Tips

#Local LLM

3 tips tagged #Local LLM, 1-3 showing

1 / 1

Upstream unavailable

Apache-2.0

hadihonarvar/flock은 macOS·Linux 머신의 Ollama, vLLM, MLX-LM, llama.cpp-RPC 백엔드를 OpenAI·Anthropic 호환 API, 키·쿼터·감사 로그, 대시보드로 묶는 G...

단일 Go 바이너리로 leader/worker/CLI를 겸하며, GitHub Release v0.1.0은 darwin·linux의 amd64/arm64 tarball...
`/v1/chat/completions`, `/v1/models`, `/v1/messages`, `/v1/messages/count_tokens`를 인증·쿼터 mid...
Ollama를 기본 엔진으로 쓰되 vLLM, MLX-LM, llama.cpp/RPC 엔진과 model catalog, `flock shard create <model...
`flock connect`는 Claude Code, Cursor, Aider, Continue, Zed, Cline, Qwen Code, OpenAI/Anthrop...
기본 listen 값은 `:8080`이고 API key는 켜져 있지만, LAN/외부 노출·worker join token·cloud fallback key·Prome...

hadihonarvar/flockUnavailable

Free web calculator

Proprietary / Terms of Use

ApX Machine Learning의 APXML VRAM Calculator는 LLM 추론·파인튜닝에서 모델 크기, 양자화, KV 캐시, 컨텍스트 길이, 배치, 동시 사용자, GPU VRAM을 조합해 메모리와 대략적인 처...

설치형 앱이나 오픈소스 저장소가 아니라 브라우저에서 쓰는 ApX Machine Learning의 무료 웹 계산기이며, 조사 시점에 공개된 calculator sour...
Inference와 Fine-tuning 탭을 나누고, FP16/Q8/Q4 같은 모델 weight quantization과 KV cache quantization을...
모델 구조, layer/hidden dimension, active experts, attention 구조, batch size, sequence length, co...
Full fine-tuning, LoRA, QLoRA 쪽도 다루지만 공식 문구처럼 optimizer, parallelism, framework 구현에 따라 실제 필요...
결과는 하드웨어 구매·모델 후보 압축용 ballpark로 쓰고, 최종 배포 전에는 실제 런타임(Ollama, vLLM, llama.cpp, Transformers 등...

ApX Machine Learning / VRAM CalculatorSource

Open source beta

MIT

antoinezambelli/forge는 작은 self-hosted LLM이 multi-step tool-calling workflow에서 덜 흔들리도록 rescue parsing, retry nudges, step enf...

PyPI 패키지는 `forge-guardrails` v0.6.0이고 Python 3.12+, MIT license, GitHub tag는 v0.6.0/v0.5.0이지...
WorkflowRunner, Guardrails middleware, OpenAI-compatible proxy 세 가지 방식으로 붙일 수 있어 새 agent loo...
Ollama, llama-server, Llamafile, Anthropic backend를 지원하며 README와 Model Guide는 llama-server +...
v0.6.0 eval 문서는 46 configs × 26 scenarios × 2 ablations × 50 runs, 총 119,600 rows 기준으로 sampl...
Proxy는 기본 127.0.0.1:8081이지만 inspected CLI에는 auth 옵션이 보이지 않으므로 0.0.0.0/LAN 노출, prompt/tool sc...

antoinezambelli/forgeSource