
Llmfit Hardware Model Matcher
Discover which local LLM weights actually fit your RAM, CPU, and GPU before you download multi-gigabyte models.
Overview
llmfit Hardware Model Matcher is an agent skill for the Build phase that maps your hardware to scored local LLM recommendations via the llmfit CLI and TUI.
Install
npx skills add https://github.com/aradotso/trending-skills --skill llmfit-hardware-model-matcherWhat is this skill?
- Detects system RAM, CPU, and GPU then scores hundreds of models on quality, speed, fit, and context
- Interactive TUI plus CLI for scripting recommendations
- Supports multi-GPU, MoE architectures, and dynamic quantization awareness
- Integrates with Ollama, llama.cpp, MLX, and Docker Model Runner workflows
- Install paths: Homebrew, curl installer, Scoop, and container images
- Scores hundreds of LLM models across quality, speed, fit, and context dimensions
Adoption & trust: 1.3k installs on skills.sh; 31 GitHub stars; 0/3 security scanners passed (skills.sh audits).
What problem does it solve?
You want local models for coding agents but cannot tell which checkpoints will fit your GPU RAM and still perform.
Who is it for?
Solo builders sizing first local LLM stacks on Mac, Linux, or Windows with mixed CPU/GPU setups.
Skip if: Teams only using hosted APIs with no on-device inference plans.
When should I use this skill?
User asks which local LLMs fit their hardware, GPU RAM, or wants llmfit recommendations.
What do I get? / Deliverables
You get a ranked, hardware-scored shortlist aligned to runtimes like Ollama or MLX before downloading weights.
- Hardware-aware model recommendation list
- Optional JSON output piped through jq for automation
Recommended Skills
Journey fit
Local model choice is a Build decision: wrong picks waste days of download and break agent tooling before Ship. Agent-tooling is where solo builders wire Ollama, MLX, and llama.cpp runtimes to their coding agents.
How it compares
Hardware-scored model picker instead of blindly pulling the latest trending checkpoint from skills.sh.
Common Questions / FAQ
Who is llmfit-hardware-model-matcher for?
Indie developers and agent users who run or plan local LLMs and need compatibility scoring tied to real machine specs.
When should I use llmfit-hardware-model-matcher?
Use it in Build agent-tooling when picking models, before Ship when hardening a local inference box, and in Operate infra when upgrading GPU or RAM and rebalancing model choices.
Is llmfit-hardware-model-matcher safe to install?
Treat install scripts and package managers like any third-party binary; review the Security Audits panel on this page and verify publisher URLs before curling installers.
SKILL.md
READMESKILL.md - Llmfit Hardware Model Matcher
# llmfit Hardware Model Matcher > Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection. llmfit detects your system's RAM, CPU, and GPU then scores hundreds of LLM models across quality, speed, fit, and context dimensions — telling you exactly which models will run well on your hardware. It ships with an interactive TUI and a CLI, supports multi-GPU, MoE architectures, dynamic quantization, and local runtime providers (Ollama, llama.cpp, MLX, Docker Model Runner). --- ## Installation ### macOS / Linux (Homebrew) ```sh brew install llmfit ``` ### Quick install script ```sh curl -fsSL https://llmfit.axjns.dev/install.sh | sh # Without sudo, installs to ~/.local/bin curl -fsSL https://llmfit.axjns.dev/install.sh | sh -s -- --local ``` ### Windows (Scoop) ```sh scoop install llmfit ``` ### Docker / Podman ```sh docker run ghcr.io/alexsjones/llmfit # With jq for scripting podman run ghcr.io/alexsjones/llmfit recommend --use-case coding | jq '.models[].name' ``` ### From source (Rust) ```sh git clone https://github.com/AlexsJones/llmfit.git cd llmfit cargo build --release # binary at target/release/llmfit ``` --- ## Core Concepts - **Fit tiers**: `perfect` (runs great), `good` (runs well), `marginal` (runs but tight), `too_tight` (won't run) - **Scoring dimensions**: quality, speed (tok/s estimate), fit (memory headroom), context capacity - **Run modes**: GPU, CPU+GPU offload, CPU-only, MoE - **Quantization**: automatically selects best quant (e.g. Q4_K_M, Q5_K_S, mlx-4bit) for your hardware - **Providers**: Ollama, llama.cpp, MLX, Docker Model Runner --- ## Key Commands ### Launch Interactive TUI ```sh llmfit ``` ### CLI Table Output ```sh llmfit --cli ``` ### Show System Hardware Detection ```sh llmfit system llmfit --json system # JSON output ``` ### List All Models ```sh llmfit list ``` ### Search Models ```sh llmfit search "llama 8b" llmfit search "mistral" llmfit search "qwen coding" ``` ### Fit Analysis ```sh # All runnable models ranked by fit llmfit fit # Only perfect fits, top 5 llmfit fit --perfect -n 5 # JSON output llmfit --json fit -n 10 ``` ### Model Detail ```sh llmfit info "Mistral-7B" llmfit info "Llama-3.1-70B" ``` ### Recommendations ```sh # Top 5 recommendations (JSON default) llmfit recommend --json --limit 5 # Filter by use case: general, coding, reasoning, chat, multimodal, embedding llmfit recommend --json --use-case coding --limit 3 llmfit recommend --json --use-case reasoning --limit 5 ``` ### Hardware Planning (invert: what hardware do I need?) ```sh llmfit plan "Qwen/Qwen3-4B-MLX-4bit" --context 8192 llmfit plan "Qwen/Qwen3-4B-MLX-4bit" --context 8192 --quant mlx-4bit llmfit plan "Qwen/Qwen3-4B-MLX-4bit" --context 8192 --target-tps 25 --json llmfit plan "Qwen/Qwen2.5-Coder-0.5B-Instruct" --context 8192 --json ``` ### REST API Server (for cluster scheduling) ```sh llmfit serve llmfit serve --host 0.0.0.0 --port 8787 ``` --- ## Hardware Overrides When autodetection fails (VMs, broken nvidia-smi, passthrough setups): ```sh # Override GPU VRAM llmfit --memory=32G llmfit --memory=24G --cli llmfit --memory=24G fit --perfect -n 5 llmfit --memory=24G recommend --json # Megabytes llmfit --memory=32000M # Works with any subcommand llmfit --memory=16G info "Llama-3.1-70B" ``` Accepted suffixes: `G`/`GB`/`GiB`, `M`/`MB`/`MiB`, `T`/`TB`/`TiB` (case-insensitive). ### Context Length Cap ```sh # Estimate memory fit at 4K context llmfit --max-context 4096 --cli # With subcommands llmfit --