Running large language models locally still involves a lot of guesswork. LLM Checker, an open-source CLI tool, tries to eliminate that by profiling your hardware and scoring every compatible model across four dimensions: quality, speed, fit, and context window support. It then spits out ranked recommendations with ready-to-run Ollama commands.
The tool detects CPU, GPU, and memory specs, calculates usable VRAM, and filters a dynamic catalog of over 200 scraped Ollama models (with a 35+ curated fallback). It picks the best quantization level your machine can handle. On an Apple M4 Pro with 24GB unified memory, for instance, the tool classified the system as "medium high" tier and recommended qwen2.5-coder:14b for coding and deepseek-r1:14b for reasoning. Those scores are deterministic: same hardware, same results every time.
Beyond basic recommendations, LLM Checker now ships a built-in MCP server so Claude Code can query your hardware directly. Recent releases added vLLM and MLX runtime support, calibrated routing for the recommendation engine, and policy enforcement via YAML files for team governance workflows. The tool runs on macOS, Linux, and Windows with Node.js 16+.
One caveat: the scoring relies on the tool's own benchmark estimates, not real inference benchmarks on your actual machine. The project is licensed under NPDL-1.0, which allows free use but prohibits paid redistribution without a commercial license. Current version on npm is 3.4.2.
Bottom Line
LLM Checker scores 200+ Ollama models against your hardware across four dimensions and outputs ready-to-run pull commands, now at version 3.4.2 on npm.
Quick Facts
- 200+ dynamically scraped models (35+ curated fallback)
- 4 scoring dimensions: Quality, Speed, Fit, Context
- Supports Apple Silicon, NVIDIA, and Intel Arc GPUs
- Current npm version: 3.4.2
- License: NPDL-1.0 (free use, no paid redistribution)




