LLM Checker includes a built-in Model Context Protocol (MCP) server. Once connected, Claude Code can detect your hardware, rank models, manage Ollama, run benchmarks, and execute any LLM Checker command — without leaving the Claude interface.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Pavelevich/llm-checker/llms.txt
Use this file to discover all available pages before exploring further.
What the MCP Server Provides
The MCP server exposes the full LLM Checker surface as structured tools that Claude can call autonomously:- Hardware detection and tier analysis
- Model compatibility scoring and ranked recommendations
- Ollama model management (list, pull, run, remove)
- Benchmarking, comparison, and optimization tools
- Policy validation and audit export
- Calibration artifact generation
- Direct CLI execution for any allowlisted command
Setup
- Global install (recommended)
- npx (no global install)
Generate the Setup Command
If you want the exact command for your environment printed to stdout (for scripting or manual config file editing), run:claude mcp add command and the corresponding JSON config snippet. Useful flags:
| Flag | Effect |
|---|---|
--apply | Run the setup command automatically |
--json | Output config as JSON only |
--npx | Use npx transport instead of global binary |
Available MCP Tools
Core Analysis
hw_detect
Detect your hardware — CPU, GPU, RAM, and acceleration backend (Metal, CUDA, ROCm, CPU).
check
Full compatibility analysis with all models ranked by score.
recommend
Top model picks by category: coding, reasoning, multimodal, and more.
installed
Rank your already-downloaded Ollama models by compatibility score.
search
Search the Ollama model catalog with filters for family, quantization, size, and use-case.
smart_recommend
Advanced recommendations using the full 4D scoring engine.
ollama_plan
Build a capacity plan for local models with recommended
NUM_CTX, NUM_PARALLEL, and memory settings.ollama_plan_env
Return ready-to-paste
export ... env vars from the recommended or fallback plan profile.policy_validate
Validate a policy file against the v1 schema and return structured validation output.
audit_export
Run policy compliance export (
json/csv/sarif/all) for check or recommend flows.calibrate
Generate calibration artifacts from a prompt suite with typed MCP inputs.
Ollama Management
ollama_list
List all downloaded models with params, quantization, family, and size.
ollama_pull
Download a model from the Ollama registry.
ollama_run
Run a prompt against a local model and receive tok/s metrics alongside the response.
ollama_remove
Delete a model to free disk space.
Advanced (MCP-exclusive)
These tools are only available through the MCP server and have no direct CLI equivalent.ollama_optimize
Generate optimal Ollama env vars for your hardware —
NUM_GPU, NUM_PARALLEL, FLASH_ATTENTION, and more.benchmark
Benchmark a model with 3 standardized prompts, measuring tok/s, load time, and prompt eval.
compare_models
Head-to-head comparison of two models on the same prompt with speed and response side-by-side.
cleanup_models
Analyze installed models — find redundancies, cloud-only models, oversized models, and upgrade candidates.
project_recommend
Scan a project directory (languages, frameworks, size) and recommend the best model for that codebase.
ollama_monitor
Real-time system status: RAM usage, loaded models, and memory headroom analysis.
cli_help
List all allowlisted CLI commands exposed through MCP.
cli_exec
Execute any allowlisted
llm-checker CLI command with custom args (policy, audit, calibrate, sync, ai-run, etc.).Example Claude Prompts
After setup, ask Claude things like:Hardware and model selection
Hardware and model selection
- “What’s the best coding model for my hardware?”
- “What model should I use for this Rust project?”
- “Do you see both my iGPU and dGPU?”
Benchmarking and comparison
Benchmarking and comparison
- “Benchmark qwen2.5-coder and show me the tok/s”
- “Compare llama3.2 vs codellama for coding tasks”
Ollama management
Ollama management
- “Clean up my Ollama — what should I remove?”
- “Optimize my Ollama config for maximum performance”
- “How much RAM is Ollama using right now?”

