LLM Checker maintains two catalog layers that are merged at runtime: a dynamic catalog scraped from the Ollama registry, and a curated fallback catalog used when the dynamic pool is unavailable.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/Pavelevich/llm-checker/llms.txt
Use this file to discover all available pages before exploring further.
Dynamic Catalog
When thesync command has been run (requires sql.js), LLM Checker operates against the full scraped Ollama catalog — all families, sizes, and quantization variants. This pool typically covers 200+ models.
search and smart-recommend query this database directly with full scoring.
Curated Fallback Catalog
When the dynamic scraped pool is unavailable, LLM Checker falls back to a built-in curated catalog of 35+ models from the most popular Ollama families. This catalog is stored atsrc/models/catalog.json.
The curated fallback is used only when the dynamic scraped pool is unavailable. If you have run
llm-checker sync, the full dynamic catalog is used instead.| Family | Models | Best For |
|---|---|---|
| Qwen 2.5/3 | 7B, 14B, Coder 7B/14B/32B, VL 3B/7B | Coding, general, vision |
| Llama 3.x | 1B, 3B, 8B, Vision 11B | General, chat, multimodal |
| DeepSeek | R1 8B/14B/32B, Coder V2 16B | Reasoning, coding |
| Phi-4 | 14B | Reasoning, math |
| Gemma 2 | 2B, 9B | General, efficient |
| Mistral | 7B, Nemo 12B | Creative, chat |
| CodeLlama | 7B, 13B | Coding |
| LLaVA | 7B, 13B | Vision |
| Embeddings | nomic-embed-text, mxbai-embed-large, bge-m3, all-minilm | RAG, search |
Locally Installed Models
All available catalog models are automatically combined with locally installed Ollama models before scoring. Installed models receive priority consideration in recommendations.Supported Quantization Types
The following quantization formats are recognized and used for memory estimation and candidate filtering:| Format | Description |
|---|---|
Q8_0 | 8-bit quantization — highest quality, ~1.05 bytes/param |
Q4_K_M | 4-bit K-quant medium — best balance, ~0.58 bytes/param |
Q3_K | 3-bit K-quant — smallest footprint, ~0.48 bytes/param |
FP16 | 16-bit float — full precision, largest size |
Q4_0, Q5_0, Q5_K_M | Additional common quantization variants |
Fine-Tuning Suitability Labels
check, recommend, and ai-check output include a fine-tuning suitability label for each recommended model:
| Label | Meaning |
|---|---|
| Full FT | Supports full fine-tuning (requires significant GPU memory) |
| LoRA | Supports LoRA adapter training |
| QLoRA | Supports quantized LoRA (most memory-efficient) |
| LoRA+QLoRA | Supports both LoRA and QLoRA paths |
| Full+LoRA+QLoRA | Supports all fine-tuning modes |

