Model Catalog - LLM Checker

LLM Checker maintains two catalog layers that are merged at runtime: a dynamic catalog scraped from the Ollama registry, and a curated fallback catalog used when the dynamic pool is unavailable.

Dynamic Catalog

When the sync command has been run (requires sql.js), LLM Checker operates against the full scraped Ollama catalog — all families, sizes, and quantization variants. This pool typically covers 200+ models.

# Download the latest catalog
npm install sql.js
llm-checker sync

After sync, search and smart-recommend query this database directly with full scoring.

Curated Fallback Catalog

When the dynamic scraped pool is unavailable, LLM Checker falls back to a built-in curated catalog of 35+ models from the most popular Ollama families. This catalog is stored at src/models/catalog.json.

The curated fallback is used only when the dynamic scraped pool is unavailable. If you have run llm-checker sync, the full dynamic catalog is used instead.

Family	Models	Best For
Qwen 2.5/3	7B, 14B, Coder 7B/14B/32B, VL 3B/7B	Coding, general, vision
Llama 3.x	1B, 3B, 8B, Vision 11B	General, chat, multimodal
DeepSeek	R1 8B/14B/32B, Coder V2 16B	Reasoning, coding
Phi-4	14B	Reasoning, math
Gemma 2	2B, 9B	General, efficient
Mistral	7B, Nemo 12B	Creative, chat
CodeLlama	7B, 13B	Coding
LLaVA	7B, 13B	Vision
Embeddings	nomic-embed-text, mxbai-embed-large, bge-m3, all-minilm	RAG, search

Locally Installed Models

All available catalog models are automatically combined with locally installed Ollama models before scoring. Installed models receive priority consideration in recommendations.

# Rank only your installed models
llm-checker installed

Supported Quantization Types

The following quantization formats are recognized and used for memory estimation and candidate filtering:

Format	Description
`Q8_0`	8-bit quantization — highest quality, ~1.05 bytes/param
`Q4_K_M`	4-bit K-quant medium — best balance, ~0.58 bytes/param
`Q3_K`	3-bit K-quant — smallest footprint, ~0.48 bytes/param
`FP16`	16-bit float — full precision, largest size
`Q4_0`, `Q5_0`, `Q5_K_M`	Additional common quantization variants

The selector automatically picks the highest-quality quantization that fits your available memory budget. Filter by quantization when searching:

llm-checker search qwen --quant Q4_K_M --max-size 8

Fine-Tuning Suitability Labels

check, recommend, and ai-check output include a fine-tuning suitability label for each recommended model:

Label	Meaning
Full FT	Supports full fine-tuning (requires significant GPU memory)
LoRA	Supports LoRA adapter training
QLoRA	Supports quantized LoRA (most memory-efficient)
LoRA+QLoRA	Supports both LoRA and QLoRA paths
Full+LoRA+QLoRA	Supports all fine-tuning modes

Coding:
   qwen2.5-coder:14b (14B)
   Score: 78/100
   Fine-tuning: LoRA+QLoRA
   Command: ollama pull qwen2.5-coder:14b

Documentation Index

​Dynamic Catalog

​Curated Fallback Catalog

​Locally Installed Models

​Supported Quantization Types

​Fine-Tuning Suitability Labels

Dynamic Catalog

Curated Fallback Catalog

Locally Installed Models

Supported Quantization Types

Fine-Tuning Suitability Labels