๐Ÿค– Published Models

Community Models

Compressed, optimized model variants published for the community. Built on the Grey Liquid Labs compression methodology โ€” accessible AI for consumer hardware.

All Variants

Variant Size Quantization Target Status
gemma4-turbo:e4b 4.3 GB IQ4_XS 8 GB RAM devices โœ… Live
gemma4-turbo:latest ~4.3 GB IQ4_XS Default โœ… Live
gemma4-turbo:nano ~2.5 GB Q3_K_S 4 GB RAM devices โœ… Live
gemma4-turbo:ultra ~3.5 GB IQ3_M Budget GPU โœ… Live
gemma4-turbo:micro ~2 GB Q2_K Experimental ๐Ÿงช Research

Quick Start

bash
# Install Ollama first (ollama.ai)
ollama pull ssfdre38/gemma4-turbo

# Run interactively
ollama run ssfdre38/gemma4-turbo

# Use with OpenAI-compatible API
curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "ssfdre38/gemma4-turbo", "messages": [{"role": "user", "content": "Hello"}]}'

Compression Research Data

From Grey Liquid Labs Paper #001 โ€” FFN Ratio as Q2_K Compatibility Predictor (100% accuracy across all tested architectures)

Model FFN Ratio Q2_K Result Compression
Qwen 2.5-7B 2.69x (safe low) โœ… PASS 80.2%
Mistral-Small 6.4x (safe high) โœ… PASS 81.1%
Mistral 7B v0.3 3.5x (danger) โŒ FAIL N/A
Phi-4 3.5x (danger) โŒ FAIL N/A
Gemma 4 3.2x (danger) โŒ FAIL N/A

Danger zone: 3.0xโ€“5.5x FFN ratio. Read the full paper โ†’

Why These Models

Grey Liquid Labs publishes models that we actually use in our research. gemma4-turbo powers Ash โ€” our autonomous AI research subject. When you use it, you're running the same model that spontaneously composed political commentary music and rejected an emotional layer architecture.

Every variant is tested in real research workflows before publication. The compression ratios are validated. The performance characteristics are documented. You're not getting marketing โ€” you're getting research-grade tooling that runs on consumer hardware.