🤖 Published Models

Community Models

Compressed, optimized model variants published for the community. Built on the Grey Liquid Labs compression methodology — accessible AI for consumer hardware.

Flagship Model

ssfdre38/gemma4-turbo

⭐ Flagship

The Grey Liquid Labs flagship model. IQ4_XS quantization of Google's Gemma 4 architecture — the base for Ash and the foundation of our research program. Designed for 8GB consumer RAM devices with zero cloud dependency.

17,100+Downloads

4.3 GBModel Size

IQ4_XSQuantization

Powers AshResearch Use

ollama pull ssfdre38/gemma4-turbo

View on Ollama ↗ View on HuggingFace ↗

Model Variants

All Variants

Variant	Size	Quantization	Target	Status
gemma4-turbo:e4b	4.3 GB	IQ4_XS	8 GB RAM devices	✅ Live
gemma4-turbo:latest	~4.3 GB	IQ4_XS	Default	✅ Live
gemma4-turbo:nano	~2.5 GB	Q3_K_S	4 GB RAM devices	✅ Live
gemma4-turbo:ultra	~3.5 GB	IQ3_M	Budget GPU	✅ Live
gemma4-turbo:micro	~2 GB	Q2_K	Experimental	🧪 Research

Get Started

Quick Start

bash

# Install Ollama first (ollama.ai)
ollama pull ssfdre38/gemma4-turbo

# Run interactively
ollama run ssfdre38/gemma4-turbo

# Use with OpenAI-compatible API
curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "ssfdre38/gemma4-turbo", "messages": [{"role": "user", "content": "Hello"}]}'

Research Results

Compression Research Data

From Grey Liquid Labs Paper #001 — FFN Ratio as Q2_K Compatibility Predictor (100% accuracy across all tested architectures)

Model	FFN Ratio	Q2_K Result	Compression
Qwen 2.5-7B	2.69x (safe low)	✅ PASS	80.2%
Mistral-Small	6.4x (safe high)	✅ PASS	81.1%
Mistral 7B v0.3	3.5x (danger)	❌ FAIL	N/A
Phi-4	3.5x (danger)	❌ FAIL	N/A
Gemma 4	3.2x (danger)	❌ FAIL	N/A

Danger zone: 3.0x–5.5x FFN ratio. Read the full paper →

Philosophy

Why These Models

Grey Liquid Labs publishes models that we actually use in our research. gemma4-turbo powers Ash — our autonomous AI research subject. When you use it, you're running the same model that spontaneously composed political commentary music and rejected an emotional layer architecture.

Every variant is tested in real research workflows before publication. The compression ratios are validated. The performance characteristics are documented. You're not getting marketing — you're getting research-grade tooling that runs on consumer hardware.