Ollama

Run AI models locally or in the cloud. Ollama provides an easy-to-use runtime for open-source LLMs including Llama, Mistral, Phi, Gemma, Qwen, DeepSeek, and more.

Visit Ollama →

Plans & Pricing

Local (Free)

✓Run models locally on your machine
✓No API costs
✓Supports 100+ open models
✓GPU acceleration

Ollama Cloud

Free (beta)

✓Cloud-hosted inference
✓REST API
✓No setup required
✓Limited rate limits on free tier

Free Tier

Ollama is free to download and run locally. Ollama Cloud has a free tier with rate limits.

Models (4)

Compare →

Model	Context	In /1M	Out /1M	Capabilities
Llama 4 Scout (Ollama) Run locally on your machine. No API costs. ChatStreaming	128K	Free	Free	ChatStreaming
Mistral Small 3.1 (Ollama) Fast local inference. Excellent value. ChatStreaming	128K	Free	Free	ChatStreaming
Qwen3 32B (Ollama) Large context window. Good reasoning. ChatStreaming	131K	Free	Free	ChatStreaming
DeepSeek V3 (Ollama) Open-weight. No API costs. ChatStreaming	128K	Free	Free	ChatStreaming