1. Why Ollama?
- Turn-key local LLMs – one command installs both the runtime and a curated model library.
- Cross-platform – native builds for macOS (Apple Silicon & Intel), Linux, and Windows.
- GPU acceleration – CUDA on NVIDIA GPUs; Metal on Apple Silicon; fallback to CPU if no GPU.
- Dev-friendly – simple CLI plus a REST API that looks a lot like OpenAI Chat.
2. Prerequisites
Minimum |
Recommended |
4 CPU cores, 16 GB RAM |
8 CPU cores, 32 GB RAM+ |
25 GB free disk space (models) |
NVMe SSD |
Optional GPU• Apple M-series• NVIDIA (Compute 6.1+, 8 GB VRAM) |
Latest vendor drivers / CUDA 12.3+ |
⚠️ On Linux, be sure your NVIDIA driver & CUDA toolkit are installed before you start Ollama; the runtime will automatically detect and use the GPU when available. Virtualization Howto
3. Installation
macOS 12 Monterey +
- Click “Download for macOS” on the official site and drag the app to /Applications. Ollama
- Launch Ollama.app once to register the background service (
ollama serve
).
Linux (x86-64 / aarch64)
curl -fsSL <https://ollama.com/install.sh> | sh
This script adds the package repo, installs the binary, and sets up a systemd service. Ollama
(Prefer manual steps? Click “Manual install instructions” on the same page.)
Windows 10 / 11 (64-bit)
- Download the signed installer (.msi) from the “Download for Windows” link. Ollama