Ollama
Updated: September 11, 2025Categories: AI
Printed from:
Ollama Comprehensive Cheatsheet
1. Installation Instructions
macOS
Bash
12345678# Using Homebrew
brew install ollama
# Manual download
curl https://ollama.ai/download/ollama-darwin.zip --output ollama-darwin.zip
unzip ollama-darwin.zip
sudo mv ollama /usr/local/bin/
Linux
Bash
1234567891011# Debian/Ubuntu
curl -fsSL https://ollama.ai/install.sh | sh
# Fedora/RHEL
curl -fsSL https://ollama.ai/install.sh | sh
# Manual installation
wget https://ollama.ai/download/ollama-linux-amd64
chmod +x ollama-linux-amd64
sudo mv ollama-linux-amd64 /usr/local/bin/ollama
Windows
PowerShell
12345# Download from official website winget install ollama # Or use Docker docker pull ollama/ollama
2. Basic Commands and Usage
Starting the Ollama Service
Bash
123456# Start Ollama service
ollama serve
# Start in background
ollama serve &
Quick Model Pull and Run
Bash
12345678# Pull a model
ollama pull llama2
ollama pull mistral
# Run a model interactively
ollama run llama2
ollama run mistral
3. Model Management
Downloading Models
Bash
12345678910# Pull a specific model version
ollama pull llama2:13b
ollama pull mistral:7b
# List available models
ollama list
# Show model details
ollama show llama2
Removing Models
Bash
123456# Remove a specific model
ollama rm llama2
# Remove all models
ollama rm --all
4. Running Models and Chat Interactions
Interactive Chat
Bash
123456# Start interactive chat
ollama run llama2
# Exit chat
/bye
Passing Prompts
Bash
123456# Run with a prompt
ollama run llama2 "Explain quantum computing"
# Pipe input
echo "Write a Python script" | ollama run codellama
5. API Usage and Endpoints
Local API Endpoints
Bash
12345678# Generate text
curl http://localhost:11434/api/generate \
-d '{
"model": "llama2",
"prompt": "Tell me a story"
}'
# Chat completion
curl http://localhost:11434/api/chat \
-d '{
"model": "mistral",
"messages": [
{"role": "user", "content": "Hello"}
]
}'
6. Configuration Options
Environment Variables
Bash
123456# Set model directory
export OLLAMA_MODELS=/path/to/models
# Configure GPU usage
export OLLAMA_CUDA_VISIBLE_DEVICES=0,1
Modelfile Customization
Dockerfile
12345# Example Modelfile FROM llama2 PARAMETER temperature 0.7 SYSTEM You are a helpful assistant
7. Advanced Features
Custom Model Creation
Bash
123456# Create a custom model
ollama create mymodel -f Modelfile
# Use custom model
ollama run mymodel
Quantization
Bash
123# Pull quantized models
ollama pull llama2:7b-q4_0
8. Troubleshooting
Common Issues
- Check service status:
systemctl status ollama - Verify model compatibility
- Update Ollama to latest version
- Check GPU drivers and CUDA installation
Logging
Bash
123# View Ollama logs
journalctl -u ollama
9. Tool Integration
Langchain Integration
Python
12345from langchain.llms import Ollama
llm = Ollama(model="mistral")
result = llm("Write a poem")
CLI Piping
Bash
123# Use with other CLI tools
ollama run llama2 "Summarize this:" | pbcopy
10. Best Practices
- Always pull latest model versions
- Use quantized models for resource-constrained systems
- Configure GPU acceleration
- Monitor model performance
- Experiment with different models
- Keep Ollama updated
- Secure your local API endpoints
Additional Resources
- Official Documentation: https://ollama.ai/docs
- GitHub Repository: https://github.com/ollama/ollama
- Community Forums: https://ollama.ai/community
Disclaimer: Always refer to the latest official documentation for the most up-to-date information.
Continue Learning
Discover more cheatsheets to boost your productivity