
Grepai Ollama Setup
Install Ollama and pull embedding models so GrepAI can index and search your repo locally without sending code to cloud APIs.
Overview
grepai-ollama-setup is an agent skill for the Build phase that installs and configures Ollama as GrepAI’s local embedding provider.
Install
npx skills add https://github.com/yoanbernabeu/grepai-skills --skill grepai-ollama-setupWhat is this skill?
- Install paths for macOS (Homebrew and DMG), Linux one-liner, and Windows installer
- Documents privacy, zero API cost, offline, and low-latency local embedding generation
- Covers choosing and downloading embedding models required by GrepAI
- Includes troubleshooting guidance for Ollama service and connection issues
- Positions 100% on-machine code search so source never leaves the machine
Adoption & trust: 604 installs on skills.sh; 17 GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
You want GrepAI semantic search but cannot—or will not—send your codebase to a remote embedding API.
Who is it for?
Privacy-conscious solo builders setting up GrepAI on a laptop or workstation with shell access.
Skip if: Teams that require managed cloud embeddings, shared GPU clusters, or search without any local runtime.
When should I use this skill?
Setting up GrepAI with local private embeddings, first-time Ollama install, choosing embedding models, or troubleshooting Ollama connectivity.
What do I get? / Deliverables
Ollama is installed, serving locally, and loaded with a compatible embedding model so GrepAI can index and search code entirely on your machine.
- Running Ollama service (ollama serve)
- Downloaded embedding model for GrepAI
- Verified local connection for embedding generation
Recommended Skills
Journey fit
Embedding provider setup is a build-time integration step before GrepAI semantic search works in the dev environment. Ollama is wired as the embedding backend for the GrepAI toolchain—classic third-party/local service integration work.
How it compares
Local embedding setup skill, not a hosted vector-DB or MCP search server by itself.
Common Questions / FAQ
Who is grepai-ollama-setup for?
Solo and indie developers who use GrepAI and want fully local, private embedding generation via Ollama before agent-assisted code search.
When should I use grepai-ollama-setup?
During Build integrations when you first wire GrepAI, when migrating off paid embedding APIs, or when fixing Ollama connection errors blocking indexing.
Is grepai-ollama-setup safe to install?
It describes standard Ollama installs and model pulls; review the Security Audits panel on this page and verify download URLs and model names before running scripts on your machine.
SKILL.md
READMESKILL.md - Grepai Ollama Setup
# Ollama Setup for GrepAI This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine. ## When to Use This Skill - Setting up GrepAI with local, private embeddings - Installing Ollama for the first time - Choosing and downloading embedding models - Troubleshooting Ollama connection issues ## Why Ollama? | Benefit | Description | |---------|-------------| | 🔒 **Privacy** | Code never leaves your machine | | 💰 **Free** | No API costs | | ⚡ **Fast** | Local processing, no network latency | | 🔌 **Offline** | Works without internet | ## Installation ### macOS (Homebrew) ```bash # Install Ollama brew install ollama # Start the Ollama service ollama serve ``` ### macOS (Direct Download) 1. Download from [ollama.com](https://ollama.com) 2. Open the `.dmg` and drag to Applications 3. Launch Ollama from Applications ### Linux ```bash # One-line installer curl -fsSL https://ollama.com/install.sh | sh # Start the service ollama serve ``` ### Windows 1. Download installer from [ollama.com](https://ollama.com/download/windows) 2. Run the installer 3. Ollama starts automatically as a service ## Downloading Embedding Models GrepAI requires an embedding model to convert code into vectors. ### Recommended Model: nomic-embed-text ```bash # Download the recommended model (768 dimensions) ollama pull nomic-embed-text ``` **Specifications:** - Dimensions: 768 - Size: ~274 MB - Performance: Excellent for code search - Language: English-optimized ### Alternative Models ```bash # Multilingual support (better for non-English code/comments) ollama pull nomic-embed-text-v2-moe # Larger, more accurate ollama pull bge-m3 # Maximum quality ollama pull mxbai-embed-large ``` | Model | Dimensions | Size | Best For | |-------|------------|------|----------| | `nomic-embed-text` | 768 | 274 MB | General code search | | `nomic-embed-text-v2-moe` | 768 | 500 MB | Multilingual codebases | | `bge-m3` | 1024 | 1.2 GB | Large codebases | | `mxbai-embed-large` | 1024 | 670 MB | Maximum accuracy | ## Verifying Installation ### Check Ollama is Running ```bash # Check if Ollama server is responding curl http://localhost:11434/api/tags # Expected output: JSON with available models ``` ### List Downloaded Models ```bash ollama list # Output: # NAME ID SIZE MODIFIED # nomic-embed-text:latest abc123... 274 MB 2 hours ago ``` ### Test Embedding Generation ```bash # Quick test (should return embedding vector) curl http://localhost:11434/api/embeddings -d '{ "model": "nomic-embed-text", "prompt": "function hello() { return world; }" }' ``` ## Configuring GrepAI for Ollama After installing Ollama, configure GrepAI to use it: ```yaml # .grepai/config.yaml embedder: provider: ollama model: nomic-embed-text endpoint: http://localhost:11434 ``` This is the **default configuration** when you run `grepai init`, so no changes are needed if using `nomic-embed-text`. ## Running Ollama ### Foreground (Development) ```bash # Run in current terminal (see logs) ollama serve ``` ### Background (macOS/Linux) ```bash # Using nohup nohup ollama serve & # Or as a systemd service (Linux) sudo systemctl enable ollama sudo systemctl start ollama ``` ### Check Status ```bash # Check if running pgrep -f ollama # Or test the API curl -s http://localhost:11434/api/tags | head -1 ``` ## Resource Considerations ### Memory Usage Embedding models load into RAM: - `nomic-embed-text`: ~500 MB RAM - `bge-m3`: ~1.5 GB RAM - `mxbai-embed-large`: ~1 GB RAM ### CPU vs GPU Ollama uses CPU by default. For faster embeddings: - **macOS:** Uses Metal (Apple Silicon) automatically - **Linux/Windows:** Install CUDA for NVIDIA GP