Grepai Ollama Setup

Name: Grepai Ollama Setup
Author: yoanbernabeu

yoanbernabeu/grepai-skills

Install Ollama and pull embedding models so GrepAI can index and search your repo locally without sending code to cloud APIs.

Overview

grepai-ollama-setup is an agent skill for the Build phase that installs and configures Ollama as GrepAI’s local embedding provider.

Install

npx skills add https://github.com/yoanbernabeu/grepai-skills --skill grepai-ollama-setup

What is this skill?

Install paths for macOS (Homebrew and DMG), Linux one-liner, and Windows installer
Documents privacy, zero API cost, offline, and low-latency local embedding generation
Covers choosing and downloading embedding models required by GrepAI
Includes troubleshooting guidance for Ollama service and connection issues
Positions 100% on-machine code search so source never leaves the machine

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 604 installs on skills.sh; 17 GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What problem does it solve?

You want GrepAI semantic search but cannot—or will not—send your codebase to a remote embedding API.

Who is it for?

Privacy-conscious solo builders setting up GrepAI on a laptop or workstation with shell access.

Skip if: Teams that require managed cloud embeddings, shared GPU clusters, or search without any local runtime.

When should I use this skill?

Setting up GrepAI with local private embeddings, first-time Ollama install, choosing embedding models, or troubleshooting Ollama connectivity.

What do I get? / Deliverables

Ollama is installed, serving locally, and loaded with a compatible embedding model so GrepAI can index and search code entirely on your machine.

Running Ollama service (ollama serve)
Downloaded embedding model for GrepAI
Verified local connection for embedding generation

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

BuildIntegrations & version control

Embedding provider setup is a build-time integration step before GrepAI semantic search works in the dev environment. Ollama is wired as the embedding backend for the GrepAI toolchain—classic third-party/local service integration work.

How it compares

Local embedding setup skill, not a hosted vector-DB or MCP search server by itself.

Common Questions / FAQ

Who is grepai-ollama-setup for?

Solo and indie developers who use GrepAI and want fully local, private embedding generation via Ollama before agent-assisted code search.

When should I use grepai-ollama-setup?

During Build integrations when you first wire GrepAI, when migrating off paid embedding APIs, or when fixing Ollama connection errors blocking indexing.

Is grepai-ollama-setup safe to install?

It describes standard Ollama installs and model pulls; review the Security Audits panel on this page and verify download URLs and model names before running scripts on your machine.

SKILL.md

READMESKILL.md - Grepai Ollama Setup

# Ollama Setup for GrepAI

This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.

## When to Use This Skill

- Setting up GrepAI with local, private embeddings
- Installing Ollama for the first time
- Choosing and downloading embedding models
- Troubleshooting Ollama connection issues

## Why Ollama?

| Benefit | Description |
|---------|-------------|
| 🔒 **Privacy** | Code never leaves your machine |
| 💰 **Free** | No API costs |
| ⚡ **Fast** | Local processing, no network latency |
| 🔌 **Offline** | Works without internet |

## Installation

### macOS (Homebrew)

```bash
# Install Ollama
brew install ollama

# Start the Ollama service
ollama serve
```

### macOS (Direct Download)

1. Download from [ollama.com](https://ollama.com)
2. Open the `.dmg` and drag to Applications
3. Launch Ollama from Applications

### Linux

```bash
# One-line installer
curl -fsSL https://ollama.com/install.sh | sh

# Start the service
ollama serve
```

### Windows

1. Download installer from [ollama.com](https://ollama.com/download/windows)
2. Run the installer
3. Ollama starts automatically as a service

## Downloading Embedding Models

GrepAI requires an embedding model to convert code into vectors.

### Recommended Model: nomic-embed-text

```bash
# Download the recommended model (768 dimensions)
ollama pull nomic-embed-text
```

**Specifications:**
- Dimensions: 768
- Size: ~274 MB
- Performance: Excellent for code search
- Language: English-optimized

### Alternative Models

```bash
# Multilingual support (better for non-English code/comments)
ollama pull nomic-embed-text-v2-moe

# Larger, more accurate
ollama pull bge-m3

# Maximum quality
ollama pull mxbai-embed-large
```

| Model | Dimensions | Size | Best For |
|-------|------------|------|----------|
| `nomic-embed-text` | 768 | 274 MB | General code search |
| `nomic-embed-text-v2-moe` | 768 | 500 MB | Multilingual codebases |
| `bge-m3` | 1024 | 1.2 GB | Large codebases |
| `mxbai-embed-large` | 1024 | 670 MB | Maximum accuracy |

## Verifying Installation

### Check Ollama is Running

```bash
# Check if Ollama server is responding
curl http://localhost:11434/api/tags

# Expected output: JSON with available models
```

### List Downloaded Models

```bash
ollama list

# Output:
# NAME                     ID           SIZE    MODIFIED
# nomic-embed-text:latest  abc123...    274 MB  2 hours ago
```

### Test Embedding Generation

```bash
# Quick test (should return embedding vector)
curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "function hello() { return world; }"
}'
```

## Configuring GrepAI for Ollama

After installing Ollama, configure GrepAI to use it:

```yaml
# .grepai/config.yaml
embedder:
  provider: ollama
  model: nomic-embed-text
  endpoint: http://localhost:11434
```

This is the **default configuration** when you run `grepai init`, so no changes are needed if using `nomic-embed-text`.

## Running Ollama

### Foreground (Development)

```bash
# Run in current terminal (see logs)
ollama serve
```

### Background (macOS/Linux)

```bash
# Using nohup
nohup ollama serve &

# Or as a systemd service (Linux)
sudo systemctl enable ollama
sudo systemctl start ollama
```

### Check Status

```bash
# Check if running
pgrep -f ollama

# Or test the API
curl -s http://localhost:11434/api/tags | head -1
```

## Resource Considerations

### Memory Usage

Embedding models load into RAM:
- `nomic-embed-text`: ~500 MB RAM
- `bge-m3`: ~1.5 GB RAM
- `mxbai-embed-large`: ~1 GB RAM

### CPU vs GPU

Ollama uses CPU by default. For faster embeddings:
- **macOS:** Uses Metal (Apple Silicon) automatically
- **Linux/Windows:** Install CUDA for NVIDIA GP

What is this skill?

Install paths for macOS (Homebrew and DMG), Linux one-liner, and Windows installer

Documents privacy, zero API cost, offline, and low-latency local embedding generation

Covers choosing and downloading embedding models required by GrepAI

Includes troubleshooting guidance for Ollama service and connection issues

Positions 100% on-machine code search so source never leaves the machine

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 604 installs on skills.sh; 17 GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

BuildIntegrations & version control

SKILL.md

READMESKILL.md - Grepai Ollama Setup

# Ollama Setup for GrepAI

This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.

## When to Use This Skill

- Setting up GrepAI with local, private embeddings
- Installing Ollama for the first time
- Choosing and downloading embedding models
- Troubleshooting Ollama connection issues

## Why Ollama?

| Benefit | Description |
|---------|-------------|
| 🔒 **Privacy** | Code never leaves your machine |
| 💰 **Free** | No API costs |
| ⚡ **Fast** | Local processing, no network latency |
| 🔌 **Offline** | Works without internet |

## Installation

### macOS (Homebrew)

```bash
# Install Ollama
brew install ollama

# Start the Ollama service
ollama serve
```

### macOS (Direct Download)

1. Download from [ollama.com](https://ollama.com)
2. Open the `.dmg` and drag to Applications
3. Launch Ollama from Applications

### Linux

```bash
# One-line installer
curl -fsSL https://ollama.com/install.sh | sh

# Start the service
ollama serve
```

### Windows

1. Download installer from [ollama.com](https://ollama.com/download/windows)
2. Run the installer
3. Ollama starts automatically as a service

## Downloading Embedding Models

GrepAI requires an embedding model to convert code into vectors.

### Recommended Model: nomic-embed-text

```bash
# Download the recommended model (768 dimensions)
ollama pull nomic-embed-text
```

**Specifications:**
- Dimensions: 768
- Size: ~274 MB
- Performance: Excellent for code search
- Language: English-optimized

### Alternative Models

```bash
# Multilingual support (better for non-English code/comments)
ollama pull nomic-embed-text-v2-moe

# Larger, more accurate
ollama pull bge-m3

# Maximum quality
ollama pull mxbai-embed-large
```

| Model | Dimensions | Size | Best For |
|-------|------------|------|----------|
| `nomic-embed-text` | 768 | 274 MB | General code search |
| `nomic-embed-text-v2-moe` | 768 | 500 MB | Multilingual codebases |
| `bge-m3` | 1024 | 1.2 GB | Large codebases |
| `mxbai-embed-large` | 1024 | 670 MB | Maximum accuracy |

## Verifying Installation

### Check Ollama is Running

```bash
# Check if Ollama server is responding
curl http://localhost:11434/api/tags

# Expected output: JSON with available models
```

### List Downloaded Models

```bash
ollama list

# Output:
# NAME                     ID           SIZE    MODIFIED
# nomic-embed-text:latest  abc123...    274 MB  2 hours ago
```

### Test Embedding Generation

```bash
# Quick test (should return embedding vector)
curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "function hello() { return world; }"
}'
```

## Configuring GrepAI for Ollama

After installing Ollama, configure GrepAI to use it:

```yaml
# .grepai/config.yaml
embedder:
  provider: ollama
  model: nomic-embed-text
  endpoint: http://localhost:11434
```

This is the **default configuration** when you run `grepai init`, so no changes are needed if using `nomic-embed-text`.

## Running Ollama

### Foreground (Development)

```bash
# Run in current terminal (see logs)
ollama serve
```

### Background (macOS/Linux)

```bash
# Using nohup
nohup ollama serve &

# Or as a systemd service (Linux)
sudo systemctl enable ollama
sudo systemctl start ollama
```

### Check Status

```bash
# Check if running
pgrep -f ollama

# Or test the API
curl -s http://localhost:11434/api/tags | head -1
```

## Resource Considerations

### Memory Usage

Embedding models load into RAM:
- `nomic-embed-text`: ~500 MB RAM
- `bge-m3`: ~1.5 GB RAM
- `mxbai-embed-large`: ~1 GB RAM

### CPU vs GPU

Ollama uses CPU by default. For faster embeddings:
- **macOS:** Uses Metal (Apple Silicon) automatically
- **Linux/Windows:** Install CUDA for NVIDIA GP

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is grepai-ollama-setup for?

When should I use grepai-ollama-setup?

Is grepai-ollama-setup safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is grepai-ollama-setup for?

When should I use grepai-ollama-setup?

Is grepai-ollama-setup safe to install?

SKILL.md