Grepai Embeddings Openai

Name: Grepai Embeddings Openai
Author: yoanbernabeu

yoanbernabeu/grepai-skills

Point GrepAI at OpenAI embeddings when you want cloud-quality vectors without running a local embed server.

Overview

GrepAI Embeddings OpenAI is an agent skill for the Build phase that configures OpenAI as GrepAI’s cloud embedding provider via .grepai/config.yaml.

Install

npx skills add https://github.com/yoanbernabeu/grepai-skills --skill grepai-embeddings-openai

What is this skill?

OpenAI embedder block for .grepai/config.yaml with env-based API key
Models such as text-embedding-3-small with optional parallelism for concurrent requests
Documents quality, speed, scalability vs privacy, cost, and internet dependency
Team-friendly shared infrastructure without local embedding compute
Prerequisites: OpenAI API key and billing-enabled account
Optional parallelism: 8 concurrent embedding requests in config example

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 1 installs on skills.sh; 17 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What problem does it solve?

You want GrepAI semantic search with top-tier embeddings but do not want to host or tune a local embedding server on your laptop.

Who is it for?

Indie devs or tiny teams indexing large repos who accept sending code to OpenAI in exchange for quality, speed, and zero local embed ops.

Skip if: Air-gapped or strict data-residency codebases, or builders who refuse API spend and on-prem privacy requirements.

When should I use this skill?

You need highest-quality GrepAI embeddings, shared team setup, no local embed server, and accept OpenAI privacy/cost tradeoffs.

What do I get? / Deliverables

You get a working OpenAI-backed GrepAI embedder configuration—with optional parallelism—so indexing and agent retrieval use cloud vectors you can share on a team setup.

.grepai/config.yaml embedder provider openai block
OPENAI_API_KEY environment setup
Tuned parallelism settings for large codebases

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

BuildIntegrations & version control

Embedding provider wiring is build-time agent-tooling setup that makes semantic code search work in your repo workflow. Integrations covers third-party APIs and config files like .grepai/config.yaml and OPENAI_API_KEY.

How it compares

Integration config for GrepAI’s embedder—not a standalone RAG framework or MCP server.

Common Questions / FAQ

Who is grepai-embeddings-openai for?

Developers using GrepAI who need high-quality cloud embeddings and are comfortable managing an OpenAI API key and token billing.

When should I use grepai-embeddings-openai?

During Build integrations when you are wiring agent-tooling search—after choosing GrepAI and before you index the repo for Claude Code, Cursor, or Codex workflows.

Is grepai-embeddings-openai safe to install?

The skill instructs sending codebase content to OpenAI when embeddings run—review the Security Audits panel on this Prism page and your org’s data policy before enabling it.

SKILL.md

READMESKILL.md - Grepai Embeddings Openai

# GrepAI Embeddings with OpenAI

This skill covers using OpenAI's embedding API with GrepAI for high-quality, cloud-based embeddings.

## When to Use This Skill

- Need highest quality embeddings
- Team environment with shared infrastructure
- Don't want to manage local embedding server
- Willing to trade privacy for quality/convenience

## Considerations

| Aspect | Details |
|--------|---------|
| ✅ **Quality** | State-of-the-art embeddings |
| ✅ **Speed** | Fast, no local compute needed |
| ✅ **Scalability** | Handles any codebase size |
| ⚠️ **Privacy** | Code sent to OpenAI servers |
| ⚠️ **Cost** | Pay per token |
| ⚠️ **Internet** | Requires connection |

## Prerequisites

1. OpenAI API key
2. Billing enabled on OpenAI account

Get your API key at: https://platform.openai.com/api-keys

## Configuration

### Basic Configuration

```yaml
# .grepai/config.yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: ${OPENAI_API_KEY}
```

Set the environment variable:

```bash
export OPENAI_API_KEY="sk-..."
```

### With Parallel Processing

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: ${OPENAI_API_KEY}
  parallelism: 8  # Concurrent requests for speed
```

### Direct API Key (Not Recommended)

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: sk-your-api-key-here  # Avoid committing secrets!
```

**Warning:** Never commit API keys to version control.

## Available Models

### text-embedding-3-small (Recommended)

| Property | Value |
|----------|-------|
| Dimensions | 1536 |
| Price | $0.00002 / 1K tokens |
| Quality | Very high |
| Speed | Fast |

**Best for:** Most use cases, good balance of cost/quality.

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
```

### text-embedding-3-large

| Property | Value |
|----------|-------|
| Dimensions | 3072 |
| Price | $0.00013 / 1K tokens |
| Quality | Highest |
| Speed | Fast |

**Best for:** Maximum accuracy, cost not a concern.

```yaml
embedder:
  provider: openai
  model: text-embedding-3-large
  dimensions: 3072
```

### Dimension Reduction

You can reduce dimensions to save storage:

```yaml
embedder:
  provider: openai
  model: text-embedding-3-large
  dimensions: 1024  # Reduced from 3072
```

## Model Comparison

| Model | Dimensions | Cost/1K tokens | Quality |
|-------|------------|----------------|---------|
| `text-embedding-3-small` | 1536 | $0.00002 | ⭐⭐⭐⭐ |
| `text-embedding-3-large` | 3072 | $0.00013 | ⭐⭐⭐⭐⭐ |

## Cost Estimation

Approximate costs per 1000 source files:

| Codebase Size | Chunks | Small Model | Large Model |
|---------------|--------|-------------|-------------|
| Small (100 files) | ~500 | $0.01 | $0.06 |
| Medium (1000 files) | ~5,000 | $0.10 | $0.65 |
| Large (10000 files) | ~50,000 | $1.00 | $6.50 |

**Note:** Costs are one-time for initial indexing. Updates only re-embed changed files.

## Optimizing for Speed

### Parallel Requests

GrepAI v0.24.0+ supports adaptive rate limiting and parallel requests:

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: ${OPENAI_API_KEY}
  parallelism: 8  # Adjust based on your rate limit tier
```

Parallelism recommendations:
- **Tier 1 (Free):** 1-2
- **Tier 2:** 4-8
- **Tier 3+:** 8-16

### Batching

GrepAI automatically batches chunks for efficient API usage.

## Rate Limits

OpenAI has rate limits based on your account tier:

| Tier | RPM | TPM |
|------|-----|-----|
| Free | 3 | 150,000 |
| Tier 1 | 500 | 1,000,000 |
| Tier 2 | 5,000 | 5,000,000 |

GrepAI handles rate limiting automatically with adaptive backoff.

## Environment Variables

### Setting the API Key

**macOS/Linux:**
```bash
# In ~/.bashrc, ~/.zshrc, or ~/.profile
export OPENAI_API_KEY="sk-..."
```

**Windows (PowerShell):**
```powershell
$env:OPENAI_API_K

What is this skill?

OpenAI embedder block for .grepai/config.yaml with env-based API key

Models such as text-embedding-3-small with optional parallelism for concurrent requests

Documents quality, speed, scalability vs privacy, cost, and internet dependency

Team-friendly shared infrastructure without local embedding compute

Prerequisites: OpenAI API key and billing-enabled account

Optional parallelism: 8 concurrent embedding requests in config example

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 1 installs on skills.sh; 17 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What do I get? / Deliverables

You get a working OpenAI-backed GrepAI embedder configuration—with optional parallelism—so indexing and agent retrieval use cloud vectors you can share on a team setup.

.grepai/config.yaml embedder provider openai block

OPENAI_API_KEY environment setup

Tuned parallelism settings for large codebases

Journey fit

Primary fit

BuildIntegrations & version control

SKILL.md

READMESKILL.md - Grepai Embeddings Openai

# GrepAI Embeddings with OpenAI

This skill covers using OpenAI's embedding API with GrepAI for high-quality, cloud-based embeddings.

## When to Use This Skill

- Need highest quality embeddings
- Team environment with shared infrastructure
- Don't want to manage local embedding server
- Willing to trade privacy for quality/convenience

## Considerations

| Aspect | Details |
|--------|---------|
| ✅ **Quality** | State-of-the-art embeddings |
| ✅ **Speed** | Fast, no local compute needed |
| ✅ **Scalability** | Handles any codebase size |
| ⚠️ **Privacy** | Code sent to OpenAI servers |
| ⚠️ **Cost** | Pay per token |
| ⚠️ **Internet** | Requires connection |

## Prerequisites

1. OpenAI API key
2. Billing enabled on OpenAI account

Get your API key at: https://platform.openai.com/api-keys

## Configuration

### Basic Configuration

```yaml
# .grepai/config.yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: ${OPENAI_API_KEY}
```

Set the environment variable:

```bash
export OPENAI_API_KEY="sk-..."
```

### With Parallel Processing

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: ${OPENAI_API_KEY}
  parallelism: 8  # Concurrent requests for speed
```

### Direct API Key (Not Recommended)

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: sk-your-api-key-here  # Avoid committing secrets!
```

**Warning:** Never commit API keys to version control.

## Available Models

### text-embedding-3-small (Recommended)

| Property | Value |
|----------|-------|
| Dimensions | 1536 |
| Price | $0.00002 / 1K tokens |
| Quality | Very high |
| Speed | Fast |

**Best for:** Most use cases, good balance of cost/quality.

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
```

### text-embedding-3-large

| Property | Value |
|----------|-------|
| Dimensions | 3072 |
| Price | $0.00013 / 1K tokens |
| Quality | Highest |
| Speed | Fast |

**Best for:** Maximum accuracy, cost not a concern.

```yaml
embedder:
  provider: openai
  model: text-embedding-3-large
  dimensions: 3072
```

### Dimension Reduction

You can reduce dimensions to save storage:

```yaml
embedder:
  provider: openai
  model: text-embedding-3-large
  dimensions: 1024  # Reduced from 3072
```

## Model Comparison

| Model | Dimensions | Cost/1K tokens | Quality |
|-------|------------|----------------|---------|
| `text-embedding-3-small` | 1536 | $0.00002 | ⭐⭐⭐⭐ |
| `text-embedding-3-large` | 3072 | $0.00013 | ⭐⭐⭐⭐⭐ |

## Cost Estimation

Approximate costs per 1000 source files:

| Codebase Size | Chunks | Small Model | Large Model |
|---------------|--------|-------------|-------------|
| Small (100 files) | ~500 | $0.01 | $0.06 |
| Medium (1000 files) | ~5,000 | $0.10 | $0.65 |
| Large (10000 files) | ~50,000 | $1.00 | $6.50 |

**Note:** Costs are one-time for initial indexing. Updates only re-embed changed files.

## Optimizing for Speed

### Parallel Requests

GrepAI v0.24.0+ supports adaptive rate limiting and parallel requests:

```yaml
embedder:
  provider: openai
  model: text-embedding-3-small
  api_key: ${OPENAI_API_KEY}
  parallelism: 8  # Adjust based on your rate limit tier
```

Parallelism recommendations:
- **Tier 1 (Free):** 1-2
- **Tier 2:** 4-8
- **Tier 3+:** 8-16

### Batching

GrepAI automatically batches chunks for efficient API usage.

## Rate Limits

OpenAI has rate limits based on your account tier:

| Tier | RPM | TPM |
|------|-----|-----|
| Free | 3 | 150,000 |
| Tier 1 | 500 | 1,000,000 |
| Tier 2 | 5,000 | 5,000,000 |

GrepAI handles rate limiting automatically with adaptive backoff.

## Environment Variables

### Setting the API Key

**macOS/Linux:**
```bash
# In ~/.bashrc, ~/.zshrc, or ~/.profile
export OPENAI_API_KEY="sk-..."
```

**Windows (PowerShell):**
```powershell
$env:OPENAI_API_K

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is grepai-embeddings-openai for?

When should I use grepai-embeddings-openai?

Is grepai-embeddings-openai safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is grepai-embeddings-openai for?

When should I use grepai-embeddings-openai?

Is grepai-embeddings-openai safe to install?

SKILL.md