
Rag Implementation
Walk through a full RAG build—requirements, embeddings, vector store, chunking, and retrieval tuning—for document Q&A and grounded agents.
Overview
rag-implementation is an agent skill most often used in Build (also Validate, Ship) that orchestrates a multi-phase RAG workflow from requirements through embeddings, vector DB, chunking, and retrieval optimization.
Install
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill rag-implementationWhat is this skill?
- Phased workflow from requirements analysis through embedding selection, vector database setup, chunking, and retrieval o
- Explicit evaluation planning: accuracy requirements, latency targets, and retrieval metrics in Phase 1
- Copy-paste prompts to invoke companion skills such as ai-product, rag-engineer, and embedding-strategies
- Covers semantic search, knowledge-grounded AI, and document Q&A system setup
- Granular-workflow-bundle tagged safe-risk pattern for orchestrating multi-skill RAG delivery
- Workflow documents at least five implementation phases from requirements through retrieval optimization
- Phase 1 explicitly lists five planning actions including evaluation metrics
Adoption & trust: 584 installs on skills.sh; 40.1k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need grounded answers from your own documents but do not have an ordered plan for embeddings, chunking, vector storage, and retrieval evaluation.
Who is it for?
Indie builders shipping RAG chat, support bots, or internal knowledge agents who want a checklist-driven stack setup.
Skip if: Simple one-shot prompting without a corpus, or teams that already run a frozen enterprise RAG platform with no custom pipeline work.
When should I use this skill?
Building RAG-powered applications, semantic search, knowledge-grounded AI, document Q&A systems, or optimizing retrieval quality.
What do I get? / Deliverables
You follow phased RAG delivery with named skill handoffs so requirements, model choice, indexing, and retrieval tuning are specified before agents write integration code.
- RAG requirements brief with metrics and data-source map
- Embedding and vector-store choices with chunking and retrieval configuration notes
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
RAG is implemented when you wire models, embeddings, and retrieval into the product, after you know the use case but before you rely on it in production. Vector DBs, embedding APIs, and retrieval pipelines are integration work connecting data sources to your LLM/agent stack.
Where it fits
Define RAG use case, corpora, accuracy bar, and latency budget before buying embedding API spend.
Stand up vector index, chunking pipeline, and retrieval API hooks your agent product calls at runtime.
Run planned retrieval metrics and tune top-k or reranking before customer-facing doc Q&A goes live.
How it compares
An agent workflow skill that chains procedural steps and @-skills, not a hosted vector database or embedding API by itself.
Common Questions / FAQ
Who is rag-implementation for?
Solo developers and small teams implementing semantic search or document Q&A with Claude Code, Cursor, or Codex who need a structured RAG rollout path.
When should I use rag-implementation?
In Validate when scoping data sources and accuracy; in Build when selecting embeddings and wiring a vector store; in Ship when tuning latency and retrieval quality against planned metrics.
Is rag-implementation safe to install?
See the Security Audits panel on this Prism page for declared risk (source lists safe) and verify before granting network access for embedding APIs.
Workflow Chain
Then invoke: embedding strategies, rag engineer
SKILL.md
READMESKILL.md - Rag Implementation
# RAG Implementation Workflow ## Overview Specialized workflow for implementing RAG (Retrieval-Augmented Generation) systems including embedding model selection, vector database setup, chunking strategies, retrieval optimization, and evaluation. ## When to Use This Workflow Use this workflow when: - Building RAG-powered applications - Implementing semantic search - Creating knowledge-grounded AI - Setting up document Q&A systems - Optimizing retrieval quality ## Workflow Phases ### Phase 1: Requirements Analysis #### Skills to Invoke - `ai-product` - AI product design - `rag-engineer` - RAG engineering #### Actions 1. Define use case 2. Identify data sources 3. Set accuracy requirements 4. Determine latency targets 5. Plan evaluation metrics #### Copy-Paste Prompts ``` Use @ai-product to define RAG application requirements ``` ### Phase 2: Embedding Selection #### Skills to Invoke - `embedding-strategies` - Embedding selection - `rag-engineer` - RAG patterns #### Actions 1. Evaluate embedding models 2. Test domain relevance 3. Measure embedding quality 4. Consider cost/latency 5. Select model #### Copy-Paste Prompts ``` Use @embedding-strategies to select optimal embedding model ``` ### Phase 3: Vector Database Setup #### Skills to Invoke - `vector-database-engineer` - Vector DB - `similarity-search-patterns` - Similarity search #### Actions 1. Choose vector database 2. Design schema 3. Configure indexes 4. Set up connection 5. Test queries #### Copy-Paste Prompts ``` Use @vector-database-engineer to set up vector database ``` ### Phase 4: Chunking Strategy #### Skills to Invoke - `rag-engineer` - Chunking strategies - `rag-implementation` - RAG implementation #### Actions 1. Choose chunk size 2. Implement chunking 3. Add overlap handling 4. Create metadata 5. Test retrieval quality #### Copy-Paste Prompts ``` Use @rag-engineer to implement chunking strategy ``` ### Phase 5: Retrieval Implementation #### Skills to Invoke - `similarity-search-patterns` - Similarity search - `hybrid-search-implementation` - Hybrid search #### Actions 1. Implement vector search 2. Add keyword search 3. Configure hybrid search 4. Set up reranking 5. Optimize latency #### Copy-Paste Prompts ``` Use @similarity-search-patterns to implement retrieval ``` ``` Use @hybrid-search-implementation to add hybrid search ``` ### Phase 6: LLM Integration #### Skills to Invoke - `llm-application-dev-ai-assistant` - LLM integration - `llm-application-dev-prompt-optimize` - Prompt optimization #### Actions 1. Select LLM provider 2. Design prompt template 3. Implement context injection 4. Add citation handling 5. Test generation quality #### Copy-Paste Prompts ``` Use @llm-application-dev-ai-assistant to integrate LLM ``` ### Phase 7: Caching #### Skills to Invoke - `prompt-caching` - Prompt caching - `rag-engineer` - RAG optimization #### Actions 1. Implement response caching 2. Set up embedding cache 3. Configure TTL 4. Add cache invalidation 5. Monitor hit rates #### Copy-Paste Prompts ``` Use @prompt-caching to implement RAG caching ``` ### Phase 8: Evaluation #### Skills to Invoke - `llm-evaluation` - LLM evaluation - `evaluation` - AI evaluation #### Actions 1. Define evaluation metrics 2. Create test dataset 3. Measure retrieval accuracy 4. Evaluate generation quality 5. Iterate on improvements #### Copy-Paste Prompts ``` Use @llm-evaluation to evaluate RAG system ``` ## RAG Architecture ``` User Query -> Embedding -> Vector Search -> Retrieved Docs -> LLM -> Response | | | | Model Vector DB Chunk Store Prompt + Context ``` ## Quality Gates