
Rag Architect
Install this when you need to design chunking, embeddings, vector retrieval, and evaluation for a production RAG knowledge base behind your AI product.
Install
npx skills add https://github.com/alirezarezvani/claude-skills --skill rag-architectWhat is this skill?
- End-to-end RAG design: document chunking through evaluation frameworks
- Chunking strategies: fixed-size, sentence-based, with overlap and tradeoff guidance
- Covers embedding model choice, vector search, and retrieval optimization patterns
- Oriented toward scalable, production-grade retrieval pipelines
Adoption & trust: 542 installs on skills.sh; 17.5k GitHub stars; 3/3 security scanners passed (skills.sh audits).
Recommended Skills
Journey fit
RAG architecture is decided and implemented while building agent features and knowledge-backed APIs, before you rely on retrieval in production. The skill targets retrieval-augmented generation stacks—chunking, embeddings, vector search—which are core agent-tooling concerns for solo AI products.
Common Questions / FAQ
Is Rag Architect safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Rag Architect
# RAG Architect - POWERFUL ## Overview The RAG (Retrieval-Augmented Generation) Architect skill provides comprehensive tools and knowledge for designing, implementing, and optimizing production-grade RAG pipelines. This skill covers the entire RAG ecosystem from document chunking strategies to evaluation frameworks, enabling you to build scalable, efficient, and accurate retrieval systems. ## Core Competencies ### 1. Document Processing & Chunking Strategies #### Fixed-Size Chunking - **Character-based chunking**: Simple splitting by character count (e.g., 512, 1024, 2048 chars) - **Token-based chunking**: Splitting by token count to respect model limits - **Overlap strategies**: 10-20% overlap to maintain context continuity - **Pros**: Predictable chunk sizes, simple implementation, consistent processing time - **Cons**: May break semantic units, context boundaries ignored - **Best for**: Uniform documents, when consistent chunk sizes are critical #### Sentence-Based Chunking - **Sentence boundary detection**: Using NLTK, spaCy, or regex patterns - **Sentence grouping**: Combining sentences until size threshold is reached - **Paragraph preservation**: Avoiding mid-paragraph splits when possible - **Pros**: Preserves natural language boundaries, better readability - **Cons**: Variable chunk sizes, potential for very short/long chunks - **Best for**: Narrative text, articles, books #### Paragraph-Based Chunking - **Paragraph detection**: Double newlines, HTML tags, markdown formatting - **Hierarchical splitting**: Respecting document structure (sections, subsections) - **Size balancing**: Merging small paragraphs, splitting large ones - **Pros**: Preserves logical document structure, maintains topic coherence - **Cons**: Highly variable sizes, may create very large chunks - **Best for**: Structured documents, technical documentation #### Semantic Chunking - **Topic modeling**: Using TF-IDF, embeddings similarity for topic detection - **Heading-aware splitting**: Respecting document hierarchy (H1, H2, H3) - **Content-based boundaries**: Detecting topic shifts using semantic similarity - **Pros**: Maintains semantic coherence, respects document structure - **Cons**: Complex implementation, computationally expensive - **Best for**: Long-form content, technical manuals, research papers #### Recursive Chunking - **Hierarchical approach**: Try larger chunks first, recursively split if needed - **Multi-level splitting**: Different strategies at different levels - **Size optimization**: Minimize number of chunks while respecting size limits - **Pros**: Optimal chunk utilization, preserves context when possible - **Cons**: Complex logic, potential performance overhead - **Best for**: Mixed content types, when chunk count optimization is important #### Document-Aware Chunking - **File type detection**: PDF pages, Word sections, HTML elements - **Metadata preservation**: Headers, footers, page numbers, sections - **Table and image handling**: Special processing for non-text elements - **Pros**: Preserves document structure and metadata - **Cons**: Format-specific implementation required - **Best for**: Multi-format document collections, when metadata is important ### 2. Embedding Model Selection #### Dimension Considerations - **128-256 dimensions**: Fast retrieval, lower memory usage, suitable for simple domains - **512-768 dimensions**: Balanced performance, good for most applications - **1024-1536 dimensions**: High quality, better for complex domains, higher cost - **2048+ dimensions**: Maximum quality, specialized use cases, significant resources #### Speed vs Quality Tradeoffs - **Fast models**: sentence-transformers/all-MiniLM-L6-v2 (384 dim, ~14k tokens/sec) - **Balanced models**: sentence-transformers/all-mp