
Rag Architect
Pick and implement RAG chunking and retrieval architecture so agent apps retrieve the right context without bloating prompts.
Overview
RAG Architect is an agent skill for the Build phase that guides chunking strategy selection and retrieval architecture for LLM and agent products.
Install
npx skills add https://github.com/jeffallan/claude-skills --skill rag-architectWhat is this skill?
- Chunking strategy comparison matrix across fixed-size, recursive, sentence, semantic, document-aware, agentic, and late
- When-to-use and when-to-avoid guidance per strategy for logs, articles, technical docs, and code-heavy content
- Retrieval-precision tradeoffs including cost and real-time ingestion constraints for semantic approaches
- Default starting-point recommendation: recursive character splitting for general RAG baselines
- Structured decision cues for LangChain/LlamaIndex-style defaults vs semantic boundaries
- 7 chunking strategies in comparison matrix
- 3 complexity tiers: Simple, Medium, Complex implementation
Adoption & trust: 2.8k installs on skills.sh; 9.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your RAG app returns wrong chunks or smashes context because chunking was a default split with no strategy fit for your docs.
Who is it for?
Builders shipping knowledge-backed agents or SaaS search who need a decision framework across seven chunking strategies.
Skip if: Teams with a frozen enterprise RAG platform, pure embedding API setup with no chunking choices, or non-text modalities only.
When should I use this skill?
User is designing or fixing RAG ingestion, chunking, or retrieval quality for an LLM product.
What do I get? / Deliverables
You choose a documented chunking approach matched to content type and precision needs before implementing ingestion and retrieval code.
- Selected chunking strategy with rationale
- Implementation notes aligned to content structure
Recommended Skills
Journey fit
RAG architecture is build-phase agent tooling—the core retrieval stack ships with the product, not as a one-time launch tactic. Agent-tooling is the shelf for skills that shape how LLM products ingest, chunk, embed, and retrieve knowledge.
How it compares
Architecture and chunking playbook—not a hosted vector DB or turnkey ingestion MCP.
Common Questions / FAQ
Who is rag-architect for?
Solo developers building agents, internal copilots, or API products that retrieve from docs, tickets, or manuals.
When should I use rag-architect?
During build agent-tooling when designing ingestion pipelines, evaluating semantic vs recursive splits, or debugging poor retrieval on technical content.
Is rag-architect safe to install?
It is guidance-only; confirm trust via the Security Audits panel on this Prism page for the jeffallan package source.
SKILL.md
READMESKILL.md - Rag Architect
# Chunking Strategies --- ## Strategy Comparison Matrix | Strategy | Best For | Chunk Quality | Implementation Complexity | |----------|----------|---------------|---------------------------| | **Fixed-size** | Simple documents, logs | Low-Medium | Simple | | **Recursive character** | General text, articles | Medium | Simple | | **Sentence-based** | Conversational, Q&A | Medium-High | Medium | | **Semantic** | Technical docs, manuals | High | Medium | | **Document-aware** | Structured content (MD, HTML) | High | Medium | | **Agentic/Contextual** | Complex documents | Very High | Complex | | **Late chunking** | Long-context embeddings | High | Medium | --- ## When to Use Each Strategy ### Fixed-Size Chunking ``` Best For: - Log files and structured data - Quick prototyping - When content has no natural structure - Baseline comparison When to Avoid: - Technical documentation - Content with semantic units (paragraphs, sections) - When context preservation matters ``` ### Recursive Character Splitting ``` Best For: - General articles and blog posts - Mixed content types - Default starting point for most RAG - LangChain/LlamaIndex default When to Avoid: - Highly structured documents - Code-heavy content - Tables and lists ``` ### Semantic Chunking ``` Best For: - Technical documentation - Research papers - Content with natural topic boundaries - When retrieval precision is critical When to Avoid: - Real-time ingestion (slower) - Very short documents - Cost-sensitive pipelines (requires embeddings) ``` ### Document-Aware Chunking ``` Best For: - Markdown documentation - HTML pages - LaTeX papers - Code files When to Avoid: - Plain text without structure - Inconsistent formatting ``` --- ## Fixed-Size Chunking ```python def fixed_size_chunk( text: str, chunk_size: int = 500, overlap: int = 50 ) -> list[str]: """Simple fixed-size chunking with overlap.""" chunks = [] start = 0 while start < len(text): end = start + chunk_size chunk = text[start:end] # Try to break at word boundary if end < len(text): last_space = chunk.rfind(' ') if last_space > chunk_size * 0.8: # Only if reasonably far in chunk = chunk[:last_space] end = start + last_space chunks.append(chunk.strip()) start = end - overlap return chunks # Usage chunks = fixed_size_chunk(document_text, chunk_size=500, overlap=50) ``` --- ## Recursive Character Splitting (LangChain Style) ```python from typing import Callable class RecursiveCharacterSplitter: """Split text recursively using multiple separators.""" def __init__( self, chunk_size: int = 1000, chunk_overlap: int = 200, separators: list[str] | None = None, length_function: Callable[[str], int] = len ): self.chunk_size = chunk_size self.chunk_overlap = chunk_overlap self.separators = separators or ["\n\n", "\n", ". ", " ", ""] self.length_function = length_function def split_text(self, text: str) -> list[str]: """Split text into chunks.""" return self._split_text(text, self.separators) def _split_text(self, text: str, separators: list[str]) -> list[str]: final_chunks = [] separator = separators[-1] for i, sep in enumerate(separators): if sep == "": separator = sep break if sep in text: separator = sep break splits = text.split(separator) if separator else list(text) good_splits = [] for split in splits: if self.length_function(split) < self.chunk_size: good_splits.append(split) else: if good_splits: merged = self._merge_splits(good_splits, separator) final_chunks.extend(merged) good_splits = []