
Pinecone
Wire a managed Pinecone index into production RAG or semantic search with hybrid vectors, metadata filters, and serverless scaling.
Overview
Pinecone is an agent skill for the Build phase that guides production integration of Pinecone’s managed vector database for RAG, hybrid search, and low-latency retrieval.
Install
npx skills add https://github.com/orchestra-research/ai-research-skills --skill pineconeWhat is this skill?
- Managed serverless Pinecone with auto-scaling toward billion-vector scale
- Hybrid search combining dense and sparse vectors plus metadata filtering and namespaces
- Documented **p95 latency under 100ms** and 99.9% uptime SLA positioning
- Quick start with `pinecone-client` install and ServerlessSpec index creation
- Guidance on when to prefer self-hosted Chroma, FAISS, or Weaviate instead
- Advertised p95 latency under 100ms for managed queries
- Skill version 1.0.0 with pinecone-client dependency
Adoption & trust: 1 installs on skills.sh; 9.4k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have embeddings and agent retrieval requirements but do not want to run or scale your own vector database infrastructure.
Who is it for?
Indie SaaS or agent products that need serverless, auto-scaling semantic search with hybrid retrieval in production.
Skip if: Offline-only FAISS experiments, fully self-hosted vector stacks, or teams that cannot use a managed API key service.
When should I use this skill?
Need managed, serverless vector search with hybrid dense/sparse vectors, metadata filtering, and production RAG at scale.
What do I get? / Deliverables
You can create indexes, query with metadata and hybrid vectors, and align embedding dimensions with a production-minded Pinecone setup.
- Index creation and upsert/query code patterns
- Integration checklist for namespaces, filters, and hybrid search
Recommended Skills
Journey fit
Build → Integrations is where you connect embedding pipelines to a hosted vector store—not distribution, validation, or runtime incident response. Integrations captures third-party SaaS APIs like Pinecone client setup, index creation, and query patterns for agent retrieval.
How it compares
Integration skill for Pinecone SaaS—not a self-hosted Chroma/FAISS recipe or an MCP server by itself.
Common Questions / FAQ
Who is pinecone for?
Solo builders adding production RAG or recommendation retrieval who want managed Pinecone instead of operating vector infra.
When should I use pinecone?
During Build integrations when you are connecting embedders and agents to a hosted index with filtering, namespaces, and latency-sensitive queries.
Is pinecone safe to install?
Review the Security Audits panel on this Prism page; using Pinecone requires API keys and network access to the managed service—treat secrets accordingly.
SKILL.md
READMESKILL.md - Pinecone
# Pinecone - Managed Vector Database The vector database for production AI applications. ## When to use Pinecone **Use when:** - Need managed, serverless vector database - Production RAG applications - Auto-scaling required - Low latency critical (<100ms) - Don't want to manage infrastructure - Need hybrid search (dense + sparse vectors) **Metrics**: - Fully managed SaaS - Auto-scales to billions of vectors - **p95 latency <100ms** - 99.9% uptime SLA **Use alternatives instead**: - **Chroma**: Self-hosted, open-source - **FAISS**: Offline, pure similarity search - **Weaviate**: Self-hosted with more features ## Quick start ### Installation ```bash pip install pinecone-client ``` ### Basic usage ```python from pinecone import Pinecone, ServerlessSpec # Initialize pc = Pinecone(api_key="your-api-key") # Create index pc.create_index( name="my-index", dimension=1536, # Must match embedding dimension metric="cosine", # or "euclidean", "dotproduct" spec=ServerlessSpec(cloud="aws", region="us-east-1") ) # Connect to index index = pc.Index("my-index") # Upsert vectors index.upsert(vectors=[ {"id": "vec1", "values": [0.1, 0.2, ...], "metadata": {"category": "A"}}, {"id": "vec2", "values": [0.3, 0.4, ...], "metadata": {"category": "B"}} ]) # Query results = index.query( vector=[0.1, 0.2, ...], top_k=5, include_metadata=True ) print(results["matches"]) ``` ## Core operations ### Create index ```python # Serverless (recommended) pc.create_index( name="my-index", dimension=1536, metric="cosine", spec=ServerlessSpec( cloud="aws", # or "gcp", "azure" region="us-east-1" ) ) # Pod-based (for consistent performance) from pinecone import PodSpec pc.create_index( name="my-index", dimension=1536, metric="cosine", spec=PodSpec( environment="us-east1-gcp", pod_type="p1.x1" ) ) ``` ### Upsert vectors ```python # Single upsert index.upsert(vectors=[ { "id": "doc1", "values": [0.1, 0.2, ...], # 1536 dimensions "metadata": { "text": "Document content", "category": "tutorial", "timestamp": "2025-01-01" } } ]) # Batch upsert (recommended) vectors = [ {"id": f"vec{i}", "values": embedding, "metadata": metadata} for i, (embedding, metadata) in enumerate(zip(embeddings, metadatas)) ] index.upsert(vectors=vectors, batch_size=100) ``` ### Query vectors ```python # Basic query results = index.query( vector=[0.1, 0.2, ...], top_k=10, include_metadata=True, include_values=False ) # With metadata filtering results = index.query( vector=[0.1, 0.2, ...], top_k=5, filter={"category": {"$eq": "tutorial"}} ) # Namespace query results = index.query( vector=[0.1, 0.2, ...], top_k=5, namespace="production" ) # Access results for match in results["matches"]: print(f"ID: {match['id']}") print(f"Score: {match['score']}") print(f"Metadata: {match['metadata']}") ``` ### Metadata filtering ```python # Exact match filter = {"category": "tutorial"} # Comparison filter = {"price": {"$gte": 100}} # $gt, $gte, $lt, $lte, $ne # Logical operators filter = { "$and": [ {"category": "tutorial"}, {"difficulty": {"$lte": 3}} ] } # Also: $or # In operator filter = {"tags": {"$in": ["python", "ml"]}} ``` ## Namespaces ```