
Qdrant Vector Search
Design and operate Qdrant vector search—collections, sharding, and multi-node clusters—for RAG and semantic retrieval backends.
Overview
Qdrant Vector Search is an agent skill most often used in Build (also Operate) that documents distributed Qdrant deployment, sharding, and collection setup for vector retrieval workloads.
Install
npx skills add https://github.com/orchestra-research/ai-research-skills --skill qdrant-vector-searchWhat is this skill?
- 3-node Docker Compose cluster pattern with Raft, P2P ports, and bootstrap peers
- Python client examples for collections with VectorParams, Distance, and ShardingMethod
- Distributed deployment guidance: HTTP/gRPC ports, per-node storage volumes, cluster env vars
- Targets advanced self-hosted Qdrant rather than a minimal single-node quickstart
- Orchestra research skill packaging for AI/RAG data plane design
- 3-node cluster example in docker-compose.yml
- Raft consensus called out for distributed coordination
Adoption & trust: 1 installs on skills.sh; 9.4k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need semantic search or RAG storage but only have fragile single-node Qdrant notes and no clear cluster or sharding recipe.
Who is it for?
Indie builders self-hosting Qdrant for agents or SaaS features who must plan cluster networking and collection sharding early.
Skip if: Teams wanting a managed-only vector DB with zero ops, or frontend-only search UI work without a Qdrant backend.
When should I use this skill?
Implementing or scaling Qdrant for vector search, RAG storage, clustered deployment, or sharding configuration.
What do I get? / Deliverables
You can stand up a multi-node Qdrant topology and configure collections with sharding-aware client code aligned to the skill’s compose and Python patterns.
- docker-compose cluster layout with per-node storage
- Python collection definitions including sharding options
- Environment-variable Qdrant cluster configuration reference
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Vector stores are implemented with the product backend and later tuned when traffic and data volume grow in production. Cluster, sharding, and client setup are backend infrastructure choices that sit alongside APIs and embedding pipelines.
Where it fits
Define collection VectorParams and sharding before exposing a search API to your agent feature.
Wire embedding ingestion into a Qdrant collection once cluster endpoints are known.
Expand from one node to a 3-node compose cluster with bootstrap peers and isolated storage volumes.
How it compares
Self-hosted Qdrant operations guide, not a generic embedding tutorial or a hosted Pinecone-style integration skill.
Common Questions / FAQ
Who is qdrant-vector-search for?
Solo developers and small teams building RAG or semantic search who run Qdrant themselves and need cluster and sharding reference material in the agent.
When should I use qdrant-vector-search?
Use it in Build when designing the retrieval backend and embedding pipeline, and in Operate when scaling to a multi-node cluster or adjusting sharding as data grows.
Is qdrant-vector-search safe to install?
The skill is documentation and example configs for infrastructure you deploy; review the Security Audits panel on this page and harden Docker networks, volumes, and secrets in your own environment.
SKILL.md
READMESKILL.md - Qdrant Vector Search
# Qdrant Advanced Usage Guide ## Distributed Deployment ### Cluster Setup Qdrant uses Raft consensus for distributed coordination. ```yaml # docker-compose.yml for 3-node cluster version: '3.8' services: qdrant-node-1: image: qdrant/qdrant:latest ports: - "6333:6333" - "6334:6334" - "6335:6335" volumes: - ./node1_storage:/qdrant/storage environment: - QDRANT__CLUSTER__ENABLED=true - QDRANT__CLUSTER__P2P__PORT=6335 - QDRANT__SERVICE__HTTP_PORT=6333 - QDRANT__SERVICE__GRPC_PORT=6334 qdrant-node-2: image: qdrant/qdrant:latest ports: - "6343:6333" - "6344:6334" - "6345:6335" volumes: - ./node2_storage:/qdrant/storage environment: - QDRANT__CLUSTER__ENABLED=true - QDRANT__CLUSTER__P2P__PORT=6335 - QDRANT__CLUSTER__BOOTSTRAP=http://qdrant-node-1:6335 depends_on: - qdrant-node-1 qdrant-node-3: image: qdrant/qdrant:latest ports: - "6353:6333" - "6354:6334" - "6355:6335" volumes: - ./node3_storage:/qdrant/storage environment: - QDRANT__CLUSTER__ENABLED=true - QDRANT__CLUSTER__P2P__PORT=6335 - QDRANT__CLUSTER__BOOTSTRAP=http://qdrant-node-1:6335 depends_on: - qdrant-node-1 ``` ### Sharding Configuration ```python from qdrant_client import QdrantClient from qdrant_client.models import VectorParams, Distance, ShardingMethod client = QdrantClient(host="localhost", port=6333) # Create sharded collection client.create_collection( collection_name="large_collection", vectors_config=VectorParams(size=384, distance=Distance.COSINE), shard_number=6, # Number of shards replication_factor=2, # Replicas per shard write_consistency_factor=1 # Required acks for write ) # Check cluster status cluster_info = client.get_cluster_info() print(f"Peers: {cluster_info.peers}") print(f"Raft state: {cluster_info.raft_info}") ``` ### Replication and Consistency ```python from qdrant_client.models import WriteOrdering # Strong consistency write client.upsert( collection_name="critical_data", points=points, ordering=WriteOrdering.STRONG # Wait for all replicas ) # Eventual consistency (faster) client.upsert( collection_name="logs", points=points, ordering=WriteOrdering.WEAK # Return after primary ack ) # Read from specific shard results = client.search( collection_name="documents", query_vector=query, consistency="majority" # Read from majority of replicas ) ``` ## Hybrid Search ### Dense + Sparse Vectors Combine semantic (dense) and keyword (sparse) search: ```python from qdrant_client.models import ( VectorParams, SparseVectorParams, SparseIndexParams, Distance, PointStruct, SparseVector, Prefetch, Query ) # Create hybrid collection client.create_collection( collection_name="hybrid", vectors_config={ "dense": VectorParams(size=384, distance=Distance.COSINE) }, sparse_vectors_config={ "sparse": SparseVectorParams( index=SparseIndexParams(on_disk=False) ) } ) # Insert with both vector types def encode_sparse(text: str) -> SparseVector: """Simple BM25-like sparse encoding""" from collections import Counter tokens = text.lower().split() counts = Counter(tokens) # Map tokens to indices (use vocabulary in production) indices = [hash(t) % 30000 for t in counts.keys()] values = list(counts.values()) return SparseVector(indices=indices, values=values) client.upsert( collection_name="hybrid", points=[ PointStruct( id=1, vector={ "dense": dense_encoder.encode("Python programming").tolist(), "sparse": encode_sparse("Python programming language code") }, payload={"text": "Python programming language code"} ) ] ) # Hybrid search with Reciprocal Rank Fusion (RRF) from qdrant_client.models i