
Storing And Querying Vectors
Design AWS S3 Vectors storage and query patterns for RAG or semantic search without paying in-memory vector DB prices at indie scale.
Overview
Storing and Querying Vectors is an agent skill most often used in Build (also Operate infra, Grow analytics) that documents AWS S3 Vectors patterns for scalable, cost-aware embedding storage and retrieval.
Install
npx skills add https://github.com/aws/agent-toolkit-for-aws --skill storing-and-querying-vectorsWhat is this skill?
- Positions S3 Vectors for large, long-term, cost-optimized embeddings with strong read-after-write consistency
- Latency guidance: subsecond for infrequent access and as low as 100ms for hotter query paths
- Per-tenant index pattern for isolation versus single-index metadata filtering by tenant_id
- Batch ingestion up to 500 vectors per PutVectors call with parallel workers and ServiceUnavailable backoff
- Points to current AWS documentation for S3 Vectors limits rather than hard-coding quotas
- Batch up to 500 vectors per PutVectors API call
- Query latency as low as 100ms for more frequent access; subsecond for infrequent queries
- Documents per-tenant index versus single-index tenant_id metadata filtering patterns
Adoption & trust: 1.1k installs on skills.sh; 819 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need semantic search or RAG storage on AWS but are unsure whether S3 Vectors fits your latency, tenancy, and bulk ingest requirements.
Who is it for?
Indie builders shipping AWS-hosted agents or SaaS with large, slowly changing embedding corpora and tolerance for S3-backed query latency.
Skip if: Ultra-low-latency-only workloads that demand always-hot in-memory vector databases, or teams not on AWS who need portable vector store abstractions.
When should I use this skill?
Designing or scaling vector storage and queries on AWS S3 Vectors for large, long-term embedding corpora with defined tenancy and ingestion needs.
What do I get? / Deliverables
You leave with a concrete tenancy model, batch ingestion approach, and query expectations aligned to S3 Vectors limits documented in AWS.
- Chosen multi-tenant vector index strategy with isolation rationale
- Batch ingestion plan with worker parallelism and backoff handling
- Documented latency and cost expectations for S3 Vectors versus in-memory stores
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Build backend because the skill teaches how to architect vector indexes, ingestion, and query paths for a product feature. Backend subphase matches multi-tenant index design, batch PutVectors ingestion, and query scoping that ships as application infrastructure.
Where it fits
Compare S3 Vectors latency and cost against a managed vector DB before committing your RAG MVP architecture.
Pick per-tenant indexes so each customer’s embeddings stay isolated in a shared vector bucket.
Tune parallel PutVectors workers with backoff after ServiceUnavailable spikes during a bulk reindex.
How it compares
Architecture patterns for S3 Vectors at scale, not a drop-in replacement for Pinecone or OpenSearch Serverless setup wizards.
Common Questions / FAQ
Who is storing-and-querying-vectors for?
Solo builders and small teams implementing retrieval or RAG backends on AWS who want agent guidance on S3 Vectors tenancy, ingestion, and query tradeoffs.
When should I use storing-and-querying-vectors?
Use it during Build backend when choosing vector storage, during Operate infra when scaling ingestion workers, and during Validate scope when estimating AWS retrieval costs versus in-memory options.
Is storing-and-querying-vectors safe to install?
It is documentation-style guidance without embedded secrets; review the Security Audits panel on this Prism page and follow your own IAM least-privilege policies for S3 Vectors APIs.
SKILL.md
READMESKILL.md - Storing And Querying Vectors
# Patterns for S3 Vectors at Scale For current limits: search AWS docs for `"S3 Vectors limitations and restrictions"` ## When to Use S3 Vectors Use S3 Vectors for large, long-term vector data that doesn't require the high-throughput performance of in-memory vector databases. S3 Vectors provides a cost-optimized data foundation with query performance optimized for long-term storage and infrequent access of data. You also benefit from a storage architecture with strong consistency guarantees, ensuring subsequent queries always include your most recently added data. S3 Vectors delivers subsecond latency for infrequent queries and as low as 100ms for more frequent queries. ## Multi-Tenant Patterns **Per-tenant index** (recommended for isolation): - Each tenant gets their own index within a shared vector bucket - Queries naturally scoped to one tenant - Easy to delete a tenant's data (delete the index) - Use when: tenants need strict isolation, different schemas, or independent scaling **Single index with metadata filtering** (simpler): - All tenants share one index, filter by `tenant_id` metadata - Simpler to manage, single query endpoint - Use when: tenants have identical schemas and moderate scale - Risk: noisy neighbor if one tenant dominates the index ## Batch Ingestion Pattern For large-scale ingestion (millions of vectors): 1. Batch vectors into groups of up to 500 per PutVectors call 2. Use parallel workers with backoff on `ServiceUnavailableException` 3. For sustained throughput beyond per-index limits, shard across multiple indexes 4. Search AWS docs for `"S3 Vectors limitations and restrictions"` for current per-call and per-second limits ## SSE-KMS Encryption To create a vector bucket with SSE-KMS: ```bash aws s3vectors create-vector-bucket \ --vector-bucket-name <BUCKET_NAME> \ --encryption-configuration '{"sseType":"aws:kms","kmsKeyArn":"arn:aws:kms:<REGION>:<ACCOUNT>:key/<KEY_ID>"}' ``` You MUST use the full KMS key ARN (not alias or key ID). The KMS key policy MUST grant `kms:GenerateDataKey` and `kms:Decrypt` to the S3 Vectors service principal `indexing.s3vectors.amazonaws.com`. Encryption cannot be changed after bucket or index creation. For full KMS policy examples, search AWS docs for `"S3 Vectors data encryption KMS"`. ## Migration Pattern When migrating from another vector DB (pgVector, AOSS, etc.): 1. Create vector bucket and index matching source dimensions + distance metric 2. Export vectors from source (with metadata) 3. Batch PutVectors into S3 Vectors 4. Verify with QueryVectors using known test vectors 5. S3 Vectors only supports `cosine` and `euclidean` — if source used dotProduct, use `cosine` on normalized vectors as equivalent # Metadata Filtering For full docs: search AWS docs for `"S3 Vectors metadata filtering"` ## Filterable vs Non-filterable - **Filterable** (default): All metadata is filterable unless explicitly declared otherwise. Can be used in query `--filter` expressions. Limited to 2 KB per vector. - **Non-filterable**: Declared at index creation via `--metadata-configuration`. Search AWS docs for `"S3 Vectors non-filterable metadata"` for JSON syntax. Cannot be used in filters but can store larger data. Total metadata per vector (filterable + non-filterable combined) is limited to 40 KB. Ideal for text chunks, descriptions, raw content. Immutable — cannot change after index creation. Max 10 non-filterable keys per index. ## Filter Operators | Operator | Input types | Description | |----------|------------|-------------| | `$eq` | string, number, boolean | Exact match (default when no operator specified) | | `$ne` | string, number, boolean | Not equal | | `$gt` | number | Greater than | | `$gte` | number | Greater than or equal | | `$lt` | number | Less than | | `$lte` | number | Less than or equal | | `$in` | array of primitives | Match any value in array | | `$nin` | array of primitives | Match none of the values | | `$exists` | boolean | Ch