Faiss

Name: Faiss
Author: orchestra-research

orchestra-research/ai-research-skills

516 installs
11.2k repo stars
Updated June 16, 2026
orchestra-research/ai-research-skills

faiss is a Claude Code skill that guides selection of FAISS vector index types and Python code patterns for developers who need fast approximate or exact nearest-neighbor search in embeddings, RAG, and recommendation bac

About

faiss is an orchestra-research skill that documents FAISS index tradeoffs across dataset sizes: Flat for under 10K vectors with 100% accuracy, IVF for 10K–1M at 95–99% accuracy, HNSW for 1M–10M near 99% accuracy, and IVF+PQ beyond 10M for memory-efficient 90–95% accuracy. It includes Python examples such as IndexFlatL2 with 128-dimensional vectors and k-neighbor search snippets developers can paste into RAG or recommendation services. Reach for faiss when you must pick an index family before wiring embedding storage rather than after latency problems appear in production.

Dataset-size matrix: Flat (<10K), IVF (10K–1M), HNSW (1M–10M), IVF+PQ (>10M)
IndexFlatL2 and IndexFlatIP recipes with normalize_L2 for cosine similarity
IVF cluster setup with quantizer, nlist, and training requirements spelled out
Accuracy vs speed vs memory tradeoffs in one comparison table

Faiss by the numbers

516 all-time installs (skills.sh)
+31 installs in the week ending Jul 26, 2026 (Skillselion tracking)
Ranked #434 of 2,066 Data Science & ML skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

npx skills add https://github.com/orchestra-research/ai-research-skills --skill faiss

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/orchestra-research/ai-research-skills/faiss.svg)](https://skillselion.com/skills/orchestra-research/ai-research-skills/faiss)

Installs	516
repo stars	★ 11.2k
Security audit	2 / 3 scanners passed
Last updated	June 16, 2026
Repository	orchestra-research/ai-research-skills ↗

Which FAISS index type fits my vector dataset?

Pick the right FAISS index type and Python snippets when you add fast vector search to embeddings, RAG, or recommendation backends.

Who is it for?

ML engineers adding vector similarity search to RAG or recommendation backends who need index guidance tied to dataset size.

Skip if: Teams already running a managed vector database with no plan to embed FAISS directly in Python services.

When should I use this skill?

User asks about FAISS indexes, vector search performance, IVF, HNSW, or embedding nearest-neighbor retrieval in Python.

What you get

Chosen FAISS index configuration, Python search code, and accuracy-speed tradeoff notes

Index selection guide
Python FAISS snippets
Accuracy-speed tradeoff table

By the numbers

Flat index recommended for datasets under 10K vectors
IVF fits 10K–1M vectors at 95–99% accuracy
HNSW targets 1M–10M vectors near 99% accuracy

Files

SKILL.mdMarkdownGitHub ↗

FAISS - Efficient Similarity Search

Facebook AI's library for billion-scale vector similarity search.

When to use FAISS

Use FAISS when:

Need fast similarity search on large vector datasets (millions/billions)
GPU acceleration required
Pure vector similarity (no metadata filtering needed)
High throughput, low latency critical
Offline/batch processing of embeddings

Metrics:

31,700+ GitHub stars
Meta/Facebook AI Research
Handles billions of vectors
C++ with Python bindings

Use alternatives instead:

Chroma/Pinecone: Need metadata filtering
Weaviate: Need full database features
Annoy: Simpler, fewer features

Quick start

Installation

# CPU only
pip install faiss-cpu

# GPU support
pip install faiss-gpu

Basic usage

import faiss
import numpy as np

# Create sample data (1000 vectors, 128 dimensions)
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')

# Create index
index = faiss.IndexFlatL2(d)  # L2 distance
index.add(vectors)             # Add vectors

# Search
k = 5  # Find 5 nearest neighbors
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)

print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")

Index types

1. Flat (exact search)

# L2 (Euclidean) distance
index = faiss.IndexFlatL2(d)

# Inner product (cosine similarity if normalized)
index = faiss.IndexFlatIP(d)

# Slowest, most accurate

2. IVF (inverted file) - Fast approximate

# Create quantizer
quantizer = faiss.IndexFlatL2(d)

# IVF index with 100 clusters
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)

# Train on data
index.train(vectors)

# Add vectors
index.add(vectors)

# Search (nprobe = clusters to search)
index.nprobe = 10
distances, indices = index.search(query, k)

3. HNSW (Hierarchical NSW) - Best quality/speed

# HNSW index
M = 32  # Number of connections per layer
index = faiss.IndexHNSWFlat(d, M)

# No training needed
index.add(vectors)

# Search
distances, indices = index.search(query, k)

4. Product Quantization - Memory efficient

# PQ reduces memory by 16-32×
m = 8   # Number of subquantizers
nbits = 8
index = faiss.IndexPQ(d, m, nbits)

# Train and add
index.train(vectors)
index.add(vectors)

Save and load

# Save index
faiss.write_index(index, "large.index")

# Load index
index = faiss.read_index("large.index")

# Continue using
distances, indices = index.search(query, k)

GPU acceleration

# Single GPU
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu)  # GPU 0

# Multi-GPU
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)

# 10-100× faster than CPU

LangChain integration

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# Create FAISS vector store
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())

# Save
vectorstore.save_local("faiss_index")

# Load
vectorstore = FAISS.load_local(
    "faiss_index",
    OpenAIEmbeddings(),
    allow_dangerous_deserialization=True
)

# Search
results = vectorstore.similarity_search("query", k=5)

LlamaIndex integration

from llama_index.vector_stores.faiss import FaissVectorStore
import faiss

# Create FAISS index
d = 1536
faiss_index = faiss.IndexFlatL2(d)

vector_store = FaissVectorStore(faiss_index=faiss_index)

Best practices

1. Choose right index type - Flat for <10K, IVF for 10K-1M, HNSW for quality 2. Normalize for cosine - Use IndexFlatIP with normalized vectors 3. Use GPU for large datasets - 10-100× faster 4. Save trained indices - Training is expensive 5. Tune nprobe/ef_search - Balance speed/accuracy 6. Monitor memory - PQ for large datasets 7. Batch queries - Better GPU utilization

Performance

Index Type	Build Time	Search Time	Memory	Accuracy
Flat	Fast	Slow	High	100%
IVF	Medium	Fast	Medium	95-99%
HNSW	Slow	Fastest	High	99%
PQ	Medium	Fast	Low	90-95%

Resources

GitHub: https://github.com/facebookresearch/faiss ⭐ 31,700+
Wiki: https://github.com/facebookresearch/faiss/wiki
License: MIT

FAISS Index Types Guide

Complete guide to choosing and using FAISS index types.

Index selection guide

Dataset Size	Index Type	Training	Accuracy	Speed
< 10K	Flat	No	100%	Slow
10K-1M	IVF	Yes	95-99%	Fast
1M-10M	HNSW	No	99%	Fastest
> 10M	IVF+PQ	Yes	90-95%	Fast, low memory

Flat indices (exact search)

IndexFlatL2 - L2 (Euclidean) distance

import faiss
import numpy as np

d = 128  # Dimension
index = faiss.IndexFlatL2(d)

# Add vectors
vectors = np.random.random((1000, d)).astype('float32')
index.add(vectors)

# Search
k = 5
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)

Use when:

Dataset < 10,000 vectors
Need 100% accuracy
Serving as baseline

IndexFlatIP - Inner product (cosine similarity)

# For cosine similarity, normalize vectors first
import faiss

d = 128
index = faiss.IndexFlatIP(d)

# Normalize vectors (required for cosine similarity)
faiss.normalize_L2(vectors)
index.add(vectors)

# Search
faiss.normalize_L2(query)
distances, indices = index.search(query, k)

Use when:

Need cosine similarity
Recommendation systems
Text embeddings

IVF indices (inverted file)

IndexIVFFlat - Cluster-based search

# Create quantizer
quantizer = faiss.IndexFlatL2(d)

# Create IVF index with 100 clusters
nlist = 100  # Number of clusters
index = faiss.IndexIVFFlat(quantizer, d, nlist)

# Train on data (required!)
index.train(vectors)

# Add vectors
index.add(vectors)

# Search (nprobe = clusters to search)
index.nprobe = 10  # Search 10 closest clusters
distances, indices = index.search(query, k)

Parameters:

nlist: Number of clusters (√N to 4√N recommended)
nprobe: Clusters to search (1-nlist, higher = more accurate)

Use when:

Dataset 10K-1M vectors
Need fast approximate search
Can afford training time

Tuning nprobe

# Test different nprobe values
for nprobe in [1, 5, 10, 20, 50]:
    index.nprobe = nprobe
    distances, indices = index.search(query, k)
    # Measure recall/speed trade-off

Guidelines:

nprobe=1: Fastest, ~50% recall
nprobe=10: Good balance, ~95% recall
nprobe=nlist: Exact search (same as Flat)

HNSW indices (graph-based)

IndexHNSWFlat - Hierarchical NSW

# HNSW index
M = 32  # Number of connections per layer (16-64)
index = faiss.IndexHNSWFlat(d, M)

# Optional: Set ef_construction (build time parameter)
index.hnsw.efConstruction = 40  # Higher = better quality, slower build

# Add vectors (no training needed!)
index.add(vectors)

# Search
index.hnsw.efSearch = 16  # Search time parameter
distances, indices = index.search(query, k)

Parameters:

M: Connections per layer (16-64, default 32)
efConstruction: Build quality (40-200, higher = better)
efSearch: Search quality (16-512, higher = more accurate)

Use when:

Need best quality approximate search
Can afford higher memory (more connections)
Dataset 1M-10M vectors

PQ indices (product quantization)

IndexPQ - Memory-efficient

# PQ reduces memory by 16-32×
m = 8   # Number of subquantizers (divides d)
nbits = 8  # Bits per subquantizer

index = faiss.IndexPQ(d, m, nbits)

# Train (required!)
index.train(vectors)

# Add vectors
index.add(vectors)

# Search
distances, indices = index.search(query, k)

Parameters:

m: Subquantizers (d must be divisible by m)
nbits: Bits per code (8 or 16)

Memory savings:

Original: d × 4 bytes (float32)
PQ: m bytes
Compression ratio: 4d/m

Use when:

Limited memory
Large datasets (> 10M vectors)
Can accept ~90-95% accuracy

IndexIVFPQ - IVF + PQ combined

# Best for very large datasets
nlist = 4096
m = 8
nbits = 8

quantizer = faiss.IndexFlatL2(d)
index = faiss.IndexIVFPQ(quantizer, d, nlist, m, nbits)

# Train
index.train(vectors)
index.add(vectors)

# Search
index.nprobe = 32
distances, indices = index.search(query, k)

Use when:

Dataset > 10M vectors
Need fast search + low memory
Can accept 90-95% accuracy

GPU indices

Single GPU

import faiss

# Create CPU index
index_cpu = faiss.IndexFlatL2(d)

# Move to GPU
res = faiss.StandardGpuResources()  # GPU resources
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu)  # GPU 0

# Use normally
index_gpu.add(vectors)
distances, indices = index_gpu.search(query, k)

Multi-GPU

# Use all available GPUs
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)

# Or specific GPUs
gpus = [0, 1, 2, 3]  # Use GPUs 0-3
index_gpu = faiss.index_cpu_to_gpus_list(index_cpu, gpus)

Speedup:

Single GPU: 10-50× faster than CPU
Multi-GPU: Near-linear scaling

Index factory

# Easy index creation with string descriptors
index = faiss.index_factory(d, "IVF100,Flat")
index = faiss.index_factory(d, "HNSW32")
index = faiss.index_factory(d, "IVF4096,PQ8")

# Train and use
index.train(vectors)
index.add(vectors)

Common descriptors:

"Flat": Exact search
"IVF100,Flat": IVF with 100 clusters
"HNSW32": HNSW with M=32
"IVF4096,PQ8": IVF + PQ compression

Performance comparison

Search speed (1M vectors, k=10)

Index	Build Time	Search Time	Memory	Recall
Flat	0s	50ms	512 MB	100%
IVF100	5s	2ms	512 MB	95%
HNSW32	60s	1ms	1GB	99%
IVF4096+PQ8	30s	3ms	32 MB	90%

CPU (16 cores), 128-dim vectors

Best practices

1. Start with Flat - Baseline for comparison 2. Use IVF for medium datasets - Good balance 3. Use HNSW for best quality - If memory allows 4. Add PQ for memory savings - Large datasets 5. GPU for > 100K vectors - 10-50× speedup 6. Tune nprobe/efSearch - Trade-off speed/accuracy 7. Train on representative data - Better clustering 8. Save trained indices - Avoid retraining

Resources

Wiki: https://github.com/facebookresearch/faiss/wiki
Paper: https://arxiv.org/abs/1702.08734

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Run TrainSafely execute selected deep learning training commands with standardized evidence capture.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

FAQ

Which FAISS index suits datasets under 10K vectors?

The faiss skill recommends Flat indices for under 10K vectors, delivering 100% exact search accuracy without training, at the cost of slower search on larger sets.

What index does faiss recommend above 10 million vectors?

For more than 10 million vectors, faiss guides developers toward IVF+PQ indexes that trade roughly 90–95% accuracy for fast, low-memory approximate search.

Is Faiss safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLllmautomation

About

Faiss by the numbers

Add your badge

Which FAISS index type fits my vector dataset?

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

FAISS - Efficient Similarity Search

When to use FAISS

Quick start

Installation

Basic usage

Index types

1. Flat (exact search)

2. IVF (inverted file) - Fast approximate

3. HNSW (Hierarchical NSW) - Best quality/speed

4. Product Quantization - Memory efficient

Save and load

GPU acceleration

LangChain integration

LlamaIndex integration

Best practices

Performance

Resources

FAISS Index Types Guide

Index selection guide

Flat indices (exact search)

IndexFlatL2 - L2 (Euclidean) distance

IndexFlatIP - Inner product (cosine similarity)

IVF indices (inverted file)

IndexIVFFlat - Cluster-based search

Tuning nprobe

HNSW indices (graph-based)

IndexHNSWFlat - Hierarchical NSW

PQ indices (product quantization)

IndexPQ - Memory-efficient

IndexIVFPQ - IVF + PQ combined

GPU indices

Single GPU

Multi-GPU

Index factory

Performance comparison

Search speed (1M vectors, k=10)

Best practices

Resources

Related skills

FAQ

Which FAISS index suits datasets under 10K vectors?

What index does faiss recommend above 10 million vectors?

Is Faiss safe to install?

This week in AI coding