Neo4j Vector Index Skill

Name: Neo4j Vector Index Skill
Author: neo4j-contrib

neo4j-contrib/neo4j-skills

Create Neo4j vector indexes, ingest embeddings, and run hybrid vector-plus-graph retrieval for RAG or similarity search.

Install

npx skills add https://github.com/neo4j-contrib/neo4j-skills --skill neo4j-vector-index-skill

What is this skill?

CREATE VECTOR INDEX with dimensions and similarity function; wait for ONLINE via SHOW VECTOR INDEXES
Python batch ingestion with UNWIND and db.create.setNodeVectorProperty
In-Cypher ai.text.embed() [2025.12+] and ai.text.embedBatch(); notes genai.vector.encode() deprecated
Vector SEARCH clause [2026.01+] with db.index.vector.queryNodes() fallback on 5.x+
Hybrid semantic, lexical, and structural retrieval plus chunking strategies (fixed-size, sentence, semantic)

Adoption & trust: 1 installs on skills.sh; 80 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Recommended Skills

Supabase Postgres Best Practicessupabase/agent-skills

Supabase Postgres Best Practices is an MIT-licensed reference skill from Supabase that packages performance and reliabil…217k installs·2.2k stars

Lark Baselarksuite/cli

Lark CLI skill for Feishu multidimensional tables, including schema, records, and analysis-oriented query patterns.210k installs·13.7k stars

Convex Migration Helperget-convex/agent-skills

Convex Migration Helper is an agent skill from the Convex toolkit that walks solo builders through safe schema and data …61.9k installs·31 stars

Neon Postgresneondatabase/agent-skills

neon-postgres guides coding agents through any Neon Serverless Postgres task: creating projects, choosing connection str…38.3k installs·68 stars

Firebase Firestore Standardfirebase/agent-skills

firebase-firestore-standard is a comprehensive Firestore agent skill for solo builders who need Cloud Firestore Standard…36.7k installs·345 stars

Postgresql Table Designwshobson/agents

PostgreSQL Table Design is an agent skill that walks solo builders through designing or reviewing Postgres-specific sche…18.5k installs·36.5k stars

Journey fit

Primary fit

BuildBackend, data & payments

Vector index setup and Cypher query patterns are backend data-layer work during product build, before launch-scale traffic. backend is the canonical shelf because the skill focuses on indexes, ingestion loops, and search procedures—not frontend UI or growth analytics.

Common Questions / FAQ

Is Neo4j Vector Index Skill safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

SKILL.md

READMESKILL.md - Neo4j Vector Index Skill

# neo4j-vector-index-skill

Skill for creating and querying vector indexes in Neo4j for semantic or structural similarity search.

**Covers:**
- Creating vector indexes: `CREATE VECTOR INDEX` with dimensions and similarity function
- Waiting for index `ONLINE` status; `SHOW VECTOR INDEXES`
- Embedding ingestion: Python batch loop with `UNWIND`, `db.create.setNodeVectorProperty`
- In-Cypher embedding with `ai.text.embed()` [2025.12] — replaces deprecated `genai.vector.encode()`
- Batch embedding procedure `ai.text.embedBatch()` for large datasets
- Vector search: `SEARCH` clause [2026.01+] and `db.index.vector.queryNodes()` procedure fallback
- Combining vector search with graph traversal (hybrid retrieval)
- Hybrid search, including semantic + lexical + structural sources
- Vector indexes over embeddings already written by GDS algorithms
- Chunking strategy before ingestion (fixed-size, sentence, semantic)
- Similarity function guidance: cosine vs euclidean — match your model's training loss
- Common errors: wrong dimensions, index not ONLINE, provider null returns

**Version / compatibility:**
- `SEARCH` clause requires Neo4j 2026.01+; `db.index.vector.queryNodes` available 5.x+
- `ai.text.embed()` requires Neo4j 2025.12+ and CYPHER 25; `genai.vector.encode()` is deprecated
- Vector type is native in CYPHER 25; stored as `LIST<FLOAT>` in older versions

**Not covered:**
- Full `ai.text.*` plugin reference (completion, chat, structured output) → `neo4j-genai-plugin-skill`
- GraphRAG pipelines with `neo4j-graphrag` → `neo4j-graphrag-skill`
- Fulltext-only / keyword-only search → `neo4j-cypher-skill`
- Computing GDS node embedding algorithms (FastRP, GraphSAGE) → `neo4j-gds-skill`

**Install:**
```bash
npx skills add https://github.com/neo4j-contrib/neo4j-skills --skill neo4j-vector-index-skill
```

Or paste this link into your coding assistant:
https://github.com/neo4j-contrib/neo4j-skills/tree/main/neo4j-vector-index-skill


# Hybrid Search

Hybrid search is useful when one retrieval signal is not enough:
- Semantic vector search finds paraphrases; misses exact names, acronyms, codes, and domain terms.
- Lexical fulltext search finds exact words; misses related concepts that do not share words.
- Structural search uses graph topology, paths, communities, or GDS node embeddings; captures relationships text does not contain.

Combining ranked sources improves recall and can boost results that are supported by more than one signal. The common pattern is vector + fulltext, but the same query shape works for any two or more ranked/scored sources: several vector indexes, title/body fulltext indexes, GDS-written structural embeddings, graph-derived candidate scores, or external retrieval scores.

Use when the user asks for custom Cypher hybrid search, WRRF/RRF, vector + fulltext, semantic + lexical + structural search, multiple vector indexes, or combining two+ ranked/scored retrieval sources.

## When NOT to Use
- `neo4j-graphrag` package `HybridRetriever` / `HybridCypherRetriever` -> use `neo4j-graphrag-skill`
- Fulltext-only / keyword-only search -> use `neo4j-cypher-skill`
- Single vector search -> use main `neo4j-vector-index-skill`

## Rules
- Run each source independently.
- Rank each source by `score DESC, stable_id ASC`.
- Do not compare raw scores from different sources.
- Compute `contribution = sourceWeight / (rrfConstant + sourceRank)`.
- Sum contributions per node.
- Order final rows by `wrrf DESC, stable_id ASC`.
- Use `sourceK > finalK`; combine before final limiting.
- Use stable unique property for tie breaks. If no stable key exists, add one before production use.
- Keep `LIMIT $sourceK` inside `SEARCH`; Cypher rejects a `LET` alias there.
- For structural vector sources, compute/write GDS embeddings first, then create a vector index over that property.

## Index Setup

Vector index:
```cypher
CYPHER 25
CREATE VECTOR INDEX chunk_embedding IF NOT EXISTS
FOR (c:Chunk) ON (c.embedding)
OPTIONS {
  indexConfig: {
    `

What is this skill?

CREATE VECTOR INDEX with dimensions and similarity function; wait for ONLINE via SHOW VECTOR INDEXES

Python batch ingestion with UNWIND and db.create.setNodeVectorProperty

In-Cypher ai.text.embed() [2025.12+] and ai.text.embedBatch(); notes genai.vector.encode() deprecated

Vector SEARCH clause [2026.01+] with db.index.vector.queryNodes() fallback on 5.x+

Hybrid semantic, lexical, and structural retrieval plus chunking strategies (fixed-size, sentence, semantic)

Adoption & trust: 1 installs on skills.sh; 80 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

BuildBackend, data & payments

SKILL.md

READMESKILL.md - Neo4j Vector Index Skill

# neo4j-vector-index-skill

Skill for creating and querying vector indexes in Neo4j for semantic or structural similarity search.

**Covers:**
- Creating vector indexes: `CREATE VECTOR INDEX` with dimensions and similarity function
- Waiting for index `ONLINE` status; `SHOW VECTOR INDEXES`
- Embedding ingestion: Python batch loop with `UNWIND`, `db.create.setNodeVectorProperty`
- In-Cypher embedding with `ai.text.embed()` [2025.12] — replaces deprecated `genai.vector.encode()`
- Batch embedding procedure `ai.text.embedBatch()` for large datasets
- Vector search: `SEARCH` clause [2026.01+] and `db.index.vector.queryNodes()` procedure fallback
- Combining vector search with graph traversal (hybrid retrieval)
- Hybrid search, including semantic + lexical + structural sources
- Vector indexes over embeddings already written by GDS algorithms
- Chunking strategy before ingestion (fixed-size, sentence, semantic)
- Similarity function guidance: cosine vs euclidean — match your model's training loss
- Common errors: wrong dimensions, index not ONLINE, provider null returns

**Version / compatibility:**
- `SEARCH` clause requires Neo4j 2026.01+; `db.index.vector.queryNodes` available 5.x+
- `ai.text.embed()` requires Neo4j 2025.12+ and CYPHER 25; `genai.vector.encode()` is deprecated
- Vector type is native in CYPHER 25; stored as `LIST<FLOAT>` in older versions

**Not covered:**
- Full `ai.text.*` plugin reference (completion, chat, structured output) → `neo4j-genai-plugin-skill`
- GraphRAG pipelines with `neo4j-graphrag` → `neo4j-graphrag-skill`
- Fulltext-only / keyword-only search → `neo4j-cypher-skill`
- Computing GDS node embedding algorithms (FastRP, GraphSAGE) → `neo4j-gds-skill`

**Install:**
```bash
npx skills add https://github.com/neo4j-contrib/neo4j-skills --skill neo4j-vector-index-skill
```

Or paste this link into your coding assistant:
https://github.com/neo4j-contrib/neo4j-skills/tree/main/neo4j-vector-index-skill


# Hybrid Search

Hybrid search is useful when one retrieval signal is not enough:
- Semantic vector search finds paraphrases; misses exact names, acronyms, codes, and domain terms.
- Lexical fulltext search finds exact words; misses related concepts that do not share words.
- Structural search uses graph topology, paths, communities, or GDS node embeddings; captures relationships text does not contain.

Combining ranked sources improves recall and can boost results that are supported by more than one signal. The common pattern is vector + fulltext, but the same query shape works for any two or more ranked/scored sources: several vector indexes, title/body fulltext indexes, GDS-written structural embeddings, graph-derived candidate scores, or external retrieval scores.

Use when the user asks for custom Cypher hybrid search, WRRF/RRF, vector + fulltext, semantic + lexical + structural search, multiple vector indexes, or combining two+ ranked/scored retrieval sources.

## When NOT to Use
- `neo4j-graphrag` package `HybridRetriever` / `HybridCypherRetriever` -> use `neo4j-graphrag-skill`
- Fulltext-only / keyword-only search -> use `neo4j-cypher-skill`
- Single vector search -> use main `neo4j-vector-index-skill`

## Rules
- Run each source independently.
- Rank each source by `score DESC, stable_id ASC`.
- Do not compare raw scores from different sources.
- Compute `contribution = sourceWeight / (rrfConstant + sourceRank)`.
- Sum contributions per node.
- Order final rows by `wrrf DESC, stable_id ASC`.
- Use `sourceK > finalK`; combine before final limiting.
- Use stable unique property for tie breaks. If no stable key exists, add one before production use.
- Keep `LIMIT $sourceK` inside `SEARCH`; Cypher rejects a `LET` alias there.
- For structural vector sources, compute/write GDS embeddings first, then create a vector index over that property.

## Index Setup

Vector index:
```cypher
CYPHER 25
CREATE VECTOR INDEX chunk_embedding IF NOT EXISTS
FOR (c:Chunk) ON (c.embedding)
OPTIONS {
  indexConfig: {
    `

Install

What is this skill?

Recommended Skills

Journey fit

Is Neo4j Vector Index Skill safe to install?

SKILL.md

This week for builders

Install

What is this skill?

Recommended Skills

Journey fit

Is Neo4j Vector Index Skill safe to install?

SKILL.md