
Elasticsearch File Ingest
Bulk-ingest JSON or log files into Elasticsearch with optional JavaScript transforms, index mappings, and validation before documents hit a cluster.
Overview
Elasticsearch File Ingest is an agent skill for the Operate phase that guides file-based bulk ingestion into Elasticsearch with mappings and optional JavaScript document transforms.
Install
npx skills add https://github.com/elastic/agent-skills --skill elasticsearch-file-ingestWhat is this skill?
- Elasticsearch index mapping examples with @timestamp, user, message, level, and keyword tags
- Custom Node transform hooks that return null to skip invalid or test documents
- Example split-transform pattern to fan one source record into multiple indexed documents
- CLI-oriented ingest.js workflow: --file, --target, --transform flags
- Validates emails and filters @test.com / @example.com rows before indexing
- Example mapping covers @timestamp, user.id/name/email, message, level, and tags fields
Adoption & trust: 1.1k installs on skills.sh; 502 GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have JSON or log files to load into Elasticsearch but need mappings, skip rules, and split logic without ad-hoc one-off scripts.
Who is it for?
Indie builders operating Elasticsearch for logs, support data, or search who want agent-assisted ingest.js and transform examples.
Skip if: Greenfield apps with no Elasticsearch cluster or teams that only need streaming ingest via Beats or serverless-only stacks.
When should I use this skill?
You need to ingest local JSON or log files into Elasticsearch with optional custom transform modules.
What do I get? / Deliverables
You can run a documented ingest path with transforms that filter invalid rows and shape documents before they land in the right target index.
- Index mapping JSON aligned to your document shape
- Transform module(s) for skip, validate, or split behaviors
- Documented ingest command with --file, --target, and --transform
Recommended Skills
Journey fit
File-based ingest and index hygiene are production operations once you are shipping data into Elasticsearch for search and observability. Running ingest scripts, transforms, and target indices is infrastructure and pipeline work under Operate, not frontend or launch SEO.
How it compares
Agent skill for scripted file ingest and transforms—not a hosted Elastic Cloud wizard or a generic database bulk loader.
Common Questions / FAQ
Who is elasticsearch-file-ingest for?
Solo developers and small teams who operate Elasticsearch and need repeatable file ingest with optional validation and document shaping.
When should I use elasticsearch-file-ingest?
In Operate → infra when backfilling logs, importing JSON dumps, or reindexing with skip/split transforms via ingest.js-style commands.
Is elasticsearch-file-ingest safe to install?
Ingest skills can imply cluster credentials and arbitrary transform code—check the Security Audits panel on this page and run transforms only against non-production clusters first.
SKILL.md
READMESKILL.md - Elasticsearch File Ingest
{ "properties": { "@timestamp": { "type": "date" }, "user": { "properties": { "id": { "type": "keyword" }, "name": { "type": "keyword" }, "email": { "type": "keyword" } } }, "message": { "type": "text" }, "level": { "type": "keyword" }, "tags": { "type": "keyword" } } } /** * Example transform that conditionally skips documents. * * This validates documents and only indexes valid ones. * * Usage: * node scripts/ingest.js ingest --file data.json --target validated --transform examples/skip-transform.js */ export default function transform(doc) { // Skip documents without required fields if (!doc.email || !doc.name) { console.warn(`Skipping document without email or name:`, doc.id); return null; } // Skip invalid email addresses if (!doc.email.includes("@")) { console.warn(`Skipping document with invalid email:`, doc.email); return null; } // Skip test data if (doc.email.endsWith("@test.com") || doc.email.endsWith("@example.com")) { return null; } // Return the document if all validations pass return { ...doc, validated_at: new Date().toISOString(), }; } /** * Example transform that splits one document into multiple documents. * * This example takes a tweet and creates a separate document for each hashtag. * * Usage: * node scripts/ingest.js ingest --file tweets.json --target hashtags --transform examples/split-transform.js */ export default function transform(doc) { // Extract hashtags from tweet text const hashtags = (doc.text || "").match(/#\w+/g) || []; // If no hashtags, skip this document if (hashtags.length === 0) { return null; } // Create one document per hashtag return hashtags.map((tag) => ({ hashtag: tag.toLowerCase(), tweet_id: doc.id, user_id: doc.user_id, created_at: doc.created_at, original_text: doc.text, })); } /** * Example transform function that enriches documents during ingestion. * * Usage: * node scripts/ingest.js ingest --file data.json --target my-index --transform examples/transform.js */ export default function transform(doc) { // Add processing metadata const enriched = { ...doc, processed_at: new Date().toISOString(), source: "batch-import", }; // Combine first and last name if present if (doc.first_name && doc.last_name) { enriched.full_name = `${doc.first_name} ${doc.last_name}`; } // Extract year from timestamp if present if (doc.timestamp || doc["@timestamp"]) { const timestamp = doc.timestamp || doc["@timestamp"]; enriched.year = new Date(timestamp).getFullYear(); } // Normalize email to lowercase if (doc.email) { enriched.email = doc.email.toLowerCase(); } return enriched; } // For CommonJS compatibility // module.exports = transform; { "name": "elasticsearch-file-ingest", "version": "0.0.1", "description": "Agent skill for ingesting and transforming large data files (CSV/JSON/Parquet/Arrow IPC) into Elasticsearch indices. Stream-based ingestion and custom transformations.", "type": "module", "private": true, "dependencies": { "@elastic/elasticsearch": "^8.17.0", "node-es-transformer": "^1.2.2" } } # Common Ingestion Patterns Detailed examples for common data ingestion scenarios. ## Pattern 1: Load CSV with Custom Mappings ```bash # 1. Create mappings.json with your schema cat > mappings.json << 'EOF' { "properties": { "timestamp": { "type": "date" }, "user_id": { "type": "keyword" }, "action": { "type": "keyword" }, "value": { "type": "double" } } } EOF # 2. Ingest CSV (skip header row) node scripts/ingest.js ingest \ --file events.csv \ --target events \ --mappings mappings.json \ --skip-header ``` ## Pattern 2: Batch Ingest Multiple Files ```bash # Ingest all JSON files in a directory node scripts/ingest.js ingest \ --file "logs/*.json" \