Elasticsearch File Ingest

Name: Elasticsearch File Ingest
Author: elastic

elastic/agent-skills

Bulk-ingest JSON or log files into Elasticsearch with optional JavaScript transforms, index mappings, and validation before documents hit a cluster.

Overview

Elasticsearch File Ingest is an agent skill for the Operate phase that guides file-based bulk ingestion into Elasticsearch with mappings and optional JavaScript document transforms.

Install

npx skills add https://github.com/elastic/agent-skills --skill elasticsearch-file-ingest

What is this skill?

Elasticsearch index mapping examples with @timestamp, user, message, level, and keyword tags
Custom Node transform hooks that return null to skip invalid or test documents
Example split-transform pattern to fan one source record into multiple indexed documents
CLI-oriented ingest.js workflow: --file, --target, --transform flags
Validates emails and filters @test.com / @example.com rows before indexing
Example mapping covers @timestamp, user.id/name/email, message, level, and tags fields

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 1.1k installs on skills.sh; 502 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

You have JSON or log files to load into Elasticsearch but need mappings, skip rules, and split logic without ad-hoc one-off scripts.

Who is it for?

Indie builders operating Elasticsearch for logs, support data, or search who want agent-assisted ingest.js and transform examples.

Skip if: Greenfield apps with no Elasticsearch cluster or teams that only need streaming ingest via Beats or serverless-only stacks.

When should I use this skill?

You need to ingest local JSON or log files into Elasticsearch with optional custom transform modules.

What do I get? / Deliverables

You can run a documented ingest path with transforms that filter invalid rows and shape documents before they land in the right target index.

Index mapping JSON aligned to your document shape
Transform module(s) for skip, validate, or split behaviors
Documented ingest command with --file, --target, and --transform

Recommended Skills

Azure Kubernetesmicrosoft/azure-skills

Azure Kubernetes skill supplies fix patterns for AKS Automatic compatibility—adding resource requests, dropping capabili…204k installs·1.2k stars

Github Actions Docsxixu-me/skills

github-actions-docs is a research-oriented agent skill that stops generic CI/CD guesses when the user’s question is real…187k installs·61 stars

Deploy To Vercelvercel-labs/agent-skills

Deploy to Vercel is an agent skill from Vercel Labs that walks solo builders through shipping a project to Vercel with d…67.1k installs·27.7k stars

Vercel Cli With Tokensvercel-labs/agent-skills

Vercel CLI with Tokens is a Vercel Labs agent skill that teaches coding agents how to deploy and manage projects when in…47.2k installs·27.7k stars

Turborepovercel/turborepo

Turborepo Skill is procedural guidance for Vercel’s Turborepo build system on JavaScript and TypeScript monorepos. It ta…36k installs·30.5k stars

Docker Expertsickn33/antigravity-awesome-skills

Docker Expert is an agent skill that acts as a hands-on containerization consultant for solo and indie builders shipping…18.7k installs·40.1k stars

Journey fit

Primary fit

OperateInfrastructure & cost

File-based ingest and index hygiene are production operations once you are shipping data into Elasticsearch for search and observability. Running ingest scripts, transforms, and target indices is infrastructure and pipeline work under Operate, not frontend or launch SEO.

Also useful

BuildIntegrations & version control

How it compares

Agent skill for scripted file ingest and transforms—not a hosted Elastic Cloud wizard or a generic database bulk loader.

Common Questions / FAQ

Who is elasticsearch-file-ingest for?

Solo developers and small teams who operate Elasticsearch and need repeatable file ingest with optional validation and document shaping.

When should I use elasticsearch-file-ingest?

In Operate → infra when backfilling logs, importing JSON dumps, or reindexing with skip/split transforms via ingest.js-style commands.

Is elasticsearch-file-ingest safe to install?

Ingest skills can imply cluster credentials and arbitrary transform code—check the Security Audits panel on this page and run transforms only against non-production clusters first.

SKILL.md

READMESKILL.md - Elasticsearch File Ingest

{
  "properties": {
    "@timestamp": {
      "type": "date"
    },
    "user": {
      "properties": {
        "id": { "type": "keyword" },
        "name": { "type": "keyword" },
        "email": { "type": "keyword" }
      }
    },
    "message": {
      "type": "text"
    },
    "level": {
      "type": "keyword"
    },
    "tags": {
      "type": "keyword"
    }
  }
}


/**
 * Example transform that conditionally skips documents.
 *
 * This validates documents and only indexes valid ones.
 *
 * Usage:
 *   node scripts/ingest.js ingest --file data.json --target validated --transform examples/skip-transform.js
 */

export default function transform(doc) {
  // Skip documents without required fields
  if (!doc.email || !doc.name) {
    console.warn(`Skipping document without email or name:`, doc.id);
    return null;
  }

  // Skip invalid email addresses
  if (!doc.email.includes("@")) {
    console.warn(`Skipping document with invalid email:`, doc.email);
    return null;
  }

  // Skip test data
  if (doc.email.endsWith("@test.com") || doc.email.endsWith("@example.com")) {
    return null;
  }

  // Return the document if all validations pass
  return {
    ...doc,
    validated_at: new Date().toISOString(),
  };
}


/**
 * Example transform that splits one document into multiple documents.
 *
 * This example takes a tweet and creates a separate document for each hashtag.
 *
 * Usage:
 *   node scripts/ingest.js ingest --file tweets.json --target hashtags --transform examples/split-transform.js
 */

export default function transform(doc) {
  // Extract hashtags from tweet text
  const hashtags = (doc.text || "").match(/#\w+/g) || [];

  // If no hashtags, skip this document
  if (hashtags.length === 0) {
    return null;
  }

  // Create one document per hashtag
  return hashtags.map((tag) => ({
    hashtag: tag.toLowerCase(),
    tweet_id: doc.id,
    user_id: doc.user_id,
    created_at: doc.created_at,
    original_text: doc.text,
  }));
}


/**
 * Example transform function that enriches documents during ingestion.
 *
 * Usage:
 *   node scripts/ingest.js ingest --file data.json --target my-index --transform examples/transform.js
 */

export default function transform(doc) {
  // Add processing metadata
  const enriched = {
    ...doc,
    processed_at: new Date().toISOString(),
    source: "batch-import",
  };

  // Combine first and last name if present
  if (doc.first_name && doc.last_name) {
    enriched.full_name = `${doc.first_name} ${doc.last_name}`;
  }

  // Extract year from timestamp if present
  if (doc.timestamp || doc["@timestamp"]) {
    const timestamp = doc.timestamp || doc["@timestamp"];
    enriched.year = new Date(timestamp).getFullYear();
  }

  // Normalize email to lowercase
  if (doc.email) {
    enriched.email = doc.email.toLowerCase();
  }

  return enriched;
}

// For CommonJS compatibility
// module.exports = transform;


{
  "name": "elasticsearch-file-ingest",
  "version": "0.0.1",
  "description": "Agent skill for ingesting and transforming large data files (CSV/JSON/Parquet/Arrow IPC) into Elasticsearch indices. Stream-based ingestion and custom transformations.",
  "type": "module",
  "private": true,
  "dependencies": {
    "@elastic/elasticsearch": "^8.17.0",
    "node-es-transformer": "^1.2.2"
  }
}


# Common Ingestion Patterns

Detailed examples for common data ingestion scenarios.

## Pattern 1: Load CSV with Custom Mappings

```bash
# 1. Create mappings.json with your schema
cat > mappings.json << 'EOF'
{
  "properties": {
    "timestamp": { "type": "date" },
    "user_id": { "type": "keyword" },
    "action": { "type": "keyword" },
    "value": { "type": "double" }
  }
}
EOF

# 2. Ingest CSV (skip header row)
node scripts/ingest.js ingest \
  --file events.csv \
  --target events \
  --mappings mappings.json \
  --skip-header
```

## Pattern 2: Batch Ingest Multiple Files

```bash
# Ingest all JSON files in a directory
node scripts/ingest.js ingest \
  --file "logs/*.json" \

What is this skill?

Elasticsearch index mapping examples with @timestamp, user, message, level, and keyword tags

Custom Node transform hooks that return null to skip invalid or test documents

Example split-transform pattern to fan one source record into multiple indexed documents

CLI-oriented ingest.js workflow: --file, --target, --transform flags

Validates emails and filters @test.com / @example.com rows before indexing

Example mapping covers @timestamp, user.id/name/email, message, level, and tags fields

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 1.1k installs on skills.sh; 502 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

You can run a documented ingest path with transforms that filter invalid rows and shape documents before they land in the right target index.

Index mapping JSON aligned to your document shape

Transform module(s) for skip, validate, or split behaviors

Documented ingest command with --file, --target, and --transform

Journey fit

Primary fit

OperateInfrastructure & cost

Also useful

BuildIntegrations & version control

SKILL.md

READMESKILL.md - Elasticsearch File Ingest

{
  "properties": {
    "@timestamp": {
      "type": "date"
    },
    "user": {
      "properties": {
        "id": { "type": "keyword" },
        "name": { "type": "keyword" },
        "email": { "type": "keyword" }
      }
    },
    "message": {
      "type": "text"
    },
    "level": {
      "type": "keyword"
    },
    "tags": {
      "type": "keyword"
    }
  }
}


/**
 * Example transform that conditionally skips documents.
 *
 * This validates documents and only indexes valid ones.
 *
 * Usage:
 *   node scripts/ingest.js ingest --file data.json --target validated --transform examples/skip-transform.js
 */

export default function transform(doc) {
  // Skip documents without required fields
  if (!doc.email || !doc.name) {
    console.warn(`Skipping document without email or name:`, doc.id);
    return null;
  }

  // Skip invalid email addresses
  if (!doc.email.includes("@")) {
    console.warn(`Skipping document with invalid email:`, doc.email);
    return null;
  }

  // Skip test data
  if (doc.email.endsWith("@test.com") || doc.email.endsWith("@example.com")) {
    return null;
  }

  // Return the document if all validations pass
  return {
    ...doc,
    validated_at: new Date().toISOString(),
  };
}


/**
 * Example transform that splits one document into multiple documents.
 *
 * This example takes a tweet and creates a separate document for each hashtag.
 *
 * Usage:
 *   node scripts/ingest.js ingest --file tweets.json --target hashtags --transform examples/split-transform.js
 */

export default function transform(doc) {
  // Extract hashtags from tweet text
  const hashtags = (doc.text || "").match(/#\w+/g) || [];

  // If no hashtags, skip this document
  if (hashtags.length === 0) {
    return null;
  }

  // Create one document per hashtag
  return hashtags.map((tag) => ({
    hashtag: tag.toLowerCase(),
    tweet_id: doc.id,
    user_id: doc.user_id,
    created_at: doc.created_at,
    original_text: doc.text,
  }));
}


/**
 * Example transform function that enriches documents during ingestion.
 *
 * Usage:
 *   node scripts/ingest.js ingest --file data.json --target my-index --transform examples/transform.js
 */

export default function transform(doc) {
  // Add processing metadata
  const enriched = {
    ...doc,
    processed_at: new Date().toISOString(),
    source: "batch-import",
  };

  // Combine first and last name if present
  if (doc.first_name && doc.last_name) {
    enriched.full_name = `${doc.first_name} ${doc.last_name}`;
  }

  // Extract year from timestamp if present
  if (doc.timestamp || doc["@timestamp"]) {
    const timestamp = doc.timestamp || doc["@timestamp"];
    enriched.year = new Date(timestamp).getFullYear();
  }

  // Normalize email to lowercase
  if (doc.email) {
    enriched.email = doc.email.toLowerCase();
  }

  return enriched;
}

// For CommonJS compatibility
// module.exports = transform;


{
  "name": "elasticsearch-file-ingest",
  "version": "0.0.1",
  "description": "Agent skill for ingesting and transforming large data files (CSV/JSON/Parquet/Arrow IPC) into Elasticsearch indices. Stream-based ingestion and custom transformations.",
  "type": "module",
  "private": true,
  "dependencies": {
    "@elastic/elasticsearch": "^8.17.0",
    "node-es-transformer": "^1.2.2"
  }
}


# Common Ingestion Patterns

Detailed examples for common data ingestion scenarios.

## Pattern 1: Load CSV with Custom Mappings

```bash
# 1. Create mappings.json with your schema
cat > mappings.json << 'EOF'
{
  "properties": {
    "timestamp": { "type": "date" },
    "user_id": { "type": "keyword" },
    "action": { "type": "keyword" },
    "value": { "type": "double" }
  }
}
EOF

# 2. Ingest CSV (skip header row)
node scripts/ingest.js ingest \
  --file events.csv \
  --target events \
  --mappings mappings.json \
  --skip-header
```

## Pattern 2: Batch Ingest Multiple Files

```bash
# Ingest all JSON files in a directory
node scripts/ingest.js ingest \
  --file "logs/*.json" \

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is elasticsearch-file-ingest for?

When should I use elasticsearch-file-ingest?

Is elasticsearch-file-ingest safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is elasticsearch-file-ingest for?

When should I use elasticsearch-file-ingest?

Is elasticsearch-file-ingest safe to install?

SKILL.md