Csv Excel Merger

Name: Csv Excel Merger
Author: onewave-ai

onewave-ai/claude-skills

1k installs
229 repo stars
Updated July 15, 2026
onewave-ai/claude-skills

csv-excel-merger is a Claude Code skill that combines multiple CSV or Excel files with automatic column alignment and deduplication for developers who consolidate spreadsheet exports.

About

csv-excel-merger is a data-consolidation skill that merges multiple CSV, Excel, or TSV files with intelligent column matching, deduplication, and conflict resolution across differing schemas. The agent first analyzes how many files are present, their formats, column names, and whether schemas align before producing a single combined dataset. Developers reach for csv-excel-merger when SaaS exports, survey results, or departmental spreadsheets must become one analyzable table without manual VLOOKUP work. The skill handles unlike column orders, mixed formats, and duplicate rows so downstream scripts, dashboards, or imports receive a clean unified file.

Automatic column matching using exact, case-insensitive, and fuzzy logic
Intelligent conflict resolution with keep-first, keep-last, and custom rules
Handles mismatched schemas, different formats (CSV, Excel, TSV), and encoding detection
Performs data deduplication based on primary key identification
Creates consolidated output with full audit trail of merge decisions

Csv Excel Merger by the numbers

1,009 all-time installs (skills.sh)
+19 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #295 of 2,066 Data Science & ML skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

npx skills add https://github.com/onewave-ai/claude-skills --skill csv-excel-merger

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/onewave-ai/claude-skills/csv-excel-merger.svg)](https://skillselion.com/skills/onewave-ai/claude-skills/csv-excel-merger)

Installs	1k
repo stars	★ 229
Security audit	3 / 3 scanners passed
Last updated	July 15, 2026
Repository	onewave-ai/claude-skills ↗

How do you merge CSV files with different columns?

Intelligently combine multiple CSV or Excel files with automatic column alignment and deduplication.

Who is it for?

Developers combining recurring exports, CRM dumps, or survey spreadsheets before analysis or database import.

Skip if: Real-time streaming ETL pipelines or databases that already enforce a single normalized schema at ingest.

When should I use this skill?

A developer needs to merge spreadsheets, combine data exports, or consolidate multiple CSV or Excel files into one.

What you get

A single consolidated CSV or Excel file with aligned columns and duplicates removed.

merged CSV file
merged Excel workbook

Files

SKILL.mdMarkdownGitHub ↗

CSV/Excel Merger

Merge multiple CSV or Excel files with automatic column matching, deduplication, and conflict resolution.

Workflow — the step-by-step merge process
Verification — confirm the merge before handing it back
Special cases — encoding, compound keys, large files
Guidelines — quality and transparency standards
Example triggers
references/merge_strategies.md — column matching, conflict resolution, and dedup options
references/output_template.md — the merge-report format

Workflow

1. Inspect the inputs. Determine file count, format (CSV / Excel / TSV), and whether the files are attached or read from disk. Read each header; identify column names, data types, and encoding (UTF-8, Latin-1). Note the candidate primary key.

2. Plan the merge. Match columns across files to one unified schema, choose a conflict-resolution rule, and pick a deduplication strategy. See references/merge_strategies.md for the matching heuristics and the full set of options.

3. Execute the merge with pandas:

   import pandas as pd

   df1 = pd.read_csv("file1.csv")
   df2 = pd.read_csv("file2.csv")

   # Normalize, then map column names onto the unified schema
   for df in (df1, df2):
       df.columns = df.columns.str.lower().str.strip()
   df2 = df2.rename(columns={"firstname": "first_name", "e_mail": "email"})

   merged = pd.concat([df1, df2], ignore_index=True)
   merged = merged.drop_duplicates(subset=["email"], keep="last")
   merged.to_csv("merged_output.csv", index=False)

4. Verify the result before reporting — see Verification.

5. Report using the layout in references/output_template.md, then offer export options: CSV (UTF-8), Excel (.xlsx), JSON, SQL INSERT statements, or Parquet for large datasets.

Verification

Never hand back a merge without checking it. After merging, assert the row math holds and the key is actually unique:

total_in = len(df1) + len(df2)
assert len(merged) > 0, "merge produced an empty frame"
assert len(merged) <= total_in, "more rows than inputs — check the concat/join"
assert merged["email"].is_unique, "duplicate keys remain after dedup"

print(f"in: {total_in} rows | out: {len(merged)} rows | removed: {total_in - len(merged)}")
print(f"null keys: {merged['email'].isna().sum()} | columns: {list(merged.columns)}")

Report rows in vs. out, duplicates removed, and per-column completeness so the user can sanity-check the numbers against their own expectations.

Special cases

Compound keys — when no single column is unique, key on a tuple: subset=["email", "company"].
Mixed data types — standardize dates, phone numbers, and country codes; strip whitespace and normalize casing before deduping, or near-duplicates slip through.
Missing columns — fill absent columns with empty values and flag them in the report; never silently drop data.
Large files (>100MB) — read in chunks (pd.read_csv(path, chunksize=...)), report progress, and estimate memory before loading everything at once.

Guidelines

Column matching — prefer exact, then case-insensitive, then fuzzy. Always emit the original → unified mapping so every match is auditable, and allow manual override.
Data quality — trim whitespace, standardize formats, flag invalid values, preserve types.
Transparency — track the source file for every surviving row, log each merge decision, and report all conflicts with their resolutions.
Performance — chunk large files, process in batches, and show progress on long-running merges.

Example triggers

"Merge these three CSV files"
"Combine multiple Excel sheets into one file"
"Deduplicate and merge customer data"
"Join spreadsheets with different column names"
"Consolidate contact lists from different sources"

Merge Strategies

Reference for the csv-excel-merger skill. Covers column matching, conflict resolution, and deduplication.

Column matching

Map columns from different files onto a single unified schema, in order of confidence:

Exact — email = email
Case-insensitive — Email = email
Fuzzy — E-mail ≈ email

Common groupings seen in real data:

Unified	Variants
`first_name`	`firstname`, `First Name`, `fname`
`last_name`	`lastname`, `Last Name`, `lname`
`email`	`e-mail`, `email_address`, `Email`
`phone`	`phone_number`, `mobile`, `tel`
`company`	`organization`, `org`
`title`	`job_title`, `position`

Always emit the original → unified mapping in the report so the matching is auditable, and let the user override it.

Conflict resolution

When the same record appears in multiple files with differing values:

Keep first — value from the first file
Keep last — value from the last (most recent) file
Keep longest — the most complete value
Merge — combine non-conflicting fields across sources
Manual review — flag the conflict for the user to resolve

Deduplication

Identify duplicates by primary key, then choose:

keep first / keep last / keep all
merge values — fold complementary fields into one row

Track the source file for every surviving row so data lineage is preserved.

Merge Report Template

Reference for the csv-excel-merger skill. Use this layout when reporting a completed merge.

CSV/EXCEL MERGER REPORT

INPUT FILES
  File 1: contacts_jan.csv    — 1,245 rows, 8 cols (name, email, phone, company, ...)
  File 2: contacts_feb.csv    —   987 rows, 9 cols (firstname, lastname, email, mobile, ...)
  File 3: leads_export.xlsx   — 2,103 rows, 12 cols (full_name, email_address, phone, ...)

COLUMN MAPPING (unified schema)
  first_name  <- firstname, first name, fname
  last_name   <- lastname, last name, lname
  email       <- email, e-mail, email_address
  phone       <- phone, mobile, phone_number, tel
  company     <- company, organization, org
  title       <- title, job_title, position
  source      <- file-origin tracking

MERGE ANALYSIS
  Rows before merge:   4,335
  Duplicates found:      892
  Conflicts detected:     47
  Primary key:         email
  Dedup strategy:      keep most recent (by source file date)

CONFLICTS (top 10)
  john.doe@example.com
    File 1 phone: (555) 123-4567
    File 2 phone: (555) 987-6543
    -> kept most recent (File 2)

RESULTS
  Output:      merged_contacts.csv
  Total rows:  3,443
  Columns:     7
  Removed:     892 duplicates

  By source:
    contacts_jan.csv    1,245 rows (398 unique)
    contacts_feb.csv      987 rows (521 unique)
    leads_export.xlsx   2,103 rows (2,524 unique)

  Completeness:
    email   98.2%
    phone   87.5%
    company 91.3%

RECOMMENDATIONS
  - Review 47 conflict records manually
  - Standardize phone number format
  - Fill missing company names (8.7% incomplete)
  - Export conflicts to conflicts_review.csv

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Run TrainSafely execute selected deep learning training commands with standardized evidence capture.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

FAQ

Can csv-excel-merger handle different schemas?

csv-excel-merger matches columns intelligently across CSV and Excel files with differing schemas, then deduplicates rows and resolves conflicts before writing one consolidated output file.

Which formats does csv-excel-merger support?

csv-excel-merger works with CSV, Excel, and TSV inputs. The skill inspects how many files are provided, their formats, and whether column names align before merging.

Is Csv Excel Merger safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLdatabasesanalyticspipelines