Parallel Data Enrichment

Name: Parallel Data Enrichment
Author: parallel-web

parallel-web/parallel-agent-skills

10.8k installs
62 repo stars
Updated July 17, 2026
parallel-web/parallel-agent-skills

Web-sourced bulk data enrichment that adds fields like CEO name, founding year, funding amount, and contact info to lists of companies, people, or products.

About

Bulk data enrichment tool that augments CSV files or inline datasets with web-sourced information (CEO names, founding years, recent funding, contact details). Developers use it to enrich lead lists, company databases, or people records without manual research. Supports multi-turn workflows by chaining to prior research tasks via interaction IDs. Runs via parallel-cli with --no-wait mode for async execution; developers poll results later with a 9-minute timeout. Output is always JSON array of {input, output} objects regardless of input format.

Enriches CSV or inline JSON data with web-sourced fields (CEO, funding, contact info) in bulk
Multi-turn support via --previous-interaction-id to carry context from prior research tasks
Async execution with --no-wait; developers get taskgroup_id and monitoring URL immediately
Auto-suggest enrichment columns via enrich suggest if user intent is vague
9-minute polling timeout; can re-run poll command if dataset enrichment takes longer

Parallel Data Enrichment by the numbers

10,839 all-time installs (skills.sh)
+341 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #11 of 2,066 Data Science & ML skills by installs in the Skillselion catalog
Security screen: HIGH risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

parallel-data-enrichment capabilities & compatibility

Depends on parallel-cli balance; user can check via balance get and add via balance add.

Capabilities: bulk csv or json enrichment with web sourced dat · auto suggest enrichment columns from vague inten · multi turn context chaining from prior research · async execution with taskgroup tracking and moni · polling with resumable timeout for long running
Use cases: data analysis · research · web scraping
Platforms: macOS · Windows · Linux · WSL
Runs: Remote server
Pricing: Bring your own API key

From the docs

What parallel-data-enrichment says it does

Adds web-sourced fields (CEO names, funding, contact info) to lists of companies, people, or products.

skill:parallel-web/parallel-agent-skills#parallel-data-enrichment description

Supports multi-turn: pass --previous-interaction-id from a prior research task to carry context forward.

skill:parallel-web/parallel-agent-skills#parallel-data-enrichment description

npx skills add https://github.com/parallel-web/parallel-agent-skills --skill parallel-data-enrichment

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/parallel-web/parallel-agent-skills/parallel-data-enrichment.svg)](https://skillselion.com/skills/parallel-web/parallel-agent-skills/parallel-data-enrichment)

Installs	10.8k
repo stars	★ 62
Security audit	2 / 3 scanners passed
Last updated	July 17, 2026
Repository	parallel-web/parallel-agent-skills ↗

What it does

Add web-sourced fields like CEO names, funding amounts, and contact info to CSV lists of companies or people.

Who is it for?

Enriching company or person datasets before lead scoring, market research, or CRM uploads; carrying context forward from prior research tasks.

Skip if: Real-time streaming enrichment; single-record lookups; building a searchable database of entities.

When should I use this skill?

User has a CSV or list needing bulk field additions; user references a prior research task they want to chain context from.

What you get

Enriched dataset as JSON array with original fields plus new web-sourced columns, ready for analytics or CRM import.

JSON file with {input, output} array; input contains original fields, output contains enriched fields; row count of succ

By the numbers

Enrichment may take several minutes depending on row count and field count
9-minute polling timeout per command invocation
Output is always JSON regardless of input or target extension

Files

SKILL.mdMarkdownGitHub ↗

Data Enrichment

Enrich: $ARGUMENTS

Before starting

Inform the user that enrichment may take several minutes depending on the number of rows and fields requested.

Optional: Suggest output columns

If the user gave a vague intent ("enrich these companies with useful info") and you're not sure what columns to add, ask the API for a suggestion before kicking off the run:

parallel-cli enrich suggest "Find CEO and recent funding info" --json

The response is an envelope: {title, processor, enriched_columns, warnings}. Extract just the `enriched_columns` array (not the whole envelope) and pass it as the value of --enriched-columns on enrich run, in place of `--intent` — the two flags are alternative ways to specify what to enrich, not combined. If suggest returned a processor, pass it through explicitly via --processor on the run call (it's a tuned recommendation for the schema). Skip this whole section if the user already specified the fields they want.

enrich suggest requires parallel-cli ≥ 0.3.0. If it errors with anything resembling no such command / No such command / unknown command, do not bail — skip the suggestion step, fall through to step 1 with --intent, complete the run, and mention parallel-cli update (or pipx upgrade parallel-web-tools) in the final response so the user picks up the feature next time.

Step 1: Start the enrichment

Use ONE of these command patterns (substitute user's actual data):

For inline data:

parallel-cli enrich run --data '[{"company": "Google"}, {"company": "Microsoft"}]' --intent "CEO name and founding year" --target "output.csv" --no-wait --json

For CSV file:

parallel-cli enrich run --source-type csv --source "input.csv" --target "output.csv" --source-columns '[{"name": "company", "description": "Company name"}]' --intent "CEO name and founding year" --no-wait --json

If this is a follow-up to a previous research task and you have its interaction_id, add context chaining:

parallel-cli enrich run --data '...' --intent "..." --target "output.csv" --no-wait --json --previous-interaction-id "$INTERACTION_ID"

The enrichment will run with the full context of that prior research — so you can enrich entities discovered earlier without restating what was already found. Note: enrichment does not itself produce a new interaction_id, so you cannot chain a further follow-up off of an enrichment.

IMPORTANT: Always include --no-wait so the command returns immediately instead of blocking.

Parse the --json output to extract taskgroup_id and url. The output is {taskgroup_id, url, num_runs} — there is no interaction_id field, do not look for one. Immediately tell the user:

Enrichment has been kicked off
The monitoring URL where they can track progress

Tell them they can background the polling step to continue working while it runs.

Step 2: Poll for results

Pick a concrete output path (e.g., /tmp/enrichment-acme.json). Note: the file is JSON regardless of the extension you choose — it's an array of {input, output} objects, not a CSV. Name it .json to avoid confusing yourself or the user.

parallel-cli enrich poll "$TASKGROUP_ID" --timeout 540 --output "/tmp/enrichment-<descriptive-name>.json"

Important:

Use --timeout 540 (9 minutes) to stay within tool execution limits
The --target from step 1 is unused in --no-wait mode — only --output here determines where results are saved, and the file is always JSON

If the poll times out

Enrichment of large datasets can take longer than 9 minutes. If the poll exits without completing:

1. Tell the user the enrichment is still running server-side 2. Re-run the same parallel-cli enrich poll command to continue waiting

Response format

After step 1: Share the monitoring URL (for tracking progress).

After step 2:

1. Report number of rows enriched 2. Preview first few rows from the output file (it's a JSON array of {input, output} objects) 3. Tell the user the full path to the output file

Do NOT re-share the monitoring URL after completion — the results are in the output file.

Setup

If parallel-cli is not found, install and authenticate:

/parallel:parallel-cli-setup

If any parallel-cli enrich command returns 403, tell the user balance is likely required. Offer to run parallel-cli balance get, and if needed ask for explicit confirmation before running parallel-cli balance add <amount_cents>. Then retry the original enrichment command.

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Run TrainSafely execute selected deep learning training commands with standardized evidence capture.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

Forks & variants (1)

Parallel Data Enrichment has 1 known copy in the catalog totaling 8 installs. They canonicalize to this original listing.

parallel-web - 8 installs

How it compares

Use parallel-data-enrichment for scripted bulk CSV field population instead of manual per-row research or static database-only merges.

FAQ

Can I chain enrichment off a prior research task?

Yes. Use --previous-interaction-id from the research task to carry context forward. Enrichment itself does not produce an interaction_id, so you cannot chain further tasks off enrichment.

Why is my output JSON and not CSV?

All enrichment output is JSON array of {input, output} objects, regardless of --target extension. This preserves full data fidelity. Import to CSV separately if needed.

My poll timed out at 9 minutes. Is my enrichment lost?

No. Enrichment continues server-side. Re-run the same poll command with the taskgroup_id to continue waiting for results.

Is Parallel Data Enrichment safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLanalyticspipelines