Data Enrichment

Name: Data Enrichment
Author: hubspot

hubspot/agent-cli-skills

Sync a spreadsheet or export into HubSpot by matching contacts on email or companies on domain without fragile search-then-create loops.

Install

npx skills add https://github.com/hubspot/agent-cli-skills --skill data-enrichment

What is this skill?

Uses `hubspot objects upsert` with `--id-property` so each row is create-or-update in one CLI call keyed by email (conta
Streams JSONL on stdin with per-line `ok` / `new` / error objects—order preserved for auditing
Reshape CSV via tools like csvjson + `jq`; preview with `--dry-run` before live writes
Requires lowercase natural keys for exact CRM match; property names come from `hubspot properties list`, not hard-coded
Builds on the bulk-operations skill for JSONL piping, dry-run digest, history, and rate-limit hygiene

Adoption & trust: 1 installs on skills.sh; 2 GitHub stars; trending (+100% hot-view momentum).

Recommended Skills

Agent Browservercel-labs/agent-browser

agent-browser is a Node-installed browser automation CLI built for AI agents that need dependable programmatic web inter…428k installs·35.5k stars

Lark Imlarksuite/cli

Lark IM is a Larksuite agent skill that exposes Feishu/Lark instant messaging to Claude Code, Cursor, and similar agents…210k installs·13.7k stars

Lark Calendarlarksuite/cli

lark-calendar is an agent skill for Feishu/Lark Calendar v4 exposed via lark-cli. Solo builders and small teams who alre…209k installs·13.7k stars

Lark Sheetslarksuite/cli

Skill for programmatic Feishu spreadsheet and worksheet management—create tables, bulk data IO, lookup, and export—using…209k installs·13.7k stars

Lark Vclarksuite/cli

lark-vc is an agent skill for Feishu/Lark video conferencing history and artifacts through lark-cli. After calls end, so…208k installs·13.7k stars

Lark Contactlarksuite/cli

CLI skill for Lark directory lookup: search employees and fetch metadata by open_id, with clear boundaries vs IM, calend…208k installs·13.7k stars

Journey fit

Primary fit

GrowLifecycle & retention

CRM enrichment from external lists is a lifecycle motion—keeping contact and company records accurate as you nurture and convert—not a one-time build scaffold. Lifecycle is the canonical shelf for write-back enrichment workflows that update who you are talking to and what you know about them in the CRM.

SKILL.md

READMESKILL.md - Data Enrichment

Prereq: read `bulk-operations/SKILL.md` first — JSONL piping, dry-run/digest, history, and rate-limit hygiene live there. This skill is the upsert-by-natural-key workflow on top.

## The core move: upsert, not search-then-create

`hubspot objects upsert --type X --id-property <natural-key>` reads JSONL on stdin and creates-or-updates each row in **one CLI call per record**, keyed by a property (email for contacts, domain for companies). No race window, no branching. Do not loop `search` → empty? → `create`.

Per line in: `{"id":"jane@example.com","properties":{"firstname":"Jane","jobtitle":"VP"}}`
Per line out: `{"id":"123","ok":true,"data":{...,"new":true|false}}` or `{"ok":false,"error":{...}}`. Order matches input.

## CSV/JSONL → upsert stream

Reshape with `jq`, preview with `--dry-run`, then execute. Always lowercase the natural key — CRM match is exact. Confirm available property names with `hubspot properties list --type contacts`; never hard-code a list. See `bulk-operations/resources/json-patterns.md` for reshape idioms.

```bash
# CSV → JSONL (any tool); example using csvkit
csvjson external.csv | jq -c '.[]' > external.jsonl

# Preview
cat external.jsonl \
| jq -c '{id:(.email|ascii_downcase), properties:{firstname:.first, lastname:.last, jobtitle:.title, company:.company}}' \
| hubspot objects upsert --type contacts --id-property email --dry-run | head

# Execute (same pipeline, drop --dry-run, capture results)
cat external.jsonl \
| jq -c '{id:(.email|ascii_downcase), properties:{firstname:.first, lastname:.last, jobtitle:.title, company:.company}}' \
| hubspot objects upsert --type contacts --id-property email \
| tee /tmp/upsert.results.jsonl
```

Companies: swap `--type companies --id-property domain` and reshape with `.domain|ascii_downcase` as `id`.

## Handle per-record OK / error output

Split with `jq`, inspect failure modes, retry just the failures after fixing the inputs:

```bash
jq -c 'select(.ok==true)'  /tmp/upsert.results.jsonl > /tmp/upsert.ok.jsonl
jq -c 'select(.ok==false)' /tmp/upsert.results.jsonl > /tmp/upsert.failed.jsonl
jq -r '.error.status' /tmp/upsert.failed.jsonl | sort | uniq -c   # status → count
jq -r '.data.new'    /tmp/upsert.ok.jsonl     | sort | uniq -c   # created vs updated
```

429s: split the input and rerun smaller chunks (see `bulk-operations` rate-limit notes). 400s usually mean a bad property name or invalid enum value — fix the reshape, rerun the failed inputs.

## Destructive-op safety

`upsert` itself is non-destructive, but write-back can clobber populated fields. Always `--dry-run` first and spot-check. For bulk delete or overwrite of existing data, follow the dry-run → digest → confirm flow in `bulk-operations/SKILL.md`. Recovery: `hubspot history --since 1h`.

## Match without upsert: OR-search → update

When you only want to read matches (no write-back), or the natural key isn't a CRM property, use repeated `--filter` flags — each flag is one OR group.

Verified cap: **5 OR groups per call**. 6+ returns `400 too many filterGroups (count: N, max allowed: 5)`. Chunk 5 at a time:

```bash
# emails.txt: one lowercased email per line
xargs -n5 < emails.txt | while read -r e1 e2 e3 e4 e5; do
  args=()
  for e in "$e1" "$e2" "$e3" "$e4" "$e5"; do [ -n "$e" ] && args+=(--filter "email=$e"); done
  hubspot objects search --type contacts "${args[@]}" --properties email,firstname,company
done > /tmp/matches.jsonl

jq -c '{id, properties:{lifecyclestage:"marketingqualifiedlead"}}' /tmp/matches.jsonl \
| hubspot objects update --type contacts --dry-run
```

For larger keyed enrichments, prefer `upsert` — o

What is this skill?

Uses `hubspot objects upsert` with `--id-property` so each row is create-or-update in one CLI call keyed by email (conta

Streams JSONL on stdin with per-line `ok` / `new` / error objects—order preserved for auditing

Reshape CSV via tools like csvjson + `jq`; preview with `--dry-run` before live writes

Requires lowercase natural keys for exact CRM match; property names come from `hubspot properties list`, not hard-coded

Builds on the bulk-operations skill for JSONL piping, dry-run digest, history, and rate-limit hygiene

Adoption & trust: 1 installs on skills.sh; 2 GitHub stars; trending (+100% hot-view momentum).

Journey fit

Primary fit

GrowLifecycle & retention

SKILL.md

READMESKILL.md - Data Enrichment

Prereq: read `bulk-operations/SKILL.md` first — JSONL piping, dry-run/digest, history, and rate-limit hygiene live there. This skill is the upsert-by-natural-key workflow on top.

## The core move: upsert, not search-then-create

`hubspot objects upsert --type X --id-property <natural-key>` reads JSONL on stdin and creates-or-updates each row in **one CLI call per record**, keyed by a property (email for contacts, domain for companies). No race window, no branching. Do not loop `search` → empty? → `create`.

Per line in: `{"id":"jane@example.com","properties":{"firstname":"Jane","jobtitle":"VP"}}`
Per line out: `{"id":"123","ok":true,"data":{...,"new":true|false}}` or `{"ok":false,"error":{...}}`. Order matches input.

## CSV/JSONL → upsert stream

Reshape with `jq`, preview with `--dry-run`, then execute. Always lowercase the natural key — CRM match is exact. Confirm available property names with `hubspot properties list --type contacts`; never hard-code a list. See `bulk-operations/resources/json-patterns.md` for reshape idioms.

```bash
# CSV → JSONL (any tool); example using csvkit
csvjson external.csv | jq -c '.[]' > external.jsonl

# Preview
cat external.jsonl \
| jq -c '{id:(.email|ascii_downcase), properties:{firstname:.first, lastname:.last, jobtitle:.title, company:.company}}' \
| hubspot objects upsert --type contacts --id-property email --dry-run | head

# Execute (same pipeline, drop --dry-run, capture results)
cat external.jsonl \
| jq -c '{id:(.email|ascii_downcase), properties:{firstname:.first, lastname:.last, jobtitle:.title, company:.company}}' \
| hubspot objects upsert --type contacts --id-property email \
| tee /tmp/upsert.results.jsonl
```

Companies: swap `--type companies --id-property domain` and reshape with `.domain|ascii_downcase` as `id`.

## Handle per-record OK / error output

Split with `jq`, inspect failure modes, retry just the failures after fixing the inputs:

```bash
jq -c 'select(.ok==true)'  /tmp/upsert.results.jsonl > /tmp/upsert.ok.jsonl
jq -c 'select(.ok==false)' /tmp/upsert.results.jsonl > /tmp/upsert.failed.jsonl
jq -r '.error.status' /tmp/upsert.failed.jsonl | sort | uniq -c   # status → count
jq -r '.data.new'    /tmp/upsert.ok.jsonl     | sort | uniq -c   # created vs updated
```

429s: split the input and rerun smaller chunks (see `bulk-operations` rate-limit notes). 400s usually mean a bad property name or invalid enum value — fix the reshape, rerun the failed inputs.

## Destructive-op safety

`upsert` itself is non-destructive, but write-back can clobber populated fields. Always `--dry-run` first and spot-check. For bulk delete or overwrite of existing data, follow the dry-run → digest → confirm flow in `bulk-operations/SKILL.md`. Recovery: `hubspot history --since 1h`.

## Match without upsert: OR-search → update

When you only want to read matches (no write-back), or the natural key isn't a CRM property, use repeated `--filter` flags — each flag is one OR group.

Verified cap: **5 OR groups per call**. 6+ returns `400 too many filterGroups (count: N, max allowed: 5)`. Chunk 5 at a time:

```bash
# emails.txt: one lowercased email per line
xargs -n5 < emails.txt | while read -r e1 e2 e3 e4 e5; do
  args=()
  for e in "$e1" "$e2" "$e3" "$e4" "$e5"; do [ -n "$e" ] && args+=(--filter "email=$e"); done
  hubspot objects search --type contacts "${args[@]}" --properties email,firstname,company
done > /tmp/matches.jsonl

jq -c '{id, properties:{lifecyclestage:"marketingqualifiedlead"}}' /tmp/matches.jsonl \
| hubspot objects update --type contacts --dry-run
```

For larger keyed enrichments, prefer `upsert` — o

Install

What is this skill?

Recommended Skills

Journey fit

SKILL.md

This week for builders

Install

What is this skill?

Recommended Skills

Journey fit

SKILL.md