
Sales Data Hygiene
Audit and improve CRM record quality—completeness, accuracy, duplicates, and decay—before outbound or pipeline reporting misleads you.
Overview
Sales Data Hygiene is an agent skill for the Grow phase that audits CRM completeness, accuracy, duplication, and decay before you fix or enrich records.
Install
npx skills add https://github.com/sales-skills/sales --skill sales-data-hygieneWhat is this skill?
- Four-pillar audit: completeness, accuracy, duplication rate, and decay rate
- Documented completeness targets (e.g. 95%+ email on contacts, 100% company on accounts)
- Accuracy checks via sampled email verification, phone checks, and LinkedIn title currency
- Calls out cross-object duplication (leads vs contacts) and account spelling variants
- Maintains a living learnings file read each invocation for accumulated CRM gotchas
- Completeness targets include 95%+ email on contacts and 100% company on accounts
- Framework cites ~30% annual B2B data decay, ~20% sales contact job churn, ~15% direct-dial invalidation
Adoption & trust: 1 installs on skills.sh; 45 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
Your CRM looks populated but outbound bounces, duplicate accounts distort pipeline, and job changes have quietly invalidated half your contact fields.
Who is it for?
Solo builders doing B2B sales with a CRM who need structured audits before campaigns, handoffs, or reporting.
Skip if: Pure product analytics funnels with no sales objects, or teams wanting a one-click auto-clean without human sampling and verification discipline.
When should I use this skill?
Starting CRM cleanup, post-import validation, or when pipeline metrics disagree with reality.
What do I get? / Deliverables
You get a prioritized picture of what is broken, field-level targets, and hygiene actions grounded in measured completeness and sample-based accuracy checks.
- Completeness and accuracy scorecard
- Duplication and decay assessment
- Appended hygiene learnings log entries
Recommended Skills
Journey fit
Grow is where revenue motion depends on trustworthy contacts and accounts; bad CRM data silently kills conversion and forecasting. Lifecycle covers contact/account maintenance, enrichment hygiene, and keeping personas current as people change roles.
How it compares
Audit framework and operational CRM hygiene—not a single-vendor enrichment API skill.
Common Questions / FAQ
Who is sales-data-hygiene for?
Indie SaaS founders and small sales motions who own their CRM data and need agent-guided quality measurement before scaling touches.
When should I use sales-data-hygiene?
During Grow lifecycle work before major outbound, after importing lists, when win rates slip, or quarterly to re-baseline duplication and decay.
Is sales-data-hygiene safe to install?
It may suggest reading or updating CRM records and external verification samples; review the Security Audits panel on this Prism page and avoid pasting production credentials into chat.
SKILL.md
READMESKILL.md - Sales Data Hygiene
# CRM Data Hygiene & Quality Learnings Accumulated tips, gotchas, and corrections discovered during use. Claude reads this at the start of each invocation and appends new learnings as they're discovered. Once significant learnings have accumulated, use `/sales-request-skill` to share them back to the community. Shared and declined entries are marked so they won't be re-prompted. <!-- Add entries below in format: **YYYY-MM-DD**: Learning description --> # CRM Data Hygiene Platform Guide ## Data Quality Audit Framework Before fixing data, measure what's broken: 1. **Completeness** — what % of records have all critical fields filled? - Email: target 95%+ for contacts - Phone: target 70%+ for key personas - Company: target 100% for accounts - Title/department: target 90%+ for contacts 2. **Accuracy** — what % of filled fields are actually correct? - Email deliverability: verify a sample with an email verification tool - Phone connectivity: check a sample of direct dials - Job title currency: compare against LinkedIn for a sample 3. **Duplication rate** — what % of records are duplicates? - Contact-level: same person, multiple records - Account-level: same company, different spellings - Cross-object: leads that are also contacts 4. **Decay rate** — how fast does your data go stale? - Industry average: 30% of B2B data decays annually - Sales contacts: ~20% change jobs each year - Direct dials: ~15% become invalid annually - Emails: ~22% bounce rate after 12 months without refresh 5. **Consistency** — are the same things called the same thing? - Job titles: "VP Sales" vs "Vice President of Sales" vs "VP, Sales" - Industries: "SaaS" vs "Software" vs "Technology" - Company names: "IBM" vs "International Business Machines" vs "IBM Corp" ## Deduplication Strategy | Approach | When to use | Risk level | |----------|------------|------------| | **Exact match** | Email, phone, domain — safest | Low | | **Fuzzy match** | Names, company names, addresses | Medium — review matches before merging | | **Rule-based** | Combine multiple fields (name + company + title) | Medium | | **ML-based** | Large datasets with complex patterns | Low (if trained well) — but expensive | **Merge rules** (which record wins): - Most recently updated record keeps modifiable fields - Most complete record keeps enrichment data - Oldest record keeps the original owner/source - Always preserve: original lead source, first touch date, opt-in status ## Data Normalization | Field | Common problems | Solution | |-------|----------------|----------| | Job title | Abbreviations, variations, custom titles | Map to standard taxonomy (C-Level, VP, Director, Manager, IC) | | Industry | Free-text, overlapping categories | Map to SIC/NAICS or your internal taxonomy | | Company name | Abbreviations, legal suffixes, DBA names | Normalize to official name, store variants as aliases | | Phone | Mixed formats, extensions, country codes | E.164 format (+1XXXXXXXXXX) | | Address | Inconsistent formatting, abbreviations | USPS standardization or Google Maps API | | Country | Mix of codes and names | ISO 3166-1 alpha-2 codes | ## Enrichment Automation Set up ongoing enrichment to prevent decay: 1. **Trigger-based** — enrich when a record is created or updated 2. **Scheduled** — monthly/quarterly batch enrichment of all records 3. **Decay-based** — re-enrich records older than X days 4. **Event-based** — re-enrich when a contact's company has a news event (funding, acquisition) ## Platform-Specific Guidance ### In ZoomInfo (OperationsOS) ZoomInfo OperationsOS is purpose-built for CRM data management at scale. **Deduplication**: - OperationsOS identifies duplicates across contacts, leads, and accounts using fuzzy matching - Configurable match rules: email, name+company, phone, domain - Bulk merge with configurable "winning record" rules - Cross-object dedup: find leads that already exist as contacts **Data orchestratio