
Clickhouse Architecture Advisor
Design workload-aware ClickHouse ingestion, partitioning, enrichment, and pre-aggregation before you commit schema and pipeline code.
Overview
ClickHouse Architecture Advisor is an agent skill most often used in Build (also Validate, Operate) that delivers workload-aware ClickHouse architecture decisions across ingestion, partitioning, enrichment, late data, an
Install
npx skills add https://github.com/clickhouse/agent-skills --skill clickhouse-architecture-advisorWhat is this skill?
- Five explicit decision areas: ingestion strategy, time-series partitioning, enrichment path, late-arriving/mutable state
- Every recommendation tagged official, derived, or field against ClickHouse documentation as source of truth
- Complements rule-check skills with workload-aware tradeoffs for workshops and system design
- Covers async inserts, Kafka engine + MV, dictionaries vs JOINs, TTL/retention, and rollup vs raw-only designs
- Targets ClickHouse 24.1+ advisory workflows alongside clickhouse-best-practices
- 5 decision areas for ingestion, partitioning, enrichment, late data, and pre-aggregation
- Version 0.1.0
Adoption & trust: 1.2k installs on skills.sh; 458 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You know ClickHouse rules of thumb but cannot map ingestion, partitioning, enrichment, and MV strategy to your specific workload without guessing.
Who is it for?
Solo builders shipping event or time-series analytics on ClickHouse 24.1+ who are choosing pipelines, partition granularity, enrichment model, and rollup strategy in design or refactor.
Skip if: Teams that only need a quick lint against generic rules without workload context—use clickhouse-best-practices first; skip when you are not using ClickHouse or have no analytics data path.
When should I use this skill?
Use when deciding ClickHouse ingestion strategy, time-series partitioning, enrichment paths, late-arriving or mutable data handling, or real-time pre-aggregation in advisory or system design work.
What do I get? / Deliverables
You receive a structured architecture response across five decision areas with labeled recommendations you can drop into a design doc or implementation plan.
- Architecture response structured across the five decision areas
- Recommendations labeled official, derived, or field
- Documented tradeoffs for ingestion, partitioning, enrichment, and pre-aggregation choices
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Build/backend is the canonical shelf because the skill outputs system-design decisions (tables, engines, MVs, ingestion paths) that ship as part of the product’s data layer. Backend subphase fits advisory architecture for storage, ingestion, and query patterns—not UI, docs, or generic agent tooling.
Where it fits
Compare direct inserts versus Kafka engine plus MV before you prototype an event analytics MVP.
Pick partition granularity and enrichment path (dictionary vs denormalization) for a new metrics table.
Decide upstream buffering and async insert settings for a high-volume telemetry writer.
Re-evaluate refreshable MVs versus rollups when query cost spikes after retention changes.
Reason about late-arriving data and mutable state without relying on frequent mutations.
How it compares
Architecture advisory with decision frameworks and source labeling—not a generic SQL generator or an MCP server integration.
Common Questions / FAQ
Who is clickhouse-architecture-advisor for?
Indie builders and small teams designing or evolving ClickHouse for analytics, observability, or event pipelines who want workshop-style tradeoffs, not ad-hoc chat guesses.
When should I use clickhouse-architecture-advisor?
Use it in Validate when scoping data architecture, in Build when choosing ingestion and MV design for the backend, and in Operate when revisiting partitioning, TTL, or pre-aggregation as traffic and retention change.
Is clickhouse-architecture-advisor safe to install?
Treat it like any third-party agent skill: review the Security Audits panel on this Prism page and inspect SKILL.md in your repo before granting agent permissions.
SKILL.md
READMESKILL.md - Clickhouse Architecture Advisor
# ClickHouse Architecture Advisor **Version 0.1.0** ClickHouse Inc April 2026 ClickHouse 24.1+ ## Abstract This skill complements `clickhouse-best-practices` by adding a workload-aware architecture layer for ClickHouse. It is optimized for advisory, workshop, and system design workflows where a user needs more than a rule check. It provides decision frameworks for ingestion strategy, time-series partitioning, enrichment paths, late-arriving data, and real-time pre-aggregation. ## Core principle Official documentation is the source of truth. Every recommendation must be labeled as: - `official` - `derived` - `field` ## Decision areas ### 1. Ingestion strategy Use when deciding between: - direct inserts - async inserts - Kafka engine + MV - upstream buffering ### 2. Time-series partitioning Use when deciding: - whether to partition - partition granularity - how retention and TTL affect design - how to avoid excessive partition counts ### 3. Enrichment path selection Use when deciding between: - runtime JOINs - dictionaries - denormalization - materialized enrichment ### 4. Late-arriving data and mutable state Use when reasoning about: - immutable append-only events - latest-state queries - replacing or collapsing semantics - whether frequent mutations should be avoided ### 5. Real-time pre-aggregation Use when deciding: - raw-only design - incremental materialized views - refreshable MVs - rollup tables ## Output standard A valid architecture response should include: - workload summary - key decisions - recommendations with provenance labels - suggested target architecture - example DDL or SQL - validation approach ## Required recommendation schema See `schemas/recommendation_schema.yaml`. ## Rule index 1. `decision-ingestion-strategy` 2. `decision-partitioning-timeseries` 3. `decision-join-enrichment` 4. `decision-late-arriving-upserts` 5. `decision-real-time-preaggregation` ## Implementation notes This skill is intentionally narrow: - it does not replace low-level rule enforcement - it does not make commercial recommendations - it does not claim field heuristics are official policy Its purpose is to translate documented ClickHouse capabilities into workload-specific architecture decisions. # Example: Financial Services — Real-time market surveillance ## Scenario - Workload: order and execution event stream - Ingest rate: 80M events/day - Query pattern: - latest order state - time-bounded compliance scans - intraday anomaly and pattern detection - Freshness target: sub-second to low-single-digit seconds - Additional requirement: late-arriving corrections and cancels ## Workload Summary This is not classic OLTP. It is a high-throughput analytical event pipeline with mutable business state derived from ordered events. The architecture should preserve append-only facts and compute latest-state views rather than forcing row-by-row transactional mutations. ## Key Decisions 1. Keep a raw append-only event table 2. Model current state separately 3. Avoid using ClickHouse like a row store 4. Use pre-aggregation only for repeated surveillance views ## Recommendations ### 1. Raw event table plus latest-state projection **What** Store all order lifecycle events immutably, then derive current order state. **Why** This preserves auditability and handles late-arriving business events without relying on heavy mutations. **Category** derived **Confidence** medium **Source** - https://clickhouse.com/docs/en/guides/replacing-merge-tree ### 2. Use ReplacingMergeTree for current-state table if version semantics are clean **What** Maintain a latest-state table keyed by order identifier and version timestamp. **Why** If corrections naturally replace prior state, ReplacingMergeTree is often the cleanest documented pattern. **Category** official **Confidence** high **Source** - https://clickhouse.com/docs/en/guides/replacing-merge-tree ### 3. Use dictionaries for small reference da