Clickhouse Best Practices

Name: Clickhouse Best Practices
Author: clickhouse

clickhouse/agent-skills

Apply ClickHouse schema, query, and engine rules when designing or tuning analytics tables and pipelines.

Install

npx skills add https://github.com/clickhouse/agent-skills --skill clickhouse-best-practices

What is this skill?

Rule-based guidance for schema design, query optimization, engines, and indexing with incorrect vs correct SQL examples
Covers partitioning cardinality, nullable avoidance, ORDER BY filtering, and distributed operations for ClickHouse 24.1+
Optimized for agent-driven design and maintenance with impact metrics per rule (Version 0.1.0, January 2026)
Materialized views and operational practices for sustained query performance
Table of contents structured by priority (e.g. Schema Design marked CRITICAL)

Adoption & trust: 3.5k installs on skills.sh; 458 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Recommended Skills

Supabase Postgres Best Practicessupabase/agent-skills

Supabase Postgres Best Practices is an MIT-licensed reference skill from Supabase that packages performance and reliabil…217k installs·2.2k stars

Lark Baselarksuite/cli

Lark CLI skill for Feishu multidimensional tables, including schema, records, and analysis-oriented query patterns.210k installs·13.7k stars

Convex Migration Helperget-convex/agent-skills

Convex Migration Helper is an agent skill from the Convex toolkit that walks solo builders through safe schema and data …61.9k installs·31 stars

Neon Postgresneondatabase/agent-skills

neon-postgres guides coding agents through any Neon Serverless Postgres task: creating projects, choosing connection str…38.3k installs·68 stars

Firebase Firestore Standardfirebase/agent-skills

firebase-firestore-standard is a comprehensive Firestore agent skill for solo builders who need Cloud Firestore Standard…36.7k installs·345 stars

Postgresql Table Designwshobson/agents

PostgreSQL Table Design is an agent skill that walks solo builders through designing or reviewing Postgres-specific sche…18.5k installs·36.5k stars

Journey fit

Primary fit

BuildBackend, data & payments

Database schema and engine choices are decided while building the data layer, before production load. ClickHouse tables, ORDER BY keys, and materialized views belong in backend data modeling, not frontend or docs alone.

Common Questions / FAQ

Is Clickhouse Best Practices safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

SKILL.md

READMESKILL.md - Clickhouse Best Practices

# ClickHouse Best Practices

**Version 0.1.0**  
ClickHouse Inc  
January 2026
ClickHouse 24.1+

> **Note:**  
> This document is mainly for agents and LLMs to follow when designing,  
> optimizing, or maintaining ClickHouse databases. Humans may also find it  
> useful, but guidance here is optimized for automation and consistency by  
> AI-assisted workflows.

---

## Abstract

Comprehensive best practices for ClickHouse database optimization. Covers schema design, query optimization, table engines, indexing strategies, materialized views, distributed operations, and operational best practices. Each rule includes detailed explanations, SQL examples comparing incorrect vs. correct implementations, and specific impact metrics to guide database design and query optimization.

---

## Table of Contents

1. [Schema Design](#1-schema-design) — **CRITICAL**
   - 1.1 [Avoid Nullable Unless Semantically Required](#11-avoid-nullable-unless-semantically-required)
   - 1.2 [Consider Starting Without Partitioning](#12-consider-starting-without-partitioning)
   - 1.3 [Filter on ORDER BY Columns in Queries](#13-filter-on-order-by-columns-in-queries)
   - 1.4 [Keep Partition Cardinality Low (100-1,000 Values)](#14-keep-partition-cardinality-low-100-1000-values)
   - 1.5 [Minimize Bit-Width for Numeric Types](#15-minimize-bit-width-for-numeric-types)
   - 1.6 [Order Columns by Cardinality (Low to High)](#16-order-columns-by-cardinality-low-to-high)
   - 1.7 [Plan PRIMARY KEY Before Table Creation](#17-plan-primary-key-before-table-creation)
   - 1.8 [Prioritize Filter Columns in ORDER BY](#18-prioritize-filter-columns-in-order-by)
   - 1.9 [Understand Partition Query Performance Trade-offs](#19-understand-partition-query-performance-trade-offs)
   - 1.10 [Use Enum for Finite Value Sets](#110-use-enum-for-finite-value-sets)
   - 1.11 [Use JSON Type for Dynamic Schemas](#111-use-json-type-for-dynamic-schemas)
   - 1.12 [Use LowCardinality for Repeated Strings](#112-use-lowcardinality-for-repeated-strings)
   - 1.13 [Use Native Types Instead of String](#113-use-native-types-instead-of-string)
   - 1.14 [Use Partitioning for Data Lifecycle Management](#114-use-partitioning-for-data-lifecycle-management)
2. [Query Optimization](#2-query-optimization) — **CRITICAL**
   - 2.1 [Choose the Right JOIN Algorithm](#21-choose-the-right-join-algorithm)
   - 2.2 [Consider Alternatives to JOINs](#22-consider-alternatives-to-joins)
   - 2.3 [Filter Tables Before Joining](#23-filter-tables-before-joining)
   - 2.4 [Optimize NULL Handling in Outer JOINs](#24-optimize-null-handling-in-outer-joins)
   - 2.5 [Use ANY JOIN When Only One Match Needed](#25-use-any-join-when-only-one-match-needed)
   - 2.6 [Use Data Skipping Indices for Non-ORDER BY Filters](#26-use-data-skipping-indices-for-non-order-by-filters)
   - 2.7 [Use Incremental MVs for Real-Time Aggregations](#27-use-incremental-mvs-for-real-time-aggregations)
   - 2.8 [Use Refreshable MVs for Complex Joins and Batch Workflows](#28-use-refreshable-mvs-for-complex-joins-and-batch-workflows)
3. [Insert Strategy](#3-insert-strategy) — **CRITICAL**
   - 3.1 [Avoid ALTER TABLE DELETE](#31-avoid-alter-table-delete)
   - 3.2 [Avoid ALTER TABLE UPDATE](#32-avoid-alter-table-update)
   - 3.3 [Avoid OPTIMIZE TABLE FINAL](#33-avoid-optimize-table-final)
   - 3.4 [Batch Inserts Appropriately (10K-100K rows)](#34-batch-inserts-appropriately-10k-100k-rows)
   - 3.5 [Use Async Inserts for High-Frequency Small Batches](#35-use-async-inserts-for-high-frequency-small-batches)
   - 3.6 [Use Native Format for Best Insert Performance](#36-use-native-format-for-best-insert-performance)
4. [Agent Integration](#4-agent-integration) — **CRITICAL**
   - 4.1 [Apply Safety Limits to Agent-Generated Queries](#41-apply-safety-limits-to-agent-generated-queries)
   - 4.2 [Connect AI Agents to ClickHouse](#42-connect-ai-agents-to-clickhouse)
   - 4.3 [Discover Schema Before Querying](#43-discover-schema-before-querying)

---

## 1. Schema

What is this skill?

Rule-based guidance for schema design, query optimization, engines, and indexing with incorrect vs correct SQL examples

Covers partitioning cardinality, nullable avoidance, ORDER BY filtering, and distributed operations for ClickHouse 24.1+

Optimized for agent-driven design and maintenance with impact metrics per rule (Version 0.1.0, January 2026)

Materialized views and operational practices for sustained query performance

Table of contents structured by priority (e.g. Schema Design marked CRITICAL)

Adoption & trust: 3.5k installs on skills.sh; 458 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

BuildBackend, data & payments

SKILL.md

READMESKILL.md - Clickhouse Best Practices

# ClickHouse Best Practices

**Version 0.1.0**  
ClickHouse Inc  
January 2026
ClickHouse 24.1+

> **Note:**  
> This document is mainly for agents and LLMs to follow when designing,  
> optimizing, or maintaining ClickHouse databases. Humans may also find it  
> useful, but guidance here is optimized for automation and consistency by  
> AI-assisted workflows.

---

## Abstract

Comprehensive best practices for ClickHouse database optimization. Covers schema design, query optimization, table engines, indexing strategies, materialized views, distributed operations, and operational best practices. Each rule includes detailed explanations, SQL examples comparing incorrect vs. correct implementations, and specific impact metrics to guide database design and query optimization.

---

## Table of Contents

1. [Schema Design](#1-schema-design) — **CRITICAL**
   - 1.1 [Avoid Nullable Unless Semantically Required](#11-avoid-nullable-unless-semantically-required)
   - 1.2 [Consider Starting Without Partitioning](#12-consider-starting-without-partitioning)
   - 1.3 [Filter on ORDER BY Columns in Queries](#13-filter-on-order-by-columns-in-queries)
   - 1.4 [Keep Partition Cardinality Low (100-1,000 Values)](#14-keep-partition-cardinality-low-100-1000-values)
   - 1.5 [Minimize Bit-Width for Numeric Types](#15-minimize-bit-width-for-numeric-types)
   - 1.6 [Order Columns by Cardinality (Low to High)](#16-order-columns-by-cardinality-low-to-high)
   - 1.7 [Plan PRIMARY KEY Before Table Creation](#17-plan-primary-key-before-table-creation)
   - 1.8 [Prioritize Filter Columns in ORDER BY](#18-prioritize-filter-columns-in-order-by)
   - 1.9 [Understand Partition Query Performance Trade-offs](#19-understand-partition-query-performance-trade-offs)
   - 1.10 [Use Enum for Finite Value Sets](#110-use-enum-for-finite-value-sets)
   - 1.11 [Use JSON Type for Dynamic Schemas](#111-use-json-type-for-dynamic-schemas)
   - 1.12 [Use LowCardinality for Repeated Strings](#112-use-lowcardinality-for-repeated-strings)
   - 1.13 [Use Native Types Instead of String](#113-use-native-types-instead-of-string)
   - 1.14 [Use Partitioning for Data Lifecycle Management](#114-use-partitioning-for-data-lifecycle-management)
2. [Query Optimization](#2-query-optimization) — **CRITICAL**
   - 2.1 [Choose the Right JOIN Algorithm](#21-choose-the-right-join-algorithm)
   - 2.2 [Consider Alternatives to JOINs](#22-consider-alternatives-to-joins)
   - 2.3 [Filter Tables Before Joining](#23-filter-tables-before-joining)
   - 2.4 [Optimize NULL Handling in Outer JOINs](#24-optimize-null-handling-in-outer-joins)
   - 2.5 [Use ANY JOIN When Only One Match Needed](#25-use-any-join-when-only-one-match-needed)
   - 2.6 [Use Data Skipping Indices for Non-ORDER BY Filters](#26-use-data-skipping-indices-for-non-order-by-filters)
   - 2.7 [Use Incremental MVs for Real-Time Aggregations](#27-use-incremental-mvs-for-real-time-aggregations)
   - 2.8 [Use Refreshable MVs for Complex Joins and Batch Workflows](#28-use-refreshable-mvs-for-complex-joins-and-batch-workflows)
3. [Insert Strategy](#3-insert-strategy) — **CRITICAL**
   - 3.1 [Avoid ALTER TABLE DELETE](#31-avoid-alter-table-delete)
   - 3.2 [Avoid ALTER TABLE UPDATE](#32-avoid-alter-table-update)
   - 3.3 [Avoid OPTIMIZE TABLE FINAL](#33-avoid-optimize-table-final)
   - 3.4 [Batch Inserts Appropriately (10K-100K rows)](#34-batch-inserts-appropriately-10k-100k-rows)
   - 3.5 [Use Async Inserts for High-Frequency Small Batches](#35-use-async-inserts-for-high-frequency-small-batches)
   - 3.6 [Use Native Format for Best Insert Performance](#36-use-native-format-for-best-insert-performance)
4. [Agent Integration](#4-agent-integration) — **CRITICAL**
   - 4.1 [Apply Safety Limits to Agent-Generated Queries](#41-apply-safety-limits-to-agent-generated-queries)
   - 4.2 [Connect AI Agents to ClickHouse](#42-connect-ai-agents-to-clickhouse)
   - 4.3 [Discover Schema Before Querying](#43-discover-schema-before-querying)

---

## 1. Schema

Install

What is this skill?

Recommended Skills

Journey fit

Is Clickhouse Best Practices safe to install?

SKILL.md

This week for builders

Install

What is this skill?

Recommended Skills

Journey fit

Is Clickhouse Best Practices safe to install?

SKILL.md