Spark SQL

Name: Spark SQL
Author: aidancorrell

aidancorrell/spark-sql-mcp-server

Let Claude Code or Cursor run Spark SQL against Thrift/HiveServer2 clusters (Spark, EMR, Hive, Impala) without leaving the agent session.

Overview

Spark SQL MCP Server is a MCP server for the Build phase that executes and explores Spark SQL on Thrift/HiveServer2-backed clusters from AI coding agents.

What is this MCP server?

stdio MCP server (PyPI spark-sql-mcp-server v0.1.2) for Spark SQL over Thrift/HiveServer2
Compatible with Spark, AWS EMR, Hive, and Impala-style endpoints
Configurable SPARK_HOST (required), SPARK_PORT (default 10000), SPARK_DATABASE, and SPARK_AUTH (NONE, LDAP, KERBEROS, CU
LDAP support via SPARK_USERNAME and secret SPARK_PASSWORD environment variables
Agent-facing SQL execution and schema exploration without a separate JDBC desktop client
Server version 0.1.2 on PyPI identifier spark-sql-mcp-server
Default Thrift port documented as 10000 when SPARK_PORT is unset
Five SPARK_AUTH modes: NONE, LDAP, KERBEROS, CUSTOM, NOSASL

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

What problem does it solve?

Agents cannot safely answer questions about your Spark, EMR, Hive, or Impala data without a governed SQL bridge and cluster credentials.

Who is it for?

Indie builders or tiny data teams who already operate a Thrift SQL endpoint and want agent-assisted querying during feature and pipeline work.

Skip if: Greenfield projects with no Spark infrastructure, or teams that need full DDL admin and cluster lifecycle from MCP alone.

What do I get? / Deliverables

After you configure SPARK_* env vars and add the stdio server to your agent, the assistant can run warehouse SQL and use live results in implementation and debugging.

Agent-callable Spark SQL query and metadata access over stdio MCP
Documented cluster connection via SPARK_HOST, port, database, and auth env vars
Faster warehouse-grounded answers during integration and debugging sessions

Recommended MCP Servers

A2dbagentic-eng/a2db

a2db is a stdio MCP server that provides safe, read-only database access for AI agents, supporting five database types w…2 stars

AdoMcpJohn0King/AdoMCP

io.github.JoeyBrar/agentseal is misplaced - wait, this is adomcp. AdoMcp is a database-focused MCP server for solo build…

Afgong Sqlite Mcp Serverafgong/sqlite-mcp-server

ai.smithery/afgong-sqlite-mcp-server is a Smithery-hosted MCP server that lets coding agents interact with a Messages-st…

AI ERD

com.ai-erd/ai-erd is a hosted MCP server for AI-powered entity-relationship diagramming and database schema management e…

Airtablemcparmory/registry

com.mcparmory/airtable is an MCP server that performs create, read, update, and delete operations against Airtable bases…25 stars

Airtabledomdomegg/airtable-mcp-server

The Airtable MCP server by domdomegg bridges Claude Code, Cursor, Codex, and other MCP clients to Airtable’s REST API so…447 stars

Journey fit

Primary fit

BuildIntegrations & version control

Data warehouse access is wired during product build when agents need live warehouse context for features, ETL debugging, and integration work. Integrations is the canonical shelf because the server is a bridge to external Spark SQL infrastructure via stdio MCP, not a standalone analytics app.

How it compares

MCP database integration for Spark SQL, not an in-repo agent skill or local SQLite file browser.

Common Questions / FAQ

Who is Spark SQL MCP Server for?

It is for solo builders and small teams using Claude Code, Cursor, or similar agents who need read-oriented Spark SQL access through HiveServer2/Thrift.

When should I use Spark SQL MCP Server?

Use it during build and operate work when you are integrating features, validating ETL, or debugging queries against an existing Spark, EMR, Hive, or Impala endpoint.

How do I add Spark SQL MCP Server to my agent?

Install the PyPI package spark-sql-mcp-server, set SPARK_HOST and optional SPARK_PORT, SPARK_DATABASE, SPARK_AUTH, and LDAP secrets, then register the stdio MCP entry in your client config.

What is this MCP server?

stdio MCP server (PyPI spark-sql-mcp-server v0.1.2) for Spark SQL over Thrift/HiveServer2

Compatible with Spark, AWS EMR, Hive, and Impala-style endpoints

Configurable SPARK_HOST (required), SPARK_PORT (default 10000), SPARK_DATABASE, and SPARK_AUTH (NONE, LDAP, KERBEROS, CU

LDAP support via SPARK_USERNAME and secret SPARK_PASSWORD environment variables

Agent-facing SQL execution and schema exploration without a separate JDBC desktop client

Server version 0.1.2 on PyPI identifier spark-sql-mcp-server

Default Thrift port documented as 10000 when SPARK_PORT is unset

Five SPARK_AUTH modes: NONE, LDAP, KERBEROS, CUSTOM, NOSASL

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

What do I get? / Deliverables

After you configure SPARK_* env vars and add the stdio server to your agent, the assistant can run warehouse SQL and use live results in implementation and debugging.

Agent-callable Spark SQL query and metadata access over stdio MCP

Documented cluster connection via SPARK_HOST, port, database, and auth env vars

Faster warehouse-grounded answers during integration and debugging sessions

Journey fit

Primary fit

BuildIntegrations & version control

Overview

What is this MCP server?

What problem does it solve?

Who is it for?

What do I get? / Deliverables

Recommended MCP Servers

Journey fit

Who is Spark SQL MCP Server for?

When should I use Spark SQL MCP Server?

How do I add Spark SQL MCP Server to my agent?

This week for builders

Overview

What is this MCP server?

What problem does it solve?

Who is it for?

What do I get? / Deliverables

Recommended MCP Servers

Journey fit

Who is Spark SQL MCP Server for?

When should I use Spark SQL MCP Server?

How do I add Spark SQL MCP Server to my agent?