Pyspark Mcp

Name: Pyspark Mcp
Author: AnnasMazhar

AnnasMazhar/pyspark_mcp

Convert SQL to PySpark, scaffold AWS Glue jobs, and tighten Spark code from the agent while building data backends.

Overview

PySpark MCP is a MCP server for the Build phase that helps agents convert SQL to PySpark, generate AWS Glue jobs, and optimize Spark code over stdio.

What is this MCP server?

SQL to PySpark conversion for agent-assisted pipeline authoring
AWS Glue job generation from prompts or specs
Spark code optimization suggestions via pyspark-tools PyPI package v0.0.4
stdio MCP transport through PyPI identifier pyspark-tools
Focused on batch/ETL stacks rather than generic CRUD APIs
PyPI package pyspark-tools version 0.0.4
Capabilities: SQL→PySpark, Glue job generation, Spark optimization (3)
Transport: stdio

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

What problem does it solve?

Writing and tuning PySpark and Glue jobs by hand slows solo builders who already use agents for application code.

Who is it for?

Indie data engineers and full-stack solos building AWS-centric ETL or analytics backends with MCP-enabled agents.

Skip if: Pure frontend products, non-Spark databases only, or teams without any Spark/Glue runtime to test generated code.

What do I get? / Deliverables

After install, your agent can draft PySpark transforms, Glue job scripts, and optimization passes you validate in your data environment.

Draft PySpark from SQL specifications
Scaffold AWS Glue job code for further hardening
Agent-suggested Spark optimizations to benchmark in your cluster

Recommended MCP Servers

1ClickReport1clickreport/mcp

1ClickReport is a hosted Model Context Protocol server that packs forty tools for marketing analytics, campaign manageme…

222wcnm Bilistalkermcp222wcnm/BiliStalkerMCP

BiliStalker MCP is a Smithery-distributed Model Context Protocol server that lets coding agents stalk—not spam—Bilibili:…5 stars

ABMeter

ABMeter MCP (ai.abmeter/abmeter) connects Model Context Protocol clients to ABMeter’s feature-flag and A/B testing platf…

Abs Mcp

io.ausdata/abs-mcp is a stdio Model Context Protocol server that wraps Australian Bureau of Statistics data so Claude Co…

ACM 68000 Product Eligibility For AI Agentsallooloo/acm-68000

io.github.allooloo/acm-68000-mcp publishes the ACM-68000 DPU resolver as a Model Context Protocol service so AI agents c…1 stars

AdAdvisor MCP Server

AdAdvisor MCP Server (ai.adadvisor/mcp-server) exposes Meta (Facebook) Ads performance data to MCP-capable coding agents…

Journey fit

Primary fit

BuildBackend, data & payments

Data pipeline authoring is backend build work before you ship jobs to staging or production. SQL conversion, Glue generation, and Spark optimization are server-side ETL concerns, not frontend or launch tasks.

How it compares

Spark/Glue codegen MCP, not a warehouse admin GUI or generic SQL lint skill.

Common Questions / FAQ

Who is pyspark-mcp for?

Builders creating PySpark pipelines or AWS Glue jobs who want agent assistance through a stdio MCP Python package.

When should I use pyspark-mcp?

Use it during backend build when translating SQL logic to Spark, bootstrapping Glue jobs, or refactoring slow Spark code.

How do I add pyspark-mcp to my agent?

Install pyspark-tools from PyPI (v0.0.4), configure the stdio MCP server entry pointing at that package, and ensure Python dependencies for PySpark workflows are available locally.

Pyspark Mcp

AnnasMazhar/pyspark_mcp

Convert SQL to PySpark, scaffold AWS Glue jobs, and tighten Spark code from the agent while building data backends.

Overview

PySpark MCP is a MCP server for the Build phase that helps agents convert SQL to PySpark, generate AWS Glue jobs, and optimize Spark code over stdio.

What is this MCP server?

SQL to PySpark conversion for agent-assisted pipeline authoring
AWS Glue job generation from prompts or specs
Spark code optimization suggestions via pyspark-tools PyPI package v0.0.4
stdio MCP transport through PyPI identifier pyspark-tools
Focused on batch/ETL stacks rather than generic CRUD APIs
PyPI package pyspark-tools version 0.0.4
Capabilities: SQL→PySpark, Glue job generation, Spark optimization (3)
Transport: stdio

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

What problem does it solve?

Writing and tuning PySpark and Glue jobs by hand slows solo builders who already use agents for application code.

Who is it for?

Indie data engineers and full-stack solos building AWS-centric ETL or analytics backends with MCP-enabled agents.

Skip if: Pure frontend products, non-Spark databases only, or teams without any Spark/Glue runtime to test generated code.

What do I get? / Deliverables

After install, your agent can draft PySpark transforms, Glue job scripts, and optimization passes you validate in your data environment.

Draft PySpark from SQL specifications
Scaffold AWS Glue job code for further hardening
Agent-suggested Spark optimizations to benchmark in your cluster

Recommended MCP Servers

1ClickReport1clickreport/mcp

1ClickReport is a hosted Model Context Protocol server that packs forty tools for marketing analytics, campaign manageme…

222wcnm Bilistalkermcp222wcnm/BiliStalkerMCP

BiliStalker MCP is a Smithery-distributed Model Context Protocol server that lets coding agents stalk—not spam—Bilibili:…5 stars

ABMeter

ABMeter MCP (ai.abmeter/abmeter) connects Model Context Protocol clients to ABMeter’s feature-flag and A/B testing platf…

Abs Mcp

io.ausdata/abs-mcp is a stdio Model Context Protocol server that wraps Australian Bureau of Statistics data so Claude Co…

ACM 68000 Product Eligibility For AI Agentsallooloo/acm-68000

io.github.allooloo/acm-68000-mcp publishes the ACM-68000 DPU resolver as a Model Context Protocol service so AI agents c…1 stars

AdAdvisor MCP Server

AdAdvisor MCP Server (ai.adadvisor/mcp-server) exposes Meta (Facebook) Ads performance data to MCP-capable coding agents…

Journey fit

Primary fit

BuildBackend, data & payments

How it compares

Spark/Glue codegen MCP, not a warehouse admin GUI or generic SQL lint skill.

Common Questions / FAQ

Who is pyspark-mcp for?

Builders creating PySpark pipelines or AWS Glue jobs who want agent assistance through a stdio MCP Python package.

When should I use pyspark-mcp?

Use it during backend build when translating SQL logic to Spark, bootstrapping Glue jobs, or refactoring slow Spark code.

How do I add pyspark-mcp to my agent?

Install pyspark-tools from PyPI (v0.0.4), configure the stdio MCP server entry pointing at that package, and ensure Python dependencies for PySpark workflows are available locally.

Overview

What is this MCP server?

What problem does it solve?

Who is it for?

What do I get? / Deliverables

Recommended MCP Servers

Journey fit

Who is pyspark-mcp for?

When should I use pyspark-mcp?

How do I add pyspark-mcp to my agent?

This week for builders

Overview

What is this MCP server?

What problem does it solve?

Who is it for?

What do I get? / Deliverables

Recommended MCP Servers

Journey fit

Who is pyspark-mcp for?

When should I use pyspark-mcp?

How do I add pyspark-mcp to my agent?