Senior Data Scientist

Name: Senior Data Scientist
Author: davila7

davila7/claude-code-templates

2.9k installs
29.9k repo stars
Updated July 27, 2026
davila7/claude-code-templates

senior-data-scientist is an agent skill that applies experiment design, feature engineering, and production ML patterns for statistical modeling and data-driven decision workflows.

About

senior-data-scientist is a davila7/claude-code-templates skill embedding experiment design, feature engineering, and production ML patterns for statistical modeling, causal inference, and advanced analytics work. Quick-start scripts include experiment_designer.py, feature_engineering_pipeline.py, and model_evaluation_suite.py for structured analysis workflows. Reference guides load on demand for statistical_methods_advanced.md, experiment_design_frameworks.md, and feature_engineering_patterns.md covering step-by-step processes, architecture patterns, and implementation examples. Production patterns span scalable distributed processing, ML model deployment with A/B testing infrastructure, feature store integration, drift detection, and high-throughput real-time inference with batching and auto-scaling. Best practices emphasize test-driven development, monitoring, automated deployments, canary releases, and team leadership standards for coding and cross-functional collaboration. Performance targets document P50 under fifty milliseconds, P95 under one hundred milliseconds, P99 under two hundred milliseconds, over one thousand requests per second throughput, and 99.9 percent uptime av.

Quick-start scripts cover experiment_designer.py, feature_engineering_pipeline.py, and model_evaluation_suite.py workflo
Reference docs span statistical methods, experiment design frameworks, and feature engineering patterns.
Production patterns include distributed processing, model serving, feature stores, drift detection, and retraining pipel
Best practices cover TDD, monitoring, canary deployments, and comprehensive logging for ML systems.
Performance targets specify sub-100ms P95 latency, 1000+ RPS throughput, and 99.9% uptime goals.

Senior Data Scientist by the numbers

2,943 all-time installs (skills.sh)
+30 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #253 of 16,659 AI & Agent Building skills by installs in the Skillselion catalog
Security screen: LOW risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

senior-data-scientist capabilities & compatibility

Capabilities: experiment design framework guidance · feature engineering pipeline patterns · production ml deployment and monitoring · statistical methods and model evaluation
Use cases: orchestration · debugging

From the docs

What senior-data-scientist says it does

World-class data science skill for statistical modeling, experimentation, causal inference, and advanced analytics.

SKILL.md

python scripts/experiment_designer.py --input data/ --output results/

SKILL.md

Uptime: 99.9%

SKILL.md

npx skills add https://github.com/davila7/claude-code-templates --skill senior-data-scientist

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/davila7/claude-code-templates/senior-data-scientist.svg)](https://skillselion.com/skills/davila7/claude-code-templates/senior-data-scientist)

Installs	2.9k
repo stars	★ 29.9k
Security audit	3 / 3 scanners passed
Last updated	July 27, 2026
Repository	davila7/claude-code-templates ↗

How do I design experiments, engineer features, and ship production ML with observability instead of notebook-only prototypes?

Get world-class experiment design, feature engineering, and production ML patterns that Claude can apply instantly to data-heavy projects.

Who is it for?

Data scientists and ML engineers building predictive models, A/B tests, and production analytics pipelines with Python, SQL, and R tooling.

Skip if: Skip for shallow spreadsheet tasks or when no statistical modeling, experimentation, or ML deployment is required.

When should I use this skill?

User designs experiments, builds predictive models, performs causal analysis, or needs production ML architecture guidance.

What you get

Structured experiment plans, feature pipelines, model evaluation suites, and production deployment patterns with monitoring and performance targets.

experiment design framework
feature engineering plan
production ML architecture guidance

By the numbers

Targets 10x scalability headroom over current load
Specifies 99.9% uptime reliability target

Files

SKILL.mdMarkdownGitHub ↗

Senior Data Scientist

World-class senior data scientist skill for production-grade AI/ML/Data systems.

Quick Start

Main Capabilities

# Core Tool 1
python scripts/experiment_designer.py --input data/ --output results/

# Core Tool 2  
python scripts/feature_engineering_pipeline.py --target project/ --analyze

# Core Tool 3
python scripts/model_evaluation_suite.py --config config.yaml --deploy

Core Expertise

This skill covers world-class capabilities in:

Advanced production patterns and architectures
Scalable system design and implementation
Performance optimization at scale
MLOps and DataOps best practices
Real-time processing and inference
Distributed computing frameworks
Model deployment and monitoring
Security and compliance
Cost optimization
Team leadership and mentoring

Tech Stack

Languages: Python, SQL, R, Scala, Go ML Frameworks: PyTorch, TensorFlow, Scikit-learn, XGBoost Data Tools: Spark, Airflow, dbt, Kafka, Databricks LLM Frameworks: LangChain, LlamaIndex, DSPy Deployment: Docker, Kubernetes, AWS/GCP/Azure Monitoring: MLflow, Weights & Biases, Prometheus Databases: PostgreSQL, BigQuery, Snowflake, Pinecone

Reference Documentation

1. Statistical Methods Advanced

Comprehensive guide available in references/statistical_methods_advanced.md covering:

Advanced patterns and best practices
Production implementation strategies
Performance optimization techniques
Scalability considerations
Security and compliance
Real-world case studies

2. Experiment Design Frameworks

Complete workflow documentation in references/experiment_design_frameworks.md including:

Step-by-step processes
Architecture design patterns
Tool integration guides
Performance tuning strategies
Troubleshooting procedures

3. Feature Engineering Patterns

Technical reference guide in references/feature_engineering_patterns.md with:

System design principles
Implementation examples
Configuration best practices
Deployment strategies
Monitoring and observability

Production Patterns

Pattern 1: Scalable Data Processing

Enterprise-scale data processing with distributed computing:

Horizontal scaling architecture
Fault-tolerant design
Real-time and batch processing
Data quality validation
Performance monitoring

Pattern 2: ML Model Deployment

Production ML system with high availability:

Model serving with low latency
A/B testing infrastructure
Feature store integration
Model monitoring and drift detection
Automated retraining pipelines

Pattern 3: Real-Time Inference

High-throughput inference system:

Batching and caching strategies
Load balancing
Auto-scaling
Latency optimization
Cost optimization

Best Practices

Development

Test-driven development
Code reviews and pair programming
Documentation as code
Version control everything
Continuous integration

Production

Monitor everything critical
Automate deployments
Feature flags for releases
Canary deployments
Comprehensive logging

Team Leadership

Mentor junior engineers
Drive technical decisions
Establish coding standards
Foster learning culture
Cross-functional collaboration

Performance Targets

Latency:

P50: < 50ms
P95: < 100ms
P99: < 200ms

Throughput:

Requests/second: > 1000
Concurrent users: > 10,000

Availability:

Uptime: 99.9%
Error rate: < 0.1%

Security & Compliance

Authentication & authorization
Data encryption (at rest & in transit)
PII handling and anonymization
GDPR/CCPA compliance
Regular security audits
Vulnerability management

Common Commands

# Development
python -m pytest tests/ -v --cov
python -m black src/
python -m pylint src/

# Training
python scripts/train.py --config prod.yaml
python scripts/evaluate.py --model best.pth

# Deployment
docker build -t service:v1 .
kubectl apply -f k8s/
helm upgrade service ./charts/

# Monitoring
kubectl logs -f deployment/service
python scripts/health_check.py

Resources

Advanced Patterns: references/statistical_methods_advanced.md
Implementation Guide: references/experiment_design_frameworks.md
Technical Reference: references/feature_engineering_patterns.md
Automation Scripts: scripts/ directory

Senior-Level Responsibilities

As a world-class senior professional:

1. Technical Leadership

Drive architectural decisions
Mentor team members
Establish best practices
Ensure code quality

2. Strategic Thinking

Align with business goals
Evaluate trade-offs
Plan for scale
Manage technical debt

3. Collaboration

Work across teams
Communicate effectively
Build consensus
Share knowledge

4. Innovation

Stay current with research
Experiment with new approaches
Contribute to community
Drive continuous improvement

5. Production Excellence

Ensure high availability
Monitor proactively
Optimize performance
Respond to incidents

#!/usr/bin/env python3
"""
Experiment Designer
Production-grade tool for senior data scientist
"""

import os
import sys
import json
import logging
import argparse
from pathlib import Path
from typing import Dict, List, Optional
from datetime import datetime

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

class ExperimentDesigner:
    """Production-grade experiment designer"""
    
    def __init__(self, config: Dict):
        self.config = config
        self.results = {
            'status': 'initialized',
            'start_time': datetime.now().isoformat(),
            'processed_items': 0
        }
        logger.info(f"Initialized {self.__class__.__name__}")
    
    def validate_config(self) -> bool:
        """Validate configuration"""
        logger.info("Validating configuration...")
        # Add validation logic
        logger.info("Configuration validated")
        return True
    
    def process(self) -> Dict:
        """Main processing logic"""
        logger.info("Starting processing...")
        
        try:
            self.validate_config()
            
            # Main processing
            result = self._execute()
            
            self.results['status'] = 'completed'
            self.results['end_time'] = datetime.now().isoformat()
            
            logger.info("Processing completed successfully")
            return self.results
            
        except Exception as e:
            self.results['status'] = 'failed'
            self.results['error'] = str(e)
            logger.error(f"Processing failed: {e}")
            raise
    
    def _execute(self) -> Dict:
        """Execute main logic"""
        # Implementation here
        return {'success': True}

def main():
    """Main entry point"""
    parser = argparse.ArgumentParser(
        description="Experiment Designer"
    )
    parser.add_argument('--input', '-i', required=True, help='Input path')
    parser.add_argument('--output', '-o', required=True, help='Output path')
    parser.add_argument('--config', '-c', help='Configuration file')
    parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
    
    args = parser.parse_args()
    
    if args.verbose:
        logging.getLogger().setLevel(logging.DEBUG)
    
    try:
        config = {
            'input': args.input,
            'output': args.output
        }
        
        processor = ExperimentDesigner(config)
        results = processor.process()
        
        print(json.dumps(results, indent=2))
        sys.exit(0)
        
    except Exception as e:
        logger.error(f"Fatal error: {e}")
        sys.exit(1)

if __name__ == '__main__':
    main()

#!/usr/bin/env python3
"""
Feature Engineering Pipeline
Production-grade tool for senior data scientist
"""

import os
import sys
import json
import logging
import argparse
from pathlib import Path
from typing import Dict, List, Optional
from datetime import datetime

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

class FeatureEngineeringPipeline:
    """Production-grade feature engineering pipeline"""
    
    def __init__(self, config: Dict):
        self.config = config
        self.results = {
            'status': 'initialized',
            'start_time': datetime.now().isoformat(),
            'processed_items': 0
        }
        logger.info(f"Initialized {self.__class__.__name__}")
    
    def validate_config(self) -> bool:
        """Validate configuration"""
        logger.info("Validating configuration...")
        # Add validation logic
        logger.info("Configuration validated")
        return True
    
    def process(self) -> Dict:
        """Main processing logic"""
        logger.info("Starting processing...")
        
        try:
            self.validate_config()
            
            # Main processing
            result = self._execute()
            
            self.results['status'] = 'completed'
            self.results['end_time'] = datetime.now().isoformat()
            
            logger.info("Processing completed successfully")
            return self.results
            
        except Exception as e:
            self.results['status'] = 'failed'
            self.results['error'] = str(e)
            logger.error(f"Processing failed: {e}")
            raise
    
    def _execute(self) -> Dict:
        """Execute main logic"""
        # Implementation here
        return {'success': True}

def main():
    """Main entry point"""
    parser = argparse.ArgumentParser(
        description="Feature Engineering Pipeline"
    )
    parser.add_argument('--input', '-i', required=True, help='Input path')
    parser.add_argument('--output', '-o', required=True, help='Output path')
    parser.add_argument('--config', '-c', help='Configuration file')
    parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
    
    args = parser.parse_args()
    
    if args.verbose:
        logging.getLogger().setLevel(logging.DEBUG)
    
    try:
        config = {
            'input': args.input,
            'output': args.output
        }
        
        processor = FeatureEngineeringPipeline(config)
        results = processor.process()
        
        print(json.dumps(results, indent=2))
        sys.exit(0)
        
    except Exception as e:
        logger.error(f"Fatal error: {e}")
        sys.exit(1)

if __name__ == '__main__':
    main()

#!/usr/bin/env python3
"""
Model Evaluation Suite
Production-grade tool for senior data scientist
"""

import os
import sys
import json
import logging
import argparse
from pathlib import Path
from typing import Dict, List, Optional
from datetime import datetime

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

class ModelEvaluationSuite:
    """Production-grade model evaluation suite"""
    
    def __init__(self, config: Dict):
        self.config = config
        self.results = {
            'status': 'initialized',
            'start_time': datetime.now().isoformat(),
            'processed_items': 0
        }
        logger.info(f"Initialized {self.__class__.__name__}")
    
    def validate_config(self) -> bool:
        """Validate configuration"""
        logger.info("Validating configuration...")
        # Add validation logic
        logger.info("Configuration validated")
        return True
    
    def process(self) -> Dict:
        """Main processing logic"""
        logger.info("Starting processing...")
        
        try:
            self.validate_config()
            
            # Main processing
            result = self._execute()
            
            self.results['status'] = 'completed'
            self.results['end_time'] = datetime.now().isoformat()
            
            logger.info("Processing completed successfully")
            return self.results
            
        except Exception as e:
            self.results['status'] = 'failed'
            self.results['error'] = str(e)
            logger.error(f"Processing failed: {e}")
            raise
    
    def _execute(self) -> Dict:
        """Execute main logic"""
        # Implementation here
        return {'success': True}

def main():
    """Main entry point"""
    parser = argparse.ArgumentParser(
        description="Model Evaluation Suite"
    )
    parser.add_argument('--input', '-i', required=True, help='Input path')
    parser.add_argument('--output', '-o', required=True, help='Output path')
    parser.add_argument('--config', '-c', help='Configuration file')
    parser.add_argument('--verbose', '-v', action='store_true', help='Verbose output')
    
    args = parser.parse_args()
    
    if args.verbose:
        logging.getLogger().setLevel(logging.DEBUG)
    
    try:
        config = {
            'input': args.input,
            'output': args.output
        }
        
        processor = ModelEvaluationSuite(config)
        results = processor.process()
        
        print(json.dumps(results, indent=2))
        sys.exit(0)
        
    except Exception as e:
        logger.error(f"Fatal error: {e}")
        sys.exit(1)

if __name__ == '__main__':
    main()

Related skills

Setup Matt Pocock SkillsScaffold the per-repo configuration that Matt Pocock’s engineering agent skills rely on so they understand the issue tracker, triage labels, and domain documentation la462k185k

Lark Skill MakerQuickly turn any Lark/Feishu OpenAPI call or multi-step workflow into a reusable agent skill with its own SKILL.md.379k15.8k

CavemanSlash token usage by roughly 75% while keeping every technical detail intact when working with Claude Code, Cursor or similar agents.378k92.5k

Lark AppsConnect Claude, Cursor or custom agents directly to Lark (Feishu) for messaging, document automation, approval workflows and enterprise data access.375k

Running Claude Code Via Litellm CopilotRun Claude Code at a fraction of the cost by routing requests through LiteLLM to the GitHub Copilot Chat API.270k72

Codex PetGenerate a complete Codex Pet spritesheet and metadata from one reference image without needing an OpenAI key or Codex Pro.246k8

Forks & variants (1)

Senior Data Scientist has 1 known copy in the catalog totaling 47 installs. They canonicalize to this original listing.

ovachiever - 47 installs

FAQ

What scripts does senior-data-scientist provide for quick starts?

experiment_designer.py, feature_engineering_pipeline.py, and model_evaluation_suite.py under the scripts directory.

What latency targets does the skill document for production ML?

P50 under 50ms, P95 under 100ms, P99 under 200ms, plus over 1000 requests per second and 99.9% uptime goals.

Is Senior Data Scientist safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

AI & Agent Buildingagentsllmautomation

About

Senior Data Scientist by the numbers

senior-data-scientist capabilities & compatibility

What senior-data-scientist says it does

Add your badge

How do I design experiments, engineer features, and ship production ML with observability instead of notebook-only prototypes?

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

Senior Data Scientist

Quick Start

Main Capabilities

Core Expertise

Tech Stack

Reference Documentation

1. Statistical Methods Advanced

2. Experiment Design Frameworks

3. Feature Engineering Patterns

Production Patterns

Pattern 1: Scalable Data Processing

Pattern 2: ML Model Deployment

Pattern 3: Real-Time Inference

Best Practices

Development

Production

Team Leadership

Performance Targets

Security & Compliance

Common Commands

Resources

Senior-Level Responsibilities

Experiment Design Frameworks

Overview

Core Principles

Production-First Design

Performance by Design

Security & Privacy

Advanced Patterns

Pattern 1: Distributed Processing

Pattern 2: Real-Time Systems

Pattern 3: ML at Scale

Best Practices

Code Quality

Performance

Reliability

Tools & Technologies

Further Reading

Feature Engineering Patterns

Overview

Core Principles

Production-First Design

Performance by Design

Security & Privacy

Advanced Patterns

Pattern 1: Distributed Processing

Pattern 2: Real-Time Systems

Pattern 3: ML at Scale

Best Practices

Code Quality

Performance

Reliability

Tools & Technologies

Further Reading

Statistical Methods Advanced

Overview

Core Principles

Production-First Design

Performance by Design

Security & Privacy

Advanced Patterns

Pattern 1: Distributed Processing

Pattern 2: Real-Time Systems

Pattern 3: ML at Scale

Best Practices

Code Quality

Performance

Reliability

Tools & Technologies