
Shap
Pick the right SHAP explainer and parameters when you need interpretable predictions from tree ensembles, deep models, or custom Python scorers.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill shapWhat is this skill?
- Maps every major SHAP explainer class (Explainer auto-selector, TreeExplainer, and related variants) with constructor pa
- Documents masker/background data choices, feature_perturbation modes, and output_names for multi-output models
- Optimizes for tree ensembles (XGBoost, LightGBM, CatBoost, sklearn, PySpark) with fast exact Tree SHAP paths
- Guidance on switching algorithms explicitly when auto-selection is wrong for your model type
- Reference-oriented: parameters, methods, and architecture-specific tradeoffs in one place for agent-assisted coding
Adoption & trust: 554 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
Recommended Skills
Paper Context Resolverlllllllama/ai-paper-reproduction-skill
Repo Intake And Planlllllllama/ai-paper-reproduction-skill
Env And Assets Bootstraplllllllama/ai-paper-reproduction-skill
Minimal Run And Auditlllllllama/ai-paper-reproduction-skill
Analyze Projectlllllllama/rigorpilot-skills
Ai Research Reproductionlllllllama/rigorpilot-skills
Journey fit
Primary fit
Model explainability is part of building and hardening ML features before you ship dashboards or agent tools that depend on predictions. SHAP sits in backend/data science work—training pipelines, inference services, and evaluation notebooks—not frontend polish.
Common Questions / FAQ
Is Shap safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Shap
# SHAP Explainers Reference This document provides comprehensive information about all SHAP explainer classes, their parameters, methods, and when to use each type. ## Overview SHAP provides specialized explainers for different model types, each optimized for specific architectures. The general `shap.Explainer` class automatically selects the appropriate algorithm based on the model type. ## Core Explainer Classes ### shap.Explainer (Auto-selector) **Purpose**: Automatically uses Shapley values to explain any machine learning model or Python function by selecting the most appropriate explainer algorithm. **Constructor Parameters**: - `model`: The model to explain (function or model object) - `masker`: Background data or masker object for feature manipulation - `algorithm`: Optional override to force specific explainer type - `output_names`: Names for model outputs - `feature_names`: Names for input features **When to Use**: Default choice when unsure which explainer to use; automatically selects the best algorithm based on model type. ### TreeExplainer **Purpose**: Fast and exact SHAP value computation for tree-based ensemble models using the Tree SHAP algorithm. **Constructor Parameters**: - `model`: Tree-based model (XGBoost, LightGBM, CatBoost, PySpark, or scikit-learn trees) - `data`: Background dataset for feature integration (optional with tree_path_dependent) - `feature_perturbation`: How to handle dependent features - `"interventional"`: Requires background data; follows causal inference rules - `"tree_path_dependent"`: No background data needed; uses training examples per leaf - `"auto"`: Defaults to interventional if data provided, otherwise tree_path_dependent - `model_output`: What model output to explain - `"raw"`: Standard model output (default) - `"probability"`: Probability-transformed output - `"log_loss"`: Natural log of loss function - Custom method names like `"predict_proba"` - `feature_names`: Optional feature naming **Supported Models**: - XGBoost (xgboost.XGBClassifier, xgboost.XGBRegressor, xgboost.Booster) - LightGBM (lightgbm.LGBMClassifier, lightgbm.LGBMRegressor, lightgbm.Booster) - CatBoost (catboost.CatBoostClassifier, catboost.CatBoostRegressor) - PySpark MLlib tree models - scikit-learn (DecisionTreeClassifier, DecisionTreeRegressor, RandomForestClassifier, RandomForestRegressor, ExtraTreesClassifier, ExtraTreesRegressor, GradientBoostingClassifier, GradientBoostingRegressor) **Key Methods**: - `shap_values(X)`: Computes SHAP values for samples; returns arrays where each row represents feature attribution - `shap_interaction_values(X)`: Estimates interaction effects between feature pairs; provides matrices with main effects and pairwise interactions - `explain_row(row)`: Explains individual rows with detailed attribution information **When to Use**: - Primary choice for all tree-based models - When exact SHAP values are needed (not approximations) - When computational speed is important for large datasets - For models like random forests, gradient boosting, or XGBoost **Example**: ```python import shap import xgboost # Train model model = xgboost.XGBClassifier().fit(X_train, y_train) # Create explainer explainer = shap.TreeExplainer(model) # Compute SHAP values shap_values = explainer.shap_values(X_test) # Compute interaction values shap_interaction = explainer.shap_interaction_values(X_test) ``` ### DeepExplainer **Purpose**: Approximates SHAP values for deep learning models using an enhanced version of the DeepLIFT algorithm. **Constructor Parameters**: - `model`: Framework-dependent specification - **TensorFlow**: Tuple of (input_tensor, output_tensor) where output is single-dimensional - **PyTorch**: `nn.Module` object or tuple of `(model, layer)` for layer-specific explanations - `data`: Background dataset for feature integration - **TensorFlow**: numpy arrays or pandas DataFrames - **PyTorch**: torch tensors - **Recommended size**: 100-1000 sa