
Scikit Survival
Implement competing-risks survival analysis with cumulative incidence functions when multiple mutually exclusive outcomes block one another.
Overview
scikit-survival is an agent skill for the Build phase that guides competing-risks survival analysis using cumulative incidence instead of Kaplan–Meier when event types are mutually exclusive.
Install
npx skills add https://github.com/k-dense-ai/scientific-agent-skills --skill scikit-survivalWhat is this skill?
- Models competing risks when one event type prevents others (cancer vs cardiovascular death, churn reasons, failure modes
- Estimates Cumulative Incidence Function CIF_k(t) instead of Kaplan–Meier when risks compete.
- Documents when to use competing risks vs standard survival vs recurrent-events methods.
- Covers covariate effects across event types in medical and operational examples.
- Clarifies that KM overestimates probabilities when competing risks are present.
Adoption & trust: 549 installs on skills.sh; 27.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have time-to-event data with several exclusive outcomes but default survival curves overstate the chance of the event you care about.
Who is it for?
Indie data builders, health-tech side projects, or agents implementing survival pipelines where death, relapse, churn reasons, or failure modes compete.
Skip if: Single-event survival only, non-exclusive recurrent events, or teams that need clinical trial regulatory sign-off without biostat review.
When should I use this skill?
Multiple mutually exclusive event types exist, one event prevents others, and you need type-specific probabilities or covariate effects—not single-event or recurrent-only survival.
What do I get? / Deliverables
You choose CIF-based competing-risks methods, avoid invalid KM assumptions, and frame covariate effects per event type in analysis code or notebooks.
- Competing-risks analysis plan
- CIF-oriented modeling approach
- Valid vs invalid method selection notes
Recommended Skills
Journey fit
Statistical modeling and CIF estimation belong in the build phase where analysis code ships alongside product or research pipelines. Survival pipelines are implemented as backend or notebook-side computation, not UI or launch copy.
How it compares
Use instead of textbook Kaplan–Meier when the readme’s competing-risks conditions apply, not as a general stats tutor.
Common Questions / FAQ
Who is scikit-survival for?
Solo builders and agent users implementing survival or time-to-event analysis in Python-oriented scientific workflows when multiple exclusive outcomes exist.
When should I use scikit-survival?
During Build when modeling cohort outcomes—for example transplant competing infections, SaaS churn reasons, or reliability failure modes—before you ship dashboards or reports that quote event probabilities.
Is scikit-survival safe to install?
Review the Security Audits panel on this Prism page and treat statistical outputs as requiring your own validation against domain standards.
SKILL.md
READMESKILL.md - Scikit Survival
# Competing Risks Analysis ## Overview Competing risks occur when subjects can experience one of several mutually exclusive events (event types). When one event occurs, it prevents ("competes with") the occurrence of other events. ### Examples of Competing Risks **Medical Research:** - Death from cancer vs. death from cardiovascular disease vs. death from other causes - Relapse vs. death without relapse in cancer studies - Different types of infections in transplant patients **Other Applications:** - Job termination: retirement vs. resignation vs. termination for cause - Equipment failure: different failure modes - Customer churn: different reasons for leaving ### Key Concept: Cumulative Incidence Function (CIF) The **Cumulative Incidence Function (CIF)** represents the probability of experiencing a specific event type by time *t*, accounting for the presence of competing risks. **CIF_k(t) = P(T ≤ t, event type = k)** This differs from the Kaplan-Meier estimator, which would overestimate event probabilities when competing risks are present. ## When to Use Competing Risks Analysis **Use competing risks when:** - Multiple mutually exclusive event types exist - Occurrence of one event prevents others - Need to estimate probability of specific event types - Want to understand how covariates affect different event types **Don't use when:** - Only one event type of interest (standard survival analysis) - Events are not mutually exclusive (use recurrent events methods) - Competing events are extremely rare (can treat as censoring) ## Cumulative Incidence with Competing Risks ### cumulative_incidence_competing_risks Function Estimates the cumulative incidence function for each event type. ```python from sksurv.nonparametric import cumulative_incidence_competing_risks from sksurv.datasets import load_leukemia # Load data with competing risks X, y = load_leukemia() # y has event types: 0=censored, 1=relapse, 2=death # Compute cumulative incidence for each event type # Returns: time points, CIF for event 1, CIF for event 2, ... time_points, cif_1, cif_2 = cumulative_incidence_competing_risks(y) # Plot cumulative incidence functions import matplotlib.pyplot as plt plt.figure(figsize=(10, 6)) plt.step(time_points, cif_1, where='post', label='Relapse', linewidth=2) plt.step(time_points, cif_2, where='post', label='Death in remission', linewidth=2) plt.xlabel('Time (weeks)') plt.ylabel('Cumulative Incidence') plt.title('Competing Risks: Relapse vs Death') plt.legend() plt.grid(True, alpha=0.3) plt.show() ``` ### Interpretation - **CIF at time t**: Probability of experiencing that specific event by time t - **Sum of all CIFs**: Total probability of experiencing any event (all cause) - **1 - sum of CIFs**: Probability of being event-free and uncensored ## Data Format for Competing Risks ### Creating Structured Array with Event Types ```python import numpy as np from sksurv.util import Surv # Event types: 0 = censored, 1 = event type 1, 2 = event type 2 event_types = np.array([0, 1, 2, 1, 0, 2, 1]) times = np.array([10.2, 5.3, 8.1, 3.7, 12.5, 6.8, 4.2]) # Create survival array # For competing risks: event=True if any event occurred # Store event type separately or encode in the event field y = Surv.from_arrays( event=(event_types > 0), # True if any event time=times ) # Keep event_types for distinguishing between event types ``` ### Converting Data with Event Types ```python import pandas as pd from sksurv.util import Surv # Assume data has: time, event_type columns # event_type: 0=censored, 1=type1, 2=type2, etc. df = pd.read_csv('competing_risks_data.csv') # Create survival outcome y = Surv.from_arrays( event=(df['event_type'] > 0), time=df['time'] ) # Store event types event_types = df['event_type'].values ``` ## Comparing Cumulative Incidence Between Groups ### Stratified Analysis ```python from sksurv.nonparametric import cumulative_incidence_competing_risks import matplotlib.pyplot as