Ctf Ai Ml

Name: Ctf Ai Ml
Author: ljagiello

ljagiello/ctf-skills

Run adversarial ML playbooks—FGSM, PGD, C&W, patches, poisoning, and backdoor checks—when solving CTF or red-team ML challenges.

Overview

CTF AI/ML is an agent skill for the Ship phase that documents adversarial ML techniques—FGSM, PGD, C&W, patches, poisoning, and backdoor detection—for CTF and security research workflows.

Install

npx skills add https://github.com/ljagiello/ctf-skills --skill ctf-ai-ml

What is this skill?

Covers FGSM, PGD, and Carlini–Wagner adversarial example generation patterns
Adversarial patch generation and physical-world evasion techniques
Foundational evasion, data poisoning, and neural backdoor detection sections
CTF walkthroughs including foolbox on Keras MNIST-Auth and hand-rolled FGSM via K.gradients
Cross-links to model-attacks.md and llm-attacks.md for weight extraction and LLM-specific chains
Documents FGSM, PGD, and C&W as three core adversarial example generation methods
Table of contents spans six major attack categories plus two named CTF walkthroughs

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 3.3k installs on skills.sh; 2.3k GitHub stars; 1/3 security scanners passed (skills.sh audits).

What problem does it solve?

You have a classifier or CTF ML challenge and need proven adversarial attack patterns instead of guessing perturbation math or library APIs.

Who is it for?

Solo builders and security hobbyists attacking or hardening ML models in CTFs, labs, or deliberate red-team exercises.

Skip if: Teams shipping consumer features without a security or competition context, or anyone seeking benign model training and MLOps pipelines only.

When should I use this skill?

You are solving ML security CTF tasks, crafting adversarial examples or patches, or evaluating classifier robustness with documented attack patterns.

What do I get? / Deliverables

You get technique-specific procedures and CTF-oriented examples you can adapt in foolbox, Keras, or custom gradient code to misclassify targets or validate model defenses.

Adversarial inputs or patches that flip classifier predictions
Documented attack steps aligned to FGSM, PGD, C&W, or poisoning patterns

Recommended Skills

Azure Compliancemicrosoft/azure-skills

Azure Compliance is a Microsoft agent skill for solo builders and small teams who need structured security and complianc…373k installs·1.2k stars

Openclaw Secure Linux Cloudxixu-me/skills

OpenClaw Secure Linux Cloud guides a hardened VPS deployment—rootless Podman, loopback-only gateway, SSH-tunneled Contro…201k installs·61 stars

Entra Agent Idmicrosoft/azure-skills

entra-agent-id is a Microsoft-authored agent skill for provisioning and operating OAuth-capable identities tailored to A…99.1k installs·1.2k stars

Firebase Security Rules Auditorfirebase/agent-skills

Firebase Security Rules Auditor turns your agent into a senior penetration-style reviewer for Firestore rulesets. Solo b…40.3k installs·345 stars

Firestore Security Rules Auditorfirebase/agent-skills

Firestore-security-rules-auditor is a checker skill for solo builders and small teams shipping Firebase-backed apps who …20.3k installs·345 stars

Skill Vetteruseai-pro/openclaw-skills-security

Skill Vetter is a security-first OpenClaw agent skill that makes your coding agent behave like a pre-install auditor for…19.2k installs·62 stars

Journey fit

Primary fit

Adversarial ML work belongs in Ship because it stress-tests model robustness and evasion before or after deployment, not during greenfield product scoping. Security subphase is the canonical shelf for offensive/defensive ML techniques referenced alongside model-attacks and llm-attacks docs.

Also useful

BuildBackend, data & payments

Also useful

OperateIteration & experiments

How it compares

Use this attack-pattern playbook instead of generic Data Science skills that focus on training metrics rather than evasion and poisoning.

Common Questions / FAQ

Who is ctf-ai-ml for?

CTF players, indie security researchers, and builders hardening or breaking ML classifiers who already work in Python/Keras-style stacks and want adversarial recipes.

When should I use ctf-ai-ml?

Use it during Ship security reviews of models, when practicing nullcon/UTCTF-style ML challenges, or when you need FGSM/PGD/C&W or patch attacks before writing custom exploit code.

Is ctf-ai-ml safe to install?

Treat it as offensive-security documentation; review the Security Audits panel on this Prism page and only run attacks on systems you own or are authorized to test.

SKILL.md

READMESKILL.md - Ctf Ai Ml

# CTF AI/ML - Adversarial ML

Adversarial machine learning techniques: generating adversarial examples, physical-world patches, evasion attacks, data poisoning, and backdoor detection. For model weight manipulation and extraction attacks, see [model-attacks.md](model-attacks.md). For LLM-specific attacks, see [llm-attacks.md](llm-attacks.md).

## Table of Contents
- [Adversarial Example Generation (FGSM, PGD, C&W)](#adversarial-example-generation-fgsm-pgd-cw)
  - [FGSM (Fast Gradient Sign Method)](#fgsm-fast-gradient-sign-method)
  - [PGD (Projected Gradient Descent)](#pgd-projected-gradient-descent)
  - [C&W (Carlini & Wagner) Attack](#cw-carlini--wagner-attack)
- [Adversarial Patch Generation](#adversarial-patch-generation)
- [Evasion Attacks on ML Classifiers (Foundational)](#evasion-attacks-on-ml-classifiers-foundational)
- [Data Poisoning (Foundational)](#data-poisoning-foundational)
- [Backdoor Detection in Neural Networks (Foundational)](#backdoor-detection-in-neural-networks-foundational)
- [foolbox L1BasicIterativeAttack on Keras MNIST-Auth (nullcon 2019)](#foolbox-l1basiciterativeattack-on-keras-mnist-auth-nullcon-2019)
- [Hand-Rolled Keras FGSM via K.gradients (UTCTF 2019)](#hand-rolled-keras-fgsm-via-kgradients-utctf-2019)

---

## Adversarial Example Generation (FGSM, PGD, C&W)

**Pattern:** Craft imperceptible perturbations to input images that cause a classifier to misclassify. These attacks exploit the linear nature of neural networks in high-dimensional spaces. Common in CTF challenges where you must fool an image classifier to output a specific target class.

### FGSM (Fast Gradient Sign Method)

Single-step attack. Fast but produces larger perturbations than iterative methods.

```python
import torch
import torch.nn.functional as F
from torchvision import transforms, models
from PIL import Image

# Load model and image
model = models.resnet18(pretrained=True)
model.eval()

img = Image.open("input.png").convert("RGB")
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
x = preprocess(img).unsqueeze(0)
x.requires_grad_(True)

# Forward pass
output = model(x)
original_class = output.argmax(dim=1).item()
print(f"Original prediction: class {original_class}")

# Untargeted FGSM: maximize loss for true class
loss = F.cross_entropy(output, torch.tensor([original_class]))
loss.backward()

# Generate adversarial example
epsilon = 0.03  # perturbation budget (L-inf norm)
x_adv = x + epsilon * x.grad.sign()
x_adv = torch.clamp(x_adv, x.min(), x.max())

# Check adversarial prediction
with torch.no_grad():
    adv_output = model(x_adv)
    adv_class = adv_output.argmax(dim=1).item()
    print(f"Adversarial prediction: class {adv_class}")
    print(f"Attack successful: {adv_class != original_class}")
```

### PGD (Projected Gradient Descent)

Iterative FGSM with projection. Stronger attack, considered the standard for robustness evaluation.

```python
import torch
import torch.nn.functional as F

def pgd_attack(model, x, y_true, epsilon=0.03, alpha=0.007, num_steps=40):
    """
    Projected Gradient Descent attack (Madry et al., 2018).
    alpha = step size per iteration, epsilon = total perturbation budget.
    """
    x_adv = x.clone().detach() + torch.empty_like(x).uniform_(-epsilon, epsilon)
    x_adv = torch.clamp(x_adv, 0, 1).detach()

    for _ in range(num_steps):
        x_adv.requires_grad_(True)
        output = model(x_adv)
        loss = F.cross_entropy(output, y_true)
        loss.backward()

        with torch.no_grad():
            # Step in gradient direction
            x_adv = x_adv + alpha * x_adv.grad.sign()
            # Project back to epsilon-ball around original input
            delta = torch.clamp(x_adv - x, min=-epsilon, max=epsilon)
            x_adv = torch.clamp(x + delta, 0, 1).detach()

    return x_adv

def targeted_pgd(model, x, y_target

What is this skill?

Covers FGSM, PGD, and Carlini–Wagner adversarial example generation patterns

Adversarial patch generation and physical-world evasion techniques

Foundational evasion, data poisoning, and neural backdoor detection sections

CTF walkthroughs including foolbox on Keras MNIST-Auth and hand-rolled FGSM via K.gradients

Cross-links to model-attacks.md and llm-attacks.md for weight extraction and LLM-specific chains

Documents FGSM, PGD, and C&W as three core adversarial example generation methods

Table of contents spans six major attack categories plus two named CTF walkthroughs

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 3.3k installs on skills.sh; 2.3k GitHub stars; 1/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

You get technique-specific procedures and CTF-oriented examples you can adapt in foolbox, Keras, or custom gradient code to misclassify targets or validate model defenses.

Adversarial inputs or patches that flip classifier predictions

Documented attack steps aligned to FGSM, PGD, C&W, or poisoning patterns

Journey fit

Primary fit

Also useful

BuildBackend, data & payments

Also useful

OperateIteration & experiments

SKILL.md

READMESKILL.md - Ctf Ai Ml

# CTF AI/ML - Adversarial ML

Adversarial machine learning techniques: generating adversarial examples, physical-world patches, evasion attacks, data poisoning, and backdoor detection. For model weight manipulation and extraction attacks, see [model-attacks.md](model-attacks.md). For LLM-specific attacks, see [llm-attacks.md](llm-attacks.md).

## Table of Contents
- [Adversarial Example Generation (FGSM, PGD, C&W)](#adversarial-example-generation-fgsm-pgd-cw)
  - [FGSM (Fast Gradient Sign Method)](#fgsm-fast-gradient-sign-method)
  - [PGD (Projected Gradient Descent)](#pgd-projected-gradient-descent)
  - [C&W (Carlini & Wagner) Attack](#cw-carlini--wagner-attack)
- [Adversarial Patch Generation](#adversarial-patch-generation)
- [Evasion Attacks on ML Classifiers (Foundational)](#evasion-attacks-on-ml-classifiers-foundational)
- [Data Poisoning (Foundational)](#data-poisoning-foundational)
- [Backdoor Detection in Neural Networks (Foundational)](#backdoor-detection-in-neural-networks-foundational)
- [foolbox L1BasicIterativeAttack on Keras MNIST-Auth (nullcon 2019)](#foolbox-l1basiciterativeattack-on-keras-mnist-auth-nullcon-2019)
- [Hand-Rolled Keras FGSM via K.gradients (UTCTF 2019)](#hand-rolled-keras-fgsm-via-kgradients-utctf-2019)

---

## Adversarial Example Generation (FGSM, PGD, C&W)

**Pattern:** Craft imperceptible perturbations to input images that cause a classifier to misclassify. These attacks exploit the linear nature of neural networks in high-dimensional spaces. Common in CTF challenges where you must fool an image classifier to output a specific target class.

### FGSM (Fast Gradient Sign Method)

Single-step attack. Fast but produces larger perturbations than iterative methods.

```python
import torch
import torch.nn.functional as F
from torchvision import transforms, models
from PIL import Image

# Load model and image
model = models.resnet18(pretrained=True)
model.eval()

img = Image.open("input.png").convert("RGB")
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
x = preprocess(img).unsqueeze(0)
x.requires_grad_(True)

# Forward pass
output = model(x)
original_class = output.argmax(dim=1).item()
print(f"Original prediction: class {original_class}")

# Untargeted FGSM: maximize loss for true class
loss = F.cross_entropy(output, torch.tensor([original_class]))
loss.backward()

# Generate adversarial example
epsilon = 0.03  # perturbation budget (L-inf norm)
x_adv = x + epsilon * x.grad.sign()
x_adv = torch.clamp(x_adv, x.min(), x.max())

# Check adversarial prediction
with torch.no_grad():
    adv_output = model(x_adv)
    adv_class = adv_output.argmax(dim=1).item()
    print(f"Adversarial prediction: class {adv_class}")
    print(f"Attack successful: {adv_class != original_class}")
```

### PGD (Projected Gradient Descent)

Iterative FGSM with projection. Stronger attack, considered the standard for robustness evaluation.

```python
import torch
import torch.nn.functional as F

def pgd_attack(model, x, y_true, epsilon=0.03, alpha=0.007, num_steps=40):
    """
    Projected Gradient Descent attack (Madry et al., 2018).
    alpha = step size per iteration, epsilon = total perturbation budget.
    """
    x_adv = x.clone().detach() + torch.empty_like(x).uniform_(-epsilon, epsilon)
    x_adv = torch.clamp(x_adv, 0, 1).detach()

    for _ in range(num_steps):
        x_adv.requires_grad_(True)
        output = model(x_adv)
        loss = F.cross_entropy(output, y_true)
        loss.backward()

        with torch.no_grad():
            # Step in gradient direction
            x_adv = x_adv + alpha * x_adv.grad.sign()
            # Project back to epsilon-ball around original input
            delta = torch.clamp(x_adv - x, min=-epsilon, max=epsilon)
            x_adv = torch.clamp(x + delta, 0, 1).detach()

    return x_adv

def targeted_pgd(model, x, y_target

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is ctf-ai-ml for?

When should I use ctf-ai-ml?

Is ctf-ai-ml safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is ctf-ai-ml for?

When should I use ctf-ai-ml?

Is ctf-ai-ml safe to install?

SKILL.md