
Ctf Ai Ml
Run adversarial ML playbooks—FGSM, PGD, C&W, patches, poisoning, and backdoor checks—when solving CTF or red-team ML challenges.
Overview
CTF AI/ML is an agent skill for the Ship phase that documents adversarial ML techniques—FGSM, PGD, C&W, patches, poisoning, and backdoor detection—for CTF and security research workflows.
Install
npx skills add https://github.com/ljagiello/ctf-skills --skill ctf-ai-mlWhat is this skill?
- Covers FGSM, PGD, and Carlini–Wagner adversarial example generation patterns
- Adversarial patch generation and physical-world evasion techniques
- Foundational evasion, data poisoning, and neural backdoor detection sections
- CTF walkthroughs including foolbox on Keras MNIST-Auth and hand-rolled FGSM via K.gradients
- Cross-links to model-attacks.md and llm-attacks.md for weight extraction and LLM-specific chains
- Documents FGSM, PGD, and C&W as three core adversarial example generation methods
- Table of contents spans six major attack categories plus two named CTF walkthroughs
Adoption & trust: 3.3k installs on skills.sh; 2.3k GitHub stars; 1/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have a classifier or CTF ML challenge and need proven adversarial attack patterns instead of guessing perturbation math or library APIs.
Who is it for?
Solo builders and security hobbyists attacking or hardening ML models in CTFs, labs, or deliberate red-team exercises.
Skip if: Teams shipping consumer features without a security or competition context, or anyone seeking benign model training and MLOps pipelines only.
When should I use this skill?
You are solving ML security CTF tasks, crafting adversarial examples or patches, or evaluating classifier robustness with documented attack patterns.
What do I get? / Deliverables
You get technique-specific procedures and CTF-oriented examples you can adapt in foolbox, Keras, or custom gradient code to misclassify targets or validate model defenses.
- Adversarial inputs or patches that flip classifier predictions
- Documented attack steps aligned to FGSM, PGD, C&W, or poisoning patterns
Recommended Skills
Journey fit
Adversarial ML work belongs in Ship because it stress-tests model robustness and evasion before or after deployment, not during greenfield product scoping. Security subphase is the canonical shelf for offensive/defensive ML techniques referenced alongside model-attacks and llm-attacks docs.
How it compares
Use this attack-pattern playbook instead of generic Data Science skills that focus on training metrics rather than evasion and poisoning.
Common Questions / FAQ
Who is ctf-ai-ml for?
CTF players, indie security researchers, and builders hardening or breaking ML classifiers who already work in Python/Keras-style stacks and want adversarial recipes.
When should I use ctf-ai-ml?
Use it during Ship security reviews of models, when practicing nullcon/UTCTF-style ML challenges, or when you need FGSM/PGD/C&W or patch attacks before writing custom exploit code.
Is ctf-ai-ml safe to install?
Treat it as offensive-security documentation; review the Security Audits panel on this Prism page and only run attacks on systems you own or are authorized to test.
SKILL.md
READMESKILL.md - Ctf Ai Ml
# CTF AI/ML - Adversarial ML Adversarial machine learning techniques: generating adversarial examples, physical-world patches, evasion attacks, data poisoning, and backdoor detection. For model weight manipulation and extraction attacks, see [model-attacks.md](model-attacks.md). For LLM-specific attacks, see [llm-attacks.md](llm-attacks.md). ## Table of Contents - [Adversarial Example Generation (FGSM, PGD, C&W)](#adversarial-example-generation-fgsm-pgd-cw) - [FGSM (Fast Gradient Sign Method)](#fgsm-fast-gradient-sign-method) - [PGD (Projected Gradient Descent)](#pgd-projected-gradient-descent) - [C&W (Carlini & Wagner) Attack](#cw-carlini--wagner-attack) - [Adversarial Patch Generation](#adversarial-patch-generation) - [Evasion Attacks on ML Classifiers (Foundational)](#evasion-attacks-on-ml-classifiers-foundational) - [Data Poisoning (Foundational)](#data-poisoning-foundational) - [Backdoor Detection in Neural Networks (Foundational)](#backdoor-detection-in-neural-networks-foundational) - [foolbox L1BasicIterativeAttack on Keras MNIST-Auth (nullcon 2019)](#foolbox-l1basiciterativeattack-on-keras-mnist-auth-nullcon-2019) - [Hand-Rolled Keras FGSM via K.gradients (UTCTF 2019)](#hand-rolled-keras-fgsm-via-kgradients-utctf-2019) --- ## Adversarial Example Generation (FGSM, PGD, C&W) **Pattern:** Craft imperceptible perturbations to input images that cause a classifier to misclassify. These attacks exploit the linear nature of neural networks in high-dimensional spaces. Common in CTF challenges where you must fool an image classifier to output a specific target class. ### FGSM (Fast Gradient Sign Method) Single-step attack. Fast but produces larger perturbations than iterative methods. ```python import torch import torch.nn.functional as F from torchvision import transforms, models from PIL import Image # Load model and image model = models.resnet18(pretrained=True) model.eval() img = Image.open("input.png").convert("RGB") preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) x = preprocess(img).unsqueeze(0) x.requires_grad_(True) # Forward pass output = model(x) original_class = output.argmax(dim=1).item() print(f"Original prediction: class {original_class}") # Untargeted FGSM: maximize loss for true class loss = F.cross_entropy(output, torch.tensor([original_class])) loss.backward() # Generate adversarial example epsilon = 0.03 # perturbation budget (L-inf norm) x_adv = x + epsilon * x.grad.sign() x_adv = torch.clamp(x_adv, x.min(), x.max()) # Check adversarial prediction with torch.no_grad(): adv_output = model(x_adv) adv_class = adv_output.argmax(dim=1).item() print(f"Adversarial prediction: class {adv_class}") print(f"Attack successful: {adv_class != original_class}") ``` ### PGD (Projected Gradient Descent) Iterative FGSM with projection. Stronger attack, considered the standard for robustness evaluation. ```python import torch import torch.nn.functional as F def pgd_attack(model, x, y_true, epsilon=0.03, alpha=0.007, num_steps=40): """ Projected Gradient Descent attack (Madry et al., 2018). alpha = step size per iteration, epsilon = total perturbation budget. """ x_adv = x.clone().detach() + torch.empty_like(x).uniform_(-epsilon, epsilon) x_adv = torch.clamp(x_adv, 0, 1).detach() for _ in range(num_steps): x_adv.requires_grad_(True) output = model(x_adv) loss = F.cross_entropy(output, y_true) loss.backward() with torch.no_grad(): # Step in gradient direction x_adv = x_adv + alpha * x_adv.grad.sign() # Project back to epsilon-ball around original input delta = torch.clamp(x_adv - x, min=-epsilon, max=epsilon) x_adv = torch.clamp(x + delta, 0, 1).detach() return x_adv def targeted_pgd(model, x, y_target