I come from offensive security and I have spent a lot of time on AI research, MCP, and vulnerability hunting. When Hack The Box shipped its Certified Offensive AI Expert, I jumped on it. This is a retex of the AI Red Teamer path and the certification, focused on how I prepared and the math behind the attacks, kept strictly within HTB’s disclosure rules.
I was invited to give a talk at FIC 2026 about a semester R&D project: an MCP architecture that orchestrates several home-made MCP servers to test, detect, and improve detection coverage. Attack runs in a GOAD lab, the system checks if an alert fires, digs through the logs when it does not, writes and tests a rule, then validates that the scenario is now covered. Hundreds of scenarios a month, and three good days in Lille.
Two weeks ago I published a deep dive on prompt engineering for security research. This article is about everything that lives one layer above the prompt: the hooks, MCPs, subagents, scope guards, and validators that make those prompts viable in a real bug bounty workflow. Six axes, sourced numbers, and an honest before-and-after between my first attempt (27 slash commands, a 74k-vuln knowledge base, one monolithic configuration) and the rewrite (8 to 12 skills, no embeddings, hard caps everywhere, a deterministic validator MCP at the gate).
Most people use LLMs for security wrong. They ask ‘find all bugs’ and get noise. This article breaks down the empirical research behind what actually works: structured prompting, adversarial self-verification, CWE-specialized chains, context engineering, and the full composite prompt template that gets you from noise to actionable findings. With numbers.
Sixth article in my MCP security series. A malicious MCP server can poison OAuth Authorization Server Metadata to redirect token exchange, client registration, and PKCE verifiers to attacker-controlled endpoints while the user sees a legitimate identity provider login page. The Python and TypeScript SDKs skip RFC 8414 Section 3.3 issuer validation and perform no endpoint origin checks. Reported to Anthropic VDP, closed as duplicate of an existing tracked issue. Full technical breakdown and PoC.
Fifth article in my MCP security series. Claude Code stores MCP server approvals as plain server names with no hash, no fingerprint, and no config verification. Once approved, swapping the server’s command to an arbitrary binary triggers no re-prompt. Reported to Anthropic VDP, closed as Informative (out of threat model). Full technical breakdown.
Third article in my MCP security series. Claude Code’s .mcp.json discovery walks from CWD to filesystem root with no boundary check and no file ownership verification. On multi-user Linux systems, any user can drop /tmp/.mcp.json to inject MCP servers into another user’s Claude Code session. Not reported to Anthropic. Here’s why, and the full technical breakdown.
Fourth article in my MCP security series. By chaining a transport-layer weakness (session ID as sole routing key) with the Tasks and Elicitation systems, an attacker can inject phantom tasks into a victim’s MCP session and phish credentials through the legitimate, trusted server. CVSS 8.1, reported to Anthropic VDP and disclosed. Full technical breakdown with working PoC.
Second article in my MCP security series. A malicious MCP server returns a 401 with a crafted WWW-Authenticate header pointing resource_metadata at any URL it wants. The MCP SDK fetches that URL without origin validation, resulting in blind SSRF that affects both Python and TypeScript SDKs, Claude Desktop, and Claude Code. Reported to Anthropic VDP, closed as duplicate. Full technical details disclosed here.
The Hacker Recipes said remote SID History injection from Linux was impossible. pySIDHistory proves otherwise with two methods: DRSUAPI and DSInternals.
A deep dive into a protocol-level vulnerability in the Model Context Protocol (MCP) specification where malicious SVG icons delivered via data: URIs can escalate from XSS to full RCE on Electron clients. Reported to Anthropic VDP, closed as Informative. Disclosed here with full technical details.
Building a cross-platform GoldenGMSA tool by reverse engineering Windows cryptographic DLLs and implementing NIST SP800-108 KDF from scratch
Complete exploitation of a Windows Active Directory machine highlighting Kerberos delegation RBCD techniques
The Phreaks 2600 team actively participated in the Hack4Values Grand Live Hacking Solidaire 2025, contributing to the security of NGOs and associations
Comprehensive guide to Active Directory exploitation with NetExec at the LeHack 2025 workshop
The Phreaks 2600 team secured 2nd place at the ‘Unlock Your Brain’ Student Bug Bounty, with notable individual performances, including awards for the most creative bug and the best meme.
HackTheBox - Redelegate # Hard difficulty Windows Active Directory machine - Anonymous FTP access leading to KeePass cracking and Kerberos constrained delegation
PeppermintRoute: Our Journey Through the challenge # This writeup covers our complete journey solving the PeppermintRoute challenge. This was part of the HackTheBox University CTF 2025, an international cybersecurity competition for students where we participated with the Phreaks 2600 team.
Complete Active Directory lab exploitation with Netexec - Advanced pentesting techniques