Skip to main content

Llm

Credential-Blind Agentic Pentesting — Part I: Bidirectional Tokenization of Secrets, Identities and Topology

I want an AI agent that can do offensive and defensive security work without ever leaking a credential, a hostname, an IP or a domain to the model provider, and to keep that property no matter which provider sits behind the API. This is Part I of the research. It covers the threat model, the state of the art, the core mechanism (bidirectional tokenization with host-side resolution), and four experiments that run on real HackTheBox machines, including an autonomous agent that drives a real domain controller while seeing nothing but opaque tokens.

Studying LLM Workflows Until They Actually Find Cool Bugs

Two weeks ago I published a deep dive on prompt engineering for security research. This article is about everything that lives one layer above the prompt: the hooks, MCPs, subagents, scope guards, and validators that make those prompts viable in a real bug bounty workflow. Six axes, sourced numbers, and an honest before-and-after between my first attempt (27 slash commands, a 74k-vuln knowledge base, one monolithic configuration) and the rewrite (8 to 12 skills, no embeddings, hard caps everywhere, a deterministic validator MCP at the gate).

Prompting for Security Research: How to Build Prompts That Actually Find Vulnerabilities

Most people use LLMs for security wrong. They ask ‘find all bugs’ and get noise. This article breaks down the empirical research behind what actually works: structured prompting, adversarial self-verification, CWE-specialized chains, context engineering, and the full composite prompt template that gets you from noise to actionable findings. With numbers.