Prompt Injection - RAG Security - Agent Testing
Stop prompt injection from hijacking your AI stack
Security testing for RAG, agents, and LLM workflows-delivered by an independent architect.
Security testing that finds how attackers hijack your LLM's instructions before launch. With 15 years shipping production systems and still building the same RAG and agent platforms put under test, every assessment remains manual, adapts in real time, and surfaces the context automated scanners miss.
You get direct access to the person doing the work, real-time findings in Slack/email, and remediation guidance grounded in production engineering-not agency handoffs.
Need comprehensive Penetration Testing or broader AI Security Consulting ? I cover those too.
Honest stats
15 Years
Production engineering across SaaS & Tesla
Independent Architect
Work directly with me-not an agency pod
Hands-On Testing
Manual testing that adapts in real time
Early client? Mention it on the call and I'll apply preferred early-adopter rates.
How I test
How I test: methodology, coverage & tools
One playbook covers everything from scoping to retests: manual prompt injection chains, RAG poisoning, agent/tool abuse, replay harnesses, and remediation pairing.
Scope & attack surface mapping
Walk the product, inventory prompts, review tool catalogs, and decide how aggressive testing should be so I can focus on the riskiest flows first.
Manual prompt injection chains
200+ adversarial payloads, multi-turn conversations, and delimiter-bypass tricks executed manually so responses can be adapted in real time.
RAG security & document poisoning
Replay harnesses push poisoned PDFs/CSV/markdown into ingestion pipelines, validate metadata filters, and confirm tenant binding on retrieval.
Agent security & tool abuse
Attack MCP servers, LangChain tools, and custom function calls to coerce parameter changes, escalate permissions, or trigger unintended actions.
Automated validation harnesses
Lightweight scripts replay successful attacks, fuzz guardrails, and plug into CI so you can keep testing after the engagement.
Report & remediation pairing
Findings ship in Markdown + PDF with impact, repro, and fixes. I stick around for retests and pairing sessions until the mitigations hold.
Guardrails validation harnesses
Structured output validators (Guardrails AI-style) keep prompts, tools, and agents bound to approved schemas and get wired into your eval suite.
Attack examples & prevention
Attack examples & prevention checklists
A sample of the payloads and defenses that show up in every engagement. The specifics stay private, but the categories match what real attackers abuse.
Prompt Injection Attack Examples
- System prompt override payloads that exfiltrate API keys or user data.
- Indirect prompt injection from poisoned PDF/CSV uploads in RAG workflows that turn into prompt injection vulnerabilities.
- Agent tool abuse where function calling instructions leak secrets.
- Multi-turn social-engineering prompts that bypass output filters.
Prompt Injection Attack Prevention
- Context-aware allowlists + deny lists enforced inside model middleware.
- RAG metadata filters, document hashing, and tenant binding.
- Agent policy enforcement, tool parameter validation, and logging.
- Continuous retesting via adversarial replay suites and CI pipelines.
Want a deeper breakdown? Read prompt injection vs. jailbreaking to understand how the two differ and why true prompt injection defense requires manual testing.
Deliverables & reports
Deliverables & reports
You leave with evidence investors, auditors, and engineers can use: human-readable writeups, payloads, repro steps, a retest plan, and the PDF report stakeholders expect.
- Custom adversarial payload catalog mapped to your prompts, tools, and RAG stack.
- Findings shared live in Slack/email so fixes can start immediately.
- Markdown + PDF report with reproduction steps, payloads, and fixes.
- 30-day retest window to validate mitigations.
- Optional pairing sessions to harden guardrails or implement playbooks.
Engagement models & pricing
Engagement models & pricing
Pick the depth you need-from fast threat-modeling sessions to full assessments bundled with penetration testing or platform builds.
Focused audit
Manual injection testing
- 1-2 week assessment focused on injection risk
- RAG + agent coverage with replay harnesses
- Detailed report plus remediation pairing
- 30-day retest included
Full security test
Injection + OWASP coverage
- Blends injection testing with OWASP methodology
- Application, API, and infrastructure layers
- Ideal before investor or enterprise reviews
- Replay scripts for CI/regression suites
RAG build or hardening
Defenses baked in
- Ingestion, eval harnesses, and guardrails
- Security integrated from the start
- For teams without platform engineers
Advisory session
60-minute workshop
- Threat modeling + prioritized next steps
- Expert gut check on your approach
- Follow-up summary with recommendations
Ready to harden it?
Walk through real RAG attack scenarios
Walk through prompt injection chains, RAG poisoning, and agent abuse scenarios in a 30-minute session tailored to your stack. Every demo includes the replay harnesses and mitigation playbooks I deploy in production.
FAQ
Common questions
Is this necessary for early-stage startups?
Yes. Attackers don't check company size-they look for weak guardrails. Most platforms I test have at least one critical injection path.
Do you use automated tools?
Automation helps with bookkeeping, but every attack chain is manual so I can adapt to how your model responds. Scanners miss nuance.
Do you have case studies?
CodeWheel is still building its public portfolio. If you need polished enterprise logos today, I'm not the right fit. If you want transparent work and early-adopter pricing, let's talk.
Will you sign NDAs?
Absolutely. Standard MNDAs or your paper. Role-based accounts in staging plus API credentials are all I need-never full database dumps.
Can you help implement fixes?
Yes. I build RAG systems and guardrails, so if you want implementation help we can scope it with the test or as a follow-on.
What's the difference between prompt injection and jailbreaking?
Jailbreaking defeats model-level safety filters. Prompt injection hijacks downstream systems-tools, APIs, billing-after the model accepts a malicious instruction.
Related services
Need broader coverage?
Injection testing pairs well with full security assessments, consulting, and platform builds.
Security testing
OWASP methodology, API testing, and vulnerability assessment for your full stack.
Explore security testingAI platform development
RAG implementation with security built in from the first sprint.
See platform buildsReady to get started?
Share your architecture, RAG setup, and timeline. I'll outline scope, approach, and pricing. If I'm not the right fit, I'll tell you.
I don't just identify risks - I help fix them through architecture reviews, remediation pairing, and secure platform builds.
Contact
Email: matt@codewheel.ai
Based in the Bay Area. Remote-friendly, but happy to meet founders locally.
Verify my background on LinkedInServing companies across the San Francisco Bay Area, Silicon Valley, and remote teams worldwide.
