Coming Soon

Your agents won't go rogue for much longer...

Privacy Terms © 2026 Rogue Security
▸ SECURE CONNECTION ▸ LATENCY: 4.2ms ▸ AGENTS: 17,432 ▸ THREAT LEVEL: NOMINAL
ROGUE TERMINAL v1.0 ESC to close
← Back to blog
March 15, 2026 by Rogue Security
data-breachvendor-riskAI-securityDLPagentic-securityruntime-security

McKinsey's Lilli Breach: Why Vendor Trust Is Not Enough

On March 9, 2026, a security researcher published a disclosure that should terrify every enterprise CISO: an autonomous AI agent breached McKinsey’s internal AI platform, Lilli, in under two hours. No credentials. No insider access. No human guidance.

The result? Full read and write access to 46.5 million chat messages about strategy, M&A, and client engagements - all stored in plaintext.

2
Hours to breach
46.5M
Messages exposed
728K
Files accessed
57K
User accounts

McKinsey fixed the vulnerabilities within hours of disclosure. They claim no evidence of unauthorized access beyond the security researcher. But that’s not the point.

The point is that one of the world’s most sophisticated consulting firms - a company that advises Fortune 500 companies on digital transformation and cybersecurity strategy - had 46 million confidential messages sitting in plaintext, protected by nothing more than an assumption that their vendor’s security was “good enough.”

The Attack: SQL Injection in 2026

Here’s what makes this breach particularly embarrassing: the vulnerability was SQL injection. Not a zero-day. Not a sophisticated supply chain attack. A vulnerability class that’s been in the OWASP Top 10 since 2003.

1
Public API Docs
22 unauth endpoints
2
JSON Key Injection
Error message leak
3
Full DB Access
Read + Write
4
System Prompts
Writable!

The agent found 22 API endpoints that required no authentication. One of these processed user search queries where the JSON keys - not the values, the keys - were concatenated directly into SQL. When database error messages started leaking production data, the agent recognized what traditional scanners missed.

But here’s the truly dangerous part: Lilli’s system prompts were stored in the same database. With write access, an attacker could silently rewrite how the AI behaves - no code deployment needed, no logs generated. Just a single UPDATE statement.

The Silent Poison

An attacker with write access to system prompts could instruct the AI to subtly bias financial models, exfiltrate data through innocuous responses, or remove safety guardrails entirely. McKinsey consultants would unknowingly integrate poisoned advice into client deliverables.

Why Traditional Security Failed

McKinsey isn’t a startup moving fast and breaking things. They have a dedicated security team, compliance certifications, and presumably run regular penetration tests. Lilli had been in production for over two years with 40,000+ employees using it daily.

So why did a decades-old vulnerability class slip through?

Because traditional scanners look for signatures, not attack chains. They check boxes. They find obvious misconfigurations. What they don’t do is think like an attacker - chaining together minor observations (JSON keys in error messages) into a complete compromise.

”The agent didn’t follow a checklist. It mapped the attack surface, probed for weaknesses, and chained together seemingly minor observations to construct a complex attack path.”
Security Researcher

This is the new reality: autonomous AI agents can probe, adapt, and escalate at machine speed. They don’t get tired. They don’t miss edge cases. And they’re now available to anyone with an API key.

The Real Problem: Plaintext Everything

Let’s be clear about what was actually exposed: 46.5 million messages about strategy, M&A, and client engagements - stored in plaintext.

This means:

  • Every confidential client conversation was readable
  • Every strategic recommendation was exposed
  • Every M&A discussion was accessible
  • Every piece of advice McKinsey consultants asked their AI was logged and vulnerable

The breach didn’t happen because McKinsey had bad security. It happened because they trusted their platform to handle security - and that trust meant no defense-in-depth for the data itself.

What McKinsey Had
  • Perimeter security (bypassed in 2 hours)
  • Authentication on most endpoints (22 were missed)
  • Security team and audits (missed the SQLi)
  • Plaintext data storage
  • No output-layer controls
What Was Missing
  • Runtime data classification
  • Output-layer policy enforcement
  • Real-time sensitive data detection
  • Breach-proof data handling
  • Defense-in-depth for AI outputs

Vendor Trust Is Dead

Here’s the uncomfortable truth: vendor trust is not a security strategy.

McKinsey trusted that their AI platform was secure. They were wrong. And they’re not alone - every enterprise running an internal AI assistant is making the same bet. The question isn’t whether your vendor will be breached. The question is what happens when they are.

If your AI platform stores confidential data in plaintext, a breach means total exposure. If your AI can access sensitive information without output controls, every response is a potential leak. If you’re relying on perimeter security alone, you’re one SQL injection away from disaster.

The New Reality

In the era of autonomous AI agents, breaches happen faster than humans can respond. Your security strategy needs to assume compromise and limit blast radius - not pray the perimeter holds.

Runtime Security: Limiting the Blast Radius

The McKinsey breach exposed a fundamental gap in how enterprises think about AI security. Most organizations focus on preventing breaches - but this incident proved that even sophisticated defenses can be circumvented in hours.

The missing layer is runtime security - controls that operate at the output layer, ensuring that even when a breach occurs, sensitive data doesn’t leave in a policy-breaking way.

This is what defense-in-depth looks like for AI systems:

1
Classify data at runtime
2
Detect sensitive content
3
Enforce output policies
4
Block policy violations

The principle is simple: even if an attacker gains database access, they shouldn’t be able to extract policy-breaking data through the AI’s output layer.

This means:

  • Real-time classification: Every AI response is analyzed for sensitive content before delivery
  • Policy enforcement: Data that violates export controls, confidentiality rules, or regulatory requirements is blocked or redacted
  • Audit trails: Every blocked response is logged for security review
  • Zero-trust outputs: The AI’s responses are treated as untrusted until validated

The Lesson for CISOs

The McKinsey breach isn’t just about SQL injection or API security. It’s about a fundamental shift in how we need to think about AI security.

The old model: Trust your vendor, secure the perimeter, hope for the best.

The new model: Assume breach, enforce policies at runtime, limit blast radius.

Your AI platform will be probed by autonomous agents. Your vendors will have vulnerabilities. Your perimeter will eventually be bypassed. The question is whether your architecture is designed to limit damage when - not if - that happens.

Key Takeaways

1. Vendor trust is not security. McKinsey trusted their platform. The platform had a SQL injection vulnerability in production for 2+ years.

2. Traditional scanners miss adaptive attacks. The vulnerability was discoverable by an autonomous agent in hours. Human-led audits missed it.

3. Runtime security is the missing layer. If 46 million messages had been protected by output-layer policies, the breach impact would have been dramatically reduced.

4. Assume breach, design for containment. The perimeter-first model is dead. Defense-in-depth for AI means controls at every layer - including outputs.


Free Resource: AI Agent Security Checklist

50+ security controls to evaluate your AI platform’s defenses - from input validation to output-layer policies. Don’t wait for the breach to discover what’s missing.

Download the checklist →

Rogue Security provides runtime protection for AI systems, including data loss prevention, output policy enforcement, and real-time threat detection. When breaches happen, we ensure sensitive data stays where it belongs.