McKinsey's Lilli Breach: Why Vendor Trust Is Not Enough
On March 9, 2026, a security researcher published a disclosure that should terrify every enterprise CISO: an autonomous AI agent breached McKinsey’s internal AI platform, Lilli, in under two hours. No credentials. No insider access. No human guidance.
The result? Full read and write access to 46.5 million chat messages about strategy, M&A, and client engagements - all stored in plaintext.
McKinsey fixed the vulnerabilities within hours of disclosure. They claim no evidence of unauthorized access beyond the security researcher. But that’s not the point.
The point is that one of the world’s most sophisticated consulting firms - a company that advises Fortune 500 companies on digital transformation and cybersecurity strategy - had 46 million confidential messages sitting in plaintext, protected by nothing more than an assumption that their vendor’s security was “good enough.”
The Attack: SQL Injection in 2026
Here’s what makes this breach particularly embarrassing: the vulnerability was SQL injection. Not a zero-day. Not a sophisticated supply chain attack. A vulnerability class that’s been in the OWASP Top 10 since 2003.
The agent found 22 API endpoints that required no authentication. One of these processed user search queries where the JSON keys - not the values, the keys - were concatenated directly into SQL. When database error messages started leaking production data, the agent recognized what traditional scanners missed.
But here’s the truly dangerous part: Lilli’s system prompts were stored in the same database. With write access, an attacker could silently rewrite how the AI behaves - no code deployment needed, no logs generated. Just a single UPDATE statement.
An attacker with write access to system prompts could instruct the AI to subtly bias financial models, exfiltrate data through innocuous responses, or remove safety guardrails entirely. McKinsey consultants would unknowingly integrate poisoned advice into client deliverables.
Why Traditional Security Failed
McKinsey isn’t a startup moving fast and breaking things. They have a dedicated security team, compliance certifications, and presumably run regular penetration tests. Lilli had been in production for over two years with 40,000+ employees using it daily.
So why did a decades-old vulnerability class slip through?
Because traditional scanners look for signatures, not attack chains. They check boxes. They find obvious misconfigurations. What they don’t do is think like an attacker - chaining together minor observations (JSON keys in error messages) into a complete compromise.
This is the new reality: autonomous AI agents can probe, adapt, and escalate at machine speed. They don’t get tired. They don’t miss edge cases. And they’re now available to anyone with an API key.
The Real Problem: Plaintext Everything
Let’s be clear about what was actually exposed: 46.5 million messages about strategy, M&A, and client engagements - stored in plaintext.
This means:
- Every confidential client conversation was readable
- Every strategic recommendation was exposed
- Every M&A discussion was accessible
- Every piece of advice McKinsey consultants asked their AI was logged and vulnerable
The breach didn’t happen because McKinsey had bad security. It happened because they trusted their platform to handle security - and that trust meant no defense-in-depth for the data itself.
- Perimeter security (bypassed in 2 hours)
- Authentication on most endpoints (22 were missed)
- Security team and audits (missed the SQLi)
- Plaintext data storage
- No output-layer controls
- Runtime data classification
- Output-layer policy enforcement
- Real-time sensitive data detection
- Breach-proof data handling
- Defense-in-depth for AI outputs
Vendor Trust Is Dead
Here’s the uncomfortable truth: vendor trust is not a security strategy.
McKinsey trusted that their AI platform was secure. They were wrong. And they’re not alone - every enterprise running an internal AI assistant is making the same bet. The question isn’t whether your vendor will be breached. The question is what happens when they are.
If your AI platform stores confidential data in plaintext, a breach means total exposure. If your AI can access sensitive information without output controls, every response is a potential leak. If you’re relying on perimeter security alone, you’re one SQL injection away from disaster.
In the era of autonomous AI agents, breaches happen faster than humans can respond. Your security strategy needs to assume compromise and limit blast radius - not pray the perimeter holds.
Runtime Security: Limiting the Blast Radius
The McKinsey breach exposed a fundamental gap in how enterprises think about AI security. Most organizations focus on preventing breaches - but this incident proved that even sophisticated defenses can be circumvented in hours.
The missing layer is runtime security - controls that operate at the output layer, ensuring that even when a breach occurs, sensitive data doesn’t leave in a policy-breaking way.
This is what defense-in-depth looks like for AI systems:
The principle is simple: even if an attacker gains database access, they shouldn’t be able to extract policy-breaking data through the AI’s output layer.
This means:
- Real-time classification: Every AI response is analyzed for sensitive content before delivery
- Policy enforcement: Data that violates export controls, confidentiality rules, or regulatory requirements is blocked or redacted
- Audit trails: Every blocked response is logged for security review
- Zero-trust outputs: The AI’s responses are treated as untrusted until validated
The Lesson for CISOs
The McKinsey breach isn’t just about SQL injection or API security. It’s about a fundamental shift in how we need to think about AI security.
The old model: Trust your vendor, secure the perimeter, hope for the best.
The new model: Assume breach, enforce policies at runtime, limit blast radius.
Your AI platform will be probed by autonomous agents. Your vendors will have vulnerabilities. Your perimeter will eventually be bypassed. The question is whether your architecture is designed to limit damage when - not if - that happens.
1. Vendor trust is not security. McKinsey trusted their platform. The platform had a SQL injection vulnerability in production for 2+ years.
2. Traditional scanners miss adaptive attacks. The vulnerability was discoverable by an autonomous agent in hours. Human-led audits missed it.
3. Runtime security is the missing layer. If 46 million messages had been protected by output-layer policies, the breach impact would have been dramatically reduced.
4. Assume breach, design for containment. The perimeter-first model is dead. Defense-in-depth for AI means controls at every layer - including outputs.
50+ security controls to evaluate your AI platform’s defenses - from input validation to output-layer policies. Don’t wait for the breach to discover what’s missing.
Rogue Security provides runtime protection for AI systems, including data loss prevention, output policy enforcement, and real-time threat detection. When breaches happen, we ensure sensitive data stays where it belongs.