McKinsey's Lilli Breach: Why Vendor Trust Is Not Enough

On March 9, 2026, a security researcher published a disclosure that should terrify every enterprise CISO: an autonomous AI agent breached McKinsey’s internal AI platform, Lilli, in under two hours. No credentials. No insider access. No human guidance.

The result? Full read and write access to 46.5 million chat messages about strategy, M&A, and client engagements - all stored in plaintext.

McKinsey fixed the vulnerabilities within hours of disclosure. They claim no evidence of unauthorized access beyond the security researcher. But that’s not the point.

The point is that one of the world’s most sophisticated consulting firms - a company that advises Fortune 500 companies on digital transformation and cybersecurity strategy - had 46 million confidential messages sitting in plaintext, protected by nothing more than an assumption that their vendor’s security was “good enough.”

The Attack: SQL Injection in 2026

Here’s what makes this breach particularly embarrassing: the vulnerability was SQL injection. Not a zero-day. Not a sophisticated supply chain attack. A vulnerability class that’s been in the OWASP Top 10 since 2003.

Public API Docs

22 unauth endpoints

→

JSON Key Injection

Error message leak

→

Full DB Access

Read + Write

→

System Prompts

Writable!

The agent found 22 API endpoints that required no authentication. One of these processed user search queries where the JSON keys - not the values, the keys - were concatenated directly into SQL. When database error messages started leaking production data, the agent recognized what traditional scanners missed.

But here’s the truly dangerous part: Lilli’s system prompts were stored in the same database. With write access, an attacker could silently rewrite how the AI behaves - no code deployment needed, no logs generated. Just a single UPDATE statement.

The Silent Poison

An attacker with write access to system prompts could instruct the AI to subtly bias financial models, exfiltrate data through innocuous responses, or remove safety guardrails entirely. McKinsey consultants would unknowingly integrate poisoned advice into client deliverables.

Why Traditional Security Failed

McKinsey isn’t a startup moving fast and breaking things. They have a dedicated security team, compliance certifications, and presumably run regular penetration tests. Lilli had been in production for over two years with 40,000+ employees using it daily.

So why did a decades-old vulnerability class slip through?

Because traditional scanners look for signatures, not attack chains. They check boxes. They find obvious misconfigurations. What they don’t do is think like an attacker - chaining together minor observations (JSON keys in error messages) into a complete compromise.

”The agent didn’t follow a checklist. It mapped the attack surface, probed for weaknesses, and chained together seemingly minor observations to construct a complex attack path.”

Security Researcher

This is the new reality: autonomous AI agents can probe, adapt, and escalate at machine speed. They don’t get tired. They don’t miss edge cases. And they’re now available to anyone with an API key.

The Real Problem: Plaintext Everything

Let’s be clear about what was actually exposed: 46.5 million messages about strategy, M&A, and client engagements - stored in plaintext.

This means:

Every confidential client conversation was readable
Every strategic recommendation was exposed
Every M&A discussion was accessible
Every piece of advice McKinsey consultants asked their AI was logged and vulnerable

The breach didn’t happen because McKinsey had bad security. It happened because they trusted their platform to handle security - and that trust meant no defense-in-depth for the data itself.

What McKinsey Had

Perimeter security (bypassed in 2 hours)
Authentication on most endpoints (22 were missed)
Security team and audits (missed the SQLi)
Plaintext data storage
No output-layer controls

What Was Missing

Runtime data classification
Output-layer policy enforcement
Real-time sensitive data detection
Breach-proof data handling
Defense-in-depth for AI outputs

Vendor Trust Is Dead

Here’s the uncomfortable truth: vendor trust is not a security strategy.

McKinsey trusted that their AI platform was secure. They were wrong. And they’re not alone - every enterprise running an internal AI assistant is making the same bet. The question isn’t whether your vendor will be breached. The question is what happens when they are.

If your AI platform stores confidential data in plaintext, a breach means total exposure. If your AI can access sensitive information without output controls, every response is a potential leak. If you’re relying on perimeter security alone, you’re one SQL injection away from disaster.

The New Reality

In the era of autonomous AI agents, breaches happen faster than humans can respond. Your security strategy needs to assume compromise and limit blast radius - not pray the perimeter holds.

Runtime Security: Limiting the Blast Radius

The McKinsey breach exposed a fundamental gap in how enterprises think about AI security. Most organizations focus on preventing breaches - but this incident proved that even sophisticated defenses can be circumvented in hours.

The missing layer is runtime security - controls that operate at the output layer, ensuring that even when a breach occurs, sensitive data doesn’t leave in a policy-breaking way.

This is what defense-in-depth looks like for AI systems:

Classify data at runtime

Detect sensitive content

Enforce output policies

Block policy violations

The principle is simple: even if an attacker gains database access, they shouldn’t be able to extract policy-breaking data through the AI’s output layer.

This means:

Real-time classification: Every AI response is analyzed for sensitive content before delivery
Policy enforcement: Data that violates export controls, confidentiality rules, or regulatory requirements is blocked or redacted
Audit trails: Every blocked response is logged for security review
Zero-trust outputs: The AI’s responses are treated as untrusted until validated

The Lesson for CISOs

The McKinsey breach isn’t just about SQL injection or API security. It’s about a fundamental shift in how we need to think about AI security.

The old model: Trust your vendor, secure the perimeter, hope for the best.

The new model: Assume breach, enforce policies at runtime, limit blast radius.

Your AI platform will be probed by autonomous agents. Your vendors will have vulnerabilities. Your perimeter will eventually be bypassed. The question is whether your architecture is designed to limit damage when - not if - that happens.

Key Takeaways

1. Vendor trust is not security. McKinsey trusted their platform. The platform had a SQL injection vulnerability in production for 2+ years.

2. Traditional scanners miss adaptive attacks. The vulnerability was discoverable by an autonomous agent in hours. Human-led audits missed it.

3. Runtime security is the missing layer. If 46 million messages had been protected by output-layer policies, the breach impact would have been dramatically reduced.

4. Assume breach, design for containment. The perimeter-first model is dead. Defense-in-depth for AI means controls at every layer - including outputs.

Free Resource: AI Agent Security Checklist

50+ security controls to evaluate your AI platform’s defenses - from input validation to output-layer policies. Don’t wait for the breach to discover what’s missing.

Download the checklist →

Rogue Security provides runtime protection for AI systems, including data loss prevention, output policy enforcement, and real-time threat detection. When breaches happen, we ensure sensitive data stays where it belongs.