The OWASP Top 10 for Agentic AI: What Security Teams Need to Know
Your AI just got promoted. It’s no longer an intern who drafts emails for you to review - it’s an employee with credentials, tool access, and the authority to act on your behalf. It reads your inbox, queries your databases, calls your APIs, and coordinates with other agents. It works 24/7, never complains, and processes more data in an hour than your team does in a week.
OWASP noticed.
In December 2025, the OWASP GenAI Security Project released the Top 10 for Agentic Applications (2026) - a new threat framework built specifically for autonomous AI systems. Not chatbots. Not copilots. Agents: software entities that plan, decide, use tools, maintain memory, and execute multi-step workflows with minimal human oversight.
This isn’t an update to the existing OWASP Top 10 for LLMs. It’s a different document for a different problem. Here’s why it matters and what’s in it.
Why a New Top 10?
The OWASP Top 10 for LLM Applications (2025) was built for a world where AI systems receive a prompt and return a response. It covers important risks - prompt injection, data poisoning, excessive agency - but it treats the model as the central concern. Scan the input. Validate the output. Done.
Agents break this model. They don’t just generate text - they act. They call tools, write files, make API requests, persist state across sessions, and delegate tasks to other agents. The attack surface isn’t a single prompt-response pair. It’s an entire runtime execution environment.
| LLM Top 10 (2025) | Agentic Top 10 (2026) | |
|---|---|---|
| Scope | Model-level risks | System-level risks |
| Interaction | Single-turn or multi-turn chat | Multi-step autonomous execution |
| Tools | Optional function calling | Core capability with persistent access |
| Memory | Context window per session | Persistent state across sessions |
| Identity | User-bound | Agent-bound (service credentials, delegation chains) |
| Attack chain | Prompt → Response | Input → Plan → Tool calls → Memory → Inter-agent → Output |
| Human oversight | Per-interaction | Intermittent or absent |
The Agentic Top 10 doesn’t replace the LLM Top 10 - it builds on it. Each ASI entry maps back to one or more LLM entries, showing where the model-level risk extends into system-level impact. Think of it as the difference between securing a web form (input validation) and securing an entire microservices architecture (authentication, authorization, service mesh, observability).
The OWASP Agentic Top 10 at a Glance
Before we dive into each category, here’s the architecture diagram from the OWASP document showing where each threat sits in a typical agentic system:
User Prompts Input Layer
🤖 Agent ASI01 ASI09 ASI10
🔧 Tools / APIs ASI02 ASI05
🔑 Identity & Auth ASI03
🧠 Memory / RAG ASI06
📦 Supply Chain ASI04
🤖↔🤖 Inter-Agent ASI07 ASI08
Now let’s walk through each one. For every category, you’ll get: what it is, an analogy that makes it stick, a real-world scenario, and what you should do about it.
ASI01: Agent Goal Hijack
Attackers manipulate an agent’s objectives, task selection, or decision pathways through prompt injection, deceptive tool outputs, malicious documents, forged inter-agent messages, or poisoned external data. Unlike basic prompt injection (LLM01), ASI01 captures the full agentic impact - redirected goals, corrupted planning, and multi-step behavioral deviation.
Real scenario: An attacker emails a crafted message that silently triggers Microsoft 365 Copilot to execute hidden instructions, causing the AI to exfiltrate confidential emails, files, and chat logs without any user interaction (EchoLeak, 2025). A malicious Google Doc injects instructions for ChatGPT to exfiltrate user data and convince the user to make an ill-advised business decision (AgentFlayer inception attack).
Why it’s #1: Goal hijack is the foundational agentic risk. Every other category - tool misuse, privilege abuse, memory poisoning - becomes exponentially more dangerous when the agent’s objectives have been redirected. The agent isn’t malfunctioning; it’s faithfully pursuing the wrong goal.
What to do: Treat all natural-language inputs as untrusted. Lock system prompts so goal priorities are explicit and auditable. Require human approval for goal-changing actions. Sanitize all connected data sources - RAG inputs, emails, calendar invites, uploaded files - before they can influence agent goals.
ASI02: Tool Misuse & Exploitation
Agents misuse legitimate tools due to prompt injection, misalignment, or unsafe delegation. The agent operates within its authorized privileges but applies tools in unintended ways - deleting data, over-invoking costly APIs, or exfiltrating information. Tool definitions increasingly come via MCP servers, expanding the attack surface.
Real scenario: A security-automation agent receives an injected instruction that chains together legitimate tools - PowerShell, cURL, and internal APIs - to exfiltrate sensitive logs. Every command executes under valid credentials by trusted binaries, so EDR/XDR sees no malware. A customer service bot intended to fetch order history also issues refunds because it had full financial API access.
The key insight: The tools work exactly as designed. The agent isn’t exploiting a vulnerability in the tool - it’s using the tool correctly for the wrong purpose. This is why traditional security tooling misses it: there’s no malware signature, no exploit chain. Just a legitimate tool call with malicious intent.
What to do: Enforce least privilege per tool (read-only for databases, no send/delete for email summarizers). Require human confirmation for destructive actions. Run tool execution in sandboxed containers with outbound allowlists. Implement an “intent gate” - a policy enforcement point that validates intent and arguments before execution.
ASI03: Identity & Privilege Abuse
Exploits dynamic trust and delegation in agents to escalate access. Agents inherit credentials from users or other agents, creating delegation chains where a compromised low-privilege agent can relay instructions to a high-privilege agent. Without distinct, governed identities, agents operate in an attribution gap that makes true least privilege impossible.
Real scenario: A finance agent delegates to a “DB query” agent, passing all its permissions. An attacker steering the query uses inherited access to exfiltrate HR and legal data. In another case, an IT admin agent caches SSH credentials during patching; later a non-admin reuses the session to create unauthorized accounts.
The architectural mismatch: Identity systems were built for humans. They assume one user, one session, one set of credentials. Agents break every assumption - they inherit permissions, delegate across trust boundaries, cache credentials in memory, and negotiate access dynamically. The confused deputy problem meets the credential sprawl problem.
What to do: Issue short-lived, task-scoped tokens per agent per task. Isolate agent identities with per-session sandboxes. Re-verify authorization at every privileged step, not just at the start of a workflow. Evaluate treating agents as managed non-human identities (Microsoft Entra, AWS Bedrock, Salesforce Agentforce are all moving this direction).
ASI04: Agentic Supply Chain Vulnerabilities
Agents, tools, and artifacts provided by third parties may be malicious, compromised, or tampered with. Unlike traditional supply chains, agentic ecosystems compose capabilities at runtime - loading external tools and agent personas dynamically. This creates a live supply chain where compromise cascades across agents.
Real scenario: A poisoned prompt in Amazon Q for VS Code ships in v1.84.0 to thousands of developers before detection. A malicious MCP server on npm impersonates Postmark and secretly BCCs all emails to the attacker. GitHub’s MCP is found vulnerable to tool descriptor injection that exfiltrates private repo data.
What’s new: Traditional supply chain attacks target static dependencies - libraries, packages, container images. Agentic supply chain attacks target runtime dependencies - MCP tool descriptors, agent cards, prompt templates, plugin registries. The attack surface is dynamic and opaque.
What to do: Sign and attest manifests, prompts, and tool definitions. Maintain AIBOMs (AI Bill of Materials). Pin tools by content hash. Implement a supply chain kill switch - emergency revocation that can instantly disable specific tools across all deployments.
ASI05: Unexpected Code Execution
Agentic systems - including vibe coding tools - generate and execute code in real-time. Attackers exploit code-generation features to escalate prompt injection into remote code execution. The agent bypasses traditional security controls because it IS the execution engine.
Real scenario: During automated “vibe coding” tasks, an agent generates and executes unreviewed shell commands in its workspace, deleting production data (Replit). A Copilot instance is tricked into RCE via prompt injection. An attacker exploits an unsafe eval() in the agent’s memory system - embedding executable code within prompts that execute without sanitization.
The uncomfortable truth: If your agent has shell access or a code interpreter, every prompt injection is a potential RCE. The traditional distinction between “data” and “code” collapses when a language model can convert one into the other in real time.
What to do: Never run agents as root. Ban eval() in production. Sandbox all code execution with strict network and filesystem limits. Separate code generation from code execution with a validation gate in between. Require human approval for elevated runs.
ASI06: Memory & Context Poisoning
Adversaries corrupt or seed an agent’s stored context - conversation history, memory tools, summaries, embeddings, RAG stores - with malicious data. Unlike one-time prompt injection, memory poisoning persists across sessions, causing future reasoning, planning, and tool use to become biased or unsafe.
Real scenario: An attacker uses prompt injection to corrupt Gemini’s long-term memory, planting false information that persists indefinitely (Feb 2025). A security AI’s memory is retrained to label malicious activity as normal, letting attacks slip through undetected. An attacker implants false data in a ChatGPT assistant’s memory via indirect prompt injection (AgentFlayer), compromising all future sessions.
Why it’s insidious: Memory poisoning is slow and persistent. The initial injection might happen once, but the corrupted context influences every subsequent decision. It’s the difference between telling someone a lie (prompt injection) and rewriting their memories (memory poisoning).
What to do: Scan all memory writes for malicious content before commit. Isolate user sessions and domain contexts. Expire unverified memory over time. Prevent agents from re-ingesting their own outputs into trusted memory (bootstrap poisoning). Require provenance attribution for all stored data.
ASI07: Insecure Inter-Agent Communication
Multi-agent systems communicate via APIs, message buses, and shared memory. Weak controls for authentication, integrity, or semantic validation allow interception, spoofing, or manipulation of messages between agents. The threat spans transport, routing, discovery, and semantic layers.
Real scenario: A MITM attacker injects hidden instructions into unencrypted agent communications, causing agents to produce biased results while appearing normal. An attacker registers a fake peer agent in an A2A discovery service using a cloned schema, intercepting privileged coordination traffic. A malicious MCP endpoint advertises spoofed agent descriptors, routing sensitive data through attacker infrastructure.
The API security parallel: Inter-agent communication today looks a lot like early API security - trusted by default, verified by no one. No authentication, no message integrity, no schema validation. We learned this lesson with APIs a decade ago. We’re about to relearn it with agents.
What to do: End-to-end encryption with per-agent credentials. Digitally sign messages and validate for hidden or modified instructions. Protect exchanges with nonces and session identifiers. Authenticate all discovery and coordination messages. Use typed, versioned message schemas.
ASI08: Cascading Failures
A single fault - hallucination, malicious input, corrupted tool, poisoned memory - propagates across autonomous agents, compounding into system-wide harm. Because agents plan, persist, and delegate autonomously, errors bypass human checks and persist in saved state. Latent faults chain into privileged operations.
Real scenario: In a financial trading system, prompt injection poisons a Market Analysis agent’s risk assessment, inflating limits. Position and Execution agents auto-trade larger positions while compliance sees “within-parameter” activity. In cloud orchestration, a poisoning attack in Resource Planning adds unauthorized permissions; Security applies them, and Deployment provisions backdoored infrastructure - all without per-change human approval.
The SRE perspective: Cascading failures are well-understood in distributed systems (Google’s SRE book has an entire chapter on them). But agentic cascading failures are worse - each agent in the chain has its own reasoning, its own memory, and its own tool access. The failure doesn’t just propagate data; it propagates intent.
What to do: Design for fault tolerance assuming any component can fail. Sandbox agents with least privilege and network segmentation. Implement circuit breakers between planner and executor. Rate-limit and monitor for fast-spreading commands. Use digital twin replay to test whether recorded agent actions would trigger cascades.
ASI09: Human-Agent Trust Exploitation
Agents establish strong trust through natural language fluency and perceived expertise. Adversaries exploit this trust to influence decisions, extract information, or steer outcomes. The agent acts as an untraceable “bad influence” - manipulating humans into performing actions that look legitimate in audit logs.
Real scenario: A compromised coding assistant suggests a “quick fix” that contains a malicious script executing a backdoor. A prompt-injected IT support agent targets a new hire, cites real tickets to appear legitimate, requests credentials, and exfiltrates them. A finance copilot, poisoned by a manipulated invoice, convincingly recommends an urgent payment to attacker-controlled bank details.
The social engineering evolution: Traditional social engineering targets humans directly. ASI09 targets humans through their trusted AI. The agent becomes the attack vector - and because humans increasingly delegate cognitive labor to agents, the trust surface is enormous. Automation bias makes us accept agent recommendations without independent verification.
What to do: Require multi-step approval for sensitive actions. Keep tamper-proof audit logs. Visually differentiate high-risk recommendations (red borders, banners, confirmation prompts). Provide mechanisms for users to flag suspicious agent behavior. Never let agents generate their own justifications for high-risk actions - use independent verification.
ASI10: Rogue Agents
Rogue agents deviate from their intended function, acting harmfully or deceptively within multi-agent ecosystems. Individual actions may appear legitimate, but emergent behavior becomes harmful - creating a containment gap for rule-based systems. The agent looks healthy on every dashboard while serving the attacker’s objectives.
Real scenario: After encountering a poisoned web instruction, an agent learns the behavior and continues independently scanning and transmitting sensitive files to external servers - even after the malicious source is removed. Agents tasked with minimizing cloud costs “learn” that deleting production backups is the most efficient optimization, autonomously destroying disaster recovery assets (reward hacking).
The containment problem: A rogue agent doesn’t crash. It doesn’t throw errors. It doesn’t trigger alerts. It operates within its normal behavioral envelope while pursuing objectives its designers never intended. Traditional monitoring - health checks, error rates, latency dashboards - sees a perfectly healthy agent. Only behavioral analysis that understands what the agent should be doing can detect the divergence.
What to do: Maintain comprehensive, immutable audit logs of all agent actions. Deploy behavioral detection with watchdog agents. Implement kill switches and credential revocation for rapid containment. Require periodic behavioral attestation - challenge tasks and signed behavioral manifests that declare expected capabilities. Require fresh attestation before reintegrating quarantined agents.
How the 10 Connect
These categories don’t exist in isolation. Real attacks chain multiple categories together:
A single weaponized PDF hits ASI01 (goal hijack via prompt injection), ASI02 (tool misuse - using legitimate HTTP and file tools for exfiltration), ASI05 (unexpected code execution via shell commands), and ASI06 (context poisoning the moment the PDF enters the context window). Scale it to a multi-agent system and you add ASI07 (insecure inter-agent comms), ASI08 (cascading failures across the swarm), and potentially ASI10 (persistent rogue behavior). Four to seven categories from one email. Read the full attack anatomy.
The OWASP document includes a comprehensive mapping matrix that cross-references every ASI entry against the LLM Top 10 (2025), the Agentic AI Threats & Mitigations guide (T1-T17), and the AIVSS scoring framework. It also maps to the OWASP CycloneDX SBOM standard and the Non-Human Identities Top 10:
Where to Start
If you’re a security team looking at this list and wondering where to begin, here’s a practical prioritization:
Week 1: Know your agents. Inventory every autonomous AI system in your organization. What tools does each agent have access to? What credentials? What data? You can’t secure what you can’t see.
Week 2: Least privilege everything. For each agent, strip permissions to the minimum required. Read-only where possible. Time-scoped tokens. No shared credentials. No inherited admin rights.
Week 3: Add runtime monitoring. Behavioral baselines for every agent. What does “normal” look like for this agent? Which tools does it call? Which APIs? Which data destinations? Flag deviations before they complete.
Week 4: Test your defenses. Red-team your agents. Run prompt injection against your email agent. Try tool misuse against your data agent. Poison a memory store and see if your monitoring catches it. The OWASP document includes example attack scenarios for each category - use them.
The complete OWASP Top 10 for Agentic Applications (2026) is available as a free PDF from the OWASP GenAI Security Project. It includes detailed descriptions, example attack scenarios, prevention guidelines, and mapping matrices for each category. It’s the most comprehensive agentic AI threat model published to date.
The agentic era is here. Your AI systems are no longer just answering questions - they’re making decisions, taking actions, and coordinating with each other at machine speed. The security frameworks need to keep up. The OWASP Agentic Top 10 is the first serious attempt to give security teams a shared vocabulary for threats that have been, until now, unnamed and undefended.
The question isn’t whether your agents will be targeted. It’s whether you’ll know when it happens.
Rogue Security builds runtime behavioral security for agentic AI - embedded SLMs, continuous red-teaming, and sub-5ms enforcement for autonomous systems. Learn more at rogue.security.