MAESTRO

10 - Case Studies

Version: 1.0.0

The OWASP Multi-Agentic System Threat Modelling Guide provides three case studies that demonstrate how MAESTRO analysis surfaces threats in real-world agentic AI systems. This document presents the key patterns and findings from each case study.

Grounding Note: These case studies are derived from the OWASP Multi-Agentic System Threat Modelling Guide v1.0 and represent plausible threat scenarios based on real-world system architectures and known attack patterns. They are not accounts of specific security incidents at named organizations. The system architectures described (RPA expense processing, ElizaOS autonomous agents, MCP protocol interactions) reflect common deployment patterns where the identified threats are most likely to manifest. Use these as templates for your own threat analysis, adapting the specifics to your system’s architecture and threat landscape.


Pattern A: RPA Expense Reimbursement Agent

System: Single agent + RAG + tools + HITL review

Key Findings


Pattern B: ElizaOS (Web3/Blockchain Agent)

System: Multi-agent + blockchain + plugins + cross-chain

Key Findings


Pattern C: MCP Protocol

System: Client-server protocol + tools + resources + prompts

Key Findings


Key Takeaways

Across All Case Studies

  1. Cross-layer threats are the highest-risk findings. Single-layer analysis misses the most dangerous attack chains. The Hallucination -> RAG -> Tool Misuse pattern (Pattern A) and the Plugin -> Wallet Compromise chain (Pattern B) both span multiple MAESTRO layers.

  2. Non-determinism creates unique agentic risks. Traditional systems process identical inputs identically. Agentic systems with LLMs do not. This fundamental property (T16 in Pattern A) undermines security controls that assume deterministic behavior.

  3. Financial actions require hard preventive controls. When agent actions have real-world financial consequences (blockchain transactions in Pattern B, expense approvals in Pattern A), corrective controls may be insufficient. Hard limits, circuit breakers, and mandatory human approval gates are essential.

  4. Trust boundaries must be explicit and enforced. MCP servers (Pattern C), plugin ecosystems (Pattern B), and RAG pipelines (Pattern A) all represent trust boundaries. Every trust boundary needs authentication, authorization, input validation, and monitoring.

  5. Autonomy amplifies every threat. In all three case studies, the autonomous nature of the agents – acting without per-action human approval – is what transforms moderate vulnerabilities into high-severity threats. Autonomy boundaries must be carefully designed and enforced.

  6. Extended threats (beyond T1-T15) are critical. The MAESTRO extended threat catalog (T16-T47) captured many of the most impactful findings. Limiting analysis to the base ASI taxonomy would miss threats like Semantic Drift (T17), Runaway Agent (T32), and Rogue MCP Server (T47).

  7. Preventive controls must not rely on LLM compliance. Since LLMs are non-deterministic and susceptible to prompt injection, preventive controls that depend on the LLM “choosing” to comply are unreliable. Controls must be enforced externally to the LLM.


Attribution: OWASP GenAI Security Project - Multi-Agentic System Threat Modelling Guide. Licensed under CC BY-SA 4.0.


Previous Up Next
09 - Mitigation Catalog 00 - Overview 11 - Framework Integration
See also: 06 - Modeling Process 09 - Mitigation Catalog