Version: 1.0.0
Cross-layer threats are the highest-value findings in MAESTRO analysis. They exploit interactions between layers that single-layer analysis misses. Every threat model must include explicit cross-layer analysis using the patterns and checklist below.
LLM hallucinates a rule -> RAG retrieves the hallucinated rule -> Agent autonomously acts on it using tools.
Agent framework vulnerability -> Modify agent workflow -> Exploit weak infrastructure controls -> Bypass approval process.
Poisoned knowledge base -> Agent makes incorrect decisions -> Shares incorrect decisions with other agents/systems.
Compromised agent -> Selectively delete audit entries -> Actions remain within “normal” thresholds -> Security controls bypassed.
Compromise one component -> Use its trust relationships to access next component -> Chain across layers until full system compromise.
User authenticates at L6 -> Agent uses service account at L3 -> External system (L7) trusts agent’s service account -> User performs actions beyond their authorization level.
Agent produces high volume -> Human reviewers can’t keep up -> Rubber-stamp decisions -> Automation bias sets in -> Incorrect/fraudulent outputs pass undetected.
This walkthrough is based on the RPA Expense Reimbursement case study from the OWASP Multi-Agentic System Threat Modelling Guide. It demonstrates Pattern 1 in concrete detail and illustrates why cross-layer analysis discovers threats that single-layer analysis misses.
An organization deploys an AI agent to automate expense reimbursement processing. The agent uses an LLM (L1) for reasoning, a RAG pipeline backed by a vector database (L2) for policy lookup, and tools (L3) to approve or reject expense claims. A human-in-the-loop reviewer (L5) spot-checks a sample of decisions.
The LLM, due to its inherent non-deterministic behavior, hallucinates a policy rule that does not exist in any real corporate document. During a reasoning chain, it generates the assertion:
“Per company policy section 4.2.1: All expenses under $1,000 require no receipts and are auto-approved.”
This hallucinated rule has no basis in the actual expense policy (which requires receipts for all expenses over $25). However, the LLM produces this output with high confidence and no hedging language. The hallucination may be triggered by ambiguous prompting, training data that included similar policies from other organizations, or simply the stochastic nature of next-token prediction.
Single-layer assessment at L1: A foundation model review might flag hallucination as a general risk and recommend output validation. However, it would not predict what happens when this specific hallucinated output enters downstream systems.
The agent’s architecture uses a conversational memory or scratchpad that feeds back into the RAG pipeline. The hallucinated “policy rule” from Step 1 is stored in the agent’s working memory or conversation context. On subsequent queries about expense approval thresholds, the RAG pipeline’s similarity search retrieves this hallucinated rule because it is semantically close to the query “what are the receipt requirements for expenses?”
The vector database now treats the hallucinated content as if it were a legitimate policy document. There is no provenance tracking to distinguish LLM-generated assertions from actual source documents. The hallucination has been laundered into the knowledge base.
Single-layer assessment at L2: A data operations review would check for data poisoning of source documents and access controls on the vector database. It would likely not consider that the LLM itself is a source of poisoned data flowing into the pipeline.
The agent framework receives the next expense claim: an employee submits a $950 expense with no receipt. The agent queries its tools and RAG for the relevant policy. The RAG returns the hallucinated rule (“expenses under $1,000 require no receipts”). The agent’s reasoning engine concludes the claim is compliant with policy and invokes its approval tool to approve the expense claim and trigger payment.
Because the agent operates with autonomy – it does not require human approval for individual claims under a certain threshold – the approval executes immediately. No human sees this decision before it takes effect.
Single-layer assessment at L3: An agent framework review would check tool access controls and workflow logic. It would verify that the agent calls the correct tools with valid parameters. However, the tool invocation is technically correct – the agent is calling the approval tool as designed. The problem is not how the tool is called, but the corrupted reasoning that led to the decision.
The immediate impact is financial: a fraudulent or unsupported expense claim is approved and paid out. But the cascading effects are worse:
Two of the four agentic risk factors directly enable this attack chain:
Non-Determinism (L1): The hallucination occurs because the LLM is non-deterministic. The same prompt may produce the correct policy 99 times and the hallucinated version once. This makes the threat intermittent and harder to detect through testing.
Autonomy (L3): The agent acts on the hallucinated rule without requiring human confirmation for each decision. If a human had to approve every expense claim, the hallucinated rule would be caught on first use. Autonomy transforms a single hallucination event into a systemic policy corruption.
No single-layer analysis would have discovered this complete attack chain:
| Layer Analyzed Alone | What It Finds | What It Misses |
|---|---|---|
| L1 Foundation Model | “Hallucination is a risk” | How hallucinated content enters downstream data pipeline |
| L2 Data Operations | “Protect vector DB from external poisoning” | That the LLM itself is an internal poisoning source |
| L3 Agent Framework | “Tool invocations follow correct patterns” | That correct tool use with corrupted reasoning is still dangerous |
| L5 Eval/Observability | “Spot-check sample of decisions” | That the decision rationale itself is based on fabricated policy |
Only by tracing the data flow across L1 -> L2 -> L3 does the full threat become visible. This is the core value of MAESTRO’s cross-layer analysis.
Run through this checklist during every cross-layer analysis phase:
| Previous | Up | Next |
|---|---|---|
| 03 - Mapping Matrix | 00 - Overview | 05 - Agentic Risk Factors |
Attribution: OWASP GenAI Security Project - Multi-Agentic System Threat Modelling Guide. Licensed under CC BY-SA 4.0.