Meta Threat Modeling: Using AI to Threat Model Your AI System

Meta Threat Modeling: Using AI to Threat Model Your AI System

The Scenario

Your team has just shipped an internal LLM-powered customer support chatbot. It takes employee questions, queries a knowledge base via a retrieval-augmented generation (RAG) pipeline, and returns answers. It connects to your CRM. It has access to customer records. And it’s live in production.

Now someone in the room asks: “Has anyone threat modeled this?”

Silence.

Then, almost inevitably: “Can we just use the AI to help us threat model it?”

It’s a fair question. And increasingly, the answer is yes, provided you understand exactly what you’re signing up for.

Why AI in Threat Modeling? 

Threat modeling has a persistent scaling problem. Security expertise is scarce, systems grow more complex by the quarter, and the window to do meaningful security design work keeps shrinking. AI offers a way to close some of that gap.

Three capabilities make it genuinely useful here:

Knowledge codification: Threat modeling expertise tends to concentrate in a small number of people. AI can encode that knowledge into reusable prompts, structured workflows, and automated scaffolds and skills making it accessible to teams who don’t have a dedicated threat modeler on hand.

Business context ingestion: Given the right documentation, a well-prompted LLM can build a coherent picture of your system and hold that context throughout the exercise, functioning as a colleague who’s actually read the architecture deck.

Threat library integration: STRIDE tables, OWASP references, and MITRE frameworks can be retrieved and applied against a described system automatically, rather than relying on whoever in the room happens to remember them.

That said: Excel doesn’t make you an accountant, and an LLM doesn’t make you a threat modeler. What it does is make a competent threat modeler significantly faster.

Using AI supported DICE 

Let’s walk through each phase of the [DICE framework](https://www.toreon.com/threat-modeling-in-4-steps/) and look at where AI adds value and where it falls short.

 

D - Description of the Context

For our chatbot, this means mapping the system: user inputs travel to an LLM endpoint, which queries a vector database and CRM, and returns a response. Trust boundaries, data flows, external integrations the usual groundwork.

This is where AI contributes earliest and most cleanly. Feed it your architecture document, API specifications, and any existing data flow diagrams, and a well-prompted LLM can generate a structured system description, flag context that’s missing, and scaffold the remainder of the threat model. In our experience, this alone can shave a meaningful chunk off the initial setup phase of a session.

The catch for AI systems specifically: the description step needs to capture what is unique about AI components. A chatbot is not just a web application with an unusual UI. You also need to describe the model provider, the training data provenance, the embedding pipeline, the vector store, the retrieval mechanism, and critically what happens to user inputs. Are they logged? Are they fed back into fine-tuning? Can they reach downstream systems?

If those components are not described accurately, the threats you identify later will be incomplete. Confident-sounding output does not make incomplete output correct.

 

I - Identification of Threats

This is where the “meta” in meta threat modeling gets interesting.

For general software, AI is an excellent sparring partner. It can populate STRIDE tables, generate threat scenarios, and raise the uncomfortable questions that teams tend to skip past all while retaining the business context established in the previous step.

For AI systems, though, the threat landscape extends well beyond what STRIDE was designed to capture. The OWASP LLM Top 10 introduces a category of threats that simply don’t appear in conventional threat models: prompt injection attacks that hijack the model’s behavior through user inputs, sensitive information disclosure through model outputs, training data poisoning that compromises the model at the source, and model denial of service through adversarial crafted inputs.

MITRE ATLAS (Adversarial Threat Landscape for AI Systems) goes further still. It catalogs real-world adversarial machine learning attack patterns model evasion, inference attacks, supply chain compromise of AI components with documented cases from actual incidents. For teams new to AI-specific threats, ATLAS is one of the most actionable starting points available.

There is also a meta-irony worth naming directly: if your threat modeling assistant is an LLM, and you are using it to help threat model an LLM-powered system, the assistant itself is a potential target. A threat actor who can influence the inputs to your threat modeling tool perhaps through a carefully crafted document you’ve loaded as context could distort the output. You could end up with a threat model that is subtly, confidently wrong in exactly the ways an attacker would prefer.

 

C - Countermeasure Definition

With a threat list in hand, AI helps again. It can suggest mitigations, map them to controls, and cross-reference guidance from OWASP LLM Top 10 and MITRE ATLAS’s mitigation catalog.

For our chatbot scenario, that might produce input validation and sanitization at the API boundary, output filtering to prevent sensitive data leakage, rate limiting, strict role-based access controls on CRM integration, hardened system prompts, and mandatory human review before the model can take any high-stakes action.

Think of the AI here as a well-read colleague  one who has internalized every relevant framework and remembers to mention the things your team would have gotten to eventually. That colleague still needs you to make the final call on what is actually feasible in your context, and what the residual risk is if you can’t implement something.

 

E - Evaluation

The Evaluation step has three jobs: verify that every identified threat has been adequately mitigated, explain residual risks in terms the business actually understands, and plan the next steps to manage what remains. All three are places where AI can contribute and where it can quietly mislead you if you let it.

The mitigation check is where AI contributes most cleanly. An LLM with your full threat model in context can systematically verify whether each identified threat maps to at least one countermeasure, flag anything left uncovered, and suggest options for closing the gaps. For a chatbot with dozens of OWASP LLM Top 10 threats to work through, that cross-referencing is time-consuming by hand  AI compresses it. The familiar caveat applies, though: an LLM asked “did we do a good enough job?” has a pronounced tendency to say yes, even when the answer should be no. It will affirm a model with significant gaps using well-structured, confident prose. The human in the loop is what prevents that from becoming a liability.

Residual risk is where human judgment becomes non-negotiable. Not every identified threat will have a cost-effective mitigation. For those that don’t, the evaluation phase requires you to clearly explain what risk remains and relate it to business impact not just security jargon. AI can draft that language, but a product owner, legal counsel, or regulator does not want an LLM’s interpretation of business consequences. They want a human who understands the business to own that decision.

Finally, the evaluation phase is where you decide on and plan the concrete next steps to manage what remains. And crucially: threat modeling is not a one-time exercise. As your chatbot evolves new integrations, updated models, expanded data access the threat model needs to evolve with it. This is where AI delivers sustained value. It can carry forward the decisions and documented residual risks from a previous iteration and use them to seed the next one, giving your team a running head start rather than a blank page. That continuity is one of the clearest practical wins AI brings to the Evaluation phase, as long as someone is still reading the output.

 

 The Risks You Are Actually Introducing 

Using AI in threat modeling introduces its own threat surface. It would be intellectually dishonest to write a threat modeling blog without naming them plainly.

Machine bias and overreliance is the most pervasive risk in practice. The more AI output a team reviews, the more they tend to accept without challenge not out of laziness, but because cognitive review capacity is finite. When an AI generates 40 threat scenarios in 90 seconds, critically evaluating all 40 is genuinely hard. Acceptance by omission, letting an item through because the list is long, is how hallucinated threats end up as accepted facts in your threat register.

Review and tool fatigue compounds this. The volume of output AI can produce quickly outpaces the human bandwidth available to scrutinize it. Teams that don’t budget explicit review time into their AI-assisted threat modeling sessions will find that the process is faster but not necessarily better.

The human in the loop is not optional It is the primary mitigation. Not because AI is inherently untrustworthy, but because AI threat modeling tools operate on statistical pattern matching, not on genuine understanding of your system. They are confidently wrong at a rate that no security decision should rely on unreviewed.

This matters under the regulatory frameworks your organization is already navigating. Both the EU AI Act and the Cyber Resilience Act expect documented, evidence-based security risk assessments not AI-generated checklists that a tired team waved through. Under the CRA in particular, “we thought about security” will not be enough; you need a traceable record of design decisions, identified threats, and deliberate mitigations. An AI-assisted threat model, properly reviewed by a human, can form exactly that record. An unreviewed one is a liability.

Want to learn more?

Book a discovery call for Toreon’s threat modeling training and turn the ENISA playbook into a working product-security practice.

Conclusion: AI Is the Accelerator, Not the Driver 

The team in our scenario wondering whether AI can help them threat model their chatbot has a practical answer: yes, and you should start now. AI compresses time in the Description and Identification phases meaningfully. It brings threat library knowledge into the session automatically. It helps your team ask better questions, faster, and surfaces AI-specific threats that many teams genuinely would not think to raise.

But threat modeling an AI system with AI is a double-edged exercise. You gain speed and breadth. You also introduce a new attack surface, a hallucination risk at every phase, and a machine bias that compounds with each unreviewed output.

The combination that actually works: AI-assisted speed in the first two DICE phases, paired with human-enforced rigor in the last two. The AI does the heavy lifting on coverage; the human does the heavy lifting on judgment.

That is where the real value is. Not in the speed. In the combination.

About the Author:

Asma is a principal cybersecurity consultant passionate about securing systems and enhancing development practices. With expertise in code analysis and scanning technologies, she specializes in identifying vulnerabilities throughout the software development lifecycle. Asma has conducted research into leveraging generative AI for security improvements, exploring how artificial intelligence can enhance threat detection and automate vulnerability assessment. As a trusted advisor to development teams, she combines technical depth with practical strategies to help organizations build robust security into their development processes.

Start typing and press Enter to search

Shopping Cart