Threat Modeling AI_ML Systems and Dependencies - Security documentation _ Microsoft Learn
.pdf
keyboard_arrow_up
School
Texas A&M University, Kingsville *
*We aren’t endorsed by this school
Course
5309
Subject
Information Systems
Date
May 11, 2024
Type
Pages
24
Uploaded by ChancellorBravery102888
Threat Modeling AI/ML Systems and
Dependencies
This document is a deliverable of the AETHER Engineering Practices for AI Working Group and
supplements existing SDL threat modeling practices by providing new guidance on threat
enumeration and mitigation specific to the AI and Machine Learning space. It is intended to be used
as a reference during security design reviews of the following:
1. Products/services interacting with or taking dependencies on AI/ML-based services
2. Products/services being built with AI/ML at their core
Traditional security threat mitigation is more important than ever. The requirements established by
the Security Development Lifecycle are essential to establishing a product security foundation that
this guidance builds upon. Failure to address traditional security threats helps enable the AI/ML-
specific attacks covered in this document in both the software and physical domains, as well as
making compromise trivial lower down the software stack
. For an introduction to net-new
security threats in this space see Securing the Future of AI and ML at Microsoft
.
The skillsets of security engineers and data scientists typically do not overlap. This guidance
provides a way for both disciplines to have structured conversations on these net-new
threats/mitigations without requiring security engineers to become data scientists or vice versa.
This document is divided into two sections:
1. “Key New Considerations in Threat Modeling” focuses on new ways of thinking and new
questions to ask when threat modeling AI/ML systems. Both data scientists and security
engineers should review this as it will be their playbook for threat modeling discussions and
mitigation prioritization.
2. “AI/ML-specific Threats and their Mitigations” provides details on specific attacks as well as
specific mitigation steps in use today to protect Microsoft products and services against these
threats. This section is primarily targeted at data scientists who may need to implement
specific threat mitigations as an output of the threat modeling/security review process.
This guidance is organized around an Adversarial Machine Learning Threat Taxonomy created by
Ram Shankar Siva Kumar, David O’Brien, Kendra Albert, Salome Viljoen, and Jeffrey Snover entitled
“
Failure Modes in Machine Learning
.” For incident management guidance on triaging security
threats detailed in this document, refer to the SDL Bug Bar for AI/ML Threats. All of these are living
documents which will evolve over time with the threat landscape.
K
N
C
id
i
i
Th
M d li
Summarize
Training Data stores and the systems that host them are part of your Threat Modeling scope.
The greatest security threat in machine learning today is data poisoning because of the lack of
standard detections and mitigations in this space, combined with dependence on
untrusted/uncurated public datasets as sources of training data. Tracking the provenance and
lineage of your data is essential to ensuring its trustworthiness and avoiding a “garbage in,
garbage out” training cycle.
If your data is poisoned or tampered with, how would you know?
-What telemetry do you have to detect a skew in the quality of your training data?
Are you training from user-supplied inputs?
-What kind of input validation/sanitization are you doing on that content?
-Is the structure of this data documented similar to Datasheets for Datasets
?
If you train against online data stores, what steps do you take to ensure the security of the
connection between your model and the data?
-Do they have a way of reporting compromises to consumers of their feeds?
-Are they even capable of that?
How sensitive is the data you train from?
-Do you catalog it or control the addition/updating/deletion of data entries?
C
d l
t
t
iti
d t ?
Key New Considerations in Threat Modeling:
Changing the way you view Trust Boundaries
Assume compromise/poisoning of the data you train from as
well as the data provider. Learn to detect anomalous and
malicious data entries as well as being able to distinguish
between and recover from them
Summary
Questions to Ask in a Security Review
Can your model output sensitive data?
-Was this data obtained with permission from the source?
Does the model only output results necessary to achieving its goal?
Does your model return raw confidence scores or any other direct output which could be
recorded and duplicated?
What is the impact of your training data being recovered by attacking/inverting your model?
If confidence levels of your model output suddenly drop, can you find out how/why, as well as
the data that caused it?
Have you defined a well-formed input for your model? What are you doing to ensure inputs
meet this format and what do you do if they don’t?
If your outputs are wrong but not causing errors to be reported, how would you know?
Do you know if your training algorithms are resilient to adversarial inputs on a mathematical
level?
How do you recover from adversarial contamination of your training data?
-Can you isolate/quarantine adversarial content and re-train impacted models?
-Can you roll back/recover to a model of a prior version for re-training?
Are you using Reinforcement Learning on uncurated public content?
Start thinking about the lineage of your data – were you to find a problem, could you track it to
its introduction into the dataset? If not, is that a problem?
Know where your training data comes from and identify statistical norms in order to begin
understanding what anomalies look like
-What elements of your training data are vulnerable to outside influence?
-Who can contribute to the data sets you’re training from?
-How would you
attack your sources of training data to harm a competitor?
Ad
i l P
t
b ti
( ll
i
t )
Related Threats and Mitigations in this Document
Adversarial Perturbation (all variants)
Data Poisoning (all variants)
Forcing benign emails to be classified as spam or causing a malicious example to go
undetected
Attacker-crafted inputs that reduce the confidence level of correct classification, especially in
high-consequence scenarios
Attacker injects noise randomly into the source data being classified to reduce the likelihood of
the correct classification being used in the future, effectively dumbing down the model
Contamination of training data to force the misclassification of select data points, resulting in
specific actions being taken or omitted by a system
Left unmitigated, attacks on AI/ML systems can find their way over to the physical world. Any
scenario which can be twisted to psychologically or physically harm users is a catastrophic risk
to your product/service. This extends to any sensitive data about your customers used for
training and design choices that can leak those private data points.
Do you train with adversarial examples? What impact do they have on your model output in
the physical domain?
What does trolling look like to your product/service? How can you detect and respond to it?
What would it take to get your model to return a result that tricks your service into denying
access to legitimate users?
Wh t i
th
i
t
f
d l b i
i d/ t l
?
Example Attacks
Identify actions your model(s) or product/service
could take which can cause customer harm online or
in the physical domain
Summary
Questions to Ask in a Security Review
What is the impact of your model being copied/stolen?
Can your model be used to infer membership of an individual person in a particular group, or
simply in the training data?
Can an attacker cause reputational damage or PR backlash to your product by forcing it to
carry out specific actions?
How do you handle properly formatted but overtly biased data, such as from trolls?
For each way to interact with or query your model is exposed, can that method be interrogated
to disclose training data or model functionality?
Membership Inference
Model Inversion
Model Stealing
Reconstruction and extraction of training data by repeatedly querying the model for maximum
confidence results
Duplication of the model itself by exhaustive query/response matching
Querying the model in a way that reveals a specific element of private data was included in the
training set
Self-driving car being tricked to ignore stop signs/traffic lights
Conversational bots manipulated to troll benign users
Related Threats and Mitigations in this Document
Example Attacks
Identify all sources of AI/ML dependencies as well as
frontend presentation layers in your data/model
supply chain
S
Many attacks in AI and Machine Learning begin with legitimate access to APIs which are
surfaced to provide query access to a model. Because of the rich sources of data and rich user
experiences involved here, authenticated but “inappropriate” (there’s a gray area here) 3
-party
access to your models is a risk because of the ability to act as a presentation layer above a
Microsoft-provided service.
Which customers/partners are authenticated to access your model or service APIs?
-Can they act as a presentation layer on top of your service?
-Can you revoke their access promptly in case of compromise?
-What is your recovery strategy in the event of malicious use of your service or dependencies?
Can a 3
party build a façade around your model to re-purpose it and harm Microsoft or its
customers?
Do customers provide training data to you directly?
-How do you secure that data?
-What if it’s malicious and your service is the target?
What does a false-positive look like here? What is the impact of a false-negative?
Can you track and measure deviation of True Positive vs False Positive rates across multiple
models?
What kind of telemetry do you need to prove the trustworthiness of your model output to
your customers?
Identify all 3
party dependencies in your ML/Training data supply chain – not just open
source software, but data providers as well
-Why are you using them and how do you verify their trustworthiness?
Are you using pre-built models from 3
parties or submitting training data to 3
party MLaaS
providers?
I
t
t
i
b
t
tt
k
i
il
d
t /
i
U d
t
di
th t
Summary
rd
Questions to Ask in a Security Review
rd
rd
rd
rd
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help