AWS re:Invent 2020: Understand ML model predictions & biases with Amazon SageMaker Clarify
Summary
TLDRIn this re:Invent presentation, Pinar Yilmaz and Michael Sun explore Amazon SageMaker Clarify, a tool designed to demystify machine learning predictions and biases. They define bias in ML, introduce SageMaker Clarify, discuss its application at Prudential, and demonstrate its use in detecting bias and enhancing explainability across the ML lifecycle. The tool's integration into various SageMaker components and its ability to comply with regulatory requirements highlight its comprehensive utility in the financial sector.
Takeaways
- 📚 Amazon SageMaker Clarify is a tool designed to help understand machine learning predictions and detect biases within models.
- 🔍 Bias in machine learning is defined as imbalances in prediction accuracy across different groups and is crucial to identify and mitigate throughout the ML lifecycle.
- 🛠️ SageMaker Clarify offers a suite of APIs and core libraries that integrate with various SageMaker components, aiding in bias detection, mitigation, and model explainability.
- 📈 The tool is used in practical applications like Bundesliga match facts, where it helps explain the machine learning model's decisions in real-time for a better fan experience.
- 🏢 Prudential Financials leverages SageMaker Clarify to ensure transparency and trust with regulators, internal stakeholders, and customers by explaining AI decisions and detecting biases.
- 🔑 SageMaker Clarify is particularly important for regulated industries like insurance, where explainability is key for compliance with laws and maintaining customer trust.
- 📊 The tool provides a way to generate bias reports and visualize metrics, helping to understand and address class imbalances and feature importance within models.
- 🔄 SageMaker Clarify can be used to monitor models in production, detecting drift in bias and explainability metrics over time, signaling the need for potential model retraining.
- 🛑 There's a trade-off between model accuracy and interpretability; simple models may be more interpretable but less accurate, while complex models may be more accurate but harder to understand.
- 🌐 Various techniques for explainability exist, such as perturbation-based, gradient-based algorithms, and rule extraction, which can be selected and applied according to the use case.
- 🔬 SageMaker Clarify is being considered for inclusion in Prudential's future AI/ML platform governance, ensuring model explainability is a standard part of their AI practice.
Q & A
Who is Pinar Yilmaz and what is her role in the AWS Deep Engine team?
-Pinar Yilmaz is a senior software engineer in the AWS Deep Engine team. She is responsible for discussing Amazon SageMaker Clarify, a tool that helps in understanding machine learning predictions and biases.
What is Amazon SageMaker Clarify and what does it aim to address?
-Amazon SageMaker Clarify is a tool designed to help users understand the predictions and biases in machine learning models. It provides insights into potential imbalances in the accuracy of predictions across different groups and offers methods to detect and mitigate these biases.
What are the three main reasons for addressing bias in the machine learning lifecycle?
-The three main reasons for addressing bias in the machine learning lifecycle are: 1) During the data science phase to understand inherent biases in the dataset or model, 2) When operationalizing models to provide explanations to stakeholders, and 3) For regulatory purposes to comply with laws and regulations around algorithm behavior and the right to explanations.
How does SageMaker Clarify help in the data science phase of a machine learning project?
-SageMaker Clarify helps in the data science phase by allowing users to run a bias report to understand the bias metrics in the dataset before training begins. This helps in identifying any inherent or embedded biases early in the process.
What is the trade-off between accuracy and interpretability in machine learning models?
-The trade-off between accuracy and interpretability in machine learning models is that simple models, which are easy to understand and interpret by humans, may not provide the desired accuracy. Conversely, complex models like deep learning, which offer high accuracy, can be difficult for humans to understand and interpret, essentially becoming a 'closed box'.
Can you explain the concept of 'xGoals' as mentioned in the Bundesliga example?
-xGoals, as mentioned in the Bundesliga example, refers to expected goals statistics. It uses a machine learning model trained on Amazon SageMaker to determine real-time goal-scoring chances based on 16 different factors. With the help of SageMaker Clarify, Bundesliga can explain the key underlying components that influence the prediction of a certain xGoals value.
How does SageMaker Clarify integrate with other SageMaker components?
-SageMaker Clarify integrates with other SageMaker components such as Studio, Data Wrangler, Debugger, Experiments, Model Monitor, and Pipelines. It offers APIs and core libraries that are used for bias detection, mitigation, and explainability, and are optimized to run on AWS.
What is the importance of explainability for a company like Prudential Financials?
-For Prudential Financials, explainability is crucial as it helps build trust with customers and regulators by providing transparency in how data is collected, features are generated, algorithms are used, and decisions are made by AI systems. It ensures an open and honest dialogue, which is fundamental to the company's relationship with its customers.
How does SageMaker Clarify assist in addressing the challenges faced by Prudential Financials in explaining AI models?
-SageMaker Clarify assists Prudential Financials by offering multiple algorithmic choices that can be easily combined, providing flexibility. It also optimizes and parallelizes algorithms, enabling the company to achieve results more quickly. This helps in explaining the AI models to various stakeholders, including regulators and customers.
What are the next steps for Prudential Financials in terms of using SageMaker Clarify?
-The next steps for Prudential Financials include scaling the tasks by incorporating multiple new use cases and scaling up the dataset and algorithms. They are also actively considering SageMaker Clarify as part of their future governance, ensuring that every model on Prudential's AIML platform will incorporate explainability.
How can bias and explainability metrics be monitored over time using SageMaker Model Monitor?
-Bias and explainability metrics can be monitored over time using SageMaker Model Monitor by deploying an endpoint with data capture enabled and creating a model monitoring schedule. This allows for the visualization and understanding of how these metrics change, ensuring that they remain stable and indicating when it might be necessary to collect more data or retrain the model.
Outlines
🧑💻 Introduction to Amazon SageMaker Clarify
Pinar Yilmaz, a senior software engineer at AWS, introduces Amazon SageMaker Clarify, a tool designed to enhance the understanding of machine learning predictions and detect biases. The session begins with a definition of bias and explainability in the context of machine learning. Pinar outlines the importance of identifying biases and providing explanations throughout the machine learning lifecycle, including the data science phase, operationalization, and regulatory compliance. The talk also touches on the challenges of balancing accuracy with interpretability and the various techniques available for providing model explanations.
🏆 Bundesliga's Application of Explainability with SageMaker Clarify
The script discusses how the DFL Bundesliga uses Amazon SageMaker Clarify to enhance fan engagement during soccer matches. Bundesliga match facts utilize 'xGoals,' a machine learning model trained on SageMaker, to determine real-time goal-scoring chances based on 16 different factors. SageMaker Clarify helps explain the key components that influence the xGoals predictions, allowing for debugging of the model, increasing confidence in the algorithm, and enabling fans to better understand the scoring chances of players from any position on the field.
🛡️ SageMaker Clarify's Role in Prudential's AI/ML Practice
Michael Sun, Vice President of Data Science at Prudential Financials, explains how Prudential, a leading financial company, leverages SageMaker Clarify. Prudential uses the tool to ensure transparency in their AI/ML practices, which is critical for building trust with customers and meeting regulatory requirements. The company focuses on explaining data collection, feature generation, bias assignment, and algorithm usage to various stakeholders. SageMaker Clarify aids in addressing the challenges of explaining AI results, especially with large datasets and diverse algorithms, and is being considered for integration into Prudential's future AI governance platform.
📊 SageMaker Studio Demonstration of Bias Detection and Explainability
The script provides a step-by-step demonstration of how to use SageMaker Studio for detecting bias and generating explainability reports. It starts with running a bias report in Data Wrangler by importing a dataset from an S3 bucket and selecting relevant parameters for the analysis. The process continues with running a processing job in a SageMaker notebook using a pre-trained model and an analysis configuration file. The output is a JSON report containing various metrics, including local and global explanations, pre- and post-training bias metrics. The demonstration also includes using the SHAP library for visualizing feature importance and exploring these metrics within SageMaker Studio.
🔍 Model Monitoring with SageMaker Clarify
The final paragraph outlines the use of SageMaker Clarify in Model Monitor to continuously detect bias and explainability drift in deployed models. It describes the process of deploying an endpoint with data capture enabled and setting up a model monitoring schedule to collect relevant metrics. The script explains how to visualize and interpret these metrics in SageMaker Studio, ensuring that feature attributions remain stable and bias metrics do not exceed predefined thresholds. This monitoring helps in maintaining model integrity and deciding when to retrain the model based on changes in real-world conditions.
Mindmap
Keywords
💡Amazon SageMaker Clarify
💡Bias
💡Explainability
💡Machine Learning Lifecycle
💡Prudential
💡DFL Bundesliga
💡SageMaker Processing Jobs
💡Model Monitoring
💡SageMaker Studio
💡Data Wrangler
💡Regulatory Compliance
Highlights
Amazon SageMaker Clarify is introduced as a tool to understand machine learning predictions and biases.
Bias in machine learning is defined as imbalances in prediction accuracy across different groups.
The importance of addressing bias and explainability throughout the machine learning lifecycle is emphasized.
SageMaker Clarify offers APIs and core libraries integrated into SageMaker for bias detection and mitigation.
Prudential uses SageMaker Clarify to enhance customer trust and comply with regulatory requirements.
The trade-off between model accuracy and interpretability is discussed, with simple models being more interpretable but less accurate.
Complex models like deep learning, with high accuracy, are often less interpretable for humans.
Various techniques for providing explanations in machine learning are mentioned, including perturbation-based and gradient-based algorithms.
SageMaker Clarify is used in practice by Bundesliga to enhance fan experience with explainable machine learning models.
Prudential Financials leverages SageMaker Clarify for its AI/ML practice, focusing on customer service, fraud detection, and regulatory compliance.
The necessity of explainability in insurtech is highlighted, with Prudential emphasizing trust and transparency with customers.
SageMaker Clarify's flexibility and optimization for parallelizing algorithms are praised for improving efficiency.
A demonstration of running a bias report in SageMaker Data Wrangler is provided, showcasing how to analyze dataset bias.
SageMaker Studio is used to run processing jobs that compute bias and explainability metrics for models and datasets.
The use of SHAP library for visualizing feature importance computed by SageMaker Clarify is demonstrated.
Model Monitor in SageMaker is shown to continuously monitor bias and explainability metrics for deployed models.
Amazon SageMaker Clarify's capabilities to detect bias, provide explanations, and generate reports for stakeholders are summarized.
Transcripts
Hello, welcome to re:Invent. My name is Pinar Yilmaz.
I'm a senior software engineer in the AWS Deep Engine team.
Today, I'm going to be talking about Amazon SageMaker Clarify
and how this tool helps you understand machine
learning predictions and biases.
Today, we're going to start off by defining what bias
and explainability mean in machine learning.
And then, we're going to give an overview of SageMaker Clarify.
And then, we're going to talk about how it's used at Prudential,
and we're going to finish off with a demo.
Bias can be broadly defined
as imbalances in the accuracy of predictions
across different groups in machine learning.
And the main reasons that we would want
to do this throughout our machine learning lifecycle is in three folds.
First off, in the beginning of the machine learning journey,
we will start by the data science phase of the project.
And this is when we collect the data, we clean up the data,
we prepare the data and get it ready for training
and try different algorithms and machine
learning models to understand the business use case
and how it unfolds.
And during this time, it's important to understand
if there are inherent or embedded biases
in the dataset or the model itself.
In the next phase, when we're operationalizing
these machine learning models, we would also want to understand
how to provide these explanations about the model behavior
to different stakeholders. These could be internal or external.
Internal could be people such as loan officers,
customer service representatives, or forecasting teams.
And sometimes, for external parties, it could be even the end-users
or the customers of a particular business.
The third phase is regulatory purposes.
Today, the world governments are coming up with new laws
and regulations around the algorithm behavior
and right to explanations by citizens.
And understanding why the machine
learning made a particular prediction,
and also if the model was influenced by potential bias,
will help you comply with local laws and regulations.
You must have seen headlines in the news lately
where an algorithm was found to behave in undesirable ways,
causing problems and making the headlines.
And this is something we would like to avoid,
and we would like to intervene as early in the machine
learning cycle as possible.
When you think about how to implement explainability in practice,
the first hurdle you're going to run into
is what it means to be accurate and interpretable at the same time.
You will quickly find out that there's actually a trade-off.
When you use simple models that are easy to understand
and interpret by humans, you may not get the accuracy desired,
such as rule-based learning or linear regression
where you can just look at the rules or the coefficients,
and in the case of decision trees just a tree structure,
but this may not give you the best predictions
with the high accuracy that you would want.
Then, when you go for a more complex model such as deep learning,
which may have millions and even billions of parameters,
then the model becomes essentially a closed box
for humans to understand and interpret.
So, what do we do?
The research area is ripe with many different techniques
and algorithms to provide explanations.
And these techniques range from perturbations-based, ablation-
or permutation-based, gradient-based algorithms
using neuron activations and things like sensitivity analysis,
saliency masks, rule extraction.
All of these techniques are available in the world today.
But how do you pick the right method or algorithm for your use case?
And once you do, how do you get these explanations in the form
appropriate for your use case,
and how do you make sure that they are consumable
by the internal or external stakeholders,
such as are they numerical, textual, or visual?
And how to represent them.
Let's talk about how the DFL Bundesliga
uses explainability in practice.
Bundesliga match facts are powered by AWS,
and it provides a more engaging fan experience
during soccer matches for Bundesliga fans.
xGoals, which is short for expected goals statistics,
uses a machine learning model trained on Amazon SageMaker,
and it allows to determine real-time goal-scoring chances
based on 16 different factors.
With explainability with Amazon SageMaker Clarify,
Bundesliga can explain what some of the key underlying components
are to determine what led the machine
learning model to predict a certain xGoals value.
And knowing the respective feature attributions
and the outcomes helps to explain how to debug the model,
increase confidence in the algorithm,
and fans can evaluate the goal-scoring chances
of Bundesliga players from any position in the field.
In SageMaker Clarify, what we're offering
is a collection of APIs and core libraries.
And these tools are broadly integrated into SageMaker.
We offer a first-party container to be used
as SageMaker processing jobs,
which is framework and model agnostic.
And our core libraries are used for bias detection,
mitigation, and explainability, which is optimized to run on AWS.
You can find these tools and features spread throughout
different SageMaker components such as Studio,
Data Wrangler, Debugger, Experiments, Model Monitor, and Pipelines,
and more to come.
If you consider the machine learning lifecycle,
as we were talking earlier,
we start by collecting and preparing data.
And SageMaker Data Wrangler will allow you
to run a bias report to understand the bias metrics in your dataset
before you even start training.
Next, we're going to train and tune the model using SageMaker
training, autopilot, or hyperparameter tuning.
And at the end, we will have a potential viable model,
and we can use our processing job
with our first-party container to understand
the bias metrics and explainability metrics
given the combination of the dataset and the model.
The next step is to deploy this model in production.
When we look at a deployed model,
we would also want to monitor the bias metrics
and explainability metrics continuously
and make sure that these metrics do not vary wildly,
indicating that the real-world conditions have changed,
and now maybe it's time to collect more data or retrain again.
So, to recap, SageMaker features can be used
during the data preparation to explain the trained models
and detect the bias inherent in the model
and the dataset combination,
explain the inferences made by these models,
and also monitor this model
throughout the lifecycle of the model itself.
Next, I'd like to invite Michael Sun
to talk about how they use SageMaker Clarify at Prudential.
Thank you, Pinar. That was great.
And hello everybody, my name's Michael Sun.
I'm the vice-president of the data science of Prudential Financials.
So, today, I'm going to tell you something about Prudential
and the current state of the AIML practice in our company.
And also, why we think
SageMaker Clarify has been such a great tool for us.
So, Prudential is one of the largest financial companies in the world.
We have tens of millions of customers in the United States
and also across over 40 countries.
The company was created more than 160 years ago,
and over the last 100 years, we've had a deep bond
and trust with tens of millions of our customers,
providing their financial wellness,
protection needs for themselves and their families,
and also investment opportunities.
When I talk to future data scientists
about the current state of AIML at Prudential
and also why they should be part of this effort,
I would say that, for other companies,
understanding the future,
predicting the future in terms of both risk and opportunities
might be something nice to do,
maybe a sort of current fad if you wish.
But understanding future risk is a part of our DNA,
and it has been for the last 160 years as I said.
So, that is why AIML is so important
for insurers such as Prudential Financials.
So, as we have to build the practice and capability,
we have a focus on all manners and aspects
of how we can serve our customers better,
which includes predicting modality and morbidity risks,
providing customer experience, and the best customer experience
we can for our customers, fraud detection,
which is another key concern for insurers
as well as a present danger all of us are facing.
So, this is the future of Prudential AIML
and is how it's been practiced. It's a huge part of our DNA,
and we welcome everybody to take a look at us
and please join our efforts.
The need for explainability is particularly important
for an insurtech company, as we're practicing at Prudential.
So, when we talk about AI explainability,
what does that mean to us? There are several aspects.
We want to be able to tell our regulators
and our other stakeholders how our data is collected,
what features are generated,
how their ways and biases are assigned,
and what algorithms were used, how the data is validated,
how the [INDISCERNIBLE 00:21:08]
are validated. All the aspects are important.
As I was saying, we need to explain our results
to our external regulators, as well as internal stakeholders.
Last but definitely not the least is our customers.
As I started by saying that Prudential built
all these amazing product and services over 100 years
but underpinning all those product services
is that deep bond and trust we had to build all those years.
And an open and honest dialogue and communication
is the basis of every trusting relationship.
So, we will not squander that trust by producing something,
which we cannot be having an open and honest dialogue
with our customer and explain to them why a certain underwriting practice
is done the way it is done by AI,
or how the pricing structure has come about.
So, explainability to us is trust, and that is something we hold sacred
and will not sacrifice or anything else.
As we try to explain AIML messages to our customers,
we're facing similar challenges,
as a lot of you probably have already faced.
There are multiple approaches for a given problem.
So, as we try to explain those algorithms,
we often have to sort of juxtapose
and be switching from one and to the other.
For example, from logistic regression
all the way to deep network learning to treat
[INDISCERNIBLE 00:22:44] algorithms and back.
So, that diversity and real challenge faced about multiple algorithms
posed a challenge for us to explain our results.
On top of that, our dataset tends to be large.
I said we have tens of millions of customers,
and those customers have so many touchpoints with us.
So, the dataset and computation need to explain algorithms
is a huge challenge in and by itself.
So, which comes SageMaker Clarify.
As Pinar said in her opening remarks,
and the team at Amazon AWS,
when they set out to tackle these problems,
they had those customer needs, customer business problems in mind.
And for us, particularly these two areas,
the questions that are raised,
we found very, very encouraging and promising results
working with the AWS teams.
First, Clarify offers multiple choices,
so we can combine easily. So, that gives us the flexibility.
Both of the teams have done an amazing job in optimizing,
parallelizing algorithms.
So, we found that we can get the same results
in a fraction of the time of what we used to take.
So, on both accounts, I say Clarify should be a tool you should consider
when you are tackling an AI explainability problem.
For the next steps, we are actively pursuing scaling the tasks
by both including multiple new use cases,
as well as scaling up the dataset and multiple algorithms,
so we can really test the boundary and how widely it can be incorporated
into our data explainability repertoire.
As we're building a future AIML platform,
a model governance is a key topic as well.
SageMaker Clarify is currently being actively considered
as part of our future governance.
So, every model for Prudential's AIML platform
will have this incorporated,
and AIML model explainability will be enhanced with this tool.
Thanks, everybody, for listening. And back to Pinar.
Thank you, Michael.
Next, we're going to see how these features are used
in SageMaker Studio.
First, we will start by showing you
how to run a bias report in Data Wrangler.
In Data Wrangler, the first step is to import a dataset,
and we're going to import our dataset from an S3 bucket.
In Data Wrangler, you can run transforms or analyses.
Bias report is a form of analysis.
So, once we import a dataset, we're going to create a new analysis,
and we're going to select Bias Report from the dropdown menu.
Next, we're going to select the label,
which is the target attribute in our dataset.
And the next value that we would like to plug into this report
is what constitutes a positive outcome for this use case.
So, this is a credit dataset, and the label
"good credit having a value of one"
means a positive outcome for this particular individual.
Next, we're going to select the sensitive group from our dataset.
In this dataset, we're going to select age
as a sensitive group and indicate that a value of 40,
meaning 40 or above, is the sensitive group in this case.
And we're going to compute the bias metrics for this sensitive group.
We're going to select which bias metrics we want to compute.
You can select all and click
on the report to see the bias metrics.
You can explore these values on the resulting report screen,
and the dropdown menus will help you to pull some information
about what this metric is, how it's computed,
how to interpret and understand the value,
and what to do about it,
and also links to further reading resources
are available right there.
You can also look at the bias metrics as a table.
And then, once we create this report,
we can save it as part of our data flow
and use it as a record of our dataset.
Next, we're going to look at how to run a processing job
to get the bias metrics and explainability metrics
for a model and dataset.
For this, we're going to use a notebook
that we're going to run inside of SageMaker Studio.
So, in this notebook, we have already trained and created a model.
And we're going to use this model to run a processing job,
and we're going to create a processor using the SageMaker Python SDK.
For this processor, we're going to provide two inputs,
an analysis configuration file, which is a JSON file
that contains the various parameters for the algorithms,
and the dataset itself.
And in the output, we're going to indicate
where we would like the job results to go.
Once we run this job, we're going to get a
JSON report with all the metrics computed from inside the job.
This includes the local explanations, global explanations,
pre-training and post-training metrics.
We can download this file and inspect
and consume it right within the notebook.
When we explore these metrics,
we find that the class imbalance is indicated as a high value here.
And we would like to confirm
that by running a quick experiment on the dataset
and plot the values in a chart,
which confirms the class imbalance indicated by this metric.
Next, we're going to explore the feature importance,
and we're going to use the open-source SHAP library
to plot these values, which have already been computed
inside the processing job. This is for visualization only.
All the computation has already happened
inside the processing job, and we have access to the global
and the local SHAP values at this time.
We can also look at the same values
and explore the report visually inside SageMaker Studio.
For this, we're going to locate our processing job
in experiments as a trial component.
And when we describe this trial component,
a new tab opens up where we can see all the bias metrics
and the feature attributions together in the same tab.
The same dropdown menus that we were looking at earlier
in Data Wrangler are also available here.
And now, we have more metrics,
because now we have access to pre-training
as well as post-training metrics.
Similarly, we have the metrics as a table or the dropdown menus.
And we can also explore different facets here,
a facet being a sensitive group here
as indicated by our configuration file
and see different bias metrics computed for each of these groups.
Next, we're going to look at how this is used in Model Monitor.
So, in Model Monitor, the first step you would want to do
is to deploy your endpoint and enable data capture on it.
Next, we will have created a model monitoring schedule
with a job definition that includes that we would like to collect bias
and explainability metrics. And we have already done that here.
And next, we're going to explore how these are presented
and the endpoint section of SageMaker Studio,
and how we can visualize and understand these metrics.
So, first, we locate our endpoint,
and then when we look at the model insights
and the bias report tabs here,
we can see that the jobs are running on the schedule that we indicated,
and we can visualize different metrics,
compare from run to run, and also understand
how the feature attributions change over time
by looking at individual features
and see how they change in ranking over time.
So, what we would like to see here is that the features don't change
their attributions over time in big increments
and they remain relatively stable over time,
meaning that the assumptions
and the learnings in our dataset and model are still valid.
So, in here, what we are looking at right now
is the feature attribution for one feature,
and see how they change from one run to the next throughout the model
monitoring schedule on the live endpoint.
Next, we're looking at the bias metrics here.
And in here, we can also see the same bias metrics
that we were looking at earlier
and how they are computed from one run to the next.
And we can see if the bias metrics cross thresholds.
That would indicate that the model is behaving
in a more biased way than we had initially set it up to be.
And this might be a sign that the model is behaving
in a different way than we would like it to be,
and maybe, again, it's time to collect more data and retrain.
We can also plot the different metrics over time
and use different intervals
and different combinations of comparisons.
This concludes our demo.
So, to recap, Amazon SageMaker Clarify will help you
detect bias during data preparation, detect bias in your trained model,
detect drift in bias and explainability
for the model behavior. And you can use SageMaker Clarify
to provide reports to internal and external stakeholders,
explain individual predictions, and explain overall model behavior.
So, thank you for watching. And thank you, Michael,
for joining us here today at re:Invent.
Weitere ähnliche Videos ansehen
Sagemaker Model Monitor - Best Practices and gotchas
AWS re:Invent 2020: Detect machine learning (ML) model drift in production
What is Machine Learning? | 100 Days of Machine Learning
Ten Everyday Machine Learning Use Cases
Webinar: AI/ML in the Fintech Industry by PayPal Global PM, Vinod Jain
Artificial Intelligence Class 10 Ch 1 |AI vs Machine Learning vs Deep Learning (Differences) 2022-23
5.0 / 5 (0 votes)