Counterfactual Fairness

Microsoft Research
1 Aug 201819:10

Summary

TLDRThis talk delves into the concept of Counterfactual Fairness in machine learning, highlighting issues like racial and gender biases in algorithms. The speaker introduces a causal model approach to address unfairness by considering how sensitive attributes influence decisions. The proposed solution involves a metric and algorithm for learning fair classifiers, demonstrated through an example of law school admission. The talk concludes with a discussion on the practical application of these models and the challenges of ensuring fairness in machine learning.

Takeaways

  • 🧠 The talk emphasizes the impressive capabilities of machine learning, such as surpassing human performance in image classification and game playing, but also highlights the need to address significant problems like bias and discrimination.
  • 🔍 The speaker introduces the concept of Counterfactual Fairness, which is about creating algorithms that do not discriminate based on sensitive attributes like race or sex.
  • 🤖 The talk discusses the limitations of 'Fairness Through Unawareness', where simply removing sensitive attributes from a model does not guarantee fairness due to the influence of these attributes on other features.
  • 📈 The 'Equality of Opportunity' approach by Hardt et al. is mentioned, which corrects for unfairness by using sensitive attributes but has limitations as it does not account for biases in the target label itself.
  • 🔗 The importance of causal models is stressed to understand how sensitive attributes like race and sex can influence other variables and lead to unfair outcomes.
  • 📊 Counterfactuals are introduced as a method to evaluate fairness by imagining what the classifier's prediction would be if a person's sensitive attributes were different, thus allowing for a single change to be observed in its effects.
  • 📚 The speaker proposes a learning algorithm that uses causal models to create fair classifiers by only considering features that are not descendants of sensitive attributes.
  • 📉 The trade-off between fairness and accuracy is acknowledged, as fair classifiers may have lower predictive accuracy due to the exclusion of biased information.
  • 📝 The practical application of the proposed method is demonstrated using a dataset of US law school students, showing the impact of different approaches on fairness and accuracy.
  • 🤝 The talk concludes by emphasizing the role of causal models in addressing unfairness in machine learning decisions and the need for further research in this area.
  • 🙏 The speaker thanks the co-authors and the audience, inviting questions and discussion on the presented topic.

Q & A

  • What is the main topic of the talk?

    -The main topic of the talk is Counterfactual Fairness in machine learning, focusing on how to design algorithms that do not discriminate and are fair.

  • What are some examples of machine learning applications mentioned in the talk?

    -Examples mentioned include image classifications, human-level Atari and Go players, skin cancer recognition systems, predicting police officer deployment, deciding on jail incarceration, and personalized advertisements for housing, jobs, and products.

  • What issues are highlighted with machine learning systems in terms of fairness?

    -Issues highlighted include face detection systems that better identify white people, algorithms showing racist tendencies in advertising recommendations, and sexist biases in word embeddings associating men with bosses and women with assistants.

  • What is the intuitive notion of fairness proposed in the talk?

    -The intuitive notion of fairness proposed is that a fair classifier gives the same prediction had the person had a different race or sex.

  • How does the talk address the problem of sensitive attributes in machine learning?

    -The talk proposes a method that involves modeling the influences of sensitive attributes causally before constructing a classifier, using counterfactuals to determine what the classifier would predict had someone's race or sex been different.

  • What is the concept of 'Fairness Through Unawareness' mentioned in the talk?

    -'Fairness Through Unawareness' is a technique where sensitive attributes are removed from the classifier to make it unaware of these attributes, aiming to make fair predictions.

  • What is the issue with simply removing sensitive attributes in a classifier?

    -The issue is that the remaining features may still be influenced by the sensitive attributes, leading to biased predictions even though the classifier is unaware of the sensitive attributes directly.

  • What is the 'Equality of Opportunity' approach proposed by Hardt et al. in 2016?

    -The 'Equality of Opportunity' approach proposes building a classifier that uses sensitive attributes to correct for unfairness, ensuring equal accuracy in predicting outcomes like law school success for different racial groups.

  • How does the talk propose to model unfair influences in data?

    -The talk proposes modeling unfair influences by assigning variables for each feature, introducing causal links from sensitive attributes to these attributes, and using counterfactuals to determine predictions under different conditions.

  • What is the definition of 'Counterfactual Fairness' introduced in the talk?

    -'Counterfactual Fairness' is defined as a predictor being fair if it gives the same prediction in a world where someone had a different race, gender, or other sensitive attributes.

  • How does the talk demonstrate the practical application of the proposed fairness approach?

    -The talk demonstrates the practical application by using a dataset of US law school students, fitting a causal model, computing unobserved variables, and learning a classifier based on features that are not descendants of sensitive attributes.

  • What are the potential limitations or challenges in using the proposed fairness approach?

    -Potential limitations include the need for accurate causal models, the assumption that interventions are real, and the possibility that the model may not account for all biases, such as those in the selection of the dataset.

  • How does the talk address the trade-off between accuracy and fairness in machine learning?

    -The talk acknowledges that achieving counterfactual fairness may come at the cost of reduced accuracy, as some biased but predictive features are removed from the model.

Outlines

00:00

🧑‍🏫 Introduction to Counterfactual Fairness

The speaker begins by acknowledging the impressive advancements in machine learning, such as superior image classification and game-playing capabilities, and effective medical diagnostics. However, they emphasize the need to address significant issues, including racial and gender biases in algorithms, using examples like biased face detection systems and Google's advertising recommendation system. The talk introduces the concept of counterfactual fairness, which posits that a fair classifier should provide the same prediction regardless of a person's race or sex. The speaker outlines the plan to design a metric and an algorithm based on this fairness definition, using the example of a machine learning system for law school admissions.

05:00

🔍 Challenges with Fairness Through Unawareness

This paragraph delves into the complexities of achieving fairness in machine learning by simply removing sensitive attributes like race and sex from the classifier's input. The speaker points out that other features, such as GPA and LSAT scores, may still be influenced by these sensitive attributes due to systemic biases. They reference a paper by Hardt et al. that proposes using sensitive attributes to correct for unfairness, but also highlight its limitations, particularly when the target label itself may be biased due to factors like stereotype threat.

10:04

🔧 Constructing Counterfactually Fair Classifiers

The speaker introduces a novel approach to fairness by using causal models to understand and address the influence of sensitive attributes on other features and the target label. They propose modeling the causal relationships and then using counterfactuals to assess fairness. Counterfactuals allow for the examination of what the classifier's prediction would have been had the individual's race or sex been different. The process involves computing unobserved variables, imagining a counterfactual condition, and recomputing observed variables based on the causal model. The goal is to create a classifier that provides consistent predictions across both actual and counterfactual data.

15:07

📈 Demonstrating Counterfactual Fairness in Practice

The speaker discusses the practical application of their proposed method using a dataset of US law school students. They compare the outcomes of different classifiers: one using all available features, one omitting sensitive attributes, and one that is counterfactually fair. The counterfactually fair classifier, while potentially less accurate due to the exclusion of biased information, is presented as the preferred model for its fairness. The speaker also acknowledges the importance of the chosen causal model in determining the fairness and accuracy of the classifier, and invites further exploration of potential biases and the selection of individuals in datasets.

🤝 Closing Remarks and Q&A

In the concluding part of the talk, the speaker summarizes the key points: the influence of race, gender, and other factors on machine learning decisions, the introduction of Counterfactual Fairness as a metric, and the provision of a learning algorithm for fair predictors. They thank their co-authors and the audience and open the floor for questions. The Q&A session touches on the practicality of implementing the model, the handling of non-linear relationships, and the potential for biases in dataset selection, with the speaker acknowledging the complexity of these issues and the need for further research.

Mindmap

Keywords

💡Counterfactual Fairness

Counterfactual Fairness is a concept that addresses the issue of discrimination in machine learning algorithms by considering hypothetical scenarios where sensitive attributes like race or gender are altered. It is a key theme in the video, as the speaker discusses the importance of creating algorithms that do not discriminate and are fair. The video defines fairness as a classifier giving the same prediction had the person had a different race or sex, and it is used to develop a metric and an algorithm for fair classification.

💡Machine Learning

Machine Learning is a subset of artificial intelligence that enables computers to learn from data without being explicitly programmed. The video script highlights the advancements in machine learning, such as image classifications and skin cancer recognition systems, while also pointing out the significant problems that need to be addressed, such as bias and discrimination in algorithms.

💡Face Detection Systems

Face Detection Systems are a type of machine learning application that can identify and locate faces in images or videos. The script points out that these systems have been found to be better at identifying the faces of white people than black people, illustrating the issue of racial bias in machine learning applications.

💡Word Embeddings

Word Embeddings are a representation of words in a text corpus, where words that have similar meanings are mapped to nearby points in a high-dimensional space. The video script mentions that when trained on the Google News corpus, word embeddings can make sexist associations, such as associating 'men' with 'bosses' and 'women' with 'assistants', demonstrating gender bias in machine learning models.

💡Causal Model

A Causal Model is a type of statistical model that describes the relationship between variables and their direct and indirect effects. In the context of the video, the speaker uses a causal model to represent the influence of sensitive attributes like race and sex on other features and the label, in order to address unfairness in machine learning predictions.

💡Sensitive Attributes

Sensitive Attributes, in the context of machine learning, refer to variables such as race, sex, or other personal characteristics that could lead to biased or unfair predictions. The script discusses the problem of sensitive attributes influencing other features and the need to account for this in the pursuit of fairness.

💡Fairness Through Unawareness

Fairness Through Unawareness is a technique mentioned in the script where sensitive attributes are removed from the dataset to make the classifier unaware of them, with the intention of making fair predictions. However, the script points out the issue with this approach, as the remaining features may still be influenced by the sensitive attributes.

💡Equality of Opportunity

Equality of Opportunity is a concept introduced by Hardt et al. in 2016, which the script discusses as a way to address fairness by ensuring that a classifier is equally accurate at predicting outcomes for different groups, such as different races. However, the script also points out the limitations of this approach, as it does not account for biases in the target label itself.

💡Counterfactuals

Counterfactuals are hypothetical scenarios that explore what would have happened if a particular condition had been different. In the video, counterfactuals are used to ask what the classifier would have predicted had someone's race or sex been different, which is central to the definition of counterfactual fairness.

💡Stereotype Threat

Stereotype Threat is a psychological phenomenon where individuals feel at risk of confirming negative stereotypes about their social group. The script mentions that stereotype threat could be a causal phenomenon leading to different acceptance rates or law grades for members of different racial groups, affecting the fairness of machine learning models.

💡Causal Relationship

A Causal Relationship is a type of relationship between variables where a change in one variable leads to a change in another. The video script discusses causal relationships, particularly how sensitive attributes like race and sex can have causal influences on other features and the label, which is crucial for understanding and addressing fairness in machine learning.

Highlights

Machine learning advancements have led to systems that outperform humans in image classification and game playing, but significant fairness issues remain.

Machine learning is being used in critical areas such as crime prediction, bail decisions, and personalized advertisements, raising ethical concerns.

Issues like biased face detection systems and racist algorithms in advertising recommendation systems highlight the need for fairness in AI.

Sexist biases are evident in word embeddings trained on Google News, associating men with bosses and women with assistants.

The talk introduces the concept of 'Counterfactual Fairness' as an intuitive notion of fairness in classification.

A fair classifier is defined as one that would give the same prediction if the person's race or sex were different.

The approach involves designing a metric and an algorithm for learning fair classifiers based on the counterfactual fairness definition.

An example of a machine learning system for law school admissions is used to illustrate the approach to fairness.

Sensitive attributes like race and sex are distinguished from other features to address fairness concerns in classification.

The technique 'Fairness Through Unawareness' is critiqued for failing to account for indirect biases in non-sensitive features.

The 'Equality of Opportunity' approach is discussed, which uses sensitive attributes to correct for unfairness in predictions.

The causal model is introduced to represent influences and biases in data, allowing for a more nuanced understanding of fairness.

Counterfactuals are used to imagine the prediction a classifier would make if a person's sensitive attributes were different.

A classifier is considered fair if it provides the same predictions on original and counterfactual data.

The method for constructing fair classifiers involves using features that are not descendants of sensitive attributes.

The practical application of the method is demonstrated using a dataset of US law school students.

The trade-off between accuracy and fairness is discussed, acknowledging the reduction in predictive power when excluding biased information.

The importance of selecting the right causal model is emphasized, as different models can lead to different fairness outcomes.

The presentation concludes with a call to address biases in machine learning decisions caused by race, gender, and other factors.

The Counterfactual Fairness metric and learning algorithm are presented as a solution for creating fair predictors.

The Q&A session explores practical aspects of implementing the model, potential biases, and the impact of selection biases in datasets.

Transcripts

play00:11

>> Okay, great. Thanks a lot.

play00:14

Today, I'll be talking about Counterfactual Fairness.

play00:17

It's a joint work with Josh, Chris, and Ricardo.

play00:21

The way I like to start off

play00:24

the talk is just by patting us on the back.

play00:28

Machine learning these days is amazing.

play00:31

We have image classifications that do better than humans.

play00:35

We have human-level Atari and Go players,

play00:39

and we have skin cancer recognition systems that

play00:42

do just as well if not better than human doctors.

play00:46

So, why not use it everywhere? And people are.

play00:51

People are using it for predicting

play00:53

where police officers should go in order to catch crime.

play00:56

They're using it to decide

play00:59

whether or not to keep someone in jail.

play01:01

They're also using it to make

play01:03

increasingly personalized advertisements about

play01:05

housing, jobs, and products.

play01:09

But, in this talk,

play01:11

I want to stress that there's still

play01:12

significant problems with machine learning

play01:14

that we need to address.

play01:16

For instance, there are face detection systems that are

play01:20

better identifying the faces

play01:21

of white people than black people.

play01:24

Algorithms are even more explicitly racist.

play01:28

So, this was an example of

play01:30

Google advertising recommendation system.

play01:34

When you search for people's names that

play01:37

were more often associated with black individuals,

play01:40

you get advertisements like,

play01:42

"Is this person arrested?"

play01:43

This is an example shown

play01:46

by the person who discovered this Latanya Sweeney,

play01:49

who's the Harvard Professor

play01:51

of Governments and Technology.

play01:53

Machine learning has also been shown to be sexist.

play01:57

So, there're word embeddings.

play02:00

When trained on the Google News corpus,

play02:02

they make associations like,

play02:04

men are bosses and women are assistants.

play02:07

So, to address this in this talk,

play02:10

we're going to take a step towards a solution.

play02:13

Step towards making algorithms that don't

play02:16

discriminate, that are fair.

play02:19

Specifically, we're going to start

play02:21

from a very intuitive notion of fairness,

play02:23

that a fair classifier gives the same prediction had

play02:26

the person had a different race or a sex.

play02:31

We're going to design a metric

play02:32

to test with this definition,

play02:35

and an algorithm that describes

play02:37

how to learn classifiers that are fair.

play02:40

So, let me demonstrate how this approach

play02:42

works by considering an example.

play02:45

A machine learning system that

play02:46

decides who should be accepted into law school.

play02:49

To make this decision,

play02:51

we have data about

play02:53

individuals that have already been in law school,

play02:57

specifically their sex, their race,

play03:00

their college GPA scores before they got to law school,

play03:04

their law school entrance exam score as in the US,

play03:07

this is the LSAT,

play03:09

and their first year law grades,

play03:12

which is what many law firms are going to use as

play03:14

a measure of their success.

play03:17

So, by the same token the way that we're going to try to

play03:23

predict whether we should admit students or

play03:26

not is whether or not their first year law grade is good.

play03:28

So, we're going to try to learn a predicted Y hat

play03:31

from these set of features to this label.

play03:35

Now, straight away many people would be hesitant

play03:38

to use these features in particular

play03:40

for classification because you're

play03:42

ensuring that people with

play03:43

different values of race

play03:45

or sex get different classifications.

play03:49

So, let me distinguish between

play03:53

features where we feel

play03:55

a bit uncomfortable using directly,

play03:56

and call these sensitive attributes A,

play03:58

and the remaining features as X.

play04:03

Because of this distinction,

play04:05

the first thing we might try is just

play04:06

simply remove these sensitive attributes.

play04:10

Because now, the classifier is unaware of

play04:14

the sense of attributes directly.

play04:17

So, this technique is

play04:18

called Fairness Through Unawareness.

play04:20

This should allow us to make fair predictions, right?

play04:23

Well, actually there's a crucial issue here.

play04:25

The issue is that if we consider

play04:27

the remaining features while we

play04:30

have a good sense that

play04:31

someone's GPA is influenced by their knowledge.

play04:35

At the same time, it may also be unfairly

play04:38

Influenced by the sensitive attributes we just removed.

play04:42

The reason for this is that there are

play04:43

studies that show that

play04:44

minority students may feel that

play04:46

teachers are unsupportive of them.

play04:49

At the same time, teachers may believe that students

play04:52

of a certain race have behavior issues,

play04:54

which influences how they assign grades to them?

play05:00

The same goes for

play05:01

the other nonsense of actually, the LSAT score.

play05:05

Minority students because of economic history may

play05:08

have limited access to certain academic institutions,

play05:11

and teachers may implicitly decide to place students

play05:15

in honors classes or not based on race.

play05:19

So, even though we removed the sensitive attributes,

play05:22

we've failed to make our classifier unbiased against

play05:26

certain races because the features themselves are biased.

play05:30

So, we might try to do something a bit more clever.

play05:33

In 2016, there's a very nice paper by Hardt et al.

play05:41

called Equality of Opportunity.

play05:43

They realized this problem

play05:46

with just throwing away the sensitive attributes.

play05:48

So, they proposed to build

play05:50

a classifier that uses the sensitive attributes,

play05:52

and use them to correct for unfairness.

play05:54

The way they did this, is they said,

play05:58

well as long as our classifier is equally accurate at

play06:01

predicting law school success in particular,

play06:04

then it provides equal opportunity

play06:06

for individuals who are

play06:09

black and individuals who are white.

play06:12

But, what if race also

play06:14

influences our label, law school success.

play06:18

There's evidence that it does,

play06:19

minority students' grades maybe

play06:21

measurably worse simply because there aren't

play06:23

minority race teachers in law school or at least as

play06:26

many because of those biases I talked about earlier.

play06:31

This causal phenomenon is due to

play06:34

a phenomenon called stereotype threat.

play06:37

So, this may lead to this result where we have

play06:39

different acceptance rate or

play06:42

different good law grades

play06:44

for members in different groups.

play06:48

Equality of Opportunity says that as long as we

play06:51

predict these percentages equally accurately.

play06:56

If I have a classifier that gives

play06:58

34 percent of blacks

play07:00

predicts they'll have a good law grades,

play07:02

and 51 percent of white students, then we're fair.

play07:07

But, in doing so,

play07:10

we don't account for the fact that the target label,

play07:13

in this case, we unfairly biased.

play07:15

So, in this talk,

play07:17

we're going to take a different approach.

play07:19

We've proposed to model

play07:21

these influences that I mentioned,

play07:23

causally before constructing a classifier.

play07:28

Specifically, let's assign a variable for each of

play07:31

our features and our nonsensing features and our label.

play07:35

Because we said, we believe race

play07:37

influences these attributes,

play07:39

we're going to introduce a causal link

play07:42

from race to these attributes.

play07:45

The same goes for sex.

play07:47

We believe this is unfair causal link.

play07:52

Finally, we said that we believe our features are

play07:55

also influenced by someone's law knowledge.

play07:58

Something that we can't observe directly,

play08:00

but we can model.

play08:01

The useful thing about having a causal model,

play08:05

is now we can talk about how we believe

play08:07

unfairness is playing a role in our data.

play08:11

So, what does this causal model telling us

play08:13

why it's more formally.

play08:15

Well, every arrow in this causal model is

play08:18

a functional relationship

play08:19

between the variables that connects,

play08:21

specifically the arrow from U to Y,

play08:23

it means that Y is some function of a view.

play08:28

Could be a deterministic non-deterministic

play08:30

function, but some function.

play08:32

Now we ask,

play08:35

why does this actually help

play08:37

us deal with a problem of fairness?

play08:39

It becomes clear when we go back to our original goal,

play08:43

which was to say we'd like to enforce

play08:45

this definition of fairness,

play08:47

giving a metric for it and then showing

play08:50

how to design algorithms that satisfy it.

play08:53

But this definition seems quite tricky because we have to

play08:58

somehow imagine that someone's race or

play09:00

sex had been different. And how can we do that?

play09:02

Well, we can do that using

play09:04

a quantity called counterfactuals.

play09:06

Imagine we have a classifier that uses GPA

play09:10

and LSAT score in order to make

play09:11

a prediction of law school success.

play09:13

Counterfactuals allow us to ask the question,

play09:15

what would have the classifier predicted,

play09:18

had someone's race or sex change?

play09:20

I'll tell you how that works.

play09:22

It works for any individual using a three-step procedure.

play09:27

In the first step, we just take

play09:29

our causal model which we believe represents our data,

play09:32

and we compute any unobserved

play09:34

variables in the causal model,

play09:35

in this case, law knowledge.

play09:37

This is what U looks like for

play09:39

this person and what I want to

play09:40

emphasize here is that, in this model,

play09:43

U describes all the information in G, L,

play09:46

and Y that isn't described by race or sex.

play09:50

So it's sort of extracting

play09:51

fair information from these variables.

play09:55

Okay. So after we do that,

play09:58

next we're going to imagine

play09:59

that someone had a different race,

play10:00

for instance, imagine that this person had been white.

play10:03

So this is the counterfactual condition,

play10:07

and in the third and final step,

play10:10

we're going to use the structural equations,

play10:13

equations that are described

play10:14

by all of these arrows in this causal model.

play10:17

The unobserved variable U that I

play10:19

calculated and then counterfactually changed

play10:21

race in order to recompute

play10:23

all our observed variables G, L, and Y.

play10:26

Let me denote these new variables

play10:29

by the subscript, this counterfactual subscript.

play10:32

When we do that we may get something like this.

play10:35

So counterfactuals, what I

play10:37

want you to take home that this procedure is,

play10:40

they allow us to imagine making

play10:42

just a single change to a person

play10:44

and to observe how this change

play10:46

propagates and affects everything else about that person.

play10:50

So, let's go back to our goal.

play10:54

A classifier is going to be fair then if it gives us

play10:58

the same predictions on

play10:59

the original data as it does on the counterfactual data.

play11:02

So more formally, for

play11:05

any features x for any sense of variable A,

play11:09

we have that the distribution over different kind of

play11:12

actual conditions of

play11:13

our predictor Y hat has to be identical.

play11:17

I'd like to contrast our definition with

play11:21

another recent definition that was at

play11:22

last year's [inaudible] Kilbertus et al.

play11:25

Their definition is similar,

play11:27

except that instead of using

play11:28

counterfactuals they use this do

play11:30

operator to intervene on the sensitive variable.

play11:35

What this means is

play11:38

that the differences here is that in our definition,

play11:42

we're comparing the same individual with a different,

play11:45

imagine version of themselves

play11:47

according to the causal model,

play11:49

while in their definition,

play11:51

they are sort of grouping all individuals who happen to

play11:53

align on the same observed features.

play11:55

So there's a trade-off sort of

play11:58

more specific but it requires additional assumptions.

play12:02

Now, most classifiers because

play12:05

these features will often

play12:06

change won't be counterfactually fair.

play12:09

So, how do we go about constructing

play12:10

these sorts of classifies?

play12:12

Well, intuitively, we're not going to want to use

play12:17

any features that change when race or

play12:19

sex changes like these.

play12:23

But notice that when we change race or

play12:27

sex because there's no variables directly going to U,

play12:29

we could use this law knowledge variable.

play12:33

In general, any features in your causal model

play12:36

which aren't descendants of your sense of variables,

play12:38

you can use in

play12:40

any unobserved variables to

play12:41

make counterfactually fair predictor.

play12:44

This sort of makes sense because

play12:46

of how he's describing you

play12:47

earlier like U is sort of everything about G, L,

play12:50

and Y that isn't described,

play12:52

it's like the fair parts of G,

play12:54

L, and Y, so we should be able to use that.

play12:56

So how do we go about constructing our classifier?

play12:59

Well, it's just a simple procedure.

play13:01

You're going to take your features, your labels,

play13:04

and your sense of variables,

play13:05

and then you're going to fit

play13:07

the causal model that you believe

play13:08

best describes your data.

play13:10

Then for every person,

play13:13

you're going to compute any unobserved variables

play13:16

about them in that causal model.

play13:19

Then you're going to learn a predictor by

play13:22

only learning one on

play13:26

features that are unobserved

play13:28

or that aren't descendants of A,

play13:30

and the final classifier is

play13:31

guaranteed to be counterfactually fair.

play13:33

So, how does this work in practice?

play13:36

Well, we demonstrate our method using a dataset of

play13:40

US law school students with the features

play13:43

I just described in this running example.

play13:46

What we can do is, we could say, "Well,

play13:48

if we didn't care about fairness and we use

play13:50

all the features highlighted in red to make a prediction.

play13:53

This is the unfair our RMSE you would get."

play13:56

Then, as I described before,

play13:59

what we could do is, we

play14:01

could remove the sense of attributes,

play14:03

we'd still have something that's likely

play14:05

unfair because we believe these attributes are

play14:07

influenced by race and gender and we get

play14:09

this sort of accuracy results.

play14:12

But because we want to make a fair prediction,

play14:14

because we care more about things than just accuracy,

play14:18

we're going to fit

play14:20

a causal model that we believe represents our data

play14:23

and by that I mean we're going to

play14:25

learn all of these weights W and biases B,

play14:28

and we're going to sample

play14:31

unobserved variables U and then

play14:33

learn to classifier just on those samples.

play14:36

The cost of achieving

play14:39

this counterfactual fairness is

play14:41

that our predictor is our RMSE.

play14:44

This is sensible because we believe that parts of G,

play14:48

L, and Y are biased,

play14:51

that they are polluted by race.

play14:52

So, when we take away that information from J, L,

play14:55

and Y we should be less able to predict Y.

play14:57

But because accuracy isn't our only goal,

play15:01

this is the model we propose.

play15:03

But I do want to point out that depending on

play15:06

the causal model that you believe is most

play15:08

accurate you'll get different results.

play15:10

So this model here it makes

play15:12

less strict assumptions using

play15:18

in the previous model and because of that,

play15:21

the fair classifier will have different accuracy,

play15:24

is natural trade-off there.

play15:27

So, I also want to quickly say that,

play15:30

if we believe that

play15:33

the first causal modal I showed you

play15:34

is true, then what we can do is,

play15:36

we can sample people from that model that agree with

play15:39

the data and then we can sample

play15:40

a counterfactual individuals from that model,

play15:42

and we can see how

play15:45

classifier as the first two I showed you,

play15:47

one they use all the features,

play15:49

one they use all of them but the sense of attributes,

play15:51

how much they change

play15:53

when we actually change someone's race,

play15:55

when we actually change someone's gender.

play15:58

So if this is the right causal model than we really

play16:02

should be learning counterfactually fair predictors.

play16:06

So, what I want you all to take away from

play16:09

this talk is that race, gender,

play16:11

sexual orientation, other things could

play16:14

cause machine-learning decisions to change unfairly.

play16:17

Our idea to address this was to describe how

play16:21

the sense of attributes calls in

play16:23

fair decisions by designing a causal model.

play16:26

We then introduce a fairness metric

play16:28

that we call Counterfactual Fairness,

play16:30

which states that a predictor is fair

play16:31

if it gives the same prediction in

play16:33

a world we had a different race, gender or otherwise.

play16:36

We then give a learning algorithm

play16:38

to learn these predictors,

play16:40

and we demonstrate our technique for making

play16:42

fair predictions for law school success.

play16:45

I'd like to thank my co-authors' Chris,

play16:47

who's at Turing, Josh,

play16:48

who's at NYU Stern and Ricardo,

play16:50

who's at UCL, and I'd like

play16:52

to thank you all for listening.

play16:53

I'll take your questions now.

play16:59

>> That shows my ignorance of structural causal models.

play17:05

In your final example,

play17:07

when you had your data set,

play17:08

it just looked like a normal latent variable model.

play17:10

Is it also just like- can I

play17:12

put it in Stern or some other software,

play17:13

crank the handle and that's it or do I have to-?

play17:16

>> Yes, we actually use Stern.

play17:18

>> So it's really practical actually?

play17:19

>> Yeah. So the only distinction between

play17:21

the invariable model and the causal model

play17:23

is that it's like a philosophical one,

play17:24

you believe that interventions are real. Yeah.

play17:29

>>This is anything that you

play17:32

are presenting depending on the form of

play17:36

the relationship so that- I noticed in

play17:37

your examples are linear and

play17:39

we substituted [inaudible] things in there.

play17:43

>> Yeah. They can be non-linear

play17:45

but you have to know them.

play17:46

So in order to compute counterfactual,

play17:48

so you can learn them, but they have to be explicit.

play17:54

Yeah, but they could be non-linear. Sure.

play17:56

>> Is that the [inaudible]?

play17:57

>> Exactly.

play18:02

>> Can you find biases in

play18:05

the selection into your dataset that you're training for-

play18:08

>> Yeah.

play18:09

>> -if you assume that there are-

play18:10

>> That's a very good question. Right, maybe- so

play18:15

for a predictive policing maybe certain people are

play18:17

selected more often because of racism.

play18:19

So no, we haven't addressed the selection problem,

play18:21

but I think that's

play18:21

a really interesting feature direction.

play18:25

>> In one of the last

play18:28

slides showing the difference between

play18:29

the counterfactual predictor [inaudible] on the attributes,

play18:33

there were some of the [inaudible] that [inaudible].

play18:37

>> Yeah.

play18:37

>> So does it imply that the labels or [inaudible] in that case.

play18:41

>> Yeah. For that causal model that we fit,

play18:47

it implies that at least on

play18:49

the data that we drew and the model we fit,

play18:52

there's not a change there but it

play18:54

may be that on different models.

play18:57

>> [inaudible] might be wrong?

play18:58

>> Exactly. Yeah.

play19:00

>> Okay, [inaudible]

play19:05

>> Great, let's thank Matt again

play19:07

for a wonderful presentation.

Rate This

5.0 / 5 (0 votes)

Связанные теги
Machine LearningCounterfactual FairnessAlgorithmic BiasData EthicsCausal ModelsBias MitigationPredictive PolicingEducational EquityGender EqualityRace EquityEthical AI
Вам нужно краткое изложение на английском?