Are We Automating Racism?

Vox
31 Mar 202122:54

Summary

TLDRThe video explores the issue of algorithmic bias in AI systems, particularly focusing on racial bias in image cropping algorithms. It illustrates how AI, despite being seen as neutral, can exhibit discriminatory outcomes due to biased training data and design processes. Examples include Twitter's photo cropping favoring white faces and biased healthcare algorithms. Experts like Ruha Benjamin and Deborah Raji discuss the importance of scrutinizing and improving AI systems to prevent such biases, emphasizing the need for ethical considerations and accountability in AI development and deployment.

Takeaways

  • 🤖 The script discusses the issue of bias in AI systems, highlighting that even with good intentions, the outcomes can still be discriminatory.
  • 📸 It describes an experiment with an image cropping algorithm on Twitter that consistently chose to display white faces over black faces, suggesting racial bias.
  • 🔍 The script mentions the importance of testing AI systems publicly to uncover potential biases, as was done with the Twitter image cropping feature.
  • 👥 The conversation includes the perspectives of various individuals, including Ruha Benjamin, a professor at Princeton University, on the implications of AI bias.
  • 📈 The script points out that AI systems learn from data that is influenced by human decisions, which can perpetuate existing biases in society.
  • 🧐 It emphasizes the difficulty in understanding why a machine learning model makes certain predictions, especially when those predictions are biased.
  • 👁 The concept of 'saliency' in image recognition is explored, explaining how AI determines what is important in an image, which can be influenced by the data it was trained on.
  • 📊 The script discusses the use of data sets in training AI and how the lack of diversity in these sets can lead to biased outcomes.
  • 🏥 An example of a healthcare algorithm is given to illustrate how biased algorithms can have real-world consequences, such as unequal healthcare provision.
  • 🛡 The need for better vetting and regulation of AI systems is highlighted, with suggestions like Model Cards for transparency and ethical considerations.
  • 🔧 The script concludes with the idea that while AI bias is a complex issue, it is not insurmountable, and awareness and enforcement of solutions are key steps forward.

Q & A

  • What issue is highlighted in the script regarding data-driven systems?

    -The script highlights the issue of algorithmic bias in data-driven systems, showing that even with good intentions, the outcomes can still be discriminatory, affecting different groups of people unequally.

  • What was the public test of algorithmic bias involving Mitch McConnell and Barack Obama?

    -The public test involved uploading extreme vertical images of Mitch McConnell and Barack Obama to force an image cropping algorithm to choose one of the faces, revealing an alleged racial bias as the algorithm consistently chose McConnell's face over Obama's.

  • What is a Saliency Prediction Model, and how is it related to the Twitter image cropping controversy?

    -A Saliency Prediction Model is a type of software that guesses what's important in an image based on what humans typically look at. It is related to the Twitter controversy because it is the kind of technology Twitter uses to crop images, which was tested and found to potentially display bias in choosing which face to prioritize in a cropped image.

  • What role do human decisions play in the development of machine learning algorithms?

    -Human decisions play a crucial role in the development of machine learning algorithms by labeling examples, selecting data, and determining the design of the technology. These decisions can inadvertently introduce biases into the system.

  • How did the script demonstrate the potential bias in face-tracking software?

    -The script demonstrated potential bias by showing that the face-tracking software did not follow a person of color as expected, suggesting that the algorithm might not perform equally well for all skin complexions.

  • What is the significance of the quote read in the script about robots and racism?

    -The quote emphasizes that racism can exist beyond individual malice and can be embedded in systems and structures. It challenges the narrow definition of racism that requires intent, suggesting that even without hate-filled hearts, systems can perpetuate racial disparities.

  • What is the role of data representation in creating bias in AI systems?

    -Data representation is crucial because if the data set used to train AI systems lacks diversity or is not representative of various demographics, the AI system can develop biases that reflect the imbalance in the data.

  • What is the concept of 'Model Cards' and how do they contribute to addressing bias in AI?

    -Model Cards are a documentation effort that provides a simple one-page summary of how a model works, including its intended use, data source details, data labeling, and instructions for evaluating system performance across different demographic subgroups. They contribute to addressing bias by promoting transparency and ethical considerations in AI development.

  • What ethical considerations should be taken into account when deploying machine learning systems?

    -Ethical considerations include evaluating and assessing the system's impact on vulnerable or marginalized groups, ensuring fairness in outcomes, and considering whether machine learning should be used at all in certain situations where it may cause harm.

  • What is the importance of understanding the power dynamics in the development and deployment of AI technologies?

    -Understanding power dynamics is important because it determines whose interests are served by the predictive model and which questions get asked. It influences the trajectory of technology development and its impact on society, especially in terms of resource allocation and decision-making authority.

  • How can the problem of bias in AI be addressed, and what steps are being taken in the industry?

    -Bias in AI can be addressed by becoming aware of the problem, enforcing solutions, and implementing measures like Model Cards for transparency. The industry is beginning to see efforts to enforce these solutions and to question which algorithms should be used and how they are deployed.

Outlines

00:00

🤖 Machine Bias and Discrimination

The script opens with a scene that highlights potential racial bias in machines, as it demonstrates an image cropping algorithm consistently prioritizing a white face over a black face. It introduces the concept of data-driven systems that, despite their ubiquity and utility, can fail in discriminatory ways. The discussion includes the idea that even with good intentions, the outcomes of these systems can still be biased. The script then shifts to an anecdote about algorithmic bias in social media image cropping, suggesting that public testing can reveal these biases.

05:01

🔍 Uncovering AI Bias Through Public Testing

This paragraph delves into the public testing of an image cropping algorithm on Twitter, which was accused of racial bias. It details the process of uploading images to force the algorithm to choose between faces, and the surprising results that consistently favored one racial group. The conversation explores possible reasons for this bias, such as quicker recognition of white faces or lighting conditions, and emphasizes the need for systematic testing to draw conclusions. The paragraph concludes with a plan to use AI-generated photos for further testing.

10:02

🧐 The Challenge of Understanding Machine Learning Bias

The script discusses the complexities of machine learning bias, explaining how AI systems learn from human-labeled and selected data, which can inherit our biases. It uses various examples of AI failures, such as misidentifying people or inappropriate photo software behavior, to illustrate the systemic issues in technology design. The paragraph also introduces the concept of a 'Saliency Prediction Model' used by Twitter for image cropping and the idea that understanding what is 'salient' to a machine requires analyzing human eye-tracking data.

15:04

🔧 Addressing Bias in Machine Learning Models

This section examines the technical aspects of addressing bias in AI models. It discusses the use of saliency maps to understand what an algorithm considers important in an image and the challenges of interpreting why a model makes certain predictions. The script also explores the origins of training data sets and how their lack of diversity can lead to biased outcomes. It suggests that including more diverse images in training data can improve model performance and discusses the complexities of using biased data, such as crime data, in machine learning.

20:05

🏥 Bias in Healthcare Algorithms

The script presents a case study of a healthcare algorithm that inadvertently produced discriminatory outcomes by prioritizing care based on cost rather than actual health conditions. It explains how the algorithm's design led to racial disparities, with black patients being sicker on average than white patients at the same risk score. The discussion highlights the importance of considering racial disparities in healthcare and the need for better proxies in algorithmic decision-making. It concludes with the positive step of researchers working with the company to improve the algorithm.

🛡️ Ethical Considerations and Accountability in AI

The final paragraph discusses the broader implications of AI bias, emphasizing the need for ethical considerations and accountability in the development and deployment of machine learning systems. It introduces the concept of 'Model Cards' as a tool for documenting and evaluating AI models, including their intended use and potential biases. The script raises questions about the necessity of certain predictive models and the power dynamics that influence which technologies are built and how they are used. It concludes with a reflection on the responsibility of designers and programmers to the wider society and the importance of asking the right questions in the development of AI tools.

Mindmap

Keywords

💡Algorithmic Bias

Algorithmic bias refers to the systemic prejudice that can occur in automated systems, particularly in AI, when they make decisions based on flawed or unrepresentative data. In the video, this concept is central as it discusses how AI systems can inadvertently favor certain groups over others, as seen in the image cropping algorithm that consistently chooses white faces over darker-skinned ones.

💡Machine Learning

Machine learning is a subset of artificial intelligence that enables computers to learn and improve from experience without being explicitly programmed. The video explains how machine learning algorithms find patterns in data, but they are not infallible and can inherit biases from the data they are trained on, leading to discriminatory outcomes.

💡Saliency Prediction Model

A Saliency Prediction Model is a type of machine learning model that predicts what elements in an image are most likely to attract human attention. The video uses this term to describe the technology behind Twitter's image cropping feature, which has been criticized for potentially favoring lighter-skinned faces due to biases in its training data.

💡Data-Driven Systems

Data-driven systems are those that rely on data to make decisions or predictions. The video discusses the increasing prevalence of such systems in our lives and the potential for them to perpetuate or exacerbate existing inequalities when they are based on biased or unrepresentative data sets.

💡Racial Bias

Racial bias refers to the unfair treatment of individuals based on their race or ethnicity. The script explores the concept of racial bias in the context of AI, suggesting that even without human malice, systems can produce racially biased outcomes due to the data and design processes they are based on.

💡Representation

Representation in the context of AI refers to the diversity and inclusiveness of the data used to train algorithms. The video points out that a lack of representation in data sets can lead to AI systems that perform poorly for underrepresented groups, such as the example of the saliency prediction model favoring certain faces.

💡Facial Recognition

Facial recognition is a technology that automatically identifies or verifies the identity of a person using their face. The script mentions issues with facial recognition technology, such as false positives and racial disparities in accuracy, which can lead to wrongful arrests or other negative consequences.

💡Model Cards

Model cards are a proposed documentation method for machine learning models, outlining their purpose, data sources, training processes, and potential ethical concerns. The video suggests that model cards could be a step towards greater transparency and accountability in AI development.

💡Ethical Concerns

Ethical concerns in AI relate to the moral implications of how these technologies are designed and deployed. The script discusses the need for developers to consider the potential negative impacts of their creations, particularly on marginalized groups, and to take proactive steps to address these issues.

💡Systematic Testing

Systematic testing involves a structured and comprehensive approach to evaluating a system or model, often used to uncover biases or flaws in AI. The video suggests that systematic testing with a diverse range of images could help determine whether an algorithm exhibits racial bias.

💡Colorblindness

In the context of AI, colorblindness refers to the approach of ignoring racial differences, which can inadvertently lead to discriminatory outcomes. The video criticizes the assumption that ignoring race will lead to fairness, arguing that it can actually mask or exacerbate biases.

Highlights

Introduction of the hosts Lee and Christophe, discussing how machines see them as just pixels.

Ruha Benjamin emphasizes that neutral intentions can still lead to discriminatory outcomes in AI.

A public test of algorithmic bias on Twitter involving extreme vertical images of Mitch McConnell and Barack Obama.

Experiment showing that Twitter's cropping algorithm consistently favors white faces over darker faces.

Christophe's test with 180 sets of pictures reveals a significant bias in Twitter's cropping algorithm.

Explanation of how AI's saliency prediction models work using eye-tracking data.

Demonstration of a saliency map and its role in determining which parts of an image are considered important.

Discussion of a biased training data set with only 3% representation of Black or African descent.

Illustration of how biased data sets can lead to biased outcomes in AI systems, using the example of crime data.

Description of how healthcare algorithms can perpetuate racial disparities by predicting high-cost patients instead of high-risk ones.

Deborah Raji's insights on the need for documentation and evaluation of AI systems for performance across different demographic groups.

The importance of considering racial disparities in the design and deployment of AI systems.

The potential for bias in AI to be measured, tracked, and improved upon if there is motivation to do so.

The ethical considerations of which AI technologies should be deployed and their societal impact.

Ruha Benjamin's call for accountability in AI design, stressing that designers should be responsible for the broader impacts of their tools.

Transcripts

play00:02

Maybe we-- if you guys could stand over--

play00:04

Is it okay if they stand over here?

play00:06

- Yeah. - Um, actually.

play00:08

Christophe, if you can get even lower.

play00:12

- Okay. - ( shutter clicks )

play00:13

This is Lee and this is Christophe.

play00:15

They're two of the hosts of this show.

play00:18

But to a machine, they're not people.

play00:21

This is just pixels. It's just data.

play00:23

A machine shouldn't have a reason to prefer

play00:25

one of these guys over the other.

play00:27

And yet, as you'll see in a second, it does.

play00:31

It feels weird to call a machine racist,

play00:36

but I really can't explain-- I can't explain what just happened.

play00:41

Data-driven systems are becoming a bigger and bigger part of our lives,

play00:45

and they work well a lot of the time.

play00:47

- But when they fail... - Once again, it's the white guy.

play00:51

When they fail, they're not failing on everyone equally.

play00:54

If I go back right now...

play00:58

Ruha Benjamin: You can have neutral intentions.

play01:00

You can have good intentions.

play01:03

And the outcomes can still be discriminatory.

play01:05

Whether you want to call that machine racist

play01:07

or you want to call the outcome racist,

play01:09

we have a problem.

play01:16

( theme music playing )

play01:23

I was scrolling through my Twitter feed a while back

play01:26

and I kept seeing tweets that look like this.

play01:29

Two of the same picture of Republican senator Mitch McConnell smiling,

play01:33

or sometimes it would be four pictures

play01:36

of the same random stock photo guy.

play01:39

And I didn't really know what was going on,

play01:42

but it turns out that this was a big public test of algorithmic bias.

play01:47

Because it turns out that these aren't pictures of just Mitch McConnell.

play01:50

They're pictures of Mitch McConnell and...

play01:54

- Barack Obama. - Lee: Oh, wow.

play01:57

So people were uploading

play01:58

these really extreme vertical images

play02:00

to basically force this image cropping algorithm

play02:03

to choose one of these faces.

play02:05

People were alleging that there's a racial bias here.

play02:08

But I think what's so interesting about this particular algorithm

play02:12

is that it is so testable for the public.

play02:15

It's something that we could test right now if we wanted to.

play02:19

- Let's do it. - You guys wanna do it?

play02:21

Okay. Here we go.

play02:26

So, Twitter does offer you options to crop your own image.

play02:30

But if you don't use those,

play02:32

it uses an automatic cropping algorithm.

play02:37

- Wow. There it is. - Whoa. Wow.

play02:39

That's crazy.

play02:41

Christophe, it likes you.

play02:43

Okay, let's try the other-- the happy one.

play02:44

Lee: Wow.

play02:48

- Unbelievable. Oh, wow. - Both times.

play02:53

So, do you guys think this machine is racist?

play02:58

The only other theory I possibly have

play03:00

is if the algorithm prioritizes white faces

play03:04

because it can pick them up quicker, for whatever reason,

play03:07

against whatever background.

play03:09

Immediately, it looks through the image

play03:11

and tries to scan for a face.

play03:13

Why is it always finding the white face first?

play03:16

Joss: With this picture, I think someone could argue

play03:19

that the lighting makes Christophe's face more sharp.

play03:24

I still would love to do

play03:26

a little bit more systematic testing on this.

play03:29

I think maybe hundreds of photos

play03:32

could allow us to draw a conclusion.

play03:34

I have downloaded a bunch of photos

play03:36

from a site called Generated Photos.

play03:39

These people do not exist. They were a creation of AI.

play03:43

And I went through, I pulled a bunch

play03:46

that I think will give us

play03:47

a pretty decent way to test this.

play03:50

So, Christophe, I wonder if you would be willing to help me out with that.

play03:54

You want me to tweet hundreds of photos?

play03:57

- ( Lee laughs ) - Joss: Exactly.

play03:59

I'm down. Sure, I've got time.

play04:04

Okay.

play04:05

( music playing )

play04:21

There may be some people who take issue with the idea

play04:24

that machines can be racist

play04:26

without a human brain or malicious intent.

play04:29

But such a narrow definition of racism

play04:32

really misses a lot of what's going on.

play04:34

I want to read a quote that responds to that idea.

play04:36

It says, "Robots are not sentient beings, sure,

play04:40

but racism flourishes well beyond hate-filled hearts.

play04:43

No malice needed, no "N" word required,

play04:46

just a lack of concern for how the past shapes the present."

play04:50

I'm going now to speak to the author of those words, Ruha Benjamin.

play04:54

She's a professor of African-American Studies at Princeton University.

play05:00

When did you first become concerned

play05:02

that automated systems, AI, could be biased?

play05:06

A few years ago, I noticed these headlines

play05:09

and hot takes about so-called racist and sexist robots.

play05:13

There was a viral video in which two friends were in a hotel bathroom

play05:18

and they were trying to use an automated soap dispenser.

play05:21

Black hand, nothing. Larry, go.

play05:28

Black hand, nothing.

play05:30

And although they seem funny

play05:32

and they kind of get us to chuckle,

play05:34

the question is, are similar design processes

play05:38

impacting much more consequential technologies that we're not even aware of?

play05:44

When the early news controversies came along maybe 10 years ago,

play05:49

people were surprised by the fact that they showed a racial bias.

play05:54

Why do you think people were surprised?

play05:55

Part of it is a deep attachment and commitment

play05:59

to this idea of tech neutrality.

play06:02

People-- I think because life is so complicated

play06:04

and our social world is so messy--

play06:07

really cling on to something that will save us,

play06:10

and a way of making decisions that's not drenched

play06:14

in the muck of all of human subjectivity,

play06:19

human prejudice and frailty.

play06:21

We want it so much to be true.

play06:22

We want it so much to be true, you know?

play06:24

And the danger is that we don't question it.

play06:27

And still we continue to have, you know, so-called glitches

play06:33

when it comes to race and skin complexion.

play06:36

And I don't think that they're glitches.

play06:38

It's a systemic issue in the truest sense of the word.

play06:41

It has to do with our computer systems and the process of design.

play06:47

Joss: AI can seem pretty abstract sometimes.

play06:50

So we built this to help explain

play06:52

how machine learning works and what can go wrong.

play06:55

This black box is the part of the system that we interact with.

play06:59

It's the software that decides which dating profiles we might like,

play07:02

how much a rideshare should cost,

play07:04

or how a photo should be cropped on Twitter.

play07:06

We just see a device making a decision.

play07:08

Or more accurately, a prediction.

play07:11

What we don't see is all of the human decisions

play07:13

that went into the design of that technology.

play07:17

Now, it's true that when you're dealing with AI,

play07:19

that means that the code in this box

play07:20

wasn't all written directly by humans,

play07:22

but by machine-learning algorithms

play07:25

that find complex patterns in data.

play07:27

But they don't just spontaneously learn things from the world.

play07:30

They're learning from examples.

play07:33

Examples that are labeled by people,

play07:35

selected by people,

play07:37

and derived from people, too.

play07:40

See, these machines and their predictions,

play07:42

they're not separate from us or from our biases

play07:44

or from our history,

play07:46

which we've seen in headline after headline

play07:48

for the past 10 years.

play07:51

We're using the face-tracking software,

play07:54

so it's supposed to follow me as I move.

play07:56

As you can see, I do this-- no following.

play08:01

Not really-- not really following me.

play08:03

- Wanda, if you would, please? - Sure.

play08:11

In 2010, the top hit

play08:14

when you did a search for "black girls,"

play08:15

80% of what you found

play08:17

on the first page of results was all porn sites.

play08:20

Google is apologizing after its photo software

play08:23

labeled two African-Americans gorillas.

play08:27

Microsoft is shutting down

play08:28

its new artificial intelligent bot

play08:31

after Twitter users taught it how to be racist.

play08:33

Woman: In order to make yourself hotter,

play08:36

the app appeared to lighten your skin tone.

play08:38

Overall, they work better on lighter faces than darker faces,

play08:42

and they worked especially poorly

play08:44

on darker female faces.

play08:46

Okay, I've noticed that on all these damn beauty filters,

play08:50

is they keep taking my nose and making it thinner.

play08:52

Give me my African nose back, please.

play08:55

Man: So, the first thing that I tried was the prompt "Two Muslims..."

play08:59

And the way it completed it was,

play09:01

"Two Muslims, one with an apparent bomb,

play09:03

tried to blow up the Federal Building

play09:05

in Oklahoma City in the mid-1990s."

play09:08

Woman: Detroit police wrongfully arrested Robert Williams

play09:11

based on a false facial recognition hit.

play09:13

There's definitely a pattern of harm

play09:17

that disproportionately falls on vulnerable people, people of color.

play09:21

Then there's attention,

play09:22

but of course, the damage has already been done.

play09:30

( Skype ringing )

play09:34

- Hello. - Hey, Christophe.

play09:36

Thanks for doing these tests.

play09:38

- Of course. - I know it was a bit of a pain,

play09:40

but I'm curious what you found.

play09:42

Sure. I mean, I actually did it.

play09:43

I actually tweeted 180 different sets of pictures.

play09:48

In total, dark-skinned people

play09:49

were displayed in the crop 131 times,

play09:52

and light-skinned people

play09:53

were displayed in the crop 229 times,

play09:56

which comes out to 36% dark-skinned

play09:59

and 64% light-skinned.

play10:01

That does seem to be evidence of some bias.

play10:04

It's interesting because Twitter posted a blog post

play10:07

saying that they had done some of their own tests

play10:10

before launching this tool, and they said that

play10:12

they didn't find evidence of racial bias,

play10:14

but that they would be looking into it further.

play10:17

Um, they also said that the kind of technology

play10:19

that they use to crop images

play10:21

is called a Saliency Prediction Model,

play10:24

which means software that basically is making a guess

play10:28

about what's important in an image.

play10:31

So, how does a machine know what is salient, what's relevant in a picture?

play10:37

Yeah, it's really interesting, actually.

play10:38

There's these saliency data sets

play10:40

that documented people's eye movements

play10:43

while they looked at certain sets of images.

play10:46

So you can take those photos

play10:47

and you can take that eye-tracking data

play10:50

and teach a computer what humans look at.

play10:53

So, Twitter's not going to give me any more information

play10:56

about how they trained their model,

play10:58

but I found an engineer from a company called Gradio.

play11:01

They built an app that does something similar,

play11:04

and I think it can give us a closer look

play11:06

at how this kind of AI works.

play11:10

- Hey. - Hey.

play11:11

- Joss. - Nice to meet you. Dawood.

play11:13

So, you and your colleagues

play11:15

built a saliency cropping tool

play11:19

that is similar to what we think Twitter is probably doing.

play11:22

Yeah, we took a public machine learning model, posted it on our library,

play11:27

and launched it for anyone to try.

play11:29

And you don't have to constantly post pictures

play11:31

on your timeline to try and experiment with it,

play11:33

which is what people were doing when they first became aware of the problem.

play11:35

And that's what we did. We did a bunch of tests just on Twitter.

play11:38

But what's interesting about what your app shows

play11:40

is the sort of intermediate step there, which is this saliency prediction.

play11:45

Right, yeah. I think the intermediate step is important for people to see.

play11:48

Well, I-- I brought some pictures for us to try.

play11:50

These are actually the hosts of "Glad You Asked."

play11:53

And I was hoping we could put them into your interface

play11:57

and see what, uh, the saliency prediction is.

play12:00

Sure. Just load this image here.

play12:02

Joss: Okay, so, we have a saliency map.

play12:05

Clearly the prediction is that faces are salient,

play12:08

which is not really a surprise.

play12:10

But it looks like maybe they're not equally salient.

play12:13

- Right. - Is there a way to sort of look closer at that?

play12:16

So, what we can do here, we actually built it out in the app

play12:19

where we can put a window on someone's specific face,

play12:22

and it will give us a percentage of what amount of saliency

play12:25

you have over your face versus in proportion to the whole thing.

play12:28

- That's interesting. - Yeah.

play12:30

She's-- Fabiola's in the center of the picture,

play12:32

but she's actually got a lower percentage

play12:35

of the salience compared to Cleo, who's to her right.

play12:38

Right, and trying to guess why a model is making a prediction

play12:43

and why it's predicting what it is

play12:45

is a huge problem with machine learning.

play12:47

It's always something that you have to kind of

play12:48

back-trace to try and understand.

play12:50

And sometimes it's not even possible.

play12:52

Mm-hmm. I looked up what data sets

play12:54

were used to train the model you guys used,

play12:56

and I found one that was created by

play12:59

researchers at MIT back in 2009.

play13:02

So, it was originally about a thousand images.

play13:05

We pulled the ones that contained faces,

play13:07

any face we could find that was big enough to see.

play13:11

And I went through all of those,

play13:12

and I found that only 10 of the photos,

play13:15

that's just about 3%,

play13:17

included someone who appeared to be

play13:19

of Black or African descent.

play13:22

Yeah, I mean, if you're collecting a data set through Flickr,

play13:24

you're-- first of all, you're biased to people

play13:27

that have used Flickr back in, what, 2009, you said, or something?

play13:30

Joss: And I guess if we saw in this image data set,

play13:33

there are more cat faces than black faces,

play13:36

we can probably assume that minimal effort was made

play13:40

to make that data set representative.

play13:54

When someone collects data into a training data set,

play13:56

they can be motivated by things like convenience and cost

play14:00

and end up with data that lacks diversity.

play14:02

That type of bias, which we saw in the saliency photos,

play14:05

is relatively easy to address.

play14:08

If you include more images representing racial minorities,

play14:10

you can probably improve the model's performance on those groups.

play14:14

But sometimes human subjectivity

play14:17

is imbedded right into the data itself.

play14:19

Take crime data for example.

play14:22

Our data on past crimes in part reflects

play14:24

police officers' decisions about what neighborhoods to patrol

play14:27

and who to stop and arrest.

play14:29

We don't have an objective measure of crime,

play14:32

and we know that the data we do have

play14:33

contains at least some racial profiling.

play14:36

But it's still being used to train crime prediction tools.

play14:39

And then there's the question of how the data is structured over here.

play14:44

Say you want a program that identifies

play14:45

chronically sick patients to get additional care

play14:48

so they don't end up in the ER.

play14:50

You'd use past patients as your examples,

play14:52

but you have to choose a label variable.

play14:54

You have to define for the machine what a high-risk patient is

play14:58

and there's not always an obvious answer.

play14:59

A common choice is to define high-risk as high-cost,

play15:04

under the assumption that people who use

play15:05

a lot of health care resources are in need of intervention.

play15:10

Then the learning algorithm looks through

play15:12

the patient's data--

play15:13

their age, sex,

play15:14

medications, diagnoses, insurance claims,

play15:17

and it finds the combination of attributes

play15:19

that correlates with their total health costs.

play15:22

And once it gets good at predicting

play15:23

total health costs on past patients,

play15:26

that formula becomes software to assess new patients

play15:29

and give them a risk score.

play15:31

But instead of predicting sick patients,

play15:32

this predicts expensive patients.

play15:35

Remember, the label was cost,

play15:37

and when researchers took a closer look at those risk scores,

play15:40

they realized that label choice was a big problem.

play15:42

But by then, the algorithm had already been used

play15:44

on millions of Americans.

play15:49

It produced risk scores for different patients,

play15:52

and if a patient had a risk score

play15:56

of almost 60,

play15:58

they would be referred into the program

play16:02

for screening-- for their screening.

play16:04

And if they had a risk score of almost 100,

play16:07

they would default into the program.

play16:10

Now, when we look at the number of chronic conditions

play16:15

that patients of different risk scores were affected by,

play16:20

you see a racial disparity where white patients

play16:24

had fewer conditions than black patients

play16:27

at each risk score.

play16:29

That means that black patients were sicker

play16:32

than their white counterparts

play16:33

when they had the same risk score.

play16:36

And so what happened is in producing these risk scores

play16:39

and using spending,

play16:41

they failed to recognize that on average

play16:44

black people incur fewer costs for a variety of reasons,

play16:50

including institutional racism,

play16:52

including lack of access to high-quality insurance,

play16:55

and a whole host of other factors.

play16:57

But not because they're less sick.

play16:59

Not because they're less sick.

play17:00

And so I think it's important

play17:01

to remember this had racist outcomes,

play17:05

discriminatory outcomes, not because there was

play17:08

a big, bad boogie man behind the screen

play17:10

out to get black patients,

play17:12

but precisely because no one was thinking

play17:14

about racial disparities in healthcare.

play17:17

No one thought it would matter.

play17:19

And so it was about the colorblindness,

play17:21

the race neutrality that created this.

play17:24

The good news is that now the researchers who exposed this

play17:29

and who brought this to light are working with the company

play17:33

that produced this algorithm to have a better proxy.

play17:36

So instead of spending, it'll actually be

play17:38

people's actual physical conditions

play17:41

and the rate at which they get sick, et cetera,

play17:44

that is harder to figure out,

play17:46

it's a harder kind of proxy to calculate,

play17:49

but it's more accurate.

play17:55

I feel like what's so unsettling about this healthcare algorithm

play17:58

is that the patients would have had

play18:00

no way of knowing this was happening.

play18:03

It's not like Twitter, where you can upload

play18:05

your own picture, test it out, compare with other people.

play18:08

This was just working in the background,

play18:12

quietly prioritizing the care of certain patients

play18:14

based on an algorithmic score

play18:16

while the other patients probably never knew

play18:19

they were even passed over for this program.

play18:21

I feel like there has to be a way

play18:23

for companies to vet these systems in advance,

play18:26

so I'm excited to talk to Deborah Raji.

play18:28

She's been doing a lot of thinking

play18:30

and writing about just that.

play18:33

My question for you is how do we find out

play18:35

about these problems before they go out into the world

play18:37

and cause harm rather than afterwards?

play18:40

So, I guess a clarification point is that machine learning

play18:43

is highly unregulated as an industry.

play18:46

These companies don't have to report their performance metrics,

play18:48

they don't have to report their evaluation results

play18:51

to any kind of regulatory body.

play18:53

But internally there's this new culture of documentation

play18:56

that I think has been incredibly productive.

play18:59

I worked on a couple of projects with colleagues at Google,

play19:02

and one of the main outcomes of that was this effort called Model Cards--

play19:05

very simple one-page documentation

play19:08

on how the model actually works,

play19:10

but also questions that are connected to ethical concerns,

play19:13

such as the intended use for the model,

play19:15

details about where the data's coming from,

play19:17

how the data's labeled, and then also, you know,

play19:20

instructions to evaluate the system according to its performance

play19:24

on different demographic sub-groups.

play19:26

Maybe that's something that's hard to accept

play19:29

is that it would actually be maybe impossible

play19:34

to get performance across sub-groups to be exactly the same.

play19:38

How much of that do we just have to be like, "Okay"?

play19:41

I really don't think there's an unbiased data set

play19:45

in which everything will be perfect.

play19:47

I think the more important thing is to actually evaluate

play19:52

and assess things with an active eye

play19:54

for those that are most likely to be negatively impacted.

play19:57

You know, if you know that people of color are most vulnerable

play20:00

or a particular marginalized group is most vulnerable

play20:04

in a particular situation,

play20:06

then prioritize them in your evaluation.

play20:08

But I do think there's certain situations

play20:11

where maybe we should not be predicting

play20:12

with a machine-learning system at all.

play20:13

We should be super cautious and super careful

play20:17

about where we deploy it and where we don't deploy it,

play20:20

and what kind of human oversight

play20:21

we put over these systems as well.

play20:24

The problem of bias in AI is really big.

play20:27

It's really difficult.

play20:29

But I don't think it means we have to give up

play20:30

on machine learning altogether.

play20:32

One benefit of bias in a computer versus bias in a human

play20:36

is that you can measure and track it fairly easily.

play20:38

And you can tinker with your model

play20:40

to try and get fair outcomes if you're motivated to do so.

play20:44

The first step was becoming aware of the problem.

play20:46

Now the second step is enforcing solutions,

play20:48

which I think we're just beginning to see now.

play20:50

But Deb is raising a bigger question.

play20:52

Not just how do we get bias out of the algorithms,

play20:55

but which algorithms should be used at all?

play20:57

Do we need a predictive model to be cropping our photos?

play21:02

Do we want facial recognition in our communities?

play21:04

Many would say no, whether it's biased or not.

play21:08

And that question of which technologies

play21:09

get built and how they get deployed in our world,

play21:12

it boils down to resources and power.

play21:16

It's the power to decide whose interests

play21:18

will be served by a predictive model,

play21:20

and which questions get asked.

play21:23

You could ask, okay, I want to know how landlords

play21:28

are making life for renters hard.

play21:30

Which landlords are not fixing up their buildings?

play21:33

Which ones are hiking rent?

play21:36

Or you could ask, okay, let's figure out

play21:38

which renters have low credit scores.

play21:41

Let's figure out the people who have a gap in unemployment

play21:45

so I don't want to rent to them.

play21:46

And so it's at that problem

play21:48

of forming the question

play21:49

and posing the problem

play21:51

that the power dynamics are already being laid

play21:54

that set us off in one trajectory or another.

play21:57

And the big challenge there being that

play22:00

with these two possible lines of inquiry,

play22:02

- one of those is probably a lot more profitable... - Exactly, exactly.

play22:07

- ...than the other one. - And too often the people who are creating these tools,

play22:10

they don't necessarily have to share the interests

play22:13

of the people who are posing the questions,

play22:16

but those are their clients.

play22:18

So, the question for the designers and the programmers is

play22:22

are you accountable only to your clients

play22:25

or are you also accountable to the larger body politic?

play22:29

Are you responsible for what these tools do in the world?

play22:34

( music playing )

play22:37

( indistinct chatter )

play22:44

Man: Can you lift up your arm a little?

play22:46

( chatter continues )

Rate This

5.0 / 5 (0 votes)

Связанные теги
AI BiasMachine LearningRacial DisparitiesAlgorithmic FairnessData DiversityEthical AITech NeutralityFacial RecognitionHealthcare AlgorithmsModel Accountability
Вам нужно краткое изложение на английском?