Algorithms can encode & magnify human bias
Case Study 1: Facial Recognition & Predictive Policing
- Joy Buolamwini & Timnit Gebru, gendershades.org
- Microsoft, FACE+, IBM - All of these things are sell now.
- Largest gap between $\therefore\ Lighter Male\ >\ Darker\ Female $
-
This US mayor joked cops should “mount .50-caliber” guns where AI predicts crime
With machine learning, with automation, there’s a 99% success, so that robot is ㅡwill beㅡ99% accurate in telling us what is going to happen next, which is really interesting.
- city official in Lancater, CA, approving on using IBM for public security
Bias
- Bias is type of error
- Statistical Bias: difference between a statistic’s expected value and the true value
- Unjust Bias: disproportionate preference for or prejudice against a group
- Unconscious bias: bias that we don’t realize we have
But, term bias is too generic to be productive.
Different sources of bias have different causes
Representation Bias: Dataset was not representative of the algorithm that might be used on later.
Above : Data is okay, but algorithm has some problem.
Below : Data has error.
For example, object detection production that performs very well in common product of US.
But in contrast, change of target product region, like Zimbabwe, Solomon Island, and so on, reduced the performence remarkably.
It is not the algorithmic problem, so we should care about data volume of region.
Evaluation Bias: Benchmark datasets spur on research, 4.4% of IJB-A images are dark-skinned women. 2/3 of ImageNet images from the West (Sharkar et al, 2017)
Case Study 2: Recidivism Algorithm Used Prison Sentencing
Case Study 3: Online Ad Delivery
Bias in NLP
( Nothing to do with the course, but I’m researching this field these days.)
-
But all about Englsih
-
Impact The person is doctor. The person is nurse -> 그는 의사다. 그녀는 간호사다.
Concept of “biased data” often too generic to be useful
- Different sources of bias have different sources
Data, models and systems are not unchanging numbers on a screen. They’re the result of a complex process that starts with years of historical context and involves a series of choices and norms, from data measurement to model evaluation to human interpretation.
- Harini Suresh, “The problem with Biased Data”
Five Sources of Bias in ML
- Representation Bias
- Evaluation Bias
- Measurement Bias
- Aggregation Bias(46:02)
- Historical Bias(46:26)
- A few studies(47:13)
- Racial Bias, Even when we have good intentions(new york times)(47:10)
- gender(48:59)
Humans are biased, so why does algorithmic bias matter?
Algorithms & humans are used differently (humans are usually decision maker)
- Algorithms are accurate and objective
- No way to apeal if there if error
- processed large scale
- cheap
Machine learning can amplify bias
Machine learning can create feedback loops.
Technology is power. And with that comes responsibility.
Solutions
- Analyze a project at work/school:
- Questions about AI
- 5 types of bias (Suresh & Guttag)
- Datasheets for datasets, Modelcards for model reporting
- Accuracy rate on different sub-groups
- Work with domain experts & those impacted
- Increase diversity in our workspace
- Advocate for good policy
- Be on the ongoing lookout for bias