Algorithmic Bias in Facial Recognition Technology on the Basis of Gender and Skin Tone

Sakshee Chawla

A Review of

2018 Proceedings of Machine Learning Research

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classiﬁcation

Algorithmic Bias in Facial Recognition Technology on the Basis of Gender and Skin Tone

Researchers identify discrepancies in classification of gender and skin tone by facial recognition technology indicating algorithmic bias.

Reviewed by Sakshee Chawla

Introduction

Artificial Intelligence has permeated into decision-making related to hiring, loan applications, and even duration of an individual’s sentence in prison. Despite its many advantages, errors in facial recognition algorithms that depend on artificial intelligence and machine learning can have dangerous consequences such as wrongfully accusing an individual of a crime due to errors in misidentification.

This study, by Buolamwini and Gebru, examined three commercial Application Programming Interface (API)-based classifiers of gender from facial images and found that recognition capabilities are not balanced across genders and skin tones. Through use of facial recognition technology the researchers found discrepancies in the classification: dark-skinned women reported the highest error rate compared to light-skinned men, who had the more accurate results.

Joy Buolamwini is a computer scientist and digital activist at the MIT Media Lab where she focuses on encouraging ethical and inclusive technology in addressing algorithmic bias. Timnit Gebru serves as a research scientist in Google’s ethical AI team and completed a postdoc at Microsoft with the Fairness, Accountability, Transparency, and Ethics in AI group where she examined algorithmic bias and ethical implications underlying data projects.

Methods and Findings

Assessment of gender classification remains limited to binary labels since classification systems construct gender into two defined classes. Since the researchers were interested in conducting an intersectional analysis, they provided skin type annotations for unique subjects in two datasets and built a new facial image dataset that is balanced by gender and skin type. The new dataset called Pilot Parliaments Benchmarks (PPB) included 1270 people from three African countries (Rwanda, Senegal, South Africa) and three European countries (Iceland, Finland, Sweden).

Buolamwini and Gebru chose not to use race labels since phenotypic features vary significantly across individuals within a racial or ethnic category and these racial and ethnic categories are unstable as they vary across geographies and time. Labeling faces using skin types allowed the researchers to understand the importance of phenotypic attributes. Analysis of the benchmarks found a bias favoring lighter males and disadvantaging darker individuals, especially darker females. The classifiers also performed more effectively on male faces. The researchers suggested that darker skin may not be the only factor responsible for misclassification and darker skin may instead be highly correlated with facial geometrics or gender presentation standards.

Conclusions

The researchers recommend that the error gaps between male and female as well as lighter and darker classifications in artificial intelligence should be closed. Since default camera settings are often optimized to better expose lighter skin than darker skin, under- and overexposed images lose crucial information making them inaccurate measures of classification within artificial intelligence systems. Lack of representation of specific demographic groups in benchmark datasets can result in frequent targeting and suspicion towards the already marginalized. Inaccurate facial recognition systems often misidentify people of color, women, and young people resulting in perilous and life-threatening circumstances. The authors highlight a critical need to ensure phenotypic and demographic accuracy of these systems to protect the general public and ensure technologies remain accountable and transparent.

This research represents a significant development in gender classification benchmarking by introducing the first intersectional demographic and phenotypic assessment of facial gender classification accuracy. Additional research may investigate gender classification on an inclusive benchmark of unconstrained images as well as further evaluate intersectional error analysis of facial recognition technology to ensure algorithmic fairness, transparency, and accountability.

Topics

Intersectional Analysis

Gender

Training for Racial Equity and Inclusion: A Guide to Selected Programs

Ilana Shapiro

Introduction This Guide, developed by Dr. Ilana Shapiro, provides a detailed review and comparison of ten antiracism training programs in the United States. It explores why programs do what they do (theory of practice), how they believe their work will lead to positive results (theory of change), why they use certain training methods (pedagogy), and…

Implicit Bias Insights as Preconditions to Structural Change

john powell

Rachel Godsil

Introduction Although humans believe we can “control” our behavior, scientists report that we have conscious access to only 2% of our brains’ emotional and cognitive process. Ninety-eight percent of the human brain works without active thinking. This indicates that there is an inconsistency between our conscious attitudes and our behaviors. How can these scientific lessons…

Thank you for visiting RRAPP

Please help us improve the site by answering three short questions.

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classiﬁcation

Introduction

Methods and Findings

Conclusions

Topics

Tags

Related Articles

Training for Racial Equity and Inclusion: A Guide to Selected Programs

Implicit Bias Insights as Preconditions to Structural Change

Thank you for visiting RRAPP

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classiﬁcation

Introduction

Methods and Findings

Conclusions

Topics

Tags

Share this Article

Related Articles

Training for Racial Equity and Inclusion: A Guide to Selected Programs

Implicit Bias Insights as Preconditions to Structural Change

Thank you for visiting RRAPP