How Human Bias is creating Racist Data in AI

Part of AI4DA’s mission is to prepare Artificial Intelligence (AI) for a better future tomorrow, which is free of any bias and inequality. This aligns with the UN’s Sustainable Development Goal 10 “Reduce inequality within and among countries” [1]. But what is the role of AI in this regard?

What is racial bias?

Bias in AI refers to the situation where machine learning algorithms discriminate against particular groups based on race, gender, biological sex, nationality, or age. [2] As an example, Marr wrote that-hiring algorithm might pick a white, middle-aged man to fill a vacancy because other white, middle-aged men were previously hired to the same position. This would mean that he would be hired because the data that the algorithm processed was biased towards white, middle-aged men rather than him being the best person for the job. [3]

Also, it is important to know that racial biases are associations made by individuals in the unconscious state of mind. This means that the individual is likely not aware of the biased association. Individuals may not be racists, but their perceptions have been shaped by experiences and the result is biased thoughts or actions. [4] So the racial biases in AI are a form of bias, which refers to the attitudes or stereotypes that affect an individual’s understanding, actions, and decisions in an unconscious manner. [5]

AI and racial bias

AI and Machine Learning (ML) cases use algorithms to receive inputs, organize data, and predict outputs within predetermined ranges and patterns. Algorithms may seem like “objective” mathematical processes, but this is far from the truth. Racial bias seeps into algorithms in several subtle and not-so-subtle ways, leading to discriminatory results and outcomes Algorithms can get the results you want for the wrong reasons. By automating an algorithm, it often finds patterns that you could not have predicted. Automation poses dangers when data is imperfect, messy, or biased. An algorithm might latch onto unimportant data and reinforce unintentional implicit biases. [6]

How does racial bias materialize in our lives?

One such example of the dangers of AI provoking racial inequality is COMPAS software. COMPAS was used in 2013 to forecast which criminals were most likely to offend again. The software compared 7,000 risk assessments from people arrested in a Florida county, counting how often they offended again. The algorithm turned out to be racist/biased and failed to predict the risk of renewed offending. It was found that black offenders were almost twice as likely as white offenders to be labelled as reoffenders even though they did not actually reoffend while white offenders were found low risk even if they ended up reoffending. [7]

Also, a health care algorithm used by the U.S demonstrated a racial bias. The algorithm was supposed to tell the details of patients who needed extra medical care, but the algorithm gave results favoring white patients over black. The reason was that the algorithm used previous patients’ healthcare spending as a proxy for medical needs which was a bad interpretation of historical data because income and race are highly correlated metrics, using only one variable of correlated metrics led the algorithm to provide inaccurate results. [8]

In the healthcare system, the biggest problem with the AI approach was the fact that those less wealthy simply couldn’t afford more extensive treatment, so they chose less expensive options, So considering the algorithm of the healthcare needs on the basis of the amount of money spent on treatment was an exclusive approach, biased towards more wealthy people. Considering variables other than the cost of treatment to estimate a person’s medical needs reduced bias by 84%. [9]

In 2018, research by Joy Buolamwini and Timnit Gebru revealed for the first time the extent to which many commercial facial recognition systems (including IBM’s) were biased. [10] Due to this, IBM quitted the facial-recognition business altogether. In a letter to Congress, chief executive Arvind Krishna condemned the software that is used for mass surveillance, racial profiling and violations of basic human rights and freedoms. [11]

In the case of image-recognition systems, the reason for the bias is that training data that machines use for learning contain mostly samples gathered from the same color people. The solution requires more attention during dataset preparation: we need to represent equally people of color and gender in training datasets, which is crucial for algorithms.

Fairness of AI

AI free of bias has the potential to tackle the different social issues – such as enhancing social mobility through fairer access to the financing and the healthcare system, mitigating exclusion and poverty through making the judiciary systems more objective, bias-free testing in university admissions, and much more. We should expect fair AI solutions from both technology companies and authorities.

Content strategist and co-founder of Rasa Advising, Julie Polk, has come up with four essential tips you should keep in mind to combat bias in AI:

  • It’s not enough to edit your results, they’ll show up again unless you address the underlying bias that produced them.
  • Require gender-neutral language in your style guide.
  • Vet your data.
  • Don’t get sucked into solutions at the expense of inclusion. [12]

By Diksha Tiwari, Artificial Intelligence 4 Development Agency



[1] United Nations SDG Goal 10.
[2] Blier, Noah. “Bias in AI and Machine Learning: Sources and Solutions”. Lexalytics. August 15, 2019.
[3] Marr, Bernard. “Artificial Intelligence Has A Problem with Bias, Here’s How to Tackle It”. Forbes. January 29, 2019.
[4] Maryfield, Bailey. “Implicit Racial Bias’’. Sociology. December 2018.
[5] SpearIt. “Implicit Bias in Criminal Justice: Growing Influence as an Insight to Systemic Oppression” The State of Criminal Justice 2020 (American Bar Association 2020).
[6] Fawcett, Amanda. “Understanding racial bias in machine learning algorithms.” Educative. June 8, 2020.
[7] Alex. “Racial bias and gender bia example in ai systems.” Reuters. September 2, 2018.
[8] Kantarci, Atakan. “Bias in AI: What it is, Types & Examples, How & Tools to fix it.” AI Multiple. February 2, 2021.
[9] Sigmoidal. “AI solutions against bias and discrimination – do 2020 machines give a new chance for humanity?” Accessed February 11, 2021.
[10] Gender Shades “The Gender Shades project evaluates the accuracy of AI powered gender classification products.”. Accessed 11 February 2021
[11] Peters, Jay.. “IBM will no longer offer, develop, or research facial recognition technology.” The Verge. June 8, 2020.
[12] Lindberg, Oliver. “Removing Bias in AI — Part 2: Tackling Gender and Racial Bias” Adobe. January 13, 2020