A new study, carried out by researchers at the University of "Stanford", showed that the algorithms used in automatic voice recognition systems were misunderstanding 19% of white speech, and 35% of black speech, which is almost double the difference.

The study confirmed that this distinction reaches its climax with women with brown skin, and is a racial discrimination that makes people with brown skin more vulnerable to misunderstanding by many of the many applications based on the identification and automatic analysis of sounds, especially in the area of ​​crime detection.

According to the study, this matter exists with the five major algorithms produced by: IBM, Google, Amazon, Microsoft and Apple.

The study that reached these results bears the title "Ethnic Differences in Automatic Speech Recognition", and was finally published on the site of the magazine "PANS" www.pnas.org, which is the official journal of the American National Academy of Sciences, which is one of the most comprehensive, multidisciplinary scientific journals. Worldwide, more than 3,300 papers are published annually.

The full texts of the research are openly open to all researchers around the world, and the research bears the names of eight researchers: Alison Quinc, Andrew Nam, Emily Lake, Joe Newdell, Maine Corati, Connor Topps, John Rickford and Dan Jurassic.

ASR

The algorithms that have been studied are used in automatic speech recognition systems, known as "ASR", which in turn is the cornerstone of a variety of applications that convert the spoken language into text, in the forefront of which are digital audio aids integrated with mobile devices, home appliances and in-car systems , Closed captioning applications, hands-free computing applications, automatic translation of video content, and the last two applications are especially useful for individuals with hearing and movement impairment, and are also used as digital dictation platforms for healthcare, among others.

Over the past years, the quality of these systems has improved thanks to advances in deep learning and big data analyzes used to train the systems. However, there was concern that these tools did not work equally well for all people, but rather did have a degree of racial bias, a problem that has recently emerged with many other advanced machine learning applications, such as facial recognition and natural language processing.

The study sample

In the study, two modern groups of human conversations were analyzed: The first: the African American Regional Language Group, which is a group of sociolinguistic interviews with dozens of black individuals, who speak English, American, and vernacular African, to varying degrees.

These interviews were conducted at three American locations, Princeville, a rural area, in eastern North Carolina, and Rochester, western New York, and the District of Columbia.

The second dataset was for sounds used by California residents. It was drawn from nationwide recorded interviews in both rural and urban areas. The analysis was conducted at two locations, the Sacramento District and the Capitol Building. Humboldt County, a white rural community in the north of the state.

In the two data sets, the interviews were copied by human experts, and the original recorded interviews contain the voice of both the interviewer and the person being interviewed, and the study relies on a subset of audio clips that contain exclusively the person being interviewed, and the length is from 5 to 50 seconds, These clips were matched and matched across the two data sets, based on the age and gender of the speaker and the length of the clip.

After matching, there were 2,141 pieces of each data set, averaging 17 seconds per segment, with up to 19.8 hours of audio. In the corresponding data set, 44% of the scraps were male speakers, and the average age of the speakers was 45 years.

Results

Systems performance was evaluated on the basis of the word error rate standard, known shortly as “WER,” which is a standard that determines the discrepancy between machine and human copying. The results showed that the average error on the WER standard for black groups was much greater It is in the case of egg groups.

In Microsoft's algorithms, which achieved the best overall performance, the error rate for blacks was 0.27%, while the error rate for whites was 0.15%, while audio clips not fully understood were 2% for whites, and 20% for blacks.

Apple's algorithms recorded the worst result and the worst overall performance, with the error rate for blacks 0.45%, and for whites 0.23%. As a general average of the five algorithms that were tested, it was found that the general average error in black groups was 0.35%, and the general average error in white groups was 0.19%, which means that the error in recognizing black words reaches about twice the error in identifying white words.

Causes of racism

The researchers attributed the racial disparity in the performance of these algorithms to the data sets on the basis of which these algorithms were trained to operate, in which large quantities of data taken from the hadiths, dialects and sounds of eggs were analyzed with more than the data sets taken from the hadiths, dialects and voices of blacks, and thus I learned and trained more to recognize the sounds and words of whites, more precisely than in the case of blacks.

In light of this, researchers urged the five companies that produce these algorithms to collect better data on the languages ​​and dialects spoken by blacks, including regional and local dialects, to eliminate this phenomenon, which will make blacks, and possibly other ethnic groups regardless of their language, subject Risks, burdens, and attitudes that are not guilty of them, especially in cases of using speech recognition in professional settings, such as: job interviews, criminal and legal procedures, and some aspects of health care, among others.

Facial recognition

Research conducted by scientists from the Massachusetts Institute of Technology (MIT), found that the algorithms for facial recognition systems show a similar bias, including “Amazon” algorithms that did not make any mistakes when identifying a person’s sex if he was a white-skinned man, but it was Make a mistake when recognizing a person's gender if he is a female, and make a greater mistake if a person has brown skin. The study demonstrated similar racial and sexual biases in the algorithms for facial recognition software from Microsoft, IBM and the Chinese company Megeve.