It is a dismissal which definitely does not pass.

Google's decision, Wednesday, December 2, to separate from Timnit Gebru continues to make waves in the tech and scientific world.

More than 4,000 engineers and researchers had already signed, Tuesday, December 8, a petition criticizing the Internet giant and giving their support to the one who co-directed research at Google on ethical issues related to artificial intelligence.

They regret, first of all, the departure of one of the few African-American specialists in a field dominated by white engineers.

Timnit Gebru is also one of the most respected researchers in the field: she was one of the first to demonstrate how algorithms can be discriminatory and how they can reinforce the racist biases of those who use them.

His work on facial recognition prompted Microsoft, Amazon and IBM to review their collaboration with the police in 2018.

The myth of always more

But this dismissal would also follow "an unprecedented censorship", write the authors of the petition.

Google opposed the publication of a scientific paper submitted by Timnit Gebru, which tackled the dangers of one of the hottest areas in AI today: natural language.

These are the efforts to create robots capable of conducting the most coherent conversations possible with humans.

This sector has experienced spectacular changes over the past year with, in particular, the publication of a conversational bot, GPT-3, ultra-powerful, capable of discussing everything and nothing with unprecedented ease.  

>> Read also on France 24: How GPT-3 pushes the limits of artificial intelligence

For Timnit Gebru, it is urgent to "take a step back" in this area.

His censored article, which leaked onto the Internet over the weekend, is a plea to change the way we "feed" algorithms to make them smarter in their understanding of language.

"There is nothing revolutionary in what she writes, but it is a very well written and argued summary of all the risks of this discipline", explains Jessica Heesen, specialist in ethical issues in the treatment of data at the University of Tübingen, contacted by France 24.

The central argument underlying all of Timnit Gebru's analysis is "that we should not think that more data means a smarter system", summarizes Laurence Devillers, professor of artificial intelligence at the University of the Sorbonne, member of the National Pilot Committee for Digital Ethics and author of "Emotional robots: health, surveillance, sexuality ... and ethics in all of this" (ed. Of the Observatory), contacted by France 24 .

Discrimination at all levels

The trend which currently dominates the development of natural language for AI consists, in fact, of making them swallow more and more data generally gleaned from the Internet in order to "enrich" their vocabulary and allow these algorithms to find, by themselves. , associations of ideas and meanings.

The feat of GPT-3, which had learned 500 billion words, "the equivalent of over 150 times the entire Wikipedia encyclopedia (in all languages)", had been hailed as a major achievement.

But this bulimia comes at a price, reminds Timnit Gebru.

First of all environmental.

Training an AI like GPT-3 consumes the equivalent of the CO2 emissions of a car traveling 700,000 km.

These innovations therefore contribute to global warming, the first victims of which are often the poorest countries.

This is where the first discrimination in this race for data lies, believes Timnit Gebru.

"Is it fair to ask residents of the Maldives (which may be underwater by 2100) or the 800,000 Sudanese affected by historic floods to pay the price required to train ever better language models? in English when no one does the same with Dhivehi (language spoken in the Maldives) or Sudanese Arabic? ”she asks.

Not to mention that the products built around these conversational agents (such as Google Home or Amazon's Alexa) are sold to the wealthiest consumers and the least exposed to the consequences of global warming.

A little beautiful, a lot of ugliness

?

Also, the larger the databases, the less man can control what the AI ​​feeds on.

"Large linguistic models can lead to a situation where the data on learning are too important to be documented. Nothing will be known of their origins", assures Laurence Devillers. 

Because the algorithms will learn just as well from reference sites, such as recognized media, as they will dig into forums of dubious reputation.

"Why trust these language models fed not by selected texts, but by Internet data representing a lot of fake news?" Asks the French researcher.

"Feeding AI systems with the beauty of the world but also with its ugliness, and its cruelty, but expect it to reflect only beauty, is a fantasy", summed up the American sociologist Ruha Benjamin in 2019.

The risk is that this ugliness prevails over the rest.

Timnit Gebru thus underlines that the most present and noisy Net surfers on the Web are the well-off whites who, voluntarily or not, transmit their prejudices in what they write.

Bias that ever more greedy AIs will integrate and perpetuate. 

For the ex-star of ethical AI at Google, we should stop always wanting to think bigger.

The solution passes, perhaps, through less impressive databases, therefore better controlled by humans, and smarter AI.

"Research efforts should then focus more on how algorithms find correlations between words to do more with less," said Jessica Heesen, ethics researcher at the University of Tübingen.

For her, it is this aspect of Timnit Gebru's article that must have displeased Google, because "it calls into question the group's development model", she assures us.

The Internet giant is, in fact, in a privileged position to have access to the entire Web.

But if the focus "is on data quality rather than quantity, smaller competitors will be better able to overshadow Google," said the German scientist.

Without ever citing his ex-employer, Timnit Gebru has therefore engaged in this article to a critique of the entire philosophy of Google in terms of AI.

The Internet giant knows the risks of this race for ever larger databases, but it would let it go because with the other members of the Gafa club, they are the only ones who have the means to take advantage of it.

But doing so, "they are playing sorcerer's apprentices," concludes Laurence Devillers.

The risk being that the AIs of tomorrow will reflect very little the famous "beauty of the world".

The summary of the week

France 24 invites you to come back to the news that marked the week

I subscribe

Take international news everywhere with you!

Download the France 24 application

google-play-badge_FR