Google wants to win the artificial intelligence race. The American computer giant launched its new artificial intelligence (AI) model, Gemini, on Wednesday 6 December, which is supposed to allow it to outrun its competitors, OpenAI (the creator of ChatGPT) and Microsoft.
"It's our most consistent, most gifted and also the most general AI model," Eli Collins, a vice president at Google DeepMind, the California-based company's AI research lab, said in a press presentation.
He then released a video where a user shows objects, drawings, and videos to Gemini. The AI system verbally comments on what it "sees", identifies objects, plays music and answers questions requiring a certain degree of analysis, justifying its "reasoning".
For example, faced with the image of a plastic duck that has to choose between two paths – the one on the left leading to another duck drawn on the paper and the one on the right to a threatening-looking bear – Gemini suggests the path on the left because "it's better to make friends than enemies."
Read alsoOpenAI: mathematics still resists ChatGPT and AI
The video also demonstrates that Gemini can recognize references with very little context, such as a scene from the movie The Matrix played by a person pretending to dodge slow-motion bullets.
Race to Generative AI
The new model is "multimedia from the start, it has sophisticated reasoning capabilities, and it can code at an advanced level," Collins said.
According to him, Gemini is the first AI model to outperform human experts on an industry-standard test, the "MMLU," which is used to evaluate the ability of these computer programs to reason in different fields, from mathematics to history and law.
Since the launch of ChatGPT a year ago, the Silicon Valley giants have been engaged in a frantic race for so-called generative AI, which makes it possible to obtain texts, images or lines of code of a level equivalent to those produced by humans, with a simple query in everyday language.
Google, the leader in AI, was caught off guard by the phenomenal success of ChatGPT and responded with its own chatbot, Bard. But it all comes down to the models, the computer systems that underpin these applications, which were first stuffed with texts collected online, and now fed with all kinds of data to process requests containing images and discuss orally with its users.
OpenAI said in September that it had equipped ChatGPT with speech and vision to make it "more intuitive."
A first version available on December 13
Gemini "is another step towards our vision of bringing you the best AI collaborator in the world," Sissie Hsiao, Google's vice president of Bard, said Wednesday.
Bard is expected to gain capacity as of Wednesday, but still with requests written, and only in English. We'll have to wait until 2024 for the other functions and formats, such as advanced math problem-solving assistance.
Less well-known than ChatGPT, Bard has the opportunity to try to regain ground on its rival, victim of its own success: in mid-November, overwhelmed by demand, OpenAI paused subscriptions to the paid version.
Google will also give access to a first version of Gemini on December 13 to its customers in the cloud (remote computing), including developers who use its Vertex AI platform to build their own AI applications.
In this field, the internet giant is in direct competition with Microsoft, OpenAI's main investor and the world's number 2 cloud company, behind Amazon.
The two U.S. groups spent the year adding generative AI tools to their respective software (search engine, office automation and productivity software, cloud platform, etc.)
"This new era of models represents one of the greatest scientific and technical efforts we have undertaken as a company," Google CEO Sundar Pichai said in a statement.
This week's recapFrance 24 invites you to look back on the news that marked the week
Take international news with you wherever you go! Download the France 24 app