Google, an American IT giant, has announced that it will start providing new technology to move generative AI that can understand not only text but also images and voices at the same time and answer and propose with high accuracy.
This kind of generative AI is called multimodal AI, and the development competition in the United States is becoming even more intense.

On the 6th, the American IT giant Google announced a new technology "Gemini" to move generative AI.

In addition to being able to understand not only text but also images, videos, and audio at the same time, the accuracy of understanding has been improved, and more appropriate answers and suggestions can be made.

The company plans to provide three models depending on the application, starting with the English version of the interactive generative AI "Bard".

Regarding generative AI, the use of Generative AI is rapidly expanding after OpenAI, an American venture company invested by IT giant Microsoft, released ChatGPT in November last year.

Generative AI that can understand images and sounds at the same time in addition to text is called multimodal AI, and Google is seen as aiming to compete with open AI and Microsoft by focusing on this field, and development competition is becoming more intense in the United States.