Europe 1 with AFP 18:53 p.m., July 12, 2023Google's artificial intelligence medical chatbot passed the medical exam in the United States, but its results remain significantly lower than those of humans. Health is an area with which technology has already shown tangible progress.
Google's artificial intelligence medical chatbot has passed the medical exam in the United States. However, its results still fall short of those of humans, says a study published Wednesday in Nature. Last year's release of ChatGPT, whose OpenAI developer is backed by Google's rival Microsoft, launched a race between tech giants in the burgeoning field of AI. Healthcare is an area where technology has already shown tangible advances, with some algorithms proving capable of reading medical scans better than humans.
>> READ ALSO - AI: "No one can beat the machine," says Marc Llari, the under-8 chess world champion
Med-PaLM was the first major language model, according to Google
Google unveiled its AI tool dedicated to medical issues, called Med-PaLM, in a pre-publication article in December. Unlike ChatGPT, it has not been opened to the general public. Google says Med-PaLM was the first major language model, an AI technique trained on large amounts of human-produced text, to pass the USMLE (US Medical Licensing Examination).
Successful completion of this exam allows you to practice medicine in the United States. To achieve this, you need to get a score of about 60%. In February, a study found that ChatGPT performed quite well on the exam. In a new peer-reviewed study published Wednesday in the journal Nature, Google researchers said Med-PaLM achieved 67.6 percent by answering USMLE-style multiple-choice questions. These results are "encouraging, but still inferior to those of humans," the study said.
The model will instead be tested for "administrative tasks"
To identify and reduce so-called "hallucinations," the word for a blatantly false response proposed by an AI model, Google said it has developed a new benchmark. Karan Singhal, a researcher at Google and lead author of the new study, told AFP that his team had tested a newer version of the model. Med-Palm 2 reportedly scored 86.5 percent on the USMLE exam, surpassing the previous version by nearly 20 percent, according to a study published in May that was not peer-reviewed.
According to the Wall Street Journal, Med-PaLM 2 has been testing at the prestigious Mayo Clinic research hospital since April. Any test performed with Med-PaLM 2 will not be "clinical, face-to-face, or likely to harm patients," Singhal said. Instead, the model will be tested for "administrative tasks that can be automated relatively easily, with low stakes," he added.