Even more data, even more computing power, even more complex neural networks: In Artificial Intelligence (AI), algorithm systems of unprecedented size are currently causing a stir - and concern.

You can answer more sophisticated questions, write lyrics, or even continue program code in a way that amazes even many professionals;

And this precisely because their knowledge is apparently not highly specialized, but covers a wide variety of areas at the same time.

Alexander Armbruster

Responsible editor for Wirtschaft Online.

  • Follow I follow

    The best known is hidden behind the abbreviation GPT 3, which stands for "Generative Pretrained Transformer 3".

    Researchers from the American AI company OpenAI, which is largely supported by Elon Musk and the IT group Microsoft, constructed the model and trained it with an unimaginable amount of data.

    GPT 3 has over 175 billion parameters, the number says something about how powerful the system is.

    When OpenAI first made GPT 3 public last year, it was the largest artificial neural network ever created to date.

    It has not held this record for too long, and there are now much larger systems.

    At the beginning of this year, Google presented its language model “Switch Transformer” with 1.6 trillion parameters.

    The current new record holders are not on the American west coast, but in China, whose leadership sees AI as a key technology not only for commercial success, but also for strategic-political strength - and mobilizes billions for it. The Beijing Academy of Artificial Intelligence (BAAI) recently presented its AI system WuDao 2.0, which, according to its own statements, is 1.75 trillion parameters ten times as large as GPT 3. The claim behind WuDao, which translates as "enlightenment", made BAAI Chairman of the Board of Directors Zhang Hongjiang, who is quoted as saying: "The way to general Artificial Intelligence lies in large models and large computers."

    As a result of the presentation of WuDao 2.0, researchers in Europe and America also took notice that in contrast to GPT 3, for example, it is a so-called multimodal system - one that was not only trained with text, but also with extensive image material and that accordingly should be able to perform a wide range of tasks.

    The American competitors OpenAI (with DALL-E and CLIP) and Google (with LaMDA and MUM) in turn have something to offer in this field themselves.

    Special supercomputers

    Researchers in Europe are alarmed by this series of new giant models for a number of reasons, although there are quite a few initiatives in this area as well. On the one hand, understanding and dealing with language has been a central AI discipline with far-reaching effects for decades. The quality of search engines, social networks, chatbots, translations or recommendation algorithms in general already depends on it today - and much more in the future, regardless of whether it is written or spoken text.

    On the other hand, the AI ​​systems mentioned not only require enormous amounts of well-curated language data, but also enormous computing resources. “The hardware requirements are such that standard GPU data centers are not suitable. These special technical requirements must already be taken into account when planning the data centers, ”says a position paper recently published by the KI Bundesverband together with leading academics and company representatives under the project title LEAM. LEAM stands for "Large European AI Models". They warn: “Because of the effort required, especially in terms of computing capacity, these models can only be implemented by companies with strong finances and resources. In the medium term, this will lead tothat central AI functionalities are provided by a few market participants and that these are used by users by calling up programming interfaces in the cloud. "

    And they are not alone. "There is an acute danger that AI developments and applications will become dependent on American models, and the effects on European technology sovereignty, but also on the economy, can hardly be underestimated," says Holger Hoos, AI professor at the Leiden University and co-founder of the European research network CLAIRE. Analogous to the physics research center of the same name, he advocates a “CERN for KI” in which researchers should be able to fall back on the necessary computing and data infrastructure to develop such models themselves in Europe.

    Wolfgang Wahlster, the former head of the German Research Center for Artificial Intelligence (DFKI), assesses the danger a little less dramatically, but also calls for more to be done. “A large multilingual language data collection is important for Europe. The preparation, curation and validation of such language data is, however, a time-consuming work that certainly requires a three-digit million amount for all European official languages.

    Yann LeCun, on the other hand, Facebook's chief AI scientist and professor at New York University, points out that France, for example, has long since responded. The state has set up the powerful “Jean Zay” supercomputer, which is specially designed for machine learning, and has launched a project together with the AI ​​company Hugging Face to develop a GPT 3-like model for French. Like Wahlster, LeCun, who is one of the pioneers of deep learning internationally, also advises not to overestimate the new giant models. He thinks they are impressive, but at the same time very far from real general intelligence. “It's not a good dialogue system, not a good question-answer system, not a good medical diagnostic system,” he says of GPT 3. To be continued.