China News Service, March 25. The 2024 Annual Meeting of the China Development Forum will be held on March 24-25, 2024. The "Artificial Intelligence Development and Governance Seminar" was held on the afternoon of March 24. Zhang Hongjiang, founder and founding chairman of Beijing Zhiyuan Research Institute, said that the future development direction of today's popular multi-modal large models must not only be for making videos. Generate, edit videos, shoot movies or generate TV series. From a technical perspective, it can be used as the brain of the machine, identify the peripheral world, and arm future autonomous driving, thereby turning today's information systems and model systems into future action systems.

  In Zhang Hongjiang’s view, the most exciting thing about mobile systems, especially large multi-modal models, is that they can give robots a brain. For example, if a robot is instructed to pick out an extinct animal from a pile of toys on the table, it can successfully pick out an extinct animal from several known animals such as tigers, lions, and birds through the process of reasoning and identification. animal dinosaur. This is something that robots in the past couldn't do. In the past, you told the robot what to grab and it could grab it, but if you gave it an abstract concept, it couldn't do it. Similarly, if you tell a current robot that you are thirsty, the robot will grab a bottle with water from a pile of objects. Both demonstrations show that after robots have large multi-modal models, they no longer simply listen to the instructions you give them before, but can think about the meaning of your instructions. This is what we see in future autonomous robots. prototype. Multimodal large models can already shock us so much today.

  Everything behind this is actually the result of the continuous development of artificial intelligence over the past 70 years. Artificial intelligence has gone through three waves of development. In the third wave, there has been another wave of deep learning in the past 10 years. The emergence of large models happened to be at the trough of the third wave in 2020, and the emergence of ChatGPT3.0 was a turning point, which brought Sora and a series of Chinese large models. In the past ten months, we have seen the rapid development from language models, multi-modal models to visual models, and then to future large models. What is the principle behind this? A very important factor is that when we do artificial intelligence today, we no longer treat it as an algorithm, but as a system. Today's models are not only large-scale, but also versatile. The driving force behind this is the "law of scale growth". It is this scale effect that enables it to solve problems one after another that we could not solve in the past. For example, when a language model only has a few billion data, it can only solve some problems in natural language processing. But when it exceeds 500 billion data, basically all problems in natural language can be solved. This is the scale. ability.

  What is the core behind the big model? Zhang Hongjiang thinks it is a new operating system. In the traditional PC era, output was generated through calculation by the CPU; today, the core of large model calculations is no longer the CPU, but the GPU, so it is said to be a new operating system. Today, all Internet platform companies are working hard to build big models. The fundamental point is that without big models, they will no longer be a platform company in the future. Observing the development process in the past few years, especially the development in the past 18 months, we can summarize the new Moore's Law, that is, the ability of the model improves one generation every one to two years, and the cost of training becomes 1 every 18 months. /4, the model’s inference cost will become 1/10 of the previous value every 4 months. This new Moore's Law will bring about the rapid popularity, rapid development and rapid application of large models. Another driving force is the meteoric rise of Nvidia's stock in the past 12 months, which has made the company one of the three most valuable companies in the world. The entire large model industry chain is now developing and growing rapidly. Big models will empower our software tools, our lives, and our work.

  Zhang Hongjiang said that today we see that artificial intelligence has entered a new stage of development. The stage represented by large models represents the fourth technological revolution (the previous three are the agricultural revolution, industrial revolution, and information revolution), which will bring It will bring a lot of efficiency improvements, provide a lot of convenience for our lives, and create huge value and new industries one after another. But at the same time, we must also see the global catastrophic consequences that artificial intelligence may cause. In order to avoid such dangers, we need to draw some red lines and improve governance mechanisms; at the same time, we need to develop more security technologies to control artificial intelligence from crossing these red lines. To achieve this, the most important thing is that we must persist in and strengthen security cooperation between the international scientific community and the policy community. Only in this way can we avoid this disaster.