The "guide" of the meta universe?

Virtual idol million cost dilemma to be solved

  "My name is Liu Yexi."

  Recently, the new beauty makeup artist Liu Yexi’s debut video has exploded the Internet, the special effects part of the video is full of high-level sense, the virtual person is vivid and vivid, the hair texture and hand movements are almost the same as the real person, and the interaction between the virtual person and the real person It's also very smooth.

Less than 30 hours after the video of the beauty makeup artist who catches monsters was released, the number of fans soared to 1.3 million.

As of November 23, the number of likes for the first video reached 3.366 million, and the number of fans has reached 5.36 million.

  Xie Duosheng, the founder of Chuangyi Technology, the company behind Liu Yexi, told Shell Finance reporter that the two-minute video is only a preview of Liu Yexi’s debut, and subsequent stories will be released on Douyin in the form of single episodes.

At present, Chuangyi Technology Co., Ltd. serves more than 150 people in the large and middle stage team behind Liu Yexi, and the small front desk team has less than 10 people.

  In 2007, when "Hatsune Miku" sang for the first time with electronically synthesized sound and was called "His Royal Highness" by Japanese Akihabara otaku, many people thought it was just a two-dimensional carnival.

As "Luo Tianyi" walked into Li Jiaqi's live broadcast room to bring goods and "Liu Yexi" attracted millions of fans a day, the virtual person had unknowingly entered the life of the general public.

  This year, as the concept of meta universe became popular, virtual humans, one of its elements, were also pushed to the front of the stage.

"In 2018, when we entered this track, many people did not understand what a virtual person is, but this year it seems that everyone understands that many investors are also looking for relevant investment targets." Founder of Next World Culture Company Chen Yan told the Beijing News Shell Finance reporter.

  Shell Finance reporter interviewed virtual human practitioners and learned that virtual humans are currently divided into three categories: hyper-realistic virtual humans, virtual idols and virtual human interactive products, and have achieved commercial value in different fields.

However, the high production costs of thousands to tens of thousands per second, real-time rendering technical difficulties, and insurmountable AI technology have also become bottlenecks in the development of virtual humans.

  ●Build

  There are virtual idols attracting more than tens of millions of fans, "a few minutes of video cost tens of thousands of yuan"

  Holding idol posters in one hand, lined up, shouting "Happy Birthday" in unison... Around November 2, students from the University of Chinese Academy of Sciences, Shanghai Jiaotong University, overseas Cambridge, and New York University spontaneously uploaded birthday videos at station B.

Fans support a strong lineup. As for the protagonist, he is not a real person in the traditional sense, but A-Soul's virtual idol "Carol".

  A-Soul is a virtual idol group launched by Lehua Entertainment in November 2020. It was initially resisted by fans of the original virtual anchor. However, shortly after the launch, its exquisite modeling and excellent business quality of the "people in the middle" made it Many opponents "black turn to fans."

  In one year, the A-Soul team has been among the top virtual anchors of station B, and the member "Jiaran" has reached 1.26 million fans at station B.

A-Soul's popularity even spawned many subculture "stalks" such as "I really like you", "Cute pinch", "Take me away" and so on.

  The virtual idol group broke the circle and has the shadow of an industry explosion.

According to the figures disclosed by the chairman of station B, Chen Rui, there will be more than 32,000 virtual anchors on station B in 2019.

  However, getting together to test the water does not mean that all people will see the dawn of success.

A set of public data shows that as of August 18, 2021, among the 3,472 virtual anchors with relatively high attention at station B, 1,827 had a monthly revenue of 0 yuan, that is to say, more than half did not have a cent in the account.

  "In fact, as long as you design a set of 3D models and purchase a set of motion capture equipment, you can become a primary virtual idol." Liu Wen (a pseudonym), an observer of the virtual idol industry, told the Shell Finance reporter that the technology used by ordinary virtual game anchors It is based on facial motion capture. In other words, as long as you put on a 2D or 3D "skin", you can become a virtual anchor and can broadcast live like a real anchor.

  However, Liu Wen said that both motion capture equipment and 3D modeling require costs. The better the effect, the higher the cost of equipment and models, which has caused many virtual anchors to make ends meet.

  Li Hao, the founder of Mars Culture, launched the virtual idol business as early as 2017. At present, his avatar "Silent Sauce" has more than 18 million fans on the entire network.

He told the Shell Finance reporter that the current production process of virtual idols is similar. "First, use modeling tools to create 3D models and iterate continuously, and then use motion capture technology to drive the actions of the character models, and find a "person in the center"."

  Li Hao told reporters that the vast majority of virtual anchors currently use "people in the middle" and motion capture technology for live broadcasts.

"'Hatsune Miku' and'Luo Tianyi' were originally released with electronically synthesized sound, but since the birth of the second batch of virtual idols in Japan, the'Nakaren' began to be used in large quantities. This is because the virtual idols must be used in Speaking more like humans. At present, technically speaking, electronically synthesized voices have a relatively high degree of agreement with real people, but when singing, there will be obvious technical barriers, which cannot effectively deal with the effects of breathing, airflow, etc., which will make the audience It feels'something's wrong', and the use of motion capture technology can effectively reduce production costs."

  In fact, live streaming of virtual idols is an order of magnitude more difficult than live streaming.

It is understood that the current facial motion capture and body motion capture are different technologies. Therefore, in extreme cases, when a virtual idol appears in a live broadcast room, two people have to carry motion capture equipment on their face and body.

In addition, technicians are required to synthesize the two motion capture animations, and then synthesize the audio and video with the voice recording of the "person in the middle" to finally present the effect of the live broadcast room seen by the audience.

  In April last year, the gimmick that "Luo Tianyi" and Li Jiaqi were in the same live broadcast caused a wave of attention.

During the live broadcast, there was a "rollover accident" that Li Jiaqi could hear "Luo Tianyi" but the audience could not hear.

  Li Hao introduced to reporters that a team is often needed behind virtual idols. "Taking'Silent Sauce' as an example, the content team has ten people, including directors, scripts, motion capture personnel, voice actors, etc. As for the technical aspects, according to the released video The animator also needs to modify the model of "Simo Sauce". An ordinary short video may cost around 6,000 yuan, and a few minutes of customized video costs tens of thousands of yuan."

  If the technology and animation team of virtual idols can be substituted, "the man in the middle" is undoubtedly the soul of virtual idols.

Le Element launched its virtual idol project "Battle Diva!" in September 2018.

"After more than two years of operation, six singers' "people of the middle" "graduated" in February this year and released a farewell video at station B.

Since then, when the operator announced the re-recruitment of the "Middle Man", many fans said in a message: "It is not acceptable to change the "Middle Man" or the old one."

  "We are deeply bound to the'Man in the Middle'. If the virtual idol does not have the'Man in the Middle', we will have to stop for at least one month, because even if we find a new voice actor training voice, chatting is fine, but singing is easy to see through. "Li Hao said.

  ●Bottlenecks

  Behind Liu Yexi's burning money: high costs and technical walls

  As early as six months ago, Chuangyi Technology smelled the outlet of the meta-universe and began to create Liu Yexi’s virtual human IP, from market positioning, character setting, character production, storyline creation, shooting execution, post-production, etc. All aspects have been constantly polished.

Liu Yexi’s oriental face, Chinese style makeup and the identity of the monster catcher fit the prevailing national trend. At the same time, the use of fluorescent elements in the makeup, the special effects full of sci-fi sense and the later color of the cyberpunk style cater to Z The preferences of young people of the age.

  For Liu Yexi's popularity, Xie Duosheng, founder of Chuangyi Technology, said that he was not surprised.

The team also discussed this during the review: 50% of Liu Yexi’s popularity is due to the concept of Metaverse, 30% is due to its 2.5-dimensional setting and technical level, and 20% is due to video creativity and worldview. Build.

At present, most of the virtual people on the market are mainly operated in the mode of virtual idols, which can be roughly divided into types such as cultivation, personification, and two-dimensional girl groups. The time and space of virtual people are mostly two-dimensional or three-dimensional.

The positioning of Chuangyi Technology for Liu Yexi is 2.5-dimensional-the second dimension is pure CG, the three-dimensional is the real world, and the 2.5-dimensional is the existence between the two.

  From the current point of view, it is difficult to realize the monetization method of the follow-up virtual human IP by Liu Yexi and others in the short term. It is still a question mark whether it can maintain its long-term operation by burning money alone.

However, in the strategic layout of Chuangyi, Liu Yexi and other follow-up virtual people can realize two main ways-the traditional IP economy and the future commercial possibilities of the meta universe.

  Shell Finance reporter noticed that most of the virtual idols currently live broadcast are based on the two-dimensional painting style, and "Liu Yexi", "Ling" and other virtual idols with a similar look and feel to real people are called "super-realistic" to a greater extent. "Virtual people", these types of virtual people often do not live broadcast, but appear on social platforms such as Weibo, Douyin, and Xiaohongshu. Like Internet celebrities, they use their photos and videos to attract fans and receive commercial endorsements.

  "We do not do the traditional two-dimensional element, nor touch the field of virtual anchors." Chen Yan, the producer of "Ling" and the founder of Next World Culture Company, told Shell Finance reporter, "The virtual humans we launched are mainly used in the pan-entertainment field. In the field of branding, fans are mostly groups that pay more attention to fashion life. If the fans of the two-dimensional virtual idol can be compared to the fans of station B, then the fans of the super-realistic virtual person are more similar to the fans of Xiaohongshu."

  A reporter from Shell Finance and Economics sorted out and saw that, taking "Ling" as an example of a Chinese-style super-realistic virtual person, most of the commercial advertisements received are similar to fashion stars, including luxury goods and beauty brands.

  Compared with the two-dimensional virtual idol, the video production cost of the super-realistic virtual person has also risen to a higher level.

In an interview with the Shell Finance reporter, the "Liu Yexi" team stated that more than half a year before the launch of "Liu Yexi", the investment in research and development costs, personnel costs, and technology costs "far exceeded one million."

  Chen Yan revealed to the Shell Finance reporter that in order to cover the cost, the company has made very strict product planning. Before each product is released, it will conduct an internal evaluation of five or six steps, including the scenarios in which each IP uses it, in an attempt What is the level of an IP created?

"Taking'Ling' as an example, we will break down the video into 15 seconds, 1 minute to 2 minutes, and plan a major event once a quarter. Otherwise, if there is a weekly or daily update, the cost will not be covered at all. "

  In fact, the realistic virtual person that appeared earlier in China can be traced back to May 2018, when NExT Studios and Epic jointly launched the high-fidelity digital virtual person Siren.

During the research and development process of this project, one can see the expensive side of the virtual human industry.

  Tencent Interactive Entertainment engineer David once used the word "difficult" to describe the Siren project in the article "The Birth of the Virtual Digital Man Siren". All calculations must occur at that time. The virtual human program runs at 60 frames per second, and all calculations must be completed within a period of 16 milliseconds.” In the end, the top teams from four countries overcome all aspects of software and hardware. After the technical bottleneck came, the project was completed.

  At present, most hyper-realistic virtual human projects are still difficult to live broadcast in real time.

"Now the real-time (virtual people) on the market are basically stylized characters. Realistic styles are usually videos made with offline CG processes. Some only use Unreal Engine as a renderer. We have always adhered to The real-time + realism line, because our goal is to achieve real-time digital humans, landing in real-time interactive scenes." said Ge Cheng, deputy director of NExT Studios New Technology R&D Center.

  With the iteration of technology, the application scenarios of hyper-realistic virtual humans will become more and more extensive.

On June 20th, Xinhua News Agency and Tencent jointly launched the digital astronaut and digital reporter "Xiao Yu". This super-realistic virtual person will take on the mission of "on-site reporting" for manned space projects and planetary exploration projects that are difficult for ordinary reporters to achieve.

Ge Cheng told the Shell Finance reporter that the NExT digital human team has been maintained within 20 people.

  ●Change

  Will virtual human + AI become the "guide" of the meta universe?

  With the development of technology, the boundaries between hyper-realistic virtual humans, virtual idols and even smart interactive products have gradually blurred. Whether the virtual human field can achieve technical "unification" in the future will give the market a lot of room for imagination.

  Tencent Interactive Entertainment stated that in the case of "black swan" incidents such as the epidemic, people are separated from each other, and there will be more and more interactions and connections between people. Virtual people and virtual worlds are not just entertainment scenes, they need to be considered. With sociality and dependence among human beings, digital humans can exert greater social value.

In addition to the identities of digital astronauts and digital journalists, "Xiao Yu" will also have more interactions with users and young people in the future.

After being known and loved by more and more friends, "Xiao Yu" can also become one of the "virtual idols" of contemporary young people representing mainstream values.

  "In fact, compared to the beautiful appearance of virtual people, it is more important to be able to have continuous communication with users." Chen Yan said that his vision is to pursue interaction with users and virtual IP while maintaining existing business lines. Intelligent and scenario-based, it has developed into a "virtual human ecology" company.

  "At present, Next World Culture is cooperating with top AI companies such as'XiaoIce' to expand more intelligent virtual products. That is to say, virtual humans are IP, but they will also add many AI intelligent functions to meet various detailed requirements. According to the needs of the scene." He said.

  The development of technology such as character modeling, real-time rendering, speech recognition, and action recognition has allowed many practitioners to see the future application prospects of virtual humans.

At the opening ceremony of the 2021 World Artificial Intelligence Conference, four virtual people appeared on the same stage as the real host-the virtual idol of station B "Ling Yuan", Baidu's "Xiaodu", Xiaomi's "Xiaoai Classmate" and Microsoft's " Xiaobing".

Among them, the last three virtual people all have their own application scenarios. For example, users can order songs on Xiaoai speakers by shouting "Little Ai Classmate".

  The Shell Finance reporter saw that many virtual human products have been promoted in the market. For example, companies such as iFLYTEK and Xiangxin Technology have successively launched virtual human products of TO B.

  A salesperson for virtual human products told Shell Finance reporters that the cost of using the finished virtual news anchor provided by him is about 1 million yuan a year. The virtual anchor can automatically generate suitable voices and expressions according to the input text content, thereby converting the text The report is converted to a video broadcast.

"If you customize the virtual anchor of your own image, it will cost millions, because we need to perform motion capture and algorithm analysis on the models you provide."

  "Many people look at virtual humans and only see the appearance, but I have always felt that virtual humans are just a carrier, and what ultimately drives the development of virtual humans is human needs. In the real world, when a person’s body has defects, he builds his own The virtual image can be re-selected in the virtual world (self-image). Through AI technology, a virtual human IP should also establish a more stable relationship with humans." Chen Yan said.

  In an interview with Shell Finance reporter, Jingdong Group Vice President Mei Tao said that the AI ​​technology of virtual digital humans may have disruptive effects in the future.

“Take JD’s own digital humans as an example. There are both 2D and 3D cartoon digital humans, as well as real human digital humans. Digital humans involve a wide range of technologies, including vision and speech recognition, as well as speech synthesis and dialogue, and graphics. In the future, we I hope that digital people can really accomplish some tasks, such as chatting with children, accompanying the elderly, citizen hotline, intelligent customer service, etc. For this reason, we build a digital person with our own characteristics based on the rich practice of intelligent customer service in the Jingdong e-commerce scene. Hope for the future A relatively mature standardized service has been formed in one to two years. As you can see, there are also many start-up companies working on digital humans. After several years of development, it can be said that digital human technology and products are about to reach an explosive period."

  It is understood that at the SIGGRAPH Asia booth in 2018, "Sirens" once demonstrated AI-driven samples, but at that time AI could only have a round of dialogue.

  "Now AI can achieve multiple rounds of dialogue, more intelligent, AI-driven digital humans, the future can be expected. The integration of virtual and real is the future trend, digital humans can gradually resonate emotionally with the real world, and they can become a part of society. Exist, think about the Transformers, Saint Seiya, Doraemon, and Calabash that we saw when we were young. They are all virtual characters, and they have also become IPs that affect our generation. As digital humans flourish, it can be foreseen that there will be more and more people in the future. This virtual character that is the bond of real society. "The man in the middle" is a topic that digital humans cannot escape at present, but I believe that in the future, AI can better drive digital humans in certain areas." Ge Cheng said.

  "Virtual idols can be broadcast live, there must be a'person in the middle'. Super-realistic virtual portraits cannot live broadcast'no soul'. The smart voice in the mobile phone can be interactive, but the level of interaction is still too low." Liu Wen told Shell Finance reporter Said that if virtual idols, super-realistic virtual humans and virtual assistants already on the market are integrated, this may be the future of virtual humans leading to the meta-universe.

  Beijing News Shell Finance reporter Luo Yidan and Li Menghan