The "little secrets" of AI painting are all in this article

  [How to draw real photos into two-dimensional characters?

Why do people "overturn" and draw cats and dogs as humans?

  ◎Our reporter Jin Feng

  With AI, everyone can be an artist.

The emergence of AI painting, just as the Swiss artist Paul Klee said: "Art is not about reproducing the visible, but about making the invisible visible." After about 20 years of development, the current development of AI painting based on different types or modal elements is not good. In the same way, the longest-developing one is "creating pictures from pictures", and then to the recently popular "text + pictures" producing pictures.

Of course, there are also teams that have developed the technology to generate images from speech.

  Upload a picture, or enter some simple keywords, and the system can automatically generate a cartoon image... Recently, AI painting has become popular on Internet social platforms.

  AI painting, as the name suggests, is to use artificial intelligence to paint, and it is one of the typical application scenarios for artificial intelligence-generated content.

Its main principle is to collect a large number of existing works, analyze their content and style characteristics through algorithms, and finally generate new works, so algorithms are the core of AI painting.

  At present, AI paintings that generate images "out of thin air" will actually "overturn" at every turn: Maybe what AI drew through your photos in the last second is a two-dimensional portrait full of artistic sense, and in the next second your pet cat, Dogs may be depicted as cute girls or muscular men.

  In fact, AI painting has long been popular all over the world.

The first publicly exhibited painting created by artificial intelligence, "Portrait of Edmund Bellamy", was sold at Christie's auction house in 2018 for $432,500. A portrait work that is automatically generated after 15,000 portraits from the century to the 20th century.

  How does AI painting realize the "out of thin air" drawing?

Besides entertainment, what are the potential applications of AI painting?

  From "Generating Images with Images" to "Generating Images with Speech"

  In 2022, "Space Opera House" created by artificial intelligence once became popular.

Space Opera won 1st Prize in the Digital Art/Digitally Retouched Photo category at the Emerging Digital Artist Competition held in Colorado, USA.

Its composition, color scheme and picture details are exquisite.

However, the creator of this work is not an artist, but a game designer from Colorado, USA.

  In an AI creation tool called "Midjourney", the game designer first entered several keywords, such as light source, composition, atmosphere, etc., and obtained 100 works, and then retouched the pictures for about 80 hours. Select 3 works, and finally print the image on the canvas.

  The "art" works generated in a short period of time through simple interactive dialogues caused human artists to start a debate on "whether AI painting entries are cheating".

This massive debate has also made the public intuitively aware of how far the level of AI painting has developed today.

  "The creation of artificial intelligence in art can be traced back to the end of the last century, when the artificial intelligence painting technology was called 'image stylization filter'." Researcher at the National Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences Dong Weiming said that the initial AI painting method is relatively simple, such as an ordinary photo, through some image processing algorithms, the photo pixels are transformed geometrically or in color, and then adjust different parameters to simulate an oil painting Or in a watercolor style.

  After about 20 years of development, the current development of AI paintings based on different types or modal elements is not the same. The longest-developed one is "pictures from pictures", and then the recently popular "text + pictures" to produce pictures.

Of course, there are also teams that have developed the technology to generate images from speech.

  AI painting mainly relies on three technical modes to achieve

  Dong Weiming introduced that at present, AI painting is mainly realized by means of image style transfer technology, graphic pre-training model and diffusion model.

  "Image style transfer technology refers to the image processing algorithm that extracts the input real image content features and the reference artistic image style features to achieve the fusion of real image content features and artistic image style features, thereby generating new artistic images. For example, Dong Weiming, if the exterior photos of the San Francisco Palace of Fine Arts and the works drawn by Monet, the founder of Impressionism, are integrated through the image style transfer technology, a piece of San Francisco art that looks like it was drawn by Monet can be obtained. Palace paintings.

It was this technique that was used in the original AI painting.

  However, in Dong Weiming's view, image style transfer technology mostly relies on the Generative Adversarial Network (GAN) algorithm. Its biggest problem is that the generated paintings are not very artistic, and the brushstrokes and composition make people feel that they are different from real paintings. Therefore, for a long time, AI painting has been "unknown".

  When image style transfer technology is still struggling with the aesthetics of output works, the emergence of graphic pre-training models has accelerated the rise of AI painting.

  "Relying on the image-text pre-training model, as long as you input a sentence or upload a picture with a clear style, the algorithm can 'align' the image features with the text features. The content features of the generated paintings are similar to those of the uploaded pictures, and the artistry is also high." It is much better than the images generated by image style transfer technology." Dong Weiming gave an example, such as the Comparable Language-Image Pre-training (CLIP) algorithm that supports the image-text pre-training model, which uses the ability of "alignment" of image-text features, combined with The existing generative model realizes "image generation by image" or "image + text" image generation.

  However, Dong Weiming said frankly that the promotion of the graphic pre-training model is also controversial. Some people believe that the model needs to use a large number of graphics processing units (GPUs) for data training in the early stage of training, which consumes a lot of power and costs. It is very high, but the application scenarios of this model are not clear enough.

But some people think that maybe this model can be built into a general-purpose artificial intelligence model in the future, and use it to complete more algorithmic tasks, but this still needs time to verify.

  It is true that no technology is perfect, which also provides infinite motivation for human beings to explore more advanced technologies.

The most popular diffusion model is one of them.

  "Currently the latest AI painting technology uses the diffusion model. This model can input a randomly sampled noise into the model, and then try to generate an image through denoising." Dong Weiming said that the diffusion model also has weaknesses, because the model does not Insufficient ability to recognize the content of the picture, or difficulty in fully understanding the meaning of the recognized text, as well as deviations in the training data, sometimes produce "four different" works.

In addition, the speed of generating pictures by the diffusion model is relatively slow, and it is not yet possible to generate pictures in real time.

  Internet Governance, Metaverse or Potential Application Prospects

  The current application scenarios of AI painting focus more on social software.

Recently, the AI ​​painting software that has become popular in domestic social networks is mainly concentrated in small programs and Apps.

With the popularity of AI painting applets, the short video platform Douyin also quickly launched AI painting special effects.

At the same time, Tencent has previously launched the "QQ Small World AI Painter" activity, and Baidu has also launched the first AI art and creative assistance platform "Wen Xin Yi Ge".

  With AI, everyone can be an artist.

The emergence of AI painting, just as the Swiss artist Paul Klee said: "Art is not to reproduce the visible, but to make the invisible visible." "AI has now perfectly realized this goal, and people can draw a lot of realities through machine calculations. Scenes that can’t be seen in the world.” Dong Weiming imagined that in the near future, AI painting may also show richer application scenarios.

  "Now the Internet is full of a lot of bad content, which often appears in the form of paintings in order to evade supervision, and many current content recognition models can accurately identify real pictures, but lack relevant training data for bad content art works, so the bad content Content identification is inaccurate. Perhaps AI painting technology can be used to accumulate data on art works with inappropriate content and use it to train identification models to improve the security supervision capabilities of Internet content and the accuracy of identification.” Dong Weiming suggested.

  In Dong Weiming's view, as a form of art presentation, AI painting will also give birth to new business models in industries such as metaverse, design, and cultural tourism.

For example, AI painting currently has layouts in AI-assisted creation, short video, film and television production, and Metaverse, because these tracks are inseparable from creativity. AI painting can help creators realize their creativity through simple feature input. Preview, and even create directly.

  However, Dong Weiming did not deny that there are still copyright disputes in AI paintings.

The core of AI painting is the model, and training the model requires the use of a large amount of image and text data.

For unauthorized pictures, it is still difficult to define the copyright ownership of the images generated after calculation.

"Some painters have very obvious styles. If the painter's paintings are used to train the algorithm model to generate works, who will own the final copyright?" Dong Weiming's question is exactly the real problem faced by most AI paintings.

  AI painting set off a group carnival of capital, hoping that one day it can get out of the embarrassment of "photographing cats and tigers", and truly serve artistic creation and create more value.