Enlarge image

Screenshot from a clip generated with Sora: OpenAI and Sam Altman have published a few such example videos

Photo: Credit OpenAI

There are now a few text-to-video generators online. The start-up Runway is considered a pioneer, but Google and Meta are also experimenting in this area. With OpenAI, the currently most widely noticed company in the field of artificial intelligence (AI) has now presented a corresponding project. OpenAI boss Sam Altman announced on X that access to the new AI model called Sora will initially only be made available to selected creatives. Experts should also explore possible security risks before the program is published.

Videos created by Sora can be a maximum of one minute long, with a resolution of up to 1080p (Full HD). On its software website, OpenAI has published several example clips that are intended to demonstrate the tool's performance. You can see a woman dressed smartly walking through a city center scene.

According to the information, the example videos were created entirely by artificial intelligence, based on text input, which is included in each case. The so-called prompt for the video with the woman says, among other things, that she should wear a leather jacket and a red dress, that the street should be reminiscent of Tokyo and have lots of neon signs that are also reflected in puddles.

Other videos show, among other things, a town in California during the gold rush and mammoths running through the snow. The videos do not have sound. OpenAI writes that previous research on the text-to-image generator Dall-E and ChatGPT was incorporated into the development of Sora (more on the technical background of the model can be found here).

“Sam, please don’t make me homeless.”

Many users on social networks were enthusiastic about such clips, which were also shared there by Sam Altman himself. MrBeast, the most successful YouTuber in the world, who is known for elaborately produced videos, wrote to Altman: "Sam, please don't make me homeless." OpenAI has already succeeded in advertising on its own behalf.

Steven Levy, one of the best-known tech journalists, was impressed by the “astonishing photorealism” of the AI-generated clips at “Wired”. However, he emphasized that the longest video he was shown lasted 17 seconds. And towards the end of his article, Levy predicts that it will be a very long time, if ever, before text-to-video threatens actual filmmaking: "No, you can't make coherent films by watching 120 of the one-minute Sora clips because the model does not always respond to requests in the same way." Continuity is not possible in this way.

But Levy sees potential for Sora to change platforms like TikTok. “This model will enable the average person who produces videos for social media to create very high-quality content,” he quotes a researcher from OpenAI.

The tech journalist also mentions that OpenAI researchers didn't want to tell him how long it took to render the AI ​​videos. When asked, however, it was said that the waiting time was more in the direction of "just getting a burrito" than "taking a few days off."

There are likely to be similar limitations to Dall-E 3

According to Wired, the same content restrictions will probably apply to the tool as to images created with Dall-E 3. This means: OpenAI will try to actively prevent users from using it to create pornography, violent videos or even recordings in which celebrities can be seen. Technical measures should also prevent attempts, for example, to stage historical-looking AI videos as real recordings.

How easy it is to recognize the videos shown so far as AI-generated varies. With some videos, you may only notice after several runs that there are small image errors or glitches. In other clips, however, such as an older lady blowing out candles on a birthday cake, you quickly notice that the hands of those celebrating, for example, look unrealistic.

“AI really has no idea what hands are or how they work,” comments a reporter from “The Verge.” OpenAI itself writes about the video: "Simulating complex interactions between objects and several people is often a challenge for the model, which sometimes leads to funny results."

When and how OpenAI will make Sora available to a larger audience is still unclear.

mbö/dpa