Sora, how OpenAI’s new AI that creates Oscar-worthy videos works

by time news

We already know that chatbots OpenAI I’m able to pass the bar exam without needing to study law. Now – just in time for the Oscars – one new application of the artificial intelligence company, Sorapromises to deliver videos that have nothing to envy of films without having attended a directing course. Sora is just a guy for now research product, and will be made available to a group of creators and security experts who will check its vulnerabilities. OpenAI plans to make it available to all aspiring filmmakers on an as yet unspecified date, but has nevertheless decided to present the application as a preview.

Different companies – from giants like Google up to startups like Runway – they have already launched text-to-video artificial intelligence projects, capable of creating videos based on users’ textual instructions. But OpenAI says Sora stands out from other applications in the field because of its astonishing photorealism (an aspect that I personally have not found in its competitors) and for its ability to produce longer videoswhich can last up to one minute. The researchers I spoke to wouldn’t reveal exactly how long it takes to make a video, but they said it was closer to what it takes to go out for a bite to eat than a couple of days.

What Sora can do

OpenAI didn’t let me submit requests to Sora, but he shared four examples which show the potential of its new AI (but no one comes close to the minute in duration, arriving a maximum of 17 seconds). The first comes from a detailed prompt that would appear to have been written by an obsessive screenwriter: “A magnificent, snow-covered Tokyo is in turmoil. The camera moves through the crowded streets of the city, following several people enjoying the snow and shopping at nearby stalls. Gorgeous sakura petals fly in the wind along with snowflakes“.

A video generated by OpenAI’s Sora

Courtesy of OpenAI

The result is a convincing video set in a very recognizable Tokyo, in that magical time of year when snowflakes and cherry blossoms coexist. The virtual camera moves as if attached to a drone, following a slowly strolling couple. One of the passers-by is wearing a mask. To their left, cars whiz by on a road that runs alongside a river, while to the right, customers enter and exit a series of small shops.

Video it’s not perfect. Watching the video several times, you realize that the protagonists – a couple walking on a snowy sidewalk – would have found themselves faced with a dilemma if the virtual camera had continued to roll. The sidewalk on which the two are walking seems to be a dead end, which would have forced them to climb over a small guardrail to reach a strange pedestrian crossing on their right. Despite this small inconvenience, the example of Tokyo represents an extraordinary exercise of world-building. In the future, production designers will be divided between those who consider tools like these a powerful aid and those who see them as an existential threat to the profession. The people depicted in this video – entirely generated by a digital neural network – are not shown in close-up and do not make any sounds. But Sora’s team says that in other cases the footage included virtual actors showing real emotions.

You may also like

Leave a Comment