OpenAI has teased, and repeatedly delayed, the release of Sora for nearly a year. On Tuesday, the company finally unveiled a fully functional version of the new video-generation model destined for public use and, despite the initial buzz, more and more early users of the release don’t seem overly impressed. And neither am I.
Introducing Sora, our text-to-video model.
Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. https://t.co/7j2JN27M3W
Prompt: “Beautiful, snowy… pic.twitter.com/ruTEWn87vf
— OpenAI (@OpenAI) February 15, 2024
The company first introduced Sora last February to critical acclaim for its hyperrealistic video renderings. “Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt,” OpenAI wrote in its announcement blog at the time. “The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world.”
OpenAI keeps dropping more insane Sora videos
These are 100% AI generated
9 reality bending videos
1. Elephant made out of leaves pic.twitter.com/tPsHNGbFPS
— Linus Ekenstam (@LinusEkenstam) March 18, 2024
The company released more Sora-generated footage in March, this time of an elephant made of leafs, further hyping the model’s capabilities. The Sora program subsequently ran into a series of development delays, which OpenAI’s chief product officer Kevin Weil blamed in a recent Reddit AMA on the “need to perfect the model, get safety/impersonation/other things right, and scale compute.” At the same time, The Information reported that early iterations of Sora suffered from poor performance and struggled to maintain a focus on the user’s prompts, requiring up to 10 real-world minutes to generate a minute-long clip. The model was also recently leaked online by a group of disgruntled beta testers who objected to OpenAI’s “art-washing” practices, however, the company swiftly had the group’s unauthorized UI removed from Hugging Face in response.
While OpenAI was tweaking and refining Sora’s performance, the company’s competition was eating its lunch. Adobe’s Firefly AI, Runway’s Gen 3 Alpha, Meta’s Movie Gen, and Kuaishou Technology’s Kling (not to mention countless free-to-use options) proliferated throughout the internet this past year, with many offering clips of superior quality and faster inference times than what OpenAI had repeatedly promised.
On Tuesday, OpenAI officially unveiled the production-ready version of Sora and released it to its $20-a-month Plus and $200-a-month (lol) Pro subscribers. Or, at least, the company did for a few hours. As technology commentator Ed Zitron noted on Bluesky Wednesday, “mere hours — maybe even less — after saying Sora was out, OpenAI stopped accepting new account registrations with no clear timeline. OpenAI bait-and-switched the entire tech media. There’s no way this company can afford to have their video generator available to the public.”
For the folks who did manage to gain access, the videos that Sora managed to generate were less than impressive. As YouTube personality Marques Brownlee pointed out during his hands-on video with the model, it required multiple minutes to generate a single 20-second-long 1080p resolution clip and had significant difficulty in generating a subject’s legs and their movements, with the front and rear legs unnaturally swapping positions throughout the clip. One need only look at the generated video below of a gymnast swapping their arms, legs, and head on the fly as they tumble across a mat to see what he meant.
here's a Sora generated video of gymnastics
— Peter Labuza (@labuzamovies.bsky.social) 2024-12-11T17:35:23.989Z
Bluesky user Peter Labuza, who posted the gymnastics video, did not hold back on his criticism of the model, stating: “I’m sorry, but if you make a text-to-video generator and you tell it “make a cat run through a field” and you give it the starting image, and the cat simply STANDS, your generator Does Not Work.”
Bluesky user Chris Offner held a similar opinion, sarcastically noting that “Sora is a data-driven physics engine” while sharing an absolutely bonkers clip of a skier defying most, if not all, known laws of physics.
The Verge also tried out the model, bemoaning the fact that it still couldn’t avoid unsightly inclusions like “additional limbs or distorted objects.”
"Sora is a data-driven physics engine."x.com/chrisoffner3…
— Chris Offner (@chrisoffner3d.bsky.social) 2024-12-10T12:42:53.674Z
Not everybody hated Sora on sight, mind you. X user Nathan Shipley showed off the model’s “remix” feature, which enables users to mask a generated video to the movements of objects in an uploaded sample. In this case, he made a generated crane’s head move in the same manner as a pair of scissors he videotaped himself holding.
Sora Remix test: Scissors to crane
Prompt was "Close up of a curious crane bird looking around a beautiful nature scene by a pond. The birds head pops into the shot and then out." pic.twitter.com/CvAkdkmFBQ
— Nathan Shipley (@CitizenPlain) December 10, 2024
There’s no word yet on when the company will be able to reliably reopen account signups for interested Sora users. Whether OpenAI can court Hollywood with Sora in its current state, as Runway recently did with Gen 3 and Lionsgate, also remains to be seen.
One thing remains certain, OpenAI, despite its initial lead in the AI boom, is quickly being surpassed by the rest of the industry, and lackluster product releases like what we just saw with Sora will only further harm the company’s reputation.