Home / Daily News Analysis / Google’s new anything-to-anything AI model is wild

Google’s new anything-to-anything AI model is wild

May 26, 2026 Twila Rosenbaum 77 views

At Google I/O 2026, the company unveiled Omni, a new family of generative AI models designed to eventually turn any type of input—text, image, video, audio—into any other type of output. For now, Omni’s capabilities are focused on video generation and editing, but even that limited scope has already produced results that are both astonishing and unsettling.

Omni Flash, the first model available, is integrated into Google’s Flow platform, the company’s AI video creation and editing tool. Users can upload a reference video and pair it with a text prompt to generate new clips, or ask the model to edit existing footage. Google claims Omni has improved real-world knowledge and character consistency compared to its predecessor, Veo. But a hands-on test reveals a more complicated picture.

The Stuffed Deer Experiment

To evaluate Omni, we revisited a project from earlier this year: generating vacation videos of a child’s plush deer named Buddy. Using a single reference image of the toy, we prompted Omni to create clips of Buddy rafting, skydiving, and packing for a cruise. Some results were surprisingly good—the deer maintained its appearance across scenes, and the rafting clip looked almost real. But then came the glitches. In the skydiving video, Buddy suddenly flipped orientation mid-fall. In the packing sequence, a jar of honey mutated into a squirt bottle and back again. The model struggled to keep objects consistent.

Deepfake Yourself in Seconds

Where Omni truly shines—and alarms—is in its ability to insert a real person into AI-generated scenarios. We provided a short selfie video with a neutral expression, then asked Omni to show us eating spaghetti, sitting on an airplane, and posing in front of the Eiffel Tower with a baguette. The results were convincing enough to fool family members. In one clip, the only clue that something was off was the unfamiliar bowl of pasta. The model even generated realistic chewing motions and background details, though occasional tells remained, like a duplicate person in the airplane scene or an overly metallic clink of the fork.

These deepfakes require minimal effort—just a few credits and a text prompt. Google’s AI Pro plan costs $20 per month for 1,000 credits, and each video generation consumes 15 to 40 credits depending on length and complexity. After producing about 20 clips with some edits, we had only 145 credits left. The cost can add up quickly if you’re iterating to get a perfect result.

The State of Generative Video

Omni represents a significant step forward in generative AI, but it also highlights the persistent challenges. While the model can produce realistic skin textures and lighting, it still struggles with temporal consistency and physical logic. Objects warp, characters gain or lose features (like Buddy’s antlers appearing and disappearing), and movements can feel uncanny. Google claims future versions of Omni will handle audio and other modalities, but for now, the video output is a mixed bag.

The broader implications are hard to ignore. Tools like Omni make it trivial to create convincing fake videos of anyone, raising concerns about misinformation, fraud, and privacy. Even Google acknowledges the risks, though the model is currently released without extensive safeguards. The company has historically been cautious with generative tools, but the pace of deployment has accelerated.

Generative AI has come a long way since the first text-to-image models like DALL-E and Midjourney. Video generation has followed, with companies like OpenAI (Sora), Runway, and now Google pushing the boundaries. Omni’s key innovation is its “anything-to-anything” architecture, which aims to unify multiple modalities into a single model. In theory, this could enable seamless translation between text, images, audio, and video, but the current implementation is limited to video input and output.

For creators and hobbyists, Omni offers a glimpse of a future where complex video edits can be done with a sentence. For the rest of us, it’s a reminder that seeing is no longer believing. As one tester put it, “We’re definitely deep in the uncanny valley.”

Source: The Verge News

Google’s new anything-to-anything AI model is wild

The Stuffed Deer Experiment

Deepfake Yourself in Seconds

The State of Generative Video

Forget PowerToys, Windows 11 needs this feature immediately

Microsoft and OpenAI are still playing the fair use card — even as ChatGPT and Copilot fuel the "death knell for local journalism"

"A hater community trying to kill the game": Epic Games CEO speaks out against Steam's forced AI disclosure policy and how it's harming developers

This ASUS Vivobook 16 OLED deal packs 32GB of RAM, a 3K display, and Intel’s latest AI chip for less than you’d expect

Bill Gates says AI may replace a lot of jobs, but it will never replace athletes because no one wants to watch computers play

Lily Collins shares Father's Day tribute to dad Phil amid his health issues

Eminem’s Ex-Wife Kim Mathers Has Bench Warrant Issued After Missing Court