Tech
Nvidia's new AI platform makes a fake you for better video calls
The AI-powered tools can improve audio and video quality for teleconferencing.
Streaming video is the number one source of traffic on the internet, and more than 30 million web meetings are estimated to take place every day. To capitalize on the growing work-from-home movement, NVIDIA has announced a new AI-powered suite of tools that it says improve the quality of video calls while also reducing bandwidth usage. How? By faking your face with artificial intelligence, essentially.
Something for everyone — Called Nvidia Maxine, the platform can be tapped into by developers of video chat software to offer a range of features, like resolution upscaling and automatic camera positioning that always keeps your face in the center of the frame. But perhaps its most impressive feature is the ability to use AI to create a model of a speaker's face, and then only update their expressions as they happen. Put together, all the enhancements provided by Maxine should save money for developers and help workers better contend with home settings that aren't always conducive to professional-grade teleconferencing.
Because streaming video can quickly get expensive, Nvidia found a way to analyze the key facial features of everyone on a call and intelligently re-animate them in the cloud instead of streaming an entire screen of pixels. Maxine also converts lower resolutions streams to higher resolution video in real-time and can compress the resulting feed by 90 percent more than the current H.264 standard can manage. The result is that the amount data being sent back and forth is significantly reduced without compromising on image quality.
Combatting unpredictable environments — Other features are more useful for end-users. Talking over video is unnatural because most people will look at their screens while they speak rather than the camera. With gaze correction, the AI will simulate eye content and autoframe will adjust your face so it looks as if you're facing the other people in the call.
Everyone knows what it's like to have family members who won't respect your need for peace quiet, or construction workers who conveniently start drilling as soon as your meeting begins. With the new "denoise" feature your voice is also isolated from everything happening in the background. In the video above it seems to work incredibly well, completely blunting the sound of children playing in the background as a women in the foreground talks on a call.
Maxine is one example of AI actually being useful because it can predict the common issues that arise to interrupt video calls. The technology suite uses Nvidia GPU acceleration and runs in the cloud, so you don't need one of the company's graphics cards to enable the features of Maxine, and they should work on most devices capable of video calls. Developers of video calling apps who are interested in using Maxine can request early access on Nvidia's website.