EMO, or Emote Portrait Live, is a recent development under the worldwide known company of Alibaba. With the world scrabbling to develop AI learning models, EMO serves as a revolutionary foothold as it tackles the ongoing issue with AI-generated videos: the unique human expression spectrum.

Article author Michael Nuñez explains that this is an AI technique regarded as the diffusion model and calls this “realistic synthetic imagery.” What’s most significant about this new approach to AI-generated media is the following: “Unlike previous methods that rely on 3D face models or blend shapes to approximate facial movements, EMO directly converts the audio waveform into video frames. This allows it to capture subtle motions and identity-specific quirks associated with natural speech.”

By training the AI with hours of footage drawn from media sources that focus on human expression and speech, EMO has been able to “create” a video that closely resembles human expression from a reference image. Compared to the many AI-generated content sites like DALL-E from OpenAI that we have viewed throughout the course, EMO somehow was able to close the gap between AI and accurate human depiction.

Beyond video generation, EMO also tackles the relationship between audio and visual by being coded to have the ability to animate where the subject is singing. Currently, the state of EMO is being able to generate videos that resemble speaking and singing videos. However, its revolutionary cultivation will affect personal content creation.

Of course, as this article notes, there are ethical concerns with generative content like this. The main issue lies in consent since images can be pulled from anywhere on the internet. This potentially has created a larger breeding ground for deepfakes.

The research paper that the article is embedded in this link (15 page reading): EMO: Emote Portrait Alive – Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *