Have you ever wondered what it would be like to see your favorite portrait come to life and sing or speak any language or style? Well, now you can, thanks to a new AI framework called EMO, which stands for Expressive Motion for Vocal Avatars.
What is EMO?
EMO is a new method that can generate realistic and expressive portrait videos from a single reference image and a vocal audio input, such as singing or talking. EMO can handle various head poses, facial expressions, and lip movements, as well as different languages, genres, and durations of audio.
How does EMO work?
EMO works by first extracting the audio features and the facial landmarks from the input audio and image, respectively. Then, it uses a neural network to learn the mapping between the audio features and the facial landmarks and generates a sequence of facial landmarks that are synchronized with the audio. Finally, it uses another neural network to synthesize the portrait video from the facial landmarks and the reference image.
EMO can produce stunning results for a wide range of portraits, including paintings, historical figures, celebrities, cartoons, and even AI-generated characters. EMO can also animate portraits from different cultures and languages, such as Mandarin, Japanese, Korean, Cantonese, and English. EMO can also handle fast-paced and complex audio inputs, such as rap songs and Shakespearean monologues.
remember the video of the woman walking in Tokyo streets from Sora? In this EMO-generated video, she is singing a song by Dua Lipa that looks so realistic.
EMO is not only a fun and creative way to bring portraits to life, but also a potential tool for various applications, such as education, entertainment, art, and communication. Imagine being able to learn from the portraits of famous people, watch them perform in different languages and styles, or even interact with them in real-time.
How to access EMO?
If you want to learn more about EMO, you can check out the original paper and the demo videos on the project website. You can also try it out yourself when the code becomes available to the public on the official Github project. EMO is a breakthrough in vocal avatars generation and a glimpse into the future of AI and human expression.
0 Comments