This AI Makes The Mona Lisa Speak…And More!

This AI Makes The Mona Lisa Speak…And More!


Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. In an earlier episode, we covered a paper
by the name Everybody Dance Now. In this stunning work, we could take a video
of a professional dancer, then record a video of our own, let’s be diplomatic – less beautiful
moves, and then, transfer the dancer’s performance onto our own body in the video. We called this process motion transfer. Now, look at this new, also learning-based
technique, that does something similar…where in goes a description of a pose, just one
image of the target person, and on the other side, out comes a proper animation of this
character according to our prescribed motions. Now, before you think that it means that we
would need to draw and animate stick figures to use this, I will stress that this is not
the case. There are many techniques that perform pose
estimation, where we just insert a photo, or even a video, and it creates all these
stick figures for us that represent the pose that people are taking in these videos. This means that we can even have a video of
someone dancing, and just one image of the target person, and the rest is history. Insanity. That is already amazing and very convenient,
but this paper works with a video to video problem formulation, which is a concept that
is more general than just generating movement. Way more. For instance, we can also specify the input
video of us, then add one, or at most a few images of the target subject, and we can make
them speak and behave using our gestures. This is already absolutely amazing, however,
the more creative minds out there are already thinking that if we are thinking about images,
it can be a painting as well, right? Yes, indeed, we can make the Mona Lisa speak
with it as well. It can also take a labeled image, this is
what you see here, where the colored and animated patches show the object boundaries for different
object classes, then, we take an input photo of a street scene, and we get photorealistic
footage with all the cars, buildings, and vegetation. Now, make no mistake, some of these applications
were possible before, many of which we showcased in previous videos, some of which you can
see here, what is new and interesting here is that we have just one architecture here
that can handle many of these tasks. Beyond that, this architecture requires much
less data than previous techniques as it often needs just one or at most a few images of
the target subject to do all this magic. The paper is ample in comparisons to these
other methods, for instance, the FID measures the quality and the diversity of the generated
output images, and is subject to minimization, and you see that it is miles beyond these
previous works. Some limitations also apply, if the inputs
stray too far away from topics that the neural networks were trained on, we shouldn’t expect
results of this quality, and we are also dependent on proper inputs for the poses and segmentation
maps for it to work well. The pace of progress in machine learning research
is absolutely incredible, and we are getting very close to producing tools that can be
actively used to empower artists working in the industry. What a time to be alive! Thanks for watching and for your generous
support, and I’ll see you next time!

82 thoughts on “This AI Makes The Mona Lisa Speak…And More!

  1. Can you find this kind of generative program applied to synthesis of text bodies? Input the cast of mad men to a model of the expanse to generate a series of stories about space barons.

  2. "tools to empower artist", more like tools to empower fake news and disrupt everything we consider evidence.
    Followed of everyone being completely useless. Lately people does not even waste time making their own video clip from their media, they wait until google made it for them.
    All these new techniques sound amazing, but there is no need to be so clueless about its implications.

  3. The artistic possibilities leave me drooling with anticipation! Once again you leave me breathlessly anticipating the next paper.

  4. I'm going into computer engineering/ computer science. This excites me so much. This also terrifies me to no end.

    All in all wonderful video <3 !

  5. i wonder since the AI Physics engine, it would be possible to have a AI Aerodynamics engine combined with the physics engine?

  6. Its all nice and all… but if the people writing those papers could release softwares based on their paper that we could use, thatd be nice.

  7. Wow…thank you so much for all the love everyone! -> https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg/community?show_create_dialog=1

  8. This where getting cleche,,we are watching this kind of animation many times.love your content.magority of them.but those simulations and these animations are getting cleche

  9. The work at 0:38 seemed to give female hip/waist ratio and thigh thickness to the male subjects, interested to see this two papers down the road!

  10. So, I can see the whole array of morally dubious uses for all these papers, but what is the ACTUAL intended use? Why are they developing this technology in the first place?

  11. What I look forward to is when AI can take midi music and translate it into real instrumental music like the violin, simulating the nuance between notes.

  12. That ballet-dancing dude still gets me every time XD This was the first time though, that I noticed the original ballerina is wearing a tutu (which partly obscures her legs). Makes it even more impressive, imo!

  13. Actors will be obsolete in just a few years, computers will terk der jerbs. Deep-fakes and thispersondoesnotexist.com are already nearly there.

  14. Now imagine a museum, where you can come and talk with paintings, and they will reply meaningfully with advanced AI and tell you about their era.

  15. 2:15 WHAAT!?

    This is absolutely incredible. Thank you for making these videos, it's always a treat to see them pop up in my notifications.

  16. Can't wait to see all the pervert people take pictures of their favorite movie star and transpose them into p0*n movie

  17. I sometimes feel like this channel isn’t even set in our timeline or dimension, but rather in a totally different one, far in the future and we’re just able to have a glimpse of it…

  18. This is great. I also know a good use case for this. People that are gone from us too early. Now we can bring them back to life, atleast their visual appearance. But with the use of ai, it would be even able to reproduce their character (maybe?) based on an amount of decisions they took in their early lifes. So you could tell them what you couldn't years ago and more. Combine all of this with a holographic component and we have Star Wars, great 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *