AI Creates Near Perfect Images Of People, Dogs and More

January 4, 2020 posted by

Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. In the last few years, we have seen a bunch
of new AI-based techniques that were specialized in generating new and novel images. This is mainly done through learning-based
techniques, typically a Generative Adversarial Network, a GAN in short, which is an architecture
where a generator neural network creates new images, and passes it to a discriminator network,
which learns to distinguish real photos from these fake, generated images. These two networks learn and improve together,
so much so that many of these techniques have become so realistic that we often can’t
tell they are synthetic images unless we look really closely. You see some examples here from BigGAN, a
previous technique that is based on this architecture. So in these days, many of us are wondering,
is there life beyond GANs? Can they be matched in terms of visual quality
by a different kind of a technique? Well, have a look at this paper, because it
proposes a much simpler architecture that is able to generate convincing, high-resolution
images quickly for a ton of different object classes. The results it is able to churn out is nothing
short of amazing. Just look at that! To be able to proceed to the key idea here,
we first have to talk about latent spaces. You can think of a latent space as a compressed
representation that tries to capture the essence of the dataset that we have at hand. You can see a similar latent space method
in action here that captures the key features that set different kinds of fonts apart and
presents these options on a 2D plane, and here, you see our technique that builds a
latent space for modeling a wide range of photorealistic material models. And now, onto the promised key idea! As you have guessed, this new technique uses
a latent space, which means that instead of thinking in pixels, it thinks more in terms
of these features that commonly appear in natural photos, which also makes the generation
of these images up to 30 times quicker, which is super useful, especially in the case of
larger images. While we are at that, it can rapidly generate
new images with the size of approximately a thousand by thousand pixels. Machine learning is a research field that
is enjoying a great deal of popularity these days, which also means that so many papers
appear every day it’s getting really difficult to keep track of all of them. The complexity of the average technique is
also increasing rapidly over time, and what I like most about this paper is that it shows
us that surprisingly simple ideas can still lead to breakthroughs. What a time to be alive! Make sure to have a look at the paper in the
description as it also describes how this method is able to generate more diverse images
than previous techniques and how we can measure diversity at all because that is no trivial
matter. This episode has been supported by Weights
& Biases. Weights & Biases provides tools to track your
experiments in your deep learning projects. It is like a shared logbook for your team,
and with this, you can compare your own experiment results, put them next to what your colleagues
did and you can discuss your successes and failures much easier. It takes less than 5 minutes to set up and
is being used by OpenAI, Toyota Research, Stanford and Berkeley. In fact, it is so easy to add to your project,
the CEO himself, Lukas instrumented it for you for this paper, and if you look here,
you can see how the output images and the reconstruction error evolve over time and
you can even add your own visualizations. It is a sight to behold, really, so make sure
to check it out in the video description, and if you liked it, visit them through
or just use the link in the video description and sign up for a free demo today. Our thanks to Weights & Biases for helping
us make better videos for you. Thanks for watching and for your generous
support, and I’ll see you next time!


100 Replies to “AI Creates Near Perfect Images Of People, Dogs and More”

  1. Karl Kastor says:

    Since it's a kind of auto-encoder, can you translate existing images into the latent space and then do latent space interpolation? This wasn't (easily) possible with GANs before!

    It's interesting how different the errors that VQ-VAE produces look from the blob-like GAN results.

  2. Loo says:

    would dream's visions kind of work like this? when i have lucid dreams i tend to focus on the images, they are very very clear but there is always a little something off (which i often ignore until i wake up because i am dreamy crazy when i focus)

  3. Jamie Fishers says:

    If they can make this why can't they make an ai that convincingly mimics the human consciousness?

  4. Robert Roberts says:

    So, when will there be photorealistic video games?

  5. Deadly Robotics says:

    Why have you stopped Explaining the technical part of these Papers ? This isn't Fair!!

  6. Philipp Jeremias says:

    Hey, that's not a two minute paper

  7. Will says:

    Hey Kárloy – great intro to the research, as usual.
    Is there a link to the W and B instrumentation of this paper that you mentioned?

  8. Lorenzo Blz says:

    It would be interesting to mix the two approaches together: one loss composed by this one plus a discriminator based one.

  9. PRO SKUB says:

    but can it kick up the 4d3d3d3?

  10. Werex Zenok says:

    Can you explain the neural network?

  11. veggiet2009 says:

    What's the difference with previous latent spaces?

  12. Peetiegonzalez says:

    2:04 wait… you're surely not telling us that these images are purely generated?

  13. OneSaile says:

    will photorrealistic games come from this tech?

  14. Michael Marks says:

    Your videos are consistently brilliant and exciting. My favorite part of browsing the web. /bravo

  15. Moby Motion says:

    Love that this uses variational autoencoders, and therefore comes with all their benefits. For example, I assume you can pick out your learnt “mean” and “standard deviation”, giving you a much more theoretically grounded basis for generating new pictures.

  16. Programming Universe says:

    Köszi a feltöltést, ez a videó is a megszokott fantasztikus minőséget nyújtotta! Ráadásul minden nap videókat láthatunk! Angolul azt mondanám: Keep up the good work!

  17. Dremekeks says:

    General Adversary Networks are really insane.. an AI teaching itself!

  18. Raf galvão says:

    Make photorealistic dinosaurs and cat-girls ofcs

  19. HulluJaska says:

    That's actually a relevant sponsor. Congrats for the first commercial sponsor in a while!

  20. A.I says:

    create me a russian programmed robot

  21. Alejandro Aguiar says:

    Those ostriches are ridiculously well made abominations from Lovecraft's nightmares, just look at each images separately. By the way, nice video as always.

  22. Chandler Supple says:

    I've been waiting for something like this! I can't wait to try out the model.

  23. Philip Fry says:

    They need to implement that into games so every npc has a unique face for free.

  24. John Jacobson says:

    Great Channel. Please link-list all the references in videos in the video comment.
    For example, the Campbell 2014 referenced in the video is 
    D. F. Campbell, Neill & Kautz, Jan. (2014). Learning a Manifold of Fonts. ACM Transactions on Graphics. 33. 1-11. 10.1145/2601097.2601212.

  25. mahchymk93 says:

    WHAT A TIME TO BE ALIVE!! ❤️❤️❤️

  26. Wajih bec says:

    As mind blowing as this is it makes me sad knowing I'll never reach the capabilities of these brilliant minds.

  27. spiddyman007 says:

    This dudes voice is crazy

  28. Марцус Åкерман says:

    I see a problem with this channel. Although it’s immensely interesting, it’s also chunking the information down to a level where the original study is easily misinterpreted or the effect of its results are extrapolated by the viewers to something far from the original intention of the authors. This is not a critique of the presenter, as he’s very good at what he’s doing, but more of a critique of the format.

    And further, where are the papers that works on how to identify a computer generated face from a real face? Until then all these papers are just one step closer at dismantling of our democracy. I mean, do the viewers really understand, or care, about the eventual negative results of these findings?

    This is exactly why I’m feeling disheartened by studying and working in this field. It’s just full speed ahead and no concern of what you might eventually hit. More findings on how to repair the eventual damage might make me hopeful about the field, but as it seems now, we are just heading towards disaster.

    I’m not saying this paper makes it so, but eventually if we are not careful and contemplating the effects of our actions we might see ourselves in such a situation.

  29. Orwell Klimt says:

    are genders discriminative to each other.
    and if so. what are the implications of this?

  30. Crash159 says:

    Hey Károly, you could change your channel name to "Too Many Papers", in order to not be restrained by time anymore.
    Your videos now often pass the 2 minute mark, and you really shouldn't be restrained by that time limit, after all, there's so much interesting stuff to be discussed about these projects.

  31. Дмитрий Хлеб says:

    Look at all these people that dont exist

  32. Awdhoot Kanawade says:

    Wewww faceapp data sold😂😂

  33. m m says:

    Videogame related ideas.

    Auto-rig and skim 3D models, with learning algorithms. Text based character creation.

  34. suri4Musiq says:

    Kindly do a paper explanation for the faceapp

  35. Sekai Yukki says:

    Deepfake getting better =D

  36. Mussadiq Zainal says:

    Instead Of Input By Keyboards… Why Not Instructions By Voice.. 😄 It's Not Just The Next Thing.. It's Just Us.. On How We Preceive Instructions.. Then There Is Turning Voice Recognition Like A Thumbprint.. For Security Measures… 😌 When You Reach To Certain A Point.. You Gotta Reach Another Simply Because People Are Getting Smarter And Smarter.. 😄

  37. zero s says:

    U N L I M I T E D H E N T A I !!

  38. KotSR says:

    Where do you learn about these papers being released?

  39. Artisan says:

    There is no AI, it's people behind algorithms so it's people making these things, these programs are just a better version of Photoshop filters and tools. It's great, will be very good for speeding up art works and design, but that's all. Needs people behind it.

  40. Mich ael says:

    With so many machine learning papers, what you need is an AI to read, collate and summarise them for you.

  41. Bjarke Erik says:





  42. Account says:

    Hello Károly
    I'm looking for image inpainting with DL for texture images, may you recommend me a source for this kind of models?

  43. Arcade Alchemist says:

    would like to not see fake images on tinder. i mean a lot of images i see are fake

  44. Raw Vid says:

    Truth be told I don't know about half the things you cover , but the way you explain is just to wonderful and just keeps me coming to see all these new technological marvels

  45. Felipe Dias Rieth says:

    Rockstar should use on next gta

  46. Rilum Osmanaj says:

    Dear fellow scholars. This is Rilum Osmanaj. Peace
    Dear fellow Károly Zsolnai-Fehér. This is Rilum Osmanaj. Peace

  47. Deeparth Gupta says:

    Catfishing people just got much easier.

  48. Cyan Josh says:


  49. wertyuiop says:

    Jesus is the way, the truth and the life. No-one comes to the Father except by Him. Repent and be saved.

  50. Y2Kvids says:

    Is there impact on Human brain watching fake faces ?

  51. Jens Törnell says:

    Is that an AI voice speaking?

  52. Seetor says:

    So… What do I do if I kind of sort of fell in love with the girl in the thumbnail?

  53. Just Another Weeb says:

    is this what the russians are using faceapp data for?

  54. Charles Okonkwo says:

    This is fucking scary. Post-truth era here we come. With artificial photos and videos you'll be able to make people believe ANYTHING! Humanity is gonna end.

  55. neohumanity says:

    We will live inside computers in the 2030s

  56. Chris says:

    perfect for gaslighting lmao

  57. Joey Koningsbruggen says:


  58. Wojciech D says:

    0:54 hahahaha. Look at the dolphin photos. One of them is different than others. It's good to see some memers in scientific community.

  59. Richard Gargya says:

    Szia Karcsi. Szuper jó az angolod. De nem bántásból meg kérdezhetem, hogy ez script, előre jól megírt szöveg és azt olvasod fel? Azt akarom mondani, hogy nem természetes beszédnek tűnik nekem a narráció hanem mintha csak gyorsan felolvasnád és ennek okán olyan lapos is, pl: mikor azt mondod, hogy "What a great time to be alive", mintha nem is a saját véleményed lenne ez a vélemény, annyira érzelem mentes… Aztán lehet csak én várok túl sokat, bocsi. Nem piszkálni akarlak, szeretem a videóid, köszi ha válaszolsz 🙂

  60. TJP projects says:

    So what you're saying is that I can now have my computer make my hentai for me?

  61. SafetySkull says:

    Is this similar to the principle component analysis found in CodeParade's videos?

  62. skrr yuhh says:

    Ok now how can we apply this to porn is my question

  63. Scramble says:

    Photoshop Surgeon want to know your location

  64. Scramble says:

    What's the object at 3:30?

  65. a thing says:

    0:44 ham bur gur

  66. Michael Brown says:

    Not sure what the point is.

  67. Ang Bart says:

    Of its creators perspective. The perspective in these channel is to damn must must gen.

  68. Kardop says:

    AI made asian girl on thumbnail is a cutie

  69. Michael Cooney says:

    Who is funding this? So much research on so generating false truth and potentially incredibly oppressive applications rather than entertainment and inpoooving human welfare.

    I'm wondering if government secret police and intelligence from nation's like China and Russia finances this, and these cute graphics demos are public fronts for much larger r&d programs decorated to mass surveillance propaganda and mass control and indoctrination.

  70. 08wolfeyes says:

    This kind of A.I imaging might be useful for the police when creating a photofit of a suspect!

  71. Digital Down says:

    I’m going to check out what it comes up with when fed a dataset of album covers. I don’t have high hopes but it’s worth a shot.

  72. Dan Iel says:

    I'm tired of seeing so many AI learning tools being demonstrated but never released. The only one that seems available is the fake app. Being able to make new images out of a set of images would be an amazing tool. You could use it for texture creation, make varieties on famous paintings etc etc.

  73. Ivan Guerra says:

    Man, What are you going to do when one of that persons ask you not to use their images without permission ?

  74. Oriru Bastard says:

    I wonder how well these work with cartoon characters?

  75. PERT DOHERTY says:

    2:37 deform

  76. Caleb says:

    "All images in this video were produced by our generator, they are not photographs of real people"
    Literally the next cut, there are four photographs of real people.

  77. Sarabella Gignac says:

    Mueller released Epstein after only 13 months

    Epstein then spent his billions backing Marvin Minsky

    (father of the internet) & emotion machine & Ben goertzel Sophia's creator

    Therefore > Evil is written into DARPA'S / Silicon Valley/ AI Source Code

  78. Solve Everything says:

    At this point it has become a badge of honour to be picked out by TMP (two minute papers) for a video.

    Maybe you should start a small financial reward fund for those papers too?

  79. Cheydinal says:

    They should use AI to create 3D models. Have it create them, then take a digital photo of the model, and let another AI compare it to real-world photos

  80. Simon Persson says:

    I need the green haired girl in my life <3

  81. Philipp Hoehn says:

    This is how our future waifus will be created

  82. David Beddoe says:

    i could do this mentally when i was 5 or 6

  83. Paul Hamacher says:

    soooo…. can it create nudes? 😏

  84. Ubi Vermis Cerritulus says:

    Whenever I see stuff like this I can not help but think the military probably had this sort of tech decades ago…

  85. BoomSnap SnapBoom says:

    Send it back to the factory it thinks humans have green hair.

  86. Tsz fung Cheung says:

    what if the robot factory uses those human face to make a real AI Robot.

  87. My Account says:

    Like a camera! omg

  88. BungyStudios says:

    More Waifus every day

  89. husain mfh says:

    Your voice like an AI generated voice tho 🤔

  90. SingingintheDark says:

    your link to the latent space papers goes to gaussian material synthesis, I was really hoping to download the software used to make the faces, is it available? why do you have a link for it if it isn't? I watched the whole video, but you really do not separate the two so are they the same ? if so how do I teach it to make the faces? because it seems to only changes textures and lighting from what I read. you should really be more clear.

  91. IHasNoLife Productions says:

    So I can create my perfect girlfriend?

  92. yoooyoyooo says:

    AI is going to make human intelligence obsolete in professional fields so fast it's not even funny. We are basically making our selfs competition that we can't win against. How is that going to influence humanity we can only teorise about.

  93. Arun Chauhan says:

    This technology would be very helpful in state surveillance. Thanks

  94. youtoober2013 says:

    0:51 A few key things we can take away from this…
    A shark turns into a octopus and a sloth before it turns into a dog.
    Dogs are horrifying with bird feet.
    Birds are really just big flies.
    If you tie a gooses neck it turns into a robin.
    … and dogs resemble other dogs.
    Good to know.
    Oh and yeah, I love how the skunks butt turns into that one dogs nose. Irony.

  95. youtoober2013 says:

    The invertebrates were one thing, but when you've got birds growing out of other birds backs (bottom right)…
    … these are things that can't be unseen! 1:02
    Droopy faced dogs are always cool, but those amphibians… ugh!
    This is one thing I think AI should learn faster.

  96. youtoober2013 says:


  97. youtoober2013 says:

    All kidding aside, this is really impressive.

  98. Gajaraj Kalburgi says:

    Is there any research going on detecting abusing language in text content ?

Leave a Comment

Your email address will not be published. Required fields are marked *