AI Reconstructs Faces Using Audio Recordings

More terrifying news from the AI world: complex algorithms can now reconstruct a facial image from an audio recording of a person speaking.

In a paper titled “Speech2Face: Learning the Face Behind a Voice“, a team of researchers examines an approach that could allow defining facial attributes using audio recordings. “How much can we infer about a person’s looks from the way they speak? In this paper, we study the task of reconstructing a facial image of a person from a short audio recording of that person speaking”, states the description.

The team behind the paper designed and trained a deep neural network that perform this task using millions of natural Internet/YouTube videos of people speaking. During training, their model learned voice-face correlations that lets it produce images that “capture various physical attributes of the speakers such as age, gender and ethnicity”. This is said to be done in a self-supervised manner, utilizing the natural co-occurrence of faces and speech in Internet videos, without the need to model attributes explicitly.

The paper studies the whole idea in depth and evaluates how reconstructions resemble the true face images of the speakers. You can learn more about the work and find the full paper here.

Join discussion

Comments 4

  • gay

    5 days. im here now b*tch



    ·5 years ago·
  • joe

    What about a research that link your voice to your criminal record? minority report is not very far.



    ·5 years ago·
  • Beggar Midas

    Well, that's a very novel, frankly terrifying method of walking backwards from the data to the identity. Comes jam packed with a bunch of consequences I strongly doubt the study authors have bothered to consider, too. You've just found a means of unmasking everyone, everywhere that's ever been, or ever will be recorded.

    Let that sink in, roll it around in your skull for a bit to more fully process it. Take some time letting that maxwell's demon eat away any remaining peace of mind you may have remained in possession of until this moment, despite everything previous to it best efforts to erode those remaining dregs of faith among this bitterly caustic modern geopolitical landscape we all call home.

    It means unmasking every anonymous source. Every anonymous tipster. Every whisteblower. Every leaker. Every insiders tip-off. Every dissident and activist trying to hide in the crowd. Every undercover cop or investigative reporter. Every witness to the worst crimes imaginable. Every principled call to action deemed criminal by a criminal government. Every youthful indiscretion. Every person who ever spoke ill of corrupt leaders or their psychotic running dogs. Past **or** present. Anyone who ever has, or ever will taken a stand against status quos within earshot of a microphone. Wow. If your goal was to exponentially fuck over our entire species, i'd say you've gone well above and beyond the call of duty in doing so.

    Congratulations for inventing the digital fourth nail for the crucifixion of free thought, liberty, and democracy everywhere. Hope those thirty pieces of silver will buy you a very nice rope to hang yourselves with, brothers of Iscariot.


    Beggar Midas

    ·5 years ago·
  • Frank Trifonovic

    Let's see how long it takes before the SJWs accuse this algorithm of racism and try to dox it.


    Frank Trifonovic

    ·5 years ago·

