SIGGRAPH: A New Method for Real-time Speech Animation
Events
Subscribe:  iCal  |  Google Calendar
Amsterdam NL   25, Jun — 28, Jun
Los Angeles US   25, Jun — 28, Jun
Montreal CA   27, Jun — 1, Jul
Cambridge GB   28, Jun — 2, Jul
Guildford GB   29, Jun — 30, Jun
Latest comments
by Matthew Scenery.Melbourne
6 hours ago

Their website does say that you can pay per image at $1 per image. I am in the opposite boat though. I could see this having a very significant effect on photogrammetry but I would need to process a few thousand images at a time which would not be very feasible with their current pricing model

by Shaun
7 hours ago

OMFG....PLEEEEEEEEEEEEEEEASE!

To the developers. A very promising piece of software for a VFX supervisor like me. BUT, please reconsider your pricing tiers and introduce a per-image price. We are a pretty large facility, but I can only imagine needing about 1-10 images a month at the very most. It's like HDRI's - we buy them all the time, one at a time. They need to be individually billed so a producer can charge them against a particular job.

SIGGRAPH: A New Method for Real-time Speech Animation
10 August, 2017
News

The team of researchers from the University of East Anglia, Caltech, Carnegie Mellon University and Disney have presented a method to to animate speech in real-time. The new way automatically incorporates new dialogues in much less time with a lot less effort.

The team recorded over eight hours of audio and video of a speaker reciting more than 2500 different sentences. The speaker’s face was tracked and the data was used to create a reference face for an animation model. Then they used special to transcribe the speech sounds. This whole process trained a neural network to animate a reference face, frame-by-frame, based on phonemes.

Training the AI is said to take only a couple of hours, letting specialists use speech from any speaker with any accent and even in different languages. The method is also capable of dealing with the singing. 

Abstract 

We introduce a simple and effective deep learning approach to automatically generate natural looking speech animation that synchronizes to input speech. Our approach uses a sliding window predictor that learns arbitrary nonlinear mappings from phoneme label input sequences to mouth movements in a way that accurately captures natural motion and visual coarticulation effects. Our deep learning approach enjoys several attractive properties: it runs in real-time, requires minimal parameter tuning, generalizes well to novel input speech sequences, is easily edited to create stylized and emotional speech, and is compatible with existing animation retargeting approaches. One important focus of our work is to develop an effective approach for speech animation that can be easily integrated into existing production pipelines. We provide a detailed description of our end-to-end approach, including machine learning design decisions. Generalized speech animation results are demonstrated over a wide range of animation clips on a variety of characters and voices, including singing and foreign language input. Our approach can also generate on-demand speech animation in real-time from user speech input.

Full paper

via engadget.com
Comments

2
Leave a Reply

avatar
2 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
2 Comment authors
JeffLeon Chen Recent comment authors
Jeff
Guest
Jeff

Your link is broken to the full paper.

Leon Chen
Guest
Leon Chen

can’t see the full papaer