logo80lv
Articlesclick_arrow
Research
Talentsclick_arrow
Events
Workshops
Aboutclick_arrow
profile_loginLogIn

NVIDIA Study: World-Consistent Video-to-Video Synthesis

Learn the details of the NVIDIA’s GAN-based approach used to convert semantic inputs into photorealistic videos.

NVIDIA has shared a paper presenting a new vid2vid method that helps generate photorealistic videos from semantic inputs using AI memory. For such an approach, the team has created a new tool that uses prior generated 3D world data and applies it to the new objects. To provide the world structure data, the team has introduced guidance images, physically grounded estimates that show how an object should look like.

As alluded to in their name, the role of these guidance images is to guide the generative model to produce colors and textures that respect previous outputs.

World-Consistent Video-to-Video Synthesis

In total, the study introduces a new architecture based on the multi-SPADE module that uses semantic models, optical-flow warping, and guidance images to create hyperrealistic and smooth videos.

Check the full paper on Github.

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more