The model can change scenes and poses.
Take a look at NeuMan – a new framework that reconstructs the human and the scene from a single video. The model, developed by Apple, can change the background and people's poses.
Given a video captured by a moving camera, the researchers trained two NeRF models, for the human and the scene, relying on existing methods to estimate the rough geometry, which allowed them to create a warping field from the observation space to the canonical pose-independent space.
According to the paper, the method can learn subject-specific details, including cloth wrinkles and accessories, from a 10-second video and provide high-quality renderings of the human taking new poses, from novel views, together with the background.
For example, NeuMan can make the people from a short video jump, dance, shake hands, or do parkour.
Find out more here and don't forget to join our Reddit page and our Telegram channel, follow us on Instagram and Twitter, where we are sharing breakdowns, the latest news, awesome artworks, and more.