RICH: A New Dataset to Predict Human-Scene Contact

Using the dataset, the team built a network that requires just one image to determine which parts of the human body interact with the surroundings.

Computer Scientists Chun-Hao P. Huang, Hongwei Yi, Markus Höschle, Matvey Safroshkin, Tsvetelina Alexiadis, Senya Polikovsky, Daniel Scharstein, and Michael J. Black have presented RICH, a cool dataset that contains multiview outdoor/indoor video sequences at 4K resolution, ground-truth 3D human bodies captured using markerless motion capture, 3D body scans, high-resolution 3D scene scans, and accurate vertex-level contact labels on the body.

Using RICH, the team managed to train a network that predicts dense body-scene contacts from a single RGB image. According to the team, their solution is the first method to directly estimate 3D body-scene contact from a single image.

"Inferring human-scene contact (HSC) is the first step toward understanding how humans interact with their surroundings. While detecting 2D human-object interaction (HOI) and reconstructing 3D human pose and shape (HPS) have enjoyed significant progress, reasoning about 3D human-scene contact from a single image is still challenging," commented the team. "Existing HSC detection methods consider only a few types of predefined contact, often reduce body and scene to a small number of primitives, and even overlook image evidence. To predict human-scene contact from a single image, we address the limitations above from both data and algorithmic perspectives."

You can learn more here. Also, don't forget to join our new Reddit pageour new Telegram channel, follow us on Instagram and Twitter, where we are sharing breakdowns, the latest news, awesome artworks, and more.

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more