Existing reconstruction techniques have significant limitations. For example, take a camera, walk around a building recording a video, and use that data to create a 3D model. You won’t get a clean and accurate reconstruction of the building. Or walk around a house with your camera and try reconstructing it. The result will definitely upset you. Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun from Intel Labs presented a new benchmark for image-based 3D reconstruction.
The researchers used a state-of-the-art industrial laser scanner with a range of 330 meters and submillimeter accuracy to acquire ground-truth models of large-scale scenes. They’ve scanned objects and environments from multiple viewpoints and registered the scans to obtain ground-truth models. Then, they used 8-megapixel video to reconstruct models.
The presented benchmark has a number of characteristics that can support the development of new reconstruction techniques:
- The input modality is video. This can help future pipelines track the camera, reason about illumination and reflectance, and reconstruct small details.
- The benchmark evaluates complete reconstruction pipelines. This leaves scope for tackling camera localization and dense reconstruction jointly, potentially increasing robustness and precision via co-adaptation to the performance characteristics of each task.
- The benchmark includes both outdoor and indoor scans of complete scenes, pushing current reconstruction pipelines to their limits and beyond.
Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun
This research can drastically change the way developers reconstruct outdoor and indoor scenes and stimulate the development of robust broad-competence systems. The researchers will set up an evaluation server and online leaderboard that can be used by the community to track progress here.