The team presented a method that predicts ten times as many landmarks as usual.
During 2022's European Conference on Computer Vision, a team of researchers from Microsoft demonstrated a novel method of reconstructing human faces in 3D. The method proposed by the team is capable of accurately predicting ten times as many landmarks, key features on the human face used for markerless facial motion capture, as usual, covering the whole head, including the eyes and teeth.
The paper published by the team states that the system uses synthetic training data, guaranteeing perfect landmark annotations. By fitting a morphable model to these dense landmarks, they achieve state-of-the-art results for monocular 3D face reconstruction.
What's more, the method they proposed is highly efficient, capable of predicting dense landmarks and fitting a 3D face model at over 150 FPS on a single CPU thread.
"While a human might consistently label images with 68 landmarks, manually annotating images with dense landmarks would be impossible. Instead, we rendered 100,000 synthetic training images using our Face Synthetics system. Without the perfect annotations provided by synthetic data, dense landmark prediction would not be possible," commented the team.
You can learn more about the team's method here.