New Depth Estimation Method Will Make Video Editing Easier

Depth Anything is a collaborative work from TikTok, The University of Hong Kong, and Zhejiang Lab.

Image credit: The University of Hong Kong et al.

Researchers from TikTok, The University of Hong Kong, Zhejiang Lab, and Zhejiang University presented Depth Anything, a new image-based depth estimation method that might make video editing easier.

Trained on 1.5 million labeled and 62 million unlabeled images, it provides impressive Monocular Depth Estimation (MDE) foundation models with these features:

  • zero-shot relative depth estimation
  • zero-shot metric depth estimation
  • optimal in-domain fine-tuning and evaluation on NYUv2 and KITTI datasets 

Image credit: The University of Hong Kong et al.

The creators want to build "a simple yet powerful foundation model dealing with any images under any circumstances" without pursuing novel technical modules.

"We investigate two simple yet effective strategies that make data scaling-up promising. First, a more challenging optimization target is created by leveraging data augmentation tools. It compels the model to actively seek extra visual knowledge and acquire robust representations. Second, an auxiliary supervision is developed to enforce the model to inherit rich semantic priors from pre-trained encoders. We evaluate its zero-shot capabilities extensively, including six public datasets and randomly captured photos."

You can find more examples, the code, and training data on the project's page.

Blender Guru seems to approve of the tool:

Don't forget to join our 80 Level Talent platform and our Telegram channel, follow us on InstagramTwitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more