LooseControl introduces a generalized version of depth conditioning for ControlNet.
Image credit: Shariq Farooq Bhat et al.
We usually post AI models that turn 2D into 3D, but this time, let's look at something different. Researchers from KAUST, University College London, and Adobe presented LooseControl, a curious model to allow generalized depth conditioning for diffusion-based image generation.
It uses ControlNet, a popular neural network for controlling diffusion models, to create 2D images based on a 3D layout. Simply put, you can set up 3D boxes wherever you envision the objects to be the final picture, change their position and size, and type a text prompt, and the AI will generate an image taking your setup into consideration.
Image credit: Shariq Farooq Bhat et al.
"Specifically, we allow scene boundary control for loosely specifying scenes with only boundary conditions, and 3D box control for specifying layout locations of the target objects rather than the exact shape and appearance of the objects. Using LooseControl, along with text guidance, users can create complex environments (e.g., rooms, street views, etc.) by specifying only scene boundaries and locations of primary objects."
Image credit: Shariq Farooq Bhat et al.
The paper also provides two editing mechanisms to improve the results. 3D box editing allows you to refine images by changing, adding, or removing boxes while retaining the style of the image. At the same time, attribute editing suggests possible editing directions to change a particular aspect of the scene.
The creators believe that LooseControl can become an important design tool that will help users easily create complex environments.
Find the project here, try it out yourself, and join our 80 Level Talent platform and our Telegram channel, follow us on Instagram, Twitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.