Imagic: Editing Images with Text Prompts

Another impressive implementation of diffusion models.

If you've ever needed to edit photos but don't have the right skill for it, this research might interest you. The authors present Imagic – a method that can transform images based on a text prompt. It can change the posture and composition of one or multiple objects inside an image while preserving its original characteristics.

The method requires only a single input image and text with the desired result. It doesn't need image masks or additional views of the object, which makes it a truly valuable tool. 

"Our method, which we call "Imagic", leverages a pre-trained text-to-image diffusion model for this task. It produces a text embedding that aligns with both the input image and the target text, while fine-tuning the diffusion model to capture the image-specific appearance," states the research paper.

The method is not as effortless as it might seem. You can see from the implementation of Imagic using Stable Diffusion here that it requires a GPU with about 30GB of VRAM. The examples you see were processed in around 5 minutes per image on Lambda Labs' A100.

Click here to learn more about Imagic and don't forget to join our Reddit page and our Telegram channel, follow us on Instagram and Twitter, where we share breakdowns, the latest news, awesome artworks, and more. 

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more