logo80lv
Articlesclick_arrow
Research
Talentsclick_arrow
Events
Workshops
Aboutclick_arrow
profile_loginLogIn

Apple Ships a New AI For Editing Images With Text Prompts

The framework, dubbed MGIE, leverages multimodal large language models.

Shortly after the launch of Apple Vision Pro, a futuristic AR/VR headset that garnered attention for its unique ability to make wearers look like tunnel-visioned blockheads, Apple, in collaboration with UC Santa Barbara, quietly released MLLM-Guided Image Editing (MGIE), a new AI that lets one edit images using text prompts.

Powered by multimodal large language models (MLLMs), the framework recognizes the objects depicted in the image and enables the user to tweak the picture, add or remove objects, change lighting, apply effects, and edit smaller details. Developers claim that MGIE can significantly improve automatic metrics and human evaluation while maintaining competitive inference efficiency.

"MGIE learns to derive expressive instructions and provides explicit guidance," commented the team. "The editing model jointly captures this visual imagination and performs manipulation through end-to-end training. We evaluate various aspects of Photoshop-style modification, global photo optimization, and local editing. Extensive experimental results demonstrate that expressive instructions are crucial to instruction-based image editing, and our MGIE can lead to a notable improvement in automatic metrics and human evaluation while maintaining competitive inference efficiency."

You can learn more about MGIE here and try it out yourself over here. Also, don't forget to join our 80 Level Talent platform and our Telegram channel, follow us on InstagramTwitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more