
Researchers from Microsoft presented Visual ChatGPT – an open-source expansion that connects the AI dialogue system with a series of Visual Foundation Models to enable sending and receiving images.
Right now, ChatGPT can't process or generate images, it is only able to write a description so you can run it through Stable Diffusion, DALL-E, or Midjourney. However, with this project, it is possible for the system to create a picture, edit it, remove objects from it, and more.
Visual ChatGPT allows:
- sending and receiving not only languages but also images;
- providing complex visual questions or visual editing instructions that require the collaboration of multiple AI models with multi-steps;
- providing feedback and asking for corrected results.
Find the project on GitHub, read the paper here, and don't forget to join our Reddit page and our Telegram channel, follow us on Instagram and Twitter, where we share breakdowns, the latest news, awesome artworks, and more.