logo80lv
Articlesclick_arrow
Research
Talentsclick_arrow
Events
Workshops
Aboutclick_arrow
profile_loginLogIn

Visual ChatGPT: Combining Talking with Visuals

Text-to-image models are now possible in the AI conversationalist. 

Researchers from Microsoft presented Visual ChatGPT – an open-source expansion that connects the AI dialogue system with a series of Visual Foundation Models to enable sending and receiving images.

Right now, ChatGPT can't process or generate images, it is only able to write a description so you can run it through Stable Diffusion, DALL-E, or Midjourney. However, with this project, it is possible for the system to create a picture, edit it, remove objects from it, and more. 

Visual ChatGPT allows:

  1. sending and receiving not only languages but also images;
  2. providing complex visual questions or visual editing instructions that require the collaboration of multiple AI models with multi-steps;
  3. providing feedback and asking for corrected results.

Find the project on GitHub, read the paper here, and don't forget to join our Reddit page and our Telegram channel, follow us on Instagram and Twitter, where we share breakdowns, the latest news, awesome artworks, and more.

Join discussion

Comments 2

  • Anonymous user

    What is apple tree?

    0

    Anonymous user

    ·2 years ago·
  • Anonymous user

    spaceship

    0

    Anonymous user

    ·2 years ago·

You might also like

We need your consent

We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more