logo80lv
Articlesclick_arrow
Research
Talentsclick_arrow
Events
Workshops
Aboutclick_arrow
profile_loginLogIn

New ChatGPT Can See, Hear & Speak

OpenAI has introduced an improved version of the AI.

OpenAI has presented new voice and image capabilities in ChatGPT: the AI can now "look" at images, listen to your requests, and offer a voiced response. The bot can now analyze pictures and communicate with you like a real person.

"Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you."

Image credit: OpenAI

The AI's voice is powered by a text-to-speech model that generates audio from text and a few seconds of speech. OpenAI used professional voice actors along with Whisper, its open-source speech recognition system, to transcribe your spoken words into text.

Image recognition looks like a fantastic feature to have: forget about explaining anything, just show the chatbot what you mean. This ability is made possible with GPT-4V (GPT-4 with vision), a new model that allows GPT-4 to analyze image inputs provided by you.

OpenAI is planning to introduce the features gradually. The Plus and Enterprise users will get to try them out in the next two weeks, others – "soon after".

Find out more about it here and join our 80 Level Talent platform and our Telegram channel, follow us on InstagramTwitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more