New ChatGPT Can See, Hear & Speak

OpenAI has introduced an improved version of the AI.

OpenAI has presented new voice and image capabilities in ChatGPT: the AI can now "look" at images, listen to your requests, and offer a voiced response. The bot can now analyze pictures and communicate with you like a real person.

"Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you."

Image credit: OpenAI

The AI's voice is powered by a text-to-speech model that generates audio from text and a few seconds of speech. OpenAI used professional voice actors along with Whisper, its open-source speech recognition system, to transcribe your spoken words into text.

Image recognition looks like a fantastic feature to have: forget about explaining anything, just show the chatbot what you mean. This ability is made possible with GPT-4V (GPT-4 with vision), a new model that allows GPT-4 to analyze image inputs provided by you.

OpenAI is planning to introduce the features gradually. The Plus and Enterprise users will get to try them out in the next two weeks, others – "soon after".

Find out more about it here and join our 80 Level Talent platform and our Telegram channel, follow us on Instagram, Twitter, and LinkedIn, where we share breakdowns, the latest news, awesome artworks, and more.

New ChatGPT Can See, Hear & Speak

Image credit: OpenAI

Join discussion

Comments 0

You might also like

We need your consent