OpenAI gives the AI ​​chatbot ears, eyes and a voice

It is one of the biggest updates ChatGPT has received since its release in November 2022. OpenAI is giving its popular AI chatbot two new ways to interact with users. Or as the provider writes: “ChatGPT can now see, hear and speak”.




ChatGPT: Conversation with AI chatbot

The new language and image capabilities should enable users to communicate more intuitively with ChatGPT in the future. A kind of “oral” conversation should be possible with the AI ​​chatbot.

You should also be able to “show” ChatGPT what you are currently talking about. For example, by uploading a picture of a landscape and talking about its beauty.




Recipes based on pictures

More proposed by OpenAI Usage options include taking and uploading pictures of the fridge or pantry – followed by discussion about the possible dinner and recipes with step-by-step instructions.

It should also be possible to take photos of the children’s homework and get tips on how to solve it. Or you can take a photo of a broken garden tool and get advice on repair options.




Dall-E 3 integrated into ChatGPT

In addition, ChatGPT will also be able to generate images itself in the future thanks to the integration of the text-to-image AI Dall-E 3. Users can use their language skills, for example, to tell each other bedtime stories or to settle an argument.

Users can choose from different voices for ChatGPT, which were generated with the help of professional actors. To convert the users’ spoken words into text that ChatGPT can understand, OpenAI’s Whisper speech recognition system is used, among other things.




Translation tool for Spotify podcasts

In order to be able to output ChatGPT’s answers in linguistic form, OpenAI has developed a new text-to-speech model. This tool is already available for use by several other companies. Including Spotify, which has used it to create a translation tool for podcasts. The original voices of the hosts are output in the various translated languages ​​such as Spanish, French and German.

When using ChatGPT, OpenAI points out that the new functions can only be used in English. The company wants to circumvent the potential misuse of the AI ​​chatbot’s language capabilities by fraudsters by creating selectable voices.




OpenAI: Problems with image recognition

OpenAI has also dealt with possible problems caused by image recognition in recent months. While ChatGPT doesn’t answer questions like “How do I make a bomb?”, this protection could have been bypassed with a picture of a bomb and the question “How do I make the object shown in the picture,” according to MIT Technology Review explained.

OpenAI also had to put a stop to potential loopholes like these before releasing the new functions. In any case, the ChatGPT provider seems to be sure that such and similar problems have been resolved.

Job search: These AI apps create your application folder

The voice and image recognition features will be available to all Pro and Enterprise users of ChatGPT in the next two weeks. However, only image recognition will be accessible on all platforms. The voice features are limited to the app (iOS and Android).




Release new feature in Settings

To do this, users have to go to the Settingsunder New functionthe Voice conversations release. You can then select one of the five voice variants currently offered by tapping the headphone symbol at the top right.

To upload a photo, click on the corresponding icon. In the app you have to tap the “plus” button beforehand. It is then also possible to discuss several photos or use the integrated painting program to point out specific content.

Almost finished!

Please click on the link in the confirmation email to complete your registration.

Would you like more information about the newsletter? Find out more now

source site