OpenAI launches DALL-E 3 API, new text-to-speech models

OpenAI launched a slew of new APIs during its first-ever developer day.

DALL-E 3, OpenAI’s text-to-image model, is now available via an API after first coming to ChatGPT and Bing Chat. Similar to the previous version of DALL-E, the API incorporates built-in moderation to help protect against misuse, OpenAI says.

The DALL-E 3 API offers different format and quality options, with prices starting at $0.04 per image generated.

Elsewhere, OpenAI’s now providing a text-to-speech API that offers six preset voices to choose from and two generative AI model variants. It’s available starting today, with pricing starting at $0.015 per input 1,000 characters.

“This is much more natural than anything else we’ve heard out there, which can make apps more natural to interact with and more accessible,” OpenAI Sam Altman said on stage. “It also unlocks a lot of use cases like language learning and voice assistance.”

In a related announcement, OpenAI launched the next version of its open source automatic speech recognition model, Whisper large-v3, which the company claims boasts improved performance across languages.

Post Views: 34

Leave a Reply Cancel reply