Microsoft Build 2024: What GPT-4o can do on Azure AI

OpenAI’s multimodal model GPT-4o is now available to developers on Microsoft’s Azure AI.

At Microsoft Build 2024, the company’s developer conference, Microsoft shared that those itching to get their hands on GPT-4o can now access it through the Azure AI Studio and as an API.

Microsoft’s Azure AI Studio is a playground for developers to try out the latest tools supported by Azure, which includes OpenAI models like GPT-4 Turbo — and now GPT-4o.

GPT-4o’s image and vision capabilities are already available via OpenAI’s own API and ChatGPT. But the highly-anticipated Voice Mode is still a few weeks away. The same goes for GPT-4o access through the Azure AI Studio and Microsoft’s API — no Voice Mode yet. The Microsoft tech community hub’s blog post, said audio capabilities would come “in the future.”

It’s unclear whether audio capabilities are currently available through Azure AI, but Microsoft CEO Satya Nadella shared some of ways people can (eventually) use GPT-4o through Copilot. This included sharing your screen or session with the GPT-4o-powered Copilot and asking it for help with playing Minecraft. As Mashable’s Alex Perry noted, however, if you’re struggling with Minecraft, “you can either play the game for 10 minutes or just Google it.”

Mashable Light Speed

Minecraft screen with copilot running in the background

Help with Minecraft that could easily be googled.
Credit: Microsoft

Nadella also went on to talk about what developers can do with GPT-4o on Azure AI.

“One of the coolest things is how any app, any website can essentially be turned into a full multimodal full duplex conversational canvas,” Nadella said. That means developers can create agents that help people navigate apps and websites. For a guy who was in a hurry and about to go on an overnight camping trip, the agent showed how it could help him choose the right shoes and actually add them to his shopping cart.

A man hold hiking sandals up to his computer

GPT-4o as a shopping agent can help clueless hikers find the right shoes.
Credit: Microsoft

Later on in the keynote, Microsoft CTO Kevin Scott showed how GPT-4o could help with code, emphasizing how models will continue to get faster and more powerful. Pointing her phone at a screen of code, a ChatGPT-style bot using GPT-4o read the code — and helped Principal Engineer Jennifer Marsman troubleshoot the problem in real-time.

iPhone screen pointed at a computer screen and showing the code that appears on the screen

GPT-4o can help troubleshoot with code problems.
Credit: Microsoft

At Build, Microsoft sprinkled GPT-4o throughout much of its announcements, including Copilot, Teams, and more. But GPT-4o on Azure AI puts the multimodal model in the hands of developers, which means lots of multimodal apps and tools sure to come.

Topics
Microsoft
OpenAI

Leave a Reply

Your email address will not be published. Required fields are marked *