Microsoft has launched its first in-house AI models, MAI-Voice-1 and MAI-1 Preview, marking a shift towards competing directly with OpenAI.
On Thursday, Microsoft announced a significant milestone in its AI journey: the debut of its first-ever in-house models, MAI-Voice-1 and the MAI-1 Preview. This move marks a strategic pivot, as Microsoft—up to now primarily dependent on OpenAI’s GPT models for powering Copilot—begins to step into direct competition with ChatGPT’s creator and a host of other AI pioneers.
What Can Microsoft’s New AI Models Do?
MAI-Voice-1 is a cutting-edge model for generating natural speech. Microsoft touts it as one of the most efficient vocal AI models, capable of creating a full minute of audio in less than a second using just a single GPU. This efficiency sets it apart from other speech tools, reducing both computational demands and delivery time.
Currently, MAI-Voice-1 is already at work in the Copilot Daily and Podcast features and is being rolled out in Copilot Labs, giving users immediate access to Microsoft’s latest advances in AI-driven voice synthesis.
Meanwhile, the MAI-1 Preview is Microsoft’s foundational large language model, now open for public testing on LMArena—a leading platform for benchmarking LLMs through crowdsourced performance comparisons. At present, MAI-1 Preview is ranked 13th on the LMArena leaderboard, trailing behind top models like GPT-5, Gemini 2.5 Pro, DeepSeek R-1, and Grok-3. Notably, MAI-1 Preview was trained using 15,000 NVIDIA H100 GPUs—a fraction of the hardware employed by some competitors, such as Grok, which leveraged over 100,000 GPUs for training.
Insights into Microsoft’s Evolving AI Strategy
In a discussion with Semafor, Mustafa Suleyman, Microsoft AI’s Chief, explained that their team drew valuable lessons from the open-source AI community to maximize results with relatively minimal hardware. According to Suleyman, “The art and craft of training models is selecting the perfect data and not wasting any of your flops on unnecessary tokens.” The focus, he says, is on efficiency and selectivity—getting the model to learn the right things, not just more things.
Microsoft also revealed it is already hard at work on next-generation models, with development ongoing at some of the world’s largest data centers, now equipped with Nvidia’s powerful new GB200s chips. Looking ahead, Microsoft plans to build a broad ecosystem of specialized AI models to address diverse user needs and deliver even greater value.
As the company wrote in a recent blog post: “We have big ambitions for where we go next. Not only will we pursue further advances here, but we believe that orchestrating a range of specialized models serving different user intents and use cases will unlock immense value. There will be a lot more to come from this team on both fronts in the near future”.