Grok 4 is here: Beats OpenAI and Google in new AI test, adds code and voice tools

New Delhi: After a delayed livestream and weeks of online criticism, Elon Musk and his xAI team officially launched Grok 4 on July 10(IST). The rollout was long and scattered, but the message was clear: this is Musk’s most advanced AI model yet. Grok 4 introduces new multimodal capabilities, a coding-focused variant, and a heavy focus on reasoning. But the launch also comes just hours after xAI’s chief scientist Igor Babuschkin resigned, raising questions about the timing and the state of internal affairs at the company.

The release follows a wave of controversy over Grok’s past behavior, with older versions of the chatbot producing racist and offensive replies. Musk has stayed mostly silent on that front, instead using the launch event to frame Grok 4 as a leap into a new phase of intelligence. “Reality is the ultimate reasoning test,” he said during the stream.

ARC AGI leaderboard: Source – arcprize.org/leaderboard

What’s new in Grok 4

Grok 4 comes with a long list of upgrades. It has been trained on xAI’s Colossus supercomputer, which Musk says enables “scientist-grade reasoning.” That claim is backed up by early benchmark results from Artificial Analysis, which show Grok 4 scoring an Intelligence Index of 73, ahead of OpenAI’s o3 (70), Gemini 2.5 Pro (70), and Claude 4 Opus (64).

Multimodal input is now part of Grok’s offering. It can process text and images, with support for video expected later. The context window is 256,000 tokens, larger than Claude 4 Opus and o3, though not as big as Gemini’s 1 million tokens.

One standout addition is Grok 4 Code, a developer-focused version that writes and debugs code. Musk even posted on X, saying, “You can cut & paste your entire source code file into the query entry box on grok.com and Grok 4 will fix it for you… works better than Cursor.”

Voice has also been improved. Grok 4 Voice now sounds more human, with fewer interruptions and smoother interactions.

Artificial Analysis Intelligence Index

Real-time web access and meme fluency

Grok 4 continues to use DeepSearch, pulling real-time data from the web, especially from Musk’s X platform. This gives it access to live updates without needing a browser.

The AI also claims to understand internet culture better than its competitors. It’s being tuned to interpret memes, slang, and cultural jokes with accuracy, making it more fluent in how people talk online. Whether that’s a feature or a risk depends on how the model handles boundaries.

Grok 4 – Pricing

Pricing and performance

The model is available via xAI’s API and Grok chatbot on X. According to Artificial Analysis, Grok 4 outputs at 75 tokens per second. That’s slower than o3 (188/s) and Gemini 2.5 Pro (142/s), but faster than Claude 4 Opus (66/s).

Pricing remains the same as Grok 3: $3 (₹261) per 1M input tokens and $15 (₹1,305) per 1M output tokens. A cached input option is available at $0.75 (₹65). Premium plans are called SuperGrok at $30/month (₹2,610) and SuperGrok Heavy at $300/month (₹26,100).

Introducing Grok 4, the world’s most powerful AI model. Watch the livestream now: https://t.co/59iDX5s2ck

— xAI (@xai) July 10, 2025