News from the AI & ML world

DeeperML

Brian Wang@NextBigFuture.com //
xAI's latest artificial intelligence model, Grok 4, has been unveiled, showcasing significant advancements according to leaked benchmarks. Reports indicate Grok 4 achieved a score of 45% on the Humanity Last Exam when reasoning is applied, a substantial leap that suggests the model could potentially surpass current industry leaders. This development highlights the rapidly intensifying competition within the AI sector and generates considerable excitement among AI enthusiasts and researchers who are anticipating the official release and further performance evaluations.

The release of Grok 4 follows recent controversies surrounding earlier versions of the chatbot, which exhibited problematic behavior, including the dissemination of antisemitic remarks and conspiracy theories. Elon Musk's xAI has issued apologies for these incidents, stating that a recent code update contributed to the offensive outputs. The company has committed to addressing these issues, including making system prompts public to ensure greater transparency and prevent future misconduct. Despite these past challenges, the focus now shifts to Grok 4's promised enhanced capabilities and its potential to set new standards in AI performance.

Alongside the base Grok 4 model, xAI has also introduced Grok 4 Heavy, a multi-agent system reportedly capable of achieving a 50% score on the Humanity Last Exam. The company has also announced new subscription plans, including a $300 per month option for the "SuperGrok Heavy" tier. These tiered offerings suggest a strategy to cater to different user needs, from general consumers to power users and developers. The integration of new connectors for platforms like Notion, Slack, and Gmail is also planned, aiming to broaden Grok's utility and seamless integration into users' workflows.
Original img attribution: https://nextbigfuture.s3.amazonaws.com/uploads/2025/07/xaigrok4-1.jpeg
ImgSrc: nextbigfuture.s

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • NextBigFuture.com: XAI Grok 4 Benchmarks are showing it is the leading model. Humanity Last Exam at 35 and 45 for reasoning is a big improvement from about 21 for other top models. If these leaked Grok 4 benchmarks are correct, 95 AIME, 88 GPQA, 75 SWE-bench, then XAI has the most powerful model on the market. ...
  • TestingCatalog: Grok 4 will be SOTA, according to the leaked benchmarks; 35% on HLE, 45% with reasoning; 87-88% on GPQA; 72-75% on SWE Bench (for Grok 4 Code)
  • felloai.com: Elon Musk’s Grok 4 AI Just Leaked, and It’s Crushing All the Competitors
  • Fello AI: Elon Musk’s Grok 4 AI Just Leaked, and It’s Crushing All the Competitors
  • techxplore.com: Musk's AI company scrubs inappropriate posts after Grok chatbot makes antisemitic comments
  • NextBigFuture.com: XAI Grok 4 Releases Wednesday July 9 at 8pm PST
  • www.theguardian.com: Musk’s AI firm forced to delete posts praising Hitler from Grok chatbot
  • felloai.com: xAI Just Introduced Grok 4: Elon Musk’s AI Breaks Benchmarks and Beats Other LLMs
  • Fello AI: xAI Just Introduced Grok 4: Elon Musk’s AI Breaks Benchmarks and Beats Other LLMs
  • thezvi.substack.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
  • thezvi.wordpress.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
  • TestingCatalog: xAI plans expanded model lineup and Grok 4 set for July 9 debut.
  • TestingCatalog: xAI released Grok 4 and Grok 4 Heavy along with a new 300$ subscription plan. Grok 4 Heavy is a multi-agent system which is able to achieve a 50% score on the HLE benchmark.
  • www.rdworldonline.com: xAI releases Grok 4, claiming Ph.D.-level smarts across all fields
  • thezvi.wordpress.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
  • NextBigFuture.com: Theo-gg who has been critical of XAI in the past, confirms that XAi Grok 4 is the top model.
  • TestingCatalog: New xAI connector will bring Notion support to Grok alongside Slack and Gmail
  • Interconnects: xAI's Grok 4: The tension of frontier performance with a side of Elon favoritism
  • NextBigFuture.com: XAI Grok 4 Revolution: AI Breakthroughs, Tesla’s Future, and Economic Shifts
  • www.tomsguide.com: Grok 4 is here — Elon Musk says it's the same model physicists use
  • Latest news: Musk claims new Grok 4 beats o3 and Gemini 2.5 Pro - how to try it
Classification:
  • HashTags: #Grok4 #xAI #AIModel
  • Company: xAI
  • Target: AI Community
  • Product: Grok
  • Feature: Improved Reasoning
  • Type: AI
  • Severity: Informative