News from the AI & ML world

DeeperML - #aimodel

Moonshot AI's Kimi K2 Outperforms GPT-4 Coding Tasks - Moonshot AI released Kimi K2, a trillion-parameter open-source MoE model excelling in long context, code, reasoning, and agentic behavior, outperforming GPT-4 in some coding tasks.

References: venturebeat.com , www.analyticsvidhya.com , www.marktechpost.com ...

Moonshot AI has unveiled Kimi K2, a groundbreaking open-source AI model designed to challenge proprietary systems from industry leaders like OpenAI and Anthropic. This trillion-parameter Mixture-of-Experts (MoE) model boasts a remarkable focus on long context, sophisticated code generation, advanced reasoning capabilities, and agentic behavior, meaning it can autonomously perform complex, multi-step tasks. Kimi K2 is designed to move beyond simply responding to prompts and instead to actively execute actions, utilizing tools and writing code with minimal human intervention.

Kimi K2 has demonstrated superior performance in key benchmarks, particularly in coding and software engineering tasks. On SWE-bench Verified, a challenging benchmark for software development, Kimi K2 achieved an impressive 65.8% accuracy, surpassing many existing open-source models and rivaling some proprietary ones. Furthermore, in LiveCodeBench, a benchmark designed to simulate realistic coding scenarios, Kimi K2 attained 53.7% accuracy, outperforming GPT-4.1 and DeepSeek-V3. The model's strengths extend to mathematical reasoning, where it scored 97.4% on MATH-500, exceeding GPT-4.1's score of 92.4%. These achievements position Kimi K2 as a powerful, accessible alternative for developers and researchers.

The release of Kimi K2 signifies a significant step towards making advanced AI more open and accessible. Moonshot AI is offering two versions of the model: Kimi-K2-Base for researchers and developers seeking customization, and Kimi-K2-Instruct, optimized for chat and agentic applications. The company highlights that Kimi K2's development involved training on over 15.5 trillion tokens and utilizes a custom MuonClip optimizer to ensure stable training at an unprecedented scale. This open-source approach allows the AI community to leverage and build upon this powerful technology, fostering innovation in the development of AI-powered solutions.

Recommended read:

Top link: www.marktechpost.com
Permalink: More details

References :

venturebeat.com: Moonshot AI’s Kimi K2 outperforms GPT-4 in key benchmarks â€” and it’s free
www.analyticsvidhya.com: Kimi K2: The Most Powerful Open-Source Agentic Model
MarkTechPost: New AI firm releases Kimi K2 for use
www.marktechpost.com: Moonshot AI Releases Kimiâ€¯K2: A Trillion-Parameter MoE Model Focused on Long Context, Code, Reasoning, and Agentic Behavior
Analytics Vidhya: Remember the flood of open-source Chinese models that disrupted the GenAI industry earlier this year? While DeepSeek took most of the headlines, Kimi K1.5 was one of the prominent names in the list. And the model was quite cool.

Brian Wang@NextBigFuture.com //

Grok 4 Leaked Benchmarks Indicate Significant AI Advancement - Elon Musk's xAI unveils Grok 4 with significant performance improvements and introduces new subscription tiers and integrations, despite past controversies.

References: NextBigFuture.com , TestingCatalog , Fello AI ...

xAI's latest artificial intelligence model, Grok 4, has been unveiled, showcasing significant advancements according to leaked benchmarks. Reports indicate Grok 4 achieved a score of 45% on the Humanity Last Exam when reasoning is applied, a substantial leap that suggests the model could potentially surpass current industry leaders. This development highlights the rapidly intensifying competition within the AI sector and generates considerable excitement among AI enthusiasts and researchers who are anticipating the official release and further performance evaluations.

The release of Grok 4 follows recent controversies surrounding earlier versions of the chatbot, which exhibited problematic behavior, including the dissemination of antisemitic remarks and conspiracy theories. Elon Musk's xAI has issued apologies for these incidents, stating that a recent code update contributed to the offensive outputs. The company has committed to addressing these issues, including making system prompts public to ensure greater transparency and prevent future misconduct. Despite these past challenges, the focus now shifts to Grok 4's promised enhanced capabilities and its potential to set new standards in AI performance.

Alongside the base Grok 4 model, xAI has also introduced Grok 4 Heavy, a multi-agent system reportedly capable of achieving a 50% score on the Humanity Last Exam. The company has also announced new subscription plans, including a $300 per month option for the "SuperGrok Heavy" tier. These tiered offerings suggest a strategy to cater to different user needs, from general consumers to power users and developers. The integration of new connectors for platforms like Notion, Slack, and Gmail is also planned, aiming to broaden Grok's utility and seamless integration into users' workflows.

Recommended read:

Top link: NextBigFuture.com
Permalink: More details

References :

NextBigFuture.com: XAI Grok 4 Benchmarks are showing it is the leading model. Humanity Last Exam at 35 and 45 for reasoning is a big improvement from about 21 for other top models. If these leaked Grok 4 benchmarks are correct, 95 AIME, 88 GPQA, 75 SWE-bench, then XAI has the most powerful model on the market. ...
TestingCatalog: Grok 4 will be SOTA, according to the leaked benchmarks; 35% on HLE, 45% with reasoning; 87-88% on GPQA; 72-75% on SWE Bench (for Grok 4 Code)
felloai.com: Elon Muskâ€™s Grok 4 AI Just Leaked, and Itâ€™s Crushing All the Competitors
Fello AI: Elon Muskâ€™s Grok 4 AI Just Leaked, and Itâ€™s Crushing All the Competitors
techxplore.com: Musk's AI company scrubs inappropriate posts after Grok chatbot makes antisemitic comments
NextBigFuture.com: XAI Grok 4 Releases Wednesday July 9 at 8pm PST
www.theguardian.com: Musk’s AI firm forced to delete posts praising Hitler from Grok chatbot
felloai.com: xAI Just Introduced Grok 4: Elon Musk’s AI Breaks Benchmarks and Beats Other LLMs
Fello AI: xAI Just Introduced Grok 4: Elon Muskâ€™s AI Breaks Benchmarks and Beats Other LLMs
thezvi.substack.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
thezvi.wordpress.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
TestingCatalog: xAI plans expanded model lineup and Grok 4 set for July 9 debut.
TestingCatalog: xAI released Grok 4 and Grok 4 Heavy along with a new 300$ subscription plan. Grok 4 Heavy is a multi-agent system which is able to achieve a 50% score on the HLE benchmark.
www.rdworldonline.com: xAI releases Grok 4, claiming Ph.D.-level smarts across all fields
thezvi.wordpress.com: Last night, on the heels of some rather unfortunate incidents involving the Twitter version of Grok 3, xAI released Grok 4.
NextBigFuture.com: Theo-gg who has been critical of XAI in the past, confirms that XAi Grok 4 is the top model.
TestingCatalog: New xAI connector will bring Notion support to Grok alongside Slack and Gmail
Interconnects: xAI's Grok 4: The tension of frontier performance with a side of Elon favoritism
NextBigFuture.com: XAI Grok 4 Revolution: AI Breakthroughs, Teslaâ€™s Future, and Economic Shifts
www.tomsguide.com: Grok 4 is here â€” Elon Musk says it's the same model physicists use
Latest news: Musk claims new Grok 4 beats o3 and Gemini 2.5 Pro - how to try it

Mark Tyson@tomshardware.com //

OpenAI o3-Pro Reasoning Model - OpenAI releases O3-Pro, a reasoning model for ChatGPT, focused on performance and cost-efficiency, challenging competitor models like Claude 4 Opus and Gemini 2.5 Pro.

References: , Last Week in AI , composio.dev ...

OpenAI has launched O3 PRO for ChatGPT, marking a significant advancement in both performance and cost-efficiency for its reasoning models. This new model, O3-Pro, is now accessible through the OpenAI API and the Pro plan, priced at $200 per month. The company highlights substantial improvements with O3 PRO and has also dropped the price of its previous o3 model by 80%. This strategic move aims to provide users with more powerful and affordable AI capabilities, challenging competitors in the AI model market and expanding the boundaries of reasoning.

The O3-Pro model is set to offer enhanced raw reasoning capabilities, but early reviews suggest mixed results when compared to competing models like Claude 4 Opus and Gemini 2.5 Pro. While some tests indicate that Claude 4 Opus currently excels in prompt following, output quality, and understanding user intentions, Gemini 2.5 Pro is considered the most economical option with a superior price-to-performance ratio. Initial assessments suggest that O3-Pro might not be worth the higher cost unless the user's primary interest lies in research applications.

The launch of O3-Pro coincides with other strategic moves by OpenAI, including consolidating its public sector AI products under the "OpenAI for Government" banner, including ChatGPT Gov. OpenAI has also secured a $200 million contract with the U.S. Department of Defense to explore AI applications in administration and security. Despite these advancements, OpenAI is also navigating challenges, such as the planned deprecation of GPT-4.5 Preview in the API, which has caused frustration among developers who relied on the model for their applications and workflows.

Recommended read:

Top link: tomshardware.com
Permalink: More details

References :

: OpenAI finally unveiled their much expensive O3, the O3-Pro. The model is available in their API and Pro plan, which costs $200
Last Week in AI: OpenAI introduces O3 PRO for ChatGPT, highlighting significant improvements in performance and cost-efficiency.
Last Week in AI: OpenAI adds o3 Pro to ChatGPT and drops o3 price by 80 per cent, ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries
composio.dev: OpenAI finally unveiled their much expensive O3, the O3-Pro. The model is available in their API and Pro plan, which costs $200

@www.marktechpost.com //

Google's AI Model Forecasts Tropical Cyclones Accurately - Google launched an AI model via Weather Lab for forecasting tropical cyclones, developed with DeepMind, aiming to predict path and intensity days in advance.

References: siliconangle.com , AI News | VentureBeat , Maginative ...

Google has unveiled a new AI model designed to forecast tropical cyclones with improved accuracy. Developed through a collaboration between Google Research and DeepMind, the model is accessible via a newly launched website called Weather Lab. The AI aims to predict both the path and intensity of cyclones days in advance, overcoming limitations present in traditional physics-based weather prediction models. Google claims its algorithm achieves "state-of-the-art accuracy" in forecasting cyclone track and intensity, as well as details like formation, size, and shape.

The AI model was trained using two extensive datasets: one describing the characteristics of nearly 5,000 cyclones from the past 45 years, and another containing millions of weather observations. Internal testing demonstrated the algorithm's ability to accurately predict the paths of recent cyclones, in some cases up to a week in advance. The model can generate 50 possible scenarios, extending forecast capabilities up to 15 days.

This breakthrough has already seen adoption by the U.S. National Hurricane Center, which is now using these experimental AI predictions alongside traditional forecasting models in its operational workflow. Google's AI's ability to forecast up to 15 days in advance marks a significant improvement over current models, which typically provide 3-5 day forecasts. Google made the AI accessible through a new website called Weather Lab. The model is available alongside two years' worth of historical forecasts, as well as data from traditional physics-based weather prediction algorithms. According to Google, this could help weather agencies and emergency service experts better anticipate a cyclone’s path and intensity.

Recommended read:

Top link: www.marktechpost.com
Permalink: More details

References :

siliconangle.com: Google LLC today detailed an artificial intelligence model that can forecast the path and intensity of tropical cyclones days in advance.
AI News | VentureBeat: Google DeepMind just changed hurricane forecasting forever with new AI model
MarkTechPost: Google AI Unveils a Hybrid AI-Physics Model for Accurate Regional Climate Risk Forecasts with Better Uncertainty Assessment
Maginative: Google's AI Can Now Predict Hurricane Paths 15 Days Out â€” and the Hurricane Center Is Using It
SiliconANGLE: Google develops AI model for forecasting tropical cyclones. According to the company, the algorithm was developed through a collaboration between its Google Research and DeepMind units. Itâ€™s available through a newly launched website called Weather Lab.
The Official Google Blog: Weather Lab is an interactive website for sharing Googleâ€™s AI weather models.
www.engadget.com: Google DeepMind is sharing its AI forecasts with the National Weather Service
www.producthunt.com: Predicting cyclone paths & intensity 15 days ahead |
the-decoder.com: Google Deepmind launches Weather Lab to test AI models for tropical cyclone forecasting
AIwire: Google DeepMind Launches Interactive AI That Lets You Explore Storm Forecasts
www.aiwire.net: Google DeepMind and Google Research are launching Weather Lab - a new AI-driven platform designed specifically to improve forecasts for tropical cyclone formation, intensity, and trajectory.

Michael Nuñez@AI News | VentureBeat //

Google Gemini 2.5 Models Released on Vertex AI - Google introduced Gemini 2.5 Flash and Pro models on Vertex AI, optimized for speed, efficiency, complex reasoning, and multimodal understanding, along with a new Flash-Lite model for cost-efficient performance.

References: AI & Machine Learning , deepmind.google , AI Talent Development ...

Google has recently rolled out its latest Gemini 2.5 Flash and Pro models on Vertex AI, bringing advanced AI capabilities to enterprises. The release includes the general availability of Gemini 2.5 Flash and Pro, along with a new Flash-Lite model available for testing. These updates aim to provide organizations with the tools needed to build sophisticated and efficient AI solutions.

The Gemini 2.5 Flash model is designed for speed and efficiency, making it suitable for tasks such as large-scale summarization, responsive chat applications, and data extraction. Gemini 2.5 Pro handles complex reasoning, advanced code generation, and multimodal understanding. Additionally, the new Flash-Lite model offers cost-efficient performance for high-volume tasks. These models are now production-ready within Vertex AI, offering the stability and scalability needed for mission-critical applications.

Google CEO Sundar Pichai has highlighted the improved performance of the Gemini 2.5 Pro update, particularly in coding, reasoning, science, and math. The update also incorporates feedback to improve the style and structure of responses. The company is also offering Supervised Fine-Tuning (SFT) for Gemini 2.5 Flash, enabling enterprises to tailor the model to their unique data and needs. A new updated Live API with native audio is also in public preview, designed to streamline the development of complex, real-time audio AI systems.

Recommended read:

Top link: AI News | VentureBeat
Permalink: More details

References :

AI & Machine Learning: Gemini 2.5 model hardening.
deepmind.google: Advancing Gemini's security safeguards.
AI GPT Journal: How to Use Gemini Live for Work, Life, and Everything in Between
AI Talent Development: Gemini 2.5: Updates to our family of thinking models. Today we are excited to share updates across the board to our Gemini 2.5 model family.
AI News | VentureBeat: Google launches production-ready Gemini 2.5 Pro and Flash AI models for enterprises while introducing cost-efficient Flash-Lite to challenge OpenAI's market dominance.
www.laptopmag.com: Google brings Gemini's latest 2.5 Flash and Pro models to audiences, and makes Flash-Lite available for testing.
thezvi.substack.com: Google recently came out with Gemini-2.5-0605, to replace Gemini-2.5-0506, because I mean at this point it has to be the companies intentionally fucking with us, right?
thezvi.wordpress.com: Google recently came out with Gemini-2.5-0605, to replace Gemini-2.5-0506.
AI & Machine Learning: The momentum of the Gemini 2.5 era continues to build.
learn.aisingapore.org: This post summarizes the updates to Google Gemini 2.5 models, including the release of Pro and Flash models, and the availability of the Flash-Lite model.
TestingCatalog: Google Gemini 2.5 out of preview across AI Studio, Vertex AI, and Gemini app
www.techradar.com: Google Geminiâ€™s super-fast Flash-Lite 2.5 model is out now - hereâ€™s why you should switch today
www.tomsguide.com: Google Gemini 2.5 models are now generally available for developers and researchers.

Sana Hassan@MarkTechPost //

Google AI Predicts Hurricanes and Bing Video Creator - Google's AI advancements include a Sora-powered Bing Video Creator, an AI model for forecasting tropical cyclones and Google DeepMind noted that Google's AI usage grew 50 times in one year, reaching 500 trillion tokens per month.

References: siliconangle.com , Maginative

Google has recently unveiled significant advancements in artificial intelligence, showcasing its continued leadership in the tech sector. One notable development is an AI model designed for forecasting tropical cyclones. This model, developed through a collaboration between Google Research and DeepMind, is available via the newly launched Weather Lab website. It can predict the path and intensity of hurricanes up to 15 days in advance. The AI system learns from decades of historical storm data, reconstructing past weather conditions from millions of observations and utilizing a specialized database containing key information about storm tracks and intensity.

The tech giant's Weather Lab marks the first time the National Hurricane Center will use experimental AI predictions in its official forecasting workflow. The announcement comes at an opportune time, coinciding with forecasters predicting an above-average Atlantic hurricane season in 2025. This AI model can generate 50 different hurricane scenarios, offering a more comprehensive prediction range than current models, which typically provide forecasts for only 3-5 days. The AI has achieved a 1.5-day improvement in prediction accuracy, equivalent to about a decade's worth of traditional forecasting progress.

Furthermore, Google is experiencing exponential growth in AI usage. Google DeepMind noted that Google's AI usage grew 50 times in one year, reaching 500 trillion tokens per month. Logan Kilpatrick from Google DeepMind discussed Google's transformation from a "sleeping giant" to an AI powerhouse, citing superior compute infrastructure, advanced models like Gemini 2.5 Pro, and a deep talent pool in AI research.

Recommended read:

Top link: MarkTechPost
Permalink: More details

References :

siliconangle.com: Google develops AI model for forecasting tropical cyclones
Maginative: Google's AI Can Now Predict Hurricane Paths 15 Days Out â€” and the Hurricane Center Is Using It

Mark Tyson@tomshardware.com //

OpenAI Launches o3-pro Reasoning Model - OpenAI released o3-pro, a smarter model with enhanced capabilities in math, science, and programming, available to ChatGPT Pro and Team subscribers and through OpenAI's API, with a price cut of 87%.

References: Maginative , AI News | VentureBeat , THE DECODER ...

OpenAI has recently launched its newest reasoning model, o3-pro, making it available to ChatGPT Pro and Team subscribers, as well as through OpenAI’s API. Enterprise and Edu subscribers will gain access the following week. The company touts o3-pro as a significant upgrade, emphasizing its enhanced capabilities in mathematics, science, and coding, and its improved ability to utilize external tools.

OpenAI has also slashed the price of o3 by 80% and o3-pro by 87%, positioning the model as a more accessible option for developers seeking advanced reasoning capabilities. This price adjustment comes at a time when AI providers are competing more aggressively on both performance and affordability. Experts note that evaluations consistently prefer o3-pro over the standard o3 model across all categories, especially in science, programming, and business tasks.

O3-pro utilizes the same underlying architecture as o3, but it’s tuned to be more reliable, especially on complex tasks, with better long-range reasoning. The model supports tools like web browsing, code execution, vision analysis, and memory. While the increased complexity can lead to slower response times, OpenAI suggests that the tradeoff is worthwhile for the most challenging questions "where reliability matters more than speed, and waiting a few minutes is worth the tradeoff.”

Recommended read:

Top link: tomshardware.com
Permalink: More details

References :

Maginative: OpenAI’s new o3-pro model is now available in ChatGPT and the API, offering top-tier performance in math, science, and coding—at a dramatically lower price.
AI News | VentureBeat: OpenAI's most powerful reasoning model, o3, is now 80% cheaper, making it more affordable for businesses, researchers, and individual developers.
Latent.Space: OpenAI just dropped the price of their o3 model by 80% today and launched o3-pro.
THE DECODER: OpenAI has lowered the price of its o3 language model by 80 percent, CEO Sam Altman said.
Simon Willison's Weblog: OpenAI's Adam Groth explained that the engineers have optimized inference, allowing a significant price reduction for the o3 model.
the-decoder.com: OpenAI lowered the price of its o3 language model by 80 percent, CEO Sam Altman said.
AI News | VentureBeat: OpenAI released the latest in its o-series of reasoning model that promises more reliable and accurate responses for enterprises.
bsky.app: The OpenAI API is back to running at 100% again, plus we dropped o3 prices by 80% and launched o3-pro - enjoy!
Sam Altman: We are past the event horizon; the takeoff has started. Humanity is close to building digital superintelligence, and at least so far it’s much less weird than it seems like it should be.
siliconangle.com: OpenAIâ€™s newest reasoning model o3-pro surpasses rivals on multiple benchmarks, but itâ€™s not very fast
SiliconANGLE: OpenAI’s newest reasoning model o3-pro surpasses rivals on multiple benchmarks, but it’s not very fast
bsky.app: the OpenAI API is back to running at 100% again, plus we dropped o3 prices by 80% and launched o3-pro - enjoy!
bsky.app: OpenAI has launched o3-pro. The new model is available to ChatGPT Pro and Team subscribers and in OpenAI’s API now, while Enterprise and Edu subscribers will get access next week. If you use reasoning models like o1 or o3, try o3-pro, which is much smarter and better at using external tools.
The Algorithmic Bridge: OpenAI o3-Pro Is So Good That I Can’t Tell How Good It Is
datafloq.com: What is OpenAI o3 and How is it Different than other LLMs?
www.marketingaiinstitute.com: [The AI Show Episode 153]: OpenAI Releases o3-Pro, Disney Sues Midjourney, Altman: â€œGentle Singularityâ€ Is Here, AI and Jobs & News Sites Getting Crushed by AI Search

Mark Tyson@tomshardware.com //

OpenAI releases o3-pro model for Reliability in ChatGPT - OpenAI released o3-pro, a new version of its AI model, which aims to provide more reliable and thoughtful responses for complex tasks, replacing the earlier o1-pro model and is available to Pro and Team users in ChatGPT and via API.

References: bsky.app , the-decoder.com , Maginative ...

OpenAI has launched o3-pro, a new and improved version of its AI model designed to provide more reliable and thoughtful responses, especially for complex tasks. Replacing the o1-pro model, o3-pro is accessible to Pro and Team users within ChatGPT and through the API, marking OpenAI's ongoing effort to refine its AI technology. The focus of this upgrade is to enhance the model’s reasoning capabilities and maintain consistency in generating responses, directly addressing shortcomings found in previous models.

The o3-pro model is designed to handle tasks requiring deep analytical thinking and advanced reasoning. While built upon the same transformer architecture and deep learning techniques as other OpenAI chatbots, o3-pro distinguishes itself with an improved ability to understand context. Some users have noted that o3-pro feels like o3, but is only modestly better in exchange for being slower.

Comparisons with other leading models such as Claude 4 Opus and Gemini 2.5 Pro reveal interesting insights. While Claude 4 Opus has been praised for prompt following and understanding user intentions, Gemini 2.5 Pro stands out for its price-to-performance ratio. Early user experiences suggest o3-pro might not always be worth the expense due to its speed, except for research purposes. Some users have suggested that o3-pro hallucinates modestly less, though this is still being debated.

Recommended read:

Top link: tomshardware.com
Permalink: More details

References :

bsky.app: the OpenAI API is back to running at 100% again, plus we dropped o3 prices by 80% and launched o3-pro - enjoy!
the-decoder.com: OpenAI cuts o3 model prices by 80% and launches o3-pro today
AI News | VentureBeat: OpenAI launches o3-pro AI model, offering increased reliability and tool use for enterprises â€” while sacrificing speed
Maginative: OpenAI’s new o3-pro model is now available in ChatGPT and the API, offering top-tier performance in math, science, and coding—at a dramatically lower price.
THE DECODER: OpenAI has lowered the price of its o3 language model by 80 percent, CEO Sam Altman said. The article appeared first on The Decoder.
AI News | VentureBeat: OpenAI announces 80% price drop for o3, it’s most powerful reasoning model
www.cnbc.com: The figure includes sales from the companyâ€™s consumer products, ChatGPT business products and its application programming interface, or API.
Latent.Space: OpenAI dropped o3 pricing 80% today and launched o3-pro. Ben Hylak of Raindrop.ai returns with the world's first early review.
siliconangle.com: OpenAIâ€™s newest reasoning model o3-pro surpasses rivals on multiple benchmarks, but itâ€™s not very fast
SiliconANGLE: Silicon Angle reports on OpenAI’s newest reasoning model o3-pro surpassing rivals.
bsky.app: OpenAI has launched o3-pro. The new model is available to ChatGPT Pro and Team subscribers and in OpenAIâ€™s API now, while Enterprise and Edu subscribers will get access next week. If you use reasoning models like o1 or o3, try o3-pro, which is much smarter and better at using external tools.
The Algorithmic Bridge: OpenAI o3-Pro Is So Good That I Canâ€™t Tell How Good It Is
Windows Report: OpenAI rolls out new o3-pro AI model to ChatGPT Pro subscribers
Windows Report: OpenAI Rolls out new o3-pro AI Model to ChatGPT Pro Subscribers
www.indiatoday.in: OpenAI introduces O3 PRO for ChatGPT, highlighting significant improvements in performance and cost-efficiency.
www.marketingaiinstitute.com: [The AI Show Episode 153]: OpenAI Releases o3-Pro, Disney Sues Midjourney, Altman: â€œGentle Singularityâ€ Is Here, AI and Jobs & News Sites Getting Crushed by AI Search
datafloq.com: OpenAI is now considered another name of innovation in the field of AI, and with the launch of OpenAI o3, this claim is becoming stronger.
composio.dev: OpenAI finally unveiled their much expensive O3, the O3-Pro. The model is available in their API and Pro plan, which costs $200

News from the AI & ML world

DeeperML - #aimodel

Moonshot AI's Kimi K2 Outperforms GPT-4 Coding Tasks - Moonshot AI released Kimi K2, a trillion-parameter open-source MoE model excelling in long context, code, reasoning, and agentic behavior, outperforming GPT-4 in some coding tasks.

Grok 4 Leaked Benchmarks Indicate Significant AI Advancement - Elon Musk's xAI unveils Grok 4 with significant performance improvements and introduces new subscription tiers and integrations, despite past controversies.

OpenAI o3-Pro Reasoning Model - OpenAI releases O3-Pro, a reasoning model for ChatGPT, focused on performance and cost-efficiency, challenging competitor models like Claude 4 Opus and Gemini 2.5 Pro.

Google's AI Model Forecasts Tropical Cyclones Accurately - Google launched an AI model via Weather Lab for forecasting tropical cyclones, developed with DeepMind, aiming to predict path and intensity days in advance.

Google Gemini 2.5 Models Released on Vertex AI - Google introduced Gemini 2.5 Flash and Pro models on Vertex AI, optimized for speed, efficiency, complex reasoning, and multimodal understanding, along with a new Flash-Lite model for cost-efficient performance.

Google AI Predicts Hurricanes and Bing Video Creator - Google's AI advancements include a Sora-powered Bing Video Creator, an AI model for forecasting tropical cyclones and Google DeepMind noted that Google's AI usage grew 50 times in one year, reaching 500 trillion tokens per month.

OpenAI Launches o3-pro Reasoning Model - OpenAI released o3-pro, a smarter model with enhanced capabilities in math, science, and programming, available to ChatGPT Pro and Team subscribers and through OpenAI's API, with a price cut of 87%.

OpenAI releases o3-pro model for Reliability in ChatGPT - OpenAI released o3-pro, a new version of its AI model, which aims to provide more reliable and thoughtful responses for complex tasks, replacing the earlier o1-pro model and is available to Pro and Team users in ChatGPT and via API.

Benchmarks

Blogs

Research Tools