Google Gemini 2.5 Flash Introduces Thinking Budgets for AI

Michael Nuñez@venturebeat.com //

Google Gemini 2.5 Flash Introduces Thinking Budgets for AI

Google has unveiled Gemini 2.5 Flash, a new AI model designed to give businesses greater control over AI costs and performance. Available in preview through Google AI Studio and Vertex AI, Gemini 2.5 Flash introduces adjustable "thinking budgets," allowing developers to specify the amount of computational power the AI should use for reasoning. This innovative approach aims to strike a balance between advanced AI capabilities and cost-efficiency, addressing a key concern for businesses integrating AI into their operations. The model is also capable of generating SVGs.

The introduction of "thinking budgets" marks a strategic move by Google to deliver cost-effective AI solutions. Developers can now fine-tune the AI's processing power, allocating resources based on the complexity of the task at hand. With Gemini 2.5 Flash, the "thinking" capability can be turned on or off, creating a hybrid reasoning model that prioritizes speed and cost when needed. This flexibility allows businesses to optimize their AI usage and pay only for the brainpower they require.

Benchmarks demonstrate significant improvements in Gemini 2.5 Flash compared to the older Gemini 2.0 Flash model. Google has stated that the latest version delivers a major upgrade in reasoning capabilities, while still prioritizing speed and cost. The "thinking budget" feature offers fine-grained control over the maximum number of tokens a model can generate while thinking, ranging from 0 to 24,576 tokens. A higher budget allows the model to reason further to improve quality, but the model automatically decides how much to think based on the perceived task complexity.

Original img attribution: https://venturebeat.com/wp-content/uploads/2025/04/nuneybits_Vector_art_of_a_retro_computer_on_the_screen_is_a_lig_b7962004-900f-4fdd-8b99-620ed4be1597.webp?w=1024?w=1200&strip=all

ImgSrc: venturebeat.com

References :

venturebeat.com: Google’s new Gemini 2.5 Flash AI model introduces adjustable "thinking budgets" that let businesses pay only for the reasoning power they need, balancing advanced capabilities with cost efficiency.
Google DeepMind Blog: Transform text-based prompts into high-resolution eight-second videos in Gemini Advanced and use Whisk Animate to turn images into eight-second animated clips.
TestingCatalog: Google integrates Veo 2 AI into Gemini Advanced, enabling subscribers to create 8-second, 720p videos for TikTok and YouTube. Download MP4s with SynthID watermark.
Simon Willison's Weblog: Start building with Gemini 2.5 Flash
www.zdnet.com: Google reveals Gemini 2.5 Flash, its 'most cost-efficient thinking model'
developers.googleblog.com: Google's Gemini 2.5 Flash has hybrid reasoning, can be turned on or off and provides the ability for developers to set budgets to find the right trade-off between cost, quality, and latency.
venturebeat.com: Googleâ€™s Gemini 2.5 Flash introduces â€˜thinking budgetsâ€™ that cut AI costs by 600% when turned down
the-decoder.com: Googleâ€™s Gemini 2.5 Flash gives you speed when you need it and reasoning when you can afford it
THE DECODER: Provides information about the release of Gemini 2.5 Flash, highlighting its reasoning capabilities and cost-effectiveness.
TestingCatalog: Google launches Gemini 2.5 Flash model with hybrid reasoning
bsky.app: New LLM release from Google Gemini: Gemini 2.5 Flash (preview), which lets you set a budget for how many "thinking" tokens it can use. I got it to draw me some pelicans - it has very good taste in SVG styles and comments.
www.marktechpost.com: Google Unveils Gemini 2.5 FlashÂ inÂ Preview through the Gemini API viaÂ Google AI StudioÂ andÂ Vertex AI.
LearnAI: Start building with Gemini 2.5 Flash
www.infoworld.com: Google previews Gemini 2.5 Flash hybrid reasoning model
MarkTechPost: Google Unveils Gemini 2.5 Flash in Preview through the Gemini API via Google AI Studio and Vertex AI.
Google DeepMind Blog: Gemini 2.5 Flash is our first fully hybrid reasoning model, giving developers the ability to turn thinking on or off.
learn.aisingapore.org: Start building with Gemini 2.5 Flash
www.marketingaiinstitute.com: This blog post highlights Google Cloud Next '25 event reveals, including Gemini 2.5 Pro, AI Agents, and more.
bsky.app: Gemini 2.5 Pro and Flash now have the ability to return image segmentation masks on command, as base64 encoded PNGs embedded in JSON strings I vibe coded an interactive tool for exploring this new capability - it costs a fraction of a cent per image
Last Week in AI: Last Week in AI discussing GPT 4.1 and Gemini 2.5 Flash
TestingCatalog: Testing Catalog about Geminiâ€™s Scheduled Actions may offer AI task scheduling
The Official Google Blog: This model allows for adjustable thinking budgets, enabling users to control costs and choose the level of reasoning needed for specific tasks.
simonwillison.net: The model also allows developers to set thinking budgets to find the right tradeoff between quality, cost, and latency. Gemini AI Studio product lead Logan Kilpatrick :
Analytics Vidhya: 7 Things Gemini 2.5 Pro Does Better Than Any Other Chatbot!
Last Week in AI: OpenAIâ€™s new GPT-4.1 AI models focus on coding, Googleâ€™s newest Gemini AI model focuses on efficiency, and more!
Simon Willison: Turns out Gemini 2.5 Flash non-thinking mode can do the same trick at an even lower cost... 0.0119 cents (around 1/100th of a cent) Notes here, including how I upgraded my tool to use the non-thinking model by vibe coding o4-mini:
techcrunch.com: Google’s newest Gemini AI model focuses on efficiency, and more!
www.analyticsvidhya.com: o3 vs o4-mini vs Gemini 2.5 pro: The Ultimate Reasoning Battle
Digital Information World: Google launches Gemini 2.5 Flash model with hybrid reasoning, multimodal support, and cost-effective token pricing.
IEEE Spectrum: This article discusses the release of Google's new leading-edge LLM, Gemini 2.5 Pro, which has attracted much attention and interest.
www.analyticsvidhya.com: This article explores the capabilities of Gemini 2.5 Pro and compares it to other AI chatbots.
Analytics Vidhya: o3 vs o4-mini vs Gemini 2.5 pro: The Ultimate Reasoning Battle
TechHQ: Google unveils â€œreasoning dialâ€ for Gemini 2.5 flash: thinking vs. cost
techhq.com: Google unveils â€œreasoning dialâ€ for Gemini 2.5 flash: thinking vs. cost
Last Week in AI: Last Week in AI #307 - GPT 4.1, o3, o4-mini, Gemini 2.5 Flash, Veo 2
Towards AI: Google's Gemini 2.5 Flash model with reasoning control allows for greater precision and control in AI applications, optimizing resources and cost.
www.artificialintelligence-news.com: Google's Gemini 2.5 Flash model features a "thinking budget" that allows developers to restrict processing power for problem-solving, addressing concerns about excessive resource consumption.
AI News: Google has introduced an AI reasoning control mechanism for its Gemini 2.5 Flash model that allows developers to limit how much processing power the system expends on problem-solving. Released on April 17, this â€œthinking budgetâ€ feature responds to a growing industry challenge: advanced AI models frequently overanalyse straightforward queries, consuming unnecessary computational resources and driving

Classification:

HashTags: #Gemini2.5 #GoogleAI #AICostEfficiency
Company: Google
Target: Businesses, Developers
Product: Gemini 2.5 Flash
Feature: thinking budgets
Type: AI
Severity: Informative

News from the AI & ML world

DeeperML

Google Gemini 2.5 Flash Introduces Thinking Budgets for AI

Classification: