@techcrunch.com
//
OpenAI has recently unveiled significant advancements in its AI model lineup, introducing o3 and o4-mini models, and updating to GPT-4.1. These new models showcase enhanced capabilities in several key areas, including multimodal functionality, coding proficiency, and instruction following. The o3 and o4-mini models are particularly notable for their ability to see, code, plan, and use tools independently, marking a significant step towards more autonomous AI systems.
The advancements extend to OpenAI's API and subscription services. Operator, OpenAI's autonomous web browsing agent, has been upgraded to utilize the o3 model, enhancing its capabilities within the ChatGPT Pro subscription. This upgrade makes the $200 monthly ChatGPT Pro subscription more attractive, offering users a more powerful AI experience capable of completing web-based tasks such as booking reservations and gathering online data. It also places OpenAI competitively against other AI subscription bundles in the market. In addition to the new models, OpenAI has introduced GPT-4.1 with optimized coding and instruction-following capabilities. This model family includes variants like GPT-4.1 Mini and Nano, and boasts a million-token context window. These improvements are designed to enhance the efficiency and affordability of OpenAI's services. The company is also exploring new frontiers in AI, focusing on the development of AI agents with tool use and autonomous functionality, suggesting a future where AI can take on more complex and independent tasks. References :
Classification:
@Google DeepMind Blog
//
Google has launched Gemini 2.0, its most capable AI model yet, designed for the new agentic era. This model introduces advancements in multimodality, including native image and audio output, and native tool use, enabling the development of new AI agents. Gemini 2.0 is being rolled out to developers and trusted testers initially, with plans to integrate it into Google products like Gemini and Search. Starting today, the Gemini 2.0 Flash experimental model is available to all Gemini users.
New features powered by Project Astra are now accessible to Google One AI Premium subscribers, enabling live video analysis and screen sharing. This update transforms Gemini into a more interactive visual helper, capable of instantly answering questions about what it sees through the device's camera. Users can point their camera at an object, and Gemini will describe it or offer suggestions, providing a more contextual understanding of the real world. These advanced tools will enhance AI Overviews in Google Search. References :
Classification:
Tris Warkentin@The Official Google Blog
//
Google AI has released Gemma 3, a new family of open-source AI models designed for efficient and on-device AI applications. Gemma 3 models are built with technology similar to Gemini 2.0, intended to run efficiently on a single GPU or TPU. The models are available in various sizes: 1B, 4B, 12B, and 27B parameters, with options for both pre-trained and instruction-tuned variants, allowing users to select the model that best fits their hardware and specific application needs.
Gemma 3 offers practical advantages including efficiency and portability. For example, the 27B version has demonstrated robust performance in evaluations while still being capable of running on a single GPU. The 4B, 12B, and 27B models are capable of processing both text and images, and supports more than 140 languages. The models have a context window of 128,000 tokens, making them well suited for tasks that require processing large amounts of information. Google has built safety protocols into Gemma 3, including a safety checker for images called ShieldGemma 2. References :
Classification:
|
BenchmarksBlogsResearch Tools |