Megan Crouse@techrepublic.com
//
Microsoft has unveiled BitNet b1.58, a groundbreaking language model designed for ultra-efficient operation. Unlike traditional language models that rely on 16- or 32-bit floating-point numbers, BitNet utilizes a mere 1.58 bits per weight. This innovative approach significantly reduces memory requirements and energy consumption, enabling the deployment of powerful AI on devices with limited resources. The model is based on the standard transformer architecture, but incorporates modifications aimed at efficiency, such as BitLinear layers and 8-bit activation functions.
The BitNet b1.58 2B4T model contains two billion parameters and was trained on a massive dataset of four trillion tokens, roughly equivalent to the contents of 33 million books. Despite its reduced precision, BitNet reportedly performs comparably to models that are two to three times larger. In benchmark tests, it outperformed other compact models and performed competitively with significantly larger and less efficient systems. Its memory footprint is just 400MB, making it suitable for deployment on laptops or in cloud environments. Microsoft has released dedicated inference tools for both GPU and CPU execution, including a lightweight C++ version, to facilitate adoption. The model is available on Hugging Face. Future development plans include expanding the model to support longer texts, additional languages, and multimodal inputs such as images. Microsoft is also working on another efficient model family under the Phi series. The company demonstrated that this model can run on a Apple M2 chip. Recommended read:
References :
@simonwillison.net
//
Google has broadened access to its advanced AI model, Gemini 2.5 Pro, showcasing impressive capabilities and competitive pricing designed to challenge rival models like OpenAI's GPT-4o and Anthropic's Claude 3.7 Sonnet. Google's latest flagship model is currently recognized as a top performer, excelling in Optical Character Recognition (OCR), audio transcription, and long-context coding tasks. Alphabet CEO Sundar Pichai highlighted Gemini 2.5 Pro as Google's "most intelligent model + now our most in demand." Demand has increased by over 80 percent this month alone across both Google AI Studio and the Gemini API.
Google's expansion includes a tiered pricing structure for the Gemini 2.5 Pro API, offering a more affordable option compared to competitors. Prompts with less than 200,000 tokens are priced at $1.25 per million for input and $10 per million for output, while larger prompts increase to $2.50 and $15 per million tokens, respectively. Although prompt caching is not yet available, its future implementation could potentially lower costs further. The free tier allows 500 free grounding queries with Google Search per day, with an additional 1,500 free queries in the paid tier, with costs per 1,000 queries set at $35 beyond that. The AI research group EpochAI reported that Gemini 2.5 Pro scored 84% on the GPQA Diamond benchmark, surpassing the typical 70% score of human experts. This benchmark assesses challenging multiple-choice questions in biology, chemistry, and physics, validating Google's benchmark results. The model is now available as a paid model, along with a free tier option. The free tier can use data to improve Google's products while the paid tier cannot. Rates vary by tier and range from 150-2,000/minute. Google will retire the Gemini 2.0 Pro preview entirely in favor of 2.5. Recommended read:
References :
Ellie Ramirez-Camara@Data Phoenix
//
Google has launched Gemini 2.5 Pro, hailed as its most intelligent "thinking model" to date. This new AI model excels in reasoning and coding benchmarks, featuring an impressive 1M token context window. Gemini 2.5 Pro is currently accessible to Gemini Advanced users, with integration into Vertex AI planned for the near future. The model has already secured the top position on the Chatbot Arena LLM Leaderboard, showcasing its superior performance in areas like math, instruction following, creative writing, and handling challenging prompts.
Gemini 2.5 Pro represents a new category of "thinking models" designed to enhance performance through reasoning before responding. Google reports that it achieved this level of performance by combining an enhanced base model with improved post-training techniques and aims to build these capabilities into all of their models. The model also obtained leading scores in math and science benchmarks, including GPQA and AIME 2025, without using test-time techniques. A significant focus for the Gemini 2.5 development has been coding performance, where Google reports that the new model excels at creating visual. Recommended read:
References :
Ryan Daws@AI News
//
OpenAI has secured a massive $40 billion funding round, led by SoftBank, catapulting its valuation to an unprecedented $300 billion. This landmark investment makes OpenAI the world's second-most valuable private company alongside TikTok parent ByteDance Ltd, trailing only Elon Musk's SpaceX Corp. This deal marks one of the largest capital infusions in the tech industry and signifies a major milestone for the company, underscoring the escalating significance of AI.
The fresh infusion of capital is expected to fuel several key initiatives at OpenAI. The funding will support expanded research and development, and upgrades to computational infrastructure. This includes the upcoming release of a new open-weight language model with enhanced reasoning capabilities. OpenAI said the funding round would allow the company to “push the frontiers of AI research even further” and “pave the way” towards AGI, or artificial general intelligence. Recommended read:
References :
Michael Nuñez@AI News | VentureBeat
//
OpenAI, the company behind ChatGPT, has announced a significant strategic shift by planning to release its first open-weight AI model since 2019. This move comes amidst mounting economic pressures from competitors like DeepSeek and Meta, whose open-source models are increasingly gaining traction. CEO Sam Altman revealed the plans on X, stating that the new model will have reasoning capabilities and allow developers to run it on their own hardware, departing from OpenAI's cloud-based subscription model.
This decision marks a notable change for OpenAI, which has historically defended closed, proprietary models. The company is now looking to gather developer feedback to make the new model as useful as possible, planning events in San Francisco, Europe and Asia-Pacific. As models improve, startups and developers increasingly want more tunable latency, and want to use on-prem deplouments requiring full data control, according to OpenAI. The shift comes alongside a monumental $40 billion funding round led by SoftBank, which has catapulted OpenAI's valuation to $300 billion. SoftBank will initially invest $10 billion, with the remaining $30 billion contingent on OpenAI transitioning to a for-profit structure by the end of the year. This funding will help OpenAI continue building AI systems that drive scientific discovery, enable personalized education, enhance human creativity, and pave the way toward artificial general intelligence. The release of the open-weight model is expected to help OpenAI compete with the growing number of efficient open-source alternatives and counter the criticisms that have come from remaining a closed model. Recommended read:
References :
Michael Nuñez@AI News | VentureBeat
//
OpenAI, the company behind ChatGPT, has announced a significant strategic shift by planning to release its first open-weight AI model since 2019. This move comes amidst mounting economic pressures from competitors like DeepSeek and Meta, whose open-source models are increasingly gaining traction. CEO Sam Altman revealed the plans on X, stating that the new model will have reasoning capabilities and allow developers to run it on their own hardware, departing from OpenAI's cloud-based subscription model.
This decision marks a notable change for OpenAI, which has historically defended closed, proprietary models. The company is now looking to gather developer feedback to make the new model as useful as possible, planning events in San Francisco, Europe and Asia-Pacific. As models improve, startups and developers increasingly want more tunable latency, and want to use on-prem deplouments requiring full data control, according to OpenAI. The shift comes alongside a monumental $40 billion funding round led by SoftBank, which has catapulted OpenAI's valuation to $300 billion. SoftBank will initially invest $10 billion, with the remaining $30 billion contingent on OpenAI transitioning to a for-profit structure by the end of the year. This funding will help OpenAI continue building AI systems that drive scientific discovery, enable personalized education, enhance human creativity, and pave the way toward artificial general intelligence. The release of the open-weight model is expected to help OpenAI compete with the growing number of efficient open-source alternatives and counter the criticisms that have come from remaining a closed model. Recommended read:
References :
Maximilian Schreiner@THE DECODER
//
References:
Data Phoenix
, SiliconANGLE
,
Google has unveiled Gemini 2.5 Pro, marking it as the company's most intelligent AI model to date. This new "thinking model" excels in reasoning and coding benchmarks, boasting a 1 million token context window for analyzing complex inputs. Gemini 2.5 Pro leads in areas like math, instruction following, creative writing, and hard prompts, according to the Chatbot Arena LLM Leaderboard.
The enhanced reasoning abilities of Gemini 2.5 Pro allow it to go beyond basic classification and prediction. It can now analyze information, draw logical conclusions, incorporate context, and make informed decisions. Google achieved this performance by combining an enhanced base model with improved post-training techniques. The model scored 18.8% on Humanity's Last Exam, which Google notes is state-of-the-art among models without tool use. Amazon Web Services is integrating its AI-powered assistant, Amazon Q Developer, into the Amazon OpenSearch Service. This integration provides users with AI capabilities to investigate and visualize operational data across hundreds of applications. Amazon Q Developer eliminates the need for specialized knowledge of query languages, visualization tools, and alerting features, making the platform's advanced functionalities accessible through natural language commands. This integration enables anyone to perform sophisticated explorations of data to uncover insights and patterns. In cases of application or service incidents on Amazon ES, users can quickly create visualizations to understand the cause and monitor the application for future prevention. Amazon Q Developer can also provide instant summaries and insights within the alert interface, facilitating faster issue resolution. Recommended read:
References :
Matthias Bastian@THE DECODER
//
References:
Simon Willison's Weblog
OpenAI has released another update to its GPT-4o model in ChatGPT, delivering enhanced instruction following capabilities, particularly for prompts with multiple requests. This improvement is a significant upgrade which has also allowed it to acheive second place on the LM Arena leaderboard, only being beaten by Gemini 2.5. The update also boasts improved capabilities in handling complex technical and coding problems, alongside enhanced intuition and creativity, with the added benefit of fewer emojis in its responses.
This update, referred to as chatgpt-4o-latest, is also now available in their API, and also gives access to the model used for ChatGPT. This version is priced higher at $5/million input and $15/million output compared to the regular GPT-4o, which is priced at $2.50/$10. OpenAI plans to bring these improvements to a dated model in the API in the coming weeks, and although they released the update on Twitter, users have complained that a more suitable place for this announcement would be the OpenAI Platform Changelog. Recommended read:
References :
Maximilian Schreiner@THE DECODER
//
Google has unveiled Gemini 2.5 Pro, its latest and "most intelligent" AI model to date, showcasing significant advancements in reasoning, coding proficiency, and multimodal functionalities. According to Google, these improvements come from combining a significantly enhanced base model with improved post-training techniques. The model is designed to analyze complex information, incorporate contextual nuances, and draw logical conclusions with unprecedented accuracy. Gemini 2.5 Pro is now available for Gemini Advanced users and on Google's AI Studio.
Google emphasizes the model's "thinking" capabilities, achieved through chain-of-thought reasoning, which allows it to break down complex tasks into multiple steps and reason through them before responding. This new model can handle multimodal input from text, audio, images, videos, and large datasets. Additionally, Gemini 2.5 Pro exhibits strong performance in coding tasks, surpassing Gemini 2.0 in specific benchmarks and excelling at creating visually compelling web apps and agentic code applications. The model also achieved 18.8% on Humanity’s Last Exam, demonstrating its ability to handle complex knowledge-based questions. Recommended read:
References :
Ryan Daws@AI News
//
DeepSeek has released DeepSeek V3-0324, an upgraded version of their large language model, marking a significant milestone in open-source AI. According to Artificial Analysis, this new iteration is the highest-scoring non-reasoning model available, surpassing even proprietary counterparts from Google, Anthropic, and Meta. Its accessibility improves the AI research environment. Early reports indicate substantial improvements in reasoning and coding abilities, positioning it as a real rival to OpenAI's models.
The updated model, V3-0324, excels in benchmarks such as MMLU-Pro, GPQA, AIME, and LiveCodeBench, demonstrating enhanced problem-solving and knowledge retention. It runs at 20 tokens per second on a Mac Studio, showcasing its efficiency. With its MIT license, DeepSeek-V3-0324 is freely available for commercial use, and it can run directly on consumer-grade hardware. DeepSeek's advancements signal a shift in the AI sector, as open-source frameworks increasingly compete with closed systems, offering developers a powerful and adaptable tool. Recommended read:
References :
Ryan Daws@AI News
//
References:
venturebeat.com
, AI News
,
DeepSeek, a Chinese AI startup, is making waves in the artificial intelligence industry with its DeepSeek-V3 model. This model is demonstrating performance that rivals Western AI models like those from OpenAI and Anthropic, but at significantly lower development costs. The release of DeepSeek-V3 is seen as jumpstarting AI development across China, with other startups and established companies releasing their own advanced models, further fueling competition. This has narrowed the technology gap between China and the United States as China has adapted to and overcome international restrictions through creative approaches to AI development.
One particularly notable aspect of DeepSeek-V3 is its ability to run efficiently on consumer-grade hardware, such as the Mac Studio with an M3 Ultra chip. Reports indicate that the model achieves speeds of over 20 tokens per second on this platform, making it a potential "nightmare for OpenAI". This contrasts sharply with the data center requirements typically associated with state-of-the-art AI models. The company's focus on algorithmic efficiency has allowed them to achieve notable gains despite restricted access to the latest silicon, showcasing that Chinese AI innovation has flourished by focusing on algorithmic efficiency and novel approaches to model architecture. Recommended read:
References :
Ryan Daws@AI News
//
DeepSeek V3-0324, the latest large language model from Chinese AI startup DeepSeek, is making waves in the artificial intelligence industry. The model, quietly released with an MIT license for commercial use, has quickly become the highest-scoring non-reasoning model on the Artificial Analysis Intelligence Index. This marks a significant milestone for open-source AI, surpassing proprietary counterparts like Google’s Gemini 2.0 Pro, Anthropic’s Claude 3.7 Sonnet, and Meta’s Llama 3.3 70B.
DeepSeek V3-0324's efficiency is particularly notable. Early reports indicate that it can run directly on consumer-grade hardware, specifically Apple’s Mac Studio with an M3 Ultra chip, achieving speeds of over 20 tokens per second. This capability is a major departure from the typical data center requirements associated with state-of-the-art AI. The updated version demonstrates substantial improvements in reasoning and benchmark performance, as well as enhanced Chinese writing proficiency and optimized translation quality. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |