Jaime Hampton@AIwire
//
China's multi-billion-dollar AI infrastructure boom is now facing a significant downturn, according to a new report. The rush to build AI datacenters, fueled by the rise of generative AI and encouraged by government incentives, has resulted in billions of dollars in idle infrastructure. Many newly built facilities are now sitting empty, with some reports indicating that up to 80% of China’s new computing resources remain unused.
The "DeepSeek Effect" is a major factor in this reversal. DeepSeek's AI models, particularly the Deepseek v3, have demonstrated impressive efficiency in training, reducing the demand for large-scale datacenter deployments. Smaller players are abandoning plans to pretrain large models because DeepSeek’s open-source models match ChatGPT-level performance at a fraction of the cost, leading to a collapse in demand for training infrastructure just as new facilities were ready to come online. Recommended read:
References :
Dashveenjit Kaur@AI News
//
Chinese AI startup DeepSeek is shaking up the global technology landscape with its latest large language model, DeepSeek-V3-0324. This new model has been lauded for matching the performance of American AI models, while boasting significantly lower development costs. According to Lee Kai-fu, CEO of Chinese startup 01.AI, the gap between Chinese and American AI capabilities has narrowed dramatically, with China even ahead in some specific areas.
DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance in multiple benchmarks, particularly in mathematics. The model scored 59.4 on the American Invitational Mathematics Examination (AIME), a significant improvement over its predecessor. Häme University lecturer Kuittinen Petri noted DeepSeek's achievements were realized with just a fraction of the resources available to competitors like OpenAI. This breakthrough has been attributed to DeepSeek’s focus on algorithmic efficiency and novel approaches to model architecture, allowing them to overcome restrictions on access to the latest silicon. This disruption is not going unnoticed, when DeepSeek launched its R1 model in January, America’s Nasdaq plunged 3.1%, while the S&P 500 fell 1.5%. While DeepSeek claimed a $5.6 million training cost, this represented only the marginal cost of the final training run. SemiAnalysis estimates DeepSeek's actual hardware investment at closer to $1.6 billion, with hundreds of millions in operating costs. The developments present opportunities and challenges for the. Recommended read:
References :
Dashveenjit Kaur@AI News
//
DeepSeek, a Chinese AI startup, is causing a stir in the AI industry with its new large language model, DeepSeek-V3-0324. Released with little fanfare on the Hugging Face AI repository, the 641-gigabyte model is freely available for commercial use under an MIT license. Early reports indicate it can run directly on consumer-grade hardware, such as Apple’s Mac Studio with the M3 Ultra chip, especially in a 4-bit quantized version that reduces the storage footprint to 352GB. This innovation challenges the previous notion that Silicon Valley held a chokehold on the AI industry.
China's focus on algorithmic efficiency over hardware superiority has allowed companies like DeepSeek to flourish despite restrictions on access to the latest silicon. DeepSeek's R1 model, launched earlier this year, already rivaled OpenAI's ChatGPT-4 at a fraction of the cost. Now the DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance. This has sparked a gold rush among Chinese tech startups, rewriting the playbook for AI development and allowing smaller companies to believe they have a shot in the market. Recommended read:
References :
Ryan Daws@AI News
//
DeepSeek, a Chinese AI company, has released DeepSeek V3-0324, an updated AI model that demonstrates impressive performance. The model is now running at 20 tokens per second on a Mac Studio. This model is said to contain 685 billion parameters and its cost-effectiveness challenges the dominance of American AI models, signaling that China continues to innovate in AI despite chip restrictions. Reports from early testers show improvements over previous versions and the model tops non-reasoning AI models in open-source first.
This new model runs on consumer-grade hardware, specifically Apple's Mac Studio with the M3 Ultra chip, diverging from the typical data center requirements for AI. It is freely available for commercial use under the MIT license. According to AI researcher Awni Hannun, the model runs at over 20 tokens per second on a 512GB M3 Ultra. The company has made no formal announcement, just an empty README file and the model weights themselves. This stands in contrast to the carefully orchestrated product launches by Western AI companies. Recommended read:
References :
Ryan Daws@AI News
//
DeepSeek, a Chinese AI startup, has emerged as a significant player in the artificial intelligence landscape, challenging the dominance of Western AI companies. Their release of the V3 large language model under the MIT open-source license marks a notable development, potentially shifting the global AI landscape. The DeepSeek-V3 model forms the foundation of DeepSeek-R1, showcasing innovation through Mixture of Experts (MoE) and efficient parameter activation system.
DeepSeek V3-0324 has achieved the position of highest-scoring non-reasoning model on the Artificial Analysis Intelligence Index. This open-source model outperforms proprietary counterparts like Google's Gemini 2.0 Pro and Meta's Llama 3.3 70B in real-time use cases. While DeepSeek models demonstrate strong performance, especially in mathematics and reasoning tasks, concerns have been raised regarding intellectual property, government connections, and security vulnerabilities. Recommended read:
References :
Ryan Daws@AI News
//
References:
venturebeat.com
, AI News
,
DeepSeek, a Chinese AI startup, is making waves in the artificial intelligence industry with its DeepSeek-V3 model. This model is demonstrating performance that rivals Western AI models like those from OpenAI and Anthropic, but at significantly lower development costs. The release of DeepSeek-V3 is seen as jumpstarting AI development across China, with other startups and established companies releasing their own advanced models, further fueling competition. This has narrowed the technology gap between China and the United States as China has adapted to and overcome international restrictions through creative approaches to AI development.
One particularly notable aspect of DeepSeek-V3 is its ability to run efficiently on consumer-grade hardware, such as the Mac Studio with an M3 Ultra chip. Reports indicate that the model achieves speeds of over 20 tokens per second on this platform, making it a potential "nightmare for OpenAI". This contrasts sharply with the data center requirements typically associated with state-of-the-art AI models. The company's focus on algorithmic efficiency has allowed them to achieve notable gains despite restricted access to the latest silicon, showcasing that Chinese AI innovation has flourished by focusing on algorithmic efficiency and novel approaches to model architecture. Recommended read:
References :
@tomshardware.com
//
References:
Jon Keegan
, www.tomshardware.com
,
Ant Group has announced a significant breakthrough in AI, achieving a 20% reduction in AI costs by training models on domestically produced Chinese chips. According to reports, the company utilized chips from Chinese tech giants Alibaba and Huawei, reaching performance levels comparable to those obtained with Nvidia's H800 chips. The AI models, named Ling-Plus and Ling-Lite, are said to match or even outperform leading models, with Ant Group claiming its AI models outperformed Meta’s in benchmarks and cut inference costs.
This accomplishment signals a potential leap forward in China's AI development efforts and a move towards self-reliance in semiconductor technology. While Ant Group still uses Nvidia hardware for some tasks, it is now relying more on alternatives, including chips from AMD and Chinese manufacturers, driven in part by U.S. sanctions that limit access to Nvidia's advanced GPUs. This shift could lessen the country’s dependence on foreign technology. Recommended read:
References :
Ryan Daws@AI News
//
DeepSeek V3-0324, the latest large language model from Chinese AI startup DeepSeek, is making waves in the artificial intelligence industry. The model, quietly released with an MIT license for commercial use, has quickly become the highest-scoring non-reasoning model on the Artificial Analysis Intelligence Index. This marks a significant milestone for open-source AI, surpassing proprietary counterparts like Google’s Gemini 2.0 Pro, Anthropic’s Claude 3.7 Sonnet, and Meta’s Llama 3.3 70B.
DeepSeek V3-0324's efficiency is particularly notable. Early reports indicate that it can run directly on consumer-grade hardware, specifically Apple’s Mac Studio with an M3 Ultra chip, achieving speeds of over 20 tokens per second. This capability is a major departure from the typical data center requirements associated with state-of-the-art AI. The updated version demonstrates substantial improvements in reasoning and benchmark performance, as well as enhanced Chinese writing proficiency and optimized translation quality. Recommended read:
References :
Ryan Daws@AI News
//
DeepSeek V3-0324 has emerged as a leading AI model, topping benchmarks for non-reasoning AI in an open-source breakthrough. This milestone signifies a significant advancement in the field, as it marks the first time an open weights model has achieved the top position among non-reasoning models. The model's performance surpasses proprietary counterparts and edges it closer to proprietary reasoning models, highlighting the growing viability of open-source solutions for latency-sensitive applications. DeepSeek V3-0324 represents a new era for open-source AI, offering a powerful and adaptable tool for developers and enterprises.
DeepSeek-V3 now runs at 20 tokens per second on Apple’s Mac Studio, presenting a challenge to OpenAI’s cloud-dependent business model. The 685-billion-parameter model, DeepSeek-V3-0324, is freely available for commercial use under the MIT license. This achievement, coupled with its cost efficiency and performance, signals a shift in the AI sector, where open-source frameworks increasingly compete with closed systems. Early testers report significant improvements over previous versions, positioning DeepSeek's new model above Claude Sonnet 3.5 from Anthropic. Recommended read:
References :
Matthias Bastian@THE DECODER
//
Baidu has released its advanced AI models, ERNIE 4.5 and ERNIE X1, making them freely available to users through the ERNIE Bot platform. This move is a direct challenge to AI giants like OpenAI, Google, and DeepSeek, aiming to provide broader access to cutting-edge AI technology. By offering these models for free, Baidu seeks to accelerate user engagement and gather real-world data to refine their AI capabilities, potentially shifting the balance in the AI landscape by making sophisticated AI tools a new standard rather than a luxury.
ERNIE 4.5 is a multimodal foundation model capable of integrating and understanding text, images, audio, and video. It enhances language understanding, reasoning, generation, and memory, and can even interpret internet memes and satirical cartoons. ERNIE X1, on the other hand, is designed for reasoning-intensive tasks, excelling in logical inference, problem-solving, and structured decision-making. Baidu claims ERNIE 4.5 can match the performance of models like DeepSeek R1 at half the deployment cost, while ERNIE X1 has demonstrated strong capabilities in areas like Chinese knowledge Q&A and complex calculations. Recommended read:
References :
Matthias Bastian@THE DECODER
//
Baidu has launched two new AI models, ERNIE 4.5 and ERNIE X1, designed to compete with DeepSeek's R1 model. The company is making these models freely accessible to individual users through the ERNIE Bot platform, ahead of the initially planned schedule. ERNIE 4.5 is a multimodal foundation model, integrating text, images, audio, and video to enhance understanding and content generation across various data types. This model demonstrates significant improvements in language understanding, reasoning, and coding abilities.
ERNIE X1 is Baidu's first model specifically designed for complex reasoning tasks, excelling in logical inference, problem-solving, and structured decision-making suitable for applications in finance, law, and data analysis. Baidu claims that ERNIE X1 matches DeepSeek R1’s performance at half the cost. ERNIE 4.5 has shown performance on par with models like DeepSeek-R1, but at approximately half the deployment cost. Recommended read:
References :
Matthias Bastian@THE DECODER
//
Baidu has released two new large language models, ERNIE 4.5 and ERNIE X1, claiming they outperform OpenAI's GPT-4.5 and DeepSeek-R1. These models are more cost-effective, offering high quality at a fraction of the price. ERNIE 4.5 is a multimodal foundation model that integrates text, images, audio, and video, enhancing its ability to understand and generate different kinds of content. ERNIE X1 is a deep-thinking reasoning model with multimodal capabilities, excelling in tasks requiring advanced reasoning.
Baidu has made both models freely accessible to individual users via the ERNIE Bot platform, ahead of schedule. For enterprise users and developers, ERNIE 4.5 is available via APIs on Baidu AI Cloud's Qianfan platform, with ERNIE X1 set to follow. Baidu also plans to integrate the models into its existing products, including Baidu Search and the Wenxiaoyan app. This move positions Baidu as a competitive force in the AI landscape, challenging Western AI companies. Recommended read:
References :
Jason Ly@cset.georgetown.edu
//
References:
Gradient Flow
, www.techradar.com
China's DeepSeek is emerging as a significant competitor to US AI companies by offering advanced AI models at notably lower costs, potentially reshaping the AI market. Sam Bresnick and Cole McFaul highlighted in Barron's the challenge US firms face due to DeepSeek's cost-effective AI solutions. This competition is expected to democratize AI, signaling that Silicon Valley companies are no longer the sole leaders in this technology.
DeepSeek's impact extends to healthcare, where it is being rapidly adopted in Chinese tertiary hospitals to improve clinical decision-making and operational efficiency. Hospitals like Fudan University Affiliated Huashan Hospital are testing DeepSeek models, with deployments localized within hospital intranets for data security. Applications range from intelligent pathology automating tumor analysis to imaging analysis achieving high accuracy in lung nodule differentiation, and AI pre-consultation systems reducing patient wait times. Recommended read:
References :
Nitika Sharma@Analytics Vidhya
//
China's Manus AI, developed by Monica, is generating buzz as an invite-only multi-agent AI product. This AI agent is designed to autonomously tackle complex, real-world tasks by operating as a multi-agent system. It utilizes a planner optimized for strategic reasoning, and an executor driven by Claude 3.5 Sonnet, incorporating code execution, web browsing, and multi-file code management.
The AI agent has sparked considerable global attention, igniting discussions about its technological and ethical implications, as well as its potential impact on the AI landscape. Manus reportedly outperformed OpenAI's o3-powered Deep Research agent on benchmarks, as showcased on the Manus website, leading some to believe it is among the most effective autonomous agents currently available. However, there is some skepticism due to it appearing to be a Claude wrapper with a jailbreak and tools optimized for the GAIA benchmark. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |