Jaime Hampton@AIwire
//
DeepSeek's innovative AI models are reshaping China's AI data center infrastructure, leading to a market disruption and potentially underutilized resources. The company's DeepSeek-V3 model has demonstrated performance that rivals ChatGPT but at a significantly reduced cost. This has altered the demand for extensive GPU clusters used in traditional AI training, shifting the focus towards hardware prioritizing low-latency, particularly near tech hubs. This has resulted in increased speculation as well as experienced players who are now posed with the challenge of the DeepSeek V3.
The open-source nature of DeepSeek’s model is also allowing smaller players to compete without the need for extensive pretraining, which is undermining the demand for large data centers. DeepSeek-V3, which runs at 20 tokens per second on a Mac Studio, poses a new challenge for existing AI models. Chinese AI startups are now riding DeepSeek's momentum and building an ecosystem that is revolutionizing the AI landscape. This narrows the technology divide between China and the United States. Recommended read:
References :
Dashveenjit Kaur@AI News
//
Chinese AI startup DeepSeek is shaking up the global technology landscape with its latest large language model, DeepSeek-V3-0324. This new model has been lauded for matching the performance of American AI models, while boasting significantly lower development costs. According to Lee Kai-fu, CEO of Chinese startup 01.AI, the gap between Chinese and American AI capabilities has narrowed dramatically, with China even ahead in some specific areas.
DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance in multiple benchmarks, particularly in mathematics. The model scored 59.4 on the American Invitational Mathematics Examination (AIME), a significant improvement over its predecessor. Häme University lecturer Kuittinen Petri noted DeepSeek's achievements were realized with just a fraction of the resources available to competitors like OpenAI. This breakthrough has been attributed to DeepSeek’s focus on algorithmic efficiency and novel approaches to model architecture, allowing them to overcome restrictions on access to the latest silicon. This disruption is not going unnoticed, when DeepSeek launched its R1 model in January, America’s Nasdaq plunged 3.1%, while the S&P 500 fell 1.5%. While DeepSeek claimed a $5.6 million training cost, this represented only the marginal cost of the final training run. SemiAnalysis estimates DeepSeek's actual hardware investment at closer to $1.6 billion, with hundreds of millions in operating costs. The developments present opportunities and challenges for the. Recommended read:
References :
Dashveenjit Kaur@AI News
//
DeepSeek, a Chinese AI startup, is causing a stir in the AI industry with its new large language model, DeepSeek-V3-0324. Released with little fanfare on the Hugging Face AI repository, the 641-gigabyte model is freely available for commercial use under an MIT license. Early reports indicate it can run directly on consumer-grade hardware, such as Apple’s Mac Studio with the M3 Ultra chip, especially in a 4-bit quantized version that reduces the storage footprint to 352GB. This innovation challenges the previous notion that Silicon Valley held a chokehold on the AI industry.
China's focus on algorithmic efficiency over hardware superiority has allowed companies like DeepSeek to flourish despite restrictions on access to the latest silicon. DeepSeek's R1 model, launched earlier this year, already rivaled OpenAI's ChatGPT-4 at a fraction of the cost. Now the DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance. This has sparked a gold rush among Chinese tech startups, rewriting the playbook for AI development and allowing smaller companies to believe they have a shot in the market. Recommended read:
References :
Ryan Daws@AI News
//
DeepSeek, a Chinese AI company, has released DeepSeek V3-0324, an updated AI model that demonstrates impressive performance. The model is now running at 20 tokens per second on a Mac Studio. This model is said to contain 685 billion parameters and its cost-effectiveness challenges the dominance of American AI models, signaling that China continues to innovate in AI despite chip restrictions. Reports from early testers show improvements over previous versions and the model tops non-reasoning AI models in open-source first.
This new model runs on consumer-grade hardware, specifically Apple's Mac Studio with the M3 Ultra chip, diverging from the typical data center requirements for AI. It is freely available for commercial use under the MIT license. According to AI researcher Awni Hannun, the model runs at over 20 tokens per second on a 512GB M3 Ultra. The company has made no formal announcement, just an empty README file and the model weights themselves. This stands in contrast to the carefully orchestrated product launches by Western AI companies. Recommended read:
References :
Ryan Daws@AI News
//
References:
venturebeat.com
, AI News
,
DeepSeek, a Chinese AI startup, is making waves in the artificial intelligence industry with its DeepSeek-V3 model. This model is demonstrating performance that rivals Western AI models like those from OpenAI and Anthropic, but at significantly lower development costs. The release of DeepSeek-V3 is seen as jumpstarting AI development across China, with other startups and established companies releasing their own advanced models, further fueling competition. This has narrowed the technology gap between China and the United States as China has adapted to and overcome international restrictions through creative approaches to AI development.
One particularly notable aspect of DeepSeek-V3 is its ability to run efficiently on consumer-grade hardware, such as the Mac Studio with an M3 Ultra chip. Reports indicate that the model achieves speeds of over 20 tokens per second on this platform, making it a potential "nightmare for OpenAI". This contrasts sharply with the data center requirements typically associated with state-of-the-art AI models. The company's focus on algorithmic efficiency has allowed them to achieve notable gains despite restricted access to the latest silicon, showcasing that Chinese AI innovation has flourished by focusing on algorithmic efficiency and novel approaches to model architecture. Recommended read:
References :
@tomshardware.com
//
References:
Jon Keegan
, www.tomshardware.com
,
Ant Group has announced a significant breakthrough in AI, achieving a 20% reduction in AI costs by training models on domestically produced Chinese chips. According to reports, the company utilized chips from Chinese tech giants Alibaba and Huawei, reaching performance levels comparable to those obtained with Nvidia's H800 chips. The AI models, named Ling-Plus and Ling-Lite, are said to match or even outperform leading models, with Ant Group claiming its AI models outperformed Meta’s in benchmarks and cut inference costs.
This accomplishment signals a potential leap forward in China's AI development efforts and a move towards self-reliance in semiconductor technology. While Ant Group still uses Nvidia hardware for some tasks, it is now relying more on alternatives, including chips from AMD and Chinese manufacturers, driven in part by U.S. sanctions that limit access to Nvidia's advanced GPUs. This shift could lessen the country’s dependence on foreign technology. Recommended read:
References :
Matthias Bastian@THE DECODER
//
Baidu has launched two new AI models, ERNIE 4.5 and ERNIE X1, designed to compete with DeepSeek's R1 model. The company is making these models freely accessible to individual users through the ERNIE Bot platform, ahead of the initially planned schedule. ERNIE 4.5 is a multimodal foundation model, integrating text, images, audio, and video to enhance understanding and content generation across various data types. This model demonstrates significant improvements in language understanding, reasoning, and coding abilities.
ERNIE X1 is Baidu's first model specifically designed for complex reasoning tasks, excelling in logical inference, problem-solving, and structured decision-making suitable for applications in finance, law, and data analysis. Baidu claims that ERNIE X1 matches DeepSeek R1’s performance at half the cost. ERNIE 4.5 has shown performance on par with models like DeepSeek-R1, but at approximately half the deployment cost. Recommended read:
References :
Matthias Bastian@THE DECODER
//
Baidu has released two new large language models, ERNIE 4.5 and ERNIE X1, claiming they outperform OpenAI's GPT-4.5 and DeepSeek-R1. These models are more cost-effective, offering high quality at a fraction of the price. ERNIE 4.5 is a multimodal foundation model that integrates text, images, audio, and video, enhancing its ability to understand and generate different kinds of content. ERNIE X1 is a deep-thinking reasoning model with multimodal capabilities, excelling in tasks requiring advanced reasoning.
Baidu has made both models freely accessible to individual users via the ERNIE Bot platform, ahead of schedule. For enterprise users and developers, ERNIE 4.5 is available via APIs on Baidu AI Cloud's Qianfan platform, with ERNIE X1 set to follow. Baidu also plans to integrate the models into its existing products, including Baidu Search and the Wenxiaoyan app. This move positions Baidu as a competitive force in the AI landscape, challenging Western AI companies. Recommended read:
References :
Ryan Daws@AI News
//
Leading US artificial intelligence companies, including OpenAI, Anthropic, and Google, are urging the US government to take decisive action to secure the nation's AI leadership. The companies have submitted documents to the government warning that America's lead in AI is diminishing due to the increasing capabilities of Chinese models like Deepseek R1. This call for action comes in response to a request for information on developing an AI Action Plan.
These submissions highlight concerns about national security, economic competitiveness, and the necessity for strategic regulatory frameworks. OpenAI warns that Deepseek shows that the US lead is not wide and is narrowing, characterizing the model as state-subsidized, state-controlled, and freely available. Anthropic's filing focuses on biosecurity concerns, particularly Deepseek-R1's willingness to provide information about biological weapons, demonstrating the need for better government oversight of AI systems. Recommended read:
References :
Nitika Sharma@Analytics Vidhya
//
China's Manus AI, developed by Monica, is generating buzz as an invite-only multi-agent AI product. This AI agent is designed to autonomously tackle complex, real-world tasks by operating as a multi-agent system. It utilizes a planner optimized for strategic reasoning, and an executor driven by Claude 3.5 Sonnet, incorporating code execution, web browsing, and multi-file code management.
The AI agent has sparked considerable global attention, igniting discussions about its technological and ethical implications, as well as its potential impact on the AI landscape. Manus reportedly outperformed OpenAI's o3-powered Deep Research agent on benchmarks, as showcased on the Manus website, leading some to believe it is among the most effective autonomous agents currently available. However, there is some skepticism due to it appearing to be a Claude wrapper with a jailbreak and tools optimized for the GAIA benchmark. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |