News from the AI & ML world

DeeperML - #deepseek

Tris Warkentin@The Official Google Blog //
Google AI has released Gemma 3, a new family of open-source AI models designed for efficient and on-device AI applications. Gemma 3 models are built with technology similar to Gemini 2.0, intended to run efficiently on a single GPU or TPU. The models are available in various sizes: 1B, 4B, 12B, and 27B parameters, with options for both pre-trained and instruction-tuned variants, allowing users to select the model that best fits their hardware and specific application needs.

Gemma 3 offers practical advantages including efficiency and portability. For example, the 27B version has demonstrated robust performance in evaluations while still being capable of running on a single GPU. The 4B, 12B, and 27B models are capable of processing both text and images, and supports more than 140 languages. The models have a context window of 128,000 tokens, making them well suited for tasks that require processing large amounts of information. Google has built safety protocols into Gemma 3, including a safety checker for images called ShieldGemma 2.

Recommended read:
References :
  • MarkTechPost: Google AI Releases Gemma 3: Lightweight Multimodal Open Models for Efficient and On‑Device AI
  • The Official Google Blog: Introducing Gemma 3: The most capable model you can run on a single GPU or TPU
  • AI News | VentureBeat: Google unveils open source Gemma 3 model with 128k context window
  • AI News: Details on the launch of Gemma 3 open AI models by Google.
  • The Verge: Google calls Gemma 3 the most powerful AI model you can run on one GPU
  • Maginative: Google DeepMind’s Gemma 3 Brings Multimodal AI, 128K Context Window, and More
  • TestingCatalog: Gemma 3 sets new benchmarks for open compact models with top score on LMarena
  • AI & Machine Learning: Announcing Gemma 3 on Vertex AI
  • Analytics Vidhya: Gemma 3 vs DeepSeek-R1: Is Google’s New 27B Model a Tough Competition to the 671B Giant?
  • AI & Machine Learning: How to deploy serverless AI with Gemma 3 on Cloud Run
  • The Tech Portal: Google rolls outs Gemma 3, its latest collection of lightweight AI models
  • eWEEK: Google’s Gemma 3: Does the ‘World’s Best Single-Accelerator Model’ Outperform DeepSeek-V3?
  • The Tech Basic: Gemma 3 by Google: Multilingual AI with Image and Video Analysis
  • Analytics Vidhya: Google’s Gemma 3: Features, Benchmarks, Performance and Implementation
  • www.infoworld.com: Google unveils Gemma 3 multi-modal AI models
  • www.zdnet.com: Google claims Gemma 3 reaches 98% of DeepSeek's accuracy - using only one GPU
  • AIwire: Google unveiled open source Gemma 3, is multimodal, comes in four sizes and can now handle more information and instructions thanks to a larger context window. The post appeared first on .
  • Ars OpenForum: Google’s new Gemma 3 AI model is optimized to run on a single GPU
  • THE DECODER: Google DeepMind has unveiled Gemma 3, a new generation of open AI models designed to deliver high performance with a relatively small footprint, making them suitable for running on individual GPUs or TPUs.
  • Gradient Flow: Gemma 3: What You Need To Know
  • Interconnects: Gemma 3, OLMo 2 32B, and the growing potential of open-source AI
  • OODAloop: Gemma 3, Google's newest lightweight, open-source AI model, is designed for multimodal tasks and efficient deployment on various devices.
  • NVIDIA Technical Blog: Google has released lightweight, multimodal, multilingual models called Gemma 3. The models are designed to run efficiently on phones and laptops.
  • LessWrong: Google DeepMind has unveiled Gemma 3, a new generation of open AI models designed to deliver high performance with a relatively small footprint, making them suitable for running on individual GPUs or TPUs.

Ryan Daws@AI News //
DeepSeek V3-0324 has emerged as a leading AI model, topping benchmarks for non-reasoning AI in an open-source breakthrough. This milestone signifies a significant advancement in the field, as it marks the first time an open weights model has achieved the top position among non-reasoning models. The model's performance surpasses proprietary counterparts and edges it closer to proprietary reasoning models, highlighting the growing viability of open-source solutions for latency-sensitive applications. DeepSeek V3-0324 represents a new era for open-source AI, offering a powerful and adaptable tool for developers and enterprises.

DeepSeek-V3 now runs at 20 tokens per second on Apple’s Mac Studio, presenting a challenge to OpenAI’s cloud-dependent business model. The 685-billion-parameter model, DeepSeek-V3-0324, is freely available for commercial use under the MIT license. This achievement, coupled with its cost efficiency and performance, signals a shift in the AI sector, where open-source frameworks increasingly compete with closed systems. Early testers report significant improvements over previous versions, positioning DeepSeek's new model above Claude Sonnet 3.5 from Anthropic.

Recommended read:
References :
  • Analytics India Magazine: The model outperformed all other non-reasoning models across several benchmarks but trailed behind DeepSeek-R1, OpenAI’s o1, o3-mini, and other reasoning models.
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: DeepSeek V3-0324 tops non-reasoning AI models in open-source first
  • Analytics Vidhya: DeepSeek V3-0324: Generated 700 Lines of Code without Breaking
  • Analytics Vidhya: DeepSeek V3-0324 vs Claude 3.7: Which is the Better Coder?
  • Cloud Security Alliance: Markets reacted dramatically, with Nvidia alone losing nearly $600 billion in value in a single day, part of a broader...
  • GZERO Media: Just a few short months ago, Silicon Valley seemed to have the artificial intelligence industry in a chokehold.
  • MarkTechPost: DeepSeek AI Unveils DeepSeek-V3-0324: Blazing Fast Performance on Mac Studio, Heating Up the Competition with OpenAI
  • SiliconANGLE: DeepSeek today released an improved version of its DeepSeek-V3 large language model under a new open-source license.
  • techstrong.ai: DeepSeek Ups Ante (Again) in Duel with OpenAI, Anthropic
  • www.zdnet.com: DeepSeek V3 model gets a major upgrade
  • www.techradar.com: DeepSeek’s new AI is smarter, faster, cheaper, and a real rival to OpenAI's models
  • Composio: Deepseek v3 0324: Finally, the Sonnet 3.5 at Home
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide

Ryan Daws@AI News //
DeepSeek V3-0324, the latest large language model from Chinese AI startup DeepSeek, is making waves in the artificial intelligence industry. The model, quietly released with an MIT license for commercial use, has quickly become the highest-scoring non-reasoning model on the Artificial Analysis Intelligence Index. This marks a significant milestone for open-source AI, surpassing proprietary counterparts like Google’s Gemini 2.0 Pro, Anthropic’s Claude 3.7 Sonnet, and Meta’s Llama 3.3 70B.

DeepSeek V3-0324's efficiency is particularly notable. Early reports indicate that it can run directly on consumer-grade hardware, specifically Apple’s Mac Studio with an M3 Ultra chip, achieving speeds of over 20 tokens per second. This capability is a major departure from the typical data center requirements associated with state-of-the-art AI. The updated version demonstrates substantial improvements in reasoning and benchmark performance, as well as enhanced Chinese writing proficiency and optimized translation quality.

Recommended read:
References :
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: DeepSeek V3-0324 tops non-reasoning AI models in open-source first
  • Analytics Vidhya: DeepSeek V3-0324: Generated 700 Lines of Code without Breaking
  • Analytics India Magazine: The model outperformed all other non-reasoning models across several benchmarks but trailed behind DeepSeek-R1, OpenAI’s o1, o3-mini, and other reasoning models.
  • Cloud Security Alliance: DeepSeek: Behind the Hype and Headlines
  • techstrong.ai: DeepSeek Ups Ante (Again) in Duel with OpenAI, Anthropic
  • www.techradar.com: Deepseek’s new AI is smarter, faster, cheaper, and a real rival to OpenAI's models
  • Analytics Vidhya: DeepSeek V3-0324 vs Claude 3.7: Which is the Better Coder?
  • MarkTechPost: DeepSeek AI Unveils DeepSeek-V3-0324: Blazing Fast Performance on Mac Studio, Heating Up the Competition with OpenAI
  • www.zdnet.com: It's called V3-0324, but the real question is: Is it foreshadowing the upcoming launch of R2?
  • SiliconANGLE: DeepSeek today released an improved version of its DeepSeek-V3 large language model under a new open-source license.
  • Composio: Deepseek v3 o324, a new checkpoint, has been released by Deepseek in silence, with no marketing or hype, just a tweet and The post appeared first on .
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet

Ryan Daws@AI News //
DeepSeek has released DeepSeek V3-0324, an upgraded version of their large language model, marking a significant milestone in open-source AI. According to Artificial Analysis, this new iteration is the highest-scoring non-reasoning model available, surpassing even proprietary counterparts from Google, Anthropic, and Meta. Its accessibility improves the AI research environment. Early reports indicate substantial improvements in reasoning and coding abilities, positioning it as a real rival to OpenAI's models.

The updated model, V3-0324, excels in benchmarks such as MMLU-Pro, GPQA, AIME, and LiveCodeBench, demonstrating enhanced problem-solving and knowledge retention. It runs at 20 tokens per second on a Mac Studio, showcasing its efficiency. With its MIT license, DeepSeek-V3-0324 is freely available for commercial use, and it can run directly on consumer-grade hardware. DeepSeek's advancements signal a shift in the AI sector, as open-source frameworks increasingly compete with closed systems, offering developers a powerful and adaptable tool.

Recommended read:
References :
  • The Algorithmic Bridge: This article mentions DeepSeek V3 update among other AI-related topics, providing a broader context for the DeepSeek release.
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: DeepSeek V3-0324 tops non-reasoning AI models in open-source first
  • Analytics Vidhya: DeepSeek V3 developed by the Chinese AI research lab DeepSeek under High-Flyer has been a standout in the AI landscape since its initial open-source release in December 2024.
  • www.techradar.com: DeepSeek releases upgraded AI model with better performance at lower costs.
  • SiliconANGLE: DeepSeek releases improved V3 model under MIT license
  • Quinta?s weblog: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI

Ryan Daws@AI News //
References: SiliconANGLE , venturebeat.com , AI News ...
DeepSeek, a Chinese AI company, has released DeepSeek V3-0324, an updated AI model that demonstrates impressive performance. The model is now running at 20 tokens per second on a Mac Studio. This model is said to contain 685 billion parameters and its cost-effectiveness challenges the dominance of American AI models, signaling that China continues to innovate in AI despite chip restrictions. Reports from early testers show improvements over previous versions and the model tops non-reasoning AI models in open-source first.

This new model runs on consumer-grade hardware, specifically Apple's Mac Studio with the M3 Ultra chip, diverging from the typical data center requirements for AI. It is freely available for commercial use under the MIT license. According to AI researcher Awni Hannun, the model runs at over 20 tokens per second on a 512GB M3 Ultra. The company has made no formal announcement, just an empty README file and the model weights themselves. This stands in contrast to the carefully orchestrated product launches by Western AI companies.

Recommended read:
References :
  • SiliconANGLE: DeepSeek today released an improved version of its DeepSeek-V3 large language model under a new open-source license.
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: Chinese AI innovation is reshaping the global technology landscape, challenging assumptions about Western dominance in advanced computing. Recent developments from companies like DeepSeek illustrate how quickly China has adapted to and overcome international restrictions through creative approaches to AI development.
  • AI News: DeepSeek V3-0324 tops non-reasoning AI models in open-source first
  • MarkTechPost: DeepSeek AI Unveils DeepSeek-V3-0324: Blazing Fast Performance on Mac Studio, Heating Up the Competition with OpenAI
  • Cloud Security Alliance: Cloud Security Alliance: DeepSeek: Behind the Hype and Headlines
  • Quinta?s weblog: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet

Dashveenjit Kaur@AI News //
References: venturebeat.com , AI News , Sify ...
Chinese AI startup DeepSeek is shaking up the global technology landscape with its latest large language model, DeepSeek-V3-0324. This new model has been lauded for matching the performance of American AI models, while boasting significantly lower development costs. According to Lee Kai-fu, CEO of Chinese startup 01.AI, the gap between Chinese and American AI capabilities has narrowed dramatically, with China even ahead in some specific areas.

DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance in multiple benchmarks, particularly in mathematics. The model scored 59.4 on the American Invitational Mathematics Examination (AIME), a significant improvement over its predecessor. Häme University lecturer Kuittinen Petri noted DeepSeek's achievements were realized with just a fraction of the resources available to competitors like OpenAI. This breakthrough has been attributed to DeepSeek’s focus on algorithmic efficiency and novel approaches to model architecture, allowing them to overcome restrictions on access to the latest silicon.

This disruption is not going unnoticed, when DeepSeek launched its R1 model in January, America’s Nasdaq plunged 3.1%, while the S&P 500 fell 1.5%. While DeepSeek claimed a $5.6 million training cost, this represented only the marginal cost of the final training run. SemiAnalysis estimates DeepSeek's actual hardware investment at closer to $1.6 billion, with hundreds of millions in operating costs. The developments present opportunities and challenges for the.

Recommended read:
References :
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • GZERO Media: How DeepSeek changed China’s AI ambitions
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • Nordic APIs: ChatGPT vs. DeepSeek: A Side-by-Side Comparison
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet

george.fitzmaurice@futurenet.com (George@Latest from ITPro //
References: Analytics Vidhya , MarkTechPost , Groq ...
DeepSeek, a Chinese AI startup founded in 2023, is rapidly gaining traction as a competitor to established models like ChatGPT and Claude. They have quickly risen to prominence and are now competing against much larger parameter models with much smaller compute requirements. As of January 2025, DeepSeek boasts 33.7 million monthly active users and 22.15 million daily active users globally, showcasing its rapid adoption and impact.

Qwen has recently introduced QwQ-32B, a 32-billion-parameter reasoning model, designed to improve performance on complex problem-solving tasks through reinforcement learning and demonstrates robust performance in tasks requiring deep analytical thinking. The QwQ-32B leverages Reinforcement Learning (RL) techniques through a reward-based, multi-stage training process to improve its reasoning capabilities, and can match a 671B parameter model. QwQ-32B demonstrates that Reinforcement Learning (RL) scaling can dramatically enhance model intelligence without requiring massive parameter counts.

Recommended read:
References :
  • Analytics Vidhya: QwQ-32B Vs DeepSeek-R1: Can a 32B Model Challenge a 671B Parameter Model?
  • MarkTechPost: Qwen Releases QwQ-32B: A 32B Reasoning Model that Achieves Significantly Enhanced Performance in Downstream Task
  • Fello AI: DeepSeek is rapidly emerging as a significant player in the AI space, particularly since its public release in January 2025.
  • Groq: A Guide to Reasoning with Qwen QwQ 32B
  • www.itpro.com: ‘Awesome for the community’: DeepSeek open sourced its code repositories, and experts think it could give competitors a scare

Jaime Hampton@AIwire //
References: AIwire , AI News , www.tomshardware.com ...
China's multi-billion-dollar AI infrastructure boom is now facing a significant downturn, according to a new report. The rush to build AI datacenters, fueled by the rise of generative AI and encouraged by government incentives, has resulted in billions of dollars in idle infrastructure. Many newly built facilities are now sitting empty, with some reports indicating that up to 80% of China’s new computing resources remain unused.

The "DeepSeek Effect" is a major factor in this reversal. DeepSeek's AI models, particularly the Deepseek v3, have demonstrated impressive efficiency in training, reducing the demand for large-scale datacenter deployments. Smaller players are abandoning plans to pretrain large models because DeepSeek’s open-source models match ChatGPT-level performance at a fraction of the cost, leading to a collapse in demand for training infrastructure just as new facilities were ready to come online.

Recommended read:
References :
  • AIwire: Report: China’s Race to Build AI Datacenters Has Hit a Wall
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • www.tomshardware.com: China's AI data center boom goes bust: Rush leaves billions of dollars in idle infrastructure

Matthew S.@IEEE Spectrum //
Recent research indicates that AI models, particularly large language models (LLMs), can struggle with overthinking and analysis paralysis, impacting their efficiency and success rates. A study has found that reasoning LLMs sometimes overthink problems, which leads to increased computational costs and a reduction in their overall performance. This issue is being addressed through various optimization techniques, including scaling inference-time compute, reinforcement learning, and supervised fine-tuning, to ensure models use only the necessary amount of reasoning for tasks.

The size and training methods of these models play a crucial role in their reasoning abilities. For instance, Alibaba's Qwen team introduced QwQ-32B, a 32-billion-parameter model that outperforms much larger rivals in key problem-solving tasks. QwQ-32B achieves superior performance in math, coding, and scientific reasoning using multi-stage reinforcement learning, despite being significantly smaller than DeepSeek-R1. This advancement highlights the potential of reinforcement learning to unlock reasoning capabilities in smaller models, rivaling the performance of giant models while requiring less computational power.

Recommended read:
References :
  • IEEE Spectrum: It’s Not Just Us: AI Models Struggle With Overthinking
  • Sebastian Raschka, PhD: This article explores recent research advancements in reasoning-optimized LLMs, with a particular focus on inference-time compute scaling that have emerged since the release of DeepSeek R1.

Dashveenjit Kaur@AI News //
References: Quinta?s weblog , AI News , Sify ...
DeepSeek, a Chinese AI startup, is causing a stir in the AI industry with its new large language model, DeepSeek-V3-0324. Released with little fanfare on the Hugging Face AI repository, the 641-gigabyte model is freely available for commercial use under an MIT license. Early reports indicate it can run directly on consumer-grade hardware, such as Apple’s Mac Studio with the M3 Ultra chip, especially in a 4-bit quantized version that reduces the storage footprint to 352GB. This innovation challenges the previous notion that Silicon Valley held a chokehold on the AI industry.

China's focus on algorithmic efficiency over hardware superiority has allowed companies like DeepSeek to flourish despite restrictions on access to the latest silicon. DeepSeek's R1 model, launched earlier this year, already rivaled OpenAI's ChatGPT-4 at a fraction of the cost. Now the DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance. This has sparked a gold rush among Chinese tech startups, rewriting the playbook for AI development and allowing smaller companies to believe they have a shot in the market.

Recommended read:
References :
  • Quinta?s weblog: Chinese AI startup DeepSeek has quietly released a new large language model that’s already sending ripples through the artificial intelligence industry — not just for its capabilities, but for how it’s being deployed.
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • SiliconANGLE: DeepSeek releases improved V3 model under MIT license
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet

drewt@secureworldexpo.com (Drew@SecureWorld News //
DeepSeek R1, an open-source AI model, has been shown to generate rudimentary malware, including keyloggers and ransomware. Researchers at Tenable demonstrated that while the AI model initially refuses malicious requests, these safeguards can be bypassed with carefully crafted prompts. This capability signals an urgent need for security teams to adapt their defenses against AI-generated threats.

While DeepSeek R1 may not autonomously launch sophisticated cyberattacks yet, it can produce semi-functional code that knowledgeable attackers could refine into working exploits. Cybersecurity experts emphasize the dual-use nature of generative AI, highlighting the need for organizations to implement strategies such as behavioral detection over static signatures to mitigate risks associated with AI-powered cyber threats. Cybercrime Magazine has also released an episode on CrowdStrike’s new Adversary Universe Podcast, discussing DeepSeek and the risks associated with foreign large language models.

Recommended read:
References :

Ellie Ramirez-Camara@Data Phoenix //
References: Data Phoenix , John Werner
Anthropic founder Dario Amodei's recent comments on AI writing code have sparked widespread discussion regarding the future role of coders. The community is grappling with the potential implications of AI's increasing ability to generate code, contemplating how this shift might reshape the software engineering landscape and the broader tech industry. This debate highlights the rapid advancements in AI capabilities and the need for professionals to adapt to evolving roles in the age of automation.

OpenAI has introduced a suite of new tools designed to empower developers in building AI agents. This comprehensive offering includes the Responses API, providing a flexible foundation for agent creation, along with built-in capabilities for web search, file search, and computer use. Furthermore, the open-source Agents SDK allows for seamless orchestration of single and multi-agent workflows, incorporating configurable large language models, safety checks, and tracing tools. These tools are designed to make it significantly easier for developers to build AI agents.

Recommended read:
References :
  • Data Phoenix: OpenAI has launched a comprehensive suite of new tools including the Responses API, built-in capabilities for web search, file search, and computer use, and an open-source Agents SDK—all designed to make it significantly easier for developers to build AI agents.
  • John Werner: Dario Amodei of Anthropic fame has had some things to say about AI writing code. In responding, the community is considering what this might mean for coders and the rest of us.

Jason Ly@cset.georgetown.edu //
China's DeepSeek is emerging as a significant competitor to US AI companies by offering advanced AI models at notably lower costs, potentially reshaping the AI market. Sam Bresnick and Cole McFaul highlighted in Barron's the challenge US firms face due to DeepSeek's cost-effective AI solutions. This competition is expected to democratize AI, signaling that Silicon Valley companies are no longer the sole leaders in this technology.

DeepSeek's impact extends to healthcare, where it is being rapidly adopted in Chinese tertiary hospitals to improve clinical decision-making and operational efficiency. Hospitals like Fudan University Affiliated Huashan Hospital are testing DeepSeek models, with deployments localized within hospital intranets for data security. Applications range from intelligent pathology automating tumor analysis to imaging analysis achieving high accuracy in lung nodule differentiation, and AI pre-consultation systems reducing patient wait times.

Recommended read:
References :
  • Gradient Flow: DeepSeek in Action: Practical AI Applications Transforming Chinese Healthcare
  • www.techradar.com: DeepSeek kicks off the next wave of the AI rush

george.fitzmaurice@futurenet.com (George@Latest from ITPro //
References: www.itpro.com , Mozilla.ai ,
DeepSeek has recently open-sourced its code repositories, a move lauded by experts as potentially disruptive to its competitors. The Chinese AI firm's decision to share its code marks a significant step towards transparency, particularly when contrasted with the more guarded approach of some of its US counterparts. Alistair Pullen, CEO of Cosine, emphasized that while Meta's Llama offers open weights, DeepSeek goes further by open-sourcing a substantial amount of its code, a move that is beneficial to the open-source community.

DeepSeek's open-source initiative is seen as a strategic move that allows it to challenge competitors while potentially diminishing its competitive edge. However, experts like Pullen suggest DeepSeek may be comfortable with this trade-off, as it is not solely reliant on its models. Furthermore, DeepSeek is rapidly being adopted across China’s tertiary hospitals to improve clinical decision-making and operational efficiency.

Recommended read:
References :
  • www.itpro.com: ‘Awesome for the community’: DeepSeek open sourced its code repositories, and experts think it could give competitors a scare
  • Mozilla.ai: Evaluating DeepSeek V3 with Lumigator
  • OVHcloud Blog: Deep Dive into DeepSeek-R1 – Part 1

Jaime Hampton@AIwire //
References: AI News , Sify , AIwire ...
DeepSeek's innovative AI models are reshaping China's AI data center infrastructure, leading to a market disruption and potentially underutilized resources. The company's DeepSeek-V3 model has demonstrated performance that rivals ChatGPT but at a significantly reduced cost. This has altered the demand for extensive GPU clusters used in traditional AI training, shifting the focus towards hardware prioritizing low-latency, particularly near tech hubs. This has resulted in increased speculation as well as experienced players who are now posed with the challenge of the DeepSeek V3.

The open-source nature of DeepSeek’s model is also allowing smaller players to compete without the need for extensive pretraining, which is undermining the demand for large data centers. DeepSeek-V3, which runs at 20 tokens per second on a Mac Studio, poses a new challenge for existing AI models. Chinese AI startups are now riding DeepSeek's momentum and building an ecosystem that is revolutionizing the AI landscape. This narrows the technology divide between China and the United States.

Recommended read:
References :
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet
  • AIwire: Report: China’s Race to Build AI Datacenters Has Hit a Wall
  • Quinta?s weblog: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI

Muhammad Zulhusni@AI News //
References: AI News , GZERO Media
Several major US artificial intelligence companies have expressed fears that America is losing its edge in AI development. In submissions to the US government in March 2025, these companies warned that Chinese AI models, like DeepSeek R1, are becoming increasingly sophisticated and competitive. The submissions, prompted by a request for input on an AI Action Plan, underscore the growing challenge posed by China in terms of technological capabilities and pricing within the AI sector.

China's growing AI capabilities are exemplified by DeepSeek R1, a state-supported model that has garnered attention from US developers. OpenAI noted that DeepSeek demonstrates a narrowing technological gap between the US and China, expressing concerns about the model's potential to influence global AI development, particularly given its "state-subsidized, state-controlled, and freely available" nature. Competition from China also includes Ernie X1 and Ernie 4.5, released by Baidu, which are designed to compete with Western systems.

Recommended read:
References :
  • AI News: Is America falling behind in the AI race?
  • GZERO Media: How DeepSeek changed China’s AI ambitions