News from the AI & ML world

DeeperML - #chineseai

Jaime Hampton@AIwire //
References: AIwire , AI News , www.tomshardware.com ...
China's multi-billion-dollar AI infrastructure boom is now facing a significant downturn, according to a new report. The rush to build AI datacenters, fueled by the rise of generative AI and encouraged by government incentives, has resulted in billions of dollars in idle infrastructure. Many newly built facilities are now sitting empty, with some reports indicating that up to 80% of China’s new computing resources remain unused.

The "DeepSeek Effect" is a major factor in this reversal. DeepSeek's AI models, particularly the Deepseek v3, have demonstrated impressive efficiency in training, reducing the demand for large-scale datacenter deployments. Smaller players are abandoning plans to pretrain large models because DeepSeek’s open-source models match ChatGPT-level performance at a fraction of the cost, leading to a collapse in demand for training infrastructure just as new facilities were ready to come online.

Recommended read:
References :
  • AIwire: Report: China’s Race to Build AI Datacenters Has Hit a Wall
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • www.tomshardware.com: China's AI data center boom goes bust: Rush leaves billions of dollars in idle infrastructure

Jaime Hampton@AIwire //
References: AI News , Sify , AIwire ...
DeepSeek's innovative AI models are reshaping China's AI data center infrastructure, leading to a market disruption and potentially underutilized resources. The company's DeepSeek-V3 model has demonstrated performance that rivals ChatGPT but at a significantly reduced cost. This has altered the demand for extensive GPU clusters used in traditional AI training, shifting the focus towards hardware prioritizing low-latency, particularly near tech hubs. This has resulted in increased speculation as well as experienced players who are now posed with the challenge of the DeepSeek V3.

The open-source nature of DeepSeek’s model is also allowing smaller players to compete without the need for extensive pretraining, which is undermining the demand for large data centers. DeepSeek-V3, which runs at 20 tokens per second on a Mac Studio, poses a new challenge for existing AI models. Chinese AI startups are now riding DeepSeek's momentum and building an ecosystem that is revolutionizing the AI landscape. This narrows the technology divide between China and the United States.

Recommended read:
References :
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet
  • AIwire: Report: China’s Race to Build AI Datacenters Has Hit a Wall
  • Quinta?s weblog: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI

Dashveenjit Kaur@AI News //
References: venturebeat.com , AI News , Nordic APIs ...
Chinese AI startup DeepSeek is shaking up the global technology landscape with its latest large language model, DeepSeek-V3-0324. This new model has been lauded for matching the performance of American AI models, while boasting significantly lower development costs. According to Lee Kai-fu, CEO of Chinese startup 01.AI, the gap between Chinese and American AI capabilities has narrowed dramatically, with China even ahead in some specific areas.

DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance in multiple benchmarks, particularly in mathematics. The model scored 59.4 on the American Invitational Mathematics Examination (AIME), a significant improvement over its predecessor. Häme University lecturer Kuittinen Petri noted DeepSeek's achievements were realized with just a fraction of the resources available to competitors like OpenAI. This breakthrough has been attributed to DeepSeek’s focus on algorithmic efficiency and novel approaches to model architecture, allowing them to overcome restrictions on access to the latest silicon.

This disruption is not going unnoticed, when DeepSeek launched its R1 model in January, America’s Nasdaq plunged 3.1%, while the S&P 500 fell 1.5%. While DeepSeek claimed a $5.6 million training cost, this represented only the marginal cost of the final training run. SemiAnalysis estimates DeepSeek's actual hardware investment at closer to $1.6 billion, with hundreds of millions in operating costs. The developments present opportunities and challenges for the.

Recommended read:
References :
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • Nordic APIs: ChatGPT vs. DeepSeek: A Side-by-Side Comparison
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet

Dashveenjit Kaur@AI News //
References: AI News , MarkTechPost , AI News ...
DeepSeek, a Chinese AI startup, is causing a stir in the AI industry with its new large language model, DeepSeek-V3-0324. Released with little fanfare on the Hugging Face AI repository, the 641-gigabyte model is freely available for commercial use under an MIT license. Early reports indicate it can run directly on consumer-grade hardware, such as Apple’s Mac Studio with the M3 Ultra chip, especially in a 4-bit quantized version that reduces the storage footprint to 352GB. This innovation challenges the previous notion that Silicon Valley held a chokehold on the AI industry.

China's focus on algorithmic efficiency over hardware superiority has allowed companies like DeepSeek to flourish despite restrictions on access to the latest silicon. DeepSeek's R1 model, launched earlier this year, already rivaled OpenAI's ChatGPT-4 at a fraction of the cost. Now the DeepSeek-V3-0324 features enhanced reasoning capabilities and improved performance. This has sparked a gold rush among Chinese tech startups, rewriting the playbook for AI development and allowing smaller companies to believe they have a shot in the market.

Recommended read:
References :
  • AI News: DeepSeek V3-0324 has become the highest-scoring non-reasoning model on the Artificial Analysis Intelligence Index in a landmark achievement for open-source AI.
  • MarkTechPost: Artificial intelligence (AI) has made significant strides in recent years, yet challenges persist in achieving efficient, cost-effective, and high-performance models.
  • Quinta?s weblog: Chinese AI startup DeepSeek has quietly released a new large language model that’s already sending ripples through the artificial intelligence industry — not just for its capabilities, but for how it’s being deployed.
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide
  • Composio: Deepseek v3 o324, a new checkpoint, has been released by Deepseek in silence, with no marketing or hype, just a tweet and
  • SiliconANGLE: DeepSeek today released an improved version of its DeepSeek-V3 large language model under a new open-source license.
  • Sify: DeepSeek’s AI Revolution: Creating an Entire AI Ecosystem
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet

Ryan Daws@AI News //
References: SiliconANGLE , venturebeat.com , AI News ...
DeepSeek, a Chinese AI company, has released DeepSeek V3-0324, an updated AI model that demonstrates impressive performance. The model is now running at 20 tokens per second on a Mac Studio. This model is said to contain 685 billion parameters and its cost-effectiveness challenges the dominance of American AI models, signaling that China continues to innovate in AI despite chip restrictions. Reports from early testers show improvements over previous versions and the model tops non-reasoning AI models in open-source first.

This new model runs on consumer-grade hardware, specifically Apple's Mac Studio with the M3 Ultra chip, diverging from the typical data center requirements for AI. It is freely available for commercial use under the MIT license. According to AI researcher Awni Hannun, the model runs at over 20 tokens per second on a 512GB M3 Ultra. The company has made no formal announcement, just an empty README file and the model weights themselves. This stands in contrast to the carefully orchestrated product launches by Western AI companies.

Recommended read:
References :
  • SiliconANGLE: DeepSeek today released an improved version of its DeepSeek-V3 large language model under a new open-source license.
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: Chinese AI innovation is reshaping the global technology landscape, challenging assumptions about Western dominance in advanced computing. Recent developments from companies like DeepSeek illustrate how quickly China has adapted to and overcome international restrictions through creative approaches to AI development.
  • AI News: DeepSeek V3-0324 tops non-reasoning AI models in open-source first
  • MarkTechPost: DeepSeek AI Unveils DeepSeek-V3-0324: Blazing Fast Performance on Mac Studio, Heating Up the Competition with OpenAI
  • Cloud Security Alliance: Cloud Security Alliance: DeepSeek: Behind the Hype and Headlines
  • Quinta?s weblog: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • Composio: Deepseek v3-0324 vs. Claude 3.7 Sonnet

Madeline Clarke@techrepublic.com //
Ant Group has announced a significant breakthrough in artificial intelligence, claiming to have slashed AI costs by 20% using Chinese-made chips. The company's Ling-Plus and Ling-Lite models reportedly match or outperform leading AI models, demonstrating China's increasing ability to innovate around US export controls. This achievement marks a potential leap forward in China’s AI development efforts, signaling a move towards self-reliance in AI development and reduced dependence on foreign technologies.

Ant Group leveraged chips from Chinese tech giants Alibaba and Huawei to train its AI model, reaching performance levels comparable to those obtained with Nvidia’s H800 chips. While Ant Group continues to utilize Nvidia’s hardware for certain AI development tasks, the company is now relying increasingly on alternatives — particularly chips from AMD and Chinese manufacturers — for its latest models. This strategic pivot reflects a broader trend within China’s tech industry, driven in part by tightening U.S. sanctions that limit access to Nvidia’s most advanced GPUs.

Recommended read:
References :
  • Jon Keegan: Ant Group boasts of breakthrough with new fast, cheap Chinese AI models trained on Chinese chips. Ant says its Ling-Plus and Ling-Lite models match or beat leading models. This shows Chinese AI developers are able to innovate around US export controls.
  • www.techrepublic.com: Ant Group Slashes AI Costs by 20% With Chinese-Made Chips: What It Means for U.S. Tech
  • www.tomshardware.com: Ant Group reportedly reduces AI costs 20% with Chinese chips

Ryan Daws@AI News //
DeepSeek V3-0324 has emerged as a leading AI model, topping benchmarks for non-reasoning AI in an open-source breakthrough. This milestone signifies a significant advancement in the field, as it marks the first time an open weights model has achieved the top position among non-reasoning models. The model's performance surpasses proprietary counterparts and edges it closer to proprietary reasoning models, highlighting the growing viability of open-source solutions for latency-sensitive applications. DeepSeek V3-0324 represents a new era for open-source AI, offering a powerful and adaptable tool for developers and enterprises.

DeepSeek-V3 now runs at 20 tokens per second on Apple’s Mac Studio, presenting a challenge to OpenAI’s cloud-dependent business model. The 685-billion-parameter model, DeepSeek-V3-0324, is freely available for commercial use under the MIT license. This achievement, coupled with its cost efficiency and performance, signals a shift in the AI sector, where open-source frameworks increasingly compete with closed systems. Early testers report significant improvements over previous versions, positioning DeepSeek's new model above Claude Sonnet 3.5 from Anthropic.

Recommended read:
References :
  • Analytics India Magazine: The model outperformed all other non-reasoning models across several benchmarks but trailed behind DeepSeek-R1, OpenAI’s o1, o3-mini, and other reasoning models.
  • venturebeat.com: DeepSeek-V3 now runs at 20 tokens per second on Mac Studio, and that’s a nightmare for OpenAI
  • AI News: DeepSeek V3-0324 tops non-reasoning AI models in open-source first
  • Analytics Vidhya: DeepSeek V3-0324: Generated 700 Lines of Code without Breaking
  • Analytics Vidhya: DeepSeek V3-0324 vs Claude 3.7: Which is the Better Coder?
  • Cloud Security Alliance: Markets reacted dramatically, with Nvidia alone losing nearly $600 billion in value in a single day, part of a broader...
  • GZERO Media: Just a few short months ago, Silicon Valley seemed to have the artificial intelligence industry in a chokehold.
  • MarkTechPost: DeepSeek AI Unveils DeepSeek-V3-0324: Blazing Fast Performance on Mac Studio, Heating Up the Competition with OpenAI
  • SiliconANGLE: DeepSeek today released an improved version of its DeepSeek-V3 large language model under a new open-source license.
  • techstrong.ai: DeepSeek Ups Ante (Again) in Duel with OpenAI, Anthropic
  • www.zdnet.com: DeepSeek V3 model gets a major upgrade
  • www.techradar.com: DeepSeek’s new AI is smarter, faster, cheaper, and a real rival to OpenAI's models
  • Composio: Deepseek v3 0324: Finally, the Sonnet 3.5 at Home
  • AI News: DeepSeek disruption: Chinese AI innovation narrows global technology divide

Matthias Bastian@THE DECODER //
Baidu has launched two new AI models, ERNIE 4.5 and ERNIE X1, designed to compete with DeepSeek's R1 model. The company is making these models freely accessible to individual users through the ERNIE Bot platform, ahead of the initially planned schedule. ERNIE 4.5 is a multimodal foundation model, integrating text, images, audio, and video to enhance understanding and content generation across various data types. This model demonstrates significant improvements in language understanding, reasoning, and coding abilities.

ERNIE X1 is Baidu's first model specifically designed for complex reasoning tasks, excelling in logical inference, problem-solving, and structured decision-making suitable for applications in finance, law, and data analysis. Baidu claims that ERNIE X1 matches DeepSeek R1’s performance at half the cost. ERNIE 4.5 has shown performance on par with models like DeepSeek-R1, but at approximately half the deployment cost.

Recommended read:
References :
  • AiThority: With the launch of ERNIE 4.5 and ERNIE X1, ERNIE Bot is made free to the public ahead of schedule, and users can access both models free of charge.
  • techxplore.com: Chinese internet search giant Baidu released a new artificial intelligence reasoning model Sunday and made its AI chatbot services free to consumers as ferocious competition grips the sector.
  • THE DECODER: Baidu claims its Ernie X1 reasoning model matches Deepseek-R1 performance at half the price
  • Analytics Vidhya: China has done it again with its AI models and this time the blow is bigger and better! Baidu – a Chinese AI company, recently released two large language models (LLMs) – ERNIE 4.5 & X1.
  • TestingCatalog: Discover Baidu's new AI models, ERNIE 4.5 and ERNIE X1, now freely accessible via ERNIE Bot. Experience cutting-edge AI tech ahead of schedule!
  • Analytics India Magazine: China’s Baidu Launches Two New AI Models, Rivals DeepSeek R1 at Half the Price
  • TechCrunch: Chinese search engine Baidu has launched two new AI models — Ernie 4.5, the latest version of the company’s foundational model first released two years ago, as well as a new reasoning model, Ernie X1. According to Reuters, Baidu claims that Ernie X1’s performance is “on par with DeepSeek R1 at only
  • AIwire: With the launch of ERNIE 4.5 and ERNIE X1, ERNIE Bot is made free to the public ahead of schedule, and users can access both models free of charge.
  • AI News: Baidu undercuts rival AI models with ERNIE 4.5 and ERNIE X1
  • AI News | VentureBeat: Baidu has also announced plans to integrate ERNIE 4.5 and ERNIE X1 into its broader ecosystem, including Baidu Search and the Wenxiaoyan app.
  • www.tomshardware.com: ERNIE 4.5 AI model by Baidu claims to match DeepSeek R1 at half the cost
  • Fello AI: Baidu’s New ERNIE 4.5 & X1 – A Free AI That Is Better Than GPT-4.5 & Costs Pennies!

Matthias Bastian@THE DECODER //
References: TestingCatalog , THE DECODER , AiThority ...
Baidu has released two new large language models, ERNIE 4.5 and ERNIE X1, claiming they outperform OpenAI's GPT-4.5 and DeepSeek-R1. These models are more cost-effective, offering high quality at a fraction of the price. ERNIE 4.5 is a multimodal foundation model that integrates text, images, audio, and video, enhancing its ability to understand and generate different kinds of content. ERNIE X1 is a deep-thinking reasoning model with multimodal capabilities, excelling in tasks requiring advanced reasoning.

Baidu has made both models freely accessible to individual users via the ERNIE Bot platform, ahead of schedule. For enterprise users and developers, ERNIE 4.5 is available via APIs on Baidu AI Cloud's Qianfan platform, with ERNIE X1 set to follow. Baidu also plans to integrate the models into its existing products, including Baidu Search and the Wenxiaoyan app. This move positions Baidu as a competitive force in the AI landscape, challenging Western AI companies.

Recommended read:
References :
  • TestingCatalog: This article discusses Baidu's ERNIE 4.5 and ERNIE X1 models, highlighting their performance and lower prices compared to DeepSeek.
  • THE DECODER: This article discusses Baidu’s new LLMs, ERNIE 4.5 and ERNIE X1, highlighting their competitive pricing and plans for open-source release in the context of the AI market.
  • Analytics Vidhya: This article discusses Baidu’s release of ERNIE 4.5 and ERNIE X1 LLMs, highlighting their claimed performance advantages over GPT-4.5 and cost-effectiveness.
  • AiThority: With the launch of ERNIE 4.5 and ERNIE X1, ERNIE Bot is made free to the public ahead of schedule, and users can access both models free of charge. As a deep-thinking reasoning model with multimodal capabilities, ERNIE X1 delivers performance on par with DeepSeek R1 at only half the price. ERNIE 4.5 is the [...]
  • techxplore.com: Chinese internet search giant Baidu released a new artificial intelligence reasoning model Sunday and made its AI chatbot services free to consumers as ferocious competition grips the sector.
  • TechCrunch: Chinese search engine Baidu has launched two new AI models — Ernie 4.5, the latest version of the company’s foundational model first released two years ago, as well as a new reasoning model, Ernie X1.
  • AI News: Baidu undercuts rival AI models with ERNIE 4.5 and ERNIE X1
  • techstrong.ai: Baidu Unleashes Speedy New AI Model to Rival DeepSeek
  • AI News | VentureBeat: Baidu has also announced plans to integrate ERNIE 4.5 and ERNIE X1 into its broader ecosystem, including Baidu Search and the Wenxiaoyan app.
  • Fello AI: Baidu’s New ERNIE 4.5 & X1 – A Free AI That Is Better Than GPT-4.5 & Costs Pennies!

Nitika Sharma@Analytics Vidhya //
China's Manus AI, developed by Monica, is generating buzz as an invite-only multi-agent AI product. This AI agent is designed to autonomously tackle complex, real-world tasks by operating as a multi-agent system. It utilizes a planner optimized for strategic reasoning, and an executor driven by Claude 3.5 Sonnet, incorporating code execution, web browsing, and multi-file code management.

The AI agent has sparked considerable global attention, igniting discussions about its technological and ethical implications, as well as its potential impact on the AI landscape. Manus reportedly outperformed OpenAI's o3-powered Deep Research agent on benchmarks, as showcased on the Manus website, leading some to believe it is among the most effective autonomous agents currently available. However, there is some skepticism due to it appearing to be a Claude wrapper with a jailbreak and tools optimized for the GAIA benchmark.

Recommended read:
References :
  • Maginative: Manus AI, China's new autonomous agent, is making waves with its ability to independently analyze, plan, and execute tasks. With industry leaders calling it “the AI agent we were promised,â€� it's raising the stakes in the global AI race.
  • MarkTechPost: In today’s digital era, the way we work is rapidly evolving, yet many challenges persist. Conventional AI assistants and manual workflows struggle to keep pace with the complexity and volume of modern tasks. Professionals and businesses face repetitive manual processes, inefficient research methods, and a lack of true automation. While traditional tools offer suggestions and […] The post appeared first on .
  • Fello AI: Manus AI is a newly announced autonomous AI agent developed by the Chinese startup Monica. It has been designed as a general AI agent that goes beyond simple text generation by autonomously planning, executing, and delivering complex tasks. The system is positioned as a breakthrough in AI technology, offering capabilities that mimic a human team working […] The post appeared first on .
  • Analytics Vidhya: Ever felt buried under a mountain of tasks, wishing for an extra set of hands to get things done? What if you could offload those tasks and get results without being glued to your screen? Manus – an AI agent from China gaining attention for its ability to handle general tasks with ease. In a […] The post appeared first on .
  • The Rundown AI: PLUS: China's Manus demos ‘world’s first fully autonomous’ AI agent
  • Craig Smith: Forbes discusses China’s Autonomous Agent, Manus, Changes Everything
  • AI News | VentureBeat: What you need to know about Manus, the new AI agentic system from China
  • AI Accelerator Institute: China’s new AI agent, Manus, operates autonomously, sparking debate on its impact, ethics, and global AI competition. Here’s what you need to know.
  • thezvi.wordpress.com: The Manus Marketing Madness
  • Analytics Vidhya: This article talks about comparison between China's new AI agent 'Manus' and OpenAI 'Operator'
  • The Register - Software: Prompts see it scour the web for info and turn it into decent documents at reasonable speed Chinese researchers’ AI prowess is again a hot topic after a startup called Monica.im last week revealed “Manusâ€�, a service it bills as a “general agentâ€� that might improve on tools offered by Western companies.
  • AIwire: China’s Manus AI: A Game-Changer or Just Another Overhyped Agent?
  • bdtechtalks.com: What is Manus, the AI agent taking on OpenAI Deep Research
  • OODAloop: China’s new AI agent, Manus, operates autonomously, sparking debate on its impact, ethics, and global AI competition. Here’s what you need to know.
  • pub.towardsai.net: Discussion on Manus AI's architecture, performance, and potential.
  • Tech News | Euronews RSS: A new Chinese AI platform is causing a frenzy. But is it worth the hype? Euronews Next takes a look.
  • techxplore.com: What to know about Manus, China's latest AI assistant
  • www.laptopmag.com: What is Manus AI? The autonomous assistant that wants to do the work for you
  • techstrong.ai: Chinese Startup’s Manus AI Agent Generates Hype, Skepticism
  • www.tomsguide.com: Manus AI is the new challenger to DeepSeek — everything you need to know
  • Gradient Flow: Manus: What You Need To Know
  • hackernoon.com: Founder of China’s New AI Model Says His Agent is More Autonomous Than Rivals'
  • iHLS: Introducing Manus: The World’s First Fully Autonomous AI Agent
  • TechNode: China’s AI agent Manus gains traction amid growing demand for autonomous AI

Matthias Bastian@THE DECODER //
References: OODAloop , THE DECODER , MarkTechPost ...
Chinese AI company DeepSeek is making waves in the global AI market with its high profit margins and low pricing. The company makes $200 million per year at 85% or greater profit margins, even while charging $2.19 per million tokens on its R1 model, about 25 times less than OpenAI. DeepSeek's financial data suggests a theoretical peak revenue could exceed operating costs by six times when using optimal R1 model pricing.

The company's success has prompted Tencent to unveil its own AI platform, Hunyuan Turbo S, designed specifically to compete with DeepSeek. Although Hunyuan Turbo S is the clear winner in certain cases, it still falls behind DeepSeek-R1-Zero in several instances. DeepSeek uses smart resource management and a dynamic resource allocation system which keeps costs down.

Recommended read:
References :
  • OODAloop: In the two months since a little-known Chinese company called DeepSeek released a powerful new open-source AI model, the breakthrough has already begun to transform the global AI market.
  • THE DECODER: Newly released data from Chinese AI provider Deepseek reveals that AI language models could, in theory, generate substantial profit margins—even at prices significantly lower than OpenAI’s.
  • Neural Magic: The 4-bit Breakdown Quantized Reasoning Models In recent research, including We Ran Over Half a Million Evaluations on Quantized LLMs and How Well Do Quantized Models Handle Long-Context Tasks?, we’ve shown that quantized large language models (LLMs) rival their full-precision counterparts in accuracy across diverse benchmarks, covering academic, real-world use cases, and long-context evaluations while…
  • MarkTechPost: DeepSeek AI Releases Fire-Flyer File System (3FS): A High-Performance Distributed File System Designed to Address the Challenges of AI Training and Inference Workload
  • eWEEK: Headquartered in Shenzhen, China, the team with Tencent recently unveiled their new AI platform called Hunyuan Turbo S.
  • NextBigFuture.com: DeepSeek has revealed they makes $200M/yr at 85%+ profit margins. This means their profits margins are larger than the 72-77% profit margins of Nvidia.
  • TechCrunch: Chinese AI startup DeepSeek recently declared that its AI models could be very profitable — with some asterisks. In a post on X, DeepSeek boasted that its online services have a “cost profit marginâ€� of 545%. However, that margin is calculated based on “theoretical income.â€�
  • Stuff South Africa: DeepSeek is now a global force. But it’s just one player in China’s booming AI industry
  • Unite.AI: DeepSeek and AI Power Shift: Key Insights for Investors and Entrepreneurs
  • AI News | VentureBeat: While DeepSeek-R1 operates with 671 billion parameters, QwQ-32B achieves comparable performance with a much smaller footprint.
  • bdtechtalks.com: Alibaba’s QwQ-32B reasoning model matches DeepSeek-R1, outperforms OpenAI o1-mini