News from the AI & ML world

DeeperML - #alibaba

Alibabaâ€™s Qwen3 Challenges Silicon Valley in AI - Alibaba has launched Qwen3, a new generation of AI models designed to compete with Silicon Valley's leading AI technologies.

References: felloai.com , techcrunch.com , www.techradar.com ...

Alibaba has launched Qwen3, a new generation of AI models designed to compete with Silicon Valley's leading AI technologies. Qwen3 represents a significant advancement in AI, with capabilities that directly challenge models from OpenAI, Google, and Meta. The unveiling has generated excitement and apprehension across the global tech industry, signaling a narrowing of the gap between China and the US in AI development.

The Qwen3 family includes models with varying parameter sizes, catering to different applications and offering exceptional capabilities in complex reasoning, mathematical problem-solving, and code generation. The models support 119 languages and have been trained on a massive dataset of over 36 trillion tokens, providing a broad understanding of global information. One key innovation is the "hybrid reasoning" approach, allowing the models to switch between "fast thinking" for quick responses and "slow thinking" for more analytical tasks.

Benchmark results have begun to surface regarding the performance of the Qwen3 models. According to Alibaba, the Qwen3 models can match and, in some cases, outperform the best models available from Google and OpenAI. Some Qwen3 models are being released open-source, a move intended to boost China's AI ecosystem by encouraging wider adoption and collaborative development.

Recommended read:

Top link: felloai.com
Permalink: More details

References :

felloai.com: Alibabaâ€™s Qwen3 AI is Here to Challenge Silicon Valley â€“ And Itâ€™s SCARY Good.
techcrunch.com: Alibaba unveils Qwen 3, a family of hybrid AI reasoning models
www.marktechpost.com: ZeroSearch from Alibaba Uses Reinforcement Learning and Simulated Documents to Teach LLMs Retrieval Without Real-Time Search
www.techradar.com: Alibaba 'ZeroSearch' can reduce AI for search training cost by 88%, company claims

@felloai.com //

Alibaba's Qwen3 Challenges Silicon Valley with Advanced AI - Alibaba’s Qwen series, including Qwen3, demonstrates competitive AI performance with full front-end code generation and ZeroSearch for improved LLM retrieval.

References: felloai.com , MarkTechPost , www.marktechpost.com ...

Alibaba has launched Qwen3, a new family of large language models (LLMs), posing a significant challenge to Silicon Valley's AI dominance. Qwen3 is not just an incremental update but a leap forward, demonstrating capabilities that rival leading models from OpenAI, Google, and Meta. This advancement signals China’s growing prowess in AI and its potential to redefine the global tech landscape. Qwen3's strengths lie in reasoning, coding, and multilingual understanding, marking a pivotal moment in China's AI development.

The Qwen3 family includes models of varying sizes to cater to diverse applications. Key features include complex reasoning, mathematical problem-solving, and code generation. The models support 119 languages and are trained on a massive dataset of over 36 trillion tokens. Another innovation is Qwen3’s “hybrid reasoning” approach, enabling models to switch between "fast thinking" for quick responses and "slow thinking" for deeper analysis, enhancing versatility and efficiency. Alibaba has also emphasized the open-source nature of some Qwen3 models, fostering wider adoption and collaborative development in China's AI ecosystem.

Alibaba also introduced ZeroSearch, a method that uses reinforcement learning and simulated documents to teach LLMs retrieval without real-time search. It addresses the challenge of LLMs relying on static datasets, which can become outdated. By training the models to retrieve and incorporate external information, ZeroSearch aims to improve the reliability of LLMs in real-world applications like news, research, and product reviews. This method mitigates the high costs associated with large-scale interactions with live APIs, making it more accessible for academic research and commercial deployment.

Recommended read:

Top link: felloai.com
Permalink: More details

References :

felloai.com: Reports Alibabaâ€™s Qwen3 AI is Here to Challenge Silicon Valley
MarkTechPost: Alibaba Uses Reinforcement Learning and Simulated Documents to Teach LLMs Retrieval Without Real-Time Search
techcrunch.com: Alibaba unveils Qwen 3, a family of hybrid AI reasoning models.
www.marktechpost.com: ZeroSearch from Alibaba Uses Reinforcement Learning and Simulated Documents to Teach LLMs Retrieval Without Real-Time Search
THE DECODER: Report on Alibaba's "Web Dev" tool in Qwen which generates full front-end code from just a prompt.
Towards AI: Qwen-3 Fine Tuning Made Easy: Create Custom AI Models with Python and Unsloth
the-decoder.com: Web Dev in Qwen generates full front-end code from just a prompt
www.techradar.com: Alibaba says AI-generating search results could not only reduce reliance on Google's APIs, but cut costs by up to 88%.
Fello AI: Just when you thought Silicon Valley had the AI game locked down, Alibaba has unleashed Qwen3, a new generation of AI models so powerful theyâ€™re making US tech giants sweat.

Alexey Shabanov@TestingCatalog //

Alibaba's Qwen3 LLMs Impress with Robust Open Source Performance - Alibaba’s Qwen team released Qwen3, a family of open-source large language models (LLMs) ranging from 0.6B to 235B parameters, supporting 119 languages and featuring improved agentic capabilities and reasoning optimizations.

References: pub.towardsai.net , gradientflow.com , TestingCatalog ...

Alibaba's Qwen team has launched Qwen3, a new family of open-source large language models (LLMs) designed to compete with leading AI systems. The Qwen3 series includes eight models ranging from 0.6B to 235B parameters, with the larger models employing a Mixture-of-Experts (MoE) architecture for enhanced performance. This comprehensive suite offers options for developers with varied computational resources and application requirements. All the models are released under the Apache 2.0 license, making them suitable for commercial use.

The Qwen3 models boast improved agentic capabilities for tool use and support for 119 languages. The models also feature a unique "hybrid thinking mode" that allows users to dynamically adjust the balance between deep reasoning and faster responses. This is particularly valuable for developers as it facilitates efficient use of computational resources based on task complexity. Training involved a large dataset of 36 trillion tokens and was optimized for reasoning, similar to the Deepseek R1 model.

Benchmarks indicate that Qwen3 rivals top competitors like Deepseek R1 and Gemini Pro in areas like coding, mathematics, and general knowledge. Notably, the smaller Qwen3–30B-A3B MoE model achieves performance comparable to the Qwen3–32B dense model while activating significantly fewer parameters. These models are available on platforms like Hugging Face, ModelScope, and Kaggle, along with support for deployment through frameworks like SGLang and vLLM, and local execution via tools like Ollama and llama.cpp.

Recommended read:

Top link: TestingCatalog
Permalink: More details

References :

pub.towardsai.net: TAI #150: Qwen3 Impresses as a Robust Open-Source Contender
gradientflow.com: Table of Contents Model Architecture and Capabilities What is Qwen 3 and what models are available in the lineup? What are the â€œHybrid Thinking Modesâ€ in Qwen 3, and why are they valuable for developers?
THE DECODER: An article about Qwen3 series from Alibaba debuts with benchmark results matching top competitors
TestingCatalog: Reporting on Alibaba Cloud debuting 235B-parameter Qwen 3 to challenge US model dominance
Towards AI: TAI #150: Qwen3 Impresses as a Robust Open-Source Contender
www.analyticsvidhya.com: Qwen3 Models: How to Access, Performance, Features, and Applications
RunPod Blog: Qwen3 Released: How Does It Stack Up?
bdtechtalks.com: Alibaba’s Qwen3: Open-weight LLMs with hybrid thinking | BDTechTalks
AI News | VentureBeat: Alibaba launches open source Qwen3 model that surpasses OpenAI o1 and DeepSeek R1
the-decoder.com: Qwen3 series from Alibaba debuts with benchmark results matching top competitors

Alexey Shabanov@TestingCatalog //

Alibaba Cloud's Qwen 3 Challenges US Model Dominance - Alibaba Cloud has launched Qwen 3, a new generation of large language models (LLMs) with 235B parameters, challenging US-based models with its reasoning and multilingual proficiency.

References: Gradient Flow , AI News | VentureBeat , MarkTechPost ...

Alibaba Cloud has unveiled Qwen 3, a new generation of large language models (LLMs) boasting 235 billion parameters, poised to challenge the dominance of US-based models. This open-weight family of models includes both dense and Mixture-of-Experts (MoE) architectures, offering developers a range of choices to suit their specific application needs and hardware constraints. The flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, and general knowledge, positioning it as one of the most powerful publicly available models.

Qwen 3 introduces a unique "thinking mode" that can be toggled for step-by-step reasoning or rapid direct answers. This hybrid reasoning approach, similar to OpenAI's "o" series, allows users to engage a more intensive process for complex queries in fields like science, math, and engineering. The models are trained on a massive dataset of 36 trillion tokens spanning 119 languages, twice the corpus of Qwen 2.5 and enriched with synthetic math and code data. This extensive training equips Qwen 3 with enhanced reasoning, multilingual proficiency, and computational efficiency.

The release of Qwen 3 includes two MoE models and six dense variants, all licensed under Apache-2.0 and downloadable from platforms like Hugging Face, ModelScope, and Kaggle. Deployment guidance points to vLLM and SGLang for servers and to Ollama or llama.cpp for local setups, signaling support for both cloud and edge developers. Community feedback has been positive, with analysts noting that earlier Qwen announcements briefly lifted Alibaba shares, underscoring the strategic weight the company places on open models.

Recommended read:

Top link: TestingCatalog
Permalink: More details

References :

Gradient Flow: Qwen 3: What You Need to Know
AI News | VentureBeat: Alibaba launches open source Qwen3 model that surpasses OpenAI o1 and DeepSeek R1
TestingCatalog: Alibaba Cloud debuts 235B-parameter Qwen 3 to challenge US model dominance
MarkTechPost: Alibaba Qwen Team Just Released Qwen3
Analytics Vidhya: Qwen3 Models: How to Access, Performance, Features, and Applications
www.analyticsvidhya.com: Qwen3 Models: How to Access, Performance, Features, and Applications
THE DECODER: Qwen3 series from Alibaba debuts with benchmark results matching top competitors
www.tomsguide.com: Alibaba is launching its own AI reasoning models to compete with DeepSeek
the-decoder.com: Qwen3 series from Alibaba debuts with benchmark results matching top competitors
pub.towardsai.net: TAI #150: Qwen3 Impresses as a Robust Open-Source Contender
Pandaily: The Mind Behind Qwen3: An Inclusive Interview with Alibaba's Zhou Jingren
Towards AI: TAI #150: Qwen3 Impresses as a Robust Open-Source Contender
gradientflow.com: Table of Contents Model Architecture and Capabilities What is Qwen 3 and what models are available in the lineup? What are the â€œHybrid Thinking Modesâ€� in Qwen 3, and why are they valuable for developers? How does Qwen 3 compare to previous versions and other leading models? What are the advantages of Qwen 3â€™s Mixture-of-Experts ...
bdtechtalks.com: Alibaba's Qwen3 open-weight LLMs combine direct response and chain-of-thought reasoning in a single architecture, and compete withe leading models. The post first appeared on .
bdtechtalks.com: Alibaba's Qwen3 open-weight LLMs combine direct response and chain-of-thought reasoning in a single architecture, and compete withe leading models. The post first appeared on .
RunPod Blog: Qwen3 Released: How Does It Stack Up?
www.computerworld.com: The Qwen3 models, which feature a new hybrid reasoning approach, underscore Alibaba's commitment to open-source AI development.
Last Week in AI: OpenAI undoes its glaze-heavy ChatGPT update, Alibaba unveils Qwen 3, a family of â€˜hybridâ€™ AI reasoning models , Baidu ERNIE X1 and 4.5 Turbo boast high performance at low cost

@bdtechtalks.com //

Alibaba's QwQ-32B Outperforms Larger AI Models in Reasoning - Alibaba’s QwQ-32B model exhibits strong performance in math and reasoning tasks, outperforming larger models such as DeepSeek-R1 and OpenAI’s o1-mini, despite having significantly fewer parameters and is available as open-source.

References: Groq , Analytics Vidhya , bdtechtalks.com ...

Alibaba's Qwen team has unveiled QwQ-32B, a 32-billion-parameter reasoning model that rivals much larger AI models in problem-solving capabilities. This development highlights the potential of reinforcement learning (RL) in enhancing AI performance. QwQ-32B excels in mathematics, coding, and scientific reasoning tasks, outperforming models like DeepSeek-R1 (671B parameters) and OpenAI's o1-mini, despite its significantly smaller size. Its effectiveness lies in a multi-stage RL training approach, demonstrating the ability of smaller models with scaled reinforcement learning to match or surpass the performance of giant models.

The QwQ-32B is not only competitive in performance but also offers practical advantages. It is available as open-weight under an Apache 2.0 license, allowing businesses to customize and deploy it without restrictions. Additionally, QwQ-32B requires significantly less computational power, running on a single high-end GPU compared to the multi-GPU setups needed for larger models like DeepSeek-R1. This combination of performance, accessibility, and efficiency positions QwQ-32B as a valuable resource for the AI community and enterprises seeking to leverage advanced reasoning capabilities.

Recommended read:

Top link: bdtechtalks.com
Permalink: More details

References :

Groq: A Guide to Reasoning with Qwen QwQ 32B
Analytics Vidhya: Qwenâ€™s QwQ-32B: Small Model with Huge Potential
Maginative: Alibaba's Latest AI Model, QwQ-32B, Beats Larger Rivals in Math and Reasoning
bdtechtalks.com: Alibabaâ€™s QwQ-32B reasoning model matches DeepSeek-R1, outperforms OpenAI o1-mini
Last Week in AI: LWiAI Podcast #202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors

News from the AI & ML world

DeeperML - #alibaba

Alibabaâ€™s Qwen3 Challenges Silicon Valley in AI - Alibaba has launched Qwen3, a new generation of AI models designed to compete with Silicon Valley's leading AI technologies.

Alibaba's Qwen3 Challenges Silicon Valley with Advanced AI - Alibaba’s Qwen series, including Qwen3, demonstrates competitive AI performance with full front-end code generation and ZeroSearch for improved LLM retrieval.

Alibaba's Qwen3 LLMs Impress with Robust Open Source Performance - Alibaba’s Qwen team released Qwen3, a family of open-source large language models (LLMs) ranging from 0.6B to 235B parameters, supporting 119 languages and featuring improved agentic capabilities and reasoning optimizations.

Alibaba Cloud's Qwen 3 Challenges US Model Dominance - Alibaba Cloud has launched Qwen 3, a new generation of large language models (LLMs) with 235B parameters, challenging US-based models with its reasoning and multilingual proficiency.

Alibaba's QwQ-32B Outperforms Larger AI Models in Reasoning - Alibaba’s QwQ-32B model exhibits strong performance in math and reasoning tasks, outperforming larger models such as DeepSeek-R1 and OpenAI’s o1-mini, despite having significantly fewer parameters and is available as open-source.

Benchmarks

Blogs

Research Tools