DeeperML - News about #deepseekai

@syncedreview.com //

DeepSeek Advances Neural Theorem Proving with DeepSeek Prover

DeepSeek AI has unveiled DeepSeek-Prover-V2, a new open-source large language model (LLM) designed for formal theorem proving within the Lean 4 environment. This model advances the field of neural theorem proving by utilizing a recursive theorem-proving pipeline and leverages DeepSeek-V3 to generate high-quality initialization data. DeepSeek-Prover-V2 has achieved top results on the MiniF2F benchmark, showcasing its state-of-the-art performance in mathematical reasoning. The release includes ProverBench, a new benchmark for evaluating mathematical reasoning capabilities.

DeepSeek-Prover-V2 features a unique cold-start training procedure. The process begins by using the DeepSeek-V3 model to decompose complex mathematical theorems into a series of more manageable subgoals. Simultaneously, DeepSeek-V3 formalizes these high-level proof steps in Lean 4, creating a structured sequence of sub-problems. To handle the computationally intensive proof search for each subgoal, the researchers employed a smaller 7B parameter model. Once all the decomposed steps of a challenging problem are successfully proven, the complete step-by-step formal proof is paired with DeepSeek-V3’s corresponding chain-of-thought reasoning. This allows the model to learn from a synthesized dataset that integrates both informal, high-level mathematical reasoning and rigorous formal proofs, providing a strong cold start for subsequent reinforcement learning.

Building upon the synthetic cold-start data, the DeepSeek team curated a selection of challenging problems that the 7B prover model couldn’t solve end-to-end, but for which all subgoals had been successfully addressed. By combining the formal proofs of these subgoals, a complete proof for the original problem is constructed. This formal proof is then linked with DeepSeek-V3’s chain-of-thought outlining the lemma decomposition, creating a unified training example of informal reasoning followed by formalization. DeepSeek is also challenging the long-held belief of tech CEOs who've argued that exponential AI improvements require ever-increasing computing power. DeepSeek claims to have produced models comparable to OpenAI, but with significantly less compute and cost, questioning the necessity of massive scale for AI advancement.

Share:

References :

Synced: DeepSeek Unveils DeepSeek-Prover-V2: Advancing Neural Theorem Proving with Recursive Proof Search and a New Benchmark
iai.tv news RSS feed: DeepSeek exposed a fundamental AI scaling myth
www.marktechpost.com: DeepSeek-AI Released DeepSeek-Prover-V2: An Open-Source Large Language Model Designed for Formal Theorem, Proving through Subgoal Decomposition and Reinforcement Learning
syncedreview.com: DeepSeek Unveils DeepSeek-Prover-V2: Advancing Neural Theorem Proving with Recursive Proof Search and a New Benchmark
SiliconANGLE: Xiaomi Corp. today released MiMo-7B, a new family of reasoning models that it claims can outperform OpenAIâ€™s o1-mini at some tasks. The algorithm series is available under an open-source license. Its launch coincides with DeepSeekâ€™s release of an update to Prover, a competing open-source reasoning model.
MarkTechPost: DeepSeek-AI Released DeepSeek-Prover-V2: An Open-Source Large Language Model Designed for Formal Theorem, Proving through Subgoal Decomposition and Reinforcement Learning
siliconangle.com: China AI rising: Xiaomi releases new MiMo-7B models as DeepSeek upgrades its Prover math AI
Second Thoughts: Chinaâ€™s DeepSeek Adds a Weird New Data Point to The AI Race

Classification:

HashTags: #DeepSeekAI #TheoremProving #OpenSourceAI
Company: DeepSeek
Target: AI researchers
Product: DeepSeek Prover
Feature: Theorem proving
Malware: DeepSeek-Prover-V2
Type: AI
Severity: Informative

@the-decoder.com //

DeepSeek's R1 Model Challenges AI Industry Giants

DeepSeek's R1 model, released in January 2025, caused significant disruption in the AI industry by demonstrating top-tier AI capabilities on a limited budget and without relying on Nvidia's high-end GPUs. The model, built by optimizing for Huawei's Ascend 910B chips, proved that innovative engineering and talent can overcome hardware limitations, inspiring fear among Silicon Valley giants who had previously dominated the AI landscape. R1 performed competitively with GPT-4o in many benchmarks, particularly in Chinese language tasks, while being significantly cheaper to train and serve, signaling a potential collapse of price-performance curves in the AI market.

The success of R1 was attributed to DeepSeek's motivation to innovate in efficiency, particularly in KV-cache optimization, an area of lesser concern for larger players with abundant resources. This allowed them to achieve cost savings in GPU memory usage by optimizing the Key-Value cache used in every attention layer of the LLM. DeepSeek has published their work, and the results have been verified on a smaller scale. This accomplishment highlighted the importance of optimizing existing resources rather than simply relying on expensive hardware to achieve top-tier AI performance.

Now, DeepSeek is preparing to launch R2 in May 2025, with rumored specifications including 1.2 trillion parameters under a hybrid MoE setup and training on over 5.2 petabytes of data. Leaks suggest R2 could be 97% cheaper than GPT-4o with improved vision capabilities. These developments, coupled with DeepSeek's focus on reducing reliance on American silicon, signal a major shift in the AI industry and a potential era of significantly lower AI pricing, potentially rivaling GPT-4.5 and Gemini 2.5, and causing massive market disruption.

Share:

References :

Fello AI: DeepSeek’s New AI Could Be 97% Cheaper And More Powerful Than GPT-4.5
felloai.com: Fello AI reports DeepSeek's new AI could be 97% cheaper and more powerful than GPT-4.5.
aider: Aider support DeepSeek R1+ Sonnet set a new SOTA with less cost
AI News | VentureBeat: VentureBeat discusses DeepSeek’s success and its motivation as key to AI innovation
Craig Smith: DeepSeek, now with models that rival the best of the West, has set the stage for a global war in AI inference pricing that is only now becoming clear to the world.
Towards AI: DeepSeek Explained Part 5: DeepSeek-V3-Base

Classification:

HashTags: #AI #DeepSeek #R1Model
Company: DeepSeek
Target: AI Developers
Product: R1 Model
Feature: Cost Efficiency, High Performa
Type: AI
Severity: Informative

Ben Dickson@AI News | VentureBeat //

DeepSeek's AI Advancements: Efficient LLMs and SPCT Reward Modeling

DeepSeek, a Chinese AI company, has achieved a breakthrough in AI reward modeling that promises to enhance the reasoning and responsiveness of AI systems. Collaborating with Tsinghua University researchers, DeepSeek developed a technique called "Inference-Time Scaling for Generalist Reward Modeling," demonstrating improved performance compared to existing methods and competitive results against established public reward models. This innovation aims to improve how AI systems learn from human preferences, a key factor in developing more useful and aligned artificial intelligence.

DeepSeek's new approach involves a dual method combining Generative Reward Modeling (GRM) and Self-Principled Critique Tuning (SPCT). GRM provides flexibility in handling various input types and enables scaling during inference time, offering a richer representation of rewards through language compared to previous scalar approaches. SPCT, a learning method, fosters scalable reward-generation behaviors in GRMs through online reinforcement learning. One of the paper's authors explained that this combination allows principles to be generated based on the input query and responses, adaptively aligning the reward generation process.

The SPCT technique addresses challenges in creating generalist reward models capable of handling broader tasks. These challenges include input flexibility, accuracy, inference-time scalability, and learning scalable behaviors. By creating self-guiding critiques, SPCT promises more scalable intelligence for enterprise LLMs, particularly in open-ended tasks and domains where current models struggle. DeepSeek has also released models like DeepSeek-V3 and DeepSeek-R1, which have achieved performance close to, and sometimes exceeding, leading proprietary models while using fewer training resources. These advancements signal that cutting-edge AI is not solely the domain of closed labs and highlight the importance of efficient model architecture, training algorithms, and hardware integration.

Share:

References :

AI News | VentureBeat: Reward models holding back AI? DeepSeek's SPCT creates self-guiding critiques, promising more scalable intelligence for enterprise LLMs.
www.artificialintelligence-news.com: DeepSeekâ€™s AIs: What humans really want
www.marktechpost.com: Scalable and Principled Reward Modeling for LLMs: Enhancing Generalist Reward Models RMs with SPCT and Inference-Time Optimization
AI News: DeepSeek’s AIs: What humans really want
bdtechtalks.com: Under the hood: The Innovations powering DeepSeekâ€™s AI breakthrough
www.analyticsvidhya.com: DeepSeek V3 vs. LLaMA 4: Choosing the Right AI Model for You
Freethink: How DeepSeek rewrote the rules of the AI race
composio.dev: Llama 4 Maverick vs. Deepseek v3 0324

Classification:

HashTags: #DeepSeek #SPCT #LLMbenchmark
Company: DeepSeek
Product: DeepSeek-R1
Feature: reward modeling
Type: AI
Severity: Major

News from the AI & ML world

DeeperML - #deepseekai

DeepSeek Advances Neural Theorem Proving with DeepSeek Prover

Classification:

DeepSeek's R1 Model Challenges AI Industry Giants

Classification:

DeepSeek's AI Advancements: Efficient LLMs and SPCT Reward Modeling

Classification:

Benchmarks

Blogs

Research Tools