News from the AI & ML world

DeeperML - #r1

Harsh Mishra@Analytics Vidhya //
DeepSeek, a Chinese AI startup, continues to make waves in the open-source community. On February 28, 2025, the company launched the Fire-Flyer File System (3FS) and the Smallpond data processing framework, designed to improve data access and processing for AI training and inference. These high-performance tools aim to address challenges associated with handling large datasets and complex computations, providing a foundation for more efficient AI development.

DeepSeek is also focusing on optimizing matrix multiplications, a critical component in modern deep learning. To that end, DeepSeek AI has released DeepGEMM, an FP8 GEMM library supporting both Dense and MoE GEMMs. This library is tailored for NVIDIA Hopper tensor cores and uses runtime kernel compilation, making it easier to integrate into existing projects without lengthy compile-time processes. DeepGEMM employs fine-grained scaling and a two-level accumulation strategy to balance speed and numerical accuracy in FP8 operations.

Recommended read:
References :
  • Analytics Vidhya: On February 28, 2025, DeepSeek made significant strides in the open-source community by launching the Fire-Flyer File System (3FS) and the Smallpond data processing framework. These innovations are designed to enhance data access and processing capabilities, particularly for AI training and inference workloads.
  • MarkTechPost: Efficient matrix multiplications remain a critical component in modern deep learning and high-performance computing. As models become increasingly complex, conventional approaches to General Matrix Multiplication (GEMM) often face challenges related to memory bandwidth constraints, numerical precision, and suboptimal hardware utilization.
  • MarkTechPost: DeepSeek AI Releases Fire-Flyer File System (3FS): A High-Performance Distributed File System Designed to Address the Challenges of AI Training and Inference Workload
  • MarkTechPost: This article discusses DeepSeek’s release of DualPipe, a bidirectional pipeline parallelism algorithm for improving computation-communication overlap.
  • techxplore.com: When small Chinese artificial intelligence (AI) company DeepSeek released a family of extremely efficient and highly competitive AI models last month, it rocked the global tech community. The release revealed China's growing technological prowess. It also showcased a distinctly Chinese approach to AI advancement.
  • MarkTechPost: DeepSeek AI Releases Smallpond: A Lightweight Data Processing Framework Built on DuckDB and 3FS
  • Gradient Flow: DeepSeek Fire-Flyer: What You Need to Know
  • Unite.AI: DeepSeek and AI Power Shift: Key Insights for Investors and Entrepreneurs
  • Towards AI: DeepSeek has been busy open-sourcing a very impressive and valuable set of internal tools and code optimizations for training and inferencing LLMs.
  • Analytics Vidhya: QwQ-32B Vs DeepSeek-R1: Can a 32B Model Challenge a 671B Parameter Model?

@timesofindia.indiatimes.com //
Recent developments highlight both the expanding influence and the regulatory hurdles faced by the AI company DeepSeek. In South Korea, the government has halted downloads of DeepSeek's applications, citing concerns over data privacy. This action has removed the company's apps from both the Apple and Google mobile app marketplaces, though their website remains accessible.

Simultaneously, DeepSeek's AI technology is rapidly integrating into China's transportation sector, extending from electric vehicles (EVs) to e-scooters. Major automakers, including BYD, Geely, and Chery Automobile, are incorporating DeepSeek's AI into their vehicles, offering features like preliminary self-driving capabilities. E-scooter brands like Segway-Ninebot and Niu Technologies are also integrating DeepSeek for enhanced features such as AI-powered content creation, data analytics, and driver assistance systems, reflecting what some industry observers are calling "DeepSeek fever" due to its cost-effective AI integration.

Perplexity has released "1776," a modified version of DeepSeek-R1. This model addresses the original version's limitations by mitigating censorship on sensitive topics, particularly those related to Chinese history and geopolitics. The modifications were made using post-training techniques to ensure more open and contextually accurate responses, making the modified model available on Perplexity's Sonar AI platform and GitHub.

Recommended read:
References :

@nsfocusglobal.com //
The launch of DeepSeek's R1 AI model has significantly impacted the AI industry and global markets, with reports indicating a substantial drop of over $1 trillion in the U.S. stock market. This event sent shockwaves throughout the tech world, prompting AI companies to reassess their strategies. The model's emergence also underscored the intensifying competition between the U.S. and China in the AI sector, highlighting China's position as a frontrunner. Chinese companies across multiple sectors, including telecommunications, cloud computing, semiconductors, finance, automotive, and mobile technology, have already integrated DeepSeek into their operations.

Security concerns surrounding the advanced AI model have also risen. NSFOCUS is using its Large Language Model Security Assessment System (NSFOCUS AI-Scan) to address potential security hazards. NSFOCUS AI-Scan is embedding security assurance into the whole process of AI application development. It comprehensively covers data security, content security, confrontation security, application security, AI supply chain security, and model backdoor attack risks of large models.

Recommended read:
References :
  • insideAI News: InsideAI News article about SambaNova achieving high speed and efficiency in running DeepSeek-R1 671B.
  • nsfocusglobal.com: NSFocusGlobal article discussing security concerns around the DeepSeek R1 AI model.

@the-decoder.com //
Perplexity AI has launched Deep Research, an AI-powered research tool aimed at competing with OpenAI and Google Gemini. Using DeepSeek-R1, Perplexity is offering comprehensive research reports at a much lower cost than OpenAI, with 500 queries per day for $20 per month compared to OpenAI's $200 per month for only 100 queries. The new service automatically conducts dozens of searches and analyzes hundreds of sources to produce detailed reports in one to two minutes.

Perplexity claims Deep Research performs 8 searches and consults 42 sources to generate a 1,300-word report in under 3 minutes. The company says that Deep Research tool works particularly well for finance, marketing, and technology research. The service is launching first on web browsers, with iOS, Android, and Mac versions planned for later release. Perplexity CEO Aravind Srinivas stated he wants to keep making it faster and cheaper for the interest of humanity.

Recommended read:
References :
  • the-decoder.com: Perplexity uses Deepseek-R1 to offer Deep Research 10 times cheaper than OpenAI
  • www.analyticsvidhya.com: Enhancing Multimodal RAG with Deepseek Janus Pro
  • www.marktechpost.com: DeepSeek AI Introduces CODEI/O: A Novel Approach that Transforms Code-based Reasoning Patterns into Natural Language Formats to Enhance LLMs’ Reasoning Capabilities
  • venturebeat.com: Perplexity just made AI research crazy cheap—what that means for the industry
  • Analytics Vidhya: The landscape of AI-powered research just became even more competitive with the launch of Perplexity’s Deep Research. Previously, OpenAI and Google Gemini were leading the way in this space, and now Perplexity has joined the ranks.
  • iHLS: New York State Bans DeepSeek AI App Over Security Concerns
  • NextBigFuture.com: Does DeepSeek Impact the Future of AI Data Centers?
  • THE DECODER: Perplexity's Deep Research utilizes DeepSeek-R1 for generating comprehensive research reports.
  • www.ghacks.net: Perplexity AI has unveiled its latest feature, the 'Deep Research' tool, designed to enhance users' ability to conduct comprehensive research on complex topics.
  • PCMag Middle East ai: Perplexity Launches a Free 'Deep Research' AI Tool
  • bsky.app: Perplexity follows OpenAI with the release of its Deep Research.
  • techstrong.ai: Perplexity AI Launches a Deep Research Tool to Help Humans Research, Deeply
  • Data Phoenix: Perplexity has launched Deep Research, a free AI-powered research tool that can analyze hundreds of sources in minutes to create comprehensive reports across various domains, promising to save users significant research time.
  • eWEEK: Perplexity 1776 Model Fixes DeepSeek-R1’s “Refusal to Respond to Sensitive Topicsâ€�

David Gerard@Pivot to AI //
Recent reports have raised significant scrutiny and safety concerns regarding DeepSeek, a popular chatbot developed in China. US lawmakers are considering a ban of the AI model on government-issued devices due to potential data transfer to China Mobile, a state-owned telecommunications company already banned in the US. Security researchers have found that DeepSeek collects user data, including IP addresses and keystroke patterns, storing it in China where it is vulnerable to government requisition, raising alarms about national security implications.

The DeepSeek R1 model is found to have easily bypassable safety guardrails, a vulnerability it shares with leading fine-tunable models from OpenAI, Anthropic, and Google. This concerning discovery has led to fears that the AI could be exploited to generate instructions for harmful and illegal activities. Researchers have demonstrated how DeepSeek can be manipulated to provide detailed instructions for producing chemical weapons, pressuring coworkers into sex, and even planning terrorist attacks. This highlights the difficulties in balancing AI innovation with effective safety measures and the complexities of regulating AI technologies developed under different governance structures.

Recommended read:
References :
  • cset.georgetown.edu: China’s ability to launch DeepSeek’s popular chatbot draws US government panel’s scrutiny
  • AI Alignment Forum: Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
  • AI News: DeepSeek ban? China data transfer boosts security concerns

Jibin Joseph@PCMag Middle East ai //
DeepSeek AI's R1 model, a reasoning model praised for its detailed thought process, is now available on platforms like AWS and NVIDIA NIM. This increased accessibility allows users to build and scale generative AI applications with minimal infrastructure investment. Benchmarks have also revealed surprising performance metrics, with AMD’s Radeon RX 7900 XTX outperforming the RTX 4090 in certain DeepSeek benchmarks. The rise of DeepSeek has put the spotlight on reasoning models, which break questions down into individual steps, much like humans do.

Concerns surrounding DeepSeek have also emerged. The U.S. government is investigating whether DeepSeek smuggled restricted NVIDIA GPUs via Singapore to bypass export restrictions. A NewsGuard audit found that DeepSeek’s chatbot often advances Chinese government positions in response to prompts about Chinese, Russian, and Iranian false claims. Furthermore, security researchers discovered a "completely open" DeepSeek database that exposed user data and chat histories, raising privacy concerns. These issues have led to proposed legislation, such as the "No DeepSeek on Government Devices Act," reflecting growing worries about data security and potential misuse of the AI model.

Recommended read:
References :
  • aws.amazon.com: DeepSeek R1 models now available on AWS
  • www.pcguide.com: DeepSeek GPU benchmarks reveal AMD’s Radeon RX 7900 XTX outperforming the RTX 4090
  • www.tomshardware.com: U.S. investigates whether DeepSeek smuggled Nvidia AI GPUs via Singapore
  • www.wired.com: Article details challenges of testing and breaking DeepSeek's AI safety guardrails.
  • decodebuzzing.medium.com: Benchmarking ChatGPT, Qwen, and DeepSeek on Real-World AI Tasks
  • medium.com: The blog post emphasizes the use of DeepSeek-R1 in a Retrieval-Augmented Generation (RAG) chatbot. It underscores its comparability in performance to OpenAI's o1 model and its role in creating a chatbot capable of handling document uploads, information extraction, and generating context-aware responses.
  • www.aiwire.net: This article highlights the cost-effectiveness of DeepSeek's R1 model in training, noting its training on a significantly smaller cluster of older GPUs compared to leading models from OpenAI and others, which are known to have used far more extensive resources.
  • futurism.com: OpenAI CEO Sam Altman has since congratulated DeepSeek for its "impressive" R1 reasoning model, he promised spooked investors to "deliver much better models."
  • AWS Machine Learning Blog: Protect your DeepSeek model deployments with Amazon Bedrock Guardrails
  • mobinetai.com: DeepSeek is a catastrophically broken model with non-existent, typical shoddy Chinese safety measures that take 60 seconds to dismantle.
  • AI Alignment Forum: Illusory Safety: Redteaming DeepSeek R1 and the Strongest Fine-Tunable Models of OpenAI, Anthropic, and Google
  • Pivot to AI: Of course DeepSeek lied about its training costs, as we had strongly suspected.
  • Unite.AI: Artificial Intelligence (AI) is no longer just a technological breakthrough but a battleground for global power, economic influence, and national security.
  • cset.georgetown.edu: China’s ability to launch DeepSeek’s popular chatbot draws US government panel’s scrutiny
  • neuralmagic.com: Enhancing DeepSeek Models with MLA and FP8 Optimizations in vLLM
  • www.unite.ai: Blog post about DeepSeek and the global power shift.
  • cset.georgetown.edu: This article discusses DeepSeek and its impact on the US-China AI race.

@www.cnbc.com //
Chinese AI company DeepSeek is currently facing a large-scale cyberattack that has led to the temporary suspension of new user registrations. The company made the announcement on Monday, stating that existing users could still log in as usual while they work to mitigate the attack. DeepSeek is known for its open-source projects and has recently released models like R1, a reasoning model, and Janus-Pro-7B, a multi-modal AI model capable of generating images. This incident highlights the security vulnerabilities that AI service providers face and the potential disruption these attacks can cause to the industry and its users.

The cyberattack comes as DeepSeek's technology has been gaining attention and challenging established AI companies. The company has also released an iOS app, DeepSeek – AI Assistant, which has become a top download. There are also reports that DeepSeek may have used OpenAI's model to train its competitor. This has brought new focus on the competition between China and the US in the AI industry. This incident raises questions about the security and stability of AI infrastructure, especially in light of geopolitical competition and the importance of AI in various sectors.

Recommended read:
References :
  • Techmeme: DeepSeek on Monday said it would temporarily limit user registrations “due to large-scale malicious attacks” on its services, though existing users will be able to log in as usual.
  • www.cnbc.com: DeepSeek on Monday said it would temporarily limit user registrations “due to large-scale malicious attacks” on its services, though existing users will be able to log in as usual.
  • www.theguardian.com: TheGuardian post about DeepSeek cyberattack.
  • www.themirror.com: TheMirror post about DeepSeek censorship.
  • www.theregister.com: TheRegister post about DeepSeek suspending registrations.
  • Techmeme: Wiz: DeepSeek left one of its critical databases exposed, leaking more than 1M records including system logs, user prompt submissions, and users' API keys (Wired)
  • www.wired.com: Exposed DeepSeek Database Revealed Chat Prompts and Internal Data
  • Pyrzout :vm:: Guess who left a database wide open, exposing chat logs, API keys, and more? Yup, DeepSeek – Source: go.theregister.com
  • ciso2ciso.com: Guess who left a database wide open, exposing chat logs, API keys, and more? Yup, DeepSeek – Source: go.theregister.com
  • Wiz Blog | RSS feed: Wiz Research Uncovers Exposed DeepSeek Database Leaking Sensitive Information, Including Chat History | Wiz Blog
  • www.cnbc.com: The US Navy has instructed its members to avoid using DeepSeek "in any capacity" due to "potential security and ethical concerns"
  • heise online English: Upgraded China's DeepSeek, which has rattled American AI makers, has limited new signups to its web-based interface containing patterns in what is said to be an ongoing cyberattack.
  • The Hacker News: DeepSeek AI Database Exposed: Over 1 Million Log Lines, Secret Keys Leaked
  • www.theverge.com: The Verge reports on DeepSeek's database exposing user data and chat histories.
  • www.infosecurity-magazine.com: Infosecurity Magazine reports on the DeepSeek database exposure and the types of sensitive data leaked.