News from the AI & ML world

DeeperML

SGLang Team@PyTorch Website //
Microsoft is advancing its AI capabilities with the integration of SGLang into the PyTorch ecosystem and the introduction of KBLaM. SGLang, now part of PyTorch, provides developers with a community-supported framework designed for efficient and adaptable serving of large language models (LLMs). By co-designing the backend runtime and frontend language, SGLang aims to accelerate model interactions and enhance controllability, supporting a wide array of models including Llama, Gemma, Mistral, and others. Its core features include a fast backend runtime with RadixAttention for prefix caching, a flexible frontend language for programming LLM applications, and extensive model support.

Introducing KBLaM, a novel approach to integrating structured knowledge into LLMs without retraining. KBLaM encodes knowledge into continuous key-value vector pairs, embedding them within the model’s attention layers using a specialized rectangular attention mechanism. This method allows for scalable knowledge integration, dynamically updating the LLM without the need for retraining. By converting external knowledge bases into a format LLMs can process, KBLaM enhances efficiency and scalability compared to traditional methods like fine-tuning and Retrieval-Augmented Generation (RAG).
Original img attribution: https://pytorch.org/assets/images/social-share.jpg
ImgSrc: pytorch.org

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • PyTorch Website: SGLang Joins PyTorch Ecosystem: Efficient LLM Serving Engine
  • Source: AI innovation requires AI security: Hear what’s new at Microsoft Secure
  • Microsoft Research: Introducing KBLaM: Bringing plug-and-play external knowledge to LLMs
Classification:
  • HashTags: #PyTorch #LLMserving #AISecurity
  • Company: Microsoft
  • Target: AI Developers
  • Product: PyTorch
  • Feature: LLM Serving Engine
  • Type: AI
  • Severity: Informative