News from the AI & ML world
@analyticsindiamag.com
//
Microsoft has announced BitNet b1.58 2B4T, a new compact large language model (LLM) designed to run efficiently on CPUs. This innovative model boasts 2 billion parameters but uses only 1.58 bits per weight, a significant reduction compared to the 16 or 32 bits typically used in conventional AI models. This allows BitNet to operate with a dramatically smaller memory footprint, consuming only 400MB, making it suitable for devices with limited resources and even enabling it to run on an Apple M2 chip.
The 1-bit AI LLM was trained on a massive dataset containing 4 trillion tokens and has proven competitive with leading open-weight, full-precision LLMs of similar size, such as Meta’s LLaMa 3.2 1B, Google’s Gemma 3 1B, and Alibaba’s Qwen 2.5 1.5B. BitNet achieves comparable or superior performance in tasks like language understanding, math, coding, and conversation, while significantly reducing memory footprint, energy consumption, and decoding latency.
The model's architecture is based on the standard Transformer model, but incorporates key modifications, including custom BitLinear layers that quantize model weights to 1.58 bits during the forward pass. The weights are mapped to ternary values {-1, 0, +1} using an absolute mean quantization scheme, while activations are quantized to 8-bit integers. To facilitate adoption, Microsoft has released the model weights on Hugging Face, along with open-source code for running it, including a dedicated inference tool called bitnet.cpp optimized for CPU execution.
ImgSrc: analyticsindiam
References :
Classification:
- HashTags: #AI #Microsoft #CPUs
- Company: Microsoft
- Target: AI Researchers, End Users
- Product: LLM
- Feature: 1-Bit LLM
- Type: AI
- Severity: Informative