News from the AI & ML world

DeeperML

@www.marktechpost.com //
The Allen Institute for AI (AI2) has launched OLMoTrace, a groundbreaking open-source tool designed to bring transparency to the often-opaque world of large language models (LLMs). OLMoTrace enables real-time tracing of LLM outputs directly back to the original training data, addressing a significant barrier to enterprise AI adoption: the difficulty in understanding how these systems arrive at their decisions. The tool is integrated into the Ai2 Playground, allowing users to experiment with the recently released OLMo 2 32B model and explore the connections between its outputs and the vast datasets it was trained on.

OLMoTrace distinguishes itself from existing methods like retrieval-augmented generation (RAG) and confidence scores by providing a direct link to the source material used in training. Unlike RAG, which enhances model generation with external sources, OLMoTrace focuses on tracing outputs back to the model's internal knowledge, offering a glimpse into how the model learned specific information. The technology identifies long, unique text sequences in model outputs and matches them with specific documents from the training corpus, highlighting the relevant text and linking to the original source material.

The tool searches for verbatim matches of word sequences within the training data, considering token rarity to highlight particularly specific passages. For each word sequence, it presents up to ten relevant documents, merging overlapping sequences for a clean display. This approach has already revealed insights, such as tracing incorrect information about a model's knowledge cutoff to examples in fine-tuning data. Ai2 aims to decode language model behavior with OLMoTrace, fostering trust and enabling a deeper understanding of AI decision-making.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • the-decoder.com: Everyone can now trace language model outputs back to their training data with OLMoTrace
  • AI News | VentureBeat: What’s inside the LLM? Ai2 OLMoTrace will ‘trace’ the source
  • www.marktechpost.com: Allen Institute for AI (Ai2) Launches OLMoTrace: Real-Time Tracing of LLM Outputs Back to Training Data
Classification:
  • HashTags: #AITransparency #OLMoTrace #LLMTracing
  • Company: AI2
  • Target: AI Community and Enterprises
  • Product: OLMoTrace
  • Feature: Tracing AI Outputs, Transparen
  • Type: AI
  • Severity: Informative