News from the AI & ML world

DeeperML

@www.marktechpost.com //
Meta AI has announced the release of V-JEPA 2, an open-source world model designed to enhance robots' ability to understand and interact with physical environments. V-JEPA 2 builds upon the Joint Embedding Predictive Architecture (JEPA) and leverages self-supervised learning from over one million hours of video and images. This approach allows the model to learn abstract concepts and predict future states, enabling robots to perform tasks in unfamiliar settings and improving their understanding of motion and appearance. The model can be useful in manufacturing automation, surveillance analytics, in-building logistics, robotics, and other more advanced use cases.

Meta researchers scaled JEPA pretraining by constructing a 22M-sample dataset (VideoMix22M) from public sources and expanded the encoder capacity to over 1B parameters. They also adopted a progressive resolution strategy and extended pretraining to 252K iterations, reaching 64 frames at 384x384 resolution. V-JEPA 2 avoids the inefficiencies of pixel-level prediction by focusing on predictable scene dynamics while disregarding irrelevant noise. This abstraction makes the system both more efficient and robust, requiring just 16 seconds to plan and control robots.

Meta's V-JEPA 2 represents a step toward achieving "advanced machine intelligence" by enabling robots to interact effectively in environments they have never encountered before. The model achieves state-of-the-art results on motion recognition and action prediction benchmarks and can control robots without additional training. By focusing on the essential and predictable aspects of a scene, V-JEPA 2 aims to provide AI agents with the intuitive physics needed for effective planning and reasoning in the real world, distinguishing itself from generative models that attempt to predict every detail.

Share: bluesky twitterx--v2 facebook--v1 threads


References :
  • www.computerworld.com: Meta’s recent unveiling of V-JEPA 2 marks a quiet but significant shift in the evolution of AI vision systems, and it’s one enterprise leaders can’t afford to overlook,
  • www.marktechpost.com: Meta AI Releases V-JEPA 2: Open-Source Self-Supervised World Models for Understanding, Prediction, and Planning
  • MarkTechPost: Meta AI Releases V-JEPA 2: Open-Source Self-Supervised World Models for Understanding, Prediction, and Planning
  • The Tech Portal: Social media company Meta has now introduced V-JEPA 2, a new open-source…
  • about.fb.com: Our New Model Helps AI Think Before it Acts
  • AI News | VentureBeat: Meta’s new world model lets robots manipulate objects in environments they’ve never encountered before
  • www.infoq.com: Meta Introduces V-JEPA 2, a Video-Based World Model for Physical Reasoning
  • eWEEK: Dubbed as a “world model,” Meta’s New V-JEPA 2 AI model uses visual understanding and physical intuition to enhance reasoning in robotics and AI agents.
Classification: