@Google DeepMind Blog
//
Google DeepMind is intensifying its focus on AI governance and security as it ventures further into artificial general intelligence (AGI). The company is exploring AI monitors to regulate hyperintelligent AI models, splitting potential threats into four categories, with the creation of a "monitor" AI being one proposed solution. This proactive approach includes prioritizing technical safety, conducting thorough risk assessments, and fostering collaboration within the broader AI community to navigate the development of AGI responsibly.
DeepMind's reported clampdown on sharing research will stifle AI innovation, warns the CEO of Iris.ai, one of Europe’s leading startups in the space, Anita Schjøll Abildgaard. Concerns are rising within the AI community that DeepMind's new research restrictions threaten AI innovation. The CEO of Iris.ai, a Norwegian startup developing an AI-powered engine for science, warns the drawbacks will far outweigh the benefits. She fears DeepMind's restrictions will hinder technological advances. References :
Classification:
@blogs.microsoft.com
//
Anthropic, Google DeepMind, and OpenAI are at the forefront of developing AI agents with the ability to interact with computers in a human-like manner. These agents are designed to perform a range of tasks, including web searches, form completion, and button clicks, enabling them to order groceries, request rides, or book flights. The models employ chain-of-thought reasoning to decompose complex instructions into manageable steps, requesting user input when necessary and seeking confirmation before executing final actions.
To address safety concerns such as prompt injection attacks, developers are implementing restrictions, such as preventing the agents from logging into websites or entering payment information. Anthropic was the first to unveil this functionality in October, with its Claude chatbot now capable of "using computers the way humans do." Google DeepMind is developing Mariner, built on top of Google’s Gemini 2 language model and OpenAI launched its computer-use agent (CUA), called Operator. References :
Classification:
@singularityhub.com
//
OpenAI models, including the recently released GPT-4o, are facing scrutiny due to their vulnerability to "jailbreaks." Researchers have demonstrated that targeted attacks can bypass the safety measures implemented in these models, raising concerns about their potential misuse. These jailbreaks involve manipulating the models through techniques like "fine-tuning," where models are retrained to produce responses with malicious intent, effectively creating an "evil twin" capable of harmful tasks. This highlights the ongoing need for further development and robust safety measures within AI systems.
The discovery of these vulnerabilities poses significant risks for applications relying on the safe behavior of OpenAI's models. The concern is that, as AI capabilities advance, the potential for harm may outpace the ability to prevent it. This risk is particularly urgent as open-weight models, once released, cannot be recalled, underscoring the need to collectively define an acceptable risk threshold and take action before that threshold is crossed. A bad actor could disable safeguards and create the “evil twin” of a model: equally capable, but with no ethical or legal bounds. References :
Classification: |
BenchmarksBlogsResearch Tools |