Open Philanthropy Funds Technical AI Safety Research

jake_mendel@LessWrong //

Open Philanthropy Funds Technical AI Safety Research

Open Philanthropy is dedicating $40 million to fund technical AI safety research. The organization has launched a Request for Proposals (RFP) seeking projects across 21 research areas, aiming to develop robust safety techniques. This initiative focuses on mitigating potential risks from advanced AI systems before they are deployed in real-world scenarios.

The research areas are grouped into categories like adversarial machine learning, exploring sophisticated misbehavior of LLMs, and theoretical approaches to AI alignment. Specific areas of interest include jailbreaks, control evaluations, backdoor stress tests, robust unlearning, and alignment faking. Open Philanthropy is particularly interested in funding work related to jailbreaks and unintentional misalignment, control evaluations, and backdoor stress tests.

Open Philanthropy welcomes various types of grants, including research expenses, discrete research projects, academic start-up packages, support for existing nonprofits, and funding to start new organizations. The application process starts with a 300-word expression of interest, with applications open until April 15, 2025. The aim is to foster research that ensures AI systems adhere to safety specifications and reduce the probability of catastrophic failure.

Original img attribution: https://res.cloudinary.com/lesswrong-2-0/image/upload/v1654295382/new_mississippi_river_fjdmww.jpg

ImgSrc: res.cloudinary.

References :

AI Alignment Forum: Research directions Open Phil wants to fund in technical AI safety
LessWrong: Open Philanthropy Technical AI Safety RFP: $USD40M Available
LessWrong: We focus on threats from the misuse of models. A bad actor could disable safeguards and create the “evil twin” of a model.

Classification:

HashTags: #AISafety #OpenPhilanthropy #LLAignment
Company: Open Philanthropy
Target: AI Systems
Product: AI Alignment
Feature: AI Safety Research
Type: Research
Severity: Informative

News from the AI & ML world

DeeperML

Open Philanthropy Funds Technical AI Safety Research

Classification: