News from the AI & ML world
@www.marktechpost.com
//
OpenAI has announced the release of Reinforcement Fine-Tuning (RFT) for its o4-mini reasoning model, alongside supervised fine-tuning (SFT) for the GPT-4.1 nano model. RFT enables developers to customize a private version of the o4-mini model based on their enterprise's unique products, internal terminology, and goals. This allows for a more tailored AI experience, where the model can generate communications, answer specific questions about company knowledge, and pull up private, proprietary company knowledge with greater accuracy. RFT represents a move beyond traditional supervised fine-tuning, offering more flexible control for complex, domain-specific tasks.
The process involves applying a feedback loop during training, where developers can initiate training sessions, upload datasets, and set up assessment logic through OpenAI’s online developer platform. Instead of relying on fixed question-answer pairs, RFT uses a grader model to score multiple candidate responses per prompt, adjusting the model weights to favor high-scoring outputs. This approach allows for fine-tuning to subtle requirements, such as a specific communication style, policy guidelines, or domain-specific expertise. Organizations with clearly defined problems and verifiable answers can benefit significantly from RFT, aligning models with nuanced objectives.
Several organizations have already leveraged RFT in closed previews, demonstrating its versatility across industries. Accordance AI improved the performance of a tax analysis model, while Ambience Healthcare increased the accuracy of medical coding. Other use cases include legal document analysis by Harvey, Stripe API code generation by Runloop, and content moderation by SafetyKit. OpenAI also announced that supervised fine-tuning is now supported for its GPT-4.1 nano model, the company’s most affordable and fastest offering to date, opening customization to all paid API tiers. The cost model for RFT is more transparent, based on active training time rather than per-token processing.
References :
- AI News | VentureBeat: You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini reasoning model with reinforcement learning
- Maginative: OpenAI Brings Reinforcement Fine-Tuning and GPT-4.1 Nano Fine-Tuning in the API
- www.marktechpost.com: OpenAI Releases Reinforcement Fine-Tuning (RFT) on o4-mini: A Step Forward in Custom Model Optimization
- Techzine Global: OpenAI opens the door to reinforcement fine-tuning for o4-mini
Classification:
- HashTags: #OpenAICustomization #FineTuning #GitHubIntegration
- Company: OpenAI
- Target: Developers
- Product: ChatGPT
- Feature: Model Customization
- Type: ProductUpdate
- Severity: Informative