Emilia David@AI News | VentureBeat
//
References:
Replicate's blog
, AI News | VentureBeat
,
Black Forest Labs, founded by creators of the popular Stable Diffusion model, has launched FLUX.1 Kontext, a new image generation model that allows users to modify images with both text and reference images. Released on May 29, 2025, FLUX.1 Kontext enables in-context image generation for enterprise AI pipelines, letting users edit images multiple times without losing speed. The company also announced its new BFL Playground, where users can test use cases and play with the models before accessing the full BFL API.
FLUX.1 Kontext is available in two versions: FLUX.1 Kontext [pro] and FLUX.1 Kontext [max]. A third, open-weight version, FLUX.1 Kontext [dev], is coming soon and will be available on private beta. The Pro version is designed for fast and iterative editing, allowing users to input both text and reference images for local edits. The Max version offers maximum performance with improved prompt adherence, high-quality typography generation, and consistent edits without compromising speed. These models are currently live through the BFL API plus partner services such as KreaAI, Freepik, Lightricks, Leonardo, Replicate, FAL, Runware, and Together. FLUX.1 Kontext distinguishes itself through features like character consistency, local editing, style reference, and minimal latency. Unlike traditional text-to-image models, Kontext understands both text and images as input, enabling true in-context generation and editing. Internal tests have shown the models achieve up to 8x lower latency than FLUX 1.1 Pro and top scores on the new KontextBench for text editing and character preservation. Builders using the Replicate endpoint report crisper edits at lower cost than OpenAI’s gpt-image-1 and praise consistent subjects during multi-step workflows. Recommended read:
References :
Matthias Bastian@THE DECODER
//
OpenAI has expanded access to its multimodal image generation model, GPT-Image-1, by making it available to developers through the API. This allows for the integration of high-quality image generation capabilities into various applications and platforms. Previously, GPT-Image-1 was primarily used within ChatGPT, where it gained popularity and generated over 700 million images for more than 130 million users within its first week. The move to offer it via API will likely increase these numbers as developers incorporate the technology into their projects. Leading platforms like Adobe and Figma are already integrating the model, showcasing its appeal and potential impact across different industries.
The GPT-Image-1 model is known for its accurate prompt tracking and versatility in creating images across diverse styles, including the rendering of text within images. The API provides developers with granular control over image creation, offering options to adjust quality settings, the number of images produced, background transparency, and output format. Notably, developers can also adjust moderation sensitivity, balancing flexibility with OpenAI's safety guidelines. This includes the implementation of C2PA metadata watermarking, which identifies images as AI-generated. The pricing model for the GPT-Image-1 API is based on tokens, with separate rates for text input tokens, image input tokens, and image output tokens. Text input tokens are priced at $5 per million, image input tokens at $10 per million, and image output tokens at $40 per million. In practical terms, the cost per generated image ranges from approximately $0.02 for a low-quality square image to $0.19 for a high-quality square image. The API accepts various image formats, including PNG, JPEG, WEBP, and non-animated GIF, with the model capable of interpreting visual content like objects, colors, shapes, and embedded text. Recommended read:
References :
@zdnet.com
//
Adobe Firefly has received a significant upgrade, integrating new AI-powered tools and third-party models, enhancing its capabilities for image, video, and vector generation. Announced at the MAX London event, the update introduces Firefly Image Model 4, aimed at generating high-definition and realistic images, with specialized options for quick idea generation and detailed projects. The update also brings the official release of the Firefly Video Model, previously in beta, which enables users to create short video clips from text or image prompts and supporting resolutions up to 1080p. The integration of a Text to Vector module allows users to generate editable vector graphics, broadening the scope of creative possibilities within the platform.
Adobe has also expanded access to Firefly through a redesigned web platform and an upcoming mobile app for both iOS and Android devices. The mobile app will allow users to generate images and videos directly from their phones or tablets, with content designed for commercial safety and projects being transferable to desktop via Creative Cloud integration. Furthermore, the Firefly web app has been overhauled to serve as a centralized platform for all of Adobe’s AI models, including select third-party models, starting with OpenAI’s GPT image generation capabilities. Since its launch less than two years ago, Firefly has been used to generate over 22 billion assets, reflecting its growing influence in the creative industry. The update also includes the integration of OpenAI's ChatGPT image generator model into Firefly and Express apps, allowing designers to rapidly explore ideas and iterate visually. The new AI model, known as gpt-image-1, is versatile and can create images across diverse styles, faithfully follow custom guidelines, leverage world knowledge, and accurately render text, unlocking countless practical applications across multiple domains. Alongside the launch of Firefly Image Model 4 and the official Firefly Video Model, Adobe also announced a new project called Firefly Boards, a limitless digital canvas workspace allowing artists to create mood boards, storyboards, or any form of creative planning with features such as Remix. Recommended read:
References :
OODA OG@OODAloop
//
OpenAI's ChatGPT image generation feature has experienced explosive growth, with users generating over 700 million images since its launch. This surge in popularity underscores the increasing demand for AI-driven content creation and highlights the challenges of scaling AI infrastructure to meet user needs. The image generation feature, which includes capabilities like creating realistic Ghibli-style photos, has led to millions of new sign-ups for ChatGPT.
This rapid adoption has strained OpenAI's resources, resulting in product delays and temporary service degradation as the company works to expand its infrastructure. According to Brad Lightcap, who oversees day-to-day operations and global deployment at OpenAI, over 130 million users have generated more than 700 million images since the upgraded image generator launched in ChatGPT. India is now the fastest-growing ChatGPT market. Recommended read:
References :
Ryan Daws@AI News
//
OpenAI is set to release its first open-weight language model since 2019, marking a strategic shift for the company. This move comes amidst growing competition in the AI landscape, with rivals like DeepSeek and Meta already offering open-source alternatives. Sam Altman, OpenAI's CEO, announced the upcoming model will feature reasoning capabilities and allow developers to run it on their own hardware, departing from OpenAI's traditional cloud-based approach.
This decision follows OpenAI securing a $40 billion funding round, although reports suggest a potential breakdown of $30 billion from SoftBank and $10 billion from Microsoft and venture capital funds. Despite the fresh funding, OpenAI also faces scrutiny over its training data. A recent study by the AI Disclosures Project suggests that OpenAI's GPT-4o model demonstrates "strong recognition" of copyrighted data, potentially accessed without consent. This raises ethical questions about the sources used to train OpenAI's large language models. Recommended read:
References :
Alexey Shabanov@TestingCatalog
//
References:
Data Phoenix
OpenAI is integrating Google Drive into ChatGPT Team, allowing users to access real-time context from workplace tools, including Docs, Sheets, and Slides, directly within conversations. This new feature aims to provide more relevant and personalized responses by automatically incorporating context from these tools, respecting existing user permissions. This integration seeks to minimize workflow disruptions and enhance decision-making by enabling teams to access context-rich answers without manual searching.
Key features include semantic searches across Google Docs and Slides, responses with relevant context and source links, and intelligent determination of when to use internal sources. Security is prioritized; only ChatGPT Team admins can connect Google Drive, and user permissions are fully respected. While initially planned for immediate access, the rollout of the GPT-4o image generation feature for free users is delayed due to high demand. Recommended read:
References :
@techxplore.com
//
ChatGPT's new image generation capabilities, powered by the GPT-4o model, have sparked a viral trend of transforming images into the distinct style of Studio Ghibli, the famed Japanese animation studio led by Hayao Miyazaki. Users have been uploading personal photos and popular memes, prompting the AI to render them in the style reminiscent of classics like "Spirited Away" and "My Neighbor Totoro." This has led to an influx of Ghibli-style images across social media platforms, particularly X, with users sharing their AI-generated creations.
The trend has ignited a debate surrounding the ethical implications of AI tools trained on copyrighted creative works. Miyazaki himself has voiced strong skepticism about AI's role in animation, and the widespread use of his studio's style raises questions about the future livelihoods of human artists. OpenAI, while acknowledging the potential for misuse, has implemented some restrictions, but users have found ways to circumvent these limitations. The situation has become so intense that some users are experiencing delays in the free tier, due to the large influx of requests. Recommended read:
References :
Maria Deutscher@SiliconANGLE
//
OpenAI has officially rolled out native image generation capabilities within ChatGPT, powered by its GPT-4o model. This significant upgrade replaces the previous DALL-E integration, aiming for more consistent results, fewer content restrictions and improved accuracy in interpreting user prompts. The new feature is available to all ChatGPT users, including those on the free tier, with API access for developers planned in the near future.
The integration of image generation into GPT-4o allows users to create detailed and lifelike visuals through natural conversation, making it easier to communicate effectively through visuals. GPT-4o can accurately render text within images, supports complex prompts with up to 20 different objects, and can generate images based on uploaded references. Users can refine their results through natural conversation, with the AI maintaining context across multiple exchanges - making it easier to iteratively perfect an image through dialogue. Early testing shows the system produces more consistent images than DALL-E 3. Recommended read:
References :
|
BenchmarksBlogsResearch Tools |