ChatGPT Images 2.0: OpenAI Challenges Google with Advanced Image Model

⚡

Key Takeaways

1OpenAI unveils ChatGPT Images 2.0, a model focused on adherence to instructions and detail accuracy.

2The Thinking mode allows for the generation of up to 8 distinct images with continuity, querying the web for up-to-date information.

3Free users have access to Instant mode, while Thinking mode is reserved for paying subscribers.

💡Why it matters — OpenAI strengthens its position against Google in the race for image generation, targeting professionals with advanced features.

ChatGPT Images 2.0: A Revolutionary Image Model by OpenAI

OpenAI has recently launched ChatGPT Images 2.0, an image generation model that stands out for its ability to respond accurately to prompts and handle complex elements. Primarily designed for professional use, this model focuses on two major axes: increased fidelity to details and a new reasoning mode.

Significant Improvements for Professionals

ChatGPT Images 2.0 surpasses traditional image generators by enhancing the management of dense compositions, small texts, icons, and interface elements. It also offers flexibility in image ratios, ranging from 3:1 to 1:3, tailored to the needs of banners, slides, posters, and mobile visuals. The maximum available resolution reaches 2K via the API.

In terms of styles, OpenAI has made notable advancements in photography, cinematic rendering, pixel art, and manga. The model is also capable of rendering text in non-Latin languages, such as Japanese, Korean, and Chinese.

The Thinking Mode: A Major Innovation

One of the most striking new features of ChatGPT Images 2.0 is its Thinking Mode. This mode allows the model to reason, query the web for up-to-date information, structure the image before generation, and verify its creations. It is thus possible to generate up to 8 distinct images from the same prompt while maintaining continuity between characters and objects. This paves the way for creating manga sequences, series of posters, or visuals for social media in a single request.

Access and Availability

The model is deployed according to a classic scheme: the standard version is accessible to everyone, but reasoning capabilities are reserved for subscribers. The access modalities for ChatGPT Images 2.0 are as follows:

Instant Mode: available to all ChatGPT and Codex users, including those on the free plan.
Thinking Mode: reserved for subscribers of the ChatGPT Plus, Pro, and Business plans.
API: the gpt-image-2 model is priced per token, depending on quality and resolution.

A Direct Response to Google

In February, Google made waves with its Nano Banana 2 model, capable of producing highly realistic images. With ChatGPT Images 2.0, OpenAI responds to Google by offering a model that not only competes in quality but goes further with its Thinking Mode and the ability to generate multiple images simultaneously. OpenAI and Google thus position themselves as the undisputed leaders in image generation, distancing their competitors.