OpenAI Releases ChatGPT Images 2.0

Posted on Apr 21, 2026 by CurrentLens in Models

A livestream claim that the model leap equals GPT‑3→GPT‑5 met a hands‑on comparison against gpt-image-1, Gemini's Nano Banana, and Claude.

AI Quick Take

Sam Altman framed gpt-image-2 as a large generational step on the launch livestream.
Simon Willison tested the model with a Where’s‑Waldo style prompt and compared outputs from gpt-image-1, Gemini and Claude; results varied.

OpenAI released ChatGPT Images 2.0 and promoted it on a livestream where Sam Altman described the jump from gpt-image-1 as comparable to a GPT‑3 → GPT‑5 generational shift. Simon Willison ran a practical comparison using a Where’s‑Waldo style prompt - “where is the raccoon holding a ham radio” - and shared results from gpt-image-1, Gemini’s Nano Banana variants, and Anthropic’s Claude Opus 4.7.

Willison noted the OpenAI Python client hadn’t been updated to include gpt-image-2, but the client doesn’t validate model IDs, so he invoked the new model by passing its identifier directly. In his informal trial, Nano Banana 2 produced a clearly findable raccoon, Nano Banana Pro produced the weakest result, Claude acknowledged a raccoon was present but had trouble locating it, and the gpt-image-1 baseline left the creature hard to spot. Willison published the returned images and his observations rather than a formal benchmark.

The episode matters operationally because it shows practitioners can test new image models quickly even when SDKs lag, but it also underlines that single‑prompt demonstrations don’t prove broad superiority. Teams evaluating image generation should expect follow‑up: OpenAI documentation and client updates, independent benchmarks, and more diverse prompt testing to validate the livestream’s generational claim and to understand where gpt-image-2 actually changes production behavior.

Latest
Trending

Models & Launches

DKPS method cuts model-evaluation queries using cached responses

CurrentLens
Jun 6, 2026

An arXiv paper introduces a DKPS-based approach that uses cached model outputs to predict benchmark scores while substantially reducing the number of queries.

Models & Launches

PIGMENT extends quantitative diffusion MRI to sparse, multi-site and low-field scans

CurrentLens
Jun 2, 2026

A physics-informed foundation model called PIGMENT learns a universal microstructure prior and adapts zero-shot to individual diffusion MRI scans, enabling reliable maps from sparse and heterogeneous data.

Models & Launches

ATOM Report Finds Chinese Open Models Overtook Western Peers in 2025

CurrentLens
May 27, 2026

A new ATOM analysis of about 1,500 open language models maps downloads, derivatives, inference share and performance, and reports Chinese models surpassed U.S.

Models & Launches

Authors Release OpenEval and Demand Item-Level Benchmark Standards

CurrentLens
May 25, 2026

A position paper argues AI evaluation must publish item-level benchmark responses and ships OpenEval - 10M model responses across 155k items - to prove the point.