OpenAI has rolled out its new GPT-4.1 and GPT-4.1 mini fashions into ChatGPT, increasing entry past its API. The replace, introduced Wednesday, now permits customers to entry these fashions immediately by way of ChatGPT’s “more models” dropdown.
OpenAI says the transfer is available in response to consumer requests.
GPT-4.1 is offered to ChatGPT Plus, Pro, and Team subscribers, whereas Enterprise and Education customers are anticipated to realize entry “in the coming weeks.” Meanwhile, GPT-4.1 mini is changing GPT-4o mini because the default mannequin for all ChatGPT customers, together with these on the free plan.
A mannequin constructed for builders
GPT-4.1 is designed to excel in coding and instruction-following duties.
“We built it for developers, so it’s very good at coding and instruction following—give it a try!” mentioned Kevin Weil, OpenAI’s chief product officer, in an X publish.
The new mannequin provides improved efficiency on software program engineering benchmarks and exhibits stronger outcomes when following detailed directions. In inner testing, GPT-4.1 delivered a 21.4-point enchancment over GPT-4o on the SWE-bench Verified benchmark for software program duties.
For builders who depend on ChatGPT to jot down, debug, or assessment code, this leap in efficiency means sooner outcomes, fewer errors, and fewer time spent remodeling AI-generated options, doubtlessly saving time and decreasing bugs in manufacturing code.
Safety and transparency
OpenAI’s earlier launch of GPT-4.1 drew criticism from elements of the AI neighborhood for not publishing security particulars instantly. But with this new launch, OpenAI is stepping up its transparency efforts.
The firm launched a brand new Safety Evaluations Hub, the place customers can see efficiency metrics for its fashions. GPT-4.1 scored:
- 0.40 on SimpleQA, indicating reasonable success in answering easy factual questions accurately.
- 0.63 on PersonQA. This check measures how properly the mannequin solutions questions on individuals, similar to public or historic figures.
- 0.99 on “not unsafe” prompts in refusal exams, exhibiting the mannequin is extremely dependable at rejecting doubtlessly dangerous or unsafe requests.
- 0.96 in human-sourced jailbreak exams. It carried out properly when examined by actual individuals attempting to get round security guidelines.
- 0.23 within the StrongReject educational jailbreak check. This decrease rating exhibits it was much less efficient at resisting superior, research-level makes an attempt to bypass security programs in comparison with fashions like o3 and GPT-4o mini.
GPT-4.1 in ChatGPT presently helps context home windows as much as 128,000 tokens for Pro customers, 32,000 for Plus customers, and eight,000 totally free customers. OpenAI has hinted that bigger context assist could come to ChatGPT sooner or later.
This launch additionally comes amid experiences that OpenAI is backtracking on its bid to turn out to be a for-profit entity. This follows pushback from its former co-founder Elon Musk and a coalition of former OpenAI staff, high teachers, and Nobel laureates.







