OpenAI's GPT-5.4 matches or beats professionals 83% of the time
OpenAI has released GPT-5.4, billed as “the most capable and efficient frontier model for complex professional work.” Within ChatGPT it appears as GPT 5.4 Thinking, and the model is rolling out across ChatGPT paid tiers, the API and Codex. OpenAI says GPT-5.4 is 18% less likely to contain errors and 33% less likely to make false claims than GPT-5.2 on prompts where users previously flagged mistakes.
The company also highlights improvements in coding, tool use and computer control. Performance was measured with GPTval, a new evaluation that tests economically valuable, real-world tasks across nine industries and 44 occupations. Tasks were developed with experienced professionals and graded blindly by human experts, with an automated grading system built from that human work.
The model’s gains have been rapid: GPT-5.1 scored 38.8% on GDPval, GPT-5.2 reached 70.9%, and GPT-5.4 now matches or exceeds human professionals 83% of the time.
openai, gpt-5.4, chatgpt, gptval, gdpval, codex, api, coding, human experts, false claims