← All stories
AI & Tech

OpenAI Used Internal Model to Disprove Major Mathematics Conjecture at Low Cost

No Priors Podcast · Why Traditional Benchmarks Fail Modern AI Models with OpenAI Research Scientist Noam Brown · June 26, 2026
OpenAI Used Internal Model to Disprove Major Mathematics Conjecture at Low Cost
No Priors Podcast
No Priors Podcast
Why Traditional Benchmarks Fail Modern AI Models with OpenAI Research Scientist Noam Brown
"We used an internal model at OpenAI a few weeks ago to disprove the Erdős unit distance conjecture. Now, I'm not a mathematician, but this seems like it was a pretty big deal in the math community. It was like the first problem that a lot of mathematicians had really spent a lot of time on, and the model was able to do something that they weren't able to do and do it in a way that was actually interesting and useful for mathematicians. Honestly, it did it at a budget that was dirt cheap."
Brown revealed OpenAI used an unreleased internal model to disprove the Erdős unit distance conjecture, a longstanding mathematical problem, at a minimal inference budget. He later noted that GPT-4.5 could likely achieve the same result with proper scaffolding at a cost of $1,000 to $100,000, suggesting significant latent capability in publicly available models that hasn't been explored.

About this episode

On this episode of No Priors, host Sarah Guo interviews Noam Brown, an AI researcher at OpenAI who pioneered inference-time scaling and reasoning approaches. Brown argues forcefully that the AI industry has a broken model evaluation system that fails to account for test-time compute, making benchmark comparisons misleading and safety evaluations inadequate. He reveals that modern models like GPT-4.5 can think productively for weeks before performance plateaus on complex tasks, unlike earlier models that peaked quickly, creating a fundamental problem: the capability of a model is now a function of how much money you spend on inference, from $10 to $10 million. Brown discloses that OpenAI used an unreleased internal model to disprove the Erdős unit distance conjecture, a longstanding mathematics problem, at minimal cost, and that GPT-4.5 could likely do the same with proper scaffolding for $1,000 to $100,000. He warns that current AI safety frameworks from all major labs don't account for this scaling dynamic, meaning models could have dangerous latent capabilities that aren't being tested at sufficient compute budgets. Brown also reveals OpenAI deliberately discourages internal researchers from using advanced models to solve open scientific problems, preferring to focus on developing more capable models for public release. He remains skeptical of overnight intelligence explosion scenarios, arguing time itself is the fundamental bottleneck because models require extended compute periods to reach peak capability. The conversation covers Brown's personal use of models for tasks from poker solver development to tax advice, his belief that models still lack research taste, and his call for the industry to abandon single-number benchmark grids in favor of performance plotted against inference budget.

Key takeaways

More stories More from No Priors Podcast