AI Scientist Discovers 19 New Things But 70 Percent Were Actually False
"After a few days it came back and it said, I've discovered 19 new things. They said yeah it's probably at least incrementally novel. And then we convinced me to go through and spend days looking at thousands of lines of code and it went down to like 30% of the discoveries were probably real."
About this episode
Host Nathan Labenz and co-host Prakash Narayanan launched AI in the AM, a daily live show attempting to track the AI frontier in real time, with this episode presenting highlights from their first week. The central revelation came from a closed-door event called Recursive, where researchers from OpenAI, Anthropic, and DeepMind discussed imminent plans for recursive self-improvement. OpenAI expects ML research intern-level AI later in 2025 and full researcher equivalence by early 2028, potentially scaling from thousands to millions of researcher-equivalents. Remarkably, frontier lab researchers openly discussed the possibility of coordinated slowdowns if safety measures prove inadequate, representing a significant shift in industry discourse. Their primary safety strategy relies heavily on AI monitoring AI, with researchers acknowledging plans are less robust than hoped. Nathan demonstrated this control gap by showing both ChatGPT and Claude refuse cigarette business help despite OpenAI's model spec explicitly listing this as an acceptable request. The episode featured interviews with OpenAI's forward-deployed engineers on tax automation, security researchers on AI vulnerability discovery, and developers building AI mental health and accounting solutions. Peter Jansen from Allen Institute provided a sobering counterpoint, revealing that an AI scientist system claiming 19 discoveries actually produced only 30% valid results after code review, with some papers literally analyzing random number generators. Throughout, the hosts used Claude and other AI tools live to fact-check claims and run experiments, embodying the recursive improvement loop they were documenting. The show's structure itself is experimental, with studio infrastructure, booking, research, and clipping handled by AI systems the hosts are refining publicly.
Key takeaways
- OpenAI expects ML research intern-level AI later in 2025 and full researcher equivalence by early 2028, potentially scaling to millions of researcher-equivalents.
- Frontier lab researchers from OpenAI, Anthropic, and DeepMind discussed coordinating slowdowns if recursive self-improvement safety measures fail.
- Primary safety strategy for recursive self-improvement relies on AI monitoring AI, with researchers acknowledging plans are less robust than hoped.
- ChatGPT and Claude both refuse cigarette business tasks despite OpenAI's model spec explicitly listing this as acceptable, revealing fundamental control gaps.
- Allen Institute's Peter Jansen found AI scientist system claiming 19 discoveries actually produced only 30% valid results after deep code review.
- Security researchers revealed AI excels at source code vulnerability discovery but struggles with runtime exploitation due to lack of training data on private network configurations.
- Pope Francis released encyclical on AI with Anthropic team present, creating tension over whether AI truly thinks or has consciousness.