The Illusion of Reasoning: The Debate that's Shaking the World of AI
Apple publishes two devastating papers-"GSM-Symbolic" (Oct. 2024) and "The Illusion of Thinking" (June 2025)-which demonstrate how LLMs fail on small variations of classical problems (Hanoi Tower, river crossing): "performance decreases when altered only numerical values." Zero success on complex Tower of Hanoi. But Alex Lawsen (Open Philanthropy) retorts with "The Illusion of the Illusion of Thinking" demonstrating failed methodology: failures were limits of token output not reasoning collapse, automatic scripts misclassified partial correct outputs, some puzzles were mathematically unsolvable. By repeating tests with recursive functions instead of listing moves, Claude/Gemini/GPT solve Tower of Hanoi 15 records. Gary Marcus embraces Apple thesis on "distribution shift," but pre-WWDC timing paper raises strategic questions. Business implications: how much to trust AI for critical tasks? Solution: neurosymbolic approaches-neural networks for pattern recognition+language, symbolic systems for formal logic. Example: AI accounting understands "how much travel expenses?" but SQL/calculations/tax audits = deterministic code.