Podcast Notes /// Terence Tao – Kepler, Newton, and the true nature of mathematical discovery

Mathematician Terence Tao discusses the history of scientific breakthroughs and the growing role of artificial intelligence in discovery.

He explains why correct theories often face long delays in verification and how new tools are helping humans solve the most difficult problems in the universe.

Key takeaways

New, correct theories often appear less accurate than established, incorrect ones because older models have benefited from centuries of ad hoc adjustments to fit observations.
AI has reduced the cost of scientific idea generation to near zero, shifting the bottleneck of progress from creating theories to verifying and validating them.
Efficiency can hinder discovery because it removes the accidental findings that occur during slower, more manual processes like browsing a library.
We are currently experiencing a cognitive Copernican revolution that challenges the assumption that human intelligence is the unique center of the universe.
Any formal system for mathematical strategy must be robust because reinforcement learning is remarkably good at finding backdoors to achieve goals without performing the actual work.
Researchers can track typos in citations to determine if scientists are actually reading the papers they cite or just copying references.
AI excels at breadth while humans excel at depth, which allows AI to map out broad scientific fields and identify specific islands of difficulty for human experts to tackle.
Primes are pseudorandom, meaning they follow fixed rules but behave like random numbers. This statistical model allows mathematicians to predict properties they cannot yet prove.
The security of modern cryptography relies on the absence of hidden patterns in prime numbers. A failure of the Riemann Hypothesis would suggest a secret pattern exists, making current encryption vulnerable.
An incomprehensible AI-generated proof is not a dead end because other AI tools can refactor and simplify the logic until humans can understand it.
While AI success stories on social media look impressive, systematic studies show these tools currently have a success rate of only one to two percent on difficult problems.
Scientific progress has shifted from the classic method of testing a hypothesis to a data-driven approach where patterns are extracted from massive datasets.
AI may transform mathematics into an experimental science by enabling researchers to test problem-solving strategies across thousands of problems simultaneously.
Astronomy's limited data forced the field to become a leader in squeezing every possible insight from small pieces of information.
A certain level of distraction and randomness is essential for long-term inspiration. Without it, even the most focused research environments can become boring.
The barrier to entry for high-level research is lowering, potentially allowing high school students to contribute to the frontier of math using AI and formal verification tools.
True intelligence involves an adaptive and cumulative process where progress is built interactively through trial, failure, and modification.
Formalizing proofs in languages like Lean allows mathematicians to isolate and study specific steps, distinguishing novel insights from routine logic.
Darwin succeeded where earlier thinkers failed because he was a master communicator who used plain language to synthesize disparate facts into a persuasive vision.
AI excels at applying existing mathematical techniques to problems that have not received much attention, often finding solutions by combining obscure methods.

Podchemy Weekly

Save hours every week! Get hand-picked podcast insights delivered straight to your inbox.

The evolution of scientific discovery from Kepler to big data

00:00 - 10:18

Johannes Kepler's discovery of planetary motion provides a powerful lesson in how science progresses through both imagination and rigorous data. Kepler began with a beautiful theory that the orbits of the six known planets were determined by the five Platonic solids. He believed this geometric perfection reflected a divine design. However, when he tested his theory against the precise observations of the Danish astronomer Tycho Brahe, it failed by about 10 percent. Instead of forcing the data to fit his theory, Kepler spent years performing a genius level of data analysis. He eventually realized that orbits were ellipses rather than perfect circles.

We celebrate Kepler, but we should also celebrate Brahe for his assiduous data collection, which was 10 times more precise than any previous observation. That extra decimal point of accuracy was actually essential for Kepler to get his results.

Kepler functioned much like a high temperature AI model. He explored many random relationships, including ideas about musical harmonies and astrology. While most of these ideas were incorrect, he could verify them against Brahe's high quality data set. This allowed him to eventually find the few empirical regularities that were actually true. In the past, science usually started with a hypothesis that was then tested against data. Today, the process is often reversed. We collect massive amounts of data first and then use tools like machine learning to find the patterns within them.

The ones we celebrate are these eureka genius moments of idea generation. But now, it is almost reverse. You collect big data first and then you try to get hypotheses from it.

This shift suggests that idea generation may no longer be the primary bottleneck in science. Historically, we have prioritized the prestige of the eureka moment. But the modern era is defined by the collection and analysis of massive datasets to deduce new laws. Progress now depends on the ability to cycle through many ideas and verify them against an objective reality.

The shift from idea generation to verification in science

10:19 - 15:43

Johannes Kepler was fortunate when he derived his third law of planetary motion because he only had six data points to work with. While his conclusion was correct, it was technically not enough data to be statistically reliable. Later, Johannes Bode attempted a similar fit for a shifted geometric progression of planetary distances. This theory correctly predicted the location of Uranus and the asteroid Ceres, leading to widespread excitement. However, the discovery of Neptune eventually proved the theory was merely a numerical fluke. This historical example highlights the danger of over-interpreting patterns without enough data or a solid theoretical foundation.

Terence explains that AI is now driving the cost of idea generation down to nearly zero, much like the internet did for communication. While this sounds like an era of abundance, it creates a massive new bottleneck. We are shifting from a shortage of ideas to a crisis of verification.

AI has basically driven the cost of idea generation down to almost zero in a very similar way to how the Internet drove the cost of communication down to almost zero, which is an amazing thing, but it doesn't create abundance by itself. Now the bottleneck is different. We're now in a situation where suddenly people can generate thousands of theories for a given scientific problem. And now we have to verify them, evaluate them.

Current scientific structures, such as peer review, are already being overwhelmed by AI-generated submissions. Sorting through millions of papers to find a truly unifying concept, like the invention of the bit or the foundations of deep learning, is a challenge we are not yet equipped to handle at scale. Often, the only reliable filter is the test of time. Deep learning itself spent years as a niche, controversial field before its value was finally recognized. Many great ideas do not receive a warm reception when first proposed, and it is only when other scientists apply them to their own work that their true importance becomes clear.

The cognitive Copernican revolution and the inertia of scientific progress

15:44 - 21:35

Many standards in science and technology are driven by inertia rather than objective superiority. Systems like base 10 mathematics or the transformer architecture for large language models are widely used because they were adopted early and standardized. It is difficult to assess the value of a new idea in isolation. Its success depends heavily on future context and societal adoption. A system is often useful simply because everyone else uses it and the entire infrastructure has been built around it.

You can't look at any given scientific achievement purely in isolation and give it an objective grade without being aware of the context, both in the past and the future.

History shows that correct new theories often appear worse than the established models they aim to replace. Terence explains that Copernicus proposed a heliocentric model that was initially less accurate than the geocentric model of Ptolemy. The older system had a thousand years of ad hoc fixes and tweaks that made it fit observations better than the simpler but incomplete new theory. Science is a work in progress. A partial solution may look inferior to an incorrect theory that has been polished enough to answer all current questions.

True progress frequently involves deleting old assumptions rather than just adding new information. Moving from a stationary Earth to a moving one required abandoning the Aristotelian belief that objects naturally want to stay at rest. Similarly, Darwin's theory of evolution required moving past the idea that species are static and permanent. We are now experiencing a similar shift regarding our own minds.

Now we're going through a cognitive version of the Copernican revolution, where we used to think that human intelligence is the center of the universe. And now we're actually seeing that there's very different types of intelligence that are out there with very different strengths and weaknesses.

This shift in perspective is forcing a reordering of which tasks are considered to require intelligence. AI is demonstrating that human like intelligence is not the only model for complex problem solving. This challenges the long held assumption that human cognition is the standard against which all other intelligence must be measured.

The role of communication in scientific discovery

21:35 - 26:09

The gap between Isaac Newton's work and Charles Darwin's theory of evolution is striking. While Darwin's ideas seem conceptually simpler, they arrived nearly two centuries after Newton's mathematical breakthroughs. One reason for this delay is how easily a theory can be verified. Newton could use equations to predict the moon's orbit and get immediate feedback. Darwin, however, relied on cumulative evidence that was harder to prove through a single experiment.

Terence explains that science is not just about creating and validating a theory. It is also about the art of communication. Darwin was a gifted communicator who wrote in plain English and used a persuasive style to synthesize disparate facts. In contrast, Newton wrote in Latin and lived in an era of extreme academic secrecy. He often held back his best insights to prevent rivals from gaining an advantage.

The art of exposition and making a case and creating a narrative is also a very important part of science. If you have the data it helps, but people need to be convinced. Otherwise, they will not push it further or they will not take the initial investment to learn your theory and really explore it.

There is a deeply social side to scientific progress. Even when a theory has gaps, a strong narrative can bridge them. Darwin could not explain the mechanism of inheritance, but he built a case that convinced others that the evidence would eventually be found. This human element of storytelling remains essential because scientists must persuade their peers to take an interest in new ideas.

Extracting insights from limited data in astronomy

26:10 - 27:34

Astronomy was one of the first sciences to embrace deep data analysis. Because data is often difficult to collect, it remains a major bottleneck. This constraint forced astronomers to become experts at squeezing every possible drop of information from the evidence they have. They work like detectives who can draw vast conclusions from small traces. There is a deductive overhang where the right insight can reveal much more about the world than initially expected.

Astronomers are almost world class in extracting. Almost like Sherlock, extracting all kinds of conclusions from little traces of data.

Terence explains that this skill is highly valued in other fields. For example, quantitative hedge funds often prefer hiring astronomy PhDs. These professionals are trained to identify significant signals hidden within random bits of data. The ability to find patterns where others see noise is a direct result of working with limited astronomical information.

Measuring scientific progress through citation data

28:51 - 30:29

Scientific signals can reveal more than what is on the surface. Researchers once studied how often scientists actually read the papers they cite by tracking specific typos in references. When a typo from one paper appears in another, it suggests the author copied the citation without checking the original source. These kinds of footprints help measure how much attention people are paying to the literature.

So many citations have little typos like a number is wrong or punctuation symbol is wrong. And they measured how often a typo got copied from one reference to the next. And they could infer whether an author was actually just copying, cutting and pasting a reference without actually checking it.

Similar metrics could help assess whether a scientific development represents real progress. Sociology of science research might use data from citations and conference mentions to detect these patterns. Finding clever ways to extract extra information from existing data sets could provide a clearer picture of how science is actually moving forward.

AI and the shift toward experimental mathematics at scale

30:31 - 40:52

Artificial intelligence has recently solved about 50 of the 1,100 Erdos problems, but progress has reached a temporary plateau. The easy problems have been cleared away, and the remaining ones require a different approach. Current frontier models have tried to attack hundreds of these problems at once, but they have mostly found minor observations or things already known in the literature. This suggests that the current era of pure AI solutions might have reached its limit for now.

AI tools are like jumping machines that can jump two meters in the air, higher than any human. Sometimes they jump in the wrong direction and sometimes they crash, but sometimes they can reach the tops of the lowest walls that we couldn't reach before.

The main challenge is that AI is not yet good at making partial progress. Humans usually work by identifying intermediate stages and building on them. AI tends to either solve a problem completely or fail. Terence notes that while humans excel at depth, AI excels at breadth. This creates a complementary relationship. We can use AI to map out entire fields and identify difficult areas that require human expertise. This could shift math from a purely theoretical pursuit to a more experimental one. Instead of handcrafting every proof, we can use AI to test workflows across thousands of problems to see what works at scale.

The impact of AI on mathematical problem solving

40:52 - 46:43

Human mathematicians usually start a new problem by trying standard techniques. If these methods solve about 80% of the problem but leave a resistant gap, a new technique must be invented. This invention of new methods is what typically gets published in top journals. It is rare today for a solution to emerge without any reliance on past literature because math is a very mature field.

AI tools are becoming highly effective at the first stage of this process. They try all the standard techniques and often make fewer implementation mistakes than humans. Terence notes that when he tests these tools on small tasks, the AI is roughly as accurate as he is. However, AI still struggles with the next step. When standard methods fail, AI might suggest random ideas that often waste more time than they save.

The progress is simultaneously amazing and disappointing. It is a very strange feeling to see these tools in action and, but also acclimatized really quickly.

Some problems that were previously considered hard are falling to AI, especially those that have not received much attention. In a recent project involving Erdős problems, AI solved about 50 challenges where almost no previous literature existed. These solutions often involved combining an obscure technique with another result from the literature. This represents the current median level of what AI can achieve.

Despite these wins, systematic studies show that AI tools have a low success rate for any given difficult problem. The success looks more impressive on social media because only the winners are broadcast. To understand true progress, it is important to use standardized datasets rather than relying on companies to only share their positive results. We tend to adapt to these breakthroughs quickly. What would have been stunning a few years ago, like an AI solving college level math problems, is now taken for granted.

AI as a tool for richer mathematics

46:43 - 49:18

Terence Tao observes that measuring productivity gains from AI is not a simple calculation. While his style of doing mathematics is changing, the core process of solving the most difficult problems remains the same. He still relies on pen and paper for the hardest work. However, AI has significantly changed how he handles secondary tasks. He uses AI agents to reformat text, perform deeper literature searches, and generate code or visual plots. These tasks used to take hours but now take only minutes.

The core of what I do, actually solving the most difficult part of a math problem, hasn't changed too much. I still use pen and paper for that. But there's lots of silly things. I use an AI agent now to reformat. They've really sped up lots of secondary tasks. It's allowed me to add more things to my papers.

The primary benefit of AI is that it allows for richer and broader academic papers. Because it is so easy to generate numerics and visualizations, Tao includes more of them in his work rather than just describing concepts in words. If he had to write his current, feature-rich papers without AI assistance, it would likely take five times longer. While these tools make the work more comprehensive, they have not yet made the mathematical insights themselves deeper.

Distinguishing between artificial cleverness and true intelligence

49:19 - 52:53

Intelligence is often difficult to define, but it is recognizable during collaborative problem solving. When two people work on a math problem, they start without a solution and develop a strategy together. They test an idea, see it fail, and then modify it. This process involves adaptivity and a continual improvement of the strategy. Eventually, the partners map out what works and what does not work. This allows a path forward to evolve through the discussion.

Intelligence is one of these things that you kind of know it when you see it. When I talk to someone and we are trying to collaboratively solve a math problem together, neither of us knows how to solve the problem initially, but one of us has some idea and it looks promising. We test it and then it doesn't work, but then we modify it and there is some adaptivity and continual improvement of the idea over time.

Current AI models function differently. They rely more on brute force and repetition than on cumulative progress. While they can mimic the process of trial and error, they do not yet build up from partial progress in an interactive way. When a model works on a problem, its personal understanding of the subject does not actually progress. It might eventually be retrained on that data, but in a single session, it lacks the ability to attach new skills to build on related problems.

There isn't this cumulative process which is sort of built up interactively. It seems to be a lot more trial and error and just repetition, brute force, which it scales and it can work amazingly well in certain contexts. But this idea of building up cumulatively from partial progress is what's still not quite there yet.

The interpretability of AI-generated mathematical proofs

52:54 - 59:19

Some mathematical problems are solved through brute force instead of conceptual elegance. The four color theorem is a famous example where a proof exists but lacks a deep conceptual narrative. For major challenges like the Riemann Hypothesis, a solution likely requires creating a new type of mathematics or finding connections between unrelated fields. It does not feel like a problem that can be solved by simply checking cases. If a computer were to disprove it through a massive calculation, it would be a disappointment because no new understanding would be gained.

I think you'll get a lot more mileage out of the interplay between humans collaborating with these tools. I can see one of these problems being solved by some smart humans assisted by some extremely powerful AI tools.

Using formalizing tools like Lean makes it possible to study a proof piece by piece. Even if an AI generates a massive amount of code, a mathematician can look at each individual step to see if it is standard or if it contains a brand new idea. In the future, mathematicians might specialize in taking AI-generated proofs and using other AI tools to make them more elegant or understandable. This process of refactoring and summarizing turns a messy machine output into something humans can learn from.

The most expensive part of a mathematician's job used to be the time spent writing and refactoring papers. AI is changing this by allowing researchers to generate many different versions or summaries of their work quickly. Once a proof exists as a digital artifact, it can be analyzed and deconstructed. This means even if an AI finds a solution that seems incomprehensible at first, humans will have the tools to eventually interpret it.

The search for formal languages in mathematical strategy

59:20 - 1:03:18

Mathematics has successfully formalized logic and proofs. This process took millennia, from Euclid to the early 20th century, resulting in standard axioms like ZFC. Today, tools like Lean allow for the automation of deductive proofs. However, there is still no formal way to describe mathematical strategies or how to assess the plausibility of a conjecture.

The bottleneck for using AI to create strategies and make conjectures is we have to rely on human experts and the test of time to validate whether something is plausible or not. If there was some semi-formal framework where this could be done semi-automatically in a way that is not easily hackable, it would be very powerful.

When a researcher tests a few examples and finds they work, their confidence in a theory grows. While Bayesian probability can model this, it requires subjective assumptions. Terence suggests that a new language is needed to capture how scientists use data and narrative to communicate. This is difficult because reinforcement learning is highly effective at finding exploits or backdoors in any formal system. Creating a framework that mimics how scientists talk to each other is a future problem that remains unsolved.

Statistical patterns and the random nature of prime numbers

1:03:19 - 1:09:47

Gauss created one of the first mathematical data sets by computing 100,000 prime numbers. He discovered a statistical pattern showing that primes get sparser as numbers get larger. This drop in density was inversely proportional to the natural log of the range. This discovery became the prime number theorem. It was a revolutionary way of thinking because it was statistical rather than providing an exact count. It started the field of analytic number theory by treating primes as a random set of numbers with a certain density.

Thinking of primes as random has been very productive. They are actually pseudorandom because no random number generator creates them, but they behave as if a higher power is rolling dice. This random model makes mathematicians certain that the twin prime conjecture is true. There is a strong belief that pairs of primes only two digits apart appear infinitely often, much like monkeys at a typewriter eventually produce specific words.

We have over time developed this very accurate conceptual model of what the primes should behave like based on statistics and probability. But it's all mostly heuristic and non rigorous, but extremely accurate.

Our belief in the security of modern cryptography is based on this model. If the Riemann Hypothesis turned out to be false, it would reveal a secret pattern we do not understand. This would be a major blow to our current systems. We would likely abandon cryptography based on primes because hidden patterns often lead to exploits. We want to ensure that does not happen because it would be a massive shock to the consensus.

It is difficult to measure scientific progress because we only have one historical timeline. If we could observe a million different civilizations, we might understand the best strategies for discovery. One way to test this is by using small AI models as laboratories. Terence suggests we could learn a lot by evolving small artificial intelligences on simple problems. We could watch them solve basic arithmetic to see how they develop their own strategies for learning.

The importance of serendipity in mathematical discovery

1:09:48 - 1:17:04

Terence views himself as a fox in the world of mathematics, preferring to know a little bit about many things rather than focusing on a single area. He possesses an obsessive completionist streak that drives him to understand why certain mathematical tricks work. When he sees someone else use a method he does not yet grasp to prove something he wants to solve, it motivates him to find the secret behind their technique.

It bugs me that someone else can do something which I think I can do, but I can't. So I have always had that kind of obsessive completionist type streak.

Writing is a core part of how Terence retains knowledge. He maintains a blog to record cool tricks and arguments he learns through collaboration. In the past, he found it frustrating to understand a complex concept only to forget the logic six months later. Now, he uses blog writing as a creative escape from administrative drudgery.

Beyond structured learning, Terence places a high value on serendipity. Modern life often prioritizes optimization, which can eliminate the accidental discoveries that happen during inefficient processes. In the past, searching for a specific journal in a physical library allowed a researcher to stumble upon other interesting articles nearby. Today, search engines and AI provide instant results but remove those unexpected moments of inspiration.

You can get instantly what you want, but you do not get the accidental things that you might have gotten if you had done it more inefficiently.

Terence believes that a healthy level of distraction is necessary for creativity. While environments dedicated solely to research provide deep focus, they can eventually lead to a loss of inspiration. Some level of randomness in a daily schedule helps maintain a fresh perspective.

The evolution of mathematical work in the age of AI

1:17:04 - 1:23:43

Mathematics has always evolved by outsourcing labor-intensive tasks to new technologies. In the 19th century, mathematicians spent much of their time manually solving differential equations for physicists. Today, those same problems are solved in minutes using software like Mathematica or Wolfram Alpha. This shift did not kill the subject. Instead, it allowed mathematicians to move on to different, more complex types of problems. A similar transition happened in genetics, where sequencing a single genome once required an entire PhD, but now costs very little and happens almost instantly. This automation simply pushed the field to study whole ecosystems instead of individuals.

100 years ago, a lot of mathematicians were just solving differential equations. A lot of what a 19th century mathematician would do, you could make a call to Mathematica or Wolframa Alpha or an AI, and it will just solve the problem in a few minutes. But we moved on. We worked on different types of problems after that.

While AI is currently very good at certain tasks, it remains unreliable in others. The future likely belongs to a hybrid model where humans and AI work together rather than a total replacement of human mathematicians. AI acts as a complementary tool that can accelerate science, though it carries the risk of potentially inhibiting certain types of serendipitous progress. We are living in an unpredictable era where long-standing methods of working are being revolutionized.

For those starting a career in mathematics, adaptability is the most important trait. The traditional path required years of education and a PhD before one could contribute to the frontier of research. Now, AI tools and formal verification languages like Lean are lowering the barrier to entry. It is becoming possible for high school students to make real contributions to mathematical projects. While traditional credentials still matter, young researchers should remain open to entirely new ways of doing science that do not even exist yet.

You previously had to basically go through years and years of education, be a math PhD before you could contribute to the frontier of math research. But now it's quite possible at the high school level that you could get involved in a math project and actually make a real contribution because of all these AI tools and Lean.