In math, rigor is vital, but are digitized proofs taking it too far?

3/30/2026 at 6:55:10 PM

Great quote from Hilbert, I think it’s also a useful thought for software development.

“The edifice of science is not raised like a dwelling, in which the foundations are first firmly laid and only then one proceeds to construct and to enlarge the rooms,” the great mathematician David Hilbert wrote in 1905 (opens a new tab). Rather, scientists should first find “comfortable spaces to wander around and only subsequently, when signs appear here and there that the loose foundations are not able to sustain the expansion of the rooms, [should they] support and fortify them.”

by WhitneyLand

3/30/2026 at 9:04:45 PM

Yeah, I see a lot of people (especially on HN) bemoaning any science that isn't a controlled double blind experiment with a large sample size. But exploratory science is just as important as the science that proves things. Otherwise we wouldn't know which hypotheses are useful/interesting to test.

by nicoburns

3/31/2026 at 12:13:53 AM

The problem is more about how it is reported to the public. Science is ugly, but when a discovery is announced to the public, a high level of confidence is expected, and journalists certainly act like there is. Kind of like you are not supposed to ship untested development versions of software to customers.

But sometimes, some of the ugly science gets out of the lab a bit too soon, and it usually doesn't end well. Usually people get their hopes up, and when it doesn't live up to the hype, people get confused.

It really stood out during the covid pandemic. We didn't have time to wait for the long trials we normally expect, waiting could mean thousands of deaths, and we had to make do with uncertainty. That's how we got all sorts of conflicting information and policies that changed all the time. The virus spread by contact, no, it is airborne, masks, no masks, hydroxycholoroquine, no, that's bullshit, etc... that sort of thing. That's the kind of thing that usually don't get publicized outside of scientific papers, but the circumstances made it so that everyone got to see that, including science deniers unfortunately.

Edit: Still, I really enjoyed the LK99 saga (the supposed room temperature superconductor). It was overhyped, and it it came to its expected conclusion (it isn't), however, it sparked widespread interest in semiconductors and plenty of replication attempts.

by GuB-42

3/31/2026 at 12:39:55 AM

  > The problem is more about how it is reported to the public.

Yes and no.

From scientific communicators there's a lot of slop and it's getting worse. Even places like Nature and Scientific American are making unacceptable mistakes (a famous one being the quantum machine learning black hole BS that Quanta published)

But I frequently see those HN comments on ArXiv links. That's not a science communication issue. Those are papers. That's researcher to researcher communication. It's open, but not written for the public. People will argue it should be, but then where does researcher to researcher communication happen? You really want that behind closed doors?

There is a certain arrogance that plays a role. Small sample size? There's a good chance it's a paper arguing for the community to study at a larger scale. You're not going to start out by recruiting a million people to figure out if an effect might even exist. Yet I see those papers routinely scoffed at. They're scientifically sound but laughing at them is as big of an error as treating them like absolute truth, just erring in the opposite direction.

People really do not understand how science works and they get extremely upset if you suggest otherwise. As if not understanding something that they haven't spent decades studying implies they're dumb. Scientists don't expect non scientists to understand how science works. There's a reason you're only a junior scientist after getting an entire PhD. You can be smart and not understand tons of stuff. I got a PhD and I'll happily say I'll look like a bumbling idiots even outside my niche, in my own domain! I think we're just got to stop trying to prove how smart we are before we're all dumb as shit. We're just kinda not dumb at some things, and that's perfectly okay. Learning is the interesting part. And it's extra ironic the Less Wrong crowd doesn't take those words to heart because that's what it's all about. We're all wrong. It's not about being right, it's about being less wrong

by godelski

3/30/2026 at 9:41:36 PM

Are they bemoaning that science is being done, or are they bemoaning that the experimental results have not yet reached high enough confidence to justify the conclusions being suggested?

by harshreality

3/30/2026 at 11:16:04 PM

> Are they bemoaning that science is being done

The reflexive "in mice" comments seem to be bemoaning how science is done.

by JumpCrisscross

3/31/2026 at 1:12:15 AM

As someone who has made several comments consisting entirely of “…in mice.”, let me assure you that the reflex only kicks in after reading the paper and noticing that the experimental subjects were exclusively mice.

The problem is not mice experiments on arxiv, the problem is posting those papers for broader dissemenation to the public, with titles suggesting to the public that cancer has been cured, without prominently pointing out that the experiments were not about cancer in humans.

by cwillu

3/31/2026 at 1:31:41 AM

> problem is posting those papers for broader dissemenation to the public, with titles suggesting to the public that cancer has been cured

Fair enough. I'm thinking of cases where a good study that isn't turned into PR slop is dismissed because it was done in mice. Which is fine for most people. But not great if we're treating real science that way.

by JumpCrisscross

3/31/2026 at 2:25:20 AM

Dismissing good science is entirely the correct decision when the good science isn't ready for broad dissemination to the audience which it is being presented to.

by cwillu

3/31/2026 at 12:34:42 AM

I disagree. I think people understand studies have to begin in mice. It’s what the GP said. You can’t release those studies because there’s not a high enough confidence rate in what most people are interested in ie how it effect humans.

by dyauspitr

3/31/2026 at 1:06:46 AM

> You can’t release those studies because there’s not a high enough confidence rate in what most people are interested in ie how it effect humans

This is science by ignoramus. It isn't how science works, at least not when it works at its best. Someone advocating for censoring science because it might be misread is not on the side of science.

by JumpCrisscross

3/31/2026 at 5:27:23 PM

I’m not advocating for censoring them. I’m advocating for less hype in science media reporting around mice studies because let’s be frank. The vast majority of the population are ignoramuses that cannot make the distinctions themselves, and that has real political consequences through lack of trust in scientific organizations.

by dyauspitr

3/30/2026 at 10:57:09 PM

More Doctors Smoke Camels!™

by cap11235

3/31/2026 at 12:32:26 AM

It depends, especially coming from fields like psychology. You can prove anything with a small enough group. A lot of those just end up adding a lot of noise and reduce the reliability of the entire field in general. It just ends up with people getting conflicting information every other week and then they just tune out.

by dyauspitr

3/31/2026 at 1:38:15 AM

Like anything else, it's easier to complain about the legitimacy of something and nitpick it to death than it is to do the actual thing.

Most people on HN aren't scientists, even if they fancy themselves as such.

by potsandpans

3/31/2026 at 6:16:46 PM

That’s similar to Neurath’s boat:” We are like sailors who on the open sea must reconstruct their ship but are never able to start afresh from the bottom. Where a beam is taken away a new one must at once be put there, and for this the rest of the ship is used as support. In this way, by using the old beams and driftwood the ship can be shaped entirely anew, but only by gradual reconstruction.”

by raincom

3/30/2026 at 8:39:33 PM

My only complaint with the article is that it doesn't seem to mention that digitized proofs can contain gaps but that those gaps must be explicit like in lean the `sorry` function, or axioms.

by ratmice

3/30/2026 at 4:23:27 PM

The problem with this ambition is that it turns mathematics into software development. There’s absolutely nothing wrong with this per se, however what happens is that, as in software, certain ideas get ossified. That’s why, for example, every OS has a POSIX layer even though technically the process/namespace/security model could be radically reimagined possibly to create more easily engineered, correct software.

Mathematics is going through a huge, quiet, upheaval. The litmus test will be when, if ever, someone wins a Fields using a proof-assistant in an essential way.

by sxzygz

3/30/2026 at 5:16:04 PM

> what happens is that, as in software, certain ideas get ossified. That’s why, for example, every OS has a POSIX layer even though technically the process/namespace/security model could be radically reimagined possibly to create more easily engineered, correct software.

Total amateur here, but it strikes me that one important difference is that performance matters in software in a way that it doesn’t in mathematics—that is, all proofs are equally valid modulo elegance. That means that abstractions in software are leaky in a way that abstractions in mathematics aren’t.

In other words, in software, the same systems get reused in large part because they’ve been heavily refined, in terms of performance, unexpected corner-case behavior and performance pitfalls, documentation of the above, and general familiarity to and acceptance by the community. In math, if you lay new foundations, build some new abstraction, and prove that it’s at least as powerful to the old one, I’d think that you’d be “done” with replacing it. (Maybe downstream proofs would need some new import statements?)

Is this not the case? Where are people getting stuck that they shouldn’t be?

by msteffen

3/30/2026 at 8:30:39 PM

I know what you're saying but elegance is not simply an aesthetic concern.

The value of a proof is not only its conclusion but also the insight that it provides through its method.

The goal of mathematics is not to prove as many theorems as possible but rather to gain an ever deeper understanding of why certain statements are true. The way that something is proved can be more or less useful to advancing that goal.

As an example the elementary proof(s) of the prime number theorem are just about as famous as the original proof. Sometimes the second bite of the cherry is even juicier than the first.

by zarzavat

3/31/2026 at 1:14:04 AM

Exactly. The reason mathematicians and physicists care about elegance is because they care about understanding things. Elegance, like you said, isn't about aesthetics, even though people seem to think they're synonymous. But the elegance is that you've reduced things to simple components. That not only makes it easier for us humans to understand but it means you're closer to the minimal structure. Meaning you know what matters and more importantly, what doesn't.

Tbh, elegance is something programmers should strive for too. Elegant code is easier to build upon, easier to read/understand, easier to modify, easier to adapt. For all the same reasons mathematicians want elegance. Though it's true for many domains. People love to throw around the term "first principles" but that's not something you (usually) start at, that's something you derive. And it's usually not very easy to figure out

by godelski

3/30/2026 at 5:17:47 PM

Agreed; e.g. if you prove something about the real numbers, the matter of how R is constructed out of your axiomatic system doesn't matter

by dooglius

3/30/2026 at 8:42:59 PM

The picture isn't quite so clean in the constructive context, which is what many of these proof systems are rooted in, e.g., https://mathoverflow.net/questions/236483/difference-between...

by jfarmer

3/30/2026 at 8:20:42 PM

there are questions where the abstraction of real numbers becomes leaky, and some axioms (or their lack) poke through.

https://en.wikipedia.org/wiki/Axiom_of_choice#Real_numbers

by pas

3/30/2026 at 9:18:37 PM

Some proofs have become extremely long, and the raw size has created worries about correctness. It's easy to make a mistake in hundreds of pages.

Ultimately, a proof is an argument that something is true. The simpler "more elegant" proof is generally going to be more convincing.

by nitwit005

3/30/2026 at 7:33:56 PM

Proof irrelevance I don't think is accepted in constructivist situations. Those are, however, not that relevant to the recent wave of AI math which uses Lean, whose type system includes classical mathematics.

by enricozb

3/30/2026 at 8:07:12 PM

>The litmus test will be when, if ever, someone wins a Fields using a proof-assistant in an essential way.

You're assuming that the point of interactive theorem provers is to discover new mathematics. While that's an interesting research area, it seems like the more practical application is verifying proofs one has already discovered through other means.

by fasterik

3/30/2026 at 9:03:57 PM

Exactly this. LLMs really aren't built for discovering new mathematics, especially _interesting_ new mathematics. They're built to try the most obvious patterns. When that works, it's pretty much by definition not interesting.

What LLMs are good at is organizing concepts, filling in detail, and remembering to check corner cases. So their use should help mathematicians to get a better handle on what's terra firma and what's still exploration. Which is great. Proof by it-convinced-other-mathematicians doesn't have a flawless track record. Sometimes major theorems turn out to be wrong or wrong-as-stated. Sometimes they're right, but there's never been a complete or completely correct proof in the literature. The latter case is actually quite common, and formal proof is just what's needed.

by auntienomen

3/30/2026 at 9:13:50 PM

LLMs and interactive theorem provers are vastly different. There are AI models that come up with workable formal proofs for ITPs but these aren't your usual frontier models, they're specifically trained for this task.

by zozbot234

3/30/2026 at 9:52:54 PM

ITPs are far older than LLMs in general, sure, but that's a pedantic distraction. What everyone is talking about here (both the comments, and the article) are ITPs enriched with LLMs to make the "smart" proof assistants. The LLMs used in ITPs are not vastly different from the usual chatbots and coding assistants. Just a different reinforcement learning problem, no fundamental change in their architecture.

by auntienomen

3/31/2026 at 12:32:04 AM

Of course, once LLMs are really good at that, they can be set loose on the entire historical math literature, all 3.5M papers worth. And then LLMs can be trained on these formalized results (the ones that turn out upon attempted formalization to have been correct.)

How good do you think AI will be at proving new results given that training set?

Math is going to change, and change massively. There's a lot of whistling past the graveyard going on from those who are frightened by this prospect.

by pfdietz

3/30/2026 at 7:39:53 PM

Haven't science and mathematics always worked like this? Models are built, they ossify, and eventually get replaced when they become limiting. Software just makes that process more explicit. Or at least I don't see how math turning into software development would selectively promote this effect.

by perching_aix

3/30/2026 at 9:10:46 PM

> however what happens is that, as in software, certain ideas get ossified. That’s why, for example, every OS has a POSIX layer

Refactoring formalized developments is vastly easier than refactoring software or informal math, since you get verified feedback as to whether the refactoring is correct.

by zozbot234

3/31/2026 at 2:40:46 AM

We already have had that ossified layer thing multiple times in mathematics. Formalisms change. For example, prior to vector calculus writing out multidimensional PDEs was tedious. Vector calculus has serious issues in its own right, so you get people pushing geometric algebra. In more rarefied domains you have things like categories and sheafs replacing the previous "ossified" layer.

It'll keep going on and on.

by nimish

3/30/2026 at 10:25:50 PM

> That’s why, for example, every OS has a POSIX layer even though technically the process/namespace/security model could be radically reimagined possibly to create more easily engineered, correct software.

But that is because everyone has to switch to the new system. There are no shortage of experimental OSs that do things in different ways. They fail because of switching costs not because making them is hard.

A machine checked proof is valid if it happens once. You dont need the whole world to switch.

by bawolff

3/31/2026 at 12:37:37 AM

Maths doesn't need a litus test, because its not chemistry. You mentioned ideas being ossified and that might be one of them.

by gerdesj

3/31/2026 at 12:55:30 AM

> someone wins a Fields using a proof-assistant in an essential way.

Terence Tao is actively using LEAN and working with the LEAN community to prove leading edge mathematics.

by deterministic

3/30/2026 at 10:06:41 PM

> certain ideas get ossified.

That's fine in math. Math is true or it is not. People who overturn popular conjectures in math get fame, not approbation.

Being able to prove things in something like Lean means that stuff like Mochizuki's work on the abc conjecture could be verified or disproven in spite of its impenetrability. Or, at the very least, it could be tackled piecemeal by legions of students tackling a couple of pages every semester.

by bsder

3/31/2026 at 12:41:39 AM

[dead]

by sxzygz

3/30/2026 at 3:38:39 PM

Imagine a future where proofs are discovered autonomously and proved rigorously by machines, and the work of the human mathematician becomes to articulate the most compelling motivations, the clearest explanations, and the most useful maps between intuitions, theorems, and applications. Mathematicians as illuminators and bards of their craft.

by jl6

3/30/2026 at 3:55:40 PM

The question is whether the capabilities that would let AI take over the discovery part wouldn’t also let them take over the other parts.

by layer8

3/30/2026 at 3:53:39 PM

But in this future, why will “the most compelling motivations, the clearest explanations, and the most useful maps between intuitions, theorems, and applications” be necessary? Catering to hobbyists?

by tines

3/30/2026 at 8:10:43 PM

Most mathematicians don't understand the fields outside of their specialization (at a research level). Your assumption that intuition and applications are limited to hobbyists ignores the possibility of enabling mathematicians to work and collaborate more effectively at the cutting edge of multiple fields.

by fasterik

3/30/2026 at 4:13:28 PM

Very far in the future when AI runs everything, of course math will be a hobby (and it will be great! As a professional programmer I'm happy that I now have a research-level tutor/mentor for my math/physics hobby). In the nearer term, it seems apparent to me that people with stronger mental models of the world are able (without even trying!) to formulate better prompts and get better output from models. i.e. as long as people are asking the questions, they'll do better to have some idea of the nuance within the problem/solution spaces. Math can provide vocabulary to express such nuance.

by ndriscoll

3/30/2026 at 3:58:04 PM

Mapping theorems to applications is certainly necessary for mathematics to be useful.

by layer8

3/30/2026 at 4:03:37 PM

Sure, applications are necessary, but why will humans do that?

by tines

3/30/2026 at 4:10:24 PM

I agree (https://news.ycombinator.com/item?id=47575890), but the parent assumes that AI will lack the ability.

by layer8

3/30/2026 at 9:55:22 PM

Proofs of what?

Proofs tend to get generated upstream of people trying to investigate something concrete about our models.

A computer might be able to autonomously prove that some function might have some property, and this prove is entirely useless when nobody cares about that function!

Imagine if you had an autonomous SaaS generator. You end up with “flipping these pixels from red to blue as a servis” , “adding 14 to numbers as a service”, “writing the word ‘dog’ into a database as a service”.

That is what autonomous proof discovery might end up being. A bunch of things that might be true but not many people around to care.

I do think there’s a loooot of value in the more restricted “testing the truthfulness of an idea with automation as a step 1”, and this is something that is happening a lot already by my understanding.

by rtpg

3/30/2026 at 9:59:19 PM

> Imagine a future where proofs are discovered autonomously and proved rigorously by machines, and the work of the human mathematician becomes to articulate the most compelling motivations

You've got the wrong idea of what mathematicians do now. There's not a proof shortage! We've had autonomously discovered proofs since at least Automated Mathematician, and we can have more whenever we want them - a basic result in logic is that you can enumerate valid proofs mechanically.

But we don't want them, because most proofs have no value. The work of a mathematician today is to determine what proofs would be interesting to have ("compelling motivations"), and try to prove them.

by thaumasiotes

3/30/2026 at 10:39:05 PM

It's easy to forget, as we all use digital tools in our day-to-day lives, that the world is fundamentally analog, and there's no way to escape that. Everyone trying to tell you otherwise is just selling snake oil, with one notable exception, which is mathematical rigor in proofs. It's understood now that a rigorous proof in math is exactly one that, in principle, can be digitized and checked automatically. Those are simply the same concept, so introducing a computer there is really a perfect fit of tool and purpose. If we can't use computers to automate the checking of mathematical proofs, then why have computers at all? It's the only serious thing people do that a computer can be literally perfect at!

To be clear, there's much more to math than writing down and checking proofs. Some of the most important contributions to math have been simply figuring out the right questions to ask, and also figuring out the useful abstractions. Those are both firmly on the "analog" side of math, and they are every bit as important as writing the proofs. But to say that we have this huge body of rigorous argumentation in math, and then to finally do the work of checking it formally is "taking it too far," is a really bewildering take to me.

No, I don't think formalizing proofs in Lean or other proof systems should dominate the practice of math, and no, I don't think every mathematician should have to write formal proofs. Is that really where we're heading, though? I highly doubt it. The article worries about monoculture. It's a legitimate concern, but probably less of one in math than in many other places, since in my experience math people are pretty independent thinkers, and I don't see that changing any time soon.

Anyway, the conclusion from all this is that the improved ability for mathematicians to rely on automated tools to verify mathematical reasoning would be a great asset. In my opinion the outcomes of that eventuality would be overwhelmingly good.

by WCSTombs

3/30/2026 at 10:55:09 PM

> that the world is fundamentally analog

Isn't that still an unresolved question? Wave-particle duality and all that.

by ralph84

3/30/2026 at 11:46:22 PM

I think there are some theories that the universe is fundamentally discrete at the lowest level below current capabilities of measurement, but to my knowledge none of those is widely accepted.

by WCSTombs

3/31/2026 at 12:57:34 AM

> that the world is fundamentally analog

Whether the world is discrete or analog is still an open problem in science. And it looks as if there is more and more evidence that the world is actually discrete at the quantum level.

by deterministic

3/31/2026 at 1:32:44 AM

There is no such evidence. The widely accepted models of physics are all continuous. If you see headlines like "physicists think our world might be discrete" please read them as "scientists cured cancer in mice".

by alde

3/30/2026 at 2:55:01 PM

With sufficient automation, there shouldn't really be a trade-off between rigor and anything else. The goal should be to automate as much as possible so that whatever well-defined useful thing can come out theory can come out faster and more easily. Formal proofs make sense as part of this goal.

by umutisik

3/30/2026 at 3:30:38 PM

Let’s not forget that mathematics is a social construct as much as (and perhaps more than) a true science. It’s about techniques, stories, relationships between ideas, and ultimately, it’s a social endeavor that involves curiosity satisfaction for (somewhat pedantic) people. If we automate ‘all’ of mathematics, then we’ve removed the people from it.

There are things that need to be done by humans to make it meaningful and worthwhile. I’m not saying that automation won’t make us more able to satisfy our intellectual curiosity, but we can’t offload everything and have something of value that we could rightly call ‘mathematics’.

by _alternator_

3/30/2026 at 4:10:53 PM

> mathematics is a social construct

If you believe Wittgenstein then all of math is more and more complicated stories amounting to 1=1. Like a ribbon that we figure out how to tie in ever more beautiful knots. These stories are extremely valuable and useful, because we find equivalents of these knots in nature—but boiled down that is what we do when we do math

by justonceokay

3/30/2026 at 4:55:29 PM

I like the Kronecker quote, "Natural numbers were created by god, everything else is the work of men" (translated). I figure that (like programming) it turns out that putting our problems and solutions into precise reusable generalizable language helps us use and reuse them better, and that (like programming language evolution) we're always finding new ways to express problems precisely. Reusability of ideas and solutions is great, but sometimes the "language" gets in the way, whether that's a programming language or a particular shape of the formal expression of something.

by ianhorn

3/30/2026 at 4:33:42 PM

You don’t really have to believe Wittgenstein; any logician will tell you that if your proof is not logically equivalent to 1=1 then it’s not a proof.

by _alternator_

3/30/2026 at 4:37:17 PM

Sure, I just personally like his distinction between a “true” statement like “I am typing right now” and a “tautological” statement like “3+5=8”.

In other words, declarative statements relate to objects in the world, but mathematical statements categorize possible declarative statements and do not relate directly to the world.

by justonceokay

3/30/2026 at 5:49:07 PM

If you look from far enough, it becomes "Current world ⊨ I am typing right now" which becomes tautological again.

by IsTom

3/30/2026 at 8:14:13 PM

In my view mathematics builds tools that help solve problems in science.

by sesm

3/30/2026 at 10:07:31 PM

This is known as “applied mathematics”.

by _alternator_

3/31/2026 at 11:26:49 AM

Sounds lame and boring to me.

by nathan_compton

3/30/2026 at 5:52:48 PM

More like 1 = 0 + 1.

Read about Lisp, the Computational Beauty of Nature, 64k Lisp from https://t3x.org and how all numbers can be composed of counting nested lists all down.

List of a single item:

     (cons '1 nil)

Nil it's an empty atom, thus, this reads as:

[ 1 | nil ]

List of three items:

    (cons '1 (cons 2 (cons 3 nil)))

Which is the same as

    (list '1 '2 '3)

Internally, it's composed as is, imagine these are domino pieces chained. The right part of the first one points to the second one and so on.

[ 1 | --> [ 2 | -> [ 3 | nil ]

A function is a list, it applies the operation over the rest of the items:

     (plus '1 '2 3')

Returns '6

Which is like saying:

  (eval '(+ '1 '2 '3))

'(+ '1 '2 '3) it's just a list, not a function, with 4 items.

Eval will just apply the '+' operation to the rest of the list, recursively.

Whis is the the default for every list written in parentheses without the leading ' .

    (+ 1 (+ 2 3))

Will evaluate to 6, while

    (+ '1 '(+ '2 '3))

will give you an error as you are adding a number and a list and they are distinct items themselves.

How arithmetic is made from 'nothing':

https://t3x.org/lisp64k/numbers.html

Table of contents:

https://t3x.org/lisp64k/toc.html

Logic, too:

https://t3x.org/lisp64k/logic.html

by anthk

3/30/2026 at 5:36:15 PM

There is a bit about this in Greg Egan‘s Disspora, where a parallel is drawn between maths and art. It is not difficult to automate art in the sense that you can enumerate all possible pictures, but it takes sentient input to find the beautiful areas in the problem space.

by adrianN

3/30/2026 at 7:43:53 PM

I do not think this parallel works, because I think you would struggle to find a discipline for which this is not the case. It is trivial to enumerate all the possible scientific or historical hypothesis, or all the possible building blueprints, or all the possible programs, or all the possible recipes, or legal arguments…

The fact that the domain of study is countable and computable is obvious because humans can’t really study uncountable or uncomputable things. The process of doing anything at all can always be thought of as narrowing down a large space, but this doesn’t provide more insight than the view that it’s building things up.

by SabrinaJewson

3/30/2026 at 3:33:24 PM

Automating proofs is like automating calculations: neither is what math is, they are just things in the way that need to be done in the process of doing math.

Mathematicians will just adopt the tools and use them to get even more math done.

by seanmcdirmid

3/30/2026 at 3:38:00 PM

I don't think that's true. Often, to come up with a proof of a particular theorem of interest, it's necessary to invent a whole new branch of mathematics that is interesting in its own right e.g. Galois theory for finding roots of polynomials. If the proof is automated then it might not be decomposed in a way that makes some new theory apparent. That's not true of a simple calculation.

by quietbritishjim

3/30/2026 at 11:45:20 PM

> I don't think that's true. Often, to come up with a proof of a particular theorem of interest, it's necessary to invent a whole new branch of mathematics that is interesting in its own right e.g. Galois theory for finding roots of polynomials. If the proof is automated then it might not be decomposed in a way that makes some new theory apparent. That's not true of a simple calculation.

Ya, so? Even if automation is only going to work well on the well understood stuff, mathematicians can still work on mysteries, they will simply have more time and resources to do so.

by seanmcdirmid

3/30/2026 at 4:05:35 PM

This is literally the same thing as having the model write well factored, readable code. You can tell it to do things like avoid mixing abstraction levels within a function/proof, create interfaces (definitions/axioms) for useful ideas, etc. You can also work with it interactively (this is how I work with programming), so you can ask it to factor things in the way you prefer on the fly.

by ndriscoll

3/30/2026 at 6:36:26 PM

>This is literally the same thing as

No.

>You can

Not right now, right? I don't think current AI automated proofs are smart enough to introduce nontrivial abstractions.

Anyway I think you're missing the point of parent's posts. Math is not proofs. Back then some time ago four color theorem "proof" was very controversial, because it was a computer assisted exhaustive check of every possibility, impossible to verify by a human. It didn't bring any insight.

In general, on some level, proofs like not that important for mathematicians. I mean, for example, Riemann hypothesis or P?=NP proofs would be groundbreaking not because anyone has doubts that P=NP, but because we expect the proofs will be enlightening and will use some novel technique

by integralid

3/30/2026 at 4:06:01 PM

There are areas of mathematics where the standard proofs are very interesting and require insight, often new statements and definitions and theorems for their sake, but the theorems and definitions are banal. For an extreme example, consider Fermat's Last Theorem.

Note on the other hand that proving standard properties of many computer programs are frequently just tedious and should be automated.

by jhanschoo

3/30/2026 at 11:46:30 PM

Yes, but > 90% of the proof work to be done is not that interesting insightful stuff. It is rather pattern matching from existing proofs to find what works for the proof you are currently working on.

If you've ever worked on a proof for formal verification, then its...work...and the nature of the proof probably (most probably) is not going to be something new and interesting for other people to read about, it is just work that you have to do.

by seanmcdirmid

3/30/2026 at 3:58:17 PM

[dead]

by 3yr-i-frew-up

3/30/2026 at 5:52:16 PM

[flagged]

by anthk

3/30/2026 at 6:38:39 PM

First of all, I think your comment is against HN guidelines.

And I expect GP has actually a lot of experience in mathematics - there are exactly right and this is how professional mathematicians see math (at least most of them, including ones I interact with).

by integralid

3/30/2026 at 9:17:22 PM

Engineers, maybe. Not the case with Mathematicians.

by anthk

3/30/2026 at 3:28:35 PM

There are still many major oversimplifications in the core of math, making it weirdly corresponding with the real world. For example, if you want to model human reasoning you need to step away from binary logic that uses "weird" material implication that is a neat shortcut for math to allow its formalization but doesn't map well to reasoning. Then you might find out that e.g. medicine uses counterfactuals instead of material implication. Logics that tried to make implication more "reasonable" like relevance logic are too weak to allow formalization of math. So you either decide to treat material implication as correct (getting incompleteness theorem in the end), making you sound autistic among other humans, or you can't really do rigorous math.

by storus

3/30/2026 at 8:27:52 PM

People keep getting hung up on material implication but it can not understand why. It's more than an encoding hack--falsity (i.e. the atomic logical statement equivalent to 0=1) indicates that a particular case is unreachable and falsity elimination (aka "from falsity follows everything") expresses that you have reached such a case as part of the case distinctions happening in every proof.

Or more poetically, "if my grandmother had wheels she would have been a bike[1]" is a folk wisdom precisely because it makes so much sense.

1: https://www.youtube.com/watch?v=A-RfHC91Ewc

by jojomodding

3/30/2026 at 10:27:30 PM

Material implication was not the default implication historically; it came as a useful hack by people who hoped that by enforcing it they could formalize the whole math and knowledge and have a sort of a "single source of truth" for any statement, and evaluate all statements purely syntactically. This proved to be futile as incompleteness theorem showed, and which material implication directly enabled by allowing self-referential non-sense as valid statements. There were many attempts to reconcile this with different logics but they all ended up weaker and unable to formalize all statements. We are now entering the next phase of this attempt, by using hugely complex reasoning function approximators as our "single source of truth" in the form of AI/LLMs.

I used to do a lot of proofs coming all the way from Peano arithmetics, successor operators and first-order tableaux method.

by storus

3/30/2026 at 3:19:58 PM

The thing is if something is proved by checking million different cases automatically, it makes it hard to factor in learning for other proofs.

by YetAnotherNick

3/30/2026 at 5:35:52 PM

A few comments:

(1) Math journals are being flooded with AI slop papers loaded with errors. I can see a time when they will require papers to be accompanied by formal proofs of the results. This will enable much of the slop to be filtered out.

(2) Formalization enables AI to do extensive search while staying grounded.

(3) Formalization of the historical math literature (about 3.5M papers) will allow all those results to become available for training and mining, to a greater extent that if they're just given as plain text input to LLMs.

by pfdietz

3/30/2026 at 2:35:21 PM

I’m confused by the calculus example and I’m hoping someone here can clarify why one can’t state the needed assumptions for roughed out theory that still need to be proven? That is, I’m curious if the critical concern the article is highlighting the requirement to “prove all assumptions before use” or instead the idea that sometimes we can’t even define the blind spots as assumptions in a theory before we use it?

by johnbender

3/30/2026 at 2:56:15 PM

In calculus the core issue is that the concept of a "function" was undefined but generally understood to be something like what we'd call today an "expression" in a programming language. So, for example, "x^2 + 1" was widely agreed to be a function, but "if x < 0 then x else 0" was controversial. What's nice about the "function as expression" idea is that generally speaking these functions are continuous, analytic [1], etc and the set of such functions is closed under differentiation and integration [2]. There's a good chance that if you took AP Calculus you basically learned this definition.

The formal definition of "function" is totally different! This is typically a big confusion in Calculus 2 or 3! Today, a function is defined as literally any input→output mapping, and the "rule" by which this mapping is defined is irrelevant. This definition is much worse for basic calculus—most mappings are not continuous or differentiable. But it has benefits for more advanced calculus; the initial application was Fourier series. And it is generally much easier to formalize because it is "canonical" in a certain sense, it doesn't depend on questions like "which exact expressions are allowed".

This is exactly what the article is complaining about. The non-rigorous intuition preferred for basic calculus and the non-rigorous intuition required for more advanced calculus are different. If you formalize, you'll end up with one rigorous definition, which necessarily will have to incorporate a lot of complexity required for advanced calculus but confusing to beginners.

Programming languages are like this too. Compare C and Python. Some things must be written in C, but most things can be more easily written in Python. If the whole development must be one language, the more basic code will suffer. In programming we fix this by developing software as assemblages of different programs written in different languages, but mechanisms for this kind of modularity in formal systems are still under-studied and, today, come with significant untrusted pieces or annoying boilerplate, so this solution isn't yet available.

[1] Later it was discovered that in fact this set isn't analytic, but that wasn't known for a long time.

[2] I am being imprecise; integrating and solving various differential equations often yields functions that are nice but aren't defined by combinations of named functions. The solution at the time was to name these new discovered functions.

by pavpanchekha

3/30/2026 at 8:19:50 PM

> If you formalize, you'll end up with one rigorous definition

Can't you just formalize both definitions and pick the one to work with based on what you want to do? Surely the only obstacle here is the time and effort it takes to write the formalization?

Or, alternatively, just because you've formalized the advanced calculus version doesn't mean you need to use the formalization when teaching basic calculus. The way we've proven something and the way we teach that something don't have to be the same.

by lilyball

3/31/2026 at 7:26:05 AM

> the concept of a "function" was undefined but generally understood to be something like what we'd call today an "expression" in a programming language. So, for example, "x^2 + 1" was widely agreed to be a function, but "if x < 0 then x else 0" was controversial

Good answer, but not the best example. In many programming languages, the latter is easily written as an expression:

   (x - abs(x)) / 2

It boils down to what you call an elementary function (https://en.wikipedia.org/wiki/Elementary_function).

Taking the absolute of a number generally is not assumed to be in that set, but there is no generally accepted strict definition.

A step ‘up’ from elementary functions are special functions (https://en.wikipedia.org/wiki/Special_functions). Likewise, that is loosely defined.

For example https://en.wikipedia.org/wiki/List_of_eponyms_of_special_fun... mentions lots of polynomials, one of them https://en.wikipedia.org/wiki/Cyclotomic_polynomial, which definitely are elementary according to https://en.wikipedia.org/wiki/Elementary_function.

Wikipedia also contradicts itself in https://en.wikipedia.org/wiki/Closed-form_expression, where it says

“a closed form expression or formula is one that is formed with constants, variables, and a set of functions considered as basic and connected by arithmetic operations (+, −, ×, /, and integer powers) and function composition. Commonly, the basic functions that are allowed in closed forms are nth root, exponential function, logarithm, and trigonometric functions”

and

“For example, if one adds polynomial roots to the basic functions, the functions that have a closed form are called elementary functions”

That would put the goniometric functions in the basic set allowed in elementary functions.

by Someone

3/30/2026 at 3:05:20 PM

That's very helpful and clear, thank you

by coldcity_again

3/30/2026 at 2:38:38 PM

I think the future of having lean as a tool is mathematicians using this or similar software and have it create a corresponding lean code. [1] This is an LLM that outputs Lean code given a mathematical paper. It can also reason within lean projects and enhance or fix lean code.

[1] https://aristotle.harmonic.fun

by zitterbewegung

3/30/2026 at 11:24:03 PM

I like the story in the article, but I think it tries to create some drama where there isn't any.

I think it's great that a lot of work is done using proof assistants, because clearly it's working out for researchers; diversity of research and of methods is a great strength of science. I really can't see how you can "push it too far", pen-and-paper proofs are not going anywhere. And as more researchers write machine-checked proofs, new techniques for automating these proofs are invented (which is what my research is about hehe) which will only make it easier for more researchers to join in.

> Currently, mathematicians are hoping to formalize all of mathematics using a proof assistant called Lean.

_Some_ mathematicians are trying to formalize _some_ of mathematics using a proof assistant called Lean. It's not a new development, proof assistants have been used for decades. Lean 4 has definitely been gaining popularity recently compared to others, but other proof assistants are still very popular.

> a dedicated group of Lean users is responsible for determining which definitions should go into Lean’s library

The article makes it sound like there is a single, universal "The Lean Library" that everyone is restricted to. I assume it refers to mathlib? But at the end of the day it's just code and everyone is writing their own libraries, and they can make their own decisions.

by emih

3/30/2026 at 2:07:15 PM

Rigor was never vital to mathematics. ZFC was explicitly pushed as the foundation for mathematics because Type Theory was too rigorous and demanding. I think that mathematicians are coming around to TT is a bit of funny irony lost on many. Now we just need to restore Logicism...

by ux266478

3/30/2026 at 5:34:45 PM

Rigor was always vital to mathematics. That it wasn’t vital to mathematicians is exactly why we need automated proofs.

by moi2388

3/30/2026 at 7:48:17 PM

In the long run creating a certificate that guarantees a certain probability of correctness will take much less energy. Right now we can run miller-rabin and show with 1-(1/10^100) certainty that the number is/isn't prime. Similar for hash collisions, after a certain point these can't happen in reality. If Anthropic can get their uptime from 1 9 to 9 9s (software isn't the bottleneck for 9 9s) then we don't need formally checked proofs.

by casey2

3/30/2026 at 5:36:07 PM

There's no such thing as being too rigorous when you're talking about proofs in math. It either proves it or it doesn't. You get as rigorous as you need to

by dbvn

3/30/2026 at 6:11:16 PM

LLM's are not reproducible. Common Lisp, Coq and the like for sure are.

by anthk

3/30/2026 at 3:57:27 PM

Is digitized proofs another way of saying the equivalent of a calculator, when a calculator was new?

by j45

3/30/2026 at 2:33:47 PM

Rigor is the whole point of math. The moment you start asking if there is too much of it you are solving a different problem.

by riverforest

3/30/2026 at 2:40:28 PM

Rigor is not the whole point of math. Understanding is. Rigor is a tool for producing understanding. For a further articulation of this point, see

https://arxiv.org/abs/math/9404236

by woopwoop

3/30/2026 at 3:25:04 PM

This conflates rigor with proof. Proof is the solve to the argument you are making. Rigor is how carefully and correctly the argument is made. You can understand something without rigor but you cannot prove it.

by 1970-01-01

3/30/2026 at 4:18:16 PM

> You can understand something without rigor but you cannot prove it.

I think I disagree. There are formal proofs and informal proofs, there are rigorous proofs and less rigorous proofs. Of course, a rigorous proof requires rigor, but that’s close to tautological. What makes a proof is that it convinces other people that the consequent is true. Rigor isn’t a necessary condition for that.

by layer8

3/30/2026 at 2:54:26 PM

Rigor is one solution to mutual understanding Bourbaki came up with that in turn led to making math inaccessible to most humans as it now takes regular mathematicians over 40 years to get to the bleeding edge, often surpassing their brain's capacity to come up with revolutionary insights. It's like math was forced to run on assembly language despite there were more high-level languages available and more apt for the job.

by storus

3/30/2026 at 3:16:11 PM

> It's like math was forced to run on assembly language despite there were more high-level languages available and more apt for the job.

I'm not a mathematician but that doesn't sound right to me. Most math I did in school is comprised concepts many many layers of abstraction away from its foundations. What did you mean by this?

by cbdumas

3/30/2026 at 3:23:36 PM

My math classes were theorem, lemma, proof all day long, no conceptualization, no explanation; low-level formulas down to axioms. Sink or swim, figure it out on your own or fail.

by storus

3/30/2026 at 2:36:56 PM

If rigor is the whole point why are we so focused on classical math (eg classical logic) not the wider plurality?

by meroes

3/30/2026 at 8:18:05 PM

How does that relate at all? Classical logic is not any less rigorous than other kinds of logic.

by SabrinaJewson

3/31/2026 at 9:02:04 AM

Because if rigor is all we cared about I’d think we’d spend more time on non-classical math.

by meroes

3/30/2026 at 3:08:55 PM

It seems you have never tried to prove anything using a proof assistant program. It will demand proofs for things like x<y && y<z => x<z and while it should have that built in for natural numbers, woe fall upon thee who defines a new data type.

by gzread