2/16/2026 at 2:48:01 AM
interesting that they call out success cases as rare, thats honestly the most useful part. are people here seeing better hit rate from tighter decomposition + verifier loops, or mostly just more compute?also curious where failures cluster most: search, formalization, or proof checking?
by umairnadeem123
2/17/2026 at 1:41:13 AM
I think Terence Tao gave them feedback on initial drafts to emphasize this piece. Yes 4-13 ish meaningful Erdos problems (depending on what you count), but that's out of 700 problems run through the pipeline.by The_Gray