alt.hn

4/6/2026 at 4:13:47 PM

Reducto releases Deep Extract

https://reducto.ai/blog/reducto-deep-extract-agent

by raunakchowdhuri

4/6/2026 at 8:21:15 PM

We used Reducto and it did struggle with long documents. As we process financial documents going over 300+ pages using Gemini 3 Flash is producing high accuracy extracts super fast.

by aleks5678

4/6/2026 at 11:55:42 PM

We've made a lot of changes in the past few months that make our standard extract much, much better, as well as Deep Extract for documents even longer than that. We'd love for you to give it a try!

by raunakchowdhuri

4/6/2026 at 6:08:47 PM

Any learnings from deploying agents at such massive scale?

by willwjack

4/7/2026 at 12:04:29 AM

The big one is that LLMs get lazy on repetitive tasks. They'll skip rows or consolidate entries instead of grinding through every last one. So you need verify-and-re-extract loops rather than single-pass processing. Breaking work into sub-agent chunks with explicit correctness criteria defined upfront (e.g., "line items must sum to the stated total") lets the system self-verify autonomously. At scale (28M+ fields), this approach actually outperformed expert human labelers!

by raunakchowdhuri

4/6/2026 at 5:21:54 PM

How does this compare to DataLab (https://www.datalab.to/)

by skadamat

4/6/2026 at 5:59:49 PM

We're releasing an open dataset for challenging structured extraction tasks as a starting point for people to do any comparisons soon!

vikp and the Datalab team have done great work in the space, but their structured extraction product is closer to our baseline /extract api since both of those are single pass extractions.

Deep Extract is more accurate than any structured extraction product we've tried, but the approach comes with a very clear cost/latency tradeoff over a single pass extraction. We have free credits if you'd like to do a side by side

by adit_a

4/7/2026 at 12:44:42 PM

Irud

by nbnn