5/31/2026 at 11:23:51 PM
Author here. For context, I was the tech lead for the Postgres team at Cloudflare, and this came directly out of a challenge I kept hitting there: BI and dashboard teams needed to run long-running analytical queries, and the answer was always to spin up another bespoke read replica or stand up an ETL dump into an analytical database and query that.So the question I started with was: what's the fewest components I could get away with? That led to the architecture here — Streambed connects to Postgres as a logical replication subscriber (same mechanism as a read replica) and streams WAL changes straight into Apache Iceberg on S3, queryable from psql via an embedded DuckDB. There are a lot of edge cases to handle, and it's very much early days.
Welcome any feedback.
by vira28
6/1/2026 at 1:08:48 AM
To me being able to query over psql is secondary. I’m fine with any SQL. What is very important is being able to transform the data to better suite analytical queries. That is, define custom transformations, define how data sectioned and what indices available.by kikimora
6/1/2026 at 12:16:08 PM
Hey vira28, thanks a lot for your work. This is a very promising project because other alternative like supabase/etl, Kuvasz-streamer, Sequin all have some subtle issues.Few questions: 1) For a supabase project can we setup replication slot on replica instead of primary? https://sequinstream.com/docs/reference/databases#using-sequ...
2) For a planetscale cluster are the replication slots on primary or the follower nodes?
I'm asking because isn't setting up slots on primary riskier than setting them on replicas/followers? Because If you have them primary In case of WAL buildup your primary will go down?
by saxenaabhi
6/1/2026 at 5:02:54 AM
Thanks for releasing this! How do you handle DDL queries? Are table changes synchronized to the Iceberg table automatically?Also, I recently started looking into olake[0] to serve the same purpose. What would you say differentiates Streambed?
by erikcw
6/1/2026 at 7:22:03 AM
[flagged]by vira28
6/1/2026 at 7:36:45 AM
> streams WAL changes straight into Apache Iceberg on S3, queryable from psql via an embedded DuckDBWhy not use Ducklake instead of Apache Iceberg? Wouldn't that simplify the architecture substantially?
by kshri24
6/1/2026 at 12:18:05 AM
Just wanted to say thank you! Very relevant to our use cases. I'll report if I find any issues.by ashtuchkin
6/1/2026 at 7:08:01 AM
Welcome. Would love to hear your experience. Feel free to share here or in the repo. Fully open source.by vira28
6/1/2026 at 5:18:16 AM
> queryable from psql via an embedded DuckDB.noob question here from someone who ony played a bit with iceberg and trino: what's the reason to do the analytics stil inside the postgres -- is it so that you don't eat up the IOPS/bandwidth of the main postgresql disks?
by raducu
6/1/2026 at 7:14:39 AM
[flagged]by vira28
6/1/2026 at 7:48:43 AM
How does it compare to https://github.com/supabase/etl ?by alex_hirner
6/1/2026 at 5:21:01 AM
Very cool! What would a 10,000 feet solution look like for MySQL to Iceberg on S3?by iamcreasy
6/1/2026 at 7:07:08 AM
Should be fairly doable using binlog-based producer https://github.com/go-mysql-org/go-mysql.by vira28
6/1/2026 at 8:53:58 AM
Why are your queries slow?by BodyCulture
6/1/2026 at 12:19:25 PM
[dead]by keynha