Better JIT for Postgres

3/4/2026 at 7:48:50 AM

> However, standard LLVM-based JIT is notoriously slow at compilation. When it takes tens to hundreds of milliseconds, it may be suitable only for very heavy, OLAP-style queries, in some cases.

I don't know anything here, but this seems like a good case for ahead of time compilation? Or at least caching your JIT results? I can image much of the time, you are getting more or less the same query again and again?

by eru

3/4/2026 at 8:17:31 AM

Yes.

Some years ago we ported some code from querying out the data and tallying in Python (how many are in each bucket) to using SQL to do that. It didn't speed up the execution. I was surprised by that, but I guess the Postgres interpreter is roughly the same speed as Python, which when you think about it perhaps isn't that surprising.

But Python is truly general purpose while the core query stuff in SQL is really specialized (we were not using stored procedures). So if Pypy can get 5x speedup, it seems to me that it should be possible to get the same kind of speed up in Postgres. I guess it needs funding and someone as smart as the Pypy people.

by olau

3/4/2026 at 10:58:11 PM

That's curious. I regularly get speed ups when moving processing from Python to postgres. At least when using indices properly and when the shift reduces the amount of data carried back and forth.

by eru

3/4/2026 at 9:33:42 AM

At some level the application needs to participate in the performance conversation too.

https://www.postgresql.org/docs/current/sql-prepare.html

by bob1029

3/4/2026 at 12:40:06 PM

Postgres’s PREPARE is per-connection so it’s pretty limited, and then connection poolers enter the fray and often can’t track SQL-level prepares.

And then the issue is not dissimilar to Postgres’s planner issues.

by masklinn

3/4/2026 at 8:39:37 PM

Oracle’s wasn’t but I haven’t used it in a very long time so that may not be longer be true.

The problem though was that it had a single shared pool for all queries and it could only run a query if it was in the pool, which is how out DB machine would max out at 50% CPU and bandwidth. We had made some mistakes in our search code that I told the engineer not to make.

by hinkley

3/4/2026 at 10:45:54 AM

Unless you cache query plans like other RDBMS's then the client manually managing that goes away and its not limited to a single connection.

MS SQL still has prepared statements and they really haven't been used in 20 years since it gained the ability to cache plans based on statement text.

by SigmundA

3/4/2026 at 9:54:25 AM

We have everything optimized, and yet somehow DB queries need to be "interpreted" at runtime. There's no reason for DB queries to not be precompiled.

by sourcegrift

3/4/2026 at 3:18:22 PM

The "byte-code" coming from the query planner typically only has a handful of steps in a linear sequence. Joins, filters, and such. But the individual steps can be very costly.

So there is not much to gain from JITing the query plan execution only.

JITing begins to make more sense, when the individual query plan steps (join, filter, ...) themselves be specialized/recompiled/improved/merged by knowing the context of the query plan.

by jpfr

3/4/2026 at 10:45:58 AM

This is a neat idea. I want to take it further and precompile the entire DBMS binary for a specific schema.

by catlifeonmars

3/4/2026 at 5:27:52 PM

Someone is already working on it: https://arxiv.org/pdf/2603.02081

by menaerus

3/4/2026 at 12:16:28 PM

How will you handle ALTER TABLE queries without downtime?

by WJW

3/4/2026 at 12:29:15 PM

That would definitely present a bit of a challenge, but:

- not all databases need migrations (or migrations without downtime)

- alternatively, ship the migrations as part of the binary

Adhoc modifications would still be more difficult but tbh that’s not necessarily a bug

by catlifeonmars

3/4/2026 at 12:16:19 PM

Many SQL engines have JIT compilers.

The problems related to PostgreSQL are pretty much all described here. It's very difficult to do low-latency queries if you cannot cache the compiled code and do it over and over again. And once your JIT is slow you need a logic to decide whether to interpret or compile.

I think it would be the best to start interpreting the query and start compilation in another thread, and once the compilation is finished and interpreter still running, stop the interpreter and run the JIT compiled code. This would give you the best latency, because there would be no waiting for JIT compiler.

by Asm2D

3/4/2026 at 1:05:55 PM

> It's very difficult to do low-latency queries if you cannot cache the compiled code

This is not too difficult, it just requires a different execution style. Salesforce's Hyper for example very heavily relies on JIT compilation, as does Umbra [1], which some people regard as one of the fastest databases right now. Umbra doesn't cache any IR or compiled code and still has an extremely low start-up latency; an interpreter exists but is practically never used.

Postgres is very robust and very powerful, but simply not designed for fast execution of queries.

Disclosure: I work in the group that develops Umbra.

[1]: https://umbra-db.com/

by aengelke

3/5/2026 at 7:29:33 AM

If I recall research papers regarding Umbra it's also using AsmJit as a JIT backend, which means that theoretically the compilation times would be comparable if you only consider code emitting overhead.

The problem will always be queries where the compilation is orders of magnitude more expensive than the query itself. I can imagine indexed lookup of 1 or few entries, etc... Accessing indexed entries like these are very well optimized by SQL query engines and possibly make no sense JIT optimizing.

by Asm2D

3/5/2026 at 8:08:00 AM

Interesting... AsmJit is pretty fast for compilation, but about 3x than sljit. The only way I can see how to make it fast enough, in theory (i.e. without slowing down point-lookup queries and such) would be to fuse planning with code generation - i.e. a single pass plan builder + compiler essentially. Not sure if Umbra tries to do that, and AsmJit is not the best choice for it anyway, but with sljit it could be on par with interpreter even for fastest queries I believe. Pretty hard (likely impossible) to implement though, planning is inherently a non-linear process...

by vladich

3/5/2026 at 1:00:20 PM

Because pg_jitter uses AsmJit's Compiler, which also allocates registers. That's much more work than using hardcoded physical registers in SLJIT case. There is always a cost of such comfort.

I think AsmJit's strength is completeness of its backends as you can emit nice SIMD code with it (like AVX-512). But the performance could be better of course, and that's possible - making it 2x faster would be possible.

by Asm2D

3/5/2026 at 6:16:52 PM

There are other issues with that auto-allocation. I tested all 3 backends on very large queries (hundreds of KBs) per query. Performance of all of them (+LLVM, but -sljit) was abysmal - the compiler overhead was in seconds to tens(!) of seconds. They have some non-linear components in their optimization algorithms. While sljit was scaling linearly and almost as fast as for smaller queries. So yes, it gives higher run-time performance but the cost of that performance grows non-linearly with code size and complexity. While you still can have good performance with manual allocations. I also don't believe you can make AsmJit 2x faster without sacrificing that auto-allocation algorithm.

by vladich

3/5/2026 at 8:44:23 PM

AsmJit has only one place where a lot of time is spent - bin-packing. It's the least optimized part, which has quadratic complexity (at the moment), which starts to show when you have like hundreds of thousands of virtual registers. There is even a benchmark in AsmJit called `asmjit_bench_regalloc`, which shows that a single function that has 16MB alone, with 65k labels and 200k virtual registers takes 2.2 seconds to generate (and 40ms of that is time to just call `emit()`).

If this function is optimized, or switched to some other implementation when there is tens of thousands of virtual registers, you would get orders of magnitude faster compilation.

But realistically, which query requires tens of megabytes of machine code? These are pathological cases. For example we are talking about 25ms when it comes to a single function having 1MB of machine code, and sub-ms time when you generate tens of KB of machine code.

So from my perspective the ability to generate SIMD code that the CPU would execute fast in inner loops is much more valuable than anything else. Any workload, which is CPU-bound just deserves this. The question is how much the CPU bound the workload is. I would imagine databases like postgres would be more memory-bound if you are processing huge rows and accessing only a very tiny part of each row - that's why columnar databases are so popular, but of course they have different problems.

I worked on one project, which tried to deal with this by using buckets and hashing in a way that there would be 16 buckets, and each column would get into one of these, to make the columns closer to each other, so the query engine needs to load only buckets used in the query. But we are talking about gigabytes of RAW throughput per core in this case.

by Asm2D

3/5/2026 at 8:53:06 PM

I have a test of 200Kb query that AsmJit takes 7 seconds to compile (that's not too bad both LLVM and MIR take ~20s), while sljit does it in 50ms. 200Kb is a pathological case, but it's not unheard of in the area I'm working on. It's realistic, although a rare case. Last 10-15 years most OLTP workloads became CPU bound, because active datasets of most real databases fully fit in memory. There are exceptions, of course.

by vladich

3/5/2026 at 9:55:34 PM

That's interesting - 200kB should not be a big deal for it - maybe it uses something that I usually don't, like many function calls, or insane number of branches, etc... I would be interested in that case, but I'm not sure whether I would be able to blindly improve AsmJit without a comprehensive test.

Definitely good to know though. When it comes to low-latency compilation my personal goal is to make it even faster when generating small functions.

by Asm2D

3/5/2026 at 7:06:51 PM

SLJIT is a bit smarter than just to use hardcoded registers. It's multi-platform anyway, so it uses registers when they are available on the target platform, if not it will use memory, that's why performance can differ between Windows and Linux on x64 for example - different number of available registers.

by vladich

3/5/2026 at 8:48:37 PM

Indeed, but this also means that you would get drastically different performance on platforms that have more physical registers vs on platforms that have less. For example x86_64 only has 16 GP registers, while AArch64 has 32 - if you use 25 registers without any analysis and just go to stack with 10 of them, the difference could be huge.

But... I consider SLJIT to be for a different use-case than AsmJit. It's more portable, but its scope is much more limited.

by Asm2D

3/5/2026 at 9:00:51 PM

It's definitely different, and for Postgres specifically, they may complement each other. SLJit can be used for low latency queries where codegen time is more important than optimizations, also for other platforms like s390x / PPC / SPARC, etc. AsmJit can be used for SIMD optimizations for x86_64 and ARM64. MIR is kinda in the middle - it does auto-allocations of registers, doesn't support SIMD, but also it's multiplatform. The only thing that doesn't fit well here is LLVM :). It has some advantages in some edge cases, but... It really needs a separate provider, the current one is bad. I'll probably create another LLVM backend for pg_jitter in the future to utilize it properly...

by vladich

3/5/2026 at 7:03:25 PM

Good point about SIMD opportunities though - it's something other 2 JITs lack.

by vladich

3/4/2026 at 1:10:50 PM

> I think it would be the best to start interpreting the query and start compilation in another thread

This technique is known as a "tiered JIT". It's how production virtual machines operate for high-level languages like JavaScript.

There can be many tiers, like an interpreter, baseline compiler, optimizing compiler, etc. The runtime switches into the faster tier once it becomes ready.

More info for the interested:

https://ieeexplore.ieee.org/document/10444855

by chrisaycock

3/4/2026 at 8:46:56 PM

It’s also common for JITs to sprout a tier and shed a tier over time, as the last and first tiers shift in cost/benefit. If the first tier works better you delay the other tiers. If the last tier gets faster (in run time or code optimization) you engage it sooner, or strip the middle tier entirely and hand half that budget to the last tier.

by hinkley

3/5/2026 at 7:30:09 AM

I write JITs so I know, but I always try to write in a way that even non-JIT people can understand :)

by Asm2D

3/4/2026 at 1:23:47 PM

See prepared statements.

by levkk

3/4/2026 at 5:00:11 PM

DB queries do get pre compiled and cached if you use prepared statements. This is why you should always use prepared statements if you can.

by array_key_first

3/4/2026 at 5:31:57 PM

It is not always necessary to explicitly use prepared statements, though. For example, the pgx library for Go [1] and the psycopg3 library for Python [2] will automatically manage prepared statements for you.

[1]: https://pkg.go.dev/github.com/jackc/pgx/v5#hdr-Prepared_Stat...

[2]: https://www.psycopg.org/psycopg3/docs/advanced/prepare.html

by kbolino

3/4/2026 at 10:52:06 AM

Postgresql uses a process per connection model and it has no way to serialize a query plan to some form that can be shared between processes, so the time it takes to make the plan including JIT is very important.

Most other DB's cache query plans including jitted code so they are basically precompiled from one request to the next with the same statement.

by SigmundA

3/4/2026 at 11:25:52 AM

What do you mean ? Cause the obvious thing is a shared cache and if there is one thing the writers of a db know it is locking

by zaphirplane

3/4/2026 at 11:42:40 AM

Sharing executable code between processes it not as easy as sharing data. AFAIK unless somethings changed recently PG shares nothing about plans between process and can't even share a cached plan between session/connections.

by SigmundA

3/4/2026 at 1:03:06 PM

Write the binary to a file, call it `libquery-id1234.so`, and link that to whichever processes that need it?

by _flux

3/4/2026 at 5:06:17 PM

Won't work well if it executes 20k+ queries per second. Filesystem will be a bottleneck among other things.

by vladich

3/4/2026 at 8:56:50 PM

You can put more than one function in one file.

by _flux

3/5/2026 at 8:02:59 AM

Sure, but not more than one query per file

by vladich

3/5/2026 at 8:15:06 AM

Hm, what is preventing from putting more than one query into the same file?

by _flux

3/5/2026 at 12:16:11 PM

The fact that you plan and execute query by query?

by vladich

3/4/2026 at 4:08:15 PM

Might want to take a look at some research like this [1] that goes over the issues:

"This obvious drawback of the current software architecture motivates our work: sharing JIT code caches across applications. During the exploration of this idea, we have encountered several challenges. First of all, most JIT compilers leverage both runtime context and profile information to generate optimized code. The compiled code may be embedded with runtime-specific pointers, simplified through unique class-hierarchy analysis, or inlined recursively. Each of these "improve- ments" can decrease the shareability of JIT compiled code."

Anythings doable here with enough dev time. Would be nice if PG could just serialize the query plan itself maybe just as an SO along with non-process specific executable code that then has to be dynamically linked again in other processes.

1. https://dl.acm.org/doi/10.1145/3276494

by SigmundA

3/4/2026 at 12:08:15 PM

Executable code is literally just data that you mark as executable. It did the JIT code, and the idea that it can't then share it between processes is incomprehensible.

I was actually confused by this submission as it puts so much of an emphasis on initial compilation time, when every DB (apparently except for pgsql) caches that result and shares it/reuses it until invalidation. Invalidation can occur for a wide variety of reasons (data composition changing, age, etc), but still the idea of redoing it on every query, where most DBs see the same queries endlessly, is insane.

by llm_nerd

3/4/2026 at 4:01:22 PM

The emphasis on compilation time there is because the JIT provider that comes with Postgres (LLVM-based) is broken in that particular area. But you're right, JITed code can be cached, if some conditions are met (it's position independent, for one). Not all JIT providers do that, but many do. Caching is on the table, but if your JIT-compilation takes microseconds, caching could be rather a burden in many cases. Still for some cases useful.

by vladich

3/4/2026 at 3:54:27 PM

No a lot of jitted code has pointers to addresses specific to that process which makes no sense in another process.

To make code shareable between processes takes effort and will have tradeoff in performance since it is not specialized to the process.

If the query plan where at least serializable which is more like a AST then at least that part could be reused and then maybe have jitted code in each processes cached in memory that the plan can reference by some key.

DB's like MSSQL avoid the problem because they run a single OS process with multiple threads instead. This is also why it can handle more connections easily since each connection is not a whole process.

by SigmundA

3/5/2026 at 4:44:17 AM

What does specialized to the process mean? Lots of JIT tooling these days readily supports caching and precompilation. Invalidation is hard but things like reloading global references are hardly intractable problems especially for an org as large as pgsql.

by patagurbon

3/5/2026 at 12:10:03 PM

Pointers to process specific memory addresses to functions other data structures that only exist in that process. I didn't say it was intractable only that it takes more effort, other databases do it.

The current PG query plan and jit is designed around just being in memory in a single process, this would need to be extracted into something not process specific and shared between all processes. The plan itself is just a bunch of C structs I believe.

by SigmundA

3/4/2026 at 11:37:44 AM

> and it has no way to serialize a query plan to some form that can be shared between processes

https://www.postgresql.org/docs/current/parallel-query.html

"PostgreSQL can devise query plans that can leverage multiple CPUs in order to answer queries faster."

by hans_castorp

3/4/2026 at 11:40:52 AM

Nothing to do with plan caching, thats just talking about plan execution of parallel operations which is that thread or process based in PG?

If process based then they can send small parts of plan across processes.

by SigmundA

3/4/2026 at 11:46:54 AM

Ah, didn't see the caching part.

Plans for prepared statements are cached though.

by hans_castorp

3/4/2026 at 4:09:13 PM

Yes if the client manually prepares the statement it will be cached for just that connection because in PG a connection is a process, but it won't survive from one connection to the next even in same process.

Other databases like MSSQL have prepared statements but they are rarely used now days since plan caching based on query text was introduced decades ago.

by SigmundA

3/4/2026 at 2:47:16 PM

Only on a per-connection basis

by AlisdairO

3/4/2026 at 8:29:10 AM

The last time I looked into it my impression was that disabling the JIT in PostgreSQL was the better default choice. I had a massive slowdown in some queries, and that doesn't seem to be an entirely unusual experience. It does not seem worth it to me to add such a large variability to query performance by default. The JIT seemed like something that could be useful if you benchmark the effect on your actual queries, but not as a default for everyone.

by fabian2k

3/4/2026 at 8:57:36 AM

That is quite strange, given that big boys RDMS (Oracle, SQL Server, DB2, Informix,...) all have JIT capabilities for several decades now.

by pjmlp

3/4/2026 at 10:43:58 AM

The big boys all cache query plans so the amount it time it take to compile is not really a concern.

by SigmundA

3/4/2026 at 3:50:08 PM

Postgres caches query plans too, the problem is you can only cache what you can share, and if your planner works well, you can share very little, there can be a lot of unique plans even for the same query

by vladich

3/4/2026 at 5:01:46 PM

No it cannot cache query plans between processes (connections) and the only way it can cache in the same process in the same connection is by the client manually preparing it, this was how the big boys did it 30 years ago, not anymore.

Was common guidance back in the day to use stored procedures for all application access code because they where cached in MSSQL (which PG doesn't even do). Then around 2000 it started caching based on statement text and that became much less important.

You would only used prepared statements if doing a bunch of inserts in a loop or something and it has a very small benefit now days only because its not sending the same text over the network over and over and hashing to lookup plan.

by SigmundA

3/4/2026 at 6:18:09 PM

I didn't say it can cache between processes. The problem is not caching between processes, it's that caching itself is not very useful, because the planner creates different plans for different input parameters of the same query in the general case. So you can reliably cache plans only for the same sets of parameters. Or you can cache generic plans, which Postgres already does as well (and sharing that cache won't solve much of the problem too).

by vladich

3/4/2026 at 7:14:49 PM

Other databases cache plans and have for years because it's very useful, many (most?) apps run many of the same statement with differing parameters, its a big win. They do this without the client having to figure out the statement matching logic like your various PG Orms and connection poolers try and do.

They also do things like auto parameterization if the statement doesn't have them and parameter sniffing to make multiple different plans based on different values where it makes sense.

https://learn.microsoft.com/en-us/sql/relational-databases/q...

You can also get this, add HINTs to control this behavior if you don't like it or its causing a problem in production, crazy I know.

https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-tr...

PG is extremely primitive compared to these other systems in this area, and it has to be since it doesn't cache anything unless specifically instructed to for a single connection.

by SigmundA

3/4/2026 at 7:55:57 PM

You make some unsubstantiated claims here. I assure you that it isn't as simple as you claim. And what Postgres does here is (mostly) the right thing, you can't do much better. You simply can't decide what plan you need to use based on the query and its parameters alone, unless you already cached that plan for those parameters (and even in that case you need to watch out for possible dramatic changes in statistics). Prepared statements != cached execution plans.

by vladich

3/4/2026 at 8:10:52 PM

Ah yes so Microsoft and Oracle do these things for no good reason, you are the one making unsubstantiated claims such as "you can't do much better". And "You simply can't decide what plan you need to use based on the query and its parameters alone" which is mostly what those systems do (along with statistics). If you bothered to read what I linked you could see exactly how they are doing it.

I never said it was simple, in fact I said how primitive PG is compared to the "big boys" because they put huge effort into making their systems fast back in the TPS wars of the early 2000's on much slower hardware.

>Prepared statements != cached execution plans

Thats exactly what a prepared statement is:

https://en.wikipedia.org/wiki/Prepared_statement

by SigmundA

3/4/2026 at 8:33:54 PM

There are reasons for that, it's useful in a very narrow set of situations. Postgres cached plans exist for the same reason. If you're claiming Oracle and MSSQL do _much_ better in this area - that's what I call unsubstantiated. From what you write further it's pretty clear you don't have a lot of understanding what happens under the hood. And no, prepared statements are not what you read in Wikipedia. Not in all databases anyway. Go read it somewhere else.

by vladich

3/4/2026 at 9:27:06 PM

>There are reasons for that, it's useful in a very narrow set of situations.

So narrow its enabled by default for all statements from the "big boy" commercial RDBMS's...

https://www.ibm.com/docs/en/i/7.4.0?topic=overview-plan-cach...

https://docs.oracle.com/en/database/oracle/oracle-database/1...

https://learn.microsoft.com/en-us/sql/relational-databases/p...

https://help.sap.com/docs/SAP_HANA_PLATFORM/6b94445c94ae495c...

>Postgres cached plans exist for the same reason.

Postgresql doesn't cache plans unless the client explicitly sends commands to do so. Applications cannot take advantage of this unless they keep connections open and reuse them in a pool and they must mange this themselves. The plan has to be planned for every separate connection/process rather than a single cached planed increasing server memory costs which are plan cache size X number of connections.

It has no "reason" to cache plans the client must do this using its "reasons".

>If you're claiming Oracle and MSSQL do _much_ better in this area - that's what I call unsubstantiated.

You are making all sorts of claims without nary a link to back it up. Are you suggestion PG does better than MSSQL, Oracle and DB2 in planning while be constrained to replan on every single statement? The PG planner is specifically kept simple so that it is fast at its job, not thorough or it would adversely effect execution time more than it already does, this is well documented and always a concern when new features are proposed for it.

>From what you write further it's pretty clear you don't have a lot of understanding what happens under the hood.

Sticks and stones, is that all you have how about something substantial.

> And no, prepared statements are not what you read in Wikipedia. Not in all databases anyway.

Ok Mr. Unsubstantiated are we talking about PG or not? What does one use prepared statements for in PG hmmm, you know the thing you call the PG plan cache? How about something besides your claim that prepared statements are not in fact plan caches? Are you talking about completely different DB systems? How about you substantiate that?

by SigmundA

3/5/2026 at 3:43:12 AM

https://www.postgresql.org/docs/current/runtime-config-query...

and then

https://www.postgresql.org/docs/current/sql-prepare.html

Read carefully about "plan_cache_mode" and how it works (and its default settings). Sorry, that's my last message in this thread, and I'm still here just for educational purposes, because what you're talking about is in fact a common misconception. If you read it carefully, you'll see that generic plans do not require any "explicit commands", Postgres executes a query 5 times in custom mode, then tries a generic one, if it worked (not much worse than an average of 5 custom plans), the plan is cached. You can turn it off though. And I'd recommend to turn it off for most cases, because it's a pretty bad heuristics. Nevertheless, for some (pretty narrow set of) cases it's useful.

So, Mr Big Boy, now we can get to what a prepared statement in Postgres is. Prepared statements are cached in a session, but if that statement was cached in custom mode, it won't contain a plan. When Postgres receives a prepared statement in custom mode, it will just skip parsing, that's it. The query will still be planned, because custom plans rely on input parameters. If we run it in generic mode, then the plan is cached.

by vladich

3/5/2026 at 4:45:47 AM

> then tries a generic one, if it worked (not much worse than an average of 5 custom plans), the plan is cached

Seems like it's not great at detecting this in all cases[1]. That said, I do note that was reproduced on PG16, perhaps they've made improvements since, given the documentation explicitly mentions what you said.

[1]: https://www.michal-drozd.com/en/blog/postgresql-prepared-sta...

by magicalhippo

3/5/2026 at 6:45:01 AM

That's exactly what I said above - just turn this thing off. The reason is that even if your generic plan is better than 5 custom plans before it, that doesn't guarantee much. With probability high enough to cause troubles, it's just a coincidence, and generic plans in general tend to be very bad (because they use some hardcoded constants instead of statistics for planning).

This behavior is often a source of random latency spikes, when your queries suddenly start misbehaving, and then suddenly stop doing it. If you don't have auto_explain on, it will look like mysterious glitches in production.

The few cases when they are useful are very simple ones, like single table selects by index. They are already fast, and with generic plans you can cut planning time completely. Which is kinda...not much. There are more complicated cases where they are useful, involving Postgres forks like AWS Aurora, which has query plan management subsystem, allowing to store plans directly. Then you can cut planning time for them. But that's a completely different story.

by vladich

3/5/2026 at 11:15:51 AM

I think you should read carefully, this only applies to prepared statements within the same session, which is exactly what I have been saying. There is no global cache, and if you reset the session it's gone.

This controls whether prepared statements even use a cached plan at all. Other database can do this with hints and they can skip parsing by using stored procedures which are basically globally named prepared statements that the client can call without preparing a temporary one or they can do prepared but again this is typically a waste of time because parsing enough to match existing plans is fast (soft vs hard parse in Oracle speak). They have many more options with more powerful caching abilities that all clients can share across sessions.

The only time PG "automatically" caches the plan is when it implicitly prepares the plan within a PL/pgsql statement like doing a insert loop inside a function, its still is only for the current session. This is just part of the planning process in other databases that cache everything all the time globally.

You don't seem to understand that most other commercial "big-boy" RDBMS cache plans across sessions and that nothing has to be done for them to reuse between completely different connections with differing parameters and can still have specialized versions based on these parameters values vs a single generic plan.

At least now you admit prepared statements are in-fact a plan cache, contradicting your other statements, and seem to make a gotcha out of an option an option to disable that cache.

You can see various discussions on pg-hackers, here is one where the submitter confirms everything I have said and attempted to add the auto part but not tackle the much harder sharing between sessions part and was shot down, I don't believe much has change in PG around plan caching since this post and even has a guy that worked on DB2 talking about how they did it: https://www.postgresql.org/message-id/flat/8e76d8fc-8b8c-14b...

by SigmundA

3/5/2026 at 12:25:19 PM

Sure, but that's not the main issue. If you add a global cache, it will have only a marginal value. There are Postgres extensions / forks with global cache and they are not wildly more efficient. The main issue you still do not understand is for different parameters you _need_ different plans, and caching doesn't help with that. It can help with parsing, sure. Parsing is very fast though, in relation to planning. And you keep conflating "prapared" statements with plan caching. Ok.

by vladich

3/5/2026 at 1:28:45 PM

>If you add a global cache, it will have only a marginal value

Please substantiate this, again all other major commercial RDBMS's do this and have invested a lot of effort and money into these systems, they would not do something that has marginal value.

Again I went through the era of needing to manually prepare queries in client code when it was the only choice as it is now in PG. It was not a marginal improvement when automatic global caching became available, it was objectively measurable via industry standard benchmarks.

You can also find other post complaining about prepared statement cache memory usage especially when libs and pooler auto prepare, the cache is repeated for every connection, 100 connections equals 100X cache size. Another advantage of a shared cache, this is obvious.

I will leave you with a quote from Bruce Momjian, you know one of the founding members of the PG dev team, in the thread I linked that you didn't seem to read just like the other links I gave you:

"I think everyone agrees on the Desirability of the feature, but the Design is the tricky part."

>The main issue you still do not understand is for different parameters you _need_ different plans, and caching doesn't help with that.

You still don't seem to be grasping what other more advanced systems do here and again don't seem to be reading any of the existing literature I am giving you. These systems will make different plans if they detect its necessary, they have MULTIPLE cached plans of the same statement and you can examine their caches and see stats on their usage.

These systems also have hints that let you disable, force a single generic, tell it how you want to evaluate specific parameters for unknown values, specific hard coded values etc. if you want to override their default behavior that uses statistics and heuristic to make a determination of which plan to use.

Please I beg you read what a modern commercial DB can do here and stop saying it doesn't help or can't be done, here is a direct link: https://learn.microsoft.com/en-us/sql/relational-databases/p...

>And you keep conflating "prapared" statements with plan caching.

Again we are talking about PG and the only way PG caches a plan is using prepared statements, in PG prepared statements and plan caching are the same thing, there is no other choice.

From your own link trying to gotcha me on PG plan caching config, first sentence of plan_cache_mode: "Prepared statements (either explicitly prepared or implicitly generated, for example by PL/pgSQL) can be executed using custom or generic plans."

The only other things a prepared statement does is skip parsing, which is another part of caching, and reduce network traffic from client to server. These things can be done with stored procedures in systems that have global caches and are shared across all connections and these systems still support the very rare situation of using a prepared statement, its almost vestigial now days.

Here is Microsoft Guidance on prepared statements in MSSQL now days:

"In SQL Server, the prepare/execute model has no significant performance advantage over direct execution, because of the way SQL Server reuses execution plans. SQL Server has efficient algorithms for matching current Transact-SQL statements with execution plans that are generated for prior executions of the same Transact-SQL statement. If an application executes a Transact-SQL statement with parameter markers multiple times, SQL Server will reuse the execution plan from the first execution for the second and subsequent executions (unless the plan ages from the plan cache)."

https://learn.microsoft.com/en-us/sql/relational-databases/q...

by SigmundA

3/5/2026 at 6:12:37 PM

If you think I'm trying to "gotcha" you, you're mistaken. I'm past time I would care about that. It was simply a (apparently failed) education opportunity. Be well.

by vladich

3/5/2026 at 7:27:33 PM

>So, Mr Big Boy, now we can get to what a prepared statement in Postgres is.

Yeah not a gotcha at all mr teacher. I think you should stop posting low effort responses and examine your own opportunities for education that may have been missed here.

Lets get this straight prepared statements should not be conflated with caching, yet the only way to cache a plan and avoid a full parse is to use a prepared statement and it is by far the biggest reason to use it and why many poolers and libraries try to prepare statements.

Do you realize how ridiculous this is, here is PG's own docs on the purpose of preparing:

"Prepared statements potentially have the largest performance advantage when a single session is being used to execute a large number of similar statements. The performance difference will be particularly significant if the statements are complex to plan or rewrite"

"Although the main point of a prepared statement is to avoid repeated parse analysis and planning of the statement, PostgreSQL will force re-analysis and re-planning of the statement before using it whenever database objects used in the statement have undergone definitional (DDL) changes or their planner statistics have been updated since the previous use of the prepared statement."

The MAIN POINT of preparing is what I am conflating with it, yes...

If PG cached plans automatically and globally then settings like constraint_exclusion and enable_partition_pruning would not need to exist or at least be on by default because the added overhead of the optimizations during planning would be meaningless.

Seriously this whole thread is Brandolini's law in action you obviously can't articulate how PG is better because it does not have a global plan cache and act like I don't know how PG works? Get real buddy.

Are you going to post another couple sentences with no content or are you done here?

by SigmundA

3/5/2026 at 7:46:24 PM

You can't get a plan cache without a prepared statement, but you can get a prepared statement without a plan cache. It's not the same thing, and in most cases in Postgres prepared statements _do_not_ give you plan caching, because they are created for custom plans. "Custom plan" is a misnomer - having a "custom plan" means the query is replanned on each execution. It's a common misconception - even a sizeable portion of articles you can find on the internet miss this. But if you have a good reading comprehension, you can read, and, possibly, understand, this:

> A prepared statement can be executed with either a generic plan or a custom plan. A generic plan is the same across all executions, while a custom plan is generated for a specific execution using the parameter values given in that call.

here https://www.postgresql.org/docs/current/sql-prepare.html

You're also mixing up parsing and planning for some reason. Query parsing costs like 1/100 of planning, it's not nothing, but pretty close to it.

Even though you're just a rude nobody, it still may be useful for others, who may read this stupid conversation...

by vladich

3/5/2026 at 9:09:05 PM

>You can't get a plan cache without a prepared statement, but you can get a prepared statement without a plan cache.

What is the purpose of a prepared statement without a plan cache? I thought parsing was a non issue? All thats left is a little extra network traffic savings.

I will for a second time quote the PG documentation that you linked btw of what the MAIN POINT of a prepared statement is according to the maintainers, I am not sure why I have to repeat this again:

"Although the main point of a prepared statement is to avoid repeated parse analysis and planning of the statement, PostgreSQL will force re-analysis and re-planning of the statement before using it whenever database objects used in the statement have undergone definitional (DDL) changes or their planner statistics have been updated since the previous use of the prepared statement.”

I am not sure what point you are trying to make other than worming your way out of your previous statements. Prepared statements are in fact plan caches and it is their MAIN purpose according the PG’s own documentation, you haven't given any other purpose for their existence, I gave the other two, one of which you dismissed, the third is not even listed in the PG docs and is also minor.

> It's not the same thing, and in most cases in Postgres prepared statements _do_not_ give you plan caching, because they are created for custom plans.

The default setting is auto which will cache the plan if the generic plan cost is similar to a custom one based on the 5 run heuristic. This is going to be most of the time on repeated simple statements that make up the bulk of application queries and why other database do this all the time globally without calling prepare. It is a large savings, not sure why you think this would not occur regularly and if you have any data to back this up I am sure everyone would like to see it, it would upset conventional thinking in other major commercial RDBMS’s with hard won gain over many years.

>You're also mixing up parsing and planning for some reason.

No I am not, you are obviously not comprehending what I said and cannot read the documentation I quoted which I had to repeat a second time here. I am not sure why you think I am mixing them up I was only trying to be gracious and include the other benefit of prepared statement, one of two thats left if it doesn't cache the plan, it avoids parsing which yes has a smaller impact, the third even less.

Also not everyone shares PG terminology, Oracle refers to what you call parsing as a soft parse (syntax check, semantic check) and parsing and planning as a hard parse (rewrite and optimizing, row source generation), you obviously have little experience outside of PG and seem to have a myopic view of what is possible in RDBMS systems and how these terms are used.

>Query parsing costs like 1/100 of planning, it's not nothing, but pretty close to it.

Again what is the point of a prepared statement if skipping parsing is meaningless and planning is not THE MAIN POINT?

>Even though you're just a rude nobody, it still may be useful for others, who may read this stupid conversation…

Further ad hominem and you call me rude, who are you to say this? How about you step off your high horse and learn something mr superior somebody. I was trying to debate in good faith and you insult me with zero substance, yeah this is a stupid conversation...

by SigmundA

3/4/2026 at 1:08:39 PM

That's not generally correct. Compile-time is a concern for several databases.

by aengelke

3/4/2026 at 3:05:59 PM

Most systems submit many of the same queries over and over again.

Ad-hoc one off queries usually can accept higher initial up-front compile cost because the main results usually take much longer anyway, vs worrying about an extra 100ms of compile.

Maybe it was too strong to say its not a concern at all, but nothing like PG where every single request needs to replan and potentially jit unless the client manually prepares and keeps the connection open.

by SigmundA

3/4/2026 at 8:51:13 PM

I’m always surprised to learn LLVM is so slow given that was one of the original motivations for developing it. I don’t know if that’s down to feature creep or intrinsic complexity being higher than people presumed was the case for GCC.

by hinkley

3/5/2026 at 8:51:40 AM

It's a compiler backend for programming languages not a runtime JIT compiler. Especially inside a DBMS a lot of the assumptions it was built with don't hold. Some people in DBMS world (mostly at TUM with Umbra/CedarDB) have written their own and others tried multi pass approaches where you have an interpreter first then a more optimised LLVM pass later.

by Tanjreeve

3/5/2026 at 10:39:58 PM

It was intended to solve the problem of interactive coding sessions such as with Language Servers, which GCC utterly fails at (because what we think of as modern IDEs did not exist in 1990).

An awful lot of people have tried to use it as a JIT now and had to backpedal. I'm not sure how the one lead to the other but here we are.

by hinkley

3/4/2026 at 11:29:44 AM

What sort of things are people doing in their SQL queries that make them CPU bound? Admittedly I'm a meat-and-potatoes guy, but I like mine I/O bound.

Really amazed to see not one but several generic JIT frameworks though, no idea that was a thing.

by the_biot

3/4/2026 at 3:53:55 PM

Most databases in practice are sub-terabyte and even sub-100Gb, their active dataset is almost fully cached. For most databases I worked with, cache hit rate is above 95% and for almost all of them it's above 90%. In that situation, most queries are CPU-bound. It's completely different from typical OLAP in this sense.

by vladich

3/4/2026 at 11:35:17 AM

Anything jsonb in my experience is quickly CPU bound...

by martinald

3/4/2026 at 4:26:17 PM

Definitely. If you're doing regular queries with filters on jsonb columns, having the index directly on the JSON paths is really powerful. If I have a jsonb filter in the codebase at all, it probably needs an index, unless I know the result set is already very small.

by jjice

3/4/2026 at 5:26:11 PM

Yeah, the other problem is I've really struggled to have postgres use multiple threads/cores on one query. Often maxes out one CPU thread while dozens go unused. I constantly have to fight loads of defaults to get this to change and even then I never feel like I can get it working quite right (probably operator error to some extent).

This compares to clickhouse where it constantly uses the whole hardware. Obviously it's easier to do that on a columnar database but it seems that postgres is actively designed to _not_ saturate multiple cores, which may be a good assumption in the past but definitely isn't a good one now IMO.

by martinald

3/4/2026 at 7:08:28 PM

I've shaved off 30s of queries by transforming json columns into a string after the first CTE is done with it

by d0100

3/4/2026 at 12:35:02 PM

I think reading queries that are always served from cache are CPU bound because it also involves locking the buffers etc and there is no I/O involved.

by wreath

3/4/2026 at 11:33:35 AM

PostgreSQL is Turing complete, so I guess they do what ever they want?

by throwaway140126

3/4/2026 at 10:23:58 AM

Have you tested this under high concurrency with lots of short OLTP queries? I’m curious whether the much faster compile time actually moves the point where JIT starts paying off, or if it’s still mostly useful for heavier queries.

by swaminarayan

3/5/2026 at 6:13:50 AM

It's not useful for sub-millisecond queries like point lookups, or other simple ones that process only a few records. sljit option starts to pay off when you process (not necessarily return) hundreds of records. The more - the better. I'm still thinking about a caching option, that will allow to lift this requirement somewhat - for cached plans. For non-cached ones it will stay.

by vladich

3/4/2026 at 12:44:38 PM

> By default, jit_above_cost parameter is set to a very high number (100'000). This makes sense for LLVM, but doesn't make sense for faster providers. It's recommended to set this parameter value to something from ~200 to low thousands for pg_jitter (depending on what specific backend you use and your specific workloads).

by masklinn

3/4/2026 at 10:07:29 AM

sadly, no windows version yet AFAICT

by larodi

3/5/2026 at 6:15:14 AM

It will be added soon

by vladich

3/4/2026 at 7:50:02 AM

awesome! I wonder if it's possible to point AI at this problem and synthesize a bespoke compiler (per-architecture?) for postgresql expressions?

by asah

3/4/2026 at 7:56:23 AM

Two things are holding back current LLM-style AI of being of value here:

* Latency. LLM responses are measured in order of 1000s of milliseconds, where this project targets 10s of milliseconds, that's off by almost two orders of magnitute.

* Determinism. LLMs are inherently non-deterministic. Even with temperature=0, slight variations of the input lead to major changes in output. You really don't want your DB to be non-deterministic, ever.

by kvdveer

3/4/2026 at 9:49:19 AM

> LLMs are inherently non-deterministic.

This isn't true, and certainly not inherently so.

Changes to input leading to changes in output does not violate determinism.

by qeternity

3/4/2026 at 11:35:34 AM

> This isn't true

From what I understand, in practice it often is true[1]:

Matrix multiplication should be “independent” along every element in the batch — neither the other elements in the batch nor how large the batch is should affect the computation results of a specific element in the batch. However, as we can observe empirically, this isn’t true.

In other words, the primary reason nearly all LLM inference endpoints are nondeterministic is that the load (and thus batch-size) nondeterministically varies! This nondeterminism is not unique to GPUs — LLM inference endpoints served from CPUs or TPUs will also have this source of nondeterminism.

[1]: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

by magicalhippo

3/4/2026 at 6:15:46 PM

Yes, lots of things can create indeterminism. But nothing is inherent.

by qeternity

3/4/2026 at 11:37:58 AM

Quoting:

"But why aren’t LLM inference engines deterministic? One common hypothesis is that some combination of floating-point non-associativity and concurrent execution leads to nondeterminism based on which concurrent core finishes first."

From https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

by yomismoaqui

3/4/2026 at 6:15:52 PM

Yes, lots of things can create indeterminism. But nothing is inherent.

by qeternity

3/4/2026 at 8:00:42 AM

> 1000s of milliseconds

Better known as "seconds"...

by simonask

3/4/2026 at 8:09:28 AM

The suggestion was not to use an LLM to compile the expression, but to use an LLM to build the compiler.

by olau