pgxpool and multi tenancy #2384

kevin-matthew · 2025-09-15T16:34:36Z

kevin-matthew
Sep 15, 2025

There has been some issues raised in the past with multi tenancy (where each tenant has their own schema) and its function with pgx pool. But none of them hit the core problems in the head, so I'm starting this discussion to index it.

Justification for multi tenancy

To put this into perspective, I like to avoid schema-based multi tenancy. So any problems that come with it I brush off as inconsequential. However in my latest project, I am faced with the requirement that the tenants should have unrestricted access to their own data (eg, they'll be writting their own sql statements) as well as frequent exports of entire tables. So I cannot used my preferred shared-table pattern (where each row has a tenant ID). This is also going to be a SaaS product so I have to be cost-aware when it comes to hardware. After much research, I reluctantly decided that schema-per-tenant was the best choice.

Problems with pgxpool and tenancy

When you execute a Postgres statement with a connection client, if its a novel sql statement that hasn't been previously been executed (eg select * from tenant1.mytable where abc=$1). Postgres has to compile that statement which takes time (See JIT section on postgres documentation). After compilation, subsequent calls to that sql is dramatically faster, even with differing arguments (eg $1).

However, JIT compilations are stored in the connection process, and JIT compilations treat tables of different schemas independently (even if they are the same structure). Thus executing select * from tenant1.mytable... requires a separate JIT compilation than select * from tenant2.mytable.... This is where pgxpool falters: if you attempt to use a pool of connections equally across a pool of (thousands) of schemas, each connection is forced to recompile the same sql. This leads me to ask: What if there was a way that the pool will first acquire a connection best suited for the tenant before it tries to acquire a connection that hasn't been used with that tenant?

Side note: JIT complications is what I'd consider the most important problem with multi-tenancy and connection pooling. I'm sure there's other things such as caching that I haven't done my research on.

Hash Pooling

There's already lots of rules when it comes to what connection is best to acquired. Most of those rules are dictated by puddle. Right now they try their best to make sure all connections are used evenly (eg using the genstack approach). My idea is that we extend, so that the following is considered:

The caller is wanting to acquire a connection configured for tenant XYZ.
The pool checks idle connections that are already configured for tenant XYZ. If one is available, return that.
The pool then checks if a new idle connection can be made with that configuration. If so, create a new configured connection and return that.
With the inability to create a new connection or acquire an already configured connection. The pool can now "reconfigure" an existing connection (found by genstack). This is assumed to be an expensive operation, so this is why we do it last.

Configuration/Reconfiguration will be callbacks. This will mean the caller can dictate what user/schema/search_path whatever they want.

This effectively means that a connection pool has a genstack control for our traditional acquires. But each idle connection has a hash (eg, schema name) that can easily be picked up before an arbitrary connection is used.

I'm working on a solution now. But was wanting to hear some other's thoughts.

jackc · 2025-09-29T12:42:28Z

jackc
Sep 29, 2025
Maintainer

I'm open to improvements in this area. Though we'd probably want a fair amount of discussion before actually doing any coding.

A few quick thoughts:

I would be concerned about prepared statements before the JIT. Presumably automatic prepared statement generation would need to be disabled.
I would investigate RLS as opposed to schema per tenant. That trades this problem for other problems, but depending on your use case it might be an improvement.
This might be doable without any puddle changes by using AcquireAllIdle.
Regardless of whether AcquireAllIdle is used or a new feature in puddle I would be concerned about increased contention as I would expect locks to be held longer.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pgxpool and multi tenancy #2384

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

pgxpool and multi tenancy #2384

Uh oh!

kevin-matthew Sep 15, 2025

Justification for multi tenancy

Problems with pgxpool and tenancy

Hash Pooling

Replies: 1 comment

Uh oh!

jackc Sep 29, 2025 Maintainer

kevin-matthew
Sep 15, 2025

jackc
Sep 29, 2025
Maintainer