Three layers of pooling, three different jobs. We learned the hard way which to use when. Real numbers from a 8k-connection workload.

On this page

Database Connection Pooling at Scale: PgBouncer, RDS Proxy, Application Pool

We run roughly 8,000 concurrent client connections against a Postgres primary that comfortably handles 300 backend connections. The difference is layers of pooling. Three of them, doing three different jobs. Here's what each layer is for, what we got wrong, and the configuration we ended up with.

The Three Layers #

code

Application code
      │
      ▼  (in-process driver pool)
PgBouncer in transaction mode  (~3000 conns ↓, ~300 backend conns ↑)
      │
      ▼
RDS Proxy  (handles failover, IAM auth)
      │
      ▼
Postgres primary

Each layer has a clear purpose:

Layer	Job
Application driver pool	Reuse a small number of connections within a single instance
PgBouncer	Multiplex thousands of clients down to hundreds of backend conns
RDS Proxy	Failover, IAM, security

We didn't start here. We started with just the app pool and a direct connection to RDS. The path to this architecture went through several outages.

The First Outage: Connection Exhaustion #

About 18 months ago, on a Friday at 3pm, our Postgres primary started rejecting connections. max_connections was 500. We had ~80 application instances each running a pool of size 20 = 1,600 attempted connections. Postgres held the line until ~500 active and started returning FATAL: too many connections.

What we'd missed: most of those 1,600 connections were idle most of the time. The right thing was to multiplex — let many clients share fewer backend connections.

Enter PgBouncer #

ini.ini

# pgbouncer.ini
[databases]
app = host=postgres-primary.example.com port=5432 dbname=app

[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = scram-sha-256
auth_user = pgbouncer

pool_mode = transaction      # critical — see below
max_client_conn = 5000       # the number our app cluster can hold open
default_pool_size = 200      # backend conns per database
reserve_pool_size = 25
reserve_pool_timeout = 3
server_idle_timeout = 60

pool_mode = transaction is the key. Three modes exist:

session: each client owns a backend conn for the life of its session. Doesn't help.
transaction: backend conn is checked out per transaction, returned at COMMIT/ROLLBACK. Allows huge multiplexing.
statement: per statement. Even more aggressive but breaks any code using transactions.

Transaction mode multiplied our effective connection capacity by ~16×. Same Postgres, suddenly handles all 1,600 clients with ~200 backend conns.

What Transaction Mode Breaks #

Not everything works under transaction pooling. The most common surprises:

1. Session-level state #

sql.sql

SET search_path TO myschema;       -- lost when transaction ends
SET TIME ZONE 'America/New_York';  -- lost when transaction ends

Anything you SET outside a transaction may not be there for your next query.

Fix: include the SET inside the same transaction, or use connection-level config in the database (ALTER ROLE app SET search_path = myschema).

2. Prepared statements (sometimes)#

Some drivers use server-side prepared statements that persist on the backend. After the transaction ends, your client thinks the prepared statement exists; the next backend doesn't have it.

Fix: SET prepareThreshold=0 (Postgres JDBC), statementCacheSize=0 (libpq variants), or use protocol-level changes. Most modern drivers handle this; ancient ones don't.

3. `LISTEN`/`NOTIFY`#

Pub/sub via Postgres notifications doesn't work through transaction pooling. Your client sends LISTEN; the backend it ran on is returned to the pool; the notification arrives at a different backend; you never see it.

Fix: don't use LISTEN/NOTIFY in pooled apps. Use a dedicated direct connection if you need it.

4. Cursors #

Server-side cursors hold a backend connection. Either don't use them, or use session mode for that specific path.

Layer 3: Application Driver Pool #

We didn't remove the application-level pool when we added PgBouncer. We shrank it.

yaml.yaml

# Before PgBouncer
hikari:
  maximumPoolSize: 20    # 20 conns per instance × 80 instances = 1600

yaml.yaml

# After PgBouncer
hikari:
  maximumPoolSize: 8     # 8 per instance × 80 = 640 toward PgBouncer
  connectionTimeout: 3000
  idleTimeout: 30000

The application pool's job changed. It used to size to peak DB capacity; now it sizes to peak per-instance concurrency. Each instance handles maybe 8 simultaneous queries, so we need 8 conns max — independent of the cluster size.

PgBouncer absorbs the actual connection thundering herd.

The Second Outage: PgBouncer Failover #

Three months later, the PgBouncer instance crashed during a Postgres failover. Postgres had failed over correctly to a replica, but PgBouncer's connections to the old primary were stale, and PgBouncer didn't refresh them gracefully.

Enter RDS Proxy #

We added RDS Proxy as the layer between PgBouncer and Postgres specifically for failover handling.

	PgBouncer alone	+ RDS Proxy
Failover detection	none	automatic via RDS API
Reconnect strategy	restart-and-pray	seamless from app perspective
IAM auth	no	yes
Cost	self-hosted	small $/hr

We also could have made PgBouncer aware of failover via DNS tricks or process-level health checks. We chose RDS Proxy because it handles edge cases (replica promoted, role changed, password rotated) we kept missing in our scripts.

Some teams skip RDS Proxy and run PgBouncer directly against the primary endpoint. That works if your failover SLA is loose. Ours wasn't.

When PgBouncer Alone Is Enough #

For lower scale, you don't need three layers. Single-app + PgBouncer + Postgres direct works fine for:

< 1000 concurrent clients
Single Postgres primary, no read replicas in your hot path
Failover RTO of "few minutes is OK"
Self-hosted Postgres (RDS Proxy is RDS-only)

We had this for years before the scale forced more layers.

Tuning That Actually Mattered #

`default_pool_size` is harder than it looks #

Your sweet spot is the largest pool size where Postgres doesn't see contention. Too small: clients wait for a backend conn. Too large: backends compete for buffer pool, planner cache, etc.

We landed at 200 per database (we have one main DB) by:

Set it high (500). Watched CPU, lock waits, I/O wait.
Reduced until throughput stopped improving.
Added 20% headroom.

Anyone who tells you a magic number for pool size hasn't tested it on your hardware.

`server_idle_timeout` matters more than expected #

If backend conns idle out, PgBouncer reopens them on demand. Connection establishment costs about 5ms in our setup; under burst load, that adds up.

We set server_idle_timeout = 60 to keep ~half the pool warm but not waste resources during quiet periods.

`max_client_conn` should be 2× your app pool sum #

A common mistake: setting max_client_conn too tight. If apps surge briefly above the cap, PgBouncer rejects new connections, apps fall over.

Rule we follow: max_client_conn = 2 × (sum of all app instance pool sizes). Headroom prevents thundering-herd rejection during incidents.

The Numbers #

After full stack deployment:

Metric	Before (app pool only)	After (3-layer)
Postgres backend connections	1,200–1,600	180–230
Postgres CPU during peak	70%	35%
Connection-acquire p95 (app side)	12ms	1.4ms
Failover-induced errors	100s of 5xx	typically 0
Cost (PgBouncer + RDS Proxy)	n/a	~$80/mo

The cost is rounding error. The reduction in primary CPU alone deferred a vertical scale-up worth ~$1k/mo.

What We Got Wrong Along the Way #

1. Used `session` pool mode initially #

Because we were nervous about transaction-mode breakage. session mode gave us almost no multiplexing benefit. Switch to transaction was a 1-line change but took 3 weeks of paranoia and testing.

If you're starting today, start with transaction mode and fix what breaks.

2. Forgot about the application pool #

Once PgBouncer was in, we initially left app pools at their old sizes (20). The result: app instances would chew through PgBouncer's max_client_conn during traffic bursts.

Always shrink the app pool when adding PgBouncer. It's not redundant; it's a different layer with a different job.

3. Ran one PgBouncer instance #

Single point of failure. We now run 3 PgBouncer instances behind a TCP load balancer. Cost: trivial. Resilience: enormous.

4. Didn't monitor PgBouncer itself #

PgBouncer has a SHOW POOLS and SHOW STATS admin interface. We weren't scraping it. When pools saturated, we found out from upstream symptoms instead of cause.

We now scrape SHOW STATS every 15 seconds and have alerts on cl_waiting > 5 for 1m (clients waiting for a backend conn).

Always start with pool_mode = transaction. Live with the constraints; the multiplexing is the entire point.
Run multiple PgBouncer instances. No single-point-of-failure for your DB layer.
Shrink your app pool when adding PgBouncer. Otherwise you have two pools fighting.
Use RDS Proxy (or equivalent) if your DB is RDS and you need clean failover. Don't roll your own.
Scrape SHOW STATS from PgBouncer. The visibility is invaluable.
Test failover. PgBouncer + Postgres failover has weird edges. Run a chaos test before you need it for real.

When You Don't Need This #

If your DB sees < 200 concurrent clients and runs comfortably at 30% CPU, just use the application driver pool. Three layers of pooling is solving a real scale problem, not a vanity one.

But if you're seeing connection-related errors, primary CPU pinned by parsing, or ugly failover behavior, the architecture above will likely buy you 12+ months of headroom for the cost of an afternoon's setup.

Closing Thought #

The right number of pooling layers is the smallest number that makes your problem go away. Most teams need one (the app pool). Many need two (+ PgBouncer). Some need three. The question to ask yourself: which of these layers is currently doing nothing for you? If you can't answer, you don't have the layer pulling its weight, and adding another won't help.

Database Connection Pooling at Scale: PgBouncer, RDS Proxy, Application Pool

Stay Updated

Backstage Adoption: From Demo to 80% Service Coverage in 6 Months

Fine-Tuning vs RAG vs Long-Context: A Decision Framework With Numbers

More from DevOps

HashiCorp Vault as a Secrets Backend for Kubernetes

Kafka Partition Strategies — Scaling Consumers Without Reshuffling Everything

Pipeline Observability — Why CI Failures Don't Trigger Alerts (And Should)

About Kiril Urbonas

You might have missed

Prompt Engineering Best Practices: Maximizing LLM Performance

AI Agents in DevOps: From Copilots to Autonomous Automation in 2025

Linux Performance Tuning for Containers and Kubernetes Nodes