TL;DR
- Default mode for APIs:
pool_mode = transaction - Django:
CONN_MAX_AGE = 0 - SQLAlchemy: usually
NullPoolwith PgBouncer transaction mode - Measure p95/p99, not only average latency
1) What PgBouncer is and why it matters
PgBouncer is a lightweight connection pooler for PostgreSQL. Your app talks to PgBouncer, and PgBouncer manages a limited number of real Postgres connections.
It helps most when:
- traffic spikes create connection storms;
- worker count is high;
- Postgres hits connection limits.
Important: PgBouncer does not make SQL itself faster. It improves connection-level stability and tail latency under load.
2) Pooling modes: session / transaction / statement
pool_mode = transaction is usually best for APIs:
- high pool density;
- better scaling for short transactions;
- avoid session-state assumptions.
pool_mode = session is more compatible but less efficient under concurrency.
statement is uncommon and can break multi-statement transaction behavior.
3) What performance gains to expect
Typical improvements:
- fewer
too many connectionsand connect timeouts; - smoother p95/p99 on peaks;
- better throughput at high concurrency.
4) PgBouncer config template
[databases]
appdb = host=POSTGRES_HOST port=5432 dbname=appdb user=appuser password=apppass
[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 5000
default_pool_size = 50
min_pool_size = 10
reserve_pool_size = 20
reserve_pool_timeout = 5
server_idle_timeout = 30
server_lifetime = 3600
ignore_startup_parameters = extra_float_digits,search_path
stats_period = 60
"appuser" "md5<md5passwordhash>"
5) Django + PgBouncer
# settings.py
DATABASES = {
"default": {
"ENGINE": "django.db.backends.postgresql",
"NAME": "appdb",
"USER": "appuser",
"PASSWORD": "apppass",
"HOST": "pgbouncer",
"PORT": "6432",
"CONN_MAX_AGE": 0,
"OPTIONS": {"connect_timeout": 5},
}
}
6) FastAPI + SQLAlchemy + PgBouncer
from fastapi import FastAPI
from sqlalchemy import text
from sqlalchemy.ext.asyncio import create_async_engine
from sqlalchemy.pool import NullPool
app = FastAPI()
engine = create_async_engine(
"postgresql+asyncpg://appuser:apppass@pgbouncer:6432/appdb",
poolclass=NullPool,
pool_pre_ping=True,
)
@app.get("/ping-db")
async def ping_db():
async with engine.connect() as conn:
await conn.execute(text("SELECT 1"))
return {"ok": True}
7) Flask + SQLAlchemy + PgBouncer
SQLALCHEMY_DATABASE_URI = "postgresql+psycopg2://appuser:apppass@pgbouncer:6432/appdb"
SQLALCHEMY_ENGINE_OPTIONS = {
"poolclass": __import__("sqlalchemy").pool.NullPool,
"pool_pre_ping": True,
}
8) 5000-request benchmark script
import asyncio
import time
import statistics
import httpx
TOTAL = 5000
CONCURRENCY = 200
URL = "http://127.0.0.1:8000/ping-db"
async def worker(client: httpx.AsyncClient, n: int, latencies: list[float]):
for _ in range(n):
t0 = time.perf_counter()
r = await client.get(URL, timeout=10.0)
r.raise_for_status()
latencies.append((time.perf_counter() - t0) * 1000.0)
async def main():
per_worker = TOTAL // CONCURRENCY
remainder = TOTAL % CONCURRENCY
latencies: list[float] = []
start = time.perf_counter()
limits = httpx.Limits(max_connections=CONCURRENCY, max_keepalive_connections=CONCURRENCY)
async with httpx.AsyncClient(limits=limits) as client:
tasks = []
for i in range(CONCURRENCY):
n = per_worker + (1 if i < remainder else 0)
tasks.append(asyncio.create_task(worker(client, n, latencies)))
await asyncio.gather(*tasks)
elapsed = time.perf_counter() - start
rps = TOTAL / elapsed
latencies.sort()
p50 = latencies[int(0.50 * len(latencies))]
p95 = latencies[int(0.95 * len(latencies))]
p99 = latencies[int(0.99 * len(latencies))]
avg = statistics.mean(latencies)
print(f"TOTAL={TOTAL} CONCURRENCY={CONCURRENCY}")
print(f"Elapsed: {elapsed:.2f}s RPS: {rps:.1f}")
print(f"Latency ms: avg={avg:.2f} p50={p50:.2f} p95={p95:.2f} p99={p99:.2f}")
if __name__ == "__main__":
asyncio.run(main())
pip install httpx
python bench_http.py
For a fair comparison, only switch the DB endpoint (Postgres:5432 vs PgBouncer:6432) and keep the same load profile.
9) Practical production tips
- Track p95/p99, not only average.
- Keep transactions short with
pool_mode=transaction. - Django: keep
CONN_MAX_AGE = 0. - SQLAlchemy: usually
NullPoolto avoid double pooling. - Tune
default_pool_sizeso Postgres stays stable under peak load.
Quick checklist:
- PgBouncer metrics:
SHOW POOLS;,SHOW STATS; - Postgres metrics:
pg_stat_activity, transaction duration, active connections - Load test steps: 50 -> 200 -> 500 concurrency and compare p95/p99 tails
