# v0.25.0 — Scheduler Scalability and Pooler Performance

> **Full technical details:** [v0.25.0.md-full.md](v0.25.0.md-full.md)

**Status: ✅ Released** | **Scope: Large** (~8–9 weeks)

> Push the comfortable operating point from hundreds to thousands of stream
> tables, eliminate the cold-start latency tax in pooled-connection
> deployments, and harden the predictive cost model against outlier noise.

---

## What problem does this solve?

At hundreds of stream tables, the scheduler's per-tick catalog reload (scanning
all stream tables to find which need refreshing) was consuming 20–200
milliseconds on every tick. Connection poolers like PgBouncer were paying a
30–45 millisecond cold-start cost per backend connection, because each new
connection recompiled the refresh SQL templates from scratch. The predictive
cost model was susceptible to outlier measurements that caused premature
strategy switches.

---

## Shared-Memory Catalog Snapshot Cache

The full list of stream tables, their queries, and their schedules is now
cached in **shared memory** (memory shared between all PostgreSQL processes
on the server), keyed by a generation counter. The cache is only invalidated
when a stream table is created, modified, or dropped — not on every tick.

This reduces the per-tick catalog reload from O(n) SPI queries to a single
shared-memory read. The win is largest at scale: at 1,000 stream tables, the
scheduler tick drops from ~200 ms to under 20 ms.

---

## Batched Change Detection

The scheduler checks which stream tables have pending changes before deciding
what to refresh. Previously, this was a separate `SELECT EXISTS(...)` query
per source table. Batched change detection combines all these checks into a
single `UNION ALL` query per refresh group.

At 10 source tables, this reduces the change detection from 10 queries to
1 — approximately 80% fewer round-trips to the database.

---

## Shared L0 Template Cache (Pooler Latency Fix)

The refresh SQL templates (the differential SQL generated for each stream
table) are now stored in a `dshash`-based shared memory cache. All backend
processes in the same database share one compiled template set.

The first backend to connect compiles the templates; every subsequent backend
— including new connections from PgBouncer — hits the shared cache
immediately.

*In plain terms:* the 30–45 ms "first query" latency penalty that affected
every new database connection in a PgBouncer deployment is eliminated.

---

## Persistent Worker Pool

`pg_trickle.worker_pool_size` (default 0) starts persistent background worker
processes that loop on a shared work queue rather than being started and
stopped for each refresh task. This saves ~2 ms of startup cost per worker
per tick and eliminates the PostgreSQL background worker registration/
deregistration overhead for high-frequency refresh workloads.

---

## Faster Row Hashing

The row identity hash used by CDC (change data capture) was switched from a
two-step "concatenate all columns into a string, then hash" approach to a
streaming `xxh3` algorithm that processes column values directly. This
eliminates per-row heap allocations on the CDC hot path.

---

## Predictive Model Robustness

The cost model from v0.22.0 could be confused by outlier measurements — one
very slow refresh causing it to switch all subsequent refreshes to FULL mode
unnecessarily. Robustness improvements:

- Predictions are clamped to `[0.5×, 4×] last_full_ms` — no extreme outliers
- Median and median absolute deviation (MAD) replace mean and standard
  deviation — more resistant to outliers
- Predictions are ignored for the first 60 seconds after a stream table is
  created (warm-up period)

---

## Subscriber Lag Tracking

Downstream publications (from v0.22.0) now track the LSN position of each
subscriber. The change buffer is not truncated until all subscribers have
acknowledged past the buffer's maximum LSN. A warning is emitted when a
subscriber falls more than `pg_trickle.publication_lag_warn_lsn` bytes behind.

*In plain terms:* if a downstream consumer (Kafka, Debezium, etc.) falls
behind, pg_trickle preserves the data it needs rather than discarding it.

---

## Scope

v0.25.0 pushes the practical scale limit from hundreds to thousands of stream
tables, and eliminates the pooler cold-start penalty that was the most
frequently reported performance issue in production deployments behind
PgBouncer. The predictive model robustness improvements make AUTO mode more
stable in production under variable workload patterns.