# v0.59.0 — Performance & Observability (Full Details) > **Summary:** [v0.59.0.md](v0.59.0.md) > **Assessment source:** [plans/PLAN_OVERALL_ASSESSMENT_12.md](../plans/PLAN_OVERALL_ASSESSMENT_12.md) > **Findings addressed:** P-1, P-2, P-3, P-4, P-5, P-6, P-7, O-1, O-2, O-3, O-4, O-5, O-6 --- ## Performance Improvements ### PERF-1: Batched Monitor Buffer-Growth Check **Assessment finding:** P-1 (HIGH) **File:** `src/monitor/health.rs` **Before:** The per-ST health check loop called `SELECT count(*)::bigint FROM pgtrickle_changes.changes_{oid}` for each CDC-enabled source, producing O(N × monitor_ticks) SPI round-trips per second. On a cluster with 100 sources at 10 Hz, this was ~1 000 SPI calls/sec. **After:** A single SPI query fans out across all CDC-enabled sources using a `VALUES` lateral union and a subquery per buffer table. The query returns one row per source with its pending-row count. The monitor loop then applies the threshold test in Rust without further SPI calls. Approximate overhead reduction: from ~100 SPI calls per health-check cycle to 1 SPI call. ### PERF-2: Query Hash Cached on Catalog Row **Assessment finding:** P-2 (MEDIUM) **Files:** `src/refresh/merge/mod.rs`, `src/catalog.rs`, `sql/pg_trickle--{prev}--0.59.0.sql` A new `defining_query_hash BIGINT NOT NULL DEFAULT 0` column is added to `pgtrickle.pgt_stream_tables`. The hash is computed during `CREATE STREAM TABLE` / `ALTER STREAM TABLE QUERY` and stored persistently. The per-refresh `DefaultHasher` computation over the full query string is removed. The SQL upgrade script (`sql/pg_trickle--0.58.0--0.59.0.sql`) adds the column and backfills it for all existing rows: ```sql ALTER TABLE pgtrickle.pgt_stream_tables ADD COLUMN IF NOT EXISTS defining_query_hash BIGINT NOT NULL DEFAULT 0; UPDATE pgtrickle.pgt_stream_tables SET defining_query_hash = hashtextextended(defining_query, 0)::bigint; ``` ### PERF-3: Arc Merge Template Strings **Assessment finding:** P-3 (MEDIUM) **File:** `src/refresh/merge/mod.rs` `CachedMergeTemplate` holds eight SQL template strings. Under a 1 000 refresh/s workload, cloning the struct on each cache hit produced ~60 MB/s of string allocation. The eight template fields are changed from `String` to `Arc`. A clone of the struct becomes O(1) (8 atomic reference-count increments) regardless of template size. ### PERF-4: Single MERGE_TEMPLATE_CACHE Borrow **Assessment finding:** P-4 (MEDIUM) **File:** `src/refresh/merge/mod.rs` Two separate `MERGE_TEMPLATE_CACHE.with(|cache| { cache.borrow().get(…) })` calls — one for `cached_non_monotonic`, one for `cached_is_deduplicated` — were each marking the same LRU entry as recently used. They are merged into a single borrow that extracts both flags atomically: ```rust let (non_monotonic, is_deduplicated) = MERGE_TEMPLATE_CACHE.with(|cache| { cache.borrow().peek(&st.pgt_id) .map(|e| (e.is_all_algebraic, e.is_deduplicated)) .unwrap_or((false, false)) }); ``` `peek()` is used to avoid spurious LRU promotion during read-only inspection. ### PERF-5: WAL Decoder UPDATE Vec Pre-allocation **Assessment finding:** P-5 (MEDIUM) **File:** `src/wal_decoder.rs` The UPDATE handler previously allocated 5 separate `Vec` objects per row (column names for DELETE part, column names for INSERT part, placeholder lists, parameter arrays) plus multiple `to_string()` calls per column. After the fix: - A single `Vec::with_capacity(num_columns)` is allocated for column names at the top of the function. - Placeholder strings are written into a pre-allocated `String` with `write!(buf, "${}", n)`. - The separate D+I parameter vectors share a `Vec::with_capacity(2 * num_columns)` allocation. Estimated reduction: 5× fewer heap allocations per UPDATE row. ### PERF-6: Frontier Borrow in Source-Change Probe **Assessment finding:** P-6 (MEDIUM) **File:** `src/scheduler/mod.rs` `has_stream_table_source_changes()` previously called `st.frontier.clone().unwrap_or_default()`, producing a full `Frontier` clone (a `HashMap`) for every upstream-dependency probe. **After:** ```rust let frontier = st.frontier.as_ref().unwrap_or(&Frontier::EMPTY); ``` A new `Frontier::EMPTY` static provides a zero-allocation default. ### PERF-7: Diamond Detection Lazy Intersection **Assessment finding:** P-7 (LOW) **File:** `src/dag.rs` Branch ancestor intersection now uses `HashSet::intersection()` as a lazy iterator that breaks on the first shared node, avoiding the `collect()` into a temporary `Vec` for every branch pair. Branches with no shared ancestors (the common case) exit without any allocation. --- ## Observability Improvements ### OBS-1: CDC Lag Percentile Metrics **Assessment finding:** O-1 (MEDIUM) **Files:** `src/shmem.rs`, `src/monitor/mod.rs`, `monitoring/prometheus/pg_trickle_queries.yml` A rolling 15-minute reservoir sampler is added to shared memory, recording the age (in seconds) of each frontier advancement. The sampler uses a 256-slot fixed-size circular buffer per worker that is reduced to percentiles on metric scrape. New Prometheus gauges: - `pg_trickle_cdc_lag_p50_seconds` - `pg_trickle_cdc_lag_p95_seconds` - `pg_trickle_cdc_lag_p99_seconds` ### OBS-2: Parallel Worker Utilisation Metrics **Assessment finding:** O-2 (MEDIUM) **Files:** `src/scheduler/pool.rs`, `src/shmem.rs` Two new shared-memory atomics: - `PARALLEL_QUEUE_DEPTH` (`PgAtomic`) — incremented when a job is enqueued, decremented when a worker picks it up. - `WORKER_IDLE_MS_TOTAL` (`PgAtomic`) — cumulative milliseconds workers have spent in the idle wait loop. Both are exposed at the `/metrics` endpoint. ### OBS-3: WAL Decoder Pending-Record Metric **Assessment finding:** O-3 (MEDIUM) **File:** `src/wal_decoder.rs` After each poll cycle, the WAL decoder writes the number of buffered-but-not-yet- written records to a new `WAL_DECODER_PENDING_RECORDS` shmem atomic. The Prometheus endpoint reads and resets this counter on each scrape, exposing it as `pg_trickle_wal_decoder_pending_records`. Alerting recommendation added to `monitoring/prometheus/pg_trickle_alerts.yml`: fire if `pg_trickle_wal_decoder_pending_records > 10000` for 5 minutes. ### OBS-4: Refresh-Mode Ratio Counters **Assessment finding:** O-4 (MEDIUM) **Files:** `src/refresh/orchestrator.rs`, `src/shmem.rs` Two counter arrays (one per stream table, keyed by `pgt_id`) track the cumulative count of DIFFERENTIAL and FULL refresh cycles. Both are exposed at the `/metrics` endpoint with a `pgt_id` label. The `postgres_exporter`-compatible queries block in `monitoring/prometheus/pg_trickle_queries.yml` is updated. ### OBS-5: application_name in Background Workers **Assessment finding:** O-5 (MEDIUM) **File:** `src/scheduler/mod.rs`, `src/scheduler/scheduler_loop.rs`, `src/scheduler/pool.rs`, `src/scheduler/dispatch.rs`, `src/wal_decoder.rs` Each background-worker role now executes: ```sql SET application_name = 'pg_trickle_' ``` immediately after its first `Spi::connect()`. Role names: - `pg_trickle_launcher` — database discovery loop - `pg_trickle_scheduler` — per-database scheduler - `pg_trickle_pool_N` — parallel refresh workers (N = worker index) - `pg_trickle_dispatcher` — parallel refresh DAG dispatcher - `pg_trickle_wal_decoder` — WAL / logical replication poller ### OBS-6: Backup & Restore Documentation **Assessment finding:** O-6 (LOW) **File:** `INSTALL.md` New "Backup & Restore" section added at the end of the installation guide. Key content: ```bash # Include both schemas when dumping a pg_trickle-enabled database: pg_dump --schema=public --schema=pgtrickle --schema=pgtrickle_changes \ mydb > mydb_backup.sql # After restore, validate catalog integrity: psql mydb -c "SELECT * FROM pgtrickle.health_check();" ``` The section also notes that change-buffer tables (`pgtrickle_changes.*`) are per-OID and that OID assignment may differ after restore, so `pg_trickle_repair_stream_table()` should be run on each stream table immediately after restore. --- ## Upgrade Notes The SQL upgrade script `sql/pg_trickle--0.58.0--0.59.0.sql` adds one column to `pgtrickle.pgt_stream_tables`: ```sql ALTER TABLE pgtrickle.pgt_stream_tables ADD COLUMN IF NOT EXISTS defining_query_hash BIGINT NOT NULL DEFAULT 0; ``` The column is backfilled automatically. No existing stream tables need to be recreated.