> **Plain-language companion:** [v0.18.0.md](v0.18.0.md)

## v0.18.0 — Hardening & Delta Performance

**Status: Released (2026-04-12).**

> **Release Theme**
> This release hardens pg_trickle for production at scale and delivers the
> biggest remaining performance win in the differential refresh path. The Z-set
> multi-source delta engine merges per-source delta branches into a single
> `GROUP BY + SUM(weight)` query, eliminating redundant join evaluation when
> multiple source tables change in the same cycle. Cross-source snapshot
> consistency guarantees that multi-source stream tables always read all
> upstream tables at the same transaction boundary — closing the last known
> correctness gap. Every production-path `.unwrap()` is replaced with graceful
> error propagation, another ~69 unsafe blocks are eliminated, and a populated
> TPC-H baseline turns the 22-query suite into a true regression canary.
> SQLancer fuzzing integration provides an external, assumption-free
> correctness oracle. Together, these changes build the confidence foundation
> for 1.0.

### Correctness

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| CORR-1 | Enforce cross-source snapshot consistency | L | P0 |
| CORR-2 | Populate TPC-H expected-output regression guard | XS | P0 |
| CORR-3 | NULL-safe GROUP BY elimination under deletes | S | P1 |
| CORR-4 | Z-set merged-delta weight accounting proof | M | P0 |
| CORR-5 | HAVING-filtered aggregate correction under group depletion | S | P1 |

**CORR-1 — Enforce cross-source snapshot consistency (CSS-3)**

> **In plain terms:** When a stream table reads from two different source
> tables, there is a window where it can see source A at a newer point in
> time than source B — for example, seeing a new order but the old
> inventory count. Phase 3 completes the tick-watermark enforcement so
> both sources are always read at the same consistent LSN before any
> refresh proceeds. Phases 1 and 2 are already complete.

| Item | Description | Effort | Ref |
|------|-------------|--------|-----|
| CSS-3-1 | LSN watermark enforcement in the scheduler — hold refresh until all upstream sources reach the same tick boundary | 4–6h | [PLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md](plans/sql/PLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md) §Phase 3 |
| CSS-3-2 | Catalog column `pgt_css_watermark_lsn` + GUC `pg_trickle.cross_source_consistency` (default `off`) | 2–3h | — |
| CSS-3-3 | E2E test: concurrent writes to two sources, assert stream table never sees a split snapshot | 2–3h | — |

> **CSS-3 subtotal: ~8–12 hours**
> Dependencies: None. Schema change: Yes.

**CORR-2 — Populate TPC-H expected-output regression guard (TPCH-BASE)**

> **In plain terms:** The TPC-H correctness tests run all 22 queries but
> the expected-output comparison guard was never populated — so the tests
> catch structural failures but not quiet result regressions. Populating
> the baseline turns the suite into a true correctness canary.

| Item | Description | Effort |
|------|-------------|--------|
| TPCH-BASE-1 | Run TPC-H suite once at known-good state; capture output | 30min |
| TPCH-BASE-2 | Populate comparison baseline in `e2e_tpch_tests.rs` line 89 (remove TODO); verify guard fires on a deliberate regression | 1h |

> **TPCH-BASE subtotal: ~1–2 hours**
> Dependencies: None. Schema change: No.

**CORR-3 — NULL-safe GROUP BY elimination under deletes**

> **In plain terms:** When all rows in a GROUP BY group are deleted and the
> grouping key contains NULLs, the differential engine must correctly remove
> the group. SQL's three-valued logic in `IS DISTINCT FROM` may cause delta
> weight miscounting for NULL keys.

Verify: E2E test with `GROUP BY nullable_col`, delete all group members,
assert zero rows remain in the stream table.
Dependencies: None. Schema change: No.

**CORR-4 — Z-set merged-delta weight accounting proof**

> **In plain terms:** Companion correctness gate for PERF-1 (B3-MERGE). The
> Z-set algebra requires that `SUM(weight)` across all merged branches for
> every primary key never produces a spurious net-positive or net-negative for
> a single join path.

Verify: property-based tests (proptest) asserting `merged_weights ==
individual_branch_sums` for randomly generated multi-source DAGs. All
existing B3-3 diamond-flow tests must pass unchanged.
Dependencies: PERF-1. Schema change: No.

**CORR-5 — HAVING-filtered aggregate correction under group depletion**

> **In plain terms:** When a HAVING-qualified group loses enough rows to no
> longer satisfy the predicate (e.g., `HAVING count(*) > 5` and 3 of 6 rows
> are deleted), the differential aggregate path must delete the stream table
> row rather than leaving a stale row matching the old HAVING predicate.

Verify: E2E test with HAVING + selective deletes crossing the threshold.
Dependencies: None. Schema change: No.

### Stability

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| STAB-1 | Eliminate production-path .unwrap() calls | S | P0 |
| STAB-2 | unsafe block reduction Phase 1 | M | P1 |
| STAB-3 | Spill detection alerting | S | P1 |
| STAB-4 | Parallel worker orphaned resource cleanup | M | P1 |
| STAB-5 | Upgrade migration test (0.17→0.18) | S | P0 |
| STAB-6 | Error SQLSTATE coverage audit | S | P2 |

**STAB-1 — Eliminate production-path .unwrap() calls (SAFE-1)**

> **In plain terms:** A small number of SQL-parsing code paths in
> production (non-test) code call `.unwrap()` directly — if they encounter
> unexpected input they will panic the backend process and disconnect all
> clients. These should propagate errors gracefully instead.

| Item | Description | Effort |
|------|-------------|--------|
| SAFE-1-1 | `detect_and_strip_distinct()` call in `api.rs` (L8163) → propagate `PgTrickleError` | 1h |
| SAFE-1-2 | `find_top_level_keyword(sql, "FROM")` calls in `api.rs` (L8229–8258, 3×) → propagate error | 1h |
| SAFE-1-3 | `merge_sql[using_start.unwrap()..using_end.unwrap()]` in `refresh.rs` (L6236) → bounds-check | 1h |
| SAFE-1-4 | `entry.unwrap()` in delta computation loop in `refresh.rs` (L5992) → return `Err` | 1h |
| SAFE-1-5 | Chained `.unwrap().unwrap()` in `refresh.rs` (L6556–6557) → propagate | 1h |

> **SAFE-1 subtotal: ~4–6 hours**
> Dependencies: None. Schema change: No.

**STAB-2 — unsafe block reduction Phase 1 (UNSAFE-P1)**

> **In plain terms:** The DVM parser has 1,286 `unsafe` blocks — 98% of
> the total. Phase 1 introduces a single `pg_cstr_to_str()` safe helper
> that eliminates ~69 of the most mechanical ones: C-string-to-Rust
> conversions. No API or behavior change; pure safety improvement.

| Item | Description | Effort | Ref |
|------|-------------|--------|-----|
| UNSAFE-P1-1 | Implement `pg_cstr_to_str(ptr: *const c_char) -> &str` safe wrapper in `src/dvm/parser/mod.rs` | 1h | [PLAN_REDUCED_UNSAFE.md](plans/safety/PLAN_REDUCED_UNSAFE.md) §Phase 1 |
| UNSAFE-P1-2 | Replace ~69 `unsafe { CStr::from_ptr(...).to_str()... }` call-sites with the safe helper | 4–6h | — |
| UNSAFE-P1-3 | `unsafe_inventory.sh` baseline update + CI check | 1h | `scripts/unsafe_inventory.sh` |

> **UNSAFE-P1 subtotal: ~6–8 hours**
> Dependencies: None. Schema change: No.

**STAB-3 — Spill detection alerting (PH-E2)**

> **In plain terms:** The GUCs `pg_trickle.spill_threshold_blocks` and
> `pg_trickle.spill_consecutive_limit` already exist to configure spill
> budgets, but no alert fires when a refresh actually spills to disk. This
> adds an `AlertEvent::SpillThresholdExceeded` notification so operators
> know when large delta queries are hitting disk.

| Item | Description | Effort |
|------|-------------|--------|
| PH-E2-1 | Add `AlertEvent::SpillThresholdExceeded` variant to `src/monitor.rs` | 1h |
| PH-E2-2 | Detect spill after MERGE execution; emit alert when consecutive count exceeds limit | 2–3h |
| PH-E2-3 | E2E test: configure low spill threshold, trigger spill, assert alert fires | 1–2h |

> **PH-E2 subtotal: ~4–6 hours**
> Dependencies: None. Schema change: No.

**STAB-4 — Parallel worker orphaned resource cleanup**

> **In plain terms:** After a parallel worker panics mid-refresh, advisory
> locks, `__pgt_delta_*` temp tables, and partially-written change buffer
> rows may be left behind. The scheduler recovery path must clean these up.

Audit the recovery path to ensure: (a) advisory locks are released on next
scheduler tick, (b) temp tables are cleaned up, (c) change buffer rows are
not double-counted on retry. Verify: E2E test simulating worker crash via
`pg_terminate_backend()` followed by successful recovery.
Dependencies: None. Schema change: No.

**STAB-5 — Upgrade migration test (0.17→0.18)**
Extend the upgrade E2E test framework to cover the 0.17.0→0.18.0 migration
path and the three-version chain 0.16→0.17→0.18. Verify: catalog column
additions, new function signatures, existing stream tables survive, refresh
continues working post-upgrade.
Dependencies: All schema-changing items (CORR-1). Schema change: No.

**STAB-6 — Error SQLSTATE coverage audit**
Audit all `ereport!()` and `error!()` calls for SQLSTATE classification.
Ensure every user-facing error has a unique, documented SQLSTATE code that
connection poolers and application retry logic can pattern-match. Cross-
reference with `docs/ERRORS.md` for completeness.
Dependencies: None. Schema change: No.

### Performance

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| PERF-1 | Z-set multi-source delta engine | L | P0 |
| PERF-2 | Cost-based refresh strategy completion | L | P1 |
| PERF-3 | Zero-change source branch elision | M | P1 |
| PERF-4 | Columnar change tracking Phase 1 — CDC bitmask | L | P1 |
| PERF-5 | Index hint generation for MERGE target | S | P2 |

**PERF-1 — Z-set multi-source delta engine (B3-MERGE)**

> **In plain terms:** When a stream table joins multiple tables and more
> than one of those tables receives changes in the same scheduler cycle,
> the current engine generates one delta branch per source and stacks them
> in a `UNION ALL`. With this change those branches are merged into a
> single `GROUP BY + SUM(weight)` query using Z-set algebra, eliminating
> duplicate evaluation of shared join paths. B3-1 (branch pruning) and
> B3-3 (correctness proofs) are already done; this is the final payoff.

| Item | Description | Effort | Ref |
|------|-------------|--------|-----|
| B3-2-1 | Z-set merged-delta generation in `src/dvm/diff.rs` (`DiffEngine::diff_node()`) | 8–10h | [PLAN_MULTI_TABLE_DELTA_BATCHING.md](plans/performance/PLAN_MULTI_TABLE_DELTA_BATCHING.md) |
| B3-2-2 | Unit + property-based tests (existing B3-3 diamond-flow tests must pass unchanged) | 2–4h | — |
| B3-2-3 | Benchmark regression check against Part-8 baseline | 2h | — |

> **B3-MERGE subtotal: ~12–16 hours**
> Dependencies: CORR-4 (property tests must accompany). Schema change: No.

**PERF-2 — Cost-based refresh strategy completion (B-4 remainder)**

> **In plain terms:** *Deferred from v0.17.0.* The `refresh_strategy` GUC
> landed in the current cycle. The remaining work is the per-ST cost model:
> collect `delta_row_count`, `merge_duration_ms`, `full_refresh_duration_ms`
> from `pgt_refresh_history`; fit a simple linear cost model; cold-start
> heuristic (<10 refreshes) falls back to the fixed threshold.

Verify: mixed-workload benchmark showing the model picks the cheaper strategy
≥80% of the time.
Dependencies: B-4 Phase 1 (shipped). Schema change: No.

**PERF-3 — Zero-change source branch elision**

> **In plain terms:** When building a multi-source delta query, skip branches
> entirely for sources with empty change buffers. Currently all branches are
> generated and executed regardless of whether a source has changes.

Verify: benchmark showing latency reduction when 1-of-3 sources changes vs.
all 3 changing.
Dependencies: PERF-1 (applies to the merged delta builder). Schema change: No.

**PERF-4 — Columnar change tracking Phase 1 — CDC bitmask (A-2-COL-1)**

> **In plain terms:** *Deferred from v0.17.0.* Compute `changed_columns`
> bitmask (`old.col IS DISTINCT FROM new.col`) in the CDC trigger; store as
> `int8` or `bit(n)` alongside the change row. Phase 1 only: bitmask
> computation + storage. Phase 2 (delta-scan filtering using the bitmask)
> deferred to v0.22.0. Provides the foundation for 50–90% delta volume
> reduction on wide-table UPDATE workloads.

Gate behind `pg_trickle.columnar_tracking` GUC (default `off`).
Dependencies: None. Schema change: Yes (change buffer schema addition).

**PERF-5 — Index hint generation for MERGE target**

> **In plain terms:** When the stream table has a covering index on the
> MERGE join keys, bias the planner toward the index to avoid expensive
> sequential scans during delta application on large stream tables.

Emit `SET enable_seqscan = off` within the MERGE statement's session.
Verify: `EXPLAIN ANALYZE` shows index scan on MERGE for tables with PK index.
Dependencies: None. Schema change: No.

### Scalability

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| SCAL-1 | Change buffer growth stress test at 10× write rate | M | P1 |
| SCAL-2 | Parallel worker utilization profiling at 200+ STs | M | P2 |
| SCAL-3 | Delta working-set memory cap | M | P2 |

**SCAL-1 — Change buffer growth stress test at 10× write rate**
Run a sustained write load at 10× normal throughput for 30+ minutes with
intentionally slow refresh intervals. Verify the `max_buffer_rows` cap
triggers correctly, FULL refresh clears the backlog, no disk exhaustion
occurs, and the extension recovers cleanly once write rate normalizes. This
validates the v0.16.0 buffer growth protection under extreme conditions.
Dependencies: None. Schema change: No.

**SCAL-2 — Parallel worker utilization profiling at 200+ STs**
Profile the scheduler with 200+ stream tables across
`pg_trickle.max_workers` = 4/8/16 settings. Measure: CPU utilization per
worker, scheduling queue depth, per-ST refresh latency P50/P99. Identify
whether the scheduling loop itself becomes a bottleneck before worker
saturation. Document findings as a scaling guide section.
Dependencies: None. Schema change: No.

**SCAL-3 — Delta working-set memory cap**
The current delta merge can allocate unbounded `work_mem` for hash joins. Add
a configurable cap (`pg_trickle.delta_work_mem_mb`, default: 256 MB) that
triggers FULL refresh fallback when the delta working set would exceed the
limit, preventing OOM on unexpectedly large deltas. Verify: E2E test with low
cap triggers fallback and logs a warning.
Dependencies: None. Schema change: No.

### Ease of Use

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| UX-1 | Template cache observability | S | P1 |
| UX-2 | Pre-built Grafana dashboard panels | M | P1 |
| UX-3 | Error message actionability audit | S | P1 |
| UX-4 | Single-endpoint health summary function | S | P2 |
| UX-5 | Prometheus metric completeness audit | XS | P2 |
| UX-6 | TUI surfaces for cache_stats and health_summary | XS | P2 |

**UX-1 — Template cache observability (CACHE-OBS)**

> **In plain terms:** The delta SQL template cache (`IVM_DELTA_CACHE`)
> saves regenerating delta queries on every refresh cycle, but its hit rate
> is invisible to operators. Adding `pgtrickle.cache_stats()` lets you see
> whether the cache is effective and tune `pg_trickle.ivm_cache_size`
> accordingly.

| Item | Description | Effort |
|------|-------------|--------|
| CACHE-OBS-1 | Add hit/miss/eviction counters to `IVM_DELTA_CACHE` | 1h |
| CACHE-OBS-2 | Expose via `pgtrickle.cache_stats()` returning `(hits BIGINT, misses BIGINT, evictions BIGINT, size INT)` | 1–2h |
| CACHE-OBS-3 | Documentation and E2E smoke test | 1h |

> **CACHE-OBS subtotal: ~3–4 hours**
> Dependencies: None. Schema change: No.

**UX-2 — Pre-built Grafana dashboard panels**
Extend `monitoring/grafana/` with import-ready JSON panels for: refresh
latency P50/P99 histogram, differential vs. FULL refresh ratio over time,
change buffer backlog per stream table, spill event count, template cache hit
rate, and worker utilization gauge. Document import instructions in
`monitoring/README.md`.
Dependencies: UX-1 (cache stats metric), STAB-3 (spill events). Schema change: No.

**UX-3 — Error message actionability audit**
Audit all `PgTrickleError` variants and `ereport!()`/`error!()` calls. Ensure
every user-facing error includes: the stream table name (when applicable), the
operation that failed, and a 1-sentence remediation hint. Cross-reference with
`docs/ERRORS.md`; add missing entries.
Dependencies: None. Schema change: No.

**UX-4 — Single-endpoint health summary function**
New `pgtrickle.health_summary()` function returning a single-row JSONB:
total STs, healthy/degraded/error counts, oldest un-refreshed age, largest
buffer backlog, fuse status, scheduler state. Useful for monitoring
integrations (Nagios, Datadog) without parsing multiple views.
Dependencies: None. Schema change: No.

**UX-5 — Prometheus metric completeness audit**
Verify every metric emitted by the extension matches the documented name in
`docs/CONFIGURATION.md` §Prometheus. Remove undocumented metrics or add
documentation. Ensure metric names follow Prometheus naming conventions
(`pgtrickle_*` prefix, snake_case, unit suffix).
Dependencies: None. Schema change: No.

**UX-6 — TUI surfaces for cache_stats and health_summary**

> **In plain terms:** The new `pgtrickle.cache_stats()` (UX-1) and
> `pgtrickle.health_summary()` (UX-4) functions are useful in isolation but
> are most discoverable when surfaced in the TUI. Even a read-only status
> panel showing total STs, healthy/degraded/error counts, cache hit rate, and
> scheduler state would make these endpoints visible to users who reach the
> extension through `pgtrickle-tui` rather than raw SQL. Audit `pgtrickle-tui/src/`
> to identify the lightest-weight integration point (likely a new "Health" tab
> or an expanded "Status" panel). If TUI changes are out of scope for this
> release, document the gap in `docs/TUI.md` so it is not silently deferred.

Verify: TUI displays non-zero cache stats and a valid health JSONB row after
at least one refresh cycle in the E2E playground environment.
Dependencies: UX-1, UX-4. Schema change: No.

### Test Coverage

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| TEST-1 | TPC-H regression baseline | XS | P0 |
| TEST-2 | SQLancer fuzzing — crash-test oracle | L | P1 |
| TEST-3 | CDC edge cases: NULL PKs, composite PKs, generated columns | M | P1 |
| TEST-4 | Property-based tests for Z-set merged delta | M | P0 |
| TEST-5 | Light E2E eligibility audit | S | P2 |
| TEST-6 | Three-version upgrade chain test (0.16→0.17→0.18) | S | P0 |
| TEST-7 | dbt integration regression coverage | S | P1 |

**TEST-1 — TPC-H regression baseline (TPCH-BASE)**
Same as CORR-2. Capture known-good outputs; verify guard fires on deliberate
regression.
Dependencies: None. Schema change: No.

**TEST-2 — SQLancer fuzzing — crash-test oracle**

> **In plain terms:** *Deferred from v0.17.0 (second time).* Scope reduced
> to crash-test oracle only for v0.18.0: SQLancer in Docker, configured to
> feed randomized SQL to the parser and DVM pipeline. Zero-panic guarantee —
> any input that crashes the extension is a bug. Equivalence oracle
> (DIFFERENTIAL ≡ FULL) and stateful DML fuzzing deferred to v0.22.0.

Verify: 10K+ fuzzed queries with zero panics.
Dependencies: None. Schema change: No.

**TEST-3 — CDC edge cases: NULL PKs, composite PKs, generated columns**
Create E2E tests covering: (a) tables with nullable PK columns in
differential mode, (b) composite PKs with 3+ columns, (c) `GENERATED ALWAYS
AS` stored columns as source columns, (d) domain-typed columns, (e)
array-typed columns referenced in defining queries.
Dependencies: None. Schema change: No.

**TEST-4 — Property-based tests for Z-set merged delta**
Required companion to PERF-1. proptest-based tests generating random
multi-source DAGs (2–5 sources, 1–3 join levels) with random DML sequences.
Assert merged delta produces identical stream table state as sequential
per-branch application. Detect weight-accounting bugs before they ship.
Dependencies: PERF-1. Schema change: No.

**TEST-5 — Light E2E eligibility audit**
Review all 10 full E2E test files (~90 tests). Identify tests that don't
require custom Docker image features (custom extensions, special
configurations) and can run on the stock `postgres:18.3` image. Migrate
eligible tests to reduce CI wall-clock time on PRs.
Dependencies: None. Schema change: No.

**TEST-6 — Three-version upgrade chain test (0.16→0.17→0.18)**
Extend upgrade E2E tests to cover: fresh install of 0.16.0, create stream
tables, upgrade to 0.17.0, verify survival, upgrade to 0.18.0, verify
survival + new features functional.
Dependencies: All schema-changing items. Schema change: No.

**TEST-7 — dbt integration regression coverage**

> **In plain terms:** The `dbt-pgtrickle` macro package is the primary
> adoption vector for teams using dbt, but the integration test suite in
> `dbt-pgtrickle/integration_tests/` currently verifies only happy-path macro
> expansion. Add regression tests covering: (a) `pgtrickle_stream_table` macro
> with all supported materialisation strategies (`differential`, `full`, `auto`),
> (b) incremental model compatibility, (c) `pgtrickle_status` test macro,
> (d) teardown and recreation idempotency (drop + re-run produces identical
> output). Run as part of `just test-dbt`.

Verify: `just test-dbt` passes all new cases; idempotency test confirms
identical stream table contents after a full `dbt run --full-refresh` cycle.
Dependencies: None. Schema change: No.

### Conflicts & Risks

1. **PERF-1 + CORR-4 + TEST-4 form a mandatory cluster.** The Z-set
   multi-source delta engine (B3-MERGE) is the highest-impact performance
   item but also touches the DVM engine core. Property-based tests (TEST-4)
   and the weight accounting proof (CORR-4) are *not optional* — they must
   ship alongside PERF-1 to prevent correctness regressions.

2. **Two schema changes.** CORR-1 (CSS-3) adds `pgt_css_watermark_lsn` to
   the catalog. PERF-4 (A-2-COL-1) adds `changed_columns` to change buffer
   tables. Both require upgrade migration scripts and freeze-risk
   coordination. Consider batching both into a single migration file.

3. **PERF-3 depends on PERF-1.** Zero-change branch elision modifies the
   same delta query builder as B3-MERGE. Sequence PERF-3 strictly after
   PERF-1 to avoid merge conflicts and compound risk.

4. **TEST-2 (SQLancer) is deferred for the second time.** Originally planned
   for v0.17.0, it remains unstarted. v0.18.0 scopes it to crash-test oracle
   only (L effort instead of XL), but there is a risk of perpetual deferral.
   If capacity is tight, prioritize the crash-test oracle as a standalone
   deliverable rather than deferring the full suite again.

5. **PERF-2 (cost model) requires production history data.** The per-ST cost
   model trains on `pgt_refresh_history`. Users upgrading from v0.17.0 will
   have a cold history cache. The cold-start heuristic (< 10 refreshes) is
   critical — test it explicitly.

6. **PERF-4 (columnar tracking) changes CDC trigger output.** The
   `changed_columns` bitmask adds overhead to every trigger invocation.
   Gate behind a GUC (default `off`) and benchmark the per-row overhead
   (`< 1μs` target) before enabling by default in a later release.

7. **B-4 and A-2-COL are carry-overs from v0.17.0.** Both were originally
   scoped for v0.17.0 but not started. They are re-proposed here with
   reduced scope (B-4 cost model only, A-2-COL Phase 1 bitmask only).
   If v0.17.0 ships B-4 partially, adjust PERF-2 scope accordingly.

> **v0.18.0 total: ~70–100 hours**

**Exit criteria:**
- [x] CORR-1: Split-snapshot E2E test passes under concurrent writes; `pgt_css_watermark_lsn` column added
- [x] CORR-2 / TEST-1: TPC-H baseline populated; deliberate regression detected by the guard
- [x] CORR-3: NULL-keyed GROUP BY group fully removed after all-row delete
- [x] CORR-4 / TEST-4: Property-based Z-set weight tests pass for randomly generated multi-source DAGs
- [x] CORR-5: HAVING-qualified group deleted from stream table when row count drops below threshold
- [x] STAB-1: All production-path `unwrap()` calls in `api.rs` and `refresh.rs` replaced with proper error propagation
- [x] STAB-2: `unsafe_inventory.sh` reports ≥69 fewer `unsafe` blocks; CI baseline updated
- [x] STAB-3: Spill alert fires in E2E test with artificially low threshold
- [x] STAB-4: Worker crash recovery E2E test cleans up advisory locks, temp tables, and buffer rows
- [x] STAB-5 / TEST-6: Three-version upgrade chain (0.16→0.17→0.18) passes
- [x] STAB-6: All user-facing errors have documented SQLSTATE codes in `docs/ERRORS.md`
- [x] PERF-1: Merged multi-source delta implemented; all B3-3 diamond-flow property tests pass unchanged
- [x] PERF-2: Cost model picks cheaper strategy ≥80% of the time on mixed workload benchmark
- [x] PERF-3: Zero-change branch elision shows measurable latency reduction in multi-source benchmark
- [x] PERF-4: `changed_columns` bitmask stored in change buffer; per-row overhead < 1μs
- [x] PERF-5: Index scan confirmed via EXPLAIN ANALYZE for MERGE on tables with PK covering index
- [x] SCAL-1: Buffer growth stress test at 10× rate completes without disk exhaustion or data loss
- [x] SCAL-2: Profiling report for 200+ STs documented
- [x] SCAL-3: Delta work_mem cap triggers FULL fallback in E2E test
- [x] UX-1: `pgtrickle.cache_stats()` returns correct counters in smoke test
- [x] UX-2: Grafana dashboard JSON importable; documents refresh latency, buffer backlog, spill events
- [x] UX-3: Error message audit complete; all errors include table name and remediation hint
- [x] UX-4: `pgtrickle.health_summary()` returns single-row JSONB with correct counts
- [x] UX-5: Prometheus metric names match documentation; no undocumented metrics
- [x] TEST-2: SQLancer crash-test oracle runs 10K+ fuzzed queries with zero panics
- [x] TEST-3: CDC edge case tests cover NULL PKs, composite PKs, generated columns, domain types, arrays
- [x] TEST-5: At least 10 tests migrated from full E2E to light E2E
- [x] TEST-7: dbt regression suite covers all macro strategies and teardown idempotency; `just test-dbt` passes
- [x] UX-6: TUI (or `docs/TUI.md` gap note) reflects `cache_stats()` and `health_summary()` availability
- [x] Extension upgrade path tested (`0.17.0 → 0.18.0`)
- [x] `just check-version-sync` passes


---