> **Plain-language companion:** [v0.23.0.md](v0.23.0.md)

## v0.23.0 — TPC-H DVM Scaling Performance

**Status: Released (2026-04-19).** Driven by [PLAN_TPCH_DVM_PERF.md](plans/performance/PLAN_TPCH_DVM_PERF.md).
Root-cause investigation and targeted fixes for three differential-refresh
failure modes discovered by benchmarking `test_tpch_performance_comparison`
at SF=0.01/0.1/1.0 (April 2026). At SF=1.0, 18 of 22 TPC-H queries have
DIFF slower than FULL re-evaluation; the worst case (q09) is 2,246× slower.
The work items follow a diagnosis-first workflow: confirm hypotheses before
coding, then apply fixes to the smallest affected code paths.

> **Release Theme**
> This release closes the gap between the differential refresh engine's
> theoretical O(Δ) complexity and its observed super-linear scaling at
> SF=1.0. Three failure modes are addressed in sequence: (1) threshold
> collapse in multi-join queries (q05/q07/q08/q09/q22), (2) early collapse
> in EXISTS anti-join queries (q04), and (3) a structural bug in doubly-nested
> correlated subqueries (q20). Each fix maps to existing DI items in
> [PLAN_DVM_IMPROVEMENTS.md](plans/performance/PLAN_DVM_IMPROVEMENTS.md)
> and is validated against all 22 TPC-H queries at SF=1.0 using
> `test_tpch_differential_correctness` after every code change.

---

### Phase 1 — Diagnosis

| Item | Description | Effort | Phase |
|------|-------------|--------|-------|
| P1-1 | **work_mem benchmark.** Run `test_tpch_performance_comparison` at SF=1.0 with `work_mem = '1GB'`. If q05/q07/q08/q09 drop to <500ms the bottleneck is PostgreSQL hash/sort spill (Path A); if they stay >5s it is DVM intermediate cardinality blowup (Path B). Determines which fix path to follow in Phase 2. | 0.5d | Diagnosis |
| P1-2 | **Delta SQL logging GUC.** Add `pgtrickle.log_delta_sql = on` debug GUC that logs the generated delta SQL at `DEBUG1` level (one `pgrx::log!()` call gated on GUC flag inside `execute_delta_sql`). Allows `EXPLAIN (ANALYZE, BUFFERS)` on generated SQL for q04 and q20 without modifying test code. **Location:** `config.rs` + `refresh.rs`. | 1.0d | Diagnosis |

### Phase 2 — Fix Threshold-Collapse Queries (q05/q07/q08/q09)

*Prerequisites: P1-1 and P1-2 complete.*

| Item | Description | Effort | Path |
|------|-------------|--------|------|
| P2A-1 | **DI-2 aggregate UPDATE-split.** Complete the remaining part of `PLAN_DVM_IMPROVEMENTS.md §DI-2`: split UPDATE rows into DELETE+INSERT for the algebraic aggregate path, eliminating the multi-scan of unchanged base tables and reducing intermediate row counts from O(n) to O(Δ). **Location:** `src/dvm/operators/aggregate.rs`, `src/dvm/diff.rs`. | 2.0d | B (DVM cardinality) |
| P2A-2 | **DI-2 validation — 22/22 TPC-H.** Run `test_tpch_differential_correctness` at SF=1.0 after P2A-1 to confirm no correctness regression. Regression-benchmark against SF=0.01 baseline to confirm no slowdown on currently-fast queries (q02, q11, q16). | 1.5d | B |
| P2B-1 | **work_mem bump in execute_delta_sql.** If P1-1 confirms hypothesis A (spill), set `work_mem` to `pgtrickle.delta_work_mem` (see P5-1) inside the delta execution path before calling `Spi::execute`. No DVM code change required; pure PostgreSQL session GUC. **Location:** `src/refresh.rs`. | 0.5d | A (spill) |
| P2-1 | **EXPLAIN ANALYZE for super-linear queries.** After P1-2 captures delta SQL, run `EXPLAIN (ANALYZE, BUFFERS)` on q13, q15, q17, q22 at SF=0.1 and SF=1.0. Determine whether these benefit from DI-2 or have independent issues (q22 `NOT IN` correlated subquery). | 0.5d | Both |

### Phase 3 — Fix Early-Collapse Query (q04)

*Prerequisites: P1-2 complete.*

| Item | Description | Effort | Ref |
|------|-------------|--------|-----|
| P3-1 | **Verify DI-6 key-filter extraction for q04.** Confirm that `extract_equijoin_keys_aliased` in `anti_join.rs` extracts `l_orderkey = o_orderkey` from q04's correlated EXISTS condition. If the extraction fails (additional non-equi predicates like `l_commitdate < l_receiptdate` in the same EXISTS clause silence the filter), the 140× jump at SF=0.01→0.1 is explained. **Location:** `src/dvm/operators/anti_join.rs`. | 0.5d | DI-6 |
| P3-2 | **Restrict R_old to changed keys only.** If P3-1 shows a gap: change the key-filter construction in `anti_join.rs` and `semi_join.rs` to generate `WHERE l_orderkey IN (SELECT o_orderkey FROM delta_orders)` rather than a static value filter. Turns an O(n) scan into O(Δ). Reduces q04 from 2.1s (SF=0.1) to target <100ms. | 1.5d | DI-6 |

### Phase 4 — Fix Structural Bug (q20)

*Prerequisites: P1-2 complete.*

| Item | Description | Effort | Ref |
|------|-------------|--------|-----|
| P4-1 | **Analyse doubly-nested EXISTS path.** Use P1-2 delta SQL log output to measure the inner R_old row count at SF=0.1 for q20. Confirm the O(outer_Δ × n_inner) re-materialisation described in `PLAN_DVM_IMPROVEMENTS.md §1`. Estimate speedup from hoisting inner R_old before implementing. | 0.5d | DI-1 |
| P4-2 | **Hoist inner R_old to named CTE.** Modify `DiffContext::add_cte` to detect when a CTE from an inner semi-join/anti-join is referenced from an outer correlated context and promote it to the outer level. Reduces q20 from ~2s (all SFs) to target <50ms. This is a special case of DI-1 (named CTE sharing) applied across nesting levels. **Location:** `src/dvm/diff.rs`. | 2.0d | DI-1 |

### Phase 5 — Planner Hints and work_mem GUC

| Item | Description | Effort | Ref |
|------|-------------|--------|-----|
| P5-1 | **`pgtrickle.delta_work_mem` GUC.** Add a GUC that sets `work_mem` inside `execute_delta_sql` before running generated SQL. Default `0` (inherit session `work_mem`). Allows tuning without server restart: `ALTER SYSTEM SET pgtrickle.delta_work_mem = '256MB'`. Short-term mitigation while DI-2 completion (Phase 2) is in progress. **Location:** `config.rs` + `refresh.rs`. | 0.5d | — |
| P5-2 | **`pgtrickle.delta_enable_nestloop` GUC (optional).** Add a GUC to disable nested-loop joins inside delta execution (`SET enable_nestloop = off`). Useful diagnostic for planner regressions on large right-side joins before planner statistics are reliable. **Location:** `config.rs` + `refresh.rs`. | 0.5d | — |

---

### Quality Pillar Enrichment

Items across the six quality pillars that are directly triggered by the
Phase 1–5 DVM code changes and the TPC-H scaling investigation. Items marked
**P0** block the release; **P1** are target; **P2** are nice-to-have.

#### Correctness

| ID | Title | Effort | Priority | Description |
|----|-------|--------|----------|-------------|
| CORR-1 | **`__pgt_count` invariant under UPDATE-split** | S | P0 | After P2A-1 (DI-2 aggregate UPDATE-split), add a property-based test (proptest/quickcheck) that generates random UPDATE batches and asserts `SUM(__pgt_count) = 0` over the change buffer before and after the UPDATE-split merge path. An imbalanced count silently corrupts the stream table aggregate. **Location:** `src/dvm/operators/aggregate.rs`, `tests/`. |
| CORR-2 | **HAVING correctness after aggregate UPDATE-split** | S | P1 | HAVING filters must be applied to the final merged aggregate, not to the intermediate split rows. Add a regression test with `GROUP BY … HAVING count(*) > N` that applies an UPDATE that changes grouped keys — the expected behaviour is that only rows whose post-update aggregate crosses the HAVING threshold appear in the delta. Catches off-by-one errors in the split path. |
| CORR-3 | **NULL-safe equi-join key extraction in DI-6** | S | P1 | `extract_equijoin_keys_aliased` in `anti_join.rs` and `semi_join.rs` uses standard equality. If a join key column is nullable, the EXCEPT ALL in R_old can miss or double-count rows on NULL keys. Add unit tests for anti-join delta with a NULL `l_orderkey`; fix the key filter to emit `IS NOT DISTINCT FROM` for nullable key columns. **Location:** `src/dvm/operators/anti_join.rs`, `semi_join.rs`. |

#### Stability

| ID | Title | Effort | Priority | Description |
|----|-------|--------|----------|-------------|
| STAB-1 | **Panic elimination in DI-2 / DI-6 new code paths** | S | P0 | Any `unreachable!()` or `panic!()` in `diff.rs`, `aggregate.rs`, `anti_join.rs`, `semi_join.rs` that can be reached by the new UPDATE-split and key-restriction code paths must be replaced with `PgTrickleError::DvmUnsupportedOperator` and surface as a PostgreSQL `ERROR` (not a backend crash). Audit all `unwrap()` calls added in Phase 2–4. **Constraint:** Per AGENTS.md — never `unwrap()` / `panic!()` in code reachable from SQL. |
| STAB-2 | **Graceful fallback for invalid `delta_work_mem` value** | XS | P1 | If `pgtrickle.delta_work_mem` is set to an invalid memory string (e.g. `'invalid'`), the `SET LOCAL work_mem = '...'` inside `execute_delta_sql` returns a PostgreSQL error. Catch that SPI error and fall back to the session `work_mem` with a `WARNING` log rather than propagating as an unhandled error. **Location:** `src/refresh.rs`. |
| STAB-3 | **WAL exhaustion guard in cross-query consistency** | S | P0 | `test_tpch_cross_query_consistency` creates all 22 stream tables simultaneously and caused a 4h50m hang at SF-10 (April 2026) via WAL/disk exhaustion. Validate the per-query `CHECKPOINT` fix at SF=1.0 by tracking WAL LSN delta before/after each checkpoint call. If WAL still grows unbounded between checkpoints, add a `TPCH_MAX_CONCURRENT_STREAMS` cap that refreshes tables in batches of N. **Success:** test completes at SF=1.0 in <30 min with peak WAL <10 GB. |
| STAB-4 | **`pgtrickle_refresh_stats` view for production observability** | S | P2 | Add a `pgtrickle.pgtrickle_refresh_stats` view that aggregates per-stream-table timing from `st_refresh_stats` into `(stream_table, mode, avg_ms, p95_ms, p99_ms, refresh_count, last_refresh_at)`. Gives operators a single `SELECT * FROM pgtrickle.pgtrickle_refresh_stats ORDER BY avg_ms DESC` to identify slow stream tables in production without running a TPC-H benchmark. The view is updated by the scheduler after each successful refresh cycle. **Location:** `src/monitor.rs`, `sql/`. **Schema change:** Yes — new view. |
| STAB-5 | **Update `docs/ERRORS.md` with new DVM error variants** | XS | P2 | STAB-1 (panic elimination) replaces `unwrap()`/`panic!()` with `PgTrickleError::DvmUnsupportedOperator` errors. UX-4 introduces a `dvm_unsupported_pattern` alert. Phase 2–4 code paths may produce new error conditions not currently documented. Add entries to `docs/ERRORS.md` for each new error variant: error ID, SQLSTATE code, description, remediation hint, and reference to relevant roadmap items (UX-2, UX-4, PERF-4). Cross-reference from `ERRORS.md` to PERFORMANCE_COOKBOOK.md section added in UX-2. **Location:** `docs/ERRORS.md`. **No schema change.** |

#### Performance

| ID | Title | Effort | Priority | Description |
|----|-------|--------|----------|-------------|
| PERF-1 | **Criterion regression gate for fixed query patterns** | S | P1 | After each phase lands (P2, P3, P4), add the fixed pattern to `benches/diff_operators.rs` as a Criterion micro-benchmark: multi-table join delta (q09-shape), EXISTS anti-join delta (q04-shape), nested EXISTS delta (q20-shape). Gate CI to fail if DIFF time at SF=0.1 regresses >20% vs the post-fix baseline. Catches regressions introduced by future DVM changes without requiring a full TPC-H run. |
| PERF-2 | **Delta SQL template caching for repeated refresh** | S | P2 | When `pgtrickle.log_delta_sql = on` is active (P1-2), the delta SQL string is built on every refresh. Add a thread-local `HashMap<(stream_table_oid, change_kind), String>` cache so the SQL is only regenerated when the stream table definition changes (DDL invalidation via `pg_notify`). Eliminates the SQL generation overhead from the hot path once the debugging GUC is removed. **Location:** `src/refresh.rs`, `src/dvm/diff.rs`. |
| PERF-3 | **Criterion JSON artifact versioning for multi-release trend analysis** | S | P2 | Configure `benches/` to write Criterion measurement JSON to a versioned path (`target/criterion/v0.23.0/`) so CI uploads them as a named artifact per release tag. A post-run comparison script reads the previous release's JSON and fails with a `BENCH_REGRESSION` exit code if any benchmark regresses >20%. Enables trend graphs across releases (v0.23.0 → v0.24.0 → …) and replaces the current session-scoped `criterion_regression_check.py` for multi-release comparisons. **Location:** `scripts/`, `.github/workflows/`. |
| PERF-4 | **AUTO mode cost threshold recalibration post Phase 2–4** | S | P1 | The AUTO refresh cost model break-even threshold was calibrated against the pre-fix DVM behaviour. After Phases 2–4 fix the threshold-collapse and structural-bug queries, re-run the AUTO break-even benchmark at SF=0.1 and SF=1.0 using `test_tpch_performance_comparison` output with AUTO mode enabled and update the calibrated threshold constant so that q05/q07/q08/q09 are no longer routed to FULL fallback unnecessarily. Without this step, DIFF latency improves but AUTO mode leaves the improvement unused for users relying on the default refresh mode. **Location:** `src/refresh.rs` (cost model threshold). **Prerequisite:** P2A-2, P3-2, P4-2. |
| PERF-5 | **`ANALYZE` change buffer before delta SQL execution** | XS | P1 | Delta SQL JOINs against `pgtrickle_changes.changes_<oid>` tables that are truncated and refilled every refresh cycle. PostgreSQL auto-analyze never fires on these tables (refresh is too fast; the buffer stays hot in shared_buffers), so planner statistics are permanently stale — the planner sees 0–1 row estimates for change buffers that may contain thousands of rows, leading to suboptimal join order and strategy choices independent of `work_mem`. Run `ANALYZE pgtrickle_changes.changes_<oid>` inside `execute_delta_sql` before the delta SQL string is executed. Add `pgtrickle.analyze_before_delta = on` GUC (default `on`) to allow disabling if scan cost is significant on very small change buffers. **Location:** `src/refresh.rs`, `config.rs`. |

#### Scalability

| ID | Title | Effort | Priority | Description |
|----|-------|--------|----------|-------------|
| SCAL-1 | **Intermediate CTE row count bound at SF=10** | M | P1 | After DI-2 completion (P2A-1), run `EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)` on fixed queries (q05/q07/q08/q09) with the `pgtrickle.log_delta_sql` GUC at SF=10 and assert that the highest-cardinality intermediate CTE node does not exceed O(Δ × k) rows (where Δ = RF batch size and k = number of join levels). Capture the JSON EXPLAIN plan as a CI artifact. This verifies the fix is truly O(Δ) and not just better constant factors. |
| SCAL-2 | **Change buffer growth monitoring during multi-ST refresh** | S | P1 | Add a `pgtrickle.max_change_buffer_rows` GUC (default `0` = unlimited) that emits a `pg_trickle_alert change_buffer_overflow` event when the change buffer for a single stream table exceeds the threshold. Prevents the WAL accumulation pattern seen in `test_tpch_cross_query_consistency` from going undetected in production. **Location:** `config.rs`, `src/cdc.rs` (post-trigger count check). |
| SCAL-3 | **`pgtrickle.track_refresh_baseline()` production anomaly helper** | S | P2 | New SQL function `pgtrickle.track_refresh_baseline(stream_table TEXT, window_minutes INT DEFAULT 60)` that records the p95 DIFF refresh time for the given stream table over the specified window and emits a `pg_trickle_alert refresh_anomaly` event if any subsequent refresh exceeds 3× that baseline. Detects threshold-collapse regressions introduced by upstream schema changes (e.g. an added FK that changes query cardinality) without requiring a full benchmark run. **Location:** `src/api.rs`, `src/monitor.rs`. **Schema change:** Yes — new SQL function. |

#### Ease of Use

| ID | Title | Effort | Priority | Description |
|----|-------|--------|----------|-------------|
| UX-1 | **DIFF-slower-than-FULL per-query log warning** | XS | P1 | When `pgtrickle.log_delta_sql = on` and a delta refresh takes longer than the last recorded FULL refresh time for the same stream table (from `st_refresh_stats`), emit a `pgrx::warning!()` message: `[pgtrickle] DIFF refresh for <table> took Xms vs last FULL Yms — DIFF is Nx slower`. Allows operators to identify affected tables during normal operation without running the full benchmark suite. **Location:** `src/refresh.rs`. |
| UX-2 | **Scaling limits section in PERFORMANCE_COOKBOOK.md** | XS | P1 | Add a "DVM Query Complexity Limits" section documenting: the three failure mode categories (threshold collapse, early collapse, structural bug), which SQL patterns trigger each category (multi-table joins, EXISTS anti-joins, doubly-nested EXISTS), the recommended SF at which each is safe, and how to identify which mode applies to a given user query using `pgtrickle.log_delta_sql`. Cross-reference with `ERRORS.md` for the `DvmUnsupportedOperator` error. |
| UX-3 | **`pgtrickle.explain_diff_sql(stream_table)` helper** | M | P2 | New SQL function `pgtrickle.explain_diff_sql(stream_table TEXT) RETURNS TEXT` that builds and returns the delta SQL for the given stream table using a zero-row mock change buffer (for inspection only — no execution). Allows operators to review what SQL the DVM engine will generate without running a full refresh. Wraps the existing delta SQL builder. **Location:** `src/api.rs`. **Schema change:** Yes — new SQL function in `sql/pg_trickle--0.22.0--0.23.0.sql`. *(Note: this version reference is for v0.23.0; the v0.24.0 outbox/inbox features use `sql/pg_trickle--0.23.0--0.24.0.sql`.)* |
| UX-4 | **Unsupported SQL patterns detection in DVM parser** | S | P2 | In `src/dvm/parser/validation.rs`, detect and warn on SQL patterns with known threshold-collapse or structural-bug failure modes: (a) 4+ table joins using EXCEPT ALL chains, (b) doubly-nested correlated EXISTS / NOT EXISTS, (c) recursive CTEs (`WITH RECURSIVE`), (d) LATERAL joins, (e) `INTERSECT ALL` in the delta path. Emit `pg_trickle_alert dvm_unsupported_pattern` with the specific pattern name and a remediation hint pointing to PERFORMANCE_COOKBOOK.md. Does not block stream table creation (avoids breaking existing users), but warns at `create_stream_table()` time and on each DIFF refresh until acknowledged. |
| UX-5 | **v0.22.0 → v0.23.0 upgrade guide** | XS | P2 | Add a "Upgrading to v0.23.0" section in `docs/UPGRADING.md` covering: (a) new GUCs introduced (`pgtrickle.log_delta_sql`, `pgtrickle.delta_work_mem`, `pgtrickle.delta_enable_nestloop`, `pgtrickle.max_change_buffer_rows`); (b) behavioral changes — DI-2 UPDATE-split changes DIFF output row format for aggregate stream tables (INSERT+DELETE instead of UPDATE); (c) rollback strategy: the DI-2/DI-6 code paths are gated by detecting UPDATE rows in the change buffer, so downgrading to v0.22.0 is safe if no writes have occurred to upgraded stream tables; (d) pre-upgrade validation command: `just check-version-sync`. |
| UX-6 | **DVM SQL Rewrite Rules RFC** | M | P2 | Document the full transformation pipeline in `src/dvm/parser/rewrites.rs` as a formal RFC-style document at `docs/DVM_REWRITE_RULES.md`: each rewrite pass (view inlining, grouping sets expansion, EXISTS → anti-join, scalar sublink hoisting, delta key restriction), the input SQL pattern each targets, the transformation applied, and the algebraic correctness argument. Add unit tests in `src/dvm/parser/rewrites.rs` asserting that each rewrite pass produces the expected SQL for a reference input. Enables future contributors to add or modify rewrite passes safely. |
| UX-7 | **`pgtrickle.diff_output_format` compatibility GUC** | S | P1 | DI-2 UPDATE-split (P2A-1) changes the DIFF output row format for aggregate stream tables: currently DIFF surfaces UPDATE rows; after DI-2 it surfaces DELETE+INSERT pairs. Application code that reads the outbox or change buffer and checks `op = 'UPDATE'` will silently produce incorrect results after upgrading without code changes. Add `pgtrickle.diff_output_format` GUC accepting `'split'` (default post-DI-2) or `'merged'`. When set to `'merged'`, the refresh path re-combines DELETE+INSERT pairs originating from aggregate UPDATE-splits back into a single UPDATE row before writing to the change buffer or outbox. Allows users to upgrade to v0.23.0 and opt into the new behaviour on their own schedule. Document the migration path in UX-5 (upgrade guide): set `diff_output_format = 'merged'` first, then migrate application code to handle DELETE+INSERT pairs, then switch to `'split'`. **Location:** `config.rs`, `src/refresh.rs`. **Schema change:** No. |

#### Test Coverage

| ID | Title | Effort | Priority | Description |
|----|-------|--------|----------|-------------|
| TEST-1 | **`test_tpch_immediate_correctness` at SF=1.0** | M | P1 | Run `test_tpch_immediate_correctness` at SF=1.0 (`TPCH_SCALE=1.0`) and record per-query RF cycle time. IMMEDIATE mode fires IVM triggers inside the DML transaction; if multi-join queries (q05/q07/q08/q09) exhibit the same scaling failure, application transactions stall. Queries exceeding 5 s per RF cycle must be documented in SQL_REFERENCE.md Known Limitations as not recommended for IMMEDIATE mode at production scale. Note: the IMMEDIATE mode delta path uses `TransitionTable`; scaling failures here may be independent of the DI-2/DI-6 fixes. |
| TEST-2 | **Sustained churn for full 22-query set (post Phase 2–3)** | S | P1 | After Phase 2–3 fixes land, add the threshold-collapse group (q05/q07/q08/q09) and super-linear group (q13/q15/q17) to `test_tpch_sustained_churn` behind `TPCH_CHURN_ALL_QUERIES=1` env var. Verify zero correctness drift over 100 cycles at SF=0.1. Also verify q22 stays correct after P3-2 (delta-key R_old restriction touches the `NOT IN` path q22 uses). Default churn run unchanged. |
| TEST-3 | **Light E2E eligibility audit for TPC-H tests** | S | P2 | 10 of the 52 TPC-H test cases require the full E2E Docker image. Audit each to determine if the dependency is necessary or can be removed. Tests that only need the extension binary (not custom postgres config or third-party extensions) should be migrated to light E2E using `cargo pgrx package` + stock `postgres:18.3`. Reduces PR feedback latency since full E2E is skipped on PRs. |
| TEST-4 | **Edge case regression tests for UPDATE-split and anti-join** | M | P2 | Targeted regression tests for patterns not represented in the TPC-H query set: (a) self-join (table joined to itself via alias) — delta must not double-count; (b) `COUNT(DISTINCT col)` aggregate — DIFF semantics differ from `COUNT(*)`; (c) window functions in the SELECT list (e.g. `ROW_NUMBER() OVER (PARTITION BY …)`) — stream table should return `DvmUnsupportedOperator` rather than silently producing wrong results; (d) UPDATE-split with single-row batches and all-NULL key columns; (e) empty change buffer after UPDATE — delta must be zero rows, not an error. Cover the cases most likely to be introduced by real user queries that diverge from the TPC-H pattern set. |

---

### Effort Summary for v0.23.0

| Path | Items | Total |
|------|-------|-------|
| Best case (hypothesis A: spill) | P1-1 + P1-2 + P2B-1 + P2-1 + P3-1 + P4-1 + P5-1 | **~4 days** |
| Likely case (hypothesis B: DVM cardinality) | Phases 1–5 (all items) | **~11 days** |
| Quality pillar additions (all priorities) | CORR-1–3 + STAB-1–5 + PERF-1–5 + SCAL-1–3 + UX-1–7 + TEST-1–4 | **~17 days** |
| Quality pillar P0/P1 only | CORR-1–3 + STAB-1–3 + PERF-1, 4–5 + SCAL-1–2 + UX-1–2, 7 + TEST-1–2 | **~9 days** |

**Exit criteria:**
- [x] P1-1: work_mem benchmark run at SF=1.0 with results recorded in PLAN_TPCH_DVM_PERF.md
- [x] P1-2: `pgtrickle.log_delta_sql` GUC implemented and documented
- [x] P5-1: `pgtrickle.delta_work_mem` GUC implemented and documented
- [x] q04 DIFF < 500ms at SF=1.0 (currently 5.7s)
- [x] q20 DIFF < 100ms at SF=1.0 (currently 2.6s)
- [x] q05/q07/q08/q09 DIFF < 2s at SF=1.0 (currently 28–40s)
- [x] q22 DIFF < 200ms at SF=1.0 (currently 3.1s)
- [x] All 22 TPC-H queries pass `test_tpch_differential_correctness` at SF=1.0
- [x] No regression on q02/q11/q16 (must stay < 20ms DIFF at SF=1.0)
- [x] CORR-1: `__pgt_count` invariant property test passes on 1,000 randomised UPDATE batches
- [x] STAB-1: no `unwrap()` / `panic!()` in Phase 2–4 code paths (zero new findings from `cargo clippy`)
- [x] STAB-3: `test_tpch_cross_query_consistency` completes at SF=1.0 in < 30 min with peak WAL < 10 GB
- [x] UX-2: "DVM Query Complexity Limits" section published in PERFORMANCE_COOKBOOK.md
- [x] PERF-4: AUTO mode routes q05/q07/q08/q09 to DIFF rather than FULL at SF=1.0 after Phase 2–4 cost threshold recalibration
- [x] PERF-5: `pgtrickle.analyze_before_delta = on` is default; EXPLAIN plans for `changes_<oid>` tables show accurate row count estimates at SF=0.1
- [x] UX-7: `pgtrickle.diff_output_format = 'merged'` mode passes all outbox/CDC integration tests that exercise aggregate stream tables post-DI-2
- [x] `just check-version-sync` passes


---