> **Plain-language companion:** [v0.42.0.md](v0.42.0.md) ## v0.42.0 — Documentation Truthfulness, Test Quality & Fuzz Automation **Status: Planned.** Derived from [plans/PLAN_OVERALL_ASSESSMENT_9.md](../plans/PLAN_OVERALL_ASSESSMENT_9.md) §Dimensions 1, 5, 6, 9, 10. > **Release Theme** > Fix documentation-implementation drift (repair function, catalog generator, > SQL reference) and test-infrastructure drift (fixed sleeps, aggregate property > tests, fuzz CI) in a single release. Both gaps share the same root cause and > the same fix: make the surrounding infrastructure keep up with the code. --- ### Features | ID | Title | Effort | Priority | Assessment ref | |----|-------|--------|----------|----------------| | A42-1 | Implement `pgtrickle.repair_stream_table(name)` | L | P0 | FEAT-01, OPS-05, DOC-04 | | A42-2 | Rewrite catalog generator with Rust-aware extraction | L | P0 | DOC-01, DOC-02, CQ-03 | | A42-3 | Update `create_stream_table` SQL reference signature | M | P0 | DOC-03, FEAT-02 | | A42-4 | Stale-term audit and docs linter (`just docs-lint`) | M | P1 | CQ-04, DOC-06, CI-07 | | A42-5 | Move deprecated GUCs to compatibility appendix | S | P1 | DOC-05 | | A42-6 | Update ARCHITECTURE.md module diagrams | S | P1 | DOC-07 | | A42-7 | RLS bypass prominence in setup and security docs | S | P2 | DOC-08 | | A42-8 | Generated docs freshness CI gate | M | P0 | CI-06, TEST-05 | | A42-9 | Replace fixed sleeps with state-polling helpers | L | P0 | TEST-01 | | A42-10 | Differential SUM(CASE) E2E tests | M | P0 | TEST-02, COR-04 | | A42-11 | SUM(CASE) AST-level detection in aggregate classification | M | P1 | COR-04 | | A42-12 | FULL JOIN aggregate property tests (DIFF vs FULL) | L | P1 | COR-05 | | A42-13 | WAL decoder SQL parameterization | M | P1 | COR-07 | | A42-14 | Update stale EC-06 comments | S | P2 | COR-08 | | A42-15 | Keyless multiset property test | S | P2 | TEST-09 | | A42-16 | Fuzz smoke CI job with corpus replay | M | P1 | TEST-06, CI-05 | **A42-1 — Implement `repair_stream_table`.** Add `#[pg_extern(schema = "pgtrickle")] fn repair_stream_table(name: &str)`. The function must: (1) acquire an advisory lock on the stream table, (2) verify the stream table exists in the catalog, (3) reinitialize materialized storage if missing or corrupted, (4) reset CDC frontiers to force a full refresh on next cycle, (5) rebuild CDC triggers/change buffer tables if absent, (6) verify all dependencies still exist, and (7) return a summary of actions taken. Add a restore-drill E2E test that pg_dumps, drops, restores, calls repair, and verifies correctness after a refresh cycle. **A42-2 — Rewrite catalog generator.** Replace `scripts/gen_catalogs.py` regex extraction with either: (a) a Rust-side `build.rs` that emits a JSON manifest of `#[pg_extern]` functions and GUC registrations, or (b) a Python script using `syn`-equivalent parsing (e.g. `tree-sitter-rust`). The generator must capture: function name, schema, args with types and defaults, return type, volatility, security (definer/invoker), and deprecation status. For GUCs: name, type, default, min/max, context, description, and deprecated flag. Fail CI on any `(registration pending ...)` rows or missing known functions. **A42-3 — Update SQL reference.** Regenerate `docs/SQL_REFERENCE.md` function signatures from the same source manifest. Update parameter tables for `create_stream_table` to include all 16 current parameters. Update examples for `output_distribution_column`, `temporal`, `storage_backend`, and bulk JSON keys. **A42-4 — Stale-term audit and docs linter.** Add `just docs-lint` that greps all `docs/**/*.md` for retired names: `pg_trickle.max_workers`, `pg_trickle.max_parallel_refresh_workers`, `event_driven_wake` (as active), `wake_debounce_ms` (as active). Fail on matches. Also check for references to `repair_stream_table` in docs and verify the function exists. **A42-5 — Deprecated GUCs appendix.** Move `event_driven_wake`, `wake_debounce_ms`, and any other deprecated GUCs from tuning tables in `docs/CONFIGURATION.md` to a dedicated "Deprecated/Compatibility GUCs" appendix section. **A42-6 — Architecture module diagrams.** Update `docs/ARCHITECTURE.md` module layout to reflect current directory structure: `src/api/mod.rs`, `src/refresh/merge/mod.rs`, `src/scheduler/mod.rs`, `src/dvm/parser/*` submodules, etc. **A42-7 — RLS bypass prominence.** Add explicit RLS bypass warnings to `docs/GETTING_STARTED.md`, `docs/PRE_DEPLOYMENT.md`, and the security model doc. **A42-8 — Generated docs CI gate.** Add `.github/workflows/docs-drift.yml` (or extend existing) to: regenerate catalogs, diff against committed versions, fail on any change, fail on `(registration pending`, and assert known functions appear. **A42-9 — Replace fixed sleeps with state-polling helpers.** Create shared test helpers in `tests/common/mod.rs`: - `wait_for_refresh_history(client, st_name, min_count, timeout)` — polls refresh history until the expected row count is reached. - `wait_for_cdc_mode(client, st_name, expected_mode, timeout)` — polls CDC mode in catalog. - `wait_for_scheduler_tick(client, min_tick, timeout)` — polls scheduler watermark. - `wait_for_job_status(client, job_id, expected_status, timeout)` — polls job. Replace the highest-impact sleeps (WAL CDC, scheduler, cascade, PgBouncer, quota, bgworker tests) with these helpers. Target: reduce the 116 fixed sleeps by at least 80%. **A42-10 — Differential SUM(CASE) E2E tests.** Add E2E tests in DIFF mode for `SUM(CASE WHEN x > threshold THEN y ELSE 0 END)` with INSERT rows above/below threshold, UPDATE rows crossing the threshold in both directions, DELETE qualifying and non-qualifying rows, and multi-cycle sequences. After each operation compare the stream table with a fresh full query. **A42-11 — SUM(CASE) AST-level detection.** In `src/dvm/operators/aggregate.rs`, detect CASE expressions at the parsed AST level rather than matching the trimmed string prefix. Handle CASE wrapped in casts, functions, or type coercions. Normalize the expression tree before classification. **A42-12 — FULL JOIN aggregate property tests.** Add property-style E2E tests comparing DIFF vs FULL for nested FULL JOIN + aggregate combinations across multi-cycle insert/update/delete sequences, including NULL keys and both-side changes in the same cycle. **A42-13 — WAL decoder SQL parameterization.** In `src/wal_decoder.rs`, parameterize every value in `write_decoded_change` INSERT SQL using SPI parameters instead of manual escaping. Centralize slot-name construction with a strict grammar assertion. Add tests with quotes, backslashes, unicode, large text, nulls, and bytea-like payloads. **A42-14 — Update stale EC-06 comments.** Replace comments in `src/dvm/operators/scan.rs` that describe EC-06 as unfixed with current design notes explaining keyless net-counting, non-unique row-id indexes, and the existing test coverage. **A42-15 — Keyless multiset property test.** Add a small property test that generates random multisets of rows, applies random insert/delete/update operations, and verifies that the keyless stream table maintains multiset equivalence with the source query after each cycle. **A42-16 — Fuzz smoke CI job.** Add `.github/workflows/fuzz-smoke.yml` that runs on schedule (daily) and manual dispatch, executes each fuzz target with a short time budget (30–60 s), replays the corpus on PRs, and uploads crashes/minimized repros as artifacts. ### Test Coverage | ID | Title | Effort | Priority | Assessment ref | |----|-------|--------|----------|----------------| | T-A42-1 | Restore-drill E2E test for `repair_stream_table` | L | P0 | FEAT-01 | | T-A42-2 | Catalog generator regression test | M | P0 | TEST-05 | | T-A42-3 | Docs-lint CI integration | S | P1 | CI-07 | | T-A42-4 | Polling helper coverage for all wait-state types | M | P0 | TEST-01 | | T-A42-5 | SUM(CASE) differential multi-cycle test | M | P0 | TEST-02 | | T-A42-6 | FULL JOIN nested aggregate DIFF-vs-FULL property test | L | P1 | COR-05 | | T-A42-7 | WAL decoder payload edge-case tests | M | P1 | COR-07 | | T-A42-8 | Fuzz corpus replay in CI | S | P1 | TEST-06 | ### Conflicts & Risks - **A42-1** (`repair_stream_table`) interacts with CDC, storage, catalog, and DAG modules. Keep the function as a composition of existing primitives. - **A42-2** generator rewrite may require `build.rs` changes; keep it pragmatic. - **A42-9** (sleep replacement) touches many test files. Use generous timeouts (30 s) with configurable overrides to avoid new flakiness. - **A42-11** (AST detection) changes aggregate classification; add golden-file tests for classification decisions. - **A42-13** (WAL parameterization) is a correctness-critical path; run full WAL CDC E2E suite before and after. ### Exit Criteria - [ ] A42-1: `repair_stream_table` implemented, documented, and E2E tested - [ ] A42-2: Catalog generator produces complete, accurate SQL API and GUC catalogs - [ ] A42-3: SQL reference matches actual function signatures - [ ] A42-4: `just docs-lint` passes on current docs - [ ] A42-5: No deprecated GUC appears in active tuning tables - [ ] A42-8: CI fails on stale generated docs - [ ] A42-9: Fixed sleeps reduced by >= 80% (from 116 baseline) - [ ] A42-10/11: SUM(CASE) differential tests pass; AST detection handles wrapped CASE - [ ] A42-12: FULL JOIN aggregate property tests converge DIFF == FULL - [ ] A42-13: WAL decoder tests pass with adversarial payloads - [ ] A42-16: Fuzz smoke CI runs without crashes on current corpus - [ ] Extension upgrade path tested (`0.41.0 → 0.42.0`) - [ ] `just lint` passes with zero warnings - [ ] `just test-all` passes > **Release Theme** > Eliminate documentation-implementation drift by implementing the missing > repair function, fixing the catalog generator, updating all stale references, > and gating documentation freshness in CI. --- ### Features | ID | Title | Effort | Priority | Assessment ref | |----|-------|--------|----------|----------------| | A42-1 | Implement `pgtrickle.repair_stream_table(name)` | L | P0 | FEAT-01, OPS-05, DOC-04 | | A42-2 | Rewrite catalog generator with Rust-aware extraction | L | P0 | DOC-01, DOC-02, CQ-03 | | A42-3 | Update `create_stream_table` SQL reference signature | M | P0 | DOC-03, FEAT-02 | | A42-4 | Stale-term audit and docs linter (`just docs-lint`) | M | P1 | CQ-04, DOC-06, CI-07 | | A42-5 | Move deprecated GUCs to compatibility appendix | S | P1 | DOC-05 | | A42-6 | Update ARCHITECTURE.md module diagrams | S | P1 | DOC-07 | | A42-7 | RLS bypass prominence in setup and security docs | S | P2 | DOC-08 | | A42-8 | Generated docs freshness CI gate | M | P0 | CI-06, TEST-05 | **A42-1 — Implement `repair_stream_table`.** Add `#[pg_extern(schema = "pgtrickle")] fn repair_stream_table(name: &str)`. The function must: (1) acquire an advisory lock on the stream table, (2) verify the stream table exists in the catalog, (3) reinitialize materialized storage if missing or corrupted, (4) reset CDC frontiers to force a full refresh on next cycle, (5) rebuild CDC triggers/change buffer tables if absent, (6) verify all dependencies still exist, and (7) return a summary of actions taken. Add a restore-drill E2E test that pg_dumps, drops, restores, calls repair, and verifies correctness after a refresh cycle. **A42-2 — Rewrite catalog generator.** Replace `scripts/gen_catalogs.py` regex extraction with either: (a) a Rust-side `build.rs` that emits a JSON manifest of `#[pg_extern]` functions and GUC registrations, or (b) a Python script using `syn`-equivalent parsing (e.g. `tree-sitter-rust`). The generator must capture: function name, schema, args with types and defaults, return type, volatility, security (definer/invoker), and deprecation status. For GUCs: name, type, default, min/max, context, description, and deprecated flag. Fail CI on any `(registration pending ...)` rows or missing known functions. **A42-3 — Update SQL reference.** Regenerate `docs/SQL_REFERENCE.md` function signatures from the same source manifest. Update parameter tables for `create_stream_table` to include all 16 current parameters. Update examples for `output_distribution_column`, `temporal`, `storage_backend`, and bulk JSON keys. **A42-4 — Stale-term audit and docs linter.** Add `just docs-lint` that greps all `docs/**/*.md` for retired names: `pg_trickle.max_workers`, `pg_trickle.max_parallel_refresh_workers`, `event_driven_wake` (as active), `wake_debounce_ms` (as active). Fail on matches. Also check for references to `repair_stream_table` in docs and verify the function exists. **A42-5 — Deprecated GUCs appendix.** Move `event_driven_wake`, `wake_debounce_ms`, and any other deprecated GUCs from tuning tables in `docs/CONFIGURATION.md` to a dedicated "Deprecated/Compatibility GUCs" appendix section. **A42-6 — Architecture module diagrams.** Update `docs/ARCHITECTURE.md` module layout to reflect current directory structure: `src/api/mod.rs`, `src/refresh/merge/mod.rs`, `src/scheduler/mod.rs`, `src/dvm/parser/*` submodules, etc. **A42-7 — RLS bypass prominence.** Add explicit RLS bypass warnings to `docs/GETTING_STARTED.md`, `docs/PRE_DEPLOYMENT.md`, and the security model doc. **A42-8 — Generated docs CI gate.** Add `.github/workflows/docs-drift.yml` (or extend existing) to: regenerate catalogs, diff against committed versions, fail on any change, fail on `(registration pending`, and assert known functions appear. ### Test Coverage | ID | Title | Effort | Priority | Assessment ref | |----|-------|--------|----------|----------------| | T-A42-1 | Restore-drill E2E test for `repair_stream_table` | L | P0 | FEAT-01 | | T-A42-2 | Catalog generator regression test | M | P0 | TEST-05 | | T-A42-3 | Docs-lint CI integration | S | P1 | CI-07 | **T-A42-1.** E2E test: create stream table, insert data, refresh, pg_dump the database, drop the stream table's storage, restore from dump, call `pgtrickle.repair_stream_table(name)`, refresh, and verify data matches expected state. **T-A42-2.** CI job that regenerates catalogs and asserts no diff. Also asserts no `(registration pending` strings and that `create_stream_table`, `repair_stream_table`, `refresh_stream_table`, and other known functions appear. **T-A42-3.** `just docs-lint` runs the stale-term grep and exits non-zero on matches. ### Conflicts & Risks - **A42-1** (`repair_stream_table`) interacts with CDC, storage, catalog, and DAG modules. Keep the function as a composition of existing primitives rather than introducing new repair-specific internals. - **A42-2** generator rewrite may require updating the `Cargo.toml` build if using `build.rs`. Keep the approach pragmatic — a `tree-sitter` based Python script may be simpler than a full build-time manifest. ### Exit Criteria - [ ] A42-1: `repair_stream_table` implemented, documented, and E2E tested - [ ] A42-2: Catalog generator produces complete, accurate SQL API and GUC catalogs - [ ] A42-3: SQL reference matches actual function signatures - [ ] A42-4: `just docs-lint` passes on current docs - [ ] A42-5: No deprecated GUC appears in active tuning tables - [ ] A42-6: Architecture docs match current module layout - [ ] A42-8: CI fails on stale generated docs - [ ] Extension upgrade path tested (`0.41.0 → 0.42.0`) - [ ] `just lint` passes with zero warnings