# v0.35.0 — Correctness Sprint, Reactive Subscriptions & Zero-Downtime Operations > **Full technical details:** [v0.35.0.md-full.md](v0.35.0.md-full.md) **Status: Released** | **Scope: Large** > A correctness-and-quality sprint that closes the three-assessment-old EC-01 > phantom-row bug and hardens Citus chaos tolerance, combined with two new > user-facing capabilities: live push notifications and zero-downtime query > changes. --- ## What is this? v0.35.0 combines a mandatory quality sprint with two long-requested operational capabilities: 1. **EC-01 correctness closeout** — the phantom-row residue in multi-table joins that has been tracked since v0.21.0 is fixed unconditionally. Every join delta is now routed through PH-D1 cleanup, and a 50,000-iteration property test proves convergence. 2. **Citus chaos hardening** — a new multi-container Docker Compose Citus test rig covers worker kill, coordinator failover, lease expiry, and rebalance scenarios that have had zero test coverage since Citus landed. 3. **Push notifications** — applications can subscribe to changes in a stream table and receive instant notifications, enabling real-time dashboards, live UIs, and event-driven microservices. 4. **Zero-downtime query changes** — modifying the defining query of a large stream table no longer requires a multi-minute lock on the table. --- ## EC-01 correctness closeout The phantom-row residue bug (`is_deduplicated: false` at `src/dvm/operators/join.rs:657-668`) has been flagged in every overall assessment since v0.21.0. The v0.24.0 fix addressed the hash function but not the downstream Z-set pipeline, leaving PH-D1 cross-cycle cleanup opt-in and only invoked when a residual is detected — which itself depends on a flag not set on every code path. v0.35.0 closes this definitively in two steps: 1. **Immediate fix:** route every refresh cycle unconditionally through PH-D1 with a batch size of 1,024 rows. The anti-join cost against the freshly applied delta is negligible and removes the residual-detection coupling. 2. **Proper fix:** re-engineer Part 2 row-id derivation so that Part 1a and Part 1b emit convergent ids; flip `is_deduplicated: true` for INNER joins on stable PKs; gate behind a 50,000-iteration proptest corpus. Stream tables built on multi-table joins can then use DIFFERENTIAL refresh with full correctness confidence. --- ## Citus chaos hardening The `pgt_st_locks` distributed mutex and `ensure_worker_slot` / rebalance recovery logic added in v0.32–v0.34 have never been tested under adversarial conditions. v0.35.0 adds `tests/e2e_citus_chaos_tests.rs` backed by a Docker Compose rig (coordinator + 3 workers) that drives: - Kill-and-restart a worker during an active poll cycle - Coordinator restart mid-lease acquisition - `pg_dist_node` removal and re-add of a worker - Sustained 1k-stream-table refresh under continuous node churn A new `citus-tests.yml` GitHub Actions workflow runs this suite on every push to `main`. Two additional Citus scalability gaps close alongside the chaos rig: - **`dblink` vs streaming libpq benchmark** (`CITUS-BENCH`) — the per-worker slot polling path has never been benchmarked. A new `benches/bench_remote_slot_poll.rs` compares `dblink`-wrapped `pg_logical_slot_get_changes()` against native libpq streaming at 1, 4, and 9 workers. If streaming delivers ≥ 30% lower p99 latency or ≥ 20% higher throughput the migration happens in the same PR; otherwise the `dblink` path is formally closed as the right choice. - **Cross-shard join advisory** (`CITUS-XSHARD`) — when a distributed stream table is keyed on `__pgt_row_id` (a surrogate) rather than the source table's distribution column, any query joining the ST back to its source incurs a cross-shard re-partition join. pg_trickle now detects this at `create_stream_table()` time and emits a `NOTICE` suggesting the `output_distribution_column` parameter. The co-location status is recorded in `pgt_stream_tables.citus_colocated_with` and surfaced in the `citus_status` view. --- ## Reactive subscriptions `pgtrickle.subscribe('my_stream_table', 'my_notification_channel')` registers a listener. After every successful refresh that produces at least one change, pg_trickle sends a PostgreSQL `NOTIFY` message to the named channel with a payload like: ```json {"name": "my_stream_table", "inserted_count": 12, "deleted_count": 3} ``` Any application holding a standard PostgreSQL connection and listening on that channel receives this signal immediately, without polling. This powers real-time dashboards, event-driven microservices, and reactive frontends — using nothing but a standard PostgreSQL driver, with no Kafka, no Debezium, no Hasura required. A configurable coalescence window prevents notification storms when a stream table refreshes at high frequency. --- ## Shadow-ST: zero-downtime query evolution Today, calling `alter_query()` on a large stream table triggers a full re-computation of the entire result set. For a stream table with millions of rows, this can lock the table for minutes — an unacceptable operation in production. The new `shadow_build := true` parameter to `alter_query()` changes how this works: 1. A parallel "shadow" stream table is created from the new query, invisible to users. 2. The shadow table is refreshed to convergence in the background, with no lock on the live table. The live table continues to serve reads and accept writes normally throughout. 3. When the shadow table has caught up, the storage is swapped atomically. 4. The new query goes live at the next refresh cycle. The shadow table is dropped. The live table is readable and writable from start to finish. --- ## Also in v0.35.0 Beyond the headlining items, this release closes a wide range of quality and operational gaps identified in the v7 overall assessment: - `EXPLAIN STREAM TABLE` — see which DVM operators your query compiled to - `pg_trickle.force_full_refresh` GUC for incident-response override - `pg_trickle.enabled = false` now also gates CDC trigger writes - History prune moved to a dedicated background worker with `LIMIT` batching - SQLSTATE error classifier wired end-to-end (replaces English-text matching) - Relay secret interpolation via `${ENV:VAR}` in connection strings - Relay backpressure and reconnection backoff - Lightweight SQLancer run added to every PR gate - Grafana p50/p99 refresh latency panels and alert rules - Citus tutorial, outbox→relay→Kafka tutorial, `pg_trickle_dump` runbook - `NOTICE` emitted on every FULL fallback so operators can detect DVM limits - Multi-architecture Docker images (arm64 + amd64) --- ## Scope v0.35.0 is a large release. It is the single highest-priority release before v1.0 because it closes the EC-01 correctness gap that affects every multi-table join workload. All other 1.0-track features are downstream of this. --- *Previous: [v0.34.0 — Citus: Automated Distributed CDC & Shard Recovery](v0.34.0.md)* *Next: [v0.36.0 — Structural Hardening, Performance & Temporal IVM](v0.36.0.md)* *Next: [v0.36.0 — Temporal IVM & Columnar Materialization](v0.36.0.md)*