# v0.34.0 Full Technical Details — Citus: Automated Distributed CDC & Shard Recovery **Status: Released (2026-04-26).** ## Goals | ID | Goal | Priority | |----|------|----------| | COORD-10 | Wire scheduler to call `poll_worker_slot_changes()` for distributed sources | Required | | COORD-11 | Wire scheduler to call `ensure_worker_slot()` on first tick and after topology changes | Required | | COORD-12 | Acquire + extend + release `pgt_st_locks` lease automatically in refresh cycle | Required | | COORD-13 | Detect `pg_dist_node` topology changes; auto-recover slots after rebalance | Required | | COORD-14 | Handle worker unreachability gracefully (skip + retry + alert) | Required | | COORD-15 | Add `pg_trickle.citus_worker_retry_ticks` GUC (default 5) | Required | | COORD-16 | Extend `citus_status` view with lease health column and last-poll timestamp | Medium | | COORD-17 | Unit tests: topology change detection logic | Required | | COORD-18 | Unit tests: shard rebalance detection edge cases | Medium | ## Scheduler Changes ### `refresh_single_st` Citus extension point ``` if source_placement == 'distributed' && is_citus_loaded(): 1. acquire_st_lock(lock_key, holder, citus_st_lock_lease_ms) 2. for each worker in worker_nodes(): ensure_worker_slot(worker, slot_name) poll_worker_slot_changes(worker, slot_name, max_changes, src) 3. extend_st_lock() if refresh duration > lease_ms / 2 4. [existing local refresh logic] 5. release_st_lock() ``` ### Topology change detection On each scheduler tick: ```sql SELECT array_agg(nodename ORDER BY nodeid) FROM pg_dist_node WHERE isactive AND noderole = 'primary' ``` Compare against `pgt_worker_slots.worker_name` set for this stream table. If different: drop stale `pgt_worker_slots` rows, set `needs_reinit = true` on affected stream tables. ## New GUC | GUC | Default | Description | |-----|---------|-------------| | `pg_trickle.citus_worker_retry_ticks` | `5` | Consecutive worker-poll failures before flagging in `citus_status` | ## Migration: 0.33.0 → 0.34.0 No schema changes. The new scheduler behavior activates automatically for stream tables with `source_placement = 'distributed'`. Operators should remove any manual `LISTEN + handle_vp_promoted()` application logic — it is now redundant (though harmless to leave in place). ## Testing - Unit tests: topology change detection logic — `test_topology_no_change`, `test_topology_worker_added`, `test_topology_worker_removed`, `test_topology_empty_recorded_live_workers`, `test_topology_both_empty`, `test_topology_port_change` (all in `src/citus.rs`). - COORD-17/18 full integration tests (3-node Citus Docker-Compose cluster) are planned for a future release when the CI Citus cluster is available. ## Implementation Status | ID | Title | Status | |----|-------|--------| | COORD-10 | Wire scheduler `poll_worker_slot_changes()` | ✅ Done | | COORD-11 | Wire scheduler `ensure_worker_slot()` | ✅ Done | | COORD-12 | Automatic `pgt_st_locks` lease lifecycle | ✅ Done | | COORD-13 | Topology change detection + slot reconciliation | ✅ Done | | COORD-14 | Worker failure handling (skip + retry + alert) | ✅ Done | | COORD-15 | `citus_worker_retry_ticks` GUC | ✅ Done | | COORD-16 | Extended `citus_status` view | ✅ Done | | COORD-17 | Unit tests: topology detection | ✅ Done | | COORD-18 | Unit tests: rebalance edge cases | ✅ Done | ## Exit Criteria - [x] All P0 items ✅ Done - [x] `just test-unit` passes - [x] `just check-version-sync` exits 0 - [x] CHANGELOG.md entry written - [x] ROADMAP.md v0.34.0 row marked ✅ Released