# v0.81.0 — Observability, Self-Tuning & Quick Wins > **Status:** Planned > **Scope:** Large > **Driven by:** [Assessment 16](../plans/PLAN_OVERALL_ASSESSMENT_16.md) — QW-1 through QW-10 ## Theme Immediate-value improvements that require minimal architectural change while delivering significant observability, ergonomic, and performance gains to the current single-node engine. Every item is backward-compatible and benefits all deployment modes. ## Items ### QW-1: Commit-to-Visible Latency Metric Track the wall-clock time between source transaction commit and stream table visibility. Use `pg_xact_commit_timestamp()` (requires `track_commit_timestamp=on`) to measure the true end-to-end latency. Expose as `pg_trickle_commit_to_visible_ms` Prometheus histogram with per-ST labels. ### QW-2: Configuration Advisor Function `SELECT * FROM pgtrickle.tune_recommendations()` returns a table of `(guc_name, current_value, recommended_value, reason)` based on observed workload patterns: refresh latency percentiles, memory usage, worker utilization, CDC lag trends. Gives operators actionable tuning guidance without requiring deep pg_trickle expertise. ### QW-3: Preview / Dry-Run Mode `SELECT * FROM pgtrickle.preview_stream_table(query text)` returns: - Detected source tables and their CDC mode - Planned refresh strategy (FULL/DIFFERENTIAL/AUTO) - OpTree complexity class - Estimated delta SQL template size - Any DVM support warnings (e.g., volatile functions, non-invertible aggregates) No side effects — does not create the stream table. ### QW-4: OpenTelemetry Trace Spans Instrument the refresh hot path with OTel trace spans: - `pg_trickle.scheduler_tick` — outer span per scheduler wake cycle - `pg_trickle.refresh_cycle` — per-ST refresh (includes mode decision) - `pg_trickle.delta_execute` — delta SQL execution time - `pg_trickle.merge_apply` — MERGE statement execution time - `pg_trickle.frontier_advance` — frontier update - `pg_trickle.cleanup` — change buffer cleanup Export via OTLP gRPC/HTTP to any collector (Jaeger, Tempo, Datadog). Controlled by `pg_trickle.otel_endpoint` GUC (empty = disabled). ### QW-5: Bounded L0/L1 Template Cache Add LRU eviction to the thread-local `DELTA_TEMPLATE_CACHE` and `PLACEHOLDER_RESOLVER_CACHE`. New GUC `pg_trickle.template_cache_max_entries` (default 256) caps per-session memory usage. In 10K-ST deployments, this prevents unbounded growth in long-lived backend sessions. ### QW-6: DeltaOperator Trait Define a `DeltaOperator` trait: ```rust pub trait DeltaOperator { fn generate_delta( &self, ctx: &mut DiffContext, children: &[DiffResult], ) -> Result; fn supports_immediate_mode(&self) -> bool { false } fn is_monotone(&self) -> bool { false } } ``` Migrate all 22 operator implementations to this trait. Enables future plugin-style operator extension and cleaner dispatch in `operators/mod.rs`. ### QW-7: Split config.rs by Category Move GUC declarations from the monolithic `src/config.rs` into focused sub-modules: - `src/config/scheduler.rs` — scheduler interval, workers, backoff - `src/config/cdc.rs` — CDC mode, WAL transition, buffer thresholds - `src/config/dvm.rs` — parse depth, CTE cap, template cache, algebraic drift - `src/config/monitoring.rs` — alert thresholds, history retention, metrics Re-export from `src/config/mod.rs` for backward-compatible imports. ### QW-8: Self-Healing Circuit Breaker Extend the current `max_consecutive_errors` suspension with auto-remediation: - **OOM detected** (SPI error containing "out of memory"): reduce `merge_work_mem_mb` by 25% for the affected ST and retry. - **Lock timeout detected**: increase that ST's effective scheduler interval by 2× (exponential backoff) until 3 consecutive successes. - **Sustained lag** (>5× schedule interval): temporarily add +1 refresh worker if below `max_worker_processes` capacity. All remediations are logged to `pgt_refresh_history` with reason codes and are reversible (settings revert after N successful cycles). ### QW-9: Chunked MERGE for Large Deltas When delta row count exceeds `pg_trickle.merge_batch_size` (default 50,000), split the MERGE into batched statements: 1. Materialize delta into a temp table 2. Execute MERGE in chunks of `merge_batch_size` rows using row-number windows 3. Drop temp table This reduces peak memory usage and lock hold time for large delta sets, preventing long-running transactions from blocking other refreshes. ### QW-10: Stream Table Presets Named configuration profiles that set multiple parameters at once: ```sql SELECT pgtrickle.create_stream_table( 'my_view', 'SELECT ...', preset => 'real-time' ); ``` | Preset | Schedule | Mode | Workers | Memory | |--------|----------|------|---------|--------| | `real-time` | 1s | DIFFERENTIAL | max | 256MB | | `batch` | 5m | AUTO | 1 | 64MB | | `cost-optimized` | 15m | AUTO | 1 | 32MB | Presets set defaults that can be individually overridden.