# v0.82.0 — External Worker Foundation > **Status:** Planned > **Scope:** Large > **Driven by:** [Assessment 16](../plans/PLAN_OVERALL_ASSESSMENT_16.md) — MT-1, MT-2, MT-7, MT-9, MT-10 ## Theme Introduce the ability to run DVM computation outside the PostgreSQL backend process. A standalone `pg_trickle_worker` binary connects via tokio-postgres, claims work via advisory locks, and computes deltas externally. The in-process scheduler becomes a coordinator that dispatches to external workers when available, falling back to in-process execution when not. This is the foundational architectural change that enables horizontal scaling in later releases. The system remains fully backward-compatible: deployments without external workers continue operating exactly as before. ## Items ### MT-1: External Worker Binary (`pg_trickle_worker`) A standalone Rust binary (`pg_trickle_worker`) in a new `crates/worker/` workspace member: - Connects to PostgreSQL via `tokio-postgres` (connection string from env/config) - Reads stream table metadata from `pgtrickle.pgt_stream_tables` - Claims assigned work from `pgtrickle.pgt_worker_jobs` via advisory locks - Reads change buffers via `COPY ... TO STDOUT (FORMAT binary)` for efficient transfer - Computes delta SQL using the extracted DVM engine - Applies MERGE back to the stream table - Advances frontier and records refresh history - Stateless: all state lives in PostgreSQL catalog tables The worker is a single static binary suitable for Docker/Kubernetes deployment. ### MT-2: Worker Coordination Protocol Advisory-lock-based work assignment: - Coordinator (in-process scheduler) inserts jobs into `pgt_worker_jobs` table with `(pgt_id, assigned_at, heartbeat_at, worker_id)` - Workers claim jobs via `pg_try_advisory_xact_lock(pgt_id)` - Heartbeat: workers UPDATE `heartbeat_at` every 5s while processing - Timeout: coordinator reclaims jobs where `heartbeat_at < now() - 30s` - Failover: unclaimed/expired jobs fall back to in-process execution New catalog table: `pgtrickle.pgt_worker_registry`: ```sql CREATE TABLE pgtrickle.pgt_worker_registry ( worker_id UUID PRIMARY KEY, hostname TEXT NOT NULL, pid INTEGER NOT NULL, started_at TIMESTAMPTZ NOT NULL DEFAULT now(), last_heartbeat TIMESTAMPTZ NOT NULL DEFAULT now(), status TEXT NOT NULL DEFAULT 'active', capabilities JSONB DEFAULT '{}' ); ``` ### MT-7: DiffContext Decomposition Split the 15-field `DiffContext` struct into focused sub-contexts: - `CdcContext` — frontier LSNs, source OIDs, CDC columns, key columns - `CacheContext` — CTE delta cache, scan pushed predicates, bypass tables - `OptimizationContext` — depth tracking, CTE counters, max limits The top-level `DiffContext` composes these via delegation. This reduces cognitive load when working on individual subsystems and enables the external worker to construct only the contexts it needs. ### MT-9: `pg_stat_pgtrickle` System View A virtual system view (similar to `pg_stat_user_tables`) exposing per-ST cumulative statistics: ```sql CREATE VIEW pgtrickle.pg_stat_pgtrickle AS SELECT pgt_id, schema_name, table_name, total_refreshes, total_full_refreshes, total_diff_refreshes, total_delta_rows_processed, avg_refresh_duration_ms, p95_refresh_duration_ms, p99_refresh_duration_ms, last_refresh_at, current_lag_estimate_ms, last_error, last_error_at FROM pgtrickle.pgt_cost_model_summary JOIN pgtrickle.pgt_stream_tables USING (pgt_id); ``` Compatible with standard PostgreSQL monitoring tools (pgwatch, Datadog, pg_stat_monitor). ### MT-10: Adaptive Worker Pool Sizing Automatically adjust the effective number of refresh workers based on real-time metrics: - **CPU utilization** > 70%: reduce workers by 1 (avoid starving OLTP) - **Refresh queue depth** > 2× workers: add 1 worker (reduce lag) - **All workers idle** for 3 consecutive ticks: reduce to `min_workers` - **CDC lag** > 5× schedule interval: add workers up to `max_workers` New GUCs: - `pg_trickle.adaptive_workers = true` (enable/disable) - `pg_trickle.min_workers = 1` (floor) - `pg_trickle.max_workers = 8` (ceiling, bounded by max_worker_processes) The adaptive algorithm runs once per scheduler tick with hysteresis to prevent oscillation (changes require 3 consecutive measurements).