# v0.83.0 — Performance Pipeline & CDC Extraction > **Status:** Planned > **Scope:** Large > **Driven by:** [Assessment 16](../plans/PLAN_OVERALL_ASSESSMENT_16.md) — MT-3, MT-4, MT-5, MT-6 ## Theme Eliminate the two largest performance bottlenecks: synchronous write-path impact from trigger-based CDC, and materialization overhead from the full-delta-then-MERGE refresh pattern. This release introduces an external CDC consumer that removes trigger overhead entirely, shared-memory ring buffers for high-throughput sources, and pipelined refresh execution that overlaps delta computation with MERGE application. ## Items ### MT-3: Pipelined Refresh Execution Replace the current pattern (materialize full delta → MERGE entire set) with a streaming pipeline: 1. Execute delta SQL with a cursor/portal (no full materialization) 2. Stream result rows in batches of `pipeline_batch_size` (default 4096) 3. For each batch, generate and execute a partial MERGE statement 4. Commit incrementally (one sub-transaction per batch) Benefits: - Peak memory reduced from O(delta_size) to O(batch_size) - First rows visible sooner (progressive freshness) - Lock hold time reduced (sub-transaction per batch) - Long deltas don't block other ST refreshes New GUC: `pg_trickle.pipeline_batch_size = 4096` Implementation uses PostgreSQL's extended query protocol with named portals when executed via external workers (tokio-postgres), or SPI cursors for in-process mode. ### MT-4: External CDC Consumer Binary (`pg_trickle_cdc`) A standalone Rust binary in `crates/cdc-consumer/`: - Subscribes to PostgreSQL logical replication using `pgoutput` protocol - Decodes WAL events and writes to change buffer tables (same schema as trigger-based CDC) - Runs as a separate process — zero overhead on the PostgreSQL write path - Supports multiple publication subscriptions (one per source table or grouped) - Tracks consumed LSN via replication protocol feedback (no separate frontier table needed) Deployment modes: - **Sidecar:** Same host as PostgreSQL, writes to change buffers directly - **Remote:** Separate host, writes via `COPY ... FROM STDIN (FORMAT binary)` - **Log-based:** Writes to an intermediate log (Kafka/NATS) instead of PostgreSQL tables (foundation for v0.85.0) New GUC: `pg_trickle.cdc_mode = 'external'` (vs 'trigger' / 'wal' / 'auto') When `cdc_mode='external'`, the extension: - Does not create triggers on source tables - Does not create logical replication slots - Expects the external consumer to populate change buffers - Monitors buffer freshness and alerts if the consumer falls behind ### MT-5: Shared-Memory Change Buffer Ring For hot sources (>10K writes/s), replace the table-based change buffer with a fixed-size ring buffer in PostgreSQL shared memory: ``` ┌─────────────────────────────────────────────┐ │ Shared Memory Ring Buffer (per source OID) │ │ [slot 0][slot 1][slot 2]...[slot N-1] │ │ ^write_head ^read_head │ └─────────────────────────────────────────────┘ ``` Characteristics: - Fixed-size: `pg_trickle.ring_buffer_slots = 65536` (per source) - Lock-free SPSC (single producer = trigger/CDC, single consumer = scheduler) - Each slot: `{action: u8, lsn: u64, row_data: [u8; MAX_ROW_SIZE]}` - Overflow: when ring is full, spill to table-based path and log warning - Zero WAL overhead: shared memory is not WAL-logged - Cleared on crash recovery (change buffers are replayed from WAL position) New GUC: `pg_trickle.ring_buffer_sources` (comma-separated OIDs or `'auto'`) When `'auto'`, sources exceeding `ring_buffer_threshold_writes_per_sec` (default 10000) are automatically promoted to ring-buffer mode. ### MT-6: Auto-Schema-Evolution When `pg_trickle.auto_evolve = true` (default false), DDL event triggers detect schema changes on source tables and automatically propagate safe changes: **Additive changes (automatic):** - `ALTER TABLE ... ADD COLUMN` → rewrite defining query if `SELECT *`, ALTER stream table, mark for reinitialize - `ALTER TABLE ... ADD CONSTRAINT` → no-op (constraints don't affect ST) **Destructive changes (alert + suspend):** - `ALTER TABLE ... DROP COLUMN` used in defining query → suspend ST, alert via NOTIFY, log to health_check() - `ALTER TABLE ... ALTER COLUMN TYPE` on used column → suspend ST, require manual intervention **Transparent changes (no action):** - `CREATE INDEX` / `DROP INDEX` on source → no-op - `ALTER TABLE ... SET STATISTICS` → no-op - `COMMENT ON COLUMN` → no-op All automated actions are logged to `pgt_refresh_history` with reason code `AUTO_SCHEMA_EVOLUTION` and the specific DDL that triggered the change.