# v0.87.0 — Enterprise Federation & State Management > **Status:** Planned > **Scope:** Very Large > **Driven by:** [Assessment 16](../plans/PLAN_OVERALL_ASSESSMENT_16.md) — LT-5, LT-8 ## Theme Complete the distributed scaling arc with multi-cluster federation (stream tables spanning multiple PostgreSQL clusters), object-storage state checkpointing for disaster recovery, and the zero-config adaptive mode that detects deployment context and configures behavior automatically. This release establishes pg_trickle as a true enterprise-grade distributed IVM platform. ## Items ### LT-5: Multi-Cluster Federation Coordinate stream tables across multiple PostgreSQL clusters with a global DAG and cross-cluster frontier synchronization: **Architecture:** ``` ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │ Cluster A │ │ Cluster B │ │ Cluster C │ │ • Source: orders │ │ • Source: users │ │ • ST: dashboard │ │ • ST: order_agg │────▶│ • ST: user_stats │────▶│ (joins A + B) │ │ (local) │ │ (local) │ │ (federated) │ └──────────────────┘ └──────────────────┘ └──────────────────┘ │ │ │ └────────────────────────┴────────────────────────┘ │ ┌───────────────┐ │ Federation │ │ Coordinator │ │ (global DAG) │ └───────────────┘ ``` **Components:** 1. **Federation Coordinator:** A standalone service that maintains the global DAG across clusters. Runs as a Kubernetes Deployment or standalone binary. 2. **Cross-cluster change forwarding:** When Cluster A's `order_agg` ST is refreshed, its change buffer events are forwarded to Cluster C via the change log backend (Kafka/NATS). 3. **Global frontier synchronization:** The coordinator tracks per-cluster frontiers and ensures downstream clusters only process events up to the consistent frontier of all upstream clusters. 4. **Cluster discovery:** Clusters register with the coordinator via `pgtrickle.register_federation(coordinator_url)`. **SQL API:** ```sql -- On Cluster C (consumer): SELECT pgtrickle.create_federated_stream_table( 'dashboard', $$SELECT o.category, u.region, SUM(o.amount) FROM cluster_a.order_agg o JOIN cluster_b.user_stats u ON o.user_id = u.id GROUP BY 1, 2$$, sources => ARRAY[ 'nats://coordinator/cluster_a/order_agg', 'nats://coordinator/cluster_b/user_stats' ] ); ``` **Consistency model:** - **Eventual consistency** (default): Each cluster refreshes independently; cross-cluster joins may see temporarily inconsistent snapshots. - **Causal consistency** (opt-in): The coordinator enforces that downstream clusters only see events that are causally consistent across all upstream clusters. Adds latency equal to the slowest upstream cluster. ### LT-8: State Checkpointing to Object Storage Periodically checkpoint DVM state to S3/GCS for disaster recovery independent of pg_basebackup: **What is checkpointed:** - Stream table metadata (defining queries, schedules, configurations) - Frontier positions (per-source, per-ST) - Template cache (delta SQL templates, Aho-Corasick automata serialized) - DAG topology (edges, SCC groupings, diamond consistency groups) - Cost model summaries (refresh history aggregates) - Worker registry state **Checkpoint format:** A single JSONL file per checkpoint: ``` s3://my-bucket/pgtrickle/checkpoints/2026-05-30T12:00:00Z.jsonl.zst ``` **Checkpoint lifecycle:** - Triggered every `pg_trickle.checkpoint_interval` (default 1h) - Also triggered before extension upgrade (`ALTER EXTENSION ... UPDATE`) - Retained for `pg_trickle.checkpoint_retention_days` (default 7) - Compressed with Zstandard (typically <1MB for 1000-ST deployments) **Recovery from checkpoint:** ```sql SELECT pgtrickle.restore_from_checkpoint( 's3://my-bucket/pgtrickle/checkpoints/2026-05-30T12:00:00Z.jsonl.zst' ); ``` This recreates all stream tables, dependencies, and configurations but marks all STs for `REINITIALIZE` (since the actual data may have diverged since the checkpoint). Useful for: - Restoring pg_trickle configuration after a database restore from a backup that doesn't include the `pgtrickle` schema - Migrating pg_trickle state to a new PostgreSQL cluster - Disaster recovery when the primary is lost and only a replica (without pg_trickle extension) is available ### Global Priority Queue & Resource Governor For multi-database PostgreSQL clusters (multi-tenant SaaS), implement a global resource governor: **Architecture:** - Shared-memory priority queue across all per-database schedulers - Priority based on: (1) SLA tier, (2) current lag, (3) last refresh age - Global worker budget: `pg_trickle.global_max_workers` across all databases - Per-database quotas: `pg_trickle.per_db_max_workers` (default unlimited) - Priority preemption: HIGH-SLA STs can preempt LOW-SLA refreshes in progress **GUCs:** - `pg_trickle.global_max_workers = 16` - `pg_trickle.per_db_max_workers = -1` (unlimited) - `pg_trickle.priority_preemption = false` ### Zero-Config Adaptive Mode The system detects its deployment context and configures behavior automatically without any GUC changes: | Context | Detection Method | Auto-Configuration | |---------|-----------------|-------------------| | **Developer laptop** | `max_connections < 20` AND no `pgt_worker_registry` entries | 1 worker, trigger CDC, 5s interval, 32MB memory | | **Small server** | `max_connections < 100` AND no external workers | 4 workers, trigger CDC, 1s interval, 64MB memory | | **Production server** | External workers in `pgt_worker_registry` | Coordinator mode, dispatch to external workers | | **Kubernetes** | `PG_TRICKLE_K8S_MODE=true` env var OR operator CRD detected | Full distributed: external CDC, external workers, log backend | **Override:** Any GUC explicitly set by the user takes precedence over adaptive detection. The adaptive mode only sets defaults for GUCs that are at their compile-time default values. **Reporting:** `SELECT pgtrickle.detected_mode()` returns the current adaptive mode and the reasoning for the detection.