# Plan: Architecture Decision Records

Date: 2026-02-24
Status: PROPOSED
Last Updated: 2026-02-25

---

## Overview

This plan proposes a comprehensive set of Architecture Decision Records (ADRs)
for pg_trickle — covering both **decisions already made** during development and
**forward-looking decisions** needed to achieve full PostgreSQL and SQL feature
coverage.

The goal is to eventually support all relevant PostgreSQL and SQL features in
both FULL and DIFFERENTIAL refresh modes. Each ADR documents the reasoning
behind a significant technical choice, including alternatives considered and
consequences.

### ADR Format

Each ADR follows a standard template:

```markdown
# ADR-NNN: <Title>

| Field         | Value                   |
|---------------|-------------------------|
| **Status**    | Accepted / Superseded / Proposed / Not Started |
| **Date**      | YYYY-MM-DD              |
| **Deciders**  | pg_trickle core team     |
| **Category**  | <area>                  |

## Context
## Decision
## Options Considered
## Consequences
## References
```

### Numbering Convention

- **ADR-001–009**: Core architecture (CDC, IVM engine, storage)
- **ADR-010–019**: API & schema design
- **ADR-020–029**: Scheduling & runtime
- **ADR-030–039**: Tooling, testing, ecosystem
- **ADR-040–049**: Performance & optimization
- **ADR-050–059**: SQL feature coverage & operator design
- **ADR-060–069**: PostgreSQL integration & compatibility
- **ADR-070–079**: Correctness & safety guarantees

---

## Part 1: Decisions Already Made

These ADRs document technical choices that have been implemented. The decisions
are settled; the ADR documents capture the rationale so future contributors
understand the "why."

### ADR-001: Row-Level Triggers as Default CDC Mechanism

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | CDC |
| **Sources** | `plans/sql/REPORT_TRIGGERS_VS_REPLICATION.md` |

**Decision:** Use row-level AFTER triggers as the default change-data-capture
mechanism, avoiding the `pg_create_logical_replication_slot()` write-context
restriction that prevents slot creation inside DDL transactions.

**Key points:**
- Triggers can be created in the same transaction as `CREATE TABLE` — atomic
  stream table creation
- No `wal_level = logical` requirement for basic operation
- Changes are visible in buffer tables as soon as the source transaction commits
- Trade-off: write-side overhead (~5-15% per DML on tracked tables)

---

### ADR-002: Hybrid CDC — Trigger Bootstrap with WAL Steady-State

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | CDC |
| **Sources** | `plans/sql/PLAN_HYBRID_CDC.md`, `plans/sql/REPORT_TRIGGERS_VS_REPLICATION.md` |

**Decision:** After ADR-001 chose triggers as default, implement a hybrid
approach: use triggers at creation time (zero-config, atomic), then
transparently transition to logical replication for steady-state if
`wal_level = logical`.

**Key points:**
- Three CDC states: TRIGGER → TRANSITIONING → WAL
- No-data-loss transition (trigger stays active until WAL catches up)
- Graceful fallback if slot creation fails or WAL decoder doesn't catch up
  within `pg_trickle.wal_transition_timeout`
- Same buffer table schema regardless of CDC mode
- `pg_trickle.cdc_mode` GUC for user control (auto/trigger/wal)

---

### ADR-003: Query Differentiation via Operator Tree (DVM Engine Design)

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | IVM Engine |
| **Sources** | `plans/PLAN.md` Phase 6, `docs/DVM_OPERATORS.md`, `docs/ARCHITECTURE.md` |

**Decision:** Implement incremental view maintenance by parsing the defining
query into an operator tree (`OpTree`) and applying per-operator differentiation
rules (analogous to automatic differentiation in calculus) to generate delta SQL.

**Alternatives considered:**
- Full recomputation only (simple but O(n) always)
- Log-based delta replay (simpler operators, less SQL coverage)
- DBSP-style Z-sets with explicit multiplicity tracking
- pg_ivm's approach (limited to single-table aggregates at the time)

**Key points:**
- 21 OpTree variants: Scan, Filter, Project, InnerJoin, LeftJoin, FullJoin,
  Aggregate, Distinct, UnionAll, Intersect, Except, Subquery, CteScan,
  RecursiveCte, Window, LateralFunction, LateralSubquery, SemiJoin, AntiJoin,
  ScalarSubquery (+more planned)
- Delta SQL is generated as CTEs, not materialized intermediates
- Row identity via `__pgt_row_id` (xxHash) for diff-based delta application
- Theoretical basis: DBSP (Budiu et al. 2023), Gupta & Mumick (1995)

---

### ADR-004: xxHash Row IDs Instead of UUIDs

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Storage / IVM Engine |
| **Sources** | `plans/PLAN.md` Key Design Decisions, `src/hash.rs`, `src/dvm/row_id.rs` |

**Decision:** Use 64-bit xxHash of the primary key as the `__pgt_row_id`
column (stored as `BIGINT`) rather than UUIDs or composite-key matching.

**Alternatives considered:**
- UUID v4 (128-bit, zero collision, 16 bytes per row)
- Composite primary key matching (no extra column, but complex MERGE logic)
- MD5/SHA hash (cryptographically stronger but slower)

**Key points:**
- 8 bytes vs 16 bytes per row (significant at scale)
- Collision probability: ~1 in 2^64 per unique key — acceptable for practical
  datasets
- `pg_trickle_hash()` for single-column PKs, `pg_trickle_hash_multi()` for
  composites
- Visible to users via `SELECT *` — a known tradeoff

---

### ADR-005: Per-Table Change Buffer Tables Instead of In-Memory Queues

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | CDC / Storage |
| **Sources** | `plans/PLAN.md` Key Design Decisions, `src/cdc.rs` |

**Decision:** Store CDC changes in dedicated PostgreSQL tables
(`pgtrickle_changes.changes_<oid>`) rather than in shared memory, message
queues, or a single global changes table.

**Alternatives considered:**
- Shared memory ring buffer (fast, but limited size, not crash-safe)
- Single global changes table (simpler, but contention on high-write workloads)
- External message queue (Kafka, NATS — unnecessary dependency)

**Key points:**
- Crash-safe: survives backend/worker crashes
- Queryable for debugging and monitoring
- Per-table isolation avoids contention across independent source tables
- Aggressive cleanup after each refresh cycle
- Trade-off: extra I/O vs. durability and simplicity

---

### ADR-006: Explicit DML for User Triggers Instead of Always-MERGE

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Refresh Engine |
| **Sources** | `plans/sql/PLAN_USER_TRIGGERS_EXPLICIT_DML.md` |

**Decision:** When a stream table has user-defined triggers, decompose the
MERGE into three explicit DML statements (DELETE, UPDATE, INSERT) so triggers
fire with correct `TG_OP`, `OLD`, and `NEW`. When no user triggers exist,
keep the fast single-MERGE path.

**Alternatives considered:**
- Always use explicit DML (simpler code, but ~10-30% slower for the common case)
- Always use MERGE + replay triggers after (complex, wrong `TG_OP` context)
- Disallow user triggers on stream tables entirely

**Key points:**
- `has_user_triggers()` detection at refresh time
- `CachedMergeTemplate` extended with explicit DML templates
- `pg_trickle.user_triggers` GUC (canonical `auto` / `off`, deprecated `on` alias)
- FULL refresh: triggers suppressed via `DISABLE TRIGGER USER` + `NOTIFY`

---

### ADR-007: Semi-Naive Evaluation for Recursive CTEs

| Field | Value |
|-------|-------|
| **Status** | Accepted (Updated) |
| **Category** | IVM Engine |
| **Sources** | `docs/DVM_OPERATORS.md`, `src/dvm/operators/recursive_cte.rs` |

**Decision:** Handle `WITH RECURSIVE` CTEs using three strategies in
DIFFERENTIAL mode, selected automatically based on column compatibility and
change type. FULL mode continues to execute the query as-is.

**Key points:**
- FULL mode: query executes as-is (PostgreSQL handles recursion natively)
- DIFFERENTIAL mode uses three strategies:
  1. **Semi-naive evaluation** — INSERT-only changes: differentiate the base
     case, then propagate new rows through the recursive term via a nested
     `WITH RECURSIVE`
  2. **Delete-and-Rederive (DRed)** — mixed INSERT/DELETE/UPDATE changes:
     insert propagation → over-deletion cascade → rederivation → combine
  3. **Recomputation fallback** — when CTE columns ⊃ ST storage columns
     (column mismatch), re-execute the full query and diff against storage
- Strategy selection is automatic: column match + INSERT-only → semi-naive;
  column match + mixed → DRed; column mismatch → recomputation
- Non-linear recursion (multiple self-references in the recursive term) is
  rejected — PostgreSQL restricts the recursive term to reference the CTE
  at most once

---

### ADR-008: Group-Rescan Strategy for Non-Algebraic Aggregates

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | IVM Engine |
| **Sources** | `docs/DVM_OPERATORS.md`, `src/dvm/operators/aggregate.rs` |

**Decision:** For aggregates that cannot be maintained algebraically
(STRING_AGG, ARRAY_AGG, JSON_AGG, BOOL_AND/OR, statistical functions, etc.),
use a NULL-sentinel approach: when any row in a group changes, return NULL for
the aggregate value, triggering re-aggregation from source data.

**Key points:**
- Algebraic: COUNT, SUM, AVG (maintained via auxiliary counters — O(1) per change)
- Semi-algebraic: MIN, MAX (O(1) for non-extremum changes, rescan on extremum
  deletion)
- Group-rescan: 17+ aggregates (STRING_AGG, ARRAY_AGG, JSON_AGG, BOOL_AND/OR,
  BIT_AND/OR/XOR, STDDEV/VAR, MODE, PERCENTILE_CONT/DISC, etc.)
- Group-rescan is correct and handles arbitrary aggregates; trade-off is O(group)
  per affected group
- Unified pattern: adding new group-rescan aggregates is a copy-paste exercise

---

### ADR-010: SQL Functions Instead of DDL Syntax

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | API Design |
| **Sources** | `plans/PLAN.md` Key Design Decisions |

**Decision:** Expose the API as SQL functions (`pgtrickle.create_stream_table()`,
etc.) rather than custom DDL syntax (`CREATE STREAM TABLE ...`).

**Alternatives considered:**
- Custom DDL via PostgreSQL parser hooks or grammar extension
- Foreign Data Wrapper interface
- Hook-based interception of `CREATE MATERIALIZED VIEW`

**Key points:**
- Works without PostgreSQL parser modifications
- Clean extension boundary — standard `CREATE EXTENSION` installation
- Idiomatic PostgreSQL extension pattern
- Trade-off: less "native" feel, no `\d`-style psql integration

---

### ADR-011: `pgtrickle` Schema with `pgt_` Prefix Convention

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | API Design / Naming |
| **Sources** | Code history (dt_ → st_ → pgt_ rename across 72 files) |

**Decision:** All internal catalog objects use the `pgtrickle` schema and `pgt_`
column/table prefix. Change buffers live in a separate `pgtrickle_changes` schema.

**Key points:**
- Original naming used `dt_` (derived table), renamed to `st_` (stream table),
  then to `pgt_` (pg_trickle) for global uniqueness and consistency
- Two schemas: `pgtrickle` (API + catalog) and `pgtrickle_changes` (buffer tables)
- `pgt_` prefix avoids collisions with user objects

---

### ADR-012: PostgreSQL 18 as Sole Target

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | API Design / Platform |
| **Sources** | `plans/PLAN.md` Key Design Decisions |

**Decision:** Target PostgreSQL 18 exclusively. No backward compatibility with
PG 16 or PG 17.

**Alternatives considered:**
- Multi-version support via conditional compilation (broader adoption, higher
  maintenance)
- Target PG 17 as minimum (more users, but miss PG 18 features)

**Key points:**
- PG 18 features used: custom cumulative statistics, improved logical
  replication, DSM improvements
- Narrows user base but simplifies development and testing
- pgrx 0.17.x provides PG 18 support

---

### ADR-020: Canonical Scheduling Periods (48·2ⁿ Seconds)

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Scheduling |
| **Sources** | `plans/PLAN.md` Key Design Decisions, `src/scheduler.rs` |

**Decision:** Use a discrete set of canonical refresh periods (48, 96, 192, ...
seconds) rather than arbitrary user-specified intervals.

**Key points:**
- Guarantees `data_timestamp` alignment across stream tables with different
  schedules in the same DAG
- User-specified schedule is snapped to the nearest (smaller) canonical period
- NULL schedule = DOWNSTREAM (refresh only when triggered by a dependent)
- Advisory locks prevent concurrent refreshes of the same ST

---

### ADR-021: Single Background Worker Scheduler

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Scheduling / Runtime |
| **Sources** | `src/scheduler.rs`, `src/shmem.rs`, `docs/ARCHITECTURE.md` |

**Decision:** Use a single background worker for scheduling, with shared memory
for inter-process communication (`PgLwLock<PgTrickleSharedState>` and
`PgAtomic<AtomicU64>` DAG rebuild signal).

**Key points:**
- Wakes at `pg_trickle.scheduler_interval_ms` intervals
- Detects DAG changes via atomic counter comparison (lock-free)
- Topological refresh ordering within each wake cycle
- `SIGTERM` graceful shutdown
- `pg_trickle.enabled` GUC to disable without unloading

---

### ADR-022: Replication Origin for Feedback Loop Prevention

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Refresh Engine |
| **Sources** | `plans/PLAN.md` Key Design Decisions, `src/refresh.rs` |

**Decision:** Use PostgreSQL's replication origin mechanism
(`pg_trickle_refresh`) to tag refresh-generated writes, preventing CDC triggers
from re-capturing changes made by the refresh itself (feedback loops).

**Key points:**
- Standard PostgreSQL mechanism (`pg_replication_origin_session_setup`)
- Reliable filtering in the trigger function
- No user-visible side effects

---

### ADR-023: Adaptive Full-Refresh Fallback

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Refresh Engine / Performance |
| **Sources** | `docs/ARCHITECTURE.md`, `src/refresh.rs` |

**Decision:** When the change ratio exceeds
`pg_trickle.differential_max_change_ratio`, automatically downgrade a
DIFFERENTIAL refresh to FULL, since delta processing becomes more expensive
than full recomputation at high change rates.

**Key points:**
- Benchmarks show DIFFERENTIAL is slower than FULL at ~50% change rate
- Automatic switching keeps the default experience fast
- Per-stream-table `auto_threshold` in catalog allows tuning
- `last_full_ms` tracks full-refresh cost for adaptive comparison

---

### ADR-030: dbt Integration via Macro Package (Not Custom Adapter)

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Ecosystem / Tooling |
| **Sources** | `plans/dbt/PLAN_DBT_MACRO.md`, `plans/dbt/PLAN_DBT_ADAPTER.md` |

**Decision:** Integrate with dbt via a Jinja macro package with a custom
`stream_table` materialization, using the standard `dbt-postgres` adapter.
Defer the full custom Python adapter as an upgrade path.

**Key points:**
- ~15 hours effort (vs ~54 for adapter)
- No Python code — pure Jinja SQL macros
- Works with dbt Core ≥ 1.6 (for `subdirectory` in `packages.yml`)
- Adapter plan exists as documented upgrade path in `plans/dbt/PLAN_DBT_ADAPTER.md`

---

### ADR-031: dbt Package In-Repo (Subdirectory) Instead of Separate Repository

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Ecosystem / Tooling |
| **Sources** | `plans/dbt/PLAN_DBT_MACRO.md`, `plans/ecosystem/PLAN_ECO_SYSTEM.md` |

**Decision:** Ship the dbt macro package as `dbt-pgtrickle/` inside the main
pg_trickle repository, not in a separate repo.

**Key points:**
- SQL API changes validated against macros in the same PR (via CI)
- Simpler contributor workflow — one repo, one PR
- Users install via `git:` + `subdirectory:` in `packages.yml`
- Extractable to separate repo later if needed

---

### ADR-032: Testcontainers-Based Integration Testing

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | Testing |
| **Sources** | `AGENTS.md`, `plans/testing/STATUS_TESTING.md`, `tests/common/mod.rs` |

**Decision:** All integration and E2E tests use Docker containers via
testcontainers-rs and a custom E2E Docker image. Tests never assume a local
PostgreSQL installation.

**Key points:**
- Custom `Dockerfile.e2e` builds PG 18 + pg_trickle from source
- Deterministic, reproducible test environments
- Three-tier test pyramid: unit (no DB) → integration (testcontainers) → E2E
  (full extension Docker image)

---

### ADR-040: Aggregate Maintenance via Auxiliary Counter Columns

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | IVM Engine / Performance |
| **Sources** | `docs/DVM_OPERATORS.md`, `src/dvm/operators/aggregate.rs` |

**Decision:** Maintain algebraic aggregates incrementally by storing auxiliary
counter columns alongside each aggregate result.

**Key points:**
- `COUNT(*)` maintained via `__pgt_count` counter
- `SUM(x)` maintained via `__pgt_sum_x` + `__pgt_count` for correctness when
  group shrinks to zero
- `AVG(x)` derived from SUM/COUNT at read time
- MIN/MAX uses semi-algebraic approach (CASE/LEAST/GREATEST with NULL sentinel
  for extremum deletion)
- Hidden auxiliary columns increase storage but enable O(1) aggregate updates

---

### ADR-041: LATERAL Diff via Row-Scoped Recomputation

| Field | Value |
|-------|-------|
| **Status** | Accepted |
| **Category** | IVM Engine |
| **Sources** | `plans/sql/PLAN_LATERAL_JOINS.md`, `src/dvm/operators/lateral_function.rs` |

**Decision:** Differentiate LATERAL subqueries (and SRFs in FROM) by
**row-scoped recomputation**: when an outer row changes, re-execute the
correlated subquery for that specific row only.

**Key points:**
- Handles both implicit LATERAL (comma-syntax) and explicit `LEFT JOIN LATERAL`
- Supports top-N per group, correlated aggregation, multi-column derived values
- Correctness relies on re-executing the subquery in the context of the changed
  outer row — not on incremental maintenance of the inner query

---

## Part 2: Forward-Looking ADRs — SQL Feature Coverage

These ADRs address decisions that **have not yet been made** but are needed to
achieve comprehensive PostgreSQL and SQL support. They cover features currently
rejected, partially supported, or not yet considered.

### Current State Summary

- **49+ of 52 original SQL gaps resolved** (see `plans/sql/GAP_SQL_PHASE_4.md`)
- **Zero P0 (silent corruption) or P1 (incorrect semantics) issues remain**
- **25 aggregate functions** in DIFFERENTIAL mode; 17 recognized-but-rejected
- **All rejected constructs** have clear error messages with rewrite suggestions

### ADR-050: Non-Deterministic Function Handling Strategy

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Correctness |
| **Sources** | `plans/sql/PLAN_NON_DETERMINISM.md` |
| **Effort** | Medium (3-5 sessions) |

**Context:** Volatile functions (`random()`, `gen_random_uuid()`,
`clock_timestamp()`, `now()`) break delta computation in DIFFERENTIAL mode
because the DVM engine assumes expressions are deterministic. The same
expression can produce different values across refreshes, causing phantom
changes, missed changes, and broken row identity hashes.

**Decision needed:** How to handle volatile, stable, and immutable functions.

**Options:**
1. **Reject volatile functions in DIFFERENTIAL mode** (safest; clear error with
   suggestion to use FULL mode) — simplest, zero correctness risk
2. **Warn but allow** — user accepts phantom-change risk
3. **Snapshot volatile values at change-capture time** — store the computed value
   in the change buffer so it's stable across refreshes. Complex but correct.
4. **Auto-downgrade to FULL mode** when volatile functions detected
5. **Classify as stable-safe / volatile-unsafe** — allow `now()` (same within
   statement) but reject `random()`

**Recommendation:** Option 1 as default with Option 4 as a GUC-controlled
override. Adds `lookup_function_volatility()` using `pg_catalog.pg_proc` and
a recursive `Expr` tree scanner.

**Scope:**
- Volatility lookup infrastructure (SPI query to `pg_proc.provolatile`)
- Recursive expression scanner for `worst_volatility()` computation
- Integration into parser validation at `create_stream_table()` time
- GUC: `pg_trickle.volatile_function_policy` (reject/warn/allow)
- Handle overloaded functions (multiple `proname` entries with different
  volatility)

---

### ADR-051: GROUPING SETS / CUBE / ROLLUP Full Implementation

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Aggregation |
| **Sources** | `plans/sql/GAP_SQL_PHASE_4.md` (item S1) |
| **Effort** | High (10-15 hours) |

**Context:** Currently rejected with a clear error suggesting separate stream
tables + UNION ALL. GROUPING SETS produce multiple aggregation levels in a
single query — each grouping set is essentially a separate GROUP BY.

**Decision needed:** Whether and how to implement in DIFFERENTIAL mode.

**Options:**
1. **Keep rejection** — the UNION ALL rewrite is a viable workaround and avoids
   significant complexity
2. **Expand to multiple Aggregate operators** — one per grouping set, combined
   with UNION ALL internally. Each grouping set maps to a separate auxiliary
   counter set in storage.
3. **Rewrite to UNION ALL at parse time** — transparently decompose the query
   into multiple GROUP BY queries combined with UNION ALL before building the
   OpTree

**Recommendation:** Option 3 — query rewrite at parse time is cleanest and
reuses existing infrastructure. Option 1 is acceptable if demand is low.

---

### ADR-052: DISTINCT ON Full Implementation

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Deduplication |
| **Sources** | `plans/sql/GAP_SQL_PHASE_4.md` (item S2) |
| **Effort** | Medium (6-8 hours) |

**Context:** `DISTINCT ON (expr)` is a PostgreSQL-specific extension that
selects the first row per group (based on ORDER BY within the group). Currently
rejected with suggestion to use `ROW_NUMBER() OVER (...) = 1`.

**Decision needed:** Whether to implement natively or via automatic rewrite.

**Options:**
1. **Keep rejection** — the ROW_NUMBER() rewrite works and is portable SQL
2. **Auto-rewrite to window function** — at parse time, transparently convert
   `DISTINCT ON (expr) ORDER BY expr, col` to a subquery with
   `ROW_NUMBER() OVER (PARTITION BY expr ORDER BY col) = 1`
3. **Native DISTINCT ON operator** — new OpTree variant tracking per-group
   "first row" across refreshes

**Recommendation:** Option 2 — automatic rewrite to window function is cleanest,
reuses the existing Window operator, and requires minimal new code.

---

### ADR-053: Circular References in the Stream Table DAG

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / DAG Architecture |
| **Sources** | `plans/sql/PLAN_CIRCULAR_REFERENCES.md` |
| **Effort** | Very High (~20-30 hours) |

**Context:** The dependency graph currently enforces a strict DAG. Creating a
stream table that would form a cycle is rejected. Some use cases naturally
involve mutual dependencies (e.g., ST A references ST B and vice versa).

**Decision needed:** Whether and how to support cycles in the ST dependency
graph.

**Options:**
1. **Keep DAG enforcement** — no cycles, users restructure their queries
2. **SCC-based fixed-point iteration** — decompose the graph into Strongly
   Connected Components (Tarjan's algorithm), create a condensation DAG, and
   iterate SCCs to fixed point
3. **Stratified evaluation** — partition cycles into monotone strata (safe to
   iterate) and non-monotone strata (rejected or user-opted-in with iteration
   limit)

**Recommendation:** Option 3 (stratified evaluation) — aligns with Datalog
theory and DBSP. Only monotone cycles (JOINs, UNIONs, filters) are
automatically iterable. Non-monotone cycles (aggregates, EXCEPT) warn and
require explicit user opt-in.

**Key design points from existing plan:**
- Replace `check_for_cycles()` with SCC decomposition
- Replace `topological_order()` with condensation-DAG ordering
- Add `max_iterations` GUC per SCC (default 100)
- Static monotonicity analysis at `create_stream_table()` time
- Convergence guarantee for monotone-only SCCs

---

### ADR-054: NATURAL JOIN Support

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Joins |
| **Sources** | `plans/sql/GAP_SQL_OVERVIEW.md` (Gap 2.3) |
| **Effort** | Medium (6-8 hours) |

**Context:** NATURAL JOIN is currently rejected with a clear error suggesting
explicit `JOIN ... ON`. PostgreSQL's raw parser does not resolve NATURAL JOIN
column lists — the `quals` field is NULL, and resolution happens during
analysis. The DVM parser would need catalog access to resolve common columns.

**Decision needed:** Whether to implement or continue rejecting.

**Options:**
1. **Keep rejection** — NATURAL JOIN is generally considered poor practice;
   explicit JOINs are clearer and less fragile to schema changes
2. **Catalog-resolved rewrite** — at parse time, query `pg_attribute` for both
   tables, find common column names, and synthesize an equi-join condition
3. **Query analysis pass** — use `pg_analyze_and_rewrite()` to get the resolved
   join quals, then extract the condition

**Recommendation:** Option 1 — rejection is appropriate. NATURAL JOIN is fragile
(adding a column to either table silently changes the join condition). The error
message already suggests the correct alternative.

---

### ADR-055: Remaining Aggregate Functions (Regression, Hypothetical-Set, XMLAGG)

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Aggregation |
| **Sources** | `plans/sql/GAP_SQL_PHASE_4.md` (items A3, E5, E6) |
| **Effort** | Low-Medium (6-10 hours total) |

**Context:** 17 aggregate functions are recognized but rejected in DIFFERENTIAL
mode. All follow the proven group-rescan pattern — implementation is
mechanical.

**Decision needed:** Priority and scope of remaining aggregate support.

**Aggregates to consider:**
- **Regression (11 functions):** CORR, COVAR_POP, COVAR_SAMP, REGR_AVGX,
  REGR_AVGY, REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE, REGR_SXX,
  REGR_SXY — all use group-rescan (~4-6 hours)
- **Hypothetical-set (4 functions):** RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST
  as aggregates — almost always used as window functions; rare as aggregates
  (~4-6 hours)
- **XML:** XMLAGG — very niche (~1-2 hours)

**Recommendation:** Implement regression aggregates on demand. Keep rejection
for hypothetical-set and XMLAGG — extremely rare use cases.

---

### ADR-056: Mixed UNION / UNION ALL Support

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Set Operations |
| **Sources** | `plans/sql/GAP_SQL_PHASE_4.md` (item S3) |
| **Effort** | Medium (4-6 hours) |

**Context:** Queries mixing `UNION` and `UNION ALL` in the same query are
currently rejected. The DVM parser handles sequences of the same set operation
but not mixed sequences.

**Decision needed:** How to handle mixed set operations.

**Options:**
1. **Keep rejection** — users rewrite to uniform set operations
2. **Per-arm dedup flag** — extend the OpTree set operation nodes with per-branch
   metadata indicating whether deduplication applies to each arm
3. **Parse-time rewrite** — decompose `A UNION B UNION ALL C` into
   `(A UNION B) UNION ALL C` by nesting set operation nodes

**Recommendation:** Option 3 — PostgreSQL's parser already produces a nested
tree structure for mixed set ops; the DVM parser should respect this nesting
rather than flattening.

---

### ADR-057: Multiple PARTITION BY Clauses in Window Functions

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Window Functions |
| **Sources** | `plans/sql/GAP_SQL_PHASE_4.md` (item S4) |
| **Effort** | High (8-10 hours) |

**Context:** Queries with window functions using different `PARTITION BY` clauses
are currently rejected in DIFFERENTIAL mode. The Window operator recomputes
entire partitions when any row in the partition changes. Multiple partitioning
schemes would require multiple recomputation passes.

**Decision needed:** How to handle queries with heterogeneous window partitions.

**Options:**
1. **Keep rejection** — users split into multiple stream tables
2. **Multi-pass recomputation** — for each distinct PARTITION BY, run a separate
   recomputation pass. The superset of affected partitions across all passes
   determines the final delta.
3. **Finest-grain partition** — find the coarsest common partition (intersection
   of all PARTITION BY keys) and recompute at that granularity
4. **Auto-rewrite to subqueries** — split each window function into a separate
   subquery with its own partitioning, then join the results

**Recommendation:** Option 2 — multi-pass is correct and bounded. Option 1 is
acceptable until demand is demonstrated.

---

### ADR-058: Subquery Expressions in Complex Positions

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / Subqueries |
| **Sources** | `plans/sql/GAP_SQL_PHASE_4.md` (items E1, E2, E3) |
| **Effort** | High (18-24 hours total for all 3) |

**Context:** Three subquery patterns are currently rejected in DIFFERENTIAL mode:

1. **Scalar subquery in WHERE** — `WHERE col > (SELECT avg(x) FROM t)` — requires
   value-change tracking per row
2. **SubLinks inside OR** — `WHERE EXISTS(...) OR col = 1` — requires
   OR-to-UNION rewrite for delta correctness
3. **ALL (subquery)** — `WHERE col > ALL(SELECT x FROM t)` — dual of ANY;
   anti-join with universal quantification

**Decision needed:** Priority and approach for each.

**Options:**
1. **Keep rejection with rewrite suggestions** — all three have documented
   workarounds (JOINs, CTEs, NOT EXISTS)
2. **Implement incrementally** — E3 (ALL subquery) is simplest (anti-join
   pattern); E1 (scalar in WHERE) is hardest (needs value-change tracking);
   E2 (OR + SubLinks) is architecturally complex (OR-to-UNION rewrite)
3. **Auto-rewrite at parse time** — transform these patterns into supported
   equivalents before building the OpTree

**Recommendation:** Implement E3 (ALL subquery) as it follows the existing
AntiJoin pattern; defer E1 and E2 due to high complexity relative to benefit.

---

### ADR-059: ROWS FROM() with Multiple Set-Returning Functions

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | SQL Feature / LATERAL |
| **Sources** | `plans/sql/GAP_SQL_PHASE_4.md` (item S5) |
| **Effort** | Low (3-4 hours) |

**Context:** `ROWS FROM(func1(...), func2(...))` zips the output of multiple
set-returning functions into a single result set. Currently rejected.

**Decision needed:** Whether to implement.

**Recommendation:** Keep rejection — extremely rare construct. Single SRF in
FROM + LATERAL covers all practical use cases.

---

## Part 3: Forward-Looking ADRs — PostgreSQL Integration & Compatibility

### ADR-060: Citus Distributed Table Compatibility

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | PostgreSQL Integration |
| **Sources** | `plans/infra/PLAN_CITUS.md` |
| **Effort** | Very High (~6 months) |

**Context:** pg_trickle has zero multi-node awareness. Every core module assumes
a single PostgreSQL instance with local OIDs, local WAL, local triggers, and a
single background worker. Citus compatibility requires addressing 6 major
incompatibilities.

**Decision needed:** Architecture for Citus support.

**Key incompatibilities:**
1. OID-based change buffer naming (OIDs not globally unique across nodes)
2. `pg_current_wal_lsn()` as change frontier (independent WAL per worker)
3. Triggers on distributed tables (DML goes to workers, bypassing coordinator
   triggers)
4. MERGE statement compatibility (limited Citus MERGE support)
5. Shared memory & background worker (coordinator-local)
6. System catalog & row estimates (coordinator shard is empty for distributed
   tables)

**Options:**
1. **Single-node only** — document incompatibility, no Citus support
2. **Reference tables only** — support Citus reference tables (triggers fire on
   coordinator) but not distributed tables
3. **Full Citus support** — 7-phase plan: stable naming, distributed sequence
   frontiers, worker-propagated triggers, INSERT ON CONFLICT instead of MERGE,
   coordinator-only scheduler, catalog-based locks, LISTEN/NOTIFY signaling

**Recommendation:** Option 2 as near-term (reference tables); Option 3 as a
long-term roadmap item. Proceed with runtime auto-detection of Citus
availability.

---

### ADR-061: Multi-Version PostgreSQL Support

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | PostgreSQL Integration |
| **Effort** | High (ongoing) |

**Context:** ADR-012 chose PostgreSQL 18 as the sole target. As PG 19 and future
versions release, a strategy for multi-version support is needed.

**Decision needed:** How to support new PG versions while maintaining backward
compatibility.

**Options:**
1. **Track latest only** — always target the newest PG version exclusively
2. **N-1 support** — support current and previous major version via conditional
   compilation (`#[cfg(feature = "pg18")]`)
3. **N-2 support** — broader compatibility at higher maintenance cost

**Recommendation:** Option 2 — support N and N-1 via pgrx's built-in
conditional compilation. Drop the oldest when a new PG version releases.

---

### ADR-062: Schema Evolution and DDL Propagation

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | PostgreSQL Integration |
| **Sources** | `src/hooks.rs`, `docs/ARCHITECTURE.md` |
| **Effort** | High (10-15 hours) |

**Context:** The current DDL tracking (`_on_ddl_end`, `_on_sql_drop`) detects
source table schema changes and marks affected stream tables for
reinitialization. This is a coarse approach — any column change triggers a full
reinitialization even if the changed column isn't used by the stream table.

**Decision needed:** How to handle schema evolution more gracefully.

**Options:**
1. **Keep full reinitialization** — correct but heavy-handed; any ALTER TABLE on
   a source table forces a full rebuild
2. **Column-level tracking** — only reinitialize if claimed columns are affected.
   The `columns_used` field in `pgt_dependencies` already tracks this; use it
   to filter DDL events.
3. **Transparent ALTER propagation** — when a source table gets a new column,
   automatically add it to the stream table if the defining query uses `SELECT *`
4. **Online schema migration** — apply schema changes to the storage table
   without full reinitialization using `ALTER TABLE ... ADD/DROP COLUMN`

**Recommendation:** Option 2 as near-term improvement; Option 3 for `SELECT *`
queries. Option 4 is complex and deferred.

---

### ADR-063: Extension Upgrade / Migration Strategy

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | PostgreSQL Integration |
| **Effort** | Medium (5-8 hours) |

**Context:** As the extension evolves, catalog schema changes, new operators,
and behavioral changes need a migration strategy. PostgreSQL supports
`ALTER EXTENSION ... UPDATE` with versioned migration SQL scripts.

**Decision needed:** Versioning and migration approach.

**Options:**
1. **pgrx-managed migrations** — rely on pgrx's SQL generation for each version
2. **Manual migration scripts** — hand-written `pg_trickle--1.0--1.1.sql` files
   with explicit `ALTER TABLE`, data migrations, etc.
3. **Hybrid** — pgrx for function signatures + manual scripts for catalog
   schema changes

**Recommendation:** Option 3 — pgrx handles function registration; manual
scripts handle catalog table changes, index additions, and data migrations.

---

## Part 4: Forward-Looking ADRs — Correctness & Safety

### ADR-070: TRUNCATE Capture in CDC

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | Correctness / CDC |
| **Effort** | Medium (4-6 hours) |

**Context:** `TRUNCATE` on a source table is not currently captured by the
row-level AFTER trigger (PostgreSQL does not fire row-level triggers on
TRUNCATE). If a source table is truncated, the stream table becomes stale
with no automatic mechanism to detect or recover.

**Decision needed:** How to detect and handle TRUNCATE on tracked tables.

**Options:**
1. **Event trigger on TRUNCATE** — use a DDL event trigger or statement-level
   trigger on TRUNCATE to detect the operation and mark affected stream tables
   for reinitialization
2. **TRUNCATE trigger** — PostgreSQL supports `BEFORE/AFTER TRUNCATE` triggers
   (statement-level only); fire a function that marks affected STs
3. **Row-count verification** — before each refresh, verify that source table
   row count hasn't unexpectedly dropped to 0
4. **Replication-based detection** — WAL-mode CDC naturally captures TRUNCATE
   via logical decoding messages

**Recommendation:** Option 2 — `AFTER TRUNCATE` trigger is the most direct and
reliable solution for trigger-mode CDC. Option 4 handles WAL mode automatically.

---

### ADR-071: Type Coercion and Implicit Cast Handling

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | Correctness / Expressions |
| **Effort** | Medium (4-6 hours) |

**Context:** PostgreSQL performs implicit type coercions in many contexts
(comparisons, function arguments, INSERT targets). The DVM parser handles
explicit `CAST(x AS type)` and `x::type` but may not preserve implicit coercions
that PostgreSQL's analyzer adds. This could lead to type mismatches in generated
delta SQL.

**Decision needed:** Whether to re-analyze delta SQL or preserve coercions
from the parse tree.

**Options:**
1. **Rely on PostgreSQL's implicit coercion** in generated SQL — trust that
   the database engine will apply the same coercions when executing delta SQL
2. **Explicit coercion insertion** — when generating delta SQL, add explicit
   casts where the source query has implicit coercions
3. **Use analyzed (post-rewrite) parse tree** — parse with `pg_analyze_and_rewrite()`
   instead of `raw_parser()` to get a fully resolved tree

**Recommendation:** Option 1 for now — PostgreSQL's implicit coercion in
generated delta SQL matches the defining query's behavior. Monitor for edge
cases and switch to Option 3 if type mismatches surface.

---

### ADR-072: Row Identity for Keyless Tables

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | Correctness / Storage |
| **Effort** | Medium (6-8 hours) |

**Context:** The current `__pgt_row_id` is computed from the primary key of
source tables. For defining queries that involve aggregations, expressions, or
joins, the row ID is derived from GROUP BY keys, join keys, or synthetic
identifiers. But what if a source table has no primary key?

**Decision needed:** How to identify rows when source tables lack primary keys.

**Options:**
1. **Require primary keys** — reject `create_stream_table()` if any source
   table lacks a PK. Simple but restrictive.
2. **Use `ctid`** — PostgreSQL's physical row ID. Not stable across VACUUM, but
   usable within a single refresh window.
3. **Use all columns** — hash all column values to generate a row ID. Works but
   may not be unique for duplicate rows.
4. **Require REPLICA IDENTITY FULL** — for WAL mode, this provides all column
   values in the change record. For trigger mode, the trigger already captures
   `to_jsonb(NEW)`.

**Recommendation:** Option 1 as default (require PK). Support Option 3 via
opt-in for tables where duplicates are acceptable or impossible.

---

### ADR-073: Consistent Snapshot Isolation for Multi-Source Refreshes

| Field | Value |
|-------|-------|
| **Status** | Not Started |
| **Category** | Correctness / Refresh Engine |
| **Effort** | High (8-12 hours) |

**Context:** When a stream table references multiple source tables, the
delta query reads changes from each source's buffer. These changes may
represent different transaction visibility windows. The current frontier
system uses LSN ranges per source, but concurrent transactions may cause
subtle inconsistencies if changes from one source are captured at a
different snapshot boundary than another.

**Decision needed:** Whether and how to enforce cross-source snapshot
consistency during refresh.

**Options:**
1. **Accept eventual consistency** — each source is independently tracked by
   LSN; minor transient inconsistencies self-correct on next refresh
2. **Transaction-ID-based windows** — use `xid` ranges instead of LSN ranges
   to ensure only committed transactions within the same window are processed
3. **Serializable refresh transactions** — run the delta query in a
   SERIALIZABLE transaction to enforce a consistent view

**Recommendation:** Option 1 — the current frontier approach provides
"eventual consistency within one refresh cycle" which is acceptable for the
DVS (Delayed View Semantics) guarantee.

---

## Priority Order

### Tier 1 — High Priority (write first)

| Priority | ADR | Rationale |
|----------|-----|-----------|
| 1 | ADR-003 | Core IVM engine — the heart of the extension |
| 2 | ADR-001 | Foundational CDC decision |
| 3 | ADR-002 | Hybrid CDC — major architectural evolution |
| 4 | ADR-010 | SQL functions vs DDL — shapes user experience |
| 5 | ADR-004 | xxHash row IDs — storage and correctness |
| 6 | ADR-005 | Change buffer design — CDC pipeline foundation |
| 7 | ADR-050 | Non-deterministic functions — open correctness gap |
| 8 | ADR-070 | TRUNCATE capture — open correctness gap |

### Tier 2 — Medium Priority

| Priority | ADR | Rationale |
|----------|-----|-----------|
| 9 | ADR-020 | Canonical scheduling — non-obvious design choice |
| 10 | ADR-023 | Adaptive fallback — performance characteristics |
| 11 | ADR-006 | User triggers — real-world usability |
| 12 | ADR-007 | Recursive CTE strategy — non-trivial IVM decision |
| 13 | ADR-008 | Group-rescan strategy — foundational aggregate pattern |
| 14 | ADR-040 | Aggregate counters — performance detail |
| 15 | ADR-053 | Circular references — major DAG architecture decision |
| 16 | ADR-062 | Schema evolution — operational concern |

### Tier 3 — Lower Priority

| Priority | ADR | Rationale |
|----------|-----|-----------|
| 17 | ADR-012 | PG 18 only — scoping decision |
| 18 | ADR-021 | Single scheduler — straightforward |
| 19 | ADR-022 | Replication origin — safety mechanism |
| 20 | ADR-030 | dbt macro — ecosystem decision |
| 21 | ADR-041 | LATERAL diff — specialized IVM detail |
| 22 | ADR-051 | GROUPING SETS — structural enhancement |
| 23 | ADR-052 | DISTINCT ON — auto-rewrite |
| 24 | ADR-060 | Citus — long-term infrastructure |

### Tier 4 — Document When Relevant

| Priority | ADR | Rationale |
|----------|-----|-----------|
| 25 | ADR-011 | Naming — historical |
| 26 | ADR-031 | In-repo dbt — minor |
| 27 | ADR-032 | Testcontainers — testing infra |
| 28 | ADR-055 | Remaining aggregates — incremental |
| 29 | ADR-056 | Mixed UNION — edge case |
| 30 | ADR-057 | Multiple PARTITION BY — edge case |
| 31 | ADR-058 | Complex subquery positions — edge case |
| 32 | ADR-059 | ROWS FROM — very niche |
| 33 | ADR-061 | Multi-PG-version — ongoing |
| 34 | ADR-063 | Extension upgrades — operational |
| 35 | ADR-071 | Type coercion — monitor for issues |
| 36 | ADR-072 | Keyless tables — restrictive edge |
| 37 | ADR-073 | Snapshot isolation — theoretical |
| 38 | ADR-054 | NATURAL JOIN — keep rejection |

---

## Effort Estimate

| Batch | ADRs | Scope | Estimated Effort |
|-------|------|-------|------------------|
| Batch 1 — Core (Accepted) | ADR-001 through 008, 010 | Document past decisions | ~5 hours |
| Batch 2 — Runtime (Accepted) | ADR-011, 012, 020-023 | Document past decisions | ~3 hours |
| Batch 3 — Ecosystem (Accepted) | ADR-030, 031, 032, 040, 041 | Document past decisions | ~2.5 hours |
| Batch 4 — SQL Features (New) | ADR-050 through 059 | Propose new decisions | ~4 hours |
| Batch 5 — PG Integration (New) | ADR-060 through 063 | Propose new decisions | ~2 hours |
| Batch 6 — Correctness (New) | ADR-070 through 073 | Propose new decisions | ~2 hours |
| **Total** | **38 ADRs** | 22 accepted + 16 forward-looking | **~18.5 hours** |

---

## File Naming Convention

```
plans/adrs/
├── PLAN_ADRS.md                                              ← this file
│
│ ── Core Architecture (001-009) ──
├── adr-001-trigger-based-cdc.md
├── adr-002-hybrid-cdc.md
├── adr-003-dvm-operator-tree.md
├── adr-004-xxhash-row-ids.md
├── adr-005-per-table-change-buffers.md
├── adr-006-explicit-dml-user-triggers.md
├── adr-007-semi-naive-recursive-cte.md
├── adr-008-group-rescan-aggregates.md
│
│ ── API & Schema Design (010-019) ──
├── adr-010-sql-functions-not-ddl.md
├── adr-011-pgtrickle-schema-naming.md
├── adr-012-postgresql-18-only.md
│
│ ── Scheduling & Runtime (020-029) ──
├── adr-020-canonical-scheduling-periods.md
├── adr-021-single-background-worker.md
├── adr-022-replication-origin-feedback-prevention.md
├── adr-023-adaptive-full-refresh-fallback.md
│
│ ── Tooling & Ecosystem (030-039) ──
├── adr-030-dbt-macro-package.md
├── adr-031-dbt-in-repo-subdirectory.md
├── adr-032-testcontainers-testing.md
│
│ ── Performance & Optimization (040-049) ──
├── adr-040-aggregate-auxiliary-counters.md
├── adr-041-lateral-row-scoped-recomputation.md
│
│ ── SQL Feature Coverage (050-059) ──
├── adr-050-non-deterministic-function-handling.md
├── adr-051-grouping-sets-implementation.md
├── adr-052-distinct-on-rewrite.md
├── adr-053-circular-references-scc.md
├── adr-054-natural-join-rejection.md
├── adr-055-remaining-aggregates.md
├── adr-056-mixed-union-support.md
├── adr-057-multiple-partition-by.md
├── adr-058-complex-subquery-positions.md
├── adr-059-rows-from-multi-srf.md
│
│ ── PostgreSQL Integration (060-069) ──
├── adr-060-citus-compatibility.md
├── adr-061-multi-version-pg-support.md
├── adr-062-schema-evolution-ddl.md
├── adr-063-extension-upgrade-migration.md
│
│ ── Correctness & Safety (070-079) ──
├── adr-070-truncate-capture.md
├── adr-071-type-coercion-handling.md
├── adr-072-keyless-table-row-identity.md
└── adr-073-snapshot-isolation-multi-source.md
```