# Plan: FAQ Expansion

Date: 2026-03-03
Status: IMPLEMENTED

---

## Overview

This plan proposed expanding the FAQ (`docs/FAQ.md`) with additional questions
that users of pg_trickle are likely to ask. The original FAQ covered 63 questions
across 13 sections. After implementation, the FAQ now contains **158 questions**
across **24 sections** (~2066 lines).

**All 94 proposed questions have been implemented** (78 net-new + 16 overlapping
entries expanded/cross-linked within existing sections).

### Priority Key

| Priority | Meaning |
|----------|---------|
| **P1** | Must-have — core concepts every user needs |
| **P2** | Should-have — common operational questions |
| **P3** | Nice-to-have — advanced topics and edge cases |

### Cross-Reference: Existing FAQ Sections

The current FAQ already covers these sections (do NOT duplicate):

- General (5 questions)
- Installation & Setup (4 questions)
- Creating & Managing Stream Tables (10 questions)
- SQL Support (4 questions)
- Change Data Capture (7 questions)
- Performance & Tuning (5 questions)
- Interoperability (8 questions)
- Monitoring & Alerting (3 questions)
- Configuration Reference (GUC table)
- Troubleshooting (6 questions)
- Why Are These SQL Features Not Supported? (10 questions)
- Why Are These Stream Table Operations Restricted? (8 questions)

---

## Proposed New Questions

### 1. Getting Started / Conceptual

Questions that help users build an accurate mental model of how pg_trickle
works before they write any SQL.

| ID | Priority | Question |
|----|----------|----------|
| GS-01 | P1 | What is incremental view maintenance (IVM) and why does it matter? |
| GS-02 | P1 | What is the difference between a stream table and a regular materialized view, in practice? |
| GS-03 | P1 | What happens behind the scenes when I INSERT a row into a table tracked by a stream table? |
| GS-04 | P1 | What does "differential" mean in the context of pg_trickle? |
| GS-05 | P2 | What is a frontier, and why does pg_trickle track LSNs? |
| GS-06 | P2 | What is the `__pgt_row_id` column and why does it appear in my stream tables? |
| GS-07 | P2 | What is the auto-rewrite pipeline and how does it affect my queries? |
| GS-08 | P3 | How does pg_trickle compare to DBSP (the academic framework)? |
| GS-09 | P3 | How does pg_trickle compare to pg_ivm? |

### 2. Data Freshness & Consistency

The #1 conceptual hurdle for users coming from synchronous materialized views.

| ID | Priority | Question |
|----|----------|----------|
| DC-01 | P1 | How stale can a stream table be? |
| DC-02 | P1 | Can I read my own writes immediately after an INSERT? |
| DC-03 | P1 | What consistency guarantees does pg_trickle provide? |
| DC-04 | P2 | What are "Delayed View Semantics" (DVS)? |
| DC-05 | P2 | What happens if the scheduler is behind — does data get lost? |
| DC-06 | P3 | How does pg_trickle ensure deltas are applied in the right order across cascading stream tables? |

### 3. IMMEDIATE Mode (Transactional IVM)

This is a major v0.2.0 feature — users switching from pg_ivm need detailed
guidance.

| ID | Priority | Question |
|----|----------|----------|
| IM-01 | P1 | When should I use IMMEDIATE mode instead of DIFFERENTIAL? |
| IM-02 | P1 | What SQL features are NOT supported in IMMEDIATE mode? |
| IM-03 | P1 | What happens when I TRUNCATE a source table in IMMEDIATE mode? |
| IM-04 | P2 | Can I have cascading IMMEDIATE stream tables (ST A → ST B)? |
| IM-05 | P2 | What locking does IMMEDIATE mode use? |
| IM-06 | P2 | How do I switch an existing DIFFERENTIAL stream table to IMMEDIATE? |
| IM-07 | P2 | What happens to IMMEDIATE mode during a manual `refresh_stream_table()` call? |
| IM-08 | P3 | How much write-side overhead does IMMEDIATE mode add? |

### 4. CDC — Triggers vs. WAL

Users need to understand the trade-offs behind the hybrid CDC model.

| ID | Priority | Question |
|----|----------|----------|
| CDC-01 | P1 | Why does pg_trickle default to triggers instead of logical replication? |
| CDC-02 | P1 | What is the write-side overhead of CDC triggers? |
| CDC-03 | P2 | How does the trigger-to-WAL automatic transition work? |
| CDC-04 | P2 | What happens to CDC if I restore a database backup? |
| CDC-05 | P2 | Do CDC triggers fire for rows inserted via logical replication (subscribers)? |
| CDC-06 | P3 | Can I inspect the change buffer tables directly? |
| CDC-07 | P3 | How does pg_trickle prevent its own refresh writes from re-triggering CDC? |

### 5. Aggregates & Group-By

Aggregate handling is complex and a common source of user confusion.

| ID | Priority | Question |
|----|----------|----------|
| AG-01 | P1 | Which aggregates are fully incremental (O(1) per change) vs. group-rescan? |
| AG-02 | P1 | Why do some aggregates have hidden auxiliary columns (`__pgt_count`, `__pgt_sum`)? |
| AG-03 | P2 | How does HAVING work with incremental refresh? |
| AG-04 | P2 | What happens to a group when all its rows are deleted? |
| AG-05 | P3 | Why are `CORR`, `COVAR_*`, and `REGR_*` limited to FULL mode? |

### 6. Joins

Join delta semantics can be surprising.

| ID | Priority | Question |
|----|----------|----------|
| JN-01 | P1 | How does a DIFFERENTIAL refresh handle a join when both sides changed? |
| JN-02 | P2 | Does pg_trickle support FULL OUTER JOIN incrementally? |
| JN-03 | P2 | What happens when a join key is updated and the joined row is simultaneously deleted? |
| JN-04 | P3 | Why is NATURAL JOIN rejected? |

### 7. CTEs & Recursive Queries

Recursive CTE support is a differentiator — document it well.

| ID | Priority | Question |
|----|----------|----------|
| CTE-01 | P1 | Do recursive CTEs work in DIFFERENTIAL mode? |
| CTE-02 | P2 | What are the three strategies for recursive CTE maintenance (semi-naive / DRed / recomputation)? |
| CTE-03 | P2 | What triggers a fallback from semi-naive to recomputation? |
| CTE-04 | P3 | What happens when a CTE is referenced multiple times in the same query? |

### 8. Window Functions & LATERAL

| ID | Priority | Question |
|----|----------|----------|
| WL-01 | P2 | How are window functions maintained incrementally? |
| WL-02 | P2 | Why can't I use a window function inside a CASE or COALESCE expression? |
| WL-03 | P2 | What LATERAL constructs are supported (SRFs, subqueries, JSON_TABLE)? |
| WL-04 | P3 | What happens when a row moves between window partitions during a refresh? |

### 9. TopK (ORDER BY … LIMIT)

| ID | Priority | Question |
|----|----------|----------|
| TK-01 | P1 | How does `ORDER BY … LIMIT N` work in a stream table? |
| TK-02 | P2 | Why is OFFSET not supported with TopK? |
| TK-03 | P2 | What happens when a row below the top-N cutoff rises above it? |
| TK-04 | P3 | Can I use TopK with aggregates or joins? |

### 10. Tables Without Primary Keys (Keyless Sources)

| ID | Priority | Question |
|----|----------|----------|
| KL-01 | P1 | Do source tables need a primary key? |
| KL-02 | P2 | What are the risks of using tables without primary keys? |
| KL-03 | P3 | How does content-based row identity work for duplicate rows? |

### 11. Diamond Dependencies & DAG Scheduling

| ID | Priority | Question |
|----|----------|----------|
| DD-01 | P2 | What is a diamond dependency and why does it matter? |
| DD-02 | P2 | What does `diamond_consistency = 'atomic'` do? |
| DD-03 | P2 | What is the difference between `'fastest'` and `'slowest'` schedule policy? |
| DD-04 | P3 | What happens when an atomic diamond group partially fails? |
| DD-05 | P3 | How does pg_trickle determine topological refresh order? |

### 12. Schema Changes & DDL Events

| ID | Priority | Question |
|----|----------|----------|
| SC-01 | P1 | What happens when I add a column to a source table? |
| SC-02 | P1 | What happens when I drop a column used in a stream table's query? |
| SC-03 | P2 | What happens when I `CREATE OR REPLACE` a view used by a stream table? |
| SC-04 | P2 | What happens when I alter or drop a function used in a stream table's query? |
| SC-05 | P2 | What is reinitialize and when does it trigger? |
| SC-06 | P3 | Can I block DDL on tracked source tables? |

### 13. Performance & Sizing

| ID | Priority | Question |
|----|----------|----------|
| PF-01 | P1 | How much disk space do change buffer tables consume? |
| PF-02 | P1 | What determines whether DIFFERENTIAL or FULL is faster for a given workload? |
| PF-03 | P2 | What are the planner hints and when should I disable them? |
| PF-04 | P2 | How do prepared statements help refresh performance? |
| PF-05 | P2 | How does the adaptive FULL fallback threshold work in practice? |
| PF-06 | P3 | How many stream tables can a single PostgreSQL instance handle? |
| PF-07 | P3 | What is the TRUNCATE vs DELETE cleanup trade-off for change buffers? |

### 14. dbt Integration

| ID | Priority | Question |
|----|----------|----------|
| DBT-01 | P1 | How do I use pg_trickle with dbt? |
| DBT-02 | P1 | What dbt commands work with stream tables? |
| DBT-03 | P2 | How does `dbt run --full-refresh` work with stream tables? |
| DBT-04 | P2 | How do I check stream table freshness in dbt? |
| DBT-05 | P2 | What happens when the defining query changes in dbt? |
| DBT-06 | P3 | Can I use `dbt snapshot` with stream tables? |
| DBT-07 | P3 | What dbt versions are supported? |

### 15. Deployment & Operations

| ID | Priority | Question |
|----|----------|----------|
| OP-01 | P1 | How many background workers does pg_trickle use? |
| OP-02 | P1 | Does pg_trickle work with connection poolers (PgBouncer, pgpool)? |
| OP-03 | P2 | How do I upgrade pg_trickle to a new version? |
| OP-04 | P2 | What happens to stream tables during a PostgreSQL restart? |
| OP-05 | P2 | Can I use pg_trickle on a read replica / standby? |
| OP-06 | P2 | How does pg_trickle work with CloudNativePG / Kubernetes? |
| OP-07 | P3 | Does pg_trickle work with partitioned source tables? |
| OP-08 | P3 | Can I run pg_trickle in multiple databases on the same cluster? |

### 16. Error Recovery & Debugging

| ID | Priority | Question |
|----|----------|----------|
| ER-01 | P1 | What happens when a refresh fails repeatedly? |
| ER-02 | P1 | How do I resume a suspended stream table? |
| ER-03 | P2 | How do I see the delta SQL that pg_trickle generates for my query? |
| ER-04 | P2 | How do I interpret the refresh history? |
| ER-05 | P2 | How can I tell if the scheduler is running? |
| ER-06 | P3 | How do I debug a stream table that shows stale data? |
| ER-07 | P3 | What does the `needs_reinit` flag mean and how do I clear it? |

---

## Summary

| Category | Count | P1 | P2 | P3 | Status |
|----------|------:|---:|---:|---:|--------|
| Getting Started / Conceptual | 9 | 4 | 3 | 2 | ✅ Implemented |
| Data Freshness & Consistency | 6 | 3 | 2 | 1 | ✅ Implemented |
| IMMEDIATE Mode | 8 | 3 | 4 | 1 | ✅ Implemented |
| CDC — Triggers vs. WAL | 7 | 2 | 3 | 2 | ✅ Implemented |
| Aggregates & Group-By | 5 | 2 | 2 | 1 | ✅ Implemented |
| Joins | 4 | 1 | 2 | 1 | ✅ Implemented |
| CTEs & Recursive Queries | 4 | 1 | 2 | 1 | ✅ Implemented |
| Window Functions & LATERAL | 4 | 0 | 3 | 1 | ✅ Implemented |
| TopK (ORDER BY … LIMIT) | 4 | 1 | 2 | 1 | ✅ Implemented |
| Tables Without Primary Keys | 3 | 1 | 1 | 1 | ✅ Implemented |
| Diamond Dependencies & DAG | 5 | 0 | 3 | 2 | ✅ Implemented |
| Schema Changes & DDL Events | 6 | 2 | 3 | 1 | ✅ Implemented |
| Performance & Sizing | 7 | 2 | 3 | 2 | ✅ Implemented |
| dbt Integration | 7 | 2 | 3 | 2 | ✅ Implemented |
| Deployment & Operations | 8 | 2 | 4 | 2 | ✅ Implemented |
| Error Recovery & Debugging | 7 | 2 | 3 | 2 | ✅ Implemented |
| **Total** | **94** | **28** | **43** | **23** | **All done** |

---

## Overlap Analysis

Some proposed questions overlap with existing FAQ entries. These should either
be merged into the existing section or cross-linked rather than duplicated:

| Proposed ID | Overlaps with existing FAQ entry | Recommendation |
|-------------|----------------------------------|----------------|
| GS-06 | "What is `__pgt_row_id`?" (Performance & Tuning) | Expand existing |
| CDC-02 | "What is the overhead of CDC triggers?" (CDC section) | Expand existing |
| AG-05 | "Why are unsupported aggregates limited to FULL mode?" (SQL Features section) | Expand existing |
| JN-04 | "How does `NATURAL JOIN` work?" (SQL Features section) | Cross-link |
| ER-01 | "What happens when a stream table keeps failing?" (Monitoring) | Expand existing |
| ER-02 | "My stream table is stuck in INITIALIZING status" (Troubleshooting) | Cross-link |
| OP-02 | "Does pg_trickle work with PgBouncer?" (Interoperability) | Expand existing |
| DC-01 | Partially in "What schedule formats are supported?" | New section |
| IM-01 | Partially in "When should I use FULL vs. DIFFERENTIAL vs. IMMEDIATE?" | Expand existing |

After deduplication, **~78 net-new questions** remain.

---

## Implementation Notes

1. **Group placement** — New questions should be organized into clear sections
   in the FAQ. Some existing sections should be split or renamed:
   - Split "SQL Support" into per-feature subsections (Aggregates, Joins, CTEs,
     Windows, TopK) for discoverability.
   - Add a new "Data Freshness & Consistency" top-level section.
   - Add a new "IMMEDIATE Mode" section.
   - Add a new "dbt Integration" section.

2. **Answer depth** — P1 questions should have complete, self-contained answers
   with code examples. P2 questions can reference other docs. P3 questions
   can be brief with a link to the relevant architecture/plan document.

3. **Cross-links** — Every answer should link to the relevant section of
   SQL_REFERENCE.md, ARCHITECTURE.md, or CONFIGURATION.md for deeper reading.

4. **Code examples** — Prefer runnable SQL snippets. Use the employee/department
   schema from GETTING_STARTED.md for consistency.

5. **Ordering** — Within each section, P1 questions come first, then P2, then P3.