# Plan: FAQ Expansion Date: 2026-03-03 Status: IMPLEMENTED --- ## Overview This plan proposed expanding the FAQ (`docs/FAQ.md`) with additional questions that users of pg_trickle are likely to ask. The original FAQ covered 63 questions across 13 sections. After implementation, the FAQ now contains **158 questions** across **24 sections** (~2066 lines). **All 94 proposed questions have been implemented** (78 net-new + 16 overlapping entries expanded/cross-linked within existing sections). ### Priority Key | Priority | Meaning | |----------|---------| | **P1** | Must-have — core concepts every user needs | | **P2** | Should-have — common operational questions | | **P3** | Nice-to-have — advanced topics and edge cases | ### Cross-Reference: Existing FAQ Sections The current FAQ already covers these sections (do NOT duplicate): - General (5 questions) - Installation & Setup (4 questions) - Creating & Managing Stream Tables (10 questions) - SQL Support (4 questions) - Change Data Capture (7 questions) - Performance & Tuning (5 questions) - Interoperability (8 questions) - Monitoring & Alerting (3 questions) - Configuration Reference (GUC table) - Troubleshooting (6 questions) - Why Are These SQL Features Not Supported? (10 questions) - Why Are These Stream Table Operations Restricted? (8 questions) --- ## Proposed New Questions ### 1. Getting Started / Conceptual Questions that help users build an accurate mental model of how pg_trickle works before they write any SQL. | ID | Priority | Question | |----|----------|----------| | GS-01 | P1 | What is incremental view maintenance (IVM) and why does it matter? | | GS-02 | P1 | What is the difference between a stream table and a regular materialized view, in practice? | | GS-03 | P1 | What happens behind the scenes when I INSERT a row into a table tracked by a stream table? | | GS-04 | P1 | What does "differential" mean in the context of pg_trickle? | | GS-05 | P2 | What is a frontier, and why does pg_trickle track LSNs? | | GS-06 | P2 | What is the `__pgt_row_id` column and why does it appear in my stream tables? | | GS-07 | P2 | What is the auto-rewrite pipeline and how does it affect my queries? | | GS-08 | P3 | How does pg_trickle compare to DBSP (the academic framework)? | | GS-09 | P3 | How does pg_trickle compare to pg_ivm? | ### 2. Data Freshness & Consistency The #1 conceptual hurdle for users coming from synchronous materialized views. | ID | Priority | Question | |----|----------|----------| | DC-01 | P1 | How stale can a stream table be? | | DC-02 | P1 | Can I read my own writes immediately after an INSERT? | | DC-03 | P1 | What consistency guarantees does pg_trickle provide? | | DC-04 | P2 | What are "Delayed View Semantics" (DVS)? | | DC-05 | P2 | What happens if the scheduler is behind — does data get lost? | | DC-06 | P3 | How does pg_trickle ensure deltas are applied in the right order across cascading stream tables? | ### 3. IMMEDIATE Mode (Transactional IVM) This is a major v0.2.0 feature — users switching from pg_ivm need detailed guidance. | ID | Priority | Question | |----|----------|----------| | IM-01 | P1 | When should I use IMMEDIATE mode instead of DIFFERENTIAL? | | IM-02 | P1 | What SQL features are NOT supported in IMMEDIATE mode? | | IM-03 | P1 | What happens when I TRUNCATE a source table in IMMEDIATE mode? | | IM-04 | P2 | Can I have cascading IMMEDIATE stream tables (ST A → ST B)? | | IM-05 | P2 | What locking does IMMEDIATE mode use? | | IM-06 | P2 | How do I switch an existing DIFFERENTIAL stream table to IMMEDIATE? | | IM-07 | P2 | What happens to IMMEDIATE mode during a manual `refresh_stream_table()` call? | | IM-08 | P3 | How much write-side overhead does IMMEDIATE mode add? | ### 4. CDC — Triggers vs. WAL Users need to understand the trade-offs behind the hybrid CDC model. | ID | Priority | Question | |----|----------|----------| | CDC-01 | P1 | Why does pg_trickle default to triggers instead of logical replication? | | CDC-02 | P1 | What is the write-side overhead of CDC triggers? | | CDC-03 | P2 | How does the trigger-to-WAL automatic transition work? | | CDC-04 | P2 | What happens to CDC if I restore a database backup? | | CDC-05 | P2 | Do CDC triggers fire for rows inserted via logical replication (subscribers)? | | CDC-06 | P3 | Can I inspect the change buffer tables directly? | | CDC-07 | P3 | How does pg_trickle prevent its own refresh writes from re-triggering CDC? | ### 5. Aggregates & Group-By Aggregate handling is complex and a common source of user confusion. | ID | Priority | Question | |----|----------|----------| | AG-01 | P1 | Which aggregates are fully incremental (O(1) per change) vs. group-rescan? | | AG-02 | P1 | Why do some aggregates have hidden auxiliary columns (`__pgt_count`, `__pgt_sum`)? | | AG-03 | P2 | How does HAVING work with incremental refresh? | | AG-04 | P2 | What happens to a group when all its rows are deleted? | | AG-05 | P3 | Why are `CORR`, `COVAR_*`, and `REGR_*` limited to FULL mode? | ### 6. Joins Join delta semantics can be surprising. | ID | Priority | Question | |----|----------|----------| | JN-01 | P1 | How does a DIFFERENTIAL refresh handle a join when both sides changed? | | JN-02 | P2 | Does pg_trickle support FULL OUTER JOIN incrementally? | | JN-03 | P2 | What happens when a join key is updated and the joined row is simultaneously deleted? | | JN-04 | P3 | Why is NATURAL JOIN rejected? | ### 7. CTEs & Recursive Queries Recursive CTE support is a differentiator — document it well. | ID | Priority | Question | |----|----------|----------| | CTE-01 | P1 | Do recursive CTEs work in DIFFERENTIAL mode? | | CTE-02 | P2 | What are the three strategies for recursive CTE maintenance (semi-naive / DRed / recomputation)? | | CTE-03 | P2 | What triggers a fallback from semi-naive to recomputation? | | CTE-04 | P3 | What happens when a CTE is referenced multiple times in the same query? | ### 8. Window Functions & LATERAL | ID | Priority | Question | |----|----------|----------| | WL-01 | P2 | How are window functions maintained incrementally? | | WL-02 | P2 | Why can't I use a window function inside a CASE or COALESCE expression? | | WL-03 | P2 | What LATERAL constructs are supported (SRFs, subqueries, JSON_TABLE)? | | WL-04 | P3 | What happens when a row moves between window partitions during a refresh? | ### 9. TopK (ORDER BY … LIMIT) | ID | Priority | Question | |----|----------|----------| | TK-01 | P1 | How does `ORDER BY … LIMIT N` work in a stream table? | | TK-02 | P2 | Why is OFFSET not supported with TopK? | | TK-03 | P2 | What happens when a row below the top-N cutoff rises above it? | | TK-04 | P3 | Can I use TopK with aggregates or joins? | ### 10. Tables Without Primary Keys (Keyless Sources) | ID | Priority | Question | |----|----------|----------| | KL-01 | P1 | Do source tables need a primary key? | | KL-02 | P2 | What are the risks of using tables without primary keys? | | KL-03 | P3 | How does content-based row identity work for duplicate rows? | ### 11. Diamond Dependencies & DAG Scheduling | ID | Priority | Question | |----|----------|----------| | DD-01 | P2 | What is a diamond dependency and why does it matter? | | DD-02 | P2 | What does `diamond_consistency = 'atomic'` do? | | DD-03 | P2 | What is the difference between `'fastest'` and `'slowest'` schedule policy? | | DD-04 | P3 | What happens when an atomic diamond group partially fails? | | DD-05 | P3 | How does pg_trickle determine topological refresh order? | ### 12. Schema Changes & DDL Events | ID | Priority | Question | |----|----------|----------| | SC-01 | P1 | What happens when I add a column to a source table? | | SC-02 | P1 | What happens when I drop a column used in a stream table's query? | | SC-03 | P2 | What happens when I `CREATE OR REPLACE` a view used by a stream table? | | SC-04 | P2 | What happens when I alter or drop a function used in a stream table's query? | | SC-05 | P2 | What is reinitialize and when does it trigger? | | SC-06 | P3 | Can I block DDL on tracked source tables? | ### 13. Performance & Sizing | ID | Priority | Question | |----|----------|----------| | PF-01 | P1 | How much disk space do change buffer tables consume? | | PF-02 | P1 | What determines whether DIFFERENTIAL or FULL is faster for a given workload? | | PF-03 | P2 | What are the planner hints and when should I disable them? | | PF-04 | P2 | How do prepared statements help refresh performance? | | PF-05 | P2 | How does the adaptive FULL fallback threshold work in practice? | | PF-06 | P3 | How many stream tables can a single PostgreSQL instance handle? | | PF-07 | P3 | What is the TRUNCATE vs DELETE cleanup trade-off for change buffers? | ### 14. dbt Integration | ID | Priority | Question | |----|----------|----------| | DBT-01 | P1 | How do I use pg_trickle with dbt? | | DBT-02 | P1 | What dbt commands work with stream tables? | | DBT-03 | P2 | How does `dbt run --full-refresh` work with stream tables? | | DBT-04 | P2 | How do I check stream table freshness in dbt? | | DBT-05 | P2 | What happens when the defining query changes in dbt? | | DBT-06 | P3 | Can I use `dbt snapshot` with stream tables? | | DBT-07 | P3 | What dbt versions are supported? | ### 15. Deployment & Operations | ID | Priority | Question | |----|----------|----------| | OP-01 | P1 | How many background workers does pg_trickle use? | | OP-02 | P1 | Does pg_trickle work with connection poolers (PgBouncer, pgpool)? | | OP-03 | P2 | How do I upgrade pg_trickle to a new version? | | OP-04 | P2 | What happens to stream tables during a PostgreSQL restart? | | OP-05 | P2 | Can I use pg_trickle on a read replica / standby? | | OP-06 | P2 | How does pg_trickle work with CloudNativePG / Kubernetes? | | OP-07 | P3 | Does pg_trickle work with partitioned source tables? | | OP-08 | P3 | Can I run pg_trickle in multiple databases on the same cluster? | ### 16. Error Recovery & Debugging | ID | Priority | Question | |----|----------|----------| | ER-01 | P1 | What happens when a refresh fails repeatedly? | | ER-02 | P1 | How do I resume a suspended stream table? | | ER-03 | P2 | How do I see the delta SQL that pg_trickle generates for my query? | | ER-04 | P2 | How do I interpret the refresh history? | | ER-05 | P2 | How can I tell if the scheduler is running? | | ER-06 | P3 | How do I debug a stream table that shows stale data? | | ER-07 | P3 | What does the `needs_reinit` flag mean and how do I clear it? | --- ## Summary | Category | Count | P1 | P2 | P3 | Status | |----------|------:|---:|---:|---:|--------| | Getting Started / Conceptual | 9 | 4 | 3 | 2 | ✅ Implemented | | Data Freshness & Consistency | 6 | 3 | 2 | 1 | ✅ Implemented | | IMMEDIATE Mode | 8 | 3 | 4 | 1 | ✅ Implemented | | CDC — Triggers vs. WAL | 7 | 2 | 3 | 2 | ✅ Implemented | | Aggregates & Group-By | 5 | 2 | 2 | 1 | ✅ Implemented | | Joins | 4 | 1 | 2 | 1 | ✅ Implemented | | CTEs & Recursive Queries | 4 | 1 | 2 | 1 | ✅ Implemented | | Window Functions & LATERAL | 4 | 0 | 3 | 1 | ✅ Implemented | | TopK (ORDER BY … LIMIT) | 4 | 1 | 2 | 1 | ✅ Implemented | | Tables Without Primary Keys | 3 | 1 | 1 | 1 | ✅ Implemented | | Diamond Dependencies & DAG | 5 | 0 | 3 | 2 | ✅ Implemented | | Schema Changes & DDL Events | 6 | 2 | 3 | 1 | ✅ Implemented | | Performance & Sizing | 7 | 2 | 3 | 2 | ✅ Implemented | | dbt Integration | 7 | 2 | 3 | 2 | ✅ Implemented | | Deployment & Operations | 8 | 2 | 4 | 2 | ✅ Implemented | | Error Recovery & Debugging | 7 | 2 | 3 | 2 | ✅ Implemented | | **Total** | **94** | **28** | **43** | **23** | **All done** | --- ## Overlap Analysis Some proposed questions overlap with existing FAQ entries. These should either be merged into the existing section or cross-linked rather than duplicated: | Proposed ID | Overlaps with existing FAQ entry | Recommendation | |-------------|----------------------------------|----------------| | GS-06 | "What is `__pgt_row_id`?" (Performance & Tuning) | Expand existing | | CDC-02 | "What is the overhead of CDC triggers?" (CDC section) | Expand existing | | AG-05 | "Why are unsupported aggregates limited to FULL mode?" (SQL Features section) | Expand existing | | JN-04 | "How does `NATURAL JOIN` work?" (SQL Features section) | Cross-link | | ER-01 | "What happens when a stream table keeps failing?" (Monitoring) | Expand existing | | ER-02 | "My stream table is stuck in INITIALIZING status" (Troubleshooting) | Cross-link | | OP-02 | "Does pg_trickle work with PgBouncer?" (Interoperability) | Expand existing | | DC-01 | Partially in "What schedule formats are supported?" | New section | | IM-01 | Partially in "When should I use FULL vs. DIFFERENTIAL vs. IMMEDIATE?" | Expand existing | After deduplication, **~78 net-new questions** remain. --- ## Implementation Notes 1. **Group placement** — New questions should be organized into clear sections in the FAQ. Some existing sections should be split or renamed: - Split "SQL Support" into per-feature subsections (Aggregates, Joins, CTEs, Windows, TopK) for discoverability. - Add a new "Data Freshness & Consistency" top-level section. - Add a new "IMMEDIATE Mode" section. - Add a new "dbt Integration" section. 2. **Answer depth** — P1 questions should have complete, self-contained answers with code examples. P2 questions can reference other docs. P3 questions can be brief with a link to the relevant architecture/plan document. 3. **Cross-links** — Every answer should link to the relevant section of SQL_REFERENCE.md, ARCHITECTURE.md, or CONFIGURATION.md for deeper reading. 4. **Code examples** — Prefer runnable SQL snippets. Use the employee/department schema from GETTING_STARTED.md for consistency. 5. **Ordering** — Within each section, P1 questions come first, then P2, then P3.