# pg_trickle — Self-Refreshing Materialized Views for PostgreSQL

## What Is pg_trickle?

pg_trickle is a PostgreSQL extension that keeps your summary tables and
materialized views automatically up to date — without rebuilding them from
scratch every time something changes.

In plain PostgreSQL, refreshing a materialized view re-runs the entire query,
even if only one row changed out of millions. pg_trickle solves this by
tracking what changed and updating only the affected rows. One insert into a
million-row table? pg_trickle touches exactly one row's worth of computation.

No external services, no sidecars, no message queues. Just install the
extension and your views stay fresh.

## Why Use It?

**The problem:** Teams using PostgreSQL materialized views face a painful
trade-off — burn CPU on constant full refreshes, or live with stale data. Most
end up building custom refresh pipelines just to keep dashboards and reports
current.

**The pg_trickle approach:** You define a "stream table" with a SQL query and a
schedule. The extension captures changes to your source tables and, on each
refresh cycle, computes only what changed and merges the result. You get fresh
data with a fraction of the computation.

## How It Works (The Simple Version)

1. **You define a query and a schedule** — "keep this summary updated every
   30 seconds"
2. **pg_trickle watches your source tables** — lightweight change tracking
   captures inserts, updates, and deletes automatically
3. **Only changes are processed** — instead of re-running the full query,
   a smart "delta query" processes just the new/changed/deleted rows
4. **Your stream table stays current** — the results are merged into a real
   PostgreSQL table you can query like any other

## Key Capabilities

### Broad SQL Support

pg_trickle supports the SQL you already write:

- **JOINs** — inner, left, right, full outer, natural, nested multi-table
- **Aggregations** — COUNT, SUM, AVG, MIN, MAX, STRING_AGG, ARRAY_AGG,
  JSON_AGG, statistical aggregates, and more
- **Set operations** — UNION, INTERSECT, EXCEPT
- **Subqueries** — in FROM, WHERE, SELECT; EXISTS, IN, NOT IN, scalar subqueries
- **Common Table Expressions** — including `WITH RECURSIVE`
- **Window functions** — ROW_NUMBER, RANK, SUM OVER, etc.
- **LATERAL joins and JSON functions** — jsonb_array_elements, unnest, JSON_TABLE
- **TopK queries** — ORDER BY ... LIMIT N with efficient scoped recomputation
- **DISTINCT and DISTINCT ON**
- **GROUPING SETS, CUBE, ROLLUP**

### Four Refresh Modes

| Mode | How it works |
|------|--------------|
| **AUTO** (default) | Smart default — uses incremental delta when possible, falls back to full recomputation |
| **DIFFERENTIAL** | Processes only changed rows — the fastest option for supported queries |
| **FULL** | Complete recomputation each cycle — works with any query |
| **IMMEDIATE** | Updates the stream table *within the same transaction* as your INSERT/UPDATE/DELETE — read-your-writes consistency |

### Smart Scheduling

Stream tables can depend on each other, forming a pipeline. pg_trickle
understands these dependencies and refreshes them in the right order. You only
set an explicit schedule on the tables your application actually reads — the
system propagates timing requirements upward through the dependency chain
automatically.

Supports fixed intervals (e.g. every 30 seconds), cron expressions (e.g.
`@hourly`), and automatic demand-driven scheduling.

### Zero External Infrastructure

pg_trickle runs entirely inside PostgreSQL:

- **No WAL configuration required** — change tracking uses lightweight
  row-level triggers by default
- **No logical replication slots** — works out of the box on any PostgreSQL 18
  installation
- **Optional WAL-based capture** — when `wal_level = logical` is available,
  the system can automatically transition to WAL-based capture for even lower
  overhead. The transition is seamless and reversible.

### Built for Observability

- Monitoring views showing refresh history, timing, and row counts
- Health check function with OK/WARN/ERROR triage
- Dependency tree visualization
- Change buffer size monitoring
- NOTIFY-based alerting when refreshes complete or fail
- Staleness reporting

## Quick Start

```sql
-- 1. Enable the extension
CREATE EXTENSION pg_trickle;

-- 2. Create your source table
CREATE TABLE orders (
    id     INT PRIMARY KEY,
    region TEXT,
    amount NUMERIC
);

-- 3. Create a stream table — it stays up to date automatically
SELECT pgtrickle.create_stream_table(
    'regional_totals',
    'SELECT region, SUM(amount) AS total, COUNT(*) AS cnt
     FROM orders GROUP BY region'
);

-- 4. Query it like any regular table
SELECT * FROM regional_totals;

-- 5. Insert data into your source table...
INSERT INTO orders VALUES (1, 'US', 100), (2, 'EU', 200);

-- 6. After the next refresh cycle, regional_totals reflects the changes
SELECT * FROM regional_totals;

-- For instant updates within the same transaction:
SELECT pgtrickle.create_stream_table(
    'regional_totals_live',
    'SELECT region, SUM(amount) AS total, COUNT(*) AS cnt
     FROM orders GROUP BY region',
    refresh_mode => 'IMMEDIATE'
);
```

## Installation

### Prerequisites

- PostgreSQL 18.x
- Rust 1.82+ with [pgrx](https://github.com/pgcentralfoundation/pgrx) 0.17.x

### Build and Install

```bash
cargo pgrx install --release --pg-config $(pg_config --bindir)/pg_config
```

Add to `postgresql.conf`:

```ini
shared_preload_libraries = 'pg_trickle'
max_worker_processes = 8
```

Restart PostgreSQL, then run `CREATE EXTENSION pg_trickle;` in your database.

### Kubernetes (CloudNativePG)

pg_trickle is available as an OCI extension image for CloudNativePG:

```bash
docker pull ghcr.io/grove/pg_trickle-ext:0.10.0
```

### Docker Hub

Pre-built Docker images with PostgreSQL 18 and pg_trickle are available:

```bash
docker pull grove/pg_trickle:latest
```

## Additional Features

- **ALTER QUERY** — change the defining query of a stream table online without
  dropping and recreating it
- **Diamond dependency consistency** — pipelines with shared upstream sources
  are refreshed atomically to prevent inconsistent reads
- **Circular dependency support** — monotone cycles can be refreshed to a fixed
  point with convergence guardrails
- **Watermark gating** — external data loaders can publish watermarks so
  downstream refreshes wait until sources are aligned
- **PgBouncer compatible** — works behind connection poolers in transaction-pool
  mode (Supabase, Railway, Neon, and similar managed platforms)
- **Crash-safe** — row-level locks prevent concurrent refreshes; crash recovery
  resumes normal operation automatically
- **dbt integration** — the `dbt-pgtrickle` package lets you manage stream
  tables as dbt models

## Technical Foundation

pg_trickle is grounded in the
[DBSP](https://arxiv.org/abs/2203.16684) differential dataflow framework
(Budiu et al., 2022). Delta queries are derived automatically from the SQL
operator tree: joins produce the classic bilinear expansion, aggregates maintain
auxiliary counters, and linear operators like filters pass deltas through
unchanged.

Written in Rust using [pgrx](https://github.com/pgcentralfoundation/pgrx).
Apache 2.0 licensed. 1,100+ unit tests and 800+ end-to-end tests.

## Links

- **Source code:** https://github.com/grove/pg-trickle
- **Documentation:** https://grove.github.io/pg-trickle/
- **Issue tracker:** https://github.com/grove/pg-trickle/issues
- **Getting started tutorial:** https://github.com/grove/pg-trickle/blob/main/docs/GETTING_STARTED.md
- **SQL reference:** https://github.com/grove/pg-trickle/blob/main/docs/SQL_REFERENCE.md
- **Configuration guide:** https://github.com/grove/pg-trickle/blob/main/docs/CONFIGURATION.md
- **Roadmap:** https://github.com/grove/pg-trickle/blob/main/ROADMAP.md