> **Plain-language companion:** [v0.40.0.md](v0.40.0.md)

## v0.40.0 — Embedding API & Advanced RAG Patterns

**Status: Planned.** Derived from [plans/ecosystem/PLAN_PGVECTOR.md](plans/ecosystem/PLAN_PGVECTOR.md) §6.4 (VA-1 through VA-5).

> **Release Theme**
> v0.40.0 consolidates the pgvector integration programme (v0.37–v0.39) into
> a production-ready, ergonomic surface layer. The `embedding_stream_table()`
> API enables one-function RAG corpus setup; materialised k-NN graph research
> explores fixed-pivot retrieval patterns; per-tenant ANN indexing is
> documented as a production pattern; and co-marketing with pgvector/pgai
> positions pg_trickle as the default incremental-view solution for AI on
> PostgreSQL.

---

### Features

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| VA-1 | `embedding_stream_table()` ergonomic API: one-call RAG corpus setup | L | P1 |
| VA-2 | Materialised k-NN graph research: explore fixed-pivot retrieval *(parallel research spike — not a release gate)* | L | P3 |
| VA-3 | Per-tenant ANN indexing patterns: RLS-scoped embedding corpora | M | P2 |
| VA-4 | Outbox-emitted embedding events for downstream consumption | M | P2 |
| VA-5 | Joint case study / co-marketing with pgvector and pgai teams *(best-effort, not a release gate)* | — | P1 |

**VA-1 — `embedding_stream_table()` ergonomic API.**
Add a high-level function that auto-generates the full denormalization query,
creates the stream table, indexes, and monitoring:

```sql
SELECT pgtrickle.embedding_stream_table(
  name                => 'product_search_corpus',
  source_table        => 'products',
  embedding_column    => 'embedding',     -- single column name or ARRAY[...]
  joins               => ARRAY[
    ('product_tags', 'product_id', 'tag', 'name'),  -- auto-JOIN syntax
    ('product_metadata', 'product_id', 'metadata')
  ],
  aggregates          => ARRAY['category', 'price_bin'],  -- GROUP BY cols
  filters             => 'active = true AND seller_rating > 4',
  schedule            => '10 seconds',
  refresh_mode        => 'DIFFERENTIAL',
  index_type          => 'hnsw',  -- or 'ivfflat'
  post_refresh_action => 'reindex_if_drift',
  reindex_drift_threshold => 0.15,
  vector_agg          => 'vector_avg',  -- if aggregating over embeddings
  retention           => '90 days'       -- optional PITR window
);
```

Under the hood, this constructs:
```sql
SELECT
  p.id, p.name, p.description, p.embedding,
  array_agg(pt.tag) AS tags,
  pm.metadata,
  p.category, p.price_bin
FROM products p
LEFT JOIN product_tags pt ON pt.product_id = p.id
LEFT JOIN product_metadata pm ON pm.product_id = p.id
WHERE p.active AND p.seller_rating > 4
GROUP BY p.id, p.category, p.price_bin, pm.metadata
```

And then:
1. Creates the stream table with the inferred schema.
2. Creates an HNSW index on the embedding column.
3. Creates B-tree or GIN indexes on join-result columns as needed.
4. Configures post-refresh actions, monitoring, and retention.
5. Returns a summary JSON: `{created_at, st_name, embedding_dim, approx_rows, index_size_mb}`.

Benefits:
- 80% of RAG users don't need to write SQL — this covers their use case.
- Maintains SQL-first flexibility for power users (they use `create_stream_table`
  directly).
- Lowers the barrier to entry for AI/ML engineers unfamiliar with PostgreSQL
  query authoring.

Required design constraints:
- **Show generated SQL (P1):** The function must accept a `dry_run => true`
  parameter that returns the generated `CREATE STREAM TABLE` + `CREATE INDEX`
  SQL without executing it. This is essential for expert users who want to
  audit or customise the output.
- **Index-inference heuristics (P1):** When `index_type` is omitted, document
  the inference rule explicitly: HNSW is chosen for `vector` and `halfvec`
  columns; IVFFlat is chosen only when explicitly requested. The heuristic and
  its rationale must be documented in the function's SQL comment and in
  `docs/SQL_REFERENCE.md`.

**VA-2 — Materialised k-NN graph (research spike).**

> **Scope change (assessment 2026-04-27):** VA-2 has been reclassified from a
> v0.40.0 release item to a **parallel research spike**. The roadmap itself
> acknowledged it is "likely not production-ready in v0.40.0." Keeping it as
> a release gate on a Large release with uncertain scope creates risk. Open a
> dedicated research tracking issue; VA-2 findings may graduate to v0.41.0 or
> later. T-VA2 is optional/informational in v0.40.0.

Explore whether pre-computing a k-NN graph can accelerate approximate retrieval
for fixed query pivots:

- Given a set of N embedding vectors and a set of P pivot vectors, maintain a
  stream table of `(pivot_id, embedding_id, distance)` tuples representing the
  k-nearest embeddings to each pivot.
- Application queries: `SELECT * FROM knn_graph WHERE pivot_id = 42 ORDER BY
  distance LIMIT 10`.

Research questions:
- Storage/compute trade-off: is maintaining k·N rows cheaper than an HNSW
  index scan?
- Clustering lifecycle: when pivots change, how do we efficiently rebuild the
  graph?
- Heterogeneous k: can we keep the k-NN graph up-to-date when k varies per
  pivot?

Outcome: a research document (`docs/research/MATERIALISED_KNN.md`) and
optionally a proof-of-concept implementation if the trade-off is favorable.
Likely not production-ready in v0.40.0; may be promoted to v0.41.0 or later.

**VA-3 — Per-tenant ANN indexing patterns.**
Document and exemplify the pattern for multi-tenant RAG:

```sql
-- Create a tenant-scoped embedding stream table using RLS (v0.5)
SELECT pgtrickle.create_stream_table(
  'tenant_embeddings_scoped',
  $$
    SELECT e.id, e.embedding, e.tenant_id, e.chunk_text
    FROM embeddings e
    WHERE e.tenant_id = current_setting('app.current_tenant_id')::int
  $$,
  rls_context => 'app.current_tenant_id',
  schedule    => '5 seconds'
);

CREATE INDEX ON tenant_embeddings_scoped USING hnsw (embedding vector_cosine_ops);

-- Application queries automatically filtered by RLS:
SELECT id FROM tenant_embeddings_scoped
WHERE embedding <=> $1 < 0.1
LIMIT 20;
```

Pattern documentation: `docs/tutorials/MULTI_TENANT_RAG.md`.
Security audit checklist: verify that RLS policies are correctly applied to
stream tables, that cross-tenant data never leaks, and that the ST inherits
permissions from the source table.

**VA-4 — Outbox-emitted embedding events.**
Extend the outbox (v0.28) to emit embedding-change events:

```sql
SELECT pgtrickle.create_stream_table(
  'embedding_changes_outbox',
  $$
    SELECT doc_id, old_embedding, new_embedding,
           'EMBEDDING_UPDATED' AS event_type
    FROM docs_embedded_audit
  $$,
  outbox_name => 'embedding_outbox'
);
```

Downstream systems (AI agents, re-rankers, external embedding services) can
subscribe to the outbox and react to embedding changes. Enables
event-driven architecture for RAG applications.

**VA-5 — Joint case studies and co-marketing.**

> **Best-effort only (assessment 2026-04-27):** VA-5 depends on third-party
> coordination (pgvector and pgai team willingness, timing, publishing
> schedules). It must not be a blocking exit criterion for the v0.40.0 release.
> If outreach has not resulted in a commitment by feature-freeze, ship with
> an internal cookbook and a public solo blog post instead.

Partner with pgvector and pgai teams:

- **Blog post:** "PostgreSQL as a Vector Database: pg_trickle + pgvector +
  pgai" (cross-published on blogs).
- **Case study:** Real production system (anonymized as needed) maintaining a
  document-embedding corpus with pg_trickle + pgvector + pgai, serving
  semantic search + LLM context retrieval.
- **Benchmark:** TPC-H with hybrid-search queries (vector similarity + SQL
  filters + BM25 if using pg_search) on pg_trickle stream tables. Measure
  latency, throughput, and resource consumption.
- **Repository:** Create a public example repo (`pg-trickle-rag-starter`)
  with Docker Compose, sample data, and working RAG application code (Python +
  LangChain / LlamaIndex).

Strategic positioning: pg_trickle is the incremental-maintenance layer for
AI/RAG on PostgreSQL. Combined with pgvector (indexing) and pgai (embedding
generation), users get a complete, open-source, self-hosted RAG stack with
no external vector database.

### Test Coverage

| ID | Title | Effort | Priority |
|----|-------|--------|----------|
| T-VA1 | Integration test: `embedding_stream_table()` auto-generates correct ST | M | P1 |
| T-VA2 | k-NN graph proof of concept: verify throughput trade-off | L | P2 |
| T-VA3 | Multi-tenant security test: RLS policies applied to embedding STs | M | P1 |
| T-VA4 | Outbox event emission for embedding changes | M | P2 |

**T-VA1.**
`tests/e2e_embedding_api_tests.rs`:

1. Call `embedding_stream_table()` with a realistic joins + aggregates spec.
2. Verify the generated stream table exists with the expected schema.
3. Verify indexes are created correctly.
4. Insert new source rows; verify stream table refreshes correctly.
5. Query the stream table and assert results match the manually-written
   equivalent `create_stream_table()` with identical query.

**T-VA2.**
`tests/e2e_knn_graph_research_tests.rs` (may be optional / research-only):

1. Create embedding stream table with 100k rows.
2. Define a set of 10 pivot vectors (e.g., cluster centroids).
3. Maintain a k-NN graph ST listing the 50 nearest embeddings to each pivot.
4. Measure: row count, index size, refresh latency vs. direct HNSW index.
5. Benchmark: single-pivot query via k-NN graph vs. direct `LIMIT 50` on HNSW
   index.

**T-VA3.**
`tests/e2e_multi_tenant_embedding_tests.rs`:

1. Create two tenant contexts (tenant_id 1 and 2) with separate embedding data.
2. Create an embedding stream table with RLS policy: `WHERE tenant_id =
   current_setting('app.current_tenant_id')::int`.
3. Connect as tenant 1; query stream table; verify only tenant 1's embeddings
   returned.
4. Connect as tenant 2; query stream table; verify only tenant 2's embeddings
   returned.
5. Verify that an admin can see the full table (or is restricted appropriately
   per policy).

**T-VA4.**
`tests/e2e_embedding_outbox_tests.rs`:

1. Create an embedding stream table with `outbox_name` configured.
2. Insert source rows; verify outbox contains the corresponding embedding
   records.
3. Verify downstream subscriber receives the outbox events.

### Conflicts & Risks

- **VA-1** (embedding_stream_table API) is syntactic sugar; it hides complexity
  and may lead to suboptimal queries if users don't understand the generated
  SQL. Mitigation: `dry_run => true` shows the generated SQL (P1 requirement);
  index-inference heuristics are documented explicitly.
- **VA-2** (k-NN graph) is a parallel research spike; it does not block the
  release. See the scope-change note above.
- **VA-3** (multi-tenant RLS) must be audited carefully. RLS can be bypassed
  by a careless admin. Include security best practices in the documentation.
- **VA-5** (co-marketing) depends on willingness from pgvector/pgai teams.
  Outreach should begin during v0.38 timeframe. See the best-effort note above.

### Exit Criteria

- [ ] VA-1: `embedding_stream_table()` function implemented, generates correct SQL, creates indexes and monitoring
- [ ] VA-1: `dry_run => true` parameter returns generated SQL without executing it
- [ ] VA-1: Index-inference heuristic (HNSW default for vector/halfvec; IVFFlat only if explicit) documented in SQL comment and `docs/SQL_REFERENCE.md`
- [ ] VA-1: Function auto-generates query for 80%+ of common RAG patterns (single source + 2–3 joins + GROUP BY)
- [ ] VA-1: Calling `embedding_stream_table()` produces identical results to manually-written equivalent
- [ ] VA-2: Research spike tracking issue opened; `docs/research/MATERIALISED_KNN.md` published with trade-off analysis *(optional: T-VA2 PoC if favorable)*
- [ ] VA-3: `docs/tutorials/MULTI_TENANT_RAG.md` published with security checklist
- [ ] VA-3: `tests/e2e_multi_tenant_embedding_tests.rs` passes (RLS policies enforced)
- [ ] VA-4: Outbox events emitted correctly on embedding stream table inserts
- [ ] VA-5: *(best-effort)* Blog post published (co-published or pg_trickle solo); `pg-trickle-rag-starter` repo available on GitHub
- [ ] Extension upgrade path tested (`0.39.0 → 0.40.0`)
- [ ] `just check-version-sync` passes

---