---
title: Logical Replication Operational Guide
description: Monitor, troubleshoot, and safely operate ParadeDB as a permanent logical replica
canonical: https://docs.paradedb.com/deploy/logical-replication/operational-guide
---
This guide covers how to operate ParadeDB after logical replication has been
set up.
Use [Getting Started with Logical
Replication](/deploy/logical-replication/getting-started) to create
the publication and subscription first. This page focuses on what happens after
the link is established and ParadeDB is staying in sync continuously.
## Operating Model
When ParadeDB is used as a logical subscriber:
1. Your application writes to tables on the publisher
2. PostgreSQL logical replication applies those row changes to matching tables
on ParadeDB
3. ParadeDB maintains BM25 indexes locally on the subscriber
4. Search and analytics queries run against ParadeDB instead of the primary
This keeps the source database authoritative while isolating search traffic from
OLTP traffic.
Logical replication copies row changes into ParadeDB, but it does not copy
BM25 indexes from the publisher. For the deployment described in this guide,
build the BM25 indexes you plan to query on the ParadeDB subscriber.
## Baseline Workflow
### 1. Wait for the Initial Copy to Finish
Let PostgreSQL finish copying the base table data before you build BM25 indexes.
This avoids extra indexing work during the bootstrap phase.
On ParadeDB, you can check whether the initial copy is still running with:
```sql
SELECT
subname,
worker_type,
CASE WHEN relid = 0 THEN NULL ELSE relid::regclass END AS table_name,
latest_end_time
FROM pg_stat_subscription
ORDER BY 1, 2, 3;
```
The initial copy is complete when there are no remaining rows with
`worker_type = 'table synchronization'`. If you want a stricter per-table
check, run:
```sql
SELECT srrelid::regclass AS table_name, srsubstate
FROM pg_subscription_rel
ORDER BY 1;
```
The initial copy is complete when every replicated table is in state `r`
(`ready`).
### 2. Build BM25 Indexes on ParadeDB
Once the replicated tables are caught up, create BM25 indexes locally on
ParadeDB:
```sql
CREATE INDEX mock_items_bm25_idx ON public.mock_items
USING bm25 (id, description, category, rating)
WITH (key_field='id');
```
After this, ongoing replicated `INSERT`, `UPDATE`, and `DELETE` operations will
keep the BM25 index current automatically.
### 3. Query ParadeDB
Your application can now issue search queries to ParadeDB without adding search
indexes to the primary database:
```sql
SELECT id, description, pdb.score(id) AS score
FROM mock_items
WHERE description @@@ 'running shoes'
ORDER BY score DESC
LIMIT 10;
```
## Day-2 Operations
### Choose Publication and Subscription Boundaries
For large or high-churn production tables, use one publication and one
subscription per large table, or group only small related tables together.
This gives each subscription its own main apply worker and replication slot.
In normal steady-state replication, PostgreSQL does not parallelize ordinary
change application across tables within a single subscription, so one hot table
can delay other tables that share that apply worker. A publication per table
alone does not provide that isolation unless it also has its own subscription.
If you split replication this way, size the replication worker settings for the
number of subscriptions you plan to run:
- On the publisher, set `max_replication_slots` to at least the number of
subscriptions plus reserve for initial table synchronization workers. During
bootstrap, each active table synchronization worker can temporarily consume
its own replication slot on the publisher. With the default
`max_sync_workers_per_subscription = 2`, leave room for the main subscription
plus up to two extra sync slots per bootstrapping subscription, and set
`max_wal_senders` high enough to cover the same plus any physical replicas.
- On the subscriber, set `max_replication_slots` and
`max_logical_replication_workers` to at least the number of subscriptions plus
reserve for table synchronization workers. On PostgreSQL 18+,
`max_active_replication_origins` controls replication origin tracking
separately and should also be sized accordingly. `max_worker_processes` must
be high enough to accommodate those logical replication workers and any other
background workers used by the system.
- `max_sync_workers_per_subscription` controls initial-copy parallelism when a
subscription is created or refreshed. The default is `2`, so multi-table
publications normally copy at most two tables at a time unless you raise it.
### Add New Tables
When you want ParadeDB to index a new table:
1. Apply the new table DDL on the publisher
2. Apply the same DDL on ParadeDB
3. Make sure the publication includes the table
4. Refresh the subscription
5. Build a BM25 index on ParadeDB if the table should be searchable
Whether step 3 is manual depends on how the publication was defined. If the
publication uses `FOR ALL TABLES`, the new table is included automatically. If
it uses `FOR TABLES IN SCHEMA ...`, new tables in those schemas are included
automatically. If it was created from an explicit table list, add the table
manually. If you do not want the table on ParadeDB, do not include it in the
publication.
```sql
-- On the publisher
ALTER PUBLICATION app_search_pub ADD TABLE public.new_table;
-- On ParadeDB
ALTER SUBSCRIPTION app_search_sub REFRESH PUBLICATION;
```
### Change Indexed Columns
If you add or remove a column that is part of a BM25 index:
1. Apply the table change on both the publisher and ParadeDB
2. Let replication catch up again
3. Rebuild the BM25 index on ParadeDB
See [Reindexing](/documentation/indexing/reindexing) for the BM25 rebuild
workflow.
### Roll Out DDL Safely
PostgreSQL logical replication does not replicate schema changes. That means
the publisher and ParadeDB must be kept in sync manually.
In practice, most teams do this through their existing migration runner or
framework tooling, whether that is Rails migrations, Django migrations, Prisma
Migrate, or another migration system.
For additive changes such as `ADD COLUMN`, the safest rollout is usually:
1. Apply the additive DDL on ParadeDB first
2. Apply the same DDL on the publisher
3. Let replication continue normally
4. Rebuild any BM25 indexes whose indexed column list changed
This follows PostgreSQL's recommendation to apply additive schema changes on the
subscriber first whenever possible, which avoids intermittent apply failures.
Logical replication can tolerate extra columns on the subscriber, so adding a
column on ParadeDB first will not stop replication by itself. Those extra
subscriber-only columns use their local default value, or `NULL` if no default
is defined, until the publisher starts sending that column.
If the new column must be `NOT NULL`, give it a compatible default on both
sides or use a coordinated maintenance window. Otherwise replicated `INSERT`
operations can fail before the publisher-side change is in place.
If the change is not additive, such as a column rename, drop, or incompatible
type change, use a short maintenance window, pause writes to the affected
tables if possible, and coordinate both sides explicitly:
```sql
-- On Subscriber
ALTER SUBSCRIPTION marketplace_sub DISABLE;
ALTER TABLE mock_items RENAME COLUMN category TO product_category;
-- On Publisher
ALTER TABLE mock_items RENAME COLUMN category TO product_category;
-- Back on Subscriber
ALTER SUBSCRIPTION marketplace_sub ENABLE;
```
Do not leave a disabled subscription in place longer than necessary. The
logical slot on the publisher can continue retaining WAL while the subscriber
is disabled.
### Handle Tables Without Primary Keys
PostgreSQL needs a replica identity to replicate `UPDATE` and `DELETE`
operations. A primary key is best. Another suitable unique index can also be
used as the replica identity. If a table has no suitable key, you can use the
per-table fallback:
```sql
ALTER TABLE public.events REPLICA IDENTITY FULL;
```
Do not think of this as a server-wide setting. `REPLICA IDENTITY FULL` is set
per published table and should be treated as a fallback rather than the default
design.
PostgreSQL explicitly warns that subscriber-side `UPDATE` and `DELETE` can
become very inefficient under `FULL`, because the subscriber must locate the
matching row using the entire old row image rather than a compact key.
`FULL` also increases WAL volume and replication traffic on the publisher,
since every `UPDATE` and `DELETE` writes the full before-image of the row
into WAL instead of just the key columns.
### Monitor the Publisher
Permanent logical replication is operationally safe only if you watch the
publisher, not just the subscriber. The most important signal is how much WAL a
logical slot is retaining.
```sql
SELECT
slot_name,
active,
restart_lsn,
confirmed_flush_lsn,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal,
wal_status,
safe_wal_size,
inactive_since
FROM pg_replication_slots
WHERE slot_type = 'logical';
```
Watch for:
- `retained_wal` growing steadily because the subscriber is not acknowledging
WAL quickly enough
- `inactive_since` becoming non-`NULL` for longer than expected
- `wal_status` showing that the slot is under pressure
- Filesystem usage on the volume that contains `pg_wal`
To reduce blast radius, configure `max_slot_wal_keep_size` on the publisher.
This caps how much WAL a slot may retain, but it can also invalidate a lagging
subscriber, so it should be paired with alerting and a reseed plan.
### Monitor the Subscriber
Use the subscriber to confirm that apply workers are healthy and that errors are
not accumulating:
```sql
SELECT subname, worker_type, received_lsn, latest_end_lsn, latest_end_time
FROM pg_stat_subscription;
SELECT subname, apply_error_count, sync_error_count
FROM pg_stat_subscription_stats;
```
If `latest_end_time` stops advancing or `apply_error_count` increases, inspect
the subscriber logs immediately.
### Troubleshoot Apply Failures
One common cause of apply-worker failures is schema drift between the publisher
and subscriber.
Two common log patterns for schema drift are:
```text
logical replication target relation "public.doctor" is missing replicated columns: "personnel_id", "role_function_id"
```
```text
logical replication apply worker for subscription "paradedb_subscription" has started
background worker "logical replication apply worker" (PID 2570238) exited with exit code 1
```
The first message is the root cause. The second means the apply worker crashed
after hitting that error and PostgreSQL will try to restart it.
When you see these messages:
1. Inspect the subscriber logs for the first schema-mismatch error, not just the
worker restart message
2. Compare the affected table definition on the publisher and ParadeDB
3. Apply the missing DDL on ParadeDB
4. Re-enable or refresh the subscription if needed
5. Rebuild any BM25 indexes affected by the schema change
Another common cause of apply-worker failures is a logical replication conflict.
For example, a duplicate key, a permissions failure on the target table, or
row-level security on the subscriber can stop replication even when the schemas
match.
```text
ERROR: duplicate key value violates unique constraint ...
CONTEXT: processing remote data during INSERT for replication target relation ...
```
When you suspect a replication conflict:
1. Inspect the subscriber logs for the first conflict error and note the finish
LSN and replication origin if PostgreSQL logged them
2. Resolve the underlying issue on the subscriber, such as conflicting local
data, missing privileges, or row-level security policy interference
3. Resume replication normally once the conflict is removed
4. Only if you intentionally want to discard that remote transaction, use
`ALTER SUBSCRIPTION ... SKIP` with care
Skipping a conflicting transaction can leave the subscriber inconsistent, so it
should be treated as a last resort rather than the default fix. For conflict
types and the PostgreSQL recovery workflow, see the [PostgreSQL logical
replication conflicts
documentation](https://www.postgresql.org/docs/current/logical-replication-conflicts.html).
### Emergency: WAL Keeps Accumulating on the Publisher
If the logical slot on the publisher is filling disk and ParadeDB cannot catch
up quickly enough, the priority is protecting the publisher.
1. First, fix the subscriber if the issue is simple and recent, such as a
schema mismatch or networking issue
2. If the publisher is running out of disk and the subscriber can be rebuilt,
remove the subscription or drop the logical slot so the publisher can recycle
WAL again
3. Recreate the subscription and reseed ParadeDB once the publisher is safe
Disabling the subscription is not an emergency fix for WAL buildup. A disabled
subscription still leaves the logical slot behind on the publisher, and that
slot can continue retaining WAL.
If the subscriber is reachable and healthy enough to cleanly tear down, dropping
the subscription is the cleanest path:
```sql
DROP SUBSCRIPTION paradedb_subscription;
```
To protect the publisher from continued `pg_wal` growth when you are
intentionally giving up the current replica state, drop the slot on the
publisher:
```sql
SELECT pg_drop_replication_slot('paradedb_subscription');
```
After either step, ParadeDB must be reinitialized from a fresh schema and data
copy before it can resume as a logical subscriber.
## Common Pitfalls
- Starting with pre-populated subscriber tables while using `copy_data = true`
- Applying DDL on only one side of the replication link
- Forgetting that new tables must be added to the publication and refreshed on
the subscription
- Writing directly to subscribed tables on ParadeDB, which can create conflicts
with incoming replicated changes
- Leaving a broken logical slot unattended on the publisher until `pg_wal`
fills disk
- Assuming `ALTER SUBSCRIPTION ... DISABLE` relieves publisher-side WAL pressure
For schema-change basics, see [Schema
Changes](/deploy/logical-replication/configuration). For multiple source
databases, see [Multi-Database Replication for
Microservices](/deploy/logical-replication/multi-database).