# PgClone Architecture This document describes the internal architecture of pgclone, covering the codebase structure, key design decisions, and PostgreSQL version compatibility. ## Codebase Structure ``` pgclone/ ├── src/ │ ├── pgclone.c # Main extension (~3800 lines) │ │ # - Table, schema, database clone functions │ │ # - DDL generation (indexes, constraints, triggers, views) │ │ # - COPY protocol data transfer │ │ # - Selective column / WHERE filter logic │ │ # - _PG_init(), shmem hooks, version function │ ├── pgclone_bgw.c # Background worker (~800 lines) │ │ # - bgw_main entry point │ │ # - Async table/schema clone workers │ │ # - Parallel worker spawning │ │ # - Shared memory progress updates │ └── pgclone_bgw.h # Shared definitions │ # - Job state struct, status enums │ # - Shared memory layout (pgclone_state) │ # - MAX_JOBS, progress fields ├── sql/ │ └── pgclone--X.Y.Z.sql # SQL function definitions per version ├── test/ │ ├── fixtures/seed.sql # Test data │ ├── pgclone_test.sql # 33 pgTAP tests │ ├── run_tests.sh # Test orchestrator │ ├── run_all.sh # Multi-version runner │ ├── test_async.sh # Async test suite │ └── test_database_create.sh ├── Dockerfile # Multi-version build container ├── docker-compose.yml # Source + test containers (PG 14–18) ├── Makefile # PGXS-based build ├── pgclone.control # Extension metadata └── META.json # PGXN metadata ``` ## Core Design Decisions ### Why libpq Instead of SPI? pgclone uses **loopback libpq connections** to the local target database for all DDL operations instead of PostgreSQL's SPI (Server Programming Interface). The reason: SPI executes within the calling transaction's snapshot, so DDL statements like `CREATE TABLE` aren't visible to subsequent SPI calls within the same function invocation until the transaction commits. By connecting via libpq (even to `localhost`), each DDL statement executes in its own transaction and is immediately visible. ### Why C Instead of PL/pgSQL? - Direct access to the COPY protocol via `PQgetCopyData` / `PQputCopyData` for high-throughput data transfer - Background worker registration requires C (`RegisterDynamicBackgroundWorker`) - Shared memory allocation for progress tracking requires C hooks - Fine-grained error handling and resource cleanup with `PG_TRY` / `PG_CATCH` ### COPY Protocol Data Transfer Data is transferred using PostgreSQL's COPY protocol, which is significantly faster than row-by-row INSERT: 1. Open a `COPY ... TO STDOUT` on the source connection 2. Open a `COPY ... FROM STDIN` on the target connection 3. Stream data between them in chunks via `PQgetCopyData` / `PQputCopyData` 4. Finalize with `PQputCopyEnd` This avoids parsing and re-serializing individual rows. --- ## Shared Memory Architecture Async operations use PostgreSQL shared memory to track job progress: ```c typedef struct PgcloneJobState { int job_id; int status; // PENDING, RUNNING, COMPLETED, FAILED, CANCELLED char schema_name[NAMEDATALEN]; char table_name[NAMEDATALEN]; char current_table[NAMEDATALEN]; int tables_total; int tables_completed; int64 rows_copied; int64 start_time_ms; int64 elapsed_ms; char error_message[256]; // ... more fields } PgcloneJobState; typedef struct PgcloneSharedState { LWLock *lock; int num_jobs; PgcloneJobState jobs[MAX_JOBS]; } PgcloneSharedState; ``` - Allocated once during `_PG_init()` via shared memory hooks - Protected by a lightweight lock (`LWLock`) for concurrent access - Read by `pgclone_progress()`, `pgclone_jobs()`, and `pgclone_jobs_view` - Written by background workers as they progress --- ## PostgreSQL Version Compatibility pgclone uses C preprocessor guards to maintain compatibility across PG 14–18: ### Shared Memory Request (PG 15+) PostgreSQL 15 introduced `shmem_request_hook` — shared memory must be requested during this hook, not directly in `_PG_init()`: ```c #if PG_VERSION_NUM >= 150000 static shmem_request_hook_type prev_shmem_request_hook = NULL; static void pgclone_shmem_request(void) { if (prev_shmem_request_hook) prev_shmem_request_hook(); RequestAddinShmemSpace(sizeof(PgcloneSharedState)); RequestNamedLWLockTranche("pgclone", 1); } #endif ``` In `_PG_init()`: ```c #if PG_VERSION_NUM >= 150000 prev_shmem_request_hook = shmem_request_hook; shmem_request_hook = pgclone_shmem_request; #else RequestAddinShmemSpace(sizeof(PgcloneSharedState)); RequestNamedLWLockTranche("pgclone", 1); #endif ``` ### Signal Handler (PG 17+) PostgreSQL 17 removed the `die` signal handler, replacing it with `SignalHandlerForShutdownRequest`: ```c #if PG_VERSION_NUM >= 170000 #include "postmaster/interrupt.h" pqsignal(SIGTERM, SignalHandlerForShutdownRequest); #else pqsignal(SIGTERM, die); #endif ``` ### Other Version-Specific Guards - `d.adsrc` was removed from `pg_attrdef` in PG 12+ — pgclone uses `pg_get_expr()` instead - `strlcpy` vs `strncpy` for safe string copy across versions - SQL return type consistency across version-specific `.sql` files --- ## Background Worker Lifecycle 1. **Registration:** `pgclone_table_async()` or `pgclone_schema_async()` allocates a job slot in shared memory, populates connection info and parameters, then calls `RegisterDynamicBackgroundWorker()`. 2. **Startup:** The worker process starts via `pgclone_bgw_main()`, which: - Sets up signal handlers - Connects to both source and target databases via libpq - Updates job status to RUNNING 3. **Execution:** The worker calls the same core clone functions used by sync operations, with periodic updates to shared memory (rows copied, current table, elapsed time). 4. **Parallel mode:** For `pgclone_schema_async` with `"parallel": N`, the parent worker: - Queries the source for the list of tables - Spawns N child workers (one per table) - Monitors child workers via shared memory - Updates aggregate progress 5. **Completion:** Worker sets status to COMPLETED or FAILED, disconnects from databases, and exits. --- ## Resource Management pgclone carefully manages resources to avoid leaks: - Every `PQconnectdb()` has a matching `PQfinish()` in all code paths (including error paths) - Every `PQexec()` result is freed with `PQclear()` - `PG_TRY / PG_CATCH` blocks ensure cleanup on errors - Background workers disconnect from both source and target databases before exiting - COPY pipeline errors consume remaining results to prevent connection state corruption --- ## Build System pgclone uses PostgreSQL's PGXS build system: ```makefile MODULES = pgclone EXTENSION = pgclone DATA = sql/pgclone--*.sql PG_CONFIG = pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) include $(PGXS) ``` This integrates with `pg_config` to find the correct include paths, library directories, and installation locations for the target PostgreSQL version.