# CHANGELOG ## 2.0.0 * **BREAKING:** minimum PostgreSQL version is now **15**. PostgreSQL 12, 13, and 14 are no longer supported. The last release compatible with those versions is `1.3.4`. * **BREAKING:** branch `release/1.4.0` was renamed to `release/2.0.0` before merge. No functional changes relative to the `1.4.0` development line — the major version bump reflects the dropped PG 12–14 compatibility and the addition of PG 15 to the official support matrix. * decision: **keep `storage_engine.enable_automatic_plan` (default: `on`)** Bench validation for the grouped `avg` shape on `bench_am_30m` (`country_code in ('BR','US')`, `GROUP BY country_code`, `ORDER BY avg(price)`) showed no stable downside for keeping automatic planning enabled. Observed behavior during repeated PG18 runs: - final plan shape remained stable between `on` and `off` for this query; - serial latency deltas were small and noisy (no consistent winner); - parallel runs tended to be slightly better with `on` on median and tail. Policy for 2.0.0: - keep `storage_engine.enable_automatic_plan = on` as the recommended default; - keep `off` as an operational/diagnostic escape hatch for targeted troubleshooting and benchmark comparisons. * feature: **VectorGroupAgg parallel partial mode expanded and hardened** The `StorageEngineVectorGroupAgg` planner path now supports `AGGSPLIT_INITIAL_SERIAL` for low-cardinality `GROUP BY` plans with `count(*)`, `sum(...)`, `min(...)`, and `max(...)` targets, allowing the vectorized group aggregate node to run inside parallel workers and feed the native `Finalize GroupAggregate` combine step. This line also adds incremental `avg(int4)` support in both: - parallel partial path (`AGGSPLIT_INITIAL_SERIAL`), where the vectorized node emits the transition state expected by PostgreSQL finalize (`bigint[]` `[count,sum]`); - simple/serial path (`AGGSPLIT_SIMPLE`), where VecGroupAgg now emits numeric-compatible `avg(int4)` results directly. Unsupported `avg` shapes still fall back safely. Additional planner coverage in this cycle: - `avg(var::float8)` now vectorizes in `StorageEngineVectorGroupAgg` for supported base types (`int4`, `int8`, `float4`, `float8`) by tracking casted-input semantics per target and emitting the correct transition/output representation. In addition to the aggregate coverage expansion, this release includes correctness and stability fixes found during PG18 validation: - constant-key grouped plans (`numCols=0`, e.g. inferred `Var = Const`) are now handled end-to-end in planner + executor; - vector batch processing now respects `VectorTupleTableSlot.keep[]` so filtered-out rows are not accumulated; - textual group keys use the correct hash key layout (excluding pointer-like `Datum` bytes) to avoid duplicate-group mismatches; - planner fallback diagnostics were added via `storage_engine.debug_vectorized_groupagg_fallback` (`DEBUG1` reasons); - non-essential EXPLAIN telemetry was reduced while keeping debug logging. Validation status for this change set: - regression suite: **PG15 175/175, PG16–PG19 174/174 PASSED** (all five versions); - real plans on `bench_am_30m` show `Parallel Custom Scan (StorageEngineVectorGroupAgg)` for `count/sum/min/max` grouped shapes; - unsupported partial-state shapes still fall back safely to native PostgreSQL aggregate nodes. * note: **compatibility policy update** `1.3.4` is now treated as the legacy line for older PostgreSQL releases. The active support matrix for the current code line remains PostgreSQL 16, 17, 18, and 19. PostgreSQL 15 is the next compatibility target under evaluation for a future major release, but it should not be advertised as officially supported until its build and regression matrix are green. PostgreSQL 12, 13, and 14 should be documented as historical compatibility via `1.3.4`, not as active support for `1.4.x` and later. * fix: **stabilized vectorized aggregate fallback for mixed `numeric` + `money` plans** Plain aggregate lists that mix `numeric` and `money` inputs are now kept on PostgreSQL's native `Agg` executor instead of being rewritten to `StorageEngineVectorAgg`. The vectorized functions for `numeric` and `money` remain available and still run for supported non-mixed shapes, but this specific mixed plan shape was crashing the backend inside `numeric_avg_accum` on recent PostgreSQL builds. The fallback is intentionally planner-level rather than per-aggregate inside the custom node: an attempted `Aggref`-level scalar fallback still reproduced the same crash, so the safe boundary is to leave the whole plain aggregate on the native executor for now. Regression coverage was added for: - `EXPLAIN` on mixed `numeric` + `money` aggregates with vectorization enabled - result equality between `enable_vectorization=off` and `on` - successful execution of the mixed query without backend termination The updated regression suite completed with **PG15 175/175, PG16–PG19 174/174 PASSED** during final validation. * feature: **VectorGroupAgg — vectorized GROUP BY aggregation for `colcompress` tables** `colcompress` tables can now execute `GROUP BY` queries entirely in vectorized mode without materializing individual rows. The planner hook replaces `HashAggregate → ColcompressScan` and `GroupAggregate → Sort → ColcompressScan` with a single `Custom Scan (StorageEngineVectorGroupAgg)` node that: - Pulls `VectorColumn` batches directly from `ColcompressScan` - Accumulates per-group results in an `HTAB` using native C integer/float accumulators — no PostgreSQL function call overhead per row - Emits one result tuple per group after a single full scan **Supported aggregates:** `count(*)`, `sum`, `min`, `max` over `int4`, `int8`, and `float8` columns; up to 8 aggregate targets per query. **Supported GROUP BY key types:** `int4`, `int8`, `float8`. **Supported PostgreSQL versions:** 16, 17, 18, 19. On PG 16, the planner generates `HashAggregate → Sort` for `GROUP BY … ORDER BY`. The hook strips the outer Sort node and enables sorted output internally (`sort_output=true`), eliminating the need for an external sort node. On PG 17–19, the planner generates `GroupAggregate (AGG_SORTED)` with an inner Sort; the hook strips the inner Sort and uses internal sorted emission. `sum(bigint)` returns `numeric` (OID 1700) as required by the SQL standard. The accumulator stores a native `int64` and converts to `numeric` via `DirectFunctionCall1(int8_numeric, …)` only at tuple emission time. `EXPLAIN ANALYZE` shows the node correctly: ``` Custom Scan (StorageEngineVectorGroupAgg) (actual time=2.8..2.8 rows=5 loops=1) Engine Vectorized Group Aggregate: enabled Engine Groups Found: 5 -> Custom Scan (ColcompressScan) on t_reg ... ``` * fix: **VectorGroupAgg — `EXPLAIN ANALYZE` crash on PG 16 (SIGSEGV in Sort ExplainNode)** When the PG 16 planner placed a Sort above HashAggregate (for ORDER BY), the mutation hook replaced HashAggregate with VecGroupAgg but left the outer Sort in place. `EXPLAIN ANALYZE` then called PG 16's `ExplainNode` on this Sort → VecGroupAgg plan, which accessed the Sort's targetlist with an out-of-range `varno` and crashed. Fixed by adding a `case T_Sort` handler in `PlanTreeMutator`: when the Sort's child becomes VecGroupAgg (single key, key position matches Sort key), the Sort is absorbed — VecGroupAgg takes over ORDER BY via internal sorted emission (`sort_output=true`). * fix: **VectorGroupAgg — `EndVecGroupAgg` double `hash_seq_term` after scan exhaustion** When `hash_seq_search` returns `NULL` (scan exhausted, `hash_seq_term` called internally), `EndVecGroupAgg` would call `hash_seq_search` again on the already-terminated status, triggering `unregister_seq_scan: scan not found` inside PG's hash table code. The resulting `elog(ERROR)` inside cleanup code corrupted the exception stack and produced `____longjmp_chk` SIGSEGV. Fixed by setting `state->seq_started = false` in `ExecVecGroupAgg` immediately when `hash_seq_search` returns `NULL`. * fix: **VectorGroupAgg — `sum(bigint)` SIGSEGV when plan has downstream Sort** `fill_and_store_slot` stored `int64` accumulators as raw `Int64GetDatum` even for `sum(bigint)` whose SQL return type is `numeric` (varlena, `typlen = -1`). Any downstream node that copied the tuple (Sort, Materialize) would call `VARSIZE_ANY` on the raw int64, treating it as a varlena pointer and dereferencing a garbage address. Fixed by tracking `result_typeoid` in `VecGroupAggTarget` and converting via `DirectFunctionCall1(int8_numeric, …)` when `result_typeoid == NUMERICOID`. * fix: **cross-version vectorized aggregate correctness on PostgreSQL 16, 17, 18 and 19** The `StorageEngineVectorAgg` wrapper and its inner `Agg` node could diverge after plan mutation: the wrapper received the mutated `ColcompressScan` child, but `newAgg->plan.lefttree/righttree` could still point at the original child plan. Since execution initializes the inner `Agg`, some multi-column vectorized aggregates on PG 17 and PG 19 ran with mismatched flags and returned wrong results. Fixed by synchronizing the mutated child plan into both the wrapper node and the inner `Agg` plan. * fix: **packed-slot qual evaluation and varattno mapping in vectorized scans** Vectorized `colcompress` scans were evaluating scalar quals against a packed slot layout while filter Vars still referenced the original table attnos. This could misread non-leading attributes and crash on varlena quals such as `LIKE` and array operators. Fixed by preserving full-width slot layout for qual evaluation, rebuilding packed aggregate slots only after the batch passes the filter, and resolving packed descriptors against the relation tuple descriptor instead of the projected scan slot. * fix: **PostgreSQL API compatibility for PG16/17 bitmap scans and PG19 slot creation** PG 16/17 still expose exact-page TIDBitmap offsets via `TBMIterateResult->ntuples` and embedded `offsets[]`, while PG 18+ use `tbm_extract_page_tuple()`. PG 19 also required keeping the local `MakeSingleTupleTableSlot()` call site as a two-argument invocation. Version-specific guards now keep `colcompress` and `rowcompress` builds and runtime behavior aligned across PG 16, 17, 18 and 19. * validation: **full correctness matrix is green on PG16–PG19** After clean per-version rebuilds, installs and cluster restarts, `python3 tests/test_suite.py` completed with `ALL 152 TESTS PASSED` on PostgreSQL 16, 17, 18 and 19. * benchmark: **no-citus benchmark matrix rerun on PG16–PG19** Serial and parallel benchmark matrices were rerun without `citus` using the same workload on PostgreSQL 16, 17, 18 and 19. `colcompress` remained the best engine for most analytical queries, especially in parallel mode. Against the historical PG 18 no-citus baseline, the new PG 18 parallel results stayed close overall and improved `colcompress` on key grouped/GIN paths such as `Q7 JSONB key + GROUP BY` (`103.477 ms -> 72.506 ms`), while the serial PG 18 run regressed more broadly and should be treated as a known planner/per-version benchmark shift rather than a correctness issue. --- ## 1.3.5 * fix: **vectorized aggregates now work when `max_parallel_workers_per_gather > 0`** On PostgreSQL 14+ the planner hook now uses a two-pass strategy for queries with aggregates when `storage_engine.vectorization = on`: - **Pass 1** — plan with the original `max_parallel_workers_per_gather` setting (parallel plan, stored as fallback). - **Pass 2** — temporarily set `max_parallel_workers_per_gather = 0`, re-plan to get a serial `T_Agg` node with `AGGSPLIT_SIMPLE`, attempt `PlanTreeMutator` to inject `StorageEngineVectorAgg`. On success, return the vectorized serial plan. On failure (mixed/unsupported aggregates), `PG_CATCH` discards the serial plan and returns the Pass 1 parallel plan. Previously, any `max_parallel_workers_per_gather > 0` caused the planner to produce `AGGSPLIT_INITIAL_SERIAL`/`AGGSPLIT_FINAL_DESERIAL` split nodes instead of `AGGSPLIT_SIMPLE`, which the vectorization hook could not match — making `storage_engine.vectorization = on` a silent no-op for most server configurations. Non-aggregate queries (pure scans, index lookups) are unaffected and continue to use the Pass 1 parallel plan with zero extra planning overhead. * fix: **server crash (SIGSEGV) on vectorized aggregate with WHERE clause** The two-pass planner called `standard_planner()` twice with the same `Query *` object. The first call modified the Query tree in-place; the second call (Pass 2, serial re-plan) operated on the already-consumed object, causing a segfault for any query with a WHERE clause or non-trivial qual pushdown. Fixed by calling `copyObject(parse)` before Pass 1 and using the fresh copy exclusively for the Pass 2 serial re-plan. Pass 1 continues to use the original `parse` pointer as before. --- ## 1.3.4 * feature: **`PROCEDURE engine.colcompress_repack(table_name regclass, min_fill_ratio float8 DEFAULT 0.9)`** — online stripe defragmentation for `colcompress` tables. `colcompress` uses an append-only storage model: `UPDATE` operations mark old rows as deleted (via `deleted_mask` bitmask) and append a new stripe. Over time, tables with frequent updates accumulate partially-dead stripes, wasting disk space and increasing scan cost. `colcompress_repack` iterates over each stripe in order and, for any stripe whose live-row ratio falls below `min_fill_ratio`, reads the live rows into a temporary table, deletes the old stripe range, and reinserts the rows. Each stripe is repacked inside its own transaction (`COMMIT` after each stripe), so the operation is crash-safe and does not hold a lock on the table between stripes. Only a brief `ShareUpdateExclusiveLock` is acquired per stripe cycle. Usage: ```sql -- default: repack stripes with < 90% live rows CALL engine.colcompress_repack('my_table'); -- custom fill ratio CALL engine.colcompress_repack('my_table', 0.7); ``` * removed: **`FUNCTION engine.colcompress_repack(regclass)`** alias — previously an alias for `engine.colcompress_merge`. Removed because it caused an "is not unique" ambiguity error when calling the new 2-argument PROCEDURE with a single argument. Users who need full table rewrite with `ORDER BY` should call `engine.colcompress_merge()` directly. Upgrade scripts contain `DROP FUNCTION IF EXISTS engine.colcompress_repack(regclass)` to cleanly remove the alias on existing installations. --- ## 1.3.3 * fix: **`CREATE EXTENSION storage_engine` fails with "could not find function se_vfloat8pl"** — with `default_version = '1.3.2'` and no direct install script, PostgreSQL chained all upgrade scripts (1.0 → … → 1.3.2) on every fresh `CREATE EXTENSION`. At the 1.2.5→1.2.6 step the `float8`, `numeric`, and `money` vectorized-aggregate C functions were registered, but users who had an older compiled `.so` (pre-float8) saw the error above. Fix: added `storage_engine--1.3.3.sql` (and the backport `storage_engine--1.3.2.sql`), a self-contained direct installation script generated from the full upgrade chain. PostgreSQL now uses this single script for all fresh installs of v1.3.3, completely bypassing the chain. Existing users upgrading with `ALTER EXTENSION storage_engine UPDATE` are unaffected; the upgrade chain is still available for that path. No catalog changes. --- ## 1.3.2 * fix: **Stripe pruning now works for stable expressions** — predicates like `WHERE ts >= CURRENT_DATE - INTERVAL '7 days'` now trigger stripe-level min/max pruning on sorted `colcompress` tables, the same as a literal constant would. Previously, any non-`Const` expression on the filter's non-Var side was passed as-is to `predicate_refuted_by()`, which cannot evaluate run-time expressions and therefore never pruned any stripe. Fix: in `ExtractPushdownClause()`, after the volatile-function check, call `estimate_expression_value()` to fold `STABLE`/`IMMUTABLE` sub-expressions into a `Const` at plan time. If the folded `Const` type differs from the Var's opfamily input type (e.g., `timestamp` result for a `timestamptz` column), an implicit cast is inserted via `coerce_to_target_type()` and the operator is replaced with its same-type counterpart from the Var's btree opfamily so `predicate_refuted_by()` can reason about it correctly. Measured improvement on a 10M-row globally-sorted `colcompress` table: - 754/1000 stripes pruned (75%) for a 7-day window - Rows removed by filter per parallel worker: 943 328 → 203 - JOIN query execution: 757 ms → 715 ms --- ## 1.3.1 * fix: **PG12 `IndexBuildCallback` second argument** — changed from `ItemPointer` to `HeapTuple` to match the PG12 API. On PG13+ the signature uses `ItemPointer`; guarded with `#if PG_VERSION_NUM < PG_VERSION_13`. * fix: **`MemoryContextMemAllocated` availability guard** — the function exists from PG13 onward (not PG14 as previously guarded). Corrected the `#if PG_VERSION_NUM < PG_VERSION_13` guard in `pg_version_compat.h`. * fix: **`commands/explain_format.h` include on PG < 18** — the header was split from `commands/explain.h` in PG18. Guarded the include in `engine_aggregator_node.c` with `#if PG_VERSION_NUM >= PG_VERSION_18`. Together these three fixes restore full build compatibility for PostgreSQL 12 through 19 (all versions now compile without errors or warnings). ## 1.3.0 * fix: **PG14/PG15 compile error in `rowcompress_tableam.c`** — added explicit `#include "access/genam.h"` so that `SysScanDesc`, `systable_beginscan`, and `systable_getnext` are declared on PostgreSQL 14 and 15, where the header is not transitively included via `catalog/indexing.h`. On PG16–19 the build was unaffected. Reported by user after attempting a fresh build on PG14.22. ## 1.2.9 * fix: **SIGSEGV on JOIN queries with `max_parallel_workers_per_gather > 0`** (the default) against `colcompress` tables — use-after-free in `AddColumnarScanPathsRec`. The parallel path block was inside the recursive parameterized-path function, which is called once per JOIN combination. On the second call `add_partial_path()` freed the dominated path; the immediately following `create_sort_path()` then dereferenced the freed pointer → SIGSEGV during query planning. Fix: moved parallel path creation to the parent `AddColumnarScanPaths` (executes exactly once) and ensured `create_sort_path()` is called before `add_partial_path()`. Crash reproduced and verified fixed; no catalog changes required. ## 1.2.8 * feat: **`engine.uint8` — unsigned 64-bit integer type** — native fixed-size 8-byte type (`INTERNALLENGTH=8, PASSEDBYVALUE, ALIGNMENT=double`) with full unsigned semantics for storing values in the `[0, 2^64−1]` range. Motivated by ClickBench columns such as `WatchID` and `UserID` that overflow to negatives when stored as `bigint`. - **I/O & binary protocol**: `uint8in/out`, `uint8recv/send` - **Comparison operators** `<`, `<=`, `=`, `<>`, `>=`, `>` with unsigned semantics - **Btree operator class** (`engine.uint8_ops`) and **hash operator class** (`engine.uint8_hash_ops`) for indexing, sorting, and hashing - **Casts**: `uint8 ↔ bigint` (assignment), `uint8 ↔ numeric` (implicit to numeric; assignment from numeric), `uint8 ↔ text` (assignment) — `numeric → uint8` supports the full `[0, 2^64−1]` range via text path - **Standard aggregates** `engine.min`, `engine.max`, `engine.sum` (returns `numeric` to handle `sum > 2^63`) - **Vectorized aggregates** `engine.vmin`, `engine.vmax`, `engine.vsum` — dispatched automatically from `min/max/sum` on `engine.uint8` columns in `colcompress` tables via `GetVectorizedProcedureOid` - Sum accumulator uses `Int128AggState` (same as `bigint`) — overflow-safe for practical column sums ## 1.2.7 * fix: `vsum(smallint)` and `vsum(integer)` now return `NULL` for empty input, matching SQL standard `sum()` behaviour. Previously returned `0` due to `initcond = '0'` on the aggregate definitions. The C transition functions (`se_vint2sum`, `se_vint4sum`) also received a NULL-state guard. ## 1.2.6 * feat: **Vectorized aggregates for `float8`, `numeric`, and `money`** — extends `StorageEngineVectorAgg` to cover three more numeric types: - **`float8`**: `vsum`, `vavg`, `vmin`, `vmax` (`se_vfloat8pl`, `se_vfloat8_accum`, `se_vfloat8larger`, `se_vfloat8smaller`) - **`numeric`**: `vsum`, `vavg`, `vmin`, `vmax` (`se_vnumericavg_accum`, `se_vnumericavg_final`, `se_vnumericsum_final`, `se_vnumericlarger`, `se_vnumericsmaller`) - **`money`**: `vsum`, `vmin`, `vmax` (`se_vcashpl`, `se_vcashsmaller`, `se_vcashlarger`) — no `avg` because PostgreSQL has no `avg(money)` * fix: **Parallel aggregate correctness** — added `AGGSPLIT_SIMPLE` guard in the planner hook so the vectorized path is only substituted for non-split (non-partial) aggregates, preventing incorrect results in parallel queries. ## 1.2.5 * fix: **Restore continuous upgrade chain** — added missing upgrade scripts `1.2.1→1.2.2` and `1.2.2→1.2.3`, which caused `CREATE EXTENSION storage_engine` (and `ALTER EXTENSION ... UPDATE`) to fail with "no installation script nor update path for version" on systems that had installed any version from 1.2.1 onward. No catalog changes; upgrade from 1.2.4 is a no-op. ## 1.2.4 * feat: **Vectorized aggregates fully operational** — `vmin`, `vmax`, `vsum`, `vavg`, and `vcount` are now registered in the `engine` schema (18 C functions + 16 aggregate definitions). `SELECT min(col), max(col), sum(col), count(*) FROM colcompress_table` transparently uses `StorageEngineVectorAgg` when `storage_engine.enable_vectorization = on`, yielding ~**1.4× speedup** over standard heap-style evaluation on 1M-row tables. * feat: **EXPLAIN ANALYZE shows VectorAgg node** — previously `IsExplainQuery` blocked vectorization for both plain `EXPLAIN` and `EXPLAIN ANALYZE`; now only plain `EXPLAIN` is blocked. `EXPLAIN ANALYZE` correctly shows `Custom Scan (StorageEngineVectorAgg)` with `Engine Vectorized Aggregate: enabled` annotation. * feat: **Schema-qualified vectorized function lookup** — `GetVectorizedProcedureOid()` now searches `engine.vXXX` (schema-qualified) instead of unqualified names, preventing false positives with similarly-named functions in other schemas. * fix: **NULL-safe vmin/vmax** — vectorized min/max transition functions now return `NULL` when scanning an empty result set (previously returned `INT_MIN`/`INT_MAX`). Affects `vint2smaller/larger`, `vint4smaller/larger`, `vint8smaller/larger`, `vdatesmaller/larger`. * fix: **C function name mismatches** — `PG_FUNCTION_INFO_V1` declarations now match their `Datum` body names (`se_vXXX`) for all 9 affected functions in `aggregates.c`. * fix: **Missing closing brace in `se_vint8smaller`** — caused a compilation error on GCC (`static declaration follows non-static declaration`) for date aggregate functions when building for PostgreSQL 18. ## 1.2.3 * fix: **`CREATE EXTENSION` failure on fresh install** — `default_version` in `storage_engine.control` was incorrectly bumped to `1.2.2` in the previous release, causing `ERROR: extension "storage_engine" has no installation script nor update path for version "1.2.2"`. The SQL extension version only needs to change when catalog objects (tables, functions, views in the `engine` schema) actually change. 1.2.2 was a C/build-only release — `default_version` has been corrected back to `1.2.1`. Reported by user after installing from PGXN. ## 1.2.2 * feat: **ZXC compression** (`compression='zxc'`) — adds support for the [ZXC asymmetric codec](https://github.com/hellobertrand/zxc) (BSD-3-Clause). Write-Once Read-Many design: encoder is slow; decoder is SIMD-maximized (NEON on ARMv8+, AVX2/AVX-512 on x86_64). Decompression throughput vs LZ4: Neoverse-V2 +24%, x86_64 AMD EPYC +18%, Apple M2 +46%. Not yet in apt — build from source. Auto-detected by `Makefile.global`. * feat: **libdeflate compression** (`compression='deflate'`) — adds support for [libdeflate](https://github.com/ebiggers/libdeflate), a zlib-compatible codec with better throughput than the standard zlib. Available as `libdeflate-dev` on Ubuntu/Debian. Auto-detected by `Makefile.global`. * build: **all compression libraries are now optional** — previously LZ4, ZSTD and libdeflate were hardcoded in `citus_config.h`, causing link failures on systems without those libraries. All four codecs (LZ4, ZSTD, Deflate, ZXC) are now detected dynamically at build time via `Makefile.global` header detection. The extension falls back to PostgreSQL's built-in `pglz` when no external library is present. Default precedence when available: `ZSTD > ZXC > LZ4 > Deflate > pglz`. * bench: **aarch64 benchmark area** (`tests/bench/aarch64/`) — new directory with serial and parallel benchmark results on ARM Neoverse-N1 / Graviton2 (PostgreSQL 18.1, 1M rows). Includes results for all four compression codecs (ZSTD, LZ4, Deflate, ZXC) with comparison charts. Key finding: ZXC achieves fastest analytical read performance on aarch64 in 6/10 queries, beating even LZ4 despite slightly larger disk size (123 MB vs 118 MB), confirming its SIMD NEON advantage on ARM. ## 1.2.1 * fix: **GUC visibility** — `storage_engine.enable_vectorization`, `enable_parallel_execution`, `enable_dml`, and `enable_engine_index_scan` were registered with `GUC_NO_SHOW_ALL | GUC_NOT_IN_SAMPLE`, hiding them from `\dconfig` and psql tab-completion. Removed — all operational GUCs are now discoverable. Note: GUCs only take effect when the extension is listed in `shared_preload_libraries`. * fix: **`-Wmissing-variable-declarations`** — `ColumnarScanPathMethods`, `ColumnarScanScanMethods`, and `ColumnarScanExecuteMethods` lacked `extern` declarations in `engine_customscan.h`, causing warnings (fatal with `-Werror`) under stricter compiler settings. * fix: **`table_beginscan` 5-argument compile error on PG16–18** — The PG19 API added a 5th `flags` argument to `table_beginscan`. The call site in `RCScan_BeginCustomScan` is now guarded with `#if PG_VERSION_NUM >= PG_VERSION_19`. This error affected builds from the `v1.2.0` tag against PG16–18. * fix: **`statement_timeout` cancels `engine.smart_update` / `engine.colcompress_bulk_update` mid-run** — `set_config('statement_timeout', '0', false)` was applied once before the stripe loop, but PostgreSQL resets session-level GUCs at each `COMMIT` inside a procedure. Both procedures now re-apply the timeout overrides at the top of every loop iteration. * feat: **`engine.smart_update` parallel worker cap** — `max_parallel_workers_per_gather` is set to `max_parallel_workers / 2` at procedure start, preventing the maintenance procedure from consuming the full parallel worker pool. Integer division: 0→0 (serial), 1→0 (serial), 2→1, 4→2, 16→8. ## 1.2.0 * feat: **`index_scan` per-table option for `rowcompress`** — `rowcompress` now supports `index_scan` as a per-table flag, providing feature parity with `colcompress`. Default (`false`) keeps the analytical mode: range index paths are removed by the planner hook so queries use the batch-compressed sequential scan with batch-level min/max pruning. When set to `true`, index scans are allowed (OLTP / document-store mode). New 6th argument to `engine.alter_rowcompress_table_set()` and new 5th argument to `engine.alter_rowcompress_table_reset()`. The `engine.rowcompress_options` view now exposes the `index_scan` column. Upgrade via `ALTER EXTENSION storage_engine UPDATE TO '1.2'`. ## 1.1.5 * compat: **PostgreSQL 19 support** — `storage_engine.so` now compiles and runs on PostgreSQL 19 (devel). README compatibility table updated. * fix: **META.json PGXN license field** — changed `license` value to the PGXN-recognized string `agpl_3`. ## 1.1.4 * fix: **`ORDER BY` silently dropped with parallel `ColcompressScan`** — When a query had `ORDER BY` and the planner chose a parallel `ColcompressScan`, PostgreSQL emitted `Gather(ColcompressScan)` without any `Sort` node above it, returning rows in arbitrary worker-completion order instead of the requested order. Root cause: `ColcompressScan` paths have `pathkeys = NIL` (columnar data has no inherent physical order), so `generate_useful_gather_paths()` found no pre-sorted partial paths and could not build `Gather Merge`. Fix: when `root->query_pathkeys != NIL`, a `Sort(ColcompressScan)` partial path is added to `partial_pathlist` alongside the unsorted one. The planner can now choose `Gather Merge(Sort(ColcompressScan))` and correctly satisfies `ORDER BY`. * fix: **double `_PG_init()` when Citus is in `shared_preload_libraries`** — On PG15 the Citus APT package dynamically loads `citus_columnar.so` via `dlopen()` at load time, which re-entered `_PG_init()` for any co-loaded extension. This caused: `ERROR: attempt to redefine parameter "storage_engine.compression"` and `ERROR: extensible node type "ColumnarScan" already exists`. Fix: added `GetConfigOption()` early-return guard in `engine_guc_init()` and an `if (GetConfigOption(...) == NULL)` block guard in `engine_customscan_init()`, mirroring the `GetCustomScanMethods()` guard already in place for `RegisterCustomScanMethods`. The init functions are now idempotent. ## 1.1.3 * fix: **remove `citus_config.h` dependency from vendored safeclib** — `safeclib/safeclib_private.h` included `citus_config.h` (generated by Citus `./configure`), causing a fatal compile error on clean clones: `fatal error: citus_config.h: No such file or directory`. Replaced with inline `#define` macros for the standard POSIX feature flags it provided. * fix: **suppress `-Wdeclaration-after-statement` warnings** — added `-Wno-declaration-after-statement` to `Makefile.global`; the codebase uses C99 mixed declarations which are valid for PostgreSQL extensions. * cleanup: **remove unused static functions** — `IsIndexPath`, `RCFindBatchForRowNumber`, `rowcompress_estimate_rel_size`, and `rowcompress_relation_set_new_filenode_compat` were declared/defined but never called, producing `-Wunused-function` warnings. ## 1.1.2 * fix: **remove stray `#include "citus_version.h"` from source files** — `citus_version.h` is a file generated by the Citus `./configure` step and is not present in a clean clone. Its absence caused a fatal compile error: `fatal error: citus_version.h: No such file or directory`. Removed from all eight translation units that referenced it. The `HAVE_CITUS_LIBLZ4` macro (also defined in that header) was replaced with the standard PostgreSQL `HAVE_LIBLZ4` macro throughout. ## 1.1.1 * fix: **remove Citus autoconf build artifacts** — the root `Makefile` was the Citus 11.1devel toplevel Makefile and required `./configure` (a Citus-specific autoconf script) to be run before any build could proceed. This caused `configure: error: C compiler cannot create executables` and other Citus-specific probe failures for users with non-standard toolchains (ccache without a backing compiler, aarch64/ARM Linux, NixOS, etc.). The root `Makefile` is now a simple delegator to `src/backend/engine`. A portable, pre-generated `Makefile.global` is now tracked in the repository and uses `pg_config` from `PATH` — no `./configure` step is needed. The six Citus autoconf artifacts (`configure`, `configure.in`, `autogen.sh`, `aclocal.m4`, `Makefile.global.in`, `src/include/citus_config.h.in`) are removed from the repository. Build is now simply: ```bash sudo make -j$(nproc) install # or with an explicit pg_config: PG_CONFIG=/usr/lib/postgresql/17/bin/pg_config sudo make install ``` ## 1.1.0 * feat: **`RowcompressScan` custom scan node with batch-level min/max pruning** — `rowcompress` tables now support a `pruning_column` parameter (`engine.alter_rowcompress_table_set(tbl, pruning_column := 'col')`). When set, `RowcompressScan` records the serialised min/max value of the pruning column per batch during `engine.rowcompress_repack()` or bulk inserts, storing them in `engine.row_batch.batch_min_value` / `batch_max_value`. At scan time, batches whose range does not intersect the query predicate are skipped entirely — no decompression, no I/O. The new GUC `storage_engine.enable_custom_scan` (default `on`) controls whether `RowcompressScan` is injected by the planner hook. * feat: **`engine.rowcompress_repack(tbl)`** — utility function that rewrites all batches of a `rowcompress` table in sorted order by the `pruning_column`, maximising pruning efficiency for range queries (e.g. date, timestamp, bigint sequences). * schema: **`engine.row_options.pruning_attnum`** — new nullable `int2` column; stores the 1-based attribute number of the pruning column. * schema: **`engine.row_batch.batch_min_value` / `batch_max_value`** — new nullable `bytea` columns; store serialised type-agnostic min/max statistics per batch. * upgrade: `ALTER EXTENSION storage_engine UPDATE TO '1.1'` applies the schema changes via `storage_engine--1.0--1.1.sql`. ## 1.0.10 * fix: **pg_search (ParadeDB) BM25 transparent compatibility** — `IsNotIndexPath` in `engine_customscan.c` now preserves `CustomPath` nodes whose `CustomName` equals `"ParadeDB Base Scan"`. Previously, `RemovePathsByPredicate(rel, IsNotIndexPath)` discarded pg_search's planner path, causing the `@@@` operator to fall through as a `Filter` inside `ColcompressScan`, which then failed with "Unsupported query shape". BM25 full-text search on colcompress tables now works **transparently** — no need for `SET storage_engine.enable_custom_scan = false`. `pdb.score()`, `pdb.snippet()`, `===`, and multi-field `AND @@@` all work correctly. `ColcompressScan` continues to handle all other query shapes (projection pushdown, stripe pruning, parallel scan) without change. ## 1.0.9 * docs: **pg_search 0.23 (ParadeDB) compatibility** — colcompress tables are fully compatible with pg_search BM25 full-text search. The BM25 index (`CREATE INDEX USING bm25`) works transparently via `index_fetch_tuple`; `@@@`, `===`, `pdb.score()`, and `pdb.snippet()` all function correctly. To avoid `ColcompressScan` intercepting the planner before pg_search's `ParadeDB Base Scan` path is selected, use `SET storage_engine.enable_custom_scan = false` for queries that use `@@@`. A future release will auto-detect the `@@@` operator in `ColumnarSetRelPathlistHook` and skip the hook transparently. * docs: **native regex alternative to BM25 for analytics** — `~*` (POSIX case-insensitive regex) on colcompress tables uses `ColcompressScan` with full parallelism and stripe-level projection pushdown, achieving the same recall as BM25 at 3× lower latency (60 ms vs ~200 ms for 150k rows, 8 parallel workers). Prefer `~*` over `@@@` for counter/aggregation patterns; reserve BM25 for ranked retrieval and fuzzy matching. * bench: updated serial and parallel benchmark results; added baseline CSV for regression tracking. ## 1.0.8 * fix: **`UPDATE` duplicate-key error on colcompress tables with unique indexes** — `engine_index_fetch_tuple` now consults the in-memory `RowMaskWriteStateMap` bitmask before falling back to `ColumnarReadRowByRowNumber` for flushed stripes. Previously, `engine_tuple_update()` marked the old row deleted (via `UpdateRowMask`) and immediately inserted the new version; the unique-constraint recheck via `index_fetch_tuple` read a stale pre-deletion snapshot from the B-tree entry's old TID and returned "tuple still alive", causing a spurious duplicate-key error on every `UPDATE`. * fix: **deleted rows visible within same command** — `engine_tuple_satisfies_snapshot` now also consults `RowMaskWriteStateMap`, so rows deleted within the current transaction are correctly reported as invisible during the same command, preventing false positives in constraint checks. * fix: **OOM crash in `engine_tuple_update` with large VARLENA columns** — `ColumnarWriteRowInternal` adds a memory-based flush guard: if the `stripeWriteContext` exceeds 256 MB (`SE_MAX_STRIPE_MEM_BYTES`), the current stripe is flushed before buffering the next row. This prevents OOM crashes when stripe row-count limits are generous but rows carry large VARLENA columns (XML, JSON, PDF). ## 1.0.7 * fix: **GIN `BitmapHeapScan` bypasses `ColcompressScan` with `random_page_cost=1.1`** — On NVMe-tuned servers (`random_page_cost=1.1`), the planner preferred a GIN `Bitmap Heap Scan` over `Custom Scan (ColcompressScan)` for analytical queries with JSONB `@>` or array `@>` predicates when `index_scan=false`. This caused +195–237% regression in serial mode vs baseline (Q6 JSONB: 163ms→479ms, Q8 array: 123ms→414ms). Fixed by adding a `disable_cost` (1e10) penalty to every `BitmapHeapPath` in `CostColumnarPaths` when `index_scan=false`, symmetric with the existing penalty for `IndexPath`. Tables with `index_scan=true` are unaffected. Fix confirmed: serial Q6 175ms (-63%), Q8 141ms (-66%). * fix: **`index_scan=false` gate missing in `engine_reader.c` chunk loader** — The single-chunk targeted loading optimisation (`ColumnarReadRowByRowNumber`) was activating unconditionally, including on analytics tables where `index_scan=false`. Added `indexScanEnabled` field to `ColumnarReadState`, populated from `ReadColumnarOptions` in `ColumnarBeginRead`, and gated the single-chunk optimisation on `readState->indexScanEnabled`. * fix: **`BitmapHeapPath` penalty also applied to `partial_pathlist`** — parallel bitmap heap paths were not being penalised, allowing GIN scans via parallel workers to bypass `ColcompressScan` even with `index_scan=false`. * fix: **infinite loop in index scan point lookup** — `ColumnarReadRowByRowNumber` could loop forever when the requested row number fell beyond the last stripe, producing a hang with no error output. * fix: **index scan cost at chunk granularity** — `ColumnarIndexScanAdditionalCost` now computes `perChunkCost` instead of `perStripeCost`, eliminating the ~15× cost inflation that caused the planner to always reject `IndexScan` over `ColcompressScan` for selective point lookups on wide columnar tables. * fix: **use projected column count in `ColumnarIndexScanAdditionalCost`** — replaced `RelationIdGetNumberOfAttributes` with `list_length(rel->reltarget->exprs)`, so wide tables with large blob columns (XML/JSON) no longer inflate index scan cost beyond the full-scan cost, restoring planner choice for `index_scan=true` tables. * fix: **remove stray `randomAccessPenalty` from `ColumnarIndexScanAdditionalCost`** — the per-row penalty (`estimatedRows * cpu_tuple_cost * 100`) was dead code when `index_scan=false` (path already blocked by `disable_cost`) but was still evaluated when `index_scan=true`, causing the planner to always choose `SeqScan` over `IndexScan` regardless of selectivity. Removed entirely. ## 1.0.6 * fix: **`index_scan=false` bypassed by `Parallel Index Scan`** — `CostColumnarPaths` only iterated `rel->pathlist`, leaving `rel->partial_pathlist` (parallel paths) untouched. When a B-tree index existed on a colcompress table, the planner chose `Parallel Index Scan` even with `index_scan=false`, bypassing stripe pruning entirely. Fixed by iterating `rel->partial_pathlist` in `CostColumnarPaths` and applying `disable_cost` (1e10) to every `IndexPath` found there. * fix: **`disable_cost` for `index_scan=false` serial paths** — replaced the proportional penalty (`estimatedRows * cpu_tuple_cost * 100.0`) with PostgreSQL's canonical `disable_cost` constant (1e10), matching the behaviour of `SET enable_indexscan = off`. The old penalty was smaller than the seq-scan cost for low-selectivity queries (~4% of rows), so the planner still preferred `IndexScan` over `ColcompressScan`. * bench: updated serial and parallel benchmark results and charts (1M rows, PostgreSQL 18, 4 access methods). ## 1.0.5 * fix: **EXPLAIN + citus SIGSEGV** — `IsCreateTableAs(NULL)` called `strlen(NULL)` when citus passed `query_string=NULL` internally; added NULL guard. Added `IsExplainQuery` guard to skip `PlanTreeMutator` for EXPLAIN statements. Fixed `T_CustomScan` else branch to recurse into `custom_plans` instead of `elog(ERROR)`. * fix: **stripe pruning bypassed by btree indexes** — when a btree index existed on a colcompress table, the planner chose `IndexScan` with `randomAccess=true`, which disabled stripe pruning entirely. Fixed by strengthening `ColumnarIndexScanAdditionalCost` with a per-row random-access penalty (`estimatedRows * cpu_tuple_cost * 100.0`), steering the planner back to seq scan. * perf: **`ColumnarIndexScanAdditionalCost` per-row penalty** — discourages index scans on large colcompress tables where full-stripe pruning is more efficient. * docs: **benchmark kit** — added `tests/bench/` with setup SQL, serial/parallel run scripts, chart generators, and result PNGs; added `BENCHMARKS.md` with full analysis. * docs: **README** — citus load order note, btree/stripe-pruning Known Limitation, Benchmarks section, corrected install path. ## 1.0.4 * chore: bump version to 1.0.4 (PGXN meta). * docs: benchmark results — heap vs colcompress vs rowcompress vs citus_columnar. ## 1.0.3 * perf: **stripe-level min/max pruning for colcompress scans** — before reading any stripe, the scan aggregates the per-column min/max statistics from `engine.chunk` across all chunks of the stripe and tests the resulting stripe-wide ranges against the query's WHERE predicates using `predicate_refuted_by`. Any stripe whose range is provably disjoint from the predicate is skipped entirely — no decompression, no I/O. The pruned count is shown in `EXPLAIN`: ``` Engine Stripes Removed by Pruning: N ``` Pruning applies to both the serial scan path and the parallel DSM path (parallel workers only receive stripe IDs that survive the filter). Effectiveness scales directly with data sortedness; combine with `engine.colcompress_merge()` and the `orderby` table option to maximise it. ## 1.0.2 * fix: **index corruption during `COPY` into colcompress tables** — `engine_multi_insert` was calling `ExecInsertIndexTuples()` internally, while COPY's `CopyMultiInsertBufferFlush` also calls it after `table_multi_insert` returns. The double insertion corrupted every B-tree index on tables loaded via `COPY`. Fixed by removing all executor infrastructure from the per-tuple loop; index insertion is the caller's responsibility, matching `heap_multi_insert` semantics. * fix: **index corruption when `orderby` and indexes coexist** — when sort-on-write is active, `ColumnarWriteRow()` buffers rows and returns `COLUMNAR_FIRST_ROW_NUMBER` (= 1) as a placeholder for every row. The executor then indexed all rows with TID `(0,1)`, making every index lookup return the first row. Fixed in `engine_init_write_state()`: sort-on-write is disabled when the target relation has `relhasindex = true`. Tables with indexes already have fast key access; sort ordering is redundant and was silently lethal. * perf: fast `ANALYZE` via chunk-group stride sampling — samples at most `N / stride` chunk groups (`stride = max(1, nchunks / 300)`) instead of reading the entire table, making `ANALYZE` on large colcompress tables milliseconds instead of minutes. > **Migration note (1.0.1 → 1.0.2):** any colcompress table that has indexes > and was written with `COPY` or `colcompress_merge` using a prior version must > be rebuilt: `REINDEX TABLE CONCURRENTLY ;` ## 1.0.1 * fix: `multi_insert` now sets `tts_tid` before opening indexes, and explicitly calls `ExecInsertIndexTuples()` — previously B-tree entries received garbage TIDs during `INSERT INTO ... SELECT`, causing index scans to return wrong rows. Tables populated before this fix require `REINDEX TABLE CONCURRENTLY`. * fix: `orderby` syntax is now validated at `ALTER TABLE SET (orderby=...)` time instead of at merge time, giving an immediate error on bad input. * fix: CustomScan node names renamed to avoid symbol collision with `columnar.so` when both extensions are loaded simultaneously. * fix: corrected SQL function names for `se_alter_engine_table_set` / `se_alter_engine_table_reset` (C symbols were mismatched). * fix: added `safeclib` symlink under `vendor/` so `memcpy_s` resolves correctly at link time. * add: `META.json` for PGXN publication. ## 1.0.0 Initial release of **storage_engine** — a PostgreSQL table access method extension derived from [Hydra Columnar](https://github.com/hydradatabase/hydra) and extended with two independent access methods: * **colcompress** — column-oriented storage with vectorized execution, parallel DSM scan, chunk pruning, and a MergeTree-style per-table sort key (`orderby`). * **rowcompress** — row-compressed batch storage with parallel work-stealing scan and full DELETE/UPDATE support via a row-level mask. Additional features added beyond the upstream: * per-table `index_scan` option (GUC `storage_engine.enable_index_scan`) * full DELETE/UPDATE support for colcompress via row mask * parallel columnar scan wired through DSM * GUCs under the `storage_engine.*` namespace * support for PostgreSQL 16, 17, and 18