# CHANGELOG

## 1.0.6

* fix: **`index_scan=false` bypassed by `Parallel Index Scan`** — `CostColumnarPaths`
  only iterated `rel->pathlist`, leaving `rel->partial_pathlist` (parallel paths)
  untouched. When a B-tree index existed on a colcompress table, the planner chose
  `Parallel Index Scan` even with `index_scan=false`, bypassing stripe pruning
  entirely. Fixed by iterating `rel->partial_pathlist` in `CostColumnarPaths` and
  applying `disable_cost` (1e10) to every `IndexPath` found there.
* fix: **`disable_cost` for `index_scan=false` serial paths** — replaced the
  proportional penalty (`estimatedRows * cpu_tuple_cost * 100.0`) with PostgreSQL's
  canonical `disable_cost` constant (1e10), matching the behaviour of
  `SET enable_indexscan = off`. The old penalty was smaller than the seq-scan cost
  for low-selectivity queries (~4% of rows), so the planner still preferred
  `IndexScan` over `ColcompressScan`.
* bench: updated serial and parallel benchmark results and charts (1M rows,
  PostgreSQL 18, 4 access methods).

## 1.0.5

* fix: **EXPLAIN + citus SIGSEGV** — `IsCreateTableAs(NULL)` called `strlen(NULL)` when
  citus passed `query_string=NULL` internally; added NULL guard. Added `IsExplainQuery`
  guard to skip `PlanTreeMutator` for EXPLAIN statements. Fixed `T_CustomScan` else
  branch to recurse into `custom_plans` instead of `elog(ERROR)`.
* fix: **stripe pruning bypassed by btree indexes** — when a btree index existed on a
  colcompress table, the planner chose `IndexScan` with `randomAccess=true`, which
  disabled stripe pruning entirely. Fixed by strengthening
  `ColumnarIndexScanAdditionalCost` with a per-row random-access penalty
  (`estimatedRows * cpu_tuple_cost * 100.0`), steering the planner back to seq scan.
* perf: **`ColumnarIndexScanAdditionalCost` per-row penalty** — discourages index scans
  on large colcompress tables where full-stripe pruning is more efficient.
* docs: **benchmark kit** — added `tests/bench/` with setup SQL, serial/parallel run
  scripts, chart generators, and result PNGs; added `BENCHMARKS.md` with full analysis.
* docs: **README** — citus load order note, btree/stripe-pruning Known Limitation,
  Benchmarks section, corrected install path.

## 1.0.4

* chore: bump version to 1.0.4 (PGXN meta).
* docs: benchmark results — heap vs colcompress vs rowcompress vs citus_columnar.

## 1.0.3

* perf: **stripe-level min/max pruning for colcompress scans** — before reading
  any stripe, the scan aggregates the per-column min/max statistics from
  `engine.chunk` across all chunks of the stripe and tests the resulting
  stripe-wide ranges against the query's WHERE predicates using
  `predicate_refuted_by`. Any stripe whose range is provably disjoint from the
  predicate is skipped entirely — no decompression, no I/O. The pruned count is
  shown in `EXPLAIN`:

  ```
  Engine Stripes Removed by Pruning: N
  ```

  Pruning applies to both the serial scan path and the parallel DSM path
  (parallel workers only receive stripe IDs that survive the filter).
  Effectiveness scales directly with data sortedness; combine with
  `engine.colcompress_merge()` and the `orderby` table option to maximise it.

## 1.0.2

* fix: **index corruption during `COPY` into colcompress tables** — `engine_multi_insert`
  was calling `ExecInsertIndexTuples()` internally, while COPY's
  `CopyMultiInsertBufferFlush` also calls it after `table_multi_insert` returns.
  The double insertion corrupted every B-tree index on tables loaded via `COPY`.
  Fixed by removing all executor infrastructure from the per-tuple loop; index
  insertion is the caller's responsibility, matching `heap_multi_insert` semantics.
* fix: **index corruption when `orderby` and indexes coexist** — when sort-on-write
  is active, `ColumnarWriteRow()` buffers rows and returns `COLUMNAR_FIRST_ROW_NUMBER`
  (= 1) as a placeholder for every row. The executor then indexed all rows with
  TID `(0,1)`, making every index lookup return the first row. Fixed in
  `engine_init_write_state()`: sort-on-write is disabled when the target relation
  has `relhasindex = true`. Tables with indexes already have fast key access;
  sort ordering is redundant and was silently lethal.
* perf: fast `ANALYZE` via chunk-group stride sampling — samples at most
  `N / stride` chunk groups (`stride = max(1, nchunks / 300)`) instead of
  reading the entire table, making `ANALYZE` on large colcompress tables
  milliseconds instead of minutes.

> **Migration note (1.0.1 → 1.0.2):** any colcompress table that has indexes
> and was written with `COPY` or `colcompress_merge` using a prior version must
> be rebuilt: `REINDEX TABLE CONCURRENTLY <table>;`

## 1.0.1

* fix: `multi_insert` now sets `tts_tid` before opening indexes, and explicitly
  calls `ExecInsertIndexTuples()` — previously B-tree entries received garbage
  TIDs during `INSERT INTO ... SELECT`, causing index scans to return wrong rows.
  Tables populated before this fix require `REINDEX TABLE CONCURRENTLY`.
* fix: `orderby` syntax is now validated at `ALTER TABLE SET (orderby=...)` time
  instead of at merge time, giving an immediate error on bad input.
* fix: CustomScan node names renamed to avoid symbol collision with `columnar.so`
  when both extensions are loaded simultaneously.
* fix: corrected SQL function names for `se_alter_engine_table_set` /
  `se_alter_engine_table_reset` (C symbols were mismatched).
* fix: added `safeclib` symlink under `vendor/` so `memcpy_s` resolves correctly
  at link time.
* add: `META.json` for PGXN publication.

## 1.0.0

Initial release of **storage_engine** — a PostgreSQL table access method extension
derived from [Hydra Columnar](https://github.com/hydradatabase/hydra) and extended
with two independent access methods:

* **colcompress** — column-oriented storage with vectorized execution, parallel
  DSM scan, chunk pruning, and a MergeTree-style per-table sort key (`orderby`).
* **rowcompress** — row-compressed batch storage with parallel work-stealing scan
  and full DELETE/UPDATE support via a row-level mask.

Additional features added beyond the upstream:

* per-table `index_scan` option (GUC `storage_engine.enable_index_scan`)
* full DELETE/UPDATE support for colcompress via row mask
* parallel columnar scan wired through DSM
* GUCs under the `storage_engine.*` namespace
* support for PostgreSQL 16, 17, and 18