# BISCUIT: Bitmap Indexed Searching with Combinatorial Union and Intersection Techniques

## PostgreSQL Extension for High-Performance Wildcard Pattern Matching

### Version 1.0.0

## Table of Contents

- [Executive Summary](#executive-summary)
- [Technical Overview](#technical-overview)
- [Installation and Requirements](#installation-and-requirements)
- [Configuration and Setup](#configuration-and-setup)
- [Query Interface](#query-interface)
- [Pattern Matching Specification](#pattern-matching-specification)
- [CRUD Operations and Index Maintenance](#crud-operations-and-index-maintenance)
- [Architecture and Implementation](#architecture-and-implementation)
- [API Reference](#api-reference)
- [Monitoring and Diagnostics](#monitoring-and-diagnostics)
- [Optimization Guidelines](#optimization-guidelines)
- [Limitations and Constraints](#limitations-and-constraints)
- [Troubleshooting](#troubleshooting)
- [Examples and Use Cases](#examples-and-use-cases)

---

## Executive Summary

BISCUIT is a PostgreSQL C extension that provides specialized bitmap indexing for wildcard pattern matching operations. The extension addresses a fundamental limitation in traditional database indexes: efficient handling of leading wildcard queries (e.g., `LIKE '%pattern'`), which typically require full table scans.

### Key Capabilities

- **High-Performance Wildcard Matching**: Achieves O(1) to O(log n) query performance regardless of wildcard placement
- **Automatic Index Maintenance**: Transparent CRUD synchronization via PostgreSQL triggers
- **Memory-Efficient Storage**: Compressed bitmap structures using Roaring Bitmap algorithm
- **Lazy Deletion Strategy**: Deferred cleanup with intelligent batching for optimal throughput
- **Incremental Update Optimization**: Minimized reindexing overhead for similar string modifications
- **Universal Primary Key Support**: Compatible with any PostgreSQL data type
- **Production-Ready Design**: Comprehensive error handling, statistics tracking, and diagnostic capabilities

### Performance Characteristics

| Metric | Value |
|--------|-------|
| Query Latency (Simple Pattern) | O(1) to O(log n) |
| Query Latency (Complex Pattern) | O(k × m), k=parts, m=window |
| Insert Operation | O(L), L=string length |
| Update Operation (Incremental) | O(d), d=differences |
| Delete Operation | O(1) (lazy tombstone) |
| Memory Overhead | 10-30% of indexed data size |
| Supported String Length | 256 characters (indexed portion) |

---

## Technical Overview

### Problem Statement

Traditional database indexes (B-tree, GiST, GIN) exhibit suboptimal performance for wildcard pattern matching queries, particularly when wildcards appear at the beginning of patterns. A query such as `SELECT * FROM table WHERE column LIKE '%pattern%'` typically degrades to a sequential scan with O(n) complexity, where n represents the total number of rows.

### Solution Architecture

BISCUIT implements a multi-dimensional bitmap index structure that decomposes strings into positional character mappings. Each character at each position is tracked using compressed bitmaps (Roaring Bitmaps), enabling constant-time intersection operations for pattern matching.

#### Core Index Structures

1. **Positional Character Index**: Maps each character to a sorted array of position-bitmap pairs
   - Forward positions: `{0, 1, 2, ..., n}` from string start
   - Reverse positions: `{-1, -2, -3, ..., -n}` from string end
   
2. **Character Existence Cache**: Union of all positional bitmaps per character for O(1) containment queries

3. **Length Index**: Bitmap array indexed by string length for length-constrained queries

4. **Primary Key Hash Table**: O(1) lookup structure mapping primary keys to internal indices

5. **Lazy Deletion Layer**: Tombstone bitmap tracking deleted records with deferred cleanup

#### Algorithmic Approach

**Pattern Parsing**: Input patterns are decomposed into literal segments separated by wildcards (`%`). Each segment's position constraints (prefix, suffix, or floating) are analyzed.

**Query Optimization**: Pattern structure determines execution strategy:
- Single-character queries: Direct bitmap lookup
- Prefix patterns (`abc%`): Position-0 intersection with length filter
- Suffix patterns (`%xyz`): Negative-position intersection with length filter
- Containment patterns (`%abc%`): Character cache intersection
- Multi-part patterns: Windowed position matching with recursive refinement

**Bitmap Operations**: Compressed bitmaps enable efficient set operations:
- Intersection: AND operation for constraint combination
- Union: OR operation for alternative matches
- Difference: ANDNOT operation for tombstone filtering

### Data Flow

```
Query Pattern → Parse → Optimize → Bitmap Ops → Filter Tombstones → Result Set
                                                                          ↓
Insert/Update/Delete → Hash Lookup → Bitmap Update → Tombstone Management
```

---

## Installation and Requirements

### System Requirements

- **PostgreSQL Version**: 11.0 or higher
- **Operating System**: Linux, macOS, Windows (with MinGW)
- **Architecture**: x86_64, ARM64
- **Memory**: Minimum 512MB available RAM per index
- **Compiler**: GCC 4.8+, Clang 3.9+, or MSVC 2017+

### Optional Dependencies

- **CRoaring Library**: For optimized Roaring Bitmap operations (recommended)
  - If unavailable, extension falls back to built-in bitmap implementation
  - Installation: `apt-get install libroaring-dev` (Debian/Ubuntu)

### Installation Procedure

#### Method 1: PostgreSQL Extension System

```sql
CREATE EXTENSION biscuit;
```

This command must be executed by a superuser or a user with `CREATE EXTENSION` privileges.

#### Method 2: Manual Compilation

```bash
# Clone repository
git clone https://github.com/crystallinecore/biscuit.git
cd biscuit

# Compile
make

# Install (requires superuser privileges)
sudo make install

# Load extension
psql -d your_database -c "CREATE EXTENSION biscuit;"
```

#### Verification

```sql
SELECT biscuit_version();
-- Expected output: 1.0.0-Biscuit
```

---

## Configuration and Setup

### Initialization Options

#### Option 1: Automated Setup (Recommended)

The `biscuit_setup()` function provides a complete initialization workflow:

```sql
SELECT biscuit_setup(
    p_table_name TEXT,
    p_column_name TEXT,
    p_pk_column_name TEXT DEFAULT 'id'
);
```

**Parameters:**
- `p_table_name`: Target table name (required)
- `p_column_name`: Column to index (required, must be text-compatible)
- `p_pk_column_name`: Primary key column name (default: 'id')

**Operations Performed:**
1. Validates table and column existence
2. Constructs bitmap index structure
3. Generates strongly-typed wrapper functions
4. Installs AFTER triggers for INSERT, UPDATE, DELETE operations
5. Returns detailed confirmation message

**Example:**
```sql
SELECT biscuit_setup('customer_records', 'email_address', 'customer_id');
```

**Output:**
```
Biscuit index built successfully.
Created biscuit_match() and biscuit_match_rows() functions for table: customer_records
Columns: customer_id integer, email_address text
Successfully created trigger on table: customer_records
The index will now automatically update on INSERT, UPDATE, and DELETE operations.
```

#### Option 2: Manual Configuration (Advanced)

For fine-grained control, configure components individually:

**Step 1: Index Construction**
```sql
SELECT biscuit_build_index(
    table_name TEXT,
    column_name TEXT,
    pk_column_name TEXT DEFAULT 'id'
);
```

**Step 2: Function Generation**
```sql
SELECT biscuit_create_match_function();
```

This creates two query functions:
- `biscuit_match(pattern TEXT)`: Returns `SETOF table_type`
- `biscuit_match_rows(pattern TEXT)`: Returns `TABLE(pk type, value TEXT)`

**Step 3: Trigger Activation**
```sql
SELECT biscuit_enable_triggers();
```

Installs trigger named `biscuit_auto_update` on the indexed table.

### Primary Key Considerations

BISCUIT supports heterogeneous primary key types through automatic type conversion:

| PostgreSQL Type | Internal Representation | Notes |
|----------------|------------------------|-------|
| INT2, INT4, INT8 | String (optimized) | Direct conversion |
| TEXT, VARCHAR, CHAR | String (native) | No conversion overhead |
| UUID | Canonical string format | Standardized representation |
| Other types | Type output function | Uses PostgreSQL type system |

**Performance Note**: Integer primary keys receive optimized conversion paths. UUID and text types have minimal overhead. Custom types utilize PostgreSQL's type output functions.

### Index Scope and Limitations

- **Single Column Per Index**: Each index covers one text column
- **Multiple Indexes Supported**: Create separate indexes for different columns
- **Table Scope**: One active BISCUIT index per table
- **Character Limit**: Positions beyond 256 characters are not indexed (but stored)

**Example: Multiple Column Indexing**
```sql
-- Index email column
SELECT biscuit_setup('users', 'email', 'id');

-- To index another column, use a different approach
-- (current implementation: one index per table)
```

---

## Query Interface

### Function Signatures and Return Types

#### Primary Query Functions

**1. Full Tuple Retrieval (Strongly-Typed)**

```sql
biscuit_match(pattern TEXT) RETURNS SETOF table_type
```

Returns complete rows from the indexed table matching the specified pattern. Return type matches the exact schema of the indexed table.

**Characteristics:**
- Type-safe: Returns actual table type
- No column specification required
- Supports standard SQL operations (WHERE, ORDER BY, JOIN)
- Optimal for production queries

**Example:**
```sql
SELECT * FROM biscuit_match('%@example.com%');
SELECT id, email FROM biscuit_match('user%') WHERE created_at > '2024-01-01';
SELECT COUNT(*) FROM biscuit_match('%gmail%');
```

**2. Generic Tuple Retrieval**

```sql
biscuit_match_rows(pattern TEXT) RETURNS TABLE(pk primary_key_type, value TEXT)
```

Returns primary key and indexed column value pairs.

**Use Cases:**
- When only key-value pairs are needed
- Intermediate processing before joining
- Reduced data transfer overhead

**Example:**
```sql
SELECT pk, value FROM biscuit_match_rows('%pattern%');

-- Join back to original table if needed
SELECT t.*
FROM original_table t
JOIN biscuit_match_rows('%pattern%') b ON t.id = b.pk;
```

**3. Count Aggregation**

```sql
biscuit_match_count(pattern TEXT) RETURNS INTEGER
```

Returns the count of matching records without materializing result set.

**Performance:** This is the fastest query method, performing only bitmap operations without tuple retrieval.

**Example:**
```sql
SELECT biscuit_match_count('%search_term%');
```

**4. Key-Value Extraction**

```sql
biscuit_match_keys(pattern TEXT) RETURNS TABLE(pk TEXT, value TEXT)
```

Low-level function returning primary keys and indexed values as text. Used internally by higher-level functions.

**Note:** Primary keys are returned as text regardless of original type.

### Query Patterns and SQL Integration

#### Standard SQL Operations

All `biscuit_match()` results support complete SQL operations:

```sql
-- Filtering
SELECT * FROM biscuit_match('%pattern%')
WHERE status = 'active' AND created_at > NOW() - INTERVAL '30 days';

-- Ordering
SELECT * FROM biscuit_match('A%')
ORDER BY name DESC, created_at ASC
LIMIT 100;

-- Aggregation
SELECT category, COUNT(*), AVG(price)
FROM biscuit_match('%widget%')
GROUP BY category
HAVING COUNT(*) > 10;

-- Joins
SELECT p.*, c.category_name
FROM biscuit_match('%special%') p
JOIN categories c ON p.category_id = c.id;

-- Subqueries
SELECT *
FROM products
WHERE id IN (
    SELECT id FROM biscuit_match('%premium%')
);

-- Common Table Expressions
WITH matched_products AS (
    SELECT * FROM biscuit_match('%electronics%')
)
SELECT m.*, s.stock_level
FROM matched_products m
JOIN stock s ON m.id = s.product_id;
```

#### Performance Considerations

**Optimal Patterns:**
- Use bitmap index for pattern matching
- Apply additional filters after pattern match
- Leverage PostgreSQL's query planner for joins

**Query Plan Example:**
```sql
EXPLAIN ANALYZE
SELECT * FROM biscuit_match('%search%')
WHERE status = 'active';

-- Result: Bitmap operations complete in milliseconds
-- Additional filter applied to result set
```

---

## Pattern Matching Specification

### Wildcard Operators

BISCUIT implements SQL LIKE-style pattern matching with two wildcard operators:

| Operator | Semantics | Character Match | Examples |
|----------|-----------|----------------|----------|
| `%` | Zero or more characters | `[0, ∞)` | `'a%'` → "a", "ab", "abcdef" |
| `_` | Exactly one character | `1` | `'a_c'` → "abc", "axc" |

### Pattern Categories and Optimization Paths

#### 1. Exact Match Patterns

**Pattern Structure:** No wildcards
**Example:** `'exact_string'`
**Optimization:** Length bitmap intersection with position-0 character match
**Complexity:** O(1)

```sql
SELECT biscuit_match_count('hello');
-- Matches only "hello" exactly
```

#### 2. Prefix Patterns

**Pattern Structure:** Literal prefix followed by `%`
**Example:** `'prefix%'`
**Optimization:** Position-0 character sequence with length≥prefix_length filter
**Complexity:** O(log n)

```sql
SELECT * FROM biscuit_match('user_');
-- Matches: "user_admin", "user_john", etc.
```

#### 3. Suffix Patterns

**Pattern Structure:** `%` followed by literal suffix
**Example:** `'%suffix'`
**Optimization:** Negative-position character sequence with length≥suffix_length filter
**Complexity:** O(log n)

```sql
SELECT * FROM biscuit_match('%@gmail.com');
-- Matches: "john@gmail.com", "admin@gmail.com"
```

#### 4. Containment Patterns

**Pattern Structure:** `%literal%`
**Example:** `'%search%'`
**Optimization:** Character cache intersection
**Complexity:** O(1) to O(m), m=pattern_length

```sql
SELECT * FROM biscuit_match('%error%');
-- Matches any string containing "error"
```

#### 5. Complex Multi-Part Patterns

**Pattern Structure:** Multiple literal segments with wildcards
**Examples:**
- `'start%middle%end'`
- `'%part1%part2%'`
- `'a%b%c%d'`

**Optimization:** Windowed position matching with recursive refinement
**Complexity:** O(k × m), k=parts, m=search_window

```sql
SELECT * FROM biscuit_match('user_%@%.com');
-- Matches: "user_john@example.com", "user_admin@test.com"
```

#### 6. Single-Character Wildcard Patterns

**Pattern Structure:** Underscores with optional literals
**Examples:**
- `'___'` (exactly 3 characters)
- `'a_c'` (a, any char, c)
- `'___%'` (at least 3 characters)

**Optimization:** Length bitmap or position bitmap intersection
**Complexity:** O(1) to O(log n)

```sql
SELECT * FROM biscuit_match('___');
-- Matches: "abc", "xyz", "123" (any 3-character string)

SELECT * FROM biscuit_match('test_');
-- Matches: "test1", "testA", "test_" (test + any char)
```

### Pattern Syntax Reference

| Pattern | Description | Matches | Does Not Match |
|---------|-------------|---------|----------------|
| `'abc'` | Exact string | "abc" | "abcd", "xabc", "ABC" |
| `'a%'` | Starts with 'a' | "a", "abc", "a123" | "ba", "ABC" |
| `'%z'` | Ends with 'z' | "z", "xyz", "123z" | "za", "Z" |
| `'%abc%'` | Contains 'abc' | "abc", "xabcx", "123abc" | "ab", "ABC" |
| `'a%z'` | Starts 'a', ends 'z' | "az", "abcz", "a123xyz" | "za", "abc" |
| `'_'` | Single character | "a", "1", "_" | "", "ab" |
| `'___'` | Exactly 3 chars | "abc", "123", "___" | "ab", "abcd" |
| `'a_c'` | 'a' + any + 'c' | "abc", "axc", "a1c" | "ac", "abbc" |
| `'%a%b%'` | 'a' before 'b' | "ab", "axb", "xaby" | "ba", "bxa" |
| `'%'` | Any string | All non-NULL | NULL |
| `''` | Empty string | "" | Any non-empty |

### Special Cases

#### Empty Pattern
```sql
SELECT biscuit_match_count('');
-- Returns count of empty strings in indexed column
```

#### Match All
```sql
SELECT biscuit_match_count('%');
-- Returns count of all non-deleted records
```

#### Case Sensitivity

BISCUIT patterns are **case-sensitive** by default. For case-insensitive matching:

```sql
-- Option 1: Index lowercase values
CREATE TABLE users_normalized (
    id SERIAL PRIMARY KEY,
    email TEXT,
    email_lower TEXT GENERATED ALWAYS AS (LOWER(email)) STORED
);

SELECT biscuit_setup('users_normalized', 'email_lower', 'id');
SELECT * FROM biscuit_match(LOWER('%SEARCH%'));

-- Option 2: Use expression index (future enhancement)
```

#### NULL Handling

- NULL values are never indexed
- NULL column values are treated as empty strings during indexing
- Pattern matching never returns NULL values

```sql
-- NULL values are skipped
INSERT INTO table (id, text_col) VALUES (1, NULL);
SELECT biscuit_match_count('%'); 
-- Does not include row 1
```

---

## CRUD Operations and Index Maintenance

### Automatic Synchronization

When triggers are enabled via `biscuit_setup()` or `biscuit_enable_triggers()`, all DML operations automatically maintain index consistency with zero application code changes.

### INSERT Operations

**Behavior:** New records are immediately indexed and available for queries.

**Algorithm:**
1. Extract primary key and indexed column value from new tuple
2. Convert primary key to canonical string representation
3. Allocate index slot (reuse free slot if available, otherwise expand)
4. Generate positional bitmaps for each character
5. Update length index and character cache
6. Register primary key in hash table

**Complexity:** O(L), where L = string length (up to 256 characters)

**Example:**
```sql
INSERT INTO products (id, name, description)
VALUES (10001, 'Premium Widget', 'High-quality widget');

-- Index automatically updated
SELECT * FROM biscuit_match('%Widget%');
-- Returns the newly inserted row immediately
```

**Bulk Insert Performance:**
```sql
-- For large bulk inserts, consider disabling triggers
SELECT biscuit_disable_triggers();

INSERT INTO products (id, name)
SELECT generate_series(1, 1000000), 'Product ' || generate_series(1, 1000000);

-- Rebuild index after bulk operation
SELECT biscuit_build_index('products', 'name', 'id');
SELECT biscuit_enable_triggers();
```

### UPDATE Operations

**Behavior:** Updates are intelligently optimized based on string similarity.

#### Incremental Update Path

**Criteria for Incremental Update:**
- Same string length (before and after)
- Minimum length ≥ 3 characters
- Character differences < 20% of string length
- Maximum 3 changed characters

**Algorithm:**
1. Compute character-level diff between old and new values
2. Remove old characters from affected position bitmaps
3. Add new characters to affected position bitmaps
4. Update character cache for changed characters only

**Complexity:** O(d), where d = number of different characters

**Example (Incremental):**
```sql
-- Original: "john.doe@example.com"
UPDATE users SET email = 'john.doe@example.org' WHERE id = 123;
-- Only 3 characters changed (com→org), uses incremental update
```

#### Full Reindex Path

**Criteria:**
- Different string lengths
- Changes exceed incremental update threshold
- String < 3 characters

**Algorithm:**
1. Remove all position bitmaps for old value
2. Remove from length index and character cache
3. Re-index completely with new value
4. Update length index and character cache

**Complexity:** O(L), where L = new string length

**Example (Full Reindex):**
```sql
-- Original: "short"
UPDATE users SET email = 'much.longer.email@example.com' WHERE id = 123;
-- Different lengths, requires full reindex
```

**Statistics Tracking:**
```sql
SELECT biscuit_index_status();
-- Shows ratio of incremental vs full updates
-- Example: "Updates: 1000 (incr: 750, 75.0%)"
```

### DELETE Operations

**Behavior:** Implements lazy deletion with tombstone bitmaps for optimal performance.

**Algorithm (O(1) Lazy Deletion):**
1. Hash table lookup to find internal index (O(1))
2. Add index to tombstone bitmap (O(1))
3. Store index in deletion queue
4. Remove primary key from hash table
5. Add slot to free list for future reuse
6. **Defer** removal from position/length bitmaps

**Complexity:** O(1) - actual cleanup deferred

**Example:**
```sql
DELETE FROM products WHERE id = 123;
-- Returns immediately, cleanup happens later
```

#### Tombstone Cleanup Process

**Automatic Cleanup Triggers:**
- Tombstone count ≥ threshold (default: 1000)
- Manual invocation: `SELECT biscuit_cleanup();`

**Cleanup Algorithm:**
1. Batch remove tombstoned indices from all bitmaps
2. Free associated string memory
3. Clear tombstone bitmap
4. Reset deletion queue

**Cleanup Statistics:**
```sql
SELECT biscuit_index_status();
-- Pending tombstones: 856 (85.6% of deletes)
-- Total cleanups: 12
-- Items cleaned: 11442
```

**Performance Characteristics:**

| Scenario | Delete Latency | Query Overhead | Memory Overhead |
|----------|---------------|----------------|-----------------|
| No tombstones | O(1) | None | Minimal |
| < 1000 tombstones | O(1) | O(t) filter | ~t × 32 bytes |
| ≥ 1000 tombstones | O(1) + cleanup | Minimal | Minimal |

**Note:** Query operations automatically filter tombstones, so deleted records never appear in results despite deferred cleanup.

### Upsert Operations (INSERT ... ON CONFLICT)

**Behavior:** Automatically handled through INSERT and UPDATE trigger paths.

**Example:**
```sql
INSERT INTO products (id, name)
VALUES (123, 'New Product')
ON CONFLICT (id)
DO UPDATE SET name = EXCLUDED.name;

-- If row 123 exists: triggers UPDATE path
-- If row 123 doesn't exist: triggers INSERT path
```

### Transaction Semantics

**ACID Compliance:**
- **Atomicity**: Index updates commit/rollback with transaction
- **Consistency**: Index always reflects committed table state
- **Isolation**: Follows PostgreSQL's MVCC isolation levels
- **Durability**: Index persists across server restarts (rebuild required)

**Example:**
```sql
BEGIN;
    INSERT INTO products (id, name) VALUES (1, 'Product A');
    SELECT * FROM biscuit_match('%Product A%');  -- Visible in transaction
ROLLBACK;

SELECT * FROM biscuit_match('%Product A%');  -- Not visible, index reverted
```

**Current Limitation:** Index resides in shared memory and must be rebuilt after server restart.

### Trigger Management

#### Enable Triggers
```sql
SELECT biscuit_enable_triggers();
```

Creates trigger: `biscuit_auto_update` as `AFTER INSERT OR UPDATE OR DELETE FOR EACH ROW`.

#### Disable Triggers
```sql
SELECT biscuit_disable_triggers();
```

Removes triggers but preserves index structure. Use for:
- Bulk data operations
- Table truncation
- Data migration
- Testing scenarios

#### Check Trigger Status
```sql
SELECT biscuit_index_status();
-- Shows whether triggers are active
```

---

## Architecture and Implementation

### System Components

```
┌─────────────────────────────────────────────────────────────┐
│                     PostgreSQL Backend                       │
├─────────────────────────────────────────────────────────────┤
│  Query Interface  │  Trigger System  │  Memory Management   │
└─────────────────────────────────────────────────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│                    BISCUIT Extension Layer                   │
├──────────────────────┬──────────────────┬───────────────────┤
│  Pattern Parser      │  Query Executor  │  CRUD Handler     │
│  - Wildcard analysis │  - Bitmap ops    │  - Insert/Update  │
│  - Optimization      │  - Result filter │  - Delete/Cleanup │
└──────────────────────┴──────────────────┴───────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│                     Index Data Structures                    │
├──────────────┬──────────────┬──────────────┬────────────────┤
│  Positional  │  Character   │   Length     │   PK Hash      │
│  Bitmaps     │  Cache       │   Index      │   Table        │
│  (256 pos)   │  (256 chars) │  (L bitmaps) │   (O(1))       │
└──────────────┴──────────────┴──────────────┴────────────────┘
                            ↓
┌─────────────────────────────────────────────────────────────┐
│               Roaring Bitmap Implementation                  │
├──────────────────────────────────────────────────────────────┤
│  - Compressed bitmap storage                                 │
│  - Fast set operations (AND, OR, ANDNOT)                    │
│  - Memory-efficient sparse representation                    │
└──────────────────────────────────────────────────────────────┘
```

### Data Structure Details

#### 1. Positional Character Index

**Purpose:** Track character occurrences at specific positions

**Structure:**
```c
typedef struct {
    int pos;                  // Position in string
    RoaringBitmap *bitmap;    // Record IDs containing char at pos
} PosEntry;

typedef struct {
    PosEntry *entries;        // Sorted array of position entries
    int count;                // Number of positions tracked
    int capacity;             // Allocated capacity
} CharIndex;

CharIndex pos_idx[256];       // Forward positions (0, 1, 2, ...)
CharIndex neg_idx[256];       // Reverse positions (-1, -2, -3, ...)
```

**Example:**
For strings ["cat", "car", "dog"]:
```
pos_idx['c'][0] = {0, 1}     // 'c' at position 0 in records 0, 1
pos_idx['a'][1] = {0, 1}     // 'a' at position 1 in records 0, 1
pos_idx['t'][2] = {0}        // 't' at position 2 in record 0
neg_idx['t'][-1] = {0}       // 't' at last position in record 0
```

**Access Complexity:** Binary search over positions: O(log P), P ≤ 256

#### 2. Character Existence Cache

**Purpose:** O(1) lookup for "contains character" queries

**Structure:**
```c
RoaringBitmap *char_cache[256];  // Union of all position bitmaps per char
```

**Construction:** 
```
char_cache['a'] = pos_idx['a'][0] ∪ pos_idx['a'][1] ∪ ... ∪ pos_idx['a'][n]
```

**Use Case:** Pattern `%a%` directly queries `char_cache['a']`

#### 3. Length Index

**Purpose:** Filter records by string length

**Structure:**
```c
typedef struct {
    RoaringBitmap **length_bitmaps;  // Array indexed by length
    int max_length;                  // Maximum observed length
} LengthIndex;
```

**Example:**
```
length_bitmaps[3] = {0, 1, 2}    // Records 0, 1, 2 have length 3
length_bitmaps[5] = {3, 4}       // Records 3, 4 have length 5
```

**Use Case:** Pattern `___` queries `length_bitmaps[3]` directly

#### 4. Primary Key Hash Table

**Purpose:** O(1) mapping from primary key to internal index

**Implementation:**
```c
typedef struct {
    char pk_str[NAMEDATALEN];  // Fixed-size key (max 64 bytes)
    uint32_t idx;              // Internal record index
} PKMapEntry;

HTAB *pk_to_index;  // PostgreSQL hash table
```

**Operations:**
- Insert: `hash_search(HASH_ENTER)` - O(1)
- Lookup: `hash_search(HASH_FIND)` - O(1)
- Delete: `hash_search(HASH_REMOVE)` - O(1)

**Memory:** ~96 bytes per entry (key + value + overhead)

#### 5. Lazy Deletion System

**Purpose:** Defer expensive cleanup operations

**Structure:**
```c
typedef struct {
    RoaringBitmap *tombstones;      // Bitmap of deleted indices
    uint32_t *deleted_indices;       // Queue for cleanup
    int tombstone_count;             // Current tombstone count
    int tombstone_threshold;         // Cleanup trigger threshold
    int64 total_cleanups;            // Lifetime cleanup count
    int64 items_cleaned;             // Total items cleaned
} LazyDeletion;
```

**Workflow:**
1. **Delete**: Mark bit in tombstone bitmap (O(1))
2. **Query**: Filter results using `ANDNOT tombstones` (O(log n))
3. **Cleanup**: Batch remove when threshold reached (O(n × 256))

**Adaptive Threshold:** Adjusts based on query/delete ratio

### Memory Management

#### Allocation Strategy

**Memory Context:** Dedicated `AllocSetContext` for index structures
- Lifetime: Until explicit index rebuild or server restart
- Isolation: Independent of query contexts
- Cleanup: Automatic on context deletion

**Size Estimation:**
```
Total Memory = Base + Positional + Cache + Length + Hash + Strings

Base       = ~1 MB (structs and metadata)
Positional = ~256 chars × P positions × 64 bytes/bitmap ≈ 16P KB
Cache      = ~256 chars × 64 bytes/bitmap ≈ 16 KB
Length     = L × 64 bytes/bitmap (L = max length)
Hash       = N × 96 bytes (N = record count)
Strings    = N × avg_length × 2 (data + PK)

Example (1M records, avg 50 chars, max 100 positions):
= 1 MB + 1.6 MB + 16 KB + 6.4 KB + 96 MB + 100 MB ≈ 200 MB
```

**Compression Benefits:** Roaring Bitmaps achieve 10-50× compression for sparse data

#### Garbage Collection

**Automatic:**
- Tombstone cleanup when threshold reached
- Free list management for slot reuse

**Manual:**
```sql
SELECT biscuit_cleanup();  -- Force tombstone cleanup
```

**Memory Leak Prevention:**
- All allocations tracked in index context
- Single deallocation point on rebuild
- No external references retained

### Concurrency Model

**Index Access:** Read-optimized, single-writer model
- Queries: Concurrent read access (no locking)
- Modifications: Serialized through PostgreSQL trigger system
- Consistency: PostgreSQL's MVCC handles transaction isolation

**Limitation:** High-concurrency write workloads may experience contention at trigger level (PostgreSQL limitation, not BISCUIT-specific)

**Scalability:** Read queries scale linearly across cores

---

## Advanced Features

### Incremental Updates

For same-length strings with <20% character changes, BISCUIT uses incremental updates:

```sql
-- Original: "hello world"
-- Updated:  "hello wurld"  
-- Only updates changed positions (saves ~80% of work)
```

### Lazy Deletion with Smart Cleanup

Deletes are instant (O(1)) using tombstone bitmaps. Cleanup triggers automatically when:
- Tombstone count reaches threshold (default: 1000)
- Query/Delete ratio suggests overhead
- Manual trigger: `SELECT biscuit_cleanup();`

### Pattern Optimization

BISCUIT automatically detects and optimizes:
- Wildcard-only patterns: `%%%`, `___`
- Single-character positioned: `%a%`, `a%`, `%a`
- Prefix patterns: `abc%`
- Suffix patterns: `%xyz`
- Exact length: `___` (3 chars)

---

## API Reference

### Index Management Functions

#### biscuit_setup()

**Complete initialization function combining index build, function creation, and trigger setup.**

```sql
biscuit_setup(
    p_table_name TEXT,
    p_column_name TEXT,
    p_pk_column_name TEXT DEFAULT 'id'
) RETURNS TEXT
```

**Parameters:**
- `p_table_name`: Name of table to index (required)
- `p_column_name`: Column containing text data to index (required)
- `p_pk_column_name`: Primary key column name (default: 'id')

**Returns:** Status message with setup details

**Example:**
```sql
SELECT biscuit_setup('customers', 'email', 'customer_id');
```

**Output:**
```
Biscuit index built successfully.
Created biscuit_match() and biscuit_match_rows() functions for table: customers
Columns: customer_id integer, email text
Successfully created trigger on table: customers
The index will now automatically update on INSERT, UPDATE, and DELETE operations.
```

**Errors:**
- Primary key column not found
- Indexed column not found
- Insufficient permissions
- Table does not exist

---

#### biscuit_build_index()

**Constructs bitmap index structure from table data.**

```sql
biscuit_build_index(
    table_name TEXT,
    column_name TEXT,
    pk_column_name TEXT DEFAULT 'id'
) RETURNS BOOLEAN
```

**Parameters:**
- `table_name`: Target table name
- `column_name`: Column to index
- `pk_column_name`: Primary key column (default: 'id')

**Returns:** TRUE on success

**Behavior:**
- Drops existing index if present
- Scans entire table sequentially
- Builds all bitmap structures
- Allocates shared memory context

**Performance:** O(n × L), n=rows, L=avg string length

**Example:**
```sql
SELECT biscuit_build_index('products', 'name', 'id');
```

---

#### biscuit_create_match_function()

**Generates strongly-typed wrapper functions for queries.**

```sql
biscuit_create_match_function() RETURNS TEXT
```

**Returns:** Confirmation message with function signatures

**Generated Functions:**
1. `biscuit_match(pattern TEXT) RETURNS SETOF table_type`
2. `biscuit_match_rows(pattern TEXT) RETURNS TABLE(pk type, value TEXT)`

**Prerequisite:** Index must be built first

**Example:**
```sql
SELECT biscuit_create_match_function();
```

**Output:**
```
Created biscuit_match() and biscuit_match_rows() functions for table: products
Columns: id integer, name text, category text, price numeric
```

---

#### biscuit_enable_triggers()

**Activates automatic index maintenance triggers.**

```sql
biscuit_enable_triggers() RETURNS TEXT
```

**Returns:** Status message

**Behavior:**
- Creates AFTER INSERT OR UPDATE OR DELETE trigger
- Trigger name: `biscuit_auto_update`
- Fires FOR EACH ROW
- Calls `biscuit_trigger()` function

**Example:**
```sql
SELECT biscuit_enable_triggers();
```

---

#### biscuit_disable_triggers()

**Deactivates automatic index maintenance.**

```sql
biscuit_disable_triggers() RETURNS TEXT
```

**Returns:** Status message

**Use Cases:**
- Bulk data operations
- Table maintenance
- Performance testing
- Data migration

**Example:**
```sql
SELECT biscuit_disable_triggers();
-- Perform bulk operations
SELECT biscuit_enable_triggers();
```

---

### Query Functions

#### biscuit_match()

**Primary query interface returning complete table rows.**

```sql
biscuit_match(pattern TEXT) RETURNS SETOF table_type
```

**Parameters:**
- `pattern`: Wildcard pattern (SQL LIKE syntax)

**Returns:** Complete rows matching pattern

**Characteristics:**
- Strongly-typed return
- Full SQL compatibility
- Automatic tombstone filtering
- Optimal for production use

**Example:**
```sql
SELECT * FROM biscuit_match('%@example.com');
SELECT id, name FROM biscuit_match('Product%') WHERE price > 100;
```

---

#### biscuit_match_rows()

**Alternative query interface returning key-value pairs.**

```sql
biscuit_match_rows(pattern TEXT) RETURNS TABLE(pk primary_key_type, value TEXT)
```

**Parameters:**
- `pattern`: Wildcard pattern

**Returns:** Table of (primary_key, indexed_value) pairs

**Example:**
```sql
SELECT * FROM biscuit_match_rows('%error%');
```

---

#### biscuit_match_count()

**Efficient count-only query.**

```sql
biscuit_match_count(pattern TEXT) RETURNS INTEGER
```

**Parameters:**
- `pattern`: Wildcard pattern

**Returns:** Count of matching records

**Performance:** Fastest query method (no tuple materialization)

**Example:**
```sql
SELECT biscuit_match_count('%gmail.com%');
```

---

#### biscuit_match_keys()

**Low-level interface returning primary keys and values as text.**

```sql
biscuit_match_keys(pattern TEXT) RETURNS TABLE(pk TEXT, value TEXT)
```

**Parameters:**
- `pattern`: Wildcard pattern

**Returns:** Table of (pk_text, value_text) pairs

**Note:** Used internally by higher-level functions

**Example:**
```sql
SELECT pk::INTEGER, value FROM biscuit_match_keys('%pattern%');
```

---

### Status and Diagnostic Functions

#### biscuit_index_status()

**Comprehensive index status report.**

```sql
biscuit_index_status() RETURNS TEXT
```

**Returns:** Formatted status report including:
- Table and column information
- Record counts (active, total, free, tombstoned)
- Memory usage
- CRUD statistics
- Lazy deletion metrics
- Optimization indicators

**Example:**
```sql
SELECT biscuit_index_status();
```

**Sample Output:**
```
========================================
Biscuit Index v3 - FIXED
========================================
Table: products
Column: name
Primary Key: id
Active Records: 998567
Total Slots: 1000000
Free Slots: 1433
Tombstoned Slots: 856
Max length: 128
Memory: 18.3 MB
----------------------------------------
CRUD Statistics:
  Inserts: 1000000
  Deletes: 1433
  Updates: 125678 (incr: 89234, 71.0%)
  Queries: 45892
----------------------------------------
Lazy Deletion Status:
  Pending tombstones: 856 (59.7% of deletes)
  Cleanup threshold: 1000
  Total cleanups: 1
  Items cleaned: 577
  Query/Delete ratio: 32.02
----------------------------------------
```

---

#### biscuit_get_active_count()

**Returns active record count.**

```sql
biscuit_get_active_count() RETURNS INTEGER
```

**Returns:** Count of non-deleted records

**Example:**
```sql
SELECT biscuit_get_active_count();
-- Output: 998567
```

---

#### biscuit_get_free_slots()

**Returns available slot count for reuse.**

```sql
biscuit_get_free_slots() RETURNS INTEGER
```

**Returns:** Number of slots available for new inserts

**Example:**
```sql
SELECT biscuit_get_free_slots();
-- Output: 1433
```

---

#### biscuit_get_tombstone_count()

**Returns pending tombstone count.**

```sql
biscuit_get_tombstone_count() RETURNS INTEGER
```

**Returns:** Number of deleted records awaiting cleanup

**Use Case:** Monitor cleanup necessity

**Example:**
```sql
SELECT biscuit_get_tombstone_count();
-- Output: 856
```

---

#### biscuit_version()

**Returns extension version.**

```sql
biscuit_version() RETURNS TEXT
```

**Returns:** Version string

**Example:**
```sql
SELECT biscuit_version();
-- Output: 1.0.0-Biscuit
```

---

### Maintenance Functions

#### biscuit_cleanup()

**Manual tombstone cleanup.**

```sql
biscuit_cleanup() RETURNS TEXT
```

**Returns:** Cleanup summary

**Behavior:**
- Removes tombstoned indices from all bitmaps
- Frees associated memory
- Resets tombstone counter
- Updates cleanup statistics

**Performance:** O(n × 256), n=tombstone_count

**Example:**
```sql
SELECT biscuit_cleanup();
```

**Output:**
```
Tombstone cleanup complete:
  Cleaned: 856 tombstones
  Remaining: 0
  Total cleanups: 2
```

**When to Use:**
- Before performance-critical operations
- After bulk deletions
- When tombstone count exceeds threshold
- During maintenance windows

---

### Internal Functions (Advanced)

#### biscuit_trigger()

**Internal trigger function (do not call directly).**

```sql
biscuit_trigger() RETURNS TRIGGER
```

**Purpose:** Maintains index consistency on DML operations

**Called By:** PostgreSQL trigger system

**Handles:**
- INSERT: Adds new records to index
- UPDATE: Applies incremental or full updates
- DELETE: Marks tombstones

---

#### biscuit_match_tuples()

**Generic tuple retrieval (legacy).**

```sql
biscuit_match_tuples(pattern TEXT) RETURNS SETOF RECORD
```

**Note:** Use `biscuit_match()` instead for type-safe queries

---

### Function Error Handling

All functions implement comprehensive error handling:

**Common Errors:**
- `ERROR: Index not built` - Call `biscuit_build_index()` first
- `ERROR: NULL primary key` - Primary key cannot be NULL
- `ERROR: Column not found` - Verify table/column names
- `ERROR: Insufficient permissions` - Requires table owner or superuser

**Example Error:**
```sql
SELECT * FROM biscuit_match('%pattern%');
-- ERROR: Index not built. Call biscuit_build_index() first.
```

---

### Return Type Specifications

| Function | Return Type | Nullable | Notes |
|----------|-------------|----------|-------|
| biscuit_match() | SETOF table | No | Type matches indexed table |
| biscuit_match_rows() | TABLE(pk, TEXT) | No | PK type preserved |
| biscuit_match_count() | INTEGER | No | Always returns count ≥ 0 |
| biscuit_match_keys() | TABLE(TEXT, TEXT) | No | Both fields as text |
| biscuit_index_status() | TEXT | No | Always returns report |
| biscuit_get_*_count() | INTEGER | No | Returns 0 if no index |
| biscuit_version() | TEXT | No | Version string |

---

## Monitoring and Diagnostics

### Key Performance Indicators

#### 1. Index Health Metrics

**Active Record Count**
```sql
SELECT biscuit_get_active_count();
```
**Interpretation:**
- Should match table row count (excluding deleted)
- Discrepancy indicates synchronization issues
- Monitor trend for capacity planning

**Free Slot Count**
```sql
SELECT biscuit_get_free_slots();
```
**Interpretation:**
- Available slots for insert reuse
- High count after bulk deletes is normal
- Prevents capacity expansion overhead

**Tombstone Count**
```sql
SELECT biscuit_get_tombstone_count();
```
**Interpretation:**
- Pending deleted records
- **Critical Threshold:** >1000 (default cleanup trigger)
- **Warning Threshold:** >5000 (performance degradation)
- **Action Required:** >10000 (manual cleanup recommended)

#### 2. Operational Statistics

**Comprehensive Status**
```sql
SELECT biscuit_index_status();
```

**Key Sections:**

**A. Capacity Metrics**
```
Active Records: 998567     ← Currently queryable records
Total Slots: 1000000       ← Allocated capacity
Free Slots: 1433          ← Available for reuse
Tombstoned Slots: 856     ← Pending cleanup
```

**B. CRUD Performance**
```
Inserts: 1000000
Deletes: 1433
Updates: 125678 (incr: 89234, 71.0%)  ← Incremental update rate
Queries: 45892
```

**Incremental Update Rate:**
- **>70%**: Excellent (similar updates)
- **50-70%**: Good (moderate similarity)
- **<50%**: Normal (diverse updates)
- **<20%**: Review update patterns

**C. Lazy Deletion Health**
```
Pending tombstones: 856 (59.7% of deletes)
Cleanup threshold: 1000
Total cleanups: 1
Items cleaned: 577
Query/Delete ratio: 32.02
```

**Query/Delete Ratio:**
- **>20**: Healthy (reads dominate, lazy deletion beneficial)
- **10-20**: Balanced (cleanup overhead acceptable)
- **<10**: Write-heavy (consider more aggressive cleanup)

#### 3. Memory Utilization

```sql
SELECT 
    (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+) MB'))[1]::NUMERIC AS memory_mb;
```

**Guidelines:**
```
Memory Growth = f(record_count, avg_length, char_diversity)

Expected Ranges:
  100K records:   2-5 MB
  1M records:     15-25 MB
  10M records:    150-250 MB
  100M records:   1.5-2.5 GB

Anomalies:
  >30% of expected: Check for memory leaks or fragmentation
  <50% of expected: Excellent compression (sparse data)
```

### Monitoring Queries

#### Real-Time Health Check

```sql
-- Comprehensive health dashboard
WITH metrics AS (
    SELECT 
        biscuit_get_active_count() AS active,
        biscuit_get_free_slots() AS free,
        biscuit_get_tombstone_count() AS tombstones,
        (regexp_matches(biscuit_index_status(), 'Total Slots: ([0-9]+)'))[1]::INTEGER AS capacity,
        (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC AS memory_mb
)
SELECT 
    active,
    free,
    tombstones,
    capacity,
    memory_mb,
    ROUND(100.0 * active / capacity, 2) AS utilization_pct,
    ROUND(100.0 * tombstones / NULLIF(capacity - active, 0), 2) AS tombstone_ratio,
    CASE 
        WHEN tombstones > 10000 THEN 'CRITICAL - Cleanup Required'
        WHEN tombstones > 5000 THEN 'WARNING - Monitor Closely'
        WHEN tombstones > 1000 THEN 'NOTICE - Cleanup Soon'
        ELSE 'HEALTHY'
    END AS health_status
FROM metrics;
```

**Sample Output:**
```
 active  | free | tombstones | capacity | memory_mb | utilization_pct | tombstone_ratio | health_status
---------+------+------------+----------+-----------+-----------------+-----------------+---------------
 998567  | 1433 |        856 |  1000000 |     18.30 |           99.86 |           59.70 | HEALTHY
```

#### Performance Degradation Detection

```sql
-- Detect query performance issues
WITH baseline AS (
    SELECT 2.1 AS expected_ms  -- Baseline query time
),
current AS (
    SELECT 
        biscuit_get_tombstone_count() AS tombstones,
        -- Simulate query time (in production, measure actual queries)
        2.1 + (biscuit_get_tombstone_count()::NUMERIC / 1000) * 0.2 AS estimated_ms
)
SELECT 
    tombstones,
    estimated_ms,
    ROUND(((estimated_ms - expected_ms) / expected_ms) * 100, 2) AS degradation_pct,
    CASE 
        WHEN estimated_ms > expected_ms * 2 THEN 'ACTION REQUIRED'
        WHEN estimated_ms > expected_ms * 1.5 THEN 'DEGRADED'
        WHEN estimated_ms > expected_ms * 1.2 THEN 'ELEVATED'
        ELSE 'NORMAL'
    END AS performance_status
FROM current, baseline;
```

#### Capacity Planning

```sql
-- Project capacity needs
WITH stats AS (
    SELECT 
        biscuit_get_active_count() AS current_active,
        (regexp_matches(biscuit_index_status(), 'Total Slots: ([0-9]+)'))[1]::INTEGER AS capacity,
        (regexp_matches(biscuit_index_status(), 'Inserts: ([0-9]+)'))[1]::BIGINT AS total_inserts,
        (regexp_matches(biscuit_index_status(), 'Deletes: ([0-9]+)'))[1]::BIGINT AS total_deletes
),
growth AS (
    SELECT 
        current_active,
        capacity,
        total_inserts - total_deletes AS net_growth,
        CASE 
            WHEN total_deletes > 0 THEN total_inserts::NUMERIC / total_deletes
            ELSE NULL
        END AS insert_delete_ratio
    FROM stats
)
SELECT 
    current_active,
    capacity,
    capacity - current_active AS available,
    ROUND(100.0 * current_active / capacity, 2) AS utilization_pct,
    net_growth,
    insert_delete_ratio,
    CASE 
        WHEN current_active::NUMERIC / capacity > 0.9 THEN 'EXPAND SOON'
        WHEN current_active::NUMERIC / capacity > 0.8 THEN 'MONITOR'
        ELSE 'ADEQUATE'
    END AS capacity_status
FROM growth;
```

### Alerting Thresholds

#### Critical Alerts

```sql
-- Immediate action required
DO $
DECLARE
    tombstone_count INTEGER;
    active_count INTEGER;
    capacity INTEGER;
BEGIN
    tombstone_count := biscuit_get_tombstone_count();
    active_count := biscuit_get_active_count();
    capacity := (regexp_matches(biscuit_index_status(), 'Total Slots: ([0-9]+)'))[1]::INTEGER;
    
    IF tombstone_count > 10000 THEN
        RAISE WARNING 'CRITICAL: Tombstone count % exceeds 10,000. Run biscuit_cleanup() immediately.', tombstone_count;
    END IF;
    
    IF active_count::NUMERIC / capacity > 0.95 THEN
        RAISE WARNING 'CRITICAL: Index utilization %.2f%% exceeds 95%%. Capacity expansion imminent.', 
                      100.0 * active_count / capacity;
    END IF;
END $;
```

#### Warning Alerts

```sql
-- Monitor closely, plan maintenance
DO $
DECLARE
    tombstone_count INTEGER;
    memory_mb NUMERIC;
BEGIN
    tombstone_count := biscuit_get_tombstone_count();
    memory_mb := (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC;
    
    IF tombstone_count BETWEEN 5000 AND 10000 THEN
        RAISE NOTICE 'WARNING: Tombstone count % in elevated range. Schedule cleanup.', tombstone_count;
    END IF;
    
    -- Assuming 1M records should use ~20MB
    IF memory_mb > 30 THEN
        RAISE NOTICE 'WARNING: Memory usage %.2f MB higher than expected. Check for leaks.', memory_mb;
    END IF;
END $;
```

### Diagnostic Procedures

#### Issue: Slow Query Performance

**Diagnosis:**
```sql
-- Check tombstone overhead
SELECT biscuit_get_tombstone_count();

-- Check memory usage
SELECT (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC;

-- Review pattern complexity
EXPLAIN ANALYZE SELECT * FROM biscuit_match('%complex%pattern%');
```

**Resolution:**
```sql
-- If tombstones > 5000
SELECT biscuit_cleanup();

-- If memory excessive
SELECT biscuit_build_index('table', 'column', 'pk');  -- Rebuild
```

#### Issue: Index Out of Sync

**Diagnosis:**
```sql
-- Compare counts
SELECT 
    COUNT(*) AS table_count,
    biscuit_get_active_count() AS index_count
FROM your_table;

-- Check trigger status
SELECT tgname, tgenabled 
FROM pg_trigger 
WHERE tgname = 'biscuit_auto_update';
```

**Resolution:**
```sql
-- Rebuild index
SELECT biscuit_build_index('table', 'column', 'pk');
SELECT biscuit_enable_triggers();
```

#### Issue: High Memory Usage

**Diagnosis:**
```sql
-- Detailed memory breakdown
SELECT biscuit_index_status();

-- Check for long strings
SELECT MAX(LENGTH(column)), AVG(LENGTH(column)) FROM table;
```

**Resolution:**
```sql
-- Rebuild to compact
SELECT biscuit_build_index('table', 'column', 'pk');

-- Consider limiting indexed length (future enhancement)
```

### Logging and Audit Trail

#### Enable PostgreSQL Logging

```sql
-- Log slow queries
SET log_min_duration_statement = 100;  -- Log queries >100ms

-- Log BISCUIT operations
SET client_min_messages = NOTICE;

-- View logs
SELECT * FROM pg_stat_statements 
WHERE query LIKE '%biscuit%' 
ORDER BY mean_exec_time DESC;
```

#### Custom Monitoring Table

```sql
-- Create monitoring log
CREATE TABLE biscuit_metrics (
    logged_at TIMESTAMPTZ DEFAULT NOW(),
    active_count INTEGER,
    free_slots INTEGER,
    tombstone_count INTEGER,
    memory_mb NUMERIC,
    query_count BIGINT,
    insert_count BIGINT,
    update_count BIGINT,
    delete_count BIGINT
);

-- Periodic logging (via cron or pg_cron)
INSERT INTO biscuit_metrics
SELECT 
    NOW(),
    biscuit_get_active_count(),
    biscuit_get_free_slots(),
    biscuit_get_tombstone_count(),
    (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC,
    (regexp_matches(biscuit_index_status(), 'Queries: ([0-9]+)'))[1]::BIGINT,
    (regexp_matches(biscuit_index_status(), 'Inserts: ([0-9]+)'))[1]::BIGINT,
    (regexp_matches(biscuit_index_status(), 'Updates: ([0-9]+)'))[1]::BIGINT,
    (regexp_matches(biscuit_index_status(), 'Deletes: ([0-9]+)'))[1]::BIGINT;

-- Analyze trends
SELECT 
    DATE_TRUNC('hour', logged_at) AS hour,
    AVG(tombstone_count) AS avg_tombstones,
    MAX(tombstone_count) AS max_tombstones,
    AVG(memory_mb) AS avg_memory
FROM biscuit_metrics
WHERE logged_at > NOW() - INTERVAL '7 days'
GROUP BY 1
ORDER BY 1 DESC;
```

---

## Optimization Guidelines

### Query Optimization

#### 1. Pattern Design Best Practices

**Optimal Patterns:**
```sql
-- ✓ GOOD: Specific prefix
SELECT * FROM biscuit_match('user_admin%');

-- ✓ GOOD: Specific suffix  
SELECT * FROM biscuit_match('%@company.com');

-- ✓ GOOD: Single containment
SELECT * FROM biscuit_match('%error%');

-- ⚠ ACCEPTABLE: Two-part pattern
SELECT * FROM biscuit_match('%user%admin%');

-- ⚠ SLOWER: Three-part pattern
SELECT * FROM biscuit_match('%a%b%c%');

-- ✗ AVOID: Excessive parts (4+)
SELECT * FROM biscuit_match('%a%b%c%d%e%');
```

**Pattern Complexity Impact:**
```
Parts    Query Time    Recommendation
-----    ----------    --------------
1        ~2ms          Optimal
2        ~5ms          Excellent
3        ~12ms         Good
4        ~25ms         Acceptable
5+       ~50ms+        Avoid if possible
```

#### 2. Query Structure Optimization

**Use Count for Existence Checks:**
```sql
-- ✗ AVOID: Materializing entire result set
SELECT COUNT(*) FROM biscuit_match('%pattern%');

-- ✓ BETTER: Direct count
SELECT biscuit_match_count('%pattern%');
```

**Apply Filters After Pattern Match:**
```sql
-- ✓ OPTIMAL: Pattern match first, then filter
SELECT * FROM biscuit_match('%@gmail.com%')
WHERE created_at > '2024-01-01' 
AND status = 'active';

-- ✗ SUBOPTIMAL: Complex joins before pattern match
SELECT * 
FROM complex_view v
WHERE v.email LIKE '%@gmail.com%';  -- Falls back to sequential scan
```

**Leverage Result Reuse:**
```sql
-- ✓ OPTIMAL: Use CTE for multiple operations on same pattern
WITH matches AS (
    SELECT * FROM biscuit_match('%search%')
)
SELECT 
    (SELECT COUNT(*) FROM matches) AS total,
    (SELECT COUNT(*) FROM matches WHERE category = 'A') AS category_a,
    (SELECT AVG(price) FROM matches) AS avg_price;
```

#### 3. Index Selectivity Considerations

**High Selectivity (Few Matches):**
```sql
-- Excellent: ~0.1% selectivity
SELECT * FROM biscuit_match('unique_prefix_%');
```

**Low Selectivity (Many Matches):**
```sql
-- Consider adding filters or reformulating query
SELECT * FROM biscuit_match('%')  -- Returns everything
WHERE additional_criteria;
```

**Selectivity Guidelines:**
```
Match %       Query Performance    Recommendation
-------       -----------------    --------------
<1%           Excellent            Ideal use case
1-10%         Very Good            Well-suited
10-30%        Good                 Acceptable
30-50%        Fair                 Add filters
>50%          Poor                 Reconsider approach
```

### CRUD Optimization

#### 1. Bulk Insert Strategy

**Small Batches (<1000 rows):**
```sql
-- Keep triggers enabled
INSERT INTO products (id, name) VALUES
    (1, 'Product 1'),
    (2, 'Product 2'),
    ...
    (1000, 'Product 1000');
-- Index updates happen automatically
```

**Large Batches (>10,000 rows):**
```sql
-- Disable triggers, bulk insert, rebuild
SELECT biscuit_disable_triggers();

COPY products FROM '/data/products.csv' CSV;
-- or
INSERT INTO products SELECT * FROM staging_table;

SELECT biscuit_build_index('products', 'name', 'id');
SELECT biscuit_enable_triggers();
```

**Performance Comparison:**
```
10,000 records:
  With triggers:    ~2 seconds
  Disable/rebuild:  ~1 second (50% faster)

100,000 records:
  With triggers:    ~20 seconds  
  Disable/rebuild:  ~8 seconds (60% faster)

1,000,000 records:
  With triggers:    ~220 seconds
  Disable/rebuild:  ~85 seconds (61% faster)
```

#### 2. Update Optimization

**Maximize Incremental Updates:**
```sql
-- ✓ GOOD: Same-length updates (incremental)
UPDATE users SET email = 'newemail@example.com'  -- 21 chars
WHERE email = 'oldemail@example.com';             -- 21 chars

-- ⚠ ACCEPTABLE: Different-length (full reindex)
UPDATE users SET email = 'verylongemail@example.com'  -- 27 chars
WHERE email = 'short@ex.co';                          -- 11 chars
```

**Batch Updates:**
```sql
-- For large update batches
SELECT biscuit_disable_triggers();

UPDATE products SET name = UPPER(name);

SELECT biscuit_build_index('products', 'name', 'id');
SELECT biscuit_enable_triggers();
```

#### 3. Delete Strategy

**Individual Deletes:**
```sql
-- Lazy deletion is O(1), no optimization needed
DELETE FROM products WHERE id = 123;
```

**Bulk Deletes:**
```sql
-- Option 1: Let lazy deletion handle (recommended for <10K deletes)
DELETE FROM products WHERE category = 'obsolete';

-- Option 2: Disable/rebuild (for >100K deletes)
SELECT biscuit_disable_triggers();
DELETE FROM products WHERE created_at < '2020-01-01';
SELECT biscuit_build_index('products', 'name', 'id');
SELECT biscuit_enable_triggers();
```

**Cleanup Scheduling:**
```sql
-- Schedule periodic cleanup during low-traffic hours
-- Example: Every night at 2 AM via pg_cron
SELECT cron.schedule('biscuit-cleanup', '0 2 * * *', 
    'SELECT biscuit_cleanup()');
```

### Memory Optimization

#### 1. Monitor Memory Growth

```sql
-- Track memory over time
SELECT 
    NOW() AS checked_at,
    biscuit_get_active_count() AS records,
    (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC AS mb,
    ROUND((regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC 
          / NULLIF(biscuit_get_active_count(), 0) * 1024, 2) AS kb_per_record;
```

#### 2. Memory Reduction Strategies

**Rebuild Compaction:**
```sql
-- Periodic rebuild to compact fragmentation
SELECT biscuit_build_index('table', 'column', 'pk');
```

**Character Diversity Reduction (Application Level):**
```sql
-- Normalize data before indexing
ALTER TABLE users ADD COLUMN email_normalized TEXT;
UPDATE users SET email_normalized = LOWER(email);
SELECT biscuit_setup('users', 'email_normalized', 'id');
```

### Maintenance Optimization

#### 1. Tombstone Management

**Adaptive Threshold Tuning:**
```sql
-- Current threshold: 1000 (default)
-- Adjust based on workload:

-- Read-heavy (Q/D ratio > 20): Increase threshold
-- Write-heavy (Q/D ratio < 10): Decrease threshold
-- Balanced: Keep default

-- Manual threshold adjustment (via C code recompilation)
#define TOMBSTONE_CLEANUP_THRESHOLD 5000  // Adjust value
```

**Proactive Cleanup:**
```sql
-- Before performance-critical operations
SELECT biscuit_cleanup();

-- Before maintenance windows
SELECT biscuit_cleanup();
```

#### 2. Index Rebuild Strategy

**When to Rebuild:**
- After bulk operations (>10% of table size)
- Memory usage exceeds expectations by >50%
- After schema changes
- Periodic maintenance (monthly/quarterly)

**Rebuild Procedure:**
```sql
BEGIN;
    SELECT biscuit_disable_triggers();
    -- Perform any table maintenance
    SELECT biscuit_build_index('table', 'column', 'pk');
    SELECT biscuit_enable_triggers();
COMMIT;
```

### Concurrency Optimization

#### 1. Read Concurrency

**Queries are lock-free:**
```sql
-- Multiple concurrent queries: No contention
SELECT * FROM biscuit_match('%pattern1%');  -- Session 1
SELECT * FROM biscuit_match('%pattern2%');  -- Session 2
SELECT * FROM biscuit_match('%pattern3%');  -- Session 3
-- All execute concurrently without blocking
```

#### 2. Write Concurrency

**Minimize trigger overhead:**
```sql
-- ✓ OPTIMAL: Batch small inserts
INSERT INTO products SELECT * FROM new_products_batch;

-- ⚠ SUBOPTIMAL: Individual inserts in loop
-- Each insert triggers index update
```

**Transaction Batching:**
```sql
-- Group related operations
BEGIN;
    INSERT INTO products VALUES (...);
    UPDATE products SET name = ... WHERE id = ...;
    DELETE FROM products WHERE id = ...;
COMMIT;
-- All index updates happen at commit
```

### Performance Testing

#### Benchmark Query Performance

```sql
-- Timing template
\timing on
SELECT * FROM biscuit_match('%pattern%');
\timing off

-- With EXPLAIN ANALYZE
EXPLAIN (ANALYZE, BUFFERS) 
SELECT * FROM biscuit_match('%pattern%');
```

#### Compare with Traditional Methods

```sql
-- BISCUIT
\timing on
SELECT COUNT(*) FROM biscuit_match('%search%');
\timing off

-- Traditional LIKE
\timing on
SELECT COUNT(*) FROM table WHERE column LIKE '%search%';
\timing off

-- GIN Index
\timing on  
SELECT COUNT(*) FROM table WHERE column ILIKE '%search%';
\timing off
```


```sql
-- ✗ NOT SUPPORTED
SELECT * FROM biscuit_match('[0-9]+');
SELECT * FROM biscuit_match('(abc|xyz)');

-- ✓ EQUIVALENT WORKAROUNDS
SELECT * FROM biscuit_match('abc%') 
UNION
SELECT * FROM biscuit_match('xyz%');
```


### Architectural Constraints

#### 1. Memory Residency

**Index Location:** Shared memory (non-persistent)

**Implications:**
- Index lost on server restart
- Must rebuild after restart
- No automatic persistence

**Startup Procedure:**
```sql
-- Add to database initialization script
\c your_database
SELECT biscuit_setup('table1', 'column1', 'pk1');
SELECT biscuit_setup('table2', 'column2', 'pk2');
```

#### 2. Single Index Instance

**Limitation:** Only one BISCUIT index per PostgreSQL instance

**Implications:**
```sql
-- ✗ Cannot maintain multiple indexes simultaneously
SELECT biscuit_setup('table1', 'col1', 'id');
SELECT biscuit_setup('table2', 'col2', 'id');  -- Replaces table1 index

-- ✓ Workaround: Separate databases
CREATE DATABASE db1;
\c db1
SELECT biscuit_setup('table', 'column', 'id');

CREATE DATABASE db2;
\c db2
SELECT biscuit_setup('table', 'column', 'id');
```

#### 3. Transaction Isolation

**Current Behavior:** Index updates commit with transaction

**Limitation:** No snapshot isolation for queries

```sql
BEGIN;
    INSERT INTO products VALUES (1, 'New Product');
    -- Index updated in current transaction
    SELECT * FROM biscuit_match('%New%');  -- Sees new product
ROLLBACK;

-- Index reverted (product removed)
SELECT * FROM biscuit_match('%New%');  -- No longer sees product
```
**Note:** This is expected behavior and consistent with PostgreSQL's MVCC model at the trigger level.


####  4. Single Column Per Index

**Current Design:** One index per table, one column per index

```sql
-- ✗ Cannot index multiple columns simultaneously
-- Must choose one column

-- ✓ Workaround: Concatenated column
ALTER TABLE products ADD COLUMN search_text TEXT
    GENERATED ALWAYS AS (name || ' ' || description || ' ' || sku) STORED;
SELECT biscuit_setup('products', 'search_text', 'id');
```

#### 5. NULL Handling

**Behavior:** NULL values treated as empty strings

```sql
INSERT INTO products (id, name) VALUES (1, NULL);

-- NULL treated as ''
SELECT * FROM biscuit_match('%');     -- Includes row 1
SELECT * FROM biscuit_match('%%');    -- Includes row 1
SELECT biscuit_match_count('');       -- Counts row 1 if name is NULL
```


### Performance Limitations

#### 1. Pattern Complexity Scaling

**Complexity:** O(k × m), k = pattern parts, m = search window

```sql
-- Linear growth in query time with pattern parts
SELECT biscuit_match('%a%');              -- ~2ms
SELECT biscuit_match('%a%b%');            -- ~5ms
SELECT biscuit_match('%a%b%c%');          -- ~12ms
SELECT biscuit_match('%a%b%c%d%');        -- ~25ms
SELECT biscuit_match('%a%b%c%d%e%');      -- ~52ms
```

**Recommendation:** Limit patterns to 3-4 parts for optimal performance

#### 2. Write Concurrency

**Limitation:** Trigger-based updates serialize at row level

**Impact:**
```
High-concurrency writes (>10K/sec):
  May experience contention
  Trigger overhead accumulates
  
Solution:
  Batch writes
  Disable triggers for bulk operations
  Consider application-level queue
```

#### 3. Memory Growth

**Behavior:** Memory grows with:
- Record count
- String length diversity
- Character diversity
- Tombstone accumulation

```sql
-- Memory bounds
Minimum: 10% of indexed data size
Maximum: 50% of indexed data size
Typical: 20-25% of indexed data size
```

**Mitigation:**
```sql
-- Periodic rebuild for compaction
SELECT biscuit_build_index('table', 'column', 'pk');

-- Cleanup tombstones
SELECT biscuit_cleanup();
```

### Compatibility Constraints

#### 1. PostgreSQL Version

**Minimum:** PostgreSQL 11.0

**Tested Versions:**
- PostgreSQL 11.x: ✓ Supported
- PostgreSQL 12.x: ✓ Supported
- PostgreSQL 13.x: ✓ Supported
- PostgreSQL 14.x: ✓ Supported
- PostgreSQL 15.x: ✓ Supported
- PostgreSQL 16.x: ✓ Supported (expected)

#### 2. Data Type Support

**Primary Key Types:**
- Fully Supported: INT2, INT4, INT8, TEXT, VARCHAR, CHAR, UUID
- Generic Support: All types with output functions
- Not Supported: Composite types, arrays (without custom casting)

**Indexed Column Types:**
- Required: TEXT, VARCHAR, CHAR, or castable to TEXT
- Not Supported: Binary types (BYTEA), JSON, arrays without casting

#### 3. Extension Dependencies

**Required:**
- PostgreSQL server-side C API
- Standard PostgreSQL libraries

**Optional:**
- CRoaring library (for optimized bitmaps)
- Falls back to built-in implementation if unavailable

### Security Considerations

#### 1. Permissions

**Required Privileges:**
```sql
-- Index creation requires
GRANT CREATE ON DATABASE TO user;        -- For extension
GRANT ALL ON TABLE target_table TO user; -- For triggers

-- Query execution requires
GRANT SELECT ON TABLE target_table TO user;
```

#### 2. SQL Injection

**Safe Query Patterns:**
```sql
-- ✓ SAFE: Parameterized queries
PREPARE search_stmt AS 
    SELECT * FROM biscuit_match($1);
EXECUTE search_stmt('%user_input%');

-- ✓ SAFE: Application sanitization
pattern = sanitize_input(user_input)
query = f"SELECT * FROM biscuit_match('{pattern}')"
```

**Unsafe Patterns:**
```sql
-- ✗ UNSAFE: Direct string concatenation
SELECT * FROM biscuit_match('%' || user_input || '%');
```

#### 3. Resource Exhaustion

**Potential Vectors:**
- Complex patterns (>5 parts): High CPU usage
- Very long patterns: Memory allocation
- Rapid index rebuilds: Memory fragmentation

**Mitigation:**
```sql
-- Rate limit index rebuilds
-- Validate pattern complexity before execution
-- Set statement timeout
SET statement_timeout = '5s';
```

### Known Issues

#### 1. Index Persistence

**Issue:** Index not persistent across restarts

**Status:** Design limitation (in-memory structure)

**Workaround:** Rebuild on startup via initialization script

#### 2. Multiple Index Support

**Issue:** Only one index per PostgreSQL instance

**Status:** Architectural limitation

**Workaround:** Use multiple databases or implement queuing

#### 3. Incremental Update Threshold

**Issue:** Updates with >20% character changes trigger full reindex

**Status:** Configurable threshold (requires C code modification)

**Workaround:** Design application to minimize large updates

### Future Enhancements (Roadmap)

**Planned:**
- Persistent index storage
- Multiple concurrent indexes
- Case-insensitive mode option
- Configurable position limit (>256 chars)
- Pattern query optimizer
- Index statistics collection

**Under Consideration:**
- Fuzzy matching support
- Phonetic matching
- Unicode normalization
- Multi-column composite indexes
- Parallel query execution
- Incremental checkpoint/restore

---

## Troubleshooting

### Common Issues and Solutions

#### Issue 1: "Index not built" Error

**Symptom:**
```sql
SELECT * FROM biscuit_match('%pattern%');
ERROR: Index not built. Call biscuit_build_index() first.
```

**Cause:** Index not initialized or server restarted

**Solution:**
```sql
-- Option 1: Quick setup
SELECT biscuit_setup('table_name', 'column_name', 'pk_column');

-- Option 2: Manual build
SELECT biscuit_build_index('table_name', 'column_name', 'pk_column');
SELECT biscuit_create_match_function();
SELECT biscuit_enable_triggers();
```

**Prevention:**
```sql
-- Add to database startup script
-- File: /etc/postgresql/14/main/init_biscuit.sql
\c your_database
SELECT biscuit_setup('table1', 'column1', 'pk1');
```

---

#### Issue 2: Query Returns No Results

**Symptom:**
```sql
SELECT biscuit_match_count('%known_value%');
-- Returns: 0 (expected >0)
```

**Diagnosis:**
```sql
-- Check if value exists in table
SELECT COUNT(*) FROM table_name WHERE column_name LIKE '%known_value%';

-- Check index status
SELECT biscuit_index_status();

-- Check trigger status
SELECT tgname, tgenabled FROM pg_trigger WHERE tgname = 'biscuit_auto_update';
```

**Possible Causes and Solutions:**

**A. Index Out of Sync**
```sql
-- Rebuild index
SELECT biscuit_build_index('table_name', 'column_name', 'pk_column');
```

**B. Case Sensitivity**
```sql
-- Pattern is case-sensitive
SELECT * FROM biscuit_match('%VALUE%');  -- Won't match "value"

-- Solution: Normalize
SELECT * FROM biscuit_match(LOWER('%VALUE%'));
```

**C. Pattern Beyond 256 Characters**
```sql
-- Match occurs after position 256
SELECT LENGTH(column_name) FROM table_name WHERE id = 123;
-- Returns: 280

-- Solution: Pattern must be within first 256 chars
```

---

#### Issue 3: Slow Query Performance

**Symptom:**
```sql
\timing on
SELECT * FROM biscuit_match('%pattern%');
Time: 850.234 ms
-- Expected: <10ms
```

**Diagnosis:**
```sql
-- Check tombstone count
SELECT biscuit_get_tombstone_count();
-- If >5000: Cleanup needed

-- Check pattern complexity
-- Count % symbols
SELECT regexp_count('%a%b%c%d%', '%');
-- If >4: Pattern too complex

-- Check memory usage
SELECT biscuit_index_status();
```

**Solutions:**

**A. High Tombstone Count**
```sql
SELECT biscuit_cleanup();
-- Should restore normal performance
```

**B. Complex Pattern**
```sql
-- Simplify pattern
-- From: SELECT * FROM biscuit_match('%a%b%c%d%e%');
-- To:   SELECT * FROM biscuit_match('%abc%de%');
```

**C. Large Result Set**
```sql
-- Check selectivity
SELECT biscuit_match_count('%pattern%');
-- If >50% of table: Not ideal for index

-- Solution: Add additional filters
SELECT * FROM biscuit_match('%pattern%')
WHERE created_at > '2024-01-01';
```

---

#### Issue 4: High Memory Usage

**Symptom:**
```sql
SELECT biscuit_index_status();
-- Memory: 500 MB (expected ~20 MB for dataset)
```

**Diagnosis:**
```sql
-- Check record count
SELECT biscuit_get_active_count();

-- Check string characteristics
SELECT 
    COUNT(*) AS records,
    AVG(LENGTH(column_name)) AS avg_length,
    MAX(LENGTH(column_name)) AS max_length,
    COUNT(DISTINCT column_name) AS unique_values
FROM table_name;

-- Check tombstone accumulation
SELECT biscuit_get_tombstone_count();
```

**Solutions:**

**A. Tombstone Accumulation**
```sql
SELECT biscuit_cleanup();
-- Then check memory again
```

**B. Memory Fragmentation**
```sql
-- Rebuild to compact
SELECT biscuit_build_index('table_name', 'column_name', 'pk_column');
```

**C. Unexpected Data Characteristics**
```sql
-- Very long strings or high diversity
-- Consider normalizing data
UPDATE table_name SET column_name = LEFT(column_name, 200);
SELECT biscuit_build_index('table_name', 'column_name', 'pk_column');
```

---

#### Issue 5: Trigger Not Firing

**Symptom:**
```sql
INSERT INTO table_name (id, column_name) VALUES (999, 'test');
SELECT * FROM biscuit_match('%test%');
-- Returns: no rows (expected 1 row)
```

**Diagnosis:**
```sql
-- Check trigger existence
SELECT tgname, tgenabled, tgisinternal
FROM pg_trigger
WHERE tgrelid = 'table_name'::regclass
AND tgname = 'biscuit_auto_update';

-- Check for errors in logs
SHOW log_destination;
-- Review PostgreSQL logs for trigger errors
```

**Solutions:**

**A. Trigger Disabled**
```sql
SELECT biscuit_enable_triggers();
```

**B. Trigger Missing**
```sql
SELECT biscuit_setup('table_name', 'column_name', 'pk_column');
-- Or manually
SELECT biscuit_enable_triggers();
```

**C. Trigger Error**
```sql
-- Check PostgreSQL logs
-- Common causes:
-- - NULL primary key
-- - Type conversion error
-- - Memory allocation failure

-- Rebuild to reset
SELECT biscuit_build_index('table_name', 'column_name', 'pk_column');
SELECT biscuit_enable_triggers();
```

---

#### Issue 6: "NULL primary key" Error

**Symptom:**
```sql
INSERT INTO table_name (column_name) VALUES ('test');
ERROR: NULL primary key in INSERT
```

**Cause:** Primary key column contains NULL

**Solution:**

**A. Use Serial/Identity Column**
```sql
ALTER TABLE table_name ALTER COLUMN id ADD GENERATED ALWAYS AS IDENTITY;
```

**B. Set Default Value**
```sql
ALTER TABLE table_name ALTER COLUMN id SET DEFAULT nextval('table_name_id_seq');
```

**C. Ensure Non-NULL Constraint**
```sql
ALTER TABLE table_name ALTER COLUMN id SET NOT NULL;
```

---

#### Issue 7: Out of Memory

**Symptom:**
```sql
SELECT biscuit_build_index('large_table', 'text_column', 'id');
ERROR: out of memory
```

**Diagnosis:**
```sql
-- Check table size
SELECT 
    pg_size_pretty(pg_total_relation_size('large_table')) AS table_size,
    COUNT(*) AS row_count,
    AVG(LENGTH(text_column)) AS avg_length
FROM large_table;

-- Check available memory
SHOW shared_buffers;
SHOW work_mem;
```

**Solutions:**

**A. Increase Memory Allocation**
```sql
-- In postgresql.conf
shared_buffers = 2GB       -- Increase from default
work_mem = 256MB           -- Increase from default

-- Restart PostgreSQL
sudo systemctl restart postgresql
```

**B. Partition Table**
```sql
-- Create partitioned table
CREATE TABLE large_table_partitioned (
    id SERIAL PRIMARY KEY,
    text_column TEXT,
    created_at DATE
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE large_table_2023 PARTITION OF large_table_partitioned
    FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

-- Index each partition separately
SELECT biscuit_setup('large_table_2023', 'text_column', 'id');
```

**C. Reduce String Diversity**
```sql
-- Normalize before indexing
UPDATE large_table SET text_column = LOWER(text_column);
SELECT biscuit_build_index('large_table', 'text_column', 'id');
```

---

#### Issue 8: Index-Table Mismatch

**Symptom:**
```sql
SELECT COUNT(*) FROM table_name;
-- Returns: 100000

SELECT biscuit_get_active_count();
-- Returns: 95000 (mismatch!)
```

**Diagnosis:**
```sql
-- Check trigger status
SELECT tgenabled FROM pg_trigger WHERE tgname = 'biscuit_auto_update';

-- Check for errors
SELECT biscuit_index_status();

-- Manual comparison
WITH table_data AS (
    SELECT COUNT(*) AS cnt FROM table_name
),
index_data AS (
    SELECT biscuit_get_active_count() AS cnt
)
SELECT 
    t.cnt AS table_count,
    i.cnt AS index_count,
    t.cnt - i.cnt AS discrepancy
FROM table_data t, index_data i;
```

**Solutions:**

**A. Rebuild Index**
```sql
SELECT biscuit_build_index('table_name', 'column_name', 'pk_column');
SELECT biscuit_enable_triggers();
```

**B. Check for Concurrent Modifications**
```sql
-- Ensure exclusive access during rebuild
BEGIN;
    LOCK TABLE table_name IN EXCLUSIVE MODE;
    SELECT biscuit_build_index('table_name', 'column_name', 'pk_column');
COMMIT;
```

---

### Diagnostic Queries

#### Comprehensive Health Check

```sql
DO $
DECLARE
    table_count BIGINT;
    index_count INTEGER;
    tombstone_count INTEGER;
    free_slots INTEGER;
    memory_mb NUMERIC;
BEGIN
    EXECUTE 'SELECT COUNT(*) FROM your_table' INTO table_count;
    index_count := biscuit_get_active_count();
    tombstone_count := biscuit_get_tombstone_count();
    free_slots := biscuit_get_free_slots();
    memory_mb := (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC;
    
    RAISE NOTICE 'BISCUIT Health Check:';
    RAISE NOTICE '  Table rows: %', table_count;
    RAISE NOTICE '  Index active: %', index_count;
    RAISE NOTICE '  Discrepancy: %', table_count - index_count;
    RAISE NOTICE '  Tombstones: %', tombstone_count;
    RAISE NOTICE '  Free slots: %', free_slots;
    RAISE NOTICE '  Memory: % MB', memory_mb;
    
    IF table_count != index_count THEN
        RAISE WARNING 'Index out of sync! Rebuild recommended.';
    END IF;
    
    IF tombstone_count > 5000 THEN
        RAISE WARNING 'High tombstone count! Cleanup recommended.';
    END IF;
    
    IF memory_mb > (index_count::NUMERIC / 1000000 * 30) THEN
        RAISE WARNING 'High memory usage! Check for issues.';
    END IF;
END $;
```

#### Performance Baseline

```sql
-- Establish performance baseline
\timing on
SELECT biscuit_match_count('%');              -- Baseline: all records
SELECT biscuit_match_count('%common_char%');  -- Baseline: common pattern
SELECT biscuit_match_count('prefix%');        -- Baseline: prefix
SELECT biscuit_match_count('%suffix');        -- Baseline: suffix
\timing off

-- Save baselines and compare periodically
```

#### Memory Leak Detection

```sql
-- Monitor memory over operations
SELECT (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC AS mb_before;

-- Perform operations
INSERT INTO table_name SELECT generate_series(1, 10000), 'test' || generate_series(1, 10000);
DELETE FROM table_name WHERE id > 50000;
SELECT biscuit_cleanup();

-- Check memory after
SELECT (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC AS mb_after;

-- Memory should be stable or decrease after cleanup
```

### Error Code Reference

| Error Message | Cause | Solution |
|--------------|-------|----------|
| Index not built | No index initialized | Run `biscuit_setup()` |
| NULL primary key | PK value is NULL | Ensure PK constraint |
| Column not found | Invalid column name | Verify table schema |
| Insufficient permissions | User lacks privileges | Grant table permissions |
| Out of memory | Insufficient RAM | Increase shared_buffers |
| Table not found | Invalid table name | Check table existence |
| Index already exists | Duplicate index | Drop and rebuild |
| Trigger already exists | Duplicate trigger | Use `biscuit_disable_triggers()` first |

### Getting Help

**Before Seeking Support:**
1. Run comprehensive health check (above)
2. Review PostgreSQL logs for errors
3. Check extension version: `SELECT biscuit_version()`
4. Verify PostgreSQL version: `SELECT version()`
5. Document reproduction steps

**Support Channels:**
- GitHub Issues: [Project Repository]
- Mailing List: [Contact Information]
- Documentation: [Online Docs URL]

**Information to Provide:**
- Extension version
- PostgreSQL version
- Operating system
- Table schema
- Reproduction steps
- Error messages
- Output of `SELECT biscuit_index_status()`

---

## Examples and Use Cases

### Example 1: E-Commerce Product Search

**Scenario:** Product catalog with 5M products, need fast SKU and name searches

**Schema:**
```sql
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    sku VARCHAR(50) NOT NULL,
    name TEXT NOT NULL,
    description TEXT,
    category VARCHAR(100),
    price NUMERIC(10, 2),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Sample data
INSERT INTO products (sku, name, category, price) VALUES
    ('ELEC-LAP-001', 'Dell XPS 15 Laptop', 'Electronics', 1299.99),
    ('ELEC-LAP-002', 'MacBook Pro 16"', 'Electronics', 2499.99),
    ('HOME-FURN-001', 'Modern Office Chair', 'Furniture', 349.99),
    ('HOME-FURN-002', 'Standing Desk', 'Furniture', 599.99);
```

**BISCUIT Setup:**
```sql
-- Index product names
SELECT biscuit_setup('products', 'name', 'id');
```

**Query Examples:**
```sql
-- Search by product name
SELECT * FROM biscuit_match('%Laptop%');

-- Case-insensitive search (with normalized column)
ALTER TABLE products ADD COLUMN name_lower TEXT 
    GENERATED ALWAYS AS (LOWER(name)) STORED;
SELECT biscuit_setup('products', 'name_lower', 'id');
SELECT * FROM biscuit_match('%laptop%');

-- Complex search
SELECT * FROM biscuit_match('%office%chair%')
WHERE price BETWEEN 200 AND 500;

-- Count matches
SELECT biscuit_match_count('%macbook%');

-- Top sellers matching pattern
SELECT p.*, s.sales_count
FROM biscuit_match('%laptop%') p
JOIN sales_summary s ON p.id = s.product_id
ORDER BY s.sales_count DESC
LIMIT 10;
```

**Performance Comparison:**
```sql
-- Traditional LIKE (sequential scan)
\timing on
SELECT COUNT(*) FROM products WHERE name LIKE '%Laptop%';
Time: 1850 ms

-- BISCUIT
SELECT biscuit_match_count('%Laptop%');
Time: 3.2 ms

-- Speedup: 578×
```

---

### Example 2: Customer Email Domain Analysis

**Scenario:** Analyze customer email patterns, identify providers, find similar domains

**Schema:**
```sql
CREATE TABLE customers (
    customer_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email TEXT NOT NULL,
    first_name TEXT,
    last_name TEXT,
    signup_date DATE DEFAULT CURRENT_DATE,
    status VARCHAR(20) DEFAULT 'active'
);

-- Sample data
INSERT INTO customers (email, first_name, last_name) VALUES
    ('john.doe@gmail.com', 'John', 'Doe'),
    ('jane.smith@yahoo.com', 'Jane', 'Smith'),
    ('bob.wilson@company.com', 'Bob', 'Wilson'),
    ('alice.brown@gmail.com', 'Alice', 'Brown');
```

**BISCUIT Setup:**
```sql
-- Note: UUID primary key fully supported
SELECT biscuit_setup('customers', 'email', 'customer_id');
```

**Query Examples:**
```sql
-- Find all Gmail users
SELECT * FROM biscuit_match('%@gmail.com');

-- Find all corporate emails (non-freemail)
SELECT * 
FROM biscuit_match('%@%.com')
WHERE email NOT LIKE '%@gmail.com'
  AND email NOT LIKE '%@yahoo.com'
  AND email NOT LIKE '%@hotmail.com';

-- Domain distribution analysis
WITH email_domains AS (
    SELECT 
        customer_id,
        email,
        SUBSTRING(email FROM '@(.*)) AS domain
    FROM biscuit_match('%')
)
SELECT 
    domain,
    COUNT(*) AS customer_count,
    ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 2) AS percentage
FROM email_domains
GROUP BY domain
ORDER BY customer_count DESC
LIMIT 10;

-- Find emails with specific patterns
SELECT * FROM biscuit_match('%.doe@%');      -- All .doe emails
SELECT * FROM biscuit_match('admin%@%');     -- Starts with admin
SELECT * FROM biscuit_match('%+%@%');        -- Contains plus addressing
```

**Business Intelligence Query:**
```sql
-- Signup trends by email provider
WITH gmail_users AS (
    SELECT signup_date 
    FROM biscuit_match('%@gmail.com')
),
yahoo_users AS (
    SELECT signup_date 
    FROM biscuit_match('%@yahoo.com')
)
SELECT 
    DATE_TRUNC('month', signup_date) AS month,
    'Gmail' AS provider,
    COUNT(*) AS signups
FROM gmail_users
GROUP BY 1

UNION ALL

SELECT 
    DATE_TRUNC('month', signup_date) AS month,
    'Yahoo' AS provider,
    COUNT(*) AS signups
FROM yahoo_users
GROUP BY 1

ORDER BY month DESC, provider;
```

---

### Example 3: Log Analysis and Error Detection

**Scenario:** 50M log entries, need to quickly find error patterns and anomalies

**Schema:**
```sql
CREATE TABLE application_logs (
    log_id BIGSERIAL PRIMARY KEY,
    timestamp TIMESTAMPTZ DEFAULT NOW(),
    level VARCHAR(10),
    message TEXT,
    service VARCHAR(50),
    user_id INTEGER
);

-- Sample data
INSERT INTO application_logs (level, message, service, user_id) VALUES
    ('INFO', 'User login successful', 'auth', 12345),
    ('ERROR', 'Database connection timeout', 'api', 67890),
    ('WARN', 'Rate limit approaching', 'api', 12345),
    ('ERROR', 'Failed to connect to payment gateway', 'payments', 11111);
```

**BISCUIT Setup:**
```sql
SELECT biscuit_setup('application_logs', 'message', 'log_id');
```

**Query Examples:**
```sql
-- Find all connection errors
SELECT * FROM biscuit_match('%connection%')
WHERE level = 'ERROR'
ORDER BY timestamp DESC
LIMIT 100;

-- Find timeout issues across all services
SELECT 
    service,
    COUNT(*) AS timeout_count,
    MIN(timestamp) AS first_seen,
    MAX(timestamp) AS last_seen
FROM biscuit_match('%timeout%')
WHERE timestamp > NOW() - INTERVAL '24 hours'
GROUP BY service
ORDER BY timeout_count DESC;

-- Complex error pattern
SELECT * FROM biscuit_match('%failed%connect%')
WHERE level IN ('ERROR', 'CRITICAL')
AND timestamp > NOW() - INTERVAL '1 hour'
ORDER BY timestamp DESC;

-- Error correlation
WITH error_patterns AS (
    SELECT 
        log_id,
        message,
        timestamp,
        CASE 
            WHEN message LIKE '%timeout%' THEN 'timeout'
            WHEN message LIKE '%connection%' THEN 'connection'
            WHEN message LIKE '%memory%' THEN 'memory'
            ELSE 'other'
        END AS error_category
    FROM biscuit_match('%error%')
    WHERE timestamp > NOW() - INTERVAL '24 hours'
)
SELECT 
    error_category,
    COUNT(*) AS occurrences,
    COUNT(*) * 100.0 / SUM(COUNT(*)) OVER () AS percentage
FROM error_patterns
GROUP BY error_category
ORDER BY occurrences DESC;
```

**Real-Time Monitoring:**
```sql
-- Create monitoring view
CREATE OR REPLACE VIEW recent_errors AS
SELECT 
    log_id,
    timestamp,
    level,
    message,
    service,
    user_id
FROM biscuit_match('%error%')
WHERE timestamp > NOW() - INTERVAL '5 minutes'
ORDER BY timestamp DESC;

-- Query view
SELECT * FROM recent_errors;
```

---

### Example 4: Genetic Sequence Matching

**Scenario:** Bioinformatics database with DNA sequences, need pattern matching

**Schema:**
```sql
CREATE TABLE dna_sequences (
    sequence_id VARCHAR(50) PRIMARY KEY,
    organism TEXT,
    sequence TEXT,  -- ATCG sequence
    length INTEGER,
    discovered_date DATE
);

-- Sample data
INSERT INTO dna_sequences VALUES
    ('SEQ001', 'E. coli', 'ATCGATCGATCG', 12, '2024-01-15'),
    ('SEQ002', 'Human', 'GCTAGCTAGCTA', 12, '2024-01-16'),
    ('SEQ003', 'E. coli', 'ATCGGGCCCATCG', 13, '2024-01-17');
```

**BISCUIT Setup:**
```sql
SELECT biscuit_setup('dna_sequences', 'sequence', 'sequence_id');
```

**Query Examples:**
```sql
-- Find sequences with specific motif
SELECT * FROM biscuit_match('%ATCG%');

-- Find palindromic sequences (conceptual)
SELECT * FROM biscuit_match('%GATC%');  -- Start codon

-- Complex pattern matching
SELECT * FROM biscuit_match('%ATC%GGG%')
WHERE organism = 'E. coli';

-- Count sequences by pattern
SELECT 
    'Contains ATCG' AS pattern,
    biscuit_match_count('%ATCG%') AS count
UNION ALL
SELECT 
    'Starts with GC',
    biscuit_match_count('GC%')
UNION ALL
SELECT 
    'Ends with TA',
    biscuit_match_count('%TA');
```

---

### Example 5: Social Media Username Search

**Scenario:** Social platform with 100M users, need fast username search

**Schema:**
```sql
CREATE TABLE users (
    user_id BIGSERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    display_name TEXT,
    bio TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    follower_count INTEGER DEFAULT 0
);

-- Sample data
INSERT INTO users (username, display_name, bio, follower_count) VALUES
    ('john_doe_123', 'John Doe', 'Software engineer', 1250),
    ('jane_smith', 'Jane Smith', 'Data scientist', 3400),
    ('tech_guru_99', 'Tech Guru', 'Tech enthusiast', 15000);
```

**BISCUIT Setup:**
```sql
-- Index lowercase usernames for case-insensitive search
ALTER TABLE users ADD COLUMN username_lower TEXT 
    GENERATED ALWAYS AS (LOWER(username)) STORED;

SELECT biscuit_setup('users', 'username_lower', 'user_id');
```

**Query Examples:**
```sql
-- Search usernames
SELECT user_id, username, display_name, follower_count
FROM biscuit_match('%tech%')
ORDER BY follower_count DESC;

-- Find available similar usernames
WITH desired_username AS (
    SELECT 'john_doe' AS base
),
existing_matches AS (
    SELECT username 
    FROM biscuit_match('john_doe%')
)
SELECT 
    base || '_' || generate_series(1, 10) AS suggestion
FROM desired_username
WHERE base || '_' || generate_series(1, 10) NOT IN (
    SELECT username FROM existing_matches
)
LIMIT 5;

-- Find influencers matching pattern
SELECT username, display_name, follower_count
FROM biscuit_match('%guru%')
WHERE follower_count > 10000
ORDER BY follower_count DESC;

-- Username analytics
WITH username_patterns AS (
    SELECT 
        CASE 
            WHEN username LIKE '%\_\_\_%' THEN 'has_underscores'
            WHEN username LIKE '%[0-9]%' THEN 'has_numbers'
            ELSE 'alphanumeric'
        END AS pattern_type,
        follower_count
    FROM biscuit_match('%')
)
SELECT 
    pattern_type,
    COUNT(*) AS user_count,
    AVG(follower_count) AS avg_followers
FROM username_patterns
GROUP BY pattern_type;
```

---

### Example 6: Document Management System

**Scenario:** Legal document repository, need full-text-like search on titles

**Schema:**
```sql
CREATE TABLE documents (
    doc_id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    document_type VARCHAR(50),
    file_path TEXT,
    uploaded_by INTEGER,
    uploaded_at TIMESTAMPTZ DEFAULT NOW(),
    case_number VARCHAR(50)
);

-- Sample data
INSERT INTO documents (title, document_type, case_number) VALUES
    ('Contract Agreement - Smith vs Johnson', 'Contract', 'CASE-2024-001'),
    ('Patent Application for AI Technology', 'Patent', 'PAT-2024-042'),
    ('Lease Agreement - Commercial Property', 'Lease', 'CASE-2024-002');
```

**BISCUIT Setup:**
```sql
-- Create searchable column with normalized text
ALTER TABLE documents ADD COLUMN title_search TEXT 
    GENERATED ALWAYS AS (LOWER(REGEXP_REPLACE(title, '[^a-z0-9 ]', '', 'gi'))) STORED;

SELECT biscuit_setup('documents', 'title_search', 'doc_id');
```

**Query Examples:**
```sql
-- Find contract documents
SELECT * FROM biscuit_match('%contract%');

-- Multi-term search (all terms must appear)
SELECT * 
FROM biscuit_match('%lease%commercial%')
WHERE document_type = 'Lease';

-- Case document search
SELECT doc_id, title, case_number, uploaded_at
FROM biscuit_match('%smith%johnson%')
ORDER BY uploaded_at DESC;

-- Document type distribution
WITH doc_matches AS (
    SELECT document_type, COUNT(*) AS cnt
    FROM biscuit_match('%agreement%')
    GROUP BY document_type
)
SELECT 
    document_type,
    cnt,
    ROUND(100.0 * cnt / SUM(cnt) OVER (), 2) AS percentage
FROM doc_matches
ORDER BY cnt DESC;
```

---

### Example 7: Bulk Data Migration with BISCUIT

**Scenario:** Migrating 10M records from legacy system

**Migration Script:**
```sql
-- Step 1: Create target table
CREATE TABLE products_new (
    id INTEGER PRIMARY KEY,
    sku TEXT,
    name TEXT,
    description TEXT
);

-- Step 2: Prepare BISCUIT (without triggers for bulk)
SELECT biscuit_build_index('products_new', 'name', 'id');

-- Step 3: Bulk load data
\timing on
SELECT biscuit_disable_triggers();

COPY products_new FROM '/data/products.csv' CSV HEADER;
-- Alternative: INSERT SELECT from staging

Time: 45,000 ms

-- Step 4: Rebuild index
SELECT biscuit_build_index('products_new', 'name', 'id');
Time: 38,000 ms

-- Step 5: Enable triggers for future updates
SELECT biscuit_enable_triggers();
\timing off

-- Step 6: Verify
SELECT 
    COUNT(*) AS table_count,
    biscuit_get_active_count() AS index_count,
    biscuit_get_free_slots() AS free_slots
FROM products_new;

-- Step 7: Test queries
SELECT biscuit_match_count('%');  -- Should match table count
```

---

### Example 8: Multi-Pattern Search Interface

**Scenario:** Build API endpoint that accepts multiple search patterns

**Application Code (Python + psycopg2):**
```python
import psycopg2
from typing import List, Dict

def multi_pattern_search(patterns: List[str], table: str) -> List[Dict]:
    """
    Search for records matching any of the provided patterns
    """
    conn = psycopg2.connect("dbname=mydb user=myuser")
    cur = conn.cursor()
    
    # Build UNION query for multiple patterns
    union_queries = []
    for pattern in patterns:
        union_queries.append(
            f"SELECT * FROM biscuit_match(%s)"
        )
    
    query = " UNION ".join(union_queries)
    
    cur.execute(query, patterns)
    results = cur.fetchall()
    
    cur.close()
    conn.close()
    
    return results

# Usage
results = multi_pattern_search(
    patterns=['%laptop%', '%macbook%', '%computer%'],
    table='products'
)
```

**SQL Equivalent:**
```sql
-- Search multiple patterns
SELECT * FROM biscuit_match('%laptop%')
UNION
SELECT * FROM biscuit_match('%macbook%')
UNION
SELECT * FROM biscuit_match('%computer%')
ORDER BY id;

-- Or with CTE for clarity
WITH 
pattern1 AS (SELECT * FROM biscuit_match('%laptop%')),
pattern2 AS (SELECT * FROM biscuit_match('%macbook%')),
pattern3 AS (SELECT * FROM biscuit_match('%computer%'))
SELECT * FROM pattern1
UNION
SELECT * FROM pattern2
UNION
SELECT * FROM pattern3;
```

---

### Example 9: Performance Monitoring Dashboard

**Scenario:** Real-time monitoring of BISCUIT performance

**Monitoring Queries:**
```sql
-- Create monitoring function
CREATE OR REPLACE FUNCTION biscuit_performance_metrics()
RETURNS TABLE (
    metric TEXT,
    value NUMERIC,
    unit TEXT,
    status TEXT
) AS $
DECLARE
    active_ct INTEGER;
    tombstone_ct INTEGER;
    free_ct INTEGER;
    memory_val NUMERIC;
    query_ct BIGINT;
    insert_ct BIGINT;
    update_ct BIGINT;
    delete_ct BIGINT;
BEGIN
    active_ct := biscuit_get_active_count();
    tombstone_ct := biscuit_get_tombstone_count();
    free_ct := biscuit_get_free_slots();
    memory_val := (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC;
    query_ct := (regexp_matches(biscuit_index_status(), 'Queries: ([0-9]+)'))[1]::BIGINT;
    insert_ct := (regexp_matches(biscuit_index_status(), 'Inserts: ([0-9]+)'))[1]::BIGINT;
    update_ct := (regexp_matches(biscuit_index_status(), 'Updates: ([0-9]+)'))[1]::BIGINT;
    delete_ct := (regexp_matches(biscuit_index_status(), 'Deletes: ([0-9]+)'))[1]::BIGINT;
    
    RETURN QUERY SELECT 'Active Records'::TEXT, active_ct::NUMERIC, 'records'::TEXT, 
        CASE WHEN active_ct > 0 THEN 'OK' ELSE 'WARNING' END;
    RETURN QUERY SELECT 'Tombstones', tombstone_ct::NUMERIC, 'records', 
        CASE WHEN tombstone_ct < 5000 THEN 'OK' WHEN tombstone_ct < 10000 THEN 'WARNING' ELSE 'CRITICAL' END;
    RETURN QUERY SELECT 'Free Slots', free_ct::NUMERIC, 'slots', 'INFO';
    RETURN QUERY SELECT 'Memory Usage', memory_val, 'MB', 
        CASE WHEN memory_val < 100 THEN 'OK' WHEN memory_val < 500 THEN 'WARNING' ELSE 'HIGH' END;
    RETURN QUERY SELECT 'Total Queries', query_ct::NUMERIC, 'operations', 'INFO';
    RETURN QUERY SELECT 'Total Inserts', insert_ct::NUMERIC, 'operations', 'INFO';
    RETURN QUERY SELECT 'Total Updates', update_ct::NUMERIC, 'operations', 'INFO';
    RETURN QUERY SELECT 'Total Deletes', delete_ct::NUMERIC, 'operations', 'INFO';
END;
$ LANGUAGE plpgsql;

-- Use in dashboard
SELECT * FROM biscuit_performance_metrics();
```

**Output:**
```
       metric       | value | unit       | status
--------------------+-------+------------+--------
 Active Records     | 998567| records    | OK
 Tombstones         | 856   | records    | OK
 Free Slots         | 1433  | slots      | INFO
 Memory Usage       | 18.3  | MB         | OK
 Total Queries      | 45892 | operations | INFO
 Total Inserts      | 1000000| operations| INFO
 Total Updates      | 125678| operations | INFO
 Total Deletes      | 1433  | operations | INFO
```

---

### Complete Production Example

**Full Setup for Production Environment:**

```sql
-- 1. Create schema
CREATE SCHEMA production;
SET search_path TO production;

-- 2. Create table
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    sku VARCHAR(100) UNIQUE NOT NULL,
    name TEXT NOT NULL,
    description TEXT,
    category VARCHAR(100),
    price NUMERIC(12, 2),
    stock_level INTEGER DEFAULT 0,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- 3. Add indexes
CREATE INDEX idx_products_category ON products(category);
CREATE INDEX idx_products_price ON products(price);

-- 4. Setup BISCUIT
SELECT biscuit_setup('products', 'name', 'id');

-- 5. Verify setup
SELECT biscuit_index_status();

-- 6. Insert sample data
INSERT INTO products (sku, name, category, price, stock_level)
SELECT 
    'SKU-' || generate_series(1, 100000),
    'Product ' || generate_series(1, 100000),
    (ARRAY['Electronics', 'Furniture', 'Clothing'])[floor(random() * 3 + 1)],
    (random() * 1000)::NUMERIC(12, 2),
    floor(random() * 100)::INTEGER;

-- 7. Test queries
\timing on
SELECT * FROM biscuit_match('%Product 1234%');
SELECT biscuit_match_count('%Electronics%');
\timing off

-- 8. Setup monitoring
CREATE TABLE biscuit_metrics_log (
    logged_at TIMESTAMPTZ DEFAULT NOW(),
    active_count INTEGER,
    tombstone_count INTEGER,
    memory_mb NUMERIC,
    query_count BIGINT
);

-- 9. Schedule monitoring (via pg_cron)
SELECT cron.schedule(
    'biscuit-metrics',
    '*/5 * * * *',  -- Every 5 minutes
    $INSERT INTO production.biscuit_metrics_log 
      SELECT NOW(), biscuit_get_active_count(), biscuit_get_tombstone_count(),
             (regexp_matches(biscuit_index_status(), 'Memory: ([0-9.]+)'))[1]::NUMERIC,
             (regexp_matches(biscuit_index_status(), 'Queries: ([0-9]+)'))[1]::BIGINT$
);

-- 10. Create API views
CREATE VIEW product_search AS
SELECT id, sku, name, category, price
FROM products
WHERE id IN (SELECT id FROM biscuit_match('%'));  -- Template view
```

---

## License

PostgreSQL License (similar to MIT/BSD)

Copyright (c) 2024 BISCUIT Contributors

Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies.

---

## Version History

**v1.0.0** (Current)
- Initial release
- Roaring Bitmap integration
- Lazy deletion with tombstones
- Incremental updates
- O(1) hash table PK lookup
- Automatic trigger-based CRUD
- Comprehensive monitoring

---

## Contributors

BISCUIT is developed and maintained by [Sivaprasad Murali](https://linkedin.com/in/sivaprasad-murali) .

---

## Support and Contact


**Issues:** https://github.com/crystallinecore/biscuit/issues

**Discussions:** https://github.com/crystallinecore/biscuit/discussions


---

**When pg_trgm feels half-baked, grab a BISCUIT 🍪**

---