# SQL API Reference This page documents the SQL surface implemented by the current `graph` crate. All functions live in the `graph` schema. ## Function Index | Category | Functions | |---|---| | Availability | `test_enabled`, `status`, `sync_health`, `reset` | | Registration | `add_table`, `add_edge`, `add_filter_column`, `registered_tables`, `registered_edges`, `remove_table`, `remove_edge`, `estimate` | | Discovery | `auto_discover`, `auto_discover_tables` | | Build jobs | `build`, `build_status` | | Sync and maintenance | `enable_sync`, `apply_sync`, `run_scheduled_maintenance`, `vacuum`, `maintenance`, `maintenance_status` | | Search | `search`, `search_nodes` | | Traversal | `traverse`, `traverse_search`, `node_ref`, `node_ref_string`, `format_path` | | Filters | `equals`, `eq`, `not_equals`, `neq`, `greater_than`, `gt`, `at_least`, `gte`, `less_than`, `lt`, `at_most`, `lte`, `between`, `on_node`, `all` | | Workflow | `find`, `expand`, `find_related`, `path`, `connection`, `neighborhood` | | Paths | `shortest_path`, `weighted_shortest_path` | | Aggregation | `path_count_estimate`, `aggregate` | | Components | `connected_components`, `component_stats`, `components`, `largest_component`, `component`, `isolated_nodes` | ## Availability ### `graph.test_enabled()` ```sql graph.test_enabled() RETURNS boolean ``` Returns the current `graph.enabled` value. ### `graph.status()` ```sql graph.status() RETURNS TABLE ( node_count int, edge_count int, memory_used_mb double precision, memory_limit_mb int, sync_mode text, sync_status text, last_build timestamptz, last_vacuum timestamptz, edge_types text[], edge_buffer_used int, has_unidirectional_edges boolean, schema_status text, sync_lag bigint, pending_edge_deltas int, needs_vacuum boolean, needs_rebuild boolean, applied_sync_id bigint, pending_sync_rows bigint, invalid_reason text, disabled_trigger_count int, read_only boolean, read_only_reason text ) ``` Reports backend-local engine state and current catalog/sync validity. When `read_only` is true, `read_only_reason` is `memory_limit`, `edge_buffer_full`, or another typed reason. ### `graph.sync_health()` ```sql graph.sync_health() RETURNS TABLE ( sync_mode text, query_freshness text, sync_batch_size int, applied_sync_id bigint, max_sync_log_id bigint, pending_sync_rows bigint, disabled_trigger_count int, edge_buffer_used int, edge_buffer_size int, needs_vacuum boolean, needs_rebuild boolean, read_only boolean, read_only_reason text, apply_sync_recommended boolean, maintenance_recommended boolean ) ``` Reports the sync and compaction signals most useful to schedulers and external monitors. Use `apply_sync_recommended` to decide when a lightweight replay is useful for a writable backend-local engine, and `maintenance_recommended` to decide when overlay/read-only/rebuild state should be folded back into the base graph. ### `graph.reset()` ```sql graph.reset() RETURNS void ``` Admin-only. Clears the backend-local engine and removes persisted graph files. ## Registration ### `graph.add_table()` ```sql graph.add_table( table_name regclass, id_column text, columns text[] DEFAULT NULL, tenant_column text DEFAULT NULL ) RETURNS void ``` ```sql graph.add_table( table_name regclass, id_columns text[], columns text[] DEFAULT NULL, tenant_column text DEFAULT NULL ) RETURNS void ``` Admin-only. Registers a source table as graph nodes. The identifier columns must match a primary key or unique `NOT NULL` index. ### `graph.add_edge()` ```sql graph.add_edge( from_table regclass, from_column text, to_table regclass, to_column text, label text, bidirectional boolean DEFAULT true, weight_column text DEFAULT NULL, label_column text DEFAULT NULL ) RETURNS void ``` Admin-only. Registers a relationship. `weight_column` must be numeric. `label_column`, when supplied, provides per-row dynamic edge labels. ### `graph.add_filter_column()` ```sql graph.add_filter_column( table_name regclass, column_name text, column_type text DEFAULT 'numeric' ) RETURNS void ``` Admin-only. Supported types are `numeric`, `text`, `boolean`, `date`, `timestamptz`, and `uuid`. ### `graph.registered_tables()` ```sql graph.registered_tables() RETURNS TABLE ( table_name text, id_columns text[], columns text[], tenant_column text ) ``` ### `graph.registered_edges()` ```sql graph.registered_edges() RETURNS TABLE ( from_table text, from_column text, to_table text, to_column text, label text, bidirectional boolean, weight_column text, label_column text ) ``` ### `graph.remove_table()` ```sql graph.remove_table(table_name regclass) RETURNS void ``` Admin-only. Removes table registration, associated filter columns, and edges where the table is either endpoint. Rebuild after calling. ### `graph.remove_edge()` ```sql graph.remove_edge(label text) RETURNS void ``` Admin-only. Removes edge registrations with the given label. Rebuild after calling. ### `graph.estimate()` ```sql graph.estimate() RETURNS TABLE ( estimated_nodes bigint, estimated_edges bigint, estimated_memory_mb double precision, memory_limit_mb int, fits_in_memory boolean ) ``` Estimates graph size from `pg_class.reltuples`. ## Discovery ### `graph.auto_discover()` ```sql graph.auto_discover(schema_name text DEFAULT 'public') RETURNS TABLE ( item_type text, item_name text, details text ) ``` Admin-only. Discovers tables and FK edges in a schema, registers them, then calls `graph.build()`. ### `graph.auto_discover_tables()` ```sql graph.auto_discover_tables( tables regclass[], tenant_column text DEFAULT NULL ) RETURNS TABLE ( item_type text, item_name text, details text ) ``` Admin-only. Discovers selected tables and FK edges only among those tables, then calls `graph.build()`. ## Build Jobs ### `graph.build()` ```sql graph.build() RETURNS TABLE ( nodes_loaded bigint, edges_loaded bigint, build_time_ms double precision, memory_used_mb double precision, sync_mode text ) ``` Admin-only. Builds the graph synchronously. ```sql graph.build(concurrently boolean) RETURNS TABLE ( build_id text, status text, nodes_loaded bigint, edges_loaded bigint, build_time_ms double precision, memory_used_mb double precision, sync_mode text ) ``` Admin-only. `false` runs the compatibility synchronous path. `true` creates a build job and attempts to launch a dynamic background worker. ### `graph.build_status()` ```sql graph.build_status(build_id text) RETURNS TABLE ( build_id text, status text, nodes_loaded bigint, edges_loaded bigint, build_time_ms double precision, memory_used_mb double precision, progress_phase text, progress_message text, started_at timestamp with time zone, finished_at timestamp with time zone, error text ) ``` Reads durable build job state, or returns `completed`/`not_found` for the special synchronous zero UUID path. ## Sync And Maintenance ### `graph.enable_sync()` ```sql graph.enable_sync() RETURNS void ``` Admin-only. Creates durable sync tables if needed and installs graph triggers on registered tables. ### `graph.apply_sync()` ```sql graph.apply_sync() RETURNS TABLE ( inserts_applied bigint, updates_applied bigint, deletes_applied bigint ) ``` Admin-only. Applies pending sync rows to the backend-local engine. ### `graph.run_scheduled_maintenance()` ```sql graph.run_scheduled_maintenance() RETURNS TABLE ( applied_sync boolean, maintenance_started boolean, maintenance_job_id text, pending_sync_rows bigint, edge_buffer_used int, message text ) ``` Admin-only. Applies pending sync when `graph.sync_health()` says it is safe, then starts `graph.maintenance(concurrently := true)` when refreshed health still recommends compaction or rebuild work. pgGraph does not run an internal scheduler; call this function from `pg_cron` or an external scheduler such as a Kubernetes CronJob, systemd timer, Docker init schedule, or application scheduler. ### `graph.vacuum()` ```sql graph.vacuum() RETURNS TABLE ( nodes_before bigint, nodes_after bigint, tombstones_removed bigint, edges_rebuilt bigint, vacuum_time_ms double precision ) ``` Admin-only. Rebuilds from source tables and swaps in the rebuilt backend-local engine; persistence writes the rebuilt base graph when requested by the caller. ### `graph.maintenance()` ```sql graph.maintenance(concurrently boolean DEFAULT false) RETURNS TABLE ( job_id text, status text, sync_rows_applied bigint, nodes_after bigint, edges_after bigint, vacuum_time_ms double precision, error text ) ``` Admin-only. Foreground path applies maintenance rebuild immediately. Background path creates a durable maintenance job and launches a dynamic worker. ### `graph.maintenance_status()` ```sql graph.maintenance_status(job_id text DEFAULT NULL) RETURNS TABLE ( job_id text, status text, sync_rows_applied bigint, nodes_after bigint, edges_after bigint, vacuum_time_ms double precision, progress_phase text, progress_message text, started_at timestamp with time zone, finished_at timestamp with time zone, error text ) ``` With `NULL`, returns the latest 50 jobs. With a job ID, returns that job or a single `not_found` row. ## Search ### `graph.search()` ```sql graph.search( property_key text, property_value text, table_filter regclass DEFAULT NULL, mode text DEFAULT 'contains', case_sensitive boolean DEFAULT false, max_rows int DEFAULT 100, row_offset int DEFAULT 0, tenant text DEFAULT NULL, hydrate boolean DEFAULT true ) RETURNS TABLE ( node_table oid, node_id text, match_type text, score real, verified boolean, node jsonb, node_table_name text ) ``` Searches registered source-table columns. Results are deduplicated by `(node_table, node_id)`, ordered by table OID then node ID, and paginated after the global merge. Per-table source scans are bounded to `row_offset + max_rows` candidates. ### `graph.search_nodes()` ```sql graph.search_nodes( property_key text, property_value text, table_filter regclass DEFAULT NULL, mode text DEFAULT 'contains', case_sensitive boolean DEFAULT false, max_rows int DEFAULT 100, row_offset int DEFAULT 0, tenant text DEFAULT NULL ) RETURNS TABLE ( node_table oid, node_id text, match_type text, score real, verified boolean, node_table_name text ) ``` Coordinate-only search. ## Traversal ### `graph.node_ref` ```sql CREATE TYPE graph.node_ref AS ( node_table regclass, node_id text ); graph.node_ref(node_table regclass, node_id text) RETURNS graph.node_ref ``` ### `graph.node_ref_string()` ```sql graph.node_ref_string(table_name regclass, node_id text) RETURNS text ``` Returns canonical JSON text for strict traversal specs. ### `graph.traverse()` single-start ```sql graph.traverse( seed_table regclass, seed_id text, max_depth int DEFAULT current_setting('graph.default_max_depth')::int, edge_types text[] DEFAULT NULL, direction text DEFAULT 'any', node_tables oid[] DEFAULT NULL, filter jsonb DEFAULT NULL, tenant text DEFAULT NULL, strategy text DEFAULT 'bfs', uniqueness text DEFAULT 'node_global', include_start boolean DEFAULT true, hydrate boolean DEFAULT true, max_rows int DEFAULT 1000, row_offset int DEFAULT 0, max_nodes int DEFAULT current_setting('graph.max_nodes')::int, max_frontier int DEFAULT current_setting('graph.max_frontier')::int ) RETURNS TABLE ( root_table oid, root_id text, node_table oid, node_id text, depth int, path jsonb, edge_path jsonb, node jsonb, root_table_name text, node_table_name text ) ``` ### `graph.traverse()` multi-start arrays ```sql graph.traverse( start_tables regclass[], start_ids text[], ... ) ``` `start_tables[i]` pairs with `start_ids[i]`. With `uniqueness := 'node_global'`, duplicate reached nodes are removed across the merged multi-start result before final pagination. With `uniqueness := 'node_per_root'`, a reached node may appear once per root. ### `graph.traverse()` node-ref array ```sql graph.traverse( starts graph.node_ref[], max_depth int DEFAULT current_setting('graph.default_max_depth')::int, ... ) ``` SQL wrapper over the parallel-array overload. ### `graph.traverse_search()` ```sql graph.traverse_search( property_key text, property_value text, table_filter regclass DEFAULT NULL, search_mode text DEFAULT current_setting('graph.default_search_mode'), case_sensitive boolean DEFAULT current_setting('graph.default_case_sensitive')::boolean, search_max_rows int DEFAULT 100, search_row_offset int DEFAULT 0, max_depth int DEFAULT current_setting('graph.default_max_depth')::int, edge_types text[] DEFAULT NULL, direction text DEFAULT 'any', node_tables oid[] DEFAULT NULL, filter jsonb DEFAULT NULL, tenant text DEFAULT NULL, strategy text DEFAULT 'bfs', uniqueness text DEFAULT 'node_per_root', include_start boolean DEFAULT true, hydrate boolean DEFAULT current_setting('graph.default_hydrate')::boolean, max_rows int DEFAULT 1000, row_offset int DEFAULT 0 ) RETURNS TABLE ( root_table oid, root_id text, node_table oid, node_id text, depth int, path jsonb, edge_path jsonb, node jsonb, root_table_name text, node_table_name text ) ``` Searches start nodes, then traverses from verified matches. `uniqueness` has the same merged-result behavior as multi-start `graph.traverse()`. ### `graph.format_path()` ```sql graph.format_path( path jsonb, edge_path jsonb, separator text DEFAULT ' | ' ) RETURNS text ``` Formats a path and edge path into readable hop text. ## Filter Constructors ```sql graph.equals(column_name text, value jsonb) RETURNS jsonb graph.equals(column_name text, value text) RETURNS jsonb graph.equals(column_name text, value bigint) RETURNS jsonb graph.eq(...) RETURNS jsonb graph.not_equals(column_name text, value jsonb) RETURNS jsonb graph.neq(column_name text, value jsonb) RETURNS jsonb graph.neq(column_name text, value text) RETURNS jsonb graph.neq(column_name text, value bigint) RETURNS jsonb graph.in(column_name text, values jsonb) RETURNS jsonb graph.not_in(column_name text, values jsonb) RETURNS jsonb graph.contains_text(column_name text, value text) RETURNS jsonb graph.prefix_text(column_name text, value text) RETURNS jsonb graph.is_null(column_name text) RETURNS jsonb graph.is_not_null(column_name text) RETURNS jsonb graph.greater_than(column_name text, value jsonb) RETURNS jsonb graph.greater_than(column_name text, value bigint) RETURNS jsonb graph.gt(column_name text, value jsonb) RETURNS jsonb graph.gt(column_name text, value bigint) RETURNS jsonb graph.at_least(column_name text, value jsonb) RETURNS jsonb graph.gte(column_name text, value jsonb) RETURNS jsonb graph.gte(column_name text, value bigint) RETURNS jsonb graph.less_than(column_name text, value jsonb) RETURNS jsonb graph.lt(column_name text, value jsonb) RETURNS jsonb graph.lt(column_name text, value bigint) RETURNS jsonb graph.at_most(column_name text, value jsonb) RETURNS jsonb graph.lte(column_name text, value jsonb) RETURNS jsonb graph.lte(column_name text, value bigint) RETURNS jsonb graph.between(column_name text, lower jsonb, upper jsonb) RETURNS jsonb graph.on_node(filter jsonb) RETURNS jsonb graph.all(filters jsonb[]) RETURNS jsonb ``` ## Workflow API ### `graph.find()` ```sql graph.find( property_key text, property_value text, table_name regclass DEFAULT NULL, mode text DEFAULT 'contains', case_sensitive boolean DEFAULT false, max_rows int DEFAULT 20, row_offset int DEFAULT 0, tenant text DEFAULT NULL ) RETURNS TABLE ( node_table oid, node_table_name text, node_id text, match_type text, score real, verified boolean, rank int, node jsonb ) ``` ### `graph.expand()` ```sql graph.expand( seed_table regclass, seed_id text, max_depth int DEFAULT current_setting('graph.default_max_depth')::int, edge_types text[] DEFAULT NULL, direction text DEFAULT 'any', target_table regclass DEFAULT NULL, target_tables regclass[] DEFAULT NULL, where_node jsonb DEFAULT NULL, tenant text DEFAULT NULL, max_rows int DEFAULT 50, row_offset int DEFAULT 0, include_start boolean DEFAULT false ) RETURNS TABLE ( root_table oid, root_id text, root_table_name text, node_table oid, node_table_name text, node_id text, depth int, rank int, path jsonb, edge_path jsonb, readable_path text, node jsonb, truncated boolean ) ``` ### `graph.find_related()` ```sql graph.find_related( property_key text, property_value text, source_table regclass DEFAULT NULL, search_mode text DEFAULT current_setting('graph.default_search_mode'), case_sensitive boolean DEFAULT current_setting('graph.default_case_sensitive')::boolean, search_max_rows int DEFAULT 1, search_row_offset int DEFAULT 0, max_depth int DEFAULT current_setting('graph.default_max_depth')::int, edge_types text[] DEFAULT NULL, direction text DEFAULT 'any', target_table regclass DEFAULT NULL, target_tables regclass[] DEFAULT NULL, where_node jsonb DEFAULT NULL, tenant text DEFAULT NULL, max_rows int DEFAULT 20, row_offset int DEFAULT 0, include_counts boolean DEFAULT false, candidate_limit int DEFAULT 10000, include_start boolean DEFAULT false ) RETURNS TABLE ( root_table oid, root_id text, root_table_name text, node_table oid, node_table_name text, node_id text, depth int, score real, rank int, path jsonb, edge_path jsonb, readable_path text, node jsonb, candidate_count bigint, filtered_count bigint, truncated boolean ) ``` ### `graph.path()` ```sql graph.path( source_table regclass, source_id text, target_table regclass, target_id text, max_depth int DEFAULT 20 ) RETURNS TABLE ( step int, node_table oid, node_table_name text, node_id text, edge_label text, readable_path text, node jsonb ) ``` ### `graph.connection()` ```sql graph.connection( source_key text, source_value text, target_key text, target_value text, source_table regclass DEFAULT NULL, target_table regclass DEFAULT NULL, source_k int DEFAULT 3, target_k int DEFAULT 3, search_mode text DEFAULT 'contains', max_depth int DEFAULT 6 ) RETURNS TABLE ( source_table_name text, source_id text, target_table_name text, target_id text, hop_count int, step int, node_table oid, node_table_name text, node_id text, edge_label text, readable_path text, node jsonb ) ``` ### `graph.neighborhood()` ```sql graph.neighborhood( property_key text, property_value text, source_table regclass DEFAULT NULL, search_mode text DEFAULT 'contains', search_max_rows int DEFAULT 1, max_depth int DEFAULT 4, edge_types text[] DEFAULT NULL, direction text DEFAULT 'any', tenant text DEFAULT NULL, sample_k int DEFAULT 5, node_limit int DEFAULT 10000 ) RETURNS TABLE ( depth int, node_table oid, node_table_name text, node_count bigint, sample_nodes jsonb, truncated boolean ) ``` ## Paths ### `graph.shortest_path()` ```sql graph.shortest_path( source_table regclass, source_id text, target_table regclass, target_id text, max_depth int DEFAULT 20, hydrate boolean DEFAULT true ) RETURNS TABLE ( step int, node_table oid, node_id text, edge_label text, node jsonb, node_table_name text ) ``` Returns no rows when no path is found. ### `graph.weighted_shortest_path()` ```sql graph.weighted_shortest_path( source_table regclass, source_id text, target_table regclass, target_id text ) RETURNS TABLE ( step int, node_table oid, node_table_name text, node_id text, edge_label text, edge_weight bigint, step_cost bigint, total_cost bigint ) ``` Uses registered edge weights and returns one row per path step. Step `0` is the source node and has NULL `edge_label` and `edge_weight`. `step_cost` is the cumulative cost at that step, and `total_cost` is repeated on every returned row. Returns no rows when no weighted path exists or the graph has no weight data. ## Aggregation ### `graph.path_count_estimate()` ```sql graph.path_count_estimate(traversal jsonb) RETURNS TABLE ( estimated_paths bigint, exact boolean, capped boolean ) ``` ### `graph.aggregate()` ```sql graph.aggregate( traversal jsonb, aggregations jsonb, scope text DEFAULT 'returned_nodes', path_limit int DEFAULT current_setting('graph.max_exact_path_count')::int ) RETURNS jsonb ``` Supported `scope` values: `returned_nodes`, `chosen_parent_path`, `all_possible_paths`. ## Components ### `graph.connected_components()` ```sql graph.connected_components() RETURNS TABLE ( node_table oid, node_id text, component_id bigint, component_size int ) ``` Admin-only. Returns one row per active node. ### `graph.component_stats()` ```sql graph.component_stats() RETURNS TABLE ( num_components int, largest_component int, num_isolated_nodes int, total_active_nodes int ) ``` Admin-only. ### `graph.components()` ```sql graph.components( max_rows int DEFAULT 100, row_offset int DEFAULT 0 ) RETURNS TABLE ( component_id bigint, component_size bigint, rank int ) ``` Admin-only. Ordered by descending component size. ### `graph.largest_component()` ```sql graph.largest_component( max_rows int DEFAULT 100, row_offset int DEFAULT 0, hydrate boolean DEFAULT true ) RETURNS TABLE ( component_id bigint, node_table oid, node_id text, node jsonb ) ``` Admin-only. ### `graph.component()` ```sql graph.component( component_id bigint, max_rows int DEFAULT 100, row_offset int DEFAULT 0, hydrate boolean DEFAULT true ) RETURNS TABLE ( component_id bigint, node_table oid, node_id text, node jsonb ) ``` Admin-only. ### `graph.isolated_nodes()` ```sql graph.isolated_nodes( max_rows int DEFAULT 100, row_offset int DEFAULT 0, hydrate boolean DEFAULT true ) RETURNS TABLE ( component_id bigint, node_table oid, node_id text, node jsonb ) ``` Admin-only.