# Architecture
This page is the contributor-level map of the `graph` crate. It explains the
crate graph, data flow, ownership model, lifetimes, and the design choices that
shape the implementation.
For SQL behavior, start with the [User Guide](/user_guide). This page assumes
you are changing Rust code or reviewing how the extension works inside a
PostgreSQL backend.
## System Shape
## Runtime Ownership
Each PostgreSQL backend owns its own `Engine` in thread-local storage:
```rust
thread_local! {
static ENGINE: RefCell = RefCell::new(Engine::new());
}
```
That means there is no shared Rust heap between connections. Sharing happens
only through PostgreSQL storage, the filesystem, and OS page-cache backed mmap
pages.
| Object | Owner | Lifetime |
|---|---|---|
| `Engine` | One PostgreSQL backend process | Backend lifetime or until `graph.reset()` |
| `NodeStore` owned mode | `Engine` heap | Mutable build/sync lifetime |
| `NodeStore` mmap mode | Raw pointers into `Engine._mmap` | Valid while `_mmap` remains owned by the engine |
| Forward `EdgeStore` mmap mode | Raw pointers into `Engine._mmap` | Valid while `_mmap` remains owned by the engine |
| `reverse_edge_store` | `Engine` heap | Rebuilt per backend from forward CSR |
| `FilterIndex` after load | `Engine` heap | Deserialized per backend from bincode |
| `edge_type_registry` after load | `Engine` heap | Deserialized per backend from bincode |
| `ResolutionIndex` mmap mode | Byte slice inside `Engine._mmap` | Valid while `_mmap` remains owned by the engine |
| `edge_buffer` and `resolution_delta` | `Engine` heap | Backend-local sync overlay state |
## Data Flow: Build
- NodeStore rows
- ResolutionIndexBuilder entries
- FilterIndex values
- tenant membership bitmaps
- forward CSR EdgeStore
- reverse CSR EdgeStore
The build path is allowed to allocate and sort. Query paths are not.
## Data Flow: Load
`load_graph_file()` validates the file before constructing any mmap-backed
store:
- mmap-backed NodeStore arrays
- mmap-backed forward EdgeStore arrays
- mmap-backed ResolutionIndex section
- heap FilterIndex from bincode
- heap edge_type_registry from bincode
- heap reverse_edge_store from forward CSR
The forward graph arrays and resolution section are mmap-backed. The reverse
CSR and bincode sections are backend-local heap allocations today.
## Data Flow: Query
- check graph.enabled
- auto-load persisted graph if needed
- validate call options
- ACL/admin check where required
- resolve seed table+PK to node_idx
- select forward or reverse CSR
- run bounded traversal/path/search algorithm
- apply filters, tenants, overlays, pagination
Graph algorithms operate on compact node indexes. SQL-facing functions translate
between PostgreSQL coordinates and those internal indexes.
## Sync And Maintenance Flow
Trigger sync is deliberately explicit. Query functions do not hide sync
catch-up work.
Node inserts and tombstones can update backend-local state. Edge mutations use
overlay buffers until maintenance or vacuum rebuilds the base CSR.
## Design Decisions
| Decision | Why it exists |
|---|---|
| SQL is the public API | Keeps application integration inside PostgreSQL and avoids a new query language. |
| Source tables stay authoritative | pgGraph is an acceleration layer, not a second source of truth. |
| Backend-local `Engine` | Matches PostgreSQL process isolation and avoids shared mutable Rust state. |
| CSR for topology | Compact adjacency slices make traversal cache-friendly and predictable. |
| Reverse CSR is materialized | Inbound traversal stays O(degree) instead of scanning all forward edges. |
| Read-only mmap for persisted forward arrays | Later backends can start quickly and share immutable derived artifact pages through the OS page cache without replacing PostgreSQL's buffer pool. |
| Bincode metadata is heap-loaded | Filter and registry structures are variable-size Rust data that are easier to validate and use as owned values. |
| Explicit maintenance | Expensive rebuild work is visible and controllable from SQL. |
| Circuit breakers everywhere | The extension runs in PostgreSQL backends and must bound memory and traversal work. |
## Safety Boundaries
Unsafe code exists for performance and PostgreSQL integration, not as a general
escape hatch. The core unsafe boundary is mmap-backed store construction:
| Boundary | Required invariant |
|---|---|
| `MmapNodeArrays` | Active bytes, OID array, PK offsets, and PK byte ranges are present, aligned, and bounded by the mmap. |
| `MmapEdgeArrays` | CSR offsets, targets, type IDs, and optional weights are present, aligned, and bounded by the mmap. |
| `Engine._mmap` | The mmap outlives every NodeStore, forward EdgeStore, and ResolutionIndex lookup that borrows from it. |
| `raise_graph_error()` | PostgreSQL error FFI is called with stable strings and is treated as non-returning at the SQL boundary. |
Detailed rules live in [Safety And Security](./safety-security). Keep rustdoc
`# Safety` sections and local `// SAFETY:` comments current when touching any
unsafe area.
## Where To Make Changes
| Change | Start here |
|---|---|
| SQL function shape or return columns | `src/sql_facade/*`, then `docs/user_guide/api-reference.mdx` |
| Registration validation | `src/catalog/validate.rs` |
| Build ingestion | `src/builder.rs` |
| Traversal behavior | `src/bfs.rs`, `src/engine.rs`, `src/sql_facade/traversal.rs` |
| Shortest path behavior | `src/path_finder.rs`, `src/engine.rs` |
| Persistence format | `src/persistence.rs`, `docs/contributor_guide/persistence-format.mdx` |
| Mmap-backed stores | `src/node_store.rs`, `src/edge_store.rs`, `src/persistence.rs` |
| Sync behavior | `src/sync.rs`, `src/sql_sync.rs`, `src/sql_facade/admin.rs` |
| SQLSTATE or error semantics | `src/safety.rs`, `docs/user_guide/troubleshooting.mdx` |