# Playground The pgGraph sandbox playground is a one-command Streamlit inspector for trying pgGraph against PostgreSQL without manually preparing a database. It starts the repository Docker image, prepares the Panama Papers dataset, loads pgGraph, warms the PostgreSQL connection, and opens a dark-mode SQL workspace where you can run pgGraph functions directly. ## Start It From the repository root: ```bash sandbox/start_playground.sh ``` The script creates a Python virtualenv under `sandbox/playground/.venv` when needed and installs the playground dependencies there. It also starts or reuses the `pggraph-sandbox` PostgreSQL container. By default, the playground starts in `csr_readonly` mode. To start the same dataset with mutable GQL writes enabled: ```bash PGGRAPH_PLAYGROUND_MODE=mutable sandbox/start_playground.sh ``` The quickstart wrapper accepts the same mode as an argument: ```bash scripts/quickstart.sh playground panama csr scripts/quickstart.sh playground panama mutable ``` The first run may take time because it downloads and prepares the Panama Papers dataset: - compressed archive: 73 MB; - extracted source data: 626 MB. ## Requirements You need: - Docker or Docker Desktop; - Python 3.10 or newer; - network access for the first dataset download. If Docker is missing or the daemon is not running, the script prints direct installation links for macOS, Windows, and Linux. ## What The Playground Loads The playground currently loads Panama Papers / ICIJ Offshore Leaks data. It is intended as a quick inspector and SQL demo environment, not a full benchmark runner. For LDBC and benchmark reports, use: ```bash sandbox/run_benchmarks.sh ldbc ``` ## Using The SQL Workspace The playground opens with a question-oriented SQL editor. The default examples show: - `graph.status()`; - registered tables and edges; - search over Panama entities; - traversal from a high-degree node; - connected component functions; - admin and operator views such as `graph.sync_health()`, `graph.run_scheduled_maintenance()`, build status, and maintenance status. You can edit the SQL and run any query against the Docker-backed PostgreSQL 17 database with pgGraph installed. Useful starting points: ```sql SELECT * FROM graph.status(); SELECT * FROM graph.sync_health(); SELECT * FROM graph.registered_tables(); SELECT * FROM graph.registered_edges(); ``` ```sql SELECT * FROM graph.search('name', 'Mossack', mode := 'contains', max_results := 25); ``` ```sql SELECT * FROM graph.component_stats(); ``` The app warms the graph before reading status so node and edge counts should reflect the loaded dataset. If the graph is still building, the app waits for the build to complete or reports the build error. ## Built-In Query Catalog The sidebar query catalog is shared with the playground release gate. It currently validates 40 CSR examples and 41 mutable examples: | Section | Queries | |---|---| | Core Functions | Status + Catalog; Search Mossack; Find Mossack; Traverse Neighborhood; Expand Neighborhood; Shortest Path; Component Stats; Largest Component | | GQL Examples | GQL Parameterized Match; GQL Scalar Projection; GQL One-Hop Relationships; GQL Relationship Projection; GQL Inbound Relationships; GQL Undirected Relationships; GQL Distinct Labels; GQL Aggregated Neighbors; GQL Aggregate By Label; GQL Collect Neighbor Labels; GQL Variable-Length Paths; GQL Path Functions; GQL Hydration Off; GQL Explain | | Mutable GQL Writes | Mutable GQL Merge Node; Mutable GQL Merge Update | | Sample Workflows | Table Sizes; Relationship Label Counts; Top Connected Officers; Top Connected Entities; Entity Direct Relationships; Officer Context Packet; Search Entity Then Expand; Relationship Filtered Walk; Capped 3-Hop Investigation | | Admin | Build Graph; Build Graph Concurrently in CSR mode; Build Status; Sync Health; Apply Sync; Scheduled Maintenance; Vacuum Graph; Maintenance; Maintenance Status | The GQL examples use the registered Panama `nodes` label and dynamic relationship labels from `panama.edges.rel_type`. You can edit those examples or run other supported GQL subset queries through `graph.gql()` and `graph.gql_explain()`. Supported GQL in the playground includes: - node-only `MATCH`; - directed, inbound, undirected, and bounded variable-length relationship `MATCH` patterns; - top-level single-relationship `OPTIONAL MATCH`; - property predicates, inline property maps, JSONB dotted property paths, `$parameters`, null checks, membership, and boolean combinations; - `RETURN`, `RETURN DISTINCT`, `WITH DISTINCT`, aliases, `ORDER BY`, `SKIP`, and `LIMIT`; - aggregates `count`, `sum`, `avg`, `min`, `max`, and `collect`; - named relationship path values plus `nodes(path)`, `relationships(path)`, and `length(path)`; - mutable-overlay writes when the graph is built in mutable mode: single-node `CREATE`, single-node `MERGE`, single-property `SET` and `REMOVE`, directed relationship `DELETE` backed by a registered edge row table, and single-node `DETACH DELETE`. The CSR sidebar focuses on read queries and admin/operator checks. The mutable sidebar adds idempotent MERGE examples and rebuilds the graph with `mutable_overlay`. ## Release Gate The playground SQL examples are part of the heavy release gate. The gate imports the same query catalog used by the Streamlit sidebar and checks each example against fixed result summaries for the Panama dataset: ```bash cd graph ./tests/heavy/playground_release_gate.sh PGGRAPH_PLAYGROUND_MODE=mutable ./tests/heavy/playground_release_gate.sh ``` If a playground query is added, removed, or changed, update the gate expectations deliberately after reviewing the new result shape. ## Cleanup To remove generated playground and benchmark artifacts: ```bash sandbox/cleanup.sh --dry-run sandbox/cleanup.sh ``` Cleanup removes downloaded datasets, normalized CSVs, generated benchmark reports, local virtualenvs, and the sandbox Docker container. Add `--remove-image` to also remove the built PostgreSQL image.