{ "nbformat": 4, "nbformat_minor": 5, "metadata": { "kernelspec": { "name": "provsql-studio", "display_name": "ProvSQL (SQL)", "language": "sql" }, "language_info": { "name": "sql" }, "provsql": { "scheme": "semiring", "database": "cs1", "generated_from": "doc/source/user/casestudy1.rst" } }, "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Case Study: Intelligence Agency\n", "\n", "This case study, largely extending the scenario introduced when ProvSQL was first presented, demonstrates ProvSQL’s custom semiring capability, where-provenance, probability computation with multiple algorithms, and circuit export through a security-classification scenario.\n", "\n", "## The Scenario\n", "\n", "An intelligence agency maintains a database of seven employees spread across three cities. Every employee holds a security clearance ranging from *unclassified* to *top secret*. Your tasks:\n", "\n", "- identify which cities are served by more than one agent,\n", "- determine the minimum clearance level needed to infer each result,\n", "- find cities with exactly one agent (sensitive: if the city leaks, the sole agent is exposed),\n", "- track where in the database each output value originated,\n", "- compute the probability that a city remains a single-agent post after accounting for possible-world uncertainty.\n", "\n", "## Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*The following cells set up the database with all the content this notebook requires; run them first, ideally on a fresh database.*" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "DROP TYPE IF EXISTS classification_level CASCADE;\n", "CREATE TYPE classification_level AS ENUM (\n", " 'unclassified', 'restricted', 'confidential', 'secret', 'top_secret',\n", " 'unavailable'\n", ");" ], "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "DROP TABLE IF EXISTS personnel CASCADE;\n", "CREATE TABLE personnel (\n", " id SERIAL PRIMARY KEY,\n", " name TEXT NOT NULL,\n", " position TEXT NOT NULL,\n", " city TEXT NOT NULL,\n", " classification classification_level NOT NULL\n", ");" ], "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "TRUNCATE personnel;\n", "INSERT INTO personnel (name, position, city, classification) VALUES\n", " ('Juma', 'Director', 'Nairobi', 'unclassified'),\n", " ('Paul', 'Janitor', 'Nairobi', 'restricted'),\n", " ('David', 'Analyst', 'Paris', 'confidential'),\n", " ('Ellen', 'Field agent', 'Beijing', 'secret'),\n", " ('Aaheli', 'Double agent', 'Paris', 'top_secret'),\n", " ('Nancy', 'HR', 'Paris', 'restricted'),\n", " ('Jing', 'Analyst', 'Beijing', 'secret');" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This creates:\n", "\n", "- `classification_level` – an ordered ENUM (`unclassified` \\< `restricted` \\< `confidential` \\< `secret` \\< `top_secret` \\< `unavailable`) where `unavailable` is a sentinel representing the semiring 𝟘 (no derivation possible)\n", "- `personnel` – 7 agents with name, position, city, and clearance level\n", "\n", "## Step 1: Explore the Database\n", "\n", "Inspect the `personnel` table:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT * FROM personnel ORDER BY id;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You should see seven rows: Juma (Director, Nairobi), Paul (Janitor, Nairobi), David (Analyst, Paris), Ellen (Field agent, Beijing), Aaheli (Double agent, Paris), Nancy (HR, Paris), and Jing (Analyst, Beijing).\n", "\n", "## Step 2: Enable Provenance and Create a Name Mapping\n", "\n", "Enable provenance tracking on `personnel` and create a mapping so that provenance tokens can be labelled with agent names:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT add_provenance('personnel');\n", "DROP TABLE IF EXISTS personnel_name;\n", "SELECT create_provenance_mapping('personnel_name', 'personnel', 'name');" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After [`add_provenance`](https://provsql.org/doxygen-sql/html/group__table__management.html#ga33ff696dabb05d813f2c1f914cb97d9a), every row of `personnel` has a unique UUID token in its hidden `provsql` column. The mapping `personnel_name` associates each token with the corresponding agent’s name.\n", "\n", "## Step 3: Cities Shared by Multiple Agents\n", "\n", "Which cities have at least two agents? Use a self-join with an `id` inequality to generate all unordered pairs, then apply [`sr_formula`](https://provsql.org/doxygen-sql/html/group__compiled__semirings.html#ga3aad4775805b92307f3ca9e1b9fbd4c5) to see which agents contribute to each city:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT p1.city,\n", " sr_formula(provenance(), 'personnel_name') AS formula\n", "FROM personnel p1\n", "JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", "GROUP BY p1.city\n", "ORDER BY p1.city;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The formula for Nairobi is `Juma ⊗ Paul`: both agents must be present for Nairobi to appear. Beijing similarly shows `Ellen ⊗ Jing`. Paris, with three agents, shows all three pairwise products joined by `⊕`.\n", "\n", "## Step 4: Minimum Security Clearance (sr_minmax)\n", "\n", "For each shared city, what is the *minimum clearance level* required to have inferred that the city has multiple agents? An analyst who knows the city only needs to see the lowest-cleared agent there.\n", "\n", "This is the security shape of the min-max m-semiring, computed by the compiled function [`sr_minmax`](https://provsql.org/doxygen-sql/html/group__compiled__semirings.html#ga21bc3ee408f01de06e82ff21e7a01458) over the `classification_level` enum:\n", "\n", "- `⊕` (OR combination) = `min`: to infer *either* agent was involved, you only need clearance for the less-classified one (one witness suffices to establish the disjunction).\n", "- `⊗` (AND/join) = `max`: to confirm *both* agents are present, you need clearance for the more-classified one (you must be able to access both records to establish the join).\n", "\n", "The third argument is a sample value of the carrier enum (used only for type inference); its value is ignored:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "DROP TABLE IF EXISTS personnel_level;\n", "SELECT create_provenance_mapping('personnel_level',\n", " 'personnel', 'classification');\n", "\n", "SELECT p1.city,\n", " sr_minmax(provenance(), 'personnel_level',\n", " 'unclassified'::classification_level) AS min_clearance\n", "FROM personnel p1\n", "JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", "GROUP BY p1.city\n", "ORDER BY p1.city;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Results: Nairobi requires `restricted` (Paul is the more-classified of the two agents, and both must be accessed to confirm the pair). Beijing requires `secret` (both Ellen and Jing hold the same level). Paris requires `confidential`: the pair David–Nancy has MAX clearance `confidential`, which is the lowest maximum among all Paris pairs, so `confidential` clearance suffices to confirm at least one pair.\n", "\n", "## Step 5: Cities with Exactly One Agent (EXCEPT / Monus)\n", "\n", "A city with a single agent is sensitive: knowing the city immediately identifies the agent. Find cities where *all* agents are alone using `EXCEPT`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT city,\n", " sr_formula(provenance(), 'personnel_name') AS formula\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "ORDER BY city;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:**\n", "\n", "> ProvSQL’s `EXCEPT` uses the *monus* operator `⊖` of the provenance semiring rather than plain set difference. Every city appears in the result with a provenance formula; the formula evaluates to `𝟘` for cities that are definitely shared and to a non-trivial expression for cities that *could* be single-agent in some possible world. Nairobi’s formula `(Juma ⊕ Paul) ⊖ (Juma ⊗ Paul)` reads: “at least one of Juma or Paul is present, minus the event where both are.”\n", "\n", "## Step 6: Where-Provenance\n", "\n", "Where-provenance tracks *which column of which input row* produced each output value. Enable it and re-run the shared-city query:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SET provsql.provenance = 'where';\n", "\n", "SELECT p1.city,\n", " where_provenance(provenance()) AS source\n", "FROM personnel p1\n", "JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", "GROUP BY p1.city\n", "ORDER BY p1.city;\n", "\n", "SET provsql.provenance = 'semiring';" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each city output value is traced back to the `city` column (column 4) of the `personnel` table for every agent in that city. The notation `[personnel:〈token〉:4;...]` shows the token and column index of each contributing row; the trailing `[]` is the untracked `source` column itself.\n", "\n", "## Step 7: Assign Probabilities\n", "\n", "Suppose the existence of each agent in the database is uncertain. Assign each agent a probability equal to `id / 10.0`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "ALTER TABLE personnel ADD COLUMN IF NOT EXISTS probability DOUBLE PRECISION;\n", "UPDATE personnel SET probability = id / 10.0;\n", "\n", "DO $$ BEGIN\n", " PERFORM set_prob(provenance(), probability) FROM personnel;\n", "END $$;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now Juma has probability 0.1, Paul 0.2, …, Jing 0.7.\n", "\n", "## Step 8: Probability – Exact\n", "\n", "Compute the exact probability that each city is a single-agent city:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT city,\n", " ROUND(probability_evaluate(provenance())::numeric, 4) AS prob\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "ORDER BY city;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Nairobi (agents with probabilities 0.1 and 0.2) has probability `0.1 × 0.8 + 0.9 × 0.2 = 0.26` of having exactly one agent. Beijing (0.4 and 0.7) scores `0.54`. Paris (0.3, 0.5, 0.6) gives `0.41`.\n", "\n", "## Step 9: Probability – Monte Carlo\n", "\n", "For larger circuits, exact evaluation can be expensive. Monte Carlo sampling gives an approximate answer:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT city,\n", " probability_evaluate(\n", " provenance(), 'monte-carlo', '10000') AS prob\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "ORDER BY city;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With 10 000 samples the result is accurate to roughly ±0.01 (see [Margin of error](https://en.wikipedia.org/wiki/Margin_of_error)). Results will vary slightly between runs due to sampling. Compare the runtime against the exact method (`\\timing` in psql, or the per-query timing Studio displays).\n", "\n", "## Step 10: Probability – Knowledge Compiler\n", "\n", "**Note:**\n", "\n", "> This step uses an external knowledge compiler such as `d4` or `dsharp`, which must be on your `PATH` (or in a directory listed in the `provsql.tool_search_path` GUC, see [the configuration chapter](https://provsql.org/docs/user/configuration.html)). The query below guards the call with [`tool_available`](https://provsql.org/doxygen-sql/html/group__provenance__output.html#ga7be235a742f0afc6ddb56ee2b1c025c1), so where no such compiler is installed – in particular in the Playground, which bundles none – it reports that instead of failing, and you can read on.\n", "\n", "A knowledge compiler converts the provenance circuit to a *d-DNNF* representation, which enables efficient exact probability evaluation on large circuits of specific forms:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT city,\n", " CASE WHEN tool_available('d4')\n", " THEN ROUND(probability_evaluate(\n", " provenance(), 'compilation', 'd4')::numeric, 4)::text\n", " ELSE '(needs external compiler d4; unavailable here)'\n", " END AS prob\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "ORDER BY city;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare the runtime (`\\timing` in psql, or Studio’s per-query timing) against the possible-worlds and Monte Carlo methods. On this small example, the external knowledge compiler will be slower than the other methods: invoking an external process and compiling the circuit carries significant overhead that only pays off on much larger circuits.\n", "\n", "## Step 11: Visualise a Provenance Circuit\n", "\n", "[`view_circuit`](https://provsql.org/doxygen-sql/html/group__provenance__output.html#ga9c312a8b72b968e3669432ab5ae49928) renders the provenance circuit as an ASCII box-art diagram using [graph-easy](https://metacpan.org/dist/Graph-Easy) (which must be on your `PATH`). The query guards the call with [`tool_available`](https://provsql.org/doxygen-sql/html/group__provenance__output.html#ga7be235a742f0afc6ddb56ee2b1c025c1), so without `graph-easy` – in particular in the Playground, where Studio’s interactive [circuit mode](https://provsql.org/docs/user/studio.html) is the better view anyway – it says so rather than failing:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT city,\n", " CASE WHEN tool_available('graph-easy')\n", " THEN view_circuit(provenance(), 'personnel_name')\n", " ELSE '(needs the graph-easy renderer; use Studio circuit mode)'\n", " END AS circuit\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "WHERE city = 'Nairobi';" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result shows the monus gate at the top, with `Juma` and `Paul` as leaf inputs, rendered in box-art notation in the terminal.\n", "\n", "## Step 12: Export to XML\n", "\n", "[`to_provxml`](https://provsql.org/doxygen-sql/html/group__provenance__output.html#ga60e8ffbac337ba6cb9e32b64672b802f) serialises the provenance circuit to [PROV-XML](https://www.w3.org/TR/prov-xml/), the W3C standard for provenance interchange, which can be processed by any standard XML tool:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT city,\n", " to_provxml(provenance(), 'personnel_name') AS provxml\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "WHERE city = 'Nairobi';" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step 13: Large Circuit Benchmark\n", "\n", "To compare the three probability algorithms at scale, create a synthetic 100 × 100 random-probability matrix and enable provenance tracking on it:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "DROP TABLE IF EXISTS matrix CASCADE;\n", "CREATE TABLE matrix AS\n", "SELECT ones.n + 10 * tens.n AS x,\n", " other.n + 10 * tens2.n AS y,\n", " random() AS prob\n", "FROM (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) ones(n),\n", " (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) tens(n),\n", " (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) other(n),\n", " (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) tens2(n);\n", "\n", "SELECT add_provenance('matrix');\n", "DO $$ BEGIN\n", " PERFORM set_prob(provenance(), prob) FROM matrix;\n", "END $$;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now run the same path query with each method in turn, timing each (`\\timing` in psql, or Studio’s per-query timing):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "-- Default method (independent evaluation, tree decomposition, or d4)\n", "SELECT m1.x, m2.y,\n", " probability_evaluate(provenance()) AS prob\n", "FROM matrix m1, matrix m2\n", "WHERE m2.x = m1.y AND m1.x > 90 AND m2.x > 90 AND m2.y > 90\n", "GROUP BY m1.x, m2.y\n", "ORDER BY m1.x, m2.y;" ], "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "-- Exact enumeration over all possible worlds\n", "SELECT m1.x, m2.y,\n", " probability_evaluate(provenance(), 'possible-worlds') AS prob\n", "FROM matrix m1, matrix m2\n", "WHERE m2.x = m1.y AND m1.x > 90 AND m2.x > 90 AND m2.y > 90\n", "GROUP BY m1.x, m2.y\n", "ORDER BY m1.x, m2.y;" ], "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "-- Monte Carlo sampling (9604 samples ≈ 1 % error at 95 % confidence)\n", "SELECT m1.x, m2.y,\n", " probability_evaluate(provenance(), 'monte-carlo', '9604') AS prob\n", "FROM matrix m1, matrix m2\n", "WHERE m2.x = m1.y AND m1.x > 90 AND m2.x > 90 AND m2.y > 90\n", "GROUP BY m1.x, m2.y\n", "ORDER BY m1.x, m2.y;" ], "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "-- Tree-decomposition-based exact compilation (built-in, no external tool)\n", "SELECT m1.x, m2.y,\n", " probability_evaluate(provenance(), 'tree-decomposition') AS prob\n", "FROM matrix m1, matrix m2\n", "WHERE m2.x = m1.y AND m1.x > 90 AND m2.x > 90 AND m2.y > 90\n", "GROUP BY m1.x, m2.y\n", "ORDER BY m1.x, m2.y;" ], "outputs": [] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "-- Knowledge compilation via external tool d4 (guarded with\n", "-- tool_available, so it reports rather than fails where d4 is\n", "-- absent, e.g. in the Playground)\n", "SELECT m1.x, m2.y,\n", " CASE WHEN tool_available('d4')\n", " THEN probability_evaluate(provenance(), 'compilation', 'd4')::text\n", " ELSE '(d4 unavailable here)'\n", " END AS prob\n", "FROM matrix m1, matrix m2\n", "WHERE m2.x = m1.y AND m1.x > 90 AND m2.x > 90 AND m2.y > 90\n", "GROUP BY m1.x, m2.y\n", "ORDER BY m1.x, m2.y;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Monte Carlo query uses `9604` samples, which gives roughly 1 % additive error with 95 % confidence (by the formula $n = z^2 / (4\\varepsilon^2)$ with $z = 1.96$, $\\varepsilon = 0.01$; see [Margin of error](https://en.wikipedia.org/wiki/Margin_of_error)).\n", "\n", "The `'tree-decomposition'` method is exact and built into ProvSQL (no external binary required). It is often the fastest exact method on simple queries, but it fails on circuits with high treewidth – when that happens, fall back to `'compilation'` or one of the other methods.\n", "\n", "## Step 14: The Boolean Expression Behind a Token\n", "\n", "[`sr_boolexpr`](https://provsql.org/doxygen-sql/html/group__compiled__semirings.html#gaf221001c4d22d033906fdcaa82864f7a) returns the abstract Boolean formula of a provenance circuit. Without a mapping it uses internal variable names `x0`, `x1`…; with an optional second argument naming a provenance mapping table the leaves are labelled by the mapping’s `value` column instead. This is the same expression ProvSQL hands to its d-DNNF compilers internally to compute probabilities." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT city, sr_boolexpr(provenance()) AS boolexpr\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "WHERE city = 'Nairobi';" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For Nairobi, the result is the circuit `(Juma ⊕ Paul) ⊖ (Juma ⊗ Paul)` from Step 5, interpreted in the Boolean function semiring – every provenance gate is mapped to its Boolean counterpart (`⊕` to `∨`, `⊗` to `∧`, `⊖` to `∧¬`) – and the resulting Boolean function rendered as a formula over anonymous variables. Unlike [`sr_formula`](https://provsql.org/doxygen-sql/html/group__compiled__semirings.html#ga3aad4775805b92307f3ca9e1b9fbd4c5), the provenance mapping is optional: the expression captures the circuit’s logical structure independently of any naming, and a mapping can be supplied later if you want the leaves labelled.\n", "\n", "## Step 15: Programmatic Circuit Inspection\n", "\n", "What [`view_circuit`](https://provsql.org/doxygen-sql/html/group__provenance__output.html#ga9c312a8b72b968e3669432ab5ae49928) renders, you can also walk programmatically with the low-level circuit API. Capture Nairobi’s monus token first:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "CREATE TEMP TABLE nairobi_token AS\n", "SELECT provenance() AS prov\n", "FROM (\n", " SELECT DISTINCT city FROM personnel\n", " EXCEPT\n", " SELECT p1.city\n", " FROM personnel p1\n", " JOIN personnel p2 ON p1.city = p2.city AND p1.id < p2.id\n", " GROUP BY p1.city\n", ") t\n", "WHERE city = 'Nairobi';" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[`get_nb_gates`](https://provsql.org/doxygen-sql/html/group__gate__manipulation.html#gacc16e47ab21d01de696c6323453d3515) reports how many gates have been materialized in the current database’s circuit:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT get_nb_gates();" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[`get_gate_type`](https://provsql.org/doxygen-sql/html/group__gate__manipulation.html#ga1731181eef2acf0b0590edd23b3ccb49) and [`get_children`](https://provsql.org/doxygen-sql/html/group__gate__manipulation.html#gad21ff35d95b5d782bf594eec99c1ee03) give a single-step view of the gate structure: they return the operator and direct children of a gate." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT get_gate_type(prov) AS root_type,\n", " get_children(prov) AS root_children\n", "FROM nairobi_token;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For Nairobi, the root is a `monus` gate with two children: the ⊕ sub-circuit (`Juma ⊕ Paul`) and the ⊗ sub-circuit (`Juma ⊗ Paul`). Recurse to inspect the children:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT (get_children(prov))[1] AS plus_token,\n", " get_gate_type((get_children(prov))[1]) AS plus_type,\n", " get_children((get_children(prov))[1]) AS plus_children\n", "FROM nairobi_token;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The leaves of the circuit are input gates that originate from the `personnel` table. [`identify_token`](https://provsql.org/doxygen-sql/html/group__circuit__introspection.html#ga5d3eca5ba1f2078973291c96cccae7e5) performs a reverse lookup, returning the table and column count for an input token:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "provsql": {} }, "source": [ "SELECT identify_token(child) AS source\n", "FROM nairobi_token, unnest(get_children((get_children(prov))[1])) AS child;" ], "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Both leaves resolve to `(personnel, 6)` – the `personnel` table with its six non-provenance columns (`id`, `name`, `position`, `city`, `classification`, and the `probability` column added in Step 7). This is exactly the traversal [`view_circuit`](https://provsql.org/doxygen-sql/html/group__provenance__output.html#ga9c312a8b72b968e3669432ab5ae49928) performs to render the box-art diagram." ] } ] }