Provenance Export
==================
ProvSQL can export provenance circuits in standard or visual formats for
use in external tools or for inspection.
Symbolic Representation
------------------------
:sqlfunc:`sr_formula` returns a symbolic representation of the provenance
as a human-readable formula. Each leaf token is replaced by its
mapped value, and the circuit operations are rendered using ``⊕``
and ``⊗``:
.. code-block:: postgresql
SELECT name, sr_formula(provenance(), 'witness_mapping')
FROM suspects;
This is the simplest way to inspect provenance interactively. See
:doc:`semirings` for the full description of semiring evaluation functions.
PROV-XML Export
----------------
`PROV-XML `_ is a W3C standard for
representing provenance information. ProvSQL can serialise provenance
circuits to this format using :sqlfunc:`to_provxml`:
.. code-block:: postgresql
SELECT provsql.to_provxml(provenance())
FROM mytable;
The function returns an XML document conforming to the PROV-XML schema,
representing the provenance circuit rooted at the given token.
Example output (abbreviated):
.. code-block:: xml
The same export is reachable from Studio's
:ref:`evaluation strip `: it lists
PROV-XML under the *Other* optgroup and serialises the currently
rendered circuit on demand.
Circuit Visualisation
----------------------
:sqlfunc:`view_circuit` renders a provenance circuit as an ASCII box-art
diagram. Internally it writes the circuit in GraphViz DOT format to a
temporary file and runs ``graph-easy --as=boxart`` on it, returning the
result as a text value. The ``graph-easy`` executable must be installed and
accessible in the PostgreSQL server's PATH (or in a directory listed in the
``provsql.tool_search_path`` GUC; see :doc:`configuration`).
.. code-block:: postgresql
SELECT provsql.view_circuit(provenance(), 'my_mapping')
FROM mytable
LIMIT 1;
To visualise the circuits for multiple rows simultaneously:
.. code-block:: postgresql
SELECT city, view_circuit(provenance(), 'my_mapping')
FROM (SELECT DISTINCT city FROM personnel) t;
For an interactive alternative, see Studio's
:ref:`Circuit mode `. It renders the same DAG
in the browser: click a UUID cell in a query result to display its
circuit, hover to highlight a subtree, expand the frontier on
demand, and read every gate's metadata (including stored
probability) in the side inspector. It does not require
``graph-easy`` and is unaffected by ``provsql.tool_search_path``.
Verbosity
----------
The ``provsql.verbose_level`` GUC variable controls diagnostic output.
Setting it to ``20`` is useful when debugging circuit export: intermediate
DOT and circuit files are kept on disk instead of being deleted, and the
d-DNNF method and gate count are reported. For example:
.. code-block:: postgresql
SET provsql.verbose_level = 20;
The default is ``0`` (silent). See :doc:`configuration` for the full
description of all thresholds.
Subcircuit Introspection
-------------------------
For programmatic exploration of a circuit (rather than a flat formula or
ASCII diagram), :sqlfunc:`circuit_subgraph` returns a BFS expansion of
the DAG rooted at a token, capped at a configurable depth. Each row
describes one ``(parent, node)`` edge — gate type, ``info1`` / ``info2``
payload and BFS depth come along on the same row. The root is reported
once with ``parent`` and ``child_pos`` ``NULL``; a node with several
parents within the bound is reported once per incoming edge (callers
that need a one-row-per-node view should deduplicate on ``node``):
.. code-block:: postgresql
SELECT * FROM provsql.circuit_subgraph(
(SELECT provenance() FROM personnel WHERE name = 'John'),
4);
``depth`` is the node's shortest-path distance from the root, so an edge
``(parent, child)`` always satisfies ``parent.depth + 1 >= child.depth``;
equality holds on shortest-path edges, strict inequality on
"shortcut" edges into a multi-parent child. For a node at
``depth = max_depth``, the caller can compare :sqlfunc:`get_children`
against the edges reported to detect a frontier node and request
another layer.
To resolve an ``input``-gate UUID back to the row that produced it,
:sqlfunc:`resolve_input` searches every provenance-tracked relation and
returns the row encoded as ``JSONB``:
.. code-block:: postgresql
SELECT relation, row_data
FROM provsql.resolve_input(
(SELECT provsql FROM personnel WHERE name = 'John'));
It returns zero rows for non-input gates and for tokens that don't
match any tracked row, without raising.
Circuit Structure
-----------------
All the representations above describe the same underlying DAG. Gate
types that appear:
* ``input`` – input (variable) gate; corresponds to a base tuple
* ``plus`` – semiring addition; generated by UNION and disjunctive conditions
* ``times`` – semiring multiplication; generated by joins and conjunctive conditions
* ``monus`` – m-semiring monus; generated by EXCEPT
* ``zero``, ``one`` – semiring additive and multiplicative identity elements
* ``project``, ``eq`` – where-provenance gates (column projection and equijoin)
* ``agg`` – aggregation gate (for aggregate provenance)
* ``semimod`` – semimodule scalar multiplication (for aggregate provenance)
* ``value`` – scalar value (for aggregate provenance)
* ``mulinput`` – multivalued input (for Boolean provenance)
* ``delta`` – δ-semiring operator :cite:`DBLP:conf/pods/AmsterdamerDT11`
* ``update`` – update operation gate