%entities; ]>
Karl Pinc O. The Meme Factory, Inc. Isok -- A query centered monitoring tool for PostgreSQL
Introduction to Isok Isok is a &postgresql; &extension; for monitoring anything that can be reported with an SQL query. Its expected usage targets data integrity maintenance and data cleanup. One expected use-case is addressing those corner cases where business logic is "fuzzy" and database content is monitored for unusual but not prohibited content. There is also a potential use-case for monitoring for errors in data, although it can be best to use constraints and triggers for this purpose because these prevent erroneous data from getting into the database in the first place. Isok lets you run SQL to produce reports alerting you of suspicious or erroneous conditions, with features to suppress previously reported alerts. Unlike simply running a query, which reports the existence of questionable data patterns, this method produces reports alerting you of changes to questionable data patterns, so that only new problems need be reviewed. It is useful when periodically probing for unusual but allowed activity, such as the purchase of more than 1,000 shoes by one person. Approved excessive shoe purchases can be individually flagged so they do not appear in future reports. To avoid being overwhelmed by numerous legitimate alerts and to allow time to resolve issues, specific rows in the reports can be deferred so they do not reappear before a designated date. Unlike triggers and constraints, Isok does nothing until executed. This is done by SELECTing FROM a function, which runs some or all of the saved queries to check the state of the database and report the results. Report content is archived and can be queried. Reported issues are classified as either errors or warnings. Errors are always reported when Isok is run. After execution, the warnings reported by the user-supplied queries may be manually sorted by the Isok user into one of the following categories: unclassified (the default), labeled resolved, or deferred until a later date. When Isok is run, unclassified warnings are reported, resolved warnings are not reported, and deferred warnings are not reported until the current date reaches the deferral date. PostgreSQL supports a high degree of introspection. Isok can therefore monitor PostgreSQL itself, both the database engine's operational metrics and database schema design. In the former case system performance or usage might be monitored. In the latter, monitoring might look for things like violations of column naming conventions. However, while there may be legitimate uses of Isok for these sorts of purposes, other tools may be a better fit. Regardless of how Isok is used, we believe some monitoring and some error checking is better than no monitoring and no error checking. Isok makes monitoring and error checking easy. If introducing triggers into your processing or running a complete performance monitoring solution is just not feasible, Isok provides a simple way to move the ball at least a little bit closer to the goal.
Installation There are two steps to installation, first, getting and preparing the code and, when installing as an &extension;, installing into the OS, and, second, loading into one or more databases. Installing Isok into the OS, which is necessary when installing as an &extension;, typically requires elevated OS-level privileges, such as root privileges. The examples given do not include the assumption of elevated privileges, or show the use of any particulars, such as the sudo command, needed to assume such privileges. Similarly, the examples do not include the connection parameters (usernames, passwords, hosts, database names, etc.) which may be needed to connect to a database. Regardless of how Isok is installed, we recommend you install it in a dedicated &schema;. Dedicating a schema to Isok has a number of benefits, not the least of which is simplified access control to mitigate security concerns. The downside to installing in a schema is that when writing SQL you must either set your &search_path; or qualify names, by prepending the object name with the schema name and a period. For example, without setting a search path, if you installed into a schema named isok, instead of SELECT * FROM run_isok_queries();, you would have to write SELECT * FROM isok.run_isok_queries();. When a schema is created, only the owner can access its content. This is sufficient protection, assuming care is taken using the . and . columns. (Or, if these features are disabled.)
Requirements Isok installs on &postgresql; version 10 or later, although &postgresql; versions no longer supported by The PostgreSQL Global Development Group may not get support.
Quick-Start The simplest way to get and prepare Isok is to use pgxnclient. Your operating system probably has a pgxnclient package available. The pgxnclient package must be installed on the machine running your &postgresql; server. If you're running a managed instance of &postgresql;, in the cloud or otherwise, and don't have access to the machine running the &postgresql; cluster, you'll need to use another installation method. After installing pgxnclient, the command: pgxn install pg_isok makes the Isok extension available to &postgresql;. Then, executing SQL like: CREATE SCHEMA isok; CREATE EXTENSION pg_isok SCHEMA isok; loads Isok into your database and makes it available for use.
Preparing, While Logged-In to Un*x Installing as an &extension; requires that the installation be done while logged into the &postgresql; server's machine. Or, at minimum, while the current working directory is within the server's filesystem. Installing from SQL, as is necessary when the &postgresql;'s server's filesystem is unavailable, must be done from a machine able to work as a &postgresql; client. The recommended download is the Isok zip file distribution from PGXN.org. It is "pre-built", and so does not require installation of any build tooling. If you have this, after unzipping, you can skip over the next sections, which cover disabling features, and cloud installation, and skip straight to . It is also possible to clone the Isok git repository, but be forewarned. Working from the git repository requires the installation of considerable tooling. Any rebuild of Isok requires the installation of the m4 macro pre-processor. Possibly, the GNU m4 implementation is required. This is what &postgresql; requires, and alternatives have not been tested. Your operating system almost surely makes available an m4 package. Only the "pre-built" PGXN distribution can be installed without the use of m4.
Re-Building to Disable Features If desired, some potentially dangerous features of Isok can be disabled at build time. These are the make variables that control the build options: DISABLE_ROLE Disable the ability to SET ROLE from . DISABLE_SEARCH_PATH Disable the ability to SET the &search_path;. To use these variables, set them to any value when running make. For example, to disable all optional features run: make DISABLE_ROLE=y DISABLE_SEARCH_PATH=y The build configuration is documented in the doc/pg_isok--${VERSION}.config file, and installed with the rest of the documentation.
Building for and Installing in The Cloud (Installing With SQL) If you are running in the cloud, or some other managed instance where you do not have permissions on the host running &postgresql;, you will not be able to install Isok as an &extension;. In these cases you can still install Isok, but you must first build its SQL and then manually execute it. Of course, this installation method can always be used, as there is always a way to execute SQL. To build a cloud version of Isok, suitable for installation by SQL execution, you would type something like: make TARGET_SCHEMA=isok pg_isok_cloud--$(cat VERSION).sql The resulting sql file is in the sql/ directory. To customize the build, any of the above variables may also be set. The TARGET_SCHEMA variable must be set; the objects produced by the generated SQL must be located within a designated &schema;. It is highly recommended that the TARGET_SCHEMA be lower-case and otherwise be a &postgresql; name which does not require quoting. To install, first create the &schema; and then execute the sql. The command line interaction, if you use the psql command line client interface, would look something like: $ psql psql (15.13 (Debian 15.13-0+deb12u1)) Type "help" for help. me=> CREATE SCHEMA isok; -- The TARGET_SCHEMA used to build the sql CREATE SCHEMA me=> \i sql/pg_isok_cloud--1.0.0.sql <lots of output redacted> me=> \q $ You must re-build different SQL, with a different TARGET_SCHEMA, in order to install into a different schema.
Installing in the &postgresql; Server's OS With appropriate OS-level permissions, run: make install With this step complete, you are ready to install the Isok extension into any schema of any database in the cluster.
Running Regression Tests Once an extension has been installed in the OS, regression tests can be run to test whether Isok is operating correctly. Running the regression tests when Isok is installed by SQL execution is unsupported. The same build variables must be set when running the regression tests as when the system was built. (The PGXN distribution sets no variables, the default.) Should you set a different collection of variables than when building, some tests will fail and others may fail to run at all. The following example runs the default set of regression tests: make installcheck
Loading Into PostgreSQL The CREATE EXTENSION command is used to install Isok, as in the following example: CREATE SCHEMA isok; CREATE EXTENSION pg_isok SCHEMA isok;
Uninstalling
Uninstalling From the OS Uninstalling from the OS does the opposite of installing. It removes the extension from the &postgresql; server's filesystem. To uninstall with pgxnclient, run: pgxn uninstall pg_isok To uninstall using make, run: make uninstall Because Isok is pure SQL, uninstalling it from the OS does not remove any functionality from existing instances installed with CREATE EXTENSION. Uninstalling does, however, remove the ability to use the CREATE EXTENSION to install Isok in a schema.
Uninstalling From &postgresql; Running: DROP EXTENSION pg_isok; removes the extension from all schemas in all &postgresql; clusters. To remove an installation of Isok from an individual schema, drop the schema with DROP SCHEMA schemaname CASCADE;.
An Overview of the Isok Tables This section provides an overview of Isok's tables.
The Isok Main Tables In the table descriptions below, each table has it's own section, with sub-sections for the table's columns. All timestamps (date plus time values) have a one second precision. Fractions of a second are not recorded. All timestamps track the time zone.
ISOK_QUERIES The ISOK_QUERIES table contains one row for every query used to search for database integrity issues. The value cannot be before the value. Use &postgresql;'s &dollar_quoting; when inserting queries into ISOK_QUERIES using INSERT statements. This avoids problems that would otherwise arise involving the use of quote characters inside quoted strings. Inserting a query into ISOK_QUERIES using dollar quoting -- Report a warning when there's a birth date before 1950 INSERT INTO isok_queries (iqname, error, type, keep, query, comment) VALUES('mycheck', false, 'bdate', false , $$SELECT 'Bad birth date: ' || mytable.id || ', ' || mytable.birthdate AS id , 'Id (' || mytable.id || ') has a birthdate (' || mytable.birthdate || ') before 1950' AS msg FROM mytable WHERE mytable.birthdate < '1950-01-01'$$ , $$Report a warning when there's a birthdate before 1950$$ );
IQName (Isok Query Name) A TEXT value. A unique name for the query. The IQName value cannot be changed. &emptytext; ¬null; &nospaces; &caseunique;
Error A BOOLEAN value. &true; when the query finds conditions that are errors, &false; when the query finds conditions that are warnings. See (and the ) for more on warnings and errors. ¬null;
Type A TEXT value. Code classifying the query. The legal values for this column are defined by the support table. ¬null;
First_Run A timestamp. Date and time the query was first run by Isok. The value of this column is &null; if the query has never been run.
Last_Run A timestamp. Date and time the query was most recently run by Isok. The value of this column is &null; if the query has never been run.
Keep (Keep old results) A BOOLEAN value. This column controls the value placed in the . column when inserts new rows in . When this column is &true;, each row returned by the query is stored in with a value of infinity. This prevents from deleting the query result row when run, when the query no longer returns the result row. When this column is &false;, the . value of any new rows that inserts is &null;. ¬null;
Role A &postgresql; name value. The &postgresql; role to use to run the query. Because different roles have differing access to database content, it can be useful to run queries with different roles in effect. Setting the role may have security implications. This column is not validated against existing roles. &name_type_description; When this column is &null;, the effective role is not changed.
Search_Path A TEXT value. The &postgresql; &schema; &search_path; to have in effect when the query is run. The syntax of the search path is that used by SET search_path ... and returned by SHOW search_path;. Because queries may not always contain schema names to qualify database objects, a single query can return different results depending on the search_path in effect. So it can be useful to run different queries when different schema search paths are in effect. Setting the search_path may have security implications. Care must be taken when setting the search path because the search path can be set to anything, regardless of which schemas exist or are available to the user. It is quite easy to set a search path that searches no schemas. &postgresql; will not produce any warnings or errors should you do so. When this column is &null;, the schema search path is not changed.
Query A TEXT value. A query which checks for database integrity violations. The query need not end in a semi-colon. The query must return 3 columns. Although these columns are referred to by name below, the names the query gives to the columns does not matter.
The first returned column, the ID column The first column is used as an id. It must contain a unique value. (Unique per results returned by the given query). The value must also be constant; repeated runs of the query which find the same problem must return a consistent value. The system cannot enforce the requirement that the first column be consistent over repeated runs of the query. If the query does not satisfy this requirement Isok will generate duplicates of previously reported problems. The value of the first column may not be &null; or the empty string. Guidelines for the value of the first column are that it should be human readable and relatively short. It should probably contain id values in order to ensure uniqueness, but only those that will not change over time. The value of this first column may need to be typed in or otherwise referenced by a person in order to make notes regarding the problem or to change the problem's status.
The second returned column, the Msg column The second column contains a message describing the discovered database integrity problem. It should contain a complete description of the problem and may be as verbose as necessary. The value of the second column may not be &null; or the empty string.
The third returned column, the Extra JSON column The third column contains JSON data. The purpose of this column is to hold additional data on the reported condition that may need to be tracked, or queried. &postgresql; is able to efficiently query JSONB data, which is how this column is stored. At the time of this writing, in practice, returning a third column is optional. But this behavior should not be relied upon. Best practice is to return a &null; value for the third column when you do not wish to store any JSON with the query result. When only 2 columns are returned, the effect is the same as returning a &null; value in the third column. The third column is optional, in practice, because a portion of of the &postgresql; PL/pgSQL language is unspecified. The unspecified PL/pgSQL behavior being, that the target in a PL/pgSQL statement of the form FOR target IN EXECUTE text_expression [ USING expression [, ... ] ] LOOP is allowed to contain more variables than the test_expression returns columns, in which case the extra variables are assigned the &null; value. Because this behavior is undocumented, it is subject to change. Should this behavior change, returning a third column will be required, not optional. Isok cannot feasibly use the text of the Query column to determine how many columns the query returns. So it cannot prevent the query from being written to return only two columns. And, when this is the case, the present PL/pgSQL implementation allows the Query to return two columns instead of three. Return a &null; value in the third column when there is no JSON data.
Comment A TEXT value. A comment on the query. This may be as verbose as necessary. ¬null; ¬onlyspaces;.
ISOK_RESULTS (Isok query Results) The ISOK_RESULTS table contains one row for every database integrity problem discovered by the queries in . That is to say, one row for every row returned by executed queries. The table's purpose is twofold. It provides an efficient way to list data integrity problems, without having to execute the potentially complex queries which discover the problems. But it's main purpose is to allow warnings, i.e. those problems discovered by the queries saved in rows having a &false; value, to be resolved -- permanently marked as acceptable conditions. Resolved warnings can be safely ignored thereafter, and since Isok automatically ignores resolved warnings those responsible for maintaining database integrity need not repeatedly concern themselves with resolved conditions. To resolve a warning place a timestamp in the column. Data integrity errors can not be resolved, the erroneous data condition must be fixed -- ISOK_RESULTS rows must have a &null; value when the row has a related to an row having a &true; value. The value, the value, and the value cannot be before the value. A resolved warning cannot be deferred -- either or , or both, must be &null;. To remove an . value and add a . value without raising an error either update both values in the same UPDATE statement or first set the value to &null; and then the value to something non-&null;. The query result id generated by the stored query must be unique per query -- the combination of ISOK_RESULTS. and ISOK_RESULTS. must be unique.
IRID (Integrity Results Identifier) An BIGINT value This column uniquely identifies the row containing the result of a database integrity query. The IRID value cannot be changed and is automatically generated with a &postgresql; sequence.
IQName (Integrity Query Name) A TEXT value. The . value identifying the query which produced the result.
First_Seen A timestamp value. Date and time the query result was first produced by Isok. ¬null;
Last_Seen A timestamp value. Date and time the query result was most recently produced by Isok. ¬null;
Last_Role A &postgresql; name value. The role (user) which was the current role when the query was last executed. &name_type_description; This column is not validated against existing roles. ¬null;
Last_Schemas An array of &postgresql; name values. All schemas that were, implicitly or not, in the &search_path;, and also available to the , when the result was returned. For more information, see the documentation of the current_schemas() function. &name_type_description; ¬null;
Resolved (Date and Time Resolved) A timestamp value. Date and time the query result was resolved; that is, marked not a concern. The Isok system does not display resolved results, although of course the ISOK_RESULTS table can always be manually queried. The value of this column may be &null;. This occurs both when the query result is a data integrity error and when it is a data integrity warning that has not yet been resolved.
Deferred_To A timestamp value. Isok suppresses display of the result when the current time is before this time. Use of this column allows resolution of data integrity problems to be deferred, and hence not clutter up the output of Isok with noise that might hide other problems. When this column is &null; Isok displays the query result.
Category A TEXT value. Code classifying the query result. The legal values for this column are defined by the support table. This column may be &null; when the query result is unclassified.
Keep_Until A timestamp value. This column controls whether or not deletes the row when the . is re-run and the query does not return the row's . A query result that the query no longer returns is kept until the given time is reached, when the value of this column is not &null;. When the value of this column is &null;, a query result row that is no longer returned is is always deleted. For further detail see the section of the documentation. Using the special TIMESTAMP value of infinity entirely prevents deletion.
QR_ID (Query Result IDentifier) A TEXT value. This is a unique, unique per query that is, identifier for the query result. It is the first column produced by the related .. ¬null;
QR_Message (Query Result Message) A TEXT value. This is the message, the second column, produced by the most recent execution of the ..
QR_Extra (Query Result Extra JSON data) A JSONB value. The value of the third, optional, column returned by most recent execution of the query. This may contain any JSON deemed useful. This column serves as a catch-all container for any additional data that needs to be tracked regarding a reported condition. The value of this column may be &null;. This is the default when the . does not return a third column. See &postgresql;'s documentation on the JSON data types for information on how to access, index, and efficiently query the JSONB data type.
Notes A TEXT value. Any notes regarding this particular query result. ¬null; ¬onlyspaces;
Isok Support Tables Support tables are used to control the values used in other tables. Each support table has a key, with an appropriate column name, and a column named Description. Both of these columns are of type TEXT. The keys of the support table are foreign keys of a column which has a controlled vocabulary, a limited number of terms which are allowed to be used. An administrator can add or remove rows from the support tables to dynamically control the allowed vocabulary. The support table Description columns must be unique when the comparison is made in a case-insensitive manner.
IQ_TYPES (Integrity Query Types) IQ_TYPES contains one row for every code used to classify database integrity queries. Classification may be by the type of data integrity problem the related queries are designed to uncover, by who is responsible for resolving the discovered problems, or any other desired classification scheme.
Key: IQType The IQ_TYPES table is keyed by the IQType column. &nospaces; &caseunique;
IR_TYPES (Isok Result Types) IR_TYPES contains one row for every code used to classify or explain sets of database integrity problems, problems discovered by Isok's queries. Codes may be used as needed, whether to organize reported problems pending resolution, to describe the circumstances which resolve an issue, or to serve other purposes.
Key: IRType The IR_TYPES table is keyed by the IRType column. &nospaces; &caseunique;
The Isok Functions (Activating Isok) Isok is run by using one of its functions. Of course the table may always be queried manually, but this does not discover any new problems. All of the Isok functions are designed to be used in the FROM clause of SELECT statements, as if they were tables. Indeed the functions look like tables to the SELECT statement, tables that look exactly like -- except that the column is missing. The difference between querying on the table directly and querying using Isok's functions is that the functions update the content of the table by executing the the queries in table. Also, the functions never return rows where the underlying row has a non-&null; value or a time and date that has not yet been reached. All timestamps, date plus time values, which Isok updates in the and tables are set to the date and time at which program execution started. So when, say, , is run, all of the new timestamp values in the and rows touched by the execution are identical. Various Isok functions (or versions of the same function) are supplied to allow easy selection of which queries in which rows are to be executed, whether all or only some. As with a regular table, the order in which rows are returned by Isok's functions is unspecified. If you wish to ensure a specific ordering an ORDER BY clause must be used. run_isok_queries run_isok_queries execute one or more of the queries stored in the table TABLE (irid, iqname, first_seen, last_seen, category, qr_id, qr_message, notes) run_isok_queries TABLE (irid, iqname, first_seen, last_seen, category, qr_id, qr_message, notes) run_isok_queries TEXT iqname_query Input iqname_query The text of an SQL query. The query must return a single column of . values. Description A function which runs the queries stored in the table, returns the output of the stored queries, and stores the results in the table. Because the function returns rows and columns it must be invoked in the FROM clause of a SELECT statement. (See the Examples below.) The function may be called in one of two ways. When called with no arguments all of the queries in are run. When called with the text of an SQL query, a query which returns a single column containing . values, the function runs only those queries. Use &postgresql;'s &dollar_quoting; when supplying a query to run_isok_queries(). The function returns a set of columns with multiple rows, a table. So it is expected to be used in the FROM clause of a SELECT statement. The columns returned by the function are the columns of the table, excepting the column. The rows returned by the function are those of the newly updated table, excepting those rows with a non-&null; column or those rows with a value that is in the future. Only those rows that are related to the executed queries (in ) are returned. So, when called with no arguments the function returns all warnings that have not been resolved and all errors. When called with a query that selects specific to execute, only the unresolved warnings and errors discovered by the executed are returned. The Record of Query Execution Running an . does more than add new rows to the table. Updates are made to existing rows to record and track the query execution's results. The . value is updated. On , the rows to update are found by matching the . value with the . of the executed query, while also matching the value with the value returned in the first column of the executed query. The columns updated are: , , , , and . Because the record of the results produced by Isok queries are updated, a query may be refined over time to produce enough information to resolve the reported issues. Even though the execution of run_isok_queries() does not return rows that are resolved, all rows returned by an executed query have all the aforementioned columns updated to new values. Whether a row is returned or not does not matter, the update occurs anyway. Deletion of Old Results If an existing row matches the value of the executed query and there is no corresponding value returned by the executed query, and the value of . is either &null; or CURRENT_TIMESTAMP The time the current transaction started, which, if a transaction was not explicitly started, is the time the database engine received the current SQL statement from the client and began execution. is not earlier than . then the row is deleted. This empties the table of errors and warnings that no longer apply to the current state of the database. If the query returns warnings, this deletion behavior does not depend upon whether or not the warning is resolved. Examples The following example runs all the queries in , displays all the errors and all the unresolved warnings (unless the error or warning has been deferred), ordered first by the name of the query, within that showing newer problems first, and within that ordered by warning id. Executing all ISOK_QUERIES SELECT * FROM run_isok_queries() AS problems ORDER BY problems.iqname , problems.first_seen DESC , problems.qr_id; The following example runs a single saved query with an . of mycheck and displays any of these sorts of problems found, ordered as in the previous example. This example also demonstrates how to use &dollar_quoting; to give a query to run_isok_queries and thereby avoid problems having to do with trying to nest regular quotes. Executing a single ISOK_QUERIES.Query SELECT * FROM run_isok_queries($$SELECT 'mycheck'$$) AS problems ORDER BY problems.iqname , problems.first_seen DESC , problems.qr_id; The following example runs multiple specific queries, those with an . of mycheck, yourcheck, and theircheck, and displays any of these sorts of problems found, ordered as in the previous example. As before, &dollar_quoting; is used to quote the query which produces the s. Executing many specific ISOK_QUERIES.Query-s SELECT * FROM run_isok_queries($$VALUES ('mycheck') , ('yourcheck') , ('theircheck')$$) AS problems ORDER BY problems.iqname , problems.first_seen DESC , problems.qr_id; The following example runs all the queries of the bdate type and displays any of these sorts of problems found, ordered as in the previous example. Again, &dollar_quoting; is used. Executing ISOK_QUERIES of the <quote>bdate</quote> type SELECT * FROM run_isok_queries( $$SELECT isok_queries.iqname FROM isok_queries WHERE isok_queries.type = 'bdate'$$ ) AS problems ORDER BY problems.iqname , problems.first_seen DESC , problems.qr_id;
Security Considerations The security concerns surrounding Isok are many, and can be complex. Fundamentally, this is because Isok executes arbitrary SQL. If the wrong SQL is executed, in the wrong context, anything might happen to your data. This appendix identifies pertinent issues, and how to minimize risk. Ultimately, these are the same issues that arise in any application that executes SQL. The big difference between Isok and other applications is that most applications execute a more-or-less limited number of SQL queries that are carefully crafted to suit a specific purpose. The queries executed by Isok can have much more variation, and be subject to less review. In the end, the recommendations here come down to following generally accepted security best-practices, in particular, the principle of least privilege.
Limiting Access Limiting access to Isok is a clear first-step. Installing Isok into a dedicated &schema; goes a long way toward helping with this. When a schema is created, only the owner has access. Yes, this is true of all objects. Only the owner has access to any newly-created object. But having a single point of access, the schema dedicated to Isok, that grants access to all of Isok, provides a very useful point of control that serves as an easily audited gateway to Isok's functionality. So, unless GRANTs are issued, access is limited by default. Remember also, the table contains query output that may contain sensitive information to which access should be restricted. And, even if this is not true today, it may become true when additional queries are added to . Even the queries in could, possibly, contain sensitive information.
What Queries Access Matters The executed queries, the .s, can be any SQL statement. Obviously, what executes matters. Less obviously, the ownership of and permissions granted on every object referenced by every query also matters. Really, when multiple schemas are in the &search_path;, it is the ownership of and permissions granted on every object that might be referenced by every query that matters. The ownership and permissions of referenced objects matter because these factors ultimately control what any given query actually does. If a user has, for example, permission to alter a view with some given name, or replace a table having that name with a view that has the same name, then the user can change what happens when that name is used in a query. The user can write a view that does anything. Or at least anything that the role which runs is allowed to do. Imagine, the new view could call a function, say, in place of a table that was referenced, and that function could do anything at all. Even while still returning the replaced table's rows, so as to produce a results identical to that produced before the system was altered. That is the issue. The user executing the saved query is dependent upon the goodwill of all the users who have enough access to alter any of the objects involved when the query is executed.
The Search Path The . column allows setting of the &search_path; on a per-query basis. The security implications of changing the search_path may be the hardest to reason through. The crux of the problem is that different users may have different permissions on the search path's schemas, and on the objects the schemas contain. This opens up the possibility that a malicious user may create an object, say, a view or a function, in a schema which appears earlier in the search path than the schema holding the object the query expects to find. If this is the case, the query will use the malicious object instead of the expected object. The &postgresql; documentation contains an analysis of this situation, in the context of writing SECURITY DEFINER functions. However, the analysis in the &postgresql; documentation is not entirely applicable to Isok. In the case of Isok, even when Isok changes the effective role, the position in the search path of the temporary table schema, pg_temp, is less relevant. Because temporary tables are not shared between connections, the creation of a malicious object in the temporary schema must be done in the current connection. And so the issue is no different from that which occurs when any other malicious object is created in the current connection. In either case, there is a security lapse that occurs dynamically, at some point in the current connection. Having said that, moving pg_temp to the end of the search path does make it harder to mask an existing object with a malicious object. Because all roles have permission to create objects in pg_temp, a malicious actor would not be able to mask an existing object with an object in pg_temp if pg_temp is at the end of the search path. For this reason it may make sense to always put pg_temp at the end of the search path whenever Isok is used. The &postgresql; documentation's observation remains valid: Malicious users with the ability to change objects in the search path may inject malicious objects.
Roles The role in effect does have security implications. But changing a role for the duration of a query's execution, with ., has fewer security implications than it might seem. Changing the current role does open up the possibility that database objects to which the new role has access may be changed. But this door is already open. A new role cannot be assumed without some chain of SET option grants from the session_user [definition here(-ish)] to the current role. So a malicious actor always has access to the same set of roles, regardless of whether Isok is involved or not. What might be surprising is that, even though a role may SET ROLE to another, perhaps with less privileges, it is always possible to use RESET ROLE (or SET ROLE NONE) and reset the current role to the session_role. There is no sandboxing. If the session sets a role before running , there is the possibility that a malicious actor might undo the assumption of the role. This could then affect the role used to execute any queries that has not yet executed. Don't expect that a SET ROLE to a role of lesser privileges makes running any safer.
Mitigation Strategies There is no one-size-fits-all solution. Even disabling Isok's ability to dynamically alter the current search path and the current role does not address the fundamental issues. Even more so because, to be useful, may need an expansive set of permissions to do its job. One possible strategy is to always supply values in the . . columns. At least that way the context of each query's execution is always known. Another possible strategy is to install Isok in multiple schemas, each schema dedicated to a different purpose and assigned different permissions, intended to be used by different users.
Creating an Audit Trail To better respond to a suspected security problem it is always very useful to have an audit trail to examine. One way to have such a trail is to install a temporal extension. These extensions track the history of database content over time. The Isok tables could be temporally tracked, to audit what queries were changed when, as well as what query results were produced or deleted when. A conceivable, although entirely untested on our part, idea is to use a temporal extension to track changes made to the postgres database. Otherwise known as the system catalog, pg_catalog, this database contains the definitions of all objects in all databases. Tracking the catalog provides an audit trail should a malicious object be created, although this would not help if pg_temp was involved. Some installations may even want to temporally track all their tables, although this may not be feasible for a whole host of reasons.
Local Copies of the Documentation When Isok is installed as an &extension;, local copies of the documentation are installed. The pg_config &postgresql; client command provides an easy way to find the documentation. Finding the Documentation of Locally Installed Extensions $ printf '\nExtension documentation is located in:\n%s\n\n' $(pg_config --docdir)/extension/ Extension documentation is located in: /usr/share/doc/postgresql-doc-15/extension/ $ ls $(pg_config --docdir)/extension pg_isok--1.0.0.config pg_isok_html pg_isok_usletter.pdf pg_isok_a4.pdf pg_isok.txt $ printf '\nThe URL used to read the local HTML documentation is:\nfile://%s\n\n' \ $(pg_config --docdir)/extension/pg_isok_html/html_paginated/index.html The URL used to read the local HTML documentation is: file:///usr/share/doc/postgresql-doc-15/extension/pg_isok_html/html_paginated/index.html Periodic Execution A monitoring system must periodically execute and deliver reports if it is to monitor and provide actionable alerts on an ongoing basis. Isok does not include a periodic job scheduler. Tools like the Unix cron command, the systemd timer system, or the PostgreSQL pg_cron extension are useful to automate, and make periodic, Isok's monitoring. There are plenty of job schedulers available and one of these must be used to schedule the production of Isok's reports. Typically, something must deliver the reports Isok produces, because push-notifications remind people to act. Although Isok does archive the reports it produces, it does not include a report delivery mechanism. Email, or other push-based delivery mechanisms (perhaps email-to-SMS text gateways), are the expected delivery mechanisms for Isok's reports. Isok itself can report to standard out when run from psql. Depending on your job scheduler, some amount of scripting may be required to route Isok's reports to a push delivery service.
Example Periodic Reporting via Email Using systemd The files shown below deliver an Isok report, if there is something to report, by email every Tuesday morning. The system on which they are installed must have a mail transfer agent installed, like Postfix, to begin the email delivery process. The system also must have GNU mailutils installed, or an equivalent mail command, like BSD mailx, to send the email. Most operating systems will have packages available to install these services, and a way to configure simple defaults. However, it is non-trivial to reliably deliver email from your system directly to the rest of the Internet. The recommended approach is to send the email from your local system to a mail relay provided by your local IT professionals. (Or, your Internet Service Provider. Or, if you are hosted in the cloud, contact your hosting company.) These professionals will usually be able to supply you with what you need to know to have mail sent from your system to a system able to send email to the Internet at-large. If not, there are companies that provide this service for a nominal fee. The service you would ask for is usually called an email relay service. It is usually a good idea to ask your local IT professionals to help with the selection of a mail transfer agent. This example is expected to run, as is, on most systems that have the default &postgresql; install. The example assumes that pg_hba.conf contains: local all postgres peer This line is typically present, but this is not guaranteed. Sample <literal>/usr/local/bin/pg_isok_report</literal> File ${EMPTY_FILE} 2>&1 PAGER= ${PSQL} --command="SELECT * FROM ${ISOK_SCHEMA}.run_isok_queries();" \ > ${OUTPUT} 2>&1 cmp --quiet ${EMPTY_FILE} ${OUTPUT} \ || { /usr/bin/mail -s 'Isok output' ${MAIL_RECIPIENT} \ < ${OUTPUT} ; }]]> Sample <literal>/etc/systemd/system/isok_report.timer</literal> File Sample <literal>/etc/systemd/system/isok_report.service</literal> File After installation, don't forget to run: systemctl daemon-reload systemctl enable pg_isok_report.timer systemctl start pg_isok_report.timer
Techniques For Making Local Extensions to Isok Should you find yourself wishing that Isok did more, here are some suggested techniques for extending the functionality of your Isok instance. There is overlap, more than one technique may facilitate reaching any given goal.
Wrap <xref linkend="run_isok_queries" /> To perform actions before or after execution of , write a new function that takes 's arguments and returns 's results. And does what you wish before or afterward. So, for example, to ensure a safe, consistent, value for &search_path;, you could write a function that executes SET search_path ...;, before itself calling and returning the result.
Extend Issue Classification If you would like additional ways to classify the issues your queries discover, the table may be extended. Create your own table to do this, called, say, IR_TYPE_CLASSES. The key of this table is that of the table; may as well call it IRType. It is a foreign key, referencing . So your new table has a one-to-one relationship with . Add as many columns as you like to your new table, a column for each (orthogonal) sub-category by which you would like to classify reported issues. Boolean columns behave as a tag, toggling classification. Other kinds of columns, possibly containing foreign keys to control the vocabulary used, allow richer classification schemes.
Fully Utilize <xref linkend="ISOK_RESULTS" />.<xref linkend="Isok_Results-QR_Extra" /> Indexing the JSONB . column improves performance. If you know your JSONB keys, you can make a VIEW that exposes the value of those keys as the view's columns. Users of this view would not have to be familiar with querying JSONB. More complex schemes involve putting a row-level BEFORE trigger on to distribute the various values appearing in into other tables. But doing so surely takes you past the point of diminishing returns. It is easier to modify the SQL that Isok installs. And doing so is probably also less of a long-term maintenance burden, which matters.
Modify Isok's Generated SQL Should you want to make a modification like allowing the queries in . to return additional columns, you can do so by modifying the SQL that Isok loads. Isok is pure SQL, so the SQL can be generated and then modified, in any way you like, before being loaded into a database's schema.
Developing Isok We consider Isok to be feature complete. That said, there's always room for improvement and contributions are welcome. Never the less, if you would like your changes added to Isok, before doing a lot of work we recommend communicating with us. You are, of course, free to make changes to your local Isok. Development should be done by cloning the git repository.
Tool Requirements Isok uses the PGXN.org tools for building and distribution, which in turn uses parts of the PostgreSQL GNU make-based build system. So GNU make is required. In addition, the Isok documentation is done with the DocBook, as is &postgresql;'s, so the tooling required to build the documentation is the same as PostgreSQL, although Isok currently generates XHTML so may require a slightly different set of DTDs. Aside from these requirements, the following additional tools are needed: DBLatex The DocBook to LaTeX to PDF, etc., converter gawk The GNU awk implementation Gnu m4 The macro pre-processor used by GNU autoconf, etc. links The command-line web browser xmllint The XML linter zip The archive and compression tool DBLatex also requires the installation of various TeX and LaTeX tooling, which your O/S's package manager is likely to install as a dependency. When working with DocBook, the book DocBook XSL: The Complete Guide from Sagehill.net may also be useful.
Building and Distributing Run make help for help on the Makefile targets. Almost all the generated files are included in the distribution. This is so that the user, or the PGXN tooling, can use the Makefile for installation, and uninstallation, without having to have all the tooling required for development installed.
Acknowledgments and History Isok was first developed as "The Warning System" for the Gombe Mother Infant Database Project. It was later incorporated into Babase, part of The Amboseli Baboon Research Project, and enhanced to take advantage of the features in &postgresql; 9.1. Further enhancement, including release as a &postgresql; extension, was done for The SokweDB Project, developed by The Jane Goodall Institute. We would like to thank these projects, and their funding sources, for enabling the development and release of Isok. The acknowledgments included in the above projects' documentation are reproduced below. (Verbatim, excepting some updated contact information.) It is not clear how applicable the entirety of the acknowledgments are, but we would rather be overly generous in our thanks than be stingy. The following acknowledgments do not include all the people who have enabled and assisted Isok development. You know who you are. Thank you. And thanks to the larger Open Source community. Without their support, and hard work, none of this would have happened.
The Gombe Mother-Infant Project Acknowledgments The Gombe-MI Development Group Karl O. Pinc The Meme Factory, Inc. kop@karlpinc.com Book Author, System Design Lead, Implementation Carson M. Murray PhD. George Washington University cmmurray@gwu.edu http://departments.columbian.gwu.edu/anthropology/people/317 Project Co-Leader, System Design Core Member Elizabeth V. Lonsdorf PhD. Franklin & Marshall College elizabeth.lonsdorf@fandm.edu http://www.fandm.edu/elizabeth-lonsdorf Project Co-Leader, System Design Core Member Karen Anderson System Design Core Member, Copy Review, System Testing Lead A. Catherine Markham PhD. Stony Brook University http://www.stonybrook.edu/commcms/anthropology/faculty/cmarkham.html System Design Participant Margaret A. Stanton PhD. George Washington University http://cashp.columbian.gwu.edu/margaret-stanton System Design Participant, System Testing Core Member Edward Wilkerson Jr. Lincoln Park Zoo http://www.lpzoo.org/conservation-science/resources/staff-bios/edward-wilkerson System Design Participant Funding and Support We gratefully acknowledge the support of The National Institutes of Health grant R00HD057992 for the development of this system. We are also very grateful for the support given by The Leo S. Guthman Foundation, the Lincoln Park Zoo, Franklin & Marshall College, and The George Washington University. Other Thanks We would like to thank the myriad Free and Open Source communities, including those of PostgreSQL, the GNU Project, the Debian Project, Ubuntu, PhpPgAdmin, the Pyramid web framework, TeX and LaTeX, DBLatex, DocBook, Babase, and many others unmentioned, for giving, gratis, billions of dollars See: Estimating the Total Development Cost of a Linux Distribution. of work to the world, without which the Gombe-MI software and this book would not exist. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Institutes of Health, The Leo S. Guthman Foundation, the Lincoln Park Zoo, Franklin & Marshall College, The George Washington University, or any other organization which has supplied support for this work.
The SokweDB Acknowledgements At the time of this writing, there is no formal set of acknowledgments for SokweDB. However, Microsoft provided funding for SokweDB and we would like to acknowledge and thank them for their support. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of Microsoft.
Isok Licensing Terms -- Licensed Under The AGPL v3.0+ (Examples Excepted) Isok, otherwise known as pg_isok, is licensed under the GNU Affero General Public License version 3 (AGPL 3.0+), or (at your option) any later version, with the exception of all sample program code, sample commands, and sample configuration file components contained in the documentation, whether explicitly labeled as an example or not. These samples of program code, commands, and configuration file components are licensed under the CC0 1.0 Universal license. The deed for the CC0 1.0 Universal license explains the license in plain language. The deed is reproduced in . The No Copyright section captures the essence. CC0 1.0 Universal Deed
Other Information In no way are the patent or trademark rights of any person affected by CC0, nor are the rights that other persons may have in the work or in how the work is used, such as publicity or privacy rights. publicity or privacy — The use of a work free of known copyright restrictions may be otherwise regulated or limited. The work or its use may be subject to personal data protection laws, publicity, image, or privacy rights that allow a person to control how their voice, image or likeness is used, or other restrictions or limitations under applicable law. Unless expressly stated otherwise, the person who associated a work with this deed makes no warranties about the work, and disclaims liability for all uses of the work, to the fullest extent permitted by applicable law. When using or citing the work, you should not imply endorsement by the author or the affirmer. endorsement — In some jurisdictions, wrongfully implying that an author, publisher or anyone else endorses your use of a work may be unlawful.
Notice The Commons Deed is not a legal instrument. It is simply a handy reference for understanding the CC0 Legal Code, a human-readable expression of some of its key terms. Think of it as the user-friendly interface to the CC0 Legal Code beneath. This Deed itself has no legal value, and its contents do not appear in CC0. Creative Commons is not a law firm and does not provide legal services. Distributing, displaying, or linking to this Commons Deed does not create an attorney-client relationship. Creative Commons has not verified the copyright status of any work to which CC0 has been applied. CC makes no warranties about any work or its copyright status in any jurisdiction, and disclaims all liability for all uses of any work.