{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pseudonymous Access To Encrypted Table"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook shows how to use [pgsodium]() and [postgresql-anonymizer]() to provide pseudonymous access to encrypted table data. `pgsodium` is used to encrypt the data, and the `anon` extension is used to anonymize it, and to make some fake test data as you'll see. First we'll load the sql extension for jupyter and connect to the database running with this container. The source code can be found [here]()."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres@/\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"1 rows affected.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n",
"Done.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%load_ext sql\n",
"%sql postgresql://postgres@/\n",
"%sql -f pgsodium_anonymizer.sql"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Demo Data\n",
"\n",
"Now make some demo data with the provided function. This will create N number of encrypted demo records."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres@/\n",
"1 rows affected.\n",
" * postgresql://postgres@/\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/html": [
"
\n",
" \n",
" count | \n",
"
\n",
" \n",
" 10 | \n",
"
\n",
"
"
],
"text/plain": [
"[(10,)]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql select demo_data(10)\n",
"%sql select count(*) from encrypted_record"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Access Decrypted Data\n",
"\n",
"While still the `postgres` superuser role, we can see the raw decrypted data. This is the data we want to protect from other roles in the system like `staff`."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres@/\n",
"10 rows affected.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" id | \n",
" first_name | \n",
" last_name | \n",
" age | \n",
" secret | \n",
" signup_date | \n",
"
\n",
" \n",
" 1 | \n",
" yahya | \n",
" Chiotti | \n",
" 95 | \n",
" Kālābāgh | \n",
" 2018-03-02 17:07:05.124475+00:00 | \n",
"
\n",
" \n",
" 2 | \n",
" daley | \n",
" Chiotti | \n",
" 53 | \n",
" Mount Lebanon | \n",
" 2013-03-08 13:57:39.824305+00:00 | \n",
"
\n",
" \n",
" 3 | \n",
" pierina | \n",
" Horenstein | \n",
" 14 | \n",
" Narangba | \n",
" 2014-10-25 03:53:20.076964+00:00 | \n",
"
\n",
" \n",
" 4 | \n",
" jouko | \n",
" Defiore | \n",
" 94 | \n",
" Kalocsa | \n",
" 2013-03-07 04:15:50.335154+00:00 | \n",
"
\n",
" \n",
" 5 | \n",
" ilana | \n",
" Bagley | \n",
" 26 | \n",
" Mitras Poniente | \n",
" 2015-07-06 16:24:43.602278+00:00 | \n",
"
\n",
" \n",
" 6 | \n",
" anatoli | \n",
" Wesolowski | \n",
" 70 | \n",
" Talca | \n",
" 2016-02-22 05:12:03.492212+00:00 | \n",
"
\n",
" \n",
" 7 | \n",
" zayd | \n",
" Lemarr | \n",
" 87 | \n",
" Almansa | \n",
" 2014-05-28 13:57:14.294830+00:00 | \n",
"
\n",
" \n",
" 8 | \n",
" sanna | \n",
" Touchette | \n",
" 74 | \n",
" Novokhovrino | \n",
" 2013-03-03 09:12:07.639588+00:00 | \n",
"
\n",
" \n",
" 9 | \n",
" aaliyah | \n",
" Feldmann | \n",
" 40 | \n",
" Monte Azul Paulista | \n",
" 2014-09-21 08:15:34.225063+00:00 | \n",
"
\n",
" \n",
" 10 | \n",
" della | \n",
" Calderone | \n",
" 54 | \n",
" Teshi Old Town | \n",
" 2012-09-09 21:35:57.563589+00:00 | \n",
"
\n",
"
"
],
"text/plain": [
"[(1, 'yahya', 'Chiotti', 95, 'Kālābāgh', datetime.datetime(2018, 3, 2, 17, 7, 5, 124475, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (2, 'daley', 'Chiotti', 53, 'Mount Lebanon', datetime.datetime(2013, 3, 8, 13, 57, 39, 824305, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (3, 'pierina', 'Horenstein', 14, 'Narangba', datetime.datetime(2014, 10, 25, 3, 53, 20, 76964, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (4, 'jouko', 'Defiore', 94, 'Kalocsa', datetime.datetime(2013, 3, 7, 4, 15, 50, 335154, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (5, 'ilana', 'Bagley', 26, 'Mitras Poniente', datetime.datetime(2015, 7, 6, 16, 24, 43, 602278, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (6, 'anatoli', 'Wesolowski', 70, 'Talca', datetime.datetime(2016, 2, 22, 5, 12, 3, 492212, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (7, 'zayd', 'Lemarr', 87, 'Almansa', datetime.datetime(2014, 5, 28, 13, 57, 14, 294830, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (8, 'sanna', 'Touchette', 74, 'Novokhovrino', datetime.datetime(2013, 3, 3, 9, 12, 7, 639588, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (9, 'aaliyah', 'Feldmann', 40, 'Monte Azul Paulista', datetime.datetime(2014, 9, 21, 8, 15, 34, 225063, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (10, 'della', 'Calderone', 54, 'Teshi Old Town', datetime.datetime(2012, 9, 9, 21, 35, 57, 563589, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None)))]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql select rec.* from encrypted_record e left join lateral decrypt_record(e.id) rec on true limit 10;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Restrict Access by Role\n",
"\n",
"After switching roles with `SET ROLE` the `staff` role can only call the `pseudo_record()` function, it cannot access the `encrypted_record` table or the `decrypt_record` function at all."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres@/\n",
"Done.\n",
" * postgresql://postgres@/\n",
"(psycopg2.errors.InsufficientPrivilege) permission denied for table encrypted_record\n",
"\n",
"[SQL: select * from encrypted_record]\n",
"(Background on this error at: http://sqlalche.me/e/13/f405)\n",
" * postgresql://postgres@/\n",
"(psycopg2.errors.InsufficientPrivilege) permission denied for function decrypt_record\n",
"\n",
"[SQL: select decrypt_record(1)]\n",
"(Background on this error at: http://sqlalche.me/e/13/f405)\n"
]
}
],
"source": [
"%sql SET ROLE staff\n",
"%sql select * from encrypted_record\n",
"%sql select decrypt_record(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Pseudonymous Access\n",
"\n",
"Now as the `staff` role, you only access pseudonymous data. Notice how the names are different, the age is a range, the secret is obscured, and the signup date is shown as a month."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" * postgresql://postgres@/\n",
"10 rows affected.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" id | \n",
" first_name | \n",
" last_name | \n",
" age_range | \n",
" secret | \n",
" signup_month | \n",
"
\n",
" \n",
" 1 | \n",
" gioachino | \n",
" Lauderman | \n",
" [95, 100) | \n",
" Kāxxxxxxxxxgh | \n",
" 2018-03-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 2 | \n",
" kanti | \n",
" Lauderman | \n",
" [50, 55) | \n",
" Moxxxxxxxxxon | \n",
" 2013-03-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 3 | \n",
" pleasance | \n",
" Warehime | \n",
" [10, 15) | \n",
" Naxxxxxxxxxba | \n",
" 2014-10-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 4 | \n",
" dalal | \n",
" Pettway | \n",
" [90, 95) | \n",
" Kaxxxxxxxxxsa | \n",
" 2013-03-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 5 | \n",
" serafima | \n",
" Fitzen | \n",
" [25, 30) | \n",
" Mixxxxxxxxxte | \n",
" 2015-07-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 6 | \n",
" cináed | \n",
" Lydon | \n",
" [70, 75) | \n",
" Taxxxxxxxxxca | \n",
" 2016-02-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 7 | \n",
" grania | \n",
" Mcmanigal | \n",
" [85, 90) | \n",
" Alxxxxxxxxxsa | \n",
" 2014-05-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 8 | \n",
" ingrid | \n",
" Nastasia | \n",
" [70, 75) | \n",
" Noxxxxxxxxxno | \n",
" 2013-03-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 9 | \n",
" laurelle | \n",
" Malsom | \n",
" [40, 45) | \n",
" Moxxxxxxxxxta | \n",
" 2014-09-01 00:00:00+00:00 | \n",
"
\n",
" \n",
" 10 | \n",
" wilton | \n",
" Hoscheit | \n",
" [50, 55) | \n",
" Texxxxxxxxxwn | \n",
" 2012-09-01 00:00:00+00:00 | \n",
"
\n",
"
"
],
"text/plain": [
"[(1, 'gioachino', 'Lauderman', NumericRange(95, 100, '[)'), 'Kāxxxxxxxxxgh', datetime.datetime(2018, 3, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (2, 'kanti', 'Lauderman', NumericRange(50, 55, '[)'), 'Moxxxxxxxxxon', datetime.datetime(2013, 3, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (3, 'pleasance', 'Warehime', NumericRange(10, 15, '[)'), 'Naxxxxxxxxxba', datetime.datetime(2014, 10, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (4, 'dalal', 'Pettway', NumericRange(90, 95, '[)'), 'Kaxxxxxxxxxsa', datetime.datetime(2013, 3, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (5, 'serafima', 'Fitzen', NumericRange(25, 30, '[)'), 'Mixxxxxxxxxte', datetime.datetime(2015, 7, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (6, 'cináed', 'Lydon', NumericRange(70, 75, '[)'), 'Taxxxxxxxxxca', datetime.datetime(2016, 2, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (7, 'grania', 'Mcmanigal', NumericRange(85, 90, '[)'), 'Alxxxxxxxxxsa', datetime.datetime(2014, 5, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (8, 'ingrid', 'Nastasia', NumericRange(70, 75, '[)'), 'Noxxxxxxxxxno', datetime.datetime(2013, 3, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (9, 'laurelle', 'Malsom', NumericRange(40, 45, '[)'), 'Moxxxxxxxxxta', datetime.datetime(2014, 9, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))),\n",
" (10, 'wilton', 'Hoscheit', NumericRange(50, 55, '[)'), 'Texxxxxxxxxwn', datetime.datetime(2012, 9, 1, 0, 0, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None)))]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql select rec.* from encrypted_record e left join lateral pseudo_record(e.id) rec on true limit 10;"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}