Replicating Your First Database -------------------------------- In this example, we will be replicating a brand new pgbench database. The mechanics of replicating an existing database are covered here, however we recommend that you learn how Slony-I functions by using a fresh new non-production database. Note that pgbench is a "benchmark" tool that is in the PostgreSQL set of "contrib" tools. If you build PostgreSQL from source, you can readily head to "contrib/pgbench" and do a "make install" to build and install it; it is not always included in packaged binary PostgreSQL installations. The Slony-I replication engine is trigger-based, allowing us to replicate databases (or portions thereof) running under the same postmaster. More commonly, databases are replicated from one host to another. This example shows how to replicate a master database, created with pgbench, to a slave database. The master and slave can both be on the same cluster or on separate hosts. We make a couple of assumptions about your PostgreSQL configuration: 1. You have tcpip_socket=true in your postgresql.conf and 2. You have enabled access in your cluster(s) via pg_hba.conf. The MASTERDBA (and SLAVEDBA) users need to be PostgreSQL superusers. This is typically postgres or pgsql. You should also set the following shell variables: CLUSTERNAME=slony_example MASTERDBNAME=pgbench SLAVEDBNAME=pgbenchslave MASTERHOST=localhost SLAVEHOST=localhost MASTERPORT=5432 SLAVEPORT=5432 MASTERDBA=postgres SLAVEDBA=postgres PGBENCHUSER=pgbench Here are a couple of examples for setting variables in common shells: bash/sh: export CLUSTERNAME=slony_example (t)csh: setenv CLUSTERNAME slony_example Creating the pgbenchuser createuser -A -D -U $MASTERDBA -h $MASTERHOST -p $MASTERPORT $PGBENCHUSER # And if you're replicating to separate cluster, create the user there... createuser -A -D -U $SLAVEDBA -h $SLAVEHOST -p $SLAVEPORT $PGBENCHUSER Preparing the databases createdb -O $PGBENCHUSER -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME createdb -O $PGBENCHUSER -h $SLAVEHOST -p $SLAVEPORT $SLAVEDBNAME pgbench -i -s 1 -U $PGBENCHUSER -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME Because Slony-I depends on the databases having the pl/pgSQL procedural language installed, we better install it now. It is possible that you have installed pl/pgSQL into the template1 database in which case you can skip this step because it's already installed into the $MASTERDBNAME. createlang -h $MASTERHOST -p $MASTERPORT plpgsql $MASTERDBNAME Slony-I does not yet automatically copy table definitions from a master when a slave subscribes to it, so we need to import this data. We do this with pg_dump. pg_dump -s -U $MASTERDBA -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME | psql -U $SLAVEDBA -h $SLAVEHOST -p $SLAVEPORT $SLAVEDBNAME To illustrate how Slony-I allows for on the fly replication subscription, lets start up pgbench. If you run the pgbench application in the foreground of a separate terminal window, you can stop and restart it with different parameters at any time. You'll need to re-export the variables again so they are available in this session as well. The typical command to run pgbench would look like: pgbench -s 1 -c 5 -t 10 -U $PGBENCHUSER -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME This will run pgbench with 5 concurrent clients each processing 10 transactions against the pgbench database running on localhost as the pgbench user. Configuring the Database for Replication. Creating the configuration tables, stored procedures, triggers and configuration is all done through the slonik tool. It is a specialized scripting aid that mostly calls stored procedures in the master/salve (node) databases. The script to create the initial configuration for the simple master-slave setup of our pgbench database looks like this: #!/bin/sh slonik <<_EOF_ #-- # define the namespace the replication system uses in our example it is # slony_example #-- cluster name = $CLUSTERNAME; #-- # admin conninfo's are used by slonik to connect to the nodes one for each # node on each side of the cluster, the syntax is that of PQconnectdb in # the C-API # -- node 1 admin conninfo = 'dbname=$MASTERDBNAME host=$MASTERHOST port=$MASTERPORT user=$MASTERDBA'; node 2 admin conninfo = 'dbname=$SLAVEDBNAME host=$SLAVEHOST port=$SLAVEPORT user=$SLAVEDBA'; #-- # init the first node. Its id MUST be 1. This creates the schema # _$CLUSTERNAME containing all replication system specific database # objects. #-- init cluster ( id=1, comment = 'Master Node'); #-- # Because the history table does not have a primary key or other unique # constraint that could be used to identify a row, we need to add one. # The following command adds a bigint column named # _Slony-I_$CLUSTERNAME_rowID to the table. It will have a default value # of nextval('_$CLUSTERNAME.s1_rowid_seq'), and have UNIQUE and NOT NULL # constraints applied. All existing rows will be initialized with a # number #-- table add key (node id = 1, fully qualified name = 'public.history'); #-- # Slony-I organizes tables into sets. The smallest unit a node can # subscribe is a set. The following commands create one set containing # all 4 pgbench tables. The master or origin of the set is node 1. # you need to have a set add table() for each table you wish to replicate #-- create set (id=1, origin=1, comment='All pgbench tables'); set add table (set id=1, origin=1, id=1, fully qualified name = 'public.accounts', comment='accounts table'); set add table (set id=1, origin=1, id=2, fully qualified name = 'public.branches', comment='branches table'); set add table (set id=1, origin=1, id=3, fully qualified name = 'public.tellers', comment='tellers table'); set add table (set id=1, origin=1, id=4, fully qualified name = 'public.history', comment='history table', key = serial); #-- # Create the second node (the slave) tell the 2 nodes how to connect to # each other and how they should listen for events. #-- store node (id=2, comment = 'Slave node'); store path (server = 1, client = 2, conninfo='dbname=$MASTERDBNAME host=$MASTERHOST port=$MASTERPORT user=$MASTERDBA'); store path (server = 2, client = 1, conninfo='dbname=$SLAVEDBNAME host=$SLAVEHOST port=$SLAVEPORT user=$SLAVEDBA'); store listen (origin=1, provider = 1, receiver =2); store listen (origin=2, provider = 2, receiver =1); _EOF_ Is the pgbench still running? If not start it again. At this point we have 2 databases that are fully prepared. One is the master database in which pgbench is busy accessing and changing rows. It's now time to start the replication daemons. On $MASTERHOST the command to start the replication engine is slon $CLUSTERNAME "dbname=$MASTERDBNAME user=$MASTERDBA host=$MASTERHOST port=$MASTERPORT" Likewise we start the replication system on node 2 (the slave) # And if you're replicating to separate host, you might want to run # this on the slave host, but not necessary for this example ... slon $CLUSTERNAME "dbname=$SLAVEDBNAME user=$SLAVEDBA host=$SLAVEHOST port=$SLAVEPORT" Even though we have the slon running on both the master and slave and they are both spitting out diagnostics and other messages, we aren't replicating any data yet. The notices you are seeing is the synchronization of cluster configurations between the 2 slon processes. To start replicating the 4 pgbench tables (set 1) from the master (node id 1) the slave (node id 2), execute the following script. #!/bin/sh slonik <<_EOF_ # ---- # This defines which namespace the replication system uses # ---- cluster name = $CLUSTERNAME; # ---- # Admin conninfo's are used by the slonik program to connect # to the node databases. So these are the PQconnectdb arguments # that connect from the administrators workstation (where # slonik is executed). # ---- node 1 admin conninfo = 'dbname=$MASTERDBNAME host=$MASTERHOST port=$MASTERPORT user=$MASTERDBA'; node 2 admin conninfo = 'dbname=$SLAVEDBNAME host=$SLAVEHOST port=$SLAVEPORT user=$SLAVEDBA'; # ---- # Node 2 subscribes set 1 # ---- subscribe set ( id = 1, provider = 1, receiver = 2, forward = no); _EOF_ Any second here, the replication daemon on $SLAVEHOST will start to copy the current content of all 4 replicated tables. While doing so, of course, the pgbench application will continue to modify the database. When the copy process is finished, the replication daemon on $SLAVEHOST will start to catch up by applying the accumulated replication log. It will do this in little steps, 10 seconds worth of application work at a time. Depending on the performance of the two systems involved, the sizing of the two databases, the actual transaction load and how well the two databases are tuned and maintained, this catchup process can be a matter of minutes, hours, or eons. You have now successfully set up your first basic master/slave replication system, and the 2 databases once the slave has caught up contain identical data. That's the theory. In practice, it's good to check that the datasets are in fact the same. The following script will create ordered dumps of the 2 databases and compare them. Make sure that pgbench has completed it's testing, and that your slon sessions have caught up. #!/bin/sh echo -n "**** comparing sample1 ... " psql -U $MASTERDBA -h $MASTERHOST -p $MASTERPORT $MASTERDBNAME >dump.tmp.1.$$ <<_EOF_ select 'accounts:'::text, aid, bid, abalance, filler from accounts order by aid; select 'branches:'::text, bid, bbalance, filler from branches order by bid; select 'tellers:'::text, tid, bid, tbalance, filler from tellers order by tid; select 'history:'::text, tid, bid, aid, delta, mtime, filler, "_Slony-I_${CLUSTERNAME}_rowID" from history order by "_Slony-I_${CLUSTERNAME}_rowID"; _EOF_ psql -U $SLAVEDBA -h $SLAVEHOST -p $SLAVEPORT $SLAVEDBNAME >dump.tmp.2.$$ <<_EOF_ select 'accounts:'::text, aid, bid, abalance, filler from accounts order by aid; select 'branches:'::text, bid, bbalance, filler from branches order by bid; select 'tellers:'::text, tid, bid, tbalance, filler from tellers order by tid; select 'history:'::text, tid, bid, aid, delta, mtime, filler, "_Slony-I_${CLUSTERNAME}_rowID" from history order by "_Slony-I_${CLUSTERNAME}_rowID"; _EOF_ if diff dump.tmp.1.$$ dump.tmp.2.$$ >$CLUSTERNAME.diff ; then echo "success - databases are equal." rm dump.tmp.?.$$ rm $CLUSTERNAME.diff else echo "FAILED - see $CLUSTERNAME.diff for database differences" fi If this script returns "FAILED" please contact the developers at http://slony.info/