ToDo List for Slony-I ----------------------------------------- Short Term Items --------------------------- Improve script that tries to run UPDATE FUNCTIONS across versions to verify that upgrades work properly. - Clone Node - use pg_dump/PITR to populate a new subscriber node Jan working on this - UPDATE FUNCTIONS needs to be able to reload version-specific functions in v2.0, so that if we do an upgrade via: "pg_dump -p $OLDVPORT dbname | psql -p $NEWVERPORT -d dbname" we may then run "UPDATE FUNCTIONS" to tell the instance to know about the new PostgreSQL version. This probably involves refactoring the code that loads the version-specific SQL into a function that is called by both STORE NODE and UPDATE FUNCTIONS. - Need to draw some "ducttape" tests into NG tests - Need to add a MERGE SET test; should do a pretty mean torture of this! - Duplicate duct tape test #6 - create 6 nodes: - #2 and #3 subscribe to #1 - #4 to #3 - #5 and #6 subscribe to #4 - Have a test that does a bunch of subtransactions - Need upgrade path Longer Term Items --------------------------- - Windows-compatible version of tools/slony1_dump.sh - Consider pulling the lexer from psql http://developer.postgresql.org/cvsweb.cgi/pgsql/src/bin/psql/psqlscan.l?rev=1.21;content-type=text%2Fx-cvsweb-markup Wishful Thinking ---------------------------- SYNC pipelining - the notion here is to open two connections to the source DB, and to start running the queries to generate the next LOG cursor while the previous request is pushing INSERT/UPDATE/DELETE requests to the subscriber. COPY pipelining - the notion here is to try to parallelize the data load at SUBSCRIBE time. Suppose we decide we can process 4 tables at a time, we set up 4 threads. We then iterate thus: For each table - acquire a thread (waiting as needed) - submit COPY TO stdout to the provider, and feed to COPY FROM stdin on the subscriber - Submit the REINDEX request on the subscriber Even with a fairly small number of threads, we should be able to process the whole subscription in as long as it takes to process the single largest table. This introduces a risk of locking problems not true at present (alas) in that, at present, the subscription process is able to demand exclusive locks on all tables up front; that is no longer possible if the subscriptions are split across multiple tables. In addition, the updates will COMMIT across some period of time on the subscriber rather than appearing at one instant in time. The timing improvement is probably still worthwhile. http://lists.slony.info/pipermail/slony1-hackers/2007-April/000000.html Slonik ALTER TABLE event This would permit passing through changes targeted at a single table, and require much less extensive locking than traditional EXECUTE SCRIPT. Compress DELETE/UPDATE/INSERT requests Some performance benefits could be gotten by compressing sets of DELETEs on the same table into a single DELETE statement. This doesn't help the time it takes to fire triggers on the origin, but can speed the process of "mass" deleting records on subscribers. Unfortunately, this would complicate the application code, which people agreed would be a net loss... Data Transformations on Subscriber Have an alternative "logtrigger()" scheme which permits creating a custom logtrigger function that can read both OLD.* and NEW.* and assortedly: - Omit columns on a subscriber - Omit tuples SL-Set - Could it have some policy in it as to preferred failover targets?