JGuiGen Replication System - Version 14

JGuiGen Replication System - Version 15 - 6/29/2006

Written by G. Hale Pringle Ed. D.

For the purposes of this document, Data Replication is defined as synchronizing selected tables on two or more database servers. The JGuiGen Replication system gets new and changed rows from one database and sends local new and changed rows to another server. At the end of the process the two tables are the same with the exception of a few system fields that should be different. The objective is to take a snapshot of each system and update the other system to match this snapshot.

Beyond Simple Replication

When you have remote sites that must stand alone (satellite offices, road warriors, sites with intermittent Internet connections, etc.) the usual solution is to replicate the data between sites. This may be as crude as retyping data or as complex as Oracle’s Replication System which requires experienced Data Base Administrators at all of the sites.

JGuiGen has a built-in replication system that is much better than retyping data, but is not as “strong” as the systems developed by major database venders.

Having remote sites and evolving software means that you must take care of five major concerns.

1. Initial Distribution of software and data – We will be using Sun’s JAWS, but there are other options (such as the software distribution system developed by the Eclipse project.) With JAWS, a user attaches to the Internet and automatically gets a copy of the software. After the first time they will run with the existing copy unless they find a newer copy when they link to the Internet. In that case the new copy is automatically installed.

2. Updating the software – JAWS does this very elegantly.

3. Updating the schema of the remote databases to include changes required by “new” versions of the software. This is not part of the JAWS system.

4. Running one-time procedures required by updated software and modified data schemas. Again this is critical, but is not covered by JAWS or other systems that focus on distributing “versions” of software.

5. Verifying the accuracy of replicated data.

The JGuiGen Replication system addresses all five of these concerns.

INDEX

I. Background
II. Type of replication
III. Objectives
IV. Overview - dummy walk through in English
V. Criteria for Evaluating Replication Systems - Notes about how JGuiGen stands up.
.... Architectural Requirements
....Functional Requirements
....Administrative Criteria
....Resource Management
....Change Management
....Recovery
....Vendor Health
VI. Minimal JGuiGen Replication
VII. System Details
....Assumptions
....Security
....JGReplicate Table
....Over-Booking the JGReplicate Table
VIII. Rules - Requirements that must be met.
IX. Walk Through - following the process step-by-step
XX. Avoiding the IsRemoved Field Requirement
XXI. Moving a site to a new primary server
XXII. Special Notes about the JGReplicate Table - fields used by various types of records
XXIII. Reports that will be available in the JGuiGen Replication system.
XXIV. General Notes
XXV. Discussion of the "PROBLEM WINDOW" during replication. XXVI. Planned JUnit Tests
XXVII. Code Steps - coding that is underway.
Back to Top

I. BACKGROUND

I have designed and worked on homegrown data replication systems since the mid 1990s. This document is my latest and gr8t3%t (nope - just couldn't say it) thinking and it builds on the existing work. There are some differences but this is basically what I spent those years developing and running. The systems I developed connected several dozen sites in 22 different states.

Desktop applications (not web-based) that work with databases fall into three rough categories.

Applications that work solely on one PC using a local database.
Applications that work using some type of client/server system where one or more desktop machines share one database. (This category escalates up into the massive clustered systems.)
Applications that are hybrids of category 1 and 2. These applications have at least one user who works with a stand-alone system and needs to share at least some data with the users on another system. The answer is always some type of replication service, even if this means someone typing the data into the other system. (Oh yes! This happens. More than we want to think.)

The advantage of the hybrid is that users can work off-line. They do not need to be connected to the Internet or to some central database server. The major disadvantage is latency. Some of the data is out-of-date and there is some effort required to move correct data to and from the remote sites when they do come on-line. Regardless. there are times when the hybrid is the only viable solution.

JGuiGen can create applications that work in any of the three modes described above. It handles stand-alone systems and large multi-user client server systems. With the addition of the replication system is can synchronize these two groups. (Note: It has always been able to connect a user to a remote database via the Internet. This replication system just allows that user to work when they are not connected. They can then connect and bring everything up-to-date.)

II. JGuiGen REPLICATION - WHAT TYPE OF REPLICATION IS IT?

There are several different types of replication. Some are download only. Some require a full dump of one site's information and others involve real-time or near real-time synchronization. Using definitions from Marie Barretta's book "Data Replication" JGuiGen uses a modified master/slave asynchronous multiple central site replication scheme. The master/slave designation indicates that data from one site is considered to "own" a record. They are the only ones that are allowed to make changes to that record. (Note: the ownership can be passed around so that someone else can edit the record, but only one site owns a given record at any given time.) The "modified" in this definition come from the fact that JGuiGen allows the non-master users to fill in empty text data in fields that someone else owns. The asynchronous part indicates that the data is not updated in real-time or near real-time. Users can run independently for long periods of time (they probably shouldn't, but they could) and then synchronize their data with the primary site. The "multiple central sites" refers to a system where some users can replicate with a central site and that central site synchronizes with another central site that is "higher" up the chain. All data is eventually replicated, it just may take a couple of days.

Note: At present JGuiGen Replication is designed to work at the table level. A while table is owned by one master site. Since JGuiGen already includes a data dictionary table that contains information about individual columns within each table, it could be expanded to do column level master/slave replication without requiring a major overhaul. The previous systems I designed had column level controls, but it doesn’t seem necessary in this first pass.

There are two main reasons for including Data Replication with JGuiGen.

JGuiGen ships with the standalone database HSQL. In order to use JGuiGen with one of the other backend databases (e.g. mySQL, PostgreSQL, Oracle), the data_dict, userlogdata, and jguigen_i18n tables need to be created and the data from the HSQL database needs to be moved into them. This replication system can be used to do this.
The other major reason for data replication is to support business applications that want to deploy stand-alone applications via JAWS, or the Eclipse program update system, or something like these and then have users able to run the application in stand-alone mode. The users usually need to be able to synchronize their data with a home/primary site. This replication system also can be used to do this.

III. OBJECTIVES

When I started this project in the mid 1990's I had a few objectives. As the project went on I discovered a number of other things that needed to be covered. Here is a fairly complete list.

Accurate - All rows synchronized, all columns correct
Secure - Resistant to attacks from outside and inside the system.
Easy to use and easy to automate. It should handle (as well as possible) users that don't care about the whole thing. (I view apathy as an "attack" on the integrity of the data. We need to know when a site is not sending and receiving the current data that other sites are using to run our business.)
Reliable - it should just work
Reliable - It should be tolerant of interrupted replication sessions
Reliable - It should be amenable to warm servers to minimize the impact of single point of failure problems.
If there is a problem, the replication system should not make it worse.
Fast - minimal redundant transfer of data (Note: Since whole rows are sent, there is transmission of data that wasn't changed, but this is a tradeoff with accuracy and complexity. Tracking changes at the individual row/column level is very complex or requires logging which is hard to add to legacy system.)
Flexible - system should allow for clusters of users when necessary.
Low impact on users - minimal down time. (This system does not require that users be off the system during replication.)
Low impact on users - There should be no impact on local users when the Internet is unavailable.
Low requirements on existing applications and tables. (The only major one is requiring that rows never be actually deleted. This can be waived if a stored procedure updates a table called JG_Killed_Rows when a row is deleted in a table that will be replicated.)
Time zone and server clock differences tolerant
Scalable - Through a system of secondary primary sites build a system that can get very large. (Obviously this causes a longer lag in the time required for everyone to get new/changed data.)
Manageable - Management should be aware of what is going on.
Manageable - On special note is Management-by-Exception. The system should actively tell administrators when there are problems.
Manageable - Management should be able to override data ownership when necessary.
Manageable - Management should be able to suspend replication of selected tables when necessary.
Can be used to propagate table schema changes across sites.
Can be used to fire one-time-methods or classes that do work needed to populate tables that have had their schema modified. For example if a legacy system had a column called "NAME" which contained first and last names and a schema change created a new "FIRST_NAME" column and a "LAST_NAME" column, you might want to run a method that tried to split the existing data in the NAME column and placed it into the FIRST_NAME and LAST_NAME columns.
Should be able to restore sites that come on-line using a backup that is days or weeks old
Should be able to populate a new site completely with current data
Should be able to work with applications distributed with JAWS.
Should be able to act as an general purpose table replication system.
Cross-platform capable - should be able to handle situations where different sites are using different back-end databases (this is a tough one)
Real-world aware - Features like the ability of any site to enter empty data should be encouraged but optional.
NOT DONE - system should be transaction safe. It is mostly safe but does not guarantee all or nothing transaction delivery. This would be very difficult to add to the system as it now exists.
NOT DONE YET- the system should allow sites to get "selected" data so that they just get what they need.
NOT DONE YET- the system should have a "compare" feature which allows one site to be compared with another to verify tha accuracy of a site's data.
NOT DONE YET - the system should access rules for transforming data from disparate system and creating enhanced (summarized) data as part of the replication process.
NOT DONE YET- systems might have the ability to reliably update summary data or totals or balances as part of the replication.
NOT DONE YET – column level control of data ownership.

IV. OVERVIEW

How the system works is repeated several times below (from different points 0f view), but I would like to start with an English conversation that shows the concept behind the system. Note that in this conversation Sites 3 and 17 doe all of the talking. Primary is very dumb, it just sends rows, updates records and inserts records when Sites 3 and 17 ask it to.

Buzz Click - someone just attached to the Primary site's database system.
Secondary site - Hello Primary. I'm new. Here is a query to see if you are a primary and if you are active and can give me a new site number.
Primary - sends data.
Secondary Site - I see that you are a Primary and have a site number you can give me. Here is an update to add a Log record to show that I've talked to you, but I don't have a site number yet.
Primary - processes insert
Secondary Site - Here is a query for the next available site number.
Primary - sends site 17 as a response.
Site 17 - Hi! I'm in. Here is an update for my log entry to say that Site 17 asked for a new site number.
Primary - handles update
Site 17 - Here is a query for all previous Schema updates. I shouldn't need them but I'll put them in my JGReplicate table.
Primary Site - sends U records.
Buzz Click - someone else has attached to the Primary site's database system.
Site 3 - Hello Primary. Here is a query to see if you are a primary and active today.
Primary site - responds to request to Site 3.
Site 3 - Hello there - I am sending you an Error record that indicates that my last update did not complete. I'll also send an email to system administrators.
Primary site - handles insert.
Site 3 - Hi there again. I see that you are a primary site and I have your type of database (you are SQL-SERVER), along with your site number and name. I'll update any of your JG_Unique_id's or JG_Owner fields that I find empty. I also see that you have remote management-by-exception reporting turned off so I won't do those functions. Here is an update to update my Site record ("S" record) to indicate I started a new replication and to bump my "count-of-unsuccessful-replications.
Primary site - handles update
Site 3 – Here is a query for the time I used when I last finished a replication. If your version and mine don’t match I will need authorization from you before I can continue. (I know that this stops cloned machines with my site number from replicating and screwing things up.)
Primary site – returns data
Site 3 – Got it! It matches what I have. Do you have any schema updates for me ("U" records) that have an update number greater than 22?
Primary Site - Sends data if it exists.
Site 3 - Okay I put that change in my JGReplicate table, and made the change to my local system. Here is a query for the list of tables that you are actively replicating today.
Primary site - responds to request to Site 3.
Site 17 - Hello Primary this is Site 17. Here is an insert to update your JGReplicate table to create a new Site record ("S" record) that describes me. It indicates that I came and started this conversation.
Primary - handles insert
Site 17 - Here is an update to change the Log record I started when didn't have a site number to reflect that site number 17 was the number I was given.
Primary - handles update
Site 17 - Let me start over. Are you an active primary site, what is your site number, name and database type and should I do remote management-by-exception reports?
Primary site - responds to request to Site 17.
******* I'm going to quit reporting Site 17's conversations here.
Site 3 - Well here I am again. I see you added a new table to the list. I've added that to my JGReplicate table. Can we compare the schema in my tables against yours? Here is a query for the structure of the first table.
Primary site - Sends the schema for the data_dict table Site 3 asked for.
Site 3 - Okay that matched. Here is a query for the schema of the JGuiGen_I18N table.
*********several back and forth discussions about the schema of other tables.
Site 3 - Hello again Primary. It is now Tuesday at 10:17 PM. I last finished a replication with you on Sunday at 11:04 PM my time. (I tried on Monday but we got disconnected.) The highest date/time I got from you when we started our last replication was Sunday at 9:15 PM your time. I'm going to update a record on your site and determine the current time where you are. Here is an update and a query.
Primary -updates the "S" record in the JGReplicate table sends "Tuesday at 5:30 pm."
Site 3 - Okay I got it. I'm going to call this time my PRIMARY-MARK time. I will use it for all the tables I request data on. Here is a query for all rows in the data_dict table that were updated since Sunday at 9:15 PM your time (my PRIMARY-MARK time the last time I replicated) and less than or equal to my current PRIMARY-MARK time. Don't send me any rows that I sent you where the date/time I sent them (JG_DateWhenSent) matches the lastChangedDate on the row. These are rows that I sent you and I know about those changes.
Primary site - sends the requested rows from the data_dict table.
Site 3 - Well I've processed those. I'll ask for things newer than Tuesday at 5:30 PM the next time we talk. I updated the lastChangedDate and the JG_DateWhenSent to my current time and changed the JG_SentBy column to your site number. When we talked Monday the most recent data I had was from Friday at 4:27 PM my time (nobody worked over the weekend.) Just in case, I'm going to update your tables with all of the changes on my system since that time on Friday. (I'm not going to include the stuff you just sent me even though the time stamp is newer than the rest of what I am sending. I'll know that it was you because the lastChangedDate will be the same as the JG_DateWhenSent and the JG_SentBy column will have your site number. If the two dates are different someone has changed this record - probably in the last minute or two - and I need to send you that change.) I'm going to use your date/time to mark lastChangedDate and JG_DateWhenSent columns in the rows I'm updating on your system. I've just checked my system and the highest date/time I am going to send is a time I will call MARK. By the time I finish there may be rows that are more current than this value, but I'll send them to you next time.
Site 3 - Here are some Table Inserts for my new rows since my last update and my local MARK time.
Primary site - handles the inserts.
Site 3 - Here are some Table updates for my changed rows since my last update and my local MARK time.
Primary site - handles the updates
Site 3 - I'm updating our logs and storing the local MARK time as the time to start with during out next replication. When you're ready here is a query for all of the new data from the JGuiGen_I18N table. (Note that I'm using the smae local MARK time in this query.)
Primary site - sends the requested data.
**********similar back and forth discussions about other tables.
Site 3 - Okay I'm done. I updated our logs to indicate that I had finished. In particular I changed the column that contains the number of replications I have started that didn't finish from 2 back to zero (remember I had an error when I last replicated.) You don't talk much do you? Oh well, talk to you again soon!
click - Site 3 disconnects.
Note: During this conversation with Site 3, Site 17 could have been carrying on similar conversations.

V. EVALUATING REPLICATION SYSTEM and JGuiGen Replication
One section of Ms. Buretta's book listed "General Requirements for Creating and Evaluation Replication Systems." I am going to go down this list and talk about the JGuiGen replication system. I will mark her statements as "Replication Criteria: ....." This gives you a good starting point to make your own evaluation about the suitability of JGuiGen Replication for your own uses.

NOTE: As of 6/27/2006, this document is being written as if this system is operational. It is under active development, but is not fully functional. The code that it is based on exists and in most cases the development consists of recreating the functionality in java. I will update this document as development proceeds.

NOTE 2 This section is written in light of setting up a full-blown, secure, multi sever system. For test purposes and non-critical situations, JGuiGen Replication is much simpler. See the note at the end of the Evaluation section.

ARCHITECTURAL REQUIREMENTS

Replication Criteria: The system should be architected as middleware whose purpose to provide data replication services. This allows the system to be plugged and unplugged with minimal impact on the rest of the system. - JGuiGen Replication meets this criterion. The replication system does not require JGuiGen or its applications to run. It does require an XML file (or LDAP system) with system settings. A data_dict table with some information about the tables to be replicated and a JGReplicate table to store replication information. There is an options JG_Killed_Rows table if your system actually deletes rows.
Replication Criteria: The system should support location/distribution transparency. A database or process should not need to know where objects are. - This is not truly applicable to a database only replication system like JGuiGen Replication. Each site needs to know how to connect to its own database and how to connect to one remote site. JGuiGen Replication does handle time zone differences between primary and target sites.
Replication Criteria: The system should support a high degree of database and process transparency. - JGuiGen meets this criterion for data replication. It will move data from and to any of the databases that JGuiGen supports and can add other database types as long as they have JDBC support.

Replication Criteria: Question about database replication systems. How transparent is the system to the resource managers involved in the replication process? - This is talking about the databases and the existing applications. JGuiGen Replication is very transparent. With the addition of four fields to each table to be replicated, two or three tables for the replication process and at most two stored procedures that are triggered on updating a row and deleting a row any database system can work with JGuiGen Replication.
Replication Criteria: Question about database replication systems. Do resource managers need to be altered to participate in the replication process? If so how hard is it to do and undo? - Here is the whole list of changes required by the JGuiGen Replication system.

Four columns need to be added to each table: 1) an int JG_Owner column, 2) a long JG_Unique_id column, 3) an int JG_SentBy column and 4) a date JG_DateWhenSent column.
There needs to be an integer row_id column with a unique value in each table to be replicated.
There needs to be a lastChangedDate column in each table to be replicated.
If the current software does not already update the lastChangedDate column when a row is inserted or modified, a stored procedure will need to be installed that updates this column.
There either needs to be an isRemoved column to mark tables as removed or a JG_Killed_Rows table will need to be added and a stored procedure installed that fills the JG_Killed_Rows table when a row is deleted. There is a section below that describes this.
An index should be created on the JG_Unique_id column.
Two tables need to be added to the database - JGReplicate and Data_dict.
JGuiGen needs to be run once to populate the Data_dict table and someone needs to indicate which columns handle the row_id, lastChangedDate and isRemoved functions. Right now the JG_SentBy, JG_DateWhenSent, JG_Unique_id and JG_Owner columns must be named exactly. This could easily be changed.
The system hosting the database should subscribe to a time service to keep its date/time accurate.
A small program/daemon will need to be running at all times that connects to the remote sites via SSL on a port other than the standard database port.
this port will need to be opened through any firewalls.
Each site will need a JGuiGenIni.xml file or an LDAP system that contains the JDBC Driver name and location for connecting to the remote site.
Each site will need to install the JGuiGen Replicate software.
An optional task involves running a small application at each site once that creates encrypted files containing the information needed to connect to the local and/or remote database.
SSL certificates for remote sites will need to be generated.

JGuiGen applications already have almost all of these items in place from the beginning.

NOTE JGuiGen Replication is currently under development and the levels described here are all in place but have not been fully implemented. The system can be run in a much looser fashion when simplicity is more important than data security.

Replication Criteria: The system should be industry standard compliant - for example it should use the Open Source Foundation - Distributed Computing Environment (DCE) standards and handles XA and non-XA data.. - JGuiGen Replication meets many of these standards. It uses the industry standard of JDBC to connect to disparate database systems. It uses industry standard SSL connections to insure that data transmission is secure. It can store passwords and user names in local LDAP systems. It uses industry standard encryption algorithms such as MD5 hashing, CBCBlockCipher, Thriple DES, PassWordBasedENtryption (PBEWithMD5AndDES). JGuiGen Replication does not work with the XA transaction manager protocol at all.

Replication Criteria: Question about system security. How robust are the security implementations? - JGuiGen is very secure against internal attacks and relies on SSL to secure transmission of data. It would be easy to add a process called Tumbling (a form or port knocking) if more security was needed.

Replication Criteria: Question about system security. Who can use the functionality of the Replications system? - JGuiGen replications requires that a site be set up to establish an SSL link using SSL Data Tunneling software. Anyone who has this setup can get to the database. In its most secure form JGuiGen Replication uses encrypted files to store the user name and password needed to attach to the database.
Replication Criteria: Question about system security. What Ids and passwords are stored? - JGuiGen Replication uses three id's and passwords. A local user name and password, the local database username and password and the remote username and password. These can all be stored in encrypted files at the local site or in a local LDAP system. The next step for our system is to store the remote database username and password in a way that causes the local site to request it using an SSL Link. This would make it easier to shift from one primary site to another that could be located anywhere in the world.
Replication Criteria: Question about system security. Where are they stored and are the encrypted? - On the local file system in encrypted files or in a local LDAP system.
Replication Criteria: Question about system security. What usernames and passwords are used on remote resource managers (databases) to alter data? - As noted JGuiGen Replication requires a database username and password. This would often be a special user name dedicated to the replication process. It would be set up so that its access to tables in the database was just as tight as possible. For examples tables that are only downloaded to remote sites could be marked "read-only" to this special user.
Replication Criteria: Question about system security. How is network security integrated into the replication system? - As mentioned, JGuiGen Replication uses a special daemon/process that sets up an SSL link with a remote site on an arbitrary port. We have the code (called the Tumbler on Sourceforge) that requires the remote site send and encrypted password to even open this port.

Replication Criteria: The system should be fail-safe (i.e. fail in a manner that protects other components in the environment) - JGuiGen Replication should meet this criterion quite well. It is simply a database link to a database.
Replication Criteria: There should be no inherent bottlenecks in the solution. - Since JGuiGen Replication can be scaled to as many clusters as are needed this should not be a problem. There are no calls to other sites required at any time except when a full replication is being done.
Replication Criteria: I would add that the system should come with a set of reliability tests that demonstrate most or all of its facilities. - These are under development as part of the JGuiGen Replication coding. There are JUnit test and a system for passing information relevant to these tests back and forth. A critical feature involves creating a "compare" facility that compares the values in a table with its counterpart on another system. This facility needs to ignore the row_id and lastChangedDate which are unique to the local system and the JG_SentBy and JG_DateWhenSent columns that are different on different systems.

FUNCTIONAL REQUIREMENTS

Replication Criteria: The replication system should be scalable. - This system is.

Replication Criteria: Question about scalability. What resource managers (databases) can be used as primary sources? - Any of the over six that JGuiGen supports. Others that have JDBC drivers could be added in very short order.
Replication Criteria: Question about scalability. What resource managers (databases) can be used as targets? - Any of the over six that JGuiGen supports. Others that have JDBC drivers could be added in very short order.
Replication Criteria: Question about scalability. Given the available resources, are there limits on the amount of data that can be transmitted through the system? - This is an individual issue that I can't answer. Given that JGuiGen does not handle blobs at this point, this should not be a problem.

Replication Criteria: The system should use or interface with data transformation and mapping. JGuiGen Replication handles minimal transformation. It will convert Boolean fields to integers given that some databases use Booleans and others do not. It does not support (at this time) full scale data transformation (such as a situation where one site stores Male and Female as 1 and 0 and another stores the same information as "M" and "F".)

Replication Criteria: Question about transformations. Does the system alter data types between systems? - As described above - yes in limited situations.
Replication Criteria: Question about transformation. Does the system insure that all target sites represent the data identically? - This is the 1/0 versus M/F question. The answer is no except for Boolean transformation.
Replication Criteria: Question about transformations. Does the system enhance the data? - This means can it creates summaries or other major changes as data is being delivered? - No
NOTE: It should be noted that JGuiGen Replication does not have the facility to do more that replicate tables. If you need a system that has failsafe ability to update totals or make other cumulative changes at the primary site, JGuiGen Replication is not appropriate. For example, if you need a system that will send new financial transactions and then update a "balance" field on the primary site, JGuiGen Replication is not appropriate. The system will insert the new rows and in the case of failure it would finish inserting the rows. There is no way to guarantee (without modifying the existing code) that the balance field would not be updated multiple times or not updated at all if all the records were inserted and then a failure happened.

Replication Criteria: There should be a scheduling mechanism. JGuiGen Replication depends upon native systems to fire off the replication executable at a regular time. Road warriors can fire off their systems at any time.
Replication Criteria: There should be a friendly system that describes the data or objects being transmitted. JGuiGen Replication lists the tables it is working on in a command window while it works.
Replication Criteria: There should be a friendly system to subscribe to the data/objects that are available for replication. - JGuiGen Replication is an all or nothing system at present. Once you attach to a primary site it replicates the tables that have been set up for replication.
Replication Criteria: The initialization process for targets should be simple and straight forward.- As noted the hardest part of getting started (in secure systems) is setting up the SSL certificates on the remote sites. After that the remote site starts with an empty JGReplicate table and attaches to a primary system. The table is populated and replication begins.
Replication Criteria: The replication system should be well behaved and provide the quality of services specified by the business environment - Since this is basically a database connection with updates and inserts it is well behaved. The issue on whether or not the "quality of service" is acceptable depends upon the group wishing to use JGuiGen Replication.

ADMINISTRATIVE CRITERIA

Replication Criteria: Configuration - The configuration process should be simple and straightforward. - I believe that this criterion is met. For full security there is one daemon/process that must be installed on primary servers and then the changes listed earlier. For target sites the software can be delivered via JAWS. Existing databases will need to have two columns added and two or three tables added. The SSL certificates will need to be created and a small executable run that creates the encrypted files for the target system.
Replication Criteria: Configuration - It should support dynamic reconfiguration - It does.
Replication Criteria: Configuration - Some sort of configuration history is a plus - JGuiGen Replication logs virtually all of its activities. This provides a fairly full history.
Replication Criteria: Monitoring - There should be a GUI monitoring tool for the entire replication system - This does not exist yet. It could easily be added when someone needs it.
Replication Criteria: Monitoring - The monitoring system should integrate with the existing monitoring system for the firm - JGuiGen currently documents errors in the JGReplicate table and sends emails to a system administrator when errors occur. This could easily be modified to handle other reporting requirements.
Replication Criteria: Performance Monitoring - JGuiGen Replication has a GUI for the JGReplicate table. This includes several standard reports that can be run at any time to monitor the activity in the system. Other ways of reporting this data could be added very easily.
Replication Criteria: Performance Monitoring - The system should efficiently sue existing resource. JGuiGen Replication requires internet band width and database server machine cycles. Thats about it.
Replication Criteria: Error Management - There should be an error log - This is handled in the JGReplicate table as described in this document.
Replication Criteria: Control of the replication process should be controllable from either the target sites or the central sites. - yes.
Replication Criteria: The system should handle emergency shutdowns - yes
Replication Criteria: The system should automatically restart after emergency shutdown. - yes

RESOURCE MANAGEMENT

Replication Criteria: Resource Management - the queuing system should efficiently handle resources. - This is currently done by manually scheduling when remote sites will replicate.
Replication Criteria: Resource Management - CPU - the subscription service should be well behaved. - As noted, after a target site is configured it automatically joins the system when it first connects.
Replication Criteria: Resource Management - Network - the system should be well behaved on the network.- As noted this system uses standard SSL for connection and security. The amount of band-width or extra bandwidth needed would depend upon the amount of data being moved. In most cases the replication will occur during off-hours and normal bandwidth that is under-used in those time periods will be sufficient.
Replication Criteria: Resource Management - Access management should be done with standard protocols - SSL and JDBC are extremely standard.

CHANGE MANAGEMENT

Replication Criteria: Change Management - It should be easy to upgrade the system. - JGuiGen Replication is being built with systems that change table structures at remote sites and run special methods that modify existing data after a data base schema changes has been made. Target sites can automatically update they software via JAWS, which will automatically install updated replication software when the site starts the replication process. Since they have to be connected to the Internet before they can replicate this imposes no additional burden.
Replication Criteria: Change Management - Migration should be straight -forward and should permit release level coexistence. - Yes in most cases.

Replication Criteria: Change Management - Must all sites migrate to the new system at the same time? - Since the migration is performed via the Internet during the replication process, this will usually happen automatically.
Replication Criteria: Change Management - can multiple levels exist in the system at the same time? This will happen rarely given the automatic nature of the upgrade system. JGuiGen Replication is not designed to handle some sites that have tables with different database schemas than others. (Except in certain cases where target sites have fields that are never replicated to the primary sites.)

CHANGE MANAGEMENT OF DATABASE SYSTEMS

Replication Criteria: Change Management - There should be tools for altering database objects - Yes

Replication Criteria: Change Management - What tasks must be done when the DDL (Data Definition Language) changes? - As described later, a "U" record must be inserted in the JGReplicate table to tell target sites how to modify their tables. Software can be distributed at the same time that uses the update number to trigger special methods that fill new columns with appropriate data or move data from one column to another.
Replication Criteria: Change Management - Are tools provided to assist with this change process at the primary sties? as the target sites? - JGuiGen provides as GUI for accessing the JGReplicate table. The system for automatically inserting the update records ("U" records) is under development.
Replication Criteria: Change Management - Is there a repository that indicates which targets have received the changes? - Yes. The JGReplicate table contains this information and there can easily be a report to show the sites that have received updates.

RECOVERY

Replication Criteria: Recovery - There should be automatic recovery procedures to recover from specified types of failure - Most will be handled when the primary database server is rebooted, a warm back server is activated or the primary server is regenerated from backups. A primary could easily be repopulated from another primary or even a site that is normally a target site.
Replication Criteria: Recovery - The system should fail safely - yes
Replication Criteria: Recovery - There should be a data reconciliation facility to com[are one site with another - under development

VENDOR HEALTH

There were several questions about the financial health of the vendor. This is Open Source software. A user has the full source code and can modify it or take over its development at any time. The "vendor" health issues are not particularly relevant.

Given that this is a small open source project developed by one person, it would seem that JGuiGen Replication does a fairly decent job of meeting this standard set of evaluation criteria.

****END OF EVALUTAION SECTION

VI. MINIMAL JGuiGen REPLICATION

As mentioned in a note above, JGuiGen Replication does not require all of these bells and whistles. For a test drive or a non-critical small application all you need is the items in the following list.

The tables to be replicated need to exist with the appropriate schema at the target site. (They obviously must exist as the primary site, but I assume you have something that already exists that you want to replicate.)
The seven variables in each table to be replicated need to exist.
The JGReplicate table needs to exist at the primary site and the target site.
Minimal information needs to be installed in an "M" record on the primary site. (Site number =1, Enter a name, Process = "Y", Next-available-site = 2, Next-available-site- lock = "N") JGuiGen supplies a GUI for entering this data.
Table records ("T" records) in JGReplicate on the primary site. These need a table_name, Process = "Y". Again, JGuiGen supplies a GUI for entering this data.
A primary site with a port open to the Internet or an intranet.
Settings in the JGuiGenIni.xml file to attach to a local database and a remote database.
The JGuiGen application with it's menu option - Maintenance --> Replicate

This is how we have users who want to use a larger database system than HSQL populate their new database from the HSQL records that ship with JGuiGen.

VII. SYSTEM DETAILS

Replication involves databases running at multiple sites. For each database system using JGuiGen replication there is one master primary site. There can be secondary primary sites. For example a remote office could call into the main office, but all sites local to the remote office would get their replications from the remote office. All sites connect to a primary and update it. Secondary primaries connect to the master primary. The remote sites also get new and updated rows from their primary site. Each site has a unique site number across the whole system.

Every table that will be replicated by JGuiGen must have seven variables in it These are used by this replication system.

row_id - integer - a unique id number for the row (regardless of what it is called) This is almost always the primary key for the table.
lastChangedDate - timestamp of when a row was last updated.
isRemoved - A Flag column used to indicate that a row should be considered as "deleted"
JG_Owner int - this contains the site number that OWNS a row. Owner wins if there are differences. This is the "master" in the master/slave relationship that is part of this type of replication.
JG_Unique_id -long - this is the unique row number for this row across all sites. It consists of the local site-number times 10,000,000,000 plus the local site's row_id times 10. For example if site 5 has a row with a row_id of 57, it will generate a JG_Unique_id of 500,000,000,570 ((5 * 10,000,000,000) + 57*10). Note: the reason the row_id is multiplied by 10 is that this leaves us room to temporarily insert a row that ends with a 5 and sorts between existing rows. This will be used when comparing a row from another site with a row from the current site.
JG_SentBy - int - the site number that sent the record.
JG_DateWhenSent - timestamp - a date/time that matches lastChangedDate when a record arrives at a site. (See the note about the "Problem Windows" below.)
Note: Right now all of the fields that start with "JG_" MUST be named exactly. JGuiGen queries the data_dict table to determine the actual column names used for the other functions.

Assumptions

Every site has a unique site id. There are NO duplicates. New sites get their site number from a primary site. (There is a built-in test for violations of this assumption.)
There is one master primary site. It hands out the site numbers.
If there are secondary - primary sites, they each can get five new site numbers they can give away. Once they have given them away they can't help a new site join the system until they get the new set of site numbers to give away from the master primary site. The secondary master replenishes any numbers it has given away when it next replicates with the Master Primary site. (A stored procedure could be created to get new site ids to give away once current set has been doled out.)
Each table being replicated has a row_id column that contains a unique value within the table. This row_id is unique at the site. Any row_id may be duplicated at different sites.
No site has a local row_id that is greater than 10,000,000,000.
Each table has a variable lastChangedDate (regardless of what it is called) that is updated with the current date/time when a row is added or changed.
There will be no rows in any of the tables being replicated that have a column that is constantly being updated by some batch process. Since we allow users and other processes to use the system during replication, it is possible that the row with this kind of column would miss the replication cycle much or all of the time.
All tables to be replicated have a variable isRemoved (regardless of what it is called) that marks a row to be considered as deleted. Rows cannot be really deleted. This requirement can be waived with a stored procedure and a JG_Killed_Rows table. There is a section below that describes this process.
Each table to be replicated has the seven variables (regardless of their names) listed above. The replication system will fill in the JG_Owner, JG_SentBy, JG_Unique_id and the JG_DateWhenSent columns along with the lastChangedDate column when a row is replicated. The isRemoved and lastChangedDate columns need to be updated by applications or with triggers in the SQL database during other changes.
The JGReplicate table must be available for the JGuiGen Replication system to use.
The JGuiGenIni.xml file must be available. It must have the email address of the system administrator, connection information about the local and the primary site this site will connect to. It should also have an English site name for each site. (Site 17 is nice, but "Jim Smith's Laptop" is easier to work with.) This could also be stored in a local LDAP system.
The JGuiGen Data Dictionary table must be available and must contain data on the tables to be replicated. The basic data is done when you mark the table to be processed in the JGuiGen Create system. Someone needs to indicate which columns in the table are the primary key field, the lastChangedDate columns and the isRemoved column.
There needs to be a JDBC driver for the database.
The database at the primary site must support the JDBC function CURDATE() which returns the current date and time on the database server doing the update.
Primary servers should subscribe to a time server. Their date/time stamps are critical.

Security

We will be using SSL tunneling software (already available) between local sites and the primary to insure that the data transfer is secure.
We could also institute the use for the Sourceforge Tumbler. This is code that requires a remote site to send an encrypted password in order to unlock a port.
When you connect to a database you need a user name and password. JGuiGen can obtain these from encrypted files (or an LDAP system) that are unique to each site.
We could do a - "here is my password can I work?" from the remote to the primary. For example the "Tumbler" project on Sourceforge is java code that requires a secret password before it will open a port. (Similar to port knocking.)
We could require that some sites come from a specific IP address.
I would suggest having the replication software use a special account to access the local database. This account should be the only one with read/write rights to the JGReplicate table. This helps to insure that no one accidentally (or deliberately) meddles with the JGReplicate table during replications. (When I find a way to issue a JDBC Table-lock, I will implement that as well.)

JGReplicate Table

There is a special table at each site called JGReplicate. It contains one row for each table to be updated and log entries that track replications attempted and completed.
Note: In the following rows the [] notations indicate which record types use the field.

The schema of the JGReplicate table is

replicate_id - int - an id number of the row.
table_name {T]- char(35) - the table name of the table to be replicated.
database_type [MS] - char[20] - Oracle, MySQL, etc.
JGuiGen_specific_table [T]- char(1) - Y/N - Y for userlogdata, JGuiGen_I18N and data_dict. JGReplicate is not replicated.
Type-of-replication [T]- (Used for "T" rows) Empty or "F" for full replication, "D" for download only (primary to target) or "U" for upload only (target to primary).
process [ETU]- char(1) - "Y" for yes and "N" if the replication of this table should be turned off.
done [ETU]- char(1) - Y for table was complete and N or blank if not.
Allow-blank-field-filling [T]- char(1) - defaults to "Y". Set to "N" if sites that do not own a row should not be allowed to insert data into empty fields.
Current-local-high-date-time [T]- timestamp - This is an arbitrary MARK time established when the current site started to send inserts and updates to the primary site for any table. It sends everything that was inserted between the previous MARK and the current MARK (At the end it gets moved to previous- local-high-date-time.)
Previous-local-high-date-time [T]- timestamp - Used in the Table record. This was an arbitrary MARK time when the site starts to send inserts and updates on any table. The site sends everything new up to this time during the previous update. The MARK time is used since the system has to send either inserts and have them processed and then updates and have them processed or vice versa. We don't want to use a different time for inserts than we use for updates. And we really don want to use different times for different tables.

NOTE: When a site tries to replicate and their Previous-local-high-date-time does NOT match the Previous-local-high-date-time at the primary server an error record is written and an email is sent to the System administrators. This usually means that two different locations are trying to use the same site number. The only exception is when the One-time-mismatch-of-times-allowed flag is set to “Y”.

Current-primary-high-date-time [T]- timestamp - The system updates the "S" record on the primary server and then queries for the lastChangedDate for that row. This becomes the PRIMARY-MARK. We will query all tables for inserts and changes made between the last PRIMARY-MARK and the current PRIMARY-MARK.
Previous-primary-high-date-time [T]- timestamp - The PRIMARY-MARK time determined during the last replication. We will ask for all changes between this date/time and the new PRIMARY-MARK.
One-time-mismatch-of-times-allowed – char(1) Usually “N”. When this is “Y” the remote site is allowed to replicate even though their times do not match the times stored on the primary server. Requiring the times to match stops two sites from using the same site number. Allowing a one-time relaxation of this rules allows a site to recreate its database from a backup and then bring itself up-to-date.
rep-count-started [SMT]- int - Number of replications started.
rep-count-finished [SMT]- int - Number of replications completed.
out_of_date_after_minutes - [SM]int - How many minutes (days) can pass before a site is considered out-of-date. This number is used by local software to notify local users that they need to replicate.
out_of_date_after_minutes_error [SM]- int - How many minutes (days) can pass before a site is considered out-of-date and an error is logged and an email is sent to the system administrators. This is used when the local users did not respond to their system prompts that they were out-of-date. Each replicating site sends an email about other sites who have a error-1-email-sent" = "N" and haven't replicated in this many minutes (days). (The sender ignores the site if the error-1-email-sent field is already set to "Y". It sets the error-1-email-sent field to "Y" once this message has been sent.)
Flag [all]- Char(1) - This column indicates which "type" of row you are looking at.

M row - There is one row marked "M" for master. This row contains the site number for this site and other key information.
S row - Each target site that attaches to a primary site creates a Site row. This row will contain counts and dates that show a sites replication status.
T for Table entries (using the columns described above)
L for Log entries using the next set of columns.
U rows -Update rows are marked with a "U". These are used to pass table schema changes to target sites.
E rows - these indicate Error conditions. Error messages are left in these rows.
J rows - (under development) special rows used by JUnit tests to pass information between replicating sites.

Site_number [all]- int - site number doing the replication (locally this will always be the local site's number. At the primary site this will indicate which site started/finished a table replication.)
Site_name [all]- Char(100) - English name for the site.
Date_rep_started [LSM]- timestamp - when a table replication was started
Date_rep_ended [LSM]-timestamp - when the table replication ended. (If this is empty the replication aborted somehow.)
record_created_date [MSEUL]- timestamp - used by all facilities to indicate when a record was originally placed in the system. (e.g. when a site record was first created.)
Notes [EMLS]- Text - a field used by the JUnit tests or by administrators to help with debugging problems. A log of all replications is kept in the field in the “S” record.
JUnit-Work [J]- Char(60) - JUnit test can use this column to pass information between replicating sites.
Primary-site [M]- Char(1) - "M" for the master primary, "S" for secondary primaries, blank for others.
Number_of_rows_inserted_by_primary [L]- int - used in "L" records.
Number_of_rows_inserted_by_target [L]- int - used in "L" records.
Number_of_rows_updated_by_primary [L]- int - used in "L" records.
Number_of_rows_updated_by_target [L]- int - used in "L" records.
Next_Available_Site_number-1 [M]- int. - next site number to dole out
Next_Available_Site_number-2 [M]- int. - only used by secondary masters.
Next_Available_Site_number-3 [M]- int. - only used by secondary masters.
Next_Available_Site_number-4 [M]- int. - only used by secondary masters.
Next_Available_Site_number-5 [M]- int. - only used by secondary masters.
next_available_Site_number_locked [M]- char(1) - Y/N used to make sure that two sites don't get the same site number.
Send_email_on_completion [MES]- char(1) - Y/N - should each site send the administrator an email when they successfully complete a replication. (Used for debugging and early in deployment.) This can be turned on for just one "problem" site.
process_mgmt_by_exception [M]- char(1) ="R" = remotes do it (default), "L" = local do it (requires a local process to be fired off) - else do not do them at all
email-bad_rep_days [M]- int - each replicating site sends an email about other sites whose last completed replication is more than this number of days old. (The sender ignores this site if the "error-2-email-sent" field is already set to "Y". It sets the "error-2-email-sent" field to "Y" once this message is sent.)
email-bad-rep-count[m] = int - each replication site sends an email when they start to replicate and their count of unsuccessful replications exceeds this.
update_number [US]- int - a sequential number that identifies "U" records. Each schema update should be one higher than the last. On the "S" record, this field contains the highest update the site completed.
update_decripton [U]- char(50) - short description of schema change
update_spec [U]- char(150) - this is the schema change that should be made ("ADD" var_name, var_type, default value, length,, decimals, nullable).
error-1-email-sent [S]- char(1) - "N" if last replication date is too old message has not been sent. "Y" if it has. (These are set to "N" when a site completes a replication.)
error-2-email-sent [S]- char(1) - "N" if the too-many-consecutive-unsuccessful-replications email message has not been sent. "Y" if it has. (These are set to "N" when a site completes a replication.)
error-3-email-sent [S]- char(1) - extra
error-4-email-sent [S]- char(1) - extra

Over-Booking the JGReplicate Table

This table is "over-booked". It has at least nine different things that it takes care of.

It holds the site number, name, and type of database in the "M" record
It holds the list of tables being updated and the critical times needed to replicate in the "T" records. On the primary sites these rows indicate if a Table is currently being replicated. It also indicates what kind of replication should be done - full replication, download only, or upload only.
On the primary site it holds a record for each site that replicates. The "S" record has the site number and name, the number of times replication started, the number of times it completed successfully, the date and time for the last replication (The Note: this column holds a list of all replications in the “S” record)
It holds a list of all table schema update that need to be applied in the "U" records
On the primary site it hold the next-available-site-number in the "M" record
It contains a table by table log of replication dates and times in the "L" records
It is used by the JUnit-tests to pass info between sites for the test ("M" record)
There can be optional "J" records used by JUnit test to pass information.
There are "E" records that contain error messages.

There are two main values from the previous replication kept in the "Table" row for each table in the JGReplicate table

The first major value is the date/time of the last replication. When a table is being replicated, one of the first steps is to ask the primary site for all rows at the primary site that have been inserted or changed since this date. This date/time will be exactly the same for each table even though the system processes tables one at a time. We basically want a snapshot at the time we are calling MARK. "Send me everything new up on tine the MARK time."
Note: To originally fill a site with data this date/time is set to a date/time before the system existed. This will cause all of the rows from the primary site to be sent.
Note: The JGuiGen replication system is not transaction safe. It is possible with this system if two sites are replicating at the same time to get part of one site's new data, but not all of it. For example if site 2 asks for all updates on a table and gets lots of rows to process, site 3 could then process the same table and another table before site 2 finishes. This would result in site 2 not having site 3's data for the first table and having site 3's data for the second table. The next replication takes care of this.
The second important value is the date/time of the most recently updated record in the local table as of the last replication. Any rows with a lastChangedDate date/time stamp higher than this date/time will have been inserted or changed locally since the last replication.

VIII. RULES-

Each site gets a site number - Primary site is usually 1, others are 2 through ?. Sites -99 is special.
Each row to be replicated has an Owner (This is the site number of the site that originated the record.)
Owner's data always wins if two sites are different
Who the Owner is can be changed from one site to another. If the primary site changes who owns a row this change is honored by all other sites. This allows the local administrators to "take control" of a record if a site goes down or a time critical change must be made and the other site is not available for any reason.
Note: The system must use "Removed rows flags" They cannot really DELETE a row. If a remote site DELETES a row, it won't exist when they try to tell the base site about the changes they made and the row won't get deleted at the primary site. There is a way to waive this restriction. It is described below and involves a JG_Killed_Rows table and a stored procedure fired off by a trigger when a row is deleted.
Any site can fill in blank text fields (real data wins over empty data - if the owner doesn't like the data they must put something like "missing" in the field and their version will overwrite the data they don't like - Owner wins.)
The ability of sites to fill in empty text fields in rows they do not own can be turned off in the JGReplicate table. The local GUI application is responsible for not allowing users to edit the field, but the replication system will not pass these changes to the primary site. It will also change the columns back to blank since in this case the Owner ALWAYS wins, even when the owner's field is blank.
Fields in GUI should be (at the most) "Display Only" if they have data and another site owns the row. Empty fields can be editable.
Primary ID fields and lastChangedDate fields are not replicated.
Both sites can have active users during their updates.
Replications are initiated by the remote sites - this is a pull technology as opposed to a push technology. (I suppose we could push as well, but I'm not going there now.)
The system described here takes care of time differences between the primary site and remote sites.
Sites with site number -99 will get downloads from the base site but they won't attempt to send data back. This is good for standby servers and building a new primary server after a disaster.

NOTE: Need a way to display a message to a local user to indicate that their site has gone too long without replication. This can be added to JGuiGen itself, but a small process is needed that checks every x minutes and pops-up a message or checks when a machine is booted up.

Note: Check into DCE Security requirements more

IX. WALK THROUGH - rows are inserted/changed

Row is added at the primary site.

lastChangedDate is set to current date/time.
JG_Owner is set to 0. If it is left as zero the next replication will fix it.
JG_Unique_id is to 0. The next site to replicate from the primary will update the value to site# * 10,000,000,000 + row_id*10 for the new row.
JG_SentBy is set to 0.
JG_DateWhenSent is set to null.

Row is changed on the primary site.

lastChangedDate is updated to current date/time.
JG_Owner is already set to the local site number unless no one has replicated since this row was created.
JG_Unique_id was set zero and they updated the first time row was created. It will still be zero if it has never been replicated.
JG_SentBy is left alone.
JG_DateWhenSent is left alone.

Row is added at a remote site

lastChangedDate is set to current date/time.
JG_Owner is set to 0. The replication system will update this with the site number.
JG_Unique_id is set to 0. The replication system will change this to the site number * 10,000,000,000 + the row_id*10 for this row.
JG_SentBy is set to 0.
JG_DateWhenSent is set to null.

Row is changed at a remote site.

lastChangedDate is set to current date/time.
JG_Owner has already been set (unless this row has never been replicated).
JG_Unique_id is zero if this row hasn't ever been sent to the primary site or has an id number that was inserted by the Replication system when the row was sent to the primary site.
JG_SentBy is left alone.
JG_DateWhenSent is left alone.

Note: As long as the row_id is automatically set and the lastChangedDate column is automatically updated, JGuiGen can replicate tables that were created for applications other than JGuiGen. See the section of JG_Killed_Rows.

Row is deleted anywhere - this is the same as a change since the "isRemoved" flag is a column in the table and it is changed. lastChangedDate field is updated to current date/time. See the section of JG_Killed_Rows.

A site attaches to the base database.

The system checks with the local JGReplicate table to get the local site number
The system checks in the JGuiGenIni.xml file (or LDAP system) for the email address of the system administrator.
The system checks in the JGuiGenIni.xml file for the addresses of the local database and the primary site that it will be connecting to. In more secure environments, it gets this information from encrypted files on the local system or a local LDAP system.
Note: JGuiGen can obtain the database user name and database password from encrypted files. If it does not find an account name it should use to connect to the databases in the JGuiGenIni.xml file, it looks for them in encrypted files. If it can't find them it creates an error record ("E" record) and emails the system administrator, puts up an error message and quits.
It connects to both databases.
It sends any error records ("E" records) that exist locally and have not been sent to the primary (Process = "Y" and Done = "N") Change Done to "Y".
It asks the primary site for it's site number, name, and database type. It stores this away. It will use this to update JG_Owner and JG_Unique_id number on the primary site if it finds that they are empty.
If it does not have a site number it requests one from the primary site and then creates a Site record (“S” record) on the primary site. It is presumed that the "new" site got its table schema from JAWS or something else current. It requests all of the "update schema" records from the primary system and stores them. It does not process them. (We might include a way to "check" if the change already exists.)
It updates its personal Site record on the primary site ("S" record), bumps the replication started count, Enter a date/time that replication started and bumps the count of how many times it started a replication and didn't finish.
It updates the M record at the local site. The updates indicate that a replication started. They will be updated when the replication finishes.
If remote management-by-exception is turned on, it checks the primary site for all sites that have error-1-email-sent = "N" and has "S" records that indicate they have not replicated during the last X days. It gets value of X from the primary site "M" record. If any offending sites are found, tt creates an error record, sends an email to system administrators and marks the error-1-email-sent column to "Y" so the email won't get sent again.
If remote management-by-exception is turned on, it checks the primary site for all sites that have error-2-email-sent = "N" and have "S" records that indicate they have more than Y consecutive replication failures. They get the value of Y from the primary server. After the error record is created and the email is, the error-2-email-sent column is set to "Y" in the S record.
Each replication site sends an email when they start to replicate and their own count of unsuccessful replications exceeds a number supplied by the primary site..
It finds the highest "update" record it has in the local JGReplicate file and asks the primary site for any updates that are higher than this. ("U" records)
It tries to apply these updates. If it fails it emails the administrator, creates an error record ("E" record), shows an error message and quits.
When it finishes, it updates the remote "S" record and the local "M" with the update number it just completed. It emails if it can't do the update.
It asks the primary site for a list of tables to be replicated and marks these tables in the local JGReplicate table. It adds new rows to the local JGReplicate table if the primary site sends a table name that is currently not in the local JGReplicate table. These rows are designated with a "T" in the flag column. These new rows are set with very old dates so that the first replication will populate the table with all existing records.
The next thing that happens when JGuiGen starts to process is that the schema of each table to be replicated is compared with the schema at the primary server. The schema for the primary site and the remote site must match. The columns do not need to be in exactly the same order, but each column on the primary site must exist on the remote site and vice versa. The column type and size must be the same for each column. (Note: There can be minor discrepancies here. For example some systems have a Boolean data type and other use an integer for the same purpose. JGuiGen tries to takes these differences into account.) If a schema does not match the system creates an error record ("E" record), sends an email to the system administrator, and displays a message describing the difference. The notes field in the JGReplicate field is updated with an error message and the replication is aborted.
Once the schema check passes, the local Done field is set to "Y" in the Update record ("U" record).
The system now checks if any of the Table ("T" records) rows in JGReplicate are marked as DONE ="Y". If so, the system did NOT complete the last replication (Done is set to "N" after all tables have been processed). The system then starts with the first table that was not done last time. This avoids doing several tables over and over and then always crashing on a particular table. If the system always started with the first table, some tables (those updated before the bad table) would contain newer and newer data and the data in the others would never change.

NOTE: Should we try to finish the previous replication first? Probably not since the later tables will now have “new data” and if there are dependencies we have ordered the tables so that the earlier tables should be done first. Even processing the one table first could cause a problem.

If the first "not done" table replicates, then either the table has been repaired or the last failure was due to a lost connection. All of the rows are marked as NOT DONE and the replication starts from the table listed in the first "T" record. (This avoids having the last tables being several days more current than the first tables.)
When the replication system starts working on a table it updates the count in the rep_count_started field for that table. ("T" records)
The system creates an error record ("E" record) and sends an email to the database administrators that the replication stared in an unclean environment. (Any of the tables are marked with a Done = "Y".) The Notes column for that table in the JGReplicate table would be updated with an error message.
It loops through the rows in the JGReplicate table with a flag of "T". For each table listed there it uses the following overall process:

A "Log" row is added to the JGReplicate table with a Flag of "L" and the table_name of the table being replicated, the date/time the replication was started and the site number. The row_id of this new row is saved so that it can be updated when the system finishes the table.
A log row is added to the JGReplicate table on the primary machine with the same information that was stored in the local database.
It processes the data in the row using procedures documented below.
After the data in a table is processed

The row for this table ("T" record locally) in the JGReplicate table is marked as DONE = "Y".
The current MARK date/time that was determined during processing is updated into the previous-local-high-date-time field. (See the Note on what I call the "Problem Window". This window starts when this site starts updating the primary server and ends when it finishes these update. The primary server has rows that this site already knows everything about. On the other hand other users or other replication sites may have inserted or updated rows during this window of time. The replication system must determine which rows that have a lastChangedDate in this window should be ignored and which should be sent.)
During the update process the replication software established a date/time on the primary server that I call the PRIMARY-MARK (this uses the Primary server's time zone) and we query for any record that was inserted or updated since the previous PRIMARY-MARK and less than or equal to the current PRIMARY-MARK. This is written to the previous-primary-high-date-time field. This is the value we will use to get rows from the primary site when we next replicate. (Note: See the "Problem Window" section.)
The rep_count_finished is bumped by one in the "Table" row of JGReplicate when a table is finished
The date_rep_ended in the "Log" record in the local JGReplicate table is updated.
The date_rep_ended in the "Log" record in the primary JGReplicate table is updated.
The Number_of_rows_inserted_by_primary is updated the Log record.
The Number_of_rows_inserted_by_target is updated the Log record.
The Number_of_rows_updated_by_primary is updated the Log record.
The Number_of_rows_updated_by_target is updated the Log record.

At the end of the process all table ("T" in the flag column) rows are marked as DONE ="N".

Processing an individual table uses the following steps.

1. For each table the system gets the previous-primary-high-date/time using the time zone on the primary machine and the previous-high-local-date/rime (using the local time zone) on the local machine.

2. The system is connected to the primary server. It establishes the current PRIMARY-MARK time by updating the "S" row in the JGReplicate table and then querying for the lastChangedDate column.

3. It then queries for all rows at the primary site for one table where the date/time stamp is higher than the previous-primary-high-date-time (previous PRIMARY-MARK) and less than or equal to the current PRIMARY-MARK.

4. The current PRIMARY-MARK is the current-primary-high-date/time and it is updated in the local JGReplicate table. It will become the previous-primary-high-date/time when the replication is finished. The next time the site asks for data on this table it will want everything newer than this date/time and less than or equal to a new PRIMARY-MARK time it will establish then. Note: This time is from the primary site and may be several time zones away from the local site time.

5. The system updates a local record and gets the lastChangedDate for that update. This becomes the current MARK time. This time is used to update the current-local-high-date-time in the local JGReplicate table.

6. Once the system has the date/times and a record set of updated rows from the primary server it loops through the rows. If it finds an empty JG_Owner or JG_Unique_id column it uses the site number from the primary site, calculates what these should be and updates the primary site. Next it checks the local database for a row with the JG_Unique_id equal to the JG_Unique_id value in the row sent from the primary site.

1. If the row does NOT exist locally it is a new row.

add the row.
set the lastChangedDate and the JG_DateWhenSent to the local date/time. Set the JG_SentBy column to the site number for the primary server.
Later when we ask for changed local records ignore record that came from the primary server and the two dates are still the same.

2. If the row exists and someone else owns the row then it is a changed row (JG_Unique_id exists on local machine).

Update the site row with all the new values unless local site is not empty and the base's version is empty. (If the primary site indicated that secondary sites were not allowed to fill empty columns then all of the owner's values are used.)
If the local site has non-empty values and they are allowed to overwrite blank value- set the lastChangedDate to the current date/time but do not set the JG_DateWhenSent to the same time. If these conditions were not met set the lastChangedDate to the current date/time and set the JG_DateWhenSent column to the same time. In either case set the JG_SentBy column to the primary Site's site number. Note: The reason for having different dates in the lastChangedDate field and the JG_DateWhenSent fields is that shortly we will update the primary server with all local rows that have dates between the PREVIOUS-MARK and the CURRENT-MARK. The only rows we don't send are those that came from the primary site and the two dates are the same. Since we want the row to be picked up by this query we make sure the two dates are different.

3. If the row exists and the local site owns the row, then someone else changed the row.

Update local data where local site column is empty and theirs is not (unless this has been disallowed).
Change lastChangedDate and JG_DateWhenSent to current date/time. Change the JG_SentBy column to the primary Site's site number.
If there is data in the downloaded row that conflicts with data in the local row (it was blank and someone edited, but someone locally also edited it with different info), set the lastChangedDate to the current date/time and the JG_DateWhenSent to some other time. See the note in the previous paragraph to see why we do this and why it results in our row being sent to the primary server with our data in it.

At this point we have all of the changes the Primary Site knows about.

7. The local site selects all local rows that have a JG_Unique_id = 0 (New Rows) OR where the lastChangedDate is greater than the previous-high-local-date/time and less than or equal to the MARK value determined earlier. (Again, note that there will be local rows that were just added or changed from the primary server that have date/time stamps that are included in this query. We will get rid of them by comparing where they came from and if the lastChangedDate equals the JG_DateWhenSent.)

8. The local site sends inserts of all of the rows with a JG_Unique_id = 0 (New Rows). It fills in the JG_Unique_id, the JG_Owner, the lastChangedDate, the JG_DateWhenSent and the JG_SentBy columns.

9. The local site sends updates for all of the rows with a lastChangedDate greater the previous-local-high-date-time and the MARK time. It fills in the JG_Unique_id, the JG_Owner, the lastChangedDate, the JG_DateWhenSent and the JG_SentBy columns.

Loop through rows (Be sure to include rows where IsRemoved Flag is set with the value that shows the row should be treated as if it was deleted.) Query the JG_Killed_Rows tables for deleted rows in the right time frame for this table if this variation is being used.

0. If the JG_Unique_id is zero - this is a NEW row

Update the JG_Unique_id to the local site# * 10,000,000,000 + row_id*10 in the local table.
If the JG_Owner field is empty, update it to the local site number.
Insert the row on main site - the lastChangedDate on the primary machine is set using the JDBC function {fn CURDATE()}. This will set the lastChangedDate to the current date/time on the primary database machine. Use the same function to update the JG_DateWhenSent column. Fill JG_SentBy and JG_Owner with the local site's site number.
The lastChangedDate locally is left with its current value.

1. The other rows are rows the local site changed.

1. *** If local site owns the row,

The row should already contain any changes made by some other site.
Send an update of all non-empty columns and set the lastChangedDate and the JG_DateWhenSent on the primary server using the JDBC CURDATE() function. This will set the date/time stamp to the current date/time on the primary server. If the primary indicates the no one but the owner can enter data then send the empty text fields as well. Update the JG_SentBy column to this site's site number.
Leave the lastChangedDate on local machine alone.

2. *** If local site does not own the row.
Request a copy from the primary - We need this row to be sure about which fields are empty on the primary server so that we can update them.

Update local row with their data if the two diff. (Except where ours is not blank and the corresponding column from the primary site is empty)
Build an update where ours is not empty and theirs is
Update the primary site and set the lastChangedDate and JG_DateWhenSent to the current date/time using the CURDATE() function on the primary site. Update the JG_SentBy column to the local site's site number.
Leave the lastChangedDate column alone at the local site unless values were changed and set it to the current local date/time.

10. The local machine uses the same MARK time for each of the tables being processed and saves that as the previous-high-local-date/time. This will the highest change we just applied from the primary server. During the next replication anything higher than this will have been changed locally since the replication that was just finished.

The local system saves the current-primary-high-date/time as the previous-primary-high-date/time. The next time the site replicates and asks the primary site for new/changed records it wants records that are higher than this value. NOTE: See the section called "Problem Window."

The DONE field is marked and the rep-count-finished is bumped by one. The log rows are updated in the local JGReplicate table and the JGReplicate table at the primary site.

The system loops through another table.
When all are done - mark all of the DONE columns as "N" (not done). The local system is "clean" and ready for another update. The steps used to complete a table were listed above.
Updates its personal Site record on the primary site ("S" record), bumps the replication finished count, Enter a date/time that replication ended and sets the count of how many times it started a replication and didn't finish to zero. Appends a line to the Notes text field that shows the date/time of this replication. The error-1-email-sent through error-4-email-sent columns are set to "N". The site has successfully completed and these columns won't be used until it is either late replicating or tries and fails several times. Then the columns need to be "N" so the management-by-exception routine will create error rows and send emails.

To change the owner of a row.

Local sites can change the owner from their site to another site. This works.
Local sites cannot change the owner if their site is not the owner
primary site can change the owner and that always wins.

XX. AVOIDING ISREMOVED FIELD REQUIREMENT

There are notes on How JGuiGen Replication handles situations where existing application actually delete rows from a table that is part of the replication system.

A new table must be added to the local database. The table is called JG_Killed_Rows and it has the following structure:

kill_id - int - primary key for the table
table_name - char(35) - name of the table where a row was deleted.
row_id - int- the id of the row that was deleted.
Site_id – int – the site number that originated the deletion.
User_name – char(30) – (if this is being used by other applications) – user who deleted the row.
JG_Unique_ID - the replication id of the row that was deleted
date_time_killed - timestamp - when the row was deleted
lastChangeddate - timestamp - date/time this row was added to JG_Killed_Rows

A stored procedure must be installed that is activated by a trigger when the row is actually deleted. This stored procedure must insert a new row into the JG_Killed_Rows table.
When a replication occurs, the JGuiGen replication software will query this table for rows with a date_time_killed that match the time period being replicated. If it finds a row, it will attempt to delete the row with that JG_Unique_id on the primary system.
The primary system will need the same type of stored procedure and a JG_Killed_Rows table if it is not the master primary site. It will need to propagate the delete up the chain and down to other replicating sites.

XXI. MOVING A SITE TO A NEW PRIMARY SERVER

Here are the steps needed to do this.

Change JGuiGenIni.xml file or LDAP references to point to the new server.
Install new SSL certificates.
Change database username and database passwords (In encrypted files or LDAP)
Adjust the Table ("T" records) in JGReplicate. The previous-primary-high-date-time may need to be adjusted to reflect a new time zone.

SPECIAL NOTE ABOUT SECONDARY PRIMARY SERVERS

I would think that it would be prudent to stop users from replicating to a secondary server while it updates from its own primary server. It shouldn't really be a problem, but I feel better if source for other site's data runs in a fairly pure envrionment while it gets its down data.

These servers will need to set the Process field in the Master record ("M" record to "N") and wait for replicating sites to finish before they start to replicate. They cannot have active users or remote sites replicating when they themselves connect to a remote server and replicate from it.

XXII. SPECIAL NOTES ABOUT THE JGREPLICATE TABLE

This list.needs to be fleshed out from notes that are scattered through this document.

Notes on getting a new site number

If a site needs a site number

It sends an update to the primary to change the Next-site-number-lock field to "Y" in the master site record ("M" record) and the next-site-number-lock field = "N".
If the update succeeds it queries for the next-site-number.
If the primary site is the master primary site it sends an update to bump the Next-site-number-1 and set the next-site-number-lock to "N"
If the primary site is a secondary primary it checks it's five possible site number for one that is not zero. It makes the same update except it sets the Next-site-number it used to zero. (The secondary site will get replacement numbers when it replicates from the master primary server.)
If the original update fails, it waits 10 seconds and tries again. After three tries it aborts, creates an Error record ("E" record)c and sends an email to the administrators.
If the next-site-number fields are all zero, it aborts, creates an error record ("E" record), and sends an email to the administrators

Notes about the Master Record ("M" records) in JGReplicate

There is one master record at each site. This contains overall information about the current site.

Contains the site English Name
Contains the site's official site number
A variable indicates if the site is a master primary, secondary master or a target site
Replications started
Replications finished
Send_email_on_completion
Process_mgmt_by_exception remotely, locally or not at all
email_days - each replicating site sends an email about other sites who have a "Process = Y" and haven't replicated in this many days.
email-bad_rep_days - each replicating site sends an email about other sites who have uncompleted replications that are more than this number of days old.
email-bad-rep-count each replication site sends an email when they start to replicate and their count of unsuccessful replications exceeds this.
Next-available-site-number 1, 2, 3 4, 5
Next-available-site-lock
....

Notes in Site Records ("S" records) in JGReplicate

Note: There will be no Site records at target/remote sites. These only exist on primary sites and contain information about the sites that are replicating from this primary site.

When a site attaches to a primary site, they update a field with the date/time they stared and bump a column that counts how many times they started.
They also bump a field that counts how many times they started but didn't finish
When a site finishes a table schema update, they update a column that contains the highest update they have finished.
When the site finishes it updates a field with the date/time finished. This also bumps a column that counts how many times they finished.
When they finish they set the counter for how many times they tried and did not succeed back to zero.
The site record contains a site number, an English name, and type of database for the site.
There is a field that contains the date/time the site record was created.
Update_number - the highest update this site has completed.

Notes about the Table Record ("T" records) in JGReplicate

Contains the name of a Table to be replicated
Indicates the type of replication (Full, download to target only, or upload from target only.)
A flag to indicate if processing of this table is currently activated
Count to times a replication of this table started
Count of times a replication of this table ended
....

Notes about the Update Records ("U" records) in JGReplicate

Contains a unique update number that gets sequentially larger
Contains a short description of the update.
Contains the specification for the update
Contains the date/time the update became available
at the local site there are fields that indicate an update should be processed and one that indicates that the update has been processed.
...

Notes about the Error Records ("E" records) in JGReplicate

Contain the site number that reported the error
Contains the site where the error occurred
Contains the text of the error message.
Contains the date/time the error record was inserted
Contains the date/time the error occurred if possible
Indicates if an email was sent
At target sites there is a flag that indicates the record should be sent to the primary during the next replication and one that indicates it was done.
....

Notes about the Log Records ("L" records) in JGReplicate

Contain the site number that created Log
Contains the table name being logged
Contains the start date/time
Contains the ending date/time
Contains a Number_of_rows_inserted_by_primary
Contains a Number_of_rows_inserted_by_target
Contains a Number_of_rows_updated_by_primary
Contains a Number_of_rows_updated_by_target
Indicates if there were any issues
....

Notes about the JUnit Records ("J" records) in JGReplicate

Contain the site number that created record
Contains the site number that should use the record
Indicates if the record should be processed.
Indicates if the record has been processed.
Contains the start date/time
Contains information that a JUnit test uses to see if updates that should have been replicated actually have been.
....

XXIV. REPORTS

I was going to list the reports we will be creating in the JGuiGen application's report generator, but I realized that there will basically be one report for each of the sections just listed. The fields will be those listed as relevant for each type of record (M, S, L, E, etc.)

XXIV. GENERAL NOTES

Email to System administrators tell about errors. Note there should always be error records created locally and on the primary where an email is sent.

when a replication starts after a previous replication failed to finish
A difference in the schema was found.
When sites haven't replicated
Lost connections
Too long without a replication
Too many tries that did not complete
When sites are having problems finishing their replications
Attempt to update a table schema failed
An update failed (due to dependency issue or other wise)
Message - "Tried to find the real column names in table X for the functions "lastChangedDate, IsRemoved, and Row_ID" Could not find.
Programmatic error when trying to process the method triggered by an update.
Other errors - to be determined

Any site can take over as the primary site. Mark the "M" record as "Y" in the "primary_site" field. You will need to update the JGuiGenIni.xml files to show where the connection is and update the SSL certificates. and then have remote PC's attach to the new primary site. See the longer description about this process available elsewhere.
Any site can redo scrambled data by setting the previous-primary-high-date/time values in the JGReplicate table back
If a site crashes and restores from a backup that is several days/weeks old, the previous-primary-high-date/time will be from before their crash and they will get all of their own records back as well as new records from other sites. (Note: There is a field in the site record that must be set to allow them to do this. The field allows a one time replication where the date at the primary site does not match the date at the target site.)
If there are dependencies and triggers in the databases (payment records that require the invoice being paid exists). then the order of the tables in the JGReplicate table should be set very carefully. We are not doing transaction level processing so it is possible for a remote site that lost its link during replication to have part of another site's data but not all of it. An email is sent to the system administrator if a failure occurs due to this issue.
The owner of a row can be passed around. A salesperson might enter a proposed bid and transfer ownership to a higher level authority. This person could then make changes that the salesperson could not override. Until the salesperson finishes with the bid the higher authority could see the proposed bid and fill in empty spaces, but they couldn't change what the salesman had entered until the salesperson was finished with it and transferred ownership.
Special note on JGuiGen-Specific tables. The three tables marked as "JGuiGen-specific" are treated a little differently. Since we plan to let sites use our Sourceforge table to update their primary sites table schema these tables are special. Local sites can add columns and these will be ignored. The remote sites MUST have all of the columns the primary site has. Table replication will ignore data sent for columns that do not exist on the primary server.

XXV. DISCUSSION OF THE "PROBLEM WINDOW" DURING REPLCICATION

We are letting users access the database being replicated while the replication process is going on. This is done to minimize the impact on users. (Think of road warriors who can't use their laptop while the replication is going on and it is 3:00 AM. Replication won't happen and we all know it.) This causes some problems. The most glaring is that changes made by the users can have approximately the same date/time stamps and records being updated by the replication service. I call the period between the time that the replication service started actually changing rows and when it stops changes rows in a replicated table the "problem window." Here is how JGuiGen Replication handles this.

Every row that is changed has its lastChangedDate set to the local date/time. This is done automatically or by the JGuiGen Replication software.
When the JGuiGen Replication software changes a row with data from a remote site (a primary or a target site) it changes the lastChangedDate. It set another column called JG_DateWhenSent to the same date/time. It also sets a column called JG_SentBy to the remote site number.
All rows updated by the replication system will have two date fields that are equal. Rows updated by a local user will not change the JG_DateWhenSent column so the two will be different. If there is a value in the JG_DateWhenSent column it will now be older than the lastChangedDate column.
This allows us to distinguish between records changed by the replication process and those changed by other users.
Assume we get 10 rows from a remote server. They will all have current date/time stamps and therefore are much more recent than the replication that completed yesterday.
Also assume that a user happened to jump on one of these new records and changed it. The lastChangedDate will be changed, but the JG_DateWhenSent will not be changed.
The replication system updates a row in the local JGReplicate table and then queries to determine the date/time stamp of that change. It calls this the MARK time and starts to send inserts and changes for all records with lastChangedDates between the time of the replication yesterday and the MARK time. This MARK time is the same for all tables even though it takes time to work through the various tables. The system specifically excludes records where the lastChangedDate is equal to the JG_DateWhenSent and the JG_SentBy is the remote site it is talking to. All of the local changes (including the one record just changed) will be sent and the 9 other records we just got will not be sent.
If our industrious local user changes another of the 9 records, but it is after the MARK time this change will not be replicated until next time, even if rows are still being sent to the remote server.

There are two major concerns with letting users use the system during replication.

The first involves situations where a local transaction takes place that involves several tables and it starts before the MARK time and ends after it. There could be very small discrepancies once the SQL system executes a commit. Rows in one table could have a slightly different time stamp from the rows in another table. If the MARK time occurred at just the right moment the replication process would send some of the changes and not the rest. If this is a problem on your system, then users (or other processes) will have to be suspended while the replication is going on.
The second problem occurs when a column is being constantly updated by some automatic process. By the time the replication process tries to update a row the date on the row being updated all the time is already higher than MARK. When we do it again and establish a new MARK, the offending row is updated to a date/time higher again. This row conceivably would never be replicated. Your site needs to determine if any of these situations exist and determine how you wnat to handle it.
There is one more concern that needs to be mentioned here. Your system might have conflicting dependencies that JGuiGen Replication cannot handle. Here is the situation. One transaction inserts data into Table 1 and then data into Table 2. The data going into Table 2 is dependent upon the data in Table 2 and dependency checks will cause the transaction to fail if the data is not correct. Normally you would just insure that Table 1 is earlier in the update cycle than Table 2. The problem arises if there is another type of transaction that does something to Table 2 first and then makes changes to Table 1 that are dependent on the changes to Table 2. We have no way to resolve this.
Note the procedures described in this document establish a local MARK time and then use this time for all tables so that all table are queries for updates for example the same time periods. While this can cause problems with transactions that fall over a boundary, it reduces problems cause by getting updates from one table that are 10 or 15 minutes later than those in another table (Assuming a very slow link and update process.
There is a similar PRIMARY-MARK time established for the remote server.
If you are wondering where the "MARK" came from, I think of it as looking at my watch and saying the time boundary starts when I say mark..... MARK.

XXVI. PLANNED JUNIT TESTS

Setup

Primary site - add a record
Primary site - changes a record

Setup

Site 2 - add a record
Site 2 - changes a record it owns
Site 2 - changes a record it does not own
Site 2 = Create an error record with Process = "Y" and Done = "N"
Site 2 Replicate

Tests

Check that added row has correct JG_Owner and JG_Unique_id fields
Check that the changed row has same JG_Owner and JG_Unique_id it had before
Check JGReplicate for correct values in data_dict table row
Check JGReplicate for correct Log entry
Check JGReplicate on primary site for correct Log Entry
Make sure the dates used on the primary are its date/time not site 2s.
Check primary for added record. - Check lastChangedDate field,
Check that the row added on the primary site now has a JG_Owner, JG_Unique_id, JG_SentBy and JG_DateWhenSent.
Check the the lastChangedBy and JG_DateWhenSent are equal.
Check that the row changed on the primary site now has a JG_Owner, JG_Unique_id, JG_SentBy and JG_DateWhenSent.
Check the the lastChangedBy and JG_DateWhenSent are equal.
Check that site 2 has the row added by primary site and that the values are the same
Check that site 2 has the row changed by primary site and that the values are the same
Check that the Error record was added to the primary's JGReplicate table. JG_Unique_id and JG_Owner.

Setup

Site 3 - adds a record
Site 3 - changes a record they own
Site 3 - changes a record they don't own
Site 3 - replicates

Tests

Check that added row has correct JG_Owner and JG_Unique_id fields
Check that the changed row has same JG_Owner and JG_Unique_id it had before
Check that the row added by site 2 has arrived and has the same JG_Unique_id and JG_Owner it had on Site 2.
Check that lastChangedDate and JG_DateWhenSent are equal and different than when they were sent.
Check that the row changed by site 2 has arrived and has the same JG_Unique_id and JG_Owner it had on Site 2
Check that lastChangedDate and JG_DateWhenSent are equal and different than when they were sent.
Check JGReplicate for correct values in data_dict table row
Check JGReplicate for correct Log entry
Check JGReplicate on primary site for correct Log Entry
Check that site 3 has the row added by primary site and the values are the same
Check that site 3 has the row changed by primary site and the values are the same
Setup
Site 2 replicates

Tests

See that the error record created at the beginning has Done = "Y" now
see if Site 2 got the records it sent again and if it ignored them.
Check that nothing sent by the primary during the last replication is being sent back.

XXVII. CODE STEPS [square brackets represent specific info on updated HSQL to MySQL]

Create JGReplicate Table
Create a JGuiGen Application to edit it
Get email address for administrator and site name from JGuiGenIni.xml (If site number is empty - use machine name or IP address)
Get connection information from JGuiGenIni.xml
Connect to local database (E record and msg about failure) [MySQL site set to -99]
Connect to remote database (E record and msg about failure) [HSQL site set to -100]
Get site number from local. Create an "M" record and populate it if necessary
If there was no local site number - get it from the primary. Get all update schema rows ("U" rows) from the remote and add them to the local JGReplicate table. Do not process them. A new copy of the program should match the current tables schema at the primary site.
Start a log record locally - save row_id
Start a log record on the primary - save the row_id
Update the "S" record on the primary - bump count started
Get is primary, site number, site name etc. from primary [HSQL is set ]
Msg and quit if error
Find the local highest update number
Request any higher at the primary site
Show a message about what the schema change should be if any exist and quit. We will do changes manually for now.
Mark all local "T" records with "N" in the process column
Ask the primary which tables ("T" records) should be marked with a Process="Y"
Mark them. Add any new "T" records and set timestamps for 1/1/1980
Check local "T" rows for DONE = "Y" - Show MSG - try first unmarked table If it succeeds, mark all "T" records as NOT DONE and start from the beginning.
Get real names for the lastChangedDate, isRemoved and row_id columns from data_dict Msg and quit if missing.
Loop through tables [Data_dict, JGuiGen_I18N, userlogdata]

Get previous-primary-high-date-time from "T" record
get MARK time by updating a local record and getting the lastChandedDate
set the current-date-time-at-start in the "T" record
update the "S" record for this site in the primary Server's JGReplicate table and determine the lastChangedDate for this update. This become the PRIMARY-MARK time.
query primary for all records with a date higher than the previous-primary-high-date/time and less than or equal to the PRIMARY-MARK time.
Set the current-primary-high-date-time in the "T" record to the PRIMARY-MARK time
Process rows
Update "T" logs
Loop

BIG DEFER - processing the local records and sending them back
update "S" record and quit

Steps deferred

Email about Connect to local database failure
Email about Connect to remote database failure
Email about error in Get is primary, site number, site name etc. from primary
Management by exception late replication reporting
1. Get all active sites that have replicated
2. get all done in last x days
3. Send email about those in list 1 but not in list 2
Management by exception problems with replication reporting
1. Get all who have not completed in the last Y days
2. Send email about those in the list
Processing schema changes
Send email about starting with "T" records that have DONE = "Y"
Send email that getting real variable names from data_dict failed.