Functional Specification for Derby Replication

Revision	Description	Date	Author
10.0	Important changes: Added 'slave crashes' failure scenario. slaveHost and slavePort options are now camelcase for uniformity. Default replication port changed to 4851 (port registered at IANA). System privilege for replication not included in 10.4.	March 10, 2008	Jørgen Løland
9.0	Important changes: In 10.4, databases must be file-system copied to slave location before starting replication. Failover performed on master shuts down the master database and reports an exception (like shutdown database does) Start slave does not return a valid connection but an exception. Shutdown of replicated databases are refused (exception: system shutdown).	February 18, 2008	Jørgen Løland
8.0	Important changes: Clearly specified how security is enforced. Behavior change: Shutdown database is now specified as allowed when replication is active. Before stopping the replication behavior, the other replication peer will be notified about the decision.	December 21, 2007	Jørgen Løland
7.0	Incorporate feedback from Dan. Important changes: Made the replication commands options to the connection url State more clearly what happens when the commands are run	November 12, 2007	Jørgen Løland
6.0	Added section on failure handling and a new command - "stopslave".	November 1, 2007	Jørgen Løland
5.0	Incorporate feedback from Ole. Important changes: Changes to "Interacting with..." table. Now with clearer pre-requirements	September 4, 2007	Jørgen Løland
4.0	Incorporate feedback from Dag and Oystein. Important changes include: Modified the new commands added to NetworkServerControl to match the existing pattern Added pre-conditions to each of the NetworkServerControl commands.	August 18, 2007	V. Narayanan
3.0	Incorporate feedback from Rick Hillegas and Narayanan. Important changes include: Authentication should follow the same strategy as DERBY-2109 Handshake is required so that only one master will connect to a slave. How to stop replication, and where the NetworkServerControl commands must be issued from.	July 10, 2007	Jørgen Løland
2.0	Incorporate feedback from Rick Hillegas: How users turn on and off replication Who has permission to turn replication on/off What happens when a master or slave goes down	July 6, 2007	Jørgen Løland
1.0	Initial version	June 26, 2007	Jørgen Løland

Overview of Characteristics
Terminology
Starting and running Replication
Stopping Replication
Forcing a Failover
Handling Failure Scenarios
Interacting with replication

Authorization and Authentication

Security
Testing
Documentation

Overview of Characteristics

This document describes a replication feature for Derby. The replication feature will have the following characteristics:

One master, one slave - A replicated database will reside in two locations and be managed by two different Derby instances. One of these Derby instances will have the master role for this database, and the other will have the slave role [1]. Typically, the master and slave will run on different nodes, but this is not a requirement. Together, the master and its associated slave represent a replication pair.
Roll-Forward Shipped Log - Replication will be based on shipping the Derby transaction log from the master to the slave, and then roll forward the operations described in the log to the slave database.
Asymmetrical - Only the master will process transactions. Not even read operations will be processed by the slave.
Asynchronous - transactions are committed on the master without waiting for the slave. The shipping of log to the slave is performed regularly, and is completely decoupled from the transaction execution at the master. This may lead to a few lost transactions if the master crashes.
Shared nothing - Apart from the network line, no hardware is assumed to be shared.
Replication Granularity - The granularity for replication will be exactly one database. However, one Derby instance may have different roles for different databases. Hence, one Derby instance may, e.g., have the master role for one database D1 replicated to one node, the slave role for a database D2 replicated from another node, and the normal, non-replicated, role for a database D3 at same time.

Replication will build on Derby's capability to recover from crash by starting with a backup and rolling forward Derby's transaction log files. The master will send log records to the slave using a network connection. The slave will then write these log records to it's local log and redo them.

If the master fails, the slave will complete the recovery by redoing the log that has not already been processed. The state of the slave after this recovery will be close to the state the master had when it crashed. However, some of the last transactions performed on the master may not have been sent to the slave and may therefore not be reflected. When the slave has completed the recovery work, it is transformed into a normal Derby instance which is ready to process transactions.

Terminology

Master of database D (short: master) - denotes the Derby instance that has the master role for the replicated database D. Note that a single Derby instance may have different roles for different databases.
Slave of database D (short: slave) - denotes the Derby instance that has the slave role for the replicated database D.

Starting and running replication

Each replicated database will be replicated from a master to a slave version of that database. Initially there is no replication; a master database must be created before it can be replicated. The database may, of course, be empty when replication starts. On the other hand, replication does not need to be specified when the database is created; it can be initiated at any time after the database is created.

How to start replication - First implementation (10.4)

Before replication is started, the master database must be booted. The database is then copied to the slave location (using file system copy), before starting replication by issuing the start slave and start master commands. To ensure that the master database is not modified during the time interval between starting the file-system copying and replication has started, the master database should be freezed. The following sequence of commands should always be used to start replication:

Boot database 'x' on master.
When ready to start replication: freeze the master database. This ensures that all data pages and log is written to disk, and blocks modifications to 'x' until replication has successfully started.
'CALL SYSCS_UTIL.SYSCS_FREEZE_DATABASE();'
Copy database 'x' to slave location.
Start slave replication mode on Derby instance acting as slave for 'x'.
Start master replication mode on Derby instance acting as master for 'x'. A successful start master command will also unfreeze the database.

How to start replication - Future enhancement (10.5)

When replication is started, the slave checks that the database does not already exist, and that the user is authorized to create a database. The slave then starts to listen for a master on the specified host and port. The master connects to the slave and performs a handshake to ensure that only one master connects to a slave. The master then sends the whole database, including the active log files, to the slave via this connection. Hence, the database is only booted, not created, on the slave.

After replication has started

Regardless of the method used to start replication (the 10.4 or 10.5 way), the slave is now ready to receive logged operations from the master. The master can now continue to process transactions. From this point on, the master will forward all logged operations to the slave in chunks. The slave will repeat these operations by redoing the Derby transaction log, but will not process any other operations. Connection attempts to the slave database will be refused. With this design, the slave will be able to recover to the state the master was in at the time the last chunk of log was sent.

While replication is running, neither the slave or the master database will be allowed to be shutdown without first stopping replication. An exception to this is when the entire system is shutdown. In this case, the peer that is shutdown notifies the other replication peer that replication is stopped.

Stopping replication

Replication of a database can be stopped by issuing the stop command to the Derby instance having the master role. The master will send the remaining log records that await shipment, and then send a stop replication command to the slave. The slave will then write all log to disk and shutdown the database.

For now, resuming replication after it has first been stopped is not considered. If the database is to be replicated again, replication has to be restarted as described in Section Starting and running replication. Resuming replication is a good candidate for extending the functionality at a later stage.

Forcing a Failover

The Derby having the slave role may be transformed into a normal Derby that can process transactions at any point in time. This transformation from being a slave to becoming an active Derby database is called failover. During failover, the slave redoes the parts of the log (the tail) that has not yet been processed. Operations belonging to uncommitted transactions are then undone, resulting in a transaction consistent state that includes all transactions whose commit log record has been sent to the slave.

Failover is performed on explicit request by running the failover command. For now, there will be no automatic failover or restart of replication after one of the instances have failed. If automatic failover is required later, this should be added as a separate Jira issue.

Handling Failure Scenarios

There are numerous failure scenarios that can be encountered by replication. This section aims at identifying the different failure scenarios and define the actions that will be taken.

Failure Scenario Action Taken

Master looses connection with slave Transactions are allowed to continue processing while the master tries to reconnect with the slave. Log records generated while the connection is down are buffered in main memory. If the log buffer reaches it's size limit before the connection can be reestablished, the master replication functionality is stopped. The size limit of the buffer will be configurable.

Slave looses connection with master The slave tries to reestablish the connection with the master by listening on the specified host and port. It will not give up until it is explicitly requested to do so by either the failover or stopslave command. If a failover is requested, the slave will apply all received log records and boot the database as described in Section Forcing a Failover. If a stopslave is requested, the slave database will be shut down without further actions.

Two different masters of database D try to replicate to the same slave The slave will only accept the connection from the first master attempting to connect. Note that authentication is required to start both the slave and the master, as described in Section Authorization and Authentication.

The master and slave Derby instances are not at the same Derby version An exception is raised and replication will not start.

The master Derby instance crashes. It restarts and wants to continue replication from where it stopped Replication will have to be restarted, as described in Section Starting and running Replication. Later improvement patches may allow replication to continue without resending the whole database to the slave.

The master Derby instance is not able to send log to the slave at the same pace as log is generated. The main memory log buffer gradually fills up and eventually becomes full. The master will notice that the main memory log buffer is filling up. It will first try to increase the speed of the log shipment to keep the amount of log in the buffer at bay. If that is not enough to keep the buffer from getting full, the response time of transactions may increase for as long as log shipment has trouble keeping up with the amount of generated log records.

The slave Derby instance crashes. The master will see this as a lost connection to the slave. The master will try to reestablish the connection until the replication log buffer is full. Replication will then be stopped on the master. Replication will have to be restarted, as described in Section Starting and running Replication. Later improvement patches may allow replication to continue without resending the whole database to the slave.

An unexpected failure is encountered Replication is stopped. The other Derby instance of the replication pair is notified of the decision if the network connection is still alive.

Interacting with the replication feature

There will be two ways to interact with the replication functionality; either through connection url or through a command line interface. When running in embedded mode, the connection url method must be used. In client/server mode, both alternatives are applicable. Note that only the connection url method will be available in Derby 10.4, whereas the command line interface will be added later.

The following commands are recognized (see below for the syntax of the two alternatives):

Command	Options	Effects	Preconditions
Start master	database name (Required) slaveHost (Required) slavePort (Optional. Default: 4851)	Starts replication master mode for a database, as described in Starting and running Replication.	The database must already exist on the master host. The database can not be shut down while replication is active.
Stop master	database name (Required)	Sends a stop slave message to the slave if the network connection is working. Then shuts down all replication related functionality without shutting down the database.	The Derby instance where this command is issued must be the replication master for the specified database.
Start slave	database name (Required) slaveHost (Optional. Default: localhost) slavePort (Optional. Default: 4851)	Partially boots the specified database. Starts to listen on the specified port and accepts a connection from the master. Start slave hangs until the master has connected to it before reporting the startup status (started/not started because...) to the caller. Will receive the database from the master (10.5 only), and chunks of transaction log after that. Will continuously redo the received log to the slave database. If replication was started successfully, an exception with error code XRE08 will be thrown. Hence, the start slave command will not return a valid connection.	10.4: The database must have been file-system copied to the slave host. 10.5: The Derby instance where this command is issued must not already be serving a database with the same name.
Stop slave	database name (Required)	All transaction log that has been received from the master is written to disk. The slave replication functionality and the database is then shut down.	The Derby instance where this command is issued must be serving the named database in replication slave mode. The command is only accepted from the master for as long as the network connection is working. If the network connection is down, the command will also be accepted from the connection url/command line interface.
Failover	database name (required)	If issued on the master: sends the remaining log records to the slave instance and then sends the failover command. The replication functionality and the database is then shut down. If failover was successful, an exception with error code XRE20 will be thrown. Hence, when issued on the master, the failover command will not return a valid connection. If issued on the slave (only if master connection is down): All transaction log that has been received from the master is written to disk. The slave replication functionality is shut down, and the boot process of the database is allowed to complete. The database will now be in a transaction consistent state, reflecting all transactions whose commit log records were received from the master. When issued on the slave, the failover command returns a valid connection.	The Derby instance where this command is issued must be serving the named database in replication mode. The command is only accepted on the slave if the network connection with the master is down.

The commands have the following syntax:

Command	Connection url (or in connection Properties)	Command line (not included in 10.4)
Start Master	'jdbc:derby:<dbname>;startMaster=true;slaveHost=<slavehost>;slavePort=<slaveport>';	java <package>.ReplicationControl startmaster -db <dbname> -slaveHost <slavehost> -slavePort <slaveport> -h <networkserverhost> -p <networkserverport>
Stop Master	'jdbc:derby:<dbname>;stopMaster=true;'	java <package>.ReplicationControl stopmaster -db <dbname> -h <networkserverhost> -p <networkserverport>
Start Slave	'jdbc:derby:<dbname>;startSlave=true;slaveHost=<slavehost>;slavePort=<slaveport>';	java <package>.ReplicationControl startslave -db <dbname> -slaveHost <slavehost> -slavePort <slaveport> -h <networkserverhost> -p <networkserverport>
Stop Slave	'jdbc:derby:<dbname>;stopSlave=true;'	java <package>.ReplicationControl stopslave -db <dbname> -h <networkserverhost> -p <networkserverport>
Failover	'jdbc:derby:<dbname>;failover=true;'	java <package>.ReplicationControl failover -db <dbname> -h <networkserverhost> -p <networkserverport>

(In the table above, the connection url dbname includes host and port to the network server)

Authorization and Authentication

The commands above apply only to the database specified. Depending on the security mode Derby is running under, the following measures will be enforced:

No security - Anyone may execute the replication commands.
Authentication is turned on - Normal Derby connection policy. The user and password must be valid.
Authorization is turned on - The user/password must be valid, and the user must be the database owner of the replicated database.
(Not included in 10.4) System privileges (DERBY-2109) is turned on - The user/password must be valid, and the user must have the "replication" system privilege.

There are two exceptions to the rules above: the "failover" and "stop slave" commands. Since the slave database is blocked in the boot process, the authorization service for the database is not available. Hence, we can not check if the user is authorized to stop replication at the slave side. For as long as the network connection between master and slave is up, the slave will therefore only accept these commands from the master. Authorization can then be properly checked on the master Derby instance. If the network connection is lost, however, the commands will also be accepted on the slave Derby instance without checking authorization. Note that although we are not able to check authorization, system level authentication and system privileges (not in 10.4) will still be enforced since these are not database specific.

Additional Security

Apart from authentication, authorization and system privileges (not in 10.4), security issues have not been considered. Security issues that may have to be addressed later include support for encrypted databases, SSL on the network communication and measures for avoiding man-in-the-middle attacks.

Testing

Testing is handled by JIRA issue DERBY-3161.

Documentation

Documentation is handled by JIRA issue DERBY-3169. The NetworkServerControl commands will be added to the reference guide, and a replication chapter will be added to the administrator guide.

Failure Scenario	Action Taken
Master looses connection with slave	Transactions are allowed to continue processing while the master tries to reconnect with the slave. Log records generated while the connection is down are buffered in main memory. If the log buffer reaches it's size limit before the connection can be reestablished, the master replication functionality is stopped. The size limit of the buffer will be configurable.
Slave looses connection with master	The slave tries to reestablish the connection with the master by listening on the specified host and port. It will not give up until it is explicitly requested to do so by either the failover or stopslave command. If a failover is requested, the slave will apply all received log records and boot the database as described in Section Forcing a Failover. If a stopslave is requested, the slave database will be shut down without further actions.
Two different masters of database D try to replicate to the same slave	The slave will only accept the connection from the first master attempting to connect. Note that authentication is required to start both the slave and the master, as described in Section Authorization and Authentication.
The master and slave Derby instances are not at the same Derby version	An exception is raised and replication will not start.
The master Derby instance crashes. It restarts and wants to continue replication from where it stopped	Replication will have to be restarted, as described in Section Starting and running Replication. Later improvement patches may allow replication to continue without resending the whole database to the slave.
The master Derby instance is not able to send log to the slave at the same pace as log is generated. The main memory log buffer gradually fills up and eventually becomes full.	The master will notice that the main memory log buffer is filling up. It will first try to increase the speed of the log shipment to keep the amount of log in the buffer at bay. If that is not enough to keep the buffer from getting full, the response time of transactions may increase for as long as log shipment has trouble keeping up with the amount of generated log records.
The slave Derby instance crashes.	The master will see this as a lost connection to the slave. The master will try to reestablish the connection until the replication log buffer is full. Replication will then be stopped on the master. Replication will have to be restarted, as described in Section Starting and running Replication. Later improvement patches may allow replication to continue without resending the whole database to the slave.
An unexpected failure is encountered	Replication is stopped. The other Derby instance of the replication pair is notified of the decision if the network connection is still alive.