Master-Slave Snapshot Replication
GotDotNet community for collaboration on this pattern
Complete List of patterns & practices
You have to design a replication solution for these requirements:
A replication set is to be copied from a single source to a target, and possibly to more than one target. The replication set consists of entire rows, not just changes that have occurred to rows since the last replication.
Any changes made to the replication data at the target that may have occurred since the last transmission will be overwritten by a new transmission. Hence, the snapshot replication is a master-slave replication.
How do you move an entire replication set from source to target so that it is consistent at a given point in time?
All of the forces described in the
Master-Slave Replication pattern and its parent patterns apply in this context, and there are additional forces that apply. The relevant Master-Slave Replication forces are repeated here for convenience.
Any of the following compelling forces justify using the solution described in this pattern:
Simplicity. You have no reason to use a more complex solution, and you avoid any potential referential integrity problems.
High ratio of changes. The ratio of replication set changes to overall volume of the replication set content is substantial. On every transmission, most of the replication set has changed. Therefore, it makes more sense to transmit the whole replication set, rather than just parts of it. In this case, operational resources are saved by performing a snapshot replication.
The following enabling forces facilitate the move to the solution, and their absence may hinder such a move:
Tolerance of overwrites. If some data has been updated by an application on the target, it might eventually be replaced by data from the source.
Stability. The target applications require data that is stable over a predictable period, and may only change at defined points in time.
Initial database load. You need to perform an initial population of a target database; therefore, the whole source replication set must be transmitted. For this reason, it makes sense to use a snapshot replication before using other types of replication.
Make a copy of the source replication set at a specific time (this is known as a snapshot), replicate it to the target, and overwrite the target data. In this way, any changes that may have occurred to the target replication set are replaced by the new source replication set.
Master-Slave Snapshot Replication uses a single replication building block, consisting of the source, the replication link with Acquire, Manipulate, and Write, and Figure 1 shows the target.
Figure 1: Snapshot replication
The source database contains the replication set to be transmitted.
In the Master-Slave Snapshot Replication pattern, the replications set that is acquired consists of the full set of complete rows that are to be copied to the target (not just the changes that were made to the data at the source).
The dimension of possible data manipulations in a Master-Slave Snapshot Replication environment depends on the implementation.
Hint: The snapshot replication between databases can be performed either by a direct connection with SQL statements or by the use of an extract tool. If the snapshot replication uses a tool, a separate manipulation service needs data in an accessible structure to read and change the replication set. Thus, manipulations can be done if the format of the extracted file is documented (for example, extracts to text files). Binary files cannot be used by this service.
The replication set is written to the target by the Write service without checking for updates to the target. To write the replication set, the service can use one of the following options, depending upon the circumstances at the target:
The relevant target tables are dropped and new ones are created before inserting the replication set.
All data in the existing tables is overwritten. The tables are truncated before the replication set is written.
The service deletes the old replication set on the target and writes the new replication set.
The first and the second option can be used if the target mirrors the source. The last option is appropriate if data exists in addition to the replication set, and this data must not be affected by the replication.
Hint: The Master-Slave Snapshot Replication does not depend on a network connection between the source and the target. If no such connection exists, or its characteristics, such as bandwidth, are not sufficient, the acquired snapshot can be written to removable media. Removable media, such as compact discs, are then transported to the target, where the snapshot is written to the target.
The target is the database where the transmitted replication set to be written.
A common use of Master-Slave Snapshot Replication is to establish a starting point for other replications, for example, incremental replication or synchronization. In this case, an identical state of the content of the replication set is needed on the source and the target to ensure data consistency when transmitting changes.
The pattern inherits the benefits and liabilities from the Master-Slave Replication and it has the following additional benefits and liabilities:
Provides data at a well-defined point in time. All data of the replicated replication set is assigned to an exact point in time. This information can be used for historical or analytical issues. In these cases, a snapshot replication provides you with a consistent, nonvolatile data basis.
Low resource consumption during operational work.Master-Slave Snapshot Replication does not check for updates to the data. Hence, during the Write, this type of replication requires fewer resources than incremental replication.
Independence of network connections. If you have to replicate a large data volume, and the infrastructure provides only a low-speed communication link, snapshot replication using removable media allows you to bypass these restrictions.
High resource consumption during transmission. Because the whole replication set is transmitted even if only small parts of the source replication set changed during the replication interval, a higher network load occurs and more working disk space is needed. The pressure of this problem grows with the volume of replicated data.
Need a time frame to apply the snapshot. Especially if you replicate a large volume of data, it takes a certain time to acquire the snapshot from the source. To keep the replication set on the source consistent, it must not be changed by running applications during this time period. You can achieve this either by avoiding write operations from the application or by locking the data so that the replication set cannot be updated during the snapshot transmission. Otherwise, the data consistency cannot be guaranteed.
Snapshot data can reside in places other than the source and target data. Snapshot data requires security standards that are as high as the security standards used for the source and the target.
For more information, see the following related patterns:
Patterns That May Have Led You Here
Move Copy of Data. This pattern is the root pattern of this cluster. It presents the fundamental data movement building block, which consists of source, data movement set, data movement link, and target. Transmissions in this kind of data movement building block are done asynchronously, sometime after the update of the source. Thus, the target applications must tolerate a certain amount of latency until changes are delivered.
Data Replication. This pattern presents the architecture of a replication, which is a specific type of data copy movement.
Master-Slave Replication. This pattern describes design considerations for transmitting data from the source to the target by overwriting potential changes in the target on a higher level than the Master-Slave Snapshot Replication.
Patterns That You Can Use Next
Implementing Master-Slave Snapshot Replication Using SQL Server. This pattern shows how to implement Master-Slave Snapshot Replication by using Microsoft SQL Server.
Other Patterns of Interest
Master-Master Replication. This pattern presents the design of a replication between a source and a target, where the common replication set is updateable at either end.
Master-Slave Transactional Incremental Replication. This pattern also describes a master-slave replication at a design level. It differs from the Master-Slave Snapshot Replication in that it transmits only changes of data from the source to the target using transactions.