OpenDJ uses advanced data replication with automated conflict resolution to help ensure your directory services remain available in the event a server crashes or a network goes down, and also as you backup or upgrade your directory service. You can configure data replication as part of OpenDJ installation, and in many cases let replication do its work in the background.
You can set up replication during installation by choosing to configure replication through the setup wizard.
In the Topology Options screen for the first server you set up, select This server will be part of a replication topology. If you also choose Configure as Secure, then replication traffic is protected by SSL.
In the Topology Options screen for subsequent servers, also select There is already a server in the topology, providing the Host Name, Administration Connector Port number, Admin User, and Admin Password for the first replica you set up.
You also set up a global administrator account, stored under
cn=admin data across replicas, used to manage replication
in the topology.
You further set up what to replicate.
Once replication is set up, it works for all the replicas. You can monitor the replication connection and status through the OpenDJ Control Panel.
Before you take replication further than setting up replication in the setup wizard, read this section to learn more about how OpenDJ replication works.
Replication is the process of copying updates between OpenDJ directory servers such that all servers converge on identical copies of directory data. Replication is designed to let convergence happen over time by default. [1] Letting convergence happen over time means that different replicas can be momentarily out of sync, but it also means that if you lose an individual server or even an entire data center, your directory service can keep on running, and then get back in sync when the servers are restarted or the network is repaired.
Replication is specific to the OpenDJ directory service. Replication uses a specific protocol that replays update operations quickly, storing enough historical information about the updates to resolve most conflicts automatically. For example, if two client applications separately update a user entry to change the phone number, replication can work out which was the latest change, and apply that change across servers. The historical information needed to resolve these issues is periodically purged to avoid growing larger and larger forever. As a directory administrator, you must ensure that you do not purge the historical information more often than you backup your directory data.
The primary unit of replication is the suffix, specified by a
base DN such as dc=example,dc=com. [2] Replication also depends on the directory schema, defined on
cn=schema, and the cn=admin data
suffix with administrative identities and certificates for protecting
communications. Thus that content gets replicated as well.
The set of replicas sharing data in a given suffix is called
a replication topology. You can have more than one replication topology.
For example, one topology could be devoted to
dc=example,dc=com, and another to
dc=example,dc=org. Directory servers are capable of
serving more than one suffix. They are also capable of participating in
more than one replication topology.
Keep server clocks synchronized for your topology. You can use NTP for example. Keeping server clocks synchronized helps prevent issues with SSL connections and with replication itself. Keeping server clocks synchronized also makes it easier to compare timestamps from multiple servers.
This section shows how to configure replication with command-line tools.
You can start the replication process by using the dsreplication enable command.
To enable secure connections for replication use the
--secureReplication1 and
--secureReplication2 options, which are equivalent to
selecting Configure as Secure in the replication topology options screen of
the setup wizard.
As you see in the command output, replication is set up to function once enabled. You must however initialize replication in order to start the process.
When scripting the configuration to set up multiple replicas in quick
succession, use the same initial replication server each time you run the
command. In other words, pass the same --host1,
--port1, --bindDN1,
--bindPassword1, and --replicationPort1
options for each of the other replicas that you set up in your
script.
If you need to add another OpenDJ directory server to participate in replication, use the dsreplication enable with the new server as the second server.
Although you can enable replication before you have user data, you must initialize each replica to activate the replication process.
You can perform initialization either over the replication protocol, by importing the same LDIF data on all server before performing initialization when starting out, by importing data from LDIF that you exported from another replica when adding a server to the topology, or by restoring a backup from an existing replica onto a new server.
Online initialization is straightforward, and works well if your network bandwidth is large compared to the amount of data to replicate.
Make sure you have enabled servers you want to participate in replication.
Start replication with the dsreplication initialize-all command.
Follow these steps to prepare a replication topology starting from directory data in LDIF.
Depending on the size of the data and your network bandwidth, you might find it quicker to initialize all replica as described in Procedure 8.1, “To Initialize Online”, and then import the LDIF on a single replica.
Import the same LDIF on all servers you want to participate in replication.
Make sure you have enabled servers you want to participate in replication.
Start replication with the dsreplication initialize-all command.
You can create a new replica from a backup of a server in the existing topology. The dsreplication commands use differ slightly from the other cases, as you must reset the generation ID on the new replica, such that replication can proceed from the proper starting point. Follow these steps to add another server to the topology.
Install a new server to serve as the new replica.
Backup the database to replicate from an existing server.
Enable replication on the new replica.
Prepare the new replica for initialization.
On the new server, restore the database from the backup archive.
Initialize replication on the new replica.
How you stop replication depends on whether the change is meant to be temporary or permanent.
If you need to stop a server from replicating temporarily, you can do so using dsconfig command.
Do not allow modifications on the replica for which replication is disabled, as no record of such changes is kept, and the changes cause replication to diverge.
Disable the multimaster synchronization provider.
When you are ready to resume replication, enable the multimaster synchronization provider.
If you need to stop a server from replicating permanently, for example in preparation to remove a server, you can do so with the dsreplication disable command.
Stop replication using the dsreplication disable command.
The dsreplication disable as shown completely removes the replication configuration information from the server.
If you want to restart replication for the server, you need to run the dsreplication enable and dsreplication initialize commands again.
Replication in OpenDJ is designed to be both easy to implement in environments with a few servers, and also scalable in environments with many servers. You can enable the replication service on each OpenDJ directory server in your deployment, for example, to limit the number of servers you deploy. Yet in a large deployment, you can use stand-alone replication servers — OpenDJ servers that do nothing but relay replication messages — to configure (and troubleshoot) the replication service separately from the directory service. You only need a few stand-alone replication servers publishing changes to serve many directory servers subscribed to the changes. Furthermore, replication is designed such that you need only connect a directory server to the nearest replication server for the directory server to replicate with all others in your topology. Yet only the stand-alone replication servers participate in fully-meshed replication.
All replication servers in a topology are connected to all other replication servers. Directory servers are connected only to one replication server at a time, and their connections should be to replication servers on the same LAN. Therefore the total number of replication connections, Totalconn is expressed as follows.
Here, NRS is the number of replication servers, and NDS is the number of stand-alone directory servers. In other words, if you have only three servers, then Totalconn is three with no stand-alone servers. However, if you have two data centers, and need 12 directory servers, then with no stand-alone directory servers Totalconn is (12 * 11)/2 or 66. Yet, with four stand-alone replication servers, and 12 stand-alone directory servers, Totalconn is (4 * 3)/2 + 12, or 18, with only four of those connections needing to go over the WAN. (By running four directory servers that also run replication servers and eight stand-alone directory servers, you reduce the number of replication connections to 14 for 12 replicas.)
If you set up OpenDJ directory server to replicate by using the Quick Setup wizard, then the wizard activated the replication service for that server. You can turn off the replication service on OpenDJ directory server, and then configure the server to work with a separate, stand-alone replication server instead. Start by using the dsreplication disable --disableReplicationServer command to turn off the replication service on the server.
This example sets up a stand-alone replication server to handle the replication traffic between two directory servers that do not handle replication themselves.
Here the replication server has admin port 6444. The directory servers have admin ports 4444 and 5444.
In a real deployment, you would have more replication servers to avoid a single point of failure.
Setup the replication server as a directory server that has no database.
Setup the directory servers as stand-alone directory servers.
Enable replication with the appropriate
--noReplicationServer and
--onlyReplicationServer options.
Initialize replication from one of the directory servers.
Replication lets you define groups so that replicas communicate first with replication servers in the group before going to replication servers outside the group. Groups are identified with unique numeric group IDs.
Replication groups are designed for deployments across multiple data centers, where you aim to focus replication traffic on the LAN rather than the WAN. In multi-data center deployments, group nearby servers together.
For each group, set the appropriate group ID for the topology on both the replication servers and the directory servers.
The example commands in this procedure set up two replication groups, each with a replication server and a directory server. The directory servers have admin ports 4444 and 5444. The replication servers have admin ports 6444 and 7444. In a full-scale deployment, you would have multiple servers of each type in each group, such as all the replicas and replication servers in each data center being in the same group.
Pick a group ID for each group.
The default group ID is 1.
Set the group ID for each group by replication domain on the directory servers.
Set the group ID for each group on the replication servers.
By default all directory servers in a replication topology are read-write. You can however choose to make replicas take updates only from the replication protocol, and refuse updates from client applications.
In standard replication, when a client requests an update operation the directory server performs the update and, if the update is successful, sends information about the update to the replication service, and sends a result code to the client application right away. As a result, the client application can conclude that the update was successful, but only on the replica that handled the update.
Assured replication lets you force the replica performing the initial update to wait for confirmation that the update has been received elsewhere in the topology before sending a result code to the client application. You can configure assured replication either to wait for one or more replication servers to acknowledge having received the update, or to wait for all directory servers to have replayed the update.
As you might imagine, assured replication is theoretically safer than standard replication, yet it is also slower, potentially waiting for a timeout before failing when the network or other servers are down.
Safe data mode requires the update be sent to
assured-sd-level replication servers before
acknowledgement is returned to the client application.
For each directory server, set safe data mode for the replication domain, and also set the safe data level.
Safe read mode requires the update be replayed on all directory servers before acknowledgement is returned to the client application.
For each directory server, set safe read mode for the replication domain.
When working with assured replication, the replication server property
degraded-status-threshold (default: 5000), sets the
number of operations allowed to build up in the replication queue before
the server is assigned degraded status. When a replication server has
degraded status, assured replication ceases to have an effect.
OpenDJ can perform subtree replication, for example replicating
ou=People,dc=example,dc=com, but not the rest of
dc=example,dc=com, by putting the subtree in a separate
backend from the rest of the suffix.
For example, in this case you might have a userRoot
backend containing everything in dc=example,dc=com
except ou=People,dc=example,dc=com, and a separate
peopleRoot backend for
ou=People,dc=example,dc=com. Then you replicate
ou=People,dc=example,dc=com in its own topology.
OpenDJ can perform fractional replication, whereby you specify the attributes to include in or to exclude from the replication process.
You set fractional replication configuration as
fractional-include or
fractional-exclude properties for a replication
domain. When you include attributes, the attributes that are required on
the relevant object classes are also included, whether you specify them
or not. When you exclude attributes, the excluded attributes must be
optional attributes for the relevant object classes. Fractional
replicas still respect schema definitions.
Fractional replication works by filtering objects at the replication server. Initialize replication as you would normally. Of course you cannot create a full replica from a replica with only a subset of the data. If you must prevent data from being replicated across a national boundary, split the replication server handling the updates from the directory servers receiving the updates as described in Procedure 8.6, “To Set Up a Stand-alone Replication Server”.
For example, you might configure an externally facing
fractional replica to include only some inetOrgPerson
attributes.
As another example, you might exclude a custom attribute called
sessionToken from being replicated.
This last example only works if you first define a
sessionToken attribute in the directory server
schema.
Some applications require notification when directory data updates occur. For example, an application might need to sync directory data with another database, or the application might need to kick off other processing when certain updates occur.
In addition to supporting persistent search operations, OpenDJ provides an external change log mechanism to allow applications to be notified of changes to directory data.
OpenDJ directory servers without replication cannot expose an external change log. The OpenDJ server that exposes the change log must function both as a directory server, and also as a replication server for the suffix whose changes you want logged.
Enable replication without using the
--noReplicationServer or
--onlyReplicationServer options.
With replication enabled, the changelog data can be accessed under
cn=changelog. For example, the following search shows
the publicly visible data available before any changes have been
made.
You read the external change log over LDAP. In addition, when you poll the change log periodically, you can get the list of updates that happened since your last request.
The external change log mechanism uses an LDAP control with
OID 1.3.6.1.4.1.26027.1.5.4 to allow the exchange
of cookies for the client application to bookmark the last changes seen,
and then start reading the next set of changes from where it left off on
the previous request.
This procedure shows the client reading the change log as
cn=Directory Manager. Make sure your client application
reads the changes with sufficient access to view all the changes it
needs to see.
Send an initial search request using the LDAP control with no cookie value.
Notice the value of the changeLogCookie attribute
for the last of the two changes.
In this example, two new users were added to another replica before the change log request was made.
Here the changes are base64 encoded, so you can decode them using the base64 command.
For the next search, provide the cookie to start reading where you left off last time.
In this example, a description was added to Babs Jensen's entry.
If we base64-decode the changes, we see the following.
If for some reason you lose the cookie, you can start over from the earliest available change by sending a search request with no value for the cookie.
As shown above, the changes returned from a search on the external
change log include only what was actually changed. If you have applications
that need additional attributes published with every change log entry,
regardless of whether or not the attribute itself has changed, then specify
those using ecl-include and
ecl-include-for-deletes.
Set the attributes to include for all update operations with
ecl-include.
Set the attributes to include for deletes with
ecl-include-for-deletes.
You can limit external change log content by disabling the domain
for a base DN. By default, cn=schema and
cn=admin data are not enabled.
Prevent OpenDJ from logging changes by disabling the domain.
The external change log can also work for applications that follow the Internet-Draft: Definition of an Object Class to Hold LDAP Change Records. Nothing special is required to get the objects specified for this legacy format. Such applications cannot however use the change log cookies that are shared across the replication topology, and therefore can continue to be used after failover to another replica in a multi-master replication environment.
[1] Assured replication can require, however, that the convergence happen before the client application is notified that the operation was successful.
[2] When you configure partial and fractional replication, however, you can replicate only part of a suffix, or only certain attributes on entries. Also, if you split your suffix across multiple backends, then you need to set up replication separately for each part of suffix in a different backend.