Delta Syncrepl Replication Configuration
OpenLDAP’s syncrepl replication is an object-based replication mechanism. When any attribute value in a replicated object is changed on the master, each replica fetches and processes the complete changed object -both changed and unchanged attribute values- during replication. This works well, but has drawbacks in some situations. For example, suppose you have a database consisting of 100,000 objects of 1 KB each. Further, suppose you routinely run a batch job to change the value of a single two-byte attribute value that appears in each of the 100,000 objects on the master. Not counting LDAP and TCP/IP protocol overhead, each time you run this job each replica will transfer and process 1 GB of data to process 200KB of changes! 99.98% of the data that is transmitted and processed in a case like this will be redundant, since it represents values that did not change. This is a waste of valuable transmission and processing bandwidth and can cause an unacceptable replication backlog to develop. While this situation is extreme, it serves to demonstrate a very real problem that is encountered in some LDAP deployments.
The Solution
Delta-syncrepl, a changelog-based variant of syncrepl, is designed to address situations like the one described above. Delta-syncrepl works by maintaining a changelog of a selectable depth on the master. The replication consumer on each replica checks the changelog for the changes it needs and, as long as the changelog contains the needed changes, the delta-syncrepl consumer fetches them from the changelog and applies them to its database. If, however, a replica is too far out of sync (or completely empty), conventional syncrepl is used to bring it up to date and replication then switches to the delta-syncrepl mode.
Setting up delta-syncrepl requires configuration changes on both the master and replica servers:
Master configuration
# Give the replica DN unlimited read access. This ACL may need to be
# merged with other ACL statements.access to *
by dn.base=”cn=replicator,dc=symas,dc=com” read
by * break# Set the module path location
modulepath /opt/symas/lib/openldap# Load the hdb backend
moduleload back_hdb.la# Load the accesslog overlay
moduleload accesslog.la#Load the syncprov overlay
moduleload syncprov.la# Accesslog database definitions
database hdb
suffix cn=accesslog
directory /db/accesslog
rootdn cn=accesslog
index default eq
index entryCSN,objectClass,reqEnd,reqResult,reqStartoverlay syncprov
syncprov-nopresent TRUE
syncprov-reloadhint TRUE# Let the replica DN have limitless searches
limits dn.exact=”cn=replicator,dc=symas,dc=com” time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited# Primary database definitions
database hdb
suffix “dc=symas,dc=com”
rootdn “cn=manager,dc=symas,dc=com”## Whatever other configuration options are desired
# syncprov specific indexing
index entryCSN eq
index entryUUID eq# syncrepl Provider for primary db
overlay syncprov
syncprov-checkpoint 1000 60# accesslog overlay definitions for primary db
overlay accesslog
logdb cn=accesslog
logops writes
logsuccess TRUE
# scan the accesslog DB every day, and purge entries older than 7 days
logpurge 07+00:00 01+00:00# Let the replica DN have limitless searches
limits dn.exact=”cn=replicator,dc=symas,dc=com” time.soft=unlimited time.hard=unlimited size.soft=unlimited size.hard=unlimited
Replica configuration
# Primary replica database configuration
database hdb
suffix "dc=symas,dc=com"
rootdn "cn=manager,dc=symas,dc=com"## Whatever other configuration bits for the replica, like indexing
## that you want# syncrepl specific indices
index entryUUID eq# syncrepl directives
syncrepl rid=0
provider=ldap://ldapmaster.symas.com:389
bindmethod=simple
binddn=”cn=replicator,dc=symas,dc=com”
credentials=secret
searchbase=”dc=symas,dc=com”
logbase=”cn=accesslog”
logfilter=”(&(objectClass=auditWriteObject)(reqResult=0))”
schemachecking=on
type=refreshAndPersist
retry=”60 +”
syncdata=accesslog# Refer updates to the master
updateref ldap://ldapmaster.symas.com
The above configuration assumes that you have a replicator identity defined in your database that can be used to bind to the master with. In addition, all of the databases (primary master, primary replica, and the accesslog storage database) should also have properly tuned DB_CONFIG files that meet your needs.
NOTE: Gavin Henry made the following point on November 15, 2006:
If you slapadd an export back into your Master, after adding all the above to your config, then the accesslog contextCSN, won’t be the same as the primary DB or replica, so just make a trivial edit on the master, and then the accesslog db contextCSN will then be back in sync with the primary db.
Caught us out a while ago.
P.S. it will show as “stale syncrepl cookie” or similar in your logs.
Gavin.
