Howard Chu Shares What to Expect with OpenLDAP 2.5
Updated: Aug 13, 2021
Howard Chu is the Chief Technology Officer at Symas, the Chief Architect of the OpenLDAP Project, and an overall amazingly entertaining fiddle player (Google it.)
This spring Howard spoke at FLOSS UK about the upcoming release of OpenLDAP 2.5. Tom Yates compiled a nice article summarizing these highlights, which we’ve included below. Curious to learn more about Symas OpenLDAP? Click here.
Wish to take these features for a spin without the pain of compiling? Download our free Silver version of Symas OpenLDAP. No registration is necessary.
Written by Tom Yates
If pressed, I will admit to thinking that, if NIS was good enough for Charles Babbage, it’s good enough for me. I am therefore not a huge fan of LDAP; I feel I can detect in it the heavy hand of the ITU, which seems to wish to apply X.500 to everything. Nevertheless, for secure, distributed, multi-platform identity management it’s quite hard to beat. If you decide to run an LDAP server on Unix, one of the major free implementations is slapd, the core engine of the OpenLDAP project. Howard Chu is the chief architect of the project, and spoke at FLOSS 2018 about the upcoming 2.5 release. Any rumors that he might have passed the time while the room filled up by giving a short but nicely rendered fiddle recital are completely true.
OpenLDAP, which will be twenty years old this August, is produced by a core team of three members, and a “random number” of additional contributors. Development has perhaps slowed down a little recently, but they still manage a feature release every 12-18 months, with maintenance releases as needed. OpenLDAP version 2.4, which was first released in 2007, is still the production release; it is theoretically feature-frozen, having had only three releases in the past two years, but the commit rate is still fairly high and fixes, particularly in documentation, continue. Chu noted that despite it being feature-frozen, 2.4.47 will have some minor new features, but this is definitely the last time this will happen and 2.4 is now “absolutely, for-sure, frozen”. Probably.
The big milestone coming up is the production release of version 2.5. New features in 2.5, which were the meat of Chu’s talk, fall into two camps: those that have been merged for the 2.5 release for some time and have matured, and those which are still scattered through various development branches and have yet to be pulled back into the main tree for release. Mature features coming in 2.5 include multiple thread pool queues, streamlined write waiters, offline slapmodify and slapdelete, and support for LDAP transactions in all the primary database backends.
CURRENTLY-MERGED FEATURES FOR 2.5
In all versions through 2.4, there is a single thread pool that allocates worker threads to every operation. Because this allocation is done through a single queue with a single lock, it gets bogged down pretty heavily under large workloads, and it doesn’t scale well to multiple cores. So in 2.5 a configurable number of queues is permitted. In testing, this has produced considerable benefits: a 25% boost in searches per second with the back-mdb backend [PDF] on a four-core test system.
When Oracle invited the OpenLDAP developers to Oracle’s Dublin office in July 2017, multiple queues were further tested on an M8 system, with 2048 virtual CPUs and 1.5TB of RAM, running a pre-release Solaris 11.3. Initially, with a test database of a million Distinguished Names (DNs, unique entities within LDAP), and a hundred clients each with ten connections, they managed 180,000 searches per second. After tuning, which included increasing the number of thread queues, they hit 930,000 searches per second, at which point they established the Solaris kernel was the new bottleneck. For multi-core servers, said Chu, multiple thread pool queues is a huge feature; his advice is to have one queue per CPU, and if necessary to further increase the number of queues so that you don’t exceed 16 threads per queue.
Also prior to 2.5, there was a single, central thread that was responsible for calling select() on all socket descriptors, for both reading and writing on the network. That thread becomes a bottleneck in high-throughput situations, with a lot of synchronization overhead. So, in 2.5, each worker thread is responsible for sending messages to its own clients, leaving the central thread to deal with receiving messages from all clients. This eliminates much of that overhead, and improves throughput in environments which mix busy clients with slow ones. The best place to keep the configuration for a database-driven tool is inside the database, and OpenLDAP does this under the Common Name cn=config. This can, however, give rise to a chicken-and-egg situation: the database won’t start because it needs a configuration change, and you can’t change the configuration because the database is down. The new tools slapmodify and slapdelete allow these changes to be made by direct operations on a down database, and complement the extant slapcat and slapadd.
RFC 5805 for transaction support in LDAP has been around for some time now. Transaction support in OpenLDAP is complete for the three primary database backends: BDB, HDB, and MDB. There is also an LDAP backend, which essentially turns OpenLDAP into an LDAP proxy server; the project looked at adding transaction support to that backend as well, but Chu said it exposes a shortcoming in the RFC, and that the specification really needs to support two-phase commit if transactions distributed across multiple servers are to be possible.
STILL IN THE PIPELINE
As Chu said earlier, there are also new features coming in 2.5 that have not had quite so much testing as those listed above. OpenLDAP’s synchronization and replication engine, syncrepl, does two or more full transactional writes for each incoming modification, and this puts it at a bit of a disadvantage when keeping up with a busy, bursty replication provider. In 2.5, syncrepl does “lazy commit”, where those writes are queued for later injection to the underlying database, which helps it keep up with such a provider. STARTTLS and SASL interactive bind have been supported by libldap for some time, but they’ve been synchronous functions; as of 2.5, they are supported asynchronously, which Chu expects to make your life a little easier “if you’re using libldap in some other external event loop”. Elliptic-curve cryptography is now supported by OpenLDAP. There are two new database backends, one called Wired Tiger (the name being a play on the BDB backend engine’s long-term maintainer, Sleepycat Software), the other called asyncmeta, which is an asynchronous version of the back-metabackend. The lload LDAP load-balancer code is being merged into the slapd code base. Many new modules are provided, including ones to support TOTP OATH and Authzid, and overlays such as adremap and usn to support Microsoft schemas.
Yet another addition for 2.5 is the autoca overlay, which is an automatic generator for both certificates and certification authorities (CAs). When slapd is started with autoca configured, it will look to see if it there is a CA and a server certificate configured; if not, they will be automatically generated with appropriate contents. Then, for any user who comes along and does an LDAP search for their own DN, that is for their own user certificate, the certificate will be generated and supplied to them on the fly. Chu is “pretty happy about that one”. Getting further down into the nuts and bolts, indexing in slapd is currently based on a 32-bit hash; in larger databases Chu is starting to see excessive hash collisions. 2.5 uses 64-bit hashes, which will make false index collisions much less likely. OpenLDAP has previously had an LDIF parsing library, called libldif, which Chu found many distributions didn’t package (perhaps, he speculated, because they didn’t know it existed). This functionality has now been moved into libldap. Timestamps have been supported, but only with single-second granularity; as of 2.5, timestamps have microsecond resolution, and time spent in queue and time for execution are both logged. For TLS, under 2.4 the filesystem location of the keys and certificates were stored in cn=config; as of 2.5, the keys and certificates themselves can be stored inside the database.
Not everything that was being worked on is ready to ship in 2.5, but it’s useful to know what might be coming shortly after. Logging continues to be a bottleneck for OpenLDAP; Chu describes the glibc syslog() code as “some of the worst I’ve ever read”. The OpenLDAP test server manages about 200,000 queries per second with no logging, but only 21,000 queries/second with stats logging enabled. Chu tried writing a streamlined syslog(), but this only raised throughput to 26,000 queries/second; it was clear that some form of binary logging was needed. Initially, Chu considered writing the stats logs as BER-encoded LDAP packets, but realized he’d have to write another tool to parse those binary packets and make them human-readable. So his current thinking is to write them out in the pcap format and let people use Wireshark to read the log, which is definitely an interesting approach to logging. Chu intends that the project will grasp the nettle of two-phase commits, which he accepts will mean extending RFC 5805. There is, he feels, no alternative if it is going to support transactions across back-ldap, back-meta, and the like. As for timescales, Chu suggested in response to an audience question that we should expect a pre-alpha quality 2.5.0 in a couple of months’ time. 2.4 is, as he said, over ten years old; 2.5 is badly-needed, and it’ll be good when it gets here.