Reloading signed zones into BIND

Written by Rick van Rein in category: Procedures, Resilience, Technical, Timing

WikiMedia Commons

In our signer, we use OpenDNSSEC to construct signatures and BIND as a hidden primary to reveal the outcome to the public authoritative name servers. We found a few interesting problems with this setup that we needed to work around.

As described under idempotence, we regularly upload lists of zones that need signing. These lists may vary over time, so we needed a way of telling BIND about the altered list of zones to publish. This is something that OpenDNSSEC 1.1.1 does not support — actually, varying zone lists will only be supported from version 1.2 on. In 1.1.1 there is a configurable command to reload zones into BIND though, and we used this hook to script around the elementary rndc reload for BIND.

The reason we wanted to respond to OpenDNSSEC’s notifications at the time a zone has changed, is that BIND configurations that refer to not-yet-existing .signed versions of zones makes BIND unstable; after being halted, it could not be started again until all .signed zones existed, so this could have jeopardised the continuous availability of previously signed zones.

Now, when we receive the notification from OpenDNSSEC (that is, when our notification script is run) we scan over all .signed files, generate a BIND configuration entry for it in a generated zone list, and run rndc reload. A further modification was needed to remove zones; we did that by removing .signed files as soon as they went missing in our uploaded zone lists. All fairly straightforward scripting.

A more interesting problem occurred when we noticed that BIND would not pickup signed zones in all situations. As it turned out, two consecutive rndc reload statements might lead to the second being ignored. The cause is almost certainly that BIND uses stat() internally to see if a file has changed, by checking its last-change timestamp. A second change within the same clock second would not be noticed.

The solution to this was straightforward, given that we found the problem. We surrounded our script with a lock and a waiting time, in such a way that a secondary zone upload would have to wait for 2 seconds (just to be sure) before commencing:

  1. Acquire an exclusive lock for this notification script (wait if needed)
  2. Recreate the zone list for BIND
  3. Notify bind with rndc reload
  4. Wait for 2 seconds
  5. Release the lock for this notification script

With this installed, we have not run into any more of these timing problems. Even if future versions of OpenDNSSEC support such facilities around the notification command, it will still be useful in cases where multiple sources can run the notifier script; for instance, we run it from the script that takes in the zone list as well as from OpenDNSSEC.