HSM backup considerations

Written by Rick van Rein in category: Architecture, Resilience, Technical, Timing

When you start to support DNSSEC, you are suddenly supposed to manage the keys used to sign the domain. This is a typical task for a security officer. Typical concerns are to conceal the private keys from outside-world prying eyes, and to avoid losing keys as long as the outside world needs them to trust your domain.

The market offers quite a range of technical solutions to manage keys securely, as this is a general cryptographic concern; the most common solutions are:

  • You can store keys on disk on a physically secure machine, possibly with password-based encryption
  • You can store keys on a cryptographic smart card, which is designed to conceal private keys
  • You can store keys on an Hardware Security Module (or HSM), which is a protected machine designed for secret key protection

These solutions vary in price and performance as well as in their level of attained security. Since SURFnet is not just responsible for its own keys but also for its connected institutions’, and because DNSSEC key management can have a direct effect on domain uptime, we have chosen to work with a fullblown HSM. Or more accurately, a pair of HSMs that act as one virtual HSM device in high-availability mode. So if one HSM fails we can replace it while the other picks up on all duties.

Cryptographic hardware (as well as software simulations such as the SoftHSM that is developed alongside OpenDNSSEC) is usually accessed over the industry-standard PKCS #11 API; in the case of a redundant HSM solution, all the high-availability issues are best resolved under that API so we don’t get to see the replication mechanisms, or even any failure of a single HSM. In a picture:

High-Availability pair of HSMs accessible as one PKCS #11 instance

Image Components by OpenClipArt.org

The hidden high-availability facilities mean that we can follow the HSM manufacturer’s instructions for any HSM-related emergency procedures, which saves us a lot of work.

We have opted for one more extension, which is a backup made on one HSM. The instant copies of an HSM are mainly to cover for hardware failure; backups have the added value of supporting the recovery from operational failures. The normal situation is one where the HSMs store the same values, so making backups in both locations hardly helps with data safety. However, if keys are backed up before they are first published, there is always a chance of recovering the vital material that makes DNSSEC tick. This can be a great asset when trying to protect the secure chain that DNSSEC builds. The complete picture now becomes:

One of the identical pair of HSMs will be backed up regularly

Image components by OpenClipArt.org