DNSSEC at Tilburg University

Written by Roland van Rijswijk in category: Users

This is the English translation of a blog posting by Casper Gielen on the SURFnet innovation blog

Tilburg University activates DNSSEC

Tilburg University (TiU) completed the introduction of DNSSEC in August, making it the first university in The Netherlands to use DNSSEC on a large scale. In February 2011, we began validating incoming DNSSEC information and we are now also publishing that information ourselves.

Architecture

The system at TiU is based on OpenDNSSEC, NSD, and Unbound. NSD and Unbound were already being used, as the authoritative DNS server and the DNS resolver respectively. An effort has been made to minimise changes to the existing infrastructure and interfaces.

Our administrators process the zone file on their own workstation, copy it to the DNSSEC server, and give a signal to OpenDNSSEC. OpenDNSSEC signs this zone, after which the new zone is loaded in NSD. This NSD runs on the DNSSEC machine and acts as a hidden master. The real masters are updated via notify/AXFR. The advantage of this construction is that the front-end DNS servers are as far as possible relieved from having to read and process enormous zone files and can focus entirely on delivering DNS responses as fast as possible.

OpenDNSSEC

Although DNSSEC is itself already a reasonably complex protocol, the real challenge is in management. Steps have to be taken with great regularity to refresh keys and signatures. If you don’t keep track of this for a few days, your domain may disappear off the Internet. And if you have several hundred domains, you can no longer keep track manually.

OpenDNSSEC automates this process and looks after the creation and timely refreshing of keys and signatures.

Architecture at Tilburg University:

HSM

The keys needed for DNSSEC need to be stored securely. Separate hardware is available for this, namely “Hardware Security Modules” (HSMs). HSMs are available in all price classes. OpenDNSSEC communicates directly with the HSM. TiU has decided not to purchase any HSM. Instead, we are using a software implementation of an HSM. This provides enough security for us, and greater speed, flexibility, and convenience.

Monitoring

If you want to use DNSSEC, you can’t just do something once and at a single point. Because a lot of things can go wrong, you need to take measurements at various different points to see whether everything is still going well. For that purpose, we have various Nagios checks to determine, for example, whether a zone has been signed correctly and whether the signatures are at risk of expiring. Something that still works this week may have expired next week; this large-scale expiry is perhaps the main difference to the traditional DNS system.

Day-to-day management

DNS has become significantly more complicated: where two text files and a simple daemon used to be enough, you now need to set up a whole chain. Adding and removing zones has become quite a bit more complicated because there are a number of steps that need to be carried out at a number of different places and in the right order. The NSD configuration can only be altered, for example, after the zone has been signed. This is not a fundamental problem and it can be solved with a bit of script work, but it is something that we regularly need to deal with. Proper monitoring is indispensable.

You need to have constant access to all your registrars because you regularly need to upload a new DS record. We discovered that there were a lot of domains that we couldn’t access (or couldn’t access any longer). There are also domains that have been acquired by third parties without us – the central IT department – being informed. The DNSSEC project has led to more attention being paid to this, and we now have a clearer view of these domains.

Malfunctions and problems

Solving DNSSEC problems takes a lot of time at first. You should set the validity of the signatures so that you have enough time to respond to problems. You also need to take account of weekends and holidays.

Most DNS administrators have hardly any experience of DNSSEC and if there are any problems it can be hard to convince them that something is wrong. In that situation, a diagram with a big red arrow indicating the problem can be a powerful argument! http://www.dnsviz.net can produce diagrams for you (for example http://dnsviz.net/d/www.dnssec-failed.org/dnssec/).

If DNSSEC goes wrong, it’s impossible for most people to determine what the problem is. You don’t get any error message other than that certain domains cannot be found. As an end-user, it’s therefore no easy matter to work around the problem (as you can do with incorrect SSL certificates). Don’t wait too long to familiarise yourself with DNSSEC; at the moment, you can still make mistakes without it having any immediate dramatic consequences.

DNS zones are becoming a lot larger (typically 20 times larger) and are being altered more frequently (several times a week). This significantly increases the load on the DNS servers, which brings up new bugs.

The future

Uploading the DS records is still work that has to be done manually. There are ways to automate this – for example with the EPP protocol – but these differ from one provider to another. From the security perspective, there’s in fact something to be said for keeping human control part of the process. By no means all TLDs and registrars support DNSSEC. As soon as this becomes possible, we will also provide these domains with DNSSEC. At the moment, we have precisely 200 domains that support DNSSEC and 65 domains that don’t do so for various technical reasons. There are still a lot of TLDs and registrars that don’t support it. As soon as this becomes possible, we will also provide these domains with DNSSEC.

Besides the central DNS resolvers, there are also a number of departments and affiliated organisations that run their own DNS resolvers and that will also be configured as DNSSEC validitors in the coming period. At the moment, we have a single OpenDNSSEC server in production. If it were to go down, we could – under normal circumstances – quickly construct a new one and restore our backups. If there were a bigger disaster, for example a fire, we would probably not have the time. We therefore want to have a reserve machine standing by at our alternative location.

Casper Gielen is ICT Manager for Library and IT Services UNIX at Tilburg University.

Comments are closed