Of monocultures and bottlenecks
by Andy Oram
Related link: http://news.com.com/2100-1029-5138447.html
There's some kind of lesson for the computer industry lurking in this
story, but darned if I can figure it out.
Essentially, a bottleneck or single point of failure at one major site
(VeriSign) triggered a bottleneck or single point of failure at
another point (the millions of Norton anti-virus products installed on
people's computers) and led to a massive denial of service.
There seems to be an issue with monocultures (the popularity of
VeriSign, although I don't think they should be blamed for their own
popularity) and with the centralized architecture of certificate
authorities as a technology. I don't suppose better caching would
work, because you can't cache verification. You need to be verified by
an authoritative site.
Can't really agree that this is a monoculture issue
Tempting as it is to draw a parallel to monoculture here, I don't think it is valid. Norton AntiVirus has some function that depends on validating a certificate. It either happens to be that it is connecting to some Symantec server with a Verisign cert, or it may be that it is connecting to many different servers and many of them happen to have Verisign certs. Yes, Verisign is the most popular CA, but it could have been any certificate from any CA that expired. There are many anti-virus products out there, so Norton, though very popular, also isn't a monoculture. It seems to me that any widely distributed software package that has a function that makes secure connections and is (properly) designed to check the CA's revocation list could result in overloading any given CA's infrastructure. The fault appears to be Verisign's lack of capacity for handling a peak load generated by a predictable event, but would any other CA's infrastructure really have been more robust?