Have you ever built a system so secure that you worry about not being able to access it in case something breaks? Or have you ever decided to use a little less security to make maintenance a little easier? We’ve faced this problem and want to share part of our solution.
Two years ago we set out to create a secure computing and storage environment where we would be able to handle credit and debit card data in accordance with the highest levels required by the payment card industry (PCI). Now, creating such an environment requires several person-years of work, including help from outside experts. Maintaining it, on the other hand, should be a lot simpler. If built well, the environment should require absolutely minimal care once it is up and running. And since the needed care is minimal, we can restrict the powers of the administrators (the only people with access) so that they can only maintain the system but not dismantle it or, say, download all the credit card numbers.
We realise that we cannot foresee everything so there is a super user account with full privileges too. It will only be used in case we need to raise the privileges of the day-to-day admins (hopefully never) or when we want to do a major overhaul (in a few years, perhaps). Only one person, Hermione, has the keys to this account. Hermione is a technical co-founder—meaning that she has a larger-than-average incentive not to abuse her access—and has limited knowledge of what is inside the environment, meaning that she would not immediately know how to abuse her access. She is still tech savvy enough to be of help in case of an emergency.
So we have a sensitive environment with limited access for those who maintain it and a backup plan for full access. That’s great, right? Well, what if Hermione disappears for a longer or shorter period (or forever!), voluntarily or involuntarily? That brings us to the age-old concept of bus number:
How many people on your project can be hit by a bus before the project fails?
Any project manager worth their salt needs to know the answer to this question, and they need to keep it high. Is there only one person who can update the database schema? Bad project manager. Are there only two people working on the central algorithm and are they friends (i.e., one might be recruited away and bring the other one along)? Bad project manager. The concept can be extended to companies too: How many people in your company can be hit by a bus before the company fails? Are there only two persons who know how to add new companies to your multi-tenant system? Bad COO. Is there only one person who can open up the secret portal to allow further work on a vital system? Bad CTO.
So what do we do? Let’s work something out.
The first, intuitive option is to let several people share the super user account. But our general security policy states that all accounts should be personal and that all accounts should use multi-factor authentication (typically a normal password plus a temporary code generated by an app on the user’s smartphone) which by design is hard to share. So we reject this option (but will reconsider if it turns out other solutions have bigger problems).
The second option is to create another super user account for our colleague Severus. Hermione says no. She is prepared to carry the master key to the system along with the obvious risk of being accused of wrongdoing if our credit card data leaks, but only if that risk is negligible. Giving Severus the same powers means that he might steal data and leave Hermione with the blame. And if he is smart, he might set things up so that more tracks point towards Hermione than towards himself. Hermione thinks the risk is minute, but it’s a risk nonetheless, and she is not interested in taking it. (Severus might react the same way if asked, of course.)
For us as a company, this is a delicate situation. We could offer to pay Hermione to take this extra risk, but then both we and she must come up with a price on a colleague’s honour, and that may damage office relations for a long time. We would of course need to offer Severus a similar deal, and we need to find a way to negotiate these separate contracts. On top of that, a key bearer might leave the company or just want to drop this extra responsibility, at which time we would have to restart the process. There ought to be a better solution.
Seeing that super user accounts are good but equal access is bad, we come up with a third option: We’ll create another super user account and split the password between Severus, Luna, Albus and Minerva. To access the account, they will need to line up alphabetically and enter 8 characters each for a full password of 32 characters.
This makes Hermione happy again but we involve 4 more persons and only manage to raise the bus number by 1. Also, if Hermione is unavailable and we need immediate access to the super powers we need to wake 4 people in the middle of the night. Can we do better?
It turns out that we can, but it’s probably not something we’d find through more iterations. It was pure luck that I had run across a neat tool earlier: Shamir’s Secret Sharing. This little piece of mathematics lets us split a secret between n persons such that the secret can be recovered using the knowledge of only k (k < n). I’m not going to explain how it works (but do read the Wikipedia article!), only note that n=4, k=2 implies that, apart from Hermione, 3 out of the 4 secondary confidants have to be unavailable for the scheme to fail. Hence the Harry Potter-esque title of this article: Bus Number 1¾.
To sum up: We now have a scheme that gives us full emergency access using one primary key keeper with a group of secondary key keepers as a backup. We have found factors k and n that work for the company as well as for Hermione (incidentally, not 2 and 4). In the next blog post we are going to look at how we implemented it and I hope you’ll find that the final solution is surprisingly simple.