TLDR: Modern key management in a large organization is primarily described by bureaucratic procedures and compliance requirements due to financial liability. No one personnel hold all the keys required for a task. To minimize the need for trust in a day to day operations, the problem is partially solved with the three basic principles - separation of duties, dual control and split knowledge. This is a blog about the people, procedures and devices, not the commercial offerings or concrete software solutions.
What is cryptographic key management
When two parties make a decision to rely on a secure encrypted communication, they have to solve a problem of a key distribution. There are multiple solutions to this problem, that are currently in use, what they have all in common is that there is some trust building initialization phase in every such system. Every time secure communication is bootstrapped for a first time, whether is it payment processing center and the POS terminal in a mall, operating system and the software update server, in all cases the trust bootstrapping makes mutual trust possible. As probably everybody reading this already know, there are two basic types of cryptography, symmetric cryptography, and asymmetric, so called public key cryptography.
To exchange symmetric keys over a wire securely without using public key cryptography, go use Merkle's Puzzles. After the first audit, came back and read this till the end :) (just a joke...)
In the case of the asymmetric keys, Public-Key Infrastructure (PKI) is a one possible mechanism to solve key distribution. It has its limitations, but it can help us solve trust establishment problem. PKI does not say anything about how to actually handle those secret keys, and this article is for the curious people that want to know how the keys are generated, how they should be stored, how encryption and decryption is done, and how are cryptographic keys decommissioned.
The life cycle associated with the cryptographic key material is referred to as key management. The key management refers to all technical and social actions that makes creating and using cryptographic keys secure. As usual, such system consists of people, processes, some software and hardware appliance.
Key management is routinely used among the payment transaction processing entities, certification authorities and military.
Personnel involved in day-to-day operations
In practice, you can divide people people in key management in two categories, one that is working with the key infrastructure daily, those are people responsible for the operations, they are probably part of the security team. The second category is key custodian, person responsible for protecting key shares. I would say that the people in the second category usually do not know what they are doing and have nearly zero knowledge about the cryptography. They are guided by the key managers, even if the key management procedures say otherwise. Probably the only exception are full time key custodians.
Every key custodian is assigned to a group with an allocated key component they are responsible for. Every member of a given group can back up each other.
There are some prerequisites when choosing key custodians, the most important are:
- be a full time employee
- must not have a direct reporting relationship with the member from another custodian group
- must not be a member of a different group that shared the key components for the same cryptographic key in the past nor present
The same requirements as for the key custodian selection are applied also for the key managers.
Except that, there are other requirements:
- must not have full administration privileges over the Hardware Secure Modules they are operating
- separation of duties and dual control must be applied to all operations which includes keys and cryptographic hardware
Day to day activities of a key custodians contain participation in a generation and loading of a cryptographic keys, storing and retrieving key components from a safe, sending key material to the other communicating parties. Key custodians and key managers both must support annual or ad-hoc audits.
Some organizations have full time key custodians able to do most of the work themselves, other companies are just using random employees from the multiple departments (compliance requirements). Many times, the actual work beside HSM smardcard authorization, key loading or component handling is done by a key management team dictating custodians what to do.
Three rules of key management
Separation of Duties
The concept of Separation of Duties is fundamental access control principle in the cryptographic key management. Separation of duties prevents knowledgeable insiders to commit fraud, e.g. in finance, to steal cryptographic keys and reissue the credit cards, send valid transactions using data stolen from the company or to derive PINs with the stolen key material. Separation of duties is an invisible border between the people who are developing systems that process confidential information, people who are using those systems and the people who take care of the system security itself. The developer can't decrypt the data processed by her application, third party responsible for redirecting of the Point of sale transactions is not able to create fraudulent transactions, cryptography key management team does not have access to the assets they are protecting, and usually they not able to steal the keys, they are working with, because they are dealing solely with the cryptograms - already encrypted keys, and does not have a knowledge and access to the rest of the infrastructure or clear keys.
In the payment sector, and I believe that in other industries as well, there is a strict requirement to access high security area (HSA) under dual control. Except internal procedures that should require this, it must be enforced by the physical security. High security mantrap ensures that only one single authorized personnel can enter after authentication and dual access to the room is enforced as people in the room are counted by the physical access system. The alarm is triggered when access under (at least) dual control is not met. You must have two different locks installed on a rack in the data center, so two people must bring different keys in their custody in order to open it. Electronic key control systems "keywatchers" are used for storing the physical keys, when those are not in use.
Dual control should also be enforced by the technology itself, e.g. only two people can access computer connected to the Hardware Secure Module (HSM). When you want to manipulate the cryptographic material, you have to use at least two smart cards to put HSM into authorized state. Only when you are authorized against the internal symmetric Key Encryption Key (KEK), you are able to work with the data protected by it. If you want to change setting on the payment industry HSM devices, you need also two physical keys. Not to mention that most of the times, when manipulating settings or firmware, internal storage is erased and you have to load the LMK again. And when you want to log in into the Windows machine with Putty installed in order to be able to issue commands to the HSM? Guess how is split-password authentication practically done on the Windows machines...
From the cryptography key management perspective, split knowledge means that no one person knows the complete value of an encryption key.
Split knowledge comes to the scene usually when the Key Encryption Keys (KEK) are generated. KEKs are used as the master keys, protecting other keys that the system implementing key management is using. We want to have KEKs stored securely under dual control, but at the same time we must be able to recover the key at any time.
- as a cryptogram encrypted by the other key (e.g. by KEK),
- in clear form loaded inside tamper-resistant security module (e.g. HSM device, smart card, secure token)
- divided into multiple clear components, distributed and stored implementing split knowledge control (multiple key custodians with safe boxes)
For example, if you want to initialize an encrypted connection between two parties and exchange card related keys which can later be used for encrypting card holder data in the banking sector, you would usually come over the two keys with completely different split knowledge life cycle. The first is a Local Master Key (LMK) which resides in the HSM device and is split across multiple smart cards delegated to the key custodians. This key is a master KEK used to protect data passing through the HSM, usually other keys. The second key is Zone Control Master Key (ZCMK). ZCMK is also key encryption key, but it's purpose is a transit protection or better, protection of a data transmitted over the wire. As ZCMK is symmetric key, the both sides of a communication channel must have the same key. I will describe later, how the ZCMK is actually exchanged without using public key cryptography.
Hardware Secure Modules
None of the above mentioned access control principles would make things significantly more secure if the actual private keys end up in some unprotected computer memory or stored on the casual hard drive. That is where the hardware security modules (HSM) come to the rescue.
The HSM is a physically secure cryptographic device which provides the cryptographic functions required to implement key management - true random number generation, real time encryption / decryption of data, key import, export and wrapping functionality. The HSM is made physically secure with locks, electronic switches and tamper-detection circuits, which can erase the internal HSM memory which contains keys during normal operation. The main purpose of the HSM is cryptographic algorithm acceleration and key protection. The U.S. National Institute of Standards and Technology (NIST) issued FIPS 140-2 computer security standard is used to evaluate and approve cryptographic modules that require certification for compliance reasons. There are lots of devices in module validation waiting list.
The most expensive FIPS 140-2 Level 4 HSMs can detect heating, drilling, and probably other activities that can lead to a key extraction or fault injecting attacks. Internals of those HSMs are filled with the non-transparent epoxy to make tampering even harder.
Thales PayShield HSM internals protected by epoxy
Thales nShield Connect general purpose HSM with the smart card inserted
We can distinguish between two types of HSMs, based on the functionality they provide:
General purpose HSMs - Those are the PKCS #11 enabled devices that provide you with the "Cryptoki" API. After some initialization, you can use software like p11tool, OpenSSL pkcs11 engine or Microsoft CAPI to work with the keys. Between the enterprise devices what I saw in the wild, most used were devices from the Thales nShield Connect, Safenet Luna family IBM PCIe Crypto Card, Utimaco SecurityServer and others.
Special purpose HSMs - In this category are Payment Card Industry (PCI) certified HSMs like Thales PayShield series, SafeNet Payment HSM, Ultimaco PaymentServer and special purpose devices like A98 ATM Key Management System by Trusted Security Solutions used for automated remote key loading among others. Special purpose devices offer specific functionality for a given industry, whether it is finance, military or others. For example, Thales PayShield functionality includes PIN block translations from one key under another, so the PIN block can flow fast between multiple organizations - bank, Visa, MasterCard - until it reaches HSM of an issuer and the PIN is decrypted and verified to allow or deny the transaction to happen.
Of course, this separation of functionality is mainly forced by the vendors, secure cryptoprocessor could be general purpose in both cases, but the enterprise license and the OS make some functionality inaccessible. Until you buy additional another license.
If you want to read excellent, but maybe outdated critics on the HSMs, definitely read Why I don't like smartcards, HSMs, YubiKeys, etc. by Hugo Landau.
If you are a fan of a Mr. Robot series, you probably saw great eps3.4_runtime-err0r.r00 episode where Angela copied the E-Corp signing keys with the ultimate help of a Dark Army. The device Angela made a backup of is actually SafeNet Luna general purpose HSM and the steps she took was really plausible. The only thing I would argue with in the real world scenario is the knowledge of a PIN for the red USB smartcard she used for authorization. It would be a nontrivial task to obtain, but the security engineers are just humans, right?
Actually, the technical consultant for Mr. Robot, Ryan Kazanciyan, explained the whole background of this scene in a blog.
Portia Doubleday typing the PIN to the Safenet Luna PIN Entry Device
Safenet Luna PIN Entry Device (PED)
Each enterprise has its own security zone defined by the master key installed within the HSM. This master key is called Local Master Key (LMK) and is used as a primary key encryption key protecting all the other cryptograms and data at rest. The LMK is stored on a multiple smartcards, classic or USB enabled. At minimum, three key custodians are required during the LMK key generation ceremony, as well as two operators with the access to the HSM physical keys.
If you are more interested about the secure key ceremonies, I strongly recommend to check the IANA's DNSSEC Root Key Signing Key Ceremony. There is exact agenda, audit logs, operating system images, whole HSM lifecycle procedures, even ceremony camera footages.
Example is better than precept, so I will explain how the key exchange is done in the financial industry between two parties willing to exchange payment data, like credit card encryption keys, cardholder data, transactions. If two organizations want to establish encrypted channel to exchange payment related data between them, they have to agree on a PCI compliant key exchange procedure. The transport keys between those two parties, bridging two distinct security zones, are called Zone Control Master Key (ZCMK).
Before the key exchange takes place, the organizations need to exchange information about the personnel involved in the exchange procedure. The form they exchange is called key custodian list. The key custodian list consists of names, addresses, emails, and phone numbers of at least 3 employees - key custodians. Once the organizations exchanged key custodian lists, they have to negotiate key component delivery dates and also which custodian will be sending the component to whom. After this, the key ceremony is scheduled.
Transport Key Generation
Three key custodians with two key managers, comes to the high security area where the HSM rack is situated. The rack is opened by two rack keys, key managers usually hold the keys. Key managers then search the cables and devices, looking for the evidence of tampering. If everything seems OK, they log-in to the computer operating system using split-passphrase. Software used for communicating with the HSM, e.g. Putty or some commercial software for sending text command over the serial / tcp port is executed and the commands required for generating and printing key components are issued. Three different key components are printed into blinded paper envelope - PIN Mailer.
Each custodian takes one key component, and store it into tamper evident envelope (TEE). Not only that ZCMK key is printed in multiple components, it is also formed, usually using XOR, inside the HSM and the result is displayed as a cryptogram encrypted under the LMK. Key manager or custodian can copy the cryptogram and store it on a tracked encrypted USB flash disk. The ZCMK encrypted under LMK is just a cryptogram and cannot be used outside of the current security zone. After the ceremony, all the generated key material is stored split across multiple safe boxes.
At the end, each key custodian sends the key component to a designated counterpart. Different courier service, recipient and sometime also different sending day are used.
The recipients of a key components are notified in advance, usually via email. They are served with the information like TEE number, package number, key component name and the key check value. In the case, that border control or someone else open the TEE, it should be obvious that the envelope is damaged. Nobody who will receive it in the other organization is going to use this component.
Tamper Evident Envelope
Key Check Values
It can be tricky to find out which key you are actually using if you are dealing only with the HSM protected keys. No easy plain text comparison can be applied. This is usually solved with the key checksum values (KCV). KCV is basically a vector of zeros encrypted with the symmetric algorithm, e.g. AES in the Electronic Code Book (ECB) mode, where the result is truncated to the first 4 bytes. The short KCV hex string is then used as a fingerprint for verification that we are dealing with the right key or key component.
If all three components are untouched, the key forming ceremony is arranged. The process is similar to the key generation ceremony, key custodians put all three components one by one into terminal using the different HSM command, HSM then XOR the key components and show cryptogram encrypted under the organization's local master key. All remaining paper components are immediately cross shredded. If the KCV match, both sides have now the same shared secret key, which is later used for all remaining secure transport.
Business as usual
As I have already written in the key exchange part, key management personnel routinely communicates with the counterpart teams in other organizations, either when negotiating key exchange procedure or when actually exchanging cryptograms, and data using established secure channel.
Cryptographic material is always handled under dual control, and split knowledge, this means one key custodian can access his own smart card in the safe, but the smart card alone is not enough to use in any part of a process. Custodian is also able to access printed plaintext key components, but as the component is printed into blinded tamper evident paper envelope and protected by the numbered TEE, nobody should be able to compromise the component. Every custodian is responsible to handle it carefully and do not share any sensitive material with other component holders.
In the finance related key management, most work done by a key management team actually relates to the audit findings. Changing rack locations, even rebuilding the whole places, because of the wrong security area "architecture" are all common. Audits are really never ending story, executed at least annually. I believe that the same apply to other industries as well. Certification Authorities and similar institutions make living from selling trust, so everybody with a skin in the game make sure they are trustworthy and are serving customers responsibly.
Key Management Cheat Sheet, OWASP
The Definitive Guide to Encryption Key Management Fundamentals
Key Management Guidelines, NIST CSRC
Encyclopedia of Cryptography and Security, Henk C.A. van Tilborg, Springer US, 10. 8. 2005 - p. 328
What is Key Management? a CISO Perspective, Cryptomathic
Key Management and use cases for HSMs, Cryptomathic
Cryptographic Key Management Concepts, IT Today, Auerbach Publications
A Survey of Hardware Crypto, Cryptowarez