Cryptography on Windows Part 4 - Symmetric cryptography I

Published 2018-04-01, updated 2018-04-01

In a prior post, I described the use of Windows CryptoAPI to compute message digests and message authentication codes which ensure integrity of messages. We now move on to the use of symmetric algorithms in cryptography, which can be used for both message confidentiality and integrity. Asymmetric algorithms will be covered in future posts. Refer to the introductory post in this series for the difference between the two.

The use of cryptography requires choosing an algorithm and a key with which to encrypt data. Key management in particular is critical for the security of a cryptographic system. So we will start by looking at these in this post and then cover their use in the next.

As a reminder, the code snippets here assume you have loaded the TWAPI package and set up the namespace path to include its namespace. Also the convenience procedure hex is defined to dump binary data in hexadecimal format.

package require twapi
namespace path twapi
proc hex bin {  binary encode hex $bin }

Algorithms and algorithm identifiers

Any operation involving symmetric cryptography will require the application to choose the algorithm to be used. Internally, the Windows CryptoAPI identifies algorithms using integer values. For example, the SHA1 message digest algorithm that we saw in our previous post is defined as the value 32772. The numeric identifier for each algorithm is listed in the Windows SDK.

Numeric identifiers are inconvenient, so just as the SDK provides C macros as an alternate way of specifying algorithms, the twapi_crypto package also permits the use of mnemonic identifiers such as des, sha1 etc. Refer to TWAPI documentation for the full list.

Finally, algorithms can be also be specified using ASN.1 object identifiers, for example 1.3.14.3.2.26 which is the OID for SHA1. (ignore this if you do not know about ASN.1 or OIDs).

The command capi_algid will return the numeric identifier for the algorithm when passed any of the above forms.

% capi_algid sha1
0x00008004
% capi_algid 1.3.14.3.2.26
0x00008004
% capi_algid oid_oiwsec_sha1
0x00008004

Note that 0x00008004 is the hex representation of the decimal value 32772 that identifies SHA1.

Most twapi commands will do the above translation internally so there is usually no need to explicitly call capi_algid.

Key management

Secure management of keys is a critical aspect of cryptography as flaws in the process can result in inadvertent exposure of keys or keys with low strength both of which result in the system being compromised no matter how sound the underlying cryptographic algorithms are in theory.

So along with the use of keys for their primary cryptographic purpose we also have to concern ourselves with various peripheral operations on the keys themselves including, in roughly temporal order,

Key generation including choice of algorithm, key lengths etc.
Key establishment and distribution which involves making keys available to authorized parties and no others
Secure storage of keys protecting against inadvertent leakage and unauthorized access
Safe disposal, backup and recovery of keys

We will discuss all these aspects as we move along in our discussion of symmetric cryptography. Note that although there are some differences, most of the discussion also applies to keys used in asymmetric algorithms as well.

Keys and cryptographic contexts

Keys are always associated with a cryptographic context, or to be more precise, a key container (possibly the default one) within a context.

All cryptographic operations involving keys require the use of a handle that represents the key of interest. This handle is obtained from the cryptographic context at the time of key generation or at a later time by specifying the appropriate attributes for the key. When the handle is no longer required, it has to be freed using the twapi::crypt_key_free command.

Note that the actual key itself is normally not needed for the application to carry out operations using the key. This helps protect against inadvertent key leakage through a badly coded application.

Generating keys

The very first requirement for the use of any key is of course the existence of the key! So let us start with how keys are generated.

A key may be created based on some randomly generated values or derived from some known data such as a pass phrase. In the former case, it is crucial that the generator use sufficient sources of entropy to make the generated key non-predictable. In the latter case, the data from which the key is derived must itself be kept secret.

The choice between randomly generating a key versus deriving it is generally dependent on the application. In a protocol such as SSL/TLS, symmetric encryption keys are randomly generated for a communication session. On the other hand, this is not suitable for an application that protects files where a key generated from a passphrase that the user has to provide (and remember) is more appropriate. Both methods are discussed below.

NOTE: Remember we are only discussing symmetric keys in this post.

Generating random symmetric keys

Generating a random key is straightforward given a cryptographic context. In the simplest case, only the algorithm needs to be specified along with the cryptographic context under which the key is to be generated.

% set hcrypt [crypt_acquire]
2014535112768 HCRYPTPROV
% set hdes [crypt_generate_key $hcrypt des]
2014502074224 HCRYPTKEY

The above generates a DES key and returns a handle to it. We could add other keys as well provided they are for different algorithms.

% set hrc4 [crypt_generate_key $hcrypt rc4]
2331196109376 HCRYPTKEY

Warning: Generating another key for an algorithm for which a key already exists will result in the previous key being overwritten and unusable.

The returned key handles can then be used for cryptographic operations.

% hex [capi_encrypt_string abc $hdes]
ae2430212e72f043

We will have more to say on the cryptographic operations later on but for now we focus on the keys themselves.

Finally, the key handles must be freed with twapi::crypt_key_free. For example,

crypt_key_free $hdes
crypt_key_free $hrc4
crypt_free $hcrypt

We will be using these keys however, so don't do that just yet.

Note that for symmetric keys once all handles for a key are freed, the key itself is no longer accessible and a new handle cannot be obtained to it by any means. The reason for this is that randomly generated symmetric keys are intended to be used as short lived session keys and should be destroyed at the end of a session.

You might be wondering at this point about how the other authorized party gets hold of this key that was randomly generated by the first party. Obviously, the other party cannot randomly generate the same key! This will be covered when we discuss key distribution.

Generating symmetric keys from a pass phrase

For some use cases, generation of a random key for encryption is not appropriate. For example, you may want to encrypt the contents of a file to protect it from unauthorized viewing and only permit users that know the secret to decrypt and view the file. For such cases, a commonly used method is to generate the secret key based on a pass phrase supplied by the user encrypting the file.

The pass phrase cannot be directly used, for example by just using the ASCII encoding of the characters, because the symmetric key lengths are generally a fixed size. More important, such a key would not be cryptographically strong. Instead, algorithms are used to transform the pass phrase into a more cryptographically secure key.

From within Tcl, you can use the crypt_derive_key command from to generate a key from a pass phrase. Like the crypt_generate_key command we saw earlier, this will return a handle to a key that can be used for cryptographic operations. Unlike that command however, the created key is not random but is a function of the pass phrase. Moreover, passing the same pass phrase and options to the command will result in the same key being generated every time. You can therefore use this mechanism to implement scenarios similar to the file protection example above.

In its simplest form, the command takes the form

crypt_derive_key _HCRYPT_ _ALGID_ _PASSPHRASE_

where HCRYPT is the cryptographic context, ALGID is the algorithm identifier for the algorithm with which the keys is to be used and PASSPHRASE is the pass phrase from which the key is to be derived.

So for example, the following would generate a key to be used for encryption using 3DES, do some cryptographic operations with it and then as always free the key handle.

% set sender_key [crypt_derive_key $hcrypt 3des [conceal "My passphrase"]]
2331196108928 HCRYPTKEY
...do something with the key...
% hex [set ciphertext [capi_encrypt_string abc $sender_key]]
cafbb1ee3dc31292
% capi_key_free $sender_key

Then on the receiver side, we can get back the plain text using the same passphrase to generate the symmetric key.

% set receiver_key [crypt_derive_key $hcrypt 3des [conceal "My passphrase"]]
2331196109600 HCRYPTKEY
% capi_decrypt_string $ciphertext $receiver_key
abc
% capi_key_free $receiver_key

NOTE: The conceal command above is used because the crypt_derive_key command expects the pass phrase to be in a protected form. We will elaborate on this in a future post.

Naturally, to ensure the same exact key is derived from a pass phrase every time, the same exact sequence of operations need to be executed. Given that the derivation may happen on multiple systems and contexts, applications making use of pass phrase based key derivation have to take care of the following aspects.

There are many key derivation algorithms and the same algorithm must be used every time. This is obvious but easy to overlook when defaults are used and these defaults differ between systems and even between libraries on the same system.
Moreover, the algorithms are parameterized which can result in different keys being derived even when the basic key derivation algorithm is the same. For example, in the case of algorithms that operate by repeatedly applying a pseudo random function (PRF) on some transform of the pass phrase, the same PRF must be used every time. This is obvious but easy to miss as different platforms may default to different PRF's if not explicitly specified. Moreover, the number of iterations over which the PRF is applied is one such parameter and it is up to the application to ensure the same iteration count is used every time. Another such parameter is the salt value which is used to seed the key derivation.
For algorithms that do not have a fixed key size, the key size that is used as a default if one is not explicitly specified is system dependent. It can depend on the platform, the operating system version and even the targeted country (due to export rules related to encryption). Again, this should be explicitly specified.

For these reasons, it is recommended that an application not use defaults and explicitly control the various parameters that are input into the key derivation process as we now discuss.

Selecting the key derivation algorithm

The crypt_derive_key command supports two algorithms for deriving the symmetric key from the passphrase. The algorithm can be selected with the -method option to the command. This can take one of two values: native and pbkdf2.

Specifying native, which is the default, results in key derivation using the CryptDeriveKey Win32 API function. Assuming other relevant parameters described below are the same, the keys derived on any Windows system will have the same value. However, this algorithm is Windows specific and other platforms generally do not have an equivalent. You can however use the specification described in the CryptDeriveKey documentation to implement equivalent functionality.

The PBKDF2 algorithm, selected by specifying pbkdf2, on the other hand is an industry standard defined in IETF RFC 2898. Most platforms provides an implementation of this algorithm and they will all produce the same key value, again assuming the same parameters are supplied.

In general, the PBKDF2 algorithm is recommended (by experts, not me!), not just because it is standardised but also because it is considered more secure. It is however several orders of magnitude slower by design (for resilience against brute-force password guessing attacks).

When deriving a key from a passphrase using PBKDF2, there are additional options, shown in the example below, that control the derivation.

% set hpbkdf [crypt_derive_key $hcrypt des [conceal "My passphrase"] -salt [random_bytes 8] -iterations 10000 -prf sha_256]
1793702295392 HCRYPTKEY
% capi_key_free $hpbkdf

The PBKDF2 algorithm has three controlling parameters:

The underlying pseudo-random function (PRF) used, as specified by the -prf option. This may be sha1 or sha_256.
An iteration count, specified by the option -iterations. This is the number of iterations used to generate the derived key, a higher iteration count being more resilient to an attack at the cost of making the key derivation correspondingly slower. As processors become faster over the years, a higher iteration count is suggested. See NIST Special Publication 800-63B Section 5.1.1 for some recommendations.
A mandatory salt value, specified through the -salt option. My introductory post described the purpose of salts in cryptographic algorithms. Here we use the TWAPI random_bytescommand, which is a call into the cryptographically secure RtlGenRandom Windows API, to generate a 64-bit random value.

Obviously, all parties that make use of the passphrase must use the same option values in order for the derived keys to be identical. Refer to RFC 2898 for the effect of each as well how to choose their values for a specific application.

NOTE: TWAPI also provides a separate command pbkdf2 that directly returns the generated key using the PBKDF2 algorithm.

Key properties

Once we have a key in hand, we can examine its properties through various commands. For example,

% capi_key_blocklen $hdes
64
% capi_key_blocklen $hrc4
0

tells us that the data encrypted with DES will be encrypted in blocks of 64 bits while RC4 being a stream cipher does not have a block size. Some properties associated with the key, like block length, are inherent to the algorithm and cannot be changed. Others, like the padding type, can be configured as we will see when we cover the encryption operations in detail.

Key distribution for symmetric keys

A symmetric key needs to be known to both the sending and receiving parties and thus a safe mechanism for sharing it between the two without revealing it to other parties is required.

Distributing keys derived from a passphrase

When symmetric keys are derived from a passphrase, this is easy because the problem is just punted to some out of band solution through which the communicating parties share the passphrase. In the encrypted disk file scenario, the encrypting and decrypting parties is the same person so sharing is a no-op. When the two parties are different, a whispered word in an abandoned warehouse would do the trick!

Distributing randomly generated keys

Randomly generated symmetric keys are a whole different issue. By definition, the two parties couldn't possibly generate the same random key! Thus somehow the key has to be securely communicated by the generating party to the other side. In practice, randomly generated keys are generated in cryptographic systems where they are used as session keys and transferred over channels that are already secured by other means, for example through a secure channel (possibly via trusted third party) that is already established (yes, chicken and egg) or through public-private key protocols where asymmetric cryptography is used to protect the shared symmetric keys. I will delay discussion of the former to my next post which discusses the actual process of encrypting communication using symmetric algorithms while the latter will have to wait for our posts on asymmetric algorithms.

Coming up next

This post dealt primarily with how symmetric keys are generated. The next post will delve into how they are used, both for confidentiality as well as integrity.