What I’ve been up to recently

As part of my degree requirements, I’ve got to complete a large project within a group of four people. The project goal is self selected, but must be useful and related to software engineering. I’ll be working with Michael Chang, Zameer Manji, and Alvin Tran until early 2014 to add integrity protection to eCryptfs, a cryptographic file system that can be used on Linux. In this post, I’ll present some concepts related to the project along with some details about the project itself.

Confidentiality and Integrity

Two important concepts in computer security are those of confidentiality and integrity. There is also usually a third concept mentioned alongside these, that of availability, but I’m only mentioning it here for completeness.

Confidentiality protection attempts to prevent certain parties from reading information while allowing other parties to access the information. In the case of a cryptographic file system like eCryptfs, this is done by encrypting the files before writing them to disk, and decrypting the files when they are needed later. This could be done manually, but it is much easier and less error prone to have the file system handle this sort of thing than to try to encrypt all sensitive information by hand.

Integrity protection attempts to ensure that information has not been unintentionally changed. This might entail actually trying to prevent modifications to the information, or it may simply indicate when the information has been changed. Cryptographically, this can be done using a Message Authentication Code (MAC), which is a short binary string that can be easily calculated with a file and a key, but cannot be calculated without both. Additionally, if the file changes then the calculated MAC will be different. Anyone knowing the key and having access to the file can calculate the MAC and compare it to one that was calculated and stored earlier, and if the two are different, then the file must have been changed.

Current state of eCryptfs

The eCryptfs file system is a stacking file system, which means that it relies on a lower file system to handle stuff like I/O and buffering, and just manages file encryption and decryption. Currently, that is all it manages, as it does not include any integrity protection. The contents of files are made unreadable to anyone without the correct key, but it is still possible to modify those files in partly predictable ways, as presented below.

There is already a wide user base for eCryptfs, with Ubuntu and it’s derivatives using it to provide the encrypted home directory feature, and within Google’s ChromeOS.

Attack against CBC mode

Cipher Block Chaining (CBC) is one of the most common modes of operation for block ciphers, and is used currently by eCryptfs. In this mode of operation, each block of plaintext is XORed with the previous ciphertext block before encryption. This ensures that the same block of plaintext won’t encrypt to the same ciphertext, unless the previous ciphertext block is the same as well. A one-block initialization vector stands in for the previous ciphertext block during the first encryption. CBC decryption just reverses the process, first decrypting the ciphertext block, then XORing it with the previous ciphertext block.

Operations can be expressed in the following way (Taken from Wikipedia)

Encryption: $C_i = E_K(P_i \oplus C_{i-1}), C_0 = IV$

Decryption: $P_i = D_K(C_i) \oplus C_{i-1}, C_0 = IV$

Now let’s perform the attack. Let’s say we want to change a certain plaintext block $P_n$ into a different plaintext ${P_n}'$ by flipping some bits. We’ll denote this change as $\Delta$. That is, ${P_n}' = P_n \oplus \Delta$

It turns out that if we don’t care what happens to the previous plaintext block, $P_{n-1}$, all we have to do is replace $C_{n-1}$ with ${C_{n-1}}' = C_{n-1} \oplus \Delta$

We can substitute this into the decryption formula above to see what will happen.

${P_n}' = D_K(C_n) \oplus {C_{n-1}}'$

${P_n}' = D_K(C_n) \oplus {C_{n-1}} \oplus \Delta$

${P_n}' = P_n \oplus \Delta$

This is an integrity issue, as an attacker can now modify files without ever knowing the key used to encrypt them. It’s also not guaranteed that this modification is detectable, depending on whether the previous block can be checked for validity. If it can be checked, great, but that’s just another form of integrity protection, and the project I’m working on aims to implement integrity protection regardless of the data stored. If it can’t be checked for correctness, or is ignored (maybe it’s a different record in a database) then the modification will go unnoticed.

Galois Counter Mode

Galois Counter Mode (GCM) is another mode of operation for block ciphers, but in addition to encryption, also produces a piece of data known as an authentication tag. This tag acts as a MAC taken over the data that was encrypted. An attacker could still modify the ciphertext, but now the resultant changes to the plaintext will invalidate the tag, making them detectable. The attacker cannot modify the tag so that it validates the new data, because calculating the tag requires the cryptographic key that was used to encrypt the data, and the attacker does not know this key.

Another benefit to GCM is speed. It’s true that the same effect on security could be had by encrypting the data and calculating a MAC separately, but that requires two passes of the file, one for each operation. GCM does both in one pass over the file, speeding things up. This is important in a file system, as you’d rather have access to your files quickly.

The project aims to implement GCM as the mode of operation for eCryptfs, thus providing both integrity and confidentiality protection. Integrity protection was something the original developers wanted to have from the beginning, but didn’t have the time to implement. I’m proud to be helping to create the first widely used integrity protected cryptographic file system.