Implementing the Signal X3DH protocol with WebCrypto

Play with the demo and view the source code on GitHub

In a previous post, I explored the low-level cryptography behind Diffie-Hellman Key Exchange, a method that allows two parties to secretly agree on an encryption key for their communications.

Diffie-Hellman is delightfully clever, but it has a few limitations if used by itself:

  • It’s prone to man-in-the-middle attacks. When Alice and Bob are transmitting their public keys to one another, a third party could be intercepting both of their keys. This attacker could complete the key exchange with both Alice and Bob and then be able to read whatever they encrypt with their shared key. The attacker would probably proxy messages between Alice and Bob so that neither of them is aware of their man-in-the-middle.
  • It doesn’t authenticate either party. Authentication is a very important part of cryptography. In addition to encrypting messages, you need to be able to prove that you’re talking to the right person! Because Alice and Bob both compute their Diffie-Hellman keys ephemerally, there’s no way for them to trust that the other person’s key was actually made by the right person.
  • It doesn’t work asynchronously. Exchanging keys is easy when both parties are on-line; each one only has to store their private keys for a few moments while the key exchange happens. But in contexts like messaging and email, somebody is very likely to send you a message when you’re not around. How can they encrypt their initial message to you, even when you’re asleep, or your phone is off?
  • It’s only for key exchange, not generic encryption. Symmetric encryption algorithms like AES have tools for checking the integrity of their contents, and are generally more efficient at encrypting lots of content, like images and video.

These limitations don’t mean that Diffie-Hellman is completely bad; it just needs a little help!

X3DH: More of a good thing

In 2016, Signal published the spec for Extended Triple Diffie-Hellman (X3DH). X3DH builds on top of plain Diffie-Hellman to create a key exchange protocol that also provides mutual authentication, forward secrecy, and deniability. X3DH also works well asynchronously, as in an instant messaging setting.

In my demo of X3DH, I set out to implement X3DH using only the WebCrypto APIs available in any modern web browser. In doing so, I learned a lot about how X3DH works along the way!

Before the key exchange: publishing prekeys

X3DH begins by having every user generate a "prekey bundle". This bundle contains several keys with a variety of characteristics:

KeyPurposeUsage
Identity KeyUniquely identifies the user who generated the key.Generated once, reused frequently, never changes.
Signed PrekeyProvides material that the owner can sign to prove that they own the keys.Generated periodically, reused frequently.
One-time PrekeysProvides key material to begin X3DH sessions with forward secrecy.Generated frequently, never reused.

When every user joins Signal, they generate a prekey bundle and upload the public portions of each key to Signal’s servers. They keep the private portions on their device.

Preparing to send a new message

When you want to send a message to somebody on Signal, you look them up by their username (or before February 2024, their phone number). When you do this, Signal’s servers send you the user’s public identity key, their current signed prekey, their computed signature of the signed prekey, and one of their one-time prekeys. Signal then deletes that one-time prekey and will never use it again.

Sometimes a user can run out of one-time prekeys. When this happens, you can still do an X3DH exchange, just without the one-time prekey.

You’re now in possession of three (maybe two) of your recipient’s public keys. Before moving on, you must verify that the user’s signed prekey was in fact signed by the user’s identity key. This proves that the same person generated both keys. If the signature isn’t valid, run away!

Generating the initial symmetric key

Signal’s diagram of the X3DH process

  1. Generate an ephemeral key (EKA). Similar to your recipient’s one-time prekey, this key is only ever used once and helps ensure forward secrecy.
  2. Calculate a Diffie-Hellman key using your identity key (IKA) and your recipient’s signed prekey (SPKB). This helps your recipient authenticate you, because they can look up your identity key from Signal’s servers.
  3. Calculate a Diffie-Hellman key using your ephemeral key (EKA) and your recipient’s identity key (IKB). This helps you authenticate your recipient, because you retrieved their identity key and validated it with the signed prekey.
  4. Calculate a Diffie-Hellman key using your ephemeral key (EKA) and your recipient’s signed prekey (SPKB).
  5. Calculate a Diffie-Hellman key using your ephemeral key (EKA) and your recipient’s one-time prekey (OPKB).

You now have four separate Diffie-Hellman keys, but you’re going to combine them all using HKDF to create one symmetric key (SK).

Sending the initial message

Use the symmetric key you generated to encrypt your first message. Then, send a message to your recipient indicating which keys you used to generate a symmetric encryption key. This bundle includes:

  • Your identity key (IKA).
  • Your ephemeral key (EKA).
  • The recipient’s one-time prekey (OPKB), or some identifier letting them know which one you used.
  • Your initial, encrypted message.

Remember, your recipient might be offline for quite a long time, but Signal’s servers can safely hold onto this bundle until the recipient returns because all of its contents are either publicly-known or encrypted.

Receiving the message

Your recipient will receive a notification that you want to send them a message, and they might choose to dismiss it if you’re trying to market a cryptocurrency scam or something. If they decide to accept your message, they’ll start doing all the same Diffie-Hellman calculations that you did:

  1. Calculate a Diffie-Hellman key using their signed prekey (SPKB) and your identity key (IKA).
  2. Calculate a Diffie-Hellman key using their identity key (IKB) and your ephemeral key (EKA).
  3. Calculate a Diffie-Hellman key using their signed prekey (SPKB) and your ephemeral key (EKA).
  4. Calculate a Diffie-Hellman key using the one-time prekey (OPKB) you told them you used, and your ephemeral key (EKA).

They will then use HKDF to generate a symmetric key from the four Diffie-Hellman keys, and their symmetric key will be exactly the same as the one you generated, meaning they can decrypt your message!

After the initial message

My WebCrypto X3DH demo ends here, but the rest of the Signal protocol after this is even more interesting!

After using X3DH to generate the initial encryption key for a message, that encryption key is never used on its own again. Instead, both parties use that key to begin the Double Ratchet algorithm, which is a series of key-derivation functions that produces brand-new encryption keys for each and every message. This is an essential part of Signal’s forward secrecy guarantee, becase if any individual key is compromised it can only decrypt a single message within a thread; the other messages each use different keys that are still secret.

WebCrypto and next steps

To bring this all back to browsers and WebCrypto, the basic cryptography primitives to build a system as secure as Signal are clearly there; the problems lie in the fact that web browsers are software islands that don’t have secure, private storage mechanisms and are separated from hardware-based security capabilities in a way that native applications aren’t. When modern browsers get reliable access to biometric authentication and truly secret storage, we’ll be able to build some really great applications that put users’ security first.

← All Posts