Three (thousand) may keep a secret

Karthikeyan Bhargavan
July 31, 2023

“Three may keep a secret, if two of them are dead.” - Benjamin Franklin (1735)

However skeptical we may be of our human ability to keep secrets, we still routinely participate in group conversations that we would like to keep away from prying eyes. We exchange confidential work emails through corporate mail servers, discuss project internals on private Slacks, and exchange deeply personal information with family and friends on WhatsApp groups. The loss of this private data to malicious outsiders can result in public embarrassment, financial loss, and for vulnerable persons like journalists or activists, even threats to life and liberty.

Secure group messaging is now an essential component of modern Internet communication and it deserves an open, transparent, interoperable standard that does not require placing complete trust in any single service provider. Enter Messaging Layer Security (MLS), aka RFC 9420, the first messaging protocol standardized by the Internet Engineering Task Force (IETF). MLS aims to provide end-to-end encrypted, authenticated, asynchronous messaging for groups ranging from two to thousands of members, with state-of-the-art guarantees, such as forward secrecy and post-compromise security that protect against compromised or coerced clients and servers.

The MLS working group was created in 2018, with the explicit “intent to follow the pattern of TLS 1.3, with specification, implementation, and verification proceeding in parallel.” My research group at Inria has been involved in MLS from the beginning, contributing key design ideas and formal analyses along the way. Cryspen actively contributes to OpenMLS, an open-source implementation of MLS, and offers consulting and software services around MLS.

As the first protocol standard of its kind, MLS has no precedent to base itself on and needs to be self-contained. Consequently, the published standard clocks in at 132 pages, with detailed descriptions of novel cryptographic mechanisms, which can seem forbidding to newcomers.

In this post, we will unpack the key goals and design constraints of MLS and break it down into its essential sub-protocols, to provide the interested reader an entry point to the standard. In future posts, we will dig deeper into the technical details of these sub-protocols.

Dynamic Group Membership

The goal of MLS is to enable devices, called clients in MLS, to form groups within which they can securely send and receive messages over an extended period of time. Each client is identified by an authentication credential, issued by some mechanism external to the protocol, and holds some cryptographic state, including public encryption keys. Any client can create a new group, by creating a fresh group identifier, setting cryptographic parameters like the ciphersuite, and initializing the group’s membership with the creator as the only member. Over time, members can be added to or removed from the group, and they can also update their credentials and cryptographic state. Deciding who is authorized to add or remove group members is left to the application; the MLS protocol treats all clients as equal.

When a group is first created, we say that it is in epoch 0. Subsequently, each change to the group increments the epoch number. In each epoch, the current membership of the group can be seen as an array, where some entries may be blank, and each non-blank entry contains the credential and public keys of a group member.

By convention, the creator of the group is at index 0. When adding a new member, its data is placed at the first empty index, and if all indexes are occupied, by extending the end of the array. Removing a member blanks its index. Updating a member’s credential or key changes the value stored at its index.

The membership array represents a snapshot of the group membership at a particular epoch and can be used to understand the security goals of MLS. In each epoch, we expect that all group members agree on the current membership array, and can send and receive group messages that will be only visible to the current members.

Security Goals

The main threats to messages sent over MLS are from network attackers, from malicious servers, and from compromised group members whose cryptographic state and keys have been obtained by the adversary. By default, we assume that the authentication server (that issues credentials) is honest, but that the network and all other servers are untrusted and can collude with each other and with compromised group members.

The MLS working group charter lists the desired security goals for MLS, and we specify the key security guarantees below using the informal notions we have defined so far.

  • Message Confidentiality: If a client C sends a message M in epoch E of group G, and C believes that the membership of G in E is C0,…,Cn, then M is kept secret from the adversary as long as none of these members is compromised.
  • Forward Secrecy: If a client C sends (or receives) a message M in epoch E of group G, then any compromise of C after this point does not affect the confidentiality of M.
  • Message Authentication: If a client C accepts a message M in epoch E of group G, and if C believes that the membership of G in E is C0,…,Cn, and if none of these members are compromised at the time of reception, then M must have been sent by one of these group members for the group G in epoch E.
  • Sender Authentication: If a client C accepts a message M seemingly sent by a client C’ in epoch E of group G, and if C’ is uncompromised at the time of reception, then M must indeed have been sent by C’ in epoch E of group G.
  • Membership Agreement: If a client C accepts a message M from a client C’ in epoch E of group G, then C and C’ must agree on the membership of G at E.
  • Post-Remove Security: If a client C was member of group G in epoch E, and was no longer a member in epoch E+1, then even if C was compromised in epochs <= E, it does not affect the confidentiality of messages sent in epochs >= E+1.
  • Post-Update Security: If a client C was member of group G in epoch E, and has updated its cryptographic keys in epoch E+1, then even if the previous state of C in epochs <= E was compromised, it does not affect the confidentiality of messages sent in epochs >= E+1.

The first two confidentiality goals will be familiar to anyone who has some experience of two-party secure channel protocols. A message is confidential as long as all the current members of the group remain uncompromised until the message has been received by everyone. After this point, forward secrecy states that later compromises do not matter. Note that this definition of forward secrecy is not restricted to long-term key compromise, it allows for full state compromise of any client.

The next two authentication goals also feel natural. Only group members can send messages to the group, and the identity of the sender is authenticated. Note that in two-party protocols, these properties are often conflated since there is only one peer, but in groups, it may be important to know which member sent a particular message.

The first truly novel security goal in MLS is membership agreement, which guarantees that group members agree with each other on the current membership. In practice, MLS requires an even stronger agreement on the entire membership history of the group and its cryptographic states. Note that currently deployed group protocols like Signal Sender Keys do not provide membership agreement.

The final two goals unpack the notion of Post Compromise Security (PCS) for group messaging. Unlike two-party messaging where we need to explain PCS in terms of hypothetical situations like devices that were temporarily stolen, groups require a more straightforward notion of recovery from compromise after removal. Perhaps the most important security goal for MLS is that once a member has been removed from the group, it can no longer read or write messages. In addition, MLS also provides post-update security, just like two-party protocols like Signal. It is again worth noting that most deployed group messaging protocols, including Signal Sender Keys do not provide either of these two properties.

Performance Constraints

A key requirement of MLS is that it must support asynchronous operation. That is, members must be able to send messages and make group changes without requiring any other member to be online at the same time. This means that most classic group key exchange protocols from the cryptographic literature are not suitable for MLS. Still, designing a naive asynchronous protocol that meets the security goals outlined above is not hard, and such protocols are already deployed in Signal, WhatsApp, Matrix, etc.

The main design constraint is scalability. MLS is meant to work for groups with thousands of users, so protocols that require heavy computation at senders or recipients or large messages become infeasible as group sizes grow. Most currently deployed group messaging protocols scale linearly (and sometimes quadratically) with the number of users and can only support groups of between 256 and 1024 members. The main bottleneck is the number of public-key operations required when members are added or updated. The stated requirement of the MLS charter is that resource requirements should scale linearly or sub-linearly in the group size.

The MLS Approach: TreeSync, TreeKEM, TreeDEM

The MLS protocol achieves its performance and security goals by using binary trees to represent the group data structure and to efficiently establish group keys. In particular, the membership array depicted above turns into leaves in a binary tree, where the internal nodes represent sub-groups consisting of the members below them.

The tree structure allows for an efficient update and removal procedure that scales logarithmically in the size of the group. The full details of the MLS tree data structure and how it works in the protocol will be discussed in future posts.

At a high-level, MLS can be divided into three sub-protocols that populate, synchronize, and use this tree-data structure to establish shared keys for the group and to use them for secure messaging. We call these three sub-protocols TreeSync, TreeKEM, and TreeDEM. This decomposition of MLS is not explicitly specified in the standard. It was first identified in our paper on TreeSync and we believe it can help the reader understand better how the different mechanisms in MLS fit together.

  • TreeSync: Authenticated Group Management The TreeSync sub-protocol ensures that all group members have a consistent, authenticated view of the group state, including the membership array and the keys stored throughout the MLS tree data structure. TreeSync defines all group management operations and uses multiple tree hashing techniques (not dissimilar to Merkle Trees) and signatures to guarantee membership agreement and group state integrity. This serves as an essential precondition for key establishment.
  • TreeKEM: Efficient Group Key Establishment The TreeKEM sub-protocol (first proposed in this short paper) uses the tree data structure to generate sub-group keys for each internal node in the tree, including a group key for the root that is shared between all the members of the group in the current epoch. Whenever the group membership changes, TreeKEM generates a fresh group key and efficiently conveys it to all other members. As long as all members actively contribute to the group, all TreeKEM operations have a cost proportional to the height of the tree, i.e. logarithmic in the size of the group. However, if only a few members actively contribute, the cost of each operation can grow to be linear in the group size.
  • TreeDEM: Forward Secure Group Messaging The TreeDEM sub-protocol uses the group keys established by TreeKEM to encrypt and authenticate application messages sent within each epoch. TreeDEM guarantees forward security for application messages, using the tree data structure to decide which keys are derived and when they are deleted.

In subsequent posts, we will go deeper into these three protocols and understand why they are secure, what their performance limitations are, and how one can best use them in different scenarios. In the meantime, we leave the reader with some reading material:

MLS @ Cryspen

At Cryspen, all of our co-founders are experts on the MLS protocol and its implementations. We have contributed to the published standard, we have published papers on the protocol, and we help maintain one of the main implementations. We provide consulting and software services to help companies understand and deploy MLS, and we help customize MLS for specific user needs, maximizing real-world performance while preserving its security guarantees.

With the advent of messaging interoperability regulations like the EU’s Digital Markets Act, we believe that implementing MLS now offers companies the most future-proof way of both improving their security and meeting compliance obligations.

We believe that MLS provides a framework for secure group collaboration that goes beyond messaging, and are working on using the components of MLS for varied use cases including group key management in groups of thousands of users. We are also working on incorporating post-quantum secure cryptography into MLS. If you have questions about any of these projects, or just want to understand what MLS can do for you, do contact us.