In a recent blog post, we described Cryspen’s new Rust implementation of ML-KEM in Rust, and talked about how our high-assurance development methodology helped us find a new timing bug in various other Kyber implementations.

In this post, we will go into some more detail on how we use the hax verification toolchain to formally prove that our code is panic free and functionally correct.

## Background on High-Assurance Cryptography

Cryptographic libraries lie at the heart of the trusted computing base of all networked applications and, consequently, programming bugs in these libraries are always treated as security vulnerabilities, sometimes with high-profile acronyms and news articles. In the past decade, a number of companies and research groups have made a concerted attempt to improve confidence in cryptographic libraries by using techniques from formal verification. This research area, called computer-aided cryptography or high-assurance cryptography, is surveyed in this paper and even has a dedicated workshop.

There are several notable successes from this community effort. Projects like HACL* and Fiat-Crypto generate C code from formally verified implementations written in proof-oriented languages like F* and Coq, and this C code is now widely used in mainstream software including Google Chrome, Mozilla Firefox, Linux Kernel, WireGuard VPN, mbed TLS, Python, ElectionGuard, and Tezos. Other projects like Vale and libjade generate verified assembly code that is as fast or faster than the most optimized implementations in mainstream crypto libraries. Still other projects like CryptoLine seek to verify existing code in important libraries like OpenSSL.

Despite these landmark achievements, many open questions remain. Cryptographic verification tools are largely focused on software written in research languages and it requires Ph.D level skills to verify new cryptographic algorithms. Our verification toolchain hax offers a new way forward targeting implementations written by crypto developers in Rust.

## Running the hax toolchain

We use the hax toolchain to extract F* code from Rust code. More details on how hax works and its history is in this blog post. The hax toolchain supports multiple backend provers, but for the purposes of this blog post, we focus on the F* backend.

We run hax on the ML-KEM implementation in librux and for each module
`M`

in the implementation, hax produces an F* implementation file
`M.fst`

and a corresponding interface file `M.fsti`

. Each function and
constant in Rust is compiled to an equivalent function and constant in
F*. Each Rust trait turns into an F* typeclass, and each Rust type
is translated to the corresponding type in F*.

### Example: Barrett Reduction

As a first example, consider the following function for Barrett reduction,
taken from
the `arithmetic.rs`

file in the libcrux ML-KEM implementation:

```
pub(crate) fn barrett_reduce(input: i32) -> i32 {
let t = (i64::from(input) * BARRETT_MULTIPLIER) + (BARRETT_R >> 1);
let quotient = (t >> BARRETT_SHIFT) as i32;
let remainder = input - (quotient * FIELD_MODULUS);
remainder
}
```

This function implements the classic Barrett reduction
algorithm to compute
the modulus of a 32-bit signed integer input with respect to the ML-KEM prime `q = 3329`

.
The algorithm uses the constant `BARRETT_R = 2.pow(26)`

with `BARRETT_SHIFT = 26`

.

In the first line, we multiple the `input`

by a pre-computed constant
`20159`

(equal to `⌈BARRETT_R/q⌋`

), and add it to `BARRETT_R/2`

to
obtatin `t`

. The next line divides `t`

by `BARRETT_R`

to find the
quotient, and the third line computes and returns the remainder.

When we run hax, this function is translated to the
following function in F* (taken from the module `Libcux.Kem.Kyber.Arithmetic.fst`

):

```
let barrett_reduce (input: i32) =
let t: i64 =
((Core.Convert.f_from input <: i64) *! v_BARRETT_MULTIPLIER <: i64) +!
(v_BARRETT_R >>! 1l <: i64)
in
let quotient: i32 = cast (t >>! v_BARRETT_SHIFT <: i64) <: i32 in
let remainder: i32 =
input -!
(quotient *! Libcrux.Kem.Kyber.Constants.v_FIELD_MODULUS <: i32) in
remainder
```

As we can see, hax has translated each Rust arithmetic operation (e.g. `+`

) to the corresponding
operation in F* (e.g. `+!`

), and every constant (e.g. `1l`

), variable (e.g. `t`

) and function (e.g. `f_from`

)
is fully annotated with its complete path and type.

This F* code uses a set of hand-written annotated F* libraries (e.g. `Core.Convert`

) that model
the behavior of the Rust core libraries and operators. These libraries are trusted, but must be carefully
audited to ensure that they correctly reflect the operational semantics of Rust.

## Why bother with Barrett Reduction

One may ask why we use a complicated algorithm like Barrett Reduction for a single `i32`

.
Isn’t this algorithm meant only for large integers stored as bignumss?
Why can’t we just calculate the modulus as `input % 3329`

?

The reason we cannot use `%`

is that `input`

is a secret here and division is not a constant-time
operation, as we discussed in our previous blog post.
In F*, we treat `i32`

as a secret integer type (a public integer must be explicitly labeled as `pub_i32`

), and operations like `%`

are not defined for secret integers. Consequently, the F* typechecker would immediately reject `input % 3329`

as mal-typed.

We could also use a different modular reduction algorithm, and indeed our code also uses Montgomery reduction, but Barrett Reduction is well-suited for certain computations in Kyber.

#### Expressiveness and limitations

We note that the F* type system is much more powerful than Rust,
allowing the programmer to attach invariants to types (*refinement types*),
pre- and post-conditions to functions (*dependent function types*) etc.,
but the hax translation only uses simple F* types by default.
As we start doing proofs over this code in F*, we use this expressiveness
to extend the generated types with logical annotations.

We also note that although F* has an expressive effect system
to handle stateful code, we only use the purely functional (`Tot`

) effect.
This gives us a significant advantage in being able to prove functional
correctness without worrying about stateful side-effects.

The hax toolchain does not support the full Rust language (at least at present), so for example, if you write a function that relies on delicate pointer manipulation and aliasing in doubly-linked linked lists, it will likely be rejected by hax. However, cryptographic code like ML-KEM primarily performs numeric computations using simple data structures like arrays and structs, with some generalization using generic types, const generics, and traits, and all of these features are supported by hax.

## Proving Panic Freedom

Once we have generated F* code from Rust using hax, the verification work begins. Looking again at the Barrett reduction function, we observe that it uses the following arithmetic operations:

- shift right (
`>>`

) - addition (
`+`

) - multiplication (
`*`

) - subtraction (
`-`

)

All of these operations can panic if their arguments are not of the right size.
For example, `i32`

addition requires that the sum of the two (signed) arguments does not
overflow or underflow the `i32`

type, otherwise the operation will trigger a `panic`

in
Debug mode and produce a (possibly) unexpected result in Release mode.

In our methodology, we first require that all the code be proved to be panic-free. To do this, we annotate the F* libraries containing these operations with pre-conditions that say that their inputs have to be in the right range. We also provide specifications and lemmas for each operation that models its input-output behavior. For example, right-shift behaves like a division with a power of two, addition maps to arithmetic addition etc.

Then, we run the F* typechecker on the generated code, and it tries to prove these library pre-conditions and if F* fails to prove a pre-condition, it produces a type error, indicating that either the function may panic, or that the prover requires some more annotations or proof hints.

When we run F* on the `barrett_reduce`

function, it uses the library definitions of the operators to prove, in sequence, that:

`input`

fits in an`i64`

(trivial since`input : i32`

)`input * 20159`

fits in an`i64`

(trivial since both inputs are`i32`

)`t = input * 20159 + 2.pow(25)`

fits in an`i64`

(requires some math reasoning)`quotient = t / 2.pow(26)`

fits in an`i32`

(requires some math reasoning)`input - (quotient * 3329)`

fits in an`i32`

(requires some math reasoning)

Using the annotations and lemmas in our library, F* is able to prove all of the
above automatically using the Z3 SMT solver, and hence it proves
that the `barrett_reduce`

function is panic free.

As we can see, panic-freedom already requires some mathematical and logical reasoning, but it is only the first step towards functional correctness. Often, panic freedom can be proved automatically by F*, but functions with loops, as we see below, require more annotations.

## Proving Functional Correctness

To prove that a function like `barrett_reduce`

is correct,
we first need to formally specify its expected input-output behavior.
If the `input`

is a natural number, the Barrett reduction algorithm is
meant to compute `input % 3329`

. However, in our code, `input`

is a
signed integer and may be negative, in which case the remainder is
also a negative value. Furthermore, the way we use this function in
Kyber, we know that the input is always in the range `- BARRETT_R <= input <= BARRETT_R`

.

So, we annotate the type of `barrett_reduce`

(in the interface file `Libcux.Kem.Kyber.Arithmetic.fsti`

) with the types and post-condition shown below:

```
val barrett_reduce (input: i32_b (v v_BARRETT_R))
: Pure (i32_b 3328)
(requires True)
(ensures fun result ->
v result % v Libcrux.Kem.Kyber.Constants.v_FIELD_MODULUS
= v input % v Libcrux.Kem.Kyber.Constants.v_FIELD_MODULUS)
```

The refinement type `i32_b N`

is defined as the type of 32-bit
signed integers `x:i32`

such that `-N <= x <= N`

. So the type
annotation on `input`

is a pre-condition that says that the input
to this function must be in the range `-BARRETT_R <= input <= BARRETT_R`

.
The annotation on the result says that it must be in the range
`-3328 <= result <= 3328`

, and the post-condition in the `ensures`

clause
says that the result is the same as the input modulo the ML-KEM prime (`3329`

).

Proving the functional correctness of `barrett_reduce`

requires more work
from the proof engineer, since we need to use properties of division
and modulus in ways that F* cannot infer automatically. There are many
ways of helping F* complete this proof. One such proof is shown below.

## A Proof of Barrett Reduction in F*

The proof is interleaved with the code of the `barrett_reduce`

function in F*:

```
let barrett_reduce input =
let t: i64 =
((Core.Convert.f_from input <: i64) *! v_BARRETT_MULTIPLIER <: i64) +!
(v_BARRETT_R >>! 1l <: i64)
in
assert_norm (v v_BARRETT_MULTIPLIER == (pow2 27 + 3329) / (2*3329));
assert (v t = v input * v v_BARRETT_MULTIPLIER + pow2 25);
let quotient: i32 = cast (t >>! v_BARRETT_SHIFT <: i64) <: i32 in
assert (v quotient = v t / pow2 26);
let remainder: i32 =
input
-! (quotient *! Libcrux.Kem.Kyber.Constants.v_FIELD_MODULUS <: i32)
in
calc (==) {
v remainder % 3329;
(==) { }
(v input - (v quotient * 3329)) % 3329;
(==) {Math.Lemmas.lemma_mod_sub_distr (v input) (v quotient * 3329) 3329}
(v input - (v quotient * 3329) % 3329) % 3329;
(==) {Math.Lemmas.cancel_mul_mod (v quotient) 3329}
(v input - 0) % 3329;
(==) {}
(v input) % 3329;
};
remainder
```

This proof uses assertions (`assert`

) to tell F* to prove some useful logical property,
normalizing assertions (`assert_norm`

) to ask F* to verify a calculation,
and a calculational tactic (`calc`

) to prove the post-condition using lemmas from
the F* math library (`Math.Lemmas`

). This is a fairly simple explicit proof in F*;
it could be further optimized and made more elegant, but it works for the specification we have here and is representative of other proofs we do for our ML-KEM code.

## Verifying Properties about Loops

Barrett reduction is a good example of our methodology since it is self-contained and has a clean mathematical specification. However, large parts of the code in ML-KEM (and other cryptographic algorithms) implement algorithms over arrays, vectors, and matrices of values. In these functions, we need to reason about loop invariants to prove that they are panic free and functionally correct.

As an example of this class of code, we consider
another function from `arithmetic.rs`

which adds two ML-KEM
polynomial ring elements:

```
pub(crate) fn add_to_ring_element(
mut lhs: PolynomialRingElement,
rhs: &PolynomialRingElement,
) -> PolynomialRingElement {
for i in 0..lhs.coefficients.len() {
lhs.coefficients[i] += rhs.coefficients[i];
}
lhs
}
```

This function takes two polynomial ring elements (arrays of 256 `i32`

coefficients) called
`lhs`

and `rhs`

and returns the sum of these elements in `lhs`

. For simplicity,
we return `lhs`

here, but of course we could just mutate `lhs`

as well.

To prove that this function is panic free, we require two properties.
First, that the function never accesses `lhs`

or `rhs`

out of bounds.
Since, we only access these arrays at indices `i`

in the right range, F*
can automatically prove these accesses to be safe.

Second, the addition operation `+`

within the loop should never overflow.
F* cannot prove this property, since it indeed does not hold for all inputs.
Consequently, we need to add a pre-condition to the function in F*
that limits the range of inputs it can be called with, and we further need to define
a loop invariant in F* to ensure that this property is preserved as the loop executes.

To specify functional correctness, we add a post-condition to the function that says that the output coefficients are the sum of the input-coefficients. Proving this property also requires us to extend the loop invariant in F*. Details of the proof are below.

## Specifying and Verifying `add_to_ring_element`

The specification for the `add_to_ring_element`

function is:

```
val add_to_ring_element (#b1: nat) (#b2: nat{b1 + b2 < pow2_31})
(lhs: t_PolynomialRingElement_b b1) (rhs: t_PolynomialRingElement_b b2)
: Pure (t_PolynomialRingElement_b (b1 + b2))
(requires True)
(ensures fun result ->
(forall i. v result.f_coefficients.[i]
== v lhs.f_coefficients.[i] + v rhs.f_coefficients.[i]))
```

It requires that every coefficient `x`

in the input `lhs`

must be bound by some `b1`

(`-b1 <= x <= b1`

) and every coefficient `y`

in `rhs`

must be bound by some `b2`

(`-b2 <= y <= b2`

) such that `b1 + b2 < pow2 31`

.
If this pre-condition holds, then the function produces an output ring element where every coefficient
`z`

is bound by `b1+b2`

(`-(b1+b2) < z < (b1+b2)`

). Furthermore, the `ensures`

clause states a post-condition that each coefficient in the output is a point-wise sum of the corresponding coefficients in the two inputs.

The F* code and proof of this function is too verbose to include here, but the key element is the loop invariant defined below

```
let inv = fun (acc: t_PolynomialRingElement_b (b1 + b2)) (i: usize) ->
(forall j. j <. i ==>
acc.f_coefficients.[j]
== lhs.f_coefficients.[j] +! rhs.f_coefficients.[j]) /\
(forall j. j >=. i ==>
acc.f_coefficients.[j]
== orig_lhs.f_coefficients.[j])
```

It says that after `i`

iterations, the coefficient at every index `j < i`

of the result is the sum of the corresponding values in `lhs`

and `rhs`

, and the coefficient at every index `j >= i`

of the result is the same as the corresponding
coefficient in the original `lhs`

. Once we state this invariant, F* is able to prove the rest of the properties automatically.

## Lessons and Benefits of Formal Verification

In this post, we have tried to provide a taste of using hax and F* to verify our ML-KEM Rust implementation. The full implementation is quite large and we are still cleaning up and polishing off the last of our proofs before a final release.

Our experience with ML-KEM shows off several benefits of formal verification:

**Finding bugs and attacks**: Our formal analysis resulted in us finding a timing bug in our code, which then pointed the way towards bugs in other implementations of ML-KEM. Although the bug itself does not require deep verification in F*, it does rely on a model of integer arithmetic, and an analysis of how secret information flows through the ML-KEM implementation.**Code understanding**: Our Rust code was comprehensively tested and so we did not expect to find many bugs, but even so, there were several instances where the formal verification found flaws in our assumptions. For example, in some cases, the calculations were much closer to the boundaries of an integer type than we thought.**Justifying optimizations**: Formal verification allowed us to have confidence in the low-level optimizations that we used to speed up our code. For example, in many places in the code, we postpone modular reduction because we can prove that we would not overflow the integer type, and this yields significant performance gains.

In summary, this project shows that formal verification has advantages beyond finding an attack or constructing a proof of a snapshot of the code; it can be used as a constructive part of high-assurance, high-performance cryptographic software development.

To learn more about our methodology, results, or how we can help your project, contact us.

Get in touch