The strengths and limits of formal verification

Karthikeyan Bhargavan
February 12, 2026

I have learned that it is important to be precise when documenting the guarantees offered by formally verified code, and to resist the temptation of simplifying things too much in trying to reach a larger audience. In my talks, to my students, and to our users and customers, we always clearly describe our coding, testing, and verification methodologies, explain what we verify and against what specifications, elaborate our trust assumptions, and detail the limitations of our approach.

Starting this week, we have started revamping the documentation in our website and code repositories to more clearly reflect these technical nuances. We are also clarifying our vulnerability reporting and security advisory policies for all our projects.

The great strength of formal verification is that it can eliminate entire classes of potential bugs from large parts of a codebase, allowing testers and reviewers to focus their efforts on the unverified bits. For example, most of the bugs in cryptographic libraries like OpenSSL can be classified as either memory safety bugs (e.g. buffer overflow or use-after-free), or functional correctness bugs (e.g. missed carry propagation), or side-channel attacks (e.g. branching on secrets). Formal verification frameworks, like F* and Rocq, can be used to rule out memory safety and correctness bugs in C code, and assembly analysis tools, like Jasmin and Vale, can also provide low-level guarantees against certain classes of side channels.

Just because a library or application uses formal verification, it does not mean that it will not have any bugs! In the past decade, across multiple projects, we have repeatedly seen bugs appear in unverified components - platform stubs, API wrappers, interface code - while large verified components have remained bug-free. These bugs were often found by developers and users during manual review and testing, which is as essential for verified libraries as it is for unverified ones.

Let me detail these experiences a bit further in the context of two crypto libraries that I have been closely involved in, HACL* and libcrux.

HACL: high-assurance cryptography in F

The HACL* project was developed as part of Project Everest, by a collaboration between my previous team at Inria, Microsoft Research, and CMU. The majority of the codebase is written in the Low* subset of the F* programming language, and then compiled via the KaRaMeL compiler to C code that can further be compiled via GCC, Clang, or CompCert. Some algorithms also have verified assembly implementations written in Vale and compiled to Intel x64 assembly. The generated C and assembly code has been deployed in Firefox and Linux and hence has benefited from code reviews and comprehensive testing from mainstream open-source projects across a variety of platforms.

The HACL* library includes optimized implementations of most modern cryptographic algorithms, including elliptic curve Diffie-Hellman and signatures (Curve25519, P-256), AEAD algorithms (AES-GCM, Chacha20-Poly1305), hashing (SHA-2, SHA-3, Blake2), etc. The source Low* code for these algorithms was written by researchers and PhD students over many years and formally verified using the F* verification framework for memory safety and functional correctness against high-level mathematical specifications for the algorithms.

The HACL* library generates 135K lines of C assembly from 245K lines of verified F* code. In addition, the generated code has 1800 lines of C stubs from the KaRaMeL compiler. In the 9 years it has been in use (2017-2026), the project has had several bugs. Twice, we found bugs in the high-level specifications we were verifying the code against, and had to change the code and proofs. More commonly, users would find bugs in the C stubs, say in the platform-specific code for handling endianness, when testing the code on platforms the HACL* CI did not support. For small projects, such bug reports are invaluable contributions.

Formal verification allows HACL* users and reviewers to focus on a much smaller bug-surface without worrying about subtle mathematical bugs inside the algorithm implementations. Still, it is important to emphasize that a proof for the F* code does not translate to an unconditional mathematical proof for the deployed executable. We need to trust the F* verification framework, the KaRaMeL and C compilers, and any system libraries or unverified code we depend on.

Libcrux: high-assurance cryptography in Rust

At Cryspen, we still use HACL* code within our libcrux Rust library. The code from HACL* is compiled to Rust using a new research tool and then extended with hand-written (unverified) Rust APIs.

For new algorithms, such as those for post-quantum cryptography, we write the code directly in Rust and use the hax toolchain to verify its runtime safety and functional correctness. This approach allows us to more closely involve the Rust developer in the verification process, instead of simply providing generated code to them.

Using this methodology, we have written Rust implementations of ML-KEM and ML-DSA and verified many of their core modules using hax and F*. Not all the code in libcrux is formally verified yet. For each crate, the code repository carefully documents what modules and properties we currently prove, and the proofs keep improving and changing as the code evolves. We are now experimenting with using Lean for some of these proofs instead of F*, and with using specifications written in Rust rather than in F*.

The libcrux code relies on a number of system libraries, including compiler intrinsics to implement platform-specific SIMD optimizations for Intel AVX2 and Arm64 Neon. These core libraries, provided by the Rust compiler, are not verified, although Cryspen contributes to the testing and formal specification of these libraries, sometimes finding and fixing bugs. Within libcrux, we implement verification-friendly wrappers around some system libraries (like SIMD intrinsics), and we implement uniform Rust APIs around the HACL* code and the verified Rust algorithms.

Like with HACL*, the verification guarantees for libcrux code only extends to the verified modules. In addition, we need to carefully review and test all the unverified code, the system libraries, the Rust compiler, the hax toolchain, and the F* framework. We use standard test vectors, including those from the Wycheproof project as part of our CI for the algorithms in libcrux. We currently only test on a few mainstream platforms like Intel x64 and Arm64, and testing on unsupported platforms remains best effort.

The libcrux library is still young and in pre-release (its crates are versioned below 0.1), and several bugs have recently been found in its codebase. One was in a fallback implementation for a platform-specific intrinsic (mea culpa, a bug I introduced, in a code path that was not exercised on the modern Arm64 machines I tested the code on) and one in an unverified wrapper around HACL* code. So far, no bugs have been found in the verified code, although we would be very interested to know if any of our code generation tools inadvertently introduced a bug, and even more interested if the bug should have been caught by formal verification.

In addition, libcrux suffers from the problem many researchers have identified in high-level cryptographic libraries, which is that code that is written to be constant-time be turned into branching code by clever compilers. This is a systematic problem that we try to mitigate (like other crypto libraries) by enforcing careful coding practices, but we do not prove formal guarantees of side-channel resistance for the compiled executable. If side-channel attacks are a crucial part of your threat model, talk to us before using libcrux. Going forward, we plan to collaborate with researchers working on binary analysis to verify side-channel resistance for compiled versions of libcrux.

Open-Source Formally-Verified Cryptography

At Cryspen, we contribute to and help maintain several open-source cryptographic libraries, including HACL*, libcrux, OpenMLS, Bertie etc. with different audiences, goals, verification status, and production readiness. We try to document their precise verification guarantees in their open-source repositories, and ask that people talk to us if they have any questions.

It is important to remember that, like the code itself, formal proofs are living artifacts. Code that is verified may lose its guarantees when it is modified, so we run formal verification as part of our CI to catch and fix proof regressions. Any PR to the repository must reverify all the code before being merged. Each addition to the repository must also document its verification status. Our goal in all the projects we work on is to start small and grow the verified base until it covers as many code components as possible. Consequently, before using a young and fast-moving library like libcrux that contains both verified and unverified code, it is important to communicate with our developers who can advise you on whether the library is mature enough for your use-case and what guarantees you can currently expect from it.

At Cryspen, we are a small team that strongly believes in open-source formally-verified crypto, and we invite public review, bug reports, and contributions. We believe the combination of open-source and formal verification can allow a level of assurance not achievable by either on their own. We do ask that reviewers work with us constructively to improve the code and have patience while we investigate any issues they have found. If you think you have found a security vulnerability, we recommend that you email security-reports and we can work together on the fix.

If you want to learn more about or discuss our code or our methodology, contact us, or bump into us at Real World Crypto 2026 or at the Open Source Crypto Workshop and we would be happy to chat.