Project Eleven - Blog - PQ Implementation Vulnerabilities Volume 1: a RustCrypto side channel CVE

This is the first post in a Project Eleven series we are calling "PQ Implementation Vulnerabilities".

Post-quantum cryptography (PQC) is now standardized, but standardized does not mean it is safe in your production build. The hard part has moved into implementations, compilers, and the sometimes unglamorous details that decide whether keys leak.

In this series, we will track implementation bugs, CVEs, and issues as they come up, and break them down clearly.

This post is about a timing side-channel in RustCrypto’s ML-DSA implementation (GHSA-hcp2-x6j4-29j7, CVE-2026-22705)

What is "constant-time" and why does it matter?

When cryptographers say "constant time" they mean that nothing about how long the computation takes, which branches it takes, or which memory it touches, should depend on a secret.

In practice, that means you avoid things like:

Branches whose condition depends on a secret (for example if secret_bit == 1 { ... }).
Memory accesses whose index depends on a secret (for example table[secret_idx]).
Instructions whose runtime can vary with their inputs, even if the code looks branchless. (hardware division is a classic example on some CPUs).

Timing leaks matter when an attacker can measure time precisely enough and collect enough samples to average out noise. This isn’t always possible and whether this is exploitable depends on the threat model: local attacker vs remote, ability to collect many timings, noise in the environment, and how much the secret influences the timing signal.

Here are three toy examples of what "non-constant-time" looks like in practice.

1) Secret-dependent branch (control flow leak)

fn sign(secret_key, message):
    bit = get_bit(secret_key, 0)
    if bit == 1:
        x = do_more_work(message)
    else:
        x = do_less_work(message)
    return x

If the branch depends on a secret, runtime and CPU behavior can differ in ways an attacker can measure.

2) Secret-dependent memory access (cache and memory-system leak)

fn sign(secret_key, message):
    idx = get_byte(secret_key ^ message, 0) # 0..255
    x = table[idx]                          # touches secret-dependent address
    return x

Even if the code path is the same, different indices touch different cache lines, which can leak via timing.

3) Variable-time instruction on secret-influenced data (instruction-level leak)

fn do_something(x):
		x / SOME_CONST

fn sign(secret_key, message):
    do_something(secret_key ^ message) == 7

Even with no branches or table lookups, some instructions (division is a common one) can take different amounts of time depending on operand values.

Division and constant time

This specific bug in question is about a division, so it is worth being clear about why division is treated differently in constant-time code.

Intuitively, we all expect some divisions to be more difficult and others to be easier. That intuition is not wrong, and CPUs have the same problem. Hardware division is not one fixed sequence of simple operations like add or xor. It is typically implemented as an algorithm inside the CPU, and the amount of internal work can vary depending on the operands. The awkward cases are not always the biggest numbers, they are often the ones whose bit patterns make the internal steps take longer, or prevent the hardware from taking shortcuts.

At the machine-code level, a high-level / often becomes a dedicated divide instruction (you will see mnemonics like UDIV or SDIV, meaning unsigned or signed division respectively in assembly). On many architectures that instruction can take a variable number of cycles depending on the numerator value, even when the divisor is a constant. That is enough to violate constant-time if the numerator is influenced by secret key material.

The practical point is simple; in constant-time cryptographic code you avoid division on secret-influenced values because it can be variable.

`Decompose` and `r`

This section does not require an in-depth understanding of ML-DSA. If you do want a full walkthrough, read our Lattice City blog.

For this bug, you only need to understand one aspect: ML-DSA signing repeatedly splits intermediate values into high bits and low bits using a function called Decompose, and emits a small hint so the verifier can recover the same high bits when it recomputes the check from the public key, message, and signature.

At a high level, Decompose takes an input, r, and returns two values (r1, r0) relative to a known public constant called γ2 (you will see this as TwoGamma2 in RustCrypto), such that:

r ≡ r1 * (2 * γ2) + r0 (mod q)

The exact detail here isn’t that important, but you can think of r1 as the coarse part of r and r0 as the remainder.

Here are the simplified formulae for computing r1 and r0 :

r ≡ r1 * (2 * γ2) + r0 (mod q)

What should jump out here is the need for a divisor to compute r1.

Decompose is not called directly in most of the spec. Instead, the spec defines two tiny wrappers that just call Decompose and return one half of the result:

HighBits(r) calls Decompose(r) and returns r1
LowBits(r) calls Decompose(r) and returns r0

Importantly, during signing these wrappers are fed values that are influenced by long-term secret key material:

LowBits(w - c*s2): where s2 is a secret key component (part of the private signing key) so Decompose runs on an input that is directly influenced by s2.
HighBits(...): later in signing, the code generates what is called a hint and, in that path, it calls HighBits() on values derived from t0. t0 is another secret key component (also part of the private signing key), so again Decompose, via HighBits, also runs on inputs influenced by secret information.

It should now be clear that the division inside Decompose ends up running on values derived from long-term secrets, which is exactly the precondition for a timing side-channel.

The RustCrypto bug

We can now look at the exact implementaion of Decompose in RustCrypto:

fn decompose<TwoGamma2: Unsigned>(self) -> (Elem, Elem) {
    // ...
    let mut r1 = r_plus - r0;
    r1.0 /= TwoGamma2::U32;  // Variable-time division on secret-derived data
    (r1, r0)
}

This is the entire problem. The function performs a normal integer division (/=) to compute r1.0 / TwoGamma2::U32. The analyzer flagged the resulting machine code because it included hardware division instructions (UDIV/SDIV) with data-dependent timing behavior.

From a constant-time perspective, the issue should now be clear:

The numerator going into that division is derived from signing state.
That signing state is influenced by long-term secret key material.
Hardware division can take different time for different numerator values.
Signing executes this kind of operation repeatedly across many coefficients/intermediate values.

So the signer is potentially emitting a small timing difference that is correlated with secret-influenced values. An attacker with precise timing measurements could, in principle, extract information about the signing key by observing timing variations in that division. This is the CVE.

The fix

The fix is not to try make division constant-time. You generally cannot, because the variable behavior is deep inside the CPU’s divide unit. Rather, the fix is to not let the compiler emit a hardware division instruction on secret-influenced values.

Here the divisor is a public constant (2*γ2). For division by a constant, you can compute the quotient using arithmetic that is easier to keep constant-time: multiplication and shifts, plus a small constant-time correction if needed, this is often described as "constant-time Barrett reduction style division-by-constant". The idea is the same as what compilers sometimes do for optimisation (replace divide-by-constant with multiply-by-constant and shift), except in cryptographic code you do it deliberately and then verify the generated machine code.

At a high level the replacement looks something like this:

# Divide x by a public constant D without emitting a hardware divide.

q = (x * MULT) >> SHIFT

# where MULT is some value precomputed using D
# optional correction to make q exact for the allowed input range, done without branches

The proposed patch implementation avoids the UDIV/SDIV instruction entirely in this path, so the variable-time behavior of hardware division is removed from the signing loop. What remains are instructions like multiply, shift, add, and subtract, which constant-time tooling can analyse more reliably, and which are far less likely to have operand-dependent early termination behavior. The proposed patch in RustCrypto replaces the /= in decompose with exactly these kind of constant-time operations. This is patched in >= 0.1.0-rc.3.

A new failure mode

This is a good example of, what we think at Project Eleven, will become the new normal. ML-DSA is not broken. A specific implementation had a constant-time violation in a small helper, and that helper sits on a secret-influenced signing path.

That is what "PQ Implementation Vulnerabilities" is about. Standards give us interoperable primitives. The next hard challenge is making sure implementations keep their security properties after the compiler and the CPU have had their say. If you are looking to implement post-quantum cryptography into your stack, or have questions on post-quantum security for your software, reach out to us.

PQ Implementation Vulnerabilities Volume 1: a RustCrypto side channel CVE