PlaidCTF 2021: Leaky Block Cipher

PlaidCTF this year had very insane challenges. Although I have spent a lot of time working on those crypto challenges, I was only able to solve leaky block cipher.

Challenge Summary⌗

This completely legitimate™ block cipher looks a bit like GCM, but my computer plumber keeps complaining about water residue. Can you help me spot the leak?

The challenge has a hand-crafted AEAD scheme. We are required to go through 20 rounds of challenges, and this is how each round proceeds:

The service sends you an IV that is used in this round.
You send a plaintext (with >= 7 blocks) for encryption.
The service encrypts your plaintext and returns the tag and 15.5 bytes of the encrypted IV ($\mathcal{C}_\text{IV}$).
If you are able to provide the remaining 0.5 bytes of $\mathcal{C}_\text{IV}$, you won this round.

Solution⌗

Part I: Understanding the scheme⌗

To begin, let's have a look on the LeakyBlockCipher class, where most of the things are happening.

class LeakyBlockCipher:
    def __init__(self, key = None):
        if key is None:
            key = secrets.token_bytes(16)
        self.key = key
        self.aes = AES.new(key, AES.MODE_ECB)
        self.H = self.aes.encrypt(bytes(16))
    def encrypt(self, iv, data):
        assert len(iv) == 16
        assert len(data) % 16 == 0
        ivi = int.from_bytes(iv, "big")
        cip = bytes()
        tag = bytes(16)
        for i in range(0,len(data),16):
            cntr = ((ivi + i // 16 + 1) % 2**128).to_bytes(16, byteorder="big")
            block = data[i:i+16]
            enced = self.aes.encrypt(xor(cntr, block))
            cip += enced
            tag = xor(tag, enced)
            tag = gf128(tag, self.H) # multiplication under GF(2^128)
        tag = xor(tag, self.aes.encrypt(iv))
        return cip, tag

As mentioned in the summary, we are only given the tag for the plaintext we send. This is how the tag, $T$, for a $n$-block ciphertext $(c_1, c_2, ..., c_n)$ defined:

\[T := c_1 {\mathcal{C}_0}^n + c_2 {\mathcal{C}_0}^{n-1} + ... + c_n {\mathcal{C}_0} + {\mathcal{C}_\text{IV}},\]

hereby ${\mathcal{C}_0}$ is the encrypted block of null bytes and ${\mathcal{C}_\text{IV}}$ is the encrypted IV.

From above, here are two ciphertext blocks we are interested: ${\mathcal{C}_0}$ and ${\mathcal{C}_\text{IV}}$. We should pick those $c_i$'s such that $c_i \in \{{\mathcal{C}_0}, {\mathcal{C}_\text{IV}}\}$, or otherwise we are actually introducing more unknowns, which is not desired.

Part II: Testing around and observing patterns⌗

One interesting way to set $c_i$'s is to set all of them as ${\mathcal{C}_\text{IV}}$. In that way, the tag we have is

\[T := {\mathcal{C}_\text{IV}}({\mathcal{C}_0}^n + {\mathcal{C}_0}^{n-1} + ... + 1).\]

It is very easy for us to exhaust ${\mathcal{C}_\text{IV}}$ since we already have 15.5 bytes of it, where there are only 16 possible cases. With that, ${\mathcal{C}_0}$ is the only unknown and its terms form an geometric series:

\[{\mathcal{C}_0}^n + {\mathcal{C}_0}^{n-1} + ... + 1 = \frac{T}{\mathcal{C}_\text{IV}}.\]

I was expecting that the above equation has roots in ${\mathcal{C}_0}$ only if ${\mathcal{C}_\text{IV}}$ is correct. I was wrong. In average, there are ~12 out of 16 candidate ${\mathcal{C}_\text{IV}}$'s has roots for ${\mathcal{C}_0}$. Although I am able to filter a few candidates away, I am never lucky to picking one in twelve correctly for 20 times in a row.

While I was messing around with $n$, I found the number of ${\mathcal{C}_\text{IV}}$'s that the equation has roots is dropped suddenly when $n = 15$. There are only one or two candidates in most of the cases. I've executed an experiment with the below code snippet to estimate the number of possible ${\mathcal{C}_\text{IV}}$'s.

from tqdm import tqdm

R.<x> = GF(2)[]
F = GF(2^128, name='a', modulus=x^128+x^7+x^2+x+1)

counts = [0 for _ in range(17)]
for _ in tqdm(range(100)):
    k0 = getrandbits(124)
    k1 = getrandbits(4)

    K = F.fetch_int(16*k0 + k1)
    H = F.fetch_int(getrandbits(128))
    T = K * sum(H^i for i in range(15+1))

    c = 0
    f = False

    for i in range(16):
        K = F.fetch_int(16*k0+i)

        rs = (K * sum(x^i for i in range(15+1)) + T).roots()
        for r, _ in rs: # H
            if r == H: f = True
        if len(rs) > 0: c += 1
    assert f

    counts[c] += 1

for c in range(17):
    print(c, counts[c])

From the results, there are one or two ${\mathcal{C}_\text{IV}}$'s in 80% of the cases. There is a 70% chance to solve a round of the challenge. Hence, we are able to get the flag in around 1250 attempts (with winning probability 0.7²⁰). Although this is decent, I would like to investigate why is the equation $x^{15} + x^{14} + ... + 1 = K$ (where $K$ is a constant) is less likely to have roots compared with others.

Part III: We need to go deeper⌗

…or not. It is still possible for someone to solve the challenge from the above results. I am just curious. Anyway, we will be speaking mathematics now!

Suppose that $g \in \text{GF}(2^{128})$ is a generator. Then the order of $g$ is $2^{128}-1$. Since 15 is a factor of $2^{128}-1$, $g^{15}$ has an order $(2^{128}-1)/15$. This said, there is $1/15$ chance for $x^{15} = h$ to have solutions if we pick $h \in \text{GF}(2^{128})$ randomly. On the other hand, we have an interesting fact that happens on $\text{GF}(2^{128})$:

\[x^{15} + x^{14} + ... + 1 = (x + 1)^{15}.\]

The probability for $x^n + x^{n-1} + ... + 1 = h$, where $h$ is randomly picked from $\text{GF}(2^{128})$, to be solvable is $1/n$ if $n \in \mathbb{N}$ satifies:

$x^n + x^{n-1} + ... + 1 = (x + 1)^n$, and
$2^{128}-1$ is divisible by $n$.

We want to find a larger $n$ to reduce the above probability, since it is correlated to the number of unwanted candidates for ${\mathcal{C}_\text{IV}}$. However $n$ should not be too large because this is the number of plaintext blocks we are sending to the server. For the first condition, it suffices to find $n$ such that $n \choose r$ are all odd for $r = 0, 1, ..., n$. This happens if and only if $n = 2^k - 1$ for some $k \in \mathbb{Z}_{\geq 0}$. In particular, $n = 255$ sounds a good pick - the probability is small enough, yet it is still reasonable to send 255 plaintext blocks for interaction. On the other hand, instead of finding the roots for $(x + 1)^{255} = K$, we can simply check if $K^{(2^{128}-1)/255}$ equals one. To summarize, this is how we solve one round of the challenge:

Send $m_1, m_2, ..., m_{255}$ to the server, where $m_1 = m_2 = ... = m_{255} = (\text{IV}+k) \oplus \text{IV}$. We then have $c_1 = c_2 = ... = c_k = \mathcal{C}_\text{IV}$.
Receive $T$ and 15.5 bytes of $\mathcal{C}_\text{IV}$.
For each candidate $C'$ of $\mathcal{C}_\text{IV}$ (there are only sixteen),
- Compute $A := T/C'$.
- Compute $B := A^{(2^{128}-1)/255}$.
- If $B = 1$, assume that $\mathcal{C}_\text{IV} = C'$.

After solving 20 rounds of challenge from the server, we have the flag: PCTF{you_found_the_residues!_db2f6ade22d73a90b15e8d1b06167393}.