HITCON CTF 2024 Writeup

The crypto challenge authors in HITCON 2024, @maple3142 and @_bronson113 prepared a set of exciting and difficult challenges.

I collaborated with @thehackerscrew1 as a guest player this time. In this blog post, we will cover three challenges: ZKPoF, PCBC Revenge and Hyper512.

ZKPoF⌗

Challenge Summary⌗

This challenge implements the proof-of-knowledge protocol for factoring specified in Poupard and Stern’s paper¹, where the server and us take turn being the verifier.

In the challenge, we have $A = 2^{1000}$ and $B = 2^{80}$. We are the verifier and the server is the prover on the first part of the connection. We can ask for up to 311 proofs:

The roles are swapped on the second part. We are asked to construct 13 proofs:

The objective is to successfully construct proofs when we are the prover.

Solution⌗

Part I: Yes Mystiz’s trick⌗

🤔 What Mystiz’s trick? I made a crypto challenge in Bauhinia CTF where the intended solution is to abuse negative numbers. Later in Balsn CTF 2023, @Utaha wrote a challenge and threw the “No Mystiz’s trick” exception when the players send negative numbers to the server.

The first thing I see is $e$ can be negative when we are the verifiers… Apparently @maple3142 is not as careful as @Utaha!

def zkpof(z, n, phi):
    # I act as the prover
    r = getRandomRange(0, A)
    x = pow(z, r, n)
    e = int(input("e = "))
    if e >= B: # 👈 WARNING: This can be negative
        raise ValueError("e too large")
    y = r + (n - phi) * e
    transcript = {"x": x, "e": e, "y": y}
    return json.dumps(transcript)

I immediately assumed that we can send $e = -A$, and we can obtain $y = r - (n - \phi) \cdot A$ from the verifier. Since $r \in [0, A)$, we can recover $\phi$ effectively. Let’s go!

Wait no. It runs assert zkpof_verify(z, t, n) before we are given the transcript. Since the $y$ we have is negative, we did not fulfil the condition $0 \le y \lt A$:

def zkpof_verify(z, t, n):
    transcript = json.loads(t)
    x, e, y = [transcript[k] for k in ("x", "e", "y")]
    return 0 <= y < A and pow(z, y - n * e, n) == x

Fortunately, this is not the end of the world. Notably, the server is running Python 3.12, where we will be condemned when a large number like $10^{4300}$ is constructed².

Let’s look again at code snippet that we act as the verifiers:

z = rand.randrange(2, n)
t = zkpof(z, n, phi)
assert zkpof_verify(z, t, n)

There are multiple reasons that the above snippet will lead to an exception, where #1 and #2 come from zkpof, and #3 comes from zkpof_verify:

Error: e too large if $e \ge B$.
Error: Exceeds the limit (4300 digits)... if $|y| \ge 10^{4300}$.
Error: if $y < 0$ or $y \ge A$ or $z^{y - ne} \not\equiv x \ (\text{mod}\ n)$.

Now we can know whether $y < -10^{4300}$ from the server’s response. For instance, we can send $e = -\lceil 10^{4300} / 2^{512} \rceil$ to check whether $\phi \le n - 2^{512}$. We will can leak one bit of $\phi$ using binary search by choosing appropriate $e$s. Since we have 311 oracle calls, we can leak 311 bits of $n - \phi$. Since $n - \phi = p + q - 1 < 2^{513}$, we only have around 200 bits that remained unknown.

Part II: Lattice for the win⌗

Now we have an approximation of $p + q = n - \phi + 1$ (we have 311/513 bits of it), we can estimate $p$ by first computing $p - q = \sqrt{(p + q)^2 - 4n}$ (assuming $p > q$) then obtain

$$ p = \frac{1}{2}(p + q) + \frac{1}{2}(p - q). $$

Now we have an approximated $p$ (namely $\tilde p$), which the error is around 200 bits. Since we have a large portion of $p$, we can recover the rest using LLL. We can use the polynomial $\text{f}(x) = \tilde p + x$ and look for small roots over modulo $n$. With $p$ recovered, we can now construct valid proofs as a prover and convince the server for the flag:

hitcon{the_error_is_leaking_some_knowledge}

PCBC Revenge⌗

Challenge Summary⌗

We are given the $\text{Encrypt}$ function, which is based on a custom mode of operation with three keys, $k_0$, $k_1$ and $k_2$. The implementation is given in the Python snippet below:

def encrypt(message):
    cipher = b""
    prev_block = iv # 👈 "iv" is a constant per connection
    counter1 = counter(nonce1)
    counter2 = counter(nonce2)
    for block in get_blocks(pad(message, BLOCK_SIZE)):
        enc1 = AES_ECB_enc(key_ctr1, next(counter1))
        enc2 = AES_ECB_enc(key_cbc, xor(block, prev_block, enc1))
        enc3 = AES_ECB_enc(key_ctr2, next(counter2))
        # 👆 "key_cbc", "key_ctr1" and "key_ctr2" will be labeled as k0, k1 and k2
        enc4 = xor(enc3, enc2)
        prev_block = xor(block, enc4)
        cipher += enc4

    return iv + cipher

📋 Notation. For simplicity, I will use $\mathcal{E}_j$ (resp. $\mathcal{D}_j$) as the function that encrypts (resp. decrypts) one block using the key $k_j$, for $j = 0, 1, 2$.

The below figure shows one round of the given mode of operation for $i = 1, 2, …$:

Here $m_0 = \texttt{00 00 … 00}$. Also, for $i = 0, 1, …$, we have

$$r_i = \mathcal{E}_1(\text{nonce}_1 + i) \quad \text{and} \quad t_i = \mathcal{E}_2(\text{nonce}_2 + 1).$$

When connected to the server, we are given the encrypted flag $c_\text{flag}$. We are also given the two functions below, and we can call them as much as we want:

[Generate a plaintext-ciphertext pair] We send $0 \le l < 4096$ to the server. The server generates a message $m$ of length $l$ and computes $c = \text{Encrypt}(m)$. We are then given $(m, c)$. Note that the IV remains unchanged throughout the connection.
[Padding oracle] We send $c = (c_0, c_1, …, c_n)$ to the server. The server decrypts it and checks if the flag is a substring of $m$, or there is an exception. Note that $c_0$ should be unchanged.

🔥 This challenge is a revenge. The challenge author also referred us to Counter Block Chaining he wrote for b01lers CTF 2024, where (1) we don’t have the first function, and (2) $c_0$ can be arbitrary for the second function.

Solution⌗

Part I: Why it is easy when IV can be controlled?⌗

Suppose that a ciphertext for $(m_1, m_2, …, m_n)$ is $(c_0, c_1, …, c_n)$. In the Counter Block Chaining challenge, we were able to control the initialization vector. By changing the IV from $c_0$ to $c'_0$, we have

$$ m_1 \oplus m'_1 = m_2 \oplus m'_2 = … = m_n \oplus m'_n = c_0 \oplus c'_0. $$

With this, we can attack the padding oracle in the same way we do for the CBC mode. For instance, if we want to recover the last byte of $m_n$, we can make the last byte of $c'_0$ to be $\texttt{00}$, $\texttt{01}$, …, $\texttt{FF}$ until $(c_0', c_1, …, c_n)$ corresponds to a plaintext that has a valid PKCS7 padding. In that case, the trailing byte of $m_n'$ will be $\texttt{01}$, thus we can deduce the trailing byte of $m_n$ using $m_n = m'_n \oplus c_0 \oplus c'_0$.

Part II: We cannot control the IV!⌗

In this challenge, we can no longer flip $c_0$ to generate an arbitrary $m_n$ for the padding oracle. To make matters worse, for $i = 1, 2, …, n$, $m_i$ is dependent of $c_i$. We do not have a simple relation like the above.

If we change the $i$-th block of the ciphertext from $c_i$ to $c'_i$, we have

$$ \begin{aligned} & m_{i+1} \oplus m'_{i+1} = m_{i+2} \oplus m'_{i+2} = … = m_n \oplus m'_n \\ &\qquad = \mathcal{D}_0(c_i \oplus t_{i-1}) \oplus \mathcal{D}_0(c_i' \oplus t_{i-1}) \oplus c_i \oplus c_i' \\ &\qquad = m_i \oplus m_i' \oplus c_i \oplus c_i'. \end{aligned}$$

Now we want to recover the plaintext $(m_{0,1}, …, m_{0,n})$ from its ciphertext $(c_{0,0}, …, c_{0,n})$. We need two known plaintext-ciphertext pairs that is 129 blocks longer, namely,

$$\begin{aligned} & (m_{1,1}, …, m_{1, n+129}) \xrightarrow{\text{Encrypt}} (c_{1,0}, …, c_{1, n+129}) \\ & (m_{2,1}, …, m_{2, n+129}) \xrightarrow{\text{Encrypt}} (c_{2,0}, …, c_{2, n+129}) \end{aligned}$$

Denote $\Delta_{ij} := m_{1, j} \oplus m_{i, 1} \oplus c_{1, j} \oplus c_{i, j}$ for $i = 1, 2$ and $j = 0, 1, …, n+128$. Additionally, we would want that for any $u \in [0, 2^{128})$, there exists $i_1, …, i_{128} \in \{0, 1\}$ such that

$$\Delta_{i_1, n+1} \oplus \Delta_{i_2, n+2} \oplus … \oplus \Delta_{i_{128}, n+128} = u.$$

We can generate two such plaintext-ciphertext pairs using the first function. Now suppose that we want to recover the last byte of $m_{0, 1}$. We will use

$$(c_{1, 0}, c_{0, 1}, c_{1, 2}, …, c_{1, n}, c_{i_1, n+1}, …, c_{i_{128}, n+128}, c_{1, n+129}).$$

😕 What is this mess? The first $n+1$ ciphertext blocks is the ciphertext for $c_1$, except for the $c_{1, 1}$. We swapped that with $c_{0, 1}$ because we want to “leak” its corresponding plaintext, $m_{0, 1}$. On the other hand, we will add another 129 blocks either from $c_1$ or $c_2$. We will control them to leak the plaintext block byte-by-byte using the padding oracle.

For the ciphertext, its last message block, $\tilde{m}_{n+129}$, would be

$$\begin{aligned} \tilde{m}_{n+129} &= m_{n+129} \oplus \Delta_{0, 1} \oplus \Delta_{i_1, n+1} \oplus … \oplus \Delta_{i_{128}, n+128} \\ &= m_{n+129} \oplus (m_{1, 1} \oplus m_{0, 1} \oplus c_{1, 1} \oplus c_{0, 1}) \oplus \Delta_{i_1, n+1} \oplus … \oplus \Delta_{i_{128}, n+128}. \end{aligned}$$

Rearranging, we have

$$m_{1, 1} = \tilde{m}_{n+129} \oplus m_{n+129} \oplus m_{0, 1} \oplus c_{1, 1} \oplus c_{0, 1} \oplus \Delta_{i_1, n+1} \oplus … \oplus \Delta_{i_{128}, n+128}.$$

The only unknown variables would be $\tilde{m}_{n+129}$ and $m_{0, 1}$. However, if we are sending this ciphertext to the second function, we will be informed if $\tilde{m}_{n+129}$ has a valid PKCS7 padding. Thus we can control $\Delta_{i_1, n+1} \oplus … \oplus \Delta_{i_{128}, n+128}$ by solving a linear system over $\text{GF}(2)$. If $\tilde{m}_{n+129}$ ends with 01 when $\Delta_{i_1, n+1} \oplus … \oplus \Delta_{i_{128}, n+128} = u$, then the padding is valid and we can recover the last byte of $m_{1, 1}$ by looking at the last byte of the above equality.

Repeating the process, we are able to recover the entire flag:

hitcon{just_a_normal_padding_oracle_with_some_linear_algebra..._horray!_1f4614e8211bddb81b05b}

Hyper512⌗

Challenge Summary⌗

We are given the below stream cipher which makes use of four 128-bit LFSRs:

class Cipher:
    def __init__(self, key: int):
        self.lfsr1 = LFSR(128, key, MASK1)
        key >>= 128
        self.lfsr2 = LFSR(128, key, MASK2)
        key >>= 128
        self.lfsr3 = LFSR(128, key, MASK3)
        key >>= 128
        self.lfsr4 = LFSR(128, key, MASK4)

    def bit(self):
        x = self.lfsr1() ^ self.lfsr1() ^ self.lfsr1()
        y = self.lfsr2()
        z = self.lfsr3() ^ self.lfsr3() ^ self.lfsr3() ^ self.lfsr3()
        w = self.lfsr4() ^ self.lfsr4()
        return (
            sha256(str((3 * x + 1 * y + 4 * z + 2 * w + 3142)).encode()).digest()[0] & 1
        )

    def stream(self):
        while True:
            b = 0
            for i in reversed(range(8)):
                b |= self.bit() << i
            yield b

    def encrypt(self, pt: bytes):
        return bytes([x ^ y for x, y in zip(pt, self.stream())])

    def decrypt(self, ct: bytes):
        return self.encrypt(ct)

We are given a ciphertext encrypted with the above Cipher, where its plaintext is 4096 null bytes followed by the flag. The objective is to recover the flag.

Solution⌗

A truth table of the output bits based on the LFSRs.

Let $x_i, y_i, z_i, w_i, \text{output}_i$ be the $i$-th bit from lfsr1, lfsr2, lfsr3, lfsr4 and the output. @KLPP observed that if $\text{output}_i = 1$, then $y_i = 1$ or $z_i = 1$. This is equivalent to $(y_i - 1) (z_i - 1) = 0$.

Additionally, we can express $y_i$ (resp. $z_i$) in terms of $y_1, …, y_{128}$ (resp. $z_1, …, z_{128}$) given the masks. Let’s write $y_i = a_{i,1} y_1 + … + a_{i,128} y_{128}$ and $z_i = b_{i,1} z_1 + … + b_{i,128} z_{128}$, and we know those $a_{ij}$ and $b_{ij}$’s.

To reiterate, for each $i$ such that $\text{output}_i = 1$, we have

$$(a_{i,1} y_1 + … + a_{i,128} y_{128} - 1)(b_{i,1} z_1 + … + b_{i,128} z_{128} - 1) = 0.$$

We can expand that and obtain the long equation below:

$$\begin{aligned} & (a_{i,1}b_{i,1} \cdot {\color{red}{y_1z_1}} + … + a_{i,128}b_{i,128} \cdot {\color{red}{y_{128}z_{128}}}) \\ & \qquad + (a_{i,1} \cdot {\color{red}{y_1}} + … + a_{i,128} \cdot {\color{red}y_{128}}) + (b_{i,1} \cdot {\color{red}{z_1}} + … + b_{i,128} \cdot {\color{red}{z_{128}}}) = 1 \end{aligned}$$

We can try to linearize the equation since there will be 16368 such equations given in output.txt (one equation upon $\text{output}_1 = 1$):

$$\begin{aligned} & (a_{i,1}b_{i,1} \cdot {\color{red}{u_1}} + … + a_{i,128}b_{i,128} \cdot {\color{red}{u_{16384}}}) \\ & \qquad + (a_{i,1} \cdot {\color{red}{u_{16385}}} + … + a_{i,128} \cdot {\color{red}u_{16512}}) + (b_{i,1} \cdot {\color{red}{u_{16513}}} + … + b_{i,128} \cdot {\color{red}{u_{16640}}}) = 1 \end{aligned}$$

Now we have 16640 unknowns, we have 272 degrees of freedom. Fortunately, we can pick three $x_i$’s and assume that they are zeroes. This makes all the $u_j$’s depending on the $x_i$’s are zeroes, too. We only need to guess 8 times on average to obtain the correct $u_j$’s (thus $y_1, …, y_{128}$ and $z_1, …, z_{128}$).

Now we have $y_i$ and $z_i$’s. To proceed, we see from the truth table that the only case when $(y_i, z_i, \text{output}_i) = (0, 1, 0)$ is $(x_i, w_i) = (1, 1)$. We can collect 256 such values to recover $x_1, …, x_{128}$ and $w_1, …, w_{128}$. Since the key is fully recovered, we can proceed to retrieve the flag:

hitcon{larger_states_is_still_no_match_of_fast_correlation_attacks!}

🙅🏽‍♂️ Unintended! As mentioned in the flag, the intended solution is the fast correlation attack. @maple3142 mentioned that his solver took less than 10s.

G. Poupard, J. Stern (2000). “Short Proofs of Knowledge for Factoring”
https://www.di.ens.fr/~stern/data/St84.pdf ↩︎
Python Software Foundation (2022). “Notable security feature in 3.10.7”
https://docs.python.org/3/whatsnew/3.10.html#notable-security-feature-in-3-10-7 ↩︎