I wrote two challenges for Google CTF qualifier - Underhanded and Merkurated. There were 17 and 37 solves during the contest time. I will cover both in this blog post.

Underhanded

Challenge Summary

Proudly sharing our Python implementation of AES. By the way, we sneakily hid a backdoor. Can you see sharp and see what went wrong?

Attachments: underhanded.zip

We are given yet another Python implementation of AES.

A 16-byte key is generated when we connect to the server. We are able to request five ciphertexts of our choice. The goal is to recover the key.

🎯 To be accurate… The players are asked to recover 16 keys in 300 seconds.

Solution

The “backdoor”

At the first glance, the encrypt function seems working perfectly. There are test cases that compares the output from pycryptodome, and the custom implementation is working properly.

However, the encrypt function only works properly when it encrypts a single block (i.e., 16 bytes). If there are more than 16 bytes, sub_bytes and add_round_key will be applied on the entire input, while shift_rows and mix_columns do not.

def sub_bytes(self, m: bytearray) -> bytearray:
    return bytearray(sbox[mc] for mc in m)

def shift_rows(self, m: bytearray) -> bytearray:
    m[+0], m[+4], m[+8], m[12] = m[+0], m[+4], m[-8], m[12]
    m[+1], m[+5], m[+9], m[13] = m[+5], m[+9], m[13], m[+1]
    m[+2], m[+6], m[10], m[14] = m[10], m[14], m[+2], m[+6]
    m[+3], m[+7], m[11], m[15] = m[15], m[+3], m[+7], m[11]
    return m

def mix_columns(self, m: bytearray) -> bytearray:
    for i in range(0, 16, 4):
        t = m[i+0] ^ m[i+1] ^ m[i+2] ^ m[i+3]
        u = m[i+0]
        m[i+0] ^= t ^ xtime(m[i+0] ^ m[i+1])
        m[i+1] ^= t ^ xtime(m[i+1] ^ m[i+2])
        m[i+2] ^= t ^ xtime(m[i+2] ^ m[i+3])
        m[i+3] ^= t ^ xtime(m[i+3] ^ u)
    return m

def add_round_key(self, m: bytearray, r: int) -> bytearray:
    # Note that the `xor` function comes from `pwnlib.util.fiddling`, where the
    # shorter string will be used to computed the result.
    return bytearray(xor(m, self.round_keys[r]))

It is natural to assume that the first 16 bytes of a ciphertext is encrypted in the normal AES setting, because shift_rows and mix_columns are applied only to the first 16 bytes. Surprisingly it does not. For instance, let

$$ \begin{aligned} & m_1 = \texttt{00000000000000000000000000000000} \\ & m_2 = \texttt{00000000000000000000000000000000 01}, \end{aligned} $$

the ciphertexts under $k = \texttt{0000…00}$ (16 $\texttt{00}$’s) are

$$ \begin{aligned} & c_1 = \texttt{66e94bd4ef8a2c3b884cfa59ca342b2e} \\ & c_2 = \texttt{71edd5d9b112b425a3dba470ca874fe8 92}. \end{aligned} $$

When we look deeper in the shift_rows, the value of m[+8] is taken from m[-8].

What does that mean? Let $(s_0, s_1, …, s_{n-1})$ and $(t_0, t_1, …, t_{n-1})$ respectively be the input and output for the last shift_rows operation. In that case, we have

$$ \begin{array}{llll} t_0 := s_0, && t_4 := s_4, && {\color{red}t_8 := s_{n-8},} && t_{12} := s_{12} \\ t_1 := s_5, && t_5 := s_9, && t_9 := s_{13}, && t_{13} := s_1, \\ t_2 := s_{10}, && t_6 := s_{14}, && t_{10} := s_2, && t_{14} := s_6, \\ t_3 := s_{15}, && t_7 := s_3, && t_{11} := s_7, && t_{15} := s_{11}, \\ \end{array} $$

and $t_i := s_i$ for $i = 16, 17, …, {n-1}$.

Additionally, the only operation after the last shift_rows would be the tenth add_round_key. Denote the resulting ciphertext be $(c_0, c_1, …, c_{n-1})$, we have

$$ \begin{aligned} c_8 &= t_8 \oplus k_{10,8} = s_{n-8} \oplus k_{10,8} \\ c_{n-8} &= t_{n-8} \oplus k_{10, r} = s_{n-8} \oplus k_{10, r}, \end{aligned} $$

where $r = (n-8)\ \text{mod}\ 16$ and $k_{ij}$ is the $j$-th byte of the $i$-th subkey. Thus we have

$$c_8 \oplus c_{n-8} = k_{10, 8} \oplus k_{10, r}.$$

Since we can request five ciphertexts of our choice, we can obtain six $k_{ij}$’s if we exhaust $k_{10, 8}$. The question is, which bytes from the subkey do we want to leak?

Turns out we can leak $k_{10,0}, k_{10,3}, k_{10,4}, k_{10,7}, k_{10,12}$ (and $k_{10,8}$). Below is a set of possible plaintexts (in hex) to leak the above bytes. In short, they are of lengths 24, 27, 28, 31 and 36 bytes.

m1 = 00000000000000000000000000000000 0000000000000000
m2 = 00000000000000000000000000000000 0000000000000000000000
m3 = 00000000000000000000000000000000 000000000000000000000000
m4 = 00000000000000000000000000000000 000000000000000000000000000000
m5 = 00000000000000000000000000000000 00000000000000000000000000000000 00000000

In particular, below are the list of $k_{ij}$’s (marked with @s) that we can recover with the six leaked bytes:

i \ j |  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
------|------------------------------------------------
    0 |  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
    1 |  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
    2 |  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
    3 |  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
    4 |  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
    5 |  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
    6 |  .  .  .  .  .  .  .  @  .  .  .  .  .  .  .  .
    7 |  .  .  .  @  .  .  .  @  .  .  .  .  @  .  .  .
    8 |  .  .  .  @  .  .  .  @  @  .  .  .  @  .  .  .
    9 |  .  .  .  @  @  .  .  @  @  .  .  .  @  .  .  .
   10 |  @  .  .  @  @  .  .  @  @  .  .  .  @  .  .  .

We can compute $k_{6,7}, k_{7,7}, k_{8,7}$ and $k_{9,7}$ from the $k_{10,j}$’s above. But why are they useful?

🙏🏼 Thanks Thr! @Thr noticed that the table I generated was incorrect. I was using a different set of $k_{10,j}$’s…

💦 Alternative bytes to leak. Since we want to perform “the thing” on the next section, we would like to know $k_{6, i}, k_{7, i}, k_{8, i}, k_{9, i}$ and $k_{10, i}$ for some $i$. Here are the combinations of $k_{10, j}$’s (that includes $k_{10, 8}$) that made it possible:

  1. $k_{10, 0}, k_{10, 3}, k_{10, 4}, k_{10, 7}, k_{10, 8}, k_{10, 12}$,
  2. $k_{10, 0}, k_{10, 4}, k_{10, 5}, k_{10, 8}, k_{10, 9}, k_{10, 13}$,
  3. $k_{10, 0}, k_{10, 4}, k_{10, 8}, k_{10, 9}, k_{10, 12}, k_{10, 13}$,
  4. $k_{10, 3}, k_{10, 4}, k_{10, 7}, k_{10, 8}, k_{10, 11}, k_{10, 12}$,
  5. $k_{10, 3}, k_{10, 7}, k_{10, 8}, k_{10, 11}, k_{10, 12}, k_{10, 15}$

The 24+16 bits meet-in-the-middle

We can leak more than one byte from the chosen ciphertexts. When $i \geq 16$, the $i$-th byte of ciphertext would only affected by add_round_key and sub_bytes. Thus

$$c_i = \texttt{S}\left[…\texttt{S}\left[\texttt{S}\left[m_i \oplus k_{0,j}\right] \oplus k_{1,j}\right] \oplus …\right] \oplus k_{10,j},$$

where $j = i\ \text{mod}\ 16$. Now we have $k_{6, 7}, …, k_{10, 7}$, we can obtain $u_i$ such that

$$u_i = \texttt{S}\left[…\texttt{S}\left[\texttt{S}\left[m_i \oplus k_{0,j}\right] \oplus k_{1,j}\right] \oplus …\right] \oplus k_{5,j}.$$

Let $i = 24, 40, …$, we have $j = 8$. We can rearrange the terms to “balance” out the number of unknowns ($k_{0, j}, …, k_{5, j}$):

$$\texttt{S}\left[\texttt{S}\left[\texttt{S}\left[m_i \oplus k_{0,7}\right] \oplus k_{1,7}\right] \oplus k_{2,7}\right] \oplus k_{3,7} = \texttt{S}^{-1}\left[\texttt{S}^{-1}\left[u_i \oplus k_{5, 7}\right] \oplus k_{4, 7}\right].$$

We can further remove $k_{3,7}$ from the equation by combining two relations (for example, xoring the relations when $i = 24$ and $i = 40$):

$$ \begin{aligned} & \texttt{S}\left[\texttt{S}\left[\texttt{S}\left[m_{23} \oplus k_{0,7}\right] \oplus k_{1,7}\right] \oplus k_{2,7}\right] \oplus \texttt{S}\left[\texttt{S}\left[\texttt{S}\left[m_{40} \oplus k_{0,7}\right] \oplus k_{1,7}\right] \oplus k_{2,7}\right] \\ &\qquad = \texttt{S}^{-1}\left[\texttt{S}^{-1}\left[u_{24} \oplus k_{5, 7}\right] \oplus k_{4, 7}\right] \oplus \texttt{S}^{-1}\left[\texttt{S}^{-1}\left[u_{40} \oplus k_{5, 7}\right] \oplus k_{4, 7}\right]. \end{aligned} $$

From the above equation, we have five unknowns ($k_{0, 7}, k_{1, 7}, k_{2, 7}, k_{4, 7}$ and $k_{5, 7}$). We can precompute the left hand side and compute the right hand side when we have $k_{6, 7}, k_{7, 7}$, $k_{8, 7}, k_{9, 7}$ and $k_{10, 7}$.

We can send the new m5 to get the relations above:

m5 = 00000000000000000000000000000000 0000...00 0101...01 0202...02 ... 00000000

In this case, we have $m_{23} = \texttt{00}$, $m_{39} = \texttt{01}$, $m_{55} = \texttt{02}$ and so on. We can compute the corresponding $u_i$’s and recover $k_{i, 7}$’s. We can further deduce even more subkey bytes and eventually are able to obtain the key.

Merkurated

Challenge Summary

Can you rugpull my crypto?

Attachments: merkurated.zip

The server in this challenge hosts a Merkle tree, where users are able to perform the below three operations:

  1. deposit(amount, public_key): Increases $\texttt{tree}[\texttt{public\_key}]$ by $\texttt{amount} \in (0, M]$.
  2. withdraw(amount, signature, proof): Decreases $\texttt{tree}[\texttt{public\_key}]$ by $\texttt{amount} \in (0, N]$. $\texttt{public\_key}$ is the public key recovered by the signature and $\texttt{proof}$ is a Merkle proof that the $\texttt{tree}[\texttt{public\_key}] = N$.
  3. flag(): Returns the flag when $M \geq 10^{18}$.

Here $M$ is the amount that the user has locally. Initially, $M = 10^9$. The goal is to earn enough to call flag().

Solution

For the Merkle tree, the formula for the node hash is defined as below:

$$\texttt{bcrypt}_N(\texttt{left\_subtree\_hash} \ || \ \texttt{":::"} \ || \ \texttt{right\_subtree\_hash} || \ \texttt{":::"} \ || \ \texttt{bcrypt}_V(\texttt{value})).$$

$\texttt{bcrypt}_N$ and $\texttt{bcrypt}_V$ are the hash $\texttt{bcrypt}$ respectively with the salt SALT_FOR_NODE and SALT_FOR_VALUE. Note that the salts will be given upon connection.

As an example, below are the Merkle tree with six nodes and the salts:

digraph { graph [bgcolor="transparent"] node [color="#ffe4e1", fontcolor="#ffe4e1", fillcolor="#33333c", style="filled"] edge [color="#ffe4e1", fontcolor="#ffe4e1"] node[shape="box", style="rounded, filled", color="#ffe4e1", fontcolor="#ffe4e1", fillcolor="#33333c"] { rank = same; "x1"; "y1"; "x2"; } x0[label="🌲 0", width=0.9] x1[label="🌲 1", width=0.9] y1[style=invis] x2[label="🌲 2", width=0.9] x3[label=<🌿 3<BR /><font point-size="11">Value: 1337</font>>, width=0.9] x4[label=<🌿 4<BR /><font point-size="11">Value: 1338</font>>, width=0.9] x5[style=invis] x6[label=<🌿 5<BR /><font point-size="11">Value: 1339</font>>, width=0.9] x0->x1 x0->y1[style=invis] x0->x2 x1->x3 x1->x4 x2->x5[style=invis] x2->x6 }
SALT_FOR_NODE  = b'$2b$04$NODENODENODENODENODEOO'
SALT_FOR_VALUE = b'$2b$04$VALUEVALUEVALUEVALUEOO'

We can compute the root node hash (hash for node 0) bottom-up:

EMPTY_NODE_HASH  = hash(b'', SALT_FOR_NODE) # == b'MQ2LoT4.AHVICmhkTdC2USXZDB0z7cm'
EMPTY_VALUE_HASH = hash(b'', SALT_FOR_VALUE) # == b'M2nEWQrn9QFIADu16Da03Xr5ezenIX.'

NODE_3_VALUE = hash(bytes.fromhex('00000000 00000539'), SALT_FOR_VALUE) # 1337
NODE_4_VALUE = hash(bytes.fromhex('00000000 0000053a'), SALT_FOR_VALUE) # 1338
NODE_5_VALUE = hash(bytes.fromhex('00000000 0000053b'), SALT_FOR_VALUE) # 1339

NODE_3_HASH = hash(b':::'.join([EMPTY_NODE_HASH, EMPTY_NODE_HASH, NODE_3_VALUE]),
                   SALT_FOR_NODE) # == b'XsDcVafPK/jlPTL0ZSle/dtEJkee/Au'
NODE_4_HASH = hash(b':::'.join([EMPTY_NODE_HASH, EMPTY_NODE_HASH, NODE_4_VALUE]),
                   SALT_FOR_NODE) # == b'IqRxjujNv9y6FmmOvHenWEo/pLRTAUu'
NODE_5_HASH = hash(b':::'.join([EMPTY_NODE_HASH, EMPTY_NODE_HASH, NODE_5_VALUE]),
                   SALT_FOR_NODE) # == b'vVHq1POwhT3YIIfK7IgtaN40H09Zfnq'

NODE_1_HASH = hash(b':::'.join([NODE_3_HASH, NODE_4_HASH, EMPTY_VALUE_HASH]),
                   SALT_FOR_NODE) # == b'sOuDjB/GqERr5J/BaoOKovGz.UjWxPC'
NODE_2_HASH = hash(b':::'.join([EMPTY_NODE_HASH, NODE_5_HASH, EMPTY_VALUE_HASH]),
                   SALT_FOR_NODE) # == b'k4sMGJnOJvAvvhgZfdDjxmGki2avy7W'

NODE_0_HASH = hash(b':::'.join([NODE_1_HASH, NODE_2_HASH, EMPTY_VALUE_HASH]),
                   SALT_FOR_NODE) # == b'.B8SqLK6AYf88PZe93HzUKwH0OJ4Pwe'

Notably, the hash function trims the salt (the first 29 bytes) away. Only the last 31 bytes from bcrypt will be taken.

It is important that bcrypt would only take the first 72 bytes of the preimage to compute its hash. When we look at the preimage of NODE_0_HASH, only the first four bytes of the VALUE_HASH will be taken into account.

123456789012345678901234567890123456789012345678901234567890123456789012
`bcrypt` would only take the up to the 72th byte for the digest        ↓
LEFT_SUBTREE_HASH______________:::RIGHT_SUBTREE_HASH_____________:::VALUE_HASH_____________________

Since the output of hash satisfy the regular expression [./0-9A-Za-z]{31}, its character set is of size 64. It is expected to generate $\approx 64^{4/2} = 2^{12}$ preimages to come up with a collision pair. In this challenge, what we need is a pair of leaf nodes such that they have the same hash, and

  1. the first leaf node has value $x \in [0, 10^9]$, and
  2. the second leaf node has value $y \in [10^{18}, 2^{64})$.

If the first leaf node is replaced with the second leaf node, the root node hash remains unchanged. After all, the below is the heuristic to make $M \geq 10^{18}$:

  1. Generate a random ECDSA key.
  2. Generate two leaf nodes with the above properties.
  3. Call deposit(x, public_key) to create the first node in the tree.
  4. Generate a proof that show that $\texttt{tree}[\texttt{public\_key}] = y$.
  5. Sign the proof with the ECDSA private key.
  6. Call withdraw(y, signature, proof) to withdraw $y$ from the server.

We have $10^9 - x + y \geq 10^{18}$ now, and we can call flag() to win.