HKCERT CTF 2024 Quals Writeup (II): Other Crypto Challenges

In the second part, I will cover the remaining cryptography challenges, including Almost DSA, mAEStro and Mask-mask-RSA.

Almost DSA⌗

Challenge Summary⌗

According to an external auditor, my code implementing the data signature algorithm (DSA) has a one-byte security fix on a critical issue.

Well, I am not bothered. Convince me by giving me the flag!

Attachment: almost-dsa_148b8b2cd8c78df02bbdc24bd7fa3f56.zip

We are given $(p, q, g)$, the public parameters of the data signature algorithm (DSA). We are also given $y$, the public key of the signer we want to impersonate.

The goal is to forge a signature for the signer.

Additionally, the signature verification algorithm is given below:

def verify(params, y, m, sig):
    p, q, g = params
    r, s = sig

    assert 0 < r < p
    assert 0 < s < p

    w = inverse(s, q)
    u1 = hash(m) * w % q
    u2 = r * w % q
    v = pow(g, u1, p) * pow(y, u2, p) % p % q
    assert v == r

Solution⌗

We would like to check whether $0 < r < q$ and $0 < s < q$. However, the upper bounds were mistakenly taken as $p$ in the challenge. This introduces a vulnerability whereby we can set $r = q$.

If we send $(r, s) = (1, q)$, we then have

$w = \texttt{inverse}(s, q) = q^{q-2} \ \text{mod}\ q = 0$
$u_1 = \texttt{hash}(m) \cdot w \ \text{mod}\ q = 0$
$u_2 = r \cdot w \ \text{mod}\ q = 0$
$v = g^{u_1} \cdot y^{u_2} \ \text{mod}\ p \ \text{mod}\ q = g^0 \cdot y^0 \ \text{mod}\ p \ \text{mod} \ q = 1 = r$

Here $v = r$ and thus passes the check. Additionally, $(r, s) = (1, q)$ should be a valid signature for all messages if the assertions are made incorrect.

mAEStro (1): Sample⌗

Challenge Summary⌗

Yet another AES challenge in 2024? Well, it is not AES – The internal operations are randomly selected.

Attachment: maestro-1_595679a660ee70b8346961cc6270ec95.zip

In this challenge, we have an implementation of AES. However, the sequence of the steps is sampled from the original list of steps:

$\texttt{AddRoundKey}(0)$, $\texttt{AddRoundKey}(1)$, …, $\texttt{AddRoundKey}(14)$
$\texttt{SubBytes}$, $\texttt{SubBytes}$, …, $\texttt{SubBytes}$ (14 times)
$\texttt{ShiftRows}$, $\texttt{ShiftRows}$, …, $\texttt{ShiftRows}$ (14 times)
$\texttt{MixColumns}$, $\texttt{MixColumns}$, …, $\texttt{MixColumns}$ (13 times)

As the player, we can provide the seed that generates the sequence of steps. We are then given the token encrypted with the backdoored AES.

After that, we are given the oracle to encrypt messages. The server will return the corresponding ciphertexts. The goal is to find the token.

Solution⌗

Since $\texttt{AddRoundKey}$ is the only operation that mixes in the secret key, we would want to avoid that as much as possible.

It is possible to generate a seed such that the sampled operations do not contain any of the $\texttt{AddRoundKey}$, as its probability being $(41/56)^{56}$, which is approximately 1 in $4 \times 10^7$.

For example, the seed 05032cde613198edd26c43a61f0978da (in hex) would generate the below sequence of operations (here SB, SR and MC means $\texttt{SubBytes}$, $\texttt{ShiftRows}$ and $\texttt{MixColumns}$ respectively):

MC SR SB SR MC SB MC SR MC SR MC SR SR SB
MC MC MC SB MC SB SB MC MC SB SR SR MC SB
MC MC MC SR SR MC SB SR MC SB MC MC SR MC
SR SB SB SR SR SR SB SR MC SR SB SB MC SR

Since the secret key is not involved during encryption, one could easily reverse all the operations to “decrypt” the token.

A faster solution is to generate a sequence with exactly one $\texttt{AddRoundKey}$, in which the probability would be $56 \cdot (15/56) \cdot (41/56)^{55}$ (around 1 in $2 \times 10^6$). With that, we can encrypt an arbitrary message $m_1$ and obtain $c_1$. With $\text{A} \rightarrow \texttt{AddRoundKey}(j) \rightarrow \text{B}$ and $\text{A}, \text{B}$ not involving $\texttt{AddRoundKey}$, we have

$$ m_1 \xrightarrow{\text{Operation set A}} s \xrightarrow{\texttt{AddRoundKey}(j)} t \xrightarrow{\text{Operation set B}} c_1. $$

Since operations in A and B do not involve $\texttt{AddRoundKey}$, we can obtain $s$ and $t$. Mathematically, $s \xrightarrow{\texttt{AddRoundKey}(j)} t$ can be written as $s \oplus k_j = t$ ($k_j$ is the $j$-th subkey), we can obtain $k_j$. Since there are no more unknowns after $k_j$ has been recovered, we can also recover $m_0$ from $c_0$ and retrieve the flag.

As a concrete example, the seed 7d54e4b3d03a5e0fa99c4b356e940de2 yields exactly one $\texttt{AddRoundKey}$:

SB MC MC SB MC MC SB SR MC SB SR SR SR ARK2 SR
SB MC MC SB SB MC MC SR MC MC SR SR SR SB SR
SB SB MC SR SB SB MC SR SB SB SR SB SB SB SR
SR SB MC MC MC SB MC SR SB SR SB SR SR MC SB

Let $m_1$ be 16 null bytes and $c_1$ be the corresponding ciphertext. We can compute $s$ and $t$ from $(m_1, c_1)$ by

$$ m_1 \xrightarrow{\texttt{SubBytes}} \circ \xrightarrow{\texttt{MixColumns}} \dots \xrightarrow{\texttt{ShiftRows}} \circ \xrightarrow{\texttt{ShiftRows}} s \\ c_1 \xrightarrow{\texttt{SubBytes}^{-1}} \circ \xrightarrow{\texttt{MixColumns}^{-1}} \dots \xrightarrow{\texttt{SubBytes}^{-1}} \circ \xrightarrow{\texttt{ShiftRows}^{-1}} t $$

The below is a sample test vector, here k is the second round key, $k_2$:

m1 = bytes.fromhex('00000000000000000000000000000000')
c1 = bytes.fromhex('2745ca25395b74d42125e51ef535973b')

s  = bytes.fromhex('76767676767676767676767676767676')
t  = bytes.fromhex('913539a268b7940ca066c048b8247e8c')
k  = bytes.fromhex('e7434fd41ec1e27ad610b63ece5208fa')

mAEStro (2): Shuffle⌗

Challenge Summary⌗

Yet another AES challenge in 2024? Well, it is not AES – The internal operations are randomly shuffled.

Attachment: maestro-2_b57368ee01e70c226bcb364cd8fb87a6.zip

This challenge is similar to the first part, except that the steps are shuffled, not sampled. With that said, there will be 15 $\texttt{AddRoundKey}$, 14 $\texttt{SubBytes}$, 14 $\texttt{ShiftRows}$ and 13 $\texttt{MixColumns}$ operations in the final sequence.

The goal is to find the token.

Solution⌗

We want to find a seed that breaks the sequence of operations into three groups:

Operation set A [no $\texttt{AddRoundKey}$]
Operation set B [no $\texttt{MixColumns}$]
Operation set C [no $\texttt{AddRoundKey}$]

The probability of finding such a seed is $1/2674440$. As an example, the seed 4978c0cb65bb881a3c95ec16d6b3ad1a yields the below sequence of operations:

MC    MC    SB    MC    SB    MC    MC    MC    MC    SR    MC    MC    MC    SR
MC    ARK9  SR    ARK2  ARK11 ARK13 ARK4  ARK6  ARK12 SB    ARK8  SB    SR    ARK5  
SR    SR    SR    SB    ARK1  ARK10 SB    ARK0  SB    ARK7  SB    SB    ARK3  SB
SB    SR    SR    SB    SR    SR    SR    SB    ARK14 SR    SB    MC    MC    SR

Here the operation sets A, B and C are respectively the below sequences:

$\texttt{MixColumns}, \texttt{MixColumns}, \texttt{SubBytes}, …, \texttt{MixColumns}, \texttt{MixColumns}$,
$\texttt{AddRoundKey}(9), \texttt{ShiftRows}, \texttt{AddRoundKey}(2), …, \texttt{SubBytes}, \texttt{AddRoundKey}(14)$
$\texttt{ShiftRows}, \texttt{SubBytes}, \texttt{MixColumns}, \texttt{MixColumns}, \texttt{ShiftRows}$

There are some properties of the sequence:

Operation sets A and C do not contain $\texttt{AddRoundKey}$, thus set B is the only place that contains the secret.
Operation set B does not contain $\texttt{MixColumns}$, thus one byte in $s$ would only affect one byte in $t$.

$$ m \xrightarrow{\text{Operations A}} s \xrightarrow{\text{Operations B}} t \xrightarrow{\text{Operations C}} c $$

Now we want to know the one-to-one mapping of the characters between $s$ and $t$.

To start with, we can set $s = \texttt{0000…00}$ and compute $m$ by reverting the operations in A. We will retrieve $c$ from the oracle. After that, we can compute $t$ by reverting the operations in B.

We can identify which byte of $s$ would map to which byte of $t$ by looking at the $\texttt{ShiftRows}$ operation. The below table shows that the $j$-th byte of $t$ depends only on the $i$-th byte of $s$ in our example seed:

i	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
j	0	13	10	7	4	1	14	11	8	5	2	3	12	9	6	7

With that, we can set the value of $s$ to be $\texttt{0000…00}$, $\texttt{0101…01}$, …, $\texttt{ffff…ff}$ and send the corresponding $m$ to the oracle for the mapping between $s[i]$ and $t[j]$.

Mask-mask-RSA⌗

Challenge Summary⌗

Since 2020, Mystiz is well-known plagiarizing his own challenges.

This situation gets worse in 2024. He is now copying challenges from the other CTFs. What’s worse, the challenge author participates in this CTF!

Disclaimer: The challenge author didn’t know I was going to copy his challenge.

Attachment: mask-mask-rsa_d11a2e5d3f444a9a854327b5c7bfe22f.zip

When the player connects to the server, the server generates two 513-bit primes, $p$ and $q$ and $n := pq$, $e := 65537$ and $c := m^e \ \text{mod} \ n$ ($m$ is the flag).

We are only given $c$. Additionally, we are asked to send up to 6 expressions that are

composed from p, q, +, -, *, / and %, and
of lengths not exceeding four.

The server would evaluate the expression. Assuming that it returns $x$, it would return every second digit from $x^e \ \text{mod} \ n$.

The objective is to recover the flag, $m$.

📢 Disclaimer. The challenge is highly referred from aa.crypto’s challenge, Mask-RSA from CyberSpace CTF 2024. Despite that aa.crypto plays HKCERT CTF 2024, he does not know that I am referring to his challenge. In the original challenge, we have $e := 3$ and can send up to 20 expressions.

Solution⌗

Part I: What could the oracle return?⌗

By exhausting all the possible patterns from the character set, there are 114 valid expressions that we can send to the server. They will be simplified to the 28 expressions on the left:

Simplified Expr.	Input Expressions
$p$	`p`, `+p`, `++p`, `--p`, `+++p`, `+--p`, `-+-p`, `--+p`
$q$	`q`, `+q`, `++q`, `--q`, `+++q`, `+--q`, `-+-q`, `--+q`
$-p$	`-p`, `+-p`, `-+p`, `++-p`, `+-+p`, `-++p`, `---p`
$-q$	`-q`, `+-q`, `-+q`, `++-q`, `+-+q`, `-++q`, `---q`
$2p$	`p+p`, `p++p`, `p--p`, `+p+p`
$2q$	`q+q`, `q++q`, `q--q`, `+q+q`
$-2p$	`-p-p`
$-2q$	`-q-q`
$-p-q$	`-p-q`, `-q-p`
$p+q$	`p+q`, `q+p`, `p++q`, `p--q`, `q++p`, `q--p`, `+p+q`, `+q+p`
$p-q$	`p-q`, `p+-q`, `p-+q`, `+p-q`, `-q+p`
$q-p$	`q-p`, `q+-p`, `q-+p`, `+q-p`, `-p+q`
$0$	`p-p`, `p%p`, `q-q`, `q%q`, `p+-p`, `p-+p`, `p%+p`, `p%-p`, `q+-q`, `q-+q`, `q%+q`, `q%-q`, `+p-p`, `+p%p`, `+q-q`, `+q%q`, `-p+p`, `-p%p`, `-q+q`, `-q%q`
$1$	`p//p`, `q//q`
$\lfloor{p/q\rfloor}$	`p//q`
$\lfloor{q/p\rfloor}$	`q//p`
$p^2$	`pp`, `p+p`, `+p*p`
$q^2$	`qq`, `+qq`
$-p^2$	`p-p`, `-pp`
$-q^2$	`q-q`, `-qq`
$n$	`pq`, `qp`, `p+q`, `q+p`, `+pq`, `+qp`
$-n$	`p-q`, `-pq`, `-q*p`
$p \ \text{mod}\ q$	`p%q`, `p%+q`, `+p%q`
$q \ \text{mod}\ p$	`q%p`, `q%+p`, `+q%p`
$p \ \text{mod}\ -q$	`p%-q`
$q \ \text{mod}\ -p$	`q%-p`
$-p \ \text{mod}\ q$	`-p%q`
$-q \ \text{mod}\ p$	`-q%p`

Although there are 28 distinct expressions, some of them (for instance $0$ and $1$) are already known. It should also be noted that the returned value is not the evaluated expression, but is also processed by str(pow(result, e, n))[::2].

Instead of recovering $n$, our goal is more ambitious - to recover $p$ and $q$.

Part II: Recovering $p^e \ \text{mod} \ n$ and $q^e \ \text{mod} \ n$⌗

To start with, we can send p, q, p+q to recover half of the digits of $a := p^e \ \text{mod} \ n$, $b := q^e \ \text{mod} \ n$, $s := (p+q)^e \ \text{mod}\ n$.

Notably, by the binomial theorem,

$$ s \equiv (p + q)^e \equiv p^e + e \cdot p^{e-1} q + {e \choose 2} \cdot p^{e-2} q^2 + \dots + q^e \equiv p^e + q^e \equiv a + b \ (\text{mod}\ n). $$

Let’s assume that we are lucky enough so that $a$ has 308 digits and $b$ has 309 digits. We can observe this because the output sizes are different (154 digits vs 155 digits), and we can further send p-q to obtain $d := (p-q)^e\ \text{mod}\ n$ (This also works when $a$ has 309 digits and $b$ has 308, where you can send q-p).

🎲 How likely? In a simulation of 100,000 runs, ~7% of them satisfies all the conditions:

$a$ and $b$ have 308 and 309 digits (resp. $a$ and $b$ have 309 and 308 digits),
$a + b < n$ and,
$d := b - a$ has 308 digits.

If both $a + b < n$ and $d$ has 308 digits, then we can simply relate $a, b, s$ and $d$ without the modulus $n$:

$$ \left\{\begin{aligned} & a + b = s \\ & a - b = d \end{aligned}\right. $$

Then we can recover $a$ and $b$ using a bottom-up approach. How? Let’s imagine a baby case that $p$ and $q$ are 7 bits, and

$$ \begin{array}{ll} \texttt{oracle}(p) = 162, & \texttt{oracle}(q) = 10, \\ \texttt{oracle}(p-q) = 87, & \texttt{oracle}(p+q) = 157 \end{array} $$

a = 1_6_2
b =  1_0_
s = 1_5_7
d =  8_7_

Let $a = 10^4 \cdot a_4 + 10^3 \cdot a_3 + \dots + a_0$ (similar for $b$, $s$ and $d$).

We can first take modulo 10 on the equation $s = a + b$ to get $b_0 = 5$. After that, we can use $d = a - b$ to get $d_0 = 7$:

$$ \begin{aligned} & 10^4 \cdot s_4 + 10^3 \cdot s_3 + \dots + s_0 = s \\ & \qquad = a + b = (10^4 \cdot a_4 + 10^3 \cdot a_3 + \dots + a_0) + (10^4 \cdot b_4 + 10^3 \cdot b_3 + \dots + b_0) \end{aligned} $$

We then take modulo $10^2$ on $d = a - b$ to get $a_1 = 8$, then on $s = a + b$ to get $s_1 = 8$.

By repeating the process, we can get

a = 10682
b =  1905
s = 12587
d =  8777

Part III: Building yet another linear equation with two unknowns⌗

Having $b := q^e \ \text{mod}\ n$ is useful because it is a multiple of $q$. If we are able to find another multiple of $q$ (denote it by $u$), we can recover $q$ by computing $\text{gcd}(b, u)$.

Without loss of generality, we can let $a > b$. We send p+p and -q%p. Under the assumption that $q > p$, and since $p, q \in [2^{512}, 2^{513})$, we have $p < q < 2p$. Thus -q%p would be $2p - q$.

We will obtain $u := 2^e \cdot p^e\ \text{mod}\ n$ and $v := (2p-q)^e \ \text{mod}\ n$. Using the binomial theorem again, $v \equiv 2^e \cdot p^e - q^e \equiv u - a\ (\text{mod}\ n)$.

We would need to be lucky again, as we want $u$ to have 309 digits and $v$ to have 308 digits. In this case, we would be sure that $v = u - a$ (equal, not only congruent modulo $n$), and thus we can recover $u$ and $v$ using a similar approach in the above section.

🎲 How likely now? The probability decreases to 0.9% (~1 in 110) with the conditions $q < p$ and $u$, $v$ have 309 and 308 digits.

After all, below are the “assumptions” taken:

#	Assumption	How to check from the oracle?
1	$a := p^e\ \text{mod}\ n$ has 308 digits	$\texttt{oracle}(p)$ has 154 digits^†
2	$b := q^e\ \text{mod}\ n$ has 309 digits	$\texttt{oracle}(q)$ has 155 digits
3	$a + b < n$	$\texttt{oracle}(p + q)$ has 155 digits (and under assumptions 1 and 2)
4	$d := b - a$ has 308 digits	$\texttt{oracle}(q - p)$ has 154 digits
5	$q > p$	unknown
6	$u := (2p)^e\ \text{mod}\ n$ has 309 digits	$\texttt{oracle}(p+p)$ has 155 digits
7	$v := (2p-q)^e\ \text{mod}\ n$ has 308 digits	$\texttt{oracle}(-q\ \text{mod}\ p)$ has 154 digits

📌 How to handle the case when $p < q$? Just rerun.

📝 Note [†]. It is possible that $\texttt{oracle}(m)$ has 154 digits, but $m$ only has 307 digits. In this unfortunate case, rerun.

Part IV: The final touches⌗

After all, $b$ and $u$ are two multiples of $q$ and thus able to recover $q$. To recover $p$, we first denote $d_q = e^{-1}\ \text{mod}\ (q-1)$. Because $a \equiv p^e \ (\text{mod}\ q)$, we have

$$ p \equiv a^{d_q} \ (\text{mod}\ q). $$

Thus $p = (a^{d_q} \ \text{mod}\ q) + q$. Now we have everything and we can decrypt the flag:

hkcert24{p0w3rs_0f_p_p1us_q_m0du1o_n_4r3_4ls0_fr3shm4n5_dr34m}