Credits. Thumbnail credit goes to Quasar! 🎉

vss is an interesting crypto challenge in BalsnCTF, which ended up having 9 solves. I took around 2.5 hours to solve the challenge. This challenge reminds me the yet another PRNG challenge from pbctf 2021 (challenge description, writeup written by @maple3142 and @rkm0959), but with a setting which looked harder. I was pretty surprised that LLL worked, too.

lfsr is another crypto challenge in BalsnCTF with 6 solvers. In the challenge, the output bits are computed nonlinearly from the LFSR states. Given that I knew almost nothing about LFSR, I just came up with the attack by myself… Well, I am not quite a paper guy and I couldn’t read.

vss

Challenge Summary

When connected to the server, two 512-bit numbers x,yx, y will be generated. We are allowed to get up to around 180 sets of (ai,bi,ci,pi,gi,vi,qi)(a_i, b_i, c_i, p_i, g_i, v_i, q_i) such that:

  1. ui=(ai+bix+ciy) mod piu_i = (a_i + b_ix + c_iy)\ \text{mod}\ p_i, and
  2. vi=giui mod qiv_i = g_i^{u_i}\ \text{mod}\ q_i.

The goal is to recover xx and yy.

Why 180 sets of data? Although not explicitly mentioned in the challenge, there is a 30-second timeout for the remote connection. You can only get around 180 sets of (ai,bi,ci,pi,gi,vi,qi)(a_i, b_i, c_i, p_i, g_i, v_i, q_i) within the timeout.

Solution

TL;DR. My approach, like 95% of the crypto challenges I solved, uses LLL.

Part I: Discrete log for the win?

After understanding the challenge statement, I immediately think that we can generate until qi1q_i-1 is smooth. In that way, we can easily recover uiu_i from vi,giv_i, g_i and qiq_i using Pohlig-Hellman algorithm. Unfortunately, generating such primes is hard. I generated around 20K sets and none of the qi1q_i-1 is 2242^{24}-smooth (i.e., all the prime factors are not greater than 2242^{24}).

With that said, finding one such qiq_i is difficult. What makes the situation worse is that we actually need two such qiq_i’s to recover xx and yy.

Part II: Taking a step backwards

We are forced to give up recovering a full uiu_i because discrete log is difficult. Good that we can still recover part of it. Assume that we know

q1=r1k1r2k2rmkmsq-1 = {r_1}^{k_1} {r_2}^{k_2} … {r_m}^{k_m} s

with r1,r2,,rmr_1, r_2, …, r_m are small primes (for instance, they are all smaller than 2242^{24}). Denote γ=r1k1r2k2rmkm\gamma = {r_1}^{k_1} {r_2}^{k_2} … {r_m}^{k_m} and we can recover v:=u mod γv := u\ \text{mod}\ \gamma using Pohlig-Hellman. Let’s formulate what we have for now. We have multiple tuples (ai,bi,ci,vi,pi,γi)(a_i, b_i, c_i, v_i, p_i, \gamma_i) such that

vi=ai+bix+ciy mod pi mod γi.v_i = a_i + b_i{\color{red}x} + c_i{\color{red}y}\ \text{mod}\ p_i\ \text{mod}\ \gamma_i.

If we intend not writing the whole thing using modulo, we will introduce two more unknown, integral variables (si,ti)(s_i, t_i):

vi=ai+bix+ciypisiγiti,v_i = a_i + b_i{\color{red}x} + c_i{\color{red}y} - p_i{\color{red}s_i} - \gamma_i{\color{red}t_i},

with 0ti<pi/γi0 \leq t_i < p_i / \gamma_i. If we look at the modular congruence modulo pip_i, then we have

aivi+bix+ciyγiti0 (mod pi).a_i - v_i + b_i{\color{red}x} + c_i{\color{red}y} - \gamma_i{\color{red}t_i} \equiv 0 \ (\text{mod}\ p_i).

Suppose that we have 40 sets of (ai,bi,ci,vi,pi,γi)(a_i, b_i, c_i, v_i, p_i, \gamma_i) such that γi>232\gamma_i > 2^{32}. Assuming each congruence carries 512 bits of information, then we have 512×40=20480512 \times 40 = 20480 bits of information. We also have 512×2+480×40=20224512 \times 2 + 480 \times 40 = 20224 bits of unknown because x,yx, y is of 512 bits and each of the tit_i’s is of 480 bits long. Since we have more information than the unknown, we can apply LLL to solve the system - below is the lattice:

A=[a1v1a2v2a40v401b1b2b401c1b2c401r11r21r401p1p2p40].A = \left[\begin{array}{cccc|ccc|cccc} a_1 - v_1 & a_2 - v_2 & \dots & a_{40} - v_{40} & 1 & & & & & & \\ b_1 & b_2 & \dots & b_{40} & & 1 & & & & & \\ c_1 & b_2 & \dots & c_{40} & & & 1 & & & & \\ \hline -r_1 & & & & & & & 1 & & & \\ & -r_2 & & & & & & & 1 & & \\ & & \ddots & & & & & & & \ddots & \\ & & & -r_{40} & & & & & & & 1 \\ \hline -p_1 & & & & & & & & & & \\ & -p_2 & & & & & & & & & \\ & & \ddots & & & & & & & & \\ & & & -p_{40} & & & & & & & \end{array}\right].

The lattice spans the vector v=[0001xyl1l2l40]\mathbf{v} = \left[\begin{array}{cccc|ccc|c}0 & 0 & \dots & 0 & 1 & x & y & l_1 & l_2 & \dots & l_{40} \end{array}\right] because

[0001xyl1l2l40]=[1xyl1l2l40t1t2t40]A\begin{aligned} & \left[\begin{array}{cccc|ccc|cccc}0 & 0 & \dots & 0 & 1 & x & y & l_1 & l_2 & \dots & l_{40} \end{array}\right] \\ & \qquad = \left[\begin{array}{ccc|cccc|cccc}1 & x & y & l_1 & l_2 & \dots & l_{40} & t_1 & t_2 & \dots & t_{40} \end{array}\right] \cdot A \end{aligned}

v\mathbf{v} can be found applying LLL if we set appropriate weights. LLL for the win! We can recover xx and yy and thus the flag: BALSN{commitments_leak_too_much_QwQ}.

lfsr

Challenge Summary

Note. In the challenge, the operations are done in GF(2)\text{GF}(2). That said, ++ is actually the \oplus (XOR) operation and ×\times is the AND operation.

Let L\mathcal{L} be a LFSR with key (s0,s1,,s127)(s_0, s_1, …, s_{127}), and for i0i \geq 0,

si+128=si+si+1+si+2+si+7.s_{i+128} = s_{i} + s_{i+1} + s_{i+2} + s_{i+7}.

Let the output bits y0,y1,y_0, y_1, … defined by

yi=si+si+16+si+32+si+120+sisi+8+sisi+32+si+8si+64+si+16si+32+sisi+8si+64+si+16si+32si+64+sisi+8si+16si+64si+120+si+8si+16si+32si+64si+120.\begin{aligned} y_i &= s_i + s_{i+16} + s_{i+32} + s_{i+120} + s_i s_{i+8} + s_i s_{i+32} + s_{i+8} s_{i+64} + s_{i+16} s_{i+32} \\ &\qquad + s_i s_{i+8} s_{i+64} + s_{i+16} s_{i+32} s_{i+64} + s_i s_{i+8} s_{i+16} s_{i+64} s_{i+120} + s_{i+8} s_{i+16} s_{i+32} s_{i+64} s_{i+120}. \end{aligned}

We are given the first 8192 bits of the output (i.e., y0,y1,,y8191y_0, y_1, …, y_{8191}). The goal is to recover the key (i.e., s0,s1,,s127s_0, s_1, …, s_{127}).

Solution

Part I: Why multiple of eight?

Let yi:=b(si,si+8,si+16,si+32,si+120)y_i := b(s_i, s_{i+8}, s_{i+16}, s_{i+32}, s_{i+120}) for simplicity.

We can see that, for instance, y0y_0, depends on s0,s8,s16,s32,s64s_0, s_8, s_{16}, s_{32}, s_{64} and s120s_{120}. The indexes of the states are all multiples of eight. Furthermore, this happens on y8iy_{8i} every every integer ii:

y8i=b(s8i,s8(i+1),s8(i+2),s8(i+4),s8(i+15)).\begin{aligned} y_{8i} &= b(s_{8i}, s_{8(i+1)}, s_{8(i+2)}, s_{8(i+4)}, s_{8(i+15)}). \end{aligned}

We have y0,y8,,y8×1023y_0, y_8, …, y_{8 \times 1023}. Combining above, we have 1024 equations with 1039 unknowns (the unknowns being s0,s8,,s8×1038s_0, s_8, …, s_{8 \times 1038}). Therefore, it is expected to get around 2152^{15} possible sets of s0,s8,,s8304s_0, s_8, …, s_{8304} from y0,y8,,y8184y_0, y_8, …, y_{8184}.

Similarly, we can use y1,y9,,y8185y_1, y_9, …, y_{8185} to recover s1,s9,,s8305s_1, s_9, …, s_{8305}, use y2,y10,,y8186y_2, y_{10}, …, y_{8186} to recover s2,s10,,s8306s_2, s_{10}, …, s_{8306} and so on.

2152^{15} is not an accurate estimate! In average, there are far more than 2152^{15} such sis_i’s from those yiy_i’s when the number of terms increases. Fortunately the number of sis_i’s does not affect a lot on the solution.

Now back to the topic. How do we get s0,s8,,s8(l1)s_0, s_8, …, s_{8(l-1)} from y0,y8,,y8(l16)y_0, y_8, …, y_{8(l-16)}? We can perform a search (breadth-first search here):

  1. Initialize the solution list S\mathcal{S} with the 2152^{15} elements: (0,0,0,,0)(0, 0, 0, …, 0), (0,0,0,,1)(0, 0, 0, …, 1), …, and (1,1,1,,1)(1, 1, 1, …, 1).
  2. Let s\mathbf{s} be the first element in the list.
    If s\mathbf{s} has ll entries, return S\mathcal{S}.
  3. Remove s=(s~0,s~8,,s~8(k+14))\mathbf{s} = (\tilde s_0, \tilde s_8, …, \tilde s_{8(k+14)}) from S\mathcal{S}.
    For each s~8(k+15){0,1}\tilde s_{8(k+15)} \in \{0, 1\}, if y8k=b(s~8k,s~8(k+1),s~8(k+2),s~8(k+4),s~8(k+15))y_{8k} = b(\tilde s_{8k}, \tilde s_{8(k+1)}, \tilde s_{8(k+2)}, \tilde s_{8(k+4)}, \tilde s_{8(k+15)}), append (s~0,s~8,,s~8(k+14),s~8(k+15))(\tilde s_0, \tilde s_8, …, \tilde s_{8(k+14)}, \tilde s_{8(k+15)}) to S\mathcal{S}.
  4. Go back to step 2.

We know (s0,s8,,s8(l1))S(s_0, s_8, …, s_{8(l-1)}) \in \mathcal{S} because all the below equations hold:

{y0=b(s0,s8,s16,s32,s120)y8=b(s8,s16,s24,s40,s128)y8(l16)=b(s8(l16),s8(l15),s8(l14),s8(l12),s8(l1)).\left\{\begin{aligned} y_0 &= b(s_0, s_8, s_{16}, s_{32}, s_{120}) \\ y_8 &= b(s_8, s_{16}, s_{24}, s_{40}, s_{128}) \\ & \qquad \vdots \\ y_{8(l-16)} &= b(s_{8(l-16)}, s_{8(l-15)}, s_{8(l-14)}, s_{8(l-12)}, s_{8(l-1)}). \end{aligned}\right.

Regarding the notation. I use s~i\tilde s_i’s instead of sis_i’s because they are not necessary the correct ones.

Part II: How to get the key from undecillions of sis_i’s?

Why undecillions? Suppose that we have 2152^{15} sets of s0,s8,s_0, s_8, …, 2152^{15} sets of s1,s9,s_1, s_9, … and so on. There are 215×810362^{15 \times 8} \approx 10^{36} sets of s0,s1,s2,s_0, s_1, s_2, … in total.

Let’s use the above algorithm to find these four sets of numbers:

  1. (s~0,s~8,,s~504,s~512,,s~632)S0(\tilde s_0, \tilde s_8, …, \tilde s_{504}, \tilde s_{512}, …, \tilde s_{632}) \in \mathcal{S}_0 (80 terms),
  2. (s~1,s~9,,s~505)S1(\tilde s_1, \tilde s_9, …, \tilde s_{505}) \in \mathcal{S}_1 (64 terms),
  3. (s~2,s~10,,s~506)S2(\tilde s_2, \tilde s_{10}, …, \tilde s_{506}) \in \mathcal{S}_2 (64 terms), and
  4. (s~7,s~15,,s~511)S7(\tilde s_7, \tilde s_{15}, …, \tilde s_{511}) \in \mathcal{S}_7 (64 terms).
8192 bits? I don’t need that. The challenge can be solved 513 consecutive bits, ideally, with my approach.

Recall the state transition: si+128=si+si+1+si+2+si+7s_{i+128} = s_i + s_{i+1} + s_{i+2} + s_{i+7} for all integer i0i \geq 0. Let’s define a set T0\mathcal{T}_0 (transformed from S0\mathcal{S}_0) by:

T0={(s~0+s~128,s~8+s~136,,s~504+s~632){0,1}64  (s~0,s~8,,s~632S0)}\mathcal{T}_0 = \{(\tilde s_0 + \tilde s_{128}, \tilde s_8 + \tilde s_{136}, …, \tilde s_{504} + \tilde s_{632}) \in \{0, 1\}^{64} \ | \ (\tilde s_0, \tilde s_8, …, \tilde s_{632} \in \mathcal{S}_0)\}

We can see that

{t0:=(s0+s128,s8+s136,,s504+s632)T0s1:=(s1,s9,,s505)S1s2:=(s2,s10,,s506)S2s7:=(s7,s15,,s511)S7,\left\{\begin{aligned} & \mathbf{t}_0 := (s_0 + s_{128}, s_8 + s_{136}, …, s_{504} + s_{632}) \in \mathcal{T}_0 \\ & \mathbf{s}_1 := (s_1, s_9, …, s_{505}) \in \mathcal{S}_1 \\ & \mathbf{s}_2 := (s_2, s_{10}, …, s_{506}) \in \mathcal{S}_2 \\ & \mathbf{s}_7 := (s_7, s_{15}, …, s_{511}) \in \mathcal{S}_7, \end{aligned}\right.

and eventually t0+s1+s2+s7=(0,0,,0)\mathbf{t}_0 + \mathbf{s}_1 + \mathbf{s}_2 + \mathbf{s}_7 = (0, 0, …, 0).

We can precompute every t~0,i+s~1,j{0,1}64\tilde \mathbf{t}_{0,i} + \tilde \mathbf{s}_{1,j} \in \{0, 1\}^{64}, where t~0,iT0\tilde \mathbf{t}_{0,i} \in \mathcal{T}_0 and s~1,jS1\tilde \mathbf{s}_{1,j} \in \mathcal{S}_1, and store all of them in the set L\mathcal{L}. After that, we can check if there exists i,ji, j such that

t~0,i+s~1,j=s~2,k+s~7,l,\tilde \mathbf{t}_{0,i} + \tilde \mathbf{s}_{1,j} = \tilde \mathbf{s}_{2,k} + \tilde \mathbf{s}_{7,l},

for all s~2,kS2\tilde \mathbf{s}_{2,k} \in \mathcal{S}_2 and s~7,lS7\tilde \mathbf{s}_{7,l} \in \mathcal{S}_7. If that is the case (which is very rare!), we can try to recover s0,s1,,s127s_0, s_1, …, s_{127} and see if it generates the same output bits as given. After the key is recovered, then we can recover the flag: BALSN{almost_linear_too_easy}.