This is a proposal for a simple factoring algorithm which actually performs better than regular attacks based on lattice-type algorithms. Although more efficient algorithms exists, this one is very simple to implement, while the more efficient ones are not. Do not quote me on this, though ;-D
Finding a good approximation
The first step is to compute an good approximation of . Denote this
. One such good approximation is
. Let us see why. First, assuming that
, note that
Let us assume there exists a small such that
and
. So, we have that
Now, let . Then,
.
Since is assumed to be small (
), we can approximate it as
Great! So the running time is bounded by if we use pure brute force. Now, we are going to be just a little more clever than that, aren’t we?
Searching efficiently
We will now show how to search for a solution faster than brute force. First, we pick an arbitrary invertible element and compute all powers of
up to a certain parameter
, i.e.,
Obviously, this requires time and memory. A reference implementation in Python (with
) would look something like.
# create a look-up table look_up = {} z = 1 for i in range(0, b + 1): look_up[z] = i z = (z * 2) % n
Now, compute
Note that the exponent of is
.
We now do a BSGS-type of algorithm (as noted by hellman in the comments, we can use Pollard’s kangaroo algorithm to avoid exhausting precious memory without sacrificing computational effort, up to polynomial factors). Generate all powers for
and multiply with
forming
. Again, we get an exponent which is
For each such computed, check against is it exists. If so, we have found
.
# check the table mu = gmpy.invert(pow(2, phi_approx, n), n) fac = pow(2, b, n) j = 0 while True: mu = (mu * fac) % n j += b if mu in look_up: phi = phi_approx + (look_up[mu] - j) break if j > b * b: return
This is in time. Once
has been found, we can trivially compute the factors in negligible time as
# compute roots m = n - phi + 1 roots = (m - gmpy.sqrt(m ** 2 - 4 * n)) / 2, \ (m + gmpy.sqrt(m ** 2 - 4 * n)) / 2
Cool! Since we require that for a solution to be found, the running time is
and with equal amount of memory required. Note that
.
Note that we further reduce the space required by a factor two. Since we know that , we can do increments skipping even values.
# check the table mu = gmpy.invert(pow(2, phi_approx // 2 * 2, n), n) fac = pow(4, b, n) j = 0 while True: mu = (mu * fac) % n j += 2 * b if mu in look_up: phi = phi_approx + (look_up[mu] - j) break if j > b * b: return
A real-world example (sort of)
This little finding inspired me to create a CTF challenge. During the SEC-T CTF 2017, one of the challenges (qproximity) could be solved using this method.
Description
We are given a public key and an encrypted message. The key is
-----BEGIN PUBLIC KEY----- MIGeMA0GCSqGSIb3DQEBAQUAA4GMADCBiAKBgAOBxiQviVpL4G5d0TmVmjDn51zu iravDlD4vUlVk9XK79/fwptVzYsjimO42+ZW5VmHF2AUXaPhDC3jBaoNIoa78CXO ft030bR1S0hGcffcDFMm/tZxwu2/AAXCHoLdjHSwL7gxtXulFxbWoWOdSq+qxtak zBSZ7R1QlDmbnpwdAgMDEzc= -----END PUBLIC KEY-----
Extracting the modulus, we find that
The factors are found as follows
def close_factor(n, b): # approximate phi phi_approx = n - 2 * gmpy.sqrt(n) + 1 # create a look-up table look_up = {} z = 1 for i in range(0, b + 1): look_up[z] = i z = (z * 2) % n # check the table mu = gmpy.invert(pow(2, phi_approx, n), n) fac = pow(2, b, n) j = 0 while True: mu = (mu * fac) % n j += b if mu in look_up: phi = phi_approx + (look_up[mu] - j) break if j > b * b: return m = n - phi + 1 roots = (m - gmpy.sqrt(m ** 2 - 4 * n)) / 2, \ (m + gmpy.sqrt(m ** 2 - 4 * n)) / 2 return roots n = 2462649746477364143454082050368305440553491900304388646893610847386194301369924385009730987303651345060031438478297733694036327257723431468649220444397635127530301992505638291521092898714917678389314956983918603221732358628680256253537449204312287724750669208856634711056863315465220853759428826555838536733 b = 10000000 close_factor(n, b)
The code runs in matter of seconds and gives the factors
and
Using these, we construct the flag
SECT{w3ll_th4t_wasnt_2_h4rd?}
Edit: I needed a threaded memory-less implementation. It can be found here.
Interesting! What are the “more efficient” algorithms that you mention?
There are algorithms which use bivariate Coppersmith to achieve this (e.g. https://hsbp.org/tiki-download_wiki_attachment.php?attId=174). Basically, it allows a smaller
such that
, so that
still is factorable in polynomial time.
I think instead of BSGS you can use Pollard’s Kangaroo (lambda), no memory used. Though slightly more code 😉
Definitely! I will add it to the text!
What are the other, “more efficient” methods you mention?
Hey there, just a short question concerning the sqrt…. Do you use the results as floats or rounded / truncated?
Cheers