# Finding close-prime factorizations

This is a proposal for a simple factoring algorithm which actually performs better than regular attacks based on lattice-type algorithms. Although more efficient algorithms exists, this one is very simple to implement, while the more efficient ones are not. Do not quote me on this, though ;-D

## Finding a good approximation

The first step is to compute an good approximation of $\phi(n)$. Denote this $\phi'$. One such good approximation is $\phi' = n - 2\sqrt{n} + 1$. Let us see why. First, assuming that $n = pq$, note that

$\phi(n) = (p-1)(q-1) = n - (p+q) + 1$

Let us assume there exists a small $\gamma$ such that $p = a + \gamma$ and $q = a - \gamma$. So, we have that

$\delta := \phi' - \phi(n) = 2\sqrt{a^2 - \gamma^2} - 2a = 2a\sqrt{(1 - \gamma^2/a^2)} - 2a.$

Now, let $\beta = \gamma^2/a^2$. Then,

$\delta =2a\left(\sqrt{1 - \beta} - 1\right) = 2a \cdot \left[\beta/2 + \mathcal{O}(\beta^2)\right]$.

Since $\beta$ is assumed to be small ($a >> \gamma$), we can approximate it as

$\delta \approx \gamma^2/a.$

Great! So the running time is bounded by $\mathcal{O}( \gamma^2/a)$ if we use pure brute force. Now, we are going to be just a little more clever than that, aren’t we?

## Searching efficiently

We will now show how to search for a solution faster than brute force. First, we pick an arbitrary invertible element $y$ and compute all powers of $y \bmod n$ up to a certain parameter $b$, i.e.,

$\mathcal B := \{y^0 \bmod n, y^1 \bmod n, \dots , y^b \bmod n\}$

Obviously, this requires $\mathcal O(b)$ time and memory. A reference implementation in Python (with $y = 2$) would look something like.

# create a look-up table
look_up = {}
z = 1
for i in range(0, b + 1):
look_up[z] = i
z = (z * 2) % n


Now, compute

$\mu := (y^{\phi'})^{-1} \bmod n.$

Note that the exponent of $\mu$ is

$-\phi' \bmod \phi(n)$.

We now do a BSGS-type of algorithm (as noted by hellman in the comments, we can use Pollard’s kangaroo algorithm to avoid exhausting precious memory without sacrificing computational effort, up to polynomial factors). Generate all powers $y^{bi} \bmod n$ for  $0 \leq i \leq b$ and multiply with $\mu$ forming $c:= y^{bi} \cdot \mu$. Again, we get an exponent which is

$-\phi' + bi \bmod \phi(n).$

For each such computed, check against $\mathcal{B}$ is it exists. If so, we have found $\phi(n)$.

# check the table
mu = gmpy.invert(pow(2, phi_approx, n), n)
fac = pow(2, b, n)
j = 0

while True:
mu = (mu * fac) % n
j += b
if mu in look_up:
phi = phi_approx + (look_up[mu] - j)
break
if j > b * b:
return


This is $\mathcal O(b)$ in time. Once $\phi(n)$ has been found, we can trivially compute the factors in negligible time as

# compute roots
m = n - phi + 1
roots = (m - gmpy.sqrt(m ** 2 - 4 * n)) / 2, \
(m + gmpy.sqrt(m ** 2 - 4 * n)) / 2


Cool! Since we require that $\gamma^2/a \leq b^2$ for a solution to be found, the running time is $\mathcal O(\gamma/a^{1/2})$ and with equal amount of memory required.

## A real-world example (sort of)

This little finding inspired me to create a CTF challenge. During the SEC-T CTF 2017, one of the challenges (qproximity) could be solved using this method.

Description

We are given a public key and an encrypted message. The key is

-----BEGIN PUBLIC KEY-----
iravDlD4vUlVk9XK79/fwptVzYsjimO42+ZW5VmHF2AUXaPhDC3jBaoNIoa78CXO
ft030bR1S0hGcffcDFMm/tZxwu2/AAXCHoLdjHSwL7gxtXulFxbWoWOdSq+qxtak
zBSZ7R1QlDmbnpwdAgMDEzc=
-----END PUBLIC KEY-----


Extracting the modulus, we find that

$\begin{array}{rl} n = &246264974647736414345408205036830544055349190030438864689361084738619 \\ &430136992438500973098730365134506003143847829773369403632725772343146 \\ &864922044439763512753030199250563829152109289871491767838931495698391 \\ &860322173235862868025625353744920431228772475066920885663471105686331 \\ &5465220853759428826555838536733 \end{array}$

The factors are found as follows

def close_factor(n, b):

# approximate phi
phi_approx = n - 2 * gmpy.sqrt(n) + 1

# create a look-up table
look_up = {}
z = 1
for i in range(0, b + 1):
look_up[z] = i
z = (z * 2) % n

# check the table
mu = gmpy.invert(pow(2, phi_approx, n), n)
fac = pow(2, b, n)
j = 0

while True:
mu = (mu * fac) % n
j += b
if mu in look_up:
phi = phi_approx + (look_up[mu] - j)
break
if j > b * b:
return

m = n - phi + 1
roots = (m - gmpy.sqrt(m ** 2 - 4 * n)) / 2, \
(m + gmpy.sqrt(m ** 2 - 4 * n)) / 2

return roots

n = 2462649746477364143454082050368305440553491900304388646893610847386194301369924385009730987303651345060031438478297733694036327257723431468649220444397635127530301992505638291521092898714917678389314956983918603221732358628680256253537449204312287724750669208856634711056863315465220853759428826555838536733
b = 10000000

close_factor(n, b)


The code runs in matter of seconds and gives the factors

$\begin{array}{rl} p = &156928319511723700336966188419841178779822856669047426731288203806365 \\ &781043486977181195772183437407585036370841822568787188893070958887580 \\ &0968904667752571 \end{array}$

and

$\begin{array}{rl} q = &156928319511723700336966188419841178779822856669047426731288203806365 \\ &838576177312234882203162849666809217683250766146317773098495921211843 \\ &5829588059735623 \end{array}$

Using these, we construct the flag

SECT{w3ll_th4t_wasnt_2_h4rd?}

## 6 thoughts on “Finding close-prime factorizations”

1. Interesting! What are the “more efficient” algorithms that you mention?

1. There are algorithms which use bivariate Coppersmith to achieve this (e.g. https://hsbp.org/tiki-download_wiki_attachment.php?attId=174). Basically, it allows a smaller $\alpha$ such that $|p-q| < n^{1/\alpha}$, so that $n=pq$ still is factorable in polynomial time.

2. hellman says:

I think instead of BSGS you can use Pollard’s Kangaroo (lambda), no memory used. Though slightly more code 😉

3. hellman says:

What are the other, “more efficient” methods you mention?

4. Ronald says:

Hey there, just a short question concerning the sqrt…. Do you use the results as floats or rounded / truncated?

Cheers