Finding close-prime factorizations

This is a proposal for a simple factoring algorithm which actually performs better than regular attacks based on lattice-type algorithms. Although more efficient algorithms exists, this one is very simple to implement, while the more efficient ones are not. Do not quote me on this, though ;-D

Finding a good approximation

The first step is to compute an good approximation of \phi(n). Denote this \phi'. One such good approximation is \phi' = n - 2\sqrt{n} + 1. Let us see why. First, assuming that n = pq, note that

\phi(n) =  (p-1)(q-1) = n - (p+q) + 1

Let us assume there exists a small \gamma such that p = a + \gamma and q = a - \gamma. So, we have that

\delta := \phi' - \phi(n) = 2\sqrt{a^2 - \gamma^2} - 2a = 2a\sqrt{(1 - \gamma^2/a^2)} - 2a.

Now, let \beta = \gamma^2/a^2. Then,

\delta =2a\left(\sqrt{1 - \beta} - 1\right) = 2a \cdot \left[\beta/2 + \mathcal{O}(\beta^2)\right].

Since \beta is assumed to be small (a >> \gamma), we can approximate it as

\delta \approx \gamma^2/a.

Great! So the running time is bounded by \mathcal{O}( \gamma^2/a) if we use pure brute force. Now, we are going to be just a little more clever than that, aren’t we?

Searching efficiently

We will now show how to search for a solution faster than brute force. First, we pick an arbitrary invertible element y and compute all powers of y \bmod n up to a certain parameter b, i.e.,

\mathcal B := \{y^0 \bmod n, y^1 \bmod n, \dots , y^b \bmod n\}

Obviously, this requires \mathcal O(b) time and memory. A reference implementation in Python (with y = 2) would look something like.

# create a look-up table
look_up = {}
z = 1
for i in range(0, b + 1):
   look_up[z] = i
   z = (z * 2) % n

Now, compute

\mu := (y^{\phi'})^{-1} \bmod n.

Note that the exponent of \mu is

-\phi' \bmod \phi(n).

We now do a BSGS-type of algorithm (as noted by hellman in the comments, we can use Pollard’s kangaroo algorithm to avoid exhausting precious memory without sacrificing computational effort, up to polynomial factors). Generate all powers y^{bi} \bmod n for  0 \leq i \leq b and multiply with \mu forming c:= y^{bi} \cdot \mu. Again, we get an exponent which is

-\phi' + bi \bmod \phi(n).

For each such computed, check against \mathcal{B} is it exists. If so, we have found \phi(n).

# check the table
mu = gmpy.invert(pow(2, phi_approx, n), n)
fac = pow(2, b, n)
j = 0

while True:
    mu = (mu * fac) % n
    j += b
    if mu in look_up:
        phi = phi_approx + (look_up[mu] - j)
        break
    if j > b * b:
        return

This is \mathcal O(b) in time. Once \phi(n) has been found, we can trivially compute the factors in negligible time as

# compute roots
m = n - phi + 1
roots = (m - gmpy.sqrt(m ** 2 - 4 * n)) / 2, \
        (m + gmpy.sqrt(m ** 2 - 4 * n)) / 2

Cool! Since we require that \gamma^2/a \leq b^2 for a solution to be found, the running time is \mathcal O(\gamma/a^{1/2}) and with equal amount of memory required.

A real-world example (sort of)

This little finding inspired me to create a CTF challenge. During the SEC-T CTF 2017, one of the challenges (qproximity) could be solved using this method.

Description

We are given a public key and an encrypted message. The key is

-----BEGIN PUBLIC KEY-----
MIGeMA0GCSqGSIb3DQEBAQUAA4GMADCBiAKBgAOBxiQviVpL4G5d0TmVmjDn51zu
iravDlD4vUlVk9XK79/fwptVzYsjimO42+ZW5VmHF2AUXaPhDC3jBaoNIoa78CXO
ft030bR1S0hGcffcDFMm/tZxwu2/AAXCHoLdjHSwL7gxtXulFxbWoWOdSq+qxtak
zBSZ7R1QlDmbnpwdAgMDEzc=
-----END PUBLIC KEY-----

Extracting the modulus, we find that

\begin{array}{rl} n = &246264974647736414345408205036830544055349190030438864689361084738619 \\ &430136992438500973098730365134506003143847829773369403632725772343146 \\ &864922044439763512753030199250563829152109289871491767838931495698391 \\ &860322173235862868025625353744920431228772475066920885663471105686331 \\ &5465220853759428826555838536733 \end{array}

The factors are found as follows

def close_factor(n, b):

    # approximate phi
    phi_approx = n - 2 * gmpy.sqrt(n) + 1

    # create a look-up table
    look_up = {}
    z = 1
    for i in range(0, b + 1):
        look_up[z] = i
        z = (z * 2) % n

    # check the table
    mu = gmpy.invert(pow(2, phi_approx, n), n)
    fac = pow(2, b, n)
    j = 0

    while True:
        mu = (mu * fac) % n
        j += b
        if mu in look_up:
            phi = phi_approx + (look_up[mu] - j)
            break
        if j > b * b:
            return

    m = n - phi + 1
    roots = (m - gmpy.sqrt(m ** 2 - 4 * n)) / 2, \
            (m + gmpy.sqrt(m ** 2 - 4 * n)) / 2

    return roots

n = 2462649746477364143454082050368305440553491900304388646893610847386194301369924385009730987303651345060031438478297733694036327257723431468649220444397635127530301992505638291521092898714917678389314956983918603221732358628680256253537449204312287724750669208856634711056863315465220853759428826555838536733
b = 10000000

close_factor(n, b)

The code runs in matter of seconds and gives the factors

\begin{array}{rl} p = &156928319511723700336966188419841178779822856669047426731288203806365 \\ &781043486977181195772183437407585036370841822568787188893070958887580 \\ &0968904667752571 \end{array}

and

\begin{array}{rl} q = &156928319511723700336966188419841178779822856669047426731288203806365 \\ &838576177312234882203162849666809217683250766146317773098495921211843 \\ &5829588059735623 \end{array}

Using these, we construct the flag

SECT{w3ll_th4t_wasnt_2_h4rd?}
Advertisements

5 thoughts on “Finding close-prime factorizations

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s