# On cycles in hash functions

Assume there is a black box, that given a cryptographic hash function $h$, finds some $x$ and $r$ in polynomial time, such that applying $h()$, $r$ times gives $x$: $h(h(...(h(x)))) = x$

The question is if an oracle, denoted $\mathcal{O}$ in the text, can be used to prove if this function is not collision resistant or pre-image resistant.

We claim the following, without proof (for now):

Theorem 1: Given a random hash function $h(x)$ and an oracle $\mathcal{O}$ (according to the above), then a collision $h(x) = h(y), x \neq y$ can be found in $\textnormal{lcm}(1,2,\ldots,n)$ time if $h(x)$ has a cycle of length $\leq n$.

We will spend the following sections to prove this. Let us first consider the simplest construction for which the above claim holds. We also state the following corollary.

Corollary 1: If the function $h$ in Theorem 1 does not have a cycle, then there exists an polynomial-time probabilistic algorithm which randomizes the cycle structure and solves the problem with probability 1.

By Corollary 1, an adversary cannot craft a special function $h$ that averts the result of Theorem 1.

## Fixed points

We start off by looking how the oracle can be used to derive fixed points, i.e. elements in $h$ such that $h(x) = x$. If the oracle $\mathcal{O}$ was to return the shortest cycle, it would immediately tells us if there is a fixpoint $h(x) = x$ or not.

Let us assume the oracle instead picks a cycle at random (which is a reasonable assumption, as this is how cycle-finding algoritm works). We create an auxilliary function $h'$ as follows $h'(x) = \begin{cases} x & \quad \text{if } h(x) = x,\\ x+1 & \quad \text{if } h(x) \neq x. \end{cases}$

Or (equivalently) in Python:

def h_prime(x):
if h(x) == x: # is fixpoint
return x
else:
return x + 1


Once our new modified function $h'$ is fed to the oracle, the oracle in turn will give a fix point for both $h$ and $h'$ if one such exists in $h$. More specifically, upon calling the oracle with $h'$, it will return $r = 1$ and $h'(x) = x$ for some $x$. Obviously, no other point can be a fixpoint nor in a cycle (unless the value of $x$ wraps around and no fixpoints exist), just leading up to a fixpoint. So, the oracle can only return fixpoints and will do so if one exists.

Assuming that the function $h$ behaves randomly, then the expected number of fixpoints is $1.$ As a result, we can, with high probability find at least one element that is a fixpoint. We will see that this can in fact be fixed by using randomization of the cycle structure, by composition with a random permutation.

A preimage as also easy to find. Assume that we want to find $h(x) = A$. We then construct $g_A(x) = \begin{cases} x & \quad \text{if } h(x) = A,\\ x+1 & \quad \text{otherwise}. \end{cases}$

Any value leading to $A$ will be a 1-cycle (or fixpoint) in the $g_A$. No other cycles will exist.

## Random oracles

Clearly, the above idea can be used to find fixpoints in any function $h$ acting as a random oracle (or any cycle of length $r$ that we see fit). We will now see how to extend this idea further.

If a point $y$ leads to a fixpoint $x$, then necessarily we have a collision $h(y) = h(x) = x$. We call $y$ a second-order fixpoint (c.f. $x$, which is a first-order fixpoint).

To include the second-order points in the result from the oracle, we simply remove the first-order fixpoints and turn the second-order fixpoints into first-order fixpoints.

def h_prime(x):
if h(x) == x: # is fixpoint
return x+1
elif h(x) == h(h(x)):
# leads to fixpoint, so x != h(x)
return x
else:
return x+1


Now, we can use this to determinstically find a collision. Consider an algorithm which calls the oracle with the modified function $h'$. Assume that $\mathcal{O}(h')$ returns $y$. If $y$ is of second order, then we have a collision.

Obviously a first-order fixpoint is a 1-cycle. But what about $m$-cycles? Consider the following (efficiently computable function) $\hat h_m(x) = \begin{cases} x+1 & \quad \text{if } h^m(x) = x,\\ x & \quad \text{else if } h^{m+1}(x) = h(x),\\ x+1 & \quad \text{otherwise}. \end{cases}$

How does this work? Every element that is in a $m$-cycle will be turned into elements leading to a 1-cycle, and every element that jumps into a cycle (but is not part of it) is turned into a 1-cycle.

By setting $m = \textnormal{lcm}(1,2,\ldots,n)$ and calling the oracle with the function $\hat h_m(x)$, we will ultimately find a collision if $h$ admits a cycle of length up to $n$. If the cost of computing $\hat h_m(x)$ and $h^m(x)$ can be neglected, then the complexity is constant. Evidently, the collision is between the elements $h^m(x)$ and $y$. This concludes the proof.

If no cycle exists of length $\leq m$ (let us assume the adversary created one with special structure to thwart this attack), then it is possible to do a re-randomization of $h$. By composition with a random permutation $\pi$, we create $h' = \pi \circ h$. This will alter the cycle structure (and can be done multiple times). After finding a collision $x,y$ in $h'$, we can find one in $h$ as $\pi(x), \pi(y)$.

Algorithm 1:

Input: function $h$
Output: Colliding pair $x,y$

1. Pick a random (efficiently computable) permutation $\pi$. Let $\hat{h} = \pi \circ h$.
2. Construct $h'(x) = \begin{cases} x+1 & \quad \text{if } \hat{h}(x) = x,\\ x & \quad \text{else if } \hat{h}(\hat{h}(x)) = \hat{h}(x),\\ x+1 & \quad \text{otherwise}. \end{cases}$
3. Ask oracle $\mathcal{O}$ to find a cycle in $h'$, call this $y$. If the returned length is not 1, to go 1 (first step).
4. Return collision $\pi(h'(y)), \pi(y)$

As a final note: If the output should have a finite image in, say, $k$ bits, then the $x+1$ must be taken $\mod 2^{k}$. This is no problem, as this will not constitute a cycle if there exists at least one 1-cycle.

## A simple example

What would a result like this be without a real-world example? Here is a toy example, which assumes $n = 3$. Hence $m = \textnormal{lcm}(1,2,3) = 6$.

from random import randint
from hashlib import sha256

def h(x):
return int(
sha256(str(x)).hexdigest()[2:6], 16
)

def h_prime(x):
# x = h^6(x)
if x == h(h(h(h(h(h(x)))))):
return x + 1
elif h(x) == h(h(h(h(h(h(h(x))))))):
return x
else:
return x + 1

def oracle(h):
x0 = randint(0, 256**2)
tortoise = x0
hare = h(x0)

while True:
tortoise = h(tortoise)
y = h(hare)
hare = h(y)
if tortoise == y or tortoise == hare:
break

return hare

y = oracle(h_prime)
x = h(h(h(h(h(h(y))))))

assert(y != x)
assert(h(x) == h(y))


## The decision oracle flavour

Worth noting, we can solve this problem with a weaker form of the oracle. If we have a fixed-point (or $m$-cycle) oracle that can be asked if the function $h$ has this property and replies with yes or no, then this can be turned into the original oracle. By construction, we can keep half of the function $h$ to remain the same, while the other part is crafted so that it has no cycles. Then, by using binary search, it is possible to determine the fixed point (or $m$-cycle).