The Hough transform is a useful tool in image analysis and in particular feature extraction. The technique uses a voting procedure to detect patterns such as lines, circles or other well-defined geometrical structures. First, the image is transformed into parameter space. In constrast to e.g. the Fourier transform, the Hough transform is not an involution. Any two points lying on the same line in image space will project into the same point in transform space. As a result, candidates of the sought geometric form appear as local maximas in transform space.
Let us illustrate this by a simple example. Consider the image below, depicting an octagon .
For each pixel in the figure, we check whether its value is within a certain threshold. In this case, if it is zero. Essentially, this means that the point lies on the octagon. In transform space, this point corresponds to a sinusoidal curve given by
.
If two points and
appear on the same line, we have that the two curves will intersect in the transform space. It must necessarily hold that
.
A slight re-arrangement gives
,
which describes the slope of the line. Hence, the two points lie on the same line.
Embodied in Python, it can look something like the code below.
from PIL import Image import numpy import math def line_transform(matrix): transformPlane = numpy.zeros(((matrix.shape[1]+matrix.shape[0])*2,180)) delta = matrix.shape[0] + matrix.shape[1] for y in xrange(matrix.shape[1]): for x in xrange(matrix.shape[0]): if A[x][y] == 0: for theta in xrange(0,180): phi = (y * math.sin(math.radians(theta)) + x * math.cos(math.radians(theta))) transformPlane[phi+delta][theta] += 1 for theta in xrange(0,180): for phi in xrange((matrix.shape[1]+matrix.shape[0])*2): if transformPlane[phi][theta] > 110: k = -1 / math.tan(math.radians(theta)) b = (phi-delta) / math.sin(math.radians(theta)) for x in xrange(matrix.shape[0]): y = int(k*x + b) if y < matrix.shape[1] and y > -1: A[x][y] = 100 Image.fromarray(A, 'L').show() img = Image.open("octagon.gif") A = numpy.array(img.getdata(), numpy.uint8).reshape(img.size[1], img.size[0]) line_transform(A)
The code returns a picture as given below, where the significant lines have been detected.