Академический Документы
Профессиональный Документы
Культура Документы
The Hough Transform (HT) (Hough, 1962) is a technique that locates shapes in
images. In particular, it has been used to extract lines, circles and ellipses (or conic
sections). In the case of lines, its mathematical definition is equivalent to the Radon
transform (Deans, 1981). The HT was introduced by Hough (Hough, 1962) and then used
to find bubble tracks rather than shapes in images. The HT implementation defines a
mapping from the image points into an accumulator space (Hough space). The mapping
is achieved in a computationally efficient manner, based on the function that describes
the target shape. This mapping requires much less computational resources than template
matching. However, the fact that the HT is equivalent to template matching has given
sufficient impetus for the technique to be amongst the most popular of all existing shape
extraction techniques.
Many shapes are far more complex than lines, circles or ellipses and hence cannot
be defined by any equation. It is often possible to partition a complex shape into several
geometric primitives, but this can lead to a highly complex data structure. In general it is
more convenient to extract the whole shape, in other words, to characterize a shape. This
has motivated the development of techniques that can find arbitrary shapes using the
evidence-gathering procedure of the HT. These techniques again give results equivalent
to those delivered by matched template filtering, but with the computational advantage of
the evidence gathering approach. An early approach offered only limited capability for
arbitrary shapes (Merlin, 1975). The full mapping is called the Generalized Hough
Transform (GHT) (Ballard, 1981) and can be used to locate arbitrary shapes with
unknown position, size and orientation.
------- (1)
where b = (x0, y0) is the translation vector is a scale factor and is a rotation
matrix. Here we have included explicitly the parameters of the transformation as
arguments, In order to define a mapping for the HT, the location of the shape is given by,
------- (2)
Given a shape and a set of parameters b, and , this equation defines the
location of the shape. However, we do not know the shape (since it depends on the
parameters that we are looking for), but we only have a point in the curve. In order to find
the solution we can gather evidence by using a four-dimensional accumulator space. In
the GHT the gathering process is performed by adding an extra constraint to the system
that allows us to match points in the image with points in the model shape. This
constraint is based on gradient direction information the gradient direction at a point in
the arbitrary model. Thus,
------ (3)
Equation (3) is true only if the gradient direction at a point in the image matches the
rotated gradient direction at a point in the (rotated) model, that is,
------- (4)
where is the angle at the point . Note that according to this equation, gradient
direction is independent of scale (in theory at least) and it changes in the same ratio as
rotation. We can constrain Equation (2) to consider only the points for which,
-------- (5)
That is, a point spread function for a given edge point is obtained by selecting
a subset of points in such that the edge direction at the image point rotated by
equals the gradient direction at the model point. For each point and selected point in
the point spread function is defined by the HT mapping in Equation (2).
The above equations give an idea for detection of arbitrary shapes. The geometry
of these equations is shown in the Figure (1). Given an image point we have to find a
displacement vector (combination of rotation and scale). When the vector is
placed at then its end is at the point b. In the GHT jargon, this point is called the
reference point. The vector can be easily obtained as or alternatively
as . However, in order to evaluate these equations, we need to know the point .
Figure – 1.
Invariant GHT
The problem with the GHT (and other extensions of the HT) is that they are very
general. That is, the HT gathers evidence for a single point in the image. However, a
point on its own provides little information. Thus, it is necessary to consider a large
parameter space to cover all the potential shapes defined by a given image point. The
GHT improves evidence gathering by considering a point and its gradient direction.
However, since gradient direction changes with rotation, then the evidence gathering is
improved in terms of noise handling, but little is done about computational complexity.
-------- (6)
The function Q is said to be invariant and it computes a feature at the point. This
feature can be, for example, the colors of the point, or any other property that does not
change in the model and in the image. By considering Equation (6), we have that
Equation (5) is redefined as
-------- (7)
That is, instead of searching for a point with the same gradient direction, we will
search for the point with the same invariant feature. The advantage is that this feature will
not change with rotation or scale, so we only require a 2D space to locate the shape. The
definition of Q depends on the application and the type of transformation. The most
general invariant properties can be obtained by considering geometric definitions. In the
case of rotation and scale changes (i.e. similarity transformations) the fundamental
invariant property is given by the concept of angle. An angle is defined by three points
and its value remains unchanged when it is rotated and scaled. Thus, if we associate to
each edge point a set of other two points ,then we can compute a
geometric feature that is invariant to similarity transformations. That is,
-------- (8)
--------- (9)
We can replace the gradient angle in the R-table (previous approach) by the angle . The
form of the new invariant table is shown in Figure 2(c). Since the angle does not
change with rotation or change of scale, then we do not need to change the index for each
potential rotation and scale. However, the displacement vector changes according to
rotation and scale. Thus, if we want an invariant formulation, then we must also change
the definition of the position vector.
Figure (2)
Implementation –
From the implementation point views following steps are performed-
1. Take a model image (which contains only the shape for which shape matching
will be performed in different images).
2. Perform edge detection on the model image.
3. Select a point on edge image and search second point at fixed angle (will be
decided prior to coding).
4. Draw tangent from these two points and calculate angle between them that is .
5. Draw line from first point passes through centre of gravity of shape and calculate
angle between them that angle is .
6. Now we will get two parameter for edge point ( , ).
7. Repeat the process for each edge point. We will get two parameters for edge point
(which got second point otherwise there will be no parameter for that point).
8. Now we will make R-Table with help of these parameters, by segmenting or
dividing over a range of 0 to 2π and storing the corresponding values of .
9. Now the edged image is obtained for that image in which shape matching has to
be performed.
10. Select an edge point and calculate in same way as done during creation of R-
Table and search value of from R-Table. If is present, then gradient on that
edge point is drawn in a two dimensional accumulator array.
11. Above step repeat in image for each edge point.
12. Finally in accumulator array we will get only the shape for which shape matching
is being performed. This shape will be brightest and remaining will have some
other line or point.
Output of Program
This is an image containing a shifted, rotated and scaled image of the bean shaped
entity which has to be detected based on the R-Table which has been created previously
by using the image shown above.
Accumulator Array after Executing Program
This image gives the accumulator array after execution of the program for
detection of image using Invariant Generalized Hough Transform.
NOTE: The actual size of all the images shown above is 1024*768 it has been reduced so
as to give a clear perception in this document.