Вы находитесь на странице: 1из 8

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

A Vertex Chain Code Approach for Image Recognition


Abdel-Badeeh M. Salem, Adel A. Sewisy, Usama A. Elyan Faculty of Computer and Information Sciences, Assiut University, Assiut, Egypt, usama471@yahoo.com, http://www.aun.eun.eg

Abstract
Shape-of-object representation has always been an important topic in image processing and pattern recognition. This work deals with representation of shape based on a new boundary chain code, and uses this chain code to recognize the object. Chain code techniques are widely used to represent an object because they preserve information and allow considerable data reduction. In this paper the vertex chain code (VCC) is presented, this chain code is based on E. Bribiesca for shapes composed of regular cells. Also, the paper discusses the capabilities of the VCC in recognizing objects, the results show that, the VCC recognizes images better than the classical methods. Keywords: Image processing, Pattern recognition, Computer vision, String edit distance.

contour nodes, may be operated for extracting interesting shape properties. (4) Using the VCC it is possible to obtain relations between the bounding contour and interior of the shape. In this paper we will compute the chain code of an image using the classical methods and the VCC method. The images considered here are binary images with outer contour. In other words there are not any holes in the objects. Also, we will show how you can use these chain codes to recognize to the object. This paper is organized as follows. Section (2) presents a new method for extracting the contour of a binary image. Section (3) shows a simple example to compute the classical and the vertex chain code. Section (4) presents how you can use the string edit distance to compare your chains to measure the similarity between the images. Section (5) we use the 2 method to solve the scaling problem. Finally, Section (6) we give some conclusions.

Introduction

Boundary extraction

The rst approach for representing digital curves using chain code was introduced by Freeman in 1961 [4]. Classical methods for processing chains are referred to [5]. Freeman [5] states that in general, a coding scheme for line structures must satisfy three objectives: (1) it must faithfully preserve the information of interest; (2) it must permit compact storage and convenient for display; and (3) it must facilitate any required processing. Also the VCC comply with these three objects, and has some important dierences. E. Bribiesca [1] states some important characteristics of the VCC: (1) The VCC is invariant under translation and rotation, and optionally may be invariant under starting point and mirroring transformation. (2) Using the VCC it is possible to represent shapes composed of triangular, rectangular, and hexagonal cells. (3) The chain elements represent real values not symbols such as other chain codes, are part of the shape, indicate the number of cell vertices of

The rst step of the construction of the chain code is to extract the boundary of the image. Chains can represent the boundaries or contours of any discrete shape composed of regular cells. In the content of this work, the length l of each side of cells is considered equal to one. These chains represent closed boundaries. Thus, all chains are closed. Extracting the contour depends on the connectivity. In the content of this paper we use pixels with four-connectivity. The simplest contour following algorithms were presented by Papert [10] and Duda and Hart [9]. Thus using these algorithms it is possible to represent shape contours by only two states: left turn (represented by 1") and right turn (represented by 0"). The abovementioned process produces a chain composed of only binary elements. Figure (1) illustrates the contour following on an image composed of pixels. This contour was obtained according to the following algorithm:

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

Figure 4: Three simple binary images.

Figure 1: Example of a contour following on a digital gure.

Figure 2: Directions of the neighbors:(a) 4-connected; (b) 8-connected. Scan the picture until a gure cell is encountered. Then: If you are in a gure cell turn left and take a step. If you are in a ground cell turn right and take a step. Terminate when you are within one cell of the starting point. In this paper we proposed a new algorithm to nd the contour of a binary image and use this contour to obtain the chain code. Since we use pixels with 4-connectivity, the four neighbors of any point can be represented by directions as illustrated in gure (2a). To nd the contour of a binary image we apply the following algorithm: Step 1. For all pixels with value 0 (black) in the image, set the pixel that has the direction 2 in 4-connected to 0. Step 2. In the new image (i.e., image obtained from Step 1), also, for all pixels with value 0, set the pixel that has the direction 1 in 4-connected to 0. Step 3. Remove the old pixels (in the original image) that have 8-connected as shown in gure (2b) and do not satisfy the conditions shown in gure (3). Example: if we apply the previous algorithm to the three images shown if gure (4), we will obtain the contour of the images as depicted in gure (5). Figure (5a) shows Step 1 of the algorithm, gure (5b)

Figure 5: The three steps for nding the contour of the binary images shown in gure (4).

represents Step 2 to nd the contour of a binary image, and gure (5c) shows Step 3 of the algorithm. Note that, we can apply this algorithm to real images to obtain their contours. Figure (6) shows a tree (a) leaf and its contour (b). Also, the previous algorithm can apply to obtain the contour of the binary images even if these images with holes. Figure (7) shows the contour of an image with two holes.

Figure 6: A tree leaf: (a) the original image; (b) contour of the image.

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

Figure 3: Four conditions to remove the old pixels.

Figure 7: Image with two holes: (a) original image; (b) its contour.

Figure 8: The classical and the vertex chain code: (a) a shape composed of pixels; (b) the same shape; (c) the elements of the classical chain code; (d) the elements of the VCC. number of the VCC shown in gure (8d) is 11212113. Note that, the shape number is invariant under rotation. It means that, if the object is rotated with k 2 where k is an integer, the shape number is the same.

The classical and the vertex chain code

Chains can represent the boundaries or contours of any image. The classical (Freeman) chain is dened as the direction of the objects contour from starting point [8], while an element of the VCC indicates the number of cell vertices, which in touch with the bounding contour of the shape in that element position [1]. Figure (8a) presents a shape composed of pixels with its directions. Figure (8b) shows the same shape with its VCC elements. Figure (8c) shows the classical chain code of the shape. Figure (8d) shows the VCC of the shape. Note that when we are using pixels; the VCC has only three dierent numbers of cell vertices for the bounding contour: 1, 2, and 3 [1]. The classical and the vertex chain code may be invariant under starting point and rotation by using the concept of the shape number. The shape number of the classical chain can be derived from the chain code by taking the dierence of the elements of the chain code in counterclockwise direction to obtain the dierence code, then rotate the digits of the dierence code until the number is minimum to obtain the shape number. The shape number of the VCC can be obtained directly by rotating the digits of the chain until the number is minimum. The dierence code of the classical chain code shown in gure (8c) is 10101131, and its shape number is 0101131 1. Also, the shape

String edit distance and shape comparison

String matching is one of the basic techniques in structural pattern recognition. In particular, string edit distance can be used to measure the similarity of objects which are represented in terms of strings. String edit distance is based on a set of edit operations, for example, the insertion, deletion, and substitution of individual symbols in a string. Often a cost is assigned to each edit operation to model its likelihood of occurrence. Given a set of edit operations together with their costs, the edit distance d(x, y ) of two strings, x and y , is dened as the minimum cost taken over all sequences of edit operations that transform x into y . String edit distance has been successfully applied to a number of problems in pattern recognition, for example, two-dimensional shape recognition [3, 11, 2], symmetry analysis [6]. Let A be a nite alphabet of symbols, and denote the empty symbol. An edit operation is any of the following: a b, a , and a, where a, b A. We call a b a substitution, a a deletion, and a an insertion. If a = b then a b is called an identical substitution; otherwise it is termed non-

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

identical substitution. A cost function is a function that assigns a non-negative real number to each edit operation. We write c(a b), c(a ), and c( a) to denote the cost of substitution a b, a deletion , and insertion a, respectively. The standard algorithm for computing d(x, y ), where x = x1 ...xn and y = y1 ...ym , is based on dynamic programming [7]. It computes the elements of a two-dimensional edit matrix D(i, j ) of dimension (n + 1) (m + 1) using the following simple algorithm: Step 1. D(0, 0) = 0. Step 2. For j = 1, ..., m D(0, j ) = D(0, j 1) + c( yj ). Step 3. For i = 1, ..., n D(i, 0) = D(i 1, 0) + c(xi ). Step 4. For i = 1, ..., n For j = 1, ..., m D(i 1, j 1) + c(xi yi ) . D(i 1, j ) + c(xi ) D(i, j ) = min D(i, j 1) + c( yi ) It has been shown that d(x, y ) = D(n, m). Furthermore, from the edit matrix the sequence of edit operations that transform x into y with minimum cost can be recovered. In order to illustrate the capabilities of the VCC, we present some results of shape comparison. Figure (9) shows simple three shapes composed of pixels. The chains of these shapes were obtained using the concepts of classical and the vertex chain code. Also, the dierence code and the shape number of each shape are obtained. Table 1 presents the classical and the vertex chains of the shaps shown in gure (9). Now, how you compare these shape numbers (strings) to measure the similarity of the three images. We will use the string edit distance to measure the similarity of these three objects. Let dc (A, B ) be the string edit distance between the classical shape numbers of image A and image B, dc (A, C ), and dc (B, C ) are the string edit distances between the classical shape numbers of image A and image C , and image B and image C , respectively. Moreover, let dv (A, B ), d(A, C ), and d(B, C ) denote the string edit distances of the VCC for the same shapes in gure (9). Now, if we apply the previous algorithm to compute these distances, we note that dc (A, B ) = 10, dc (A, C ) = 12, dc (B, C ) = 6, Where we assume that c(a b) = 2, if |a b| = 2 1, otherwise. (2) dv (A, B ) = 4; dv (A, C ) = 10; dv (B, C ) = 6.

Since image A is more similar to image B than it is to image C . Then the string edit distance between the shape numbers of image A and image B is the smallest number. From Eq. (1), we not that the VCC demonstrates that image A and B do form the most similar pair of images, since dv (A, B ) is the smallest number. Also, image B and C are more similar than images A and C . On the other hand, the classical chain code cannot be used for discriminating between the images to determine their visual similarity. Now, we will use the string edit distance to solve a practical problem (recognition of tree leaves). Figure (10a) shows the stored leaves images and gure (10b) present the test image used. We note that, image F (test image) is more similar to image A than it is to image B , C , D, and E . dc (A, F ) = 140, dc (B, F ) = 996, dc (C, F ) = 138, dc (D, F ) = 142, dc (E, F ) = 162, dv (A, F ) = 27; dv (B, F ) = 996; dv (C, F ) = 166; dv (D, F ) = 162; dv (E, F ) = 165.

(3)

The calculated distances for the tree leaves are shown in Eq. (3). Using the VCC, we notice that image F is more similar to image A. The third example compares the classical and the vertex chain codes using images of airplanes. Figure (11a) shows the stored airplanes image used and gure (11b) shows a test image. We note that , image D (test image) is more similar to image A than it is to image B and image C . Also, the test image is more similar to image B than image C . dc (A, D) = 30, dc (B, D) = 170, dc (C, D) = 146, dv (A, D) = 30; dv (B, D) = 72; dv (C, D) = 98.

(4)

The calculated distances for the airplanes are shown in Eq. (4). Using the VCC, we notice that image D is far away from image B and C . As can be seen, the VCC produces more realistic results.

Scaling problem

(1)

The cost of deletion c(a ) = 1, and the cost of insertion c( a) = 1.

The chain code is translation invariant and it is not scale invariant. Figure (12) shows how the chain code is varying if the shape is scaling. Nevertheless the shape code is not scale invariant. If the elements of the chain code are multiplied n times, the object will enlarge k times depending on n. But the shape number has not got this feature. On the other hand, if the chain code is transformed back from the shape number anyhow, then the chain code will be as original chain code. It means, that if you have some dierent chain codes of an object, you have to transform these into a shape number and back and the solution will be the same chain.

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

Figure 9: Three simple images.

The The The The The

Table 1: The classical and the vertex chains for the shapes in gure (9). A B C classical chain code 011001121122323032230303 011001121122333223030 011001121122333333 dierence code 103010130101311330113131 1030101301010030113131 103010130101000001 shape number 010130101311330113131103 0030113131103010130101 000001103010130101 VCC 112321213212131133211313 1123212132121223211313 112321213212122222 shape number 112321213212131133211313 1123212132121223211313 112321213212122222

Figure 10: Tree leaves images: (a) the stored images; (b) test image.

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

Figure 11: Airplanes images: (a) the stored images; (b) test image. Now the problem can be dened. Let two objects without holes f (x, y ) and g (x, y ). The classical chain code of image f is cf and the classical chain code of image g is cg . The question is the following: can the binary object f (x, y ) be matched to the object g (x, y ) with any scaling or rotation by k 2 , where k is an integer.

5.1

The 2 method
Figure 12: Scaling problem: (a) a simple shape; (b) the shape after scaling; (c) the classical chain of the image in (a); (d) the classical chain of the image in (b). the distribution of test statistic 2 is approximately a 2 distribution with 4 degree-of-freedom. Example: the original chain code of the image in gure (12a) is 0 0 1 1 2 3 2 3 and the original chain code of the image in gure (12b) is 0 0 0 0 1 1 1 1 2 2 3 3 2 2 3 3. Now, let = (0, 0, 1, 1, 2, 3, 2, 3) and = (0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 3, 3, 2, 2, 3, 3). Then: v0 v1 v2 v3 and R0 R1 R2 R3 So, the 2 is 2 = 0
2

The 2 test is a common method to verify if two given set of data belongs to the same of dierent distribution. Compute the original chain codes cf and cg . Then look on these codes as two random variables and the elements of these codes as samples of these random variables. Detail the random variable with the cf and let the sample be (1 , ..., m ) with m elements. Detail the random variable with cg as well. Let the sample be (1 , ..., n ) with n elements. Examine the homogenity of these variables with the 2 test, because if one of these chain codes diers only from the other chain code only in scaling, then they can be considered as two samples of one population. The set of the values of and is the set {0, 1, 2, 3}, because these are classical chain codes. Divide the values of and into four sets. Let v0 is the number of 0" in , v1 is the number of 1" in , v2 is the number of 2" in , and v3 is the number of 3" in . Also, let u0 , u1 , u2 , and u3 are the number of 0", 1", 2", and 3" in , respectively. Then the 2 is given by: 2 = m . n where (6) vi + ui , if vi = ui = 0 0, otherwise It can be proven that if m and n are large enough and the codes can be matched together anyhow, then Ri = vi ui )2 (m n
3

= 2, = 2, = 2, = 2,

u0 u1 u2 u3

= 4; = 4; = 4; = 4;

Ri
i=0

(5)

=0 =0 =0 =0

(7)

From Eq. (7), we note that the value is zero (i.e., gure (12b) is a scaling to gure (12a). From this

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

R0 R1 R2 R3

= 1.7147 108 = 2.1745 108 = 1.7147 108 = 2.1745 108

But the 2 between the VCC is 2 = 124.04 where m = 431, n = 1008, Figure 13: 2 of the g leaf: (a) the g leaf; (b) the g leaf after scaling. example we can use the 2 test to solve the scaling problem. Also, we can compute the 2 between Any two VCC. Where the 2 is given by 2 = m . n
3

(11)

v1 = 125, v2 = 185, v3 = 121, and

u1 = 135; u2 = 742; u3 = 131;

Ri
i=1

(8)

R1 = 9.3714 105 R2 = 10.159 105 R3 = 9.0219 105 From Eqs. (10) and (11), we note that the classical chain code gives better results from the VCC.

and the other notations (Ri , vi , and ui ) are the same. The VCC of gure (12a) is 1 1 2 1 2 1 1 3, and the VCC of gure (12b) is 1 2 1 2 2 2 1 2 2 2 1 2 1 2 3 2. Now if we compute the 2 , we will obtain =3 where v1 = 5, v2 = 2, v3 = 1, and R1 = 0.0097656 R2 = 0.011719 R3 = 0.0019531 From Eq. (9), we note that the 2 of VCC shown in gure (12) is large. Which means the classical chain code gives better results in the scaling problem. Figure (13) show a practical example to solve the problem of scaling. Since we use a g" leaf. The 2 between the classical chain codes of the g" leaf is 2 = 0.033715 where m = 431, n = 1008, v0 v1 v2 v3 and = 121, = 94, = 121, = 94, u3 u1 u2 u3 = 281; = 223; = 218; = 223; (10) u1 = 5; u2 = 10; u3 = 1;
2

Conclusion

(9)

In this paper, a new method for extracting the contour of binary images is presented. The new method can apply to any binary image with or without holes. The extracted contour is used to derive the chain code of the image. The classical (Freeman) and a new chain code for shapes composed of nite number of cells is dened. The denition of the new chain code (termed vertex chain code, VCC) is valid for shapes composed of triangular, rectangular, and hexagonal cells. The VCC preserves information and allows considerable data reduction. This chain code is invariant under translation and rotation, and optionally, under starting point and mirroring transformation. To illustrate the capabilities of the VCC, we present some examples of shape comparison and image recognition for binary images of tree leaves and airplanes images. Where, we use the string edit distance to measure the similarity between the chain codes. The results demonstrate that the VCC recognizes the shapes better than the classical chain code. Finally, we use the 2 method to solve the problem of scaling. Since the chain code is not scale invariant. In the scaling problem the classical chain code gives better results.

References
[1] E. Bribiesca. A new chain code. Pattern Recognition, 32:235251, 1999.

ICGST-GVIP Journal, Volume 5, Issue3, March 2005

[2] H. Bunke and M. Zumbhl. Acquisition of 2D shape models from scenes with overlpping objects using string matching. Pattern Analysis and Applications, 2:29, 1999. [3] H. Bunke and U. Bhler. Applications of approximate string matching to 2D shape recognition. Pattern Recognition, 26:17971812, 1993. [4] H. Freeman. On the encoding of arbitrary geometric congurations. IRE Trans. Electron. Comput. EC, 10:260268, 1961. [5] H. Freeman. Computer Processing of line drawing images. ACM Comput. Surveys, 6:5797, 1974. [6] J. Llados, H. Bunke, and E. Marti. Finding rotational symmetries by cyclic string matching. Pattern Recognition Letters, 18:14351442, 1997. [7] R. Wagner and M. Fischer. The string-to-string correction problem. Journal of the ACM, 21:168 173, 1974. [8] R.C. Gonzales and R.E. Woods. Digital image processing, second edition. Addison-Wesley, 2002. [9] R.O. Duda and P.E. Hart. Pattern classication and scene analysis. Wiley, New York, 1973. [10] S. Papert. Uses of technology to enhance education. Technical Report 298, AI Lab, MIT, 1973. [11] S.W. Chen, S.T. Tung, C.Y. Fang, S. Cherng, and A. Jain. Extended attributed string matching for shape recognition. Computer Vision and Image Understanding, 70:3650, 1998.

Вам также может понравиться