Вы находитесь на странице: 1из 8

Principals of Intelligent Character Recognition

Copyright 2001 Top Image Systems Ltd. All rights reserved.

Table of Contents
What is Intelligent Character Recognition? .................................................... 3 The Statistical Approach .................................................................................. 3 The Semantic Approach ................................................................................... 5 Hybrid Methods and Voting Algorithms ......................................................... 6 About Top Image Systems................................................................................ 8

What is Intelligent Character Recognition?

The term Intelligent Character Recognition (ICR) encompasses various technologies aimed at the analysis and recognition of handwritten characters from electronic images. The problem could be stated as follows: Given a digitized image, how should an ICR algorithm analyze its content, recognize the identity of any character(s) contained in the image and return this information. For these purposes, image content could consist of an alpha character (a,b,c...), a numeric character (0,1,2...), a special character ($,%,&...), or a reject - meaning that the algorithm was unable to identify the image as a particular character. A digitized image is, after all, just a collection of numbers. For binary images, every point or pixel is assigned a value of either 0 or 1, for gray level images pixel values range from 0 to 255, and for color images pixel values usually consist of three numbers, each in the range of 0 to 255. While the ensuing discussion is valid for any type of image, for the sake of simplicity, only binary images will be addressed. Recognition technologies may be classified as statistical, semantic and hybrid. In the following, these methodologies are reviewed and their advantages and weaknesses compared. During the course of the discussion, only handwritten numeric characters or digits will be considered, such that the character recognition algorithms return only the values 0,1,2...9 or reject.

The Statistical Approach

Since every electronic image of a digit consists of pixel values that are represented by a spatial configuration of 0s and 1s, a statistical approach to image character recognition would suggest that one look for a typical spatial distribution of the pixel values that characterize each digit. In general, one is searching for the statistical characteristics of various digits. These characteristics could be very simple, like the ratio of black pixels to white pixels, or more complex, like higher order statistical parameters such as the third moments of the image. For Example: Typically, an image of the digit 1 will have relatively fewer black pixels than an image of the digit 8. In the following illustration, there almost twice as many black pixels in the 8 image than in the 1, though both are drawn to the same scale.

Continuing the same approach, cursory analysis shows that the ratio of height to width for the digit 0 is less than the same ratio for the digit 6 :

More advanced algorithms usually are based on the one dimensional histograms that can be extracted from digitized images. Such an approach is carried out by producing a histogram that reflects graphically the number of black pixels in each line and in each row. By projecting the black pixel count horizontally and vertically, it is possible to differentiate between many typical cases of digits. Such a projection is demonstrated in the following figure.

In short, by careful analysis of the histograms of various digits, it is possible to differentiate between them. Thus, the general flow of statistic based character recognition algorithms is as follows: Compute the relevant statistics for a digitized image Compare the statistics to those from a predefined database.

In general, most statistical methods of character recognition work well for digits that do not vary much from an ideal or predefined digit. Unfortunately, in reality, handwritten images demonstrate a large variance. Thus, some additional approaches are required to solve the character recognition problem.

The Semantic Approach

Digitized images of handwritten characters indeed consist of pixels. However, a fact that most statistical methods ignore is that the pixels also form lines and contours. This is the essential point of the semantic approaches to character recognition: first recognize the way in which the contours of the digits are reflected in the pixels that represent them and then try to find typical characteristics or relationships for each digit. As is seen in the following examples, this is also the main advantage of semantic methods versus statistical ones. Consider the following case:

The steps of a semantic based classifier for character recognition are as follows: Find the starting point of a contour. Start tracing the contour. Identify the characteristics of the contour while tracing it: up, down, diagonal up, arc, loop, etc. Search the database for a description similar to the one obtained. Technically, this would be executed by representing the descriptions as a logic tree (graph) and then by matching the graph against the graphs contained in the database. The following illustration consists of several exemplars of the digit 2. Though the images of the digits exhibit substantial differences on the basis of a pixel by pixel comparison, the semantic description of the two leftmost 2s , for example, is identical.

Since there are not an excessive number of ways to write all possible descriptions of every digit in the manner described and demonstrated above, it is possible to prepare a database that includes several hundred descriptions and encompasses the vast majority of possible cases. It is clear, thus, that the main problem of statistical methods, specifically, the large variance in shapes and sizes of handwritten data, can be effectively overcome by semantic methods. Nevertheless, the main problem with semantic methods is their reliance on correct extraction of character contours from the electronic images. For instance, if a character image is broken, then a semantic method may fail to trace the characters contour correctly. On the other hand, a statistical approach could still reflect broken character image statistics with sufficient accuracy to enable correct identification.

Hybrid Methods and Voting Algorithms

It is clear that statistical and semantic approaches to character recognition have specific advantages and disadvantages. The obvious question: Is it possible to combine the best of both methods? The answer: To a certain extent, yes, it is possible to develop algorithms that are part statistical and part semantic in an effort to leverage the advantages of both. Top Image Systems proprietary T.i.S. ICR/OCR reflects such a hybrid approach and, in many cases, overcomes the problems associated with the statistical and semantic methods when utilized independently. There is another step which can be taken, given extant technologies and methodologies, to obtain the best of all available recognition algorithms. Today, there are substantial number of good ICR recognition engines available. Each of these engines has its own specific strengths and weakness. Each engine, on a particular type of image or document, performs better than its peers, on another, worse. Recognizing this flaw common to all ICR engines, T.i.S. analyses the relative strengths and weaknesses of the different recognition engines. This knowledge allows for the creation of unique voting algorithms which draw on the strengths of various engines optimizing recognition results.

OCR Type A

OCR Type B

OCR Type C

*7521

97*21

9*5*1

VOTING

97521

Specifically, within AFPS Pro, T.i.S.s proprietary automated form processing software, numerous combinations of recognition engines may be selected and subsequently defined as a specific Recognition Class. By adjusting the various parameters that are controlled via AFPS Pros advanced yet simple to use graphic user interface, a customized set of guidelines and voting algorithms can be defined easily for each new form or image type, optimizing recognition results. T.i.S. is committed to maintaining its position as a world leader in forms processing and in image pattern recognition technologies. A major R&D effort in recognition optimization is now underway at T.i.S. and is expected to yield significant improvements in real-world recognition applications in the near future.

About Top Image Systems


Top Image Systems is a leading innovator of enterprise solutions for managing and validating the flow of information between an enterprise and its customers and employees. Whether originating from mobile, electronic, paper or other sources, TiS solutions deliver the content to applications that drive the organization. TiS' eFLOW Unified Content Platform is a common platform for the Company's products - Integra, Freedom and Mobili. Our mission is to enable enterprises to integrate external information, improving business processes, in order to profitably deliver products and services to customers and employees. Top Image Systems provides enterprises with an integrated software platform that enables the collaborative processes between the enterprise and its customers, business partners and employees. The collaborative process is fueled by information, content and data entering the enterprise, which must be validated, processed and delivered, to the right business applications in order to provide the products and services, which make companies profitable. TiS markets its products in more than 30 countries through a multi-tier network of distributors, system integrators and value added resellers, as well as strategic partners. Top Image Systems Ltd. is a publicly traded company on the NASDAQ exchange under the symbol TISA.

Вам также может понравиться