Академический Документы
Профессиональный Документы
Культура Документы
Waleed Abdulla Stephan Hussmann Dept. of Electrical & Computer Engineering School of Engineering, The Univeristy of Auckland Private Bag 92019, Auckland 1, New Zealand lwon051@ec.auckland.ac.nz w.abdulla@auckland.ac.nz s.hussmann@auckland.ac.nz Lisa Wong
Abstract
Braille is a tactile format of written communication for sight-impaired people worldwide. This paper proposes a software solution prototype to optically recognise single sided embossed Braille documents using a simple image processing algorithm and probabilistic neural network. The output is a Braille text le formatted to preserve the layout of the original document which can be sent to an electronic embosser for reproduction. Preliminary experiments have been performed with an excellent recognition rate, where the transcription accuracy is at 99%.
1. Introduction
The Braille format used by sight impaired people worldwide today was invented by Louis Braille in 1829. It uses a character set made up of different combinations of raised dots in a 3-by-2 (3 rows and 2 columns) arrangement to represent different characters or sequences of character. [1] In Grade II Standard English Braille, each character can represent a character, a string of characters, or a common word, depending on the context. Because the writing is tactile, only a small area can be read by the reader at a time. Other information such as font size and shape is also impossible to convey due to the nature of the language format. This makes the formatting of the page very important, in order for the reader to be able to locate information such as the page number and chapter name easily while reading. To reproduce a Braille document, especially those older documents which are starting to wear out, the current common method is to use a piece of thermal-sensitive material, and melt the material on the document to be reproduced. While this method produce an accurate replica of the original document, the quality and resolution of the Braille dots degrade as a result of the process. A more effective way would be to provide a means to automatically convert the tactile Braille document into a computer le, which can then
be sent to an electronic embosser. To perform such conversion, the Braille character on the document will need to be recognised. Attempts had been made to optically recognise embossed Braille using various methods. In 1988, Dubus and his team designed an algorithm called Lectobraille which translates relief Braille into an equivalent printed version on paper. [3] Since then, research has built on knowledge of image processing techniques towards the goal of Braille to text translation. In 1993, Mennens and his team designed an optical recognition system which recognised Braille writing that was scanned using a commercially available scanner. [4] The result was satisfactory with reasonably wellformed Braille embossing. However the system cannot handle deformation in the dot grid alignment. In 1999, Ng and his team approached the problem using boundary detection techniques to translate Braille into English or Chinese. [8] The recognition rates were good, however no mention was made of grid deformed input, nor its efciency. In 2001, Murray and Dais designed a handheld device which handles the scanning as well as the translation. [6, 7] Since the user is in control of the scanning orientation, and only a small segment is scanned at each instance, grid deformation is not a major concern, and a simpler algorithm was used to yield efcient, real-time translation of Braille characters. In 2003, Morgavi and Morando published a paper where they described the use of a hybrid system using a neural network to solve the recognition problem. [5] The paper also provides a means of measuring accuracy in Braille recognition, and the results show the system can handle a larger degree of image degradation compared with the algorithms that use more conventional and rigid image processing techniques. While most of the systems developed before concentrated on the translation, the formatting of the document is critical when reproduction is involved. In the paper Mennens and his team has presented [4], the system focuses on transcribing the scanned image into a computer le rather than translation. However no mention was made of the accuracy of the formatting conservation.
Half-character Detection
5 7
3 5 1 6
2 2
4 7
3 6 5
Half-character Recognition
n p
, 6
\ &
of processing required. The row of pixels is changed into a two-tone image to distinguish the pixels that are likely to be part of the shadows of the dots from the others. When a row of pixels is found that contains one or more characters, it will be buffered, until all the pixels from that row of characters are stored in the buffer. The psuedocode for this algorithm is presented in the appendix. At this stage, the columns of the buffered image will be processed in a similar manner to determine the position of the half-characters. The result from this process will be two-tone images and the position of the character. Fig. 2 and 3 show a sample of the original image and the area detected from the original image. Since the image is handled a row at a time, rather than as a 2D array of pixels, the computational cost is a lot less. Compared with the conventional method of using a mask to determine the position of the dots [5], thresholding and buffering, although it does not give the exact location of the dots, can nd the area where processing is needed to determine what character is present.
This paper presents a method of Braille recognition that aims at preserving the formatting of the Braille page, as well as the efciency and accuracy of the recognition algorithm. Section 2 of this paper will present the system proposed and the algorithms involved. Section 3 presents sets of test results, which are discussed in Section 4.
as one of seven possible arrangements. The seven arrangements come from having three possible dot positions in each half-character, without considering empty half character. This processing is done by using a probabilistic neural network, which has a hidden layer of seven radial basis functions (RBF) neurons, and a competitive learning network at the output layer. [2] It was decided that one hidden neuron is used for each possible combination. Since there are only seven possible combinations, the computational cost would not be excessive. The output of each neuron in the hidden layer is an indication of the Euclidean distance between the input vector, and the half character template the RBF neuron represents. The output layer is a competitive learning layer. It gathers the outputs from the hidden layer to determine the class that the half-character, represented by the input vector, belongs to.
(CanonScan FB320P) with a 300pdi resolution. Sections of the images were recognised using the proposed method. The processing was performed on a PC with an AMD2000+ CPU, 256MB RAM, using a Matlab implementation of the algorithm. The CPU time taken for a single line spaced page to be processed is averaged at around 32.6s. The half-character detection rate for the experiments was 100%, while the half-character recognition rate was 99.5% (2 out of 372 half-characters). The grids of the documents were determined correctly in all of the experiements.
4. Discussion
The results of the experiments have been very promising. The fact that the recognition algorithm does not involve convolution of a mask of any kind has shortened the processing time required to pre-process the image before the recognition phase. The characters are processed as halfcharacters, which means there are fewer possible classes for the classication phase, compared to recognising the whole character, which would involve 63 possible classes. This simplies the processing needed for the recognition stage. The recognition rate for the half-character recognition has room for improvement. One way to improve it is to implement a syntactic spell check algorithm. Any misrecognised symbol can be detected since the word and sentence that the character belongs to would no longer make sense. This can also be extended to be used to translate the document into plain English. It is worth noting that all the samples in the experiments are reasonably well formed and computer embossed. A possible future enhancement in this project would be to use a bigger range of learning dataset. This may include adding some algorithms to dynamically adjust various thresholds for different parts of the system, to account for variance such as the colour of the paper. Since there are strict rules about the size of the dots used as well as the spacing between them, little or no adjustment should be involved in this enhancement in terms of half-character recognition.
5. Conclusions
Figure 4. Transcript for the sample shown in Figure 2 This paper proposed an algorithm to optically recognise embossed and transcribe Braille documents into a computer le. The algorithm processes the image one row at a time and uses thresholding to pre-process the image. The processed sections that contain the Braille characters are then recognised using a probabilistic neural network. The results were promising during experiments performed on single sided computer embossed documents, with more than 99% accuracy achieved.
3. Results
Samples of single sided embossed Braille documents were scanned using a commercially available scanner
6. Acknowledgments
The authors would like to thank Ms Janet Reynolds from the New Zealand Royal Foundation for the Blind for providing embossed Braille document samples for the development of this project, as well as providing information about Braille reproduction and valuable suggestions at the early stages of this project.
References
[1] English braille: American edition 1994. [Online] Available: http://www.brl.org/ebae/, 1994. Braille Authority of North America. [2] Radial basis networks (neural network toolbox): Probabilistic neural networks. [Online] Available: http://www.mathworks.com/access/helpdesk/help/toolbox/n net/radial10.shtml#2412, c 1994-2003. The MathWorks, Inc. [3] J. Dubus., M. Benjelloun., V.Devlaminck, F. Wauquier, and P. Altmayer. Image processing techniques to perform an autonomous system to translate relief braille into black-ink, called: Lectobraille. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pages 15841585, 1988. [4] J. Mennens, L. V. Tichelen, G. Francois, and J. Engelen. Optical recognition of braille writing using standard equipment. IEEE Transactions on Rehabilitation Engineering, 2(4):207 212, December 1994. [5] G. Morgavi and M. Morando. A neural network hybrid model for an optical braille recognitor. In International Conference on Signal, Speech and Image Processing 2002 (ICOSSIP 2002), 2002. CDROM. [6] I. Murray and T. Dias. A portable device for optically recognizing braille - part i: hardware development. In The Seventh Australian and New Zealand Intelligent Information Systems Conference 2001, pages 129134, 2001. [7] I. Murray and T. Dias. A portable device for optically recognizing braille - part ii: software development. In The Seventh Australian and New Zealand Intelligent Information Systems Conference 2001, pages 141146, 2001. [8] C. Ng, V. Ng, and Y. Lau. Regular feature extraction for recognition of braille. In Third International Conference on Computational Intelligence and Multimedia Applications, 1999. ICCIMA 99. Proceedings, pages 302306, 1999.