Optical Character Recognition

Optical Character Recognition
Click to edit Master subtitle style The Final Presentation
4/11/12
Curtain Raiser
OCR -a Software of great utility.
It provides user with a facility of creating a meaning
full text document from a image with out actually typing it.
Provide functions for better image processing.
Allows the user to change the appeared result
according to requirement.
4/11/12
Idea Behind Developing the Tool

In our country where the technology has taken a
long time to reach the homes of commons, the record maintenance has always been done on paper.
With the advancement in technology the need of
such a software tool has now aroused which can help the gruesome task of conversion of written document in electronic or digital format. document digitizing and preservation to handwritten text recognition.
4/11/12
The applications of this technique range from
Practical applications
One can always write a document at some place
where he is not possessing a computer and rest assured that the document can be digitized without any further time investment. converted to the digital format without a manual operator actually typing the whole text
All the paperwork in the offices could be easily
4/11/12
Existing software system?

The existing software systems work on older methods
of artificial intelligence and are very less efficient.
Also many of the tools doesnt allow the user to give
more than just a single character, therefore debarring the user from recognizing the text. the task of Optical Character Recognition. They require the users to buy the complete hardware.
Some hardware dependent systems are dedicated for
4/11/12
We present a better tool!!!

The tool we bring for the user is a far more capable
software that provides the user with liberty of doing this mammoth task of digitizing a paper based document in a very friendly and easy manner.
Also the tool has a dedicated module for handwriting
recognition which makes it even more desirable in case the user wants a digitized document of his own handwritten text.
4/11/12
Construction Process
1. Information Gathering 2. Technology used
4/11/12
Information gathering
For this project the information we collected was mainly from the internet.
We searched for existing softwares on the internet. We discussed the project scope with the regular users
of word processors and people related to programming specially in field of artificial networks. we develop the project.
We discussed the problems we would encounter while
4/11/12
Technologies used
We developed the tool mainly using
Java concepts of Core, Java Swings IDE for Java programming- Netbeans. MySQL for back end database connectivity.
4/11/12
System Analysis Information flow representation

1. Use case Diagram 2. Activity Diagram 3. Class Diagram 4. Sequence Diagram
4/11/12
Use Case diagram
4/11/12
Activity diagram
4/11/12
Activity Diagram for handwritten users
4/11/12
Class diagram
4/11/12
Sequence diagram
4/11/12
Sequence diagram
4/11/12
Architecture Design
1. Architecture Behavioral Diagram 2. Modular Approach 3. Algorithm design for operation
4/11/12
Architectural behavior of the software
4/11/12
Description of the diagram

Firstly the image gets loaded in the initial module
from where it reaches the pixel extractor module in its original form i.e. in image format. equivalent array form of the image pixels. The image is converted into a grey scaled version of the input image. A corresponding array of the then grey scaled image. where the system evaluates each pixel of the input image and separates the pixels forming the text and the background. 4/11/12
The pixel extractor module then brings out the
This array subjected to the segmentation module
Continued
Each of the separate array so formed is fed to the
neural network where each pixel value forms an input node and at output nodes are those nodes which are obtained from the database. winning neuron from the output nodes.
The SOM then identifies the character and suggest the This output neuron called as the winning neuron
signifies a character and is sent to the text editor.
4/11/12
Modular approach
Modules used
a) Image loading/ processing module b) Pixel extractor c) Segmentation module d) Scanning e) Self Organizing map f) Conversion to text g) Spell checker h) Saving
4/11/12
Testing
1. Purpose of Testing 2. Test Cases
4/11/12
Purpose of Testing
Software testing an unavoidable step in software
quality assurance and quality control tasks.
Testing is a process of executing a program with the
intent of finding an error, eliminating errors to produce an error free software which meets the specification.
Its objective is to identify the faults as quickly as
possible after they occur and identify the cause of the fault so that the remedial steps can be taken.
It is important for making the project more robust.

4/11/12
Some test cases

Test case no 1.
Its expected outcome is of course the same characters
written in a text format ie.
The quick brown fox jumps over the lazy dog.

What we achieved was
The quick bl-n lx jumps oVer the la_ dog

4/11/12
Test cases continued

Test case no 2. is for the login for handwriting
recognition. As an input we gave username and corresponding password . she holds a valid account and must be denied entry if they dont have a account. the picture upload module. else
4/11/12
Expected outcome was that the user gets entry if he or
Output an authenticated user gets transported to

If the person is not an authenticated
4/11/12

Test case no 3. for spell check module
The input given in the text pane was a word with wrong spelling like brwn
Expected outcome was the suggestion of the word
wrong and all those which contain the letters b, r, w, n.
4/11/12

The output found as
4/11/12

Test case no 4. for opening the image.
Validaton-unless the input is taken as image no button should work. Expexted- Message should be displayed asking user to load the image first of all. Output- An alert appears asking user to open an image first.
4/11/12
Limitations of the software
4/11/12
limitations
The image should be of identifiable quality. The image should be in valid image file format ie file
formats like .jpg, .bmp, .png only are usable.

Hand-written document should be readable. For handwriting scanning the system needs to be
trained first by the users handwritten documents image. Only then the system would be able to recognize the input image. handwriting.
Image should not contain text contained in cursive Input image file should be aligned upside down. 4/11/12
Future Scope
Things that could be added at some later Point in time to enhance the functionality of the project
4/11/12
In future application can be enhanced.

Application will become more efficient to scan
character in a more time efficient manner.

Text editor will become more efficient so that it
automatically detects the wrong guess characters and correct it automatically
We hope to include a technology that would be able to
recognize cursive handwriting as well.
4/11/12

Optical Character Recognition

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Optical Character Recognition

Загружено:

Авторское право:

Доступные форматы

Optical Character Recognition

Click to edit Master subtitle style The Final Presentation

It provides user with a facility of creating a meaning

Provide functions for better image processing.

Allows the user to change the appeared result

Idea Behind Developing the Tool

With the advancement in technology the need of

The applications of this technique range from

All the paperwork in the offices could be easily

Existing software system?

of artificial intelligence and are very less efficient.

Also many of the tools doesnt allow the user to give

Some hardware dependent systems are dedicated for

We present a better tool!!!

Also the tool has a dedicated module for handwriting

1. Information Gathering 2. Technology used

We discussed the problems we would encounter while

System Analysis Information flow representation

Use Case diagram

Activity Diagram for handwritten users

1. Architecture Behavioral Diagram 2. Modular Approach 3. Algorithm design for operation

Architectural behavior of the software

Description of the diagram

The pixel extractor module then brings out the

This array subjected to the segmentation module

signifies a character and is sent to the text editor.

1. Purpose of Testing 2. Test Cases

quality assurance and quality control tasks.

Testing is a process of executing a program with the

Its objective is to identify the faults as quickly as

It is important for making the project more robust.

Some test cases

Its expected outcome is of course the same characters

written in a text format ie.

The quick brown fox jumps over the lazy dog.

The quick bl-n lx jumps oVer the la_ dog

Test cases continued

Expected outcome was that the user gets entry if he or

Output an authenticated user gets transported to

Test cases continued

Test cases continued

wrong and all those which contain the letters b, r, w, n.

Test cases continued

Test cases continued

Limitations of the software

formats like .jpg, .bmp, .png only are usable.

In future application can be enhanced.

character in a more time efficient manner.

automatically detects the wrong guess characters and correct it automatically

We hope to include a technology that would be able to

recognize cursive handwriting as well.

Вам также может понравиться