Вы находитесь на странице: 1из 37

SEMINAR ON

Universal Real-Time Language Independent Optical Character Recognition


Guided By: Co-Guide: Mrs. Jagtap V. V. Miss Jadhav Sujata Presented By: Aher Abhijeet S. Jamdade Sandeep S. Jadhav Vishal S.

CONTENTS

Introduction Need of system Block diagrams Modules Breakdown structure Algorithms Used Challenges of the system UML Diagrams Process Model Time Line Chart Risk Management Requirements Advantages Conclusion References

PROBLEM DEFINITION

Optical character recognition, usually abbreviated to OCR.

System is capable of recognizing printed text from multiple languages.

The technology enhances performance for degraded data.

This system is capable of processing static as well as dynamic inputs.

NEED OF SYSTEM

To reduce manual data entry work.

System with dynamic input is not exist in the market.


To improve degraded data(images/ text). It reduces overall cost & time.

BLOCK DIAGRAM

BLOCK DIAGRAM

MODULES

Prepeocessing & Scanning Training & Recognition Verification Feature Extraction

BREAKDOWN STRUCTURE

ALGORITHMS USED

Grayscale Conversion Thresholding Erosion & Dilation

Sharpen & Blur


Image Segmentation

GRAYSCALE CONVERSION

GRAYSCALE CONVERSION

Color image is converted into Black & White image. Pixel carries only intensity information. Here the value of each pixel is a single sample.

THRESHOLDING

Original Image

Example of a threshold effect used on an image

THRESHOLDING

It is the simplest method of image segmentation.

From a grayscale image, thresholding can be used to create binary images.

During the thresholding process, individual pixels in an image are marked as "object" pixels if their value is greater than some threshold value

EROSION & DILATION

ORIGINAL IMAGE

DILATED IMAGE

EROTED IMAGE

EROSION & DILATION


Operations which will increase or decrease object in size.

Dilation will cause object to grow in size.

Erosion will cause object to decrease in size.

SHARPEN & BLUR

SHARPEN & BLUR

It will reduce the blur effect & repair the image.

The reduced or degraded images will be sharpened.

IMAGE SEGMENTATION

IMAGE SEGMENTATION
segmentation refers to the process of partitioning a digital image into multiple segments.

It is typically used to locate objects and boundaries in images.

CHALLENGES OF THE SYSTEM

Low-quality, low-resolution text recognition. Inpainting text recognition.

Separation of character.

challenges still exist in reading handwritten Texts.

LEVEL 0 DFD

Handwritten Text Via Pad or Mouse

Optical Character Recognition System

Text In Editable Format

ACTIVITY DIAGRAM

User Login & Authentication

Logout

NO

Authentication Save Output Image

Access Application

Generate Output

Load User Hand written Profile

Match Templates

Enter Text Using Pad/Mouse

Save This Text As Image

Apply Segmentation Using Scan Line Algo.

Apply Scaling Using Bilinear Interpolation

CLASS DIAGRAM

DIAGRAM

USE CASE DIAGRAM

DIAGRAM

INCREMENTAL PROCESS MODEL


System Engg. Analysis Design Implementation Testing Increment 1

Increment 2 Analysis Design Implementation Testing Increment 3 Analysis Design Implementation Testing

TIME LINE CHART


12/ 07 06/ 08 13/ 08 19/ 25/ 30/ 08 08 08 07/ 09 13/ 09 04/ 10 14/ 10 T1 T2

T3
T4 T5 T6

M1

M2 T7

T8
T9 T10

Time Line Chart for Sem-1

04/ 12
T9 5/ 7 T10 T11 5/7 T12 T13 5/7 T14 5/7 T15 /7 T16 T17 T18 T19 T20 T21

14/ 12

03/ 02

09/ 02

12/ 03

26/ 03

02/ 04

M1

M2

M3

Time Line Chart for Sem-2

RISK MANAGEMENT
Risk identification Product size : overall size of product/software Business : constrains imposed by management Customer : ability to communicate with customer Process : software process model used Development: availability and quality of tools Technical technology used Staff size: technical and project experience

REQUIREMENTS
Software req. (min)
Operating system xp & onwards.

Hardware req.
Processor p4 & onwards. RAM 128 MB & greater HDD 4GB.

Technology To Be Used:
Java Platform JDK 6.0.1 SQL Server

ADVANTAGES
White page detection. Correct skew/slant errors Normalize character size Reduced Manpower. Separation of text from background. Improve degraded data. Reduces cost & time.

APPLICATIONS
Real time handwritten scripts recognition.

Handwritten sticky notes.


Maintain any handwritten notes.

FUTURE SCOPE
More languages can be included.

System will be made on-line.


Can be implemented for mobile devices.

CONCLUSION
The OCR system is easily trainable on new sets of data and portable to recognize new scripts. With training, the system could achieve robust performance on degraded data. As this system is not exist in the market it will be very useful & efficient. And also language independence will also help to distinguish our system with existing systems.

REFERENCES
K. Aas and L. Eikvil, "Text page recognition using grey-level features and hidden Markov models," Pattern Recognition 29, 977-985, 1996.

M. Allam, "Segmentation versus segmentation-free for recognizing text," Proc. SPIE, Vol. 2422, 228-235, 1995.

I. Bazzi, C. Lapre, R. Schwartz, and J, Makhoul, "Omnifont and unlimited vocabulary OCR system for English

THANK YOU

Вам также может понравиться