Вы находитесь на странице: 1из 42

Text – to – Speech Converter

Submitted in partial fulfillment of the requirements for the award of degree of

BACHELOR OF ENGINEERING

IN

COMPUTER SCIENCE & ENGINEERING

Submitted to:

Mr. GurjeetpalBawa

Submitted By:

Yashpreet Kaur Raon

16BCS1313

Nitesh Kumar

16BCS1253

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

Chandigarh University, Gharuan


Dec 2018
ACKNOWLEDEMENT
I have taken efforts in this project. However, it would not have been possible without the kind
support and help of many individuals and organizations. I would like to extend my sincere
thanks to all of them.

I am highly indebted to Ms. Manpreet Kaur for her guidance and constant supervision as well
as for providing necessary information regarding the project & also for her support in
completing the project.

I would like to express my gratitude towards my parents & members of Chandigarh


University for their kind co-operation and encouragement which help me in completion of
this project.

I would like to express my special gratitude and thanks to my classmates and teachers for
giving me such attention and time.

My thanks and appreciations also go to my mentor in developing the project and people who
have willingly helped me out with their abilities.

Yashpreet Kaur (16BCS1313)

Nitesh Kumar

III Year, V Semester

Computer Science Engineering


ABSTRACT

Text - to - speech conversion software project is windows based application that reads a text
file to the user. The software reads a text file and associated pronunciations in its temporary
database. The program then reads an entire word to the user. The software can be effectively
used to help read the text document for the user so that the user does not constantly need to
look at the screen and read the entire document.

Text to speech converter is a recent software project that allows even the visually challenged
to read and understand various documents. The blinds cannot read a document, so this
software can be an assistant to them who would read out those documents for them. It can
also be a great help for those who cannot speak. The person can simply type what he/she
wants to say and the software would give a voice to them by speaking what they wanted to
say. So, this software is not just an advancement towards the future development but also a
boon for those who cannot speak and see.

A text-to-speech system (or "engine") is composed of two parts:[3] a front-end and a back-
end. The front-end has two major tasks. First, it converts raw text containing symbols like
numbers and abbreviations into the equivalent of written-out words. This process is often
called text normalization, pre-processing,or tokenization. The front-end then assigns phonetic
transcriptions to each word, and divides and marks the text into prosodic units,
like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to
words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions
and prosody information together make up the symbolic linguistic representation that is
output by the front-end. The back-end—often referred to as the synthesizer—then converts
the symbolic linguistic representation into sound. In certain systems, this part includes the
computation of the target prosody (pitch contour, phoneme durations), which is then imposed
on the output speech.
List of Figures

Figure No. Title Page No.


1. Block Diagram of Iterative Waterfall Model 18
2. Block Diagram of Text to Speech Converter 23
3. Flow Diagram of Text to Speech Converter 24
4. Screenshots from the Project 25 - 33
Table of Contents
Sr. No. Topic Page No.

. . .
. . .
. . .
1 Introduction

1.1 About Project

Text - to - speech conversion software project is windows based application that reads a text
file to the user. The software reads a text file or entered text or the image selected and
associated pronunciations in its temporary database. The program then reads an entire word
to the user. The softwarecan be effectively used to help read the text, pdf documents or image
text or entered text for the user so that the user does not constantly need to look at the screen
and read the entire document or image or text.

Text to speech converter is a recent software project that allows even the visually challenged
to read and understand various documents. The blinds cannot read a document, so this
software can be an assistant to them who would read out those documents for them. It can
also be a great help for those who cannot speak. The person can simply type what he/she
wants to say and the software would give a voice to them by speaking what they wanted to
say. The user just have to select the Interactive mode and then write what he wants to say in
the textarea and then he can easily express what he wanted to say by simply clicking the
convert button. So, this software is not just an advancement towards the future development
but also a boon for those who cannot speak and see. This technology can also be utilized for
various purposes, e.g. car navigation, announcements in railway stations, response services in
telecommunications, and e-mail reading. Thus, if we think more innovatively, we can easily
get more applications out of it.

TTS works with nearly every personal digital device, including computers, smartphones and
tablets. All kinds of text files can be read aloud, including Word and Pages documents. Even
online web pages can be read aloud. The voice in TTS is computer-generated, and reading
speed can usually be sped up or slowed down. This software can has a quality in which the
voice quality varies, but some voices sound human. This feature is specifically designed to
give a real feel to the voice. There are even computer-generated voices that sound like
children speaking. The software designed uses the computerized female voice. Many TTS
tools highlight words as they are read aloud. This allows kids to see text and hear it at the
same time. Some TTS tools also have a technology called optical character
recognition (OCR). OCR allows TTS tools to read text aloud from images. For example, your
child could take a photo of a street sign and have the words on the sign turned into audio. The
designed software actually allows the accomplishment of this feature we have given the
option of converting image text to speech. Different files can also be converted using this
software. Text, document, or pdf files can easily be read using the software.

A text-to-speech system (or "engine") is composed of two parts:[3] a front-end and a back-

end. The front-end has two major tasks. First, it converts raw text containing symbols like
numbers and abbreviations into the equivalent of written-out words. This process is often
called text normalization, pre-processing, or tokenization. The front-end then
assigns phonetic transcriptions to each word, and divides and marks the text into prosodic
units, like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to
words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions
and prosody information together make up the symbolic linguistic representation that is
output by the front-end. The back-end—often referred to as the synthesizer—then converts
the symbolic linguistic representation into sound. In certain systems, this part includes the
computation of the target prosody (pitch contour, phoneme durations),[4] which is then
imposed on the output speech.
1.2About Language

The language used for the project text – to speech conversion is python. Python is a high-
level, interpreted, interactive and object-oriented scripting language. Python is designed to
be highly readable. It uses English keywords frequently where as other languages use
punctuation, and it has fewer syntactical constructions than other languages.

 Python is Interpreted − Python is processed at runtime by the interpreter. You do not


need to compile your program before executing it. This is similar to PERL and PHP.
 Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
 Python is Object-Oriented − Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.
 Python is a Beginner's Language − Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications from
simple text processing to WWW browsers to games.

History of Python

Python was developed by Guido van Rossum in the late eighties and early nineties at the
National Research Institute for Mathematics and Computer Science in the Netherlands.

Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-68,
SmallTalk, and Unix shell and other scripting languages.

Python is copyrighted. Like Perl, Python source code is now available under the GNU
General Public License (GPL).

Python is now maintained by a core development team at the institute, although Guido van
Rossum still holds a vital role in directing its progress.

Python Version History

Currently, PSF supports two versions, Python 2.x & Python 3.x. Python 2.0 was released in
October 2000 and includes a large number of features. PSF continues to support version
Python 2 because a large body of existing code could not be forward ported to Python 3. So,
they will support Python 2 until 2020.
Python 3.0 was released on December 3rd, 2008. It was designed to rectify certain flaws in
earlier version. This version is not completely backward-compatible with previous versions.
However, many of its major features have since been back-ported to the Python 2.6.x and
2.7.x version series. Releases of Python 3 include 2 to 3 utilities to facilitate the automation
of translation of Python 2 code to Python 3.

Python Features

Python's features include −

 Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
 Easy-to-read − Python code is more clearly defined and visible to the eyes.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.
 A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
 Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
 Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more efficient.
 Databases − Python provides interfaces to all major commercial databases.
 GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows MFC,
Macintosh, and the X Window system of Unix.
 Scalable − Python provides a better structure and support for large programs than
shell scripting.

Apart from the above-mentioned features, Python has a big list of good features, few are
listed below −

 It supports functional and structured programming methods as well as OOP.


 It can be used as a scripting language or can be compiled to byte-code for building
large applications.
 It provides very high-level dynamic data types and supports dynamic type checking.
 It supports automatic garbage collection.
 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.

Python Advantages

 Python provides enhanced readability. For that purpose, uniform indents are used to
delimit blocks of statements instead of curly brackets, like in many languages such as
C, C++ and Java.
 Python is free and distributed as open-source software. A large programming
community is actively involved in the development and support of Python libraries for
various applications such as web frameworks, mathematical computing and data
science.
 Python is a cross-platform language. It works equally on different OS platforms like
Windows, Linux, Mac OSX etc. Hence Python applications can be easily ported
across OS platforms.
 Python supports multiple programming paradigms including imperative, procedural,
object-oriented and functional programming styles.
 Python is an extensible language. Additional functionality (other than what is
provided in the core language) can be made available through modules and packages
written in other languages (C, C++, Java etc)
 A standard DB-API for database connectivity has been defined in Python. It can be
enabled using any data source (Oracle, MySQL, SQLite etc.) as a backend to the
Python program for storage, retrieval and processing of data.
 Standard distribution of Python contains the Tkinter GUI toolkit, which is the
implementation of popular GUI library called Tcl/Tk. An attractive GUI can be
constructed using Tkinter. Many other GUI libraries like Qt, GTK, WxWidgets etc.
are also ported to Python.
 Python can be integrated with other popular programming technologies like C, C++,
Java, ActiveX and CORBA.

Python Application Types

Even though Python started as a general-purpose programming language with no particular


application as its focus, over last few years it has emerged as the language of choice for
developers in some application areas. Some important applications of Python are summarized
below:

 Data Science - Python experienced a recent emergence in popularity charts mainly


because of its Data science libraries. Huge amount of data is being generated today by
web applications, mobile applications and other devices. Companies need business
insights from this data.

Today Python has become the language of choice for data scientists. Python libraries
like NumPy, Pandas and Matplotlib are extensively used in the process of data
analysis, including the collection, processing and cleansing of data sets, applying
mathematical algorithms and generating visualizations for the benefit of users.
Commercial and community Python distributions by third-parties such
as Anaconda and ActiveState provide all the essential libraries required for data
science.

 Machine Learning - This is another key application area of Python. Python libraries
such as Scikit-learn, Tensorflowand NLTK are widely used for the prediction of
trends like customer satisfaction, projected values of stocks etc. Some of the real-
world applications of machine learning include medical diagnosis, statistical
arbitrage, basket analysis, sales prediction etc.
 Web Development - This is another application area in which Python is becoming
popular. Web application framework libraries like django, Pyramid, Flask etc. make it
very easy to develop and deploy simple as well as complex web applications. These
frameworks are used extensively by various IT companies. Dropbox for example uses
django as a backend to store and, synchronize local folders.

Most of the web servers today are compatible with WSGI (Web Server Gateway
Interface) - a specification for the universal interface between Python web
frameworks and web servers. All leading web servers such as Apache, IIS, Nginxetc
can now host Python web applications. Google's App Engine hosts web applications
built with almost all Python web frameworks.

 Image Processing - The OpenCV library is commonly used for face detection and
gesture recognition. OpenCV is a C++ library, but has been ported to Python. Because
of the rapid development of this feature, Python is a very popular choice from image
processing.
 Game Development - Python is a popular choice for game developers.
The PyGame library is extensively used for building games for desktop as well as for
mobile platforms. PyGame applications can be installed on Android too.
 Embedded Systems and IOT - Another important area of Python application is in
embedded systems. Raspberry Pi is a very popular yet a low-cost single-board
computer. It is being extensively used in automation products, robotics, IoT, and
kiosk applications. Popular microcontrollers like Arduino are used in many IoT
products and are being programmed with Python. A lightweight version of Python
called Micropython has been developed especially for microcontrollers. A special
Micropython-compatible controller called PyBoard has also been developed.
 Android Apps -Although Android apps are predominantly developed using Android
SDK, which is similar to Java, Python can also be used to develop Android apps.
Python's Kivy library has all the functionalities required for a mobile application.
 Automated Jobs - Python is extremely useful and widely used for automating CRON
(Command Run ON) jobs. Certain tasks like backups, defined in Python scripts can
be scheduled to be invoked automatically by the operating system scheduler to be
executed at predefined times.

Python is embedded as a scripting language in many popular software products. This


is similar to VBA used for writing macros in Excel, PowerPoint, etc. Python API is
integrated with Maya, PaintShop Pro, etc.

 Rapid Development Tool - Standard distribution of Python as developed by Rossum


and maintained by Python Software Foundation is called CPython which is a
reference implementation. Its alternative implementations - Jython the JRE
implementation of Python and IronPython - the .NET implementation, interact
seamlessly with Java and C#, respectively. For exampleJython can use all Java
libraries such as Swing etc. So the development time can be minimized by using
simpler Python syntaxes and Java libraries for prototyping the software product.

Running Python
There are three different ways to start Python −
Interactive Interpreter
You can start Python from Unix, DOS, or any other system that provides you a command-
line interpreter or shell window.

Enter python the command line.

Start coding right away in the interactive interpreter.

$python # Unix/Linux

or

Python% # Unix/Linux

or

C: >python # Windows/DOS

Here is the list of all the available command line options −

Sr.No. Option & Description

1 -d

It provides debug output.

2 -O

It generates optimized bytecode (resulting in .pyo files).

3 -S

Do not run import site to look for Python paths on startup.

4 -v

verbose output (detailed trace on import statements).

5 -X

disable class-based built-in exceptions (just use strings); obsolete starting with
version 1.6.
6 -c cmd

run Python script sent in as cmd string

7 file

run Python script from given file

Script from the Command-line


A Python script can be executed at command line by invoking the interpreter on your
application, as in the following –

$python script.py # Unix/Linux

or

Python% script.py # Unix/Linux

or

C: >python script.py # Windows/DOS

Integrated Development Environment


You can run Python from a Graphical User Interface (GUI) environment as well, if you have
a GUI application on your system that supports Python.

 Unix − IDLE is the very first Unix IDE for Python.

 Windows − PythonWin is the first Windows interface for Python and is an IDE with
a GUI.

 Macintosh − The Macintosh version of Python along with the IDLE IDE is available
from the main website, downloadable as either MacBinary or BinHex'd files.

If you are not able to set up the environment properly, then you can take help from your
system admin. Make sure the Python environment is properly set up and working perfectly
fine.
1.4Feasibility Study

 Economic Feasibility
The text-to-speech conversion software is a very affordable software which requires
the use of python only. There is no special need of any recorder or any other gadget or
equipment that can act as an overhead in the cost of this software.
 Technical Feasibility
This software only requires the use of python which is already widespread and used.
Youtube which is used by billions of users has some parts of it implemented using
python. So the only technology required in this project is already available and
familiar. Hence, this software is technologically feasible
 Operational Feasibility
The software would be very easy to use and is designed only to help the people
especially, the ones who cannot speak and also those who are visually challenged.
This software would be a great help to them to live a normal life.
2 SRS

2.1 Introduction

2.1.1 Purpose

Among the many definition that could be given of text – to – speech, that
describes it as a way of having computer audibly communicate information to
the user is probably the most relevant within the context of this statement. In
situations where visual feedback is inadequate or even impossible, audible
feedback may be an essential feature; in many situations it may just add extra
value to a product. Generally, text – to – speech provides a very valuable and
flexible alternative for digital – audio recordings where :
 Recordings are too expensive.
 Disk storage is insufficient to store the recordings.
 The application does not know ahead of time what it will need to
speak.
 The information varies too much to record and store all the
alternatives.
2.1.2 Document Conventions
To prepare this SRS we had used the lettering style of Times New Roman and
font size for sub headings is 14 with bold. The matter which is mentioned in
this SRS is 12 with a lettering style Times New Roman. Then headings are of
lettering style Times New Roman with Font Size 16 with bold letters. Then the
important points are mentioned in italics.
2.1.3 Intended Audience and Reading Suggestions
This SRS can be read by all the developers. The rest of the part of SRS
mentions the benefits of our project, how to use the project, how the project
was developed, what are the major things we have taken into consideration.
2.1.4 Project Scope
The term “Text – to - Speech” or TTS for short, refers to the process by which
plain text is converted into digital audio and then “spoken”. This speaking can
be in the form of actually sending the audio through a computer’s speakers (or
other capable device), or simply saving the computer audio for later playback.
For the most part, all TTS conversion engine can be broken out into three
methods used to convert phonemes (the smallest phonetic unit in a language
that is capable of conveying distinction in meaning, such as the m of mat and
the b of bat in English) into audible sound. The supplied Microsoft Speech
engines used the second method. The three methods are described in the
following paragraphs.

2.2 Overall Description


2.2.1 Product Perspective
Text – to – speech program that lets you type – in any English or Spanish text
and then plays it as an audio stream.
Instantly convert desired text to audio.
Converting files into speech format. File can be txt, pdf.
Converting text from image to speech.
Supported language: English.
2.2.2 Product Feature and developer application
Different implementations of text – to – speech system exist. This section
discusses some of the concepts on which this systems are built. Generally, a
text – to – speech system can be broken down into three parts: a linguistic, a
phonetic and an acoustic part. First, an ordinary text is input to the system. A
linguistic module converts this text into a phonetic representation. From this
representation, the phonetic processing module calculates the speech
parameters. Finally, an acoustic module uses these parameters to generate a
synthetic speech signal.
2.2.3 User Classes and Characteristics
Using this product user can listen his entered text or selected text. He or she
can listen given input file text which can be txt, pdf format.
User can listen the entered text in the interactive mode.
User can also listen to the text written in an image.
2.2.4 The Operating Environment
Software Requirements used are Windows XP and any other latest editions,
Python Technologies.
Hardware Requirements used are P4processor, 512MB of Main Memory
(RAM) and 40GB hard disk and base memory.
2.2.5 Design and Implementation Constraints
Design constraints developers. All modules are coded thoroughly based on
requirements. The software is designed in such a way that the user can easily
interact with the screen. Software is designed in such a way that it can be
extended into real time business.

2.3 System Features


In this project, we have 3 modules:
 Entered Text or Selected Text to Speech conversion.
 Text File and PDF file conversion Module.
 Image Text to Speech Conversion Module.

Entered Text or Selected Text to Speech Conversion

In this module user has to enter some text and it can listen the speech by clicking the
Convert button present at the bottom. User can listen selected text or entered text. In
this module we have to design GUI which provides text area to enter text. This
Module opens up when we click on Interactive Mode in the main menu.

Text File Conversion Module

In this module, user can input text file as input for converting text into speech. In this
module functionalities are:

 Getting path of input file.


 Open the file using path.
 Reading file.
 Read text passed to speech method.

PDF File Conversion Module

In this module, user can input text file as input for converting text into speech. In this
module functionalities are:
 Getting path of input file.
 Open the file.
 Reading file.
 Read text passed to speech module.

Image Text to Speech Conversion Module

This Module would first take an image as an input which would involve selecting
or browsing an image and then the Convert button pass it text acquired from the
image to the speech module and we get the audio form of the text of the image.

2.4 External Interface Requirements


2.4.1 User Interfaces
This application includes GUI standards or product family style guides that are
to be followed, screen layout constraints, standard buttons and functions that
will appear on every screen, error message display standards, and so on.
2.4.2 Hardware Interfaces
Processor : Intel i3 and above
RAM : 128 MB and above
Hard disk : 5 GB and above
Monitor : CRT OR LCD monitor
Keyboard : Normal or Multimedia
Mouse : Compatible mouse

2.4.3 Software Interfaces


Python IDLE - Python’s Integrated Development and Learning Environment
for writing programs. IDLE has two main window types, the Shell window
and the Editor window. It is possible to have multiple editor windows
simultaneously. The shell is used for the construction of the project.
Tkinter library – Tkinter is a Python binding to the Tk GUI toolkit. It is the
standard Python interface to the Tk GUI toolkit, and is Python's de
facto standard GUI.
Operating system used is Windows10.
Pyttsx3 library - Pyttsx is a good text to speech conversion library in python
but it was written only in python2 until now ! Even some fair amount of
googling didn’t help much to get tts library compatible with pyton3.
Pytesseract - Python – tesseract or Pytesseract is an Optical Character
Recognition (OCR). That is, it will recognize and “read” the embedded text in
image.
3 SOFTWARE DEVELOPMENT LIFE CYCLE USED

The software development lifecycle used (SDLC) used for this project was the iterative
waterfall model. In a practical software development project, the classical waterfall model is
hard to use. So, Iterative waterfall model was thought of as incorporating the necessary
changes to the classical waterfall model to make it usable in practical software development
projects. It is almost same as the classical waterfall model except some changes are made to
increase the efficiency of the software development. This gave us the required flexibility of
changes.

The iterative waterfall model provides feedback paths from every phase to its preceding
phases, which is the main difference from the classical waterfall model.

Feedback paths introduced by the iterative waterfall model are shown in the figure below.
When errors are detected at some later phase, these feedback paths allow correcting errors
committed by programmers during some phase. The feedback paths allowed us to rework on
the phase in which errors were committed and these changes were reflected in the later
phases. But, there is no feedback path to the stage – feasibility study, because once a project
has been taken, does not give up the project easily.
It is good to detect errors in the same phase in which they are committed. It reduces the effort
and time required to correct the errors.

Phases in Iterative Waterfall model are –

 Feasibility Study – Feasibility Study is an assessment of the practicality of a


proposed project or system. A feasibility study aims to objectively and rationally
uncover the strengths and weaknesses of an existing business or proposed venture,
opportunities and threats present in the natural environment, the resources required
to carry through, and ultimately the prospects for success. In its simplest terms, the
two criteria to judge feasibility are cost required and value to be attained. A well-
designed feasibility study should provide a historical background of the business or
project, a description of the product or service, accounting statements, details of
the operations and management, marketing research and policies, financial data,
legal requirements and tax obligations.Generally, feasibility studies precede
technical development and project implementation. A feasibility study evaluates the
project's potential for success; therefore, perceived objectivity is an important factor
in the credibility of the study for potential investors and lending institutions. It must
therefore be conducted with an objective, unbiased approach to provide information
upon which decisions can be based

 Requirement analysis and specification -The first phase involves understanding


what you need to design and what is its function, purpose etc. Unless you know what
you want to design, you cannot proceed with the project. Even a small code such as
adding two integer numbers, needs to be written with the output in mind. Here, in
this stage, the requirements which the software is going to satisfy are listed and
detailed. These requirements are then presented to the team of programmers. If this
phase is completed successfully, it ensures a smooth working of the remaining
phases, as the programmer is not burdened to make changes at later stages because of
changes in requirements. As per the requirements, the software and hardware needed
for the proper completion of the project is analyzed in this phase. Right from
deciding which computer language should be used for designing the software, to the
database system that can be used for the smooth functioning of the software, such
features are decided at this stage.

 Design − The algorithm or flowchart of the program or the software code to be


written in the next stage, is created now. It is a very important stage, which relies on
the previous two stages for its proper implementation. The proper design at this
stage, ensures a execution in the next stage. If during the design phase, it is noticed
that there are some more requirements for designing the code, the analysis phase is
revisited and the design phase is carried out according to the new set of resources.

 Coding and Unit Testing − With the coding of the application complete, the testing
of the written code now comes into scene. Testing checks if there are any flaws in
the designed software and if the software has been designed as per the listed
specifications. A proper execution of this stage ensures that the client interested in
the created software, will be satisfied with the finished product. If there are any
flaws, the software development process must step back to the design phase. In the
design phase, changes are implemented and then the succeeding stages of coding and
testing are again carried out.With inputs from the system design, the system is first
developed in small programs called units, which are integrated in the next phase.
Each unit is developed and tested for its functionality, which is referred to as Unit
Testing.

 Integration and System Testing − All the units developed in the implementation
phase are integrated into a system after testing of each unit. This is an important part
as the errors may not be seen in individual units but while integration of these units
there can be some errors which needs to be corrected. Post integration the entire
system is tested for any faults and failures.

 Deployment of system − Once the functional and non-functional testing is done; the
product is deployed in the customer environment or released into the market.

 Maintenance − There are some issues which come up in the client environment. To
fix those issues, patches are released. Also to enhance the product some better
versions are released. Maintenance is done to deliver these changes in the customer
environment. It is a never ending phase. Once the system is running in production
environment, problems come up. The issues that are related to the system are solved
only after deployment of the system. The problems arise from time to time and need
to be solved; hence this phase is referred as maintenance.

Phase Containment of Errors: The principle of detecting errors as close to their points of
commitment as possible is known as Phase containment of errors.

The choice for a software development lifecycle is generally made on the basis of its
advantages and disadvantages. For the text of speech converter the SDLC (Software
Development Lifecycle) chosen was Iterative Waterfall Model was chosen as the
disadvantages were overpowered by the advantages of the model making it more suitable for
this project.
Advantages of Iterative Waterfall Model
 Feedback Path: In the classical waterfall model, there are no feedback paths, so there
is no mechanism for error correction. But in iterative waterfall model feedback path
from one phase to its preceding phase allows correcting the errors that are committed
and these changes are reflected in the later phases.
 Simple: Iterative waterfall model is very simple to understand and use. That’s why it is
one of the most widely used software development models.
Drawbacks of Iterative Waterfall Model
 Difficult to incorporate change requests: The major drawback of the iterative
waterfall model is that all the requirements must be clearly stated before starting of the
development phase. Customer may change requirements after some time but the
iterative waterfall model does not leave any scope to incorporate change requests that
are made after development phase starts.
 Incremental delivery not supported: In the iterative waterfall model, the full software
is completely developed and tested before delivery to the customer. There is no scope
for any intermediate delivery. So, customers have to wait long for getting the software.
 Overlapping of phases not supported: Iterative waterfall model assumes that one
phase can start after completion of the previous phase, But in real projects, phases may
overlap to reduce the effort and time needed to complete the project.
 Risk handling not supported: Projects may suffer from various types of risks. But,
Iterative waterfall model has no mechanism for risk handling.
 Limited customer interactions: Customer interaction occurs at the start of the project
at the time of requirement gathering and at project completion at the time of software
delivery. These fewer interactions with the customers may lead to many problems as the
finally developed software may differ from the customers’ actual requirements.

3.1 Methodology of Work

Different libraries used in the project are:

Tkinter - Tkinter is a Python binding to the Tk GUI toolkit. It is the standard Python
interface to the Tk GUI toolkit,and is Python's de facto standard GUI.Tkinter is
included with standard Linux, Microsoft Windowsand Mac OS X installs of
Python.The name Tkinter comes from Tk interface. Tkinter was written by Fredrik
Lundh.Tkinter is free software released under a Python license. As with most other
modern Tk bindings, Tkinter is implemented as a Python wrapper around a
complete Tcl interpreter embedded in the Python interpreter. Tkinter calls are
translated into Tcl commands which are fed to this embedded interpreter, thus making
it possible to mix Python and Tcl in a single application. Python 2.7 and Python 3.1
incorporate the "themed Tk" ("ttk") functionality of Tk 8.5. This allows Tk widgets to
be easily themed to look like the native desktop environment in which the application
is running, thereby addressing a long-standing criticism of Tk (and hence of Tkinter).
There are several popular GUI library alternatives available, such
as wxPython, PyQt (PySide), Pygame, Pyglet, and PyGTK

Creating a GUI application using Tkinter is an easy task. All you need to do is
perform the following steps −

 Import the Tkinter module.


 Create the GUI application main window.
 Add one or more of the above-mentioned widgets to the GUI
application.
 Enter the main event loop to take action against each event triggered by
the user.

Pytesseract – Python – tesseract or Pytesseract is an Optical Character Recognition


(OCR). That is, it will recognize and “read” the embedded text in image.
Python-tesseract is a wrapper for Google’s Tesseract-OCR Engine. It is also useful as
a stand-alone invocation script to tesseract, as it can read all image types supported by
the Python Imaging Library, including jpeg, png, gif, bmp, tiff, and others, whereas
tesseract-ocr by default only supports tiff and bmp. Additionally, if used as a script,
Python-tesseract will print the recognized text instead of writing it to a file.

Functions

 get_tesseract_version Returns the Tesseract version installed in the


system.
 image_to_string Returns the result of a Tesseract OCR run on the image
to string
 image_to_boxes Returns result containing recognized characters and
their box boundaries
 image_to_data Returns result containing box boundaries, confidences,
and other information. Requires Tesseract 3.05+. For more information,
please check the Tesseract TSV documentation
 image_to_osd Returns result containing information about orientation
and script detection.

Pyttsx3 –Pyttsx is a good text to speech conversion library in python but it was
written only in python2 until now ! Even some fair amount of googling didn’t help
much to get tts library compatible with pyton3.

There is however , one library gTTS which works perfectly in python3 but it needs
internet connection to work since it relies on google to get the audio data.ButPyttsx is
completely offline and works seemlesly and has multiple tts-engine support.The codes
in this repos are slightly modified version of the pyttsx module of python 2.x and is a
clone from westonpace’s repo. The purpose of creating this repo is to help those who
want to have an offline tts lib for Python3 and don’t want to port it from python2 to
python3 themselves.

Usage

import pyttsx3;

engine = pyttsx3.init();

engine.say(“I will speak this text”);


engine.runAndWait();

The development of a software is not an easy or one day task. It requires a lot of time
and discussion where the real need for the software is considered and analysed . The
software was first tested for feasibility then requirements were specified and analysed.
Then designing was done followed by coding, and testing. The Iterative Waterfall
Model was used in order to provide feedback and make necessary changes even after
the completion of a module. The detailed description of the steps followed while the
development of the project as described as follows:

 The first step to be considered for the development of a project is the need of project
as mentioned earlier. First, we considered the reason behind making this project. As
mentioned earlier, the main goal behind choosing this text-to-speech converter is its
feature of helping the people. This software is helpful for the visually challenged
people to read and understand various documents and those who cannot speak would
have a voice. This technology can also be utilized for various purposes, e.g. car
navigation, announcements in railway stations, response services in
telecommunications, and e-mail reading.
 Then requirement analysis was done where we analyze what all we require for this
project and what would be the best technology for it. We have chosen python for our
project as it is widely used nowadays and there won’t be any problem in acceptance
of this technology. Moreover, this project would be a great help for teaching purpose,
and for those who are blind and cannot speak by giving a voice and an eye for
reading. All these gave us a green signal to move ahead towards the development of
the project. Then we thought of a system that would speak what is typed by a user and
this would be a great help for those who cannot speak. The blinds cannot read the
documents and sometimes the lessons needs to be dictated in a class, giving us the
need for reading from documents. The text on various signals and images also needs
to be read which gave us the need of reading from the images. All these requirements
showed us a way towards the developments of text to speech converter with three
modules – first, Interactive Mode, that can convert the input typed by a user, second,
Convert from File, which would convert text from text and pdf files, and third,
Convert from Image, that would convert the text from image.
 Then a design was be prepared where we demonstrated our expected design, the
design which specify how our project is going to look and how the modules would be
represented to the user. This was the most time consuming part of the project
development as the User Interface is an essential element of any software. If a
software is not convenient for a user then the software is not considered good. So
proper GUI was to be created that would be simple to use and should provide the
efficient output.
 Then the design was then corrected and modified manier times according to the
suggestions of our friends and mentor and all the changes were meant to make the
system more efficient and for making it look more attractive. Colors were modified
the functionalities were added and the interface was more simplified to provide the
ease to use.
 After that, the implementation was done using python which was the coding for the
project. The project was divided into modules, as mentioned earlier, where first
module converted the written text into speech and second module convert files text
into speech which is further divided into conversion of text and pdf files and the third
module, last one convert the text in an image to speech. The testing of each module
after its coding was done. Following it, testing was done where all the errors or
unexpected results are corrected again by coding while integration of the modules.
The new additions were also be done. The overall system was again tested for its
correct functioning. Thus, we used the Unit Testing, Integration Testing, and System
Testing for the project Development so that the correctness of individual module and
overall system can be verified.
 Then after the implementation and testing of the software, we got it tested by our
friends to know about their reviews and the suggestion were welcomed and the
required modifications were done and again got tested.
4 DESIGN

4.1 Block Diagram


4.2 Flow Diagram
4.3 Process Diagram
4.4 Screenshots

The above screenshot displays the home screen for the text to speech converter. This page
provides three options for the user to convert text to speech – first, interactive mode that
converts the copied or typed text. Second, convert from file that converts the text and pdf file
and third, convert from image that convert the text written on an image to speech.
After selecting interactive mode the above frame opens up where the user can paste a copied
text or type text in the textarea provided and this typed or copied text is converted to speech
when the user clicks on the convert button.
The above frame appears when convert from file is chosen in main menu. The screen here
provides the option of converting either a text or a pdf file to speech. Then the contents of file
gets displayed in the textarea and then the convert button gives the audio output.
The above shown screen spears when a file is to be chosen for converting it to speech.
The above frame opens up when convert from file is clicked in main menu. Here a text file is
chosen and its text appears in the textare and then after clicking convert we can listen to the
audio.
The above screenshot is taken when the pdf file is chosen to be converted to speech. Here
convert button would convert the text from pdf file to speech and the back button would take
us to the home page.
The above screen again gets displayed when we press the back button present at the top right
corner. The back button helps us to jump back from a module to the home page.
The above screen appears when we click on Convert from Image button in the homepage.
The above screen pops up which gives us the option of browsing image by clicking on
Upload Image. The convert button converts the text in the image to audio.
The above screenshot displays the screen where an image was selected for its text be
converted to speech. The selected image then appears in the space in the middle as shown in
figure. Then by clicking the convert button we get our desired output.
5 CONCLUSION AND FUTURE SCOPE

Text - to - speech conversion software project is windows based application that reads a text
file to the user. The software reads a text file or entered text or the image selected and
associated pronunciations in its temporary database. The program then reads an entire word
to the user. The softwarecan be effectively used to help read the text, pdf documents or image
text or entered text for the user so that the user does not constantly need to look at the screen
and read the entire document or image or text.

Text to speech converter is a recent software project that allows even the visually challenged
to read and understand various documents. The blinds cannot read a document, so this
software can be an assistant to them who would read out those documents for them. It can
also be a great help for those who cannot speak. The person can simply type what he/she
wants to say and the software would give a voice to them by speaking what they wanted to
say. The user just have to select the Interactive mode and then write what he wants to say in
the textarea and then he can easily express what he wanted to say by simply clicking the
convert button. So, this software is not just an advancement towards the future development
but also a boon for those who cannot speak and see. This technology can also be utilized for
various purposes, e.g. car navigation, announcements in railway stations, response services in
telecommunications, and e-mail reading. Thus, if we think more innovatively, we can easily
get more applications out of it.
6 REFERENCES

https://www.geeksforgeeks.org/software-engineering-iterative-waterfall-model/

https://www.tutorialspoint.com/python/python_overview.htm

https://www.sestek.com/2014/10/introduction-to-text-to-speech/

https://pythonspot.com/tag/tkinter/

https://pyttsx.readthedocs.io/en/latest/

https://pypi.org/project/pytesseract/

Вам также может понравиться