Вы находитесь на странице: 1из 61

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 1
INTRODUCTION
1.1 Objective of the project
In order to make the user listen any text into voice conversions which will make the systems
to readout any written text into a audible mode. The text file which will be converted into
speech. For this we are using an embedded RASPBERRY PI processor which has the
capability of operating system porting. An open source speech synthesis algorithm is
deployed in RASPBERRY PI Processor for text to speech conversion. This algorithm will
help not only sighted people to enjoy audio files but also visually impaired people because
almost all of their information is obtained from speech. So by using this project the visually
impaired people can be have better advantage by listening the speech. Apart from this we can
use in in Railway stations, Companies, Bus stations, Shopping malls etc.. so whatever we
enter in the Text file it will be converted in the form of speech, so burden on the Human
beings will be less.
The objective of the project include
1. Converting the given Written Text file into voice conversion

1.2 MOTIVATION
Nowadays, listening to audio books on mobile devices is quite common. Therefore, in the
future we will obtain ever-increasing amounts of information through speech instead of
conventional printed materials. So instead of Reading the books we can hear the speech in the
form of the headphones or in the form of the speakers and also whatever the User will Write
it will be read in the form of the speech. In this project we have used Open source Speech
Synthesis Algorithm for converting the files from Text to Speech So People read books at
various levels of detail from close reading to skimming. By using this concept what ever we
entered in the Text file will be converted in the form of the Speech in the form of the
computer generated voice. This opens up a lot of opportunities in the future by their own
generated voice.

1
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

1.3 ORGANIZATION OF DOCUMENTATION


CHAPTER 1: This chapter gives brief overview of project, objective of the project and
finally dissertation organization.
CHAPTER 2: Brief description of the literature survey of the dissertation.
CHAPTER 3: It explains the block diagram and its functional description and functionality of
each block.
CHAPTER 4:Focused on Hardware description of ARM 11 Processor, like features, , Address
space allocation,. arm 11 Power supply Connection, GPIO Configuration, RASPBERRY PI
Board, S3C2440A - 32-BIT RISC Microprocessor
CHAPTER 5: Description about Text to Speech Conversion.
CHAPTER 6: Description about ADC .
CHAPTER 7: Description about other Hardware Modules like Speakers, Power supply.
CHAPTER 8: Detailed explanation about Flowchart and Schematics of various selections of
the project.
CHAPTER 9: Software description and Source code of the Project.
CHAPTER 10: It describes the Advantages and Applications of the Project.
CHAPTER 11: Snapshots of the project.
CHAPTER 12: Hardware component list of the project
CHAPTER 13: Conclusion and Future Scope of the project.
CHAPTER 14: Bibliography of the project.

2
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 2
LITERATURE SURVEY
2.1 Introduction
To Design a low cost Text to speech embedded device. We describe an adaptive
speech rate control technology for ultrafast listening that is equivalent to skimming.
Nowadays, listening to audio books on mobile devices is quite common. Therefore, in the
future we will obtain ever-increasing amounts of information through speech instead of
conventional printed materials. People read books at various levels of detail from close
reading to skimming. Although a similar feature to skimming is required to efficiently obtain
information from audio sources, there is no tool equivalent to skimming for audio playback.

2.2 Existing System


They have not shown any hardware implementation only the theoretical procedure
have been explained which is not sufficient for practical exposure and also which has been
used very complicated and further enhance cannot be shown. The Informedia system used in
this existing project concept seems to be ineffective in the case of skimming the speech
contents of an audio book. Therefore, this method is unsuitable for achieving the aim of
listening to the voices as quickly and accurately as possible.

2.3 Proposed System


In the proposed system we have provided the practical implementation using an
embedded device with ARM architecture processor and with the help of an open source
speech synthesis
Algorithm developed a enhanced version of the text to speech which provides the user to
listen accurate words given in a text file.
The text file will be given to the Text Analysis, it loads the length of the Written Text
file and it generates the Utterance Composed of Words and these Group of words will be
given to the Phasing and Phasing is used to divide the Group of words from the Text file and
the Intonation is used to scan character by character from the Phasing and the Duration is
3
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

used to count the duration of the characters from the Intonation and the Linguistic Analysis is
to change the Language according to the textile and it generates the Utterance composed of
Phonemes and by using Waveform Generator block Synthetic Voice is inbuilt in Flite Speech
Synthesis which will readout the Output Phonemes through Audio port.
Using Qutopia GUI, designer interface for loading the file by using select button and
by clicking on the play button, the Text will be loaded into the Open Source Speech Synthesis
Algorithm and Speech will be generated as output.

4
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 3

BLOCK DIAGRAM AND DESCRIPTION

ADC

FIGURE 3.1 BLOCK DIAGRAM OF RASPBERRY PI

3.2 Description of the Block diagram


In this project we have provided the practical implementation using an embedded device with
ARM architecture processor and with the help of an open source speech synthesis
Algorithm developed a enhanced version of the text to speech which provides the user to
listen accurate words given in a text file. Using Qutopia GUI, designer interface for loading
the file.
5
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

ADC is connected to HDMI Port to LCD connection.


Speaker is inbuilt in RASPBERRY PI Board.
For User interface we are using Mouse and Keyboard connected to RASPBERRY PI
Board.

3.2.2 MICRO CONTROLLER


In this project work the micro-controller is plays major role. Micro-controllers were
originally used as components in complicated process-control systems. However, because of
their small size and low price, Micro-controllers are now also being used in regulators for
individual control loops. In several areas Micro-controllers are now outperforming their
analog counter parts and are cheaper as well.

3.2.3 POWER SUPPLY

The input to the circuit is applied from the regulated power supply. The
a.c. input i.e., 230V from the mains supply is step down by the
transformer to 12V and is fed to a rectifier. The output obtained from the
rectifier is a pulsating d.c voltage. So in order to get a pure d.c voltage,
the output voltage from the rectifier is fed to a filter to remove any a.c
components present even after rectification. Now, this voltage is given to
a voltage regulator to obtain a pure constant dc voltage.

3.2.4 ANALOG-TO-DIGITAL CONVERTER (ADC)


An analog-to-digital converter (ADC, A/D, or A to D) is a device that converts a continuous
physical quantity (usually voltage) to a digital number that represents the quantity's
amplitude.
The conversion involve quantization of the input, so it necessarily introduces a small amount
of error. Instead of doing a single conversion, an ADC often performs the conversions
6
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

("samples" the input) periodically. The result is a sequence of digital values that have been
converted from a continuous-time and continuous-amplitude analog signals to a discrete-time
and discrete-amplitude digital signal.
3.2.5 SPEAKER
A system's speaker is the component that takes the electronic signal stored on things like
CDs, tapes and DVDs and turns it back into actual sound that we can hear.

3.2.6 TFT Monitor LCD


The LCD controller has a dedicated DMA that supports to fetch the image data from video buffer
located in system memory. Its features also include:

Dedicated interrupt functions (INT_FrSyn and INT_FiCnt)


The system memory is used as the display memory.
Supports Multiple Virtual Display Screen (Supports Hardware Horizontal/Vertical Scrolling)
Programmable timing control for different display panels
Supports little and big-endian byte ordering, as well as WinCE data formats
Supports 2-type SEC TFT LCD panel (SAMSUNG 3.5 Portrait / 256K Color /Reflective
and Transflective a-Si TFT LCD)

CHAPTER 4
RASPBERRY PI PROCESSOR
Released in July 2014, the Model B+ is a updated revision of the Model B. It
increases the number of USB ports to 4 and the number of pins on the GPIO
header to 40. In addition, it has improved power circuitry which allows higher
powered USB devices to be attached and now hotplugged. The full size
composite video connector has been removed and the functionality moved to the
3.5mm audio/video jack. The full size SD card slot has also been replaced with a
much more robust microSD slot.

7
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

The following list details some of the improvements over the Model B.
Current monitors on the USB ports mean the B+ now supports hot plugging.
Current limiter on the 5V for HDMI means HDMI cable powered VGA converters
will now all work
14 more GPIO pins
EEPROM readout support for the new HAT expansion boards
Higher drive capacity for analog audio out, from a separate regulator, which
means a better audio DAC quality.
No more backpowering problems, due to the USB current limiters which also
inhibit back flow, together with the "ideal power diode"
Composite output moved to 3.5mm jack
Connectors now moved to two sides of the board rather than the four of the
original device.
Ethernet LED's moved to the ethernet connector
4 squarely positioned mounting holes for more rigid attachment to cases etc.
The power circuit changes also means a reduction in power requirements of
between 0.5W and 1W.

Product Description:
The Raspberry Pi Model B+ incorporates a number of enhancements and new
features. Improved power consumption, increased connectivity and greater IO
are among the improvements to thisPowerful, small and lightweight ARM based
computer.
Specifications
Chip Broadcom
Core architecture
CPU

: BCM2835 SoC
: ARM11
: 700 MHz Low Power ARM1176JZFS Applications Processor
8

ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

GPU

ECE(ES)

: Dual Core VideoCore IV Multimedia Co-Processor

Memory

: 512MB SDRAM

Operating System

: Boots from Micro SD card, running a version of the Linux

Dimensions

: 85 x 56 x 17mm

Power

: Micro USB socket 5V, 2A

Ethernet

: 10/100 BaseT Ethernet socket

Video Output

: HDMI (rev 1.3 & 1.4)


Composite RCA (PAL and NTSC)

Audio Output

: 3.5mm jack, HDMI

USB

: 4 x USB 2.0 Connector

GPIO Connector

: 40-pin 2.54 mm (100 mil) expansion header: 2x20 strip


Providing 27 GPIO pins as well as +3.3 V, +5 V and GND

supply lines
Camera Connector
JTAG

: 15-pin MIPI Camera Serial Interface (CSI-2)


: Not populated

Display Connector

: Display Serial Interface (DSI) 15 way flat flex cable

connector with two data lanes and a clock lane


Memory Card Slot

: Micro SDIO

REVISIONS:
There have been a number of revison changes over the lifetime of the Model B,
and the B+, despite its dramatic improvements over the B, is simply a new
revision, and is expected to be the final one using the BCM2835. It is in effect
revision 3 of the board.
Revision 1 is the revision as of initial launch, whilst revision 2 improved the
power and USB circuitry to increase reliability, and also included 2 registration
holes that could also be used for mounting the device. There have also been

9
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

minor revision changes during the lifetime of the board to help wth manufacture,
testing, and production line BOM (Bill of material) transitions.

4.5.7 THE COMPUTE MODULE


The compute module is intended for industrial applications, it is a cut down
device which simply include the BCM2835, 512MB of SDRAM and a 4GB eMMC
flash memory, in a small form factor. This connects to a base board using a
repurposed 200 pin DDR2 SODIMM connector. Note the device is NOT SODIMM
compatible, it just repurposes the connector. All the BCM2835 features are
exposed via the SODIMM connector, including twin camera and LCD ports, whilst
the Model A or B/B+ only have one of each.
The compute module is expected to be used by companies wishing to shortcut
the development process of new product, meaning only a baseboard needs to be
developed, with appropriate peripherals, with the Compute Module providing the
CPU, memory and storage along with tested and reliable software

4.5.8 Schematics for ModelB+:

10
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

Figure 4: Schematics for ModelB+

4.5.9 BCM2835:
The Broadcom chip used in the Raspberry Pi Model A, B and B+
The BCM2835 is a cost-optimized, full HD, multimedia applications processor for
advanced mobile and embedded applications that require the highest levels of
multimedia performance. Designed and optimized for power efficiency, BCM2835
uses Broadcom's VideoCore IV technology to enable applications in media
playback, imaging, camcorder, streaming media, graphics and 3D gaming.
Features:
Low Power ARM1176JZ-F Applications Processor
11
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

Dual Core VideoCore IV Multimedia Co-Processor


1080p30 Full HD HP H.264 Video Encode/Decode
Advanced Image Sensor Pipeline (ISP) for up to 20-megapixel cameras operating
at up to 220 megapixels per second
Low power, high performance OpenGL-ES 1.1/2.0 VideoCore GPU. 1 Gigapixel
per second fill rate.
High performance display outputs. Simultaneous high resolution LCD and HDMI
with HDCP at 1080p60
Overview
BCM2835 contains the following peripherals which may safely be accessed by
the ARM:
Timers
Interrupt controller
GPIO
USB
PCM / I2S
DMA controller
I2C master
I2C / SPI slave
SPI0, SPI1, SPI2
PWM
UART0, UART1
The purpose of this datasheet is to provide documentation for these peripherals
in sufficient detail to allow a developer to port an operating system to BCM2835.

12
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

There are a number of peripherals which are intended to be controlled by the


GPU. These areomitted from this datasheet. Accessing these peripherals from the
ARM is not recommended.

4.5.10 General Purpose Input/Output pins on the Raspberry Pi:


This page expands on the technical features of the GPIO pins available on
BCM2835 in general. For usage examples, see the GPIO Usage section. When
reading

this

page,

reference

should

be

made

to

the

BCM2835

ARM

Peripherals Datasheet, section 6.


GPIO pins can be configured as either general-purpose input, general-purpose
output or as one of up to 6 special alternate settings, the functions of which are
pin-dependant.
There are 3 GPIO banks on BCM2835.
Each of the 3 banks has its own VDD input pin. On Raspberry Pi, all GPIO banks
are supplied from 3.3V. Connection of a GPIO to a voltage higher than 3.3V will
likely destroy the GPIO block within the SoC.
A selection of pins from Bank 0 is available on the P1 header on Raspberry Pi.

4.5.11 GPIO PADS


The GPIO connections on the BCM2835 package are sometimes referred to in the
peripherals datasheet as "pads" - a semiconductor design term meaning "chip
connection to outside world".
The pads are configurable CMOS push-pull output drivers/input buffers. Registerbased control settings are available for
Internal pull-up / pull-down enable/disable
Output drive strength
Input Schmitt-trigger filtering

13
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

4.5.12 POWER-ON STATES


All GPIOs revert to general-purpose inputs on power-on reset. The default pull
states are also applied, which are detailed in the alternate function table in the
ARM peripherals datasheet. Most GPIOs have a default pull applied.

4.5.13 INTERRUPTS
Each GPIO pin, when configured as a general-purpose input, can be configured as
an interrupt source to the ARM. Several interrupt generation sources are
configurable:
Level-sensitive (high/low)
Rising/falling edge
Asynchronous rising/falling edge
Level interrupts maintain the interrupt status until the level has been cleared by
system software (e.g. by servicing the attached peripheral generating the
interrupt).
The normal rising/falling edge detection has a small amount of synchronisation
built into the detection. At the system clock frequency, the pin is sampled with
the criteria for generation of an interrupt being a stable transition within a 3cycle window, i.e. a record of "1 0 0" or "0 1 1". Asynchronous detection
bypasses this synchronisation to enable the detection of very narrow events.

4.5.14 ALTERNATIVE FUNCTIONS


Almost all of the GPIO pins have alternative functions. Peripheral blocks internal
to BCM2835 can be selected to appear on one or more of a set of GPIO pins, for
example the I2C busses can be configured to at least 3 separate locations. Pad
control, such as drive strength or Schmitt filtering, still applies when the pin is
configured as an alternate function.

14
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

For more detailed information see the Low level peripherals page on the
elinuxwikiThere are 54 general-purpose I/O (GPIO) lines split into two banks. All
GPIO pins have atleast two alternative functions within BCM. The alternate
functions are usually peripheral IOand a single peripheral may appear in each
bank to allow flexibility on the choice of IOvoltage. Details of alternative
functions are given in section 6.2. Alternative FunctionAssignments.
The block diagram for an individual GPIO pin is given below :

Figure 5: GPIO Block Diagram

The GPIO peripheral has three dedicated interrupt lines. These


lines are triggered by the setting of bits in the event detect
status register. Each bank has its own interrupt line with
thethird line shared between all bits.
The Alternate function table also has the pull state (pull-up/pulldown) which is applied after a power down.
15
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

4.5.15 UART

The BCM2835 device has two UARTS. On mini UART and and
PL011 UART. This section describes the PL011 UART. For details
of the mini UART see 2.2 Mini UART. The PL011 UART is a
Universal Asynchronous Receiver/Transmitter. This is the ARM
UART (PL011) implementation. The UART performs serial-toparallel conversion on data characters received from an
external

peripheral

serialconversion

on

device
data

or

modem,

characters

and

received

parallel-tofrom

the

Advanced Peripheral Bus (APB).

4.13.1 S3C2440A - 32-BIT RISC MICROPROCESSOR


Architecture:

Integrated system for hand-held devices and general embedded applications.


16/32-Bit RISC architecture and powerful instruction set with RASPBERRY PI20T

CPU core.
Enhanced ARM architecture MMU to support WinCE, EPOC 32 and Linux.
Instruction cache, data cache, writes buffer and Physical address TAG RAM to

reduce the effect of main memory bandwidth and latency on performance.


RASPBERRY PI20T CPU core supports the ARM debug architecture.
Internal Advanced Microcontroller Bus

16
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 5

TEXT TO SPEECH CONVERSION


5.1 Speech synthesis flow
The text file will be given to the Text Analysis, it loads the length of the Written Text file and
it generates the Utterance Composed of Words and these Group of words will be given to the
Phasing and Phasing is used to divide the Group of words from the Text file and the
Intonation is used to scan character by character from the Phasing and the Duration is used to
count the duration of the characters from the Intonation and the Linguistic Analysis is to
change the Language according to the textile and it generates the Utterance composed of
Phonemes and by using Waveform Generator block Synthetic Voice is inbuilt in Flite Speech
Synthesis which will readout the Output Phonemes through Audio port.
LINGUISTIC
ANALYSYS

PHASING
WAVEFORM
TEXT
ANALYSYS
ECET

INTONATION

GENERATION17

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

DURATION
Text
Speech

Utterance composed
of words

Utterance composed
of phonemes

FIGURE 5.1 SPEECH SYNTHESIS FLOW

CHAPTER 6

ADC & MONITOR INTERFACE


6.1 OVERVIEW
The 10-bit CMOS ADC (Analog to Digital Converter) is a recycling type device with 8-channel
analog inputs. It converts the analog input signal into 10-bit binary digital codes at a maximum
conversion rate of 500KSPS with 2.5MHz A/D converter clock. A/D converter operates with on-chip
sample-and-hold function and power down mode is supported. Monitor Interface can control/select
pads (XP, XM, YP, YM) of the Monitor for X, Y position conversion. Monitor Interface contains
Monitor Pads control logic and ADC interface logic with an interrupt generation logic .

6.2 FEATURES

Resolution: 10-bit
Differential Linearity Error: 1.0 LSB
Integral Linearity Error: 2.0 LSB
Maximum Conversion Rate: 500 KSPS
Low Power Consumption
Power Supply Voltage: 3.3V
Analog Input Range: 0 ~ 3.3V
On-chip sample-and-hold function
Normal Conversion Mode
Separate X/Y position conversion Mode
Auto(Sequential) X/Y Position Conversion Mode
Waiting for Interrupt Mode

18
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

An analog-to-digital converter (ADC, A/D, or A to D) is a device that converts a continuous


physical quantity (usually voltage) to a digital number that represents the quantity's
amplitude.
The conversion involves quantization of the input, so it necessarily introduces a small amount
of error. Instead of doing a single conversion, an ADC often performs the conversions
("samples" the input) periodically. The result is a sequence of digital values that have been
converted from a continuous-time and continuous-amplitude analog signal to a discretetime and discrete-amplitude digital signal.
An ADC works by sampling the value of the input at discrete intervals in time. Provided that
the input is sampled above the Nyquist rate, defined as twice the highest frequency of
interest, then all frequencies in the signal can be reconstructed. If frequencies above half the
Nyquist rate are sampled, they are incorrectly detected as lower frequencies, a process
referred to as aliasing. Aliasing occurs because instantaneously sampling a function at two or
fewer times per cycle results in missed cycles, and therefore the appearance of an incorrectly
lower frequency. For example, a 2 kHz sine wave being sampled at 1.5 kHz would be
reconstructed as a 500 Hz sine wave.
To avoid aliasing, the input to an ADC must be low-pass filtered to remove frequencies above
half the sampling rate. This filter is called an anti-aliasing filter, and is essential for a
practical ADC system that is applied to analog signals with higher frequency content. In
applications where protection against aliasing is essential, oversampling may be used to
greatly reduce or even eliminate it.
An ADC is defined by its bandwidth (the range of frequencies it can measure) and its signal
to noise ratio (how accurately it can measure a signal relative to the noise it introduces). The
actual bandwidth of an ADC is characterized primarily by its sampling rate, and to a lesser
extent by how it handles errors such as aliasing. The dynamic range of an ADC is influenced
by many factors, including the resolution (the number of output levels it can quantize a signal
to), linearity and accuracy (how well the quantization levels match the true analog signal)
and jitter (small timing errors that introduce additional noise). The dynamic range of an ADC
is often summarized in terms of its effective number of bits (ENOB), the number of bits of
each measure it returns that are on average not noise. An ideal ADC has an ENOB equal to its
resolution. ADCs are chosen to match the bandwidth and required signal to noise ratio of the
19
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

signal to be quantized. If an ADC operates at a sampling rate greater than twice the
bandwidth of the signal, then perfect reconstruction is possible given an ideal ADC and
neglecting quantization error. The presence of quantization error limits the dynamic range of
even an ideal ADC, however, if the dynamic range of the ADC exceeds that of the input
signal, its effects may be neglected resulting in an essentially perfect digital representation of
the input signal.
An ADC may also provide an isolated measurement such as an electronic device that converts
an input analog voltage or current to a digital number proportional to the magnitude of the
voltage or current. However, some non-electronic or only partially electronic devices, such
as rotary encoders, can also be considered ADCs. The digital output may use different coding
schemes. Typically the digital output will be a two's complement binary number that is
proportional to the input, but there are other possibilities. An encoder, for example, might
output a Gray code.

20
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 7

HARDWARE FUNCTIOANL MODULES


7.1 SPEAKERS
A system's speaker is the component that takes the electronic signal stored on things like
CDs, tapes and DVDs and turns it back into actual sound that we can hear.
To understand how speakers work, it is needed to understand how sound works. Inside the
human ear, there is a very thin piece of skin called the eardrum. When the eardrum vibrates,
the human brain interprets the vibrations as sound and that is how we hear. The rapid changes
in air pressure are the most common thing to vibrate the eardrum.
21
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

An object produces sound when it vibrates in air (sound can also travel through liquids and
solids, but air is the transmission medium when we listen to speakers). When something
vibrates, it moves the air particles around it. Those air particles in turn move the air particles
around them, carrying the pulse of the vibration through the air as a travelling disturbance.
To see how this works, a simple vibrating object, a bell, is considered. When a bell is rung,
the metal vibrates, flexes in and out, rapidly. When it flexes out on one side, it pushes out on
the surrounding air particles on that side. These air particles then collide with the particles in
front of them, which collide with the particles in front of them and so on.
When the bell flexes away, it pulls in on these surrounding air particles, creating a drop in
pressure that pulls in on more surrounding air particles, which creates another drop in
pressure that pulls in particles that are even farther out and so on. This decreasing of pressure
is called rarefaction. In this way, a vibrating object sends a wave of pressure fluctuation
through the atmosphere. When the fluctuation wave reaches the human ear, it vibrates the
eardrum back and forth. Thus, the brain interprets this motion as sound.

7.1.1 MONO TYPE SPEAKER


Monotype Speakers use similar drivers arranged at the small end of a cone-type structure.
The most traditional type of speaker is the horn. The driver is attached to a wave-guide
structure. This type of speaker offers a high degree of sensitivity and transmits sound
efficiently in large areas.
7.1.2 DYNAMIC SPEAKER
The most common type of speaker, these devices are typically passive speakers. They
generally have one or more woofer driver to produce low-frequency sound, which is also
known as bass. One or more tweeter drivers in dynamic drivers produce high-frequency

22
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

sound, or treble. Professional audio dynamic speakers that offer higher performance may also
have drivers on the rear of the speaker enclosure to further amplify sound.
7.1.3 SUBWOOFER SPEAKER
Subwoofers are one-driver dynamic loudspeakers with a single woofer driver. The speakers
enclosure typically includes a bass port to increase low-frequency performance. These
speakers are used to transmit bass or low-frequency sound. They are also used to enhance
bass from any accompanying main speakers in a multi-speaker system, allowing users to
offer and deliver bass enhancement without compromising other sound at higher frequencies.
7.2 POWER SUPPLY

The input to the circuit is applied from the regulated power supply. The
a.c. input i.e., 230V from the mains supply is step down by the
transformer to 12V and is fed to a rectifier. The output obtained from the
rectifier is a pulsating d.c voltage. So in order to get a pure d.c voltage,
the output voltage from the rectifier is fed to a filter to remove any a.c
components present even after rectification. Now, this voltage is given to
a voltage regulator to obtain a pure constant dc voltage.

Figure 7.2 POWER SUPPLY

7.2.1 Transformer:
23
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

Usually, DC voltages are required to operate various electronic


equipment and these voltages are 5V, 9V or 12V. But these voltages
cannot be obtained directly. Thus the a.c input available at the mains
supply i.e., 230V is to be brought down to the required voltage level. This
is done by a transformer. Thus, a step down transformer is employed to
decrease the voltage to a required level.

7.2.2 Rectifier:
The output from the transformer is fed to the rectifier. It converts
A.C. into pulsating D.C. The rectifier may be a half wave or a full wave
rectifier. In this project, a bridge rectifier is used because of its merits like
good stability and full wave rectification.

7.2.3Filter:
Capacitive filter is used in this project. It removes the ripples from
the output of rectifier and smoothens the D.C. Output received from this
filter is constant until the mains voltage and load is maintained constant.
However, if either of the two is varied, D.C. voltage received at this point
changes. Therefore a regulator is applied at the output stage.

7.2.4 Voltage regulator:


As the name itself implies, it regulates the input applied to it. A voltage regulator is an
electrical regulator designed to automatically maintain a constant voltage level. In this project, power
supply of 5V and 12V are required. In order to obtain these voltage levels, 7805 and 7812 voltage
regulators are to be used. The first number 78 represents positive supply and the numbers 05, 12
represent the required output voltage levels.

24
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 8

FLOW CHART
start

RASPBERRY PI
PROCESSOR
HDMI cable

TFT MONITOR LCD


ECET

AND SPEAKER

25

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

QUTOIA TTS
(Text to speech)

Select the desired


file from desktop

Play

stop
stosop

End

CHAPTER 9

SOFTWARE DESCRIPTION

The procedure has to be done on ubuntu platform


1) sudo apt-get install build-essential automake autoconfig libtool
2) Using Synaptic Package Manager , install the below libraries in your system
Git-core
ia32-libs
26
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

gcc-arm-linux-gnueabi
aptitude
libncurses5-dev
autoconf
libtool
libqt4-dev
qt4-dev-tools
libusb-dev

3) sudo apt-get install dh-autoreconf realpath

SETTINGUP AND CONFIGURING RASPBERRY PI FRIENDLYARM ON UBUNTU

The first software we need on our ubuntu system in minicom.Minicom is just like
hyperterminal.Install minicom first
sudo apt-get install minicom
Now connect Raspberry pi to our system two connection one from the serial port and next
from the Type B USB plug must be made to the system.On the friendly arm board the
NAND/NOR switch must be placed in the NOR position
Now invoke minicom from terminal using command
sudo minicom
usually you will get something like this on your terminal

27
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

The problem is you need to configure minicom the port etc use the command dmesg this
command lists several other messages also so you have to closely observe to which port the
board is connected. Here am using serial to usb converter so the port deteced for mine
is ttyUSB0 if you connected it to serial port itself it will be different

kill minicom using the command


sudo pkill minicom
then invoke minicom using
sudo minicom -s
It will give a screen like this

28
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

select serial port setup and hit enter

there change serial device to the port detected in my case it will be /dev/ttyUSB0 (It may vary
from 0 to any number) we can edit that by selecting A option. Also double check that both
software and hardware flow control must be in NO
29
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

Press Enter, Then select save set up as dfl

After that exit from minicom and try our first command
sudo minicom
30
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

If everything went correct then you will get a boot loader like this .This is the preinstalled
SuperVivi boot loader on memory.

SETTING UP SUPERVIVI,KERNEL & QTOPIA IN RASPBERRY PI

Download usbpush - http://www.friendlyarm.net/dl.php?file=usbpush.zip


Remaining Files Loaded into Raspberry pi can be found in provided CD / images folder
Extract the usbpush this utility helps us to push the image files from our system to
Raspberry pi target. When we extract we can see one more usbpush folder inside that folder
there will be a usbpush binary give executable permission to that binary
sudo chmod +x usbpush/usbpush/usbpush
31
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

Gets the boot loader using command.


sudo minicom

Select x and format the NAND flash

Now the next step is to load supervivi to target board using usbpush. Supervivi is the boot
loader.
First select V

Only after this you should push file from the host system. Push the file like this
sudo ./usbpush/usbpush/usbpush supervivi_20100818/supervivi-128M

32
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

If the push is successful then the minicom prompt will be like this and will be back to the
boot loader

And the usbpush prompt will be like this

So supervivi is successfully installed. Next is zImage, Kernel images generated by the kernel
build process are either uncompressed Image files or compressed zImage files. Here we
select zImage W35 because RASPBERRY PI with 3.5 display.
Now we selects K option to download Linux kernel and pushes the extracted zImage
sudo ./usbpush/usbpush/usbpush linux-zImage_20110421/zImage_W35
After this step the basic Linux kernel image will be loaded into memory
Now we need to Download the root yaffs image.YAFFS (Flash File System) is now in its
second generation and provides a fast robust file system for NAND and NOR Flash.
Select Y to download yaffs image and push the qtopia image file

33
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

It will take some time


Now that we have completed all the steps select b to boot the system
After this select boot. Then the board will boot and we will get prompt of friendly arm

34
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

Now calibrate the screen by clicking on the crosshairs shown and you will be taken to qtopia
screen by default qtopia will be in Chinese in second tab there is a flag symbol select that it
changes the language select English and we are done

SETTING UP CROSS COMPILATION TOOLCHAIN TO COMPILE

So first we need some packages installed in our system they are


sudo apt-get install build-essential make ncurses gcc g++ libncurses5 libncueses5-dev

After that we need to download package or it can found in provided CD Linux folder
Download the File provide by us and follow the below commands

35
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

Configuring Arm gcc


sudo toolchain.tgz /usr/local/
cd /..
cd /usr/local/
sudo tar xvf toochain.tgz
sudo chmod uga+rw R toochain

Then come to your Home Folder there type


sudo gedit .bashrc
In this file at the bottom paste the path of the folder by doing pwd at the stored file ..for
example ...below bold is the path u got from pwd cmd done in where u saved the folder
export PATH=/usr/local/toolschain/4.4.3/bin/:$PATH
export CROSS_COMPILE= arm-none-linux-gnueabi-

After this restart the system, then open a terminal and type arm-linux-gcc v, then u should
received the output of gcc version .If so u have successfully installed arm gcc.

RUNNING A C PROGRAM ON RASPBERRY PI


Am creating a simple Hello World program in C.This is the program
36
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

#include<stdio.h>
int main(){
printf("Hello World\n");
}
Save it as hello.c .Now compiling it for our target or Raspberry pi.
open terminal and change directory to the folder where c program is present and issue this
command
arm-linux-gcc hello.c -o hello
The arm-linux-gcc command was present bcoz we setup the toolchain early. Now a binary
hello will be present in the folder. Use file command to know the type of the binary it will
suitable for ARM.

Move the binary to Raspberry pi using pen drive or sdcard.To see the output execute the
binary like this.
./hello

You can directly download the files provide by us and save in /usr/local/ and provide
permissions, the below details for your presentation purpose

Install Qt4.6.3 in Raspberry pi


37
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

The way of installing Qt4.6.3 in Raspberry pi is listed below.


At first, you should have general view of the work:
1. tslib compilation (tslib are libraries for making Monitor work)
2. Qt 4.6.3 compilation
3. Copy library of tslib and Qt4.6.3 into Raspberry pi board.
4. Configure the environment in Raspberry pi board.
5. Run Qt example program.

1. tslib compilation
$ sudo mkdir /usr/local/tslib
$git clone http://github.com/kergoth/tslib.git
$export CROSS_COMPILE=arm-none-linux-gnueabi$export CC=${CROSS_COMPILE}gcc
$export CFLAGS=-march=armv4t
$export CXX=${CROSS_COMPILE}"g++"
$export AR=${CROSS_COMPILE}"ar"
$export AS=${CROSS_COMPILE}"as"
$export RANLIB=${CROSS_COMPILE}"ranlib"
$export LD=${CROSS_COMPILE}"ld"
$export STRIP=${CROSS_COMPILE}"strip"
$export ac_cv_func_malloc_0_nonnull=yes
$cd tslib
$./autogen-clean.sh
$./autogen.sh
$./configure --host=arm-linux --prefix=/usr/local/tslib --enable-shared=yes --enablestatic=yes
$make
$sudo make install
#after successful tslib compilation
root@laptop:/usr/local/tslib
38
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

root@laptop: ls
bin etc include lib
2. Qt4.6.3 compilation
#get Qt4.6.3 from the link http://download.qt-project.org/archive/qt/4.6/
#copy to home folder

$tar -zxvf qt-everywhere-opensource-src-4.6.3.tar.gz


$cd qt-everywhere-opensource-src-4.6.3
$cd mkspecs/common/
$gedit g++.conf
#change
#into

QMAKE_CFLAGS_RELEASE += -O2
QMAKE_CFLAGS_RELEASE += -O0

#save the file


$cd /usr/local/qt-everywhere-opensource-src-4.6.3/mkspecs/qws/linux-arm-g++/
$gedit qmake.conf

/************************************************************
change the file into, note that, the path /usr/local/arm/4.3.2/ is the path which you
installed tool chain. i.e go to /usr/local/ , do the pwd for toochain directory copy the
path and replace with above mentioned.
/**********************************************************/
#
# qmake configuration for building with arm-linux-g++
#

include(../../common/g++.conf)
include(../../common/linux.conf)
39
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

include(../../common/qws.conf)

# modifications to g++.conf
QMAKE_CC

= /usr/local/arm/4.3.2/bin/arm-none-linux-gnueabi-gcc -msoft-float

-D__GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=Raspberry pi20t -O0 -lts


QMAKE_CXX

= /usr/local/arm/4.3.2/bin/arm-none-linux-gnueabi-g++ -msoft-float

-D__GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=Raspberry pi20t -O0 -lts


QMAKE_LINK

= /usr/local/arm/4.3.2/bin/arm-none-linux-gnueabi-g++ -msoft-float

-D__GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=Raspberry pi20t -O0 -lts


QMAKE_LINK_SHLIB

= /usr/local/arm/4.3.2/bin/arm-none-linux-gnueabi-g++ -msoft-

float -D__GCC_FLOAT_NOT_NEEDED -march=armv4t -mtune=Raspberry pi20t -O0 -lts

# modifications to linux.conf
QMAKE_AR
QMAKE_OBJCOPY
QMAKE_STRIP

QMAKE_INCDIR

= /usr/local/arm/4.3.2/bin/arm-none-linux-gnueabi-ar cqs
= /usr/local/arm/4.3.2/bin/arm-none-linux-gnueabi-objcopy
= /usr/local/arm/4.3.2/bin/arm-none-linux-gnueabi-strip

+= /home/tslib/include/

QMAKE_LIBDIR += /home/tslib/lib/

QMAKE_CFLAGS_RELEASE += -march=armv4 -mtune=Raspberry pi20t


QMAKE_CFLAGS_DEBUG += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_MT += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_MT_DBG += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_MT_DLL += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_MT_DLLDBG += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_SHLIB += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_THREAD += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_WARN_OFF += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CFLAGS_WARN_ON += -march=armv4t -mtune=Raspberry pi20t
40
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

QMAKE_CXXFLAGS_DEBUG += -march=armv4t -mtune=Raspberry pi20t


QMAKE_CXXFLAGS_MT += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_MT_DBG += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_MT_DLL += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_MT_DLLDBG += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_RELEASE += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_SHLIB += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_THREAD += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_WARN_OFF += -march=armv4t -mtune=Raspberry pi20t
QMAKE_CXXFLAGS_WARN_ON += -march=armv4t -mtune=Raspberry pi20t

load(qt_config)
#then, save the file. Continue on console panel:
$ sudo mkdir /usr/local/Qt
$cd qt-everywhere-opensource-src-4.6.3

$./configure -embedded arm -xplatform qws/linux-arm-g++ -prefix /usr/local/Qt -qtmouse-tslib -little-endian -no-webkit -no-qt3support -no-cups -no-largefile -optimizedqmake -no-openssl -nomake tools -qt-sql-sqlite -no-3dnow -system-zlib -qt-gif -qt-libtiff
-qt-libpng -qt-libmng -qt-libjpeg -no-opengl -gtkstyle -no-openvg -no-xshape -no-xsync
-no-xrandr -qt-freetype -qt-gfx-linuxfb -qt-kbd-tty -qt-kbd-linuxinput -qt-mouse-tslib
-qt-mouse-linuxinput
#chose 'o' Open Source Edition
#chose 'yes' to accept license offer
#then, you will wait for about 5 miniutes
$make
41
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

#it will take nearly one hour on my due core 1.7GHz laptop, too long!
$make install
#Ok, if the compilation is done, you will see the result files in /usr/local/Qt.
3. Copy library of tslib and Qt4.6.3into Raspberry pi board.

#I use sdcard or pendrive to copy file from Ubuntu to Raspberry pi board. It should be copied
the library of tslib, Qt, and the example program of Qt that will be run on Raspberry pi board.

#copy library and example program of Qt.


$cd /usr/local/Qt/lib
$cp *4.6.3 /media/pen drive name/
$cp -r fonts/ /media/pen drive name/
$cd /usr/local/Qt
$cp -r demos/ /media/ pen drive name /
#copy tslib into pen drive
$mkdir /media/Pen drive name/tslib/lib/
$cd /home/tslib/lib/
$cp -r * /media/Pen drive name/tslib/lib/
$cp * /media/Pen drive name/tslib/lib/

#place your SDcard or pendrive into Raspberry pi board, copy all file to the board.
#in Raspberry pi console
$mkdir /usr/local/Qt/lib/
$cp /sdcard/*4.6.3/usr/local/Qt/lib/ (or) cp /udisk/*4.6.3/usr/local/Qt/lib/
#rename all *.4.6.3in /usr/local/Qt/lib into *.4, for example
$mv libQtCore.so.4.6.3libQtCore.so.4
$cp -r /sdcard/fonts/ /usr/local/Qt/lib/ (or) cp -r /udisk/fonts/ /usr/local/Qt/lib/
$cp -r /sdcard/demos/ /mnt/ (or) cp -r / udisk /demos/ /mnt/
$cp -r /sdcard/tslib/ /usr/local/ (or) cp -r / udisk/tslib/ /usr/local/
42
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

4. Configure the environment in Raspberry pi board.


Below instructions will make Raspberry pi board not load its desktop. So, the board will load
system until the black screen with Linus logo and some note lines. We should stop the
desktop of Raspberry pi because that helps example Qt programs not conflict screen with the
desktop
$cd /etc/init.d/
$vi rcS
#comment last three lines:

#bin/qtopia&
#echo"

"> /dev/tty1

#ehco"Starting Qtopia, please waiting..." > /dev/tty1


#Configure the environment in Raspberry pi:
$cd /etc/
$vi profile
#add those lines at the end of file
export LD_LIBRARY_PATH=/usr/local/tslib/lib
export QTDIR=/usr/local/Qt
export QWS_MOUSE_PROTO=tslib:/dev/input/event0
export TSLIB_CALIBFILE=/etc/pointercal
export TSLIB_CONFFILE=/usr/local/etc/ts.conf
export TSLIB_CONSOLEDEVICE=none
export TSLIB_FBDEVICE=/dev/fb0
export TSLIB_PLUGINDIR=/usr/local/tslib/lib/ts
export TSLIB_TSDEVICE=/usr/local/tslib/lib/ts
export TSLIB_TSEVENTTYPE=INPUT
export QWS_DISPLAY=LinuxFB:mmWidth=105:mmHeight=140
#save it.

43
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

SOURCE CODE OF THE PROJECT


Main.cpp
#include "mainwindow.h"
#include <QApplication>
int main(int argc, char *argv[])
{
QApplication a(argc, argv);
MainWindow w;
w.show();
return a.exec();
}

Mainwindow.cpp
#include "mainwindow.h"
#include "ui_mainwindow.h"
#include<festival.h>
#include<QtSpeech.h>
#include<QtSpeech>
#include<QtSpeech_unx.h>
#include <QApplication>
#include<signal.h>
#include<sys/types.h>
#include<stdlib.h>
//QtSpeech voice;
QString path;
MainWindow::MainWindow(QWidget *parent) :
QMainWindow(parent),
ui(new Ui::MainWindow)
{
ui->setupUi(this);
44
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

QString sPath = "/";


dirmodel=new QFileSystemModel(this);
dirmodel->setRootPath(sPath);
ui->treeView->setModel(dirmodel);
//festival_say_text("hello world");
// voice.say("hello world");
}
MainWindow::~MainWindow()
{
delete ui;
}
void MainWindow::on_treeView_clicked(const QModelIndex &index)
{
QString sPath=dirmodel->fileInfo(index).absoluteFilePath();
ui->textEdit->setText(sPath);
path=sPath;
}
void MainWindow::on_pushButton_clicked()
{
QtSpeech voice;
voice.say(path);
//voice.tell(path, &a, SLOT(quit()));
}

void MainWindow::on_pushButton_2_clicked()
{
// close = new QtSpeech;
45
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

// close->~QtSpeech();
// close->thread();
// close->say(path);
//QApplication::quit();
system("/home/t1/./a.out&");
}

QtSpeech_unx.cpp
#ifndef MAINWINDOW_H
#define MAINWINDOW_H
#include <QMainWindow>
#include <QMainWindow>
#include <QDialog>
#include <QtCore>
#include <QtGui>
#include <QString>
#include <QDebug>
#include <QtSpeech>
#include <QtCore>
#include <QtSpeech>
#include <QtSpeech_unx.h>
#include <festival.h>
#include <QPushButton>

namespace Ui
{
class MainWindow;
}
class MainWindow : public QMainWindow
46
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

{
Q_OBJECT
public:
explicit MainWindow(QWidget *parent = 0);
~MainWindow();
private slots:
void on_treeView_clicked(const QModelIndex &index);
void on_pushButton_clicked();
void on_pushButton_2_clicked();
private:
Ui::MainWindow *ui;
QFileSystemModel *dirmodel;
QApplication *a;
QtSpeech *close;
};
#endif // MAINWINDOW_H

#include <QtCore>
#include <QtSpeech>
#include <QtSpeech_unx.h>
#include <festival.h>
namespace QtSpeech_v1 { // API v1.0
// some defines for throwing exceptions
#define Where QString("%1:%2:").arg(__FILE__).arg(__LINE__)
#define SysCall(x,e) {\
47
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

int ok = x;\
if (!ok) {\
QString msg = #e;\
msg += ":"+QString(__FILE__);\
msg += ":"+QString::number(__LINE__)+":"+#x;\
throw e(msg);\
}\
}
// qobject for speech thread
bool QtSpeech_th::init = false;
void QtSpeech_th::say(QString text) {
try {
if (!init) {
int heap_size = FESTIVAL_HEAP_SIZE;
festival_initialize(true,heap_size);
init = true;
}
has_error = false;
EST_String est_text(text.toUtf8());
SysCall(festival_say_file(est_text), QtSpeech::LogicError);
}
catch(QtSpeech::LogicError e) {
has_error = true;
err = e;
}
emit finished();
}
// internal data
class QtSpeech::Private {
public:
Private()
48
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

:onFinishSlot(0L) {}
VoiceName name;
static const QString VoiceId;
const char * onFinishSlot;
QPointer<QObject> onFinishObj;
static QPointer<QThread> speechThread;
};
QPointer<QThread> QtSpeech::Private::speechThread = 0L;
const QString QtSpeech::Private::VoiceId = QString("festival:%1");
// implementation
QtSpeech::QtSpeech(QObject * parent)
:QObject(parent), d(new Private)
{
VoiceName n = {Private::VoiceId.arg("english"), "English"};
if (n.id.isEmpty())
throw InitError(Where+"No default voice in system");
d->name = n;
}
QtSpeech::QtSpeech(VoiceName n, QObject * parent)
:QObject(parent), d(new Private)
{
if (n.id.isEmpty()) {
VoiceName def = {Private::VoiceId.arg("english"), "English"};
n = def;
}
if (n.id.isEmpty())
throw InitError(Where+"No default voice in system");

49
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

d->name = n;
}
QtSpeech::~QtSpeech()
{
//if ()
delete d;
}
const QtSpeech::VoiceName & QtSpeech::name() const {
return d->name;
}
QtSpeech::VoiceNames QtSpeech::voices()
{
VoiceNames vs;
VoiceName n = {Private::VoiceId.arg("english"), "English"};
vs << n;
return vs;
}
void QtSpeech::tell(QString text) const {
tell(text, 0L,0L);
}
void QtSpeech::tell(QString text, QObject * obj, const char * slot) const
{
if (!d->speechThread) {
d->speechThread = new QThread;
d->speechThread->start();
}
d->onFinishObj = obj;
d->onFinishSlot = slot;
50
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

if (obj && slot)


connect(const_cast<QtSpeech *>(this), SIGNAL(finished()), obj, slot);
QtSpeech_th * th = new QtSpeech_th;
th->moveToThread(d->speechThread);
connect(th, SIGNAL(finished()), this, SIGNAL(finished()), Qt::QueuedConnection);
connect(th, SIGNAL(finished()), th, SLOT(deleteLater()), Qt::QueuedConnection);
QMetaObject::invokeMethod(th, "say", Qt::QueuedConnection, Q_ARG(QString,text));
}
void QtSpeech::say(QString text) const
{
if (!d->speechThread) {
d->speechThread = new QThread;
d->speechThread->start();
}
QEventLoop el;
QtSpeech_th th;
th.moveToThread(d->speechThread);
connect(&th, SIGNAL(finished()), &el, SLOT(quit()), Qt::QueuedConnection);
QMetaObject::invokeMethod(&th, "say", Qt::QueuedConnection, Q_ARG(QString,text));
el.exec();
if (th.has_error)
throw th.err;
}
void QtSpeech::timerEvent(QTimerEvent * te)
{
QObject::timerEvent(te);
}
} // namespace QtSpeech_v1
51
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 10
ADVANTAGES AND APPLICATIONS OF THE PROJECT
9.1ADVANTAGES OF THE PROJECT

Written textile to Speech.


sighted people to read audio books .
visually impaired People.
Book Reading to Skimming.

9.2 LIMITATIONS OF THE PROJECT

Cannot train our own Voice


Cannot Read PDF File because of Empty Spaces

9.3 APPLICATIONS

Railway Stations
Companys
Hospitals
Shopping malls
Schools and Colleges

CHAPTER 11

SNAPSHOTS OF THE PROJECT


52
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

FIGRURE 11.1 RASPBERRY PI BOARD

53
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

FIGRURE 11.2 RASPBERRY PI BOARD IS INTERFACING WITH MONITOR LCD

54
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

FIGRURE 11.3 TEXT TO SPEECH MAIN WINDOW

55
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

FIGRURE 11.4 HOME DIRECTORY OF MAIN WINDOW

56
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

FIGRURE 11.5 TEXT FILE IN THE HOME DIRECTORY

57
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 12

HARDWARE COMPONENT LIST


S.NO

COMPONENT NAME

QUANTITY

Speaker

MSBI Port

USB Port

Monitor LCD

58
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 13
CONCLUSION & FUTURE SCOPE

13.1 CONCLUSION
We have implemented efficient Speech Synthesis System which can be further improved in
future by using more advanced Processors.

13.2 FUTURE SCOPE


We can train Our own voice by using Speech Synthesis Algorithm instead of Computer
generated Voice.

59
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

CHAPTER 14

BIBLIOGRAPHY
1.
2.
3.
4.

www.wikipedia.com
www.allaboutcircuits.com
www.microchip.com
www.howstuffworks.com

Books referred:
1. Raj Kamal Microcontrollers Architecture, Programming, Interfacing System
Design.
2. Mazidi and Mazidi David.L.Jones.
60
ECET

ARM BASED IMPLEMENTATION OF TEXT-TO SPEECH (TTS) FOR REAL-TIME

ECE(ES)

3. PCB Design Tutorial David.L.Jones.


4. Embedded C Michael.J.Pont.
IEEE Papers referred:
1. Liu Zhing-xuan, etc.. Research on Remote Wireless Monitoring System Based on
GPRS and MCU, ICCP 2010, p 392-394
2. Yunfang Hao, GPRS signaling network structure if the interface Xian institute of
Posts and Telecommunications, 2002, 7(1):46-49.
3. Dai Jia, etc.. C-51 microcontroller application design Electronic industry Press, 2007
4. Zhangdui Zhong, GPRS General Packet Radio Service Beijing, Posts & Telecom
Press, 2001.
5. Srisuresh, P..Holdrege, M IP Network Address Translator(NAT) Terminology and
Considerations, RFC 2663 (1999).
6. Gleeson. B,.Lin, A,. Heinanen, J., Armitage, G., Malis A. A Framework for IP Based.

61
ECET

Вам также может понравиться