Вы находитесь на странице: 1из 6

2011 4 International Conference on Mechatronics (ICOM), 17-19 May 2011, Kuala Lumpur, Malaysia

th

Voice Recognition Based Wireless Home Automation System


Humaid AlShueili, Gourab Sen Gupta, Subhas Mukhopadhyay
School of Engineering and Advanced Technology Massey University, Turitea Campus, Palmerston North, New Zealand humaid.shueili@gmail.com, g.sengupta@massey.ac.nz, S.C.Mukhopadhyay@massey.ac.nz

Abstract Home Automation industry is growing rapidly; this is fuelled by the need to provide supporting systems for the elderly and the disabled, especially those who live alone. Coupled with this, the world population is confirmed to be getting older. Home automation systems must comply with the household standards and convenience of usage. This paper details the overall design of a wireless home automation system (WHAS) which has been built and implemented. The automation centres on recognition of voice commands and uses low-power RF ZigBee wireless communication modules which are relatively cheap. The home automation system is intended to control all lights and electrical appliances in a home or office using voice commands. The system has been tested and verified. The verification tests included voice recognition response test, indoor ZigBee communication test, and the compression and decompression tests of DPCM (Differential Pulse Code Modulation) speech signals. The tests involved a mix of 35 male and female subjects with different English accents. 35 different voice commands were sent by each person. Thus the test involved sending a total of 1225 commands and 79.8% of these commands were recognised correctly. Keywords Home automation, ZigBee transceivers, voice streaming, ADC, Differential Pulse Code Modulation (DPCM), voice recognition.

Figure 1: uControl Home Security, Monitoring and Automation (SMA) [3].

There have been several commercial and research projects on smart homes and voice recognition systems. Figure 1 shows an integrated platform for home security, monitoring and automation (SMA) from uControl [3]. The system is a 7-inch touch screen that can wirelessly be connected to security alarms and other home appliances. The home automation through this system requires holding and interacting with a large panel which constraints the physical movements of the user [4]. Another popular commercially available system for home automation is from Home Automated Living (HAL) [5]. HAL software taps the power of an existing PC to control the home. It provides speech command interface. A big advantage of this system is it can send commands all over the house using the existing highway of electrical wires inside the homes walls. No new wires means HAL is easy and inexpensive to install. However, most of these products sold in the market are heavily priced and often require significant home make over. The rest of the paper is organised as follows: Section II provides a system overview. The hardware design is detailed in Section III while the software design is detailed in Section IV. The experimental results are discussed in Section V. The paper concludes by looking at the future research and development work required to make the system more versatile. II. SYSTEM OVERVIEW The Wireless Home Automation System (WHAS) is an integrated system to facilitate elderly and disabled people with

I. INTRODUCTION The demography of the world population shows a trend that the elderly population world wide is increasing rapidly as a result of the increase of the average live expectancy of people [1]. Caring for and supporting this growing population is a concern for governments and nations around the globe [2]. Home automation is one of the major growing industries that can change the way people live. Some of these home automation systems target those seeking luxury and sophisticated home automation platforms; others target those with special needs like the elderly and the disabled. The aim of the reported Wireless Home Automation System (WHAS) is to provide those with special needs with a system that can respond to voice commands and control the on/off status of electrical devices, such as lamps, fans, television etc, in the home. The system should be reasonably cheap, easy to configure, and easy to run.

978-1-61284-437-4/11/$26.00 2011 IEEE

Figure 3: Functional block diagram of the Wireless Home Automation System (WHAS). Legends- A: Analogue, D: Digital

an easy-to-use home automation system that can be fully operated based on speech commands. The system is constructed in a way that is easy to install, configure, run, and maintain. The functional blocks of the overall system are shown in Figure 2.

Handheld Microphone Module which incorporates a


microphone with RF module (ZigBee protocol).

Central Controller Module (PC based). Appliance Control Modules.


Figure 3 illustrates the sequence of activities in the WHAS. The voice is captured using a microphone, sampled, filtered and converted to digital data using an analogue-to-digital converter. The data is then compressed and sent serially as packets of binary data. At the receiving end (Central Controller Module), binary data are converted to analogue, filtered and passed to the computer through the sound card. A Visual Basic application program, running on the PC, uses Microsoft Speech API library for the voice recognition. Upon recognition of the commands, control characters are sent wirelessly to the specified appliance address. Consequently, appliances can be turned ON or OFF depending on the control characters received. III. HARDWARE DESIGN In this section we present the hardware descriptions of the three modules that constitute the WHAS. A. Handheld Microphone Module(MM) The components of the microphone module are shown in Figure 4. The system captures human voice using a sampling rate (fs) of 8 kHz. It is known that the highest frequency component of the human voice is 20 kHz, however the most significant parts of the information is encoded in frequencies between 6 Hz and 3.5 kHz [6]. To meet Nyquist sampling criteria, an anti-aliasing filter is used to block all the frequencies above the Nyquist frequency (Fn).

Figure 2: Sequence of activities in the Wireless Home Automation System

The system consists of three modules:

f s = 2 Fn

(1)

Figure 6: Block diagram of the Central Controller Module.

C1

Figure 4: Block diagram of the handheld Microphone Module.

4.7n VCC R2 150K P1 3 2 1 Header 3 DAC0 VCC GND 100n R4 150K C3 R3 8.2K R1 15K C2 3.3n 4 VCC U2A MC33204DR2 1 VCC CDEC 100n

The incoming speech wave goes through a low pass filter (Figure 5). A 3-pole Butterworth low pass filter is used as an anti-aliasing filter [7]. The signal is then amplified in order to utilise the full range of the ADC. A voltage divider and a DC blocking capacitor provide a voltage translation from the filters to the ADC. In the microcontroller, data is first converted to digital format using the in-built ADC, and then compressed using Differential Pulse Code Modulation (DPCM) algorithm. The data is compressed from 12 bits to 6 bits. Data are sent serially from the microcontroller to the ZigBee RF module at the baud rate of 115200 bits/s. This is the maximum configurable baud rate provided by ZigBee [8].
C5 VCC R1 10K 7p R6 5.1K VCC C4 1u R? 4.99K C1 0.1u GND MK01 R4 154K R5 133K 4 R7 4.99K VCC GND 6 R8 9.76K C8 1200p 5 C7 2700p
B (OPA)

2 3
A (COM P)

GND VCC 1 GND 6 U1 SIGNAL VIN PCOMP VSS TPA4861D VDD VOUT COMP DEM 8 3 2 5 C7 10uF VO2 VO1 Speaker LS

GND GND C11 100n C8 10uF 7 4

GND C4 R7 2.7K 68n R5 6.8K C5 3.3n 6 5


B (OPA)

R8 R6 12K 22K C10 1.5n

GND

U2B MC33204DR2 7

8 4

C6 100n

GND

Figure 7: Filtering and amplification circuit of the received audio.

VCC

C3 10uF

C. Appliance Control Module Once the speech commands are recognised, control charterers are sent to the specified appliance address through ZigBee communication protocol. Each appliance that has to be controlled has a relay controlling circuit shown in Figure 8.
J1 VAC 230AC 50HZ 2 3 1 GND K1 Port to home appliance Vcc2 D2 Diode 1N4934 PWR2.5

R3 154k 2 3

A (COMP)

X1A LM392N 1

C6 1u

GND

Mic1 C2 1u GND

X1B LM392N 7

R9 100K C9 0.1u R10 100K R11 10K C10 10n P2 TO_ADC0 1 2 Header 2

Relay-SPST ACGND

8 4

P1 2 1 MHDR1X2 GND

U1A MM74HC08N 1 A frm ucon R2 3 Y 2 Res1 B 7 3900 GND Vcc3 14 VCC 4 5 U1B 6 GND Vcc2

Q1 2N3904

GND GND

GND

P2 GND 11 10 9 8 7 6 5 4 3 2 1 Header 11

Figure 5: Portable microphone circuit.


9

MM74HC08N U1C 8 10 MM74HC08N 12 13 MM74HC08N U1D 11

R3 10K Vcc3 R4 12K5

Vcc3 Ccou 10n

B. Central Controller Module The functional blocks of the central controller module are shown in Figure 6. At the central controller module (coordinator), when data are received, the received bytes are decompressed using DPCM algorithm [9]. Decompressed data is assigned to the digital-to-analogue converter (DAC). The analogue output of the DAC is filtered and fed to the computer as analogue signal through the sound card of the PC. The filter and amplifier circuit is shown in Figure 7.

GND

GND

GND

Figure 8: Circuit schematic for appliance control module

IV. SOFTWARE DESIGN Software design includes ADC sampling compression/decompression algorithms, transmission receiving, and voice recognition.

and and

A. ADC sampling and data compression / decompression The portable microphone module implements DPCM compression scheme. This compression algorithm is inherently lossy because of the error incurred due to the nature of the compression algorithm. The algorithm compresses each ADC sample from 12 bits of data down to 6-bit codes. This code represents the difference between the actual sample and the

predicted value of the sample. The predicted sample is obtained from the previous iteration result. The difference between the sample and the predicted value is then quantised. The 6 bit code is then packed into bytes of data in order to send them serially. In order to calculate the new predicted value, the compression algorithm decodes the difference and adds it into the current predicted value.

Figure 11: Voice recognition application hierarchy.

Figure 9: DPCM Compression algorithm.

The designed graphical user interface (GUI) offers the user the choice of selecting the desired serial communication port as well as it provides a record of all the commands that have been recognised and executed. The application implements the hierarchy described earlier in Figure 11 and the flow chart shown in Figure 12. When designing the programme GUI, making it a user friendly application was a huge priority since the target clients need to avoid any possible complications in the system. A screen shot of the GUI is shown in Figure 13. Control characters corresponding to the recognised commands are then sent serially from the central controller module to the appliance control modules that are connected to the home appliances.

Figure 10: DPCM Decompression Algorithm.

Figure 9 shows the DPCM compression algorithm. At the receiving end, data are decompressed to the original form using the DPCM decompression algorithm. Figure 10 shows the decoding algorithm which basically matches the received code with the quantised difference and adds this difference to the predictor [10]. B. Voice Recognition Application The voice recognition application implements Microsoft speech API. The application compares incoming speech with an obtainable predefined dictionary. The Microsoft speech API run time environment relies on two main engines: Automatic Speech Recognition (ASR engine) and Text To Speech (TTS engine) as shown in Figure 11. ASR implements the Fast Fourier Transform (FFT) to compute the spectrum of the fingerprint data [4]. Comparing the fingerprint with an existing database returns a string of the text being spoken. This string is represented by a control character that gets sent to the corresponding appliances address.

Figure 12: Flow chart of the voice recognition application.

Figure 14: Microphone circuit board with ZigBee module Figure 13: Voice recognition GUI

C. ZigBee RF communication Zigbee protocol is the communication protocol thats used in this system. Zigbee offers 250 kbps as maximum baud rate, however, 115200 bps was used for sending and receiving as this was the highest speed that the UART of the microcontroller could be programmed to operate at. For each byte transmitted, there is a start and stop bit. Hence the actual baudrate is :

The amount of data (bits/s) produced by the ADC is:

Figure 15: Fabricated relay control unit

The streaming will not be possible without voice data being compressed [11]. After compression, the total resultant data rate (bits/s) will be:

This allows a window for error checking and resending data if necessary. V. EXPERIMENTAL RESULTS AND DISCUSSIONS The prototype of the system has been fabricated and tested. Figure 14 shows the microphone module. Figure 15 shows the appliances control module.

Figure 16: Results of voice recognition experiments showing percentage of correct recognition for different ethnicity/accent

The graph in Figure 16 and the data in Table I show the response of the speech recognition application to spoken commands. The tests involved 35 subjects; the trails were

conducted with people with different English accents. The test subjects were a mix of male and female and 35 different voice commands were sent by each person. Thus the test involved sending a total of 1225 commands. 79.8% of these commands were recognised correctly. When a command is not recognised correctly, the software ignores the command and does not transmit any signals to the device control modules. The accuracy of the recognition can be affected by background noise, speed of the speaker, and the clearity of the spoken accent. These factors need to be studied further in more details by conducting more tests. The system was tested in an apartment and performed well up to 40m. With a clear line-ofsight transmission (such as in a wide open gymnasium) the reception was accurate up to 80m. Additional tests are being planned involving a bigger variety of commands.
TABLE I: RESULTS OF VOICE COMMAND RECOGNITION TESTS: PERCENTAGE OF COMMANDS CORRECTLY RECOGNISED
Category Person 1 Person 2 Person 3 Person 4 Person 5 Average Kiwi 68 86 57.1 90 85 77.3 Arab 85.7 80 88 77 60 78.1 Filipino 88.6 85 80 90 77 84.1 Pacific Island 70.3 77 80 67 90 78.7 Japan 67.7 70 74 78.6 76 73.3 Thai 75.8 81.8 85 82 68 78.5 African 96.7 88 85 90 82 88.3

(DPCM) compression algorithms that allows to compress the speech data to half of its orignal data size. The preliminary test results are promising. Future work will entail: Adding confirmation commands to the voice recognition system. Integrating variable control functions to improve the system versatility such as providing control commands other than ON/OFF commands. For example Increase Temperature, Dim Lights etc. Integration of GSM or mobile server to operate from a distance. Design and integration of an online home control panel. REFERENCES
[1] [2] T. Birtley, (2010) Japan debates care for elderly. [Cited 21/09/2010]. Available: http://www.youtube.com/watch?v=C0UTqfigSec Population Division, DESA, United Nations. (2009). World Population Ageing: Annual report 2009. [29/07/2010]. Available: http://www.un.org/esa/population/publications/WPA2009/WPA2009_W orkingPaper.pdf (2010) uControl Home security system website. [Cited 2010 14th Oct]. Available: http://www.itechnews.net/2008/05/20/ucontrol-homesecurity-system/ R. Gadalla, Voice Recognition System for Massey University Smarthouse, M. Eng thesis, Massey University, Auckland, New Zealand, 2006. (2010) Home Automated Living website. [Cited 2010 14th Oct]. Available: http://www.homeautomatedliving.com/default.htm L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, New Jersey, US: Prentice Hall Inc, 1978. XBee-2.5-Manual, ZigBee RF communication protocol. (2008). Minnetonka: Digi International Inc. B. Yukesekkaya, A. A. Kayalar, M. B. Tosun, M. K. Ozcan, and A. Z. Alkar, A GSM, Internet and Speech Controlled WirelessInteractive Home Automation System, IEEE Transactions on Consumer Electronics, vol. 52, pp. 837-843, August 2006. F. J. Owens, Signal Processing of Speech, New York, US: McGraw-Hill Inc, 1993. Voice Recoder Refrence Design (AN 278), Silicon Laboratories, 2006. D. Brunelli, M. Maggiorotti, L. Benini, and F. L. Bellifemine, Analysis of Audio Streaming Capapbility of Zigbee Networks, in EWSN 2008, 2008, LNCS 4913, pp. 189-204.

[3]

[4]

[5] [6]

VI. CONCLUSIONS AND FUTURE WORK A home automation system based on voice recognition was built and implemented. The system is targetted at elderly and disabled people. The prototype developed can control electrical devices in a home or office. The system implements Automatic Speech Recognition engines through Microsoft speech APIs. The system implements the wireless network using ZigBee RF modules for their efficiency and low power consumption. Multimedia streaming through the network was impleneted with the help of the Differential Pulse Code Modulation

[7] [8]

[9] [10] [11]

Вам также может понравиться