Вы находитесь на странице: 1из 656

Communications

in Computer and Information Science

193

Ajith Abraham Jaime Lloret Mauri


John F. Buford Junichi Suzuki
Sabu M. Thampi (Eds.)

Advances in Computing
and Communications
First International Conference, ACC 2011
Kochi, India, July 22-24, 2011
Proceedings, Part IV

13

Volume Editors
Ajith Abraham
Machine Intelligence Research Labs (MIR Labs)
Auburn, WA, USA
E-mail: ajith.abraham@ieee.org
Jaime Lloret Mauri
Polytechnic University of Valencia
Valencia, Spain
E-mail: jlloret@dcom.upv.es
John F. Buford
Avaya Labs Research
Basking Ridge, NJ, USA
E-mail: john.buford@gmail.com
Junichi Suzuki
University of Massachusetts
Boston, MA, USA
E-mail: jxs@acm.org
Sabu M. Thampi
Rajagiri School of Engineering and Technology
Kochi, India
E-mail: smthampi@acm.org

ISSN 1865-0929
e-ISSN 1865-0937
e-ISBN 978-3-642-22726-4
ISBN 978-3-642-22725-7
DOI 10.1007/978-3-642-22726-4
Springer Heidelberg Dordrecht London New York
Library of Congress Control Number: Applied for
CR Subject Classification (1998): C.2, H.4, I.2, H.3, D.2, J.1, K.6.5

Springer-Verlag Berlin Heidelberg 2011


This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply,
even in the absence of a specific statement, that such names are exempt from the relevant protective laws
and regulations and therefore free for general use.
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)

Preface

The First International Conference on Advances in Computing and Communications (ACC 2011) was held in Kochi during July 2224, 2011. ACC 2011 was
organized by Rajagiri School of Engineering & Technology (RSET) in association with the Association of Computing Machinery (ACM)- SIGWEB, Machine
Intelligence Research Labs (MIR Labs), International Society for Computers
and Their Applications, Inc. (ISCA), All India Council for Technical Education (AICTE), Indira Gandhi National Open University (IGNOU), Kerala State
Council for Science, Technology and Environment (KSCSTE), Computer Society of India (CSI)- Div IV and Cochin Chapter, The Institution of Electronics
and Telecommunication Engineers (IETE), The Institution of Engineers (India)
and Project Management Institute (PMI),Trivandrum, Kerala Chapter. Established in 2001, RSET is a premier professional institution striving for holistic
excellence in education to mould young, vibrant engineers.
ACC 2011 was a three-day conference which provided an opportunity to
bring together students, researchers and practitioners from both academia and
industry. ACC 2011 was focused on advances in computing and communications
and it attracted many local and international delegates, presenting a balanced
mixture of intellects from the East and from the West. ACC 2011 received 592 research papers from 38 countries including Albania, Algeria, Bangladesh, Brazil,
Canada, Colombia, Cyprus, Czech Republic, Denmark, Ecuador, Egypt, France,
Germany, India, Indonesia, Iran, Ireland, Italy, Korea, Kuwait, Malaysia, Morocco, New Zealand, P.R. China, Pakistan, Rwanda, Saudi Arabia, Singapore,
South Africa, Spain, Sri Lanka, Sweden, Taiwan, The Netherlands, Tunisia, UK,
and USA. This clearly reects the truly international stature of ACC 2011. All
papers were rigorously reviewed internationally by an expert technical review
committee comprising more than 300 members. The conference had a peerreviewed program of technical sessions, workshops, tutorials, and demonstration
sessions.
There were several people that deserve appreciation and gratitude for helping
in the realization of this conference. We would like to thank the Program Committee members and additional reviewers for their hard work in reviewing papers
carefully and rigorously. After careful discussions, the Program Committee selected 234 papers (acceptance rate: 39.53%) for presentation at the conference.
We would also like to thank the authors for having revised their papers to address
the comments and suggestions by the referees.
The conference program was enriched by the outstanding invited talks by
Ajith Abraham, Subir Saha, Narayan C. Debnath, Abhijit Mitra, K. Chandra
Sekaran, K. Subramanian, Sudip Misra, K.R. Srivathsan, Jaydip Sen, Joyati
Debnath and Junichi Suzuki. We believe that ACC 2011 delivered a high-quality,
stimulating and enlightening technical program. The tutorials covered topics of

VI

Preface

great interest to the cyber forensics and cloud computing communities. The tutorial by Avinash Srinivasan provided an overview of the forensically important
artifacts left behind on a MAC computer. In his tutorial on Network Forensics, Bhadran provided an introduction to network forensics, packet capture
and analysis techniques, and a discussion on various RNA tools. The tutorial on
Next-Generation Cloud Computing by Pethuru Raj focused on enabling technologies in cloud computing.
The ACC 2011 conference program also included ve workshops: International Workshop on Multimedia Streaming (MultiStreams 2011), Second International Workshop on Trust Management in P2P Systems (IWTMP2PS 2011),
International Workshop on Cloud Computing: Architecture, Algorithms and
Applications (CloudComp 2011), International Workshop on Identity: Security,
Management and Applications (ID2011) and International Workshop on Applications of Signal Processing (I-WASP 2011). We thank all the workshop organizers as well as the Workshop Chair, El-Sayed El-Alfy, for their accomplishment
to bring out prosperous workshops. We would like to express our gratitude to
the Tutorial Chairs Patrick Seeling, Jaydeep Sen, K.S. Mathew, and Roksana
Boreli and Demo Chairs Amitava Mukherjee, Bhadran V.K., and Janardhanan
P.S. for their timely expertise in reviewing the proposals. Moreover, we thank
Publication Chairs Pruet Boonma, Sajid Hussain and Hiroshi Wada for their
kind help in editing the proceedings. The large participation in ACC2011 would
not have been possible without the Publicity Co-chairs Victor Govindaswamy,
Arun Saha and Biju Paul.
The proceedings of ACC 2011 are organized into four volumes. We hope
that you will nd these proceedings to be a valuable resource in your professional, research, and educational activities whether you are a student, academic,
researcher, or a practicing professional.
July 2011

Ajith Abraham
Jaime Lloret Mauri
John F. Buford
Junichi Suzuki
Sabu M. Thampi

Organization

ACC 2011 was jointly organized by the Department of Computer Science


and Engineering and Department of Information Technology, Rajagiri School
of Engineering and Technology (RSET), Kochi, India, in cooperation with
ACM/SIGWEB.

Organizing Committee
Chief Patrons
Fr. Jose Alex CMI
Fr. Antony Kariyil CMI

Manager, RSET
Director, RSET

Patron
J. Isaac, Principal

RSET

Advisory Committee
A. Krishna Menon
A.C. Mathai
Fr. Varghese Panthalookaran
Karthikeyan Chittayil
Vinod Kumar, P.B.
Biju Abraham
Narayamparambil
Kuttyamma A.J.
Asha Panicker
K. Rajendra Varmah
P.R. Madhava Panicker
Liza Annie Joseph
Varkey Philip
Fr. Joel George Pullolil
R. Ajayakumar Varma
K. Poulose Jacob
H.R. Mohan, Chairman
Soman S.P., Chairman
S. Radhakrishnan, Chairman

RSET
RSET
RSET
RSET
RSET
RSET
RSET
RSET
RSET
RSET
RSET
RSET
RSET
KSCSTE
Cochin University of Science & Technology
Div IV, Computer Society of India (CSI)
Computer Society of India (CSI), Cochin
Chapter
Kerala State Centre, The Institution of
Engineers (India)

VIII

Organization

Steering Committee
John F. Buford
Rajkumar Buyya
Mukesh Singhai
John Strassner
Junichi Suzuki
Ramakrishna Kappagantu
Achuthsankar S. Nair

Avaya Labs Research, USA


University of Melbourne, Australia
University of Kentucky, USA
Pohang University of Science and Technology,
Republic of Korea
University of Massachusetts, Boston, USA
IEEE India Council
Centre for Bioinformatics, Trivandrum, India

Conference Chair
Sabu M. Thampi

Rajagiri School of Engineering and Technology,


India

ACC 2011 Program Committee Chairs


General Co-chairs
Ajith Abraham
Chandra Sekaran K.
Waleed W. Smari

Machine Intelligence Research Labs, Europe


National Institute of Technology Karnataka,
India
University of Dayton, Ohio, USA

Program Co-chairs
Jaime Lloret Mauri
Thorsten Strufe
Gregorio Martinez

Polytechnic University of Valencia, Spain


Darmstadt University of Technology, Germany
University of Murcia, Spain

Special Sessions and Workshops Co-chairs


El-Sayed El-Alfy
Silvio Bortoleto
Tutorial Co-chairs
Patrick Seeling
Jaydeep Sen
K.S. Mathew
Roksana Boreli

King Fahd University of Petroleum and


Minerals, Saudi Arabia
Positivo University, Brazil

University of Wisconsin - Stevens Point, USA


Tata Consultancy Services, Calcutta, India
Rajagiri School of Engineering and Technology,
India
National ICT Australia Ltd., Australia

Organization

Demo Co-chairs
Amitava Mukherjee
Bhadran V.K.
Janardhanan P.S.

IX

IBM Global Business Services, India


Centre for Development of Advanced
Computing, Trivandrum, India
Rajagiri School of Engineering and Technology,
India

Publicity Co-chairs
Victor Govindaswamy
Arun Saha
Biju Paul

Publication Co-chairs
Pruet Boonma
Sajid Hussain
Hiroshi Wada

Texas A&M University, USA


Fujitsu Network Communications, USA
Rajagiri School of Engineering and Technology,
India

Chiang Mai University, Thailand


Fisk University, USA
University of New South Wales, Australia

ACC 2011 Technical Program Committee


A. Had
Abdallah Shami
Abdelhad Abouaissa
Abdelmalik Bachir
Abdelouahid Derhab
Abhijit Mitra
Ad
ao Silva
Adel Ali
Ahmed Mehaoua
Ai-Chun Pang
Ajay Gupta
Alberto Dainotti
Alessandro Leonardi
Alex Galis
Alexey Vinel
Ali Abedi
Alicia Trivi
no Cabrera
Alireza Behbahani
Alois Ferscha
Al-Sakib Khan Pathan
Amar Prakash Azad
Amirhossein Alimohammad
Amit Agarwal

Network Research Lab, University of Montreal,


Canada
The University of Western Ontario, Canada
University of Haute Alsace, France
Imperial College London, UK
CERIST, Algeria
Indian Institute of Technology Guwahati, India
University of Aveiro, Portugal
University Technology Malaysia
University of Paris Descartes, France
National Taiwan University, Taiwan
Western Michigan University, USA
University of Naples Federico II, Italy
University of Catania, Italy
University College London, UK
Saint Petersburg Institute, Russia
University of Maine, USA
Universidad de M
alaga, Spain
University of California, Irvine, USA
University of Linz, Austria
International Islamic University, Malaysia
INRIA, France
University of Alberta, Canada
Indian Institute of Technology, Roorkee, India

Organization

Amitava Mukherjee
Anand Prasad
Andreas Maeder
Ankur Gupta
Antonio Coronato
Antonio Pescape
Ant
onio Rodrigues
Anura P. Jayasumana
Arnab Bhattacharya
Arun Saha
Arvind Swaminathan
Ashley Thomas
Ashraf Elnagar
Ashraf Mahmoud
Ashwani Singh
Athanasios Vasilakos
Atilio Gameiro
Aydin Sezgin
Ayman Assra
Aytac Azgin
B. Sundar Rajan
Babu A.V.
Babu B.V.
Babu Raj E.
Balagangadhar G. Bathula
Borhanuddin Mohd. Ali
Brijendra Kumar Joshi
Bruno Crispo
C.-F. Cheng
Chang Wu Yu
Charalampos Tsimenidis
Chih-Cheng Tseng
Chi-Hsiang Yeh
Chitra Babu
Chittaranjan Hota
Chonho Lee
Christian Callegari
Christos Chrysoulas
Chuan-Ching Sue
Chung Shue Chen

IBM Global Business Services, India


NEC Corporation, Japan
NEC Laboratories Europe, Germany
Model Institute of Engineering and Technology,
India
ICAR-CNR, Naples, Italy
University of Naples Federico II, Italy
IT / Instituto Superior Tecnico, Portugal
Colorado State University, USA
Indian Institute of Technology, Kanpur, India
Fujitsu Network Communications, USA
Qualcomm, USA
Secureworks Inc., USA
Sharjah University, UAE
KFUPM, Saudi Arabia
Navtel Systems, France
University of Western Macedonia, Greece
Telecommunications Institute/Aveiro
University, Portugal
Ulm University, Germany
McGill University, Canada
Georgia Institute of Technology, USA
Indian Institute of Science, India
National Institute of Technology, Calicut, India
BITS-Pilani, Rajasthan, India
Sun College of Engineering and Technology,
India
Columbia University, USA
Universiti Putra Malaysia
Military College, Indore, India
Universit`
a di Trento, Italy
National Chiao Tung University, Taiwan
Chung Hua University, Taiwan
Newcastle University, UK
National Ilan University, Taiwan
Queens University, Canada
SSN College of Engineering, Chennai, India
BITS Hyderabad Campus, India
Nanyang Technological University, Singapore
University of Pisa, Italy
Technological Educational Institute, Greece
National Cheng Kung University, Taiwan
TREC, INRIA, France

Organization

Chun-I. Fan
Chutima Prommak
Dali Wei
Danda B. Rawat
Daniele Tarchi
Davide Adami
Deepak Garg
Demin Wang
Dennis Psterer
Deyun Gao
Dharma Agrawal
Dhiman Barman
Di Jin
Dimitrios Katsaros
Dimitrios Vergados
Dirk Pesch
Djamel Sadok
Eduardo Cerqueira
Eduardo Souto
Edward Au
Egemen Cetinkaya
Elizabeth Sherly
El-Sayed El-Alfy
Emad A. Felemban
Eric Renault
Errol Lloyd
Ertan Onur
Faouzi Bader
Faouzi Kamoun
Fernando Velez
Filipe Cardoso
Florian Doetzer
Francesco Quaglia
Francine Krief
Frank Yeong-Sung Lin
Gianluigi Ferrari
Giuseppe Ruggeri
Grzegorz Danilewicz
Guang-Hua Yang
Guo Bin

XI

National Sun Yat-sen University, Taiwan


Suranaree University of Technology, Thailand
Jiangsu Tianze Infoindustry Company Ltd,
P.R. China
Old Dominion University, USA
University of Bologna, Italy
CNIT Pisa Research Unit, University of Pisa,
Italy
Thapar University, India
Microsoft Inc., USA
University of L
ubeck, Germany
Beijing Jiaotong University, P.R. China
University of Cincinnati, USA
Juniper Networks, USA
General Motors, USA
University of Thessaly, Greece
National Technical University of Athens,
Greece
Cork Institute of Technology, Ireland
Federal University of Pernambuco, Brazil
Federal University of Para (UFPA), Brazil
Federal University of Amazonas, Brazil
Huawei Technologies, P.R. China
University of Kansas, USA
IIITM-Kerala, India
King Fahd University, Saudi Arabia
Umm Al Qura University, Saudi Arabia
TELECOM & Management SudParis, France
University of Delaware, USA
Delft University of Technology,
The Netherlands
CTTC, Spain
WTS, UAE
University of Beira Interior, Portugal
ESTSetubal/Polytechnic Institute of Setubal,
Portugal
ASKON ConsultingGroup, Germany
Sapienza Universit`
a di Roma, Italy
University of Bordeaux, France
National Taiwan University, Taiwan
University of Parma, Italy
University Mediterranea of Reggio Calabria,
Italy
Poznan University of Technology, Poland
The University of Hong Kong, Hong Kong
Institut Telecom SudParis, France

XII

Organization

Hadi Otrok
Hamid Mcheick
Harry Skianis
Hicham Khalife
Himal Suraweera
Hiroshi Wada
Hong-Hsu Yen
Hongli Xu
Houcine Hassan
Hsuan-Jung Su
Huaiyu Dai
Huey-Ing Liu
Hung-Keng Pung
Hung-Yu Wei
Ian Glover
Ian Wells
Ibrahim Develi
Ibrahim El rube
Ibrahim Habib
Ibrahim Korpeoglu
Ilja Radusch
Ilka Miloucheva
Imad Elhajj
Ivan Ganchev
Iwan Adhicandra
Jalel Ben-othman
Jane-Hwa Huang
Jaydeep Sen
Jiankun Hu
Jie Yang
Jiping Xiong
Jose de Souza
Jose Moreira
Ju Wang
Juan-Carlos Cano
Judith Kelner
Julien Laganier
Jussi Haapola
K. Komathy
Ka Lok Hung
Ka Lok Man
Kaddar Lamia
Kainam Thomas

Khalifa University, UAE


Universite du Quebec `a Chicoutimi, Canada
University of the Aegean, Greece
ENSEIRB-LaBRI, France
Singapore University of Technology and Design,
Singapore
University of New South Wales, Australia
Shih-Hsin University, Taiwan
University of Science and Technology of China,
P.R. China
Technical University of Valencia, Spain
National Taiwan University, Taiwan
NC State University, USA
Fu-Jen Catholic University, Taiwan
National University of Singapore
NTU, Taiwan
University of Strathclyde, UK
Swansea Metropolitan University, UK
Erciyes University, Turkey
AAST, Egypt
City University of New York, USA
Bilkent University, Turkey
Technische Universitat Berlin, Germany
Media Technology Research, Germany
American University of Beirut, Lebanon
University of Limerick, Ireland
The University of Pisa, Italy
University of Versailles, France
National Chi Nan University, Taiwan
Tata Consultancy Services, Calcutta, India
RMIT University, Australia
Cisco Systems, USA
Zhejiang Normal University of China
Federal University of Cear
a, Brazil
IBM T.J. Watson Research Center, USA
Virginia State University, USA
Technical University of Valencia, Spain
Federal University of Pernambuco, Brazil
Juniper Networks Inc., USA
University of Oulu, Finland
Easwari Engineering College, Chennai, India
The Hong Kong University, Hong Kong
Xian Jiaotong-Liverpool University, China
University of Versailles Saint Quentin, France
Hong Kong Polytechnic University

Organization

Kais Mnif
Kang Yong Lee
Katia Bortoleto
Kejie Lu
Kemal Tepe
Khalifa Hettak
Khushboo Shah
Kotecha K.
Kpatcha Bayarou
Kumar Padmanabh
Kyriakos Manousakis
Kyung Sup Kwak
Li Zhao
Li-Chun Wang
Lin Du
Liza A. Lati
Luca Scalia
M Ayoub Khan
Maaruf Ali
Madhu Kumar S.D.
Madhu Nair
Madhumita Chatterjee
Mahamod Ismail
Mahmoud Al-Qutayri
Manimaran Govindarasu
Marcelo Segatto
Maria Ganzha
Marilia Curado
Mario Fanelli
Mariofanna Milanova
Mariusz Glabowski
Mariusz Zal
Masato Saito
Massimiliano Comisso
Massimiliano Laddomada
Matthias R. Brust
Mehrzad Biguesh
Michael Alexander
Michael Hempel
Michael Lauer
Ming Xia
Ming Xiao
Mohamed Ali Kaafar

XIII

High Institute of Electronics and


Communications of Sfax, Tunisia
ETRI, Korea
Positivo University, Brazil
University of Puerto Rico at Mayaguez, USA
University of Windsor, Canada
Communications Research Centre (CRC),
Canada
Altusystems Corp, USA
Institute of Technology, Nirma University, India
Fraunhofer Institute, Germany
General Motors, India
Telcordia Technologies, USA
Inha University, Korea
Microsoft Corporation, USA
National Chiao Tung University, Taiwan
Technicolor Research and Innovation Beijing,
P.R. China
University Technology Malaysia
University of Palermo, Italy
C-DAC, Noida, India
Oxford Brookes University, UK
National Institute of Technology, Calicut, India
University of Kerala, India
Indian Institute of Technology Bombay, India
Universiti Kebangsaan Malaysia
Khalifa University, UAE
Iowa State University, USA
Federal University of Esprito Santo, France
University of Gdansk, Poland
University of Coimbra, Portugal
DEIS, University of Bologna,Italy
University of Arkansas at Little Rock, USA
Poznan University of Technology, Poland
Poznan University of Technology, Poland
University of the Ryukyus, Japan
University of Trieste, Italy
Texas A&M University-Texarkana, USA
University of Central Florida, USA
Queens University, Canada
Scaledinfra Technologies GmbH, Austria
University of Nebraska - Lincoln, USA
Vanille-Media, Germany
NICT, Japan
Royal Institute of Technology, Sweden
INRIA, France

XIV

Organization

Mohamed Cheriet
Mohamed Eltoweissy
Mohamed Hamdi
Mohamed Moustafa
Mohammad Banat
Mohammad Hayajneh
Mohammed Misbahuddin
Mustafa Badaroglu
Naceur Malouch
Nakjung Choi, Alcatel-Lucent
Namje Park
Natarajan Meghanathan
Neeli Prasad
Nen-Fu Huang
Nikola Zogovic
Nikolaos Pantazis
Nilanjan Banerjee
Niloy Ganguly
Pablo Corral Gonzalez
Patrick Seeling
Paulo R.L. Gondim
Peter Bertok
Phan Cong-Vinh
Pingyi Fan
Piotr Zwierzykowski
Pascal Lorenz
Pruet Boonma
Punam Bedi
Qinghai Gao
Rahul Khanna
Rajendra Akerkar
Raul Santos
Ravishankar Iyer
Regina Araujo
Renjie Huang
Ricardo Lent
Rio G. L. DSouza
Roberto Pagliari
Roberto Verdone
Roksana Boreli

Ecole de Technologie Superieure, Canada


Pacic Northwest National Laboratory, USA
Carthage University, Tunisia
Akhbar El Yom Academy, Egypt
Jordan University of Science and Technology,
Jordan
UAEU, UAE
C-DAC, India
IMEC, Belgium
Universite Pierre et Marie Curie, France
Bell-Labs, Seoul, Korea
Jeju University, South Korea
Jackson State University, USA
Center for TeleInFrastructure (CTIF),
Denmark
National Tsing Hua University, Taiwan
University of Belgrade, Serbia
Technological Educational Institution of
Athens, Greece
IBM Research, India
Indian Institute of Technology, Kharagpur,
India
University Miguel Hern
andez, Spain
University of Wisconsin - Stevens Point, USA
University of Braslia, Brazil
Royal Melbourne Institute of Technology
(RMIT), Australia
London South Bank University, UK
Tsinghua University, P.R. China
Poznan University of Technology, Poland
University of Haute Alsace, France
Chiang Mai University, Thailand
University of Delhi, India
Atheros Communications Inc., USA
Intel, USA
Western Norway Research Institute, Norway
University of Colima, Mexico
Intel Corp, USA
Federal University of Sao Carlos, Brazil
Washington State University, USA
Imperial College London, UK
St. Joseph Engineering College, Mangalore,
India
University of California, Irvine, USA
WiLab, University of Bologna, Italy
National ICT Australia Ltd., Australia

Organization

Ronny Yongho Kim


Ruay-Shiung Chang
Ruidong Li
S. Ali Ghorashi
Sahar Ghazal
Said Soulhi
Sajid Hussain
Salah Bourennane
Salman Abdul Moiz
Sameh Elnikety
Sanjay H.A.
Sathish Rajasekhar
Sergey Andreev
Seshan Srirangarajan
Seyed (Reza) Zekavat
Sghaier Guizani
Shancang Li
Shi Xiao
Siby Abraham
Silvio Bortoleto
Simon Pietro Romano
Somayajulu D. V. L. N.
Song Guo
Song Lin
Soumya Sen
Stefano Ferretti
Stefano Giordano
Stefano Pesic
Stefano Tomasin
Stefanos Gritzalis
Steven Gordon
Suat Ozdemir
Subir Saha
Subramanian K.
Sudarshan T.S.B.
Sugam Sharma
Surekha Mariam Varghese
T. Aaron Gulliver
Tao Jiang
Tarek Bejaoui
Tarun Joshi
Theodore Stergiou

XV

Kyungil University, Korea


National Dong Hwa University, Taiwan
NICT, Japan
Shahid Beheshti University, Iran
University of Versailles, France
Ericsson, Swedan
Fisk University, USA
Ecole Centrale Marseille, France
CDAC, Bangalore, India
Microsoft Research, USA
Nitte Meenakshi Institute, Bangalore, India
RMIT University, Australia
Tampere University of Technology, Finland
Nanyang Technological University, Singapore
Michigan Technological University, USA
UAE University, UAE
School of Engineering, Swansea University, UK
Nanyang Technological University, Singapore
University of Mumbai, India
Positivo University, Brazil
University of Naples Federico II, Italy
National Institute of Technology Warangal,
India
The University of British Columbia, Canada
University of California, Riverside, USA
University of Pennsylvania, USA
University of Bologna, Italy
University of Pisa, Italy
Cisco Systems, Italy
University of Padova, Italy
University of the Aegean, Greece
Thammasat University, Thailand
Gazi University, Turkey
Nokia Siemens Networks, India
Advanced Center for Informatics and
Innovative Learning, IGNOU, India
Amrita Vishwa Vidyapeetham, Bangalore,
India
Iowa State University, USA
M.A. College of Engineering, India
University of Victoria, Canada
Huazhong University of Science and
Technology, P.R. China
Mediatron Lab., Carthage University, Tunisia
University of Cincinnati, USA
Intracom Telecom, UK

XVI

Organization

Thienne Johnson
Thomas Chen
Tsern-Huei Lee
Usman Javaid
Vamsi Paruchuri
Vana Kalogeraki
Vehbi Cagri Gungor
Velmurugan Ayyadurai
Vicent Cholvi
Victor Govindaswamy
Vijaya Kumar B.P.
Viji E Chenthamarakshan
Vino D.S. Kingston
Vinod Chandra S.S.
Vivek Jain
Vivek Singh
Vladimir Kropotov
Wael M El-Medany
Waslon Lopes
Wei Yu
Wei-Chieh Ke
Wendong Xiao
Xiang-Gen Xia
Xiaodong Wang
Xiaoguang Niu
Xiaoqi Jia
Xinbing Wang
Xu Shao
Xueping Wang
Yacine Atif
Yali Liu
Yang Li
Yassine Bouslimani
Ye Zhu
Yi Zhou
Yifan Yu
Yong Wang
Youngseok Lee
Youssef SAID
Yuan-Cheng Lai
Yuh-Ren Tsai

University of Arizona, USA


Swansea University, UK
National Chiao Tung University, Taiwan
Vodafone Group, UK
University of Central Arkansas, USA
University of California, Riverside, USA
Bahcesehir University, Turkey
University of Surrey, UK
Universitat Jaume I, Spain
Texas A&M University, USA
Reva Institute of Technology and Management,
Bangalore, India
IBM T.J. Watson Research Center in
New York, USA
Hewlett-Packard, USA
College of Engineering Thiruvananthapuram,
India
Robert Bosch LLC, USA
Banaras Hindu University, India
D-Link Russia, Russia
University of Bahrain, Kingdom of Bahrain
UFCG - Federal University of Campina Grande,
Brazil
Towson University, USA
National Tsing Hua University, Taiwan
Institute for Infocomm Research, Singapore
University of Delaware, USA
Qualcomm, USA
Wuhan University, P.R. China
Institute of Software, Chinese Academy of
Sciences, P.R. China
Shanghai Jiaotong University, P.R. China
Institute for Infocomm Research, Singapore
Fudan University, P.R. China
UAE University, UAE
University of California, Davis, USA
Chinese Academy of Sciences, P.R. China
University of Moncton, Canada
Cleveland State University, USA
Texas A&M University, USA
France Telecom R&D Beijing, P.R. China
University of Nebraska-Lincoln, USA
Chungnam National University, Korea
Tunisie Telecom/SysCom Lab,ENIT, Tunisia
Information Management, NTUST, Taiwan
National Tsing Hua University, Taiwan

Organization

Yu-Kai Huang
Yusuf Ozturk
Zaher Aghbari
Zbigniew Dziong
Zhang Jin
Zhenghao Zhang
Zhenzhen Ye
Zhihua Cui
Zhili Sun
Zhong Zhou
Zia Saquib

XVII

Quanta Research Institute, Taiwan


San Diego State University, USA
University of Sharjah, UAE
University of Quebec, Canada
Beijing Normal University, P.R. China
Florida State University, USA
iBasis, Inc., USA
Taiyuan University of Science and Technology,
China
University of Surrey, UK
University of Connecticut, USA
C-DAC, Mumbai, India

ACC 2011 Additional Reviewers


Akshay Vashist
Alessandro Testa
Amitava
Ammar Rashid
Anand
Bjoern W. Schuller
Chi-Ming Wong
Danish Faizan
Fatos Xhafa
Hooman Tahayori
John Jose
Jyoti Singh
Koushik
Long Zheng
Manpreet Singh
Maria Striki
Mohamad Zoinol Abidin
Mohamed Dahmane
Mohd Helmy Abd Wahab
Mohd Riduan Bin Ahmad
Mohd Sadiq
Mudhakar Srivatsa
Nan Yang
Nurulnadwan Aziz Aziz

Telcordia Telchnologies, USA


University of Naples Federico II, Italy
Academy of Technology, India
Auckland University of Technology,
New Zealand
MITS, India
Technical University, Germany
Jinwen University of Science and Technology,
Taiwan
NIC-INDIA, India
UPC, Barcelona Tech, Spain
Ryerson University, Canada
IIT Madras, India
Academy of Technology, India
West Bengal University of Technology, India
University of Aizu, Japan
M.M. Engineering College, India
Telcordia Technologies, Piscataway, USA
Universiti Teknikal Malaysia Melaka, Malaysia
University of Montreal,Canada
Universiti Tun Hussein Onn Malaysia, Malaysia
Universiti Teknikal Malaysia Melaka, Malaysia
Jamia Millia Islamia, India
IBM T.J. Watson Research Center, USA
CSIRO, Australia
Universiti Teknologi MARA, Malaysia

XVIII

Organization

Pooya Taheri
R.C. Wang
Roman Yampolskiy
Shuang Tian
Syed Abbas Ali
Velayutham
Yeong-Luh Ueng

University of Alberta, Canada


NTTU, Taiwan
University of Louisville, USA
The University of Sydney, Australia
Ajman University of Science & Technology,
UAE
Adhiparasakthi Engineering College,
Melmaruvathur, India
National Tsing Hua University, Taiwan

International Workshop on Identity: Security,


Management and Applications (ID 2011)

General Chairs
Paul Rodrigues
(CTO, WSS, India)
H.R. Vishwakarma
(Secretary, Computer
Society of India)

Hindustan University, India

VIT University, India

Program Chairs
P. Krishna Reddy
Sundar K.S.
Srinivasa Ragavan
S. Venkatachalam

IIIT, Hyderabad, India


Education & Research, Infosys Technologies
Limited, India
Intel Inc, USA
Jawaharlal Nehru Technological University,
India

Organizing Chair
Madhan Kumar Srinivasan

Education & Research, Infosys Technologies


Limited, India

Organizing Co-chairs
Abhi Saran
Anireddy Niranjan Reddy
Revathy Madhan Kumar

London South Bank University, UK


University of Glamorgan, UK
Education & Research, Infosys Technologies
Limited, India

Technical Program Committee


Arjan Durresi
Arun Sivanandham
Avinash Srinivasan
Bezawada Bruhadeshwar
Bhaskara Reddy AV
Bipin Indurkhya

Indiana University Purdue University


Indianapolis, USA
Infosys Technologies Limited, India
Bloomsburg University, USA
IIIT, Hyderabad, India
Infosys Technologies Limited, India
IIIT, Hyderabad, India

XX

ID 2011

C. Sunil Kumar
Chandrabali Karmakar
Farooq Anjum
Gudipati Kalyan Kumar
Hamid Sharif
Hui Chen
Jie Li
Kalaiselvam
Lau Lung
Lukas Ruf
Manik Lal Das

Manimaran Govindarasu
Narendra Ahuja
Omar
Pradeep Kumar T.S.
Pradeepa
Rajiv Tripathi
Rakesh Chithuluri
Sanjay Chaudhary

Santosh Pasuladi
Satheesh Kumar Varma
Saurabh Barjatiya
Sreekumar Vobugari
Suthershan Vairavel
Tarun Rao
Thomas Little
Tim Strayer
V. Balamurugan
Vasudeva Varma
Vinod Babu
Yonghe Liu

Jawaharlal Nehru Technological University,


India
Infosys Technologies Limited, India
On-Ramp Wireless, USA
Excellence India, India
University of Nebraska-Lincoln, USA
Virginia State University, USA
University of Tsukuba, Japan
Inneon Technologies, Germany
UFSC, Brazil
Consecom AG, Switzerland
Dhirubhai Ambani Institute of Information and
Communication Technology (DA-IICT),
India
Iowa State University, USA
University of Illinois, USA
University of Jordan, Jordan
Infosys Technologies Limited, India
Wipro Technologies, India
NIT, Allahabad, India
Oracle, India
Dhirubhai Ambani Institute of Information and
Communication Technology (DA-IICT),
India
Jawaharlal Nehru Technological University,
India
IIIT, Pune, India
IIIT, Hyderabad, India
Education & Research, Infosys Technologies
Limited, India
CTS, India
Infosys Technologies Limited, India
Boston University, USA
BBN Technologies, USA
IBM, India
IIIT, Hyderabad, India
Giesecke & Devrient, Germany
UT Arlington, USA

International Workshop on Applications of


Signal Processing
(I-WASP 2011)

Workshop Organizers
Jaison Jacob
Sreeraj K.P.
Rithu James

Rajagiri School of Engineering and Technology,


India
Rajagiri School of Engineering and Technology,
India
Rajagiri School of Engineering and Technology,
India

Technical Program Committee


A. Vinod
Aggelos Katsaggelos
Bing Li
Carlos Gonzalez
Damon Chandler
Egon L. van den Broek
Feng Wu
Hakan Johansson
Joaquim Filipe
Lot Senahdji
Reyer Zwiggelkaar
Xianghua Xie
Yoshikazu Miyanaga

NTU, Singapore
Northwestern University, USA
University of Virginia, USA
University of Castilla-La Mancha, Spain
Oklahoma State University, USA
University of Twente, The Netherlands
Microsoft Research Asia, P.R. China
University of Linkoping, Sweden
EST-Setubal, Portugal
Universite de Rennes 1, France
Aberystwyth University, UK
Swansea University, UK
Hokkaido University, Japan

International Workshop on Cloud Computing:


Architecture, Algorithms and Applications
(CloudComp 2011)

Workshop Organizers
Binu A.
Biju Paul
Sabu M. Thampi

Cochin University of Science and Technology,


India
Rajagiri School of Engineering and Technology,
India
Rajagiri School of Engineering and Technology,
India

Technical Program Committee


Antonio Puliato
Bob Callaway
Chee Shin Yeo
Chin-Sean Sum
Ching-Hsien Hsu
Drissa Houatra
Deepak Unnikrishnan
Jie Song
Salah Sharieh
Francesco Longo
Fabienne Anhalt
Gaurav Somani
Haibing Guan
Hongbo Jiang
Hongkai Xiong
Hui Zhang
Itai Zilbershtein
Jens Nimis
Jie Song

University of Messina, Italy


IBM, USA
Institute of High-Performance Computing,
Singapore
National Institute of Information and
Communications Technology, Japan
Chung Hua University, Taiwan
Orange Labs, France
University of Massachusetts, USA
Northeastern University, P.R. China
McMaster University, Canada
Universita di Messina, Italy
Ecole Normale Superieure de LyonINRIA,
France
LNMIIT, Jaipur, India
Shanghai Jiao Tong University, P.R. China
Huazhong University of Science and
Technology, P.R. China
Shanghai Jiao Tong University, P.R China
Nec Laboratories America, USA
Avaya, Israel
University of Applied Sciences, Germany
Software College, Northeastern University,
China

XXIV

CloudComp 2011

Jorge Carapinha
Junyi Wang
K. Chandra Sekaran
Kai Zheng
Krishna Sankar
Laurent Amanton
Luca Caviglione
Lukas Ruf
Massimiliano Rak
Pallab Datta
Pascale Vicat-Blanc Primet
Prabu Dorairaj
Shivani Sud
Shuicheng Yan
Siani Pearson
Simon Koo
Srikumar Venugopal
Stephan Kopf
Thomas Sandholm
Umberto Villano
Vipin Chaudhary
Yaozu Dong
Zhou Lan

PT Inovacao S.A. Telecom Group, Portugal


National Institute of Information and
Communications Technology, Japan
NITK, India
IBM China Research Lab, P.R. China
Cisco Systems, USA
Havre University, France
National Research Council (CNR), Italy
Consecom AG, Switzerland
Second University of Naples, Italy
IBM Almaden Research Center, USA
INRIA, France
NetApp Inc, India
Intel Labs, USA
National University of Singapore, Singapore
HP Labs, UK
University of San Diego, USA
UNSW, Australia
University of Mannheim, Germany
Hewlett-Packard Laboratories, USA
University of Sannio, Italy
University at Bualo, USA
Intel Corporation, P.R. China
National Institute of Information and
Communications Technology, Japan

International Workshop on Multimedia


Streaming (MultiStreams 2011)

Program Chairs
Pascal Lorenz
Fan Ye
Trung Q. Duong

University of Haute Alsace, France


IBM T.J. Watson Research Center, USA
Blekinge Institute of Technology, Sweden

Technical Program Committee


Guangjie Han
Alex Canovas
Brent Lagesse
Chung Shue Chen
Debasis Giri
Mario Montagud
Doreen Miriam
Duduku V. Viswacheda
Elsa Macas Lopez
Eugenia Bernardino
Fernando Boronat
Jen-Wen Ding
Joel Rodrigues IT
Jo-Yew Tham
Marcelo Atenas
Jorge Bernabe
Bao Vo Nguyen
Hans-Juergen Zepernick
Jose Maria Alcaraz Calero
Juan Marin Perez
Lei Shu
Lexing Xie
Marc Gilg
Miguel Garcia
Mohd Riduan Bin Ahmad

Hohai University, P.R. China


Polytechnic University of Valencia, Spain
Oak Ridge National Laboratory, USA
INRIA-ENS, France
Haldia Institute of Technology, India
Universidad Politecnica de Valencia, Spain
Anna University, India
University Malaysia Sabah, Malaysia
University of Las Palmas de Gran Canaria,
Spain
Polytechnic Institute of Leiria, Portugal
Instituto de Investigaci
on para la Gesti
on
Integrada de Zonas Costeras, Spain
National Kaohsiung University of Applied
Sciences, Taiwan
University of Beira Interior, Portugal
A*STAR Institute for Infocomm Research,
Singapore
Universidad Politecnica de Valencia, Spain
University of Murcia, Poland
Posts and Telecommunications Institute of
Technology, Vietnam
Blekinge Institute of Technology, Sweden
University of Murcia, Spain
University of Murcia, Spain
Osaka University, Japan
The Australian National University, Australia
University of Haute-Alsace, France
Polytechnic University of Valencia, Spain
Universiti Teknikal Malaysia, Malaysia

XXVI

MultiStreams 2011

Phan Cong-Vinh
Alvaro Su
arez-Sarmiento
Song Guo
Tin-Yu Wu
Zhangbing Zhou
Zuqing Zhu
Juan M. S
anchez
Choong Seon Hong

London South Bank University, UK


University of Las Palmas de Gran Canaria,
Spain
University of British Columbia, Canada
Tamkang University, Taiwan
Institut Telecom & Management SudParis,
France
Cisco System, USA
University of Extremadura, Spain
Kyung Hee University, Korea

Second International Workshop on Trust


Management in P2P Systems
(IWTMP2PS 2011)

Program Chairs
Visvasuresh Victor
Govindaswamy
Jack Hu
Sabu M. Thampi

Texas A&M University-Texarkana, USA


Microsoft, USA
Rajagiri School of Engineering and Technology,
India

Technical Program Committee


Haiguang
Ioannis Anagnostopoulos
Farag Azzedin
Roksana Boreli
Yann Busnel
Juan-Carlos Cano
Phan Cong-Vinh
Jianguo Ding
Markus Fiedler
Deepak Garg
Felix Gomez Marmol
Paulo Gondim
Steven Gordon
Ankur Gupta
Houcine Hassan
Yifeng He
Michael Hempel
Salman Abdul Moiz
Guimin Huang
Renjie Huang
Benoit Hudzia
Helge Janicke

Fudan University, P.R. China


University of the Aegean, Greece
King Fahd University of Petroleum & Minerals,
Saudi Arabia
National ICT Australia, Australia
University of Nantes, France
Universidad Politecnica de Valencia, Spain
London South Bank University, UK
University of Luxembourg, Luxemburg
Blekinge Institute of Technology, Sweden
Thapar University, Patiala, India
University of Murcia, Spain
Universidade de Brasilia, Brazil
Thammasat University, Thailand
Model Institute of Engineering and Technology,
India
Universidad Politecnica de Valencia, Spain
Ryerson University, Canada
University of Nebraska-Lincoln, USA
CDAC, India
Guilin University of Electronic Technology,
P.R. China
Washington State University, USA
SAP Research, UK
De Montfort University, UK

XXVIII

IWTMP2PS 2011

Mohamed Ali Kaafar


Eleni Koutrouli
Stefan Kraxberger
Jonathan Loo
Marjan Naderan
Lourdes Penalver
Elvira Popescu
Guangzhi Qu
Aneel Rahim
Yonglin Ren
Andreas Riener
Samir Saklikar
Thomas Schmidt
Fangyang Shen
Thorsten Strufe
Sudarshan Tsb
Demin Wang
Fatos Xhafa
Jiping Xiong
Chang Wu Yu

INRIA, France
National University of Athens, Greece
Graz University of Technology, Austria
Middlesex University, UK
Amirkabir University of Technology, Iran
Valencia Polytechnic University, Spain
UCV, Romania
Oakland University, USA
COMSATS Institute of Information
Technology, Pakistan
SITE, University of Ottawa, Canada
University of Linz, Austria
RSA, Security Division of EMC, India
HAW Hamburg (DE), Germany
Northern New Mexico College, USA
TU Darmstadt, Germany
Amrita School of Engineering, India
Microsoft, USA
UPC, Barcelona, Spain
Zhejiang Normal University, P.R. China
Chung Hua University, Taiwan

Table of Contents Part IV

Position Papers
Impact of Node Density on Node Connectivity in MANET Routing
Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G. Jisha and Philip Samuel
Survey and Comparison of Frameworks in Software Architecture . . . . . . .
S. Roselin Mary and Paul Rodrigues
Two Layered Hierarchical Model for Cognitive Wireless Sensor
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
K. Vinod Kumar, G. Lakshmi Phani, K. Venkat Sayeesh,
Aparna Chaganty, and G. Rama Murthy
3D-CGIN: A 3 Disjoint Paths CGIN with Alternate Source . . . . . . . . . . . .
Meenal A. Borkar and Nitin
Architecture for Running Multiple Applications on a Single Wireless
Sensor Network: A Proposal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sonam Tobgay, Rasmus L. Olsen, and Ramjee Prasad

1
9

19

25

37

Feature Based Image Retrieval Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . .


P.U. Nimi and C. Tripti

46

Exploiting ILP in a SIMD Type Vector Processor . . . . . . . . . . . . . . . . . . . .


Abel Palaty, Mohammad Suaib, and Kumar Sambhav Pandey

56

An Extension to Global Value Numbering . . . . . . . . . . . . . . . . . . . . . . . . . . .


Saranya D. Krishnan and Shimmi Asokan

63

Data Privacy for Grid Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


N. Sandeep Chaitanya, S. Ramachandram, B. Padmavathi,
S. Shiva Skandha, and G. Ravi Kumar

70

Towards Multimodal Capture, Annotation and Semantic Retrieval from


Performing Arts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rajkumar Kannan, Frederic Andres, Fernando Ferri, and
Patrizia Grifoni
A New Indian Model for Human Intelligence . . . . . . . . . . . . . . . . . . . . . . . .
Jai Prakash Singh
Stepping Up Internet Banking Security Using Dynamic Pattern Based
Image Steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
P. Thiyagarajan, G. Aghila, and V. Prasanna Venkatesan

79

89

98

XXX

Table of Contents Part IV

A Combinatorial Multi-objective Particle Swarm Optimization Based


Algorithm for Task Allocation in Distributed Computing Systems . . . . . .
Rahul Roy, Madhabananda Das, and Satchidananda Dehuri

113

Enhancement of BARTERCAST Using Reinforcement Learning to


Eectively Manage Freeriders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G. Sreenu, P.M. Dhanya, and Sabu M. Thampi

126

A Novel Approach to Represent Detected Point Mutation . . . . . . . . . . . . .


Dhanya Sudarsan, P.R. Mahalingam, and G. Jisha

137

Anonymous and Secured Communication Using OLSR in MANET . . . . .


A.A. Arifa Azeez, Elizabeth Isaac, and Sabu M. Thampi

145

Bilingual Translation System for Weather Report (For English and


Tamil) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
S. Saraswathi, M. Anusiya, P. Kanivadhana, and S. Sathiya

155

Design of QRS Detection and Heart Rate Estimation System on


FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Sudheer Kurakula, A.S.D.P. Sudhansh, Roy Paily, and S. Dandapat

165

Multi-document Text Summarization in E-Learning System for


Operating System Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
S. Saraswathi, M. Hemamalini, S. Janani, and V. Priyadharshini

175

Improving Hadoop Performance in Handling Small Files . . . . . . . . . . . . . .


Neethu Mohandas and Sabu M. Thampi

187

Studies of Management for Dynamic Circuit Networks . . . . . . . . . . . . . . . .


Ana Elisa Ferreira, Anilton Salles Garcia, and
Carlos Alberto Malcher Bastos

195

International Workshop on Identity:


Security, Management and Applications (ID 2011)
Game Theoretic Approach to Resolve Energy Conicts in Ad-Hoc
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Juhi Gupta, Ishan Kumar, and Anil Kacholiya

205

Software Secureness for Users: Signicance in Public ICT


Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.K. Raju and P.B.S. Bhadoria

211

Vector Space Access Structure and ID Based Distributed DRM Key


Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ratna Dutta, Dheerendra Mishra, and Sourav Mukhopadhyay

223

Table of Contents Part IV

Multiple Secrets Sharing with Meaningful Shares . . . . . . . . . . . . . . . . . . . . .


Jaya and Anjali Sardana

XXXI

233

On Estimating Strength of a DDoS Attack Using Polynomial Regression


Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B.B. Gupta, P.K. Agrawal, A. Mishra, and M.K. Pattanshetti

244

Finding New Solutions for Services in Federated Open Systems


Interconnection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Zubair Ahmad Khattak, Jamalul-lail Ab Manan, and Suziah Sulaiman

250

Duplicate File Names-A Novel Steganographic Data Hiding


Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Avinash Srinivasan and Jie Wu

260

A Framework for Securing Web Services by Formulating an


Collaborative Security Standard among Prevailing WS-* Security
Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
M. Priyadharshini, R. Baskaran, Madhan Kumar Srinivasan, and
Paul Rodrigues

269

Improved Web Search Engine by New Similarity Measures . . . . . . . . . . . .


Vijayalaxmi Kakulapati, Ramakrishna Kolikipogu, P. Revathy, and
D. Karunanithi

284

International Workshop on Applications of Signal


Processing (I-WASP 2011)
Recognition of Subsampled Speech Using a Modied Mel Filter Bank . . .
Kiran Kumar Bhuvanagiri and Sunil Kumar Kopparapu

293

Tumor Detection in Brain Magnetic Resonance Images Using Modied


Thresholding Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C.L. Biji, D. Selvathi, and Asha Panicker

300

Generate Vision in Blind People Using Suitable Neuroprosthesis


Implant of BIOMEMS in Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
B. Vivekavardhana Reddy, Y.S. Kumara Swamy, and N. Usha

309

Undecimated Wavelet Packet for Blind Speech Separation Using


Independent Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ibrahim Missaoui and Zied Lachiri

318

A Robust Framework for Multi-object Tracking . . . . . . . . . . . . . . . . . . . . . .


Anand Singh Jalal and Vrijendra Singh

329

XXXII

Table of Contents Part IV

SVM Based Classication of Trac Signs for Realtime Embedded


Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rajeev Kumaraswamy, Lekhesh V. Prabhu, K. Suchithra, and
P.S. Sreejith Pai
A Real Time Video Stabilization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . .
Tarun Kancharla and Sanjyot Gindi

339

349

Object Classication Using Encoded Edge Based Structural


Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Aditya R. Kanitkar, Brijendra K. Bharti, and Umesh N. Hivarkar

358

Real Time Vehicle Detection for Rear and Forward Collision Warning
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Gaurav Kumar Yadav, Tarun Kancharla, and Smita Nair

368

PIN Generation Using Single Channel EEG Biometric . . . . . . . . . . . . . . . .


Ramaswamy Palaniappan, Jenish Gosalia, Kenneth Revett, and
Andrews Samraj

378

International Workshop on Cloud Computing:


Architecture, Algorithms and Applications
(CloudComp 2011)
A Framework for Intrusion Tolerance in Cloud Computing . . . . . . . . . . . .
Vishal M. Karande and Alwyn R. Pais
Application of Parallel K-Means Clustering Algorithm for Prediction
of Optimal Path in Self Aware Mobile Ad-Hoc Networks with Link
Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Likewin Thomas and B. Annappa
Clouds Infrastructure Taxonomy, Properties, and Management
Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Imad M. Abbadi
A Deduced SaaS Lifecycle Model Based on Roles and Activities . . . . . . . .
Jie Song, Tiantian Li, Lulu Jia, and Zhiliang Zhu
Towards Achieving Accountability, Auditability and Trust in Cloud
Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ryan K.L. Ko, Bu Sung Lee, and Siani Pearson
Cloud Computing Security Issues and Challenges: A Survey . . . . . . . . . . .
Amandeep Verma and Sakshi Kaushal
A Deadline and Budget Constrained Cost and Time Optimization
Algorithm for Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Venkatarami Reddy Chintapalli

386

396

406
421

432
445

455

Table of Contents Part IV

XXXIII

International Workshop on Multimedia Streaming


(MultiStreams 2011)
A Bit Modication Technique for Watermarking Images and Streaming
Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kaliappan Gopalan

463

Ecient Video Copy Detection Using Simple and Eective Extraction


of Color Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
R. Roopalakshmi and G. Ram Mohana Reddy

473

Mobile Video Service Disruptions Control in Android Using JADE . . . . .


Tatiana Gualotu
na, Diego Marcillo, Elsa Macas L
opez, and
Alvaro Su
arez-Sarmiento

481

Performance Analysis of Video Protocols over IP Transition


Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hira Sathu and Mohib A. Shah

491

Performance Comparison of Video Protocols Using Dual-Stack and


Tunnelling Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Hira Sathu, Mohib A. Shah, and Kathiravelu Ganeshan

501

IPTV End-to-End Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . .


Priya Gupta, Priyadarshini Londhe, and Arvind Bhosale
A Color Image Encryption Technique Based on a SubstitutionPermutation Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
J. Mohamedmoideen Kader Mastan, G.A. Sathishkumar, and
K. Bhoopathy Bagan

512

524

Second International Workshop on


Trust Management in P2P Systems
(IWTMP2PS 2011)
Comment on the Improvement of an Ecient ID-Based RSA
Mutlisignature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chenglian Liu, Marjan Kuchaki Rafsanjani, and Liyun Zheng
A Secure Routing Protocol to Combat Byzantine and Black Hole
Attacks for MANETs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jayashree Padmanabhan, Tamil Selvan Raman Subramaniam,
Kumaresh Prakasam, and Vigneswaran Ponpandiyan
A Convertible Designated Verible Blind Multi-signcryption Scheme . . . .
Subhalaxmi Das, Sujata Mohanty, and Bansidhar Majhi

534

541

549

XXXIV

Table of Contents Part IV

Middleware Services at Cloud Application Layer . . . . . . . . . . . . . . . . . . . . .


Imad M. Abbadi

557

Attribute Based Anonymity for Preserving Privacy . . . . . . . . . . . . . . . . . . .


Sri Krishna Adusumalli and V. Valli Kumari

572

An Anonymous Authentication and Communication Protocol for


Wireless Mesh Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Jaydip Sen

580

Data Dissemination and Power Management in Wireless Sensor


Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
M. Guerroumi, N. Badache, and S. Moussaoui

593

Performance Evaluation of ID Assignment Schemes for Wireless Sensor


Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Rama Krishna Challa and Rakesh Sambyal

608

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

617

Impact of Node Density on Node Connectivity in MANET


Routing Protocols
G. Jisha1 and Philip Samuel2
1

Department of Information Technology, Rajagiri School of


Engineering and Technology, Kochi, India-682039
2
Information Technology, School of Engineering,
Cochin University of Science and Technology, Kochi,
India-682022
philips@cusat.ac.in, jishag@rajagiritech.ac.in

Abstract. The functioning of routing protocols in Mobile Ad-hoc Networks


depends on factors like node mobility, node failure, broken paths, node
connectivity and node density. These factors make the network dynamic. Due to
the change in node connectivity, availability of link for data transfer data may
vary. This paper discusses about Mobile Ad-Hoc environment with varying
node density and its effect on node connectivity among MANET routing
protocols. The performance of two routing protocols like DSDV from proactive
routing protocols, AODV from reactive routing protocols are analyzed and
compared. Quantitative metrics like normalized overhead, packet delivery ratio,
number of control packets are evaluated using the Network Simulator NS-2.
This paper helps in identifying the impact of varying node densities on the node
connectivity in Mobile Ad-Hoc networks. The result of performance
comparison can also be helpful in the design of new routing protocols based on
topological characteristics.
Keywords: MANET, DSDV, AODV, Node Connectivity, Node Density.

1 Introduction
A Mobile Ad-hoc network is a temporary or short period dynamic network used
in battlefields, conference, rescue operation, multimedia games. These networks
comprises of a group of wireless mobile nodes which communicate each other
without any fixed infrastructure. Routing in MANET is a challenging task as the
topology of such networks keeps on changing due to various factors like node
mobility, change in the node status and change in node density. Here the nodes act as
both host and receiver, who forward packets to other mobile host. Individual node
has limited processing capacity but is capable of supporting distributed approach
through coordination effort in a network [12]. Initially the node will not have prior
knowledge of its neighboring nodes or the topology of the entire network. The nodes
sends beacons to neighboring nodes, and listens to the broadcasting message from
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 18, 2011.
Springer-Verlag Berlin Heidelberg 2011

G. Jisha and P. Samuel

neighboring nodes to find the list of current neighbors. This process continues till the
node knows about all other nodes and also when a change in the topology of the
network is detected. Thus through these neighbors, the nodes can communicate to
other nodes outside its coverage area to maintain node connectivity in the network [3].
Knowledge of topological characteristics like connectivity, coverage, maximum
inter-node distance and node degree helps in the design of new distributed protocol
and also for evaluating the performance of existing routing protocols [12]. The factors
that complicate the analysis of the topological characteristics are node mobility and
node density [8], [9]. Paolo Santi and Douglas M Blough have discussed the
conditions that are needed to ensure that a deployed network is connected initially and
remains connected as node migrates [8]. Certain Routing Protocols are found to
perform better in densely connected network than in sparse network [1]. There is a lot
of work done for evaluating the performance of MANET Routing Protocols under
different topological characteristics [1], [12], [2], [8], [9]. In our paper we have
evaluated performance of MANET routing protocols under varying node density and
how it affects the node connectivity.
This paper analyzes the performance of routing protocols designed for MANET
under different node density and its impact on node connectivity. The second
sections of this paper discuss the routing protocols designed for MANET and the
performance metrics used. The third section discusses the impact of node density in
proactive and reactive routing protocols with a comparative evaluation. The forth
section presents the environment for simulations in NS-2 for comparing the effect of
node density on these two protocols. The fifth section discusses the simulation result
and the last section concludes the paper.

2 Routing Protocols and Performance Metrics


Mobile Ad-hoc Routing Protocol is a rule or standard that tells the mobile nodes in
the network which way to route the packets. Effective routing protocols are needed to
handle dynamic topology, a major problem in MANET routing protocol. Various
routing protocols were proposed in the past, depending on routing information,
updating mechanism and temporal information used for routing, routing topology and
utilization of specific resources [4]. Protocols designed for MANET deal with high
power consumption, low bandwidth and high error rates which are the typical
limitations of these networks [5].
Main two classifications considered are Proactive Routing Protocols and Reactive
Routing Protocols. Proactive Routing protocols stores information about nodes in the
network to transfer data between different nodes in the network. These protocols
constantly update node information and may react to change in the network topology
even if no traffic is affected by the topology modification. This could create
unnecessary overhead. Reactive Routing Protocols establish routes between nodes
only when they is a request to required to route data packets. There is no updating of
every possible route in the network instead it focuses on routes that are being used or
being set up.
Both qualitative and quantitative metrics are needed to evaluate the performance of
a routing protocol [7]. Qualitative properties of MANET routing protocol includes

Impact of Node Density on Node Connectivity in MANET Routing Protocols

distributed operation, loop freedom, demand based operation, proactive operation,


security, sleep period operation, unidirectional link support. Quantitative metrics used
are end-to-end data throughput and delay, Route acquisition time, Percentage out-oforder delivery, Efficiency. The other parameters that should be varied are network
size, network connectivity, and topological rate of change, link capacity, and fraction
of unidirectional links, traffic patterns, Mobility, Fraction and frequency of sleeping
nodes [6], [7].
This paper evaluates two routing protocols DSDV from Proactive Routing
Protocols and AODV from Reactive Routing Protocols. Quantitative distributed
operation, demand based operation, proactive based operation, sleep period operation,
unidirectional link support. Qualitative measures used are number of routing packets
used, normalized overhead, packet delivery ratio. Two main parameters that are
verified to check the performance are network connectivity and node density. All
qualitative measures are verified at different node densities.

3 Impact of Node Density on Routing Protocols-Comparative


Evaluation
A variety of performance metrics are discussed to determine the performance and
robustness of MANET routing protocols [7]. The goal is to compare the routing
protocols on the basis of qualitative and quantitative measures. Topological
parameters like node density and node connectivity are also considered for comparing
MANET routing protocols.
3.1 Impact of Node Density on Proactive Routing Protocol
In Proactive routing protocols every nodes maintain the routing information by
periodic updating of routes. Proactive routing protocols maintain up-to-date routing
information between every node in the network. The dynamic change in topology
results change in routing table. Proactive protocol update routing table when there is
any change in the topology due to change in node density. Change in the number of
nodes reflect in more overhead for such networks due to the updating of routing table
by the nodes when new nodes are added or existing nodes are disconnected.
Destination Sequence distance vector (DSDV) is the protocol considered under
proactive category
Destination Sequenced Distance Vector (DSDV)
Destination Sequenced Distance Vector is a table- driven proactive protocol modified
from Bellman Ford algorithm. DSDV is one of the mostly accepted routing protocols
under the category of proactive routing protocol developed by C. Perkins and P.
Bhagwat in 1994[5]. The main contribution of this algorithm was to solve the routing
loop problem. Every mobile node in the network maintains a routing table in which
all the possible destination within the network and the number of hops to each
destination are recorded. Each entry in the routing table contains a sequence number,

G. Jisha and P. Samuel

the sequence numbers will be even if a link is present otherwise an odd number is
used. The number is generated by the destination and the sender needs to send out the
next update with this number [4]. DSDV periodically send routing control packets to
neighbors for updating routing table [12].
Selection of routes: Latest sequence numbers are used if a router receives new
information. Route with a better metric is used if the sequence number is the same as
the one in the table. Stale entries and the routes using those nodes with stale entries as
next hops are deleted.
Advantages: DSDV is a hop by hop distance vector routing protocol suitable for
creating ad-hoc networks with small node density. It selects the shortest path based on
the number of hops to the destination. Loop freedom is eliminated. Use simple route
update protocol. Routing table is maintained on each network. Sequence number is
used for making decision. Network Connectivity of the network found to increase due
to various paths between the nodes. Breakage of an intermediate node may not affect
the connectivity if the node density is high.
Disadvantages: Use of battery power and small bandwidth even when the network is
idle, due to the regular updating of its routing tables. Periodic updating in
transmissions limits the number of nodes that can be connected to the network. Not
suitable for highly dynamic networks, as a new sequence number is necessary
whenever the topology changes due to the increase in number of nodes. Routing
overhead is directly related with the number of nodes in the network.
3.2 Impact of Node Density in Reactive Routing Protocols
Reactive Routing Protocol does not maintain network topology information.
Necessary path is obtained by a connection establishment process. The Routing
information are not exchanged periodically. In case of reactive routing protocols the
overhead for route calculation is less when compared to Proactive routing protocols in
case of increase in node density.
Ad-Hoc On Demand Distance Vector Routing Protocol (AODV)
AODV, an extension of DSDV is a reactive routing protocol implemented for mobile
ad-hoc networks. AODV is combination of DSR, a reactive routing protocol and
DSDV, a proactive routing protocol. It has the basic on demand mechanism of Route
Discovery and Route maintenance of DSR and the use of hop by hop routing
sequence number and periodic beacons from DSDV. When a source node wants to
send information to destination node and does not have a route to destination, it starts
the route finding process starts the route finding process. It generates a RREQ and
broadcast to its neighbors. The route request is forwarded by intermediate nodes. A
reverse path is created for itself from destination. When the request reaches a node
with a route to destination it generates a RREP containing number of hops required to
reach destination. RREP is routed along the reverse path. Each node maintains is own
sequence number and broadcast id. To maintain routes the nodes survey the link

Impact of Node Density on Node Connectivity in MANET Routing Protocols

status of their next hop neighbor in active routes. If the destination or some
intermediate node move, steps are used to update the routing table of all neighbors
[4],[11].
Features:
Combines the features of both DSR and DSDV
Route discovery and route maintenance from DSR
Hop by Hop routing ,sequence numbers and periodic beacon from DSDV
Advantage: Maximum utilization of bandwidth. Simple. Node acts as router and
maintains a simple routing table, Maintains effective routing information and current
routing information, Loop free, Coping up with dynamic topology and broken links,
Highly Scalable when compared with DSDV.
Disadvantage: No reuse of routing information, vulnerable to misuse, high route
discovery latency, overhead on bandwidth. When number of node increases initially
throughput increase as large number of routes are available, after a certain limit
throughput becomes stable[4].
3.3 Discussion on MANET Routing Protocols Using Various Performance
Metrics
After the discussion of the two MANET routing protocols namely, DSDV, AODV. A
comparative discussion using various performance metrics are made to judge the
performance and suitability of these routing protocols. Both qualitative and
quantitative metrics are used.

Distributed Operation: DSDV maintains routes between every pair of nodes


in the network, while AODV finds path between nodes whenever the route is
required.
Broadcasting: In DSDV, for updating the routing tables, broadcasting is done
periodically to maintain routing updates. In AODV, only hello messages are
broadcasted to its neighbors to maintain node connectivity.
Node Density: When node density varies DSDV is affected more than
AODV as DSDV has to maintain connectivity between every node.
Bandwidth: The periodic updating of routing tables for each node results in
wastage of bandwidth in DSDV. But, Bandwidth is effectively used with
AODV, as it propagates only hello messages to its neighbors and RREQ and
RREP are broadcasted only on-demand.
Route Maintenance: For sending data to a particular destination, there is no
need to find a route as DSDV routing protocol maintains all the routes in the
routing tables for each node. While AODV has to find a route before sending
a data.
Routing Overhead: Overhead in DSDV is more when the network is large
and it becomes hard to maintain the routing tables at every node. But, in
AODV overhead is less as it maintains small tables to maintain local
connectivity.

G. Jisha and P. Samuel

Node Mobility: DSDV cannot handle mobility at high speeds due to lack of
alternative routes hence routes in routing table is stale. While in AODV it
does not affect much, as it find the routes on demand.

4 Simulation Result and Observations


We evaluate the performance of MANET protocols using measurements obtained
through both simulation and implementation. Simulations help to measure the
effectiveness of routing protocols under varying node densities.
Simulation is performed using Network Simulator NS-2 [10]. NS-2 is an object
oriented simulator developed as a part of the VINT project at the University of
California in Berkeley. The simulator is event-driven and runs in a non real time
fashion. The main purpose of NS-2 is for evaluating the existing networks
performance or the performance of newly designed components
Simulation Design: The basic configuration is that our testing is in a square of
500*500 with nodes ranging from 5 to 50. The traffic source used is CBR (Constant
Bit Rate), 512 bytes as data packets, sending rate is 4pkts/second, Radio-propagation
model used is TwoRayGround, MAC Type is 802_11,a\Ad-Hoc routing protocols
tested are DSDV, AODV.
Simulation Result: Number of nodes is the varying parameter as it plays an important
role in performance Figure 1, 2, 3 shows various performance parameters versus
number of nodes.
a)
Packet delivery rate: Packet delivery rate is the amount of packets
successively received by the destination node over the total number of packets send
throughout the simulation .With varying node densities packet delivery rate in DSDV
and AODV increases with increase in number of nodes

Fig. 1. Number of Nodes Vs Packet delivery ratio

Impact of Node Density on Node Connectivity in MANET Routing Protocols

Fig. 2. Number of Nodes Vs Normalized Overhead

b)
Normalized Overhead: Number of routing packets over the number of
packets successfully received at the destination.
c)
Number of Routing Packets : Routing Packets(RP) is used to refer the routing
related packets like route request, route reply, route error that are received by various
nodes. Number of RP received is different from number of packets sent. Nodes on
receiving such packets broadcast these packets to the neighboring nodes. Here the
number of routing packets is compared with the number of nodes to measure the node
connectivity.

Fig. 3. Number of Nodes Vs Number of Routing Packets

5 Conclusion
We have presented a detailed performance comparison of important routing protocols
for mobile ad-hoc networks. DSDV and AODV are the two protocols taken for
comparison. Routing Protocols were studied in detail and their features advantages,

G. Jisha and P. Samuel

disadvantages were discussed. Then a comparative study on the basis of quantitative


and qualitative measures including the parameters node connectivity and node density
are also studied. Both reactive and proactive protocols perform well in case of packet
delivery when number of nodes increase in the network which shows node
connectivity. In case of Normalized overhead both DSDV and AODV shows increase
in overhead with increase in number of nodes. Number of routing packets shows a
decrease with increase in Node density. AODV gets more affected by change in node
density than DSDV.
Future works may evaluate more routing protocols with different metrics which
will be helpful for those who are trying to design a new or improved version of any
existing routing protocol. More works can address issues related to topological
characteristics of mobile ad-hoc networks.

References
1. Schult, N., Mirhakkak, M., LaRocca, D.: Routing in Mobile Ad Hoc Networks. IEEE, Los
Alamitos (1999)
2. Goel, A., Sharma, A.: Performance Analysis of Mobile Ad-hoc Network Using AODV
Protocol. International Journal of Computer Science and Security (IJCSS) 3(5), 334
(1999)
3. Adam, N., Ismail, M.Y., Addullah, J.: Effect of Node Density on Performance of Three
MANET Routing Protocols. In: 2010 International Conference on Electronic Devices,
Systems and Applications (ICEDSA 2010). IEEE, Los Alamitos (2010)
4. Siva Ram Murthy, C., Manoj, B.S.: AdHoc Wireless Networks Architectures and
Protocols, 2nd edn., pp. 321347. Pearson Education, London
5. Royer, E.M., Toh, C.-K.: A Review of Current Routing Protocols for AdHoc Mobile
Wireless Networks. IEEE Personal Communications, 46 (1999)
6. Arun Kumar, B.R., Reddy, L.C., Hiremath, P.S.: Performance Comparison of Wireless
Mobile Ad-Hoc Network Routing Protocols. International Journal of Computer Science
and Network Security 8(6), 337 (2008)
7. Corson, S.: Mobile Ad hoc Networking (MANET): Routing Protocol Performance Issues
and Evaluation Considerations, RFC2501
8. Santi, P., Blough, D.M.: An Evaluation of Connectivity in Mobile Wireless AdHoc
Networks. In: Proceedings of the International Conference on Dependable Systems and
Networks (DSN 2002). IEEE, Los Alamitos (2002)
9. Deepa, S., Kadhal Nawaz, D.M.: A Study on the Behavior of MANET Routing Protocols
with Variying Densities and Dynamic Mobility Patterns. IJCA Special Issue on Mobile
Ad-hoc Networks, MANET 2010, 124 (2010)
10. Greis, M.: Tutorial for the UCB/LBNL/VINT Network Simulator ns
11. Perkins, C., Royer, E.M.: Ad hoc On demand distance vector (AODV) routing (Internet
draft) (August 1998)
12. Bagchi, S., Cabuk, S., Lin, L., Malhotra, N., Shroff, N.: Analysis of Topological
Characteristics of Unreliable Mobile Wireless AdHoc Network (1999)

Survey and Comparison of Frameworks in Software


Architecture
S. Roselin Mary1 and Paul Rodrigues2
1

Department of Computer science and Engineering at Anand Institute of Higher Technology,


Chennai 603103, India
2
Department of Information Technology, Hindustan University, Chennai 603103, India
jesuroselin@gmail.com, deanit@hindustanuniv.ac.in

Abstract. The development of various architectural frameworks and models in


the field of software architecture shows the importance of the need for such a
governing structure for growing and developed organizations. To create or to
choose the right and suitable architecture frame work for any organization, the
comparative study of all the frameworks and models must be analyzed. This
paper technically analyzes various well known frameworks based on their
views/perspectives, kind of architecture they deal with, characteristics, system
development methodology, system modeling technique and business modeling
technique and also explains their advantages and weakness. Frameworks that
we consider are Zachman framework, TEAF, FEAF, TOGAF, DODAF and
ISO/RM-ODP.
Keywords: Framework, Software Architecture, Views.

1 Introduction
Complexity of any system can be understood with the help of the architecture of that
system. Planning is required when a system becomes more complex. Architecture is
the combination of process and product of planning, designing and constructing space
that reflects functional, social and aesthetic considerations [1]. It also encompasses
project planning, cost estimation and constructing administration. In civil engineering,
architecture deals with the relationship between complexity and planning for buildings and cities. Customers and builders may have different views and different perspectives of their own interest. [2].
Similarly, the same concept can be used for software which is called software architecture. Building a very complex, critical and highly distributed system requires
the interconnected components that are basic building blocks and the views of end user, designer, developer and tester. The research work of Dijkstra in 1968 and David
Parnas in the early 1970s identified the concept of Software Architecture at first.
Software architecture is known for the design and implementation of the high-level
structure of the software. It is the connection of architectural elements in some wellchosen forms to achieve the major functionality and performance requirements of the
system and to achieve non- functional requirements such as reliability, scalability,
portability, and availability [4]. Software frameworks indicate the locations in the
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 918, 2011.
Springer-Verlag Berlin Heidelberg 2011

10

S. Roselin Mary and P. Rodrigues

architecture where application programmers may make the adaptations for a specific
functionality [5]. A software framework is an abstraction in which common code providing generic functionality can be selectively overridden or specialized by user code.
Instead of concentrating on the low level details of a working system, the designers
and programmers can concentrate only on the software requirements so as to reduce
overall development time [6].
Even though software architecture is relatively new to the field, the basic principles of this field have been applied since the mid 1980s. Seeing the evolution of
Software architecture from the algorithms era will clearly show us the various stages
it has crossed and the concepts it has borrowed from others to get a shaped form. The
following sections briefly describe the evolution and evaluation of Software architecture. Section 2 describes the evolution from algorithm to Zachmans framework and
Zachman framework to Service oriented Architecture. Section 3 describes the classification of frameworks and lists out the comparison criteria of frameworks. Frameworks are evaluated in section 4 using the criteria listed out in the section 3.

2 Evolution of Software Architectural Frameworks


2.1 Evolution from 1920s to 1995
In 1928, an Algorithm was partially formulated to solve a problem by a finite sequence of instructions. To plan computer programs by having a visual representation
of the instruction flow, Von Neumann developed Flow Chart in 1947.He got the
idea from the flow process chart(1921) and multi flow chart(1944)which were used
mostly in the area of electrical engineering. Later, Control Flow Diagram (CFD)
was developed in 1950s to describe the control flow of a business process and program. The representation of flow of control was not enough to view the complex systems and it was not an easy job to design one for it. It did not give a high level view
of the work and immediate access of particular points. So Block Diagram was developed in late 1950s to understand the complete system by dividing it into smaller
sections or blocks. A particular function will be done by each block and the connection between the blocks will be shown in a diagram.
In the mid time, the historical development of abstraction in the field of computer
science and programming made a turning point in the field of software architecture.
The introduction of Abstract data types in the late 1960s, paved a way to group data
structures that have similar behavior and to group certain data types and modules of
one or more programming languages that have similar semantics. The notion of Abstract data types lead to a software design technique called Modular Programming
that introduced the concept of modules in software. Modules represent a separation of
concerns. They maintain the logical boundaries between components. This concept
was introduced in 1968.
In 1977, the adoption of layered architecture based on the modular programming led
to the Three Schema Approach to build information systems using three different
views in systems development. By breaking up an application into tiers, developers

Survey and Comparison of Frameworks in Software Architecture

11

had to modify only a specific layer rather than rewrite the entire application over. And
thus it helped to create a flexible and reusable application. By evolving the three schema model into layers of six perspectives, John Zachman developed The Zachman
Framework in 1987. It still plays an important role in the era of Enterprise Architecture and influenced frameworks DODAF, TOGAF, TEAF and FEAF. The modified
version of Zachman Framework with more number of views was released in 1993. In
1995, the 4+1 view model was developed by Kruchten.
The purpose of Views used in these models was to analyze the complex systems,
and to list out the elements of the problem and the solution around the domains of expertise. A view of a system is the representation of the system from the perspective of
a viewpoint. This viewpoint on a system focuses on specific concerns of the system. It
provides a simplified model with the elements related to the concerns of the viewpoint
and hides other details [2] [4]. This section deals how the framework and view point
evolved from the algorithm through several stages. It clearly portrays that the introduction of ADT and layered approach paved a way towards the frameworks era. The
next subsection deals how the various standard architectural frameworks evolved.
2.2 Evolution from 1995 to 2010
The necessity of frameworks in defense side applications and the encouragement of
U.S government to new architecture led to the C4ISR Architecture Framework in
1996. In 2003, The Department of Defense Architecture Framework (DODAF) was
released which restructured the C4ISR framework ver2.0 [7][8].This version was later
restructured and released as The Department of Defense Architecture Framework
(DODAF) in 2003 [7] [8]. The Open Group Architecture Framework (TOGAF)
was developed by the members of open architecture forums in 1995. Recently in
2009, TOGAF Version 9 was released [9].
The Federal enterprise Architecture Framework (FEAF) was developed in 1999
by the Federal Government to integrate its myriad agencies and functions under a single common and enterprise architecture [10]. Treasury Enterprise Architecture
Framework (TEAF) was developed by the US Department of Treasury and published
in July 2000.It was used to support the Treasurys business processes in terms of
products [11].
Based on the distributed processing developments and using the concepts of abstraction, composition and emergence, a reference model RM-ODP was developed by
Andrew Herbert in 1984. A set of UML profiles were included in ODP and
UML4ODP was introduced in 2004 [12].In 2001, Aspect oriented programming came
out by getting the principles of OOPS. It led to the Aspect oriented software development later in 2002.As an opposition to the distributed processing and Modular
programming, the Service Oriented Architecture (SOA) came out and IBM announced Service Oriented Modeling Architecture (SOMA) as the first publicly
announced SOA related methodology in 2004. Based on this concept, the SOMF ver
1.1 was released by Michael Bell to provide tactical and strategic solutions to enterprise problems [13] [14].

12

S. Roselin Mary and P. Rodrigues

This section clearly portrays the independent development of frameworks based on


the Zachman framework, OOPS concept and Event driven concept. The application of
UML on RM-ODP derived a new framework. By analyzing the concept and structure
of various frameworks and combining those appropriately with some existing technology will yield a better framework. The frameworks dealt in the forthcoming sections are most widely used by the commercial and Government departments. So, it is
necessary to classify and compare them. The next section deals the classification and
comparison of frameworks based on few parameters

3 Classification and Comparison Criteria


3.1 Classification of Frameworks
Frameworks were developed under the interests of people in different fields for various purposes. They evolved by keeping their base on different concepts in different
directions. To establish a new organization with the implementation of architectural
framework or to introduce the frameworks to an existing organization for streamlining
their tasks, it is necessary to look up the environment where these frameworks were
developed and used and adopt those architectural frameworks to the new environment. So, it is necessary to classify them as whether they were developed by standard
bodies or individual interests or by private agencies. The frameworks developed by
standard bodies fall under the standard category and others fall under the non standard
category. And also they are subcategorized based on their usage in commercial or
Government purposes.
Frameworks developed and used for the Government departments and for Defense
side applications are classified under the Government frameworks. Frameworks used
for commercial purpose are classified under the commercial frameworks.
The Open Distributed model ISO RM-ODP falls under the standard and commercial frameworks. DODAF, FEAF and TEAF which were developed for the U.S Government agencies fall under the standard and government frameworks. The well
accepted and most widely used frameworks, TOGAF and Zachman frameworks are
used by both the commercial and government agencies.
Even though TOGAF and Zachman frameworks are falling under the non-standard
category, mapping of these frameworks to DODAF, FEAF and other standard frameworks yielded good products in the industry. The classification described in this section will be very useful for the customer to choose the suitable framework quickly for
his organization based on the job nature. The next subsection deals the comparison
parameters that can be used by the customer to choose an appropriate tool.
3.2 Comparison Criteria
In this paper, we have taken the survey of few frameworks which are most widely
used. The parameters used for comparison in existing surveys are not suitable for a

Survey and Comparison of Frameworks in Software Architecture

13

customer to choose the tool. So, the methodologies, techniques and tools used in these
frameworks are considered for the comparison .Parameters used for the comparison in
this paper is listed below.
1. Views / View points: Total number of views defined in the framework
2. Domain: It deals about the domain of applications and services the particular
framework focuses on.
3. Origin: It deals about for whom the framework was developed and in which area
the framework was well suited
4. Focus: It describes the focus of the framework i.e. business, cost, quality and so
on.
5. Phase of SDLC: It discusses in which stage of the software life cycle the particular
framework can be used widely.
6. System development methodology: A system development methodology is like a
framework to structure, plan, and control the process of developing an information
system. Lots of such frameworks have come up over the years, each with its own
recognized strengths and weaknesses. It is not mandatory to use one system development methodology for all projects. Based on the technical, organizational,
project and team considerations, each of the available methodologies can be
followed in specific kind of projects. Mostly used methodologies are Rational
Unified process (RUP), Dynamic system development Method (DSDM), Rapid
Application Development (RAD), Iterative Application Development (IAD),
Linear Application Development (LAD) and Extreme programming (XP).
7. System modeling Technique: The working principle of the system is revealed in
System modeling. These techniques help us examine how the different components
in a system work together to produce a particular result. The tools used for system
modeling are UML, Flow chart, OMG-Model driven Architecture, Interface Definition Language and Object oriented programming.
8. Business Modeling Technique: A business model explains the functions of the
process being modeled. The nature of the process can be visualized, defined, understood and validated by representing its activities and flows. Available techniques are flow chart, functional flow block diagram, control flow diagram, Gantt
chart, PERT diagram, and Integration Definition (IDEF). Recently evolved
methods are Unified Modeling Language (UML) and Business Process Modeling
Notation (BPN).
9. Advantages: It deals with the benefits of using the particular framework.
10.Weakness: It deals with the drawbacks of the framework.
The following section deals the well known frameworks and lists out their comparison criteria.

4 Evaluation of Various Frameworks


4.1 Zachman Framework
The Zachman Framework describes a complex thing in different ways using different
types of descriptions. It provides thirty-six categories to describe anything completely. It has six different views (Planner's View (Scope), Owner's View (Enterprise or

14

S. Roselin Mary and P. Rodrigues

Business Model), and Designers View (Information Systems Model), Builders view,
Subcontractor View, actual system view) to facilitate each player to view the system
in their own particular way. The domain of this framework is mainly on Categorizing
Deliverables. It is well suited for Manufacturing Industries. It focuses mainly on
Business process. It can be used in the Planning stage or Design stage of SDLC [15].
Organizations own system development methodology can be followed if they
apply this framework. System modeling technique such as OMG-Model driven Architecture or Organizations own technique can be followed. BPML is used as the
business modeling technique for this framework. It provides improved professional
communication within community and understanding the reasons for and risks of not
developing any one architectural representation. It provides variety of tools and/or
methodologies [26]. But, it has few weak points also. It may lead to more documentation depending on the cases and it may guide to a process-heavy approach to development. It isnt well accepted by all the developers. It seems in its first appearance as
a top-down approach to developers. It is biased towards traditional and data-centric
techniques.
4.2 NATO Architecture Framework/C4ISR/DODAF
The Department of Defense Architecture Framework (DoDAF) provides the organization of enterprise Architecture (EA) into consistent views. It is well suited for large
complicated systems and interoperability challenges. DoDAF provides multiple
views, each of which describes various aspects of the architecture. They are Overarching All View (AV), Operational View (OV), Systems View (SV), and Technical
Standards View (TV)."Operational views" used here deal with the external customer's
operating domain. It focuses mainly on Architecture data and Business process. It is
used in the Process or Planning stage of SDLC. The Framework does not advice the
use of any one system development methodology. It depends on the organizations
decision. If the system to be developed is larger, then UML tools are likely to be the
best choice for system modeling and IDEF family for business modeling. It defines a
common approach for describing, presenting, and comparing DoD enterprise architectures. Common principles, assumptions and terminologies are used and across the
organizational boundaries architecture descriptions can be compared. It reduces
Deployment costs and reinvention of same system [7]. The weakness of DoDAF is no
common ontology of architecture elements in the framework. Baseline (current) and
objective (target) architectures and business financial plans are not addressed. Usage
of architectures to measure effectiveness is not dealt here [23].
4.3 TOGAF
The Open Group Architecture Framework (TOGAF) provides a comprehensive approach to the design, planning, implementation, and governance of enterprise information architecture. TOGAF identifies many views to be modeled in an architecture
development process. It includes Business Architecture Views, Information Systems
Architecture views, Technology Architecture views and Composite views. The domain of this framework mainly focuses on Business, data and applications. This
framework is developed due to the motivation in Defence side framework. It focuses

Survey and Comparison of Frameworks in Software Architecture

15

mainly on Business process, Data, applications and Technology. It is used in the


Process or Planning stage of SDLC. Rational Unified process (RUP) is used as the
system development Methodology. UML, BPMN are widely used for system modeling and IDEF is used for business modeling. It has increased transparency of accountability. It provides controlled risk, protection of assets, proactive control and Value
creation [21]. But, it is weak on Information Architecture, planning methods and governance framework [9]. It needs lots of Detail. And it can lead startup efforts into too
much too soon [22].
4.4 TEAF
Treasury Enterprise Architecture Framework (TEAF) was developed by the US Department of the Treasury and published in July 2000 to support Treasurys business
processes in terms of products. This framework guides the development and redesign
of the business processes for various bureaus. It is based on the Zachman Framework.
It provides four different views. They are Functional Views, Information View, Organizational View and Infrastructure View. It has a domain on Business processes. It focuses mainly on Business process. It is used in the communication or Planning stage
of SDLC [11] [15]. It does not refer to any specific system development methodology. It depends on the organizations decision [20]. Flow chart, UML can be used as
system modeling technique and IDEF and ERD can be used as business modeling
techniques. It provides guidance to the treasury bureaus and offices in satisfying
OMB and other federal requirements. It Supports Treasury bureaus and offices based
on their individual priorities and strategic plans. It leads to Treasury-wide interoperability and reusability [11]. The TEAF does not contain a detailed description of how
to generate the specification documents (work products) that are suggested for each
cell of the TEAF Matrix [19].
4.5 FEAF
Federal Enterprise Architecture (FEA) was developed for the Federal Government to
provide a common methodology for information technology (IT) acquisition, use, and
disposal in various government enterprises. It was built to develop a common taxonomy and ontology to describe IT resources. The FEAF provides documenting architecture descriptions of high-priority areas. It guides to describe architectures for
functional segments in the multi-organization manner of the Federal Government.
Like zachman framework, FEAF also has five different views (Planners View, Owners View, Designers View, Builders View and Subcontractors View) in its framework. It has a domain on provision of services [15]. This framework is well suited for
Enterprise Architecture planning. It focuses mainly on Business process, Data, Application and Technology. It is used in the Communication or Planning stage of SDLC
[15]. Rational Unified process (RUP) is used as the system development Methodology. UMLis widely used for system modeling and BPML is used for business modeling. It serves customer needs better, faster, and cost effectively. It Promotes Federal
interoperability and promotes Agency resource sharing. It reduces costs for Federal
and Agency. It improves ability to share information and supports capital IT investment planning in Federal and Agency [10]. The weakness of FEAF is that the Federal

16

S. Roselin Mary and P. Rodrigues

Government can risk allocating too much time and resources to an enterprise architecture description effort yielding potentially little return at significant cost. The Federal
Enterprise Architecture program requires technical and acquisition expertise. The
Federal IT community must keep its eyes on the basic principles rather than near-term
objectives and achievements. The Federal Government has to pay up-front for
the right to exercise options in the future. Concern over territoriality and loss of
autonomy may impede the Federal Enterprise Architecture effort due to long-term,
realignment of Agency functions and responsibilities. It is hard to have common,
cross-Agency models and standards to ensure interoperability [10].
4.6 ISO RM-ODP
The ISO Reference Model for Open Distributed Processing provides a framework
standard to support the distributed processing in heterogeneous platforms. Object
modeling approach is used to describe the systems in distributed environment. The
five viewpoints described by RM-ODP are enterprise viewpoint, information viewpoint, computational viewpoint, engineering viewpoint and technology viewpoint. It
has a domain on information sharing in distributed environment. This framework is
well suited for major computing and telecommunication companies. It focuses mainly
on Business process, Technical Functionality and Solution. It is used in the Processing and communication stage of SDLC. Object oriented method and IAD can be used
as the system development methodology. UML and OMG are widely used for system
modeling and BPML is used for business modeling. It provides lot of details for the
analysis phases of the development of applications. It provides the platform to integrate the requirements from different languages consistently. It provides a set of established reasoning patterns to identify the fundamental entities of the system and the
relations among them. It provides the appropriate degrees of abstraction and precision
for building useful system specifications. It provides a set of mechanisms and common services to build robust, efficient and competitive applications, interoperable
with other systems [17]. RM-ODP has the problem of inter-view and inter-view consistency. A number of cross-view checks have to be done to maintain consistency.
Yet, these checks dont guarantee the consistency [16].

6 Conclusion
This paper summarizes the frameworks based on the important criteria used in industry side or business side applications and it discusses the benefits and drawbacks of
each framework. These points will invoke the user to choose the suitable framework
for their industry, organization and business based on their requirement. Users can
easily identify the supporting tools available for the frameworks of their choice. On
analyzing the work term Focus, we can easily conclude that all the frameworks developed mainly focus on business and IT solutions. In future, we can enhance the
frameworks to focus on quality through effective mapping of frameworks. We can
map ancient Indian architecture styles and patterns to the familiar Frameworks to
yield new frameworks that will focus on quality.

Survey and Comparison of Frameworks in Software Architecture

17

References
1. Conely, W.: About Architecture (2009),
http://www.ehow.com/about_4565949_architecture.html
2. Roger Session: A Comparison of Top Four Enterprise Architecture Methodologies, ObjectWatch, Inc. (May 2007),
http://www.objectwatch.com/white_papers.htm
3. Bass, L., Clements, P., Kazman, R.: What is software Architecture? In: Software Architecture in Practice, ch.2, 2nd edn., pp. 19-45. Addison Wesley, Reading (2003)
4. Kruchten, P.: Architectural Blueprints The 4+1 View model of software Architecture.
IEEE Softw. 12, 4250 (1995)
5. Shan, T.C.: Taxonomy of Java Web Application Frameworks. In: Conf. Rec. 2006 IEEE
Int. Conf. e-Business Engg., pp. 378385 (2006)
6. HighBeam Research: Software Framework (2008),
http://www.reference.com/browse/Software_framework
7. U.S. Dept. of Defense: DoD Architecture Framework Version 1.5. (April 23, 2007),
http://www.cio-nii.defense.gov/docs/DoDAF_Volume_II.pdf
8. Kobryn, C., Sibbald, C.: Modeling DODAF Complaint Architectures. (October 25, 2004),
http://www.uml-forum.com/dots/White_Paper_
Modeling_DoDAF_UML2.pdf
9. The Open Group: Module 2 TOGAF9 Components (2009),
http://www.opengroup.org/togaf/
10. U.S. Chief Information officers (CIO) Council: Federal Enterprise Architecture Framework Version 1.1 (September 1999),
http://www.cio.gov/documents/fedarch1.pdf
11. U.S. Treasury Chief Information officer Council: Treasury Enterprise Architecture Framework Version 1 (July 2000), http://www.treas.gov/cio
12. Ignacio, J.: UML4ODP PLUGIN User guide Version 0.9., Atenea Research Group,
Spain (2009), http://issuu.com/i72jamaj/docs/uml4odp_plugin
13. Bell, Michael: Introduction to Service-Oriented Modeling. In: Service-Oriented Modeling:
Service Analysis, Design, and Architecture. Wiley & Sons, Chichester (2009)
14. Buckalew, P. M.: Service Oriented Architecture (2009),
http://www.pmbuckalew.com/soa.htm
15. schekkerman, J.: A comparative survey of Enterprise Architecture Frameworks. Institute
for Enterprise Architecture Developments, Capgemini (2003), http://www.enterprisearchitecture.info
16. Maier, M., Rechtin, E.: Architecture Frameworks. In: The Art of Systems Architecting,
2nd edn., pp. 229250. CRC Press, Florida (2000)
17. Vallecillo, A.: RM-ODP: The ISO Reference Model for Open Distributed Processing. ETSI Informtica, Universidad de Mlaga, http://www.enterprise-architecture.
info/Images/Documents/RM-ODP.pdf
18. Liimatainen, K., Hoffmann, M., Heikkil, J.: Overview of Enterprise Architecture work in
15 countries FEAR Research Project, Ministry of Finance, Finland (2007),
http://www.vm.fi/julkaisut
19. Leist, S., Zellner, G.: Evaluation of Current Architecture Frameworks. University of
Regensburg, Germany (2006),
http://www.dcc.uchile.cl/~vramiro/d/p1546-leist.pdf

18

S. Roselin Mary and P. Rodrigues

20. Treasury Enterprise Architecture Framework,


http://www.en.wikipedia.org/.../Treasury_Enterprise_
Architecture_Framework
21. What is TOGAF?
http://www.articlebase.com/information-technologyarticles/what-is-togaf-626259.html
22. Westbrock, T.: Do Frameworks Really Matter?, EADirections (October 24, 2007),
http://www.eadirections.com/.../
EAdirections%20Frameworks%20Breakout%20updated.pdf
23. Mosto, A.: DoD Architecture Framework Overview (May 2004),
http://www.enterprise-architecture.info/Images/.../DODAF.ppt
24. Jim: Applicability of DODAF in Documenting Business Enterprise Architectures
(August 9, 2008), http://www.thario.net/2008/08/applicability-ofdodaf-in-documenting.html
25. Ambler, S.: Extending the RUP with the Zachman Framework (2007),
http://www.enterpriseunifiedprocess.com/essays/
ZachmanFramework.html
26. Zachman, J.A.: A Framework for Information Systems Architecture. IBM Syst. J. 26(3),
276292 (1987)
27. Gulla, J., Legum, A.: Enterprise Architecture Tools project (2006),
http://www.enterprise-architecture.info/EA_Tools.htm
28. May, N.: A survey of Software Architecture Viewpoint models (2005),
http://mercuryit.swin.edu.au/ctg/AWSA05/Papers/may.pdf

Two Layered Hierarchical Model for Cognitive Wireless


Sensor Networks
K. Vinod Kumar1, G. Lakshmi Phani2, K. Venkat Sayeesh3, Aparna Chaganty4,
and G. Rama Murthy4
4

1,2,3
National Institute of Technology Warangal, India
Indian Institute of Information Technology, Design & manufacturing, Jabalpur, India
5
Communiation Research Centre, IIIT Hyderabad, India
{vinodreddy.nitw,phani.l.gadde,sayeesh.nitw,
aparna.214}@gmail.com,
rammurthy@iiit.ac.in

Abstract. In recent years, we have seen tremendous growth in the applications


of wireless sensor networks (WSNs) operating in unlicensed spectrum bands.
However, there is evidence that existing unlicensed spectrum is becoming
overcrowded. On the other hand, with recent advancements in Cognitive Radio
technology, it is possible to apply the Dynamic Spectrum Access model in
WSNs to get access to less congested spectrum, with better propagation
characteristics. One of the predominant problems in Cognitive aided Sensor
Networks is spectrum management. In this paper we propose an effective way
of Co-operative spectrum management in a large environment. The key idea is
to localize the sensor field by forming clusters such that each cluster in a way is
independent of another. Intra-cluster communication takes place within the
locally detected spectrum holes and inter-cluster communication takes place
between cluster heads in a common spectrum hole. Thus forming a two-layered
Cognitive aided Sensor Networks.
Keywords: Wireless Sensor Networks, Cluster, Cluster heads, Cognitive
Radio, Co-operative Spectrum detection.

1 Introduction
Recent technological advancements have made the development of small, low-power,
low-cost, multifunctional, distributed devices, which are capable of wireless
communication, a reality. Such nodes which have the ability to local processing are
called sensor nodes (motes).Limited amount of processing is only possible in a sensor
node.
Wireless Sensor networks are the key to gathering the information needed by
industrial, smart environments, weather in buildings, utilities, home, automation,
transportation systems, shipboard or elsewhere. Recent guerilla warfare counter
measures need a distributed network of sensors that can be deployed using, e.g. an
aircraft. In such applications cabling or, running wires is generally impractical. A
sensor network is required which is fast to maintain and easy to install. A key feature
for current WSN solutions is operation in unlicensed frequency bands, for instance,
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 1924, 2011.
Springer-Verlag Berlin Heidelberg 2011

20

K.V. Kumar et al.

the worldwide available 2.4 GHz band. However, the same band is shared by other
very successful wireless applications, such as Wi-Fi and Bluetooth, as well as other
proprietary technologies. Therefore the unlicensed spectrum is becoming
overcrowded. As a result, coexistence issues in unlicensed bands have been subject of
extensive research. In addition, a large portion of the assigned spectrum is used
sporadically. The spectrum usage is concentrated on certain portions of the spectrum
while a significant amount of spectrum remains unutilized.
The limited available spectrum and the inefficiency in the spectrum usage
necessitate a new communication paradigm to exploit the existing wireless spectrum
opportunistically. Dynamic Spectrum Access is proposed to solve these current
spectrum inefficiency problems. DARPAs approach on Dynamic Spectrum Access
network, the so-called NeXt Generation (xG) program aims to implement the policy
based intelligent radios known as cognitive radios. Defined the first time by J.Miltola
in 1999, cognitive radios are promising solutions to improve the utilizations of the
radio spectrum. The central idea of cognitive radio is to periodically monitor the radio
spectrum, intelligently detect occupancy in the spectrum and then opportunistically
communicate over spectrum holes with minimal interference to active licensed users.
Similar to the existing WSNs, a Cognitive Wireless Sensor Networks (CWSN)
consists of many tiny and inexpensive sensors where each node operates on limited
battery energy. In a WSN, each node both sends and receives data or it is in idle state.
However, in a CWSN, there would be another state called sensing state where the
sensor nodes sense the spectrum to find spectrum opportunities or spectrum holes.
Adding cognition to a WSN provides many advantages. Sensor nodes in a CWSN
can measure and provide accurate information at various locations within the
network. Measurements made within the network provide the needed diversity to
cope with multi-path fading. In addition, a CWSN could provide access not only to
new spectrum (rather than the worldwide available 2.4 GHz band), but also to the
spectrum with better propagation characteristics. In this paper we proposed a novel
two layered approach co-operative for spectrum sensing. Rest of this paper is
organized as follows related work in Section 2, our proposed approach in section 3
and conclusions in the section 4.

2 Related Work
Nodes are grouped into distinct and non overlapping clusters. One of the node among
a cluster is made has cluster head. This cluster-head collect sensor data from other
nodes in the cluster and transfer the aggregate data to the base station. Since the data
transfer to the base station dissipate much energy, so these cluster heads have some
extra energy and more transmission range. i.e they are different from normal nodes.
Since sensor nodes have limited power, power consumption is considered to be one of
the most important issue. Thus care has to be taken in designing a protocol which
consumes less power. Hence in a CWSN, more the sensors who participate in sensing,
more energy is consumed. Thus, we tend to limit the sensing task to some sensors
only. Spectrum sensing is a key task in a cognitive radio. It allows identification of
spectral holes and helps in exploiting them efficiently. The most effective way of
detecting spectrum holes is to detect the primary receivers in the range of secondary

Two Layered Hierarchical Model for Cognitive Wireless Sensor Networks

21

users (sensors). Practically it is difficult for a Cognitive Radio to have a direct


measurement of the channel between the primary receiver and transmitter. So most of
the research now-a-days focuses on primary transmitter detection based on the
observations by the secondary user. In general, we can classify Spectrum Sensing
techniques as Transmitter Detection, Co-operative Detection and Interference
Detection
A cognitive radio should be able to differentiate used and unused spectrum band.
So it should be able to determine if a signal from primary transmitter is present in the
spectrum. Transmitter detection approach is based on the detection of weak signals
from the primary transmitter. Transmitter detection is based on a assumption that the
locations of the primary receivers are unknown due to the absence of interactions
between the primary users and the secondary users, moreover, transmitter detection
model cannot prevent the hidden terminal problem and sometimes the secondary
transmitter may not be able to detect the transmitter due to shadowing. While in the
case of co-operative detection model the information from multiple secondary users is
used for primary user detection. Co-operative detection is done in two methods:
centralized and distributed. In centralized method, secondary base station collects all
the sensing information from its users and detects the spectrum holes. While in
distributed method secondary users exchange observations. Co-operative Spectrum
sensing task in the CWSN sensing state, can be performed either by a distributed or
centralized scheme. However, we use a centralized scheme for reasons explained
earlier. In a centralized scheme, spectrum opportunities are detected by a single entity
called network coordinator. The network coordinator broadcasts a channel switch
command to indicate an alternate available channel. The alternate channel could be
another licensed channel or an unlicensed channel in the ISM band. The broadcast
message could be retransmitted by multiple nodes to reliably deliver the message.
Typically, there exist two traffic load configurations in a CWSN:
Regular status report: each sensor sends regular status update to the coordinator.
The information in such status updates depends on the particular application.
Control commands: control messages are sent by the coordinator. For example, in
a heat control application, the coordinator sends commands to switch on/off the
heaters.
In most cases, due to multi path fading and co-relation shadowing, primary user
detection is poor. Co-operative sensing is a method in which xG users co-operate with
each other instead of competing. Cooperative detection among unlicensed users is
theoretically more accurate since the uncertainty in a single users detection can be
minimized. Cooperative detection schemes allow mitigating the multi-path fading and
shadowing effects, which improves the detection probability in a heavily shadowed
environment.

3 Motivation
As discussed in the earlier section distributed sensing is a very useful technique for
spectrum sensing. Consider a very large network where the primary users and
secondary users co-exist sharing the spectrum. The secondary users sense the spectrum

22

K.V. Kumar et al.

at regular intervals of time. This time interval (dt) is decided by the permissible latency
for the primary users. The secondary users sense the spectrum, detect the spectrum
holes and use it without causing any harmful interference to the primary user. Since the
secondary network in this scenario is wireless sensor network, it needs a single band of
spectrum to communicate. The best suitable contiguous spectrum available for the
whole network will have very small bandwidth while when we divide the network into
smaller parts (cluster) ,the best available bands for each of the cluster will be relatively
large. Consider the figure where we see 6 regions in a big network. The frequency
band which is free to all the six areas is relatively very less when compared to the
spectrum locally free within the region.

4 Our Contribution
Our idea is to use the locally free spectrum for communication within a cluster. The
whole network is divided into clusters and each cluster is headed by a coordinator
node which has got extra abilities. The coordinator node will communicate with all
the nodes in that particular cluster and Base station. All the data that is to be routed to
the secondary Base Station will be sent first to the co-ordinator node and then the coordinator node communicates with the adjacent co-ordinator. There are two channels
that a co-ordinator maintains with every node and the neighbouring co-ordinator
nodes. They are:
(1). Control channel and
(2). Data Channel
The control channel operates in unlicensed bands (2.4Ghz) and transfers only those
packets which are related to the spectrum sensing activity. The data channel transfers
the data to be routed to the Base Station via sensors. The data channel operates in the
free spectrum bands that are decided centrally by the Base Station.
Procedure:
1) The co-ordinator node senses (in regular intervals) the spectrum to detect the
spectrum holes in that cluster and sends this sensing information via control
channel to the adjacent co-ordinator node. Eventually the base station receives
all the spectrum sensing information.
2) Based on the information the Base Station decides communication frequency
in which the cluster should communicate in order to avoid harmful
interference to the primary user. This information is also sent via control
channel itself.
3) Once the co-ordinator node gets the information about the communicating
frequency bands, it notifies the same to all the sensor nodes (secondary users)
within the cluster.
4) Then all the sensors starts sending the sensing data to the co-ordinator in the
data channel which operates in the locally free bands specified by the coordinator.

Two Layered Hierarchical Model for Cognitive Wireless Sensor Networks

23

Fig. 1. A sample Network with primary and secondary

Fig. 2. Control and Data links between nodes

5) The co-ordinator also forwards this information to the neighbouring


co-ordinator. Finally, all the data reaches the Base station.
6) The nodes will route in the best available band (which changes from time to
time) without causing interference to primary user.

24

K.V. Kumar et al.

5 Conclusion
In this paper, we proposed a conceptual model of two layered architecture for
Cognitive aided WSN. Considering the challenges raised by Wireless Sensor
Networks, the use of Cognitive Radio appears as a crucial need to achieve satisfactory
results in terms of efficient use of available spectrum and limited interference with the
licensed users. As described in this paper, the development of the Cognitive Radio
aided Sensor Network technology requires the involvement and interaction of many
advanced techniques like cooperative sensing, interference management, cognitive
radio reconfiguration management etc. Energy constraints are the main limitation of
Cooperative Sensing which can be overcome placing some coordinator nodes with
extra power. By doing so the network life time WSN will increase to a greater extent,
also unutilized spectrum can be used more efficiently with good QoS. Also, each node
maintains 2 channels which is an added advantage as the data and control channels
are separated.

6 Future Work
In this paper, we try to give a two layered hierarchy for the cooperative spectrum
sensing. In future we would like to enhance the performance of these wireless sensors
based cooperative sensing by implementing various methods of sensing and also
under some problems like shadowing and fading. As future work we would like to
implement this and find real time results for the proposed model.

References
[1] Akyildiz, I.F., Lee, W.-Y., Vuran, M.C., Mohanty, S.: NeXt generation/dynamic spectrum
access/cognitiveradio wireless networks: A survey. Computer Networks 50, 21272159
(2006)
[2] Ganesan, G., Li, Y.G.: Cooperative spectrum sensing in cognitive radio: Part I: two user
networks. IEEE Trans. Wireless Commun. 6, 22042213 (2007)
[3] Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Gayirci, E.: A Survey on Sensor
Networks. IEEE Communications Magazine (August 2002)
[4] Mitola III, J., Maguire Jr., G.: Cognitive Radio: Making Software Radios More Personal.
IEEE Personal Communications (see also IEEE Wireless Communications) 6(4), 1318
(1999)
[5] Haykin, S.: Cognitive radio: brain-empowered wireless communications. IEEE Journal on
Selected Areas in Communications 23(2), 201220 (2005)

3D-CGIN: A 3 Disjoint Paths CGIN with Alternate


Source
Meenal A. Borkar1 and Nitin2
1

Department of Computer Sciecnce and Engineering,


Amrapali Institute of Technology and Sciences, Shiksha Nagar,
Haldwani-263139, Uttarakhand, India
meenal.borkar@gmail.com
2
College of Information Science and Technology,
The Peter Kiewit Institute, University of Nebraska at Omaha,
Nebraksa-68182-0116, United States of America
fnunitin@mail.unomaha.edu

Abstract. The performance of multiprocessor systems is greatly dependent on


interconnections and their fault tolerance ability. Handling faults becomes very
important to ensure steady and robust working. This paper introduces, a new
CGIN with at least 3 disjoint paths. It uses an alternate source at the initial
stage, due to which at least 3 disjoint paths can be ensured. The network
provides multiple paths between any source and destination pair. The alternate
source guarantees delivery of packets to the intended destination, even if two
switches or links fail. The alternate source proves quite helpful in case of any
fault at initial stage, or the source is busy. In such cases, the packets can be
retransmitted through the alternate source to avoid delayed delivery or
starvation, which was not being used in original CGIN. This network also
provides the dynamic re-routing to tolerate faults. The paper further presents
two very simple routing strategies first for routing in fault free environment
and second for routing in faulty environment.
Keywords: Gamma Interconnection Network, CGIN, disjoint paths, fault
tolerance, Distance tag routing, Destination tag routing, re-routing.

1 Introduction
In a multiprocessor system, many processors and memory modules are tightly
coupled together with an interconnection network. A properly designed
interconnection network certainly improves the performance of such multiprocessor
system. Multistage Interconnection Networks (MIN) are highly suitable for
communication among tightly coupled nodes. For ensuring high reliability in complex
systems, fault tolerance is an important issue. The Gamma Interconnection Network
(GIN) is a class of MIN is popularly used in many multiprocessor systems.
In a gamma interconnection network, there are multiple paths between any source
and destination pair except when source and destination are same. To overcome this
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 2536, 2011.
Springer-Verlag Berlin Heidelberg 2011

26

M.A. Borkar and Nitin

drawback, many new techniques have been introduced. These techniques have also
improved the fault tolerance capability of the GIN. These networks are Extra Stage
Gamma Network, Monogamma Network, B-network, REGIN, CGIN, Balanced-GIN,
PCGIN, FCGIN and 3DGIN. These network architectures use additional stages,
backward links & alteration in connecting patterns to tolerate the faults. These
networks also suggest the techniques to route packets towards destination in case of
any occurrence of a fault(s).
In this paper, we propose a new network, namely 3D-CGIN a 3 Disjoint CGIN
with alternate source. This network is capable of tolerating two switch or link faults,
by providing alternate source. Its hardware complexity is approximately equal to
PCGIN. We further propose a simple routing algorithm for packet delivery. This
paper is organized as follows Section 2 covers Background and Motivation for this
architecture. Section 3 introduces 3D-CGIN and its Topology, Section 4 focuses on
routing in fault free environment and re-routing techniques to tolerate faults. Section 5
provides comparison of 3D-CGIN with other networks. Concluding remarks are in
section six, followed by acknowledgments and references.

2 Background and Motivation


2.1 The Gamma Interconnection Network (GIN)
The Gamma Interconnection Network [1] is an interconnection network connecting N
= 2n inputs to N outputs. It consists of log2(N) + 1 stages with N switches per stage.
These switches are connected with each other using 3 x 3 crossbar switches. The input
stage uses 1 x 3 crossbars, output stage uses 3 X 1 crossbar and all the intermediate
switches use 3 x 3 crossbar. A typical Gamma Network is shown in fig. 1. The stages
are linked together using power of two [2] and identify connections such that
redundant paths exist. The path between any source to destination is represented using
any one of the redundant forms of the difference between source and destination.
These redundant forms are generated using Fully Redundant Binary Number System.
A number system gives a method to express numeric values. In a r radix fully
redundant binary number system, each digit has (2r 1) representations, ranging in {(r-1), ..., -1, 0, 1, 2, .., (r-1)}. These number systems are redundant in the sense
that some values have multiple representations. All non-zero values will have
multiple representations. A digit can take any of the three values: 1, 0 and -1.
In a Gamma Network, a packet visits n routing points before reaching its
destination. There are three possible connections at stage i the packet from node j
takes a straight path to deliver it to node j at stage i+1; or reach node (j 2i)mod N by
taking an upward link; or reach node (j + 2i)mod N by taking a downward link. A
routing tag is attached with each packet, which guides the packet through the network.
This routing tag is made up of n bits, where n = log2N. This tag denotes the modulo N
difference between destination and source. For this difference, we generate redundant
representations. Each representation denotes a connection pattern from source to
destination.

3D-CGIN: A 3 Disjoint Paths CGIN with Alternate Source

27

In this network, at any switching element three different packets can arrive at the
same time. At stage i any node j can receive input packet from following three nodes:
j, (j - 2i+1 ) and (j + 2i+1 ). These three packets will have 0, 1 and -1 at the ith digit in the
routing tag. This network provides multiple paths between any source and destination
pair using the redundant number representation. However, it provides unique path
when source is same as destination.

Fig. 1. The Gamma Interconnection Network

2.2 Cyclic Gamma Interconnection Network (CGIN)


In n-stage Gamma Network, the connection pattern can be represented as 20, 21, 22,
..., 2n-1 . In CGIN [8] the connection patterns were altered by repeating any stage at
initial and end connections. This alteration provided multiple disjoint paths,
guaranteeing one arbitrary fault tolerance. It also guaranteed multiple paths between
any pair of source and destination. The resulting network depicted cyclic nature.
A CGINn is a CGIN with n stages and the connecting patterns between first two
stages and last two stages being 2 where 0 (n-2). So, the connecting patterns
between stages can be ordered as
2, 2+1, 2+2, ., 2n-3, 2n-2, 20, 21, 22, ...2-1, 2

28

M.A. Borkar and Nitin

The stages of CGINn are numbered 0 to n and the connecting patterns are based on
plus-minus 2( +i)(mod n-1) functions. Fig. 2 shows a typical CGIN03 .
Each request in CGINn , carries a routing tag of n digits. The weight of each digit
is determined by the connecting pattern. For a tag digit di , the weight is determined
by the formula : 2( +i)(mod n-1) if di is 1. The routing complexity of CGIN is same as
that of GIN. CGIN reduces the pin count, as it uses 2 instead of 2n-1 connecting
pattern. It reduces the total layout area as well thus, achieving reduction in cost.
CGIN uses the destination tag routing and re-routing to tolerate any arbitrary single
fault. It does not provide strong re-routability. Strong re-routability implies that the
packet can be re-routed at every stage. CGIN provides at least 2 disjoint paths
between any pair of source and destination. However, packet delivery can fail, if the
source is faulty.

Fig. 2. CGIN03

2.3 Review of Fault Tolerance Techniques Used in GIN


In order to provide reliable interconnection between each source and destination, one
needs to look into the availability of switching nodes and links. Generally, a switching
element is said to be faulty if it is down due to non-functionality or is busy with

3D-CGIN: A 3 Disjoint Paths CGIN with Alternate Source

29

transfer of packets. There are two possible faults in the Gamma Network, either a SE
i.e. switching element is faulty or a link connecting two SEs is faulty. When a SE is
faulty, then either the Source or the SE at previous stage should take a decision about
retransmission of packet or re-routing the packet through some intermediate SE. In
case of a link failure, the node connected with it should be able to choose an alternate
path to the destination. In the following section we discuss various techniques used to
tolerate these faults.
Fault Tolerance Techniques. We try to focus on the major attempts made to tolerate
the faults and improve the performance as well as terminal reliability of the Gamma
Network. The majority of the work is done by providing additional hardware or
altering the connection patterns.
Adding Extra Stage. Adding extra stage to the Gamma Network eliminates following
two problems: first, the unique path between a source and destination when S = D and
second, the number of paths for even tag value are less than the number of paths for
odd tag values. To provide multiple paths for S = D, an extra stage is added to the
Gamma Network. The connection pattern for this extra stage can be any stage of
gamma network. The routing tag is again made up of three possible values 1, 0 and
-1. By using an additional bit for the extra stage one can generate the multiple paths
from source to destination. The routing tags are generated in similar manner as that of
the Gamma Network. The routing algorithm is a simple extension of routing in
Gamma Network. The Extra Stage Gamma Network [3] uses this concept to provide
multiple paths; those could be followed to handle the faults.
Providing Back Links. In any multiprocessor system, the memory requests from
processing elements are generated randomly, hence the path or memory conflicts are
inevitable. By increasing switch sizes the path conflicts may be reduced, but memory
conflicts are still unavoidable. Providing extra buffer space will certainly reduce the
memory conflicts, but the implementations become very costly. Therefore, some
networks use Backward Links to provide multiple paths, to cater with path / memory
conflicts. The B-network [4] is the network using this particular fault tolerant
technique. In this technique, the requests blocked due to path / memory conflict, are
simply send back one stage and from there a new path is selected for the packet. In
this approach, the packet may follow any number of back links, and then may get
forwarded to destination. Following are certain features observed with back links 1)
The backward links act as implicit buffers, 2) The backward links at the very last
stage can handle the memory contention, which cannot be done by crossbars.
Providing the Extra Link. Some network architectures use an additional link that may
connect to some additional SE in next stage. The Balanced Gamma Network [5] uses
this approach. It uses the distance tag routing. Two more modified GINs, namely,
PCGIN [6] and FCGIN [6], make use of additional links at 0th stage. In PCGIN, all
the nodes are connected to each other, forming a chain from 0 to N. Using this layout
it ensures at least 2 disjoint paths between any source to destination pair. It uses
backtracking to tolerate faults. On the other side, FCGIN, uses a fully chained
approach at each stage to avoid backtracking. Due to chaining at every stage, it
provides distributed control and dynamic rerouting, hence a better fault tolerance is
provided. These networks are 1 - fault tolerant.

30

M.A. Borkar and Nitin

Changing the Interconnection Patterns. Due to 2i interconnection patterns, GIN


provides multiple paths for many sources to destination pairs. However, for certain
Source and Destination it provides unique paths, which prove risky if a node fails.
Then for certain pairs the communication becomes impossible. One can provide
multiple disjoint paths for all source to destination pairs by altering the
interconnections between any stage i and i+1. The Reliable Gamma Network (RGIN)
[7], uses the altered interconnection patterns to provide multiple paths and disjoint
paths. Another network, called Mono gamma Network [8] also uses the altered
interconnections to provide multiple paths between any source to destination pair, but
they are not disjoint in nature. In Cyclic Gamma Interconnection Network (CGIN),
any interconnection pattern between any two stages can be repeated to provide
multiple disjoint paths without increasing the hardware cost. As it uses, the repetition
of connection pattern the pin count reduces. Balance modified GIN [9], reverses the
connection patterns for upper links as compared with GIN, while keeping the
connection patterns intact for lower and straight links. This approach balances the
distances between any communicating pair.
By Combining the Switching elements. 3DGIN [10] is a network which combines the
switches to provide 3 disjoint paths. Here, the switches at initial stage are combined
together. It ensures without one lookahead, less hardware cost, 3 Disjoint paths and
two faults tolerance.
We propose a new network, which is an alteration in CGIN, with an alternate link
at stage 0. This network provides 3 disjoint paths in CGIN, with use of the alternate
link.

3 3D-CGIN: A 3 Disjoint Paths CGIN


3.1 Network Architecture of 3D-CGIN
3D-CGIN is a cyclic Gamma Network, connecting N = 2n inputs to N outputs. It
consists of log2N + 1 stages with N switching elements per stage. The number of
input nodes in 3D-CGIN are divided in two parts, and an alternate link is used
connecting the respective input nodes to each other. It means that the first node in first
part is connected with first node in second part with alternate link and so on. The 0th
stage switches are 2 X 3 crossbars, 1st and 2nd stage switches are 3 X 3 crossbars,
output stage switches are 3 X 1 crossbars. The connecting patterns between different
stages are done as per CGIN concept. The connections between stage 0 - 1 & 2 - 3 are
done as per 20 patterns whereas the 21 pattern is used for connection between 1 2.
Fig. 3 shows the topology of 3D-CGIN for N = 8.
In 3D-CGIN, a packet visits n switches before reaching the destination. The stages
are numbered 0 to n, from left to right. The connecting pattern between stages is given
by plus-minus 2(+i)mod (n-1) functions. The jth switch at stage i, 0 i n, is connected
with three switches at stage i+1 using three functions:
fstraight (j) = j
fup (j) = j - 2(+i)mod (n-1) mod N
fdown (j) = j + 2(+i)mod (n-1) mod N

3D-CGIN: A 3 Disjoint Paths CGIN with Alternate Source

31

The function fstraight defines the switch to be visited if a straight link is chosen.
The functions fup and fdown denote the switches visited if we choose up and down
links respectively. Each request in 3D-CGIN also carries a routing tag of n digits.
Each digit in tag can take any of the following three values: 0, 1 and -1. We can use
both the distance tag routing and destination tag routing methods to route a packet to
its intended destination. By distance we mean Distance = D S (Mod N), where D is
the destination and S is the source. Following formula is used, to generate the all
possible routing tags representing the distance between source and destination:
RTDistance = ij 20 ij 21 ij 20 .

(1)

The alternate source / link at stage 0 is used in following cases : 1) The source S is
faulty / non operational, 2) Source S is busy with packets and the current request
needs urgent processing, 3) the buffer of source S is full, due to which the request is
required to wait. The routing algorithm should make a decision about it. Whenever
the packet is transferred to alternate source the routing needs one extra hop
processing.

Fig. 3. Topology of 3D-CGIN with N = 8

32

M.A. Borkar and Nitin

3.2 Multiple Disjoint Paths


We can observe that the destination D at stage n is connected to three switches at
stage n-1; D + 2 (Mod N), D and D - 2 (Mod N) respectively. Therefore, a path can
reach to D through one of them. So, the total number of alternative paths between
source S and destination D should be sum of all possible paths from S to these three
switches. We can estimate these numbers by using the recurrence relation used in
CGIN. If we consider the alternate source is also used for transmission then, the paths
from it will prove additional to the original paths generated from S to D. It can be
observed that, multiple paths are always present between every pair (S, D), and we
can get at least 3 disjoint paths considering the alternate link.
Theorem: There exist at least 3 disjoint paths in 3D-CGIN.
Proof : In 3D-CGIN, for any source S at stage 0, three switches at stage 1 are
connected : S-1(Mod 8), S and S+1 (Mod 8), where the distance between S and D is
in range -1 D S (Mod 8) 1. The S i (Mod 8) switches at stage 1 will be
connected with S i (Mod 8) and ((S i ) 2 ) mod 8 switches at stage 2.
Therefore, we can say that the source S at stage 0 is connected with ( S 1 )mod 8
and (( S 1 ) 2 ) mod 8 switches at stage 2, except for a switch where (D S ) mod
8 = 4. That means, any source S will be connected with switches (S, S 1, S 2, S
3) mod 8 at stage n-1. A switch at stage n is reachable through 3 links from stage n-1.
The links 1 and -1 will certainly lead to disjoint paths towards destination D. Source S
is connected with destination D by two paths if (D S) mod 8 = 1. There is exactly
one path from S to D if (D S) mod 8 1, except where (D S) mod 8 = 4. The
same logic is applicable when we use the alternate source. We can further show that,
between two disjoint paths from alternate source, at least one path will be always in
existence that will never use the switches used by source S to D. Hence, we can
guarantee at least 3 disjoint paths from any source S to destination D considering the
paths from alternate source.

4 Routing in 3D-CGIN
4.1 Routing Algorithm Considering Fault at Initial Stage
In this section, we present an algorithm / strategy for selecting proper link at every
stage. This algorithm considers fault at stage 0, and in case of faulty switch it
forwards the packet to alternate source using the alternate link. The algorithm do not
assumes any fault in further stages.
Algorithm
1. If the source S is faulty at 0th stage then forward the
packet to alternate source for that node i.e. the node
S = S+4.
2. If S=D then always follow a straight link till you
reach destination.

3D-CGIN: A 3 Disjoint Paths CGIN with Alternate Source

33

3. The routing will be done as follows:


a. Let S be the source and D be the destination then
first calculate Difference = DS
b. Repeat
following
steps
till
we
reach
the
destination:
1. If -1 >= Difference >= -4 then follow
uplink, you will reach an intermediate
switch, make it new source and again compute
the Difference = D-S.
2. If 1 <= Difference <= 4 then follow
downlink, you will reach an intermediate
switch, make it new source and again compute
the Difference = D-S.
3. If S = D then follow straight link
4. If Difference is in range (5 to 7 ) +ve or
-ve then calculate
diff = 8 mod Difference
Difference =- diff
Examples
1) S = 0 , D = 4 we assume that S is not faulty, then, step 2 will be out of picture and
step 3 will be used , the Difference = 4 so step 3(b) 2 will be used and a downlink will
be chosen. The next switch will be 1, that will be new source here, again the
difference will be calculated considering, S = 1 and D = 4. Difference = 3, again a
downlink will be chosen and we will reach node 3. Then again taking a downlink we
will reach the destination.
2) S = 2 D = 3 we assume here that node 2 is faulty, so the packet is transferred by
node 6 which is the alternate source for 2.Therefore, S = 6 D = 3, Difference = D S =
3-6 = -3, by following step 3(b) 1 we will take an uplink and reach 5. So, S = 5 D = 3.
Difference = 3 5 = -2, by following step 3(b) 1 we take an uplink and reach 3. S = 3,
D =3 step 3(b) 3 is satisfied and a straight link is selected and destination is reached.
4.2 Re-routing Using Destination Tag
As 3D-CGIN uses the concepts of CGIN, any arbitrary fault can be tolerated using the
destination tag routing instead of distance tag routing. 3D-CGIN should check the nbit reached switch number with the n-bit carried tag. If every bit is the same except bit
n-1 at stage i, the path should be discontinued as the difference between the tag and
the switch number will be 2n-1, making the destination unreachable. Then backtrack
to stage i-1 to take the other non-straight link. During this process if we reach the
input stage, then another tag should be generated. The process ensures handling
arbitrary fault.

34

M.A. Borkar and Nitin

5 Comparison of 3DCGIN with Other Interconnection Networks


The fault tolerance ability of a network is directly dependant on the number of paths
(multiple / disjoint as per the routing strategy). 3DCGIN provides multiple paths for
every tag used in the network, as it uses alternate source. It is obvious that 3DCGIN
provides more number of disjoint paths. Table 1 shows the number of alternate paths
available in different networks. We can see that, the total number of paths in GIN and
CGIN are 27, whereas in 3DCGIN they are 54 i.e. 2 * number of alternate paths
available in CGIN. This is the major achievement of this architecture over other
networks. These paths ensure more apt handling of faults.
Table 1. Total number of alternative paths for every tag in a network of size 8
Tag Values

Network
GIN
0

CGIN

3DCGIN

Total number of paths in GIN and CGIN are 27. In 3DCGIN, the total number of
paths are 54 i.e. 2* paths in GIN or CGIN. 3DCGIN use disjoint paths for routing the
packets. Alternate source ensures at least 3 disjoint paths between any source to
destination. In case, after rerouting in faulty environment, a packet reaches back to the
source, the third disjoint path from alternate source is used. Table 2 shows the number
of disjoint paths in various networks.
Table 2. The maximum number of disjoint paths for every tag in a network of size 8
Tag Values

Network
0

GIN

CGIN03

3DCGIN

3DCGIN ensures 3 disjoint paths for every tag value. 3-Disjoint GIN is a network
ensuring at least three disjoint paths, but the algorithms used are complicated. In
comparison to it, 3DCGIN provides very simple routing strategy, along with an
ability to tolerate single fault at input stage. Table 3 shows a comparison of various
GINs and CGINs.

3D-CGIN: A 3 Disjoint Paths CGIN with Alternate Source

35

Table 3. Comparison of GIN, PCGIN, FCGIN, 3DGIN, CGIN and 3DCGIN


Network Fault Tolerance Method Fault Tolerance ability Routing Method
GIN

Multiple Paths

Faults Robust

Distance Tag

CGIN

Disjoint Paths

1 Fault-tolerant

Distance Tag, Destination


Tag with Re-routing

PCGIN

Disjoint Paths

1 Fault-tolerant

Destination Tag

FCGIN

Multiple Paths

1 Fault-tolerant

Destination Tag

3DGIN

Disjoint Paths

2 Fault-tolerant

Distance Tag and Re-routing


Tags

3DCGIN Disjoint Paths

2 Fault-tolerant

Distance Tag, Destination


Tag with Re-routing

6 Conclusion
In this paper, a new concept 3D-CGIN is introduced. This network provides alternate
source at initial stage, which guarantees packet delivery in case of busy / faulty
source. The remaining stages follow CGIN type connection patterns. Due to alternate
source, this network ensures at least 3 disjoint paths between any source and
destination pair. The alternate source also doubles the redundant paths between any
source and destination pair. The paper proposes a simple routing algorithm, which
uses the concepts of Distance Tag routing. In order to provide dynamic re-routing, our
network is compatible with Destination Tag routing technique. Being at least 3
disjoint path network, it guarantees at least 2 switches or link fault tolerance. Though
this network provides dynamic re-routing facility, it is not a strongly re-routable
network. There is much more work remaining for finding the capabilities of this
network, making it strongly re-routable, and checking the terminal reliability. We are
working in this direction.
Acknowledgments. We wish to thank Mr. Rakesh Pandey for his apt help in
analysing the proposed network.

References
1. Parker, D.S., Raghavendra, C.S.: The Gamma Network: A Multiprocessor Interconnection
Network With Redundant Paths. IEEE, Los Alamitos (1982)
2. Parker, D.S., Raghavendra, C.S.: The Gamma Network. IEEE, Los Alamitos (1984)
3. Lee, K.Y., Hegazy, W.: The Extra Stage Gamma Network. IEEE, Los Alamitos (1988)
4. Lee, K.Y., Yoon, H.: The BNetwork: A Multistage Interconnection Network With
Backward Links. IEEE, Los Alamitos (1990)
5. Venkatesan, R., Mouftah, H.T.: Balanced Gamma NetworkA New Candidate For
Broadband Packet Switch Architectures. IEEE, Los Alamitos (1992)

36

M.A. Borkar and Nitin

6. Chen, C.W., Lu, N.P., Chen, T.F., Chung, C.P.: Fault Tolerant Gamma Interconnection
Networks By Chaining. IEEE Proceedings (2000)
7. Tzeng, N.F., Chuang, P.J., Wu, C.H.: Creating Disjoint Paths In Gamma Interconnection
Networks. IEEE, Los Alamitos (1993)
8. Chuang, P.J.: CGIN: A Modified Gamma Interconnection Network with Multiple Disjoint
Paths. IEEE, Los Alamitos (1994)
9. Chuang, P.J.: Creating a Highly Reliable Modified Gamma Interconnection Network
Using a Balance Approach. IEEE Proceedings (1998)
10. Chen, C.W., Lu, N.P., Chung, C.P.: 3Disjoint Gamma Interconnection Network. The
Journal of Systems and Software (2003)

Architecture for Running Multiple Applications on a


Single Wireless Sensor Network: A Proposal
Sonam Tobgay*, Rasmus L. Olsen, and Ramjee Prasad
Department of Electronic Systems, Aalborg University, Aalborg, Denmark
{in_st,rlo,prasad}@es.aau.dk

Abstract. Wireless sensor networks have gained much attention from


researchers as well as from the industrialists in the recent past. Usually, wireless
sensor networks are deployed per application based. This limitation has
restricted its deployment in commercial applications, which require to running
multiple applications on a single wireless sensor network infrastructure. In this
paper, we propose a simple architecture for running multiple applications on a
single wireless sensor network. Our proposed system is based on the
middleware concept in which an application manager module controls the
switching among different applications with the help of a mobile agent.
Keywords: Wireless Sensor Network, middleware, multiple applications,
sensor.

1 Introduction
A wireless sensor network (WSN) is a special kind of ad hoc networks that consists of
a number of low-cost, low-power wireless sensor nodes, with sensing, wireless
communications and computation capabilities [1], [2], [3]. These sensor nodes
communicate over a short range via a wireless medium and collaborate to accomplish
a common task, like environmental monitoring, military surveillance, and industrial
process control [3]. Wireless sensor networks have open up for new opportunities to
observe and interact with the physical environment around us. They enable us now to
collect and gather data that was difficult or impossible before [4]. With the
advancement of wireless sensor network technology fueled by dropping cost of sensor
nodes, it is expected that the future world will be very much dependable on this
wireless technology. It is expected that the wireless sensor networks will find wide
applicability and increasing deployment in the future.
However, usually most of the wireless sensor networks deployments so far are
application specific due to resource constraints like limited amount of memory,
computation power and energy source of sensor nodes [5]. Due to these limitations,
most of the previous works on wireless sensor networks have been aimed at how to
decrease the energy consumption thereby improving the life time of the network. This
nature of a single application support of wireless sensor networks have tremendously
limits their commercial deployment in many real life applications where multiple
*

Corresponding author.

A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 3745, 2011.
Springer-Verlag Berlin Heidelberg 2011

38

S. Tobgay, R.L. Olsen, and R. Prasad

applications support by a single wireless sensor network is required. Although, the


energy consumptions contribute to the total cost of the wireless sensor network but
there are other aspects such as application development, maintenance and return-ofinvestment which equally contribute to the total cost of the wireless sensor networks
[6]. In this research, we argue that a single wireless sensor network should support
multiple applications so that it increases the return-of-investment and usefulness of
the system.
The rest of the paper is organized as follows: Section 2 brings out some real life
scenarios which motivate this research, section 3 explores some similar research done
previously on the same topic, section 4 describes our proposed system with
algorithms and flowchart and finally we conclude in section 5 describing
implementation plan of our proposed system.

2 Motivation Scenarios
In this section, we bring out two scenarios where and in which situation the proposed
system could be used in real life deployment of wireless sensor networks.
Scenario A
Consider a corporate industry that employs a wide range of sensors for monitoring
different applications. Since the wireless sensor network are usually application
specific running one application per wireless sensor network, the minimum
requirement for setting up a single wireless sensor network would be, one sink node,
several sensing nodes, cluster heads if heterogeneous nodes are deployed and the
system which manages the network. Let us look at below scenarios:
1.

The security department wants to set up wireless sensor network based


intrusion detection system using surveillance camera in the corporate
premises.
2. The maintenance department wants to set up wireless sensor network to
monitor the temperature in the premises in order to prevent fire.
3. The top level management wants to monitor the air quality (pollution level)
in the premises.

For the above three applications, the organization requires three different wireless
sensor networks to provide three different services. The total investment cost will be
reduced if all these three applications can be made to run simultaneously on a single
wireless sensor network infrastructure.
Scenario B
Let us assume that the department of forest had set up wireless sensor network to
detect and monitor the forest fire in the region of interest. After few years, the
department even wants to monitor the illegal timber extraction from that same forest.
In this situation, the cost of setting up the wireless sensor network to monitor the
illegal timber extraction could be reduced if we can make use of the existing wireless
sensor network infrastructure which monitors forest fire.
If we observe the above two scenarios, we would find different applications which
requires different treatments. We can group these applications into three categories as

Architecture for Running Multiple Applications

39

inspired by [7] as environment data collection, security monitoring and mobility


management. So, when a single sensor network infrastructure runs multiple
applications, we will be faced with different challenges because the requirements and
characteristics of each application are different and needs to be satisfied individually.
Another important issue is how to share the common resources among multiple
applications without compromising the performance of any individual application.
Closely observing the requirements to be fulfilled for the individual application, we
have come out with two main research questions as shown below:
i

How to design a system or an architecture which supports multiple


applications running simultaneously and at the same time fulfilling the
individual applications requirements?
How to collect and disseminate specific data from/to a particular application
which needs a distributed and scalable routing protocol?

ii

In this paper, we mainly focus on the first research question, how to design a system
which supports multiple applications satisfying their individual requirements.

3 Related Works
In this section, we describe some of the similar works previously done. Running
multiple applications on a single wireless sensor network can be possible in two ways.
One way is to run all the applications simultaneously and the other way is to run
applications in a predefined sequence. The application concurrency in wireless sensor
networks can be divided into two categories [8]:

Application concurrency at the node level


Application concurrency at the network level

Concurrency at node level involves executing multiple applications at the processor of


the node. As an example, a system could check a sensor to decide whether data is
available and processes the data right away, and at the same time checks the
transceiver for data packet availability, and immediately processes the packet [5]. On
the other hand, application concurrency at network level can be either of the
following ways. Running multiple applications simultaneous on a single network
infrastructure or running different applications in a predefined sequence. The sensor
nodes can be preprogrammed for each application or the application code can be
distributed to the nodes by using some mobile agents after deployment during run
time.
In [5] and [8], different applications running on a single network in a predefined
sequence is proposed. It uses a mobile agent to switch ON and OFF the running
application and a configuration agent to take care of updates and reconfigurations of
nodes. This system is aimed at scenarios like forest fire detection and monitoring of
fire fighters after the fire is detected. In this type of scenario, it is possible to have a
sequential execution of the services, where one group of sensors is made to sleep
while the other group is running. Although, switching the group of sensor nodes to
sleep mode increases the life time of the network, however, the main drawback of the
system is that the event happening during the sleep mode may not be detected. As an

40

S. Tobgay, R.L. Olsen, and R. Prasad

example, the fire may not be detected if the fire is occurred during the group of
sensors which is supposed to detect fire is in sleep mode. The proposed system tries to
overcome this drawback by making all the applications run simultaneously.
In [6], an architecture based on scoping technique is proposed. A scope is defined
as a group of nodes that are specified through a membership condition. The proposed
architecture stresses that the sensor nodes are not addressed individually by some
addresses, but by their properties or context. Based on the scope, the system creates
subsets of nodes within the network. The membership conditions are specified at
different levels, e.g. like properties of nodes. In this way, the scoping suggests how to
separate different tasks both at the node level and the network level.
A multiple service support wireless sensor network based on routing overlay is
proposed in [9]. It mainly focuses on the routing issues by deploying different sink
nodes for different applications. The similar approach is also discussed in [10]. In [9],
the nodes are made to registers with the applications with the help of application join
message advertise by the applications. In this way, different groups of nodes are formed
for each application. The intermediate node which receives messages for application, to
which it is not registered, act as a relay and forward the message to the nodes nearer to
the gateway. It is assumed that the nodes which are nearer to the gateway are made to
act as relays whereas the nodes which are far away from gateways are made to act as
sensing nodes, in order increase the lifetime of the network.
Our research proposal is inspired by Agilla [11] where it adopts a mobile agentbased system where programs are composed of mobile agents and move across the
nodes in the network during run time. However, in contrast to Agilla, our mobile
agent will not be moving continuously across the nodes. We use the mobile agent to
distribute the application codes when we set up new application only. More detail
description regarding the proposed system is given in the following sections.

4 Proposed Architecture
In this section, we give an overview and system design aspect of our proposed
architecture for running multiple applications simultaneously on a single wireless
sensor network infrastructure.
4.1 System Overview
Usually wireless sensor nodes are preprogrammed to accomplish a specific
application due to their limited resources [8]. These limitations have restricted
wireless sensor networks to be deployed commercially where multiple applications
are required to be run on a single wireless sensor network infrastructure. After
deployment of the wireless sensor nodes it would be very difficult to collect and reprogram the nodes to suite the changing requirements. It is also observed that the cost
of sensor would be decreasing as per the current trend and the rate of advancement on
the field of electronics. With this technological advancement, it is anticipated that the
many commercial organizations and industries would be deploying wireless sensor
network and would be running concurrent applications on a single wireless sensor
networks in order to reduce the management cost and increases the utilization of the
system.

Architecture for Running Multiple Applications

41

In view of the above anticipation, we propose to design a wireless sensor network


architecture which can run multiple applications simultaneously. In our system we
propose to deploy sensor nodes which are not preprogrammed, meaning that they are
not application specific. The system is composed of four phases: i) Deployment of
nodes and formation of clusters phase, ii) Application registration phase, iii)
Information dissemination phase and iv) Data Collection phase. These different
phases are described in the following section.
4.2 System Design
The proposed system is inspired and motivated from the middleware concept of
Agilla [11]. But, in our proposed system we use a mobile agent in two ways: one to
distribute the application code when the particular application is deployed for the first
time and other to enable the next application after the one application completes its
task or any abnormal events occurred in the network. The system considers an
application manager module which controls and coordinates the running of multiple
applications and distribution of application codes with the help of mobile agent. In
contrast to the proposal in [5], where running multiple applications is solely based on
predefined sequence; our system uses a partial predefined sequence. Instead of
switching the nodes between active and sleeping mode, we switch between active and
passive mode. This will help us to monitor all the environments irrespective of the
modes of nodes. We call passive mode when the nodes are sensing but not
transmitting whereas in active mode nodes can transmit as well as sense. The decision
on which application will be in active mode will be decided by the sequence in the
application manager and the abnormal events detected in the field, and the priority is
given to the former. Before describing the working of the system, we would like to
bring out some of the assumptions we make in the proposed system:

The sensor nodes can be programmed for different applications.


How many number of sensing and cluster head nodes required are predefined?
The nodes which act as cluster heads are more powerful than the sensing nodes
Only one sink or base station (PC) node is available.

The proposed system has four phases:


1. Deployment of nodes and Cluster Formation
In this phase, we deploy sensing and cluster head nodes in the region of interest. The
formation of cluster needs clustering algorithms which scales with the huge number
of nodes as we may have to deploy a large number of nodes to have equal coverage of
each application within the cluster. There are two main types of clustering in wireless
sensor network, homogeneous and heterogeneous network clustering of sensor nodes
[12]. A homogeneous sensor network consists of sensor nodes with identical
hardware configurations and capabilities where as a heterogeneous sensor network
consists of two types of sensor nodes; one with low power and low processing
capabilities; mostly used as sensing nodes, and the other type which is more powerful
in terms of resources and processing power; usually used as cluster head [5].

42

S. Tobgay, R.L. Olsen, and R. Prasad

Fig. 1. Flowchart, depicting an operation scenario where application A1 is active, while


applications A2 and A3 are in passive mode

2. Application Registration Phase


This phase is mainly concerned with the assignment of application task to the sensing
nodes. The sink node (base station) advertises the application identifiers of the
available applications to the cluster heads. The cluster heads, upon receiving the

Architecture for Running Multiple Applications

43

applications identifiers would be assigning the application identifiers to its member


nodes assuring that equal representation of all the applications is made within its own
cluster.
3. Information Dissemination Phase
In the third phase, sink node invokes the application manager, which in turn will set
one of the applications active and check if the application is made active for the first
time. If so, it sends the application code as well as the information via a mobile agent
to the cluster heads. If the application is not a new, which means its code is already
distributed, only then the information to switch to active mode is sent. With this, we
make one of the applications active at one instance and the rests are in passive mode.
Here, we assume that each application code has critical section, where we define
threshold values which when exceeds; the application goes to active mode
automatically even if it is in passive mode, thereby informing the cluster head through
transmitting the sensed data. This, we call as abnormal event or abnormal operation.
4. Data Collection Phase
Once the one of the applications is switched ON to active mode, the sensing nodes
will start sending the data to the base station via cluster heads. In the process if
abnormal event occurs, then the current active application will be suspended, and the
cluster head will inform the base station which in turn will invoke application
manager to handle the situation. In normal condition, the running application will
continue till its task is completed or till its allocated time expires, after which the next
application will be switched ON to active mode and this process continues.
Flowchart, depicting the scenario when the application A1 is active while other
two applications A2 and A3 are in passive mode is shown in figure 1. For simplicity,
we have considered three applications A1, A2 and A3, but it can be used for any
number of applications.

Fig. 2. Normal operation, A1 active while A2 and A3 are in passive mode

44

S. Tobgay, R.L. Olsen, and R. Prasad

Fig. 3. Abnormal operation, while A1 is active, A2 detects an abnormal event

5 Conclusion and Future Works


It is observed that most of the works done on the sensor networks till date are targeted
for a single application; user or the query is using the network at one time. This, not
only increases the cost of investment by deploying duplicate infrastructure when an
organization needs to monitor multiple applications, users or queries but also the
resources are not fully utilized. In this paper, we have presented simple sensor
network system architecture which aims to supporting multiple applications on a
single wireless sensor network infrastructure. This not only will increase the
utilization of the infrastructure but will also minimize the running and investment cost
of the whole system. It is based on the middleware concept where the module called
application manager coordinates and control the running of multiple applications on a
single wireless sensor network. In the future, we plan to develop a test bed so as to
show the concept of our proposed system and verify the system for scalability and
reliability with the help of simulation.

References
1. Akyildiz, I.F., Su, W., Sankarasubramaniam, Y., Cayirci, E.: Wireless sensor networks: a
survey. Computer Networks 38(4), 393422 (2002)
2. Kemal, A., Mohamed, Y.M.: A Survey on routing protocols for wireless sensor networks.
Ad Hoc Networks 3, 249325 (2005)
3. Singh, S.K., Singh, M.P., Singh, D.K.: Routing protocols in Wireless Sensor Networks-A
Survey. International Journal of Computer Science & Engineering Survey (IJCSES) 1(2)
(2010)
4. Chu, D., Deshpande, A., Hellerstein, J.M., Hong, W.: Approximate Data Collection in
Sensor Networks using Probabilistic Models. In: Proceedings of the 22nd International
Conference on Data Engineering, ICDE 2006 (2006)

Architecture for Running Multiple Applications

45

5. Akbar, M., Tanveer, A.Z.: Multi-Set Architecture for Multiple-Applications Running on


Wireless Sensor Networks. In: 24th IEEE International Conference on Advanced
Information Networking and Applications Workshops, pp. 299304. IEEE Computer
Society, Los Alamitos (2010)
6. Jonathan, S., Ludger, F., Mariano, C., Alejandro, B.: Towards Multi_Purpose Wireless
Sensor Networks. In: Proceedings of the 2005 Systems Communications (ICW 2005),
IEEE Computer Society, Los Alamitos (2005)
7. System Architecture for Wireless Sensor Networks, A PhD Thesis, University of
California, Berkeley (2003)
8. Akbar, M., Tanveer, A.Z.: Running Multi-Sequence Applications in Wireless Sensor
Networks. In: Proceeding of 7th ACM International Conference on Advances in Mobile
Computing and Multimedia (MoMM 2009), Kuala Lumpur, pp. 365369 (2009)
9. Ali, H.A., Ahmad, A.I., Shafique, A.C., Chaudhar, S.H., Ki-Hyung, K.: A Routing
Overlay for Wireless Sensor Networks with Multiple Services Support. In: The Journal of
Korean Institute of Next Generation Computer Conference (KINGPC), Seoul, pp. 171
165 (2005)
10. Anwar, A., Ali, H.A.: Carrefour Cast: A New Routing Protocol to Support Multiple
Applications in Multiple Gateway Environments of Wireless Sensor Networks. In: World
Congress on Science, Engineering and Technology, Singapore (2009)
11. Fok, C.L., Roman, G.C., Lu, C.: A Mobile Agent Middleware for Sensor Networks: An
Application Case Study. In: Fourth International Symposium on Information Processing
on Sensor Networks (IPSN 2005), pp. 382387 (2005)
12. Li, Y., Thai, M., Wu, W.: Wireless Sensor Networks and Applications. Signals and
Communication Technology, 331347 (2008)

Feature Based Image Retrieval Algorithm


P.U. Nimi and C. Tripti
Department of Computer Science,
Rajagiri School of Engineering and Technology, Cochin, India
nimi2ni@gmail.com, triptic@rajagiritech.ac.in

Abstract. The developments in the eld of internet allow users in almost all the professional areas for exploiting the opportunities oered
by the ability to access and manipulate remotely-stored images. The
large multimedia database has to be processed within a small fraction
of seconds for many of the real time applications. This demand of using
the technique of content based image retrieval (CBIR) as a scheme for
searching large database for image retrieval has addressed some of the
issues that need to be solved for having an ecient system. The paper
focuses on the issues of image retrieval and also suggests a method to
get an accurate result by using a hybrid search methodology. The paper
works in two phases- in the rst phase it works with genetic algorithm to
get a local optimal result and in the second phase, it works with neural
network to get a global optimal result.
Keywords: Content based Image Retrieval (CBIR), Genetic Algorithm
(GA), and Neural Network.

Introduction

The recent advances in the digital technologies have created a great demand
for organising the available digital images for easy retrieval [1]. The retrieval
of similar images based on a query from the large digital image database is a
challenging task. A content based image retrieval system, commonly known as
image search engine is used for image retrieval of relevant query similar images
from the large digital image database. The term has been widely used to describe
the process of retrieving desired images from a large collection on the basis of
features that can be automatically extracted from the images themselves. The
features used for retrieval can be either primitive or semantic, but the extraction
process must be predominantly automatic[2]. The applications of image retrieval
system include areas such as medical imaging, criminal investigation, computer
aided design etc. The CBIR technique is an emerging technology that attracts
more and more people from dierent elds such as computer vision, information
retrieval, database systems, machine learning [3]. But there are some problems
which are becoming widely recognised such as the semantic gap between low-level
visual content and higher level concepts and the high computational time taken
for image analysis, image indexing and image searching. This work proposes a
solution for the second issue, which will make the system more ecient with less
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 4655, 2011.
c Springer-Verlag Berlin Heidelberg 2011


Feature Based Image Retrieval Algorithm

47

computational time for image analysis, searching and image retrieval. In this
proposed work, the optimal image retrieval scheme is aimed by implementing
genetic algorithm.
The content based image retrieval system works by extracting several textual
features and these features are used for analysing and retrieving the optimal
image results for the query from the user. Genetic algorithm is used as the
optimisation technique for getting a local optimal result and this result is trained
to an optimal result by using neural network. There are so many issues like
selecting the good features, large computational time, large storage space etc.
as addressed in the papers[3][4][5].
The paper is organised as follows: Section two presents the related works
which includes a review of the retrieval techniques so far developed, an overview
of genetic algorithm and neural network. Section three describes the block diagram and the section four discussed algorithm of the proposed solution. Section
ve presents a proposed Implementation Methodology and section six gives the
conclusion and remarks.

Related Works

Neural network and Genetic algorithm are used for learning and optimization.
Recently there had an attempt to combine these two technologies. As mentioned
in the paper [6], they proposed a hybrid model for content based mammogram
retrieval which is demonstrated as two stages, and in rst stage they dealt with
Self organising Map (SOM) neural network for clustering images and in second
stage used Genetic algorithm search. But in this paper, the rst stage genetic
algorithm is used for local optimal result and this result is trained to an optimal
result by using neural network.
A brief description of an overview of Content Based Image Retrieval (CBIR),
Genetic Algorithm and Neural Network are the following:
2.1

Overview of CBIR

The word Content Based says that the retrieval of images are taking place
rather than the metadata like keywords, tags, descriptions etc. Retrieval of images based on the content which refers to colours, shapes, textures or any other
information that can be derived from the image itself. The demand for using
large multimedia database for the various real time applications have addressed
many issues in the eld of content based image retrieval. They can be addressed
mainly as
Image storage problem
Image retrieval problem
Humans manually enter keywords for image retrieval in a large database can be
inecient and are dicult to capture every keyword that describes the image,
which was the process in the traditional search methodology. CBIR will help to
lter images based on the content and return more accurate results.

48

P.U. Nimi and C. Tripti

The process of content based image retrieval is that, images visual contents are
extracted and described by multidimensional feature vectors and which forms a
feature database. When the user gave an image query or sketched gure the system changes the image into feature vector. The system will check the similarities
with the featured image database then the retrieval is performed with an indexing method which provides an ecient way to search for the image database.
After the retrieval of image, the users can feedback to modify the process of
retrieval for more accurate results [4].
Classications of query types are into three levels [7].
Level 1 comprises primitive features such as colour, texture, shape or the
spatial location of image elements etc. For example, query might include
nd images containing yellow stars arranged in a ring, which is both objective and directly derivable from images themselves, without the need to
refer to any external knowledge base. Its use is largely limited to specialist
applications such as trademark registration, identication of drawings in a
design archive or colour matching of fashion accessories[7].
Level 2 comprises retrieval by derived (logical) features, involving some degree of logical inference about the identity of the project depicted in the
image. It can usefully be divided further into i) Retrieval of objects of a
given type, example: nd pictures of a double Decker bus; ii) Retrieval of
individual objects or persons, example: nd a picture of the Eiel Tower;[7]
Level 3 comprises retrieval by abstract attributes, involving a signicant
amount of high level reasoning about the meaning and purpose of the objects or scene depicted. Again this level of retrieval can be subdivided into:
i) Retrieval of named events or type of activity, example: nd pictures of
Scottish folk dancing; ii) Retrieval of pictures with emotional or religious
signicance, example: nd a picture depicting suering; [7].
The CBIR technology can be of mainly four types [1][7]
1) Retrieval based on Colour feature: Histograms are generally used for describing colour feature of the images [8], which shows the proportion of pixels
of each colour within the image. Each images of colour histogram are added
to the database. User can either specify the colour or can give the image
from which colour histogram is calculated. The matching technique such as
histogram intersection is used to retrieval the images based on the colour
feature.
2) Retrieval based on Textural feature: Another important property of image
is Texture. For measuring texture similarity a variety of techniques has been
used. Texture representation schemes can be broadly classied as Structural
and Statistical. Texture queries can be formulated in a similar manner to
colour queries, by selecting examples of desired textures from a palette, or
by supplying an example query image. The system then retrieves images
with texture measures most similar in value to the query.

Feature Based Image Retrieval Algorithm

49

3) Retrieval based on Shape feature: Shape is another important feature that


is basically used describe an image content. Two main types of shape feature
are commonly used - global features such as aspect ratio, circularity and
moment invariants and local features such as sets of consecutive boundary
segments.
4) Retrieval by other types of primitive feature: One of the oldest-established
means of accessing pictorial data is retrieval by its position within an image.
Most of these rely on complex transformations of pixel intensities which
have no obvious counterpart in any human description of an image. Most of
the techniques aim to extract features which reect some aspect of image
similarity which a human subject can perceive, even if he or she nds it
dicult to describe [1][8].
With the progress of research and development of accurate and fast image retrieval systems, the performance of such systems needs to be improved. The
performance can be improved by removing the irrelevant and redundant features from taking into consideration [9]. Many optimization techniques can be
used to improve the performance of such image retrieval systems. This paper
works in two main phases. In the rst phase it works with genetic algorithm
(GA). It performs a local optimization to generate a small set of images, depending on a tness function value. The tness function depends on the feature
values of an object in the image. In the second phase, to get an optimal result,
neural network is used to train the newly generated set of image from the rst
phase with the user query.
2.2

Overview of Genetic Algorithm

Genetic algorithms are used as nature inspired adaptive algorithms for solving
real time practical problems. Genetic Algorithms search algorithms based on
the mechanisms of natural selection and natural genetics, survival of ttest and
randomized information exchange. GA was rst introduced by John Holland
for the formal investigation of the mechanisms of natural adaptation, but the
algorithms have been modied to solve computational search problems.
The genetic algorithm is a probabilistic search algorithm that iteratively
transforms a set (called a population) of mathematical objects (typically xedlength binary character strings), each with an associated tness value, into a
new population of ospring objects using the Darwinian principle of natural selection and using operations that are patterned after naturally occurring genetic
operations, such as crossover and mutation [10].
Genetic algorithm works with the operators like:
Selection - GA selection operators perform the equivalent role to natural
selection. The overall eect is to bias the gene set in following generations to
those genes which belong to the most t individuals in the current generation
[11].

50

P.U. Nimi and C. Tripti

Crossover - Members of the newly reproduced strings in the mating pool


are mated at random and each pair of strings undergo crossing over and
crossover is the process by which good schema get combined.
Mutations - Mutation enables the GA to maintain diversity whilst also introducing some random search behaviour.
Inversion- This operator aims to mimic the property from nature that, in
general, the function of a gene is independent of its location on the chromosome.
The working of GA can be represented using the following diagram.

Fig. 1. GA Cycle

A Genetic Algorithm works with a population of individuals, also known as


chromosomes, which represent the possible solutions to a given problem. In
nature, the tness describes the ability of an organism to survive and reproduce.
But in the case of genetic algorithm it is discussed as the result of the objective
function. In the population, the organisms with a better tness score (value of
the tness function) are more likely to be selected to the mating pool. This
promotes the genes with more benecial characteristics to propagate through
generations [12].
For each problem to be solved, one has to supply a tness function, and indeed
its choice is crucial to the good performance of the GA. For a chromosome or an
individual, a numerical value will be return as a tness function, which represents
the utility of an individual. This value will be used in both the parents selection
process and in the survival selection process for next generations, so the ttest
individual is being chosen [13][5].

Feature Based Image Retrieval Algorithm

2.3

51

Overview of Neural Network

Neural network is a bio- inspired methodology. Bio-inspired techniques often


involve the method of specifying a set of simple rules, a set of simple organisms
which adhere to those rules, and a method of iteratively applying those rules.
After several generations of rule application it is usually the case that some forms
of complex behaviour arise [14]. There is a biologically inspired way of training
recurrent neural networks that takes its methodology from another eld of bioinspired computing - genetic algorithms. The set of weights for a neural net
can be seen as a genome. By introducing a tness function that dierentiates
networks from each other according to how well they perform a task, the genomes
of the more successful networks can be used as parents for future generations.
The genomes are combined according to some algorithm, commonly taking half
of one parents and half of the others genome. Mutations are introduced to
make sure the genetic algorithms have more complete coverage of the solution
space[15].
Neural Networks can be done in three modes, supervised learning,
un-supervised learning and Reinforced Learning. In a supervised learning scheme,
the inputs are trained to get the user-dened output. The cost function is the difference between the mapping and the data. This scheme has a prior knowledge
about the problem domain. The average squared error between the networks
output and the target value over the user dened data is minimized. The algorithms used commonly for training are back propagation algorithm [16] and least
mean square error convergence (LMS) algorithm[17]. The most commonly seen
application of supervised learning is pattern recognition and regression. This
type of learning scheme is analogous to a teacher in the form of a function that
provides continuous feedback on the quality of solutions obtained so far.
In an unsupervised learning, a cost function dependent on the task is to be
minimized. A set of data is given as the input and the set of assumptions and
parameters are considered to solve the problems. The problems like clustering
come under this scheme. It is as if there is no teacher is present the desired
patterns and hence, the system learns of its own discovering and adapting to
structural features in the input patterns[12].
In Reinforced Learning, a teacher through available,does not present the expected answer but only indicates if the computed output is correct or incorrect.
The information provided helps the network in its learning process. A reward is
given for a correct answer computed and a penalty for a wrong answer.[12].

Block Diagram

The paper works in two phases. The rst phase is done using genetic algorithm
as a local search optimisation technique. Features are extracted from the query
images and compared with the images in the image database. Based on a tness
function, the relevant features are given more priority and they are shown as the
result of the rst phase. The result of the rst phase is given as the input to the

52

P.U. Nimi and C. Tripti

Fig. 2. Block Diagram of CBIR System

second phase. In the second phase the result of the rst phase is trained with the
input image. In this phase neural network is used to train the rst stage result,
so that the dierence between the input image and the local optimal result is
minimised and thereby getting into a global optimal result.
In the rst phase, an initial set of possible solutions are evaluated based on a
tness function. The population contains chromosomes and these chromosomes
contain the genes. Each gene represents an image segment. For each segment the
various features of the object are studied and coded to a gene. Various operators
like selection, crossover and mutation operators are applied to obtain the t
members of the next generation.
Fitness function is used to evaluate the individual or chromosome quality
in the population. So the tness function (F) of image query (q) and chromosomes(c) are,

(1)
Where w is the weight of feature, h is the feature of images and x is the
image region. The selection operator selects the members for the mating pool
by a probabilistic approach. It calculates the probability [13] of an individual to
get selected when it is in a population using equation (2).

Feature Based Image Retrieval Algorithm

53

(2)
Where N is the number of individuals in the population and is the tness value
associated with each individual. A roulette wheel scheme is used commonly. In
the mating pool, exchange of the chromosome material will happen during the
crossover operation. Two individuals will mate each other to form two other individuals. The next operator is mutation. Mutation is basically any changes in
the environment creating a signicant change in the genetic make-up of an organism[18]. The probability of mutation is made very less. The genetic algorithm is
an iterative process[19]. Search process using GA is controlled by certain control
parameters like crossover probability, mutation probability, population size etc.
The output of the rst phase is the local optimal results which serve as the
input to the next stage. This output is trained using the least mean square
convergence method of neural network. The LMS algorithm is an adaptive algorithm based on the gradient based approach [12]. LMS incorporates an iterative
method of updating the weight vector based on the dierence calculated from
the feedback input image query with the set of local optimal result obtained
from the rst phase. The procedure leads to a minimum mean square error. The
result thus obtained after the second phase is a global optimal result.

Algorithm

Input : The Images in Database and query image


Output : Optimized search image
Step 1:
Start
Step 2:
Initialization:
Collect many images and prepare a database for those images
Step 3:
Select a query image to the CBIR
Step 4:
Extract features of the query image
Step 5:
Calculate similarity measurement with the features of colour, shape,
intensity, and edge
Step 6:
Calculate tness function and check it with the threshold and if the
image score is greater than the threshold, display it as the local optimal result
Step 7:
Based on the users feedback, train the obtained output set using
neural network
Step 8:
Get the optimal result
Step 9:
Stop

Proposed Implementation Methodology

The proposed implementation methodology includes an image database having


more than 100 images. So we consider the implementation onto an object relational database management system (ORDBMS)[20]. A simulation environment

54

P.U. Nimi and C. Tripti

for feature based retrieving images from database will be used with MATLAB,
which is an ecient program vector and matrix data processing [21].

Conclusion

The proposed algorithm for retrieval of an ecient image from a large database
used by a hybrid search methodology was described. The algorithm includes Genetic Algorithm and Neural Network. The content based image retrieval system
has worked by extracting several textual features and these features were used
for analyzing and retrieving the optimal image results for the query from the
user. The procedure described in this paper works in two phases, and two phase
model can oer more accuracy and speed than single phase. In the rst phase
Genetic Algorithm has been used as the optimization technique for getting a
local optimal result and in the second phase the result from the rst phase has
been used to train the Neural Network for getting the best results.

References
[1] Yang, H., Zhou, X.: Research of Content based Image Retrieval Technology. In:
Guangzhou, P.R. (ed.) Proceedings of the Third International Symposium on
Electronic Commerce and Security Workshops (ISECS 2010), China, July 29-31,
pp. 314316 (2010)
[2] Konstantinidis, K., Andreadis, I.: On the use of color histograms for content based
image retrieval in various color spaces. In: ACM Proceeding ICCMSE 2003 Proceedings of the International Conference on Computational Methods in Sciences
and Engineering ISBN:981-238-595-9
[3] Deb, S., Zhang, Y.: An Overview of Content-based Image Retrieval Techniques. In:
IEEE Proceedings of the 18th International Conference on Advanced Information
Networking and Application, AINA 2004 (2004)
[4] Fundamentals of content-based image retrieval,
www.cse.iitd.ernet.in/~ pkalra/siv864/Projects/ch01_Long_v40_proof.pdf
[5] Melanie, M.: An Introduction to Genetic Algorithms
[6] Jose, T.J., Mythili: Neural Network and Genetic Algorithm based Hybrid model
for content based mammogram Image Retrieval. Journal of Applied Sciences 9(19),
35313538 (2009) ISSN 1812-5654, Asian Network for Scientic Information
[7] Content-based Image Retrieval - JISC,
http://www.jisc.ac.uk/uploaded_documents/jtap-039.doc
[8] Eakins, J.: Content-Based Image Retrieval. Margaret Graham University of
Northumbria at Newcastle. Report (October 39, 1999),
http://www.cse.iitd.ernet.in/ pkalra/
siv864/Projects/ch01 Long v40 proof.pdf
[9] Varghese, T.A.: Performance Enhanced Optimization based Image Retrieval
System. IJCA Special Issue on Evolutionary Computation for Optimization
Techniques, ECOT, 3134 (2010)
[10] Rezapour, O.M., Shui, L.T., Dehghani, A.A.: Review of Genetic Algorithm Model
for Suspended Sediment Estimation. Australian Journal of Basic and Applied
Sciences 4(8), 33543359 (2010) ISSN 1991-8178

Feature Based Image Retrieval Algorithm

55

[11] Introduction to Genetic Algorithms and GAUL,


http://gaul.sourceforge.net/intro.html
[12] Rajasekharan, S., Vijayalakshmi Pai, G.A.: Neural Networks, Fuzzy Logic, and
Genetic Algorithm Synthesis and applications, Eastern Economy Edition
[13] da Silva, S.F., Batista, M.A., Barcelos, C.A.Z.: Adaptive Image Retrieval through
the use of a Genetic Algorithm. In: 19th IEEE International Conference on Tools
with Articial Intelligence, pp. 557564 (2007)
[14] Bio-inspired Computing,
http://en.wikipedia.org/wiki/Bio-inspired_computing
[15] Bryden, J.: Biologically Inspired Computing: The Neural Network
[16] Otair, M.A., Salameh, W.A.: Speeding Up Back-Propagation Neural Networks.
In: Proceedings of the 2005 Informing Science and IT Education Joint Conference,
Flagsta, Arizona, USA, June 16-19 (2005)
[17] Least mean square algorithm, http://etd.lib.fsu.edu/theses/available/
etd-04092004-143712/unrestricted/Ch 6lms.pdf
[18] Karaboga1, N., Cetinkaya, B.: Design of Minimum Phase Digital IIR Filters by
Using Genetic Algorithm. In: Proceedings of the 6th Nordic Signal Processing
Symposium - NORSIG 2004, Espoo, Finland, June 9-11 (2004)
[19] Sharpe, P.K., Greenwood, A., Chalmers, A.G.: Genetic Algorithms for Generating
Minimum Path Congurations.
[20] Ignatova, T., Heuer, A.: Model-Driven Development of Content-Based Image
Retrieval Systems. Journal of Digital Information Management
[21] Kerminen, P., Gabbouj, M.: Prototyping Color-based Image Retrieval with MATLAB

Exploiting ILP in a SIMD Type Vector Processor


Abel Palaty1, Mohammad Suaib2, and Kumar Sambhav Pandey3
1,2

National Institute of Technology,Research scholar,


Computer Science and Engineering department, Hamirpur, India
3
National Institute of Technology,Associate professor,
Computer Science and Engineering department, Hamirpur, India
{palatyabel,suaibcs09}@gmail.com, kumar@nitham.ac.in
Abstract. In this paper we exploit instruction level parallelism by compiler
optimization techniques like loop unrolling and loop peeling for an SIMD type
vector processor. SIMD type vector processor is a high performance
computational model which exploits the computational capabilities of both
SIMD and vector architecture. SIMD type vector processor works on short
vector instructions of vector length four and has four processing units which
enables execution of four vector operands simultaneously. To implement the
proposed work we need a common estimation platform. We use MachSUIF
intermediate representation for proposed approach. MachSUIF is provided with
many inbuilt passes which gives us different levels of intermediate
representations. We have created a control data flow graph (CDFG) to do
vectorization according to SIMD type vector architecture. We have made a
custom pass in MachSUIF in which we do unrolling and peeling according to
the architecture i.e. we will be unrolling the loop to size four. We have shown
that in ideal conditions we will get a speed up factor of 4 in a SIMD type vector
processor.
Keywords: SIMD type vector processor, MachSUIF, loop unrolling, loop
peeling, vectorization.

1 Introduction
The goal of a compiler and processor designer is to achieve high level parallelism
among the instructions as much as possible. So to utilize the parallelism, possible in a
loop we use the architecture SIMD type vector architecture. Ordinary programs are
typically written under a sequential execution model where instructions execute one
after the other and in the order specified by the programmer. ILP allows the compiler
and the processor to overlap the execution of multiple instructions or even to change
the order in which instructions are executed. To achieve the maximum parallelism in
a loop we do loop unrolling and loop peeling [1],[5]. Here we will be unrolling to a
size of 4 as our architecture supports short vectors of size 4. Peeling is done to make
the no. of iterations in a loop to be a multiple of our vector size 4 and then
vectorization [6] is done accordingly.
MachineSUIF is a flexible and extensible infrastructure for constructing compiler
ends [8]. MachSUIF works upon a working compiler based on the Stanford SUIF
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 5662, 2011.
Springer-Verlag Berlin Heidelberg 2011

Exploiting ILP in a SIMD Type Vector Processor

57

compiler infrastructure (version 2.1)[3]. By using the MachSUIF we can build new
optimizations [2] that are parametrizable with respect to the target machine and
portable across compiler environments and add support for new target architecture.
MachSUIF provides libraries like control flow and data flow libraries which provide
abstractions that aid in the coding of certain kind of optimizations.
1.1 SIMD-Vector Architecture
The performance of a vector processor can also be enhanced by introducing
parallelism in vector processing. This parallelism can be implemented by adding the
superscalar issue in vector processor. In vector architecture all the vector instructions
are executed in sequence. SIMD type vector processor will take into account the
advantages of both; the vector processor and the superscalar processor. In this
processor architecture we work on short vector of vector length four. SIMD vector
processor issues one vector instruction at a time which has four set of vector
operands. All four operations are executed simultaneously. So for this simultaneous
execution we need four independent processing units. This will increase the
complexity of our hardware but throughput of the system will be increased. Our
SIMD vector processor issues the instructions in-order and executes the instruction
out-of-order and we shall get the result in order. Out of order execution provides us
maximum throughput. All the vector instructions are issued and buffered into
instruction cache. After checking the dependencies among the instructions,
independent instructions are sent to execution units. Here out-of-order execution takes
place. When one vector instruction is decoded, it is sent to execution unit. Since
vector length is four so all four operations are send to four different ALU. SIMD
vector architecture has no scalar execution unit. It is also possible that vector
instruction should have less than four words. To handle this situation remaining
vector words are hard wired to zero. In tradition vector processor to improve the
performance chaining is used but in SIMD architecture chaining [4] is not used.

2 Proposed Approach for the Vectorization of Instructions for a


SIMD Type Vector Processor
SIMD instructions are very effective in executing the same instruction again and
again, normally a loop. And vector instructions are effective when there is an
instruction which have to be executed again and again. So we do loop unrolling and
loop peeling to find whether the code is vectorizable or to improve performance. So
here we take a loop and it is unrolled and peeled according to the proposed
architecture to bring out its effectiveness. Here the loop unrolling and peeling is done
by SUIF2 passes. To implement the proposed work we have used MachSUIF. In
MachSUIF, the IR(Intermediate Representation) uses a SUIFvm (SUIF virtual
machine architecture) which assumes that the underlying architecture is a generic
RISC which is not biased towards any existing architecture.

58

A. Palaty, M. Suaib, and K.S. Pandey

Fig. 1. Flow graph for vectorization

The program code is decomposed into its IR consisting of operations with minimal
complexity i.e. in primitive or atomic instructions. Then this IR description of the
program code is organized into control data flow graph with primitive instructions as
its nodes and edges denoting control and data dependencies [7]. Before the CDFG is
created we do loop unrolling and peeling. After the CDFG is computed we do
vectorization according to the SIMD type vector processor architecture. The flow
graph of the proposed work is shown in fig.1. The shaded blocks on the figure shows

Exploiting ILP in a SIMD Type Vector Processor

59

the available passes of the MachSUIF infrastructure. The c2suif pass converts the
input ANSI C code to the SUIF frontend i.e. the code is preprocessed and its SUIF
representation is emitted. After c2suif loop unrolling and peeling is done by a custom
made pass which does it according to our architecture. In this pass at first the SUIF
file is converted to the ANSI C code and then the loop unrolling and peeling is done
on this C code. The peeling is done in such a manner that the number of iterations
would become a multiple of 4 and unrolling is done for a size of 4. After the unrolling
and peeling the pass again converts the C code to SUIF IR. The 3rd step is do_lower
pass or an equivalent pass with all necessary transformations which is provided with
SUIF. In this step several machine independent transformations is done like
dismantling of loop and conditional statements to low-level operations. By doing the
do_lower we translate the higher SUIF representation to lower SUIF representation.
To convert lower SUIF representation to SUIFvm representation an s2m compiler
pass is used which is available in MachSUIF. After s2m, architecture independent
optimizations are done on IR and a CFG (Control Flow Graph) is created by pass
il2cfg. A CFG form is got in which the dagconstruct pass parses on each node of the
CFG and constructs a corresponding CDFG. By this we get a CDFG of each node and
according to the data dependencies between the instructions the vectorization is done
which is done by the vectorization pass.
The C code segment to be unrolled is the Kernel 5 (tri-diagonal elimination) of
Livermore Loops coded in C [9]:
for ( l=1 ; l<=loop ; l++ ) {
for ( i=1 ; i<n ; i++ ) {
x[i] = z[i]*( y[i] - x[i-1] );
}
}
argument = 5;
TEST( &argument );

Fig. 2. DFG for the inner loop of kernel 5 of Livermore Loops before unrolling

60

A. Palaty, M. Suaib, and K.S. Pandey

In figure 2 we can notice that the parallelism is not achievable like in figure3. For the
proposed architecture we have taken the vector length to be 4. So after unrolling the
inner loop of the kernel, the result is:
for(l = 1;l<= loop;l += 1) {
for(i = 1;i< (n- 4);i += (1* 4)) {
(((x))[i])=(((((z))[i])*((((y))[i])-(((x))[(i-1)]))));
(((x))[(i+(1*1))])=(((((z))[(i+(1*1))])*((((y))[(i+(1*
1))])-(((x))[((i+(1* 1))-1)]))));
(((x))[(i+(1*2))])=(((((z))[(i+(1*2))])*((((y))[(i+(1*2
))])-(((x))[((i+(1*2))-1)]))));
(((x))[(i+(1*3))])=(((((z))[(i+(1*3))])*((((y))[(i+(1*3
))])-(((x))[((i+(1*3))-1)]))));
}
for(;i< n;i += 1) {
(((x))[i])=(((((z))[i])*((((y))[i])-(((x))[(i-1)]))));
}
}
DFG of the above code is as in figure 3. By looking into the DFG of the both code ie
the fig.2 and fig.3 we can see the parallelism is more prominently seen at fig3. In fig.3
we can see that 4 separate similar arrangement of blocks which do not have any data
dependency. As no dependency we are able to process them simultaneously. Here we
get 4 separate blocks as we have unrolled to size 4. We have done this basically as our
hardware is having vector size of 4.

Fig. 3. DFG for the inner loop of kernel 5 of Livermore Loops after unrolling

If the no. of iterations is not a multiple of vector length then loop peeling is done.
In the SIMD vector architecture there is no scalar processing so by peeling in the code

Exploiting ILP in a SIMD Type Vector Processor

61

below after vectorization we would get a vector with 3 data at first i.e. the 4th data
would be hardwired to zero. The no. of instructions to be peeled is found by modulus
of no. of iterations by vector length. For e.g.
for(i=0;i<99;i++)
{
x[i]=y[i]+z[i];
}
Then no. of instruction to be peeled is 3.So the peeled and unrolled loop code is:
x[0]=y[0]+z[0];
x[1]=y[1]+z[1];
x[2]=y[2]+z[2];
for(i=3;i<100;i+=4)
{
x[i]=y[i]+z[i];
x[i+1]=y[i+1]+z[i+1];
x[i+2]=y[i+2]+z[i+2];
x[i+3]=y[i+3]+z[i+3];
}
After this, the code is transformed into vector instructions. And in the SIMD vector
processors, it is executed with much better efficiency than the scalar or vector
instructions. By vectorization in the SIMD architecture, the advantage is that the
instruction is to be decoded, only once while in other architectures each instruction
would have to be decoded every time. By having the SIMD architecture all the four
instruction would be executed at once which is not possible in vector architecture. So
here we can see that we have saved time to decode 3 instructions and all 4 instructions
are executed at once. No. of iteration also have been decreased from 100 to 25.

3 Conclusions
In this paper we have presented SIMD vector architecture which is taking the benefit
of superscalar processor and as well as vector processor. As vector processor uses
deeply pipelined functional unit, the operation on elements of vector was performed
concurrently. We have taken the benchmark as Kernel 5 (tri-diagonal elimination)
of Livermore Loops coded in C. SIMD vector processor implements parallelism on
short vectors having four words. The operation on these words is performed
simultaneously i.e. the operation on these words is performed in one cycle. This
reduces the clock cycles per instruction (CPI). The parallelism in vector processing
requires superscalar issue of vector instructions. By vectorizing we decrease the no. of
iterations from 100 to 25 if we have taken the no. of iterations in the original loop as
100. Assuming each instruction takes one clock cycle for execution. In SIMD vector
architecture the inner loop of code is executed in 35*25= 875 cycles. While in scalar
architecture it takes 35*99+27= 3492 cycles. Here the performance is directly
dependent on the data dependency. If the data dependency is high then the parallelism

62

A. Palaty, M. Suaib, and K.S. Pandey

will be highly decreased i.e. the parallelism is directly proportional to the data
dependency. So ideally by using SIMD vector architecture we gain speedup factor of
four i.e. the SIMD type vector processor would be executing 4 times faster than the
scalar processors.

References
1. Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann,
San Francisco (1997)
2. DeVries, D., Lee, C.G.: A Vectorizing SUIF Compiler. In: Proceedings of the First SUIF
Compiler Workshop, pp. 5967 (January 1995))
3. Wilson, R.P., French, R.S., Wilson, C.S., Amarasinghe, S.P., Anderson, J.M., Tjiang,
S.W.K., Liao, S.W., Tseng, C.W., Hall, M.W., Lam, M.S., Hennessy, J.L.: SUIF: An
Infrastructure for Research on Parallelizing and Optimizing Compilers. In: ACM SIGPLAN
Notices, vol. 29(12), pp. 3137 (December 1994)
4. Astanovic, K.: The Torrent Architecture Manual. University of California, Berkeley (1994)
5. Bacon, D.F., Graham, S.L., Sharp, O.J.: Compiler Transformations for High-Performance
Computing. ACM Computing Surveys 26(4) (December 1994)
6. Nuzman, D., Zaks, A.: Outer-Loop Vectorization - Revisited for Short SIMD Architectures.
In: PACT 2008, Toronto, Ontario, Canada, October 25-29 (2008)
7. Kavvadias, N., Nikolaidis, S.: Application Analysis with Integrated Identification of
Complex Instructions for Configurable Processors. In: Proc. of the 14th Intl. Workshop on
Power and Timing Modeling, Optimization and Simulation, Santorini, Greece, September
15-17, pp. 633642 (2004)
8. Smith, M.D., Holloway, G.: An introduction to Machine SUIF and its portable libraries for
analysis and optimization. Tech. Rpt., Division of Eng. and Applied Sciences, Harvard
University, 2.02.07.15 edition (2002)
9. McMahon, F.H.: The Livermore Fortran kernels:A computer test of the numerical
performance Range. Lawrence Livermore National Laboratory, Livermore (December 1986)

An Extension to Global Value Numbering


Saranya D. Krishnan and Shimmi Asokan
Rajagiri School of Engineering and Technology, Kochi, India
saranyadk@gmail.com, shimmi a@rajagiritech.ac.in

Abstract. Optimizing compilers play a crucial role in making a computer program ecient. Many optimizing techniques are used to improve the performance of a program written in high level language. The
dierent types of optimizations techniques include data-ow optimization, control ow optimization, SSA-based optimization, loop optimization, code generator optimization, functional language optimization etc.
Some optimizations are done early during the compilation or optimization phase since they help or improve further optimizations. Some of
the early optimizations are value numbering, constant folding, constant
propagation, copy propagation, scalar replacement of aggregates etc. We
propose here an extension to the existing technique of early optimization
- Global Value Numbering. The value graph method used for global value
numbering is utilized. The idea is to include some algebraic simplication and error detection during the value numbering phase itself. This is
benecial since we can improve an existing technique without incurring
any additional cost and it gives a scope to increase the eciency of other
optimizations.
Keywords: early optimization, global value numbering, value graph,
algebraic simplication, error detection.

Introduction

Compilers in general transform a piece of code from source language to target


language. Optimizing compilers does the same but it also makes the code more
ecient without changing its meaning. Eciency is achieved in terms of time or
space. This means that the time taken to execute the program or the memory
occupied by the program has to be minimized. These optimizing compilers are
judged by the quality of code they produce. The criteria is that any optimized
program should be semantically equivalent to the original program. Average time
of program execution should be reduced, without modifying the algorithm. In
short, optimization should be worth the eort.
There are a variety of optimization techniques, each applied at dierent stages
of compilation. The optimizations on intermediate code are to rearrange or compress the code. This will in eect reduce the number of computations i.e. the
number of three address instructions or make the abstract syntax tree smaller.
Optimizations applied during the code generation phase works with memory
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 6369, 2011.
c Springer-Verlag Berlin Heidelberg 2011


64

Saranya, D.K. and Shimmi, A.

registers and the like. Some other optimizations may be done after nal code
generation. They attempt to change the assembly code into more ecient one.
Majority of the optimizations are in some way inter-related. One optimization may aect or improve the other. Thus determining the order of performing
dierent optimizations is of much importance. Phase ordering is not an easy
task. Many optimizations could be done in any order, but their interaction will
determine the eectiveness of many other optimizations [7]. In general the optimizations that will improve or help in most of the other optimizations are
performed early during the compilation1 process or optimization2 phase.
Value numbering technique considered here is a data ow dependent3 early
optimization4 [4]. It can be applied to basic blocks or to the entire procedure.

Global Value Numbering

Value numbering applied to entire procedure is called global value numbering. It


is a method of determining equivalent computations and eliminating repetitions.
Equivalence detection in global value numbering gains importance since it is used
by compilers in many other important optimizations. These include common
sub-expression elimination, branch elimination, loop invariant code motion copy
propagation, constant propagation[5] etc. Other important applications include
use in tools for plagiarism detection and translation validation5 [3].
There are several algorithms developed for performing global value numbering.
These algorithms follow either an optimistic approach or pessimistic approach or
a combination of both in determining congruence of variables. One of the earliest
algorithms is Kildalls algorithm[3]. Alpern, Wegman, and Zadeck developed
an algorithm based on partition renement which uses value graph[1]. Value
numbering algorithm of Briggs, Cooper and Simpson combines hashing technique
of basic block value numbering and the global value numbering approach[2].
Gulwani and Necula have proposed a random interpretation based algorithm
and a polynomial time algorithm for global value numbering[3,6].
Our proposal for extending global value numbering is based on the Alpern,
Wegman, and Zadeck (AWZ) algorithm which uses value graph. The AWZ algorithm requires the procedure to be converted to SSA form6 . At join points of
the ow graph additional uninterpreted functions are used which act as selection functions. Value graph is constructed from the ow graph resulting from
SSA form of the procedure. The nodes in value graph are partitioned into congruence classes to identify congruence of variables. Two variables are considered
1
2
3
4
5
6

On source code - Programming language dependent optimizations.


On intermediate code - Programming language independent optimizations.
Dependent on the ow of data within the program.
Optimizations usually applied early in the compilation or optimization process.
Checking correctness of optimizer by comparing optimized code and with original
code before optimization.
Static Single Assignment form.

An Extension to Global Value Numbering

65

Fig. 1. Code Snippet to generate Value Graph

Fig. 2. Value Graph to Identify Congruence

equivalent at a particular point in the program if they are found to be congruent


and their dening assignments dominate that program point[4].
A small sample code and its corresponding value graph for equivalence detection is shown in Fig.1 and Fig.2.
The value graph generated from ow graph consists of nodes and directed
edges similar to the DAG7 . Nodes represent either operator or operands. Operator can be function symbol also. Operands are variables, constants or other
operators in case of non-leaf nodes. Node name is the corresponding variable
name in SSA form and node label can be a constant, function symbol or operator. If only a single assignment to a variable occurs no subscripts are used
in node names .Arbitrary names are given to nodes in case there is no SSA
form variable attached to it. Directed edge from the operator or function to an
operand is labelled with the position of the operand.
Congruence of two variables in the value graph is identied by three rules.
Same nodes or the nodes with the same name are congruent. Constant labelled
nodes with same content are congruent. Nodes labelled with same operator having congruent operands are also congruent. These conditions are checked using
congruence class partitioning. After this equivalence is identied.
Similar to reuse of nodes in DAG to avoid new node insertion, computations
in value graph can be replaced by previous equivalent computations. Thus recomputation is avoided by reuse of expressions.
7

Directed Acyclic Graph.

66

Saranya, D.K. and Shimmi, A.

Proposed Solution

We propose here an extension to Global Value Numbering. The proposed technique performs some algebraic simplications and error detection along with
global value numbering. The value graph method to nd equivalent computations is used to perform these additional jobs. In addition to identifying equivalent computations and eliminating redundancy, the value graph is used to simplify the computations by using algebraic laws after identifying the operator and
to detect errors.
Algebraic simplications are mainly based on the properties of existence of
identity and existence of inverse. The algebraic properties applied to the value
graph are listed below.

a
a
0
a
a
a
a

+0=0+a=a
-0=a
- a = -a
-a=0
*1=1*a=a/1=a
*0=0*a=0
/a=1

Similar simplications can be applied to Boolean valued expressions also. The


boolean valued constant simplications are listed below.

b
b
b
b

true = true b = true


false = false b = b
true = true b = b
false = false b = false

Using this extension to GVN will help to improve the eciency of optimizations
like constant folding, dead code elimination etc. Also error detection during
optimization phase is possible.
Consider the code snippet in Fig.3. This will illustrate the eectiveness of
applying the proposed optimization technique.

Fig. 3. Code to apply proposed method

To apply the proposed method, value graph constructed for GVN is used. The
value graph corresponding to the code above is shown in Fig.4.

An Extension to Global Value Numbering

67

Fig. 4. Value graph for given code

Fig. 5. (a) Value graph after simplication (b) Final result

The value graph is used to determine equivalence of operands & interpret the
operations. In the value graph we can replace an expression with a single value
(variable/constant) which is the result of computation, by applying the proposed
simplications. After applying the proposed technique, the resulting value graph
will be as shown in Fig.5.
Providing this extension to Global Value Numbering improves further optimizations like constant folding, algebraic simplication, dead code elimination,
if simplication etc. Some examples that improve other optimizations are illustrated below.
Constant folding
x/x1
x*00

68

Saranya, D.K. and Shimmi, A.

Algebraic simplication
x/1x
x*1x
Dead code elimination
if ( y > y ) if ( false )
If simplication
if ( z z ) if ( true )
It also helps to detect errors during compile time. Arithmetic errors are detected
during optimization phase, which is otherwise done at run time. Examples given
below illustrate the errors detected by this method.
Division by zero
y/0
Undened operation
0/0
Sequence of optimizations plays a key role in determining how far this technique
becomes eective.

Conclusion

The proposed technique is not a replacement for any existing technique of optimization. It is an extension to Global Value Numbering using value graph. It
further improves later optimizations like constant folding, dead code elimination,
if simplication etc. It also helps detecting certain arithmetic errors in optimization phase itself. No additional cost is involved since it uses the tools of existing
technique.
Deciding the sequence of optimizations is the key factor in determining eectiveness of this technique. A sequence which could provide reasonable eciency
would be applying as an early optimization, after copy propagation, along with
Global Value Numbering and before Constant folding & propagation and control
ow optimizations.

References
1. Alpern, B., Wegman, M.N., Zadeck, F.K.: Detecting equality of variables in programs. In: 15th Annual ACM Symposium on Principles of Programming Languages,
p. 111 (1988)
2. Briggs, P., Cooper, K.D., Simpson, L.T.: Value numbering. Software Practice and
Experience (1997)
3. Gulwani, S., Necula, G.C.: A Polynomial-Time Algorithm for Global Value Numbering. In: Static Analysis Symposium (2004)

An Extension to Global Value Numbering

69

4. Muchnick, S.S.: Early Optimizations. In: Advanced Compiler Design and Implementation, pp. 329360, 580586. Morgan Kaufmann, San Francisco (2000)
5. Wegman, M.N., Zadeck, F.K.: Constant propagation with conditional branches.
ACM Transactions on Programming Languages and Systems (1991)
6. Gulwani, S., Necula, G.C.: Global Value Numbering using Random Interpretation.
In: Proceedings of the Principles of Programming Languages (2004)
7. Kulkarni, P.A., Whalley, D.B., Tyson, G.S., Davidson, J.W.: Exhaustive Optimization Phase Order Space Exploration. In: Proceedings of the International Symposium on Code Generation and Optimization (2006)

Data Privacy for Grid Systems


N. Sandeep Chaitanya, S. Ramachandram, B. Padmavathi, S. Shiva Skandha,
and G. Ravi Kumar
1

Dept of CSE, College of Engineering Osmania University, Hyd


Sreekavitha Engineeering College, karepalli, Khammam, AP, India
3
Dept of CSE, CMR College of Engineering & Technology, Hyd, AP, India
4
Dept of CSE, CMR College of Engineering & Technology, Hyd, AP, India
{n.sandeepchaitanya,s.ramachandramou,padmavathi.sahu,
sivaskandha,g.ravikumnar06}@gmail.com
2

Abstract. Grid computing can provide users dynamically scalable, shared


resources over the internet, but users usually fear about security threats and loss
of control of data and systems. This paper presents a practical architecture to
protect the data confidentiality for guest virtual machines. With this solution,
even the grid computing service providers cannot access the private data of
their clients. This is very important and attractive for the grid clients. In our
work, we utilize virtualization technology and trusted computing technology to
provide a secure and robust virtualization platform. On this platform, we
customize the guest virtual machine operating system, strengthen the isolation
between virtual machines, and therefore, greatly improve the data privacy of
grid services. With our solution, the grid service provider can compromise the
availability, but not the confidentiality of the guest virtual machines.
Keywords: grid computing, grid security, data privacy in grid, virtual machine.

1 Introduction
Grid computing is derived from several technologies: virtualization, distributed
application design and IT management. It can provide dynamically scalable, shared
resources over the Internet and avoids large upfront costs. Grid computing is getting
more and more concerns and promises to change the future of computing .The
security requirements usually fall into four categories confidentiality, integrity,
recoverability, and availability. For most grid computing clients, data confidentiality
is the preliminary property that must be guaranteed. In general, people or company
will never risk letting their private data be freely accessed by grid computing
providers. Grid computing providers need to promise this data privacy property to
attract more clients in practice; grid service can be grouped into three categories:
software as a service (SaaS), platform as a service (PaaS), and infrastructure as a
service (IaaS). IaaS is an evolution of virtual private server offerings, it depends
heavily on the underlying infrastructure: virtualization. Virtualization software is
usually called hypervisor; it allows a single physical server to host many guest virtual
machines (VM), and the guest VMs are provided to clients as the grid computing
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 7078, 2011.
Springer-Verlag Berlin Heidelberg 2011

Data Privacy for Grid Systems

71

service. The guest VMs enable individual developers and companies to deploy their
own operating systems and applications without the cost and complexity of buying
servers and setting them up. After that, clients may send their data to the VM and
provide their dedicated services. Commercial examples of IaaS include the Amazon's
EC2 [4], etc. This paper aims to proposing a virtualization solution to achieve data
confidentiality for guest virtual machines. With this solution, even the IaaS grid
computing service providers cannot access the private data of their clients.
This is very important and attractive for the clients. The solution only emphases
on data confidentiality, Service provider can still easily break down the availability
and integrity of the clients services and data. But these misbehaviors can be detected
easily, and will finally impair the service providers commercial credibility. The
basic idea of the our solution is to combine machine virtualization technology
with trusted computing technology to achieve the privacy of the virtual machines;
by running a modified OS inside the VM, enhance the clients data confidentiality
against the service provider.
Our work is based on the type-1 hypervisor, such as Xen [5], which runs on bare
hardware and hosts a privileged domain (called dom0) and several guest virtual
machines (called guest VM). An application named Qemu[6] running in dom0
provides virtual platform and devices for guest VMs. We do not trust dom0 and
Qemu because they both are controlled by the grid computing service provider.
Based on the trust computing technologies, we can verify the boot sequence of the
target machine and make sure the trusted hypervisor is running on it. With the
hypervisor, we can strongly isolate the clients memory from dom0s, mediate the IO
accesses between Qemu and guest VMs to prevent guest VMs data from being
stolen Finally, our solution needs to modify the guest kernel to remove all device
drivers other than hard disk and network card drivers, disable ACPI and kernel BIOS
calls, because dom0 may embed Trojan horses in these codes
The next section introduces the virtualization and trusted computing technologies.
Section 3 presents the detailed design of the presented solution to show how the data
privacy is achieved. In section 4, we evaluate the solution and state that it may
prevent all kinds of attacks from untrusted dom0. We discuss related work in
section 5 and conclude in section 6.
1.1 Background
1.1.1 Virtualization
Virtualization refers to the abstraction of computer resources, it is a key feature of
grid computing. Many virtualization technologies have been proposed and
implemented, such as Xen, VMware. VMware is commercial software that
implements full virtualization. The Xen hypervisor is an open- source project that is
being developed in University of Cambridge. We focus our work on Xen because it
is open-source and well accepted. Xen hypervisor has been used in many commercial
virtualization products; it acts as the engine of the Amazon Elastic Compute GridA
Xen-based system consist of several items that work together: hypervisor, dom0 ,
user-space tools, domU ( guest VM ).The Xen hypervisor abstracts the hardware for
the virtual machines, controls the execution of virtual machines as they share the
common processing environment.Dom0 is a privileged VM, it runs a full-fledged

72

N.S. Chaitanya et al.

operating system, it is always booted by the hypervisor. Dom0 is used for platform
management. Xen supports two kinds of virtualizations: paravirtualization and fully
virtualization. Fully virtualization needs Intel VT or AMD-V hardware supports, it
can provide better isolation between VMs without the need to modify guest operating
system.In our work, we use fully virtualized Xen VMs. Every fully virtualized VM
requires its own Qemu daemon, which exist in dom0.
In the existing Xen architecture, dom0 takes full control of all virtual machines
running on the same host. When evaluate the trustworthiness of the guest VM,
dom0 have to be included in the Trusted Computing Base (TCB), this implies that the
system administrator must be trusted, which impairs the usefulness of Xen in griding
computing
1.1.2 Trusted Computing
Trusted Computing is a category of technology developed and promoted by the
Trusted Computing Group. It is usually based on a TPM integrated on the
motherboards. It includes some technologies: such as Remote attestation, sealed
storage and authenticated booting The TPM specification prescribes a way of
building up a chain of trust in the platform so that when interacting with a particular
application on a platform, a report can be obtained on the software stack that was
executed on the platform.
This report is a list of Platform Configuration Registers (PCR) configuration
values signed and certified by the TPM. To sign its PCRs, the TPM uses the private
portion of an Attestation Identity Key (AIK) pair. The verifier uses the public AIK to
validate the signature and then checks the PCR values. In this paper, we leverage
the above virtualization technology and trusted computing technology to construct a
secure and robust virtualization platform.

2 Architecture
2.1 Design Goal
(1) Provide VM platforms to grid computing users, users can then integrate an
OS, middleware, application software on the platform at their own discretion.
(2) Administrator and other users of the virtualization system are impossible to
hack into the target VM. Only the valid user can boot the target VM.
(3) The mechanisms used to protect the data privacy can be easily verified.
2.2 Overview of Our Architecture
Our implementation is very simple in theory namely, protect the data privacy of
guest VMs with help of a trusted Xen hypervisor. In practice we need to thoroughly
analyze the virtualization system and make a clear separation between correlative
components. Figure 1 gives an overview of the implement We divide our Xenbased system into three parts: the trusted part; untrusted part; protected part (not
including the bios and grub). This division is based on two technologies: (1) trust
computing, which provides a trustable hardware platform that can act as a core root
of trust, (2) virtualization, which provides isolation between different VMs.

Data Privacy for Grid Systems

73

Fig. 1. Overview of the architecture

The system is separated into three parts. Box 1 contains the trusted part, box 2
contains untrusted part, box 3 contains protected part.
2.3 Trusted Part
This part includes the hardware platform, a trusted bootloader and a trusted Xen
hypervisor. The hardware should support the following
TPM: TPM provide a secure environment for authenticated booting, secure storage
and secure IO. It is the security base of our architecture
IOMMU: with IOMMU, we can prevent malicious dom0 device drivers from
compromising the system address spaces through DMA
VT-D or SVM: With them, Xen can provide better isolation between VMs without
the need to modify guest operating system.
2.4 Untrusted Part
We do not trust dom0 and the Qemu running in it, because they are both controlled
by the grid computing provider. However, our guest VM still needs the virtual
firmware, virtual platform and devices produced by the Qemu. So, we need a further
analysis of Qemu, which is discussed in detail in section 4.6
2.5 Protected Part
As illustrated in Figure1, this part includes all the guest VM environment. As
mentioned before, we focus our work on how to protect the data privacy of the
guest VMs. The service providers can still compromise the availability and integrity
of the clients services and data, but they cannot steal the clients secrets. It is worth
noting that we exclude bios and grub out of the protected part. Its because, we do
not care about the load process of the kernel and initrd image, we only care the
load result, that is, whether the kernel is in the right address and is in correct
state when we boot the guest operating system.

74

N.S. Chaitanya et al.

2.6 The Relations between the Three Parts


The trusted part is built upon the hardware root of trust. The physical platform
configuration is measured by the trusted bootloader, it loads and measures itself and
the hypervisor, and stores the measurements in the hardware TPM. Users can use
remote attestation to verify the boot process, and make sure that the trusted Xen
hypervisor is running on the physical machine.We use a modified Xen hypervisor, it
is adapted to support verification of the guest VM kernel image. As shown in Figure
1, the hypervisor is responsible for CPU scheduling and memory partitioning
among VMs. It has no knowledge of networking, external storage devices, video, or
any other common I/O functions found on a computing systemThe guest VMs in
protected part rely on the hypervisor to provide scheduling and memory protection,
and they depend on the Qemu in dom0 to provide virtual devices and services, such
as network, storage, video, virtual firmware, etc. Qemu is responsible for booting the
guest VMs Figure 2 illustrate the Qemu and guest VM relations in our architecture.
We only activate virtual network devices and virtual disk devices. To prevent dom0
from hacking into the guest VM, We modify the guest VM kernel as follows: a)
Disable mouse, vga console, frame buffer, keyboard, serial, sound in the kernel
configure file. These are not necessary to remote users, and they may disclose critical
information to the grid service administrator. b) Disable the ACPI configure
option. We do nottrust the code in the ACPI tables and BIOS, because they are
provided by Qemu, and may include Trojan horses. Without ACPI, the kernel can
still work in the legacy mode to initialize devices. With the help of hypervisor, bios
calls can be avoided during the kernel boot process.
c) Disable the kernel debug options to prevent malicious administrator from
tapping the kernel.
d) Enable dm-crypt option, it will provide transparent encryption of block devices
using the kernel cryptapi. We rely on it to protect guest VMs virtual disks.
e) Only include necessary device drivers.

Fig. 2. Qemu and guest VM in our architecture

Data Privacy for Grid Systems

75

2.7 Guest VM Boot Process


We must guarantee that only the valid user can boot the guest VM and the boot
process is absolutely safe. Before boot the guest VM, the user need to prepare the
boot disk image and root disk image. Firstly, create an encrypted disk image, install
the root file system on it, store the password (call it PASS- FILE) of this disk image
in initrd Secondly, compile the kernel, encrypt the kernel and initrd with a password
(call it PASS-BOOT), then wrap the encrypted kernel into a pack. The wrapping code
includes a hypercall, which ask the hypervisor to decrypt the kernel and initrd. Figure
3 shows the kernel pack.

Fig. 3. The wrapped

Finally, create a boot disk image, install grub on the disk image, and put the
kernel and initrd on it. Send these two disk images to the dom0 on the grid server.
To protect the data privacy in the boot process, we separate the boot process into the
following seven steps (As shown in Figure 4).

Fig. 4. The guest VM boot kernel and initrd image process

(1) After remote attestation, the remote user sends a boot request to dom0
(2) Dom0 use Qemu to launch the specific guest VM. In the guest VM, grub
loads the encrypted kernel and initrd images to the appointed address
(3) The kernel wrapping code executes a hypercall, ask the hypervisor to decrypt
the images.
(4) The hypervisor challenges the remote user.

76

N.S. Chaitanya et al.

(5) The remote user gives the PASS-BOOT encrypted with the hypervisors public
key, the hypervisor decrypt the guest VM kernel and initrd.
(6) Hypervisor transfers control to the guest VM, the guest VM kernel continues
its work and commit several checks to make sure it is placed in the right address
and in a correct state.
(7) After the kernel getting up, the PASS-FILE is used to mount the encrypted file
systems. Now, a new secure guest system starts to work.
In the above process, dom0 can only see the encrypted password, and has no chance
to hack into the boot process. Only the valid user can boot the guest VM.

3 Evaluation
In this work, we aim to improve the data privacy of guest VMs and prevent potential
attacks come from two kind of sources: the owner of the hardware machine and
the grid service administrator (in our case, dom0) .We evaluate our system
confidentiality in the following aspects:
3.1 Hardware Platform
In our implementation, we use a TPM-based trusted loader, O S L O , to boot the
physical server. It leverages the Dynamic Root of Trust for Measurement (DRTM) to
secure the boot process. With the help of the loader, our trusted hypervisor finally
takes over the machine. Users can use TPM-based attestation to verify the software
stack running on the physical machine. If a malicious program alters part of the boot
loader or operating system, the grid customer can detect the change quickly and
reliably. So, users can be ensured that the hardware platform is trustworthy.
3.2 Memory Isolation
The Xen hypervisor allows multiple VMs to run at the same time. A VM may
only manipulate its own page table to include pages which it has been granted
explicit access. Even dom0, confined by IOMMU, cannot access the memory of
other VMS. Therefore, the grid service administrator cannot undermine the
confidentiality of the memory space.
3.3 Storage
A virtual block disk (VBD) in guest VMs, may exist as a file in dom0, or as a
physical disk or partition. All disk I/O in VMs need the help of dom0, so, dom0 may
inspect into the data block and tamper their contents. In our work, we protect our
virtual disks using the dm-crypt API. Dm-crypt is a transparent disk encryption
subsystem, it is implemented as a device mapper target and can encrypt whole virtual
disks. Dom0 can only reveal the data in encrypted form. Data secrecy, integrity,
ordering and freshness are protected up to the strength of the cryptography used. If
the dom0 or other hostile code tries to modify the encrypted data, the guest VM will
just terminate.

Data Privacy for Grid Systems

77

3.4 Network
The virtual network driver is implemented as a virtual split device that has a front end
in the guest VM and a back end in Qemu. Dom0 services just like a router. We make
no effort to protect network I/O, as this is addressed by existing technologies such as
SSL. Users can also use Virtual Private Networking (VPN) to perfectly protect the
confidentiality and integrity of the network.
3.5 Guest VM Boot Process
The kernel and initrd image are all encrypted by the user. Dom0 cannot get the
plain-text password, therefore, it cannot decrypt the kernel image. If the dom0
modifies the kernel or initrd image, the hypervisor will fail when decrypting the
images, the failure will be logged in a temporary buffer, and the user can use
remote attestation to acquire the information.
3.6 Virtual Devices
Qemu connects the untrusted part and protected part, so, we must treat it very
carefully. It provides all virtual devices for guest VMs: network interface card (NIC),
disk, vga, mouse, keyboard, serial, sound, Qemu also provides other virtual platform
chips such as Programmable Interval Timer (PIT), Programmable Interrupt Controller
(PIC), bus controller (e.g., PCI bus controller) and virtual firmware. In our work, we
modify the guest kernel, and only allow several specific NIC drivers and IDE driver
leave in the kernel. ACPI and bios codes are disallowed to execute in the kernel
initialization process, so, dom0 cannot compromise the guest system by embedding
Trojan horses in these codes. A few bios calls are moved into the kernel wrapping
code, such as E820 call, they are executed before the kernel initialization and the
results are saved in appointed addresses. In rare cases, guest kernel needs to read
some system information from the bios data area, which may be tampered by the
dom0. We research these situations and find these malicious modifications do affect
the kernel behavior to some extent, but they cannot lead to privacy leakage. In all the
above situations, the machine owner and system administrator can break down the
data availability, integrity and recoverability of the guest VMs, but they cannot break
down the data confidentiality. It should be noted that we only provide a mechanism to
prevent the grid provider from hacking into the guest VM, the guest OS still needs
effective measures to protect attacks from the internet

4 Related Work
The European Network and Information Security Agency (ENISA) assesses the risks
and benefits of grid economies from a security point of view. ENISA lists the top
security risks, including, loss of governance, isolation failure protection, malicious
insider, etc. Kelton Research conducts on survey of the grid computing in 2009
and analyzes the status quo of grid computing. There are many endeavors have been
made to improve the data privacy. The Terra architecture proposes moving the entire
application into a separate VM with its own application-specific software stack
tailored to its assurance needs, but Terras security infrastructure depends heavily on

78

N.S. Chaitanya et al.

the privileged management VM. Microsofts Next-Generation Secure Computing


Base (NGSCB) is also aiming to provide a system-wide solution that includes
hardware, operating system kernel and an execution environment. Derek G. Murray
uses disaggregation to shrink the TCB of a virtual machine in a Xen-based system, in
their work the VM-building functionality is moved into a separated, trusted VM
which runs alongside dom0. A major limitation of this approach is that the dom0
kernel must be included in the TCB. Nizzas work presents an architecture that
allows to build applications with a much smaller TCB. It is based on a kernelized
architecture and on the reuse of legacy software using trusted wrappers. Flicker
leverages new commodity processors from AMD and Intel to establish a mechanism
that can support secure execution even the surrounding operating system is
completely compromised.

5 Conclusion
In our work, we focus on improving the data privacy in grid computing, and try to
find a solution to prevent the potential attacks come grid service provider (in our
case, dom0).By using the Xen-based virtualization technologyand TPM-base
trusted computing technology, we construct a secure and robust virtualization
platform. Based on this platform, we divide the whole system into three parts:
trusted part, untrusted part, protected part. We place the dom0 and the Qemu
application in the untrusted part, because they are controlled by the grid service
provider.To achieve better isolation between VMs,we customize the guest VM
operating system, disable all the unnecessary virtual devices, and disallow code from
untrusted part to be executed in the guest VM. These modifications greatly improve
the data privacy of the guest VM.Finally, we evaluate our system confidentiality
in the following aspects: hardware platform, memory isolation, storage, network,
guest VM boot process, Qemu virtual devices. The evaluation shows that our
architecture provides a good solution to protect the confidentiality of the grid
clients. In future research, we will concentrate on the control and data flow analysis
and policy analysis about the Qemu virtual driver models in greater detail.

References
1.
2.
3.
4.
5.

Kelton Research, 2009 Global Survey of Grid Computing (January 2009)


Sun Microsystems, Introduction to grid computing architecture white paper (June 2009)
Wikipedia, http://en.wikipedia.org/wiki/Grid_computing
Amazon Web Service, http://aws.amazon.com/
Barham, P., et al.: Xen and the art of virtualization. In: Proceedings of the Nineteenth ACM
Symposium on Operating Systems Principles, pp. 164177. ACM Press, New York (2003)
6. Qemu, http://wiki.qemu.org/
7. TCG: Trusted Computing Group, https://www.trustedcomputinggroup.org
8. Kauer, B.: OSLO: Improving the Security of Trusted Computing. In: Proceedings of the
16th USENIX Security Symposium. USENIX Association (November 2009); ENISA, Grid
Computing: Benefits, risks and recommendations for information security

Towards Multimodal Capture, Annotation and Semantic


Retrieval from Performing Arts
Rajkumar Kannan1, Frederic Andres2, Fernando Ferri3, and Patrizia Grifoni3
1

Bishop Heber College(Autonomous), Tiruchirappalli, India


2
National Institute of Informatics, Tokyo, Japan
3
IRPPS-CNR, Rome, Italy
rajkumar@bhc.edu.in, andres@nii.ac.jp,
{fernando.ferri,patrizia.grifoni}@irpps.cnr.it

Abstract. A well-annotated dance media is an essential part of a nations


identity, transcending cultural and language barriers. Many dance video
archives suffer from tremendous problems concerning authoring and access,
because of the multimodal nature of human communication and complex
spatio-temporal relationships that exist between dancers. A multimodal dance
document consists of video of dancers in space and time, their dance steps
through gestures and emotions and accompanying song and music.This work
presents the architecture of an annotation system capturing information directly
through the use of sensors, comparing and interpreting them using a context and
a users model in order to annotate, index and access multimodal documents.
Keywords: Multimodal data, Semantic retrieval, Sensors, Multimedia indexing.

1 Introduction
Until recently, the only way for dance experts to pass on their knowledge from
generation to generation was orally. Although dance media has immensely benefited
from advances in recording and storage of digital information, which have enabled
rapid production of dance videos, dance media annotation and querying are still major
challenges, and the gap between media features and the capabilities of the existing
media tools is wide. What is needed is a dance video annotation system that takes into
consideration dance notations.
A dance notation, defined by Encyclopedia Britannica, is the recording of dance
movements through the use of written symbols. Such a notation is a symbolic form of
representing movements of dancers. It is used to document, analyze, and choreograph
dance pieces. A dance movement represents a basic pose, gesture or action done by a
dancer. A dance piece is a set of dance movements. The collection of dance pieces
denotes a dance performed by dancer(s).
Two popular dance notations are Labanotation and Benesh notation. Labanotation
[1] is a standardized system for analyzing and recording any human movement. The
original inventor is Rudolf von Laban (1879-1958), an important figure in European
modern dance. He published this notation in 1928 as Kinetographie in the first issue
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 7988, 2011.
Springer-Verlag Berlin Heidelberg 2011

80

R. Kannan et al.

of Schrifttanz. Several people continued the development of this notation. In


Labanotation, it is possible to record every kind of human motion. Labanotation is not
connected to a specific style of dance. Its basis is natural human motion, and every
change from natural human motion (e.g. turned-out legs) has to be specifically written
down in the notation, as shown in Figs. 1-3.
Benesh [2] movement notation invented by Joan and Rudolf Benesh is particularly
prominent in ballet, and it shows the whereabouts of a dancer on stage, the direction she
or he faces, the positions of the limbs, and the details of the head, hands and feet. J&R
Benesh also notates movements by recording the paths traced by the limbs. Figure 4
shows an example of Benesh notation for the human avatar shown in Fig. 3. The human
avatar shows a dance movement where the hands are raised to shoulder level.
A fundamental problem with Labanotation and Benesh notations is that very few
people, typically only dance experts, understand them. The annotation process
involves a dance expert watching a dancer and recording their movements in
notational form and interpreting the annotations to dance learners. Dance annotations
are hand-written documents. Consequently, learners may have trouble visualizing a
dance described in such a form.

Fig. 1. Staff

Fig. 2. Direction Sysmbols

Fig. 3. Laban score with


avatar

Fig. 4. Banesh Notation for Human Avatar

The video annotation tools IBM VideoAnnEx [17] and Vannotator provide only
high-level annotation facilities like object, time, location and events. These features
have to be complemented with finer level annotation capabilities for dance features
such as dance movements, emotion, story and spatio-temporal characteristics of
dancers (i.e. actors) and their body parts (i.e. agents). However, these annotation tools
are either application dependent or provide few authoring facilities.
The multimodal dance information system will help dance students learn various
dance movements along with emotion expressed by dancers. It also helps choreographers to design new dance sequences. Besides, a cultural expert can study the

Towards Multimodal Capture, Annotation and Semantic Retrieval

81

evolution of dance over years and preserve them for future generations. The
architecture on which the system is based allows to directly annotate modal and
multimodal data after they are recognised and interpreted, involving inputs by different
sensors.
Rest of the paper is organized as follows. Section 2 gives overview of dance video
annotation models and systems. In section 3, a brief description of our earlier MPEG7
based dance video annotation and retrieval system is discussed. The need for
automatic annotation of multimodal inputs from dance presentation is illustrated in
section 4. Section 5 presents our proposed architecture for multimodal dance
information system in details and section 6 concludes our paper mentioning the road
ahead in our research.

2 Related Work
Current research on the performing art of dance can be broadly divided into two
categories: dance composition and visualization and dance analysis and retrieval.
Studies have been conducted on dance composition using notations, especially
Labanotation and Benesh [3, 4, 5, 6]. Hachimuras system [7] uses markers placed on
dancers bodies and limbs and records 3D motion data. Hattori et al. [8] developed
key-frame-based dance animation software. In his approach, certain key poses are
identified and are coded in Labanotation.
Several graphical editors have been developed, including LabanEditor [7],
LabanWriter for Mac OS, Calaban [9] for Windows, NUNTIUS [10], and LED and
LINTER [11] for UNIX systems. In addition, there are graphical editors for Benesh
notation, such as MacBenesh and Benesh Notation Editor [12]. MacBenesh is a
Macintosh application that lets the user create high-quality single dancer from Benesh
movement notation scores that can be saved as a document. Benesh Notation Editor is
a Windows-based application for writing Benesh movement notation scores. It
resembles a word processor; thereby, the Benesh scores can be saved as a file. Several
commercial software applications such as LifeForms and DanceForms [13] are
available for interactively composing dances. These tools use virtual reality features
to model the movements of dancers. We surveyed the various ideas these studies and
applications have on how to use notations for documenting dance semantics. Besides
such interactive editing, our approach provides features for semantic annotation for
efficient learning.
Dance analysis and retrieval systems perform semantic interpretation of dance
movements in the context of culture, action, gestures and emotions. A system that
directly calculates the similarities among body motion data is described in [7]. It
considers the issue of how the similarity of identical body motions should be
defined.This paper gave us insight on how to represent dance movements with
multiple granularities (i.e. on actor and agent levels).
Kalajdziski and Davcev [14] developed a system for annotating Macedonian dance
videos with keywords from a controlled vocabulary. The system consists of modules
for segmentation, annotation, 3D dancer animation, and Laban score generation. We
adopted their idea of high-level action as MPEG-7 descriptions. However, their system
offers only a limited vocabulary for annotation and lacks MPEG7 query processing.

82

R. Kannan et al.

Forouzan et al. [15] designed a multimedia information system for Macedonian


dance annotation and analysis of dance features. This system has visual, 3D motion,
and audio tools for processing video features, motion features and sound respectively.
COSMOS-7 [16] is a multimedia annotation and retrieval system based on MPEG7 MDS. It includes a multimedia model that integrates low level and high level video
features by using multimedia frames. It provides a conversion tool that translates Mframes into corresponding MPEG-7 descriptions. We took the idea of representing
dance movements as events performed by actors and agents from this work. A
number of MPEG-7 authoring systems have been developed to describe the segments
[18], news videos [19] and motion trajectory in sports videos [20]. These systems are
domain dependent, so we need a specific system for dance videos.
Our earlier system named DanVideo is an authoring system for annotating and
generating macro (metadata about the dance) and micro (metadata about dance
specific movements) features of generic dance videos as well as to generate MPEG-7
instances semi-automatically from the generated annotations. Also, users can submit
queries to (direct and approximate) and retrieve results from DanVideo.

3 MPEG7 Based Dance Semantics Retrieval


Generally, dance media is multimedia in nature; it is visual (dance steps, posture,
dancers etc), auditory (music, tempo, instruments, rhythm, intonation etc), and textual
(lyrics of songs). DanVideo must provide facilities to annotate these semantics. Dance
video can also be characterized on multiple video granularity levels, i.e., shot, scene,
and clip. Here, shot denotes a group of dance movements (such as, raise left leg hip
level), scene denotes a dance piece (such as, actions representing royalty) and clip
denotes a dance (e.g. 30-minute dance show).

Fig. 5. DanVideo System Architecture

Towards Multimodal Capture, Annotation and Semantic Retrieval

83

DanVideo system provides an easy metadata authoring environment via a graphical


user interface. The architecture of DanVideo system is depicted in Figure 5.
DanVideo is a system for semi-automatic semantic authoring using MPEG7 and
retrieval of media documents based on tree embedding applied to the domain of
dance. The system modules are Video Annotator, MPEG7 Instance Generator,
Parser, Visualizer, MPEG-7 Query Generator, MPEG-7 Tree Generator, and Query
Processor. The Video Annotator has two parts, namely, macro annotator and micro
annotator, to annotate the features of the dance steps and the accompanying song. The
Video Annotator takes the raw video as input and stores the generated text annotations
in its feature base. The feature base consists of a collection of hash tables and vectors
for indexing. The MEPG-7 Instance Generator receives video annotations from the
Video Annotator and generates appropriate MPEG-7 DS elements as XML tags.
The Parser checks the validity of the metadata by contacting the NIST validation
service. The MPEG-7 metadata validated by the Parser can be viewed on the
Visualizer and stored in the Metabase for future querying and reuse by
choreographers and students. The MPEG-7 Query Generator takes the query from the
user and translates it into MPEG-7 query form. Then, the MPEG-7 Tree Generator
produces the equivalent MPEG-7 query tree for the MPEG-7 query. Finally, the
Query Processor retrieves the result for the query tree, if one exists, by accessing the
Metabase. DanVideo system implementation is Java and JMF is shown in Figure 6.
Further details of DanVideo system can be found at [21].

Fig. 6. Screenshot of micro annotation. The video panel renders the dance of the song of the
movie, Hai Mera Dil. The dance expert annotates events, actors, agents, and concepts by using
free text and/or controlled vocabulary. The textual annotations are interactively updated in the
tree view.

4 Need for Automatic Multimodal Annotation


The dance expresses emotion and sometimes make an audience feel specific emotions
[22-24]. But how is it possible for intentional movements to express emotions such as

84

R. Kannan et al.

joy or sadness? What is the relationship between the cognitive dimension of dance
and how it stimulates the emotions? The emotions aroused by the dance do they tell
us something? And about what? Space? Time? Relationships between dancers? [25]
Because we dance with the body, it is sometimes said that the emotions in the
dance are unsophisticated and mostly physical. But is that rabbits are dancing? And
one might say that their movements express love, or a quest for eternal salvation? For
many philosophers, the question of emotions is primarily related to the traditional
problem of mind-body connection, and is now called the philosophy of mind." [2628]. We certainly need to apply the concepts in this part of philosophy and
metaphysics to dance. But conversely, the dance can teach us something about
voluntary movements, for example. Wittgenstein says: "The human body is the best
picture of the human soul. "The serious philosophical study of dance might confirm or
refute this observation. [29].
The dance video systems should be able to manage the high level semantics of
dance videos and their semantic annotations to make available a range of applications
such as search and filtering. Moreover, these systems should provide authoring
environments enabling dancers and choreographers to make effortless annotations and
automatic authoring capabilities. In addition, systems that provide retrievals for
semantic dance queries for different dance semantics are needed. To be more precise,
dance learners and viewers would like to search within the system for understanding
dance movements that

exhibit mood, feeling, emotion and affect expressed by the dancers


incorporate history of the dance
highlight location where it was recorded
exhibit culture of the country of its origin
allow the accompanying song
express semantic concepts like king, friend, hero, heroine, and leader
facilitate search based on costume of the dancers, spatio-temporal
relationships between dancers, and so on

5 Architecture for Multimodal Dance Information System


The semi-automatic annotation process can be powered through the combined and
synergic use of multiple inputs modalities. In this section, an architecture is proposed
to capture different information to facilitate a semi-automatic annotation, access and
presentation of multimodal documents related with dance. In particular, the
architecture is focused on:

annotating and indexing multimodal documents (audio, video,etc.)


automatically adding to the document all information characterizing the class
of dances the dance belongs to;
access to multimodal documents (video, images, texts ...), and information on
possible performers and sociocultural characteristics of the concerning dance.

When multimodal contents are involved, it can be opportune to use a multimodal


interaction too, as multimodal interaction can be natural and efficient. Basic concepts

Towards Multimodal Capture, Annotation and Semantic Retrieval

85

on multimodal interaction where given by [30], its naturalness and easy use when
conceived as a multimodal language are discussed in [31], and the need to manage
issues such as for example ambiguity connected with naturalness is addressed in [32]
and [33]. Multimodal interaction is potentially very similar to that between people;
but people may adopt different behaviors in the communication processes according
to different contexts. For this reason user profile and context need to be modeled,
making actually natural the interaction process; we refer the term context as the sociocultural information and knowledge, the environmental features in which the dance is
performed and the characteristics of devices used for the multimodal input. User
profile and context provide some features for improving indexing and access
processes. This architecture (Fig. 7) allows to directly annotate modal and multimodal
data after they are recognized and interpreted.
The architecture prefigures different contexts of use to capture information, such as
sensors, streaming video and audio, sketches, voice and so on. In particular, wireless
wearable sensors can be used for the 3-D motion capture and real-time analysis of a
dancer. They allow to acquire features such as synchronisation between the different
body parts (e.g. legs and foot movements). Cameras are opportunely located on the
scene that provide the redundant information on the body of the dancer and how it is
related with other dancers and with the scene.
The body motion of a dancer is accompanied by music and rhythms, which
constitute the features of a modal audio input (containing complementary information
from environment and temporal relations of the movements of the human body).
For the purpose of describing the proposed architecture of the multimodal system,
consider the scenario where the dancers performance characteristics are catched by a
set of sensors distributed in the environment in which the performance takes place and
/ or sensors worn by the dancer. In particular, we consider the audio signal sent as
input to a recognizer music, and the gesture signal sent to the gesture recognizer using
motion sensors applied to the dancers body. Sensors placed on the scene, can
provide information such as information about music in input and about the position
of the dancer on scene.
The modal recognition modules carry out the recognition process by comparing the
inputs with the contents of libraries that, e.g. for gesture, contain the set of coded
human motions or their features.

Fig. 7. The architectural schema for multimodal annotation

86

R. Kannan et al.

Auditory input involves music, time, instruments, rhythm, intonation features.


Information that can be obtained by the gesture recognizer are related with body
motion segmented according to different granularity levels, in a similar manner to the
video granularity levels, (i.e., shot, scene, and clip) introduced in a previous section.
At the beginning only sequences of simple body motions are recognised for
gesture, without connections with semantics related to the style of dance for a specific
dancer, with her/his particular climate, with semantics of social and cultural message
contained in the dance. The information obtained is used by an annotator (which we
refer as the first level annotator) in the annotation process of the multimodal input.
Music and gesture inputs, if combined, can provide more accurate information; e.g.
combining temporal features for music and gesture also information on their
synchronization can be obtained. For this reason the architecture contain a fusion
module. When the fusion process is completed, then the interpreter compares the
sequences of gesture, music, and their combination with the sequences contained in
appropriate libraries using the knowledge on the context and knowledge on the dancer
(user). In this manner is obtained information on:

mood, feeling, emotion and affect expressed by the dancers,


culture of the country of the dance origin,
accompanying song,
semantic concepts connected with the dance,
identity of the dancer, etc

This information is used and defines the second level annotation (see Figure 7).The
automatic annotation using information obtained by multimodal inputs provides a
systematic annotation method that is improved by the context and the user knowledge.
All this information is used in the indexing process, facilitating information access.

6 Conclusion
The artistic and cultural expression of individuals contains elements of a shared
collective language. The dance is no exception. It is a language in which the dancer
communicates using his own body and then the multimodal nature of human
communication. Like all languages, the dance has evolved and summarizes
characteristics of the reference culture and the characteristics with which each dancer
interprets it (e.g. emotional). For this reason this work presents the architecture of an
annotation system capturing information directly through the use of sensors,
comparing and interpreting them using a context and a users model in order to
annotate, index and access multimodal documents.

References
1. Ann Hutchinson, G.: Dance Notation: Process of recording movemen. Dance Books,
London (1984)
2. Chitra, D., Manthe, A., Nack, F., Rutledge, L., Sikora, T., Zettl, H.: Media Semantics:
Who needs it and why? In: Proceedings of ACM Multimedia, pp. 580583 (2002)

Towards Multimodal Capture, Annotation and Semantic Retrieval

87

3. Herbison, D., Evans: Dance, Video, Notation and Computers. Leonardo 21(1), 4550
(1988)
4. George, P.: Computers and Dance: A bibliography. Leonardo 23(1), 8790 (1990)
5. Calfert, T.W., Chapman, J.: Notation of movement with computer assistance. In:
Proceedings of ACM Annual Conference, pp. 731736 (1978)
6. Hatol, J., Kumar, V.: Semantic representation and interaction of dance objects. In:
Proceedings of LORNET Conference, Poster (2005)
7. Hachimura, K.: Digital archiving of dancing. Review of the National Center for
Digitization 8, 5166 (2006)
8. Hattori, M., Takamori, T.: The description of human movement in computer based on
movement score. In: Proceedings of 41st SICE, pp. 23702371 (2002)
9. Calaban: (2002), http://www.bham.ac.uk/calaban/frame.htm
10. Bimas, U., Simon, W., Peter, R.: NUNTIUS: A computer system for the interactive
composition and analysis of music and dance. Leonardo 25(1), 5968 (1992)
11. Led & Linter: An X-Windows Editor / Interpreter for Labanotation (2006),
http://wwwstaff.it.uts.edu.au/don/pubs/led.html
12. MacBenesh:
Behesh
notation
editor
for
Apple
Macintosh
(2004),
http://members.rogers.com/dancewrite/macbenesh/macbenesh.htm
13. Ilene, F.: Documentation Technology for the 21st Century. In: Proceedings of World
Dance Academic Conference, pp. 137142 (2000)
14. Kalajdziski, S., Davcev, D.: Augmented reality system interface for dance analysis and
presentation based on MPEG-7. In: Proceedings of IASTED Conference on Visualization,
Imaging, and Image Processing, pp. 725730 (2004)
15. Forouzan, G., Pegge, V., Park, Y.C.: A multimedia information repository for cross
cultural dance studies. Multimedia Tools and Applications 24, 89103 (2004)
16. Athanasios, C., Gkoritsas, Marios, C.A.: COSMOS-7: A video content modeling
framework for MPEG-7. In: Proceedings of IEEE Multi Media Modeling, pp. 123130
(2005)
17. IBM VideoAnnEx (2002),
http://www.alphaworks.ibm.com/tech/videoannex
18. Tra-Thusng, T., Roisin, C.: Multimedia modeling using MPEG-7 for authoring
multimedia integration. In: Proceedings of ACM Multimedia Information Retrieval,
pp. 171178 (2003)
19. Ryn, J., Sohn, J., Kin, M.: MPEG-7 metadata authoring tool. In: Proceedings of ACM
Multimedia, pp. 267270 (2002)
20. Haoran, Y.I., Rajan, D., Liang-Tien, C.: Automatic generation of MPEG-7 complaint
XML document for motion trajectory description in sports video. Multimedia Tools and
Applications 26(2), 191206 (2005)
21. Rajkumar, K., Andres, F., Guetl, C.: DanVideo: A Mpeg7 Authoring and Retrieval System
for Dance Videos. Multimedia Tools and Applications 46(2), 545572 (2009)
22. Devillers, L., Vidrascu, L., Lamel, L.: Challenges in real-life emotion annotation and
machine learning based detection. Neural Networks 18, 407422 (2005)
23. Popescu-Belis, A.: Managing Multimodal Data, Metadata and Annotations: Challenges
and Solutions. In: Thiran, J.-P., Marques, F., Bourlard, H. (eds.) Multimodal Signal
Processing for Human-Computer Interaction, pp. 183203. Elsevier/ Academic Press
(2009)
24. Callejas, Z., Lpez-Czar, R.: Influence of contextual information in emotion annotation
for spoken dialogue systems. Speech Communication (2008), doi: 10.1016/j.specom,
01.001

88

R. Kannan et al.

25. Yu, C., Zhou, J., Riekki, J.: Expression and Analysis of Emotions: Survey and
Experiment. In: Symposia and Workshops on Ubiquitous, Autonomic and Trusted
Computing, UIC-ATC, pp. 428433 (2009)
26. Harada, I., Tadenuma, M., Nakai, T., Suzuki, R., Hikawa, N., Makino, M., Inoue, M.: An
Interactive and Concerted Dance System?? Emotion Extraction and Support for Emotional
Concert. In: Fifth International Conference on Information Visualisation (IV 2001),
vol. iv, p. 0303 (2001)
27. Glowinski, D., Camurri, A., Volpe, G., Dael, N., Scherer, K.: Technique for automatic
emotion recognition by body gesture analysis. In: 2008 IEEE Computer Society
Conference on Computer Vision and Pattern Recognition Workshops, CVPRW, pp. 16
(2008)
28. Grassi, M.: Developing HEO human emotions ontology. In: Fierrez, J., Ortega-Garcia, J.,
Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds.) BioID MultiComm2009. LNCS,
vol. 5707, pp. 244251. Springer, Heidelberg (2009)
29. Sorci, M., Antonini, G., Cruz, J., Robin, T., Bierlaire, M., and Thiran, J.: Modelling
human perception of static facial expressions. Image Vision Comput. 28(5), 790806
(2010), doi:http://dx.doi.org/ 10.1016/j.imavis. 2009.10.003
30. Oviatt, S., Choen, P.: Perceptual user interfaces: multimodal interfaces that process what
comes naturally. Comm. of ACM 43, 4553 (2000)
31. DUlizia, A., Ferri, F., Grifoni, P.: Generating Multimodal Grammars for Multimodal
Dialogue Processing. IEEE Transactions on Systems, Man, and Cybernetics, Part A 40(6),
11301145 (2010)
32. Mankoff, J., Abowd, G.D., Hudson, S.E.: OOPS: a toolkit supporting mediation
techniques for resolving ambiguity in recognition-based interfaces. Computers &
Graphics 24(6), 819834 (2000)
33. Caschera, M.C., Ferri, F., Grifoni, P.: Ambiguity detection in multimodal systems. In:
Proc. AVI 2008, pp. 331334 (2008)

A New Indian Model for Human Intelligence


Jai Prakash Singh
Instruments R&D Establishment (IRDE - DRDO), Dehradun, India
jpsingh@irde.drdo.in, jpsingh1972@yahoo.co.in

Abstract. This paper first reviews the existing models for human intelligence.
Then it discusses the nineteen types of human thought processes KriyaPratikriya, Indriya, Aatmsaat, Smaran, Samajh, Soch, Vichar, Vimarsh,
Kalpana, Swapna, Anubhava, Anubhooti (or Aatm-Prerana), Tark, Bhav,
Dhyan, Gyan, Vivek, Siddhi and Darshan which are commonly present in the
Indian books regarding human thinking. These processes when joined together
form a new Indian model for human intelligence. Though many aspects of
human thinking have now been understood, the core reason behind superior
capabilities of human brain as compared to present Artificial Intelligence or
neural network based machines or other living beings has not come out clearly.
This paper relates the biological research with human thinking modes to explain
some aspects of this superior intelligence.
Keywords: Human Intelligence, Artificial Intelligence, Neural Networks,
Indian concepts.

1 Introduction
Superiority of human intelligence both over present machine intelligence and other
living being puts forward many questions yet un-answered. Though monkey shares
similar genetic as well as physical structure and brain as well, it is far behind humans
in evolution. One difference found in literature is that while most of the monkey brain
neurons are hardwired at the time of birth itself, human brain has most of the neurons
free. So, monkey child can perform running, jumping and balancing on trees quite
early while humans can take lot much time and practice to achieve the same. But in
return humans can think and create a lot many things. In humans as well, different
people have different qualities. An athlete/ Gymanst can run faster or do Gymnastics
better than others but many people may think faster than him. So, it all depends, in
which direction neurons are making their connections.

2 Methodology
A survey of concepts about human thinking in Indian literature revealed different types
of human thought processes. These different types put jointly resulted in a new model
about human intelligence which explains some aspects of the superiority of human
intelligence over other living being and present day machine intelligence. Only the
concepts contributing to superior intelligence were selected. So, Bhaya (Fear) and
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 8997, 2011.
Springer-Verlag Berlin Heidelberg 2011

90

J.P. Singh

Abhivyakti (Expression) were left out as these functions suppress or bring the thought
out rather than contributing to intelligence though they may add to intelligence when
joined with others. Further, biological basis of these different processes was also
studied.

3 Existing Models for Human Intelligence


Apriority and Adaptivity are the two As around which most of the discussions
regarding human intelligence have been happening. Following models are relevant
here:
1. Platos Concepts Model: Around 2300 years ago, Plato said that the ability to
think is founded in the a priory knowledge of concepts. Concepts are abstract ideas
(Eide) known to us a priori (from god/ nature).
2. Aristotles Adaptivity Model: Platos Concepts model was criticized by his own
pupil Aristotle on the account that the Concepts a priori model does not take in to
account an important aspect of intelligence the ability to learn or adapt to a
changing world.
3. Grand Philosophical System of Realism of Ideas: Throughout the ancient and
middle ages, concepts of Plato and Aristotle were unified into a grand philosophical
system based on the realism of ideas the ways in which intelligence combines
apriority and adaptivity.
4. Occams Nominalism: In the fourteenth century, Occam rejected the realism of
Plato and Aristotle. He propagated the view of Nominalism claiming that ideas are
just names (nomina) for classes or collections of similar empirical facts. Nominalism
emphasized on the ability of the mind to learn from experience. Only particular
experiences have real existence and general concepts are just names for similar types
of experiences, devoid of any real existence. Thus, Occam developed empiricism, a
rational understanding of intelligence away from spiritualism and provided a
scientific method.
5. McCulloch and Co-workers Neural Structures Model: In 1940, McCulloch
stated that Platos a priory ideas were encoded in the complicated neural structures of
the brain. In 1943, McCullogh and Pitts presented a mathematical model of the neural
organization of the brain taking a few important properties of biological neurons into
account.
6. Hebbs Adaptation Model: In 1949, Hebb supplemented the McCullogh & Pitts
Model with an adaptation mechanism. First Artificial Neural Networks were created
on this basis. In 1951, Minsky and Edmonds built first neural network using formal
neurons which modeled the food searching maze behavior of rats.
7. Widrows Adaline: In 1950s, many researchers and groups developed neural
networks based on formal neurons. In 1959, Wiener filters capability of fast learning
in linear signal filtering problems was utilized by Widrow in developing Adaline.

A New Indian Model for Human Intelligence

91

8. Rosenblatts Perceptrons: In 1958, Rosenblatt created Perceptrons capable of


learning linear classification rules from the training data or concepts from empirical
data. Thus it appeared that a large number of adaptive neurons connected into a
network would be able to learn complex cognitive and behavioral concepts on their
own. So, a priory knowledge was not needed.
9. Plato-Minskys Rule based AI or Expert System: In 1960s, Minsky suggested a
concept of artificial intelligence similar to Platos principle of apriority of ideas. In
Minskys system, a system of logical rules is put up in computer memory a priory.
This Plato-Minskys method became the foundation of AI widely used in factory
floors to space shuttles.
10. Chomskys Linguistics Model: Chomsky proposed to build a self learning
system that could learn a language similar to humans using a symbolic mathematics
of rule systems. Chomsky brought the genetic concept that a certain knowledge is
encoded a priory in the brain through genes.
11. Fuzzy Logic Model: All the previous models when applied to certain physical
situations could not deal with combinatorial explosion. Zadehs Fuzzy Logic could
deal with this problem accounting the inherent approximate nature of concepts &
thoughts.
12. Grossbergs Perception-Cognition Model: In 1980s, Grossberg presented a
model based on efferent and afferent signals. Signals coming from within the mind
interact with the signals coming from outside world and create a complete
representation. This was fundamental departure from early neural networks which
emphasized on learning from data (signals from outside) and also from rule based AI
which stressed on signals coming from within the mind.
13. Modeling Field Theory Model: This model adds adaptive fuzzy logic to the
Grossbergs Perception-Cognition model and is the present status of human
intelligence models. It has been developed in 2000s.

4 The Different Types of Human Thought Processes Described in


Indian Texts and Philosophy
Indian philosophy describes human thought processes in many different types out of
which the following nineteen types are common:
1. Kriya-Pratikriya: This means action mainly monitoring actions & reflex action.
The reflex actions have been implemented in most of the neural network or fuzzy
models. This is the lowest level of thought process described in philosophy.
Monitoring action is performed by the brain unconsciously. Most of the primitive
living organisms are limited to this functionality.
2. Indriya: This means senses which are five commonly Eyes for vision, Ears for
sound, Nose for smell, Skin for Touch and Tongue for taste. These are the interfaces
through which human brain senses the world. The brain has got separate areas
allocated to process the signals received from these five sensors. Present machine

92

J.P. Singh

intelligence is mainly involved in implementing Indriya in machines and has been


successful with some variations.
3. Aatmsaat: This is internalizing or memorizing. In this process, the facts and data
sensed through Indriyas (reading/ seeing through eyes, listening through ears, sensing
through touch, smelling through nose or tasting through tongue) are brought inside the
long term memory from short term memory. The machine intelligence equivalent for
this process is RAM and Hard Disk concept. Data is manipulated in RAM (a Short
Term Memory) in real time and then stored on Hard disk (a Long Term Memory).
However, a better analogy is taking CCD image first on charge couples (Short Term
Memory) and then on memory (Long Term Memory).
4. Smaran: This stands for retrieval from memory. Similar data is fed to the memory
and full data comes out due to the content based retrieval scheme of the human
memory. Our present day computers implement address based memories but neural
net based content based memory retrieval is also emerging.
5. Samajh: This means understanding. Here data is taken to the brain and it is
correlated with previous experiences. Here meaning from the data is extracted without
much further processing. Can Machines understand their surroundings or by reading a
book? This question is difficult to answer at present but many efforts regarding
semantic algorithms are able to acquire a similar capability. Understanding looks
basically classification but at a quite high and sophisticated level.
6. Soch: This is similar to goal directed creative thought. Sentences like I thought a
scheme mean soch. Most of the intelligent machines are trying to imitate this
process. New implementations of maze or path finding processes are able to bring up
an elementary Soch capability in the machines. Quite often Soch and Samajh are used
together in phrases like Soch-Samajh kar meaning well thought out.
7. Vichar: This word has come from vicharana which means random walk in hindi.
Hence, Vichar means random thinking without a clear goal. Many people take vichar
as actual thought process. Vichar provides clearly a new capability to the humans,
which has been implemented in a primitive way through Monte-Carlo methods in
machine intelligence also. Vichar is also akin to spontaneous thought. Soch and
Vichar are also used together as Soch-Vichar kar meaning creative thoughts well
adjusted with current understanding.
8. Vimarsh: This is collaborative or group thinking. Humans use this process to
acquire something like ant or swarm intelligence though this process brings out a
more refined thought rather than a new capability which ant/ swarm intelligence
brings. Quite often Vichar and Vimarsh words are used together for a well thought out
consensus among people as a phrase Vichar-Vimarsh.
9. Kalpana: This means imagination. Next milestone to be achieved for machine
intelligence is imagination. This may be projection inside the human brain or
extrapolation and sometimes intrapolation as well. The most important aspect of
Kalpana is creativity. People who are creative are said to have better imagination
power. Though extrapolation or intrapolation can be implemented in machine

A New Indian Model for Human Intelligence

93

intelligence, creativity like humans is not yet been possible. Neural Nets are playing a
key role in bringing creativity in machine intelligence.
10. Swapna: This stands for night dreams. Swapna means self-generated. Though
night dreams seem to be the thoughts generated by the brain itself without any input,
they also can be a result of reciprocal effect of other sensors on human vision system.
In day time, human eyes are open and are the main sensors which keep making
pictures of the world around. In night, when eyes are closed, the ears take up the job
of main sensor and in collaboration with other sensors (smell, touch and taste), try to
create a picture of the world around. The picture created is sent to human vision
system so that any danger can be sensed in sleep also. This may be one aspect of
dreams. Other aspect more prevalent in psychology is the relaxation or normalization
of strong nodes formed in the brain so that the brain may function properly. Dreams
might also be acting as a calibration process where human vision system might be
calibrating itself against other sensors. Though it is quite unlikely that machine
intelligence will be using night dreams also but it will surely need calibration and also
relaxation or re-boot to keep functioning properly.
11. Anubhav: This means experience or feel. Any thought is said to be ready once it
is validated through actual experience like Anubhav. The human thought process
provides good weightage to this process and adjusts weights as per results of
Anubhav. In machines, this is similar to online or off-line learning.
12. Anubhooti or Aatm-Prerana: This stands for self intution. It may be related with
the internal layers of the neural network within human brain providing entirely new
outputs or it may also be a result of genetic evolution as it is found that people in a
certain family has got self- intuition in certain directions. This aspect is too difficult to
be incorporated in machine intelligence as on today but genetic evolution is a factor
being considered seriously as genetic algorithms. Even output from internal layers of
neural nets also can be used but its format is yet to be evolved.
13. Tark: This means logic. Though human brain uses this in daily life but Tark is
also used to refine established theories and thoughts. Evolution of mathematics and
science has been through this Tark process. A process called Shastrarth was
adopted by Indians to discuss and establish theories something like Conferences and
Workshops. Tark has mostly been implemented in machines through AI rules.
14. Bhav: This has got two parallel meanings. Bhav means the essence or abstract.
But a word Bhavuk means emotional. Another word Bhavana means the intention.
This aspect has not yet been implemented in machine intelligence. Emotions are
related with the flow of certain chemicals (enzymes) in the human brain which
enhance its functions in some directions and suppress in other directions.
15. Dhyan: The literal meaning of this word is concentration. Though, this word is
also commonly used for Meditation. In the concentration meaning, brain switches

94

J.P. Singh

off most of the other thoughts and brings only a particular thought in mind to provide
it complete focus. In meditation mode, brain switches off all the thoughts and makes
our neurons relax. It is like a conscious sleep. In the first meaning, machines are
already coming now to switch off other programs to provide full focus to one task.
But it is difficult to say if machines will do meditation also or if they will need it.
However, shut down and restart function can be a way to do it but that is more similar
to sleep & awakening.
16. Gyan: This means knowledge. After Dhyan, humans achieve knowledge.
Incorporating knowledge in machine intelligence is a theme present day researchers
are working on.
17. Vivek: This means morality (the ability to distinguish between right and wrong).
While gyan can be useful or harmful as well (depending on the use), vivek helps
humans to use knowledge for human and natures benefits only. Whether improved
machine intelligence will be able to acquire vivek also with their thinking power? is
a big question. Till vivek is incorporated in machine intelligence, it is advised that
improved human like machine intelligence or super-intelligence shall not be used in
robotics or in machines which are physically more powerful than humans. Till then,
machine intelligence shall only advise humans which they can accept or reject. A
wavelet transform type of thinking process where machines can analyze the minute
details and also can see the big picture and take decision on the basis of combination
of both may help.
18. Siddhi: After a lot of Vichar, Tark, Bhav, and Dhyan, humans achieve Siddhi
which means an eternal established fact or skill. If machines can find Siddhi, that will
be a milestone in machine intelligence and will help humans a lot.
19. Darshan: This is the highest level of human thought (though Moksha is the
highest level but it is not well defined and more related to soul than brain). Darshan
literally means philosophy. Machine Intelligence will need many more evolutions to
reach this level.

5 The Human Brain vs Machine Intelligence


Following capabilities of human brain seem to make them superior to machines:
1. Massive Parallelism: Billions of neurons with so many synapses create a huge
infrastructure for the human brain to operate. This feature has been well described in
present literature of neural research.
2. Contextual Thinking: This is the most striking capability of the human brain. It
can segregate the facts so well that things fall in place quite nicely. Machines are
finding great difficulties in getting an equivalent ability through semantic algorithms.

A New Indian Model for Human Intelligence

95

Huge number of neurons possess a high level of classification ability and may be a
reason behind contextual thinking.
3. Imagination & Creative Thinking: Human brain can imagine new things and can
create a new world inside it. Machine Intelligence is yet to fully understand and
incorporate this feature.
4. Spontaneous Thinking: Human brain does not stop till it dies. It keeps on tackling
problems in conscious or unconscious modes in goal directed or goal less manner.
This aspect is now coming up in literature.
5. Evolution through generations: Not only human brain keeps thinking throughout
its life, it also keeps passing its evolved capabilities to the next generations through
genes. This natural process seems the key towards vast human capabilities.
6. Collaborative Thinking: Vast human knowledge has emerged through
collaborative thinking. Generations of humans have accumulated knowledge and
passed on to the next generations through documents, books etc.
7. Quest for eternal truth: Humans have got a curiosity and motivation to find out
the eternal truth. This quest has made them evolve newer and newer facts and
principles.
On the other hand, machine intelligence has also got its own strengths over human
brain:
1. Massive Data Crunching ability: Machines have surpassed humans in data
crunching. This is the main capability where machines are able to help humans at
present.
2. Vast Memory: Machines can have huge memories in future and thus can store any
amount of information. This ability may surpass humans. Even today, machines do
store so many images & data that humans find them suitable for taking help.
3. Consistency & Stability: Machines can be more consistent and stable. Emotions
bring instability in human performance. AI is more consistent than neural nets and
hence a combination of both can aid.

6 The Biological Basis of These Different Thought Processes


Though Indian texts do not reveal much about how different organs in the human
brain contribute to these different types of human thought processes, modern day
biological research has advanced enough to understand the biological basis of some of
these processes. These can be summarized in the following table:

96

J.P. Singh
Table 1. Bilogical basis
Process
Indian Texts
1. Kriya
Pratikriya

in

Main probable Brain organs involved

English Description
Monitoring
Actions

&

Reflex

Central Nervous System, Hypothalmus


LGN, Primary Visual Cortex, Auditory
Cortex, Somato-sensory cortex

2. Indriya

Senses

3. Aatmsaat

Internalizing

Neo & Old Cortex, Hippocampus

4. Smaran

Memory retrieval

Neural Axons

5. Samajh

Understanding

Neural Organization

6. Soch

Creative Thinking

7. Vichar

Spontaneous Thinking

Limbic System

8. Vimarsh

Collaborative Thinking

Brochas area, Limbic System

9. Kalpana

Imagination

Projection Cells

10. Swapna

Dream

Reticular Formation

11. Anubhav

Experience

Neural Axons & Synapses

12.
Anubhooti

Self-Intution

Internal neural layers and genes

13. Tark

Logic

14. Bhav

Emotions

15. Dhyan

Concentration

16. Gyan

Knowledge

Neural Organization

17. Vivek

Moral Values

Mirror Neurons

18. Siddhi

Established Fact/ Skill

19. Darshan

Eternal Philosophy

Hebbs synapses
getting excited)

(nearby

neurons

Empathy or Mirror Neurons, Chemicals


Ascending Reticular Activating System
(ARAS)

7 A New Model for Human Intelligence


Though the different types of human thinking types or modes have been mentioned in
different contexts in the Indian literature, they do suggest collectively a new model
for human intelligence. Following observations can be directly made:
7.1 Human Intelligence vs other species intelligence: The first four thinking modes
i.e. Kriya-Pratikriya, Indriya, Aatmasaat and Smaran are found in most of the living
being or mammals also. In some living being, Bhav (emotions) is also found. But
other modes are mostly specific to the humans.
7.2 Human vs Human: Even humans differ among themselves on the scale of these
thinking modes. Some people are more emotional than others while some are more
logical. It can be said, that these different types of human thinking modes classify the

A New Indian Model for Human Intelligence

97

human intelligence. Different humans have their brain functional in different modes
predominantly and that describes his capabilities. For example a person with more
bhav, kalpana and swapna will be poetic or creative while a person with more tark,
soch will be more business minded. A person with Samajh, vichar, vimarsh will be a
good administrator. Most of the inventors and mathematical prodigies have AatmaPrerana which other humans lack. This model suggests a new direction of research
towards superior machine intelligence to be developed on the lines of these thinking
modes. If a machine or living being can acquire all these modes, it may behave like
humans or even surpass it. Lack of any or mor of these modes in a human being due
to one or other reason may result in psychological problems.

8 Conclusions and Future Directions


This paper has brought out a new model for human intelligence based on Indian
concepts of human thinking. This model predicts the capabilities of creativity,
generalization, logical thinking and intution to be the keys behind superior human
intelligence. This model needs to be further studied and explored on its biological
and mathematical bases. Further comparisons with neural network, Fuzzy Logic,
Evidential reasoning based systems can bring more clarity.
Acknowledgments. Author acknowledges the directors of IRDE, Dehradun and GFAST, Delhi for their kind support towards studies performed to write this paper.

References
1. Antonov, A.A.: Human-computer super-intelligence. American Journal of Scientific
and Industrial Research 1(2), 96104 (2010), http://www.scihub.org/AJSIR,
ISSN:2153-649X doi:10.5251/ajsir.2010.1.2.96.104
2. Arbib, M.A. (ed.): The Handbook of Brain Theory and Neural Networks. The MIT Press,
USA (2006)
3. Polk, T.A., Seifort, C.M. (eds.): Cognitive Modeling. The MIT Press, USA (2002)
4. Fogel, D.B.: Evolutionary Computation. Prentice Hall of India Pvt. Ltd., Englewood Cliffs
(2004)
5. Haykin, S.: Neural Networks: A Comprehensive Foundation. Pearson Education, London
(1999)
6. Many texts in Indian Literature
7. Penrose, R.: Shadows of the mind: A search for the missing science of consciousness.
Vintage Books, London (2005)
8. McGaugh, J.L., Weinberger, N.M., Linch, G. (eds.): Brain and Memory: Modulation and
Mediation of Neuroplasticity. Oxford University Press, Oxford (1995)
9. Gleik, J., Chaos: The Amazing Science of the Unpredictable. Vintage Books (1998)
10. Zurada, J.M.: Introduction to Artificial Neural Systems. West Publishing Company (1999)
11. Darwin, C.: The origin of Species. Goyal Saab Publishers, Delhi
12. Norden, J.: Ph. D., TTC Video Course on Understanding the Brain
13. Perlovsky, L.I.: Neural Networks and Intellect: Using Model- Based Concepts. Oxford
University Press, New York (2001)

Stepping Up Internet Banking Security Using Dynamic


Pattern Based Image Steganography
P. Thiyagarajan, G. Aghila, and V. Prasanna Venkatesan
CDBR-SSE Lab Department of Computer Science,
Pondicherry University, Puducherry 605 014
thiyagu.phd@gmail.com, aghilaa@gmail.com,
prasanna_v@yahoo.com

Abstract. In the world of E-Commerce internet banking is one of the


indispensable applications. Security issues are to be addressed critically in
internet banking applications and it directly influences the comfort. Even
though the existing mechanisms ensure security the hackers succeed in breaking
these mechanisms. In this regard to step up the internet banking security a new
layer called Stego Layer is introduced which in turn uses Dynamic Pattern
Based Image Steganography (DPIS) algorithm. Stego-layer will be present in
both client and server for embedding and extracting the message. The proposed
method is compared with other popular encryption algorithms in practice.
Keywords: Image Steganography, Session Hijacking, AES, Internet Banking
Security, Dynamic Key Management, Pixel Intensity.

1 Introduction
Information is Wealth, is a profoundly known statement. This goes inherent in all the
aspects of business. With information serving a critical role in an organisation,
preserving it becomes the most challenging activity. This paper presents a method to
enhance the security of data which is transmitted between client and server. This deals
with the step-wise transition of data and suggests a mechanism to cover the
information from the intruders. The security is guaranteed by the inclusion of a stego
layer in the client and server side. The functionality of the layer is that the information
to be exchanged between the banking parties is hidden in an image before being
transmitted.

2 Internet Banking
Internet banking, otherwise called anywhere anytime banking, has become an
indispensable tool in the modern banking arena. With the help of internet banking,
one can access any information regarding their account and transactions, any time of
the day. One can regularly monitor the account as well as keep track of financial
transactions, which can be of immense help in detecting any fraudulent transaction. In
the world of internet money transaction between accounts take place in fraction of
seconds. The main issue of the internet banking analysed in the survey conducted by
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 98112, 2011.
Springer-Verlag Berlin Heidelberg 2011

Stepping Up Internet Banking Security Using Dynamic Pattern

99

online banking association in the year 2002 [9] is security. Security is a crucial
requirement of an E-Commerce system [6] due to the fact that the sensitive financial
information that these systems transmit travel over un-trusted networks where it is
essentially a fair game for anyone with local or even remote access to fetch the
confidential data in any part of the path followed.

3 Threats in Internet Banking


Internet has become parts of life some way or the other without which individual
cant survive. With almost all processes automated, the processing time has become
almost negligible which is directly proportional to the efficiency of the system as a
whole. Christos K.Dimitriadis et.al [5] in analysing the security of internet banking
classified the attacks broadly in to four categories. They are

Phishing
Injection of commands
User credentials guessing
Use of known authenticated session by attacker.

The following section explains the sample attacks which are relevant to our work and
falls in above categories.
A. Man in the middle attack
Man in the middle attack falls under the Injection of commands category [5]. In this
attack, attackers intrude into an existing connection to interrupt the exchanged data
and inject false information. It involves eavesdropping on a connection, intruding into
a connection, intercepting messages, and selectively modifying data. Such attacks are
usually selected by hackers against public-key cryptosystems. Quite often in such
cases, the victim parties are made to believe that they remain safe in communicating
with each other.
B. Session hijacking
Session hijacking falls under the use of known authenticated session by attacker
category [5]. Session hijacking is the act of taking control of a user session after
successfully obtaining or generating an authentication session ID. In Session
hijacking attacker seizes the control of a legitimate user's Web application session
using brute forced or reverse-engineered session IDs while that session is in progress.
C. Man in the browser attack
Man in the browser attack falls under the user credentials compromise category [5].
This Man in the browser attack takes place only in computer memory. It takes place
before Secure Socket Layer (SSL) encoding. When a user's PC is infected, the
malicious code is triggered as the user visits an online bank website. This attack
retrieves authentication information, such as logins and passwords, entered on a

100

P. Thiyagarajan, G. Aghila, and V.P. Venkatesan

legitimate bank site. The retrieved personal data is sent directly to an FTP site where
it is stored.

4 Existing Techniques for Handling Attacks in Internet


Financial institutions offering Internet-banking should have reliable and secure
methods for transactions. In this section some of the security mechanisms from
literature have been discussed.
Plossl et.al [10] addresses authentication and phishing issues by proposing a visual
cryptography mechanism. According to their proposal the client is supplied with
challenge-response list. When the client carries a transaction, he has to scan the list to
find the response that corresponds to the challenge provided by the bank.
Hiltgen et.al [8] in his paper targets Man in the middle attack by Short-time
password solutions based on a password generating hardware token which are
available from various manufacturers such as RSA Security, Active Card or VeriSign.
The RSAs SecureID solution is the most prominent example. It consists of a small
device including a LCD display and one button the user can press to initiate the
calculation of the next short-time password.
Geeta et.al [7] in her paper enhances the security level of Mobile banking using
Steganography. In this method pixels are chosen according to the key generated and
secret message bits are embedded at constant rate in the chosen pixel. This method
does not ensure that significant color do not suffer from data embedding and bits are
embedded sequentially in all selected pixels which may pave way for steganalyst to
easily crack the method.
AES is the most commonly used encryption algorithm [11] for high end security
applications. Recently it has been proven by cryptographers that the AES is breakable
[1]. The survey clearly points out the need for a technique which ensures for secured
transactions in internet banking. In this work the stego-layer method has been
evaluated and compared with Advanced Encryption Standard (AES) against important
security parameters. The proposed method also provides solution for Man in the
middle attack and Session hijacking.

5 Proposed Method Using Dynamic Pattern Image Steganography


Algorithm
In proposed method a new layer called stego-layer was introduced both in client and
server side. Any critical data passing to and from the client and server will pass
through the stego-layer. The stego-layer uses Dynamic Pattern based Image
Steganography algorithm for embedding and extracting message which is explained
in this section.
Steganography is the art of hiding secret information in media such as image, audio
and video [4]. The purpose of steganography is to conceal the existence of the secret

Stepping Up Internet Banking Security Using Dynamic Pattern

101

information in any given medium. It is known that Internet banking is based on


Client-Server architecture. The proposed stego-layer was introduced in both client and
server sides for embedding and extracting process. Figure 1 shows the architecture of
the proposed stego-layer method.
In stego-layer embedding and extracting was done by the Dynamic Pattern Based
Image Steganography (DPIS) algorithm. The idea behind DPIS technique is that
significant color channels should not suffer from data embedding while the
insignificant color channel can be used for data embedding. Figure 2 and Figure 3
depicts the embedding and extracting process of DPIS algorithm.

Browser

Client Stego Layer


E
m
b
e
d
d
i
n
g

Internet

Server Stego Layer

E
x
t
r
a
c
t
i
n
g

Bank Server

Bank Database
Fig. 1. Architecture of proposed Stego-layer Method

If the indicator chosen is lowest color channel then the pixel is exempted from data
embedding else if the indicator chosen is not the lowest color channel then choose the
lowest value channel apart from the indicator channel for data embedding.

102

P. Thiyagarajan, G. Aghila, and V.P. Venkatesan

g
Embedding Part
Generate Indicator Sequence of any length
Get the Cover image
Get the Secret message to be embedded
For 1 to last _row
For 1 to last_col
Fix the Indicator Channel
If (Indicator channel is lowest)
Skip
Else
Find the lowest channel
Embed the secret message
bits
Mark the bits embedded in
3rd channel
End if
End For
End For

Extracting Part
Get the Indicator Sequence from embedding part
Get the Stego-Image
For 1 to last _row
For 1 to last_col
While (Entire bits not extracted = true)
Find the Indicator channel
If (Indicator channel is lowest)
Skip
Else
Find the data channel
Extract the bits embedded
End if
End While
End For
End For

Stepping Up Internet Banking Security Using Dynamic Pattern

Fig. 2. Flow chart for Steps followed in Client Stego Layer

103

104

P. Thiyagarajan, G. Aghila, and V.P. Venkatesan

The above steps completely describe the embedding and extracting process.
Criteria for choosing number of bits to be embedded in data channel
Experiments have been conducted to find almost how many least significant bits
can be changed in pixel such that the color doesnt vary from the original color and
the results have been used in DPIS technique. DPIS technique has been tested on
different image category such as portrait, flower, nature, toys etc. Figure 4 and
figure 5 shows the Cover and Stego images generated through DPIS technique.

Fig. 3. Flow chart for Steps followed in Server Stego Layer

Stepping Up Internet Banking Security Using Dynamic Pattern

105

Image Size: 323 x 429, No of pixels: 138567


Fig. 4. Cover Image

Pixel used for embedding: 4714, Secret Message size: 2113


Fig. 5. Stego- Image

Proposed method is moulded so as to ensure the preservation of the data between


destinations which comprises of a bank and a customer, or vice versa. This work deals
with the step-wise transition of data from internet to bank server and suggests a
mechanism to cover the information from the intruders. This superlative degree of
security is guaranteed by the inclusion of a stego layer into the network. As the client
visit the banking website he is made to enter his customer id. This customer-id is
validated and key (Indicator Sequence) for the transaction is issued from the bank
server to the client. Client will have to use this key for any further transaction to
server. Key for the particular customer is changed by the bank server at regular
intervals of time to enhance the system security. The functionality of the layer is to
hide the information sent between the communicating parties in an image before
being transferred.

106

P. Thiyagarajan, G. Aghila, and V.P. Venkatesan

The prototype implementation of proposed method is done in Visual Basic 6 (VB


6). DPIS algorithm was implemented in MATLAB and its executable is invoked in
VB. The sample implementation screenshots are shown in figure 9, figure 10 and
figure 11 which depict the internet banking login screen for embedding and extracting
user credentials in to an image. Once the user submits his username and password, the
data is passed to stego layer. In stego-layer, DPIS embedding algorithm is invoked
and it embeds the user credentials using the key allocated for that particular user in to
the image called stego-image. Thus each customer has dynamic key and the allocated
key will be changed after certain period of time to strengthen the security.
The stego-image is passed to bank server through the internet. Once it reaches the
server stego layer, the embedded message is decrypted using the symmetric key by
DPIS algorithm. The decrypted user credentials are extracted to a file and it is
validated in the bank server.

6 Behavior of Common Attacks in the Stego-Layer Method


There are many attacks in the literature survey for internet applications [5]. In this
section the behaviour of common attacks in proposed Stego-layer method has been
analysed experimentally. The most common attacks are
Brute force
Extracting data from all pixels
Extracting same number of bits from data channels
Brute force attack on Indicator sequence: Brute force attacks for Stego-layer method
[3] [5] involves in trying all possible keys until valid key is found. In dynamic pattern
based image steganography technique, indicator channel contains the information
where the data are stored. Intruder may try the brute force attack on indicator
sequence until meaningful message is traced. In all experiments length of indicator
sequence is greater or equal to 20 so number of distinct patterns generated is very
high. For example if the indicator length is 20 the number of distinct patterns
generated is 7, 748, 40,978 and it is difficult to break by brute force attack.

Fig. 6. Stego-image generated by DPIS algorithm

Stepping Up Internet Banking Security Using Dynamic Pattern

107

The secret message shown in the Table 1 has been embedded in the cover medium
by DPIS technique and the obtained stegoimage is shown Figure 6. Brute force attack
has been applied on the indicator sequence for extracting message from the stego
image.
Table 1. Embedded message and extracted message with wrong pixel indicator

Embedded Secret Message in cover medium

Secret message obtained by wrong indicator


sequence

PondicherryUniversity Computer Science


Dept
QibP(97u2q ]< < 4]nD
|:8* ~v&N-F KKe W;'

The above experiment depicts that even if one value in the indicator sequence is
incorrect, embedded secret message cannot be extracted.
Extracting data from all the pixels sequentially: In Stego-layer method data are not
embedded in all the pixels sequentially. Some pixels in the sequence are missed in
order to strengthen the algorithm. The Table 2 shows the result of extracting data
from all the pixels from stegoimage shown in Figure 7.

Fig. 7. Stego-image generated by DPIS algorithm

The embedded message in the image cannot be extracted without the key. Since the
Key is known only by the communicating parties the Stego-layer method prevents
Man in the middle attack and Session Hijacking.

108

P. Thiyagarajan, G. Aghila, and V.P. Venkatesan

Table 2. Embedded Message and Extracted Message from all the pixels in the Stego Image

Embedded Secret Message in cover medium

Secret message obtained by extracting bits from all


pixels in stego image

PondicherryUniversity Computer
Science Dept

T+*m* -&#tKK`w0

-xy

Extracting same number of bits from all pixels: In the Stego-layer technique the
number of bits embedded in each pixel varies and it is decided during the run time.
Experiments were conducted for extracting same number of bits from all the pixels in
the stegoimage generated by DPIS technique. The Table 3 shows the result of
extracting same number of bits from all the pixels from stegoimage which is shown in
Figure 8.

Fig. 8. Stego-image generated by DPIS algorithm


Table 3. Embedded message and message extracted with uniform number of bits from pixels in
the stego image

Embedded Secret Message in cover medium

Pondicherry University
Computer Science Dept

Secret message from stego image obtained by


extracting 2 bits from all data channels

Sj Lc;G;? Cq chT =Y { w@SO`


=o `4' h | *Hf#

Stepping Up Internet Banking Security Using Dynamic Pattern

109

6.1 Detection of Tampered Stego Images


In Stego-layer method, provision has been provided to check whether the stego image
has been modified by any intruder [2] [5]. Before decrypting the embedded message
the stegoimage is checked for any change done by intruder. If stegoimage is not
tampered then the extraction part will be executed.
This has been achieved in the DPIS technique by user id. Since user id has been known
by both client and server, hash value of user id is stored in predetermined pixel of the
image. Before extracting, the particular pixel which contains the hash value of the user id
will be checked in order to make sure that the image has not been modified by the
intruder. Thus Man in the browser attack has been detected using Stego-layer method.

7 Comparison of Stego-Layer Method with Existing Standard


Cryptographic Algorithm
There are numerous mechanisms reported for strengthening internet banking in the
literature. For each efficient methods proposed for security an equal intelligent
method for breaking the security has been developed by the hackers. In table 4 the
proposed stego-layer method is compared with Advanced Encryption Standard (AES)
algorithm against functional and non-functional parameters.
Table 4: Comparison of Stego-Layer method with Advanced Encryption Standard algorithm
(+++ - Good ++++ - Very Good)

Type

Non
functional
parameters

Functional
Parameters

Parameters

AES Algorithm

Stego-layer
method

Performance

+++

++++

Efficiency

+++

+++

Time need to decipher a


message

+++

++++

Key length

Static and vary


with 3 different
key length

Dynamic

Identification of
tampered message at
receivers side

No

Yes

Covertness in message
transmission

No

Yes

110

P. Thiyagarajan, G. Aghila, and V.P. Venkatesan

Fig. 9. Bank login Screen

Fig. 10. Embedding user credentials in to an image at client side

Stepping Up Internet Banking Security Using Dynamic Pattern

111

Fig. 11. Extracting user credentials from stego-image at server side

8 Conclusion
In this stego-layer method a new security mechanism was introduced for internet
banking using Dynamic Pattern Image Steganography algorithm. The proposed
method has been compared with existing algorithm against functional and nonfunctional parameters. Regular attacks were experimented on the proposed method and
from the results it is obvious that attack results in vain. Further work will focus on
choosing appropriate image size so as to ensure it will not be an overload in the network
and to compare the proposed method with other different algorithms for efficiency.

Acknowledgement
This research, CDBR Smart and Secure Environment, was sponsored by National
Technical Research Organisation (NTRO) and their support is greatly acknowledged.
I would also like to thank Mr.M.R.Parandama and Ms.S.Deepa for their support in
implementation.

References
[1] Biryukov, A., Dunkelman, O., Keller, N., Khovratovich, D., Shamir, A.: Key Recovery
Attacks of Practical Complexity on AES Variants With Up To 10 Rounds (2009)
[2] Anderson, R.J., Petitcolas, F.A.P.: On limits of steganography. IEEE Journals of Selected
Areas in Communications (May 1998)

112

P. Thiyagarajan, G. Aghila, and V.P. Venkatesan

[3] Westfeld, A., Pfitzmann, A.: Attacks on Steganographic Systems. In: Proceedings of
the Third International Workshop on Information Hiding, September 29-October 01,
pp. 6176 (1999)
[4] Bailey, K., Curran, K.: An Evaluation of Image Based Steganography Methods.
Multimedia Tools & Applications 30(1), 5588 (2006)
[5] Dimitriadis, C.K.: Analyzing the Security of Internet Banking Authentication
Mechanisms. Information Systems Control Journal 3 (2007)
[6] Oghenerukeyb, E.A., et al.: Customers Perception of Security Indicators in Online
Banking Sites in Nigeria. Journal of Internet Banking and Commerce (April 2009)
[7] Navale, G.S., Joshi, S.S., Deshmukh, A.A.: M-banking Security a futuristic improved
Security approach. International Journal of Computer Science Issues 7(1,2) (January
2010)
[8] Hiltgen, A., Kramp, T., Weigold, T.: Secure Internet Banking Authentication. IEEE
Security and Privacy 4(2) (2006)
[9] Mishra, A. K.: Internet Banking in India-Part I,
http://www.banknetindia.com/banking/ibkg.htm
[10] Plssl, K., Federrath, H., Nowey, T.: Protection Mechanisms Against Phishing
Attacks. In: Katsikas, S.K., Lpez, J., Pernul, G. (eds.) TrustBus 2005. LNCS, vol. 3592,
pp. 2029. Springer, Heidelberg (2005)
[11] Seleborg, S.: About AES Advanced Encryption Standard (2007),
http://www.axantum.com/axcrypt/etc/About-AES.pdf

A Combinatorial Multi-objective Particle Swarm


Optimization Based Algorithm for Task
Allocation in Distributed Computing Systems
Rahul Roy1 , Madhabananda Das1 , and Satchidananda Dehuri2
1

Department of Computer Science Engineering, KIIT University,


Bhubaneswar, Odisha
{link2rahul,mndas12}@gmail.com
2
Department of Information and Communication Technology,
Fakir Mohan University, Balasore-756019, Odisha
{satchi.lapa}@gmail.com

Abstract. In a distributed computing system (DCS), the scheduling


of tasks comprise of two phases, task allocation and task scheduling.
The allocation of tasks to dierent processors is required to maximize
the processors synergism in order to achieve the various objectives, such
as, system throughput, reliability maximization and cost minimization.
The task allocation also need to satisfy a set of system constraints related to memory and link capacity. This problem has been shown as
NP-Hard. Most of meta-heuristic algorithms deal with the task allocation problem as single objective or transform the multiple objectives into
single objective. This paper presents a combinatorial multi-objective particle swarm optimization based (CMOPSO) algorithm to deal with the
multiple conicting objectives of task allocation problem simultaneously
without transforming them to single objective. The performance of the
algorithm is compared with a NSGA-II based task allocation algorithm;
results manifest that the algorithm shows good performance under different problem scales and task interaction density.
Keywords: Task allocation, MOPSO, Combinatorial optimization problem, Distributed computing system.

Introduction

A distributed computing system (DCS) conguration involves a set of cooperating processors communicating over a communication links. To increase the
system throughput, it is required to allocate the modules of the distributed task
to dierent processors according to some objectives, like, minimization of execution and communication cost [7][10], maximization of system reliability and
safety [16] [6] [13], maximizing the fault tolerance of the system using the software and hardware redundancy [4]. Moreover, the system components (processor
and communication links) are capacitated by limited resources which create a
constraint on the task allocation problem. As an instance, for successful accomplishment of a longer task, we need the distributed system to be reliable (i.e.,
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 113125, 2011.
c Springer-Verlag Berlin Heidelberg 2011


114

R. Roy, M. Das, and S. Dehuri

the processor and the communication link are less prone to failure). This incurs
heavier system cost (i.e., communication cost and the execution cost). Thus, it
involves a tradeo between the reliability and system cost. This problem of task
allocation, with the two objectives of minimizing the system cost and maximizing
the system reliability is shown as an NP-Hard problem in [8].
There are numerous methodologies described in the literature that has been
adopted for solving the task allocation problem. They can be broadly classied
as: (1) Mathematical programming [3] like linear programming,graph matching,
state-space search algorithm, branch-and-bound,etc. All these techniques seek
the exact solution and are prohibited if the problem space is large. (2)Customized algorithms [7][13] which take into account specic network conguration
that can provide specic or approximate solutions under certain scenario. However, these algorithms are very much dependent on the network conguration.
(3) Meta-heuristic algorithms [16] [1] [14] like genetic algorithm, tabu search,
particle swarm optimization,etc has been used to solve the problem. The successful application of the meta-heuristic algorithm in diverse domain made it a
strong candidate for solving the problem. Page et al.[11] proposed a genetic algorithm based dynamic task allocation algorithm. They optimized the maxspan of
the task allocation schedule using genetic algorithm alongwith other 8 heuristic
strategies. Most of the meta-heuristics algorithms applied to the problem are
single objective in nature. However, Yin et al. [15] has considered the problem
as multi-objective problem. But they applied the hybrid PSO by transforming
the problem to single objective by considering the tness function as weighted
sum of the two objectives. There has little or no literature available for considering the task allocation as multi-objective problem. This motivated us to
design a combinatorial multi-objective particle swarm optimization (CMOPSO)
algorithm to solve the problem considering multiple objectives.
In this paper, we present a CMOPSO based task allocation algorithm that design a task allocation schedule by optimizing both system reliability and system
cost simultaneously of a distributed computing system. The experimental results
manifest that CMOPSO provides a quality solution in lesser time by considering
dierent problem scales, and task interaction density.
The remainder of the paper is organized in following manner: Section 2 describes the formulation of the objectives of the multi-objective task allocation
problem (MOTAP). Section 3 describes the CMOPSO algorithm. Section 4 provides the experimental environment, result and analysis of the experiment. Finally, conclusion and future research direction is provided in Section 5.

2
2.1
xik

Multi-objective Task Allocation Problem


Nomenclature
decision variable where i is the index of the modules and k is the index
of the processor : xik = 1 if module i is allocated to processor k else
xik = 0
number of processors

A Combinatorial Multi-objective Particle Swarm Optimization

r
pk
lkb
k
kb
k
kb
eik
cij
kb
mi
Mk
si
Sk
2.2

115

number of modules
processor k
communication link between the processor k and b
execution cost between processor k per unit time
communication cost of the link between processor k and b
failure rate of processor k
failure rate of the link between processor k and b
incurred accumulative execution time(AET) if module i is executed in
processor k
incurred intermodule communication load (IMC) between module i and
j (in some data unit quantity)
transmission rate of the communication link lkb
memory requirement of module i from its execution processor pk
memory allocated to processor k
computation resource requirement of processor i from its execution
processor
amount of computation resource capacitated to processor k
Problem Statement

We consider the following assumptions for solving the task allocation problem.
The processors involved in the DCS are heterogeneous. Hence the processors may be capacitated with dierent units of memory and computation
resources. The processing speed and failure rates may be dierent. Also the
communication link may have dierent bandwidth and failure rates.
The modules are non preemptive in nature and have dierent communication
time depending on the data needed for communication and the communication link. Also the execution time may vary based on the speed of the
processors.
The execution of the module consume specic amount of computation and
memory resources from its assigned processor.
Failure event of the processors are statistically independent.
The network topology of the processors is rendered by processor interaction
graph (PIG)which is denoted by G1 (P, L) where P = pi , i = 1, 2, ...n and L =
lkb , 1 k < b n.The PIG for a linear topology is shown in Figure 1. Here, the
nodes represent the processor and the edges represent the communication link.
The intermodule communication among the modules, to be executed by dierent
processors, is depicted by task interaction graph (TIG). We denote the TIG by
G2 (V, E) where V = vij is a set of r nodes indicating r modules and E = cij

Fig. 1. Network topologies

...

116

R. Roy, M. Das, and S. Dehuri

is the set of edges representing the IMC among these modules. The example of
a TIG is shown in Figure 2. The complexity of a task interaction graph can be
measured by a task interaction density d given by Equation 1.
d=

|E|
,
r(r 1)/2

(1)

where |E| calculates the channels of requested IMC demands in TIG and r(r
1)/2 indicates the maximum number of possible channels among the modules.
The task interaction density can serve as key factor in deciding the complexity
of the problem.

V1

c14

c15

V4

V5

c45

c58

c28

V2
c26

V8

c67

V7

V6
c38

c37

V3

Fig. 2. Example of task interaction graph with r = 8

System Cost. The system cost [15] is a combination of the execution and communication cost incurred in the successful completion of the task. We assume that
the execution and the communication cost are time dependent i.e., longer execution and communication of task will incur heavier cost in the involved processors
and communication links. Given a task allocation X = xik , 1 i r, 1 k n,
the execution cost of processor pk during the accumulative execution time (AET)
interval t is k t. Since the total elapse time t of the processor pk is given by
r
n 
r


xik eik , the execution cost of all the processors is given by
k xik eik
i=1

k=1 i=1

Similarly, the total elapse time for handling the IMC over the link lkb is given
r 

xik xjb (cij /wkb ), therefore the communication cost incurred over the
by
i=1 i=j

link lkb is given as


system is given by

r 


kb xik xjb (cij /wkb ), and the communication cost of the

i=1 i=j
n1
 

r 


k=1 b>k i=1 i=j

kb xik xjb (cij /wkb )

A Combinatorial Multi-objective Particle Swarm Optimization

117

Summing up both the cost together, the system cost can be dened by
Equation (2).
r
r 
n 
n1

 
k xik eik +
kb xik xjb (cij /wkb ).
(2)
C(X) =
k=1 i=1

k=1 b>k i=1 i=j

System Reliability. The distributed system reliability (DSR) is formulated


using the Shatzs [12] formulation where the probability, that all the involved
components are operational during the execution of the task, is computed. Similar to the system cost, the system reliability is also time-dependent. Given a task
allocation X = xik 1ir,1kn , the reliability of processor pk during the AET for
k

r


xik eik

executing the modules assigned to it follow poisson distribution e


.
Similarly, the reliability of a communication link lkb during the total IMC transkb

r 


i=1

xik xib (cij /wkb )

i=1 i=j
. Thus the total system reliability that
mission time is e
involve components that are operational is given by Equation (3).

R(X) =

n


r

i=1

k=1

xik eik

n1


kb

r 

i=1 i=j

xik xib (cij /wkb )

(3)

k=1 b>k

Multi-objective Formulation. The multi-objective formulation of the MOTAP problem is given as follows:

subject to

M inimize C(X)
M aximize R(x)

(4)

xik = 1 i = 1, 2, ..., r

(5)

mi xik Mk k = 1, 2, ..., n

(6)

n

k=1
r

i=1
r


si xik Sk k = 1, 2, ..., n

(7)

i=1

xik = {0, 1} i, k

(8)

The multiple objective functions are given in Equation (4). Equation (5) enforces the constraint that the each module can be allocated to single processor.
Equations (6) and (7) enforces the resource constraint where the memory and
computation resource capacity of each processor should be no less than the total
amount of resource requirement of all of its assigned modules. Constraint (8)
guarantees that xik are binary variables.

118

R. Roy, M. Das, and S. Dehuri

Combinatorial MOSPO Algorithm for MOTAP

To design the CMOPSO algorithm, we adopt the notion of Jarboui et al. [5],
particle update strategies. However, here the particle is not mapped to {-1, 0, 1}
rather we use a symmetric function to map the integer value to a a continuous
domain. It is discussed in detail in following subsections.
3.1

Particle Representation

The particle for the MOTAP need to be encoded with 0,1 in a matrix of size
n r. But this creates a sparse array which would take longer updation time.
So use a vector of 1 r, denoted by Pi = {pi1 , pi2 , ..., pir }, where pi j represents
index of the allocated processor and j represent the index of the task. Here
pi,j = k implies xj,k = 1 and xj,m = 0 m = k. The representation is shown
in Figure 3.
1
3

5
1

6
7

...

Fig. 3. Particle representation

This representation creates a compact representation of the particle and thus


requires O(r) steps for updating each particle as compared to the matrix representation which takes O(nr) steps.
3.2

Fitness Evaluation

We use the MOTAP formulation dened in subsection 2.2. In this function the
constraints (5) and 8 are implicitly satised due to the compact particle representation. The constraints (6) and (7) are redened by combining them together
in a function J(x) which is dened in Equation (9).
 r
 r


n
n




max 0,
mi xik Mk + 2
max 0,
si xik Sk
J(x) = 1
k=1

i=1

k=1

i=1

(9)
3.3

Velocity and Position Updation

Every particle is associated to a unique vector of velocities vi = (vi1 , vi2 , ...,


vin ). Before updating the particle vector to generate a new position, it is mapped
from combinatorial state to continuous state using the Equation (10).

f (xi )
xi = Pig

f (xi )
xi = Pit
yi =
(10)
g
f (x)or f (x) xtj = Pit = Pi

0
otherwise
where f (x) = |x2 b| where b is a prime number and b >> x.

A Combinatorial Multi-objective Particle Swarm Optimization

119

The use of the symmetric equation helps mapping the particle to higher range
values that provides a clear distinction between particles position in the continuous state. The velocity of the particle is update using Equation (11).
vij (t + 1) = wvij (t) + 1 (P Bij pij (t)) + 2 (P Gj pij (t)),

(11)

When xi = Pig , it impose the particle to y in positive sense. When xi = Pit ,


the particle are imposed to y in negative sense. When xi = Pig = Pit , the particle
y in opposite direction to that of yit . When xi = Pig = Pit ,, the direction of the
particle ight is determined by r1 ,r2 , c1 and c2 . The position of the particle is
calculated by using the Equations (12)-(14).
t
t
t+1
ij = yij + vij

(12)

t+1
The value of yij
is adjusted using equation 5.

t+1
yij

f (x) if t+1
ij > f (x)
= f (x) if t+1
ij < f (x)

0 otherwise

(13)

where is known as the intensication (or diversication) parameter. Smaller


value of leads to diversication of the pareto front and large value of results
in intensication of the Pareto front.
t+1
The demapping of the yij
with Equation (14) generate the new set of the
particle position in the search space.

(t+1)
xij

3.4

t+1
= f (x)
Pijg if yij
t+1
t
=
Pij if yij = f (x)

random number otherwise

(14)

t
Selection of Pij
and PiG

At the beginning, each particle position is generated randomly is considered as


the pbest (ptij ) of the particle. After every generation, the pbest for each particle
is selected using the sigma method proposed in [9]. The selection of pbest is done
from the external local memory which stores the non dominated solution over
the generation. As there are more than one single global optimal solutions, so
we need to take an external archive for maintaining the non-dominated list of
PiG . The guides are stored in external repository and for each particle, a guide is
selected by using the guide selection strategy proposed in [2]. In this method, the
objective space is divided into adaptive hypercubes. Each hypercube is assigned a
a large number
tness value which is calculated as number of particles
in each hypercube . A guide
is selected for each particle from the hypercube which has the highest tness
value to guide the particle in the search space.

120

3.5

R. Roy, M. Das, and S. Dehuri

Repository Updation

As we see in the Subsection 3.4, we need to maintain two archives to store the
global best position and the personal best position. So we need a strategy to
maintain the archives size as they retain the subset of the true Pareto front.
For maintaining the local memory, we use the non domination test. After
every generation, the elements are stored in the local memory if they are not
dominated any members of the archive. Also those members of the archive which
are dominated by the new members, which are about to enter the local memory,
are removed from the archive.
The external repository, storing the global guides, is also maintained using the
non domination test. But, alongwith it there is secondary strategy to maintain
the repository. We use the crowding sort technique to maintain the elements
of the repository when the repository size grows beyond a xed size. Here we
sort the members of the repository in descending order based on their crowding
distance values. The crowding distance value is calculated using the objective
values. Then the elements which has least crowding distance value are removed
from the archives. The algorithm for the CMOPSO is given in Algorithm 1.
AlGORITHM 1: CMOPSO algorithm
for j=1: M ax Swarm do
/* M ax Swarm is maximum size of the swarm*/
Initialize SW ARM [j];
end for
Fitness-particle(Swarm)
/* evaluate the tness */
for j=1 : M ax Swarm do
P B[j] = SW ARM [j];
/* Initialize the swarm local memory*/
end for
pbestti = getP article pbest(Swarm, P B[j])
/* select pbest for each particle */
while (I < IMax ) do
/* IMax is the maximum number of iteration*/
Pgt = getSwarm gbest(Swarm, EXARCHIV E)
/* select the gbest for each particle*/
Map the particle to continuous state using equation(10)
Update the velocity of particle with the equation (11)
Update the position of the particle using equation (12)-(14)
Fitness-particle(Swarm).
P B = U pdate local memory(swarm, P B)
/* update the pbest repository*/
pbestti = getP article pbest(Swarm, P B[j])
EXARCHIV E = P U pdate repository(Swarm, EXARCHIV E)

A Combinatorial Multi-objective Particle Swarm Optimization

121

/* primary update strategy for ex-archive */


if (|EXARCHIV E| > M ax Archive size) then
EXARCHIV E = S U pdate repository(Swarm, EXARCHIV E)
/* secondary update strategy for ex-archive */
end if
end while

4
4.1

Experimental Study
Dataset

The dataset for the MOTAP is generated randomly. For the specied PIG, we
set the number of processors (n) and modules(r) equal to (6,8), (6,10),(7,9) and
(7,11) respectively in order to testify the problem with two dierent category of
problem scale. We also consider three dierent TIGs with various task interaction
density d equal to 0.3,0.5 and 0.8. The values for the other system parameters are
generated randomly using uniform distributions of following ranges: the module accumulative execution time (AET) is between 15 and 25, the intermodule
communication (IMC) load is between (15,25), the failure rates are generated
in the range of (0.0005-0.0010) and (0.00015-0.00030), the memory and computation capacity of each processor varies from the 100 to 300, the memory and
computation requirement of each module is generated in the range of 1 to 60.
4.2

Experimental Environment

The experiment is performed in Core2Duo processor, 1GB RAM and Matlab


2009b. The parameter settings for the experiments is shown in Table (1). The
parameter are derived empirically during the initial phases of the experiment.

Table 1. Parameter Setting


Algorithm Swarm /population size iteration c1 c2 W pc pm
CMOPSO
200
50
1 1.8 1.4 0.10
NSGA-II
200
50
0.8 0.05

4.3

Result and Analysis

We apply the CMOPSO algorithm to solve all the instances of the problem. The
obtained pareto optimal solution for one of the conguration of MOTAP (for
r=8,p=6) is shown in Figure 4. For the other congurations, the best, worst
and median values of the Pareto optimal set of two objectives(i.e., maximization

R. Roy, M. Das, and S. Dehuri

0.9954
0.9953
0.9953

System Reliability

122

0.9952
0.9952
0.9951
0.9951

d=0.8
d=0.5
d=0.3

0.995
0.995
0.9949
0.9949
45

50

55

60

65

70

75

80

System Cost

Fig. 4. Pareto front of the conguration p=6 r=8

Table 2. Comparative results CMOPSO and NSGA-II for the two objective
Algorithm conguration
System cost
System Reliability
P* r d
best median worst best
worst median
0.3 50.31
65
72.78 0.9953 0.995497 0.9952
8 0.5 51.99 62.37 76.9 0.9954 0.99503 0.9952
6
0.8 52.06
63
76.93 0.9952 0.9949 0.9951
CMOPSO
0.3 56.71 63.61 68.58 0.99532 0.9949 0.9954
10 0.5 57.72 64.3
69.6 0.9954 0.99495 0.9952
0.8 63.76 64.2
69.6 0.9954 0.9950 0.9952
0.3 47.09 70.58 120.16 0.99420 0.9929 0.9941
9 0.5 47.51 83.98 128.3 0.9942 0.9936 0.9941
7
0.8 47.09 83.98 107.6 0.9942 0.9936 0.9941
0.3 56.19 71.69 97.6 0.9942 0.9921 0.9936
11 0.5 57.21 71.90 100.29 0.9941 0.9926 0.9935
0.8 58.91 72.97 101.17 0.9941 0.992 0.9935
0.3 112.2 112.35 114 0.9541 0.9540 0.95437
8 0.5
113 113.5 115.8 0.9650 0.9649 0.9648
6
0.8 113.6 113.9 116.3 0.96872 0.9681 0.96832
NSGA-II
0.3 106.2 108.1 116.8 0.9693 0.9654 0.9549
10 0.5 117.7 119.2 121.1 0.9554 0.9553 0.95504
0.8 121.6 124.3 125 0.9559 0.95532 0.9550
0.3 146.43 146.97 148 0.9821 0.9814 0.9808
9 0.5 149.5 149.7 149.9 0.9810 0.9806 0.9802
7
0.8 156.7 163.8 167.9 0.9810 0.9809 0.9800
0.3 243.2 247.43 261.65 0.9809 0.9802 0.9782
11 0.5 261.8 264.5 267.4 0.9843 0.9839 0.9832
0.8
274 278.8 281.6 0.9862 0.9857 0.9850
P represents the number of processor
r represents the number of modules

d represents the task interaction density

A Combinatorial Multi-objective Particle Swarm Optimization

123

Fig. 5. Eect of task interaction density on cpu time

(a) Worst case analysis of system cost (b) Worst case analysis of system reliover iteration
ability over iteration
Fig. 6. Worst case analysis of the system cost and system reliability over the number
of iterations

of system reliability and minimization of system cost) are shown in Table (2).
Also the similar values of the two objective of non-dominated sorting genetic
algorithm (NSGA-II) based task allocation algorithm are also tabulated in 2.
The result conrms the better performance of CMOPSO over NSGA-II for both
the objective values under dierent congurations. Also, we nd that scaling
of processors and modules deteriorates the performance of both the algorithm
to some extent. However, this deterioration is within the tolerable range for
CMOPSO as compared to the NSGA-II based task allocation algorithm. we also
study the eect of the task interaction density (d) on the CPU time. We nd
that the increase of d increases the CPU time. The result is shown in Figure 5.
To ensure the quality service of the application running in DCS, it is essential
to study the worst case analysis of the system reliability and system cost. We
repeated the experiment on all the instances by varying the iteration from 1 to
300 in 30 runs. We plot the result of the system reliability vs iteration and system
cost vs iteration in Figure 6. The result clearly shows that the task allocation
solution generate by the CMOPSO have a system reliability > 0.9940 and system

124

R. Roy, M. Das, and S. Dehuri

cost < 150 for all the system conguration in 30 runs. Thus CMOPSO generates
a quality solution with a very high probability equivalent to 1-30/300=90%.

Conclusion

In this paper, we proposed a CMOPSO algorithm for solving the MOTAP that
minimizes the system cost and maximizes the system reliability. We showed that
the result obtained from the CMOPSO are very promising for task allocation.
We also studied the eect of program scaling and the task interaction density.
We see that the increase in the factor have tolerable aect on the objectives
and there is minor increase in the CPU time. However, we did not consider the
impact of the topologies on the performance of the system. This can be our future
research. Also we need to compare the algorithm with other meta-heuristics.

References
1. Attiya, G., Hamam, Y.: Task allocation for maximizing reliability of distributed
systems: A simulated annealing approach. Journal of Parallel and Distributed Computing 66(10), 12591266 (2006)
2. Coello Coello, C.A., Lechuga, M.S.: MOPSO: a proposal for multiple objective
particle swarm optimization. In: Proceedings of the Evolutionary Computation on
2002, CEC 2002, pp. 10511056. IEEE Computer Society, Washington, DC, USA
(2002)
3. Ernst, A., Hiang, H., Krishnamoorthy, M.: Mathematical programming approaches
for solving task allocation problems. In: Proceedings of the 16th National Conference of Australian Society of Operations Research, Australia (2001)
4. Hsieh, C.-C.: Optimal task allocation and hardware redundancy policies in distributed computing systems. European Journal of Operational Research 147(2),
430447 (2003)
5. Jarboui, B., Ibrahim, S., Siarry, P., Rebai, A.: A combinatorial particle swarm
optimisation for solving permutation owshop problems. Computers and Industrial
Engineering 54(3), 526538 (2008)
6. Kartik, S., Siva Ram Murthy, C.: Task allocation algorithms for maximizing reliability of distributed computing systems. IEEE Transaction on Computers 46,
719724 (1997)
7. Lee, C.-H., Shin, K.G.: Optimal task assignment in homogeneous networks. IEEE
Transactions on Parallel and Distributed Systems 8, 119129 (1997)
8. Lin, M.-S., Chen, D.-J.: The computational complexity of the reliability problem
on distributed systems. Information Processing Letters 64(3), 143147 (1997)
9. Mostaghim, S., Teich, J.: Strategies for nding good local guides in multi-objective
particle swarm optimization MOPSO. In: Proceedings of the IEEE, Symposium on
Swarm Intelligence, pp. 2633 (2003)
10. Ajith Tom, P., Siva Ram Murthy, C.: Optimal task allocation in distributed systems
by graph matching and state space search. Journal of Systems and Software 46(1),
5975 (1999)
11. Page, A.J., Keane, T.M., Naughton, T.J.: A multi-heuristic dynamic task allocation
using genetic algorithms in a heterogeneous distributed environment. International
Journal of Parallel Distributed Computing 70, 758766 (2010)

A Combinatorial Multi-objective Particle Swarm Optimization

125

12. Shatz, S.M., Wang, J.-P., Goto, M.: Task allocation for maximizing reliability of distributed computer systems. IEEE Transaction on Computers 41, 11561168 (1992)
13. Srinivasan, S., Jha, N.K.: Safety and reliability driven task allocation in distributed
systems. IEEE Transactions on Parallel and Distributed Systems 10, 238251
(1999)
14. Tripathi, A.K., Sarker, B.K., Kumar, N.: A ga based multiple task allocation
considering load. International Journal of High Speed Computing 11(4), 214230
(2000)
15. Yin, P.-Y., Yu, S.-S., Wang, P.-P., Wang, Y.-T.: Multi-objective task allocation
in distributed computing systems by hybrid particle swarm optimization. Applied
Mathematics and Computation 184(2), 407420 (2007)
16. Yin, P.-Y., Yu, S.-S., Wang, P.-P., Wang, Y.-T.: Task allocation for maximizing reliability of a distributed system using hybrid particle swarm optimization. Journal
of System Software 80, 724735 (2007)

Enhancement of BARTERCAST Using


Reinforcement Learning to Eectively Manage
Freeriders
G. Sreenu, P.M. Dhanya, and Sabu M. Thampi
Rajagiri School of Engineering and Technology, Cochin, India

Abstract. Ecient searching and quality services are oered by prevailing infrastructure of Peer-to-Peer(P2P)networks. P2P applications
are more and more wide spreading with good scope. Though the advantages are still existing the P2P system is vulnerable to some security
issues. One of the important issues that threatens the subsistence of
P2P system is freeriding. Freeriders are peers(nodes) which only utilize
the system but not contribute anything to the system. Freeriders aect
the system in a drastic manner. Freeriders mainly download the contents without uploading anything. So the contents will be concentrated
in few peers and that will increase the congestion and reduce the quality
of the system. This reduces the popularity of the system. This paper
compares dierent approaches for managing freeriders and nally a solution is suggested which is an extension to existing protocol known as
BARTERCAST and the enhancement is done through Q-learning. Application of reinforcement learning approach in BARTERCAST results
in more accurate results.
Keywords: Maxow, Q-learning, Q-table , Reward.

Introduction

The application areas of P2P system include le sharing, distributed computing,


information storage and exchange. The main appealing factor is the existence of
the system without a centralized authority. The main objective of P2P system
is to enhance the quality of services and as a result increase the popularity.
Anonymous users are allowed to join and users are not restricted by the quantity
they have to upload. The freedom which the system provides may lead to more
system vulnerabilities.
The main objective of this paper is to scrutinize one such vulnerability
known as freeriding. Freeriders will damage the system by dropping the users
and increasing congestion among peers. This paper enlighten diverse approaches
towards freerider detection and prevention with a suggested solution.
BARTERCAST [1] is a triing protocol implemented with Ford-Fulkerson
maxow algorithm.This method includes an assessment equation for nding the
behavior of a peer. The existing method can be enhanced with Q-learning [2]
for getting more accurate results. This paper enlightens the precision that is
observable in categorizing the peer behavior.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 126136, 2011.
c Springer-Verlag Berlin Heidelberg 2011


Enhancement of BARTERCAST Using Reinforcement Learning

127

Existing Solutions

Freerider problem can be detected and avoided by dierent solutions.Some of


the existing solutions are
1. Bidding in P2P Content Distribution Networks using the Lightweight Currency Paradigm [3]: It is a currency exchange mechanism. It requires a centralized server for managing the currency transaction. For each download,the
peer should have enough currency. The currency can be achieved through
upload activity. The main disadvantage is that the transaction cost will be
higher than the transferred value.The centralized server lead to a central
point of failure.
2. A Fair Micro-Payment Scheme for Prot Sharing in P2P Networks [4]: This
method suggests a prot sharing mechanism in which the authors right is
managed.This is achieved through coin transferability. This mechanism needs
lot of message exchange. In addition to that, delegation of accountability is
also a complicated process.
3. Free Riders under Control through Service Dierentiation in Peer-to-Peer
Systems [5]: This method considers the contribution behavior of each peer.
Normally peer will upload some content and then goes to sleep.To avoid that
the contribution is considered.Contribution of each peer can be calculated
through peer involvement and availability.
4. Towards a Cluster Based Incentive Mechanism for P2P Networks [6]: This
approach consists of formation of clusters.Each cluster consist of service exchange rings.The nodes in the rings are processed in such a way that each
node will receive service from its predecessor and provide service to its successor.If any node is not willing to provide service the entire ring will be
collapsed.Those nodes will be identied and noticed.The clusters are formed
through query processing. The information about each failed node is passed
through the query. A suspect table will store the information and those nodes
which are having a freerider behavior will be blacklisted and other nodes will
avoid the chance to form a ring with them.
5. TARC: A Novel Topology Adaptation Algorithm based on Reciprocal Contribution in Unstructured P2P Networks [7]: This method consider the RCC
(Reciprocal Contribution Capacity) value.Peers with high RCC value are
placed more close and others will be placed at the network end.The RCC
value is based on Content Provision capacity,Content transmission capacity
and Locating node capacity.The disadvantage is that reallocation of nodes
between the topologies depending on dynamic reputation is dicult and
inecient.
6. A Distributed and Monitoring-based Mechanism for Discouraging Free Riding in P2P Network [8]: A utility function is evaluvated.This method divides
the system into basic level,statics level,decision level and executive level.
Based on the result of utility function the request is handled by dierent
levels.This method doesnt consider some security issues and also the initialization of average utility value is to be considered in future.

128

G. Sreenu, P.M. Dhanya, and S.M. Thampi

7. BARTERCAST [1]:Fully distributed method for freerider prevention. Applying Ford-Fulkerson maxow algorithm and the result is used for nding
the reputation among dierent peers.Dierent security aspects are also considered by this method.No need of a centralized server for coordinating the
entire process.
In addition to above solutions, reputation based approach for establishing trust
among peers also plays an important role in freerider prevention. Local Reputation values generated among peers as a result of individual feedback are aggregated to acquire a global value. On hand reputation models which are proved
on the basis of simulation experiments are briefed here.
1. Xrep [9]: The protocol is developed for generating reputation values on both
resources and peers. Reputation values are calculated on the basis of votes.
Peers are asked to vote their opinion about resources and the peers which
provide the resource.The votes are evaluated and nally the peer having
highest votes is selected to download the content.
2. TrustMe [10]: It gives importance to the anonymous nature of peers. Trust
value of each peer will be stored in a THA(Trust Holding agent)peer. The
requestor can initiate a broadcast query by asking the trust value of peer
holding the resource(say peer A).The answer is given by the corresponding
THA peer. After getting a satisfying trust value the requetsor will choose
peer A for interaction. Finally the requestor will update the trust value of
peerA by submitting the report to corresponding THA peer. If the network
consists of large number of peers the report distribution will take long time
and it will aect the reputation calculation time.
3. NICE [11]: Used for the implementation of cooperative applications. As a
result of each transaction a signed certicate will be send from requestor
to resource possessor. By using the certicates the possessor will conduct a
search. The nal trust value calculation is based on the suggestions obtained
from searches and signed certicates.
4. EigenTrust [12]: This method is mainly used for identiying malicious peers.
Each peer calculates the local trust value. These values will be normalized and
aggregated. The aggregated value represent the global trust value.The basic
assumption of this method, that is if the contents presented by a peer is true
then the trust value reported by the peer will also reliable, is not correct.
5. PeerTrust [13]: This reputation model is implemented on a structured P2P
network. The trust value calculation depends on ve factors. The trust values
will be stored at hashed locations. Though it provides an acceptable level
freerider prevention it is not easy to implement in large scale P2P network.
6. PowerTrust [14]:All the calculations are based on the Trust Overlay Network
developed above peers. Local trust values developed will be aggregated by
regular random walk module and frequently updated by Look ahead random
walk module.This LRW is associated with distributed ranking module for
nding power nodes.Power nodes are considered as the most reputable nodes
and if anything happens to power nodes that will aect entire network.

Enhancement of BARTERCAST Using Reinforcement Learning

129

7. GossipTrust [15]:It is used for aggregating local trust values in unstructured P2P networks.Aggregation is performed by using step and cycle methods.Cycle is composed of dierent steps.Vale aggregated in one cycle is added
to the next cycle.The main problem in GossipTrust is that the method will
consider all existing peers in the network.
8. GRAT [16]:The problem in Gossip Trust is solved here by forming groups.The
method includes Creation of groups ,Creation of link between Sub Leaders
and nally Calculation of global score.

Comparison of Existing Solutions

We have used an N S2 network simulator. A simulation environment consisting


of 100 peers is created and analyzed the data transfer through each peer.Peers
are characterized like, peers with high request processing rate are considered
as good peers and peers having request processing rate which is below a xed
threshold level are freeriders.
The above discussed solutions are compared and analyzed that all are able to
discourage freeeiders and allow high download rate for sharers.The result based
on freerider prevention(X-axis) and time (Y-axis) is shown in the graph for all
the seven methods discussed initially.
M1 for Lightweight Currency Paradigm.M2 for Prot Sharing.M3 for Service
Dierentiation.M4 for Cluster Based Incentive Mechanism.M5 for TARC.M6 for
DMM.M7 for BARTERCAST.

Fig. 1. Comparison Graph

The dierence among seven methods lies in its complexity and implementation
as shown in the table. From the above comparison, BARTERCAST shows high
performance by considering all factors.

130

G. Sreenu, P.M. Dhanya, and S.M. Thampi


Table 1. Comparison Table

M ethod

Centralized/Decentralized Complexity P erf ormance

Currency Method
Prot Sharing
TARC
Service Dierentiation
Cluster Based Mechanism
DMM
BARTERCAST

Centralized
Centralized
Distributed
Partially Decentralized
Distributed
Partially Decentralized
Distributed

High
High
High
Medium
Medium
Medium
Low

Fair
Fair
Fair
Good
Good
Good
Excellent

Proposed Solution

BARTERCAST is a protocol developed for fully distributed (unstructured) P2P


network. The protocol embraces a two step procedure. The initial step implements maxow algorithm [1] for nding maximum ow from given source node
to the target node. The algorithm is explained with the network shown in Fig.2.
Considering peer A as the source node and D as the destination node.Initial
capacity of each node is given as AB=5, BC=3, CD=5, AE=4, AF=7, ED=8,
EF=5 ,CE=7, FD=9.
Step 1 :Initially 0 is passed through network.
Step 2 :Consider one path ABCD. The minimum value among ABCD is 3. Initial
ow through this path is 3 bytes and now maxow is 0+3 that is 3 bytes. So
values remained in the link become AB=2,BC=0,CD=2.
Step 3 :Consider the path AFD.Minimum of AFD=7. So a ow of 7 bytes will occur and now maxow is 3+7 that is 10 bytes.After that values are AF=0,FD=2.
Step 4 :Consider AED.Minimum of AED is 4 and maxow is 14.The remaining
values are AE=0 and ED=4.

Fig. 2. Input for maxow algorithm

Enhancement of BARTERCAST Using Reinforcement Learning

131

No other path is remaining there.So here the maximum ow between source


and destination is 14.
The second step makes use of the maximum ow value for nding reputation
of other peers associated with it.
4.1

Q-Learning

Q-learning is one of the methods for realizing learning course of action. It helps
in achieving perfect outcome as a result of frequent actions. The Q-value can
be measured by analyzing the past performance of each peer. Q-Feed [17] take
account of Q-table for each node. Q-table upholds the Q-value for each node.
Based on the Q-value the behavior of each node is labeled into three states.
The opening state of every node is regular. The performance decline results in
oating state and nally leads to sleeping state.
4.2

Q-Learning Applied in BARTERCAST

BARTERCAST is implemented in a fully unstructured network. The above explained maxow algorithm gets input from peers as message exchange. BARTERCAST has already proved that it can work in a situation when majority of peers
send false information. Though it produces a valuable outcome it can be made
more accurate by applying Q-learning. The formula for nding reputation matrix in BARTERCAST is dened for a path with maximum length 2. It can be
modied by considering more than 2 nodes in an unstructured network. The
method used in Q-feed [17] for nding the Q-value can be applied here.
Qi,t+1 =Qi,t + ( - Qi,t )

(1)

Each peer Qi is considered as an agent. Every action of peer Qi results in a state


change. Initial Q-value is set based on the number of bytes transmitted by the
peer. History of number of past performances of a peer is used to decide the
peer behavior. BARTERCAST analyzes the reputation only through a single
calculation. Learning process comes to a decision after considering number of
past performances. The learning rate constant is preset as 0.2. is the reward
for each action. Maximum ow value resulted from the maxow algorithm is
used for nding Q-value.
The custom-made formula for nding Q-value is (1).
where
=(arctan(maxow (j,i) - maxow(i,j)))/(/2)

(2)

Q values for each node based on previous performances are stored in the Q-table.
The Q-table values decide whether the peer belongs to regular state,oating state
or sleeping state. The formula for reward is to calculate the reward between
neighboring peers. Values from other peers are exchanged as messages among
the peers.

132

G. Sreenu, P.M. Dhanya, and S.M. Thampi

Algorithm for Learning Approach in BARTERCAST

Step 1 :Create a Q-table for each node in the network.


Step 2 :Initialize the Q-table based on the node capacity.
Step 3 :Apply maxow algorithm and analyze the maximum ow between peers.
Step 4 :Use Q-learning equation for nding Q-values.
Step 5 :The value is substituted with equation 2.
Step 6 :The calculation is repeated for each action and the Q-values are
modied.
Step 7 :The modied Q-values are stored in the Q-table.
Step 8 :The Q-table entry of each peer is analyzed to decide the peer character.
Step 9 :Based on the decision,it is rejected or accepted by other peers in the
network.

Illustrating the Algorithm

Entire steps can be summarized into three sections. As the initial step Q-table
creation is considered. Second and third step deals with value calculation and
behavior analysis.
6.1

Q-Table Creation and Initialization

Each node consists of a Q-table. Q-table entry contains the Q-values of neighboring peers.Considering Fig.2 as input network. Node A will send a request
and the nodes which give a response will entered into the Q-table. The initial
Q-values for these nodes will be set based on their uploading capacity. Then the
Q-values get modied for each action performed by the peer.
Here Q-table entries of peer A is considered. The nodes which respond for
the query send by A are entered in the Q-table. Here nodes B,C,D,E and F are
the Q-table entries. Based on their uploading bandwidth, resulted as part of
their response for node As request, initial Q-values are set for each node.Table
2 stands for the Q-table of node A.
Table 2. Initial Q-Table of peer A
B

C D E

160 90 45 200 80

6.2

Q-Value Updation and Reward Calculation

Q-value calculation at each step gives a reward to the agent(peer).Here the


reward is represented as .Values of varies between -1 and 1.Here consider
the values are -.5,-.3,.4,.66, and .7 for peers B,C,D,E and F respectively. The
Q-values of each peer calculated using Q-learning equation for these values are
shown in the following Q-table.Initial values shown in Table 2 are modied in
Table 3.

Enhancement of BARTERCAST Using Reinforcement Learning

133

Table 3. Modied Q-Table of peer A


B

127.9 71.94 36.08 160.132 64.14

6.3

Behavior Analysis

Based on the Q-values, whether it is greater than or equal to average Q-value, it


enters into dierent states.A regular peer is a good contributor peer.Peer with
Q-value less than that of regular peer comes in oating state.Peers with Q-value
less than that of regular and oating peers goes to sleeping state.
Finally each node will analyze its Q-table,and based on the Q-values of neighbors the node decides whether to accept or reject the neighbors.

Result Analysis

This section elucidates the simulation results. Result shows the eectiveness
the proposed method in freerider prevention. Simulations are done for both
BARTERCAST and the enhanced method.Finally the results are compared to
show the accuracy of enhanced method over existing BARTERCAST.
7.1

Simulation Scheme

NS2 is used as the simulator. The simulation is done for 1000 peers.The le size
is taken as 70M. Simulate 20% peers as freeriders and 80% peers as good peers.

Fig. 3. Analysis of BARTERCAST

134

G. Sreenu, P.M. Dhanya, and S.M. Thampi

Fig. 4. Analysis of Enhanced BARTERCAST using Q-learning

Fig. 5. Comparison of above two results

Fig.3, Fig.4 and Fig.5 show the coresponding simulation results of BARTERCAST protocol, Enhanced Bartercast usng Q-learning and comparison of the
two methods. X-axis shows the time values corresponding to hours and Y-axis
represents the total upload rate of the network.
The gradual increase in upload rate shows the detection and avoidance of
freeriders. Comparison results show the eciency of proposed method over

Enhancement of BARTERCAST Using Reinforcement Learning

135

existing one. For the proposed method, as the time increases the upload rate
of peers will increase in an accurate manner. Since the enhanced method is
based on Q-learning the accuracy will be increased in each transaction based on
past experience.

Conclusion

Application of reinforcement learning approach results in highly accurate performance. The accuracy of result is interrelated with reputation of peers. BARTERCAST is an undemanding mechanism with high reputation values. Freerider
detection and prevention is done on the basis of Q-values stored in the
Q-table. Q-value is calculated as two separate procedures. Initially Ford Fulkerson maxow algorithm is used for nding the maxow. After nding the maxow,
Q-value is found based on Q-learning equation.The values in Q-table are examined and xed on the behavior of each peer. Acceptance of a peer depends on
the behavior. A peer with sleeping behavior is considered as a freerider and it
is rejected by the network. Finally eciency of the proposed method is proved
against existing BARTERCAST protocol with the help of simulated results.

References
1. Meulpolder, M., Pouwelse, J.A., Epema, D.H.J., Sips, H.J.: BARTERCAST: A
practical approach to prevent lazy free riding in p2p networks. In: Proceedings of
IPDPS 2009, pp. 18 (2009)
2. Watkins, C.J.C.H., Dayan, P.: Technical Note:Q-Learning. Journal Machine Learning 8(3-4) (May 1992), doi:10.1007/BF00992698
3. Elrufaie, E., Turner, D.A.: Bidding in P2P Content Distribution Networks using
the Lightweight Currency Paradigm. In: International Conference on Information
Technology: Coding and Computing (ITCC 2004), vol. 2, p.129 (2004)
4. Catalano, D., Ruo, G.: A Fair Micro-Payment Scheme for Prot Sharing in P2P
Networks. In: Proceedings of the 2004 International Workshop on Hot Topics in
Peer-to-Peer Systems (HOT-P2P 2004). IEEE, Los Alamitos (2004)
5. Mekouar, L., Iraqi, Y., Boutaba, R.: Free riders under control through service differentiation in peer-to-peer systems. In: International Conference on Collaborative
Computing: Networking, Applications and Worksharing (2005)
6. Zhang, K., Antonopoulos, N.: Towards a Cluster Based Incentive Mechanism for
P2P Networks. In: CCGRID 2009 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (2009)
7. Chen, C., Su, S., Shuang, K., Yang, F.: TARC: A Novel Topology Adaptation
Algorithm based on Reciprocal Contribution in Unstructured P2P Networks. In:
ICPP Workshops 2009, pp. 437442 (2009)
8. Tian, J., Yang, L., Li, J., Liu, Z.: A Distributed and Monitoring-based Mechanism for Discouraging Free Riding in P2P Network. In: 2009 Computation World:
Future Computing, Service Computation, Cognitive, Adaptive, Content, Patterns,
pp. 379384 (2009)

136

G. Sreenu, P.M. Dhanya, and S.M. Thampi

9. Damiani, E., Vimercati, S., Paraboschi, S., Samarati, P., Violante, F.: A
Reputation-based Approach for Choosing Reliable Resources in Peer-to-Peer Networks. In: ACM Symposium on Computer Communication Security, pp. 207216
(2002)
10. Singh, A., Liu, L.: TrustMe: Anonymous Management of Trust Relationships in
Decentralized P2P Systems. In: Third IEEE International Conference on Peer-toPeer Computing, pp. 142149 (September 2003)
11. Lee, S., Sherwood, R., Bhattacharjee, B.: Cooperative peer groups in NICE (2003)
12. Kamvar, S., Schlosser, M., Garcia-Molina, H.: The Eigen- Trust algorithm for reputation management in P2P networks. In: Proceedings of the Twelwth International
World-Wide Web Conference (WWW 2003), 446458 (2003)
13. Xiong, L., Liu, L.: PeerTrust: Supporting reputationbased trust for peer-to-peer
electronic communities. IEEE Transactions on Knowledge and Data Engineering 16(7), 843857 (2004)
14. Zhou, R., Hwang, K.: PowerTrust: A Robust and Scalable Reputation System for Trusted Peer-to-Peer Computing. IEEE Trans. Parallel and Distributed
Systems 18(4), 460473 (2006)
15. Zhou, R., Hwang, K., Cai, M.: Gossiptrust for fast reputation aggregation in peerto-peer networks. IEEE Trans. on Knowledgement and Data Engineering, 1282
1295 (February 11, 2008)
16. Yasutomi, M., Mashimo, Y., Shigeno, H.: GRAT:Group Reputation Aggregation
Trust for Unstructured Peer-to-Peer Network. In: ICDCSW 2010: Proceedings of
the 2010 IEEE 30th International Conference on Distributed Computing Systems
Workshops (2010)
17. Thampi, S.M., Chandra Sekaran, K.: Q-Feed - An Eective Solution for the Freeriding Problem in Unstructured P2P Networks. International Journal of Digital
Multimedia Broadcasting 2010, Article ID 793591, doi:10.1155/2010/793591, ISSN:
1687-7578, e-ISSN: 1687-7586

A Novel Approach to Represent Detected Point


Mutation
Dhanya Sudarsan, P.R. Mahalingam, and G. Jisha
Rajagiri School of Engineering and Technology, Cochin, India
{dhanyasudarsan127,prmahalingam}@gmail.com,jishag@rajagiritech.ac.in

Abstract. Research in point mutation is ubiquitous in the eld of bioinformatics since it is critical for evolutionary studies and disease identication. With the exponential growth of gene bank size, the need to
intelligibly capture, manage and analyse the ever-increasing amount of
publicly available genomic data became one of the major challenges faced
by bioinformaticians today.The paper proposes a new method to represent point mutation by eectively reclassifying the DNA sequences on
the basis of occurence of point mutation to form a mutation hierarchy
which considerably reduces the memory space requirement for storage
and heavily reduces the complexity in data mining.
Keywords: Point mutation, Data warehousing, Data mining.

Introduction

A gene mutation is a permanent change in the DNA sequence that makes up a


gene. Mutations range in size from a single DNA building block (DNA base) to a
large segment of a chromosome. Gene mutation can be either small scale or large
scale. Point mutation is a small scale mutation resulting in a single nucleotide
base change in DNA. A point mutation may be due to the loss of a nucleotide
resulting in a shorter sequence, the insertion of an additional nucleotide increasing the sequence, or the substitution of one nucleotide for another. There are
many methods to detect point mutations such as:
1. denaturing gradient gel electrophoresis (dgge)
2. temperature gradient gel electrophoresis (tgge)
3. heteroduplex analysis (het)
The new approach to represent point mutation presented in the paper tries to
re-classify the DNA sequences based on the obtained point mutation data, and
create a hierarchy for the mutations. This can aid in eective representation of
the sequences, track mutation chains and even in DNA regeneration.

Motivation

The Genbank size is increasing exponentially day by day. Accumulation of information into Genbank was heavily boosted by the introduction of shotgun
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 137144, 2011.
c Springer-Verlag Berlin Heidelberg 2011


138

D. Sudarsan, P.R. Mahalingam, and G. Jisha

technology for sequencing DNA[2]. Thus the memory requirement for storage
has increased drastically and the retrieval of needed information from this huge
volume became a complicated task.

Fig. 1. Graphical representation of growth of Genbank (Source:Wikipedia)

All these situations motivated to think of a new method to represent mutation


which can save memory as well as reduce the complexity of data mining. Initially
point mutation in DNA sequence is taken into consideration.

Existing Methods

Point mutations are represented in a variety of ways. Most frequent representation consists of three distinct parts: a nucleotide, a sequence position, and a mutant. A typical representation of a point mutation is A113T, denoting a change
from adenine to thymine at position 113 of a DNA sequence. Variations on this
shorthand form include A123T, A(123)T, and A-123-T. Three letter abbreviations were also used. For example:Ala123Thr, Ala118Thr, Ala(118)Thr, and
Ala-118-Thr. Aside from this, point mutations are also represented in a sentence

A Novel Approach to Represent Detected Point Mutation

139

form such as position 132 was mutated from an alanine to a threonine or positions 101110 were mutated to proline. [9].
The problem is that none of these representation methods indicates the type
of mutation happened, so from these representation methods the actual sequence
regeneration is not possible and therefore the mutated gene has to be stored in
some other location which causes considerable memory wastage.

Proposed Solution

The proposal is a new representation method for point mutation which is simple enough and dynamically reduces the space requirement for storage. This
new representation method makes it possible to retrieve information from the
database with less complexity.
The three possible point mutations have to be taken into consideration. The
representation method consists of a tuple with three variables followed by a
memory address:
<V,X,Y >address
where
V is the nucleotide
X is the position of mutation
Y is the type of mutation happened
Here we are considering only DNA mutation, so the possible values of V can
be either A, T, C or G.X can take any integer value depending on the position
of the nucleotide for which the mutation happened value is a single character
value which can be either s or i or d, where s stands for the substitution of one
nucleotide for another type mutation, and i for the insertion of an additional
nucleotide and d for a loss of a nucleotide type mutation. The address followed
by V is the address location of the gene from which the new gene is mutated. A
pointer is made to point to that memory address. So in this method of representation, it is not necessary to store the mutated gene, instead only the changes
happened is need to be stored. From this representation we will get the changes
and address of the old gene it is easily possible to trace out the new gene.

5
5.1

Method of Implementation
Computation of Input to the Representation Algorithm

A one to one matching is done on the sequence and if a mismatch occurred the
nucleotide for which the mismatch occurred and its position is saved to some
temporary variables. Then the next two nucleotides are compared. If they are
same, the previous one is considered as a substitution. So to the temporary
stored variable,s(s stands for substitution) is added as the type of mutation.
Else if the comparison result is again a mismatch, the nucleotide is substituted

140

D. Sudarsan, P.R. Mahalingam, and G. Jisha

according to the nucleotide in the position of the gene to which it is compared.


Again the position is incremented and a match is checked. If it is a match the
type of mutation is stored as i ( i stands for insertion).Else if the comparison
result is again a mismatch the nucleotide position is decremented by one and the
nucleotide in the current position is deleted and the type of mutation is stored
as d(d stands for deletion).
Here, we use a matrix to compute the degree of mutation between two sequences. Every element of the matrix is of a composite type consisting of the
changed nucleotide, position and the type of mutation. We maintain a hash index
of the set of sequences, which is again used as the matrix index. So, corresponding to any two sequences in the hash table, there will be a matrix entry that
contains the data required for keeping track of mutation.
The steps followed are as follows:
1. Read the two sequences and call the match function
2. If the function returns false, the modied nucleotide and its position is stored
to the corresponding matrix cell.
3. The position is incremented and the match function is called
4. If the function returns true, the type of mutation is saved as substitution
5. Else if the mismatch function returns false the current position is decremented and the nucleotide in the rst sequence is substituted in the current
position of second sequence
6. Position is then incremented and again the match function is called
7. If the function returns true, the type of mutation is stored as insertion
8. Else if the match function returns false the position is decremented and the
nucleotide in the current position of second sequence is deleted and the type
of mutation is saved as deletion
5.2

Sample Representation

Consider the case of Sickle Cell Anaemia:


Normal Sequence: AUG AAA CUU CGC AGG AUG AUG AUG
Mutated sequence: AUG AAC UUC GCA GGA UGA UGA UG
Here, the point will be represented as:
<- , 6 , d >
Here, the rst eld of the triple is left blank since there is no nucleotide that
replaces the current one. Instead, a nucleotide is simply getting deleted.
5.3

Graph-Based Data Mining

As we have seen, there is a matrix that keeps track of the mutations happening
between any two sequences in the hashed database. Thinking in terms of data
mining opportunities, we can logically represent the same matrix as a graph
(of which this matrix forms an Adjacency Matrix). In that graph, the hashed

A Novel Approach to Represent Detected Point Mutation

141

sequences will be the vertices and the edges correspond to mutations as per the
matrix.
This method enables us to visualize the mutations by jumping from one vertex
to the next, each of which is in turn a sequence in itself. So, by keeping track
of the mutations encountered (from the graph edges), we can easily retreive the
new sequence by simple graph traversal algorithms. So, this matrix will act as a
chain that keeps the mutation process in itself. Any new sequence simply needs
to be hashed into the matrix, and we have to ll the corresponding cells. Once
that is done, it is an integral part of the whole graph, and gets its right position
in the mutation chain.
5.4

Performance Enhancement through Parallelism

Due to the inherent parallel nature of this method the speed of processing can be
increased by parallelizing the process of computation. The best suitable model
suggested here for parallelisation is the Workers Dispatcher model.

Fig. 2. Dispatcher Worker Model[7]

In the Dispatcher Worker Model the server process is comprised of a single


dispatcher thread and multiple worker thread[7]. Using the divide and conquer
algorithm the server can divide the matrix into dierent sub-matrices that can be
processed separately. Separate client thread may be used to handle each mutation
edge tracking. Dispatcher allocates an idle worker thread to each subprocess. The
result produced by each client is send back to the server. Server combines the
result.
According to Amdahls law, The speed-up of a program from parallelization
is limited by how much of the program can be parallelized[11].

142

D. Sudarsan, P.R. Mahalingam, and G. Jisha

S=1/1-P
S -speed-up of the program
P -fraction that is parallelizable
So, as we subdivide the matrix further, we are able to parallelize the process by
a huge margin.

Result

Here, we consider basic sequences, limited to only two nucleotides. Then, we


have a set of 16 posibilities.
The hash table is designed such that it maps a Hexadecimal value to each
sequence, ranging from 0 to F. The matrix, for the time being is indexed
based on the sequence itself, not on the hash value (for better readability).

Fig. 3. Outcome

The algorithm proposed above produces the following result:Consider the following sequences:
AUG ATC
ACG ATC
ACG TTC
Consider the seuences to be hashed as:
AUG ATC = 20
ACG ATC = 27
ACG TTC = 12

A Novel Approach to Represent Detected Point Mutation

143

Once the graph is constructed, it will have 3 nodes - 20, 27, 12.
There will be the following entries in the matrix (corresponding to the edges):
matrix[20][27] =
matrix[27][20] =
matrix[27][12] =
matrix[12][27] =

{C,2,S}
{U,2,S}
{T,4,S}
{A,4,S}

So, by using suitable graph traversal algorithms, we can trace the mutations
that have occured between any two such sequences, as long as they have been
mapped into the matrix.

Conclusion

The new approach to represent point mutation is simple and reduces the storage
requirement considerably. It also provides an easy means of data mining. The
approach to parallelize the method makes it possible to perform the computation
among a cluster of computers rather than going for a super computer. The process is expected to speed up the computation considerably. This representation
method is suitable for evolutionary studies, disease recognition, strength identication etc.Eventhough the paper takes into consideration the DNA mutations
other mutations such as protein (amino acid sequences) can be represented using
the same method.

Future Work

The representation of point mutation can be extended to chromosomal mutation


and copy number variation. Also future work can be done on less time consuming
methods for mutation detection.

References
1. Nollau, P., Wagener, C.: Methods for detection of point mutations:Performance
and quality assessment. Clinical Chemistry 43(7) (1997)
2. GBParsy: A GenBank atle parser library with high speed (2008),
http://www.biomedcentral.com/1471-2105/9/321
3. Sheng, C., Hsu, W., Lee, M.L., Tong, J.C., Ng, S.-K.: Mining mutation chains in
biological sequences. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), pp. 473484 (2010)
4. Ji, M., Tang, H., Guo, J.: A single-point mutation evolutionary programming.
Information Processing Letters 90(6) (June 30, 2004)
5. Binder, A.: Methods of Detection of Single Point Mutations (1997),
http://www.kfunigraz.ac.at/~ binder/thesis/node63.html
6. Akhurst, T.J.: The Role of Parallel Computing in bioinformatics. Research Report
(January 2005)

144

D. Sudarsan, P.R. Mahalingam, and G. Jisha

7. Sinha, P.K.: Distributed Operating Systems, pp. 398414. PHI Learning Private
Limited (2009)
8. Sickle Cell: Sickle Cell Anemia: Example of a Point Mutation (2010),
http://www.tamu.edu/faculty/magill/
gene603/PDF%20versions/Sickle%20Cell.pdf
9. Lee, L.C., Horn, F., Cohen, F.E.: Automatic Extraction of Protein Point Mutations Using a Graph Bigram Association. PLoS Comput. Biol. 3(2), e16 (2007),
doi:10.1371/journal.pcbi.0030016,2007
10. Wu, L., Ling, Y., Yang, A., Wang, S.: Detection DNA point mutation with rollingcircle amplication chip. IEEE, Los Alamitos (2010)
11. Amdahls Law (1992),
http://home.wlu.edu/~ whaleyt/classes/parallel/topics/amdahl.html

Anonymous and Secured Communication Using


OLSR in MANET
A.A. Arifa Azeez, Elizabeth Isaac, and Sabu M. Thampi
Rajagiri School of Engineering and Technology, Cochin, India

Abstract. Mobile ad hoc network termed as MANET is an adhoc network with self conguring mobile devices connected by wireless links.
Currently, MANET had a greater impact on secured communication as
it is a part of the ubiquitous network. Each node in the MANET is acting as a router itself. Routing in wireless ad hoc networks are vulnerable
to trac analysis, link spoong, wormhole attacks and denial of service
attacks as it is of infrastructure less and having highly dynamic topology.
Anonymity mechanisms is used to protect the nodes against these attacks
by concealing identication information of the nodes, links, trac ows
and network topology information etc. For a secured communication, secured routing and anonymity to the nodes is essential in adhoc networks.
MANET security issues includes provisions and policies adopted to prevent and monitor unauthorized access to the network and to the data.
Several ecient protocols are proposed specically for MANET, from
these, optimised link state routing protocol is suited for large and huge
dense network. The current OLSR scheme assumes the nodes are to be
trusted nodes and the anonymity is not achieved yet through OLSR. The
proposed solution is for achieving anonymity and security in MANET by
implementing four way handshaking between two nodes using Host Identity Protocol and integrate it with OLSR for secured routing. The technique expected to have a less message overhead by compared to classical
ooding mechanisms and increase the security level with a preferrable
bit rate. Overall, the technique provides anonymity and security in the
MANET environment.
Keywords: MANET Routing, OLSR, MD5, HIP.

Introduction

Along with the rapid use of mobile devices, MANET is achieving great
attention in the eld of secured communication networks. As the environment
is infrastructure-less, MANET is attractive for the applications such as emergency operation, disaster revovery and so on. The nodes in the MANET can
move in any direction and can join or leave the network at any time, the transfer
medium is electromagnetic spectrum. Common protocols used in wired networks
are inecient for MANETs so dedicated protocols have been developed. Primarily two types of routing protocols are used in MANET, proactive and reactive.
Proactive protocols such as the Optimized Link State Routing (OLSR) protocol
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 145154, 2011.
c Springer-Verlag Berlin Heidelberg 2011


146

A.A. Arifa Azeez, E. Isaac, and S.M. Thampi

proactively maintain the routes between nodes and route information in the routing table by propagating route updates thorough the network. While, reactive
or on demand routing protocols institute routes on request, Ad-hoc On-demand
Distance Vector (AODV) and the Dynamic source routing (DSR) protocols are
some exapmles of this architecture. MANET routing protocols enable nodes to
discover routes proactively or reactively to nodes they wish to communicate
within the network. Additionaly as it is of infrastructure less, no central administration exists and the security issues are dierent from conventional networks.
MANETS need energy - ecient operation because all the nodes depend on battery power which is a highly limited source[3]. These features indicates the need
of a more secure operation in the MANET. Current routing protocols do not focus much on the security aspects such as condentiality, integrity, authenticity,
availability, anonymity and non repudiation.
Since mobile ad-hoc networks are highly vulnerable to eavesdroppers. Eavesdropping is a mechanism of overhearing the condential information which is
exchanged between two nodes and the observer node can act as a malicious
node, so mechanisms needed to protect the communication. One of these mechanisms is the use of anonymous communications, which is used to conceal the
identities and informations about the transactions between the source and destination nodes. Anonymity includes data anonymity and connection anonymity
in which data anonymity is hiding all the information about the data while connection anonymity deals with hiding all the connection links and details about
the source and destination.
Some works about anonymous communications are proposed for MANET
which includes ANODR, ARM, MASK AnonDSR and ASR. One of authenticity protocol widely used in wireless networks are Host Identity Protocol(HIP)
whch hides information about the source and destination nodes and prevents the
tracking of trusted links between them. The nodes in a network can be identied
by their IP address, MAC address etc. For achieving anonymity, the information
about the identities need to be kept hidden. There arises the need for a cryptographic identier for each nodes and for each a security parameter index is added
to notify a secured session between two communication parties involved in the
conversations. Even though HIP will help to achieve anonymity, we can not sure
about providing a secure communication with minimum overhead in the network
trac. A trusted environment of grater anonymity can be achieved with the help
of these exchanges in HIP. A host identity tag is attached with each nodes in
the MANET for achieving authenticity and availability in the network[2].
The cryptographic techniques are ecient way of securing the data packets in
the network, but some of the ecient cryptographic algorithms are not suited for
adhoc networks as it may create a much delay in the network and require more
processing power. The proposed solution uses suitable cryptograhic techniques
and the hashed data using MD5 whose source code is provided by www. olsr.
org. The use of olsr technique considerably reduces the processing time and delay
on the network. As the encrypted HIP packet is exchanged only after DieyHellman Key exchange between the parties done, condentiality also achieved.

Anonymous and Secured Communication Using OLSR in MANET

147

The concept of implementing HIP and integrate it with OLSR will help to achieve
both anonymous and secure communication whole over the network provided a
compromise on extra encryption needed. As olsr is supported, it is expected to
have a minimum packet ovehead in the MANET environment.
The structure of this paper is the following. Section2, refer to the existing
approaches in MANET security and various proposals, Section 3 describes necessity for anonymity and security in MANET, Section4 describes the security
architecture, section5 with implementing HIP in the proposed solution and further going on with a brief overview of the olsr. Section6 presents expected results.
Conclusion and future works are added in Section7.

Existing Approaches

There are dierent approaches for MANET security based on anonymous communication and secured routing using OLSR. In this section we discuss main
contributions on OLSR security.
1. A timestamp based approach against replay attacks in the MANET is proposed by Rao.
2. SAOLSR describes a scheme of using a trust table for 2 hop neighbours and
comparing the delay to protect against wormhole attacks in the MANET
and the solution is simulated with ns2 simulator[10].
3. Panaousis proposed a soluion by taking the advantage of the strength of
Security Architecture for the Internet Protocol(IPSec) for implementing a
secured OLSR in MANET[6].
4. A test bed implementation of OLSR in MANET is done and it is noted that
OLSR had a good performance when nodes are in stationery state[12].
5. Another solution for securing OLSR proposed by clausen which provides
optimal routes and is suitable for large and huge dense network and the
implemeted result is also simulated[15].
Many anonymous routing protocols are also proposed specically for MANET
1. Anonymous routing schemes like Anonymous On Demand Routing (ANODR), Anonymous Dynamic Source Routing (AnonDSR), MASK , Secure
Distributed Anonymous Routing Protocol (SDAR) are also proposed[3].
2. ANODR uses pseudonyms instead of real identities for the route discovery
and to achieve anonymity by hiding the identities of the intermediate nodes
in the route[5].
3. AnonDSR employs anonymous dynamic source routing uses onion between
the source and destination, and each intermediate node owns a shared session
key with the source and destination nodes.
4. MASK can establish multiple routes for data transmission by indicating the
real identity of the destination node in the route request packet[9].

148

A.A. Arifa Azeez, E. Isaac, and S.M. Thampi

Already many approaches exists to deal with how to protect the MANET from
wormhole attacks, colluding misrelay attacks, denial of service attacks, the need
for achieving anonymity is also concern. Hence the objective of the proposed idea
to Anonymous and secured communication using OLSR in MANET
is to provide untracaceable communication links using hidden IP address and a
secured routing by OLSR.

Necessity for Achieving Anonymity and Security

Each node in a mobile adhoc networks is free to move in every direction. These
nodes can be a laptop, mobile phones, personal digital assistant, mp3 player and
pc which can be located in car or anywhere or with people having small electronic
devices as there is no centralised control over the network trac[2]. Routing is
one of the critical issues in MANET and hence it is needed to focus more on
performance analysis of routing protocols. The delay, throughput and work load
in the network measures the eciency of the routing protocols. In those routing
protocols proposed specicall for MANET, AODV shows best performane in
low and mediumnode density while DSR is suited for higher delivery ratio and
throughput[14]. OLSR perform in a suited manner for video streaming in dense
networks.
There are manuy anonymities exists in a network, such as sender anonymity,
reciever anonymity, connection anonymity, data anonymity and localization
anonymity, of these relation anonymity such as connection anonymity is the
most concerned. Already mentioned the mobile ad hoc networks are vulnerable
to security problems than in the wired conventional networks, the various issues
in MANET should be discussed as there is no predened boundary, no centralised
control like base station, a limited energy resource of battery power etc.
Also dierent security criterias are explored such as availability, whether the
nodes is available in the network, authenticity to prove the identities of the
parties, integrity to be achieved so that no modication is done to the message,
non repudiation, condentiality so that unauthorised person can not view the
message[8].
The malicious node make routing services as the target because it is one of
the important services in the MANET. There are two avours in the routing
attack, one is to attack the routing table and the other is to attack the node
links and thus attcks the packet delivery mechanisms . So the information about
the connection and data should be kept private in the network. The rst is
aimed to block the packets in some nodes without routing. The second casse
is attacking the packets to be delivered which includes both passive and active
attacks. Various routing attacks exists such as worm hole attack in which a
node in the network itself acting as a malicious node, eavesdropping in which
the trusted node may observes the other nodes and may attack the condential
information like location, private key, session key etc, DoS attacks due to less
batter power, colluding misrelay attacks etc.

Anonymous and Secured Communication Using OLSR in MANET

149

Intrusion detection system in the MANET includes a distributed framework


to detect the presence of attackers in the network. Every node in the MANET
participates in the the process of intrusion detection and thus detects the sign
of intrusion locally and independently in each nodes in the network. A cluster
based intrusion detection system uses the node in the same radio range forms
a group termed as clusters, and using these clusters intruder in the cluster is
detected. Every node should be a member of atleast one cluster.
There are two types of anonymity, data anonymity so that information about
the data is hidden and connection anonymity so that information about the
links and paths between the neighbours are hidden. Security involves achieving
condentiality, integrity and non repudiation. Thus for providing a secured and
anonymous communication all the information about the identities should be
kept private and message should be protected against routing attacks.

Security Architecture of the Proposed System

Security architecture of the proposed system is as shown below.

Fig. 1. proposed system architecture

The host identy protocol is acting as a lter above the network layer in the
ISO/OSI architecture. Eventhough IP address of the nodes are used to forward
the data packets in the network, for achieving anonymity we need to hide this
identity and instead we are using host identity tag. From the transport layer
onwards we used to refer the nodes with HIT tag to hide the identity of the
nodes. One advantage of this HIT tag is the hosts can name each other with
unique identities and host can change their IP address without dealing with
transport layer connection security levels[16]. By using HIP architecture, the
nodes are not needed to be aware of certicate authorities on the data. Here, in
the network layer the data packets are exchanged by applying hash function on
the data and an integrity check value is added with the data. Through the IP
address, the ICV is given to the HIP layer which is above the network layer. Host
identiy tag corresponding to the IP address is generated at this level . Now, above

150

A.A. Arifa Azeez, E. Isaac, and S.M. Thampi

the hashed message, IP address is there and above that host identity tag is there.
This HIT is encrypted with the destination nodes public key and transferred
through the port in the transport layer.

Four Way Handshaking Using HIP

With HIP, security architecture is based on the four way handshaking between
two nodes, initiator and responder as showwn below. The HIP base exchange
(BEX) consists of four way handshaking between initiator and responder. Before
a base exchange can be established the initiator must know the address of the
responder with the help of DNS or LDAP. Initiator in the HIP need not be the
sender itself.
As shown in fugure 2, a trigger message is sent from the initor to the responder
for initiating the authentication checking program with responder. Responder
sends back a challenge request say a puzzle with encrypted session id, time
stamp and life time for the request. As it is encrypted with the public key of the
initiator and session id used here is a unique identier for the particular session,
condentiality can be achieved. Initiator send back a response to the challenge
request say solution to the puzzle and exchanges an encrypted session key with
timestamp and life time of the packet. The session key is encrypted with public
key of the responder. Responder sends back the IP address with the timestamp
and encrypted with the session key which is known only to the initiator and
responder. This termed as the host identity tag. The security levels are shown
in the gure below.
When the HIT is reached on the initiator, initiator starts to froward the data
packet which is encypted with the destination nodes public key. The integrity
check value is added to the message by applying some hashing techniques on the
message. The data packet to the destination is as shown below.

Fig. 2. Fourway handshaking

Anonymous and Secured Communication Using OLSR in MANET

151

Fig. 3. security levels

Fig. 4. data to destination

Thus the process of hiding the identities of the endpoints in the adhoc network
is achieved by using host identity protocol while the protection against routing
attacks can not be guaranteed.

OLSR

The Optimized Link State Routing Protocol (OLSR) operates as a table driven,
proactive protocol. Each node selects a set of its one hop neighbor nodes as multipoint relay nodes(MPR) so that it covers all strict 2 hop neighbours and these
are responsible for forwarding control trac in the entire network. A HELLO
packet is periodically is exchanged for link sensing in the network. A nodes
HELLO message contains its own address, a list of its 1-hop neighbors and a
list of its MPR set. Topology Control messages are used for calculating routing
table. Each node which is selected as an MPR node periodically generate TC
message to containing its MPR selector and only its MPR nodes are allowed to
forward TC messages. Upon receiving TC messages from all MPR nodes in the
network, each node learn all nodes MPR set and hence obtains knowledge of
the whole network topology. Based on this topology, nodes are able to calculate
routing table.
The basic layout of the OLSR packet is shown as follows.
The Packet Sequence Number must be incremented by one each time a new
OLSR packet is transmitted. Message type indicates the type of message. Vtime
indicates validity time. Message Size indicates the size of the message, counted
in bytes. Originator address contains the address of the node. Time to Live
indicates the maximum no of hops a message will be transmitted. Hop Count
contains the number of hops a message has attained. Message Sequence Number
assigned by the originator for a unique identication number of each message.
Ecient integration of HIP with OLSR allows optimizing the ooding process by taking advantage of the minimum spanning tree approach dened by
multipoint rlay nodes in the olsr , thus reducing broadcasting ovehead in the
network.

152

A.A. Arifa Azeez, E. Isaac, and S.M. Thampi

Fig. 5. OLSR packet format

Fig. 6. TC packet header

Fig. 7. TC packet

A Topology Control message header contains Toplogy control message contains OLSR constantly maintains the delay in the routing table and the proposed solution expects an optimal packet transmission with broadcasting to all
the nodes found in the path of minimum spanning tree by the MPR.
6.1

Advantages of the Proposed Solution

The proposed architecture concerns the issues such as anonymity and security in
mobile adhoc networks. Through this approach, authenticity can be achieved by
using host identity tag which is a concealed host identier. OLSR may provide
a minimum packet overhead as it is not ooded to all the nodes in the network.
Thus a secured routing and anonymous communication can be achieved . The
message digest generated by hash function can easily be created as open source
provided by www. olsr. org. Eventhough an extra eort of encryption is needed
in case of host identity protocol, the system is expected to have not much delay
as olsr is suitable for large and dense network.

Anonymous and Secured Communication Using OLSR in MANET

153

Conclusion and Future Scope

As for the conclusion, we have seen how to provide a secured routing and anonymous communication in mobile adhoc network. Many security issues, the need for
anonymity and security are also discussed. Also a brief idea about how to prevent
internal and external attacks against routing services such as wormhole attacks,
eavesdropping, Denial of Service attacks, colluding mis relay attacks etc is given.
The proposed solution provides both anonymity and security in the MANET environment, so that the performance is increased with higher throughput in the
MANET trac. Protection against routing attacks is guaranteed. Authenticity
is achieved by using HIP and as the OLSR approach uses optimal routes, packet
ooding overhead will be minimum. The application of the proposed soltion is
best suited for military application as of needs higher degree of authenticity and
condentiality. The future scope includes extending the advantages of proposed
solution for unaltered host identity tag so that any one can change the IP address in a MANET environment and it will be easy to check the availability of
the nodes.

References
1. Chandee, R.S.M.S., Mishra3, D.K.: Security Issues in MANET:overview. In: Proceedings od Seventh International Conference on Wireless and Optical Communication Networks, pp. 14 (September 2010)
2. Khurri, A., Kuptsov, D., Gurtov, A.: On Application of Host Identity Protocolin
Wireless Sensor Networks. In: Proceedings of Seventh International Conference on
Mobile Addhoc and Network Systems, pp. 345358 ( May 2010)
3. Kumari, E.H.J., Kannammal, A.: Privacy and security on anonymous routing protocols in manet. In: 2nd International Conference on Computer and Electrical
Engineering, Dubai, pp. 433435 (December 2009)
4. Hu, Y.-C., Perrig, A.: A survey of secure wireless ad hoc routing. IEEE Security
and Privacy 2(3), 2839 (2004)
5. Nccher, M., Calafate, C.T., Cano, J.-C., Manzoni, P.: Anonymous routing protocols: impact on performance in MANETs. In: Proceedings of IEEE International
Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems, Moscot, pp. 13
6. Panaousis, E.A., Drew, G., Millar, G.P., Ramrekha, T.A., Politis, C.: A TestBed Implementation For Securingolsr In Mobile AD-HOC Networks. International
Journal of Network Security and Its Applications (IJNSA) 2, 143 (2010)
7. Hiyama, M., Ikeda, M., Barolli, L., Kulla, E., Xhafa, F., Durresi, A.: Experimental
Evaluation of a MANET Testbed in Indoor Stairs Scenarios. In: Proceedings of
International Conference on Broadband, Wireless Computing, Communication and
Applications, p. 678 (2010), issued in IEEE transactions
8. Ali, S., Ali, A.: Performance Analysis of AODV, DSR and OLSR in MANET.
In: Proceedings on Seventh International Conference on Wireless Systems, p. 34
(2009), IEEE transactions

154

A.A. Arifa Azeez, E. Isaac, and S.M. Thampi

9. Tehrani, A.H., Shahnasser, H.: Anonymous Communication in MANETs, Solutions


and Challenges (2010), issued in IEEE transactions
10. Kannhavong, B., Nakayama, H., Nemoto, Y., Kato, N.: SA-OLSR: Security Aware
Optimized Link State Routing for Mobile Ad Hoc Networks. In: Proceedings of
International Conference on Communication Security, vol. 1, pp. 14641468 (2008),
IEEE transaction
11. t-Abdesselam1, F.N., Bensaou2, B., Yoo1, J.: Detecting and Avoiding Wormhole
Attacks in Optimized Link State Routing Protocol. In: Proceedings of International
Conference on Dependable Systems and Networks, vol. 46, 127(4) (2007), IEEE
Transaction
12. Tnnesen, A.: Impementing and extending the Optimized Link State Routing
Protocol-a report. University graduate centre (August 2004),
http://olsr.org/docs/master-pres.pdf,UniK
13. Gorantala, K.: Routing Protocols in Mobile Ad-hoc Networks, Masters Thesis in
Computing Science, Sweden (2006)
14. Clausen, T., Jacquet, P.: Optimised link state routing protocol-report. Network
working group, projrct Hipercom, INRIA (October 2003)
15. Roost, L.J., Toft, P.N., Haraldsson, G.: The Host Identity Protocol-Experimental
evaluation. Communication Networks (2005)
16. Karvonen, K.: komu, M., kurtov, A.: Usable security management with Host Identity Protocol. IEEE Transactions (2009)

Bilingual Translation System for Weather Report


(For English and Tamil)
S. Saraswathi1, M. Anusiya2, P.Kanivadhana2, and S.Sathiya2
Department of Information Technology, Pondicherry Engineering College,
Puducherry-605004, India
swathi@pec.edu, anusiya90@gmail.com, kanivadhana@gmail.com,
thiyainfotech@gmail.com

Abstract. The paper aims in developing Bilingual Translation System for


English and Tamil using hybrid approach. We use Rule Based Machine
Translation (RBMT) and Knowledge Based Machine Translation (KBMT)
techniques. Since its a bilingual translation system both English to Tamil and
Tamil to English translation are possible. The simple sentences are translated
using the rules in RBMT. The complex sentences are split/ converted into
simple sentences using KBMT and translated using RBMT and then processed
to get text in target language. It is restricted to the domain Weather Report and
can be expanded to other domains in future.
Keywords: Machine Translation, RBMT, KBMT, Tagger.

1 Introduction
Machine Translation [1] is one of the important applications of Natural Language
Processing. Machine Translation helps people from different places to understand an
unknown language without the aid of human translator. The language to be translated
is the Source Language (SL). The language to which source language translated is
Target Language (TL).The major machine translation techniques are Rule Based
Machine Translation Technique, Statistical Machine Translation Technique (SMT)
and Example-based machine translation (EBMT). India has a linguistically rich
areait has 18 constitutional languages, which are written in 10 different scripts.
Tamil is the most commonly used language of the south. English is very widely used
in the media, commerce, science and technology and education. Many of the states
have their own regional language, which is either Tamil or one of the other
constitutional languages. Only about 5% of the population speaks English.
Tamil-English Cross Lingual Information Retrieval System for Agricultural
Society [2], translates Tamil to English using statistical machine translation system. It
developed a Cross Lingual Information Retrieval (CLIR) system which helps the
users to pose the query in one language and retrieve the documents in another
language. They developed a CLIR system in Agriculture domain for the Farmers of
Tamil Nadu which helps them to specify their information need in Tamil and to
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 155164, 2011.
Springer-Verlag Berlin Heidelberg 2011

156

S. Saraswathi et al.

retrieve the documents in English. This paper addressed the issue of translating the
given query in Tamil to English using Machine Translation approach.
Cross Lingual Information Retrieval System for English, Tamil and Hindi [3]is a
system which developed a Query Engine to retrieve the solution for the given query
from many other languages apart from the query language, using the concept of
Ontology on the domain Festivals.
Electronic Dictionary Research (EDR) [4], by Japanese, is the most successful
machine translation system. This system has taken a knowledge-based approach in
which the translation process is supported by several dictionaries and a huge corpus.
While using the knowledge-based approach, EDR is governed by a process of
statistical machine translation. As compared with other machine translation systems,
EDR is more than a mere translation system but provides lots of related information.
AU-KBC had developed Tamil- English Cross Lingual Information Retrieval
Track [5] for news articles taken from The Telegraph, English news magazine in
India. All these organizations have developed their CLIR systems using word by word
translation approach in news domain.
An Efficient Interlingua Translation System for Multi-lingual Document
Production [6] describes KANT, a system that reduces this requirement to produce
practical, scalable, and accurate KBMT applications. First, the set of requirements is
discussed, then the full KANT architecture is illustrated, and finally results from a
fully implemented prototype are presented.
The drawbacks of the existing system are discussed below. In Statistical Machine
Translation, large set of bilingual corpus is needed which increases the space
complexity of the entire system. Even a slight variation in the input sentence causes a
new rule to be written into the existing Rule Based Machine Translation System. All
the existing translation system concentrates on addition of new rules into the system
but none address code reusability. In the existing Rule based Machine Translation
Systems, Number of rules written does not always covers entire domain. All the
existing Machine Translation Projects concentrates only on the translation of simple
sentences and the complex sentences are left unnoticed.

2 Proposed Work
The proposed system is a hybrid machine translation system. The paper uses Rule
Based and Knowledge Based machine translation techniques. In the proposed system
the source language sentence is given as input. It is then given as input to the
morphological analyzer, where the morphological analyzer returns the part of speech
of each word in the sentence.
If the input sentence is complex, then it is given as input to KBMT and where it is
split/ converted into simple sentences which are then given as input to RBMT. It is
then translated to the target language using RBMT. If the input sentence is simple,
then it is directly translated using RBMT.
First, the user is initially asked to choose the languages namely: Tamil or English.
The user can select a language in which he/she is most comfortable. After getting the
source language sentence from user, the given input sentence is tagged using a tagger.
It is software that takes a sentence as input, separates the words in that sentence, tries

Bilingual Translation System for Weather Report

157

to identify the parts of the sentences like nouns, verb, preposition etc. and returns
every word in the sentence along with its type as an output.
The sentence given by the user is fed as an input to the appropriate Tagger. It tags
the input sentence and returns the parts of speech of the sentences. From the output of
the tagger, verbs, nouns, tense and patterns of the input sentence are identified.
Sample Input Sentence:
It is raining.
Win Tree Tagger Output:
It

PP

it

is

VBZ

be

Raining VVG
.

rain

SENT

Rule Based Machine Translation System


If the sentence is simple, it is given directly as input to RBMT. In the Rule Based
Machine Translation System, the tagged text is given to the sentence type analyzer.
The rule database consists of various sentence patterns.
Sentence Pattern:
PP VBZ VVG
Mapping to Target Language:
The individual words are then mapped into the target language, present in the
Bilingual Dictionary. Bilingual Dictionary consists of equivalent words in the source
and target languages.

In Bilingual Dictionary, the verb and nouns are stored in separate tables so as to
avoid ambiguity. For instance, in the sentence above, the root words of both words are
same. i.e. one we store the verbs and nouns in separate tables, we can easily resolve
the conflict in searching the words and can thus reduce ambiguity.
Sentence Pattern:
PP VBZ VVG
Mapping to Target Language:

158

S. Saraswathi et al.

For English, Wintree Tagger is used to fetch the data (i.e.) to return the parts of the
sentence. For Tamil, Atcharam tagger is used, which will split the sentence into words
and then returns its parts of speech of every word.
Knowledge Based Machine Translation
If the sentence is complex, the sentence is first given to the Knowledge Based
Machine Translation system. Here, the sentence is changed/split into its equivalent
simple sentence/sentences. The simple sentence is then translated to the target
language using the Rule Based Machine Translation System. The complexity of the
sentence is decided by the test cases. Possible Test cases includes sentences with
and, due to, because of, in the case of, etc.

Fig. 1. Overall architecture diagram of the proposed system

In the Knowledge Based Machine Translation System, the source language text is
first subjected to the process of tokenization. Tokenization involves converting a text
from a single string to a list of tokens. In this process, each word can be used either to
refer to an individual occurrence of a word or to refer to an abstract vocabulary item.

Bilingual Translation System for Weather Report

159

These words are then tagged so as to find the part of speech of each word in the input
sentence. These tagged words are then given as input to the process of
Lemmatization. In Lemmatization the meaning of the sentence is analyzed and is
split/ converted to form simple sentences based on its syntactic meaning.

3 Parameters for Testing


The working of the system is tested by considering the following parameters.
MOS (MEAN OPINION SCORE)
Mean of the remarks obtained from the users of the system.
MOS = (OS1 + OS2 + OS3 +..........OSn) / n
Where, OS is Opinion Score.
MOS and OS range from 0 to 10.
Precision
No. of sentences translated correctly divided by the no. of sentences given as input.
Precision= (number of sentences translated correctly) / (total number
of sentences given as input)
It ranges from 0 to 1.

4 Results
The following are the results of the proposed system.
Table 1. Performance measure for simple sentences for English to Tamil Translation System
SENTENCE
TYPE

TOTAL NO. OF
SENTENCES GIVEN

OUTPUT
OBTAINED

MEAN
PRECISION
OPINION
(%)
SCORE
100
10

PP VBD/VBZ VG

10

10

DT NN VHZ VVN

13

12

92.30

9.5

DT NN MD VV

11

10

90.9

9.2

DT NN VVD

10

10

100

10

NN NN IN DT NP NN

10

80

8.9

NN VHP VVN RB

12

10

83.3

8.5

160

S. Saraswathi et al.
Table 1. (continued)

DT JJ NN NN VHP VVN
RB

80

8.2

DT NN VBD/VBZ JJ

20

20

100

10

JJ NN NNS/VVZ CD NNS
IN NP

88.8

8.9

NP/NN VVD NN IN NP

87.5

8.3

NP VVZ/VVD JJ NN

20

20

100

10

JJ NN VVZ NN NN IN
NP

100

10

NP VVZ/VVD NN

10

10

100

10

JJ NN VBZ/VBD VVN IN
NP

10

10

100

10

DT JJ NN VVN VVD CD
NN

83.33

8.5

NN VVN/VVD/VVZ VBZ
CD NN

87.5

8.8

JJ NN VVD DT VBD CD
NN

12

10

83.33

8.5

JJ NN VVD/VVZ DT NN
VBD CD NN

100

10

Precision of RBMT = 91.415.


Table 2. Perfomance measure for complex sentence for English to Tamil Translation System

SENTENCE TYPE

TOTAL NO. OF
SENTENCES GIVEN

OUTPUT
OBTAINED

PRECISION
(%)

MEAN
OPINION
SCORE

80

8.5

83.33

80

7.9

83.33

8.5

Sentence containing a
keyword and between
two sentences
Sesntence containing a
keyword due to
Sentence containing a
keyword and between
two parameters
Sentence containing a
keyword Because of

Precision of KBMT = 81.65.

Bilingual Translation System for Weather Report


Table 3. Perfomance measure for Tamil to English Translation system
SENTENCE TYPE

TOTAL NO. OF
SENTENCES
GIVEN
15

OUTPUT
OBTAINED
14

93.33

^<noun> ^<Vvp>+s ^<FV>

15

14

93.33

^<adv> ^<noun> ^<FV>

15

13

86.66

9.7

^<verb> <Nins>
^<noun> ^<FV>
^<noun> ^<Num>
<entity>
^<FV>/<NN>
<entity> ^<adj>
^<Vinf> ^<FV>
^<adj> ^<noun>
^<FV>
<entity> <entity>
<entity> ^<adj>
^<vinf> ^<FV>

^<adj>

10

80

<entity>
^<conj>

83.33

8.5

^<noun>

10

90

^<Vinf>

10

90

9.3

^<par>
^<noun>

83.33

^<noun> ^<FV>

PRECISION
(%)

MEAN
OPINION
SCORE
9.5

Precision of Tamil to English Translation System = 87.49.

Fig. 2. Performance measure for simple sentences


P1 PP VBD/VBZ VVG

P11 NP VVZ/VVD JJ NN

P2 DT NN VHZ VVN

P12 JJ NN VVZ NN NN IN NP

P3 DT NN MD VV

P13 NP VVZ/VVD NN

P4 DT NN VVD

P14 JJ NN VBZ/VBD VVN IN NP

P5 NN NN IN DT NP NN

15 DT JJ NN VVN VVD CD NN

161

162

S. Saraswathi et al.

P6 NN VHP VVN RB

P16 NN VVN/VVD/VVZ VBZ CD NN

P7 DT JJ NN NN VHP VVN RB

P17 JJ NN VVD DT VBD CD NN

P8 DT NN VBD/VBZ JJ

P18 NN VVN VBZ CD NN IN NNS

P9 JJ NN NNS/VVZ CD NNS IN NP

P19 JJ NN VVD/VVZ DT NN VBD CD NN

P10 NP/NN VVD NN IN NP

Fig. 3. Performance measure for complex sentences

P1 Sentence containing a keyword and between two sentences


P2 Sentence containing a keyword due to
P3 Sentence containing a keyword and between two parameters
P4 Sentence containing a keyword Because of

Fig. 4. Performance measure for Tamil to English Translation System

Bilingual Translation System for Weather Report

163

P1 ^<noun> ^<FV>
P2 ^<noun> ^<Vvp>+s ^<FV>
P3 ^<adv> ^<noun> ^<FV>
P4 ^<verb> <Nins> ^<adj> ^<noun> ^<FV>
P5 ^<noun> ^<Num> <entity> <entity> ^<conj> ^<FV>/<NN>
P6 <entity> ^<adj> ^<noun> ^<Vinf> ^<FV>
P7 ^<adj> ^<noun> ^<Vinf> ^<FV>
P8 <entity> <entity> ^<par> <entity> ^<adj> ^<noun> ^<vinf> ^<FV>
Since the Tamil language is free word order language, the morphological analyzer
does not always tag the sentence in the same way. So the precision of Tamil to
English Translation System is less when compared to the English to Tamil
Translation System.

5 Conclusion
We presented a Bilingual Translation System which translates given input sentence in
source language into target language using hybrid approach. New rules can been
added to the proposed system in order to make the system more efficient. This work
can be extended to other domains with the addition of new rules.

References
1. http://en.wikipedia.org/wiki/Machinetranslation
2. Thenmozhi, D., Aravindan, C.: Tamil-English Cross Lingual Information Retrieval
System for Agricultural Society, Department of Computer Science & Engineering, SSN
College of Engineering Chennai, India (2009)
3. Saraswathi, S., Asma Siddhiqaa, M., Kalaimagal, K., Kalaiarasi, M.: Cross Lingual
Information Retrieval System for English, Tamil and Hindi. Department of Information
Technology, Pondicherry Engineering College (2009)
4. Toshio, Y.: The EDR electronic dictionary. Communications of the ACM 38(11), 4244
(1995)
5. Rao, P.R.K., Sobha, L.: AU-KBC FIRE2008 Submission - Cross Lingual Information
Information Retrieval Track: Tamil-English. In: First Workshop of the Forum for
Information Retrieval Evaluation (FIRE), Kolkata, pp. 15 (2008)
6. Mitamura, T., Nyberg III, E.H., Caronell, J.G.: An Efficient Interlingua Translation
System for Multi-lingual Document Production, Center for Machine Translation Camegie
Mellon University Pittsburgh, PA 15213
7. Apertium Machine Translation system, http://www.apertium.org/
8. Gosling, J.: A brief history of the Green project, for java Java.net.no date (ca/ Q1/1988)
(retrieved April 29, 2007)

164

S. Saraswathi et al.

9. http://www.softpedia.com/get/Others/
Home-Education/Wintree.html
10. Open Source Java Technology Debuts In GNU/Linux Distributions. Sun Microsystems
(2008), http://www.sun.com/aboutsun/pr/
2008-04/sunflash.20080430.1.xml (retrieved May 2, 2008)
11. Systran Information and Translation Technologies, http://www.systransoft.com/About/
12. Hegde, J.J.: Machine Translation in India. NCST, Mumbai,
http://kshitij.ncst.ernet.in/~jjh/mainpage_sections/
Writings/mt4clir.txt
13. Bharathi, A., Chaitanya, V., Kulkarni, A.P., Sangal, R., Anusaaraka: Overcoming
language barrier in India. In: Nair, R.B. (ed.) To appear in "Anuvad: Approaches to
Translation". Sage, New Delhi (2001)

Design of QRS Detection and Heart Rate Estimation


System on FPGA
Sudheer Kurakula, A.S.D.P. Sudhansh, Roy Paily, and S. Dandapat
Department of Electronics and Electrical Engineering,
Indian Institute of Technology Guwahati
{a.sudhansh,roypaily}@iitg.ernet.in

Abstract. Electrocardiogram (ECG) is a representative signal containing


information of the heart. The main tasks in ECG signal analysis are the
detection of QRS complex & the estimation of instantaneous heart rate by
measuring the time interval between two consecutive R-waves. An algorithm
based on wavelet transforms which uses the quadrature spline wavelet for the
detection of QRS complex is developed & implemented on FPGA. The
proposed system can operate at a maximum throughput of 52.662
MSamples/sec in the real time, while the sampling rate of ECG signal is only
few hundred samples/sec. Thus the system can work on both online and offline
at maximum possible throughput.
Keywords: Electrocardiograph (ECG), Wavelet Transform (WT), Field
Programmable Gate Array (FPGA), Algorithme a Trous, QRS complex, P and

T waves.

1 Introduction
Electrocardiograph (ECG) is a noninvasive recording of the electrical activity of the
heart and it represents the signal containing information of the heart. The main tasks
in ECG signal analysis are the detection of QRS complex & the estimation of
instantaneous heart rate by measuring the time interval between two consecutive Rwaves. There are many hardware implementation approaches to ECG monitoring
systems. A low-cost microcontroller-based Holter recorder implemented with off-theshelf components was reported in [1]. There are DSP based Medical Development
Kits which include a board, emulator and complete integrated development platform
for cases like Electrocardiogram (ECG), Pulse Oximeter (PO) and Digital Stethoscope
(DS) [2]. The DSP is a specialized microprocessor, typically programmed in C and is
well suited to extremely complex maths-intensive tasks. It is limited in performance
by the clock rate. In contrast, an FPGA is an uncommitted "sea of gates". The device
is programmed by connecting the gates together to form multipliers, registers, adders
and so forth. Many blocks can be very high level ranging from a single gate to an
FIR or FFT. Their performance is limited by the number of gates they have & the
clock rate. FPGA is more advantageous than DSP chip because of its low cost &
reconfigurable property. Recent FPGAs have included Multipliers especially for
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 165174, 2011.
Springer-Verlag Berlin Heidelberg 2011

166

S. Kurakula et al.

performing signal processing tasks more efficiently. There are also customized
System on Chip (SoC) approach for biomedical signal processing for portable brainheart monitor systems. Through SoC integration, bulk associated with interfacing
circuitry can be reduced, allowing for the miniaturization & realization of portable
power efficient brain-heart monitor systems in [3]. An FPGA based ECG QRS
Complex Detection with Programmable Hardware was developed by Chio In Ieong
et. al [4].
This paper proposes a new architecture for the FPGA implementation of Wavelet
Transforms by exploiting the properties of quadrature spline wavelets. Wavelet
Transform (WT) is very much useful because of its tremendous localization
techniques in time and frequency analysis. The WT can characterize the local
regularity of signals and can be used to distinguish ECG waves from serious noise,
artifacts and baseline drift. An algorithm based on the WT for detecting QRS
complex, P & T waves is proposed in [5]. Either continuous or discrete WT
decomposes finite energy signals into a set of basis functions generated from shifted
and scaled versions of a prototype wavelet functions. This approach can be easily
extended to denoising, analyzing and extracting ECG signals easily and conveniently
not only in heart illness diagnosis but also in ECG signal processing research.
Moreover, these tools can be also used in other biomedical signal processing
applications such as Magnetic Resonance Imaging (MRI) & Electroencephalography
(EEG). Based on the wavelet Transform approach [6], [5] describe a methodology for
the QRS detection & [7] addresses Detection of QRS complex on a DSP Chip. FPGA
implementation of the Discrete Wavelet Transforms offers innumerable possibilities
for the use of these transforms in a variety of applications like determination of
various characteristics as QRS complex detection & heart rate estimation. FPGA
implementation can also be made capable of checking the various abnormalities in the
functioning of heart and also for the compression of the signal for storage &
transmission. Adding to this, FPGA architecture can also utilize less VLSI
area, power & can use the hardware efficiently. This paper is organized as follows.
Section 1 gives the introduction and section 2 presents the basics of WT. Section 3
states the algorithm used & its implementation details. Results obtained are
summarized in section 4 and finally section 5 concludes the paper.
The architecture used for developing the Daubechies DWT [9] uses only a single
filter, which is the most hardware expensive part and therefore, the spline dyadic
wavelet transform is implemented. In [10] an algorithm was proposed for the
implementation of fast dyadic wavelet transform. The fast dyadic wavelet transform is
implemented using filter banks It also proposes an efficient algorithm for QRS
detection, for the P and Q waves though the algorithm works only in few cases. It can
be said that, these waves are not usually more generalized and the algorithm fails
whenever two peaks lie in the search window or when the T waves merges with the P
waves the algorithm fails to detect these waves. Also, the width of the individual
waves is not taken into consideration for the detection of these waves which may lead
to errors. Threshold based method is used which fails for T wave detection as the
average amplitude of T wave is not fixed. Lab VIEW and the signal processingrelated toolkits can provide a robust and efficient environment and tools for resolving
ECG signal processing problem [11].

Design of QRS Detection and Heart Rate Estimation System on FPGA

167

2 Basics of Wavelet Transform Approach


Since our implementation is based on the Wavelet Transform approach, in this section
a brief analysis on different types of transforms on wavelets is discussed.
2.1 Discrete Wavelet Transform
In the Discrete Wavelet Transform (DWT) the transform variables a and b are
discretized instead of the independent variable t. The scaling parameter a is
discretized as a = 2j and is called dyadic sampling. The expression for DWT is
(1)
Moving up in the scale results in higher time resolution and poor frequency
resolution and moving down vice versa. It is equivalent to decomposing the signal
into different frequency bands at different time intervals, by passing the signal
through a bank of constant band pass (Q) filters. Multi Resolution Analysis (MRA) is
the design method of most of the practically relevant DWT. The signal is decomposed
(scaling) level by level into an approximation part found using the scaling function &
the detailed part found using wavelet function. This is equivalent to passing the signal
through a series of low pass filters with cut off frequencies lower than the previous
filter and applying wavelet function at each filter stage.
2.2 Fast Wavelet Transform
S. Mallat [9] has proposed a method for implementing DWT using pyramidal structures
where the time dilation is accomplished by down sampling for every stage in the
wavelet decomposition which is shown in figure 1. The signal is convolved with the
impulse response of the wavelet function (high pass filter). The scale parameter controls
the rate of the decay of mother wavelet h[n] are the coefficients of a high pass filter
whereas, g[n] are the coefficients of the low pass filters. After each high pass and low
pass filters the down sampling is implemented for the time dilation. The signals
obtained after passing it through the jth level high pass & low pass filters are called the
details and the approximation coefficients resp. at the jth level. The main disadvantage of
this approach is its translation invariant nature caused due to the sub-sampling. When a
pattern is translated, its numerical descriptors should be translated but not modified.
Indeed, a pattern search is particularly difficult if its representation depends on its
location. Therefore, we have used the Dyadic Wavelet Transform approach.

Fig. 1. Implementation of DWT using Mallat Algorithm [9]

168

S. Kurakula et al.

2.3 Discrete Wavelet Transform


If the scale parameter is the set of Integral powers of 2, i.e., a = 2j, (j Z, Z is Integer
set), then the wavelet is called a dyadic wavelet. The dyadic wavelet transform is a
translation invariant representation that does not sample the translation factor b.
(2)

2.4 Spline Dyadic Wavelets


A box spline of degree m is a translation of m + 1 convolutions of l [0,1]with itself. It is
centered at t = 1/2 if m is even and at t = 0 if m is odd. Its Fourier transform is given
as
(3)
Where
= {1 if m is even}
{0 if m is odd}
So,
(4)

And,
(5)
The Fourier transform of the resulting wavelet is
(6)

(7)

2.5 Quadratic Spline Function


Quadratic Spline Function is the case for the spline dyadic wavelet of order two
(m=2). The corresponding frequency characteristics and frequency response of the
filters are given as
(8)
Producing the coefficients

Design of QRS Detection and Heart Rate Estimation System on FPGA

169

The filter coefficients are normalized to have the sum of their absolute values to 1.
Therefore the coefficients used for the filters are
H = [0.125 0.375 0.375 0.125] and G = [-0.5 0.5].
2.6 Algorithme a Trous
An algorithm was proposed for the implementation of fast dyadic wavelet transform
[7]. The fast dyadic wavelet transform is implemented using filter banks. For a given
filter x with coefficients x[n], xj[n] denotes the filter obtained by inserting 2j-1 zeroes
between every x coefficient (hence the French name "algorithme trous", which
means "holes"), Its Fourier transform is x*(2j )
For any j>0,
(9)
(10)

3 Algorithm and Implementation


The ECG signal is passed through the bank of filters as shown in Fig. 2. In the
detection of QRS the wavelet coefficients of 4 is used for detection since most of the
energy of QRS complex is present in this frequency band. Thus only the filters in the
path of wavelet transform scale four are implemented. The coefficients of the filters
used in the calculation of scale 4 wavelet coefficients are
H0 = [0.125 0.375 0.375 0.125 ],
H1 = [0.125 0 0.375 0 0.375 0 0.125],
H2 = [0.125 0 0 0 0.375 0 0 0 0.375 0 0 0 0.125 ] and G3 = [-0.5 0 0 0 0 0 0 0 0.5]

Fig. 2. Filters implemented in the pyramidal structure

170

S. Kurakula et al.

The given ECG signal is passed through the series of these filters to obtain the scale 4
wavelet coefficients.
3.1 Zero Crossing Detection
The first stage in the detection of QRS complex is to detect the zero crossing points in
the wavelet coefficients. Zero crossing points are detected wherever the wavelets
coefficient value is zero or if it has a different sign from the previous value.
3.2 Detection of R Peaks
Threshold based detection is used to determine the R peak. Detection is based on two
thresholds one the minimum threshold and next maximum threshold. If there exists
coefficient with its absolute value greater than the min threshold value on both the
negative and positive side of the zero crossing point and one coefficient on any side of
the zero crossing should have a value greater than maximum threshold, then such a
point is detected as a valid R peak.
3.3 Rate Determination
The number of clock cycles between the first and the next r peak is counted. Let it be
n and let the sampling rate be N, then heart rate in bpm (beats per minute)is given as
(N/n) *60.
3.4 Implementation
Block Diagram of the implemented system is shown in Fig. 3. The filters in the way
to computation of scale 4 wavelet coefficients are implemented and they are named as
filters H0, H1, H2, G3. Instead of defining three different filters for H0, H1, H2 only
one filter unit is used and the hardware is shared accordingly. Whenever there are
zero coefficients between two nonzero coefficients then a delay is used for every zero
in the incoming signal. Similarly, G3 is also implemented. After the wavelet
coefficients of scale 4 are obtained, they are passed to another unit for the QRS peak
detection. Two thresholds min threshold and max threshold are used for QRS
detection. The detector circuit initially looks for the zero crossing points and when it
finds one it searches for one value in both thesides of the zero crossing point for a
value on each side whose magnitude is greater than the minimum threshold value and
for a single value on any side that is greater than maximum threshold & if it finds

Fig. 3. Block Diagram of the Implemented System

Design of QRS Detection and Heart Rate Estimation System on FPGA

171

such a zero crossing point it determines that as a valid R-peak & the signal indicating
the detection of R-peak goes high for one clock duration. The R-peak signal is given
to the next block which determines the heart rate in bpm (beats per minute). This unit
consists of a counter which in turn counts the number of clock cycles between the
consecutive R-peak units. The sampling frequency divided by the counted value gives
the heart rate per second and when multiplied by 60 gives bpm value.

4 Results
ECG signals were taken from the MIT-BIH Database and figures 4 to 7 give the
corresponding graphs for the four records {100,101,103,105}. In each plots, the
amplitude versus sample in which the first part of the graph corresponds to ECG and
R Peak Points, and the second part corresponds to D4 Wavelet Coefficients and R
peak points.

Fig. 4. Plot for Amplitude Vs. time of the record 100 of MIT-BIH Database

Fig. 5. Plot for Amplitude Vs. time of the record 101 of MIT-BIH Database

172

S. Kurakula et al.

Fig. 6. Plot for Amplitude Vs. time of the record 103 of MIT-BIH Database

Fig. 7. Plot for Amplitude Vs. time of the record 105 of MIT-BIH Database

We analyzed the signal in different frequency bands using the wavelet transform
(WT) [12, 13]. This approach is a recently developed signal processing technique and
it appears to be well suited to this problem, for the following reasons.
1.
2.

3.

It is flexible with respect to frequency and shape of the analyzing wavelets.


Represents the signals over different scales, enabling an identification of
large-scale (low-frequency, long lasting) and small-scale (high-frequency,
short lasting) signal fragments in n-stationary signals.
It has good localization properties in both time and frequency domains.
Allows arbitrary concentration in one of these domains - at different levels of
redundancy [13], and allows orthogonal signal decomposition [3].

In this paper wavelets approach proposed by S. G. Mallat [9] was used. This method
decomposes the signal into an orthogonal set of coarse and fine components that
correspond in the spectral domain to sets of special low-pass and band-pass filters.
The implementation was carried out on the Spartan-3E FPGA board and the signals
obtained at the output are Reset, Clock, heart rate, R peak detection are obtained. The

Design of QRS Detection and Heart Rate Estimation System on FPGA

173

output of the ECG signal is obtained after 30 clock cycles and it has the following
features. Max. Frequency = 52.43 MHz, Minimum Time Period = 19.071 nS, Number
of slices = 796 out of 4656, Number of slice flip flops = 533 out of 9312.
Table 1. Implementation results using records of MIT-BIH
Record no

Total Beats

100
101
103
105

2272
1864
2090
1556

False Positive
0
0
1
27

False Negative
0
2
1
1

% Accuracy
100
99.89
99.99
98.24

Records taken from the MIT-BIH database are used and the results obtained are
tabularized in Table 1. False positive is an extra beat not present in data, but detected
by device. False negative is a missed beat present in ECG but not detected by the
signal. Accuracy is given by the formula.
(11)
The reason for variation in accuracy in the records may be due to the interference
from other biomedical signals or due to the muscle artifacts or it might also be
because the records taken might be from different individuals. The accuracy is
reduced by various noises like motion artifacts, respiration and muscle extraction.
Sometimes the failure of the system to detect abnormal beats may also be a reason.

5 Conclusions
An algorithm based on wavelet transforms which uses the quadrature spline wavelet
for the detection of QRS complex was implemented on FPGA. Records taken from
the MIT-BIH database were used for analysis. The work so far done covers the QRS
detection and heart rate determination. The future work includes detection of P and T
waves the various abnormalities based on the data.

Acknowledgement
This work is carried using VLSI design softwares and FPGA boards provided by SMDP II
project at IIT Guwahati.

References
1. Segura-Jurez, J.J., Cuesta-Frau, D., Samblas-Pena, L., Aboy, M.: A microcontrollerbased portable electro cardiograph recorder. IEEE Transactions on BME 51(9)
(September 2004)
2. http://focus.ti.com/docs/toolsw/folders/print/
tmdxmdkek1258.html

174

S. Kurakula et al.

3. Fang, W.-C., Chen, C.-K., Chua, E., Fu, C.-C., Tseng, S.-Y., Kang, S.: A Low Power
Biomedical Signal Processing System-on-Chip Design for Portable Brain-Heart
Monitoring Systems. In: International Conference on Green Circuits and Systems
(ICGCS), Shanghai, pp. 1823 (2010)
4. Ieong, C.I., Vai, M.I., Mak, P.U.: 30th FPGA based ECG QRS Complex Detection with
Programmable Hardware. In: Annual International IEEE EMBS Conference Vancouver,
British Columbia, Canada, August 20-24, pp. 29202923 (2008)
5. Li, C., Zheng, C., Tai, C.: Detection of ECG characteristic points using wavelet
transforms. IEEE Trans. Biomed. Eng. 42, 2128 (1995)
6. Yang, Z.R.: A method of QRS detection based on wavelet transforms. Master Thesis,
Dept. Mech&Electromech. Eng., National Sun Yat-Sen University, pp. 17-29
7. Sahambi, J.S., Tandon, S.N., Bhatt, R.K.P.: Using wavelet transform for ECG
characterization. IEEE Eng. in Med. and Biol. 16(1), 7783 (1997)
8. Pan, J., Tompkins, W.J.: A Real Time QRS Detection Algorithm. IEEE Trans. Biomed.
Eng. 32(3), 230236
9. Mallat, S.G.: A theory for multi resolution signal decomposition: the wavelet
representation. IEEE Trans. Pattern Analysis and Machine Intelligence 11(7), 674693
(1989)
10. Holschneider, M., Kronland-Martinet, R., Morlet, J., Tchamitchian, P.: A Real-Time
Algorithm for Signal Analysiswith the Help of the Wavelet Transform. In: Wavelets,
Frequency Methods and Phase Space, pp. 289297. Springer, Berlin (1989)
11. LabVIEW for ECG Signal Processing,
http://zone.ni.com/devzone/cda/tut/p/id/6349
12. Rioul, O., Vetterli, M.: Wavelets and Signal Processing. IEEE Sig. Proc. Magazine, Skt.,
1438 (1991)
13. Kronland-Martinet, R., Morlet, J., Grossmann, A.: Analysis of sound pattems through
wavelet transforms. Int. J. Pattern Rec. Artificial Intell. 1(2), 273302 (1987)
14. Ieong, C.I, Vai, M.I., Mak, P.U.: QRS recognition with programmable hardware. In: The
2nd Int. Conf. Bioinformatics and Biomedical Eng. (iCBBE 2008), Shanghai, China
(2008)
15. Vishwanath, M.: Discrete Wavelet Transform in VLSI. In: Proc. IEEE Int. Conf. Appl.
Specific Array Processors, pp. 218229 (1992)
16. Mallat, S.: Zero-crossings of a wavelet transform. IEEE Trans. Inform. Theory 37, 1019
1033 (1991)
17. Knowles, G.: VLSI architecture for the discrete wavelet transform. Electronics Letters,
26(15), 1184-1185 (1990)

Multi-document Text Summarization in E-learning


System for Operating System Domain
S. Saraswathi1, M. Hemamalini2, S. Janani2, and V. Priyadharshini2
Department of Information Technology, Pondicherry Engineering College,
Puducherry-605004, India
swathi@pec.edu,
hmalini89@gmail.com,
jananisekar@pec.edu,
priyadharshini138@pec.edu

Abstract. The query answering in E-learning systems generally mean retrieving


relevant answer for the user query. In general the conventional E-learning
systems retrieve answers from their inbuilt knowledge base. This leads to the
limitation that the system cannot work out of its bound i.e. it does not answer
for a query whose contents are not in the knowledge base. The proposed system
overcomes this limitation by passing the query online and carrying out multidocument summarization on online documents. The proposed system is a
complete E-learning system for the domain Operating systems. The system
avoids the need to maintain the knowledge base thus reducing the space
complexity. A similarity check followed by multi-document summarization
leads to a non-redundant answer. The queries are classified into simple and
complex types. Brief answers are retrieved for simple queries whereas detailed
answers are retrieved for complex queries.
Keywords: Multi-document summarization, Information retrieval, Query
answering system, Ontology tree, POS tagger.

1 Introduction
E-learning can be defined as technology-based learning in which learning material is
delivered electronically to remote learners via a computer network. With the advent of
the information technology, loads of information is available on the World Wide Web
for the user to browse. But with more than required information, many a time users
leave the web with dissatisfaction rather than content. Auto-summarization is a
technique used to generate summaries of electronic documents. This has some
applications like summarizing the search-engine results, providing briefs of big
documents that do not have an abstract etc. There are two categories of summarizers,
linguistic and statistical. Linguistic summarizers use knowledge about the language to
summarize a document. Statistical ones operate by finding the important sentences
using statistical methods (like frequency of a particular word etc). Statistical
summarizers normally do not use any linguistic information.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 175186, 2011.
Springer-Verlag Berlin Heidelberg 2011

176

S. Saraswathi et al.

All existing E-learning systems retrieve answers from the knowledge base which
are collected offline. This leads to the limitation that the system cannot think out of its
domain. The knowledge base is static in all E-learning systems. But the proposed Elearning system collects documents online and tries to produce more accurate answers
by performing Multi-document summarization on the Documents retrieved online.
In a single-document summarizer the most important sentences are extracted from a
single source document. Multiple documents summarization consists of a cluster of
documents concerning the same topic. Several new problems arise here. For example,
because the documents are about the same topic, they can contain similar sentences.
We have to ensure that the summary does not contain any type of redundancy.
The proposed system is made as a complete E-learning system by providing
Authentication facility for each user by implementing Symmetric key encryption
algorithm [1] and by providing a tutorial section for the users on the domain
Operating Systems. The profile information of each user are encrypted and stored in
the database. The user on his every login the login information are encrypted and
checked with that in the database.
In query answering system the input query is processed by a Parts Of Speech
tagger [2] which detects the keywords for deciding the type of search. This leads to
Concept wise search for complex queries and Keyword search for simple queries
based on the wh keywords obtained [3].
Locality based similarity heuristic method [4] is used to extract the answers in
which every word location in each document is scored. The quality of this approach
depends on the location of the keyword. Document retrieval based on query
answering system [5] focus on solving majority problems to process the natural
language query: approaches of syntax analysis and syntax model, semantic model,
transformation mechanism from semantic model into database queries.
Semantic summarization is performed using Clustering algorithms [6] such as KMeans in which cluster centers become the summarized set. Semantic summarization
tends to summarize the dataset such that summarization ratio is maximized but the
error due to information loss is minimized. In Discovery Net [7], each distributed
dataset is locally summarized by the K-Means algorithm. Then, the summarized sets
are sent to a central site for global clustering. The quality of this approach is largely
dependent on the performance of K-means.
A scalable clustering algorithm [8] is proposed to deal with very large datasets. In
this approach, the datasets are divided into several equally sized and disjoint
segments. Then, hard K-Means or Fuzzy K-Means algorithm are used to summarize
each data segment. Similar to Discovery Net, a clustering algorithm is then run on the
union of summarized sets.
A database system consists of millions of data items that could be picked out as
their priorities by proper approach. At first, all the data is classified into different
categories. Each category is assigned with a predefined priority. The higher the
priority, information has more possibility to be chosen. Secondly, each data can only
be visited once. In addition, writing a program to perform the task can be very
straightforward [9]. However, it is not very easy to design an algorithm that is most
efficient for all scenarios.
By searching the web based on keywords in the given query, new pages called
composed pages[10], that contains query words is generated. By extracting and
binding together the relevant information from hyper-linked web pages the composed

Multi-document Text Summarization in E-learning System

177

pages are generated. By taking into account the hyper link structure of the original
pages and the association between the keywords within each page, the authors rank
the composed pages.
Nave algorithm is used for identifying the keywords in the document and Page
ranking algorithm [11] is used for ranking the retrieved documents obtained online.
An Ontology tree is used for extracting the concepts words under each topic. The
ontology tree is built in prior exploring all the topics under Operating systems. The
tree is built systematically and the sub-topics are placed under its respective topics.
For concept wise search the keywords from the query are passed to the ontology tree
to obtain concept words which are the added to the keywords and they are helpful in
retrieving more relevant documents online.

2 Proposed Work
The major three modules in the proposed system are User authentication, Tutorial and
Query Answering system. The major work is on query answering module. This
module is again classified into two types: simple query and complex query. Simple
queries collects documents by passing the keywords in the user query whereas for
complex type queries in addition to the keywords in the query the concept words are
also passed to the search engine to obtain more relevant documents.

Fig. 1. Overall architecture diagram of the proposed system

178

S. Saraswathi et al.

Module description and design. The entire system is classified such that it consists
of the following major modules: User authentication, Tutorial section and Query
Answering system.
2.1 User Authentication
User authentication to the system is provided by implementing Symmetric key
encryption algorithm [1]. The Profile information is encrypted and stored in the
database during the user registration to the system. During each entry to the system
the login information are encrypted and they are matched with the database. The
authorized users are allowed to use the system.
SYMMETRIC KEY ENCRYPTION ALGORITHM
Step 1: Convert each plain character into its corresponding ASCII equivalent.
Step 2: Convert the ASCII values to its binary representation in 8 bits.
Step 3: Reverse the bits.
Step 4: Divide these bits with the secret key.
Step 5: Find the quotient and the reminder.
Step 6: Represent the reminder in the first four bits and quotient in the last five bits of
the 9-bit cipher text.
2.2 Tutorial
This module provides a tutorial section for user on various topics under Operating
systems. By choosing a particular topic the total information under the topic is
provided to the user. The information is collected offline, organized to form a
complete tutorial under the domain. The topics covered are Process management,
Memory management and Storage management of Operating Systems. All the
information under these topics are consolidated to form a complete tutorial on
Operating Systems.
2.3 Query Answering System
The major part of work comes from this module. The module does not contain any
knowledge base. The system makes use of online search engines to retrieve
documents which are then processed to retrieve answers.
Input query. The input query is entered by the user using the user interface. The
input query can be any topic under operating systems. The query can be like what is
operating system or explain memory management.
Question type classification. The input query decides the type of the query. The
question identifying keyword in the query decides which method to choose for answer
extraction.
Simple queries. Simple queries are identified by processing the user query. Words
like what, when, how are some examples of simple queries. These queries are again
classified into 6 types based on the question identifying words. The 6 words are what,
define, what are the different types of, how, when and what are the necessary
conditions for. Different approaches are being followed for each type of questions to
extract answer.

Mu
ulti-document Text Summarization in E-learning System

179

Keyword extraction. Partt-of-speech tagging (POS tagging or POST), also callled


grammatical tagging or wo
ord-category disambiguation, is the process of markingg up
the words in a text (corpus) as corresponding to a particular part of speech, basedd on
both its definition, as well as its context i.e. relationship with adjacent and relaated
nce, or paragraph. Dynamic programming algorithms are
words in a phrase, senten
applied for tagging in POS
S tagger [17]. The input query from the user in operaated
with the POS(Parts of Speeech) Tagger in order to obtain the nouns, verbs, adjectivves.
These words are kept as priimary keywords.
Document extraction. Thee obtained keywords from the POS Tagger are passed to the
web browser in order to obtain
o
the web results for the relevant query. From the
obtained results are convertted to text format using an html to text converter.
Information retrieval. Th
he document with the highest rank that is obtained as the
result of the text converter, is subjected to match with the Primary keywords ussing
Naive Algorithm [16]. Thee naive algorithm is used to search every location of the
string t for the pattern p. This
T
way the most relevent answer is found based on the
weightage of the passages.. Different types of approaches are being followed forr all
question types. For questio
ons identifying words what and define the highest rrank
document is evaluated with
h the primary keywords to find the relevant passage. For
queries with how, when, what are the types of identifiers the concept words are
t
The passages are evaluated with both primary and
extracted from Ontology tree.
concept keywords and the highest
h
rank passage extracted as result.
Complex queries. Queriees with Explain, Discuss, Describe, List ouut,
Account on are complex question type identifiers. These kinds of queries expecct a
ystem.
detailed answer from the sy
Examples for complex queries: Explain memory management, Discuss pagging
algorithms.
mary
Concept words extraction.. The input query is passed to the POST to obtain prim
keywords. Each keyword id
dentified from Tagger output is matched with entry in evvery
node in the Ontology tree. The
T exact location of the keyword in the tree is identified. T
The
sub-nodes under the keyword
d give the concept keywords. The primary keywords togetther
with the concept keywords arre used to rank the documents and extract the result.

Fig. 2. Ontology tree

180

S. Saraswathi et al.

Document extraction for complex queries. The keywords and the concept words are
passed on to the online search engine and the needed documents are extracted. The
extracted documents are converted into text documents using Html to Text convertor.
Documents ranking. The text documents documents extracted as a result of Html to
Text Convertor are ranked using tf-idf algorithm. The primary keywords and the
concept keywords are operated on the individuals documents and Nave algorithm
used to match the keywords with the documents. The tfidf weight (term frequency
inverse document frequency) is a weight often used in information retrieval and text
mining. This weight is a statistical measure used to evaluate how important a word is
to a document in a collection or corpus. The importance increases proportionally to
the number of times a word appears in the document but is offset by the frequency of
the word in the corpus. Variations of the tfidf weighting scheme are often used
by search engines as a central tool in scoring and ranking a document's relevance
given a user query. In this way the term frequencies for individual documents are
obtained.
We assign to each term in a document a weight for that term, that depends on the
number of occurrences of the term in the document. We compute a score between a
query term t and a document d, based on the weight of t in d. This weighting scheme
is referred to as term frequency ( tft,d), with the subscripts denoting the term and the
document in order.
To assess the relevancy on a query, all terms are given equal importance. However
certain terms have little or no discriminating power in determining relevance. A
mechanism to attenuate the effect of terms that occur too often in the collection to be
meaningful for relevance determination is considered. For this purpose the document
frequency (dft ), that measures the number of documents in the collection that contain
a term t is used.
For total number of documents in a collection by N, we define the inverse
document frequency of a term t as follows:
(1)
The tf-idf weighting for the term t and document d is given as follows:
,

(2)

Multi-document summarization. The documents are preprocessed to filter the


unwanted data. The preprocessed documents based on the keywords available in the
query is summarized individually. The passages in the document are ranked
individually and the highest ranked passages are retained as it is. The number of
passages retrieved after summarization is dynamic. It depends on the total number of
passages in the document.
The documents are summarized using Weighted means algorithm. Each of the text
document is divided into the components based on the strategy described in the
weighted based algorithms. In the weight based algorithm, a binomial distribution
function is applied over the set of lines or the component chosen and the weights of

Multi-document Text Summarization in E-learning System

181

the components are checked and compared for extraction. Weight assigned to the
keywords play the decisive role in extracting the components. The algorithms are
applied over the document and the required component is extracted from the
document. The same procedure is applied over the remaining documents inorder to
obtain the components related to the query.
WEIGHTED MEANS ALGORITHM
Step 1: Fetch one of the resultant documents in the text format.
Step 2: Divide the entire document into components. Each passage can be considered
as a component.
Step 3: Fetch the first line of the document and check for all the keywords from the
keyword array.
Step 4: If there is a match then increment the keywords counter.
Step 5: Repeat step 3 and 4 until all the lines in the document are exhausted.
Step 6: Evaluate the weight of the component and store them in the result array.
Step 7: Repeat from step 3 to 6 for all the rest of the components in the document.
Step 8: Compare the values of the result array of weights and fetch the components as
per the result array sort.
Step 9: Repeat step 2 to 8 for all the other extracted documents.
Step 10: Store all the extracted components from all the documents to the resultant
document for the aggregation.
Similarities check and summarized answer. Aggregation of answer is carried out
such that the redundancies of the information in the various extracted components are
weighed against each other and redundant data will be removed. For aggregating the
result Cosine similarity check is used.
ALGORITHM FOR COSINE SIMILARITY
Step 1: Fetch the first passage from the first extracted document.
Step 2: Compute the weight of the passage by finding its tf*idf weights.
Step 3: Fetch the first passage from the second document the compute its weight.
Step 4: Check for the cosine similarity between the two passages based on the
formula below

,
,

(3)
,

where, i stands for the number of passages in document A,


j stands for the number of passages in document B,
th
W
Wi,A denotes the summation of total weight of i passage in document A,
th
W
Wj,B denotes the summation of total weight of j passage in document B,
Sim(A,B) stands for similarity between document A and documet B.
Step 5: If the similarity value is lesser than 0.5 then the two passages are similar so
only the first passage is considered.

182

S. Saraswathi et al.

Step 6: Else if the similarity value is greater than 0.5 then two passages are totally
different and both the passages are considered.
Step 7: Repeat the step 3 to 6 and carry out the process for all the passages in second
document then continue the same with the rest of the documents.
Step 8: Repeat the step 1 to 7 for all the passages in first document.
Step 9: Store the extracted passages from the first document and store them in a
document.
Step 10: Now considering the extracted passages from the second document which
are totally different from the first carry out the same process form step 1 to 8.
Step 11: Append the extracted passages from second document to that of the first.
Step 12: Repeat the process for all the passages in all the documents.

3 Parameters for Testing


The working of the system is tested by considering the following parameters.
Precision. Precision is defined as the number of relevant documents retrieved by a
search divided by the total number of documents retrieved by that search.
|

|
|

Recall. Recall is defined as the number of relevant documents retrieved by a search


divided by the total number of existing relevant documents. Recall in Information
Retrieval is the fraction of the documents that are relevant to the query that are
successfully retrieved. It can be looked as the probability that a relevant document is
retrieved by the query.
|

|
|

(5)

F-measure. A measure that combines Precision and Recall is the harmonic mean of
precision and recall, the traditional F-measure or balanced F-score. It considers both
the precision p and the recall r of the test to compute the score: p is the number of
correct results divided by the number of all returned results and r is the number of
correct results divided by the number of results that should have been returned.
6
MOS (Mean Opinion Score). The MOS is generated by averaging the results of a set
of standard, subjective tests where a number of listeners rate the retrieved answer.
MOS ranges between 0 to 10.

(7)

Multi-document Text Summarization in E-learning System

183

Summarization ratio. It is the ratio of the size of the Summarized text to the size of
the original document.
(8)
This value lies between 0 and 1.
Effective summarization ratio is calculated for multiple documents by taking the
average of individual documents.

4 Results
The following are the results of the proposed system.
Documents relevant to the query keywords were retrieved from the Google web
site. The first best ten documents were summarized to retrieve the results relevant to
the complex queries. The maximum size of the documents retrieved from web pages
after text conversion was around 126KB. The summarization algorithm resulted in a
summarization ratio of 0.1726. For all types of queries the summarization ratio values
lie between 0.1 and 0.2. The answers for different types of queries are extracted from
this summarized text.
The Precision, Recall and F-Measure for simple queries and complex queries
are tabulated below, where P refers to Precision, R refers to Recall and F refers to
F-measure.
Table 1. Precision, Recall, F-measure for simple queries in Operating System

Query Type

Simple what

0.724

0.784

0.868

Complex what

0.437

0.613

0.745

Define

0.745

0.793

0.812

When

0.565

0.846

0.893

How

0.486

0.749

0.871

Different types

0.789

0.812

0.853

The following graph shows the MOS for simple and complex queries. A total of
229 queries have been posed by a total of 30 users at various levels to the system and
the results obtained are sketched into a graph taking the mean opinion score into
consideration. The graph shows that the system works on the range of 7 to 8.
From the graphs it is observed that the Mean opinion score for simple queries
ranges between 7 to 8.5 whereas Complex queries ranges between 6.5 to 8.5. The

184

S. Saraswathi et al.
Table 2. Precision, Recall, F-measure for complex queries in Operating System

Query Type

Explain

0.696

0.784

0.813

Describe

0.642

0.846

0.768

Give an account

0.563

0.767

0.810

Detail

0.532

0.801

0.673

MOS
10
9
8
7
6
5
4
3
2
1
0
Simple what Descriptive
what

Define

When

How

Different
types

Complex

Fig. 3. Mean Opinion Score for simple and complex queries in Operating System

fluctuations in the range between different types of simple queries imply that different
techniques are followed to retrieve answers. For questions with how and when in
addition to the keywords, concept words are also considered and intensive passage
ranking is followed taking into account the concept words with keywords. And in
some cases two levels of ranking on the passages which are first obtained with
keywords alone is carried out to get relevant results. This can be the possible reason
for change in MOS values.

5 Conclusion
Multi document summarization is important in e-learning system which can be
utilized for improving the effectiveness of retrieval and accessibility of learning

Multi-document Text Summarization in E-learning System

185

objects in e-learning. Thus the proposed e-learning system, aims at providing the
solution by summarizing multiple documents, uses Ontology tree for the purpose of
concept words extraction in order to produce more relevant answers. Also, user
authentication is provided for the user profile information in order to prevent the
system from intruders. Tutorial section helps the user who needs complete
information on the domain. Thus a summarization in e-learning system for operating
system domain is achieved by this system.

References
1. Sarker, M.Z.H., Parvez, M.S.: A Cost Effective Symmetric Key Cryptographic Algorithm
for Small Amount of Data. In: 9th International Multitopic Conference, pp. 16. IEEE
INMIC, Los Alamitos (2005)
2. Charniak, E.: Statistical Techniques for Natural Language Parsing. AI Magazine 18(4),
3344 (2007)
3. van Halteren, H., Zavrel, J., Daelemans, W.: Improving Accuracy in NLP Through
Combination of Machine Learning Systems. Computational Linguistics 27(2), 199229
(2004)
4. Kumar, P., Kashyap, S., Mittal, A., Gupta, S.: A Query Answering System for E-Learning
Hindi Documents. In: South Asian Language Review, vol. XIII(1&2) (January-June 2003)
5. Dang, N.T., Tuyen, D.T.T.: Document Retrieval Based on Question Answering System.
In: Second International Conference on Information and Computing Science. IEEE, Los
Alamitos (2009)
6. Ha-Thuc, V., Nguyen, D.-C., Srinivasan, P.: A Quality-Threshold Data Summarization
Algorithm. In: IEEE International Conference on Research, Innovation and Vision for the
Future in Computing & Communication Technologies, Rivf 2008, ho chi minh city,
vietnam, July 13-17. IEEE, Los Alamitos (2008)
7. Wendel, P., Ghanem, M., Guo, Y.: Scalable clustering on the data grid. In: Proceedings of
5th IEEE International Symposium Cluster Computing and the Grid, CCGrid (2005)
8. Hore, P., Hall, L.O.: Scalable clustering: a distributed approach. In: Proceedings of IEEE
International Conference on Fuzzy Systems, FUZZ-IEEE (2004)
9. Cai, P., He, L.: Weighted Information Retrieval Algorithms for Onsite Object Service. In:
Proceedings of the International Multi-Conference On Computing in the Global
Information Technology, ICCGI 2007 (2007)
10. Varadarajan, R., Hristidis, V.: A system for query-specific document summarization. In:
CIKM 2006: Proceedings of the ACM Conference on Information and Knowledge
Management, pp. 622631 (2006)
11. Saraswathi, S., Asma siddhiqaa, Kalaimagal, Kalaiyarasi: Bilingual Information Retrieval
System for English and Tamil. Journal of Computing 2(4) (2010)
12. Satheesh Kumar, R., Pradeep, E., Naveen, K., Gunasekaran, R.: Enhanced cost Effective
Symmetric Key Cryptographic Algorithm for Small Amount of Data. In: International
Conference on Signal Acquisition and Processing. IEEE, Los Alamitos (2010)
13. Gilberg, R., Forouzan, B.: Data Structures: A Pseudocode Approach With C++.
Brooks/Cole, Pacific Grove, CA (2005) ISBN 0-534-95216-X
14. Heger, D.A.: A Disquisition on The Performance Behavior of Binary Search Tree Data
Structures. European Journal for the Informatics Professiona 5(5) (2004)

186

S. Saraswathi et al.

15. Aragon, C.R., Seidel, R.G.: Randomized search trees. In: Proc. 30th IEEE FOCS,
pp. 540545 (2000)
16. Wikipedia, http://en.wikipedia.org/wiki/String_searching_
algorithm#Na.C3.AFve_string_search
17. young, J.S.: Markov random field based English part-of-speech tagging system. In:
Proceedings of the 16th Conference on Computational linguistics, vol. 1, pp. 451457
(2006)
18. Glenisson, P., Antal, P., Mathys, J., Moreau, Y., De Moor, B.: Evaluation of the Vector
Space Representation in Text-Based Gene Clustering. In: Pacific Symposium on
Biocomputing, vol. 8, pp. 391402 (2003)

Improving Hadoop Performance in Handling


Small Files
Neethu Mohandas and Sabu M. Thampi
Rajagiri School of Engineering and Technology, Cochin, India

Abstract. Hadoop, created by Doug Cutting, is a top-level Apache


project that supports distributed applications which involves thousands
of nodes and huge amount of data. It is a software framework under a
free license, inspired by Googles MapReduce and Google File System papers. It is being developed by a global community of contributors, using
Java. Hadoop is used world-wide by organizations for research as well as
production.Hadoop includes Hadoop Common,Hadoop Distributed File
System(HDFS) and MapReduce as its subprojects. Hadoop Common
consists of the common utilities that support the other Hadoop subprojects. HDFS is a distributed le system which adds to the high performance of Hadoop by giving high througput access to application data.
It also improves reliability by replication of data, and maintains data integrity as well.MapReduce is a software framework based on MapReduce
algorithm to perform distributed computation involving huge amount of
data on clusters. Although Hadoop is widely used, its full potential is
not yet put to use because of some issues, the small files problem being
one of them.Hadoop Archives was introduced as a solution for the small
les problem for the Hadoop Version 0.18.0 onwards. Sequence les are
also used as an alternative solution.Both has their respective merits and
demerits. We propose a solution which is expected to derive their merits
while ensuring a better performance of Hadoop.
Keywords: hadoop, Hadoop Distributed File System(HDFS),
MapReduce, small les problem, hadoop archives, sequence les.

Introduction

In this era of distributed computing, development of Hadoop has further improved the performance of applications in which computations involving terabytes and petabytes of data are eciently processed quickly. This has been
made possible due to the underlying software framework, named MapReduce,
and the Hadoop Distributed File System. MapReduce, just as its name indicates,
is a software framework based on two basic steps - Map and Reduce supporting
massive computations. The concept of Map and Reduce steps are derived from
the functional programming languages. In OSDI 2004, Google presented a paper
on MapReduce, which kickstarted the implementation of the concept. Hadoop
is the java implementation of MapReduce, based on the concept that a huge unmanageable computation can be split into smaller manageable chunks. HDFS,
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 187194, 2011.
c Springer-Verlag Berlin Heidelberg 2011


188

N. Mohandas and S.M. Thampi

on the other hand, was inspired from the Google le system paper. It supports
the high performance of Hadoop in performing large computations by its reliable
data storage, high data integrity, and most importantly, a high throughput access to application data. As such, Hadoop is widely favoured in the web, search,
nance, and scientic market segments.

2
2.1

Background
MapReduce

Programmers are benetted from using this framework because they can avoid
the headache of complexities of distributed applications. This is possible because
the task of splitting the input data, assigning the computations among a set of
nodes in a cluster, managing system failures, and inter-node communications is
taken care of by the run time system. Programmers can, very conveniently, program even if they dont have much experience dealing with distributed computing
frameworks, which makes Hadoop a favourite among them.
The basic programming model can be described as a combination of Map tasks
and Reduce tasks[1]. To perform the computation, initially a set of key/value
pairs are provided as input. Then the computation is done, nally producing
a set of key/value pairs as output. In the context of the MapReduce library,
the computation can be viewed as two functions,Map and Reduce. Both Map
and Reduce functions will be written by the user. The Map function will accept
the input key/value pairs and will give a set of intermediate key/value pairs
as output. Now,the MapReduce library groups together the intermediate values
for a particular key and then pass it along to the Reduce function, probably
iteratively, as the list of values might be too large to t in memory.
It is the task of Reduce function to merge these values belonging to a particular key to a smaller set of values. If the user wants an even smaller set of
output values, he/she can avoid manual computation by giving this output as input to another MapReduce computation, thus resulting in a nested MapReduce
invocation.
As a simple example, we can count the access frequency of a set of URLs if
we give logs of web page requests as input to the MapReduce computation. The
Map function produces <URL; 1>. The reduce function sums up the values for
the same URL and produces a <URL; total count>pair,thus giving the URL
access frequency.
In the gure 1, we can see the Map and Reduce tasks being assigned to the
nodes by a master node,and the partitioned input given to the nodes assigned
with Map tasks, which produces the intermediate values. The master node will be
informed about the location of the intermediate values produced by each node.
On acquiring this information,Master node will pass it to the nodes assigned
with reduce tasks to nally perform the merging task, producing the output
les.

Improving Hadoop Performance in Handling Small Files

189

Fig. 1. Execution overview[1]

2.2

Hadoop Distributed File System

Hadoop Distributed File System is the le system used by Hadoop. It is similar to


the UNIX le system, and it is developed to support Hadoop in the data intensive
distributed computations. In a cluster scenario where Hadoop is implemented,
according to present design in Yahoo!, the largest contributer to the Apache
Hadoop project, one node per cluster acts as the NameNode, which stores all
the metadata of les. The application data will be stored in DataNodes[2].
If a client wants to perform a read operation,the request will be processed by
the NameNode, and it will provide the location of data blocks which constitutes
the le, and the client will perform the read operation from the closest DataNode.
For a write operation, the NameNode will select a set of DataNodes(by default
3) to host the replicas of each block of the le and the client will write the le
blocks into those DataNodes in a pipelined fashion.
In a cluster, during a DataNode start-up, it performs a handshake with NameNode which helps in maintaing data integrity. During handshake, the namespace ID and software version of the DataNode will be checked.Only a DataNode
with the same namespace ID and supported software version will be allowed in
the cluster.
A block report is sent periodically to the NameNode by each DataNode to
provide information about the block replicas it holds which helps the NameNode
in collecting information about the location of replicas of each block in a le,
thus maintaining a consistent metadata. Also, DataNodes will send heartbeats
every three seconds to acknowledge the DataNode about the availability of the
node as well as the le block replicas it holds. If the NameNode did not receive

190

N. Mohandas and S.M. Thampi

Fig. 2. HDFS architecture[6]

the heartbeat of a DataNode within a particular time, say ten minutes, it will
consider the DataNode as well as the block replicas it hosts as unavailable,
and takes charge of creating new replicas of those blocks on other available
DataNodes in the cluster.
The NameNode allocates space for the metadata and balance the load among
the DataNodes in its cluster using the information contained in the heartbeat.
From the gure 2, we can get a pictorial view of HDFS architecture, as well as
the read and write operations.

The Small Files Problem

Let us discuss the impact of this problem on the Hadoop Distributed File System
as well as MapReduce, the two major components of Hadoop that we discussed
before[7]. Scientic application environments, such as climatology, astronomy etc
contains huge amount of small les.
3.1

Impact on HDFS

HDFS block size by default is 64 MB.Any le smaller than this is considered as


a small le. We know that the NameNode holds the metadata of each le held
by the DataNodes in its cluster. If each DataNode holds unmanageable amounts
of small les, obviously the NameNode will nd it dicult to manage the large
amount of metadata. Also, it results in an inecient data access pattern as it
will require a large number of seek operations from DataNode to DataNode to
search and retrieve a requested le block.

Improving Hadoop Performance in Handling Small Files

3.2

191

Impact on MapReduce

The large number of small les will create an extra overhead on MapReduce
since the map tasks usually takes a block of input at a time and each map task
will be processing only a small amount of data thus resulting in a large number
of Map tasks.
3.3

Why Are the Small Files Produced?

Either the les are pieces of a larger le or they are small by nature.One or both
of these two cases can be seen in most of the environments faced by the small
les problem.

Fig. 3. Small les problem[3]

3.4

Signicance of Small Files Problem

One reason for the importance of this problem is that the NameNode will have
to manage a huge amount of metadata in its memory. Another reason is involves
the time that each datanode takes during start-up to scan its le system to
obtain data about the les it is holding which is needed in the block report to
be sent to the NameNode. The larger the number of small les, the longer it
takes. In a cluster, the administrator is provided with two choices in putting
user quotas on directories[3].
1. Maximum number of les per directory
2. Maximum le space per directory

192

N. Mohandas and S.M. Thampi

In Figure 3, maximum number of les allowed is seven, and the maximum le


space for a user directory allowed is 7 GB. User 1 has neither exceeded the
maximum number of les limit nor the maximum le space limit. So an incoming
request for a new le is processed right away. But for user 2, since he has reached
the number of les limit, an incoming request for a new le cant be processed
although a lot of allowed le space is still there. But user n has reached the le
space limit and an incoming request for a new le cant be processed although
he hasnt exceeded the rst criteria, the maximum number of les limit.

Existing Solutions

If the smaller les are part of a larger le, the problem may be avoided by writing
a program to concatenate the small les to produce a large le which is atleast
as big as the default block size. But, if the les are inherently small, they need
to be grouped in some way. Some existing solutions for this are as follows.
4.1

Hadoop Archives

Hadoop, in its later versions, introduced archiving as a solution to the small les
problem.The Hadoop archives, always with a *.har extension, contains metadata
and the data les. The metadata will be organized as index and master-index
les, and data les will be stored as part-* les.The name of the archive les
and the location within the data les will be stored in the index le.[5]. The
modications done to the le system for archiving is invisible to the user and
yet, the increased system performance will be quite obvious to the user. Also,the
number of les in HDFS has been reduced resulting in a better NameNode
performance. This solution is only for the Hadoop versions 0.18.0 onwards. The
former versions are still being widely used. While a MapReduce task is being
processed if a quota is exceeded, the task is aborted by the scheduler, no matter
how critical the task is or how close to completion the task is. Although les are
archived, compression of les is not possible with this method. Read operations
can still be slow since each le access needs two index le reads and a data le
read.

Fig. 4. Hadoop archive format[3]

Improving Hadoop Performance in Handling Small Files

4.2

193

Sequence Files

In this method, the existing data is converted to sequence les. That is, the
small les are put in a sequence le, and it can be processed in a streaming
way.Sequence les allow compression too, unlike Hadoop Archives. Also,sequence
les can be split into smaller chunks,and MapReduce can operate on each piece
in an independent manner. Conversion to sequence les might take time, and
this method is mostly dependent on java i.e, it is not available in a cross-platform
manner.

Proposed Solution

The proposed solution makes use of the merits of the existing solutions listed in
the previous section and tries to avoid their demerits. While Hadoop Archives
succeeded in grouping the small les, the read operation can still be slow as it
requires reading the two index les and nally the data le for a single read. On
the other hand, sequence les are ecient in data processing but it is platform
dependent.
We propose a method which will automatically analyse the input data block
size. If it is less than the default block size of HDFS, it will automatically reduce
the number of reduce tasks to an optimum number. This is based on the reason
that Hadoop outputs one le for each reduce task, regardless of the fact that
the reduce task might not produce any data. In addition, compression will be
allowed.This method is proposed to be implemented in a platform-independent
manner. While performing the MapReduce tasks,this method will keep track of
the memory space left to ensure that the minimum amount of space to uncompress a le is available, since updations will require the les in their original,
uncompressed format.
Since archiving is not done in this method, the read operation can be done
in the normal way, although the writes will require uncompression of the requested le. Instead of conversion of data to sequence les, which might take
more time than necessary, this method eciently analyses the input task at
hand,determines the block size, and sets the number of reduce tasks accordingly.
And this method, proposed to be implemented as a tool, can be used in earlier
versions of Hadoop where archiving is not introduced. A study conducted on
social networks revealed that the former versions of Hadoop are still being used,
which implies that this method can help them enhance the performance of the
version they are using, as well.

Conclusion

The role of Hadoop in performing massive computations eciently is evident.


Hadoop has a lot of potential.This is because of the unique way of distributing
the computation and nally merging the result, along with an ecient le system tailored to ensure the usage of its full potential. We have discussed one of

194

N. Mohandas and S.M. Thampi

the drawbacks faced by Hadoop, in certain environments, where a large number of small les degrade Hadoops performance. Two of the existing solutions
are explained in brief, along with their respective advantages and disadvantages.Finally, we proposed a solution combining the merits of the existing solutions, while ltering out the demerits faced by them. The proposed solution, after
successfull implementation, is expected to enhance the performance of Hadoop
in the scenarios where the so called small les problem is causing a performance
degradation.

Future Work

The next milestone in our work will be the successful implementation of the proposed solution, which focuses on improving the Hadoop performance when the
input les are inherently small. An extension of this can be a method which can
eciently manage small les, be it inherently small or otherwise, in a way that is
a bit more convenient for the programmers, sparing them from the complexities
of the distributed framework.

References
1. Dean, J., Ghemawat, S.: MapReduce: Simplied Data Processing on Large Clusters. In: Proceedings of the 6th Symposium on Operating Systems Design and
Implementation, San Francisco CA (December 2004)
2. Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The Hadoop Distributed File
System. In: Proceedings of the 26th IEEE Symposium on Massive Storage Systems
and Technologies (May 2010)
3. Mackey, G., Sehrish, S., Wang, J.: Improving Metadata Management for Small Files
in HDFS. In: Proceedings of IEEE International Conference on Cluster Computing
and Workshops, pp. 14 (August 2009)
4. Satyanarayanan, M.: A Survey of Distributed File Systems, Technical Report
CMU-CS-89- 116, Department of Computer Science, Carnegie Mellon University
(1989)
5. Hadoop Archives: Archives Guide (2010),
http://hadoop.apache.org/core/docs/r0.20.0/hadoop_archives.html
6. Hadoop Distributed File System: HDFS Architecture (2010),
http://hadoop.apache.org/common/docs/r0.20.1/hdfsdesign.html
7. The major issues identied: The small les problem (2010),
http://www.cloudera.com/blog/2009/02/02/the-small-files-problem
8. Introduction: What is Hadoop (2010),
http://www.cloudera.com/blog/what-is-hadoop
9. Hadoop Distributed File System: Welcome to Hadoop Distributed File System!
(2010), http://hadoop.apache.org/hdfs
10. MapReduce: Welcome to Hadoop MapReduce! (2010),
http://hadoop.apache.org/mapreduce

Studies of Management for Dynamic Circuit Networks


Ana Elisa Ferreira1, Anilton Salles Garcia2, and Carlos Alberto Malcher Bastos3
1

Universidade Federal do Esprito Santo, Departamento de Engenharia Eltrica


Av. Fernando Ferrari, s/n - Campus Universitrio
29060-900 Vitria, ES Brasil
anaelisa@telecom.uff.br
2
Universidade Federal do Esprito Santo, Departamento de Informtica
Av. Fernando Ferrari, s/n - Campus Universitrio
29060-900 Vitria, ES Brasil
anilton@inf.uff.br
3
Universidade Federal Fluminense, Departamento de Engenharia de Telecomunicaes
Rua Passo da Ptria, 156 Boa Viagem
24210-240 - Niteri, RJ Brasil
cmbastos@telecom.uff.br

Abstract. This article presents the need for a new model for Internet
architecture and shows briefly the international panoramic on researches for
developing it. One of the most prominence proposals is the Dynamic Circuit
Network (DCN), which allow provide dynamic hybrid packet and circuit
services within the same network infrastructure. However this bring several
important challenges to the control and management planes. The use of smart
management agents and self-management technicals seems to be one approach
able to deal with the new features of the new types of networks.
Keywords: DCN, Control Plane, GMPLS, Management.

1 Introduction
The Internet, as it is known today, is a network supported by technologies more than
30 years old, which evolved from a research network interconnecting few institutions
to a global network that is the backbone of modern society and economy. Despite the
history of success, scientific applications, such as high energy physics, astronomy,
bioinformatics, telemedicine, remote visualization, grid computing, nanodatacenters
among others are leading the Internet to its technological limit.
These applications are typically distributed and/or require strict guarantees of
quality, including high capacity interconnection. Improvements have been developed
in the original Internet protocols to meet these requirements, adding facilities for
measurement, management, traffic engineering, control and network security.
Different studies have been made both to address the limitations of the current model
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 195204, 2011.
Springer-Verlag Berlin Heidelberg 2011

196

A.E. Ferreira, A.S. Garcia, and C.A.M. Bastos

of the Internet and to develop a architecture with new capabilities in a visionary


perspective [1]. Among these studies, the following projects are being conducted:
Internet2 (Internet2) [2] - The Internet2 is a consortium led by the American
NREN community since 1996. Internet2 promotes the mission of its
members providing a network-edge capabilities and offering the opportunity
for unique partnerships that facilitate the development, implementation and
use of revolutionary technologies for the Internet.
Global Environment for Network Innovations (GENI) [3] - This project is a
partnership with the National Science Foundation, and aims to support
experiments ranging from new researches related to the design of network
infrastructure to distributed systems for theoretical aspects of the underlying
social, economic and technological value of the networks.
Future Internet Research and Experimentation (FIRE) [4] - Similar to the
GENI, this project has the support of the program Information and
Communication Technologies (ICT) of the European Commission. FIRE
deals with the new expectations that are emerging for the Internet providing
a research environment for investigation and experimental validation of
highly innovative and revolutionary ideas. The FIRE project has two
interrelated dimensions:
FIRE Facility: An experimental network, based on the principle of
federated testbeds, ie testbeds which retain its individual authority but
work in a coordinated way.
FIRE experimentally-driven research: Visionary multi-disciplinary
research, defining challenges and using the facilities provided by the
project FIRE. It consists of interactive cycles of study design and largescale test of new and innovative paradigms and network architectures
and services for the future Internet.
Gigabit European Academic Network (GEANT) [5] - GANT is a high
capacity network that includes more than three hundred research institutions
and education in 32 countries through 34 national and regional networks for
education and research. In operation since December 2001, his main goal is
to continue and improve the earlier version of the pan-european research
network TEN-155. Currently the GANT project is focused on developing
and implementing tools and services so that the research and education
community can achieve the best possible network performance. GANT is
managed by DANTE (Delivery of Advanced Network Technology to
Europe).
Canarie (CANARIE) [6] - CANARIE is the network for advanced research and
innovation at Canada. Established in 1993, the nonprofit organization serves
about 50,000 researchers at nearly 200 Canadian universities and colleges,
government laboratories, research institutions, hospitals and other
organizations in public and private sectors, connecting them with innovators
who are near, distributed across the country or around the world. Since the

Studies of Management for Dynamic Circuit Networks

197

government of Canada is its largest donor, the CANARIE provides advanced


networking capabilities that enable scientists to manage, analyze and
exchange large data volumes, allowing important discoveries. CANARIE
also enables researchers and their partners to develop new tools to explore
the potential of network
Advanced Network Test bed for R & D (JGN2plus) [7] - The National Institute
of Information and Communications Technology (NICT) of Japan, launched
the project JGN2plus for operation of new networks of laboratories that will
conduct research in conjunction with academia, government and industry
segments to develop new projects for a new generation networks.
In this work is presented a study on the main topics related to the control and
management of a new model of network architecture, the DCN (Dynamic Circuit
Network). Section 2 is a brief description of its architecture and the solutions adopted
by Internet2, GEANT and also by the Canarie. Then section 3 describes the control
plane and the section 4 some key issues of management. Managing this new network
model is still in an early stage of research and development. So here we present our
view about the main differences between the traditional approach and the challenges
brought by DCN. Concluding the article, we also touched on the topic of traffic
engineering, increasingly relevant and necessary to provide services that meet the
requirements of new applications.

2 Dynamic Circuit Network


New applications under development will shape a new profile of Internet use. This
new usage profile demands guarantees increasingly stringent of quality and level of
service. Recently, new architectural models are being proposed in order to support
these and future demands, changing the current paradigm and presenting new
solutions to the infrastructure network.
One of these proposals is the Dynamic Circuit Network (DCN), already used by
Internet2, GEANT and Canarie. The DCN is a new network architecture, which
enables the deployment of hybrid dynamic networks, with the simultaneous packets
and circuits switching. This convergence promises reduce operational costs and
complexity and simplify the network architecture when compared with the actual
model of segregated networks. The idea is to define a set of common control
functions and interconnection mechanisms that allow unified communication, routing,
and control across different types of underlying transport technologies, such as IP,
ATM, SONET/SDH and DWDM. Traditionally, each specific technology has its own
control protocols, and as a result each set of control protocols not communicate
directly with other on a peer-to-peer level. Instead, networks are layered one on top of
the other, creating overlays at each layer to collectively provide end-user services.
This processes require knowledge of each technology domain, provisioning of each
layer, and separate management of per-domain operations or functions. [8]

198

A.E. Ferreira, A.S. Garcia, and C.A.M. Bastos

Beyond the convergence, using the DCN is possible to provide a dynamic service
of virtual circuits switching with a specific duration between users that require
dedicated bandwidth for periods ranging from minutes to days [9]. A challenge in this
area concerns the automatic and dynamic management and control of these networks.
It is still necessary to develop solutions to important issues to the control and
management plans. The main proposal to the control plane of hybrid networks is
Generalized Multiprotocol Label Switching (GMPLS).
The solution used in Internet2 is the Dynamic Resource Allocation in GMPLS
Optical Networks (DRAGON). The DRAGON project has developed the technology
used to build an infrastructure network that enables the dynamic provisioning of
resources to establish deterministic paths in a network packet, thus meeting the
requirements of the various types of end users [10] Similar to Dragon, the Automated
Bandwidth Allocation across Heterogeneous Networks (AutoBAHN) is a solution
developed by GANT to allow dynamic allocation of channels. Autobahn is the result
of researches to automate the establishment of inter-domain end to end circuits with
guaranteed capacity [11]. These two solutions use a GMPLS control plane. Another
solution is the User Controlled Lightpaths (UCLP) [12], proposed by Canarie. Using
the technology of Web Services and the framework IaaS (Infrastructure as a Service),
the UCLP allows the establishment of intra and inter circuits. Unlike the other two
projects, the UCLP uses the management plan via the TL1 protocol instead of the
control plane to interact with the network equipments.

3 The GMPLS Control Plane


The implementation of new architectures depends largely on control and management
plans that can deal with the new models and their characteristics. One of the main
proposals for the control plane is the GMPLS, used in the DCN. The GMPLS extends
the MPLS signaling and routing functions for devices that switch packet, time,
wavelength and fiber optics. This unified control plane promises to turn the network
operation easier, faster and more efficient by automating the end to end provisioning,
managing network resources and providing the quality of service required by new
applications. The main challenge is the establishment, maintenance and management
of dynamic end to end paths. The user data can cross different networks, implemented
using different technologies and possibly belonging to different administrative
domains. The evolution of MPLS to GMPLS extended signaling protocols, RSVP-TE
(Resource Reservation Protocol-Traffic Engineering) and CR-LDP (ConstraintRouting Label Distribution Protocol) and routing protocols, OSPF-TE (Open Shortest
Path First - Traffic Engineering) and IS-IS-TE (Intermediate System to Intermediate
System - Traffic Engineering), to accommodate the characteristics of optical,
Ethernet, SDH/SONET, and WDM networks. A new protocol, the LMP (Link
Management Protocol), was introduced to manage the smooth functioning of the
control plan between two nodes. The table 1 summarizes the protocols and extensions
to the GMPLS.

Studies of Management for Dynamic Circuit Networks

199

Table 1. Protocols and extensions for GMPLS

Protocols
Routing

OSPF-TE
IS-IS-TE

Signaling

RSVP-TE
CR-LDP

Link
Management

LMP

Description
Routing protocols to discover network topology and
advertise the availability of resources (eg bandwidth
or protection). The main developments are:
Link protection announcement (1 +1, 1:1,
unprotected, extra traffic)
FA-LSP (forwarding adjacency) implementation to
improve scalability.
Announcement and receive of information in an no
IP link using link ID
Discovery of an alternative route diverse from the
primary path (shared-risk link group).
Signaling protocols to provision user and traffic
engineering LSPs. The main developments are:
Generic labels - allow joint use of packet switching
and other technologies on the same network.
Bidirectional LSPs establishment
Signaling to establish backup paths (Protection
information).
Fast label association via label suggested.
Support for waveband switching - an aggregate of
wavelengths which are switched together.
Control Channel Management: Established through
negotiation of link parameters (eg messages keepalive) and used to ensure the health of the link (hello)
Link Connectivity Check: Ensures the physical
connectivity between neighbors using a test message
similar to ping.
Correlation of the properties of links: Identifying
the links properties between adjacent nodes (eg
protection mechanisms).
Fault Isolation: Isolate simple or multiple faults
inside the optical domain.

4 The Management Plan


One activity of great importance in network operation is its management. The basic
concepts of OAM and the functional rules of behavior monitoring and diagnosis of
telecommunication networks should be reviewed and adapted to the new paradigms of
networks.
The Telecommunication Management Network (TMN) model, defined by ITU-T
in the M3000 series of recommendations [13], provides a framework to enable
the interconnection and communication between telecommunication networks and

200

A.E. Ferreira, A.S. Garcia, and C.A.M. Bastos

operations support systems (OSS) via management plan. TMN also defines functional
areas and logical layers that can be applied to the management of new network
architectures, but its implementation should be reviewed and extended to meet the
needs of new types of networks. Figure 1 shows the layers of TMN model , while
Table 2 shows its functional distribution.

Fig. 1. TMN logical layers


Table 2. TMN functional layers

Performance Management

Fault Management

Configuration
Management
Accounting Management
Security Management

TMN Functional Layers


Evaluate the proper functioning of the networks
elements and the operational state of the network and
its components.
Detect, recognize, isolate, correlate and correct the
failures events, which indicate anomalous operations
of networks, equipments or systems.
Perform the control and identification functions as
well as set and collect the data configurations from
the network elements.
Measure the services and network resources usage
and send the collected data to the billing systems.
Allow the prevention, detection, and control of
improper use of network resources and systems.

OAM functions are traditionally designed for networks built using a single
technology, however the DCN provides hybrid networks and multi-layer. The new
proposals for the management plan, but should not be limited to the traditional

Studies of Management for Dynamic Circuit Networks

201

functions of management. The five functional areas shown in table 2, known as


FCAPS (Fault, Configuration, Accounting, Performance, Security), should be
extended and implemented to permit, for example, that users with different profiles
can act in different network layers and different levels of coverage.
The management of hybrid networks and multi-layer resources that make use of
different switching granularities, virtual or otherwise, requires new forms of
monitoring. Equipment failures, performance and security problems becomes more
critical and more difficult to solve once it impacts on the various layers and network
services. Effective management requires monitoring, interpreting and control of
distributed resources [14]. From the standpoint of monitoring the main problem is
multi-dimensional and variability of the network. New ways must be found to obtain
basic information about the load and the use of individual resources in this type of
network, whose structure changes randomly according to the requests of users [15].
Distributed management systems, exploiting the advantages of mobile agents,
software that move in the network between its various entities, are a way to deal with
the management of dynamic networks [16]. The use of mobile agents allows the
management system also dynamically adapt to changing networks, like DCN, SAC,
DTN and others. However, the exclusive use of mobile agents can lead to an
unnecessary increase of traffic management, overloading the management plan. One
approach under study is the use of systems combining centralization of trivial tasks
and decentralization by mobile agents for complex tasks, allows a more realistic
solution. [17]
The management of these networks must also be able to deal with several possibly
different media and its limitations, for example, radio segments have a high error rate,
while optical segments have a limited ability to dynamically change the path of a
route.
Other issues to consider:

Flexibility: The management policies must evolve with the strategy of


operation and network utilization, QoS, quick allocation of resources,
protection and restoration of resources and so on. This implies the interaction
with the control plane (signaling and routing).
Reliability: There is a need to react smoothly in case of detection, isolation
and fault repair/protection. The network must be autonomous, ie, perform
self-diagnosis, self-repair and self-protection.
Mobility of terminals and resources partial and/or temporary availability:
These questions are raised by new uses and architecture of the network, with
ubiquitous wireless access, autonomous reconfiguration and delay and
interruption tolerance. Management functions should be able to deal with
reservations and scheduled use of resources.[18]

The operation and interconnection of hybrid and multi-layer networks depends on the
existence of a well-defined management plan that allows the measurement and
monitoring of intra and inter-domain traffic, as well as dynamic circuits provisioning,
correlation and fault recovery, AAA (Authentication, Authorization, and Accounting),
traffic engineering and other features. Similar to the control plane, there are important
challenges for the management of new network models.

202

A.E. Ferreira, A.S. Garcia, and C.A.M. Bastos

5 Traffic Engineering
The main objective in the operation of any network is to optimize the use of its
resources while satisfying the users demands and ensuring that the agreed availability
and quality will be met. Thus it is possible to maximize the investments made in its
implementation. The traffic engineering is the method by which network performance
is optimized, dynamically analyzing, predicting and controlling the behavior of traffic
through the network. The use of traffic engineering methods allows to adapt the
various traffic flows according to network conditions in order to ensure the joint goal
of network performance and efficient use of resources. So are required three main
steps: measuring, modeling and control.
Initially the traffic and the network must be measured. The tool for measuring and
monitoring must report:

Network topology,
Operational status of links and network equipment, including configuration
and performance parameters,
Traffic flows and their characteristics, including in our case, the scheduled
usage.

These data are also useful for other activities such as capacity planning, billing and
network visualization. So the changes to be made to the network should be planned
considering the heuristics and policies for the networks, physical or overlapping, and
for the services it provide. The efficiency of this step depends on an accurate and
current view of network state, ie the frequency and assertiveness of the
measurements. Finally an operator or an automated system makes the configurations.
The traffic engineering is applied both during normal operation and in the event of
failure. In case of failure, the goal is to preserve as much as possible the flows
performance and restore the operational state of the network. During normal operation
the goal is to improve operational running and perform preventive actions to optimize
the network, for example, allowing increased flows or the improvement of quality
related parameters.
These issues can be extrapolated to the case of new network architectures, such as
DCNs, and inter-domain paths involving traffic crossing different networks that may
be technologically and administratively separate. There are still several outstanding
issues, including even the first step: measurement. End-to-end metrics that remain
valid through technologically different networks are not yet defined. Parameters of
Wi-Fi, Ethernet, IP, MPLS, SDH and DWDM technologies, to cover only the most
common one circuit might cross, should be correlated to ensure the quality demanded
by the user. Since there is no correlation between intuitive metrics related to transport
technologies, like DWDM and SONET/SDH, and packet switching, IP/MPLS and
Ethernet for example, it is necessary to transform and combine information from
monitoring and measuring the network to obtain quality end to end services .
In addiction the establishment of circuits and the computation of paths should be
done using the measuring and monitoring information, so that it is possible to achieve
the goals of traffic engineering. Once the metrics are defined and mapped to the
parameters of various technologies, it is necessary to identify the heuristics that will

Studies of Management for Dynamic Circuit Networks

203

result in good inputs for the algorithms for paths computation. All this information is
essential to make a better network modeling. The model should also consider the use
of VPNs and networks with multiple layers services, which are another challenge
brought by the new paradigms of network architecture. There have been research on
the use of Virtual Topology Design (VTD) and hierarchical management, which
allows users to view, model and interact only with the devices and links (physical or
virtual) used for its own traffic and with different levels of authorization [19]. Once
there are packet and circuit switching with several priority levels, traffic engineering
can be used to assure that the basic services wont starve while the premium services
achieve its goals on QoS, circuit protection, fast restoration and etc. With the use of
specific queues for each class of service, it is possible to limit the bandwidth available
to any of the provided services, isolating them and preventing that the whole
resources be allocated only to the high priority services.
The interaction with the network also brings other issues. How users, managers or
applications can interact with the management and control plans? What can be offered
to management and configuration systems? The traffic engineering is then necessary
to ensure that the best alternatives are chosen, even for creating circuits and for the
restoration of the network. A special situation is presented by the inter-domain
services. In this case should be considered the restrictions to collect network
information: measurements and topology discovery are usually limited. Traditionally
are made interconnection agreements and SLAs, but the internal information of each
domain is hardly available, both for security and scalability. An alternative is the
federations, with rules to facilitate inter-operation and facilitate the establishment,
control and management of end to end services

6 Conclusion
This article is a compilation presenting some of the main topics on the control and
management of DCNs, that change the paradigm of IP traffic routing. This new
network model, dynamic, hybrid and multi-layer, is being designed to meet the
requirements of new applications extremely demanding like e-science, business, peerto-peer, social networks and so on.
Our future work will focus on developing a management model for the specific
needs of these networks. The operation and interconnection of hybrid and dynamics
networks depends on, among other requirements, a management plan that allows the
traffic measurement and monitoring, keeping it within the limits of quality required
trough the use of traffic engineering. The interconnection of these networks in turn
will require the standardization and development of their control and management
plans.

References
1. Jesdanun, A.: Internet pioneer will oversee GENI redesign (2007),
http://www.usatoday.com
2. http://www.internet2.edu
3. http://www.geni.net

204
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.

19.

A.E. Ferreira, A.S. Garcia, and C.A.M. Bastos


http://cordis.europa.eu/fp7/fire/
http://www.geant2.net
http://www.canarie.ca
http://www.jgn.nict.go.jp/english/index.html
Nadeau, T., Rakotoranto, H.: GMPLS Operations and Management: Todays Challenges
and Solutions for Tomorrow. IEEE Communications Magazine (July 2005)
http://www.internet2.edu/pubs/200710-IS-DCN.pdf
Lehman, J.T., Sobieski, J., Jabbari, B.: DRAGON: A Framework for Service Provisioning
in Heterogeneous Grid Networks. IEEE Communications Magazine (2006)
Sevasti, A.: Autobahn Dynamic circuits across heterogeneous R & E networks. In: RNP X National Education and Research Network Workshop (2009)
http://www.uclp.ca/
http://www.itu.int/rec/ - T-REC-M.3000, T-REC-M.3010, T-REC-M.3013
and T-REC-M.3016
Meyer, K., et al.: Decentralizing Control and Intelligence in Network Management. In:
The Fourth International Symposium on Integrated Network Management (1995)
Ji, N., et al.: Monitoring of overlay networks with virtual resources (2010)
Garcia, A.: TIAMHAT - Dynamic Provisioning of Connections in Hybrid Networks.
Project Future RNP (December 2008)
Damianos, G., et al.: A Hybrid Centralised - Distributed Network Management
Architecture. In: The Fourth IEEE Symposium on Computers and Communications (1999)
Dini, P., et al.: IP / MPLS OAM: Challenges and Directions - A multi-technology,
proactive, and autonomic management view. In: Proceedings IEEE Workshop on IP
Operations and Management (2004)
Garcia, A.: TIAMHAT - Dynamic Provisioning of Connections in Hybrid Networks.
Project Future RNP (December 2008)

Game Theoretic Approach to Resolve Energy


Conflicts in Ad-Hoc Networks
Juhi Gupta, Ishan Kumar, and Anil Kacholiya
Dept. of Electronics and Communication
Jaypee Institute of Information Technology Noida, India
juhi@jiit.ac.in, {ishankumarjiit,anil.jiitn}@gmail.com

Abstract. This paper provides a game theoretic approach to optimize the


transmission energy in wireless ad hoc network and to maintain the connectivity
of the network. As nodes have limited power they may act in a selfish manner
in order to minimize their power (energy) consumption. We also suggest a
novel method to identify the selfish nodes present in the topology.
Keywords: game theory, nash equlibrium, payoff.

Introduction

1.1 MANET
Mobile ad hoc networks (MANETs) have the ability to provide temporary and instant
wireless networking solutions in situations where cellular infrastructures are lacking
and are expensive or infeasible to deploy. Due to their inherently distributed nature,
MANETs are more robust than their cellular counterparts against single-point failures,
and have the flexibility to reroute around congested nodes. Furthermore, MANETs
can conserve battery energy by delivering a packet over a multi hop path that consists
of short hop-by-hop links. While wide-scale deployment of MANETs is yet to come,
several efforts are currently underway to standardize protocols for the operation and
management of such networks. Each device in a MANET is free to move
independently in any direction, and will therefore change its links to other devices
frequently. Each must forward traffic unrelated to its own use, and therefore be a
router. The primary challenge in building a MANET is equipping each device to
continuously maintain the information required to properly route traffic. Power
optimization is another major issue in MANET. When more nodes participate in a
network, lifetime of the network is increased [1] as more alternative paths originate to
forward the data. Nodes in a network may act in a selfish manner by using the
resources of a network and not participating in the routing. Identification of such node
is essential for proper functioning of the network and for maintaining the connectivity
of the network.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 205210, 2011.
Springer-Verlag Berlin Heidelberg 2011

206

J. Gupta, I. Kumar, and A. Kacholiya

1.2 Game Theory Approach


Game Theory is concerned with finding the best actions for individual decision
makers in these situations and recognizing stable outcomes. For the game theory there
must be atleast two players and each player has a number of possible strategies and
the strategy chosen by each player determine the outcome of the game.
1) Strategic Games: In strategic games, the players first make their decisions and
then the outcome of the game is determined. The outcome can be either deterministic
or contain uncertainties. The actions of the players may take place for a long time
period but the decisions are made without knowledge of the decisions of the other
players. A strategic game consists of
2) a finite set N (the set of players)
3) for each player i N a nonempty set Ai (the set of actions available to player, i)
4) for each player i N a utility function Ui on A = jNAj .
The players can choose their actions either from discrete alternatives or from a
continuous set. The solution of a strategic game is a Nash Equilibrium. This Nash
equilibrium is a point from which no single player wants to deviate unilaterally.
A Nash equilibrium of a strategic game <N,(Ai),(Ui)> is a profile a* = (a*1,,
*
a N) A of actions with the property that for every player i N we have Ui(a*) Ui
(a*1 , ., a*i-1, ai, a*i+1,, a*N) all ai A. Associated with each possible outcome of
the game, a collection of numerical payoffs[2][3], is decided. These payoffs represent
the value of the outcome to the different players.
5) Non-atomic games: These games model large population with infinitely many
players could define a payoff per strategy profilebut what fraction of population
picks each strategy is taken care of; the model including the following ingredients for
selfish routing has been developed.
finite number of player types [(Si,Di) pairs]
finite strategy sets [(Si,Di) paths]
for each distribution (fraction of population using each strategy), payoffs
corresponding to each strategygiven traffic pattern, cost of each path
Equilibrium non atomic games like nash equilibrium though note that individual
deviations do not affect payoffstrategies used by a player type have equal cost.

2 Topology
Directed graph is represented by G (v,e) where v is set of nodes (players) and e is set
of edges connecting those nodes. ev*v. Another set (Si, Di) contains pair of source
and destination nodes. Pi is number of path from Si to Di. Pe is the power required to
support an edge of a hop, e. For a transmission, let P Pi be the path chosen and Pt
is the power used for a transmission from Si to Di, clearly Pd=Pe. So, our Nash
Equilibrium will be point where total power is minimized.

Game Theoretic Approach to Resolve Energy Conflicts in Ad-Hoc Networks

207

Total power = Pd, for all pairs of (Si,Di)


This power can be minimized if all nodes participate without any selfishness.
Selfishness can be defined as nodes not participating in routing or transmitting
packets to nodes with more traffic or that are highly congested [4] [5], which leads to
the over-utilization of that node which may reduce its lifespan. Nodes will participate
if they are getting more advantage of being in the network than transmitting directly.
Let, Pd be the power required for direct transmission and Pr be the power used by
node through routing and be the cost needed for node to be a part of the network,
then node participates if
Pd-Pr >

3 Algorithm
3.1 Setting Up a Dedicated Path
Here, we considered a 6 nodes topology of an ad hoc network to be fixed. Fig. 1
shows one source node Si, one destination node Di and four intermediate nodes A, B,
C, D.

Fig. 1. Six nodes topology with Si and Di as destination and source node and four intermediate
nodes

Strategy profile for each node having 3x3 random matrices has been generated. For
any action (i, j) of the node, say A, value of the matrix a (i, j) is the payoff to node A
for that action. Here, action represents the power with which the node is transmitting
packets. To establish a communication link, these matrices are generated for every
participating node and equilibrium occurs when maxima of one node occurs at
position (i, j) where maxima of other node occurs. The strategy matrix is very much
crucial to keep the track over selfish nodes. The following algorithm is proposed:
a) Each node that will be participating in the transmission generates a 2-D matrix
which contains the set of actions. Selection of a particular action decides the outcome
of the node. A payoff value is assigned corresponding to each action of the node.

208

J. Gupta, I. Kumar, and A. Kacholiya


Table 1. Payoff Matrix
Node B
a11

a21

a12

a22

Node A

b) Each node will cross verify the matrices of the rest of the nodes for the best
action. This value will be considered as the virtual currency [2]. Considering three
intermediate nodes between source and destination the condition can be checked in
MATLAB using if
((A(j,i) == max A(i)) && (B(j,i) == max B(i)) && (C(j,i) == max C(i)) && (D(j,i)
== max D(i)))
Where, A, B, C, D are the corresponding strategy matrices and max(A), max(B),
max(C), max(D) contains the maximum values of the corresponding strategies. The
case may occur that a particular node is not participating with its best possible action.
Then this node will be provided certain time to improve its behavior. The payoff
values of all the nodes involved in the transmission are asked from the user. On the
basis of outcome, one value from the various payoff values is being selected.
Corresponding to this payoff value, there will be an energy with which node will be
transmitting forward. After all nodes agree to participate at their corresponding
energy levels, a dedicated path is formed. After the formation of dedicated path the
transmission starts taking place.
3.2 Identifying the Cheating Node
Cheating node can be identified by maintaining a table for all (Si,Di) pair which will
be available at every node. Transmission by node will not be considered when it acts
as source.
Table 2.
From
node

To
node

Power
to
support
edge

Number
times
node
participated

A
A
B
B
C
C
D
D

B
D
A
C
B
D
C
A

10
9
10
8
8
12
12
9

0
4
0
0
0
5
4
6

of

Game Theoretic Approach to Resolve Energy Conflicts in Ad-Hoc Networks

209

First three columns show the topology, whereas fourth column shows the number of
times a node has participated in routing. It can be concluded from the table that when
the node does not participate in the routing, value of fourth column for that particular
node becomes zero. Based on the involvement or the co-operation of the nodes in the
network payoffs or some incentives are given to them. Table 2 will be evaluated after
every periodic interval and when value for any node comes out to be zero repeatedly, it
means that particular node is not participating in routing therefore; no payoff will be
given to it. This is how, the path is determined from the source to the destination which
consumes least energy. In our simulation node B is not participating. This table can also
help in minimizing the over-utilization of a node. When value of any node in fourth
column becomes large, it means that node is being over-utilized.

4 Implementation
The implementation of this novel algorithm is done by using rand function to generate the
strategy matrices. The algorithm has been implemented for 50 games/transmissions. If the

Fig. 2. Graph representing Nash Equilibrium of source node

Fig. 3. Graph representing Nash Equilibrium of destination node

210

J. Gupta, I. Kumar, and A. Kacholiya

energy of a particular node in the topology gets exhausted then there will be no more
participation of the node in the game/transmission. Fig.1 and Fig.2 show the Nash
Equilibrium points for the source and destination nodes respectively. Similarly, the Nash
Equilibrium of all the other nodes can be plotted.

5 Conclusion
Game theory is used in many situations where conflict and cooperation exist. In this
paper, we propose a game model that can be used to optimize total energy of the
network and to analyze selfish behavior of the node, if any. Using this approach the
route/path is determined which requires least energy with maximum co-operation
among the nodes. If the same node is participating again and again to forward the
packet then all the paths that go through that particular node will get diminished soon
due to the over utilization of the node in terms of energy. Therefore, the above
described algorithm has taken into account this problem. The nodes which are
participating will be provided some payoffs or incentives and the others which are not
co-operating will not be allowed to transmit their own packets. Nash equilibrium is
used to determine the path which consumes less energy to reach the destination after
taking decisions from the payoff matrices. Strategy of Game theory can further be
applied to determine the network parameters like throughput and delay etc.

References
1. Komali, R.S., MacKenzie, A.B.: Distributed Topology Control in Ad-Hoc Networks:
A Game Theoretic Perspective. In: Proc. IEEE CCNC (2006)
2. Leino, J.: Applications of Game Theory in Ad Hoc Networks, Masters thesis, Helsinky
University (October 2003)
3. Xiao, Y., Shan, X., Yongen, Tsinghua University: Game Theory Models for IEEE 802.11
DCF in Wireless Ad Hoc Networks, IEEE Radio Communications (March 2005)
4. Roughgarden, T.: Selfish routing and price of anarchy. Lecture Notes. Stanford University,
Stanford
5. Srinivasan, V., Nuggehalli, P., Chiasserini, C.F., Ramesh, R.R.: Cooperation in Wireless
Ad Hoc Networks. In: IEEE INFOCOM (2003)
6. Narahari, Y.: Game Theory. Lecture Notes, Bangalore, India

Software Secureness for Users: Signicance in


Public ICT Applications
C.K. Raju and P.B.S. Bhadoria
Indian Institute of Technology, Kharagpur, India
{ckraju,pbsb}@agfe.iitkgp.ernet.in

Abstract. Software secureness as experienced by a user has connotations that imply better control over information that is getting encoded.
It also implies adherence to established protocols by the software and
provision to inspect software sources for coding errors. The signicance
of some of these issues is evident in some reference manuals on Software
Quality. Software secureness could be treated as a signicant constituent
in software quality which can be enhanced by altering the properties of
the software applications, software environment and software implementation of protocols or data standards that are deployed in the software
projects. Traditional approaches to Software Quality often provide a privileged position to developers of software projects, in providing them the
freedom to x the prerequisites and conditions that determine quality.
In situations where software serves public interests or needs, software secureness should not contrast amongst the communities that use, develop,
test or maintain the same software project. For most of the services in
public domain, the user community is the one which is constitutionally
the most empowered. Utilities that serve public needs may also involve
processing of information of user communities. Therefore, software secureness must be evaluated from the viewpoint of the community of its
users, even if it happens to be the least privileged in setting the prerequisites or conditions for software quality. A shift of this nature is necessary
because a proprietary software environment may be completely transparent to its developer community, even while remaining opaque or insecure
to its user community.

Introduction

Software Quality is a widely discussed and debated issue especially in the context
of software engineering practices. Software processes dier from most other manufacturing processes, in that the products and processes are capable of getting
modied, tested or developed by communities of users, maintainers or developers.
If one is aware of the constituents that create software quality, then it becomes
easier to enhance the quality. This article examines the issue, taking software secureness as a parameter for achieving software quality. It attempts to dene and
view secureness through correctness of software sources, fair implementation of
protocols and through the nature of data formats that the software projects use.
Initial reviews on software quality measures on software products had prompted
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 211222, 2011.
c Springer-Verlag Berlin Heidelberg 2011


212

C.K. Raju and P.B.S. Bhadoria

detailed studies on issues like as-is utility, portability and maintainability [4].
Signicance of portability and maintainability has also been stressed in another
software product quality model suggested by Dromey [6]. This situation presupposed availability of software as sources and details of protocols, as without
accessibility to such sources, maintainability could be ruled out and portability
gets impossible.
Manuals in software quality make adequate references to adherence to use of
data standards, fair implementation of protocols and transparency in coding of
their implementation [2], [3], [1]. These manuals, while laying out specications
for achieving software quality, however, do not insist on the dichotomies that
arise while applying quality conditions to user and developer domains. While
developers working within proprietary software establishments can claim software quality by adhering to these manuals by virtue of having accessibility to
software design and development processes, a majority of users who are out of
purview of the development establishment and who would be the major consumers of software, are not in a position to guarantee software quality on their
own. This unique contradictory situation insofar as the software users are concerned has even prompted a suggestion that open standards without insistence
of open source software would render the whole claim of software quality inadequate [14].
Software sources often undergo changes while catering to evolving demands.
Its management too would get complex, when the communities of users, developers or maintainers are allowed varying degree of permissions or restrictions in
their access to software sources. Adherence to established protocols also requires
that software is programmed to do, what it claims to do. Even more importantly,
software should not be doing, what it is not supposed to do. The task of validating secureness, therefore, needs a fair amount of expertise in programming by
the inspecting agency. It is known that errors or deviations from norms in software can be brought out if more people are allowed to inspect the sources [13].
Therefore, access to software sources is a critical factor in establishing secureness
of software, especially for those software applications serving public interests or
needs.
The properties and nature of data formats used in public applications also
need to be scrutinized for their linkages with software secureness. Data formats
could be either standards or open formats where ownership have been relinquished. Data formats could also be proprietary which may have owners. When
ownership of proprietary formats are in private possession, encoded information
risk getting under perpetual control, especially if the private owners shun eorts
to convert them into legitimate standards relinquishing ownership.
In a draft legislation introduced in Republic of Peru [15], a few issues were
referred. It was argued that public agencies have a natural obligation to guarantee permanent availability of information encoded, processed and stored while
engaging with software applications. To ensure this, proprietary formats were
not desirable for public systems. The signicance, as per the bill was due to the
fact that private information of citizens gets processed in such software systems,

Software Secureness for Users: Signicance in Public ICT Applications

213

and the state as a legitimate custodian of such information has an obligation to


safeguard its availability. Hence usage of data standards or open data formats
which do not have owners needs to be mandatory part of any public software
initiative. Software that has the data formats encoded in open data formats or
data standards will enhance its secureness, as these could be retrieved or made
available at any time in future.
Accessibility to software sources allows inspection of the sources for fair implementation of software protocols and coding practices. The draft bill had promoted use of Free Software in public institutions [15]. A study [14] on eectiveness of open standards had pointed out that unless the implementation of
such standards are carried out with open source projects, a precarious situation
involving vendor lock-in might follow.
In a case taken up for study here, the twin requirements that deal with accessibility to software sources and adherence to established data standards or open
data formats were scrutinized and their suitability examined towards enhancing
software secureness. The public software application is one that monitors a rural
employment guarantee scheme introduced on a national scale in India.
1.1

National Rural Employment Guarantee Scheme (NREGS)

A government initiative to guarantee 100 days of employment on an annual basis to all rural households in India was legislated [7] in 2005. Commissioned as
National Rural Employment Guarantee Scheme (NREGS), the programme was
open to all households whose members were willing to oer unskilled labour.
Though initially the programme was implemented in selected areas, it later got
extended to all rural areas of India. The programme, rechristened as Mahatma
Gandhi National Rural Employment Guarantee Scheme (MGNREGS), continues
to be executed through all local self-government institutions in the Panchayat
Raj System which predominantly addresses rural population. The enactment
was subsequently amended to place all information about the scheme in public
domain through a website. It later became a mandatory requirement [8] for the
purpose of introducing transparency in all the transactions within the system.
This monitoring scheme which has already commenced is planned to be in operation at over 240,000 rural self-government institutions in India. The software
that fullls this requirement has been developed by National Informatics Centre
(NIC) and is made available to the rural local self-government institutions. Here,
the data processed at rural local self-government institutions spread across the
country will be received and stored at a central database repository.
NREGASoft, the software developed for monitoring these activities is capable
of operating in online mode as well as in oine mode [12]. In the online mode
of operation, a dedicated internet connection needs to be established between
the local self-government institution and the Ministry of Rural Development
(Govt. of India) which hosts the central server. In the online mode, details of
all activities are updated on a daily basis with the help of a browser application
at the nodes. However, due to the enormity of data, the data-entry operations
which even include marking of attendance of the workers at the various work

214

C.K. Raju and P.B.S. Bhadoria

sites are carried out in the oine mode. In the oine mode, data related to
MGNREGS are entered by local self-government institutions and updated in
a local database repository. Later, at a convenient time or from an alternate
location, the incremental updates to local database are synchronized with the
remote central repository, which is housed in the premises of Ministry of Rural
Development, Government of India. NIC has developed a web-server application
integrated with a hypertext scripting engine with the central database server
[8], which allows online mode of operation. According to its principal developer
[11], the rst major award bagged by the project was Microsoft e-Governance
Award 2006.

State of Software Sources and Data Formats

On analysis of NREGASoft it was observed that the central server which received
information from rural local bodies was congured using proprietary software.
The information received was stored in database in a proprietary format. The
minimum essential conguration for becoming the client of the monitoring network, as per the manual [12], is listed in Table 1.
Table 1. Ownership of Client Software Sources (Oine)
Software

Nomenclature

Owned by

OS

Windows XP SP-2

Microsoft Inc

Web Server IIS Server


Database

Microsoft Inc

MS SQL Server 2000 Microsoft Inc

Application NREGASoft

NIC

It can be seen that a software rm has exclusive ownership over the software
environment which embeds the application software developed by NIC during
execution. Users of this rural software application do not have access to software
sources that are owned by this software rm, and hence software secureness of
the environment for the users gets reduced. For both the oine mode and the
online mode of operation, the server conguration is listed in Table 2.
The secureness of the scripts that make up NREGASoft is dependent on access
to its sources. NIC owns and maintains the scripts of NREGASoft. Since these
scripts are made available to local self-government institutions, secureness of
NREGASoft will be dependent on the extent of access to the scripts that is made
available to the users. However, when a software application is embedded inside
an insecure software environment, the software project will become insecure for
its users. In a study carried out by Jones, it was pointed out that at the user end,
it is almost impossible to build a meaningful software metrics even for identifying
its inadequacies or highlighting its worthiness as good, bad or missing [9]. The
study even went ahead and claimed a metric as hazardous which was unrelated to

Software Secureness for Users: Signicance in Public ICT Applications

215

Table 2. Ownership of Server Software Sources


Software

Nomenclature

OS

MS Windows Server Microsoft Inc

Web Server IIS Server


Database

Owned by
Microsoft Inc

MS SQL Server 2000 Microsoft Inc

Application NREGASoft

NIC

real economic productivity. Therefore for any software project to be completely


secure to its users, it should be operated only in an environment that can extend
secureness of any software that is in execution.
From the database description used in the application, it is evident that information related to public is encoded in proprietary data format and is opaque to
its users. Deprived of the neutrality that is required in data standards or open
data formats and transparency in implementation of its encoding, the secureness
of data diminishes.
2.1

Secureness through Access to Sources

In NREGASoft, the community of users is mostly those from the rural local
bodies in India, belonging to dierent states and union territories in India.
The developers and maintainers of the application of NREGASoft happen to
be from National Informatics Center (NIC), which is a public agency under the
administrative control of Government of India. The developers of the software
environment of NREGASoft happen to be from a private software rm. In this
proprietary software project it can be seen that the communities of users, developers and maintainers are not the same.
NIC has some denite control over the sources (server scripts) it develops
and maintains. The communities of users, which happen to be the members in
local self government institutions, do not enjoy the same privileges for access to
the sources as that of the maintainers. A proprietary developer of kernel and
similar services related to Operating System may have complete control over
the entire project. This is because user-level software applications get embedded
inside a proprietary operating environment, which can oversee any aspect of its
functioning. A recent study suggested that exposure to software sources would
help in reducing the number of faults which can be taken as an important factor
while creating a process metrics [10], but the dilemma of software secureness
would continue, so long as sources are not made available to user community.
Secureness of software is directly related to access and control over source
code of software by the users. The software project may be secure enough to
Microsoft Inc., who has access to all the code it develops. NICs sense of secureness, however, is limited to its control over the sources NIC has developed. Still
lesser sense of secureness will prevail on the programmers and other users in
rural local self-government institutions, who may have access to some portions

216

C.K. Raju and P.B.S. Bhadoria

of the program developed by NIC. For the common rural citizens in whose service the application is created, however, the application can never be declared
secure. This is because there are no legal provisions that facilitate rural citizens
to inspect, test or debug the code or entrust such inspection to third-parties
as brought out in the draft bill introduced in Peru [15]. In a democracy, where
state serves its people, excluding people from accessing software sources is akin
to excluding masters of the state. The secureness of software vis--vis ordinary
citizens, whose information is getting processed, is therefore not prominent in
NREGASoft.
2.2

Secureness through Adherence to Data Standards

Software scenario is replete with instances of multiple choices for data formats
available for the purposes of storage or processing in certain application domains.
Wherever data formats have been declared as data standards or open data formats, it can be presumed that issues over ownership over such data standards
too have been settled. This is primarily because data standards or open data
formats are devoid of owners claiming exclusive rights over such formats. Data
standards or open data formats play a vital role in ensuring interoperability of
encoded data between systems as they become neutral to applications that use
them. Retention of ownership or rights over some or all parts of standards would
dent this neutrality, in the process rendering it a non-standard. Its status then
would be as a proprietary data format.
The scope of discussion on proprietary formats in which data are encoded and
other related protocols used in NREGASoft is limited, as their implementation
details are not available for inspection by any user, other than the rm that
developed it. Additionally, there cannot be a fool-proof mechanism for validating any claims of adherence to protocols, as these are available only in binaries,
mostly in a non-decodable format whose ownership entirely lies with a single
agency. The licensing conditions, under which these utilities are made available
to users, strictly prohibit any attempts to reverse-engineer or decode. Thus, the
existing state of the art is severely limited in its scope for evaluation or scrutiny,
from a technological perspective. The data encoded cannot be guaranteed to
be available permanently [15]. Secureness of the system, therefore, is further
compromised through the usage of proprietary formats and non-veriable protocols.
Operations from client-side have been categorized into two modes. In the
oine mode, a local database is created and updated, from where data is updated
with the central database server. Most of the software utilities are available only
in binary formats. The state of client in oine mode is almost the same as that
of server. Secureness of client, therefore, is poor as in the case with secureness
of server. In the online mode, it will be a web-application which will be used
to update the remote database. Here too, the encoding of data for storage in
remote database will be carried out in proprietary formats.

Software Secureness for Users: Signicance in Public ICT Applications

217

The tendency of software secureness to vary can be gauged from the interest
shown by the owner of proprietary format to have it converted into a legitimate
standard, relinquishing any kind of ownership. Absence of ownership over any
part of format, if published as a standard, and made available for public use,
would naturally mean that everyone has an equal share of ownership, enforcing
neutrality. In the event of non-neutrality of encoding process, the format may
need alteration to become a standard. In the case of NREGASoft, Microsoft
Inc currently holds the ownership of proprietary data formats used in its systems. Hence, software secureness is severely restricted with regard to encoding
of information.
2.3

A Framework That Indicates Secureness of Software

In a similar description, one can nd a range associated with accessibility of


software. At one end of the spectrum is making available software source codes
with freedom to inspect, modify and publish, to the community that uses them.
At the other end of the spectrum lie software, extended as binaries with two
dierent variants. One variant is a type of software binaries with access to their
respective sources with varying degrees of freedom over their usage to inspect,
modify, alter, distribute or publish as in the case with Free Software or other
Open Source projects. The other is extension of mere binaries of software with
no access to their sources, denying the user community to build the binaries
from their respective sources. Thus, inspection, modication, alteration etc are
impossible. A framework that adequately represents this model of arrangement
is produced in Figure 1.
The rst and fourth quadrants deal with sources of software code, with a
dierence. While sources and standards are available in public domain in the
case of rst quadrant, the sources and data formats used in the fourth quadrant are available only within the proprietary establishments that develop them.
The secureness in the rst quadrant is highest, which are enjoyed by users,
developers, maintainers and testers. The secureness for software in fourth quadrant are however enjoyed by only developers and testers of proprietary software. Since users of proprietary software deal only with software binaries and
proprietary formats, secureness of software is absent for users of proprietary
software.
Cases that deal with standards and binaries (with no access to source code)
as well as cases that deal with proprietary formats and binaries (with access to
source code) are both deviations from the usual norm, and hence their representation is not considered important in this framework. They are not testable too,
by virtue of having binaries or proprietary formats, often legally protected from
any detailed scrutiny. NREGASoft as a product independent of its environment
lie in third quadrant, and the software environment that facilitates its functioning lie in the fourth quadrant. It is pertinent to note that users of NREGASoft
have their software secureness seriously compromised.
This analysis sets o an inquiry whether it is possible to elevate the secureness
of software, and if so, what should be conditions that favour this transition.

218

C.K. Raju and P.B.S. Bhadoria

Fig. 1. A framework highlighting diering environments for Users, Developers and


Maintainers

Software Monitoring Application with Enhanced


Secureness

A scalable prototype for a local database management system that captures and
stores information pertaining to MGNREGS was developed using Free Software
applications during late 2009 and early 2010. The following software components
described in Table 3 were deployed.
Table 3. Alternate Software Specications
Software
Operating System

Nomenclature
GNU/Linux Ubuntu 9.10

Web Server

Apache 1.3.42

Database

MySQL 5.1.31

Webserver Scripts

PHP 5.2.12

Content Managing Software

Drupal 6.15

A scalable prototype was developed with Drupal and the essential functions of
a work-activity were captured and made available as reports. Assessment of work
requirements and its processing was carried out at Panskura-I, a block panchayat
in East Medinipore district, West Bengal. Information generated through the
reports, validated the functional aspects of the prototype at the developers

Software Secureness for Users: Signicance in Public ICT Applications

219

level. In a rural application developed with Free Software, the transparency of


the solution would be highest if the rural citizens are allowed to inspect the code
that processes their information. This meant that making available packages in
Free Software over an Operating System built over Free Software is inadequate.
The entire sources of the database that created the application too need to be
made transparent.
The new conditions made the publishing of the Structured Query Language
(SQL) database dump for the application under a GNU General Public License
(GPLv3), imperative. A mere replication of a database, too, is inadequate if inspections are to be carried out. The metadata design pertaining to the database
that processed all work activities of MGNREGS were made part of the original
design. This meant that the application displayed, as part of its features, the
database design too, with relations, entity relationship diagrams and detailed
description of every attribute in all relations, and the forms that invoke these
relations. Moreover, all future transactions that are committed on this database
would also retain the same openness and transparency, when copied for distribution.
An SQL dump would then cause not only the data captured through the
application available for inspection, but also the semantics of its usage. Since
access privileges are controlled by the MySQL database which is separated from
the database meant for storing information related to MGNREGS, unauthorized
intrusions are blocked. Releasing the SQL dump under a GNU General Public
License would ensure that every amendment incorporated would need to be
published if the solution is made available to another party. These measures
would in no way aect the operational capabilities of the monitoring software and
would enhance the relationship between software design quality, development
eort and governance in open source projects as carried out in a study [5].
Rather, it would reinforce the requirements for transparency in all its operational
details, which had been a condition for setting up an information processing and
monitoring system [8].
The new way for replication was, thus, to install all the packages mentioned
in Table 3 above, superimpose the SQL dump of backed up Drupal application
database and install MGNREGS database in MySQL. The entire application
would be recreated that would not only have the application, but also one that
contains the design aspects too of the database. By implementing such a design,
secureness of software was enhanced with ease of reproduction for the purpose of
analysis or modication. The authentication codes were the only elements that
were not part of the transparent package, for obvious reasons. For developers,
updating of software tools as and when new releases are distributed is essential,
as most Free Software projects evolve continuously. A new challenge, therefore,
would be to make available newer releases of software to all the nodes. A version control system would ensure seamless integration of the application to any
versions of the software environment.
Software projects may involve user, developer, tester and maintainer communities. Here, one can nd that the privileges of all the communities are almost

220

C.K. Raju and P.B.S. Bhadoria

Fig. 2. Varying software secureness in dierent software environments

the same with regard to access to software sources, which is crucial to ascertain
adherence to established protocols and adherence to open data formats or data
standards. The privileges of user communities, here, are better than those in
NREGASoft. For the user community, secureness of software has been enhanced
in Free Software application when compared to secureness of NREGASoft.
To enhance the software secureness of NREGASoft, therefore, the conditions
require that the solution be re-designed in a Free Software environment. Additionally, the proprietary formats in which encoding of public information is
currently being carried out are to be abandoned in favour of open data formats or data standards, devoid of any ownership. To ensure that the application
scripts too can never be closed for inspection, they too should be released under
an open public license that prevents its closure in future. By having the software
secureness of NREGASoft enhanced considerably to the user community, it can
be safely presumed that software quality too would be improved as depicted in
Fig 2.
The authors would like to point out that while this software development work
merely validates the claim that secureness of software with respect to the user
community can be enhanced, the study does not claim that such development
work is beyond the capabilities of private software development companies. On
the contrary, the authors may even recommend entrusting such development
work to leading software developers in the private sector in India to make use
of their vast experience and access to human resources. This study, however,
accords priority to the licenses under which the transfer of rights of software
and sources ought to take place that would reveal the extent of secureness of
software to its users.

Software Secureness for Users: Signicance in Public ICT Applications

221

Conclusion

As a principle, software quality is associated with adherence to use of data


standards, fair implementation of protocols and transparency in coding of their
implementation. Software that adheres to these criteria extends secureness to the
users, developers and maintainers of software. In many software projects, especially those that process information related to public citizens, the communities
of developers, maintainers and users could be dierent. There exist possibilities
wherein software which may appear secure to developer community could become insecure to user community. Software which are released only as binaries
cannot be veried for their adherence to data standards, protocols or for rules
associated with its implementation.
It is therefore vital to ensure that any software that are to be assured for its
quality, adheres to established data standards or published open formats (after
relinquishing ownership, so that these could be taken up for converting into a
standard). Additionally, releasing the software sources would ensure that implementation details of software are transparent and do not violate any existing
protocols. Rigorous methods of control have been suggested in a software quality management system adopted from standards Australia [2], which insisted on
review of code and documents to assure their compliance with design criteria.
Additionally, in a constitutional setup under which such public software services
are developed, operated and maintained, the user community is the one which
is constitutionally the most empowered. Therefore in cases like these, software
secureness should be evaluated from the viewpoint of users to ascertain software
quality.
NREGASoft, a software implementation for monitoring information processed
in the employment guarantee scheme (MGNREGS) is found wanting in areas
related to data standards and transparency in implementation as the current
environment and software platforms are proprietary in nature. In order to enable the government in extending the necessary guarantees over processing of
information related to public, adherence to published protocols and its encoding, NREGASoft should be re-designed to be implemented with Free Software
using published data standards. This variation in design and implementation
would eventually enhance the software secureness to the user community of the
software, thereby accomplishing better software quality. The experiment carried
out with Free Software as a case study by the author, further exemplies that
by resolving to release the database dump under a GNU General Public License (GPLv3), the legal mechanisms would help in retaining the transparency
of implementation in future too.

References
1. IEEE Guide for Software Quality Assurance Planning. ANSI/IEEE Std 983-1986,
131 (1986)
2. IEEE standard for Software Quality Assurance Plans. IEEE Std 730.1-1989, 01
(1989)

222

C.K. Raju and P.B.S. Bhadoria

3. Software Quality Management System. part 1: Requirements. Adopted from Standards Australia. IEEE Std. 1298-1992; AS 3563.1-1991, 01 (1993)
4. Boehm, B., Brown, J., Lipow, M.: Quantitative evaluation of software quality.
In: Proceedings of the 2nd International Conference on Software Engineering,
pp. 592605. IEEE Computer Society, Los Alamitos (1976)
5. Capra, E., Francalanci, C., Merlo, F.: An empirical study on the relationship between software design quality, development eort and governance in open source
projects. IEEE Transactions on Software Engineering 34(6), 765782 (2008)
6. Dromey, R.G.: A model for software product quality. IEEE Transactions on Software Engineering 21, 146162 (1995)
7. Government of India: The National Rural Employment Guarantee Act NREGA
2005. Government Gazette, India (2005)
8. Government of India: Government Notication on Transparency in NREGA.
Government Gazette, India, p. 9 (2008)
9. Jones, C.: Software Metrics: Good, Bad and Missing. IEEE Computer 27(9), 98
100 (1994) ISSN:0018-9162
10. Khoshgoftaar, T.M., Liu, Y., Seliya, N.: A multiobjective module-order model
for software quality enhancement. IEEE Transactions on Evolutionary Computation 8(6), 593608 (2004)
11. Madhuri, S., Mishra, D.: Strengthening National Rural Employment Guarantee
Scheme (NREGS) through E-Governance. In: E-Governance in Practice (2008)
12. NIC, Government of India: User manual of NREGA. MIS for National Rural
Employment Guarantee Act (NREGA) 2005 (2007)
13. Raymond, E.S.: The Cathedral and the Bazaar: Musings on Linux and Open Source
by an Accidental Revolutionary. OReilly, Sebastopol (2001)
14. Tiemann, M.: An objective denition of open standards. Computer Standards and
Interfaces. Science Direct 28(5), 495507 (2006) ISSN 0920-5489
15. Villaneuva, E.: Use of Free Software in Public Agencies. Bill No 1609, Republic of
Peru (2001)

Vector Space Access Structure and ID Based


Distributed DRM Key Management
Ratna Dutta, Dheerendra Mishra, and Sourav Mukhopadhyay
Department of Mathematics
Indian Institute of Technology
Kharagpur721302, India
{ratna,dheerendra,sourav}@maths.iitkgp.ernet.in

Abstract. We present an eective DRM architecture with multi distributors that facilitates client mobility and propose a family of exible
key management mechanisms for this system coupling Identity-Based
Encryption (IBE) with vector space secret sharing. Our proposed DRM
architecture provides scalability of business model and allows to make
proper business strategies for dierent regions and cultures. The encrypted digital content sent by a package server can only be decrypted
by the DRM client and is protected from attacks by other parties/servers
in the system. Our key management protects the key used to encrypt a
digital content during its delivery from the package server to the DRM
client, not only from purchasers but also from the distribution servers and
the license server. The IBE enables eciency gains in computation time
and storage over the existing certicate-based Public Key Infrastructure
(PKI) based approaches as no certicate management and verication is
needed by the entities in the system.
Keywords: DRM, key management, content protection, security, vector
space secret sharing, IBE.

Introduction

The widespread use of the Internet has greatly facilitated the distribution and
exchange of information. Immediate access to content with low-cost delivery
is one of the new benets Internet-based distribution brings. However, digital
content by nature is highly vulnerable to unauthorized distribution and use.
This raises issues regarding intellectual property and copyright. After content
is provided, no further protection is provided on that content. While these new
technologies have the potential to open up new markets, the risk of abuse makes
copyright owners reluctant to use them.
Digital Rights Management (DRM) technologies ensure the protection of digital content after distribution, providing ways to exercise usage control on that
content. The goal of DRM technology is to distribute digital contents in a manner that can protect and manage the rights of all parties involved. The core
concept in DRM is the use of digital licenses. The consumer purchases a digital license granting certain rights to him instead of buying the digital content.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 223232, 2011.
c Springer-Verlag Berlin Heidelberg 2011


224

R. Dutta, D. Mishra, and S. Mukhopadhyay

The content access is regulated with the help of a license that contains permissions, constraints and a content decryption key. Permissions are privileges or
actions that a principal can be granted to exercise against some object under
some constraints. Examples of permissions include printing, playing, copying,
and embedding the content into other content items. Constraints are restrictions and conditions under which permissions are executed. Constraints may
include expiration date, available regional zone, software security requirements,
hardware security requirements, and watermarking requirements. A set of constraints can also include another set of constraints recursively, which means that
the included set of constraints must also be satised.
Current Digital Rights Management (DRM) systems support only two-party
systems, involving the package server and purchaser [10], [2], [13], [7], [3]. However, DRM systems need to be suciently exible to support existing business
models and extensible to adapt to future models. The DRM architecture in
multi-party multi-level setups has been used [8], [11], [14], [15] as an alternative
to the traditional two-party DRM architecture.
Our Contribution: In this paper, we design a DRM system which is suitable
to more innovative and scalable business models considering a network with
multi-distributors instead of single-distributor. A local distributor can better
explore potentially unknown markets for the owner (package server) and make
strategies according to the market. In addition, the distributors can also help in
handling dierent pricing structures of media in dierent countries, and share
with the owner any information on price or demand uctuation cost. In our
DRM system, the DRM client has the exibility of choosing a distributor based
on his own preference. The DRM client may be mobile and roam from one
region to another. The DRM client may contact the distributor who is nearest
to him by location or who oers promotions/discounts on the price or oers
more commissions.
We provide a secure and ecient key management scheme in our proposed
DRM system using IBE [17] instead of certicate-based Public Key Infrastructure (PKI), coupling it with vector space secret sharing scheme. The IBE has
the property that a users public key is an easily calculated function of his identity, such as his email address, while a users private key can be calculated for
him by a trusted authority, called Private Key Generator (PKG). The identitybased public key cryptosystem needs verication of users identity only at the
private key extraction phase. Consequently, the identity-based public key cryptography simplies certicate management and verication and is an alternative
for certicate-based PKI, especially when ecient key management and security are required. We obtain eciency gains in computation time and storage
over the existing certicate-based PKI approaches as no certicate management
and verication are needed by the entities in our DRM system. Moreover, our
construction is general as it uses a general monotone access structure and vector space secret sharing. These facilitates to constructs a family of exible key
distribution schemes.

Vector Space Access Structure

225

In our key management mechanism, the package server does not trust distribution servers or license server. The symmetric decryption key used to encrypt
a digital content is delivered from the package server to the DRM client in a
secure manner and is protected from its generation to consumption. Unlike current DRM systems which have focused on content protection from purchasers,
our scheme protects the key not only from purchasers, but also from other principals such as the distribution servers and the license server. Consequently, the
encrypted digital content sent by a package server can only be decrypted by the
DRM client who has a valid license and no one else.

2
2.1

Preliminaries
Common Components in DRM System

Despite dierent DRM vendors having dierent DRM implementations, names


and ways to specify the content usage rules, the basic DRM process is the same.
The entities involved in a DRM system are a package server, distribution server,
license server and DRM client [12], [9]. In this model, a purchaser is not a service
provider, he simply pays a fee to the DRM client and watches a movie or listens
to a song.
2.2

Certicate-Based Vs. Identity-Based Cryptography

The certicate-based protocols work by assuming that each entity has a static
(long term) public/private key pair, and each entity knows the public key of
each other entity. The static public keys are authenticated via certicates issued
by a certifying authority (CA) by binding users identities to static keys. When
two entities wish to establish a session key, a pair of ephemeral (short term)
public keys are exchanged between them. The ephemeral and static keys are
then combined in a way so as to obtain the agreed session key. The authenticity
of the static keys provided by signature of CA assures that only the entities who
posses the static keys are able to compute the session key. Thus the problem of
authenticating the session key is replaced by the problem of authenticating the
static public keys which is solved by using CA, a traditional approach based on
a Public Key Infrastructure (PKI).
However, in a certicate-based system, the participants must rst verify the
certicate of the user before using the public key of the user. Consequently, the
system requires a large amount of computing time and storage.
In identity-based public key encryption, the public key distribution problem is
eliminated by making each users public key derivable from some known aspect
of his identity, such as his email address. When Alice wants to send a message to
Bob, she simply encrypts her message using Bobs public key which she derives
from Bobs identifying information. Bob, after receiving the encrypted message,
obtains his private key from a third party called a Private Key Generator (PKG),
after authenticating himself to PKG and can then decrypt the message. The
private key that PKG generates on Bobs query is a function of its master key
and Bobs identity.

226

R. Dutta, D. Mishra, and S. Mukhopadhyay

Shamir [17] introduced this concept of identity-based cryptosystem to simplify


key management procedures in certicate-based public key infrastructure. The
rst pairing-based IBE scheme was proposed by Boneh and Franklin in 2001.
Shortly after this, many identity-based cryptographic protocols were developed
(see [4] for a survey) based on pairings and this is currently a very active area
of research.The identity-based public key cryptosystem can be an alternative for
certicate-based PKI, especially when ecient key management and moderate
security are required.
The advantages of ID-based encryption are signicant. It makes maintaining
authenticated public key directories unnecessary. Instead, a directory for authenticated public parameters of PKGs is required which is less burdensome than
maintaining a public key directory since there are substantially fewer PKGs than
total users. In particular, if everyone uses a single PKG, then everyone in the
system can communicate securely and users need not to perform on-line lookup
of public keys or public parameters.
In an ID-based encryption scheme there are four algorithms (i) Setup:
Creates system parameters and master key, (ii) Extract: Uses master key to
generate the private key corresponding to an arbitrary public key string ID,
(iii) Encrypt: Encrypts messages using the public key ID, and (iv) Decrypt:
Decrypts the message using the corresponding private key of ID.
2.3

Secret Sharing Schemes

Denition 2.1: (Access Structure) Let U = {U1 , . . . , Un } be a set of participants


and D
/ U is the dealer or group manager. A collection 2U is monotone
increasing if B and B C U imply C . An access structure is a
monotone collection of non-empty subsets of U. i.e., 2U \{}. The sets in
are called the authorized sets. A set B is called minimal set of if B , and for
every C B, C = B, it holds that C
/ . The set of minimal authorized subsets
of is denoted by 0 and is called the basis of . Since consists of all subsets
of U that are supersets of a subset in the basis 0 , is determined uniquely
as a function of 0 . More formally, we have = {C U : B C, B 0 }.
We say that is the closure of 0 and write = cl(0 ). The family of nonauthorized subsets = 2U \ is monotone decreasing, that is, if C and
B C U, then B . The family of non-authorized subsets is determined
by the collection of maximal non-authorized subsets 0 .
Example. In case of a (t, n)-threshold access structure, the basis consists of all
subsets of (exactly) t participants. i.e. = {B U : |B| t} and 0 = {B
U : |B| = t}.
Denition 2.2: (Vector Space Access Structure) Suppose is an access structure, and let (Zq )l denote the vector space of all l-tuples over Zq , where q is prime
and l 2. Suppose there exists a function : U {D} (Zq )l which satises
the property: B if and only if the vector (D) can be expressed as a linear
combination of the vectors in the set {(Ui ) : Ui B}. An access structure is
said to be a vector space access structure if it can be dened in the above way.

Vector Space Access Structure

3
3.1

227

Protocol
Overview of the Proposed DRM Architechture

Entities involved in our DRM model are: package server P , n distribution servers
D1 , . . . , Dn , license server L, DRM client C. The package server P appoints n
distribution servers D1 , . . . , Dn in dierent regions to facilitate the distribution
process. The DRM client C is mobile and moves from one region to another. C
can download encrypted contents from its preferred distributor, say Di , which
might be location wise nearest to C. The owner of the package server P has raw
content and wants to protect it. None of the principals except P and the DRM
client with a valid licence should know how to decrypt the content.
3.2

Overview of the Proposed Key Distribution

The commonly used cryptographic primitives in DRM systems are symmetric


and public key encryption, digital signatures, one way hash functions, digital
certicates etc. A high level description of our proposed key distribution scheme
and the implementation specications are provided below.
Symmetric key algorithm: Symmetric encryption is used to encrypt raw digital
content with a content key by the package server P to prevent illegal copying of
digital content. We can make use of any existing symmetric key algorithm (such
as DES-CBC).
Public key algorithm: We split the content key into several partial content keys.
These partial content keys are delivered using public key encryption to the license
server L and the distribution servers D1 , . . . , Dn in such a way that neither the
distribution servers nor the license server can generate the content key. Public key
algorithm is also used to encrypt the digital license containing the partial content
keys using the public key of the receiver, thereby enabling only the party holding
the matching private key to extract the partial content keys. The party then can
reassemble these partial content keys to compute the original content key and
get access to the digital content. The components of our proposed DRM system
which have a content decryption key are the package server P and the DRM client
C with a valid license. It is very dicult to authenticate a purchaser. Purchasers
are concerned about their privacy and anonymity. They simply need to pay a fee
to watch a movie. Instead, the DRM client C is a service provider to purchaser
and should be authenticated by the owner of the package server P . RSA 2048
bit is widely used public key encryption algorithm. To mitigate the bandwidth
overhead, among several public key cryptography one may adopt Elliptic Curve
Cryptography (ECC) [5], [1] due to its acceptable overhead. In our public key
distribution, we use the setup of Identity-Based Encryption (IBE) instead of
certicate-based setup to simplify certicate management and verication. A
trusted PKG generates the private key of a server upon receiving its public
identity (which may be some known aspect of its identity, such as its e-mail
address). We use the private/public key pair thus generated for each entity in
the system as the respective signing/verication key pair of the corresponding
entity.

228

R. Dutta, D. Mishra, and S. Mukhopadhyay

Digital signatures: We use digital signatures for non-repudiable rights issuing.


The license server digitally signs licenses of the digital content. Consequently,
the play application on the DRM clients device can verify the correctness of the
usage rights and keep the signature as a proof of rights purchase. The signature
scheme ECC-192 provides higher security level than RSA-1024 while the length
of its signature is 48 bytes compared to 128 bytes of RSA-1024 [5].
3.3

Secure Delivery of Content Key

We now describe in detail our proposed key distribution scheme.


1. Setup:
1.1) The principals of the package server P , the distribution
servers Di , 1 i n and the license server L submit their public identities
to PKG and obtain the corresponding private keys SP , SDi , 1 i n and SL
respectively through a secure communication channel. PKG uses its master key
MK to generate the principals private keys after verifying the validity of the
principals public identities submitted to PKG.
1.2) The principal of the DRM client C submits its public identity IDC to the
principal of the package server P and obtains the corresponding private key SC
through a secure communication channel. P uses its own private key SP issued
by PKG to generate the private key of C after verifying the validity of Cs public
identity IDC submitted to P .
2. Key Delivery when Packaging the Content:
The package server P creates the content key K to encrypt a raw digital content
M using symmetric encryption while packaging M . P splits the content key K
and distributes a dierent part of K to each of the license server L and the
distribution servers Di , 1 i n. These servers in turn keep their respective
partial content keys secret. We describe below the generation of the content key,
the procedure of splitting the content key and the delivery of the partial content
keys to dierent servers.
2.1) Let U be a set of N servers and P
/ U. Select n < N and take a subset
{D1 , . . . , Dn } of distribution servers from U. All the operations take place in
a nite eld, GF(q), where q is a large prime number (q > N ). We consider a
vector space secret sharing scheme realizing some access structure over the
set U. Suppose there exists a public function : U {P } GF(q)l satisfying
the property (P ) (Ui ) : Ui B B , where l is a positive integer. In
other words, (P ) can be expressed as a linear combination of the vectors in the
set {(Ui ) : Ui B} if and only if B is an authorized subset. Then denes
as a vector space access structure.
2.2) The package server P rst chooses uniformly at random a vector v
GF(q)l and computes the content key K = v.(P ).
2.3) For 1 i n, P computes YDi = EncIDDi (v.(Di )) using Di s public
identity IDDi , generates signature YDi = SigSP (YDi ) using P s own private key
SP and sends YDi |YDi to Di .
2.4) P chooses randomly a subset W U \ {D1 , . . . , Dn } such that W 0 ,
i.e. W is a maximal non-authorized subset with respect to with minimal
cardinality. P generates the set S = {(Uk , v.(Uk )) : Uk W }. P computes

Vector Space Access Structure

229

YL = EncIDL (S) using Ls public identity IDL , signature YL = SigSP (YL ) using
P s own private key SP , and sends YL |YL to L.
2.5) For 1 i n, Di on receiving YDi |YDi , veries the signature YDi on YDi
using P s public identity IDP . If verication succeeds, i.e. VerIDP (YDi , YDi ) =
true, then Di decrypts YDi using its private key SDi , recovers v.(Di ) = DecSDi
(YDi ) and stores v.(Di ) to its secure database.
2.6) L upon receiving YL |YL , veries the signature YL on YL using P s
public identity IDP . If verication succeeds, i.e. VerIDP (YL , YL ) = true, then L
decrypts YL using its private key SL , recovers S = DecSL (YL ), where S is the
set given by S = {(Uk , v.(Uk )) : Uk W }. L stores S to its secure database.
3. Key Delivery when Content Service is Provided:
Suppose a DRM client C requests the content service for encrypted content
M from a distribution server, say Di , which is within nearest reach to C. The
following steps are executed.
3.1) Di computes YC = EncIDC (v.(Di )) using Cs public identity IDC , signature YC = SigSDi (YC ) using Di s private key SDi , and sends YC |YC to L.
3.2) L on receiving YC |YC , veries the signature YC on YC using Di s public
identity IDDi . If verication succeeds, i.e. VerIDDi (YC , YC ) = true, L computes
YL = EncIDC (S) using Cs public identity IDC , signature YC |YL = SigSL (YC |YL )
using Ls own private key SL , and issues the license that contains YC |YL |YC |YL
together with rights, content URL, and related other information.
3.3) The DRM client C analyzes the licence issued by L, veries YC |YL on
YC |YL using Ls public key IDL . If verication succeeds, C decrypts YC and YL
using its own private key SC , and extracts the partial content keys v.(Di ) =
DecSC (YC ) and S = DecSC (YL ), where S = {(Uk , v.(Uk )) : Uk W }. C
then reassembles these partial content keys and extracts the original content as
follows: Since W 0 and Di
/ W , the set B =
W {Di } . Thus B is
an authorized subset and one can write (P ) = {k:Uk B} k (Uk ) for some
k GF(q). Hence C knows k and v.(U
 k ) for all k B and
consequently can

compute {k:Uk B} k (v.(Uk )) = v.
{k:Uk B} k (Uk ) = v.(P ) = K.
Finally, C decrypts the encrypted content using the recovered content key K
and can view (playback) M .

Security Analysis

We design our key management scheme keeping in mind the following specic
security objectives.
1. Preventing insider attacks: Raw content should not be exposed to unintended parties with the help of an insider.
2. Minimizing attacks by outsiders: Unauthorized outsiders should not illegally
obtain the content keys.
3. Protecting distribution channels for content key/license: The security of the
following two distribution channels should be ensured.
the distribution channel between the distribution servers and the license
server to transport the content key

230

R. Dutta, D. Mishra, and S. Mukhopadhyay

the distribution channel between the DRM client, the distribution servers
and the license server to transport the license.
An attack on the (n + 1) partial content keys of the original content key
K (which is used in symmetric key encryption for content protection by the
package server) during delivery from the package server P to the distribution
servers D1 , . . . , Dn and the license server L is prevented, because each piece of
the (n+1) partial content keys of K is encrypted under a public key and delivered
to a server who owns the matching private key. The (n + 1) partial content keys
of K are separated and stored at dierent servers in such a way that, neither
any of the distribution servers D1 , . . . , Dn nor the license server L has sucient
number of partial content keys to generate the original content key K by itself.
The content key K is protected from an attack on the distribution servers or the
license server, since the (n + 1) partial content keys of K is stored at dierent
servers so that each server knows insucient number of partial content keys to
extract the original content key K.
Moreover, since a distribution server encrypts its partial content key of K
with the DRM clients public key and sends it to the license server, the license
server cannot decrypt it and consequently, cannot generate the original content
key K. License server also encrypts its partial content key of K using the DRM
clients public key. Thus the partial content keys of K can only be decrypted by
the DRM client who has the matching private key and no one else. The DRM
client gets sucient partial content keys after decryption and combines them to
recover the original content key K.
In summary, we achieve the following.
1. By splitting the content key, each of the distribution servers has a distinct
partial content key. Thus if an insider attack on a server is successful, the partial
content key obtained in the attack is insucient to decrypt the DRM-enabled
content.
2. For an outside attack to succeed, the attacker must break into the license
server and any distribution server to obtain sucient partial content keys. Thus
the proposed scheme achieves multi-party security.
3. We use IBE and digital signature schemes to protect the content key/license
distribution channel from impersonation attacks, replay attacks, man-in-themiddle attacks etc. Therefore, the security of the content key/license distribution
channel depends on the security of the mechanisms IBE, digital signatures used
for the key management.
4. Note that the content keys in the license le are transmitted to the client
module under encryption with the client modules public key. Consequently,
entities other than the client module cannot retrieve the content key even when
they have obtained the license le.

Performance Analysis

The process of authentication or verication of the identities of the parties is


necessary in a DRM system to ensure that the packaged digital content is from

Vector Space Access Structure

231

the genuine authorized content distributor. In our design, digital certicates


are not used to authenticate or verify the identity of the parties involved in
the system unlike certicate-based public key infrastructure, thus saving large
amount of computing time and storage. Instead, we use IBE that simplies our
key management mechanism.
Our key management scheme enables the symmetric content key to be protected from the principals who manages the distribution servers and the license
server. The digital content can thus be protected from attacks during the content distribution since the encrypted digital content is sent by the package server
and only the DRM client can decrypt the digital content. Besides, we use IBE
and digital signature instead of digital certicates. This simplies the process of
authentication or verication of the identities in the system.
Our key management makes use of a general monotone access structure and
vector space secret sharing which leads to a family of exible key distribution
schemes with eective performance. Our construction is general in the sense that
it depends on a particular public mapping and for dierent choices of we
obtain dierent key distribution schemes.
The license server performs a range of tasks such as service monitoring, payment processing, license management and much information passes through it.
License issuance and content key management involve time-consuming operations such as digital signature and public key encryption. Thus the license server
could potentially become a bottleneck. However, the license server may consists
of many subsystems arranged in a modular design that allows them to run independently to overcome this bottleneck. We have not addressed all these issues
in this article and refer to Hwang et al.[8]. In our design, we mainly focus on
ensuring security in content key management.

Conclusion

For a scalable business model of transacting digital assets, a multi-party DRM


system is often necessary which involves more than one distributor, who can
promote and distribute the content in regions unknown to the package server.
We propose a key management scheme for a DRM system that involves more
than one distributors with the DRM clients exibility of choosing a distributor
according to his own preference. In our scheme, the package server does not trust
the distribution servers or the license server. The encrypted digital content sent
by a package server can only be decrypted by the DRM client who has a valid
license and is protected from attacks by other parties/servers in the system. A
general monotone decreasing access structure is used in our key distribution that
leads to more exible performance. Moreover, we use the IBE that incurs less
computation cost and storage as certicate managements are not necessary and
certicate verications are no longer needed. These features make our DRM system suitable for more eective business models/applications with the exibility
in deciding a wide range of business strategies as compared to the existing works.

232

R. Dutta, D. Mishra, and S. Mukhopadhyay

References
1. ANSI X9.62, Public Key Cryptography for the Financial Services Industry. The
Elliptic Curve Digital Signature Algorithm (1999)
2. Camp, L.J.: First Principles of Copyright for DRM Design. IEEE Internet Computing 7, 5965 (2003)
3. Cohen, J.E.: DRM and Privacy. Communications of the ACM 46(4) (April 2003)
4. Dutta, R., Barua, R., Sarkar, P.: Pairing Based Cryptographic Protocols: A Survey.
Manuscript (2004), http://eprint.iacr.org/2004/064
5. BlueKrypt: Cryptographic Key Length Recommendation,
http://www.keylength.com/en/3/
6. Grimen, G., Monch, C., Midtstraum, R.: Building Secure Software- based DRM
systems. In: NIK 2006 (2006)
7. Hartung, F., Ramme, F.: Digital Rights Management and Watermarking of Multimedia Content for M-Commerce Applications. IEEE Comm. 38, 7884 (2000)
8. Hwang, S.O., Yoon, K.S., Jun, K.P., Lee, K.H.: Modeling and implementation of
digital rights. Journal of Systems and Software 73(3), 533549 (2004)
9. Jeong, Y., Yoon, K., Ryou, J.: A Trusted Key Management Scheme for Digital
Rights Management. ETRI Journal 27(1), 114117 (2005)
10. Lee, J., Hwang, S., Jeong, S., Yoon, K., Park, C., Ryou, J.: A DRM Framework
for Distribution Digital Contents through the Internet. ETRI Journal 25, 423436
(2003)
11. Liu, X., Huang, T., Huo, L.: A DRM Architecture for Manageable P2P Based
IPTV System. In: IEEE Conference on Multimedia and Expo., pp. 899902 (July
2007)
12. Liu, Q., Safavi-Naini, R., Sheppard, N.P.: Digital Rights Management for Content Distribution. In: Proceedings of Australasian Information Security Workshop
Conference on ACSW Frontiers 2003, vol. 21 (January 2003)
13. Mulligan, D.K., Han, J., Burstein, A.J.: How DRM- Based Content Delivery Systems Disrupt Expectations of Personal Use. In: Proc. 2003 ACM Works. Digital
Rights Management, pp. 7788 (October 2003)
14. Rosset, V., Filippin, C.V., Westphall, C.M.: A DRM Architecture to Distribute
and Protect Digital Content Using Digital Licenses. Telecommunication, 422427
(July 2005)
15. Sachan, A., Emmanuel, S., Das, A., Kankanhalli, M.S.: Privacy Preserving Multiparty Multilevel DRM Architecture. In: IEEE Consumer Communications and
Networking Conference (CCNC) (January 2009)
16. Shamir, A.: How to Share a Secret. Communications of the ACM 22(11), 612613
(1979)
17. Shamir, A.: Identity-Based Cryptosystems and Signature Schemes. In: Blakely,
G.R., Chaum, D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 4753. Springer,
Heidelberg (1985)

Multiple Secrets Sharing with Meaningful Shares


Jaya and Anjali Sardana
Deptt. of Electronics and Computer Engg
Indian Institute of Technology Roorkee
Roorkee 247667, Uttarakhand, India
jaya.im.k@gmail.com, anjlsfec@iitr.ernet.in

Abstract. Traditional Visual Cryptography method produces random shares


which is susceptible to attackers. Some methods have been proposed to
generate innocent-looking shares so that attacker cannot get doubtful by looking
at the random pattern of the share. They look like a valid image and the
adversary cannot judge whether it is a secret share or not. However many of
them use additional data structure and take much time in encoding the secret. In
this paper, we propose a method which produces meaningful shares for color
images without the need of any additional data structure and takes less time for
encoding. The share size does not vary with the number of colors present in the
secret image. The method is further extended to share multiple secrets together
to reduce the overhead of keeping too many shares.
Keywords: Visual cryptography schemes (VCS), share, pixel expansion,
contrast, stacking.

1 Introduction
In 1994, Naor and Shamir [1] proposed a new cryptographic area called visual
cryptography based on the concept of secret-sharing. It divides an image into a
collection of shares and requires threshold number of shares to retrieve the original
image. Initially the model could be used only for black-and-white images but was
further extended to support grey-level and color images. There are some interesting
extensions of the original model. One of them is to generate innocent-looking shares
so that attacker cannot get doubtful by looking at the random pattern of the share.
Another extension is to encode multiple secret images together so that overhead of
keeping too many shares can be reduced.
The hacker can get suspicious by looking at the random looking shares and can
guess that a secret message has been encoded. To remove this problem, Naor and
Shamir [1] proposed a method to produce innocent looking shares to conceal the
secret message. Chang et al. [3] proposed a method to generate two shares for
hiding a secret two-tone image. Shares are embedded into two gray-level cover
images by the proposed embedding scheme. Chang et al [4] suggested a scheme for
color image hiding using a color index table. Chang and Yu [5] came with an
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 233243, 2011.
Springer-Verlag Berlin Heidelberg 2011

234

Jaya and A. Sardana

approach to provide a more efficient way to hide a gray image (256-colors) in


different shares. The size of the shares does not change with the number of colors
appearing in the secret image. Wu et al. [6] formulated a method in which size of
each embedding image is about 1/k of that of the secret image (k- threshold),
avoiding the need for much storage space and transmission time. Qualities of both
the recovered secret image and he embedding images that contain the hidden
shadows are acceptable. Tsai et al. [7] developed a method to support true-color
secret image with size constraint on shares.
To share multiple secret images, many shares need to be generated and it takes a
lot of time during transmissions. This is not efficient and hence, some methods have
been proposed to hide more secret images into two share images. Droste [8] proposed
a scheme to share more than one secret among a set of shares. Wu and Chen [9]
developed a VCS to share two secret images together using two circle shares. First
secret can be obtained by stacking the two shares and second secret by rotating share
1 by a rotation angle and then stacking it with share 2. The scheme was extended by
Shyu et al. [10] so that multiple secrets can be shared. Share 2 is rotated with n
different angles and stacked with share 1 to get n secret images. Feng et al. [11]
proposed another scheme to hide n secrets and to reveal the secrets by stacking the
share images at n aliquot angles. This scheme is more general than the previously
discussed two schemes.
The paper is organized as follows: Section 2 reviews the basic schemes needed for
the proposed method. Section 3 presents the proposed method. Section 4 analyzes the
performance of the method. Section 5 contains the experimental results for the
verification of the scheme.

2 Related Work
2.1 (2,2) Visual Cryptography Scheme
A (2,2)-VCS scheme divides the original image into 2 shares and secret image is
recreated by stacking both the shares. Secret image is viewed as a collection of white
and black pixels. Each share contains collections of m black and white subpixels
where each collection represents a particular original pixel. The resulting picture can
be thought as a [n x m] Boolean matrix S = [si,j].
si,j = 1 if the j-th subpixel in the i-th share is black.
si,j = 0 if the j-th subpixel in the i-th share is white.
The algorithm, in Figure 1, describes how to encode a single pixel. One of the two
subpixels in P is black and the other is white in both the shares. The possibilities
"black-white" and "white-black" are equally likely, independent of the corresponding
pixel in the secret image. So the shares do not provide any information as whether the
original pixel to be encoded is black or white and hence proves the security of the
scheme.

Multiple Secrets Sharing with Meaningful Shares

235

Fig. 1. Encoding and stacking of a single pixel

2.2 Hous Scheme


Hou [2] proposed three VCS for color images. In all the three schemes, secret image
is decomposed to three primitive color images cyan, magenta and yellow first, then
halftoning of those three images is done and finally the encryption is performed. First
scheme, the four-share color VCS generates one black mask randomly as the fourth
share which contains as many 2x2 blocks as the number of pixels in the secret image.
The second method expands each pixel of a halftone image into a 2x2 block on two
sharing images and fills it with cyan, magenta, yellow and transparent, respectively.
Using these four colors, two shares can generate various colors with different
permutations. To reduce the trouble of having four shares as in method 1, and to have
a better image quality than in method 2, third scheme was developed which applies
basic (2,2) VCS on each of the C,M and Y images to produce six intermediate shares
and combines C1,M1,Y1 to get share 1 and C2,M2,Y2 to get share 2.

3 The Proposed Scheme


3.1 Encryption
In method-2 proposed by Chang [3], one bit of the pixel-byte of cover image is replaced
by the pixel-value of the share image. For security, the cover pixels in the cover image
are first mixed using total automorphism. This method was developed for black-andwhite images. But it can be extended for color images. First, 2 shares are produced for
the color image using Hous third algorithm. To share a W x H secret image, we take a
cover image of size 2W x 2H so that the cover image size and share size is the same.
Then each pixel of the share image has 3-bit of color information- one for each
primitive color. Each pixel of the cover image has 8-bit (256 level) color information for
each primitive color. Now for share 1, one bit of the cover pixel for a primitive color is
chosen and is XORed with that primitive color bit of the share image. The process is
repeated for all the pixels of share 1. Now for share 2, the same cover image is taken

236

Jaya and A. Sardana

and same procedure is reiterated. Here XOR is used instead of total automorphisms as
proposed by Chang [3]. The reason is that XOR is easy to perform and takes very less
time than total automorphisms. Further, the secret bit cannot be revealed from the cover
image as it is the XOR-result of cover-pixel bit and share-bit.

Fig. 2. Bit pattern

The bit to be replaced should be one of the four lower order bits so that the pixel
value of the cover image does not change much.
3.2 Data Extraction
For decoding purpose, the specified bit of modified-share 1 and modified-share 2 are
XORed and the bit value of the original halftoned image is achieved. Figure 3 shows
Table 1. Pixel reconstruction using proposed scheme

Share 1

Share 2

Cover
Image

Modified
Share 1

Modified
Share 2

Final
XOR

Multiple Secrets Sharing with Meaningful Shares

237

one of the possible combinations of share 1 and share 2 using basic (2,2) VC
proposed by Naor and Shamir[1] for white pixel, and the reconstructed image using
XOR operation. Figure 4 shows the scheme for black pixel. The output image quality
is better using this scheme as XOR operation allows for perfect reconstruction. It
means that there is no loss of contrast.
The decryption can be understood as follows. Suppose we have the secret image S.
We create 2 shares for it as S1 and S2 using Hous method [3]. Then we XOR it with
the cover image, S1 with C1 and S2 with C2 during encryption to produce innocentlooking shares. For the decoding process, we XOR the predefined bit of C1 and C2 to
generate the secret.

( S1 C1) ( S 2 C 2) = ( S1 S 2) (C1 C 2) = S1 S 2
C1 and C2 are the same images as they are just 2 copies of same cover image.
Hence result of C1 C 2 becomes 0 and this effectively results in S1 S 2 which
constructs the final image.

Share 1

Share 2

XORed Result

Fig. 3. White pixel reconstruction

Share 1

Share 2

XORed Result

Fig. 4. Black pixel reconstruction

3.3 Method Extension


The proposed method can be extended to share multiple secrets in same cover image.
For this, 2 shares are created for each secret to be shared. While embedding the shares
in cover image, the same bit-position is used for both shares belonging to the same
secret. Thus the extension will result in 2 innocent-looking cover shares which will
contain multiple secrets. The method can be used to share up to 8 secret images
together but in that case the shares produced will be random. To keep the shares
meaningful, optimum number of shares should be embedded. If 4 secret images are
shared together in same cover image then it can change the pixel value maximum by a
value of 15. If 5 images are to be shared then it can change the original pixel value of
cover image by a value of 31. These changes are not very large and the eyes will not
be able to recognize a difference. The value of optimum number of shares depends on
the application where the method is to be used such that it will not change the pixel
value of cover image much and will still be innocent-looking shares.

238

Jaya and A. Sardana

The method can be summarized as:


Step 1: Create 2 shares for each secret using Hous third method.
Step 2: Choose a cover image. Make 2 copies of it.
Step 3: Select one bit in cover image and XOR it with corresponding pixel of share 1.
Step 4: Repeat the process for share 2.
Step 5: Repeat the steps 1-4 for each of the secret to be shared.

4 Analysis
4.1 Security Analysis
The proposed scheme does the security enhancement with a much cheaper operation
XOR than the permutation method used by Chang [3]. The method XORs the original
bit-value of the cover image with the pixel-value of the created color-share. The
produced result becomes the bit-value of the corresponding pixel of modified cover
image. If the bit-value in the cover image is 0 and the pixel-value of the color share is
0, this gives a bit-value of 0 after the modification done by XOR. But this result can
also be produced if the values are 1 and 1. So each possible bit-value in the modified
cover image has two possibilities and both of them are equally likely. This proves the
security of the scheme.
Table 2. Security analysis of the scheme

Cover image bitvalue

Pixel-value of color
share

Modified cover image


bit-value

4.2 Performance Analysis


The methods discussed in [5, 6, 7] need additional data structure and so for decoding
process, that data structure must also be provided along with the shares. This adds an
extra overhead. Further, the security enhancement is done by the XOR operation and
not by the permutation as in [3, 6, 7] which takes less encoding time. The proposed
method supports true color images as opposed to [3, 5]. Finally, decoding needs again
only XOR operation of the pixels and so it also takes less time than the methods

Multiple Secrets Sharing with Meaningful Shares

239

which need additional data structure to decode the image. The image quality is also
good as XOR allows for perfect reconstruction of the pixels.
Table 3. Comparison of various VCS for producing meaningful shares

Authors

Year

True-color (n,n)-scheme
Security
support
supported
enhancement

Additional
data
structure
needed

Chang-Yu
[5]

2002

No

No

No

Yes

Wu et al. [6]

2004

NA

Yes

Permutation

Yes

Tsai et al.
[7]

2009

Yes

Yes

Permutation

Yes

Proposed
method

Yes

No

XOR

No

Table 4. Comparison of various VCS for sharing multiple secrets

Author

Year

No of secret
images

Pixel expansion

Share type

Wu and
Chang [9]

2005

Circle

Shyu et al.
[10]

2007

n>=2

2n

Circle

Feng et al.
[11]

2008

n>=2

Rectangular

Proposed
method

Upto 8 (upto 4
for better
security)

Rectangular

240

Jaya and A. Sardana

5 Experimental Results
Figure 5 shows 4 secret images to be encoded. The size of all the secret images is 200
x 200. Figure 6 is chosen as the cover image which is of size 400 x 400. 2 copies of
the same cover image are taken. Then random-looking shares for the secret image are
created using Hous method. The shares are then embedded in cover images. Thus the
innocent-looking shares shown in figure 7 are achieved. These shares are decoded to
generate the secret images, shown in figure 8. The reconstructed images are 4 times
the original images as the pixel expansion in Hous method is 4.
We can see that the created meaningful shares do not differ much from the original
cover image. As we increase the number of secret images to be shared, the shares start
to differ more than the original cover image. This method provides an efficient way
to share up to 4 or 5 secret images together with innocent-looking shares. One
limitation of the scheme is that it cannot be used for (n, n)- scheme.

(a)

(b)

(c)

(d)

Fig. 5. (a) Secret image Lena (b) Secret image Baboon (c) Secret image Ball (d) Secret image Toy

Fig. 6. Cover image

Multiple Secrets Sharing with Meaningful Shares

(a)

241

(b)

Fig. 7. (a) Innocent share 1 (b) Innocent share 2

(a)

(b)

Fig. 8. (a) Recovered image Lena (b) Recovered image Baboon (c) Recovered image Ball
(d) Recovered image Toy

242

Jaya and A. Sardana

(c)

(d)
Fig. 8. (continued)

6 Conclusions
In this paper, we have proposed multiple-secret sharing scheme producing innocent
shares. When the two shares are XORed, the original embedded information can be
achieved. The scheme takes two copies of single cover image for producing two
shares. We can share multiple secrets together with enhanced security. The
advantages of the proposed method are good image quality, no additional data
structure and less encoding time. The size of reconstructed images does not vary with
the number of colors present in the secret images. The scheme is very much suitable
for real-life applications which requires fast computation, less storage and is prone to
attackers.

References
1. Naor, M., Shamir, A.: Visual cryptography. In: De Santis, A. (ed.) EUROCRYPT 1994.
LNCS, vol. 950, pp. 112. Springer, Heidelberg (1995)
2. Hou, Y.C.: Visual cryptography for color images. Pattern Recognition 36, 16191629
(2003)
3. Chang, C.-C., Chuang, J.-C., Lin, P.-Y.: Sharing A Secret Two-Tone Image In Two GrayLevel Images. In: Proceedings of the 11th International Conference on Parallel and
Distributed Systems, ICPADS 2005 (2005)
4. Chang, C., Tsai, C., Chen, T.: A New Scheme For Sharing Secret Color Images In
Computer Network. In: Proceedings of International Conference on Parallel and
Distributed Systems, pp. 2127 (2000)

Multiple Secrets Sharing with Meaningful Shares

243

5. Chang, C.-C., Yu, T.-X.: Sharing A Secret Gray Image In Multiple Images. In: First
International Symposium on Cyber Worlds, CW 2002 (2002)
6. Wu, Y.S., Thien, C.C., Lin, J.C.: Sharing and hiding secret images with size constraint.
Pattern Recognition 37, 137138 (2004)
7. Tsai, D.-S., Horng, G., Chen, T.-H., Huang, Y.-T.: A Novel Secret Image Sharing Scheme
For True-Color Images With Size Constraint. Information Sciences 179, 324325 (2009)
8. Droste, S.: New results on visual cryptography. In: Koblitz, N. (ed.) CRYPTO 1996.
LNCS, vol. 1109, pp. 401415. Springer, Heidelberg (1996)
9. Wu, H.-C., Chang, C.-C.: Sharing visual multi-secrets using circle shares. Computer
Standards & Interfaces 28, 123135 (2005)
10. Shyu, S.J., Huang, S.-Y., Lee, Y.-K., Wang, R.-Z., Chen, K.: Sharing multiple secrets in
visual cryptography. Pattern Recognition 40, 36333651 (2007)
11. Feng, J.-B., Wu, H.-C., Tsai, C.-S., Chang, Y.-F., Chu, Y.-P.: Visual secret sharing for
multiple secrets. Pattern Recognition 41, 35723581 (2008)

On Estimating Strength of a DDoS Attack Using


Polynomial Regression Model
B.B. Gupta1,2, P.K. Agrawal3, A. Mishra1, and M.K. Pattanshetti1
1

Department of Computer Science, Graphic Era University, Dehradun, India


gupta.brij@gmail.com
2 Department of Electronics and Computer Engineering, Indian Institute of Technology
Roorkee, Roorkee, India
3 Department of Computer Science, NSIT, New Delhi, India

Abstract. This paper presents a novel scheme to estimate strength of a DDoS


attack using polynomial regression model. To estimate strength of attack, a relationship is established between strength of attack and observed deviation in
sample entropy. Various statistical performance measures are used to evaluate
the performance of the polynomial regression models. NS-2 network simulator
on Linux platform is used as simulation test bed for launching DDoS attacks
with varied attack strength. The simulation results are promising as we are able
to estimate strength of DDoS attack efficiently.

1 Introduction
DDoS attacks compromise availability of the information system through various
means [1,2]. One of the major challenges in defending against DDoS attacks is to
accurately detect their occurrences in the first place. Anomaly based DDoS detection
systems construct profile of the traffic normally seen in the network, and identify
anomalies whenever traffic deviate from normal profile beyond a threshold [3,4]. This
extend of deviation is normally not utilized. We use polynomial regression [5,6]
based approach that utilizes this extend of deviation from detection threshold, to estimate strength of a DDoS attack.
In order to estimate strength of a DDoS attack, polynomial regression model is
used. To measure the performance of the proposed approach, we have calculated
various statistical performance measures i.e. R2, CC, SSE, MSE, RMSE, NMSE, ,
MAE and residual error [12]. Internet type topologies used for simulation are generated using Transit-Stub model of GT-ITM topology generator [7]. NS-2 network simulator [8] on Linux platform is used as simulation test bed for launching DDoS attacks
with varied attack strength.
The remainder of the paper is organized as follows. Section 2 contains overview of
polynomial regression model. Detection scheme is described in section 3. Section 4
describes experimental setup and performance analysis in details. Model development
is presented in section 5. Section 6 contains simulation results and discussion. Finally,
Section 7 concludes the paper.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 244249, 2011.
Springer-Verlag Berlin Heidelberg 2011

On Estimating Strength of a DDoS Attack Using Polynomial Regression Model

245

2 Polynomial Regression Model


In its simplest, form regression analysis [9,10] involves finding the best straight line
relationship to explain how the variation in an outcome variable, Y, depends on the
variation in a predictor variable, X. When there is only one explanatory variable the
regression model is called a simple regression, whereas if there are more than one
explanatory variable the regression model is called multiple regression.
Polynomial regression [4,5] is a form of regression in which the relationship between the independent variable X and the dependent variable Y is modeled as an ith
order polynomial. The general form of this regression model is as follows:

Yi = Yi + i
Yi = 0 + 1 X + 2 X 2 + ......... + n X n

(1)

Input and Output: In polynomial regression model, a relationship is developed between strength of a DDoS attack Y (output) and observed deviation in sample entropy
X (input). Here X is equal to (Hc-Hn). Our proposed regression based approach utilizes this deviation in sample entropy X to estimate strength of a DDoS attack.

3 Detection of Attacks
Entropy [11] based DDoS scheme is used to construct profile of the traffic normally
seen in the network, and identify anomalies whenever traffic goes out of profile. A
metric that captures the degree of dispersal or concentration of a distribution is sample
entropy. Sample entropy H(X) is

H ( X ) = pi log 2 ( pi )
N

(2)

i =1

where

pi is ni/S. Here ni represent total number of bytes arrivals for a flow i in {t

= ni , i = 1, 2....N . The value of sample entropy lies in the range 0N

, t} and S

i =1

log2 N.
To detect the attack, the value of

Hc ( X ) is

calculated in time window conti-

nuously; whenever there is appreciable deviation from X n ( X ) , various types of


DDoS attacks are detected. Hc ( X ) , and X n ( X ) gives Entropy at the time of detection of attack and Entropy value for normal profile respectively.

4 Experimental Setup and Performance Analysis


Real-world Internet type topologies generated using Transit-Stub model of GT-ITM
topology generator [7] are used to test our proposed scheme, where transit domains
are treated as different Internet Service Provider (ISP) networks i.e. Autonomous

246

B.B. Gupta et al.

Systems (AS). For simulations, we use ISP level topology, which contains four transit
domains with each domain containing twelve transit nodes i.e. transit routers. All the
four transit domains have two peer links at transit nodes with adjacent transit domains. Remaining ten transit nodes are connected to ten stub domain, one stub domain per transit node. Stub domains are used to connect transit domains with customer domains, as each stub domain contains a customer domain with ten legitimate
client machines. So total of four hundred legitimate client machines are used to generate background traffic.
The legitimate clients are TCP agents that request files of size 1 Mbps with request
inter-arrival times drawn from a Poisson distribution. The attackers are modeled by
UDP agents. A UDP connection is used instead of a TCP one because in a practical
attack flow, the attacker would normally never follow the basic rules of TCP, i.e.
waiting for ACK packets before the next window of outstanding packets can be sent,
etc. In our experiments, the monitoring time window was set to 200ms. Total false
positive alarms are minimum with high detection rate using this value of monitoring
window.

5 Model Development
In order to estimate strength of a DDoS attack ( Y ) from deviation (HC - Hn) in entropy value, simulation experiments are done at the varying attack strength from 10Mbps
Table 1. Deviation in entropy with actual strength of DDoS attack
Actual strength of DDoS Deviation in Entropy (X)
attack (Y)
10M
0.149
15M
0.169
20M
0.184
25M
0.192
30M
0.199
35M
0.197
40M
0.195
45M
0.195
50M
0.208
55M
0.212
60M
0.233
65M
0.241
70M
0.244
75M
0.253
80M
0.279
85M
0.280
90M
0.299
95M
0.296
100M
0.319

On Estimating Strength of a DDoS Attack Using Polynomial Regression Model

247

to 100Mbps and at fixed total number of zombies i.e. 100. Table 1 represents deviation in entropy with actual strength of DDoS attack.
Polynomial regression model is developed using strength of attack (Y) and deviation (HC - Hn) in entropy value as discussed in Table 1 to fit the regression equation.
Figure 1 shows the regression equation and coefficient of determination for polynomial regression model.

Strength of Attack (Mbps)

120
100

y = -1284.9x 2 + 1176.4x - 144


R2 = 0.9603

80
60
40
20

Polynomial Regression

0
0.10

0.14

0.18
0.22
0.26
Deviation in Entropy (X)

0.30

0.34

Fig. 1. Regression equation and coefficient of determination for polynomial regression model

6 Results and Discussion


We have developed polynomial regression model as discussed in section 5. Various
performance measures are used to check the accuracy of this model.
120

Strength of Attack

100
80
60
40
20
0
0.149

0.184

0.199

0.195

0.208

0.233

0.244

0.279

0.299

0.319

Deviation in Entropy
Actual DDoS attack Strength

Predicted DDoS attack strength using Model M2

Fig. 2. Comparison between actual strength of a DDoS attack and predicted strength of a DDoS
attack using polynomial regression model M2

248

B.B. Gupta et al.

Predicted strength of attack can be computed and compared with actual strength of
attack using proposed regression model. The comparison between actual strength of
attack and predicted strength of attack using polynomial regression model is depicted
in figures 2.
Table 2 contains values of various statistical measures for polynomial regression
model. It can be inferred from table 2 that for polynomial regression model, values of
R2, CC, SSE, MSE, RMSE, NMSE, , MAE are 0.96, 0.98, 566.31, 29.81, 5.46, 1.06,
0.96 and 0.81, respectively. Hence estimated strength of a DDoS attack using polynomial model is closed to actual strength of a DDoS attack.
Table 2. Values of various performance measures
R2
CC
SSE
MSE
RMSE
NMSE

MAE

0.96
0.98
566.31
29.81
5.46
1.06
0.96
0.81

7 Conclusion and Future Work


This paper investigates how polynomial regression model can be used to estimate
strength of a DDoS attack from deviation in sample entropy. For this, model is developed and various statistical performance measures are calculated. After careful investigation, we can conclude that estimated strength of a DDoS attack using polynomial
regression model is very close to actual strength of a DDoS attack. Hence, polynomial
regression model is very useful method for estimating strength of attack.

References
1. Gupta, B.B., Misra, M., Joshi, R.C.: An ISP level Solution to Combat DDoS attacks using
Combined Statistical Based Approach. International Journal of Information Assurance and
Security (JIAS) 3(2), 102110 (2008)
2. Gupta, B.B., Joshi, R.C., Misra, M.: Defending against Distributed Denial of Service Attacks: Issues and Challenges. Information Security Journal: A Global Perspective 18(5),
224247 (2009)
3. Gupta, B.B., Joshi, R.C., Misra, M.: Dynamic and Auto Responsive Solution for Distributed Denial-of-Service Attacks Detection in ISP Network. International Journal of
Computer Theory and Engineering (IJCTE) 1(1), 7180 (2009)
4. Mirkovic, J., Reiher, P.: A Taxonomy of DDoS Attack and DDoS defense Mechanisms.
ACM SIGCOMM Computer Communications Review 34(2), 3953 (2004)
5. Stigler, S.M.: Optimal Experimental Design for Polynomial Regression. Journal of American Statistical Association 66(334), 311318 (1971)

On Estimating Strength of a DDoS Attack Using Polynomial Regression Model

249

6. Anderson, T.W.: The Choice of the Degree of a Polynomial Regression as a Multiple Decision Problem. The Annals of Mathematical Statistics 33(1), 255265 (1962)
7. GT-ITM Traffic Generator Documentation and tool,
http://www.cc.gatech.edu/fac/EllenLegura/graphs.html
8. NS Documentation, http://www.isi.edu/nsnam/ns
9. Lindley, D.V.: Regression and correlation analysis. New Palgrave: A Dictionary of
Economics 4, 120123 (1987)
10. Freedman, D.A.: Statistical Models: Theory and Practice. Cambridge University Press,
Cambridge (2005)
11. Shannon, C.E.: A mathematical theory of communication. ACM SIGMOBILE Mobile
Computing and Communication Review 5(1), 355 (2001)
12. Gupta, B.B., Joshi, R.C., Misra, M.: ANN Based Scheme to Predict Number of Zombies
in DDoS Attack. International Journal of Network Security 13(3), 216225 (2011)

Finding New Solutions for Services in Federated Open


Systems Interconnection
Zubair Ahmad Khattak1,2, Jamalul-lail Ab Manan2, and Suziah Sulaiman1
1

Universiti Teknologi PETRONAS, Department of Computer and Information Sciences,


Tronoh 31750, Perak, Malaysia
zubair_g00953@utp.edu.my, suziah@petronas.com.my
2
MIMOS Berhad, Advanced Information Security Cluster, Technology Park Malaysia,
57000, Kuala Lumpur, Malaysia
jamalul.lail@mimos.my

Abstract. Federated environment application running on cost-effective


federated identity management system has been more widely adopted, and
would potentially attract more organizations to adopt and invest if we enhance
with security and trust mechanisms. The traditional certificate based
authentication raises various issues such as firstly, the case when public portion
of the key pair can be guessed or calculated by the attacker, it can further be
used to masquerade against resource access, and secondly, when the storing of
private key on user system can be compromised by viruses, Trojan horses etc.
Also current computer platforms are lacking in platform trust establishment
which makes it hard to trust remote platforms. In this paper, we discuss
concerns related to federated services user authentication, authorization, and
trust establishment in Federated Open Systems Interconnection and proposed
trusted platform module protected storage to protect private keys, and platform
attestation mechanisms to establish inter platform (and hence inter system) trust
among interacting systems in open environment to overcome these issues. To
assess our work we compared trusted platform module with existing
authentication types and shows that trusted platform module provides better
temper-resistance protection against attacks such as replay, Trojan horses, and
fake anti viruses attacks etc.
Keywords: federated identity management system, authentication, trust
establishment, trusted computing, trusted platform module.

1 Introduction
Federated Environment (FE) can be defined as a collaborative, sharing of resources or
services between groups, environment between several organizations. The two well
known FE application examples are Centers of Excellence (COE) and Federated
Identity Management (FIM) [1]. The later one allows users to use their authentication
(AuthN) credentials with home organization (from Identity Provider (IDP)) to access
services (from Service Providers (SP)) within the federation. The Single-Sign-On
(SSO) [2] facility plays a major role in reducing the number of users accounts by
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 250259, 2011.
Springer-Verlag Berlin Heidelberg 2011

Finding New Solutions for Services in Federated Open Systems Interconnection

251

reducing too many and repeated AuthN to various sites. The three major entities
involved are: (1) User - an entity that access a service or multiple services (2) IDP an entity that performs user AuthN, and (3) SP- entity that offers services to the users
[3]. In open environments SSO schemes, user identification is achieved via diverse
AuthN methods ranging from a single to multiple factors. In a worst case scenario, if
user identification process is typically based on weak method, such as a user name
and password. In this scenario, once user credentials are compromised, it immediately
opens a security breach hole. In two examples, as pointed in the data leakage reports
in [4, 5] showed that the lost of personal data in open environment can bring disaster
to whom it belongs and holds.
Access to Internet Web-based services or a resource identification or AuthN of
end-user, is mandatory to ensure they are, who they say they are. In certificate based
AuthN, user first obtains private key certificate from certificate authority and installs
it on client PC. The main problem in this case lies, how to protect the private key?
The threat model which relates to dispersed identities presented in [9] which shows
how concerns raises due to the attacks such as man-in-the-middle attack, replay
attack, fake softwares, etc to unprotected entities on user system (client). The manin-the-middle attack also can impersonate IDP and SP to obtain users credentials, or
intercept and/or tamper system or user messages, or installing Trojan horses or fake
anti-viruses [10]. With no trust infrastructure, and the lack of trust between any two
interacting parties, many of existing solutions, such as trust establishment based on
traditional methods, such as Public Key Infrastructure (PKI), or sharing of secret
keys, would have many challenges. We will present later in this paper, the Trusted
Platform Module (TPM) AuthN (certificate based) and attestation mechanism (a
platform integrity check) as a suggested solution to enhance the client side security &
existing weak trust between platforms with hardware TPM.
Trusted Computing Group (TCG) [6] replaces Trusted Computing Platform
Alliance (TCPA) [7] which is a not-for-profit organization. Its main objective include
developing, defining, and promoting open, vender neutral, and industry standards for
Trusted Computing (TC) building blocks and software interfaces across multiple
platforms [6]. For more details about TCG interested readers are referred to [6, 7, 8].
Our paper is structured as follows. In Section 2, we present federated services
challenges. Section 3 presents trusted computing temper-resistant chip based solution.
In Section 4, we present assessment of the work and we conclude with Section 5.

2 Federated Services Challenges


In this section we present issues such as AuthN, authorization (AuthR) [11, 12, 13] &
trust establishment [14, 15] that identify the potential & critical area of improvement.
2.1 User Authentication and Authorization
The risk associated with authentication (SSO) in a federated system is more difficult
to control as compared to a centralized systems. Therefore, weak AuthN mechanism
such a username and password might produce major vulnerabilities to the subject, and
would lead to increasing a risk associated to phishing, pharming & password attacks
mentioned in [16]. In open system interconnection environment (Internet) these

252

Z.A. Khattak, J.Ab. Manan, and S. Sulaiman

threats would eventually lead to widely damage user trust and organization reputation
due to the poor and weak AuthN mechanisms implementation. The study performed
by panda security [17] in 2009 have found that Trojans which were maliciously
designed to steal personal identifiable or financial information had lead to identity
fraud which rose staggeringly to 800 % from the first half to the second half of 2008.
In addition, researcher forecast based on the previous 14 months analysis that this rate
would increase up to 336 % (p/m) throughout 2009. The two important challenges
related to AuthN in FIMS are (1) measurement & determination of identity
information accuracy & validity, and (2) trusted services that enhance confidence.
In typical federated system, each domain has its own AuthR polices. In federation,
user in domain (X), for example, want to access a service (P) or resource (R) in
another domain (Y) is difficult to achieve without facing issues of compromising
identity and loosing personal identity information. Hence, proper AuthR mechanisms
are highly needed, i.e. when communicating with endpoints across multiple hops [18].
2.2 Trust Establishment
By analogy, in a real physical interaction between two persons coming from two
different organizations, they must undergo certain trust establishment before any
serious engagement take place between two persons. Federated systems based on a
concept to get into (login) the services (government services, education services, email services etc.) only once with username/ password or any other mechanism and
then access many services without re-login. The most common open federated
services systems are Shibboleth [24], Liberty Alliance [25], OpenID [26]. The basic
architecture of these systems is nearly same. However, request and response of
messages in theses systems varied from one to another. The common three entities
involved in such systems a user, an IDP, and a SP. In a federated services scenario, a
user requests a resource or service from SP. Lets assumes there is no prior trust
relationship exists between user and SP and the services provider depends on the
AuthN information to make access decision. The user trust IDP, associated with one
or more IDP, that they AuthN them and provide credentials associated with the users
to the SP. The SP on basis of these credentials and their owned polices allow or deny
access to the requested resource or a service.
Therefore, the federated services approaches mentioned above solve the dilemma,
user is AuthN and trusted not to misuse the provided services, of AuthN, and AuthR
but the his/her platform might be not in trustworthy state. Therefore, before
transferring credentials from IDP to SP assess the trustworthiness of user platform or
IDP & SPs platforms is mandatory.

3 Trusted Computing
Trusted Computing is a response to the rising challenges and possible costs of
networks and data security breaches. Practically TC covers a range of technologies
and standards intended to make computers safer, reliable, and less prone to viruses,
malwares, & spams. This technology can also help to make the network management
security more effective and efficient. In early 2000s, TCPA [7] now known as TCG
[6] launched the notion of trusted platform. This platform contains a hardware based

Finding New Solutions for Services in Federated Open Systems Interconnection

253

subsystem, called TPM [21], devoted to maintaining trust and security between
communicating machines (client, servers, H/P, etc.).
TPM is a unique paradigm to bring and establish trust among computing/ mobile
platforms. The TPM by definition is small co-processor chip that can securely store
and report information to provide a hardware root-of-trust. TPM has shielded
locations, Platform Configuration Registers (PCR) that can store cryptographic hashes
of the software loaded for execution, to store the platform configurations. These PCRs
can only be manipulates by a mechanism called TPM-Extend. The hashes stored in
PCR are used to report the platform configuration to the challenging party in a secure
and trusted manner. The mechanism, for establishment of trust that reports the
platform configuration to the challenging party is known as Remote Attestation. The
RA enables a remote party (a validation service in our case) to verify the integrity of
the remote platform through trust tokens submitted by the TPM on the target platform
(a client or server in our case).
3.1 AuthN with Private Key Protection
The four entities given in Figure 1 below a Bank service, an ISP playing the role of
Private-Certificate Authority, User agent and TPM. The TPM and User agent are both
part of the client system. In typical certificate based AuthN (e.g. public key
certificate), user obtains a certificate from Certificate Authority (CA) and stores it on
a client system. In such a process the public portion is passed on to the CA and
private part is stored at client system.
TPM

User agent

ISP (P-CA)

Bank Service (Server)

Service Req.
Client Certificate Req.

Look for client certificate, if no then


AIKcertificate Req.
Perform(TPM_MakeIdentity)

Tspi_TPM_CollateIdentityRequest
Perform(TSS_CollateIdentityRequest)

TPM_IDENTITY_REQ.
ISP (P-CA) signing
certificate
AIKcertificate
Activation

Perform(TPM_ActivateIdentity)
Perform(TPM_CreateWrapKey)
Perform(TPM_CertifyKey2)

Identity certificate
(user) signing.

TPM_SYM_CA_ATTESTATION.
Tspi_TPM_ActivateIdentity
Tspi_Key_CreateKey
Tspi_Key_CreateKey

Perform(TPM_CreateWrapKey)

Tspi_Key_CreateKey

Perform(TPM_CertifyKey2)

Tspi_Key_CertifyKey

Perform(TPM_Sign)

Perform(TPM_LoadKey)
Perform(TPM_Sign)

Tspi_Hash_Sign

If certificate found

Cert.Verify message & certificate

Private key & certification verification


Secure channel (SSL)
(Username/ password)/

verify

Fig. 1. The flow diagram of setup and authentication phases

The storing of private key certificate on client system raises many issues.
Therefore, to overcome from above problem two precautionary steps must be taken.
Firstly, secure transferring of the public key, secondly, private key protection. Here
we present only user AuthN to a single service via two factors, i.e. TPM provide
protection to private key which involves (1) certificate and private key corroboration,

254

Z.A. Khattak, J.Ab. Manan, and S. Sulaiman

and (2) username and password confirmation. The AIKcertificate Request, ISP (P-CA)
signing certificate, AIKcertificate Activation, Identity certificate (user) signing are
important steps to be performed during phase-setup (Figure 1 above, left side).
Complete process given in Figure 1 above, & detailed description in (Appendix:
Table 2).
3.2 Attestation (Platform Authentication)
Attestation (platform authentication) is a mechanism which is defined in TCG
specifications, whereby the integrity measurements of the client or host platform is
performed and stored in PCR registers of the TPM chip. During attestation process
the TPM signs over the values of PCRs, and external 20-byte (160-bit) data (nonce)
using RSA private key. The confidentially of these signed PCRs and nonce are
protected by TPM. The unique aspect of the attestation is that it proves the identity,
integrity and state of the platform to the attester (requestor). The Root of Trust for
measurement (RTM), for instance, Core Root of Trust for Measurement (CRTM) is
considered as a trustworthy and reliably measure the integrity of other entities.
Secondly, Root of Trust for Reporting (RTR) proves to challenger of the local PC
embedded with genuine TPM and reliably measure and reports its configuration.
Thirdly, for Root of Trust for Storage (RTS), due to TPM memory constraint the
external keys are secured by Storage Root Key (SRK) that is also secured by RTS.
The remote attestation technique, Integrity Measurement Architecture (IMA) [19]
extended TCG attestation mechanism, actually formed on load-time measurements.
Because of space limitation please we refer interested reader to [19].
Using attestation process a TPM enabled device (such as PC, laptop, PDA, etc)
assures the remote device of its trustworthy status. The TPM consist of many keys
such Endorsement Key (EK), Attestation Identity Key (AIK), Binding, Sealing keys.
The EK is a manufactured built-in key representing the identity of each TPM enabled
platform. The EK private part using TPM signs assertion about the trusted computer
states. The remote device can verify that those assertions are signed by a genuine
TPM. The EK public part is certified by CA (P-CA) to indicate EK public part
belongs to a particular TPM. There are several benefits of using AIK over EK, they
are; (i) AIK not directly linked with the hardware TPM, (ii) prevent against EK
cryptanalysis, (iii) reduces load on TPM, because AIK uses by the CPU, while EK
uses TPM. The root of trust plays an important role in trusted chain establishment.
For federated web services system, from Attestation Models (AM) we can build
various trust models such as a Direct Trust i.e. Direct Attestation Model (DAM), and
Indirect Trust i.e Delegated Attestation Model (DeAM). In DAM either exists as a
uni-directional or mutual-directional. In uni-directional, only attestation requestor
(e.g. a server) challenges the attested platform (e.g. a client or target), and in mutualdirectional the challenger (server) and attester (client) change their positions after
each integrity measurement request and response. In an example of a mutual
directional, a server (challenger) sends an integrity measurement request to the client,
and if validation of returned measurement is successfully verified, then client sends
the integrity measurement request to the server and performs the validation. If both
measurement results are successfully validated each other then they are mutually
attested. In DAM two main disadvantages exists, that is (i) the attested platforms (e.g.
a client) need to disclose their integrity measurement information to the challenger

Finding New Solutions for Services in Federated Open Systems Interconnection

255

(e.g. a server), and that leads to the violation of integrity privacy disclosure to the
attestation challengers (ii) in both cases uni-directional and mutual-directional
attestation challenger needs to be capable of validating the attestation response. For
detail request and response overview among different entities (see Figure 2). Please
for more details interested readers referred to [20].
<Service-1>

TPM

TPM

<Client>

SML
5

Policy

SML
PCR

PCR
Integrity request/ response
Module

Integrity request/ response


Module

Attestation Challenger

<! Corroboration Service!>

Validation
Repository
SML
PCR
Certificate

2
7

Fig. 2. Practical delegation based trust and remote platform attestation (authentication)
architecture

While in DeAM, a Corroboration Service (CS) performs the role of a challenger


and validates the integrity measurement or performs attestation of attested platform
(either a client or a server) on behalf of the requestor and forwards the validation
result in the form a credential. The DeAM approach helps to overcome of the
concerns pointed in DAM under the pre-condition that CS behaves properly and is
trusted by both the challenger and attester platforms. Next, we show with an example
of DeAM. The corroboration entity showed in Figure 2 above plays the role of trust
validation service on behalf of the entities (client and server) provided that both are
equipped with TPMs.

4 Assessment: Trusted Platform Module vs. Existing AuthN Types


Federated system approaches mostly supports third parties AuthN, for instance IDP that
authenticates end user, while in second approach TPM plays the role of IDP or ASP to
achieve SSO among dispersed SPs. Both approaches can adopt any single or
combination of two factors AuthN mechanisms amongst knowledge-Based
Authentication (KBA) such as password, Object-Based Authentication (OBA) fro
instance hardware tokens or TPM, and ID-Based Authentication (ID-BA) such as
biometric.
In typical scenario user obtains a public key certificate from CA. The certificate is
storing on client system. The public part of key is passing to the CA and private part
of the key stored on client system. The storing of this key on user system raises many
concerns. However, storing this key on smart card can bring great security to the
private key. The TPM provides strong security against all software based attacks but
TPM still vulnerable to hardware based attack. An example of such attack would be a
cold-boot attack where a user doesnt let the computer shut down completely. This

256

Z.A. Khattak, J.Ab. Manan, and S. Sulaiman

attack relies on data to be in the RAM after power has been removed [22]. The (X)
represents that TPM is nearly strengthen the computer system security against all
software based attacks. The Table 1 given below presents some potential attacks,
vulnerable AuthN types, and examples, changed according to our requirements picked
from [23].
Table 1. Some Potential attacks, vulnerable AuthN mechanisms with examples
Attack Types
User System
Attack

AuthN Types
Password
Token
Biometric
TPM
Password
Theft,
Token
Copying, &
Eavesdropping Biometric
TPM
Password
Replay
Token
Biometric
TPM
Password,
Trojan Horses Token, Biometric
TPM
Password,
Fake Antivirus
Token, Biometric
TPM
Phishing,
Password,
pharming,
Token, Biometric
man-in-theTPM
middle

Instances
By guessing, or exhaustive searching
By exhaustive searching
By False matching
(X)
By shoulder surfing
By counterfeiting hardware, theft
By spoofing (Copying biometrics)
(X)
By replay stolen password response
By replay stolen pass code response
By replay stolen- biometric- template response
(X)
By installing of a rough client or capture device
(X)
By installation of malicious software and capture secret
info. via taking the control of client system
(X)
By using social engineering techniques, exploit the poor
usability of current web service technologies
(X)

5 Conclusion
In this paper, we discussed concerns related to federated services that involve user
AuthN, AuthR, and trust establishment in Federated Open Systems Interconnection.
We argued that the traditional certificate based AuthN which raised number of issues
such as firstly, the case when public portion of the key pair can be guessed or
calculated by the attacker, and secondly, when the storing of private key on user
system can be compromised by viruses, Trojan horses, etc. In addition current
computer platforms are lacking to establish a platform trust which makes harder to
trust remote platforms are trustworthy or untrustworthy. Therefore in distributed
environment access a TPM based trust establishment mechanism, remote attestation,
would boost the end user trust that no body can invade his computing/ mobile
platform to run or install malicious softwares. From service or resource provider
perspective that only an authentic TPM are allowed to make a request to a resource or
a service.

Finding New Solutions for Services in Federated Open Systems Interconnection

257

We also discussed how TCG can potentially provide solution for these issues using
both protected storage to protect private keys and platform attestation mechanisms to
establish inter platform (and hence inter system) trust among interacting systems, and
can help to overcome identity theft issues in open environment. Our assessment of
range of most common AuthN types and TPM shows that TPM provides stronger
security against range of attacks in open environment.
Currently we are in a process to create IMA based prototype to demonstrate the
remote attestation mechanism. In this demo we will show how the requester and
responding platforms attest each others that they are trustworthy or not, & guarantees
that no malicious software or code running on either platforms.
Acknowledgments. This work funded by Universiti Teknologi PETRONAS
Postgraduate Assistantship Scheme and MIMOS Berhad, Malaysia.

References
1. Chadwick, D.W.: Federated Identity Management. In: Aldini, A., Barthe, G., Gorrieri, R.
(eds.) FOSAD 2007. LNCS, vol. 5705, pp. 96120. Springer, Heidelberg (2009)
2. Pashalidis, A., Mitchell, C.J.: Taxonomy of Single Sign-On Systems. In: Safavi-Naini, R.,
Seberry, J. (eds.) ACISP 2003. LNCS, vol. 2727, pp. 249264. Springer, Heidelberg
(2003)
3. Lutz, D.: Federation Payments using SAML Tokens with Trusted Platform Modules. In:
Proceedings of the IEEE Symposium on Computers and Communications, pp. 363368
(2007)
4. Vijayan, J.: Wells fargo discloses another data breach. Computer World (2006),
http://www.computerworld.com/s/article/9002944
/Wells_Fargodisclo_nother_data_breach
5. Lemos, R.: Reported data leaks reach high in 2007. Security Focus (2007),
http://www.securityfocus.com/brief/652
6. Trusted Computing, http://www.trustedcomputinggroup.org/
7. Trusted Computing Platform Alliance (TCPA),
http://mako.cc/talks/20030416politics_and_tech_of_control/trustedcomputing.html
8. Balacheff, B., Chen, L., Pearson, S., Plaquin, D., Proudler, G.: Trusted Computing
Platforms: TCPA Technology in Context. Prentice-Hall, Englewood Cliffs (2003)
9. Khattak, Z.A., Sulaiman, S., Manan, J.A.: A Study on Threat Model for Federated
Identities in Federated Identity Management System. In: Proceeding 4th International
Symposium on Information Technology of IEEE Symposium, pp. 618623 (2010)
10. Ahn, G.-J., Shin, D., Hong, S.-P.: Information Assurance in Federated Identity
Management: Experimentations and Issues. In: Zhou, X., Su, S., Papazoglou, M.P.,
Orlowska, M.E., Jeffery, K. (eds.) WISE 2004. LNCS, vol. 3306, pp. 7889. Springer,
Heidelberg (2004)
11. Stephenson, P.: Ensuring Consistent Security Implementation within a Distributed and
Federated Environment, pp. 1214 (2006)
12. Hommel, W., Reiser, H.: Federated Identity Management: Shortcomings of Existing
Standards. In: Proceedings of 9th IFIP/IEEE International Symposium on Integrated
Management (2005)
13. Smedinghoff, T.J.: Federated Identity Management: Balancing Privacy Rights, Liability
Risks, and the Duty to Authenticate (2009)

258

Z.A. Khattak, J.Ab. Manan, and S. Sulaiman

14. Jsang, A., Fabre, J., Hay, B., Dalziel, J., Pope, S.: Trust Requirements in Identity
Management. In: Australasian Information Security Workshop (2005)
15. Maler, E., Reed, D.: The Venn of Identity: Options and Issues in Federated Identity
Management. IEEE Security and Privacy 6(2), 1623 (2008)
16. Madsen, P., Koga, Y., Takahashi, K.: Federated Identity Management For Protecting
Users from ID Theft. In: Proceedings of the 2005 ACM Workshop on Digital Identity
Management, pp. 7783. ACM Press, New York (2005)
17. Mills, E.: Report: ID fraud malware infecting PCs at increasing rates, Security (2009),
http://news.cnet.com/8301-1009_3-1019302583.html?tag=mncol;title
18. Shin, D., Ahn, G.-J., Shenoy, P.: Ensuring Information Assurance in Federated Identity
Management. In: Proceedings of the 23rd IEEE International Performance Computing and
Communications Conference, pp. 821826 (2004)
19. Sailer, R., Zhang, X., Jaeger, T., van Doorn, L.: Design and Implementation of a TCGbased Integrity Measurement Architecture. In: Proceedings of the 13th USENIX Security
Symposium Conference, Berkeley, CA, USA, pp. 223238 (2004)
20. Khattak, Z.A., Manan, J.A., Sulaiman, S.: Analysis of Open Environment Sign-in
Schemes-Privacy Enhanced & Trustworthy Approach. J. Adv. in Info. Tech. 2(2), 109
121 (2011), doi:10.4304/jait.2.2.109-121
21. Trusted Computing Group, Trusted Computing Group Specification Architecture
Overview v1.2. Technical Report. Portland, Oregon, USA (2003)
22. Bakhsh, S.: Protecting your data with on-disk encryption, Business Intelligence Solutions,
http://www.trustyourtechnologist.com/index.php/2010/07/07
/protecting-your-data-with-on-disk-encryption/
23. OGorman, L.: Comparing passwords, tokens, and biometrics for user authentication.
Proceedings of the IEEE 91(12), 20212040 (2003)
24. Shibboleth, http://shibboleth.internet2.edu/
25. Liberty Alliance, http://projectliberty.org/
26. OpenID, http://openid.net/

Appendix: Table 2. Processes Steps Detailed Description


User agent calling
Tspi_TPM_Collate
IdentityRequest

TPM Performs
TPM_Make
Identity

TSS_CollateIde
ntityRequest

Process
The user agent performs Tspi_TPM_CollateI
dentityRequest making a request to TPM to create
an AIK key & setup certificate request from ISP or
IDP or AS plays a role of P-CA.
The
TPM_MakeIdentity execution creates a new AIK
and using private key to sign structure
(TPM_IDENTITY_CONTENTS). This structure
includes public key, hashing result, identity Label.
The
user
agent
performs
TSS_
CollateIdentityRequest. It assembles the data
required by ISP (P-CA). Next it sends the IdentityRequest (to attest the new created TPM identity)
TPM_IDENTITY_PROOF to the ISP (P-CA). This
message include Identity-Binding signature to
structure (TPM_IDENTITY_CONTENTS). In
addition it consist endorsement l, conformance, and
platform credentials. The IR message is
symmetrically encrypted using a session key and

Finding New Solutions for Services in Federated Open Systems Interconnection

259

session key is asymmetrically encrypted using the


public key of the ISP (P-CA). The user agent
forwards
Identity-Request
(TPM_IDENTITY_REQ.) message to ISP.
On receiving the TPM_IDENTITY_REQ. by ISP
(P-CA). The ISP uses private key to decrypts the
session key and then decrypts the message using
session key.
It verifies the Identity-Request
message was generated by a genuine TPM. The ISP
response
by
sending
ISPResponse
(TPM_SYM_CA_ATTESTATION
structure)
message. This message includes encrypted version
of
identity
credential
such
as
TPM_IDENTITY_CREDENTIAL
structure,
symmetrically encrypted using a session key and
session key is asymmetrically encrypted using the
public key of TPM Endorsement Key (EK).
Tspi_TPM_Activate TPM_ActivateI User agent perform Tspi_TPM_ActivateIdentity to
Identity
dentity
receives AIK credential from ISP (P-CA) and
activate
it.
For
this
TPM
performs
TPM_ActivateIdentity to obtain session used to use
to encrypt the identity credential. Only TPM EK
private key part can be used to decrypt the
encrypted session key with TPM EK public part.
The user agent performs TSS_ RecoverTPMIdentity
to decrypt the AIK certificate such as
TPM_IDENTITY _CREDENTIAL using the
session key.
Tspi_Key_Create
TPM_CreateWr The certified AIK private part can not be used to
Key
apKey
sign external data to the TPM. Therefore, user agent
should create another non-migratabale key pair (D)
method call Tspi_Key_CreateKey (by
Tspi_Key_Certify
TPM_CertifyKe by
performing a command- TPM_CreateWrap Key)
Key
y2
and then sign newly created key with AIK private
key part by using the method- Tspi_Key
_CertifyKey (by performing a commandTPM_CertifyKey2).
Tspi_Key_
TPM_CreateWr The user agent should create non-migratabale key
CreateKey
apKey
pair or certified migratabale key pair (E) using
TPM_CMK_Cr methodTspi_Key_CreateKey.
TPM_
eateKey
CMK_CreateKey can be used if user want to
Tspi_Key_
TPM_CertifyKe migrate the key to another TPM Platform while
TPM_CreateWrapKey to create non-migratabale
Certify Key
y2
key pair. Using TPM_CertifyKey2 to sign new key
pair (E) using private portion of AIK.
Tspi_Hash_Sign
TPM_Sign
The user agent performs method-Tspi_Hash _Sign.
TPM performs a command- TPM_Sign to sign E
public key (confirming to X.509 v3 format). The
public key certificate work as user identity
certificate for authentication of client to the Bank
Server.

Duplicate File Names-A Novel Steganographic


Data Hiding Technique
Avinash Srinivasan1 and Jie Wu2
1
2

PA Center for Digital Forensics, Bloomsburg University, Bloomsburg PA 17815


Center for Networked Computing, Temple University, Philadelphia, PA 19122

Abstract. Data hiding has been an integral part of human society from
the very early days dating back to BC. It has played its role for both
good and bad purposes. First instances of data hiding dates back to 440
B.C. and has been cited in several works as one of the rst known and
recorded use of steganography. Several complicated Steganographic techniques have been proposed in the past decade to deceive the detection
mechanisms. Steganalysis has also been one of the corner stones of research in the recent past to thwart such attempts of the adversary to
subterfuge detection. In this paper we present a novel, simple, and easy
to implement data hiding technique for hiding les with duplicate names.
The proposed le hiding technique Duplicate File Names uses an innocuous le as the cover medium exploiting its name and reputation as a good
le. This vulnerability was rst discovered on a Windows 98 machine
with DOS 6.1. We have tested this vulnerability on several dierent le
systems to conrm that the vulnerability exists across le systems and
not specic to older Windows le systems. Finally, we have discussed
using this method for legitimate data hiding as well as detecting when
employed for illegitimate data hiding.
Keywords: Digital forensics, duplicate le name, le hiding, identity
and data theft, steganography.

Introduction

Steganography has been a great challenge to the digital forensic community from
the very beginning. However, one has to be unbiased and recognize the good side
to Steganography like digital copyrighting and watermarking. Several techniques
have been developed to detect information hiding accomplished by various Steganographic tools employing a limited number of Steganographic algorithms. However
the adversary has been consistently successful in developing new techniques to
achieve the same. In this paper we expose a potentially serous vulnerability which
was rst discovered on a Windows 98 machine with DOS 6.1.
The problem was identied while recovering deleted les on a FAT12 formatted oppy disk using DiskEdit. Norton Diskedit is a hexeditor for logical and
physical disk drives on all Windows lesystems. It is an undocumented utility
that comes along with the standard Norton Utilities package for Windows. The
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 260268, 2011.
c Springer-Verlag Berlin Heidelberg 2011


Duplicate File Names-A Novel Steganographic Data Hiding Technique

261

aforementioned vulnerability persists across the FAT le system family- FAT12,


FAT16, and FAT32. The vulnerability can be formally stated as followsA malicious le can be renamed, using a simple Hex editor tool, to bear the
same name as that of a known good le on the media to evade simple detection
schemes including visual examination.
This vulnerability is as powerful as it appears simple. An average computer
user with the knowledge of the underlying le systems structure and layout
can easily trac important les in and out of a room, building, or the country.
To accomplish this, all he needs is a simple hex editor tool such as DiskEdit
or HxD. Such les can range anywhere from a simple and not so critical data
like coworkers salary and bonus package to important business data like design
and development blueprints and Intellectual Property. From a national security
perspective, this could a document with classied information or a terrorist plot.
None-the-less, these les can also be potentially dangerous viruses, malware,
child porn image and video les.
The question that many of us want an answer to isIs this the most sophisticated data hiding technique? and the simple answer is NO. However, the
answer neither mitigates the risk nor eliminates the threat from such a simple
data hiding technique.
In this paper we will discuss the structure of a simple FAT le system- FAT12.
We then discuss the steps by which malicious les can be hidden in plain sight
there by easily evading detection and visual inspection techniques employed.
Simple and routine inspections are commonly deployed at the periphery of an
organization like the security guard who can be directed to inspect the les
carried out by employees working in certain restricted areas. The idea of this
research work is to develop a simple and easy to use tool that can be used
to detect and thwart such simple information theft that can potentially cause
irreversible business losses and jeopardize the national security. We then discuss
in detail the reverse engineering process of extracting such les.
The reminder of this paper is organized as follows. In Sec.2 we will review
some of the important works in the eld of steganography relevant to our work.
We then present two application scenarios discussing in detail the presented data
hiding technique in Sec.4. In Sec.3, we discuss the requirements for this method
of data hiding to work, the categorization of storage devices, and related issues.
We also present in this section the various areas on a disk where data can be
hidden that usually would not hold user data otherwise. Later in Sec.5 we present
some fundamental information of le systems with FAT12 as an example because
of its simplicity. In Sec.6 we will discuss the details on how les can be hidden in
plain sight by exploiting the vulnerability presented in this paper. We present a
detailed detection and recovery process in Sec.7. Finally, in Sec.8, we conclude
our work with directions for future research.

Related Work

Steganography can be used to insert plain or encrypted data in a cover le to


avoid detection. The sole purpose of steganography is to conceal the very fact

262

A. Srinivasan and J. Wu

that something exists as opposed to cryptography which aims at rendering the


contents uninterpretable.
MaDonald and Kuhns StegFS [MK2000] hides encrypted data in the unused
blocks of a Linux ext2 le system. Consequently, it makes the data look like a
partition in which unused blocks have recently been overwritten. Furthermore,
the proposed method of overwriting with random bytes mimics disk wiping tool.
Metasploit Anti-Forensics Project [MetaSplolt] seeks to develop tools and
techniques for removing forensic evidence from computer systems. This project
includes a number of tools, including Timestomp, Slacker, and SAM Juicer,
many of which have been integrated in the Metasploit Framework. Metasploits
Slacker hides data within the slack space of FAT or NTFS le system.
FragFS [TM2006] hides data within the NTFS Master File Table. It scans
the MFT table for suitable MFT entries that have not been modied within the
last year. It then calculates how much free space is available and divides it into
16byte chunks for hiding data.
RuneFS [G2005] stores les on blocks it assigns to the bad blocks inode which
happens to be inode 1 in ext2. Forensic programs are not specically designed to
look at the bad blocks inode. Newer versions of RuneFS also encrypt les before
hiding them making the problem a two fold problem.

Hiding Information on Storage Devices

In this section we will list the the requirements for successful data hiding and
various areas on the storage volume where data can be hidden.
3.1

Requirements

For successful data hiding using the Duplicate File Names method, the following
requirements have to be met.
1. The cover le should always have a lower starting cluster number compared
to the le to be hidden. This is because the OS, when you access a le will
always open the le with the lower starting cluster number. This is true and
has been veried on all three FAT le systems.
2. The cover le and the hidden le have to be at the same hierarchical level in
the directory structure. In light of this point, we have to ask the following
questionIs it possible to have two les with the same name but dierent contents at
the same hierarchical level- i.e., on the same drive, inside the same partition,
and inside the same folder?
The answer to this question is No. Trivially, there are two ways to attempting to create two les with the same name1. Renaming an existing le-Two les already exists inside a folder with
dierent names. Try to rename one of the them to have the same name as

Duplicate File Names-A Novel Steganographic Data Hiding Technique

263

the other by either right clicking or by opening the le and using the save
as option under le menu. An error message will pop up.
2. Creating a new le- A le already exists. Try to create a new le and save
it in the same folder as the existing one with the same name. This is same
as opening an existing le and using save as option. Once again you will
see an error message pop up.
In summary, one cannot save two le with the same name inside the same directory without overwriting. Once overwritten, the original le content will be lost
forever although parts of it may be recovered from slack space. None-the-less,
creating multiple les with duplicate names can be easily accomplished with the
use of any freely available HeX editor. This requires some knowledge of the underlying le system and the associated OS. With the help of a HeX editor, the
adversary can rename multiple les with a single name. Since, with a HexEditor,
we work below the le system, the OS will not complain about the le already
existing. Neither does the OS overwrite the contents of the original le. This
way, there can be several les with the same name inside the same directory.
This has been illustrated in Fig.1

Fig. 1. Screenshot of a diskette storing two les with exactly the same name and
extension at the at the same hierarchical level

There are several common areas on the disk that are either unused or reserved
and can serve the purpose of hiding data without interfering with the intended
primary operations of the storage partition. Below is the a list of areas common
to both OS partition and non-OS partition.

264

A. Srinivasan and J. Wu

Slack Space- RAM and File Slack


Boot Sector of non-bootable partition
Unallocated Space
Volume Slack

Application Scenario

In this section we present two application scenarios in dierent domain to emphasize the potential threat that Duplicate File Name data hiding technique can
pose.
1. Scenario- 1: Child Pornography: A child pornographer can hide child porn
images and/or videos using the same name as that of an innocuous looking
image and/or video le respectively. The child pornographer can be doing
this at his work place or at home. Since two les have the same name and
clicking on either will always open the known good cover le.
2. Scenario- 2: Information Theft: A company employee easily steal condential
and proprietary data. The employee can steal the data very easily. He can
save it on to his system with the name of a le he has privilege to access.
Then copy both the original le and the le he is stealing with the Duplicate
Name and walk out. Even if there is any sceurity screeningNo body would
immediately wonder as to how two les with the same name can be copied
to the same directory.
The following two situations have to be clearly dierentiated. Duplicate les can
have the same name and or dierent names. If they have the same name and
are inside the same volume on a drive, then there will be only one root directory
entry for all copies of the le with the same name. However, if duplicate copies
have dierent names, then there will be a separate root directory entry for each
copy with a dierent name irrespective of the hierarchy they reside at. In the
former situation, as long as duplicate copies are inside the same volume, copies
with the same name will have consistent data as long as they are duplicate.
However, in the later scenario, modifying a le will not update the duplicate
copies with dierent le names.
As already mentioned, in this paper we are trying to resolve the rst scenario.
There are commercially available tools to handle the second and third scenario.
The fourth scenario is benign and poses no threat as such.

Hiding On Floppy Disk

For simplicity, we consider the example of hiding a malicious le on a oppy


disk formatted with FAT12 le system. Additionally, to enable the reader in
appreciating and understanding the le hiding technique presented in this paper,
we will briey discuss the layout of a oppy disk formatted with FAT12 le
system and important data structures as shown in Fig.2.
The entire oppy disk can be divided into two main regions.

Duplicate File Names-A Novel Steganographic Data Hiding Technique

265

1. System Region
2. Data Region
System region consists of important system areas and data structures as follows1. Boot Sector
2. File Allocation Table
(a) Primary FAT
(b) Secondary FAT
3. Root Directory
For le recovery, the two most critical regions are the File Allocation Table
and the Root Directory. The standard, default size of a root directory entry
is 32 bytes and is consistent across the three FAT le systems-12, 16 and 32.
In this paper we will restrict our discussions to FAT le systems for simplicity
of conveying the idea. The 32 byte directory entry of a le stored on a FAT
formatted volume has some critical information, which are listed below, that
can be useful in detecting dierent les with duplicate names.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

File Name
File Extension
File Attribute(s)
Create Date
Created Time
Last Accessed Date
Modied Date
Modied Time
Start Cluster Number
File Size

In particular, for les that have dierent content but the same name and extension, the start cluster numbers have to be unique. The le size, in almost
all cases should be dierent as well, however it cannot serve as an evidence to
trigger suspicion nor serve as a conrmatory litmus test.
The same vulnerability can be seen from another perspective of having positive applications including hiding password les in plain sight. Such le can be
accessed and opened on the y by the methods presented later in this paper.

The Process of Hiding

In this section, we will discuss the method of hiding les with duplicate name
using HeX Editor tool. The le uses size as the key requirement for choosing a
cover le. Extension is not a key concern when choosing cover le since extension
can be easily modied for the malicious le to match that of the cover le.
Without of loss of generality, we will use Good File to refer to the cover
le being used whose name will not cause any suspicion or raise ags and Bad

266

A. Srinivasan and J. Wu

Fig. 2. The two main regions of a FAT12 formatted oppy disk and regions and data
structures within the system region of a FAT12 formatted oppy disk

File to refer to the le being hidden which can be proprietary information of a


corporate, child pornography image or video, etc.
The tool scans the entire root directory and returns the top ve les whose
size match the given le in size and attributes very closely. Then the user can
choose a le whose name and extension will be used as the cover for hiding the
malicious le. Once the user makes his choice, the rest is very simple.
1.
2.
3.
4.

The user initially save the le to be hidden on the storage device.


The user then loads the storage device into a hex editor and opens it.
User locates the entry in the root directory for the le to be hidden.
User over writes the name and extension of the le to be hidden with name
and extension of the cover le.
5. User saves the changes made to the storage device.
6. Now, when the storage device is opened on any system, you can see two les
with the exact name and extension at the same hierarchical level.

The Process of Detection and Recovery

Detecting les with duplicate names but dierent content can be performed in
two dierent ways. Both these methods are described in detail below. Once two

Duplicate File Names-A Novel Steganographic Data Hiding Technique

267

or more les are detected to have the same name but dierent content using the
method below, then they have to be recovered with out loosing data for their
potential evidentiary value.
7.1

Renaming Method of Detection

1. Open the disk in a hex editor tool.


2. Scan the root directory entries on the entire disk including subdirectories
for duplicate le names including the extension. If there is more than one
le with the same name and extension, then cross check their start cluster
number and logical le size.
3. Two les with the same name and extension, if they are exactly the same in
content, should have the exact same start cluster number and logical size.
4. If the result of this test conrms the les under scrutiny have the same start
cluster number, then it can be ignored since it represents duplicate les.
5. If the result of this test conrms that the le with duplicate names have
dierent start cluster numbers, then they are clearly dierent.
6. The logical size cannot be used as a conrmatory test since two les with
same name but dierent contents can have the same size.
7. Both these les can be now retrieved to a dierent location such that original
content is not altered, rename them as DIRTY-1.EXT and DIRTY-2.EXT.
Now open both les. Having named them dierently, the malicious le will not
be protected any longer since accessing it now will reveal the actual content.

Conclusion and Future Work

In this paper, we have exposed a subtle yet important vulnerability in le systems, specically FAT, that can be exploited to hide les in plain sight and
evade detection. We have also proposed simple solutions to overcome such data
hiding techniques and detect hidden les. We will continue to investigate along
these lines to uncover any such data hiding techniques that have been either
unknown or have been dismissed as too trivial. We have shown strong reasons
through example application scenarios where such simple techniques can have
a big payo for the adversary with minimum risk. In the second phase of this
project we will be developing a tool that can be used to hide information in plain
sight exploiting the same vulnerability. The tool will be primarily targeted for
education and training purposes.
As part of our future work we will be investigating anti-forensics techniquestechniques that are specically designed to hinder or thwart forensic detection
of criminal activities involving digital equipment and data. Also on our agenda
of future research is Denial-of-Service attacks exploiting le system knowledge.

References
[HBW2006] Huebnera, E., Bema, D., Wee, C.K.: Data hiding in the NTFS le system.
Digital Investigation 3(4), 211226 (2006)
[P1998] Liu, Brown: Bleeding-Edge Anti-Forensics. In: Infosec World Conference &
Expo. MIS Training Institute

268

A. Srinivasan and J. Wu

[AB1992] Abramson, N., Bender, W.: Context-Sensitive Multimedia. In: Proceedings


of the International Society for Optical Engineering. SPIE, Washington, DC,
September 10-11, vol. 1785, pp. 122132 (1992)
[BGM1995] Bender, W., Gruhl, D., Morimoto, N.: Datahiding Techniques. In: Proceedings of SPIE 2420 (1995)
[BGML1996] Bender, W., Gruhl, D., Morimoto, N., Lu, A.: Techniques for Data hiding.
IBM System Journal 35(3&4) (1996)
[BL2006] Buskrik, Liu: Digital Evidence: Challenging the Presumption
of Reliability. Journal of Digital Forensic Practice 1, 1926 (2006),
doi:10.1080/15567280500541421
[GM2005] Garnkel, Malan: One Big File is Not Enough: A Critical Evaluation of the
Dominant Free-Space Sanitization Technique. In: The 6th Workshop on Privacy
Enhancing Technologies, June 28-June 30. Robinson College, Cambridge
[LB2006] Liu, Brown: Bleeding-Edge Anti-Forensics. In: Infosec World Conference &
Expo. MIS Training Institute
[MK2000] McDonald, A., Kuhn, M.: StegFS: A Steganographic File System for Linux.
In: Ptzmann, A. (ed.) IH 1999. LNCS, vol. 1768, pp. 463477. Springer, Heidelberg (2000)
[TM2006] Thompson, I., Monroe, M.: FragFS: An Advanced Data Hiding Technique.
In: BlackHat Federal, Wang (2004)
[MetaSplolt] Metasploit Anti Forensics Project,
http://www.metasploit.com/research/projects/antiforensics/
[G2005] Grugq: The Art of Deling, Black Hat (2005),
http://www.blackhat.com/presentations/bh-usa-05/bh-us-05-grugq.pdf

A Framework for Securing Web Services by


Formulating an Collaborative Security Standard among
Prevailing WS-* Security Standards
M. Priyadharshini1, R. Baskaran2, Madhan Kumar Srinivasan3, and Paul Rodrigues4
1,2

Computer Science Department, Anna University, Chennai, India


mpriya1977@gmail.com, baaski@annauniv.edu
3
Education & Research, Infosys Technologies, Mysore, India
madhan_srinivasan@infosys.com
4
Department of IT, Hindustan University, Chennai, India
deanit@hindustanuniv.ac.in

Abstract. Web Services enables communication between applications with a


less working out on the underlying mechanics of communication. This paper
provides a brief introduction to security concepts and describes in detail various
specifications related to security among WS-* family and association among
those specifications. Web Service Standards available do not completely
address security for web services. In this paper we have proposed a framework
that consists of components which could secure web service interactions
facilitating interoperability between various WS-* security standards, by
devising a collaborative security standard based on the associability of WS-*
security standards and can be furthermore customized by optimizing the
selection and projection functions of standard list and parameter list. The
parameter list is again formulated by clear understanding of association of the
WS-* security standards.
Keywords: WS-* family, collaborative security standard, interoperability, web
services.

1 Introduction
Todays enterprises take advantage of the benefits of loosely coupled web services
and made it an integral part of their business process. Therefore, need for security in
business process raises the level of security needs in web services as well. The loose
coupling is possible in web services due to extensive usage of XML (Extensible
Mark-up Language). XML is used in web services for describing, requesting,
responding and so on, which drives us to secure XML messages if web services need
to be secured. The chapter just following briefs about the Web Service Model,
Chapter III about the various security issues need to be addressed in web services and
Chapter IV describes about formulation of collaborative security standard and
proposed framework which provides an interoperable and secure gateway for web
service usage. Chapter V briefs about various WS-* security standards along with the
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 269283, 2011.
Springer-Verlag Berlin Heidelberg 2011

270

M. Priyadharshini et al.

issues addressed by those specifications and followed by Chapter VI about the


associations that exist between the standards, which serves as the basis for
formulating the collaborative security standard. In Chapter VII selection criteria based
on scenarios is presented with few scenarios and finally Chapter VIII gives how
evaluation process can be done for evaluating the security provision of the framework

2 Web Service Model


Web service model is one of the approaches for building SOA (Service Oriented
Architecture). Service provider creates a web service and its service definition and
publishes in the service registry. Service Requestor finds the service in the registry
and obtains the WSDL description and URL to the service itself. The service
requestor with the help of information obtained binds to the service and invoke it.
Figure 1 shows the web services model as interaction between service requestor and
service provider through UDDI registry which is same as that of the Service Oriented
Architecture.

Fig. 1. Web Services Model

The core technologies which form the foundation of Web services are SOAP,
WSDL and UDDI.
2.1 SOAP
Simple Object Access Protocol (SOAP) is used as a standard to exchange messages
between client applications and services that run on server through Internet
infrastructure. The method invocation is made as a SOAP request and result is passed
as SOAP response. SOAP message are in form of XML and it encapsulates
<Soap:Header> as optional element and <Soap:Body> as mandatory element inside a
<Soap:Envelope>[1]. Soap Header holds the information needed by the SOAP node

A Framework for Securing Web Services

271

to process the SOAP message such as authentication, routing etc. Soap body contains
the information to be sent to the SOAP message receiver. The format of SOAP
request and response will be as follows [7]:
Table 1. SOAP Request invokes OrdItem() method from http://www.Tanishq.com/Order and
SOAP Response passes order number generated on processing the order to the client
SOAP Request to Process Order
<Soap:Envelope
xmlns:Soap-ENV=
"http://schemas.xmlsoap.org/soap/envelope/"
Soap:encodingStyle=
"http://schemas.xmlsoap.org/soap/encoding/">
<Soap:Body>
<Ord:OrdItem xmlns:Ord="urn:Order">
<CID>70010</CID>
<ItNum>105057</ItNum>
<ItNme>WGRWRD</ItNme>
<ItDesc>WhiteGoldRing
WithRoundDiamond</ItDesc>
<ItPrice>8332</ItPrice>
<OrdDateTime>2010-02-10
0:10:56</OrdDateTime>
</Ord:OrdItem>
</Soap:Body>
</Soap:Envelope>

SOAP Response on Processing Order


<Soap:Envelope
xmlns:Soap=
"http://schemas.xmlsoap.org/soap/envelope/"
Soap:encodingStyle=
http://schemas.xmlsoap.org/soap/encoding/>
<Soap:Body>
<Ord:OrdItemResponse
xmlns:Ord ="urn:Order">
<OrdNum>20014</OrdNum>
</Ord:OrdItemResponse>
</Soap:Body>
</Soap:Envelope>

Table 2. Sample WSDL for Placing Order is specified


OrderItem.WSDL
<?xml version="1.0" encoding="UTF-8"?>
<definitions name="OrdService" targetNamespace="urn:Order" xmlns:tns="urn:Order"
xmlns="http://schemas.xmlsoap.org/wsdl/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/">
<message name="OrdItem">
<part name="CID" type="xsd:int"/>
<part name="ItNum" type="xsd:int"/>
<part name="ItNme" type="xsd:string"/>
<part name="ItDesc" type="xsd:string"/>
<part name="ItPrice" type="xsd:double"/>
<part name="OrdDateTime" type="xsd:string"/>
</message>
<message name="OrdItemResponse">
<part name="OrdNum" type="xsd:int"/>
</message>
<portType name="OrdItemPort">
<operation name="OrdItem" parameterOrder="CID ItNum ItmName
ItmDesc ItPrice OrdDateTime">
<input message="tns:OrderItem"/>
<output message="tns:OrdItemResponse"/>
</operation>
</portType>
<binding name="OrdItemBinding" type="tns:OrdItemPort">
<soap:binding transport=http://schemas.xmlsoap.org/soap/http style="rpc"/>
<operation name="OrdItem">

272

M. Priyadharshini et al.
Table 2. (Continued)

<soap:operation soapAction=""/>
<input>
<soap:body encodingStyle=http://schemas.xmlsoap.org/soap/encoding/
use="encoded" namespace="urn:Order"/>
</input>
<output>
<soap:body encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
use="encoded" namespace="urn:Order"/>
</output>
</operation>
</binding>
<service name="OrderService">
<port name="Order" binding="tns:OrdItemBinding">
<soap:address location="http://www.Tanishq.com/Order"/>
</port>
</service>
</definitions>

2.2 WSDL
WSDL is an XML document which is the web service interface published by the
service providers. Service requestors who wish to access the service can read and
interpret the WSDL file. The information in the WSDL file is as follows:

Location of the service


Operations performed by the service
Communication Protocol supported by the service
Message Format for sending and receiving request and responses.

2.3 UDDI
UDDI (Universal Description, Discovery and Integration) is the directory which has
list of web service interfaces provided by various businesses. The interfaces are
represented using WSDL which is rendered when businesses find the interfaces
suitable for their search. UDDI are public or private platform-independent framework
driven by service providers like Dell, IBM, Microsoft, Oracle, SAP, and Sun as well
as few e-business leaders.
Web service is a powerful technology for distributed application development and
integration. Todays most e-commerce applications are based on the involvement of
web services and hence make web service as an essential element in current scenario.
Next chapter elaborates security issues in web services.

3 Security Issues
As stated earlier Web Services rely on Internet infrastructure and hence the security
issues encountered in network is encountered in web services also.
3.1 Confidentiality
Confidentiality specifies that the content of the message should be accessed only by
the sender and receiver. This is achieved by appropriate encryption and decryption

A Framework for Securing Web Services

273

algorithms applied on entire message or parts of the messages. SSL using HTTPS can
provide point-to-point data privacy i.e. security at transport level. At application level
sensitive data fields can be applied with encryption mechanisms. Sniffing or eaves
dropping is an attack with respect to confidentiality.
3.2 Authentication
Authentication is establishment of proof of identities among entities involved in the
system. Username and password are used for authenticating the user at platform level.
At message level to provide authenticity SOAP headers [5] is added with user name
and password, assigned tickets and certificates such as Kerberos and X.509
certificate. In application level custom methods can be included for authentication.
Single Sign on or Trust relationship need to be incorporated in routing to provide
authentication between multiple services.
3.3 Authorization
One entity may be authorised to do certain operations, and access certain information
whereas others may not be. In Web services access control mechanisms need to be
provided in form of XML (XACML and SAML). Access control may be based on
Role (RBAC), Context (CBAC), Policy (PBAC), Attribute (ABAC) and so on[2].
3.4 Non-Repudiation
Non-Repudiation is disclaiming the message sending or receiving, time of sending
and receiving the message. On critical and secure service access non-repudiation is
one of the major issues. A Central arbiter Trusted Third Party (TTP) [1] should be
introduced along with XML Signature to provide security in these cases.
3.5 Availability
Authorized resources and services available at all times are meant by availability.
Denial of Service (DOS) is the commonly encountered problem related to availability.
3.6 Integrity
The change of message content during transit leads to loss of Integrity. It is mainly
concerned with the web service description (WDSL) file. On tampering and changing
this file, intended service may not get bind to the requestor and even problems may
arise in case of composition. Proper Hashing algorithm or XML Signature may
overcome this issue.

4 Proposed Security Framework Including Formulation of


Collaborative Standards
4.1 Security Framework
The Proposed Security Framework consists of components such as Security Manager,
Static Analyser and Dynamic Analyser. Figure 2 depicts the Security Framework

274

M. Priyadharshini et al.

which serves as a gateway to ensure the security of the web service access from
various distributed client applications. Web Service model involves the process of
publishing and invoking services. Proposed Security Framework includes security list
formation and corresponding parameter list which is devised as a collaborative
security standard.
Static Analyser is the component which is invoked during registering of service,
which will guide the service provider or the publisher to customise and hence record
the security standard values for the Standard List as well as corresponding Parameter
List.

Fig. 2. Security Framework

Dynamic Analyser component is invoked during discovery and execution of the


service, checks for the correctness of the security needs specified in the standard at
various levels such as message and transport level.
Security Manager is the component in the framework which manages the process
of proper execution of the framework maintaining the logs made during Static and
Dynamic Analysis.
SM={<SL,PL(SL),Slog> | af(<SL,PL(SL),Slog>) }

(1)

Where
SL Standard List
PL (SL) Parameter List of Standard List Item
Slog Service Log
af registering or access function
4.2 Collaborative Security Standard
Collaborative security standard consist of Standard List and Parameter List. A
Standard list is selected based on the precise understanding of the security needs and
the WS-* Security Standards and their associations, which could address the needs.

A Framework for Securing Web Services

275

Standard List (SL) formulated with WS-*(all XML Security standards) pertaining to
security as input:
SL = {I | I WS}
WS = {t | t sf(WS-*)}
Where
SL Standard List
I - Standard List Item
sf selection function selecting among WS standard Items
with security as objective

(2)

Parameter List (PL) for each Standard List (SL) which are found to be suitable for
inclusion:
PL = {P | P pf (SL)}

(3)

Where
pf Projection function to list out only mutually exclusive parameters

5 WS-* Security Standards


Organization for the Advancement of Structured Information Standards (OASIS) and
World Wide Web Consortium (W3C) devised many WS standards which were used
for providing security, reliability, and transaction abilities in web services. These WS* standard specifications help to enhance the interoperability between different
industry platforms, most notably Microsofts .NET platform and IBMs WebSphere
software. This chapter will discuss on the standards which concentrate on Security.
5.1 WS-Policy
WS-Policy [9] is general-purpose framework model that describes web service related
policies. A policy can specify properties, requirements and capabilities of the web
services. While service request is been sent this policy is used to validate and accept
the request. For example a Policy can mandate the web service for NEFT (NonElectronic fund Transfer) to provide service between 10:00 AM and 5:00 PM from
Monday to Friday and between 10:00 AM and 2:00 PM on Saturdays or that request
should be signed using X.509.
The Policies defined in WS-Policy can be attached to service endpoints or XML
data using WS-PolicyAttachment.
The Polices can be retrieved from SOAP node using WS-MetadataExchange.
Specific Policy assertion related to text encoding, SOAP Protocol version and
Predicates that enforce the header combinations existing between SOAP messages are
defined using WS-PolicyAssertions. The Proposed Security Framework consists of
components such as Security Manager, Static Analyser and Dynamic Analyser.
Figure 2 depicts the Security Framework which serves as a gateway to ensure the
security of the web service access from various distributed client applications. Web
Service model involves the process of publishing and invoking services. Proposed

276

M. Priyadharshini et al.

Security Framework includes security list formation and corresponding parameter list
which is devised as a collaborative security standard.
5.2 WS-SecurityPolicy
WS-SecurityPolicy[8] consists of the security related assertions such as Security
Token which tells the requestor which security token need to be used while calling a
given web service. The other assertions include assertions specifying about Integrity,
Confidentiality, and Visibility which are used to specify the message part that need to
be protected and that parts need to remain unencrypted. Message expiry can be
prompted using Message Age exception.
For Instance, XPath based
SignedElements assertion is used to arbitrary message element that need Integrity
protection. The RequiredParts and RequiredElements using QNames and XPath are
used to specify the header element the message should contain.
WS-SecurityPolicy also consists of assertions related to cryptographic algorithms,
transportation binding and the order of applying cryptographic algorithms.
5.3 WS-Security
WS-Security Standard addresses Confidentiality and Integrity of XML messages
transferred as request and responses. The header <wsse:Security>[12] is used to
attach security related information. WS-Security standard defines cryptographic
processing rules and methods to associate security tokens. Since SOAP messages are
processed and modified by SOAP intermediaries, mechanisms such as SSL/TLS are
insufficient to provide end-to-end security of SOAP messages and hence WS-Security
gain importance.
WS-Security specifies that signature confirmation attribute included to digital
signature of request and again back included in the response message, as signed
receipt , in order to ensure that the request or response are tied to corresponding
response or request.
WS-Security defines a mechanism to associate a security token by including them
in the <wsse:Security> header and a reference mechanism to refer the tokens in
binary and XML formats. Username Token Profile adds literal plaintext password,
hashed password, nonce (time variant parameters), and creation timestamp to already
available Username Token. Kerberos token profile defines the way in which
Kerberos tickets are embedded into SOAP messages. The Other profiles include WSSecurity X.509 Certificate token profile, SAML token profile and Rights
Expression Language Token profile.
5.4 WS-SecureConversation
WS-Secure Conversation [10] defines way to establish security contexts identified
by an URI, which will permit existing SSL/TLS connection to be shared by
subsequent requests to a web server in the transport level. When overheads related to
key management raises due to introduction of message level security and as a result of
which scalability becomes a problem this standard proves to be a better solution.
There are three different ways to establish Security contexts. First, SCT (Security
Context Token) retrieval using WS-Trust i.e. SCT is retrieved from a security token

A Framework for Securing Web Services

277

service trusted by the web service. Second, SCT created by the requestor which has a
threat of getting rejected by the web service. Third, using security context mutually
agreed by requestor as well as provider using challenge-response process. This is SCT
is then used to derive the session key, which is used for subsequent encryption and
authentication codes. When Security context time exceeds the communication session
then it will be cancelled but if it gets expired then it has to be renewed.
5.5 WS-Trust
WS-Trust [11] standard introduces Security Token Service which is a web service
that issue, renew and validate security tokens. While multiple trust domains are
involved, one security token can be converted into other by brokering trust. When a
requestor wants to access a web service and he doesnt hold the right security token
specified in the policy. The requestor may state the available token and ask for the
needed token to STS else requestor may delegate the responsibility of finding the
right token to STS itself and state only available token and just ask for the right
token.
When the requestor includes time variant parameters as entropy while requesting
for token, STS will return a secret key material which is called proof-of-possession. In
this case token may be a certificate whereas the proof-of-possession is the associated
private key. Requestor who needs an authorisation token for a colleague which need
to be valid only till a particular time period can get a token from WS-Trust.
5.6 WS-Federation
Federation means two or more security domains interacting with each other, letting
users to access the services from other security domain. Each domain has its own
security token service and each of them has their own security policies.
There are few XML standards used along with the WS-* security standards
discussed above, which could help those standards in addressing the security issues.
They include XMLSignature, XMLEncryption, SAML (Security Assertion Mark-up
Language), XACML (Extensible Access Control Mark-up Language) and XKMS
(XML Key Management Specification) and so on.
XMLSignature. XMLSignature is the protocol which describes the signing of digital
contents as whole or in parts. This provides data integrity and also important for
authentication and non-repudiation of web services. This may also be used to
maintain integrity and non-repudiation of WSDL files to enable definition of web
service to be published and later trusted.
XMLEncryption. XMLEncryption ensures confidentiality and hence provide secure
exchange of structured data [3]. XMLEncryption can be applied to parts and even for
documents in persistent storage, in contrast to SSL or VPN. Algorithms such as RSA,
Triples DES are used for encryption, combination of these algorithms also prove to
increase security during message exchange.

278

M. Priyadharshini et al.

SAML. SAML [4] is an XML standard for asserting authentication and authorisation
information. Single sign-on (SSO) between different systems and platforms are
realised using SAML. SAML does not establish or guarantee the trust between
participants instead assumes and requires trust between them. Also SAML does not
guarantee confidentiality, integrity or non-reputability of the assertions in transit. This
could only be provided by XMLEncryption and XMLSignature or any other
mechanisms supported by underlying communication protocol and platform.
XACML. Extensible Access Control Mark-up Language express access control rules
and policies used to derive access decision for set of subjects and attributes. In case of
multiple rules and policies encoding rules, bundling rules into policies and defining
selection and combination algorithms are done by XACML.
Access control list in XACML consists of four tuples:

Subject UserIds, groups or role names


Target Object single document element
Permitted action read, write, execute or delete (not domain
specific)
Provision execute on rules activation initiating log-in requesting
additional credential etc,

XMLKeyManagementSpecification. XMLKMS is the web service interface which


provides public key management environment for usage in XMLSignature and
XMLEncryption. It consists of two sub protocols XML key information service
specification and XML key registration service specification. The former is used for
locating and retrieving public keys from key server. The later defines service
interfaces to register to revoke and recover escrowed keys from key server.
So far we had discussed about various WS-* standards that are related to security.
Other than these standards we also have standards which ensures factors such as
reliability, transaction, routing in web services such as WS-ReliableMessaging, WSTransaction, WS-Routing, WS-Discovery etc.,

6 Collaboration of WS-* Security Standards


All the above standards discussed do not provide an entire solution on their own but
need to be used along with other specification to finally arrive at an end to end
security standard. For example WS-Security does not provide session management
and that is done by WS-SecureConversation. The security solution can be tailored by
the solution providers according to the specific need. In order to tailor the security
solution it becomes necessary for the service providers and researchers involved in
providing such solution to have a clear insight about the association of these standards
in detail, which is as follows.

A Framework for Securing Web Services

279

WS-SecurityPolicy provides assertions specific for security. WS-Policy extends


WS-SecurityPolicy provides all generic assertions. Hence WS-Security Policy fits
into WS-Policy. The security assertions specified in the WS-Security Policy are
utilized by WS-Trust, WS-Security and WS-SecureConversation. Security Assertions
are represented using SAML.
WS-Trust utilizes WS-Security for signing and encrypting SOAP Messages with
the help of XMLSignature and XMLEncryption[6]. WS-Trust utilizes WSPolicy/WS-SecurityPolicy for expressing Security Token and to determine which
particular security token may be consumed by a given web service.

Fig. 3. Collaboration of WS-* Security Standards


Table 3. Summarises the WS-* Security Standards, their purpose and how they collaborate
Standard

Purpose

Related Standards

WS-Policy

Define assertions for web services

WS-SecurityPolicy

WS-SecurityPolicy

Define security Assertions

WS-Security

Provide Message Security

WSSecureConversation
WS-Trust

Establish security context

WS-Trust
WS-Federation
WS-Security
WS-SecureConversation
WS-Federation
WS-Security

Security Token Management

WS-Security
WS-SecuirtyPolicy

WS-Federation

Enable cross domain access

WS-Security
WS-SecurityPolicy
WS-Trust

280

M. Priyadharshini et al.

WS-Security uses session keys generated by WS-SecureConversation for


subsequent encryption and decryption of messages. WS-Federation uses the WSSecurity, WS-SecurityPolicy and WS-Trust to specify the scenarios in which the
requestors from one domain can get access to services in the other domain.

7 Scenarios of Security Challenges and Technological Solutions


The proposed system provides us with an environment which could handle different
scenarios of security challenges in different ways, but in an integrated and exhaustive
manner ensuring the whole security challenges as per the requirements.
Table 4. Selection criteria for choosing the WS-* Standards based on security challenges
Security Challenge
Confidentiality
Authorisation

Standard List
WS-Security
WS-Trust

Authentication

WS-Security

Non-repudiation
Availability
Integrity

WS-SecureConversation
WS-Security
WS-Security
WS-SecurityPolicy

Parameter List
XMLEncryption
SAML Assertion
XACML Assertion
Username Token Profile
Kerberos token profile
Certificate token profile
SAML token profile
Rights Expression
Language token profile
STS,X.509,Kerberos
XMLSignature
XMLSignature
Username Token Profile
Kerberos token profile
Certificate token profile
SAML token profile
Rights Expression
Language token profile

SCENARIO #1. A Project Proposal is formulated which is aiming at accessing lab


results from various laboratories, who are specialist in performing diagnosis for
various diseases. This project is intended to be used by all hospital management
applications. This could give doctors a clear picture of the status of a patient. In the
system requirements, identified significant non-functional requirement is security.
SCENARIO #2. A renowned Bank plans to provide a facility of Tax Payment, which
access a Tax calculator service, followed by payment through their payment gateway.
The Service implementation needs to be incorporated so as to secure the profile of
user details and tax amount as well.

A Framework for Securing Web Services

281

Table 5. We provide a listing for the above said scenarios, which gives the list of Security
Objectives, possible Standard List and corresponding Parameter List, which could be the inputs
from Static Analyzer to our system and used by Dynamic Analyzer during discovering and
binding process
Scenario
# 1:
Provider:
Diagnostic
Laboratories
Requestor:
Doctors

#2:
Provider:
Accounting
Offices
Requestor:
Bank

Security
Challenge
Confidentiality
Authorisation
Authentication
Nonrepudiation

Confidentiality
Integrity

Standard List

Parameter List

SL= {WS-Security,
WS-Trust, WSSecureConversation }
WS= {WS-Security,
WS-Trust, WSSecureConversation |
sf (WS-Security, WSTrust, WSSecureConversation,
WS-SecurityPolicy)}

PL= { pf (XMLEncryption,
SAML Assertion, XACML
Assertion, Username Token
Profile ,Kerberos token
profile, Certificate token
profile ,SAML token
profile, Rights Expression
Language token profile,
STS Token, X.509 Token,
Kerberos Token)}
PL= {XMLEncryption,
SAML Assertion, XACML
Assertion, (Username
Token Profile || Kerberos
token profile || Certificate
token profile|| SAML token
profile || Rights Expression
Language token profile),
(STS Token || X.509 Token
|| Kerberos Token)}
PL= { pf (XMLEncryption,
XMLSignature, Username
Token Profile ,Kerberos
token profile, Certificate
token profile ,SAML token
profile, Rights Expression
Language token profile)}
PL= {XMLEncryption,
XMLSignature, (Username
Token Profile || Kerberos
token profile || Certificate
token profile|| SAML token
profile || Rights Expression
Language token profile)}

SL= {WS-Security,
WS-SecurityPolicy }
WS= {WS-Security,
WS-SecurityPolicy |
sf (WS-Security, WSTrust, WSSecureConversation,
WS-SecurityPolicy)}

8 Evaluation Process
The formulation of collaborative security standard done by the framework can be
justified by performing combinations of testing appropriate to the security objectives.
Inputs for this testing are taken from Slog managed by Security Manager.

282

M. Priyadharshini et al.

n
Security Metric sm =

i=1
Where

(Nai Nfi)

(4)

Nai

n is the number of security objectives,


Nai is total number of times client request for the service with
security objective i
Nfi is number of times program fails to access the service with
security objective i

The security metric(sm) when maximum denotes a better security objective


achievement. The individual values of Nai and Nfi as well as security metric(sm) gets
updated for each discovery and binding in the Slog, which can be used for further
optimisations.

9 Conclusion
To provide better interoperability it is not enough to have a good level of
understanding on these WS-* Security Standards but collaboration of these standards
need to be clearly known without any discrepancies. Web Services Interoperability
Organization (WS-I) provides security profiles which specifies the best combinations
of these standards, yet it is difficult to devise a customized collaborative security
standard and a framework to implement the standard which is proposed in this paper.
The Optimization of the customization process can be performed by the logs
maintained by the Security Manager component which will be taken care during the
implementation of the above proposed framework.

References
1. Sinha, S., Sinha, S.K., Purkayastha, B.S.: Security Issues in Web Services: A Review and
Development Approach of Research Agenda. AUJST: Physical Sciences and
Technology 5(II) (2010)
2. Zhang, Y., Sun, C., Yang, J., Wang,Y.: Web Services Security Policy. In: International
Conf. on Multimedia Information Networking and Security (2010)
3. Liu, W.-j., Li,Y.: Research and Implementation Based on Web Services Security Model.
In: International Conference on Innovative Communication and Asia-Pacific Conference
on Information Technology and Ocean Engineering (2010)
4. Nortbotten, N.A.: XML and Web Service Security Standards, IEEE Communications
Surver & Tutorials, 3 (Third Quarter 2009)
5. Kadry, S., Smaili, K.: A Solutions for Authentication of Web Services Users. Information
Technology Journal 6(7), 987995 (2007)
6. Geuer-Pollman, C., Calessens, J.: Web Services & Web Services Security Standards.
Information Security Technical Report, 10, 1524, Published by Elsevier (2005)
7. WSDL Binding for SOAP 1.2,
http://schemas.xmlsoap.org/wsdl/soap12/soap12WSDL.htm

A Framework for Securing Web Services

283

8. WS-SecurityPolicy 1.2 (July 1, 2007),


http://docs.oasis-open.org/ws-sx/wssecuritypolicy/200702/ws-securitypolicy-1.2-spec-os.html
9. Web Service Policy 1.5 Framework, W3 Recommendations, 4 (September 2007),
http://www.w3.org/TR/2007/REC-ws-policy-20070904
10. WS-SecureConversation 1.3, OASIS Standard (March 1,2007),
http://docs.oasis-open.org/ws-sx/wssecureconversation/200512/ws-secureconversation-1.3-os.html
11. WS-Trust 1.3, OASIS Standard (March 19, 2007),
http://docs.oasis-open.org/ws-sx/ws-trust/200512
/ws-trust-1.3-os.html
12. Web Services Security: SOAP Message Security1.0 (WS-Security 2004), OASIS Standard
(March 01, 2004),
http://docs.oasis-open.org/wss/2004/01
/oasis-200401-wss-soap-message-security-1.0
13. Chakhar, S., Haddad, S., Mokdad, L., Mousseau, V., Youcef, S.: Multicriteria EvaluationBased Conceptual Framework for Composite Web Service selection,
http://basepub.dauphine.fr/bitstream/handle/123456789/5283/m
ulticriteria_mokdad.PDF?sequence=2

Improved Web Search Engine by New Similarity


Measures
Vijayalaxmi Kakulapati1, Ramakrishna Kolikipogu2, P. Revathy3,
and D. Karunanithi4
1,2

Computer Science Department, JNT University, Hyderabad, India


vldms@yahoo.com, krkrishna.csit@gmail.com
3
Education & Research, Infosys Technologies Limited, Mysore, India
revathy_madhan@infosys.com
4
Information Technology Department, Hindustan University, Chennai, India
karunanithid@gmail.com

Abstract. Information retrieval is a process of managing the user's needed


information. IR system captures dynamically crawling items that are to be stored
and indexed into repositories; this dynamic process facilitates retrieval of needed
information by search process and customized presentation to the visualization
space. Search engines plays major role in finding the relevant items from the huge
repositories, where different methods are used to find the items to be retrieved.
The survey on search engines explores that the Naive users are not satisfying with
the current searching results; one of the reason to this problem is lack of
capturing the intention of the user by the machine. Artificial intelligence is an
emerging area that addresses these problems and trains the search engine to
understand the users interest by inputting training data set. In this paper we
attempt this problem with a novel approach using new similarity measures. The
learning function which we used maximizes the users preferable information in
searching process. The proposed function utilizes the query log by considering
similarity between ranked item set and the users preferable ranking. The
similarity measure facilitates the risk minimization and also feasible for large set
of queries. Here we have demonstrated the framework based on the comparison of
performance of algorithm particularly on the identification of clusters using
replicated clustering approach. In addition, we provided an investigation analysis
on clustering performance which is affected by different sequence representations,
different distance measures, number of actual web user clusters, number of web
pages, similarity between clusters, minimum session length, number of user
sessions, and number of clusters to form.
Keywords: Search engines, ranking, clustering, similarity measure, Information
retrieval, click through data.

1 Introduction
Web users request for accurate search results. Most of the Nave users are poor in
experts terminology in which they failed build right query to the search engine, due
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 284292, 2011.
Springer-Verlag Berlin Heidelberg 2011

Improved Web Search Engine by New Similarity Measures

285

this as one of the reason search engines are limited in capability in the providing
accurate results. All most Google, Yahoo, Bing, Ask, etc search engines are in a
nascent stage. Still they are interested in doing research to give better results to the
end users through one click. Query expansion is one dimension of search engines
problem, in which it allows to add new terms to the base query to form new query
for better understandability of search engine. We had a survey on query expansion
techniques in [7] our previous work. Even we found the difficulty to improve search
results by adapting WordNet for term selection for Query reformulation [8]. With
this experience [7] [8] we proposed a novel technique to improve the search results.
One basic idea is to record the user interaction with the search engine. This
information can be used by the user to feedback the base results. Such information
is known as click-through data. This information helps to learn similarity between
or among the query keywords. Firsthand information is always needed to decide the
relevance of the search results. Similarity measure is a function that computes
the degree of similarity between two vectors [6]. Different similarity measures
are used to increase the function output as the item becomes more similar to the
query.
The query term based query expansion refers to the measurement of similarity
between the terms of a query with the utilization of the similarity propagation of web
pages being clicked [9] and the document term based query expansion refers to the
measurement of similarity between or among document terms and search queries
primarily based on the search engines query log of data [9]. The idea behind this is,
that the web pages are similar if they are visited by users, which are issuing related
queries and these queries are considered similar if the corresponding users visit
related pages. The problem of web personalization has become very popular and
critical with the faster growth of users in using WWW. The process of customizing
web to meet the needs of specific users is called Web Personalization [10].Web
customization is to meet the needs of users with the aid of knowledge obtained from
the behavior of user navigations. User visits are essentially sequential in nature that
needs the services of efficient clustering techniques, These provides sequential data
Set similarity measure or S3M that are able to capture both the order of visits
occurrence and the content of the web-page. We discuss how click-through
information is used in section 3. We explore the importance of similarity measures in
section 2 as Related Work.

2 Related Word
Similarity Measures(SM) are used to calculate the similarity between documents ( or
Web Items) and search query pattern. SM helps to rank the resulted items in the
search process. It provides flexibility to present more relevant retrieved item in the
search in the desired order. SM is used for Item clustering and term clustering.
Statistical indexing and similarity measures [11].

286

V. Kakulapati et al.

2.1 Similarity Measure as Inner dot Product


Similarity Measure SM between Item I and Query Q is measured as inner dot product
for vectors.
t
SM(Ij , Q)= Wi j Wi q
i=1

Where Wij is weight of term i in Item j , and Wij is Weight of term I in query q.
Wi j = TFi j / TOTFj
TFi j = Fj /{ max(Fj )}
Where TFij is occurrence of Term j in Item i and TOTFj is Total Term Frequency
of term j in all Items of the Database. Sometimes less frequent terms in an item may
have more importance than more frequent terms; in this case Inverse Item Frequency
(IIF) is taken into consideration i.e TF-IIF weighting.
Wi j = TFi j * IIFi = TFi j * Log (N/IF i )
Where N is total terms, IFi is Item Frequency. This is represented as binary vector
and weighted vector. In binary vector inner dot product is number of matched query
terms in Item. In Weighted vectors, it is the sum of product of the weights of the
matched terms. It also used for clustering the Similar Items:
SM(Ii , Ij)= ( Termi *Termj )
This Inner dot product is unbounded good for larger Items with more number of
unique terms. But one drawback of this technique is, it finds how many query terms
have matched with Item terms, but not how many are not matched. Sometimes we use
inverse Similarity for Relevance calculation in such cases it fails to provide good
results.
2.2 Cosine Similarity Measure
Inner dot product Similarity Measure is normalized by Cosine angle between two
vectors. Cosine Similarity Measure (CSM) is defined as
- ij .q
CSM(Ij , Q) = -------|ij| .|q|
CSM(Ij , Q) =

Improved Web Search Engine by New Similarity Measures

287

Fig. 1. Cosine Similarity Measure

Fig.1 Describes the similarity between Query Q terms and Item I 1 & I2 terms with
angle 1 and 2 respectively. If the two vectors of Item Term and Query terms
coincide and aligned to same line i.e. angle distance is zero, then those two vectors
are similar [12]. Like few of the above many similarity measures are used to match
the terms of user search to the repository item set. We used same similarity measures
for comparison, but comparing information is taken not only from base search pattern,
we extended the initial search pattern with the user personalized information and
other sources of information to match the items that improve the search results.
2.3 User Personalization
To improve the relevance of the user queries, user query logs and profiling is to be
maintained as user logs. User Personalization can be achieved using adaption of user
interface or adaption of content needed to specific user. To judge the relevance of
search results users have no common mechanism. The order of ranking by user
interest give better understanding of query results for future analysis. In domain
specific search tools the relevance is closer to the ranking order and easy to judge the
relevance. To capture the user behavior for future prediction [13] they used ranking
quality measures. Using Implicit Feedback whether user get satisfied or not is
predicted through learning by finding indicative features including way of search
session termination, time spent on resultant pages[14]. The behavior of engine is
observed by measuring the quality of ranking functions and observing natural user
interactions with the search engine [15].

3 Click-through Data
Measuring the similarity of search queries is observed by quarrying the increasing
amount of click-through data recorded by Web search engines, which maintain log of

288

V. Kakulapati et al.

the interactions between users and the search engines [16]. The quality of training
data considered by humans has major impact on the performance of learning to rank
algorithms [17]. Employing human experts to judge the relevance of documents is the
traditional way of generating the training examples. But in real time , it is very
difficult, time-consuming and costly. From few observations [6] [7] [8] [11] [12] [14]
[15] Simple Relevance judgment and normal personalization of user queries has no
much affect in improving the search results. In this paper we claim a novel approach
for selecting alternate source for user behavioral information i.e. click-through data.
Click-through data helps the user to captures the similar features from the past user
navigations and searches for alternate items to retrieve. This approach has significant
information to decide whether the user option for relevance feedback improves search
results or not. We used different similarity measures for matching the click through
data aided to the personalized query logs or simple query logs.
3.1 Click-through Data Structure
We took manually collected dataset for implementation setup. Our document
collection consisting of 200 faculty profiles consisting of standardized attributes given
as good meta-data. We begin ranking our document set using Coarse Grain Ranking
Algorithm. Coarse grain ranking is good for the document ranking if the items are
containing required query terms. This algorithm scores each document by computing
a sum of the match between the query and the following document attributes: name of
faculty, Department or branch, qualification summary, experience track, subjects
handled publication details, references and other details. When we gave query to User
interface it returns the following results:
Query: CSE Faculty with minimum of 5 years experience
Table 1. Ranking order of retrieved results for the above query
1

Dr.B.Padmaja Rani, 16 years of teaching experience. Http://www.jntuh.ac.in

Satya.K, CSE Faculty, http://www.cmrcet.ac.in

Ramakrishna Kolikipogu, CSE Faculty, 5 years experience in teaching


http://www.jntuh.ac.in

Indiravathi, having 20 years experience, not working in CSE Department

Prof.K.Vijayalaxmi, Faculty of CSE, http://www.jntu.ac.in

Megana Deepthi Sharma, studying CSE, having 5 years experience in computer


operation.

From the profiles document set that we have taken to experiment the model, we got
the above result in the first attempt of query CSE Faculty with minimum of 5 years
experience. We found interesting results i.e. 1, 3, 5 are relevant to the query and 2, 4,
6 are not relevant to the query. Due to blind similarity measure the results are not
fruitful. Now we need user judgment for deciding the relevance of search results. The

Improved Web Search Engine by New Similarity Measures

289

user clicks are preserved for the future search process. If user clicks 3rd result first
then it has to reserve the first rank among the relevance list. For capturing such click
through data, we built a Click-through data data-structure as triplet. Click-through
data in search engines is a triplet which consists of the query a, the ranking b
presented to the user, and the set c or <a,b,c> of links that the user clicks for every
navigation.
3.2 Capturing and Storing Click-through Data
Click through data can be captured with little overhead and without compromising the
functionality and usefulness of the search engine. This does not add any overhead for
the user compared to explicit user feedback in particular. The query q and the returned
ranking r are recorded easily when ranking (resulted) is displayed to the user. A
simple system can be used to keep log of clicks. The following system was used to do
the experiments in this paper. We recorded queries submitted, as well as clicks on
search results. Each record included the experimental condition, the time, IP address,
browser, a session identifier and a query identifier. We define a session as a sequence
of navigations (clicks or queries) between a user and the search engine, where less
than 10 minutes passes between subsequent interactions. When attribute is clicked in
query results keep track of recording clicks occurring within the same session as the
query. This is important to eliminate clicks that appeared to come from stored or
retrieved and captured search results. Sometimes if user is continuing search more
than 10 minutes it is built in such a way that it continues the recording process. In
order to capture the click through data as we used middle server. This proxy server
records the user clicks information. It has no effect on overhead of the user in search.
To give faster results we need to reduce processing time called overhead, in general
recording increases the overhead, but in our approach recording click through data
and ranking information has no effect on operational cost. The click-through data is
stored in a triplet format <q,r,s> Data-Structure. The query q and rank order r can be
recorded when search engine returns initial results to the user. To record clicks, a
middle server maintains a data store of the log file. User queries are given unique Ids,
while searching IDs are stored into log file along with query terms and the rank
information r. User need not think of storing Links displayed by the results page, but
direct him to a proxy server. These links are steps to encode IDs of queries and URLs
of the item being suggested. Recording of query, ranking order and URL address
happens automatically through proxy server whenever a user clicks the feedback link.
The server redirects the user to the clicked URL through HTTP protocol. All this
process is done with no more operating cost, which keeps the search engine to present
the results to the user with no much extra time.

4 Ranking and Re-ranking


The ranking rule is set according to the rank score of the item equal to the number of
selections of the same item in past. From the initial ranking we proposed a new
ranking algorithm to redefine the user choice list. We use Probalistic Similarity
Measure and Cosine Similarity Measure for Item Selection and Ranking for base
Search.

290

V. Kakulapati et al.

1. Algorithm: Ranking (Relevant Items Set RIS)


Input: Relevance Item Set RIS.
Output: Ordered Item List with Ranking r.
Repeat
if (Reli >Relj) then
Swap (Ii,Ij)
else
Return Item Set I with ranking Order
Until (no more Items in RIS)

2. Algorithm: Re-ranking (Ranked Items Set S)


Input: Ranked Item Set S.
Output: Ordered Item List with Re-Ranking r.
CTD<--GetClick_ThroughData(q,r,S);
Repeat
if (CTD=True && Reli >Relj) then
Swap (Ii,Ij)
else
Return Item Set I with Re-ranking Order
Until (no more Items in S)

5 Experimental Setup
We implemented the above concept using Java. We took 200 Faculty Profile Item Set
S and Created Click-through Data set in a Table. Whenever user gives choice click
from the retrieved Items to the visual place we recorded the click through data in to
Click-through Data Table. Using Base algorithm 1 we rank the items in initial search
process. We ran the search tool for more than 100 Times and build a click-through
data table. For experimenting the algorithm 2, we ran the Search process again for
multiple numbers of times and observed the results are more accurate than the initial
search. This process has a number of advantages including, it is effortless to execute
while covering a large collection of Items and the essential search engines provide a
foundation for comparison. The Striver meta-search engine works in the following
way. The user will type a query into the interface of the Striver. The query is then
forwarded to MSN Search, Google, Excite, AltaVista, and Hotbot. The retrieved
results of the pages returned by search engines are analyzed and diagonized for top 50
attempts that are suggested are somehow extracted. For every link, the system
displays the name of the page along with its uniform resource locator (URL). The
results of our experiment are shown in Table 2.

Improved Web Search Engine by New Similarity Measures

291

Table 2. Experimental Results


Recommended Query from
click through Data Table
(Personalized Queries)
CSE Faculty + 5 year
experience

Q.
No

Query

Average
Relevance

Average
Improvement

CSE Faculty

50.00%

82.00%

Faculty with 5 years


experience

25.00%

98.00%

CSE Faculty with min. of 5


years experience

Experience Faculty in
Computers

60.00%

79.00%

Experienced CSE Faculty

Experience

10.00%

18.00%

Minimum Experience

Computer Science
Engineering

15.00%

50.00%

Computer Science
Engineering Faculty

Teaching Faculty

40.00%

66.00%

Teaching Faculty for CSE

CSE

20.00%

50.00%

CSE Faculty

Faculty

12.00%

50.00%

CSE Faculty

CSE Faculty with good


experience

80.00%

50.00%

CSE Faculty

6 Conclusion and Future Work


With our proposed a model we measure the similarity of query term with Click
through Data log Table instead of directly comparing the whole data set. This new
Similarity gave positive result and improved the Recall along with Precision. For
query suggestion based on user click through logs to implement it required less
computational cost. The re-ranking and suggesting items for user to judge the results
are enhance through this paper. Moreover, the algorithm does not rely on the
particular terms appearing in the query and Item Set. Our experiments shows that
click through data gave more related suggesting queries from the Table 2. Query
Numbers 2>8>5>1>7>6>3>4>9 are the order of improved relevance. We observe
some time if the Feedback is not right judged it even gives negative results. We
experienced with Q.No .9 from Table 2. Inorder to overcome such negative impact
from the Click through History we plan to enhance this base model more carefully by
appending semantic network and Ontology as our future research direction.

References
1. Baeza-Yates, R.A., Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval.
Addison-Wesley Longman Publishing Co., Inc., Amsterdam (1999)
2. Beitzel, D.M., Jensen, E.C., Chowdhury, A., Grossman, D., Frieder, O.: Hourly analysis of
a very large topically categorized Web query log. In: Proceedings of the Annual
International ACM SIGIR Conference on Research and Development in Information
Retrieval, pp. 321328 (2004)

292

V. Kakulapati et al.

3. Shen, X., Dumais, S., Horvitz, E.: Analysis of topic dynamics in Web search. In:
Proceedings of the International Conference on World Wide Web, pp. 11021103 (2005)
4. Kumar, P., Bapi, R., Krishna, P.: SeqPAM: A Sequence Clustering Algorithm for Web
Personalization. Institute for Development and Research in Banking Technology, India
5. Cohen, W., Shapire, R., Singer, Y.: Learning to order things. Journal of Artificial
Intelligence Research
6. Shen, H.-z., Zhao, J.-d., Yang, Z.-z.: A Web Mining Model for Real-time Webpage
Personalization. ACM, New York (2006)
7. Kolikipogu, R., Padmaja Rani, B., Kakulapati, V.: Information Retrieval in Indian
Languages: Query Expansion model for telugu language as a case study. In: IITAIEEE,
China, vol. 4(1) (November 2010)
8. Kolikipogu, R.: WordNet Based Term Selection for PRF Query Expansion Model. In:
ICCMS 2011, vol. 1 (January 2011)
9. Vojnovi, M., Cruise, J., Gunawardena, D., Marbach, P.: Ranking and Suggesting Popular
Item. IEEE Journal 21 (2009)
10. Eirinaki, M., Vazirgiannis, M.: Web Mining for Web Personalization. ACM Transactions
on Internet Technology 3(1), 127 (2003)
11. Asasa Robertson, S.E., Spark Jones, K.: Relevance Weighting of Search Terms. J.
American Society for Information Science 27(3) (1976)
12. Salton, G.E., Fox, E.A., Wu, H.: Extended Boolean Information Retrieval.
Communications of the ACM 26(12), 10221036 (1983)
13. Kelly, D., Teevan, J.: Implicit feedback for inferring user preference: A bibliography.
ACM SIGIR Forum 37(2), 1828 (2003)
14. Fox, S., Karnawat, K., Mydland, M., Dumais, S., White, T.: Evaluating implicit measures
to improve web search. ACM Transactions on Information Science (TOIS) 23(2), 147
168 (2005)
15. Radlinski, F., Kurupu, M.: How Does Clickthrough Data Reflect Retrieval Quality? In:
CIKM 2008, Napa Valley, California, USA, October 26-30 (2008)
16. Zhao, Q., Hoi, S.C.H., Liu, T.-Y.: Time-dependent semantic similarity measure of queries
using historical click-through data. In: 5th International Conference on WWW. ACM,
New York (2006)
17. Xu, X.F.: Improving quality of training data for learning to rank using click-through data.
In: ACM Proceedings of WSDM 2010 (2010)

Recognition of Subsampled Speech Using a


Modied Mel Filter Bank
Kiran Kumar Bhuvanagiri and Sunil Kumar Kopparapu
TCS Innovation Labs - Mumbai,
Tata Consultancy Services, Pokhran Road 2, Thane (West),
Maharastra 400 601. India
{kirankumar.bhuvanagiri,sunilkumar.kopparapu}@tcs.com
Abstract. Several speech recognition applications use Mel Frequency
Cepstral Coecients (MFCCs). In general, these features are used to
model speech in the form of HMM. However, features depend on the
sampling frequency of the speech and subsequently features extracted at
certain rate can not be used to recognize speech sampled at a dierent
sampling frequency [5]. In this paper, we rst propose a modied Mel
lter bank so that the features extracted at dierent sampling frequencies
are correlated. We show experimentally that the models built with speech
sampled at one frequency can be used to recognize subsampled speech
with high accuracies.
Keywords: MFCC, speech recognition, subsampled speech recognition.

Introduction

Mel Frequency Cepstral Coecients (MFCC) are commonly used features in


speech signal processing. They have been in use for a long time [3] and have proved
to be one of the most successful features in speech recognition tasks [8]. For a typical speech recognition process (see Fig. 1), acoustic models are built using speech
recorded at some sampling frequency during the training phase (boxed blue -.-.
in Fig. 1). In the testing (boxed red - - - in Fig. 1) or the recognition phase, these
acoustic models are used along with a pronunciation lexicon and a language model
to recognize speech at the same sampling frequency. If the speech to be recognized
is at a sampling frequency other than the sampling frequency of the speech used
during the training phase then one of the two things needs to be done (a) retrain
the acoustic models with speech samples of the desired sampling frequency or (b)
change the sampling rate of the speech to be recognized (test speech) to match the
sampling frequency of the speech used for training. In this paper, we address the
problem of using models built for a certain sampling frequency to enable recognition of speech at a dierent sampling frequency. We particularly concentrate on
Mel-frequency cepstral coecient (MFCC) as features [9], [4] because of their frequent use in speech signal processing. Kopparapu et al [5] proposed six lter bank
constructs to enable calculation of MFCCs of a subsampled speech. Pearson correlation coecient was used to compare the MFCC of subsampled speech and the
MFCC of original speech.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 293299, 2011.
c Springer-Verlag Berlin Heidelberg 2011


294

K.K. Bhuvanagiri and S.K. Kopparapu

Fig. 1. Speech Recognition - showing train and the test stages

In this paper, we construct a Mel lter bank that is able to extract MFCCs
of the subsampled speech which are signicantly correlated to the MFCCs of
original speech compared to the Mel lter banks discussed in [5]. This is experimentally veried in two ways (a) through the Pearson correlation coecient and
(b) through speech recognition experiments on AN4 speech database [1] using
open source ASR engine [2]. Experimental results show that the recognition accuracy on subsampled speech using models developed using original speech is as
good as the recognition accuracy on original speech and as expected degrades
with excessive subsampling.
One of the prime applications of this work is to enable use of acoustic models
created for desktop speech (usually 16 kHz) with telephone speech (usually 8
kHz) especially when there is access to only the acoustics models and not to
the speech corpus specically as in Sphinx. The rest of the paper is organized
as follows. In Section 2, largely based on our previous work [5], procedure to
compute MFCC features and the relationship between the MFCC parameters
of the original and subsampled speech is discussed. In Section 2.1, new lter
bank is proposed. Section 3 gives the details of the experiments conducted to
substantiate advantage of proposed modied lter bank and we conclude in
Section 4.

Computing MFCC of Subsampled Speech

As shown in [5], let x[n] be a speech signal with a sampling frequency fs and be
divided into P frames each of length N samples with an overlap of N/2 samples,
say, {x1 , x2 xp xP }, where xp denotes the pth frame of the speech signal
 

N 1
x[n] and is xp = x p N2 1 + i i=0 . Computing MFCC of the pth frame
involves,

Recognition of Subsampled Speech Using a Modied Mel Filter Bank

295

 
1. Multiply xp with a hamming window w[n] = 0.54 0.46 cos n
N ,
2. Compute discrete Fourier transform (DFT) [7]. Note that k corresponds to
the frequency lf (k) = kfs /N .
Xp (k) =

N
1


xp [n]w[n] expj

2kn
N

for k = 0, 1, , N 1

n=0

3. Extract the magnitude spectrum |Xp (k)|


4. Construct a Mel lter bank M (m, k), typically, a series of overlapping triangular lters dened by their center frequencies lf c (m). The parameters
that dene a Mel lter bank are (a) number of Mel lters, F , (b) minimum
frequency, lf min and (c) maximum frequency, lf max . So, m = 1, 2, , F in
M (m, k).
5. Segment the magnitude spectrum |Xp (k)| into F critical bands by means of
a Mel lter bank.
6. The logarithm of the lter bank outputs is the Mel lter bank output
N 1


M (m, k)|Xp (k)|
(1)
Lp (m) = ln
k=0

where m = 1, 2, , F and p = 1, 2, , P .
7. Compute DCT of Lp (m) to get the MFCC parameters.
rp

{x[n]} =

F



Lp (m) cos

m=1

r(2m 1)
2F


(2)

where r = 1, 2, , F and rp {x[n]} represents the rth MFCC of the pth


frame of the speech signal x[n].
The sampling of the speech signal in time eects the computation of MFCC
parameters. Let y[s] denote the sampled speech signal such that y[s] = x[n]
where = uv and u and v are integers. Note that > 1 denotes downsampling
while < 1 denotes upsampling and for the purposes of analysis we will assume
that is an integer. Let yp [s] = xp [n] denote the pth frame of the time scaled
speech where s = 0, 1, , S 1, S being the number of samples in the time
scaled speech frame given by S = N/. DFT of the windowed yp [n] is calculated
from the DFT of xp [n]. Using the scaling property of DFT, we have, Yp (k  ) =
1
1


l=0 Xp (k +lS) where k = 1, 2, , S. The MFCC of the subsampled speech

is given by


F

r(2m 1)
Lp (m)cos
rp {y[n]} =
(3)
2F
m=1
where r = 1, 2, , F and
Lp (m)

= ln

 S1

k =0

 1

1 




M (m, k ) 
Xp (k + lS)




l=0

(4)

296

K.K. Bhuvanagiri and S.K. Kopparapu

Note that Lp and M  are the log Mel spectrum and the Mel lter bank of the
subsampled speech. Note that a good choice of M  (m, k  ) is the one which gives
(a) the best Pearson correlation with the MFCC (M (m, k) of the original speech
and (b) best speech recognition accuracies when trained using the original speech
and decoded using the subsampled speech. Kopparapu et al [5] chose dierent
constructs of M  (m, k  ).
2.1

Proposed Filter Bank

We propose a Mel lter bank Mnew (m, k  ) for subsampled speech as



M (m, k  ) for lf (k  ) ( 1 f2s )

Mnew (m, k ) =
0
for lf (k  ) > ( 1 f2s )
where k  ranges from 1 to N/. Notice that the modied lter bank is the
subsampled version of original lter bank with bands above 1 f2s set to 0. Clearly,
the number of Mel lter bands are less than the original number of Mel lter
bands. Let be the number of lter banks whose fc is below fs /2. Subsequently,
Lp (m), the Mel lter bank output, is 0 for m > . In order to retain the total
number of lter bank outputs as the original speech we construct
Lp (m) = (0.9)m Lp () for < m F

(5)

Equation (5) is based on the observation that Mel lter outputs for m > seems
to decay exponentially.

Experimental Results

We conducted experiments on AN4 [1] audio database. It consists of 948 train


and 130 test audio les containing a total of 773 spoken words or phrases. The
recognition results are based on these 773 words and phrases. All the speech les
in the AN4 database are sampled at 16 kHz. The Mel lter bank has F = 30
bands with lf min = 130 Hz and lf max = 7300 Hz and the frame size is set to
32 ms. The MFCC parameters are computed for the 16 kHz speech signal x[n],
and also the subsampled speech signal x[n]. The MFCC parameters of y[s] are
calculated using the proposed Mel lter bank (5) while the MFCC of x[n] was
calculated using (2).
We conducted two types of experiments to evaluate the performance of the
proposed Mel lter bank construction on subsampled speech. In the rst set
of experiments, we used the Pearson correlation coecient (r) to compare the
MFCC of the subsampled speech with the MFCC of the original speech along the
lines of [5]. In the second set of experiments we used speech recognition accuracies
to evaluate the appropriateness of the use of Mel lter bank for subsampled
speech. We compared our Mel lter bank with the best Mel lter bank (Type
C) proposed in [5].

Recognition of Subsampled Speech Using a Modied Mel Filter Bank

3.1

297

Comparison Using Pearson Correlation Coecient

We computed framewise r for MFCCs of subsampled speech and the original


speech. The mean and variances of r over all the frames are shown in Table 1.
Clearly the Mel lter bank construction proposed in this paper performs better
than the best method suggested in [5] for all values of . For = 16/4 = 4
the mean-variance pair for the proposed Mel lter bank is (0.85609, 0.04176)
compared to best in [5] (0.67837, 0.14535).
Table 1. Pearson correlation coecient (r), between the MFCCs of original and subsampled speech
Sampling factor
= 16/4
= 16/5
= 16/6
= 16/7
= 16/8
= 16/10
= 16/12
= 16/14
= 16/16

Proposed
mean variance
0.85609 0.04176
0.90588 0.02338
0.9284 0.01198
0.94368 0.00633
0.96188 0.00005
0.98591 0.00037
0.989 0.00025
0.99451 0.00006
1
0

best
mean
0.67837
0.70064
0.7201
0.7321
0.7465
0.8030
0.8731
0.9503
1

in [5]
variance
0.14535
0.1280
0.1182
0.1010
0.0846
0.0448
0.0188
0.0029
0

Table 2. Recognition accuracies (percentage)


Sampling factor Case A (30 MFCCs) Case B (39 features)
proposed Best in [5] proposed Best in [5]
= 16/4
9.83
2.07
30.36
3.36
= 16/5
20.18
1.68
58.86
2.85
= 16/6
27.30
2.07
68.95
3.62
= 16/7
31.44
2.33
73.22
5.30
= 16/8
37
3.88
77.23
11.77
= 16/10
40.36
7.12
80.50
34.15
= 16/12
41.01
16.19
81.11
65.85
= 16/14
42.56
34.80
82.54
77.10
= 16/16
43.21
43.21
81.11
81.11

3.2

Speech Recognition Experiments

We used the 948 training speech samples of AN4 database to build acoustic
models using SphinxTrain. Training is done using MFCCs calculated on the
16 kHz (original) speech les. Recognition results are based on the 130 test
speech samples In Case A we used 30 MFCCs while in Case B we used 13
MFCC but concatenated them with 13 velocity and 13 acceleration coecients

298

K.K. Bhuvanagiri and S.K. Kopparapu

Fig. 2. Comparing ASR accuracies of both methods for dierent values of sampling
factors ()

Fig. 3. Sample log Filter bank outputs of original speech, and subsampled speech using
the proposed Mel lter bank and the best Mel lter bank in [5]

to form a 39 dimensional feature vector. Recognition accuracies are shown in


Table 2 on the 773 words in the 130 test speech les. It can be observed that the
word recognition accuracies using the proposed Mel lter bank on subsampled
speech is better that what has been proposed in [5] for all values of and for
both the Case A and the Case B. We also observe from Fig. 2 that proposed
method is more robust, while accuracies rapidly fall in case of best method in
[5] it gradually decreases in our case.

Recognition of Subsampled Speech Using a Modied Mel Filter Bank

299

The better performance of the proposed Mel lter bank in terms of recognition
accuracies can be explained by looking at a sample lter bank output shown in
Fig. 3. Filter bank output of the proposed Mel lter bank construct (red line
+) closely follow that of original speech Mel lter bank output (blue line x),
while even the best reported lter bank in [5] (shown in black line o) shows a
shift in the lter bank outputs.

Conclusion

The importance of this Mel lter bank design to extract MFCC of subsampled
speech is apparent when there are available trained models for speech of one sampling frequency and the recognition has to be performed on subsampled speech
without explicit creation of acoustic models for the subsampled speech. As a
particular example, the work reported here can be used to recognize subsampled
speech using acoustic (HMM or GMM) models generated using Desktop speech
(usually 16 kHz). We proposed a modied Mel lter bank which enables extraction of MFCC from subsampled speech which correlated very well with the
MFCC of the original sampled speech. We experimentally showed that the use
of the modied Mel lter bank construct in MFCC computation of subsampled
speech outperforms the Mel lter banks developed in [5]. This was demonstrated
at two levels, namely, in terms of a correlation measure with the MFCC of the
original speech and also through word recognition accuracies. Speech recognition
accuracies for larger values of can be improved by better approximating the
missing Mel lter outputs using bandwidth expansion [6] techniques, which we
would be addressing in our future work.

References
1. CMU: AN4 database, http://www.speech.cs.cmu.edu/databases/an4/
2. CMU: Sphinx, http://www.speech.cs.cmu.edu/
3. Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust.
Speech Signal Processing 28(4), 357366 (1980)
4. Jun, Z., Kwong, S., Gang, W., Hong, Q.: Using Mel-frequency cepstral coecients in
missing data technique. EURASIP Journal on Applied Signal Processing 2004 (3),
340346 (2004)
5. Kopparapu, S., Laxminarayana, M.: Choice of mel lter bank in computing mfcc
of a resampled speech. In: 10th International Conference on Information Sciences
Signal Processing and their Applications ISSPA, pp. 121124 (May 2010)
6. kornagel, U.: Techniques for articial bandwidth extension of telephone speech. Signal Processing 86(6) (June 2006)
7. Oppenheim, S.: Discrete Time Signal Processing. Prentice-Hall, Englewood Clis
(1989)
8. Quatieri, T.F.: Discrete-time speech signal processing: Principles and practice,
vol. II, pp. 686713. Pearson Education, London (1989)
9. Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identication using
Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing 3(1) (January 1995)

Tumor Detection in Brain Magnetic Resonance Images


Using Modified Thresholding Techniques
C.L. Biji1, D. Selvathi2, and Asha Panicker3
1

ECE Dept, Rajagiri School of Engineering & Technology, Kochi, India


clbiji@yahoo.com
2
ECE Dept, Mepco Schlenk Engineering College, Sivakasi, India
selvathi_d@yahoo.com
3
ECE Dept, Rajagiri School of Engineering & Technology, Kochi, India
pnckr_sh@yahoo.co.in

Abstract. Automated computerized image segmentation is very important


for clinical research and diagnosis. The paper deals with two segmentation
schemes namely Modified Fuzzy thresholding and Modified minimum error
thresholding. The method includes the extraction of tumor along with suspected
tumorized region which is followed by the morphological operation to remove
the unwanted tissues. The performance measure of various segmentation
schemes are comparatively analyzed based on segmentation efficiency and
correspondence ratio. The automated method for segmentation of brain tumor
tissue provides comparable accuracy to those of manual segmentation.
Keywords: Segmentation; Magnetic resonance Imaging; Thresholding.

1 Introduction
In medical image analysis, segmentation is an indispensable step in the processing.
Image segmentation is the process of partitioning the image into meaningful sub
regions or objects with the same attribution [6]. Brain tumor segmentation for
magnetic resonance image (MRI) is a difficult task that involves image analysis
based on intensity and shape [1, 4]. Due to the characteristics of the imaging
modalities segmentation becomes a difficult but important problem in biomedical
application. Manual segmentation is more difficult, time-consuming, and costlier
than automated processing by a computer system. Hence, medical image
segmentation scheme should posses some preferred properties such as fast
computation, accurate and robust segmentation results [2, 3].
The proposed frame work employs with segmentation schemes viz. (i) Modified
Fuzzy thresholding and (ii) Modified minimum error thresholding. The method
includes two stages; initially the tumor along with suspected tumorized region is
extracted using the segmentation scheme mentioned and which is followed by the
morphological operation to remove the unwanted tissues. More over the segmentation
schemes are comparatively analyzed based on the performance measure. The proposed
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 300308, 2011.
Springer-Verlag Berlin Heidelberg 2011

Tumor Detection in Brain Magnetic Resonance Images

301

automatic segmentation proves to be effective by its high segmentation efficiency and


correspondence ratio.
1.1 Materials
In this work, brain MR Images acquired on a 0.2 Tesla, SiemensMagnetom
CONCERTO MR scanner (siemens, AG medical solutions, Erlangen, Germany) are
used. Axial, 2D, 5mm thick slice images, with a slice gap of 2mm were acquired with
246*512 acquisition matrix and with the field view of range 220mm to 250mm. In
this work 30 abnormal patients are considered. For each patient a set of slices are
selected by experts to provide the representative selection. Each brain slice consists of
T1, T2 and T2 FLAIR weighted images. All T1 weighted images (TR/TE of 325550/8-12ms) were acquired using Spin Echo (SE) sequence. While T2 (TR/TE of
3500-10000/ 97-162ms) and T2-FLAIR (TR/TE of 3500-10000/89-162ms) weighted
images were collected using Turbo Spin Echo (TSE) sequences.

2 Methodology
Automated computerized image segmentation [20] is very important for clinical
research and diagnosis. A wide variety of approaches [5,7] have been proposed for
brain MR image segmentation which mainly relies on the prior definition of tumor
boundary. The paper mainly aims to present new and sophisticated methods of
automatically selecting a threshold value for segmentation. The general block diagram
describing the proposed work is shown in Fig.1.
Input
MR Image

Segmentation
Scheme

Post
Processing

Output
MR Image

Fig 1. General Block diagram for Tumor Extraction in Brain MR Images

The approximate tumor tissues in the brain MR images can be extracted by


performing segmentation. Segmentation schemes employed in this work namely (i)
Modified fuzzy thresholding and (ii) Modified minimum error thresholding. After the
segmentation, the unwanted tissues can be removed by performing morphological
operation [19].
Thus the post processing step merge the connected regions while removing some
isolated non tumor regions
2.1 Modified Fuzzy Thresholding
Fuzzy c means thresholding provides an efficient segmentation by considering all
pixels with a membership value. In the classical Fuzzy C Means thresholding [12] the
output threshold depends on the input threshold which has been arbitrarily fixed,
hence in order to overcome the difficulties a proper initial threshold is encountered.

302

C.L. Biji, D. Selvathi, and A. Panicker

The parameter selected to have fuzziness is the gray level intensity values. The block
diagram for the proposed method is displayed in Fig 2.

Input
Image

Initial
Threshold
Selection

Fuzzy
Cmeans
Thresholding

Post
Processing

Output
Image

Fig. 2. Modified Fuzzy C Means Thresholding

The input initial threshold can be calculated using the formula

T = max grayvalue / 2

(1)

The extraction of tumor is carried out by performing fuzzy c means thresholding


algorithm [14, 16] followed by hardening scheme. Fuzzy thresholding is an extension
of fuzzy clustering for segmentation by considering the gray values alone as a feature.
For fuzzy c-means thresholding the objective function to be minimized is given by
2 L 1
2

J = h j d j, v
j i
i
i =1 j= 0

(2)

The objective function equation (2) can be iteratively minimized by computing the
means with the equation (3) and updating the membership with equation (4)
L

v =
i

j= 1
L

j=1

(j) =

j i

h
j i

jh

(3)

(4)

2/( 1)

1+ d j, v /d j, v

2
1

In order to remove the unwanted tissues, as post processing morphological erosion


and dilation is carried.
2.3 Modified Minimum Error Thresholding
The block diagram for the proposed frame work is given in Fig 3,
Input
MR
Image

Pre
Processing

Valley
Point
Removal

Minimum
Error
Thresholding

Post
Processing

Fig. 3. Modified Minimum Error Thresholding

Output
Image

Tumor Detection in Brain Magnetic Resonance Images

303

The preprocessing step provides a measure of local variation of intensity and is


computed over a square neighborhood.

1
W2

i+ M j+ M 2

1 i+ M j+ M
I
(
k,
l)
I(k, l)


2
W iM jM
k =iM jM

(5)

As the internal local minimum can adversely affect the threshold selection the first
valley point corresponding to background has to be suppressed [18]. The histogram
provides a narrow peak corresponding to the background. Initially, two local maxima
(peak) yj and yk are computed. The valley point can be obtained using

Vt = ( y j + y k )/2

(6)

After the valley point removal, threshold selection is carried out through minimum
error thresholding algorithm ([14], [15]). Using this optimal threshold abnormal
tissues are extracted. In order to remove the unwanted tissues after segmentation,
morphological erosion and dilation is performed for improving the efficiency.

3 Results and Discussion


The proposed methodology is applied to real datasets representing tumor images with
different intensity, shapes, location and sizes. The effectiveness of proposed
methodology is analyzed with the resultant average of T2 weighted and T2 FLAIR
MR Images of size 512X320. The algorithms are implemented in Matlab 6.5 platform
and are run on 3.0 GHZ, 512 MB RAM Pentium IV Personal Computer in Microsoft
Windows operating system. Analysis is carried out for 30 different dataset. Defining
the abnormal region can be useful for surgical planning and even for radiation
therapy. The proposed methodology automatically extracts the tumor region and
hence it largely assists the physicians. In this paper a comparative analysis of the two
thresholding schemes have been tabulated.
3.1 Modified Fuzzy Thresholding
The tumor extraction results obtained for four type data set is shown in the figure 4.
The fuzzy threshold obtained through the algorithm for astrocytomas data set is 122,
Glioma data set is 77, Metastas type data set is 103 and Meningiomas type is 95.
3.2 Modified Minimum Error Thresholding
For Astrocytomas data set, inorder to get proper segmentation the first valley point is
removed and first threshold valley point obtained is 19. The minimum error threshold

304

C.L. Biji, D. Selvathi, and A. Panicker

value obtained through the algorithm is 89 and for further improvement in result
erosion and dilation is performed. For Gliomas type data set, inorder to get proper
segmentation the first valley point is removed and first valley point obtained is 28.
The minimum error threshold value obtained through the algorithm is 144 and for
further improvement in result erosion and dilation is performed. For metastas type
data set, Inorder to get proper segmentation the first valley point is removed and
first valley point obtained is 34. The minimum error threshold value obtained
through the algorithm is 110 and for further improvement in result erosion and
dilation is performed. For meningiomas type data set inorder to get proper
segmentation the first valley point is removed and first valley point obtained is 31.
The minimum error threshold value obtained through the algorithm is 162 and
for further improvement in result erosion and dilation is performed. The results
obtained are shown in fig 5.
Type
Cancer

of

T2
Weighted

T2 Flair

Average

Fuzzy

Post

Image

Threshold

Processed

Image

Image

Astrocytomas

Glioma

Metastas

Meningiomas

Fig. 4. Result obtained for Modified Fuzzy Thresholding

Tumor Detection in Brain Magnetic Resonance Images

305

3.2 Performance Measure


Quantitative measurement of segmentation criterions are performed based on
segmentation efficiency (SE) and correspondence ratio (CR) [17]. Segmentation
efficiency and Correspondence ratio are calculated in terms of true positive (TP), false
positive (FP), ground truth area (GT) and false negative (FN) [18]. SE and CR can be
defined as follows. The quantitative measures of both the schemes are given in the
table1 and 2.

SE =

TP
*100
GT

(4)

CR = [(TP - 0.5 * FP)/GT] * 100


Type of Cancer T2 Weighted T2 Flair

(5)

Average

Minimum

Post

Image

ErrorThreshod

Processed

Image

Image

Astrocytomas

Glioma

Metastas

Meningiomas

Fig. 5. Resultant Images for Modified Minimum Error Thresholding

306

C.L. Biji, D. Selvathi, and A. Panicker

Table 1. Performance measure of modified fuzzy thresholding

Tumor Type

Astrocytomas

Name
Slice 1
Slice 2
Slice 3
Slice 4
Slice 5

Tumor
Isolated
11779
14349
16709
16295
15957

TP
7984
11385
13125
13313
13291

FP
3795
2964
3584
2982
2666

GT
8048
11464
13139
13660
13994

FN
64
79
14
347
703

SE
99.20
99.310
99.89
97.45
94.97
98.16

CR
0.75
0.86
0.86
0.86
0.85
0.840

Tumor
Isolated
12053
11775
7698
12804
15069

TP
4913
7201
6914
5924
4276

FP
7140
4574
784
6880
10793

GT
4937
7229
6923
5949
4319

FN
24
28
9
25
43

SE
99.51
99.61
99.87
99.57
99.00
99.51

CR
0.27
0.67
0.94
0.41
-0.25
0.41

Tumor
Isolated
15135
17470
20001
20284
17119

TP
13307
13576
14777
17709
15555

FP
1828
3894
5224
2575
1564

GT
13518
13751
14924
18153
16466

FN
211
175
147
444
911

SE
98.43
98.72
99.01
97.55
94.46
97.64

CR
0.91
0.84
0.81
0.90
0.89
0.87

Tumor
Isolated
43325
46048
43375
36850
49307

TP
12017
9806
8780
13807
10632

FP
31308
36242
34595
23043
38675

GT
12031
9806
8798
14157
10697

FN
14
0
18
350
65

SE
99.88
100
99.79
97.52
99.39
99.31

CR
-0.30
-0.84
-0.96
0.16
-0.81
-0.55

Avg
Name

Glioma

Slice 1
Slice 2
Slice 3
Slice 4
Slice 5

Avg
Name

Metastas

Slice 1
Slice 2
Slice 3
Slice 4
Slice 5

Avg

Meningiomas

Name
Slice 1
Slice 2
Slice 3
Slice 4
Slice 5

Avg

Tumor Detection in Brain Magnetic Resonance Images

307

Table 2. Performance measure of modified Minimum thresholding

Tumor Type
Name
Slice 1
Slice 2
Astrocytomas Slice 3
Slice 4
Slice 5

Tumor
Isolated
9222
12726
14647
18494
16118

TP
7848
11329
13099
13602
13604

FP
1374
1397
1548
4892
2514

GT
8048
11464
13139
13660
13994

FN
200
135
40
58
390

Avg

Glioma

Name
Slice 1
Slice 2
Slice 3
Slice 4
Slice 5

Metastas

Avg

CR
0.88
0.92
0.93
0.81
0.88
0.89

Tumor
Isolated

TP

FP

GT

FN

SE

CR

5239
8143
7873
6454
4831

4804
7212
6918
5860
4177

435
931
955
594
654

4937
7229
6923
5949
4319

133
17
5
89
142

97.30
99.76
99.92
98.50
96.71
98.44

0.92
0.93
0.93
0.93
0.89
0.92

Tumor
Isolated
5485
10353
14685
15044
19633

TP
3712
8907
13150
12968
14726

FP
1773
1446
1535
2076
4907

GT
3988
9616
13518
13751
14924

FN
276
709
368
783
198

SE
93.07
92.62
97.27
94.30
98.67
95.19

CR
0.70
0.85
0.91
0.86
0.82
0.83

Avg
Name
Slice 1
Slice 2
Slice 3
Slice 4
Slice 5

SE
97.51
98.82
99.69
99.57
97.21
98.56

Tumor
Name
Isolated TP
FP
GT
FN
SE
CR
Slice 1 3536
3016
520
12031
9015
25.068
0.22
Slice 2 2632
2383
249
9806
7423
24.30
0.23
1498
175
8798
7300
17.02
0.160
Meningiomas Slice 3 1673
Slice 4 2997
2746
251
14157
11411 19.39
0.18
Slice 5 3279
3129
150
10697
7568
29.25
0.28
Avg
23.01
0.218
From the quantitative analysis, it is observed that both the methods for comparable results.

4 Conclusion
The paper presents two new approaches for automatic segmentation of tumors from
MR images. The approaches shows promise in effectively segmenting different
tumors with high segmentation efficiency correspondence ratio. A potential issue that
is not handled by the proposed method is extraction of tumor if the intensity level is
less. The method can further extended through clustering methodologies, which
should been even suitable for Meningiomas type tumors.

Acknowledgement
The author would like to thank S. Alagappan, Chief consultant Radiologist, Devaki
MRI & CT scans, Madurai, INDIA for supplying all MR images.

308

C.L. Biji, D. Selvathi, and A. Panicker

References
1. Macovski, A., Meyer, C.H., Noll, D.C., Nishimura, D.G., Pauly, J.M.: A homogeneity
correction method for magnetic resonance imaging with time-varying gradients. IEEE
Trans. Med. Imaging 10(4), 629637 (1991)
2. Clark, M.C., Goldgof, D.B., Hall, L.O., Murtagh, F.R., Sibiger, M.S., Velthuizen, R.:
Automated tumor segmentation using knowledge based technique. IEEE Trans. on
Medical Imaging 17(2), 238251 (1998)
3. Lenvine, M., Shaheen, S.: A modular computer vision system for image segmentation.
IEEE Trans. on Pattern Analysis and Machine Intelligence 3(5), 540557 (1981)
4. Kichenassamy, S., Kumar, A., Oliver, P.J., Tannenbaum, A., Yezzi, A.: A geometric
snake model for segmentation of medical imagery. IEEE Trans. on Medical Image
Analysis 1(2), 91108 (1996)
5. Sahoo, P.K., Soltani, S., Wong, A.K.C.: A survey of Thresholding Techniques. Computer
Vision, Graphics, and Image Processing 41, 233260 (1988)
6. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Pearson Education,
London (2002)
7. Illingworth, J., Kittler, J.: Threshold Selection based on a simple image statistic. Computer
Vision, Graphics, and Image Processing 30, 125-147 (1985)
8. Yan, H., Zhu, Y.: Computerized Tumor Boundary Detection Using a Hopfield Neural
Network. IEEE Transactions on Medical Imaging 16(1) (1997)
9. Gauthier, D., Wu, K., Levine, M.D.: Live Cell Image Segmentation. IEEE Transactions on
Biomedical Engineering 42(1) (January 1995)
10. Calvard, S., Ridler, T.: Picture thresholding using an iterative selection method. IEEE
Trans. Systems Man Cybernet. SMC-8, 630632 (November 1978)
11. Biswas, P.K., Jawahar, C.V., Ray, A.K.: Investigations on Fuzzy thresholding Based on
Fuzzy Clustering. Pattern Recognition 30(10), 16051613 (1997)
12. Ostu, N.: A threshold selection method from gray level histogram. IEEE Trans. System
Man Cybernet. SMC-8, 6266 (1978)
13. Kapurs, J.N., Sahoo, P.K., Wong, A.K.C.: A new method for gray-level picture
thresholding using the entropy of histograms. Computer Vision Graphics Image
Process. 29, 273285 (1985)
14. Illingworth, J., Kittler: Minimum Error thresholding. Pattern Recognition, 19(1), 4147
(1985)
15. Danielsson, P.E., Ye, Q.Z.: On minimum error thresholding and its implementations.
Pattern Recognition Letters 7, 201206 (1988)
16. Cheng, H.-D., Freimanis, R.I., Lui, Y.M.: A novel Approach to Micro calcification
Detection Using Fuzzy Logic Technique. IEEE Transactions on Medical Imaging 17(3),
442450 (1998)
17. Mendelsohn, M.L., Prewitt, J.M.S.: The analysis of cell images. Ann. N. Y. Acad.
Sci. 128, 10351053 (1966)
18. Goldgof, D.B., Hall, L.O., Fletcher-Heath, L.M., Murtagh, F.R.: Automatic Segmentation
of non-enhancing brain tumors in magnetic resonance images. Artificial Intelligence in
Medicine 21, 4363 (2001)
19. Middleton, I., Damper, R.I.: Segmentation of MR Images Using a Combination of Neural
networks and active contour models. Medical Engineering & Physics 26, 7176 (2004)
20. Pradhan, N., Sinha, A.K.: Development of a composite feature vector for the detection of
pathological and healthy tissues in FLAIR MR images of brain. ICGST-BIME
Journal 10(1) (December 2010)

Generate Vision in Blind People Using Suitable


Neuroprosthesis Implant of BIOMEMS in Brain
B. Vivekavardhana Reddy1, Y.S. Kumara Swamy2, and N. Usha3
1

City Engineering College Bangalore


Department Of MCA DYSCE, Bangalore
3
Department Of CSE, BGSIT Bangalore

Abstract. For Human beings, image processing occurs in the occipital lobe of
the brain. The brain signals that are generated for the image processing
is universal for all humans. Generally, the visually impaired people lose sight
because of severe damage to only the eyes (natural photoreceptors) but the
occipital lobe is still working. In this paper, we discuss a technique for
generating partial vision to the blind by utilizing electrical photoreceptors to
capture image, process the image using edge & motion detection adaptive VLSI
network that works on the principle of bug flys visual system, convert it into
digital data and wirelessly transmit it to a BioMEMS implanted into the occipital
lobe of brain.

1 Introduction
Since visually impaired people only have damaged eyes, their loss of sight is mainly
because their natural photoreceptors (eyes) are unable to generate signals that excite
the neurons in the occipital lobe of the brain. The temporal lobe in the human brain is
responsible for the visual sensation. It is proved that the neurons of the occipital lobe
in a blind patient are healthy and have the potential to create visual sensation if the
required signals are fired to the neurons in that region. Thus here we are discussing a
technique of transmitting visual data digitally into the occipital lobe of the brain by
wireless means, in the brain a BioMEMS device is implanted to receive this wireless
digital data. The visual data transmitted by external means into the brain is received
by a Patch Antenna that is present on the BioMEMS device. This digital data tapped
by the patch antenna is then converted into analog signal using a resistor controlled
wein bridge oscillator. This analog signal obtained from the wein bridge oscillator is
equivalent to the signals that are required by the occipital lobe neurons to create
visual sensation in human beings.
The visual sensation occurs in temporal lobe but the image processing in human
beings is done in the occipital lobe of the brain. Our main agenda is to generate same
image processing signals in blind peoples mind. The brain signals also referred to as
Visual Evoked Potential (VEP) is obtained from EEG tests of normal people [3].
The whole Process carried out in EEG test is given in Fig 1a. The EEG signals
obtained from normal people serve as a means of reference for us to design our
system. An adaptive VLSI network is used to recognize the edges & motion, based on
these a suitable decision to identify the edges is made. Fig 1b shows the block
diagram of our system.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 309317, 2011.
Springer-Verlag Berlin Heidelberg 2011

310

B.V. Reddy, Y.S.K. Swamy, and N. Usha

Fig. 1a. EEG electrode configuration

Fig. 1b. Block diagram of the system

2 Image Acquisition and Processing


A 2-dimensional array of electrical photoreceptor cells are used in the system, this
acts as the artificial eye that captures visual data. The image is focused onto the
photoreceptor cell array. Each pixel is mapped on to the corresponding cell which is
aligned in the straight line of sight. Each cell has a concave lens which allows only
the light rays from straight line of sight pixel. Thus, pixel data is obtained on to the
corresponding photoreceptor cell. The correlation between the adjacent cells is used
to extract the edges. When an image is focused onto the photoreceptor cell array, the
current induced due to the intensity variation is converted into voltage and then it will
be sampled and hold for further analysis. The refresh time is made equal to the
perception time of humans.
Inspired by biological transduction, we shall assume that the primary goal of
phototransduction is to compute image, contrast invariant to absolute illumination.
We can think of the total intensity as a sum Igb+i of a steady state background
component I. The contrast of the signal is as the ratio of i / Igb & the receptor
response should be proportional to this ratio independent of Igb, at least for small
ratios. The rationale for this assumption is that objects reflect a fixed amount of the
light that hits them. The simplest logarithmic receptor that produces useful voltage

Generate Vision in Blind People Using Suitable Neuroprosthesis Implant of BIOMEMS

311

Fig. 2. Basic Cell of Pixel

Fig. 3. MOS Transistor Photoreceptor & Log Respons

range is shown in Figure 3, it consists of a MOS transistor where the source of the
transistor forms the photodiode & channel forms the barrier that results in logarithmic
response of intensity.
Now this forms the basic photoreceptor cell. The number of electrons excited is
dependent on the light intensity i.e. color Eg=nhv, the number of electrons emitted
into the conduction band is dependent on wavelength of light hence on color, thus
there is a small change in current/voltage when an edge is detected. The correlation
between adjacent cells is extracted to detect edges; figure 4 shows a simplified 2 x 2
photoreceptor array & correlation between the adjacent cells.
Usually spatial-temporal model is used for motion detection, but since we have to
have a real time motion detection system, this model needs prediction of speed to
induce delays, so this creates a problem. Hence so called Template Model proposed
by G.A.Horridge [4] is used for motion detection.

312

B.V. Reddy, Y.S.K. Swamy, and N. Usha

Fig. 4. Correlation technique used for edge detection

3 Adaptive Network Using VLSI


The adaptive network is used to generate signals equivalent to VEP, using the
photoreceptor cells output. The adaptive network is used in 2 modes
1. Training Mode
2. Recognition Mode
In training mode the training image is given to set the weights of the network. These
weights are the optimum weights. In recognition mode, random images are given, and
the network will adapt to recognize the new image. When circuit is in training mode,
the photoreceptor output is given to the adaptive network; the output from each cell is
given to a capacitor so that the desired response for a particular edge is obtained. This
desired response is sampled and then compared with output of the network when it is
operated in the recognition mode. In recognition mode the current obtained from the
photoreceptor correlation is given to the input capacitor. It changes to a voltage Vgs
peak then it produces a current (ac) Ids peak which is proportional to the
transconductance gain gm1. The output from n amplifiers is added to get the
weighted sum of the input. Then it is subtracted from the desired response to get
the error for the particular network. This will be four to five networks for four basic
edges say Circular, Square, Triangular, etc. similarly for motion there will be two to
three networks. The error functions of these networks determine the edge or motion.
Fig 5 shows the implementation circuit implementation using VLSI.
The Network Adaptation Process: If the error is positive, the current pump circuit in
figure 5 is selected and large current flows through input capacitor. If the error is
negative then the current sink circuit in figure 5 is selected and then the error voltage
controls the amount of current to sink. Thus, Ids of the transconductance amplifier is
varied depending upon the error. After two cycles of weight adjustments, the
adaptation stops. Depending upon the errors of all the networks the object, edge is
recognized. The transconductance gain of the amplifier acts as weights (gm) gm=B
(Vgs-Vl). The selection of particular digital value for the corresponding object based

Generate Vision in Blind People Using Suitable Neuroprosthesis Implant of BIOMEMS

313

on the errors of the adaptive network is shown in figure 6. After two cycles of
adaptation process which ever has the minimum error, that particular edge network &
motion network is dominant & the brain signal equivalent to that is generated. Figure
7 shows a method to select the minimum errors. In the circuit, the comparator values
give the minimum error function of the four networks Ek1, Ek2, Ek3 and Ek4.
Depending on these the digital values are selected that are obtained from the ADC of
the VEP of the particular edge/motion.

Fig. 5. Adaptive Network designed in VLSI

Fig. 6. Selection of digital value for corresponding error

314

B.V. Reddy, Y.S.K. Swamy, and N. Usha

Fig. 7. Circuit to select minimum of errors

Fig. 8. Comparator values for all possibilities

4 Visual Evoked Potential


The EEG test has shown that, for any visual evoked potential the response is of the
order of microseconds and is divided into four regions of frequency as shown in
figure 8. The gain of the weinbridge oscillator which has the same frequency as the
VEP is altered in the four frequency regions to generate the particular brain signal.
The digital values of the particular brain signal is stored in a 6 transistor Pseudo
SRAM cell shown in figure 9, then depending on the digital values transmitted the
gain of the oscillator in the given frequency is varied.

Fig. 9. EEG test signal with 4 frequency regions f1, f2, f3 and f4

Generate Vision in Blind People Using Suitable Neuroprosthesis Implant of BIOMEMS

315

Wireless Digital Data Communication: The digital data from the Pseudo SRAM cell
will now have to be transmitted to the BioMEMS that is implanted inside the patients
brain. To establish this communication link we need to use wireless technology. The
data from the SRAM cell is transmitted using a wireless patch antenna operated at
300MHz frequency. Also, there will be one more patch antenna, meant only for
receiving data, is embedded on the surface of the BioMEMS device. This patch
antenna is tuned to operate in the band or around 300MHz.
The digital data has to be encoded, because the resistance values must have
different resonant frequencies so that the particular resistance is selected. This is
achieved by having a Voltage Controlled Oscillator [8] V.C.O in which, the frequency
is dependent on the magnitude of the applied voltage.

Fig. 10. Pseudo SRAM Memory cell to store and transmit data

BioMEMS: The BioMEMS [9] is implanted into the blind persons occipital lobe. It
contains 5 parts namely: 1. Patch Antenna Receiver, 2. Resistor Controlled Schmitt
trigger and double integrator and 3. Demultiplexing Circuit and 4. A 4 x 4 silicon
platinum Electrode Array.
The patch antenna receiver receives the digital encoded data wirelessly. The gain
controlled scmitt trigger generates signals depending upon the received digital
encoded data from the receiver antenna. The resistors in the ciccuits as shown in fig11
is controlled through the resistors are implemented using ujt and the rlc circit is
used to tune the resistor to a particular frequency and hence control the selection of
the resistor of double integrator circuit shown in figure 11, the output voltage of the
oscillator is controlled by the resistor network. Thus signal corresponding to only the
transmitted digital data. As explained above the VEP is sum of the potential of neuron
firing. Hence the signal generated by the wein bridge oscillator has to be
demultiplexed and then apply the voltage signals to the neurons. Figure 14 shows the
Demultiplexer circuit used to demultiplex the signals and apply the same to the
electrode array.

316

B.V. Reddy, Y.S.K. Swamy, and N. Usha

Fig. 11. Simulated circuit of Schmitt trigger and dual integrator Gain Controller Circuit that
should be incorporated on BioMEMS

Fig. 12. Simulation results

Thus, the demultiplexer is used to drive voltages of the electrodes that are placed
on the neurons. The silicon material is used to create the 4 x 4 electrode array; we
used this material because of the biocompatibility of silicon for BioMEMS
application [10].The simulated results are as shown in fig12.The output of the first
integrator is triangular and the output of the second integrator is sine wave. Since ujt

Generate Vision in Blind People Using Suitable Neuroprosthesis Implant of BIOMEMS

317

and micro strip antenna is not available in multisim. Resistors are controlled using
switching MOSFET. This is also shown in figure 12.

4 Conclusion
A technology to enable partial vision in visual impaired people is discussed here.
Since majority of the blind people have the occipital lobe healthy, we are using new
technologies to artificially excite brain neurons, like using a BioMEMS 4 x 4
electrode arrays that is precisely firing the neurons with required brain signals. The
brain signals are generated using VLSI circuits, the VLSI circuit processes an image
captured by electrical photoreceptors for this purpose.
The EEG signal is known to be the summation of the individual neuron firing. So
the output generated from the gain control circuit is given to demultiplexer the
frequency of the clock is twice as the frequency of the output. The demultiplexed
output is given to the respective MEMS electrode .This information is got from the
EEG electrode configuration.

References
1. Yakovleff, A.J.S., Moini, A.: Motion Perception using Analog VLSI. Analog Integrated
Circuits & Signal Processing 15(2), 183200 (1998) ISSN:0925-1030
2. Mojarradi, M.: Miniaturized Neuroprosthesis Suitable for Implantation into Brain. IEEE
Transaction on Neural Systems & Rehabilitation Engineering (March 2003)
3. Rangayanan, R.M.: Visual Evoked Potential Biomedical Signal Processing Analysis. A
Case Study Approach. IEEE Press, Los Alamitos
4. Sobey, P.J., Horridge, G.A.: Implementation of Template Model For Vision. Proc. R. Soc.
Lond. B 240(1298), 211229 (1990), doi:10.1098/rspb.1990.0035
5. Nguyen, C.T.-C.: MEMS Technology for Timing and Frequency Control. Dept. of
Electrical Engineering and Computer Science
6. Schmidt, S., Horch, K., Normann, R.: Biocompatibility of silicon-based electrode arrays
implanted in feline cortical tissue. Journal of Biomedical Materials Research (November
1993)

Undecimated Wavelet Packet for Blind Speech


Separation Using Independent Component
Analysis
Ibrahim Missaoui1 and Zied Lachiri1,2

1
National School of Engineers of Tunis,
BP. 37 Le Belved`ere, 1002 Tunis, Tunisia
brahim.missaoui@enit.rnu.tn
National Institute of Applied Science and Technology
INSAT, BP 676 centre urbain cedex, Tunis, Tunisia
zied.lachiri@enit.rnu.tn

Abstract. This paper addresses the problem of multi-channel blind


speech separation in the instantaneous mixture case. We propose a new
blind speech separation system which combines independent component analysis approach and the undecimated wavelet packet decomposition. The idea behind employing undecimated wavelet as a preprocessing
step is to improve the non-Gaussianity distribution of independent components which is a pre-requirement for ICA and to increase their independency. The two observed signals are transformed using undecimated
wavelet and Shannon entropy criterion into an adequate representation
and perform then a preliminary separation. Finally, the separation task
is done in time domain. Obtained results show that the proposed method
gives a considerable improvement when compared with FastICA and
other techniques.
Keywords: Undecimated wavelet packet decomposition, independent
component analysis, blind speech separation.

Introduction

The human auditory system has a remarkable ability to separate the target
sounds emitted from dierent sources. However, it is very dicult to replicate this functionality in their machine counterparts. This challenging known as
cocktail-party problem has been investigated and studied by many researchers
in the last decades [20].
The blind source separation (BSS) is a technique for recovering a set of the
source signals from their mixture signals without exploring any knowledge about
the source signals and the mixing channel. Among the solutions to BSS problem independent component analysis (ICA) approach is one of the popular BSS
methods are often used inherently with them. The ICA is a statistical and computational technique in which the goal is to nd a linear projection of the data
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 318328, 2011.
c Springer-Verlag Berlin Heidelberg 2011


Undecimated Wavelet Packet for Blind Speech Separation

319

where the source signals or components are statistically independent or as independent as possible [17]. In the instantaneous blind separation, many algorithms
have been developed using this approach [19] such as ICA based on the mutual
information minimization [2,27], maximization of non-Gaussianity [1,6,4] and
maximization of likelihood [3,12]. To perform the blind separation task, the ICA
approach can use the second or higher-order statistics. For instance, SOBI [13]
is the Second order blind identication algorithm which extract the estimated
signals by applying a joint diagonalization of a set of covariance matrix. Similarly, the Jade algorithm is introduced in [10] based on higher-order statistics
and use a Jacobi technique in order to performed a joint diagonalization of the
cumulant matrices.
Some approach combines the ICA algorithm with another technique. For instance, the geometric information [29]and the subband decomposition [24] can
be used in combination with ICA. In [5], the mixture is decomposed using the
discrete wavelet transform and then the separation step is performed in each
sub band. The approach proposed in [22,25] employed the wavelet transform as
the preprocessing step and the separation task is then done in time domain. In
this paper, we propose a blind separation system to extract the speech signals
from their observed signals in the instantaneous case. The proposed system use
undecimated wavelet packet decomposition [9] for the transformation of the two
mixtures signals into adequate representation to emphasize the non-Gaussian
nature of mixture signals which pre-requirement for ICA and then performed a
preliminary separation [22,25,23]. Finally, the separation task is carried out in
time domain.
The rest of the paper is organized as follows. After the introduction, the
section 2 introduces the blind speech separation problem and describes the FastICA algorithm used in the proposed method. In section 3, undecimated wavelet
packet decomposition is presented. Then in section 4, the proposed method is described. Section 5 exposes the experimental results. Finally, Section 6 concludes
and gives a perspective of our work.

Blind Speech Separation

In this section, we formulate and describe the problem of blind speech separation
by focusing mainly on the ICA approach.
2.1

Problem Formulation

The main task of the blind speech separation problem is to extract the original
speech signals from their observed mixtures without reference to any prior information on the sources signals or the observed signals under the assumption
that the source signals are statistically. The observed signals contain a dierent
combination of the source signals. This mixing model can be represented in the
instantaneous case where the number of mixtures signals equal that of source
signals by:
X(t) = AS(t) .
(1)

320

I. Missaoui and Z. Lachiri

Where X(t) = [x1 (t) .. xn (t)]T is a vector of mixture signals, S(t) = [s1 (t) ..
sn (t)]T is the unknown vector of sources signals and A is the unknown mixing
matrix having as dimension (n  n).
The Independent component analysis is a statistic method of BSS technique,
which tends to solve this problem by exploiting the assumption of independence
of the source signals. This method consist to nd the separating matrix known
as unmixing matrix W = A1 , whose used to recover the original independent
components as Y = W X. Their principle can be depicted by gure 1. The key
idea is to maximize the non-Gaussianity for attempting to make the sources
as statistically independent as possible under some fundamental assumptions
 (i.e. the sources) are
and certain restrictions [17]: The components si (t) of S(t)
assumed to be statistically independent with non-gaussian distribution.
In order to measure the non gaussianity or independence, the ICA approach
exploits the high-order statistics and information-theoretic criteria such as the
kurtosis or dierential entropy called negentropy [17]. FastICA algorithm [6,17],
which based on negentropy, is one of the most popular algorithms performing
independent component analysis.

Fig. 1. Principle of ICA

2.2

FastICA Algorithm

The FastICA is one of an ecient algorithm which performs the ICA approach.
It realizes the blind separation task by using a point iteration scheme in order to
nd maximum of the non-Gaussianity of projected component. The non gaussianity can be measured through the value of negentropy which dened as the
dierential entropy:
(2)
J(y) = H(ygauss ) H(y) .
Where H(y) represents the dierential entropy of y and it is computed as fallows:

H(y) = f (y) log(f (y))dy .
(3)
The negentropy can be considered as the optimal measure of the gaussianity.
However, it is dicult to estimate the true negentropy. Thus, several approximations are used and developed such the one developed by Aapo Hyvarinen et
al [6,17]:

Undecimated Wavelet Packet for Blind Speech Separation

J(y) =

p


ki (E[gi (y)] E[gi (v)])2 .

321

(4)

i=1

where ki , gi and v are respectively positive constants, the non quadratic functions and Gaussian random variable. The fundamental xed-point iteration is
performed by using the following expression:
i )} E{g  (WiT X
i )} Wi .
i g(WiT X
Wi (k) E{X

(5)

where g is the contrast function and g  is its derivative.

3
3.1

Undecimated Wavelet Packet


Wavelet Transform

Wavelet transform (WT) is introduced as a time frequency analysis approach


which leads a new way to represent the signal into a linear powerful representation [14,21]. In addition, the discrete wavelet transform (DWT) has been
developed. It decomposes signals into approximation and details followed by decimators after each ltering. However, the DWT is not translation invariant [21].
In order to provide a denser approximation and overcome this drawback, the
undecimated wavelet transform (UWT) has been introduced. In which, no decimation is performed after each ltering step, so that every both approximate
and detail signal have the same size which equal to that of the analyzed signal. The UWT was invented several times with dierent names as algorithm a`
trous (algorithm with holes) [8], shiftinvariant DWT [26] and redundant wavelet
transform [7]. The UWPT is developed in the same way and computed in a
similar manner as the wavelet packet transform except that the downsampling
operation after each ltering step is suppressed.
3.2

The Undecimated Wavelet Packet

In our BSS system, we use the undecimated wavelet packet decomposition using
Daubechies4 (db4) of an 8 kHz speech signal. This decomposition tree structure
consists of ve levels and it is adjusted in order to accords critical band characteristics. The sample rate of speech signal used in this work is 8 Khz which leads
a bandwidth of 4 kHz. Therefore, the audible frequency range can be approximate with 17 critical bands (barks) as shown in Table 1. The tree structure of
undecimated wavelet packet decomposition is obtained according to this results
critical bandwidths [9]. It is depicted in gure 2, The frequency bandwidth for
each node of the UWPD tree is computed by the following equation:
cbw(i, j) = 2j (Fs 1) .

(6)

Where i = (0, 1, .., 5) and j = (0, .., 2j 1) are respectively the number of levels
and the position of the node and Fs is the sampling frequency.

322

I. Missaoui and Z. Lachiri


Table 1. Critical-band characteristics
Critical bands Center frequency Critical bandwidth
(barks)
(Hz)
(CBW) (Hz)
1
50
100
2
150
100
3
250
100
4
350
100
5
450
110
6
570
120
7
700
140
8
840
150
9
1000
160
10
1170
190
11
1370
210
12
1600
240
13
1850
280
14
2500
320
15
2150
380
16
2900
450
17
3400
550

The Proposed Method

The idea behind employing wavelet transform as a preprocessing step is to improve the non-Gaussianity distribution of independent components that is a
pre-requirement for ICA and to increase their independency [22,25]. Inspired
from this idea, we propose a new blind separation system, in the instantaneous
mixture case, to extract the speech signals of two-speakers from two speech
mixtures.
The proposed system use the undecimated wavelet packet decomposition for
transformed the two mixtures signals into adequate representation to emphasize
the non-Gaussian nature of mixture signals. The UWPD tree is chosen according to critical bands of psycho-acoustic model of human auditory system. The
results signals are used to estimate the unmixed matrix W using the FastICA
algorithm [6]. The separation task is then done in the time domain. Our speech
separation system, shown in gure 3, contains two modules shown in dotted
boxes. The rst module (Preprocessing Module) consists to extract appropriate
signals from the observed signals to improve the source separation task. The second module (Separation module) performs the source separation using FastICA
algorithm [6]. The Description for each module is given bellow.
4.1

Preprocessing Module

The rst module corresponds to the preprocessing step that decomposes the
observed signals using a perceptual lter bank. This lterbank is designed by

Undecimated Wavelet Packet for Blind Speech Separation

323

Fig. 2. The CB-UWPD tree and its corresponding frequency bandwidths (perceptual
lterbank)

adjusting undecimated wavelet packet decomposition tree, according to critical band characteristics of psycho-acoustic model [9]. Each result coecients of
the two mixtures can be viewed as an appropriate signal. Thus, we have many
possibilities in the choice of the best coecients. In order to increase the non
Gaussianity of the signals that is a pre-requirement for ICA, we need to nd
the best coecients which improves the source separation task. The coecients
selection is done by using Shannon entropy criterion [22,25,15]. The following
steps summarize the procedure of the selection algorithm:
Step 1: Decompose each mixture signals into undecimated wavelet packet.
Step 2: Calculate the entropy of each node Cj,k of UWPD tree.
Step 3: Select the node which has the lowest entropy.
The Shannon entropy is computed for each node (j, k) as follow:

H(j, k) =
pi log(pi ) .
Where
pi =

Cj,k (i)
.
X(k)2

(7)

(8)

324

I. Missaoui and Z. Lachiri

Fig. 3. The framework of proposed speech separation system

With Cj,k are the UWPD coecients and X is the mixture signal.
4.2

Separation Module

In this module, the separation task in done and can be devised into two steps,
the rst one consists on generating the unmixing matrix W using the FastICA
algorithm [6]. This step uses the result signals of the previous module as new
inputs of FastICA. The two input signals correspond to the UWPD coecients
having the lowest entropy. The second step consists on extracting the estimated
speech signals using the matrix W and taking into account the original mixtures
signals.

Results and Evaluation

In this section, we illustrate the performance of our system described in the


previous section. We use TIMIT database which formed by a total of 6300 speech
signals. These signals are formed by 10 sentences spoken by each of 630 speakers.
The latter are chosen from 8 major dialect regions of the United States [28].
The speech signals taken are resampled to 8 kHz. We consider in this work the
instantaneous case with two mixture signals composed of two speech signals. The
observed signals are generated, articially, using the following mixing matrix:


21
A=
.
(9)
11

Undecimated Wavelet Packet for Blind Speech Separation

325

To evaluate our system, we use dierent performance metrics such as the blind
separation performance measures introduced in BSS EVAL [11,30], including various numerical measures of BSS performance. We exploit in this work, the Signal
to Interference Ratio (SIR) and Signal to Distortion Ratio (SDR) measures. To
generate these measures, the estimated signals si (n) must be decomposed into
the following component sum:
si (n) = starget (n) + einterf (n) + eartef act (n) .

(10)

where starget (n), einterf (n) and eartef act are, respectively, an allowed deformation of the target source si (n),an allowed deformation of the sources which takes
account of the interference of the unwanted sources and an artifact term which
represents the artifacts produced by the separation algorithm. Then, the SIR
and SDR ration are computed using the last decomposition as:
starget (n)2
.
einterf (n)2

(11)

starget (n)2
.
einterf (n)2 + eartef act (n)2

(12)

SIR = 20 log

SDR = 20 log

In order to evaluate the quality of the estimated speech signals, the segmental
and overalla Signal to Noise ration are used. In addition, a subjective test is
done using perceptual evaluation of speech quality PESQ, which is an objective
method, dened in the ITU-T P.862 standard [16]. The PESQ measure is a score
comprise between 0.5 and 5 db and equivalents to the subjective Mean Opinion
Score.
The experiment results of our proposed system has been compared to that of
FastICA algorithm [6], described in section 2, and two well-known algorithms
SOBI [13] and Jade [10].
The obtained results are summarized in four tables. The table 2 presents the
BSS evaluation, including SIR and SDR ration, obtained after separation task by
proposed method, SOBI, Jade and FastICA. We observed that the SIR SDR
and their values for the proposed method is improved compared to FastICA. The
SIR average is 55.93 db for the proposed method, 48.03db for FastICA, 50.17
db for Jade and 26.60 db for SOBI.
Table 3 and 4 illustrates segmental SNR and overall SNR. We can see that
the estimated signals obtained by our method have better values than that of
the other methods. for instance, we have obtained overall SNR improvement of
9 db compared with FastICA.
To measure the speech quality of the estimated signals, the BSS evaluation
measures is reported in terms of PESQ. As depicted in table 5, the proposed

326

I. Missaoui and Z. Lachiri

Table 2. Comparison of SIR and SDR using SOBI, Jade, FastICA and proposed
Method (PM)

SIR(Signal 1)
SIR(Signal 2)
SDR(Signal 1)
SDR(Signal 2)
Average

SOBI Jade FastICA PM


26.92 54.72 44.39 51.11
26.29 45.63 51.68 60.75
26.92 54.72 44.39 51.11
26.29 45.63 51.68 60.75
26.60 50.17 48.03 55.93

Table 3. Comparison of segmental SNR using Sobi, Jade, FastICA and proposed
Method (PM)
SOBI Jade FastICA PM
Seg SNR (Signal 1) 22.58 33.56 30.79 32.79
Seg SNR (Signal 2) 20.47 29.40 31.15 33.03
Table 4. Comparison of Overall SNR using Sobi, Jade, FastICA and proposed Method
(PM)
SOBI Jade FastICA PM
Overall SNR (Signal 1) 26.92 54.72 44.39 51.11
Overall SNR (Signal 2) 26.29 45.63 51.68 60.75
Table 5. Comparison PESQ using SOBI, Jade, FastICA and proposed Method (PM)
SOBI Jade FastICA PM
PESQ (Signal 1) 2.58 3.29 3.25 3.29
PESQ (Signal 2) 3.45 4.14 4.27 4.38

method is still more eective in terms of perceptual quality than FastICA and
the other techniques.

Conclusion

In this work, we have proposed a novel blind speech separation approach to


separate two sources in the instantaneous case. This approach is based on undecimated wavelet packet transform and ICA algorithm. We employed the undecimated wavelet packet and used Shannon entropy criteria in order to increase
the non gaussianity of observed signals. The results signals are used as new inputs of ICA algorithm to estimate the unmixing matrix which was employed
to separate the speech signals in time domain. The experimental results of this
hybrid undecimated wavelet packet-ICA show that the proposed approach yields
a better separation performance compared to similar techniques.

Undecimated Wavelet Packet for Blind Speech Separation

327

References
1. Comon, P.: Independent component analysis: A new concept? Signal Processing 36(3), 287314 (1994)
2. Bell, A.J., Sejnowski, T.J.: An information maximization approach to blind separation and blind deconvolution. Neural Computation 7, 10041034 (1995)
3. Cardoso, J.F.: Infomax and maximum likelihood for blind separation. IEEE Signal
Processing Letters 4, 112114 (1997)
4. Wang, F.S., Li, H.W., Li, R.: Novel NonGaussianity Measure Based BSS Algorithm
for Dependent Signals. In: Dong, G., Lin, X., Wang, W., Yang, Y., Yu, J.X. (eds.)
APWeb/WAIM 2007. LNCS, vol. 4505, pp. 837844. Springer, Heidelberg (2007)
5. Xiao, W., Jingjing, H., Shijiu, J., Antao, X., Weikui, W.: Blind separation of speech
signals based on wavelet transform and independent component analysis. Transactions of Tianjin University 16(2), 123128 (2010)
6. Hyv
arine, A.: Fast and robust xed-point algorithms for independent component
analysis. IEEE Transactions on Neural Networks 10(3), 626634 (1999)
7. Fowler, J.: The redundant discrete wavelet transform and additive noise. IEEE
Signal Processing Letters 12(9), 629632 (2005)
8. Shensa, M.: The discrete wavelet transform: Wedding the `
a trous and Mallat algorithms. IEEE Trans. Signal Processing 40(10), 24642482 (1992)
9. Tasmaz, H., Ercelebi, E.: Speech enhancement based on undecimated wavelet
packet-perceptual lterbanks and MMSE-STSA estimation in various noise environments. Digital Signal Processing 18(5), 797812 (2008)
10. Cardoso, J.F.: Higher-order contrasts for independent component analysis. Neural
Computation 11, 157192 (1999)
11. Vincent, E., Gribonval, R., Fevotte, C.: Performance Measurement in Blind Audio
Source Separation. IEEE Transactions on Audio, Speech, and Language Processing 14(4), 14621469 (2006)
12. Chien, J.T., Chen, B.C.: A New Independent Component Analysis for Speech
Recognition and Separation. IEEE Transactions on Audio, Speech and Language
Processing 14(4), 12451254 (2006)
13. Belouchrani, A., Abed-Meraim, K., Cardoso, J.F., Moulines, E.: A blind source separation technique using second order statistics. IEEE Trans. Signal Processing 45,
434444 (1997)
14. Gargour, C., Abrea, M., Ramachandran, V., Lina, J.M.: A short introduction to
wavelets and their applications. IEEE Circuits and Systems Magazine 9(2), 5758
(2009)
15. Coifman, R., Wickerhausser, M.: Entropy-based algorithms for best-basis selection.
IEEE Transactions on Information Theory 38, 713718 (1992)
16. ITU-T P.862, Perceptual evaluation of speech quality (PESQ), an objective method
for end-to-end speech quality assessment of narrow-band telephone networks and
speech codecs. International Telecommunication Union, Geneva (2001)
17. Hyv
arinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley
Interscience, New York (2001)
18. Wang, L., Brown, G.J.: Computational Auditory Scene Analysis: Principles, Algorithms, and Applications. Wiley/IEEE Press, Hoboken, NJ (2006)
19. Haykin, S.: Neural Networks and Learning Machines, 3rd edn. Prentice-Hall, Englewood Clis (2008)
20. Cichocki, A., Amari, S.: Adaptive Blind Signal and Adaptive Blind Signal and
Image Processing. John Wiley and Sons, New York (2002)

328

I. Missaoui and Z. Lachiri

21. Mallat: A Wavelet Tour of Signal Processing: The Sparse Way, 3rd edn. Academic
Press, London (2008)
22. Moussaoui, R., Rouat, J., Lefebvre, R.: Wavelet Based Independent Component
Analysis for Multi-Channel Source Separation. In: IEEE International Conference
on Acoustics, Speech and Signal Processing, pp. 645648 (2006)
23. Usman, K., Juzoji, H., Nakajima, I., Sadiq, M.A.: A study of increasing the speed
of the independent component analysis (lCA) using wavelet technique. In: Proc.
International Workshop on Enterprise Networking and Computing in Healthcare
Industry, pp. 7375 (2004)
24. Tanaka, T., Cichocki, A.: Subband decomposition independent component analysis
and new performance criteria. In: IEEE International Conference on Acoustics,
Speech and Signal Processing, pp. 541544 (2004)
25. Mirarab, M.R., Sobhani, M.A., Nasiri, A.A.: A New Wavelet Based Blind Audio Source Separation Using Kurtosis. In: International Conference on Advanced
Computer Theory and Engineering (2010)
26. Walden, A.T., Contreras, C.: The phase-corrected undecimated discrete wavelet
packet transform and its application to interpreting the timing of events. Proceedings of the Royal Society of London, 22432266 (1998)
27. Chien, J.T., Hsieh, H.L., Furui, S.: A new mutual information measure for independent component alalysis. In: IEEE International Conference on Acoustics, Speech
and Signal Processing, pp. 18171820 (2008)
28. Fisher, W., Dodington, G., Goudie-Marshall, K.: The TIMIT-DARPA speech
recognition research database: Specication and status. In: DARPA Workshop on
Speech Recognition (1986)
29. Zhang, W., Rao, B.D.: Combining Independent Component Analysis with Geometric Information and its Application to Speech Processing. In: IEEE International
Conference on Acoustics, Speech, and Signal Processing (2009)
30. Fevotte, C., Gribonval, R., Vincent, E.: BSS EVAL toolbox user guide, IRISA,
Rennes, France, Technical Report 1706 (2005)

A Robust Framework for Multi-object Tracking


Anand Singh Jalal and Vrijendra Singh
Indian Institute of Information Technology, Allahabad, India
anandsinghjalal@gmail.com, vrij@iiita.ac.in

Abstract. Tracking multiple objects in a scenario exhibit complex interaction is


very challenging. In this work, we propose a framework for multi-object
tracking in complex wavelet domain to resolve the challenges occurred due to
incidents of occlusion and split. A scheme exploiting the spatial and appearance
information is used to detect and correct the occlusion and split state.
Experimental results illustrate the effectiveness and robustness of the proposed
framework in ambiguous situations in several indoor and outdoor video
sequences.

1 Introduction
There are a sheer number of applications where visual object tracking becomes an
essential component. These applications include surveillance system to know the
suspicious activity, sport video analysis to extract highlights, traffic monitoring and
human computer interaction to assist visually challenged people. Even the
performance of high level event analysis is highly depends on the accuracy of an
object tracking method.
Multi-object tracking is one of the most challenging problems in computer vision.
The challenges are due to change in appearance of the objects, occlusion of objects
and splitting of object. Occlusion occurs either due to one object is occluded by
another object or an object is occluded by some component of the background. Split
may occur due to merged object or because of errors in the segmentation method. An
error in the split may mislead the tracker. A good multi-object tracking method should
be able to detect changing numbers of objects in the scene, adding and removing
objects and also able to handle both occlusion and split events.
Kalman filtering is an efficient solution to track multiple objects [1]. However
mistakes become more frequent and are difficult to correct as the number of objects
increases. The problem can be solved using particle filtering by exploiting the
multiple hypotheses [2]. In [3], the author formulates the multi-object tracking as a
Bayesian network inference problem and explores this approach to track multiple
players. In [4], the author proposed a probabilistic framework based on HMM to
describe a multiple object trajectory tracking. The framework was able to track
unknown number of multiple objects. The association problem has been represented
as a bipartite graph in [5]. A method was proposed to maintain hypotheses for
multiple associations. They also resolved the problem of objects entering and exiting,
and handled the error due to merging and splitting objects. However, particle filterbased tracking algorithms having not enough samples that are statistically significant
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 329338, 2011.
Springer-Verlag Berlin Heidelberg 2011

330

A.S. Jalal and V. Singh

modes, faced difficulty to track multiple objects. They are only capable to handle
partial short duration occlusion. In recent years a number of approaches are proposed
in the literature to resolve the issues of multi-object tracking [6,7,8]. However, these
methods are failed when the objects suddenly disappears or change its direction or in
case of similar colored/textured objects.
This paper describes a multi-resolution tracking framework using Daubechies
complex wavelet transform. Due to its approximate shift invariant and noise resilience
nature Daubechies complex wavelet transform based method provides efficiency and
robustness to the tracking system in varying real-life environment and even in the
presence of noise [9]. Also the wavelet transform has an inherent multi-resolution
nature that provides a rich and robust representation of an object. A multi-resolution
approach employs opportunity to perform tracking at high resolution when we require
accurate estimation of object state e.g. confusion due to occlusion, while tracking at
lower spatial resolution at other times. Therefore, in the proposed approach we exploit
the high resolution to gain more discriminative power to object model, whereas all
other tasks are performed at low resolution.
The proposed algorithm exploits a correspondence establishment approach similar
to that presented in [6], but with a different distance measure and different appearance
model based on Daubechies complex wavelet coefficients. The proposed approach
encompasses the principle of object permanence to handle the occlusion occurred due
to a background object such as an electric pole or a tree. Object permanence is
defined as the ability of an infant to understand the existence of a hidden moving
object [10].
The remaining part of the paper is organized as follows. Section 2 gives an
overview of the proposed framework. Section 3 presents the proposed multi-object
tracking approach and discusses how to handle the occlusion and split problems in
multi-object scenario. Section 4 contains results over real world video sequences, and
finally, Section 5 concludes and discusses the open issues for future research.

2 Proposed Framework
Wavelet domain provides a framework to view and process image at multiple
resolutions [11]. We have used Daubechies complex wavelet transform (CxWT), as it
is approximately shift-invariant and has better directional information with respect to
real DWT. The details of CxWT can be found in [12]. The proposed framework is
broadly subdivided into two components: 1) moving object extraction using
background subtraction 2) muti-object tracking using occlusion reasoning. Fig. 2
illustrates the block diagram of the proposed framework, which is self explanatory.
From the block diagram it is clear that all the tasks are performed in complex wavelet
domain.
The first component of the proposed framework consists of a simple and effective
background modelling and subtraction method, in which a background model is
extracted initially by a training stage. Background subtraction is then performed to
extract moving foreground pixels in the current frame using a single Gaussian method
in wavelet domain [13]. An area thresholding method is incorporated to remove the

A Robust Framework for Multi-object Tracking

331

false detections. The background is updated using the feedback of classification


results of the extracted moving objects. Morphological operators are then applied for
further smoothing of the moving object.
The second component is the tracking module, which uses a correspondence
process to associate each foreground object with one of the existing tracked object.
However, this task becomes challenging even in the simple case of partial occlusions.
Therefore, in the proposed method we are exploiting the spatial and appearance
information of object to handle object occlusion and object split problem. In the
proposed framework we compute appearance model at a high resolution (lower level)
to gain more discriminative power for object model, whereas other tasks like
background modelling and subtraction, shadow suppression and correspondence
establishment are performed at low resolution (higher level). By performing most of
the task at low resolution, we have attained high immunity to noise and also reduced
the computations.

Complex
Wavelet
Transform

Frame
Capture

Background
Modelling

N
Last
Frame ?

Update
Objects
Database

Background
Subtraction

Morphological
analysis
Extract Moving Objects

Y
Finish
Tracking

Compute feature vector


for candidate objects

Multi-Object Tracking
Occlusion
Analysis

Correspondence
Establishment

Multi-Object Tracking

Fig. 2. The proposed multi-object tracking framework

3 Multi-object Tracking
Mutli-object tracking algorithms should be able to establish unique correspondences
between objects in each frame of a video. The first component of the proposed
framework detects foreground pixels and form isolated regions of connected foreground
pixels, which are called blobs. The next task is to establish a correspondence between
object instances over frames. The proposed framework is capable to track any number
of objects without any prior information about the object modelling.

332

A.S. Jalal and V. Singh

3.1 Object Appearance Modelling


Object modelling can be defined as finding an appropriate visual description that
makes the object distinguish from other objects and background. In the proposed
method, each object is characterised with its spatial layout and appearance. A
histogram of wavelet coefficients is used to model the object appearance. Since in the
tracking process, usually the boundary points of object are prone to error due to
occlusion or interference from the background. So we used Epanechinikov kernel
function to smooth the probability density function, which assign lower weights to
pixels farther from the centre [14]. This improves the robustness of the object model,
by diminishing the influence of boundary clutter.
The histogram of a target can be computed as:

p = { pk ; k = 1: K}
Where K represents the number of histogram bins. For each bin the discrete
probabilities are formulated as:
M

pk = CN g E || xa ||2 ( b( xa ) k )
a =1

Where CN is a normalization constant required to ensure that kK=1 pk = 1, the

kronecker delta, { xa ; a = a : M } the pixel location, M is the number of target pixels,


and b( xa ) a function that makes a correspondence between the given pixel value and
its analogous histogram bin. The symbol g E ( x) represents an isotropic kernel having
a convex and monotonic profile.
In order to make the appearance model more discriminative, we used complex
wavelet coefficients at higher resolution to compute weighted histogram. A
Bhattacharyya coefficient is used as a measure of comparability between two
appearance models [14].
3.2 Detection of Object Correspondence

The aim of Correspondence Establishment module is to associate foreground blobs


with objects that are already being tracked. In ideal cases a single object is mapped to
one connected blob. However, due to occlusions and splits, two or more different
objects may assign to a single connected blob or single object may appear as more
than one connected blobs. To resolve the correspondence problem, we use a
correspondence matrix Cm showing the association between foreground regions
extracted in current frame and the objects successfully tracked in the previous frame.
In the correspondence matrix (Cm), the rows correspond to existing tracks in previous
frame and columns to foreground blobs in the current frame.
In the proposed approach we maintain a data structure named as object database
( DBobj ) to keep track of information about the tracked objects. The information are:
identity(ID), area(A), centroid(C), minimum bounding box(MBB), appearance
model(OM), merge list(MList) and status(S). Where status can be A(Active),

A Robust Framework for Multi-object Tracking

333

P(Passive), E(Exit), M(Merge). MList consists of the ID of the objects that are
involved in merging.
Suppose Oti 1 represent a i th tracked object in (t 1)th frame and Btj represent
the j th blob in t th frame respectively, where i = 1, 2,..., M and j = 1, 2,..., N . M
represents the number of objects that already being tracked in the previous frame and
N represents the number of foreground blobs in the current frame.
The distance between blob Btj and object Oti 1 can be defined as:
t 1

Bt

D x = COx i C x j

t 1

Bt

and D y = COy i C y j

(1)

Where C x and C y represent the X and Y component of the respective centroid.


The size of the correspondence matrix (Cm) is M x N and its value can be defined
as:

C m [i, j ] = 1;
0;

Dx <

WOt 1 + WBt
i

Otherwise

and D y <

H Ot 1 + H Bt
i
j

(2)

Where W and H represent width and height of respective object and blob.
Thus correspondence matrix contains binary values. An entry 1 in the
correspondence matrix shows that there is an association between the corresponding
object ( O ) and blob ( B ). The analysis of correspondence matrix produces following
association events:
Active Track: A single blob B in current frame is associated to a single object O in
the previous frame, if the blob is isolated and not occluded. In such condition the
corresponding column and row in Cm have only one non-zero element. As soon as a
blob is declared as active track, the corresponding information in the DBobj is

updated.
Appearing or Reappearing: If a column in Cm has all zero elements then it shows
that the corresponding blob B cannot be explained by any of the existing object
hypotheses. Thus, B has to be a new region which is either caused by the entry of a
new object or the reappearance of one of the existing object. The existing object may
disappear from the scene for some time due to occlusion occurred by a background
object such as a pole or a tree. If the entry (appearance) of the region is from the
boundary of the image then it is treated as a new object, otherwise it might be an
existing object. If it is a case of existing object then the appearance feature of such
blob B is matched against the objects having a Passive status in DBobj . If a match

is found, the Passive status of corresponding object is replaced by an Active status


and object details are updated in DBobj . However, if no match is found, the blob is

334

A.S. Jalal and V. Singh

treated as a new object. If blob is detected as new object then its details should be
added to the DBobj and a Active status is assigned to it.
Exit or Disappear: If a row in Cm has all zero elements then it implies that the
hypothesis of corresponding object O is not supported by any of the foreground
blobs. Thus, O is either exited from the scene or disappeared for some time due to
occlusion occurred by a background object. If the O was near the boundary then it is
assumed to be an exit status, otherwise it is assumed that the O is disappeared for
some time. If blob is detected as an exit object then its status is updated as Exit in
DBobj . If it is the case of disappearing then the status is updated as Passive.
Merging: If a column in Cm has more than one non-zero entries. It implies that
multiple objects compete for a single foreground blob.
Splitting: If a row in Cm has more than one non-zero entries. It implies that a merged
object is splitted into its corresponding components.
3.2.1 Detecting and Correcting Occlusion
If a column in Cm has more than one non-zero entries, there are two possible causes
a) multiple objects merged together and form a single foreground blob; b) two or
more objects are in the close proximity and satisfying eqn. 2. Merging is very
common in case of objects crossing each other, standing together etc. First condition
(merging or occlusion) occurs when two or more objects come in close proximity of
each other, i.e., the minimum bounding boxes of the objects physically overlap in the
frame. Thus, the merging gives rise to a single foreground blob having area
significantly larger than corresponding objects. Suppose two objects OA and OB in
previous frame t-1 are occluded in the current frame t and give rise to a single blob
BM . In the proposed approach this BM is tracked as a new object and assumed to be a

mother object ( OM ) having two child objects OA and OB . This mother object is
added to DBobj and the ID of OA and OB are inserted in the MList of OM . The
status of OA and OB are also updated as M. This OM will be tracked in the
subsequent frames as an active track until it splits.
In case of second condition where objects seemed to be merged due to the close
proximity, the blob is mapped to object having maximum similarity. This similarity is
based on appearance feature using object model at high resolution.
3.2.2 Detecting and Correcting Splits
A merged object can be splitted into several blobs during segmentation process. There
are two possibility of merging of objects a) due to the occlusion of two or more
objects during tracking b) two or more objects might enter the scene in a group. If
merging is due to occlusion such as in the above example OM , then splitting

produces corresponding child objects ( OA and OB ) and their identity is re-established

A Robust Framework for Multi-object Tracking

335

using the appearance features of objects. These child objects are then tracked as
existing objects. The objects details are updated in DBobj and the status is changed as
Active.
If merging is due to the group entry then splitting produces new objects. The
details of these new objects are added to the DBobj and an Active status is assigned.
After splitting the merge object ( OM ) is released from the DBobj .

4 Experimental Results
A qualitative evaluation of the proposed approach is carried out in four video
sequences. The first two sequences are from PETS-2009 dataset, which provides a
challenging multi-object scenario. The third sequence is from Hall Monitoring video
consists of the problem of noise due to the indoor illumination. Since several lighting
sources are present in the Hall Monitoring scenario, so target and background
appearance is significantly affected. The last video is recorded in outdoor
environment of our institute campus. The image resolution is 760 x 576, 352 x 240
and 720 x 480 for PETS video, Hall Monitoring and Campus video respectively.
The experiments have been started with the first set of PETS-2009 image
sequences. The ID of the object is labeled at the middle of the bounding box. A green
color label shows a new or reappeared object. A white color label shows an active
tracked object. Whereas a yellow color label with a red bounding box shows a merged
object. The top image in fig. 3, demonstrates the trajectory of objects on the image
plane. Fig. 3(a) shows the start of the scenario having seven objects. Fig. 3(c)
illustrates the object having ID#5 is occluded behind the electric pole and disappears
from the scene for few frames. This object is reappeared in fig. 3(d) and correctly
tracked by the proposed approach using the concept of object permanence. In the
meantime objects having ID#6 and ID#7 come very close and form a single merged

Fig. 3. Snapshots from tracking results of the PETS 2009 image sequence 1

336

A.S. Jalal and V. Singh

Fig. 4. Snapshots from tracking results of the PETS 2009 image sequence 2

Fig. 5. Snapshots from tracking results on Hall monitoring image sequence

object. Figs 3(d-f) show that the proposed algorithm, enables the tracking of objects
ID#6 and ID#7 during the merge period. Figs 3(e-f) also show the partial occlusion of
object due to background component.
Another experiment was performed on the second set of PETS-2009 image
sequences. A crosswalk scene is analysed in this experiment. Fig. 4(a) shows the
tracking objects at the start of the scenario. In fig. 4(b) the occlusion takes place
between object ID#2 and object ID#3. Fig. 4(b) and fig. 4(d) illustrate that the
proposed scheme detects and corrects occlusion effectively in the presence of heavy
occlusion as shown in fig. 4(c). Figs 4(e-f) again show the effectiveness of the
proposed scheme during heavy occlusion.
The next experiment was performed on the Hall Monitoring image sequences. In
this scenario, the background color distribution is similar to the trousers of the first
object and also this image sequence is suffered from noise caused by variations in the

A Robust Framework for Multi-object Tracking

337

illumination. This causes multiple segments of a single object during background


subtraction process. However, the proposed framework shows its efficiency to
correct these split segments as shown in fig. 5. Also we are performing most of the
task like background subtraction, shadow suppression etc. at a lower resolution, as a
result the noise is greatly attenuated due to the effect of lowpass filtering.
The last experiment was performed on the Campus video, in which the objects are
moving away from the camera. Figs. 3-6 illustrate that our method attains the entry of
objects from any directions.

Fig. 6. Snapshots from tracking results on Campus image sequence

From the experimentation results, we conclude that our method obtains satisfactory
results to track multiple objects and successful in coping with the problem of split and
occlusion. However, it is not validated in crowded scenes.

5 Conclusion
In this paper we have presented a Daubechies complex wavelet transform based
framework for tracking multiple objects aiming to resolve the problem occur due to
the presence of noise, occlusion, and split error. The appearance feature of objects at
multi-resolution level is used to resolve the problem of occlusion and split error. The
obtained experimental results in four video sequences show that our approach can
cope successfully the interactions, occlusions and split in challenging situations.

References
1. Mittal, A., Davis, L.: M2tracker: A Multi-view Approach to Segmenting and Tracking
People in a Cluttered Scene. International Journal of Computer Vision 51(3), 189203
(2003)
2. Smith, K., Gatica-Perez, D., Odobez, J.-M.: Using Particles to Track Varying Numbers of
Interacting People. In: Proceedings of the Conference on Computer Vision and Pattern
Recognition (2005)

338

A.S. Jalal and V. Singh

3. Nillius, P., Sullivan, J., Carlsson, S.: Multi-target Tracking - Linking Identities using
Bayesian Network Inference. In: Proceedings of the Conference on Computer Vision and
Pattern Recognition, vol. 2, pp. 21872194 (2006)
4. Han, M., Xu, W., Tao, H., Gong, Y.: Multi-object Trajectory Tracking. Machine Vision
and Applications 18(3), 221232 (2007)
5. Joo, S.W., Chellappa, R.: Multiple-Hypothesis Approach for Multi-object Visual
Tracking. IEEE Transactions on Image Processing 16, 28492854 (2007)
6. Senior, A., Hampapur, A., Tian, Y.-L., Brown, L., Pankanti, S., Bolle, R.: Appearance
Models for Occlusion Handling. Journal of Image and Vision Computing 24(11), 1233
1243 (2006)
7. Rad, R., Jamzad, M.: Real Time Classification and Tracking of Multiple Vehicles in
Highways. Pattern Recognition Letters 26(10), 15971607 (2005)
8. Amer, A.: Voting-based Simultaneous Tracking of Multiple Video Objects. IEEE
Transactions on Circuits and Systems for Video Technology 15, 14481462 (2005)
9. Jalal, A.S., Tiwary, U.S.: A Robust Object Tracking Method Using Structural Similarity in
Daubechies Complex Wavelet Domain. In: Chaudhury, S., Mitra, S., Murthy, C.A.,
Sastry, P.S., Pal, S.K. (eds.) PReMI 2009. LNCS, vol. 5909, pp. 315320. Springer,
Heidelberg (2009)
10. Huang, Y., Essa, I.: Tracking Multiple Objects through Occlusions. In: Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, pp. 10511058 (2005)
11. Wang, Y., Doherty, J.F., Duck, R.E.V.: Moving Object Tracking in Video. In:
Proceedings of 29th IEEE Intl Conference on Applied Imagery Pattern Recognition
Workshop, pp. 95101 (2000)
12. Lina, J.-M.: Image Processing with Complex Daubechies Wavelets. Journal of
Mathematical Imaging and Vision 7(3), 211223 (1997)
13. Ugur, B., Enis, A., Aksay, A., Bilgay, M.A.: Moving object detection in wavelet
compressed video. Signal Processing: Image Communication 20, 255264 (2005)
14. Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based Object Tracking. IEEE Transactions
on Pattern Analysis and Machine Intelligence 25(5), 564575 (2003)

SVM Based Classication of Trac Signs for


Realtime Embedded Platform
Rajeev Kumaraswamy, Lekhesh V. Prabhu, K. Suchithra, and P.S. Sreejith Pai
Network Systems & Technologies Pvt Ltd, Technopark, Trivandrum, India
{rajeev.k,lekhesh.prabhu,suchithra.k,sreejith.pai}@nestgroup.net

Abstract. A vision based trac sign recognition system collects information about road signs and helps the driver to make timely decisions,
making driving safer and easier. This paper deals with the real-time detection and recognition of trac signs from video sequences using colour
information. Support vector machine based classication is employed
for the detection and recognition of trac signs. The algorithms implemented are tested in a real time embedded environment. The algorithms
are trainable to detect and recognize important prohibitory and warning
signs from video captured in real-time.
Keywords: trac sign recognition, support vector machine, pattern
classication, realtime embedded system.

Introduction

Driver Assistance Systems(DAS) that help drivers to react to changing road conditions can potentially improve safety [1,2,3]. Computer vision based methods,
which have the advantage of high resolution, can be employed to recognize road
signs and detect lane markings, road borders and obstacles. The input is usually a video captured from a camera xed on the vehicle. Automatic recognition
of trac signs is an important task for DAS. Trac signs are standardized by
dierent regulatory bodies and are designed to stand out in the environment.
Moreover, signs are rigidly positioned and are set up in clear sight to the driver.
These factors reduce the diculty in designing recognition algorithms. Nevertheless, a number of challenges remain for a successful recognition. Weather and
lighting conditions can vary signicantly in trac environments. Additionally, as
the camera is moving, motion blur and abrupt contrast changes occur frequently.
The sign installation and surface material can physically change over time, inuenced by accidents, vandalism and weather resulting in rotated and degenerated
signs. Another problem is occlusion from other objects such as trees.
The trac sign detection algorithms commonly rely on shape and colour of
the trac signs [3,4,5,6,7]. Shape based methods detect the signs using a set
of predened templates and hence is sensitive to total or partial occlusion and
target rotation. Colour based methods detect signs in a scene using the pixel
intensity in RGB or HSI colour spaces. There are very few work reported in
literature which deal with actual real-time embedded implementations. Goedeme
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 339348, 2011.
c Springer-Verlag Berlin Heidelberg 2011


340

R. Kumaraswamy et al.

[13] has proposed some algorithms intended for real-time implementation on


embedded platforms, but real-time implementation has not been attempted.
Souki et al [14] propose an embedded implementation based on shape and
colour but the processing time is greater than 17 seconds. The only real-time
embedded implementation reported so far is by Muller et al [15] realized on a
Virtex-4 LX100 FPGA. They classify 11 signs out of which 7 are speed limit
signs. They have adopted the classication algorithm from a previous work of
the current authors [9].
This paper describes a general framework for the realtime embedded realization of detection and classication of red speed limit and warning signs from
video sequences. We followed a model based approach in developing and validating the reference model in MATLAB/Simulink environment. The real-time
C/C++ implementation was developed based on autocode generated from the
reference model. The algorithms are evolved from our previous work [9]. We describe the adptations that were required for meeting the real-time constraints
and limited resources of OMAP processor. The paper is organized as follows:
Section 2 focuses on the system overview. Experimental results are described in
Section 3. Conclusions are drawn in Section 4.

System Overview

In this paper, we present a system for detection and recognition of trac signs.
The block level representation of the trac sign detection and recognition system
is shown in Figure 1.
The trac sign detection and recognition system consists of three stages.
1. ROI Selection : In this stage the candidate blobs are segmented from the
input frames by thresholding in RGB color space. The extracted blobs are
rotation corrected, cropped and resized to a size of 64x64.
2. Shape Classication : Blobs obtained from the colour segmentation process
are classied according to their shape using multiclass SVM.
3. Pattern Recognition : Blobs classied as circle or triangle are sent to the
pattern recognition stage. This stage involves pattern segmentation, feature
extraction and SVM based pattern classication.
2.1

ROI Selection

The rst task involved is the segmentation of the trac sign. Road signs are
designed to stand out from the environment, so colour is the natural choice for
segmentation. Dierent colour spaces can be employed for segmenting trac
signs. Hue in the HSI colour model is a very good representative for colour. We
found Hue and Saturation based schemes have better illumination invariance.
But conversion from RGB to HSI is computationally expensive. So for the real
time implementation in embedded platform, we have used RGB colour space.

SVM Based Classication of Trac Signs for Realtime Embedded Platform

341

Fig. 1. Block diagram of Trac Sign Detection and Recognition system

Once the color based segmentation is done, the image pixels are grouped
together as connected components. As we expect multiple signs in some frames,
we do a connected component labelling. Blobs not having a minimum size and
aspect ratio are discarded. This eliminates most of the unwanted and noisy
blobs.The limits for blob size and aspect ratio were empirically derived using
standard road signs. A minimun blob size of 400 pixels and an aspect ratio
between 0.6 and 2.5 has been used for selecting the candidate blobs.
The candidate blobs obtained may not be aligned with the horizontal axis.
The rotation angle is calculated from the bottom Distance to Border(DtB) vectors [10] and the blobs are reoriented in a reference position. Once the rotation correction is done, the candidate blobs are cropped and resized to a size
of 64x64.

342

2.2

R. Kumaraswamy et al.

Shape Classication Using SVM

The blobs that are obtained from the segmentation stage are to be classied
in this stage according to their shape. In order to perform shape classication,
nonlinear multi-class SVM is employed.
1) Shape Feature Extraction: The rst step in shape classication is to make
feature vectors for the input to the non linear multi-class SVM. Many methods
have been proposed for extraction of feature vectors [8,9,10]. We use DtB as the
vectors for training SVM. DtB is the distance from the external edge of the blob
to its bounding box. Thus for a segmented blob we have four DtB vectors for left,
right, top and bottom. Each DtB vector has a length of 64. The main advantage
of this method is its robustness to several factors such as rotation and scale.
This feature is invariant to rotations, because all blobs have been previously
orientated in a reference position using the DtB vectors. The DtB vectors for
left,right,top and bottom are concatenated and subsampled to a length of 64.
Figure 2 shows the resampled DtB vectors for segmented triangular,circular and
yield signs.
2) Shape Classication: In this work three shapes viz,circle, triangle and
inverted triangle are considered for classication. The non linear SVM trained
using the distance to border features enables the classication of a trac sign.
A detailed description on the training aspects of non linear SVM is given in
section 2.4.
2.3

Pattern Classication

Once the shape classication process is completed, the candidate blobs belonging
to circle or triangle are being sent to the pattern recognition stage. The inverted
triangle trac signs obtained in shape classication are considered directly for
YIELD sign. If the number of white pixels is above a preassigned threshold,
then it is classied as YIELD sign. In order to perform pattern recognition of
circular and triangular trac signs non linear multi-class SVMs are employed.
1) Pattern Segmentation: The pattern is extracted from a virtual masked
region within the segmented blob. This masked region is obtained from the left
and right DtBs used in the shape detection. The top and the bottom limits are
manually chosen from the prior knowledge of the region in which the pattern
resides. Now the pattern is obtained by thresholding the black region of the
segmented intensity blob.
2) Feature Extraction: Projections and DtB features are used in the recognition of triangular signs whereas DtB alone is used for recognizing circular signs.
For triangular signs, the projection of the cropped pattern is found along the x
axis and y axis. The x and y projections are both resampled to a length of 32
each and then concatenated to form the projection feature vector of length 64.

SVM Based Classication of Trac Signs for Realtime Embedded Platform

(a)

(d)

(b)

(e)

343

(c)

(f)

Fig. 2. (a) Segmented Circular Blob (b) Segmented Triangular Blob (c)Segmented
Blob for Yield sign (d) Distance to Border for Circular Blob (e) Distance to Border for
Triangular Blob (f) Distance to Border for Yield

Left and Right DtBs each resampled to 32 samples and concatenated to form
the DtB feature vector. For triangle the full feature vector is formed by concatenating the projection and DtB vectors. For the blobs classied as circular,
a red area checking is performed. If the total red area inside the blob is greater
than a threshold, it is considered as either STOP sign or DO NOT ENTER
sign. Inorder to distinguish between the STOP sign and DO NOT ENTER
sign, we search for a continuous pattern of white pixels. If there exists such a
pattern, the blob is classied as DO NOT ENTER otherwise it is classied as
a STOP sign.
For circular signs other than STOP and DO NOT ENTER, the DtBs
resampled to a length of 64 forms the nal feature vector. In the case of circular speed limit signs, the rst digit alone is cropped and used in the feature
extraction. Figure 3 shows the segmented blobs and the feature vectors used for
training the multiclass non-linear SVM for pattern recognition.
3) Pattern Recognition: In the recognition stage, multi-class SVM classiers
with a RBF kernel is used. We have used two SVM classiers- one for the circular
sign and the other for the triangular sign. For the current real-time implementation, we have restricted the classication to 6 circular signs and 8 triangular
signs. It is possible to include more signs without changing the present classier
structure. By extending this hierarchical classication, we can include signs with
other colours also.

344

R. Kumaraswamy et al.

(a)

(b)

(c)

(d)
Fig. 3. (a) Red Triangular Segmented Blobs and the corresponding extracted pattern(b) Red Circular Segmented Blobs and the corresponding extracted pattern(c)
Extracted features for red triangular blobs(d) Extracted features for red circular blobs

2.4

Training Non-linear SVM for Shape Classication and Pattern


Recognition

We summarize here the details of SVM classiers and the training strategy.
1) Support Vector Machine: Support Vector Machine is a machine learning
algorithm which can classify data into several groups. It is based on the concept of
decision planes, where the training data is mapped to a higher dimensional space
and separated by a plane dening the two or more classes of data. The extensive
introduction about SVMs can be found in [11]. The formulation of SVMs deals
with structural risk minimization (SRM). SRM minimizes an upper bound on
the Vapnik Chervonenkis dimension, and it clearly diers from empirical risk
minimization, which minimizes the error on the training data. For the training
of SVMs, we have used the library LIBSVM [12].

SVM Based Classication of Trac Signs for Realtime Embedded Platform

345

2) Cross validation and Grid search: The accuracy of SVM model is largely
dependent upon the selection of the model parameters. There are two parameters
c and g while using an RBF kernel in the SVM classication. g is the kernel
parameter gamma and c is the cost parameter. The value of c controls the tradeo
between allowing training errors and forcing rigid margins. Increasing the value
of c results in over tting. It results in an increase in the misclassications, but
creates a more accurate model. Hence an optimal value should be chosen for the
parameter c. The cross validation procedure can prevent the over tting problem.
In v -fold cross-validation, the training set is rst divided into v subsets of equal
size. Sequentially one subset is tested using the classier trained on the remaining
v-1 subsets. Thus, each instance of the whole training set is predicted once [12].
Grid search tries values of (c,g) across a specied search range using geometric
steps and picks up the values with the best cross validation accuracy. For shape
classication, the (c,g) values used are (2,0.00781). For pattern recognition the
(c,g) value used are (2,0.25 ) for triangular signs and (2,2) for circular signs.

Experimental Results

The algorithms developed were tested in real-time on an embedded platform.


The algorithms are trainable to detect and recognize important prohibitory and
warning signs from video captured in real-time. The test setup used for the
real-time embedded application as shown in Figure 4 comprises of,

Fig. 4. Test setup for Real-time Embedded application

1. Hardware: The application runs on Beagle board (www.beagleboard.org),


which is a low-cost, community supported development board from TI. It
has an OMAP3530 processor. The trac sign recognition (TSR) software
runs in the ARM (Cortex A8) core of OMAP.

346

R. Kumaraswamy et al.

Fig. 5. Signs currently recognized by the embedded application

(a)

(b)

(c)

(d)

(e)
Fig. 6. Results obtained from the embedded platform (a) Test input frame(b) Thresholded image (c) Segmented Blobs of interest(d) Extracted Patterns (e)GUI Showing
Classication Results

SVM Based Classication of Trac Signs for Realtime Embedded Platform

347

2. System software: The board runs Angstrom Linux distribution


(www.angstrom-distribution.org) which is built using the Open Embedded
build system (www.openembedded.org). V4L2 (Video for Linux) is used
for capturing the webcam video and display is using OMAP frame buer
(Video/OSD).
3. TSR application: This includes the algorithms for detection and recognition
of trac signs. Algorithms were modelled using MATLAB/Simulink. Autocode is generated using Matlab Real-Time Workshop and then ported to
embedded platform. Platform specic code generation and code optimization
has not been done for this application. Even without these optimizations,
the application runs at near real-time.
The input to the application is a video stream captured live from a camera
connected to the board. The video from webcam is captured at VGA resolution
(640x480 pixels). These frames are processed by TSR engine and at the same
time displayed on screen. If a sign is detected, an icon image corresponding to
the recognized sign is displayed on screen. Currently the processing speed of
the TSR application is 5 fps.The application currently recognizes 17 dierent
signs compliant to the Vienna Convention on Road Signs and Signals. The set
of currently recognizable signs is shown in Figure 5. Some of the test results
displayed by the embedded application are shown in Figure 6.
In the absense of extensive video footage for testing, we conducted testing on
real-time video of printed signs held in various distances and orientations in front
of the camera. The sizes of the printed signs were chosen to match the actual
viewing angles in real road situations. This controlled testing environment allowed for exible generation of test sets. The system tolerated in-plane rotations
upto 20 away from horizontal and much larger rotations upto 40 away from
the frontal pose. As an improvement over many other reported solutions, the
recognition performance did not deteriorate all the way down to 20 20 pixel
size for the signs. Misclassication error is less than 3% over a test data set of
1700 images.

Conclusion

We have proposed a new hierarchical scheme for real-time detection and classication of trac signs on an embedded platform. We have introduced low complexity
algorithms for detection and feature extraction, suitable for real-time implementation. The algorithms were developed in MATLAB/Simulink environment and
automatically generated C code was ported to ARM core of OMAP and tested
with real-time video input. Without any further code optimization, a performance
of 5 frames per second was achieved. Considering the fact that processing is not
usually required for every frame, this frame rate is already nearly real-time. The
proposed scheme is a very good candidate for real-time realization of multiclass
trac sign recognition within the limited computing resources of embedded processors. With some modications, the scheme is expected to be extensible to trac
signs following conventions other than the Vienna Convention.

348

R. Kumaraswamy et al.

References
1. de la Escalera, A., Moreno, L.E., Salichs, M.A., Armingol, J.M.: Road Trac Sign
Detection and Classication. IEEE Transactions on Industrial Electronics 44(6),
848859 (1997)
2. de la Escalera, A., Armingol, J.M., Mata, M.: Trac Sign Recognition and Analysis
for Intelligent Vehicles. Image and Vision Computing 21, 247258 (2003)
3. Fang, C., Chen, S., Fuh, C.: Road Sign Detection and Tracking. IEEE Transactions
on Vehicular Technology 52(5), 13291341 (2003)
4. Miura, J., Itoh, M., Shirai, Y.: Towards Vision Based Intelligent Navigator: Its Concept and Prototype. IEEE Transaction on Intelligent Transportation Systems 3(2),
136146 (2002)
5. Bascon, S.M., et al.: Road Sign Detection and Recognition Based on Support Vector
Machines. IEEE Transactions on Intelligent Transportation Systems 8(2) (June
2007)
6. de la Escalera, A., Armingol, J.M., Pastor, J.M., Rodriguez, F.J.: Visual Sign
Information Extraction and Identication by Deformable Models for Intelligent
Vehicles. lEEE Transactions on Intelligent Transportation Systems 5(2), 5768
(2004)
7. Liu, H., Liu, D., Xin, J.: Real Time Recognition of Road Trac Sign in Motion
Image Based on Genetic Algorithm. In: Proceedings 1st. Int. Conf. Mach. Learn.
Cybern., pp. 8386 (November 2002)
8. Kiran, C.G., Prabhu, L.V., Abdu Rahiman, V., Kumaraswamy, R., Sreekumar, A.:
Support Vector Machine Learning based Trac Sign Detection and Shape Classication using Distance to Borders and Distance from Center Features. In: IEEE
Region 10 Conference, TENCON 2008, November 18-21. University of Hyderabad
(2008)
9. Kiran, C.G., Prabhu, L.V., Abdu Rahiman, V., Kumaraswamy, R.: Trac Sign
Detection and Pattern Recognition using Support Vector Machine. In: The Seventh International Conference on Advances in Pattern Recognition (ICAPR 2009),
February 4-6. Indian statistical Institute, Kolkata (2009)
10. Lafuente Arroyo, S., Gil Jimenez, P., Maldonado Bascon, R., Lopez Ferreras, F.,
Maldonado Bascon, S.: Trac Sign Shape Classication Evaluation I: SVM using
Distance to Borders. In: Proceedings of IEEE Intelligent Vehicles Symposium, Las
Vegas, pp. 557562 (June 2005)
11. Abe, S.: Support Vector Machines for Pattern Classication. Springer-Verlag London Limited, Heidelberg (2005)
12. Chang, C., Lin, C.: LIBSVM: A Library for Support Vector Machines (2001),
http://www.csie.ntu.edu.tw/~ cjlin/libsvm
13. Goedeme, T.: Towards Trac Sign Recognition on an Embedded System. In:
Proceedings of European Conference on the Use of Modern Electronics in ICT,
ECUMICT 2008, Ghent, Belgium, March 13-14 (2008)
14. Souki, M.A., Boussaid, L., Abid, M.: An Embedded System for Real-Time Trafc Sign Recognizing. In: 3rd International Design and Test Workshop, IDT 2008
(December 2008)
15. Muller, M., Braun, A., Gerlach, J., Rosenstiel, W., Nienhuser, D., Zollner, J.M.,
Bringmann, O.: Design of an automotive trac sign recognition system targeting
a multi-core SoC implementation. In: Proceedings of Design, Automation and Test
in Europe, Dresden, Germany, March 8-12 (2010)

A Real Time Video Stabilization Algorithm


Tarun Kancharla and Sanjyot Gindi
CREST, KPIT Cummins Info systems Ltd.
Pune, India
{Tarun.Kancharla,Sanjyot.Gindi}@kpitcummins.com

Abstract. Jitter or unintentional motion during image capture, poses a critical


problem for any image processing application. Video stabilization is a
technique used to correct images against unintentional camera motion. We
propose a simple and fast video stabilization algorithm that can be used for real
time pre-processing of images, which is especially useful in automotive vision
applications. Corner and edge based features have been used for the proposed
stabilization method. An affine model is used to estimate the motion parameters
using these features. A scheme to validate the features and a variant of iterative
least squares algorithm to eliminate the outliers is also proposed. The motion
parameters obtained are smoothed using a moving average filter, which
eliminates the higher frequency jitters obtained due to unintentional motion.
The algorithm can be used to correct translational and rotational distortions
arising in the video due to jitter.
Keywords: Video Stabilization, Corner detection, Affine Transform, Moving
average filter, Dolly motion.

1 Introduction
For vision based driver safety applications a camera is mounted on the vehicle to
capture continuous and real time videos. Uneven surface of the roads and mechanical
vibrations of the vehicle during capture, affect the quality of these videos. The
distortions arising from such kind of jitter makes them unpleasant for viewing. Such
motion of the camera also makes it difficult to process and extract important
information from the images. Hence, the video needs to be corrected and stabilised
against any unintentional movement of the camera.
Video can be stabilized using either hardware sensors or by software techniques.
Hardware sensors are usually expensive and have a limited range of correction. Hence
they are less preferred. Software techniques use image processing methods to estimate
and compensate for the unintentional motion.
Over the past decade, a number of methods have been proposed to stabilize video
using image based methods. Any image based method used for stabilization consists
of 2 main steps: motion estimation and motion compensation. Different kinds of
feature extraction and matching methods have been used to obtain a match between
frames, for example, Block Matching [1], [5], SIFT [2] etc. Motion estimation is done
by comparing the features across subsequent frames and obtaining the parameters for
the motion models like translation or affine. The motion vectors which are obtained
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 349357, 2011.
Springer-Verlag Berlin Heidelberg 2011

350

T. Kancharla and S. Gindi

due to local motion in the image are eliminated using RANSAC [3] or iterative least
squares algorithms [10]. IIR filters were used in [4] and Kalman filter and its
variations were used in [6], [9] to estimate the intentional motion; optical flow
techniques [7] have also been used to smoothen the motion vectors. The final step of
motion correction uses geometrical transformations to compensate for the
unintentional motion estimated by the motion estimation module. Fig. 1 illustrates the
generalized block diagram of a video stabilization algorithm.
While most accurate methods in literature use complex methods that are also time
consuming, the proposed algorithm uses simple techniques and is fast, to be used for
real time applications. The smooth motion quality in the output video is favourable for
viewing and further processing in applications for object segmentation, detection etc.
The paper is organised as follows: we describe feature extraction and feature matching
scheme used for this algorithm in section 2. In section 3 Motion estimation and
compensation are discussed. Section 4 contains the results of experiments performed
on different videos and we present the conclusions in section 5.

Fig. 1. Generalized block diagram for video stabilization

2 Feature Extraction and Matching


In order to compensate the unintentional motion in a video, it needs to be estimated
using motion vectors. The motion vectors are calculated by matching features

A Real Time Video Stabilization Algorithm

351

between successive image frames in the video. Features are extracted in one frame
and compared with those in preceding frame to obtain the correspondence. The jitter
or unintentional motion can be in the form of translational, rotational and scale
variations. Hence the features selected should be robust to all these variations. Image
frames also need to be enhanced, to enable better extraction of the selected features.
2.1 Corner Features
In this method, we use Harris corner detector [8] to detect features in the image. A
slight modification of Harris corner detection technique is used, to ensure uniform
distribution of corners across the image. Using Harris corner detection in its original
form gives very high number of corners in images or areas in an image which
contains details like multiple objects, people, trees etc and very few corners in images
or areas in an image containing plain regions like sky or water. In order to obtain a
sufficient number and a uniform distribution of features across the image in any
scenario, pre-processing is done to enhance the corners and they are detected in all the
four quadrants using an adaptive threshold.
A combination of cross correlation and confidence value of the corner is used to
match the corner features. In template matching, an image block of size
centered
about the corner point in the present frame is used as a template to be matched with
similarly obtained image blocks about the feature points in the previous frame. The
best match is obtained by comparing the normalized cross-correlation values obtained
by template matching against each image block and a minimal threshold value is set
to reject false matches.. The following equation gives the value of normalized crosscorrelation and the minimum threshold condition

0.8

(1)

In the above equation, is the normalized cross-correlation value, N is the size of the
image block, x and y are the pixel values of the image blocks from present and
previous frames respectively. Templates are usually sensitive to rotation, scaling and
illumination changes. However, it is safe to assume that such variations in subsequent
frames are very less when the image block size is chosen carefully, hence template
matching usually gives a proper match.
As the strength of the corner pixel does not change drastically in subsequent
frames, the matched image block is further validated by comparing the pixel intensity
(confidence) value of the feature extracted (corner).
2.2 Edge Based Template Matching
The extraction of corner points is time consuming as it gives a large number of
features and template matching needs to be done further for all the obtained features.
In order to decrease the computation, edges are used instead of corners, as it requires
very less computation to extract edges than to extract corners. The image is smoothed
as a pre-processing step to eliminate the edges that arise due to noise.

352

T. Kancharla and S. Gindi

Canny edge detection is performed on the smoothed image to obtain the connected
edges. The image is uniformly divided into image blocks (IB) of size
and each
block is checked for edges present. The equation to obtain the fraction of edge content
in IB is given below

1
0

0
.
0

(2)

is the fraction of edge pixels in the image block IB. The blocks are discarded if the
fraction e is less than a minimum threshold . The centre point of IB can be used as
the feature point.
Since only a few matching features are sufficient for estimating the motion vectors,
we can choose the required number of image blocks and perform template matching
to identify corresponding image blocks in the previous frame. For each selected IB in
the present frame, a corresponding search window of size (
2 ,
2 ) is
selected from the previous frame.
and
are distances in x and y directions which
are dependent on the maximum translation possible in respective directions. The best
match is obtained by comparing the normalized cross-correlation values.
The computation time for this method is minimal because template matching is
done only for the selected blocks, as compared to corner features where the template
matching is done for all the obtained feature points.

3 Motion Estimation and Compensation


Once the features are extracted and matched, the co-ordinates of the matched features
from present and previous frames are obtained. A simple affine motion model is used
to estimate the motion between the frames. It is essential to eliminate the outliers
before estimating the motion vectors. There are two types of outliers: the ones due to
incorrect matching of the features and those due to local motion. By choosing an
appropriate template size and matching the selected blocks as mentioned in section II,
the first type of outliers are avoided. To eliminate the second type of outliers, we use
a variant of iterative least squares [10]. We compute the mean of the error between
the estimated
,
(using motion model obtained from previous frame) and actual
positions , of the feature points in the present frame.
_

(3)

P is the total number of feature points selected. If the difference between the error
and the
_
is greater than a threshold for a feature point then it is
discarded. This process successfully eliminates the outliers due to local motion.
3.1 Motion Estimation
The relationship between the co-ordinates of present and previous frames using affine
model is shown in the following equation.

A Real Time Video Stabilization Algorithm

=
1

*
0

353

(4)

, are the co-ordinates in present frame and , are co-ordinates in previous


frame. The parameters
,
,
are responsible for rotation and scaling
in the image,
account for translation in x and y directions respectively.
The rotation and scaling parameters are calculated using the following equations.
,

(5)

(6)

and
are scaling in x and y directions respectively and is the angle of
rotation. The matched co-ordinates from successive frames are pooled to form a
matrix.
=
1

*
0

1
.

(7)
(8)

S represents co-ordinates from the previous frame and represents co-ordinates


from the present frame. A is the affine matrix which contains the affine parameters. 6
co-ordinates are sufficient to estimate the affine parameters. However, to account for
any minor mismatches, multiple co-ordinates are used. The affine parameters are
estimated using pseudo-inverse method. Since the pseudo-inverse contains only
inversion of a 3 3 matrix, the amount of computation is minimal.
3.2 Motion Compensation
For a camera mounted on vehicle, the total motion obtained is a summation of the
intentional motion and motion due to jitter. Usually jitter is a high frequency
component compared to intentional motion and can be eliminated by filtering out the
high frequency components in the estimated motion vectors. We use a moving
average filter of length 30 (usually 1 sec assuming camera frame rate is 30fps) to filter
out the jitter and obtain an estimate of the intentional motion. The following equation
shows the calculation of estimated motion in horizontal direction using moving
average filter for the frame p, this can similarly be applied for rotation and scaling.

(9)

where,
is the estimate of the intentional motion in the present frame,
is the
motion vector obtained using motion estimation step, the subscripts p and (p-i) denote
the respective frame numbers, k is length of the moving average filter. The estimated
) is subtracted from the motion vector obtained using
intentional motion (

354

T. Kancharla and S. Gindi

motion estimation to obtain the motion due to jitter. The compensation is done for the
jittery motion to obtain a stabilized video. The intentional motion estimated may not
be equal to the actual intentional motion, but the aim of video stabilization is to obtain
a video that is free from jittery motion and pleasing to the eye, rather than to exactly
track the intentional motion.

4 Experimental Results
The algorithm has been tested by performing informal subjective evaluation of the
output corresponding to 20 videos taken for different test conditions. The test
scenarios considered are jittery videos with dolly motion, with stationary and moving
objects, videos having variations in illumination conditions etc. Compared to corner
based template matching, edge based template matching for video stabilization is
much faster. It takes 40 to 50ms to correct the unintentional motion in each frame that
corresponds to 20 to 25 fps (frame rate per second) on a 3.0GHz Intel Pentium 4
processor using Open-CV software. The performance can also be significantly
improved if implemented on embedded boards, which are commonly used for vision
based applications.

(a)

(b)

Fig. 2. Filtered motion vectors obtained using moving average filter. (a) along horizontal
direction (b) along vertical direction.

In Fig 2, we see that the motion vectors obtained after application of moving
average filter (thicker line in the figures) have fewer jitters and are much smoother
than the estimated motion vectors. The filtered motion vector is free from jitters and
the video thus obtained is smooth.
Fig 3 shows the comparison of stabilized and original image sequences at different
instances. The highlighted areas along the perimeter in the right-hand-side of the
images indicate the unintentional motion that is compensated for in that particular
image frame. The highlighted area also gives an idea of the type of distortionswhether translational, rotational or scale- that are caused due to jitter, with respect to
the previous frame. The sequence on the left side of Fig. 3(a) and 3(b) is the original
sequence and the sequence of the right side is the stabilized sequence. Videos in Fig
3(a) and 3(b) are taken in low light and bright light conditions respectively.

A Real Time Video Stabilization Algorithm

(a)

355

(b)

Fig. 3. Stabilized sequence of a video which undergoes dolly motion. Original sequence is on
the left side and the stabilized sequence is on the right side of the images.

The images in Fig. 4(a) and 5(a) are obtained by overlapping 20 consecutive
frames of original video sequence and images, Fig. 4(b) and 5(b) are obtained
similarly for the stabilized video sequence. Since there is dolly motion present in the
video, we expect a motion blur when successive frames in the sequence are
overlapped. However, the edge information should not vary drastically. The original
sequence is affected by jitter along x and y directions and due to rotation. Notice the
highlighted portions in the images, it is difficult to identify the objects in the original
image sequences of Fig. 4(a) and 5(a) due to excessive blurring, but they can clearly
be identified in the stabilized image sequences of Fig. 4(b) and 5(b). Further
confirmation can be obtained by comparing the edge maps of the original and
stabilized image sequences. The edge maps in 4(d) and 5(d) are much more detailed
than the edge maps in figures 4(c) and 5(c).
The proposed algorithm does not give the expected results in scenarios where the
number of features detected in background is very less than the foreground. Consider
an example of a boat moving on water, the background consists of sky and water,
there are very few corner points in the background, and the corner points obtained are
mainly due to the boat. The motion estimated using these vectors gives the local
motion of the boat and not the motion of the camera.

356

T. Kancharla and S. Gindi

(a)

(b)

(c)

(d)

Fig 4. The figures are obtained by overlapping 20 consecutive frames. (a) Original Image
sequence (b) Stabilized Image sequence (c),(d) Corresponding edge maps.

(a)

(b)

(c)

(d)

Fig 5. The figures are obtained by overlapping 20 consecutive frames. (a) Original Image
sequence (b) Stabilized Image sequence (c),(d) Corresponding edge maps.

A Real Time Video Stabilization Algorithm

357

5 Conclusions
In this paper, we have presented a simple and computationally efficient video
stabilization algorithm that is robust to distortions in translation and rotation. We
estimate the global motion vectors and filter them to obtain a stabilized sequence. The
accuracy of other methods known in literature relies heavily on the complexity of
features used for matching, and as such, give poor performance with respect to time
and computation.
The speed and performance of this algorithm for stationary videos is excellent and
suitable for use in real time applications. The speed of the algorithm reduces slightly
for videos containing intentional motion; however, it is acceptable for any practical
case. When used as a pre-processing step of an object detection scheme, the detection
accuracy can improve due to the stabilization. Also, the quality of output is smooth
and pleasing to view.

References
1. Vella, F., Castorina, A., Mancuso, M., Messina, G.: Digital image stabilization by
adaptive block motion vector filtering. IEEE Trans. on Consumer Electronics 48(3)
(August 2002)
2. Lowe, D.: Distinctive image features from scale-invariant key points. International Journal
of Computer Vision 60(2), 91110 (2004)
3. Fischler, M.A., Bolles, R.C.: A Paradigm for Model Fitting with Applications to Image
Analysis and Automated Cartography. Comm. of the ACM 24, 381395 (1981)
4. Jin, J.S., Zhu, Z., Xu, G.: A Stable Vision System for Moving Vehicles. IEEE Transaction
on Intelligent Transportation Systems 1(1), 3239 (2000)
5. Ko, S.J., Lee, S.H., Lee, K.H.: Digital image stabilizing algorithms based on bit-plane
matching. IEEE Transaction on Consumer Electronics 44(3), 617622 (1998)
6. Litvin, A., Konrad, J., Karl, W.C.: Probabilistic video stabilization using Kalman filtering
and mosaicking. In: Proc. of SPIE Electronic Imaging, vol. 5022, pp. 663674 (2003)
7. Chang, J., Hu, W., Cheng, M., Chang, B.: Digital image translational and rotational
motion stabilization using optical flow technique. IEEE Transactions on Consumer
Electronics 48(1), 108115 (2002)
8. Harris, C., Stephens, M.: A combined corner and edge detection. In: Proceedings of the
Fifth IEEE International Conference on Automatic Face and Gesture Recognition, pp.
287293 (May 2002)
9. Tico, M., Vehvilainen, M.: Robust Method of Videos Stabilization. In: EUSIPCO
(September 2007)
10. Chang, H.C., Lai, S.H., Lu, K.R.: A robust and efficient video stabilization algorithm. In:
ICME 2004: International Conference on Multimedia and Expo., vol. 1, pp. 2932, 7, 30,
40, 49, 64. IEEE, Los Alamitos (2004)

Object Classification Using Encoded Edge Based


Structural Information
Aditya R. Kanitkar1, Brijendra K. Bharti2, and Umesh N. Hivarkar3
KPIT CUMMINS INFOSYSTEMS LIMITED, Pune 411057, India
{Aditya.Kanitkar,Brijendra.Bharti,
Umesh.Hivarkar}@kpitcummins.com

Abstract. Gaining the understanding of objects present in the surrounding environment is necessary to perform many fundamental tasks. Human vision
systems utilize the contour information of objects to perform identification of
objects and use prior learnings for their classification. However, computer
vision systems still face many limitations in object analysis and classification.
The crux of the problem in computer vision systems is identifying and grouping
edges which correspond to the object contour and rejecting those which correspond to finer details.
The approach proposed in this work aims to eliminate this edge selection
and analysis and instead generate run length codes which correspond to different contour patterns. These codes would then be useful to classify various
objects identified. The approach has been successfully applied for day time
vehicle detection.
Keywords: Object Classification, Discrete Haar Wavelet Transform, Contour
Pattern Detection, Run length Codes.

1 Introduction
The basic task required for any such system is recognition of objects present in the
surrounding environment. The human vision is a highly sophisticated system which
has evolved over millions of years and handles this task with ease. But computer
vision and hardware are still in a comparative nascent stage. Hence suitable logic
needs to be developed to compensate for lack of sophisticated hardware.
From detailed studies and research performed, it has been observed that the strongest cue for identification and recognition of objects is the boundary i.e. the contour of
the object. Human vision system has highly developed cells and complex mechanisms
for detection and grouping of bar, gratings and edges observed into object contours.
Computer vision systems perform this task in an analogous manner. They identify
possible edges of objects and use developed logic for identifying regions in the image
which correspond to real world objects.
There is an inherent non-trivial task of filtering out edges corresponding to finer details and grouping of boundary edges. A major challenge faced in this task is preparing
data structures and logical algorithms for automatic object detection for a variety of
environmental scenes, pose and appearances [1].
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 358367, 2011.
Springer-Verlag Berlin Heidelberg 2011

Object Classification Using Encoded Edge Based Structural Information

359

In this work, it is proposed that instead of detecting edges and then grouping those,
the structure of the objects can be inferred from suitable distribution of edges. Hence
filtering out of weak edges, grouping of strong edges and connecting them automatically can be avoided. Representation for the object structure for automatic parsing is
handled by encoding the structure in run length binary codes. This method is experimented for daytime vehicle detection in complex urban scenes.

2 Vehicle Detection in Daytime


Vision based Driver Assistance Systems (DAS) require a thorough understanding of
the surrounding environment to perform various functions efficiently. Hence, it is
necessary to perform detection of other vehicles present in the environment.
To detect a vehicle in the image is a complex process. This is mainly because the
vehicle class consists of a large number of different elements having various types of
shape, size, color and texture. Shadows, tires, lamp systems, are few to name.
Vehicle detection however can be considered as a two class problem with vehicle
versus non vehicle classes. There are many regions which exhibit similar features as
vehicles. So it is needed to classify vehicles in the selected regions.
Various other methods based on features and classifiers are presented in the available literature till date.
Matthews et al. [2] analyzed the Region of Interest (ROI) using Principal Components Analysis (PCA). The training images were scaled (20x20) and then subdivided into windows (25 of 4x4 sizes).Each window was then analyzed to extract
PCA features. These features were then classified by Neural Networks (NN).
Goerick et al. [3] used Local Orientation Code (LOC) to extract edge information
of ROI. The histogram of this LOC was then classified by Neural Networks.
The line and edge information were used as features by using Gabor Filters. The
set of values for different orientations and scale in Gabor Filter were then used for
classification by learning algorithms [4].
Papageougiou et al. [5] considered over complete set of Haar Wavelet Features and
used Support Vector Machines (SVM) as classifiers. The key observation made
was the increased performance than PCA or Gabor Filter Features.
Sun et al [6], [7], [8] demonstrated that magnitude of Haar Features was redundant
and quantized features had superior performance as compared to previous Haar
features.
According to Wen et al [9] however, changing illumination and surrounding conditions can affect the sign of the wavelet coefficients and cause intra class variance.
The authors proposed an improvement in the quantized features by using unsigned
features.
Thus it is seen that the state of art approaches for vehicle classification involve
features which are based on the edge information and structure of vehicle. In these
approaches, learning algorithms and training databases are an important factor in
the classification. However the deterministic nature of the edge structure provides
strong visual cues for classification of vehicles. It is proposed that this deterministic

360

A.R. Kanitkar, B.K. Bharti, and U.N. Hivarkar

distribution can be used for classification and reducing the complex and time consuming learning process.

3 Edge Feature Analysis Using DHWT


Edges can be defined as discontinuities in the acquired image. They are usually
represented as the change in gradient of the pixel values.
But this discontinuity is also a characteristic used to define the noise added in
the image. The randomness of the discontinuity is used to separate noise and edges.
Thus recognizing strong object edges and filtering out noise like features as well
very weak edges due to illumination and material texture is a major challenge in edge
detection [10].
3.1 Representation of Object Edge Properties Using DHWT
Wavelets are well known class of functions which represent the signals present by
scaling and translating a base function. In mathematics, the Haar wavelet has squareshape functions which together form a wavelet family or basis. Haar Wavelet basis
are not continuous and hence are not optimal for representation of original grayscale
images. They can be used to detect transitions in the signal i.e. ultimately edges with
accuracy. So edge detection can be performed accurately with Haar Wavelets. [11]
In Image processing, Haar wavelets are usually applied in the form of Discrete
Haar wavelet transform (DHWT). As this transform is calculated in the spatial
domain, it has low time complexity (O (n2)) compared to 2D-Fourier transform
{O (n2log (n))}.

Fig. 1. An urban road image and its Haar Wavelet Decomposition

The transform can be represented as the weighted sum and differences over the entire image. This weighted sum is termed as the coefficient of the Haar wavelet basis
functions. Haar wavelet coefficients are of high value where there are distinct edge
features.
Haar Wavelets are utilized for vehicle detection because they offer many advantages such as

Object Classification Using Encoded Edge Based Structural Information

361

They are simple to implement with algebraic operations of averaging and differencing with less time complexity.
They encode edge information from multiple scales inherently. They also form
orthogonal basis and thus provides a non redundant compact representation of
the image structure.

4 Contour Suppression
One of the problems with contemporary edge detectors is that they do not make a
distinction between contours of objects and edges originating from textured regions.
Detection of edges is very much dependant on filtering out noise as well as unimportant edge features. So a biologically motivated technique is used to improve the edges
detected by using Haar Transform.
In human vision, classical receptive field (CRF) is the region for which neural cells
gives maximum response for edges of specific orientation and size. [12].
A cell simulated in the CRF is also affected by the stimulus given to cells outside
the defined CRF. The effect is inhibitive in nature and is referred to as non-classical
receptive field (non-CRF) inhibition.
Using this inhibition, texture regions can be handled by edge detectors. Normal
edge detectors will exhibit strong stimulus for contours and less for texture edges. So
there can be separation in edges belonging to boundary region and those belonging to
texture of region
A centre surround model is used for suppression on edges. This model is based on
neurons present in retina. Essentially it is positive scaling factors at centre which
decrease towards negative values at the edges [13].

Fig. 2. 3d Surface for Centre Surround Inhibitor

Fig. 3. Suppressed Edges for Contour Detection

362

A.R. Kanitkar, B.K. Bharti, and U.N. Hivarkar

Thus we can observe that the kernel values are selected such that the centre of the
region is affected by a large positive scaling while the boundaries of the regions are
affected by a negative scaling. This negative scaling is the inhibitive effect observed.

5 Run Length Encoding


As we observe acquiring ideal contour is extremely difficult in current edge detectors.
Various specific techniques for noise removal and edge detection have to be performed in order to generate the true contour. However, getting the exact contour is
not of much importance as compared to obtaining the structure of the object for various tasks like object classification and recognition.
Generally prominent edges are converted to binary pattern with selected points
marked as 1 and the rest as 0. This pattern of 1 and 0 generally represents the
structure of the scene. In ideal cases various measures can be utilized to identify the
binary patterns. These can be texture based, transform based and so on.
The run length encoding forms a great feature as it encodes the structure of the object in just numeric values. Long lines and edges can be represented by higher values
and noise and distortions will be encoded as short transitions.
So the binary edge image obtained is converted to a 1D string using concatenated
column or row values. This string is processed using run length encoding with one
and zero being represented as unique weights. Each run of zeros is expressed as negative value and run of ones is positive. This code represents the final region.
The advantages of run length encoding is that

It remains unchanged even if image is scaled down.


It is same for translated images in horizontal as well as vertical directions.
It can also be smoothed and reduce distortions in the image binarization.
It is faster compared to texture representation
For e.g. 1010011000111 = 1Z 1O 2Z 2O 3Z 3O with Z as Zeros and O as Ones.
So Run length Code = [ -1 , 1 , -2 , 2 , -3, 3 ]

6 Structure Encoding in Run length Code


Image Reshaping is done by taking all the pixel elements in a column and then concatenating it in one dimensional array formed. This is repeated for all columns. The one
dimensional array thus formed is then converted to run length code.
For use in classification, the code is shifted by the maximum run for zero and normalized to obtain data plot in linear manner in [0 1] form.
1
0
1

0
1
0

0
0 = { [1 , 0 , 1] , [ 0, 1, 0] , [ 0, 0,1] }
1

To illustrate, plots for some common euclidean structures are shown in Fig. 4. The
bars indicate the nomralised run value for specific element of code.

Object Classification Using Encoded Edge Based Structural Information

363

Fig. 4. Run length Code Plots for Regular Euclidean Structures

Thus, the run length code for regular shapes follows deterministic trend. This fact
is used in detecting the vehicles in day time where removing the false detection after
segmentation is one of the challenging tasks. Run length code for potential vehicle
regions follows some deterministic pattern over the false vehicle regions.
Further section elaborates on the approach of using this variance value for classification in day time vehicle detection.

7 Classification
The edges detected in the image are the structural information of the objects. Contour
suppression logic is added in the approach to improve the accuracy of detected edges.
Then the edge pixels are scaled to binary format by thresholding.
After this edge detection, the pixel coefficients are used as features for learning
algorithms. In approaches outlined earlier, an extensive training database and algorithms such as SVM, NN are used for classification.
It is hypothesized that it is possible to represent the edge pixels as structural information using some features and separate the classes by simple linear thresholding.
To ratify this hypothesis a series of experiments with different approaches was
performed as follows
7.1 Profile Based Classification
The structural information in itself can also act as a feature for classification by using
the row dimensional profile of the edge pixels. This profile is generated by summing
the pixel values row wise. The deterministic nature of this profile can be used to separate the classes. This nature can be quantified in form of the variance of the profile.
7.2 Run Length Based Classification
It is observed that discontinuities in the rows at column level are lost in dimensional
profiling. So the pixels are represented as runs of 1s in a particular row. This was

364

A.R. Kanitkar, B.K. Bharti, and U.N. Hivarkar

done to represent the structure more accurately. The deterministic nature of the run
length code is quantified in form of the scatter obtained in the data points of the code.
The forward difference of the run length code is obtained to remove the scaling in
intensity levels as well as to obtain the transitions in runs. The forward difference is a
1-D array of length equal to the obtained run length code. This array is assumed as the
data set and the variance of the data set is calculated. It is observed that this variance
value proves to be capable of differentiating non-vehicles and vehicles efficiently.
7.3 Comparative Analysis
To evaluate the proposed approach against previous state of art approach , a comparative analysis was done on the run length code approach and the approach using SVM
and level scaling (3rd Approach) as proposed by Wen et al.[]
Here, the comparison parameter was the accuracy tradeoff and the reduction in
time complexity and efforts.
7.4 Performance Evaluation
Classification of true and false vehicles is usually performed over a test database.
Training is done using a small subset to select various parameters such as thresholds.
Performance measures for classification considered are false classification ratio and
true classification ratio for vehicle candidate.
CRt = Nt / Ni+

(1)

CRf = Nf / Ni-

(2)

where,
CR is Classification Ration
N is Labels identified as belonging to a particular Class
Ni is Labels predefined as belonging to a particular Class

8 Experimental Results
The training set consists of 20 vehicles and 20 non vehicles. A test dataset of 100
vehicles and 100 non vehicles was collected in varying conditions for day time. This
dataset was used for classification of vehicles and non-vehicles.
8.1 Profile Based Classification
The randomness of the dimensional profile obtained as described earlier is used to
classify the vehicle regions. This randomness can be found by using the spread of the
profile data points and the corresponding variance values. A suitable threshold is
selected for separation of the two classes using the training data .This threshold is the
mean value of the range of thresholds available.

Object Classification Using Encoded Edge Based Structural Information

365

Table 1. Performance Analysis of Profile based Classification

Class
Vehicle
Non-Vehicle

N
100
100

Ni
74
77

CR
74
77

Accuracy in %
74
77

8.2 Run Length Code Based Classification


As described earlier the difference curve for run length code is used as a feature.
The variance of the data points are used to obtain the measure a suitable threshold is
selected for separation of the two classes using the training data. The threshold is set
at the mean of data set obtained.
Table 2. Performance Analysis of Run length Code based Classification

Class
Vehicle
Non-Vehicle

Ni

100
100

89
72

CR
89
72

Accuracy in %
89
72

Thus it is observed that if the value of the threshold is set to the median of the
range of available linear threshold values, there is an increase in the vehicle detection
accuracy by almost 15%. This is observed however for a small test data-set.
8.3 Comparative Analysis
The approach followed by Wen et al. using SVM is compared for the improvement in
time complexity.
It is observed that for training and testing the same database for SVM the time
complexity is increased as compared to our approach
The SVM comparison is done using the SVM and Kernel Methods Matlab Toolbox
which is implemented entirely in Matlab. The simulation was done on an Intel Core
2 Duo processor with 2.19 GHz processing speed and 3 GB RAM. The time measured
is on basis of the standard CPU time scale.
Table 3. Comparitive Analysis based on SVM

Classification Approach
Profile based

Time Required for Training + Testing


0.0030 seconds

Run length based

2.6186 seconds

SVM based

8.1191 seconds

The accuracy of the proposed approach is less as compared to SVM with SVM being over classified at 100 % accuracy on a small data set.

366

A.R. Kanitkar, B.K. Bharti, and U.N. Hivarkar

8.4 Parametric Analysis


The approaches outlined above depend on the coefficients obtained from edge detection. While converting to binary format, a suitable threshold has to be selected .So the
features extracted an subsequently the classification depends on this threshold

Fig. 7. Classifier characteristic

9 Conclusions and Summary


Classification of Objects is needed for performing various tasks in computer vision
systems. Current state of the art methods utilize SVM, ADABoost and other learning
algorithms. Substantial efforts need to be taken for feature extraction and selection
using approaches like Dimensionality reduction, PCA and ICA. Efforts are also
needed for creating data sets and training using neural networks.
The proposed method eliminates this need for training as it is mathematical based
approach. It is observed that structural encoding will be beneficial for fast real time
systems as it virtually eliminates the grouping of edges and contour formation. The
structural information is deterministic in nature and can also be used as features for
classification using dimensional profiles.
It is also proposed that run length coding for structural information will increase
the efficiency of the classification as it represents the structure more accurately. Features extracted as run length code are virtually invariant as they are robust to change
in scale, linear translation as well as illumination conditions being binary in nature.
The approach is tested for day time vehicle detection in vision-based Driver
Assistance System, and it shows the promising results in terms of improvement of
vehicle detection by 15% as compared to profile based approaches by using run
length coding.
One of the limitations is that the approach is dependent on the thresholding performed and creation of the binary string pattern. Robustness is increased in run length
code as compared to other texture based feature extraction methods.
Future scope of development in the approach is addition of adaptive binary thresholding logic as well as increased efficiency of edge detection.

Object Classification Using Encoded Edge Based Structural Information

367

References
1. Basu, M.: Gaussian-based edge-detection methods: A Survey. IEEE SMC-C (32), 252
260 (2002)
2. Matthews, N.D., An, P.E., Charnley, D., Harris, C.J.: Vehicle detection and recognition in
greyscale imagery. Control Eng. Practice 4(4), 473479 (1996)
3. Goerick, C., Detlev, N., Werner, M.: Artificial neural networks in real-time car detection
and tracking application. Pattern Recognition Letters 17, 335343 (1996)
4. Sun, Z., Bebis, G., Miller, R.: On-road vehicle detection using Gabor filters and support
vector machines. Digital Signal Processing, 10191022 (2002)
5. Papageorgiou, C., Poggio, T.: A trainable system for object detection. International Journal of Computer Vision 4(4), 1533 (2000)
6. Sun, Z., Bebis, G., Miller, R.: Quantized wavelet features and support vector machines for
on-road vehicle detection. In: 7th International Conference on Control, Automation, Robotics and Vision, vol. 3, pp. 16411646 (2002)
7. Sun, Z., Bebis, G., Miller, R.: On-road vehicle detection using optical sensors: a review.
In: IEEE International Conference on Intelligent Transportation Systems, pp. 585590.
IEEE Press, Washington, DC (2004)
8. Sun, Z., Bebis, G., Miller, R.: Monocular precrash vehicle detection: features and classifiers. IEEE Transactions on Image Processing (2006)
9. Wen, X., Yuan, H., Yang, C., Song, C., Duan, B., Zhao, H.: Improved Haar Wavelet Feature Extraction Approaches for Vehicle Detection. In: Proceedings of the 2007 IEEE Intelligent Transportation Systems Conference, Seattle, WA, USA, September 30-October 3
(2007)
10. Canny, J.F.: A computational approach to edge detection. IEEE PAMI 8(6), 679698
(1986)
11. Mallat, S.: A Wavelet Tour of Signal Processing
12. Grigorescu, C., Petkov, N., Westenberg, M.A.: Contour detection based on non-classical
receptive field inhibition. IEEE Trans. on Image Processing, 729739 (2003)
13. Papari, G., Campisi, P., Petkov, N., Neri, A.: A multiscale approach to contour detection
by texture suppression. In: SPIE Image Proc.: Alg. and Syst., San Jose, CA, vol. 6064A
(2006)
14. Canu, S., Grandvalet, Y., Guigue, V., Rakotomamonjy, A.: SVM and Kernel Methods
Matlab Toolbox. In: Perception Systmes et Information. INSA de Rouen, Rouen (2005)

Real Time Vehicle Detection for Rear and Forward


Collision Warning Systems
Gaurav Kumar Yadav, Tarun Kancharla, and Smita Nair
CREST, KPIT Cummins Info systems Ltd.
Pune, India
{Gaurav.Yadav,Tarun.Kancharla,Smita.Nair}@kpitcummins.com

Abstract. Vehicle detection module is an important application within most of


the driver assistance systems. This paper presents a real-time vision based
method for detecting vehicles in both rear and forward collision warning
systems. The system setup consists of a pair of cameras mounted on each lateral
mirror for monitoring rear collisions, whereas camera for forward monitoring is
placed on the dashboard. The proposed algorithm selects ROI based on the road
lane marking. Two separate modules are functional, one for detecting vehicles
in the forward path and other for passing-by vehicles. Profiling and edge
detection techniques are used to localize forward path objects. The passing
vehicles are detected by temporal differencing. The detected vehicles are
tracked in the subsequent frames using mean-shift based tracking. Experiments
performed on different road scenarios shows that the proposed method is robust
and has a real-time performance.
Keywords: Rear Collision, Forward Collision, Profiling, Vehicle geometry,
Vehicle Detection.

1 Introduction
The major challenge in road transportation is to increase the safety of the passengers.
A survey on the vehicle accidents statistics [3] predicts 10 million injuries each year.
Amongst these, rear-end collisions and forward collisions are most common types of
road accidents, wherein the major threat to the driver is due to other vehicles. Vehicle
detection and tracking find a major application in all collision avoidance systems.
Vehicle detection can be accomplished either by hardware sensors like radar or
laser, or by vision based software methods. Hardware sensors such as laser and radars
are very expensive and cannot be used in low-end vehicles. Ultrasound sensors are
cost effective but their application is restricted due to the limited detection range.
A number of vision-based techniques have been used over the past few years to
detect vehicles in various road scenarios. Vision based methods used for vehicle
detection can be categorized based on hypothesis generation or hypothesis
verification [1]. K.Lim, L.Ang, K.Seng and S.Chin present a comparative study on
few vehicle detection techniques in [1]. The study shows that some methods are
symmetry based, but symmetry estimation is sensitive to noise. Shadow based vehicle
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 368377, 2011.
Springer-Verlag Berlin Heidelberg 2011

Real Time Vehicle Detection for Rear and Forward Collision Warning Systems

369

detection does not provide a systematic way to choose proper threshold and could be
affected due to illumination variations. Other methods based on texture, motion,
entropy analysis, stereo vision etc. are computationally expensive. However, the
presence of over bridge, flyover roadways, and signboards may decrease the
performance of above-mentioned techniques. Couple of methods use profiling, optical
flow and edge detection for detecting vehicles [4], [2].
N. Matthews, P. An, D. Charnley, and C. Harris [6], used edge detection to find
strong vertical edges to localize left and right position of a vehicle,. The left and right
position of a vehicle is estimated by finding the local maximum peaks of the vertical
profile.
Most of the mentioned methods use classifiers after vehicle segmentation, which
increases the computation time and sometimes classifies vehicle as non-vehicle.
In the proposed work, videos are captured from a moving car for both rear and
forward collisions. The captured videos are analysed for detecting forward path and
passing-by vehicles. The ROI is selected based on lane detections and using the
concept of vanishing point. The vehicle regions are localized using profiling. Once
the probable regions are detected, further processing based on vehicle geometry such
as vehicle base, aspect ratio etc. removes false detections The method is robust to
detect vehicles under normal day-light highway conditions. Since classifiers are not
used, it provides a very real time performance.
The proposed work is presented as follows. Section 2 provides the algorithm
details for detecting forward vehicles. Section 3 provides algorithm details for
detecting passing vehicles followed by tracking module in section 4. The experiments
and results are summarized in section 5 followed by conclusion in section 6.
Fig. 1 illustrates the block diagram for the proposed vehicle detection algorithm.

Fig. 1. Block diagram for vehicle detection

370

G.K. Yadav, T. Kancharla, and S. Nair

2 Algorithm Description Forward Vehicles


The proposed vehicle detection algorithm can be used for both rear and forward
collision warning systems. The algorithm is explained with reference to forward
vehicle detection, the same algorithm can be used for rear- collision detection with a
little change in the ROI (Region of Interest) selection. Further sections provide insight
into the developed algorithm.

(a)

(b)
Fig. 2. (a) Original image, (b) Region of interest

2.1

ROI Selection

For forward vehicle detection case, it is assumed that vehicles are present only
between the end lanes of the road and below the vanishing point, original image and
region of interest is shown in Fig. 2(a) and 2(b) respectively. The lanes are detected
using Hough transform [5], applied on the canny image. Hough transform provides
multiple lanes and needs further analysis to extract the required region. The outer
most lanes are selected based on the lane slope. Analysis showed that if the slope is
selected varying from 5 to 175 degrees, the required lanes can be extracted. The result
of the extracted lane is presented in Fig. 3(a). Based on the extracted lanes, the
vanishing point is computed and the required ROI selected. The selected ROI is as
shown in Fig. 3(b). In case of rear-end systems; the lane detection is done for only
one side using the above-mentioned procedure.

(a)

(b)

Fig. 3. (a) Lane detection, (b) outside lane region removed

Real Time Vehicle Detection for Rear and Forward Collision Warning Systems

371

2.2 Profiling
Edge based profiling is performed on the selected ROI. Foremost, the horizontal and
vertical edge detection is performed on the ROI region using sobel operator. The
obtained edge image consists of edges due to vehicles and some noise edges due to
lanes and irregularities on the road. The false edges are discarded based on their
lengths using morphological opening functions. A threshold edge map is created for
prominent edges. To compute edge profile, we sum up the edges column wise and
row wise for vertical and horizontal edge image respectively using Eq. 1, Eq. 2
respectively, where v and h are vertical and horizontal projection vectors. A large
value for vi indicates pronounced vertical edges along V ( xi , y , t ) . A large value for

h j indicates pronounced horizontal edges along H ( x, y j , t ) . A threshold is set for


selecting large projection values in each direction. Combined horizontal and vertical
edge image after profiling is shown in Fig. 4. Notice that the lane markings are not
obtained in Fig. 4, as they do not satisfy the threshold condition for profiling in both
horizontal and vertical directions.
m

i=1

i=1

h = (h1,h2,......
hm) = (H(xi , y1,t),........
.,H(xi , yn,t)) .

(1)

v = (v1,v2,.....vn ) = (V(x1, yj ,t),......., V(xm, yj ,t)) .

(2)

j =1

Fig. 4. Vertical and Horizontal edges after profiling

2.3 Grouping of Edges


The output of profiling gives edges belonging to the vehicle, as can be seen from Fig
4, there are multiple edges that belong to the same car. The edges are grouped to form
image blocks, which contain vehicles. In some cases, the top of the vehicle may be
cut off due to the ROI limitation. To obtain the complete vehicle, the length of the
box is extended based on the width of the horizontal and vertical edge obtained. The
obtained output is as shown in Fig. 5
2.4 False Detections
To remove false detections due to poles, signboard, markers on the road, and due to
the misdetection of lanes, once again we perform edge detection on each separate box

372

G.K. Yadav, T. Kancharla, and S. Nair

Fig. 5. Detected Vehicle

detected. Horizontal edge detection is used to obtain the edge due to base of the
vehicle as shown if Fig. 6. A square window is considered around horizontal edge and
the number of non-zero pixels for the same image block is checked in the canny edge
image, if the percentage of non-zero pixels is more than a predetermined threshold,
the block is retained else discarded. This procedure helps to eliminate the false
detection due to other objects like poles or sign board. The detected objects are
retained on basis of their aspect ratio and maximum/ minimum areas. This further
reduces false detections.

Fig. 6. Horizontal edge detection to remove false detection

3 Algorithm Description Passing Vehicles


In case of forward detection, it is not possible to detect passing vehicles using the
above mentioned method since the vehicle geometry of the passing vehicle would be
different from the in-path vehicle. Assuming that the passing vehicle moves with
certain velocity, temporal difference of images provides significant information. In
the obtained video, consider temporal difference of images i.e. differencing the
current image j frame from an earlier frame k using Eq. 3 where is fixed threshold
and R is region of interest. This provides brightness difference in that region if there is
a passing vehicle and if there is no passing vehicle, the brightness difference is zero or
very less. Both left and right side ROIs are selected and subtracted from the previous
frame. The sum of the difference region is compared against a threshold value and a
rectangular box is drawn in that region as shown in Fig. 7. Due to overlap of certain
regions in both algorithms, multiple boxes would be visible for the same passing
vehicles. In that case, the box having more overlap ratio is retained.

Real Time Vehicle Detection for Rear and Forward Collision Warning Systems

373

Fig. 7. Passing Vehicle Detection

|I

x , y R

( x, y) I k ( x, y) | .

(3)

4 Tracking
In order to further improve the real time performance of the system, tracking module
is introduced after the detection block. The method uses histogram based tracking

(a) Frame no.210

(b) Frame no.215

(c) Frame no.218


Fig. 8. Tracking result

374

G.K. Yadav, T. Kancharla, and S. Nair

module using mean-shift algorithm [7]. The detected vehicle is represented using a
rectangular region with centre position co-ordinates (c x , c y ) and width and height
dimensions

(hx , h y ) . The features of the target vehicles are represented using

intensity histogram within the rectangular region. The target vehicle in the current
image is located and tracked in subsequent frames. The tracking module tracks all
detected vehicles for next N frames (N=10). The tracking module also provides
consistency in drawing boxes round the vehicles and removing misdetections in
adjacent frames. Tracking results are presented in Fig. 8.

5 Experiments
The algorithm is tested on multiple videos for both rear and forward scenarios. The
forward scenarios are considered for highway conditions whereas the rear scenario is
considered for city conditions. The accuracy of the algorithm is presented in Table.1.
The results include data taken on highways and different Indian road scenarios,
described as follows.
Table 1. Result obtained
Total Frames
Forward
Collision
Rear-end
collision(Bright
Condition)
Rear-end
collision (Rainy
Condition)

Detected
Vehicles
2824

Accuracy
94.13%

False Positive
Rate
0.19

500 (700 vehicles)

647

92.45%

0.2

2000 (1800 vehicles)

919

51.9%

0.4

2000 (3000 vehicles)

5.1 Highway Road Condition


Real time video for forward vehicle detection was captured during day time in highway
environment. The total number of vehicle in N=2000 frames is about 3000, in which
2824 vehicles were detected correctly. The results of vehicle detection are presented in
Fig. 9. The vehicles that are very far from the host vehicle are not detected.
5.2 Normal City Condition
Real time video for rear-end vehicle detection was captured in two scenarios a) bright
condition and b) rainy condition.

Real Time Vehicle Detection for Rear and Forward Collision Warning Systems

375

a)

For bright condition, the number of total vehicles was 700 in 500 frames,
of which 560 vehicles were detected correctly. The output is shown in
Fig. 10.
b) For rainy condition, the total number of vehicles was 1800 in 1000 frames,
of which 919 vehicles were detected correctly. The output is shown in
Fig. 11.
As presented in the results, the algorithm achieves best performance in highwayroad scenarios with an accuracy of about 95%. It is observed that the vehicles in
the same lane as that of the host vehicle (in-path) are always detected by the
mentioned technique, and the misdetections are generally for the side and
passing-by vehicles. The performance of the algorithm deteriorates for city type
conditions where the results are poor in case of rain weather. As shown, in rainy
conditions, accuracy is poor due to reflections from vehicles and other objects on
the road. The processing speed of the proposed algorithm is 15 fps and can be
used in real time applications.

(a)

(b)

(c)

(d)

Fig. 9. Highway road conditions for forward vehicle detections. Algorithm analysed for
N=2000 frames.

376

G.K. Yadav, T. Kancharla, and S. Nair

(a)

(b)

(c)
Fig. 10. Normal bright city condition for rear vehicle scenario

(a)

(b)

(c)
Fig. 11. Rainy condition for rear vehicle scenario

Real Time Vehicle Detection for Rear and Forward Collision Warning Systems

377

6 Conclusions
This paper presents a simple robust real-time application for detecting vehicles in
rear-and forward collision regions for daytime scenarios. Results from experimental
video sequence demonstrate the high performance of the system and low false
positive rate under ideal road scenarios. The algorithm has a very high accuracy at
detecting in-path vehicles. The performance is degraded under rainy weather
scenarios because of improper segmentation obtained from multiple edges due to
reflections from various objects. The algorithm finds applications in collision warning
systems, where warning is provided to the host vehicle is case of a possible collision.
The algorithm is more effective for highway type of scenarios in normal daylight
conditions. Future work includes developing robust technique for detecting vehicles
under various weather conditions and for NIR videos.

References
1. Lim, K.H., Ang, L.M., Seng, K.P., Chin, S.W.: Lane-vehicle detection and tracking. In:
International Multi Conference of Engineers and Scientists, March 18-20, vol. 2 (2009)
2. Betke, M., Haritaoglu, E., Davis, L.S.: Real-time multiple vehicle detection and tracking
from a moving vehicle. Machine Vision and Application 12, 6983 (2000)
3. Sun, Z., Bebis, G., Miller, R.: On-road vehicle detection: A review. IEEE Transaction on
Pattern Analysis and Machine Intelligence 28(5) (May 2006)
4. Sotelo, M.A., Barriga, J.: Rear-end collision detection using vision for automotive
application. Journal of Zhejiang University Science A 9(10), 13691372 (2008)
5. Galambos, C., Kittler, J., Matas, J.: Progressive Probabilistic Hough Transform for Line
Detection. In: IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, vol. 1, p. 1554 (1999)
6. Matthews, N., An, P., Charnley, D., Harris, C.: Vehicle Detection and Recognition in
Greyscale Imagery. Control Eng. Practice 4, 473479 (1996)
7. Comaniciu, D., Ramesh, V., Meer, P.: Kernel-based Object Tracking. IEEE Transactions
on Pattern Analysis and Machine Intelligence 25(5) (May 2005)

PIN Generation Using Single Channel EEG Biometric


Ramaswamy Palaniappan1, Jenish Gosalia2, Kenneth Revett3, and Andrews Samraj4
1

School of Computer Science and Electronic Engineering, University of Essex,


Colchester, UK
palani@essex.ac.uk
2
Toumaz Technology Limited, Abingdon, UK
jenish.gosalia@toumaz.com
3
Faculty of Informatics and Computer Science, British University of Egypt,
Cairo, Egypt
ken.revett@bue.edu.eg
4
Vellore Institute of Technology University, Chennai Campus, Chennai, 600 048, India
holyant@yahoo.com

Abstract. This paper investigates a method to generate personal identification


number (PIN) using brain activity recorded from a single active
electroencephalogram (EEG) channel. EEG based biometric to generate PIN is
less prone to fraud and the method is based on the recent developments in
brain-computer interface (BCI) technology, specifically P300 based BCI
designs. Our perfect classification accuracies from three subjects indicate
promise for generating PIN using thought activity measured from a single
channel.
Keywords: Biometrics, Brain computer interface, Electroencephalogram,
Information transfer rate, Neural networks.

1 Introduction
Biometric technologies can be roughly divided into those that that identify a person or
authenticate a persons identity [1]. Personal identification number (PIN) is one
commonly used confidential sequence of numerals to authenticate a persons
identity, as employed in automated teller machine (ATM) to withdraw cash or
perform other functions. In recent years, PINs have been used to authenticate debit
and credit cards in lieu of signatures. In this paper, we investigate a method to
generate PIN using only brains electrical activity (i.e. electroencephalogram (EEG)).
The advantage is obviously that it is less prone to fraud such as shoulder surfing
problem as in the conventional method of keying in the numbers.
The method follows the recent developments in brain-computer interface (BCI)
technology [2]. BCI designs were initially developed to assist the disabled to
communicate with their external surroundings as they circumvent the peripheral
nerves and muscles to create a link between the brain and computers/devices. In
recent years, BCI designs have been explored for other purposes such as biometrics
[3, 4], games design [5], virtual reality [6] and robotics [7].
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 378385, 2011.
Springer-Verlag Berlin Heidelberg 2011

PIN Generation Using Single Channel EEG Biometric

379

There are many BCI paradigms, the most common being the non-invasive EEG
based. EEG based BCI designs could be further divided into those based on transient
evoked potential, motor imagery, slow cortical potential, mental task and steady state
evoked potential. Transient evoked potential method, more commonly known as the
P300 method as it is based on a potential that is generated about 300-600 ms after the
stimulus onset, is probably the method chosen by many BCI researchers due to its
simplicity and ease of use by the subjects. The thought based PIN generation
investigated here is based on this P300 based BCI.

2 Methodology
Three right handed male subjects aged 24 participated in this study. The objective of
the experiment and the description of the experiment were given to the subjects before
they signed a voluntary consent. The experiment was approved by the University of
Essexs Ethics Committee. The subjects were seated in a room with computer screen
projected about 30 cm from their eyes. The subjects had no uncorrected visual
problems. The visual stimulus paradigm is as shown in Figure 1.

0
Fig. 1. Visual stimulus paradigm

The numbers on the screen were flashed randomly with each flash lasting 100 ms
with 75 ms inter-stimulus interval (ISI). These timings were chosen from a previous
study [7]. The subjects were asked to concentrate on a given target number and to
keep a mental count of the target flashes (this is to avoid lapses of concentration).
When a target number is flashed, a positive potential about 300-600 ms after stimulus
onset in evoked and shows up in the recorded EEG signal. A total of five trials were
conducted in each session where a trial consisted of ten random flashes of each
number. A short break of 2.5 s was given between each session. A second session was
conducted on a separate week. EEG data from 32 electrodes as shown in Figure 2 was
collected using Biosemi Active Two system. The sampling rate used was 256 Hz. One
second EEG data after stimulus onset from each flash was extracted for further
processing.

380

R. Palaniappan et al.

Fig. 2. Used electrode locations

2.1 Pre-processing
The data was bandpass filtered using a Butterworth IIR filter with order 6. Two
commonly used bandpass ranges of 1-8 Hz [9] and 1-12 Hz [10] were used. Next, the
data was downsampled to 32 samples. Windsorising as suggested in [10] was applied
to remove outlier data beyond 10th and 90th percentiles. A single hidden layer feedforward neural network classifier trained by the backpropagation algorithm was used
to train and test the performance of the processed EEG data.
2.2 Classification
Instead of treating the classification as a ten class problem, the classifier was trained
with only two outputs, one for target and another for non-target. Our preliminary
simulations show that the results are much improved following this strategy. Data
from one session was used to train the neural network while the remaining data from
the other session was used to test the performance of the classifier. To avoid
overtraining the neural network with more non-target instances as compared to target
instances, all 50 target instances (ten numbers x five flashes) were used with 50
randomly chosen non-target instances rather than the total 450 non-target instances.
The training was conducted until mean square error fell below 0.0001 or a maximum
iteration number of 1000 was reached. The hidden layer size was fixed to be similar
to the number of inputs. For example, when 32 channels were used, the size was
1024.
The two outputs of the classifier were added incrementally after each trial. As the
neural network could predict more than a single target for each trial, the maximal
output after considering all the ten flashes in a trial was taken as the predicted target.

PIN Generation Using Single Channel EEG Biometric

381

The classification step was repeated ten times (to reduce effects of different neural
network weight connections) and also cross validated with the training and testing
datasets swapped and performances from these 20 runs were averaged. All the
computer simulations were conducted with MATLAB.

3 Results
Figure 3 shows the grand averaged 1-8 Hz bandpass filtered EEG response from 50
target and 50 non-target EEG signals for a subject. The occurrence of P300
component around 300-600 ms for the target flash (shown in red) as compared to nontarget (shown in blue) is evident from the figure.
Target vs non-target P300 EEG
8

Amplitude (arbitrary units)

-2

-4

-6
0

0.1

0.2

0.3

0.4

0.5
0.6
Time (s)

0.7

0.8

0.9

Fig. 3. Grand averaged EEG response for a subject


Classification accuracy for subject 1

0.95

Accuracy

0.9

0.85

0.8

0.75

0.7

0.65

1-8 Hz
1-12 Hz
1

3
No. of trials

Fig. 4. Passband range comparison for subject 1

382

R. Palaniappan et al.
Classification accuracy for subject 2

1
0.95
0.9
0.85

Accuracy

0.8
0.75
0.7
0.65
0.6
1-8 Hz
1-12 Hz

0.55
0.5

3
No. of trials

Fig. 5. Passband range comparison for subject 2

Figures 4-6 shows the results from subjects using all 32 channels with passband
ranges of 1-8 Hz and 1-12 Hz. Passband range of 1-8 Hz gave improved performance
(statistically significant, p<0.1) compared to passband range of 1-12 Hz for subjects 1
and 3 when considering the first two trials. For subject 2, 1-8 Hz range gave improved
performance for all the trials. In the figures, accuracy value of 1.00 indicates perfect
classification (i.e. 100%).
Tables 1-3 shows the results comparing the performance using 1 channel, 4
channels, 8 channels, 16 channels and all 32 channels with passband range of 1-8 Hz
(passband range of 1-12 Hz was dropped from further analysis due to its poorer
performance for all the subjects). The locations of multi-channels were obtained from
[10] and are shown in Table 4, while our own preliminary simulations indicated
location Cz to be the most favourable single channel.
The results indicate that perfect classification was obtained after five trials for all
the subjects for all the channel configurations. Hence, using a single channel Cz
would be sufficient if five trials were considered.
Classification accuracy for subject 3

0.95

Accuracy

0.9

0.85

0.8

0.75

0.7

0.65

1-8 Hz
1-12 Hz
1

3
No. of trials

Fig. 6. Passband range comparison for subject 3

PIN Generation Using Single Channel EEG Biometric

383

Table 1. Channel wise accuracy for subject 1


Channel/trial

0.56

0.67

0.84

0.95

1.00

0.66

0.74

0.83

0.96

1.00

0.71

0.71

0.93

1.00

1.00

16

0.74

0.81

0.97

1.00

1.00

32

0.74

0.79

0.97

1.00

1.00

Table 2. Channel wise accuracy for subject 2


Channel/trial

0.34

0.58

0.73

0.88

1.00

0.46

0.66

0.85

0.85

1.00

0.49

0.63

0.82

0.9

1.00

16

0.50

0.74

0.82

0.88

1.00

32

0.57

0.82

0.96

0.93

1.00

Table 3. Channel wise accuracy for subject 3


Channel/trial

0.58

0.77

0.73

0.85

1.00

0.55

0.63

0.79

0.93

1.00

0.62

0.73

0.92

1.00

1.00

16

0.67

0.8

0.97

1.00

1.00

32

0.65

0.81

0.99

0.99

1.00

Table 4. Channel locations


4 channels

Fz, Cz, Pz, Oz

8 channels

Fz, Cz, Pz, Oz, P7, P3, P4, P8

16 channels

Fz, Cz, Pz, Oz, P7, P3, P4, P8, O1, O2, CP1, CP2,
C3, C4, FC1, FC2

Information transfer rate (ITR), which gives a measure of the performance based on
the accuracy plus number of targets in bits/min was computed using [11]:

1 P .
ITR = log 2 ( N ) + P log 2 ( P ) + (1 P ) log 2

N 1

(1)

R. Palaniappan et al.
Bits per min for subject 1
60
1 ch
4 ch
8 ch
16 ch
32 ch

Bits/min

50

40

30

20

3
No. of trials

Fig. 7. ITR for subject 1


Bits per min for subject 2
40
1 ch
4 ch
8 ch
16 ch
32 ch

Bits/min

30

20

3
No. of trials

Fig. 8. ITR for subject 2


Bits per min for subject 3
50
1 ch
4 ch
8 ch
16 ch
32 ch
40

Bits/min

384

30

20

3
No. of trials

Fig. 9. ITR for subject 3

PIN Generation Using Single Channel EEG Biometric

385

The ITR for each subject is shown in Figures 7-9 with passband range of 1-8 Hz.
The best ITR of 57.29 bpm was obtained for subject 1 for 32 channels, which is much
higher than reported in [10]. For single channel, this was 34.60 bpm for subject 3.

4 Conclusion
A method to generate PIN based on EEG signals has been investigated here. Major
obstacle with EEG based biometric work is the cumbersome usage of many electrodes
but our results indicate that single channel Cz with passband 1-8 Hz is appropriate
for the investigated objective assuming a minimum of five trials. Furthermore, a
reduction in the number of channels will reduce the cost, computational time and
complexity. The perfect accuracy that is obtained after five trials shows the promise
behind the method for fraud-resistant PIN generation. The design of the new
capacitive electrodes will further remove the obstacle of having to use wet-EEG
electrodes thereby bringing this method closer to deployment in the real world.

References
1. Wayman, J., Jain, A., Maltoni, D., Maio, D. (eds.): Biometric Systems: Technology,
Design and Performance Evaluation. Springer, Heidelberg (2004)
2. Wolpaw, J.R., Birbaumer, N., McFarland, D.J., Pfurtscheller, G., Vaughan, T.M.: Braincomputer interfaces for communication and control. Clinical Neurophysiology 113(6),
767791 (2002)
3. Palaniappan, R., Mandic, D.P.: Biometric from the brain electrical activity: A machine
learning approach. IEEE Transactions on Pattern Analysis and Machine
Intelligence 29(4), 738742 (2007)
4. Ravi, K.V.R., Palaniapan, R.: Improving visual evoked potential feature classification for
person recognition using PCA and normalization. Pattern Recognition Letters 27(7), 726
733 (2006)
5. Neurosky, http://www.neurosky.com
6. Cho, H.-s., Goo, J.J., Suh, D., Park, K.S., Hahn, M.: The virtual reality brain-computer
interface system for ubiquitous home control. In: Sattar, A., Kang, B.-h. (eds.) AI 2006.
LNCS (LNAI), vol. 4304, pp. 992996. Springer, Heidelberg (2006)
7. Geng, T., Gan, J.Q., Hu, H.: A self-paced online BCI for mobile robot control.
International Journal of Advanced Mechatronic Systems 2(1-2), 2835 (2010)
8. Krusienski, D.J., Sellers, E.W., Cabestaing, F., Bayoudh, S., McFarland, D.J., Vaughan,
T.M., Wolpaw, J.R.: A comparison of classification techniques for the P300 speller.
Journal of Neural Engineering 3, 299305 (2006)
9. Gupta, C.N., Palaniappan, R.: Enhanced detection of visual evoked potentials in braincomputer interface using genetic algorithm and cyclostationary analysis. Computational
Intelligence and Neuroscience (Special Issue on Brain-Computer Interfaces: Towards
Practical Implementations and Potential Applications) 2007, Article ID 28692, 12 pages
(2007), doi:10.1155/2007/28692
10. Hoffmann, U., Vesin, J.M., Ebrahimi, T., Diserens, K.: An efficient P300-based braincomputer interface for disabled subjects. Journal of Neuroscience Methods 167(1), 115
125 (2007)
11. Obermaier, B., Neuper, C., Guger, C., Pfurtscheller, G.: Information transfer rate in a
five-classes braincomputer interface. IEEE Transactions on Neural Systems and
Rehabilitation Engineering 9(3), 283288 (2001)

A Framework for Intrusion Tolerance in Cloud


Computing
Vishal M. Karande and Alwyn R. Pais
Information Security Lab, Dept. of Computer Science and Engineering,
National Institute of Technology Karnataka, Surathkal, India - 575025
{vishalmkarande,alwyn.pais}@gmail.com

Abstract. Cloud Computing has been envisioned as the next generation


architecture and one of the fastest growing segments of the IT enterprises.
No matter how much investment is made in cloud intrusion detection and
prevention, cloud infrastructure remains vulnerable to attacks. Intrusion
Tolerance in Cloud Computing is a fault tolerant design approach to defend cloud infrastructure against malicious attacks. Thus to ensure dependability we present a framework by mapping available Malicious and
Accidental Fault Tolerance for Internet Applications (MAFTIA) intrusion tolerance framework for dependencies such as availability, authenticity, reliability, integrity, maintainability and safety against new Cloud
Computing environment. The proposed framework has been validated
by integrating Intrusion Tolerance via Threshold Cryptography (ITTC)
mechanism in the simulated cloud environment. Performance analysis of
the proposed framework is also done.
Keywords: Cloud Computing, Framework,Intrusion
Security, and Threshold Cryptography.

Tolerance,

Introduction

Experience shows that attacks may never be completely prevented or detected


accurately and on time. Thus Intrusion Tolerance combining the aspects of
protection, detection and reaction is currently considered to be the optimal
way to address information security challenges [1]. However, the architecture of
intrusion-tolerant systems, integrating multiple layers of defenses, redundancy
and diversity is often considered to be costly and heavy weight to provision it
dynamically. At the same time, the information technology landscape has been
evolving continuously with the introduction of new software technology Cloud
Computing.
Cloud computing provides simple, on-demand access to pools of highly elastic
computing resources. Cloud Computing delivers software, platform and infrastructure as subscription-based services to its user in a pay-as-you-go model.
These services are referred to as Software as a Service (SaaS), Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) wherein resources are provided
as a service over a network. Corporations and individuals are concerned about
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 386395, 2011.
c Springer-Verlag Berlin Heidelberg 2011


A Framework for Intrusion Tolerance in Cloud Computing

387

how security and compliance integrity can be maintained in this new rapidly
evolving cloud computing environment. Even more concerning, though, is the
corporations that are jumping to cloud computing while being oblivious to the
implications of putting critical applications and data in the cloud. So cloud computing environment should be secure enough in maintaining cloud users trust
level as small intrusion can cause a huge loss to both cloud users as well as cloud
service executives [10]. Cloud computing being new and evolving rapidly, intrusions causing damage to its functional and operational units should be taken
care of in their early stages of development.
In this paper we present a framework for intrusion tolerance in cloud computing environment which summarizes how a number of defenses and security
techniques, especially those providing availability, integrity and condentiality
can possibly be integrated in the cloud or within its services. We have studied
the MAFTIA intrusion tolerance framework. This existing framework for intrusion tolerance does not account for essential characteristics of cloud computing,
such as scalability, elasticity, ubiquitous access, computer virtualization, relative
consistency, commodity, reliability. The new framework is obtained by mapping
available intrusion tolerance framework for dependencies such as availability,
authenticity, reliability, integrity, maintainability and safety against new cloud
computing environment wherein for each component we provide requirement,
design description (architecture, specication), reasoning and evidence (why description meets the requirement under assumptions). The framework serves as
an excellent platform for making cloud services intrusion tolerant. To test the
feasibility of the proposed framework a Cloud Computing environment is simulated using CloudSim [12] toolkit, and using Intrusion Tolerance via Threshold
Cryptography (ITTC) [7] mechanism clouds Infrastructure as a service (IaaS) is
made intrusion tolerant. Performance of the new simulated service model is measured using various performance metrics such as total execution time, intrusion
detection time, recovery time, number of cloudlets etc.
The rest of the paper includes following structure, Section 2 provides a brief
summary of the related work in this area. In section 3, we propose our framework.
Section 4 gives the validation of our proposed framework and the paper concludes
in Section 5.

Related Work

A dependable system is dened as one that is able to deliver a service that can
justiably be trusted [1]. Attributes of dependability include availability (readiness for correct service), reliability (continuity of correct service), condentiality
(prevention of unauthorized disclosure of information), and integrity (the absence of improper system state alterations). An intrusion-tolerant system is a
system that is capable of self diagnosis, repair, and reconguration while continuing to provide a correct service to legitimate users in the presence of intrusions.

388

V.M. Karande and A.R. Pais

The MAFTIA Project, funded by the European Union, systematically investigated the tolerance paradigm for security in order to propose an integrated
architecture built on this paradigm and to realize a concrete design that can be
used to support the dependability of many applications [4]. MAFTIA was the
rst project that uniformly applied the tolerance paradigm to the dependability
of complete large-scale applications in a hostile environment and not just for
single components of such systems. Its major innovation was a comprehensive
approach for tolerating both accidental faults and malicious attacks in large-scale
distributed systems, including attacks by external hackers and by corrupt insiders. The framework proposed is strongly inspired by the MAFTIA framework,
but we have applied it to an emerging Cloud Computing environment.
A Component Based Framework for Intrusion Tolerance (CoBFIT) [5] provides a platform for building and testing a variety of Intrusion tolerant
distributed systems. The CoBFIT framework, by virtue of its design and implementation principles, can serve as a convenient base for building components
that implement intrusion-tolerant protocols and for combining these components in an ecient manner to provide a number of services for dependability.
This framework is studied to identify the possible components in the proposed
framework.
The Intrusion Tolerance by Unpredictable Adaptation (ITUA) Project proposes to develop a middleware-based intrusion-tolerant solution that helps applications survive certain kinds of attacks. The main goal of ITUA is to add
intrusion tolerance to CORBA architecture by modifying the middleware itself and an existing crash tolerant group communication system (C Ensemble).
These projects do not directly address the specic problem of intrusion tolerance
in cloud environment, but they include the notions of replication, reconguration
that also belong to our framework.

3
3.1

The Framework
Overview of Framework

Fig. 1 shows the intrusion tolerance framework based on the layered design of
cloud computing architecture. In layered design, physical cloud resources along
with core middleware capabilities form the basis for delivering IaaS and PaaS.
The user-level middleware aims at providing SaaS capabilities. The top layer
focuses on application services (SaaS) by making use of services provided by
the lower layer services. PaaS/SaaS services are often developed and provided
by third party service providers, who are dierent from the IaaS providers. In
these service layers, framework components implement the structure of intrusion tolerance in the form of abstractions, primitives, and supporting software
mechanisms that are commonly needed for the creation of intrusion-tolerant services. The framework also shows the components which are to be managed by
the Cloud Security Administration System to make Cloud services intrusion tolerant. It is important to note that implementing any of the cloud computing
service in the proposed framework will not make the service intrusion-tolerant.

A Framework for Intrusion Tolerance in Cloud Computing

389

Fig. 1. Intrusion Tolerance Framework based on Layerd Design of Cloud Computing


Architecture

The service will be intrusion tolerant only if the protocol or the algorithm upon
which the service is based is intrusion tolerant by design.
3.2

Framework Components

Layered Design
1. User Level: This layer includes applications that are directly available to
end-users. We dene end-users as the active entity that utilizes the SaaS
applications over the Internet. These applications may be supplied by the
Cloud provider (SaaS providers) and accessed by end-users either via a subscription model or a pay-per-use basis. Alternatively, in this layer, users
deploy their own applications.
2. Middleware: Cloud computing services rely on several layers of middleware
services that must be able to withstand intrusions and attacks from a very
wide range of players. For an intrusion tolerant service by design, its protocol
or algorithm should be implemented in middleware. It is composed of User
Level Middleware and Core Middleware.

390

V.M. Karande and A.R. Pais

(a) User Level Middleware provides those programming environments and


composition tools that ease the creation, deployment, and execution of
applications in clouds.
(b) Core Middleware implements the platform level services that provide
runtime environment for hosting and managing User-Level application
services. Core services at this layer include Dynamic SLA Management,
accounting, billing, execution monitoring and management, and pricing.
3. System Level: The computing power in Cloud environments is supplied
by a collection of Datacenters that are typically installed with hundreds to
thousands of hosts. At the System Level layer there exist massive physical
resources (storage servers and application servers) that power the data centers. At system level the cloud resources are recongured to support intrusion
tolerance.
Attack and Vulnerablity Prevention. The Intrusion Prevention is the combined application of attack and vulnerability prevention, as well as attack and
vulnerability removal. This component consists of the introduction of mechanisms such as authentication, authorization and rewalls, which prevent attacks in that they push back the attacks to the level of the additional barriers
these mechanisms introduce.
Error Processing
1. Event Analysis: Event Analysis provides Cloud Security Administrator with
an eective mechanism to update Security Plans, Security Assessment Reports, and Plans of Action and Milestones. The sensor is the component of
the system collecting raw data (e.g., a snier or an audit log). Event Analysis
involves,
(a) Security impact analyses on proposed or actual changes to computing
systems and environments of operation.
(b) Assessment of selected security controls (including system-specic, hybrid, and common controls) based on the dened continuous monitoring
strategy.
(c) Security status reporting to appropriate ocials.
2. Error Detection: We distinguish two basic generic component types, Intrusion Intolerant components and Intrusion Tolerant components. At both
middleware and system levels error can be detected in any of the above
two components. However, only intrusion tolerant components are capable
of acting autonomously to implement error recovery.
3. Fault Model: According to basic fault model a fault leads to error and then
to failure of the system. Thus for both middleware and system level faults
should be identied by Cloud Security Administration as an auditing part
of event analysis. In Cloud Computing environment faults can be physical,
design, interaction, accidental/intentional, transient/intermittent, internal
external. It is necessary to distinguish the internal detectable impairment
(error) from the causing impairment (fault) since there may be multiple

A Framework for Intrusion Tolerance in Cloud Computing

391

causes that could give rise to the same detectable impairment. Also it is
necessary to distinguish the internal detectable impairment (error) from the
external impairment (i.e., failure in the service delivered to a user) that
intrusion tolerance techniques aim to prevent [4].
Fault Treatment. At middleware and system level of cloud computing, Cloud
Security Admisnistration is responsible for fault handling.
1. Fault Diagnosis: Fault diagnosis is concerned with identifying the type and
locations of faults that need to be isolated before carrying out system reconguration or initiating corrective maintenance. It involves,
(a) Intrusion diagnosis, i.e., trying to assess the degree of success of an intruder in terms of system corruption.
(b) Vulnerability diagnosis, i.e., trying to understand the channels through
which the intrusion took place so that corrective maintenance can be
carried out.
(c) Attack diagnosis, i.e., nding out who or what organization is responsible
for the attack in order that appropriate litigation or retaliation may be
initiated.
2. Fault Isolation: In Cloud Computing environment fault isolation is needed
to make sure that the source of the detected error(s) is prevented from
producing further error(s). It involves,
(a) Blocking cloud service request from an intrusion containment region that
is diagnosed as corrupt.
(b) Removing a corrupted host from the datacenter or, with reference to the
root vulnerability/attack causes.
(c) Uninstalling software versions with newly-found vulnerabilities
(d) Arresting and taking legal action on an attacker.
3. System Reconguration: All the protocols and algorithms required for cloud
services provisioning are implemented in middleware level. Depending on
the damage level caused due to intrusion in the system, reconguration in
both middleware and system level is required to be carried out by Cloud
Security Administrator. In an intrusion tolerant Cloud environment possible
reconguration actions include,
(a) Virtualization software downgrades or upgrades (using appropriate versions are available on-line for this to be done automatically)
(b) Changing a voting threshold (say from 5-out-of-9 voting to 6-out-of-9
voting) after two corrupt servers have been isolated, so that a further
intrusion can be masked
Cloud Security Administration System. Cloud Security Administration
System is responsible for handling and treating security issues in Cloud environment. Standards that are relevant to security management practices in
the cloud are Information Technology Infrastructure Library (ITIL), ISO/IEC
27001/27002 and Open Virtualization Format (OVF) [6]. Information Technology Infrastructure Library is set of best practices and guidelines that dene

392

V.M. Karande and A.R. Pais

an integrated, process-based approach for managing information technology services. Open Virtualization Format (OVF) enables ecient, exible, and secure
distribution of enterprise software, facilitating the mobility of virtual machines
and giving customers vendor and platform independence.

4
4.1

Framework Validation
Simulation Environment

To test the feasibility of the proposed framework a Cloud Computing environment as shown in Fig. 2 was simulated using CloudSim [12] toolkit.

Fig. 2. Cloud Computing Simulation Environment

In Cloud Computing user submits a cloudlet (cloud service request) to UBroker who is responsible for nding a suitable cloud for servicing user. Cloud
Exchange keeps information about various clouds such as currently available
resources. Upon accepting cloudlets the Cloud Coordinator sends it to D-Broker
who is responsible for creating Virtual Machines on Hosts Machines constituting
a Datacenter. All the cloudlets are scheduled and executed on these Virtual
Machines. The results are updated and sent back to the user.
Intrusion Tolerance via Threshold Cryptography. The simulation toolkit
is extended to add Intrusion Tolerance capability by adding new classes into it. In
this environment, Cloud Coordinator can execute cloud request (cloudlets) only
if the hosts running inside the datacenter are legitimate. For this the datacenter
authentication key is distributed among the hosts using Shamir Secret Sharing
algorithm [2].

A Framework for Intrusion Tolerance in Cloud Computing

Fig. 3. Total Execution Time Vs Number of Cloudlets

Fig. 4. Total Execution Time Vs Number of Hosts in a Datacenter

Fig. 5. Total Execution Time Vs Number of Hosts Failed

393

394

V.M. Karande and A.R. Pais

1. Key Management: For n number of hosts in a datacenter, its authentication


key S is shared among k number of hosts as s0, s1,s2, , sk in such a way
that,
(a) Knowledge of k or more number of shares makes secret S computable
and
(b) Knowledge of k-1 or fewer shares makes secret S completely undetermined.
Such a scheme is called a (k, n) threshold scheme. Secret shares s0, s1..., sk
are distributed to h1, h2..., hk respectively. A robust key management for
(k, n) scheme can be obtained with n=2k-1. Every time when a new Host
is added or deleted from the datacenter secret share values are regenerated
and distributed to hosts by Key Management module.
2. Intrusion Detection Module: A Sensor module continuously tests all possible
combinations of hosts for valid secret generation and detects compromised
hosts. Thus for (k, n) threshold scheme, n Ck combinations are tested by
generating secret key for every combination and negative results are used to
detect compromised hosts. Sensor module is capable of detecting intrusions
in all n hosts. Sensor module also generates alert and initiates recovery
module when intrusion is detected.
3. Recovery Module: In recovery process, reconguration module reallocates
all the virtual machines running on the penetrated host(s), and the fault
isolation module removes compromised host(s) machine from the Host group
constituting a datacenter. Recovery module then invokes key management
module for generating and redistributing new secret shares.
4.2

Simulation Results

The performance overhead of incorporating Intrusion Tolerance is measured under dierent scenarios, with varying threshold of secret sharing i.e. k (Fig. 3),
number of hosts in a datacenter i.e. n (Fig. 4), number of hosts failed in a datacenter (Fig. 5). Fig. 3 shows that performance overhead (measured with varying
cloudlets) is maximum for n=2k-1 i.e. 5 out of 9 hosts sharing a secret. Fig. 4
shows performance overhead increases with increase in the number of Hosts in a
datacenter keeping the number of cloudlets constant. In case of intrusions, the
total execution cost involves intrusion detection cost and system recovery cost.
Fig. 5 shows datacenter performance in case of failure of Hosts. Total execution
time increases with the number of Hosts failed.

Conclusion and Future Work

In this paper, we have proposed a framework for intrusion tolerance based on


the layered design of Cloud Computing architecture. For the validation of framework, we have simulated Intrusion Tolerant Cloud environment with security
controls and techniques required for intrusion tolerance. We have used Intrusion Tolerance via Threshold Cryptography mechanism for validation. It is observed that our framework is capable of detecting and recovering intrusions in

A Framework for Intrusion Tolerance in Cloud Computing

395

the Cloud Computing environment. Performance analysis of framework shows


that the overhead of integrating intrusion detection and recovery mechanism in
Cloud Computing environment increases with the number of hosts in a datacenter for the given application. The framework components were designed to be
generic. Future work includes (1) rening and extending the implementation of
framework components, and (2) exploring additional supporting mechanisms for
intrusion tolerance that can be added to the proposed framework.

References
1. Avizienis, A., Laprie, J.C., Randell, B., Landwehr, C.: Basic Concepts and Taxonomy of Dependable and Secure Computing. IEEE Trans., Dependable and Secure
Computing 1(1), 1133 (2004)
2. Shamir, A.: How to share a secret. Comm. of the ACM 22, 612613 (1979)
3. Saidane, A., Nicomette, V., Deswarte, Y.: The Design of a Generic IntrusionTolerant Architecture for Web Servers. IEEE Trans. 6, 4558 (2009)
4. Powell, D., Stroud, R.: Malicious-and Accidental-Fault Tolerance for Internet Applications: Conceptual Model and Architecture. Technical Report 03011, Project
IST-1999-11583 MAFTIA, Deliverable D21, LAAS-CNRS (January 2003)
5. Ramasamy, H.V., Agbaria, A., Sanders, W.H.: CoBFIT: A Component-Based
Framework for Intrusion Tolerance. In: 30th EUROMICRO Conference (EUROMICRO 2004), pp. 591600 (2004)
6. Information Technology Infrastructure Library,
http://www.itil-officialsite.com/home/
7. Intrusion Tolerance via Threshold Cryptography,
http://crypto.stanford.edu/~ dabo/ITTC/
8. Reynolds, J.C., Just, J., Clough, L., Maglich, R.: On-Line Intrusion Detection
and Attack Prevention Using Diversity, Generate-and-Test, and Generalization.
In: HICSS 2003, Track -9, vol. 9 (2003)
9. Pal, P., Schantz, R., Atighetchi, M., Loyall, J.: What Next in Intrusion Tolerance.
BBN Technologies, Cambridge
10. Popovic, K., Hocenski, Z.: Cloud computing security issues and challenges. In:
IEEE Trans. MIPRO, 2010 Proceedings of the 33rd International Convention, pp.
344349 (May 2010)
11. Proposed Security Assessment and Authorization for U.S. Government Cloud Computing (November 2010), http://www.govinfosecurity.com/
12. Buyya, R., Ranjan, R., Calheiros, R.N.: Modeling and Simulation of Scalable Cloud
Computing Environments and the CloudSim Toolkit: Challenges and Opportunities. University of Melbourne, Australia (July 2009)

Application of Parallel K-Means Clustering Algorithm


for Prediction of Optimal Path in Self Aware Mobile
Ad-Hoc Networks with Link Stability
Likewin Thomas and B. Annappa
Department of Computer Science and Engineering
Centre for Wireless Sensor Networks
National Institute of Technology Karnataka, Surathkal, Mangalore, India
likewinthomas@gmail.com, annappa@ieee.org

Abstract. Providing Quality of Service (QoS) in terms of bandwidth, delay,


jitter, throughput etc., for Mobile Ad-hoc Network (MANET) which is the
autonomous collection of nodes, is challenging issue because of node mobility
and the shared medium. This work is to predict the Optimal link based on the
link stability which is the number of contacts between 2 pair of nodes that can
be effectively applied for prediction of optimal effective path while taking QoS
parameters into account to reach the destination using the application of
K-Means clustering algorithm for automatically discovering clusters from large
data repositories which is parallelized using Map-Reduce technique in order to
improve the computational efficiency and thereby predicting the optimal
effective path from source to sink. The work optimizes the previous result by
pre-assigning task for finding the best stable link in MANET and then work is
explored only on that stable link hence, by doing so we are able to predict the
optimal path in more time efficient way.
Keywords: K-Means, Map-Reduce, MANET, QoS, Time of Contact.

1 Introduction
Due to popularity and advance application of Mobile Ad-hoc networks which neither
has fixed infrastructure nor administrative support, where as a conventional wireless
network requires both fixed infrastructure and centralized administration for their
operation [1]. For such a complex network, providing Quality of Service (QoS) is
critical and challenging issue. Traditional MANET routing protocols focused on
finding a feasible route from a source to destination, without any consideration for
optimizing the utilization of network resources or for supporting application specific
QoS requirements [8-10] where the most concerned thing was to find the path from
source to destination which is the shortest among all existing path. Hence to support
QoS, the essential problem is to find a route with efficient available resources, such as
finding the lowest cost or most stable route that meets the QoS constraints. When we
say stable route we actually mean the paths whose time of contact is high in given
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 396405, 2011.
Springer-Verlag Berlin Heidelberg 2011

Application of Parallel K-Means Clustering Algorithm for Prediction of Optimal Path

397

duration of time. Here the simulation was run to check the number of contacts the
nodes make with each other in given simulation time, if the contacts made by the
nodes to each other are stable then we say such link as stable link hence those links
are highly durable and can be trusted in predicting the path which is optimal from
source to destination.
1.1 Previous Work
In our previous work Application of Parallel K-Means Clustering Algorithm for
Prediction of optimal path in Self Aware Mobile Ad-Hoc Networks, in that work we
have found the optimal path from source to sink using the K-Means clustering
algorithm which is one of the popular clustering techniques used for minimizing the
total distance between the group's members and its corresponding centroid; which is
the representative of the group, by finding the best division of n entities in k groups.
The K-Means clustering mines the large datasets in order to find the centroid for the
obtained information from running the simulation for the nodes in the network hence
once the centroid of the cluster was found we tried to group the pattern based on to
which cluster they belonged, hence we finally were able to find the best, good and bad
cluster and find to which cluster the available paths belong to.
The rest of the paper is organized as follows: Section 2 provides an overview
of Map-Reduce technique; Section 3 describes previous result and analysis. 4
Proposed solutions.

2 Map-Reduce for Determining Clusters and K-Means Clustering


Algorithm
Figure 1 explains the working principle of Map-Reduce model along with its
utilization in determining clusters. Map tasks are referred to as group of independent
tasks assigned to each worker for further processing by utilizing the information
collected from software agents [3]. Then, the master will distribute the tasks among
workers based on the information from software agents either in the round robin or
in the serial fashion. Each worker will perform the K-Means algorithm on the
information given and thereby determining the clusters as Best, Good and Bad.

Fig. 1. Map-Reduce model

398

L. Thomas and B. Annappa

2.1 K-Mean Clustering Algorithm


Clustering can be considered as the most important unsupervised learning in order to
find a structure based on the collection of unlabeled data, in other words, clustering is
the process of grouping similar data [11]. Most of the clustering algorithms in
literature use some form of a distance measure of which the Euclidean being the most
common distance measure. Care must be taken in choosing the distance measure
that determines the correctness of the clustering to a large extent. K-Means, in this
work, has been effectively used to determine the cluster with best through-put, good
through-put and bad through-put based on the information collected from the
software agents. Hence, during Map-phase of K-Means algorithm, clusters are
identified based on the threshold value that has been set depending upon the users
requirement [6].

3 Previous Result and Analysis


3.1 Simulation Result
For parallelization, Message Passing Interface (MPI) is used (MPI is a standard for
distributed memory, message passing and parallel/distributed computing). MPI is well
known for its simplicity, modularity and portability that give the complete control
over the parallelism to the programmer. And for the simulation of network, NS-2
simulator is used. For test environment, the network topology with 6 nodes (one
source and 5 sinks) is taken where the topology keeps changing because of mobility
of nodes. The experiment was conducted using 5 machines for parallelization with the
configuration intel core i5, 2.67 GHz, 4 GB RAM, Linux OS (64 bit) and OpenMPI is
used for communicating the information between the master and slaves, where master
performs the function of assigning the data to the slaves who are registered along with
master who inturn performs the assigned task and returns the result back to the
master, by doing so we are able to achive the complete task done.
1. To collect the information about the current status of different QoS parameters
(Packet Delivery ratio, Throughput in Kbps, Delay in ms, Number of packets
dropped, Number of packets sent and Number of packets received) through software
agents which interacts with different layers of the OSI layer stack.
2.
To gather the information at every t seconds (in our experiment it is 60 seconds,
which we call a window) and store the information in a file so that the master can use
it for further processing. The simulation was run for ten minutes and the scenario
consists of random movements of nodes within a 1000m x 1000m area with the
specified maximum speed.
A Simulation Result
Results corresponding to the information obtained by the software agents between the
pair of nodes 0 and 1 are shown in Table I where we find that Packet Received is
not equal to Packet Sent + Packet lost since some of the packets generated in the
current window will be received in the next window. Information corresponding to
other pairs obtained from software agents is not given since results were almost

Application of Parallel K-Means Clustering Algorithm for Prediction of Optimal Path

399

similar to Table 1. The results obtained by Map-phase where clusters are created are
shown in Table 2. Once the clustering is done the master reassigns the reduce phase to
all workers where the worker will compare the each entries of Table 1 with the cluster
average value to decide where it falls i.e. Best, Good or Bad cluster. Results of
Reduce phase are shown in Tables 3.
Table 1. Results Obtained From Software Agents For Nodes 0 And 1
Packet
delivery
ratio
98
95
97
98
94
93
90
93
85

Throughput in
Kbps

Delay in ms

1000.89
998.56
1054.93
980.89
1000.89
1001.01
968.89
997.89
890.09

Packet lost

65
56
65
75
35
55
85
35
55

Packet Sent

12
10
6
24
20
22
06
14
23

Packet Received

668
670
640
658
629
668
635
647
640

650
655
629
650
640
650
630
650
630

Table 2. Clusters Obtained Using K-Means For Nodes 0 To 1


Packet
delivery
ratio
Cluster 0
(Best)
Cluster 1
(Good)
Cluster 2
(Bad)

Throughput Delay
in Kbps
ms

in

Packet lost

Packet
Sent

Packet
Received

97

1054.93

65

640

629

95

989.355

70.16

14.16

661.1

647.5

90.66

962.9

41.66

19

638

630

This is the intermediate result; which is obtained as the result of map phase. During
the map phase the master will divide the work; example: if there are 5 nodes and if
there are only 2 workers who are been registered with the master, in the first stage the
master will divide the work into (01; 02 ) to worker1 where 01 is the information
obtained when the simulation was run between the nodes 0 and 1, similarly (03, 04)
will be assigned to worker2 now master check if there are any more files still
remaining and since the path 05 is yet to be assigned it will be assigned to worker1
back in round robin fashion. Now the worker who receives the above information will
perform K-Means algorithm and creates the clusters as shown in Table 2. From the
obtained map phase results the master will re-assign the reduce phase to all the
workers where it decides to which cluster the hop belongs to. The reduce phase results
are shown in Table 3.

400

L. Thomas and B. Annappa


Table 3. Reduce Phase Of K-Means For 0 To All Nodes
HOPS

CLUSTER 0 (BEST)

0 TO 1
0 TO 2
0 TO 3
0 TO 4
0 TO 5

3
5
3
8
0

CLUSTER
1 (GOOD)
5
1
2
1
0

CLUSTER
2 (BAD)
2
4
5
1
0

REMARKS
GOOD
BEST
BAD
BEST
NOT REACHABLE

From the above readings we are able find that the hop 0 to 2 and 0 to 4 are optimal
available hops between 0 to all nodes and node 5 is not reachable from 0. Hence
by running the K-Means for rest of all available path obtained by running the
simulation following clusters can be obtained for the paths available; hence the Table
IV shows the result of both map and reduce phase. Following figure 2 shows how the
above stimulation is done to find the effective optimal available path from source 0 to
destination 5.

Fig. 2. Optimal Path from Source to Destination

3.2 Identifying Link Stability


Quality of Service (QoS) based routing is defined as a "Routing mechanism under
which paths for flows are determined based on some knowledge of resource
availability in the network as well as the QoS requirement of flows." The main
objectives of QoS based routing are: Dynamic determination of feasible paths for
accommodating the QoS of the given flow under policy constraints such as path cost,
provider selection etc, optimal utilization of resources for improving total network
throughput and graceful performance degradation during overload conditions giving
better throughput. QoS routing strategies are classified as source routing, distributed
routing and hierarchical routing. QoS based routing becomes challenging in
MANETs, as nodes should keep an up-to-date information about link status. Also, due

Application of Parallel K-Means Clustering Algorithm for Prediction of Optimal Path

401

to the dynamic nature of MANETs, maintaining the precise link state information is
very difficult. Finally, the reserved resource may not be guaranteed because of the
mobility caused path breakage or power depletion of the mobile hosts. QoS routing
should rapidly find a feasible new route to recover the service.
Due to the maturity and improving performance of Ad-Hoc Network and wireless
Network its demand and application are improving in very rapid pace due to which
the maintenance of the service is becoming a challenging task. The Key challenge
would be to overcome and face the frequently changing network topology. However,
many applications require stable connections to guarantee a certain degree of QoS. In
access networks, access point handovers may disrupt the data transfer. In addition,
service contexts may need to be transferred to the new access points, introducing
additional overhead and delays to the connection.
Link stability helps in establishing of stable paths between connection peers
between one ends to the other. Re-routing is especially costly in these networks
without infrastructure, since it usually results in (at least partly) flooding the network.
The stability of a link is given by its probability to persist for a certain time span,
which is not necessarily linked with its probability to reach a very high age. Little
work has been published so far on this topic. The related concept of signal stability,
well known from cellular networks, has been used to find the right time for a
handover. Variations in the received signal strength may hint to the movement pattern
of the connection peers and thus allow an estimation of a probable connection loss.
However, received signal strength is largely dependent on actual radio conditions.
Due to fading effects those measurements are subject to large fluctuations.
Here we are able to eliminate those pair of nodes who hardly contact with each
other by which the total time taken for prediction of optimal path from source to
destination can be decreased to much higher percentage, when the simulation was run
we were able to get following Table 4 where the sample simulation result is shown for
every second reading to check whether there exist the link between 2 pair of nodes, if
there exist in time period then it is mentioned as yes else it is mentioned as no:
Table 4. Simulation for finding link stability

0
1
1
2
3
4
5

1to 4
Yes
Yes
Yes
Yes
Yes
Yes
Yes

2 to 6
Yes
Yes
Yes
Yes
Yes
Yes
Yes

3 to 4
No
No
No
No
No
No
No

3 to 6
Yes
Yes
Yes
Yes
Yes
Yes
Yes

4 to 3
Yes
Yes
Yes
Yes
Yes
Yes
Yes

4 to 5
Yes
Yes
Yes
Yes
Yes
Yes
Yes

4 to 6
No
No
No
No
No
No
No

5 to 6
Yes
Yes
Yes
Yes
Yes
Yes
Yes

From the above Remarks we are able to notice that out of 8 exemplary nodes 2
nodes (2 to 6) and (4 to 6) time of contact is very poor hence their link stability is very
poor were in the link stability of rest of the nodes are identified as good nodes hence
network simulation can be run for only those nodes whose link stability is good, by
doing so we are able to reduce the time required to extract the information.

402

L. Thomas and B. Annappa


Table 5. Table of Contact

PATHS
1 to 4
2 to 6
3 to 4
3 to 6
4 to 3
4 to 5
4 to 6
5 to 6

Yes
0 to 29
0 to 12
0 to 27
0 to 24
0 to 31
0 to 31
31
0 to 31

No
30 to 31
13 to 31
28 to 31
25 to 31
NIL
NIL
0 to 40
NIL

Time of Contact
Yes
No
30
01
13
18
28
03
25
06
32
00
32
00
01
31
32
00

Remark
YES
NO
YES
YES
YES
YES
NO
YES

3.3 Comparison of the Results


Figure 5 to 8 shows the comparative reading obtained when the algorithm was run for
paths from source 0 to all the nodes in the cluster using 2 to 5 machines
respectively. Here we can observe the proposed result has taken less time to generate
the optimal path in comparative of the previous one. Figure 5 shows comparative
study with and without Link Stability techniques, the reading obtained when the
algorithm was run for hops from source 0 to all the nodes in the cluster using 2
machines. Here we can observe that path 0 to 1 and 0 to 2 has taken more time
since the points in the cluster are badly distributed.

Fig. 5. Result of the work on 2 machines

Figure 6 shows the comparative reading obtained when the algorithm run on 3
machines using link stability technique to that of without link stability technique.
Figure 7 shows the comparative reading obtained when the work is run using 4
workers.

Application of Parallel K-Means Clustering Algorithm for Prediction of Optimal Path

403

Fig. 6. Result of the work on 3 machines

Fig. 7. Result of the work on 4 machines

Figure 8 was obtained when the algorithm was run on 5 machines and here also we
are able to notice the same observation which we saw in the above result but the total
time taken when the algorithm was run using 5 workers are very less when compared
to above result and it becomes more efficient when the algorithm is run using link
stability technique.

404

L. Thomas and B. Annappa

Fig. 8. Result of the work on 5 machines

4 Conclusion
This Paper proposes the Advantage of including link stability prediction for predicting
the optimal path from source to destination for our previous work which is an
application of Map Reduce technique for parallelizing the K-Means algorithm for
finding the optimal effective path by creating clusters in Self Aware MANET where
each hop along the path is chosen based on whether it falls in Best, Good or Bad
cluster. Also this paper attempts to add self awareness in MANET through software
agents by interacting with the layers of the protocol stack in order to find the status of
different QoS parameters.

References
1. Chakrabarti, S., Mishra, A.: QoS issues in ad hoc wireless networks. IEEE
Communication Magazine 39(2) (February 2001)
2. Manimaran, G., Siva Ram Murthy, C.: An Efficient Dynamic Scheduling Algorithm For
Multiprocessor Real-Time Systems. IEEE Transactions on Parallel and Distributed
Systems 9(3), 312319 (1998)
3. Dean, J., Ghemawat, S.: Map-Reduce: Simpli_ed Data Processing on Large Clusters.
0018-9162/95/ D OSDI IEEE (2004)
4. Nevison, C.H.: Parallel : computing in the Undergraduate Curriculum. Colgate University
0018-9162/95/ D 1995 IEEE (December 1995)
5. Meira Jr., W., Zaki, M.: Fundamentals of Data Mining Algorithms
6. Hartigan, J.A.: Clustering Algorithms. John Wiley & Sons, Inc., New York (1975)

Application of Parallel K-Means Clustering Algorithm for Prediction of Optimal Path

405

7. Lawniczaka, A.T., Di Stefanob, B.N.: Computational intelligence based architecture for


cognitive agents. In: International Conference on Computational Science, ICCS 2010.
Procedia Computer Science, vol. 1, pp. 22272235. Elsevier Ltd., Amsterdam (2010)
1877-0509 c 2010
8. Gelenbe, E.: Steps towards Self Aware Networks. ACM Communications Magazine (7)
(2009)
9. Asokan, R.: A Review of Quality of Service (QoS) Routing Protocols for Mobile Ad Hoc
Networks. In: ICWCSC 2010X. IEEE, Los Alamitos (2010)
10. Chen, L., Heinzelman, W.B.: A survey of Routing Protocols that Wupport QoS in Mobile
Ad Hoc Networks. IEEE Network (November/December 2007)

Clouds Infrastructure Taxonomy, Properties,


and Management Services
Imad M. Abbadi
Department of Computer Science
University Of Oxford
imad.abbadi@cs.ox.ac.uk

Abstract. Moving current Clouds infrastructure to trustworthy Internet scale critical infrastructure requires supporting the infrastructure
with automated management services. Thereby, the infrastructure provides, as described by NIST, minimal management eort or service
provider interaction [11]. The initial step in this direction requires understanding how experts in the domain manage Clouds infrastructure,
and how the infrastructural components are interlinked with each other.
These are the main contribution in this paper; i.e. proposes a Cloud
taxonomy focusing on infrastructure components interaction and management, provides a real life scenario of a critical application architecture
using the proposed taxonomy, and then derive the management services
using the provided scenario. Public Cloud model supports very limited
features in comparison with other models, e.g. community Cloud. In this
paper we analyze the management services at community Cloud to identify the ones which require automation to be adopted at public Cloud.
Keywords: Cloud taxonomy (3-D view), Cloud infrastructure management, infrastructure properties, self-managed services.

Introduction

Cloud computing is relatively a new term in IT (started in 2006 with Amazon


EC2[3]), which has emerged from commercial requirements and applications [9].
Cloud support three main types: Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS) [11]. The Cloud infrastructure
is complex and heterogeneous in nature; various Cloud components provided by
dierent vendors need to communicate in an organized and well managed way.
Cloud infrastructure management is mainly provided by internal employees and
contractors. There are dierent tools, which help employees to manage Cloud infrastructure, which require human intervention for supporting the infrastructure.
One of the main Cloud potential features is the provision of fully automated services (we refer to such services as self-managed services), which provide Cloud
infrastructure with exceptional capabilities enabling it automatically manage
the infrastructure and take appropriate actions on emergencies [5,9]. Achieving self-managed services is not an easy task considering Clouds infrastructure
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 406420, 2011.
c Springer-Verlag Berlin Heidelberg 2011


Clouds Infrastructure Taxonomy, Properties, and Management Services

407

complexity and heterogeneity. This would require careful understanding of how


experts in the domain manage the infrastructure, and also requires analyzing
Clouds infrastructure management tools and components interaction.
Cloud has dierent models, e.g. private, community, and public Cloud [9].
Public Cloud (e.g. Amazon EC2 [3]), as its name indicates, could be used by
anyone without having a prior relation with the Cloud provider. On the other
hand, private Cloud is mainly used by a specic organization, and a community Cloud is used by collaborating organizations who share a common mission,
goals, etc. Public Cloud model has much more customers in comparison with
other models. Therefore, public Cloud services should be automated to hide the
complexity of the infrastructure, and to increase users service availability and
reliability. Fully automated management services are not yet available, at the
time of writing, for many Cloud services, which are required by dierent types
of applications including but not limited to critical applications [2]. Such lack
of automated management for many services forces public Cloud provider to
mainly support basic functions which can be automated at the virtual layer.
These cover the needs of casual users, small businesses, and uncritical applications. Other Cloud models, on the second hand, which have limited number of
users support all possible wide range of services. Such services are customized for
the need of a group of organizations, and require much more human intervention
in comparison with the ones provided by public Cloud. One of the objectives of
this paper is to identify the services which require automation. Automating such
services are important for potential Cloud.
1.1

Cloud Evolution

Prior to the virtualization era customers use to provide their application requirements to enterprise architects. Enterprise architects used to provide an
architecture which is typically designed to a specic customer application needs
and requirements. This has caused huge waste of resources, e.g. computational
resources and power consumption. Virtualization technology, which is the foundation of the Cloud infrastructure, brings tremendous advantages in terms of
consolidating resources; however, it is also associated with other problems, e.g.
security and privacy problems [2]. Virtualization era changes the mentality of enterprise architects, as the relation between users and their physical resources are
no longer one-to-one. This raises a big challenge of how such a consolidated architecture can satisfy users dynamic requirements and unique application nature.
Enterprise architects addressed this by studying the environment they inherit
prior to virtualization era, and they found dierent architectures have many
similarities. Such similarities enable enterprise architects to split the infrastructure into groups. Each set of groups can be architected and associated with
certain properties, which enable such a group to address common requirements
of certain categories of applications. For example, a group can be allocated for
applications: i) that can tolerate single point of failures; ii) that require full resilience with no single point of failure; iii) that are highly computational; iv) a
group for archiving systems, etc.

408

I.M. Abbadi

The second challenging question is how such grouping, which is associated


with almost static properties, can t with the dynamic users requirements and
their application nature. Enterprise architects realize that virtualization can be
ne-tuned and architected to support the dynamic properties, which are not
already provided by the physical group static properties. In other words, the
combination of physical properties and the virtual layer dynamism are used to
support customers expectations. Enterprise architects found that using virtual
layer can even provide many automated features that cannot be easily provided
at the physical layer. Automated management features at the virtual layer are
the main key factor behind the evolution of Cloud computing. However, we are
still at an early stage for providing fully automated services for many limitations,
which we partially discuss in [1,2]. The lack of full automation restricts public
Cloud providers from supporting many features already provided, manually, by
community and private Cloud providers, as discussed in the paper.
1.2

Related Work

In this paper we continue our previous work in [2] which discusses the misconceptions about Cloud computing, introduces Cloud layering concept, and derives
the main security challenges in the Cloud. In this paper we start by proposing a
Cloud taxonomy, and then derive management services and factors aecting their
actions. The factors include both infrastructure properties and user properties.
We have previously dened self-managed services and the security challenges for
providing such services in an extended abstract [1]; however, the foundations of
our previous work are claried in this paper.
There are few related work, which analyze Cloud environment (see, for example, [6,21]). These mainly focus on analyzing Cloud properties, benets, and
services from user perspectives. However, they do not discuss the Cloud infrastructure taxonomy, do not discuss management services and the properties they
require when managing Cloud infrastructure. Our proposed taxonomy does not
contradict or even replace previously proposed ones which mainly focus on different angle from ours. It is rather the opposite, as our taxonomy complete the
picture of such work which consider the physical layer as a black-box and also
does not discuss the management of Cloud infrastructure. Autonomic computing
[8] is not related to our work, as it is mainly concerned about management of
physical resources.
1.3

Organization of the Paper

This paper is organized as follows. Section 2 proposes a Clouds infrastructure


taxonomy. In this section we derive the key properties that are required by
management services. Section 3 provides a real life scenario for an application
that is currently deployed at a community Cloud provider. It then motivates
the need for automated management services and derives the required services.
Finally, we provide a conclusion and provide research agenda in section 4.

Clouds Infrastructure Taxonomy, Properties, and Management Services

409

Taxonomy of the Cloud

In this section we propose taxonomy of Cloud focusing on the relationships


and interactions amongst Cloud components. In section 3.1 we illustrate the
taxonomy in context of a scenario. We use the taxonomy to derive Cloud infrastructural properties, which are one of the key factors when providing automated
management services.
2.1

Cloud Infrastructure Taxonomy

A Cloud infrastructure is analogous to a 3-D cylinder, which can be sliced horizontally and/or vertically (see Figure 1). We refer to each slice using the keyword
layer. A layer represents Clouds components that share common characteristics. Layering concept helps in understanding the relations and interactions
amongst Cloud components. We use the nature of the component (i.e. physical, virtual, or application) as the key characteristic for horizontal slicing of
the Cloud. For vertical slicing, on the other hand, we use the function of the
component (i.e. server, network, or storage) as the key characteristic for vertical
slicing.

Fig. 1. Cloud Taxonomy: 3-D View

410

I.M. Abbadi

As illustrated in Figure 1, the Vertical Layer consists of three layers: Storage


Layer, Server Layer, and Network Layer. Each layer is organized into sub-layers;
i.e. we have network sub-layer, storage sub-layer, and server sub-layer. Each sublayer provides specic properties to serve the needs of the wide range of Cloud
user requirements. Server, network and storage sub-layers are organized into
multiple collaborating sub-layers. Sub-layers within each collaborating sub-layer
and their components are carefully selected, interconnected, and even physically
positioned to support the overall collaborating sub-layer properties.
Virtual resources are then created and grouped based on users application
requirements. Multiple related groups join a collaborating group based on user requirements and application nature (e.g. dependency amongst application components). Sub-layers and groups are associated with properties and policies, which
are of most importance for managing the infrastructure (and especially for the
provision of automated self-managed services). Sub-layers properties and policies are infrastructure related, while group properties and policies are related to
users application requirements. Each group is hosted at a collaborating sub-layer
that has physical properties best match with user properties.
Figure 1 also illustrates the relation between Horizontal and Vertical layers.
We identify a Horizontal Layer to be the parent of physical, virtual and application layers. Each Horizontal Layer contains Domains, i.e. we have Physical
Domains, Virtual Domains, and Application Domains. A Domain represents related resources which enforce a Domain dened policy. Physical Domains are
related to Cloud infrastructure and are, naturally, associated with infrastructure
properties and policies. A Physical Domain in the Horizontal Layer is equivalent
to a Collaborating Sub-Layer (in Vertical Layer terms).
An Application Domain is composed of the components of a single application. A Virtual Domain is then created to serve the needs of an Application
Domain. Each Virtual Domain is composed of groups of virtual resources. A
group would typically run and manage a specic component within an Application Domain. Each Virtual Domain group is associated with user properties
which are related to the application component to be served by the group. Such
properties help in directing the management services when providing automated
self-managed services of the group. For example, such user properties help management services to decide on i) minimum and maximum resources allocated to
a virtual machine within a group (Vertical Scalability), ii) minimum and maximum number of virtual machines that can be allocated and deallocated within
each group based on load/incidents (Horizontal Scalability), iii) deciding on the
right Physical Domain that can serve the needs of the application, and iv) helps
management services to react based on user requirements during incidents. A
Virtual Domain in Horizontal Layer is equivalent to Collaborating Group (in
Vertical Layer terms).
In this paper we mainly focus on the vertical slicing of the physical and the
virtual layers, as our interest is in IaaS Cloud type. Also, our discussion next
follow the vertical slicing as we are mainly interested in deriving infrastructure
properties.

Clouds Infrastructure Taxonomy, Properties, and Management Services

411

Network Layer. A network layer is the backbone that provides communication medium between Clouds components. The communication medium can
be either public or private. By public we mean communication occurs over the
Clouds local or wide area network. Private, on the other hand, means communication occurs in a physically dedicated network, which is isolated from the public
network. Such a private network is especially setup between a set of components
to perform a specic function; e.g. (a.) connecting a server to dedicated storage,
as in the case of Storage Area Network (SAN) [20], and (b.) software clustering
as the case in Real Application Cluster (RAC) requires servers member in RAC
to have a private network [16].
From an abstract level the communication amongst Cloud components is organized within dened boundaries that follow a process workow. We refer to such
communication as horizontal and vertical communication, which are described
as follows (see Figure 2).
Horizontal Communication In this type Cloud entities communicate as peers
either inside a sub-layer or across sub-layers. This type of communication
does not span outside layer boundaries. We now discuss what we mean by
horizontal communication in the following examples: (a.) horizontal communication can be realized when storage systems are self-replicated in such
a way one storage entity regularly copies changes of its physical blocks to
a standby storage entity; and (c.) when Virtual Machines (VMs) within a
sub-layer collaborate in a RAC [16] and need to exchange messages to synchronize shared memory (e.g. memory fusion [16]) is also a form of horizontal
communication between VMs.
Vertical Communication In this type Cloud entities communicate with other
Cloud entities in the same or dierent layer following a process workow in
either up-down or down-up directions. This would typically works as follows:
an upper sub-layer component runs a process which generates sub-processes
that should run at lower sub-layer, following a process workow. The lower
sub-layer could be in the same or dierent layer of the upper sub-layer.
The lower sub-layer executes the sub-processes and then sends the result
back to the upper sub-layer. We provide the following example: a multi-tier
application in which the front-end in the Cloud represents a load balancing
component that receives users request and distribute them across the middletier sub-layer. The middle-tier sub-layer, which runs the application logic,
processes the request and generates sub-requests that send them to backend
sub-layer. The backend sub-layer, which runs DB instance, processes the subrequest and then generates sub-sub-request and sends them to the storage
sub-layer. These steps represent up-down communication channel. Each layer
in turn sends their response back in the opposite direction, which represents
the down-up communication channel.
There are many other important network properties, which we do not discuss
in this section for space limitations, e.g. network speed between components,
network nature, any restrictions aecting information ow as in the case of a
rewall stopping certain type of trac, network topology, etc.

412

I.M. Abbadi

Fig. 2. Conceptual Models of Cloud Layers

Storage Layer. A storage layer is composed of sub-layers, which consist of


storage components. A storage component is the basic component1 that stores
Cloud data and/or provides le system services. Storage could be either local
storage or network storage. Local storage is connected directly to server(s) via a
private network (e.g. SAN), while network storage means servers are connected
to the storage over public network (e.g. Network-Attached Storage (NAS) [19]).
Network storage can provide Cloud users with storage as a service (e.g. Amazon
S3 [4]); however, local storage does not communicate directly with users and does
require a Server Layer. Each type of storage (i.e. local and network storage) has
many dierent categories, which is outside the scope of this paper to discuss.
An Enterprise architect decides on storage specications by considering many
factors (e.g. purpose of storage le system or block storage , storage usage
e.g. cluster of servers). Storage components communicate horizontally, for
example, when replicating data at physical level, i.e. storage-to-storage. The
storage layer/sub-layer communicates vertically with other layers. Part of Figure
2 provides a conceptual model of Storage Layer.
There are many important properties about storage layer, which include: size,
speed, protection measures (e.g. hardware raid), reliability, connectivity with the
servers and its speed (private or public network), physical distance from attached
servers (for private network), etc. Again we do not aim to discuss these for space
limitations (we outline some of these in section 3.1).
Server Layer. The server layer is composed of multiple sub-layers. Each sublayer is composed of a sets of physical servers. Physical Servers provide computational resources to Cloud users (e.g. CPU, memory, network, and storage).
Each server hardware resources are managed by a hypervisor (a minimized operating system providing the minimum components, which enable the hypervisor
to virtualize hardware resources to guest operating systems[13]). Part of Figure
2 provides a conceptual model of Server Layer.
The hypervisor runs (or sometimes same as) the Virtual Machine Manager
(VMM). The VMM manages Virtual Machines (VMs) running on the physical
server [10,13]. A VM provides an abstraction of CPU, memory, network and
storage resources to Cloud users in such a way a VM appears to a user as an
1

We mean by basic component an integrated component (e.g. EMC storage products


[7]) and not a simple hard-disk or physical block.

Clouds Infrastructure Taxonomy, Properties, and Management Services

413

independent physical machine. Each VM runs its own Operating System (OS),
which is referred to as guest OS. The guest OS runs the VM specic applications.
VMs running in the same physical platform are independent, share the platform
resources in a controlled manner, and they should not be aware about each other;
i.e. a VM can be shutdown, restarted, cloned, and migrated without aecting
other VMs running on the same physical platform.
2.2

Virtual Control Center

In this section we identify the component that can take the role of providing
automated management services. Cloud infrastructure is composed of enormous
components, which are not easy to be managed manually. There are dierent
tools, which help Cloud employees to manage Cloud infrastructure. These cover
virtual resource management, physical resource management, network management, server management, etc. In this paper we are mainly concerned about
virtual resource management tools, which manage virtual resources and their
interaction with physical resources. There are many tools for managing virtual
resources, which are provided by dierent manufacturers (e.g. VMWare tool is
referred to as vCenter [18], Microsoft tool is referred to as System Center [12]).
Many open source tools have also been recently developed (e.g. OpenStack [15]
and OpenNebula [14]), which support additional services. In this paper, for convenience, we refer to such tools, which are used to manage virtual resources,
as Virtual Control Centre (VCC). In our previous work ([2]) we have outlined
VCC which helped us to derive cloud unique security challenges. In this paper
we discuss it considering the provided taxonomy and identify the factors that
aect its operation.
VCC establishes communication channels with physical servers to manage
Clouds Virtual Machines (VMs). VCC establishes such channels by communicating with VMM running on each server. VCC and VMM regularly exchange
heartbeat signals ensuring they are up and running. VMM regularly communicates VMs related status (failure, shutdown, etc) to VCC enabling the latter to
communicate the status to system administrators. Such management helps in
maintaining the agreed Service Level Agreement (SLAs) and Quality of Service
(QoS) with customers. In addition, and probably most importantly, VCC provides system administrators with easy to use tools to manage virtual resources
across Cloud infrastructure. This is very important considering the Cloud complex and heterogeneous nature. For example, if a physical machine fails (e.g. due
to hardware failure) then where should VMs running on top of the failed physical machine move. Also, once the failed physical machine is recovered should
VMs return back to their original hosting server or should they stay at the
guest hosting server. Such examples are managed by VCC based on policy predened by enterprise architects but managed by system administrators using
VCC.

414

2.3

I.M. Abbadi

Factors Aecting Management Services

We believe that VCC will play a major role in managing self-managed services.
We now identify the factors, which would aect decisions made by self-managed
services.
Infrastructure Properties (Static Properties) As we discussed earlier, Clouds
physical infrastructure are very well organized and managed by multiple
parties, e.g. enterprise architects, system administrators, security administrators. These parties build the infrastructure to provide certain services
and are aware about Cloud taxonomy, which we describe it early in this section. Therefore, they dene the physical infrastructure properties for each
infrastructural component, sub-layers, and layer. Providing such properties
to VCC is a foundation step for supporting automated management services.
User Properties (Dynamic Properties) A Cloud user interacts with the Cloud
provider via Cloud webpage and supplied APIs. This enables users to dene
user properties, which should cover the following for potential Cloud:
a.) Technical Requirements IaaS Cloud users would typically be organizations, which have expertise to provide enough information about their technical requirements in terms of VMs, storage, and network requirements. For
example, they provide the properties of applications to be hosted on VMs,
e.g. DBMS instances that require high availability with no single point of
failure, middle-tier web servers that can tolerate failures, highly computational application, etc. This enables the Cloud provider to identify the best
infrastructural resources that t for user requirements.
b.) Service Level Agreement (SLA) Requirements These specify quality
control factors and other legal and operational issues on user services; for
example, dene system availability, reliability, scalability (in upper/lower
bound limits), and performance metrics.
c.) User-Centric Security and Privacy Requirements Examples of these
include (i.) users need stringent assurance that their data is not being abused
or leaked; (ii.) users need to be assured that Cloud provider properly isolate
VMs that runs in the same physical platform from each other (i.e. multitenant architecture[17]); and (iii.) users need to identify location of data distribution and processing (which could be for legal reasons). Current Cloud
providers have full control over all hosted services in their infrastructure; e.g.
Cloud provider controls who can access VMs (e.g. internal Cloud employees, contractors, etc) and where user data can be hosted (e.g. server type
and location). The user has very limited control over the deployment of his
services, has no control over the exact location of the provided services, and
has no option but to trust Cloud provider to uphold the guarantees provided
in SLA.
Infrastructure Policy Policies should be dened by Cloud authorized employees and associated with layers and sub-layers to control the behaviours
of self-managed services.
Changes and Incidents These represent changes in: user properties (e.g. security/privacy settings), infrastructure properties (e.g. components reliability,

Clouds Infrastructure Taxonomy, Properties, and Management Services

415

components distribution across the infrastructure, redundancy type), infrastructure policy, and other changes (increase/decrease system load, component
failure, network failure, etc).
Management services should automatically manage Cloud environment by nding the best match of user properties with infrastructure properties that considers infrastructure policy. For example, a sub-layer would be associated with a
set of infrastructure properties dening many important factors related to this
sub-layers itself, and how it is related to other sub-layers. Also, groups hosted
at each sub-layer are associated with users properties. These enable automated
management services to take proper actions on emergencies, as such services
would be provided with architectural factors and users requirements.

Deriving Self-managed Services

The rst subsection provides a real life scenario for an application that is currently deployed at a community Cloud provider. In this we map the scenario at
the provided taxonomy. In the second subsection we use the scenario to identify
and motivate the need for automated self-managed services.
3.1

Multi-tier Application Scenario at the Cloud

We have architected and deployed the scenario which is provided in this section
for a production environment supporting an editorial workow. The editorial
workow depends on weather forecast application. For simplicity we assume
both editorial and weather applications have similar architectural requirements.
The system is architected as a multi-tier application, which is deployed across
community Cloud infrastructure (primary and secondary locations), to achieve
the following user properties: high system availability and reliability, disaster recovery (DR) to support business continuity, high resilience with no single point
of failure, transactions type are more write than read, system scalability (i.e.
minimum/maximum resources that can be allocated/deallocated when load increase/decrease), and security properties. We next provide a simplied architecture based on user properties and the discussed Cloud taxonomy (Figure 3
illustrates the overall architecture).
Application Middle-tier Groups These virtual layer groups run application
business logic functions, which provide services to end-users. We require two
groups as illustrated in Figure 3: the rst we refer to as weather middle-tier
group which runs weather middle-tier application component, and the second we
refer to as editorial middle-tier group which runs editorial middle-tier application
component. Both groups should be hosted using appropriate collaborating sublayer, as discussed latter. Also, the number of VMs and their specications within
each group would depend on expected load and user requirements, which we do
not discuss in this example for simplicity. But each group should at least have
two VMs, as the user requires no single point of failure. Having one VM means
if it fails the system will be down while the VM gets restarted.

416

I.M. Abbadi

Fig. 3. Cloud Taxonomy Multi-Tier architecture at Community Cloud Provider

Database Management System (DBMS) Groups The DBMS groups at virtual layer manage the application data (e.g. storage, retrieval, and indexing).
We require two groups as illustrated in Figure 3: the rst we refer to as weather
DBMS group which hosts weather DBMS and the second we refer to as editorial
DBMS group which hosts editorial DBMS. Both groups should be hosted using appropriate collaborating sub-layer, as discussed latter. Also, the number of
VMs and their specications within each group would depend on expected load
and user requirements, which we do not provide in this example for simplicity.
But, as in the case of application middle-tier groups, each DBMS group should
at least have two VMs to support no single point of failure.
Collaborating Sub-Layers Each collaborating sub-layer at the physical layer
is composed of three sub-layers: storage sub-layer, network sub-layer, and server
sub-layer. The server sub-layer has special properties enabling it to host the indicated type of application. The storage and network sub-layers are associated
with special properties enabling them to collaborate to support server sub-layer
properties, which can address wide range of common user requirements which
are related to a specic category of application (e.g. DBMS with no single point
of failure). The system architect should provide a resilient architecture based on
both user supplied requirements and Clouds infrastructure properties. Figure
3 provides four collaborating sub-layers: (a.) collaborating sub-layer middle-tier
(primary) which has properties enabling it to host middle-tier application groups
and is physically located at Cloud primary location; (b.) collaborating sub-layer
middle-tier (secondary) which has properties enabling it to host middle-tier application groups and is physically located at Cloud secondary location; (c.) collaborating sub-layer DBMS (primary) which has properties enabling it to host
DBMS groups and is physically located at Cloud primary location; and (d.) collaborating sub-layer DBMS (secondary) which has properties enabling it to host

Clouds Infrastructure Taxonomy, Properties, and Management Services

417

DBMS groups and is physically located at Cloud secondary location. Collaborating groups at the primary location hosts groups at the primary location, and
acts as a backup (i.e. DR) for groups located at the secondary location. Similarly,
Collaborating groups at the secondary location hosts groups at the secondary
location, and acts as a backup for groups located at the primary location. We
now discuss some of the properties at individual sub-layers.
Storage layer The system architect should use storage sub-layers that
satisfy user requirements. For example, one of the user requirements indicates
that the system activity is more write than read. For performance reasons this
would require RAID 1+0 for DBMS sub-layers rather than RAID 5. In addition, the user requires no single point of failure, which implies integrated storage
component should be fully redundant from inside and outside (e.g. dual communication channels, multiple processor cards). It also implies replicating data
from the community Cloud primary location to its secondary location. Replicating data can be done at dierent levels: (a.) storage sub-layers or (b.) DBMS
server sub-layers.
Server layer The scenario requires four groups, as discussed above. The
system architect should decide on server sub-layers that can host each group.
The systems architect should also associate with each group a set of properties
enabling it to satisfy consumer requirements. Understanding the nature of the
hosted application enables the system architect to even provide enhanced features in terms of using the right hardware conguration (i.e. server sub-layer)
that best suits the generic nature of the application; e.g. DBMS application,
highly computation systems, etc.
In our scenario, the system architect should: (a.) associates with all groups a
dependency property that require all groups to always run in the same physical
location at one time and on emergencies all groups should fail-over to predened
sub-layers located at DR location (such a condition ensures that all dependent
components run in the same location; e.g. avoid the case where a DBMS group is
hosted in dierent location than its corresponding middle-tier group); (b.) hosts
editorial and weather DBMS groups at server sub-layer which has properties of
hosting DBMS with no single point of failure; and (c.) hosts editorial and weather
middle-tier groups at server sub-layer which has properties of hosting middle-tier
applications. It is beyond the scope of this paper to discuss architectural reasons
beyond that, but of course all groups could be hosted using a single server sublayer or multiple sub-layers. This is based on user properties and infrastructure
properties.
Network Layer The above sub-layer components (i.e. server sub-layers
components and storage sub-layers components) must be connected using at
least two network channels. Also, related server sub-layers and storage layers
should be connected using redundant channel. For example, a DBMS server sublayer should be connected using multiple channels to related storage sub-layer.
In addition, the storage sub-layer itself should provide full resilience, which is
outside the scope of this paper to discuss.

418

3.2

I.M. Abbadi

Identifying Management Services

Current public Cloud providers do not support the kind of architecture provided
in the scenario above, as it requires human intervention. In this section we aim
to derive the main services, which are mostly (at the time of writing) provided
by private and community Cloud internal employees. We also aim to show the
importance of automating such services. Public Cloud potential future, which
is expected to host critical applications should be capable of automatically and
without human intervention manage Cloud environment [11].
The rst two services we identify are system architect and resilient design. Cloud provider should provide automated application architecture (what
we refer to as system architect as a service), which should result in a resilient
design. It should also automatically deploys the resilient design (what we refer to
as resilience as a service). As we described earlier the deployment of the architecture should consider infrastructure properties and user requirements. In our
scenario a fundamental user requirement, which is especially required by critical
applications, is providing a resilient architectural design with no single point of
failure. Current Public Cloud providers only support very limited features in
this direction in comparison with the ones supported by private and community
Cloud providers. This is because fully automated management services do not
exist and public Cloud providers can only support limited features that can be
managed automatically.
The other important user expectation from a Cloud provider is to automatically adapt to failures, changes in user properties, and infrastructure properties
and policies, without aecting user applications. This is what we refer to as
adaptability as a service. This requirement is critical for potential Cloud
infrastructure. For example, when users change their requirements, the virtual
layer resources should automatically adapt to such changes, and when infrastructure physical resources get changed the virtual layer resources should also
automatically adapt to such changes without compromising users requirements.
All these changes should not compromise user requirements, security and privacy
properties.
Elasticity is one of the Cloud essential properties. In peak periods the virtual
layer resources should automatically scale up, and in o-peaks the resources
should automatically scale down. Such scaling is based on the demand and customer pre-agreed SLA, and it should not compromise user requirements, security
and privacy properties. This we refer to this as scalability as a service. Public Cloud provider at the time of writing only support vertical scalability, but
do not provide horizontal scalability. The Cloud provider should also provide
availability as a service which is related to utilizing all redundant resources.
Also, Cloud provider should provide reliability as a service which assures
end-to-end service integrity.
The combination of above services would result in higher availability and reliability as properties. Full reliance on human being requires longer time to architect
and deploy solutions, requires longer time to discover and resolve problems, and
reliance on human beings is error prone, subject to insiders threats by Cloud

Clouds Infrastructure Taxonomy, Properties, and Management Services

419

employees, and do not provide a reliable way for measuring the level of trust in
Clouds operations. This raises the need of self-managed services that can automatically and with minimal human intervention manage Cloud infrastructure.
Automated self-managed services provide Cloud computing with exceptional capabilities and new features. For example, scale per use, hiding the complexity
of infrastructure, automated higher reliability, availability, scalability, dependability, and resilience that consider users security and privacy requirements by
design. Automated self-managed services should help in providing a trustworthy resilient Cloud computing, and should result in cost reduction. More details
about these services can be found in our extended abstract [1].

Conclusion and Research Agenda

We start this paper by proposing a novel Cloud taxonomy. Unlike previously


proposed ones which consider Cloud physical layer as a black-box, the proposed
taxonomy focuses on Cloud infrastructure and management; i.e. our taxonomy
does not replace previous ones but covers additional areas not discussed before.
In this paper we demonstrate the organization and grouping of Cloud hardware
resources based on infrastructure properties.
The proposed taxonomy helps to extract the infrastructure properties, which
demonstrates how management services can take automated actions. The taxonomy also helps us to realize the complexities of providing self-managed services. Cloud infrastructure in use these days does not provide the full potential
of automated self-managed services. These services are currently supported by
Cloud employees. We believe that VCC will play a major role in supporting
self-managed services. VCC is still under continuous development and only provides very limited services. We provide a real life scenario illustrating the use
of the provided Cloud taxonomy and how experts in the domain manage Cloud
infrastructure. Based on this, we identify and discuss the required management
services which require automation.
In our previous work ([1,2]) we have discussed Cloud security challenges. Part
of the identied challenges is related to providing secure and reliable automated
management of trustworthy Clouds infrastructure. Considering this and the
identied infrastructure properties, our next objective is to propose a framework that can help in establishing trust in self-managed services. This covers a
fundamental part of our research in building a trustworthy Cloud infrastructure.

Acknowledgment
This research has been supported by the TCloud project2 , which is funded by
the EUs Seventh Framework Program ([FP7/2007-2013]) under grant agreement
number ICT-257243. The author would like to thank Andrew Martin for his
discussion and valuable comments.
2

http://www.tClouds-project.eu

420

I.M. Abbadi

References
1. Abbadi, I.M.: Self-Managed Services Conceptual Model in Trustworthy Clouds
Infrastructure. In: Workshop on Cryptography and Security in Clouds. IBM, Zurich
(March 2011), http://www.zurich.ibm.com/~ cca/csc2011/program.html
2. Abbadi, I.M.: Toward Trustworthy Clouds Internet Scale Critical Infrastructure.
In: Bao, F., Weng, J. (eds.) ISPEC 2011. LNCS, vol. 6672, pp. 7182. Springer,
Heidelberg (2011)
3. Amazon: Amazon Elastic Compute Cloud, Amazon EC2 (2010),
http://aws.amazon.com/ec2/
4. Amazon: Amazon Simple Storage Server, Amazon S3 (2010),
https://s3.amazonaws.com/
5. Armbrust, M., Fox, A., Grith, R., Joseph, A.D., Katz, R.H., Konwinski, A., Lee,
G., Patterson, D.A., Rabkin, A., Stoica, I., Zaharia, M.: Above the Clouds: A
Berkeley View of Cloud Computing (2009),
http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf
6. Cloud Computing Use Case Discussion Group. Cloud computing use cases (2010),
http://cloudusecases.org/Cloud_Computing_Use_Cases_Whitepaper4_0.odt
7. EMC. EMC (2011), http://www.emc.com/products/category/storage.htm
8. IBM. Autonomic computing (2001), http://www.research.ibm.com/autonomic/
9. Jeery, K., NeideckerLutz, B.: The Future of Cloud Computing Opportunities
For European Cloud Computing Beyond (2010)
10. McCune, J.M., Li, Y., Qu, N., Zhou, Z., Datta, A., Gligor, V.D., Perrig, A.: Trustvisor: Ecient tcb reduction and attestation. In: IEEE Symposium on Security and
Privacy, pp. 143158 (2010)
11. Mell, P., Grance, T.: The NIST Denition of Cloud Computing
12. Microsoft. Microsoft System Center IT Infrastructure Server Management Solutions (2010), http://www.microsoft.com/systemcenter/
13. Murray, D.G., Milos, G., Hand, S.: Improving xen security through disaggregation.
In: Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference
on Virtual Execution Environments, VEE 2008, pp. 151160. ACM, New York
(2008)
14. OpenSource. OpenNebula (2010), http://www.opennebula.org/
15. OpenSource. OpenStack (2010), http://www.openstack.org/
16. Oracle. Oracle Real Application Clusters, RAC (2011),
http://www.oracle.com/technetwork/
database/clustering/overview/index.html
17. Ristenpart, T., Tromer, E., Shacham, H., Savage, S.: Hey, you, get o of my cloud:
exploring information leakage in third-party compute clouds. In: Proceedings of
the 16th ACM Conference on Computer and Communications Security, CCS 2009,
pp. 199212. ACM, New York (2009)
18. VMware. VMware vCenter Server (2010),
http://www.vmware.com/products/vcenter-server/
19. Wikipedia. Network-Attached Storage, NAS (2010),
http://en.wikipedia.org/wiki/Network-attached_storage
20. Wikipedia. Storage Area Network, SAN (2010),
http://en.wikipedia.org/wiki/Storage_area_network
21. Youse, L., Butrico, M., Da Silva, D.: Toward a unied ontology of cloud computing. In: Proceedings of Grid Computing Environments Workshop, pp. 110. IEEE,
Los Alamitos (2008)

A Deduced SaaS Lifecycle Model Based on


Roles and Activities*
Jie Song, Tiantian Li, Lulu Jia, and Zhiliang Zhu
College of Software, Northeastern University
Shenyang, P.R. China
songjie@mail.neu.edu.cn,ltt.ruanyi0701@gmail.com,
jialulu1988@126.com,zzl@mail.neu.edu.cn

Abstract. In recent years, SaaS (Software-as-a-service) software is becoming


more and more popular, and it may gradually take the place of traditional
software. Therefore, having an overview of the SaaS software lifecycle is
critical for corporations to make strategic choices to keep their competitiveness
in the software industry. However, the existed software lifecycle models are
most for traditional software and the amount of researches done about SaaS
software lifecycle to date can be considered little. In this paper, we will propose
a SaaS lifecycle model based on roles and activities. It is deduced through a
clustering algorithm based on the data produced by a simulating system which
adopts intelligent agent technology.
Keywords: SaaS, Lifecycle Model, Roles and Activities.

1 Introduction
SaaS is a software delivery model in which the software is delivered and used in the
form of a service through the internet. As a new emerging software delivery model,
SaaS has proved its salient merits and promising future by its characteristics of
service oriented, internet based, pay-as-you-go and multi-tenant. It is just because of
these features that SaaS has been paid great attention to in both academic and
application areas. There have been many SaaS corporations today, such as Salesforce,
IBM, Microsoft, Google, NetSuite, 800APP, Alisoft, etc. However, the success rate of
SaaS applications is still relatively low.
Having an overview of the SaaS software lifecycle is helpful for increasing the
success rate. The study of the software lifecycle will enable software vendors to
develop software that is adaptable to new business models and new markets. For the
incentive behind the defining, modeling, and monitoring of the software lifecycle is to
increase quality and decrease costs, to strengthen the competitiveness of the
corporations in the software industry, and to make strategic choices that help them to
thrive in the software ecosystem.
Unfortunately, there has not been a ready-made lifecycle model for SaaS. On the
one hand, although the software lifecycle models have evolved from sequential
*

This work is supported by the Shenyang Technical Foundation, China (No.1091176-1-00).

A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 421431, 2011.
Springer-Verlag Berlin Heidelberg 2011

422

J. Song et al.

models such as Waterfall Model and V-Model, towards more iterative ones such as
Incremental Model and Spiral Model, in order to be more responsive to changes of
requirements and reduce rework to a great extent, they are not suitable for SaaS
software. That is because, in SaaS mode, software vendors no longer function as
independent units. They have become networked, i.e., software vendors are depending
on other software vendors, outsourcers, value-added-resellers, and so on. And SaaS
software has more roles and the interactions between them are much more complex
than traditional software. On the other hand, even though SaaS has been around for
quite some time, there is little research about its lifecycle model. The modeling
challenge lies in the fact that SaaS is a new emerging technology, and all of the
existing SaaS software is at an infant stage, so we cannot deduce or validate the SaaS
lifecycle model through the case studies of them.
In this paper, we propose a SaaS lifecycle model by the following four steps.
Firstly, we study the SaaS ecosystem and its compositions, drawing the conclusion
that the lifecycle model can be deduced through analyzing the roles and activities
appearing in the SaaS ecosystem. Secondly, we design an algorithm to cluster the
interactions into lifecycle phases based on the data produced by the agent-based
simulation system, the details of which are abbreviated in this paper. Finally, we
deduce the lifecycle model based on the data produced by the simulating system. The
lifecycle model deduced mainly includes the following five phases: Requirement
Definition, Development, Deployment, Operation and Retirement.
The rest of this paper is organized as follows. Following the introduction, we
briefly introduce the related works in section 2. Then, we illustrate the SaaS software
ecosystem in section 3. Section 4 elaborates on our lifecycle model deducing
approach. Finally, we summarize our work and present our future works in section 5.

2 Related Works
So far, there have been some traditional software lifecycle models, such as Waterfall
Model (originally defined by Royce in 1970), V-Model, Prototyping Model,
Incremental Model and Spiral Model (developed by Boehm in 1986). However, none
of them is suitable for SaaS software because of its new features. In addition,
researchers have made some efforts in the SaaS software related area. [1] specified a
generalized service lifecycle including the following six phases: Service Analysis,
Design, Implementation, Publishing, Operation and Retirement. [2] presented a threeperspective model of SECO (software ecosystem) by using the software supply
network modeling technique, and a few definitions about SECO are available in both
[3] and [4]. [5] proposed a unified lifecycle template in which the software lifecycle is
composed of roles, activities, artifacts and supports. Unfortunately, the above
researches are scattered for software lifecycle service, and SECO, not special for SaaS.
Therefore, based on the above researches, we propose a deduced SaaS lifecycle model
based on roles and activities. It synthesizes the research methods mentioned in those
papers. However, we meet such a challenge during the deducing process: We could not
deduce or validate the SaaS lifecycle model through the case studies of the existing
SaaS software, for SaaS is a new emerging technology, and all of the existing SaaS
software is at an infant stage. Fortunately, we find a way, which adopts intelligent
agent technology, to handle the challenge proposed above. An intelligent agent is an

A Deduced SaaS Lifecycle Model Based on Roles and Activities

423

entity that perceives the environment and takes actions to change the environment to
reach the desired environmental state [6], and multi-agent systems can be used to solve
problems which are difficult or impossible for an individual agent to solve.

3 SaaS Software Ecosystem


Before we elaborate on the lifecycle deducing approach in section 4, we firstly
illustrate the SaaS ecosystem perspectives in Fig.1, based on the definition of
software ecosystem [2]: Software ecosystem is a set of businesses functioning as a
unit and interacting with a shared market for software and services, together with the
relationships among them.

Fig. 1. SaaS Ecosystem Perspectives

SaaS corporations thrive in the software ecosystem where their services are used
by others (external or internal service customers) and they themselves utilize services
from others (internal or external service providers). As shown in Fig. 1, SaaSISV is
supplied with components from Outsourcer and OtherISV. When the SaaSISV has
developed a service, it registers the service in the service pool provided by the
Operator, and the Tenant can search the service pool for the services they need.
Finally, they can sign the rent contract with the SaaSISV.
In our approach, SaaS ecosystem is composed by Role, Activity, Artifact, and
Support, which are organized around SaaSLifecycle. As shown in Fig.2,
SaaSLifecycle has a one-to-many relationship with the other four objects, that is, one
SaaSLifecycle has multiple roles, activities, artifacts, and supports (corresponding to
"1*" shown in Fig.2). Role is a type of actors responsible for the activity. A role
may be responsible for multiple activities under multiple supports. Activity is mainly
responsible for the workflows of the software lifecycle. An activity may be a higher
level activity as a parent activity or a lower level activity as a child activity. It is
noteworthy that, a parent activity here corresponds to a phase, while a child activity
refers to an atomic task. An activity may have multiple artifacts under multiple
supports. Artifact is the output of an activity, which in return can be also used as input
of other activities. Support mainly provides supports for the software development.

424

J. Song et al.
1
1

Sa a SLife cy cle
1
1

1..*

1..*

R ole

input

A ct iv it y
1

output

A rt ifa ct

1..*

1..*
+children
+parent

1
0..*

Su pport

0..*

0..*
1..*

Fig. 2. SaaS Ecosystem Components

If necessary, it can be subdivided into several different types such as tools, templates,
checklists, standards, guidelines, instantiation guidance and so on. Both artifact and
support have a corresponding relationship with other objects like role and activity.
Based on the research about the SaaS ecosystem and our knowledge of software
lifecycle, we make our point as follows. Interactions between roles are supposed to
show regional characteristics in the timeline, and each lifecycle phase can be
characterized by a series of specific interactions involved specific roles. Therefore, we
think, if we record all of the interactions occurred between roles during the lifecycle,
and then cluster them using an algorithm, we can finally get the portioned lifecycle
phases only based on roles and activities. In the next section, we will elaborate on our
deducing approach by four steps.

4 Deducing Approach
According to our previous research, we have decided to deduce the lifecycle model
based on roles and activities through a clustering algorithm based on the data
produced by a SaaS lifecycle simulating system which adopts the intelligent agent
technology. We take the following four steps:
Step1: Extract the roles and activities and define the data structure we need;
Step2: Simulate the lifecycle process by an agent-based simulating system;
Step3: Design a clustering algorithm and carry it out based on the data acquired from
the above simulating system;
Step4: Analyze the clustering results and define the phases.
SaaS is a new technology, and the existing SaaS software are at an infant stage, so we
cannot deduce or validate the SaaS lifecycle model through the case studies of them.
In order to deal with this problem, we adopt intelligent agent technology to simulate
the whole process of the lifecycle. But the details of the agent-based simulation
system are abbreviated in this paper. In this paper, we only describe Step1, 3 and 4.

A Deduced SaaS Lifecycle Model Based on Roles and Activities

425

4.1 Roles and Activities


In this section, we first give the definition of role, and then list the roles and activities
extracted from the lifecycle. In addition, we also define some related objects and the
data structure used in the following simulating system.
Definition 1. Role: A role corresponds to a character appeared in the lifecycle. It
perceives the environment and then takes corresponding actions. Let r be a role and R
be the role set, and then they can be described as follows:
r = < name, description, knowledge>
R = { ri | 1 i NR } NR is the number of roles in R
Note: Name, like its literal meaning, is the name of the role; Description describes the
responsibilities of the role; Knowledge stores the "if-then" rules, according to which
the role take its actions.
We list the roles and activities identified in the whole lifecycle in Table 1. As
shown in Table 1, we extract 12 roles. Among them, Integrator is responsible for
integrating the developing system for Developer; IaaS ISV is responsible for
providing the basic resources such as hardware resources for Developer; PaaS ISV is
responsible for providing the software development tools and middleware platforms
for Developer; Promoter is responsible for the market promotion.
Table 1. Roles and Activities
Role
Researcher
Original User
Developer
Integrator
IaaS ISV
PaaS ISV
Deployer
Operator
Tenant
End-user
Maintainer
Promoter

Activity
Research: Define the requirement by market researching
Feedback: Give the requirement feedbacks to the researcher
Develop: Develop the software according to the requirement specification
Integrate: Integrate the developing system for developer
Provide: Provide the basic resources for developer
Provide: Provide the developing tools and middleware platforms for developer
Deploy: Deploy the software on the operating platform
Operate: Provide the software for tenants to subscribe
Subscribe: Subscribe the functions they need
Add Users: Authorize a user to use the software
Use: Do their business using the software
Maintain: Solve the problems appeared during the operation
Promote: Promote the software market

The definition 2-3, TimeLine and Interaction are the basis of the definition 4-6.
Definition 2. TimeLine: TimeLine is the temporal range of lifecycle, and it is
composed by a sequence of TimeUnits which stand for the measures of time, such as
weeks, days or hours. Let t be a TimeUnit and T be a TimeLine, then TimeLine can be
described as follows:
T= { ti | 1 i NT } NT is the number of TimeUnits in T
ti, tj T, if i < j, then ti < tj

426

J. Song et al.

Definition 3. Interaction: An interaction is the activity occurred between two roles.


Here, the interaction may be a one-way activity, such as there is no feedback from
developer back to researcher, from deployer back to developer etc. Let be an
interaction and be the interaction set, and then they can be described as follows:
= <id, r, r> in which r r
= { i | 1 i N } N is the number of interactions in
Note: Id is the unique identification of an interaction; r represents the active role of
the interaction while r represents the passive role of the interaction.
In order to give an intuitive description of the interactions, we illustrate them in
Fig. 3. As shown in Fig. 3, the directed line represents an interaction, and the
beginning side of the arrow represents r, while the end side of the arrow represents
r. There are thirteen possible interactions we identified in total.

Or igina lUs e r

R e s e a r che r

Ia a SISV

De v e lope r

De p loy e r

Ope ra t e r

M a int a ine r

Int e gr a t or

P a a SISV

Te na nt

Us e r

P ro m o t or

Fig. 3. Interactions between Roles

Definition 4. Event: Event records the interactions happened in the same TimeUnit.
Let e be an event and E be the event set, and then they can be described as follows:
e = < te , e >
in which te T, e
E = { ei | 1 i NE } NE is the number of events in E
= {TE , E }

in which TE =

NE

NE

i =1

i =1

tei , E = ei

Note: te is the TimeUnit when the interactions happened. e is the set of interaction
which happened in the same te.
Without loss of generality, we assume that "day" is the TimeUnit of TimeLine for
lifecycle phase, and then the occurrence of a particular interaction can be measured by
"date". Therefore, event is treated as "daily event" as well.
Definition 5. Phase and Lifecycle: Phase exists to accomplish some specific tasks in
TimeLine, and lifecycle is composed of continuous phases. Let pn be a nth phase, then
it can be defined as the cluster of events (En) in a certain duration (TEn):
Lifecycle = {pn} = {En}
N

in which1 n N, T = TEn
n=1

pn = En , in which TEn of E satisfy,


if i, k [1, NT ] tei , tek TEn then j [i, k ] tej TEn

A Deduced SaaS Lifecycle Model Based on Roles and Activities

427

From Definition 5, a conclusion can be drawn that, the process of deducing the
lifecycle equals the process of clustering events (or partitioning E) into N continuous
phase (E1, E2,, EN). So in next definition, we define the similarity between events.
Definition 6. Event Similarity: Event similarity is the formula which is used for
calculating the similarity of two events, for the further clustering. ei, ej E, i, j NE:

Sim(ei, ej) = K(M+N-K)-1


K = sizeof (ei ej), M = sizeof (ei), N = sizeof (ej)
Note: If Sim(ei, ej) , then ei and ej is similar and will be emerged into one cluster.
is the similarity threshold. The value of parameter has great impact on the
clustering effect. In the experiment, we will study the impact of the parameter to
improve clustering result. The parameter will be fixed in our experiment.
4.2 Clustering Algorithm

In this section, we give the algorithm designed for clustering the phases based on the
data produced by the agent-based simulation system. Details of the agent-based
simulation system are abbreviated here. Traditional clustering algorithm can be divided
into five categories. Among them, hierarchical clustering algorithm is simple, fast and
able to effectively handle large data sets. Therefore, we choose the hierarchical method
in our approach. In accordance with the formation of the hierarchy, hierarchical
method can be divided into agglomerative method and divisive method.

Fig. 4. Clustering Process

In our approach, we take two steps to execute the clustering. Firstly, we divide the
lifecycle into several rough phases through the first cluster using the divisive method
according to the calculated interval. Secondly, we divide each rough phase into more
adequate phases through the second cluster using the agglomerative method. We
schematize the clustering process in Fig. 4.
The similarity threshold used in the second cluster is set at 0.6. The clustering
algorithm is as follows:

428

J. Song et al.

Algorithm 1. Cluster for Lifecycle Phases


Input: Events
Output: Phases
First Cluster:
//Get the suitable interval; interval =(interval1+interval2++intervaln)/n;
1. sum=num=temp=0;
2. For each ei in E
3.
If( sizeof(ei) = 0 )
4.
sum ++ ;
5.
End if
6.
If( temp != sum )
7.
num ++;
8.
temp = sum;
9.
End if
10. End for
11. interval = sum / num;
// First Cluster by the interval;
12. start = end = temp = count = 0;
13. For each ei in E
14. If( sizeof(ei) = 0 )
15.
count ++;
16. End if
17. Else
18.
if( count > interval )
19.
start = temp; temp = i; end = i-count-1;
20.
v1.add(new Phase(start, end));
21.
End if
22.
count = 0;
23. End if
24. End for
25. v1.add(new Phase(temp, i-1));
Second Cluster:
//Second cluster for each phase acquired from the first cluster; Get the critical point;
26. flag = 0;
27. For each ei in E
28. For each ej in E // tei> tej
29.
If( Sim(ei, ej) )
30.
flag=1;
31.
remove(tei, tej);
32.
Break;
33.
End if
34. End for
35. If( flag = 0 )
36.
v.add(i);
37. End if
38. End for
//Add the new phases divided by the second cluster;
39. For each element in v
40. end=element - 1;
41. If( end >= start )
42.
v2.add(new Phase(start, end));
43. End if
44. start = element;
45. End for

A Deduced SaaS Lifecycle Model Based on Roles and Activities

429

4.3 Deducing Results

In this section, we display the clustering results in Fig. 5 and define the phases
acquired from it.
In Fig. 5, the X corridor expresses the date in TimeLine; the Y corridor expresses
the id of the interaction. Fig.5-a gives the interactions happened during the whole
lifecycle. Through the first cluster, we divide the lifecycle into 3 rough phases which
is shown in Fig.5-b . Then we carry out the second cluster in each phase created by
the first cluster, and the result is shown in Fig.5-c. Finally we integrate the phases
according to our common knowledge of lifecycle. The result is shown in Fig.5-d.

(a) Initial Data

(b) First Cluster

(c) Second Cluster

(d) Final Integration


Fig. 5. Cluster Results

Now, we get deducing SaaS lifecycle model. It includes the following five phases:
Requirement Definition, Development, Deployment, Operation and Retirement. The
definitions of the phases are as follows:
z

Requirement Definition
This phase focuses on the service requirement definition which forms the input
of development phase. The requirement is captured through the interaction
between Researcher and the Original User. It comes to an end when the
Researcher submits the requirement specification to the Developer.
Development
This phase is responsible for developing the SaaS software according to the
requirement specification. The Developer first asks the System Integrator to
integrate the developing system before they begin the development. And the
Integrator has to get the hardware and software resources from the IaaS ISV and
PaaS ISV.

430
z

J. Song et al.

Deployment
Once the service is built, it will be deployed on the operating platform and
registered in the service pool. This requires the interaction between the Deployer
and the Operator.
Operation
After deploying, operation starts. In this phase, service is in operation, actively
consumed by Users; Users can submit feedbacks and improvement proposals.
The number of Users may increase, be stable or decrease (see Fig.6-d). The
service will typically undergo revisions, extensions, or promotions. Here, the
Maintainer is the decision-maker, deciding which measure should be taken
through analyzing the Users feedbacks and the running logs. For example,
when the number of Tenants is decreasing, the Maintainer may assign the
Promoter for market promotion or assign the Researcher for development of
new features.
Retirement
When the service is not used anymore, the retirement phase arrives. The service
will be taken out of the service pool in this phase.

5 Conclusions and Future Works


In this paper, a SaaS lifecycle model based on roles and activities is proposed. It
includes the following five phases: Requirement Definition, Development,
Deployment, Operation and Retirement. The primary work of this research can be
divided into the following four aspects:
z
z
z
z

Illustrate the SaaS ecosystem perspectives as well as its components.


Extract the roles and activities appearing in the SaaS lifecycle process and define
the related objects and the data structure needed.
Design the algorithm to cluster for lifecycle phases.
Analyze the clustering results and describe each lifecycle phase.

All of the works are contributed to our research on SaaS lifecycle. We believe that the
proposed lifecycle model will provide guidelines for developing, operating and
maintaining SaaS software. Future works of this research include: Further improve
the agent-based simulation system, and provide much more detailed information on
the simulation, such as the internal business logic employed by roles when
determining what interaction to initiate next, how to do bugs/errors miscalculations
factor in revision requests, and what is the statistical model for such bugs to occur;
Enrich the lifecycle model with much more detailed phases and verify it by case
study.

References
1. Kohlborn, T., Korthaus, A., Rosemann, M.: Business and Software Service Lifecycle
Management. In: IEEE International Enterprise Distributed Object Computing Conference
(2009)

A Deduced SaaS Lifecycle Model Based on Roles and Activities

431

2. Jansen, S., Finkelstein, A., Brinkkemper, S.: A Sense of Community: A Research Agenda
for Software Ecosystems. In: ICSE 2009, Vancouver, Canada, May 16-24 (2009)
3. Bosch, J.: From Software Product Lines to Software Ecosystems. In: International Software
Product Line Conference (SPLC 2009), USA, August 24-28 (2009)
4. Kittlaus, H.-B., Clough, P.: Software Product Management and Pricing. Key Success
Factors for Software Organizations. Springer, Heidelberg (2009)
5. He, R., Wang, H., Lin, Z.: A Software Process Tailoring Approach Using a Unified
Lifecycle Template. IEEE, Los Alamitos (2009); Smith, T.F., Waterman, M.S.:
Identification of Common Molecular Subsequences. J. Mol. Biol. 147, 195197 (1981)
6. Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, New
Jersey (1995)

Towards Achieving Accountability, Auditability and


Trust in Cloud Computing
Ryan K.L. Ko1, Bu Sung Lee1, and Siani Pearson2
1

Cloud and Security Lab, HP Labs, Fusionopolis, Singapore


Cloud and Security Lab, HP Labs, Bristol, United Kingdom
{ryan.ko,francis.lee,siani.pearson}hp.com
2

Abstract. The lack of confidence in entrusting sensitive information to cloud


computing service providers (CSPs) is one of the primary obstacles to
widespread adoption of cloud computing, as reported by a number of surveys.
From the CSPs perspective, their long-term return-on-investment in cloud
infrastructure hinges on overcoming this obstacle. Encryption and privacy
protection techniques only solve part of this problem: in addition, research
is needed to increase the accountability and auditability of CSPs. However,
achieving cloud accountability is a complex challenge; as we now have to
consider large-scale virtual and physical distributed server environments
to achieve (1) real-time tracing of source and duplicate file locations, (2)
logging of a files life cycle, and (3) logging of content modification and access
history. This position paper considers related research challenges and lays a
foundation towards addressing these via three main abstraction layers of cloud
accountability and a Cloud Accountability Life Cycle.
Keywords: Accountable cloud computing, trusted computing platform,
accountability, logging, continuous auditing, audit trails.

1 Introduction
In a recent survey by Fujitsu Research Institute [1], it was revealed that 88% of
potential cloud consumers surveyed are worried about who has access to their data
within the cloud, and would like to have more awareness of what goes on in the
clouds backend physical servers. Such surveys have not only identified trust as the
key barrier to cloud computing uptake, but also enhanced the urgency for researchers
to quickly address key obstacles to trust [1-3].
From a system design perspective, the notion of trust can be increased via reducing
risk when using the cloud. While risk can be greatly mitigated via privacy protection
and security measures such as encryption, they are not enough, particularly as full
encryption of data in the cloud is at present not a practical solution.
There is a need to complement such preventative controls with equally important
detective controls that promote transparency, governance and accountability of the
service providers. This paper focuses on the detective controls of tracing data and file
movements in the cloud.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 432444, 2011.
Springer-Verlag Berlin Heidelberg 2011

Towards Achieving Accountability, Auditability and Trust in Cloud Computing

433

Despite accountability being a crucial component of improving trust and


confidence [4, 5], current prominent providers (e.g. Amazon EC2/ S3 [6, 7],
Microsoft Azure [8]) are still not providing full transparency or capabilities for the
tracking and auditing of the file access history and data provenance [9] of both the
physical and virtual servers utilized [1]. Currently, users can at best monitor the
virtual hardware performance metrics and system event logs of the services in which
they engage. The cloud computing research community, particularly the Cloud
Security Alliance, has recognized this. In its Top Threats to Cloud Computing Report
[10], it listed seven top threats to cloud computing:
1.
2.
3.
4.
5.
6.
7.

Abuse and nefarious use of cloud computing


Insecure application programming interfaces
Malicious insiders
Shared technology vulnerabilities
Data loss or leakages
Account, service and traffic hijacking
Unknown risk profile.

Methods increasing the accountability and auditability of cloud service providers,


such as tracing of file access histories, will allow service providers and users to
reduce five of the above seven threats: 1,2,3,5 and 7.
In this position paper, we (1) identify accountability and auditability as urgent
research areas for the promotion of trust in cloud computing, (2) discuss the
complexities of achieving accountability as a result of cloud computings promise of
elastic resources, (3) propose a conceptual foundation that promotes system designs
which fully address the different cloud accountability phases and abstraction layers
and (4) discuss related and further work.

2 A Trust-Related Scenario
Figure 1 shows a typical trust-related scenario which many potential cloud customers
fear [1]. A customer stores some sensitive data in a file (see Fig. 1 top-left; red icon)
within a virtual machine (VM) hosted by a provider s/he has subscribed to. Upon
uploading the data, failsafe mechanisms within the cloud will typically back it up, and
perform load balancing by creating redundancies across several virtual servers and
physical servers in the service providers trusted domain. From the files creation to the
backup processes, large numbers of data transfers occur across virtual and physical
servers (black solid-line arcs; Fig. 1), and several memory read/write transactions to
both virtual and physical memories are involved (blue dotted-line arcs; Fig. 1). If all
such transactions and the creation of new duplicate files are logged, monitored and
accounted for, we would be able to trace the file history and log the access history and
content modifications, i.e. achieving cloud accountability and auditability.
Even if a malicious insider of the CSP attempts to transfer the sensitive file/ data to
a target outside the cloud (e.g. in Fig. 1, via email), we will be well-equipped to
know when, where, how and what was being leaked, and by whom. This empowers
both the CSP and the consumers, as problematic processes and even insider jobs may
be investigated. This also removes some barriers to confidence in the cloud.

434

R.K.L. Ko, B.S. Lee, and S. Pearson

Fig. 1. An example scenario in cloud computing, showing the importance of accountability and
auditability

3 Complexities Introduced by Clouds Elasticity


With cloud computings feature of elasticity empowered by virtualization [6, 11]
comes several new complexities introduced related to the area of accountability.
3.1 Challenges Introduced by Virtualisation
3.1.1 Tracking of Virtual-to-Physical Mapping and Vice versa
With the introduction of virtualization, server resources are utilized more efficiently.
However, the addition of virtualized layers means that not only the events in each
individual virtual server need to be tracked, but also in the physical servers [11].
Currently, there are only tools (e.g. HyTrust [12]) which are able to log virtual level
logs and system health monitoring tools for virtual machines. There is still a lack of
transparency of (1) the linkages between the virtual and physical operating systems,
(2) relationships between virtual locations and physical static server locations, and (3)
how the files are written into both virtual and physical memory addresses. This
information is currently disparate and not available as a single-point-of-view for cloud
users.

Towards Achieving Accountability, Auditability and Trust in Cloud Computing

435

3.1.2 Multiple Operating System Environments


With the ease of choosing myriad operating systems for virtual machines comes the
complexity of managing logging of a very large possibility of operating systems
(OSs) within the cloud. Enforcing a homogeneous OS for all virtual machines would
solve this issue, but makes the provider less competitive. This means that we cannot
focus on system health logging, and also existing OS-based logging tools [13], but
need a new perspective for logging, as explained in the following section.
3.2 Logging from Operating System Perspective vs. Logging from File-Centric
Perspective
Current tools focus on OSs and system health monitoring (e.g. cloudstatus.com, [14],
etc), but few emphasize the file-centric perspective. By this, we mean that we need to
trace data and files from the time of creation to the time of destruction. When we log
from a file-centric perspective, we view data and information independently from
environmental constraints. This reflects the elastic nature of cloud computing. With
the transfer of control of data to CSPs, the latter should ease the minds of consumers
by providing them with the capabilities to track their data (just like those shown in
Figure 1).
3.3 Live and Dynamic Systems
While there are proposals for adoption of provenance-aware mechanisms (that allow
tracing back the source or creator of data) in cloud computing, such proposals are
unable to address all challenges in clouds, as cloud systems are live and dynamic in
nature. Provenance techniques propose reports (e.g. audit trails) as the key to forensic
investigations. However in reality, a snapshot of a running, or live system such as
the VMs turned on within a cloud can be only reproduced up to its specific instance
and cannot be reproduced in a later time-frame. As a result, with a live system, data
from a probe up to one instance will be different from data from another probe say
15 minutes into the live system [15]. This means cloud accountability demands
complex real-time accountability, where key suspected events are captured almost
instantaneously.
3.4 Scale, Scope and Size of logging
The elasticity concept also increases the need for efficient logging techniques and a
proper definition of scope and scale of logging. By efficient, we mean that the
impending exponential increase in log size has to be manageable, and not quickly
wipe out memory of servers hosting the cloud logging features. By scale and scope,
we mean policies that can help to clearly define the areas in which loggers are
assigned to log. For example, a CSP may label its own network as a safe zone, while
its suppliers or mirror sites trusted zones (and any other network outside of these) are
labeled as unsafe zones. Zonal planning will greatly reduce the complexities of
network data transfer tracing within a cloud. Another way of reducing complexity will
be the classification of the level of data abstraction, e.g. crude data, documents, and
on a higher level, workflows. These are discussed further in Section 5.

436

R.K.L. Ko, B.S. Lee, and S. Pearson

4 Achieving an Accountable Cloud


4.1 Phases of Cloud Accountability
The discussions in Section III and the scenario in Figure 1 have not only revealed the
scale and urgency of the problem but also exposed the need for reduction of
complexity. Having an awareness of the key accountability phases will not only
simplify the problem, but also allow tool makers to gauge the comprehensiveness of
their tool (i.e. if there are any phases not covered by their product). Phases can help
researchers focus on specific research sub-problems of the large cloud accountability
problem. Consumers can also understand if the cloud accountability tool has a real
coverage of all phases of cloud accountability. These phases are collectively known as
the Cloud Accountability Life Cycle (CALC). We propose CALC as the following
seven phases (see Figure 2):
1)
Policy Planning
In the beginning, CSPs have to decide what information to log and which events to
log on-the-fly. It is not the focus of this paper to claim or provide an exhaustive list of
recommended data to be logged. However, in our observation, there are generally four
important groups of data that must be logged: (1) Event data a sequence of activities
and relevant information, (2) Actor Data the person or computer component (e.g.
worm) which trigger the event, (3) Timestamp Data the time and date the event took
place, and (4) Location Data both virtual and physical (network, memory, etc)
server addresses at which the event took place.
2)
Sense and Trace
The main aim of this phase is to act as a sensor and to trigger logging whenever an
expected phenomenon occurs in the CSPs cloud (in real time). Accountability tools

Fig. 2. The Cloud Accountability Life Cycle

Towards Achieving Accountability, Auditability and Trust in Cloud Computing

437

need to be able to track from the lowest-level system read/write calls all the way to
the irregularities of high-level workflows hosted in virtual machines in disparate
physical servers and locations. Also, there is a need to trace the routes of the network
packets within the cloud.
3)
Logging
File-centric perspective logging is performed on both virtual and physical layers in
the cloud. Considerations include the lifespan of the logs within the cloud, the detail
of data to be logged and the location of storage of the logs.
4)
Safe-keeping of Logs
After logging is done, we need to protect the integrity of the logs prevent
unauthorized access and ensure that they are tamper-free. Encryption may be applied
to protect the logs. There should also be mechanisms to ensure proper backing up of
logs and prevent loss or corruption of logs. Pseudonymisation of sensitive data within
the logs may in some cases be appropriate.
5)
Reporting and Replaying
Reporting tools generate from logs file-centric summaries and reports of the audit
trails, access history of files and the life cycle of files in the cloud. Suspected
irregularities are also flagged to the end-user. Reports cover a large scope: virtual and
physical server histories within the cloud; from OS-level read/write operations of
sensitive data to high-level workflow audit trails.
6)
Auditing
Logs and reports are checked and potential fraud-causing loopholes highlighted.
The checking can be performed by auditors or stakeholders. If automated, the process
of auditing will become enforcement. Automated enforcement is very feasible for
the massive cloud environment, enabling cloud system administrators and end-users
to detect irregularities more efficiently.
7)
Optimising and Rectifying
Problem areas and security loopholes in the cloud are removed or rectified and
control and governance of the cloud processes are improved.
4.2 Cloud Accountability Abstraction Layers
Next we address the important question: what data to log? The answer ranges from a
system-level log to a workflow-level audit trail transactional log. Such a range shows
that there are many abstraction layers of data, and a framework is needed to reduce
this kind of ambiguity and increase research focus and impact. As such, we propose
the following layers of accountability in a cloud:

Workflow Layer
Data Layer
System Layer
Fig. 3. Abstraction Layers of Accountability in Cloud Computing

438

R.K.L. Ko, B.S. Lee, and S. Pearson

Figure 3 shows the abstraction layers for the type of logs needed for an accountable
cloud. It is important to note that the focus is on the abstraction layers of logs and not
on architectural layers. Hence, it is independent of virtual or physical environments.
The data and workflow abstraction layers are derived from related works in data and
workflow provenance [9, 16, 17], and the system layer is derived from related works
in trusted computing platforms [18, 19] and system logging literature [20, 21].
Such explicit definition of layers in Figure 3 allows us to efficiently identify the
areas of their application and their focus areas. At a glance, the three layers look
deceptively simple, but the problem is more complex than it looks. Each layer has a
slightly different set of sub-components for each different context. Our model
simplifies the problem and makes accountability more achievable. The usefulness of
layers is also analogous to OSI [22] and TCP/IP [23] networking layers. Let us now
discuss the scope and scale of each layer:
4.2.1 System Layer
At the lowest level lie the system layer logs. The system layer consists of logging
within the following components:
1)

Operating System (OS)


OS system and event logs are the most common type of logs associated with cloud
computing at the moment. However, these logs are not the main contributing factor to
accountability of data in the cloud, but a supporting factor. This is because in
traditional physical server environments housed within companies, the emphasis was
on health and feedback on system status and ensuring uptime as server resources are
limited and expensive to maintain. In cloud computing, resources like servers and
memory are elastic, and are no longer limited or expensive [6, 11]. Hence, OS logs,
while important, are no longer the top concern of customers. Instead, the customers
are more concerned about the integrity, security and management of their data stored
in the cloud [1, 24].
2)

File System
Even though the file system is technically part of the OS, we explicitly include it as
a major component in this system layer. This is because, in order to know, trace and
record the exact file life cycle and history, we often have to track system read/write
calls to the file system. From the system read/write calls, we can also extract the
virtual and physical memory locations of the file, providing more information for
further forensic investigations. The file-centric perspective [25] is also the area which
is less emphasized by current tools. Cloud computing needs to have more emphasis
on file-centric logging, and the tracing and logging of a files life cycle (i.e. creation,
modification, duplication, destruction).
3)

Network Logs
As clouds are vast networks of physical and virtual servers over a large number of
locations, we need to also monitor network logs within the cloud. Network logs [26,
27] are logs specific to data being sent and received over the network.

Towards Achieving Accountability, Auditability and Trust in Cloud Computing

439

4.2.2 Data Layer


This layer contains the logging of data transactions and the life cycle of data.
The difference with the system layer is that the system layers file system logs track
the life cycle of files, whereas the data layer actually tracks the life cycle of data and
the contents of files. The same file can contain drastically different sets of data over
time. Some examples of the data layer are: (1) data provenance, which records the socalled chains of custody [9, 16, 17] (e.g. the history of owners and authorized users)
of the data found in the cloud and (2) database logs [28, 29] (i.e. histories of updates
and actions executed by a database management system to the database).
4.2.3 Workflow Layer
This layer primarily contains logs which reveal the robustness or weaknesses of
the governance and controls of a workflow or business process [30, 31] in an
organization. It correlates with an organizations strategic and management levels
[32]. It is the key layer audited by most IT auditors and internal audits. Examples
include: (1) audit trails from transactions in business process and workflow
management systems (2) audit trails from information systems for the customer
organizations (e.g. ERP systems, Human Resource systems, etc) (3) continuous
auditing and monitoring tools.
4.3 Foreseeable Research Challenges
4.3.1 Growth of Log Size
Without a doubt, this will be an obstacle to efficient auditability and will most likely
be the major problem in cloud accountability. Some of the main areas of research will
be on the optimal period of storage of logs, how to shrink logs over time, and how and
where to store logs efficiently.
4.3.2 Security and Integrity of Logs (No Tampering)
With an increased awareness (from customers and also hackers) of the clouds ability
in accountability, there will be attempts to tamper with logs to escape detection of
fraudulent activities. One of the ways to minimize compromise of security of logs is
to make the logging silent within the cloud. The integrity and security of logs must
never be compromised or there will be leakage of user access history and possibilities
of competitors studying customers usage behavior.
4.3.3 Privacy vs. Accountability
With logs available in the cloud, hackers and corporate spies can benefit from
studying these logs. Key competitive advantages may be lost, and it is of the utmost
important that CSPs place strong focus on privacy and secrecy of logs.

5 Technical Approaches to Increasing Accountability


With the definition of CALC and the abstraction layers of the type of data to log, we
are primed to create tools and software which will achieve cloud accountability.
Currently, we envision three possible technical approaches:

440

R.K.L. Ko, B.S. Lee, and S. Pearson

1)
Central Watchdog/ Manager Service
In this approach, a watchdog service manages a certain set of nodes, and watches
over the physical and virtual logs of all layers and stores the logs centrally. While this
is more economical and easier to maintain, such a watchdog service would
undoubtedly be vulnerable to network routing problems, interference or use of false
identities.
2)
Local File Tracking Embedment
In this approach, we envision that a file is designed to dedicate some of its memory
for storage of bite-sized local logs and provenance data. Currently, this is very
difficult to achieve in current file extensions as they are usually predefined without
much consideration of local logging.
3)
Domain segregation
Accountability in cloud computing will be more achievable if there is a clear
design of different domains from the perspective of CSPs or customers. Internal
Zones can depict the CSPs own network, with Trusted Zones for its Collaborators,
and External Zones for networks outside these two zones. If the data leaves
authorized zones, the event will be flagged.

6 Policy Approaches to Increasing Accountability


Primarily, the aim is to improve trust. While technical approaches may be efficient,
we foresee that they must be accompanied by the following policy approaches.
1)
Certification of trusted clouds
Cloud service providers have to conform and be audited to a set of rules and
guidelines to be labeled as a trusted cloud. A recent movement by the CSA related
to this concept is the Trusted Cloud Initiative [33].
2)
Cloud Trust Track Record
Cloud service providers are subject to customer ratings, and each record of breach
in trust will be tabulated into a central, neutral database, which will advise consumers
on the particular strength and security standards of the ranked clouds. This is likened
to airline safety track records and serves as a standard for cloud service providers to
uphold.
3)
Legislation
Governmental bodies must control and impose penalties on breach of trust in the
cloud, much like antitrust and privacy protection laws.

7 Related Research
Cloud accountability and auditing are growing areas of active research. We
summarize some key elements below:
7.1 Governance, Risk Management and Compliance (GRC) Stack of the Cloud
Security Alliance (CSA) [34]
CSA is a non-profit organization formed to promote the use of best practices for
providing security assurance within Cloud Computing, and provide education on

Towards Achieving Accountability, Auditability and Trust in Cloud Computing

441

Cloud Computing uses [35]. Two projects from the CSAs GRC Stack [35] are very
relevant to our paper:

CloudAudit [36] An ongoing API project hosted on Google Code, and aims
to provide the technical foundation to enable transparency and trust in private and
public cloud systems.

Trusted Cloud Initiative [33] An initiative which aims to promote


education, research and certification of secure and interoperable identity in the cloud.
Most significant and related to our paper will be their movement towards the
certification of trusted clouds.
7.2 CSC and National Institute of Standards and Technology
In mid-2010, at 6th Annual Information Technology Security Automation Conference
(ITSAC) hosted by the National Institute of Standards and Technology (NIST), a
representative from the technology provider CSC presented the CloudTrust Protocol
(CTP) with Security Content Automation Protocol (SCAP) [37]. The CTP with
SCAP was claimed to offer a simple way to request and receive the fundamental
information needed to address cloud transparency. At the time of writing, there is no
public release of the proposed tool.
7.3 HP Labs Cloud and Security Lab
Pearson [4, 5, 38, 39] and Mowbray [38, 40] were two of the first researchers to aim
to promote privacy protection via procedural and technical solutions encouraging the
increase of accountability in the cloud [5, 39]. Their work on cloud privacy has
addressed the high levels of the accountability layers, and this paper aims to
complement their work with the inclusion of the lower system layers identified in
Section VI.
7.4 University of Pennsylvania
Haeberlen et al were one of the first researchers to call for awareness in an
accountable cloud [41]. In [41], they assumed a primitive AUDIT with considerations
of agreement, service and timestamps. However, AUDIT did not have a clear
explanation of the scope, scale, phases and abstraction layers of accountability. It is
our aim to complement their work. Their team has also proposed an approach for
accountable virtual machines [42], and discussed a case study on the application to
detect cheats in an online multi-player game Counterstrike. In our opinion, the
scenario of a non-cloud based game was not a practical business scenario for cloud
accountability.
7.5 HyTrust Appliance [12]
Recently in industry, HyTrust, a startup focusing on cloud auditing and accountability,
has released a hypervisor consolidated log report and policy enforcement tool (i.e.
HyTrust Appliance) for VM accountability management in clouds. HyTrust Appliance

442

R.K.L. Ko, B.S. Lee, and S. Pearson

addresses the System layer of accountability in the cloud. Despite this, it focuses only
on virtual layers and is not virtual-to-physical complexities.
7.6 Data and Workflow Provenance Research [17]
From the field of databases, data and workflow provenance research focuses on
recording histories of derivation of final outputs of data at different levels of
abstraction within databases. Provenance research may offer clues to recording logs in
the workflow and data layers of cloud accountability.

8 Concluding Remarks
We highlighted accountability and auditability as an important perspective towards
increasing trust in cloud computing. Several complexities introduced by the clouds
nature of elasticity were discussed. Some examples include (1) tracking of virtual-tophysical mapping and vice versa, (2) multiple operating system environments, (3)
logging from file-centric perspective, (4) live and dynamic systems, and (5) the scale,
scope and size of logging.
Achieving accountability and auditability in cloud computing will also empower:
automated monitoring and enforcement; Sarbanes-Oxley (SOX) audits in Clouds;
cloud security forensics; learning and analytics of usage behavior.
To simplify and enable efficient scoping of this complex problem, we proposed the
Cloud Accountability Life Cycle (CALC) and three abstraction layers. With these
conceptual foundations, researchers and practitioners can design tools and approaches
which address all areas of cloud accountability. This paper also discussed imminent
roadblocks to achieving accountability. In addition to related work discussions,
technical and policy approaches were suggested.
Moving forward, we are developing the different modules in the CALC, eg.
logging and mapping of virtual machines to physical machines. We believe that with
CALC, we would have a model that enables us to have a Trusted Cloud environment
where there is accountability and auditability.
Acknowledgments. The authors would like to thank their HP Labs colleagues Peter
Jagadpramana and Miranda Mowbray for their input.

References
1. Fujitsu Research Institute: Personal data in the cloud: A global survey of consumer
attitudes (2010)
2. Gross, G.: Microsoft presses for cloud computing transparency (2010),
http://www.infoworld.com/d/cloud-computing/microsoftpresses-cloud-computing-transparency-799
3. Strukhoff, R.: Cloud Computing Vendors Need More Transparency (2010),
http://cloudcomputing.sys-con.com/node/1308929

Towards Achieving Accountability, Auditability and Trust in Cloud Computing

443

4. Pearson, S., Benameur, A.: Privacy, Security and Trust Issues Arising from Cloud
Computing. In: The 2nd International Conference on Cloud Computing. IEEE, Indiana
(2010)
5. Pearson, S., Charlesworth, A.: Accountability as a way forward for privacy protection in
the cloud. In: Cloud Computing 2009, pp. 131144 (2009)
6. Armbrust, M., et al.: A view of cloud computing. Communications of the ACM 53(4), 50
58 (2010)
7. Garfinkel, S.: An Evaluation of Amazons Grid Computing Services: EC2, S3, and SQS
(2007)
8. Chappell, D.: Introducing windows azure. Microsoft (2009)
9. Buneman, P., Khanna, S., Tan, W.: Data provenance: Some basic issues. In: Foundations
of Software Technology and Theoretical Computer Science, pp. 8793 (2000)
10. Cloud Security Alliance: Top Threats to to Cloud Computing Report, Ver.1.0 (2010)
11. Baldwin, A., Shiu, S., Beres, Y.: Auditing in shared distributed virtualized environments.
HP Technical Reports (2008)
12. HyTrust. HyTrust Appliance (2010),
http://www.hytrust.com/product/overview/
13. Silberschatz, A., Galvin, P., Gagne, G.: Operating system concepts. Addison-Wesley, New
York (1991)
14. Hyperic: CloudStatus (2010), http://www.cloudstatus.com/
15. Shende, J.: Live Forensics and the Cloud - Part 1. Cloud Computing Journal (2010),
http://cloudcomputing.sys-con.com/node/1547944
16. Buneman, P., Khanna, S., Wang-Chiew, T.: Why and where: A characterization of data
provenance. In: International Conference on Database TheoryICDT 2001, pp. 316330
(2001)
17. Tan, W.: Provenance in databases: Past, current, and future. Data Engineering 2007, 3
(2007)
18. Pearson, S., Balacheff, B.: Trusted computing platforms: TCPA technology in context.
Prentice Hall PTR, Upper Saddle River (2003)
19. Proudler, G.: Concepts of trusted computing. In: Mitchell, C.J. (ed.) Trusted Computing.
IEE Professional Applications of Computing Series, vol. 6, pp. 1127. The Institute of
Electrical Engineers (IEE), London (2005)
20. Hansen, S., Atkins, E.: Automated system monitoring and notification with swatch. In:
USENIX Associations Proceedings of the Seventh Systems Administration (LISA VII)
Conference (1993)
21. Roesch, M.: Snort-lightweight intrusion detection for networks. In: Proceedings of the
13th USENIX Conference on System Administration, LISA 1999, Seattle, Washington
(1999)
22. Zimmermann, H.: OSI reference modelThe ISO model of architecture for open systems
interconnection. IEEE Transactions on Communications 28(4), 425432 (2002)
23. Stevens, W.: TCP/IP Illustrated: The Protocols, vol. I. Pearson Education, India (2004)
24. Chow, R., et al.: Controlling data in the cloud: outsourcing computation without
outsourcing control. In CCSW 2009: Proceedings of the 2009 ACM Workshop on Cloud
Computing Security. ACM, New York (2009)
25. Rosenblum, M., Ousterhout, J.: The design and implementation of a log-structured file
system. ACM Transactions on Computer Systems (TOCS) 10(1), 2652 (1992)
26. Slagell, A., Wang, J., Yurcik, W.: Network Log Anonymization: Application of CryptoPAn to Cisco NetFlows. In: NSF/AFRL Workshop on Secure Knowledge Management
(SKM 2004), Buffalo, NY (2004)

444

R.K.L. Ko, B.S. Lee, and S. Pearson

27. Slagell, A., Yurcik, W.: Sharing computer network logs for security and privacy: A
motivation for new methodologies of anonymization. In: Proceedings of SECOVAL: The
Workshop on the Value of Security Through Collaboration (August 2005)
28. Gray, J., Reuter, A.: Transaction processing: concepts and techniques. Morgan Kaufmann,
San Francisco (1993)
29. Peters, T.: The history and development of transaction log analysis. Library Hi
Tech. 11(2), 4166 (1993)
30. Ko, R.: A computer scientists introductory guide to business process management (BPM).
ACM Crossroads 15(4), 1118 (2009)
31. Ko, R., Lee, S., Lee, E.: Business process management (BPM) standards: a survey.
Business Process Management Journal 15(5), 744791 (2009)
32. Anthony, R.: Planning and control systems: a framework for analysis. Division of
Research, Graduate School of Business Administration, Harvard University (1965)
33. Cloud Security Alliance: Trusted Cloud Initiative (2010),
http://www.cloudsecurityalliance.org/trustedcloud.html
34. Cloud Security Alliance: Cloud Security Alliance Governance, Risk Management and
Compliance (GRC) Stack (2010),
http://www.cloudsecurityalliance.org/grcstack.html
35. Cloud Security Alliance (2010), http://www.cloudsecurityalliance.org/
36. Cloud Security Alliance: CloudAudit (A6 - The Automated Audit, Assertion, Assessment,
and Assurance API) (2010), http://cloudaudit.org/
37. Knode, R.: CloudTrust 2.0 (2010),
http://scap.nist.gov/events/2010/itsac/presentations/day2/Se
curity_Automation_for_Cloud_Computing-CloudTrust_2.0.pdf
38. Mowbray, M., Pearson, S., Shen, Y.: Enhancing privacy in cloud computing via policybased obfuscation. The Journal of Supercomputing, 125 (2010)
39. Pearson, S.: Taking account of privacy when designing cloud computing services. In:
Proceedings of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud
Computing. IEEE, Los Alamitos (2009)
40. Mowbray, M., Pearson, S.: A client-based privacy manager for cloud computing. In:
Proceedings of the Fourth International ICST Conference on COMmunication System
softWAre and middlewaRE, COMSWARE 2009. ACM, New York (2009)
41. Haeberlen, A.: A case for the accountable cloud. ACM SIGOPS Operating Systems
Review 44(2), 5257 (2010)
42. Haeberlen, A., et al.: Accountable virtual machines. In: Proceedings of the 9th USENIX
Symposium on Operating Systems Design and Implementation, OSDI 2010 (2010)

Cloud Computing Security Issues and Challenges:


A Survey
Amandeep Verma and Sakshi Kaushal
U.I.E.T, Panjab University, Chandigarh, India
verma_aman81@yahoo.com, sakshi@pu.ac.in

Abstract. Cloud Computing has become another buzzword after Web 2.0. The
phrase cloud computing originated from the diagrams used to symbolize the
internet. Cloud computing is not a completely new concept; it has intricate
connection to the grid Computing paradigm, and other relevant technologies
such as utility computing, cluster computing, and distributed systems in general.
With the development of cloud computing, a set of security problems appears.
Security issues present a strong barrier for users to adapt into Cloud Computing
systems. Several surveys of potential cloud adopters indicate that security is the
primary concern hindering its adoption. This paper introduces the background
and service model of cloud computing. Along with this, few of security issues
and challenges are also highlighted.
Keywords: Cloud computing, Grid computing, Security.

1 Introduction
The cloud computing is a new computing model that provides the uniform access to
wide area distributed resources on demand. The emergence of cloud computing has
made a tremendous impact on the Information Technology (IT) industry over the past
few years, where large companies such as Google, Amazon and Microsoft strive to
provide more powerful, reliable and cost-efficient cloud platforms, and business
enterprises seek to reshape their business models to gain benefit from this new
paradigm[1]. However, there still exist many problems in cloud computing today. A
recent survey by Cloud Security Alliance (CSA) [2] shows that security have become
the primary concern for people to shift to cloud computing.
In this paper, we investigate the security concerns of current Cloud Computing
systems. As Cloud Computing referred to both the applications delivered as services
over the Internet and the infrastructures (i.e., the hardware and systems software in the
data centers) that provide those services [3], we present the security concerns in terms
of the diverse applications and infrastructures. More concerns on security issues, such
as availability, confidentiality, integrity control, authorization and so on, should be
taken into account.
The rest of the paper is organized as follows: Section 2 highlights the basic cloud
computing definitions and architecture. Section 3 and 4 presents the security issues
and challenges. The paper is concluded in Section 5.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 445454, 2011.
Springer-Verlag Berlin Heidelberg 2011

446

A. Verma and S. Kaushal

2 Cloud Computing Definition and Features


2.1 Definition
A number of computing researchers and practitioners have attempted to define Clouds
in various ways. Here are some definitions:
NIST [4] definition of cloud computing: Cloud computing is a model for
enabling convenient, on-demand network access to a shared pool of configurable
computing resources (e.g., networks, servers, storage, applications, and services) that
can be rapidly provisioned and released with minimal management effort or service
provider interaction.
Buyya [5] defined Cloud as follows:
A Cloud is a type of parallel and distributed system consisting of a collection of
inter-connected and virtualized computers that are dynamically provisioned and
presented as one or more unified computing resource(s) based on service-level
agreements established through negotiation between the service provider and
consumers.''
To understand the importance of cloud computing and its adoption, one must have to
understand its principal characteristics, its delivery and deployment models.
2.2 Characteristics
The five key characteristics of cloud computing defined by NIST includes [4]:

On-demand self-service: A consumer can unilaterally provision computing


capabilities, such as server time and network storage, as needed without
requiring human interaction with each services provider.
Ubiquitous network access: Accessed through standard mechanisms on
heterogeneous thin and thick clients. Both high bandwidth and low latency
are expected.
Location-independent resource pooling: The providers computing resources
are pooled to serve all consumers using a multi-tenant model, with different
physical and virtual resources dynamically assigned and reassigned
according to consumer demand.
Rapid elasticity: Lets us quickly scale up (or down) resources.
Measured service: are primarily derived from business model properties and
indicate that cloud service providers control and optimize the use of
computing resources through automated resource allocation, load balancing,
and metering tools.

2.3 Cloud Computing Service Model


Three cloud computing delivery service models are [5]:

Cloud Computing Security Issues and Challenges: A Survey

i)

ii)

iii)

447

Software as a Service (SaaS): In SaaS, the business application software


are delivered to customer/client as on-demand services. Because clients
acquire and use software components from different providers, so the
main issue is here that information handled by these composed services
is to be well protected. Example of SaaS providers are Salesforce,
GoogleApp etc.
Platform as a Service (PaaS): PaaS provide an application or
development platform in which user can create their own application
that will run on the cloud. Microsoft Azure, Manjrasoft Aneka and
Google AppEngine are examples of PaaS providers.
Infrastructure as a Service (IaaS): Iaas is the delivery of computer
hardware like servers, networking technology, storage, and data center
space etc. as a service. It may also include the delivery of operating
systems and virtualization technology to manage the resources. Example
is Amazon S3, EC2, and OpenNebula etc.

Fig. 1 shows the general cloud computing architecture.


2.4 Cloud Deployment Models
There are four cloud deployment models:
i) Public cloud: In public cloud, the resources are dynamically provisioned on a finegrained, self-service basis over the Internet, via web applications/web services. The
customers can quickly access these resources, and only pay for the operating
resources. As multiple customers are sharing the resources so major dangers to public
cloud are of security, regulatory compliance and Quality of Service (QoS) [6, 7, 8].
ii) Private cloud: In the private cloud, computing resources are used and controlled by
a private enterprise. In private cloud, resource access is limited to the customers that
belong to the organization that owns the cloud. The main advantage of this model is
that the security and privacy of data is increased as compliance and QoS are under the
control of the enterprises [6, 7].
iii) Hybrid cloud: A third type can be hybrid cloud that is typical combination of
public and private cloud. Through this environment an organization can provide and
manage certain resources in-house and have others provided through external
resources.
iv) Community cloud: The cloud infrastructure is shared among a number of
organizations with similar interests and requirements. This may help limit the capital
expenditure costs for its establishment as the costs are shared among the
organizations. The cloud infrastructure could be hosted by a third-party vendor or
within one of the organizations in the community.
Although, cloud computing is becoming a well-known buzzword nowadays. However,
security issues present a strong barrier for users to adapt into Cloud Computing systems.
According to an IDC survey in August 2008, security is regarded as the top challenge of
nine [9]. Fig. 2 shows the nine challenges in detail.

448

A. Verma and S. Kaushal

Fig 1. Cloud computing architecture

3 Security Issues Associated with the Cloud


There is a number of security issues associated with cloud computing. These issues
are categories as: security issues faced by cloud providers and security issues faced by
their customers. In most cases, the provider must ensure that their infrastructure is
secure and that their clients data and applications are protected while the customer
must ensure that the provider has taken the proper security measures to protect their
information.
The following list contains several security issues highlighted by Gartner [10]:
Privileged access: Who has specialized/privileged access to data? Who decides
about the hiring and management of such administrators?
Data location: Does the cloud vendor allow for any control over the location of
data?
Data segregation: Is encryption available at all stages, and were these encryption
schemes designed and tested by experienced professionals?
Data availability: Can the cloud vendor move their entire clients data onto a
different environment should the existing environment become compromised or
unavailable?
Regulatory compliance: Is the cloud vendor willing to undergo external audits
and/or security certifications?
Recovery: What happens to data in the case of a disaster, and does the vendor
offer complete restoration, and, if so, how long does that process take?
Investigative Support: Does the vendor have the ability to investigate any
inappropriate or illegal activity?
Long-term viability: What happens to data if the cloud vendor goes out of
business, is clients data returned and in what format?

Cloud Computing Security Issues and Challenges: A Survey

449

Fig. 2. Rate the challenges/issues ascribed to cloud on-demand model

3.1 Security Issues Based on the Delivery and Deployment Model of Cloud
In SaaS, providers are more responsible for security. The clients have to depend on
providers for security measures. As public cloud is less secure than private clouds, the
stronger security measures are required in public cloud. Also in SaaS, it becomes
difficult for the user to ensure that proper security is maintained or not. Private clouds
could also demand more extensibility to accommodate customized requirements. The
following key security elements [11] should be carefully considered as an integral part
of the SaaS application development and deployment process:
i)
ii)
iii)
iv)
v)
vi)
vii)
viii)
ix)
x)

Data security
Data locality
Data integrity
Data segregation
Data access
Data confidentiality
Network security
Authentication and authorization
Availability
Identity management and sign-on process

In PaaS, customers are able to build their own applications on top of the platforms provided. Thus it is the responsibility of the customers to protect their applications as
providers are only responsible for isolating the customers applications and workspaces
from one another [6]. So, maintaining the integrity of applications and enforcing the
authentication checks are the fundamental security requirements in PaaS.

450

A. Verma and S. Kaushal

IaaS is mainly used as a delivery model. The major security concern in IaaS is to
maintain the control over the customers data that is stored in providers hardware.
The consumers are responsible for securing the operating systems, applications, and
content. The cloud provider must provide low-level data protection capabilities [6].
Based upon the deployment model, public clouds are less secure than the other
cloud models as it allows users to access the data across wide area network. In public
cloud, additional security measurements like trust are required to ensure all
applications and data accessed on the public cloud are not subjected to malicious
attacks [12]. Utilization on the private cloud can be much more secure than that of the
public cloud because of it is specified for some particular organization. A hybrid
cloud is a private cloud linked to one or more public clouds. Hybrid clouds provide
more secure control of the data and applications as each and everything is centrally
managed [12].
Fig 3, illustrates the information security requirements coupled with the Cloud
computing deployment model and delivery models [12, 13]. In Fig 3[12], an X
denoting mandatory requirements and an asterisk (*) denoting optional requirements.
Each of the security requirements will be highlighted below in context of cloud
computing:
A. Authorization
Authorization is an important information security requirement in Cloud computing to
ensure referential integrity is maintained. It follows on in exerting control and
privileges over process flows within cloud computing. In case of public cloud,
multiple customers share the computing resources provided by a single service
provider. So proper authorization is required irrelevant of the delivery model used. In
private cloud, authorization is maintained by the system administrator.
B. Identification & authentication
As the major concerns in public and private cloud include internal and external
threats, data collection, privacy and compliance, so, it is the cloud service providers
ability to have a secure infrastructure to protect customer data and guard against
unauthorized access. We need to have some identification and authentication process
to verifying and validating individual cloud users based upon their credentials before
accessing any data over the cloud. Thats why identification and authentication is
mandatory security requirement in public and private cloud.
C. Integrity
The integrity requirement lies in applying the due diligence within the cloud domain
mainly when accessing data. Therefore ACID (atomicity, consistency, isolation and
durability) properties of the clouds data should without a doubt be robustly imposed
across all Cloud computing delivery models.
D. Confidentiality
In Cloud computing, confidentiality plays a major part especially in maintaining
control over organizations data situated across multiple distributed databases.
Asserting confidentiality of users profiles and protecting their data, that is virtually
accessed, allows for information security protocols to be enforced at various different
layers of cloud applications.

Cloud Computing Security Issues and Challenges: A Survey

451

Data confidentiality is one of the most difficult things to guarantee in a public


cloud computing environment. There are several reasons for that: First, as public
clouds grow, the number of people working for the cloud provider who actually have
access to customer data (whether they are entitled to it or not) grows as well, thereby
multiplying the number of potential sources for a confidentiality breach. Second, the
needs for elasticity, performance, and fault-tolerance lead to massive data duplication
and require aggressive data caching, which in turn multiply the number of targets a
data thief can go after. Third, end-to-end data encryption is not yet available. So,
data confidentiality will be maximized by using a large number of private clouds
managed by trusted parties.
E. Availability
Availability is one of the most critical information security requirements in Cloud
computing because it is a key decision factor when deciding among private, public or
hybrid cloud vendors as well as in the delivery models. The service level agreement is
the most important document which highlights the trepidation of availability in cloud
services and resources between the cloud provider and client.
The goal of availability for Cloud Computing systems (including applications and
its infrastructures) is to ensure its users can use them at any time, at any place. Many
Cloud Computing system vendors provide Cloud infrastructures and platforms based
on virtual machines. So availability is a mandatory security requirement for IaaS and
PaaS whether the public cloud is used or private cloud. As in private cloud, all
services are internal to the enterprise, so availability is also required when SaaS is to
be used.
F. Non-repudiation
Non-repudiation in cloud computing can be obtained by applying the traditional ecommerce security protocols and token provisioning to data transmission within cloud
applications such as digital signatures, timestamps and confirmation receipts services
(digital receipting of messages confirming data sent/received).

Fig. 3. Cloud computing security requirements

452

A. Verma and S. Kaushal

4 Security Challenges
Cloud computing environments are multinomial environments in which each domain
can use different security, privacy, and trust requirements and potentially employ
various mechanisms, interfaces, and semantics [6]. Main security challenges in cloud
computing and their solutions are discussed below:
4.1 Service Level Agreement
A Service level agreement (SLA) [14] is a part of a service contract between the
consumer and provider that formally defines the level of service. It is used to identify
and define the customers needs and to reduce areas of conflict like Services to be
delivered Performance, Tracking and Reporting Problem Management Legal
Compliance and Resolution of Disputes, Customer Duties and Responsibilities,
Security IPR and Confidential Information Termination.
4.2

Authentication and Identity Management

By using the cloud services, the user can access the information from various places
over the internet. So we need some Identity Management (IDM) [6] mechanism to
authenticate users and provide services to them based on credentials and
characteristics. An IDM system should be able to protect private and sensitive
information related to users and processes .Every enterprise will have its own identity
management system to control access to information and computing resources.
4.3 Data- Centric Security and Protection
In cloud computing, number of customers can share, save and access the data over the
cloud. So data from one customer must be properly segregated from that of another
and it must be able to move securely from one location to another [6]. Cloud
providers must implement the proper security measures to prevent data leaks or access
by third unauthorized parties. The cloud provider should carefully assign privileges to
the customers and also ensure that assigned duties cannot be defeated, even by
privileged users at the cloud provider. Access control policies should be properly
implemented. When someone wants to access data, the system should check its policy
rules and reveal it only if the policies are satisfied. Existing cryptographic techniques
can be used for data security.
4.4 Trust Management
In cloud computing environments, the customer is dependent on provider for various
services. In many services, the customer has to store his confidential data on the
providers side. Thus, a trust framework should be developed to allow for efficiently
capturing a generic set of parameters required for establishing trust and to manage
evolving trust and interaction/sharing requirements.

Cloud Computing Security Issues and Challenges: A Survey

453

4.5 Access Control and Accounting


Due to heterogeneity and diversity in cloud computing services, a fine grained access
control polices should be enforced. Access control services should be flexible enough
to capture dynamic, attribute- or credential-based access requirements. The access
control models should also be able to capture relevant aspects of SLAs. As the cloud
computing model is pay-per-usage model, so proper accounting records for users are
required for billing purposes. In clouds, service providers usually do not know their
users in advance, so it is difficult to assign roles to users directly. Therefore,
credential- or attribute-based policies can be used to enhance this capability. Security
Assertion Markup Language (SAML), Extensible Access Control Markup Language
(XACML), and Web services standards can be used to specify the secure access
control policies. Among the many methods proposed so far, Role-Based Access
Control (RBAC) [6] has been widely accepted because of its simplicity, flexibility in
capturing dynamic requirements, and support for the principle of least privilege and
efficient privilege management.

5 Conclusion
In this paper key security considerations and challenges which are currently faced in
the Cloud computing are highlighted. Many enhancements in existing solutions as
well as more mature and newer solutions are urgently needed to ensure that cloud
computing benefits are fully realized as its adoption accelerates. Cloud computing is
still in its infancy, and how the security and privacy landscape changes will impact its
successful, widespread adoption.

References
1. Zhang, Q., Cheng, L., Boutaba, R.: Cloud computing: state-of-the-art and research
challenges. Journal of Internet Services and Application 1(1), 718 (2010)
2. Cloud Security Alliance , http://www.cloudsecurityalliance.org
3. Zhou, M., Zhang, R., Xie, W., Qian, W., Zhou, A.: Security and privacy in cloud
computing: a survey. In: The Proceedings of IEEE 6th International Conference on
Semantics, Knowledge and Grids, pp. 105111 (2010)
4. Mell, P., Grance, T.: The NIST definition of Cloud Computing, version 15. National
Institute of Standards and Technology (NIST), Information Technology Laboratory
(October 7, 2009), http://www.csrc.nist.gov
5. Buyya, R., Yeo, C.S., Venugopal, S., Broberg, J., Bandic, I.: Cloud Computing and
emerging IT platforms: vision, hype, and relatity for deliverling computing as the 5th
utility. Future Generation Computer System 25(6), 599616 (2009)
6. Takabi, H., Joshi, J.B.D.: Security and privacy challenges in cloud computing
environment. IEEE Journal on Security and Privacy 8(6) (November 2010)
7. Yang, J., Chen, Z.: Cloud computing research and security issues. In: The Proceeding of
IEEE International Conference on Computational Intelligence and Software Engineering,
pp. 13 (2010)
8. Kaur, P., Kaushal, S.: Security concerns in cloud computing. In: Accepted For
International Conference on High Performance Architecture And Grid Computing-2011.
Chitkara University, Rajpura (2011)

454

A. Verma and S. Kaushal

9. Gens, F.: New IDC IT Cloud Services Survey: Top Benefits and Challenges. In: IDC
eXchange (2009), http://blogs.idc.com/ie/?p=730
10. Brodkin, J.: Gartner: Seven cloud-computing security risks. In: Infoworld 2008 (2008),
http://www.infoworld.com/d/security-central/gartner-sevencloudcomputing-security-risks-53?page=0,1
11. Subashini, S., Kavitha, V.: A survey on security issues in service delivery models of cloud
computing. Journal of Network and Computer Application, 111 (2010)
12. Ramgovind, S., Eloff, M.M., Smith, E.: The management of security in cloud computing.
In: The Proceedings of IEEE Conference on Information Security for South Africa-2010
(2010)
13. Dlamini, M.T., Eloff, M.M., Eloff, J.H.P.: Internet of People, Things and Services The
Convergence of Security, Trust and Privacy. In: The Proceeding of 3rd Annual
CompanionAble Consortium Workshop-IoPTs, Brussel (December 2009)
14. Kandukuri, B.R., Paturi, R., Rakshit, A.: Cloud Security Issues. In: The Proceedings of
IEEE International Conference on Service Computing, pp. 517520 (2009)

A Deadline and Budget Constrained Cost and Time


Optimization Algorithm for Cloud Computing
Venkatarami Reddy Chintapalli
IBM India Pvt ltd, Hyderabad
ramireddy556@gmail.com

Abstract. Cloud computing is a rapidly developing area. In that, resource


allocation is an important enabling technology for cloud computing environments.
The users will submit the service requests to clouds for computation. Along with
the service requests user may give some constraints like deadline, budget,
reliability, trust/security, etc. In this paper, we are considering two constraints
deadline and budget. To improve the resource utilization and QoS, we are using
the concept of RAINBOW service computing framework. We propose a cost and
time optimization algorithm for allocating resources to service requests by
considering multiple clouds, using the concept of RAINBOW framework in such
a way that the user's requirements are met with minimum cost. In cloud
computing there is need for sharing resources like storage, processing time,
memory, network bandwidth, etc. In this paper, we consider processing time as
the main resource, which is to be allocated to competing clients.
Keywords: Cloud computing, resource allocation, RAINBOW framework.

1 Introduction
Cloud computing [1] is Internet based system development in which large scalable
computing resources are provided as services over the Internet to users. The services
that can be provided from the cloud include Software as a Service (SaaS), Platform as
a Service (PaaS) and Infrastructure as a Service (IaaS). Cloud computing becomes
more and more popular in large scale computing and data store recently due to it
enables the sharing of computing resources that are distributed all over the world and
it allows enterprises to obtain as much computation and storage resources as they
require, while only paying for the precise amount that they use. Customers pay for the
computational services that they receive, like we pay for Internet services, electricity
and gas. In an open cloud computing framework, scheduling service requests with
guaranteeing QoS constrains present a challenging technical problem [3].
Clouds aim to power the next generation data centers by exposing them as a
network of virtual services (hardware, database, user-interface, application logic) so
that users are able to access and deploy applications from anywhere in the world
on demand at competitive costs depending on users QoS (Quality of Service)
requirements [6]. In cloud computing there are resources like storage, processing,
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 455462, 2011.
Springer-Verlag Berlin Heidelberg 2011

456

V. Reddy Chintapalli

memory, network bandwidth, etc. we consider processing as the main resource. In


cloud, different types of resources like nodes of super computer, nodes of cluster,
nodes of workstation and desktop pc are with different speeds. Each resource has a
fixed price according to its capacity. The consumers submit the requests to cloud for
computations. Along with Service requests users may give so many constraints like
cost, time, reliability, and trust/security. But we consider proposal of deadline and
budget as our constraints. We are using the concept of RAINBOW framework [2] for
improving the resource utilization. The main issue is achieving the user level
requirements by allocating current available resources at the cloud side.
The remainder of the paper is structured as follows. Section 2 presents RAINBOW
framework. Section 3 gives motivations and related works. Section 4 presents design
of the resource allocation in Clouds and proposed algorithm. Section 5 describes
about experimental framework and time complexity of proposed algorithm. Section 6
concludes the paper.

2 Novel Service Computing Framework-RAINBOW


Its a new trend towards providing heterogeneous services concurrently by enterprise
data centers, for example, of Google and Amazon. Google provides services
consisting of Google search, Google office, and YouTube. In the past, those services
were provided by different platforms. In such a case, QoS guaranteeing of services
with the time varying capacities (including computing, storage, and communication)
demands as the result of request arrival distributions resulted in over-provision each
service. Such datacenters were often underutilized. One approach to increase the
resource utilization is consolidating services in a shared infrastructure-utility
computing [12]. In such a shared platform, isolation among the hosted services is
crucial. As virtualization is increasingly popular, utility computing is incorporating
virtualization technology such as virtual machines (VMs) with effective isolation
among services. Many companies envisioning this popular trend have devoted
themselves to developing new utility computing infrastructures based on
virtualization technologies such as VMware [14] and XenSource [13]. VM-based
utility computing has some obvious advantages like consolidation, isolation,
flexibility resource management. We are using novel service computing framework
(RAINBOW, illustrated in fig 2) to improve the resource utilization and QoS.
Different from the traditional service computing framework (illustrated in fig 1) in
which one service runs on a set of dedicated servers, RAINBOW uses virtualization
to isolate concurrent services in a shared physical infrastructure [4].
In order to minimize the interaction among the hosted services due to their
competitions for resources, services with the same resource-bound should be
distributed into different physical servers. In RAINBOW, a set of VMs serving a
particular service is called a group. The key principle is that VMs belonging to a
single group are spread across multiple servers, while each server hosts VMs
belonging to different groups. This principle aims to reduce the competitions for
resources by the hosted services in a server [12].

A Deadline and Budget Constrained Cost and Time Optimization Algorithm

457

Fig. 1. Traditional service computing framework using dedicated resources [2]

Fig. 2. RAINBOW-VM based service computing framework for utility computing [2]

3 Motivations and Related Work


Development of QoS and resource management middleware has been studied in
various research areas, such as real-time systems, grid computing and cloud
computing. A workflow-based computational resource broker [10, 11] is presented for
grid computing environment. The main function of the resource broker is to monitor
the available resources over multiple administrative domains. The resource broker
provides a uniform interface for accessing available system resources of computing
grid via users credentials.
A market-based autonomic resource management approach in cloud computing
environment was developed [6], in which Service-level agreement Resource Allocator
provides the interfaces between the cloud service provider and external users/brokers
for 1) monitoring users service requests and QoS requirements, 2) monitoring the
availability of system resources, 3) examining service requests to determine whether
to accept the requests according to resource availability and processing workload, 4)
allocating system resources to Vms in cloud, 5) pricing the usage of resources and
prioritizing the resource allocation, and 6) keeping track of the execution progress of

458

V. Reddy Chintapalli

the service requests and maintaining the actual usage of resources. This approach
supports negotiation of QoS between users and providers to establish service-level
agreement (SLA) and allocation of system resources to meet the SLAs.
In grid computing, Auction system and reservation systems are two existing
resource allocation policies. In Auction/Bid resource allocation system [7,8], only one
winner gets a particular resource, all losers have to bid for other resources elsewhere,
and may fail many times before they get a chance to run their jobs. For those people
that have less money, they may never get a chance to run their jobs. So, it is neither
efficient nor fair. On the other hand, once a bid is accepted, it is fixed; if a user or the
system wants to change the resource allocation that will be difficult. Where as in
reservation system [9], the advantage is guaranteed resources IN ADVANCE.
Reservation systems reduce the risk of jobs, but the reservation may be not accurate
due to inaccuracy of job length estimation. For example, if a job reserved one-hour,
but actually need 1.5 hours, the job may be abandoned after 1 hour. But if a user
wants to reserve 2 hours to make sure the job is completed, then 0.5 hour is wasted,
since no other jobs can use it. Although reservations allow low risk and low latency,
the efficiency is also low because some tasks do not use their entire reservations.
Where as in our proposed algorithm, we are considering multiple clouds by
introducing three entities and we are using RAINBOW framework to improve the
resource utilization. The major distinction between our resource allocation approach
and other resource allocation approaches discussed in next Section is that our
approach can maximize utilization of limited system resources to reach maximum
throughput according to throughput requirements, and resource status, rather than just
serving users service-requests in first-in-first-serve manner by matching available
system resources upon the arrival of the user requests and requirements.

4 Resource Allocation in Clouds


Current Cloud Computing providers have several data centers at different
geographical locations over the Internet in order to optimally serve consumers needs
around the world. However, existing systems does not support mechanisms and
policies for dynamically coordinating load-shredding among different data centers in
order to determine optimal location for hosting application services to achieve
reasonable service satisfaction levels. For that, we are including three entities called
users, brokers, resource allocator. Figure 3 gives the coordination among these three
entities in multiple clouds. We are considering multiple cloud providers for allocating
the service requests to resources.
Users: users are acting on their behalf submit service requests from anywhere in the
world to the cloud to be processed.
Brokers: brokers will collect the service requests from the different users and send
these service requests to resource allocator in different clouds. The brokers will get
the estimated cost and time of service requests from resource allocator of different

A Deadline and Budget Constrained Cost and Time Optimization Algorithm

459

Fig. 3. Coordination among users, brokers, and resource allocators in multiple clouds

cloud providers and it will decide whether the request is accept or reject by checking
the QoS requirements. After checking the QoS requirements the broker will assign
that job to service provider with minimum cost.
Resource Allocator: It is an interface between cloud provider and the brokers. The
resource allocator in every cloud will take the service requests from the brokers and
by applying the proposed algorithm it will get the estimated cost and time. These
values will send to the broker from where it gets the request.
4.1 Cost and Time Optimization Algorithm
Assumptions:
(1) For each service request, the time spent on each available resource can be
known. Many techniques are available to achieve this [5].
(2) Each resource has a fixed price according to its capacity.
Users will submit the service requests to broker from anywhere in the world. Along
with the service requests the users will give the budget and deadline constrains. After
collecting the service requests from different users, the brokers will send these
requests to multiple clouds for calculating the estimated costs and time. After getting
the request from broker, every cloud will find out the estimated cost and time for that
request using the given algorithm and send this information to the broker from where
it got the request. In any cloud, if the resources are not free then by adding the request
to resource queue it will find out the estimated cost and time. After collecting the
estimated costs and time from multiple clouds, the broker will submit the request to
the cloud which completes the requests with minimum cost and time by satisfying the
user requirements. The pre-requirement of our algorithm is for each service request,
the time spent on each available resource is known.

460

V. Reddy Chintapalli

Algorithm
Input: Requests with deadline (Di) and budget (Bi) constraints.
Output: Either allocates the request to any cloud which takes minimum cost to
Complete or reject the request if it is not satisfying the constraints.
Begin
Steps:
1: For all the service requests i
2: For all cloud j
Begin inner for loop
3: Get the available resource information in the cloud (from Resource
Allocator).
4: For each available resource, find out how much time it will take to complete
and with how much cost on that resource (Let time is saved in
CalculatedETi and cost saved in CalculatedCi).
5: SORT the resources by increasing order of cost. If two or more resources
have the same cost, order them such that powerful ones are preferred first
(which takes less time to complete the request).
6: for all sorting resources
Do
7: If (CalculatedETi <Di and CalculatedCi < Bi)
1. Send this information to broker from where it got request.
2. Come out from inner for loop.
End for
8: If there is no resource by satisfying the constraints
/* that means it is not possible to allocate request to currently available
resources*/
9: Then, by adding the request to resource Queue, estimate the cost and time for
all the resources in the cloud.
10: For all these, select the one by satisfying the given constrains with minimum
cost and Return the estimated cost and time to the broker from where it got
the request.
End inner for loop
11: If it is not possible to allocate the request to resource in any cloud then send
message to the broker allocation is not possible.
End

5 Experimental Framework
In this section we provide some details about implementation of the proposed
algorithm. We have developed a simulation setup for testing this algorithm in Java.
We are considering multiple users, multiple brokers and multiple cloud providers.
Every user is associated with one broker and each broker is connected to different
cloud providers. Here, we are generating the required number of cloud providers and
required number of free servers in every cloud dynamically and on each server we are

A Deadline and Budget Constrained Cost and Time Optimization Algorithm

461

generating random data. Every user has separate user ID, every broker has related
broker ID and the service request from every user has separate request ID.
User Modeling: The users will give type of service and request number to
corresponding broker along with two constraints deadline and budget. Every request
has separate request ID.
Broker Modeling: The broker will get the service requests from different connected
users. The broker will save this information for further process and send every request
to connected cloud providers. After finding the estimated cost and time for the given
service requests, the resource allocator in all clouds will send this information to
broker from where it got requests. The broker will select the cloud provider which
takes least cost by using quick sort. If more than one cloud provides same cost then
the cloud provider which computes in less time will consider.
Resource Allocator Modeling: Every cloud has resource allocator. It will get the
service requests from different brokers along with given constraints. It has the
currently available resource information so it will find the estimated cost and time on
that resources. Among these it will select the resource which takes less cost. If any
resource completes with same cost then we are considering time. It will find estimate
cost and time for all requests it received. If any resource in that cloud does not satisfy
the given constraints then simply it send message saying that the request cannot be
satisfied. After getting the confirmation of resource allocating from broker, the cloud
will computes the request and send results to particular user based on user ID.
5.1 Time Complexity
The time complexity of the proposed algorithm depends on the number of clouds and
the availability of resources in every cloud. Let us assume that n is the number of
cloud providers and m is the maximum number of available resources among all
clouds. The estimated cost for the given service request will be calculated for each
cloud and the cloud which takes minimum cost will be selected. So, the time
complexity for the proposed algorithm is O (nmlogm + nlogn).

6 Conclusion and Future Work


Cloud computing is a new and promising paradigm delivering IT services as
computing utilities. As Clouds are designed to provide services to external users,
providers need to be compensated for sharing their resources and capabilities. In this
paper, we have proposed an algorithm for allocating resources to the service requests
with minimum cost by satisfying the given constraints deadline and budget. Here we
consider multiple cloud providers for allocating these service requests. We
implemented this algorithm and based on the results we can conclude that it will run
on linear time.
Tools can be used for implementation to get realistic environment. Now we are
giving same priority to all the services. Different priorities can be given to different
services and varying resource cost can also be considered based on time and demand.

462

V. Reddy Chintapalli

References
1. Wang, L., Tao, J., Kunze, M., Canales, A., Castellanos, Kramer, D., Karl, W.: Cloud
Computing: Early Definition and Experience. In: 10th IEEE International Conference on
High Performance Computing and Communications (2008)
2. Song, Y., Li, Y., Wang, H., Zhang, Y., Feng, B., Zang, H., Sun, Y.: A service-oriented
priority-based resource scheduling scheme for virtualized utility computing. In:
Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2008. LNCS,
vol. 5374, pp. 220231. Springer, Heidelberg (2008)
3. Guiyi, W., Athanasios, V., Yao, Z., Naixue, X.: A game-theoretic method of fair resource
allocation for cloud computing services. Spinger Science Business Media, LLC (2009)
4. Song, Y., Wang, H., Li., Y., Feng, B., Sun, Y.: Multi-Tiered On-Demand Resource
Scheduling for VM-Based Data Center. In: 9th IEEE/ACM International Symposium on
Cluster Computing and the Grid, pp. 148155 (2009)
5. Dogan, A., Ozguner, F.: Scheduling independent tasks with QoS requirements in grid
computing with time-varying resource prices. In: CCGRID, pp. 5869 (2002)
6. Buyya, R., Yeo, C.S., Venugopal, S.: Market-oriented cloud computing: Vision, hype, and
reality for delivering IT services as computing utilities. In: Proceedings of the 10th IEEE
International Conference on High Performance Computing and Communications (2008)
7. Buyya, R., Abramson, D., Giddy, J., Stockinger, H.: Economic models for resource
management and scheduling in grid computing. In: The Journal of Conferency and
Computation: Practice and Experience (CCPE), maio (2002)
8. Lawson, B., Smirni, E.: Multiple-queue backfilling scheduling with priorities and
reservations for parallel systems. In: 8th Workshop on Job Scheduling Strategies for
Parallel Processing (2002)
9. Lai, K., Rasmusson, L., Adar, E., Sorkin, S., Zhang, L., Huberman. B,A.: Tycoon: a
distributed market-based resource allocation system. Technical report, Hewlett-Packard
laboratories, palo alto, CA (2004)
10. Yang, C., Lin, C., Chen, S.: A Workflow-based Computational Resource Broker with
Information Monitoring in Grids. In: 5th International Conf. Grid and Cooperative
Computing, pp. 105206 (2006)
11. Venugopal, S., Chu, X., Buyya, R.: A negotiation Mechanism for Advance Resource
Reservation using the Alternate Offers Protocol. In: 16th International Workshop on
Quantity of Service (2008)
12. Hewlett-Packard : HP Utility Data Centre Technical White Paper (October 2001),
http://www.hp.com
13. Barham, P., Dragovic, B., et al.: Xen and the art of virtualization. In: SOSP, pp. 164177
(2003)
14. VMware Infrastructure: Resource Management with VMware DRS by VMware, Inc. 3145
Porter Drive Palo Alto (2006)

A Bit Modification Technique for Watermarking Images


and Streaming Video
Kaliappan Gopalan
Department of Electrical and Computer Engineering
Purdue University Calumet
Hammond, IN 46321, U.S.A.
gopalan@purduecal.edu

Abstract. This paper demonstrates a bit modification technique for embedding


watermarks of known strings of data applied to a color image. For
watermarking of images and video frames, low payload information can be
embedded indiscernibly and with quick and easy manipulation of pixel bits.
Robustness of the hidden watermark with added noise is shown as a tradeoff
with visibility, as is the case with the size of the watermark. By retrieving the
hidden watermark in an oblivious manner, the technique can be used in
copyright and authentication applications. The simplicity of embedding and
blind recovery with low level noise-resistance render the proposed technique
suitable for imperceptible watermarking of still images and streaming video
over public networks.
Keywords: Watermarking, Data embedding, Oblivious retrieval, Visual
perception, Bit modification.

1 Introduction
Multimedia watermarking is concerned with imperceptibly embedding a short amount
of information in an audio, image or video frame. With the rapid growth of Internet
technologies in all walks of life, copyright protection, authentication of ownership,
and validation of multimedia sources are of paramount importance. By indiscernibly
hiding a watermark on a stock picture, for instance, any alteration, or claims of false
ownership, can be detected. Other applications of watermarking include automatic
retrieval of patient records such as radiographic images, invisible marking to avoid
tampering of surveillance video tapes with location, time, etc., and fingerprinting.
In this paper we describe a technique for embedding predetermined information as
watermark on a set of pixels in a compressed color image using pixel modification at
a selected bit index. The proposed technique is ensures imperceptibility of the
watermarked image while maintaining robustness to additive noise simulating
attacks at low levels. In the next section a brief review of some of the existing
image watermarking methods including the bit modification technique, and issues in
watermarking in general are presented. Following this, we describe the proposed bit
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 463472, 2011.
Springer-Verlag Berlin Heidelberg 2011

464

K. Gopalan

modification of pixels with experimental results. Discussion of the robustness of the


technique and variations to improve imperceptibility, and extensions to video frames
conclude the paper.

2 Image Watermarking
Digital watermarking can be considered a subset of steganography or data hiding on a
medium for the purpose of establishing ownership or integrity of the media signal.
While the goal of steganography is to transmit a significantly large payload of hidden
information employing an innocuous signal, watermarking embeds a short amount of
information on a specific image, video or audio signal to ascertain the author and/or
authenticity of the signal. Both cases rely on the imperfections of the human visual
and/or auditory system in perceiving changes made to a multimedia signal.
Generally, it is preferable to carry out embedding and retrieval of embedded
information with the use of a strong key to thwart illegal attempts to extract or destroy
the information.
For easy retrieval of the embedded information, a comparison between the original
(host) media signal and the embedded stego signal can be carried out in spatial or
spectral domain, for example with the difference yielding the hidden information.
While this escrow technique is quite simple, it requires the original signal, which
renders it unsuitable for watermarking or for covert communication if the same host is
used more than once. A better alternative is to employ oblivious extraction of the
hidden information using a key and/or location associated with the embedding
process. Oblivious or blind extraction, clearly, is preferred for watermarking for
copyright and authentication applications.
In the case of watermarking or information hiding on a host image or video frame,
embedding techniques generally exploit the psychovisual masking phenomenon due
to the low sensitivity of the human visual system (HVS) to small changes in
luminance, masking effect of the edges, varying sensitivity to contrasts as a function
of spatial frequency, and low sensitivity to very low spatial frequencies as in
continuous changes in brightness in an image.
Conceptually, a set of visually masked two-dimensional spectral points can be
determined for a given host image, and pixels may be modified at some or all of these
points in the spatial or frequency domain in accordance with the data for
imperceptible hiding [1 - 4]. Masked spectral points can be obtained using
psychovisual contrast or pattern masking frequencies from the discrete cosine
transform (DCT) of each block of an image. Difficulty of and the number of steps
involved in evaluating the masked points, however, has led to the development of
techniques that take advantage of the HVS limitation in an indirect manner.
Manipulation of the DCT coefficients of an image by rounding or otherwise changing
their values, for example, can result in an indiscernible image [2].
Image embedding techniques for covert communications typically use spread
spectrum steganography in which a narrow band pseudorandom noise is spread over a
varying wide band carrier [5 - 7]. The low density of the noise spread over the carrier
renders the embedding imperceptible. Although it is a highly robust, multiple keybased technique, complexity of the encoding and decoding processes limits the use of
spread spectrum image steganography to secure military communications.

A Bit Modification Technique for Watermarking Images and Streaming Video

465

Bit modification by inserting a small amount of data in the least significant bit
(LSB) position, in particular, of pixels, is a simple and common technique for highpayload embedding. By altering the LSB values of selected pixels based on their
locations or intensity levels, a large volume of information can be hidden with
little noticeability of change in image. Pixel bit alteration in accordance with the
hidden data may be carried out, and extracted from the stego image, with or without a
key [2].
Extending the technique with a key for audio embedding in secure communication
applications has received attention recently [8-10]. In particular, altering other than
the LSB for data robustness has been shown to be a viable technique for reasonably
high payload audio steganography [8, 10]. This technique, using a key of any desired
size, is also highly useful in covert transmission due to its oblivious extraction of the
hidden data.
The next section describes the proposed pixel embedding procedure at a selected
bit index for watermarking an image with a known set of data.

3 Bit Modification Image Watermarking at High Bit Index


The algorithm for watermarking and oblivious retrieval of the watermark data in a
color image proceeds as follows. The watermark signature data of size N bits and the
bit index k to modify in accordance with the data in a given image are first
determined. To minimize perceptual changes in the watermarked image, the image is
scanned to obtain a set of N pixels in a particular color that alternate in their bit values
at the kth index. For simplicity, these N pixels may occur in a sequence, for example.
The premise is that with alternating bit values in N consecutive pixels of the original
image, changes due to the watermark data are less likely to be perceived than if all N
bits are simultaneously flipped. (It is assumed that the watermark data are not also
generally alternating; otherwise, a total of N pixels will be modified in the worst case
causing perceptual change.) If N is small, clearly, it is possible to obtain a column or
row of pixels in a given image. For a large N, a group of pixels which alternate in bit
values at the kth index in every 2nd, 3rd, or mth pixel may be selected. (Additionally,
pixels with alternating bit values in every mth pixel may be needed if the index k is
large, again to avoid visible change.)
For determining the presence of the watermark and extracting it, the watermark
data may be slid on the image as a stencil at the kth index of all pixels for the case of m
= 1, and find a match with the watermark. This oblivious pattern matching can work
even if some rows and/or columns at edges have been removed without affecting the
location of the watermark. For m > 1, synchronization of the first embedded bit may
cause a problem if some rows or columns of the watermarked image are cropped;
otherwise, the sliding technique can still extract the watermark in an oblivious
manner. The key for embedding and extraction thus consists of the set {k, m, N} and
the watermark itself. To prevent the presence of watermark data in a random
sequence of pixels, the starting row and column indices of the watermark may be
added to the key set. This addition, of course, cannot help in extracting the
watermark if there is a loss of rows or columns.

466

K. Gopalan

If the number of bits N in the watermark data is large, it may not be possible to find
N pixels that alternate in bit values at the kth index. In such a case it is essential to
determine an area consecutive or otherwise of rows or columns where the original
pixel intensity is varying so that any modification in any color is likely to result in
little or no discernible change.
The next section describes the implementation of the above watermarking
procedures and discusses the experimental results.

4 Experimental Results and Discussion


A JPEG-encoded color (RGB) image kid.jpg of size 289x200x3 with 8 bit pixels was
chosen for testing the two cases of watermarking. In the first case, the watermark
consisted of the 56 bits corresponding to the 8-bit ASCII characters for the word,
CALUMET. With only 8 bits (one ASCII character) to be embedded in a row, it was
possible to find many rows of pixels with alternating bit values, after excluding the
first and last 6 columns. (Skipping the edge columns and rows helps in extracting the
hidden data even if these columns and rows were cropped due to tampering, or
otherwise lost in transmission.) To avoid modifying adjacent rows and thereby
causing visible changes, a row for hiding a character was chosen if it was at least 20
rows away from the previous row selected for watermarking. With this restriction,
rows of modified pixels were located far from each other and the selected bit index in
each pixel was modified between 0 pixel (if alternating ASCII data for a character is
the same as the alternating pixel bit values) and 8 pixels (if data and pixel bit values
are complementary). Perceptibility of the hidden watermark, therefore, depended on
the original pixel values at the selected index and the ASCII character that replaced
them. Fig.1 shows the original image on the left and the watermarked image on the
right with bit index 5 carrying the 56-bit ASCII data in the red color.
As the bit index is increased for higher robustness of the hidden data, the criterion
of 8 adjacent pixels with alternating bits at the selected bit index (for embedding one
ASCII character) was not satisfied. This is particularly a limitation if the characters
are to be hidden in rows far removed from each other. Because of this problem,
watermarking at the 6th bit was possible with rows no farther apart than 5. Fig. 2
shows the watermarked image at bit index 6 in red. As with index 5, this figure also
depicts no visible change due to the hidden watermark.
Embedded watermark in the above cases was extracted in an oblivious manner
using the key consisting of the color and the bit index in which bit modification was
carried out, and the watermark itself. With the watermark characters used as a
template, the starting row and the number of rows skipped between embedded
characters are, generally, not required for extraction. As the template is slid along
each row (assuming each ASCII character is hidden in a row of 8 consecutive pixels),
however, a sequence of 8 bits at the modified bit index may have the same pattern as
the embedded character, in addition to the intentionally modified set of pixels. This
causes an erroneous detection of a character even if the intended pixels are tampered
with. To avoid this problem, the starting row and the minimum number of rows
skipped between embedded characters may be added to the key. The augmented key
works well for extraction if no row or column is removed or missing in transmission.

A Bit Modification Technique for Watermarking Images and Streaming Video

467

Red WM'd in bit 5

Original Image

Fig. 1. Original Image (left), and the Image Watermarked in Red at Bit Index 5 with 56 Bits
(right)

Red WM'd in bit 6

Fig. 2. Original Image in Fig. 1 (left) Watermarked in Red at Bit Index 6 with 56 Bits

If an image is such that the bit values at the selected bit index in a sequence of N
pixels do not alternate, or if the transmission environment is likely to add noise to the
watermarked image, watermark locations may be selected a priori based on intensity
variation, and the row (or, column) location/s may be added to the key. As an
example, each of the 7 ASCII characters in CALUMET used above was appended
with 8 bits each of header (= [0 0 0 0 1 1 1 1]) and footer (= [1 1 1 1 0 0 0 0]) data for
a total of 7x24 = 168 bits in the watermark. For this N = 168, an arbitrary set of rows
and columns were used to study the effect of modifying pixel bits. Fig. 3 (left) shows
the result of hiding the watermark in the 4th bit position of 24 consecutive columns in
7 rows, both selected arbitrarily, in the original image shown in Fig. 1 (left). For the
row, column pairs (77, 29), (96, 49), (128, 71), ( 151, 91) , (180, 111), (208, 129),

468

K. Gopalan

(219, 157), watermarking in the red does not appear to indicate any visible change; if
the red intensity alone is mapped, however, effect of bit modification is evident in
certain rows (above the right eye and near the nose, for example) as seen on the right
in Fig. 3.
Red WM'd in bit 4 with 168 bits

Bit 4 WM'd with 168 bits -- RED

Fig. 3. Longer Watermark (168 Bits) Embedded in Red in Bit Index 4 (left); Grayscale Image
of the Watermarked Red Color (right)

Using the same pairs of rows and columns for hiding the 168-bit watermark in
green in bit index 4, on the other hand, causes less noticeable change in the image, as
evidenced in Fig. 4. This is due to the fact that the intensity variations in green are
possibly in the same sequence at the 4th bit level at the selected pixels as the
watermark data, hence, modifying the pixels at the green color does not cause any
change in intensity. Thus, a careful choice of the areas for watermarking is imperative
in the bit modification procedure, as with any other hiding technique, for
imperceptibility.
4.2 Robustness of Watermark to Noise
Because of the high bit index used for modification, it is reasonable to expect the
watermark to remain intact when noise at low levels is added to the watermarked
image. This was verified for salt and pepper noise of up to a density of 0.005, on
average, added to the entire image or to the watermarked color. As an example, the
watermarked image at bit index 4 in green (Fig. 4) is shown in Fig. 5 with salt and
pepper noise added at a density of 0.005. Although the image is noisy, no error
resulted from the extraction of the hidden data at the specified bit index in the rows
and columns used for pixel modification. In fact, noise at even higher levels in many
cases showed little effect on the modified bits at index 4. Correct retrieval of the
watermark was, similarly, observed for low levels of Gaussian and speckle noise
added to the watermarked image.

A Bit Modification Technique for Watermarking Images and Streaming Video

Green W M'd in bit 4 with 168 bits

469

Bit 4 WM'd with 168 bits -- GREEN

Fig. 4. Imperceptibility of Watermark Depends on Intensity Variations: Watermark (168 Bits)


Embedded in Green in Bit Index 4 (left); Grayscale Image of the Watermarked Green Color
(right)

Bit 4 WM'd in G + Noise

Fig. 5. Watermarked Image (Fig. 4) with salt and pepper noise added at a density of 0.005

As a second example, the image shown in Fig. 6 was embedded with the same
watermark of 168 bits at the blue color. The watermarked image, with its embedded
blue color shown in Fig. 7, retained the hidden data imperceptibly. Also, oblivious
extraction with and without added noise resulted in correct retrieval of the watermark.
Noise immunity in retrieving the watermark was also similar to that for the
previous image at low levels of additive noise such as salt and pepper noise.

470

K. Gopalan

Brandyrose

Fig. 6. Original Image, Brandyrose

Blue in BR - Original

Blue WM'd in bit 4 (168 Bits)

Fig. 7. Original (left) and Watermarked (right) Image of Fig. 6 shown with the Watermarked
Color alone

5 Discussion
As the results in the previous section indicate, modifying pixel values at a selected bit
index is a viable technique for embedding watermarks on images. The selection of
rows (or, columns), as stated, is important in achieving an imperceptible embedding
of the watermark in any image. Although explicit evaluation of visually masked
regions and employing these regions for hiding information is necessary for covert
communication, the complexity of the evaluation, clearly, is not warranted for

A Bit Modification Technique for Watermarking Images and Streaming Video

471

watermarking an image for authentication and proof of ownership, for example.


While the choice of alternating bits in a row of pixels at the selected index may limit
the size of hidden information in a particular color, payload can be increased by
employing more than one color. Choosing a higher bit index, undoubtedly, results in
more robust embedding with noise. Detection of tampering, on the other hand,
depends on fragile watermarking so that a corrupted watermark can be used to
indicate alteration of security images, for example. In such applications, a low bit
index can be used for watermarking with minimal perceptual change and less
impervious to noise. To further increase tamper detection, a small payload watermark
may be embedded by modifying a low bit index over a large number of pixels that are
spread over the entire host image.
For further security, a K bit key may be used along with the data to be hidden. The
N data bits and the K key bits may be combined in an exclusive OR operation, for
example [8], to obtain the bits for embedding. At the cost of increased key size, the
enhanced key can be used for secure hiding and oblivious retrieval of information.
Robustness of the information, still, depends on the bit index employed for
modification.

6 Conclusion
A simple method for watermarking images and video frames using pixel modification
at a selected bit index has been proposed. The technique results in indiscernible
hiding of a small payload of data that can be retrieved without the use of the original,
unembedded host image or video frame. By selecting a high bit index, the hidden data
or watermark can be rendered robust to low levels of additive noise.
The proposed bit modification can be extended to video frame watermarking by
selecting the color, locations and frames. Additionally, a watermark may be split into
several parts and hidden in a sequence of frames for detecting tampering and/or
missing frames. This is useful in the transmission of security videos over the Internet,
for example.

References
1. Anderson, R.J., Petitcolas, F.A.P.: On the limits of steganography. IEEE J. Selected Areas
in Communications 16(4), 474481 (1998)
2. Bender, W., Gruhl, D., Morimoto, N., Lu, A.: Techniques for data hiding. IBM Systems
Journal 35(3&4), 313336 (1996)
3. Wu, M., Liu, B.: Data hiding in image and video.I. Fundamental issues and solutions.
IEEE Transactions on Image Processing 12, 685695 (2003)
4. Wu, M., Yu, H., Liu, B.: Data hiding in image and video.II. Fundamental issues and
solutions. IEEE Transactions on Image Processing 12, 696705 (2003)
5. Swanson, M.D., Kobayashi, M., Tewfik, A.H.: Multimedia data-embedding and
watermarking technologies. Proc. IEEE 86, 10641087 (1998)
6. Cox, I.J., Kilian, J., Leighton, F.T., Shamoon, T.: Secure spread spectrum watermarking
for multimedia. IEEE Trans. Image Proc. 6, 16731687 (1997)

472

K. Gopalan

7. Marvel, L.M., Boncelet Jr., C.G., Retter, C.T.: Spread spectrum image steganography.
IEEE Trans. Image Proc. 8(8), 10751083 (1999)
8. Gopalan, K.: Audio Steganography Using Bit Modification. In: Proc. of the IEEE 2003
International Conference on Multimedia and Exposition, ICME 2003 (July 2003)
9. Cvejic, N., Seppanen, T.: Increasing robustness of LSB audio steganography using a novel
embedding method. In: Proc. of the International Conference on Information Technology:
Coding and Computing (ITCC 2004), vol. 2(5-7) (April 2004)
10. Gopalan, K., Shi, Q.: Audio Steganography using Bit Modification A Tradeoff on
Perceptibility and Data Robustness for Large Payload Audio Embedding. In: Proc. of the
19th International Conference on Computer Communications and Networks (ICCCN
2010) Workshop on Multimedia Computing and Communications, Zurich, Switzerland
(August 2010)

Ecient Video Copy Detection Using Simple


and Eective Extraction of Color Features
R. Roopalakshmi and G. Ram Mohana Reddy
Information Technology Department,
National Institute of Technology Karnataka(NITK),
Surathkal, Mangalore, Karnataka -575025, India
{roopanagendran2002,profgrmreddy}@gmail.com
http://www.nitk.ac.in

Abstract. In the present Multimedia era, the exponential growth of


illegal videos and huge piracy issues increased the importance of Content Based video Copy Detection (CBCD) techniques. CBCD systems
require compact and computationally ecient descriptors for detecting
video copies. In this paper, we propose a simple and ecient video signature scheme using Dominant Color Descriptors of MPEG-7 standard
in order to implement the proposed CBCD task. Experimental results
show that the proposed approach yields better detection rates when compared to that of existing approaches, against common transformations
like Contrast change, Noise addition, Rotation, Zooming, Blurring etc.
Further, evaluation results also prove that our scheme is computationally
ecient by supporting substantial reduction in the total computational
cost up to the extent of 65% when compared to that of existing schemes.
Keywords: Content-Based Video Copy Detection, MPEG -7, Dominant
Color Descriptor.

Introduction

The massive media consumption in terms of media streaming has increased the
presence of enormous amount of video copies, which leads to huge piracy issues.
Controlling the copyright of the huge number of videos uploaded everyday is a
critical challenge for the owner of the popular video web servers. For example,
latest survey says that users upload 65,000 new videos each day on video sharing
websites like YouTube and also on an average, a viewer watches more than 70
videos online in a month [1] and the number is expected to keep growing.
In general, a video copy is dened as,a transformed video sequence, which is
visually less similar and does not contain any new and important information,
compared to the source video. There are two general approaches for detecting copies of a digital media: digital watermarking and Content Based video
Copy Detection (CBCD). The primary idea of CBCD technique is, detecting
video copies using the media itself which contains enough unique information.
The purpose of any CBCD system is, when a query video is given, to nd out
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 473480, 2011.
c Springer-Verlag Berlin Heidelberg 2011


474

R. Roopalakshmi and G.R.M. Reddy

the original video from which the query is taken, even if the query is modied
by means various transformations. CBCD techniques can be classied into two
major categories Global descriptor and Local descriptor techniques. Global descriptors like Ordinal measure [2], Color histograms [3] are compact and easy to
extract, but they are less robust against region based attacks. Local descriptors
like SIFT [4], SURF [5], PCA-SIFT [6] etc., use local interest points for feature extraction. The main drawback of local descriptors is generation of several
hundreds of features for a single video frame, resulting in high computational
cost.
Since color is one of the dominant and distinguishing visual feature of an image, in this paper, we employed a color descriptor of MPEG-7 standard [7], called
as Dominant color descriptor (DCD), which extracts the representative colors
of an image. The Generalized Lloyd algorithm (GLA) is the most extensively
used algorithm to extract the dominant colors of an image [8]. However, GLA
suers due to following drawbacks: 1) It needs expensive computational cost, 2)
It is time consuming, and 3) It mainly depends upon initial specications like
distance, number of clusters, centroid etc. In most of the CBCD systems, major
challenging problem is computational cost of feature extraction and matching,
because a huge video databases need to be checked. So, the main focus of this
paper is to provide easily extractable and compact feature descriptors with low
computational cost.The main contributions of this paper are as follows:
1. We use a new DCD extraction technique, which is easy to extract and compact (on average 12 to 20 numbers), when compared with existing color
clustering techniques.
2. We present an adaptive video signature pruning method, by which the total
number of video signatures of a given video are reduced by greater extent
(up to 58 %).
The rest of the paper is organized as follows: Section 2 introduces the framework
of the proposed scheme along with signature extraction and matching techniques;
Section 3 shows the experimental results of proposed scheme, followed by the
conclusion in Section 4.

Proposed Scheme

Figure 1 describes the framework of the proposed scheme, in which key frames
are extracted from master video using the sampling method. Then for each key
frame, Frequency image [9],representing the distribution of same feature pixels
is calculated.Selecting R, G and B colors as three features of an image, for each
pixel,the frequency of the same color pixels is calculated.Then, Dominant color
descriptors of each frame are calculated by making use of frequency images. By
applying simple pruning strategy to the extracted feature vectors, nal set of
representative dominant colors of an image are calculated and stored in feature
database of video les. Whenever user presents a query video, frequency image generation and DCD extraction is performed.Finally the feature vectors are
compared and the result of copy detection task is reported.

Ecient Video Copy Detection

475

Fig. 1. The Framework of Proposed Scheme

2.1

Fingerprint Extraction

Using uniform sampling method with the rate of 5 frames per second, key frames
are obtained from master video les. Since DCD captures the dominant or representative colors in a given image, it is referred to as dominant color descriptor.
The dominant color descriptor consists of the representative colors and their relative distribution in a given image or region. Dominant color descriptor (DCD)
replaces the whole image color information with a small number of representative colors.The dominant color descriptor of MPEG-7 standard is dened as [10],
F = {{ci , pi , vi }, s}, i = 1, 2, 3, ..N,

(1)

where N is the total number of dominant colors for an image, ci is a 3-D dominant
color vector, pi is the percentage for each dominant color, such that pi values are
normalized to 1.The color variance vi and spatial coherency s are optional parameters. The color variance vi describes the variation of the color values of the
pixels in a cluster around the corresponding representative color and the spatial
coherency s represents the overall spatial homogeneity of the dominant colors in
the image. In order to extract DCD, we used frequency image of frames, in which
each pixel represents the frequency of the same color pixels. In our scheme, we

476

R. Roopalakshmi and G.R.M. Reddy


Table 1. Comparison of Total No. of Extracted Feature descriptors

S.No Duration(in minutes)


1
2
3

1
3
5

Total No.of Feature Descriptors


Reduction (in %)
Baseline Method Pruning Based Adaptive Method
247
144
58
1150
352
31
1445
407
29

used RGB color space. Using the frequency image of frames, the corresponding
DCD features are extracted. Consecutive images in a video sequence have very
similar color statistics [11] hence, we developed a new video signature pruning
method, which reduces the total number of descriptors required to characterize
the given image. In order to prove our method, we performed two sets of experiments for signature extraction. In the rst baseline method, the DCDs extracted
from frequency images are considered as signatures of corresponding video les.
In the second pruning based adaptive method, we are comparing the DCD of
each frame with that of previous frame, and if the similarity between DCD exceeds the threshold, then the latter DCD is considered as new representative
color of given video le.In our experiments we have considered 35 as threshold
value.Table 1 shows the details of extracted feature descriptors, using both baseline & pruning based adaptive methods for 1, 3, 5 minutes videos respectively.
From the Table 1 data, it is observed that, the proposed pruning based adaptive extraction method reduces the total number of feature descriptors by 58%,
31% and 29% respectively. So, based on the above facts, we have considered the
pruning based adaptive extraction method in order to perform this CBCD task.
2.2

Fingerprint Matching

In feature extraction of DCD, we observe that each image can be eectively


represented using 3 to 5 dominant colors, but more than 1 dominant color is
necessary for an image. Since the number of representative colors is less, the
feature descriptors are indexed based upon their dominant color values. Fingerprint matching of our scheme involves searching the database for similar color
distributions same as the input query that includes searching for each of the
dominant colors separately. If F1 and F2 are two dominant color descriptors
such that,
(2)
F 1 = {{ci , pi }, i = 1, 2, .., N1 },
F 2 = {{bj , qj }, j = 1, 2, .., N2},

(3)

then the distance between and F1 and F2 is given by [7],


D2 (F 1, F 2) =

N1

i=1

p2i +

N2

j=1

qj2

N1 
N2

i=1 j=1

2ai,j pi qj

(4)

Ecient Video Copy Detection

477

where ai,j is the similarity coecient between colors ci and bj . The similarity
coecient ai,j is given by,

di,j
1 dmax
if di,j Td
(5)
ai,j =
0
if di,j > Td
where di,j is Euclidean distance between two colors ci and bj , and the threshold
Td is the maximum distance used to judge whether two color features are similar
or not.The distance dmax = Td ,where and Td are set as 1.2 and 25 in our
experiments.

(a)

(b)

(c)

(d)

Fig. 2. Comparison of PR Curves for Dierent Transformations: (a) Blurring (b)


Zooming-in (c) Zooming-out (d) Contrast change

Experimental Results

To evaluate the performance of our approach, we used a video database, which


contains 101 video sequences, collected from Open Video Project [12]. The video
database contains approximately 305297 frames. The video content includes
news, documents, Education, movies, natural scenes, landscapes etc. The format of the original video data used is MPEG-1 with 352*240 pixels and 30
fps. We designed two sets of experiments to evaluate the detection accuracy

478

R. Roopalakshmi and G.R.M. Reddy

(a)

(b)

(c)

(d)

Fig. 3. Comparison of PR Curves for Dierent Transformations: (a) Rotation (b) Image
Ratio (c) Noise addition (d) Resolution Change

and detection eciency of our approach respectively. From the video database,
we randomly selected 15 videos, ranging from 5 to 8 seconds. Dierent kinds
of transformations, that are given by, 1) Blurring, 2) Zooming-in, 3) Zooming-out, 4) Contrast Change, 5) Rotation, 6) Random Noise Addition, 7) Image
Ratio and 8) Resolution Change are applied to those 15 videos to generate 120
video copies. Then, selected 15 videos are used as the query videos to search the
database. To evaluate the eciency, the computational cost of the single video
copy detection is discussed.
3.1

Experiment 1: Detection Accuracy

To measure the detection accuracy of our scheme, we used standard Precision


and Recall metrics. We consider a detection result as correct if there is any
overlap with the region from which the query was extracted. The metrics of
Precision and Recall used for the accuracy evaluation are given by,
P recision = T P/(T P + F P ),

(6)

Recall = T P/(T P + F N ),

(7)

Ecient Video Copy Detection

479

True Positives (TP) are positive examples correctly labeled as positives. False
Positives (FP) refer to negative examples incorrectly labeled as positives. False
Negatives (FN) refer to positive examples incorrectly labeled as negatives.
Figure 2 shows the comparison of precision and recall values of our approach with that of algorithm(1), stands for the approach [13], with respect
to the blurring, zooming-in, zooming-out and contrast change transformations.
In algorithm(1),authors have used Ordinal measure for extracting features of
frames.The experimental results show that our scheme produces better detection results compared to the reference method. From Figure 2, we can observe
that for recall values 0.8 & above, our scheme gives good precision values which
is almost equal to 1, whereas the precision values of the reference method vary
from 0.7 to 0.8. Figure 3 shows the results in terms of precision and recall values
of the proposed and reference methods for various types of image transformations, that include rotation, image ratio, noise addition and resolution change.
These results show that our scheme produces better precision values as 0.98 ,0.97
etc.,when compared with that of the reference method.
3.2

Experiment 2: Detection Eciency

In most of the CBCD systems, the major challenge is total computation time
required to implement copy detection task. In order to evaluate the eciency of
our approach, we have compared the computational cost of our approach with
that of Kim s approach [14]. In [14] authors have used luminance of frames as
feature descriptors for their CBCD task.The experiments are conducted on a
standard PC with 3.2 GHz CPU and 2 GB RAM. Table 2 gives the computational cost details of both proposed and reference methods. The results from
Table 2 demonstrate that our scheme is more ecient, when compared to Kim
s approach by reducing the total computational cost up to 65%.
Table 2. Computational Cost Comparison of Kim and Proposed Methods
Task

Kims Method(in secs) Proposed Method(in secs)


1 Min 3 Min

5 Min

1 Min 3 Min

5 min

Feature Extraction

16.000 51.000 97.000 13.986 34.849

52.561

Feature Comparison

6.500 18.700 27.800

0.648 1.140

2.689

Total Computation Time 22.500 69.700 124.800 14.634 35.989

55.250

Conclusion

In this paper, we presented a simple and video signature method using Dominant
Color Descriptors of MPEG-7 standard. Experimental results show that our approach provides good performance in terms of detection accuracy rates and also

480

R. Roopalakshmi and G.R.M. Reddy

reduces the computational cost, when compared with the existing approaches.
Further, our future work will be targeted at the following:
1. Multi- feature CBCD system, in which audio signatures are also incorporated
with the existing approach.
2. To increase robustness of existing system against various transforms like
Cropping, Camcording, Encoding, Gamma Correction etc.
Acknowledgments. We would like to thank the anonymous reviewers for their
valuable comments and suggestions.

References
1. Wu, X., Hgo, C.-W., Hauptmann, A.G., Tan, H.-K.: Real Time Near Duplicate
Elimination for Web Video Search with Content and Context. IEEE Transactions
on Multimedia 11(2) (2009)
2. Bhat, D., Nayar, S.: Ordinal Measures for Image Correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 415423 (1998)
3. Shen, H.T., Zhou, X., Huang, Z., Shao, J.: UQLIPS: A real-time near-duplicate
video clip detection system. In: VLDB (2007)
4. Lowe, D.G.: Distinctive Image Features from Scale-Invariant Key Points. Journal
of Computer Vision, 91110 (2004)
5. Bay, H., Tuytelaars, T, Van Gool, L.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding, 346359 (2008)
6. Ke, Y., Sukthankar, R.: PCASIFT: A More Distinctive Representation for Local
Image Descriptors. In: Computer Vision and Pattern Recognition (CVPR), pp.
506513 (2004)
7. Manjunath, B.S., Salembier, P., Sikora, T.: Introduction to MPEG-7 - Multimedia
Content Description Interface. John Wiley and Sons, West Sussex (2002)
8. Lloyd, S.P.: Least Squares Quantization in PCM. IEEE Transactions. Information
Theory 28, 129137 (1982)
9. Kashiwagi, T., Oe, S.: Introduction of Frequency Image and applications. In: SICE
Annual Conference 2007, Japan (2007)
10. Deng, Y., Manjunath, B.S., Kenney, C., Moore, M.S., Shin, H.: An ecient color
representation for image retrieval. IEEE Transactions on Image Processing 10,
140147 (2001)
11. Roytman, E., Gotsman, C.: Dynamic Color Quantization of Video Sequences. IEEE
Transactions on Visualization and Computer Graphics 1(3) (1995)
12. Open Video Project, http://www.open-video.org
13. Cho, H.-J., Lee, Y.-S., Sohn, C.-B., Chung, K.-S., Oh, S.-J.: A Novel Video Copy
Detection Method Based on Statistical Analysis. In: International Conference on
Multimedia & Expo. (2009)
14. Kim, J., Nam, J.: Content-based Video Copy Detection using Spatio-Temporal
Compact Feature. In: International Conference on Advanced Communication Technology ICACT 2009 (2009)

Mobile Video Service Disruptions Control in Android


Using JADE
Tatiana Gualotua1, Diego Marcillo1, Elsa Macas Lpez2,
and Alvaro Surez-Sarmiento2
1

Grupo de Aplicaciones Mviles y Realidad Virtual, Departamento de Ciencias de la


Computacin, Escuela Politcnica del Ejrcito, Ecuador
{tatiana.gualotunia,dmmarcillo}@espe.edu.ec
2
Grupo de Arquitectura y Concurrencia, Departamento de Ingeniera Telemtica,
Universidad de Las Palmas de G.C., Spain
{emacias,asuarez}@dit.ulpgc.es

Abstract. The rapid evolution of wireless communications has contributed to


successful of the mobile Video Streaming service today. However, the
streaming technique is necessary for receiving video efficiently because the
mobiles still have limited resources. The unpredictable behavior of wireless
channels can produce service disruptions, and rejection of the user. In a
previous work the last two authors have worked on solving this problem using a
new architecture based on software agents and proxies. This paper extends that
previous work by applying it to the new mobile phones based on Android. We
have developed new skills: the use of servers (VLC and YouTube) and the use of
multimedia compression formats. The comparison of our results and the
previous ones for the other platforms shows that our mechanism is very
efficient and provides a high quality service for the user.
Keywords: Video Streaming, Android, Mobile Telephone, JADE, Service
Disruptions.

1 Introduction
The globalization of the Internet has led to increased distribution of multimedia files,
generating digital mechanisms that can communicate high quality information in
compressed form and deploy them in real-time to the mobile user. Advances in video
compression and communication allow multimedia information to be displayed on
mobile devices. But due to limited resources, especially memory, it is necessary to
use particular techniques as Video Streaming; this technique requires real time video
or large stored videos be divided in synchronized parts. These parts are communicated
independently but synchronously visualized in the mobile phone. The most important
topic of Video Streaming is that concurrently there are parts of the video leaving the
Server, travelling in the Network and visualizing in the phone.
Mobile and wireless communication has grown dramatically in recent years.
Worldwide there are many communication facilities based on standards such as:
Wireless Fidelity (Wi-Fi) [1], Worldwide Interoperability for Microwave Access
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 481490, 2011.
Springer-Verlag Berlin Heidelberg 2011

482

T. Gualotua et al.

(WiMAX) [2] allowing communication to a very small economic cost to the user.
However, the data transmission in wireless channels suffers from many errors,
frequent packet loss and radio coverage is not always high which can produce
frequent radio disconnections of mobile phones. These disconnections are totally
unpredictable and their adverse effects on communication can only be mitigated.
The most widely used protocol for Video Streaming is Real Time Streaming
Protocol (RTSP) (described in Request For comment (RFC) 2326). This protocol, or
its recent variants based on Hypertext Transfer Protocol (HTTP) [3] [4] or Real Time
Messaging Protocol (RTMP) [5], run over Real Time Protocol (RTP) and Real Time
Control Protocol (RTCP) for sending video frames and control of arrival to the
mobile phone. In special cases such as the Nokia N95 it can be used an additional
protocol called Real Time Data Transfer (RDT) [6] which carries out additional
control over the state of the wireless network that connects the mobile phone (3G
networks only). The RTSP typically uses Transmission Control Protocol (TCP) for
signaling. This represents a new problem associated with the wireless cannel
disconnections. This is because neither RTP nor RTCP are appropriate protocols to
control (without an additional middleware) mobile phone disconnections. In addition,
because it is necessary to carry out a RTSP session to download the video, when a
long-term disruption occurs (1 minute approximately) the session is lost and must be
restarted negotiating again all the connection parameters. It implies to receive video
from the beginning. This causes enormous anger in the mobile phone user leaving the
RTSP session. This is not a minor problem, on the contrary, it is a significant issue for
several reasons: the operator o manager of multimedia content can lose Money due to
user neglect, the wireless cannel can be congested with video frames that no one will
see (are out of radio coverage) and can degrade the use of the other services at that
time are using the wireless network. Therefore, we think it is important to continue
working on this problem, especially for new platforms for smart mobile telephones.
Recent Market Research [7] [8] demonstrate that mobile phones with Android are
the most currently used for multimedia applications. Moreover, all these phones have
wireless communication interfaces, enabling them to receive Video Streaming at low
cost.
In this paper we present the implementation of our mechanism, that controlled the
video communication and visualizing on portable computers [9] [10] and mobile
devices like Nokia N95 [11], to the new mobile phones with Android. The basic idea
was to prevent the user to re-initiate an RTSP session and to allow the user to receive
lost video frames (temporarily stored in a buffer until mobile device back into the
coverage area). These video frames are visualized while the video server is sending
other video frames. As in previous cases we have used a software agent architecture
[12] [13] providing artificial intelligence methods. Our software mechanism uses the
Java Agent Development Environment (JADE) platform, in particular the version:
JADE powered by Lightweight Extensible Agent Platform (JADE-LEAP) [14] [15]
which is an open source multi-agent system that meet the standards Foundation for
Intelligent Physical Agents (FIPA) [16]. The important innovations in this paper are
the follows: a) we have used free open source video platforms like: Video LAN (VLC)
[17] and the YouTube video service whose provides a high level of practical

Mobile Video Service Disruptions Control in Android Using JADE

483

application to our mechanism. b) We have used standard compression formats for


media information used worldwide.
The structure of this paper is: Section 2 presents the benefits of using software
agents and the applicability of the JADE platform for programming agents. We
present the Video Streaming automatic resumption mechanism based on JADE for
mobile phones. In section 3 we discuss the effectiveness of the mechanism to perform
experimental tests in real scenarios and to evaluate the results in the Android
platform. Finally, we present conclusions and future work.

2 Basic Ideas of Our Agent Based Control Mechanism


Agent Oriented Programming (AOP) has emerged as an alternative for the design of
complex systems. The objectives and requirements of the problem to be solved are
implemented by a set of autonomous entities, capable of exhibiting intelligent
behavior. This naturally allows the decoupling and desirable cohesion system,
facilitating the software construction and maintenance [18]. Software agents represent
autonomous computational units capable of learning from their environment and act
upon it to fulfill the purpose for which they were designed. They monitor the progress
of selected strategies and communicate by exchanging messages.
The AOP allows the development of multi-agent systems that are structured by
multiple software components that interact and cooperate to solve problems that are
beyond the scope of each component. We have shown AOP can be used to control the
quality of Video Streaming delivery to mobile devices.
There are several platforms for the development of software agents, free and
proprietary. For this study we selected JADE. It is a free software platform with
graphical tools for monitoring, documentation, support and is implemented in a
widely accepted multiplatform language, Java. JADE [14] is a middleware that
enables the deployment of distributed multi-agent systems and is consistent with the
FIPA specifications [16] (global institution that sets standards to ensure
interoperability between heterogeneous agents). JADE consists of one or more
containers that can be located in different hosts. JADEs agents can contact to
platforms implemented with other technologies and stay in distributed containers. The
set of all containers is called a platform and provides an even layer to cover the
diversity of the underlying layers [19]. In each platform there should be a principal
container that will be the first to run and which will record the remaining containers to
start looking for run. Agent Management System (AMS) and Directory Facilitator
(DF) can only be placed in the principal container. JADE LEAP [15] is a modified
version of the JADE platform that runs on personal computers, servers, especially on
mobile devices like cell phones, PDAs It allows the development and
implementation of JADE agents on mobile devices connected through wireless
networks taking into account that these devices have limitations on some resources
such as connectivity, memory and processing.
JADE-ANDROID [20] is a JADE complement that provides support for the use of
JADE-LEAP in the Android operating system. It is released as an add-on of JADE
and is available for download on the terms of the LGPL license. It allows the
implementation of agent-oriented applications based on the peer-to-peer paradigm.

484

T. Gualotua et al.

2.1 The Service Disruption Problem and Our Solution


Video transmission uses Streaming technique that enables partial download of the
video into small fragments of smaller size in the Client, improving usability to send a
sequence of video frames presenting low delays to start the display and lower storage
requirements. During video communication, there could be breaks or service
disruptions especially in wireless networks communication. It is important to improve
the user experience, thus a mechanism must be provided in order to allow the user to
resume a RTSP session in case a service disruption was produced. This mechanism
must identify the point where to resume the session transparently.
We consider the following hypothesis:
1.

2.

3.
4.

5.

There is a video server (real time o on demand) that uses Video Streaming
technique to communicate a large amount of video data to mobile phone
whose user is moving.
Eventually, the mobile phone can be in an area where there is no radio
coverage, or the transport layer protocols or Internet application does not
respond during a time interval increased (from 10 seconds to 1 minute).
The wireless channel may be congested causing intermittent disconnections
of low duration (less than 10 seconds).
As a result Video Streaming service experiences unpredictable disruptions.
Moreover, it is useless to have a statistical method to predict disruptions
because in practice it is impossible to have a probability of success of
disruption that is reliable.
The control parameters in mobile devices (bandwidth, delay, error rate and
display quality) change due to client mobility and unpredictable behavior of
wireless communications.

With these assumptions we are facing a mathematical optimization problem whose


input variables are the above control parameters (actually there are more parameters at
different levels of network architecture that does not consider in this paper). In theory
with all these variables is possible to construct a function which result is a logic value:
Data should be stored in a buffer when the mobile phone be out of coverage (for use
when it returns to coverage) or data must not be stored in that buffer. Let us note that
in general it must be decided when there is a high level of congestion, when many
packets are lost The randomness of the variables and the inability to effectively
predict values of these variables makes the problem very difficult to solve exactly.
We believe that the use of heuristics can alleviate the adverse effects of disruptions.
But it is more efficient to use an algorithmic solution based on cooperative software
agents as outlined in general for optimization problems in [21]. In this sense, systems
like JADE can be used as an alternative to control the video service disruptions,
thereby increasing the usability of the video streaming system.
JADE-LEAP uses the Multiagent System Management (MAS) service that allows
the design of the messaging and Remote Method Invocation (RMI). JADE supports
the management of intermittent disconnections through its Message Transport
Protocol (MTP) that performs the physical transport of messages between two agents
residing in distributed containers, providing failover services. When a communication
break occurs, the MTP performs constant monitoring and when the reconnection is

Mobile Video Service Disruptions Control in Android Using JADE

485

given to place, it warn the agents to restart dialogue from the point where the
disconnection was produced [22].
Our mechanism for mobile phones identifies the following entities (Fig. 1.): A
RTSP Video Streaming server, an Agent Proxy Server (APS), an Agent Proxy Client
(APC) and the Client device which is displaying video frames.

Fig. 1. Proposed mechanism of Video Streaming on Android phones

The APS has two functions, the first is to receive messages from the APC and sent
to the server as the ports assigned in the negotiation and the remaining function is to
both receive messages from the server and send to the APC. The APC allows the
mobile device to receive and send messages of RTSP, RTP and RTCP safely. This
APC implements storage and communication mechanisms, as well as filtering the
RTSP negotiation to be as transparent as possible. This resides on the mobile device
which ensures that the agent will never disconnect client. By placing agents in the
APS and APC, they cooperate to define when the phone is out of coverage using MTP
signaling and taking care to resolve intermittent disconnects and automatically resume
Video Streaming session. The messages of FIPA-Agent Communication Language
(ACL) that cannot be delivered to the mobile phone are stored in the buffer of the
APS, and they will be sent once the reconnection is achieved.
One of the benefits of JADE-ANDROID is that they try to reconnect the APS and
APC for some time. MTP defines the waiting time to reconnect the client (default
value is one minute); this waiting time defines the maximum size of the buffer of the
APS. When the connection between the agents is restored, the APC will read the
sorted frames in the buffer and then it will send them to the video player in the mobile
phone (Client). Thus, the Client will retrieve the video from the break point.

486

T. Gualotua et al.

3 Experimental Results
In our previous work we had worked with a video server set for us in Java that used
the Motion Joint Photographic Expert Group (MJPEG) [23] video format which
basically compresses each video frame as a JPEG image. The MJPEG format was the
worst choice because it does not take advantage of temporal images compression.
That means the Server will send approximately a similar amount of traffic in each
RTP packet which should cause a high waste of bandwidth. But it will simplify the
observation of service disruptions because it is simple to observe the absence of
traffic in the Wi-Fi Channel in the period of time d. In this paper we have used freely
distributed servers widely used and more powerful video encoding formats. On the
one hand, we have complicated the interaction between the proxies and the video
Server. On the other hand, we have improved the detection of video packets when
disruptions occur and should be subsequently retrieved from the buffer. The reason
we do this is to provide more practical realism to our mechanism to mitigate the
adverse effects of video disruptions. The improvements include:
1.

A video server that has been used for testing, VLC (one of the free
distribution that is currently used worldwide). The other server we have used
was YouTube that represents the most widely used multimedia service
worldwide today. In both cases we have tried to deliver video encoded in a
compatible format with Android.
2. The configuration of the video format used is as follows: for video the
H.264/AVC codec, bit rate 9000 kbps, 25 fps, width 176, height 144 and for
audio the Advanced Audio Coding (AAC), bit rate of 128 kbps, 1 channel and
the sampling rate of 44100.

We have used a wireless network with a Wi-Fi access point at 54Mbps. A 2.20 GHz
AMD processor laptop with 4 GB of RAM and wireless card Atheros AR5009 IEEE
802.11 a/g/n wireless adapter. A Centrino Duo 1.66 GHz processor laptop with 1 GB

Fig. 2. FIPA message Exchange on the JADE-ANDROID platform

Mobile Video Service Disruptions Control in Android Using JADE

487

of RAM and a wireless card Intel Pro / Wireless 3945ABG. The model of the mobile
device is a Google Nexus One with Android with a QSD 8250 Qualcomm processor
at 1 GHz and 512 MB of RAM.
We have achieved several tests considering the following cycle: in-out-in coverage
in order to test the effectiveness of JADE-ANDROID for retrieving video frames
packets after a service disruption. We show the sequence of FIPA ACL messages
exchanged between the APS and APC in the Fig. 2. As can be seen the recovering of
the RTSP session is done correctly.
We present some results taking into account the communication of video and
audio due to we tested that there are problems when the size of the packet increases
(video) but there is no problem when the size of the packet is short (audio). In Fig. 3
and Fig. 4 we show the jitter and delay for audio and video separately comparing the
using of JADE-ANDROID and without it.

Fig. 3. Variation in arrival of audio and video packets

Fig. 4. Delay audio and video packets

In Fig. 5 is shown that practically all the packets sent by the Server (audio and
video) are received by the Client using JADE-ANDROID. This is not the case
without JADE-ANDROID. Moreover, in this last case, the quality of visualization is
very high.

488

T. Gualotua et al.

Fig. 5. Disconnection management

When it was out of coverage about 30 seconds, the audio and video packets are
successfully recovered (no packet losing). A delay is produced when reconnecting,
because the packets stored in the buffer must be released and their delay must be recalculated. When the phone was out of coverage during 30 to 45 s, the audio and
video packets always arrive to the Client. The audio packets were presented with
100% of quality, but the video frames were delayed and sometimes were not
visualized because the buffer size was very small and failed to recalculate the frame.
This was due to the timestamp, so the PvPlayer decided to remove the packets with
long delay, causing the application did not respond.
We found limitations in JADE-ANDROID. It limits to only one agent by each
application. But in order to obtain a high quality application we need three agents by
application. That is, one agent in charge to manage the RTSP and RTCP messages (a
shared channel can be used for these protocol messages). One agent that manages the
audio communication using a dedicated channel and another agent for managing
video communication.

4 Conclusions and Future Work


The implementation of the Video Streaming technique on mobile phones is not a
mature research area in which there are efficient practical solutions. This is because
users that experience service disruptions should start a new RTSP session. There are
some interesting proposals on a commercial level that timidly begin to propose
changes to HTTP and RTMP to support this type of terminals. But, there is still no
efficient solution. Furthermore, the implementation of mobile Video Streaming
systems on a mobile specific architecture is often a proprietary solution: is not multiplatform. We propose a solution to mitigate disruptions of service multi-platform. In
previous Works we have applied our solution to the Symbian and Windows Mobile
and in this paper extends to the Android platform. We used JAVA-ANDROID.
Moreover, in this paper we present an extension of our mechanism to apply to a
freeware open source video server (VLC) and YouTube using compression formats
more powerful than those used so far.
Our mechanism is necessary because protocols such as RTCP can send delay
statistics of packets between the transmitter and receiver, and the amount of lost
packets and other data used to make corrections in the transmission, but even so, this

Mobile Video Service Disruptions Control in Android Using JADE

489

protocol does not efficiently handle service disruptions and cannot be used to
implement corrective actions. As with previous platforms, for Android our
mechanism was tested in practice and experimental results show that it does not
cancel the RTSP session and the mobile phone user cannot stop viewing any of the
video frames that would be lost if no use our mechanism. It is necessary to clarify that
a maximum time for reconnection must be specified in order to avoid wasting
resources on the Server.
An important issue to work is the predictive mechanisms to manage mobility and
apply delivery prediction techniques. In this way the mechanism would anticipate a
possible interruption of service with a considerable degree of reliability (remember
that it is impossible to predict exactly service disruptions in wireless networks).
Another interesting issue is the generation of intelligent mechanisms to carry out the
analysis and selection of video frames to be stored based user profiles applying
artificial intelligence to create agents for other mobile devices that seek adaptability to
the actual conditions of wireless channel. This would maximize the memory in the
APS because it would store only those video frames are strictly necessary and that the
user profile has indicated. A third interesting issue is the development of a multiagent
system that allows dynamic creation of agents on the server for each customer which
allows applying Video Streaming technique with high-quality videos. This would
achieve to improve the performance of multimedia information communication when
multiple mobile devices connect to server at the same time because each JADE agent
works point to point.

References
1. Hernndez, K., Pelayo, J., Aguirre, L.: Broadband Transmission to Rural Areas. In: Eighth
LACCEI 2010, pp. 25 (2010)
2. Gabriel, C.: WiMAX; The Critical Wireless Standard (March 2011), Download available
http://eyeforwireless.com/wimax_report.pdf
3. Deshpande, S.: Adaptive timeline aware client controlled HTTP streaming. In: Proc. of
SPIE, pp. 25 (2009)
4. Begen, C., Akgul, T., Baugher, M.: Watching video over the Web, part I: streaming
protocols. IEEE Internet Comput. (2011)
5. Real-Time Messaging Protocol (RTMP) Specification. Adobe Systems Inc. (March 2011),
Download available http://adobe.com/devnet/rtmp.html
6. Nokia N95, Nokia Inc. (March 2011), Download available
http://nokia.es/link?cid=PLAIN_TEXT_815211
7. Market Research (March 2011), Download available
http://altersem.com/blog/wpcontent/uploads/2010/09/EstudioDeMercadoMobileInternet.pdf
8. Trends in Mobile Operating Systems (March 2011), Download available
http://noticiasdot.com/wp2/2010/12/14/android-sera-elsistema-operativo-mas-popular-en-el-verano-del-2012/
9. Suarez, A., Macias, E.: Automatic Resumption of Streaming Sessions over Wi-Fi Using
JADE. IAENG International Journal of Computer Science, IJCS 33(1), 16

490

T. Gualotua et al.

10. Suarez, A., Macias, E., Martin, J.: Light Protocol and Buffer Management for
Automatically Recovering Streaming Sessions in Wi-Fi Mobile Telephones. In:
Proceedings of the IEEE Second International Conference on Mobile Ubiquitous
Computing, Systems, Services and Technologies, UBICOMM 2008, pp. 8076 (2008)
11. Suarez, A., Macias, E., Espino, F.J.: Automatic Resumption of RTSP Sessions in Mobile
Phones using JADE-LEAP. IEEE Latin America Transactions 7(3), 38 (2009)
12. Gao, L., Zhang, Z., Towsley, D.: Proxy-Assisted Techniques for Delivering Continuous
Multimedia Streams. IEEE/ACM Transactions on Networking 11(6), 884894 (2003)
13. Bellavista, P., Corradi, A., Giannelli, C.: Mobile Proxies for Proactive Buffering in
Wireless Internet Multimedia Streaming. In: Proceedings of the IEEE International
Conference on Distributed Computing Systems Workshop (ICDCSW 2005), pp. 297304
(2005)
14. Bellifemine, F., Caire, G., Poggi, A., Rimassa, G.: JADE, A White Paper. Journal of
Telecom Italia Lab 3(3), 619 (2003)
15. Caire, G., Piere, F.: LEAP USER GUI (March 2011), Download available
http://jade.tilab.com/doc/tutorials/LEAPUserGuide.pdf
16. FIPA, The Foundation for Intelligent Physical Agents (March 2011), Download available
http://fipa.org/.
17. VideoLAN projects media player, free software under GPL licensed (March 2011),
Download available http://videolan.org/vlc/
18. Vallejo, D.: A multi-agent system for optimizing the rendering. Department of Computer
Science, pp. 823. University Castilla La Mancha (2006)
19. Caire, G.: JADE Tutorial. JADE Programming for Beginners (2011), Download available
http://jade.tilab.com/doc/tutorials/JADEProgrammingTutorial-for-beginners.pdf
20. Gotta, D., Trucco, T., Ughetti, M.: Jade Android Add-On Guide (March 2011), Download
available
http://jade.tilab.com/doc/tutorials/JADE_ANDROID_Guide.pdf
21. Shoham, Y., Leyton-Brown, K.: MULTIAGENT SYSTEMS Algorithmic, GameTheoretic, and Logical Foundations, pp. 330381. Cambridge University Press,
Cambridge (2009)
22. Suarez, A., Macias, E., Espino, F.J.: Automatic Resumption of RTSP Sessions in Mobile
Phones using JADE-LEAP. IEEE/ACM Transactions on Networking 11(6), 884894
(2003)
23. Muralles, W.: Analysis, Evaluation and Recommendation of Digital Video Formats
(March 2011), Download available
http://biblioteca.usac.edu.gt/tesis/08/08_7716.pdf

Performance Analysis of Video Protocols over IP


Transition Mechanisms
Hira Sathu and Mohib A. Shah
Unitec Institute of Technology, Auckland, New Zealand
hsathu@unitec.ac.nz, shahm09@wairaka.com

Abstract. In this paper, the performance of Video Protocols was tested on three
well known IP transition mechanisms such as IPv6to4, IPv6in4 and Dual-Stack.
The protocols involved in this experiment were MPEG-1, MPEG-2, and MP-4
protocol. In this experiment two tunnelling and a Dual-Stack mechanisms were
configured and impact of these mechanisms on Video packets was observed.
The parameters measured using the above test-bed were throughput, impactedthroughput (due IP transition mechanisms) and CPU utilization. The results
indicate that as video packet size is increased, impact of IP transition
mechanism becomes significant. Observation for Dual-Stack mechanism shows
that it performed much better than other two tunnelling mechanisms (IPv6to4 &
IPV6in4). IPv6to4 tunnelling mechanism had less impact on video packets
while IPv6in4 had the highest impact of all three protocols tested. Comparison
between video protocols illustrates that MPEG-2 was highly impacted by
tunnelling mechanisms having almost the same amount of bandwidth wasted
while MP4 was least impacted by tunnelling mechanism. More detail of results
is covered in this paper including CPU utilization and impacted-throughput.
Keywords: Video, performance analysis, protocols, IPv6to4, IPv6in4 & DualStack mechanism, and Linux Ubuntu 10.10.

1 Introduction
A recent study [1] indicates that Video over IP is one of the most important and fast
growing technology in the digital world. Larger numbers of users prefer to have
Video over IP available at any of the computing devices they use from any location.
Thus, usage of Video over IP would require each device to have an IP address in
order to communicate over the internet. There are several other issues that Video over
IP would face challenges from. These would be size of video packets for smaller
devices and quality of video over NGN internet infrastructures using various
protocols. Video over IP is mostly being used over IPv4 infrastructure (Internet);
however futuristic research study of video states that video over IP would face greater
challenges ahead, when it will be used over IPv6 networks and has to integrate with
both IPv4 and IPv6 networks. In this experimental based research we have set up a
network test-bed environment, to investigate and clarify how video quality is
impacted by IP transition mechanisms [2].
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 491500, 2011.
Springer-Verlag Berlin Heidelberg 2011

492

H. Sathu and M.A. Shah

Moving Picture Experts Group (MPEG) is a working group of experts formed by


ISO and IEC with a view to set standards for multimedia (MM) video and audio
compression and transmission. MPEG has membership of various universities,
industries, and research institutions. MPEG standardises the syntax and the protocol
which is used to multiplex/ combine video and audio. The standard defines the way
many such multimedia streams can be compressed and transported simultaneously
within the MPEG standards.
The most commonly used standards for transporting video are MPEG-1, MPEG-2
and MPEG-4. Their evolution was also in that order. MPEG-3 meant for High
Definition TV compression became redundant with its features merged with MPEG-2.
MPEG-1 was the first MM compression technique with speeds at about 1.5 Mbps
(ISO/IEC 11172). Considering the lower bit rate of 1.5Mbps for MM services, this
standard has lower sampling rate for the images as well as uses lower picture rates of
24-30 Hz. This resulted in lower picture quality. The popular digital audio encoding,
MP3 audio compression format is a part of MPEG-1 standard which was later
extended to cover MPEG-2 standard as well.
MPEG-2 standard is an improvement over the MPEG-1 standard and is capable of
broadcast quality TV transportation. The typical transmission rates for this standard
are higher than MPEG-1. MPEG-4 standard uses enhanced compression features over
MPEG-2 helping transport of computer graphic application level MM. In some
profiles the MPEG-4 decoder is capable of describing even three dimensional shapes
and surfaces for files with .MP4 file extension, that were also covered in this study.
The structure of this paper is as follows: next Section covers the background.
Section 3 mentions related works and contribution of this paper. Section 4 includes
the network test-bed setup of this research and Section 5 covers the traffic generating
& monitoring tools specification and Section 6 outlines experiment design. Section 7
presents the results in graphical form and describes the experiment. Finally section 8
covers the discussions and conclusions followed by the references section.

2 Background
The researchers considered the issue relating to the growth of IPv6 that provides a
number of advantages on adoption, and its co existence with the currently popular
IPv4. However, there still remains the issue of IPv6 not being able to communicate
directly with IPv4 networks. To resolve this problem, different IP transition
mechanisms have been designed such as Dual-Stack, IPv6-to-4 and IPv6-in-4
mechanisms.
IPv6-to-4 and IPv6-in-4 are two major tunnelling mechanisms which were mainly
designed for IPv6 users. It allows IPv6 based network users to communicate with
other IPv6 based networks through IPv4 cloud (Internet). These tunnelling
mechanisms were structured to carry IPv6 based packets through IPv4 cloud by
encapsulating IPv6 packets with IPv4 header and sending them via IPv4 cloud. It then
de-capsulate the packets at the other end and deliver them to their destination.
IPv6-to-4 tunnel is considered as automatic tunnel and it requires prefixed IP
addresses. It does not work with private IPv4 addresses and it cannot use multicast
addresses or the loop-back address as the embedded IPv4 address [3]. IPv6-in-4

Performance Analysis of Video Protocols over IP Transition Mechanisms

493

tunnel is considered as configured tunnel, which is manually configured between


hosts, no tunnel brokers were used for the set up. It does not require any prefixed IP
addresses, unlike 6to4 tunnel and can perform with any IP address range. Each tunnel
has a separate virtual interface and is configured differently. In Linux based operating
systems IPv6-to-4 tunnel is configured in an interface called tun6to4, while IPv6-in-4
tunnel is established in an interface called IPv6-in-4. Dual-Stack is based on both
versions of IP (IPv4 & IPv6) protocol stacks working simultaneously. It enables IPv4
based nodes to communicate with other IPv4 based nodes and IPv6 based nodes
communicate with only IPv6 based nodes. However IPv4 based nodes cant
communicate directly with IPv6 based nodes.
These IP transition mechanisms have provided a solution by enabling IPv6 based
networks to communicate with other IPv6 based networks via different
infrastructures. However, these IP transition mechanisms result in additional delay in
the transmission process because of encapsulation and de-capsulation that is essential.
This may reduce the quality of video communication or cause more bandwidth
wastage. Therefore the authors have carried out tests to identify and clarify the impact
of the two tunnelling and a Dual-Stack mechanism on video protocols. Three well
known IP transition mechanisms were selected namely, IPv6to4, IPv6in4 and DualStack. The video protocols tested were (MPEG-1 MPEG-2 & MP-4) and Linux
Ubuntu 10.10 operating system was used. The main focus of this experimental
research is to capture and evaluate the impact caused by these IP transition
mechanisms on video packets and compare their results.

3 Related Works
This section discusses earlier studies undertaken in this and related areas. In [4] the
researchers have used a technique called Adaptive Significance Determination
Mechanism in Temporal and Spatial domains (ASDM-TS) for H.264 videos over IP
dual-stack network with DiffServ model. The packet loss scenario was mainly
focused for various video protocols as each video protocol has different error
transmission characteristics. Usage of fixed model for video traffic which priorities
video packets in different sequences is not successful and degrades the quality of
video due to loss of important video packets. However with new technique (ASDMTS) simulated results show that it improves the performance of video traffic over IP
dual-stack mechanism.
In another study [5], the authors have carried out an experiment on Dual-Stack
mechanism using four different types of traffic (Video, Internet, FTP & VoIP). NS-2
(Network Simulator 2) tool was used to identify the performance of multiple traffic
and parameters covered were Bandwidth, packet loss and delay. MPEG-4 protocol
was selected to transmit video traffic over Dual-Stack mechanism using different
packet sizes and results were compared. Final discussion covering the overall results
mentioned that IPv6 is better than IPv4 using all four types of traffics tested.
Moreover IPv6 allows more bandwidth, and adds less delay over large packet sizes
while IPv4 does not provide high bandwidth and is limited in regard to large packet
size traffic [5].

494

H. Sathu and M.A. Shah

In [6] Video communication between two countries was established in order to


evaluate the behavior of the video transmission over IPv6 infrastructure and compared
against IPv4 infrastructure. HDTV with/without compression was transmitted over
both networks (IPv6 & IPv4) using one-way and two-way transmission. The outcome
stated that 0.1% packet loss was observed using a one-way communication on IPv6
infrastructure while two-way transmission provided 44% packet loss. IPv4 using both
one-way and two-way video communication didnt create any unusual results.
However it was concluded that 44% packet loss over IPv6 was due to the devices that
come into play over the complete path between these two countries; with some of
these devices not being efficient for both-ways video communication over IPv6.
In [7] the investigation was carried out to clarify packet loss in video transmission
over IP networks using error concealment process and without error concealment
process. Lotus multi-view sequence was used, which allows 8 views and each view
has 500 frames. It was observed that 2% packet loss occurred without using error
concealment process. However quality of video was seriously damaged while using
error concealment process the quality of video over IP networks was much better.
In [8] authors propose a solution for the IP-3DTV Network Management System
based on IPv4 and IPv6 networks. In another study authors have proposed a solution
for video traffic improvement using two schemes such as SBF-H and RBF-H. These
two techniques have the ability to select the best packet forwarder in bi-directional
multiple lanes. The tests were simulated and results compiled indicate that RBF-H can
provide better video quality than SBF-H in most traffic situations [9].
In the next paper [10] the authors have simulated voice and video traffic over
WLAN (Wireless Local Area Network) using various protocols. The results achieved
from the experiment indicate that it is possible to have three different standards of
Video channels with minimum packet loss. The authors believe that the results
identified in the local area network can be applied to the wide area network without
having low quality video services.
The contribution and motivation of this paper is to identify the impact caused by
different IP transition mechanisms on Video protocols and compare the results. It was
based on simulating a real network. A two way video conference on IPv6 based
networks via IPv4 cloud was set up and used two tunnelling and a Dual-Stack
mechanism to establish a connection between both IPv6 networks. Video traffic was
generated using MPEG-1, MPEG-2 and MP4 protocols over IP transition mechanisms
and impact of IP transition mechanisms was measured. As of early-2011, no literature
was observed that covered evaluation of video performance using three well known
transition mechanisms such as IPv6to4, IPv6in4 and Dual-Stack.

4 Network Setup
The proposed network test-bed was setup based on three different configurations. There
are three networks as shown in Figure 1 below. There are two IPv6 based networks
that were connected to each other via IPv4 network. To establish a connection between
two IPv6 based networks via IPv4 cloud, IP transition mechanisms were configured.
Three different types of IP transition mechanisms were involved in these networks
such as IPv6to4, IPv6in4 and Dual-Stack. One by one we configured each of these

Performance Analysis of Video Protocols over IP Transition Mechanisms

495

mechanisms to establish a connection between IPv6 based networks. Throughout these


networks cat5e cables were used for physical connectivity.
As illustrated below a client machine is connected to a router using IPv6
configuration and then a router is connected to another router using IPv4 configuration.
Second router is connected to a client using IPv6 configuration. IPv6to4 and IPv6in4
tunnelling mechanisms were configured on both routers machines. For Dual-Stack
mechanism all the machines had both versions of IPs enabled (IPv4 and IPv6 at same
time). Linux (Ubuntu 10.10) operating system was installed on both routers and static
routing was used for both versions of IPs (IPv4 & IPv6).

Fig. 1. Network test-bed based on IP Transition & Tunnelling mechanisms

In addition we have setup a pure IPv4 based networks and pure IPv6 based
networks and performed similar tests on these networks in order to compare and
differentiate the results. The test-bed shown above is based on two IPv6 networks
through IPv4 cloud and both IPv6 networks are connected to each other using IP
transition mechanisms (IPv6to4, IPv6in4 & Dual-Stack). All tests were conducted
under same circumstances using same services on each workstation.
The hardware used in this experiment includes four workstations; two performed
as clients and other two were configured as routers. Linux (Ubuntu 10.10) operating
system was installed on both router machines and three IP transition mechanisms
were implemented on those routers. Authors used a tool called CPU-Z to identify all
the components used. Following is a list of hardware components, which were
involved:

An Intel Core 2 Duo E6300 1.86 GHz processor


4.00 GB RAM for the efficient operation
Broadcom NetXtreme Gigabit NIC cards
A Western Digital hard-drive (160 GB) on each workstation.
Cat5e fast Ethernet cables were also used.

5 Traffic Generating and Monitoring Tools


VLC (Video LAN Client) [11] is a tool that was selected to broadcast (generate)
video traffic over the networks. We explicitly selected this tool as it supports both

496

H. Sathu and M.A. Shah

versions of internet protocols (IPv4 & IPv6) and works across a range of operating
systems including Linux, Windows and Mac. It also has the ability to broadcast live
audio, video and supports multiple voice and video protocols such as MPEG-1,
MPEG-2 and MP4.
Gnome is a traffic monitoring tool [12] that allows users to audit and measure the
performance of a live network. It has the capability to capture and measure
throughput, CPU utilization and RAM utilization. Gnome was explicitly selected as it
could detain and monitor the traffic during encapsulation and de-capsulation sectors.
Other tools have the ability to measure the traffic performance over a network;
however they cannot obtain the performance results during encapsulation and decapsulation at IP transition segments. Gnome has that special ability to monitor the
traffic when it is being encapsulated or de-capsulated. This tool allowed us to capture
the throughput and impacted-throughput caused by IP transition mechanisms.

6 Experimental Design
Two platforms of VLC were installed on each client machine at both ends of
networks and Gnome was installed on a router. First VLC player was used to stream
live video conference and was received at other end using VLC Player. Same way
another VLC player was used to stream video back to the client, to make it two ways
video conferences. Then Gnome tool was setup on Router 1 machine where
encapsulation and de-capsulation is processed. Hence all measurements were made at
Router 1. In this experiment data was captured at 30 seconds intervals. The tests were
repeated over 10 times to gain more accuracy in results. Next sections presents tests
results obtained from this experiment.

7 Results
The metrics involved in this experiment are the pure throughput, impacted-throughput
(due tunnelling) and CPU utilization. This section covers three video protocols
namely, MPEG-1, MPEG-2 and MP4 performance over the two pure IP versions
followed by the transition mechanisms and their average results are presented in
graphs and Table 1.
Figure 2 below illustrates the MPEG-1 actual throughput and additional impactedthroughput due to encapsulation process. Throughput obtained using IPv4 was
approximately 250 Kilobytes per second while using IPv6 it slightly increased due to
the bigger header size of IPv6 packets. Dual-Stack provided marginally more
throughput than IPv6 as Dual-Stack works by enabling both IPv4 & IPv6 protocol
stacks at the same, which may cause slight impact on video packets. The results
measured over IPv6in4 tunnel was the highest at approximately 367 Kilobytes per
second and IPv6to4 tunnelling was marginally close at approximately 364 Kilobytes
per second. It is clear from the graph shown below that using IPv6in4 tunnel will
require at least 110 Kilobytes per second extra bandwidth on your actual throughput.
Due to IPv6in4 tunnel 110 Kilobytes per second will be wasted and it will be costly
for users as high amount of bandwidth gets wasted. IPv6to4 tunnel provides less
impact than IPv6in4 tunnel at approximately 3 Kilobytes per second.

Performance Analysis of Video Protocols over IP Transition Mechanisms

497

T h ro u g h p u t (K iB /s )

Video Protocols MPEG-1


355
330
305
280
255
230

IPv4

IPv6

Dual Stack

IPv6to4

IPv6in4

IP versions & IP mechanims


Fig. 2. Throughput of MPEG-1 and Impacted-Throughput by IP Transition mechanisms
(KiB/s)

T h r o u g h p u t (K iB /s )

Video Protocols MPEG-2


800
740
680
620
560
500
IPv4

IPv6

Dual Stack

IPv6to4

IPv6in4

IP versions & IP mechanims


Fig. 3. Throughput of MPEG-2 and Impacted-Throughput by IP Transition mechanisms
(KiB/s)

The results for MPEG-2 protocol indicate that IPv6in4 tunnel had the highest
impact on bandwidth. Throughput traffic measured over IPv4 shows that it takes 530
Kilobytes per second to have a two way video conference while using IPv6 it takes 536
Kilobytes per second. Observation over Dual-Stack indicates that it caused about 4
Kilobytes per second, more bandwidth wastage than IPv6. IPv6in4 tunnel had at
approximately 883 Kilobytes per second by wasting 347 Kilobytes per second. IPv6to4
had less impact on throughput than IPv6in4 at approximately 4 Kilobytes per second.
The throughput results for MP-4 protocol are visible in Figure 4. It describes that
using IPv4 at approximately 110 Kilobytes per second bandwidth provides a two

498

H. Sathu and M.A. Shah

way video conference while using IPv6 it takes 125 Kilobytes per second. Dual-Stack
mechanism caused slight impact on throughput at approximately 3 Kilobytes per
second more than IPv6. Impact measured over IPv6in4 tunnel was higher than
IPv6to4 tunnel at approximately 20 Kilobytes per second. Observation over IPv6in4
shows that it caused at least 71 Kilobytes per second bandwidth wastage while
IPv6to4 had at least 52 Kilobytes per second wastage.

Th roughput (K iB /s )

Video Protocols MP-4


200
180
160
140
120
100
IPv4

IPv6

Dual Stack

IPv6to4

IPv6in4

IP versions & IP mechanims


Fig. 4. Throughput of MP-4 and Impacted-Throughput by IP Transition mechanisms (KiB/s)
Table 1. CPU Utilization tested over IP transition mechanisms (%)

Protocols
IPv4
MPEG-1
MPEG-2
MP-4

29.6
31.4
30.9

CPU Utilization%
DualStack
31.0
30.03
28.0
28.05
31.5
33.4

IPv6

6to4
26.6
27.6
33.3

6in4
27.4
26.0
31.7

Table 1 shows results of CPU utilization. These results were captured during the
performance test of each video protocol using the two IP versions and IP transition
mechanisms. The results for MPEG-1 and MPEG-2 had consistent CPU usage on
both IP versions and three mechanisms. However, results for MP-4 were marginally
higher than MPEG-1 and MPEG-2. It was due to the high compression system
operating for MP-4 protocol.

8 Discussion and Conclusion


The results compiled from this experiment and presented in the above section, clearly
indicate that impact caused by transition mechanisms exceeds when protocols like

Performance Analysis of Video Protocols over IP Transition Mechanisms

499

MPEG-1 or MPEG-2 are transmitted. Dual-Stack had reasonable impact on each


protocol while IPv6to4 and IPv6in4 tunnelling mechanisms had higher impact.
Migrating from IPv4 to IPv6 for pure IPv6 networks is beneficial as seen
from the results that IPv6 had slight impact on each video protocol, which was
caused due to the header size of IPv6 packets. Observation using Dual-Stack also
showed that it is still better to have Video traffic over Dual-Stack as it caused
slightly more impact than pure IPv6 network, which is only marginal. However,
impact measured using IPv6in4 needs to be considered as seen in the section above
that IPv6in4 had highest impact on each protocol tested. It caused 110 Kilobytes
per second bandwidth wastage on MPEG-1, 347 KiB/s on MPEG-2 and 71 KiB/s
on MP-4.
Comparisons between tunnelling mechanisms showed that IPv6to4 tunnel
was marginally better than IPv6in4 tunnel as it had less impact on each protocol
tested.
CPU Utilization measured for three video protocols over IP transition
mechanisms showed no extra impact on CPU. It was consistent and didnt provide
any up and down inconsistency. So it is clear from Table 1 above that IP transition
mechanisms had no additional impact on CPU utilization while encapsulation and
de-capsulation process.
Future work in this area should also include study and comparison of alternative
methods that could be used to forward IPv6 traffic on IPv4 core networks. Authors
are currently working on another test using more video protocols to identify the
impact of IP transition mechanisms on those protocols. Another area is to measure
packet loss of these video protocols with increased traffic loads to relate the
experiments to the realistic environments that are of practical interest.

Acknowledgments
We would like to acknowledge Unitec, Institute of Technology for supporting the
research team and providing us this opportunity to complete this research.

References
1. Norton, W.B.: Internet Video: The Next Wave of Massive Disruption to US Peering
Ecosystem. In: Presented at the Asia Pacific Regional Internet Conference on Operational
Technologies (APRICOT), Bali, Indonesia (2007)
2. Tao, S., Apostolopoulos, J., Guerin, R.: Real-Time Monitoring of Video Quality in IP
Networks. IEEE/ACM Transactions on Networking 16(5), 1052 (2008)
3. Stockebrand, B.: IPv6 in Practice. A Unixers Guide to the Next Generation Internet.
Springer, Heidelberg (2007)
4. Lee, C., Yu, Y., Chang, P.: Adaptable Packet Significance Determination Mechanism for
H.264 Videos over IP Dual Stack Networks. In: IEEE 4th International Conference on
Communications and Networking, pp. 15 (2009)

500

H. Sathu and M.A. Shah

5. Sanguankotchakorn, T., Somrobru, M.: Performance Evaluation of IPv6/IPv4 Deployment


over Dedicated Data Links. In: IEEE Conference on Information, Communications and
Signal Processing, pp. 244248 (2005)
6. Lee, J., Chon, K.: Compressed High Definition Television (HDTV) over IPv6. In: IEEE
Conference on Applications and the Internet Workshops, p. 25 (2006)
7. Zhou, Y., Hou, C., Jin, Z., Yang, L., Yang, J., Guo, J.: Real-Time Transmission of HighResolution Multi-View Stereo Video over IP Networks. In: IEEE Conference: The True
Vision-Capture, Transmission and Display of 3D Video, p. 1 (2009)
8. Luo, Y., Jin, Z., Zhao, X.: The Network Management System of IP-3DTV Based on
IPv4/IPv6. In: IEEE 6th Conference on Wireless Communications Networking and
Mobile Computing, pp. 14 (2010)
9. Xie, F., Hua, K.A., Wang, W., Ho, Y.H.: Performance Study of Live Video Streaming
over Highway Vehicular Ad hoc Networks. In: IEEE 66th Conference on Vehicular
Technology, pp. 21212125 (2007)
10. Gidlund, M., Ekling, J.: VoIP and IPTV Distribution over Wireless Mesh Networks in
Indoor Environment. IEEE Transactions on Consumer Electronics 54(4), 16651671
(2008)
11. Video LAN : Video LAN Organization: VLC Media Player (2011),
http://www.videolan.org/vlc/
12. GNOME: GNOME Documentation Library: System Monitor Manual (2011),
http://library.gnome.org/users/gnome-system-monitor/

Performance Comparison of Video Protocols Using


Dual-Stack and Tunnelling Mechanisms
Hira Sathu, Mohib A. Shah, and Kathiravelu Ganeshan
Unitec Institute of Technology, Auckland, New Zealand
{Hsathu,kganeshan}@unitec.ac.nz, shahm09@wairaka.com

Abstract. This paper investigates the performance of Video Protocols over


IPv6 and IP transition mechanisms. It mainly focuses on the impact caused by
IP transition mechanisms on video packets and compares this with pure IPv6
based networks. The video protocols selected in this experiment were MPEG-1,
MPEG-2, MPEGP-4, MKV and FLV. In this experiment a Dual-Stack and two
tunnelling mechanisms were established and the impact of these mechanisms on
five video protocols was measured. The parameters measured were actualthroughput over a pure IPv6 network, impacted-throughput (due IP transition
mechanisms) and CPU utilization. The results indicate that video packet with
large size had been impacted more than packets with small size using these IP
transition mechanisms. Dual-Stack mechanism performed much better than two
tunnelling mechanisms (IPv6to4 & IPv6in4) tested. IPv6in4 tunnelling
mechanism had more impact than IPv6to4 tunnelling mechanism over all the
video protocols tested with IPv6to4 marginally close for all protocols tested.
Performance comparison between video protocols shows that FLV protocol was
least impacted while MPEG-2 was highly impacted by the tunnelling
mechanisms. Further detail is covered in this paper including specification for
actual-throughput, impacted-throughput and CPU utilization.
Keywords: Video protocols, performance evaluation, actual-throughput,
Impacted-throughput, IPv6to4 tunnel, IPv6in4 tunnel & Dual-Stack mechanism.

1 Introduction
Recent studies [1], [2] and [3] indicate that Video over IP is an important technology,
which is growing rapidly and has a vital role ahead. Futuristic studies also specify that
reliability and availability of Video over IP on all types of electronic devices will be
on demand. Hence Video over IP would require more IP addresses in order to permit
larger numbers of devices to be connected over the internet. Several other concerns
are expected to arise and Video over IP has to deal with related issues, in order
to enhance the performance of Video over IP. Issues like video packet size for mobile
devices and quality over next generation networks (NGN) are yet to be resolved.
Currently Video over IP is mainly being transmitted over IPv4 networks (Internet).
However, according to researchers a greater challenge exists for transmitting video
over IP over IPv6 infrastructure. In this scenario we have implemented an
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 501511, 2011.
Springer-Verlag Berlin Heidelberg 2011

502

H. Sathu, M.A. Shah, and K. Ganeshan

infrastructure based on NGN including IPv6 to identify the quality of Video over IP
using IP transition mechanisms [4].
MPEG (Moving Picture Experts Group) is a working group of specialists formed
by international organisations with a view to set standards for audio, video and
multimedia (MM) communications. MPEG has collaborative organisations and works
with a range of universities, industries, and research institutions. MPEG standard
characterizes multiple ways to broadcast audio and video such as multimedia streams
that are compressed and transmitted concurrently within the MPEG standards.
MPEG-1, MPEG-2 and MPEG-4 are commonly used standards form the range of
MPEG standards, which are used for audio and video transmission. MPEG-3 was
designed for High Definition TV compression and became redundant with its features
merged with MPEG-2. MPEG-1 was the first MM compression method, which had a
speed at approximately 1.5 Mega bits per second (ISO/IEC 11172). Considering the
low bit rate of 1.5Mbps for MM services, this standard provides lower sampling rate
for the images and uses lower picture rates of 24-30 Hz. The final outcome results a
lower picture quality.
The popular format known as MP3 is formed from the parts of MPEG-1 and
MPEG-2 standards. MPEG-2 provides broadcast quality video and is specially used
for TV transportation. The typical broadcast rates for MPEG-2 standard are higher
than MPEG-1 while MPEG-4 standard uses compression techniques that result in
higher throughput that is greater than MPEG-2. This aids transport of application
level MM like computer graphics, animation and regular video files. In some cases
MPEG-4 decoder is capable of describing three dimensional pictures and surfaces for
files with .MP4 file extension.
Matroska Multimedia Container, MKV is an open standard free container file
format that can hold an unlimited number of video, audio, picture or subtitle tracks
inside a single file. Unlike other similar formats, such as MP4, AVI and ASF, MKV
has an open specification (open standard) and most of its code is open source. The
formats are .MKA for audio only, .MKS for subtitles only, .MKV for audio, video,
pictures and subtitles and .MK3D for stereoscopic/3D video. Matroska is also the
basis for .webm (WebM) files. Matroska is based on a binary derivative of XML,
called the Extensible Binary Meta Language (EBML) which bestows future format
extensibility, without breaking file support in old parsers.
Flash Video is viewable on most operating systems, using the Adobe Flash Player
and web browser plug-ins and is very popular for embedded video on the web and
used by YouTube, Google Video, metacafe, Reuters.com, and many other news
providers. FLV is a container file format used to deliver video over the Internet using
Adobe Flash Player (versions 6 to10). There are two different video file formats
known as Flash Video: FLV and F4V. FLV was originally developed by Macromedia.
The audio and video data within FLV files are encoded in the same way as they are
within SWF files Flash Video content may also be embedded within SWF files. The
F4V file format is based on the ISO base media file format. Flash Video FLV files
usually contain material encoded with CODECS following the Sorenson Spark or
VP6 video compression formats. The most recent public releases of Flash Player
(collaboration between Adobe Systems and MainConcept) also support H.264 video
and HE-AAC audio.

Performance Comparison of Video Protocols

503

The contribution and motivation of this paper is to identify the actual-throughput


and impacted-throughput using five different video protocols and compare their
results. This experiment was conducted in a computer lab based on a real network. A
two way video conference was established on IPv6 based networks, which were
connected via IPv4 cloud. To connect these networks two tunnelling and a Dual-Stack
mechanism was established between both IPv6 networks. Video traffic was
transmitted using MPEG-1, MPEG-2, MPEG-4, MKV and FLV protocols over these
mechanisms and actual-throughput and impacted-throughput was monitored. As
of early-2011, no literature was observed that covered evaluation of video protocols
on these three well known transition mechanisms such as IPv6to4, IPv6in4 and
Dual-Stack.

2 Background
To resolve the issue of shortage in IPv4 addresses for the future, IPv6 was introduced
to the computer world. In addition it also provides a number of other advantages on
adoption. However, IPv6 still has one major issue since it does not communicate
directly with IPv4 networks. To resolve this issue, researchers have designed various
IP transition mechanisms known as IPv6 over 4, NAP-PT, Dual-Stack, IPv6to4 and
IPv6-in-4 mechanisms, which allows IPv6 based networks to communicate with other
IPv6 based networks via the IPv4 cloud.
IPv6-to-4 and IPv6-in-4 are two vital tunnelling mechanisms which are available
on multiple operating systems including Windows and Linux OSs. The main purpose
of these tunnelling mechanisms was to enable IPv6 based networks to communicate
to other IPv6 based networks through IPv4 networks (internet). The function of
tunnelling mechanisms was designed to carry IPv6 packets via IPv4 networks using
encapsulation process and add IPv6 packets into IPv4 header. It then executes decapsulation process at the other end and removes IPv4 header and deliver pure IPv6
based packets to their destinations.
IPv6-to-4 tunnel operates as an automatic tunnel using prefixed IP addresses. A
special method is used to calculate prefixed IP addresses for both IPv4 and IPv6. It
also does not work with private IPv4 addresses and it cannot use multicast addresses
or the loop-back address as the embedded IPv4 address [5]. IPv6-in-4 tunnel is also
known as configured tunnel, which needs to be configured manually among hosts. It
has the capability to operate at any given IP address and does not require any prefixed
IP addresses, unlike IPv6to4 tunnel. Each of these tunnels has a special virtual
interface, which requires different setup configuration. IPv6to4 tunnel is created and
setup in an interface called tun6to4 whereas IPv6in4 tunnel is created and setup in an
interface called IPv6-in-4. Dual-Stack mechanism is established by enabling both
versions of IP (IPv4 & IPv6) protocol concurrently and they both operates
simultaneously. It allows IPv4 based nodes particularly to communicate with only
IPv4 based nodes while IPv6 based nodes specifically communicate with IPv6 based
nodes; however IPv6 nodes cant communicate IPv4 nodes.
IP transition mechanisms have proposed a solution by allowing IPv6 based
networks to communicate with other IPv6 based networks through IPv4
infrastructures. However, there are still major concerns that are noticeable with use of

504

H. Sathu, M.A. Shah, and K. Ganeshan

IP transition mechanisms. Dual-Stack is limited in application until internet service


providers and other networks on the internet enable dual-stack mechanism. The
tunnelling mechanisms causing additional delay in video transmission is because of
encapsulation and de-capsulation process that is a vital concern. This may impact the
quality of video transmission and may also lead to increased bandwidth wastage. Thus
the authors have setup a real network based environment to conduct tests to clarify the
impact of dual-stack and tunnelling mechanisms on five different video protocols.
The IP mechanisms selected include a dual-stack, IPv6to4 and IPv6in4 tunnelling
mechanisms. The video protocols investigated and tested were MPEG-1 MPEG-2,
MPEG-4, MKV & FLV over Linux Ubuntu 10.10 platform. The actual aim of this
experimental research is to identify the actual-throughput and clarify the impactedthroughput caused by these IP transition mechanisms on each video protocol and
compare their results.
The organization of this paper is as follows: Section 3 describes related works.
Section 4 presents the network setup and Section 5 discusses the traffic generating &
monitoring tools used for this study. Section 6 covers experiment design and Section
7 outlines the analysis and results. Last section presents the discussions and
conclusions followed by the references.

3 Related Works
This section covers related areas of research which was undertaken by other
researchers in past years. In [6] a method was designed and tested which purposed a
solution to packet loss issue in video transmission. The method used is called
Adaptive Significance Determination Mechanism in Temporal and Spatial domains
(ASDM-TS) for H.264 videos packets using IP dual-stack infrastructure with
DiffServ model. The video packet loss issue was undertaken in depth and multiple
video protocols were involved as each protocol is based on different characteristics
and experiences different errors during transmission. A model which used fixed
packets for video traffic and prioritised video packet progression differently is
ineffective and reduces the quality of video packets due to significant packet loss in
the process of transmission. However, using this new method (ASDM-TS) can
improve the packet loss in video transmission especially when it is broadcast over IP
dual-stack mechanism.
In this scenario different types of traffic including video was tested and analyzed
on dual-stack mechanism. In [7], authors conducted an experiment and performed
Video, Internet, FTP & VoIP traffic over dual-stack mechanism. The tool known as
NS-2 (Network Simulator 2) was selected to carry out the tests and metrics considered
were packet loss, bandwidth and delay. Video protocol involved was MPEG-4 and it
was transmitted over Dual-Stack mechanism using various packet sizes and outcome
was compared. It was concluded at the end, that usage of IPv6 is much better than
IPv4 no matter which traffic is transmitted. Furthermore IPv6 has the capacity to
transmit more bandwidth, and cause less delay for large packet sizes whereas IPv4 is
limited and provides less bandwidth for large packet sizes.

Performance Comparison of Video Protocols

505

Communication between two countries was setup using IPv6 to identify the
behaviour of video traffic over a live network. In [8] authors observed video
transmission over pure IPv6 and results obtained were compared with IPv4 based
networks. The tests include HD (High Definition) video packets with and without
compression system on both networks (IPv4 & IPv6) and one-way and two-way
communication system was established between both countries. The traffic analysis
outlines that 0.1% packet loss was measured over one-way transmission on IPv6
based networks while two-way transmission added significant packet loss at
approximately 44%. The video transmission over IPv4 states that there is no major
concern while using one-way and two-way video communication and outcome is
stable for both. However, results for IPv6 indicates that using two-way transmission
has caused significant impact on packet loss (44%) due to the network devices.
Overall it was concluded that devices used in the infrastructure of IPv6 have caused
this major packet loss as these device are not compatible with each other in regards to
IPv6 traffic forwarding.
An investigation over packet loss was conducted using video traffic. In [9]
investigation was conducted to identify and compare packet loss occurrence in video
transmission due to the process of error concealment and without error concealment.
Lotus multi-view sequence was established that enables 8 views at a time and each
view provides 500 frames. Outcome over packet loss shows that there was packet
loss at approximately 2% without using error concealment process and caused
significant damage to video quality. However using error concealment produced
much better results and the quality of video over IP infrastructure was efficient.
A new structure of carrying 3D traffic over IP networks was designed and a solution
was proposed for 3D IP-TV. In [10] authors designed a technique called IP-3DTV
Network Management System which was established on both versions of IP (IPv4 &
IPv6). Another study was carried out to enhance the performance of video over IP
networks using two techniques known as SBF-H and RBF-H. The techniques
mentioned above have the capability to select the appropriate packets during video
transmission and forward them in bi-directional multiple lanes. The outcome was
obtained based on simulated test environment. It outlines that having RBF-H technique
could enhance video traffic while SBF-H is appropriate in most conditions [11].
In this paper [12] the researchers setup a network for simulation environment and
performed voice and video packets over WLAN (Wireless Local Area Network) using
multiple protocols. The outcome obtained from the tests shows that three different
types of channels can be broadcasted concurrently without having significant packet
loss in video transmission. The authors concluded at the end that the outcome
achieved from these tests which was conducted in LAN (Local Area Network)
environment, can be applied over WAN (Wide Area Network) without causing any
impact on video quality.
In [13], another study was undertaken and real-time network was established to
observe the original packet loss on a live network. Impact of frame rate on real-time
transmission was also investigated in [14] and [15], the research in [16] takes it to the
next level by testing effects of video on next generation network (NGN) and future
architectures.

506

H. Sathu, M.A. Shah, and K. Ganeshan

4 Network Setup
The proposed network test-bed was established using four different setups. First setup
was based on pure IPv6, second enabled dual-stack mechanism. Third and fourth
setup involved the two tunnelling mechanism known as IPv6to4 and IPv6in4. There
are three networks in each setup as illustrated in Figure 1 below. Two networks at
both ends are based on IPv6 configurations while the cloud is based on IPv4
configuration. To establish a connection between two IPv6 based networks through
IPv4 cloud, two tunnelling and dual-stack mechanisms were configured. The two
tunnelling mechanisms included are IPv6to4 and IPv6in4. One by one each of these
mechanisms was configured to setup a connection between IPv6 based networks.
Throughout these networks Cat5e cables were used for physical connectivity.
As visible below a client workstation is connected to a router using IPv6
configuration and then a router is connected to another router using IPv4
configuration. Second router is connected to a client using IPv6 configuration.
IPv6to4 and IPv6in4 tunnelling mechanisms were configured on both router
machines. For Dual-Stack mechanism all the workstations and routers had both
versions of IPs enabled (IPv4 and IPv6 concurrently). Linux (Ubuntu 10.10) operating
system was installed on both routers and static routing was used for both versions of
IPs (IPv4 & IPv6).

Fig. 1. Network test-bed based on Tunnelling and Dual-Stack mechanisms

In addition pure IPv6 based networks were set up and similar tests performed on
these networks in order to compare the results. The test-bed shown above is based on
two IPv6 networks through IPv4 cloud and both IPv6 networks are connected to each
other using IP transition mechanisms (IPv6to4, IPv6in4 & Dual-Stack). All tests were
conducted under same circumstances using same services on each workstation.
The hardware used in this experiment contains four workstations; two machines
performed as clients and other two were as routers. Linux (Ubuntu 10.10) platform
was installed on both router machines and three IP mechanisms were established on
each of the two routers. Authors used a tool called CPU-Z to identify all the
components are identical. List of hardware components is mentioned below:

An Intel Core 2 Duo E6300 1.86 GHz processor


4.00 GB RAM for the efficient operation
Broadcom NetXtreme Gigabit NIC cards

Performance Comparison of Video Protocols

507

A Western Digital hard-drive (160 GB) on each workstation.


Cat5e fast Ethernet cables were also used.

5 Traffic Generating and Monitoring Tools


VLC (Video LAN Client) [17] is a tool that was selected to generate video packets
over the networks. This tool was selected as it supports both versions of internet
protocols (IPv4 & IPv6) and works across a range of operating systems including
Linux, Windows and Mac. It also has the ability to transmit live audio, video and
supports multiple voice and video protocols such as MPEG-1, MPEG-2, MPEG-4,
MKV and FLV.
Gnome is a network monitoring tool [18] that allows users to audit, capture and
measure the status of a live network. It has the ability to detain and evaluate
throughput, CPU utilization and RAM utilization. Gnome was particularly selected as
it could capture and audit the video traffic during the process of encapsulation and decapsulation. Other tools tested had capability to evaluate the traffic status over a
network; however they could not observe the performance of network during the
process of encapsulation and de-capsulation which is operating due to IP transition
mechanisms. Gnome is powerful enough and has special capacity to monitor the
traffic when it is being encapsulated or de-capsulated. This tool enabled us to observe
and capture the actual-throughput and impacted-throughput caused by IP transition
mechanisms.

6 Experimental Design
Two applications of VLC player were installed on each client workstation at both
sides of networks and Gnome was setup on a router machine. First VLC application
was setup to stream live video conference using one of the video protocols and it was
received at other end of the network using another VLC application. Same way
another VLC application was setup to stream live video back to the client, to make it
two ways video conferences. Then Gnome tool was configured on a router machine
where encapsulation and de-capsulation was processed. In this experiment data was
captured at 30 seconds intervals. The tests were repeated over 10 times to gain more
accuracy in results. Next section presents tests results obtained from this experiment.

7 Results Analysis
The parameters covered in this experiment are actual-throughput, impactedthroughput and CPU utilization. This section presents performance of five different
video protocols namely, MPEG-1, MPEG-2, MPEG-4, MKV and FLV over two
tunnelling and Dual-Stack mechanism and their average results are shown in graphs
and Table 1 below.
Actual-throughput: This is original throughput of two way video conference with
no additional traffic impact due to encapsulation. It simply carries video packets and
delivers them to their destinations with no addition to the packet size.

508

H. Sathu, M.A. Shah, and K. Ganeshan

Throughput
(KiB/s)

Throughput vs Impact-Throughput

IPv6to4
IPv6

750
600
450
300
150
0

MPEG-1

MPEG-2

MPEG-4

MKV

FLV

Video Protocols

Fig. 2. Actual-Throughput with IPv6 and Impacted-Throughput by IPv6to4 tunnelling mechanism


(KiB/s)

Impacted-throughput: This is caused by IP transition mechanisms and is added


into actual-throughput. The addition of additional encapsulation in the network wastes
significant amount of bandwidth to deliver actual-throughput.
Figure 2 below illustrates the actual-throughput on IPv6 and impacted-throughput
due to encapsulation process on IPv6to4 tunnelling mechanism. The results were
obtained using five different video protocols. Highest impacted-throughput was
identified on MPEG-2 at approximately 242 kilobytes per second additional impact
on actual-throughput. Second highest was calculated on MKV at approximately 136
kilobytes per second. Lowest amount of impacted-throughput was observed on FLV
at approximately 17 kilobytes per second while second lowest impacted-throughput
was measured on MPEG-4 at approximately 51 kilobytes per second. MPEG-1 had
actual-throughput of approximately 257 kilobytes per second and additional 106
kilobytes per second more due encapsulation. MKV and MPEG-1 had almost the
same amount of actual-throughput kilobytes per second; however impactedthroughput shows that MKV was more impacted than MEPG-1 at approximately 30
more kilobytes per second. Impacted-throughput is actually wasting bandwidth in a
real network as it uses more bandwidth to send less amount of throughput, which is
costly for users.
The results analysis using IPv6in4 tunnelling mechanism indicates that it has
significant impact on the protocols tested. The measurement of MPEG-2 reveals that
it was impacted most as it has larger packet size and impacted-throughput was
observed at approximately 246 Kilobytes per second. Least amount of impactedthroughput was calculated on FLV at approximately 17 kilobytes per second while
second lowest was measured on MPEG-4 at approximately 70 kilobytes per second.
MPEG-1 had actual-throughput for two way video conference at approximately 257
kilobytes per second while 110 kilobytes per second impacted-throughput was added
on it. The results for MKV shows that it almost had similar amount of actualthroughput as MPEG-1 while impacted-throughput measurement shows it was
impacted even more than MPEG-1 at approximately 16 more kilobytes per second.
The results for Dual-Stack mechanism are presented below in Figure 4. It
illustrates that dual-stack has very less impact over actual-throughput using all five
protocols. Highest impact was measured using MKV protocol as it produced 15

Performance Comparison of Video Protocols

Throughput
(KiB/s)

Throughput vs Impact-Throughput

509

IPv6in4
IPv6

750
600
450
300
150
0

MPEG-1

MPEG-2

MPEG-4

MKV

FLV

Video Protocols

Fig. 3. Actual-Throughput with IPv6 and Impacted-Throughput by IPv6in4 tunnelling


mechanism (KiB/s)

Throughput
(KiB/s)

Throughput vs Impact-Throughput

Dual-Stack
IPv6

500
400
300
200
100
0

MPEG-1

MPEG-2

MPEG-4

MKV

FLV

Video Protocols

Fig. 4. Actual-Throughput with IPv6 and Impacted-Throughput by Dual-Stack mechanism


(KiB/s)
Table 1. CPU Utilization tested over Tunnelling and Dual-Stack mechanisms. (%)
Mechanisms
MPEG-1

CPU Utilization%
MPEG-2
MPEG-4

MKV

FLV

Dual-Stack

30.0

28.1

33.4

29.1

26.3

IPv6to4

26.6
27.4

27.1
25.1

33.3
31.7

30.1
28.9

26.2
26.4

IPv6in4

kilobytes per second impacted-throughput while second highest was calculated on


MPEG-1 at approximately 4 kilobytes per second. Lowest amount of impactedthroughput was observed on FLV at approximately 0.9 kilobytes per second. MPEG-2
and MPEG-4 both provided marginally close impacted-throughput having less than
0.2 kilobytes per second difference. The reason dual-stack has less impact on these
protocols, is because it does not process encapsulation and de-capsulation.
Table 1 above shows results of CPU utilization. These results were obtained during
the performance test of each video protocol using the two tunnelling and Dual-Stack

510

H. Sathu, M.A. Shah, and K. Ganeshan

mechanism. The results for IPv6to4 and IPv6in4 didnt show much inconsistency for
CPU usage while Dual-Stack was slightly higher than both of them. It is because both
versions of IP operate concurrently. Comparison between protocols also didnt have
much variation. However, results for MPEG-4 were marginally higher than all four
protocols (MPEG-1, MPEG-2, MKV & FLV). It is due to high compression method
used to process MPEG-4 protocol. Least amount of CPU was used during FLV tests,
as it can be seen from Table 1 above that FLV had 26 percent usage of CPU no matter
which mechanisms it was tested on. It is due to the size of this protocol which
requires less processing power.

8 Discussion and Conclusion


In this paper investigation of actual-throughput and impacted-throughput was
identified and results compiled were presented in the section above. The results
clarified that video packets with larger size have significant impact while video
protocols with smaller size had slight impact. Dual-Stack had had only very slight
impact on all the protocols tested as it does not process encapsulation and decapsulation. On the contrary the tunnelling mechanisms had significant impact on all
the protocols tested.
Performance of dual-stack was much better; however it will take several
years before websites and other internet service providers allow dual-stack
mechanism to be used over the internet. In recent years IPv6to4 and IPv6in4
tunnelling mechanism will enable IPv6 based networks to communicate; however
the impact caused by these two tunnelling mechanisms is significant and wastes a
lot of bandwidth for video transmission. IPv6to4 mechanism marginally performed
better than IPv6in4 mechanism on all the video protocols tested.
Comparison between video protocols indicates that FLV protocol was the
least impacted, and usage of this protocol will not cause much bandwidth wastage
while MPEG-4 was second best for use over tunnelling mechanisms.
CPU utilization measurement shows no additional impact on CPU usage
while IP transition mechanisms were used. However MPEG-4 was marginally
higher than other four video protocols tested on two tunnelling and Dual-Stack
mechanisms. Due to the packet size of FLV it needs less CPU power. So it is clear
from Table 1 above that IP transition mechanisms had slight impact on CPU
processing power and even encapsulation and de-capsulation process caused
insignificant impact on CPU utilization.
Future work in this area should also include study and comparison of alternative
methods that could be used to forward IPv6 traffic on IPv4 core networks. Another
area is to cover other metrics such as jitter and packet loss and increase traffic loads
of these protocols to relate the experiments to the realistic environments.

Acknowledgments
We would like to show appreciation to UNITEC, Institute of Technology for
supporting the research team and providing us this opportunity to fulfil this research.

Performance Comparison of Video Protocols

511

References
1. Atenas, M., Garcia, M., Canovas, A., Lloret, J.: MPEG-2/MPEG-4 Quantizer to Improve
the Video Quality in IPTV Services. In: IEEE Sixth International Conference on
Networking and Services, pp. 4954 (2010)
2. Schierl, T., Gruneberg, K., Wiegand, T.: Scalable Video Coding Over RTP and MPEG-2
Transport Stream in Broadcast and IPTV Channels. IEEE Journals on Wireless
Communications 16(5), 6471 (2009)
3. Kim, S., Yongik, Y.: Video Customization System using Mpeg Standards. In: IEEE
International Conference on Multimedia and Ubiquitous Engineering, pp. 475480 (2008)
4. Tao, S., Apostolopoulos, J., Guerin, R.: Real-Time Monitoring of Video Quality in IP
Networks. IEEE/ACM Transactions on Networking 16(5), 1052 (2008)
5. Stockebrand, B.: IPv6 in Practice. A Unixers Guide to the Next Generation Internet.
Springer, Heidelberg (2007)
6. Lee, C., Yu, Y., Chang, P.: Adaptable Packet Significance Determination Mechanism for
H.264 Videos over IP Dual Stack Networks. In: IEEE 4th International Conference on
Communications and Networking, pp. 15 (2009)
7. Sanguankotchakorn, T., Somrobru, M.: Performance Evaluation of IPv6/IPv4 Deployment
over Dedicated Data Links. In: IEEE Conference on Information, Communications and
Signal Processing, pp. 244248 (2005)
8. Lee, L., Chon, K.: Compressed High Definition Television (HDTV) over IPv6. In: IEEE
Conference on Applications and the Internet Workshops, p. 25 (2006)
9. Zhou, Y., Hou, C., Jin, Z., Yang, L., Yang, J., Guo, J.: REAL-TIME Transmission of
High-Resolution Multi-View Stereo Video over IP Networks. In: IEEE Conference: The
True Vision-Capture, Transmission and Display of 3D Video, p. 1 (2009)
10. Luo, Y., Jin, Z., Zhao, X.: The Network Management System of IP-3DTV Based on
IPV4/IPV6. In: IEEE 6th Conference on Wireless Communications Networking and
Mobile Computing, pp. 14 (2010)
11. Xie, F., Hua, K.A., Wang, W., Ho, Y.H.: Performance Study of Live Video Streaming
over Highway Vehicular Ad hoc Networks. In: IEEE 66th Conference on Vehicular
Technology, pp. 21212125 (2007)
12. Gidlund, M., Ekling, J.: VoIP and IPTV Distribution over Wireless Mesh Networks in
Indoor Environment. IEEE Transactions on Consumer Electronics 54(4), 16651671
(2008)
13. Kukhmay, Y., Glasman, K., Peregudov, A., Logunov, A.: Video over IP Networks:
Subjective Assessment of Packet Loss. In: Tenth IEEE Conference on Consumer
Electronics, pp. 16 (2006)
14. Khalifa, N.E.-D.M., Elmahdy, H.N.: The Impact of Frame Rate on Securing Real Time
Transmission of Video over IP Networks. In: IEEE Conference on Networking and Media
Convergence, pp. 5763 (2009)
15. Sims, P.J.: A Study on Video over IP and the Effects on FTTx Architectures. In: IEEE
Conference on Globecom Workshops, pp. 14 (2007)
16. IlKwon, C., Okamura, K., MyungWon, S., YeongRo, L.: Analysis of Subscribers Usages
and Attitudes for Video IP Telephony Services over NGN. In: 11th IEEE Conference on
Advanced Communication Technology, pp. 15491553 (2009)
17. Video LAN.: Video LAN Organization: VLC Media Player(2011),
http://www.videolan.org/vlc/
18. GNOME: GNOME Documentation Library: System Monitor Manual (2011),
http://library.gnome.org/users/gnome-system-monitor/

IPTV End-to-End Performance Monitoring


Priya Gupta, Priyadarshini Londhe, and Arvind Bhosale
Tech Mahindra Pune, India
{pg0042955,londhepv,arvindb}@techmahindra.com

Abstract. Service providers are spending millions of dollars in rolling out


Internet Protocol Television (IPTV) services. In order to deliver profitable
services in this competitive market service provider should focus on reliability
and service quality. Customer experience is the key differentiator in this
competitive market. To gauge end-user experience they need to have an end-toend and proactive performance monitoring solution for IPTV. Monitoring
should be done at various identified interfaces, servers, network elements for
user impacting impairments from Super-headend to Set-Top-Box. Proactively
addressing trouble alarms or predicting possible issues before they impact enduser experience will greatly enhance the customer experience. Following a
proactive approach for monitoring will help service provider in early detection
and resolution of faults. The ability to do so will help in improving end-user
experience and in turn increase service provider revenue.
Keywords: IPTV, Performance Monitoring, Quality of Experience (QoE),
Quality of Service (QoS).

1 Introduction
IPTV is delivery of entertainment quality video over managed IP network. It is not
just limited to delivery of broadcast television program but also extends to services
like Video on Demand (VOD) where video is unicast to customer on request. IPTV is
one facet of Triple Play (Voice over Internet protocol (VOIP), IPTV and Data
services) and Quadplay (also includes mobile services) services. It is a game changing
technology as it provides end-users a two-way communication in the delivery of
broadcast television. IPTV also offers interactive services like ordering and playing
VOD, controlling live TV (rewind and pause), Personal Video Recording (PVR), time
shifting etc. In this growing technology many telecom service providers are now
offering services like Triple/Quadplay in an attempt to gain greater market share.
In addition to providing IPTV over wireline access network, delivery of IPTV over
wireless cellular network, hybrid satellite and terrestrial wireless systems, Cable-TV
are gaining foothold.
IPTV service providers have to compete with their counterparts offering IPTV
services through wireless, cable & satellite TV. To be successful, they have to meet
and exceed todays high standards of a reliability and service quality. Service quality
is a primary reason for customer churn and dissatisfaction. Hence end-to-end
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 512523, 2011.
Springer-Verlag Berlin Heidelberg 2011

IPTV End-to-End Performance Monitoring

513

performance monitoring is always a very crucial role in any successful rollout and
management of IPTV offerings.
This paper will discuss about IPTV Architecture stack that shows components that
contribute in delivering IPTV services, various performance monitoring points and
performance metrics that needs to be captured for a measuring and ensuring good end
customer experience. The paper also suggests an approach for performance
monitoring and method to measure QoE aspects of IPTV.

2 IPTV Architecture
For delivering IPTV services, content needs to traverse various components. Major
components deployed for delivering IPTV services are shown in Fig.1. These are:
1.
2.
3.
4.
5.

Super Headend
Video Hub Office
Core Network
Access Network
Home Network

Fig. 1. IPTV Architecture Stack

2.1 Super Headend (SHE) and Video Hub Office (VHO)


SHE component is responsible for three major tasks - acquiring television
signals/videos at national level from various sources, processing content (encoded,

514

P. Gupta, P. Londhe, and A. Bhosale

trans-coded) and distributing (encapsulating, conditional access) processed content to


customer through VHO. Fig.1 shows how content acquired flows from SHE to enduser. Content acquired here by satellite dish have variety of formats and encryption
protocols. This content is then converted into serial digital interface (SDI) form by
Integrated receiver and decoder (IRD).The signal is than further compressed and
encrypted using encoders and Digital Right Management (DRM) technology. Video is
encoded into MPEG-4 AVC H.264 to transmit good quality Standard/High definition
(SD/HD) content/channel to customers having bandwidth constraints in access
network. After all these processing, content is stored in streaming servers to be
streamed to end-user. IPTV middleware system offers end-users functionalities like
user authentication and authorization, Electronic program guide (EPG) management,
subscriber management etc. IPTV network generally contains one SHO and multiple
VHOs, where national services are acquired at the SHO and regional, local services
and video on demand services are acquired and integrated at each of the VHOs. The
content distribution systems are located in the VHOs closest to the clients.
2.2 Core Network
The core network is the central network portion of a communication system. It
primarily provides interconnection and transfer between edge networks through
routers. It needs to be re-engineered to support carriage of large volume of video
content. Its bandwidth needs to be extended to meet growing video demand.
2.3 Access Network
An access network allows individual subscribers or devices to connect to the core
network. It can be Digital subscriber line (DSL), cable modem, wireless broadband,
optical lines, or powerline data lines. In order to deliver good quality video to end
user The Last Mile must be capable of supporting the bandwidth required to carry a
video on IP. IPTV roughly requires 2 Mbps for a SD content/channel and 6-12 Mbps
for HD content/channel in MPEG-4 format of compression. With fiber, bandwidth in
local loop is not an issue. However, providing fiber to all customers is a complex
undertaking as it is costly and may involve redesign of existing infrastructure.
Currently, broadband technology ADSL2+ and VDSL (Asymmetric/Very High Bit
Rate Digital Subscriber Line) seems to be the most economical means of deployment
of real time video services.
2.4 Home Network
Its a very critical area as nearly 40% of video issues occur in the home. The
bandwidth within the home will be critical factor to assure good delivery of IPTV
service. Service needs to be provided by utilizing existing transmission media (power
lines, phone lines, coax cables and wireless technologies) that are available at
subscriber premises and with same quality [12].

IPTV End-to-End Performance Monitoring

515

3 Need for Performance Monitoring and Its Benefits


While service providers are spending millions of dollars in rolling out IPTV service,
spend on proactive end-to-end monitoring of IPTV stack (Network, Service delivery
platform and video quality) cannot be overlooked.
Need for proactive monitoring of IPTV services is centered on delivering reliable
and profitable services in this highly competitive market. Focus on reliability and
quality of IPTV services will be key to growth and long term survival of the
service provider. Service quality is a primary reason for customer churn and
dissatisfaction. Therefore service provider should understand and meet customers
expectation of service quality.
Most operators are using monitoring solution in some form or other. But many are
highly dissatisfied as they do not have end-to-end visibility of IPTV stack and its
performance - from headend through network to customer premises. Also with
existing network monitoring solution they are not able to gauge end user quality of
experience.
End-to-End monitoring of IPTV will give service provider complete knowledge of
outage, its scope and magnitude. This knowledge will allow them to act proactively
rather than reactively to any outage. This will in turn help them in providing better
service, increasing Average Revenue Per User (ARPU) and customer loyalty.

4 QoS for IPTV


Quality of Service (QoS) is the measure of performance of the system from network
perspective. This measure ensures all network elements, protocols and related
services operate as expected. Quality of Experience (QoE), however relates to overall
acceptability of service as perceived subjectively by end user. QoE is a measure of
end-to-end performance levels from the users perspective and an indicator of how
well this system meets the user needs [1].
End-users perception of the quality of a service is different depending on the
application requirements. IPTV service delivered over protocol stack shown in
Fig.2.has mainly below mentioned aspects of quality. IPTV performance has mainly
below mentioned aspects of quality.
Customer: QoE as expected and perceived by end user.
Services: Service specific expectation such as connection time, transaction time,
video/audio quality, download speed etc.
Network: QoS in terms of Signaling, Transport, Network/Service delivery
elements.
Data or Content: Quality of data received by the customer.
A system that is able to measure all these aspects can truly give the E2E service
quality measurement. Various standards like ITU-T, broadband forum have various
recommendations for IPTV service. Measurement of end-to-end QoS for IPTV
services will also have to be based on the underlying IPTV architecture and the type

516

P. Gupta, P. Londhe, and A. Bhosale

of network. Standards like ITU-T 1541[10] standard defines QoS classes for IP
network and need to be referred while measuring the service performance over
wireline network.

IPTV PROTOCOL STACK


Application
Layer

HTTP RTSP

Transport
Layer

TCP

RTP,
RTCP

MPEG2-TS

DHCP,
DNS,SNMP

UDP UDP

Network
Layer

IPv4/IPv6 , IGMP, RRC, RLC

Data Link
Layer

ETHERNET , ATM, AAL2..5, PDCP

Physical
Layer

ADSL, VDSL, PON, RF (WCDMA L1/ MAC)

Fig. 2. IPTV Protocol Stack

5 Approach for Monitoring


In the established market of satellite and cable television, IPTV brought in interactive
digital television, increasing the user expectations of responsiveness of the system and
digital video quality. Delivering this desired high service availability with required
QoS performance in such a complex and dynamic system requires continuous
monitoring and performance analysis. Based on ITU-T and Broadband forum
recommendations and our internal study, following approach is recommended for
end-to-end performance monitoring. The flow of approach is shown in Fig.3.
5.1 Monitoring Points and Methodology
In order to have a complete unified view, IPTV stack should be monitored at all
interfaces with all devices, servers and network elements that are present between
SHE to subscriber. Suggested monitoring points shown in Fig-1[2] and interfaces are
listed below:M1:- Between egress of Content Provider and ingress of SHE.
M2:- Domain border between SHO and Core Network.
M3:- Domain border between Core Network and Access Network.
M4:- Domain border between Access Network and Home Network.
M5:- Between Set-top-box (STB) and TV.
Servers: - Content Acquisition and Distribution Server (VOD Assets, VOD
Metadata), Middleware Servers, VHO Servers, Advertisement Server, Conditional

IPTV End-to-End Performance Monitoring

517

Access (CA)/DRM Servers, Database Servers and servers to integrate with service
provider Operational Support System (OSS) and Business Support Systems (BSS).
Devices: - Routers, Digital Subscriber Line Access Multiplexer (DSLAM) and
Encoders etc.
Network: - Access Network and Home Network etc.
Active/Passive probes or agents on all monitoring points mentioned above needs to be
planted. Active/Passive probes with varying capabilities (Video Content analysis,
Video quality, video signaling, Protocol identification, and application server
performance measurement) will be required to capture performance metrics
mentioned in Section-6. Active probe simulates user behavior by injecting test or
signaling packets while Passive approach uses devices or sniffers to watch the traffic
as it passes. It analyzes the performance on the basis of transaction/protocol/transport
stream metrics. True end-to-end monitoring will have to include client side
performance metrics measurement that can be done by either installing an agent on
client equipment or an agent in client locality. These agents will capture performance
metrics, to enable performance measurement. For measuring customer experience at
the Customer Premises Equipment (CPE) mainly two approaches are used Transport
based( on packet loss, Bit rate, loss distance, error packets etc) and Payload based(on
full reference/no reference) methods of comparing payloads quality at ingest with
that of the packet received at CPE.
5.2 System Integration
Data captured from probes and agents goes as an input to various Element
Management System (EMS) and Network Management System (NMS), Fault
management system (FMS), Performance Management System (PMS) etc. End-toEnd system will essentially mean integrating output of these various systems. Data
obtained from these systems will need to be aggregated, correlated and mapped with
IPTV service flow.
For all performance metrics that affect end customer experience threshold values
needs to be configured. If there is any breach of threshold value or some trend of
deterioration in performance metric is observed alarm should be raised. Having a
robust and flexible integration strategy will be a key to the success of a proactive endto-end end-performance monitoring solution for IPTV. All alarms raised will go as
input to Correlation engine.
5.3 Correlation Engine
Topology based correlation engine that uses the topological relationships between
networks elements can be used to do following: Isolate and unify region wise alarms or notifications received from network
elements or probes.
Establish dependencies of alarms and notifications
Identify existing or impending issues based on QoE dependencies
Deduce cause and effect relationships between alarms

518

P. Gupta, P. Londhe, and A. Bhosale

Correlation of different alarms will help in isolating the exact location of faults and
thus will help in earlier fault diagnosis and its resolution.
5.4 Unified dashboard
A unified dashboard will greatly improve the Trouble to resolve (T2R) cycle for the
service providers. Views for performance metrics from different demarcated points
suggested above in IPTV, timely alarms proactively generated based on priority and
probability of an impending issue and role wise visibility of state of the system (SOS)
are something every provider will look for. A connected environment where all the
stakeholders are timely updated on the SOS will definitely improve customer
experience. In addition to end-to-end visibility, it will enable them to react to any
outage even before the customer complains.

Fig. 3. Performance Approach Flow

6 Performance Metrics
Fig.4 above shows performance metrics and their place for measurement in end-toend platform. Major performance parameters that need to be monitored for good endcustomer experience in IPTV are as follows:-

Fig. 4. IPTV Performance Metrics & Monitoring Points

IPTV End-to-End Performance Monitoring

519

Monitoring source stream: - At SHO/VHO content obtained from various external


sources are aggregated. It is very necessary to measure source quality of Audio/Video
streams received at SHO/VHO. If source quality is not good service provider will not
be able to deliver good quality video to the customers.
Coding Parameters for Audio/Video: - Media streams received at SHO/VHO are
processed i.e. encoded, transrated etc. Encoding has a significant impact on video
quality. Efficiency of encoding aids in bandwidth efficiency while preserving the
video quality. Due to measures taken to conserve bandwidth, certain impairments are
inevitably introduced. Various parameters that affect QoE due to digitization and
compression are :- Codec standard used, bit rate, frame rate, Group of Picture (GoP)
structure and its length, Motion vector search range, Video rate shaping, Frame
Width, height, Frame rate, interlacing and Slices per I-frame, Video Reference Clock
rate etc. As recommended by Broadband Forum TR-126 [4] provisional video
application layer performance thresholds are shown in TABLE 1.
Table 1. Bit-rate Thresholds
Bit Rate

Threshold values

SDTV broadcast Video

1.75 Mbps CBR

HDTV Broadcast Video

10 Mbps CBR

VOD SDTV

2.1 Mbps CBR

Monitoring Servers/Devices: - We need to monitor all Servers and Devices present


from headend to STB. E.g. Servers like Content acquisition server, Content delivery
servers and Devices like Routers, DSLAM, Optical Network Terminal (ONT), Edge
routers etc. Some of the parameters that need to be monitored here are:

CPU & Memory Utilization of Servers


Routers Utilization and their Traffic Metrics
Throughput & Availability
Behavior under heavy load condition
Response delays of Servers, Request Error, and Blocked Request

IP Network performance parameters: - Video is transported from the core network


through the access network to customer. It has to pass through series of routers and
switches and over the transmission systems that interconnect them. At each router or
switch there are queues that can introduce variable delays and if queue buffer
becomes too full it can lead to packet loss. In general, transmission system between
routers and switches does not contribute to performance degradation as it has
extremely low error rate. Major contributors are Access network and the home
network. Once video has left the headend, major factors that may impair video
are Loss/Latency/Jitter of packets and impairment in Transport stream structure. If
left unattended, these issues will cause perceivable impairment of video. Major

520

P. Gupta, P. Londhe, and A. Bhosale

parameters that are suggested by TR-126 [4] and ITU-T G.1080 [1] that need to be
monitored here are:
Packet Lost, Repaired and Discarded-Impact of loss will be dependent on type of
impairment. Loss from I and P frames produce different impairments than B frame
packet losses
Burst Loss Rate, Burst Loss Length
Gap Length, Gap Loss and Gap count
Loss Period, Loss Distance
Jitter should be monitored as suggested by RFC 3350[8]
Decoder concealment algorithm used. It can mitigate some perceptual impact of
losses
Bandwidth Utilization
As suggested by TR-126[4] and ITU-T1540[3] below mentioned threshold limit
(shown in TABLE 2) for these parameters should not be breached.
Table 2. Network Performance Thresholds
Parameter

Threshold Value

Latency

Less than 200ms

Loss distance

1 error event per 4 hours

Jitter

Less than 50 ms

Max duration of a single error

Less than 16ms

IP video stream Packet loss rate

less than 1.22E-06

Jitter buffer fill

150-200 ms

I-Frame delay

500 ms

MPEG Transport parameters: - Video is often carried in MPEG Transport streams.


MPEG-TS contain time stamps, sequence numbers, and program associations for
packetized video streams. It is suggested by ETSI TR 101-290 - Level 1, 2 and 3
parameters should be monitored by service providers [6]. It recommends various
checks including Synchronization Errors (TS Sync, Sync byte), Table errors (PAT,
PSI), PMT, Missing PID, CRC, PTS, and PCR etc. These metrics provide information
on key error types that occur with MPEG transport protocols, and are useful for
identifying and resolving error conditions.
Service Availability & Response Time: - Channel change time that depends on factors
like IGMP delay (Internet group multicast protocol), buffer delay and decoding delay
is one of major factors affecting users experience. Response time for VOD trick play
functions (pause, rewind); EPG Navigation (electronic program guide), Time taken
for authorization and authentication of user to check his validity and Connection Time

IPTV End-to-End Performance Monitoring

521

Table 3. User Experience Performance thresholds


Parameter

Threshold Value

Latency

Less than 200ms

Loss distance

1 error event per 4 hours

Jitter

Less than 50 ms

Max duration of a single error

Less than 16ms

IP video stream Packet loss rate

less than 1.22E-06

Jitter buffer fill

150-200 ms

I-Frame delay

500 ms

etc affect customer experience. As recommended by TR-126[4] thresholds values of


below mentioned parameters in TABLE 3. Threshold values should not be breached
to have a good end-user experience.
STB Related parameters: - The video quality delivered to customer depends on
decoding efficiency, buffer size and error concealment algorithm implemented in
STB. Jitter that can have significant impact on video quality can be neutralized to
some extent by decoder buffer.STB boot time also plays a significant role in terms of
user experience.
Synchronization of Audio and Video, Lip synchronization: - Audio-Video should be
synchronized in order to have good viewing experience. Max threshold recommended
by TR-126 [4] is Audio Lead Video by 15ms max and Audio Lag Video by 45ms.

7 QoE in IPTV
For services like IPTV, where user satisfaction is the ultimate metric of performance,
a method to accurately measure the QoE is required. QoE is a subjective term and
using subjective measurements on large scale is not practical as this method relies on
input from actual users watching a video. Though this method is accurate, it is
expensive and too time consuming. Therefore objective methods are used for
estimating QoE. Objective measurements, infer video quality based on the video
stream without direct input from the users. It is a very challenging task to have
objective measurements that can incorporate human perception. Objective
measurement can be done by three methods.

Payload based - J.144 (Full reference Model) and Peak Signal Noise Ratio
(PSNR) [11].
Codec aware Packet based MPQM [9]
Codec Independent packet based MDI (Media delivery index) [7]

522

P. Gupta, P. Londhe, and A. Bhosale


Table 4. Comparision of MDI and Complementry Techniques

MDI
MDI relies on packet-level
information.
No codec information is taken
into account.
It does not require lot of
hardware support.
It poorly correlates to human
perception
Here it is easier to isolate the
problems in video quality.

It is highly scalable and thus


can monitor thousands of
video streams simultaneously.
It is suitable for IPTV service

Other Complementary Techniques like


MPQM, PSNR
It looks at packet-level information along with
codec information and video header information
Incorporating codec information makes them
computationally intensive
They require lot of hardware support.
Some of them correlate to some extent with
human perception
Some techniques return a single number in the
range 1-5, that gives little indication of video
quality. E.g. As impairments due to encoding,
network cannot be distinguished
Limited scalability may have issues for real time
services as it requires continuous monitoring of
thousands of streams
Not that suitable for IPTV services

Based on comparison shown in Table 4, MDI seems to be most suitable choice for
applications like IPTV.
Media Delivery Index: - As described in RFC 4445[7] it is a diagnostic tool or a
quality indicator of video quality for monitoring network intended to deliver a video.
The MDI is expressed as delay factor (DF) and the media loss rate (MLR).
Delay Factor: - This component indicates how many milliseconds of data buffers
must contain in order to eliminate jitter.
Media Loss Rate (MLR):- It is simply defined as the number of lost or out-of-order
media packets per second.
7.1 QoE Estimation
QoE is dynamic and depends on many factors. QoE should be measured continuously.
It is a function of many factors having different weights as shown in equation-1.
Weights might be calculated by simulating the end-to-end scenario and calculating the
overall contribution of each parameter to QoE.
Quality of video is affected by impairments introduced during encoding, decoding
process and in playback of reconstructed video signals. It is inevitably introduced due
to measures taken to conserve bandwidth like codec quantization level, longer GoP
structure, lower frame rate etc. Various human factors that affect user experience are
their emotional state, previous experience or service billing. For.eg Customers who
have been watching TV on satellite or cable may be annoyed by channel change delay

IPTV End-to-End Performance Monitoring

523

in IPTV. Environmental factors that may affect user experience are whether you are
viewing it on Mobile, HDTV, and SDTV. A particular video will be rated differently
for HDTV and SDTV.
QoE for IPTV = f (w1*MDI + w2*Transport Layer parameters +
w3*Availability+ w4*Environment Factors + w5*Encoding and Decoding
efficiency+ w6*Human factors + w7*Service Response Time).
(1)
w1 to w7 are weights of that parameter

8 Conclusion
In this competitive IPTV market, successful service providers have to fulfill
subscribers expectation of almost zero tolerance for poor or spotty video and
unavailability of service. Delivering this kind of quality services on IP network
requires monitoring service at all time on all locations. The quality of IP video can be
affected by impairments introduced during encoding, transmission of packets,
decoding etc. Video is so sensitive that a loss of few packets may lead to freeze frames,
blank screen etc. Any such impairment may bring dissatisfaction, leading to customer
churn and thus loss of revenue. In this scenario tools for end-to-end monitoring of
IPTV becomes extremely critical. Service provider needs to monitor whole stack of
IPTV at various interfaces, servers, network elements for various performance level
parameters that impact QoE of customers. Monitoring those parameters and reflecting
their effect on the QoE, will enable the provider to take proactive action to resolve
even impending problems and also help expedite fault resolution. Customer experience
will improve many folds by detecting and resolving problems even before customer
reports for it or informing the customer proactively on possible issues. Delivering
quality IPTV services will improve customer loyalty and service provider revenue.

References
[1] Recommendation ITU-T G.1080, Quality of experience requirements for IPTV services
(2008)
[2] Recommendation ITU-T G.1081, Performance monitoring points for IPTV (2008)
[3] Recommendation ITU-T Y.1540, Internet protocol data communication service IP
packet transfer and availability performance parameters (2007)
[4] TR-126, Triple-Play Services Quality of Experience (QoE) Requirements, Broadband
Forum Technical Report 13 December (2006)
[5] TR-135, Data Model for a TR-069 Enabled STB (December 2007)
[6] ETSI TR 101 290 V1.2.1,Technical Report Digital Video Broadcasting (DVB);
Measurement guidelines for DVB systems (2001-2005)
[7] RFC 4445 proposed Media Delivery Index
[8] RFC 3550, RTP: A Transport Protocol for Real-Time Applications
[9] Branden Lambrecht, C.J., Verscheure, O.: Perceptual Quality Measure using a SpatioTemporal Model of the Human Visual System. In: Proc. SPIE, vol. 2668, pp. 450461
(March 1996)
[10] ITU-T Y.1541(02/06), Network performance objectives for IP-based services
[11] T1.TR.74-2001, Objective Video Quality Measurement Using a Peak-Signal-to-NoiseRatio (PSNR) Full Reference Technique
[12] TR-135 Data Model for a TR-069 enabled Set Top Box

A Color Image Encryption Technique Based on a


Substitution-Permutation Network
J. Mohamedmoideen Kader Mastan1, G.A. Sathishkumar2, and K. Bhoopathy Bagan3
1,2
Department of Electronics and Communication Engineering,
Sri Venkateswara College of Engineering, Sriperumbudur-602105, India
3
Department of Electronics and Communication Engineering
Madras Institute of Technology,
Chennai-600044, India
mastan_j_20@yahoo.co.in, sathish@svce.ac.in,
kbb_mit@annauniv.edu

Abstract. In this paper we have proposed and tested an image encryption


technique consisting of matrix transformation, pixel diffusion and a permutation
box. Since the matrix transformation, which produces both confusion and
diffusion, is linear, the pixel diffusion and permutation have been introduced so
as to make the technique nonlinear. This technique is specifically designed for
sensitive fields like medicine where misinterpretation could result in loss of life.
One apt application of this technique is its use in PACS for secure storage and
transfer of medical images. The uniqueness of our technique is that it is
ciphertext sensitive so that decryption doesn't yield a recognizable image in
case there is an error during transmission. The proposed technique gives good
parametric and sensitivity results proving itself an eligible candidate for image
encryption.
Keywords: color image encryption, matrix transformation, pixel diffusion,
permutation, ciphertext sensitivity.

1 Introduction
With the advent of the Internet and wireless communication, instantaneous transfer of
text, images and multimedia to any point on earth has become feasible. However it is
a package deal. The more one is free to go, the smaller the boundary becomes. The
easier it is to communicate, the more unsafe it is. At this point data security plays an
important role. Since digital images play a vital role in areas like medicine and
military, their confidentiality is extremely important. However securing or encrypting
images is different from doing text in terms of high amount of data and its
redundancy, high correlation between pixels and different dimensions such as gray
scale and color.
The last decade has witnessed image encryption techniques [1-6] which have tried
to disrupt the correlation and the redundancy in the ciphered images to the best
possible extent. In an attempt to exploit the nature of visual perception, encryption
techniques using only pixel permutation have also been proposed [7]. Though these
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 524533, 2011.
Springer-Verlag Berlin Heidelberg 2011

A Color Image Encryption Technique Based on a Substitution-Permutation Network

525

permutation only techniques seem to serve the purpose they are completely
vulnerable to known plaintext attacks [8]. Hence image encryption techniques too
should have the basic requirements namely, confusion, diffusion [9] and avalanche
effect.
In this paper we have proposed a color image encryption technique involving
matrix transformation, pixel diffusion and permutation. While the matrix
transformation, which produces good confusion and diffusion, is linear, the pixel
diffusion and permutation have been introduced so as to make the technique
nonlinear. This technique is specifically designed for sensitive fields like e-medicine
where misdiagnosis of diseases could result in loss of life. Our technique is unique
because it is ciphertext sensitive so that decryption doesnt yield a recognizable image
in case there is an error during transmission thereby avoiding life threatening
conclusion otherwise taken by the physician.
One good application of our technique is in PACS (Picture Archiving and
Communication System) through which medical images are archived and
communicated with confidentiality among physicians and related people. Inadequate
security of the medical records and their intentional or accidental leakage could cost
the medical center [10]. The rest of the paper is organized as follows. Initially, section
2 is about the encryption technique, followed by the experimental results in section 3.
Subsequently the key space and sensitivity analysis are done in section 4, while
section 5 constitutes the efficiency analysis. Finally conclusion remarks are drawn in
section 6.

2 Encryption Process
This technique is significantly different from our previous work [11] in 3 ways. First,
we have strengthened the key schedule, the encryption process and additionally, we
have designed it for color images. Throughout this paper we have considered 24 bit
RGB images as test images.
2.1 Matrix Transformation
The matrix transformation (MT) used in our technique is the traditional hill cipher[12]
where a key matrix of size 88 with values from 0 to 255 is generated. If the
generated matrix is not a valid key matrix (not having odd determinant), the main
diagonal elements are alone randomized until a valid key matrix is got. Randomizing
the main diagonal gives a faster result.
Encryption: C = Mat P mod 256

(1)

Decryption: P = Mat-1 C mod 256

(2)

Where P and C are the plaintext and ciphertext respectively in the form of 18 matrix
and Mat is the key matrix. The key schedule generates 3 additional matrices by
swapping rows and columns of the key matrix as shown in Fig. 1. Each channel of the
color image is subjected to MT independently.

526

J.M. Kader Mastan, G.A. Sathishkumar, and K.B. Bagan

Generation of
key matrix
Mat1

Swapping
consecutive
rows to
generate Mat2

Swapping
consecutive
columns to
generate Mat3

Swapping
consecutive
rows to
generate Mat4

XOR values
in each row to
generate Key1

XOR values in
each row to
generate Key2

XOR values in
each column to
generate Key3

XOR values in
each row to
generate Key4

Fig. 1. Key schedule

2.2 Pixel Diffusion


Initially the rows or the columns of the generated matrices are bitwise XORed to get
key arrays of 8 elements. Here we have used linear indexing of image pixels where
a(1,1) = a(1), a(2,1) = a(2) and so on. The key schedule is clearly depicted in Fig. 1.
Each channel of the color image is subjected to Pixel Diffusion independently. Two
types of pixel diffusion are done which are,
Single Pixel Diffusion (SPD): Letk1 is the key array generated for SPD. Let a(i)
denote the intensity of the pixel with index i, r be the number of rows in the image
and p be the number of pixels in the image. Let p+1=1. The pseudocode of SPD is
as follows:
Step1:a(1:8) a(1:8)k1
Step 2:i1
Step 3:a(i+1) a(i)a(i+1)
Step 4:ii+1
Step 5: Repeat Step 3 till i=1
Step 6:a(i+1) (a(i)+a(i+1))mod n
Step 7:ii+1
Step 8: Repeat Step 6 for r 1 times
Block Pixel Diffusion (BPD): Letk1 be the key array generated for BPD. The
pseudocode of BPD is as follows.
Step1:a(1:8) a(1:8)k1
Step 2:i1
Step 3:a(i+8:i+15) a(i+8:i+15)a(i:i+7)
Step 4:ii+7
Step 5: Repeat Step 3 till i=1
2.3 Permutation
This is the stage which makes the red, green and blue channels of the image interdependent. Let Ri, Gi and Bi be the pixels of the red, green and blue channels

A Color Image Encryption Technique Based on a Substitution-Permutation Network

527

Generation of
key matrices
MT
(Mat2)

MT
(Mat1)

Plain
image

MT
(Mat3)

Cipher
Image

PB

PB

BPD-1
(Key1)

MT
(Mat4)

SPD-1
(Key2)

PB

SPD
(Key3)

BPD
(Key4)

Generation of
key arrays

N.B:
denotes matrix transpose; MT Matrix Transformation; SPD Single Pixel Diffusion;
BPD Block Pixel Diffusion; SPD-1 Inverse of SPD algorithm; BPD-1 Inverse of BPD
algorithm; PB Permutation Box
Fig. 2. Encryption process

respectively at index i. The following permutation box (PB) is applied to sets of 4


pixels in all the channels.
R1
G1
B1

R2
G2
B2

R3
G3
B3

R4
G4
B4

G2
R2
G1

B1
B2
R1

B4
B3
R4

G3
R3
G4

The decryption is done in the reverse direction of the encryption process. However
the MT block employs the inverse of the corresponding key matrix in the encryption
process and the inverse of the BPD-1, SPD-1, SPD, BPD and PB algorithms are used
with the same corresponding key.

Fig. 3. The original image Mandrill.png (left), the encrypted image (middle) and the decrypted
image (right)

528

J.M. Kader Mastan, G.A. Sathishkumar, and K.B. Bagan

Fig. 4. Histogram of the red, green and blue channels of the Mandrill.png (top) and those of the
encrypted Mandrill.png (bottom)

3 Experimental Results
3.1 Visual Test
We have tested our technique over a variety of RGB images such as Barbara, fabric,
F16, Heart, Lena, Mandrill, Peppers. Without loss of generality, we have shown the
results of a typical natural image, Mandrill.png and a typical medical image, heart.jpg.
The encrypted image in Fig. 3 and Fig. 5 doesnt have any resemblance of the original
image. Besides, the histograms at the bottom of Fig.4 and Fig.6 dont reveal any
information of the original image and shows equally probable intensity values.
3.2 Information Entropy Analysis
Entropy h is a cumulative measure of the frequency of the intensity levels in
an image. Due to the characteristic of the human eye of being insensitive to
high frequency components, an image of high entropy is not visually perceivable.
Moreover if the entropy of a signal is high the signal looks random. Entropy,

Fig. 5. Original heart.jpg (left), encrypted heart.jpg (middle) and the decrypted image (right)

A Color Image Encryption Technique Based on a Substitution-Permutation Network

529

Fig. 6. Histogram of red, green and blue channels of the original heart.jpg (on the top) and the
encrypted heart.jpg (on the bottom)

h=i ( pi log2 pi), where pi is the frequency of intensity level i in the image. The
maximum h an 8bit image can attain is 8. The average of our results in Table.1 is
7.99975. Hence an entropy attack is difficult to launch.
3.3 CrossCorrelation
The crosscorrelation coefficient, CAB between the image A and B quantifies the level
to which the image pixels are relatively randomized. The closer it is to zero, the better.
Our technique produces an absolute cross-correlation of about 10-4 in most of the
cases making a statistical attack tough.

(3)
,

Where Ai,j is the pixel in the ith row and jth column of A, r and c are the number of
rows and columns in each channel of the image respectively.

and

(4)

Table 1. Entropy of original and encrypted test images


Image
Original
Encrypted

Barbara
7.6919
7.9998

Fabric
7.5632
7.9998

F16
6.6639
7.9997

Heart.jpg
4.9830
7.9995

Lena.tif
7.7502
7.9998

Mandrill.png
7.7624
7.9998

Peppers.jpg
7.7112
7.9998

3.4 Net Pixel Change Rate


Net Pixel Change Rate, NPCR is the measure of the number of pixel changed between
2 images A and A.

530

J.M. Kader Mastan, G.A.


G Sathishkumar, and K.B. Bagan

100% here

0 if
1 if

,
,

,
,

(5)

The expected NPCR due


d
to a good encryption technique is 1n-1 100%
%
99.6094%. Our results averrage to 99.60888%.
3.5 Unified Average Cha
ange in Intensity
The Unified Average Change in Intensity (UACI) is a measure of the degree to whhich
the pixels vary between 2 im
mages.

100%

(6)

The expected UACI for a go


ood encryption scheme is
1

1
1

100%

33.4635%

Our encryption scheme reacches an average UACI of 33.48185%.


Table 2. Parametric results of encryption of Mandrill.png, heart.jpg and Lena.tif
Dim
|CAB|
R vs R
110-3
R vs G
110-3
R vs B
110-3
G vs R
210-4
G vs G
610-4
G vs B
710-4
B vs R
210-5
B vs G
310-5
B vs B
110-3
Avg

Mandrill.png
512*512*3
NPCR% UACI%
U
99.6056
2
29.9482
99.5975
3
30.0023
99.6006
2
29.9843
99.5705
2
28.5434
99.6265
2
28.5987
99.6178
2
28.5877
99.6067
3
31.2287
99.6166
3
31.2768
99.6265
3
31.2747
99.6075
2
29.9383

|CAB|
710-3
210-3
310-3
610-3
110-3
310-3
510-3
810-4
210-3
Avg

Heart.jpg
360*360*3
NPCR%
99.6134
99.6173
99.5941
99.5049
99.598
99.5957
99.6173
99.635
99.6111
99.5985

UACI%
41.2312
41.1535
41.0656
43.2241
43.1686
43.1288
44.344
44.3225
44.2663
42.8782

|CAB|
210-3
710-4
110-3
110-3
410-4
410-4
810-4
310-4
810-4
Avg

Lena.tif
512*512*3
NPCR%
99.6159
99.6223
99.6071
99.5953
99.6025
99.6201
99.6067
99.5998
99.5991
99.6076

UACI%
32.91182
33.08813
33.02256
30.59989
30.6331
30.64442
27.58898
27.60079
27.6221
30.41131

Fig. 7. Image decrypted


d by wrong key (left) and its consolidated histogram (right)

A Color Image Encryption Technique Based on a Substitution-Permutation Network

531

4 Key Space and Sensitivity Analysis


4.1 Key Space Analysis
Since the initial key generated is an 88 matrix with values from 0 to 255, the
effective key space is calculated using the formula
|

(7)

from [13] as 3.88710153 where qi is a prime factor of n=


sufficient to resist a brute force attack.

and m=8 which is

4.2 Decryption Key Sensitivity


We have shown the sensitivity test results with respect to Mandrill.png. We tested our
technique by decrypting an encrypted image using the decryption key changed by one
bit and found that no trace of the original image is present in the decrypted image and
the histogram is flat (Fig. 7). This ensures that a partial decryption of the image is
infeasible.
4.3 Encryption Key Sensitivity
We have tested the encryption key sensitivity of our technique by comparing the
ciphered images obtained using 2 keys varying by 1 bit. The results average to an
NPCR of 99.6072% and a UACI of 33.4685%. The parametric results presented in
Table 3 confirm that it is difficult to analyze the encryption technique based on
similar keys.
Table 3. Sensitivity test results with Mandrill.png
Encryption key sensitivity
|CAB|
NPCR%
UACI%
R vs R
210-3
99.5979
33.4923
R vs G
510-4
99.5995
33.4775
R vs B
110-3
99.6056
33.5062
G vs R
710-4
99.614
33.514
-4
G vs G
710
99.6212
33.4644
G vs B
110-3
99.612
33.4844
B vs R
310-3
99.6086
33.4054
-3
B vs G
210
99.6056
33.4375
B vs B
110-3
99.601
33.4356
Avg
99.6072
33.4685

Plaintext sensitivity
|CAB|
NPCR%
UACI%
-3
110
99.6067
33.482
110-3
99.6033
33.4443
410-4
99.6033
33.49
110-3
99.6162
33.4938
-3
410
99.5293
33.5271
310-4
99.5995
33.4658
510-5
99.5922
33.4635
-4
310
99.6067
33.458
210-3
99.6124
33.5261
Avg
99.5966
33.4834

Ciphertext sensitivity
|CAB|
NPCR% UACI%
110-3
99.6586
33.5538
110-3
99.6147
33.4318
110-3
99.6071
33.4821
710-4
99.6246
33.5148
-4
610
99.704
33.5113
610-4
99.6159
33.4595
110-3
99.6063
33.5522
210-3
99.6048
33.4317
110-3
99.5689
33.505
Avg
99.6227
33.4935

4.4 Plaintext Sensitivity


As plaintext sensitivity is closely related to differential cryptanalysis, we have
analyzed the parametric results shown in Table.3 after encrypting Mandrill.png
varying by 1 bit. The average of the NPCR values is 99.5966% and UACI is
33.4834%. The results confirm that a differential attack is infeasible.

532

J.M. Kader Mastan, G.A. Sathishkumar, and K.B. Bagan

4.4 Ciphertext Sensitivity


Error tolerance is generally expected from cryptographic techniques. However, when
strong authentication techniques are used, error tolerance is useless. Moreover in
certain fields like medicine, decisions shouldnt be taken based upon corrupted
images. Hence we wanted our technique to be ciphertext sensitive so that a natural
error or an intentional change in the ciphertext should lead to a non-recognizable
decrypted image. The parametric results shown in Table.3 after decrypting encrypted
Mandrill.png varied by 1 bit bolster the fact that any change in the encrypted image
corrupts the entire image to a non-perceivable form. The average of the NPCR data is
99.6227% and that of UACI data is 33.4935%.

5 Efficiency Analysis
We have implemented the technique in MATLAB 7.10 using a PC equipped with
Intel Core2Duo T5550 @ 1.83 GHz, 2 GB RAM, 32bit Windows 7 Ultimate OS.
Theoretically both the encryption and decryption algorithms have same complexity. It
can be seen from Table. 4 that as the images dimensions increase the bit rate
increases due to the parallelism of the matrix transformation and the permutation box.
Our technique is faster than the MATLAB implementation of AES [16] which takes
at least 1742 seconds (in its fastest mode of encryption) for encrypting an 8 bit image
of size 5125123.
Table 3. Efficiency analysis
Spatial resolution of
the image
3603603
5125123
6404803

Time taken for


encryption (seconds)
1.878208
3.648261
4.214025

Time taken for


decryption (seconds)
1.832749
3.652164
4.241128

Average bit rate for


encryption/decryption
1.6Mbps
1.68Mbps
2.27Mbps

6 Conclusion
This paper presents a substitution-permutation network based encryption technique
for color images. The key space, parametric and sensitivity test results mentioned
show the cryptographic strength of the technique. The technique resists brute force,
entropy, statistical, known/chosen plaintext and differential attacks. This is the first
color image encryption technique which is ciphertext sensitive. Unlike other image
encryption techniques this technique has 0% error tolerance so that lethal decisions
are not taken based on corrupted images. The technique is faster than AES and can be
used in real time secure image transmission.

References
1. Sathish Kumar, G.A., Bhoopathy Bagan, K., Vivekanand, V.: A novel algorithm for image
encryption by integrated pixel scrambling plus diffusion [IISPD] utilizing duo chaos
mapping applicability in wireless systems. Procedia Computer Science 3, 378387 (2011)

A Color Image Encryption Technique Based on a Substitution-Permutation Network

533

2. Mao, Y., Chen, G., Lian, S.: A novel fast image encryption scheme based on 3D chaotic
Baker Maps. International Journal of Bifurcation and Chaos 14(10), 36133624 (2004)
3. MU, X-C., SONG, E.-N.: A new color Image Encryption Algorithm Based on 3D Lorenz
Chaos Sequences. In: First international Conference on Pervasive Computer, Signal
Processing and Application, pp. 269272 (2010)
4. Liu, S., Sun, J., Xu, Z.: An improved image encryption algorithm based on chaotic system.
Journal of computers 4(11) (2009)
5. Fridrich, J.: Symmetric ciphers based on two-dimensional chaotic maps. Int. J. Bifurcation
and Chaos 8, 12591284 (1997)
6. Socet,D., Magliveras,S.S., Furht, B.: Enhanced 1-D Chaotic Key-Based Algorithm for
Image Encryption. In: First International Conference on Security and Privacy for
Emerging Areas in Communications Networks, pp. 406407 (2005)
7. Usman, K., Juzoji, H., Nakajimal, I., Soegidjoko, Ramdhani, M., Hori, T., Igi, S.: Medical
image encryption based on pixel arrangement and random permutation for transmission
security. In: 9th International Conference on e-Health Networking, Application and
Services, pp. 244247 (2007)
8. Li, S., Li, C., Chen, G., Bourbakis, N.G., Lo, K.-T.: A general quantitative cryptanalysis
of permutation-only multimedia ciphers against plaintext attacks. Signal Processing:
Image Communication 23(3), 212223 (2008)
9. Shannon, C.E.: Communication theory of secrecy system. Bell Syst. Techn. J. 28, 656
715 (1949)
10. http://www.healthcareitnews.com/news/
hhs-cracks-down-provider-pay-100000-hipaa-penalties-overlost-laptops
11. J. Mohamedmoideen Kader Mastan, Sathishkumar, G.A., Bagan, K.B.: Digital Image
Security using Matrix and Non-Linear Pixel Transformation. In: International Conference
on Computer, Communication, and Electrical Technology, vol. 1 (2011)
12. Hill, L.S.: Cryptography in an Algebraic Alphabet. The American Mathematical
Monthly 36(6), 306312 (1929)
13. Overbey, J., Traves, W., Wojdylo, J.: On the keyspace of the hill cipher.
Cryptologia 29(1), 5972 (2005)
14. Forouzan, B.A.: Cryptography & Network Security. Tata McGraw-Hill, New York (2009)
ISBN-13: 978-0-07-066046-5
15. Schneier, B.: Applied Cryptography: Protocols, Algorithms and Source Code in C, 2nd
edn. Wiley, NY (1995)
16. Buchholz, J.J.: Matlab Implementation of the Advanced Encryption Standard (2001),
http://buchholz.hs-bremen.de/aes/aes.htm

Comment on the Improvement of an Efficient ID-Based


RSA Mutlisignature
Chenglian Liu1,3, , Marjan Kuchaki Rafsanjani2 , and Liyun Zheng1
1

Department of Maths and Computer Science, Fuqing Branch of Fujian Normal University
chenglian.liu@gmail.com
2
Department of Computer Science, Shahid Bahonar University of Kerman
kuchaki@mail.uk.ac.ir
3
Department of Mathematics, Royal Holloway, University of London

Abstract. In 2008, Harn and Ren proposed an efficient identity-based RSA multisignatures scheme which it based on Shamirs identity-based signature. In 2010,
Yang et al. pointed out two methods that he presumed make the Harn-Ren scheme
insecure. This documentation will prove that Yang et al.s first forgery attack was
incorrect and the Harn-Ren scheme is still secure.
Keywords: Multisignature, Identity-based signature, RSA Cryptosystem.

1 Introduction
With the growth of the Internet, digital signature has become very important to electronic commerce, it provides the cryptographic services authentication and data integrity where agreeance between signer and verifier is required. Is In 1984, Shamir [3]
proposed the concept of an identity-based signature (IBS) scheme based on an integer
factorization problem. Harn and Ren [1] proposed an efficient identity-based multisignature based on Shamirs scheme. Each signer needs to register at a private key generator (PKG) and identify himself before being able to joint the network. A signer is
accepted, the PKG generates a secret key for that signer based on the signers identity, and relative information. In 2010, Yang et al. [4] proposed two forgery attacks to
the Harn-Ren scheme. They claimed their methods could be successful, and also suggested improvements. In this paper, we show that Yang et al.s first claim is incorrect
and Harn-Rens scheme is still secure.

2 Review of Harn-Ren Scheme


Harn and Ren proposed [1] an identity-based multisignature scheme in 2008. Their
description follows the model proposed in Micali et al. [2].


Corresponding Author: Mr. Liu is with Department of Mathematics and Computer Science,
Fuqing Branch of Fujian Normal University, China.

A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 534540, 2011.
c Springer-Verlag Berlin Heidelberg 2011


Comment on the Improvement of an Efficient ID-Based RSA Mutlisignature

535

2.1 PKG
The PKG chooses its public and private key pairs as follows:
1. Runs the probabilistic polynomial algorithm Krsa to generate two random large
primes, p and q.
2. Chooses a random public key e such that gcd(e, (n)) = 1 and computes the private
key
(1)
d e1 (mod (n)).
2.2 Multisignature Generation
Signer secret key generation. In this algorithm, the signer gets a copy of his secret
key from the PKG through a two-step process:
1. A signer submits his identity to the PKG.
2. The PKG, with its private key d and the corresponding public key e, signs the message digest of the identity, denoted as ij , by generating a secret key gj , such that
gj idj

(mod n).

(2)

The gj is the signer ij s secret key. We will not distinguish between the identity and
its message digest.
Message signing. To generate an identity-based multisignature, each signer carries out
the followings steps:
1. Chooses a random integer rj and computes
tj rje

(mod n).

(3)

2. Broadcasts rj to all the signers.


3. Upon receiving of rj , j = 1, 2, . . . , l, each signer computes
t

l


rj

(mod n),

(4)

j=1

and

H(t,m)

sj gj rj

(mod n).

(5)

4. Broadcasts sj to all the signers.


5. After receiving of sj , j = 1, 2, . . . , l, the multisignature component s can be computed as
l

sj (mod n).
(6)
s
j=1

The multisignature for message m is From the above algorithm, it is clear that the
signing phase of each individual signature is identical to the original IBS scheme. It is
also clear that the length of each multisignature is the same as the individual IBS.

536

C. Liu, M.K. Rafsanjani, and L. Zheng

2.3 Multisignature Verification


To verify a multisignature = (t, s) of a message m of signers whose identities are
i1 , i2 , . . . , il , one verifies the following:
se (i1 i2 il ) tH(t,m)

(7)

(mod n).

If it holds, the identity-based multisignature is valid, otherwise it is invalid.


Signers

Receiver

Zn R rj
tj rje (mod n)
Broadcasts rj to all the signers.
l

rj (mod n)
Computes t
j=1

and sj gj

H(t,m)
rj

(mod n)
(, m)

Broadcasts sj to all the signers.


Computes s

l


sj

(mod n)

j=1
?

se (i1 i2 . . . il ) tH(t,m)

= (t, s)

(mod n)

Fig. 1. Harn-Ren scheme

3 Yang et al. Attack and Improvement


Yang et al. [4] initiated two forgery attacks on Harn-Rens scheme, and claimed the
attacker is able to steal the signers secret key using their first method. The details are
described below:
3.1 Yang et al.s Forgery Attack I
Anyone using broadcast data (rj , sj ) is able to obtain the signers secret key gj and
signature (, m). When the attack intercepts rj and sj in the broadcasting process, the
signature secret key gj can be computed from the formula
H(t,m)

sj gj rj

(mod n),

(8)

(mod n).

(9)

which also can be expressed as


g j sj

1
H(t,m)
rj

Comment on the Improvement of an Efficient ID-Based RSA Mutlisignature

In the formula

1
H(t,m)
rj

is inverse element of

H(t,m)

rj

537

(mod n),

(10)

(mod n)

(11)

in the modular n multiplicative group. Therefore, Harn-Ren scheme does not protect
the signers secret key from being exposed.
Signers

Receiver

Zn R rj
tj rje (mod n)
Broadcasts rj to all the signers.
l

rj (mod n)
Computes t
j=1

and sj gjt

H(t,m)
rj

(mod n)

Broadcasts sj to all the signers.


Computes s

l


sj

(, m)

(mod n)

j=1
?

se (i1 i2 . . . il )t tH(t,m)

= (t, s)

(mod n)

Fig. 2. Yang et al. scheme

3.2 Yang et al.s Improvement


The PKG key phase and Signer secret key generation phase, is the same as the original
scheme and need not be improved.
Signing phase. Suppose l signers, i1 , i2 , , il , plan to jointly sign document m, then
each signer proceeds as follows:
a) Signer ij chooses a random number rj , and compute
ti rje

(mod n).

(12)

b) Broadcast rj to all signers.


c) After receiving rj from all signers, compute
t

l

j=1

rj

(mod n),

(13)

538

C. Liu, M.K. Rafsanjani, and L. Zheng

and compute
H(t,m)

sj gjt rj

(mod n)

(14)

using own secret key.


d) Broadcast sj to all signers.
e) After receiving sj from the others, compute
s

l


sj

(mod n),

(15)

j=1

the multisignature of a complete message m is = (t, s).


Verification Phase. When the receiver receives multisignature message (m, ), the
public key e of the PKG and the identities of all the signers i1 , i2 , , ij can be used
to verify the validity of the signature. The verification formula is as follow:
?

se (i1 , i2 , , ij )t tH(t,m)

(mod n).

(16)

If verification is successful, then the information has a legitimate signature. Otherwise,


it is an illegal signature. Figure 3 shows the detailed procedure for this signature and
verification process.

4 Our Comment
4.1 Erratum to Harn-Ren Scheme
In this section, we will point out an erratum as follow. The centre broadcasts tj to all
the signers, and each signer computes their parameters t where
t

l


tj

(mod n).

(17)

j=1

If centre broadcasts rj to all the signers, then each signers compute


t

l


rj

(mod n),

(18)

j=1

and send to receiver. The signers do not pass verification phase where
se (i1 i2 il ) tH(t.m)

(mod n).

(19)

Comment on the Improvement of an Efficient ID-Based RSA Mutlisignature


Signers

539

Receiver

Zn R rj

It should be sent tj .

tj
(mod n)
Broadcasts tj to all the signers.
It also sent tj ,
l
otherwise it causes an error in vercation phase.

tj (mod n)
Computes t
rje

j=1

and sj gj

H(t,m)
rj

(mod n)

Broadcasts sj to all the signers.


Computes s

l


sj

(, m)

(mod n)

j=1
?

se (i1 i2 . . . il ) tH(t,m)

= (t, s)

(mod n)

Fig. 3. Erratum to Harn-Ren scheme

4.2 Security Analysis of Improvement Scheme


Yang et al. proposed an improvement of multisignature scheme that they assumed if rj
is known.
Proof. Assume (t, sj , H(t, m)) is known, and
H(t,m)

sj gjt rj

(20)

(mod n).

Step 1. Compute gcd(t, e) = 1, if it is not equal, then continue until to correct where
gcd(t, e) = 1.
Step 2. Use Extended Euclidean Algorithm to compute (u, v) where
(21)

tu + ec = 1.
Step 3. Compute
uH(t,m)

suj gjtu rj

(mod n),

(22)

and
ivj gjev

(23)

(mod n).

Step 4. Compute
uH(t,m)

suj ivj gjtu rj


and

v
gj su
j ij rj

gjev

uH(t,m)

(mod n),
(mod n).

(24)
(25)

According above statement, although Yang et al. proposed an improvement of multisignature scheme, but their scheme do not increase to the security degree.

540

C. Liu, M.K. Rafsanjani, and L. Zheng

5 Conclusion
As the original erratum in Harn and Rens scheme, Yang et al.s results derived an error
in the first forgery attack. The wrong result, is an incorrect assumption. Now that HarnRens multisignature scheme has proved to be secure, the Yang et al. scheme becomes
unnecessary.

Acknowledgment
The authors would like to thank our anonymous reviewers for their valuable comments.
This research was supported in part by the Fuqing Branch of Fujian Normal University
of China under the contract number KY2010030.

References
1. Harn, L., Ren, J.: Efficient identity-based RSA multisignatures. Computers & Security 27(12), 1215 (2008)
2. Micali, S., Ohta, K., Reyzin, L.: Accountable-subgroup multisignatures: extended abstract.
In: CCS 2001: Proceedings of the 8th ACM Conference on Computer and Communications
Security, pp. 245254. ACM, New York (2001)
3. Shamir, A.: Identity-based cryptosystems and signature schemes. In: Blakely, G.R., Chaum,
D. (eds.) CRYPTO 1984. LNCS, vol. 196, pp. 4753. Springer, Heidelberg (1985)
4. Yang, F.Y., Lo, J.H., Liao, C.M.: Improvement of an efficient ID-based RSA multisignature.
In: 2010 International Conference on Complex, Intelligent and Software Intensive Systems
(CISIS), February 15-18, pp. 822826 (2010)

A Secure Routing Protocol to Combat Byzantine and


Black Hole Attacks for MANETs
Jayashree Padmanabhan, Tamil Selvan Raman Subramaniam, Kumaresh Prakasam,
and Vigneswaran Ponpandiyan
Department of Computer Technology, Anna University, MIT Campus, Chennai -600 044,
Tamil Nadu, India
pjshree@annauniv.edu,
{tamil.3758,kumareshpbe,vignesh.wrn65}@gmail.com

Abstract. The unique features of mobile ad hoc networks (potential nodes and
link mobility) raise certain requirements for the security mechanism. A
particularly challenging problem is how to feasibly detect and screen possible
attacks on routing protocols such as Byzantine and Black hole attacks. This
work focus on detecting routing Black Hole and Byzantine routing attacks
through security and trust based routing. A secure auto configuration scheme is
adapted and enhanced with secure public-key distribution to authorize the nodes
joining the Mobile Ad hoc Network (MANET). Integrity in messages between
source and destination is achieved via public key cryptographic mechanism and
keyed Hash MAC over a shared secret key. The proposed schemes can be
integrated with the existing routing protocols for MANETs, such as ad hoc ondemand distance vector routing (AODV) and dynamic source routing (DSR).
Introducing security mechanisms over routing in MANETs might cause
significant overhead and power consumption. Hence a security mechanism
considering the tradeoff between security and energy consumption is proposed.
A routing algorithm to establish parallel routes in order to build trust over paths
and nodes in those paths is devised. So, that compromising nodes can be
detected and paths involving those nodes are ignored. The proposed protocol
Secure Routing Protocol to combat Byzantine and Black Hole attacks
(SRPBB) is implemented in ns2 for throughput analysis in presence of attack.
Keywords: Mobile Ad hoc networks, security, routing protocol, key
management, Byzantine attack, Black Hole attack.

1 Introduction
Wireless ad hoc networks are formed by devices that are able to communicate with
each other using a wireless physical medium without having a pre-existing network
infrastructure. This network is known as Mobile Ad hoc NETworks (MANETs).
MANETs can form stand-alone groups of wireless terminals, but some terminals
connected to fixed networks too. An inherent characteristic of nodes in ad hoc
networks is that they are able to auto configure themselves without the intervention of
centralized administration.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 541548, 2011.
Springer-Verlag Berlin Heidelberg 2011

542

J. Padmanabhan et al.

As in wired and infrastructure enabled wireless networks, MANETs are also


vulnerable to security threats. Attacks against routing in MANETs can be classified in
to external attacks and internal attacks. An external attack originates from a node
(router) that doesnt involve in the process of routing but facade to be a trusted node
(router). Black hole attack is the most prominent external attack in which attack is
generated via advertising false routes. Hence the malicious entity can absorb packets
they receive or fabricate them instead of forwarding them. An internal attack
originates from a node or a group of nodes that participates in the process of routing.
The most prominent internal attack is byzantine attack, in which a node or a group of
nodes within the network domain collude to disrupt the routing process by modifying,
fabricating, or misleading the packets they receive.
Though the level of security achieved through trust based mechanisms is
noticeably low when compared to cryptography based mechanisms. In the case of
MANETs the nodes are resource constrained and mostly battery driven. Hence,
implementation of cryptographic mechanisms would considerably increase the
overhead because of the computational complexity involved in them. Hence there is a
tradeoff between level of security and computational overhead.
In this paper a novel mechanism is proposed for secure auto configuration for a
node newly joining MANET. Further, it also states an efficient attack detection and
screening mechanism to secure the MANET routing protocols from Byzantine and
Black Hole attacks. This is paper is organized as follows; related works are analyzed
in Section 2. Mechanism for dynamic key management is proposed in Section 3. An
enhanced routing algorithm using Trust mechanism is stated in Section 4. Simulation
environment and results are presented in Section 5. Section 6 concludes the paper.

2 Related Work
An analysis on current secure routing protocols used for MANETs and self-auto
configuration schemes is carried out. Secure routing protocols can be classified under
two categories, 1) integrated security mechanism with the existing routing protocol
and 2) to detect and defend specific attacks. The common practice is to secure on
demand routing protocols, such as AODV, DSR, and DSDV by using end to end
authentication. This results in secure routing protocols which includes secure efficient
ad hoc routing (SEAD) and Authenticated routing in ad hoc networks (ARAN).
SEAD[3] is an extension of DSDV. There is no delay in updates and no increment
in sequence number due to broken link. It uses one-way hash chains to authenticate
hop counts. The security mechanism in SEAD is the shared secret keys between each
pair of nodes. It has Byte overhead and packet overhead. ARAN[4] is implemented
both in AODV and DSR. In ARAN, environment has been defined as open, managed
open and managed hostile. In open environment random nodes establish connectivity
without any trusted third parties in common. In managed open environment difference
in the fact that nodes wishing to exchange information may exchange the initial
parameters. In managed hostile environment all nodes are deployed by the common
source. The weakness is that it is satisfactory only for managed open, as open
environment need a trusted certificate server and as managed open need to expose the
entire topology.

A Secure Routing Protocol to Combat Byzantine and Black Hole Attacks for MANETs

543

The latter involves protecting the routing traffic against routing attacks. These
include On-Demand Secure Byzantine Routing (ODSBR), Rushing Attack Prevention
(RAP). ODSBR [5] includes three phases namely route discovery, byzantine fault
detection, link weight management. Route discovery process involves signed request
to destination and the destination verifies authenticity of the request and creates a
response with response sequence numbers. The source in turn selects the best route
using link weight management. Authenticated acknowledgement is to be sent for
every data packet it receives. The fault nodes are identified using binary search. Route
discovery overhead and acknowledgement overhead occurs in ODSBR. RAP [6] is
defined for rushing attacks. It involves secure neighbor detection, secure route
delegation and secure route discovery. It is to be integrated with other protocol which
involves change in route discovery process.
Current auto configuration schemes [7] include Self-authentication schemes,
challenge response scheme and trust model scheme. In Self authentication scheme [8],
a node generates its public/private key pair randomly and then uses the hash value of
the public key as the IP address. Here, certificate repository is not needed. The
relationship between public key and IP address in this scheme brings the following
problem; public/private key pair per node is limited to one where as two pairs of key
are needed for signing/verifying and encryption/decryption. Hence it is vulnerable to
cipher text attack.
Challenge response scheme [9] is based on two steps, authentication and address
allocation. This scheme has two problems; only one-hop broadcast is used in the
announcement of the public key, and thus the public key is distributed only to the onehop neighbors; and the allocator might be a malicious node, hence it can assign a nondisjoint address pool to the new node, which will lead to address conflicts in the
current and subsequent address allocations. Two secure auto configuration schemes
based on trust model are analyzed. One method[10] is based on the trust value of the
neighbors. This method is vulnerable to Sybil attacks. The second method[11] is
threshold cryptography based distributed certificate authority (DCA). The problems
with the scheme are, at least k preconfigured DCA server nodes must be present in the
MANET without auto configuration and this scheme is also vulnerable to Sybil attack.
To summarize, expensive authentication mechanisms are used in protocols that
detect routing attacks.

3 Dynamic Key Management Mechanism


This section deals with the secure auto configuration mechanism adapted for secure
public key distribution nodes and proposed key management scheme. The proposed
key management scheme achieves integrity in messages transmitted and the trust
estimator technique detects the compromised nodes in the network and screens them
to provide security and reliability.
3.1 Secure Auto Configuration Mechanism
Since MANETs lack a centralized administration like DHCP to configure nodes joining
them, new nodes should auto configure them with network. Auto configuration is done

544

J. Padmanabhan et al.

to announce the association of IP address and public key of a node to the network. A
mechanism proposed by [7] HongbouZhou et al, for secure auto configuration and
public key distribution is adapted. Secure Auto Configuration and Public Key
Distribution achieve two goals, Uniqueness of address allocation and Secured public
key distribution. It involves following procedures Generation of Parameters, Broadcast
of Duplicate Address Detection (DAD) message, Receipt of Duplicate Address
Detection (DAD) message, Forwarding of DAD message, Forwarding of NACK
message, Receipt of NACK message, and Broadcast of Commit (CMT) message.
3.2 Key Management Scheme
There are two basic key management schemes; they are public and shared-key based
mechanisms. Public key based mechanism uses a public/private key pair and an
asymmetric key based algorithm like RSA to establish session and authenticate nodes.
In a secret key based scheme a shared symmetric key is used to verify the integrity
of data.
In the proposed key management scheme whenever a node is needed to initiate
route discovery, it constructs RREQ and generates SMSG. SMSG consists of the
shared secret key that has to be shared between the source and destination and digital
signature of the same. The source node now forwards the RREQ along with the
SMSG. Once the destination receives the RREQ along with SMSGs it verifies the
digital signature via polling. It chooses the shared secret key that has been proved to
be valid and detects the misbehavior if the digital signature sent via a path seems to be
invalid. The destination reports the source regarding the misbehavior and hence every
intermediate nodes record it for the future trust factor calculation. Once the source
receives the RREP it starts transmitting the data encrypted via keyed HMAC
algorithm using the secret key shared between the source and destination as the key.
Key Management Scheme:
While (initiate route discovery)
{
Construct RREQ;
Generated SMSG;
LRREQ:

broadcast RREQ + SMSG;


Wait (RREP || timer Expiry)
if (RREP)
Exit
else
Goto LRREQ
Transmit

A Secure Routing Protocol to Combat Byzantine and Black Hole Attacks for MANETs

545

Where,
RREP Route reply
RREQ Route Request
SMSG Start Message (num+ (E(E(num,KR-PUB),KS-PRI))
num - Shared Secret key

4 Trust Estimator Technique


To build a trust estimation technique, trustworthiness on node n by another node n is
defined as the probability that node x will perform a particular action expected by n,
we denote it as Tx(n). The trustworthiness is measured using Trust Factor (TF) which
is calculated by accumulating the behavior of a node over a particular interval called
as TF updation cycle. The actions include route request, route reply, and SMSG and
data transmission. Each node maintains a Trust Certificate Repository. Based on the
Trust Factor calculated each node classifies its entire neighbors under three
categories, known, unknown and companion. Known refers to a classification where
nodes under this have high probability of being trusted. Unknown refers to a
classification of nodes where nodes under this have low probability of being trusted.
Companion refers to a classification of nodes where nodes under this have high
probability of switching from unknown to known.
Let be the probability that the link remains active and correct, mC be transmitted
message found to be correct, mS be successful transmissions, mT be total number of
messages transmitted by x to n which are not destined to n and mA be total number of
attempted transmissions. Then Trustworthiness on node n by node x is stated as in
equation (1)
Tx(n)= (mc+ms) / (mt+ma)

(1)

Let Tx(p;j) be the Trustworthiness on path p by node x on jth TF updation cycle,


then Path trust estimation can be stated as in equation (2)
Tx(p;j) = Tx(n;j) | n p

(2)

The trustworthiness parameter stated above accounts for reliability, whereas


availability of path also plays an important role in ad hoc networks which has the
inherent quality of link mobility. If H is the number of hops in the path, V is the
average relative speed, R is the transmission range where R = min (Rx), x path and
0 is constant of proportionality, decided by node density and mobility scenarios, then
parameter path is defined as in equation(3).
Path = (1/ 0) ((H*V)/R)

(3)

4.1 Enhanced Routing


1. During route discovery, a source node sends RREQ+SMSG packets to its
neighboring nodes.
2. Once an RREQ packet is received by an intermediate node, it either replies via
a RREP or broadcasts it in turn.

546

J. Padmanabhan et al.

3. When the destination node receives the RREQ, it extracts the shared secret key
from the SMSG and sends the RREP message via the route with highest Trust
Factor (TF).
4. Once the route is established, the intermediate nodes monitor the link status of
the next hops in the active routes. Those that do not meet the performance and
trustworthiness requirement will be eliminated from the route.
5. When a link breakage in an active route is detected, a route error (RERR)
packet is used to notify the other nodes that the loss of that link has occurred.
Some maintenance procedures are needed as in AODV.

5 Simulation Result and Observation


The proposed secure routing protocol (SRPBB) is implemented on ns2 version 2.34.
Considering the computational overhead involved RSA public key cryptography with
OAEP (256 Bits key length) and SHA1 keyed HMAC algorithms are used in dynamic
key management scheme. Throughput of the protocol is evaluated under various
mobility models, in the presence and absence of attack. Throughput comparison
with existing AODV protocol has been done in the presence of attacker nodes and
presented in figure 1. Though the protocol uses Operations causing additional
processing overhead:
1. SMSG Start Message involves digital signature generation procedure at the
source end on creation and digital signature verification procedure on
receiving at the destination end.
2. Transmission involves just HMAC mechanism over the data to be transmitted.
3. DAD involves two HMAC operations one at the source end and another at the
destination end.
4. Key length of public key encryption algorithm can be changed or extended to
1024 bits based on the computational capability and applications.
Convergence of a routing protocol refers to the time taken by the protocol to establish
routes during route discovery phase and re-establishment of routes during route
maintenance phase in case of route error. The proposed SRPBB routing protocol
implements route maintenance mechanism of AODV but in the case of route
discovery it considers trust factor for path selection instead time of arrival of route
reply as in AODV. Since a path being chosen based on reliability and availability, the
probability of link or node failure is considerably low. So this confirms efficiency of
SRPBB over AODV in context of convergence.
Considering overhead involved we chose RSA with 256 bit key with OAEP
padding. Here OAEP improves the strength of RSA multiple times and incurs very
low overhead. This saves the need for large key size to improve the Strength of RSA
at the cost of resource utilization. It is estimated that for an RSA key with a length of
Lk (in bits), the CPU cycles needed to perform one RSA operation is about (Lk + 2)
(Lk + 2 + 32) for a typical implementation, which is equal to 0.28 and 1.09 million for
Lk = 512 and 1024, respectively. It is also estimated that the generation of a signature
takes about 20 RSA operations, whereas the verification takes only one RSA
operation.

A Secure Routing Protocol to Combat Byzantine and Black Hole Attacks for MANETs

547

In the proposed scheme path trust evaluation parameter is given by the product of
trusts of the nodes in the path. Hence the node with minimal trust value will scale
down the entire path trust considerably. This makes the protocol to converge fast than
other existing protocol. More than reliability, the trust estimator technique proposes a
simple and feasible mechanism in consideration with availability of paths. This
mechanism adds more efficiency to the protocol.
x-axis: Time in milliseconds

y-axis: Total Throughput

Fig. 1. Throughput Comparison Between AODV and SRPBB

6 Conclusion
It is evident from the performance evaluation that the devised routing protocol has
outperformed existing unicast routing protocols in terms of efficiency and security.
Overhead in the implemented key management scheme cause is due to public key
cryptographic mechanism being used. Hence considering the tradeoff between energy
and security a new cryptographic mechanism can be devised in the future to support
resource constraint MANET environment. Considering appropriate parameters other
than throughput, efficiency of the protocol has to be scaled and suitable adjustments
have to be carried out. To conclude an enhanced routing protocol that eliminates
byzantine and black hole attack in MANETs has been devised and implemented.
Making the protocol QoS centric is a challenging issue and to be continued with
future work.

548

J. Padmanabhan et al.

References
1. Yu, M., Zhou, M., Su, W.: A Secure Routing Protocol against Byzantine Attacks for
MANETs in Adversarial Environments. IEEE Transactions On Vehicular
Technology 58(1) (January 2009)
2. Bhalaji, N., Shanmugam, A.: Association between Nodes to Combat Blackhole Attack in
DSR based Manet. In: IEEE WOCN 2009 Conference Program Cairo, Cairo, Egypt (2009)
3. Hu, Y.-C., Johnson, D.B., Perrig, A.: SEAD: Secure efficient distance vector routing for
mobile wireless ad hoc networks. In: Proc. 4th IEEE Workshop Mobile Comput. Syst.
Appl., pp. 313 (June 2002)
4. Sanzgiri, K., LaFlamme, D., Dahill, B., Levine, B.N., Shields, C., Belding-Royer, E.M.:
Authenticated routing for ad hoc networks. IEEE J. Sel. Areas Commun. 23(3), 598610
(2005)
5. Awerbuch, B., Curtmola, R., Holmer, D., Nita-Rotaru, C.: ODSBR: An On-Demand
Secure Byzantine Routing Protocol. JHU CS Tech. Rep.Ver.1 (October 15, 2003)
6. Hu, Y.-C., Perrig, A., Johnson, D.B.: Rushing Attacks and Defense in Wireless Ad Hoc
Network Routing Protocols. In: WiSe 2003, San Diego, California, USA (September 19, 2003)
7. Zhou, H., Mutak, M.W., Ni, L.M.: Secure Autoconfiguration and Public-key Distribution
for Mobile Ad-hoc Networks. In: IEEE 6th International Conference on Mobile Ad hoc
and Sensor Systems, MASS 2009 (2009) Secure Autoconfiguration and Public-key
Distribution for Mobile Ad-hoc Networks. In: IEEE 6th International Conference on
Mobile Ad hoc and Sensor Systems, MASS 2009 (2009)
8. Wang, P., Reeves, D.S., Ning, P.: Secure Address Autoconfiguration for Mobile Ad Hoc
Networks. In: Proceedings of the 2nd Annual International Conference on Mobile and
Ubiquitous Systems: Networking and Services (MobiQuitous 2005), San Diego, CA, pp.
519521 (July 2005)
9. Cavalli, A., Orset, J.-M.: Secure Hosts Autoconfiguration in Mobile Ad Hoc Networks. In:
Proceedings of the 24th International Conference on Distributed Computing Systems
Workshops (ICDCSW 2004), Tokyo, Japan (March 2004)
10. Hu, S., Mitchell, C.J.: Improving IP address autoconfiguration security in mANETs using
trust modelling. In: Jia, X., Wu, J., He, Y. (eds.) MSN 2005. LNCS, vol. 3794, pp. 8392.
Springer, Heidelberg (2005)
11. Nakayama, H., Kurosawa, S., Jamalipour, A., Nemoto, Y., Kato, N.: A Dynamic Anomaly
Detection Scheme for AODV-Based Mobile Ad Hoc Networks. IEEE Transactions On
Vehicular Technology 58(5) (June 2009)
12. Bai, F., Sadagopan, N., Helmy, A.: BRICS: A Building-block approach for analyzing
Routing protocols in ad hoc networks - a Case Study of reactive routing protocols. In:
IEEE International Conference on Communications (ICC) (June 2004)
13. Johnson, D.B., Maltz Josh Broch, D.A.: DSR: The Dynamic Source Routing Protocol for
Multi-Hop Wireless Ad Hoc Networks. RFC 4728 (February 2007)
14. Lu, S., Li1, L., Lam, K.-Y., Jia, L.: SAODV: A MANET Routing Protocol that can
Withstand Black Hole Attack. In: IEEE International Conference on Computational
Intelligence and Security (2009)
15. Sadagopan, N., Bai, F., Krishnamachari, B., Helmy, A.: PATHS: Analysis of PATH
Duration Statistics and their Impact on Reactive MANET Routing Protocols. In: ACM
International Symposium on Mobile Ad Hoc Networking & Computing (2003)
16. Nakayama, H., Kurosawa, S., Jamalipour, A., Nemoto, Y., Kato, N.: A Dynamic Anomaly
Detection Scheme for AODV-Based Mobile Ad Hoc Networks. IEEE Transactions On
Vehicular Technology 58(5) (June 2009)

A Convertible Designated Verible Blind


Multi-signcryption Scheme
Subhalaxmi Das , Sujata Mohanty, and Bansidhar Majhi
Department of Computer Science and Engineering, NIT, Rourkela, Orissa

Abstract. This paper presents a convertible blind multi-signcryption


scheme without using any one-way hash function based on the security
of three computationaly hard problems, namely Computationaly Diehellman problem, Discrete Logarithimic Problem, Integer Factorisation
problem .Only a designated verier can verify the signcrypted text having
the signcrypters public parameters. The size of the generated authenticated ciphertext is independent of the number of total participating signcrypters. The proposed scheme is convertible as it can easily produce the
ordinary signcrypted text without the co-operation from the signer. The
verication of the proposed scheme is less, thereby can be applicable in
reallife scenarios.
Keywords: Blind
Convertible.

Multi-Signcryption,

Blind

Multi-signature,

Introduction

Encryption and signature are fundamental tools of Public Key Cryptography for
condentiality and authenticity respectively [1]. Traditionally, these two main
building-blocks have been considered as independent entities. However, these
two basic cryptographic techniques may be combined together in various ways,
such as sign-then-encrypt and encrypt-then-sign, in many applications to ensure
privacy and authenticity simultaneously. To enhance eciency, Zheng proposed a
novel conception named signcryption, which can fulll both the functions of signature and encryption in a logical step [3] . Compared with traditional methods,
signcryption has less computation, communication and implementation complexity. As the signcryption scheme having so many advantages and extensive
application prospects it is used in multi user setting. In multi-user settings, messages are often signed in a group of members. To send messages to multiple
recipients, the base signcryption scheme could be run several times in the trivial
way. But, the trivial method is infeasible for security and performance reasons.
Thus, the new primitive called multi-signcryption should be present. In multisigncryption scheme a number of user can sign a message using some rule and
the message is sent to the verier.


Please note that the LNCS Editorial assumes that all authors have used the western naming convention, with given names preceding surnames. This determines the
structure of the names in the running heads and the author index.

A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 549556, 2011.
c Springer-Verlag Berlin Heidelberg 2011


550

S. Das, S. Mohanty, and B. Majhi

Blind signatures were rst introduced by Chaum (1982) to protect the right
of an individual privacy. A blind signature allows a user to acquire a signature
without giving any information about the actual message or the resulting signature [4,5,6]. The properties of the blind signatures are: the signer can not to
read the document during process of signature generation and the signer can
not correlate the signed document with the act of signing. In a secure blind
signature scheme, the signer is unable to link or trace this signed message to
the previous signing process instance. This property is usually referred to as
the unlinkability property. Due to the unlinkability (blindness) property, blind
signature techniques have been widely used in the anonymous electronic cash
(e-cash) and anonymous voting systems [9].
In this paper, we propose a designated veriable blind multi-signcryption
scheme, which is an organic combination of multi signcryption and blind signature. This scheme is based on the security of three computational hard problem,
namely integer factorization(IF), discrete Logarithmic problem(DLP) and computational Die Hellman problem (CDHP) . The proposed scheme has following
advantages: (i) The size of the generated authenticated ciphertext is independent
of the number of total participating signcrypters. (ii) Except for the designated
verier, no one can obtain the signcrypted message and verify its corresponding
signature. (iii) The multi signcryptrd text is cooperatively produced by a group
of signcrypters instead of a single signcrypter. (iv) In case of a later dispute on
repudiation, the recipient has the ability to convert the authenticated ciphertext
into an ordinary one for convincing anyone of the signcrypters dishonesty. (v)
Each signcrypter of the group cannot know a relationship between the blinded
and the unblinded message and signature parameters. (vi) Only cooperation of
all signcrypterers can generate a valid blind multi-signcrypted for the designated
verier. Other third parties or some (not all) signcrypters cannot forge a valid
blind multi-signcrypted text. This scheme is more ecient for multi-party applications since the size of the generated authenticated ciphertext is independent
of the number of total participating signcrypters. In addition, the computation
costs for the verier will not increase even if the signcrypter group is expanded.
The proposed blind multi-signcryption is useful in the real life scenarios such as
e-cash system, e-bidding, online lottery system and e-commerce applications.
Outline of this paper is as follows: The proposed scheme is presented in Section
2. Section 3 contains the discussion about the scheme. Security analysis is done in
Section 4, Performance Evaluation is disscussed in section 5, nally we conclude
in Section 6.

The Proposed Scheme

The proposed multisigncryption scheme consisting of three parties, namely a requester(A), a group of n signcrypters (SG), a trusted party and a verier(B). The
scheme consisting of following four phases such as: setup, Blinding, signcryption,
Unblinding and verication. The parameter used in the proposed scheme is given
in Table 1.

A Convertible Designated Verible Blind Multi-signcryption Scheme

551

Table 1. Parameters used in the proposed Scheme


Parameter
Function
A
Requester
B
Verier
SG
Group of N signcrypter
xi ,yi
Private and public key of requester
xv ,yv
Private and public key of Verier
E and D
Encryption and Decryption algorithim
||
Concatenation operator
z,t
Private parameter choosen by Signcrypter
w
Private parameter choosen by Verier
log ygi
Log of Yi to the basis g
+,-and * Adition, Substraction and Multiplication function resectively

Setup
Step 1:
The trusted party chooses a integer n as the product of two large primes p and
q such that p=2p1 q1 + 1 and q=2p2 q2 + 1, where p1 , q1 ,p2 , q2 are all large primes
[8]. Then he chooses g as a generator of GF(n). Then he submits n and g to the
requester(A)
Step 2:
The RequesterA chooses his/her private key xi Zn and publishes the public
keyusing DLP [4].
(1)
yi = g xi mod n
Blinding
Step 1:
Then the requester chooses an private parameter w such that w = xi yv mod n.
Then encrypt the message by multiplying the value of w with the original message, then add the private key of the sender with that multiplication


M = xi + M w mod n

(2)

After that the requester encrypt the message using the public key of the verier(B)


M = yv M mod n
(3)


Then he sends M to the signcrypter


Signcryption
After nding the message, signcrypters are signcrypt the message blindly without knowing the content of the message. The steps of signcryption is proceed as
follows:

552

S. Das, S. Mohanty, and B. Majhi

Step 1
Each signcrypter randomly chooses an integer z and t compute the key by using
the formula

(4)
K = z||(M z) mod n
Then he nds the ciphertext by encrypting the message with the key


C = E(K, M ) mod n

(5)

After that each signcrypter computes three private element u, r and v as follows:
u = yv t mod n

(6)

r = k u mod n

(7)

v =t

logygi

r mod n

(8)

After receiving all u, r and v value from all members, a clerck who may be a
signcrypter of that group computes U, R and S as follows:
U=

N


u mod n

(9)

r mod n

(10)

i=1

R=

N

i=1

V =

N


v mod n

(11)

i=1

Then he sends this signcrypted text (U, R, V and C) to the requester, and the
requester sends this to the verier(B).
Unblind and Verification
Step 1
After nding this (U, R, V and C), the verier(B) rst checks the authenticity
as follows:
(12)
U yv V + R = R (w + 1) mod n
w = g (yi )+(xv ) mod n

(13)

If this equation holds then the verier(B) proceed to the next step, otherwise he
sent to the message back. Then the verier calculates the value of U and K as
follows:
U = g xv (V + R logygi ) mod n
(14)
K = R U 1 mod n

(15)

Then he nds the encrypted message by decrypting the ciphertext(C) with the
key(K)

M = D(K, C) mod n
(16)

A Convertible Designated Verible Blind Multi-signcryption Scheme




553



Then he calculate the the value of M by decrypting M with his private key as




M = D(xv , M ) mod n

(17)


After that he nds the original message by Exclusive-oring M with his public
key as

(18)
M = (M g yi ) gxv mod n
Correctness
As U = g xv (V + R logygi ) mod n
= yv (t logygi R + R log ygi ) mod n
= yv t mod n

Discussion

The security of breaking the private key of requester(A) is bound in the complexity of solving DLP. Also, the concept of safe prime is used in the key generation
process which makes the scheme secure. The original message M blinded by attaching it to the private key of the requester along with a random parameter

Fig. 1. The layout of the proposed scheme

554

S. Das, S. Mohanty, and B. Majhi

choosen by him/herself. The signcrypter puts a signature on the blinded message and sends it back to the requestor. The requester checks the authenticity of
the signature and extracts the signature. Then the requester sends the signature
to a designated verier(B). The verier designcrypt the signcrypted text and
recovers the original message M. It can be also veried by only the intended
recipient of the signcryption. The accepted notion of security with respect to
the non-repudiation is existential unforgeability (EUF) under adaptive chosen
message attack. This implies message authentication and integrity. This scheme
is convertible because in any case of legal dispute verier can verify the message without the help of the signcrypter. The layout of the proposed scheme is
discussed in Figure-1.

Security Analysis

First, it is shown that how the proposed scheme resists the attacks to recover
the secret key of the signcryter. Then the proof of parameter reduction attack
and forgery attack without using one way hash function is disscussed. Finally,
the security of the blind multi-signcryption scheme is discussed.
4.1

Attacks for Parametre Reduction

The message recovery Eq. 18 can be written as follows: M=(D(xv,D(R*(g xv


(V+R logygi )-1,C)g yi )) gxv mod n Therefore, the parameters in Eq 10 can
not be reduced further. Hence the proposed scheme is resistant against parameter
reduction attack.
4.2

Forgery Attack

Given the message M, a forger has to solve both Eq 5 and Eq 6 in order to get
the triplet (C,R,V) as it is uses discrete logarithm problem. Also the value of p
and q is very dicult to obtain as it is generated using safe primes. Even if both


R and V known, it is very dicult to nd the value of M and M as it uses
the private key of the verier.
Lemma 1:Neither the clerk nor the signcrypter can forge a valid signcrypted text
Proof:

As the requester sends the triplet (C,R,V) to the verier, where C= E(K, M )
mod n is its own for each signcrypter but R and V is made of summation of all
signcrypters
r and v as follows
n
r
mod
n
R= i=1
n
V= i=1 v mod n,where
r = K u mod n and v=t-logygi *r mod n, as it is made from each signcrypters
contribution, so neither the clerk nor any of the signcrypter can decrypt the
valid signcrypted text.

A Convertible Designated Verible Blind Multi-signcryption Scheme

555

Lemma 2: Only a designated verifier can verify the signcrypted text


and able to recover the original message
Proof:
Since the message can be veried by using the M=(D(xv ,D(R*(g xv (V+R logygi )1,C)g yi ))gxv mod n , in this case no one except the verier can nd the key,
as this message can be recovered by using the private key of the verier. So it is
secure against any external attack.

Performance Evaluation

The computational complexity of any cryptographic algorithim mainly depends


upon on four major operation, namely, no. of inverse algorithim, no. of hash
function, no. of exponential and no. of multiplication operation. We ignore the
time for performing addition and subtraction operation. The following notations
are used to analyze the performance of the proposed scheme; we have ignored
the cost of setup phase in the analysis process.
1)TE is the time complexity of modular exponentiation.
2)TM is the time complexity of multiplication.
3)TH is the time complexity of hash function.
4)TI is the time complexity of inverse function.
From Table II, it is clear that our scheme is devoid of any hash function, that implies computation cost for signcryption and verication is reduced considerably.
The signcryption and verication phase of the proposed scheme has minimal no.
of operation . Hence this scheme has low computational complexity and can be
useful in practical application.
Table 2. Performance evaluation of the proposed scheme
Phases
Proposed scheme
Blinding
5TM
Signcryption
3TM
Unblinding and verication 3TM +TE +2TI

Conclusion

In this paper a new kind of multi-signcryption scheme is proposed, which allows


a group of signers to cooperatively produce a blind authenticated ciphertext and
preserves the characteristic signcryption scheme. Here only a specic recipient
can recover the message and verify the signcrypted text. In case of a dispute, the
recipient has the ability to release an ordinary multi-signature and to convince
anyone of the singers dishonesty. The proposed scheme would be a better alternative for some organizational operations, in which the security requirements

556

S. Das, S. Mohanty, and B. Majhi

of integrity, condentiality, authenticity, and non-repudiation can be simultaneously achieved with low computation and communication cost. It is proved and
analyzed that the proposed scheme can withstand parameter reduction attack,
forgery attack and can recover message from the signcrypted text itself. There
is no message redundancy feature used in this scheme, but still it resists forgery
attack. The scheme supports message recovery feature, as message is recovered
from the signature and there is no need to send message along with the signcrypted text. The proposed scheme can be applicable to areas such as e-voting,
e-cash and e-commerce.

References
1. Shen, Y., Xie, H., Yang, L.: The Study and Application of Group Blind Signature
Scheme in E-commerce Security. IEEE, Los Alamitos (2009)
2. Chen, X., Zhang, F., Kim, K.: ID-Based Multi-Proxy Signature and Blind Multisignature from Bilinear Pairings. Information and Communications University(ICU),
305732
3. Mohammed, E., Emarah, A.E., El-Shennawy, K.: A blind signature scheme based on
Elgamal signature. In: Seventeenth National Radio Science Conference, pp. 2224
(February 2009)
4. Liu, Y., Yin, X., Chen, J.: A Forward Secure Blind Signature Scheme.In: Congress
on Image and Signal Processing (2008)
5. Lopez-Garca, L., Martnez-Ramos, L., Rodrguez-Henrquez, F.: A Comparative Performance Analysis of Several Blind Signature Schemes. In: International Conference
on Electrical Engineering, Computing Science and Automatic Control, pp. 310315
(November 2008)
6. Fan, C.-I., Guan, D.J., Wang, C.-I., Lin, D.-R.: Cryptanalysis of Lee-HwangYang
blind signature scheme. Computer Standards and Interfaces 31, 319320 (2009)
7. Kang, B.: On the security of proxy blind multisignaturescheme without a secure
channel. In: 2nd International Conference on Computer Engineering and Technology
(2009)
8. Wang, C.-H., Hwang, T., Lee, N.-Y.: Comments on two group signatures. Information Processing Letters 69, 9597 (1999)
9. Tian, X.-X., Li, H.-J., Xu, J.-P., Wang, Y.: A Security Enforcement ID-based Partially Blind Signature Scheme. In: International Conference on Web Information
Systems and Mining (2009)

Middleware Services at Cloud Application Layer


Imad M. Abbadi
Department of Computer Science
University Of Oxford
imad.abbadi@cs.ox.ac.uk

Abstract. Cloud infrastructure is composed of enormous resources,


which need to be securely and reliably coordinated and managed to provide end-to-end trusted services in the Cloud. Such coordination and
management could be supported using a set of middleware. A middleware should provide a set of trustworthy automated management services. Such services would help in moving current untrusted Cloud to a
trustworthy Clouds Internet scale critical infrastructure. The main contribution in this paper is identifying Cloud middleware types focusing on
application layer management services and their interdependencies. To
the best of our knowledge our paper is the rst to identify middleware
services and their interdependencies. We demonstrate services interdependencies and interactions using a multi-tier application architecture in
Cloud computing context. Finally, we discuss the advantages of middleware services for establishing trust in the Cloud and provide our research
agenda in this direction.

Introduction

Cloud is dened as an elastic execution environment of resources involving multiple stakeholders and providing a metered service and multiple granularities for
specied level of quality[11]. Cloud support three main deployment types Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as
a Service (IaaS) [12]. The technologies behind current Cloud infrastructure are
not new, as they have been used in enterprise infrastructure for many years [14].
Cloud computing current understanding become popular with Amazon EC2 in
2006 [5], and its infrastructure is built up of technologies and processes based
on in-house solutions. The two main characteristics of potential Cloud critical
infrastructure, which dierentiate it from traditional enterprise infrastructure
are pay-per-use payment model and automated management services [11]. Such
services provide Cloud computing with exceptional capabilities and new features. For example, scale per use, hiding the complexity of infrastructure, automated higher reliability, availability, scalability, dependability, and resilience.
These should help in providing a trustworthy resilient Cloud computing, and
should result in cost reduction.
The main objective of this paper is to identify and analyze Cloud application
middleware automated management services and their interdependency. We also
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 557571, 2011.
c Springer-Verlag Berlin Heidelberg 2011


558

I.M. Abbadi

discuss how such services help in establishing trust in the Cloud. This paper
is organized as follows. Section 2 denes the scope of the paper and related
work. Section 3 denes application layer middleware self-managed services and
their interdependence. Section 4 provides a typical multi-tier architect in cloud
environment, and discusses how the conceptual models proposed in section 3
support trustworthy and resilient multi-tier application in the Cloud. Finally,
we conclude the paper and propose our research agenda in Section 5 .

Scope of the Paper and Related Work

Cloud computing can conceptually be viewed from dierent angles. For the purpose of our paper Cloud computing conceptually consists of multiple intersecting
layers as follows (for detailed description about Cloud taxonomy see [1]).
1. Physical Layer This layer represents the main physical components and
their interactions, which constitute Cloud physical infrastructure. Example
of these include physical servers, storage, and network components. The
physical layer resources are consolidated to serve the Virtual Layer.
2. Virtual Layer This layer represents the virtual resources, which are hosted
by the Physical Layer. Cloud customers in IaaS Cloud type interact directly
with the virtual layer, which hosts Clouds customer applications. This layer
consists of multiple sub-layers: Virtual Machine (VM), virtual network, and
virtual storage.
3. Application Layer This layer has Clouds customer applications, which
are hosted using resources in the Virtual Layer.
Moving current Cloud infrastructure to the potential trustworthy infrastructure requires a set of trustworthy middleware services. Middleware services glue
resources within Cloud layers together by providing a set of automated selfmanaged services that consider users security and privacy requirements by design. These services should be transparent to Clouds customers and should
require minimal human intervention. The implementation of self-managed services functions in middleware would mainly depend on the middleware location
within Clouds layers. For example, a Virtual Layer Middleware is needed between Physical Layer and Virtual Layer to provide infrastructure transparent
services to virtual layer, and an Application Layer Middleware is needed between Virtual Layer and Application Layer to provide transparent management
services to applications. We have previously dened Virtual Layer Middleware
self-managed services and the security challenges for providing such services in
[3]. In this paper for clarity we mainly focus on Application Layer Middleware
self-managed services.
In this paper we continue our previous work in [4], which discusses the misconceptions about Cloud computing, discusses Cloud structural components, and
derives the main security challenges in the Cloud. In this paper we mainly focus

Middleware Services at Cloud Application Layer

559

on self-managed services at application layer, factors aecting their actions, and


their interdependency.
We could not nd related work going in same direction of this paper. However, there are other work (see, for example, [8,19]) analyzing Cloud properties
from user perspectives, and mainly focus on analyzing Cloud provided services
(IaaS, PaaS, and SaaS). However, these do not discuss application layer automated management services and their interdependency. The work in autonomic
computing [10] is not related to our work, as it is mainly concerned about physical layer management, which is very dierent than virtual and application layer
management. To the best of our knowledge our work is the rst to identify
middleware management services interdependency at the application layer.

Middleware Services of Application Layer

Middleware Services of Application layer (which we also refer to as self-managed


services of Application Layer) are about providing Cloud Application Layer with
exceptional capabilities enabling it to automatically manage all applications running on the Cloud, their interdependencies, and take appropriate actions on
emergencies. These should support application availability, reliability, scalability, resilience, and adaptability that consider user requirement of security and
privacy by design. In this section we provide a set of conceptual models for these
services. We use these in subsequent sections to describe their interactions when
managing a multi-tier application architect in the Cloud.
3.1

Adaptability

Adaptability is the ability to provide timely and ecient support of applications on system changes and events. Adaptability should always ensure that the
overall system properties are preserved (e.g. security, resilience, availability and
reliability) when taking an action. The Adaptability service should automatically
decide on an action plan and then manage it by coordinating with other services
in the same layer or other layers.
Figure 1 provides a conceptual model of Adaptability services functions. This
Figure provides examples of Events and Changes, which Triggers the Adaptability
service. The Adaptability service in turn Performs Actions based on the Events
and Changes. The Actions also Triggers Cascaded Actions to other services in
both Application Layer and Virtual Layer. The Adaptability Service follows a set
of rules dened by cloud authorised employees dening Actions and Cascaded
Actions.
3.2

Resilience

Resilience in application layer is the ability of system to to maintain an application features (e.g. serviceability and security) despite a number of components

560

I.M. Abbadi

Fig. 1. Adaptability Service

Fig. 2. Resilience Service

failures. High resilience at application layer can be achieved by providing: high


resilience at virtual layer and well planned procedures, which we have discussed
in detail at [3]. High application layer resilience also requires application redundancy, which can be of two types: active/active or active/passive. Active/Passive
(what is also referred to as hot-standby) means the passive application can only
process requests once the Active application has failure. Active/Active, on the
other hand, means multiple copies of the same application process requests simultaneously. Resilient design helps in achieving higher availability and end-to-end
service Reliability, as its design approach focuses on tolerating and surviving
the inevitable failures rather than trying to reduce them. The Resilience service
communicates with other services to collaborate in providing end-to-end resilient
Cloud. Figure 2 provides a conceptual model for Resilience service functions that
should be provided to maintain the overall end-to-end application resilience.
This Figure provides examples of Single Point of Failure, which Triggers the Resilience service. As we see in the Figure 2 the Adaptability Service rst receives
a notication of Single Point of Failure events, and then it manages the events.
This management would include interacting with other services, which we are
interested in the Resilience Service in this part.
The Resilience service in turn Performs Actions based on the Single Point of
Failure. If the Actions failed to guarantee resilience the Figure provides examples

Middleware Services at Cloud Application Layer

561

Fig. 3. Scalability Service

of Cascaded Actions that are followed. Such Actions and Cascaded Actions follow
a set of rules dened by Clouds authorised employees.
3.3

Scalability

Scalability at the Application Layer is providing an application with capabilities to quickly and eciently adapt to the addition and removal of virtual
resources. For example, on peak periods the virtual layer scales resources up, and
similarly on o-peak periods the virtual layer should release unneeded resources.
These should be reected at the application to support the addition and removal
of virtual resources. Also, these should not aect fundamental system properties
and should always represent user requirements (e.g. security and privacy). The
Adaptability service at the Virtual Layer (see [3] for detailed description of Virtual Layer services) upon detecting a need for either adding resources (e.g. peak
period) or removing resources it instructs the virtual layer Scalability service to
do so. The virtual layer Scalability service should trigger the application layer
Adaptability service to adapt to changes in the Virtual Layer. The Adaptability service at the Application Layer then triggers the Scalability service at the
application layer to scale the application to adapt to such changes.
Scalability type at virtual layer can be: Horizontal Scalability, Vertical Scalability, or combination of both. Horizontal Scalability is about the amount of
instances that would need to be added or removed to a system to satisfy increase
or decrease in demand. Vertical Scalability is about increasing or decreasing the
size of instances themselves to maintain increase or decrease in demand. In this
regard application layer scalability reacts dierently to both types of scalability. For example, Horizontal Scalability means the application will be replicated
at the newly created VMs; however, Vertical Scalability means the application
needs to take advantages of the additional allocated resources (e.g. increase memory usage, spawn additional child processes). Also, in both cases the Scalability
process needs to notify the Availability and Reliability services.
Figure 3 provides a conceptual model for application Scalability service. This
Figure provides the Actions from Adaptability service that triggers the Scalability
service. The Scalability service in turn Performs appropriate Actions.

562

I.M. Abbadi

3.4

Availability

Availability of a service represents the relative time a service provides its intended functions. High levels of availability are the result of excellent resilient
design.

Fig. 4. Availability Service

Availability service at application layer is in charge of distributing requests


coming to an application across all redundant application resources based on
their current load. If a resource is down or it is relatively overloaded, the Availability service should immediately stop diverting trac to that resource, and
re-diverts trac to other active resources until the Adaptability service xes the
problem or until the overloaded resource returns to normal processing capacity.
Figure 4 provides a conceptual model for application Availability service. This
Figure provides examples of Events from Resilience and Changes from Scalability
service, which Triggers the Availability service. The Availability service in turn
Performs Actions based on the Events and Changes. The Actions also Trigger
Cascaded Actions to other services in both Application Layer and Virtual Layer.
3.5

Reliability

Reliability is related to the success in which a service functions [15]. High endto-end service reliability implies that a service always provides correct results
and guarantees no data loss. Higher individual components reliability together
with excellent architect and well dened management processes, help in supporting higher resilience. This in turn increases end-to-end service reliability
and availability.
Reliability is of higher priority than Availability service. Most importantly it
ensures that the end-to-end service integrity is maintained (i.e. no data loss and
correct service execution). If service integrity is aected by anyway and cannot be immediately recovered, Reliability service then noties the Availability
service to immediately bring a service or part of a service down. This is to ensure that data integrity is always protected. Simultaneously, Adaptability and
Resilience service should automatically attempt to recover the system and noti system administrators in case of a decision cannot be automatically made

Middleware Services at Cloud Application Layer

563

Fig. 5. Reliability Service

(e.g. data corruption that requires manual intervention by an expert domain


administrator).
Figure 5 provides a conceptual model for application Reliability service. This
Figure provides examples of Events from Resilience, Events from Virtual Layer
Services, and Changes from Scalability service, which Triggers the Reliability
service. The Reliability service in turn PerformsActions and Cascaded Actions
based on the Events and Changes.
3.6

Security and Privacy

Security and Privacy at application layer is about ensuring Cloud user security
and privacy requirements are maintained by the environment surrounding the
application (it is important to re-stress that we are covering the middleware
services supporting the application and not the application itself). This for example includes (a.) protecting Cloud user data whilst in transit (transferred to
the Cloud and back to the client, and transferred between Cloud structure components), (b.) protecting the data whilst being processed by application, (c.)
protecting the data when transferred across Cloud services, (d.) protecting data
whilst in storage, and (e.) ensuring that the application runs at a pre-agreed
geographical location and also data stored at pre-agreed geographical location.
Security and privacy should be built into all other services as default option.
Figure 6 provides a conceptual model of Security and Privacy service at Application Layer. This Figure provides examples of Events and Application Services,
which trigger the Security and Privacy service. The Security and Privacy service
in turn takes Actions based on the Events or Application Services.
3.7

Summary of Services Interdependency

Figure 7 provides a summary for the interaction amongst Application Layer


middleware self-managed services, as we discuss throughout this section. This
Figure provides a high level overview and it is meant not to cover deep details

564

I.M. Abbadi

Fig. 6. Security and Privacy Service

Fig. 7. Application Layer Self Managed Services Interaction

for clarity. In this Figure Adaptability Service acts as the heart of self-managed
services. For example, it intercepts faults and changes in user requirements,
manages these by generating action plans, and delegates action plans to other
services. To be in a position to do this, the Adaptability Service communicates
with Resilience Service, Scalability Service, and Reliability Service.
The Resilience Service requires having redundant resources, which is represented by relation Maintains on Redundancy. Excellent resilient design results
in higher availability and reliability. This is indicated using Supports relation
between Resilience Service with Availability Service and Reliability Service.
Scalability Service (it starts based on Triggers received from Adaptability Service) instructs either Adapt to Vertical Scaling and/or Adapt to Horizontal Scaling processes. It also Noties Availability Service and Reliability Service once
scaling is done.
The Reliability Service is linked with Integrity process using Must Provide
relation. The outcome of the Integrity process is fed to the Reliability Service.
If application integrity is aected by any way the Reliability Service sends an
Integrity Failure message to both Availability Service and Adaptability Service.

Middleware Services at Cloud Application Layer

565

Services Interaction for Multi-tier Application in the


Cloud

In this section we start by proposing a typical multi-tier application architect in


the Cloud, and then discuss the required classes of middleware. We then describe
how it can be managed using the proposed services conceptual models.
4.1

Application Architecture in the Cloud and Types of Middleware

Figure 8 illustrates an architect of a muti-tier application in the Cloud. The


Application Layer in such multi-tier architect would typical be composed of the
following components.
1. Server Backend Application Is in charge of maintaining backend database
repository. The database repository runs in an appropriate container (e.g.
Oracle DBMS [17], Microsoft SQL Server [13], Derby [9]). The Server Backend Application would typically be hosted on a set of dedicated VMs, which
we refer to as Backend VMs.
2. Server Middle-tier Application Is in charge of running application business
logic functions that interact with Client Frontend Application. The middletier application runs in an appropriate container (e.g. Apache/Tomcat [6], Weblogic [18], Oracle Application Server[16]), which would normally be
hosted and replicated across a set of VMs that we refer to as Middle-tier VMs.
Middle-teir VMs and backend VMs are usually separate and independent in
production environment for several reasons (e.g. security, resource management, and resilience). These two sets of VMs could be combined and even
hosted on a single VM for development and test environment.
3. Client Frontend Application Client application could be combination of
HTML, JavaScript, Java Applets, or even a standalone application that
would need to communicate with the Cloud for special purposes (e.g. upload data on the Cloud for backup purposes, or be part of a supply chain
application). Client application could be stored at either Cloud customer
environment or inside Middle-tier VMs, based on the application nature.
The Cloud customer at run time (possible downloads and) runs the Client
Frontend Application at client side.
For example, media organizations usually have editorial systems and online web systems. A media organization could move its online web systems
on the Cloud and keep the editorial applications hosted on their local infrastructure. The organization editorial employees use their local editorial
applications when creating and editing stories. The organization customers,
on the other hand, access online web systems from the Cloud. In this case
the Client Frontend Application nature (for organization customers) is a
HTML/JavaScript; however, the Client Frontend Application nature (for organization employees) is standalone applications, which transfer stories into
online web systems hosted at the Cloud.

566

I.M. Abbadi

Fig. 8. A Typical Multi-tier Application Architect in the Cloud

Fig. 9. Middleware Types for a Multi-Tier Application in the Cloud

Middleware Services at Cloud Application Layer

567

The proposed multi-tier application architect requires a set of trustworthy middleware, as follows (see Figure 9).
1. Virtual Layer Middleware This middleware intermediates the communication between physical layer and application layer. It should provide transparent infrastructure management services to application layer via a set of selfmanaged services (see [3] for further details). Application Layer Middleware
requires these services to support trustworthy and resilience application.
2. Application Layer Middleware As discussed before this middleware should
provide a transparent management services to server applications via a set
of self-managed services. This middleware is conceptually composed of two
parts: (a.) Server Middle-tier Middleware that supports Server Middle-tier
Application, and (b.) Server Backend Middleware that supports Server Backend Application. These middleware should coordinate amongst each other to
provide trustworthy and Resilience service between Server Middle-tier Application to Server Backend Application. They also need to coordinate with
the other types of middleware to provide trustworthy and Resilience service
between Client Frontend Application to Virtual Layer.
3. Client Frontend Middleware This middleware should provide transparent management services on Client Frontend Application via a set of selfmanaged services. The services functions should coordinate with Server
Middle-tier Middleware in order to provide trustworthy service between
client middle-tier middleware to Server Middle-tier Middleware.
4.2

Middleware Services Interaction

In this section we use the conceptual models proposed in section 3 to discuss


middleware services interaction when managing the multi-tier architect proposed
earlier. Our discussion is based on providing several examples for the interaction
amongst Client Frontend Middleware, Server Middle-tier Middleware, and Server
Backend Middleware to self-manage the overall application. For brevity in this
we do not discuss Virtual Layer Middleware except when absolutely necessary.
Client Frontend Middleware Supporting Client Frontend Application requires the following self-managed services (in this we do not discuss issues related
to customer environments self-managed services as it is outside the scope of the
paper; for example, we do not discuss Availability and Scalability services for
this specic case).
1. Adaptability This service is in charge of adapting Client Frontend Application side to changes provided by Cloud provider (i.e. Server Middle-tier
Middleware), e.g. changes in service location, degraded performance, and
incidents. This would enable Adaptability service at client side to take appropriate actions.
Example of actions include (see Figure 1): (a.) on change of service location the Middle-tier Middlewares Adaptability service sends the new location
to Client Frontend Middleware Adaptability service, the client can then reestablish communication to new location; (b.) on change of performance due

568

I.M. Abbadi

to emergency the client could reduce its requests to the minimal or even do
oine processing and then upload the result on the Cloud; and (c.) on security incidents the client could temporarily follow an emergency plan. These
are just sample examples, which would be based on application nature. It is
important to re-stress at this point that the application is not necessarily a
simple HTML, as it could be an interactive application that do processing
at Clouds customer location and then communicates with Cloud for follow
up process.
2. Resilience This service is about providing resilient service at client side
when communicating with the Cloud (see Figure 2). The service, in this context, mainly attempts to re-establish failed communication with the Cloud
(i.e. with Server Middle-tier Middleware)
3. Reliability This service is concerned about maintaining service reliable
for Client Frontend Application when communicating with the Cloud (see
Figure 5). The service, in this context, ensures reliability when data transferred/received to/from Cloud, and ensures reliability when data processed
at Client Frontend Application.
4. Security and Privacy Is related to providing security measures at Cloud
customer side for Client Frontend Application (see Figure 6). This, for example, includes (a.) protecting clients data when retrieved from the Cloud and
stored or processed at client environment, and (b.) protecting data whilst
being transferred to/from the Cloud.
Server Middle-tier Middleware supports Server Middle-tier Application and
requires the following self-managed services.
1. Adaptability This service is in charge of supporting changes and events
that might aect the functions of Server Middle-tier Application, as illustrated in Figure 1. Example of these includes: (a.) problems in the Cloud,
which require relocating the service to another location. The service communicates with the Client Frontend Middlewares Adaptability service to
take an appropriate action; (b.) if Server Middle-tier Application cannot be
restarted because of hardware related issues the Adaptability service coordinates with the Adaptability service at all other dependent middleware (e.g.
virtual layer middleware and Client Frontend Middleware); and (c.) if application cannot be restarted because of dependency problem, the Adaptability
service manages this by nding dependent applications and re-validating
their availability.
2. Resilience This service covers the following examples (see Figure 2). (a.)
subject to the Client Frontend Application nature, the Resilience service
re-establish communication with the Client Frontend Middleware on failure;
(b.) re-establish communication with Server Backend Middleware on failure;
(c.) restart Server Middle-tier Application on failure; and (d.) if the application cannot be restarted because of an error (application, environment, or

Middleware Services at Cloud Application Layer

3.

4.

5.

6.

569

others) the service follows appropriate procedure based on the error nature
(e.g. triggers the Adaptability service).
Scalability This service is mainly concerned about Server Middle-tier
Application adaptability issues when the hosting underneath resources scales
up/down. This covers (see Figure 3): (a.) scaling up resources allocated to
VM hosting Server Middle-tier Application. This requires the application to
follow a set of processes, e.g. spawn further child processes; (b.) scaling up
by adding a VM, which require the application to follow a dierent process,
e.g. noties the Availability service to redistribute the incoming load to the
newly created VM, and redistribute client sessions considering the new VM;
and (c.) scaling down by removing additional resources allocated in (a.)
or removing the additional VM allocated in (b.), each requires following a
somehow a reverse process and noties the Availability service.
Availability This service is in charge of distributing the load coming from
Client Frontend Application and Server Backend Application evenly across
Server Middle-tier Application redundant resources. If a resource is down,
the Availability process immediately stops diverting trac to that resource,
and re-diverts the trac to other active resources until the Adaptability process xes the problem. Also, when the hosting environment scales up/down
the Availability service re-considers incoming requests distribution based on
the nature of the scaling. These are illustrated in Figure 4.
Reliability This service is concerned about maintaining service reliable
for Server Middle-tier Application when communicating with both Server
Backend Application and Client Frontend Application. Example of processes
provided by this service include (see also Figure 5) the following: (a.) verifying reliability when data transferred/received between applications, and
(b.) verifying reliability whilst data is processed.
Security and Privacy - Is related to maintaining Cloud customers security
and privacy requirements are maintained by the environment surrounding
Server Middle-tier Application. This includes (see Figure 6) the following:
(a.) protecting clients data when retrieved from the Client Frontend Application, (b.) protecting data whilst being processed by Server Middle-tier
Application, (c.) protecting data when transferred to/from Server Backend Application, (c.) protecting data on storage, and (d) ensuring security
and privacy is preserved for all other services (e.g. securing communication
baths).

Server Backend Middleware, which is required to support Server Backend


Application, requires same services that are required for Server Middle-tier Middleware. The main dierence is that this middleware does not communicate with
the Client Frontend Middleware. It mainly protects the application that intermediates the communication between Server Middle-tier Application and backend
storage, where data eventually stored. This in turn means this middleware services implementation would require to provide additional functions and security
features for managing database instance that interacts with the storage.

570

I.M. Abbadi

Discussion, Conclusion, and Research Direction

Cloud computing is complex and composed of enormous and heterogeneous


resources that need to cooperate, exchange critical messages and coordinate
amongst themselves. Such complexity of communication and coordination are
error prone and are subject to various security threats. This is especially the
case as Cloud computing has recently emerged to academic research from industry because of its promising potential as an Internet-scale computing infrastructure [7,11]. The lack of academic research that formally analyze current Cloud
infrastructure increases its vulnerabilities.
Cloud infrastructure is expected to support Internet scale critical applications
(e.g. hospital systems, smart grid systems). Critical infrastructure and even organizations will not outsource their critical resources at public Cloud without
strong assurance about its trustworthiness. Therefore, establishing trustworthy
Cloud infrastructure is the key factor to move critical resources into the Cloud.
In order to move in this direction for such a complex infrastructure, we virtually
split Cloud infrastructure into layers, as illustrated in Figure 8. Each layer relies
on the services and resources provided by the layer directly underneath it, and
each layer services rely on messages communicated with both the layer directly
underneath it and above it. Each two adjacent layers have a specic middleware
that provides self-managed services. These services implementations are based
on the layer they serve. Also, dierent types of middleware services coordinate
amongst themselves and exchange critical messages. Establishing trusted middleware services is paramount for providing trustworthy Cloud infrastructure.
In our opinion establishing trust in the Cloud requires two mutually dependent
elements: (a.) supporting Cloud infrastructure with proper mechanisms and tools
helping Cloud providers to automate the process of managing, maintaining, and
securing the infrastructure; and (b.) developing methods helping Cloud users
and providers to establish trust in the infrastructure operation by continually
assessing the operations of the Cloud infrastructure. In our previous work ([2])
we focus on point (b). We discussed point (a) in two papers: in this paper we
mainly focus on application layer middleware services, and in our previous work
([3]) we outlined virtual layer services. We are planning to extend our work
and build a trust model that clearly claries each middleware service functional
specications in all discussed middleware types. The trust model will identify the
interdependence across all types of middleware services. It should also clarify how
the collaboration across middleware services would establish trust in the Cloud.

Acknowledgment
This research has been supported by the TCloud project1 , which is funded by
the EUs Seventh Framework Program ([FP7/2007-2013]) under grant agreement number ICT-257243. The author would like to thank Andrew Martin and
Cornelius Namiluko for their discussion and valuable input. The author would
also like to thank IWTMP2PS 2011 anonymous reviewers for their comments.
1

http://www.tClouds-project.eu

Middleware Services at Cloud Application Layer

571

References
1. Abbadi, I.M.: Clouds infrastructure taxonomy, properties, and management services. In: CloudComp 2011: To Appear In Proceedings Of The International Workshop On Cloud Computing- Architecture, Algorithms And Applications. LNCS.
Springer, Berlin (2011)
2. Abbadi, I.M.: Operational trust in clouds environmen. In: MOCS 2011: To Appear
In Proceedings Of Workshop On Management Of Cloud Systems. IEEE Computer
Society, Los Alamitos (2011)
3. Abbadi, I.M.: Self-Managed Services Conceptual Model in Trustworthy Clouds
Infrastructure. In: Workshop on Cryptography and Security in Clouds. IBM, Zurich
(2011), http://www.zurich.ibm.com/~ cca/csc2011/program.html
4. Abbadi, I.M.: Toward Trustworthy Clouds Internet Scale Critical Infrastructure.
In: Bao, F., Weng, J. (eds.) ISPEC 2011. LNCS, vol. 6672, pp. 7182. Springer,
Heidelberg (2011)
5. Amazon : Amazon Elastic Compute Cloud, Amazon EC2 (2010),
http://aws.amazon.com/ec2/
6. Apache (2011), http://apache.org/
7. Armbrust, M., Fox, A., Grith, R., Joseph, A.D., Katz, R.H., Konwinski, A., Lee,
G., Patterson, D.A., Rabkin, A., Stoica, I., Zaharia, M.: Above the Clouds: A
Berkeley View of Cloud Computing (2009),
http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-28.pdf
8. Cloud Computing Use Case Discussion Group. Cloud computing use cases (2010),
http://cloudusecases.org/Cloud_Computing_Use_Cases_Whitepaper4_0.odt
9. Derby (2011), http://db.apache.org/derby/
10. IBM. Autonomic computing (2001), http://www.research.ibm.com/autonomic/
11. Jeery, K., Neidecker-Lutz, B.: The Future of Cloud ComputingOpportunities
For European Cloud Computing Beyond (2010)
12. Mell, P., Grance, T.: The NIST Denition of Cloud Computing
13. Microsoft Corporation. Microsoft SQL Server (2008),
http://www.microsoft.com/sqlserve
14. Sun Microsystems. Take Your Business to a Higher Level (2009)
15. Musa, J.D., Iannino, A., Okumoto, K.: Software reliability: measurement, prediction, application (professional ed.), USA, McGraw-Hill, Inc., New York (1990)
16. Oracle Application Server (2010),
http://www.oracle.com/technetwork/middleware/ias/overview/index.html
17. Oracle DBMS (2011),
http://www.oracle.com/us/products/database/index.html
18. Weblogic (2007), http://www.bea.com
19. Youse, L., Butrico, M., Da Silva, D.: Toward a unied ontology of cloud computing. In: Proceedings of Grid Computing Environments Workshop, pp. 110. IEEE,
Los Alamitos (2008)

Attribute Based Anonymity for Preserving Privacy


Sri Krishna Adusumalli and V. Valli Kumari
Department of Computer Science and Systems Engineering, Andhra University
Visakhapatnam, Andhra Pradesh, India, 530 003
{srikrishna.au,vallikumari}@gmail.com

Abstract. Privacy Preserving Publication has become much concern in this


decade. Data holders are simply publishing the dataset for mining and survey
purpose with less knowledge towards privacy issues. Current research has
focused on statistical and hippocratic databases to minimize the re-identification
of data. Popular principles like k-anonymity, l-diversity etc., were proposed in
literature to achieve privacy. There is a possibility that person specific
information may be exposed when the adversary ponders on different
combinations of the attributes. In this paper, we analyse this problem and
propose a method to publish the finest anonymized dataset that preserves both
privacy and utility.
Keywords: Privacy, Attribute Generalization, Information Loss.

1 Introduction
The affordability of computation, memory and disk storage is enabling large volumes
of person specific data is to be collected. Data holders with little knowledge about
privacy are releasing the information and thus compromising the privacy. On the
other fold the end users are also not aware of privacy issues and several software
giants like Google, Microsoft etc., are tracking search queries of the individuals.
In this regard protecting data from re-identification has become the most challenging
problem when important data like census, voter registration and medical information
of patients is released by hospitals, financial institutions and government
organizations for mining or survey purposes. Research towards protecting
individuals identity is being done extensively. In 2002 when medical data is linked
with the voters registration list 87% of USA population was identified with the
release data having gender, data of birth and zip code as attributes [1]. To avoid this
breach, the data is anonymized by using generalization and suppression that turned
into a protection model named k-anonymity [1]. When Netflix data set was deanonymized individuals information was exposed [2]. AOL [3] removed their query
logs immediately due to re-identification of person specific information.
When data is published, the original data table (T) as shown in the Table 1 is
anonymized and the anonymized dataset (T) (Table 3) is released for mining
purpose. The anonymized table does not contain any Identifying attributes such as
SID, Name etc., Some attributes that might reveal the information when linked with
the external dataset are termed as Quasi Identifiers (QID), for example: zip code, age,
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 572579, 2011.
Springer-Verlag Berlin Heidelberg 2011

Attribute Based Anonymity for Preserving Privacy

573

date of birth etc. The elementary way of protecting the privacy for a dataset can be
done by using k-anonymity. By definition, in the anonymized data released the values
of each record are similar to atleast (k-1) other records. This is achieved with the help
of generalization and suppression. The framework of k-anonymity is to generalize or
suppress some values on Quasi Identifier attributes. Generalization [8] is achieved by
generalizing the attribute values to specific value. For example if you consider age
attribute of a person say Alice to be 23 we transform it to [20-25] range thereby
preserving the semantic nature of the attribute value. Sometimes the generalization is
achieved using the desired taxonomy tree as shown in the fig. 1.
Table 1. Original Microdata Table
Name
Alice
Bob
Korth
Jane
Korth
Harry
Sandeep
Jack
Mary
Patricia
Benny
Ally

Age
26
28
20
24
51
58
44
48
32
38
35
33

Zipcode
53053
53068
53068
53053
54853
54853
54850
54850
53053
53053
53068
53068

Gender
Male
Male
Male
Female
Male
Female
Male
Male
Female
Female
Female
Female

Disease
Cancer
HIV
Flu
HIV
HIV
HIV
Obesity
Flu
Flu
Flu
HIV
Obesity

Table 2. Table without Identifiers


Age
26
28
20
24
51
58
44
48
32
38
35
33

Fig. 1. Taxonomy Tree

Zipcode
53053
53068
53068
53053
54853
54853
54850
54850
53053
53053
53068
53068

Gender
Male
Male
Male
Female
Male
Female
Male
Male
Female
Female
Female
Female

Disease
Cancer
HIV
Flu
HIV
HIV
HIV
Obesity
Flu
Flu
Flu
HIV
Obesity

574

S.K. Adusumalli and V.V. Kumari

In this paper we use generalization without any loss of generality. The main
objective is that the anonymized dataset that is to be published should preserve both
privacy and utility. Our proposed algorithm (Algorithm 1) preserves anonymity and
as well as good utility by measuring the information loss on the produced anonymized
data sets which are generated by the combination of attributes. To achieve this we
adopted the k-anonymity principle for grouping the data. This paper is divided into 4
sections. Section 2 discusses the related work. The proposed work and information
loss measuring is explained in section 3 and section 4 concludes the paper.

2 Related Work
This section reviews some of the known works in the area. Statistical community
resolved re-identification of person specific data but none provided a better and
efficient solution for providing anonymity. According to [4] the statistical databases
which are used for data mining and fraud detection were released to the miner by
adding some noise to the data but on the go it deteriorated the integrity of the tuples
thereby turning out for an inappropriate use of data. On the other side some
researchers introduced aggregation technique where the data is classified into lower
and higher types and then restriction was done in a way that the higher type of the
classified data cannot be inferred [5]. The draw backs of the above mentioned
methods were overcome by k-anonymity model to an extent by using generalization
and suppression [1]. K-anonymity identifies the appropriate Quasi-Identifiers and
then generalizes them to a higher level such that the anonymized group will contain
atleast k tuples. Our proposed method adopts this principle and considers the
combination of attributes and produces different sets of anonymized dataset from
which we select the finest among all by measuring the utility of the produced
combinatorial datasets.

3 Proposed Work
One way of protecting privacy is to group the data such that person specific data
cannot be identified. In our proposed approach we adopted the k-anonymity principle
[1]. Our work is divided into two fold. We initially generate different possible
anonymized datasets considering the taxonomy tree as shown in the fig.1. The
anonymized dataset should provide privacy and utility for mining purpose. This goal
is being achieved on the second fold by calculating the information loss (I) for all the
anonymized datasets that were produced in first stage. The T table which has low
information loss will be published. This produced anonymized table is the finest
dataset that provides both anonymization and utility.
3.1 Attribute Combination Based Generalization
By definition k-anonymity applies the generalization and suppression techniques to
the attribute values and then groups the data such that anonymized dataset contains
atleast k similar tuples. Abiding to this principle we initially consider the original

Attribute Based Anonymity for Preserving Privacy

575

table (DS) as shown in the table 1. Let DAi be the domain of the dataset DS and be
the attribute value set and we term the selected attributes as Quasi-Identifiers.
Initially we give an anonymized dataset and k value to the algorithm (Algorithm
1). We apply it for the entire attribute domain DAi. Then the first Quasi-Identifier
(QID) attribute i.e., DA1 is selected and the dataset is sorted in ascending order. After
sorting we calculate the count i.e., frequency of the support of each attribute value. If
DAi [vj] < k we then perform the generalization of the attribute value to a higher level
based on the taxonomy tree. We repeat this process (step 4 to 11) for the entire
domain values till the support value is greater than or equal to k.
Table 3. Anonymized Dataset D1
Age
[20-28]
[20-28]
[20-28]
[20-28]
[32-38]
[32-38]
[32-38]
[32-38]
[44-58]
[44-58]
[44-58]
[44-58]

Zipcode
[53053-53068]
[53053-53068]
[53053-53068]
[53053-53068]
[53053-53068]
[53053-53068]
[53053-53068]
[53053-53068]
[54850-54858]
[54850-54858]
[54850-54858]
[54850-54858]

Gender
Person
Person
Person
Person
Female
Female
Female
Female
Person
Person
Person
Person

Disease
HIV
Flu
HIV
Cancer
Flu
Obesity
HIV
Flu
Flu
HIV
Obesity
HIV

Table 4. Anonymized Dataset D3


Zipcode
53053
53053
53053
53053
53068
53068
53068
53068
[54850-54858]
[54850-54858]
[54850-54858]
[54850-54858]

Age
[20-38]
[20-38]
[20-38]
[20-38]
[20-38]
[20-38]
[20-38]
[20-38]
[44-58]
[44-58]
[44-58]
[44-58]

Gender
Person
Person
Person
Person
Person
Person
Person
Person
Person
Person
Person
Person

Disease
Flu
HIV
Flu
Cancer
HIV
Obesity
HIV
Flu
Flu
HIV
Obesity
HIV

Once the selected Quasi-Identifier is generalized we sort the remaining QuasiIdentifiers in descending order. For every tuple Tx in the dataset if the tuples DAi [Vx]
and DAi [Vx+1] are not equal then for every tuple Ty to Tx if the count of the support
value of DAj [vj] < k we then generalize the attribute value and repeat the steps 18
to 23 as shown in the algorithm (Algorithm 1) until support for all the attribute values
is >=k.

576

S.K. Adusumalli and V.V. Kumari


Table 5. Anonymized Dataset D5
Gender
Female
Female
Female
Female
Female
Female
Male
Male
Male
Male
Male
Male

Age
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]
[20-58]

Zipcode
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]
[53053-54858]

Disease
HIV
Obesity
Flu
HIV
Flu
HIV
Obesity
Flu
Cancer
HIV
Flu
HIV

Algorithm 1. Anonymized Dataset Constructor


Input: A Original Dataset DS and K value
Output: Anonymized Datasets [D1-DN]
Method: Constructing anonymized dataset based on attribute ordering
1. Begin
2.
For each attribute domain DAi in DS[DA1,DA2,, DAn] do
3.
sort the DS in ascending order based on DAi
4.
For each attribute value DAi[Vj] of DAi in DS do
5.
count support(DAi[vj])//To calculate the frequency of the each
6.
//attribute value
7.
if(support(DAi[vj]<k) then
8.
Generalize the attribute value to one level up based on taxonomy
9.
Tree
10.
end if
11.
end for
12.
Repeat step 4 to step 11 until all attribute value support is greater
13.
than Or equal to k.
14.
For each DAj [[DA1,DA2,,DAn]-[DAi] ] in DS do
15.
sort the DS in ascending order based on DAi and DAj
16.
For each tuple Tx in DS do
17.
if(DAi[Vx]!=DAi[Vx+1])then
18.
For each tuple Ty to Tx in DS do
19.
count support(DAj [vj]<k)
20.
if(support(DAj [vj]<k) then
21.
Generalize the attribute value to one level up based on
22.
taxonomy tree
23.
end if
24.
end for
25.
Repeat step 18 to 23 until all attribute values support is
26.
greater than or equal to k.
27.
end if
28.
end for
28.
end for
29.
end for
30. end Begin

Attribute Based Anonymity for Preserving Privacy

577

When anonymized dataset constructor algorithm (Algorithm 1) is applied on data


Table 2 different types of datasets based on attribute combinations are produced. The
possible tables are D1 {Age, Zipcode, Gender}, D2 {Age, Gender, Zipcode}, D3
{Zipcode, Age, Gender}, D4 {Zipcode, Gender, Age}, D5 {Gender, Age, Zipcode},
D6 {Gender, Zipcode, Age}. But datasets D1& D2, D3 & D4 and D5 & D6 are
similar when different combinations are taken and hence datasets D1, D3 & D5 are
anonymized.
3.2 Information Loss
When anonymized datasets are generated, the information loss for every anonymized
dataset is calculated. In general the information loss is calculated based on the
distance between the original Quasi Identifiers and their corresponding anonymized
ones using the generalized loss metric [6]. This metric is the ratio based metric. In
this paper since we use taxonomy tree for each quasi identifier attributes the
information loss for numerical and categorical dataset is defined as follows. We limit
the information loss calculation only for generalization. When numerical QuasiIdentifier attribute is considered, for every value the information loss ILnum is
calculated as
ILnum

(1)

[UI, LI] are the upper and lower limit of particular attribute I, to which the value is
generalized based on the taxonomy tree and Max, Min are the least and largest values
of the attribute I. For any categorical attribute, if r and r are the data values before
and after generalization and V is the corresponding node of r and if LV is the number
of leaf nodes of the subtree, L is the total number of leaf nodes of the taxonomy tree.
The information loss ILcat is given thereby
ILcat

(2)

For instance consider the dataset D1. According to equation (1) and (2) the
information loss for age, zipcode, sex for the first tuple are (28-20)/(58-20) = 0.21,
(53068-53053)/(54858-53053) =0.0083, (2-1)/(2-1)=1 respectively . So the total
Table 6. Information Loss for Different Anonymized Datasets
Dataset
/QIDs
D1
D2
D3
D4
D5
D6

Age
0.245
0.245
0.438
0.438
1
1

Information Loss
Zipcode
Gender
0.007
0.666
0.007
0.666
0.001
1
0.001
1
1
0
1
0

Total
0.306
0.306
0.479
0.479
0.666
0.666

578

S.K. Adusumalli and V.V. Kumari

Fig. 2. Information Loss

information loss for the first tuple is (0.21+0.0083+1)/3= 0.406. This process is
repeated for all the tuples of D1 and the total information loss for D1 is 0.306. Table 6
shows different information losses for all the dataset attributes domain wise and the
total information loss is also shown. The dataset with less information loss will be
considered as the finest dataset for publishing. A graph showing information loss for
different anonymized datasets is shown in the fig.2.

4 Complexity
The complexity analysis of our approach is presented in this section. Let n be number
of tuples and m be the number of Quasi Identifier attributes. Initially, the first
attribute domain is sorted in ascending order using simple merge sort algorithm
having complexity of O (n log n). Later the support count of each attribute value is
calculated. The complexity is found to be n + n log n and once this process is done we
perform the above process (steps 14 to 28 of the algorithm (Algorithm 1)) for the
remaining (m-1) attributes whose complexity is (m-1) (n log n + n). Hence the overall
complexity of the algorithm (Algorithm 1) for m quasi identifier attributes will be m
((n log n + n) + (m -1) (n log n + n)) m2n log n.

5 Conclusions and Future Work


In this paper, we proposed a privacy protection mechanism based on attribute
generalization by considering the combination of the attributes. When the data is
generalized attribute wise we protect the identity of the individuals by grouping the
data. To achieve this we used k-anonymity principle. The algorithm (Algorithm 1)
proposed generates different anonymized datasets based on the combination of the
quasi identifiers. We then determine the information loss of the generated datasets and
there by decides the finest anonymized dataset for publishing based on the

Attribute Based Anonymity for Preserving Privacy

579

information loss. The dataset that has minimum information loss will provide good
utility. The experimentation showed that the approach is practically feasible and the
information loss can be found. We limited our work to Quasi Identifiers. In future we
would like to focus on l-diversity [7] i.e., sensitive attribute based grouping.
Acknowledgments. This work was supported by Grant SR/S3/EECE/0040/2009 from
Department of Science and Technology (DST), Government of India. We thank the
anonymous reviewers for their insightful comments.

References
1. Sweeney, L.: K-anonymity: a model for protecting privacy. International Journal on
Uncertainty, Fuzziness and Knowledge-based Systems 10(5), 557570 (2002)
2. Narayanan, A., Shmatikov, V.: Robust De-anonymization of Large Datasets, (February 5,
2008)
3. Hansell, S.: AOL removes search data on vast group of web users. New York Times
(August 8, 2006)
4. Kim, J.: A method for limiting disclosure of microdata based on random noise and
transformation. In: Section on Survey Research Methods of the American Statistical
Association, pp. 328387 (2001)
5. Denning, D., Lunt, T.: A Multilevel relational data model, pp. 220234. IEEE, Oakland
(1987)
6. Iyengar, V.S.: Transforming data to satisfy privacy constraints: Special Interest Group on
Knowledge Discovery and Data Mining. In: SIGKDD, pp. 279288. ACM, New York
(2002)
7. Machanavajjhala, J.G., Kifer, D., Venkitasubramaniam, M.: l-diversity: Privacy beyond kanonymity. In: Proc. 22nd Intl. conf. Data Engg. (ICDE), p. 24 (2006)
8. Aggarwal, G., Feder, T., Kenthapadi, K., Motwani, R., Panigrahy, R., Thomas, D., Zhu, A.:
k-anonymity: Algorithms and hardness. Stanford University, Stanford (2004)

An Anonymous Authentication and Communication


Protocol for Wireless Mesh Networks
Jaydip Sen
Innovation Lab, Tata Consultancy Services Ltd.,
Bengal Intelligent Park, Salt Lake Electronics Complex, Kolkata 700091, India
Jaydip.Sen@tcs.com

Abstract. Wireless mesh networks (WMNs) have emerged as a key technology


for next generation wireless broadband networks showing rapid progress and
inspiring numerous compelling applications. A WMN comprises of a set of
mesh routers (MRs) and mesh clients (MCs), where MRs are connected to the
Internet backbone through the Internet gateways (IGWs). The MCs are wireless
devices and communicate among themselves over possibly multi-hop paths
with or without the involvement of MRs. User privacy and security have been
primary concerns in WMNs due to their peer-to-peer network topology, shared
wireless medium, stringent resource constraints, and highly dynamic environment. Moreover, to support real-time applications, WMNs must also be
equipped with robust, reliable and efficient communication protocols so as
to minimize the endto-end latency and packet drops. Design of a secure and
efficient communication protocol for WMNs, therefore, is of paramount importance. In this paper, we propose a security and privacy protocol that provides
security and user anonymity while maintaining communication efficiency in a
WMN. The security protocol ensures secure authentication and encryption in
access and the backbone networks. The user anonymity, authentication and data
privacy is achieved by application of a protocol that is based on Rivests ring
signature scheme. Simulation results demonstrate that while the protocols have
minimal storage and communication overhead, they are robust and provide high
level of security and privacy to the users of the network services.
Keywords: Wireless mesh network (WMN), user anonymity, security, authentication, key management, Rivest ring signature scheme, privacy.

1 Introduction
Wireless mesh networking has emerged as a promising concept to meet the challenges
in next-generation wireless networks such as providing flexible, adaptive, and reconfigurable architecture while offering cost-effective solutions to service providers.
WMNs are multi-hop wireless networks formed by mesh routers (which form a wireless mesh backbone) and mesh clients. The mesh routers provide a rich radio mesh
connectivity which significantly reduces the up-front deployment cost of the network.
Mesh routers are typically stationary and do not have power constraints. However,
the clients are mobile and energy-constrained. Some mesh routers are designated as
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 580592, 2011.
Springer-Verlag Berlin Heidelberg 2011

An Anonymous Authentication and Communication Protocol for WMNs

581

gateway routers which are connected to the Internet through a wired backbone.
A gateway router provides access to conventional clients and interconnects ad hoc,
sensor, cellular, and other networks to the Internet. A mesh network can provide multi-hop communication paths between wireless clients, thereby serving as a community
network, or can provide multi-hop paths between the client and the gateway router,
thereby providing broadband Internet access to the clients.
As WMNs become an increasingly popular replacement technology for last-mile
connectivity to the home networking, community and neighborhood networking, it is
imperative to design an efficient resource management protocols for these networks.
However, several vulnerabilities currently exist in various protocols for WMNs.
These vulnerabilities can be exploited by the attackers to degrade the performance of
a network. The absence of a central point of administration makes the WMN protocols vulnerable to various types of attacks. Security is, therefore, an issue which is of
prime importance in WMNs [1]. Since in a WMN, traffic of an end user is relayed via
multiple wireless mesh routers, preserving privacy of the user data is also a critical
requirement [2]. Majority of the current security and privacy protocols for WMNs are
extensions of protocols originally designed for mobile ad hoc networks (MANETs)
and therefore their performances are suboptimal.
Keeping this problem in mind, this paper presents a novel security protocol for
node authentication and message confidentiality for WMNs. In addition it also
presents a user anonymization scheme that ensures secure authentication of the mesh
clients (i.e., the user devices) while protecting their privacy.
The key contributions of the paper are as follows: (i) It proposes a novel security
protocol for the mesh client nodes and the mesh routers. (ii) For protecting user privacy while providing a secure authentication framework for the mesh clients (user devices), it presents a novel anonymization scheme that utilizes the essential idea of
Rivest group signature scheme [3].
The rest of this paper is organized as follows. Section 2 describes related work on
routing in WMNs. Section 3 presents the details of the architecture of a WMN and the
assumptions made for the development of the proposed protocols. Section 4 and Section 5 describe the proposed security and the privacy protocols respectively. Section 6
presents some performance results of the proposed scheme, and Section 7 highlights
some future scope of work and concludes the paper.

2 Related Work
Since security and privacy are two extremely important issues in any communication
network, researchers have worked on these two areas extensively. However, as comapred to MANETs and wireless sensor networks (WSNs), WMNs have received very
little attention in this regard. This section briefly discusses some of the existing mechanisms for ensuring security and privacy in communications in WMNs.
In [4], a standard mechanism has been proposed for client authentication and
access control to guarantee a high-level of flexibility and transparency to all users in
a wireless network. The users can access the mesh network without requiring
any change in their devices and softwares. However, client mobility can pose severe

582

J. Sen

problems to the security architecture, especially when real-time traffic is transmitted.


To cope with this problem, proactive key distribution has been proposed [5, 6].
Providing security in the backbone network for WMNs is another important challenge. Mesh networks typically employ resource constrained mobile clients, which
are difficult to protect against removal, tampering, or replication. If the device can be
remotely managed, a distant hacking into the device would work perfectly [7]. Accordingly, several research works have been done to investigate the use of cryptographic
techniques to achieve secure communication in WMNs. In [8], a security architecture
has been proposed that is suitable for multi-hop WMNs employing PANA (Protocol
for carrying Authentication for Network Access) [9]. In the scheme, the wireless
clients are authenticated on production of the cryptographic credentials necessary to
create an encrypted tunnel with the remote access router to which they are associated.
Even though such framework protects the confidentiality of the information exchanged, it cannot prevent adversaries to perform active attacks against the network
itself. For instance, a malicious adversary can replicate, modify and forge the topology information exchanged among mesh devices, in order to launch a denial of service
attack. Moreover, PANA necessitates the existence of IP addresses in all the mesh
nodes, which is poses a serious constraint on deployment of this protocol.
Authenticating transmitted data packets is an approach for preventing unauthorized
nodes to access the resources of a WMN. A light-weight hop-by-hop access protocol
(LHAP) has been proposed for authenticating mobile clients in wireless dynamic
environments, preventing resource consumption attacks [10]. LHAP implements
light-weight hop-by-hop authentication, where intermediate nodes authenticate all the
packets they receive before forwarding them. LHAP employs a packet authentication
technique based on the use of one-way hash chains. Moreover, LHAP uses TESLA
[11] protocol to reduce the number of public key operations for bootstrapping and
maintaining trust between nodes.
In [12], a lightweight authentication, authorization and accounting (AAA) infrastructure is proposed for providing continuous, on-demand, end-to-end security in
heterogeneous networks including WMNs. The notion of a security manager is used
through employing an AAA broker. The broker acts as a settlement agent, providing
security and a central point of contact for many service providers.
The issue of user privacy in WMNs has also attracted the attention of the research
community. In [2], a light-weight privacy preserving solution is presented to achieve
well-maintained balance between network performance and traffic privacy preservations. At the center of the solution is an information-theoretic metric called traffic
entropy, which quantifies the amount of information required to describe the traffic
pattern and to characterize the performance of traffic privacy preservation. The
authors have also presented a penalty-based shortest path routing algorithm that maximally preserves traffic privacy by minimizing the mutual information of traffic entropy observed at each individual relaying node, meanwhile controlling performance
degradation within the acceptable region. Extensive simulation study proves the
soundness of the solution and its resilience to cases when two malicious observers
collude. However, one of the major problems of the solution is that the algorithm is
evaluated in a single-radio, single channel WMN. Performance of the algorithm in
multiple radios, multiple channels scenario will be a really questionable issue. Moreover, the solution has a scalability problem.

An Anonymous Authentication and Communication Protocol for WMNs

583

In [13], a mechanism is proposed with the objective of hiding an active node that
connects to a gateway router, where the active mesh node has to be anonymous. A
novel communication protocol is designed to protect the nodes privacy using both
cryptography and redundancy. This protocol uses the concept of onion routing [14]. A
mobile user who requires anonymous communication sends a request to an onion
router (OR). The OR acts as a proxy to the mobile user and constructs a onion route
consisting of other ORs using the public keys of the routers. The onion is constructed
such that the inner most part is the message for the intended destination, and the message is wrapped by being encrypted using the public keys of the ORs in the route. The
mechanism protects the routing information from insider and outsider attack. However, it has a high computation and communication overhead.
None of the above propositions, however, addresses all the security problems of a
typical WMN. Most of the schemes handle security issues at a specific layer, and
therefore, fail to provide a multi-layer attack on the protocol stack of a WMN. This
paper proposes a security and privacy framework that addresses issues both at the
access and the backbone networks while not affecting the network performance.

3 WMN Security Architecture


In this section, we first present a standard architecture of a typical WMN for which
we propose a security and privacy protocol. The architecture is a very generic one that
represents majority of the real-world deployment scenarios for WMNs. The architecture of a hierarchical WMN consists of three layers as shown in Fig. 1. At the top
layers are the Internet gateways (IGWs) that are connected to the wired Internet. They
form the backbone infrastructure for providing Internet connectivity to the elements in
the second level. The entities at the second level are called wireless mesh routers
(MRs) that eliminate the need for wired infrastructure at every MR and forward their
traffic in a multi-hop fashion towards the IGW. At the lowest level are the mesh
clients (MCs) which are the wireless devices of the users. Internet connectivity and
peer-to-peer communications inside the mesh are two important applications for a
WMN. Therefore design of an efficient and low-overhead communication protocol
which ensure security and privacy of the users is a critical requirement which poses
significant research challenges. For design of the proposed protocol and to specify the
WMN scenario, the following assumptions are made.
(1) Each MR which is authorized to join the wireless backbone (through the
IGWs), has two certificates to prove its identity. One certificate is used during
the authentication phase that occurs when a new node joins the network. EAPTLS [15] for 802.1X authentication is used for this purpose since it is the
strongest authentication method provided by EAP [15], whereas the second
certificate is used for the authentication with the authentication server (AS).
(2) The certificates used for authentication with the RADIUS server and the AS
are signed by the same certificate authority (CA). Only recognized MRs are
authorized to join the backbone.
(3) Synchronization of all MRs is achieved by use of the network time protocol
(NTP) protocol [16].

584

J. Sen

Fig. 1. The three-tier architecture of a wireless mesh network (WMN)

The proposed security protocol serves the dual purpose of providing security in the
access network (i.e., between the MCs and the MRs) and the backbone network (i.e.,
between the MRs and the IGWs). These are described the following sub-sections.
3.1 Access Network Security
The access mechanism to the WMN is assumed to be the same as that of a local area
network (LAN), where mobile devices authenticate themselves and connect to an
access point (AP). This allows the users to the access the services of the WMN
exploiting the authentication and authorization mechanisms without installing any
additional software. It is evident that such security solution provides protection to the
wireless links between the MCs and the MRs. A separate security infrastructure is
needed for the links in the backbone networks. This is discussed in Section 3.2.

Fig. 2. Secure information exchange among the MCs A and B through the MRs 1 and 2

Fig. 2 illustrates a scenario where users A and B are communicating in a secure


way to MRs 1 and 2 respectively. If the wireless links are not protected, an intruder M
will be able to eavesdrop on and possibly manipulate the information being exchanged over the network. This situation is prevented in the proposed security scheme
which encrypts all the traffic transmitted on the wireless link using a stream cipher in
the data link layer of the protocol stack.

An Anonymous Authentication and Communication Protocol for WMNs

585

3.2 Backbone Network Security


For providing security for the traffic in the backbone network, a two-step approach is
adopted. When a new MR joins the network, it first presents itself as an MC and
completes the association formalities. It subsequently upgrades its association by
successsfully authenticating to the AS. In order to make such authentication process
efficient in a high mobility scenario, the key management and distribution processes
have been designed in a way so as to minimize the effect of the authentication overhead on the network performance. The overview of the protocol is discussed as
follows.

Fig. 3. Steps performed by a new MR (N) using backbone encrypted traffic to join the WMN

Fig. 3 shows the three phases of the authentication process that a MR (say N) undergoes. When N wants to join the network, it scans all the radio channels to detect any
MR that is already connected to the wireless backbone. Once such an MR (say A) is
detected, N requests A for access to network services including authentication and key
distribution. After connecting to A, N can perform the tasks prescribed in the IEEE
802.11i protocol to complete a mutual authentication with the network and establish a
security association with the entity to which it is physically connected. This completes
the Phase I of the authentication process. Essentially, during this phase, a new MR
performs all the steps that an MC has to perform to establish a secure channel with an
MR for authentication and secure communication over the WMN.
During Phase II of the authentication process, the MRs use the TLS protocol. Only
authorized MRs that have the requisite credentials can authenticate to the AS and
obtain the cryptographic credentials needed to derive the key sequence used to protect
the wireless backbone. In the proposed protocol, an end-to-end secure channel between the AS and the MR is established at the end of a successful authentication
through which the cryptographic credentials can be exchanged in a secure way.
To eliminate any possibility of the same key being used over a long time, two protocols are proposed for secure key management. These protocols are presented in
Section 4. As mentioned earlier in this section, all the MRs are assumed to be synchronized with a central server using the NTP protocol.

586

J. Sen

Fig. 4. Autonomous configuration of the MRs in the proposed security scheme

Fig. 4 shows a collection of four MRs connected with each other by five wireless
links. The MR A is connected with the AS by a wired link. At the time of network
bootstrapping, only node A can connect to the network as an MR, since it is the only
node that can successfully authenticate to the AS. Nodes B and C which are neighbors
of A then detect a wireless network to which can connect and perform the authentication process following the IEEE 802.11i protocol. At this point of time, nodes B and C
are successfully authenticated as MCs. After their authentication as MCs, nodes B and
C are allowed to authenticate to the AS and request the information used by A to produce the currently used cryptographic key for communication in the network. After
having derived such key, both B and C will be able to communicate with each other, as
well as with node A, using the ad hoc mode of communication in the WMN. At this
stage, B and C both have full MR functionalities. They will be able to turn on their
access interface for providing node D a connection to the AS for joining the network.

4 The Key Distribution Protocol


In this section, the details of the proposed key distribution and management protocol
are presented. The protocol is essentially a server-initiated protocol [17] and provides
the clients (MRs and MCs) flexibility and autonomy during the key generation.
4.1 Server Initiated Key Management Protocol
The proposed key management protocol delivers the keys to all the MRs from the AS
in a reactive manner. The keys are used subsequently by the MRs for a specific time
interval in their message communications to ensure integrity and confidentiality of the
messages. After the expiry of the time interval for validity of the keys, the existing
keys are revoked and new keys are generated by the AS. Fig. 5 depicts the message
exchanges between the MRs and the AS during the execution of the protocol.
A newly joined MR, after its successful mutual authentication with a central server,
sends its first request for key list (and its time of generation) currently being used by
other existing MRs in the wireless backbone. Let us denote the key list timestamp as
TSKL. Let us define a session as the maximum time interval for validity of the key list
currently being used by each node MR and MC). We also define the duration of a
session as the product of the cardinality of the key list (i.e., the number of the keys in
the key list) and the longest time interval of validity of a key (the parameter timeout in
Fig. 5). The validity of a key list is computed from the time instance when the list is

An Anonymous Authentication and Communication Protocol for WMNs

587

generated (i.e., TSKL) by the AS. An MR, based on the time instance at which it joins
the backbone (tnow in Fig. 5), can find out the key (from the current list) being used
by its peers (keyidx) and the interval of validity of the key (Ti) using (1) and (2) as
follows:
TS KL
+1
= t now
keyidx timeout

T = key
i

idx

* timeout (t now TS KL )

(1)

(2)

In the proposed protocol, each WMN node requests the AS for the key list that will
be used in the next session before the expiry of the current session. This feature is
essential for nodes which are located multiple hops away from the AS, since, responses from the AS take longer time to reach these nodes. The responses may also get
delayed due to fading or congestion in the wireless links. If the nodes send their requests for key list to the AS just before expiry of the current session, then due to limited time in hand, only the nodes which have good quality links with the AS will
receive the key list. Hence, the nodes which will fail to receive responses from the
server will not be able to communicate in the next session due to non-availability of
the current key list. This will lead to an undesirable situation of network partitioning.

Fig. 5. The message exchanges between an MR and the AS in the key management protocol

The key index value that triggers the request from the nodes to the server can be set
equal to the difference between the cardinality of the list and a correction factor. The
correction factor can be estimated based on parameters like the network load, the
distance of the node from the AS and the time required for the previous response.
In the proposed protocol, the correction factor is estimated based on the time to receive the response from the AS using (3), where ts is the time instance when the first
key request was sent, tr is the time instance when the key response was received from

588

J. Sen

the AS, and timeout is the validity period of the key. Therefore, if a node fails to
receive a response (i.e., the key list) from the AS during timeout, and takes a time tlast,
it must send the next request to the AS before setting the last key.

t timeout
c = last

timeout
=0

last

if

last

timeout

if

last

< timeout

= tr ts

(3)

The first request of the key list sent by the new node to the AS is forwarded by
the peer to which it is connected as an MC through the wireless access network.
However, the subsequent requests are sent directly over the wireless backbone.

5 The Privacy and Anonymity Protocol


As mentioned in Section 1, to ensure privacy of the users, the proposed security protocol is complemented with a privacy protocol so as to ensure user anonymity and
privacy. The same authentication server (AS) used in the security protocol is used for
managing the key distribution for preserving the privacy. To enable user authentication and anonymity, a novel protocol has been designed extending the ring signature
authentication scheme in [18]. It is assumed that a symmetric encryption algorithm E
exists such that for any key k, the function Ek is a permutation over b-bit strings. We
also assume the existence of a family of keyed combining functions Ck,v(y1, y2, ., yn),
and a publicly defined collision-resistant hash function H(.) that maps arbitrary inputs
to strings of constant length which are used as keys for Ck,v(y1, y2, ., yn) [3]. Every
keyed combining function Ck,v(y1, y2, ., yn) takes as input the key k, an initialization
b-bit value v, and arbitrary values y1, y2, ., yn. A user Ui who wants to generate a
session key with the authentication server, uses a ring of n logged-on-users and performs the following steps.
Step 1: Ui chooses the following parameters: (i) a large prime pi such that it is hard
to compute discrete logarithms in GF(pi), (ii) another large prime qi such that qi | pi
1, and (iii) a generator gi in GF(pi) with order qi.
Step 2: Ui chooses x Ai Z qi as his private key, and computes the public
key y A = g ix Ai mod pi .
i
mod qi

Step 3: Ui defines a trap-door function f i ( , ) = . y Ai


verse function

.g i mod pi . Its in-

f i 1 ( y ) is defined as f i 1 ( y ) = ( , ) , where and are com-

puted as follows (K is a random integer in Z qi .

= y Ai .g i K .( g

K
i

mod pi ) mod qi

mod pi

(4)

An Anonymous Authentication and Communication Protocol for WMNs

589

* = mod qi

(5)

= K .( g iK mod pi ) x Ai . * mod qi

(6)

Ui makes pi, qi, gi and y Ai public, and keeps x Ai as secret.


The authentication server (AS) chooses: (i) a large prime p such that it is hard to
compute discrete logarithms in GF(p), (ii) another large prime q such that q | p 1,
(iii) a generator g in GF(p) with order q, (iv) a random integer xB from Zq as its private
key. AS computes its public key y B = g xB mod p and publishes (yB, p, q, g).
Anonymous authenticated key exchange: The key-exchange is initiated by the user
Ui and involves three rounds to compute a secret session key between Ui and AS. The
operations in these three rounds are as follows:
Round 1: When Ui wants to generate a session key on the behalf of n ring users U1,
U2, ..Un, where 1 i n , Ui does the following:
(i) Ui chooses two random integers x1, xA Z q* and computes the following:

R = g x1 mod p , Q = y Bx1 mod p mod q , X = g xa mod p and l = H ( X , Q,V , yB , I ) .


(ii) Ui Chooses a pair of values ( t , t ) for every other ring member Ut
(1 t n, t k ) in a pseudorandom way, and computes yt = ft ( t , t ) mod pt .
(iii) Ui randomly chooses a b-bit initialization value v, and finds the value of

yi

from the equation Ck ,v ( y1, y2 ,........ yn ) = v .


(iv) Ui computes ( i , i ) = f i 1 ( yi ) by using the trap-door information of f i .
First, it chooses a random integer K Z q , computes i using (6), and keeps K secret.
i

It then computes

*
i

using (5) and finally computes i using (6).

(v) (U 1 ,U 2 .,U n , v,V , R, ( 1 , 1 ), ( 2 , 2 ),., ( n , n ) is the ring signature on X.


Finally, Ui sends and I to the server AS.
Round 2: AS does the following to recover and verify X from the signature .
(i) AS computes Q = R xB mod p mod q , recovers X using X = V .g Q mod p and hashes X, Q, V and yb to recover l, where l = H ( X , Q,V , y B , I ) .
(ii) AS computes yt = fi ( t , t ) mod pi , for t = 1,2,..n.
(iii) AS checks whether C k ,v ( y1, y 2 , ......... y n ) = v. If it is true, AS accepts X as valid;
*

otherwise, AS rejects X. If X is valid, AS chooses a random integer xb from Z q , and


computes the following: Y = g xb mod p K s = X xb mod p and h = H ( K s , X , Y , I ' ) . AS
'

sends {h, Y, I } to Ui.


Round 3: Ui verifies whether K S' is from the server AS. For this purpose, Ui computes K S' = Y xa mod p , hashes K, X, Y to get h ' using h ' = H ( K s' , X , Y , I ' ) . If

h ' ? = h , Ui accepts Ks as the session key.

590

J. Sen

Security analysis: The key exchange scheme satisfies the following requirements.
User anonymity: For a given signature X, the server can only be convinced that the
ring signature is actually produced by at least one of the possible users. If the actual
user does not reveal the seed K, the server cannot determine the identity of the user.
The strength of the anonymity depends on the security of the pseudorandom number
generator. It is not possible to determine the identity of the actual user in a ring of
size n with a probability greater than 1/n. Since the values of k and v are fixed in a
b n 1
number of ( x1 , x2 ,...xn ) that satisfy the equation
ring signature, there are ( 2 )
C k ,v ( y1 , y 2 ,... y n ) = v , and the probability of generation of each ( x1 , x2 ,...xn ) is the same.

Therefore, the signature cant leak the identity information of the user.
Mutual authentication: In the proposed scheme, not only the server verifies the users, but the users can also verify the server. Because of the hardness of inverting the
hash function f(.), it is computationally infeasible for the attacker to determine ( i , i ) , and hence it is infeasible for him to forge a signature. If the attacker
wants to masquerade as the AS, he needs to compute h = H ( K s , X , Y ) . He requires xB
in order to compute X. However, xB is the private key of AS to which the attacker has
no access.
Forward secrecy: The forward secrecy of a scheme refers to its ability to defend
leaking of its keys of previous sessions when an attacker is able to catch hold of the
key of a particular session. The forward secrecy of a scheme enables it to prevent
replay attacks. In the proposed scheme, since xa and xb are both selected randomly,
the session key of each period has not relation to the other periods. Therefore, if the
session key generated in the period j is leaked, the attacker can not get any information of the session keys generated before the period j. The proposed protocol is, therefore, resistant to replay attack.

6 Performance Evaluation
The proposed security and privacy protocols have been implemented in the Qualnet
network simulator, version 4.5 [19]. The simulated network consists of 50 nodes randomly distributed in the simulation area forming a dense WMN. The WMN topology
is shown in Fig. 6, in which 5 are MRs and remaining 45 are MCs. Each MR has 9
MCs associated with it. To evaluate the performance of the security protocol, first the
network is set as a full-mesh topology, where each MR (and also MC) is directly
connected to two of its neighbors. In such a scenario, the throughput of a TCP
connection established over a wireless link is measured with the security protocol
activated in the nodes. The obtained results are then compared with the throughput
obtained on the same wireless link protected by a static key to encrypt the traffic.
After having 10 simulation runs, the average throughput of a wireless link between
a pair of MRs was found to be equal to 30.6 MBPS, when the link is protected by a
static key. However, the average throughput for the same link was 28.4 MBPS when
the link was protected by the proposed security protocol. The results confirm that the
protocol does not cause any significant overhead on the performance of the wireless
link, since the throughput in a link on average decreased by only 7%.

An Anonymous Authentication and Communication Protocol for WMNs

591

Fig. 6. The simulated network topology in Qualnet Simulator

The impact of the security protocol for key generation and revocation on packet
drop rate in real-time applications is also studied in the simulation. For this purpose, a
VoIP application is invoked between two MRs which generated UDP traffic in the
wireless link. The packet drop rates in the wireless links are studied when the links
are protected with the security protocol and when they protected with a static key. The
transmission rate was set to 1 MBPS. The average packet drop rate in 10 simulation
runs was found to be only 4%. The results clearly demonstrate that the proposed
security scheme has no adverse impact on packet drop rate even if several key switching (regeneration and revocation) operations are carried out.
The performance of the privacy protocol is also analyzed in terms of its storage and
communication overhead. Both storage and communication overhead were found to
increase linearly with the number of nodes in the network. In fact, it has been analytically shown that overhead due to cryptographic operation on each message is: 60n +
60 bytes, where n represents the number of public key pairs used to generate the ring
signature [20]. It is clear that the privacy protocol has a low overhead.

7 Conclusion and Future Work


WMNs have become an important focus area of research in recent years owing to
their great promise in realizing numerous next-generation wireless services. Driven by
the demand for rich and high-speed content access, recent research has focused on
developing high performance communication protocols, while security and privacy
issues have received relatively little attention. However, given the wireless and multihop nature of communication, WMNs are subject to a wide range of security and
privacy threats. This paper has presented a security and user-privacy preserving protocol for WMNs. The proposed security protocol ensures security in both the access
and the backbone networks, whereas the privacy protocol enables anonymous authentication of the users. Simulation results have shown the effectiveness of the protocol.
Future research issues include the study of a distributed and collaborative system
where the authentication service is provided by a dynamically selected set of MRs.
The integration with the current centralized scheme would increase the robustness of
the proposed protocol, maintaining a low overhead since MRs would use the distributed service only when the central server is not available.

592

J. Sen

References
1. Sen, J.: Secure Routing in Wireless Mesh Networks In: Funabiki, N. (ed.) Wireless Mesh
Networks (2011), Intech
http://www.intechopen.com/articles/show/title/securerouting-in-wireless-mesh-networks
2. Wu, T., Xue, Y., Cui, Y.: Preserving Traffic Privacy in Wireless Mesh Networks. In: Proc.
of WoWMoM (2006)
3. Rivest, R., Shamir, A., Tauman, Y.: How to Leak a Secret. In: Boyd, C. (ed.)
ASIACRYPT 2001. LNCS, vol. 2248, pp. 552565. Springer, Heidelberg (2001)
4. Mishra, A., Arbaugh, W.A.: An Initial Security Analysis of the IEEE 802.1X Standard.
UM Computer Science Department Technical Report CS-TR-4328 (2002)
5. Kassab, M., Belghith, A., Bonnin, J.-M., Sassi, S.: Fast Pre-Authentication Based on
Proactive Key Distribution for 802.11 Infrastructure Networks. In: Proc. of WMuNeP, pp.
4653 (2005)
6. Prasad, A., Wang, H.: Roaming Key Based Fast Handover in WLANs. In: Proc. of IEEE
WCNC, vol. 3, pp. 15701576 (2005)
7. Ben Salem, N., Hubaux, J.-P.: Securing Wireless Mesh Networks. IEEE Wireless Communication 13(2), 5055 (2006)
8. Cheikhrouhou, O., Maknavicius, M., Chaouchi, H.: Security Architecture in a Multi-Hop
Mesh Network. In: Proc. of SAR (2006)
9. Parthasarathy, M.: Protocol for Carrying Authentication and Network Access (PANA)
Threat Analysis and Security Requirements. RFC 4016 (2005)
10. Zhu, S., Xu, S., Setia, S., Jajodia, S.: LHAP: A Lightweight Network Access Control Protocol for Ad Hoc Networks. Ad Hoc Networks 4(5), 567585 (2006)
11. Perrig, A., Canetti, R., Song, D., Tygar, J.: Efficient and Secure Source Authentication for
Multicast. In: Proc. of NDSS, pp. 3546 (2001)
12. Prasad, N., Alam, M., Ruggieri, M.: Light-Weight AAA Infrastructure for Mobility Support across Heterogeneous Networks. Wireless Personal Communications 29 (2004)
13. Wu, X., Li, N.: Achieving Privacy in Mesh Networks. In: Proc. of SASN, pp. 1322
(2006)
14. Reed, M., Syverson, P., Goldschlag, D.: Anonymous Connections and Onion Routing.
IEEE Journal on Selected Areas in Communications 16, 482494 (1998)
15. Aboba, B., Blunk, L., Vollbrecht, J., Carlson, J., Levkowetz, H.: Extensible Authentication Protocol (EAP). RFC 3748 (2005)
16. Mills, D.L.: Network Time Protocol. RFC 1305 (1992)
17. Martignon, F., Paris, S., Capone, A.: MobiSEC: A Novel Security Architecture for Wireless Mesh Networks. In: Proc. of Q2SWinet, pp. 3542 (2008)
18. Cao, T., Lin, D., Xue, R.: Improved Ring Authenticated Encryption Scheme. In: Proc. of
JICC, pp. 341346 (2004)
19. Network Simulator QUALNET, http://www.scalable-networks.com
20. Xiong, H., Beznosov, K., Qin, Z., Ripeanu, M.: Efficient and Spontaneous PrivacyPreserving Protocol for Secure Vehicular Communication. In: Proc. of ICC, pp. 16
(2010)

Data Dissemination and Power Management in


Wireless Sensor Networks
M. Guerroumi1, N. Badache2, and S. Moussaoui3
USTHB, Faculty of Electronic and Computing Department,
Algiers, Algeria
mguerroumi@usthb.dz, Nbadache@wissal.dz,
moussaoui_samira@yahoo.fr

Abstract. Data dissemination in wireless sensor networks has been considered


as an important mechanism and very interested issue. In such environment,
data dissemination is generally performed from sensor nodes to the sink. Thus,
each sensor node can be implicitly provided the direction to forward sensing
data towards the sink. The inherent characteristics of sensor nodes such as
limited battery and limited computing capability, make the forwarding
mechanism is not more suitable for this kind of network. In this paper, we
present new data dissemination protocol based energy-efficient called Data
Dissemination and Power Management Protocol (DDPM). In this protocol, we
propose new energy management scheme using a dynamic power threshold.
Firstly, in the initialization phase, the sensor nodes organized under clusters and
cluster head should be selected for each cluster. Secondly, in the data
dissemination phase, the cluster head collects and transmits the sensing data
based on the data dissemination process. The simulation result shows that the
proposal protocol permits to reduce the energy consumption and prolong the
network life.
Keywords: wireless sensor networks, data dissemination, energy efficiency,
dynamic power threshold, power management.

1 Introduction
Wireless sensor networks contain large number of sensor nodes that communicate
wirelessly [8]. Each node equipped with a radio transceiver or other wireless
communications device, a small process unit, and an energy source, usually a battery.
These networks are characterized by its easy deployment and low maintenance cost. In
computer science and telecommunications, wireless sensor networks are an active
research area with numerous workshops and conferences arranged each year.
Therefore, sensor networks are the focus of significant research efforts on account of
their diverse applications that include disaster recovery, military surveillance, health
administration, environmental monitoring and construct complex physical system [10].
The main task of a wireless sensor network is the monitoring of a larger area.
Usually, the end user wants to extract information from the sensor field, this
information is gathered by the sensor nodes, and disseminated to the sensor sink.
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 593607, 2011.
Springer-Verlag Berlin Heidelberg 2011

594

M. Guerroumi, N. Badache, and S. Moussaoui

Moreover, this information can be sensed and disseminated to the sink node and after
forwarded to the end user without requested by the last one. A possible solution, to
disseminate the interested data is letting the sensor nodes use the flooding technique.
Nevertheless, this technique produces a high traffic load and consumes many energy
resources.
Data Dissemination is a process by which the sensing data will be transmitted form
the source sensor node to the sink. It consists to determine the optimal path on which
the information will be disseminated. The characteristics of the networks of sensors,
like the significant density and the limited energy require specific data dissemination
protocol. However, the aim of our research consists to conceive and validate a
protocol of data of use the concept of aggregation to minimize energy
The remaining parts of this paper are organized in the following way. Section 2
reviews a set of related data dissemination protocol and summarizes some recent
works. Section 3 presents the parameters and the assumptions of our environment.
Section 4 presents our proposal and gives the necessary description of the different
concepts used in our design. Performance analysis and simulation results are
presented in Section 5. Section 6 concludes this paper.

2 Related Works
Several data dissemination protocols for sensor networks have been proposed in the
literature to address the data communication problem in these networks. Protocol
LEACH proposed in [2] is one of the first approaches of the hierarchical data
dissemination sensors networks. LEACH has been considered as an effective protocol
in energy consumption this protocol can extend the lifetime of the network [2],
compared with the other protocols. Moreover, this protocol organizes the sensor nodes
in clusters form, the elected cluster heads collect the data from its sensor nodes,
aggregate and transmit them directly to the sink node, these cluster heads changed and
elected periodically. TEEN is a protocol of data dissemination based on the clustering
technique proposed by anjeshwar & Al.[ 4 ]. TEEN uses the same strategy as LEACH
to create the clusters node, but adopts a different approach during the data transmission
phase. In this phase, TEEN uses two parameters called hardware threshold and
software threshold to determine the need of collected data transmission. PEGAGIS
[13] is another data dissemination protocol designed for sensor networks which
improves the previous LEACH. In this protocol, a sensor node communicates only
with the closest neighbors, it should wait its turn to transmit its data to the sink node.
CODE [1] is a protocol based on a virtual grid structure, where each cell of the grid
contains a node called coordinator playing the role of an intermediate node, only these
nodes coordinators take part in the process of data dissemination. This protocol
is principally inspired from some previous works like GAF [6], [7]. TTDD [14]
considers sink mobility, by constructing grid networks for each data source and
selecting a grid node as the communication portal of mobile data sinks.
Other protocol SPIN [3] considers the end-to-end communications in sensor
networks, it supposes that two sensor nodes can communicate between them without
any interference with other nodes. This protocol supposes also that the energy
consumption does not constitute a constraint, and the data are never lost.

Data Dissemination and Power Management in Wireless Sensor Networks

595

In the directed diffusion protocol [15], [FHR04], data are inherently dispersed with
the physical object and retrieved via queries transferred to the object through the
network. It also envisions that the querying and monitoring the physical space may
rely on multicast mechanisms. Protocol PDDD [5] tries to surmount the disadvantage
of the multicast mechanisms used in the directed diffusion protocol. It eliminates the
gradient algorithm of directed diffusion and exploits the information of neighbor
nodes.
According to user importance, SAFE [12] considers service differentiation between
data sinks, allowing each data sink itself to specify the desired data update rate. This
aspect entails multiple level provision of data freshness.
Other protocol MMSPEED [11] presents an evolution in the protocols oriented
quality of service. MMSPEED offers several transmissions speed and establishes
more than one route form the source node to the destination. Therefore, each offered
speed defines a level of temporal QoS and each additional route helps to improve the
quality of traffic. These two mechanisms respectively make it possible to respect the
degree of criticality of each application, to transmit the data within the required times
and to avoid the problems frequently encountered like the congestion and packets
loss.

3 System Model and Assumptions


We consider a large scale sensor network with a large number of sensor nodes
scattered randomly. Each node acts as either a source to sense information from the
environment or a router to forward data through the sensor field to the interest users.
In this paper, we consider only the detected events occurred in the sensor network, the
end users receive frequently and randomly the new detected events. Therefore, sensor
nodes should be preconfigured to send a notification if the new event verifies some
parameters, example if the temperature exceeds or decreases under predefined degree.
In this environment we assume that:
Each sensor node should be aware of its own geographic location using location
services such as GPS[16]
After having been deployed, sensor nodes remain stationary at their initial
locations.
The sensor nodes are stationary, homogeneous and each sensor node has
constrained battery energy.
Sensor nodes communicate with sinks via a multiple hops path.

4 Data Dissemination and Power Management Protocol (DDPM)


The proposed protocol is based on a structure of indexed virtual grid the same used in
[1], where each cell represents a cluster that contains a selected head. The selected
head collects the data from all the nodes of its group, aggregates and transmits them
to the basic station using the multi hop communication.

596

M. Guerroumi, N. Badache, and S. Moussaoui

The proposed solution starts with an initialization phase, during which, the virtual
grid will be constructed and a header will be selected for each group. Thus, the
process of data dissemination will start in the second phase.
During the second phase, the head of the group receives the data collected by the
nodes of its group, then, it transmits them to the interested sinks hop by hop, the next
hop is defined by the indices of next cell in the grid.
During the dissemination phase, each head of group executes the necessary data
aggregation to decrease the number of transmission packets. When the head receives
data from one of its members, it will ignore all the same ones received in the next T
second, where T is the necessary time for transmission of data between the two
farthest nodes in the group.

Fig. 1. Grid cell (cluster)

According to the above Figure1:


R2 = r2 + r2 => R = r2 so T = r2/v

(1)

Where: v is radio speed and r is cell size.


This operation permits to reduce the number of the transmitted packets. Moreover,
when an event is observed by several nodes, only one message will be transmitted to
the sink and the other messages will be ignored.
4.1 Initialization Phase
The initialization phase begins just after the deployment of network, it aims to prepare
the sensor nodes for the second phase of data dissemination process. During this first
phase, the virtual grid will be constructed using the geographical position of the
sensor nodes and a head will be selected for each cell.
- Geographical position: each sensor node calculates its geographical
position using GPS [9]. This information is necessary for the virtual grid
construction and the data dissemination procedure.
- Grid virtual: In virtual grid each cell is known by its coordinates (Cx, Cy).
And each sensor node calculates the coordinates of its cell using the
following formulas [1]:

Data Dissemination and Power Management in Wireless Sensor Networks

597

(2)

Where:

is the cell size.

1
A

r
2
r

Fig. 2. Cell size (r)

The size r of a cell in the grid is based on the range of communication R. the cell is
square of size R (Fig. 2).
To ensure the communication between all the close nodes of the cells the distance
between the two most distant nodes in two close cells must be less than R.
For example in fig. 2, the most distant nodes are nodes 1 of cell A and node 2 of
the cell D, the distance between two nodes 1 and 2 must be less than the range of
communication R in order to be able to communicate between them. This translated
mathematically by the following formula:
4r2+4r2R2 => r R/8 => r R/(22)

(3)

- Selection of cluster head: Selected cluster head depends to the energy


availability of sensor node. In each cluster, only one sensor node which has the
highest energy availability superior than a predetermined threshold can be
selected as head. Initially, all the nodes have the same probability to be a cluster
head.
- Sink table: each cluster head creats and maintains in its cache a Sink_table
that contains SinkId, SinkPos and Data_type where:
SinkId: is the sink identifier.
SinkPos: the geographical position of the sink.
Data_type: type of data in which the sink is interested.
After the construction of virtual grid (Fig. 3) and the selection of the cluster head,
each sink advertises the above information witch permit a cluster heads to update its
Sink_table.

598

M. Guerroumi, N. Badache, and S. Moussaoui

Fig. 3. Virtual grid

The sink_table correspondent to head of the cluster [0, 0] is:


Table 1. Sink table

Sinkid
Sink1
Sink1
Sink2

Pos
[1,0]
[1,0]
[0,1]

Data_type
T1
T2
T1

4.2 Data Dissemination Process


In our solution the data disseminations process is based on the virtual grid built in the
first phase, the cluster head collects and transmits the observed data using the virtual
grid. Therefore, the data dissemination will be done through two levels. Locally,
between source sensor node and its cluster head in the same cluster. Externally,
between the clusters heads from the first head of the cluster in witch the event has
been sensed to the sink.
Locally level
When a sensor node observes an event, it sends a message of data information
contained the type of the observed phenomenon. This message will be received
locally by its cluster head, the last will be responsible transmit it to the interested sink.
Externally level
In this level the data dissemination is based on the virtual grid, such as only the
cluster heads will participate and collaborate to delivery the sensing data to the

Data Dissemination and Power Management in Wireless Sensor Networks

599

interested sink. Therefore, when a head receives the sensing data, will define the next
destination head, so this next hope will be selected as below:
(X,Y) : Cluster coordinates of the current node.
(Xsink,Ysink) : Cluster coordinates of the sink node.
(Nexthop.X,Nexthop.Y) : Cluster coordinates of next head.
Select_Nexthop()
1 { If (Xsink=X) then
2 { nexthop.X=X ;
3
Selecte_y() ; }
4 else
5 if (Xsink > X ) then
6
{ nexthop.X=X++;
7
Selecte_y() ; }
8
else
9
{ nexthop.X=X-- ;
10
Selecte_y() ; }
11 }
Select_y()
1 { if (Ysink=Y ) then nexthop.Y=Y ;
2
Else if (Ysink>Y )
3
then nexthop.Y=Y++ ;
4
Else nexthop.Y=Y-- ;
5}
4.2.1 Dynamic Power Threshold and Cluster Head Management
As motioned above, the cluster head is responsible to disseminate the sensing data
from the source sensor node to the sink witch consume more energy. Using a fixed
energy threshold to determinate the cluster head permits for the selected node to acts
as head only one time in its life and it will be the first node that will be died and like
this one by one all the node will be died, as result the network performances decrease.
In order to make the proposal protocol more efficient in energy consumption and
prolong the lifetime of the sensor nodes, we define the below formula:
New_threshold = Old_threshold-( ld_threshold /k)

(4)

K is positive integer.
We define also two fixed threshold, threshold_max and threshold_min. the
threshold_max is the initial threshold, and the threshold_min is the lowest sufficient
energy that permits the cluster head to advertise its energy exhaustion in order to
select another one.

600

M. Guerroumi, N. Badache, and S. Moussaoui

Initially the energy available of the selected cluster head should be more than
the threshold_max, after certain time, the energy of the head will be decreased
and when becomes inferior to the threshold_min. the head sends a message of
Select_newHeaderr(Old_threshold) in order to select new head. The nodes which
have a residual energy higher than the threshold sent by the head will replay by
sending their energy available (Ack_newLeader(Residual_ Energie).
The sensor node which has the highest residual energy available will be selected as
new cluster head. In the worst case, where no sensor node replies on the selected new
header message, which means that all the sensor nodes have residual energy less than
the specified threshold (Old_threshold). In this case, the head defines new threshold
using the above formula and select new cluster head according to it.
However, if the residual energy of the current cluster head is less than or equal to
the threshold_min , it sends a Death_Leader(Old_threshold) message to advertise its
energy exhaustion, and the nodes will cooperate between them with the same manner
to choose a new cluster head the node which has the highest residual energy.
4.2.2 An Empty Cluster
Generally in wireless sensor networks the nodes are deployed randomly. Therefore,
the nodes density in the sensing field is variable. In our proposal, it is more possible,
after the construction of the virtual grid, to find some cluster empty (Fig. 4).
Moreover, after a certain time all the sensors nodes of the same cluster can be died.
During the data dissemination process, the next hope can be carried out on an
empty cluster, in this case the data dissemination can not be done.
To solve this problem, we suggest that the cluster head of the next hope must
delivery acknowledgement. Therefore, the source cluster head must await an
Ack_recept message from the next selected hope. If any Ack_recept message has
been received during the next T period, it will select another next head to which it will
disseminate the sensing event.

Empty
Cluster

Fig. 4. An empty cluster

Data Dissemination and Power Management in Wireless Sensor Networks

601

The period T is the twice necessary time that a message traverses the distance
between two most distant nodes in two neighboring clusters (Fig. 5).
1
A

t1
C

t2

2
r

Fig. 5. Period t = t1+ t2

In the Figure5 above, the far distant nodes are the node1 in the cluster A and the
node 2 in the cluster D, t is estimated mathematically as below:
t=t1+ t2 = r8/v + r8/v

(5)

= 2r8/v => t = 4r2/v

5 Performances Evaluation
This section is reserved to discuss the performance evaluation results of our proposal
protocol. The evaluation has been carried out by simulation using Glomosim
simulator [16]. To simulate the sensing data, the sensor nodes are randomly chosen to
detect and send new sensing event during the simulation time. The sensing processes
follows the model of POISSON where 60 seconds value fixed as an interval average
of this model. This interval has been varied between 1 second and 60 seconds in order
to simulate the network load and the energy consumption in case of many detected
event occurred. In this simulation, energy consumption, response time or latency from
the source node which detect the event to the sink node and traffic parameter have
been evaluated according to different metrics. Moreover, the proposal protocol has
been compared with Leach and Code using the same parameters in the same
simulator.
The table 1, below, shown the parameters of our environment. The default
dimension of the network is 1000x1000m2. And, in order to test the scalability, the
number of nodes can reach 250 on a site of 5000x5000m2.
5.1 Energy Consumption
This parameter represents the average of energy consumption by a node, It is
calculated using following formula:

602

M. Guerroumi, N. Badache, and S. Moussaoui

Energy_consumption = NbMsg_Sent*Eng_TX + NbMsg_recevied*Eng_RX


+ tandby_energy_ consumption
NbMsg_Sent: number of sent messages.
NbMsg_recevied: Number of received messages.
Eng_TX: Energy of an emitted packet.
Eng_RX: Energy of a packet received.
Standby_energy_consumption: Energy consumption during the standby mode
Table 2. Configuration parameter

Default value

Variation interval

Number of nodes

100

100 - 700

Initialenergy available(KW)

100

threshold_max(KW)

threshold_min (KW)

Bandwidth(Mbps)

New event detection period (S)

60

Simulation time duration(M)

15

Energy consumption(KWh)

Parameter

60

1 - 60

64
62
60
58
56
54
52
50
3

10

20

30

40

50

Threshold_max value(KW)

Fig. 6. Energy consumption and power threshold

60

(6)

Data Dissemination and Power Management in Wireless Sensor Networks

603

The figure above (Fig. 6) represents the evolution of energy consumption


according to the dynamic power threshold (threshold_max) related to our proposal.
The aim of this experience is to determine the most optimal value of the power
threshold_max that we can use it for the remaining simulation tests.
This experience shown clearly, that the most optimal value threshold is 10 KW.
This value represents 10 % of an initial power100 KW. When the threshold_max
increases the energy consumption increases also because the execution number of
cluster head swaps procedure is very high and many control packets will be
generated, therefore consumption will increased.
The Figure7 below represents the evolution of energy consumption according to
the density of sensor nodes. It permits also to evaluate the impact of scalability on
energy consumption.
For LEACH the energy consumption is very high comparing with two others
protocols. It remains stable for density of 100 and 500 nodes/Km2, and then increases
when the density reaches 600 and 700 nodes/Km2. This is explained by the dynamic
creation groups procedure which needs to send high number of control messages.
Concerning protocol CODE and our proposal, the energy consumption increases in
parallel with the density. This is caused by the flooding technique used by some nodes
in network to communicate their co-ordinates.

Energy consumption (KWh)

CODE

DDPM

LEACH

140
120
100
80
60
40
20
0
100 150 200 300 400 500 600 700
Density (Node/Km2)

Fig. 7. Energy consumption and density

Comparing these protocols, CODE is more effective than LEACH, because the last
one consumes more energy for the creation and the re-creation of the dynamic groups,
whereas CODE uses a static virtual grid. However, more our protocol appeared more
effective than CODE, because the initialization and transferred request phases used in
CODE have been eliminated.
The above figure (Fig. 8) shown the evolution of energy consumption according to
the detection event frequency.

604

M. Guerroumi, N. Badache, and S. Moussaoui

Energy consumption (KWh)

CODE

DDPM

LEACH

100
90
80
70
60
50
40
30
20
10
0
0,13 0,25 0,38 0,5 0,63 0,75 0,88

Detected event frequency

Fig. 8. Energy consumption and detected event frequency

For the three protocols, the energy consumption is not very affected by the number
of detected event, which means that the most energy consumption provided from the
protocol design.
5.2 Response Time
The response time is the needed average duration to disseminate the detected data
from the source to the sink node. The average time is calculated as below:
TpsAcc = Resp_time /I, I [1, N]

(7)

Resp_time = Sent_time- Received_time

(8)

Sent_time: The time of the sending data.


Received_time: The receiving data time by the sink node.
Resp_time: The needed time for successful delivered data.
N is the number of successful delivered data.
The above figure (Fig. 9) represents the evolution of the response time according to
the density of sensor nodes. The response time for LEACH protocol is unstable. This
permits to say that the density does not have a direct influence on the response time
and this instability is related to the protocol design.
Concerning the two other protocols, the response time increases slowly in the
beginning and becomes stable. The highest response time is usually corresponds to
LEACH protocol, this last uses the concept of period transmission which impacts data
dissemination process. Our protocol appears more effective than CODE and gives the
best response time

Data Dissemination and Power Management in Wireless Sensor Networks

CODE

DDPM

605

LEACH

response tim e(s)

0,035
0,03
0,025
0,02
0,015
0,01
0,005
0
100 150 200 300 400 500 600 700
Density(Node/Km 2)

Fig. 9. Response time and density

In the below Figure10, we evaluate the behavior of the response time according to
the detected event frequency. When the detection frequency increases the network
overhead decreases also, therefore the response time or the latency will be increased.
Consequently, the real time application will be influenced.
CODE

DDPM

LEACH

Response time(s)

0,03
0,025
0,02
0,015
0,01
0,005
0
0,13 0,25 0,38 0,5 0,63 0,75 0,88

Detected event frequency

Fig. 10. Response time and detected event frequency

During this experience we notice that the protocol Leach gives the highest response
time which increases in parallel with the detected event frequency. In the other hand,
the response time in Code is not more influenced and remains stable until the value
0,75 request/second where the response time starts to increase. Moreover, the best
response time is that given by our protocol, where it remains low and not influenced.

606

M. Guerroumi, N. Badache, and S. Moussaoui

6 Conclusion
In this paper we saw that the particular nature of the sensor networks, such as the
limited lifetime of the sensors in consequence of their limited size, the multiplicity of
the components and their performances require a specific mode of communication
and represents considerable constraints.
According to the studied related works we noticed that each protocol has
advantages and disadvantages, this study allowed us to understand the mechanism of
data dissemination wireless sensor networks, which helped us to propose a new
solution that considers the requirements of the sensor networks. In this solution, we
took into account mainly the advantages and the disadvantages of the two protocols
CODES and LEACH.
The proposal protocol is based on a virtual grid structure, where each cell in the
grid contains a head responsible on the dissemination and the aggregation of the
sensed data. This head is selected periodically according to the dynamic power
threshold. In this paper we considered only the detected events and the sensed data are
disseminated from the source sensor node to the sink. The users requests are not
considered in this work and will be the object of our future paper.

References
[1] Xuan, H.L., Lee, S.: A Coordination-Based Data Dissemination Protocol for Wireless
Sensor Networks. In: Proceedings of the Sensor Networks and Information Processing
Conference, pp. 1318 (December 2004)
[2] Heinzelman, W., Chandrakasan, A., Balakrishnan, H.: Energy-effcient Communication
Protocol for Wireless Microsensor Networks. In: Proceedings of the 33rd Hawaii
International Conference on System Sciences (HICSS 2000) (January 2000)
[3] Heinzelman, W.R., Kulik, J., Balakrishnan, H.: Adaptive protocols for information
dissemination in wireless sensor networks. In: Proceedings of the ACM MobiCom 1999,
Seattle, Washington, pp. 174185 (1999)
[4] Manjeshwar, D.P., Agrawal, A.: TEEN: a routing protocol for enhanced efficiency in
wireless sensor networks. In: International Proceedings of 15th Parallel and Distributed
Processing Symposium, pp. 20092015 (2001)
[5] Lee, M.-G., Lee, S.: Data Dissemination for Wireless Sensor Networks. In: Proceedings
of the 10th IEEE International Symposium on Object and Component-Oriented RealTime Distributed Computing, pp. 172180. IEEE, Los Alamitos (2007)
[6] Akkaya, K., Younis, M.: An Energy-Aware QoS Routing Protocol for Wireless Sensor
Networks. In: The Proceedings of the IEEE Workshop on Mobile and Wireless Networks
(MWN 2003), Providence, Rhode Island (May 2003)
[7] Xu, Y., Heidemann, J., Estrin, D.: Geographyinformed Energy Conservation for Ad Hoc
Routing, Rome, Italy (2001)
[8] Akyildiz, I., Su, W., Sankarasubramanian, Y., Cayirci, E.: A Survey on Sensor Networks.
IEEE Communications Magazine 40(8), 102114 (2002)
[9] Bulusu, N., Heidemann, J., Estrin, D.: Gps-less low cost outdoor localization for very
small devices. IEEE Personal Communications Magazine 7(5), 2834 (2000)

Data Dissemination and Power Management in Wireless Sensor Networks

607

[10] Estrin, D., Govindan, R., Heidemann, J., Kumar, S.: Next Century Challenges: Scalable
Coordination in Sensor Networks. In: Proceedings of the Fifth Annual International
Conference on Mobile Computing and Networks (MobiCOM 1999), Seattle, Washington
(August 1999)
[11] Felemban, E., Lee, C.-G., Ekici, E.: MMSPEED: Multipath Multi-SPEED Protocol for
QoS Guarantee of Reliability and Timeliness in Wireless Sensor Networks. IEEE
Transactions on Mobile Computing 5(6), 738754 (2006)
[12] Kim, S., Son, S.H., Stankovic, J.A., Choi, Y.: Data Dissemination over Wireless Sensor
Networks. IEEE Communications Letters 8(9), 561563 (2004)
[13] Lindsey, S., Raghavendra, C.S.: Pegasis: Power-efficient gathering in sensor information
systems . Proc. Of the IEEE, 924935 (2002)
[14] Ye, F., Luo, H., Lu, S., Zhang, L.: A TwoTier Data Dissemination Model for Large scale
Wireless Sensor Networks. UCLA Computer Science Department, Los Angeles (2002)
[15] Intanagonwiwat, C., Govindan, R., Estrin, D., Heidemann, J., Silva, F.: Directed
diffusion for wireless sensor networking. IEEE/ACM Transactions on Networking 11(1),
216 (2003)
[16] Bagrodia, R., Zeng, X., Gerla, M.: GloMoSim - A Library for Parallel Simulation of
Large-scale Wireless Networks. Computer Science Departement. University of
California, Los Angles (1999)

Performance Evaluation of ID Assignment Schemes for


Wireless Sensor Networks
Rama Krishna Challa and Rakesh Sambyal
NITTTR, Chandigarh, India
rakeshsambyal@rediffmail.com

Abstract. Wireless sensor networks gains tremendous popularity, as evidenced


by the increasing number of applications for these networks. The limitations of
the wireless sensor nodes, such as their finite energy and moderate processing
capability restrict the performance of wireless sensor networks. Efficient node
addressing schemes with minimum communication cost are important for the
optimum initialization of wireless sensor network. Several energy efficient
solutions have been proposed for ID assignment in wireless sensor networks. In
this paper, we explore and compare two categories of ID assignment schemes
namely proactive and reactive in terms of energy consumption, communication
overhead and packet size.
Keywords: Wireless sensor networks, ID assignment, proactive, reactive.

1 Introduction
A wireless sensor network consists of large number of low cost sensor nodes and one
or more sink node which are deployed randomly to perform sensing tasks in a given
environment. A typical sensor node has limited battery power, low computing
capability and limited memory. These sensor nodes can be deployed in a different
type of environments to perform information related tasks such as gathering,
processing and dissemination of information.
Any ID assignment algorithm should produce the shortest possible addresses
because wireless sensor networks are energy-constrained. For wireless sensor
networks, researchers have proposed attributes, instead of unique IDs, as network
addresses, and steering routing directly based on these attributes [8]. Typical queries
are not the water level at node #5546, but rather the water level in the north-west
quadrant. The final destination is, therefore, identified by attributes such as any
node in the north-west quadrant or the nearest gateway. This method has several
benefits, one of which is that more common attributes can be encoded in only a few
bits, resulting in energy savings and an increase in the nodes lifetime.

2 Related Work
Q. Zheng et al. [1] proposed a distributed scheme of energy efficient clustering with
self-organized ID assignment (EECSIA). This scheme can prolong network lifetime
in comparison with low-energy adaptive clustering hierarchy (LEACH).
A. Abraham et al. (Eds.): ACC 2011, Part IV, CCIS 193, pp. 608615, 2011.
Springer-Verlag Berlin Heidelberg 2011

Performance Evaluation of ID Assignment Schemes for Wireless Sensor Networks

609

H. Zhou et al. [2] proposed an efficient reactive ID assignment scheme for wireless
sensor networks. In this scheme the node address is required only when the data
communication is started. Therefore we can preserve more energy if the ID conflicts
are resolved during data communication. C. Schurgers et al. [3] proposed a distributed
algorithm which significantly reduces the size of MAC address. This scheme can
handle unidirectional links and is scalable in terms of assignment algorithm and
address representation. C. Schurgers et al. [4] proposed a dynamic MAC addressing
scheme to reduce the MAC address overhead in wireless sensor network. This scheme
scales well with the network size, making it suitable for wireless sensor networks with
thousands of nodes. E. O. Ahmed et al. [5] proposed a distributed algorithm that
assigns globally unique IDs to sensor nodes. In this scheme, the sensor nodes can join
the network during the execution of the algorithm or even after its termination.
J. H. Kang et al. [6] proposed a structure-based algorithm that assigns globally unique
IDs to sensor nodes. This scheme reduces the communication overhead during ID
assignment of sensor nodes and hence preserves energy and increases network
lifetime.
We categorized the ID assignment schemes as reactive ID assignment scheme,
proactive ID assignment scheme and hybrid ID assignment scheme. This paper will
evaluate the performance of reactive ID assignment scheme using directed diffusion
[2] and proactive ID assignment scheme using distributed unique global ID
assignment scheme [5].

3 Performance Evaluation
Efficient node addressing schemes are important for the optimum initialization of
wireless sensor network. Initialization can be viewed as the mechanism for individual
sensor nodes to become the part of a wireless sensor network. The aim of this paper is
to compare the performance of ID assignment schemes in wireless sensor networks.
To ensure a sufficient network lifetime, all ID assignment algorithms must be
designed with focus on energy efficiency. Therefore it is important to minimize the
communication overhead during ID assignments of nodes in wireless sensor networks
in order to save energy.
Following metrics are used to measure the performance of the ID assignment
schemes for wireless sensor networks.
Energy consumption: Energy consumed by a node which consists of the energy
consumed by sensing, transmitting, receiving, listening for packets, internal
processing, discarding a packet and even in its sleep state.
Communication overhead: It measures the number of control packets sent and
received by sensor nodes during their ID initialization.
Packet size: The size of the packet is an important factor to enhance the network
operation lifetime in wireless sensor network.

610

R.K. Challa and R. Sambyal

4 Simulation Results
The simulations are carried out using NS-2 network simulator (Version 2.33)[13] to
compare the performance of reactive ID assignment and proactive ID assignment
schemes. This simulation process considered a wireless network of 10 nodes which
are placed within a 670m x 670m area. The first node is the sink node and the last
node is the source node. The size for the address is 4 bits.
4.1 Energy Consumption
In order to evaluate the energy consumption, we set the parameter values as shown
in Table 1.
Table 1. Important simulation parameters

Parameter

Value

Initial energy

1 Joule

Transmit power

0.06 Joule

Receive power

0.03 Joule

Packet size

512 bytes

Max. queue length

50

Simulation time

500 seconds

Simulation area

670m x 670m

Number of nodes

10

Fig. 1. Aggregate energy consumption of 10 nodes

Performance Evaluation of ID Assignment Schemes for Wireless Sensor Networks

611

Figure 1 shows the result of total energy consumption by all the nodes with respect
to execution time. In proactive approach each forwarding ID is kept in a table to
maintain routing information. Maintaining routes at all times may cause high energy
utilization. However in reactive ID assignment scheme, node ID is not required if
there is no data communication. This approach will keep traffic low and hence
preserves energy.
Figure 2 shows the comparison of the total energy consumption by each individual
node after their ID initialization. Simulation results show that reactive ID assignment
scheme saves an average of 12% energy per node as compare to proactive ID
assignment scheme.

Fig. 2. Energy consumption of individual nodes

Fig. 3. The number of control packets sent and received at each node

4.2 Communication Overhead


Figure 3 shows the number of control packets sent and received at each node
when assigning the node ID. In the case of proactive ID assignment, it broadcasts
periodic HELLO message including its neighbor table. This causes extremely high

612

R.K. Challa and R. Sambyal

communication overhead. Reactive scheme broadcasts HELLO message in the end of


the simulation and hence causes low communication overhead than the proactive
scheme.
The simulation results confirm that reactive ID assignment scheme generates less
number of control packets as compared to proactive ID assignment scheme and hence
consumes less energy.
4.3 Impact of Packet Size on Energy Consumption
In this experiment, we will highlight the effect of the packet size on ID assignment
schemes in wireless sensor networks with a focus on energy consumption. Figure 4
and Figure 5 shows the impact of packet size on energy consumption of nodes in
reactive ID assignment and proactive ID assignment schemes.

Fig. 4. Impact of packet size on energy consumption in reactive ID assignment scheme

Fig. 5. Impact of packet size on energy consumption in proactive ID assignment scheme

Performance Evaluation of ID Assignment Schemes for Wireless Sensor Networks

613

When the packet size is increased in reactive ID assignment scheme, node 3


consumes all of its available energy and the energy level of node 2 is 0.003 Joule.
Similarly when the packet size is increased in proactive ID assignment scheme, node
2 and node 3 consumes all of their available energy. These results show that in
wireless sensor network, the sinks neighbor nodes consume more energy than others
during their ID initialization.
The result shows that increasing the data packet size will increase the energy
consumption of individual nodes in wireless sensor networks for both the ID
assignment schemes. Simulation result shows that the impact of packet size on energy
consumption is 7% more in the proactive ID assignment scheme than reactive ID
assignment scheme.

Fig. 6. Control packets in reactive ID assignment scheme for varying packet size

Fig. 7. Control packets in proactive ID assignment scheme for varying packet size

614

R.K. Challa and R. Sambyal

4.4 Impact of Packet Size on Communication Overhead


Figure 6 and Figure 7 shows the effect of packet size on control packets in reactive ID
assignment scheme and proactive ID assignment scheme. The simulation result shows
that when the packet size is increased, the number of control packets generated gets
decreased in both the ID assignment schemes. The results of this analysis reveal that
longer packet size in ID assignment schemes is favorable in wireless sensor networks
when the size of control packets is considered alone.

5 Conclusions
In this paper we presented the performance evaluation of reactive and proactive ID
assignment schemes for wireless sensor networks. The simulation results reveals that
since reactive ID assignment approach saves much more energy in ID assignment of
nodes as compared to proactive approach, this scheme eventually targets increasing
the network operation lifetime. Moreover reactive ID assignment scheme cause much
lower communication overhead than the proactive scheme because it generates very
less numbers of control packets as compared to proactive ID assignment scheme. In
summary, longer network lifetime can be achieved with the reactive ID assignment
scheme.
There are however still many challenges in ID assignment schemes. To make
wireless sensor networks more practical, we need to develop effective ID assignment
algorithms that meet several unique requirements such as optimal packet size, reduced
startup cost in terms of execution time and energy and asynchronous wake-up of
nodes. Moreover further research would be needed in ID assignment algorithms to
address the overhead of mobility in energy constrained wireless sensor networks.

References
1. Zheng, Q., Liu, Z., Xue, L., Tan, Y., Chen, D., Guan, X.: An energy efficient clustering
scheme with self-organized ID assignment for wireless sensor networks. In: Proceedings
of IEEE International Conference on Parallel and Distributed Systems, Shanghai, China,
December 8-10, pp. 635639 (2010)
2. Zhou, H., Mutka, M.W., Ni, L.M.: Reactive id assignment for wireless sensor networks.
International Journal of Wireless Information Networks 13, 317328 (2006)
3. Schurgers, C., Kulkarni, G., Srivastava, M.B.: Distributed on-demand address assignment
in wireless sensor networks. Proceedings of IEEE Transactions on Parallel and Distributed
Systems 13, 10561064 (2002)
4. Schurgers, C., Kulkarni, G., Srivastava, M.B.: Distributed assignment of encoded MAC
address assignment in wireless sensor networks. In: Proceedings of the 2nd ACM
International Symposium on Mobile Ad hoc Networking & Computing, USA, pp. 295
298 (October 2001)
5. Ahmed, E.O., Blough, D.M., Heck, B.S., Riley, G.F.: Distributed unique global id
assignment for sensor networks. In: Proceedings of IEEE International Conference on
Mobile Ad-hoc and Sensor Systems, vol. 7, pp. 123 (November 2005)

Performance Evaluation of ID Assignment Schemes for Wireless Sensor Networks

615

6. Kang, J.H., Park, M.: Structure-based id assignment for sensor networks. International
Journal of Computer Science and Network Security 6, 158163 (2006)
7. Zhou, H., Mutka, M.W., Ni, L.M.: Reactive id assignment for sensor networks. In:
Proceedings of IEEE International Conference on Mobile Ad-Hoc and Sensor Systems,
pp. 567572 (November 2005)
8. Intanagonwiwat, C., Govindan, R., Estrin, D.: Directed diffusion: a scalable and robust
communication paradigm for sensor networks. In: Proceedings of the Sixth Annual
International Conference on Mobile Computing and Networking, pp. 5667 ( August
2000)
9. Ali, M., Uzmi, Z.A.: An energy efficient node address naming scheme for wireless sensor
networks. In: International Networking and Communication Conference, vol. 11,
pp. 2530 (June 2004)
10. Jiang, P., Wen, Y., Wang, J., Shen, X., Xue, A.: A study of routing protocols in wireless
sensor networks. In: Proceedings of the Sixth World Conference on Intelligent Control
and Automation, vol. 1, pp. 266270 (June 2006)
11. Qui, W., Cheng, Q., Skafidas, E.: A hybrid routing protocol for wireless sensor networks.
In: International Symposium on Communications and Information Technologies, pp.
13831388 (October 2007)
12. Dai, S., Jing, X., Li, L.: Research and analysis on routing protocols for wireless sensor
networks. In: Proceedings of IEEE International Conference on Communications Circuits
and Systems, vol. 1, pp. 407411 (May 2005)
13. http://www.isi.edu/nsnam/ns

Author Index

Abbadi, Imad M. IV-406, IV-557


Abbas, Ash Mohammad II-307
Abraham, Anuj III-503
Abraham, John T. III-168
Abraham, Siby I-328
Achuthan, Krishnashree I-488, II-337
AdiSrikanth, III-570
Aditya, T. I-446
Adusumalli, Sri Krishna IV-572
Agarwal, Vikas II-595
Aghila, G. II-327, IV-98
Agrawal, P.K. IV-244
Agrawal, Rohit II-162
Agrawal, Shaishav III-452
Agushinta R., Dewi II-130, II-138,
II-146
Ahmed, Imran II-317
Ahn, Do-Seob II-595
Aishwarya, Nandakumar II-490, II-498,
III-269
Akhtar, Zahid II-604
Al-Sadi, Azzat A. II-535
Alam, Md. Mahabubul III-349
Alam Kotwal, Mohammed Rokibul
II-154
Ananthi, S. I-480
Andres, Frederic IV-79
Anisha, K.K. III-315
Anita, E.A. Mary I-111
Anju, S.S. II-490, II-498, III-269
Annappa, B. IV-396
Anto, P. Babu III-406
Anusiya, M. IV-155
Aradhya, V.N. Manjunath III-289,
III-297
Arifuzzaman, Md. III-349
Asif Naeem, M. II-30
Asokan, Shimmi IV-63
Athira, B. II-80
Awais, Muhammad II-374
Awasthi, Lalit Kr III-609
Azeem, Mukhtar II-525
Azeez, A.A. Arifa IV-145

Babu, Korra Sathya II-1


Babu, K. Suresh II-636
Babu, L.D. Dhinesh I-223
Babu, M. Rajasekhara I-182
Baburaj, E. I-172
Badache, N. IV-593
Bagan, K. Bhoopathy IV-524
Bajwa, Imran Sarwar II-30
Bakshi, Sambit III-178
Balasubramanian, Aswath I-411
Banati, Hema II-273
Banerjee, Indrajit III-68
Banerjee, Joydeep III-82
Banerjee, Pradipta K. II-480
Banerjee, Usha II-648
Bansal, Roli III-259
Banu, R.S.D. Wahida II-545
Baruah, P.K. I-446
Basak, Dibyajnan I-519
Basil Morris, Peter Joseph II-577
Baskaran, R. II-234, IV-269
Bastos, Carlos Alberto Malcher
IV-195
Batra, Neera I-572
Bedi, Punam II-273, III-259
Bedi, R.K. II-397
Behl, Abhishek II-273
Bhadoria, P.B.S. IV-211
Bhardwaj, Ved Prakash II-568
Bharti, Brijendra K. IV-358
Bhat, Veena H. III-522
Bhattacharyya, Abhijan I-242
Bhosale, Arvind IV-512
Bhuvanagiri, Kiran Kumar IV-293
Bhuvaneswary, A. II-327
Biji, C.L. IV-300
Binu, A. I-399
Biswas, G.P. II-628
Biswas, Subir III-54
Biswas, Suparna II-417
Biswas, Sushanta II-612, II-620
Biswash, Sanjay Kumar I-11
Boddu, Bhaskara Rao II-296
Borah, Samarjeet III-35

618

Author Index

Borkar, Meenal A.
Boutekkouk, Fateh

IV-25
II-40

Chaganty, Aparna IV-19


Chaitanya, N. Sandeep IV-70
Chakraborty, Suchetana II-585
Chakravorty, Debaditya III-35
Challa, Rama Krishna IV-608
Chanak, Prasenjit III-68
Chand, Narottam III-122, III-609
Chandra, Deka Ganesh II-210
Chandra, Jayanta K. II-480
Chandran, K.R. I-631
Chandrika, I-704
Chanijani, S.S. Mozaari III-289
Chaoub, Abdelaali I-529
Chaudhary, Ankit III-488
Chauhan, Durg Singh I-21
Chawhan, Chandan III-35
Chawla, Suneeta II-430
Chia, Tsorng-Lin III-334
Chintapalli, Venkatarami Reddy IV-455
Chitrakala, S. III-415
Chittineni, Suresh III-543
Choudhary, Surendra Singh I-54
Chouhan, Madhu I-119
Chowdhury, Chandreyee I-129
Chowdhury, Roy Saikat II-577
Dadhich, Reena I-54
Dahiya, Ratna III-157
Dandapat, S. IV-165
Das, Madhabananda IV-113
Das, Satya Ranjan II-172
Das, Subhalaxmi IV-549
Datta, Asit K. II-480
Dawoud, Wesam I-431
Deb, Debasish II-577
Dedavath, Saritha I-34
Deepa, S.N. III-503
Dehalwar, Vasudev I-153
Dehuri, Satchidananda IV-113
Desai, Sharmishta II-397
Devakumari, D. II-358
Devani, Mahesh I-213
Dhanya, P.M. IV-126
Dhar, Pawan K. I-284
Dharanyadevi, P. II-234
Dhavachelvan, P. II-234
Dhivya, M. II-99

Dilna, K.T. III-185


Dimililer, Kamil III-357
Diwakar, Shyam II-337
Doke, Pankaj I-607, II-430
Dongardive, Jyotshna I-328
Donoso, Yezid II-386
Doraipandian, Manivannan III-111
Dorizzi, Bernadette III-20
Durga Bhavani, S. III-1
Dutta, Paramartha I-83
Dutta, Ratna IV-223
El Abbadi, Jamal I-529
El-Alfy, El Sayed M. II-535
Elhaj, Elhassane Ibn I-529
Elizabeth, Indu I-302
Elumalai, Ezhilarasi I-1
Ferreira, Ana Elisa IV-195
Ferri, Fernando IV-79
Gadia, Shashi II-191
Gaiti, Dominique II-471
Ganeshan, Kathiravelu IV-501
Garcia, Andres III-664
Garcia, Anilton Salles IV-195
Gaur, Manoj Singh I-44, I-162, I-562,
II-183, II-452, III-478, III-644
Gaur, Vibha II-284
Gautam, Gopal Chand I-421
Geetha, V. II-48
Geevar, C.Z. III-460
Ghosh, Pradipta III-82
Ghosh, Saswati II-620
Giluka, Mukesh Kumar I-153
Gindi, Sanjyot IV-349
Gireesh Kumar, T. II-506
Giuliani, Alessandro I-284
Godavarthi, Dinesh III-543
G
omez-Skarmeta, Antonio Fernando
III-664
Gondane, Sneha G. II-99
Gopakumar, G. I-320
Gopalan, Kaliappan IV-463
Gore, Kushal I-607
Gosain, Anjana I-691
Gosalia, Jenish IV-378
Govardhan, A. I-581
Govindan, Geetha I-294
Govindarajan, Karthik I-192

Author Index
Grifoni, Patrizia IV-79
Grover, Jyoti III-644
Gualotu
na, Tatiana IV-481
Guerroumi, M. IV-593
Gunaraj, G. I-192
Gunjan, Reena III-478
Gupta, Ankur I-501
Gupta, B.B. IV-244
Gupta, Deepika II-183
Gupta, J.P. I-260
Gupta, Juhi IV-205
Gupta, Priya IV-512
Habib, Sami J. II-349
Hazul Islam, SK II-628
Harivinod, N. III-396
Harmya, P. II-490, II-498, III-269
Harshith, C. II-506
Hassan, Foyzul II-154, III-349
Hati, Sumanta III-580
Hazarika, Shyamanta M. II-109, II-119
Hazra, Sayantan III-601
Hemamalini, M. IV-175
Hivarkar, Umesh N. IV-358
Hsieh, Chaur-Heh III-334
Huang, Chin-Pan III-334
Huang, Ping S. III-334
Ibrahim, S.P. Syed I-631
Indira, K. I-639
Isaac, Elizabeth IV-145
Jaganathan, P. I-683
Jagdale, B.N. II-397
Jain, Jitendra III-326
Jain, Kavindra R. III-239
Jain, Kavita I-328
Jalal, Anand Singh II-516, IV-329
Jameson, Justy II-693
Janani, S. IV-175
Jaya, IV-233
Jayakumar, S.K.V. II-234
Jayaprakash, R. II-656
Jena, Sanjay Kumar II-1
Jia, Lulu IV-421
Jimenez, Gustavo II-386
Jisha, G. IV-1, IV-137
Joseph, Shijo M. III-406
Juluru, Tarun Kumar I-34, III-590

619

Kacholiya, Anil IV-205


Kahlon, K.S. II-58
Kakoty, Nayan M. II-119
Kakulapati, Vijayalaxmi IV-284
Kalaivaani, P.T. III-143
Kale, Sandeep II-604
Kanade, Sanjay Ganesh III-20
Kanavalli, Anita I-141
Kancharla, Tarun IV-349, IV-368
Kanitkar, Aditya R. IV-358
Kanivadhana, P. IV-155
Kankacharla, Anitha Sheela I-34,
III-590
Kanmani, S. I-639, II-69
Kannan, A. II-19
Kannan, Rajkumar IV-79
Kapoor, Lohit I-501
Karamoy, Jennifer Sabrina Karla II-138
Karande, Vishal M. IV-386
Karmakar, Sushanta II-585
Karthi, R. III-552
Karthik, S. I-480
Karunanithi, D. IV-284
Karunanithi, Priya III-624
Karuppanan, Komathy III-425, III-615,
III-624, III-634
Katiyar, Vivek III-122
Kaur, Rajbir I-44, I-162
Kaushal, Sakshi IV-445
Kavalcio
glu, Cemal III-357
Kayarvizhy, N. II-69
Keromytis, Angelos D. III-44
Khajaria, Krishna II-9
Khalid, M. I-182
Khan, Majid Iqbal II-471, II-525
Khan, Srabani II-620
Khan Jehad, Abdur Rahman III-349
Khanna, Rajesh III-205
Khattak, Zubair Ahmad IV-250
Khilar, P.M. I-119
Kim, Pansoo II-595
Kimbahune, Sanjay I-607, II-430
Kiran, N. Chandra I-141
Kishore, J.K. II-460
Ko, Ryan K.L. IV-432
Kolikipogu, Ramakrishna IV-284
Kopparapu, Sunil Kumar II-317,
IV-293
Koschnicke, Sven I-371
Kothari, Nikhil I-213

620

Author Index

Krishna, Gutha Jaya I-382


Krishna, P. Venkata I-182
Krishna, S. III-522
Krishnan, Saranya D. IV-63
Krishnan, Suraj III-374
Kopparapu, Sunil Kumar III-230
Kulkarni, Nandakishore J. III-570
Kumar, Chiranjeev I-11
Kumar, C. Sasi II-162
Kumar, G.H. III-289
Kumar, G. Ravi IV-70
Kumar, G. Santhosh I-399
Kumar, Ishan IV-205
Kumar, K.R. Ananda I-704
Kumar, K. Vinod IV-19
Kumar, Manish I-44
Kumar, Manoj II-9
Kumar, Naveen I-461
Kumar, Padam I-461
Kumar, Ravindra II-307
Kumar, Santosh I-619
Kumar, Santhosh G. III-93
Kumar, Saumesh I-461
Kumar, Sumit I-619
Kumaraswamy, Rajeev IV-339
Kumari, M. Sharmila III-396
Kumari, V. Valli IV-572
Kumar Pandey, Vinod III-230
Kumar Sarma, Kandarpa III-512
Kurakula, Sudheer IV-165
Kussmaul, Clifton III-533
Lachiri, Zied IV-318
Lal, Chhagan II-452
Latif, Md. Abdul II-154
Laxmi, V. II-183, II-452
Laxmi, Vijay I-44, I-162, I-562, III-478,
III-644
Lee, Bu Sung IV-432
Li, Tiantian IV-421
Limachia, Mitesh I-213
Lincoln Z.S., Ricky II-130
Linganagouda, K. III-444
Lingeshwaraa, C. II-19
Liu, Chenglian IV-534
Lobiyal, D.K. III-132, III-654
Londhe, Priyadarshini IV-512
L
opez, Elsa Macas IV-481
Madheswari, A. Neela
Madhusudhan, Mishra

II-545
III-365

Mahalakshmi, T. I-310
Mahalingam, P.R. III-562, IV-137
Maheshwari, Saurabh III-478
Maiti, Santa II-172
Maity, G.K. III-249
Maity, Santi P. I-519, III-249, III-580
Maity, Seba I-519
Majhi, Banshidhar III-178
Majhi, Bansidhar IV-549
Maji, Sumit Kumar I-649
Malay, Nath III-365
Malaya, Dutta Borah II-210
Malik, Jyoti III-157
Mallya, Anita I-302
Manan, Jamalul-lail Ab IV-250
Mandava, Ajay K. I-351
Mannava, Vishnuvardhan I-250
Manomathi, M. III-415
Maralappanavar, Meena S. III-444
Marcillo, Diego IV-481
Marimuthu, Paulvanna N. II-349
Mary, S. Roselin IV-9
Masera, Guido II-374
Mastan, J. Mohamedmoideen Kader
IV-524
Mehrotra, Hunny III-178
Meinel, Christoph I-431
Mendiratta, Varun II-273
Menta, Sudhanshu III-205
Mishra, A. IV-244
Mishra, Ashok II-223
Mishra, Dheerendra IV-223
Mishra, Shivendu II-407
Misra, Rajiv I-101
Missaoui, Ibrahim IV-318
Mitra, Abhijit III-512, III-601
Mitra, Swarup Kumar III-82
Mittal, Puneet II-58
Modi, Chintan K. III-239
Mohammadi, M. III-289
Mohandas, Neethu IV-187
Mohandas, Radhesh II-685, III-10
Mohanty, Sujata IV-549
Mol, P.M. Ameera III-193
Moodgal, Darshan II-162
Morag
on, Antonio III-664
More, Seema I-361
Moussaoui, S. IV-593
Mubarak, T. Mohamed III-102
Mukhopadhyay, Sourav IV-223

Author Index
Mukkamala, R. I-446
Muniraj, N.J.R. I-270, III-168
Murthy, G. Rama IV-19
Nadarajan, R. II-366
Nadkarni, Tanusha S. II-685
Nag, Amitava II-612, II-620
Nagalakshmi, R. I-683
Nagaradjane, Prabagarane III-374
Nair, Achuthsankar S. I-284, I-294,
I-302, I-320
Nair, Bipin II-337
Nair, Madhu S. III-193, III-276
Nair, Smita IV-368
Nair, Vrinda V. I-302
Namboodiri, Saritha I-284
Namritha, R. III-634
Nandi, Sukumar I-619
Narayanan, Hari I-488
Nasiruddin, Mohammad II-154
Naskar, Mrinal Kanti III-82
Nataraj, R.V. I-631
Naveen, K. Venkat III-570, III-615
Naveena, C. III-297
Nazir, Arfan II-525
Neelamegam, P. III-111
Neogy, Sarmistha I-129, II-417
Nigam, Apurv II-430
Nimi, P.U. IV-46
Niranjan, S.K. III-297
Nirmala, M. I-223
Nirmala, S.R. III-365
Nitin, I-21, II-568, IV-25
Noopa, Jagadeesh II-490, II-498, III-269
Nurul Huda, Mohammad II-154, III-349
Oh, Deock-Gil II-595
Okab, Mustapha II-40
Oliya, Mohammad I-232
Olsen, Rasmus L. IV-37
Padmanabhan, Jayashree I-1, IV-541
Padmavathi, B. IV-70
Pai, P.S. Sreejith IV-339
Pai, Radhika M. II-460
Paily, Roy IV-165
Pais, Alwyn R. II-685, IV-386
Pais, Alwyn Roshan III-10
Pal, Arindarjit I-83
Palaniappan, Ramaswamy IV-378

621

Palaty, Abel IV-56


Pandey, Kumar Sambhav IV-56
Panicker, Asha IV-300
Panneerselvam, S. I-223
Pappas, Vasilis III-44
Parasuram, Harilal II-337
Parmar, Rohit R. III-239
Parthasarathy, Magesh Kannan I-192
Parvathy, B. I-204
PatilKulkarni, Sudarshan III-342
Patnaik, L.M. I-141, II-636, III-522
Patra, Prashanta Kumar I-649
Pattanshetti, M.K. IV-244
Paul, Anu II-201
Paul, Richu III-213
Paul, Varghese II-201
Paulsen, Niklas I-371
Pavithran, Vipin I-488
Pearson, Siani IV-432
Perumal, V. I-471
Petrovska-Delacretaz, Dijana III-20
Phani, G. Lakshmi IV-19
Ponpandiyan, Vigneswaran IV-541
Poornalatha, G. II-243
Povar, Digambar I-544
Prabha, S. Lakshmi I-192
Prabhu, Lekhesh V. IV-339
Pradeep, A.N.S. III-543
Pradeepa, J. I-471
Prajapati, Nitesh Kumar III-644
Prakasam, Kumaresh IV-541
Pramod, K. III-444
Prasad, Ramjee IV-37
Prasath, Rajendra II-555
Prasanna, S.R. Mahadeva III-326
Prasanth Kumar, M. Lakshmi I-11
Pratheepraj, E. III-503
Priya, K.H. I-471
Priyadharshini, M. IV-269
Priyadharshini, V. IV-175
Pung, Hung Keng I-232
Qadeer, Mohammed Abdul

II-442

Radhamani, A.S. I-172


Rafsanjani, Marjan Kuchaki IV-534
Raghavendra, Prakash S. II-243
Raghuvanshi, Rahul I-153
Rahaman, Hazur III-68

622

Author Index

Raheja, J.L. III-488


Raheja, Shekhar III-488
Rahiman, M. Abdul III-304
Rahman, Md. Mostazur II-154
Rai, Anjani Kumar II-407
Rai, Anuj Kumar III-111
Rai, Mahendra K. III-469
Raja, K.B. II-636
Rajapackiyam, Ezhilarasie III-111
Rajasekhar, Ch. I-78
Rajasree, M.S. III-304
Rajendran, C. III-552
Rajesh, R. III-497
Rajeswari, A. III-143
Rajimol, A. II-253
Rajkumar, K.K. III-435
Rajkumar, N. I-683
Raju, C.K. II-223, IV-211
Raju, G. I-671, II-253, III-435
Ramachandram, S. IV-70
Ramamohanreddy, A. I-581
Ramaraju, Chithra I-661
Ramasubbareddy, B. I-581
Ramaswamy, Aravindh I-411
Ramesh, Sunanda I-1
Ramesh, T. I-250
Rameshkumar, K. III-552
Rana, Sanjeev I-91
Rani, Prathuri Jhansi III-1
Rao, Appa III-102
Rao, Avani I-213
rao, D. Srinivasa I-78
Rao, Prasanth G. III-522
Rastogi, Ravi I-21
Rathi, Manisha I-260
Rathore, Wilson Naik II-676
Razi, Muhammad II-146
Reddy, B. Vivekavardhana IV-309
Reddy, G. Ram Mohana IV-473
Reddy, P.V.G.D. Prasad III-543
Reddy, Sateesh II-460
Regentova, Emma E. I-351
Reji, J. III-276
Revathy, P. IV-284
Revett, Kenneth IV-378
Roberta, Kezia Velda II-146
Rodrigues, Paul IV-9, IV-269
Rokibul Alam Kotwal, Mohammed
III-349
Roopalakshmi, R. IV-473

Roy, J.N. III-249


Roy, Rahul IV-113
Sabu, M.K. I-671
Saha, Aritra III-35
Sahaya, Nuniek Nur II-138
Sahoo, Manmath Narayan I-119
Sahoo, Soyuj Kumar III-326
Saikia, Adity II-109, II-119
Sainarayanan, G. III-157
Sajeev, J. I-310
Sajitha, M. III-102
Saljooghinejad, Hamed II-676
Samad, Sumi A. III-93
Samanta, Debasis II-172
Sambyal, Rakesh IV-608
Samerendra, Dandapat III-365
Samraj, Andrews IV-378
Samuel, Philip II-80, IV-1
Sandhya, S. II-88
Santa, Jose III-664
Santhi, K. III-221
SanthoshKumar, G. II-263
Santhoshkumar, S. I-223
Saralaya, Vikram II-460
Sarangdevot, S.S. I-592
Saraswathi, S. IV-155, IV-175
Sardana, Anjali IV-233
Saritha, S. II-263
Sarkar, D. II-612, II-620
Sarkar, Partha Pratim II-612, II-620
Sarma, Monalisa II-172
Saruladha, K. II-327
Sasho, Ai I-340
Sasidharan, Satheesh Kumar I-552
Satapathy, Chandra Suresh III-543
Sathisha, N. II-636
Sathishkumar, G.A. IV-524
Sathiya, S. IV-155
Sathu, Hira IV-491, IV-501
Satria, Denny II-138
Sattar, Syed Abdul III-102
Savarimuthu, Nickolas I-661
Sayeesh, K. Venkat IV-19
Schatz, Florian I-371
Schimmler, Manfred I-371
Sebastian, Bhavya I-302
Sehgal, Priti III-259
Selvan, A. Muthamizh III-497

Author Index
Selvathi, D. IV-300
Sen, Jaydip IV-580
Sendil, M. Sadish I-480
Senthilkumar, Radha II-19
Senthilkumar, T.D. III-185
Shah, Mohib A. IV-491, IV-501
Shahram, Lati I-351
Shajan, P.X. III-168
Sharma, Amita I-592
Sharma, Dhirendra Kumar I-11
Sharma, Divya I-511
Sharma, H. Meena I-162
Sharma, Neeraj Kumar II-284
Sharma, Ritu I-511
Sharma, Sattvik II-506
Sharma, Sugam II-191
Sharma, Surbhi III-205
Sharma, T.P. I-421
Shekar, B.H. III-396
Shenoy, P. Deepa I-141, III-522
Shenoy, S.K. III-93
Sherly, K.K. II-693
Shringar Raw, Ram III-654
Shukla, Shailendra I-101
Shyam, D. II-99
Sikdar, Biplab Kumar III-68
Singal, Kunal III-488
Singh, Anurag III-609
Singh, Ashwani II-374
Singh, Jai Prakash IV-89
Singh, Jyoti Prakash I-83, II-612, II-620
Singh, Manpreet I-91, I-572
Singh, Puneet III-570
Singh, Rahul I-340
Singh, Sanjay II-460
Singh, Satwinder II-58
Singh, Vijander I-54
Singh, Vrijendra II-516, IV-329
Singh, Preety II-183
Sinha, Adwitiya III-132
Sivakumar, N. II-88
Skandha, S. Shiva IV-70
Smith, Patrick II-191
Sojan Lal, P. III-460
Song, Jie IV-421
Soni, Surender III-122
Sood, Manu I-511
Soumya, H.D. I-361
Sreenath, N. II-48
Sreenu, G. IV-126

623

Sreevathsan, R. II-506
Srikanth, M.V.V.N.S. II-506
Srinivasan, Avinash IV-260
Srinivasan, Madhan Kumar IV-269
Srivastava, Praveen Ranjan III-570
Srivastava, Shweta I-260
Starke, Christoph I-371
Suaib, Mohammad IV-56
Su
arez-Sarmiento, Alvaro IV-481
Subramaniam, Tamil Selvan Raman
IV-541
Suchithra, K. IV-339
Sudarsan, Dhanya IV-137
Sudhansh, A.S.D.P. IV-165
Sujana, N. I-361
Sukumar, Abhinaya I-1
Sulaiman, Suziah IV-250
Sundararajan, Sudharsan I-488
Swaminathan, A. II-648
Swaminathan, Shriram III-374
Swamy, Y.S. Kumara IV-309
Tahir, Muhammad II-471
Takouna, Ibrahim I-431
Thakur, Garima I-691
Thampi, Sabu M. I-64, IV-126, IV-145,
IV-187
Thangavel, K. II-358
Thilagu, M. II-366
Thiyagarajan, P. IV-98
Thomas, Diya I-64
Thomas, K.L. I-544, I-552
Thomas, Likewin IV-396
Thomas, Lincy III-425
Thomas, Lisha III-221
Thukral, Anjali II-273
Tim, U.S. II-191
Tiwary, U.S. III-452, III-469
Tobgay, Sonam IV-37
Tolba, Zakaria II-40
Tripathi, Pramod Narayan II-407
Tripathi, Rajeev I-11
Tripathy, Animesh I-649
Tripti, C. IV-46
Tyagi, Neeraj I-11
Tyagi, Vipin II-568
Uma, V. II-656
Umber, Ashfa II-30
Unnikrishnan, C. III-562

624

Author Index

Usha, N. IV-309
Utomo, Bima Shakti Ramadhan

II-138

Vanaja, M. I-78
Varalakshmi, P. I-411, I-471
Varghese, Elizabeth B. III-383
Varshney, Abhishek II-442
Vasanthi, S. III-213
Vatsavayi, Valli Kumari II-296
Venkatachalapathy, V.S.K. II-234
Venkatesan, V. Prasanna IV-98
Venugopal, K.R. I-141, II-636, III-522
Verma, Amandeep IV-445
Verma, Chandra I-284
Verma, Gyanendra K. III-452, III-469
Verma, Rohit I-21
Vidya, M. I-361
Vidyadharan, Divya S. I-544
Vijay, K. I-78
Vijaykumar, Palaniappan I-411
VijayLakshmi, H.C. III-342
Vinod, P. I-562
Vipeesh, P. I-270

Vishnani, Kalpa III-10


Vivekanandan, K. II-88
Vorungati, Kaladhar I-488
Vykopal, Jan II-666
Wadhai, V.M. II-397
Wankar, Rajeev I-382
Wattal, Manisha I-501
William, II-130
Wilscy, M. III-315, III-383
Wirjono, Adityo Ashari II-130
Wisudawati, Lulu Mawaddah II-146
Wu, Jie IV-260
Xavier, Agnes

I-328

Yadav, Gaurav Kumar


Yu, Fan III-54
Yuvaraj, V. III-503

IV-368

Zaeri, Naser II-349


Zheng, Liyun IV-534
Zhu, Shenhaochen I-340
Zhu, Zhiliang IV-421

Вам также может понравиться