Se min ar R e port On
Submitted By
2015-2016
August2015
A
Se min ar R e port On
Bachelor OF ENGINEERING
Computer Science & Engineering
Submitted By
2015-2016
August2015
Pratap Institute of Management & Technology, Washim
Department of Computer Science &Engineering
Certificate
Prof. S. Jadhav
Guide
Guide
Prof. K. U. Chaware Dr.M.S.Patil
Head of Department Principal
PiMT, Washim PiMT,Washim
I
ACKNOWLEDGEMENT
I have great pleasure and sense of satisfaction in presenting this report on Big
Data: Issues and Challenges Moving Forward for Seminar. This report
would not be possible without the help and support of gratitude to my seminar guide
Prof. S.Jadhav, for his instinct help and valuable guidance with a lot of
encouragement throughout this seminar work, right from selection of topic work up to
its completion. My sincere thanks to Head of the Department of Computer Science
Prof. K.Chaware, who continuously motivated and guided for completion of this
work. I am also thankful to our Prof. A.Raipurefor this report.
II
ABSTRACT
Big data refers to data volumes in the range of Exabytes (1018) and beyond.
Such volumes exceed the capacity of current on-line storage systems and processing
systems. Data, information, and knowledge are being created and collected at a rate
that is rapidly approaching the Exabyte/year range. But, its creation and aggregation
are accelerating and will approach the zettabyte/year range within a few years.
Volume is only one aspect of big data; other attributes are variety, velocity, value, and
complexity. Storage and data transport are technology issues, which seem to be
solvable in the near-term, but represent long-term challenges that require research and
new paradigms. We analyze the issues and challenges as we begin a collaborative
research program into methodologies for big data analysis and design.
Keywords :- Big data; Hadoop; Hadoop distributed file system ; Map reduce
III
CONTENTS
Page No.
Certificate
Acknowledgement
Abstract
Contents
List of Figures
1 Introduction 1- 4
1.1 Importance of Big-data
1.2 Big-data Characteristics
1.3 Big-data- Where is it?
2 Literature Survey 5-6
3 Processing Big-data 7 - 13
3.1 Issues
3.1.1 Storage and Transport Issues
3.1.2 Management Issues
3.1.3 Processing Issues
3.2 Challenges
3.2.1 Data Input and Output Process
3.2.2 Quality versus Quantity
3.2.3 Data Growth versus Data Expansion
3.2.4 Speed versus Scale
3.2.5 Structured versus Unstructured Data
3.2.6 Data Ownership
3.2.7 Compliance and Security
4. Solution on Issues and Challenges of Big Data 14 - 17
4.1 Apache Hadoop Technology
4.1.1 Hadoop Distributed File System
4.1.2Apache Map Reduce
5 Discussion 18 - 19
6. Conclusion and Future work 20
References 21
IV
LIST OF FIGURES