Big Data Intro

Загружено:

david smith

0% нашли этот документ полезным (0 голосов)

28 просмотров10 страниц

Оригинальное название

Big data intro.pptx

Авторское право

Доступные форматы

PPTX, PDF, TXT или читайте онлайн в Scribd

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Пожаловаться на этот документ

Авторское право:

Доступные форматы

Скачайте в формате PPTX, PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

0% нашли этот документ полезным (0 голосов)

28 просмотров10 страниц

Big Data Intro

Загружено:

david smith

Авторское право:

Доступные форматы

Скачайте в формате PPTX, PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

Перейти к странице

Вы находитесь на странице: 1из 10

Поиск в документе

By: Hadeel Ahmed

BIG DATA IS BIG


Big data from its name is very big
Starting size of it at least 1 TB

Sources of data
Mere data from internet
Data from military corporations
Hospitals data
NASA corporation data
And so on…
Types of data

 Unstructured data
Like images , videos , social media data
 Semi structured data
Like xml files
 Structured data
Like data base , SQL servers
What is the problem ?!

 The problem is that with this un sorted very large data size
, we cant analysis it, more over we cant classify it ,, it
become un-useful stored data without any usage

How to solve this ?!

Say Hello to HADOOP ;)
HADOOP

 Hadoop is an open source project developed by Google
and Doug Cutting
 It provides very useful options to deal with that data
 At first its scalable , it can accept any size of data
 It provides very high velocity in analysis this data ,
imagine that u can process 1TB data with in 2.5 minutes
 One of the most important characteristics of hadoop is
that it can distribute any data among number of files or
servers
MapReduce

 MapReduce is a parallel programming model for
writing distributed applications devised at Google
for efficient processing of large amounts of data
(multi-terabyte data-sets), on large clusters
(thousands of nodes) of commodity

 hardware in a reliable, fault-tolerant manner. The

MapReduce program runs on Hadoop which is an
Apache open-source framework.
HADOOP

 Hadoop is an Apache open source framework written in
java that allows distributed processing of large datasets
across clusters of computers using simple programming
models. The Hadoop framework application works in
an environment that provides distributed storage and
computation across clusters of computers. Hadoop is
designed to scale up from single server to thousands of
machines, each offering local computation and storage.
Characteristics of HADOOP

 It doesn’t deal with your data as one whole block ,, instead
it distributes it into number of blocks , each block is
stored at different server or file that’s why it’s scalable
and accept any size of data
 It has one central node called NAME NODE which
control every other data nodes ,, it has the location info
for any block ( like broker)
 MapReduce is responsible for connecting all nodes
together, it distributes tasks among the nodes
So it acquire info of those nodes from name node at first
then distribute the task between them
So Mapreduce acquire info from name node, distribute the
task between nodes decrease needed time in processing the
data

Characteristics of HADOOP CONT.


 Fault tolerance
 Hadoop ensures that there is backup for every block and
there is more than one copy of each block among nodes
cluster
 It provides single write and multiple read for data
 PIG , HIVE, ZOOKEEPER
 They are already built projects dedicated for special type
of jobs
For example pig is used for data base projects
 It supports any language to write your own MapReduce

Вам также может понравиться

Hadoop
Документ7 страниц
Hadoop
Mayank Rai
Оценок пока нет
Hadoop Interview Questions
Документ28 страниц
Hadoop Interview Questions
jey011851
Оценок пока нет
Introduction To Hadoop
Документ5 страниц
Introduction To Hadoop
Hanumanthu Gouthami
Оценок пока нет
Unit 2
Документ10 страниц
Unit 2
tripathineeharika
Оценок пока нет
Big Data Analytics
Документ27 страниц
Big Data Analytics
Chinmay Bhake
Оценок пока нет
02 Unit-II Hadoop Architecture and HDFS
Документ18 страниц
02 Unit-II Hadoop Architecture and HDFS
KumarAdabala
Оценок пока нет
Haddob Lab Report
Документ12 страниц
Haddob Lab Report
Magneto Eric Apollyon Thorn
Оценок пока нет
Edureka Interview Questions - HDFS
Документ4 страницы
Edureka Interview Questions - HDFS
varunpratap
Оценок пока нет
Mapreduce
Документ15 страниц
Mapreduce
manasa
Оценок пока нет
Big Data - Unit 2 Hadoop Framework
Документ19 страниц
Big Data - Unit 2 Hadoop Framework
Aditya Deshpande
Оценок пока нет
Bda Summer 2022 Solution
Документ30 страниц
Bda Summer 2022 Solution
Vivek
Оценок пока нет
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Документ89 страниц
HADOOP and PYTHON For BEGINNERS - 2 BOOKS in 1 - Learn Coding Fast! HADOOP and PYTHON Crash Course, A QuickStart Guide, Tutorial Book by Program Examples, in Easy Steps!
Antony George Sahayaraj
Оценок пока нет
Bigdata and Hadoop Notes
Документ5 страниц
Bigdata and Hadoop Notes
Nandini Malviya
Оценок пока нет
Unit 2 Lecture - 04 - HDFS PDF
Документ40 страниц
Unit 2 Lecture - 04 - HDFS PDF
Vaibhavi Sangawar
Оценок пока нет
Module III Note
Документ36 страниц
Module III Note
johnsonjoshal5
Оценок пока нет
Hadoop
Документ14 страниц
Hadoop
Chaturvedi Tanya
Оценок пока нет
Hadoop Interview Questions
Документ14 страниц
Hadoop Interview Questions
Sankar Susarla
Оценок пока нет
Hadoop
Документ11 страниц
Hadoop
Inu Kag
Оценок пока нет
High Performance Fault-Tolerant Hadoop Distributed File System
Документ9 страниц
High Performance Fault-Tolerant Hadoop Distributed File System
Editor IJRITCC
Оценок пока нет
Notes Hadoop
Документ19 страниц
Notes Hadoop
Oyimang Tatin
Оценок пока нет
By Pallavi Mandal Class: CS-B Roll No.: 2014BCS1150
Документ17 страниц
By Pallavi Mandal Class: CS-B Roll No.: 2014BCS1150
neerendra pratap singh
Оценок пока нет
Cloud Computing
Документ19 страниц
Cloud Computing
Afia Faryad
Оценок пока нет
Research Paper On Apache Hadoop
Документ6 страниц
Research Paper On Apache Hadoop
soezsevkg
100% (1)
4 UNIT-4 Introduction To Hadoop
Документ154 страницы
4 UNIT-4 Introduction To Hadoop
PrakashRameshGadekar
Оценок пока нет
Bda Aiml Note Unit 2
Документ13 страниц
Bda Aiml Note Unit 2
viswakranthipalagiri
Оценок пока нет
Big Data Capsule PDF
Документ12 страниц
Big Data Capsule PDF
Kavya Kharbanda
Оценок пока нет
HADOOP
Документ40 страниц
HADOOP
saadiaiftikhar123
Оценок пока нет
Hadoop Interview Questions
Документ14 страниц
Hadoop Interview Questions
satish.sathya.a2012
Оценок пока нет
Hadoop Interview Questions
Документ14 страниц
Hadoop Interview Questions
satish.sathya.a2012
Оценок пока нет
BDA Notes
Документ25 страниц
BDA Notes
mrudula.sb
Оценок пока нет
Compare Hadoop & Spark Criteria Hadoop Spark
Документ18 страниц
Compare Hadoop & Spark Criteria Hadoop Spark
dasari ramya
Оценок пока нет
Technical Seminar
Документ32 страницы
Technical Seminar
Sda Sdasd
Оценок пока нет
Big Data Analytics Unit-3
Документ15 страниц
Big Data Analytics Unit-3
4241 DAYANA SRI VARSHA
Оценок пока нет
Hadoop
Документ6 страниц
Hadoop
Vikas Sinha
Оценок пока нет
Pig Vs Hive VS Native Map Reduc E: Pangool
Документ6 страниц
Pig Vs Hive VS Native Map Reduc E: Pangool
kumar
Оценок пока нет
Big Data Ana Unit - II Part - II (Hadoop Architecture)
Документ47 страниц
Big Data Ana Unit - II Part - II (Hadoop Architecture)
Mokshada Yadav
Оценок пока нет
Hadoop Interview Questions
Документ28 страниц
Hadoop Interview Questions
Anand S
Оценок пока нет
Unit 2
Документ30 страниц
Unit 2
Awadhesh Maurya
Оценок пока нет
Unit 3
Документ15 страниц
Unit 3
xcgfxgvx
Оценок пока нет
Hadoop Ecosystem
Документ56 страниц
Hadoop Ecosystem
RUGAL NEEMA MBA 2021-23 (Delhi)
Оценок пока нет
Research Paper On Hadoop Technology
Документ4 страницы
Research Paper On Hadoop Technology
efjddr4z
100% (1)
21ai402 Data Analytics Unit-2
Документ44 страницы
21ai402 Data Analytics Unit-2
Priyadarshini
Оценок пока нет
Hadoop
Документ25 страниц
Hadoop
RAJNISH KUMAR ROY
Оценок пока нет
Subject: Data Driven Decision Making: Apache Hadoop For Big Data
Документ5 страниц
Subject: Data Driven Decision Making: Apache Hadoop For Big Data
Tarun Maheshwari
Оценок пока нет
Hadoop Unit-4
Документ44 страницы
Hadoop Unit-4
Kishore Parimi
Оценок пока нет
Getting Started With Hadoop
Документ47 страниц
Getting Started With Hadoop
TeeMan27
Оценок пока нет
Bda Unit 2
Документ21 страница
Bda Unit 2
245120737162
Оценок пока нет
Efficient Ways To Improve The Performance of HDFS For Small Files
Документ5 страниц
Efficient Ways To Improve The Performance of HDFS For Small Files
Yassine Zrigui
Оценок пока нет
Unit III
Документ86 страниц
Unit III
Farhan Sj
Оценок пока нет
CSE Hadoop Report
Документ14 страниц
CSE Hadoop Report
rohit
Оценок пока нет
Hadoop Ecosystem
Документ55 страниц
Hadoop Ecosystem
nehal
Оценок пока нет
BDA Lab Assignment 3 PDF
Документ17 страниц
BDA Lab Assignment 3 PDF
parth shah
Оценок пока нет
CC-KML051-Unit V
Документ17 страниц
CC-KML051-Unit V
Fdjs
Оценок пока нет
Apache Hadoop: Developer(s) Stable Release Preview Release
Документ5 страниц
Apache Hadoop: Developer(s) Stable Release Preview Release
nitesh_mps
Оценок пока нет
Big Data Hadoop Stack
Документ52 страницы
Big Data Hadoop Stack
Yaser Ali Tariq
Оценок пока нет
Unit-2 Introduction To Hadoop
Документ19 страниц
Unit-2 Introduction To Hadoop
Siva
Оценок пока нет
BDA Unit 3
Документ6 страниц
BDA Unit 3
Sp
Оценок пока нет
Bda Unit 4 Material
Документ37 страниц
Bda Unit 4 Material
Siva Saikumar Reddy K
Оценок пока нет
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
От Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
Оценок пока нет
Hands-On Machine Learning Recommender Systems with Apache Spark
От Everand
Hands-On Machine Learning Recommender Systems with Apache Spark
Ernesto Lee
Оценок пока нет
Unity Build Pipeline Ios Games On Aws Cloud Ra
Документ1 страница
Unity Build Pipeline Ios Games On Aws Cloud Ra
Dzevad Behram
Оценок пока нет
Soa, WSDL, Soap
Документ7 страниц
Soa, WSDL, Soap
Gautam Reddy
Оценок пока нет
Strategi Dan Peningkatan Keamanan Pada Komputasi Awan
Документ7 страниц
Strategi Dan Peningkatan Keamanan Pada Komputasi Awan
Gentra Aw
Оценок пока нет
Exploring The Components of AWS
Документ96 страниц
Exploring The Components of AWS
aka
Оценок пока нет
Proxy Chain
Документ5 страниц
Proxy Chain
Francisco Diaz
Оценок пока нет
Torrent Terminology
Документ4 страницы
Torrent Terminology
sirmadamstudent
Оценок пока нет
AZ-900T00 Microsoft Azure Fundamentals-01
Документ21 страница
AZ-900T00 Microsoft Azure Fundamentals-01
MgminLukaLay
Оценок пока нет
Core Cloud Services - Introduction To Azure - Learn - Microsoft Docs PDF
Документ2 страницы
Core Cloud Services - Introduction To Azure - Learn - Microsoft Docs PDF
NavneetMishra
Оценок пока нет
Gowrishankar - Yarra Resume
Документ3 страницы
Gowrishankar - Yarra Resume
sana ganesh
Оценок пока нет
Fog Computing: Submitted by Vivek Murali 34015803040
Документ24 страницы
Fog Computing: Submitted by Vivek Murali 34015803040
ARCHANA R PILLAI
Оценок пока нет
Microsoft Exchange DR Solution Using ASR - Guidance
Документ15 страниц
Microsoft Exchange DR Solution Using ASR - Guidance
Sant.santi
Оценок пока нет
Distributed File Systems in Unix
Документ6 страниц
Distributed File Systems in Unix
Jeya Pradha
Оценок пока нет
MCQ Esiot-2
Документ35 страниц
MCQ Esiot-2
Chaitanya Magar
50% (2)
CC Application in Viettel IDC
Документ24 страницы
CC Application in Viettel IDC
hamph113
Оценок пока нет
Amazon Web Services
Документ19 страниц
Amazon Web Services
Sukhjinder Kaur
Оценок пока нет
Amazon Web Services Tutorial
Документ11 страниц
Amazon Web Services Tutorial
karthik rs
Оценок пока нет
The Torrent Guide For Everyone - 24 Pages PDF
Документ24 страницы
The Torrent Guide For Everyone - 24 Pages PDF
Glen Gadowski
Оценок пока нет
Openshift Tutorial
Документ110 страниц
Openshift Tutorial
balha
75% (4)
B.Tech CSE Specialization in Cloud Computing and Automation
Документ19 страниц
B.Tech CSE Specialization in Cloud Computing and Automation
Saravanan Devaraj
Оценок пока нет
Prepaway Az 900 Microsoft Azure Fundamentals Version 12.0 With Answer
Документ172 страницы
Prepaway Az 900 Microsoft Azure Fundamentals Version 12.0 With Answer
matija904
Оценок пока нет
Hadoop Multi Node Cluster Setup
Документ7 страниц
Hadoop Multi Node Cluster Setup
Ganapathiraju Sravani
Оценок пока нет
0 - Hypervisor Clustering and Load Balanced Virtual Server Architecture
Документ8 страниц
0 - Hypervisor Clustering and Load Balanced Virtual Server Architecture
Ratan singh
Оценок пока нет
Privatedorks
Документ650 страниц
Privatedorks
jefferson
100% (2)
AWS Dumps
Документ78 страниц
AWS Dumps
sriramperumal
Оценок пока нет
Get Ready To Unlock Aws
Документ6 страниц
Get Ready To Unlock Aws
dolo2000
Оценок пока нет
A Comparison of Azure AWS and Google Cloud Services PDF
Документ17 страниц
A Comparison of Azure AWS and Google Cloud Services PDF
Komal Gupta
Оценок пока нет
CC Question Bank All Units
Документ28 страниц
CC Question Bank All Units
asmitha0567
Оценок пока нет
T1 - 672009296 - Full Text
Документ29 страниц
T1 - 672009296 - Full Text
Rizkhy Ramadhani
Оценок пока нет
What's New or Changed in Dynamics 365 Commerce
Документ5 страниц
What's New or Changed in Dynamics 365 Commerce
Walter Carlin Jr
Оценок пока нет
Oracle Paas and Iaas Global Price List
Документ117 страниц
Oracle Paas and Iaas Global Price List
Lp
Оценок пока нет