Вы находитесь на странице: 1из 10

BIG DATA MANAGEMENT

Organization, administration and governance of large volumes of


both structured and unstructured data
Goal is to ensure a high level ofdata qualityand accessibility for
business intelligence andbig data analyticsapplications
Effective big data management helps companies locate valuable
information in large sets of data from a variety of sources (e.g. call
detail records, system logs and social media sites)
Challenges involve scale, speed, and diverse data types

BIG DATA MANAGEMENT


DEALING WITH DATA

Process is quite different from handling traditional data


Big Data is handled at different stages:
a.
b.
c.
d.
e.
f.

Collection
Storage
Organizing
Data Analysis
Data Visualization
Actions or Results

Motive is deciphering useful insights useful to make business


decisions

COLLECTION
Involves collection of data from several types of data sources, data
marts and data warehouses
Data sources can beinternal and external
a. Internal - Customer Relationship Management (CRM), enterprise resource,
customers details, products and sales data and operational data
b. External - Data collected from business partners, internet, government and market
research organizations

Commonly used data for insights is collected from 3 major sources:


Social Data, Machine Data and Transactional Data
a. Social Data - Data generated from Facebook, Twitter, Google+, Linkedin
b. Machine Data - Data generated from RFID chip readings, global positioning system
results
c. Transactional Data - Data generated from ebay, amazon, walmart, Ikea

STORAGE
Involves storing the data into distributed database systems and
servers
Key requirements
a. Can handle very large amounts of data and keep scaling to keep up with growth
b. Can provide the input/output operations per second (IOPS) necessary to deliver
data to analytics tools

Hyperscale computing environment


a. Systems capable of rapid, efficient expansion to handle massive quantities of data
from database, high-performance computing and other especially busy
applications
b. Comprise vast amounts of commodity servers with direct-attached storage (DAS)
c. Largest practitioners Google, Facebook, Apple

STORAGE
Scaled-out or clustered NAS
a. Used when organizations need ability to handle relatively large data sets and
handle them fairly-quickly (not need quite the same response times of order of
milliseconds)

Object Storage
a. Similar to Scaled-out/clustered NAS, but less mature technology than Scaledout/Clustered NAS
b. Works by giving each file a unique identifier and indexing the data and its
location. More like DNS way of doing things on the internet than the kind of file
system

ORGANIZING
Involves categorizing and arranging the data on the basis of
structured, unstructured and semi-unstructured data which is easy
to access and analyze
Data can be accessed using big data technologies such as NoSQL,
Hadoop Distributed File System(HDFS)

DATA ANALYSIS
Involves extracting the data and applying statistical & business
analytics concepts to carve out the hidden insights from the data
helpful for decision making

DATA VISUALIZATION
Involves using of tools such as D3.js, Qlikview, Tableau etc to
visualize the data

ACTIONS OR RESULTS

THANKS

Вам также может понравиться