Вы находитесь на странице: 1из 34

AN INTRODUCTION BY

PARAGYTE TECHNOLOGIES

AUTHOR: PARAGYTE TECHNOLOGIES


PARAGYTE'S END-TO-END DIGITAL SOLUTIONS.
Date: 07-07-17

2017 Paragyte Technologies, All Rights Reserved.


AGENDA

WHAT IS BIG DATA

WHY BIG DATA

FRAME WORKS OF BIG DATA


APACHE MAHOUT
APACHE PIG
APACHE SPARK
APACHE HIVE
APACHE SOLR

IMPACT OF BIG DATA ON INFORMATION TECHNOLOGY

WHAT WE OFFER

2017 Paragyte Technologies, All Rights Reserved. 01


WHAT IS BIG DATA
An Explanation

2017 Paragyte Technologies, All Rights Reserved. 02


WHAT IS BIG DATA

According to Wikipedia,
Big data is a term for data sets that are so large or
complex that traditional data processing application
software is inadequate to deal with them

In other words, big data is a collection of huge sets of


structured and unstructured data that traditional
database management tools are ill-equipped to deal with.

2017 Paragyte Technologies, All Rights Reserved. 03


WHAT IS BIG DATA

To better explain big data, lets study data, or as it has now become, small data:

Data of a single individual initially just Now, the data of that same person has
comprised of: amplified to:
Name, residential address, residential telephone Name, home telephone number, work telephone
number, work address, and work telephone number, home address, work address, mobile
number. After a few years, these were added: number, work email, personal email, personal and
mobile number, personal email address, work work email attachments, social media user IDs
email address. and passwords, information required to retrieve
all those passwords, social media posts including
uploads/downloads of photos and videos, mobile
device data, mobile devices sensor data, personal
videos and photos from mobile device

And this is just to start with.

2017 Paragyte Technologies, All Rights Reserved. 04


WHAT IS BIG DATA

Big data is described by the following characteristics:

VOLUME VARIETY VELOCITY VARIABILITY VERACITY


The quantity of The type and The speed at Consistency of the Quality of the
generated and nature of data. which data is data set. captured data.
stored data. generated and
processed.

2017 Paragyte Technologies, All Rights Reserved. 05


WHAT IS BIG DATA

Lets look at it from a different perspective

THIS IS DATA THIS IS BIG DATA

Judging by the size of it, if organizations are not prepared, it could even spell disaster for them.

2017 Paragyte Technologies, All Rights Reserved. 06


WHY BIG DATA
An Analysis

2017 Paragyte Technologies, All Rights Reserved. 07


WHY BIG DATA

Before we begin answering this question, lets try to digest the following facts:

Data is growing at such a fast pace that it is predicted by year 2020, ~2 MB of data will be generated every
second for every individual.

Google itself receives nearly 40,000 search queries every second, which makes nearly 1.2 trillion searches
per year.

In 2008, Google processed ~20,000 TB (i.e., 20 petabytes) in a single day.

Facebook recently announced that it has reached 2 billion monthly users worldwide.

Organizations/Brands on Facebook receive ~35,000 Likes every minute.

YouTube receives ~300 hours of video content every minute.

2017 Paragyte Technologies, All Rights Reserved. 08


WHY BIG DATA

This image is pretty self explanatory of the


impact big data is having on the internet
right now.

It indicates digital events happening


around the world this very minute.

2017 Paragyte Technologies, All Rights Reserved. 09


WHY BIG DATA

Heres why organizations need to move to big data:

Unlike other DBMS


Since all data is tools, big data
Due to the incremental warehoused in one management tools and
shift from analog to single location, it is technologies such as Big data forms a major
digital, the need for easier for organizations NoSQL, MapReduce, chunk of the $65
increased data storage to take minimal-risk, etc., provide the ability billion database and
has multiplied well-educated, and to retrieve information analytics market
manifold. well-calculated without changing the
decisions at the right underlying data
time. structures in a big data
database.

2017 Paragyte Technologies, All Rights Reserved. 10


WHY BIG DATA

Heres why organizations need to move to big data:

Big data comprises of Big data has the ability Due to its flexibility,
structured and to locate, extract, scalability, mobility,
unstructured data sets. modify, visualize, Allows for
management of data and speed, it can
This eliminates the calculate the data efficiently tackle the
organizations need to stored in its databases in-house or in the
cloud using Hadoop. challenges thrown by
invest in other using a variety of complex transactions
specialized DBMS specialized tools and of the modern day.
tools. technologies.

2017 Paragyte Technologies, All Rights Reserved. 11


WHY BIG DATA

Heres why organizations need to move to big data:

Since all customer data


By using industry- is centrally located and
Fraudulent standard practices, easily accessed, Big data services are
transactions can be organizations can organizations can serve built around secure
detected and predict their their customers better
competitors actions. infrastructures that
prevented at an early by proactively sending protect an
stage to help in This can help them discounts or organizations valuable
reducing or eliminating organizations stay a specialized offers
step ahead of their data from cyber
financial loss. based on their criminals.
competitors. browsing patterns.

2017 Paragyte Technologies, All Rights Reserved. 12


BIG DATA AND ITS FRAMEWORKS
An Explanation

2017 Paragyte Technologies, All Rights Reserved. 13


BIG DATA FRAMEWORKS

Big data supports the following frameworks:

APACHE APACHE APACHE APACHE APACHE


MAHOUT PIG SPARK HIVE SOLR

2017 Paragyte Technologies, All Rights Reserved. 14


BIG DATA FRAMEWORKS

All about Apache Mahout:

It is a library that uses MapReduce paradigm implemented on top of Hadoop

Algorithms used in the library focus on filtering, clustering, and classification


of big data

Also provides Java libraries/primitive java collections for statistical and


algebraic operations
APACHE
MAHOUT Helps in creating scalable, performance-oriented machine learning
applications on Hadoop

2017 Paragyte Technologies, All Rights Reserved. 15


BIG DATA FRAMEWORKS

Benefits of using Apache Mahout:

Better user targeting based on predictions of audience interests

Improved advert targeting and content filtering

Filtering of members, products, services, and locations for better ROI


APACHE
MAHOUT Cluster creation, detection of duplicate documentation,
cluster classification

2017 Paragyte Technologies, All Rights Reserved. 16


BIG DATA FRAMEWORKS

All about Apache Pig:

It is a high-level platform to create programs that run on Hadoop

The programs target large datasets by executing MapReduce jobs

Uses a high-level language called Pig Latin which resembles SQL with some
APACHE differences
PIG

2017 Paragyte Technologies, All Rights Reserved. 17


BIG DATA FRAMEWORKS

Benefits of using Apache Pig:

Prevent data frauds through detailed transaction analysis

Quick exploration of large datasets

Analyze user engagement on the web

APACHE
PIG
Produce KPIs for critical web services

2017 Paragyte Technologies, All Rights Reserved. 18


BIG DATA FRAMEWORKS

All about Apache Spark:

It is a general purpose, large scale processing engine.

It is fast, easy to use, and advanced option for development

APACHE Considered to be the most suited processing engine for performing


SPARK advanced analytics in large scale data processing

2017 Paragyte Technologies, All Rights Reserved. 19


BIG DATA FRAMEWORKS

Benefits of using Apache Spark:

Valuable tool for diverse computing needs

Creates applications that run faster than normal

APACHE
SPARK

2017 Paragyte Technologies, All Rights Reserved. 20


BIG DATA FRAMEWORKS

All about Apache Hive:

It is an open source data warehouse project built on top of Hadoop

Uses an SQL-like interface to facilitate summarization, query, and analysis of


data stored in various databases and file systems

Provides a mechanism to structure organizational data through HiveQL


APACHE
HIVE
Brings about better management and querying for large datasets

2017 Paragyte Technologies, All Rights Reserved. 21


BIG DATA FRAMEWORKS

Benefits of using Apache Hive:

More ROI

Reduced time for Sematic Checks

APACHE Ad-hoc style querying


HIVE

2017 Paragyte Technologies, All Rights Reserved. 22


BIG DATA FRAMEWORKS

All about Apache Solr:

It is a powerful open source enterprise search tool

Provides text search, faceted search, real-time indexing, dynamics clustering,


rich documents handling, and ease of integration

APACHE Used by popular sites such as Reddit, Netflix, Instagram


SOLR

2017 Paragyte Technologies, All Rights Reserved. 23


BIG DATA FRAMEWORKS

Benefits of using Apache Solr:

It supports indexing and searching through multiple sites and indexing


attachments

Designed for scalability and fault tolerance

APACHE
SOLR

2017 Paragyte Technologies, All Rights Reserved. 24


IMPACT OF BIG DATA ON
INFORMATION TECHNOLOGY
An Investigation

2017 Paragyte Technologies, All Rights Reserved. 25


IMPACT OF BIG DATA ON IT

To start with, lets look at a few commercial establishments/political factions where big
data made a major impact:

Walmart handles more than 1 million customer transactions every hour, which is imported into databases
estimated to contain more than 2.5 petabytes of data.

The largest AT&T database boasts titles including the largest volume of data in one unique database (312
terabytes) and the second largest number of rows in a unique database (1.9 trillion), which comprises
AT&Ts extensive calling records.

The White House has already invested more than $200 million in big data projects.

Big data analysis played a large role in Barack Obamas successful 2012 re-election campaign.

Big data analysis was tried out for a major Indian political faction to win the Indian General Election 2014.

2017 Paragyte Technologies, All Rights Reserved. 26


IMPACT OF BIG DATA ON IT

Now, lets discuss some more about the impact big data will have on Information
Technology.

Employment boom for specialists and IT professionals

Shortage of IT workers in US with specific skills to handle large pools of data

A developed need for employer-sponsored training programs

Call for the government to issue visas to foreign workers in US (this remains for debate)

More data reliant companies in the marketplace as technology evolves

New specialty job positions emerging in the healthcare IT sector

Special higher education programs being developed to meet future demand in Healthcare Informatics

2017 Paragyte Technologies, All Rights Reserved. 27


WHAT PARAGYTE OFFERS
A Proposal

2017 Paragyte Technologies, All Rights Reserved. 28


PARAGYTE TECHNOLOGIES

Paragytes big data solutions team comprise of certified and experienced data analytics
experts. We offer cross platform expertise to create customer big data solutions and
develop your data analytics capabilities on various technologies and database
platforms.

Paragyte offers managed services that are designed to enable complex data analysis features for
facilitating business decisions

Paragyte offers data mining, data aggregation, data integration, migration and maintenance along with
data optimization.

Paragytes certified big data experts ca create highly scalable solutions that facilitate organic data search,
real time, batch processing, and dynamic data links.

2017 Paragyte Technologies, All Rights Reserved. 29


PARAGYTE TECHNOLOGIES

Paragyte Technologies offers the following big data services:

ARCHITECTURE BIG DATA BIG DATA


STRATEGY AND DESIGN IMPLEMENTATION CONSULTING

We help you establish scalable By successfully installing robust Paragytes big data analysts
and flexible big data big data architecture in place, offer insightful consulting that
architecture that enables Paragyte facilitates seamless helps you adopt big data for
development of holistic big data implementation for improving your business
business strategies. Paragytes your organizational needs. We efficiencies. Your big data
structure and business-focused augment your existing abilities enable achievement of
approach makes big data database management business objectives and faster
utilization ROI driven for your capabilities; enable agile realization of ROI.
organization. testing and complex data
management.

2017 Paragyte Technologies, All Rights Reserved. 30


PARAGYTE TECHNOLOGIES

Paragyte Technologies offers the following big data services:

BUSINESS ANALYTICS & DATA MINING & DATA BIG DATA


INTELLIGENCE AGGREGATION OPTIMIZATION

Ensure smooth data collection Augment your database Rise beyond the mathematical
and accumulation from management system with solutions of big data
different business processes faster and smoother data optimization and develop
while conducting quantitative mining and data solutions for your data
& qualitative analysis of data. aggregation capabilities. Big analytics and business
Paragytes big data solutions Data services of Paragyte intelligence needs using
help your integrate data facilitate intelligent search Paragytes expert big data
mining, digital dashboards & and access of data, which can services.
OLAP to generate business in turn be statistically analyzed
intelligence (BI). on an integrated platform.

2017 Paragyte Technologies, All Rights Reserved. 31


PARAGYTE TECHNOLOGIES

Paragyte Technologies offers the following big data services:

BIG DATA DATA MIGRATION & HADOOP & MONGODB


INTEGRATION MAINTENANCE DEVELOPMENT

To improve your customer As your organizational activities Paragytes Hadoop expertise


experience and develop generate complex data at fast helps your design custom big
business focused strategies, speed, there is an undeniable data management solutions,
Paragytes big data solutions need to migrate and maintain thereby generating intelligent
capture and accumulate data this data. With our big data insights from large quantities
from varied sources. services, you will be able to of data. Upgrade your
Systematic and structured data access and share your data database management
integration helps you anywhere and anytime. capabilities by partnering with
understand customer usage Paragyte for tailor made
trends & productivity MongoDB development
enhancement factors. services.

2017 Paragyte Technologies, All Rights Reserved. 32


Thank You!
contact@paragyte.com +1 917 909 8646 www.paragyte.com

2017 Paragyte Technologies, All Rights Reserved. 33

Вам также может понравиться