Академический Документы
Профессиональный Документы
Культура Документы
Data is any set of characters that has been gathered and translated for some
purpose, usually analysis. It can be any character, including text and numbers,
pictures, sound, or video. If data is not put into context, it doesn't do anything to a
human or computer.
TYPES OF DATA
WHAT IS BIG DATA?
Big data is a concept providing an opportunity to find new insight into your existing data as well
guidelines to capture and analysis your future data. It makes any business more agile and robust so it can
adapt and overcome business challenges.
Collection of data sets so large and complex that it becomes difficult to process using on-hand database
management tools or traditional data processing applications .
‘Big Data’ is similar to ‘small data’, but bigger in size.
Big Data generates value from the storage and processing of very large quantities of digital information
that cannot be analyzed with traditional computing techniques.
3 V’S OF BIG DATA
VOLUME
VELOCITY VARIETY
VOLUME
• Big Data Velocity deals with the pace at which data flows in from sources like business
processes, machines, networks and human interaction with things like social media sites, mobile
devices, etc.
• The flow of data is massive and continuous.
• This real-time data can help researchers and businesses make valuable decisions that provide
strategic competitive advantages and ROI if you are able to handle the velocity.
• Inderpal suggest that sampling data can help deal with issues like volume and velocity.
• The high velocity data represent Big Data.
VARIETY
• Variety refers to the many sources and types of data both structured and unstructured.
• Data can be stored in multiple format.
• In past data stored in the form of sources like spreadsheets and databases.
• Currently the data is stored in the form of emails, photos, videos, monitoring devices, PDFs,
audio, etc.
• This variety of unstructured data creates problems for storage, mining and analyzing data.
• This variety of the data represent represent Big Data.
Big Data Storage
Software design
Flash Storage
Storage
Big Data Processing
Batch Processing
Why Big Data?
First, let’s start with the definition of big data. Big data
means large sets of structured and unstructured data. It
is so large and complex that regular data processing
techniques do not work in dealing with this type of data
sets.
Understand the market conditions:
Analyzing big data helps understanding current market conditions. For example; by analyzing customers’
purchasing behaviors, a business can find out the products that are sold the most and produce its future
products according to this trend. As a result, it can get ahead of its competitors.
Cost Savings:
Implementing big data tools may be expensive at the beginning but it will eventually save you a lot of
money.
Big Data v/s Small data
Sources for BIG DATA
Sensor Data:
Companies that utilize devices that are equipped with sensors and network connectivity can leverage data
as well.
Social interactions:
It is data produced by human interactions through a network, like Internet. The most common is the data
produced in social networks.
Business transactions:
Data produced as a result of business activities can be recorded in structured or unstructured databases.
Archived data:
These refers to unstructured documents, statically or dynamically produced which are stored or published
as electronic files, like Internet pages, videos, audios, PDF files, etc.
1. Apache Hadoop:
Hadoop has become synonymous with big data and is currently the most popular distributed data
processing software. powerful system is known for its ease of use and its ability to process extremely
large data in both, structured and unstructured formats.
2. Lumify:
Lumify is a relatively new open source project to create a Big Data fusion and is a great alternative to
Hadoop. It has the ability to rapidly sort through numerous quantities of data in different sizes, sources
and format.
Data Security
Data privacy/security
Bad Analytics
Bad Data
According to NASSCOM the big data sector is expected to grow at a CAGR of 26% over the next 5years,
expected to reach a value of $16 billion by 2025.
Realizing the great importance of big data, many analytical organizations are moving beyond process
improvements to find hidden information buried in big data and trying to make the best use of it
Big data technologies have addressed the problems related to this new big data revolution through the use of
commodity hardware and distribution.
Companies like Google, Yahoo!, Microsoft, Facebook, Amazon are investing a lot in Big Data research and
projects.
By leveraging Big Data technologies effectively, organizations can be more efficient and more competitive.
Big Data technologies are changing the world—everything from the Internet of Things to gathering both
more qualitative and more quantitative data will lead to better decision-making and insight.