Вы находитесь на странице: 1из 2

Every day we produce 2.5 quintillion bytes of data, in 2012 an estimated of 2.

5 zettabytes were
generated.

There are two big reasons why big data had an exponential grown, so that we can use it right now.
According to McKinsey Big Data is a growing torrent. Why? Now everyone has a phone, those
phones have apps and those apps store information, and that information is one, of some other
sources, of what we called Big Data; and every time we post or like or play or even realize (do) a
search, we are adding information to that growing torrent. The second is cloud computing, which
means computing anywhere and anytime. We are providing data all day, even when we are not in
to it, or at least we think we are not. Then, once we know that everything must be stored, I mean,
that we generate data we need to integrate the data coming from the organizations, from sensors
from the cities or other sensors and from other people.

With those amounts of information, we can create models in order to improve our precision in
daily life tasks, say making decisions or doing sells in other words, “personalized marketing”.

What does this mean? We can know how the customer feel, “sentiment analysis” in order to give
them what they are asking through natural language processing, that is a technique to record the
opinion of customers without reading all the opinions.

We recollect data from:

 Machine data. (The largest source of big data)


 Organizational data (organization)
 People (maybe in social medias).

The data that a specific organization collect and open data plus analytics make us make better
predictions.

However, what is Big Data? Typically, it is referred to the 4 v’s. This are velocity, variety, volume
and veracity supporting and integrating these three, we create Big Data. However, large, I mean
massive amounts of data does not involve real knowledge, not even information and do not even
think in wisdom.

Walmart collects data from online clicks, Twitter data, local events, local weather and in-store
purchases in order to launch new products, for predictive analytics and for customize marketing.

Most of Big Data is analyse in Hadoop that is a program constructed for that specifically, also Spark
and Apache storm.

One of the challenges of Big Data is where to store all that information? And what to do with it, for
that we have Data Science, we analyse and find patterns to answer questions.

So, what is Data Science?

We need a purpose and many people with different knowledge on maths, business, programming
etc. We need a process, for example, collecting data, cleaning data, processing, analysing and
results; also, we need platforms and programmability.

Вам также может понравиться