You are on page 1of 26


What is BIG DATA ?

Big Data is not just about lots of data, it is actually a concept providing an opportunity to find
new insight into your existing data as well guidelines to capture and analysis your future data. It
makes any business more agile and robust so it can adapt and overcome business challenges
Big data is a popular term used to describe the exponential growth and availability of data, both
structured and unstructured and big data may be as important to business and society as the
Internet has become. Why? More data may lead to more accurate analyses. More accurate
analyses may lead to more confident decision making. And better decisions can mean greater
operational efficiencies, cost reductions and reduced risk

Gartner 3 Vs of BIG DATA
We currently see the exponential growth in the data storage as the data is now more than text
We can find data in the format of videos, musics and large images on our social media channels.
It is very common to have Terabytes and Petabytes of the storage system for enterprises. As the
database grows the applications and architecture built to support the data needs to be
reevaluated quite often.
Sometimes the same data is re-evaluated with multiple angles and even though the original data
is the same the new found intelligence creates explosion of the data. The big volume indeed
represents Big Data.
The data growth and social media explosion have changed how we look at the data.
There was a time when we used to believe that data of yesterday is recent. The matter of the
fact newspapers is still following that logic. However, news channels and radios have changed
how fast we receive the news.
Today, people reply on social media to update them with the latest happening. On social media
sometimes a few seconds old messages (a tweet, status updates etc.) is not something interests
users. They often discard old messages and pay attention to recent updates.
The data movement is now almost real time and the update window has reduced to fractions of
the seconds. This high velocity data represent Big Data.
Data can be stored in multiple format. For example database, excel, csv, access or for the matter
of the fact, it can be stored in a simple text file.
Sometimes the data is not even in the traditional format as we assume, it may be in the form of
video, SMS, pdf or something we might have not thought about it. It is the need of the
organization to arrange it and make it meaningful.
It will be easy to do so if we have data in the same format, however it is not the case most of the
time. The real world have data in many different formats and that is the challenge we need to
overcome with the Big Data. This variety of the data represent represent Big Data
Emerging Dimensions
Emerging two additional dimensions when thinking about big data:
In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent
with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered
peak data loads can be challenging to manage. Even more so with unstructured data involved.
Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse
and transform data across systems. However, it is necessary to connect and correlate
relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.
What is Enterprise Computing ?
The term, enterprise, commonly describes a business or venture of any size. Here the term
enterprise refers to large multi national corporations, universities, hospitals, research
laboratories, and government organizations.
Enterprise computing involves the use of computers in networks, such as LANs and WANs, or a
series of interconnected networks that encompass a variety of different operating systems,
A typical enterprise consists of corporate headquarters, remote offices, international offices,
and hundreds of individual operating entities, called functional units, including departments,
centers, and divisions. Often, organizations within the enterprise may have similar
responsibilities within the divisions to which they belong.
Types of Enterprise
Retail enterprises own a large number of stores in a wide geographical area and use their size to
obtain discounts on the goods they purchase; they then seek to sell the goods at a lower price
than smaller retailers.
Manufacturing enterprises create goods on a large scale and then distribute and sell the goods
to consumers or other organizations.
Service enterprises typically do not create or sell goods, but provide services for consumers or
other organizations. Examples include companies in the insurance, restaurant, and financial
Wholesale enterprises seek to purchase and then sell large quantities of goods to other
organizations, usually at a lower cost than retail.
Types of Enterprise ?

Government enterprises include large city governments, state governments, and the
departments and agencies of the federal government.
Educational enterprises include large universities or schools that include executives, instructors,
and other service personnel and whose reach extends throughout a county, a state, or the entire
Transportation enterprises include airlines, regional transportation authorities, freight and
passenger railroads, and trucking firms. These enterprises often include a mix of such types of
transportation and have a local or an international reach
Organization Structure of an
Most traditional enterprises are organized in a hierarchical manner. The Figure shows an
example of an organization chart of a large manufacturing company.
Organization Structure of an
Each enterprise includes its own special needs and the organizational structure of every enterprise
varies. Organizations may include all or some of the managers and departments. Organizations also
may include additional departments or combine some of those shown.
A decentralized approach to information technology exists when departments and divisions maintain
their own information systems. Sometimes, enterprises use outsourcing in a decentralized approach
so that the company better can focus on its core skills
Some organizations maintain central computers, supported by a central information technology
department, which is referred to as a centralized approach to information technology.
Organizations decide whether to support a centralized or decentralized approach based on a number
of factors, including cost, efficiency, and the interdependence of departments.
A centralized approach to information systems usually reduces costs of maintenance and increases
A decentralized approach allows for greater flexibility, allowing each functional unit or department to
customize information systems to their particular needs.
Levels of Users in the Enterprise
How Manager Use Information ?
All employees, including managers, in a company need accurate information to perform their
jobs effectively. Managers aResources include people, money, materials, and information.
Managers coordinate these resources by performing four activities: planning, organizing,
leading, and controlling
Planning involves establishing goals and objectives. It also includes deciding on the strategies
and tactics needed to meet these goals and objectives
Organizing includes identifying and combining resources, such as money and people, so that the
company can reach its goals and objectives. Organizing also involves determining the
management structure of a company, such as the departments and reporting relationships
Leading, sometimes referred to as directing, involves communicating instructions and
authorizing others to perform the necessary work.
Controlling involves measuring performance and, if necessary, taking re responsible for
coordinating and controlling an organizations resources. corrective action
Business Intelligence

Business intelligence (BI ) includes several types of applications and technologies for acquiring,
storing, analyzing, and providing access to information to help users make more sound business
BI applications include decision support systems, query and reporting, online analytical
processing (OLAP), statistical analysis, and data mining
Business Process Management

Business process management (BPM) includes a set of activities that enterprises perform to
optimize their business processes, such as accounting and finance, hiring employees, and
purchasing goods and services.
BPM almost always is aided by specialized software designed to assist in these activities.
Business Process Automation

Business process automation (BPA) provides easy exchange of information among business
applications, reduces the need for human intervention in processes, and uses software to
automate processes wherever possible.
BPA offers greater efficiency and reduces risks by making processes more predictable.
How Enterprise measure quality of
their Operation ?

Enterprises measure the quality of their operations in a number of ways.
Often, systems have specific requirements for availability, the capability to grow (scalability), and
One of the goals of an enterprises hardware is to maintain a high level of availability to end
The availability of hardware to users is a measure of how often it is online. Highly available
hardware is accessible 24 hours a day, 365 days a year
High Availability System
A high-availability system continues running and performing tasks for at least 99 percent of the
Some users demand that high-availability systems be available for 99.9 percent or 99.99 percent
of the time.
Uptime is a measurement of availability. A system that has an uptime of 99.99 percent is
nonfunctional for less than one hour per year. That one hour, called downtime, includes any
time that the computer crashes, needs repairs, or requires installation of replacement or
upgrade parts.
A system with 99.9 percent availability is said to have three nines of availability, and a system
with 99.99 percent availability is said to have four nines of availability.
As an enterprise grows, its information systems either must grow with it or must be replaced.
Scalability is a measure of how well computer hardware, software, or an information system can
grow to meet increasing performance demands.
A system that is designed, built, or purchased when the company is small may be inadequate
when the company doubles in size. When making decisions for computing solutions, managers
must be careful to consider the growth plans of the company.
Enterprises typically build and buy a diverse set of information systems.
An information system often must share information, or have interoperability, with other
information systems within the enterprise.
Information systems that more easily share information with other information systems are said
to be open.
Information systems that are more difficult to interoperate with other information systems are
said to be closed, or proprietary.
Recent open systems employ XML and Web services to allow a greater level of interoperability
Backup Procedures
Business and home users can perform four types of backup: full, differential, incremental, or
selective. A fifth type, continuous data protection, typically is used only by large enterprises
A full backup, sometimes called an archival backup, copies all of the files in the computer. A full
backup provides the best protection against data loss because it copies all program and data
files. Performing a full backup can be time-consuming. Users often combine full backups with
differential and incremental backups
A differential backup copies only the files that have changed since the last full backup
An incremental backup copies only the files that have changed since the last full or last
incremental backup
A selective backup, sometimes called a partial backup, allows the user to choose specific files to
back up, regardless of whether or not the files have changed since the last incremental backup.
Backup Procedures
Continuous data protection (CDP), or continuous backup, is a backup plan in which all data is
backed up whenever a change is made. Because CDP is costly, few organizations have
implemented continuous data protection, but its popularity is growing quickly as the cost for the
technology falls. CDP requires little or no maintenance when compared to other backup
methods. Many experts believe that CDP will replace all other types of backups in the future
Disaster Recovery Plan

A disaster recovery plan is a written plan describing the steps a company would take to restore
computer operations in the event of a disaster.
Every company and each department or division within an enterprise usually has its own
disaster recovery plans.
A disaster recovery plan contains four major components: the emergency plan, the backup plan,
the recover
An emergency plan specifies the steps to be taken immediately after a disaster strikes. The
emergency plan usually is organized by type of disaster, such as fire, flood, or earthquake y plan,
and the test plan
Disaster Recovery Plan
Once the procedures in the emergency plan have been executed, the next step is to follow the
backup plan. The backup plan specifies how an organization uses backup files and equipment to
resume information processing. The backup plan should specify the location of an alternate
computer facility in the event the organizations normal location is destroyed or unusable
The recovery plan specifies the actions to be taken to restore full information processing
operations. As with the emergency plan, the recovery plan differs for each type of disaster. To
prepare for disaster recovery, an organization should establish planning committees, with each
one responsible for different forms of recovery
To provide assurance that the disaster plan is complete, it should be tested. A disaster recovery
test plan contains information for simulating various levels of disasters and recording an
organizations ability to recover