You are on page 1of 17

Big Data: Opportunities, Strategies and Challenges

Executive Summary
Gregg Barrett
Acknowledgment

This report draws extensively, and focuses on, the work and viewpoints from industry participants
including:

Diversity Limited
Economist Intelligence Unit
Gartner
HBR
Hortonworks
IBM
ITG
Intel
McKinsey
Ordnance Survey
John Standish Consulting
Christopher Bienko @ IBM
Dirk deRoos @ IBM
John Choi @ IBM
Marc Andrews @ IBM
Paul Zikopoulos @ IBM
Rick Buglio @ IBM
Strategy Meets Action

References are included in-text as well as in the References section at the end of the report.

1
Challenges facing the industry

Difficult and uncertain economic conditions, low interest rates, decreasing underwriting profitability,
higher combined ratios and low investment returns are placing insurers under stress. Insurers also
have to confront commoditisation of the business, more informed consumers, high customer churn
rates, new distribution channels and strong competition. If this was not enough natural perils,
increases in regulatory intervention and greater demands for transparency by regulators, together
with ever increasing compliance requirements are placing immense strain on the capabilities of
insurers.

According to IBM (2013) to thrive in this environment insurers must gain a specific set of capabilities
that will allow them to:

- Build a customer-centric business model


- Find profitable ways to sustain growth
- Develop new, competitively priced products
- Increase claims efficiency and effectiveness
- Improve capital management and investment decisions
- Improve risk management and regulatory reporting

(IBM, 2013, pg. 2)

Insurers are turning to analytics

The business of insurance is based on analysing data to understand and evaluate risks. Two important
insurance professions, actuarial and underwriting, emerged at the beginning of the modern insurance
era in the 17th century. These both revolve around and are dependent upon the analysis of data.

(Strategy Meets Action, 2012, pg. 3)

While the insurance industry has long been recognized for analysing data, the new news involves the
overwhelming amount of data that is now available for analysis and the sophistication of the
technology tools that can be used to perform the analysis. The opportunities for advanced analysis
are many and the potential business impact is enormous.

(Strategy Meets Action, 2013, pg. 3)

2
The Concept of Big Data

In simple terms Big Data refers to a data environment that cannot be handled by traditional
technologies.

Big Data is often described in terms of the three Vs, and if you are at IBM, it is likely to be the four Vs
. Figure 1 below illustrates the IBM four V representation of Big Data:

Figure 1: Big Data in dimensions

Figure 1. Four dimensions of big data. Copyright 2012 by IBM. Reprinted with permission.

Volume refers to the quantity (gigabytes, terabytes, petabytes etc.) of data that organizations are
trying to harness. Importantly there is no specific measure of volume that defines Big Data, as what
constitutes truly high volume varies by industry and even geography. What is clear is that data
volumes continue to rise.

Variety refers to different types (forms) of data and data sources. When referring to data types this
includes; numeric, text, image, audio, web, log files etc., whether structured or unstructured. The
growth of data sources such as social media, smart devices, sensors and the Internet of Things has not
only resulted in increases in the volume of data but increases in the types of data as well.

Velocity refers to speed at which data is created, processed and analysed. Velocity impacts latency,
which is the lag time between when data is created or captured, and when it is processed into an
output form for decision making purposes. Importantly, certain types of data must be analysed in real-
time to be of value to the business, a task that places impossible demands on traditional systems
where the ability to capture, store and analyse data in real-time is severely limited.

Veracity refers to the level of reliability associated with certain types of data. According to IBM some
data is inherently uncertain, for example: sentiment and truthfulness in humans; GPS sensors
bouncing among the skyscrapers of Manhattan; weather conditions; economic factors; and the future.
When dealing with these types of data, no amount of data cleansing can correct for it. Yet despite
uncertainty, the data still contains valuable information. The need to acknowledge and embrace this
uncertainty is a hallmark of Big Data. (IBM, 2012, pg. 5)

3
The Big Data Impact

According to McKinsey (2011), Big Data creates value in several ways:

- Creating transparency
- Enabling experimentation to discover needs, expose variability, and improve performance
- Segmenting populations to customize actions
- Replacing/supporting human decision making with automated algorithms
- Innovating new business models, products, and services

To understand the impact at an organisational level, Erik Brynjolfsson with a team at MIT, working in
partnership with McKinsey, Lorin Hitt at Wharton and the MIT doctoral student Heekyung Kim,
conducted structured interviews with executives at 330 public North American companies about their
organizational and technology management practices, and gathered performance data from their
annual reports and independent sources.

Based on the analyses they conducted one relationship stood out: The more companies characterized
themselves as data-driven, the better they performed on objective measures of financial and
operational results. In particular, companies in the top third of their industry in the use of data-driven
decision making were, on average, 5% more productive and 6% more profitable than their
competitors. This performance difference remained robust after accounting for the contributions of
labour, capital, purchased services, and traditional IT investment. (HBR, 2012)

Further an IBM study based on survey responses of more than 1,000 business and IT executives from
more than 60 countries, revealed four transformative shifts in the use of Big Data:

1. A solid majority of organizations are now realizing a return on their Big Data investments
within a year.
2. Customer centricity still dominates analytics activities, but organizations are increasingly
solving operational challenges using Big Data.
3. Integrating digital capabilities into business processes is transforming organizations.
4. The value driver for Big Data has shifted from volume to velocity.

(IBM, 2014, pg. 1)

While Big Data has resulted in significant opportunity it has also brought new challenges. According
to Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014), some challenges include:

- Greater volumes of data than ever before


Placing more demands on the organisations security plan.

- The experimental and analytical usage of the data


Democratizing data within the organisation requires building trust into the Big Data
platform. A data governance framework covering lineage, ownership etc. is required for any
successful Big Data project.

- The nature and characteristics of Big Data


The data consists of more sensitive personal details than ever before raising governance, risk
and compliance concerns.

- The adoption of technologies that are still maturing

4
Big Data technologies like Hadoop (and much of the NoSQL world) do not have all of the
enterprise hardening from a security perspective thats needed, and theres no doubt
compromises are being made.

A look at Big Data in Insurance

Exploration and discovery

Big Data necessitates an approach of exploration and discovery. As articulated by Gartner (2013),
business analysts have typically worked to a requirements-based model, answering clearly-defined
business questions. Big Data, however, demands a different approach, using opportunistic analytics
and exploring answers to ill-formed or non-existent questions. (Gartner, 2013, pg. 1)

Figure 2: Culture change - Discovery versus control

Figure 2. A better assessment of the data around and connected to a single piece of information enables a more complete,
in-context understanding. Copyright 2013 by IBM. Reprinted with permission.

Moving to a data driven culture

Gartner (2014) has found that many insurance IT departments lack a consistent, enterprise-wide
business intelligence and data management strategy, because of siloed, line of-business-centric IT
systems. (Gartner, 2014, pg. 6)

In embracing the Big Data paradigm the Economist Intelligence Unit (2013) suggests moving towards
what they call a data driven culture. According to the report, in promoting a data driven culture
organisations should consider:

- Data-driven companies place a high value on sharing. Companies own data, not employees. Data
are a resource that can power growth, not something to be hoarded.

- Shared data should be utilised by as many employees as possible, which in practice means rolling
out training wherever it is needed.

5
- Data collection needs to be a primary activity across departments

- Perhaps most importantly, implementing a data driven culture requires buy-in from the top;
without that, little will change.

(Economist Intelligence Unit, 2013, pg. 11)

Emerging techniques in Big Data on the insurance front

According to Ordnance Survey (2013) the following are some of the emerging techniques being
deployed by insurers:

- Predictive modelling: already well used by insurance companies, this works even better when
more data is fed into the model.

- Data-clustering: automated grouping of similar data points can provide new insights into
apparently familiar situations. Livehoods.org is an example of how social media and machine
learning can reveal previously-unseen patterns.

- Sentiment analysis: textual keyword analysis can help analyse the mood of Twitter chatter on a
given topic or brand.

- Web crawling: sophisticated programmes that can identify an individuals web footprint as a
result of posting on social media websites, blogs and photo-sharing services. Using data-matching,
this can be linked to public records and data from other third parties to build a multi-dimensional
profile of an individual.

(Ordnance Survey, 2013, pg. 22)

Data protection, a lurking risk

In addition to the transformative shifts in the use of Big Data mentioned earlier, the same IBM report
found that respondents rated data protection lowest on the list of data priorities; only 11 percent of
respondents identified it a top three priority. Given the proliferation of large-scale data breaches in
recent years, organizations risk the loss of customer and business partner confidence if adequate
precautions are not taken to safeguard data, as well as legal and remediation fees. Moreover, business
leaders should thoughtfully consider how their organizations use data to minimize any potential
backlash in perceived privacy infringement. (IBM, 2014, pg. 9)

Skills gap

The Big Data environment requires a skill set that is new to most organisations requiring people with
deep expertise in statistics and machine learning, as well as managers and analysts who know how to
operate companies by using insights from Big Data.

According to McKinsey (2011), the United States alone faces a shortage of 140,000 to 190,000 people
with deep analytical skills as well as 1.5 million managers and analysts to analyse Big Data and make
decisions based on their findings.

In addressing the skills gap, IBM (2014) suggests organisations should consider the following:

6
Learn from the best within your organization.
- Tap into the pockets of talent within the organization - those few using predictive or
prescriptive analytics - to expand the skills of others.
- Create a strong internal professional program to arm analysts and executives who already
understand the organizations business fundamentals with analytics. Sharing resources and
knowledge is a cost-effective way to build skills and helps limit the need to seek talent
elsewhere.

Externally supplement skills based on business case.


Not all organizations need a data scientist full time; the same is true for niche analytics skills that may
be used only to solve specific challenges.
- Organizations should invest in the talent and skills they need to solve the majority of their
analytics demands
- Consider vendors to supplement critical niche skills that are hard to find and expensive to
employ.

(IBM, 2014, pg. 15)

Big Data technologies

Apache Hadoop is the starting point for most organizations wanting to take the plunge into Big Data
analysis.

The Hadoop ecosystem

In their book, Big Data Beyond the Hype, Zikopoulos, deRoos, Bienko, Buglio and Andrews (2014)
classify Hadoop as an ecosystem of software packages that provides a computing framework. These
include MapReduce, which leverages a K/V (key/value) processing framework (dont confuse that with
a K/V database); a file system (HDFS); and many other software packages that support everything from
importing and exporting data (Sqoop) to storing transactional data (HBase), orchestration (Avro and
ZooKeeper), and more.

When you hear that someone is running a Hadoop cluster, its likely to mean MapReduce (or some
other framework like Spark) running on HDFS, but others will be using HBase (which also runs on
HDFS). Vendors in this space include IBM (with BigInsights for Hadoop), Cloudera, Hortonworks, MapR,
and Pivotal. On the other hand, NoSQL refers to non-RDBMS SQL database solutions such as HBase,
Cassandra, MongoDB, Riak, and CouchDB, among others.

(Zikopoulos, deRoos, Bienko, Buglio, Andrews, 2014, pg. 38)

Key components of many Big Data environments:

MapReduce
MapReduce is a system for parallel processing of large data sets.

According to IBM (2015) as an analogy, you can think of map and reduce tasks as the way a census
was conducted in Roman times, where the census bureau would dispatch its people to each city in the
empire. Each census taker in each city would be tasked to count the number of people in that city and
then return their results to the capital city. At the capital, the results from each city would be reduced
to a single count (sum of all cities) to determine the overall population of the empire. This mapping of

7
people to cities, in parallel, and then combining the results (reducing) is much more efficient than
sending a single person to count every person in the empire in a serial fashion. (IBM, 2015)

Hadoop
MapReduce is the heart of Hadoop. Hadoop is an open source software stack that runs on a cluster of
machines. Hadoop provides distributed storage and distributed processing for very large data sets.

NoSQL
NoSQL is a database environment. Using the definition from Planet Cassandra (2015), a NoSQL
database environment is, simply put, a non-relational and largely distributed database system that
enables rapid, ad-hoc organization and analysis of extremely high-volume, disparate data types.
NoSQL databases were developed in response to the sheer volume of data being generated, stored
and analyzed by modern users (user-generated data) and their applications (machine-generated data).
(Planet Cassandra, 2015)

Spark
What is Spark and what does it mean for Hadoop?

IBM (2014) refers to Spark as an open source engine for fast, large-scale data processing that can be
used with Hadoop, boasting speeds up to 100 times faster than Hadoop MapReduce in memory, or 10
times faster on disk. As with the early enthusiasm around Hadoop, Spark should not be thought of as
a singular platform for analytics, as it can be used with existing investments for the widest variety of
data types and analytics workloads. (IBM, 2014)

Figure 3: Example of a Big Data environment

Figure 3. Application Enrichment with Hadoop. Copyright 2013 by Hortonworks Inc.. Reprinted with permission.

8
The impact of Hadoop

According to IBM (2015), Hadoop changes the economics and dynamics of large-scale computing by
enabling a solution that is:

- Scalable: Add new nodes as needed without changing data formats, how data is loaded, how
jobs are written or the applications on top.
- Cost-effective: Hadoop brings massively parallel computing to commodity servers. The result
is a significant decrease in the cost per terabyte of storage, which in turn makes it affordable
to model all your data.
- Flexible: Hadoop is schema-less, and can absorb any type of data, structured or not, from a
number of sources. Data from multiple sources can be joined and aggregated in arbitrary
ways, enabling deeper analyses than any one system can provide by itself.
- Fault-tolerant: When you lose a node, the system redirects work to another location of the
data and continues processing without missing a beat.

(IBM, 2015, pg. 2)

Hadoop challenges

Hadoop is not without its own set of challenges. According to IBM (2014), there are four key areas of
Hadoop that need to mature in order to drive wider adoption, these include:

1) Performance
2) the reduction of skills
3) data governance
4) deep integration with existing technologies

(IBM, 2014)

Along similar lines TDWI Research (2015) in a recent survey found respondents struggling with the
following barriers to Hadoop implementation:

Barriers to Hadoop:
- Skills gap
- Weak business support
- Security concerns
- Data management hurdles
- Tool deficiencies
- Containing costs

(TDWI Research, 2015)

According to a study by the International Technology Group, organisations need to be particularly


mindful in the highly skilled programming requirements demanded of most Hadoop environments,
noting that:

Although the field of players has since expanded to include hundreds of venture capital-funded start-
ups, along with established systems and services vendors and large end users, social media businesses
continue to control Hadoop. Most of the more than one billion lines of code more than 90 percent,
according to some estimates in the Apache Hadoop stack has to date been contributed by these.

9
The priorities of this group have inevitably influenced Hadoop evolution. There tends to be an
assumption that Hadoop developers are highly skilled, capable of working with raw open source
code and configuring software components on a case-by-case basis as needs change. Manual coding
is the norm.

Decades of experience have shown that, regardless of which technologies are employed, manual
coding offers lower developer productivity and greater potential for errors than more sophisticated
techniques.

(ITG, 2013, pg. 2)

Big Data in the context of traditional technologies

The Big Data environment has been brought about by the advancement in technology enabling the
processing and storage of the volume, variety, velocity and veracity of data, which is beyond the
capabilities of traditional technology.

Big Data supplements traditional systems

As illustrated in Figure 3, the Big Data environment supports traditional technology, extending
capabilities into areas previously unsupported.

Gartner (2013) suggest that Big Data doesn't replace traditional data and analytics:

..big data technologies are not really replacing incumbents such as business intelligence, relational
database management systems and enterprise data warehouses. Instead, they supplement traditional
information management and analytics. (Gartner, 2013, pg. 13)

Examples of three insurance use cases with Big Data

According to Gartner (2013) Big Data and the associated technology has been shown to provide the
following benefits:

- Detection and prevention of fraud or other security violations


- High ROI
- Little operational disruption

(Gartner, 2013, pg. 5)

Big Data to fight fraud

According to John Standish Consulting (2013), mobilizing Big Data is gaining wider attention in anti-
fraud circles. Insurers are sitting on troves of data, hard and soft. Much is never accessed for fraud-
fighting. Insurers can dramatically increase their anti-fraud assertiveness by insightfully accessing,
analyzing and mobilizing their large volumes of untapped data.

Marshaling analytics and big data with current rules and indicators into a seamless and unified anti-
fraud effort creates an expansive world of possibilities.

- Imagine the ability to search a billion rows of data and derive incisive answers to complex
questions in seconds.
- Imagine being able to comb through huge numbers of claim files quickly.

10
- Imagine more-quickly linking numerous ring members and entities acting in well-disguised
concert. These suspects likely could not be detected with sole or even primary reliance on
basic methods such as fraud indicators.
- Ultimately, imagine analyzing entire caseloads faster and more completely, thus addressing
the largest fraud problems and cost drivers in any of an insurers coverage territories.

(Standish, 2013)

Case study: Fraud at IBC


The Insurance Bureau of Canada (IBC) is the national insurance industry association representing
Canadas home, car and business insurers. Because investigation of cases of suspected automobile
insurance fraud often took several years, the companys investigative services division wanted to
accelerate its process. The IBC worked with IBM to conduct a proof of concept (POC) in Ontario,
Canada that explored new ways to increase the efficiency of fraud identification. The POC showed
how IBM solutions for big data can help identify suspect individuals and flag suspicious claims. IBM
solutions also help users visualize relationships and linkages to increase the accuracy and speed of
discovering potential fraud. In the POC, more than 233,000 claims from six years were analyzed. The
IBM solutions identified more than 2,000 suspected fraudulent claims with a value of CAD41 million.
IBM and the IBC estimate that these solutions could save the Ontario automobile insurance industry
approximately CAD200 million per year.

(IBM, 2012)

Big Data for customer segmentation

Case study: Customer segmentation at Progressive


In July 2012, Progressive Insurance released new findings from an analysis of five billion real-time
driving miles, confirming that driving behaviour has more than twice the predictive power of any other
insurance rating factor. Loss costs for drivers with the highest-risk driving behaviour are approximately
two-and-a-half times the costs for drivers with the lowest-risk behaviour. These results suggest that
car insurance rates could be far more personalized than they are today.

Progressive has also found that 70% of drivers who have signed up for its Snapshot UBI program pay
less for their insurance. The program involves installing a small monitoring device in the car (900,000
drivers have already done this) and driving normally. After the device has collected enough data,
customers receive a personalized rate for their insurance. Progressive is currently expanding access to
Snapshot to all of its drivers - not just Progressive customers - who can take a free test drive of the
technology and after 30 days find out whether their own driving behaviour can lower the price they
pay for insurance.

The problem with today's less granular systems of customer classification in the property and casualty
insurance market is that the majority of drivers who present a lower risk subsidize the minority of
higher-risk drivers.

(Gartner, 2013, pg. 5)

Big Data for underwriting

Case study: Improving underwriting decisions


A large global property casualty insurance company wanted to accelerate catastrophe risk modelling
in order to improve underwriting decisions and determine when to cap exposures in its portfolio. The
current modelling environment was too slow and unable to handle the large-scale data volumes that

11
the company wanted to analyze. The goal was to run multiple scenarios and model losses in hours,
but the current environment required up to 16 weeks. As a result, the company conducted analysis
only three or four times per year. A proof of concept demonstrated that the company could improve
performance by 100 times, accelerating query execution from three minutes to less than three
seconds.

The company decided to implement IBM solutions for big data, and can now run multiple catastrophe
risk models every month instead of only three or four times per year. Once data is refreshed, the
company can create what-if scenarios in hours rather than weeks. With a better and faster
understanding of exposures and probable maximum losses, the company can take action sooner to
change loss reserves and optimize its portfolio.

(IBM, 2013, pg. 7)

Costs associated with typical Big Data implementations

Although a Big Data environment such as that illustrated in Figure 3 can be constructed from open
source software, such as Hadoop and a NoSQL database such as MongoDB, there are still substantial
costs involved. These include:

1) Hardware costs
2) IT and operational costs in setting up a machine cluster and supporting it
3) Cost of personnel to work on the ecosystem

These costs are NOT trivial for the following reasons:

- Dealing with cutting edge technology and finding people who know the technology is
challenging
- The technology introduces a different programming paradigm, frequently requiring additional
training of existing engineering teams
- These technologies are new and still evolving and are not yet mature in the enterprise
ecosystem
- The hardware is server grade and large clusters require resources including network
administration, security administration, system administration etc., as well as data centre
operational costs including electricity, cooling etc.

Infrastructure as a Service (IaaS)

One consideration that can mitigate the cost implications of hardware and support personnel is the
use of a cloud offering. As pointed out by Intel (2015) clouds are already deployed on pools of
server, storage, and networking resources and can scale up or down as needed. Cloud computing
offers a cost-effective way to support Big Data technologies and the advanced analytics applications
that can drive business value.

Diversity Limited (2010) defines Infrastructure as a Service (IaaS) as a way of delivering Cloud
Computing infrastructure servers, storage, network and operating systems as an on-demand
service. Rather than purchasing servers, software, datacenter space or network equipment,
organisations instead buy those resources as a fully outsourced service on demand.

12
Recommended course for Big Data

IBM (2015) recommends that organisations consider the following when embarking on the Big Data
journey:

1. Choose projects with a high potential return on investment, for which data sources are
readily accessible and already in electronic form, and establish clear goals and quantifiable
metrics. There should be a strong business need for making the resulting data easily
accessible to broad user communities.

2. The data architecture should be extensible to allow addition of other data sources, including
streaming data, as needed.

3. As the project continues, create a feedback loop to inform other departments of insights
derived about products, marketing and sales. This helps promote the value of analytics,
builds a culture that focuses on deriving even better information from analytics, and instils a
high level of trust in the datas veracity and completeness.

4. Surround Hadoop with a strong ecosystem of Big Data tools and analytics capabilities. The
richer the portfolio of capabilities in the selected Hadoop solution, the more freedom teams
have to solve problems and advance the organizations insights.

(IBM, 2015, pg. 4)

Recommended Big Data platform

- Utilise an IaaS offering


- Explore the MapR and the IBM BigInsights offerings further.

IBM BigInsights example:

IBM BigInsights is based on 100 percent open source Hadoop. It extends Hadoop with enterprise-
grade technology including administration and integration capabilities, visualization and discovery
tools as well as security, audit history and performance management.

According to IBM, the BigInsights platform offers:

- Increased performance: An average 4 times performance gain over open source Hadoop.1
- Usability: BigInsights is optimized for a wide range of roles, including integration developers,
administrators, data scientists, analysts and line-of-business contacts.
- Integrated with IBM Watson Foundations big data platform: BigInsights comes bundled with
search and streaming analytics capabilities.
- Analytics: Built-in Hadoop analytics capabilities for machine data, social data, text and Big R
enable you to locate actionable insights from data in the Hadoop cluster rather than having
to move the data around.

Figure 4: Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications Averages for All Installations

13
Figure 4. Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications. Copyright 2013 by 2013 by the International Technology Group. Reprinted with permission.

Conclusion

Big Data is having a substantive impact on the P&C insurance industry. Insurers are combining Big Data
and analytics to overcome many of the challenges confronting the industry, and to support new
capabilities. Although implementing a Big Data platform is not without its challenges, through careful
consideration, the organisation should be able to generate an appreciable return on its Big Data and
analytics initiative. The availability of IaaS platforms for Big Data reduce many of the initial risks that
would traditionally be associated with such projects. In addition the Big Data offerings from MapR
Technologies and IBM, based on initial research appear to be strong candidates for evaluation.

14
References

Diversity Limited. (2010). Moving your infrastructure to the cloud. [pdf]. Retrieved from
http://diversity.net.nz/wp-content/uploads/2011/01/Moving-to-the-Clouds.pdf

Economist Intelligence Unit. (2013). Fostering a data-driven culture. [pdf].


Retrieved from
http://www.economistinsights.com/search/node/sites%20default%20files%20downloads%20Tableau%20DataCu
lture%20130219%20pdf

Gartner. (2013). Characteristics of the traditional versus the big data approach. [Table]. Retrieved from Gartner. (2013).
Big data business benefits are hampered by 'culture clash'. [pdf]. Retrieved from
https://www.gartner.com/doc/2588415

Gartner. (2013). Use big data to solve fraud and security problems. [pdf]. Retrieved from
https://www.gartner.com/doc/2397715

Gartner. (2013). How it should deepen big data analysis to support customer-centricity. [pdf].
Retrieved from https://www.gartner.com/doc/2531116

Gartner. (2013). Consistent view of the customer for big data. [Diagram]. Retrieved from Gartner. (2013). How it should
deepen big data analysis to support customer-centricity. [pdf].
Retrieved from https://www.gartner.com/doc/2531116

Gartner. (2014). Agenda overview for p&c and life insurance. [pdf].
Retrieved from https://www.gartner.com/doc/2643327

HBR. (2012). Big Data: The management revolution. [pdf].


Retrieved from https://hbr.org/2012/10/big-data-the-management-revolution/ar

Hortonworks. (2013). Application enrichment with hadoop. [Diagram]. Retrieved from Hortonworks. (2013).
Apache Hadoop patterns of use. [pdf]. Retrieved from http://hortonworks.com/blog/apache-hadoop-patterns-of-
use-refine-enrich-and-explore/

IBM. (2012). Four dimensions of big data. [Diagram] Retrieved from IBM, (2012). Analytics: the real-world use of big
data. [pdf]. Retrieved from
http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF

IBM. (2012). Analytics: the real-world use of big data. [pdf].


Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/gbe03519usen/GBE03519USEN.PDF

IBM. (2012). Insurance bureau of Canada. [pdf]. Retrieved from


http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?subtype=AB&infotype=PM&appname=SWGE_IM_IM_USEN&htmlfid=IMC14775USEN&attachment=I
MC14775USEN.PDF

IBM. (2013). A better assessment of the data around and connected to a single piece of information enables a more
complete, in-context understanding. [Diagram]. Retrieved from IBM. (2013). The future of insurance. [pdf].
Retrieved from http://public.dhe.ibm.com/common/ssi/ecm/en/imw14671usen/IMW14671USEN.PDF

IBM. (2013). Harnessing the power of big data and analytics for insurance. [pdf]. Retrieved from
http://public.dhe.ibm.com/common/ssi/ecm/en/imw14672usen/IMW14672USEN.PDF

IBM. (2014). Analytics: The speed advantage. [pdf].


Retrieved from http://www-935.ibm.com/services/us/gbs/thoughtleadership/2014analytics/

IBM. (2014). IBM expands hadoop commitment with support for spark.. [blog].
Retrieved from http://www.ibmbigdatahub.com/blog/ibm-expands-hadoop-commitment-support-spark

IBM. (2015). Analytics: What is mapreduce. [web page].


Retrieved from http://www-01.ibm.com/software/data/infosphere/hadoop/mapreduce/

15
IBM. (2015). BigInsights for apache hadoop quick start edition. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?infotype=PM&subtype=BR&htmlfid=IMB14164USEN#loaded

IBM. (2015). Making the case for hadoop and big data in the enterprise. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?infotype=PM&subtype=BK&htmlfid=IMM14161USEN#loaded

ITG. (2013). Business case for enterprise big data deployments. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?htmlfid=IME14028USEN&appname=skmwww

ITG. (2013). Three-year Costs for Use of IBM InfoSphere BigInsights and Open Source Apache
Hadoop for Major Applications. [Diagram].
Retrieved from ITG. (2013). Business case for enterprise big data deployments. [pdf].
Retrieved from http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?htmlfid=IME14028USEN&appname=skmwww

Intel. (2015). Big data cloud technology. [pdf].


Retrieved from http://www.intel.co.za/content/dam/www/public/us/en/documents/product-briefs/big-data-
cloud-technologies-brief.pdf

McKinsey. (2011). Big data: The next frontier for innovation, competition, and productivity. [pdf].
Retrieved from
http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation

Ordnance Survey. (2013) The big data rush: how data analytics can yield underwriting gold. [pdf].
Retrieved from http://events.marketforce.eu.com/big-data-underwriting-report-email

Planet Cassandra. (2015). Nosql databases defined and explained. [web page].
Retrieved from http://www.planetcassandra.org/what-is-nosql/

Standish, J. (2013). Speed to detection - strategically leveraging advanced analytics for insurance fraud. [blog]. Retrieved
from
http://www.johnstandishconsultinggroup.com/JohnStandishConsultingGroup.com/Blog/Entries/2013/8/9_Speed
_to_Detection_-_Strategically_Leveraging_Advanced_Analytics_for_Insurance_Fraud.html

Strategy Meets Action. (2012). Data and analytics in insurance. [pdf].


Retrieved from https://www.acord.org/library/Documents/2012_SMA_Data_Analytics.pdf

Strategy Meets Action. (2013). Data and analytics in insurance: p&c plans and priorities for 2013 and beyond. [pdf].
Retrieved from https://strategymeetsaction.com/data-and-analytics-in-insurance-p-and-c-plans-and-priorities-
for-2013-and-beyond/

Zikopoulos, P., deRoos, D., Bienko, C., Buglio, R., Andrews, M. (2014). Big data beyond the hype. [pdf].
Retrieved from
https://www.ibm.com/developerworks/community/blogs/SusanVisser/entry/big_data_beyond_the_hype_a_gui
de_to_conversations_for_today_s_data_center?lang=en

16