You are on page 1of 15

Big Data

Assurance Services Unit GTM


June 29, 2012

Copyright 2012 Tata Consultancy Services Limited

Data Data Data Data Data Everywhere

Source: Kelly Hodgkins

What is Big Data?

Big Data is about the growing challenge that organizations face as they deal with large and fast-growing sources of data or information that also present a complex range of analysis and use problems. These can include:
Having a computing infrastructure that can ingest, validate, and analyze high volumes (size and/or rate) of data Assessing mixed data (structured and unstructured) from multiple sources Dealing with unpredictable content with no apparent structure Enabling real-time or near-real-time collection, analysis, and answers

Understanding Big Data

US Retail Bank
Analyzing billions of records to get a better understanding of the impact of credit and operational risk of products across different lines of business like home loans, insurance and online banking

US Investment Bank

Consolidating 30,000 databases and 15,000 applications to create a common data platform to bring all its customer data together in one place

US Investment Bank

Implementing a recommendation engine for Financial Advisors to increase revenue and assets and at the same time helping clients in building a very profitable portfolio

US Credit Card Firm

Working on Fraud and Risk Analysis of Social Market Data, Sentiment Analysis, Call Center KPI Analysis and Application Server Log Analysis

Swiss Bank

Surveillance and anti-money laundering


Big Data Market Size

Source: IDC

Total 2011 Big Data Revenue by Vendor

Big Data Revenue (in $ Mn) Total Revenue (in $ Mn) Big Data Revenue as % of Total


IBM CSC Accenture Capgemini Atos S.A. Tata Consultancy Services Logica

$1,100 $160 $155 $111 $75

$106,000 $16,200 $21,900 $12,100 $7,400

1% 1% 0% 1% 1%







2011 Big Data Pure- Play Market Share

1% 1% 1% 1% 2% 3% 2% 3% Vertica 27% 6% Aster Data Splunk Greenplum

Cloudera 8% Think Big Analytics

Digital Reasoning Datameer 14% 17% Hortonworks HPCC Systems Karmasphere Other 14%

Source: Wikibon

Big Data Market Segments

Hardware Storage Servers Networking Big Data Distributions Open Source Hadoop distributions Enterprise Hadoop distributions Non-Hadoop Big Data frameworks Data Management Components Distributed file stores NoSQL databases Hadoop- optimized data warehousing Data Integration Data quality and governance Analytics Layer Analytics application development platforms Advanced analytics applications Applications Layer Data visualization tools Business intelligence applications Services Consulting Training Technical Support Software maintenance Hardware maintenance Hosting/ Big Data as a Service/ Cloud Vendors include: Tresata, Tidemark, Think Big Analytics, Amazon Web Services, Accenture, Cloudera, Hortonworks

Vendors Include: Dell, HP, Arista, IBM, Cisco, EMC, NetApp

Vendors Include: Apache, Cloudera, Hortonworks, IBM, EMC, MapR, LexisNexis

Vendors include: Apache, DataStax, Pervasive Software, Couchbase, IBM, Oracle, Informatica, Syncsort, Talend

Vendors include: Apache, Karmasphere , Hadapt, Attivio, 1010data, EMC, SAS institute, Digital Resoning, Revolution Analytics

Vendors include: Datameer, ClickFox, Platfora, Tableau Software, Tresata, IBM, SAP, Microstrategy, Pentaho, Qliktech, japersoft

Scope of Testing Opportunities in Big Data

Database Transformation Testing Streaming and CEP Engine Testing Analytical Sandbox Testing Cloud Testing Big Data Distribution
Reliability Testing Compatibility Testing Data Integration Testing Capacity Testing

Data Management Component

Data Quality/ Validation Testing Performance Testing Cloud Testing Functional Testing

Industry Usage
Life Science Health Care

Genome Analysis Develop drug models

Patient behavior study to treat chronic diseases Adverse drug effect analysis


Banking & Financial Services

Risk modeling Location Intelligence Catastrophe Modeling Claims Fraud Detection Retail Contextual / targeted ad marketing Point of Sale analysis Supply chain optimization Customer churn analysis

Stock Exchange Processing & surveillance of trade data Credit Card Fraud Detection


Predictions of High-Energy Physics Real-time demand forecasting


Business Drivers
What are your organization's drivers for using big data technologies and approaches?


Challenges and Opportunities

What is your organization's biggest challenge when it comes to Big Data?


Big Data Platforms & TCS Case Studies

Hadoop And MapReduce EMC (Greenplum) IBM HP Autonomy HP Vertica Oracle

TCS Case Studies

Passport e-Seva CBA Australia Super Valu Retail Chain JPMC HP Vertica Oracle


Five Keys to Succeeding With Big Data

1) Understand the possibilities Combining hundreds of data elements and terabytes of data doesnt automatically produce results. Experts must understand how data can be used & how to map it effectively 2) Tap into IT systems that manage data and provide deep analysis Theres typically no single tool or approach that addresses an organizations needs. Enterprises require a portfolio of products, tools and capabilities from different vendors 3) Build workflows and policies that facilitate the use of big data An organization must define ownership and how different groups can access and use data. It must also develop ways to transform findings and results into actual programs 4) Focus on security and privacy concerns Big data creates remarkable opportunities, but it also creates risks. As different groups within an organization use and analyze data, theres an urgent need to ensure that the proper protections are in place 5) Find the needed talent to put big data to work McKinsey & Co. states that a talent shortage is looming in the data-analytics field that could reach 190,000 people by 2018.


Thank You
For any feedback/comment/ clarification, please contact,