Вы находитесь на странице: 1из 17

For Enterprise Architecture Professionals

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives
by Noel Yuhanna
November 22, 2016

Why Read This Report

Key Takeaways

Big data initiatives are on the rise as organizations


focus on rolling out actionable insights. Big data
fabric offers enterprise architecture (EA) pros
a platform that helps them discover, prepare,
curate, orchestrate, and integrate data across
sources by leveraging big data technologies in
an automated manner. Forresters 26-criteria
evaluation of 11 big data fabric solutions will help
EA pros understand the available choices and
recommend the best for their organization.

Informatica, IBM, And Talend Lead The Pack


Forresters research uncovered a market in which
Informatica, IBM, Oracle, and Talend lead the
pack. Denodo Technologies, Global IDs, Paxata,
SAP, Syncsort, and Trifacta offer competitive
options. Waterline Data lags behind.

This report details our findings about how each


vendor fulfills our criteria and where they stand in
relation to each other to help EA.

EA Pros Look At Big Data Fabric Solutions


This market is growing largely because EA pros
see big data fabric as a strategic platform to
support their next-generation applications and
insights. When selecting a solution, enterprises
should look for scale-out architecture, security,
automation, and cost as the key factors.
Tooling And Services Can Be A Dealmaker
While all of the evaluated vendors offer
compelling value and features, some offer a
broader range of tooling and services that can
accelerate deployments.

forrester.com

For Enterprise Architecture Professionals

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives
by Noel Yuhanna
with Gene Leganza and Shreyas Warrier
November 22, 2016

Table Of Contents
2 The Big Data Fabric Market Is Immature But
Will Grow Rapidly
3 Big Data Fabric Evaluation Overview
Evaluation Criteria: Current Offering, Strategy,
And Market Presence
Forresters Evaluation Assesses The
Capabilities Of 11 Big Data Fabric Vendor
Offerings
6 Larger Providers Have An Edge With A
Broader Range Of Functionality

Notes & Resources


Forrester conducted product evaluations and
interviews with 11 vendor companies: Denodo
Technologies, Global IDs, IBM, Informatica,
Oracle, Paxata, SAP, Syncsort, Talend, Trifacta,
and Waterline Data.

Related Research Documents


Big Data Fabric Drives Innovation And Growth
The Forrester Wave: Big Data NoSQL, Q3 2016
TechRadar: Big Data, Q1 2016

9 Vendor Profiles
Leaders
Strong Performers
Contenders
14 Supplemental Material

Forrester Research, Inc., 60 Acorn Park Drive, Cambridge, MA 02140 USA


+1 617-613-6000 | Fax: +1 617-613-5000 | forrester.com
2016 Forrester Research, Inc. Opinions reflect judgment at the time and are subject to change. Forrester,
Technographics, Forrester Wave, RoleView, TechRadar, and Total Economic Impact are trademarks of Forrester
Research, Inc. All other trademarks are the property of their respective companies. Unauthorized copying or
distributing is a violation of copyright law. Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

The Big Data Fabric Market Is Immature But Will Grow Rapidly
Big data is not an option it has become a necessity for supporting next-generation insights.
Enterprises of all types and sizes are embracing big data, but the gap between business expectations
and the challenges of supporting big data technology (such as Hadoop) has become the primary
motivation to innovate with big data fabric. The collection of technologies enables enterprise architects
to integrate, secure, and govern various data sources through automation, simplification, and selfservices capabilities. It reduces complexity and hides heterogeneity by embodying an abstracted
model of the data processing pipeline that reflects business requirements rather than the complexity of
the underlying systems.
Today, big data fabric is accelerating the delivery of insights by automating key processes for increased
agility while giving business users more autonomy in the data preparation process. Enterprises use it
to support many use cases, such as enabling 360-degree and multidimensional views of the customer,
internet-of-things (IoT) and real-time analytics, offloading data warehouses, fraud detection, integrated
analytics, and risk analytics. Enterprises are using big data fabric primarily because it:
Delivers new actionable insights with minimal effort. Big data fabric offers the ability to
aggregate, transform, cleanse, and integrate data from multiple big data sources, which can
then be presented in dashboards, reporting tools, and web applications. It leverages advanced
technologies such as machine learning, Apache Spark, Hadoop, Kafka, Storm, Ranger, and others
to deliver insights with zero to minimal coding.
Secures big data end-to-end. Big data fabric enables centralized data access and control, and
it enforces a stricter level of data-at-rest and data-in-motion security measures than traditional
approaches. It can remediate security risks with masking, auditing, and encryption across the
fabric. Today, large banks and insurance companies rely on big data fabric to ensure the protection
of critical siloed data.
Enables real-time integrated data across the business. Big data fabric enables data and
metadata sharing between peers, employees, partners, and customers. It allows any application,
process, dashboard, tool, or user to access any integrated data, regardless of where the data is
physically or logically located and regardless of the data format. Big data fabric offers consistent,
timely, and trusted data for internal and external users, creating a go-to place for integrated data
like Google does for searches.
Delivers a self-service data platform for business users. Until recently, data platforms were
mostly used by developers, architects, and data scientists, largely because of the platforms
complexity and limited use cases. Big data fabric emphasizes self-service data preparation,
curation, orchestration, and integration services that nontechnical personnel can leverage. It
enables business users to blend, wrangle, and mash up their own data sets and share them among
peers and other groups for improved decision making.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

Big Data Fabric Evaluation Overview


To assess the state of the market and see how the vendors stack up against each other, Forrester
evaluated the strengths and weaknesses of 11 top commercial big data fabric vendors: Denodo
Technologies, Global IDs, IBM, Informatica, Oracle, Paxata, SAP, Syncsort, Talend, Trifacta, and
Waterline Data.
Evaluation Criteria: Current Offering, Strategy, And Market Presence
After examining past research, user requirements, and vendor interviews, we developed a
comprehensive set of 26 evaluation criteria, which we grouped into three high-level buckets:
Current offering. We evaluated each products application development, streaming, loading,
data consistency, transactional support, data security, big data support, multimodel, deployment
architecture, scalability, performance, in-memory, high availability, and other features and
functionality to establish the capabilities of the vendors current offering. All products evaluated
must have been publicly available by August 1, 2016.
Strategy. We reviewed each vendors strategy to assess its ability to compete and grow in
the commercial big data fabric market. Key criteria include Forresters level of confidence in
the vendors ability to execute on its stated strategy as well as support for current and future
customers. Forrester also reviewed each vendors product road map to assess how it will affect the
vendors competitive position compared with the other vendors in this evaluation.
Market presence. To determine each vendors market presence, we evaluated overall big data
fabric product revenue, install base, market awareness, partnerships, and reach.
Forresters Evaluation Assesses The Capabilities Of 11 Big Data Fabric Vendor Offerings
Each of the 11 vendors (Denodo Technologies, Global IDs, IBM, Informatica, Oracle, Paxata, SAP,
Syncsort, Talend, Trifacta, and Waterline Data) that Forrester included in this evaluation has (see Figure 1):
A comprehensive big data fabric offering. The vendors included in this evaluation must provide
big data fabric functions as defined in the Forrester report Big Data Fabric Drives Innovation
And Growth, published in March 2016.1 These include functions such as access, discovery,
transformation, integration, security, governance, lineage, and orchestration of big data sources to
support big data workloads and use cases. The solution must be able to process and curate large
amounts of structured, semistructured, and unstructured data stored in big data platforms such as
Apache Hadoop, MPP EDWs, NoSQL, Apache Spark, in-memory technologies, and other related
commercial and open source platforms, including Apache projects.2 In addition, it must leverage
big data technologies such as Spark, Hadoop, and in-memory as a compute and storage layer to
assist the big data fabric with aggregation, transformation, and curation processing.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

A standalone big data fabric solution. The vendors included in this evaluation provide a software
solution that organizations can implement independent of Hadoop distribution and the analytics/
visualization tool. The solution should not be technologically tied or bundled to any particular
application, product, or solution. The vendor must market the big data fabric as a standalone
product or solution. The solution can run on cloud and/or on-premises platforms.
Big data use cases. The solution must be able to support big data use cases such as customer
churn, the IoT, 360-degree views of customers and the business, advanced analytics, real-time
analytics, and others.
A referenceable install base. There should be 10 or more unique enterprise paying customers
using the big data fabric product that span more than one major geographical region. Each vendor
also provided at least two customer references who Forrester interviewed.
A publicly available product. The participating vendors must have actively marketed a big data
fabric product as of August 1, 2016.
Customer interest. Forrester included only those vendors that customers mentioned during
Forrester inquiry calls during the past 12 months related to big data fabric topics.
Client inquiries and/or technologies that put the vendor on Forresters radar. Forrester
clients often discuss the vendors and products through inquiries and interviews; alternatively, the
vendor may, in Forresters judgment, warrant inclusion or exclusion in this evaluation because of
technology trends and market presence.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

FIGURE 1 Evaluated Vendors: Product Information

Product version
evaluated

Vendor

Product evaluated

Denodo Technologies

Denodo Platform

Global IDs

Data Ecosystem Management Suite

IBM

IBM InfoSphere Information Server Enterprise Edition

11.5

Informatica

Informatica Platform

10.1

Oracle

GoldenGate for Big Data


Oracle Data Integrator for Big Data
Oracle Big Data Preparation Cloud Service
Oracle Big Data Discovery Cloud Service
Oracle Stream Analytics
Oracle Enterprise Metadata Management
Big Data Cloud Service
Big Data SQL Cloud Service

Paxata

Adaptive Information Platform

SAP

SAP Hana Vora


SAP Hana
SAP BW
SAP IQ
SAP EIM (multiple products)

Syncsort

DMX-h and Ironstream

Talend

Talend Data Fabric

Trifacta

Trifacta Wrangler Enterprise Fall 16 release

Waterline Data

Waterline Data Catalog

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

6.0
9

12.2.0.1.1
12.2.1.2.0

12.2.1.1
12.2.1.1

1.3
SP 12
7
16

9 and 1.4
6.2

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

FIGURE 1 Evaluated Vendors: Product Information (Cont.)

Vendor inclusion criteria


Forrester included providers that met the following inclusion criteria:
A comprehensive big data fabric offering. The vendors included in this evaluation must provide big
data fabric functions as defined in the Forrester report Big Data Fabric Drives Innovation And Growth,
published in March 2016. These include functions such as access, discovery, transformation,
integration, security, governance, lineage, and orchestration of big data sources to support big data
workloads and use cases. The solution must be able to process and curate large amounts of structured,
semistructured, and unstructured data stored in big data platforms such as Apache Hadoop, MPP
EDWs, NoSQL, Apache Spark, in-memory technologies, and other related commercial and open source
platforms, including Apache projects. In addition, it must leverage big data technologies such as Spark,
Hadoop, and in-memory as a compute and storage layer to assist the big data fabric with aggregation,
transformation, and curation processing.
A standalone big data fabric solution. The vendors included in this evaluation provide a software
solution that organizations can implement independent of Hadoop distribution and the
analytics/visualization tool. The solution should not be technologically tied or bundled to any particular
application, product, or solution. The vendor must market the big data fabric as a standalone product or
solution. The solution can run on cloud and/or on-premises platforms.
Big data use cases. The solution must support big data use cases such as customer churn, the IoT,
360-degree views of customers and the business, advanced analytics, real-time analytics, and others.
A referenceable install base. There should be 10 or more unique enterprise paying customers using
the big data fabric product that span more than one major geographical region. Each vendor also
provided at least two customer references who Forrester interviewed.
A publicly available product. The participating vendors must have actively marketed a big data fabric
product as of August 1, 2016.
Customer interest. Forrester included only those vendors that customers mentioned during Forrester
inquiry calls during the past 12 months related to big data fabric topics.
Client inquiries and/or technologies that put the vendor on Forresters radar. Forrester clients
often discuss the vendors and products through inquiries and interviews; alternatively, the vendor may,
in Forresters judgment, warrant inclusion or exclusion in this evaluation because of technology trends
and market presence.
Forrester reserves the right to include or exclude any vendor.

Larger Providers Have An Edge With A Broader Range Of Functionality


Forresters evaluation of big data fabric vendors uncovered a market with four Leaders, six Strong
Performers, and one Contender (see Figure 2):

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

Informatica, IBM, Oracle, and Talend are Leaders. These vendors offer more comprehensive,
scalable platforms with broader use-case support. Each has a sweet spot enabling it to compete
vigorously in the market. They have had strong offerings in the traditional data integration space
and have been quick to expand their platform to leverage big data technologies. EA pros often
shortlist Informatica for its integration capabilities, but over the past two years it has extended
its platform to support a broader big data fabric that appeals to many enterprises. IBMs strong
data and information management offering, including its broad range of database, Hadoop, and
integration services, helps deliver the big data fabric. Oracle offers a scalable fabric software
and appliance. It continues to expand its existing data platform to support big data use cases,
leveraging its high-performance Hadoop loader, open source integration, and big data appliance.
Talend offers a big data fabric that delivers high scale and performance and supports various big
data use cases.
Denodo, Global IDs, Paxata, SAP, Syncsort, and Trifacta are Strong Performers. Strong
Performers can still be a strong choice, especially if price/performance, broader big-data-as-aservice, integration-as-a-service, and big data appliances are important. Denodos mature data
virtualization technology broadens its coverage to support big data fabric use cases. Global
IDs leverages its core expertise in data discovery, governance, metadata, and data quality to
support various use cases. Paxatas platform has been expanding. It is built on Apache Spark
and optimized to run in Hadoop, leveraging distributing computing and machine learning. SAPs
Hana Vora supports big data initiatives by combining in-memory, Spark, Hadoop, and integration
services in a unique platform. Syncsorts solution supports new big data use cases by leveraging
technologies to collect, integrate, sort, and distribute data. Trifactas data prep software continues
to expand to support big data fabric, leveraging machine learning, sophisticated transformations,
discovery, and enrichment.
Waterline Data is a Contender. Waterline provides a niche solution focused on the enterprise data
catalog space, but it is not a complete data fabric solution. Customers often use Waterline Data
with other vendor solutions, such as data prep software to support big data fabric deployments.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

FIGURE 2 Forrester Wave: Big Data Fabric, Q4 16

Challengers

Contenders

Strong
Performers

Leaders

Strong

Informatica
Talend
IBM

Paxata
Trifacta
Syncsort

Oracle
Denodo Technologies

Go to Forrester.com
to download the
Forrester Wave tool for
more detailed product
evaluations, feature
comparisons, and
customizable rankings.

SAP
Current
offering

Global IDs
Waterline Data

Market presence
Full vendor participation

Weak
Weak

Strategy

Strong

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

rt
Ta
le
nd
Tr
ifa
c
W ta
at
e
Da rlin
ta e

nc
so
Sy

at
ra ica
cl
e
Pa
xa
ta
SA
P

In
fo
r

De
no
do
G
T
lo
ba ech
no
IB l ID
lo
s
M
g

Fo
r
w res
ei te
gh rs
tin
g

ie
s

FIGURE 2 Forrester Wave: Big Data Fabric, Q4 16 (Cont.)

Current offering

50%

3.54 2.38 4.08 4.53 3.57 4.00 2.93 3.19 4.13 3.72 2.15

Data ingestion

10%

3.00 2.00 3.00 4.00 3.00 3.00 3.00 4.00 4.00 2.00 1.00

Data orchestration

15%

4.00 2.00 3.00 4.00 4.00 4.00 3.00 3.00 4.00 4.00 0.00

Data discovery

15%

4.00 2.00 3.00 5.00 3.00 4.00 4.00 2.00 3.00 4.00 2.50

Data management

20%

4.20 2.60 5.00 5.00 4.60 4.60 3.00 3.80 5.00 4.20 3.00

Fabric data access

20%

3.00 2.80 4.40 4.40 3.00 4.40 2.40 2.40 4.40 4.40 3.60

Fabric management

20%

3.00 2.50 5.00 4.50 3.50 3.50 2.50 4.00 4.00 3.00 1.75

Strategy

50%

3.30 3.00 4.05 4.05 3.70 3.00 3.60 3.00 3.65 3.00 2.60

Ability to execute

35%

3.00 3.00 4.00 4.00 4.00 3.00 3.00 3.00 4.00 3.00 2.00

Road map

30%

3.00 3.00 4.00 4.00 3.00 3.00 4.00 3.00 4.00 3.00 3.00

Vision

30%

4.00 3.00 4.00 4.00 4.00 3.00 4.00 3.00 3.00 3.00 3.00

Professional services

5%

3.00 3.00 5.00 5.00 4.00 3.00 3.00 3.00 3.00 3.00 2.00

Market presence

0%

2.50 1.65 4.00 4.45 3.65 2.40 3.00 2.70 3.65 2.85 1.65

Product revenue

35%

2.00 1.00 4.00 4.00 4.00 2.00 3.00 2.00 3.00 2.00 1.00

Customer base

30%

3.00 2.00 4.00 5.00 4.00 2.00 3.00 3.00 5.00 3.00 2.00

Market awareness

20%

3.00 2.00 4.00 4.00 3.00 4.00 3.00 4.00 4.00 4.00 2.00

Partner ecosystem

15%

2.00 2.00 4.00 5.00 3.00 2.00 3.00 2.00 2.00 3.00 2.00

All scores are based on a scale of 0 (weak) to 5 (strong).

Vendor Profiles
Whether they are a Leader, Strong Performer, or Contender, every big data fabric vendor in this
Forrester Wave offers a credible solution to support new and emerging use cases. This evaluation of
the big data fabric market is intended to be a starting point only. We encourage clients to view the
detailed product evaluations and adapt the criteria weightings to fit their individual needs through

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

the Forrester Wave Excel-based vendor comparison tool. Clients can also schedule an inquiry to
have a conversation about the market and specific vendor products to discuss specific business and
technology requirements.
Leaders
IBM differentiates with its broad information management capabilities. IBM is known for
its strong data and information management offering, and now the company is extending it to
support big data fabric deployments. Unlike other big data fabric vendors, IBM provides its own
Hadoop distribution, yet it also provides connectors to support connectivity to Hadoop and Spark
ecosystems. IBMs key strengths lie in high-end scalability, support for complex data issues, endto-end big data governance, integrated metadata, and granular security and privacy controls. In
addition, several reference customers mentioned that IBM Global Business Services helped them
implement a big data fabric quicker through customized models, access patterns, and integration
with existing analytical tooling. IBM is a good fit for enterprises that have complex legacy data,
have multiple data lakes, require tight security controls, and want to leverage a hybrid platform.
Informatica provides a big data fabric with all the trimmings. With more than 7,000 firms
using Informatica for their information management initiatives, its technology is proven and
mature. Informaticas strength lies in increasing developer productivity via its intuitive visual and
metadata-driven development environment, which developers can leverage for big data sources
and prebuilt parsers, transformers, and connectors that help parse, integrate, cleanse, mask, and
match data natively on Hadoop. It also supports the reuse of workflow pipelines to support other
infrastructures. Informatica provides an enterprise information catalog, which catalogs data assets
across the enterprise using an inferred understanding of the data as well as crowdsourced input
from business analysts, stewards, and architects. Enterprises use Informaticas big data fabric
solutions to deliver enterprise data lakes for real-time analytics, IoT, integrated analytics, and realtime operational intelligence like fraud detection and proactive customer engagement.
Talend offers a compelling, flexibly priced big data fabric solution. The Talend big data fabric
combines several technologies to deliver a common set of easy-to-use tools for real-time, batch,
or dynamic integration running in on-premises, cloud, or hybrid environments. Talend Platform
for Big Data simplifies the process of working with Hadoop and Spark distributions, requiring no
coding to perform various activities. In the Eclipse-based Talend user interface, you can drag, drop,
and configure graphical components representing Hadoop-related data transformation and data
quality operations and natively connect to applications, databases, NoSQL, and the IoT. Talend
automatically generates the corresponding native Spark or MapReduce code for transforming data
using the Hadoop cluster. However, data preparation, discovery, and self-service are still emerging
functionality compared with leading big data fabric vendors.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

10

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

Oracle offers a viable and scalable big data fabric solution. Oracles GoldenGate replication
solution provides real-time capabilities, integrating with Oracle Data Integrator tools to deliver a
unified development experience. It also supports real-time big data integration to dynamically
push data into the HDFS, HBase, Hive, Flume, Storm, and Kafka big data frameworks. Oracle
Big Data SQL provides data federation with Hadoop; Oracle Big Data Connectors deliver a highperformance Hadoop to Oracle Database loader and enables optimized analysis using Oracles
distribution of open source R directly on Hadoop data. Oracles key strengths lie in its security and
governance capabilities, highly scalable data movement and transformations, and tight integration
with Oracle Big Data Appliance. Its customers use big data fabric to support various use cases,
including real-time analytics across disparate data sources (such as data lakes), customer
intelligence, IOT applications, and other big data applications and insights.
Strong Performers
Denodo Technologies extends its platform to support big data fabric. Unlike other large
software vendors in this evaluation, Denodo is a pure-play data virtualization vendor now extending
the platform to support big data initiatives. Today, several enterprises are leveraging Denodo to
support big data fabric deployments such as virtual big data marts, big data analytics, realtime analytics, and IoT data processing in various vertical industries. Denodos key strength
is delivering a unified and centralized data services fabric with security and real-time integration
across multiple traditional and big data sources, including Hadoop, NoSQL, cloud, and softwareas-a-service (SaaS). Customers like its easy-to-use, simple yet sophisticated data modeling
capabilities, search, and support for various big data sources.
Global IDs offers a viable big data fabric solution for all enterprises. Global IDs has been
providing data management solutions to retailers, financial services, telcos, pharmaceuticals,
and healthcare companies for more than 15 years. It addresses the data ecosystem problem
by leveraging its core expertise in data discovery, governance, profiling, lineage, and quality.
Enterprises can deploy the product in on-premises, cloud, and hybrid environments, and it
is optimized for performance on the Hadoop ecosystem. Business analysts can contribute
business terms and metadata within the product and focus on technology-management-business
collaboration. Global IDs provides extensive metadata functionality in its products to support
end-to-end big data fabric deployments. Enterprises with complex big data platforms that need
powerful metadata management and lineage should look at Global IDs.
Paxata offers easy-to-use big data fabric focusing on self-service. Paxatas information
platform provides an interactive, analyst-centric data preparation solution that is powered by a
unified set of technologies designed to support data integration, quality, governance, collaboration,
and enrichment. Machine learning algorithms help business analysts easily understand, categorize,
integrate, and connect data more quickly. The platform is built on Apache Spark and optimized to
run in the Hadoop environment, leveraging distributed computing, machine learning, and visual
workspace. Paxata focuses on delivering an easy-to-use solution that eliminates the need for

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

11

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

coding, scripting, and sampling. Enterprises are using Paxata to support ad hoc, operational,
predictive, and real-time analytics. However, customers report that Paxatas integration with a few
traditional and legacy data sources is not optimized.
SAP Hana Vora extends the SAP platform to support big data fabric. SAP offers a
comprehensive data management framework to support data access, data movement, data quality,
transformation, and integration. And with SAP Hana Vora, it extends the platform to support big
data initiatives, including those for Hadoop, Spark, NoSQL, and in-memory computing fabrics.
SAP Hana Vora couples tightly with Apache Spark to expose Vora data and processing to Spark.
Enterprises can deploy machine learning algorithms in Hana directly or to Spark. In addition,
organizations can distribute data preparation operations such as sorting, joining, and aggregation
across Hana and Spark clusters. Enterprises use SAPs big data fabric to support various use
cases, including a 360-degree view of the customer, fraud detection, IoT, and real-time insights.
Syncsort offers a scalable big data fabric solution. Syncsort provides a big data fabric solution
that focuses on simplifying the process of collecting, integrating, sorting, and distributing enterprise
data to deliver actionable insights, while requiring fewer resources. Syncsorts top use cases for
big data fabric include leveraging data from mainframes and other traditional systems in Hadoop,
while ensuring data lineage, security, and efficiency. Syncsort allows enterprises to deploy a fullfeatured ETL environment on premise and on AWS EC2, Amazon Elastic MapReduce, and Google
Cloud Platform, with forthcoming support for Microsoft Azure. Data transformations are defined in
a visual, wizard style GUI, and the same jobs can be executed natively in MapReduce, Spark, or
stand-alone servers, without any changes. Although DMX-h does not ship with built-in machine
learning capabilities, they can be included as task extensions and custom functions as part of the
data flows. Syncsort is still expanding its self-service capabilities.
Trifactas solution makes self-service big data fabric easy to deploy. Trifactas self-service data
preparation software enables enterprises to easily explore, transform, and join together raw and
diverse data sources into clean and structured outputs for a variety of analytic purposes. Trifacta
leverages machine learning algorithms to automate and simplify the interaction with data, making
data wrangling a self-service process for analysts and business users. The vendor supports batch
and on-demand natively and continuous ingestion through integrations with partners StreamSets
and Google Dataflow. It has extensive metadata management directly within the application and
through integrations with partners such as Cloudera Navigator, Apache Atlas, Waterline Data, and
Alation. Trifacta visually tracks and presents the lineage of data transformation steps for specific
data sets and across multi-data-set-wrangling workflows. However, enterprises are reporting that
Trifacta lacks high-end scalable big data fabric deployments.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

12

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

Contenders
Waterline Data focuses on delivering a Smart Data Catalog for big data environments.
Waterline Data accelerates data discovery, governance, and time-to-value through its Smart Data
Catalog, which automates the cataloging of all data lake assets. It empowers business analysts
and data scientists to find, understand, and provision trusted data to extract insights and create
accurate business decisions without coding and manual exploration. In addition to automated
discovery, it also enables business analyst communities to crowdsource tagging and annotations
and allows data stewards to curate the data catalog using an agile approach. Waterline ensures
that the catalog is up to date by detecting changes and automatically cataloging new and updated
data assets including curated business metadata and data lineage. While Waterline supports onpremises and cloud, hybrid is currently planned in a future release.

Engage With An Analyst


Gain greater confidence in your decisions by working with Forrester thought leaders to apply
our research to your specific business and technology initiatives.
Analyst Inquiry

Analyst Advisory

Webinar

To help you put research


into practice, connect
with an analyst to discuss
your questions in a
30-minute phone session
or opt for a response
via email.

Translate research into


action by working with
an analyst on a specific
engagement in the form
of custom strategy
sessions, workshops,
or speeches.

Join our online sessions


on the latest research
affecting your business.
Each call includes analyst
Q&A and slides and is
available on-demand.

Learn more.

Learn more.

Learn more.

Forresters research apps for iPhone and iPad


Stay ahead of your competition no matter where you are.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

13

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

Supplemental Material
Online Resource
The online version of Figure 2 is an Excel-based vendor comparison tool that provides detailed product
evaluations and customizable rankings.
Data Sources Used In This Forrester Wave
Forrester used a combination of 32 data sources to assess the strengths and weaknesses of each solution:
Vendor surveys. Forrester surveyed vendors on their capabilities as they relate to the evaluation
criteria. Once we analyzed the completed vendor surveys, we conducted vendor calls where
necessary to gather details of vendor qualifications.
Product briefings and demos. We asked vendors to conduct briefings and demonstrations of
their products functionality. We used findings from these product briefings and demos to validate
details of each vendors product capabilities.
Customer reference calls. To validate product and vendor qualifications, Forrester also conducted
reference calls or conducted surveys with at least one of each vendors current customers.
The Forrester Wave Methodology
We conduct primary research to develop a list of vendors that meet our criteria to be evaluated in this
market. From that initial pool of vendors, we then narrow our final list. We choose these vendors based
on: 1) product fit; 2) customer success; and 3) Forrester client demand. We eliminate vendors that have
limited customer references and products that dont fit the scope of our evaluation.
After examining past research, user need assessments, and vendor and expert interviews, we develop
the initial evaluation criteria. To evaluate the vendors and their products against our set of criteria,
we gather details of product qualifications through a combination of lab evaluations, questionnaires,
demos, and/or discussions with client references. We send evaluations to the vendors for their review,
and we adjust the evaluations to provide the most accurate view of vendor offerings and strategies.
We set default weightings to reflect our analysis of the needs of large user companies and/or other
scenarios as outlined in the Forrester Wave document and then score the vendors based on a
clearly defined scale. These default weightings are intended only as a starting point, and we encourage
readers to adapt the weightings to fit their individual needs through the Excel-based tool. The final
scores generate the graphical depiction of the market based on current offering, strategy, and market
presence. Forrester intends to update vendor evaluations regularly as product capabilities and vendor
strategies evolve. For more information on the methodology that every Forrester Wave follows, go to
http://www.forrester.com/marketing/policies/forrester-wave-methodology.html.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

14

For Enterprise Architecture Professionals

November 22, 2016

The Forrester Wave: Big Data Fabric, Q4 2016


A Critical Platform For Enterprises To Succeed With Big Data Initiatives

Integrity Policy
All of Forresters research, including Forrester Wave evaluations, is conducted according to our
Integrity Policy. For more information, go to http://www.forrester.com/marketing/policies/integritypolicy.html.

Endnotes
Increasing data volume is creating new challenges in integration, security, curation, administration, and governance.
Business users want real-time trusted data to make accurate business decisions, while technology management
wants to simplify administration and lower costs. Closing the big data platform gap is the goal of the emerging
collection of technologies that Forrester calls big data fabric. Enterprise architects should look at big data fabric to
accelerate their big data initiatives, monetize big data sources, and respond more quickly to business needs and
competitive threats. See the Forrester report Big Data Fabric Drives Innovation And Growth.

MPP: massively parallel processing; EDW: enterprise data warehouse.

2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.
Citations@forrester.com or +1 866-367-7378

15

We work with business and technology leaders to develop


customer-obsessed strategies that drive growth.
Products and Services

Core research and tools


Data and analytics
Peer collaboration
Analyst engagement
Consulting
Events

Forresters research and insights are tailored to your role and


critical business initiatives.
Roles We Serve
Marketing & Strategy
Professionals
CMO
B2B Marketing
B2C Marketing
Customer Experience
Customer Insights
eBusiness & Channel
Strategy

Technology Management
Professionals
CIO
Application Development
& Delivery
Enterprise Architecture
Infrastructure & Operations
Security & Risk
Sourcing & Vendor
Management

Technology Industry
Professionals
Analyst Relations

Client support
For information on hard-copy or electronic reprints, please contact Client Support at
+1 866-367-7378, +1 617-613-5730, or clientsupport@forrester.com. We offer quantity
discounts and special pricing for academic and nonprofit institutions.

Forrester Research (Nasdaq: FORR) is one of the most influential research and advisory firms in the world. We work with
business and technology leaders to develop customer-obsessed strategies that drive growth. Through proprietary
research, data, custom consulting, exclusive executive peer groups, and events, the Forrester experience is about a
singular and powerful purpose: to challenge the thinking of our clients to help them lead change in their organizations.
132141
For more information, visit forrester.com.

Вам также может понравиться