Вы находитесь на странице: 1из 14

Apigee Insights

Data as Currency and Catalyst 


in the App Economy

Anant Jhingran
Apigee Insights

Contents
Introduction3
Data as Currency and Catalyst 4
Big, Noisy…and Useful 5
The Power of Context 6
Relief from the ETL and EDW Legacy 8
Broad Data and Signal Amplification 9
Increasing Signal to Noise 11
Correlating Data Across Channels 11
Considerations for Getting the Most From Your API Data 12
Conclusion13
About the Author 13

2 2 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

Introduction
Businesses are expanding their digital ecosystems into the app economy. The app
economy is changing the way users consume digital information and customers
interact with enterprises. As enterprises increasingly interact with partners, devel-
opers, and internal resources through mobile, social media, and app experiences,
the number of sources of useful data keeps expanding. Data becomes the currency
of the digital ecosystem and enterprises are rushing to make sense of the new world
of data as their highly structured internal systems meet the ocean of unstructured
and semi-structured data in the big data world. This eBook introduces you to broad
data, the bridge between big data and the world characterized by legacy ETL and
EDW technologies and internal enterprise systems.

Digital ecosystems enable an enterprise to broaden the reach of its digital assets,
increase revenue, accelerate and expand partnerships, and create innovation. In the
app economy, digital ecosystems are enabled quickly, securely, and at scale through
APIs. APIs are like the valves and the arteries that connect the heart of a company,
its digital assets, to the rest of the ecosystem. Just as assessing the health of a
person requires blood tests as well as information about their lifestyle, assessing and
optimizing the health of a digital ecosystem requires a breadth of data. It requires
that context about users, developers and partners from internal and external sources
be combined with API data to create a 360-degree view from which you can draw
actionable business insights.

Broad data starts with a focus on digital ecosystem data. That’s internal data at
the core of enterprise commerce as well as external data from interactions with the
enterprise. Broad data combines the relevant contextual signals from outside of the
system to deliver insights and help optimize the health of your business.

In order to participate and be relevant in the app economy, and to derive the most
valuable business insights, a whole new paradigm of data analysis is needed.

3 3 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

In this eBook, we explore how broad data and the power of context informs this
paradigm shift.

We describe why existing internal enterprise technologies and machine data analytics
tools don’t work in the broad data world. We explore how data transport systems need
to change. The traditional approach of putting large datasets on disks and shipping
them where they need to go is just not going to work in the world of broad data.

As it becomes impossible to move all the relevant external data to repositories inside
a business, companies will need to create new infrastructures, build new types of
data analysis systems, and forge new kinds of business relationships.

The challenge for companies is to gain access to and gain value from diverse datasets
that they don’t own and that they cannot move. In short, businesses need to be able
to access data wherever it resides, to reduce the noise, and to harvest insights from
the signals in the data.

Data as Currency and Catalyst


In this new world of apps and APIs, data is the currency that flows through the
system and powers the app economy. It is constantly being exchanged, added,
and subtracted. If the data can be harnessed, it can improve the app economy for
all participants.

For example, when an app interacts with a retailer’s catalog, it facilitates an


exchange of data: the retailer tells the app about what’s available, and the user
(through the app) tells the retailer what she’s interested in. That’s the “currency”
part of the equation.

But even more importantly, this data, when harnessed, is a catalyst to improve
retailer interactions with the app, the developer, and ultimately the user. The
analysis of this data immediately helps improve the retailer’s business. Big data at
the edge of the enterprise is like water—it flows readily, but if harnessed using new
techniques, it also helps power the business and change it dramatically.

4 4 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

Big, Noisy…and Useful


Big data immediately brings to mind zottabytes, petabytes, and exabytes—words
that express more zeros than the mind can comprehend. But what’s “big” about big
data is not just a matter of quantity. It is also qualitatively different.

The world of connected devices and communications data has created a universe
of data that looks very different from the structured tables and rows at work in our
transactional systems of record, such as ecommerce platforms and databases.
There has been a great deal of attention paid to “unstructured” data, which can be
misleading, because what’s most important about big data is not its volume and
lack of structure, but the fact that there are important signals buried in it that can
be extracted and exploited.

The most important data is semi-structured, not unstructured. API traffic,


developer interactions, app traffic, Twitter streams, ecommerce shopping carts—
these types of data are measured in terabytes, rather than petabytes or exabytes
and all have some detectable organizing principles. As a consequence, we don’t
need massive supercomputers to extract the right signals. Truly unstructured,
massive data consumes too many resources in the process of signal extraction and
is difficult to cost-justify outside of scientific or military research. But the data that
lives around APIs—just outside the front doors of our businesses, and just behind
the front doors of our business partners—is semi-structured data, and we can start
using it today.

Rather than worry about “speeds and feeds” and the obligation to process raw
volume, the right questions for the enterprise to ask are: What is the right data
source to explore? What data sources provide valuable signals for us?

5 5 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

The Power of Context


One of the most important facets of big data is its potential to provide context
around an interaction with your business. Context is the 360-degree view of your
business. It encompasses interactions between a business and its customers
across multiple devices, apps, social networks, business networks, and cloud
services. Note the use of the word interaction: that can mean a transaction, but
it also includes visits to your online presence that do not result in a purchase, as
well as commentary about your business that happens well before or after an
interaction.

Context = 360 view

Here are two examples where context lends powerful value to data analysis,
especially when cross-referenced:

›› The transactional context of app developers and users. How many times did
a developer access your API? How many purchases has an app user made in the
last month through your ecommerce app? How many times has the app user
phoned your support line? Much of this data is stored in internal systems.
›› The social context of both app developers and users, which can lead to
insights for a business on its developer innovation programs. What kind of
comments are the developers and users making in developer forums and
social media about your API or your brand? This data lies almost entirely
outside of the organization.

6 6 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

The context of user behavior, combined with developer behavior, compiled of


both internal and external, structured and semi-structured data, holds valuable
predictive clues.

While Twitter and Facebook garner a lot of attention from a business-to-consumer


(B2C) social perspective, there are many other social “hangouts” that inform us
about digital ecosystems and allow us to measure the success of an enterprise’s
business strategy. Developers are the key players in the new app economy—they
build apps. What do developers’ social interactions tell us about an individual
developer, and about developers in the aggregate?

Individually, a developer’s profile on an enterprise’s developer channel, his Github


account, and his StackOverflow Q & A activity reveals much about the devel-
oper’s influence. This helps the enterprise learn whether it is targeting the right
developers. Collectively, the interactions of all the developers, stitched across
multiple data sources, tell an enterprise about trends such as the popularity of
API structures, programming language popularity, and general impressions of the
enterprise’s specific APIs.

Here’s an example: A developer is building an app around a telecom provider’s


payments API. The developer is active on Github, has expressed interest in certain
programming languages, and has a set of followers. Likewise, he follows other
developers. He interacts with others on StackOverflow, where he comments on the
usefulness of the telecom provider’s APIs.

The app that he builds stores information on user behavior outside the telecom
API. The user of the app is a known customer of the telecom, which has records
of her payment and phone-activation history in its traditional systems of record.
With all of this contextual information, the telecom provider can now determine
how engaged the developer is, how persuasive his app is to the user, and how the
user is interacting with the telecom provider through the app and other channels
with the telecom. This 360-degree view, when captured, is far more influential in
engaging the user and the developer than capturing only the traffic to and from
the app through an API.

7 7 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

Relief from the ETL and EDW Legacy


The world of internal enterprise data is highly structured. Enterprise applications
such as enterprise resource planning (ERP), human resources (HR), and customer
relationship management (CRM) systems are built with the assumption that they
will consume structured data and be customized to some degree. At the same
time, these applications are often isolated in information silos with their own closed
databases. In order to correlate information well enough to get a 360-degree view
of the business, data had to be extracted from these closed systems, transformed
into a digestible common format, and loaded into an enterprise data warehouse
(EDW), where it will sit and wait to be analyzed by business intelligence (BI) appli-
cations. This process is known as Extract, Transform and Load (ETL), and it is
notoriously slow and expensive.

The good news is that when the data is highly structured—even if it is siloed—it
requires less logic to normalize and analyze. EDW and ETL were great for
answering known questions derived from known data sources. If an enterprise
wanted to know the churn rate of a channel, it could build a model with a high
degree of confidence, then fill its EDW with carefully selected data, filtered out by
ETL. Once this connection is set up, this apparatus could repeatedly answer the
same question efficiently.

But today, new apps and APIs are constantly being developed and deployed,
sometimes with a “half life” of only weeks. Not only is there a lot of semi-struc-
tured and unstructured data in the world of big data, there are a lot of unknown
structures, and new ones are added to the mix every day. The world of big data is
almost the opposite of the controlled world of the internal enterprise. The ETL and
EDW paradigm was not designed for such a dynamic world.

When data arrives in a previously unknown format, or changes structure frequently,


or when a new kind of question is asked for which there is no pre-existing model,
traditionally modeled processes and systems such as ETL and EDW break down.
ETL and EDW were made to answer questions like, “What is my churn rate?” They
were not made to answer questions like, “What is the right engagement metric to
measure the commitment level of iOS users on my photo-printing app?”

8 8 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

In the past, an enterprise collected all the data that it thought might be useful for
optimizing its business and centralized it for analysis. But in the new world, it can
no longer do so. While this might appear to be a problem, it is not. Everywhere data
is being collected, it is also being exposed through APIs. The action is at the edge
of the enterprise, where the interactions happen, not in a central processing facility.
APIs form the bridge for an enterprise wanting to access external data that it does
not collect and store, such as Twitter feeds or Github posts, and correlate this with
data that is buried deep behind its own firewall. It becomes much easier to access
data that is outside of the enterprise’s direct control, without the need to ingest,
store, and process it—while its value can still be determined and put to use.

Broad Data and Signal Amplification


At the edge of the enterprise, things are changing much more rapidly than at the
core of the enterprise. Traditional analytical systems can’t accommodate these
changes easily. It is possible to build a large, inexpensive distributed cluster
of storage in Hadoop, but linking the different sources of data and performing
analysis on the ocean of data requires enormous amounts of logic. It is possible
to do it, but it is like throwing the baby out with the bath water. The “bath water”
is the constraints of the elaborate modeling required by traditional ETL and EDW.
But the “baby” is the time to value derived from having ETL and EDW do a lot of the
analytical heavy lifting.

Enterprises need a system that delivers the best of both worlds—fast. That is
broad data.

The world at the edge of the enterprise has some key patterns—apps talk to
APIs, which produce data; likewise APIs expose data. Developers build apps, and
users interact with apps built by developers. Broad data starts with the business
questions that an enterprise is looking to answer and builds upon the interactions
at the edge, around the APIs and apps. Then it lays out a loose and expandable
structure around the core concepts of entities, events, and analytical needs that
can help normalize data, point it towards the business value sought, and deliver
time to value without old-world constraints.

9 9 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

In the structured-data/EDW world the “entity” refers to a party (human or


computer) that conducts some kind of interaction with the database, which is
called an “event.” The most basic example is a customer, an “entity” defined by
name, age, address, and so on. That customer making a purchase of product X at
store Y at time Z is an “event.” These definitions have changed little over time.

In the broad data/API world, the definition of “entity” is much more fluid, as the
properties of apps and their developers change all the time and the “events”
change too. It’s critical that the system you use for analysis has the flexibility to
recognize when events and entities change, to accommodate those changes,
and to produce a useful signal for your business right away. You need a loose
framework that allows you to ask new business questions all the time.

The data seen by such a system is not just big; it is broad, in the sense that
hundreds of data sources combine to produce some strong signals—as opposed to
the handful of signals collected by focusing only on data that was generated within
the enterprise.

This broad data platform encapsulates the common entities and events at the
edge, captures the common analytical needs, and allows simple extensions for
data, domains, and business problems. This is the future for insights at the edge of
the enterprise.

10 10 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

The new broad data platform needs some new constructs

This is also the essence of signal amplification: Understand what each data source
is telling you, stitch it together with other data sources, and produce a more
reliable signal than any one data source can give you.

Increasing Signal to Noise


Generally, filtering out noise is achieved by amplification of the signal. The best
technique for amplification of the signal is to use multiple sources and stitch them
together. In the last US presidential election, The New York Times’ Nate Silver
looked at “noisy” polls, their confidence intervals, the questions they asked, their
historic biases, and so on, and combined them to produce stunning predictions.
Silver correctly predicted the presidential election results in all 50 states. People
who looked only at individual polls almost always got something wrong; sometimes
they got a lot of things spectacularly wrong.

Correlating Data Across Channels


A user of an app built against a retail API might start her shopping at the retailer’s
website. However, she does not complete her shopping on the website. Instead,
she uses an app on her iPhone to connect back to the enterprise, expecting
to continue her shopping where she left off, but also expecting a different user

11 11 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

experience. Individual behaviors might vary, but the interactions across the two
channels are likely to be different. A website supports a more exploratory approach
while mobile apps need to be more streamlined. The essence of the broad data
challenge in this case is how to correlate two behavior patterns of one shopper and
how to meaningfully differentiate the behaviors of an aggregate population across
the channels so that we can build appropriate experiences.

Considerations for Getting the Most From Your API Data


Now that we’ve explored the potential of API data and context in the world of
business, here are a few pieces of advice for getting started on gathering signals
and gaining insights with a broad data system.

›› Don’t think of analytics as an afterthought. Your analytical approach should


be considered on day 1. Would you build a website today without enabling
Google Analytics or Omniture? Why would you do something different for your
API?
›› Don’t over plan your APIs, and track usage. Developers are a straight-
forward bunch. They will work with tools that are useful and ignore those that
are not. There’s no sense in designing a complex API on a predetermined
model because as soon as it is deployed, it will have to be changed. Conducting
continual analytics helps you understand the Darwinist dynamics of your API. As
you observe “what’s hot and what’s not,” you can make changes, either automat-
ically, through self-learning systems, or manually. You can deliver something
better for your developers every day.
›› Build out a business strategy for understanding the value of different data
sources. Couple that business strategy with one or two smart data scientists
on staff who can quickly evaluate the signal-to-noise ratio of the data source.
Every data source will have an API, so this investigation should be relatively
easy. Follow up with some real business deals, internally or with partners, to get
control of the data that is most likely to yield dividends.

12 12 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

›› Always ensure that you have key business drivers in mind. These might
be information accounting, developer adoption, or user behavior. Convert the
business problem into bite-sized tasks that answer the questions:
What data do I need?
What entities and events should I monitor?
What analysis is appropriate?
What is the connection to the business problem?
Solve the problem from both ends—“problem down” and “technology up.”

Conclusion
In order to profitably participate in the app economy, a new paradigm of analysis
is needed—one that understands that the action is at the edge of the enterprise.
By analyzing not only the transactions but also the interactions and contextual
activity around an API, businesses can begin extracting the valuable, actionable
signals from the broad ocean of noise that big and broad data have delivered to
our doorsteps. Once we start extracting the signals and exploring the breadth of
data around our APIs and stop worrying so much about ingesting and processing
massive volumes of data, a rich context develops that provides new clues about
how to survive and thrive in the app economy.

About the Author


Dr. Anant Jhingran (PhD Berkeley) joined Apigee from IBM where he was VP and
CTO for IBM’s Information Management Division and Co-Chair of the IBM-wide
Cloud Computing Architecture Board. He was responsible for the technical strategy
for databases, information integration, analytics, big data, and helped deliver IBM’s
PaaS capabilities. Anant has received several awards including IBM Fellow, IIT Delhi
Distinguished Alumnus Award, President’s Gold Medal at IIT Delhi, IBM Academy of
Technology, and has authored over a dozen patents and over 20 technical papers.

13 13 ©2012 Apigee. Confidential – All Rights Reserved. ©2013 Apigee. All Rights Reserved.
Apigee Insights

For further inquiry


If you’d like more information on Apigee’s Insights, please email data@apigee.com
or send a message via Twitter to @apigee.

About Apigee
Apigee is the leading provider of API technology and services for enterprises and
developers. Hundreds of companies including AT&T, Bechtel, eBay, Korea Telecom,
Telefonica and Walgreens, as well as tens of thousands of developers use Apigee to
simplify the delivery, management and analysis of APIs and apps. Apigee’s global
headquarters are in Palo Alto, California, and it also has offices in Bangalore, India;
London; and Austin, Texas. To learn more, go to apigee.com.

Find Best Practices to Accelerate your API Strategy

Scale, Control and Secure your Enterprise

Build Cutting-Edge Apps and Intuitive APIs

14 ©2013 Apigee. All Rights Reserved.