Вы находитесь на странице: 1из 10

Bango Technical Whitepaper

Bango Analytics Architecture


Version 1.0 (updated December 4th 2008)

Introduction
Reliable and accurate data is an essential foundation for analysis, business decisions and building the strongest
customer relationships. Bango Analytics uses a combination of conventional and proprietary Bango techniques
for data collection. In addition data enrichment techniques are used to constantly add further meaning to the
collected data.
Raw data as stored in an on-line data warehouse and can either be analyzed using the tools provided online at
Bango.com or exported for analysis with other tools. Export can be in real-time using the Web-services interface
or though download as records in a number of structured file formats.
This document is focused on explaining the composition of the Event Log stored in the data warehouse. The
collection of accurate and maximally enriched data is the key focus of Bango Analytics.

Core components
Bango Analytics products are built round four key building blocks:
1. Data-collection: These collect information about user interactions with a website through their mobile phone
browsers or other routes. Each item of data collected is known as an event.
2. Data warehouse: This database stores the data collected by data collection systems. The data warehouse
stores the Event Log which is at the heart of Bango Analytics.
3. Data enrichment: This takes event information within the data warehouse and applies more meaning and
links with associated information. This is based on:
a. Live information within the Bango system
b. Previous information stored within the Bango Data Warehouse
c. Information from partners and other 3rd party services, including mobile network operators (MNOs).
This happens either in real-time or at a later stage as the information becomes available.
4. Data analysis and reporting: A set of web based reporting, analysis and graphing tools that enable Bango
customers to get actionable intelligence from the pool of data they collect using Bango Analytics.
Appendices in this document provide more detail on mobile specific aspects involved in Bango Analytics.

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

The Bango Event Log


The Bango Data Warehouse stores billions of events collected across thousands of websites, visited by tens of
millions of users. Each Bango customer runs one or more mobile websites and for convenience can organize
collection of data into sites or groups of sites using Bango packages.
Each Bango package owns the event log collected from page tracking, campaign tracking, payment and other
tools provided. Only authorized customers with password protected access to the package can access or export
the events recorded.
All records in the event log have the following elements: owner, time, type, user ID, route, location and web
specific. These are described in more detail in the following sections:
1. Owner: Who owns the record? To ensure privacy of information, each record in the event log is assigned to a
package where it was originally created.
2. Time: When did the event occur? This is a UTC time, linked to a centralized atomic clock, to the nearest
millisecond when the event took place. This event time may be mapped into a local time zone by viewing and
reporting tools. Note that UTC time does not have missing or superfluous hours. Reliable time stamping is
vital in understanding cause and effect and in sequencing activities.
3. Type: What sort of event was generated? Events may be generated in a number of different ways and Bango
expects to add additional types of events in the future. For example by recording packet sniffing findings, SMS
delivery reports, payment clearances, events in other systems or simply annotations.
Possible event types and the associated data include:
a. Page tracking events record a visit by a browser to a specific web page a page view. This event type
includes a Page Title, a Bango Number reference and can also record referral information.

b. Link tracking events, recorded using URL redirects, are used to track specific campaign clicks or links to
other sites rather than arrivals at a destination or page view. This event type includes a Redirect or
Campaign title and a Bango Number reference

c. Payment events are used to record financial transactions using the Bango Payment products. This event
type includes a Content title, a price, a currency and a yield value
d. Identification events are used to determine operator, consumer identity or other factors. This event type
is the same as a redirect type but passes back information in real-time. Identification can be done at any
time or as part of tracking a campaign or link.
2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

Different data capture techniques are recommended for different situations, the diagram below shows some
suggested use. In all cases a range of parameters may be specified to pass additional information as required
(see the Appendix or http://bango.com) for more details).
3rd Party
Referral

Your site

Browse In

Home
Page

Mobile
Search

Landing
Page

Ad
Campaign

Landing
Page

Text
Campaign

Landing
Page

Premium
Download
Store
Premium
Download
Page
Download
Page

PC Web
Promotion

Form
Input

Partner
Page

Link tracking

Page tracking

Payment links

Track entries from campaigns


and exits to 3rd party sites

Track page views as visitors


navigate your site

Optionally process and


track financial transactions

4. User ID: Who generated the event? A reliable and persistent user identity is crucial for accurately
understanding your customers behaviour and to build the strongest ongoing relationship with them.
Some event types may not have a specific user associated with them for example, in the future a recorded
event may indicate the start of an ad campaign or a change in pricing.
Bango has an extensive range of technologies and partnerships in place to enable precise user identity (see
Appendix A for more details). Whether identity is linked to the customer phone number, an operator generated
identity, deduced from headers, IP addresses, by leaving cookies or by other means, Bango gives each user a
unique identity that persists for as long as possible. In most cases, since this is linked to the users phone
number, it lasts for several years allowing you to build a strong relationship with each customer.
A Bango web service can be used to determine more about how that user was identified, and to get more
identity information from that user if it is available (for example MSISDN, email address, etc.). See Appendix C
for information on Bango web services.
5. Route: How the user connected to the website. The device, browser and internet connectivity used by a
consumer to access a website is important to understand. The method used can affect the user experience
and can also provide demographic or psychographic input to assist with understanding.
The Bango system architecture allows method of to be de-coupled from user ID. For example a user could
visit a few times on their mobile through a mobile operators network and then visit through a WiFi network
bypassing the mobile operator. A common identity can be maintained through pairing of these routes.
Several service providers, for example Vodafone, are developing universal identities to glue together
information about their subscribers usage of mobile, home broadband and WiFi hotspot channels. Bango can
exploit these identities.
a. The Device capabilities. There are a very wide range of mobile devices, with even the most popular
having single digit percentage market shares. Therefore understanding which devices are used by
consumers is important along with the capabilities of that device.
The device is normally deduced by reading information from the User-Agent (UA) string in the browser
headers. Sometimes for example when a mobile operator has deployed a transcoder other methods
are needed which vary on an operator by operator level. For example, Vodafone UKs transcoder sends
a Mozilla-compatible UA string in the User-Agent header whatever the device. It also adds the mobile

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

User-Agent string from the phone browser as the value of a new secondary header, X-Device-UserAgent.
i.

Device name can be used with a Bango web service to extract detailed device specific info.

ii.

Device capabilities include music and video format, screen size, Java capability, file upload and
download capability, and operating system.

Bango works with the dotMobi organization as part of the Device Atlas project ensuring that the richest
range of information is available to Bango Analytics users.
b. Connection to the internet. How a device (user) is connected to the web is important to know. For
example, a user connecting through a mobile operator may have access to operator payment services or
there may be a transcoder or content filter in place that affects the user experience. Many devices, for
example, the iPhone or Nokia N95 can connect by WiFi as well as by operator network. Knowing the
country that visitors originate from helps with targeting of marketing, adverts or user services.
i.

Country provides the consumers home country, typically the location where the phone is
connected to the Internet. Country can frequently be determined by the IP address or the mobile
operator identity. Sometimes it needs further work.
For example, most US Blackberry devices appear to be in Canada as the internet gateway they
use is via RIM HQ in Toronto. However, device specific information can be used to deduce that a
device is (for example) provided by Verizon and therefore that the home country of the device is
USA. Another example is that some Scandinavian mobile operators have one gateway for several
countries, but the home country can be distinguished by phone language (for example Swedish or
Danish).

ii.

Mobile operator or ISP provides access to the internet through gateway servers. From the IP
address of the server, Bango can deduce the mobile operator, although this does not always work.
For example, sometimes a consumer connects using a Mobile Virtual Network Operator (MVNO),
who shares the connection / gateway of a parent operator (for example BT Mobile on Vodafone or
Boost on Sprint). In these instances other operator specific techniques are needed to assign the true
operator.
In addition, some operators pass traffic through a transcoding partner. Vodafone UK traffic
frequently appears to come from a hosting facility in London due to the transcoding role of a
company called Novarra.
Also, proxies such as Opera Mini replace the operator IP address with their own for example in
Oslo, Norway. Fortunately Opera provides extra information to Bango that identifies the originating
operator.

6. Location: The location of the mobile device or user will become increasingly valuable for analytics and
marketing in the coming years. Location might be determined by operator roaming knowledge (to country
level), cell id (to within a local area), GPS from the phone itself (increasingly this will be visible from the
browser) and also through other systems like knowing a phone is near a specific Bluetooth beacon or QR
code attached to a physical object.
Several operators are able to provide Bango with location data for mobile device transactions. In the future
Bango will connect to the various location methods being deployed in mobile and fixed devices and will use
WiFi hotspot location data to provide more precision to users connecting via WiFi nodes. We will also be
connecting to Google Gears and Yahoo Fire Eagle.
The location is stored as floating point values for longitude, latitude, and altitude (in that order). Longitude and
latitude values are in degrees, where longitude 180 and <= 180, latitude 90 and -90, altitude in meters above
sea level (- for below). Location method is also stored in each transaction, with a code indicating any of the
available methods such as GPS via browser, deduced from NFC swipe, Yahoo Fire Eagle, user entered zip
code, interpolation based on time from last location and/or location change frequency.

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

7. Web specific: Some web browsers provide additional information to the web server, which is collected and
stored by Bango as part of the event recorded. This includes:
a. Referrer. Some mobile browsers pass a referrer string which indicates the page the browser visited
before visiting the Bango page tracking code or redirect. Bango stores this string for further analysis. For
example a referrer string could indicate that the user came from a google.com search for certain
keywords. For example http://www.google.com/search?q=curved+yellow+fruit.
Referrer is available for download. Releases of Bango.com early in 2009 will be providing tools for
detailed analysis and reporting on referrer and referrer semantics (such as keywords, search source etc.)
b. Query String. Some advertisers prefer or are required to track campaigns by providing parameters with
their landing page destination to indicate the campaign source. These parameters are then recovered for
use later on. For example: campaign=videosites&channel=youtube Bango stores a portion of the
query string to allow extraction and reporting on this data at a later stage.
Note: Query String recording will become available at the end of 2008 and reporting on query string
values will follow shortly after.

Interpreting the data


Bango provides a comprehensive set of online tools and export facilities that deliver mobile website and campaign
analysis according to the Web Analytics Association standards (see Appendix D for details). Bango is a Premier
Corporate Member of the WAA.
Events recorded allow us to analyze which visitors triggered which event at which time and determine things like
the time on site or specific pages, the path people took and more.

Unique visitors
The foundation to any business or marketing campaign, having precise details about your unique visitors is
fundamental to success. Bango Analytics directly uses the unique User ID (as described in appendix A) to deliver
the most accurate unique visitor numbers. Our system filters out bots and spiders to give true consumer visits
only.

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

Page views
Bango delivers this basic currency of analytics according to the WAA guidelines, recording each page that is
loaded by the browser.

Visits
By measuring the gaps between the page views of a unique visitor it is possible to determine distinct visits
according to the WWA standards. In addition, Bango can detect when the consumer visits another site using
Bango Analytics.

Revenue
Bango Analytics works great with Bango Payment to deliver true revenue tracking along with up to the minute
conversion rates.

Mobile specific data countries, operators and devices


Bango delivers additional mobile specific analysis for consumers country, operator (MNO and MVNO) and
device. This data is constantly being enriched as new devices and operator connections become known.

Back enrichment
Bango is constantly working to provide further insight into data collected. Partners and internal systems enrich the
raw data captured by user visits.
For example, new devices are discovered visiting mobile sites often before their release. Operators change
their gateways, new MVNOs are discovered and user identities are connected. A back enrichment process is
always running, trawling back though old data to refine and add useful information to it in the data warehouse.

Extracting raw data


At Bango.com, Analytics package owners can (depending on package level) extract all the event records on their
transactions in CSV or XML format. A date range or other queries can be specified.
This enables analysis of data outside Bango.com (for example by importing into another tool).
Note that as back enrichment can occur at any time, it is always best to wait until the latest possible time before
extracting the data for processing, or to use the real-time web service interface.

Real-time data services


Most of the above services are available real-time from within a mobile site as Bango Analytics collects the data.
See Appendix B for an overview of the Bango Identifier Service.

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

Appendix A: Bango identification techniques


In traditional SMS based systems, the user sends a message and therefore their phone number (MSISDN) is
communicated and can be used as an identity. With web based systems, the MSISDN is not automatically
provided and therefore cannot be used as a reliable identity you would not expect to browse a web page and
have your email address passed to each site owner. In addition, the operator (MNO) rarely provides personal
details for their customers, which includes phone numbers. In addition, consumers may connect via WiFi or other
non-mobile routes where a phone number does not exist.
A persistent unique user ID is critical for building strong relationships with consumers, but this ID does not need to
be an MSISDN, even for customer communication and payment.
MSISDN (user phone number) e.g. +16176218989
This is linked to the phone user and their handset SIM. The user knows this number. It is portable across
operators and may move from user to user. There are different levels of trust (claimed by user or verified by
MNO).
The MSISDN is required to send SMS messages outbound (alerts, confirmations, support) and is highly
desirable for telephone customer service. It is frequently used in existing SMS CRM systems as primary key
but is subject to MNO privacy and disclosure regulations.
Bango User ID (Fingerprint)
e.g. 234534298
This is a globally consistent anonymous ID for any web user. It is constructed from other forms of identity and
links to the users MSISDN, MNO-ID, email and other identities where available. This is persistent across
sessions and channels and is recommended as the globally consistent ID for all users across all websites for
mobile payment and analytics.
MNO network identity (SubNo) e.g. R3d78W3QQ2f5r9
Generated by the mobile network operator (MNO) to provide identity without revealing the consumers
MSISDN. The format varies by MNO and technology. It can change based on phone settings and may only
have a short lifetime. It may include encoded information and is often used for MNO billing systems.
Email e.g. person@hotmail.com
30-40% of users have an email address and is the most common login identity used on the internet. It is
chosen by the user and typically does not link to any specific device, although some emails are related to
devices such as iPhone, RIM, Android.

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

A Bango User ID is simply a positive integer.


Users are split between authenticated users where a mobile operator SIM, username & password or similar
system can be used to determine identity, and unauthenticated users which are created dynamically without
user awareness such as by storing cookies or inferring identity from information available through a browser
(IMEI, serial number etc.). Problems may be caused by using unauthenticated identities for financially sensitive
information. It is possible to differentiate between authenticated and unauthenticated user IDs.
A user can transition from unauthenticated to authenticated at any time. For example, a payment transaction
might authenticate the users MSISDN, and in this case the user identity in previous transaction records may be
back enriched to replace an old user id with the new authenticated one.

Examples

AT&T USA mobile web users


MSISDN is not available to third parties
Track and build relationships with customers, collect payments and run subscriptions using the Bango
User ID
Provide customer service using Bango User ID to MSISDN
AT&T provide MNO-ID (SubNo) to link legacy SMS users

UK MNO Payforit flows


MSISDN not provided to 3rd parties
Track and build relationships with customers, collect payments and run subscriptions using the Bango
User ID
Query MSISDN for customer service or marketing purposes

Blackberry credit card payments


MSISDN not available via RIM gateways
Track and build relationships with customers, collect payments and run subscriptions using the Bango
User ID

WiFi credit card payments


MSISDN may not exist
Track and build relationships with customers, collect payments and run subscriptions using the Bango
User ID
Bango may link MSISDN to Bango User ID and make available

Appendix B: Overview of Bango Identifier


For more details of the Bango Identifier and Relay services, please login to the Bango Customer Support Center
online.
http://bango.com/assets/data/support/bango_relay.pdf
http://bango.com/assets/data/support/bango_identifier.pdf

Appendix C: Summary of real-time web services


For details of Bango web service APIs, please login to the Bango Customer Support Center online.
http://bango.com/assets/data/support/bango_userinformation.pdf
http://bango.com/assets/data/support/bango_analytics_webservices.pdf

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

Appendix D: Web Analytics Standards


Web analytics definitions are defined by the Web Analytics Association of which Bango is a Premier Corporate
Member. Aspects of the most recent WAA standards document are included below for reference. Please visit their
website at http://www.webanalyticsassociation.org for full details.

Definition Framework Overview


There are two types of Web analytics metrics counts and ratios:
Count The most basic unit of measure; a single number, not a ratio. Often a whole number (Visits = 12,398),
but not necessarily (Total Sales= $52,126.37.). Some metrics cannot be summed across time and/or within a
report. See metric definitions for specific limitations.
Ratio A derived metric, obtained by dividing one number by another. The result is usually not a whole number.
Because its a ratio, per is typically in the name, such as Page Views per Visit. Most ratios used in web
analytics are not summable.
Another type of definition is included for terms that describe concepts instead of numbers.
Dimension A component or category of data. Metrics (counts and ratios) are measured across dimensions.
All metrics can apply to three different universes:
Aggregate Representative of the entire site.
Segmented A subset of the site traffic for a defined period of time, filtered in some way to gain greater analytical
insight: e.g., by campaign (e-mail, banner, PPC, affiliate), by visitor type (new vs. returning, repeat buyers, high
value), by referrer.
Individual Activity of a single Web visitor for a defined period of time.

Assumptions and Qualifications


There are certain statements and qualifications that can be added to every definition and therefore would become
repetitive and redundant. This states those conditions that apply to every definition unless explicitly stated
otherwise in the definition or comments.
All measures and metrics assume that they relate to an action by a human visitor. This is implied by the reference
to unique visitor in many of the definitions. The types of non-human visitors include robots, spiders and website
crawlers that periodically scan or methodically download (scrape) content from a website. Many identify
themselves via the user agent in the HTTP request that allows the website to provide a different version of the
content to aide search engines and content aggregators. However there are many that do not identify themselves
and can be confused with human traffic. Each web analytic provider has various techniques for identifying and
filtering this traffic.
The definitions in this document assume the provider has successfully extracted the traffic due to actual human
visitor behavior, to the extent possible.

Building Block Terms


Building block terms include four main metrics, Unique Visitors, Visit/Sessions, Page Views, and Events that
make up the foundation for all web measures. These measures can be used either as a unique value by
themselves or as the denominator within various formulas. The following definitions are provided as infrastructure
on which to build upon.

Page
Type: Dimension
Calculation: An analyst definable unit of content.

Page View
Type: Count
Calculation: The number of times a page was viewed.

2008 Bango Ltd

Commercial in Confidence

Bango Technical Whitepaper

Visits (Sessions)
Type: Count
Calculation: A visit is an interaction, by an individual, with a web site consisting of one or more requests for a
page. If an individual has not taken another action (typically additional page views) on the site within a specified
time period, the visit will terminate by timing out.

Unique Visitors
Type: Count
Calculation: The number of inferred individual people (filtered for spiders and robots), within a designated
reporting timeframe, with activity consisting of one or more visits to a site. Each individual is counted only once in
the unique visitor measure for the reporting period.
Unique visitors are calculated according to Appendix A where the Bango User ID behaves in a way resembling a
server side cookie. This avoids issues with handsets as well as cookie limitations, blocking and deletion.

Event
Type: Dimension and/or count
Calculation: Any logged or recorded action that has a specific date and time assigned to it by either the browser
or server.
Events are covered in the main body of this document and can include any data.

Referrer
Type: Dimension
Calculation: Referrer is a generic term that describes the source of traffic to a page or visit.

Conversion Rate
Type: Ratio
Calculation: The ratio of conversions over a relevant denominator.

2008 Bango Ltd

Commercial in Confidence

10

Вам также может понравиться