Вы находитесь на странице: 1из 7

ANALYTICS: A COMPETITIVE EDGE FOR A RETAIL PORTAL1

Laxmi Harikumar, Indian Institute of Management Bangalore, India


Vishnuprasad Nagadevara, Indian Institute of Management Bangalore, India

ABSTRACT

In India, the annual retail market is estimated to be between $450 billion and $500 billion (Rumman, 2011).
The Indian government is currently examining an FDI proposal that would allow multi-brand retailers to own
a stake of up to 51% in joint ventures with Indian partners, as is already the case for single-brand retailers
like Marks & Spencer Group PLC and Nike Inc. In a scenario of such intense competition, analytics can be
a major differentiator for the companies. In this study we have performed Market Basket Analysis for an
Indian retail e-commerce portal and developed a model for assessing the analytical maturity of the
company. Based on the study, the paper gives recommendations to the ecommerce portal and also makes
suggestions on how the company can embed analytics in its DNA.

Keywords: Market Basket Analysis, Retail Analytics, Web trend analysis, Analytics maturity assessment

1. INTRODUCTION

Be it online e-commerce portals segment or brick-and-mortar segment, most of the companies within the
same retail segment offer similar products, use similar IT tools and infrastructure, adopt identical & proven
high performance business models. In such an era of high competition and lack of sustainable
differentiators, analytics can be a major success factor. The advent of computing power, cheaper data
storage and sophisticated data processing tools (such as SAS, SPSS etc) is just the right concoction for
business analytics. The study looks at the application of advanced analytics techniques for an online retail
start-up, “ERetail”.

2. LITERATURE REVIEW

Today, organizations are attempting to capitalize on information and apply analytics to achieve competitive
advantage. Netflix is one of the best examples of companies that owe a large part of their success to
analytics (Davenport and Harris, 2007). The company employs a movie recommendation engine
“Cinematch” based on a proprietary algorithm to ensure a personalized web page for the customer – based
on his preferences, movie ratings etc. Netflix uses data to make decisions that moguls make by gut. The
average user rates more than 200 films and Netflix crunches consumers’ rental history and film ratings to
predict what they will like (Davenport and Harris, 2007).

Kolyshkina, I and Simeon Simoff (2007) provide an approach designed to allow optimal utilization of
Analytics in the industry setting. They focused on the key stages of the Analytics process. They had also
presented various case studies to identify factors responsible for success or failure of analytics projects.
The report by the Sloan Management Review and the IBM Institute for Business Value (2010) found that
the top performing companies are three times more likely than the lower performers to be sophisticated
users of analytics. These top performing companies also believe that their use of analytics is a competitive
differentiator.

There are multiple examples of retail companies that have seamlessly integrated advanced analytics into
the soul of their businesses and reaped significant returns on their investments. The best example is the
retail giant – Wal-Mart. Wal-Mart collects more data about its products and shoppers' purchasing habits
than any other retailer in order to increase operational efficiency and maximize product sales. Wal-Mart
used predictive modeling analytics on its data to determine the top-selling items before hurricanes in Florida

1
Journal of Academy of Business and Economics, Vol 12, No. 1, 2012, pp 43-48
stores. Analysis revealed that the top-selling items stretched beyond expected items like flashlights and
water--beer was the top-selling pre-hurricane item, while sales of Strawberry Pop-Tarts increased sevenfold
during these periods (Dey, 2005).

3. OBJECTIVES

The objectives of this study are:


 Identify the trends in the website traffic and study how the traffic trends change in relation to the
marketing programs and discount cycles
 Explore the application of Market Basket Analytics for an e-commerce portal
 Assess the analytical maturity of the company

4. METHODOLOGY

Data is obtained from an online retail company called ERetail. Their primary focus was to increase website
traffic and to convert the traffic to business. Initially, various items retailed by ERetail were grouped into
major categories. These categories are further subdivided into subcategories. The data was analyzed to
understand Relationship between daily website traffic trends and triggers and Category wise sales and
revenue distribution. Market Basket Analysis was carried out to identify the purchasing patterns and the
bundling possibilities in order to improve the sales. Finally, the stage at which ERetail is in terms of its
analytical maturity is identified based on a questionnaire developed for the purpose. Appropriate
recommendations were made for the company to move ahead.

5. PREPROCESSING OF DATA

Most of the transaction (sales) data at ERetail is organized towards efficient order tracking and monitoring.
Most of the details of the products sold per transaction are clubbed into a single “Product Details” column.
Hence we had to do some major restructuring of the data. This column had to be split into 8 separate
columns namely Product ID, Product Qty, Product SKU, Product Name, Product Weight, Product Variation
Details, Product Unit Price and Product Total Price. Another essential restructuring was to convert data
format from a transaction format (where each record was a complete transaction) to an itemized format
(where each record was an item sold). This resulted in multiple rows per transaction when multiple items
were purchased as part of the same transaction.

ERetail handles thousands of Stock Keeping Units (SKUs). However in order to understand the purchasing
behavior as well as inherent purchase patterns it was essential to abstract the SKU to broader categories.
Inferences made at a category level would be much simpler to implement and execute. Hence we classified
the SKUs into high level categories (Apparel, Books, CD, Confectionaries, Electronics, Gift Items and Toys)
and subcategories. For example, the subcategories for electronics are “Mobile phone”, “Mobile phone
accessories”, “Multimedia”, “Phone”, “Computer Accessories” and “Household Items”.

6. DATA ANALYSIS

The initial data analysis was meant for understand data better. It involved

 Relationship between daily website traffic trends and triggers


 Category wise sales and revenue distribution

6.1. Relationship between daily website traffic trends and triggers


We analyzed the volume of traffic and looked for emerging trends within the traffic patterns. The key focus
area was to identify major spikes / trough as well as the overall trend over a period of one year. The traffic
patterns, presented in Figure 1, showed an increasing trend. Though the growth in traffic trend was not
significant for most of early 2010, there are some intermittent spikes that stood out as interesting prospects
for further analysis. However there is sharp increase in the traffic starting 2011. As can be seen there were
8 major spikes on 3 May 2010, 22 June 2010, 28 October 2010, 10 February 2011, 5 March 2011, 23 March
2011, 31 March 2011 and 12 April 2011. On further investigation we found that these spikes coincided with
various marketing campaigns run during these periods. Going by the above spikes and campaign details
we arrived at the following conclusions
1. Campaigns run recently (Feb 2011 onwards) were far more effective in increasing the traffic when
compared to the earlier campaigns.

2. Majority of these effective campaigns were Email Campaigns executed through partner sites.
These are turning out to be excellent “Crowd Pullers”.

FIGURE-1 NUMBER OF VISITS PER DAY TO THE WEBSITE

After identifying the reasons for spikes, we wanted to analyze its impact on the portal’s popularity. We did
this by analyzing the change in composition of direct hits vs. referral hits over a period of time. The synopsis
of this comparison is shown in Table 1. It can be seen from Table 1 that the proportion of indirect referrals
when compared to direct referrals has decreased.

TABLE-1 CHANGES IN DIRECT HITS VS. REFERRAL HITS OVER THE THREE MONTH PERIOD

Sources January 2011 February 2011 March 2011

Visits % Visits Visits % Visits Visits % Visits

Direct 91 38.56% 13,356 48.39% 16,368 45.59%

Indirect (referral) 145 61.44% 14,245 51.61% 19,535 54.41%

6.2. Category wise sales and revenue distribution

Category wise sales and revenue distribution is presented in Table 2. The average quantity purchased per
transaction is approximately one per transaction. It can be seen from Table 2 that usually people buy a
single item for most of categories. This could lead to two possibilities in terms of bundling. Accept single
item purchase for these categories and explore options for cross bundling across categories using MBA
across categories; or look at possibilities of increasing sales and revenue within this category by building
and offering lucrative bundles within such category. Next we tried to explore how Market Basket Analysis
can be used to form effective bundles.

TABLE-1 CATEGORY WISE SALES AND REVENUE DISTRIBUTION

Category Number of Items sold average per Average Average


transactions transaction revenue per revenue per
item item

Apparel 60 60 1.00 488.02 488.02

Book 309 314 1.02 254.78 250.72

CD 28 28 1.00 1736.07 1736.07

Confectionaries 41 50 1.22 601.45 493.19

Electronics 250 255 1.02 2335.02 2289.24

Gift Items 249 306 1.23 2179.30 1773.35

Toys 17 17 1.0 547.88 547.88

7. MARKET BASKET ANALYSIS

“Market Basket Analysis (MBA) uses the information about what customers purchase to provide an insight
into who they are and why they make certain purchases”(Berry and Linoff, 2004). By analyzing the products
that are purchased together, the store can decide on the store layout and what products to bundle together.
The market basket analysis was performed in two stages. The first one was on product category and the
second was on customer history.

 Product Category MBA - To find out bundles within same category (i.e. what electronic goods sell in
bundles): Order ID was the key to identifying the bundles here. Table 3 provides the possible bundling
of electronics category with the corresponding support and confidence levels.

TABLE-3 PRODUCT CATEGORY MBA: CATEGORY – ELECTRONICS

Keys Basis Consequent Antecedent Support % Confidence %

Order ID Product Mobile phone bike Mobile phone 1.667 25.0


Name charger magnetic pouch
Mobile phone Mobile phone 1.667 25.0
ROTO charger magnetic pouch

Multimedia IPod 2.0 50.0

 Customer History MBA: We try to extend the concept of a basket to cover the entire purchase history
of each customer (i.e. all the electronics items that the customer has purchased till date forms on single
basket). Identifying bundles within such a basket helps us to discover buying patterns across
transactions. The results of customer history MBA are presented in Table 4.

TABLE 2 - CUSTOMER HISTORY MBA: CATEGORY - ELECTRONICS

Keys Basis Consequent Antecedent Support % Confidence %

Customer Product SanDisk 8GB Toshiba 500GB 2.5” 1.0 50.0


ID Name USB Pen Drive Portable Hard Drive

Mobile Phone Samsung B5310 1.0 50.0


ROTO charger CorbyPRO

Customer Subcategory Computer Household 1.0 100.0


ID Accessories Appliance

Phone

Multimedia IPod 2.0 50.0

8. ANALYZING THE ANALYTICAL MATURITY OF THE COMPANY

Table 5 lists down typical characteristics of organizations at various stages of analytics maturity. We
assessed ERetail’s current state and mapped it to the stages in Table 5, based on a questionnaire prepared
and administered. The table is based on concepts in Davenport and Harris, 2007.

TABLE-3 CHARACTERISTICS OF ORGANIZATIONS DETERMINING ANALYTICS MATURITY

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5


Limited data Non integrated Beginning to Well integrated Continuous
sources. data sources integrate data and orchestrated integration of new
sources data sources data sources
Reactive use of Some inclusion of Full leverage of
analytics external data external data
sources sources
Historical data Static and point in
primarily time
Data not well Data inadequately Data refreshed Data refreshed
maintained maintained periodically continually
Stage 1 Stage 2 Stage 3 Stage 4 Stage 5
Localized Departmental Data mining and Predictive Predictive analytical
modeling take analytical models models are adaptive
place continually used
Transactional Begin to leverage Leverage of Full leverage of
focused other data attitudinal and all types of data
sources such as unstructured data
attitudinal data
Knowledge not Some processes Formal process to Ongoing innovative
shared exist but not capture analytical analytical thinking
formalized ideas exists
No repeatable Analytics reports Analytical reports Predictive
process do not impact impacts decision analytics drives
operational making decisions e.g.
processes. forecasting
Descriptive Statistical More advanced Predictive Real time predictive
analytics only correlations analytics analytic models analytics impact
analytics done on are linked to business decisions
limited capacity metrics
Lack of Analytical life Formalized Analytics Automated end to
analytical cycle analytical lifecycle embedded and end self learning
lifecycle management not management automated within and adaptive
management formalized the business business process
process
Lack of holistic Ad hoc analytical Data architecture Adaptive data
data architecture data architecture in place architecture
support
Lack of holistic Ad-hoc analytical Scalable and Analytical data Analytical data
technical technical Resilient model part of architecture is part
architecture architecture technical enterprise data of enterprise data
support architecture model architecture

 The company currently uses its own database (DB) and DB of partner sites, to capture and analyze
basic transactional data
 Analytics happens on a sporadic basis based on individual initiatives
 Currently analytics is focused on sales transactions only. Factors in attitudinal or demographic data
 Current focus on descriptive statistics like averages and growth rates

We have highlighted appropriate cells in the above table based on our understanding of ERetail’s analytical
capabilities. Based on this, we placed ERetail between Stage 1 and Stage 2.

9. SUMMARY AND CONCLUSIONS

Our three high level recommendations to ERetail for moving up from Stage 2 towards Stage 5 are

- Refine Data Models


o The company needs to collect more demographic data for customer segmentation.
o The data is currently organized more for order processing. With some amount of data base
redesign, the company can capture data for analytics.
o The website should encourage/incentivize registration with demographics details while
shopping.
- Leverage analytics to understand consumers
o The purchasing behavior captured about a user can be used to create personalized home
pages for the user (displaying the latest offerings in the category which he purchases
frequently)
o Explore the possibility of capturing “clickstream” data without significant impact on response
time/website performance. This can be a major source of insights into costumer behavior.
- Integrate analytics into all business units/processes.
o Manage the company inventory based on demand data which can be generated from data
models.
o Leverage on trend/time series analytics for identifying future growth prospects early.
o Apply analytical techniques like market basket analytics, clustering/segmentation techniques
and classification techniques to design effective/targeted promotions and campaigns.
o Devise a phase wise roadmap to integrate all business processes with analytics at the core.

10. REFERENCES AND BIBLIOGRAPHY

Ahmed, Rumman., Is FDI the Magic Wand for Indian Retail?, September 2011; The Wall Street Journal.,
http://blogs.wsj.com/indiarealtime/2011/09/22/is-fdi-the-magic-wand-for-indian-retail/

Berry, Micheal J A and Linoff, Gordon S., Data Mining Techniques, John Wiley & Sons, 2004

Davenport, Thomas H and Harris, Jeanne G., Competing on Analytics: The New Science of Winning,
Harvard Business School Press, 2007

Dey, Arnab., Optimize Revenue through Life-Cycle Analytics., July 2005;Destination CRM.com,
http://www.destinationcrm.com/Articles/Web-Exclusives/Viewpoints/Optimize-Revenue-Through-Life-
Cycle-Analytics-44796.aspx

Kolyshkina, I and Simeon Simoff, “Customer Analytics Projects: Addressing Existing Problems with a
Process that Leads to Success”, 6th Australasian Data Mining Conference, (AusDM2007), Gold Coast,
Australia, 3-4 December, 2007.

Sloan Management Review and IBM Institute for Business Value., “Analytics, the new path to value”., 2010;
MIT Sloan Management Review, http://c0004013.cdn2.cloudfiles.rackspacecloud.com/MIT-SMR-IBM-
Analytics-The-New-Path-to-Value-Fall-2010.pdf

11. AUTHOR PROFILE

Dr. Vishnuprasad Nagadevara earned his Ph.D. at the Iowa State University, USA in 1974. Currently he is
a Professor of Quantitative Methods and Information Systems at Indian Institute of Management –
Bangalore. His current research interests are Data Mining, Operations Research Applications and
Quantitative Techniques.

Ms Laxmi Harikumar is a student of Post Graduate Programme in Software Enterprise Management at


Indian Institute of Management –Bangalore. Her areas of interest are retail analytics and financial time
series analysis.

Вам также может понравиться