Академический Документы
Профессиональный Документы
Культура Документы
Gediminas Adomavicius
Information and Decision Sciences
Carlson School of Management
University of Minnesota
gedas@umn.edu
Alexander Tuzhilin
Information, Operation and Management Sciences
Stern School of Business
New York University
atuzhili@stern.nyu.edu
1. Introduction
Over the past several years, there has been much work done in personalization focusing on the
development of new technologies, understanding personalization from the business point of
view, and developing novel personalization applications [CACM00].
Since personalization
constitutes a young and rapidly developing field, there still exist different points of view on what
personalization is expressed by academics and practitioners. In this article we synthesize these
various points of view and describe personalization technologies from a process-oriented
perspective.
Several attempts have been made to define personalization by the industry practitioners and the
academic researchers. Some of the representative definitions include:
Personalization is the ability to provide content and services that are tailored to individuals based
on knowledge about their preferences and behavior [Smart Personalization, Forrester Report by
Paul Hagen, 1999].
Personalization is the combined use of technology and customer information to tailor electronic
commerce interactions between a business and each individual customer. Using information either
previously obtained or provided in real-time about the customer and other customers, the exchange
between the parties is altered to fit that customer's stated needs so that the transaction requires less
time and delivers a product best suited to that customer [www.personalization.com].
These definitions cover various aspects of personalization, and several important features of
personalization emerge from them.
This type of
Consumers
Providers
(a) Provider-centric
Consumers
Providers
(b) Consumer-centric
Consumers
Providers
(c) Market-centric
product and service recommendations, e.g., for books, CDs and vacations;
personalized email;
Personalization Goals. The personalization objectives usually are multifaceted. They may
range from simply improving the consumers browsing and shopping experience (e.g., by
presenting only the content that is relevant to the consumer) to much more complex objectives,
such as building long-term relationships with consumers, improving consumer loyalty, and
generating a measurable value for the company. Currently, the most commonly used metrics are
accuracy metrics that measure how the consumer liked a specific personalized offering, e.g.,
how accurate the recommendation was [BS97, Paz99]. Although important, accuracy metrics
are fairly simplistic and do not reflect the bigger picture, i.e., to what extent the more complex
personalization objectives have been met.
sophisticated metrics, such as the consumer lifetime value, consumer loyalty value, purchasing
and consumption experiences, and other ROI-based metrics [CS00, RNE+02].
Consumer Knowledge.
knowledge about personal preferences and behavior of the consumers that is usually distilled
from the large volumes of granular information about the consumers and stored in consumer
profiles [Paz99, AT01]. We will cover this topic in Section 3.
The definitions listed above collectively cover several major points about personalization.
However, the point that personalization constitutes an iterative process that takes place over time
has not been sufficiently addressed before, and we describe it in the next section.
2. Personalization Process
Personalization constitutes an iterative process that can be defined by the Understand-DeliverMeasure cycle taking place in time and consisting of the following stages shown in Fig. 2:
Deliver personalized offering based on the knowledge about each consumer, as stored in
the consumer profiles. The personalization engine must be able to find the most relevant
offerings and deliver them to the consumer.
Measure personalization impact by determining how much the consumer is satisfied with
the delivered personalized offerings.
understanding about consumers or point out the deficiencies of the methods for
personalized delivery. Therefore, this additional information serves as a feedback for
possible improvements to each of the other components of personalization process. This
feedback information completes one cycle of the personalization process, and sets the
stage for the next cycle where improved personalization techniques can make better
personalization decisions.
The technical implementation of the Understand-Deliver-Measure cycle consists of the six
stages presented in Fig. 2 and described below.
Data collection.
channels of interaction between consumers and providers (e.g., Web, phone, direct mail, and
other channels) and from various other heterogeneous data sources. Such data can be solicited
explicitly (e.g., via surveys) or tracked implicitly and may include histories of consumers
purchasing and searching activities, as well as demographic and psychographic information. The
5
objective is to obtain the most comprehensive picture of a consumer. After the data is
collected, it is usually processed, cleaned, and stored in a consumer-oriented data warehouse.
Building consumer profiles.
constructing accurate and comprehensive consumer profiles based on the collected data. We
discuss the techniques for building consumer profiles in more detail in Section 3.
Matchmaking.
individual consumers. There are many matchmaking technologies including user-specified rulebased content delivery systems, e.g., as deployed by BroadVision [http://www.broadvision.com]
and ATG [http://www.atg.com], statistics-based predictive approaches, and recommender
systems. However, in this paper we will focus on recommender systems technologies because of
the space limitation and because they represent the most developed matchmaking technologies
applicable to various types of personalized offerings. We will describe them further in Section 3.
Feedback loop
Measure
Impact of
Personalization
Deliver
Personalized
Offering
Matchmaking
Understand
the Consumer
Data Collection
consumer is getting frustrated with the personalization system and stops using it. The depersonalization effect is largely responsible for failures of some of the personalization projects
reported in the literature [PR01]. Therefore, one of the main challenges of personalization is the
ability to achieve the virtuous cycle of personalization and not fall into the de-personalization
trap. From the algorithmic sophistication perspective, the technologies that contribute the most
to this goal are the profiling and the matchmaking technologies. We describe them and their
interaction in the next section.
information may include consumers demographics, such as name, gender, date of birth and
address, or be derived from the past transactions of a consumer, such as the largest purchase
value made at a Web site. This factual profile information is usually defined as a record of
values and stored in a relational database, one record per consumer.
In addition to the factual information, [AT01] considers profiles that capture more complex
behavioral information of consumers. These profiles are modeled using such techniques as
Conjunctive rules. For example, the rule John Doe prefers to see action movies on
weekends (i.e., Name = John Doe & MovieType = action TimeOfWeek =
weekend) can be a part of John Does profile that describes his movie viewing habits
[AT01]. Such rules can be learned from the transactional history of the consumer using
various data mining techniques, including association and classification rule discovery
methods [HMS01].
Sequences, such as sequences of Web browsing activities. For example, we may want to
store in John Does profile his typical browsing sequence when John Doe visits the book
Web site XYZ, he usually first accesses the home page, then goes to the
Home&Gardening section of the site, then browses the Gardening section and then
leaves the Web site (i.e., XYZ: StartPage Home&Gardening Gardening Exit).
Such sequences can be learned from the transactional histories of consumers using
frequent episodes and other sequence learning methods [HMS01].
Signatures, i.e., the data structures that are used to capture the evolving behavior learned
from large data streams of simple transactions [CFP+00]. For example, top 5 most
frequently browsed product categories over the last 30 days would be an example of a
signature that could be stored in individual consumer profiles in a Web store application.
In summary, all profiling approaches can be classified into simple, that support unstructured
factual information about the customers (e.g., demographic information), and advanced, that
support the behavioral information about consumers expressed in the form of rules, sequences,
signatures, and other knowledge representation methods.
Besides profiling, delivering targeted content and services for the consumers is another crucial
aspect of personalization that depends significantly on the quality of the underlying
matchmaking technologies. There has been much research done on this subject, including rulebased matchmaking, statistics-based predictive approaches, and recommender systems.
However, as was explained in Section 2, in this paper we will focus on recommender systemsbased matchmaking techniques.
In the context of recommender systems, matchmaking technologies are often classified into
broad categories according to their recommendation approach as well as their algorithmic
technique as described below.
Classification based on the recommendation approach [BS97]:
Hybrid approaches: these methods combine collaborative and content-based methods. This
combination can be done in many different ways, e.g., separate content-based and
collaborative systems are implemented and their results are combined to produce the final
recommendations.
10
Model-based techniques use the previous transactions to learn a model (usually using some
machine learning or statistical learning technique), which is then used to make
recommendations. For example, based on the movies that I have seen, a probabilistic model
is built to estimate the probability of how I would like each of the unseen movies.
As we have done with profiling techniques, we classify various matchmaking methods into
simple and advanced. Existing empirical research suggests that hybrid approaches outperform
the pure content-based and pure collaborative approaches [BS97, Paz99] and that model-based
techniques outperform heuristic-based ones in terms of recommendation accuracy [BHK98].
Based on these results, we classify content-based and collaborative approaches that use heuristic
algorithms as simple and hybrid heuristics and model-based matchmaking techniques are
classified as advanced.
Combining the profiling and the matchmaking classifications, personalization technologies can
be characterized by the 2x2 matrix presented in Table 1 emphasizing that various
recommendation techniques often use different types of consumer profiles for recommendation
purposes. In particular, many collaborative techniques only use the ratings of items that were
provided by individuals as a feedback to the recommender system, although some techniques
take into account simple demographic attributes (e.g., age, gender) in the recommendation
process. Content-based techniques commonly use keywords to represent tastes and preferences
of individuals.
factual profiling information. Even more advanced approaches to recommender systems, such
as the current generation of hybrid recommendation heuristics and model-based techniques, still
use the same limited profiling information as simple heuristics (i.e., ratings, keywords,
demographic attributes).
11
classified as having simple profiling components. Thus, heuristic techniques for content-based
and collaborative recommendations were placed in the upper left quadrant of Table 1. Similarly,
the more advanced recommendation approaches, such as hybrid heuristics and model-based
techniques, also utilize simple profiling methods, as discussed above. Therefore, they fall into
the lower left quadrant of Table 1.
PROFILING
Simple
Simple
MATCHMAKING Advanced
Content-based heuristics
Collaborative filtering heuristics
Hybrid heuristics
Model-based approaches
Advanced
Rule-based matching
Future work?
12
Finally, there has been very little prior work done for the lower right quadrant of Table 1 because
most of the personalization research has focused on the development of matchmaking techniques
that do not require a comprehensive understanding of consumers tastes, preferences, and
behavior.
Similarly, the advanced profiling methods have only been applied for simple
matchmaking problems.
profiles, poorly chosen techniques for matchmaking or content delivery? Alternatively, the
selection metrics may not be well suited for the application at hand and are not giving us
accurate measurements. We call this a feedback integration problem, since the main issue is
how to improve the personalization process by integrating the feedback into the overall process.
Note, that the feedback integration problem is a recursive one, i.e., if we are able to identify the
underperforming stages of the personalization process, we may still face similar challenges when
deciding on the specific adjustments within each stage. For example, if we need to improve the
data collection phase of the personalization process, should we collect more data, collect
different data, or use better data pre-processing techniques [SMB+03]?
The problem of feedback integration in the personalization process has not been extensively
studied before. Therefore, more research is needed in order to achieve a more comprehensive
understanding of how to transform the e-business measurements into specific adjustments to
various stages of the personalization process.
personalization described in this paper suggests the importance and the need for vertical
personalization research, i.e., research that integrates all the stages of the personalization process.
References
[AT01]
14
[AT02]
[BS97]
[CFP+00]
[CS00]
M. Cutler and J. Sterne. E-Metrics: Business Metrics for the New Economy,
NetGenesis Corporation, 2000. [http://www.emetrics.org/articles/emetrics.pdf]
[GT01]
D. Hand, H. Mannila, and P. Smyth. Principles of Data Mining, MIT Press, 2001.
[Paz99]
[PR01]
Peppers, D. and Rogers, M. Why CRM Initiatives Fail and What You Can Do
about It. Inside 1to1. Oct. 8, 2001.
[RNE+02]
15
E-commerce recommendation
16