Вы находитесь на странице: 1из 21

Statistical Science

2006, Vol. 21, No. 2, 256–276


DOI: 10.1214/088342306000000222
© Institute of Mathematical Statistics, 2006

Network-Based Marketing: Identifying


Likely Adopters via Consumer
Networks
Shawndra Hill, Foster Provost and Chris Volinsky

Abstract. Network-based marketing refers to a collection of marketing


techniques that take advantage of links between consumers to increase sales.
We concentrate on the consumer networks formed using direct interactions
(e.g., communications) between consumers. We survey the diverse literature
on such marketing with an emphasis on the statistical methods used and the
data to which these methods have been applied. We also provide a discus-
sion of challenges and opportunities for this burgeoning research topic. Our
survey highlights a gap in the literature. Because of inadequate data, prior
studies have not been able to provide direct, statistical support for the hypoth-
esis that network linkage can directly affect product/service adoption. Using
a new data set that represents the adoption of a new telecommunications ser-
vice, we show very strong support for the hypothesis. Specifically, we show
three main results: (1) “Network neighbors”—those consumers linked to a
prior customer—adopt the service at a rate 3–5 times greater than baseline
groups selected by the best practices of the firm’s marketing team. In ad-
dition, analyzing the network allows the firm to acquire new customers who
otherwise would have fallen through the cracks, because they would not have
been identified based on traditional attributes. (2) Statistical models, built
with a very large amount of geographic, demographic and prior purchase
data, are significantly and substantially improved by including network in-
formation. (3) More detailed network information allows the ranking of the
network neighbors so as to permit the selection of small sets of individuals
with very high probabilities of adoption.
Key words and phrases: Viral marketing, word of mouth, targeted market-
ing, network analysis, classification, statistical relational learning.

1. INTRODUCTION cial network among consumers. Instances of network-


based marketing have been called word-of-mouth mar-
Network-based marketing seeks to increase brand
keting, diffusion of innovation, buzz marketing and
recognition and profit by taking advantage of a so-
viral marketing (we do not consider multilevel market-
ing, which has become known as “network” market-
Shawndra Hill is a Doctoral Candidate and Foster Provost ing). Awareness or adoption spreads from consumer to
is Associate Professor, Department of Information, consumer. For example, friends or acquaintances may
Operations and Management Sciences, Leonard N. Stern
tell each other about a product or service, increasing
School of Business, New York University, New York, New
York 10012-1126, USA (e-mail: awareness and possibly exercising explicit advocacy.
shill@stern.nyu.edu; fprovost@stern.nyu.edu). Chris Firms may use their websites to facilitate consumer-
Volinsky is Director, Statistics Research Department, AT&T to-consumer advocacy via product recommendations
Labs Research, Shannon Laboratory, Florham Park, New (Kautz, Selman and Shah, 1997) or via on-line cus-
Jersey 07932, USA (e-mail: volinsky@research.att.com). tomer feedback mechanisms (Dellarocas, 2003). Con-

256
NETWORK-BASED MARKETING 257

sumer networks may also provide leverage to the ad- such as Oprah, with her monthly book club reading list,
vertising or marketing strategy of the firm. For exam- may represent “hubs” of advocacy in the consumer re-
ple, in this paper we show how analysis of a consumer lationship network. The success of The Da Vinci Code,
network improves targeted marketing. by Dan Brown, may be due to its initial marketing:
This paper makes two contributions. First we sur- 10,000 books were delivered free to readers thought to
vey the burgeoning methodological research literature be influential enough (e.g., individuals, booksellers) to
on network-based marketing, in particular on statisti- stimulate the traffic in paid-for editions (Paumgarten,
cal analyses for network-based marketing. We review 2003). When firms give explicit incentives to con-
the research questions posed, and the data and analytic sumers to spread information about a product via word
techniques used. We also discuss challenges and op- of mouth, it has been called viral marketing, although
portunities for research in this area. The review allows that term could be used to describe any network-based
us to postulate necessary data requirements for study- marketing where the pattern of awareness or adoption
ing the effectiveness of network-based marketing and spreads from consumer to consumer.
to highlight the lack of current research that satisfies Implicit advocacy: Even if individuals do not speak
those requirements. Specifically, research must have about a product, they may advocate implicitly through
access both to direct links between consumers and to their actions—especially through their own adoption
direct information on the consumers’ product adoption. of the product. Designer labeling has a long tradi-
Because of inadequate data, prior studies have not been tion of using consumers as implicit advocates. Firms
able to provide direct, statistical support (Van den Bulte commonly capitalize on influential individuals (such as
and Lilien, 2001) for the hypothesis that network link- athletes) to advocate products simply by conspicuous
age can directly affect product/service adoption. adoption. More recently, firms have tried to induce the
The second contribution is to provide empirical sup- same effect by convincing particularly “cool” members
port that network-based marketing indeed can im- of smaller social groups to adopt products (Gladwell,
prove on traditional marketing techniques. We intro- 1997; Hightower, Brady and Baker, 2002).
duce telecommunications data that present a natural Network targeting: The third mode of network-based
testbed for network-based marketing models, in which marketing is for the firm to market to prior purchasers’
communication linkages as well as product adoption social-network neighbors, possibly without any advo-
rates can be observed. For these data, we show three cacy at all by customers. For network targeting, the
main results: (1) “Network neighbors”—those con- firm must have some means to identify these social
sumers linked to a prior customer—adopt the service at neighbors.
a rate 3–5 times greater than baseline groups selected These three modes may be used in combination.
by the best practices of the firm’s marketing team. In A well-cited example of viral marketing combines net-
addition, analyzing the network allows the firm to ac- work targeting and implicit advocacy: The Hotmail
quire new customers who otherwise would have fallen free e-mail service appended to the bottom of every
through the cracks, because they would not have been outgoing e-mail message the hyperlinked advertise-
identified based on traditional attributes. (2) Statistical ment, “Get your free e-mail at Hotmail,” thereby
models, built with a very large amount of geographic, targeting the social neighbors of every current user
demographic and prior purchase data, are significantly (Montgomery, 2001), while taking advantage of the
and substantially improved by including network in- user’s implicit advocacy. Hotmail saw an exponen-
formation. (3) More sophisticated network information tially increasing customer base. Started in July 1996,
allows the ranking of the network neighbors so as to in the first month alone Hotmail acquired 20,000 cus-
permit the selection of small sets of individuals with tomers. By September 1996 the firm had acquired over
very high probabilities of adoption. 100,000 accounts, and by early 1997 it had over 1 mil-
lion subscribers.
Traditional marketing methods do not appeal to
2. NETWORK-BASED MARKETING
some segments of consumers. Some consumers ap-
There are three, possibly complementary, modes of parently value the appearance of being on the cutting
network-based marketing. edge or “in the know,” and therefore derive satisfac-
Explicit advocacy: Individuals become vocal advo- tion from promoting new, exciting products. The firm
cates for the product or service, recommending it to BzzAgents (Walker, 2004) has managed to entice vol-
their friends or acquaintances. Particular individuals untary (unpaid) marketing of new products. Further-
258 S. HILL, F. PROVOST AND C. VOLINSKY

more, although more and more information has be- recommender systems. In each case, we provide an
come available on products, parsing such information overview of the approach and a discussion of a promi-
is costly to the consumer. Explicit advocacy, such as nent example. This (brief ) survey is not exhaustive. In
word-of-mouth advocacy, can be a useful way to filter the final subsection, we discuss some of the statistical
out noise. challenges inherent in incorporating this network struc-
A key assumption of network-based marketing ture.
through explicit advocacy is that consumers propagate
3.1 Econometric Models
“positive” information about products after they either
have been made aware of the product by traditional Econometrics is the application of statistical meth-
marketing vehicles or have experienced the product ods to the empirical estimation of economic relation-
themselves. Under this assumption, a particular subset ships. In marketing this often means the estimation of
of consumers may have greater value to firms because two simultaneous equations: one for the marketing or-
they have a higher propensity to propagate product in- ganization or firm and one for the market. Regression
formation (Gladwell, 2002), based on a combination and time-series analysis are found at the core of econo-
of their being particularly influential and their having metric modeling, and econometric models are often
more friends (Richardson and Domingos, 2002). Firms used to assess the impact of a target marketing cam-
should want to find these influencers and to promote paign over time.
useful behavior. Econometric models have been used to study the im-
pact of interdependent preferences on rice consump-
3. LITERATURE REVIEW tion (Case, 1991), automobile purchases (Yang and
Allenby, 2003) and elections (Linden, Smith and York,
Many quantitative statistical methods used in em- 2003). For each of the aforementioned studies, geogra-
pirical marketing research assume that consumers act phy is used in part as a proxy for interdependence be-
independently. Typically, many explanatory attributes tween consumers, as opposed to direct, explicit com-
are collected on each actor and used in multivari- munication. However, different methods are used in
ate modeling such as regression or tree induction. In the analysis. Most recently, Yang and Allenby (2003)
contrast, network-based marketing assumes interde- suggested that traditional random effects models are
pendency among consumer preferences. When inter- not sufficient to measure the interdependencies of con-
dependencies exist, it may be beneficial to account for sumer networks. They developed a Bayesian hierar-
their effects in targeting models. Traditionally in statis- chical mixture model where interdependence is built
tical research, interdependencies are modeled as part of into the covariance structure through an autoregressive
a covariance structure, either within a particular obser- process. This framework allows testing of the presence
vational unit (as in the case of repeated measures ex- of interdependence through a single parameter. It also
periments) or between observational units. Studies of can incorporate the effects of multiple networks, each
network-based marketing instead attempt to measure with its own estimated dependence structure. In their
these interdependencies through implicit links, such as application, they use geography and demography to
matching on geographic or demographic attributes, or create a “network” of consumers in which links are
through explicit links, such as direct observation of created between consumers who exhibit geographic or
communications between actors. In this section, we re- demographic similarity. The authors showed that the
view the different types of data and the range of statis- geographically defined network of consumers is more
tical methods that have been used to analyze them, and useful than the demographic network for explaining
we discuss the extent to which these methods naturally consumer behavior as it relates to purchasing Japanese
accommodate networked data. cars. Although they do not have data on direct commu-
Work in network-based marketing spans the fields nication between consumers, the framework presented
of statistics, economics, computer science, sociology, by Yang and Allenby (2003) could be extended to ex-
psychology and marketing. In this section, we orga- plicit network data where links are created between
nize prominent work in network-based marketing by consumers through their explicit communication as op-
six types of statistical research: (1) econometric mod- posed to demographic or geographic similarity.
eling, (2) network classification modeling, (3) surveys, A drawback of this approach is that the interde-
(4) designed experiments with convenience samples, pendence matrix has size n2 , where n is the number
(5) diffusion theory and (6) collaborative filtering and of consumers; consumer networks are extremely large
NETWORK-BASED MARKETING 259

and prohibit parameter estimation using this method. To address this shortcoming, some studies use survey
Sparse matrix techniques or clever clustering of the ob- sampling to collect comprehensive data on consumers’
servations would be a natural extension. word-of-mouth behavior. By sampling individuals and
contacting them, researchers can collect data that are
3.2 Network Classification Models
difficult (or impossible) to obtain directly by observing
Network classification models use knowledge of the network-based marketing phenomena (Bowman and
links between entities in a network to estimate a quan- Narayandas, 2001). The strength of these studies lies
tity of interest for those entities. Typically, in such a in the data, including the richness and flexibility of the
model an entity is influenced most by those directly answers that can be collected from the responders. For
connected to it, but is also affected to a lesser ex- instance, researchers can acquire data about how cus-
tent by those further away. Some network classifica- tomers found out about a product and how many oth-
tion models use an entire network to make predictions ers they told about the product. An advantage is that
about a particular entity on the network; Macskassy researchers can design their sampling scheme to con-
and Provost (2004) provided a brief survey. However, trol for any known confounding factors and can devise
most methods have been applied to small data sets and fully balanced experimental designs that test their hy-
have not been applied to consumer data. Much research potheses. Since the purpose of models built from sur-
in network classification has grown out of the pioneer- vey data is description, simple statistical methods like
ing work by Kleinberg (1999) on hubs and authorities logistic regression or analysis of variance (ANOVA)
on the Internet, and out of Google’s PageRank algo- typically are used.
rithm (Brin and Page, 1998), which (to oversimplify) Bowman and Narayandas (2001) surveyed more
identifies the most influential members of a network than 1700 purchasers of 60 different products who pre-
by how many influential others “point” to them. Al- viously had contacted the manufacturer of that product.
though neither study uses statistical models, both are The purchasers were asked specific questions about
related to well-understood notions of degree centrality their interaction with the manufacturer and its impact
and distance centrality from the field of social-network on subsequent word-of-mouth behavior. The authors
analysis. were able to capture whether the customers told oth-
One paper that models a consumer network for max- ers of their experience and if so, how many people
imizing profit is by Richardson and Domingos (2002), they told. The authors found that self-reported “loyal”
in which a social network of customers is modeled as customers were more likely to talk to others about the
a Markov random field. The probability that a given products when they were dissatisfied, but interestingly
customer will buy a given product is a function of the not more likely when they were satisfied. Although
states of her neighbors, attributes of the product and studies like this collect some direct data on consumers’
whether or not the customer was marketed to. In this word-of-mouth behavior, the researchers do not know
framework it is possible to assign a “network value” which of the consumers’ contacts later purchased the
to every customer by estimating the overall benefit of product. Therefore, they cannot address whether word-
marketing to that customer, including the impact that of-mouth actually affects individual sales.
the marketing action will have on the rest of the net-
3.4 Designed Experiments with Convenience
work (e.g., through word of mouth). The authors tested
Samples
their model on a database of movie reviews from an In-
ternet site and found that their proposed methodology Designed experiments enable researchers to study
outperforms non-network methods for estimating cus- network-based marketing in a controlled setting. Al-
tomer value. Their network formulation uses implicit though the subjects typically comprise a convenience
links (customers are linked when a customer reads a sample (such as those undergraduates who answer an
review by another customer and subsequently reviews ad in the school newspaper), the design of the experi-
the item herself ) and implicit purchase information ment can be completely randomized. This is unlike the
(they assume a review of an item implies a purchase studies that rely on secondary data sources or data from
and vice versa). the Web. Typically ANOVA is used to draw conclu-
sions.
3.3 Surveys
Frenzen and Nakamoto (1993) studied the factors
Most research in this area does not have informa- that influence individuals’ decisions to disseminate in-
tion on whether consumers actually talk to each other. formation through a market via word-of-mouth. The
260 S. HILL, F. PROVOST AND C. VOLINSKY

subjects were presented with several scenarios that rep- incorporate individual adoption. Models of product
resented different products and marketing strategies, diffusion assume that network-based marketing is ef-
and were asked whether they would tell trusted and fective. Since understanding when diffusion occurs
nontrusted acquaintances about the product/sale. They and the extent to which it is effective is important
studied the effect of the cost/value manipulations on for marketers, these methods may benefit from using
the consumers’ willingness to share information ac- individual-level data. Data on explicit networks would
tively with others, as a function of the strength of enable the extension of existing diffusion models, as
the social tie. In this study, the authors did not allow well as the comparison of results using individual- ver-
the subjects to construct their explicit consumer net- sus aggregate-level data.
work; instead, they asked the participants to hypoth- In his first study, Bass tested his model empirically
esize about their networks. The experiments used the against data for 11 consumer durables. The model
data from a convenience sample to generalize over yielded good predictions of the sales peak and the
a complete consumer network. The authors also em- timing of the peak when applied to historical data.
ployed simulations in their study. They found that the Bass used linear regression to estimate the parame-
stronger the moral hazard (the risk of problematic be- ters for future sales predictions, measuring the good-
havior) presented by the information, the stronger the ness of fit (R 2 value) of the model for 11 consumer
ties must be to foster information propagation. Gen- durable products. The success of the forecasts suggests
erally, the authors showed that network structure and that the model may be useful in providing long-range
information characteristics interact when individuals forecasting for product sales or adoption. There has
form their information transmission decisions. been considerable follow-up work on diffusion since
3.5 Diffusion Models this groundbreaking work. Mahajan, Muller and Kerin
(1984) review this work. Recent work on product diffu-
Diffusion theory provides tools, both quantitative
sion explores the extent to which the Internet (Fildes,
and qualitative, to assess the likely rate of diffusion
2003) as well as globalization (Kumar and Krishnan,
of a technology or product. Qualitatively, researchers
2002) play a role in product diffusion.
have identified numerous factors that facilitate or hin-
der technology adoption (Fichman, 2004), as well as 3.6 Collaborative Filtering and Recommender
social factors that influence product adoption (Rogers, Systems
2003). Quantitative diffusion research involves empir-
ical testing of predictions from diffusion models, often Recommender systems make personalized recom-
informed by economic theory. mendations to individual consumers based on de-
The most notable and most influential diffusion mographic content and link data (Adomavicius and
model was proposed by Bass (1969). The Bass model Tuzhilin, 2005). Collaborative filtering methods focus
of product diffusion predicts the number of users who on the links between consumers; however, the links are
will adopt an innovation at a given time t. It hypoth- not direct. They associate consumers with each other
esizes that the rate of adoption is a function solely based on shared purchases or similar ratings of shared
of the current proportion of the population who have products.
adopted. Specifically, let F (t) be the cumulative pro- Collaborative filtering is related to explicit consumer
portion of adopters in the population. The diffusion network-based marketing because both target market-
equation, in its simplest form, models F (t) as a func- ing tasks benefit from learning from data stored in mul-
tion of p, the intrinsic adoption rate, and q, a mea- tiple tables (Getoor, 2005). For example, Getoor and
sure of social contagion. When q > p, this equation Sahami (1999), Huang, Chung and Chen (2004) and
describes an S-shaped curve, where adoption is slow Newton and Greiner (2004) established the connection
at first, takes off exponentially and tails off at the end. between the recommendation problem and statistical
This model can effectively model word-of-mouth prod- relational learning through the application of proba-
uct diffusion at the aggregate, societal level. bilistic relational models (PRM’s) (Getoor, Friedman,
In general, the empirical studies that test and ex- Koller and Pfeffer, 2001). However, neither group used
tend accepted theories of product diffusion rely on explicit links between customers for learning. Recom-
aggregate-level data for both the customer attributes mendation systems may well benefit from information
and the overall adoption of the product (Ueda, 1990; about explicit consumer interaction as an additional,
Tout, Evans and Yakan, 2005); they typically do not perhaps quite important, aspect of similarity.
NETWORK-BASED MARKETING 261

3.7 Research Opportunities and Statistical claimed that squashing can be useful when dealing
Challenges with up to billions of records. However, there may be
a loss of important information which can be captured
We see that there is a burgeoning body of work
only by complex network structure.
that addresses consumers’ interactions and their effects
More sophisticated network information derived
on purchasing. To our knowledge the foregoing types
from transactional data can also be incorporated into
represent the main statistical approaches taken in re-
the matrix of customer information by deriving net-
search on network-based marketing. In each approach,
work attributes such as degree distribution and time
there are assumptions made in the data collection or in spent on the network (which we demonstrate below).
the analysis that restrict them from providing strong Similarly, other types of data such as geographical data
and direct support for the hypothesis that network- or temporal data, which otherwise would need to be
based marketing indeed can improve on traditional handled by some sophisticated methodology, can be
techniques. Surveys and convenience samples can suf- folded into the analysis by creating new covariates.
fer from small and possibly biased samples. Collab- It remains an open question whether clever data en-
orative filtering models have large samples, but do gineering can extract all useful information to create
not measure direct links between individuals. Models a set of covariates for traditional analysis. For exam-
in network classification and econometrics historically ple, knowledge of communication with specific sets of
have used proxies like geography instead of data on individuals can be incorporated, and may provide sub-
direct communications, and almost all studies have no stantial benefit (Perlich and Provost, 2006).
accurate, specific data on which (and what) customers Once the data are combined, the remaining data set
purchase. still may be quite large. While much data mining re-
To paint a complete picture of network influence for search is focused on scaling up the statistical toolbox to
a particular product, the ideal data set would have the today’s massive data sets, random sampling remains an
following properties: (1) large and unbiased sample, effective way to reduce data to a manageable size while
(2) comprehensive covariate information on subjects, maintaining the relationships we are trying to discover,
(3) measurement of direct communication between if we assume the network information is fully encoded
subjects and (4) accurate information on subjects’ pur- in the derived variables. The amount of sampling nec-
chases. The data set we present in the next section essary will depend on the computing environment and
has all of these properties and we will demonstrate the complexity of the model, but most modern systems
its value for statistical research into network influence. can handle data sets of tens or hundreds of thousands
The question of how to analyze such data brings up of observations. When sampling, care must be taken to
many statistical issues: stratify by any attributes that are of particular interest
Data-set size. Network-based marketing data sets or to oversample those attributes that have extremely
often arise from Internet or telecommunications ap- skewed distributions.
plications and can be quite large. When observations Low incidence of response. In applications where the
number in the millions (or hundreds of millions), the response is a consumer’s purchase or reaction to a mar-
data become unwieldy for the typical data analyst and keting event, it is common to have a very low response
often cannot be handled in memory by standard statis- rate, which can result in poor fit and reduced ability to
tical analysis software. Even if the data can be loaded, detect significant effects for standard techniques like
their size renders the interactive style of analysis com- logistic regression. If there are not many independent
mon with tools like R or Splus painfully slow. In Inter- attributes, one solution is Poisson regression, which is
net or telecommunications studies, there often are two well suited for rare events. Poisson regression requires
massive sources of data: all actors (web sites, commu- forming buckets of observations based on the indepen-
nicators), along with their descriptive attributes, and dent attributes and modeling the aggregate response
the transactions among these actors. One solution is in these buckets as a Poisson random variable. This
to compress the transaction information into attributes requires discretization of any continuous independent
to be included in the actors’ attribute set. It has been attributes, which may not be desirable. Also, if there
shown that file squashing (DuMouchel, Volinsky et al., are even a moderate number of independent attributes,
1999), which attempts to combine the best features of the buckets will be too sparse to allow Poisson mod-
preprocessed data with random sampling, can be use- eling. Other solutions that have been proposed include
ful for customer attrition prediction. DuMouchel et al. oversampling positive responses and/or undersampling
262 S. HILL, F. PROVOST AND C. VOLINSKY

negative responses. Weiss (2004) gave an overview of Incorporating extended network structure. Data with
the literature on these and related techniques, show- network structure lend themselves to a robust set of
ing that there is mixed evidence as to their effective- network-centric analyses. One simple method (em-
ness. Other studies of note include the following. Weiss ployed in our analysis) is to create attributes from
and Provost (2003) showed that, given a fixed sample the network data and plug them into a traditional
size, the optimal class proportion in training data varies analysis. Another approach is to let each actor be in-
by domain and by ultimate objective (but can be de- fluenced by her neighborhood modeled as a Markov
termined); generally speaking, to produce probability random field. Domingos and Richardson (2001) used
estimates or rankings, a 50:50 distribution is a good de- this technique to assign every node a “network value.”
fault. However, Weiss and Provost’s results are only for A node with high network value (1) has a high prob-
tree induction. Japkowicz and Stephen (2002) experi- ability of purchase, (2) is likely to give the product
mented with neural networks and support-vector ma- a high rating, (3) is influential on its neighbors’ rat-
chines, in addition to tree induction, showing (among ings and (4) has neighbors like itself. Hoff, Raftery and
other things) that support-vector machines are insen- Handcock (2002) defined a Markov-chain Monte Carlo
sitive to class imbalance. However, they considered method to estimate latent positions of the actors for
primarily noise-free data. Other techniques to deal small social-network data sets. This embeds the actors
with unbalanced response attributes include ensemble in an unobserved “social space,” which could be more
(Chan and Stolfo, 1998; Mease, Wyner and Buja, 2006) useful than the actual transactions themselves for pre-
and multiphase rule induction (Clearwater and Stern, dicting sales. The field of statistical relational learning
1991; Joshi, Kumar and Agarwal, 2001). This is an area (Getoor, 2005) has recently produced a wide variety of
in need of more systematic empirical and theoretical methods that could be applicable. Often these models
study. allow influence to propagate through the network.
Separating word-of-mouth from homophily. Unless Missing data. Missing data in network transactions
there is information about the content of communi- are common—often only part of a network is observ-
cations, one cannot conclude that there was word-of- able. For instance, firms typically have transactional
mouth transmission of information about the product. data on their customers only or may have one class
Social theory tells us that people who communi- of communication (e-mail) but not another (cellular
cate with each other are more likely to be simi- phone). One attempt to account for these missing edges
lar to each other, a concept called homophily (Blau, is to use network structure to assign a probability of
1977; McPherson, Smith-Lovin and Cook, 2001). Ho- a missing edge everywhere an edge is not present.
mophily is exhibited for a wide variety of relation- Thresholding this probability creates pseudo-edges,
ships and dimensions of similarity. Therefore, linked which can be added to the network, perhaps with a
consumers probably are like-minded, and like-minded lesser weight (Agarwal and Pregibon, 2004). This is
consumers tend to buy the same products. One way to closely related to the link prediction problem, which
address this issue in the analysis is to account for con- tries to predict where the next links will be (Liben-
sumer similarity using propensity scores (Rosenbaum Nowell and Kleinberg, 2003). One extension of the
and Rubin, 1984). Propensity scores were developed PRM framework models link structure through the use
in the context of nonrandomized clinical trials and at- of reference uncertainty and existence uncertainty. The
tempt to adjust for the fact that the statistical profile of extension includes a unified generative model for both
content and relational structure, where interactions be-
patients who received treatment may be different than
tween the attributes and link structure are modeled
the profile of those who did not, and that these differ-
(Getoor, Friedman, Koller and Taskar, 2003).
ences could mask or enhance the apparent effect of the
treatment. Let T represent the treatment, X represent
4. DATA SET AND PRIMARY HYPOTHESIS
the independent attributes excluding the treatment and
Y represent the response. Then the propensity score This section details our data set, derived primar-
PS(x) = P (T = 1|X = x). By matching propensity ily from a direct-mail marketing campaign to po-
scores in the treatment and control groups using typical tential customers of a new communications service
indicators of homophily like demographic data, we can (later we augment the primary data with a large set
account (partially) for the possible confoundedness of of consumer-specific attributes). The firm’s market-
other independent attributes. ing team identified and marketed to a list of prospects
NETWORK-BASED MARKETING 263

using its standard methods. We investigate whether stead included values for derived attributes that de-
network-related effects or evidence of “viral” informa- fined 21 marketing segments (Table 1) that were used
tion spread are present in this group. As we will de- for campaign management and post hoc analyses. The
scribe, the firm also marketed to a group we identified sample included millions of consumers. The team be-
using the network data, which allows us to test our hy- lieved that the different segments would have varying
potheses further. We are not permitted to disclose cer- response rates and it was important to separate the seg-
tain details, including specifics about the service being ments in this way to learn the most from the campaign.
offered and the exact size of the data set. An important derived variable was loyalty, a three-
level score based on previous relationships with the
4.1 Initial Data Details
firm, including previous orders of this and other ser-
In late 2004, a telecommunications firm undertook a vices. Roughly, loyalty level 3 comprises customers
large direct-mail marketing campaign to potential cus- with moderate-to-long tenure and/or those who have
tomers of a new communications service. This service subscribed to a number of services in the past. Loyalty
involved new technology and, because of this, it was level 2 comprises those customers with which the firm
believed that marketing would be most successful to has had some limited prior experiences. Loyalty level 1
those consumers who were thought to be “high tech.” comprises consumers who did not have service with the
In keeping with standard practice, the marketing firm at the time of mailing; little (if any) information is
team collected attributes on a large set of prospects— available on them. Previous analyses have shown that
consumers whom they believed to be potential adopters loyalty and tenure attributes have substantial impact on
of the service. The marketing team used demographic response to campaigns.
data, customer relationship data, and various other data Other important attributes were based on demo-
sources to create profitability and behavioral models graphics and other customer characteristics. The at-
to identify prospective targets—consumers who would tribute Intl is an indicator of whether the prospect had
receive a targeted mailing. The data the marketing previously ordered any international services; Tech1
team provided us with did not contain the underly- (hi, med or low) and Tech2 (1–10, where 1 = high
ing customer attributes (e.g., demographics), but in- tech) are scores derived from demographics and other

TABLE 1
Descriptive statistics for the marketing segments (see Section 4.1 for details)

Segment Loyalty Intl Tech1 Tech2 Early Adopt Offer % of list %NN

1 3 Y Hi 1–7 Med–Hi P1 1. 6 0.63


2 3 Y Med 1–7 Med–Hi P1 2. 4 1.26
3 2 Y Hi 1–4 Hi P1 1 .7 0.08
4 2 Y Med 1–4 Hi P1 1.7 0.10
5 1 Y Hi 1–4 Hi P1 0 .1 0.22
6 1 Y Med 1–4 Hi P1 0.1 0.25
7 3 N Hi 1–7 Med–Hi P2 10.9 0.50
8 3 N Med 1–7 Med–Hi P2 13.1 0.83
9 2 N Hi 1–4 Hi P2 17.5 0.04
10 2 N Med 1–4 Hi P2 11.0 0.07
11 1 N Hi 1–4 Hi P2 5 .3 0.14
12 1 N Med 1–4 Hi P2 7.7 0.25
13 3 N Hi 1–7 Med–Hi P2 2.0 0.63
14 1, 2 N Hi 1–4 Hi P2 2. 0 0.15
15 1 Y ? ? ? P3 2.0 1.01
16 1 N ? ? ? P2 1.6 0.46
17 3 N Hi 1–7 Med–Hi P2+ 2.0 0.70
18 1, 2 N Hi 1–4 Hi P2+ 2.0 0.15
19 1, 2, 3 Y Hi 1–7 Med–Hi P3 1.8 0.67
20 2 N Hi, Med 1–4 Hi L1 6.0 0.05
21 2 N Hi, Med 1–4 Hi L2 6.0 0.05
264 S. HILL, F. PROVOST AND C. VOLINSKY

attributes that estimate the interest and ability of the relaxed for our list to Tech2 less than 7. In this way,
customer to use a high-tech service; Early Adopt is the marketing team allowed prospects who missed in-
a proprietary score that estimates the likelihood of clusion on the first cut to make it into segment 22 if
the customer to use a new product, based on previ- they were network neighbors. However, the market-
ous behavior. We also show the Offer, indicating that ing team still avoided targeting customers who they
different segments received different marketing mes- believed had very small probabilities of a purchase.
sages: P1–P3 indicate different postcards that were For those network neighbors who did not score high
sent, L1 and L2 indicate different letters, and a “+” enough to warrant inclusion in segment 22, we still
indicates that a “call blast” accompanied the mailing. tracked their purchase records to see if any of them sub-
In defining the segments, those groups with high loy- scribed to the service in the absence of the marketing
alty values were permitted lower values from the tech- campaign; see below. Overall, the profile of the candi-
nology and early adoption models. Segments 15 and 16 dates in our segment 22 was considered to be subpar
were provided by an external vendor; there were insuf- in terms of demographics, affinity and technological
ficient data on these prospects to fit our Tech and Early capability. Notably, for our final conclusions, these tar-
Adopt models, as indicated by a “?” in Table 1. gets are potential customers the firm would have other-
4.2 Primary Hypothesis and Network Neighbors wise ignored. The size of segment 22 was about 1.2%
of the marketing list.
The research goal we consider here is whether re-
To summarize, the above process divides the pros-
laxing the assumption of independence between con-
pect universe along two dimensions: (1) targets—those
sumers can improve demonstrably the estimation of
consumers identified by the marketing models as being
response likelihood. Thus, our first hypothesis is that
someone who has direct communication with a current worthy of solicitation—and (2) network neighbors—
subscriber is more likely herself to adopt the service. those who had direct communication with a subscriber.
It should be noted that the firm knows only of com- Table 2 shows the relative size for each combination
munications initiated by one of its customers through a (using the non-network-neighbor targets as the refer-
service of the firm, so the network data are incomplete ence set). Note the non-NN nontargets, who neither
(considerably), especially for the lower loyalty groups. are network neighbors nor are they deemed to be good
Data on communications events include anonymous prospects. This group is the majority of the prospect
identifiers for the transactors, a time stamp and the space and includes consumers that the firm has very lit-
transaction duration. For the purposes of this research, tle information about, because they are low-usage com-
all data are rendered anonymous so that individual municators or do not subscribe to any services with the
identities are protected. firm.
In pursuit of our hypothesis, we constructed an at-
4.3 Modeling with Consumer-Specific Data
tribute called network neighbor (or NN)—a flag that
indicates whether the targeted consumer had commu- To determine whether relaxing the independence as-
nicated with a current user of the service in a time pe- sumption (using the network data) improves model-
riod prior to the marketing campaign. Overall, 0.3% of ing, we fit models using a wide range of demographic
the targets are network neighbors. In Table 1, the per- and consumer-specific independent attributes (many
centage of network neighbors (%NN) is broken down of which are known or believed to affect the esti-
by segment. mated likelihood of purchase). Overall, we collected
In addition, the marketing team invited us to create the values for over 150 attributes to assess their ef-
our own segment, which they also would target. Our fect on sales likelihood and their interactions with the
“segment 22” consisted of network neighbors that were network-neighbor variable. These values included the
not already on the current list of targets. To make sure following:
our list contained viable prospects, the marketing team
calculated the derived technology and early adopter • Loyalty data: We obtained finer-grained loyalty in-
scores for the consumers on our list. They filtered formation than the simple categorization described
based on these scores, but they relaxed the thresholds above, including past spending, types of service,
used to limit their original list. For instance, someone how often the customer responded to prior mailings,
with loyalty = 1 needed a Tech2 score less than 4 to a loyalty score generated by a proprietary model and
merit inclusion on the initial list; this threshold was information about length of tenure.
NETWORK-BASED MARKETING 265

TABLE 2
Data categories

Target = Y Target = N

NN = Y NN targets NN nontargets
Segments 1–22
Relative size = 0.015 Relative size = 0.10
Prospects identified by marketing models and who also Consumers who were network neighbors, but were not
are network neighbors. Those in segment 22 have re- marketed to because they scored poorly on marketing
duced thresholds on the marketing model scores. models.
NN = N Non-NN targets Non-NN nontargets
Segments 1–21
Relative size = 1 Relative size > 8
Prospects identified by marketing models but who are Consumers who were not network neighbors and also
not network neighbors. were not considered to be good prospects by the mar-
keting model.

N OTES. The data for our study are broken down into targets and network neighbors. The “relative size” value shows the number of prospects
who show up in each group, relative to the non-NN target group.

• Geographic data: Geographic data were necessary The overall response rate is very low. As discussed
for the direct mail campaign. These data include city, above, this presents challenges inherent with a heav-
state, zip code, area code and metropolitan city code. ily skewed response variable. For example, an analysis
• Demographic data: These include information such that stratifies over many different attributes may have
as gender, education level, credit score, head of several strata with no sales at all, rendering these strata
household, number of children in the household, age mostly useless. The data set is large, which helps to
of members in the household, occupation and home ameliorate this problem, but in turn presents logistical
ownership. Some of this information was inferred at problems with many sophisticated statistical analyses.
the census tract level from the geographic data. In this paper, we restrict ourselves to relatively straight-
• Network attributes: As mentioned earlier, we ob- forward analyses.
served communications of current subscribers with 4.5 Loyalty Distribution
other consumers. In addition to the simple network-
neighbor flag described earlier, we derived more A look at the distribution of the loyalty groups across
sophisticated attributes from prospects’ communica- the four categories (Figure 1) of prospects shows that
tion patterns. We will return to these in Section 5.6. the firm targeted customers in the higher loyalty groups
relatively heavily. The network-neighbor target group
4.4 Data Limitations appears to skew toward the less loyal prospects; this
We encountered missing values for customers across is due to the fact that segment 22, which makes up a
all loyalty levels. The amount of missing information is large part of the network-neighbor population, com-
directly related to the level of experience we have had prises predominantly low-loyalty consumers.
with the customer just prior to the direct mailing. For
example, geography data are available for all targets 5. ANALYSIS
across all three loyalty levels. On the other hand, as the Next we will show direct, statistical evidence that
number of services and tenure with the firm decline, consumers who have communicated with prior cus-
so does the amount of information (e.g., transactions) tomers are more likely to become customers. We show
available for each target. Given the difference in in- this in several ways, including using our own best
formation as loyalty varies, we grouped customers by efforts to build competing targeting models and con-
loyalty level and treated the levels separately in our ducting thorough assessments of predictive ability on
analyses. This stratification leaves three groups that out-of-sample data. Then we consider more sophisti-
are mostly internally consistent with respect to miss- cated network attributes and show that targeting can be
ing values. improved further.
266 S. HILL, F. PROVOST AND C. VOLINSKY

F IG . 1. Loyalty distribution by customer category. The three bars show the relative sizes of the three loyalty groups for our four data
categories. The network neighbors (NN) show a much larger proportion of low-loyalty consumers than the non-NN group.

5.1 Network-Based Marketing Improves Response network-neighbor sales, and therefore had an infinite
log odds). Figure 2 shows that in all 20 segments the
Segmentation provides an ideal setting to test the sig-
network-neighbor effect is positive (the parameter esti-
nificance and magnitude of any improvement in model-
mate is greater than zero), demonstrating an increased
ing by including network-neighbor information, while
take rate for the network-neighbor group within each
stratifying by many attributes known to be important,
segment. For 17 of these segments, the log-odds ratio
such as loyalty and tenure. The response variable is
is significantly different from the null hypothesis value
the take rate for the targets in the two months following
the direct mailing. The take rate is the proportion of the of 0 (p < 0.05), indicating that being a network neigh-
targeted consumers who adopted the service within a bor significantly affected sales in those segments.
specified period following the offer. For each segment, While odds ratios allow for tests of significance of
we performed a simple logistic regression for the inde- an independent variable, they are not as directly inter-
pendent network-neighbor attribute versus the depen- pretable as comparisons of take rates of the network-
dent sales response. In Figure 2, we graphically present neighbor and non-network-neighbor groups in a given
parameter estimates (equivalent to log-odds ratios) for segment. The take rates for the network neighbors
the network attribute along with 95% confidence inter- are plotted versus the non-network neighbors in Fig-
vals for 20 of the 21 segments (segment 5 had only a ure 3, where the size of the point is proportional to
small number of network-neighbor prospects and zero the log size of the segment. All segments have higher
take rates in the network-neighbor subgroup, except for
the one segment that had no network-neighbor sales
(the smallest sample size). Over the entire data set, the
network-neighbors’ take rates were greater by a fac-
tor of 3.4. This value is plotted in Figure 3 as a dotted
line with slope = 3.4. The right-hand plot of Figure 3
shows the relationship between each segment’s take
rate and its lift ratio, defined as the take rate for NN
divided by the take rate for non-NN. The plot shows
that the benefit of being a network neighbor is greater
for those segments with lower overall take rates.
As Figure 3 shows, some of the segments had much
F IG . 2. Results of logistic regression. Parameter estimates plot-
higher take rates than others. To assess statistical sig-
ted as log-odds ratios with 95% confidence intervals. The number nificance of the network-neighbor effect after account-
plotted at the value of the parameter estimate refers back to seg- ing for this segment effect, we ran a logistic regression
ment numbers from Table 1. across all segments, including the main effects for the
NETWORK-BASED MARKETING 267

F IG . 3. Take rates for marketing segments. Left: For each segment, comparison of the take rate of the non-network neighbors with that of
the network neighbors. The size of the glyph is proportional to the log size of the segment. There is one outlier not plotted, with a take rate
of 11% for the network neighbors and 0.3% for the non-network neighbors. Reference lines are plotted at x = y and at the overall take-rate
ratio of 3.4. Right: Plot of the take rate for the non-network group versus lift ratio for the network neighbors.

network-neighbor attribute, dummy attributes for each The results of the logistic regression reiterate the sig-
segment and the interaction terms between the two. nificance of being a network neighbor. The final model
Two of the interaction terms had to be deleted: one can be found in Table 3. The coefficient of 2.0 for the
from segment 22, which only had network-neighbor network-neighbor attribute in the final model is an esti-
cases, and one from the segment with no sales from mate of the log odds, which we exponentiate to get an
the network neighbors. We ran a full logistic regression odds ratio of 7.49, with a 95% confidence interval of
and used stepwise variable selection. (5.64, 9.94). More than half of the segment effects and
most of the interactions between the network-neighbor
attribute and those segment effects are significant. The
TABLE 3 interpretation of these interactions is important. Note
Coefficients and confidence intervals for the final segment model that the magnitudes of the interaction coefficients are
negative and very close in magnitude to the coefficients
Attribute Coeff (c.i.) Significancea
of the main effects of the segments themselves. There-
Network neighbor (NN) 2.0 (1.7, 2.3) ** fore, although the segments themselves are significant,
Segment = 1 1.7 (0.9, 2.5) ** in the presence of the network attribute the segments’
Segment = 2 1.8 (1.2, 2.4) **
effect is mostly negated by the interaction effect. Since
Segment = 4 2.1 (1.3, 3.0) **
Segment = 5 1.9 (0.4, 3.3) ** the segments represent known important attributes like
Segment = 6 1.9 (1.2, 2.5) ** loyalty, tenure and demographics, this is evidence that
Segment = 7 1.4 (1.0, 1.9) ** being a network neighbor is at least as important in this
Segment = 8 1.3 (0.9, 1.7) **
Segment = 17 1.5 (0.7, 2.2) **
context.
Segment = 19 2.2 (1.6, 2.9) ** In Table 4 we present an analysis of deviance table,
NN × Segment = 1 −1.1 (−2.1, 0.0) * an analog to analysis of variance used for nested lo-
NN × Segment = 2 −0.9 (−1.7, −0.2) ** gistic regressions (McCullagh and Nelder, 1983). The
NN × Segment = 4 −1.8 (−4.0, 0.4) **
NN × Segment = 6 −1.5 (−2.6, −0.6) ** table confirms the significance of the main effects and
NN × Segment = 7 −1.2 (−1.7, −0.6) ** of the interactions. Each level of the nested model is
NN × Segment = 8 −0.8 (−1.3, −0.4) ** significant when a chi-squared approximation is used
NN × Segment = 17 −1.6 (−2.8, −0.5) **
for the differences of the deviances. The fact that so
NN × Segment = 19 −1.1 (−1.9, −0.3) **
many interactions are significant demonstrates that the
a Significance of the attributes in the logistic regression model is network-neighbor effect varies for different segments
shown at the 0.05 (*) and 0.01 (**) levels. of the prospect population.
268 S. HILL, F. PROVOST AND C. VOLINSKY

TABLE 4
Analysis of deviance table for the network-neighbor study

Variable Deviance DF Change in deviance Significancea

Intercept 11200
Segment 10869 9 63 **
Segment + NN 10733 1 370 **
Segment + NN + interactions 10687 8 41 **

a Significance of the group of attributes at each step is shown at the 0.05 (*) and 0.01 (**) levels.

5.2 Segment 22 not identified by marketing analysts or were deemed to


be unworthy prospects, they represent customers who
The segment data enable us to compare take rates
of network and non-network targets for the segments would have “fallen through the cracks” in the tradi-
that contained both types of targets. However, many of tional marketing process.
the network-neighbor targets fall into the network-only 5.3 Improving a Multivariate Targeting Model
segment 22. Segment 22 comprises prospects that the
original marketing models deemed not to be good can- Now we will assess whether the NN attribute can im-
didates for targeting. As we can see from the distribu- prove a multivariate targeting model by incorporating
tion in Figure 1, this segment for the most part contains all that we know or can find out (over 150 different at-
consumers who had no prior relationship with the firm. tributes) about the targets, including geography, demo-
We compare the take rates for segment 22 with the graphics and other company-specific attributes, from
take rates for the combined group, including all of seg- internal and external sources (see Section 3.2).
ments 1–21, in the leftmost three bars of Figure 4. As discussed in Section 3.7, we tried to address
The network-neighbor segment 22 is (not surprisingly) (as well as possible) an important causal question that
not as successful as the NN groups in segments 1–21, arises: Is this network-neighbor effect due to word of
since the targets in segments 1–21 were selected based mouth or simply due to homophily? The observed ef-
on characteristics that made them favorable for mar- fect may not be indicating viral propagation, but in-
keting. Interestingly, we see that the segment 22 net- stead may simply demonstrate a very effective way
work neighbors outperform the non-NN targets from to find like-minded people. This theoretical distinction
segments 1–21. These segment 22 network neighbors, may not matter much to the firm for this particular type
identified primarily on the basis of their network ac- of marketing process, but is important to make, for ex-
tivity, were more likely by almost 3 to 1 to purchase ample, before designing future campaigns that try to
than the more “favorable” prospects who were not net- take advantage of word-of-mouth behavior.
work neighbors. Since those in segment 22 either were Although we cannot control for unobserved similar-
ities, we can be as careful as possible in our analysis
to ensure that the statistical profile of the NN prospects
is the same as the profile for the non-NN cases. Since
our data set contains many more non-NN cases than
NN cases, we match each NN case with a single non-
NN case that is as close as possible to it by calculating
propensity scores using all of the explanatory attributes
considered (as described in Section 3.7). At the end of
this matching process, the NN group is as close as is
reasonably possible in statistical properties to the non-
NN group.
Due to heterogeneity of data sources across the three
F IG . 4. Take rates for marketing segments. Take rates for the
network neighbors and non-network neighbors in segments 1–21 loyalty groups, we used the propensity scores to create
compared with the all-network-neighbor segment 22 and with the a matched data set for each group. For each (individu-
nontarget network neighbors. All take rates are relative to the ally), we fitted a full logistic regression including in-
non-network-neighbor group (segments 1–21). teractions and selected a final model using stepwise
NETWORK-BASED MARKETING 269

TABLE 5
Results of multivariate model

Loyalty
3 2 1

Significant NN NN NN
attributes Discount calling plan (-)(I) Discount calling plan (-) Previous responder to
Level of Int’l Comm.(I) Tenure with firm mailing
# of devices in house (-) Referral plan High Tech Msg
Revenue band High Tech model score (I) Letter (vs. postcard)
Tenure with firm (-) Region of country indicator Recent responder to mailing
International communicator Belonged to loyalty program User of incentive credit card
Belonged to loyalty program Chumer (-) Any children in house (-)
Referral plan College grad
Type of previous service Tenure at residence (-)
Credit score Any children in house (-)
Number of adults in house Child < 18 at home (-)
Beta hat for NN
(95% CI) 0.68 (0.46, 0.91) 0.99 (0.49, 1.49) 0.84 (0.52, 1.16)
Take rate 0.9% 0.4% 0.3%

N OTES. Significant attributes from logistic regressions across loyalty levels (p < 0.05). Bold indicates significance at 0.01 level; (-) indicates
the effect of the variable was negative; (I) indicates a significant interaction with the NN variable.

variable selection. All attributes were checked for out- those customers, knowing whether the customer has
liers, transformations and collinearity with other at- responded to any previous marketing campaigns has a
tributes, and we removed or combined the attributes significant effect.
that accounted for any significant correlations. Table 5 also shows parameter estimates for NN and
Table 5 shows the results of the logistic regres- the take rates in the three loyalty groups. The take rates
sions, which show the attributes that were found to be are highest in the group with the most loyalty but, in-
significant, those that were negatively correlated with terestingly, this group gets the least lift (smallest para-
take rate, and those that had interactions with the NN meter estimate) from the NN attribute. So the impact
attribute. Each of the three models found the network- of network-neighbor is stronger for those market seg-
neighbor attribute to be significant along with several ments with lower loyalty, where actual take rates are
others. The significant attributes tended to be attributes weakest.
regarding the prospects’ previous relationships with the 5.4 Consumers Not Targeted
firm, such as previous international services, tenure
As discussed above, only a select subset of our
with firm, churn identifiers and revenue spent with
network-neighbor list was subject to marketing, based
the firm. These attributes are typically correlated with
on relaxed thresholds on eligibility criteria. The re-
demographic attributes, which explains the lack of sig- mainder of the list, the nontarget network neighbors,
nificance of many of the demographic attributes con- made up the majority. Potential customers were omit-
sidered. Interestingly, tenure with firm is significant in ted for various reasons: they were not believed to
loyalty groups 1 and 2, but with different signs. In the have high-tech capacity; they were on a do-not-contact
most loyal group, tenure is negatively correlated, but list; address information was unreliable, and so on.
in the mid-level loyalty group it is positive. This unex- Nonetheless, we were able to identify whether they
pected result may be due to differing compositions of purchased the product in the follow-up time period.
the two groups; those consumers with long tenure in The take rate for this group was 0.11%, and is shown
the most loyal group might be people who just never relative to the target groups as the rightmost bar in Fig-
change services, while long tenure in the other group ure 4. Although they were not even marketed to, their
might be an indicator that they are gaining more trust take rate is almost half that for the non-NN targets—
in the company. In loyalty group 1, there is limited in- chosen as some of the best prospects by the market-
formation about previous services with the firm. For ing team. This group comprises consumers without
270 S. HILL, F. PROVOST AND C. VOLINSKY

any known favorable characteristics that would have well as network-neighbor status. Note that in different
put them on the list of prospects. The fact that they business scenarios, different types and amounts of data
are network neighbors alone supports a relatively high are available. For example, for low-loyalty customers,
take rate, even in the absence of direct marketing. very few descriptive attributes are known. We report
This lends some support to an explanation of word-of- results here using all attributes; the findings are quali-
mouth propagation rather than homophily. tatively similar for every different subset of attributes
Finally, we will briefly discuss the remainder of the we have tried (namely, segment, loyalty, geography,
consumer space—the non-NN nontarget group. Unfor- demographic). The response variable is the same as
tunately, it is very difficult to estimate a take rate in this above and we used the same logistic regression mod-
category, which could be considered a baseline rate for els. We measure the predictive ranking ability in the bi-
all of the other take rates. To do this, we would need to nary response variable by an increase in the Wilcoxon–
estimate the size of the space of all prospects. This in- Mann–Whitney statistic, equivalent to the area under
cludes all of the prospects the firm knows about, as well the ROC curve (AUC). The ROC curve represents the
as customers of the firm’s competitors and consumers trade off between false negative and false positive rates
who might purchase this product that do not have cur- for each predicted possible probability score cutoff re-
rent telecommunications service with any provider. It sulting from the logistic regression model. Specifically,
has been established that the size of the communica- the AUC is the probability that a randomly chosen (as
tions market is difficult to estimate (Poole, 2004); our yet unseen) taker will be ranked higher than a randomly
best estimates of this baseline take rate put it at well chosen nontaker; AUC = 1.0 means the classes are per-
below 0.01%, at least an order of magnitude less than fectly separated and AUC = 0.5 means the list is ran-
even the nontarget network neighbors. domly shuffled. All reported AUC values are averages
On the other hand, a by-product of our study is that obtained using 10-fold cross-validation.
we can upper-bound the effect of the mass market- Table 6 shows the AUC values for the three loy-
ing campaigns in general by comparing the target-NN alty groups, quantifying the expected benefit from the
group and the nontarget-NN group. The difference in improved logistic regression models. There is an in-
take rates between the targeted network neighbors and crease in AUC for each group, with the largest increase
the nontargeted network neighbors is about 10 to 1. belonging to loyalty level 1, for which the least infor-
This difference cannot all be attributed to the marketing mation is available; note that here the ranking ability
effect, since the targeted group was specifically chosen without the network information is not much better
to be better prospects and it is likely that more of them than random.
would have signed up for the service even in the total To visualize this improvement, Figure 5(a) shows cu-
absence of marketing. However, it does seem reason- mulative response (“lift”) curves when using the model
able to call this factor of 10 an upper bound on the on loyalty group 3. The lower curve depicts the per-
effect of the marketing. formance of the model using all traditional attributes,
and the upper curve includes the traditional marketing
5.5 Out-of-Sample Ranking Performance attributes and the network-neighbor attribute. In Fig-
These results suggest that we can give fine-grained ure 5(b), one can see the marked improvement that
estimations as to which customers are more or less
likely to respond to an offer. Such estimations can be TABLE 6
ROC analysis: AUC values that result from the application of
quite valuable: the consumer pool is immense and a
logistic regression models
campaign will have a limited budget. Therefore, be-
ing able to pick a better list of “top-k” prospects will Loyalty trad atts trad atts + NN
lead directly to increased profit (assuming targeting
costs are not much higher for higher ranked prospects). 1 0.54 0.60
In this section, we show that combining the network- 2 0.64 0.67
3 0.60 0.64
neighbor attribute with the traditional attributes im-
proves the ability to rank customers accurately. N OTE. The logistic regression models were built using all available
For each consumer, we create a record that com- attributes with (trad atts + NN) and without (trad atts) the network-
prises all of the traditional attributes (trad atts), includ- neighbor attribute. We see an increase in AUC across all loyalty
ing loyalty, demographic and geographic attributes, as groups when the NN attribute is included in the model.
NETWORK-BASED MARKETING 271

F IG . 5. (a) Lift curves. Power of the segmentation curves for models built with all attributes with (trad atts) and without (trad atts + NN)
network-neighbor attribute. The model with the NN attribute outperforms the model without it. For example, if the firm sent out 50% of the
mailing, they would get 70% of the positive responses with the NN compared to receiving only 63% of the responses without it. (b) Top-k
analysis. Consumers are ranked by the probability scores from the logistic regression model. The model that includes the NN attribute
outperforms the model without. For example, for the top 20% of targets, the take rate is 1.51% without the NN attribute and 1.72% with the
NN attribute.

would be obtained from sending to the top-k prospects phisticated measures of social relationship with the
on the list. For example, for the top 20% of the list, network of existing customers.
without the NN attribute, the take rate is 1.51%; with Table 7 summarizes a set of additional social-
the NN attribute, it is 1.72%. The NN attribute does not network attributes that we add to the logistic regres-
improve the ranking for the top 10% of the list. sion. The terminology we use is borrowed to some
degree from the fields of social-network analysis and
5.6 Improving Performance By Adding More graph theory. Social-network analysis (SNA) involves
Sophisticated Network Attributes measuring relationships (including information trans-
mission) between people on a network. The nodes in
Knowing whether a consumer is a network neigh-
the network represent people and the links between
bor is one of the simplest indicators of consumer-to-
them represent relationships between the nodes. The
consumer interaction that can be extracted from the
SNA measures help quantify intuitive social notions,
network data. We now investigate whether augment- such as connectedness, influence, centrality, social im-
ing the model with more sophisticated social-network portance and so on. Graph theory helps to understand
information can add additional value. In this section, problems better by representing them as interconnected
we focus on the social network that comprises (only) nodes, and provides vocabulary and methods for oper-
the current customers of this service (which here we ating mathematically.
will call “the network”), along with the periphery Three of the attributes that we introduce can be de-
of prospects who have communicated with those on rived from a prospect’s local neighborhood (the set of
the network (the network neighbors). We investigate immediate communication partners on the network; re-
whether we can improve targeting by using more so- call that these all are current customers). Degree mea-

TABLE 7
Network attribute descriptions

Attribute Description

Degree Number of unique customers communicated with before the mailing


Transactions Number of transactions to/from customers before the mailing
Seconds of communication Number of seconds communicated with customers before mailing
Connected to influencer Is an influencer in prospect’s local neighborhood?
Connected component size Size of the connected component prospect belongs to
Max similarity Max overlap in local neighborhood with any existing neighboring customer
272 S. HILL, F. PROVOST AND C. VOLINSKY

sures the number of direct connections a node has. TABLE 8


ROC analysis
Within the local neighborhood, we also count the num-
ber of Transactions, and the length of those transac-
Attribute(s) AUC
tions (Seconds of communication).
The network is made up of many disjoint subgraphs. Transactions 0.68
Given a graph G = (V , E), where V is a set of ver- Seconds of communication 0.68
tices (nodes) and E is a set of links between them, Degree 0.59
the connected components of G are the sets of ver- Connected to influencer 0.53
Connected component size 0.55
tices such that all vertices in each set are mutually con- Similarity 0.55
nected (reachable by some path) and no two vertices All network 0.71
in different sets are connected. The size of the con- All traditional (loyalty, demographic, geographic) 0.66
nected component may be an indicator for awareness All traditional + all network 0.71
of and positive views about the product. If a prospect
is linked to a large set of “friends” all of whom have N OTE. AUC values result from logistic regression models built on
each of the constructed network attributes individually, as well as
adopted the service, she may be more likely to adopt in combination. Results are presented for loyalty-level 3 customers.
herself. Connected component size is the size of the
largest connected component (in the network) to which
the prospect is connected. neighbors, who already have especially high take rates
We also move beyond a prospect’s local neigh- as a group, as we have shown.)
borhood. Observing the local neighborhoods of a Interestingly, when we combine the traditional at-
prospect’s local neighbors, we can define a measure tributes with the network attributes, there is no ad-
of social similarity. We define social similarity as the ditional gain in AUC, even though many of these
size of the overlap in the immediate network neighbor- attributes were shown to be significant in the broader
hoods of two consumers. Max similarity is the max- analysis above. The similarities represented implicitly
imum social similarity between the prospect and any or explicitly in the network attributes seem to account
neighbors of the prospect. Finally, the firm also can for all useful information captured by traditional de-
observe the prior dynamics of its customers. In partic- mographics and other marketing attributes. That tra-
ular, the firm can observe which customers communi- ditional demographics and other marketing attributes
cated before and/or after their adoption as well as the do not add value is not only of theoretical interest, but
date customers signed up. Using this information, we practical as well—for example, in cases such as this
define influencers as those subscribers who signed up where demographic data must be purchased.
for the service and, subsequently, we see one of their Our result is further confirmed by the lift and take
network neighbors sign up for the service. Connected rate curves displayed in Figure 6(a) and (b), respec-
to influencer is an indicator of whether the prospect is tively. One can achieve substantially higher take rates
connected to one of these influencers. We appreciate using the new network attributes as compared to using
that we do not actually know if there was true influ- the traditional attributes. For example, we find that for
ence. the top 20% of the targeted list, without the network
We use all of the aforementioned attributes and show attributes, the take rate is 2.2%; with the network at-
AUC values for these predictive models in Table 8. We tributes, it is 3.1%. Likewise, at the top 10% of the list,
find that some of these network attributes have con- the take rate with the network attributes is 4.4% com-
siderable predictive power individually and have even pared to 2.9% without them.
more value when combined. This is indicated by AUCs
6. LIMITATIONS
of 0.68 for both transactions and seconds of commu-
nication. We do not find high AUCs individually for We believe our study to be the first to combine data
connected component size, similarity or connected to on direct customer communication with data on prod-
influencer. Ultimately, we find that the logistic regres- uct adoption to show the effect of network-based mar-
sion model built with the network attributes results in keting statistically. However, there are limitations in
an AUC of 0.71 compared to an AUC of 0.66 without our study that are important to point out.
the network attributes—using only the traditional mar- There are several types of missing, incomplete or
keting attributes described in previous sections. (Re- unreliable data which could influence our results. We
call that this represents the ability to rank the network have records of all of the communication (using the
NETWORK-BASED MARKETING 273

F IG . 6. (a) Lift curves. Power of segmentation curves for models built with all traditional attributes, with (trad atts + net) and without (trad
atts) the network attributes. If the firm sent out 50% of the mailing, they would have received 77% of the positive responses with the network
attributes compared to receiving 63% of the responses without the network attributes. (b) Top-k analysis. The model including the network
attributes (trad atts + net) outperforms the model without them (trad atts). For example, for the top 20% of target ranked by score, the take
rate is 2.2% without the network attributes and 3.1% with the network attributes.

firm’s service) to and from current customers of the the new service studied here to a roll-out of another
service. That is not true for all the network-neighbor product by the same firm. This other product was sim-
consumers. As such, we do not have complete infor- ply a new pricing plan for an older telecommunications
mation about the network-neighbor targets (as well as service. Customers who signed up for this new plan
the non-network-neighbor targets). In addition, some could stand to save a significant amount of money, de-
of the attributes we used were collected by purchasing pending on their current usage patterns. However, the
data from external sources. These data are known to range and variety of telecommunications pricing plans
be at least partially erroneous and outdated, although it in the marketplace is so extensive and so confusing to
is not well known how much so. An additional prob- the typical consumer that we do not believe that this
lem is joining data on customers from external sources is the type of product that would generate a lot of dis-
to internal communication data, leading to missing cussion between consumers. We refer to the two prod-
data or sometimes just blatantly incorrect data. Finally, ucts as the pricing plan and the new technology. For
telecommunications firms are not legally able to col- the pricing plan, we have the same knowledge of the
lect information regarding the actual content of the network as we do for the new technology. For those
communication, so we are not able to determine if
consumers who belong to the pricing plan, we know
the consumers in question discussed the product. In
who they communicate with and then we can follow
this regard, our data are inferior to some other do-
these network-neighbor candidates to see if they ulti-
mains where content is visible, such as Internet bulletin
mately sign up for the plan. We construct a measure
boards or product discussion forums.
of “network neighborness” as follows. For a series of
We expect the network-neighbor effect to manifest
itself differently for different types of products. Most consecutive months, we gather data for all customers
of the studies done to date on viral marketing have fo- who ordered the product in that month. We calculate
cused on the types of products that people are likely the percentage of these new customers who were net-
to talk about, such as a new, high-tech gadget or a re- work neighbors, that is, those who had previously com-
cently released movie. We expect there to be less buzz municated with a user of the product. This percentage
for less “sexy” products, like a new deodorant or a sale is a measurement of the proportion of new sales be-
on grapes at the supermarket. The study presented in ing driven by network effects. By comparing this per-
this paper involves a new telecommunications service, centage across two products, we get insight into which
which involves a new technology and features that con- product stimulates network effects more.
sumers have perhaps never been exposed to before. The We now look at this value for our two products over
firm hopes the new technology and features are such an 8-month period. The time period for the two prod-
that they would encourage word of mouth. ucts was chosen so that it would be within the first
What can we say about other products that might not year after the product was broadly available. The re-
be quite so buzz-worthy? To study this, we compared sults are shown in Figure 7. The two main points to
274 S. HILL, F. PROVOST AND C. VOLINSKY

make marketing decisions based on how much they


know about their customers and potential customers.
They may choose to mass market when they do not
know much. With more information, they may market
directly based on some observed characteristics. We
provide strong evidence that whether and how well a
consumer is linked to existing customers is a powerful
characteristic on which to base direct marketing deci-
sions. Our results indicate that a firm can benefit from
the use of social networks to predict the likelihood of
purchasing. Taking the network data into account im-
proves significantly and substantially on both the firm’s
own marketing “best practices” and our best efforts to
F IG . 7. Network-neighborness plot for new service versus pricing collect and model with traditional data.
plan. The sort of directed network-based marketing that
we study here has applicability beyond traditional
take away are that the new service has a higher per- telecommunciations companies. For example, eBay
cent of purchasers who are network neighbors and also recently purchased Internet-telephony upstart Skype
an increasing one (except for the dip in month 5). In for $2.6 billion; they now also will have large-scale,
contrast the pricing plan has a flat network-neighbor explicit data on who talks to whom. With gmail,
percentage, never increasing above 3%. Google’s e-mail service, Google now has access to
Interestingly, the dip in the plot for the new service explicit networks of consumer interrelationships and
corresponds exactly to the month of the direct market- already is using gmail for marketing; directed network-
ing discussed earlier. Before the campaign, we can see based marketing might be a next step. Various systems
that the network-neighbor effect was increasing, that have emerged recently that provide explicit linkages
more and more of the purchasers in a given month between acquaintances (e.g., MySpace, Friendster,
were network neighbors. During the mass marketing Facebook), which could be fruitful fields for network-
campaign, we exposed many non-network neighbors based marketing. As more consumers create interlinked
to the service and many of them ended up purchasing blogs, another data source arises. More generally, these
it, temporarily dropping the network-neighbor percent- results suggest that such linkage data potentially could
age. After the campaign, we see the network-neighbor be a sort of data considered for acquisition by many
percentage starting to increase again. types of firms, as purchase data now are being col-
This network-neighborness measure should not be lected routinely by many types of retail firms through
confused with the success of the product, as the pric- loyalty cards. Even academic departments could bene-
ing plan was quite successful from a sales perspective, fit from such data; for example, the enrollment in spe-
but it does suggest that the pricing plan is a product cialized classes could be bolstered by “marketing” to
that has less of a network-based spread of information. those linked to existing students. Such links exist (e.g.,
This difference might be due to the new service creat- via e-mail). It remains to design tactics for using them
ing more word-of-mouth or perhaps we are seeing the that are acceptable to all.
effects of homophily. People who interact with each It is tempting to argue that we have shown that cus-
other are more likely to be similar in their propensity tomers discuss the product and that discussion helps to
for purchasing the new service than in their propensity improve take rates. However, word of mouth is not the
for purchasing a particular pricing plan. Again, the ef- only possible explanation for our result. As discussed
fects of word of mouth versus homophily are difficult in detail above, it may be that the network is a powerful
to discern without knowing the content of the commu- source of information on consumer homophily, which
nication. is in accord with social theories (Blau, 1977; McPher-
son, Smith-Lovin and Cook, 2001). We have tried to
control for homophily by using a propensity-matched
7. DISCUSSION
sample to produce our logistic regression model. How-
One of the main concerns for any firm is when, how ever, it may well be that direct communications be-
and to whom they should market their products. Firms tween people is a better indicator of deep similarity
NETWORK-BASED MARKETING 275

than any demographic or geographic attributes. Either F ILDES , R. (2003). Review of New-Product Diffusion Models, by
cause, homophily or word of mouth, is interesting both V. Mahajan, E. Muller and Y. Wind, eds. Internat. J. Forecasting
theoretically and practically. 19 327–328.
F RENZEN , J. and NAKAMOTO , K. (1993). Structure, cooperation,
and the flow of market information. J. Consumer Research 20
ACKNOWLEDGMENTS 360–375.
G ETOOR , L. (2005). Tutorial on statistical relational learning. In-
We would like to thank DeDe Paul and Deepak Agar- ductive Logic Programming, 15th International Conference.
wal of AT&T, as well as Chris Dellarocas of the Uni- Lecture Notes in Comput. Sci. 3625 415. Springer, Berlin.
versity of Maryland, for useful discussions and helpful G ETOOR , L., F RIEDMAN , N., KOLLER , D. and P FEFFER , A.
suggestions. We would also like to thank three anony- (2001). Learning probabilistic relational models. In Relational
mous reviewers who offered insightful comments on Data Mining (S. Džeroski and N. Lavrač, eds.) 307–338.
Springer, Berlin.
previous drafts. G ETOOR , L., F RIEDMAN , N., KOLLER , D. and TASKAR , B.
(2003). Learning probabilistic models of link structure. J. Mach.
Learn. Res. 3 679–707. MR1983942
REFERENCES G ETOOR , L. and S AHAMI , M. (1999). Using probabilistic relation
A DOMAVICIUS , G. and T UZHILIN , A. (2005). Toward the next models for collaborative filtering. In Proc. WEBKDD 1999, San
generation of recommender systems: A survey of the state-of- Diego, CA.
the-art and possible extensions. IEEE Trans. Knowledge and G LADWELL , M. (1997). The coolhunt. The New Yorker March 17,
Data Engineering 17 734–749. 78–88.
AGARWAL , D. and P REGIBON , D. (2004). Enhancing communi- G LADWELL , M. (2002). The Tipping Point: How Little Things Can
ties of interest using Bayesian stochastic blockmodels. In Proc. Make a Big Difference. Back Bay Books, Boston.
Fourth SIAM International Conference on Data Mining. SIAM, H IGHTOWER , R., B RADY, M. K. and BAKER , T. L. (2002). Inves-
Philadelphia. tigating the role of the physical environment in hedonic service
BASS , F. M. (1969). A new product growth for model consumer consumption: An exploratory study of sporting events. J. Busi-
durables. Management Sci. 15 215–227. ness Research 55 697–707.
H OFF , P. D., R AFTERY, A. E. and H ANDCOCK , M. S. (2002). La-
B LAU , P. M. (1977). Inequality and Heterogeneity: A Primitive
tent space approaches to social network analysis. J. Amer. Sta-
Theory of Social Structure. Free Press, New York.
tist. Assoc. 97 1090–1098. MR1951262
B OWMAN , D. and NARAYANDAS , D. (2001). Managing customer-
H UANG , Z., C HUNG , W. and C HEN , H. C. (2004). A graph model
initiated contacts with manufacturers: The impact on share of
for E-commerce recommender systems. J. Amer. Soc. Informa-
category requirements and word-of-mouth behavior. J. Market-
tion Science and Technology 55 259–274.
ing Research 38 281–297.
JAPKOWICZ , N. and S TEPHEN , S. (2002). The class imbal-
B RIN , S. and PAGE , L. (1998). The anatomy of a large-scale hy-
ance problem: A systematic study. Intelligent Data Analysis 6
pertextual Web search engine. Computer Networks and ISDN
429–449.
Systems 30 107–117. J OSHI , M., K UMAR , V. and AGARWAL , R. (2001). Evaluating
C ASE , A. C. (1991). Spatial patterns in household demand. Econo- boosting algorithms to classify rare classes: Comparison and
metrica 59 953–965. improvements. In Proc. IEEE International Conference on Data
C HAN , E. and S TOLFO , S. (1998). Toward scalable learning with Mining 257–264. IEEE Press, Piscataway, NJ.
non-uniform class and cost distributions: A case study in credit K AUTZ , H., S ELMAN , B. and S HAH , M. (1997). Referral web:
card fraud detection. In Proc. Fourth International Conference Combining social networks and collaborative filtering. Comm.
on Knowledge Discovery and Data Mining 164–168. AAAI ACM 40(3) 63–65.
Press, Menlo Park, CA. K LEINBERG , J. (1999). Authoritative sources in a hyperlinked en-
C LEARWATER , S. H. and S TERN , E. G. (1991). A rule-learning vironment. J. ACM 46 604–632. MR1747649
program in high-energy physics event classification. Computer K UMAR , V. and K RISHNAN , T. V. (2002). Multinational diffusion
Physics Communications 67 159–182. models: An alternative framework. Marketing Sci. 21 318–330.
D ELLAROCAS , C. (2003). The digitization of word of mouth: L IBEN -N OWELL , D. and K LEINBERG , J. (2003). The link pre-
Promise and challenges of online feedback mechanisms. Man- diction problem for social networks. In Proc. Twelfth Interna-
agement Sci. 49 1407–1424. tional Conference on Information and Knowledge Management
D OMINGOS , P. and R ICHARDSON , M. (2001). Mining the network 556–559. ACM Press, New York.
value of customers. In Proc. Seventh ACM SIGKDD Interna- L INDEN , G., S MITH , B. and YORK , J. (2003). Amazon.com
tional Conference on Knowledge Discovery and Data Mining recommendations—Item-to-item collaborative filtering. IEEE
57–66. ACM Press, New York. Internet Computing 7 76–80.
D U M OUCHEL , W., VOLINSKY, C., J OHNSON , T., C ORTES , C. M ACSKASSY, S. and P ROVOST, F. (2004). Classification in net-
and P REGIBON , D. (1999). Squashing flat files flatter. In Proc. worked data: A toolkit and a univariate case study. CeDER
Fifth ACM SIGKDD International Conference on Knowledge Working Paper #CeDER-04-08, Stern School of Business, New
Discovery and Data Mining 6–15. ACM Press, New York. York University.
F ICHMAN , R. G. (2004). Going beyond the dominant paradigm M AHAJAN , V., M ULLER , E. and K ERIN , R. (1984). Introduction
for information technology innovation research: Emerging con- strategy for new products with positive and negative word-of-
cepts and methods. J. Assoc. Information Systems 5 314–355. mouth. Management Sci. 30 1389–1404.
276 S. HILL, F. PROVOST AND C. VOLINSKY

M C C ULLAGH , P. and N ELDER , J. A. (1983). Generalized Linear International Conference on Knowledge Discovery and Data
Models. Chapman and Hall, New York. MR0727836 Mining 61–70. ACM Press, New York.
M C P HERSON , M., S MITH -L OVIN , L. and C OOK , J. (2001). Birds ROGERS , E. M. (2003). Diffusion of Innovations, 5th ed. Free
of a feather: Homophily in social networks. Annual Review of Press, New York.
Sociology 27 415–444. ROSENBAUM , P. R. and RUBIN , D. B. (1984). Reducing bias in
M EASE , D., W YNER , A. and B UJA , A. (2006). Boosted classifi- observational studies using subclassification on the propensity
cation trees and class probability/quantile estimation. J. Mach. score. J. Amer. Statist. Assoc. 79 516–524.
Learn. Res. To appear. T OUT, K., E VANS , D. J. and YAKAN , A. (2005). Collaborative fil-
M ONTGOMERY, A. L. (2001). Applying quantitative marketing tering: Special case in predictive analysis. Internat. J. Computer
techniques to the Internet. Interfaces 31(2) 90–108. Mathematics 82 1–11. MR2159280
N EWTON , J. and G REINER , R. (2004). Hierarchical probabilistic U EDA , T. (1990). A study of a competitive Bass model which takes
relational models for collaborative filtering. In Proc. Workshop into account competition among firms. J. Operations Research
on Statistical Relational Learning, 21st International Confer- Society of Japan 33 319–334.
ence on Machine Learning. Banff, Alberta, Canada. VAN DEN B ULTE , C. and L ILIEN , G. L. (2001). Medical innova-
PAUMGARTEN , N. (2003). No. 1 fan dept. acknowledged. The New tion revisited: Social contagion versus marketing effort. Ameri-
Yorker May 5. can J. Sociology 106 1409–1435.
P ERLICH , C. and P ROVOST, F. (2006). Distribution-based aggre- WALKER , R. (2004). The hidden (in plain sight) persuaders. The
gation for relational learning with identifier attributes. Machine New York Times Magazine Dec. 5, 69–75.
Learning 62 65–105. W EISS , G. and P ROVOST, F. (2003). Learning when training data
P OOLE , D. (2004). Estimating the size of the telephone universe. are costly: The effect of class distribution on tree induction.
A Bayesian Mark-recapture approach. In Proc. Tenth ACM J. Artificial Intelligence Research 19 315–354.
SIGKDD International Conference on Knowledge Discovery W EISS , G. M. (2004). Mining with rarity: A unifying framework.
and Data Mining 659–664. ACM Press, New York. ACM SIGKDD Explorations Newsletter 6 7–19.
R ICHARDSON , M. and D OMINGOS , P. (2002). Mining knowledge- YANG , S. and A LLENBY, G. M. (2003). Modeling interdependent
sharing sites for viral marketing. In Proc. Eighth ACM SIGKDD consumer preferences. J. Marketing Research 40 282–294.

Вам также может понравиться