Академический Документы
Профессиональный Документы
Культура Документы
n recent years, social media has become an important part to increase. As of September 2016, approximately 1.79 billion
2.5
1.5
.5
sh
sn
—t—
0
—t— Time Spent with the Site
Notes: All else equal, the information extracted per unit of time spent on shopping sites yields higher returns than time spent on social networks:~Ish > ~Isn (left side
of graph). Over the long run, the incremental daily time t spent on social networking sites could result in greater informational value (~Isn ) than if that time t
were spent on the shopping sites that have already received a substantial amount of the consumer’s attention (the right side of the graph: ^Ish ).
Using the model described in Equation 1, we can assess the already spend time on social networking sites and that the variation
impact of cumulative social network usage (e.g., daily usage) in immediate time spent (greater or less) is correlated with dif-
by comparing the accumulation of informational value under ferences in e-commerce activity. Thus, conditional on a person’s
N and n days of social network usage, where N > n. In the already allocating time to social networking activities on a par-
latter scenario, there are N - n days with no social network ticular day, more time spent on these sites results in less infor-
usage. The expected informational value on each of these no- mational value because less time is spent on shopping sites. Here,
social-network days is always Ish · lnðTÞ. On days with social with respect to cumulative social network usage, we are com-
network usage, we calculate (in Web Appendix A) the expected paring variation in day-to-day social network usage incidence
informational value as follows: across an extended period of time (days or even months). Our
assumption is that consumers can obtain new information on
(5) E ½informational value = Isn ½lnðTÞ - 1 + Ish ½lnðTÞ - 1.
social networking sites on a day-to-day basis, providing them with
If we compare these two cases, using social networking sites greater marginal value relative to their initial daily time allocation
over N days (compared with n days) generates more infor- to social networking activities. Thus, a consumer who frequently
mational value under the following condition: engages in social networking activities each day of the week,
Isn 1 compared with a consumer who visits these sites only once per
(6) > . week, will potentially generate greater marginal informational
Ish ½lnðTÞ - 1
value. Overall, this suggests that, in the long run, there is a positive
In other words, as long as there is informational content available correlation between social networking usage and shopping activities.
on social networks, allocating a small e amount of time to social Of course, this outcome relies on the premise of new
networking sites increases the amount of informational value, informational value on social networking sites across different
which is correlated with increased purchase activity. It follows that days. We argue that this is indeed the case. Consumers are
if e is measured in smaller units (i.e., seconds), T becomes large often exposed to product- and consumption-related content as
(total available seconds), and 1/[ln(T) – 1] becomes small. well as opinions and eWOM posted by other consumers (e.g.,
We note that this is different from the substitution effect pro- Chen, Wang, and Xie 2011; Chevalier and Mayzlin 2006;
posed previously. In the prior example, we assume that consumers Chintagunta, Gopinath, and Venkataraman 2010; Godes and
Observation Period
when a user opens a website and ends when the user leaves the records, on a session basis, whether individual i made a purchase
website by closing the page, progressing to a different domain in session t. The second variable, NUMPURCHASEit, records
on the same browser page, or remaining inactive for more than the number of purchases made in session t, which represents a
30 minutes) because it accounts for visits to multiple websites in panelist’s shopping activity intensity for that session. The third
one sitting. We choose to analyze our data at the Internet session variable, NUMRETAILERSit, records the number of retailers
level rather than at the website session level because the latter from which a panelist made purchases in each session. We
raises simultaneity concerns. Specifically, once a user opens measure each of these variables at the individual session level.
multiple websites, we are unable to identify the order of use Table 1 provides the summary statistics of these variables.
within these concurrent sessions. In contrast, we are able to
identify distinct Internet sessions without overlap in usage Web-Browsing History
activity because, by definition, Internet sessions are separated The second component of our data set is a detailed history of
by periods with no Internet usage. website visitation for all members of the panel during the
Online Purchase Records observation period. This includes records of the domain (e.g.,
www.facebook.com) and domain pages, a time stamp of each
The data set includes all online purchases made by the panelists website visit, the website category, and duration of the visit. The
between January 1 and December 31, 2007. For each purchase, majority of Internet usage by our panelists in 2007 was related to
we know who made the purchase, the purchase date, the retailer
(i.e., website domain) where the purchase was made, and the FIGURE 3
category of purchase. We observe 140,291 distinct purchase Distribution of Purchase Activity Per User
transactions made during this period by 7,402 unique users.
Figure 3 shows a distribution of purchase activity by user. Of our 3,500
panelists, 73% purchased at least once during the observation
period, and conditional on having made a purchase, each person 3,000
made an average of 19 purchases over the course of the year.11
We show the aggregate purchasing activity for our panel in 2,500
Figure 4. As expected, we observe seasonal fluctuations (e.g.,
Number of Users
12We also investigate blog browsing activities. The results appear 15One possible way to deal with left censoring is to drop the early
in Web Appendix D. observations in our data. We run a robustness check by dropping the
13We note that Facebook launched its mobile app in July 2008 left-censored users from the analysis. The results are consistent with
(after our observation period). Therefore, we expect our data to our main results and are available in Web Appendix C.
capture the majority of usage because people are limited to web 16To explore the validity of this classification of new versus
browser–based interactions. existing users, we measure the average interarrival time to social
14We note that social networks in 2007 (our data time frame) are networking websites. We find that, on average, consumers visit these
different from the dominant platforms in the present. In 2007, social networking websites about once a week. For social net-
Facebook contained a news feed on individual profiles with photos working websites, the mean number of days between usage, across
and information about groups, interests, updates, and other user- individuals, is 6.8 (SD = 5.67). Therefore, we expect that the average
generated content. Myspace was similar in that it included a profile user is a new user if (s)he has not visited a social networking website
and also allowed users to share posted information with friends. At during the first two weeks. We also calculate new users as those for
present, Facebook contains an aggregated news feed that combines whom we observe a first usage on social networking websites four
postings of all friends. We expect that the current format exposes weeks after the beginning of our observational period. The results of
users to a greater variety of consumption-related content (e.g., this robustness check (available on request) are equivalent in terms
sponsored product placement) than the format in our data set. of sign and significance.
1,000
900
800
700
Number of Users 600
500
400
300
200
100
0
1-Jan 4-Jan 7-Jan 10-Jan 13-Jan 16-Jan 19-Jan 22-Jan 25-Jan 28-Jan 31-Jan
Day of First Observed Social Media Usage
records the total number of days that user i has been active on initiated by a referral. The referral activity allows us to assess the
social networks before session t.17 For example, we set this relationship of the previously visited website with purchase
variable to 1 during the first day a user actively visits social activity. Thus, this helps us identify potential referral effects of
networking websites. For the second day with an observed visit, social networks, where consumers may discover a new product
we set this variable to 2, and so on. We use this variable to or referral link and immediately transition to an e-commerce
capture cumulative usage of social networking sites. website, from the cumulative effects of social networks. We
Fourth, we created measures related to users’ shopping separately identify three different categories of referral websites:
browsing behavior. We created two variables to account for social network, search, and shopping (see Table 2). There are
this: shopping session count and shopping duration. SHOP- many referrals from both e-commerce websites and search engines
SESSIONit records the number of unique visits user i made to (i.e., Google and Yahoo). In contrast, social network visits precede
e-commerce websites in session t, and SHOP-DURATIONit only about 2% of all shopping sessions as referrals. Using this referral
records the total time (minutes) user i spent on e-commerce information, we created three variables: (1) SN-REFERRALit, (2)
websites in session t. We also record the total Internet session SEARCH-REFERRALit, and (3) SHOP-REFERRALit. These are
count and duration for user i in session t, exclusive of shopping dummy variables equal 1 if we observe a referral to an e-commerce
or social networks, as covariates (NET-SESSIONit and NET- website for individual i in session t.
DURATIONit). These variables are important because they
allow us to control for individual i’s Internet activity in session t. Demographics and Control Variables
In other words, they capture the variation in a user’s Inter- The final part of the data set comprises demographic infor-
net activity over time. In addition, we include cumulative mation for each panelist and several control variables. Table 3
measures of shopping and Internet usage (days that user i has provides summary statistics on age, gender, household size,
visited the respective site before session t), labeled SHOP- income, and number of working members in a household, as
CUMULATIVEit and NET-CUMULATIVEit. well as an indicator for children in the household. Note that we
Fifth, in addition to social network and shopping metrics, do not have a continuous measure for income, which we instead
our web-browsing history data enable us to infer referral metrics measure as one of four categories (see Table 3). We use these
that identify which domains help lead consumers to online demographic variables to control for observed heterogeneity in
retailers. Although we do not have clickstream data to identify our model. Note that the average age is greater than what might
referral sources precisely, we use our web history data to infer be expected on social networks because our data set captures a
referrals to e-commerce websites. Specifically, we define a representative sample of the U.S. population.
referral site as one that a user visited immediately before going
to an e-commerce website if the referral session ended less than
five minutes from the start of the shopping session. Therefore, if Empirical Model
we do not observe any activity within five minutes before a We now describe how we model the relationship of people’s
shopping session, we assume that the shopping session was not immediate usage of and engagement with social networking
sites with their online shopping activities. As we mentioned
17To test robustness, we also calculate this variable as the number previously, shopping activity is indicated by three variables:
of sessions and duration on social networking sites. The results are (1) a binary session-level purchase action, (2) the number of
consistent with our main results and are available on request. purchases made each day, and (3) the number of retailers
Social network SN-SESSION Indicator for visit to social network websites during
the previous session
SN-DURATION Total duration during visits to social network
websites during the previous session
SN-ENGAGEMENT Variable indicating the number of days the user
has visited a social networking site
SN-ACTIVE Dummy variable indicating whether user has ever
visited a social networking site
SN-FIRSTMONTH Dummy variable indicating whether user has
visited the social networking site during the first
month
Internet SHOP-SESSION Indicator for visit to shopping websites during the
previous session
SHOP-DURATION Total duration during visits to shopping websites
during the previous session
SHOP-CUMULATIVE Variable indicating the number of days the user
has visited a shopping site
NET-SESSION Indicator for visit to any website (exclusive of
shopping or social networks) during the previous
session
NET-DURATION Total duration during visits to all websites
(exclusive of shopping or social networks during
the previous session
NET-CUMULATIVE Variable indicating the number of days the user
has visited any non-social-media, nonshopping
site
Referral SN-REFERRAL Dummy variable indicating whether the referral
website was from social networks
SEARCH-REFERRAL Dummy variable indicating whether the referral
website was from search
SHOP-REFERRAL Dummy variable indicating whether the referral
website was from other shopping sites
Demographics GENDER Gender of user
HOUSEHOLD-SIZE Number of people in user’s household
AGE Age of user
INCOME Income of user
WORKING-MEMBERS Number of working members in user’s household
CHILDREN Dummy indicating whether user has children
such as those found in Figure 4 during the holiday shopping online. We control for this issue by including lagged social
period. In addition, given the early stage of social network network activity in the model (i.e., lagged duration and sessions)
adoption in 2007, zt also controls for time-varying changes in and examine the relationship of these lagged variables with a
consumer behavior with regard to social network usage and consumer’s current online purchase activity.20 Thus, we assume
online shopping. For identification purposes, we assume that zt is that a person’s social network usage at time t – 1 is not affected by
independent of the other error terms in the model and is normally that person’s future shopping activity at time t.
distributed with a mean fixed at 0 (zt ~ Nð0, sÞ, where s is the
variance hyperparameter for the prior distribution of zt ).
Finally, simultaneity may be a concern. Simultaneity occurs if 20As a check, we run a robustness test with same-day social
consumers concurrently use social networking websites and shop network variables, and the findings are consistent.
Other Variables
LAG-Y .0354a .0641a .1903a
[.0467, .0277] [.0982, .0316] [.2312, .1554]
NET-SESSION -.0132a -.2324a -.3704a
[-.0035, -.0204] [-.1657, -.3102] [-.2831, -.4229]
NET-DURATION .0016 .0492a .0510a
[.0111, -.0074] [.0694, .0257] [.0735, .0288]
SHOP-SESSION .0118a .4887a .7140a
[.0175, .0054] [.5535, .4165] [.82, .6115]
SHOP-DURATION .0078a .1692a .1387a
[.0116, .0039] [.1973, .1444] [.1629, .1102]
NET CUMULATIVE SESSION .0006 -.8839a -.8839a
[.0378, -.0358] [-.7629, -.9691] [-.7789, -.9719]
SHOP CUMULATIVE SESSION .0006 .1284a .1638a
[.0161, -.0140] [.1803, .0629] [.2122, .0865]
SEARCH-REFERRAL .0075a N.A. N.A.
[.0153, .0020] N.A. N.A.
Demographic Variables
AGE .0010a .0719a .0157a
[.0013, .0007] [.0845, .0595] [.0258, .009]
GENDER -.0059 -.0613a -.0431a
[.002, -.013] [-.0553, -.0683] [-.0306, -.0649]
HOUSEHOLD-SIZE .0035a .0262a -.0627a
[.0085, .0005] [.0314, .0195] [-.0527, -.0735]
INCOME .0261 -.0047 .0762a
[.0381, -.0124] [.0004, -.0107] [.0837, .071]
WORKING-MEMBERS .0111a -.0082 -.1082a
[.0161, .0066] [.0004, -.014] [-.0999, -.1167]
CHILDREN .0118a .0893a -.0478a
[.021, .0048] [.097, .0855] [-.0329, -.0575]
INCOME2 .0101a .0435a .0238a
[.0133, .0054] [.0517, .0279] [.0291, .0016]
These findings demonstrate that the immediate and (NET-DURATION = .0492) and the number of retailers pur-
cumulative usage of social networking websites may play two chased from (NET-DURATION = .0510). SHOP-SESSION
distinct, opposing roles in their relationship with shopping and SHOP-DURATION control for the propensity to engage in
activities. First, these results point to an immediate, short-term shopping-related activities. As we expected, these shopping-
negative relationship, such that time spent on social networking related activities are positively associated with shopping
is negatively correlated with online shopping, which suggests activities. In the purchase incidence model shown in Table 5,
a substitution effect. Second, our results indicate that there greater shopping duration is related to greater purchase in-
is a cumulative, longer-term positive relationship, such that cidence, which means that consumers who visit shopping
cumulative usage of or engagement with social networking websites or spend more time shopping are more likely to make a
websites over time is associated with increased purchasing purchase (Column 1: SHOP-SESSION = .0118, SHOP-
activity. In line with our theory, this may be because people who DURATION = .0078). We also find (Table 5, Column 2) that,
have been engaged in social networks for an extended period consistent with our expectation, more visits and more time
of time are better informed about new shopping options that spent on e-commerce websites are correlated with the number
they learned from their networks. In other words, continuous of purchases (Column 2, SHOP-SESSION = .4887, SHOP-
exposure to new consumption-related content over time may DURATION = .1692). Finally, in Column 3 of Table 5, we find
provide greater marginal benefit than if that time were spent on that more time spent on e-commerce websites is associated
shopping websites.23 with purchasing from a greater variety of retailers (Column 3:
SHOP-SESSION = .7140, SHOP-DURATION = .1387).
Results for Other Variables As for the other demographic control variables, we also find
We also consider the relationship of other Internet-related that older consumers and those with more income are associated
variables with purchase activity in Table 5. These variables with greater online purchase activity.24 Higher-income people
allow us to control for short-term effects (e.g., eWOM) that are may have more disposable income, which is correlated with
correlated with immediate purchase. We find that search greater online spend. We also find that women are more likely
referrals (i.e., Google and Yahoo web search) have a positive than men to make more purchases and buy from a wider variety
relationship with the probability of purchase (e.g., SEARCH- of retailers. Our results also suggest a notable relationship of
REFERRAL = .0075). This means that search activity is as- household size and number of children with purchase activity.
sociated with subsequent purchase activity, which intuitively Having a larger household and/or more children are positively
makes sense (e.g., Hauser and Wernerfelt 1990; Ratchford and associated with greater purchase incidence but negatively
Srinivasan 1993). associated with the number of retailers purchased from. We
We also control for several other important factors. For speculate this might be a result of larger households combining
example, NET-DURATION captures a person’s session-level their purchases within one retailer. We also find that households
Internet activity and, along with the individual-level random with more working members are more likely to purchase online
effect, controls for possible activity bias. Our results suggest but buy from fewer retailers. Finally, as a check, the time-
a positive relationship between the duration of a person’s specific estimates are consistent with our expectations, with
Internet session usage and both the number of purchases made
24We also include interactions between the demographic varia-
23We also find that number of shopping sessions mediates the rela- bles and the other variables in the model to test whether demographic
tionship between cumulative social network usage and purchase inci- characteristics moderate any of the effects. The one interaction that
dence. Thus, our findings suggest that the correlation between increased was significant was income · y i, t-1 (lag purchase), and it was
engagement on social networking websites and the higher likelihood of positive. This suggests that higher-income consumers tend to have a
making online purchases is due in part to increased shopping browsing higher purchase probability if they purchased during the previous day.
activity. The results of this analysis are available on request. The other interactions were not significant.
Number of Users
400
Children .700 21.681
Chocolate/Candy .370 2.299
Clothing 5.079 211.046 300
Computer .151 2.165
Computer Software .021 2.297 200
Electronics .127 2.178
Flowers 1.812 21.811
Gift Cards .018 2.127 100
Jewelry .428 21.891
Magazines .072 2.588 0
Photo .181 -.543
–.10
–.08
–.06
–.04
–.02
.00
.02
.04
.06
.08
.10
.12
.14
.16
.18
.20
.22
.24
.26
.28
.30
Shoes .097 2.259
Sports/Outdoors .355 2.499 βi Estimate for SN-ENGAGEMENT
Theater/Entertainment .276 2.951
Tools/Hardware .132 2.573
Toys and Games .081 2.240
Video .639 2.637
(SN-SESSION) and cumulative social network usage (SN-
Notes: Each row consists of a separate model, with the dependent
variable specified as whether a purchase was made (or not)
ENGAGEMENT).26 In Table 6, each row shows the partial
in that particular category. We report the partial effects (scaled effects from one purchase incidence model with the respective
by 1,000 for presentation purposes) for the social network category purchase incidence described in Column 1 as the
engagement and session variables. Boldfaced values indicate dependent variable. Consistent with H3a, our results (Table 6,
that 0 is not contained in the 95% Bayesian credible interval.
The differences between the most affected product categories Column 1) suggest that product category moderates the positive
(i.e., clothing, jewelry, and children’s products) and the least correlation between social network engagement and purchase
affected categories (i.e., computers, books, and gift cards) are incidence. Specifically, categories such as clothing, children
statistically significant.
products, jewelry, and chocolates have significant, positive
coefficients (Column 1: 5.079, .700, .428, .370, respectively). In
higher baseline purchase activity occurring during the December contrast, categories such as tools/hardware, auto parts, or gift
holiday season. cards have nonsignificant effects (Column 1: .132, .915, .018,
respectively). This suggests that the positive correlation be-
Category-Level Analysis tween social network engagement and purchase likelihood is
more pronounced for products that are more likely to be shared
We also empirically examine product category data to provide
on social networks.
further evidence that the positive correlation between social
Similarly, consistent with H3b, we find that product category
network engagement and purchase activities may arise from
moderates the negative correlation between immediate social
greater consumption-related exposure.25 In accordance with
network usage and purchase incidence. Consumers with higher
H3a, we expect that product category moderates this positive
previous period social network usage may have satisfied their need
correlation because consumers are more likely to be exposed to
for entertainment value, thus exhibiting lower purchase activity for
certain categories of products on social networks. In addition,
entertainment-related products. While all the partial effects are
product category might moderate the negative correlation
significantly negative, consistent with H1, we observe variation
between immediate social network usage and purchase inci-
across the categories. The negative relationships in Column 2
dence (H3b). To test for this, we identify the product category for
of Table 6 appear stronger for categories that provide greater
each purchase observation in our data. There are 21 categories
entertainment value (e.g., –11.046 for clothing, –1.891 for jewelry,
under which the purchases fall (see Table 6). We use this
–1.681 for children’s products). While not perfect, to large extent,
information to create a new set of purchase incidence dependent
these results are in line with our theoretical framework.
variables, which we denote as yitc = 1 if consumer i purchased a
product in category c during session t. For each category, we
separately estimate the binary probit model using yitc as the
dependent variable. 26Partial effects show the effect on the probability of purchase
In Table 6, we report the average partial effects for the key for a unit change in the respective social network variable. We report
social network variables—both immediate social network usage partial effects for comparison across models. The estimated co-
efficients and other variables are consistent with our main results
25We thank an anonymous reviewer for this suggestion. and are available on request.