Вы находитесь на странице: 1из 6

娀 Academy of Management Journal

2014, Vol. 57, No. 2, 321–326.


http://dx.doi.org/10.5465/amj.2014.4002

FROM THE EDITORS

BIG DATA AND MANAGEMENT

Editor’s note: This editorial launches a series search and application in management and organ-
written by editors and co-authored with a se- izational scholarship.
nior executive, thought leader, or scholar from
a different field to explore new content areas
and grand challenges with the goal of expand- WHAT IS “BIG DATA”?
ing the scope, interestingness, and relevance of
Big data is generated from an increasing plurality
the work presented in the Academy of Manage-
of sources, including Internet clicks, mobile trans-
ment Journal. The principle is to use the edi-
actions, user-generated content, and social media
torial notes as “stage setters” for further work
as well as purposefully generated content through
and to open up fresh, new areas of inquiry for
sensor networks or business transactions such as
management research. GG
sales queries and purchase transactions. In addi-
Big data is everywhere. In recent years, there has tion, genomics, health care, engineering, operations
been an increasing emphasis on big data, business management, the industrial Internet, and finance
analytics, and “smart” living and work environ- all add to big data pervasiveness. These data re-
ments. Though these conversations are predomi- quire the use of powerful computational tech-
nantly practice driven, organizations are exploring niques to unveil trends and patterns within and
how large-volume data can usefully be deployed to between these extremely large socioeconomic data-
create and capture value for individuals, busi- sets. New insights gleaned from such data-value
nesses, communities, and governments (McKinsey extraction can meaningfully complement official
Global Institute, 2011). Whether it is machine statistics, surveys, and archival data sources that
learning and web analytics to predict individual remain largely static, adding depth and insight
action, consumer choice, search behavior, traffic from collective experiences—and doing so in real
patterns, or disease outbreaks, big data is fast be- time, thereby narrowing both information and
time gaps.
coming a tool that not only analyzes patterns, but
Perhaps the misnomer is in the “bigness” of big
can also provide the predictive likelihood of
data, which invariably attracts researchers’ atten-
an event.
tion to the size of the dataset. Among practitioners,
Organizations have jumped on this bandwagon
there is emergent discussion that “big” is no longer
of using ever-increasing volumes of data, often in
the defining parameter, but, rather, how “smart” it
tera- or petabytes’ worth of storage capacity, to
is—that is, the insights that the volume of data can
better predict outcomes with greater precision. For reasonably provide. For us, the defining parameter
example, the United Nations’ Global Pulse is an of big data is the fine-grained nature of the data
initiative that uses new digital data sources, such as itself, thereby shifting the focus away from the
mobile calls or mobile payments, with real-time number of participants to the granular information
data analytics and data mining to assist in devel- about the individual. For example, a participant in
opment efforts and understanding emerging vul- a Formula 1 car race generates 20 gigabytes of data
nerabilities across developing countries. Though from the 150 sensors on the car that can help ana-
“big data” has now become commonplace as a busi- lyze the technical performance of its components,
ness term, there is very little published manage- but also the driver reactions, pit stop delays, and
ment scholarship that tackles the challenges of us- communication between crew and driver that con-
ing such tools— or, better yet, that explores the tribute to overall performance (Munford, 2014).
promise and opportunities for new theories and The emphasis thus moves away from outcomes
practices that big data might bring about. In this (win/lose race), to instead focus on each proximal,
editorial, we explore some of its conceptual foun- contributory element for success or failure mapped
dations as well as possible avenues for future re- for every second during the race. Similarly, one
321
Copyright of the Academy of Management, all rights reserved. Contents may not be copied, emailed, posted to a listserv, or otherwise transmitted without the copyright holder’s express
written permission. Users may print, download, or email articles for individual use only.
322 Academy of Management Journal April

could analyze the social networks and social en- when they interact with others). Another source of
gagement behaviors of individuals by mapping mo- data exhaust is information-seeking behavior, which
bility patterns onto physical layouts of workspaces can be used to infer people’s needs, desires, or inten-
using sensors, or the frequency of meeting room tions. This includes Internet searches, telephone
usage using remote sensors that track entry and exit hotlines, or other types of private call centers.
patterns, which could provide information on com- “Community data” is a distillation of unstruc-
munication and coordination needs based on proj- tured data— especially text—into dynamic net-
ect complexity and approaching deadlines. These works that capture social trends. Typical commu-
micro data provide a richness of individual behav- nity data include consumer reviews on products,
iors and actions that have not yet been fully tapped voting buttons (such as, “I find this review useful”),
in management research. Whether it is “big” or and Twitter feeds, among many others. These com-
“smart” data, the use of large-scale data to predict munity data can then be distilled for meaning to
human behavior is gaining currency in business infer patterns in social structure (e.g., Kennedy,
and government policy practice, as well as in sci- 2008). “Self-quantification data” are types of data
entific domains where the physical and social sci- that are revealed by the individual through quanti-
ences converge (recently referred to as “social fying personal actions and behaviors. For example,
physics”) (Pentland, 2014). a common form of self-quantification data is that
obtained through the wristbands that monitor exer-
cise and movement, data which are then uploaded
Sources of Big Data
to a mobile phone application and can then be
Big data is also a wrapper for different types of tracked and aggregated. In psychology, individuals
granular data. Below, we list five key sources of have “stated preferences” of what they would like
high volume data: (1) public data, (2) private data, to do versus “revealed preferences,” wherein the
(3) data exhaust, (4) community data, and (5) self- preference for an action or behavior is inferred. For
quantification data. example, an individual might buy energy-efficient
“Public data” are data typically held by govern- lightbulbs with the goal of saving electricity, but,
ments, governmental organizations, and local com- instead, keep the lights on longer because they are
munities that can potentially be harnessed for now using less energy. Such self-quantification
wide-ranging business and management applica- data helps bridge the connection between psychol-
tions. Examples of such data include those con- ogy and behavior. Social science scholars from di-
cerning transportation, energy use, and health care verse areas, such as psychology, marketing, or pub-
that are accessed under certain restrictions in order lic policy, could benefit from stated and implicit
to guard individual privacy. “Private data” are data preference data for use in their research.
held by private firms, non-profit organizations, and
individuals that reflect private information that
Data Sharing, Privacy, and Ethics
cannot readily be imputed from public sources. For
example, private data include consumer transac- In current information technology infrastruc-
tions, radio-frequency identification tags used by tures, the provision of services such as network
organizational supply chains, movement of com- connectivity is usually associated with a Service
pany goods and resources, website browsing, and Level Agreement (SLA) defining the nature and
mobile phone usage, among several others. quality of the service to be provided. Such SLAs are
“Data exhaust” refers to ambient data that are pas- important to limit liability, to enable better provi-
sively collected, non-core data with limited or zero sioning of the operational infrastructure for the pro-
value to the original data-collection partner. These vider, and to provide a framework for differential
data were collected for a different purpose, but can be pricing. The exponential expansion of network
recombined with other data sources to create new connectivity and web services was, in large part,
sources of value. When individuals adopt and use due to significant technological advances in the
new technologies (e.g., mobile phones), they gener- automation of SLA enforcement, in terms of moni-
ate ambient data as by-products of their everyday toring and verification of compliance with the con-
activities. Individuals may also be passively emit- tract. In contrast, the realm of big data-sharing
ting information as they go about their daily lives agreements remains informal, poorly structured,
(e.g., when they make purchases, even at informal manually enforced, and linked to isolated transac-
markets; when they access basic health care; or tions (Koutroumpis & Leiponen, 2013). This acts as
2014 George, Haas, and Pentland 323

a significant barrier to the market in data— espe- ing, cluster analysis, data fusion and integration,
cially for social science and management research, data mining, genetic algorithms, machine learning,
which cannot access these private data for integra- natural language processing, neural networks, net-
tion with other public sources. work analysis, signal processing, spatial analysis,
Data sharing agreements need to be linked into simulation, time series analysis, and visualization
the mechanisms for data protection and privacy, (McKinsey Global Institute, 2011).
including anonymization for open data, access con- The challenge, though, is to shift away from fo-
trol, rights management, and data usage control. cusing on p values to focusing, rather, on effect
Issues such as imputed identity, where individual sizes and variance explained. With further empiri-
identity can be inferred through data triangulation cal work, perhaps scholars can develop and con-
from multiple sources, will need to be carefully verge on rough heuristics; for example, an R2 of
considered and explicitly acknowledged and per- more than 0.3 could suggest that closer scrutiny of
mitted. Management scholars will be invited to em- the pattern of relationships is warranted. Another
bed themselves into social issues based on defining pitfall of big data—again, amplified by our com-
research questions that integrate data sharing and monly used statistical techniques—lies in focusing
privacy as part of their research methodology. Do- too much on aggregates or averages and too little on
ing so will likely allow us to refine the model for outliers. In many situations, averages are very im-
data sharing and data rights, which could be uni- portant, and often revealing about how people tend
versally beneficial and define big data collabora- to behave under particular conditions. But, in the
tions in the future. vastness of a big data universe, the outliers can be
even more interesting: critical innovations, trends,
disruptions, or revolutions may well be happening
ANALYZING BIG DATA
outside the average tendencies, yet still involve
Equally relevant as the sources of data are the enough people to have dramatic effects over time.
methodologies to analyze them and the standards The fine-grained nature of big data offers opportu-
of evidence that would be acceptable to manage- nities to identify these sources of change— be they
ment scholars for their publication. As with any business innovations, social trends, economic cri-
nascent science, there is likely to be a trade-off ses, or political upheavals—as they gather steam.
between theoretical and empirical contribution, Once promising leads have been identified, the
and the rigor with which data are analyzed. Per- next challenge of analyzing big data is to then move
haps, with big data, we are liable to initially be beyond identifying correlational patterns to explor-
confounded by the standard of evidence that ing causality. Given the unstructured nature of
should be expected. The typical statistical ap- most big data, causality is not built into their design
proach of relying on p values to establish the sig- and the patterns observed are often open to a wide
nificance of a finding is unlikely to be effective range of possible causal explanations. There are
because the immense volume of data means that two main ways to approach this issue of causality.
almost everything is significant. Using our typical The first is to recognize the central importance
statistical tools to analyze big data, it is very easy to of theory. An intuition about the causal processes
get false correlations. However, this doesn’t neces- that generated the data can be used to guide the
sarily mean that we should be moving toward more development of theoretical arguments, grounded
and more complex and sophisticated econometric in prior research and pushing beyond it. The
techniques to deal with this problem; indeed, such second, complementary way is to then test these
a response poses a substantial danger of over-fitting theoretical arguments in subsequent research—ide-
the data. Instead, basic Bayesian statistics and step- ally, through field experiments. Of course, labora-
wise regression methods may well be appropriate tory experiments offer the advantage of greater con-
approaches. Beyond these familiar approaches, trol, but they usually focus on a very limited
there is a range of specialized techniques for ana- number of variables, and the nature of big data
lyzing big data, each of which is important for those research is that there may be many factors driving
entering this field to understand, though beyond the observed correlational patterns. In a field ex-
the scope of this editorial. These techniques draw periment, a wider net can be cast, as a richer set of
from several disciplines, including statistics, com- data about behaviors and beliefs can be collected,
puter science, applied mathematics, and econom- and over an extended period of time. For scholars
ics. They include (but are not limited to) A/B test- as well as managers with an interest in action re-
324 Academy of Management Journal April

search, there are alluring opportunities here to en- future research. Instead, our goal is to trigger
gage in “management engineering” that goes be- broader discussions of big data in society and its
yond more typical management research by implications for management research. The con-
bringing theory and practice together with much stantly changing environment in the digital econ-
faster cycle times between the identification of a omy has challenged traditional economic and busi-
promising theoretical insight and the testing of that ness concepts. Huge volumes of user-generated
insight with a well-designed intervention that can data are transferred and analyzed within and across
help to both advance management knowledge and different sectors, gradually increasing the markets’
address pressing practical questions. dependency on precise and timely information ser-
Ultimately, the promise and the goal of strong vices. A mere Tweet from a trusted source can
management research built on big data should be cause losses or profits of billions of dollars and a
not only to identify correlations and establish plau- chain reaction in the press, social networks, and
sible causality, but, ultimately, to reach consil- blogs. This situation makes information goods even
ience—that is, convergence of evidence from mul- more difficult to value, as they have a catalytic
tiple, independent, and unrelated sources, leading impact on real-time decision making. Meanwhile,
to strong conclusions (Wilson, 1998). Big data of- entrepreneurs and innovators have taken aggregate
fers exciting new prospects for achieving such con- open and public data as well as community, self-
silience due to its unprecedented volume, micro- quantification and exhaust data to create new prod-
level detail, and multifaceted richness. The vast ucts and services that have the power to transform
majority of current management research relies on industries. In private and public spheres, big data
painstaking collection of low numbers of measures sourced from mobile technologies and banking ser-
that cover a short duration of time (or, possibly, in vices, such as digital/mobile money, when com-
the case of more historically based research, a lon- bined with existing “low-tech” services, such as
ger duration but comprised of larger periods, such water or electricity, can transform societies and
as years). In contrast, big data offers voluminous communities. There is little doubt that, over the
quantities of data over multiple periods (whether next decade, big data will change the landscape of
seconds, minutes, hours, days, months, or years). social and economic policy and research.
While some big data datasets are unidimensional What is unclear is how these “new models” for
or single channel, focusing, for example, on a par- mixing and matching these products, services and
ticular transaction or communication behavior and data come about and evolve into a sustainable so-
relying on single-channel interactions (e.g., via cial and economic model. Categorizing big data,
phone or email), there are increasingly opportuni- assessing its quality, and identifying its impact is
ties to collect and analyze multidimensional data- radically new in social sciences, especially in man-
sets that offer insight into constellations of behav- agement and organizational research. The rate and
iors, often through a variety of channels (e.g., call scale of content generation multiplies its impact
center customer interactions that switch between and diminishes the time to respond. Consequently,
voice, web, chat, mobile, video, etc). For manage- management scholars will need to unpack how
ment researchers, the result of such richness is that ubiquitous data can generate new sources of value,
there are unprecedented opportunities to notice po- as well as the routes through which such value is
tentially important variables that previous studies manifest (mechanisms of value creation) and how
might have failed to consider at all, due to their this value is apportioned among the parties and
necessarily more focused nature. And, once such data contributors, entrepreneurs, businesses, in-
variables capture a researcher’s attention, the rela- dustries, and governments through new business
tionships between them can be explored and the models and new governance tools, such as con-
contextual conditions under which these relation- tracts and licenses (mechanisms of value capture).
ships may or may not hold can be examined. Empirical research in management often infers
relationships; for example, two companies might
be competing in the same market, have comple-
BIG DATA IN MANAGEMENT RESEARCH
mentary products, collaborate in production or
Our intent in this editorial is to encourage fresh, R&D, or be linked through supplier– customer rela-
new areas of scholarly inquiry—it is not to provide tionships, or they might be close to each other in
a systematic review of big data applications; nei- geographic, technology, or some other space that
ther do we pretend to provide a definitive guide for might facilitate knowledge spill-overs between them.
2014 George, Haas, and Pentland 325

Detailed data on these relationships are typically un- izational actions. Instead, we could use big data to
available in firm-level datasets that allow representa- check what sort of communication patterns are re-
tive statistical inference. However, information on quired to avoid such disasters, and where we might
such relationships is often available in unstructured discover that the lack of face-to-face communication
textual form, such as in news articles or company at the “alpha test” stage was the critical variable, we
blogs on the web. IBM estimates that as much as 80% might then suggest establishment of a real-time data-
of this relationship information is unstructured “con- monitoring mechanism to ensure that face-to-face
tent” of various communications through email, communication happened at all the necessary “alpha
texts, and videos—and they reckon unstructured con- test” junctures.
tent data is growing at twice the rate of conventional Big data can also be a potent tool for analysis of
structured databases. To address such data, content individual or team behavior, using sensors or badges
analytics is emerging as a commercial evolution of to track individuals as they work together, move
what academics call “content analysis,” or the anal- around their workspace, or spend time interacting
ysis of text and other kinds of communication for the with others or allocated to specific tasks. While early
purposes of identifying robust patterns. management research codified diaries and time-man-
There are additional uses of big data that have agement techniques of CEOs, evolving practices—
broader implications for communities and societ- using big data— can allow us to study entire organi-
ies, but which managers would find useful. For zations and workgroups in near-real time to predict
example, disease spreads, commuting patterns, or individual and group behaviors, team social dynam-
emotions and moods of communities, which can all ics, coordination challenges, and performance out-
be accessed through live Twitter feeds or Facebook comes. Scholars could examine questions around the
postings, could affect organizational responses, differences between stated versus revealed prefer-
products and services, and their strategies. Patterns ences by tracking data on purchasing, mobile appli-
in social media are being used to gleam information cations, and social media engagement and consump-
on the creation of new markets and product cate- tion, to state but a few examples. Social network
gories. Many companies now use digital interven- studies could also use big data to examine the dynam-
tion labs that track social media on a real-time basis ics of formal and informal networks as they form and
around the world, thereby creating longitudinal evolve, as well as their impact on individual, net-
data structures of millions of posts, Tweets, or re- work, and organizational behaviors. Such granular,
views. Any deviations from normal patterns that high-volume data can tell us more about workplace
invoke their brand or products are immediately practices and behaviors than our current data-collec-
flagged for action to provide rapid responses to tion methods allow—and have the potential to trans-
consumer reactions, shape new product introduc- form management theory and practice.
tions, and create new markets.
Gerard George
The continuous, ubiquitous nature of the data
Imperial College London
means that scholars have a wealth of new opportu-
nities to focus on the microfoundations of organi- Martine R. Haas
zational strategies or behaviors; for instance, we University of Pennsylvania
can examine the dynamics of how business pro-
cesses and opportunities evolve on a minute-to- Alex Pentland
minute, day-to-day basis, rather than being con- MIT
strained to assessing snapshots such as quarterly
inputs and outcomes or sales cycle trends. Con-
sider the famous example of the Hubble space tele- REFERENCES
scope having the wrong optics installed because Kennedy, M. T. 2008. Getting counted: Markets, media,
one group assumed metric measurements and an- and reality. American Sociological Review, 73:
other imperial measurements, or the example of the 270 –295.
Airbus 380 in which the wiring harness built in Koutroumpis, P., & Leiponen, A. 2013. Understanding
Germany and Spain did not fit the airframe built in the value of (big) data. In Proceedings of 2013 IEEE
Britain and France because the standards adopted international conference on big data. 38 – 42. Sili-
were different. Current practice would be to review con Valley, CA, October 6 –9, 2013. Los Alamitos,
procedures and suggest more checkpoints; that is, a CA: IEEE Computer Society Press.
relatively static measurement and control of organ- McKinsey Global Institute. 2011. Big data: The next fron-
326 Academy of Management Journal April

tier for innovation, competition, and productivity. an associate editor of the Academy of Management Jour-
June 2011. Lexington, KY: McKinsey & Company. nal, covering the topics of knowledge management, mul-
Munford, M. 2014. Rule changes and big data revolutionise tinationals, and organization theory.
Caterham F1 chances. The Telegraph, Technology Alex “Sandy” Pentland is Toshiba professor of media,
Section, 23 February 2014. Available from http:// arts, and sciences and director of the MIT Media Lab
www.telegraph.co.uk/technology/technology-topics/ Entrepreneurship Program at MIT. He is a pioneer in
10654658/Rule-changes-and-big-data-revolutionise- organizational engineering, mobile information sys-
Caterham-F1-chances.html. tems, and computational social science. His research
Pentland, A. 2014. Social Physics. New York, NY: Pen- focus is on harnessing information flows and incen-
guin. tives within social networks, the big data revolution,
Wilson, E. O. 1998. Consilience: The unity of knowl- and converting this technology into real-world ven-
edge. New York, NY: Knopf. tures. He is the World Economic Forum’s lead aca-
demic for its big data and personal data initiatives. He
is among the most-cited computer scientists in the
world, and, in 1997, Newsweek magazine named him
one of the 100 Americans likely to shape this century.
Gerry George is professor of innovation and entrepre- His book Honest Signals: How They Shape Our World
neurship at Imperial College London and serves as dep- was published in 2008 by the MIT Press. His most
uty dean of the Business School. He is the editor of the recent book, Social Physics, was published by Penguin
Academy of Management Journal.
in 2014.
Martine Haas is associate professor of management at the
Wharton School at the University of Pennsylvania. She is

Вам также может понравиться