You are on page 1of 201

Mine the gap

- a multi-method investigation of web-based groupware use

Dissertation submitted in partial fulfilment of the requirements for the Ph.D.,


University of Copenhagen

January 15, 2003

by
Kristian Billeskov Bøving
kristian@billeskov.dk

Supervisor:
Professor Klaus Bruhn Jensen
Department of Film and Media Studies
University of Copenhagen
Abstract
Computer mediated communication in organizations today is characterized by the
introduction of packaged, generic computer media based on Internet technology to support
communication and collaboration. This challenges conventional views and theories on how
technology is related to the context (i.e. social structures) in which it is embedded. One type
of technology introduced in organizations is virtual workspaces, a specific type of web-based
groupware. This thesis investigates the adoption and use of a virtual workspace technology in
an organization. It studies how the technology is adopted and integrated in the organization
and in specific work practices. The theory of genres of organizational communication is used
as a framework for specifying the context relevant for understanding the adoption of the
technology.
The study of computer mediated communication in temporally and geographically
distributed settings poses methodological challenges as to the observation of usage. This
study explores the triangulation of different methods for analyzing usage of the technology
and develops a method for utilizing and integrating log file analysis in a case study, which can
serve as a possible response to the methodological challenge.

Mine the gap - a multi-method investigation of web-based groupware use


i
Acknowledgements
The completion of this thesis is the result of a collaborative effort. The thoughts
presented here are solely my responsibility, but I am grateful to the following persons and
institutions:
• Klaus Bruhn Jensen for excellent guidance and invaluable feedback. I have
immense respect for your intellectual generosity and your ability to pose the
hard questions.
• The research assistants on the DIWA project, Rasmus Helles and Frank Bjergø,
for doing a fantastic job of analysing log files and providing much input to my
thesis. Thanks also to the DIWA research program for financing Rasmus and
Frank.
• Jesper Simonsen, Keld Bødker and Jens Kaaber Pors from Department of
Communication, Journalism and Computer Science, Roskilde University for
their cooperation in gathering and interpreting data from Beta.. Many of the
ideas that have inspired this thesis are products of our cooperation and
discussions.
• Rasmus Helles for our joint development of the functional model for
understanding the dynamics of folder structures in addition to many invaluable
discussions.
• Lone Hoffmann Petersen, Department of Communication, Journalism, and
Computer Science, Roskilde University for collaboration on investigating the
concept of end-user design, and for useful feedback during the writing process.
• Sisse Finken, Department of Communication, Journalism, and Computer
Science, Roskilde University for giving me much feedback in the writing
process.
• The DIWA project and the Ph.D. forum in DIWA for providing an excellent
framework for working and researching this project.
• IBM Denmark A/S for providing the Lotus Domino and Lotus Quickplace
software, and SPSS Denmark for providing Clementine Data Mining software.
• IBM Denmark A/S, Interse A/S, e-sense and Framfab AB for interactions that
have enabled me to test the practical implications of the work on my thesis.

Mine the gap - a multi-method investigation of web-based groupware use


ii
Table of headers
Abstract.....................................................................................................................................i
Acknowledgements .................................................................................................................ii
Table of headers .................................................................................................................... iii
Table of contents ....................................................................................................................iv
Introduction .............................................................................................................................1
Framing the research ...............................................................................................................6
Research method ...................................................................................................................32
HTTP-log analysis for CMC.................................................................................................58
Introducing virtual workspaces.............................................................................................74
The study of Quickplace use.................................................................................................95
Characterising Quickplace use..............................................................................................97
The three Quickplace exemplars.........................................................................................115
Three genres of communication .........................................................................................125
Statistical generalizations of document use........................................................................140
Classification practices in Quickplace................................................................................156
Conclusions and implications .............................................................................................171
References ...........................................................................................................................184

Mine the gap - a multi-method investigation of web-based groupware use


iii
Table of contents

Abstract.....................................................................................................................................i
Acknowledgements .................................................................................................................ii
Table of headers .................................................................................................................... iii
Table of contents ....................................................................................................................iv
Introduction .............................................................................................................................1
Framing the research ...............................................................................................................6
Organizational computing...................................................................................................6
Two imperatives and an alternative................................................................................7
IT and organizational design ..........................................................................................9
Organizations and task-technology fit..........................................................................11
Media Richness theory and its critics...........................................................................13
The emergent perspective .............................................................................................15
Structuration theory as a basis for an emergent perspective .......................................16
Adaptive structuration theory .......................................................................................18
Technology-in-practice.................................................................................................20
Selecting the relevant social structure ..........................................................................22
Organizational communication.........................................................................................22
CSCW............................................................................................................................23
Research in Group Decision Support Systems (GDSS) ..............................................24
Units of analysis for organizational communication .......................................................26
Genres of communication.............................................................................................26
The role of computer media in genre theory................................................................28
Genre repertoire and genre system ...............................................................................29
The consequence of introducing genres of communication ........................................30
Research method ...................................................................................................................32
Defining the object of study .........................................................................................32
Schools of IS research methods....................................................................................34
The case study or field study approach ........................................................................37
On Generalizability .......................................................................................................39
Level two generalizations in my study:........................................................................40
Combining Research methods ......................................................................................41
Two hypothetical studies of Lotus Quickplace............................................................43
The research design...........................................................................................................44
Mine the gap - a multi-method investigation of web-based groupware use
iv
The interviews...............................................................................................................47
Log file analysis ............................................................................................................49
The survey .....................................................................................................................52
Level-one generalizations .............................................................................................54
Sampling in a case study...............................................................................................56
HTTP-log analysis for CMC.................................................................................................58
Web mining and web usage mining .............................................................................59
A survey of the research in web usage mining ............................................................60
Mining computer-mediated communication ................................................................62
HTTP-log analysis and cryptanalysis...........................................................................64
The practical process of log analysis................................................................................65
Mapping user actions to log lines.................................................................................66
Breaking the code in Lotus Quickplace............................................................................69
Some generic technical challenges of HTTP-log analysis...............................................70
Identifying the user .......................................................................................................70
Handling caching ..........................................................................................................72
Consistency of the resource ID.....................................................................................73
Introducing virtual workspaces.............................................................................................74
Introduction .......................................................................................................................74
Method...............................................................................................................................76
Three economic models of virtual workspaces ................................................................76
The design process of virtual workspaces........................................................................78
The standards process ...................................................................................................81
The application development process ..........................................................................82
The adoption of virtual workspaces .............................................................................83
Design and the use of metaphors ......................................................................................84
The design strategy for virtual workspaces..................................................................85
Approaches to modelling the anticipated use...............................................................86
The use of metaphors ........................................................................................................89
The metaphorical landscape .........................................................................................90
The house, room or the office.......................................................................................90
The domains of reference. ............................................................................................92
Summary of virtual workspaces .......................................................................................94
The study of Quickplace use.................................................................................................95
Characterising Quickplace use..............................................................................................97
The implementation of QP..........................................................................................100

Mine the gap - a multi-method investigation of web-based groupware use


v
The communication infrastructure at Beta.................................................................102
Unintended uses ..........................................................................................................105
The organizational distribution of use........................................................................106
Size of the Quickplaces...............................................................................................108
QP lifecycles ...............................................................................................................110
Summary of the initial characterization .....................................................................113
The three Quickplace exemplars.........................................................................................115
NP_Solo-ID.................................................................................................................115
GIC ..............................................................................................................................116
IC .................................................................................................................................116
Characterizing document use......................................................................................117
Characterization of the users ......................................................................................119
Three genres of communication .........................................................................................125
The translation of a press release................................................................................125
The holiday list............................................................................................................132
The meeting agenda ....................................................................................................135
Mixing media in a genre .............................................................................................136
Why do people use e-mail instead? ............................................................................137
Statistical generalizations of document use........................................................................140
Deriving the basic document types.............................................................................141
The six clusters of document life cycles ....................................................................145
A top-down typology of documents:..........................................................................147
Findings from clustering document lifecycles ...........................................................150
Zipf’s law on documents and QPs..............................................................................151
Classification practices in Quickplace................................................................................156
The functional model ..................................................................................................157
The structural factors and disturbances......................................................................158
The dynamics of the folder structure..........................................................................160
The NP_solo-ID folder structure ................................................................................161
The IC folder structure................................................................................................163
The GIC folder structure.............................................................................................165
Summary of the folder structures ...............................................................................168
Levels of functional relationships ..............................................................................169
Conclusions and implications .............................................................................................171
The value of log analysis for CMC studies ....................................................................175
A new type of data for case studies ............................................................................178

Mine the gap - a multi-method investigation of web-based groupware use


vi
The downside of log analysis .....................................................................................180
Practical implications ......................................................................................................181
The design of virtual workspaces ...............................................................................181
Adoption of virtual workspaces..................................................................................182
References ...........................................................................................................................184

Mine the gap - a multi-method investigation of web-based groupware use


vii
Introduction

Introduction
The research documented in this thesis has been conducted under the auspices of the
DIWA (Design and use of Interactive Web Applications) research program. The ambitions of
the program are expressed in the following way:
“The goal of the program is to examine how Web-technology - as a networked,
distributed computing platform - is changing organizational IS development and use.”
Source: www.diwa.dk
This thesis investigates the adoption and use of a specific web-based communication
technology in a specific organization. The investigated technology is Lotus Quickplace,
marketed by IBM as "Instant, Secure Team Workspaces for the Web". Lotus Quickplace is a
type of web-based application that is called virtual workspace in this thesis. Virtual
workspaces have emerged as part of the .com boom. They are applications that are inspired by
groupware technologies, and they are designed to support a group of people working together
remotely. As the marketing message form IBM indicates, the value proposition is that a group
of people can instantly establish a platform for cooperation on e.g. a project across
geographical and organizational boundaries.
Virtual workspaces started out as applications that could be leased via the web for a low
monthly fee or as ad-ware, but it seems that the largest commercial success has been to sell
the product in a more traditional software license model, and the technology has spread as
internal collaboration tools in large organizations, and as platform for consulting
organizations collaborating with customers on engagements.
The first commercial virtual workspaces were offered in the second half of 1999 and,
according to Gartner, "Team collaboration support" is now a maturing market segment IBM
(2002). "Team collaboration support" covers more or less the same applications which are
denoted as virtual workspaces. According to IBM, Lotus Quickplace is used in 60% of
Fortune 100 companies IBM (2002). Virtual workspaces seem to have become a commercial
success in terms of licenses sold.
The promise of "instant" collaboration across geographical and organizational
boundaries raises of course a suspicion. My own experience with using Lotus TeamRoom as a
consultant in IBM Global Service was that it is difficult to utilize such a tool, and in many
cases it actually fails. TeamRoom is a predecessor to Quickplace based on the proprietary
Lotus Notes platform. A genuine fascination with the possibilities of utilizing virtual
workspaces, combined with an interest in the obstacles involved, spurred an interest in
investigating how virtual workspaces are actually adopted and used in an organizational
setting.
Mine the gap - a multi-method investigation of web-based groupware use
1
Introduction

The opportunity to study the use of a virtual workspace arose in Beta Corporation, a
partner in the DIWA research program. Beta is a Nordic financial corporation that was
formed as the result of a merger during 2000 of a Danish financial corporation named Alpha,
a Swedish-Finnish financial corporation, and a Norwegian financial corporation. In May 2000
Beta had implemented Lotus Quickplace as a technology to support projects spanning more
than one country.
One of the interesting aspects of virtual workspaces introduced in organizations is that
they are introduced as though the magic word "instant" marketed by vendors can be taken
literally. At Beta the technology is introduced "as is" without education or guidelines for how
to use it. A virtual workspace is a generic groupware solution that can be used in a wide
variety of ways. It typically offers basic support for controlling access to documents, sharing
documents, integration with e-mail, discussions, and synchronous chat. This leaves the users
of a virtual workspace with the task of finding out what it should be used for and how it
should be integrated in their existing work and with the existing media for communication.
This approach to the implementation of IT-systems is new - at least at Beta. The only
predecessor might be e-mail. E-mail has been introduced in corporations as an open
communication platform without an explicit purpose or specific guidelines for how it should
be integrated into the work practice. E-mail therefore seems to be the computer medium that
can provide the best inspiration for understanding how virtual workspaces are adopted in an
organization.
The open character of the virtual workspace technology and the way it is implemented at
Beta calls for a closer examination of how it is actually integrated in diverse work practices.
Investigating this integration has been the main driver of the research process. The research
question, which I formulated in order to guide the investigation, was the following:

How do the properties of an IT artefact for communication in a group of people


interact with the social structures in the organization and result in a work practice in
which the IT artefact plays a role.

The main purpose of the investigation is therefore to understand in detail how virtual
workspaces interact with the social structures of organizations in which they are adopted. An
important aspect of this is to understand the actual role of the technology in relation to the
work practice and in relation to other available technologies for communication and
collaboration.

Mine the gap - a multi-method investigation of web-based groupware use


2
Introduction

A secondary purpose of the research has been to experiment with different methods for
analyzing the use of a virtual workspace. The use of virtual workspaces is distributed in time
and space, which renders traditional observations of their use very difficult. This poses a
methodological challenge for the detailed studies of technology use, and this has therefore
been the other main drive of the research process. Virtual workspaces offer an opportunity
that has not been developed in previous research on the use of technology in organizations.
Virtual workspaces are web-based. This means that the client used is a browser and that the
communication between client and server is achieved via a web or HTTP-server. All HTTP-
servers adhere to a de facto standard for logging activity known as the Common Log File
Format. This provides a possibility for using the HTTP-log as part of the empirical
investigation of virtual workspace use.
My investigation seeks to combine both interviews, document analysis, a questionnaire
and HTTP-log analysis in the study of technology use and maintains also a particular focus on
the investigation of the possibilities of log analysis. The analysis of HTTP-log files is being
used to some degree in HCI (Human-Computer Interaction) and commercially to analyze the
interaction between a user and a web-application. It has not yet been used to understand
communication between users mediated by a web application. The thesis proposes some
promising applications and identifies limitations and pitfalls of using HTTP-log analysis for
understanding computer-mediated communication.

The thesis is divided into six main sections, which are presented here:
Framing the research
The purpose of this section is to provide the theoretical framework for investigating how
virtual workspaces are adopted in an organization. This involves a discussion of theories that
have been important in previous research on organizational computing, especially in the
research of related technologies such as group support systems, CSCW and e-mail. I have
chosen structuration theory developed by Anthony Giddens (1984) and extended by Klaus
Bruhn Jensen (2000) as the overall framework for understanding the dynamics of social
structures in the organization. I have taken Yates' and Orlikowski's (1992) application of
genre theory and structuration theory in the concept of genres of organizational
communication to specify the social structures relevant for understanding the adoption and
use of Lotus Quickplace at Beta. Lastly, I have used Orlikowski's (2000) distinction between
the technology as an artefact and the technology-in-practice to explain the interaction between
the properties of the virtual workspaces and the social structures of the organization.
Research method

Mine the gap - a multi-method investigation of web-based groupware use


3
Introduction

This section discusses methodological issues related to doing research on the use of
technology in an organization, and presents the research design. It focuses on the issues
related to combining qualitative and quantitative traditions of research. The research
presented in this thesis combines interviews, a survey, and HTTP-log analysis as the primary
sources of data in a case study. As we shall se this has both advantages and potential pitfalls.
The research design and the collection and analysis of data are presented here to enable
the reader to assess the conclusions, which have been drawn when the results of the case
study are presented.
HTTP-log analysis
A separate section is devoted to HTTP-log analysis for two reasons. HTTP-log analysis
and log analysis in general is not a commonly used method in research in technology use in
organizations. The methodological and practical challenges of using log analysis are therefore
discussed in detail. The second reason is that the use of HTTP-log analysis to investigate
computer mediated communication presents a novel approach to utilizing HTTP-logs. The
previous analyses of HTTP-logs in HCI and in commerce have focused on session-based
analysis. A document-based analysis is introduced as a method for analyzing computer-
mediated communication.
Introducing virtual workspaces
Preceding the reports from the case study, a separate section is devoted to an analysis of
the virtual workspace technology. This I have done by means of a comparative analysis of
seven different virtual workspace products. Both the design process and the user interface and
functionality are analyzed. The design process of virtual workspaces is characterized by both
the heavy reliance on the development of Internet standards and the fact that it is designed as
a generic software product. The analysis of the user interface and functionality shows that the
functionality across the products is more or less the same. The main differences between the
products lie in the approaches taken to model the anticipated use situation and in the different
metaphors used in the design. The "metaphorical landscape" is introduced as a method for
analyzing user interfaces.
The study of Quickplace use
The analysis and findings from the case study at Beta is divided into five sub-sections,
which reflect the explorative nature of the study.
The first sub-section gives an overall characterization of the introduction and adoption of
Lotus Quickplace at Beta. The history of Lotus Quickplace at Beta is reported and the use of
QP is reported based primarily on data from the survey and log analysis.
The next sub-section looks more closely at three QPs at Beta and gives a detailed
account of the use of these. In this section three different instantiations of genres of

Mine the gap - a multi-method investigation of web-based groupware use


4
Introduction

communication from the three Quickplaces are analyzed in detail as a basis for understanding
the exact role of the Lotus Quickplace technology. This analysis also addresses the role of
additional media. The analysis shows that genres of communication involve more than one
medium. The study of the instantiations of genre exemplifies a valuable contribution from log
analysis to the understanding of computer media use.
Following this I zoom out again to explore another perspective on the use of log
analysis. The document-based log analysis is used as a method for characterizing patterns of
computer-mediated communication. In this analysis different data-mining techniques are
applied.
The fourth section suggests a theoretical model for understanding a specific aspect of the
use of virtual workspaces: maintaining a folder structure is a central aspect of integrating a
virtual workspace within a work practice, and a model is suggested to explain how folder
structures develop over time. The model is substantiated by an analysis of the development of
folder structures in three Quickplaces based on log analysis and interviews.
Based on the model for understanding folder structures and the study of genres of
communication some general implications are drawn on how we should understand the virtual
workspace technology.
Conclusions and implications
The last section summarizes and evaluates the findings of the thesis. This section is
divided in three parts. The first part addresses the findings related to the research question
posed. The second part summarizes the experiences with using log analysis, and the last part
is devoted to the presentation of some practical consequences of the findings for designers
and users of virtual workspaces.

Mine the gap - a multi-method investigation of web-based groupware use


5
Framing the research

Framing the research


The purpose of this section is to provide the theoretical framework for investigating how
virtual workspaces are adopted in an organization. This involves a discussion of theories that
have been important in previous research on organizational computing.
Research in organizational computing can, as suggested by Orlikowski and Baroudi
(1991), be divided according to the underlying causal relationship between the properties of
technology and properties of an organization. This distinction frames the discussion of how
the adoption of a technology can be conceived. The result of this discussion is that
structuration theory, as it is formulated by Giddens (1984) and extended by Jensen (2000) is
selected as the framework for understanding the relationship between agents, social structures
and computer media.
Structuration theory has previously been applied to organizational computing and two
earlier endeavours of using it to understand groupware (Lyytinen and Ngwenyama (1992),
Desanctis and Poole (1994)) are discussed. This discussion concludes with selecting
Orlikowski's (2000) application of structuration theory. This application involves a distinction
between technology characterized as an artefact, and technology-in-practice that captures the
"enacted structures of technology use" (Orlikowski (2000) p. 407).
Virtual workspaces are characterized as media for communication, and the social
structures, which are central for understanding the role of the technology, are the social
structures active in organizational communication.
Virtual workspaces can also be characterized as a kind of groupware or group support
system. CSCW (Computer Supported Cooperative Work) and GDSS (Group Decision
Support Systems) offer themselves as frameworks for understanding groupware and the
respective approaches are discussed.
As a result of the discussions, the theory of genres of organizational communication
developed in a number of papers (Yates and Orlikowski (1992), Orlikowski and Yates (1994),
Yates, Orlikowski et al. (1997), Yates, Orlikowski et al. (1999), Yoshioka, Herman et al.
(2001), Yates and Orlikowski (2002)) is introduced as the theoretical framework for
understanding the use patterns that emerge in virtual workspaces.

Organizational computing
The idea of using computers to support various tasks in organizations has existed since
the early days of computing. At first, when computer power was expensive, they were
primarily used for processing information such as handling invoices or inventories. One of the
first industries to exploit computers was the banking industry. Now, computers have entered
Mine the gap - a multi-method investigation of web-based groupware use
6
Framing the research

most aspects of organizational life and they have moved from being conceived as information
processing machines, and are now also researched and used as a medium for communication.
Research into issues regarding organizational computing has been conducted using a
variety of names: Information Systems, Management Information Systems, information
processing systems, information and decision systems, organizational information systems,
etc. Orlikowski and Baroudi (1991) use the broad term "Information Systems Research" to
denote research that from different theoretical perspectives studies information technology in
organizations. Information systems research is divided into a number of under-disciplines
such as DSS (Decision Support Systems), GDSS(Group Decision Support Systems), CSCW
Computer Supported Cooperative Work), CMC (Computer Mediated Communication), PD
(Participatory Design), HCI (Human-Computer Interaction), and others.
Most research into organizational computing is focused on producing theories, which
can be useful for guiding practice. Lynne M. Markus and Daniel Robey state the imperative
of IS research rather precisely:
“Good theory guides research, which, when applied, increases the likelihood that
information technology will be employed with desirable consequences for users,
organizations, and other interested parties.” Markus and Robey (1988) p. 583
This characteristic of IS research implies that it is performed in the same system in
which it is later applied. The research in organizational computing is characterized by the fact
that it is done with the purpose of applying the research in the same organizations in which it
is performed. The results of the research presented in this thesis are intended to promote the
understanding of how IT is integrated in organizations, and the results intended to be
applicable in improving the design and adoption of computer media in organizations.

Two imperatives and an alternative


Research in organizational computing has as its object of study the relationship between
the technology and the organization. It studies causal and other types of relations between
properties of the technology and properties of the organizations where technology is used. In
the case of this thesis, the relationship between a technology characterized as a medium for
communication and organizational communication is investigated. Research in organizational
computing (or IS research) can be divided into three types of research, categorized by the
underlying conception of the relationship between properties of the technology and the
characteristics of organizations. This is suggested by Markus and Robey (1988), who entitle
these three types, the technological imperative, the organizational imperative, and the
emergent perspective.

Mine the gap - a multi-method investigation of web-based groupware use


7
Framing the research

In the technological imperative, organizational forms act as the dependent variable and
IT is treated more or less consistently as an independent variable.
The organizational imperative treats the IT as the dependent variable. It holds that
“...human actors design information systems to satisfy organizational needs for information.”
Markus and Robey (1988) p. 587. One example of the organizational imperative is the media
richness theory, which holds that information needs vary with task ubiquity and variety and
that media are chosen according to this (Daft and Macintosh (1981), Daft and Lengel (1986)).
According to this theory, e-mail would be used for routine communications, while face-to-
face communication would be used for ubiquitous and unique situations. Another example is
theories of task-technology fit, for example Zigurs and Buckland (1998). The organizational
imperative argues basically that managers and developers of IT can choose rationally to
design an IT system according to their needs and that subsequently the system will be used
accordingly. As Markus and Robey (1988) note, the empirical support for the organizational
imperative is limited. Lynne Markus has in her classical case study (Markus (1994)), shown
that media richness theory is not a good theory for understanding how e-mail is used.
As a third perspective Markus and Robey present the emergent perspective.
“The emergent perspective hold that the uses and consequences of information
technology emerge unpredictably from complex social interactions.” p. 588
An early example of the emergent perspective used in an empirical study is Barley's
(1986) study on the use of CT scanners in two radiology departments. The study of two
radiology departments introducing and using CT scanners shows that the same properties of
the technology produced two different outcomes. Barley concludes his article by stating that
structuring theory as he labels his contribution “...departs from previous approaches to the
study of technology by postulating that technologies are social objects capable of triggering
dynamics whose unintended and unanticipated consequences may nevertheless follow a
contextual logic.” Barley (1986) p. 107
He argues here that neither the technological not the organizational imperative is useful
for explaining his case. Rather, studying the context in which the technology is put to use will
produce useful theories for understanding the relationship between IT and organizations. The
theoretical background on which Barley builds his theory is structuration theory.

In the following, the three perspectives on the relation between technology and
organization are introduced in more detail by using examples of research performed from the
perspectives. The purpose is to clarify some of the issues involved in the relationship and thus
the background for the thesis.

Mine the gap - a multi-method investigation of web-based groupware use


8
Framing the research

IT and organizational design


The technological imperative states that technology is an exogenous force, which per se
determines or constrains how organizations and individuals behave. The technological
imperative implies that IT has some general properties, not dependent on a specific design of
the technology, which will affect organizations. The technological imperative apparently
stems from a period where technology was used for rather limited purposes such as
accounting or for electronic bank accounts. With the quite diverse uses of IT observed in
contemporary organizations, the likelihood of theorizing “universal” properties of IT systems,
which will affect organizations in certain ways, is decreasing. The area in which the
technological imperative has had the most impact is in the relationship between IT and
organizational design. Classic examples of these are Leavitt and Whisler (1958) and Simon
(1977). A recent example is Fiedler, Grover et al. (1996) who derive a taxonomy empirically
from a survey of 313 organizations. The taxonomy relates the information technology
structure to the organizational structure. They are uncertain about the direction of the
causality, and the study shows that, in the contemporary research on the relationship between
IT and organizational design, the causality is not as clear as the traditional examples.
Groth (1999) offers a contemporary example of research on the relationship between IT
and organizational designs, which postulates the direction of causality from the technological
imperative between IT and organizational design. His agenda is to investigate the relationship
between the possibilities of IT and the basic organizational designs. His work is based on
Mintzberg’s classic taxonomy of organizations, and he is following the tradition of the
technological imperative.
Groth's work concludes with a suggestion for five computer-based configurations for
organizations. The configurations build on Mintzberg (1979)'s two-by-two taxonomy of
organizational configurations. This taxonomy identifies four ideal organizational structures
based on environmental "determinants". The determinants are complexity (simple or
complex) and pace of change (stable or dynamic). The resulting four forms are exhibited in
the matrix below:
Simple Complex
Stable Machine bureaucracy Professional organization
Standardized work processes and output Standardized Skills and
Norms
Dynamic Entrepreneurial start-up Adhocracy
Direct supervision Mutual adjustment

Mine the gap - a multi-method investigation of web-based groupware use


9
Framing the research

The primary means of coordination used in the four organizational types are described
below the name of the type, so whereas as the machine bureaucracy primarily coordinates
activities through standardized work processes and output, the professional organization
coordinates primarily through standardized skills and norms.
According to Groth the basic contributions of IT are the following three, rated by
significance.
1. IT can process information outside of the mind
2. IT improves information storage capabilities
3. IT improves communication
This rating of IT contributions shows how information processing is still considered the
most important aspect of IT. The rating also shows that while this thesis explores the
possibility of IT to support communication certain persons, such as Groth, express their
disbelief in the value of supporting human communication with IT, compared to other uses.
“... not to say that personal communication and networking are unimportant, but they
have limited potential compared to other uses of information systems.” Groth (1999) p. 14
Additionaly the rating documents a contemporary account of the technological
imperative. Groth's vision for the future organizational design is the model-driven
organization.
“The main constituting part of the organization will be the integrated computer-based
systems, part of the organization will be the integrated computer-based systems, their
programmed patterns of action, and, implicitly, the conceptual model they are based on. The
coordination of the organization members will then be mediated mainly by the systems and
thereby (logically) by the model, not by direct human communication.” Groth (1999) p. 356
His central point is that the model you create of an organization is no longer a passive
model like an organization diagram, but that once it is built into a computer system it
becomes an active model. It is activated because it at the same time defines and describes the
organization. This characteristic was first expressed, although with a completely different
emphasis, by Shoshana Zuboff (1988) as a fundamental duality of information technology.
“Activities, events, and objects are translated into and made visible by information when
a technology informates as well as automates.” p. 10
While previous technologies only produced concrete products, IT simultaneously
generates information about the process in which the products are produced. Groth uses the
duality of information technology in an argument for letting the model drive the organization
in the sense that the model is embedded in IT systems which at the same time describe and
"are" the producing organization. The central systems of financial organizations like the one
studied in the present thesis can actually be analyzed in this way. Groth's own examples

Mine the gap - a multi-method investigation of web-based groupware use


10
Framing the research

include airplane design and car production. He concludes with three basic models: the
regulating model, the mediating model, and the assisting model. These three models result in
five different organizational forms: joystick organization, flexible bureaucracy, interactive ad-
hocracy, meta-organization, and organized cloud p. 402 - 403
Groth illustrates work on IT in organizations, which provides another perspective than
the one chosen for this thesis. Groth suggests that other approaches, rather than support of
communication between humans, might produce more radical results in terms of
organizational effectiveness. Groth's approach is basically to build a model organization and
represent the model in an IT system. In fact, he conceives that the building of the model is the
same process as the building of the IT system. The IT system should then guide the operations
of the organization. However, the approach of groupware and communication technologies is
to construct simple tools that let people construct simple local models of coordination.
The bank of the case study is like most other organizations, an organization that uses
both strategies for organizational computing. Processes such as financial transactions or
counselling processes for people wishing to acquire loans are modelled so that the
coordination between employees who take part in these processes is mediated by the model
and not directly by the employees. At the other end the use of virtual workspaces, e-mail,
LAN-drives etc., is meant to support the local construction of coordination/communication
between employees.
Outside of serving as a contemporary account of the technological imperative, Groth
also expresses an often-heard opinion that groupware is not worth betting on. His opinion
aside, it is a fact that groupware is being used in a number of organizations. E-mail is now a
medium as important as the telephone in many organizations, and other kinds of groupware
have also been deployed, and among these, the virtual workspaces of our case study
organization.

Underlying both the technological and organizational imperative is the belief that the
relationship between organizational structure and properties of the technology should be
studied independently of specific organizational configurations and specific technologies.
Theories of task-technology fit are examples of the organizational imperative.

Organizations and task-technology fit


The organizational imperative typically works on a micro-level in organizations. While
most research in the technology imperative has dealt with the consequences of IT on the
macro structures of organizations, research in the organizational imperative is typically
dealing with micro structures in the organization.

Mine the gap - a multi-method investigation of web-based groupware use


11
Framing the research

A school of thought that has had impact on the study of groupware is decision support
theory (see e.g. Jarvenpaa (1989)). A decision support theorist hinges on the idea that
organizations are units producing decisions. A natural role for IT systems is therefore to
support decision-making.
Saunders and Jones (1990) have, for example, attributed the use of media to different
phases in the organizational decision-making process. They divide the decision-making
process into three phases: identification, development, and selection. The role of written
media such as e-mail is attributed to the latest phase of the decision-making process.
A special type of decisions is group decisions and consequently Group Decision Support
Systems to support them. This is an area, which has been intensely investigated. See for
example Fjermestad and Hiltz (2000), Fjermestad and Hiltz (2000) for an overview of the
empirical research.
Stemming from the decision support school are theories of task-technology fit. The task-
technology fit theories are based on the idea, that there are a limited number of task types that
can be mapped to a design of an IT system. The IT system should thus enable the group of
people, which performs the task to accomplish the task better. That could be either faster,
with less errors, with less conflicts or with greater satisfaction.
Zigurs and Buckland (1998) presents a theory of task-technology fit for Group Support
Systems. We have previously quoted the rationale behind the proposed theory:
“Can we specify particular combinations of task and GSS (Group Support Systems)
technology that will enhance group performance?” Zigurs and Buckland (1998) p. 314
The task-technology fit theory answers the question positively based on a typology of
tasks and a typology of characteristics of computer support. The different tasks are divided by
their structural properties and the typology consists of tasks which are labelled: simple,
problem, decision, judgement, and fuzzy. The types of computer support are reproduced in
the table below.
Communication support “any aspect of the technology that supports, enhances or defines
the capability of group members to communicate with each
other.” p. 319
Process structuring “any aspect of the technology that supports, enhances, or defines
the process by which groups interact, including capabilities for
agenda setting, agenda enforcement, facilitation, and create a
complete record of group interaction.” p. 319
Information processing “the capability to gather, share, aggregate, structure, or
evaluate information, including specialized templates.” p. 319

Mine the gap - a multi-method investigation of web-based groupware use


12
Framing the research

By introducing different kinds of fit between technology and task, Zigurs and Buckland
propose a model for perfect matches between technology and task. The match is expressed in
five propositions, one for each type of task. For example the proposition on simple tasks
states:
P1: Simple tasks should result in the best group performance (as defined for the specific
task) when done using a GSS configuration that emphasizes communication support.” p. 326
The proposition for fuzzy task states:
“P5: Fuzzy tasks should result in the best group performance (as defined for the specific
task) when done using a GSS configuration that emphasizes communication support and
information processing, and includes some process structuring.” p. 328
The theory of task-technology fit exhibits a common approach to researching and
theorizing in IS. It defines some generic structure of tasks or processes, which can be matched
to IT systems with certain properties, independently of the specific organizational setting. The
theory also exhibits some of the limitations of this kind of theorizing.
Firstly, the categories for both task and computer support are too generic to be of
practical use in the design of IT systems. As the first proposition exhibited above shows,
linking a simple group task with communication support is of limited value for designing IT
or for making decisions about implementing IT in an organization. Communication support is
a very abstract notion that could involve a number of different media such as telephone, e-
mail, instant messaging, virtual workspaces, etc.
While acknowledging that existing organizational structures and issues of
implementation are not insignificant, the theory states that, despite all the noise, at the core
there is a perfect match between task and technology. As we shall see, when we turn to an
overview of the research on Group Decision Support Systems, the empirical evidence for this
kind of perfect match is lacking.

Media Richness theory and its critics


Another research area of importance to the present study, is the research on the use of e-
mail. E-mail has from the outset been studied from the media richness theory perspective.
Media richness theory, originally formulated by Daft and Lengel, is another example of the
organizational imperative. This theory also exemplifies the view that organizations are seen as
information processing systems.
Daft and Lengel (1986) argue that organizations process information to reduce
uncertainty and equivocality imposed from its surroundings. Seeing organizations as
information processing systems implies that organizational processes and structures are
correlated with requirements for processing information.

Mine the gap - a multi-method investigation of web-based groupware use


13
Framing the research

“Variables in the design of organizations, such as structural self-containment and


vertical information systems, should provide sufficient capacity to meet information needs.”
Daft and Macintosh (1981) p. 207
What follows is that communication media are used to support information processing,
and their ability to support information processing is characterized by their capacity to
support rich information.
“Communication transactions that can overcome different frames of reference or clarify
ambiguous issues to change understanding in a timely manner are considered rich.” Daft and
Lengel (1986) p. 560
Face-to-face communications are considered the richest followed by telephone then
personal documents such as letters and memos.
“Media of low richness process fewer cues and restrict feedback, and are less
appropriate for resolving equivocal issues. However, an important point is that media of low
richness are effective for processing well understood messages and standard data.” Daft and
Lengel (1986) p.560
Media richness theory (MRT) has been used as the theory for a number of studies of e-
mail. (See Garton and Wellman (1995) for an overview of the research in e-mail.) The theory
basically predicts that e-mail will be used for "well understood messages and standard data".
This assumes that media use is based on individual rational choice independent of the social
settings in which the individuals operate.
The theory has been criticized by a number of researchers (Yates and Orlikowski (1992),
Lee (1994), Markus (1994), Ngwenyama and Lee (1997)), the most notable of which is put
forward by Markus (1994) in a classic case study of managers' use of e-mail. Her case study
shows that three aspects predicted to be aligned in MRT are actually not aligned. These are:
- Individual perception and selection of communication media
- Actual media use patterns
- Perceived media appropriateness
Firstly, the perception of media differs significantly from the actual use. While managers
perceived communication media according to MRT, they used e-mail more and differently
than MRT would predict. Decisions on using communication media are not, as MRT predicts,
based on individual rational-choice, but rather seem to be shaped by social factors.
Firstly, this theoretical dispute, where e-mail has been used as the empirical example
(since it was THE computer medium for communication used in organizations in the early
1990') shows that we need to consider individuals' perceptions of what they do as something
significant and different from what the same individuals actually do. Secondly, it suggests
that we need to study the social construction of media characteristics, rather than the

Mine the gap - a multi-method investigation of web-based groupware use


14
Framing the research

individual adoption, in order to advance the understanding of communication media use in


organizations.
As a final criticism of MRT Markus (1994) notes:
“Because it reflects our shared cultural norms about these issues, information richness
theory is generally quite able to predict managers’ perception and uses of older
communication technologies.” P. 523
MRT is reflected in our common-sense understanding of media and this may be the
reason why it works to predict managers' perception and use of older communication
technologies, but it simply cannot be used with a new medium such as e-mail. If we want to
understand how new communication media are adopted and used in organizations we should
look to research which takes into consideration the specific social structures in organizations.
Markus' critique of MRT is an example of what was earlier referred to as the emergent
perspective on the relation between properties of IT and properties of the organization. It
demonstrates the importance of using social theory in addition to the economic and
psychological theories underlying the technological and organizational imperative.

The emergent perspective


The discussions on IT and organizational design and MRT have highlighted some
problems with the two imperatives. Markus' identification of how perceptions of media and
usage of media are dependent on social structures and processes highlights the problems of a
one-way causality model. The adoption of IT systems should be understood as an interplay
between properties of the technology and the social structures and processes involved in the
adoption.
The case study of the introduction of Lotus Notes in a consulting organization, published
by Wanda Orlikowski in 1993 (reprinted in Kling (1996)) also illustrates the kind of approach
taken by the emergent perspective. Lotus Notes is one of the first groupware platforms and
the most successful in terms of sold licenses. The technology of Lotus Quickplace studied in
this thesis is built upon later developments of the Lotus Notes platform.
The purpose of her case study was to investigate the intended and unintended
consequences of the introduction of Lotus Notes on the "nature of work" and the "patterns of
social interaction". The conclusions drawn from the study of the introduction of Lotus Notes
is that:
"Two organizational elements seem especially relevant in influencing the effective
utilization of groupware: people's cognitions, or mental models, about technology and their
work, and the structural properties of the organization, such as policies, norms, and reward
systems." Orlikowski (1996) p. 174

Mine the gap - a multi-method investigation of web-based groupware use


15
Framing the research

As an example of the structural properties relevant to the adoption of the technology she
mentions the competitive culture of the consulting organization. This culture had the
consequence that employees were reluctant to share documents with each other.
The study exemplifies the emergent perspective in the sense that it identifies specific
cognitive and social structures in the organization that are necessary for understanding how
the technology is used.
Regardless of whether generic theories of IT and organizational design (e.g. Groth
(1999)) or of task-technology fit are true or useful, a study such as Orlikowski's identifies
specific problems of introducing IT that are perhaps of more practical importance. There
exists however an ambiguity in her interpretation of the results. One could interpret the results
as pointing in two directions. The users' mental models of the technology could be seen as an
issue regarding the implementation. The fact that the technology was implemented in a
certain way by management could explain the mental models. The identification of the
structural properties of the organization points in a direction that has greater consequences for
how the relationship between IT and organization should be understood. In later papers she
focuses on the social structures as the most profound finding (e.g.Orlikowski (2000)).

Both Markus (1994) and Orlikowski (1996) argue for the importance of the specific
social structures in organizations for the understanding of how IT is introduced in
organizations. Structuration theory formulated by Anthony Giddens has been proposed as a
theoretical frame for addressing this.

Structuration theory as a basis for an emergent perspective


Structuration theory has been adopted as one of the social theories of choice in IS.
Alongside actor network theory, (Callon and Law (1989), Latour (1991), Lea, O'Shea et al.
(1995)), activity theory (Engeström (1987), Engeström, Miettinen et al. (1999)), critical
theory (Habermas (1981), Lyytinen (1992)) and others, it has established itself as a possible
framework for understanding the adoption of IT in organizations.
One of the basic purposes of structuration theory (as for any social theory) is to explain
the relationship between agents and social structures. This relation is captured in the notion of
the duality of structure. The duality of structure holds that social structures regulate how
agents interact in situations while the same agents in the same situations create, recreate and
change the same structures that regulate the agents.
Structuration theory has been criticized by for example Jensen (2000) as a social theory
that ignores the influence of media in the interactions between agents and social structures.
He proposes a model that includes the media in the equation, and suggests that we understand

Mine the gap - a multi-method investigation of web-based groupware use


16
Framing the research

the relationship between agents, structure and media as a trichotomy that extends Giddens
dichotomy of structure and agents.

Structure

Agent

Medium
While Giddens' concept of duality of structure addresses how social structures interact
with agents, the introduction of the media element results in three different kinds of
interactivity. This distinction allows us to specify the relationship studied in this thesis. While
the discussion until now has been structured around the relationship between organizations
and technology, structuration theory includes the agents as an important element. The agents
create and recreate social structures.
The three interactivities are agent - medium, agent - structure, medium - structure. The
interactivities specified in a setting of organizational computing and computer media will
result in three different kinds of relations to look for:
Agent - medium the relationship between agent and medium is
the relationship studied in HCI. It analyzes
the agents' interactions with a medium.
Agent - structure the relationship between agent and structure
is e.g. the study of computer mediated
communication. It studies how computer
media affects communication.
Structure - medium This relation studies how social structures in
an organization interact with the properties of
computer media.

The focus of this thesis is to study the agent - structure interactivity. While this is the
focus, the strength of Jensens model is that it insists that the three elements cannot be studied
independently. While the focus is on the interactivity between agent and structure, it is the
purpose of the research to analyze how this interactivity is affected by the medium. The
agent-medium and structure-medium interactivities should be included in observations on
agent-structure interactivity.
Structuration theory has been used as the outset for creating a number of theoretical
accounts on how collaborative technologies are adopted in organizations. It has been used for
characterizing two different processes in relation to IT: the design of IT, and the adoption of

Mine the gap - a multi-method investigation of web-based groupware use


17
Framing the research

IT. This distinction has not until recently (Orlikowski (2000) provides an attempt) been
clarified. Orlikowski and Robey (1991), Lyytinen and Ngwenyama (1992), Orlikowski
(1992), Desanctis and Poole (1994) all focus their use of structuration theory on how social
structures are built into IT. Social structures are therefore primarily thought of as something
built into IT.
Two specific uses of structuration theory have offered themselves as theories for
understanding Group Support Systems (Desanctis and Poole (1994)) and CSCW (Lyytinen
and Ngwenyama (1992)). We shall therefore deal with them in more detail.

Adaptive structuration theory


Desanctis and Poole (1994) suggest adaptive structuration theory (AST) as a way of
handling some of the problems with the existing research on Group Support Systems (GSS)
and Group Decision Support Systems (GDSS).
They describe the purpose of AST in the following way:
"…for studying the role of advanced information technologies in organization change.
AST examines the change process from two vantage points: (1) the types of structures that are
provided by advanced technologies and (2) the structures that actually emerge in human
action as people interact with these technologies." Desanctis and Poole (1994) p. 121
AST thus addresses the distinction between structures related to the design of IT (1) and
the adoption of IT(2).
Adaptive structuration theory attempts to bridge the gap between what they refer to as
the decision theorists and the institutional school. These more or less correspond to the terms
technological and organizational imperative.
They describe the social structures that are built into technology in the following way:
"Prior to development of an advanced technology, structures are found in institutions
such as reporting hierarchies, organizational knowledge, and standard operating procedures.
Designers incorporate some of these structures into the technology; the structures may be
reproduced so as to mimic their non-technology counterpart, or they may be modified,
enhanced, or combined with manual procedures, thus creating new structures within the
technology. Once complete, the technology presents an array of social structures for possible
use in interpersonal interaction, including rules and resources." Desanctis and Poole (1994)
p. 125
AST characterizes the social structures built into information technology by two dimensions:
1. Structural features
The structural features are specific capabilities, rules and resources offered by the system.
These features can be evaluated in terms of their restrictiveness (the set of possible

Mine the gap - a multi-method investigation of web-based groupware use


18
Framing the research

actions), their sophistication (see Desanctis and Gallupe 1987) and by comprehensiveness
or richness. The more comprehensive the system the greater the number of features
offered to users.
2. The spirit of these features
Spirit can be identified by treating the technology as a "text" and developing a reading of
its philosophy based on analysis of:
- the design metaphor
- naming and presentation of features
- the nature of the user interface
- training materials and on-line guidance
- other training or help provided with the system
The structure built into the technology is a social structure along with others like the
"task" and "organizational environment" (examples provided by Desanctis and Poole (1994))
They describe the second aspect of structuring (the adoption of IT) in the following way:
"When the social structures of the advanced information technology are brought into
action, they may take on new forms. That is, interpersonal interaction may reflect rules and
resources that are modified from the advanced information technology. For example, when a
group uses voting rules built into a GDSS, it is employing the rules to act, but - more than this
- it is reminding itself that these rules exist, working out a way of using the rules, perhaps
creating a special version of them. In short the group is producing and reproducing the GDSS
rules for present and future use."
This description of the process of adopting a technology actually ignores the social
structures existing in the context where an IT system is put to use. It only deals with the social
structures that are built into the IT system. AST uses "appropriation" as the concept for
explaining how the technology is integrated in the work environment. In the fourth
proposition they state that:
"New social structures emerge in group interaction as the rules and resources of an
advanced information system are appropriated in a given context and then reproduced in
group interactions over time." Desanctis and Poole (1994)
By appropriations they mean the following aspects of adoption:
- A structural feature of the system can be chosen to be appropriated in different ways
by the group. They can adopt them directly, combine them with other social
structures, interpret and reflect on the structures.
- Faithful vs. unfaithful appropriations (whether they adhere to the "spirit of the
technology")
- Appropriate features for different instrumental uses

Mine the gap - a multi-method investigation of web-based groupware use


19
Framing the research

- Attitudes towards the structural features (confidence, perceived value etc.)

The problem with AST is two-fold. Firstly, technology is considered as something that
represents social structure. Orlikowski (2000) notes that this is a misinterpretation of Giddens'
theory. As Jensen's (2000) development of the structuration theory suggests, we should
separate the media (IT) from the social structures. The concept of the duality of structure
implies that social structures only exist through the enactment of agents in recurrent
situations. Contrary to this, properties of technology, despite their origin in social processes,
have a material existence independent from their use.
The second problem with AST has to do with the limited aspects of the use situation
captured by the concept of appropriation. The appropriation process only focuses on how the
structures, which are built into the technology, affect the use. It therefore tends to ignore
social structures such as existing patterns of communication, power structures, culture.

Lyytinen and Ngwenyama (1992) attempt a definition of CSCW systems based on


Giddens' Structuration theory that has similar problems to AST.
They define CSCW applications in the following way:
"Computer Supported Cooperative Work applications are open evolutionary structures
embedding organizational and linguistic rules and serving as resources that mediate and
transform cooperative interactions via recurrent use-processes (procedures and practices)
within specific organizational contexts." p.26
As with AST, Lyytinen and Ngwenyama (1992) define the technology as a social
structure. They therefore make the same categorical mistake, which may not be evident when
we discuss systems that are custom designed for an organization. When we deal with
standardized systems such as virtual workspaces, the social structures relevant for
understanding the design of a technology are different to those which are relevant for
studying its adoption. Orlikowski (2000) has clarified this distinction.

Technology-in-practice
Orlikowski (2000) argues that we should draw a distinction between the technology as
artefact, and the technology-in-practice. The technology as artefact is an entity with certain
properties that we for example describe as functionality. These properties should neither be
thought of as properties determining specific uses, nor as social structures that are “built” into
the artefact as suggested by AST and Lyytinen and Ngwenyama (1992). In other words
should we distinguish clearly between the social processes involved in the design of IT, and

Mine the gap - a multi-method investigation of web-based groupware use


20
Framing the research

the social structures involved in adopting IT. The result of the social processes of design
exists as fixed properties of an IT artefact when the IT is adopted.
What happens in the use situation is that users interact with some properties of the
technology at hand, while ignoring most of them, and in this interaction create and recreate
the social structures that constitute work.
"Through their regularized engagement with a particular technology (and some or all of
its inscribed properties) in particular ways in particular conditions, users repeatedly enact a
set of rules and resources, which structures their ongoing interactions with that technology."
Orlikowski (2000) p. 407
Thus she corrects the categorical mistake made by AST and her own previous attempts
in Orlikowski (1992) and Orlikowski and Robey (1991). The technology as artefact thus
represents the material properties of a technology that are fixed at the point of adoption.
Technology-in-practice is characterized in the following way:
“These enacted structures of technology use, which I term technologies-in-practice, are
the sets of rules and resources that are (re-) constituted in people’s recurrent engagement
with the technologies at hand.” Orlikowski (2000) p. 407
While users interact with some of the properties of the IT artefact, they do not interact
with all of them. Nor can the designer predict which properties. This differs from AST, which
hinges on a notion that the properties of IT have specific effects on the adoption. They leave
some room for variations, but they basically believe in a specific link between properties of
IT and the resulting use patterns.
Orlikowski mentions the World Wide Web technology originally designed by Tim
Berners-Lee (Berners-Lee and Fischetti (1999)), as an example of a technology which would
defy the notion of a direct link between its design and the resulting use. Another obvious
example is the e-mail technology, which Orlikowski has herself studied but for some reason
does not mention. In Bøving (2001) the design of the e-mail standard is analyzed and there is
by no means a direct link between its design and the use observed in the numerous studies on
e-mail in organizations. (See Garton and Wellman (1995) for an overview on e-mail
research).
The conceptual distinction between the technology as artefact and the technology-in-
practice also seems a useful distinction for understanding the observations made in this thesis.
The character of the design process of virtual workspaces as well as the use patterns observed
in the case study suggest that there is no direct link between the properties of the technology
and the resulting use patterns. At least there are other social structures more important for
understanding how the technology is adopted.

Mine the gap - a multi-method investigation of web-based groupware use


21
Framing the research

Selecting the relevant social structure


Using structuration theory and its applications in IS is, at its outset, a highly generic
approach that could justify the study of a broad range of aspects. The relationship between
organizational structures and Computer media has been studied from a number of more
specific perspectives. Rice and Gattiker (2000) provide an overview of the studies.
One of the aspects of social structure studied is the issue of power relations. The
introduction of communication technologies affects existing power relations in organizations.
(See e.g. Markus (1983))
The kinds of social structures studied in the present thesis are patterns of organizational
communication. This study investigates the relationship between a collaborative technology
and organizational communication. This also implies that the technology is understood as a
medium for communication.

Organizational communication
The research field of organizational communication has been around since the late 1930s
(Jablin and Putnam (2000)) and has from different perspectives studied diverse aspects of the
communication taking place in organizations.
According to Stanley Deetz (Jablin and Putnam (2000) p. 4 – 5) organizational
communication can be approached with three different conceptions of organizational
communication.
1. Organizational communication as a specialty of communication departments and
communication associations.
2. Communication as a phenomenon that exist in organizations alongside other
phenomena
3. Communication as a way to describe and explain organizations
In the third approach organizational communication becomes an alternative theory of
organizations to the decision making (e.g. Simon (1977)) and information processing (e.g.
Media Richness Theory) approaches, which we have discussed previously and which are
primarily based on psychological and economic theories.
In the context of this thesis, communication is treated as such an alternative theory of
organizations. Before we proceed to the choice of a theory that uses structuration theory as
the starting point for understanding organizational communication and its relation to
computer media, we need to deal more specifically with alternative theories for understanding
groupware. The virtual workspace technology studied here should be considered a kind of
groupware, and both the CSCW and GSS traditions offer frameworks for understanding
groupware.
Mine the gap - a multi-method investigation of web-based groupware use
22
Framing the research

CSCW
The IS research tradition of which we have seen a number of examples, and studies of
the impact of communication technologies on organizational communication, are
characterized by a modest interest in specific technology designs. Most articles refer to rather
abstract characterizations of technologies’ properties such as “communication support”,
“decision modeling” and “rule-writing capability” Desanctis and Gallupe (1987)
Computer Support for Cooperative Work has established itself as a tradition in research
with a bi-annual American conference since 1986, a bi-annual European conference since
1989 and the Journal of CSCW. Grudin (1994) provides an overview of the CSCW tradition,
Hughes, Randall et al. (1991) and Schmidt and Bannon (1992) attempt a definition of the field
of CSCW. The CSCW tradition seems rather isolated from the other related IS disciplines
presented so far and is characterized by many experiments with the construction of CSCW
systems (e.g. Conklin and Begeman (1988), Bentley and Dourish (1995), Bentley, Horstmann
et al. (1997), Guzidial, Rick et al. (2000). It is also characterized by numerous accounts of
methods for designing CSCW systems (e.g. Grudin (1991), Grudin (1994), Teege (2000),
Büschner, Gill et al. (2001)).
The tradition has also produced theories of the nature of cooperative work relevant for
understanding CSCW. These are based on notions such as a distinction between "work" and
"coordination of work", and a notion of articulation work. CSCW draws on a number of
research fields and there is not a unified theory of work underlying CSCW. However
cooperation, coordination and articulation work stand out as central concepts in the CSCW
understanding of work.
The rationale behind CSCW is, as implied in the acronym, to support what is termed
"cooperative work".
"CSCW should be conceived as an endeavour to understand the nature and
characteristics of cooperative work with the objective of designing adequate computer-based
technologies". Schmidt and Bannon (1992)
At least for several researchers in CSCW, coordination and articulation are seen as basic
concepts for understanding the work supported by computers Schmidt and Bannon (1992),
Schmidt and Simone (1996), Suchman (1996), Divitini and Simone (2000). The concept of
articulation work stems from Strauss (1985). In the words of Schmidt and Bannon (1992)
articulation work amounts to:
"First, the meshing of the often numerous tasks, clusters of tasks, and segments of the
total arc. Second, the meshing of efforts of various unit-workers (individuals, departments
etc.). Third, the meshing of actors with their various types of work and implicated tasks."
Schmidt and Bannon (1992)
Mine the gap - a multi-method investigation of web-based groupware use
23
Framing the research

Underlying the idea of articulation work is a distinction between articulation work and
more basic work tasks. This distinction makes indeed sense in the case of settings with
physical labour involved, for example, in a production facility. The coordination of efforts
through the means of articulation is distinct from the completion of the actual work of, for
example, manipulating steel. Whether this distinction is relevant for symbolic work settings
(e.g. office work) is more ambiguous. One "side-effect" of the notion of articulation work is
the role of communication in understanding work.
The role of communication in this theory of work is that it is conceived as a means of
articulating work. This has the "hidden" consequence that communication is not understood
as basic work. The purpose of a CSCW system is to facilitate the articulation of work and
"thus augment the capacity of the ensembles in articulating their distributed work."Schmidt
and Bannon (1992). In production settings where the basic work is seen as the physical
manipulation of materials, the role of communication as a means for the articulation of work
seems like a very useful distinction. In the settings of work dealt with in the context of this
thesis, where all that is manipulated is symbols, the role of communication as an articulator of
work seems limiting. In symbolic work communication should be considered basic work. The
outcomes of the activities of symbolic work could, in many cases, actually be characterized as
communication.

Research in Group Decision Support Systems (GDSS)


Another tradition that has researched groupware, is the tradition of Group Decision
Support Systems (GDSS) and more broadly Group Support Systems(GSS). As described
previously, decision-making is considered a core activity in organizations, especially for
managers. The research in group behavior and factors that improve decision-making has a
long tradition in social psychology. McGrath (1984) provides an overview of the research.
The research has, among other things, resulted in a large number of methods (see e.g.
VanGundy (1988)) for improving group decision processes. These are then typically used as a
basis for research on computer support for group decision-making. See e.g. Desanctis and
Gallupe (1987) for a suggestion for a framework for studying GDSS.
“A GDSS aims to improve the process of group decision making by removing common
communication barriers, providing techniques for structuring decision analysis, and
systematically directing the pattern, timing, or content of the discussion.” Desanctis and
Gallupe (1987) p. 589
Research in GDSS has primarily been performed as experiments. A typical example of a
research design can be found in Watson, Desanctis et al. (1988). Here 82 university student
groups of three to four persons were assigned randomly to three different experimental

Mine the gap - a multi-method investigation of web-based groupware use


24
Framing the research

conditions: a computer-based support system (GDSS), a paper and pencil based support
system, or no support at all. The common task of the groups was to act as a philanthropic
organization deciding how to allocate money among six projects competing for funding. The
results of the experiment were such that:
“In general, the GDSS technology appeared to offer some advantages over no support,
but little advantage over the pencil and paper method of supporting group discussion.”
Watson, Desanctis et al. (1988) p. 463
Group Decision Support Systems are in general systems that support meetings where
people are interacting synchronously. One GDSS called GroupSystems (now a commercial
company www.groupsystems.com) has been used in 55% of 54 case and field studies of
GDSS Fjermestad and Hiltz (2000). GroupSystems is a tool that supports online meetings and
has tools for supporting the decision process. This includes supporting collaborative
generation of ideas (electronic whiteboard), doing surveys, and votings among the
participants. The typical experimental research design also places people in the same location
with computer screens.
While the term GDSS denotes research generally based on decision making with the
“group” being a special case, a broader term “Group Support Systems” has emerged. GSS
also includes experiments that are not specifically focused on decision-making, and includes
what is referred to as CMC systems.
At least 200 experiments have been conducted on GSS (Fjermestad and Hiltz (1998-
1999)) compared to 54 case and field studies (Fjermestad and Hiltz (2000)).
“The results show that the modal outcome for GSS systems compared with face-to-face
(FtF) methods is "no difference," while the overall percentage of positive effects for
hypotheses that compare GSS with FtF is a disappointing 16.6 percent.” Fjermestad and Hiltz
(1998-1999) p. 7
The approach taken by GSS and GDSS has not produced much empirical evidence for
the underlying theories of group decision-making and task-technology fit. The research in
GSS is typically based on the notion of a group task. Previously a recent theory of task-
technology was introduced. It defined task in the following way:
“Thus, a group task is defined here as the behavior requirements for accomplishing
stated goals, via some process, using given information.”
Zigurs and Buckland (1998)
Other than the lack of empirical evidence of the usefulness of the notion of task, it is
problematic for studying the use of virtual workspaces in another sense as well. When we
look at the use of Lotus Quickplace at Beta, the process of defining and changing tasks during
the integration of the technology is important. In the settings we look at, tasks are rarely

Mine the gap - a multi-method investigation of web-based groupware use


25
Framing the research

defined in advance. The goals are only on an abstract level clear and are not entirely agreed
upon by all members, the process is not defined, and the information needed is not given in
advance. Thus the three defining criteria for a task as formulated by Zigurs and Buckland are
not met.
Much of what goes on in the settings where Lotus Quickplace is used is that members of
a group negotiate goals, process, and information needed. This is an important part of
working and part of the rationale behind designing a technology such as Lotus Quickplace.
Instead of focusing on the technologies-in-practice, the concepts of group decision
process and task assumes a rational definition of work that ignores the specific social
structures involved in the adoption of IT.

Units of analysis for organizational communication


The present thesis studies empirically how a virtual workspace technology is adopted in
the communication of an organization. I am interested in how the patterns of organizational
communication affect, and are affected by the technology. This could be stated in the
following question: What is the role of the technology in the ongoing process of reproducing
and changing patterns of communication in the organization? The underlying assumption is
that this is important for understanding the adoption of the technology in the organization.
At this point a unit of analysis for studying patterns of communication is needed. JoAnne
Yates and Wanda Orlikowski have developed the theory of genres of organizational
communication and genre system in a number of papers (Yates and Orlikowski (1992),
Orlikowski and Yates (1994), Orlikowski, Yates et al. (1995), Yates, Orlikowski et al. (1997),
Yates, Orlikowski et al. (1999), Yates and Orlikowski (2002)), which are based on three
empirical studies.

Genres of communication
The theory of genres of organizational communication is based on two major theoretical
developments. The first is that of structuration theory, which we have dealt with earlier, and
the second is the concept of genre, which is drawn from rhetorical theory.
As noted in Orlikowski and Yates (1994) the approach taken is to treat organizational
communication as an alternative theory of organizations to the decision making and
information processing approaches we have discussed above, which are primarily based on
psychological and economic theories. In this respect they agree with Deetz (2000) mentioned
earlier.
The theory of genres of organizational communication has adopted the concept of genre
from rhetorical theory and used the premises of structuration as a framework for

Mine the gap - a multi-method investigation of web-based groupware use


26
Framing the research

understanding organizational communication and how it develops, and how media and
especially computer media affects and are affected by the genres of communication. From
rhetorical theory Orlikowski and Yates adopt Miller (1984)'s definition of genres as "typified
rhetorical actions based on recurrent situations", p. 159.
They define the overall concept of genres of organizational communication in the
following manner:
“…a genre of organizational communication is a typified communicative act having a
socially defined and recognized communicative purpose.” Yates and Orlikowski (1992) p. 3
Genres of organizational communication are therefore types of communicative acts such
as project meetings or meeting agendas. Henceforth genre is used as an abbreviation of genre
of organizational communication.
Three different aspects define genres: the social rules, form, and content. While the form
and content are the observable properties of the genre the social rules are social structures in
Giddens' sense, which are only observable through their effect on the form and content of the
genre. The content of the genre includes: “… social motives, themes, and topics being
expressed in the communication.” 1992 p. 301. The form is “… the observable physical and
linguistic features of the communication." The medium used for the genre is considered an
aspect of the form of the genre.
"Media are the physical means by which communication is created, transmitted, or
stored. Genres are typified communicative actions invoked in recurrent situations and
characterized by similar substance[content] and form." Yates and Orlikowski (1992) p. 319
A project meeting, for example, has certain social rules. There is a project manager who
issues invitations for the meeting and decides who should participate. A social rule dictates
formal cancellation if one cannot participate in the meeting. The form of the project meeting
concerns aspects such as the typical existence of an agenda, that the meeting is held in a room
and the appointment of a person to produce minutes from the meeting. The content of the
project meeting concerns the themes dealt with in the meeting. (A project meeting typically
discusses the project plan and whether the project is on target or delayed.)
The agenda of a meeting is another example. It contains certain social rules, which
include, perhaps that it is produced by the chairperson of the meeting, in some cases at the
meeting and in other cases it is sent out in advance. The form of the agenda could be that it is
a Word document attached in an e-mail with the invitation to the meeting. The content of the
meeting agenda specifies that it contain the subjects dealt with on the meeting in a specific
order.
A very important feature of genres is how they change, and this is where Yates and
Orlikowski put Giddens' idea of structuration into play. As introduced earlier, one of the

Mine the gap - a multi-method investigation of web-based groupware use


27
Framing the research

central concepts in the theory of structuration is the duality of structure: this duality notes that
social structures regulate how agents interact in situations, while the same agents in the same
situations create, recreate and change the same structures which regulate the agents. This is
known as the process of structuration.

The role of computer media in genre theory


Adopting the concept of genre implies that IT (in our case the virtual workspace) is
conceived as a medium for communication. In this respect, virtual workspaces facilitate
communication as a medium of that communication in the same sense as the telephone or e-
mail is a medium. Stated in the active voice: virtual workspaces mediate genres.
The introduction of a new medium such as a virtual workspace may change existing
genres or create new ones through the change of the social rules, form, or content of the
genre. But the change happens in the process of structuration, it does not happen as a causal
effect of the technology. Moreover, the media will be shaped by the existing genres of
communication in the organization.
A good example of a new genre of communication that has emerged with the
introduction of e-mail, which I have personally experienced in a Consulting company, is the
“cc: of the bosses”. When you want somebody in another department to do something for you
and you are not sure they will do it, it is a well known and often used trick to cc: your boss
and the recipients boss. It works as a way of notifying the recipient that you are serious and
prepared to cause trouble if he does not react properly. Before e-mail this trick would not
have worked, as you would have to call up both bosses resulting in a reputation as the
troublesome employee. Surely, the act of notifying bosses when departments were working
together is not introduced with e-mail. But it is a new genre of communication, because the
social rules around it has changed (allowed frequency, consequences for the implicated
parties) and the form has changed (simple cc: function in the mail standard).

Adhering to the distinction between technology as an artefact with fixed properties and
technology-in-practice, the concept of genres of organizational communication offers a
specification of relevant social structures for understanding the use of computer media. The
process in which genres of communication are reproduced and changed using computer
media is thus an example of technology-in-practice.
While computer media will affect the existing genres, the reverse effect is perhaps the
most important contribution for understanding the adoption of computer media:
"Any time a new communication medium is introduced into an organization, we expect
that existing genres of communication will influence the use of this new medium, though the

Mine the gap - a multi-method investigation of web-based groupware use


28
Framing the research

nature of this influence will reflect the interaction between existing genres and human action
within specific contexts." Yates and Orlikowski (1992) p. 318

Genre repertoire and genre system


In addition to the original concept of genre, Yates and Orlikowski have developed two
additional concepts concerning genre.
Community is a concept used extensively in the analysis of computer media, and genre
repertoire is used by Yates and Orlikowski as a means of characterizing a community. A
repertoire of genres is a collection of genres routinely enacted by a particular community.
“A community’s genre repertoire indicates its established communicative practices. Hence,
the concept of genre repertoire can serve as a useful analytic tool for investigating the
structuring of a community’s communicative practices over time.” Orlikowski and Yates
(1994) p. 546
The concept of genre system is developed in Yates, Orlikowski et al. (1997), Yates and
Orlikowski (2002) serves as a way of defining genre itself more clearly. In Yates and
Orlikowski (1992) they use the meeting as an example of a genre. A meeting consists of
multiple communicative moves. In later accounts (Orlikowski and Yates (1994), Yates,
Orlikowski et al. (1997)), they distinguish between a genre and a genre system. Thus, it
becomes clearer that a genre is conceived as a single communicative "move". So, a meeting
should actually be analyzed as a genre system because it consists of multiple "moves".
"To summarize, a genre system, when enacted by participants, structures or
choreographs multi-party interactions within and across communities by specifying the key
dimensions of communication - purpose, content, participants, form, time, and location. A
genre system therefore, serves as an interaction template that participants draw on in
engaging with each other across media, time, and space." Yates and Orlikowski (2002) p. 17
They suggest that genre systems should be used as the unit of analysis for studying
communication in groupware:
"We suggest that the notion of genre system may be particularly useful for studying
collaborative communicative activities in electronic media, because a genre system is an
interlocking and interdependent set of genres that, by definition, requires collaboration."
Yates, Orlikowski et al. (1997) p. 51
Yates, Orlikowski et al. (1997) report a case study in which the genres enacted in a
groupware system is studied. The groupware technology is Lotus TeamRoom, which is a
predecessor for the Lotus Quickplace technology studied in the present thesis. The study
identifies three genre systems used in the TeamRoom: meeting documentation genre system,
collaborative repository genre system, and collaborative authoring genre system. All genre

Mine the gap - a multi-method investigation of web-based groupware use


29
Framing the research

systems consist of multiple genres. The meeting documentation, for example, consists of
meeting logistics, meeting agenda, and meeting minutes.
The study is based on studying the contents of all messages posted to three Team Rooms
over a seven-month period. The total body of messages was 492 and the content analysis was
followed by interviews with members from each of the TeamRooms.
Genre systems clarify that communicative acts cannot be understood without
understanding preceding and subsequent communicative acts. The concept of genre systems
will be used as the primary unit of analysis when adopting the genre theory in the context of
this thesis.

The consequence of introducing genres of communication


Stating that virtual workspaces should be conceived as media used in genres of
organizational communication has certain important consequences:
1. The adoption of a virtual workspace is measured in terms of how it is integrated in
the genres of communication of the organization. Whether the adoption makes
communication more efficient or improves the quality of work, or indeed whether other
measures of success can be assigned to the introduction of the technology is not assessed by
genre theory. It purely addresses how the technology is adopted.
2. The agents or users change the genres in the situations where they communicate. This
is not accomplished by the designer of the application, and is not a direct effect of the
properties of the technology artefact.
3. The integration of a virtual workspace into the daily work of the organization means
that some of the existing genres of communication will be modified, as the technology is
adopted in the genre.
4. While virtual workspaces will change existing genres, new genres will also emerge as
a consequence of the integration of the medium in the practice of work.

The purpose of introducing the theory of genres of organizational communication into


this thesis is that it should work as a framework for understanding the adoption of the
technology. The study will show that the theory can be used as a sensible framework for
understanding in detail how the virtual workspace is integrated in the practice of work. It is
neither the purpose of this thesis to confirm or disconfirm the theory applied to the specific
combination of technology and organization studied.

As stated in the introduction of this thesis the secondary purpose of the thesis is to
experiment with multi-method studies of the use of virtual workspaces. More specifically the

Mine the gap - a multi-method investigation of web-based groupware use


30
Framing the research

use of log file analysis will be explored in combination with interviews and survey data. The
theory of genres of organizational communication has implications on how to research genres.
Yates and Orlikowski (1992) suggest that both diachronic analysis and synchronic analysis of
genres of communication could be useful. Their study of how the memo genre has evolved in
business organizations is an example of a diachronic analysis. On synchronic analysis they
state the following:
"Synchronic analyses would identify the existing genres influencing communication and
media use within certain contexts, either by searching for the presence of well-established
genres such as the memo or the meeting, or by identifying genres based on detailed analysis
of communication form, substance, and the invoking situation." Yates and Orlikowski (1992)
p. 322
The investigations presented in this thesis identify genres based on a detailed analysis of
specific situations in which a genre is used. The purpose of the study is not to identify the
typical genres used in relation to a virtual workspace, but to provide examples of the detailed
adoption of specific genres and illustrate how log analysis can provide additional insights in
this process. Particularly for the analysis of genre systems (multiple related communicative
"moves"), log analysis provides important insights into the relationship between the
individual communicative moves not captured by content analysis alone.
Genre studies are typically based on the analysis of content. This is the case both in the
context of organizational communication as well as in the aesthetic analysis of media
products in general such as films, novels, etc..
“Genre analysis requires qualitative textual analysis of messages to understand the
situations within which certain genres are invoked and their shared purpose, substance and
form.” Orlikowski and Yates (1994)
The textual analysis is not used as a method in the study reported here. This will, of
course, limit the aspects of communication content in a genre. Instead it shows how log file
analysis, which provides a detailed account of how communicative actions are linked
temporally in a genre system, can extend the analysis of genres and genre systems.

Mine the gap - a multi-method investigation of web-based groupware use


31
Research method

Research method
Before we can proceed to the analysis of the virtual workspace technology, and report
from the case study, we need to reflect on the research methods chosen for the investigation.
A research method and a research design allow for the drawing of certain kinds of
conclusions and the ruling out of others. This section is devoted to both the discussion of
some methodological issues concerning studying the use of computer media in an
organization, issues specifically related to utilizing log analysis and also, to the presentation
of the research design of the case study.
In this thesis I am devoting relatively more text to these methodological considerations
than is customary, due to the experimental character of the case study. There are two main
issues, which have not been resolved systematically in the literature on research method:
1. How can we use quantitative data from user-actions to investigate a research
question, which would normally be classified as an interpretive research question
lending itself to qualitative methodologies?
2. What is the validity of HTTP-log files when using them to analyse the use of a web-
based technology such as Lotus Quickplace.
The purpose of this section is firstly, to discuss the methodological issues of combining
quantitative and qualitative data, and secondly to report the considerations made during the
different phases of the case study, in order for the reader to judge the conclusions drawn
based on the actual process of gathering and analysing data.

Defining the object of study


An important step in any scientific research process is the identification of the research
object. This might seem trivial, and in some well-established cases it is. The important aspect
of identifying the research object (besides being able to tell others about what you are
studying) is that the nature of the object determines or at least influences the choice of
research method.
If you are in biology, describing your object of study as some specific process in the cell
is pretty straightforward. The research tradition of biology pretty much defines the objects or
relations relevant for biological research.
When you turn to the study of organizational IT, identifying the object of research is not
necessarily obvious. A first distinction that seems natural would be to distinguish between
social structures or properties of the organization, and the properties of the technology. Since
we are interested in the interaction between social structures and the properties of an IT
artefact, we need to find a hypothesis on the kind of relationship between the technology and
Mine the gap - a multi-method investigation of web-based groupware use
32
Research method

the social structures. As discussed in the previous section on existing research, there are two
dominant imperatives for this relation: the technological imperative, and the organizational
imperative. The technological imperative states that the properties of the technology affect the
social structures of the organization (e.g. the organizational form (Leavitt and Whisler (1958),
Simon (1977), Groth (1999))). In positivist research terms, the social structures are treated as
the dependent variable in the technological imperative. The organizational imperative treats
the technology as the dependent variable, and states that IT is designed to satisfy needs that
are a result of the social structures in an organization (e.g. the media richness theory (Daft and
Macintosh (1981), Daft and Lengel (1986)).
As Orlikowski and Iacono note, studying either the social implications of technology or
the properties of the technology independently from its use is analytically hazardous.
“By following specific artefacts over time, it should become clear that changes occur not
only in the social, behavioural, and economic circumstances within which the artefacts are
embedded (resulting in the so called “societal” or “organizational transformations” that we
hear so much about) but also that changes are constantly occurring in the IT artefacts
themselves – whether through invention, innovation, regulation, expansion, slippage,
upgrades, patches, cookies, viruses, workarounds, wear and tear, error, and failure.”
Orlikowski and Iacono (2001) p. 132
As others also have stressed (see e.g. Markus and Robey (1988)), both imperatives are
too general to be useful in understanding how a technology such as Lotus Quickplace
interacts with the social structures at Beta. While this opens the space for new understanding
of the relationship between technology and organizations, it leaves us with the rather daunting
task of answering the question:
What properties of the artefact and which social structures are relevant to the
relationship between technology and social structures?
Let me repeat the research question in order to bring it into the light of this discussion:
How do the properties of an IT artefact for organizational communication in a group of
people interact with the social structures and result in a work practice in which the IT artefact
plays a role?
The phrases “properties” and “interact with” indicate that the research here is based on
the assumption that we cannot understand the relation between technology and social
structures if we separately choose either one of the imperatives. This is true on a general level
and implies that theories, which do not deal with the specific social structure in the
organization, will not improve the understanding of how technology and organization
interacts. In another perspective, we should however not ignore that technologies do affect
organizations and vice-versa. The meeting of technology and organization is simply a

Mine the gap - a multi-method investigation of web-based groupware use


33
Research method

collision of factors and consequences combined in different ways, which produce a result
where both technology and the organization are affected. The properties of the IT artefact will
allow for certain usages of the technology and rule out others.
The concept of “social structures” used in the research question is as vague a term as
any. Social structures can be investigated on many levels in an organization. The specification
of social structures in the context of this thesis is made through the notion of genres of
organizational communication. The kinds of social structures studied here are the ones, which
are relevant to the genres of communication used by “groups of people”. The notion of groups
of people means that we are not dealing with e.g. strategic communication from management,
for which the Intranet is used at Beta.
The formulation of the research question also indicates (at least it should !) the kind of
answer one might to provide. The purpose of a case study such as the one reported her, is to
provide a set of generalizations, which make the observations made here relevant to
researchers or professionals dealing with related issues.
I can rule certain generalizations out:
- It is not the purpose of this thesis to draw conclusions on the general behaviour,
ethics, aesthetics etc. of people. No generic sociological implications are drawn from
the research.
- It is not the purpose to draw sociological implications on the dynamics of
organizations from the study.
- It is not the purpose of this study to provide better algorithms or designs of virtual
workspaces.
The purpose of the study is to understand the object, a virtual workspace and understand
its relations to the social setting of use. The research provided here is intended for other IS
researchers as well as practitioners (not primarily for sociologists or humanists in general),
and its generalizations are made in order to improve the understanding, implementation and
design of other IT systems for communication in other organizations.
As the field of IS is interdisciplinary and combines research traditions from natural
science, humanities, and sociology, the choice of method is not at all self-evident.

Schools of IS research methods


Research traditions in the IS field have been distinguished according to their underlying
epistemology by Orlikowski and Baroudi (1991) into positivist, interpretivist and critical
research. These are of course not specific to the IS community, but are distinctions used to
characterize research in general. The distinctions are widely used in the IS field to distinguish
between research traditions, and are used widely to label research and researchers (“oh, he’s a

Mine the gap - a multi-method investigation of web-based groupware use


34
Research method

real positivist”.). While the terms positivist and interpretive originally refers to
epistemologies, in IS they refer rather to approaches to research. They are named after their
underlying epistemology, but carry with them different theoretical frameworks,
methodologies, data analysis methods, and data collection methods.
Orlikowski and Baroudi characterize positivist research as studies which
“…are premised on the existence of a priori fixed relationships within phenomena which
are typically investigated with structured instrumentation. Such studies serve primarily to test
theory in an attempt to increase predictive understanding of phenomena.” Orlikowski and
Baroudi (1991) p. 5
The theory of task-technology fit (Zigurs and Buckland (1998)) or media richness theory
(Daft and Lengel (1986)) are examples of theories which are characterized as positivist.
Empirical research based on a positivist epistemology can, in principle, use both qualitative
and quantitative methods, but positivist IS research is associated with quantitative methods.
Quantitative research methods investigate the relationship between one or a few dependent
variables and a number of independent variables. Examples mentioned previously include the
experimental research performed in the tradition of GSS (see Fjermestad and Hiltz (1998-
1999) for an overview). Gallupe, Desanctis et al. (1988) provide a specific example in which
measures of decision quality and individual perceptions are dependent variables, and the
difficulty of the decision task and whether the process is supported by a GDSS are
independent variables. This yields four experimental conditions (high vs. low difficulty *
GDSS support vs. no GDSS support).
Orlikowski and Baroudi provides the following characterization of interpretive research:
Interpretive research “… assume that people create and associate their own subjective and
intersubjective meanings as they interact with the world around them. Interpretive
researchers thus attempt to understand phenomena through accessing the meanings that
participants assign to them.” p. 5
While the positivist research approach involves quantitative methods, the interpretive
research approach is associated with qualitative methods taken from sociology, ethnography
and other related fields.
The positivist research approach was the first and dominant approach in IS research.
Interpretive and critical research have emerged as rival approaches. Very illustrative of this is
the abstract of Benbasat, Goldstein et al. (1987). "The article defines and discusses one of
these qualitative methods - the case research strategy. Suggestions are provided for
researchers who wish to undertake research employing this approach."

Mine the gap - a multi-method investigation of web-based groupware use


35
Research method

The research underlying the theory of genres of organizational communication presented


earlier would be characterized as interpretive research, since it draws on theory and methods
from sociology and text analysis.
Critical studies “…aim to critique status quo, through the exposure of what are believed
to be deep-seated, structural contradictions within social systems, and thereby to transform
these alienating and restrictive social conditions.” p. 6
The critical approach is based on the tradition of critical social theory. The difference
between the three approaches can be studied in the critique of media richness theory (Daft and
Macintosh (1981), Daft and Lengel (1986)). Markus (1994) uses first a positivist approach to
show that MRT does not apply to e-mail. Then she develops alternative explanations of the
use of e-mail based on social factors using an interpretivist approach. Ngwenyama and Lee
(1997) offer an illustration of how a critical approach, based on Habermas' theory of
communicative actions (Habermas (1981)) can offer an alternative explanation of what
communication richness can mean in e-mail.
The distinction between positivist, interpretive, and critical approaches to research in IS
is a problematic distinction. The problem is that what is characterized as interpretive research
in this distinction covers a number of rather different theories and methods from sociology
and ethnography. The distinction reflects a historical development of IS research where
positivist approaches have dominated, and interpretive and critical approaches have emerged
as rival approaches.
For the further discussion on empirical research methods it will be useful to distinguish
between different levels of characterizations. Jensen (2002) identifies six levels relevant to the
conduct of empirical research in general. These can serve as our guide in addressing the
different considerations related to research method.

Mine the gap - a multi-method investigation of web-based groupware use


36
Research method

In terms of placing the present case study within the three traditions, the research
question rules out that this be a piece of critical research. It is not the aim of this case study to
shed light on underlying contradictions and alienating social conditions.
As for the remaining two research traditions, it is the intention of this work to try to
combine interpretive and positivist research traditions. I have not in my research proposed
specific propositions or hypotheses that I wished to test, and in that sense neither the
theoretical framework nor the methodology are taken from the positivist tradition. On the
level of data analysis methods and data collection methods I have however used methods
from the positivist research tradition.
The study presented here is based on the theoretical framework presented earlier, which
in turn is based on two interrelated perspectives for understanding computer media.
Technology can on the one hand be described as an artefact with certain properties. On the
other hand it can be described as something embedded in a social practice referred to as
technology-in-practice. For understanding technology-in-practice, the theory of genres of
communication is used. This theory carries with it a certain methodology, as well as data
analysis methods and data collection methods. The empirical research performed using genre
theory uses the analysis of texts such as e-mails or documents in a specific groupware
application, but the purpose of this study is to explore new methods for collecting and
analyzing data. Rather than basing the research on a pre-packaged research approach
described in the six levels presented by Jensen (2002), part of the purpose of the study is to
explore how log files and log analysis as data collection and data analysis methods can be
combined with interview and survey data. Some of the results presented will be based on the
theoretical framework of genre theory, but some of them will be characterized by a focus on
exploring the value of log analysis for field study research of computer media use.

The case study or field study approach

The case study approach is used in IS research alongside the more traditional approaches
of controlled experiments or surveys. Next to surveys and controlled experiments, it is the
most widely used research design in IS Orlikowski and Baroudi (1991). The case study
strategy is described en e.g. Benbasat, Goldstein et al. (1987), Yin (1994), Walsham (1995)
The concept of field studies has also been used in IS Klein and Myers (1999), and there
seems to be no general agreement on the distinction between case and field study, other than
as an indicator for the research background of the researcher.
The main characteristic for a case or field study in IS is, as in most other disciplines, that
phenomena are studied in their natural contexts (as opposed to experimental laboratories) and
that it does not use inferential statistics in the process of generalizing observations.

Mine the gap - a multi-method investigation of web-based groupware use


37
Research method

Yin (1994) p. 13 defines a case study as "an empirical inquiry that investigates a
contemporary phenomenon within its real-life context, especially when the boundaries
between phenomenon and context are not clearly evident." According to this broad definition,
the study reported in this thesis is an example of a case study.
Yin (1994) identifies six sources of evidence or data relevant to performing case studies.
These are presented in the following table with a description of the types of data used in this
study.
Documentation Standard Operating Procedures for using
Lotus Quickplace and the Intranet have been
studied.
Archival Records Applications for opening a Lotus Quickplace
gathered since the introduction of the
technology in the organization.
Interviews Interviews with managers of Quickplaces and
with the people responsible for the
introduction of the technology have been
conducted.
Direct Observations HTTP-log files could be characterized as a
kind of direct observations.
Participant-observation
Physical artefacts The Lotus Quickplace technology has been
analyzed both in terms of its functionality for
the user, and for understanding the
relationship between what a user does and
how this is represented in the log file.

HTTP-log files are characterized as a special type of direct observations, which combine
certain characteristics of archival records and direct observations. Archival records are
characterized by Yin (1994) p. 80 as:
• stable - can be retrieved repeatedly
• unobtrusive - not created as a result of the case study
• exact - contains exact names and details of an event
• broad coverage - long span of time, many events and many settings
• precise and quantiative
All of these properties are also properties of log files. They differ, however, in a very
important way from archival records in that they are not produced intentionally by members
Mine the gap - a multi-method investigation of web-based groupware use
38
Research method

of the organization studied. The information present in the log files is a combined product of
both the de-facto standard HTTP-log format and of the technical design of the Lotus
Quickplace technology. In this sense log files are very different from an archive of all
applications for using Lotus Quickplace, which have been sent to the technical manager of the
Lotus Quickplace server.
In this respect it might be better to characterize the HTTP-log as a type of direct
observation. Yin characterizes direct observations as:
• reality - covers events in real time
• contextual - covers context of the event
Clearly HTTP-log files only capture a very limited aspect of events, and they do not
capture what Yin calls the context of the event. Comparing them to Yin's typology, log files
represent a new type of data for case studies and the study presented here is partly devoted to
exploring their qualities as a data source for case or field studies.

On Generalizability
Before we get down to the discussion of data analysis methods and data collection
methods, a little more space will be devoted to the issue of generalizability. Discussing
generalizability is a useful approach to discussing the issues of combining research traditions.
Generalizations made from a study constitute the results of the study, but furthermore: the
way the generalizations are made judges it as valid or proper in a research tradition. Therefore
we shall discuss some general (!) issues of generalizing, as a background for the
generalizations made from the present case study.
A central question that has consequences both for the design of an empirical study and
the results drawn from the study is the question of generalizability. First of all, empirical
research must show results or insights that in some way are general beyond the empirical
setting. This means that it must be generalizable beyond the organization in which the case
study has been undertaken or beyond the specific technology used.
The most common way of thinking about generalizability is the statistical generalization
from a sample to a population. The methods of statistical inference are used to assess whether
a characteristic of a sample (e.g. that 30% of males between 20 – 34 watch football against
only 5% of females over 50 found in a sample of 2000 Danes) can be generalized to the
whole population (the Danish population). One of the basic requirements for a valid
generalization from sample to population is that the sample is chosen randomly.
This kind of generalization is not the only one relevant to research. In qualitative
research, the generalization from sample to population and the methods of inferential statistics
do not apply.

Mine the gap - a multi-method investigation of web-based groupware use


39
Research method

Yin(1994) distinguishes level-one generalizations from level-two generalizations.

Level one generalizations are either generalizations from sample to population as in


statistical generalizations or from subjects to experimental findings in a closed experiment
with dependent and independent variables. As Lee and Baskerville (2001) note, the results of
these generalizations are themselves empirical statements. They are, for example, statements
about a population or a case description of an organization.
Level two generalizations are generalizations from empirical statements to theoretical
statements. These generalizations are statements that suggest the relevance of some
characteristic of the empirical setting beyond that setting. In my study the level-two
generalizations are generalizations from the specific Lotus Quickplace technology to other
technologies, or from the specific organization to other organizations. Another type-two
generalization would be to generalize the finding that Danish men watch more football than
Danish women, to a theory of football-wieving patterns of men and women in general.

Level two generalizations in my study:


When considering level-two generalizations in my study, there are at least two different
overall generalizations, which are relevant to distinguish and deal with separately.
1. Generalizations of the technology from Lotus Quickplace to virtual workspaces to …
2. Generalizations of the organizational setting to other Nordic corporations, to other
financial services companies or to other types of organizations.
Actually, our hypothesis is that certain relations between the technology and the genres
of communication are generalizable. We can, however, only generalize through substitution
of the two elements in the relation: technology and genres of communication.

Mine the gap - a multi-method investigation of web-based groupware use


40
Research method

Let's take an example. At some point in this thesis I will conclude that, based on the case
study, "end-user design is essential for establishing use of Lotus Quickplace at Beta".
Considering generalizations under 1.) concerns properties of the Lotus Quickplace
technology. If the technology has properties that we can find in other technologies, and if
these properties are essential for the relation of the technology to end-user design, then we
would have something that is generalizable to these other technologies. Following from that,
we would expect that exchanging Lotus Quickplace for another technology X with the same
essential properties would produce the conclusion: "end-user design is essential for
establishing the use of technology X at Beta"
Considering generalizations under 2.) concerns the social structures of the organization.
Rather than looking for properties of the technology, we would be looking for social
structures of Beta that firstly can be found in other organizations, and secondly are essential
for its relation to end-user design. By combining the two, we are striving for a generalization
of our conclusion as stating something like "end-user design is essential for establishing the
use of technology X in organization Y."
The generalizations both on technology and genres of communication are based on
specific properties. If we generalize from Beta to other organizational settings, we do so
because of some specific properties of the genres of communication at Beta. The type-two
generalizations such as the example above will therefore essentially be a discussion of which
properties of both technology and genres of communication (which we have observed in the
case study) are the essential ones. That is to say, the essential ones for the relationship
between technology and social structures stated in the generalized conclusion.
The type-two generalizations made in a thesis are usually done at the end to allow the
reader to assess whether he agrees on the generalizations or not, so we shall leave them for
now and turn to the issues of how to combine research approaches and present the design of
the present research. This includes the practicalities of combining data collection methods and
data analysis methods in a case study.

Combining Research methods


The present study is based on two assumptions:
1. Using more than one method of research can give a richer understanding of the
issue at hand.
2. Combining more methods in the same case study makes sense despite their
diverse approaches and implicitly different views of the world.

Mine the gap - a multi-method investigation of web-based groupware use


41
Research method

The advantages and challenges of combining research methods have been discussed by a
number of authors (see e.g. Kaplan and Duchon (1988), Lee (1991), Mingers (2001),
Jensen (2002)). John Mingers argues that:
“Different methods generate information about different aspects of the world. The
information is used to construct theories about the world, which in turn condition our
experience of the world. It is both desireable and feasible to combine together different
research methods to gain richer and more reliable research results.” Mingers (2001) p. 243
It sounds both true and important to try to combine research methods. The difficult
question that follows is how this combination can be exercised in practice. The paper of
Lynne Markus (1994) - which we have discussed earlier - is an empirical example of
combining quantitative and qualitative methods. The main purpose of her paper is to
challenge media richness theory and propose better alternatives based on social construction.
In her study, Markus uses a survey based on statistical sampling to test the propositions from
media richness theory. In parallel, she uses text analysis of e-mails and interviews, which
were then analyzed interpretively as a basis for proposing better alternative explanations for
e-mail use.
“…a better explanation for e-mail use patterns at HCP [ the case study organization ]
may be found in views shared by most HCP managers about what various media were good
for – social definitions of media appropriateness that do not necessarily reflect the material
characteristics of the technology, such as its objectively-defined or individually-perceived
degree of information richness.” Markus (1994) p 519
Her approach to the combination of research methods is to divide her research question
into two parts. The first to test the propositions of information richness theory; the second to
propose better alternative explanations of e-mail use. Together the combination creates a
much stronger argument than researching the two parts in two separate case studies. If the
organizations had differed, one might have argued that organizational differences
undiscovered by the case studies could disturb the result.
Markus (1994) combines quantitative and qualitative methods in a more specific way
than described above. When she combines the analysis of e-mail archives with interviews, she
engages in what is named triangulation. The term triangulation is taken from trigonometrics.
Webster's Dictionary defines it as:
“any similar trigonometric operation for finding a position or location by means of
bearings from two fixed points a known distance apart” (Webster's) Dictionary
Triangulation is originally the technique for determining the position in two-dimensional
space by means of two measurement points and the geometry of triangles. Triangulation is
used metaphorically in multi-method research approaches, to describe the study a

Mine the gap - a multi-method investigation of web-based groupware use


42
Research method

phenomenon from different angles and thereby gain a better view of its character. In Jensen
(2002) p. 272 it is described as "a general strategy for gaining several perspectives of the
same phenomenon."
Norman K. Denzin (Denzin (1989), Denzin and Lincoln (2000)) is one of the developers
of triangulation as a strategy for integrating multiple perspectives on the same phenomenon.
He distinguishes four basic types of triangulation (reprinted from Denzin and Lincoln
(2000)):
1. Data triangulation: the use of a variety of data sources in a study
2. Investigator triangulation: the use of several different researchers or evaluators
3. Theory triangulation: the use of multiple perspectives to interpret a single set of data
4. Methodological triangulation: the use of multiple methods to study a single problem

Data triangulation is as Yin (1994) notes a typical characteristic of a case study. In my


study interviews, log file analysis, a survey, and documentation are combined. Investigator
triangulation has also been used both for performing interviews and in the early stages of
interview and survey data analysis. In the results presented in my thesis investigator
triangulation is primarily observable in the acknowledgements. They are not dealt with
explicitly except in the collection of the data.
The present case study uses triangulation in a more specific sense to understand the
relationship between genres of communication and the technology in the usage situation.
When we identify genres of communication, interviews and log file analysis are triangulated
to analyze individual instantiations of a genre. This is an example of methodological
triangulation.

Before I proceed to the presentation of the research design, which will allow us to
discuss the issues raised here much more concretely, I will present two hypothetical studies.
This will clarify in which sense I combine qualitative and quantitative research methods.

Two hypothetical studies of Lotus Quickplace.


The two hypothetical studies of Lotus Quickplace are genuine positivist research
designs. The first is an experimental study, mimicking the large body of studies produced in
the GDSS (Group Decision Support Systems) tradition (see Fjermestad and Hiltz (1998-1999)
for an overview and assessment of the results). The second is based on a survey, which was
by 1991, the most common research method, accounting for approximately 50% of the
empirical research surveyed in Orlikowski and Baroudi (1991).

Mine the gap - a multi-method investigation of web-based groupware use


43
Research method

The experimental study would investigate the effect of a new design of the document
views in Lotus Quickplace on the time spent to find a specific document. Document views are
the lists of documents in a repository, which are used to select the document you wish to read.
Our hypothesis would be that design Y shortens the time spent on finding a document
compared to the traditional design X. The independent variable would then be the design of
the document views and the dependent variable would be search time. We would then design
an experiment where a number of randomly selected people would solve a pre-defined task of
finding certain documents. A control group given the old design X would solve the same
tasks. The results would then be analyzed to see whether there was any significant correlation
between the independent and the dependent variable. (See e.g. Smith, Cadiz et al. (2000) for a
similar research design).
The statistical study would investigate the ideal size for the group of users using a
Quickplace. The hypothesis to be tested would be that the size of the group is significant for
the successful use of a Quickplace. The independent variable would be the size of the group,
and the dependent variable would be successful use. One would then randomly select a
sample of Quickplaces. This study could through log analysis capture the independent
variable by counting the number of unique users present in the log files (or by some other
means discovering the number of users). The dependent variable could be captured by a
questionnaire of a sample of users, to establish whether the Quickplace is a success or not.
It is well known in IS that both the implementation process, and the organization in
which the technology is introduced, are important factors in determining the success of a
technology. This could be corrected by choosing a sample of users that all had a similar
implementation process, or to document the implementation process through, for example, the
questionnaire and have the sample reflect different types of implementation processes.
The two hypothetical studies exemplify research design which would appear to be
possible to pursue in the case study presented here. However, there are two reasons why this
is not the case:
1 It would not produce answers to questions I would be interested in asking
2 It would have required a more conscious top-down research design than has actually
been the case.
Let us look at how the case study actually was designed (or emerged).

The research design


The case study reported in this thesis was conducted partly as a collaborative effort
between Jesper Simonsen, Keld Bødker, Jens Kaaber Pors from Roskilde University and
myself, and partly by myself alone. It was performed as part of the DIWA research program.

Mine the gap - a multi-method investigation of web-based groupware use


44
Research method

DIWA is an acronym for Design and use of Interactive Web Applications, and the program
performed empirical studies of both design and use processes of interactive web applications.
Our study was classified as a study of use. We worked with partly individual research
questions and this has affected, in particular the way in which we conducted interviews.
Besides the case study at Beta the present thesis also contains a comparative analysis of
technologies similar to Lotus Quickplace. (virtual workspaces). The comparative analysis is
included in the thesis, because it strengthens some of the conclusions drawn from the case
study. As discussed earlier, technology can partly be conceived as an artefact and partly as
part of a technology-in-practice. In the following section the technology will be analyzed as
an artefact. I will devote my main energy on the methodological issues of the case study since
they form the primary data source of the thesis.
The case study has been primarily based on three sources of data:
1. Interviews with selected managers of the Lotus Quickplaces and the people
responsible for introducing and managing the technology.
2. HTTP-log files from the Lotus Quickplace server
3. A survey among managers of the Lotus Quickplaces
The numbering of the data sources does not rate them according to importance; they
rather present a time sequence.

Quickplace study

Log-analysis
Use case study Interviews
Pilot Survey

1/6-2000 1/10 1/4 15/5-2001 8/11 23/11 10/12 18/2-2002


10/5

Timeline of the case study

The results of the first part of the study entitled “Use Case study” are not reported in this
thesis. This study analyzed the use of “use cases” in an IT-development project. Use cases
(see Fowler and Scott (1997)) are a specific genre of documents used in various phases of the
development of an IT-system. Some of the results of the study are reported in Bøving (2001).
The study of the use of "use case" documents and technologies supporting its use could have
been relevant to this thesis. The problem is, that the technology used in the IT-development
project for exchanging "use case" documents was a custom groupware system based on Lotus

Mine the gap - a multi-method investigation of web-based groupware use


45
Research method

Notes. Mixing two studies based on two different technologies would blur the conclusions
presented here.
In spite of this, I mention the use case study because insights into and interests in the
workings of the organization as well as document genres from this study have been used as
input for the Lotus Quickplace (QP) study.
The research design of our case study was not finalized before starting the study. From
the outset we planned and agreed with our Beta contact person that we performed
approximately 10 interviews and obtained log files from the Lotus Quickplace server for
analysis. As the analysis of the log files proceeded I had the idea that a survey might provide
a useful insight into what the Quickplaces were used for. The log files do not reveal any
purpose of the activity observed, nor does it link the activity to concepts such as work, teams
or groups. Despite the fact that log files were a planned part of the empirical data from the
outset, the analyses of the log files have developed significantly over time. At the start, I did
not have a clear idea of what information the log files could provide and which specific
analyses would be useful.
Beta has been a partner of the DIWA project since the launching of the project in 1999.
This includes a study of the Intranet in the Danish part of the organization, the use case study
mentioned above, and a study of a unit responsible for supporting the organization with
methods for managing projects and re-using knowledge across projects.
Preceding my engagement as a Ph.D. researcher in the DIWA project, I worked as a
management consultant for IBM Global Services. There I was involved in a project spanning
several phases and approximately 1 1/2 year of planning and construction of the Intranet for
the Danish part of the organization. This work has given me much insight into the processes
of this organization, and specifically how IT is managed and projects are done. It has also
provided experience with interacting with this organization, which has eased the process of
getting access to information etc.
The experiences gained from my engagement as consultant have provided background
insights beneficiary to the section on the general introduction to Beta and Lotus Quickplace.
Some of the knowledge on how IT is managed and implemented stems from the interviews
conducted as part of the case study, but some of it (such as my understanding of Standard
Operating Procedures and how they are used) stems from my experience as a consultant. This
blend of roles in my relationship with Beta could be seen as a potential methodological
problem. When we conducted the interviews, I knew some of the interviewees from my
engagement as a consultant and therefore had a special relationship to them. One could argue
that this would influence their answers. I feel confident that I have drawn precautions to avoid

Mine the gap - a multi-method investigation of web-based groupware use


46
Research method

this in my study. The reader is asked to make up his own mind as I seek to lay open the
source of the descriptions I provide and the conclusions I draw.

The interviews
The interviews conducted in collaboration by Jesper Simonsen, Keld Bødker, Jens
Kaaber Pors and myself consist the first data collection from the case. The purpose of the
interviews was partly to investigate how Quickplace was implemented and used in general,
and partly to investigate a specific Quickplace called GIC, which was used by the "Group
International Communication" department. Our primary contact at Beta was employed in this
department, which was also the department responsible for introducing the Lotus Quickplace
technology in the organization.
For the planning of the interviews, we used a number of tools which I had previously
used as a management consultant. The purpose of using the tools was primarily that we had
differing research questions. An obvious problem as a researcher collaborating with
researchers with differing interest in the interviews is that you only get answers to your
questions in the interviews you perform yourself - unless you do something about it. We used
a data-gathering matrix (see appendix 1). This is a tool for planning the gathering of data used
in the issue-based consulting methodology, which is used widely in consulting companies
(e.g. IBM). The issue-based approach is, for example, used by Kunz and Rittel (1970) to build
an issue-based information system, IBIS. The data-gathering matrix is a document that is
created collaboratively between the participants. The idea is that you can distribute the data
collection among the participants based on a shared understanding. The rows of the matrix
constitute issues (or research questions), hypotheses and key questions the answering of
which should confirm or disconfirm the hypotheses. The columns of the matrix contain data
sources. In our case these were interviews, log files, survey, and QP observations/document
analysis.
The data gathering matrix was used to create an “interview matrix” (see appendix 2).
The rows of the interview matrix contained the key questions checked in the data-gathering
matrix as relevant for the interviews. Rather than having data sources as columns in the
matrix, each interviewee was given a column. We scanned the interview matrix together, and
checked for each key question the person we thought could answer the question. The selection
of interviewee for each question was based on our knowledge of the interviewee gained from
our primary contact at Beta, and based on the fact that we had about one hour for each
interview. The interview guides (see appendix 3 for an example) were then generated
automatically from the interview matrix, followed by adjustments since some of the
interviews ended up having too many questions.

Mine the gap - a multi-method investigation of web-based groupware use


47
Research method

The data gathering matrix and the interview matrix enabled us to create interview guides
which consisted of a collection of questions which would answer (at least partially) all of our
individual research questions. The rigid character of the data-gathering matrix had the effect
that we only used it for planning the interviews and not for guiding the analysis of the data.
None of us had a clear enough picture of what the survey and the log file analysis should be
used for at that point in time. However, it ensured that we were able to plan an interview
process where different research questions could be investigated using the same data.

Selecting the interviewees


The interviewees were not selected at any one point in time or on the basis of one
specific principle. I actually conducted one of the interviews (with the manager of the
NP_Solo-ID QP) two months before the interview planning. I made contact with the
interviewee because he was part of the project I studied in the use case study. An interview
with our primary contact, who was involved both in the implementation of Lotus Quickplace
and manager of the GIC QP, was also conducted before the planning process.
The selection of the interviewees was done in collaboration with our primary contact.
They were partly selected based on the data gathering matrix, and partly they were selected
because our primary contact would like us to investigate the QP in which our primary contact
was involved. Therefore all interviews conducted were with users from the GIC Quickplace.
It turned out that one of the interviewees was the manager of International_communications, a
third Quickplace, and the interview was actually conducted with her playing the role of
manager of International_communications rather than as a member of GIC.
We had planned to interview 9 people and finished up completing interviews with 7
people. We did not manage to schedule an interview with two interviewees because they
argued that they lacked time to complete the interview.
The following table lists the interviewees. (Under pseudonyms)
Name Role
Linda Involved in the implementation of Lotus
Quickplace
Manager of the GIC Quickplace
Lone Manager of the
International_Communications Quickplace
Member of the GIC Quickplace
Richard Adminstrator of the Lotus Quickplace server
- he also delivered the log files
Phillip Manager of the GIC Quickplace
Mine the gap - a multi-method investigation of web-based groupware use
48
Research method

Birgit Member of both GIC and


International_Communications
Jens Member of GIC Quickplace
Ernst Manager of the NP_Solo-ID Quickplace

In my thesis I have selected three Quickplaces as exemplars, which I study in more


detail than the rest. These three are the GIC, NP_Solo-ID, and
International_Communications(IC). They were selected because of the interview data, which
describes the purpose of the Quickplaces, as well as giving accounts of specific genres of
communication.
As noted previously, the interviews were conducted as the process of logging data on the
use of the Lotus Quickplace server was started. We did not have any results from the log
analysis before the interview process had ended.

Beside the interviews, the log files have been the most important source of data for this
study. While interviewing is a well-proven technique in case and field studies, the use of log
files as data is not. Therefore I will devote quite extensive space to the discussion of log file
analysis.

Log file analysis


Most IT systems use log files. The purpose of log files is generally to trace the history of
some application. The concept of logging is well known before IT systems, for example, from
maintaining a ship’s log. The advantage with IT-systems is that they can maintain their own
log. Logging the activity of an IT-system generally means to write a trace of what the system
has done to a file as a part of the execution. Logging is known as tracing in application design
and as transactional logging in transaction monitors and database systems. In application
design the purpose of the tracing is primarily to analyze the program execution in order to
optimize the algorithm. In transactional logging of databases (e.g. DB2, Oracle, or MySQL)
and transaction monitors (e.g. CICS from IBM inc. or Tuxedo from BEA inc.) the purpose is
to maintain consistency and to be able to recover from system breakdowns without loss of
information. This is high priority in settings where IT-systems are used to automate
transactions such as maintaining a bank account.
While less common in single-user systems, logs are maintained or at least offered as a
possibility in most server-applications. This is also the case for the HTTP-server, perhaps
more commonly known as a web server. The HTTP-server is one of the cornerstones of the
www. Its job is to serve .html pages and other files according to the specifications of the
Mine the gap - a multi-method investigation of web-based groupware use
49
Research method

Hyper Text Transfer Protocol (See RFC1945 for HTTP 1.0 and RFC2068 for HTTP 1.1). It
serves requests to a browser or some other HTTP-compliant client.
All HTTP-servers (see www.netcraft.com for a list of web servers and surveys of the
most popular ones) have the built-in possibility of maintaining an HTTP-log. The HTTP-log
is not specified in a standard, but two formats have emerged as de-facto standards.
The fact that the logging format follows a de-facto standard makes it a lot cheaper to
analyze the log files. The reason is that it is possible to build general software that can
analyze log files across different sites and different HTTP-server implementations. This fact
was one of the reasons for using the HTTP-log of the Lotus Quickplace as data for the case
study.
In the tradition of IS and the study of IT systems, log files can be used as a means of
observing how an IT-system is actually used. When one studies the use of an IT-system one
can either study users accounts of how they use it, or actually try to observe the users when
they use it. Log file analysis has only been used in experiments in laboratory settings and not
as a part of field studies. The reason for this can only be speculated. One possible reason is
perhaps that the standards-based Internet has put more focus on the analysis of use-patterns. It
might also be due to the fact that log files are problematic for observing use for a number of
reasons:
1. They show very few aspects of some users activities. Two identical lines in a log file
might therefore document use processes that would differ significantly in a direct
observation of use.
2. Typically log files are not designed by the researchers. The creators of the software
typically design them with another purpose in mind than a study of use patterns.
Therefore the log files often do not contain the information required by a researcher.
This problem can be solved in very controlled settings such as in experimental research were
the IT-system is custom designed so that it logs the wanted information.
For real life studies of IT use, using log files as a data source is not common. I have been
unable to track down a case study or other empirical real-life studies in IS using log files as a
data source. Therefore the use of log files in this thesis must be considered explorative. As we
shall see, when the results of the log file analysis are presented, they might serve as a
mediator between qualitative accounts of usage based on observations of use which suffer
from a lack of any indications of their generality and surveys, which gives a general picture of
usage which is not related to how systems actually get used.
While the log files have serious shortcomings, as we shall see in more detail later on,
they still seem to be an interesting solution to the problem of observing use of Lotus

Mine the gap - a multi-method investigation of web-based groupware use


50
Research method

Quickplace. Generally, systems for distributed collaboration have some built-in challenges for
the case study researcher:
1. The users are distributed geographically.
2. The usage is distributed in time.
The problem with the geographical distribution of users is that they are difficult for a
researcher to observe for a researcher. If the users are few in numbers and in predictable
locations it might be possible to observe usage, but if there are many it soon becomes
infeasible. In addition, web-based systems including Lotus Quickplace can be used from
different client machines. It is therefore unpredictable which locations the usage is taking
place in. Finally, the usage is distributed in a timescale up to weeks and months.

Location C Event

Location B

Location A Event Event

T1 T2 T2

Through the ID of a document on which a user performs some action, and a username in
the log file, we can link together events that are spread organizationally and temporally,
which would otherwise have been difficult for a researcher to discover.
The qualities and deficiencies of the HTTP-log files as data for a case study might be
summarized as follows:
Indices They provide perfect information about a very limited aspect of usage. Log
lines are indices of use.
Consistent They provide this information consistently across time
Analyzable The data is very easy to analyze using data mining techniques compared to
richer observations such as videos, tape recordings, researchers' notes, etc.

These features of log files make them useful for studies that span many users and a long time-
span.

Mine the gap - a multi-method investigation of web-based groupware use


51
Research method

The survey
Doing a survey was not on the agenda for the joint DIWA group (Jesper, Keld, Jens and
myself). As I started working on the analysis of the log files it became clearer that, without
some account of what the QPs were used for, and how they were used, the information we
could deduce from the log files would be limited. Therefore I planned a survey to answer
some overall questions regarding the use of the QPs, and some more specific questions about
how it was used together with other media, etc.
The only way of doing a survey that would not take up all of my research time was to
use the web: paper-based surveys as well as telephone-based would have taken up too many
resources. Another good reason for this was that, according to our contact at Beta, they had
good experience with doing web-based surveys internally in the organization. I customized a
freeware survey application built on top of Lotus Notes and hosted the survey application on a
computer at the University. This allowed me to customize the look of the application so that it
included a Beta logo and looked familiar to the respondents.
The response rate of the questionnaire was 46% completed and 12% which were
partially completed. 46% is a good response rate also according to the usual rate at Beta, and
the 12% partially completed questionnaires is not a very high percentage. It is therefore likely
that using a web-based questionnaire compared to more well-proven methods, did not disturb
the results.
The survey was made in two stages: a pilot stage with 10 respondents and a full stage
with 123. Prior to the pilot, the technical set-up as well as the questions were tested on
selected researchers in the DIWA project. The pilot was produced to test several aspects of
the questionnaire:
- to test the technical set-up of e-mail invitations and links to the survey application.
- to test the wording of the invitation.
- to test the questions and descriptions for errors.
- to test whether we received answers that made sense in relation to what we would
like to know.
The pilot revealed some technical problems with the set-up, and some minor errors in the
wording of the questions. None of the questions were changed significantly, and none were
removed. After implementing the changes in the pilot, we issued invitations to 123
respondents on the 23/11 2001. This resulted in 33 answers. On the 5/12 2001 we reissued
invitations to the 92 who had not responded. This produced an additional 24 answers. All in
all 57 completed answers. A completed answer does not necessarily imply, that all questions
are answered, but implies that the respondents had clicked though the entire questionnaire.
Typically though, the respondents answered all questions.
Mine the gap - a multi-method investigation of web-based groupware use
52
Research method

Design of the survey


The survey was designed with an overall goal of asking no more than 30 questions.
There are no fixed recommendations as to the length of surveys Weisberg, Krosnick et al.
(1996), but 30 questions ensured that the questionnaire would not take more than 10 minutes
to complete. A longer survey generally increases the risk of people not completing the survey.
Surveys can be used for a number of different purposes. Weisberg, Krosnick et al.
(1996) mentions three overall groups of purposes:
- Attitudes and Preferences
- Beliefs and Predictions
- Facts and past behavioral experiences.
The purpose of our survey was to have respondents describe facts about QP use and past
behavioural experiences. The goal of the survey was at least two-fold. Firstly, I wanted some
idea of the prevalence of QP usage characteristics. These were to be used both as a
characterization of use, but also as input for a correlation analysis between the answers in the
survey and patterns in the log file. Secondly, I wanted some descriptions of use that could be
used as material for qualitative interpretations. The second goal was achieved mostly with the
open questions, which, for example, provided a rich body of examples of concrete uses.
The survey was divided into three main parts (see appendix 4 for a reprint of the
complete survey). The first consisted of questions, which should provide an overview of who
was using the QP both in terms of organizational and geographical placement and in terms of
the kind of group it supported. The second group of questions aimed at finding out how QP
was used in different collaborations. The last part of the questionnaire served to get an idea of
how the users planned and agreed on using the QP, and to what extent they succeeded in
agreeing on its use.
Sampling respondents
Due to practical factors, the only users available as respondents for the survey were users
who had the role of manager in the QP. The manager is allowed to invite users and define
their rights as well as change the structure of the QP through the creation and deletion of
folders and sub-rooms. Users with manager rights accounts for around 10% of the total
number of users. The questionnaire was not sent to all managers of all Quickplaces. They
were selected in two rounds. First, we selected QPs that had been active in the three-month
period leading up to the questionnaire. This selection was done because I was interested in
accounts of how QP was used: to ask managers of dead QPs about how it was used would
have produced very problematic answers.
Had the goal of the survey been to get users' attitudes and preferences towards the QP
technology, this procedure of selecting only active QPs, would be very problematic. I made

Mine the gap - a multi-method investigation of web-based groupware use


53
Research method

another round of selection of managers of the 77 selected QPs. From interviews I knew that
some managers were merely appointed as QP managers because they were real-life managers.
Some of these managers were not using the QP at all. Since I was interested in accounts of
how it was used, asking non-active managers would therefore produce problematic answers. I
took the list of 77 Quickplaces and the list of managers and made queries in the database to
see which managers were active in the three-month period before the questionnaire was sent
out.
As this selection process documents the respondents were not chosen as a random
sample of all users. This was not possible because nobody knew who were users. There was
information in the log file but the user-name in the QP was not identical to the users ID in the
e-mail system.

Level-one generalizations
As discussed earlier, level-one generalizations are generalizations used in empirical
research guided by the methods of inferential statistics.
An important distinction in quantitative research is that between reliability and validity
considerations. The concept of validity deals with how well that which is measured actually
measures the desired variable. Reliability concerns how well the sample measured can be
generalized to the population of objects over whom the question is posed. Reliability is
assessed using the methods of inferential statistics.
As to the question of validity, one of the challenges of the survey was how the
respondents were selected. Our practical limitation was that we could only survey managers
of the QPs since we had access to their e-mail addresses. The first challenge was that this
group of users was not representative for the whole group of QP users. If we asked them
questions as users of QP, the reliability of conclusions drawn for all users of QP at Beta
would be poor. Instead they were asked questions as representatives for uses of QPs. The
population is uses of QPs instead of users of QP. Because I asked them questions as
representatives of uses of QPs, the reliability of the sample is increased. So then the question
becomes whether asking the managers is a valid way of gaining descriptions of uses of
Quickplaces. This problem of validity will be addressed as I report the results, because they
can only be assessed for single questions.
Quantitative methods are typically used to measure associative or causal links between
few variables, whereas qualitative research investigates relationships between more variables
at a time. In qualitative research, relationships are not usually named variables. In the present
case study, where we combine quantitative and qualitative methods, variables can be used as
a way of discussing reliability. All members of a sample will share some characteristics and

Mine the gap - a multi-method investigation of web-based groupware use


54
Research method

differentiate in other characteristics called the variables. The characteristics that are shared are
called background variables. The variables are the ones we inquire the sample about. The
quantitative methodology and theoretical framework investigates relationships between
dependent and independent variables. Statistical methods such as regression analysis are then
used typically to analyze the relationship between one dependent and one or more
independent variables.
As we have discussed previously, it makes sense to distinguish between level-one and
level-two generalizations. A level-two generalization concerns the generalization from
empirical statements to theoretical ones. Level-one generalizations are the generalizations
ruled by the discipline of inferential statistics. They concern the generalizations made from
one or more samples observed to the population from which the samples are drawn. Because
both log file analysis and the survey are part of the investigation it is relevant to discuss
whether it makes sense to apply the methods of inferential statistics in our case study.
The basic issue of inferential statistics is the relationship between the sample and the
population Bakeman (1992), Gunter (2002). The method of sampling (or lack of method)
determines the conclusion that can be drawn about the population. The first step is to define
the population relevant for the investigation. In the present case one could set up a few
alternative possibilities:
- Uses of virtual workspaces in general
- Uses of virtual workspaces in intra-organizational settings
- Uses of Lotus Quickplace in Financial institutions
- …
Common to these suggestions for populations is that they go beyond the setting of the
study, which is the use of Lotus Quickplace at Beta. If we chose one of these populations, the
next step would be to argue for the principle used for sampling. In any case, both the
sampling of the technology and the organization would be useless because it is not random.
The sampling of the technology and the organization is based on convenience. While one
could argue that the sampling of the technology is random, certainly the choice of
organization is not. Therefore I will not spend more time on level-one generalizations beyond
the population of “use of Lotus Quickplace at Beta”. This is perfectly in line with the overall
characterization of the study as a case study. In the following considerations, "use of Lotus
Quickplace at Beta" is our population. The conclusions we draw on this population will then
form part of the basis for level-two generalizations.
Regarding the issue of sampling in the population “use of Lotus Quickplace at Beta”:
The log file analyses are generally made on the whole population. This means that we
have made the analysis on all log file data from the whole 10-month period. Sampling is

Mine the gap - a multi-method investigation of web-based groupware use


55
Research method

generally used in data mining when the amount of data exceeds what is technically feasible.
We did not run into that limit (although some of our queries in the database took several days
to complete).
For some specific analyses we set up criteria for selecting a collection of documents
because we, for example, only wanted documents for which we had a full lifecycle. This was
the case also with the selection of QPs. These criteria are relevant for the specific analyses
and will be dealt with when the results are presented.

Sampling in a case study


Even when the methods of inferential statistics are not relevant for use either in the
survey or in the log file analysis, sampling is still an important issue, in both the planning and
in the analysis of the data. It is important in the planning phase, because it will increase the
probability of collecting data that can actually answer the research question. It is also
important in the data analysis process, because the sampling is one of the determinants for the
conclusions, which can be drawn.
When making a case study, the first sampling issue is of course the choice of case
(Jensen (2002) p. 238). In the present context, this means choosing the specific technology
and the organization as the object of study.
Jensen (2002) identifies three procedures for sampling in a qualitative study: maximum
variation sampling, theoretical sampling, and convenience sampling.
In the choice of Lotus Quickplace as the medium as well as the organization,
convenience sampling has been used. Convenience sampling is not only a nice way of
expressing that the sampling is based on “what was accessible”, it also implies that choosing
between organizations can be based on how they provide access to studying the object of
interest. The organization was chosen because it had a pre-established engagement with the
DIWA project, and because I had a connection to the organization stemming from previous
consulting engagements. The organization therefore provided eased opportunities for
interviews, observations, document studies, questionnaires, and log files for analysis. For the
purpose of studying the research question “how a computer medium interacts with pre-
existing social structures in an organization” convenience sampling is probably the best
sampling method. One of the critical issues of such a study is precisely to get access to data.
The choice of medium for the study was a result of my participation in the DIWA
project, as Lotus Quickplace is an example of an interactive web application. This choice of
interactive web applications rather than, for example, e-mail is primarily due to its news
value. Interactive web applications have been the object of relatively little research, while e-
mail has been empirically studied extensively.

Mine the gap - a multi-method investigation of web-based groupware use


56
Research method

The sampling issue is an issue of which kind of conclusions or generalizations can be


drawn from empirical data. Different samples and different sampling methods in combination
with different data provide different foundations for valid generalizations. The primary task of
this thesis is therefore to lay open the sampling and let the reader evaluate whether the
generalizations made are valid.

Before we proceed to the actual results, I wish to use some additional time on the
analysis of HTTP-logs. This will present information, which will enable the reader to
critically evaluate the results from the log analysis. My discussion of the use of HTTP-log
analysis is extensive because it is not a well-described method in IS research and research on
computer-mediated communication. HTTP-log analysis is established as a method for
understanding how single users interact with a web site in the research tradition of Human-
Computer Interaction (HCI). It is argued that HTTP-logs also offer a valuable source for
analyzing computer-mediated communication via web sites, not just in case study research of
computer use in organizations, as is the case in this study, but as a generic method for various
purposes. The HTTP-log analysis performed in this study thus illustrates a use of HTTP-logs
not previously described in research or performed in practice.

Mine the gap - a multi-method investigation of web-based groupware use


57
HTTP-log analysis for CMC

HTTP-log analysis for CMC


The purpose of this section is to present the HTTP-log analysis produced in the study, as
a method, which builds on the disciplines of data mining and its under-discipline web usage
mining. This section is devoted to it because the analysis made constitutes an as yet
undeveloped method of HTTP-log analysis.
The challenge of log files is that there is so much data to analyze, and so many relations
and patterns to look for, that the process can be infinite. The analysis of log files is considered
a special case in the broader discipline of data mining. The discipline of data mining is
summarized in the following definition:
“Data mining is the analysis of (often large) observational data sets to find unsuspected
relationships and to summarize the data in novel ways that are both understandable and
useful to the data owner.” Hand, Mannila et al. (2001)
Data mining is a discipline that has grown out of the use of computer systems with large
databases. Typical examples include banks which analyze records of transactions in bank
accounts and the medical testing of new medicines. Data mining of transactional records in
banks has, for example, resulted in earlier detection of credit card fraud. The data mining
work has produced a model or pattern for how credit card fraud looks in transactional data.
This model or pattern is then applied in an on-going analysis of transactions, which can
identify frauds as they happen.
The process of producing statistics is usually described as a two-step process. The first
step is to generate hypothesis about a certain phenomenon. The second is to test these
hypotheses using inferential statistics. The methods of data mining address both steps but, in
contrast to traditional statistics, it focuses more on discovering and visualizing patterns in data
that could serve as a basis for generating statistical hypotheses. As Glymour, Madigan et al.
(1996) state, the data collections used for data mining are typically large convenience
samples. They are therefore problematic as the basis for performing inferential statistics.
Whether or not they are problematic for generating global models, the results can be very
useful for more modest or practical purposes. As mentioned earlier, the purpose of using data
mining techniques for this thesis was not to produce global models, which are valid according
to inferential statistics.
An HTTP-log is a special kind of log file produced by a HTTP-server (web server). An
HTTP-server is a piece of server software that adheres to the HTTP-protocol as defined in
RFC 1945 (HTTP v. 1.0) and RFC 2068 (v. 1.1). The logging of activity on an HTTP-server
is not dealt with in any of the RFC’s.

Mine the gap - a multi-method investigation of web-based groupware use


58
HTTP-log analysis for CMC

The logging of activity on an HTTP-server is produced using “the common log file
format”, which has emerged as a de facto standard (see Appendix 5).
HTTP-logs are attractive compared to other logs because they represent a standardized
format for logging. This has some interesting consequences. Firstly, it means that analytical
tools for HTTP-logs can be built across diverse systems, because they use the same HTTP-log
standard. Secondly, it enables researchers to more easily compare studies of use across
different technologies.
The standardized format of HTTP-logs also limits the tyoe of analysis one can perform.
This has lead to a working draft under W3C (World Wide Web Consortium) to define an
“Extended log file format” (not to be confused with the extended common log file format),
which standardizes the description of the data which is logged from a specific HTTP-server
rather than defining the data itself, as is the case with the common log file format. This would
enable more flexible logging that can be suited to specific purposes.

Web mining and web usage mining


The advent of the www has led to another under-discipline of data mining: web mining.
Data mining analyzes large data sets and the www certainly is a large data set, or rather a
huge number of large datasets. Therefore it has been natural to transfer the data mining
discipline to web mining.
As Cooley, Srivastava et al. (1997) note, the term "web mining" has been used for
covering quite diverse phenomena. To clarify matters they have provided this conceptual
division of web mining.

Web Mining
Web Mining

Web Content Web Usage


Web Content Web Usage
Mining Mining
Mining Mining

Agent
AgentBased
Based Database
Database
Approach Approach
Approach Approach

Web content mining is the mining of the content on the Internet. The Google search
engine is an example of a practical application of web mining. The Google search engine
analyzes the contents of the Internet and indexes the contents so that users can search the
index. This is a relatively simple process of web content mining, which is more interesting
Mine the gap - a multi-method investigation of web-based groupware use
59
HTTP-log analysis for CMC

because of the huge amount of data analyzed. Usually web content mining is interested in
establishing models to describe data rather than just indexing it for users to search, as Google
does, but in principle Google is a simple example of web content mining.
The model above distinguishes between the agent-based approach and the database
approach to web content mining. The agent-based approach searches for information and
patterns of information where the information is, whereas the database approach is focused on
structuring web-data in a database to make it available for structured querying using
languages such as SQL (Structured Query Language). The Google search engine is an
example of the agent-based approach.
Web usage mining is the discipline from which our analyses of log files set out. Web
usage mining is the adoption of data mining techniques to discover use patterns from HTTP-
log files. Web usage mining is still a young discipline and is characterized by a very strong
drive from the industry. The next section provides an overview of the types of research that
have been conducted on web usage mining.

A survey of the research in web usage mining


Existing research on web usage mining, as well as their practical applications, is focused
on analyzing how single users use a system. A typical approach is to analyze actual uses of a
system in terms of, for example, how people navigate a web site and use this as input for
redesigning a web site. A simple example would be to discover that some specific page on
level 3 of the web site hierarchy is the target of 50% of all visits to the web site. This could
suggest that the page should take a more central role in the information hierarchy. Similarly,
the fact that a page considered very important by the site owners but rarely used, suggests that
it be moved to another place in the information hierarchy. (See e.g. Masseglia, Poncelet et al.
(1999), Spiliopoulou (2000) for examples).
Web usage mining has a number of scientific as well as practical applications. The
following presents a categorization of applications of web usage mining both in research
literature and in practice:
Technical analysis:
The analysis of HTTP-logs is used to generate simple statistics of usage, which are used,
for example, for load planning and sizing of the technology, or for measuring activity levels
in different areas of a web site. One of the difficult aspects of sizing web technologies is to
foresee the level of activity and how it fluctuates in different time periods. Analyses of log
files, which produce activity level histograms for a site, can help the planning for future
investments in technology or define time periods where the site can be maintained without
disturbing too many users. These analyses are supported by all commercial web site analysis

Mine the gap - a multi-method investigation of web-based groupware use


60
HTTP-log analysis for CMC

tools (e.g. WebTrends (http://www.webtrends.com)) of which some exist as freeware (e.g.


Analog (http://www.analog.cx/)).
Methods for extracting use patterns:
Different generic methods and algorithms have been developed in research to extract
typical patterns of usage of a web site. These are mostly based on different types of sequence
analysis and association rule mining taken from the data mining discipline. See e.g. Srikant
and Agrawal (1995), Srikant and Agrawal (1996), Hidber (1998), Cooley, Tan et al. (1999),
Andersen, Larsen et al. (2000), Pei, Han et al. (2000). Sequence analysis analyzes typical
sequences of events in time, while association rule mining analyzes associations between web
pages based on users' actions. Association rule mining is also known from shopping basket
analysis, providing results such as "80% of people who buy beer also buy potato chips". In the
context of usage of a web site a possible result could be that "10% of people who visit the
product descriptions section also visit the web-shop."
Research on methods for clustering users based on their usage of a web site has also
been conducted (See e.g. Fu, Sandhu et al. (1999)). A practical application of user clustering
could be to reflect the types of users in the information architecture.
Personalization and Information architecture:
Research has been conducted into the utilization of the generic methods described above
for improving the user interface of a web site. The most popular approach has been to utilize
the analysis of usage for providing personalization of the user interface. See Mobasher,
Cooley et al. (2000), Mobasher, Dai et al. (2001), Su, Yang et al. (2002), Toolan and
Kushmerick (2002) for examples. The idea is that the history of usage for a single user or
group of users can provide input on how to promote specific web pages on a site to specific
users. A simple implementation could be to generate a list of news articles, based on news
articles previously read by a user. Such an application would also require a typology for news
which could be mapped to the usage. The generation of the news list would involve selections
such as "if the user has previously read news of type X, and we have current news of type X,
then add the current news of type X to the list of news displayed to the user".
Apart from personalization, more generic approaches to dynamic information
architectures have also been developed. Masseglia, Poncelet et al. (1999), Spiliopoulou, Pohle
et al. (1999), Su, Yang et al. (2002) provide examples of this. They generally incorporate
usage history in the automated design of the interaction with users.
On-line shopping behavior
Specific methods have been developed for studying the behaviour of on-line shoppers.
Srikant and Agrawal (1995), Srikant and Agrawal (1996) develop an algorithm for applying
sequence analysis to the analysis of shopping behaviour. Different forms of sequence analysis

Mine the gap - a multi-method investigation of web-based groupware use


61
HTTP-log analysis for CMC

for analyzing on-line shopping behavior are now built into commercial data mining products
and services (e.g. Clementine from SPSS, WebHound from SAS, SurfAid from IBM).
Marketing based on log analysis:
Clustering of users based on usage and correlated analyses of other data such as
customer segmentation models (e.g. Minerva or Kompass) or buying history have been
researched as an input for marketing. The goal is to direct marketing more precisely towards
potential customers. One of the buzzwords is personalized marketing (Büchner and Mulvenna
(1998)). While not based on the analysis of HTTP-log files but rather transactional data, the
marketing e-mails sent out by the Amazon bookstores illustrate this principle.

This overview of the research shows that web usage mining has been driven strongly by
the commercial interest in utilizing www. The applications of the methods are also all focused
on the interaction between single users and a web site. It could therefore be characterized
partly as a contribution to research in HCI (Human-Computer Interaction) and partly to
marketing and sales research.

Mining computer-mediated communication


Many web sites can be understood as collections of information or shops available for
single users. The existing research in web usage mining is based on this understanding. It
however ignores the fact that web technologies or web sites function as media for
communication between users. This is, for example, the case with the virtual workspace
technologies studied here, and also in web-based discussion fora and sites such as
slashdot.org where users both act as producers and readers of information.
It is suggested here that HTTP-logs can also be used as a source for understanding how
people collaborate or communicate mediated by web-based technologies. This requires a
different approach to analysis and increases the importance of combining the quantitative
”traces” of action with actors accounts of their actions derived from interviews or
questionnaires.
While the analysis of HTTP-log files to study human-computer interaction is fairly well
established in research and to a lesser extent also in the industry, using it to study computer-
mediated communication (CMC) is still undeveloped. I have been unable to find accounts of
such research or practical applications in journals, conference proceedings, etc.
While HCI focus on one users interaction with a system, the focus of CMC is the
communication between two users mediated by a system. The analytical unit for HCI is the
session. A session is a sequence or collection of interactions with the system by one user
limited by function and by time. In HTTP-log analysis, a session is usually defined negatively

Mine the gap - a multi-method investigation of web-based groupware use


62
HTTP-log analysis for CMC

as interactions of one user preceded by 30 minutes of non-activity and followed by 30


minutes of non-activity.
In the analysis of CMC, the session is not a relevant analytical unit. Instead, the
analytical unit must be the document (or the html page). One would intuitively think that
communication would be the analytical unit. Due to the architecture of the HTTP-protocol
specifically and www in general, this is not possible. The HTTP protocol, and therefore also
HTTP-logs, only record single users interaction with specific resources on a web server. To
compensate for this some twists must be made to the data. As an illustration, think of the
logging of telephone conversations. Communication over the telephone is something that
technologically requires two telephones and two users. When the telephone company logs the
activity of this telephone conversation, they log both phone numbers. Therefore
communications can be studied directly. On the www the situation is more cumbersome. If
we take a question and answer session on, for example, a programmer’s discussion forum, the
HTTP-log has no account of any communication. In the HTTP-log an action identifiable as
“one user posing a question” is written, and then at some later time another action “one user
answering a question” is written (see Appendix 5 for a definition of the information contained
in an HTTP-log). Initially these two actions cannot be identified as a communication between
the two users. The link between the two is the document. Therefore the document is the
analytical unit for HTTP-log analysis for CMC. The following table summarizes the
differences between the two approaches:
Name Analytical unit Use situation
Session-based log- Session Single users
analysis interactions with a
system
Document-based log- Document life cycle Communication
analysis between users
mediated by the
system

Some of the analysis of HTTP-log files made in this thesis will exhibit some potential
applications of HTTP-log analysis for CMC.

The next sections will be devoted to some practical problems involved in using HTTP-
logs for analysis. Some of the problems cover both session-based log analysis and document-
based log analysis, while others specifically deal with doing document-based analysis. The
purpose of the description in relation to this thesis is to clarify the process of doing log
Mine the gap - a multi-method investigation of web-based groupware use
63
HTTP-log analysis for CMC

analysis. This will enable us to judge the implications drawn when the results of the analysis
is presented later.

HTTP-log analysis and cryptanalysis


One of the challenges of HTTP-logs and logs in general is the relationship between the
contents of the resource and the information, which is stored in the log. By contents of the
resource, I refer to the meaning of the resource to the people using the system. For a web
page, this could be expressed in the question ”what is the HTML page about?”. HTTP-logs do
not contain information to answer that question. The question can only be answered by
reading the HTML page. Also text-mining algorithms have been developed for clustering
documents according to the contents of the document.
Drawing an analogy to cryptanalysis can illustrate the task of the log analyst. Basically,
there is no guarantee that there is a link between the URL saved in the HTTP-log and the
contents of the resource. In traditional web-server environments, where a web-server is
serving html files out of a file system and the filenames of the html files are related to their
contents, it is possible to make sense out of the URL in the log. As an example my personal
homepage has an “index.html” file, which is my homepage and a “curriculum.html” file,
which contains my curriculum. When log statistics are generated it is relatively
straightforward to see, from the statistics, that when these users accessed the
“curriculum.html” file they were probably reading (or at least skimming) my curriculum. As
soon as we turn to more complex situations where the HTTP-server is not just serving files
from a file system, there is no link between the URL in the log and the contents of the
resource.
Another aspect of HTTP-log analysis justifies the analogy. When a user performs some
action (e.g. clicks a hyperlink), it typically generates a number of lines in the log. Some of
these lines are relevant, but a large percentage is irrelevant for understanding the user's
actions. In the case of Lotus Quickplace approximately 90% of the lines in the log files are
irrelevant.
Doing log file studies could be compared to the work of a cryptologist trying to decrypt
a message. The cryptologist tries to break the code and then reconstruct the message
originally written by someone. He knows (or assumes) that the encryption is systematic and
uses a number of techniques to test hypotheses on which principles the encryption is based,
and testing specific keys to break the code. The log analyst is faced with a similar task. The
log files represent traces that are systematically related to actions performed by the users of
the system. The extraction of relevant information from the log demands setting up

Mine the gap - a multi-method investigation of web-based groupware use


64
HTTP-log analysis for CMC

hypotheses on how a line in the log file is related to a user's actions, and testing the validity of
the hypotheses in an on-going learning cycle.
The process of decrypting text consists of multiple cycles of hypothesis formulation and
hypothesis testing. The cryptanalyst assumes that there is a relationship between an unknown
text and the encrypted text, which is fairly simple. There are two elements of this relation in
most cryptography Singh (1999): the algorithm and the key.
Encryption
key

Encryption
Cleartext Algorithm
Encrypted text

The algorithm specifies the mathematical relationship between three entities: the
encrypted text, the clear text (decrypted) and the encryption key. Often the cryptanalyst
knows the algorithm and needs to identify the encryption key.
The analogous task of analyzing log files looks like this: as we have seen above, in the
explanation of the common log file format, some user action causes a number of lines to be
written to the log. In the common log file format, the only field available for understanding
the relationship between user actions and the log line is the URL field. Thus we can
schematize the log file analysis in the same manner as the process of cryptanalysis.

Action code

Application Server
User action architecture
Log-line URL

The practical process of log analysis


The process of analyzing HTTP-log files involves a number of steps:
1. Defining the goal of the analysis, and therefore also the user actions relevant for the
analysis. This is an iterative process of defining goals and investigating the possibilities
available with the server architecture, which produces the log file.
2. Breaking the code of the relationship between user actions and properties of the URL
in a log line.
3. Cleansing and preparation of the data. Much of the data is irrelevant, and for
computing efficiency reasons it is usually preferred to extract data from the URL and store
them as separate data. With the Lotus Quickplace server we extracted the name of the
Quickplace and the ID of the document from the log file. Also, formatting of some parts of
the log data is necessary. In our case the date needed to be re-formatted.
Mine the gap - a multi-method investigation of web-based groupware use
65
HTTP-log analysis for CMC

4. Generation of data matrices based on the cleansed data. A lot of the analyses of the
log data are made on matrices containing aggregated information. An example can be found
in appendix 6. In our case this was done in a relational database.
5. Analysis and visualisation of the data matrices. In our case this was achieved using a
number of tools: Clementine from SPSS, SQL, Excel spreadsheets, and SPSS statistical
package.
In the following, some aspects of these processes are dealt with more extensively.

Mapping user actions to log lines


As previously noted, the first complication in analysing the log file is the problem that
one user action typically causes a number of lines to be written to the log. This is due to the
architecture of HTML and HTTP. When you request a plain HTML document through your
browser from a HTTP-server using a URL, the HTTP-server will serve you the document.
However, HTML documents usually contain elements, which are placed on the HTTP-server
as individual resources (e.g. images, java-applets, flash movies, etc.). The browser then
analyses the HTML document and requests the elements, which were linked from the HTML
document in the same way it requested the original document.
If a user types www.mydomain.com/mainpage.html in the address field and hits return
or clicks on a link on an HTML page (“<a
href=www.mydomain.com/mainpage.html>linktext</a>”), an HTTP request is produced in
the browser. Once the HTML page is returned to the browser, the browser scans the page for
links to additional resources, which need to be requested from a server before the page can be
shown to the user. In our simple scenario, we have four other resources, which we need obtain
by sending an HTTP request. The four additional resources are labelled “#2” because they are
issued as a consequence of “#1”. The browser deals with all #2 requests automatically,
without involving the user.

Mine the gap - a multi-method investigation of web-based groupware use


66
HTTP-log analysis for CMC

The mouse click, or typing of the URL in the browser address field, and pressing return
eventually produces 5 loglines in the server HTTP-log. The problem is then to locate the
resource relevant for the analysis. Typically the relevant resource to analyze is the one that
uniquely identifies that the user is in fact looking at the contents of a specific document.
For most analyses, the identification of the resource, which contains the contents of the
document on which the user performs the actions, is the goal. In certain specialized analyses it
might be different. In the process of analyzing the log files of the Lotus Quickplace server we
discovered that some of the .gif files loaded were named after the folders of the QP. It was
therefore possible to analyze the folder structure of the QP through the log file. For this
specialized analysis the contents of documents were unimportant.
This problem of a one-to-many relationship between a user action and a log line is
solved in a number of ways. One of the elements for solving the one–to-many relationship
puzzle is to identify the main resource of interest. When users read or edit an HTML page,
there is one resource that is unique for this action and several that are not. How this is
organized is completely dependent on the architecture of the server. Two scenarios are
important to distinguish between in this respect. We might name the first one the file server
scenario and the other the application server scenario.
The file-server scenario:
This is the original HTTP-server scenario. It is still being used and is characterized by its
simplicity.

Mine the gap - a multi-method investigation of web-based groupware use


67
HTTP-log analysis for CMC

HTTP
Browser server File-system

HTML, PDF,
Server GIF, JPEG...
In this scenario, log analysis is simple:
- All resources requested by a browser via HTTP exist on the server in a hierarchical
file system. The HTTP-server has a mapping table between URL’s and places in the
file system so that ,for example, the URL http://www.billeskov.dk/publications.html
maps to C:\www\html\publications.html.
- Usually the file-naming convention tells something about the contents of the page
being viewed. So just by looking at the name of the resource, you can determine
information about the contents.
- Often, the hierarchy of the file system is mapped to the information architecture of
the web site. (Such that, for example, all html files related to product presentations
are placed in the folder “\products” in the file system.)
Even though the file system scenario is simple in some ways, it faces problems when:
- the information architecture does not relate to the hierarchical structure of the file
system.
- the complexity of the web site (number of resources and number of links) exceeds
what can be grasped by the mind of the log-analyst.
Attempts have been made to solve the problem of complexity by machine-analyzing the
contents of individual resources (See e.g. IBM SurfAid (http://www.ibm.com/surfaid) which
uses a text mining clustering algorithm for categorizing pages by analyzing contents).
The application server scenario:
Today many web sites are managed via an application server. This means that our
scenario above is now more complicated.

Mine the gap - a multi-method investigation of web-based groupware use


68
HTTP-log analysis for CMC

File-system

Interface
CGI
HTTP Application
Browser
server

Database
interface
Relational
database

Server Oracle, DB2,


MySQL…

Application servers are used for a number of reasons: for handling more complex
interactions than reading HTML pages, for handling authentication, for handling scalability
etc. While the file server scenario utilizes the URL standard (RFC 2396) for accessing
resources in the form: http://host/resource in an application server everything preceding
http://host is not standardized and solely dependent on the internal architecture of the
application server. This means that the techniques used in the file server scenario cannot be
applied. In the file server scenario it is often possible to provide standardized solutions to the
analysis of log files independent of the architecture of the specific web site. All standardized
log analysis tools are based on the file server scenario and they are therefore not applicable
for application servers.
In the application server scenario it is therefore necessary to perform the code-breaking
process described above. I will now go through the process with the Lotus Quickplace server.

Breaking the code in Lotus Quickplace


Breaking the code of the Lotus Quickplace server, which would allow a mapping
between properties of the URL and a user action, was the first step in the preparation for the
log analysis.
The first step of breaking the code was to select the user actions that would be relevant
for the analysis of log files for studying computer-mediated communication. The Lotus
Quickplace technology allows users to perform a large number of different actions. Some of
these are very common while others are extremely specialized. For the purpose of studying
computer-mediated communication eight different common user actions were selected. The
eight action types are defined in appendix 7.
The next step was to start the code-breaking process. This was conducted on a test
installation of Lotus Quickplace, over which we had full control. The process began by
performing some action on the system such as uploading an attachment to a document.
Afterwards, the log lines produced by the action were studied to search for patterns useful for

Mine the gap - a multi-method investigation of web-based groupware use


69
HTTP-log analysis for CMC

identifying that the action had taken place. This we repeated a number of times for each type
of action and in various combinations until we had a reasonably good idea of how to identify,
in the log, that a user had performed one of the action types.
After this, we performed a test which included a sequence of all the action types. The log
file was then searched to see whether the actions predicted by the search of the log files was
identical to the actions actually performed. This test exposed some problems with the criteria
and meant, that the process was repeated again for some of the action types. After the second
overall test, the predictions made from searching the log files matched the actions performed.
As for the problem of establishing the relationship between properties of the URL and
the contents of the document, the Lotus Quickplace server provided no possibilities of doing
this. This is very often the case in the application server scenario. The documents on Lotus
Quickplace were all identified in the URL via a 32-character hexadecimal code and thus gave
no indications whatsoever as to the contents of the documents. For the analysis where this
relation was important, the file-names of attached files were used to indicate the contents of
the document. This was, for example, necessary for the analysis of specific genres of
communication.

This section has explained some aspects of the first three phases of the log analysis. The
remaining two phases will be dealt with when the results of the analysis are presented.

Some generic technical challenges of HTTP-log analysis


Before we proceed, some generic technical issues with HTTP-log analysis need to be
explained since they are important for producing reliable analysis. These problems exist in
both web server scenarios, and are related to the identification of the user and to the caching
of documents.

Identifying the user


It is very common to visit web sites without logging in with a user-name. This means
that the USERNAME field in the log file contains a ”-” for activity performed by all users. In
the analysis of how users use a web site, this of course is a challenge because there is no
obvious way of identifying the user. Two approaches can be used to overcome this problem.
The first possibility is the client IP-address stored in the log file. In an IP-based network such
as the Internet, every computer has a unique IP-address, and therefore it seems like a good bet
for identifying the user. There are however two problems with using the IP-address:
1. People who connect to the HTTP-server via an ISP(Internet Service Provider), as well
as people on a network using DHCP change IP address over time. Therefore the

Mine the gap - a multi-method investigation of web-based groupware use


70
HTTP-log analysis for CMC

identity of the user changes. The IP-address remains the same across a use session,
but for analysis over longer periods of time it is highly unreliable. Therefore it cannot
be used for document-based analysis of CMC.
2. Even in the same session, the IP address can pose problems. People connecting
through a proxy-server will all appear in the log with the same IP-address. Thus an
analysis would collapse perhaps 10 or 100 users into one.

The second possibility is to use persistent cookies. A cookie is a de facto standard


invented by Netscape. The cookie specification allows a server to set a cookie with a name,
domain, expiration date and a value on the client's browser. This cookie then gets sent back to
the server with each HTTP-request to the domain specified in the cookie. Cookies are used a
great deal in web applications, but can also be used for analytical purposes.
If the cookie is logged together with the other data, it can be used as an ID of the user,
even though he changes his IP-address and is not logged in, until the cookie expires. This of
course requires that the standard log-format is changed to also log the cookie value. Different
HTTP-servers can change their log-format with varying degrees of difficulty, but typically a
log analyst does not have access to change the configuration of the server. If a cookie is used
as identification of the user it is therefore important to analyze how this can be achieved in
practice on the web-server.
Another potential problem with using cookies arises in settings where multiple users
access the web-server via the same client machine. Because the cookie is stored in the
browser different users will appear as one.
In the case of the Lotus Quickplace, most requests to the HTTP-server are logged with a
user-name. Consequently this was used as the user identification without the necessity of
cookies.
While one of the qualities of log files is that they log consistently over time, on rare
occasions this is not the case. In the middle of our data analysis, we had an extremely
frustrating experience. After a specific date (5/10/2001), we discovered a radical increase in
requests where the USERNAME field contained a “-“. One of our hypotheses was that they
had allowed anonymous access to some of the Quickplaces. It turned out, however, to be
much worse. On the 5/10/2001, which was a Saturday, the Lotus Quickplace server was
upgraded to a new version. For some obscure reason, which cannot have been the intention of
the developers, a change in the design of the application meant that much fewer of the
requests were logged with a USERNAME. Actions such as reading or editing a document
were no longer logged with a USERNAME. This meant that some of the analyses conducted,

Mine the gap - a multi-method investigation of web-based groupware use


71
HTTP-log analysis for CMC

including the ones related to studying CMC, could only be based on data in the period before
5/10/2001.

Handling caching
Except for the problem of identifying the user, caching poses another challenge to the
reliability of the results from log analysis. Caching means that documents and other resources
on the web-server are stored in other places to save network traffic. There are two scenarios
where a mouse-click does not produce the lines in the server HTTP-log, because of caching:
Browser caching:
The browser caches resources from the web according to two sets of rules:
1. A browser can be set to cache files differently. In Internet Explorer the options on
when the browser should check for an updated page are “Never”, “Once per session”
and “Always”. If set to “Once per session”, the browser checks only once in the time
period, where a browser window is open.
2. For each HTML-page an additional rule applies. The browser caching settings are
only used when an HTML-page is not expired. HTML-page headers have a field for
defining an expiration time. If this expiration time is before the time on the client
machine, the browser will automatically check for an updated version of the html
page.
If a page has not expired and the browser cache settings are set to “Once per
Session”, each additional mouse-click to the same HTML-page in the same session will
not produce any additional lines in the HTTP-log. This is of course important to know if
one wants to study the exact path that a user has taken.
In the case of Lotus Quickplace all browsers at Beta were set up in a unified way, so
the browser caching was not a problem. Generally the browser caching poses a greater
problem for session-based analysis than document-based analysis because the session-
based analysis analyze events in the same session.
Server/Proxy caching:
For performance reasons it is not uncommon that files often served from an HTTP-
server are cached. This means that they are not processed by the HTTP-server itself, but
from some cache server that is placed between the browser and the HTTP-server on the
network. This of course means that requests to the HTTP-server are not logged at the
server because the request never reaches it. The Lotus Quickplace server did not use
caching.

Mine the gap - a multi-method investigation of web-based groupware use


72
HTTP-log analysis for CMC

Consistency of the resource ID


When conducting document-based log analysis it is very important that there is a
consistent relationship between the content of a particular document and its identification in
the log. For the study over time of the communication between two or more users via a single
document, it is crucial that the identification of the document in the log file is consistent over
time. If the document changes ID, while still appearing as the same document to the users, it
is not possible to perform document-based analysis. In Lotus Quickplace the document ID
was consistent over time, but in some application server scenarios this might not be the case.

This section has dealt with some rather detailed problems related to the analysis of log
files. It has also presented a new approach to log analysis. It has been shown how HTTP-logs
can be used as a basis for studying computer-mediated communication. Before we proceed to
the presentation of the results from the case study, including the use of log analysis, we need
to take a closer look at the technology, which has been introduced in the organization. As
discussed previously the specific properties of the technology artefact is essential for
understanding how it is adopted in the organization.

Mine the gap - a multi-method investigation of web-based groupware use


73
Introducing virtual workspaces

Introducing virtual workspaces


The following section introduces the type of technology, which has been introduced at
Beta, through a comparative study of different virtual workspace products. One of these is the
Lotus Quickplace technology studied at Beta. Other versions of this section have been
published in Bøving (2001), Bøving (2002).
The purpose of the section is to give a characteristic of the type of technology employed
at Beta. Using Orlikowski's (2000) distinction between technology as an artefact and the
technology-in-practice, this section describes the character of the technology as an artefact.
Also included in the characterization is a description of the design process of virtual
workspaces. Some of the applications of structuration theory such as Lyytinen and
Ngwenyama (1992), Desanctis and Poole (1994) have characterized the properties of the
technology as a social structure. While the properties of a technology is a product of
structuration processes, the relevant social structures for understanding the design of virtual
workspaces are different from the ones which are relevant for studying the adoption. It
therefore appears as a category mistake to characterize the technology as a social structure.
This is especially the case with standards-based systems such as virtual workspaces. The
analysis of the design process of virtual workspaces will clarify that we should analytically
separate social structures and the properties of media (Jensen (2000)).

Introduction
The www is no longer simply a medium for publishing information. It is being used for a
wide range of things such as doing commercial transactions, shopping, and so om. One of the
trends is to use Internet technologies to support collaborative work in teams and projects. The
promised advantages include, amongst others, the ability to enable geographically dispersed
teams to work together, to improve collaboration in the team, and to lower the cost of setting
up inter-organizational projects or teams.
From august 1999, when Intranets.com began offering a virtual workspace service, it has
been possible to lease an application for collaboration over the Internet either at a monthly
rate or as ad-ware. Since then a large number of companies have offered this service and
several of them have already disappeared again.
My first interest in this type of application came from the sheer enthusiasm concerning
the possibility of obtaining a shared workspace application for collaboration almost for free.
This illustrates the new economic possibilities of the Internet and the economic effects of
infrastructure and standard protocols.

Mine the gap - a multi-method investigation of web-based groupware use


74
Introducing virtual workspaces

Virtual workspaces are also interesting in other respects. They exemplify how the design
process of software applications has changed, and how the distinction between design and use
is shifting. The design of virtual workspaces is very open in the sense that it is designed to
support a wide variety of use situations or genres of communication.
Virtual workspaces also exemplify how the design of the protocols and standards of the
Internet has an increasing significance for the end-user situation. This is the case, as we will
see, with both the HTTP and the SMTP standards. Seen in the light of emerging standards
such as XML, which is a standard for modelling data and creating standard data models, this
trend is due to continue and enter new territories.
The virtual workspaces studied are very similar in terms of functionality yet project very
different images to the user. An analysis of the metaphors used reveal different strategies for
modelling the anticipated use situation. The differentiator in this type of application is not the
functionality but the metaphors used, and the strategy for modelling the use situation.

Virtual workspaces are suggested as a name for the type of application studied. PCWorld
has also used this term1. There is no unified and agreed name and definition of a virtual
workspace. Other names used include: The Digital Workspace, Virtual Office, Team
Workspace, Worksite, and Teamware. Most of these names are registered as trademarks,
which prevents them from being used for naming a type.
A virtual workspace is an application that facilitates people working together. Typically
this means that it has facilities for sharing files, engaging in discussions, sharing a calendar,
integration with e-mail, synchronous chat, and simple work-flow. All virtual workspaces also
provide access control and model the user in standard profiles such as managers, authors, and
readers.
A virtual workspace is an application, which is available as a service over the Internet. It
is designed to be ready for use for a group of people. In the marketing of virtual workspaces
terms such as "instant collaboration", "lets your whole team start working immediately" are
typical. The findings from the Beta study will show that, while there are no technical
difficulties in setting up a virtual workspace, "instant collaboration" is not a term well suited
to the adoption of the technology. On the contrary it is a complex task to integrate the virtual
workspace in the genres of communication, already existing in the organization.
Appendix 8 is a compiled list of virtual workspace products available at the time of the
analysis, together with some products that have appeared since this analysis took place.

1
http://www.zdnet.com/pcmag/stories/reviews/0,6755,2619206,00.html
Mine the gap - a multi-method investigation of web-based groupware use
75
Introducing virtual workspaces

Virtual workspaces are not new in terms of functionality, nor are they based on new
insights into the workings of groups. Shared workspace applications and groupware has
existed for a while, both in the scientific community and commercially, and the virtual
workspaces studied here do not present new aspects in terms of functionality or new unique
approaches to collaboration support. The research in both CSCW and GSS has investigated
the kind of functionality offered. Commercially, the Lotus Notes platform is an example of
groupware adopted in many organizations that offers the same functionality as virtual
workspaces based on proprietary protocols.
In October 1995, GMD made the first version of their BSCW system (Basic Support for
Co-operative Work) available for the public to test and use (see Bentley, Horstmann et al.
(1997), Appelt (1999)). BSCW is the first groupware system based on web protocols such as
HTTP, HTML and TCP/IP. It has facilities for file sharing, discussion facilities, user
modelling and access control. The BSCW system seems to be a main inspiration for all the
commercial products studied here.

Method
The empirical basis for my analysis of virtual workspaces is a selection of seven
commercially available products (See Appendix 8). All are web-based applications providing
a set of functions, which enable communication, the exchange of documents, discussions and
collaboration within a group of people. All the virtual workspaces studied were available
from an ASP (Application Service Provider) at a monthly rate or as ad-ware. The criterion for
selecting the products to be analyzed was primarily their availability for testing without cost.
The results presented are drawn from analyses of the application user interfaces and self-
observation in simple use-situations. The other Ph.D. students, from the DIWA project, and I
have used the applications in different situations for collaborative work. Also the functionality
of the applications has been analyzed in detail. Appendix 9 provides an example of the guide
used for the analysis of the virtual workspaces with answers taken from the analysis of Lotus
Quickplace.

Three economic models of virtual workspaces


The seven virtual workspaces are all based on three economic models of which the two
presume the existence of the Internet. The three models are exhibited below.

Mine the gap - a multi-method investigation of web-based groupware use


76
Introducing virtual workspaces

Standards

Application Application Application


developer and developer developer
Service
Provider
Application
Service Provider IT-department

Application Application Application


users users users

.com-model ASP-model Traditional


model

The first virtual workspaces (except for BSCW which was a research project) were
based on the .com model. In this model, a company develops the virtual workspace and offers
it as a service directly to users. The primary medium of communication to customers is the
Internet, and in many cases it is available as ad-ware where users get a free service in return
for being exposed to banner-ads. This model has turned out not to be viable, and by the end of
2002 very few virtual workspaces were solely based on the .com model. The companies have
either closed or they have combined the .com model with either the ASP or the traditional
model.
The ASP-model divides the development of the technology and the offering of the
service to customers into two companies. The software development companies lease the
software to ASP's who offer the virtual workspace directly to users.
In both the .com- and ASP- models the virtual workspace is offered as a service to be
paid for on a per unit of use basis. You don’t buy a software license with unlimited use but
pay specifically for each unit of use. The ASP or .com also hosts the application and the data,
so that all the customer needs is a browser and Internet access. The typical unit of use by
which payment is measured is one month of use by a given maximum number of users.
In the traditional model for offering virtual workspaces, a software license is sold to an
organization, which installs and run the software in their internal IT-department, as is the case
at Beta. The main reason for this is that financial corporations have a history for focusing on
security and protection of corporate data. Therefore it was not an option to have a .com or
ASP host the data.

Mine the gap - a multi-method investigation of web-based groupware use


77
Introducing virtual workspaces

The case of virtual workspaces shows how the .com model, which has turned out to be
problematic, has produced a type of application that is finding its way into organizations
primarily through the ASP and traditional model.
The customers for whom the service is marketed are, for example, projects in and
between organizations of varying sizes. As we shall see in the section on the metaphors used,
the virtual workspaces are targeting slightly different customer settings. The ASP's all provide
a very easy start-up of the virtual workspace, with no installation required. All that is needed
for someone to start a virtual workspace is a browser.

The economic model underlying virtual workspaces is heavily dependent on a number of


things available as simple and very cheap commodities. A cheap network (the Internet) is
needed as well as the standards used on the Internet for communicating on the network. A
network access more expensive than this would hinder the success of virtual workspaces. So
would a troublesome installation procedure of a proprietary client. The Internet infrastructure
provides simplicity of building, running, and using applications, which reduces their price.
The virtual workspaces are thus partly based on the availability of an infrastructure (the
Internet), protocols which simplify development of the application and simplify deployment.
The cost of deploying an extra virtual workspace is limited to the price of extra server load
and server monitoring and the cost of a customer spending 5 minutes to start op the new
virtual workspace. Low sales and marketing costs are additional factors, which determine the
extremely low prizing of a service previously only available after lengthy implementation
projects.
The economic model underlying the virtual workspaces also assumes that there is no
design and customization, which cannot be handled by the customer. This challenge is
handled in different ways among the virtual workspaces. Some of them deal with it by
modelling general aspects of group work such as document sharing, shared calendar, e-mail,
etc. Two of the workspaces (eRoom and Lotus Quickplace) use a strategy of modelling
typical settings of use. eRoom has templates for “Consulting Engagement”, “Product
Development”, and so on, which can then be used and modified by the customer. The
possibility for providing different templates for different use situations is not used by Beta.
All Quickplaces at Beta are based on the same standard template.

The design process of virtual workspaces


In traditional in-house software development the design process can usually be described
as one process that takes place in one organizational setting with the same group of
participants. In standard systems such as Lotus Notes and SAP, the software design process

Mine the gap - a multi-method investigation of web-based groupware use


78
Introducing virtual workspaces

takes place as a product development process in the software organization, and the
customization process takes place at the customer, handled by different people within a
different organizational setting. With Internet based applications such as virtual workspaces,
the design process is best described as three distinct processes. The first process is the
development of the standards of the Internet, the second is the application design process and
the third is the process of customization or adoption of the technology in the organization of
use.
The virtual workspaces showcase a design process, which has changed in two ways.
Firstly, the development of standards is becoming a process that is significant not only to the
developers but also to the way IT systems will be adopted in the use organization. Secondly,
an increasing proportion of the design process is left to be solved in the use situation. Virtual
workspaces are examples of applications that are open to many different forms of integration
into work settings, and processes which in standard systems such as SAP or work-flow
systems are the task of specialists who customize the software, are now the domain of end-
users.
In traditional in-house development projects, the design process is typically organized as
a project in the IT department with perhaps representatives from the users. Once the project is
completed, the application is then taken into use and in the event of necessary re-designing,
this will generally take place as a new project in the IT department. Broadly speaking, there is
one design process involved and a use process.
With standard applications such as SAP, Lotus Notes and others the design process is
divided into a software design process taking place at the software company and an extensive
customization process in the use organization. Here the design process is divided into two
organizationally and temporally distinct processes. Typically the customization process is
made in-house, organized as a project in the same way as the traditional design proces. The
customization or adoption of virtual workspaces, and other applications such as e.g. e-mail,
WikiWikiWeb (Leuf and Cunningham (2001)), or CoWeb (Guzdial, Rick et al. (2000)), is
accomplished by users more or less as part of the daily work, and in many cases without
explicit attention to it as being a design process.
The most important feature of virtual workspaces and other web-based applications is
their reliance on standards. They all use the browser as the client and rely on TCP/IP, HTTP,
HTML, and the group of standards guiding e-mail. (Bøving (2001) contains an analysis of the
standards underlying the e-mail system and how they affect the use patterns of e-mail). These
standards impose a number of constraints and "ways of doing things" which are important for
understanding how the application will be developed and used. In addition to the well-
established standards, new standards have been developed but not yet adopted in virtual

Mine the gap - a multi-method investigation of web-based groupware use


79
Introducing virtual workspaces

workspaces. These are XML (Extensive Markup Language) and webDAV (Web-based
Distributed Authoring and Versioning). XML is a standard for structuring information
including providing standard data structures for specific purposes known as DTD's
(Document Template Definition) or Schema. WebDAV is an extension to the HTTP 1.1
protocol and handles the locking of files for editing as well as version control and other
related features.
Of course the effects of standards are not new in systems development, but their
importance has increased significantly, and they are, in the case of RFC822 (Standard for the
format of ARPA Internet text messages), HTML and XML, not only concerned with
providing basic communication protocols but also with forming the content of the application.
The problem of editing documents via HTPP provides a specific example of how the
standards process is important for the use situation in the case of virtual workspaces (see Dix
(1997) for an extensive discussion on CSCW and internet protocols). The virtual workspaces
rely on the HTTP protocol, which is a stateless protocol for requesting and sending resources
between a client (browser) and a server (HTTP-server). That the protocol is stateless means
that when the server has sent what the client asks for it forgets everything about the
transactions except for maybe writing it to an HTTP-log. In a virtual workspace where part of
the purpose is the sharing of documents in the making this is somewhat inconvenient, since
one risks simultaneous editing of different versions of the document that will result in a
conflict when the changes are sent back to the server. Because of this, all of the virtual
workspaces have implemented locking of files, which enables one to lock the file against
editing or reading by other users while you download a copy of the document to your local
machine for editing directly in the browser or in another application such as Word.
The locking of files is implemented differently in each of the workspace applications and
is generally not very intuitive to use. As a result of the deficiency in HTTP, WebDAV is
being developed as a standard protocol under IETF (Internet Engineering Task Force). If
webDAV is adopted by the software organizations who develop virtual workspaces, the
locking of documents for editing and version control will be defined by the standard rather
than by the software organization. This example illustrates how the protocol is setting up
conditions, which have consequences for the design of the application and, subsequently, for
the end-user.
Above I have illustrated how the development of standards is becoming an important
factor in the design of applications such as virtual workspaces. There is one other major
factor, which should be mentioned, which concerns the tools and methods used to develop the
application. The virtual workspaces divide into three main groups according to the basic
model they are built on. Most of the applications (e.g. eRoom, Projectplace, BSCW) are based

Mine the gap - a multi-method investigation of web-based groupware use


80
Introducing virtual workspaces

on the concepts of files and folders known from file systems in operating systems such as
Windows. Lotus Quickplace is based on the concept of forms, documents and views known
from the world of Lotus Notes. Forms are document templates or data structure definitions on
which individual documents are based; views are collections of documents selected by some
common property. Views serve the same purpose as the folder but have a more flexible
structure. Quickteam is based on a generic object-oriented model. The whole organisation of
the Quickteam points to an object modelling approach. There are many entities such as issues,
documents, goals, events, etc. which share the same general properties such as the possibility
of assigning security settings. It would be rather simple to reconstruct the class hierarchy
underlying the application. The logic of classes, objects, properties and inheritance is very
evident in the way things are organized.
This is exemplified in the modelling of a project. A project is modelled not as an
aggregated class containing tasks, documents, etc. The project class only contains a schedule.
Overall a project is modelled as a property of all objects such as documents, tasks, etc. In the
file system approach a project is either the whole room as with eRoom or a folder such as in
BSCW. In the Lotus Notes approach, a project is not modelled and would therefore first be
modelled in the use situation as a view collecting a number of documents or by the naming of
a Quickplace after the name of the project it is supporting.
Generally, we can depict the design process of virtual workspaces as three distinct
processes, which are distinctly different in character:

The standards process


The standards process is characterized by time cycles of 3+ years. Most of the standards
being developed for the Internet are controlled by two organizations: W3C (World Wide Web
Consortium) and IETF. The process is organized in groups with members from organizations
Mine the gap - a multi-method investigation of web-based groupware use
81
Introducing virtual workspaces

both from academia and the commercial world. It is typically done as a Nebengeschäft for the
members of the group. Both organizations have defined a standard process for developing
standards (see Bradner (1996), Jacobs (2001)).
As an example, the WebDAV working group under IETF was approved in March 1997,
and WebDAV has the status of "Proposed Standard" by IESG (Internet Engineering Steering
Group). "Proposed Standard" is the first maturity level of an IETF standard. The next is the
"Draft Standard" and the last is "Internet Standard". As a comparison HTTP 1.1 now has the
status of "Draft Standard" while TCP, IP and FTP are "Internet Standards".
It is, however, important to note that the standards development processes of W3C and
IETF are not the only relevant standards processes. The Internet has several examples of de-
facto standards, which have emerged typically because a software company has been so
successful that competitors have adopted them. The "common log file format" and the
"extended common log file format" are both examples of de-facto standards. The cookie is
another example, which was developed by Netscape as part of the Navigator browser. A third
example is JavaScript, which was also a Netscape invention.

The application development process


There are two characteristics of the application development process, which deserve to
be mentioned here. Firstly, it is characterized as packaged software development. The process
and organization of developing packaged software differ significantly from that of custom
designed applications (Carmel (1995), Carmel and Bird (1997), Carmel and Sawyer (1998)).
One of the characteristics is the low degree of user involvement. "Many of the assumptions in
these approaches [approaches for involving users] are inapplicable for packaged software
development." Carmel and Sawyer (1998) p. 11. Packaged software development is focused
on developing a product for a market rather than a custom application for a specific
organization. Sales representatives, focus groups or other types of market analysis must
mediate the communication with users and, in many cases, user needs are defined without
systematic research. This characteristic of the development of virtual workspaces highlights
the category mistake made by Lyytinen and Ngwenyama (1992), Desanctis and Poole (1994)
in their application of structuration theory. The social structures relevant for understanding
the design of the packaged software are very different from the ones relevant for
understanding the adoption of the technology in an organization.
The other characteristic of application development is that it is software for the Internet.
As discussed above, this means that application development is dependent on the
development of standards. The Internet is also characterized by being a highly dynamic

Mine the gap - a multi-method investigation of web-based groupware use


82
Introducing virtual workspaces

environment and it is critical factor for success that products are developed to be very
flexible. The shifts in demand and the constant release of products by competitors mean that
software development must be able to change significant aspects of products at late stages in
the development processes (MacCormack, Verganti et al. (2001)).

The description of the standard development and its importance for how applications are
designed and packaged in products, and the description of the development process has
highlighted some structural factors relevant for understanding how virtual workspaces turn
out as they do. The model of the three major processes involved in understanding the role of
virtual workspace technologies is characterized as "levels of structuring a domain of work".
This means that the standards set up both constraints and possibilities for the development of
the applications. More specifically, the application development process is producing an
artefact with certain properties, which will constrain some and enable other possible
technologies-in-practice.

The adoption of virtual workspaces


While the application process constrains and enables technologies-in-practice, virtual
workspaces are characterized by their openness. They are open in the sense that they support
very different technologies-in-practice in different types of settings. In this sense they are
similar to other Internet technologies such as e.g. WikiWikiWeb (Leuf and Cunningham
(2001)) and e-mail. The character of the adoption and integration of the virtual workspace
artefact in a work setting will be dealt with further in the reports on the adoption of Lotus
Quickplace at Beta. The distinction between the technology as an artefact and the technology-
in-practice is used to distinguish between the characteristics of the virtual workspace
technology and the genres of communication in which the single virtual workspace is
embedded. It remains, however, an open question as to whether it makes sense to characterize
the different examples of Lotus Quickplace use as one artefact, identical to the software
which left the Lotus Corporation. We shall get back to this discussion after reporting the case
study, but two illustrations of why it is questionable to characterize it as one artefact will be
provided here.
All users who start up a virtual workspace must decide on, and create a structure in
which the documents shared by the users of the virtual workspace are stored. This is well
known from LAN drives and from paper-based management of information. The structure is
modelled as a hierarchy of folders in most virtual workspaces. When the virtual workplace is
deployed in a group, the users must design and implement a hierarchy of folders.

Mine the gap - a multi-method investigation of web-based groupware use


83
Introducing virtual workspaces

The support for designing folders is approached quite differently in the virtual
workspaces, mainly in the way the creation of folders is organized. In, for example, eRoom,
Quickteam, and Projectplace any member of the group can create folders. In Lotus
Quickplace the manager is the only one allowed to create structures for organizing content.
One aspect of designing the document structure is not modelled/considered by any of the
applications. This is the process of deciding on a structure of the documents. The decision
process of structuring the documents in the virtual workspace could be the task of a project
manager, a project librarian or it could involve the whole group. As we shall see in the case
study at Beta, the decision processes on structuring is approached differently across the
Quickplaces.
Another illustration is the discussion forum. All virtual workplaces model a discussion
forum where members can post information and others can respond to it in a threaded way. A
discussion forum is open to many different genres of communication. In a project these could
be: announcements from the project manager, issues which need resolution, questions, etc.
The discussion forum is thus a very open structure that can be used for many different genres
of communication.
In one of the interviews made at Beta, a manager of a Quickplace reported trying to
model the issue resolution genre. He was managing a geographically distributed project
group, which met face-to-face in workshops every two weeks and otherwise communicated
using the telephone, e-mail and a Lotus Quickplace. At one of these workshops the project
decided on a number of issues which the participants in the different countries should each
investigate. The project manager posted each of the issues as root documents in the discussion
forum, and anticipated that each of the countries would respond to each issue so that the
project could take an informed decision at the next workshop. However, nobody responded in
the Quickplace, but all had done their work and were ready to report back when the project
met again. They just did not use the Quickplace, because the project manager did not use an
existing genre of communication, and was suggesting a new one without making it explicit.
He did nothing but post the issues, leaving the project members to guess how they should
respond to it.
These illustrations exemplify the challenge of adopting the virtual workspace in a work
setting, which is analyzed in greater detail in the report from the Beta case study.

Design and the use of metaphors


I have now given an overall characterization of virtual workspaces and the process of
designing and adopting them. The rest of this section will be devoted to the analysis of the

Mine the gap - a multi-method investigation of web-based groupware use


84
Introducing virtual workspaces

different strategies used by the application developers for modelling the anticipated use
situation.
The most interesting aspect of modelling the anticipated use situation is the use of
metaphors. The use of metaphors is the primary means by which the virtual workspaces are
differentiated. As noted previously, the functionality offered across the different virtual
workspaces is more or less the same. The typical approach to comparing software focuses on
functionality, but this approach does not work when assessing an application such as virtual
workspaces.

The design strategy for virtual workspaces


The economic model underlying the development process of virtual workspaces implies
a strategy for generality/specificity. The economic model dictates a certain generality with
regards to the types of settings in which it can be put to use. It must work in different types of
organizations with different management models in order for the market to be large enough to
provide a low price per use. It also requires a certain specificity in order to provide perceived
value for the customer. Many processes must be automated so that the application is ready to
use and easy to use. A virtual workspace requiring the use of a man-day of expensive IT-
technician time to set it up would disable its potential in this economic model.
A virtual workspace could be characterized as a web-based CSCW application. The
CSCW community has a long tradition for creating and analyzing IT systems to support the
co-ordination of work. These systems are generally based on formal constructs for the co-
ordination of work such as a calendar and are not concerned with modelling the subject
matter, or the intellectual process, which the system supports. All the elements present in the
virtual workspaces are well known elements in the CSCW community so, as stated earlier, the
virtual workspaces present nothing new in terms of functionality.
Shipman III and Marshall (1999) propose a typology of four different types of systems
according to which aspects of the use situation are modelled: The hypermedia model, the
argumentation model, Knowledge Based Systems, and Groupware systems. Groupware
systems are focused on structuring the interactions between people working together in some
setting. This means that a groupware system asks the user to formalize the interaction with
co-workers. The virtual workspaces are Groupware systems in this respect. The virtual
workspaces are modelling discussions, document management, co-ordination of activities and
messaging, while leaving other aspects open. Knowledge based systems such as IBIS (Issue
Based Information System) and gIBIS (graphical Issue Based Information System) (see
Conklin and Begeman (1988)) model the intellectual process of breaking down information

Mine the gap - a multi-method investigation of web-based groupware use


85
Introducing virtual workspaces

logically into issues, hypotheses on the issues, and key questions which confirms the
hypotheses, and leave the interactions untouched.

Approaches to modelling the anticipated use


Virtual workspaces are then groupware, which means that they model some aspects of
the interaction between the users of the application. The following analysis will illustrate how
the virtual workspaces approach this modelling, and show the consequences that the chosen
approach has for the adoption in practice.
The design and marketing messages indicate that virtual workspaces are built to support
project-like work. It is therefore natural to compare the different ways in which the concept of
project is modelled in some of the applications:
Projectplace: Project is modelled as an object that is created with a number of features
attached to it:
Document Archive, Document Templates, Discussion Forum, Project
Calendar, Tasks, Contacts, Deleted Items, and Members.
Quickplace Project is not modelled.
Quickteam Project is modelled as a property you can attach to any object regardless of
whether it is a calendar entry or an uploaded file.
eRoom Whole application is modelled as a project. eRoom has a number of templates
for different types of projects (e.g. a Consultants engagement, a product
development template, etc.)
BSCW Project is not modelled.

This description of how a project is modelled shows how it is modelled in the


application development process. This creates rather different bases for the process of
establishing a technology-in-practice which is left to the users of the application. In both
BSCW and Quickplace it is for the user to model a project. In BSCW this would be done by
creating a folder at some place in the folder hierarchy and giving it the same name as the
project and specify who would have access to perform actions in the folder and sub-folders.
With Lotus Quickplace, a Quickplace can be created and named after the project and then it
will represent the projects, or a room in a Quickplace can represent a project. Both approaches
are used at Beta.
The eRoom strategy for modelling project is quite different. eRoom has templates for
different kinds of projects. One example is the Consulting Project. The diagram which
follows shows the right navigation bar of the Consulting Project eRoom. What is modelled
here is the structure of the documents in folders and a principle for structuring the documents,
Mine the gap - a multi-method investigation of web-based groupware use
86
Introducing virtual workspaces

which match different phases of a typical consulting project. So, in fact, the management of
the project is modelled in different phases such as Project
Initiation, Development, Testing, and so on.
The different approaches to modelling the idea of a
project illustrate how the virtual workspaces have different
strategies for modelling the anticipated use situation. Whereas
the functionality of Quickplace and eRoom is more or less the
same, the modelling of the anticipated use situation is quite
different. As we will se in the reports from the case study,
Lotus Quickplace, which was implemented to support
projects, is also used also to support other groups of people. It
can only be speculation, but this pattern would probably not
have emerged had Beta chosen to implement eRoom instead.
There are two underlying trade-offs in play here: specificity vs. generality and flexibility
vs. complexity. The specificity/generality trade-off is important in standard applications such
as these with the underlying economic model we have described earlier. One wants specificity
in relation to the work situation, which the application is supposed to support, in order to
create perceived value. On the other hand one wants it to be general so that no design is
required from the ASP or software company for each copy of the software leased or sold.
The other trade-off between flexibility and complexity is actually two trade-offs. The
first one is well known from systems engineering. Maximum flexibility is desired so that one
can cope with changes in the use situation without having to re-design. However, flexibility
increases the complexity of the application and thereby the costs of designing and maintaining
the application. In addition, and here is the second variant of the trade-off, flexibility also
adds complexity to the process of adopting the technology. People using Quickplace or
BSCW have to deal with a more complex design-process when modelling a project than
people using Projectplace or eRoom.
While these trade-offs are attached to the virtual workspaces as a type, the outcome in
the individual workplaces has been quite different. Which approach is the most successful can
only be decided in the use situation.
Another aspect in which the virtual workspaces differ is through the way the
organization of the users is modelled. They manner in which user rights are distributed
projects an image of how the users are organized.

Mine the gap - a multi-method investigation of web-based groupware use


87
Introducing virtual workspaces

Virtual workspace Defined roles Scope of the roles


Lotus Quickplace Manager, The Manager invite members and define access
Author, Reader rights for the main Quickplace and sub-rooms. The
Author can create documents and move and delet
only his own documents. The Reader can only read
documents
Projectplace Project Owner, The Project Owner can terminate the project and
Project Member invite new members. Otherwise everybody has
equal rights.
eRoom Administrator, The administrator can invite members and close
Member down the eRoom. Otherwise, members and
administrators have equal rights.
Quickteam Team Member Access rights are only defined on specific objects
and Guest and are done so by the creator of the object. There
is no administrator role.

Quickplace, Projectplace, eRoom, and Quickteam project quite different images of the
organization of the group or project which uses the workspace. Projectplace is projecting an
image of peer-to-peer collaboration with no hierarchical structure. Quickplace on the other
hand projects the image of a manager (or a librarian hired by a manager), a group of core
members and peripheral members who can only read documents.
The two examples of modelling a project and modelling the organization of the users are
prototypical examples of how virtual workspaces, which are quite similar in functionality,
choose to model different aspects of an anticipated use situation. The user rights management
clearly shows a difference in the image of the use organization projected into the application
by the developers. The result is that the different virtual workspaces create quite different
starting points for the establishment of technology-in-practice.
The use patterns of Lotus Quickplace at Beta show that in most of their Quickplaces a
few persons are assigned the role of manager, a small group of people author documents,
while a larger group of people only read documents. This pattern reflects therefore the pattern
anticipated in the design of Lotus Quickplace. It would not have been a practical problem to
assign the role of manager to everybody in a Quickplace but for some reason the model
suggested by the application is used. Again, the consequences of choosing another virtual
workspace technology such as eRoom can only be speculated, but perhaps rather different
technologies-in-practice would have emerged. It is notable that Beta has not consciously
chosen to implement Lotus Quickplace rather than, for example, eRoom because it modelled
Mine the gap - a multi-method investigation of web-based groupware use
88
Introducing virtual workspaces

the right aspects of the use situation. The choice of virtual workspace was done because the
IT department had previous experience with Lotus Notes software, on which Lotus
Quickplace is developed.

The use of metaphors


Typical product comparisons in computing magazines usually focus on the functionality
offered by the application. PCWorld has evaluated a number of virtual workspaces and they
use functionality matrices as a means of comparison. In a functionality matrix each row
describes certain functionalities such as "has threaded discussion", and each column
represents a product. Even though it is possible to define a functionality comparison matrix
that can define a winner, the question is whether the focus on functionality grasps the most
important differentiators between the virtual workspaces. As seen above the different
strategies for modelling the anticipated use situation have revealed important differences.
Related to this is the use of metaphors.
The use of metaphors is recognized as a central aspect in building computer systems. A
search in the ACM Digital Library for “metaphor” or “metaphors” in the document title
results in 52 entries either from journals or conference proceedings published by ACM.
The body of literature on metaphorical theory is vast and metaphors play a central role in
cognitive theory, art theory, and so on. Metaphors are applied in HCI where they serve as
guidelines and methods for building user-friendly systems. The typical approach for using
metaphors in the design of computer systems is to take one central metaphor such as the
desktop and let that inspire the design.
However, I want to show here that the use of metaphors, in creating the user interface of
the virtual workspaces, is much more encompassing than the question of using a central
metaphor. The difference lies in the unconscious use of metaphors. The designers of the
virtual workplaces use many metaphors or frames of reference, which are implicit and not
reflected upon. The current analysis will reveal some of these “hidden” metaphors. The
underlying hypothesis is that the use of the metaphors is significant for the use patterns
emerging.
The desktop is one of the most successful metaphors in the history of computing. It
works as the basic metaphor in the Mac and Windows operating systems as well as the GNU
and KDE graphical environments for Linux. This type of metaphor is sometimes called a root
metaphor, because it guides the overall structuring of e.g. the interaction with the Operating
System. Most design research on the use of metaphors looks for one or more root metaphors
to inspire the design process and to familiarize the user with the computer system.
Merriam-Webster defines a metaphor in the following way:

Mine the gap - a multi-method investigation of web-based groupware use


89
Introducing virtual workspaces

“a figure of speech in which a word or phrase literally denoting one kind of object or idea is used
in place of another to suggest a likeness or analogy between them.”
(www.merriam-webster.com)

Goodman (1976) describes one important aspect of a metaphor, which is not covered by
the Merriam-Webster definition.
“Now a metaphor typically involves a change not merely of range but also of realm”
(Goodman 1976) p. 72

A metaphor changes the range of a predicate. When you use “sad” to describe music, the
predicate, which we literately apply to human feelings, has now changed its range to include
music metaphorically. However, as soon as the predicate “sad” is applied to music it is also
possible to apply “happy”, “depressing” or “gay”. A predicate such as “sad” is always part of
a symbol scheme e.g. the scheme of human feelings, and when “sad” is applied
metaphorically to music, it enables the rest of the scheme applicable in the new realm.
This property of metaphors is central in explaining the use of metaphors in the virtual
workplaces. Some of the metaphors are explicit, while others are implicit or potential
metaphors.
Nelson Goodman thus offers us three different types of metaphors to look for: the
explicit ones such as when applying “sad” to music, the hidden ones such as “gay” or “happy”
and the symbol scheme or realm which would in this case be “human feelings”.

The metaphorical landscape


The virtual workspaces use metaphors at more levels. They use a root metaphor to
establish a sense of place for what is going on. It answers basic questions such as “where am
I?” and “what is this?”. For example, Quickplace uses a house as the root metaphor for the
application.
Another level of metaphors is found in the naming of the different parts of the
application. An analysis made at this level identifies the schemes or realms from which the
metaphors used explicitly are taken. This level of metaphors reveals, from which realms the
application design is inspired. It therefore also reveals the use situation anticipated by the
designers.
The analysis produces what we could call a metaphorical landscape of the applications;
an additional level of description besides functionality and technical architecture.

The house, room or the office


The root metaphors used in the virtual workspaces are all connected to a place. There are
two main groups. The first are the applications which use the desk or personal office as

Mine the gap - a multi-method investigation of web-based groupware use


90
Introducing virtual workspaces

metaphor, and the second branch use a room, house or a huddle which focus the idea of a
place where people meet and do things together.
The method of identifying the root metaphors was to collect the symbols used in the
application or used to describe the application in help systems and marketing material. The
BSCW system was the only virtual workspace, which provided some reflection on the design.
BSCW is a scientific project as opposed to the rest, which are all commercial. In the
description of the system (Appelt(1999)) the author is reflecting upon the use of metaphor in
BSCW:
“The BSCW system is based on the metaphor of shared workspaces.”
Exactly how the “shared workspaces” metaphor informed the design of BSCW is not
reflected upon in the article. The interface of BSCW seems to point in many directions.
Quickplace, eRoom and Huddle 247 use the house, room or huddle respectively as the
root metaphor. The metaphor is used directly in the naming and marketing messages of the
three applications:
“eRoom is the digital workplace on the web that enables distributed teams to work
together on their complex business projects.” (www.eroom.com)
“Lotus QuickPlace is the self-service Web tool for team collaboration. QuickPlace
enables the creation of a team workspace on the Web -- instantly!”
(www.lotus.com/quickplace)
The metaphor is also used in the interface design. One enters a Quickplace via a URL,
log in and then see a user interface which one assumes is similar for all members (in fact it is
not, since documents and rooms can be hidden for certain members). The same is the case
with eRoom and Huddle247.
The other group of virtual workspaces uses the personal office space as the root
metaphor. Projectplace, TeamNow, and HotOffice use this approach. While the marketing
messages all focus on the possibility of collaboration between geographically separated
people, the different metaphors are reflected in the start-up screens. When one enters the
virtual workspace a personal interface is presented, and the projects one participates in are
modelled. This does not produce a feeling of a shared place, which is common to all the
members of the project or team.
The two overall metaphors used in the virtual workspaces are thus the room or house,
and the personal office. It is again very important to note, that most of the functionality is the
same. One example is the personal storing of files. The applications such as Quickplace and
eRoom using the room or house metaphor also allows for files to be stored without being
visible to others as the applications using the personal office metaphor.

Mine the gap - a multi-method investigation of web-based groupware use


91
Introducing virtual workspaces

The domains of reference.


If we look at how the different functionalities of the virtual workspaces are named, we
see another level of metaphors.
The present analysis takes key elements of the virtual workspaces and analyses the
naming. It combines the explicit and hidden metaphors and produces a metaphorical
landscape for a virtual workspace technology. If we allow ourselves the generalization from
metaphor to realm we see an interesting difference between the applications. It should be
noted that the generalizations are not made based on empirical evidence which could be
gathered from interviews with developers of the virtual workspaces. They are products of my
interpretation based on the appearance and functionality of the virtual workspaces, and the
available marketing material. The metaphors used were not traced back to their origin, but
rather to the domains, which the developers of the applications would assume that the users
would think of. This is why the "files" in HotOffice is traced back to corporate IT rather than
to the paper-based units of administration.
The metaphorical landscapes of eRoom, HotOffice and Lotus Quickplace are depicted
below.

My Desk
Calendar Contacts
Phone messages
documents
e-mail
The Office
Bulletin Board
Departments

The Organization Internet

Projects Benefits Chat


Corporate IT

Users
Files
Tools

List Note
Poll
Client Engagement Personal org.

Project Public
Discussion
New Product Launch

Org. comm. Internet


PC
Intercom File Inbox Link
Folder

Mine the gap - a multi-method investigation of web-based groupware use


92
Introducing virtual workspaces

Calendar Folder News digest


Search
Document
Personal org. Chat
Index Internet
Archive
Library
Tools
IT Customize

Discussion
Group
Reader
Members Organization Authoring
Manager Revision
Tasks Notify Author

The metaphorical landscapes reveal a mixture of realms from which the metaphors are
drawn. As an example of the way my analysis is made we can look at the discussion facility.
The discussion facility is in all three cases the standard threaded discussion. The user can post
messages and respond to the messages as well as respond to the responses. In eRoom,
discussion is used in conjunction with a poll. A poll can be performed as part of a discussion,
so that one response in a discussion could be to define a poll for the member of the eRoom.
This is why the elements of poll and discussion are gathered in the “public“ realm as they
borrow from public discussions and democratic decision making. HotOffice, on the other
hand, names the same threaded discussion facility a “bulletin board”. I have not traced it back
to the bulletin board in a university or an office building. This is because HotOffice uses it
together with a chat function, which is a standard synchronous non-threaded chat. Therefore
the original realm of the bulletin board metaphor in HotOffice is rather the BBSs (Bulletin
Board System) on the Internet. In Quickplace discussion is associated with members and
should therefore be assigned to a generic realm of groups where members discuss.
Another example, which illustrates the different realms from which the metaphors are
drawn, is the facility for sharing files. The eRoom relies on metaphors from PCs and uses
"folders" in association with "files". Also the graphical design of the folder structure
resembles what is seen in, for example, the explorer application of the Windows operating
system. In Quickplace the metaphors of "document", "Library", and "Index" are used in
conjunction. These metaphors stem from the realm of the archive or library, where documents
are stored for future use.
As was the case with the different strategies for modelling projects and distributing
rights to users, the consequences of applying various different metaphorical landscapes can
only be speculated.

Mine the gap - a multi-method investigation of web-based groupware use


93
Introducing virtual workspaces

Summary of virtual workspaces


The analysis of the strategies for modelling the anticipated use situation, and the use of
different metaphorical landscapes has served to illustrate how the virtual workspaces
differentiate themselves. We have characterized virtual workspaces technologies as open for
diverse technologies-in-practice and as very similar in functionality. The analysis of
metaphors and aspects of the modelled use situation has shown that the virtual workspaces
probably create different conditions in which the technology is adopted and integrated in a
work practice.
While the analysis of functionality did not reveal the differences between the virtual
workspace designs, the analysis of the metaphors used in the applications clarified the
different approaches to modelling the use situation. The typical analysis of the functionality
and the underlying technological architecture should be extended with an analysis of the
metaphorical landscape in order to understand the properties of the technology relevant for its
adoption and integration in a work practice.
With applications such as virtual workspaces which are open for very different
technologies-in-practice or genres of communication, the image projected by the metaphors is
probably an important factor in determining how the application will be used, and which
genres of communications will be mediated by the application (See also Hamilton (2000)).
There are some indications in the study of Lotus Quicplace at Beta but a comparative study of
usage between different virtual workspaces would be able to address this question more
directly.
The metaphorical landscape analysis is not only relevant for research into the character
and role of computer media. The analysis could also inform the decision on deploying virtual
workspaces in organizations. Designers of virtual workspaces would also benefit from
adopting a conscious and systematic use of metaphors. Using root metaphors such as the
desktop are well established in HCI but a more extensive use, exemplified in the use of the
metaphorical landscape, can probably help to build better virtual workspaces.

Mine the gap - a multi-method investigation of web-based groupware use


94
The study of Quickplace use

The study of Quickplace use


The reports from the use of Quickplace (QP) at Beta are divided into several parts.
While the thesis follows a fairly strict research question, it also has an explorative character in
the application of methods to investigate the research question. The use of log-analysis in
combination with interviews and survey data is not prevalent in the study of the use of
computer media. The objective has been to explore the possibilities and pitfalls of combining
types of data in the study of computer media use just as well as to present results of how QP
is used at Beta.
The first part is concerned with an overall characterization of QP use. This
characterization uses data from interviews, survey and log-analysis. It is not the purpose to
discuss generalizations and generate or modify theory, but to serve as a background for a
more in-depth treatment of specific issues.
The issues that will be dealt with in-depth are:
1. The investigating of genres of communication in three exemplars:
Three QPs are selected for a study of the individual genres of communication existing in
them. The QPs are not chosen because they are interesting in any particular way. They are
chosen for two reasons primarily. Firstly, interview, survey and log-data are available for all
three, and secondly, because they are QPs which have shown a reasonable amount of activity
during the time of study. One of the results of the analysis regards the genre theory and its
relation to media. A finding of the study is that most genres combine different media.
Therefore a specific genre is not attached to a specific medium. It also shows how an analysis
of interactions through specific documents can constitute an alternative viewpoint for
studying genres in addition to the traditional analysis of texts in, for example, e-mail (Yates
and Orlikowski (1992), Orlikowski and Yates (1994)).
2. Statistical generalizations:
A so far little used method of investigating the use of collaborative systems is to look at
the “document lifecycle”. The analysis here presents how different statistical tools were used
to characterize the use of documents in all QPs at Beta. One of these methods is the cluster
analysis. The section also presents the test of two specific quantitative hypotheses. It turns out
that both the use of documents and the number of users in the QPs follows Zipf’s law. They
are not distributed according to a normal distribution but follow a power-law distribution
formulated in Zipf’s law.
3. Folder structures
During our analysis of the log files an interesting data material turned up, about which
we had no idea when we began. It turned out that the log files contained information about the
Mine the gap - a multi-method investigation of web-based groupware use
95
The study of Quickplace use

naming of folders. For some reason, the .gif pictures used as buttons for the folders in the QP
are named after the text on the button. The text on the button is the name of the folder. This
means that a button named “presentations” can be clicked and will take you to a folder which,
if used with some sense, contains material related to the concept of presentations. Since these
.gif files were loaded when a person used the QP, it gave us access to a very detailed account
of how the folder structure of each QP has developed over time.
Structuring the contents of a QP is an important aspect of making a QP work (Bannon
and Bødker (1997), Bannon (2000), Schmidt and Israel (2000)). Besides seeing it as a
communications tool, it is a place to store information in a way, which makes it easier for the
users to find and work on. Our analysis of the development of folder structures in the three
exemplars suggests that information structuring should not be characterized as a static
librarian discipline but rather as something performed by users as a part of their work. The
analysis concludes with the presentation of a functional model that attempts to explain the
dynamics of folder structures.

Mine the gap - a multi-method investigation of web-based groupware use


96
The study of Quickplace use

Characterising Quickplace use


Beta is a young organizational construct forged by the merger between a Danish
financial organization (Alpha) and a Finnish/Swedish one, which took place in spring 2000.
Alfa was thereby transformed from being a Danish bank with offices around the world to a
true Nordic bank with retail banking business in all Nordic countries.
At the time of the merger no IT infrastructure was in place for communication between
the pre-merger organizations. This was a challenge for the projects, which were formed to
complete the merger. As a fast solution to the problem, it was decided to buy and implement
Lotus QuickPlace as a tool for supporting the communication in the new organization. One of
the goals was to minimize travelling. It was not a decision based upon a technology selection
process, which is usual in the bank. Decisions on buying software usually entail tests to assess
security, scalability, manageability and integration with the existing infrastructure. There was
an instant need for a browser-based, platform independent tool to support communication. A
standard application was needed, and Lotus QuickPlace was chosen because of prior
knowledge of the product and because IT-Operations had positive experiences with Lotus
products. Lotus Notes was used already as the e-mail system for parts of Alpha and was used
as the platform for the Alpha Intranet.
Lotus Quickplace was introduced as a tool for inter-Nordic groups in May 2000. The
department "Group Information and Communication" (hereafter: GIC) was commissioned to
make the new organization aware of the opportunity. GIC was a newly formed Nordic
organizational unit consisting of a merger of the three national communication departments.
The initiative to use QP was initiated from the Danish part of the GIC. Inter-Nordic groups
were offered the opportunity to use QP for their work.
The first QP was opened in May 2000, and by April 2001 when we began our study,
there were 91 QPs created on the server. In the period of log-analysis 170 different QPs
showed activity in the logs.

Mine the gap - a multi-method investigation of web-based groupware use


97
The study of Quickplace use

pagereads pr. week from week 19 2001 - week 8 2002

180000
160000
140000
Pagereads

120000
100000
80000
60000
40000
20000
0
ee 9
W 22

W 25

W 28

W 31

W 34

W 37

W 40

W 43

W 46

W 49

W 52

W 3

6
1

k
ee

ee
k

k
ee

ee

ee

ee

ee

ee

ee

ee

ee

ee

ee
W

The chart shows the overall activity on the QP server in the whole log period
summarized per week. This shows a steady increase in activity over the 10-month period
summarized across all QPs. 80 different QPs were active in the first month of logging. In the
last month of the log period 126 QPs were active. The number of active QPs had therefore
risen by 58%. By comparison, the overall activity had risen by 275%. Not only was there an
increase in the number of QPs but the average activity per QP had risen over the log period by
138%.
QP was only offered to the headquarters of the bank. Retail banks historically have a
sharp distinction between the customer-facing branches and the headquarters where IT,
Communications, Human Resources, etc. are placed. Corporate and institutional banking and
investment banking are also considered part of the “headquarters”.
Around July 2001 GIC, (the Danish part) that initiated the use of QP in the organization
and commissioned its use, told us that QP was probably going to be closed down. The reason
was a missing approval from the department of IT Security. According to them QP had some
features which violated Beta's IT security policy. However, QP was not shut down. After an
intense political struggle between IT Security and the users represented by the
Communications Department, a compromise was agreed where IT Security took over issuing
QPs to new users and QP remained operational in the organization. A main reason why QP
remained in the organization was the fact that there were over 80 active QPs at that time. For
practical reasons it was impossible to close down. Some of the activities supported by QP,
such as the translation of the quarterly financial reports, were indeed business critical. Closing
it down would have created many problems, and would have required that some alternative
technology be found to replace it.
In order to understand this conflict, we need to take a closer look at the security
architecture of QP. Once the opening of a new QP is granted, two QP managers are assigned

Mine the gap - a multi-method investigation of web-based groupware use


98
The study of Quickplace use

to it by IT Security. After this appointment the two managers have full control over security
in the QP; they are even able to invite other managers. The distributed nature of the security
model in QP was originally motivated by the need to ensure privacy of data to the users in an
ASP context. The QP manager defines who can use the QP and the author of a document
solely defines who is able to read and edit it. This distributed security model also enables a
manager to create new "sub-rooms" potentially inaccessible by the two QP managers
originally appointed by IT Security.
It is obvious that QP hereby compromises the hierarchical and centrally managed
security model normally used at Beta. Neither does the central IT security unit have any way
of controlling access to rooms or documents, nor does a QP manager have any means of
controlling what is in “his/her” QP.
What we see is a tradition of centrally managing both the technology and the use of the
technology on a macro-level at Beta. Their tradition is not suited to handling a technology
such as QP where both access rights, what the system should be used for, and how it should
be used is defined at the level of the individual QP (the micro level). The tradition of centrally
managing IT is also evident when we look at the way QP was implemented.
Despite the conflict QP has remained at Beta, and in June 2002 when I last had contact
with Beta, QP was still a part of the IT infrastructure offered to units, which work across the
Nordic countries.
The IT management of the QP technology has caused a number of conflicts. At Beta
there is a long tradition of assigning system owners to IT systems. The system owner is
typically the manager of a business unit. The role of the system owner is to define the purpose
of the system and define rules for the proper use of it. As an example, the communications
department is the system owner of the Intranet. It has been rather difficult to find someone
willing to play the role as system owner of QP. The reason is that QP, like other virtual
workspaces, makes it impossible to exercise the role of system owner. This is because it is
very decentralized in the way it is managed, and because it is a very generic technology. It is
very difficult to define a “proper” use and to exercise control.
The typical IT system at Beta has a surveillance functionality, which enables the system
owner (or the system administrator on his behalf) to oversee and control the actual use of the
system. In QP this surveillance functionality does not exist. As previously discussed, QP is
originally designed as an application for an ASP environment. In an ASP environment, the
last thing you would want, as a customer leasing a QP, is for some system administrator to
have unlimited access to the documents you choose to put in it. For that reason nobody but
the members of the individual QP have access to it and define who else should have access to
the QP.

Mine the gap - a multi-method investigation of web-based groupware use


99
The study of Quickplace use

The typical IT system at Beta has a SOP (Standard Operating Procedure), which
documents the “proper” use of it. The SOP describes who should use the system, what it
should be used for, and how it should be used. When an IT system is put to use, the system
owner writes an SOP. The SOP is meant for users and managers of the system. There are
SOPs for, for example, issuing a mortgage to a private customer which tells the bank assistant
step by step how it should be done, including the use of the supporting IT system. In the case
of QP, it has taken more than one year to come to an agreement about a SOP for QP. It has
been very hard for the people responsible for the implementation to actually formulate a SOP
for an open technology such as QP.
So, QP can be characterized as a “rebel” technology in a financial institution where IT is
traditionally very centrally managed. The organization was used to technologies where the
proper use is defined centrally.
Not only the character of the technology, but also the implementation of the QP
technology has been different from normal.

The implementation of QP
The QP implementation effort is understood more clearly when contrasted with the way
one of the pre-merger banks previously implemented an Intranet. Intranets have in some
organizations been the first experiments with bottom-up IT initiatives, where Intranet-sites
have emerged without a planned change approach (Bansler, Damsgaard et al. (2000)). This
was however not the case at Beta. The Intranet implementation implied the defining of a
number of communication channels, as well as roles and work-flows for the publication of
information through these channels. The implementation of the Intranet included a formalized
education effort where editors and authors, in a two-day seminar, learned about system
features as well as how to write for the new medium Also, all readers were introduced to the
Intranet by video-presentations within the organizational units. The SOP written for the
Intranet is a 50+ page document.
There were very little efforts to implement the QP in the organization. Some resources
were spent on customizing the look of the application, but apart from that the only formal
means of implementation were e-mails sent to potential QP managers, and oral
communication. The e-mails were sent by the initiator of the QP initiative to people in his
network and then forwarded.
The e-mail contained an instruction to people who wanted to begin using a QP. Potential
QP managers should send an e-mail to IT Operations applying for a QP. The original idea was
that the application should contain a business justification, but in practice all applications
were approved. The rule of thumb for granting an application for a QP was that the use was

Mine the gap - a multi-method investigation of web-based groupware use


100
The study of Quickplace use

justified when the proposed members were from geographically dispersed organizational
units.
A part of our data from the case study consists of the applications for QPs sent to the QP
administrator. The conclusion after reading them is that very few applications had business
justifications for using a QP. Most of them simply contained a request to start up a QP with
the names and e-mail addresses for the proposed managers.
In contrast to the Intranet implementation there was neither educational effort of users
nor any guidelines on how the QP could and should be used to support various
communicational and collaborative needs. A 5 page SOP (compared to the 50 page Intranet
SOP) was written containing information about how to open and close down a QP, but was
first issued one year after the introduction of QP. As to how they should set up and use the
QP, the users were left with the general guidelines provided by the software manufacturer.
As noted above, creating and setting up a QP is by default distributed to the manager(s)
of the QP. The QP manager(s) defines the initial structure (rooms and folders) of the QP and
the authorization structure.
The start-up of a new QP is initiated by someone sending an e-mail to Security with the
application for opening a QP. The e-mail typically contains names of at least two persons who
are assigned the roles of managers in the room.
Being a manager firstly allows one to invite other members to the QP and assign them
rights as either reader, author or manager. Secondly, the manager can create and name new
folders and sub-rooms in the QP. Below is the start-up screen as shown to the user after the
QP has been created. The layout and graphics are customized at Beta but the functionality and
default folder structure are the same as shown here.

The default folder structure of the QP consists of a welcome page and seven different
folders. The folders are:
- Discussion: a threaded document repository, which allows people to post
comments to published documents.
- Library: a simple document repository which does not allow threaded comments
- Calendar: a basic calendar where one can post events, meetings etc.

Mine the gap - a multi-method investigation of web-based groupware use


101
The study of Quickplace use

- Tasks: a simple planning tool which allows one to create tasks with start-date and
duration and visualize the tasks in a simplified GANTT-like manner.
- Index: a repository for all documents in the QP. All the contents of the other
folders are also available here.
- Customize: only available for managers. It allows them to create sub-rooms,
forms for structured data entry, and to change the appearance of the QP.
- Members: the list of members in the QP. Here the manager can also invite new
members or expel existing members who are no longer welcome.
The start-up technically consists of inviting members, and eventually of changing the
default folder structure to suit the needs of the group. Of course this is but a small part of
the efforts needed to actually make a group begin using it. The development of the folder
structures will go into one aspect of this.

QP has been the first technology at Beta, which has been implemented using what
Bansler, Damsgaard et al. (2000) call an "improvisational" approach. The use, which we will
study in detail is not framed by strict limitations defined in a SOP, which have then been
defined centrally and distributed to users through, for example, educational sessions.

The communication infrastructure at Beta


As previously noted, QP is not the only communication technology available for the
employees at Beta. In fact one of the conclusions derived in this thesis is that the role of QP
cannot be understood without taking the other communication technologies into account.
The employees of the Beta headquarters have a whole suite of communication
technologies available to them. These technologies form a communication infrastructure,
which is used in many different ways by the employees.
The technologies available, which are related to the QP technology are:
E-mail: the e-mail system in the Beta headquarters is partly based on Lotus Notes and
partly on Microsoft Outlook. Both Lotus Notes and Microsoft Outlook have a built-in central
directory of people and a calendar, which can be viewed by others. They also contain
facilities for inviting people to a meeting, which are integrated with the calendar. When QP
was started, employees were unable to send e-mails back and forth between the countries,
because there was no closed network available. This was one of the reasons for starting QP.
By the time we entered the organisation a closed network had been set up and e-mailing was
possible between the countries.
Telephone: the telephone system is a standard phone system.

Mine the gap - a multi-method investigation of web-based groupware use


102
The study of Quickplace use

LAN-drive: LAN-drives have been available at Beta headquarters for a number of years
before QP arrived. A LAN-drive is only available in a certain geographical unit. LAN-drives
are used to store files. I studied the use of the LAN-drive as a part of the “Use-case study”
Bøving (2001) mentioned previously in the section on research method. LAN-drives were
used in the project primarily as personal back up. Both interviews, the analysis of documents
on the drive and the naming of folders (e.g.” Bob’s documents”) indicated this. There were a
few accounts of people who used a folder on the drive to exchange documents.
Intranet: each of the three banks that merged into Beta had their own Intranet. The
Danish was by far the best developed (according to the interview with our primary contact at
Beta). From version two, the Danish Intranet consisted of six overall types of information:
1. News: a highly structured set of communication channels targeted at specific
organizational, geographical and functional units.
2. Reference information: a collection of handbooks and SOPs. All SOPs for the use of
IT systems were available here.
3. Homepages: a tool available for creating information targeted at the members of an
organisational or a geographical unit.
4. Discussion forums: a collection of discussion forums, which were hardly used.
5. Tools: a collection of tools targeted at specific tasks and not related to
communication
6. Bulletin board: used for varying informal information from the wine clubs, sports
clubs of the bank and the list of available holiday houses to let.
I have no accounts of the contents of the Intranet in the other countries, but the Danish
Intranet gives an overall idea of the possibilities available.
These communication media together with the QP technology form an infrastructure for
communication. The functionality offered in these media is partly supplementary and partly
overlapping. For example, both QP and e-mail have calendar functionality and both LAN-
drive and QP enable shared storing of documents. A quick look at the functionality offered in
the different media shows a communication infrastructure, which is in terms of functionality
not clearly divided. In the survey, I asked the users two questions on their combination of QP
use with the other available communication media.
The following graph shows the result of asking the following inclusive question in the
survey:
“Which other media do you use to communicate or exchange files with the other
members of the QP?”

Mine the gap - a multi-method investigation of web-based groupware use


103
The study of Quickplace use

Which media are used

100%
90%
80%
70%
Frequency

60%
50%
40%
30%
20%
10%
0%
E-mail Face to Telephone Intranet LAN-drive Regular
Face mail

Clearly QP does not work as a communication medium with an isolated and well-
defined role. It is rather used sometimes in combination, sometimes instead of alternative
media. As the answers to the next exclusive question indicate, e-mail is the most important
competitor and supplement to QP.
“Please select the medium most frequently used by you to communicate with the other
members of the QP.”

Medium most frequently used

80% 74%
70%
60%
Frequency

50%
40%
30%
20% 14%
10% 5% 4% 4% 0% 0% 0%
0%
l

d
ce

e
t

t
ne
ai

ai
ne

ne
e

riv
m

m
Fa

er
o

tra

tra
-d
ph
E-

sw

ar
to

N
In

In
le

ul
an

LA
Te

ce

eg
ot
Fa

R
N

We will investigate the combination of media in some more detail later on. It would
appear that even when we study a single genre of communication, different media are put to
use. This seems to challenge the way genres of communication are studied traditionally by
analyzing the contents of one medium. The ad-hoc combination of communication media in a
communication situation also challenges the deep-seated notion of a rational relation between

Mine the gap - a multi-method investigation of web-based groupware use


104
The study of Quickplace use

functionality and the use of an IT-medium. We will analyze this in more detail in the section
on design-in-use.

The next four sections contain some general observations on how the QP technology has
been used in the 10-month period of investigation. The first two sections are observations on
how the technology had spread in the organization. The means by which it had been
introduced has produced some uses of QP, which were not intended at the outset. The third
section contains statistics on the sizes of the QPs observed in terms of the number of active
users. The fourth section contains an analysis of the QPs, which had been started during the
log period that shows very diverse outcomes of deploying a QP.

Unintended uses
The intention of the people who implemented QP in the organization was to supply a
tool for merger projects which would support a faster merger process and a lower level of
travel expenses. As the results from the survey report below shows, the use of QP has grown
into other areas than the one originally intended.
I asked the respondents of the survey to characterize their use of their QP in terms of
four types of use. These were: To support organizational units, to support different recurrent
tasks such as translating the quarterly financial reports, to support special interest groups and
to support projects, which was the original intent. The types of use were derived from the
interviews conducted.

Types of use

25
20
20 17
Frequency

15
10 7 6
5 3

0
An A project A group A special Other
organisational performing a interest group
unit recurrent task

The graph shows the distribution of QPs according to the typology, and therefore gives
an indication of the purpose of using QPs. The “other” category of the graph was written by
the respondents in a free-text answer. The three responses in the “other” category covered one

Mine the gap - a multi-method investigation of web-based groupware use


105
The study of Quickplace use

empty description, a QP spanning more than one project and a QP spanning more than one
organizational unit.
The respondents’ answers suggest that the use of QP had evolved without central
control. This is a very different pattern than usually seen in the bank. E-mail and LAN-drives
are the only previous experience the bank had had with a technology, which expands with
very little central control.

The organizational distribution of use


The use of QP had spread to most parts of the organization considered a part of the
"headquarters". In the survey, the respondents were asked the following inclusive question:
"Which areas do the members of the QP work in?" The respondents were asked to select one
or more organizational units reproduced from the organizational diagram.
The graph below shows the presence of different organizational units in the QPs.

Frequency of org. unit present in a Quickplace

30
25
Frequency

20
15
10
5
0
y
IT

l
g
fe

es
g

ce

ns
tro

ga
ur

in
in

io
Li

rc
an

io
ki

Le
nn
nk

at
as
on
an

at
nd

ou
ic

r
Ba

a
C

su

el
un
lB

Tr
ta

Pl

es

R
k

In
e/
-I

is

R
en
ai

r
C

al
R

nc

to
om
et

an
em

er

es
nd
R

na

um
C

en
ag

v
ta

Fi

In
d

H
an

an
di
re

tM

ity
C

t
se

en
As

Id

Organisational unit

The graph should be interpreted in the following manner: in 28 of the 45 QPs that
answered the survey, Corporate and Institutional Banking was selected. 50% of the QPs
report that all members are from one organizational unit, while the other 50% had members
from two or more units. The names of organizational units used in the survey were the major
organizational units of the company. It is therefore very likely that, for example, a QP
reporting only members from IT in the survey response will have members who span both
countries and functional divisions of IT. All major organizational units in the bank were
represented in the QPs. The reason why Corporate and institutional Banking and IT were
present in so many QPs is probably because the strategy of merging the banks focused on
aligning IT to save costs, and additionally to align Corporate and Institutional banking,
Mine the gap - a multi-method investigation of web-based groupware use
106
The study of Quickplace use

because many corporate and institutional customers operate internationally. The reason for
Retail Banking having a high presence is probably due to the fact that it is by far the biggest
business area in terms of people employed.
The survey also asked a question on the geographical distribution of the QP members,
which was phrased: "In which countries do the members of the QP primarily work (Where are
their offices located)?" The respondents should select one or more countries from the list of
countries in which Beta has offices.

% of QPs where country is represented

100% 93% 91% 88%


90%
80% 75%
70%
% Precence

60%
50%
40%
30%
20% 11% 9% 7% 5%
10%
0%
Denmark Sweden Norway Finland United USA Singapore Poland
Kingdom
Country

Not surprisingly, the four Nordic countries are represented in almost all of the QPs, and
all but four QPs had more than one country represented. This corresponds to the fact that
applications for a QP were only accepted if the users came from more than one country.
While most major organizational units on the organizational chart are represented, there
is a striking difference in activity across geographies

Mine the gap - a multi-method investigation of web-based groupware use


107
The study of Quickplace use

Distribution of QP use pr. Nordic country

13%

12% Finland

Sweden

4% Norway

Denmark

71%

This chart shows the distribution of the overall activity observed in the log based on the
IP-addresses of the requests made to the QP server. The network architecture allowed us to
identify from which country the request was made. The graph is slightly misleading since the
requests displayed as coming from the four Nordic countries also include requests made from
offices in countries outside of the Nordic countries. The amount of traffic stemming from
other countries is indubitably very small and fairly evenly divided across the four Nordic
countries, so that it still provides a reasonable picture of the actual distribution.
The distribution of activity across the countries can be interpreted as a result of the fact
that the technology is introduced by the Danish part of the organization. It could also reflect a
difference in the experience of working with collaborative technologies across the countries.
There are no data to support this interpretation other than an indication made in one of the
interviews with a Danish QP manager.

Size of the Quickplaces


As an initial characterization of QP use we can take a look at the size of the QPs in terms
of the number of users active in each QP. It gives an idea of the diversity of settings in which
the QP is used.
The size of the QPs measured in terms of the number of users active vary from 1 to 269
users. The histogram shows the distribution of frequency of number of users per QP. The
histogram should be read so that, for example, 19 QPs have between 23 and 42 users and that
120 QPs have between 2 and 22 users. The users counted are those who have visited the QP
in the period of logging.

Mine the gap - a multi-method investigation of web-based groupware use


108
The study of Quickplace use

Histogram of users pr. Quickplace

140
120
120
100
Frequency

80

60
40
19
20 9 7 3 4 3 1 0 2 1 0 0 1
0
1 22 42 63 83 104 125 145 166 187 207 228 248 More
No. of users interval

The histogram shows a huge concentration of QPs with more than 1 and less than 22
users. The following histogram zooms to give a detailed account of this distribution.

Histogram of users pr. Quickplace (2 - 22 users )

25 19 20
Frequency

20 17
14
15 11 10 9
10 8
5 4 3
5
0
2 3,9 5,8 7,7 9,6 11,5 13,4 15,3 17,2 19,1 More
No. of users interval

The histograms show a large diversity of QP sizes. This suggests quite diverse uses of
the technology. A possible hypothesis for the size of the QPs is that there would be an “ideal”
size, which most of the QPs would have, and that few QPs would have much fewer or many
more users than the average. This hypothesis does not, however, match the data. We shall see
later that the number of users in the QPs is distributed according to a power-law distribution.
A QP called NP_InternationalDivision is a QP, which exemplifies one of the QPs with
many users. Within the log period, 203 users had been active using 5297 different documents.
The QP supports an organizational unit called "International Division", which is an
organizational unit spanning the four Nordic countries as well as all other countries where
Beta is present. The QP is used for holiday lists, to support credit projects, distribute credit
limits and related information on issuing credits to large customers, and for marketing
materials (responses to the survey question: "Please give some specific examples of tasks the
Nordicplace is used for?").
Another example is the NP_MarketRiskReports, which had 35 active users within the
log period working on 1896 documents. The QP was used to collect daily risk reports in the
Mine the gap - a multi-method investigation of web-based groupware use
109
The study of Quickplace use

form of spreadsheets from different parts of the organization and to consolidate them into one
spreadsheet, providing a daily snapshot of the overall risk situation in the organization. In
contrast to NP_InternationalDivision, this QP was used to support one very specific task.
A third example is NP_nnn which was used to support a group of IT people working
with the Lotus Domino platform. It had 50 active users working on 114 different documents
within the log period. The QP is used to support "Discussions, experiments, programming,
documents. All relevant topics that have to do with the Domino platform within Beta". (Quote
from the survey).
The large concentration of QPs with a small number of users shown in the histograms is
also a reflection of the fact that not all QPs are successful in being integrated in a work
practice.

QP lifecycles
In order to look closely into how new QPs actually evolve, I will present an analysis of
QPs, which have been started in the log period.
The general usage statistics presented in the beginning of this section is a generalization
of 170 QP lifecycles. Some of them were active before we started logging, some died out
during the period of logging and some started up during the log period. For a few we may
have the full lifecycle represented in the log-data. None of the QPs residing on the server
were deleted during the period of logging. There were no procedures for closing down a QP
and archiving or deleting its contents, so QPs only die out in the sense that they are no longer
used.
In order to analyze the lifecycle of a QP, 37 were selected which were all started in the
period of logging. On a per-week basis, the number of active users, number of document
reads and the number of document edits was plotted for each of the 37 QPs. These graphs
give an overall idea of the level of activity or, indeed, whether there was activity at all. The 37
graphs are reproduced in appendix 10.
The three graphs below show three typical use patterns observed from the graphs. The
first use pattern is from a QP used to support a project. This had been started up 13 weeks
after the start of the log period and showed an activity level, which grew throughout the
remaining log period without signs of a decline.

Mine the gap - a multi-method investigation of web-based groupware use


110
The study of Quickplace use

np_fx-mm-globalisation

700

600

500

400
Reads
Edits
300 Users

200

1
4
7
10
13
16
100

19
22
25
28
31

0
34
37
Reads
Users

The second QP had been used to support a short-term project, which had been started
and closed down during the log period. The project supported by the QP had the purpose of
implementing a new corporate name and logo throughout the merged organization. It
demonstrates a shirt-term use of a QP. Since the closing of the project, there is almost no
activity in the QP.

Mine the gap - a multi-method investigation of web-based groupware use


111
The study of Quickplace use

np_name-change

450

400

350

300
Reads
Edits
250
Users
200

1
4
7
150

10
13
16
100

19
22
25
50
28
31

0
34
37
Reads
Users

The last use pattern is a QP, which never got started, the example here being a QP
supposed to support a system development project. Clearly the QP was never integrated into
the practice of the project.

np_fonda

40

35

30

25

20

15

10
Reads
Edits
5
Users
0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
Reads

39
Users

Mine the gap - a multi-method investigation of web-based groupware use


112
The study of Quickplace use

One should be cautious about making conclusions from these graphs. It is impossible to
deduce anything about whether a QP was actually used for anything sensible, simply from a
plot of the number of users and the activity. One observation is, however, possible: if the QP
showed no activity or only very little activity in very few weeks it should be characterized as
dead. About 14 of the 37 QPs never really got started and showed a little activity for only a
few weeks, or showed very fragmented activity over a longer period. The graphs tell us
nothing about the reason for this, but we can conclude that it is not straightforward to begin
using a QP. The nature of the technology, which we have analyzed earlier in the comparative
analysis of virtual workspaces, allows for a wide range of uses. This is also confirmed by the
study of use at Beta. It is very different than applications targeted at specific uses such as an
application for calculating interest rates for a loan to a customer, or for the Intranet as it has
been implemented at Alpha.
Instead of approaching the adoption of QP in the organization as problematic because so
many of the QPs never get to be used, one could re-direct the argument. The adoption of this
type of technology occurs even though its use is prescribed neither in the technology nor in
the implementation process. The adoption documents the ability of groups of users to re-think
their communication processes and integrate the technology in these processes.

Summary of the initial characterization


QP has been introduced at Beta as a technology without much centralized effort or
control of its use. In spite of this, the technology has been adopted by the organization and is
a part of the communication infrastructure of projects and organizational units in the
headquarters. Probably due to the lack of central implementation efforts and the lack of
central control inherent in the technology, the QPs have been used for quite different purposes
and even purposes unintended in the beginning. The uncontrolled development of use also
supports the characterization of QP as a technology, which does not have a specific use
inscribed, but rather, which is a general communication technology best compared to e-mail.
QP had spread in the organization both in terms of the organizational units, which
participate and in terms of the countries. The distribution of use also shows that Denmark was
highly overrepresented in terms of activity. The first obvious explanation for this is the fact
that the technology had been introduced from the Danish part of the organization. Since the
use of QP had spread through informal e-mails and by word-of-mouth, the difference in
activity level also indicates that the organization was only partly integrated across the country
borders. Another possible explanation is that the use of Intranet technology and Lotus Notes
was at a more developed stage in the Danish organization.

Mine the gap - a multi-method investigation of web-based groupware use


113
The study of Quickplace use

Both in terms of the increase in the number of QPs, which were active and the activity of
each QP, the use of the technology had evolved during the period in which it was observed. It
is not a technology, which is stagnating or dying out. The people responsible for the QP
technology discussed how to replace the technology because it did not fit with the IT-
management practice of the organization. It would create many problems because the
technology had been integrated in very diverse communication practices. It would therefore
be very difficult to replace the technology with something else. A replacement of an
application which serves a specific purpose, such as calculating loan offers to customers, is
much easier to handle centrally.
The QP technology is not easy to integrate into a communication or work practice. The
statistics of the QPs, which had started during the log period, shows that many of them did not
really start at all. It is therefore a technology with a large economic overhead involved. Many
QPs start up and some of them survive to become part of the communication infrastructure of
a group of people.
In order to understand the microstructures of the QPs about which we have untill now
only seen crude patterns, we will focus on three QPs to study the users and the usage in more
detail. This will also assist us in understanding the process of integrating QP in a work
practice.

Mine the gap - a multi-method investigation of web-based groupware use


114
The study of Quickplace use

The three Quickplace exemplars


In this section I will focus on three of the QPs. As previously noted, they have been
selected because of the data available. For all three QPs, an interview with one of the
managers, the answers from the survey and log-data are available.
The purpose of the section is to give a more detailed picture of the use of QP than is
possible across a representative number of QPs. It is also serves as a preparation for the
detailed analysis of individual genres of communication. The level of analysis presented here
gives an idea of the role played by the QP for a specific group of people with certain needs to
communicate and coordinate work.
The three QPs cover three of the four types of use presented earlier. The NP_solo-id QP
is an example of a QP supporting a project, the GIC QP is an example of a QP supporting an
organizational unit, while International_Communications (hereafter IC) supported a number
of recurrent tasks.

NP_Solo-ID
This QP served as a project repository. The project was a Nordic IT project with the
purpose of finding an IT-solution for giving one ID to each customer, which they could use
across all Internet-based services offered by the bank. The customers should only have one
cryptographic key and associated password for all the different services. The project could be
characterized as a technical infrastructure project with the purpose of producing a common
solution to be used by all Internet-based customer services. It therefore had many
stakeholders. They included all the managers of existing customer services, customer services
under development and planned services, and the people responsible for the overall IT-
strategy. The project was therefore organized with a relatively small core team of people led
by the project manager who drove the definition and implementation of the solution forward,
and a larger outer group of stakeholders, who were involved in the decision process and who
needed to be informed about the progress of the project.
The project was lead by a Danish project manager and the core team included a few IT-
architects, IT-specialists and business people placed in Denmark who conducted the actual
development work. The stakeholders included people from the other Nordic countries.
The project had had bi-weekly meetings. According to the project manager it had been a
challenge to try to work together in the newly merged corporation across the old
organizational and geographical boundaries. Therefore, they have decided to meet this
frequently in spite of the cost of transportation.

Mine the gap - a multi-method investigation of web-based groupware use


115
The study of Quickplace use

According to the project manager, the QP was used to store and share all relevant
information about the project, including documentation of the process and the solution, as
well as the documentation of decisions, meeting minutes, IT solution documentation, business
process documentation, presentations and material from suppliers of technical solutions.
The QP was started at the initiative of the project manager, and he also decided the
initial folder structure. The management of the folder structure had later been delegated to
people responsible for different parts of the project.

GIC
The GIC QP served as a communication tool for a Nordic organizational unit, which was
formed after the merger. The acronym GIC stands for Group Identity and Communication and
the organizational unit consisted of people working with external and internal
communication. The responsibility for the Danish Intranet was placed in this unit and it also
included the people involved in translation work. There was a large extent of overlap between
the members of the GIC and the IC QP. The GIC QP was used for meeting agendas, meeting
minutes and the holiday list, and is also used to store presentations, which had been used at
different venues.
The GIC QP was started at the same time as the organizational unit, and the decision to
use it was taken at a workshop where most of the members of the organizational unit were
present. It had four managers who could invite members and maintain the folder structure.

IC
The IC QP served primarily as a tool for a number of translation tasks. It is used for the
translation of the financial reports, which were released quarterly. It was also used for the
production and translation of an internal magazine for all employees in the organization and
for the translation of press releases, as we will study in detail in the next section.
The manager of the translators in the organization started the IC QP. As to the reasons
for starting to use the QP, she states:
“If we should ever make the translation process work, we needed some drive or
something were one could be sure that what was there was the right.”
From the beginning the purpose of the IC QP was not to inform people such as in GIC
where meeting minutes were distributed. It should be used for supporting specific work
processes.
“I mean we use it for other purposes than just a medium to keep people informed. One
could see it as a place where you go get some things and work on them and put them back
when you are finished.”

Mine the gap - a multi-method investigation of web-based groupware use


116
The study of Quickplace use

They had been using e-mail and LAN-drives previously but had had problems with
controlling versions of documents. Also, the e-mail and LAN-drive were not available after
the merger where they needed to communicate across country boundaries.

Characterizing document use


The following characterizations of document use in the three QPs are based on a
document matrix containing properties of the document. The matrix shows the number of
actions performed on a specific document, the lifespan and the number of users accessing it.
This matrix is also used as a basis for the statistical analysis of QP use in the section on
statistical generalizations and is described there in more detail.
The following matrix shows basic statistics for the documents in the three Quickplaces.
These statistics only cover the documents whose full lifecycle is in the period between the
start of the log period and the 5/10 2001 where the server was upgraded and the log-format
changes.
Statistics of No. of Avg. Avg. # Avg. # Avg. Avg. # Avg. #
document use documents lifespan users uploads #reads downloads edits
IC 456 7,0 2,8 1,2 4,8 2,7 0,3
GIC 68 2,3 3,0 1,1 6,7 1,9 0,2
NP_solo-ID 138 11,5 2,0 2,2 3,8 2,0 0,4

The number of documents used in the three QPs show a striking difference between IC
and the two other QPs. IC had approximately four times as many documents as the two
others.
The matrix also shows that the average document contained one attachment in GIC and
IC while the average NP_solo-ID document contained two. It is therefore safe to say that
most of the actual contents of the documents used across the QPs are stored in attached files.
The three genres we will analyze later also share this characteristic. The QP technology
allows for a number of different ways of storing contents, but the one used here was to attach
a file to a document. This meant that the contents of the file were not directly visible before
the file was downloaded.
The average number of edits between 0,2 and 0,4 indicate that most documents were not
edited after they had been created. The QP technology offers several possibilities for having
more than one user edit a document and it automatically locks the document to prevent two
users from editing it at the same time. These possibilities are hardly used in the three QPs and,
as the general statistics will show, this is also the case across all QPs.

Mine the gap - a multi-method investigation of web-based groupware use


117
The study of Quickplace use

The document use statistics for the three QPs shown above hides an important
characteristic of how documents are used. A large proportion of the documents in the QPs can
be characterized as dead documents. A dead document is a document where the number of
users is one and the lifespan is less than a day. A dead document is therefore a document that
is created by a user and never used again by anyone.

Dead documents # of documents % of all documents Avg. # uploads


IC 191 41% 1,0
GIC 59 87% 1,1
NP_solo-ID 94 68% 2,6

For GIC 87% of the documents were dead. This indicates great difficulties in
establishing genres of communication, which utilize the QP. The average of dead documents
across all QPs at Beta was 72%, so the difficulty of utilizing the QP was a general
phenomenon, which was especially striking in the case of GIC. While the percentage of dead
documents in NP_solo-ID was close to the average, IC had a significantly lower percentage.
It seems therefore that IC had been more successful in utilizing QP.
A phenomenon well known from LAN-drives is that they are used as private archives.
The analysis I conducted in the IT development project, before the QP study, included an
analysis of the structure and use of the LAN-drive used by the project. The conclusion was
that it was mostly used as a private archive, where people placed their files and were the only
ones to access them. One might suspect that this was the case in QP as well. The tendency to
use it as a private archive was, however, not prevalent in the three QPs or generally. The
average across all QPs was 4%. For IC it is 2%, GIC 1,4% and NP_solo-ID 6%. A document
in QP that served as private archive, was a document with only one user and a lifespan of
more than one day.
The table below is cleansed for dead documents and personal archives and clarifies the
different patterns in the QPs. Only documents with more than one user are included.
Statistics of # of Avg. Avg. # Avg. # Avg. # Avg. # Avg.
living docs documents lifespan users uploads reads downloads # edits
IC 255 12,1 4,2 1,4 8,1 4,6 0,6
GIC 8 16,6 17,9 1,3 54,0 14,6 0,1
NP_solo-ID 36 40,0 4,7 1,7 11,6 6,4 1,2

The GIC QP use is actually limited to 8 documents whose full lifecycle was from 5/5
2001 to 5/10 2001. These are read 54 times by 18 users. Compared to the other QPs there
Mine the gap - a multi-method investigation of web-based groupware use
118
The study of Quickplace use

were significantly more readers. The GIC QP seems primarily to have been used for
publishing very few documents to a larger group of people such as the management meeting
agenda genre analyzed in the next section.
A document in NP_solo-id was edited 1,2 times on average, which was more than the
two others. This makes sense if the QP works as a project repository where documents are
worked on while they reside in the QP.

Characterization of the users


The following characterization of the users is based on a matrix, produced for each of
the three QPs, which shows the activity per user per week. The matrix is defined in appendix
11.
The three QPs had approximately the same number of active users. GIC was the largest
with 108 active users, NP_solo-ID had 97 and IC had 81. An active user is defined as
someone who has used the QP at least once during the log period. For all three QPs, the level
of activity was spread with many users with very low activity and few users who were very
active. The table illustrates this by displaying different percentages of the users and their
corresponding percentage of the overall activity in the QP.

% of activity per % of users 5% of users 25% of users 50% of users


IC 35% 85% 98%
GIC 44% 80% 95%
NP_solo-id 55% 86% 96%

For all three QPs, more than 95% of the activity was conducted by half of the users. The
other half of the users were therefore very peripheral to the QPs and should be characterized
as virtually inactive. The table also shows that IC had a larger proportion of “core” users. The
5% of the users had a lower share of the activity while the 25% of the users had
approximately the same share of activity as GIC and NP_solo-ID.
Another way of characterizing the users is to look at the author/reader ratio. Apart from
the level of activity, a distinction should be made between the users who only read documents
from the QP and the users who authored documents meant for others.
Number of users Number of authors Author percentage
International-Communications 81 29 36%
GIC 108 8 7%
NP_solo-id 97 8 8%

Mine the gap - a multi-method investigation of web-based groupware use


119
The study of Quickplace use

The image of a limited number of people being responsible for most of the activity is
repeated when we look at the percentage of authors. For GIC and NP_solo-ID respectively
7% and 8% are authors while the rest only read the documents. While the tables do not
document the link directly, the authors were in the group of the most active users.
There is a striking difference in the author percentage between IC and the two others.
This matches the descriptions derived from the interviews and survey. While GIC and
NP_solo-id were used to distribute information (e.g. meeting minutes) and document work
(e.g. solution documentation and presentations), IC was used to support work processes where
people post articles for the internal magazine or press releases, which had been translated and
therefore 36% were authors of documents. It is clear that for GIC and NP_solo-id the task of
distributing information and document work was executed in a centralized manner.
The distribution of authors and readers across all QPs shows that 11,9% of the users
were authors. The patterns of GIC and NP_solo-id are therefore much closer to the overall
image of how the usage of QP was divided among the users.
We have seen a wide spectrum in the activity levels of the users. The following table
shows statistics on how often the average user visited the QP. The numbers are measured on a
per week basis. The week unit was chosen based on the hypothesis that if a user has visited
the QP once a week, he can be considered active. The level of activity measured across the
whole period used above gives no indication as to how concentrated the usage was. In
principle the most active user could have used the QP during only one week being otherwise
totally inactive.
Avg. no. of users Avg. weeks of Avg. weeks of
pr. week activity pr. user activity / total no.
of weeks
International-Communications 18,8 9,3 25%
GIC 19,5 6,9 19%
NP_solo-id 17,9 7,1 19%

The table shows that 19% of the users of GIC and NP_solo-id used the QP during an
average week or that the average user uses the QP every fifth week. The average user in IC
uses the QP every fourth week.
The following graphs display the number of users for each week in the log period. The
week numbers on the x-axis correspond to calendar weeks.

Mine the gap - a multi-method investigation of web-based groupware use


120
The study of Quickplace use

NP_Solo-ID weekly users

35
33
32

30
28 28
27 27
25
25 24

21 21 21 21 21 21 21
20 20 20 20
No. of users

20
18 18
17 17
16 16
15
15 14 14
13 13 13
12
11
10 9 9
8

6
5 4
3

0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Week

NP_solo-id shows a rather stable number of users per week. Between 15 and 30 users
used the QP, of course, except during the public holidays where the number of users dropped
significantly. The activity seems to have dropped in the last period, which might indicate that
the project was ending or entering a new phase.

Mine the gap - a multi-method investigation of web-based groupware use


121
The study of Quickplace use

IC weekly users

40
38
37

35 34 34

31 31
30
30 29
28
27
26
25 25 25
25
Number of users

23 23
21 21
20 20
20 19 19 19
18 18

15 14
12

10
8
6 6
5
5 4
3
2
1 1 1 1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Week

The number of IC users increased rather dramatically from a level of 6 – 8 users before
the summer holiday to a level of 25 – 30 after. This is probably because the IC had started to
support new tasks. The press release, which we are going to analyze in detail as a genre, was
produced in week 33. This is the first time the QP was used for the translation of press
releases. Also the use of IC to support the production and translation of the internal magazine
was started after the summer holidays. After the dramatic increase in number of users, the
weekly number of users stayed between 15 and 30 for the rest of the log period.

Mine the gap - a multi-method investigation of web-based groupware use


122
The study of Quickplace use

Weekly users GIC

70

64

60
57
54
52
50 48

40
40
35

30
27
25
23 23
21 21
20
20
16 16
15 15 15 15
14
13
12
11
10
10 9 9
8 8 8
7 7 7
5 5
3 3
1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7

Compared to NP_solo-ID and IC the GIC QP had a much more unstable number of
weekly users. Part from the peak in week 47, the graph also shows a tendency towards fewer
users per week. The activity during the late autumn and early winter was at the same level as
during the summer holiday. As we will see later in the section on folder structures, the peak in
week 47 was concurrent with a major reorganization of the GIC QP. Similarly, the dramatic
increase in the number of weekly users in IC occurred concurrently with a major re-
organization of the QP. While the IC reorganization seemed have been a success, the
dropping number of users in GIC might indicate that the re-organization had been less so
successful. We will deal with this in more detail in the section on folder structures.

The characterization of the three Quickplace exemplars has provided us with a


reasonably detailed overview of how they are used. The characterization has also illustrated
how statistical data based on log files can provide a supplementary viewpoint to the accounts
from interviews and survey responses. It provides insights not obtainable through interviews
by providing insights that help the researcher interpret the statements in the interviews. Also it
provides data that cannot be derived from interviews simply because the interviewees do not
know how the QP is used. Interviewees only have knowledge of their own use patterns and of
other's use through conversations. As an illustrative example, Linda, the manager of the GIC
QP reported that she shared presentations with the other users by placing them in the QP. A
study based on interviews would have no other options than report that the QP was used to

Mine the gap - a multi-method investigation of web-based groupware use


123
The study of Quickplace use

share presentations. The interpretation of "sharing" changes radically when provided with the
knowledge that all but 8 documents in GIC were never used by anyone. The term "share" as
Linda used it therefore means that she had the intention of sharing the presentation with
others. A study of GIC based on interviews would have required interviews with all 108 users
of GIC to provide this insight.
As the example with Linda shows, communication does not follow from good intentions.
In the context of this thesis the concept of a genre of communication is used as a basis for
understanding what makes QP a medium for communication. The analysis of instantiations
where a genre is established with QP as the medium will provide us with a greater
understanding of this issue.

Mine the gap - a multi-method investigation of web-based groupware use


124
The study of Quickplace use

Three genres of communication


In the following we will analyze three different instantiations of genres of
communication. The first genre is taken from IC while the last two are from GIC. The reason
for selecting these instantiations is that data from interviews are available. The interview data
provides an understanding of what is going on which enables an interpretation of the patterns
observed in the log. The three genre examples are not exemplars chosen because they
represent typical uses of QP in terms of the genres. The level of detail at which the genres are
studied does not allow in practice for an analysis of how representative they are to QP use at
Beta.
The purpose of the analysis is to give a detailed account of how the QP is used to
support specific communication processes. This viewpoint will provide us with a precise
understanding of the role of the QP technology in a very specific setting.
Apart from understanding the specific genres and the role of the technology, the analysis
exemplifies how log file analysis can be used to support a qualitative study of a work process.
It exemplifies a concrete approach to multi-method research.

The translation of a press release


One of the tasks which were handled in the GIC department was translation. The IC QP
functioned as a tool for the translation of corporate annual and interim reports, press releases
and a magazine for employees. It was managed by Lone, who was interviewed. She gives an
account of how the team of translators should use the QP in the translation of press releases.
At the time of the interview, the work process was agreed among the translators but had not
been tried out in practice.
In the translation process each language had a folder and the person responsible for the
translation in that language agreed with the relevant people on conventions. This was not
done centralized by Lone. The description, provided in the interview, of the translation of the
internal magazine can be summarized in the following diagram of the process:

Mine the gap - a multi-method investigation of web-based groupware use


125
The study of Quickplace use

1. The authors of articles for the internal magazine place their finished articles in the IC
QP in the folder named after the language in which it is written. (In appendix 12 the
folders called "Nordic ideas - English" and so on can be found in depiction of the IC
folder history.)
2. The translators download the articles and translate them into English (typically one
translator per source language). Once they have completed the draft translation, they
e-mail it back to the author and when he/she has accepted the draft it is sent to a
proof-reader. The translator then places the final English version in the “Nordic
ideas - English” folder.
3. A deadline is agreed on when all the articles should be available in English. Then the
translators translate the articles into the other languages. When they have completed
the translations they also place them in the QP.
If an article is changed after the translation process has started, the author must place the
new version in QP and notify the involved translators by telephone or e-mail. By the deadline
the result should be that all the articles are available collected in the folders by language.
This description of the genre of translation is taken from the translation of the internal
magazine. While it may not be identical in details to other translation processes, it exemplifies
a genre, which is used for the translation of annual and interim financial reports, the magazine
and for press releases.
Yates and Orlikowski (1992) identifies three criteria for defining something as a genre of
communication:
“…that is, the business letter and the recommendation letter, the meeting and the
personnel committee may all be designated as genres of communication if there can be

Mine the gap - a multi-method investigation of web-based groupware use


126
The study of Quickplace use

identified for each a recurrent situation, a common subject (either very general or more
specific), and common formal features.” Yates and Orlikowski (1992)
The translation process can be defined as a genre of organizational communication in the
following way:
The recurrent situation occurs when someone in the organization requests that the
translation unit translate a certain document. Most of these tasks re-occur with fixed intervals.
The financial reports occur each quarter and so do the press releases connected with the
release of the financial report. Also the internal magazine is published four times a year. Once
the request is made, the manager of the unit (Lone) initiates the work by informing the
translators and by sending the document to be translated. A translation process always has a
deadline, so the time length of the process is always known before it starts.
The common subject of the translation genre is always that it is about the translation of
a document. The subject of the documents to be translated of course varies.
The common formal features of the translation process are described above for the
translation of the magazine. The translator collects the document from the QP or receives it
via e-mail (as we shall see e-mail is used in the translation of the press release). In some cases
it is first translated into English and then from English to the other languages. One person
typically conducts a translation. In cases of tight deadlines, as is the case with the financial
reports, the translation process begins on drafts of the documents. This requires that the
translators coordinate the translation with the author of the original text. In these cases the
translation to one language requires more cycles as the original document is finalized in
parallel with the translation. In the case which we will be study in detail, the document is,
however, finalized before the translation process starts.
The detailed analysis of the translation of a press release is based solely on data from the
log file analysis. This allows us to focus on the specific role of the QP technology in the genre
of translation and the limits of interpretations made from log data. The visualizations of the
process can be found in total in appendix 13.
The first indication of the translation process in the log file is that Eva Berg on 15/8
12:00 uploaded two files to a document (a document is equivalent to an html page which can
contain attached files), which we shall name “Danish”. These files were named 16press-
dk.doc and 16aug-dk.doc. The “16” probably refers to the deadline of the translation process.
As a qualification of this interpretation, the Beta website had an announcement dated 16/8,
which stated that the interim financial report will be released on 22/8.
The chart below shows the full lifecycle of the “Danish” document. Each dot represents
an action and a number denotes the type of action (8 for upload attachment, 7 for loading
document in browser and 2 for downloading a file attachment). The graph shows us a very

Mine the gap - a multi-method investigation of web-based groupware use


127
The study of Quickplace use

short lifecycle of the document. After the last download of the attached files at 9:00 on 16/8
by Lena, the document was not touched again in the log period, and neither was it deleted, as
this would have shown in the log. The document was not edited after uploading, since this
would have produced an action type 4. This means that no new versions of the attached files
were uploaded in the document lifecycle.

Eva Berg who uploaded the Danish files later on the same day also uploads the
Norwegian and the English versions. On the next day (16/8) at 11:00 she created a new
document (which we name ”samling1”) and uploaded all language versions of the two press
releases to that document, which amounts to ten attached files. By the time she did this, all
language versions were available as documents in the QP. Lone had uploaded the Swedish
version and Kati had uploaded the Finnish version during the afternoon of 15/8.
The following chart summarizes the actions performed by Eva Berg on all the
documents involved in the translation process.

Mine the gap - a multi-method investigation of web-based groupware use


128
The study of Quickplace use

<<evaberg0.jpg>>
If we assume that QP was used as the coordination tool for the translation process, we
would predict that Eva Berg had downloaded the Swedish and Finnish version from the QP in
order to be able to collect them in “samling1”. However, this was not the case. Some other
medium must have been used to send her the files, and probably the e-mail system was used.
Eva Bergs role in the translation process could therefore be characterized as a proxy in
relation to the QP. A proxy is a widely used concept in IT and means that someone does
something en behalf of others. Eva Berg acted as a proxy in relation to the use of QP in the
sense that the translators of the Swedish and Finnish versions sent the documents to Eva by e-
mail instead of uploading them directly in the QP. The reason for the proxy role might be that
the two translators stuck to the way the translation process had worked before where e-mail
was used. Unfortunately the log data cannot tell how widespread the proxy role is in QP use.
If we turn to the other people using the “Danish” documents, all of them either simply
looked at the document or downloaded the attached files, and all these actions happened in
the period from the creation of the “Danish” document to the creation of the “samling1”
document.
In addition to the observation made on the actions made by Eva Berg, another
observation strengthens the interpretation that the QP is used as a secondary medium for
coordination. Approximately five hours after Eva Berg had created the “samling1” document,

Mine the gap - a multi-method investigation of web-based groupware use


129
The study of Quickplace use

Lone, the manager of the translation unit created another document (“samling2”) and
uploaded all ten files.
As the chart for Lone shows (please refer to appendix 13), she had herself uploaded the
Swedish versions and downloaded the Finnish versions. In order for her to upload the ten
files, she must have had them sent, probably by e-mail. Eva Berg acted as a proxy and so did
Lone, who was uploading the Swedish versions (she was Danish, had no special knowledge
of Swedish and therefore could not be the translator).
From the first analysis of the detailed process we can make two interesting observations:
1. While Lone described QP as the medium used to coordinate the translation process,
the log file analysis tells us that at least e-mail was used also. Actually it seems that
e-mail was used as the primary way of routing the documents from the translators to
the people responsible for publishing them.
2. Eva Berg and Lone Kaas seems to have acted as proxies for some translators, while
other translators seemed to have uploaded documents themselves.
Lone reported in the interview that the e-mail notification function was used when a
document was published. While the files were sent as attachment via e-mail as the means for
the primary routing of work, the QP notification process must have been used to explain the
observation that the files are downloaded from the QP and read by others.
Six people read the “Danish” document during its short lifecycle. All these readings took
place before all the five language versions (ten files) were collected in “samling1”. The two
readings by Kati and Tore of the "Danish" document were of action type 7, which means that
they were simply checking to see whether the documents were there. Kati uploaded the two
Finnish files and immediately after that she loaded the “Danish” document as if she was
checking that the Danish version was also there. This indicates that the QP was used as a way
of checking the status of the translation process.
The other four readings of the “Danish” document were of action type 2, which means
that one or more of the attached files were downloaded. Susanna, Claus and Lena downloaded
both files, while Per downloaded the one file and then after an hour downloads the same file
again.
The six people who read the “Danish” document must all have been notified that it had
been published by means of the notification function in QP. We can only speculate about their
reason for reading the document, but it is likely that at least some of them read them for proof
or approval. If they had provided input for the Danish press releases to the translator Eva
Berg, they must either have e-mailed the comments, phoned her or most likely have walked
across the room to her and provided the input face-to-face. All Danish members of the
communications department were located in one open office space. The finding, as in the

Mine the gap - a multi-method investigation of web-based groupware use


130
The study of Quickplace use

analysis of the actions of Eva Berg and Lone Kaas, is that QP was not used as the only tool to
support the coordination of the translation process. The use of QP was mixed with the use of
e-mail, phone calls and face-to-face conversations.
While the hypothesis that the readings of the "Danish" document had a corrections and
approval function could not be validated, it was validated in one of the readings of
“samling1”. Two hours after Eva Berg published “samling1”, Jens downloaded the two
Danish files from “samling1”. Jens was press officer for Beta in Denmark. Therefore, he was
responsible for all communication with the Danish press and responsible for the two Danish
press releases analyzed here. We conducted an interview with Jens, where he specifically
addressed his role as approver.
A number of people read the “samling1” and after a break in activity in the QP of three
hours, Lone created a new document “samling2” and uploaded all ten files. There is one
probable explanation for this: The readers of “samling1” had provided input for the different
language versions and the updated versions had been routed via e-mail to Lone. Again the
finding is the same, that QP was used together with e-mail, phones and face-to-face
encounters.
What is then the role of QP in this genre of communication? From the interview with
Lone, it seemed that QP was used as the coordination tool for the translation process. The
result of the analysis of the log files gives us a rather different picture.
Firstly, the primary routing of documents from the translators was only partially done
using QP. E-mail must have been a central part of it. Otherwise we cannot make sense of the
action patterns observed in the log file. The routing of documents is not the only relevant
aspect of the translation process, and in these other aspects QP seems to have played a more
central role. The diagrams show that for all published documents regarding this translation
process, a number of people downloaded (and probably read) the attached files. Either they
did it simply to inform themselves of the contents of the press release, or to provide
corrections or to provide approvals for the press releases. In the cases where the readers fed
comments back into the translation process, they did it by other means than the QP.
Secondly, the use of QP was limited to simple processes of upload and read. The files
that were posted in the QP were not edited after the creation. The exchanges back-and-forth
of new versions between the translators and readers must have been accomplished using e-
mail. This initially seems strange. QP provides nice facilities for locking documents in and
out and thus support that the document could be edited several times by different people
without the risk that versions get out of sync.
In the interview Lone stated that: “…only the things that are finished are put in the
Quickplace”. This might provide an explanation for the observed pattern that documents were

Mine the gap - a multi-method investigation of web-based groupware use


131
The study of Quickplace use

simply uploaded and read without any edits of the document. However, the logical
consequence of the principle stated by Lone would be that once the “Danish” document was
uploaded, it would serve as the final document. This is not the pattern observed on the
diagrams. The QP was in fact used for draft versions and later collected in a final version. The
principle stated by Lone explains why the revision processes were divided into simple upload
and read processes in the QP.
One of the questions arising here is, whether they were using QP as intended in the
design. With the words of Desanctis and Poole (1994) we might ask whether it was used
according to “the spirit” of the technology. This question might be answered with both a yes
and a no in this context. The no answer would argue that the QP has facilities for handling the
full process of translation, including that other people than the author provide direct input for
corrections and the handling of different concurrent versions of the documents. According to
this line of argument (which is very common in IS), QP was not used according to its design.
The “yes” answer would argue that they used some facilities provided by QP and integrated
them with the use of other media in a way which worked to their own satisfaction.
There is a third response to the question of whether QP is used as intended in the design.
The third response is that the question is wrong. The way in which the QP was used together
with other media as observed in this genre of communications prevents a clear answer to the
question. The problem of answering this question points to a distinction between design and
use common in the study of computer systems and their use. It seems that the observations of
the use of QP challenge this distinction both in terms of who, where and when design and use
take place. We will get more into this discussion after collecting more evidence that this
distinction needs rework when applied to computer media such as QP.

The translation of a press release showed us how QP was used to support a genre of
communication that involved several documents and several exchanges of documents. The
two remaining genres analyzed are of a simpler nature. They are taken from the GIC QP. The
two genres represent a significant amount of the activity in the GIC QP. As documented
earlier only eight of the documents were used by more than one user in the period between
5/5 2001 and 5/10 2001, and the two genres account for two of the eight.

The holiday list


One of the tasks of an organizational unit is to coordinate vacations. In GIC the GIC QP
was used to maintain the plan for holidays. According to the survey this was also the case for
a number of other QPs. The plan for holidays typically existed in the form of an Excel sheet.

Mine the gap - a multi-method investigation of web-based groupware use


132
The study of Quickplace use

The holiday list was one of the most visited documents in the GIC QP. The document
was created on 7/5 2001 and was still used at the end of the log period. The analysis presented
here only covers the period before 5/10 2001 because the change in the logging-format caused
by a new version of the QP server prevented a proper identification of users after this date.
The holiday list were used as follows: when a member of GIC wanted to plan a vacation
he/she downloaded the Excel sheet from the QP. The first graph shows the uploads, edits of
the document and downloads of the attached Excel sheet containing the holiday list. Souma
was the responsible person for maintaining the holiday list. She uploaded the spreadsheet and
conducted all edits of the documents.

The graph shows extensive downloads of the spreadsheet on two occasions. This is
probably in response to an e-mail from Souma requesting the individual holiday plans from
each member of the organizational unit.
A closer look on the actions of Souma reveals that she performed 14 edits of the
document during the period of analysis. The edit was always preceded by the download of the
spreadsheet, indicating that the QP was used to maintain the master copy of the sheet.

Mine the gap - a multi-method investigation of web-based groupware use


133
The study of Quickplace use

A graph of five days of activity in June illustrates how the holiday list was maintained.

The triggering event was that some user downloaded the spreadsheet. Probably he/she
was planning a holiday and wanted to check with the other members' holiday plans. The next
observation in the log was that Souma downloaded the spreadsheet and thereafter edited the
document, indicating that she was uploading a new version of the spreadsheet. Like in the
case of the translation genre some other media must have been used in between the two
events in the log. Probably the user who downloaded the spreadsheet sent an e-mail to Souma
telling her when he wanted to have a holiday. She then read the e-mail and made the
appropriate changes to the sheet and uploaded it.
Apart from the e-mail to Souma the holiday was probably coordinated directly with
colleagues, either by face-to-face conversations, phone calls or by e-mail.
The observation in this case is the same as with the translation regarding to the
combination of media. The spreadsheet in QP was used for some aspects about
communicating and coordinating the holiday plans for the employees, while other aspects
used e-mail, telephone and face-to-face conversations.
It is also characteristic that one person maintained the holiday list. Souma was a
secretary and therefore had no power to decide when people should take their vacation. It
would have been simple to distribute the update of the spreadsheet to the individual users. It is
a general observation, as we will see in the statistical analysis of document lifecycles, that
only in extremely rare cases do more than one person make changes to the same document.

Mine the gap - a multi-method investigation of web-based groupware use


134
The study of Quickplace use

The meeting agenda


While the same holiday list was used in the whole log period, the lifecycle of the next
genre is very short. Before the regular meeting of the management group in GIC, the agenda
in the form of a PowerPoint presentation was distributed to the members. In the instantiation
of the genre analyzed here the filename of the PowerPoint presentation was
“Meeting16/8.ppt” indicating that the meeting was held at 16/8-2001. Suoma (the person also
maintaining the holiday list) was secretary to the manager of GIC and uploaded the
presentation on 14/8. 42 users thereafter download it before the meeting was held. After the
meeting is held it was downloaded by two different users 8 and 12 days after the meeting.
The graph shows the actions on the document from the upload to the meeting. In
appendix 14 the full lifecycle of the document is displayed.

The upload of the document by Suoma must have been accompanied by an e-mail
notification, explaining why so many people were aware that the document had been
uploaded.

Mine the gap - a multi-method investigation of web-based groupware use


135
The study of Quickplace use

The meeting agenda genre served to inform the employees of the GIC department about
the subjects of the management meeting, and it is a simple example of one-way
communication. According to an interview with Linda, the publishing of the agenda also
served as a chance for people to respond to the agenda and provide input or comments. The
log documents that the responses were not mediated by the QP, but were probably mediated
by the telephone or face-to-face communication.
The downloads of the meeting agenda 8 and 12 days after the meeting was held indicate
that it also had an archival function.
Analyzed in the terms of the genre theory, the recurrent situation initiating the genre is
the production of the agenda for the management meeting. This agenda is agreed in the
management group, and for this process the QP is not used. Once the agenda is settled in the
management group it is ready for publishing. The publishing of the document should take
place before the meeting so that employees have a chance of commenting or contribute with
information or insights
The common subject for the genre is of course the agenda of the management meeting.
The subjects dealt with at the different meetings will vary, but the stable factor is the fact that
it is the agenda for the GIC management meeting.
The form of the genre is that the secretary of the manager uploads the agenda once it is
settled and notifies all employees of the department of the publication. Some of the
employees then read the agenda and eventually respond to it using some other medium than
the QP.

Mixing media in a genre


The analysis of the three genres has shown that the use of QP was closely intertwined
with the user of other media. The results of the survey showed us that the QP is used in
parallel with e-mail, telephone, etc., and the present analysis tells us that we should take this
very seriously. The survey question was: “what other media were used to communicate with
the other members of the QP”. All the respondents answered that QP was used with other
media. The present analysis shows us that this was also the case in individual genres of
communication.
The finding in relation to genre theory is therefore that important aspects of genres of
organizational communication cannot be captured by the analysis of just one medium. It
might have made sense before computer media was introduced, but the availability of
multiple media with both supplementing and overlapping functionality makes it difficult. The
users have an infrastructure consisting of multiple media that is combined in the genres
utilizing various combinations of functionality from the available media.

Mine the gap - a multi-method investigation of web-based groupware use


136
The study of Quickplace use

Existing empirical studies of genres of organizational communication have focused on


one medium (Yates and Orlikowski (1992), Orlikowski and Yates (1994), Yates and
Orlikowski (1994), Yates, Orlikowski et al. (1999), Davidson (2000), Karjalainen, Päivärinta
et al. (2000), Yoshioka, Herman et al. (2001), Yates and Orlikowski (2002)). The present
analysis has showed that focus on a single medium should be done cautiously.
One other characteristic of existing studies of genres of organizational communication is
the focus on content analysis. The studies performed by Yates and Orlikowski are based on
the study of media contents. Orlikowski and Yates (1994) is based on the analysis of 2000 e-
mail transcripts and Yates and Orlikowski (2002) analyzes 682 documents posted to a Lotus
Team Room. In both studies the content analysis is supplemented by interviews with users.
The intent of this study in relation to genre theory has not been to map the genre systems
enacted in QP at Beta. It has provided detailed analysis of specific instants of genres and the
data does not provide evidence that these are the most common genres enacted in QP.
Genre systems structure several aspects of communication. Yates and Orlikowski (2002)
identify these as the "what", "who/m", "how", "when", and "where" of communication. The
present analysis has shown that log analysis provides a supplementary approach to the
analysis of genres of communication. In combination with interviews it clarifies the aspects of
genres that has to do with "who/m", "how" and "when". The "what" of communication is only
addressed on a very abstract level because it lacks the content analysis. The strength of the log
analysis is that it clarifies the relationship between media and genre systems by providing a
detailed account of the sequence of communicative actions involved in a genre system.

Why do people use e-mail instead?


E-mail has been used at Beta for a number of years and people have gotten used to
interpreting a lot of details about e-mail communication and how to respond on them. They
base themselves on a number of genres of communication. The genres observed in the QPs
indicate a less developed sensibility. The holiday list was an example of this. The users
themselves could just as well have updated the Excel sheet, rather than sending an e-mail to
Souma. The question is why they did not. I have no empirical data that supports one or the
other explanation. For some reason it was more difficult for people to edit the spreadsheet
than sending Souma an e-mail stating the same information. It seems very hard for different
people to edit the same document without being near each other as to adjust mistakes and
misunderstandings.
It is probable that if the organization continues to use the QP technology, the use of more
"advanced" features such as trusting that people can update a document asynchronously,
uncontrolled, and from geographically dispersed locations could probably be observed. The

Mine the gap - a multi-method investigation of web-based groupware use


137
The study of Quickplace use

point made here is not that I as a researcher or system developer have overlooked some
crucial little detail in the coordination work that makes me misinterpret the work situation.
Neither is it the point that some ignorant system developer has designed the software so that it
cannot be used for the task, which it was meant for. The point is rather that the existing genres
of communication, in which e-mail is a well-established medium, are social structures that are
not changed from one day to the other.
An interesting difference between the three QPs studied in detail hinges on the
relationship to existing genres of communication. The use of NP_solo-ID and IC were
characterized by the fact that they served well-established genres in the organization. Projects
have exchanged descriptions and kept a project archive, and the translators have translated
and coordinated their translations before the introduction of the QP technology. In the case of
GIC it was harder to identify the genres of communication on which it should be based. While
the communication needs of a project and translation tasks are pretty straightforward, on the
level of an organizational unit, they are not. The two genres that seem to have been successful
in GIC were the meeting agenda and the holiday list. These genres have most likely been
established at the time the GIC organizational unit was formed. The holiday list was not a
new genre and the members of GIC must have been familiar with it before it was put in QP.
In the case of the management meeting agenda, the genre was certainly a new invention that
was connected to the merger of the communications departments into GIC. Before the merger
the Danish organization was characterized by a fairly simple structure with one manager. The
meeting agenda genre was therefore a new genre that was unique to QP. The question is in
what sense it was uniquely integrated with the QP technology.
The meeting agenda genre provides an opportunity to pinpoint the relationship between
the QP technology and the social structures of Beta and highlight the difference between the
QP technology as an artefact and the technology-in-practice. Since the management meeting
agenda is unique to the QP technology one might speculate whether it is a consequence of
introducing the QP technology and whether the genre is directly related to unique properties
of the QP technology. This is hardly the case even with a new genre. First of all, the QP
properties utilized in the management-meeting genre are not unique. E-mail could have
served exactly the same purpose of distributing the agenda. Also the genre was introduced as
a consequence of change in the social structures. It was introduced because the
communication departments had merged into one unit and a resulting new management
structure.
It is probably the case that the decision to publish the management meeting agenda for
the purpose of getting input and informing employees was influenced by the presence of a
communication infrastructure that allows easy distribution of information to a group of

Mine the gap - a multi-method investigation of web-based groupware use


138
The study of Quickplace use

specific people. Only in this abstract sense does the properties of the technology artefact
affect the genres of communication. We cannot pinpoint specific properties of the artefact that
has caused specific properties of the genre. Rather it exemplifies the emergence of a
technology-in-practice in the form of a genre of communication that includes utilizing some
properties of the artefact.

Up till now, the reports on the use of QP at Beta have exhibited a continuous zooming in
from an overall characterization to the analysis of three specific instantiations of genres in
which the QP was integrated. In the next section we shall zoom out again an attempt an
overall characterization of QP document use. The next section is solely devoted to document-
based HTTP-log analysis that will both illustrate the potential of this kind of log analysis and
provide some interesting findings for the characterization of QP use.

Mine the gap - a multi-method investigation of web-based groupware use


139
The study of Quickplace use

Statistical generalizations of document use


As described earlier in the section on research method, log-analysis of computer-
mediated-communication analyzes the life cycle of documents. In contrast, log-analysis of
human-computer interaction focus on the single users interaction with an application.
The analysis of the translation genre exemplifies how a document-centered analysis of
the log file can be used in the process of understanding one single instantiation of a genre of
communication. The quality of such an analysis is that it provides a very detailed account of
actions performed by users in a single instantiation of a genre. The challenge is that it
produces a lot of complexity, which makes it impossible to use for the analyses of larger
samples and defies statistical generalizations beyond the sample. In data-mining terms the
analysis of the single instantiation of a genre is denoted a search for local patterns in the data
(Hand, Mannila et al. (2001)).
The opposite approach to the search for local-patterns in data mining is to look for global
models. A model is a simple description of the data that can be used to explain a
phenomenon. Global models are of course based on statistical generalizations, and the value
of a global model is whether it can explain all the data analyzed, and whether it perhaps can
be generalized using inferential statistics to data beyond the sample.
This section will report some attempts made to create global models of the document life
cycles in the body of data. The strength of a global model is that it explains all the data
gathered in a simple way. In this respect the approach used to analyze genres of
communication and this section represent opposites in a trade off between specificity and
complexity.
In some case there is a correlation between this analysis and the genre analysis. There is
however not a direct link between them. The translation genre showed that seven document
life cycles constituted a genre of communication. Therefore when we attempt to analyze a
document life cycle, we shouldn’t hope for any statements regarding genres. Some simple
genres such as the distribution of management meeting minutes are equivalent to a document
life cycle, but it would be unjustified to think that this is the case in general. The purpose of
the genre analysis was to understand in detail how QP was used in specific settings while the
purpose of generating global models is to make general statements regarding the use across
all QPs.
The purpose of attempting to create a global model of document life cycles is three-fold.
Firstly, it will provide us with a general idea about how QP documents are used, secondly it
will reveal some interesting patterns that we can use for the understanding of the distinction
between properties of the artefact and technology-in-practice, thirdly it illustrates how
Mine the gap - a multi-method investigation of web-based groupware use
140
The study of Quickplace use

different results can derive from the analysis of log files. When the global models are
compared to the instantiations of the genre, the trade off between investigating many aspects
of use in one situation and investigating on aspect across a number of situations becomes
clear.

Deriving the basic document types


The purpose of the general model of document life cycles was to look for a limited
number of basic types of document life cycles. The initial idea was that these types of life
cycles could be interpreted as the basic genres of communication used in the QPs at Beta. The
detailed analysis of genre instantiations showed that this interpretation is not sensible. The
genre of translation identified two problems: first, that more than one document was used in
the genre and second, that other media were part of the genre. Therefore, the interpretation of
types of document life cycles as genres is not sensible. Still the basic types of document life
cycles provide valuable input to the characterization of QP as a technology for computer
mediated communication at Beta.

The first step of the process of deriving the basic document life cycles is to define how a
document life cycle should be represented. This determines the relevant methods available in
statistics. We worked on two different definitions, which were distinguished by the manner in
which time was represented.
1. A document life cycle is represented as a sequence of different types of actions on
the same document.
2. A document life cycle is a collection of properties of a document that characterizes
how it has been used.
The first representation is the most complex but would also cover more aspects of use. It
would distinguish between documents that were first edited two times and then read seven
times from one that was first edited, read three times, edited once more and finally read four
times. Sequence analysis is the relevant statistical method of analysis for this representation
of a document life cycle. Sequence analysis is a method that can analyze a sequence of events
and produces results of the form: event 1, event 2 -> event 3. It can be interpreted so that if
event 1 occurs and event 2 occurs after event 1, then event 3 will occur after event 2 with a
probability score. Sequence analysis is used in HTTP-log analysis for analyzing click streams.
The click streams are the most probable paths through a web site. It has been identified as a
very useful method for analyzing shopping behaviour in on-line stores (Srikant and Agrawal
(1995), Srikant and Agrawal (1996)). The sequence analysis was tried out using a matrix
derived from the database and the sequence analysis algorithm in Clementine from SPSS but

Mine the gap - a multi-method investigation of web-based groupware use


141
The study of Quickplace use

did not produce useful results. The main practical problem was to set the time limits of events
(in what period of time should events be characterized as one event between documents, and
what period of time should define between events). The difference in lifespan between the
meeting agenda and the holiday list very clearly illustrates this problem. It was simply not
possible to define these time limits because of an extreme variance in the data. Besides the
practical problems, a general limitation of sequence analysis is that it cannot represent users.
It can only analyze types of events over time.
The second representation selects a number of properties of individual documents that
are descriptive of their life cycle. This is, for example, the lifespan, number of users, number
of downloads etc. of each document. In statistical terms this means an analysis of multiple
variables simultaneously. The statistical definition of finding types of document life cycles
based on multiple properties is to group instances according to multiple variables, known as
non-hierarchical clustering. The most popular method for non-hierarchical clustering is K-
means clustering (See e.g. Hand, Mannila et al. (2001)). The K-means clustering process
starts by choosing the number of clusters one wishes to end up with as a result of running the
algorithm on the data. Initially the algorithm defines k cluster centres in an n-dimensional
space (where n is the number of variables). Thereafter it iterates through all instances and
assigns the instances to their closest cluster and recalculates the cluster centre. This process is
iterated until no change occurs in the assignment of instances.
The weakness of this representation of the document life cycle is that it ignores the order
of events. Instead it gives us a possibility to choose any property of the document life cycle
for defining the types.
Choosing a sample for the analysis
As the basis for analyzing document life cycle we used a matrix that collected a number
of data for all documents used across all QPs in the log period. This amounts to 14049
documents. Because the server was upgraded on the 5/10 2001, we had to limit the analysis to
documents with a life cycle that ended before this date. The server upgrade changed the log
format and meant that e.g. the calculation of the number of readers of each document would
be misleading after 5/10 2001. This left us with 5826 documents. The amount of documents
allowed us to perform the analysis without performing further sampling. It is not possible to
determine whether the sample represent all 14049 documents.
Selecting the properties of the documents
For all documents we calculated the data on for how long the document had shown
activity in the log, how many times it had been accessed and by how many different people
(for all details on the matrix, please refer to Appendix 6).

Mine the gap - a multi-method investigation of web-based groupware use


142
The study of Quickplace use

Specifically for the cluster analysis a number of properties were chosen to characterize
each document life cycle. These properties were selected because they were relevant for the
characterization of the life of a document. For all documents in the sample we calculated the
following properties and collected them in a matrix:

Property Definition
No. of uploads: occurrences of action type 8
No. of edits: occurrences of action type 4

No. of reads: occurrences of action type 7


No. of downloads: occurrences of action type 2
No. of different readers: unique users who had performed action type
7
No. of different editors: unique users who had performed action type
4
No. of users # of Distinct users of the document
Lifespan No. of days from first to last use of the
document

The document properties provide a simple picture of how documents were used ignoring
special actions such as document moves and ignoring the sequence in which actions were
performed.
K-means clustering of the document sample
The process of k-means clustering was iterative. Initially the matrix was fed into the k-
means algorithm with the values of the matrix. It turned out that very few documents had very
extreme values in some of the variables (e.g. number of reads). This produced very uneven
clusters in terms of size. It put most documents in one cluster and the rest of the clusters
contained very few documents. The clusters therefore didn’t provide a broad description of
the documents but characterized only the few with the extreme values. Because of this the
values of the variables were categorized based on two principles: the number of documents in
each category should be equal and the number of categories should be small and relevant for
interpretation. The categories are described in appendix 15.
Another observation from the initial clustering was that a large proportion of the
documents were “dead” documents. 72% of all documents analyzed were only used by one
user and had a lifespan of less than one day. These documents were simply created by a user
and never accessed again. This is an interesting finding in its own right and we shall get back
Mine the gap - a multi-method investigation of web-based groupware use
143
The study of Quickplace use

to it later, but the extreme skewness in the data produced by the dead documents was a
problem for the clustering algorithm. We therefore chose to leave out the 72% from the
cluster analysis. They were very easy to characterize as a type of document life cycle without
clustering and would only disturb the process.
With 28% of the documents and 8 variables we started the clustering process. We had to
make two other adjustments before doing the final clusters. A correlation analysis between
the variables showed that there was no significant correlation between the lifespan of a
document and any of the other variables. So, while lifespan = 0 was important in
distinguishing dead documents, it turned out to be insignificant to the rest of the documents.
This lack of correlation is illustrated in the following graph:

The last adjustment to the data for the k-means clustering was to leave out the number of
readers and the number of editors and just use "no. of users" as the variable describing the
number of persons who had accessed the document. As the hierarchical analysis of document
types will show, all documents but 22 had, if the document was edited at all , one user doing
it.
Setting the number of clusters when doing the k-means clustering is a non-trivial task. In
some analysis such as analyzing use patterns of credit cards for predicting credit card fraud k
is straightforward, because you want to isolate the ones where a credit card is used fraudulent.
In our case the number of types of document life cycles was of course unknown at the outset.
We tried out k values from three to twelve. The iterations with different values of k showed
that 5 clusters provided the best grouping of properties. When comparing the clusters from k
= 3 to k = 5 it was clear the 5 clusters evolved from 3 so that two of the clusters for k = 3
Mine the gap - a multi-method investigation of web-based groupware use
144
The study of Quickplace use

would divide in two for k = 5. For k > 5 the algorithm kept the 5 clusters from k = 5 more or
less the same while the rest of the clusters were very small. The output of the clustering
algorithm for k = 3 to k = 8 can be found in Appendix 16.
The clusters produced for k = 5 were then used as the basis for a characterization of the
six types of document life cycles. As an illustration, the interpretation process of Cluster 1 is
presented below. The definition of the categorizations of the document properties into A,B, C
etc. can be found in appendix 15:
Cluster 1: 467 examples
Uploads : The number of uploads in this cluster is
B -> 0.858672, C -> 0.141328 typically 1, because 86% of the occurrences
in the cluster have 1 upload
Edits: The number of edits are 0 – 1 with 0 being
A -> 0.732335, B -> 0.152034, the most typical
C -> 0.115632
Reads: The number of reads of the documents is
A -> 0.010707, B -> 0.025696 between 4 and 15.
C -> 0.239829, D -> 0.434689
E -> 0.239829, F -> 0.049251
Downloads: The number of downloads of attached files is
A -> 0.023555, B -> 0.201285 between 1 and 8
C -> 0.762313, D -> 0.012848
Users: The documents in this cluster typically have
A -> 0.014989, B -> 0.070664 more than three users
C -> 0.730192, D -> 0.184155

If we take the upload property, the numbers show the share of the documents in the cluster
that has that categorical value. For example, 0,858672 or 85,9% of the examples have B,
which refers to 1 upload while 0,141328 or 14.1% has C, which refers to more than one
upload.

The six clusters of document life cycles


The k-means clustering provided 5 types of document life cycles. Including the “dead”
documents not included in the clustering process, we get 6 basic types of document life
cycles. These are summarized below:
Cluster 1: Publish an attached file which is downloaded by a small 467 examples
number of users.
Mine the gap - a multi-method investigation of web-based groupware use
145
The study of Quickplace use

Basic characteristics:
1 upload of an attached file, 0-1 edits, between 4 and 15 reads, 1 – 8 downloads and more
than three mostly 3 – 5 users.
Cluster 2: Publish a document with no attachments which is read by 325 examples
a small number of users
Basic characteristics:
0 uploads (20 percent > 1 uploads), 0 downloads and 1 – 5 users
Cluster 3: Publish an attached file, which is downloaded by a larger 271 examples
number of readers.
Basic characteristics:
More than one upload and more than 8 downloads, more than 10 reads and more than 3
readers (with more than 5 in 79% of the cases).
Cluster 4: Publish a document that is hardly read, if it has an 245 examples
attachment it is not downloaded.
Basic characteristics:
0 downloads and 0 – 3 reads, 0 edits, and 0 or 3 – 5 users.
Cluster 5: Publish one or more files that is downloaded by one or 250 examples
two readers
Basic characteristics:
1 or more uploads of a file, 0 edits, 1 – 5 reads and 0-2 downloads by mostly 2 users
Cluster 6: dead documents 4215 examples
Basic characteristics:
Are created by one users and never gets accessed again.

The first observation from the clusters is that the three genres of communication studied
are not typical for the overall use of documents. The meeting agenda genre and the holiday
list genre both belong to cluster 3, while the translation genre consists of 7 documents also
belonging to cluster 3. Cluster 3 consists of 271 examples or 4,6% of the sample of 5862
documents. It would have been nice to connect the cluster analysis with the genre analysis by
providing a genre analysis exemplifying each of the clusters. There were not sufficient
descriptions of use available from interviews so as to provide a basis for interpreting the
patterns in the log files to exemplify each cluster. Other than the fact that sufficient data was
not available suggesting a connection between the genre analysis and the cluster analysis
would also be unjustified. To suggest that a genre exemplifies a document cluster is true in
the sense that they have properties in common. The fact that all three genres analyzed stem
from the same cluster should raise a suspicion as to the relation between genre analysis and
Mine the gap - a multi-method investigation of web-based groupware use
146
The study of Quickplace use

the clustering of document life cycles that is firstly based on the role of time. For the analysis
of genre, time is central. In the statistical analysis of documents it turned out that time was not
significant. Secondly, it illustrates the difference between the two analysis. Genre analysis
takes into account the specific work practice and the meaning assigned by users to the
documents used in the genre. In contrast it is the purpose of the cluster analysis to ignore the
specific work practice and make statements that are generalized form that.
The second observation that does not stem from the k-means clustering process but was
discovered during the initial statistical analysis is that most documents are dead documents.
72% of all documents published are never again used by anyone. Of course one cannot know
for sure whether they will ever be used. The 4215 dead documents have all been published in
the period 5/5-2001 – 5/10-2001 and it is known positively that they have not been used more
than once before 19/2-2002 where the log period ends. This gives a minimum period of 4 1/2
months to measure the “deadness” of the documents.
One kind of conclusion to draw from the percentage of dead documents is to question
the usefulness of the technology. I think it is justified to say that the 72% of the documents in
the QPs might as well not have been there. If people just put stuff there without using it for
anything it might as well be closed down. The genre analysis however shows that it is
certainly possible to use the QP technology as a communication tool for doing work and
coordinating things. One might rather just observe that integrating an open technology such as
QP that can support many different genres of communication is not straightforward.
Obviously a lot of unsuccessful attempts are made to use the QP technology.

A top-down typology of documents:


The cluster analysis has exemplified a bottom-up approach to a global model of
document life cycles in QP. Another way of approaching a global description of the life
cycles is to use a top-down approach where the criteria for dividing the sample of documents
into different types are explicit and controlled. In the following I shall present two such
attempts that will clarify some important aspects of document life cycles that are not captured
by the cluster analysis.
I will present one possible typology of document lifecycles. It is a hierarchical typology
made from a stepwise division of the document sample according to a number of criteria. It
divides the sample of documents first according to lifespan, then number of users, and finally
according to the presence of attached files. The types were generated using plain SQL into the
dokqp table exhibited in Appendix 6.

Mine the gap - a multi-method investigation of web-based groupware use


147
The study of Quickplace use

For each top-down type of documents, the average value of some of the properties was
calculated as a way of characterizing the documents of each type. These averages are
exhibited below.
Document type % of document sample Average values for properties
Dead documents 72,3 Uploads = 2.58, Downloads=0.24,
Edits=0.12
Short term 4,8 Users=2.48, Uploads = 1.50, Reads=4.45,
coordination Downloads=1.67, Edits=0.30
Personal archive 4,1 Lifespan= 12.95, Uploads=2.64,
Reads=4.80, Downloads=2.12, Edits=0.90
Publish document 2,5 Lifespan= 29.41, Users=3.86,
Reads=10.44, Edits=0.99
Publish files 16,8 Lifespan= 25.83, Users=5.58,
Uploads=2.69, Reads=13.88,
Downloads=8.17, Edits=0.81

The top-down typology identifies two types of documents not visible in the cluster
analysis. Firstly, it identifies short-term coordination. The short-term coordination documents
only have a lifespan of less than on working day. Some of the documents involved in the
translation of the press release have this characteristic. On the average they have 1,5
attachments and are read by 2,5 users. Secondly, the personal archive also exists in QP. 4% of
the documents in the document sample are only used by one user, and have an average
lifespan of 13 days.
Mine the gap - a multi-method investigation of web-based groupware use
148
The study of Quickplace use

The next model serves to clarify how the documents are divided according to how they
are edited. This model only includes documents with a lifespan of at least one day and more
than one user.

Document % of Average values for selected properties


characteristics document
sample
No edits 12,7 Lifespan= 24.60, Users=5.10, Uploads=2.73,
Reads=9.96, Downloads=7.67
Edited by one 6,3 Lifespan= 29.31, Users=5.77, Uploads=2.22,
user Reads=19.58, Downloads=13.69, Edits=2.33
Edited by 0,3 Lifespan= 32.95, Users=6.95, Uploads=8.26,
several users Reads=27.77, Downloads=16.95, Edits=4.05

The model shows that most documents are not edited at all. While representing 12,7% of
the total document sample they represent 66% of the documents with a lifespan of at least one
day and more than one user.
The most striking observation of this analysis is the number of documents edited by
several users. One of the features of the QP technology is that it supports collaboration on
documents, where several users can edit the same document without the risk of overwriting
each other's changes. It can be used to co-author documents without e-mailing the document
back and forth. In the document this happens in 22 occasions or 1,9% of the documents with a
Mine the gap - a multi-method investigation of web-based groupware use
149
The study of Quickplace use

life cycle of at least one day and more than one user. It may be concluded that this feature of
QP is not used at Beta. The empirical data does not provide a specific explanation for this, but
one explanation can be ruled out. The respondents in the survey were asked whether they
wrote collaboratively with other people.

Do you write collaboratively with other people?

60%

50%

40%

30%

20%

10%

0%
Often Seldom Never

The reason is not that people don't write collaboratively, they just do it without the
support of QP.

Findings from clustering document lifecycles


A large share of the documents in the QPs is not as one would expect in a technology for
communication. 72% of the documents created in QPs are dead. They are put in a QP and
never serve as a medium for communication. Also the QPs are used to a small degree as a
private archive. The amount of dead documents can be interpreted in two ways by the
intentions of the user that has put them in the QP. The first interpretation is exemplified by
the interview with Linda from GIC mentioned previously. She states that she shares
presentations with the other members of GIC. What actually happens is that they are put in
the QP and never used by anyone. The second interpretation is that the dead documents are
personal backups of documents residing on users hard drives. There is no evidence in the
interview or survey data to confirm the second interpretation, but this might very well be
something that respondents and interviewees choose not to report. The first interpretation
supports the view that integrating QP in communication processes is a process of integrating
QP in existing genres of communication and evt. create new genres. The dead documents are
examples of attempts to communicate via the QP without a genre of communication. Whether

Mine the gap - a multi-method investigation of web-based groupware use


150
The study of Quickplace use

some of the dead documents are personal back ups or not the observation based on the
interview with Linda from GIC can be generalized to the use of QP across the whole
organization.
In the analysis of the genre of translation and the meeting agenda, we observed that the
use of QP documents was very simple. All documents followed the pattern where the
document is created and a file is attached, and subsequently read and downloaded by a
number of users. None of the documents were changed after creation nor were more people
involved in changing them. From the typologies we can generalize this observation.
Only 22 documents out of the sample of 5826 or 0,3% were edited by more than one
person. Thus QP is not used as a means for collaborative writing. This does not implicate that
users of QP do not write collaboratively, they just use other communication technologies to
do it. Even though all interviewees complain about the e-mail system as a means of
coordinating the production of documents, they seem to use it anyway. It is clear also for the
interviewees that the functionality of QP is better suited to support collaborative writing. This
finding shows that theories of task-technology fit Zigurs and Buckland (1998) are not very
useful in understanding the use of QP and e-mail.
The clustering of document life cycles has shown that CMC log-analysis can serve as a
useful tool in a case study. The statistical generalizations of document life cycles have been
used to test the generality of findings from the interviews. Apart from that, the analysis has
also produced findings that are completely invisible to interviewees or respondents in the
survey. The fact that 72% of the documents are dead is not observable through interviews or
surveys. This would have required a cross checking on who had accessed individual
documents not possible in practice.

Zipf’s law on documents and QPs


Until now this section on statistical generalizations has only dealt with descriptive
statistics. Common methods for descriptive statistics such as means, histograms showing
distributions etc., and k-means clustering for finding a natural grouping of document
lifecycles have been used to provide descriptions of the data material.
The quantitative research process is often described as something consisting of two
phases: a discovery phase that has the purpose of generating hypotheses, followed by a phase
were the hypotheses are tested using inferential statistics. I referred earlier to inferential
statistics as a method for doing level-one generalizations.
This last part of the section on statistical generalizations will report an attempt to test
such a hypothesis – or rather two hypotheses. The two hypotheses are:
1. The use of documents in QPs is distributed according to Zipf’s law

Mine the gap - a multi-method investigation of web-based groupware use


151
The study of Quickplace use

2. The number of users in QPs is distributed according to Zipf’s law


While the testing of these hypotheses might seem a bit of the track of the research
question of this thesis I have chosen to include them for two reasons. Firstly they provide
interesting knowledge to the character of QP use. Secondly, they illustrate what kind of
results a purely quantitative method can provide in the analysis of QP use.
A lot of statistical methods are based on the mathematics of normal distribution. In
inferential statistics the normal distribution curve is used to estimate the probability of a
sample being representative of a population. The assumption is that typically the mean of the
sample will be placed somewhere on the normal distribution curve of the population. This
enables the calculation of the probability that the mean of the sample is equal to the mean of
the population. The normal distribution also fits with a lot of natural phenomenon. E.g. the
height of people in a population follows the normal distribution stating that a lot of people
have a height close to the mean while an equal small number of people are either much higher
or much lower than the mean. The normal distribution does however not work on a number of
man made and natural phenomenon. These phenomena are distributed according to a power-
law distribution. A power-law distribution is a distribution where small instances are
extremely common whereas large instances are extremely rare.
George Kingsley Zipf did one of the precise definitions of a power-law distribution. He
studied the frequency of word occurrences in the English language and found that the
frequency of the occurrence of the word was inversely proportional to its rank. Zipf’s law
states that:
1

rb
The frequency y is more or less equal to the inverse power b of the rank r where b is a
constant. If the equation fits for a set of data they will follow a straight line plotted on double

logarithmic paper. The slope of the line will be the value of b.
Zipf’s law has been tested on a number of other phenomena such as population of cities
(see e.g. Gabaix (1999)). Lately a number of phenomena on the www have turned out to
follow Zipf’s law. Huberman (2001) provides an overview of the research, which he names
the ecology of information. Among the results are that the number of visits to a site follows
Zipf’s law Adamic and Huberman (2000) as well as the number of pages within a
siteHuberman and Adamic (1999). Zipf’s law is also used to explain the way users surf the
webLevene, Borges et al. (2001),
The first hypothesis was tested by counting the number of uses (across all types of
actions) for each document in all of the QPs. The documents were ranked so that the

Mine the gap - a multi-method investigation of web-based groupware use


152
The study of Quickplace use

document with the largest number of uses was given rank 1. The number of uses was then
plotted as a function of the rank on a double logarithmic graph.

As the graph show, the plots almost produce a linear curve. The calculated Pearson
correlation is – 0.971 where –1 describes a perfect inverse proportionality.
The second hypothesis was tested by counting the number of unique users for each of the
QPs. The QPs were ranked in the same manner as the documents and the number plotted as a
function of the rank.

Mine the gap - a multi-method investigation of web-based groupware use


153
The study of Quickplace use

The Pearson correlation is –0.970 or very close to the correlation calculated for the
documents.
The two hypotheses are hereby confirmed by our analysis. Both the number of uses of
documents and the number of users per QP is distributed according to Zipf's law. The
question arising now is: "so what?" In the context of this thesis does not seem like an
explanation of how QP is used at Beta. It completely ignores the social structures of the
organization, which according to the research question in this thesis as well as in most IS
research is considered crucial for understanding use of technology. Huberman and Adamic
(1999), Adamic and Huberman (2000), Huberman (2001) and others seam to treat it as a kind
of natural law of information.
Zipf's law has been applied in the economy of the www (Adamic and Huberman (2000))
to describe the structure of the market of web sites as a "winner-takes-all" market. It has also
been used to design caching algorithms for web-servers (Breslau, Cao et al. (1998), Breslau,
Cao et al. (1999)). The fact that the frequency of access to the documents on a web server is
distributed according to Zipf's law is used to decide which pages should be cached with the
purpose of increasing the performance of the web-server. A number of possible implications
of the confirmation of the Zipf's law hypotheses in our case can be drawn:
Zipf's law is independent of information architecture and centralized authoring.
It has previously been shown that the use of web pages on a web-server is distributed
according to Zipf's law. These analyses have been performed on web pages residing in an
information architecture and for web sites where the information distribution is centralized. In

Mine the gap - a multi-method investigation of web-based groupware use


154
The study of Quickplace use

the context of the QP technology there is not a central information architecture, because each
QP has its own information architecture and the number of authors is 394 or 11,9% of the
total users.
Zipf's law questions task-technology fit
As we have seen in the section on previous research a lot of research in IS either tries to
build theories on task technology fit or hinges on the notion that there is an ideal fit between
the task at hand and the design of the technology. While not explicitly stated in the literature a
task-technology fit theory of the use of QP would assume that there is an ideal relationship
between number of users and the use of a QP. According to this hypothesis the distribution of
number of users in a QP would be something close to a normal distribution where most QPs
would have the same number of users and very few would have either very few or very many
users. This is refuted by the fact that the number of users follows a power-law distribution.
While this result does not refute the overall idea of task-technology fit as a way of
understanding the use of IT, it questions whether it is a good theory for understanding QP or
other technologies that are open for diverse kinds of usage.
Zipf's law might solve the archival problems of virtual workspaces
The Zipf distribution of web pages on a web site has been used to design caching
algorithms. The fact that document use is distributed equally might be used for archival
algorithms for virtual workspaces. One of the problems inherent in using virtual workspaces
in an organization as well as a general problem in the knowledge management technology
discipline is to control the archival of documents that might be valuable to reuse. There is no
archival process defined for QP use and therefore the process of reusing documents across
different projects is not supported. The fact that document use is distributed according to
Zipf's law might be used to design an archival algorithm. By automatically archiving 25% of
the most used documents in QP one would archive all of the documents that have been used
by more than one user. It might turn out to be a quick way of solving a task that is otherwise
extremely complex because it requires individual judgments of an archival person. One of the
problems of centrally archiving documents for later reuse by someone else is also that the
person who created the document does not have an incentive to spend time on the archival
process.

After this short detour into the world of inferential statistics, we shall return to the
question of the relationship between properties of the technology artefact and the social
structures.

Mine the gap - a multi-method investigation of web-based groupware use


155
The study of Quickplace use

Classification practices in Quickplace


Up till now we have analyzed the primary use of QPs from different perspectives as a
medium for communication. One is the statistical generalization presented in the cluster
analysis; another is the analysis of specific instantiations of genres in which QP is used.
An important aspect of the QP technology is that it serves as a means of structuring
documents in different folders. It is recognized by several authors that labelling or
classification practices are important and in some respects overlooked aspects of
organizational life (Ashforth and Humphrey (1997)) and for society in general (Bowker and
Star (1999)). The field of CSCW recognizes classification or categorization as an important
issue in using technology to support what is termed “common information spaces” (Bannon
(2000), Schmidt and Christensen (2000), Schmidt and Israel (2000)).
A lot of research has been conducted on classification as a scientific discipline (see e.g.
Bailey (1994)), which provides methods and techniques for “good” classifications. Such
methods and techniques are also used extensively in the design of IT systems. The traditional
view of IT as a technology for information processing puts classification at the centre of
building IT systems. In the discipline of systems development the data model of an IT system
is crucial. It defines the aspects of the world that are represented in a system and thus also the
aspects that are ignored. It also orders these aspects in e.g. the object models of OOD (Object
Oriented Design) (Jacobson (1992)) or in the entity-relation diagrams of the structured
method. Some research has been done into the consequences of these classifications that are
built into IT-systems (Bowker and Star (1999)) and of the practical problem of getting users
to use the classifications (Shipman III and Marshall (1999)).
The QP technology is an example of a technology where some of the classification work
is left to the users. A QP does not prescribe a certain classification of documents. It provides a
default structure as presented earlier but it does not insist on this structure.
Defining the folder structure of a QP could be seen as an enabling process for the
primary use observed through e.g. the analysis of genres of communication. Besides the
management of which users have access to which parts of a QP it is the most important
enabling process for using the QP.
It is the purpose of this section to understand the process of structuring the folders in the
QPs and how it is related both to the practice of using it for different genres of
communication and to the classifications and classification processes surrounding the QP. A
model for understanding the process of structuring documents in the QP or any collaborative
technology that includes classifications is suggested. The classification practice is an
important element in establishing what we have earlier termed technology-in-practice using
Mine the gap - a multi-method investigation of web-based groupware use
156
The study of Quickplace use

the concept from Orlikowski (2000). First the theoretical model is presented and thereafter the
empirical evidence for the model is presented. While the empirical evidence is limited to the
QP technology at Beta, it is proposed as a generic model that can be used for other
technologies and certainly other organizational settings.

The functional model


By means of the application of Stinchcombe's (1968) generic model for functionalist
analysis, the folder-structure in QP is conceptualized as a homeostatic variable, dependent on
both structural factors (i.e. Media design, organizational structures) and external disturbances
(i.e. changing work tasks).
A functionalist model is chosen because the means of maintaining the folder structure
varies while having a similar goal. Stinchcombe (1968) proposes to select functional
explanations, for social phenomenon with the following characteristics:
"Whenever we find uniformity in the consequences of action but great variety of the
behaviour causing those consequences, a functional explanation in which the consequence
serves as a cause is suggested." p. 80.
As we shall see, the way folder structures are created and maintained varies across the
three folder structures analyzed, but the consequence is always a folder structure.
It is proposed to view users' classifications in computer media - in this case the folder
structures of Lotus Quickplaces - as the dependent variable in a functional relationship with
two independent variables S and D. D is short for disturbances, or the set of factors that will
disturb the existing equilibrium and cause S (the structural mechanisms) to interact with H
(the homeostatic variable) and find a new equilibrium between the structural mechanisms and
the classifications. Classification can thus be understood as a functional relation between H
(the folder structures), D (the disturbances) and S (the structural mechanisms) in constant
search for equilibrium. There is not stable equilibrium, but for each time t, there exists
equilibrium.

Mine the gap - a multi-method investigation of web-based groupware use


157
The study of Quickplace use

The reason for viewing the folder structures as a homeostatic variable in a functional
relationship is to provide a view of classification as something else than a more or less
scientific discipline of classifying entities into the best structure.
The functional relationship introduces the dynamics observed in the QP folder
structures. The environmental conditions for the folder structure change over time. This either
causes the folder structure to change or produces an instable situation. The mere statement
that it is a functional relationship however provides limited insights. The strength of the
model is tested in the identification of the specific structural factors and the disturbances.
Introducing a functional model in addition to the classic discipline of classification is
also done because the virtues of the librarians and the scientist's classification are not the most
important. The model is introduced based on a hypothesis that a good folder structure
(classification) is one that aims at equilibrium between the folder structure, the structural
factors and the disturbances. This quality criteria is much more important than the virtues of
consistency and stability known from the classification discipline Bailey (1994).
By selecting a functional explanation we also assume that given the same structural
factors and the same disturbances, the homeostatic variable - the resulting folder structure -
will be similar.
The purpose of the following will firstly be to show how the empirical data support the
relationship between the structural factors, the disturbances and the homeostatic variable.
Secondly, the purpose will be to identify the relevant structural factors and disturbances for
the folder structures. Without this identification the model would be too generic and only
describe very few aspects of the data.
For the analysis the three exemplars GIC, NP_solo-ID and IC are chosen. Partly because
they were chosen for the genre analysis, partly because of the data available from interviews
where the process of structuring the folders is described.

The structural factors and disturbances


The specific structural factors and types of disturbances are introduced before the
analysis of the dynamics of the model. The three structural factors are: media design,
organizational classifications, and genres of communication.
Media Design:
The design of the medium in which the folder structure exists represents a very concrete
structural factor. Some important aspects were captured in the comparative analysis of virtual
workspace technologies. These include the functionality of the virtual workspace. In QP a
very specific structural factor is the fact that the QP technology only supports one level of

Mine the gap - a multi-method investigation of web-based groupware use


158
The study of Quickplace use

folders and the possibility of creating sub-rooms. Another is the concept of documents and the
possibility of attaching multiple files to each document. The comparative analysis of virtual
workspaces also showed that the metaphors of the user interface should be included as
structural factors.
Organizational Classifications:
Organizational classifications are the classifications surrounding the users of the QP.
The organizational diagram is an important classification that is visible in GIC. Another is the
classification of tasks in a project that can be observed in NP_solo-ID. Organizational
classifications though encompass much more than those visible in i.e. the organizational
diagram. Organizational classifications might be captured by the notion of code in semiotic
theory (Eco (1976)) or "symbol scheme" in Goodman (1976).
Genres of communication:
The third structural factor is the genres of communication (and genre systems) supported
by the QP and thus also the folder structure. The genres of communication capture the
communication processes in which the QP is embedded. The work of genres as a structural
factor can be observed in a very direct sense in IC where the genre of translation is reflected
in the naming of folders such as "Press releases - in progress" and "Press releases - info". Of
course the genres are not always reflected directly in the folder structure.
The types of disturbances in the model are new documents, new work tasks, and new
members. Disturbances generally represent the changes happening in the environment of the
folder structure that causes instability.
New documents:
As it is well known from the personal folder structures of a PC, new documents are
introduced in an existing structure and some of the documents do not seem to fit. Over time
this will create instability in the equilibrium. The users might not explicitly note the instability
at first as it happens gradually over time. In the period where a folder name remains constant
the fact that new documents are put in a folder over time changes the interpretation of the
folder.
New work tasks:
The introduction of new work tasks that is to be supported by the QP is an obvious
disturbance that can be observed e.g. in IC, where the translation of press releases is
introduced as a new task supported by the QP.
New members:
New members (or users) of the QP also introduce a disturbance. They disturb in two
ways. Firstly, they disturb the equilibrium because they interpret the folder structure
differently than the existing users. Secondly, they may produce instability because they

Mine the gap - a multi-method investigation of web-based groupware use


159
The study of Quickplace use

introduce new documents in the QP and new work tasks to be supported by the folder
structure.
The disturbances are very closely related. The introduction of a new work task that
should be supported by the QP will typically include new documents and perhaps also new
members. Also new members may introduce new documents and tasks.

The dynamics of the folder structure


After the introduction of the structural elements, the dynamics of the functional model
will be illustrated by observations made on the empirical data. The data consists primarily of
the history of the folder structures derived from the log files. These are visualized in appendix
12. Also the interviews are used.
The first observation that supports the functional model is that the folder structures
change in the log period. The first table shows a simple count of the number of folders.

Number o f Number o f Discontinued Folders created


folders 1. Jun folders 10. Jan folders from 1. Jun after 1. Jun active
on 10. Jan
IC 45 36 18 9
GIC 69 74 13 18
NP_solo-ID 37 56 5 24

Column one and two are snapshots of the number of folders on the June 1, a bit less than
a month after the beginning of the log period and on the January 10, a little more than a
month before the end of the log period. Column three counts the number of folders registered
on June 1 that are deleted in the period up to January 10. Column four counts the number of
folders created after June 1. that are still active on January 10. This gives a crude picture of
the amount of change to the folder structures ignoring the folders that have been created and
discontinued between the two dates. The table enables us to calculate a change index that
measures the degree of lasting changes in the folders structure. The changes are lasting
because the index ignores folders that are created and quickly deleted again.
(Newfolders + Deletedfolders) = Changeindex
Initialfolders
The change index of IC is 0,61. For GIC it is 0,45 and for NP_solo-ID it is 0,78. A
change index of 0 would indicate a stable folder structure and an index of 1 would indicate a

radical change e.g. that the number of folders was doubled. Based on the change index scores,
it is safe to say that the folder structures in neither of the QPs are stable.
Mine the gap - a multi-method investigation of web-based groupware use
160
The study of Quickplace use

The following descriptions of how the three QP folder structures change serve to
illustrate the process of reaching equilibrium in the functional relationship.

The NP_solo-ID folder structure


The NP_solo-ID folder structure is characterized by a continuous increase in complexity.
The number of folders increases from 37 to 56 in the period. The folder history map in
appendix 12 shows that even though the rate of change in some periods as e.g. the beginning
of July is bigger than others, the changes are not organized as major reorganization of the QP.
This is the case with both IC and GIC. The project manager (of the project supported by the
QP) who is also the initiator of the NP_solo-ID explains in the survey that the management of
folders is distributed to people responsible for different parts of the project. Thus different
sub-rooms are created, which have different managers responsible for that sub-room. The
organization of the maintenance of the equilibrium as a distributed and decentrally-managed
process explains the pattern of continuous change in the folder structure over time.
The continuous change of folders and the naming of folders indicate that folders are
created as new tasks in the project are started. As an example folders named "IT-Production",
"Processess-general", and "processes-corporate" are created in early October. Probably the
two folders about processes refer to the production processes and business processes related
to the IT-system, which the project is designing and implementing. The definition of business
processes related to an IT-system is a standard element in an IT project at Beta. The "IT-
Production" folder probably refers to the descriptions on how the new IT-system should be
deployed in the existing IT-production environment. It is notable that these tasks are not
foreseen in the folder structure at the outset but are created as the needs arise. This very well
illustrates how new tasks and thus also new documents are introduced as disturbances and
result in a change of the folder structure.
The effects of the media design on folder structures can also be observed in NP_solo-ID.
A QP only supports one level of folders. The possibility of having multiple levels of folders is
only supported by the creation of sub-rooms. In July it seems that the limits of having just one
level of folders and the need for sub-categorizations create new rooms. In July a number of
sub-rooms are created. The QP changes from something consisting primarily of folders in one
level to something consisting of different sub-room. At this point the administration of the
sub-rooms is also delegated as reported by the project manager. This also includes renaming a
number of existing rooms and folders from e.g. "pre-study docs" to "pre-study room". Some
sub-rooms were started before the shift but then got renamed to reflect the strategy.

Mine the gap - a multi-method investigation of web-based groupware use


161
The study of Quickplace use

NP_solo-ID is, as explained in the section on the three exemplars, used as a project
repository. This explains the gradual increase in the number of folders. The old documents in
the QP continue to stay relevant as documentation while new documents arrive on new
subjects thus creating the need for more folders. Another observation supports that the
NP_solo-ID is a project repository and that this affects the structure of the QP. Other than the
observation of creation and deletion of folders, the movement of folders from one folder to
another and the deletion of documents are important to understanding the development of the
folder structure. The number of documents that are moved around in NP_solo-ID is
significantly higher than the other two QPs. In NP-Solo-ID 138 documents were active in the
period. Of these were 44 or 32% moved. 28 documents were deleted in the period. The
following graph shows the number of new folders created and the number of document moves
per week from week two (week no. 20) of the log period until week 51.

NP_solo-ID new folders and moves

16

14

12

10

New folders
8
Moves

0
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
Week

We can observe a close correlation between the creation of new folders and document
moves. This indicates that new folders also change the relevant classification of the existing
documents. A new folder is therefore created as a reactive action to instability caused by new
documents introduced in the QP. While some folders are created proactively as e.g. the folder
structure created initially by the manager, folders in NP_solo-ID are as often created
reactively.
The moving of documents is distributed among more users such as the maintenance of
the folder structure. Three users are performing the document moves. The project manager
has performed 21 moves, a second user 20 and a third user 3.

Mine the gap - a multi-method investigation of web-based groupware use


162
The study of Quickplace use

The IC folder structure


The folder structure of the IC QP is simplified in the log period in contrast to GIC and
NP_solo-ID. The number of folders decreases from 45 to 36. This simplification happens
even though the number of weekly users increases significantly from an initial level of 6 - 8
users to a final level of 25 - 30 users. Also the translation of press releases and the translation
of the internal magazine are introduced as new work tasks in the QP in the period where the
folder structure is simplified.
The introduction of the new work task of translating press releases is reflected in the
major reorganization of IC. Three new folders are created: "Press releases - FINAL:", "Press
releases - in progress:", and "Press releases - info:".
As concluded in the detailed analysis of the IC QP it is used for supporting a number of
work processes. These processes are repeated several times during the period of logging. This
is reflected in the fact that there is 538 deletions of documents over the log period. Thus the
QP is not used for archiving the documents used for coordinating the translation processes.
While the NP_solo-ID moved documents around as the equilibrium changes over time, IC
only has 7 document moves.
The folder structure is maintained centrally be the manager:
“I am the manager for the sub-rooms and I create them. Then I create on an ad-hoc
basis structures for the individual tasks we have.” Interview IC QP manager Lone
The maintenance of the folder structure is done primarily in a major reorganization on
the 13/8-2001 by Lone. The major reorganization is followed by a dramatic increase in the
number of users using the QP.

Mine the gap - a multi-method investigation of web-based groupware use


163
The study of Quickplace use

IC weekly users

40
Major reorganisation 38
of the IC QP 37

35 34 34

31 31
30
30 29
28
27
26
25 25 25
25
Number of users

23 23
21 21
20 20
20 19 19 19
18 18

15 14
12

10
8
6 6
5
5 4
3
2
1 1 1 1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Week

The correspondence between the increase in weekly users and the major reorganization
indicates that the reorganization is done proactively as a preparation for new work tasks. The
creation of new folders named "Press-releases..." also supports this. The decrease in the
number of folders also indicates a reactive reorganization. Some folders have turned out not
to be used or are not used anymore.
While the maintenance of folders is centralized, the deletion of documents, which is an
important related task to the maintenance of the folder structure is distributed among 18 users.

Mine the gap - a multi-method investigation of web-based groupware use


164
The study of Quickplace use

Folder creations and document deletions

80

70

60

50

New folders
40
Deletions

30

20

10

0
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
Week

The chart does not exhibit the same relationship between change in folder structure and
document deletions, as was the case with the relationship between changes in folder structures
and document moves in NP_solo-ID.
In contrast to NP_solo-ID the maintenance of the folder equilibrium is not done
continuously and distributed but through a major reorganization and some minor adjustments.
Thus the timing of re-establishing the equilibrium after the introduction of disturbances
differs significantly between IC and NP_solo-ID.

The GIC folder structure


The GIC QP is an example of a folder structure that never reaches equilibrium.
According to the functional model this would be explained by the absence or partial absence
of one of the structural factors. The GIC QP has by far the most complex folder structure, and
the complexity increases over the log period from 69 to 74 folders. When the complexity of
the folder structure is compared to two other observations the lack of equilibrium becomes
apparent.
The number of folders is between 69 and 74 in the period of logging. In the same period
68 different documents were active in the GIC QP. Of these 8 had more than one user. This
leaves us with 8 documents that could be characterized as parts of a genre or genre system in
a folder structure that is far more complex. One would suspect that even in extreme cases one
or more documents per folder would make sense.

Mine the gap - a multi-method investigation of web-based groupware use


165
The study of Quickplace use

The increase in the complexity of the folder structure happens concurrently with a
decrease in activity both in terms of overall activity and in terms of the number of weekly
users.
The folder structure of GIC can be interpreted according to the functional relationship as
an example of the absence of genres of communication as a structural factor to guide the
folder structure of the QP. Except for the meeting agenda and holiday list genre analyzed in
the section on genres of communication, there seems to be no guidance from the genres in the
structuring of QP.
Linda - one of the managers of the GIC QP - addresses specifically the issues concerning
the folder structure of the QP. The following quotes can qualify the interpretation of the GIC
folder structure:
“Whether organizing after subject is a better idea…the question is then whether one can
agree on what a specific subject covers.”
“We use communication plans quite a lot. Typically that involves more than one section.
And that would be hard to put in.”
Linda mentions problems with the fact that the GIC QP is structured according to the
organizational diagram. What characterizes the two quotes is that she discusses the use of the
QP in the mode of hypotheses. The interviews with the manager of NP_solo-ID and IC both
describe the creation and maintenance of the folder structure as something related to real
events and not hypothetical situations. Her own understanding of the problem with the folder
structure is that it is hard to decide between principles of structuring. The one possibility -
according to her - is the organizational diagram, the other is to structure it according to
subject. The problem underlying the apparent issue of choosing the "right" principle of
structuring the folders is clarified in the following quote.
“I think even if we have all sat down and discussed it [the structuring if the
NordicPlace] this time, we would not have agreed, because we didn’t have an idea of what it
was. Maybe we didn’t have an idea of what type [of information] should be out there [ in the
NordicPlace]".
The fact that no genres of communications are established that utilize the GIC QP
produces the problem of deciding on a relevant folder structure.
As in NP_solo-ID and IC there is more than one manager. Like in NP_solo-ID more
than one person make changes to the folder structure. Linda addresses this as a problem.
“And I think personally that there is a problem in being four managers. Well it takes a
lot of coordination there…because the one…well it requires that you have an agreement on
how you wish to structure it, so suddenly some folder appears that doesn’t fit in with the
rest.”

Mine the gap - a multi-method investigation of web-based groupware use


166
The study of Quickplace use

What is conceived as a problem in NP_solo-ID is a problem in GIC. Again the best


explanation is that the genres of communication lack as structural factor.
As in IC the GIC undergoes a major restructuring in the period of logging and like in IC
the major reorganization corresponds to a significant increase in the number of weekly users,

Weekly users GIC

70

64 Major reorganisation of the GIC QP

60
57
54
52
50 48

40
40
35

30
27
25
23 23
21 21
20
20
16 16
15 15 15 15
14
13
12
11
10
10 9 9
8 8 8
7 7 7
5 5
3 3
1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7

Unlike IC the increase in weekly users is not permanent. The week after the
reorganization the numbers of users drop to an average lower level than before. While the IC
reorganization was correlated with the introduction of new work tasks and thus also new
documents, this is not the case in GIC.

The GIC QP is primarily structured according to the organizational diagram of the GIC
organizational unit. Analyzed as a static classification, the folder structure of the GIC QP is
the most consistent of the three. It classifies according to the organizational diagram of the
GIC department. This illustrates very clearly the point of introducing the folder structures as a
functional relationship. The criteria of consistency and completeness are not important in their
own right. The virtue of a folder structure in a QP is that that it supports genres of
communication and uses terms that are part of the surrounding classifications in the
organization. According to the interviews the GIC folders structure appears as the one done
with the most systematic and careful considerations of the three. The problem is that the
structural factor of genres of communication is not there to guide it. Expressed in everyday
language, they don't know what they should use the QP for.

Mine the gap - a multi-method investigation of web-based groupware use


167
The study of Quickplace use

The librarians or the scientists approach to classification is to create a structure that is


stable across time. This makes perfectly sense in a setting where you build a library or
taxonomy for animals. These virtues of the librarian are not of much use in understanding the
folder structures of a QP or any other collaborative technology.

Summary of the folder structures


The analysis of the processes of creating and maintaining the folder structure in the three
QPs as well as the related tasks of moving and deleting documents has illustrated how the
functional model can explain the change of folder structures over time.
The reason for introducing a functional model was that the processes involved in
establishing an equilibrium after a disturbance differed. The IC and NP_solo-ID exhibited
quite different approaches to the maintenance of the folder structure. It differed in the way it
was organized: in IC, it was done centrally by a manager and in NP-_solo-ID it was
distributed among more managers. It also differed in the timing: In NP_solo-ID the
maintenance of the equilibrium was a continuous process while IC had a major reorganization
The analysis of GIC exhibited the strength of the functional model compared to
traditional approaches to understanding classifications. Judged by the merits of a librarian the
GIC folder structure was the most consistent of the three. The problem was that it was not
used. The functional model explained the folder structure that could not reach equilibrium
because the structural factor of genres of communication was not present to establish it.
The functional model has been applied to Lotus Quickplace at Beta but it is suggested as
a generic model for virtual workspaces and other computer media that involve structuring
units of information involved in communication.
The functional model is primarily suggested as a way of explaining the development of
folder structures in virtual workspaces and other computer media. It explains a central aspect
of establishing the technology-in-practice in a virtual workspace but can also work as a
guideline for doing folder structures in virtual workspaces. The classical virtues of
classification practices are stability and consistency and neither of these are important in a
virtual workspace. The folder structure should be seen as something dynamic that
continuously adapt to the disturbances guided by the structural factors. The functional model
also suggests that the folder structure is related closely to the genres of communication in
which the virtual workspace is integrated. This aspect can specifically be formulated as a
guideline for establishing technologies-in-practice. The design and maintenance of the folder
structure is closely related to the continuous enactment and negotiation of the genres of
communication and is not something that should be handled by a librarian.

Mine the gap - a multi-method investigation of web-based groupware use


168
The study of Quickplace use

The functional model does not address an important finding from the analysis of genres
of communication in QP. In most genres multiple media are involved. With respect to the
issue of structuring information in computer media or in general in IT systems it points to the
fact that the structures present in different media affect each other. This is addressed in the
following section, which present a model for how different functional relationships used to
explain structures in specific media affect each other.

Levels of functional relationships


As we have seen in the analysis of genres of communication in QP, the use of QP is
intertwined with other media because they support the genre concurrently. The genre of
translation utilized at least both e-mail and QP.
Not only through the genres of communication but also in the functional relationship of
the folder structure is QP closely dependent on and related to other media. The folder
structure of the QP is functionally related to the structural factors and disturbances discussed
above. It is however also dependent on structures of related IT-systems. The structure of the
Intranet is an example of a functional relationship that affects the functional relationship of
the QP folder structure. The structure of the personal e-mail folders of the people who use the
QP is another.

Central Intranet
IT-system structure

Organization Org. IT structure

Group QP structure

Individual Individual structures


PC folders Paper archives
e-mail folders

This is illustrated in the model. The model suggests distinguishing between three
relevant levels of functional relationships. The first level is the structures relevant to the
individuals in an organization. These include the folder structure of the PC that can be
understood using the same functional relationship as the QP structure (with slightly modified

Mine the gap - a multi-method investigation of web-based groupware use


169
The study of Quickplace use

disturbances and structural factors). The next level is the group of people using the QP. The
third level is the structures that exist across the organization. This includes the structure of the
Intranet and the structure of the central IT-system. The structure of the Intranet is directly
observable while the structure of the central IT-system is observable in what is referred to as
the "data model" at Beta. This "data model" is an Entity-Relationship diagram e.g. specifying
which information is stored for each employee in the organization.
The model thus provides an overview of the relationship between the different functional
models, which apply to each individual structure. It provides an overview of the infrastructure
of classifications embedded in technologies in an organization and specifies that not only the
structural factors identified in the functional model for virtual workspaces affects the creation
of folder structures. The model thus takes the functional relationship beyond the scope of one
technology used by a group and applies it to the whole organization.

Mine the gap - a multi-method investigation of web-based groupware use


170
Conclusions and implications

Conclusions and implications


The conclusions and implications section of this thesis is divided into three main parts.
The first part deals directly with how the observations presented answer the research question.
The second part will summarize my experiences with log analysis. As stated in the
introduction an important purpose of the work underlying this thesis has been to explore the
possibilities of using log analysis to study computer mediated communication and to perform
case studies in general. The last part of the conclusion is a smaller section on practical
implications of the observations. While the primary aim of the thesis has been to contribute to
research, it has nevertheless produced operational insights, which I believe are relevant for
designers, implementers, and users of virtual workspaces.

The comparative analysis of virtual workspaces described a standardized technology for


communication open to many different settings and types of use. The different virtual
workspace products hardly varied in functionality but varied in the metaphors used and the
approach chosen to model the anticipated use situation.
Manufacturers market virtual workspaces as "instant collaboration", but it was difficult
to observe any indications of "instant" collaboration at Beta. The impression left from the
study at Beta was rather that of a gradual adoption of the technology and a high number of
unsuccessful attempts in the process. On the level of QPs, it was observed that more than one
third of the QPs started were hardly used. While the explanations for this could be many, it
indicates that starting a QP is by no means "instant". On the document-level the 72% dead
documents also clearly indicate that adopting a QP is not "instant", and produces much
overhead in terms of unsuccessful use attempts.
The use of QP was characterized as very simple in relation to the possibilities offered by
the technology. Only in 0,3% of the document life cycles studied did more than one person
edit the same document. The analysis of the three instantiations of genres showed that e-mail
was used as the primary medium for routing work. These indications might point to an
explanation, which is connected with the genre theory. One might characterize the use of QP
as simple with a low degree of sensitivity towards the use of the medium. The users are
simply insecure about how other users react and are insecure about how to precisely interpret
that someone is putting a particular document in a particular folder in the QP at a particular
point in time. Because e-mail has been in the organization for much longer than QP, users
have developed a greater sensitivity on how to interpret e-mails. This characteristic of the use
of QP and other computer media is captured by the genre theory and the underlying theory of
structuration. QPs are mediating genres of communication and the users establish a genre
Mine the gap - a multi-method investigation of web-based groupware use
171
Conclusions and implications

through continuous enactment of the genre. When QP is introduced the genres change during
this continuous enactment and are gradually integrated in certain genres and eventually create
new genres.
Two other observations support this. Perhaps most strikingly, 72% of all documents
were characterized as dead documents. They were simply placed in a QP and never used
again. Although other explanations may account for some of the dead documents they support
the finding that many attempts are made at using QP to communicate which are not successful
because the QP has only partly been integrated in genres of organizational communication. It
also indicates that users explore the new medium in a trial-and-error manner. Although there
is no direct evidence in the observations it might be the case that new genres emerge in this
trial-and-error manner. Not all genres are planned and then executed, as for example is the
case with the translation genre, and it would be an interesting subject for further research to
study how individual genres might emerge without conscious planning by the participants.
The present study has presented two interesting findings in relation to existing uses of
genre theory in the study of the use of IT in organizations. Firstly, the study of the three
instantiations of genres has shown that genres or genre systems span multiple media. The
detailed study of the use of QP for, for example, the translation genre showed that QP was
used together with e-mail and probably also telephone and face-to-face communication.
Secondly, the use of log analysis for the study of genre instantiations illustrated a useful
viewpoint for studying the enactment of genres and understanding in detail the role played by
the computer medium. We will deal with the second finding in the next section.
The close integration of different media in the enactment of a genre also points to a
general precaution for the study of how computer media is integrated in work practice.
Understanding this integration requires an understanding of the supplementing and
overlapping media available to users. While this observation is not new, it has perhaps not
been taken seriously enough. It suggests that studies of individual computer media should be
supplemented by studies of infrastructures of computer media. An infrastructure of computer
media is the suite of individual computer media available to a group of people, for example,
the members of an organization or a project. The notion of infrastructure should be considered
a supplementary research object to that of the individual medium when conducting empirical
research as well as in the theories of how media are adopted in an organization.
The research question formulated in the introduction to this thesis was the following:

How do the properties of an IT artefact for communication in a group of people interact


with the existing social structures in the organization and result in a work practice in
which the IT artefact plays a role.

Mine the gap - a multi-method investigation of web-based groupware use


172
Conclusions and implications

Orlikowski's distinction between the technology as an artefact and the technology-in-


practice was used as a basic distinction for conceptualizing the relationship between the IT
artefact and the social structures. It was also used to structure the empirical part of the thesis.
The comparative analysis of virtual workspaces characterized the IT artefact independently of
the setting in which it is adopted and used. The case study on the adoption and use of the
Lotus Quickplace technology served to characterize and understand how the technology-in-
practice emerges as the artefact is embedded in a work practice.
The distinction between technology as an artefact and the technology-in-practice has
shown to be a useful distinction in the study of the QP technology and its adoption and use. It
is useful because it denies the direct connection between the properties of the artefact and the
resulting patterns of use inherent in both the technological and organizational imperative as
well as in the adoptions of structuration theory in, for example, adaptive structuration theory.
Adaptive structuration theory mistakes categories when it postulates that social structures are
built into the IT artefact. If we wish to understand the adoption of technologies such as virtual
workspaces, it is crucial to distinguish the social structures in the organization (i.e. the genres
of organizational communication) from the social structures that has effects on the properties
built into the IT artefact. This can be observed from the present study in two ways. Firstly, the
design process of virtual workspaces is disconnected from the organization in which it is
adopted. Virtual workspaces are software products designed by a software company to be
deployed diversely both in terms of the organizational setting and the work process. Secondly,
as virtual workspaces are generic products there is no direct relation between the properties of
the artefact and the genres of communication in which it can function as a medium. It would
be meaningless to postulate a direct connection between the properties of the virtual
workspace and the use of the technology in a specific combination with related media in
specific genres of communication as can be observed in, for example, the translation of press
releases.
The theory of genres of organizational communication was selected as the specification
of the social structures relevant for understanding the adoption of virtual workspaces at Beta.
Genres of organizational communication were used to conceptualize how virtual workspaces
are adopted in a setting and provided a useful explanation of how some virtual workspaces are
integrated in work practices and others are not. The purpose of adopting genre theory was not
to map the communication patterns in QP at Beta, but rather to use it as a framework for
understanding in detail how the technology is adopted in a specific setting. The findings in
relation to genre theory have been dealt with earlier.

Mine the gap - a multi-method investigation of web-based groupware use


173
Conclusions and implications

Selecting genres of communication as the concretization of social structures relevant for


understanding the emergence of technology-in-practice of course means ignoring other
perspectives such as the structures of power or the concept of community. This is not to imply
that the structures of power or the concept of community are irrelevant. It is merely a
selection made because genres of communication are considered important and that an in-
depth empirical study can only deal with a certain level of complexity in terms of the aspects
studied.
In addition to the integration of the IT artefact in genres of communication, one other
aspect of the relationship between the properties of the IT artefact and the social structures of
the organization was investigated. The study of the dynamics of the folder structures provided
this other perspective that in a more specific way addressed the relationship between the
properties of the IT artefact and the social structures of the organization. The functional
model for understanding how folder structures develop over time provided a better
understanding of the dynamics than traditional theories of classification. It also provided
another perspective on the relationship between properties of the IT artefact and the
technology-in-practice
The analysis of maintaining a folder structure is an example of a central activity in
establishing the technology-in-practice. The functional model developed links the process of
developing a folder structures with genres of communication. The process of integrating the
use of QP in genres of communication might be characterized as an example of an activity,
which Bøving and Pedersen (2002) have characterized as end-user design. The reason for
denoting it end-user design is that this process is considered a design process in other settings.
Work-flow systems can provide an example of this. The introduction of a work-flow system
typically consists of a process of modelling a work-process. This model is then built into a
computer system that supports this model and allows a supervision of the work process. The
translation of press releases is an example of a work process, which had it been performed
often enough could be a potential candidate for building a work-flow system. In the case of
the work-flow system, the process is characterized as design, while in the case of QP one
would traditionally denote it use. The distinction then becomes primarily based on the
involvement of work process designers. If expensive designers are involved it is characterized
as design, but when the same design is performed by end-users themselves it is merely use.
Thus this distinction tends to ignore the processes of maintaining a folder structure and
establishing genres of communication - or work-flows - in virtual workspace technologies.
Hiding this process is dangerous for the adoption of virtual workspaces, because it is a
process necessary for integrating a QP into a work practice and the terms end-user design
serves to highlight the role of the end-users. Introducing end-user design transforms the end-

Mine the gap - a multi-method investigation of web-based groupware use


174
Conclusions and implications

user from objects that will gradually adopt a technology in more or less predictable patterns to
agents that actively design the integration of the IT artefact in a work setting.
The introduction of the concept of end-user design might be problematic because it blurs
the distinction between what it is like to design IT-systems and what it is like to work and
communicate using IT-systems. However, the analysis of virtual workspaces has at least
shown that the distinction between design and use has shifted considerably from the
traditional world of customized IT-systems such as e.g. a work-flow system. Some of the
processes previously handled by professional designers are now left to the end-users and
whether they should be denoted use, adoption, appropriation or end-user design is a question
for debate. At least it should not be ignored.

The value of log analysis for CMC studies


The study presented in this thesis has experimented with the use of log file analysis as a
means for understanding computer-mediated communication. The purpose of this section is to
summarize and evaluate the experiences with the analyses.
The primary quality of the log file is its rigorous detail. Like other quantitative data it
captures very few aspects of technology use in a systematic manner that allows analysis
across many different specific settings.
Existing methods for the analysis of HTTP-log files has focused on one users' interaction
with the web-technology using the analytical unit of a session. A session is defined as a
specific users interaction with a web application (or visit to a web site) that is typically
defined negatively through a preceding and subsequent period of inactivity by the user. These
have e.g. been used in HCI as a method of analysis that provides input for the improvement of
user interfaces. The method that has been developed and used in the context of this case study
uses the HTTP-log to analyze communication mediated by the web-technology. It does this
by exchanging the analytical unit of session with that of a document life cycle that captures
different users actions on the same document or resource over time.
The document-based HTTP-analysis was used for two very different analyses. One the
one end it was used as a basis for producing statistical analyses across all document life
cycles in the QPs at Beta. On the other end it was used for the detailed analysis of specific
instantiations of genres of communication. While the statistical analysis of document was
solely based on the log file data, the analysis of genre instantiations involved both interviews
and log file analysis.

Because the log-data are systematically collected by the HTTP-server and because they
are structured so that computer-based analysis tools can manipulate them they lend

Mine the gap - a multi-method investigation of web-based groupware use


175
Conclusions and implications

themselves to quantitative analysis. The types of document life cycles generated from the
cluster analysis and the top-down analysis are examples of using data mining techniques for
looking for global models in the data. In terms of statistics they merely used descriptive
statistics. The methods from mathematical (inferential) statistics were not used to assess the
generality of the result. In one case a quantitative hypothesis was tested using inferential
statistics. The hypothesis that the number of uses of documents and number of users in QPs
were distributed according to Zipf's law was tested and the Pearson correlation was calculated
as a way of assessing the reliability of the correlation found in the data. The distribution
according to Zipf's law was an interesting finding in its own right but it is a finding that is a
bit of the track of understanding the role of Lotus Quickplace at Beta.
The log analysis has proven useful in relation to understanding the research question
connected with two concepts that address the relation between the technology and social
structure. Firstly it was used for investigating genres of organizational communication in QP.
Secondly, it was used as a basis for understanding the dynamics of folder structures, that is an
important part of establishing technology-in-practice.
In both analyses of instantiations of genres and the dynamics of folder structures it has
proven useful as a means of triangulating. In the case of genres descriptions of genres of
communication was triangulated with indications of how the QP was used in a specific
instantiation of that genre. This provided a very precise analysis of the role of the technology.

In general triangulation of interpretations of use provided through interviews and


accounts of use through log analysis improves significantly the researchers interpretation of
interview statements. The log files serve as a way of understanding the practical
consequences of a description made by an interviewee. As this analysis has shown, analyzing
the practical consequences in the log files has changed the interpretation of the interviews
radically. One example is the statement made by Linda (one of the managers of GIC) that "we
share presentations". The initial interpretation based on the interview is that they upload
PowerPoint presentations in the QP, which are then downloaded by others for reuse. As the
log analysis showed "we share presentations" should rather be interpreted so that someone is
uploading presentations with the intention that others might use them. The problem is that
nobody downloaded the documents. They are of the type dead documents that cover 72% of
documents across all QPs and 87% of documents in GIC. Another example comes from the
interview with Lone, the manager of IC and manager of the translation unit using IC. The
interview leaves the impression that QP is used as the tool to coordinate the translation
process. The result of the log-analysis gives us a rather different picture of this. It turned out

Mine the gap - a multi-method investigation of web-based groupware use


176
Conclusions and implications

that e-mail was the most important medium for coordinating the translation. The IC QP had
but a secondary role.
To approach the evaluation of the use of a communication technology by interviewing
the users has turned out to be a very problematic approach. The use of QP is so intertwined
with other media – in this case e-mail in particular - that interviewing people about a specific
technology is not a very good approach if one whishes to understand how the technology is
used and integrated in a work practice. The interview situation, which is framed to be about
the use of a technology, seems to make the interviewee focus on that aspect and ignore other
technologies involved. Interviews on the usage of one medium tend to exaggerate the role of
that technology.
The exaggeration of the role of the technology is also a general methodological finding.
In order to research the role of a technology in an organization it is necessary to address the
whole set of technologies available to users - in the case of virtual workspace the
technological infrastructure available for communication. The triangulation of interviews and
log analysis has shown that the design of the interview guides in this case has produced
answers that in some cases would have been misleading had they been used as the only
empirical data.
Log analysis turned out to be useful for another kind of triangulation as well. It can
provide a perspective on the use that cannot be captured through interviews because the
interviewee does not have the information on how it is used but from his personal
experiences. The document life cycle is a perspective on use that it is practically impossible to
extract from interviews or direct observations of use. A very specific example was the finding
that 72% of documents were dead. Another is that although virtual workspaces are
characterized as tools for collaboration and coordination most documents had a very simple
life cycle. They were created and after that a number of users would read it. Only in 22 cases
were documents edited by more than one person.
The techniques of data mining can produce patterns and link different observations that
would have otherwise have been invisible. The analysis of the folder structures uses the
history of folder structures and combines it with user activity and document moves and
deletions. These data were all derived from the log files. These data were then triangulated
with interviews on how the folder structure was maintained and for what the QP was in
general used for. Linking these would probably never occur without the techniques of data
mining. The functionalistic model for explaining folder structures is a theory that would not
have been possible without data mining of the log files. In principle it would be possible to
observe the history of the folder structure by visiting the QP daily or weekly through the
whole period, but it could not have been related to numbers on the activity. A study of folder

Mine the gap - a multi-method investigation of web-based groupware use


177
Conclusions and implications

structures using observation and interviews would probably have focused on documenting the
structure as well as interviewing the person responsible for the changes and inquire into the
reasons for doing it.

While the log analysis was used for understanding the role of technology in genres of
communication, another sociological method lends itself to an investigation through
document-based log analysis. Communication between users in a virtual workspace can be
used for drawing social networks. Social network analysis Harary, Norman et al. (1965) is a
method used for describing the structure of a social setting that can be based on many
different types of data. The document-based log analysis provides excellent data for drawing
social networks based on who is accessing the same documents in a virtual workspace. Social
network analysis is e.g. used as a method for analyzing the concept of community in
computer media Wellman, Salaff et al. (1996). Some experiments have been conducted in the
course of the work on this thesis with drawing the social network of the three QPs. No final
results are however presented here because time ran out and because social network analysis
lends itself to the investigation of other social structures than genres of communication i.e. the
concept of virtual community.

A new type of data for case studies


The document-based analysis of log files could be characterized as systematic
indications of computer-mediated communication. They contain aspects of the use of a
technology to support communication between people. Analysis of log files should therefore
be understood as a method for observing CMC alongside other methods for observation such
as document analysis and participant observation. Participant observation of media use that is
distributed temporally and geographically is very hard to perform in practice and requires a
lot of resources.
Log files might be characterized as a new type of data for case study research. Yin
(1994) identifies six sources of evidence or data relevant to performing case studies, which
was presented earlier in the section on research method. These were:
• Documentation
• Archival Records
• Interviews
• Direct observation
• Participant-observation
• Physical artefacts

Mine the gap - a multi-method investigation of web-based groupware use


178
Conclusions and implications

Log files can be characterized as "new" in the sense that they are not covered by Yin's
categories and in the sense that they have not been used for case study research in IS.
Log files share characteristics with both direct observations and archival records.
Archival records are characterized by Yin (Yin (1994) p. 80) as:
• stable - can be retrieved repeatedly
• unobtrusive - not created as a result of the case study
• exact - contains exact names and details of an event
• broad coverage - long span of time, many events and many settings
• precise and quantitative
All of these properties except maybe for "exact" are also properties of a log file. The
analysis of HTTP-log files has shown that is often difficult to link the log file data to names
and details that are part of the social discourse in the organization.
A log file differs from archival records in a very important way because they are not the
product of an intentional archiving process by members of the organization studied. In the
case of the HTTP-log the log files is a product partly of the de-facto HTTP-log standard
partly of the technical architecture of the Lotus Quickplace technology. In this sense log files
are very different from archival records. An example of archival records used in this study is
the archive of all applications for opening a Lotus Quickplace sent to the technical manager of
the Lotus Quickplace server.
It might be better to characterize the HTTP-log as a kind of direct observation. Yin
characterizes direct observations as:
• reality - covers events in real time
• contextual - covers context of the event
Clearly HTTP-log files only capture a very limited aspect of events and it does not
capture what Yin calls the context of the event. Log files have another characteristic that is
very important. The data are structured computer records. This means that they are directly
available for analysis using data mining techniques. In sum we can characterize log files with
the following properties:
• reality - covers events in real time
• stable - can be retrieved repeatedly
• unobtrusive - not created as a result of the case study
• broad coverage - long span of time, many events and many settings
• precise and quantitative
• Structured data records - analyzable through data mining techniques

Mine the gap - a multi-method investigation of web-based groupware use


179
Conclusions and implications

The downside of log analysis


Up till now the focus has been on the qualities of log analysis as a data source. As with
other data types it has its limitations of both methodological and practical nature.
HTTP-log files and other log files are in contrast to archival records not produced
intentionally by the organization. They are designed as part of an IT-system for different
purposes such as e.g. being able to reverse transactions on a bank account. The use of log files
to understand social practices as it is done in this thesis is not intended in the design of the log
file. This means that the analysis of log files is sometimes cumbersome and sometimes
impossible due to the design of the log file. One consequence of this in this study was that in
most cases no link existed between the contents of what was communicated and the
information in the log file. Only in the case of attached files used in the genre analysis was it
possible to relate the information in the log file to a social meaning of the information.
The problem of relating social meaning of information to information in the log file puts
severe constraints on the interpretation of the results of the log analysis. This is e.g.
observable in the global models produced for document usage in this thesis. The types of
document life cycles generated from the cluster analysis should not be interpreted as patterns
that are directly related to concepts of social practice such as genres of communication. A
typified document life cycle can cover very diverse uses in terms of the social setting in
which it is integrated. The documents involved in the genre of translation and the document
involved in the genre of the meeting agenda are all characterized as the same document type
in the cluster analysis. Characterizing the types of document life cycles directly as genres of
communication or any other socially defined concept such as coordination or collaboration
would be a category mistake.
Apart from the methodological challenge of using log analysis, a more practical
challenge is central. Using log analysis in case studies requires knowledge of statistical
analysis, data mining, and relational databases. There are many pitfalls in the process of
preparing data and analyzing data which can render the results misleading or useless.
Paradoxically, in this situation it requires data analysis techniques even to discover that errors
have been made.
The necessary skills are in many cases not present in case researchers who are used to do
qualitative analysis. In the process of research leading to this thesis much new knowledge on
computing and statistical analysis had to be acquired during the process, and has therefore
been very time consuming. (Denzin and Lincoln (2000)) mentions different types of
triangulation in social research and one of these is "investigator triangulation". Investigator
triangulation basically means that different researchers with different knowledge and
background collaborate on collecting and analyzing data. Using log analysis in case studies
Mine the gap - a multi-method investigation of web-based groupware use
180
Conclusions and implications

calls for investigator triangulation simply because experienced interview researchers are
seldom experienced data miners at the same time.

The research presented in this thesis has shown that log analysis is a method for
investigating computer mediated communication, whether in organizational settings or not,
which should be taken seriously. It not only lends itself to the testing of quantitative
hypotheses, but can also be applied in case or field studies as a means of triangulation. Log
analysis can help solve the problem of direct observation of CMC. Direct observation is used
in combination with interviews and other reflections on computer use in settings where the
use is not distributed temporally and geographically. Direct observations can be used to
analyze the gap between what people say and do. In situations where this is not possible I
recommend to "Mine the gap".

Practical implications
The purpose of this section is to provide implications of the study of the QP technology
at Beta, which can be of practical value to people who are either constructing virtual
workspace technologies or are planning to implement them.

The design of virtual workspaces


The comparative analysis of virtual workspace technologies concluded that in terms of
functionality, the different products were more or less the same. The main differentiators were
characterized by two related concepts: the aspects of the anticipated use situation which was
modelled, and the metaphorical landscape which was used. These concepts were used for
purely analytical purposes but could also be used consciously in the design of virtual
workspaces. As an example of the different approaches to modelling the anticipated use
situation, the modelling of projects were compared. While it was not investigated empirically
in the Beta study, it is very likely that had Beta chosen a virtual workspace, which had
explicitly modelled project in the user interface, it would have an impact on the adoption. It is
likely that Lotus Quickplace would not have been used to support organizational units, as was
the case with the GIC QP, had it modelled the concept of the project explicitly in the user
interface. Therefore the decision to model specific aspects of an anticipated use situation
should be considered a strategic choice and be dealt with systematically in the design and
marketing of virtual workspaces.
The technique of producing a metaphorical landscape for a virtual workspace (and for
other technologies as well) was also introduced in the section describing the virtual

Mine the gap - a multi-method investigation of web-based groupware use


181
Conclusions and implications

workspace technology. It suggested that it could be used in software development as a


technique for systematic reflections on design choices. Not only the root metaphor used but
the entire metaphorical landscape produced from the organization and naming of interface
elements are important for the use patterns emerging. In a generic technology such as virtual
workspaces, which can support a multitude of settings and genres of communication, the
associations produced by users as a consequence of the metaphors are important.

The log analysis has been used in this thesis to analyze the use of QP in order to
understand how it is used at Beta. The information gained from log analysis might also be
used directly by the users and by the virtual workspace themselves. The analysis of document
use according to Zipf's law suggests that the distribution of the use of documents can be used
for making decisions on which documents to delete or archive and which documents to keep
in the folder structure. The fact that 72% of all documents were never used could be useful
information to users for reflecting on the use of the virtual workspace. As an example, the
BSCW system (Bentley, Horstmann et al. (1997), Appelt (1999)) displays the number of
readers who have accessed a document as an inherent property of the document. Other
statistics of document usage could, for example, be used as a basis for maintaining folder
structures. Given that 72% of the documents are never used, the number of documents is not a
good indication of the value of a folder. Displaying document usage statistics for each folder
in a virtual workspace would serve as a better basis for considering a change in the folder
structure. The use of other users' use of information in the design of IT systems has
previously been suggested by e.g. Dieberger, Dourish et al. (2000).

Adoption of virtual workspaces


The use of the theory of genre systems of communication for understanding the adoption
of virtual workspaces generates more practical guidelines for users who wish to use a virtual
workspace. The theory suggests that when a group of users start up a virtual workspace they
should focus on the genres of communication in which they wish to integrate the virtual
workspace. Instead of thinking about which documents to put in a virtual workspace, thereby
viewing it as an archive, contemplating the document life cycle and the communication
situation will provide a better starting point for understanding what the technology is useful
for.
The analysis of QP use has also shown the importance of understanding virtual
workspaces as an element in a communication infrastructure alongside other media such as
the telephone and e-mail. When the use of a virtual workspace is agreed upon, it is crucial to

Mine the gap - a multi-method investigation of web-based groupware use


182
Conclusions and implications

think of it as part of an infrastructure and its use should be defined in relation to e-mail, the
Intranet, etc.
The analysis of how technologies-in-practice emerge as virtual workspaces are
integrated in genres of communication suggests a conscious approach to starting a virtual
workspace. The adoption of a virtual workspace should be seen as a design process where
existing genres of communication are changed and integrated with the technology. There is
no specific use inscribed in the properties of the technology, and it is the responsibility of the
group of users to define in which ways the virtual workspace should be integrated in their
work practice.

The functional model for understanding the dynamics of folder carries some
consequences for how users of virtual workspaces should think about and maintains a folder
structure. The folder structure is a "living" structure, which requires constant re-working.
Both centralized and distributed approaches to managing the folder structure seem to wor, and
in both approaches a change log for the virtual workspace visible to all users could provide a
simple means of communicating changes and help the users to adjust their personal
interpretations of folder contents. The change log would document the change made to the
structure and the arguments for doing so.

"My lord, facts are like cows. If you look them in the face hard enough, they generally
run away."
Dorothy L. Sayers

Mine the gap - a multi-method investigation of web-based groupware use


183
References

References

Adamic, L. A. and B. A. Huberman (2000). "The nature of markets in the World Wide Web."
Quarterly Journal of Elctronic Commerce 1(1): 5-12.
Andersen, J., R. S. Larsen, et al. (2000). Analyzing Clickstreams Using Subsessions. ACM
Third International Workshop on Data Warehousing and OLAP(DOLAP'00),
Washington DC, ACM.
Appelt, W. (1999). WWW Based Collaboration with the BSCW System. SOFSEM'99,
Milovy, Czech Republic, Springer Lecture Notes in Computer Science.
Ashforth, B. E. and R. H. Humphrey (1997). "The ubiquity and potency of labeling in
organizations." Organization Science 8(1): 43-58.
Bailey, K. D. (1994). Typologies and taxonomies : an introduction to classification
techniques. Thousand Oaks, Calif., Sage Publications.
Bakeman, R. (1992). Understanding social science statistics : a spreadsheet approach.
Hillsdale, N.J., L. Erlbaum.
Bannon, L. (2000). Understanding common information spaces in CSCW. Workshop on
common information spaces. Copenhagen.
Bannon, L. and S. Bødker (1997). Constructing Common Information Spaces. European
conference on Computer Supported Cooperative Work, Lancaster, UK, Kluwer
Academic Publishers, Netherlands.
Bansler, J., J. Damsgaard, et al. (2000). "Corporate Intranet Implementation: Managing
Emergent Technologies and Organizational Practices." Journal of the Association for
Information Systems 1(10).
Barley, S. R. (1986). "Technology as an Occasion for Structuring - Evidence from
Observations of Ct Scanners and the Social-Order of Radiology Departments."
Administrative Science Quarterly 31(1): 78-108.
Benbasat, I., D. K. Goldstein, et al. (1987). "The Case Research Strategy in Studies of
Inofrmation Systems." Mis Quarterly 11(3): 369-386.
Bentley, R. and P. Dourish (1995). Medium versus mechanism: Supporting collaboration
through customisation. Xerox. London.
Bentley, R., T. Horstmann, et al. (1997). "The World Wide Web as enabling technology for
CSCW: The case of BSCW." Journal of Computer Supported Cooperative Work 6(2-
3): 111-134.
Berners-Lee, T. and M. Fischetti (1999). Weaving the Web : the original design and ultimate
destiny of the World Wide Web by its inventor. San Francisco, HarperSanFrancisco.
Mine the gap - a multi-method investigation of web-based groupware use
184
References

Bowker, G. C. and S. L. Star (1999). Sorting things out : classification and its consequences.
Cambridge, Mass., MIT Press.
Bradner, S. (1996). The Internet Standards Process -- Revision 3, IETF. 2002.
Breslau, L., P. Cao, et al. (1998). On the Implications of Zipf's Law for Web Caching. 3rd
International WWW Caching Workshop.
Breslau, L., P. Cao, et al. (1999). Web Caching and Zipf-like Distributions: Evidence and
Implications. IEEE INFOCOM, New York.
Büchner, A. G. and M. D. Mulvenna (1998). "Discovering Internet Marketing Intelligence
through Online Analytical Web Usage Mining." SIGMOD Record 27(4): 54-61.
Büschner, M., S. Gill, et al. (2001). "Landscapes of Practice: Bricolage as a Method for
Situated Design." Computer Supported Cooperative Work 10(1): 1-28.
Bøving, K. B. (2001). Datastructuring, Standards, and Knowledge Work. Proceedings of the
24th Information Systems Research Seminar in Scandinavia, Ulvik, Norway,
Department of Information Science, University of Bergen.
Bøving, K. B. (2001). Datastructuring, Standards, and Knowledge Work. 24th Information
Systems Research Seminar in Scandinavia, Ulvik, Norway, Department of Information
Science, University of Bergen.
Bøving, K. B. (2001). "Digitalt samarbejde via Virtual Workspaces." Internethåndbogen,
Børsens forlag.
Bøving, K. B. (2002). Digitale web-baserede samarbejdssystemer. Digital genopbygning af
den offentlige sektor. København, Forvaltningshøjskolen.
Bøving, K. B. and L. H. Pedersen (2002). Design for Dummies: Understanding Design Work
in Virtual Workspaces. PDC2002, Malmö, Sweden.
Callon, M. and J. Law (1989). "On the Construction of Sociotechnical Networks: Content and
Context Revisited." Knowledge and Society 8: 57-83.
Carmel, E. (1995). "Cycle-time in packaged software firms." Jpurnal of Product Innovation
Management 12(2): 110-123.
Carmel, E. and B. J. Bird (1997). "Small is beautiful: a study of packaged software
development teams." Journal of High Technology Management Research 8(1): 129-
148.
Carmel, E. and S. Sawyer (1998). "Packaged software development teams: what makes them
different?" Information Technology & People 11(1): 7-19.
Conklin, J. and M. Begeman (1988). gIBIS: A Hypertext Tool for Exploratory Policy
Discussion. CSCW'88, Portland, Oregon, Association of Computing Machinery.

Mine the gap - a multi-method investigation of web-based groupware use


185
References

Cooley, R., J. Srivastava, et al. (1997). Web mining: Information and pattern discovery on the
world wide web. 9th IEEE International Conference on Tools with Artificial
Intelligence (ICTAI'97), Newport Beach, California.
Cooley, R., P.-N. Tan, et al. (1999). WebSIFT: The Web Site Information Filter System.
Proceedings of the 1999 KDD Workshop on Web Mining, San Diego, CA, Springer-
Verlag.
Daft, R. L. and R. H. Lengel (1986). "Organizational Information Requirements, Media
Richness and Structural Design." Management Science 32(5): 554-571.
Daft, R. L. and N. B. Macintosh (1981). "A Tentative Exploration into the Amount and
Equivocality of Information-Processing in Organizational Work Units." Administrative
Science Quarterly 26(2): 207-224.
Davidson, E. J. (2000). "Analyzing genre of organizational communication in clinical
information systems." Information Technology & People 13(3): 196-209.
Deetz, S. (2000). Conceptual Foundations. The New Handbook of Organizational
Communication. F. M. Jablin and L. Putnam, Sage Publications: 3 - 46.
Denzin, N. K. (1989). The research act : a theoretical introduction to sociological methods.
Englewood Cliffs, N.J., Prentice Hall.
Denzin, N. K. and Y. S. Lincoln (2000). Handbook of qualitative research. Thousand Oaks,
Calif., Sage Publications.
Desanctis, G. and R. B. Gallupe (1987). "A Foundation for the Study of Group Decision
Support Systems." Management Science 33(5): 589-609.
Desanctis, G. and M. S. Poole (1994). "Capturing the Complexity in Advanced Technology
Use - Adaptive Structuration Theory." Organization Science 5(2): 121-147.
Dieberger, A., P. Dourish, et al. (2000). "Social Navigation: Techniques for Building More
Usable Systems." Interactions 7(6): 36-45.
Divitini, M. and C. Simone (2000). "Supporting Different Dimensions of Adaptability in
Workflow Modeling." Computer Supported Cooperative Work 9(3/4): 365-397.
Dix, A. (1997). "Challenges for Cooperative Work on the Web: An Analytical Approach."
Journal of Computer Supported Cooperative Work 6(2-3): 135-156.
Eco, U. (1976). A theory of semiotics. Bloomington, Indiana University Press.
Engeström, Y. (1987). Learning by expanding : an activity-theoretical approach to
developmental research. Helsinki, Orienta-Konsultit Oy.
Engeström, Y., R. Miettinen, et al. (1999). Perspectives on activity theory. Cambridge ; New
York, Cambridge University Press.

Mine the gap - a multi-method investigation of web-based groupware use


186
References

Fiedler, K. D., V. Grover, et al. (1996). "An Empirically Derived Taxonomy of Information
Technology Structure and Its Relationship to Organizational Structure." Journal of
Management Information Systems 13(1): 9-34.
Fjermestad, J. and S. R. Hiltz (1998-1999). "An Assessment of Group Support Systems
Expeirmental Research: Methodology and Result." Journal of Management Information
Systems 15(3): 7-149.
Fjermestad, J. and S. R. Hiltz (2000). Case and Field Studies of Group Support Systems: An
Emprical Assessment. 33rd Hawaii International Conference on System Sciences,
Hawaii, IEEE.
Fjermestad, J. and S. R. Hiltz (2000). "Group support systems: A descriptive evaluation of
case and field studies." Journal of Management Information Systems 17(3): 115-159.
Fowler, M. and K. Scott (1997). UML distilled : applying the standard object modeling
language. Reading, Mass., Addison Wesley Longman.
Fu, Y., K. Sandhu, et al. (1999). Clustering of Web Users Based on Access Patterns.
Proceedings of the 1999 KDD Workshop on Web Mining, San Diego, CA, Springer-
Verlag.
Gabaix, X. (1999). "Zipf's law for cities: an explanation." Quarterly Journal of Economics
114(3): 739-767.
Gallupe, R. B., G. Desanctis, et al. (1988). "Computer-Based Support for Group Problem-
Finding - an Experimental Investigation." Mis Quarterly 12(2): 277-296.
Garton, L. and B. Wellman (1995). "Social Impacts of Electronic Mail in Organizations: A
Review of the Research Literature." Communication Yearbook 18: 434-453.
Giddens, A. (1984). The Constitution of Society, Berkeley University of California Press.
Glymour, C., D. Madigan, et al. (1996). "Statistical Inference and Data Mining."
Communications of the ACM 39(11): 35-41.
Goodman, N. (1976). Languages of Art, Hackett Publishing.
Groth, L. (1999). Future organizational design : the scope for the IT-based enterprise.
Chichester, England ; New York, John Wiley & Sons.
Grudin, J. (1991). "The Development of Interactive Systems: Bridging the Gaps Between
Developers and Users." IEEE Computing 24(4): 59-69.
Grudin, J. (1994). "CSCW: History and Focus." IEEE Computing 27(5): 19-26.
Grudin, J. (1994). "Groupware and social dynamics: Eight challenges for developers."
Communications of the ACM 37(1): 92-105.
Gunter, B. (2002). The quantitative research process. A Handbook of Media and
Communication Research. K. B. Jensen. London, Routledge: 209-234.

Mine the gap - a multi-method investigation of web-based groupware use


187
References

Guzdial, M., J. Rick, et al. (2000). Recognizing and Supporting Roles in CSCW. CSCW
2000, Philadelphia, PA, ACM.
Guzidial, M., J. Rick, et al. (2000). Recognizing and Supporting Roles in CSCW. ACM
Conference on Computer Supported Cooperative Work, Philadelphia, PA, ACM.
Habermas, J. (1981). Theorie des kommunikativen Handelns. Frankfurt am Main, Suhrkamp.
Hamilton, A. (2000). "Metaphors in theory and practice: the influence of metaphors on
expectation." ACM Journal of Computer Documentation 24(4).
Hand, D. J., H. Mannila, et al. (2001). Principles of data mining. Cambridge, Mass., MIT
Press.
Harary, F., R. Z. Norman, et al. (1965). Structural models: an introduction to the theory of
directed graphs. New York,, Wiley.
Hidber, C. (1998). Online Association Rule Mining. Berkeley, CA, International Computer
Science Institute, University of California at Berkeley.
Huberman, B. A. (2001). The laws of the Web : patterns in the ecology of information.
Cambridge, Mass., MIT Press.
Huberman, B. A. and L. A. Adamic (1999). "Growth Dynamics of the World Wide Web."
Nature 401(131).
Hughes, J., D. Randall, et al. (1991). CSCW: Discipline or Paradigm? Second European
Conference on CSCW (ECSCW'91), Amsterdam.
IBM (2002). Lotus Quickplace Homepage. 2002.
IBM (2002). Research Lists IBM Lotus Software As A Leader In Team Collaboration Tools.
Cambridge, Mass.
Jablin, F. M. and L. Putnam, Eds. (2000). The new handbook of organizational
communication : advances in theory, research, and methods. Thousand Oaks, Calif.,
Sage Publications.
Jacobs, I. (2001). World Wide Web Consortium Process Document, W3C (World Wide Web
Consortium). 2002.
Jacobson, I. (1992). Object-Oriented Software Engineering: A Use Case Driven Approach,
Addison-Wesley.
Jarvenpaa, S. L. (1989). "The Effect of Task Demands and Graphical Format on Information-
Processing Strategies." Management Science 35(3): 285-303.
Jensen, K. B. (2000). Interactivities: Constituents of a Model of Computer Media and
Communication. Moving Images, Culture, and the Mind. I. Bondebjerg. Luton,
University of Luton Press.

Mine the gap - a multi-method investigation of web-based groupware use


188
References

Jensen, K. B. (2002). The complementarity of qualitative and quantitative methodologies in


media and communication research. A Handbook of Media and Communication
Research. K. B. Jensen. London, Routledge: 254-272.
Jensen, K. B., Ed. (2002). A Handbook of Media and Communication Research. London,
Routledge.
Jensen, K. B. (2002). The qualitative research process. A Handbook of Media and
Communication Research. K. B. Jensen. London, Routledge: 235-254.
Kaplan, B. and D. Duchon (1988). "Combining Qualitative and Quantitative Methods in
Information-Systems Research - a Case-Study." Mis Quarterly 12(4): 571-586.
Karjalainen, A., T. Päivärinta, et al. (2000). Genre-Based Metadata for Enterprise Document
Management. Hawaii international Conference on Systems Sciences, Hawaii, IEEE.
Klein, H. K. and M. D. Myers (1999). "A Set of Principles for Conducting and Evaluating
Interpretive Field Studies in Information Systems." Mis Quarterly 23(1): 67-94.
Kling, R. (1996). Computerization and controversy : value conflicts and social choices. San
Diego, Academic Press.
Kunz, W. and H. W. J. Rittel (1970). Issues as elements of information systems.
Latour, B. (1991). Technology is Society Made Durable. Essays on Power, Technology and
Domination. J. Law. London, Routledge.
Lea, M., T. O'Shea, et al. (1995). "Constructing the Networked Organization: Content and
Context in the Development of Electronic Communications." Organization Science
6(4): 462-478.
Leavitt, H. J. and T. L. Whisler (1958). "Management in the 1980s." Harvard Business
Review 36(6): 41-48.
Lee, A. S. (1991). "Integrating Positivist and interpretive approaches to organizational
research." Organization Science 2(4): 342-365.
Lee, A. S. (1994). "Electronic Mail as a Medium for Rich Communication - an Empirical-
Investigation Using Hermeneutic Interpretation." Mis Quarterly 18(2): 143-157.
Lee, A. S. and R. Baskerville (2001). Generalizing Generalizability in Information Systems
Research: 1 - 49.
Leuf, B. and W. Cunningham (2001). The Wiki way : quick collaboration on the Web.
Boston, Addison-Wesley.
Levene, M., J. Borges, et al. (2001). "Zipf's law for web surfers." Knowledge and Information
Systems 3(1): 120-129.
Lyytinen, K. (1992). Information Systems and Critical Theory. Critical Management Studies.
M. Alvesson and H. Willmott. Newbury Park, CA, Sage: 159-180.

Mine the gap - a multi-method investigation of web-based groupware use


189
References

Lyytinen, K. and O. K. Ngwenyama (1992). "What does computer support for cooperative
work mean? A structurational analysis of computer supported cooperative work."
Accounting, Management and Information Technology 2(1): 19-37.
MacCormack, A., R. Verganti, et al. (2001). "Developing Products on Internet Time: The
Anatomy of a Flexible Development Process." Management Science 47(1): 133-150.
Markus, M. L. (1983). "Power, Politics, and MIS Implementation." Communications of the
ACM 26(6): 430-444.
Markus, M. L. (1994). "Electronic Mail as the Medium of Managerial Choice." Organization
Science 5(4): 502-527.
Markus, M. L. and D. Robey (1988). "Information Technology and Organizational-Change -
Causal-Structure in Theory and Research." Management Science 34(5): 583-598.
Masseglia, F., P. Poncelet, et al. (1999). "Using Data Mining Techniques on Web Access
Logs to Dynamically Improve Hypertext Structure." ACM SIGWEB letters 8(3): 1-19.
McGrath, J. E. (1984). Groups : interaction and performance. Englewood Cliffs, N.J.,
Prentice-Hall.
Miller, C. R. (1984). "Genre as Social Action." Quarterly Journal of Speech 70: 151-167.
Mingers, J. (2001). "Combining IS research methods: Towards a pluralist methodology."
Information Systems Research 12(3): 240-259.
Mintzberg, H. (1979). The structuring of organizations : a synthesis of the research.
Englewood Cliffs, N.J., Prentice-Hall.
Mobasher, B., R. Cooley, et al. (2000). "Automatic personalization based on Web usage
mining - Web usage mining can help improve the scalability, accuracy, and flexibility
of recommender systems." Communications of the Acm 43(8): 142-151.
Mobasher, B., H. Dai, et al. (2001). Effective Personalization Based on Association Rule
Discovery from Web Usage Data. WIDM'01 3rd ACM Workshop on Web Information
and Data Management, Atlanta, Georgia, ACM.
Ngwenyama, O. K. and A. S. Lee (1997). "Communication richness in electronic mail:
Critical social theory and the contextuality of meaning." Mis Quarterly 21(2): 145-167.
Orlikowski, W. (1996). Learning from Notes:. Computerization and controversy : value
conflicts and social choices. R. Kling. San Diego, Academic Press.
Orlikowski, W. and J. Baroudi (1991). "Studying information technology in organizations:
research approaches and assumptions." Information Systems Research 2(1): 1-28.
Orlikowski, W. and D. Robey (1991). "Information Technology and the Structuring of
Organizations." Information Systems Research 2(2): 143-169.
Orlikowski, W. J. (1992). "The Duality of Technology - Rethinking the Concept of
Technology in Organizations." Organization Science 3(3): 398-427.

Mine the gap - a multi-method investigation of web-based groupware use


190
References

Orlikowski, W. J. (2000). "Using technology and constituting structures: A practice lens for
studying technology in organizations." Organization Science 11(4): 404-428.
Orlikowski, W. J. and C. S. Iacono (2001). "Research commentary: Desperately seeking the
"IT" in IT research - A call to theorizing the IT artifact." Information Systems Research
12(2): 121-134.
Orlikowski, W. J. and J. Yates (1994). "Genre Repertoire - the Structuring of Communicative
Practices in Organizations." Administrative Science Quarterly 39(4): 541-574.
Orlikowski, W. J., J. Yates, et al. (1995). "Shaping Electronic Communication - the
Metastructuring of Technology in the Context of Use." Organization Science 6(4): 423-
444.
Pei, J., J. Han, et al. (2000). Mining Access Patterns Efficiently from Web Logs. Proceedings
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'00).
Rice, R. E. and U. Gattiker, E. (2000). New Media and Organizational Structuring. The New
Handbook of Organizational Communication. F. M. Jablin and L. Putnam, Sage
Publications: 544 - 581.
Saunders, C. and J. W. Jones (1990). "Temporal Sequences in Information Acquisition for
Decision-Making - a Focus on Source and Medium." Academy of Management Review
15(1): 29-46.
Schmidt, K. and L. Bannon (1992). "Taking CSCW seriously - Supporting Articulation
Work." Journal of Computer Supported Cooperative Work 1(1-2): 7 - 40.
Schmidt, K. and U. Christensen (2000). Using classification in common information spaces.
Workshop on Common information spaces. Copenhagen.
Schmidt, K. and R. Israel (2000). Cooperative Management of common information spaces.
Workshop on classification schemes in cooperative work, ACM conference on CSCW.
Philadelphia.
Schmidt, K. and C. Simone (1996). "Coordination Mechanisms: Towards a Conceptual
Foundation of CSCW Systems Design." Computer Supported Cooperative Work 5(2-
3): 155-200.
Shipman III, F. M. and C. C. Marshall (1999). "Formality Considered Harmful: Experiences,
Emerging Themes, and Directions on the Use of Formal Representations in Interactive
Systems." Journal of Computer Supported Cooperative Work 8: 333-352.
Simon, H. A. (1977). The new science of management decision. Englewood Cliffs, N.J.,
Prentice-Hall.
Singh, S. (1999). The code book : the evolution of secrecy from Mary, Queen of Scots, to
quantum cryptography. New York, Doubleday.

Mine the gap - a multi-method investigation of web-based groupware use


191
References

Smith, M., J. J. Cadiz, et al. (2000). Conversation Trees and Threaded Chats. ACM2000
Conference on Computer Supported Cooperative Work, Philadelphia, PA, ACM press.
Spiliopoulou, M. (2000). "Web Usage Mining for Site Evaluation: Makking a Site Better Fit
its Users." Communications of the ACM 43(8): 127-134.
Spiliopoulou, M., C. Pohle, et al. (1999). Improving the Effectiveness of a Web Site with
Web Usage Mining. Workshop on Web Usage Analysis and User Profiling
(WebKKD99), San Diego, August 1999.
Srikant, R. and R. Agrawal (1995). Mining sequential patterns. Int'l Conference on Data
Engineering (ICDE), Taipei, Taiwan.
Srikant, R. and R. Agrawal (1996). Mining sequential patterns: generalizations and
performance improvements. Fifth Int'l Conference on Extending Database Technology
(EDBT), Avignon, France.
Stinchcombe, A. L. (1968). Constructing social theories. New York, Harcourt Brace &
World.
Strauss, A. (1985). "Work and the Division of Labor." The Sociological Quarterly 26(1): 1-
19.
Su, Z., Q. Yang, et al. (2002). "Correlation-Based Web Document Clustering for Adaptive
Web Interface Design." Journal of Knowledge and Information Systems 4(2).
Suchman, L. (1996). Supporting Articulation Work. Computerization and Controversy. R.
Kling, Academic Press: 407-423.
Teege, G. (2000). "Users as Composers: Parts and Features as a Basis for Tailorability in
CSCW Systems." Computer Supported Cooperative Work 9(1): 101-122.
Toolan, F. and N. Kushmerick (2002). Mining web logs for personalized site maps.
International Conference on Web Information Systems Engineering, Sigapore.
VanGundy, A. B. (1988). Techniques of structured problem solving. New York, Van
Nostrand Reinhold.
Walsham, G. (1995). "Interpretive case studies in IS research: nature and method." European
Journal of Information Systems 4: 74-81.
Watson, R. T., G. Desanctis, et al. (1988). "Using a Gdss to Facilitate Group Consensus -
Some Intended and Unintended Consequences." Mis Quarterly 12(3): 463-478.
Websters Merriam Webster Collegiate Dictionary. 2002.
Weisberg, H. F., J. A. Krosnick, et al. (1996). An introduction to survey research, polling, and
data analysis. Thousand Oaks, Calif., Sage Publications.
Wellman, B., J. Salaff, et al. (1996). "Computer Networks as Social Networks: Collaborative
Work, Telework, and Virtual Community." Annual Review of Sociology 22: 213-238.

Mine the gap - a multi-method investigation of web-based groupware use


192
References

Yates, J. and W. Orlikowski (2002). "Genre Systems: Structuring Interaction through


Communicative Norms." Journal of Business Communication 39(1): 13-35.
Yates, J., W. Orlikowski, et al. (1997). Collaborative Genres for Collaboration: Genre
Systems in Digital Media. 30th Hawaii International Conference on System Sciences,
Hawaii, IEEE Computer Society Press.
Yates, J. and W. J. Orlikowski (1992). "Genres of Organizational Communication - a
Structurational Approach to Studying Communication and Media." Academy of
Management Review 17(2): 299-326.
Yates, J. and W. J. Orlikowski (1994). "Contextualizing Technology - from Ill-Defined
Borders to Socially Defined Genres." Human-Computer Interaction 9(1): 132-135.
Yates, J., W. J. Orlikowski, et al. (1999). "Explicit and implicit structuring of genres in
electronic communication: Reinforcement and change of social interaction."
Organization Science 10(1): 83-103.
Yin, R. K. (1994). Case study research : design and methods. Thousand Oaks, Sage
Publications.
Yoshioka, T., G. Herman, et al. (2001). "Genre taxonomy: A knowledge repository of
communicative actions." Acm Transactions on Information Systems 19(4): 431-456.
Zigurs, I. and B. K. Buckland (1998). "A theory of task/technology fit and group support
systems effectiveness." Mis Quarterly 22(3): 313-334.
Zuboff, S. (1988). In the age of the smart machine : the future of work and power. New York,
Basic Books.

Mine the gap - a multi-method investigation of web-based groupware use


193