Академический Документы
Профессиональный Документы
Культура Документы
discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/281269288
READS
11
2 AUTHORS:
Zina Houhamdi
Belkacem Athamena
20 PUBLICATIONS 91 CITATIONS
17 PUBLICATIONS 23 CITATIONS
SEE PROFILE
SEE PROFILE
182
Zina Houhamdi
Software Engineering Department, College of Engineering and IT
Al Ain University of Science and Technology, UAE
zina.houhamdi@aau.ac.ae
Belkacem Athamena
MIS Department, College of Business Administration
Al Ain University of Science and Technology, UAE
belkacem.athamena@aau.ac.ae
ABSTRACT
This paper discusses a general, meaningful and repeated problem in information systems practice: under
investment in the client information quality. Many organizations need precise financial models so as to
initiate investments in their information systems and associated processes. Nevertheless, there are no
broadly recognized strategies to accurately combining the expenses and profits of potential quality
enhancement to client information. This can result in inadequate quality client information which influences
the organizational goals. Further, the absence of such a strategy impedes the ability for Information System
(IS) developers to discuss the investing case in betterments since the organizational resources access is
dependent on such a case being made. To address this problem, we propose and assess a structure for
generating financial models of the expenses and profits of client information quality. These models can be
exploited to select and prioritize from various candidate interventions across multiple client processes and
information resources, and to set up a business case for the society to make the investment. As the work
tried to provide and evaluate an artifact instead of answer a question, design science was identified as the
most suitable research approach. With design science, utility of a conceived artifact is precisely established
as the goal rather than the theory truth. So instead of following a process of expressing and answering a
sequence of research questions, design science develops by constructing and evaluating an artifact. In this
case, the framework is built as an abstract artifact, incorporating models, measures and a method.
Keywords: Information quality, Information system representation, Decision making and design science
I. INTRODUCTION
Practitioners have long perceived the organizational and financial influence of shoddy quality
information (Redman, 2008). Nevertheless, the charges of solving the implicit causes can be relevant. For
organizations fighting with Information Quality (IQ), combining the expected costs and benefits of IQ
ameliorations can be an essential primary step to achieve large organizational goals.
IS scientists have been working on this problem since the 1980s (Ballou and Pazer, 1985; Ballou and
Tayi, 1989). Really, economists and management specialists have been examining this problem since earlier
(Marschak, 1971). However, the proliferation of IQ models and structures in 1990 from IS researchers
(Strong and Lee, 1997; Wang and Wang, 2008) and authors (English, 1999); the IQ expenditure issue has
observed approximately limited preoccupation within the discipline.
This work attempts to elaborate and assess an exhaustive framework to assist analysts compute the
costs and benefits of IQ amelioration. The framework should include the required descriptions,
computations and steps required to generate a business case used by the decision-makers to base a pertinent
investment determination.
The abstraction level should be sufficiently high that the framework is generic and can employ in a
large and different circumstances and organizations. At the same time, it should also be sufficiently low that
it can deliver beneficial results to guide decision-makers in their specific situations.
This paper is an investigation of IQ undertaken so as to set out a structure for IQ evaluation. It
estimates and incorporates ideas from presumed reference disciplines (e.g. information theory, semiotics
and decision theory) and justified by the demand from the practitioner context interviews. As part of the
design science approach, this develops artifact design, where the artifact is a framework including an
abstract model, metrics and techniques for assessing the IQ in customer relationship management processes.
183
184
III. STRATEGY
This section describes the steps sequence to conceive and estimate an IQ intervention. It proceeds by
setting out a value-driven analysis of both the enhancements possibilities (technical ability) and areas of
considerable necessity (business needs) applying the mathematical metrics.
The target scenario for method selection has the following properties: A sole information source (e.g.
operational data store, data warehouse), containing a set of client records, one for each client. Each record
has a set of attributes, comprising demographic and transactional information, applied by a number of client
decision processes to split, categorize, anticipate, assign or identify on a per-client base, where various
candidate IQ improvement interventions are to be assessed.
Notice that the sole information source doesnt have to be a single database or even physical table; a
network of inter-connected ISs acting simultaneous to generate a general representation of the client
suffices. An illustrative scenario can be a financial support company where a client database, increased by
demographic knowledge from external furnishers, is exploited regularly by a group of marketing, campaign
handling, client involvement control applications and credit computing processes. The firm may be
envisaging the relevant advantages of training contact personnel on data entry, buying a data refinement
tool or unifying client data from an auxiliary assurance business.
The method purposes are:
Effectiveness. The initiatives suggested by the method should be provable (approximately ideal) in
reference of value creation.
Efficiency. The method should generate an investigation applying a smallest quantity of resources,
embracing time and skills.
Feasibility. The data availability, theoretical comprehension degree, requirements measurement and IS
actions splitting should be within reasonable limits.
Clarity. The concepts, metrics and processes used should be understandable and rational to IS experts
and observed as being impartial and adjusted to organizational-large concerns.
Instead of assessing suggested interventions, in short, demanding What can we determine? the
approach continues by demanding What requires to be determined? Merely then we demand What is the
optimum manner to determine it? This approach attenuates specifying useful interventions for IQ
difficulties of approximately little economic importance.
In a formal manner, we describe an IQ intervention as a modification in the IS state, X , into X . In
reference to the client process standard, the modification is persuaded on one or more X attributes. The
decision function, D, is applied to the new IS state, X , resulting in a new action, Y . This is described in
the following figure (Figure 1).
External
Context
C: Context
Communication
Process
X: System State
Decision Marketing
Process
X: System State
Intervention
Process
Y: Action
X: System State
Decision Marketing
Process
Y: Action
Realization
Process
Y*: Optimal Decision
185
No influence. The intervention coincides with IS i.e. the new value is the equal to the old one. Therefore, no
modification in the decision and in value.
Inactionable revision. A modification of the IS state at the semantic level, but no modification in the
decision and consequently in the value.
Actionable revision. A modification of the IS state at the semantic level and the pragmatic (i.e. decision)
level, corresponding to a value modification.
Valuable revision. An actionable revision leading to a concrete modification in value.
Thus, there is a possibility that an actionable revision takes negative value (when an intervention actually
causes a process execute worse). This may furthermore be allowed, if the intervention has a net positive
influence. This circumstance is similar to a public vaccination schedule, where the disadvantage of some
persons allergic reactions is neutralized by the public interest to removing the malady.
Nevertheless, usually, it is not reasonable to predict whether a special state modification will be relevant or
not, or, surely, be actionable or not. This must be accomplished by executing the decision procedure over the
modified state and observing if it generates a distinct decision.
Accordingly, it is acceptable to esteem a specific intervention process, T, by analyzing its impact on the
decision results. An intervention that produces in exactly the identical decisions set being made (i.e. Y Y * ) is
economically inutile though it fixed mistakes in the initial communication process. This is the similar
differentiation among semantic quality amelioration and pragmatic quality enhancement.
An IQ intervention is actionable if and only if the modification in IS state leads in a modified decision.
Evidently, any intervention that doesnt modify the IS state (i.e. X X ) is inactionable and not actionable
interventions are, necessary, insignificant and useless. From a design aspect, the purpose is to efficiently
determine interventions that are doubtless to have an effect on decision-making, particularly decisions with high
value.
The strategy has two steps. Initially, it begins with a large range of possible problem areas and narrows it
during consecutive iterations of data gathering and analysis employing the performance metrics (Figure 2).
Start
T Null
T = List of Intervention
Data Collection
l = List of Data
l Null
False
True
T Null
Data Analysis
A: List of Attributes
P: List of Processes
Attribute Prioritization
using Influence Metric
Processes Prioritization
using Stake Metric
A: List of Valuable
Attributes
P: List of Valuable
Processes
False
True
Intervention Evaluation
using Yield Formula
Intervention Ranking
using Traction Metric
Next(T )
Insert( I , T )
Next(l )
Value
End
186
After that, candidate interventions are estimated in reference to expenses and profits, producing an
evaluation of their value in terms compatible with the companys requirements for formal decision making. The
same measurements can be employed to follow and analyze the implementation step of interventions.
IV. STRUCTURAL PROCESSES
A result of considering a company large view of client IQ amelioration is that money saved from
commercial is valuated the same as money saved from Sales. Accordingly, all structural processes that depend
on client information to produce value should be enclosed within the scope. For many company, this is a
possibly huge number of processes using various information sources and structural processes. Instead of a
complete examination of all processes, the Stake metric can be employed to assign priority to those processes
that are apt to have commercial importance.
S NfR*
(1)
Ii 1
H (Y X i )
H (Xi Y )
1
H (Y )
H (Xi )
(2)
I ( X ; Y ) H (Y ) H (Y X ) I (Y ; X )
(3)
The impact of a specific attribute on a specific decision making procedure is calculated without regard to
the interior workings of the procedure. It can be defined exclusively by examination of the union rate of
187
occurrence between inputs (IS value) and outputs (process decision). To take a small representative example,
suppose the impact of a gender attribute (binary value) on an offer decision (binary value):
P( X gender, Y )
Yes
No
Male
0.25
0.1
Female
0.05
0.6
Table 1: Attribute Influence on a Decision
In this illustration, the client base is 35% male and 65% female and the offer is made to 30% of clients
(Yes= 30%). Using the Influence equation, we calculate:
I gender 1
H (Y X gender)
H (Y )
I (Y ; X gender)
H ( X gender)
0.325
34.8%
0.934
(4)
Thus, in this example, the gender attributes has approximately 35% relevance on the decision. This point
out how much the decision uncertainty is eliminated when the gender attribute value is known. For a client
selected randomly, there is a 30% possibility that the decision function will label them as receiving an offer. On
the other hand, when ascertaining the client is female, this declines to a 7% chance. Had they been male, the
chance would have augmented to 83%. In this manner, the attribute is determined considerably influential.
As this computation doesnt depend on having real external context values or accurate decisions, it is
immediately ready and inexpensive to execute using available data. The solely condition is for an appropriate
query language (e.g. SQL or XQuery) or a very simple spreadsheet.
Applying this metric, the greatest influential attributes for the high stake functions can be recognized and
ordered. To compute the accumulation impact of an attribute upon a group of structural processes, we can use
the following metric: The Importance M of attribute a, is the product of its Influence and Stake over the set of
processes P of concern:
Ma
S p Ia
(5)
pP
Because the Influence values dont contribute to unity, weighting the Stakes in this manner results in a form
of worth that is valuable just as a model and shouldnt be considered as a genuine monetary measure. But, it
does go some way to giving a sense of the monetary significance of each attribute accumulated over the
processes set of concern.
The result of this stage is a list of attributes that are most important on decision making in high value
processes, arranged by Importance M.
VI. INFORMATION SYSTEM REPRESENTATION
Utilizing the list of significant attributes, the analyst continues to analyze the possibility for amelioration.
We start by looking in the places with the worst IQ: the supposition is that the best range for enhancement (and
therefore fiscal profit) is those attributes that operate the badly.
The raw fault ratio isnt a reliable measure for amelioration, due to the issue of previous probabilities. On
the other hand, the fidelity metric (on a per-attribute basis) is a pleasant manner to distinguish attributes with one
another:
i 1
H (Ci X i )
H (Ci )
(6)
The fidelity metric is a gap measure in the manner that it produces a normalized representation of how
much the reality falls short of the perfection standards. Accordingly, it provides a sense of the comparative
improvability of each attribute under analysis.
Certainly, collecting the true context value of a client attribute is costly and hard if it werent, this
complete exercise wouldnt be needed! So gathering these values should be done carefully. In a reasonable
sense, this is reached by:
Determining an appropriate source. Depend on the attribute; this can be achieved by using an official
source (like an applicable government registry) or direct client authentication. In various cases, it may be
enough to obtain another reliable source system or business benchmark source.
188
Sampling the client base. Since the fidelity metric calculation needs just a probability distribution, simply a
sub group of clients require to be examined. The problem about How many clients do we need in our
evaluation? is solved by the necessary number to be certain that the attributes are arranged in accurate
order.
Depend on the trusted source and sample, the fidelity metric for each attribute is calculated and a new
attribute ranking is created. This current ranking can be employed to remove attributes that, while significant in
the sense mentioned before, are previously of merit. This implies the short list of significant attributes is again
decreased to only those where important amelioration is both guaranteed and possible.
VII. INFORMATION QUALITY INTERVENTIONS
The following step is to analyze candidate IQ amelioration interventions. These may have been suggested
before of this test or new ones may have been suggested due to the former examination into structural processes,
decision making procedures and the illustrative efficacy of the IS.
The starting point is the short list of the attributes. Interventions that approach these attributes are more
expected to be, relatively, important than ones that approach attributes not included in the attribute list, because
these attributes are both significant and ameliorable. The particular intervention value, T, can be calculated
using the yield formula:
VT
( R*p R*p ) Nf p
(7)
pP
This needs the calculation of two realization matrices over the set of interested processes. The former
associated the actual decisions Y to the decisions with precise information, Y * . The second associates the
improved decisions Y (after the intervention) to Y * . Usually, this can only be reached by applying the
intervention (in short, rectifying the client data) and re-executing it over the same decision processes and
estimating the degree at which decisions change. This is probably to be costly, time consuming and perhaps
inconvenient to actions because it employs operational resources for example CPU cycles, storage space and
network bandwidth.
Likewise to calculate fidelity, only evaluates of overall probabilities are needed thus sampling will help
decrease the exercise charge. Nevertheless, before this is initiated over the whole candidate interventions, some
may be removed previously. Remember that for any intervention on a specific client attribute, there will be an
instances part in which the rectified value dissents with the initial value.
Mathematically, we describe this part as:
Pr( X i X i )
(8)
This metric is labeled Traction, because it defines the degree to which an intervention really modifies the
status quo.
Not the entire interventions will be actionable, in short, effect in a modified decision. In addition, not all
actionable modifications will have an absolute effect on value. Usually, we might expect a compromise among
the part of client records that are modified and if the modification is profitable. An intervention with a great
Traction may be called destructive, while an intervention that interests on assuring all modifications are
profitable might be defined as safe.
The intervention traction can be evaluated without re-executing the client records over the decision
procedure. Consequently, it is a relatively inexpensive metric, because the intervention sample is ready.
Considering a conventional method, candidate interventions with low traction can be removed if they have a
small value even when presumed to be maximally safe i.e. every modification results in the highest positive
effect. This can be computed by taking the product of the maximum from each processs pay-off matrix and .
The interventions can then be ordered according their envisaged value. At this moment, suggested
interventions can be jointed, disaggregated or more approximately focused. As illustration, attaching the birth
date and address may both reveal important profits, sufficient to support implementation. But, when jointed,
they may produce even maximum benefits than separately over decision synergy. Alternatively, the produce
may amelioration if the intervention is interested to the top 25% of client, Instead of employed over the
complete client database. Previously removed suggestions can be restored if the short recorded interventions
present a less than intended profit. If this happen, then the re-execution and consequent analysis can be redone
in a view to expand the value fulfilled by the complete interventions set.
189
Illustrative example: assume the region attribute is under examination. This attribute, X region , has four
values: east, west, south and north. The intervention, T, implies substituting region values with those saved in a
database possessed by a newly gathered auxiliary business with (assumedly) more suitable geographic coding,
.
X region
X region
0.15 0.11 0.25 0.49
(9)
We can relate the two attributes by sampling the joint probability mass functions:
0
0
0.10
0.14
0
0.08 0.02
0
Tregion
0
0
0.13 0.02
(10)
(11)
This signifies that 19% of client records will be modified. Some of those modifications will be actionable,
some will not. Part of those actionable modifications will have a convinced effect on value (i.e. amend the
decision) while the rest will have a negative effect.
Assume we take another intervention, T , on the same attribute, through an internal coherence test with
those clients with an actual street address:
X region
0.16 0.14 0.15 0.45
(12)
We can relate the two attributes by sampling the joint probability mass functions:
0
0
0
0.15
0
0.09
0
0.01
Tregion
0
0
0.15
0
0.58
0.01 0.01 0
(13)
(14)
Therefore the 1st intervention has a traction of 19% while the second just 3%. Although this doesnt
assuredly imply that the first intervention is preferable, it does propose that the latter could be eliminated from
the short list for additional (costly) analysis by reason of, even if it was impeccable, it couldnt have more than a
3% effect on any process. Certainly, if additional analysis disclosed that the first intervention was abnormal (In
short, produced more errors than it eliminated), then this one could be proceed encore.
VIII. CONCLUSION
This paper describes a framework for estimating ameliorations to the quality of client information. It
includes a mathematical model, used in economic and semiotic theory and employed to deduce performance
metrics, and an approach for consistently examining the value opportunities in client IQ ameliorations.
The framework answers to practitioner demands to construct strong and testable business cases to assist
ameliorations in IQ based on cash flows. These fiscal metrics are essential for IS practitioners and business
managers to exchange the value of such initiatives and affect current structural resourcing processes.
The end users of the framework are analysts within the company engaged in a value led, technology doubter
analysis exercise. The approach employs iterative removal of suggestions to center on high value opportunities
and tries to decrease wasted effort on impertinent, useless or low stakes interventions. It also takes into
consideration the expenses of obtaining information about the efficacy of distinct model perspectives.
190
The result is an interventions set which enhances the entire expected financial generated from the
companys client processes. This doesnt model the elusive and the intangible soft profits (morale
ameliorations, reputation, planning and forecasting, etc), thus it is profitable to be a lower bound on the
interventions value. Still, it is a difficult lower bound that is more satisfactory to financial administrators,
authorizing IQ projects to content a range of IS and non IS investments.
From a project management viewpoint, the analysis scope is defined mainly by two elements: The structural
processes number and the suggested quality interventions number. Other factors like the attributes number in the
database and the status of current information arent directly manageable however are ascertained by existing
conditions. The time required to define the analysis is based on the initial scope and the quality of the results, in
short, the level of financial hardness and detail. This level is defined by the expected usage: a formal
cost/benefit analysis for very large projects will be ranked to a biggest standard than a smaller, informal
analysis.
Grand part of the analysis can be exploited later. For example, the Influence metrics will stay invariable
while the basic business logic and database definitions dont evolve excessively. Likewise, fidelity and stake
will modify moderately in regard to business conditions and, once evaluated, may just require to be revised
regularly to hold their convenience.
REFERENCES
Arnott, D., Cognitive Biases and Decision Support Systems Development: A Design Science Approach,
Information Systems Journal, Vol.16, No.1, 2006, pp.55-78.
Ballou, D.P. and Pazer, H.L., Modeling Data and Process Quality in Multi-Input Multi-Output Information
Systems, Management Science, Vol.31, No.2, 1985, pp.150-162.
Ballou, D.P. and Tayi, G.K., Methodology for Allocating Resources for Data Quality Enhancement,
Communications of the ACM, Vol.32, No.3, 1989, pp.320-329.
Burstein, F. and Gregor, S., The Systems Development or Engineering Approach to Research in Information
Systems: An Action Research Perspective, Proceedings of the 10th Australasian Conference on Information
Systems, 1999, pp.122-134.
English, L.P., Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and
Increasing Profits (Wiley, 1999).
Gregor, S., The Nature of Theory in Information Systems, Management Information Systems Quarterly,
Vol.30, No.3, 2006, pp.611-642.
Gregor, S. and Jones, D., The Anatomy of a Design Theory, Journal of the Association for Information
Systems, Vol.8, No.5, 2007, pp.312-335.
Hevner, A.R., March, S.T., Park, J. and Ram, S., Design Science in Information Systems Research,
Management Information Systems Quarterly, Vol.28, No.1, 2004, pp.75-105.
Jrg, B., Bjrn, N. and Christian, J., Socio-Technical Perspectives on Design Science in is Research,
Advances in Information Systems Development, 2007, pp.127-138.
Marschak, J., Economics of Information Systems, Journal of the American Statistical Association, Vol.66,
No.333, 1971, pp.192-219.
Peffers, K. E. N., Tuunanen, T., Rothenberger, M.A., and Chatterjee, S., A Design Science Research
Methodology for Information Systems Research, Journal of Management Information Systems, Vol.24, No.3,
2007, pp.45-77.
Redman, T.C., Data Driven: Profiting from Your Most Important Business Asset (Harvard Business School
Press, 2008).
Simon, H.A., The Sciences of the Artificial (MIT Press, 1996).
Strong, D.M. and Lee, Y.W., 10 Potholes in the Road to Information Quality, Computer, Vol.30, No.8, 1997,
pp.38-46.
191
Wang, S. and Wang, H., Information Quality Chain Analysis for Total Information Quality Management,
International Journal of Information Quality, Vol.2, No.1, 2008, pp.4-15.