Вы находитесь на странице: 1из 24

Reliability and Validity

of Measurement
RCS 6740
6/28/04
Validity of Measurement

 Validity of measurement refers to the


extent to which a test accurately
measures what it is intended to
measure. The three major types of
validity are content, construct, and
criterion-related.
Validity of Measurement
Cont.
 Validity has a different meaning in the
context of measurement than in the context
of research design (Pedhazur & Schmelkin,
1991).
 In measurement, “validity, or rather
validation, refers not to a measure in
question but to inferences made on the
bases of scores obtained on it” (Pedhazur &
Schmelkin, 1991, p. 31).
Validity of Measurement
Cont.
 Validity is a unitary concept. The
widely used types of validity are not
mutually exclusive.
 In the sense of the elephant, if validity
is the elephant, the types of validity
refer to different ways of measuring
the elephant – or perhaps they refer
to the different facets of the elephant.
Validity of Measurement
Cont.
 Much in the way that we use multiple
measures to operationalize constructs, we
also have a variety of ways to measure
validity.
 The most commonly used types of validity
are content, criterion, and construct.
 Pedhazur and Schmelkin (1991) point out
that although the content of a measure is
important, content validity is not validity in
the sense of the definition presented above
and the focus should be on the construct
being measured.
Types of Measurement
Validity
Content Validity
– The test is a representative sample of
performance in some defined area of job-
related knowledge, skill, ability, or other
characteristic. The extent to which a test
adequately samples the domain of
information, knowledge, or skill that it
purports to measure. Most important for
achievement and job sample tests.
Types of Validity Cont.
Criterion-related Validity
 The test is shown to be statistically related
to some criterion of successful job
performance. The type of validity that
involves determining the relationship
(correlation) between the predictor and the
criterion. The correlation coefficient is
referred to as the criterion-related validity
coefficient. Criterion-related validity can be
either concurrent or predictive.
Types of Validity Cont.
Construct Validity
 The test is demonstrated to be a measure of a job-
relevant characteristic (e.g., reasoning ability). The
extent to which a test measures the hypothetical
trait (construct) it is intended to measure.
 Methods for establishing construct validity include
correlating test scores with scores on measures that
do and do not measure the same trait (convergent
and discriminant validity); conducting a factor
analysis to assess the test’s factorial validity;
determining if changes in test scores reflect
expected developmental changes; and seeing if
experimental manipulations have the expected
impact on test scores.
Criterion-Related Validity

 Criterion-related validation focuses


on prediction, the overriding concern
being the degree of successful
prediction of a criterion, regardless of
whether or not it is possible to explain
the process or processes leading to
the phenomenon that is being
predicted (Pedhazur and Schmelkin,
1991, p. 32).
Criterion-Related Validity
Cont.
Criterion-Related Validation Approaches
 A criterion is any variable (e.g., academic
achievement, voting, aggression,
productivity, drug use, absenteeism,
delinquency) one wishes to explain and/or
predict by resorting to information from
another variable(s) (Pedhazur & Schmelkin,
1991, p. 32).
Criterion-Related Validity
Cont.
Criterion-Related Validation Approaches
 Criterion-related validity focuses on prediction, the
overriding concern being the degree of successful
prediction of a criterion, regardless of whether or
not it is possible to explain the process or processes
leading to the phenomenon that is being predicted
(Pedhazur & Schmelkin, 1991, p. 32).
– Predictive Validity
– Concurrent Validity
 Regression equations are common tools in criterion-
related validation.
Construct Validity
 Construct validation is concerned with
validity of inferences about unobserved
variables (the constructs) on the basis of
observed variables (their presumed
indicators) (Pedhazur & Schmelkin, 1991, p.
52).
 A construct derives its meaning and
relevance from the theoretical context, the
nomological network, within which it is
embedded (Pedhazur & Schmelkin, 1991, p.
69).
Construct Validity Cont.
Construct Validation Approaches
 Construct validation is a never-ending enterprise.
Gains, or losses, in credibility of inferences made on
the bases of responses to (or status on) indicators
of a construct depend on the nature and quality of
the accumulated evidence involving the construct
under consideration.
 Because tests of hypothesis have a bearing on its
validation, it is clear that the approaches to
validation are limited only by the researcher’s
imagination and acumen, and the theoretical
formulation and expectations regarding the
construct under consideration (Pedhazur &
Schmelkin, 1991, p. 59).
Construct Validity Cont.

 Pedhazur and Schmelkin (1991) group


approaches to construct validation
under the following topics:
– (a) logical analysis
– (b) internal-structure analysis
– (c) cross-structure analysis
 Following is a summary of their points:
Construct Validity Cont.
Logical Analysis
 Is the construct well defined?
 Is the definition of the construct grounded in
knowledge of theories and relevant research
findings?
 Are the items consistent with the definition of the
construct?
 To what extent are the obtained scores adversely
influenced by specific measurement procedures
used (e.g., X-rays of the elephant)?
Construct Validity Cont.
Internal Structure Analysis
 If appropriate, was factor analysis used?
– Factor analysis: an approach that, like cluster analysis,
identifies relationships without using an outcome
(dependent) variable.
– Grouping related characteristics, factor analyses reveal
unobserved "dimensions" that underlie a larger number of
observed variables. This technique can either identify a
subset of variables to represent these dimensions, or
derive new variables that are composites of the original
variables associated with each dimension.
– In either case, subsequent analyses (e.g. regression or
cluster) can benefit from variable reduction.
Construct Validity Cont.
Cross-Structure Analysis
 Convergent validity refers to the convergence
among different methods (preferably maximally
different ones) designed to measure the same
construct (Pedhazur & Schmelkin, 1991, p. 74).
 Discriminant validity refers to the distinctiveness of
constructs (Pedhazur & Schmelkin, 1991, p. 74).
 Measures of similar constructs should have
reasonably high correlations, whereas measures of
different constructs should not be highly correlated.
Reliability of
Measurement
 Reliability of measurement refers to the
degree to which the results of an
assessment are dependable and consistent.
 Reliability is an indication of the consistency
of scores over time, between scores, or
across different tasks or items that measure
the same thing.
 If scores from an assessment are unreliable
interpretations based on these scores, and
subsequent decisions, will not be valid.
Reliability of
Measurement Cont.
 In the sense of measurement, the reliability
of a measure is the “ratio of true-score
variance to observed-score variance” (p.
85).
 The same instrument may be more or less
reliable in different populations. Therefore,
“the relevant reliability estimate is the one
obtained for the sample used in the study
under consideration” (Pedhazur &
Schmelkin, 1991, p. 86).
Reliability of
Measurement Cont.
 An important thing to remember is
that different formulations of reliability
imply different treatment of error
variance. Thus, we cannot talk about
reliability without talking about the
formulation.
Reliability of
Measurement Cont.
Test-Retest Reliability
 Test-retest reliability- Individuals are asked to take
the test of interest and then take the same test
again at a later date. The scores are then
compared.
 The closer the scores are, the more reliable the
test. Reliability is an important factor in testing
because if paves the way for accuracy
 Note that test-retest reliability should be used with
caution, because “a low test-retest correlation, for
example, may be indicative of a measure with low
reliability, of true changes in the individuals
measured, or a combination of both” (Pedhazur &
Schmelkin, 1991, p. 89.
Reliability of
Measurement Cont.
Equivalent Forms
 Equivalent forms is a way to gauge the
consistency of measurement based on the
correlation between scores on two similar
forms of the same test taken by the same
individual.

 This method also has problems because the


coefficients of equivalence reflect both the
extent of equivalence of the forms and the
actual reliability.
Reliability of
Measurement Cont.
Internal Consistency
 Internal Consistency is a method of estimating
whether different parts of a test are measuring the
same variable. One of the most commonly used
reliability coefficients for internal consistency is
Cronbach's alpha (a).

 This type of reliability tells us the extent to which


the items in a scale measure the same
phenomenon and the extent to which the scale
hangs together. Two methods are split-half and
coefficient alpha. Alpha is preferred.
Reliability of Observation
 As Pedhazur and Schmelkin (19991) suggest, there
are multiple potential sources of error in
observational data.
 The most simple is addressed through inter-
observer agreement, which is the percent of
agreements with respect to the total number of
decisions.
 Tinsley and Weiss (1975) suggest multiple types of
computation to examine the agreement and
reliability among observers.
 Generalizability theory (Shavelson, Webb, & Rowley,
1992) is one approach to evaluating the multiple
sources of errors in such data.

Вам также может понравиться