Академический Документы
Профессиональный Документы
Культура Документы
Validity
Validity refers to the appropriateness, correctness,
meaningfulness, and usefulness of the specific
inferences researchers make based on the data they
collect.
It is the most important idea to consider when
preparing or selecting an instrument.
Validation is the process of collecting and analyzing
evidence to support such inferences.
Evidence of Validity
There are 4 types of evidence
Face validity – to look at the operationalization and see whether
"on its face" (superficially) it seems like a good instrument
Content-related evidence of validity
Content and format of the instrument
Criterion-related evidence of validity
Relationship between scores obtained using the new
instrument and scores obtained in a standard instrument
Construct-related evidence of validity
Psychological construct or domain being measured by
the instrument
5
Content-Related Evidence
A key element is the adequacy of the
sampling of the content of the construct it is
supposed to represent.
The other aspect of content validation is the
format of the instrument.
Attempts to obtain evidence that the items
measure what they are supposed to
measure typify the process of content-
related evidence.
The McGraw-Hill Companies, Inc.
All rights reserved.
6
Criterion-Related Evidence
Reflects the use of a criterion - a well-
established measurement procedure - to create
a NEW measurement procedure or to predict on
the construct you are interested in
Possible reasons for using criterions to create a new
measurement procedure, is to create a shorter version of a
well-established measurement procedure; and/or to
account for a new context or location; and/or to use as
predictor of a criterion
A Correlation Coefficient (r) = validity coefficient, indicates
the degree of relationship that exists between the scores
of individuals obtained by the two instruments.
8
Criterion-Related Evidence
A criterion and a second test are presumed to be
theoretically related, i.e. measure the same variable.
2 forms of criterion-related validity
Predictive validity: researchers allow a time interval to
elapse between administration of the instrument and
obtaining the criterion scores. E.g. researcher administer a
‘new’ mathematics attitude test (predictor) to a group of
Form 2 (Grade 8) students. Later these scores are
correlated to students’ end of the year mathematics
grades (a criterion measure)
Concurrent validity: new instrument data and criterion
data are gathered around the same time, and correlated
9
CONSTRUCT VALIDITY
Reliability
Refers to the consistency of scores or answers
provided by an instrument... From one
administration of instrument to another, and
from one set of items to another
Scores obtained can be considered quite reliable
but not valid.
An instrument should be valid and reliable,
depending on the context in which an instrument
is used.
The McGraw-Hill Companies, Inc.
All rights reserved.
12
Reliability Coefficient
Expresses a relationship between scores of
the same instrument at two different times
or parts of the instrument.
The 3 best known methods are:
Test-retest
Equivalent-forms method
Internal-consistency method
Test-Retest Method
Involves administering the same test twice to the
same group after a certain time interval has
elapsed.
A reliability coefficient is calculated to indicate the
relationship between the two sets of scores.
Reliability coefficients are affected by the lapse of
time between the administrations of the test.
An appropriate time interval should be selected.
In educational research, scores collected over a two-
month period are considered sufficient evidence of
test-retest reliability.
Equivalent-Forms Method
Two different but equivalent forms of an
instrument are administered to the same
group.
A reliability coefficient is then calculated
between the two sets of scores.
Internal-Consistency Methods
There are several internal-consistency methods that
require only one administration of an instrument.
Split-half Procedure: involves scoring two halves of a
long test separately for each subject and calculating the
correlation coefficient between the two scores.
Top half vs bottom half
Even vs odd numbered items
Then adjusted with Spearmen-Brown Formula
rWhole = 2rHalves
____________
1 + rHalves
The McGraw-Hill Companies, Inc.
All rights reserved.
16
Internal-Consistency Methods
Kuder-Richardson Approaches: (KR20 and KR21) requires 3
pieces of information:
Number of items on the test
The mean
The standard deviation
Formula for KR 21
r(KR-21)
KR-21) = (K)(s ) - Mean(K - Mean)
2
(s2) (K – 1)
WHERE:
K – No. Of Items In The Whole Test
s2 – Variance Score
Mean – Mean of Scores
Most frequent method for determining internal consistency
Alpha Coefficient: a general form of the KR20 used to
calculate the reliability of items that are not scored right
vs. wrong as in Likert Scale
17
Scoring Agreement
Scoring agreement requires a demonstration that
independent scorers can achieve satisfactory
agreement in their scoring.
Instruments that use direct observations are highly
vulnerable to observer differences.
What is desired is a correlation of at least .90 among
scorers (inter-rater reliability) as an acceptable level of
agreement.