Вы находитесь на странице: 1из 31

RESEARCH

METHODS

Lecture 17
CRITERIA FOR GOOD
MEASUREMENT
Goodness of measurement
• Make sure the accuracy of the instrument.
For that:
• Ensure not missing any dimension, element,
and question. Nothing irrelevant.
• Assess the “goodness” of measuring
instrument.
• Characteristics of a good measurement:
Validity, Reliability, and Sensitivity –
accuracy of the measuring instrument.
The Goal of Measurement:
Validity and Reliability
Validity
• The ability of an instrument to measure

what is intended to be measured.


• Validity of the indicator  Is it a true
measure? Are we tapping the concept?
• Abstract ideas (construct) but concrete

observations. Degree of fit between a


Researchers ask questions:
• Do colleagues agree with my
measurement?
• Does my measure correlate with others’
measures of the same concept?
• Does it really measure what is expected to
measure?
• The answers provide some evidence of
the measure’s validity.
Types of Validity
• 1. Content validity
• 2. Criterion-related validity
• 3. Construct validity
1. Content Validity:
• Do the measures include an adequate
and representative set of items that tap
the concept?
• How well the dimensions and elements
of the concepts have been delineated?
• Let us take the example of measuring
feminism
Example of Feminism:
• Implies a person’s commitment to a set of
beliefs creating full equality between men
and women – in areas of:
• Arts, intellectual pursuits, family, work,
politics, authority relation. Dimensions.
• Is there adequate coverage of dimensions?
• Do we have questions on each dimension?
• Panel of judges may attest the content
validity of an instrument.
• Each panelist assesses the test items.
Face Validity
• Basic and minimum index of content validity
• Items that are supposed to measure a
concept, do on the face look like they
measure the concept. For example:
• Measure a college student’s math ability 
ask: 2+2=? Not a valid measure of college
level math ability.
• Subjective agreement among professionals
about the measuring content.
2. Criterion-Related Validity
• Uses some standard or criterion to indicate
a construct accurately.
• Compare the measure with another
accepted measure of the same construct.
• Does the measure differentiate individuals
on the criterion it is expected to predict?
• Two subtypes (a) concurrent validity, (b)
predictive validity
a. Concurrent Validity:
• An indicator must be associated with a
preexisting valid indicator. Example:
• Create a new intelligence test. Is it highly
associated with the existing IQ test?
• Do those who score hi on old test also
score hi on new test? If yes, then valid.
• When a scale differentiates individuals who
are known to be different should score
differently on the instrument.
b. Predictive Validity
• Indicates the ability of the measuring
instrument to differentiate among
individuals as to future criterion.
• Aptitude test at the time of selection.
Those who score high at time one (T-1)
should perform better in future (T-2)
than those who scored low.
3. Construct Validity
• Used for measures with multiple indicators.
• Do various indicators operate in consistent
manner?
• How well the results obtained from the use
of the measure fit the theories around
which the test is designed? This is
assessed through (a) convergent and (b)
discriminant validity.
a. Convergent Validity
• Multiple indicators of a concept
converge or are associated with one
another.
• Multiple indicators hang together, or
operate in similar ways. For example,
we measure “education” as a
construct.
Construct “education”
• Ask the level of education completed.
• Verify the certificates.
• Give a test measuring school level knowledge.
• If the measures do not converge i.e.
• People claiming college degree but not
supported by college records, or those with
college degree perform no better than hi
school drop-out on the test.
• The outcome of each does not converge.
Weak convergent validity. Do not combine
the three indicators into one measure.
b. Discriminant Validity
• Divergent validity.
• Indicators of one concept hang
together or converge, but also diverge
or are negatively associated with
opposing constructs.
• If two constructs A and B are very
different then measures of A and B
should not be associated.
• Example of political conservatism.
Measuring political conservatism
• We have 10 questions to measure P C.
• People answer all 10 in similar ways.
• We put 5 additional questions that measure
liberalism.
• Two scores are theoretically predicted to
be different and are empirically found to be
so.
• If the 10 conservatism items hang together
and are negatively associated with 5
liberalism ones.
• It has discriminant validity
RELIABILITY
• The degree to which measures are free from
random error and therefore yield consistent
results
• Stability and consistency with which the
instrument measures the concept and helps
to assess the goodness of a measure.
• Maintains stability over time in the
measurement of a concept.
• Two important dimensions of reliability: (a)
stability and (b) consistency
a. Stability of Measures
• Ability of the measure to remain the
same over time.
• It attests to the “goodness” of measure
because the measure of the concept is
stable, no matter when it is applied.
• Two tests of stability: (1) test-retest
reliability, and (2) parallel-form
reliability
1. Test-Retest Reliability:
• Administering the same test to the same
respondents at two separate times
• Use instrument for measuring job
satisfaction at T-1. Satisfied 64%. Repeat
after 4 weeks. Same results. Hi stability.
• Reliability coefficient obtained with
repetition of the same measure on second
occasion. Correlation between two scores.
Two problems with test-retest
• It is a longitudinal approach. So:
• 1. It may sensitize the respondents.
• 2. Time may help change the attitude.
Also maturation of the subjects.
• Hence the results may not show high
correlation.
• Due to time factor rather than the lack
of reliability
2. Parallel-Form Reliability
• Also called equivalent-form reliability.
• When responses on two comparable sets of
measures tapping the same construct are
highly correlated.
• Both forms/sets have similar items, same
response format. Change the
wording/ordering.
• Minimum error variance caused by wording,
ordering, or other factors
• Split-half reliability  correlation between
two halves of an instrument.
b. Internal Consistency
of Measures
• Indicative of the homogeneity of the
items in the measure.
• Items should ‘hang together as a set,’
• Each item be capable of independently
measuring the same concept.
• Examine if the items and subsets of
items in the instrument are highly
correlated.
• Two ways to do it.
1. Inter-item
Consistency Reliability
• Test of consistency of
respondents’ answers to all items
in a measure.
• To the degree that items are
independent measures of the same
concept, they will be correlated.
2. Split-Half Reliability
• Reflects the correlation between two
halves of an instrument.
• One half could be of even numbered
items and other half of odd
numbered items.
• High correlation tells us about
similarity among items
Note
• Reliability is necessary but not
sufficient condition to test the
goodness of a measure.
• A measure could be highly stable and
consistent, but may not be valid.
• Validity ensures the ability of the
instrument to measure the intended
concept.
Reliability and Validity on Target

Old Rifle New Rifle New Rifle


Sun glare
Low Reliability High Reliability Reliable but
Not Valid
(Target A) (Target B) (Target C)
Sensitivity
• Instrument’s ability to accurately measure
variability in responses.
• A dichotomous response category, such as
“agree or disagree” does not allow the
recording of subtle attitude changes.
• A sensitive measure, with numerous items
on the scale, may be needed. For example:
• Increase items. Increase response
categories. (Strongly agree, agree, neutral,
disagree, strongly disagree). It will
increases a scale’s sensitivity
Practicality
• Validity, reliability, and sensitivity
are the scientific requirements of a
project.
• Operational requirements call for it
to be practical in terms of
economy, convenience, and
interpretability.
RESEARCH
METHODS

Lecture 17

Вам также может понравиться