Вы находитесь на странице: 1из 32

VALIDITY

AND
RELIABILITY
Measurement and Evaluation in Education
Graduate School- LSU Ozamis
Presenter: Euprime B. Regalado
Validity
• A degree to which a test measures
what it is supposed to measure
• The most central and essential
quality in the development,
interpretation, and use of
educational measures.
Evidence
• Is used to determine whether a
test is measuring what it is
supposed to measure
• Grouped into three categories:
– Construct-related evidence
– Content-related evidence
– Criterion-related evidence
Construct-Related Evidence
• Establishes a link between
the underlying
psychological construct
we wish to measure and the
visible performance we
choose to observe
• In measuring learned
knowledge, bear in mind the
question:

“ How would I indicate to others


that I have this knowledge?”
Example:
• Give other examples of a
rectangle
• What happens to electrical
voltage if resistance is
decreased while current
remains constant?
Content-Related Evidence
• Indicates how well the
content of a test
corresponds to the
student performance to
be observed.
2 techniques
• Table of Specifications
(TOS)
• Performance Objectives
Performance Objectives for a test on validity

• Describe characteristics • No. of questions


of each of each of the 6
methods used to
establish evidence of
validity
• When provided an
example of evidence
used to establish
validity, classify by type
4
the evidence being
illustrated
Other Procedures Used to Judge
the Relevance of Test Content
Face Validity

• Test questions have


face validity when they
appear to be relevant
to the group being
examined.
Face Validity
Example:
Question:
How many square feet
are contained in a 4ft by
6ft rectangle?
How many square feet are contained
in a 4ft by 6ft rectangle?
Group Tested Question
Carpenters •How many square feet are
contained in a 4ft by 6ft piece of
plywood?
Glass Cutters •How many square feet of glass
are contained in a 4ft by 6ft
Painters window?
•How many square feet of
painted surface are contained in
a 4ft by 6ft area?
Curricular and Instructional Validity
• An evaluation of the
extent to which the
content of a test agrees
with the content of
instruction
Criterion-Related Evidence
• Indicates how well performance on a test
correlates with performance on relevant
criterion measures external to the test.

“If a test is valid, with what other


things should performance on
the test correlate?”
A teacher’s present knowledge
of a student is used to as the
criterion for informal
assessment.
Factors Affecting Validity
• Test-related factors
• Functioning Content and
Teaching Procedures
• Intervening events
• Nature of the group and the criterion
• Reliability
RELIABILITY

•The consistency of
measurements
A RELIABLE TEST
Produces similar scores
across various conditions
and situations, including
different evaluators and
testing environments.
How do we account for an individual who
does not get exactly the same test score
every time he or she takes the test?

1. Test-taker’s temporary psychological


or physical state
2. Environmental factors
3. Test form
4. Multiple raters
TEST-RETEST RELIABILITY

Suggests that subjects tend


to obtain the same score
when tested at different
times.
Split-Half Reliability
Sometimes referred to as internal
consistency

Indicates that subjects’ scores on


some trials consistently match
their scores on other trials
INTERRATER RELIABILITY
Involves having two raters independently
observe and record specified behaviors,
such as hitting, crying, yelling, and getting
out of the seat, during the same time period

TARGET BEHAVIOR
A specific behavior the
observer is looking to record
ALTERNATE FORMS RELIABILITY
• Also known as equivalent forms reliability or
parallel forms reliability
• Obtained by administering two equivalent
tests to the same group of examinees
• Items are matched for difficulty on each test
• It is necessary that the time frame between
giving the two forms be as short as possible
OBTAINED SCORE
•The score you get when you administer a test
•Consists of two parts: the true score and the
error score
STANDARD ERROR of
MEASUREMENT (SEM)
Gives the margin or error that you should
expect in an individual test score because of
imperfect reliability of the test
FACTORS AFFECTING RELIABILITY
1. Test length
2. Test-retest interval
3. Variability of scores
4. Guessing
5. Variation within the test
situation
Let the target ( bull’s eye), be the content objective
Let the distance of the darts to the target be the measure of validity
Let the distance of the darts to each other be the TEST SCORES

RELIABLE RELIABLE
VALID
THUS, A RELIABLE TEST IS NOT ALWAYS VALID
Thank you
for
LISTENING!

Вам также может понравиться