Академический Документы
Профессиональный Документы
Культура Документы
Scales clarify the characteristics of measurement processes Scales indicate which statistical procedures are appropriate
Ordinal Categories with order Size (S,M,L), Social class, Agreement (strong, some, low, none)
Interval Distance is meaningful between categories Temperature, ACT scores, shoe size, IQ
Ratio Scale of categories has absolute zero Age, income, all rates and percents, vacation time
Does my measurement procedure give the same accurate measurement each time it is used?
Reliability is
consistency in measurement
Does this procedure or
test yield the same results if you repeat the measurement, so long as conditions have not changed? How do we know?
Stern Tone Variator, from The Archives of the History of American Psychology
Validity is truth
in measurement
Does this procedure or
test actually measure the construct or dimension it is intended to measure? How do we know?
Lavery Psychograph, from The Archives of the History of American Psychology
shot hits the same part of the target each time: it is consistent, so it is reliable. Not valid. The goal is to hit the center of the target, but the shots are not in that area.
evenly distributed around the correct goal (center): the person probably tried to hit the correct place. Not reliable because the shots are off the mark in every possible direction; they are not consistent.
shots are not tightly clustered together; they are not consistent.
Not valid because, to the
extent there is any pattern, it is not at the true target, the center.
close together. The red player can reliably hit the same part of the target. Valid: the darts are clustered at the center, where they were aimed.
The scores only have meaning if they measure what they are
Reliability
Does the value observed and
Validity
Does the value observed and
multiple times or ways. Every researcher must either use a known instrument, or test and demonstrate the reliability of a new tool.
The Literature Search is a
data or similar processes. Every researcher must either use a known instrument, or test and demonstrate the validity of a new tool.
The Literature Search is a
huge labor saving device Using a known instrument improves research quality
huge labor saving device Using a known instrument improves research quality.
Unreliable measurement
tools introduce error Reliability improves with new tool or method Test-Retest is the simplest way to assess reliability Can be used whenever the process of measuring will not, by itself, affect data Test with Correlation of first with second scores.
theories and hypotheses. The more error in the data whether random or systematic the less likely they are to find true and significant results. Researchers in psychology and human service fields spend months and years to develop a set o questions or observations that has high reliability.
Meaning of questions is
unclear or produces random answers. Raters not adequately trained on method of making rating. Some of the questions or items measure a subtly different dimension, they dont go with the others. Instructions may be unclear or inconsistent, even if the test questions are fine. Outside events may be having an effect.
so it cant be repeated
Test by computing
Abstract or complex
dimensions cant be measured directly Several related items or observations are more likely to get an accurate result Need to verify that all the items relate to the same dimension Cronbachs alpha () is commonly reported.
Interpret strength on same
scale as correlation
Observation process is
skilled and requires individual judgment. Trained researchers make observations of the same subject independently. Test with Correlation of ratings (interval or ratio) or percent agreement (nominal or ordinal)
Taking In the View by Randy Son of Robert at http://www.flickr.com/photos/randysonofrobert/2384256036/
Measurement process
needs accuracy across all possible outcomes. Ceiling effect: process cannot measure extreme high scores Floor effect: process cannot measure extreme low scores Both are a scale attenuation problem.
sort out all the reliability questions Reliability issues are intertwined with validity. Learn by seeing what competent researchers do.
Click to listen
beauty have an evolutionary impact (i.e., people are more likely to have children to pass along their genes)?
Dr. Harrell addresses both of those questions in this 4 minute audio clip.
belt safety Previously validated measures of beauty Two trained researchers evaluate attractiveness (inter-rater reliability) Two different trained researchers observe safety (inter-rater reliability and avoiding bias)
more stable measure of a complex dimension Some questions will be eliminated because they evoke varied and inconsistent responses The items need to cover the entire range of the dimension in order to observe the extreme values Reliability may change if observations are made in very different populations or situations (e.g., college students vs. seniors in assisted living).
Does my measurement procedure give a measurement of the construct or variable that I intend? Or is it measuring something else?
linked to the dimension or concept claimed? Research design how well does the experiment or study control the situation so that we are confident that the relationships or results observed were due to the impact of the independent variable?
In this course, we consider validity in measurement now
Face validity
Content validity Are all aspects of the dimension or concept covered? Are any aspects over- or under-emphasized? Does the measure differentiate this dimension from other similar ones?
Improving content validity Thorough search of the literature Consult with experts who disagree with your perspective
criterion
A music audition is a valid measure if it selects the better players
over those with less ability (concurrent validity). The GRE is a valid measure if people who do well on the GRE succeed in graduate school (predictive validity) It is often as hard to demonstrate the link between the test and the criterion as between the test and the dimension.
idea related to a group of interrelated variables. The construct itself might be socially constructed
classic studies in obedience
are being re-interpreted. some cultures lack a word or idea for schizophrenia
Researchers make their case for
Milgrams obedience experiment
evaluated in terms of degrees. If a measurement has very low reliability, it cant be valid because it is not even accurate. Maximum validity is the square root of reliability.
Validity Reliability
Especially in inferential statistics when relationships
are tested, the analyst must remember to examine the quality of measurement.