Академический Документы
Профессиональный Документы
Культура Документы
Assessing Measurement
VALIDITY
Claims of having appropriately measured the DV and IVs are valid;
Validity of measurement
Assuming that there is a relationship in this study, if we claim causality, is the relationship causal?
Validity of Measurement An empirical measure is valid to the extent to which it adequately captures the real meaning of the concept under consideration (how well it measures the concept it is intended to
measure)
Face Validity - look at the operationalization; assess whether "on its face" it seems like a good translation of the construct To improve the quality of face validity assessment, make it more systematic.
Content Validity
- concerns the extent to which a measure represents all facets of a concept; - Identify clearly the components of the total content domain; then show that the items adequately represent the components; Ex: knowledge tests - assumes a good detailed description of the content domain, which may not always be so;
Criterion-related Validity I
Applies to measures (tests) that should indicate a persons present/future standing on a specific behavior (trait)
Criterion-related Validity II
Predictive Validity - assess how well the measure is able to predict something it should theoretically be able to predict. Concurrent Validity - assess how well the measure is able to distinguish between groups that it should theoretically be able to distinguish between.
Construct Validity I
- based upon accumulation of research evidence (lit. rev) - construct concept
Assumption: the meaning of any concept is implied by statements of its theoretical relation to other concepts
Hence: - Examine theory; - Hypotheses about variables that should be related to measure(s) of the concept; - Hypotheses about variables that should NOT be related to measure(s) of the concept; - Gather evidence
Construct Validity II
Convergent Validity - examine the degree to which the measure is similar to (converges on) other measures that it theoretically should be similar to. Discriminant Validity - examine the degree to which the measure is not similar to (diverges from) other measures that it theoretically should be not be similar to. To estimate the degree to which any two measures are related to each other one typically uses the correlation coefficient.
http://www.socialresearchmethods.net/kb/constval.php - construct validity
Assessing Reliability
Inter-coder reliability: - check the degree to which different interviewers/observers/raters/coders give consistent estimates of the same phenomenon. A. Nominal measure & raters are checking off which category each observation falls in: calculate % of agreement between raters. Ex: One measure, with 3 categories; N = 100 observations, rated by two raters. On 86 of the 100 observations the raters checked the same category (i.e., 86% inter-rater agreement)
Test-retest reliability - administer the same test to the same sample on two different occasions.
- calculate the correlation btw repeated applications of the measure through time Problems:
Split-half reliability - calculate the correlation btw. responses to subsets of items from the same measure (apply scale to sample; then divide scale in 2, randomly; reapply each half; make correlation of results; the higher the better)
- when factors systematically influence the process of measurement, or the concept we measure - systematic measurement error
- when temporary, chance factors affect measurement random error - its presence, extent and direction are unpredictable from one question to the next, or from one respondent to the next
See:
www.socialresearchmethods.net
Appropriate Measurement
Valid Reliabile Exhaustive It should exhaust the possibilities of what it is intended to measure. There must be sufficient categories so that virtually all units of observations being classified will fit into one of the categories. Mutually exclusive each observation fits one and only one of the scale values (categories).