The MOS Short-Form General Health Survey: Reliability and Validity in a Patient Population
Author(s): Anita L. Stewart, Ron D. Hays and John E. Ware, Jr.
Source: Medical Care, Vol. 26, No. 7 (Jul., 1988), pp. 724-735 Published by: Lippincott Williams & Wilkins Stable URL: http://www.jstor.org/stable/3765494 . Accessed: 02/04/2014 14:32 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. . Lippincott Williams & Wilkins is collaborating with JSTOR to digitize, preserve and extend access to Medical Care. http://www.jstor.org This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions MEDICAL CARE July 1988, Vol. 26, No. 7 Communication The MOS Short-form General Health Survey Reliability and Validity in a Patient Population ANITA L. STEWART, PHD, RON D. HAYS, PHD, AND JOHN E. WARE, JR., PHD There is a great demand for measures of physical and mental health, social and role functioning, and other general health con- cepts for use in evaluating health care.1-3 To be useful, these instruments should repre- sent multiple health concepts and a range of health states pertaining to general function- ing and well-being.4 They should adhere to conventional standards of reliability and va- lidity.5 To be useful in clinic settings, mea- sures must also be simple and easy to use. Patients tend to be sicker than the general population, their attention is divided, and time is limited. Therefore, measures that work well in general populations may not work well for patients. Health surveys that are comprehensive and satisfy psychometric standards are cur- rently available, including the McMaster Health Index Questionnaire, the Sickness Impact Profile (SIP), the Functional Status Questionnaire, the Duke-UNC Health Pro- file, the RAND Health Insurance Experi- ment (HIE) measures, the Nottingham Health Profile, and the Index of Well-being From the Department of Behavioral Sciences, The RAND Corporation, Santa Monica, California. Supported by grants for the Medical Outcomes Study from the Robert Wood Johnson Foundation, The Henry J. Kaiser Family Foundation, and the Pew Chari- table Trusts. The opinions expressed are those of the authors and do not necessarily reflect the opinions of the sponsors or The RAND Corporation. Address correspondence to: Anita L. Stewart, The RAND Corporation, 1700 Main Street, Santa Monica, CA 90406. 724 (IWB).6-12 However, these instruments are too long to be practical in most clinic set- tings. For example, the HIE health scales included 108 questionnaire items that took an average of 45 minutes to complete. The short form of the SIP includes 136 questions that take an average of 30 minutes.13 The IWB is interviewer-administered and takes about 18 minutes.14 The length of most available instruments has prompted investigators to adopt surveys based on a few single-item measures.15 For example, Spitzer, Dobson, Hall, et al. devel- oped a quality of life index that aggregates five single-item measures of health and health-related concepts; the index required about a minute for completion by physi- cians.16 A single-item rating of health in general is one of the most commonly used measures, such as in the National Health and Nutrition Examination Survey.17 Sin- gle-item measures of subjective well-being have been proposed by several investiga- tors.18-20 In general, single-item measures are less satisfactory than multi-item scales because single items are generally less pre- cise, less reliable, and less valid.21'22 Multi- item scales also provide more options for estimating scores when a response to a given item is missing. A compromise between lengthy instru- ments and single-item measures of health was sought. A small subset of items from long-form measures has been shown to sat- isfy standards of acceptability, reliability, and validity in a general population.23 Re- ported here are results from administering This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions SHORT-FORM HEALTH SURVEY such a survey to patients. The results are compared with those from administration of this survey to a general population. Methods Sampling Patients surveyed were participating in the Medical Outcomes Study (MOS), an ob- servational study of variations in physician practice styles and patient outcomes in dif- ferent systems of care. At each of three study sites (Boston, Chicago, Los Angeles), physicians (general internists, family physi- cians, cardiologists, endocrinologists, diabe- tologists, psychiatrists), psychologists, and other mental health providers were sampled from health maintenance organizations (HMO), large multispecialty groups, and solo fee-for-service practices. Altogether, 526 health care providers age 31-55, who reported direct patient care as their primary professional activity and who had been in their current practice setting at least one year were included in the MOS. The information in this article is based on a sample of 11,186 adult, English-speaking patients who visited these providers during the sampling period (lasting 9 days on aver- age). Ages ranged from 18-103 (mean age was 47). Thirty-eight percent were male and 87% had completed high school (average of 13.7 years of education). Fifty percent of the sample had a total household income of at least $20,000 in 1985 dollars. Seventy-nine percent were white, 11% black, 5% Latino, and 3% were Asian or Pacific Islander. The general population sample of adults repre- senting United States households, to which we compare results, is described else- where.23 As expected,24 the patient sample was slightly older and overrepresented women relative to the general population sample. The patients were also slightly more educated and had slightly higher income. Data Collection Data from patients were collected from February through October 1986. The 20 health items were located in the middle of a 75-item self-administered questionnaire, which was completed by patients as they waited to see their doctor. In all solo prac- tices and in some group practices, surveys were distributed by office staff. In most group practices, they were distributed by MOS field representatives. The entire ques- tionnaire took an average of 13 minutes to complete, of which it is estimated that the health items took from 3-4 minutes, on average. Questionnaires were returned for about 74% of the eligible patient visits in group practices and about 65% of such visits in fee-for-service practices. These return rates underestimate patient acceptance of the questionnaire because, when practices were very busy, staff were encouraged to survey every other patient. Health Measures In accordance with the minimum stan- dard of content validity for a comprehensive health measure suggested by Ware4 and consistent with previous definitions of health,25-27 20 items were selected to repre- sent six health concepts: physical function- ing, role functioning, social functioning, mental health, health perceptions, and pain. Physical functioning was assessed by limita- tions in a variety of physical activities, ranging from strenuous to basic, due to health. Role and social functioning were defined by limitations due to health prob- lems. Mental health was assessed in terms of psychological distress and well-being. The measure of health perceptions tapped patients' own ratings of their current health in general. Pain was included to capture differences in physical discomfort. Defini- tions of physical functioning, mental health, and health perceptions tap positive as well as negative states of health. The definitions are summarized in Table 1. Questionnaire items are presented in the appendix. Eighteen of the 20 items were adapted from longer HIE measures of these concepts and were used successfully in a general pop- 725 Vol. 26, No. 7 This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions STEWART ET AL. TABLE 1. Definitions of Health Concepts No. of Item Measure Items Definition Numbersa Physical 6 Extent to which health interferes with a variety of 16a-16f functioning activities (e.g., sports, carrying groceries, climbing stairs, and walking) Role 2 Extent to which health interferes with usual daily activity 18, 19 functioning such as work, housework, or school Social 1 Extent to which health interferes with normal social 20 functioning activities such as visiting with friends during past month Mental 5 General mood or affect, including depression, anxiety, 21-25 health and psychologic well-being during the past month Health 5 Overall ratings of current health in general 2, 26a-26d perceptions Pain 1 Extent of bodily pain in past 4 weeks 17 a See appendix. ulation survey.23 The two additional single- item measures (social functioning and pain) were added after that administration, based on experience with similar measures in the HIE. Analysis Plan Analyses were designed to evaluate the extent to which very short multi-item scales would satisfy traditional psychometric crite- ria.5 A multitrait scaling method was used to test item convergent and discriminant valid- ity.28 This method consists of three steps designed to determine whether items have equivalent variances, whether each item in a hypothesized group is substantially related (r > 0.40) to the total score computed from other items in that group (item convergent validity criterion) and whether each item correlates significantly higher with its hy- pothesized scale than with other scales (item discriminant validity criterion). If these con- ditions are met, it is appropriate to combine items as hypothesized into simple sum- mated ratings scales. These multitrait scal- ing tests were performed for the patients having complete data on all 20 items (N = 8,294, 73% of respondents). 726 Cronbach's alpha,29 a measure of inter- nal-consistency reliability, was estimated for the four multi-item scales. Reliability is considered acceptable for group compari- sons when alpha is 0.50 or above.30 On the strength of experience with longer forms of these measures, the authors thought it would be possible to achieve reliability coef- ficients above 0.70, as recommended by Nunnally.31 Reliability was evaluated for the total sample and in two subsamples for whom data quality was hypothesized to be lower based on prior studies-those with less than a high school education and those over age 75.32 In addition, because patients with serious health problems might have trouble completing such a questionnaire, the authors tested the reliability separately for groups of patients with congestive heart failure, depressive symptoms, diabetes, and/or recent myocardial infarction. Where possible, the reliability estimates for the short-form scales were also compared with those obtained for longer versions of the same scales. Preliminary tests of validity were also possible within the available data set, in- cluding product-moment correlations among the health measures, discrimination MEDICAL CARE This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions SHORT-FORM HEALTH SURVEY between patient and general population groups, and correlations with sociodemo- graphic characteristics. All correlations were examined to determine whether the short- form measures produced the same pattern of results as observed for long-form mea- sures in previous research. Patient and gen- eral population groups were compared by examining the magnitude of the difference in the proportion of those scoring in the "poor" health range (i.e., known groups va- lidity). The variability of score distributions was examined for all six health measures. Our goal was to achieve scores that were normally distributed or at least not highly skewed or kurtotic. Construction of Health Measures Consistent with previous studies, limita- tions in physical and role functioning were counted regardless of duration and were scored to reflect the number of limitations present.33'34 Scores were reversed so that a high value indicated better functioning. Mental health scales were scored by sum- ming the item responses, after reversing the scoring of some items so that a high score indicated better health. Before combining items in the health perceptions scale, the authors recoded the response choices of the overall health item (item 1), to better reflect the unequal intervals of the item.* The scale was scored by summing the item responses, after recoding some items so that a high score indicated better health. The single- item measures were scored so that high scores indicated better social functioning and more pain. Finally, for all measures, scores were transformed linearly to 0-100 * To do this, the authors calculated the mean general health score for the other four items for each response level on the overall item. These mean scores were then transposed into a 1-5 scale to correspond to the scale used for the other 4 items. This resulted in the follow- ing transformation: 1 = 5, 2 = 4.36, 3 = 3.43, 4 = 1.99, 5 = 1. scales, with 0 and 100 assigned to the lowest and highest possible scores, respectively.t Results Multitrait Scaling The item means and standard deviations were roughly equal within each scale, thus meeting the first criterion for summated rat- ings scales, with one minor exception in the physical functioning scale. Item-scale cor- relations (corrected for overlap) for the four multi-item scales indicated that our strin- gent criterion of convergent validity was met in all cases. Item-scale correlations for hypothesized scales ranged from 0.45 to 0.79, with a median of 0.68. All items in each hypothesized scale also exceeded the discriminant validity criterion. These results support the construction of simple sum- mated ratings scales based on hypothesized item groupings. Variability of Health Measures The mean and standard deviation for each of the measures are shown in Table 2. The full range of possible scores was ob- t For the mental health and general health scales, a missing score was assigned only if all five items in the scale were missing. For the two-item role-functioning scale, a missing score was assigned initially if either item was missing. However, if the one nonmissing item indicated that the person was unable to work (limited on item 18), the lowest possible functioning score of 0 was assigned. If the one nonmissing item indicated that the person had no limitations in the kind or amount of work (not limited on item 19), the highest possible functioning score of 100 was assigned. For the physical functioning scale, a missing score was assigned if more than one item was missing. However, if the first item (item 16a) was answered as not limited and any re- maining items were missing, a score of 100 was as- signed. The authors found about 8% of respondents were assigned a missing value on physical and role functioning, 3% on mental health, and less than 1% on general health. Missing data rates for the single-item pain and social functioning measures were both about 3%. Overall, 15% of the sample had missing data on one or more of the final scales. Missing data rates were highest among those over 75 (39%). 727 Vol. 26, No. 7 This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions STEWART ET AL. TABLE 2. Descriptive Statistics for Health Scales and Percent of Patient (N = 11,186) and General Population (N = 2,008) Samples Scoring in the "Poor" Health Range % in Poor Healthb No. of General Measure" Items Mean SD Patients Population' Physical functioning 6 78.5 30.8 45 22 Role functioning 2 77.5 38.3 28 12 Social functioning 1 87.2 23.6 9 d Mental health 5 72.6 20.2 31 19 Health perceptions 5 63.0 26.8 52 20 Pain 1 31.4 27.7 29 d a Observed range of all scores was 0-100. A high score indicates better health except for pain, where a high score indicates more pain. b Poor health defined as: physical and role functioning = one or more limitations; social functioning = limitations a good bit of the time or more; mental health = lowest 19% of scores in general population sample (score of 67 or lower) (cutoff defined as close as possible to the bottom 20%); health perceptions = lowest 20% of scores in general population sample (score of 70 or lower); pain = moderate, severe, or very severe pain. ' T-tests of difference between proportions in patient and general population samples were statistically significant (P < 0.01) for every possible comparison. d Not available. served for all measures (data not reported).? The distributions of mental health and health perceptions scores were roughly symmetric, as desired. The distributions of physical and role functioning scores were skewed, with more people scoring along the positive end of the scale but to a lesser de- gree than in general populations.34 The role functioning scale had a somewhat bimodal distribution with the least prevalent cate- gory being the middle one; 72% had perfect functioning and 17% scored at the worst level. The distribution of the social func- tioning item was quite skewed and kurtotic, with the modal score (69% of the sample) being perfect functioning. The pain item was well distributed even though the modal score was no pain (30%), with approxi- mately 9% reporting severe or very severe pain and about 20% reporting moderate pain. ? Actual score distributions are available from the senior author upon request. Reliability of Health Measures Reliability coefficients for the multi-item health scales ranged from 0.81 to 0.88 (Table 3). These estimates are nearly identi- cal to those for the same scales in the gen- eral population sample (range was from 0.76 to 0.88). Estimates for the four multi- item scales were similar for depressed pa- tients (0.82 to 0.87) and for other subgroups analyzed: congestive heart failure (0.77 to 0.87), diabetes (0.83 to 0.87), myocardial infarction (0.77 to 0.88), less than a high school education (0.86 to 0.88), and over age 75 (0.84 to 0.89). The internal-consistency reliabilities of these short-form measures were lower, but not much lower, than their full-length ver- sions. The internal-consistency reliability of the five-item health perceptions measure was 0.87, compared to 0.88 for a nine-item general health scale.35 The reliability of the five-item mental health measure was 0.88, compared with 0.96 for a 38-item version.36 The reliability of the six-item physical func- 728 MEDICAL CARE This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions SHORT-FORM HEALTH SURVEY TABLE 3. Reliability Estimates and Correlations Among Health Measures Measure PF RF SF MH HP P Physical Functioning (PF) (0.86) Role Functioning (RF) 0.65 (0.81) Social Functioning (SF) 0.47 0.56 (-) Mental Health (MH) 0.24 0.33 0.45 (0.88) Health Perceptions (HP) 0.53 0.57 0.53 0.45 (0.87) Pain (P) -0.39 -0.42 -0.39 -0.42 -0.47 (-) Note: Ns varied from 9,729 to 10,860, due to missing data. All correlation coefficients are statistically significant (P < 0.01). Internal-consistency reliabilities are given on the diagonal for multi-item scales. tioning measure was 0.86, compared with 0.90 for a 10-item similar measure.37 Finally, the reliability of the two-item role function- ing was 0.81, compared with a coefficient of reproducibility of 0.92 for a three-item ver- sion.34 Validity of Health Measures Results for the three types of validity analyses, introduced in the methods section, are presented below. Correlations Among Health Measures. All correlations among the health measures were statistically significant (P < 0.01) and most were substantial in magnitude (see Table 3). This pattern of correlations corre- sponds well with that observed from studies of full-length versions of these mea- sures.1038 Because the social functioning item had the same format as the mental health items, it might be expected to corre- late highest with mental health. It did not, suggesting that this method effect did not dominate the results. Consistent with pre- vious research,35 the health perceptions scale correlated substantially with both physical and mental health; in fact, that scale correlated substantially with all of the other health scales. Comparison of Patient and General Population Samples. The authors calcu- lated the percent scoring in the "poor" health range for each of the health measures (see Table 2 for definitions of poor health). The percentage of respondents with poor health was significantly greater (P < 0.01) in the patient sample than in the general popu- lation sample on all four comparable mea- sures, consistent with previous stud- ies.13'16'39'40 The percentage of patients with physical or role limitations or poor health perceptions was about twice that observed for the general population sample. The per- centage of respondents with poor mental health was 50% larger in the patient sample than in the general population sample. These differences could not be accounted for by differences in sociodemographic characteristics between the two samples. Correlations Between Health Measures and Sociodemographics. Correlations be- tween the health measures and age, sex, ed- ucation, income, and race (not shown) were consistent with results using longer form measures.38 People with more education and income tended to have better health. In this study, men reported slightly better health than women on all measures except health perceptions. Older people tended to report poorer health than younger people on all measures except mental health, as ex- pected. Nonwhites tended to report poorer health perceptions, poorer social function- ing, and more pain than whites. Discussion The authors' goal was to develop a gen- eral health survey that is comprehensive and psychometrically sound, yet short 729 Vol. 26, No. 7 This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions STEWART ET AL. enough to be practical for use in large-scale studies of patients in practice settings. The resulting 20-item survey assesses physical functioning, role and social functioning, mental health, health perceptions, and pain. A full range of favorable and unfavorable health levels is tapped by items in these measures. Thus, the survey achieves breadth and depth of measurement, while permitting self-administration in only 3-4 minutes. By virtue of its reduction of re- spondent burden by 80% relative to the lengthier measures from which it was de- rived, this short-form survey offers a more practical approach to patient health assess- ment. The reliability of each of the multi-item scales is acceptable for group comparisons, even in subgroups of patients over age 75, with serious chronic conditions, with de- pressive symptoms, and with less than a high school education. The reliability of the mental health and the health perceptions measures, however, might be slightly in- flated because these items were asked in a sequence. Future tests of the reliability of these items should split them up to mini- mize recall effects. The fact that the reliabilities observed here were not substantially lower than those for long-form measures is encouraging. However, it does not necessarily follow that short-form measures will achieve equiva- lent precision in measuring changes in health over time. Such precision is essential for studies of health outcomes. Although some sacrifice in precision is likely with short measures, compared with lengthier ones, these short-form scales represent a gain in precision relative to single-item measures, which are typically coarse.22 Tradeoffs between short- and long-form measures in detecting changes in health over time are currently being evaluated in the MOS. The results also offer preliminary support for the validity of the measures. First, excel- lent item discrimination among hypothe- sized scales in the multitrait scaling analyses was observed. Second, correlations among the health measures and between the mea- sures and sociodemographic characteristics were similar to correlations observed using longer form versions of these measures. Third, substantial differences in health be- tween the patient and general population samples were observed and the pattern of differences (across measures) was consistent with previous research.39 The validity of a health survey cannot be established in a single study. Future studies of these short-form measures should evalu- ate how well they discriminate among groups differing in diagnosis and disease se- verity. Their validity in predicting future health and utilization of health services should also be tested. Short-form measures may not do as well as long-form measures in tests of validity.22 Thus, it is important to establish the limits of the short-form mea- sures and understand fully the tradeoffs in- volved in their use. A relatively high rate of missing data for items in the physical and role functioning scales (8%) was observed. The pattern of missing data in the role functioning scale suggests that some people overlooked the second role functioning item (item 19). This item should be printed directly below the first one in future administrations of this battery. For the physical functioning items, instructions should make more clear that respondents should answer every item, not just those describing problems that apply." Missing data was more prevalent among patients: older than 75, with less than a high school education, and with diabetes or heart disease. The authors suspect that this prob- lem is not unique to this study and suggest " The modified instructions for physical functioning are as follows: For how long has your health limited you in each activity listed? Please provide an answer for each of the activities listed. 730 MEDICAL CARE This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions SHORT-FORM HEALTH SURVEY that special steps be taken to guard against this problem in studies of these populations. An advantage of multi-item scales in this regard is the option they provide for esti- mating missing item scores from other items in the same scale. Using this method, the percentage of respondents available for analysis was increased from 73% to 85%. The results of this study offer some en- couragement regarding the potential of short-form health measures in surveys of both patients and general populations. It may no longer be necessary to completely omit measures of functional status and well-being from large-scale studies because of practical constraints. Further, for some purposes the same short-form measures may be appropriate for use in both popula- tions. The usefulness of these measures in health policy research to evaluate the effects of health care as well as in clinical trials also warrants further study. (Key words: health; health assessment; functioning.) References 1. McDermott W. Absence of indicators of the influ- ence of its physicians on a society's health. Am J Med 1981;70:833. 2. Schroeder SA. Outcome assessment 70 years later: are we ready? N Engl J Med 1987;316:160. 3. Tarlov AR. Shattuck lecture-the increasing sup- ply of physicians, the changing structure of the health- services system, and the future practice of medicine. N Engl J Med 1983;308:1235. 4. Ware JE. Standards for validating health mea- sures: definition and content. J Chronic Dis 1987;40:473. 5. Ware JE. Methodological considerations in the se- lection of health status assessment procedures. In: Wenger NK, Mattson ME, Furberg CD, et al., eds. As- sessment of Quality of Life in Clinical Trials of Cardio- vascular Therapies. New York: Le Jacq Publishing, 1984. 6. Chambers LW, MacDonald LA, Tugwell P, et al. The McMaster Health Index Questionnaire as a mea- sure of quality of life for patients with rheumatoid dis- ease. J Rheumatol 1982;9:780. 7. Bergner M, Bobbitt RA, Carter WB, et al. The Sickness Impact Profile: development and final revi- sion of a health status measures. Med Care 1981; 19:787. 8. Jette AM, Davies AR, Cleary PD, et al. The Func- tional Status Questionnaire: reliability and validity when used in primary care. J Gen Intern Med 1986; 1:143. 9. Parkerson GR, Gehlbach SH, Wagner EH, et al. The Duke-UNC health profile: an adult health status instrument for primary care. Med Care 1981; 19:806. 10. Brook RH, Ware JE, Davies-Avery A, et al. Con- ceptualization and measurement of health for adults in the Health Insurance Study: vol VIII, overview. Santa Monica, CA: The RAND Corporation (publication number R-1987/8-HEW), 1979. 11. Hunt SM, McKenna SP, McEwen J, et al. The Nottingham Health Profile: Subjective health status and medical consultations. Soc Sci Med 1981;15A:221. 12. Patrick DL, Bush JW, Chen MM. Toward an op- erational definition of health. J Health Soc Behav 1973; 14:6. 13. Bergner M. The Sickness Impact Profile (SIP) In: Wenger NK, Mattson ME, Furberg CD, et al., eds. As- sessment of Quality of Life in Clinical Trials of Cardio- vascular Therapies. New York: Le Jacq Publishing, 1984. 14. Read JL, Quinn RJ, Hoefer MA. Measuring overall health: an evaluation of three important ap- proaches. J Chronic Dis 1987;40(Supp):7S. 15. Diener E. Subjective well-being. Psychol Bull 1984;95:542. 16. Spitzer WO, Dobson AJ, Hall J, et al. Measuring the quality of life of cancer patients: a concise QL-index for use by physicians. J Chronic Dis 1981;34:585. 17. Wan TTH, Livieratos B. Interpreting a general index of subjective well-being. Milbank Mem Fund Q 1978;56:531. 18. Andrews FM, Withey SB. Developing measures of perceived life quality: results from several national surveys. Social Indicators Research 1974; 1:1. 19. Cantril H. The Pattern of Human Concerns. New Brunswick, NJ: Rutgers University Press, 1965. 20. Gurin G, Veroff J, Feld S. Americans View Their Mental Health. New York: Basic Books, 1960. 21. Ware JE, Karmos AH. Development and valida- tion of scales to measure perceived health and patient role propensity: volume II of a final report. Springfield, VA: National Technical Information Services (NTIS publication no. PB288-331), 1976. 22. Manning WG, Newhouse JP, Ware JE. The status of health in demand estimation; or, beyond ex- cellent, good, fair, and poor. In: Fuchs VR, ed. Eco- nomic Aspects of Health. Chicago: University of Chi- cago Press, 1982. 23. Ware JE, Sherboure CA, Davies AR, et al. A short-form general health survey. Santa Monica: The RAND Corporation (publication number P-7444), 1988. 24. McKinlay JB. Some approaches and problems in the study of the use of services: an overview. J Health Soc Behav 1972; 13:115. 25. World Health Organization. Constitution of the 731 Vol. 26, No. 7 This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions STEWART ET AL. World Health Organization. In: Basic Documents. Geneva: World Health Organization, 1948. 26. Bergner M. Measurement of health status. Med Care 1985; 23:696. 27. Breslow L. A quantitative approach to the World Health Organization definition of health: physical, mental and social well-being. Int J Epidemiol 1972; 1:347. 28. Ware JE, Brook RH, Davies-Avery A, et al. Con- ceptualization and measurement of health for adults in the Health Insurance Study: vol. I, model of health and methodology. Santa Monica: The RAND Corporation (publication number R-1987/1-HEW), 1980. 29. Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika 1951;16:297. 30. Helmstadter GC. Principles of Psychological Measurement. New York: Appleton-Century-Crofts, 1964. 31. Nunnally JC. Psychometric Theory, 2nd ed. New York: McGraw-Hill, 1978. 32. Andrews FM. Construct validity and error com- ponents of survey measures: a structural modeling ap- proach. Public Opinion Quarterly 1984;48:409. 33. Stewart AL, Ware JE, Brook RH. Advances in the measurement of functional status: construction of ag- gregate indexes. Med Care 1981; 19:473. 34. Stewart AL, Ware JE, Brook RH. Construction and scoring of aggregate functional status measures: volume I. Santa Monica: The RAND Corporation (publication no. R-2551-1-HHS), 1982. 35. Davies AR, Ware JE. Measuring health percep- tions in the Health Insurance Experiment. Santa Mon- ica: The RAND Corporation, (publication number R-2711-HHS), 1981. 36. Veit CT, Ware JE. The structure of psychological distress and well-being in general populations. J Con- sult Clin Psychol 1983;51:730. 37. Stewart AL, Ware JE, Brook RH, et al. Conceptu- alization and measurement of health for adults in the Health Insurance Study: vol II, physical health in terms of functioning. Santa Monica: The RAND Corporation (publication no. 1987/2-HEW), 1978. 38. Ware JE, Davies-Avery A, Brook RH. Conceptu- alization and measurement of health for adults in the Health Insurance Study: vol VI, analysis of relation- ships among health status measures. Santa Monica: The RAND Corporation (publication number R-1987/ 6-HEW), 1980. 39. Nelson E, Conger B, Douglass R, et al. Func- tional health status levels of primary care patients. JAMA 1983;249:3331. 40. Cassileth BR, Lusk EJ, Strouse TB, et al. Psycho- social status in chronic illness: a comparative analysis of six diagnostic groups. N Engl J Med 1984;311:506. 732 MEDICAL CARE This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions SHORT-FORM HEALTH SURVEY Appendix Short-form Health Survey: Medical Outcomes Study 2. In general, would you say your health is: a Excellent O Very Good O Good O Fair O Poor 17. How much bodily pain have you had during the past 4 weeks? 1 0 None 2 O Very mild 3 E Mild 4 0 Moderate 5 0 Severe 16. For how long (if at all) has your health limited you in each of the following activities? (Check One Box on Each Line) Limited for more than 3 months 1 a. The kinds or amounts of vigorous activities you can do, like lifting heavy objects, running or participating in strenuous sports ............ b. The kinds or amounts of moderate activities you can do, like moving a table, carrying groceries or bowling ........ c. Walking uphill or climbing a few flights of stairs ......... d. Bending, lifting or stooping ... e. Walking one block .......... f. Eating, dressing, bathing, or using the toilet ............. Limited for 3 months or less 2 Not limited at all 3 O 0 El El O O O O O 18. Does your health keep you from working at a job, doing work around the house or going to school? 1 0 Yes, for more than 3 months 2 0 Yes, for 3 months or less 3 l No 19. Have you been unable to do certain kinds or amounts of work, housework or schoolwork because of your health? 1 O Yes, for more than 3 months 2 0 Yes, for 3 months or less 3 O No 1 2 3 4 5 733 Vol. 26, No. 7 This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions STEWART ET AL. Appendix Continued For each of the following questions, please check the box for the one answer that comes closest to the way you have been feeling during the past month. (Check One Box on Each Line) All of the Time 1 Most of the Time 2 A Good Bit of the Time 3 Some of the Time 4 A Little of the Time 5 None of the Time 6 20. How much of the time, during the past month, has your health limited your social activities (like visiting with friends or close relatives)? ......... 21. How much of the time, during the past month, have you been a very nervous person? .... 22. During the past month, how much of the time have you felt calm and peaceful? .. 23. How much of the time, during the past month, have you felt downhearted and blue? ............. 24. During the past month, how much of the time have you been a happy person? 25. How often, during the past month, have you felt so down in the dumps that nothing could cheer you up? 0 0 0 E 0 E E 0 E 0 E E 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 734 MEDICAL CARE This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions SHORT-FORM HEALTH SURVEY Appendix Continued 26. Please check the box that best describes whether each of the following statements is true or false for you. (Check One Box on Each Line) Definitely Mostly Not Mostly Definitely True True Sure False False 1 2 3 4 5 a. I am somewhat ill ....... b. I am as healthy as anybody I know ................ c. My health is excellent .... d. I have been feeling bad lately ................. LI L L L E L L L L L L L L L LI L L L L L NOTE: Item numbers indicate the order in which the questions appeared in the questionnaire 735 Vol. 26, No. 7 This content downloaded from 134.48.160.146 on Wed, 2 Apr 2014 14:32:07 PM All use subject to JSTOR Terms and Conditions