Вы находитесь на странице: 1из 9

AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 121:19 (2003)

Bayes Theorem in Paleopathological Diagnosis


Steven N. Byers1* and Charlotte A. Roberts2
1 2

Department of Anthropology, University of New Mexico, Albuquerque New Mexico 87131-1086 Department of Archaeology, University of Durham, Durham DH1 3LE, England, UK KEY WORDS paleopathology; Bayes theorem; diagnosis lowing this, the sources of these prior probabilities and their accompanying problems in paleopathology are considered. Finally, an application using prehistoric rib lesions is presented to demonstrate the utility of this method to paleopathology. Am J Phys Anthropol 121:19, 2003. 2003 Wiley-Liss, Inc.

ABSTRACT The utility of Bayes theorem in paleopathological diagnoses is explored. Since this theorem has been used heavily by modern clinical medicine, its usefulness in that eld is described rst. Next, the mechanics of the theorem are discussed, along with methods for deriving the prior probabilities needed for its application. Fol-

One of the most vexing problems facing paleopathologists is the accurate identication of pathological conditions in ancient osteological remains. As discussed by numerous workers (Brothwell, 1981; Ortner, 1992; Waldron, 1994; Miller et al., 1996; Lovell, 2000), there are many difculties associated with this process, because different pathological conditions have similar osteological manifestations (Ortner, 1992; Aufderheide and Rodriquez-Martin, 1998). Since the presence of pathologies has been used to explain aspects of life in past populations (Steinbock, 1976; Ortner and Putschar, 1981; Buikstra and Ubelaker, 1994), accuracy of diagnosis is of paramount importance. Although many good publications describe the visual and radiographic appearance of various diseases, traumata, and other departures from normal bone form (e.g., Brothwell and Sandison, 1967; Steinbock, 1976; Ortner and Putschar, 1981; Zimmerman and Kelley, 1982; Roberts and Manchester, 1995; Aufderheide and Rodr guezMartin, 1998), very little attention is given to methods for resolving problems in differential diagnosis. When faced with a bone containing abnormalities that could be caused by any number of pathological processes, it is up to the paleopathologist to use his or her best judgment to correctly identify the causative agent. The lack of methods for untangling conicting information has often resulted in fairly strong disagreements among researchers. Miller et al. (1996) showed that even persons experienced in the practical aspects of skeletal biology, paleopathology, and medicine often offer incorrect judgments, and/or may be unsure of a precise pathological classication. Using dry bone lesions of known etiology, these authors compiled data on the accuracy of diagnoses made by persons attending workshops in 1990 and 1991 at the annual meetings of the Paleopathology Association. Their results showed that attendees

placed only 42.9% of provided specimens into their correct pathological category (e.g., anomaly, traumarepair, or metabolic). Even worse, when identications of specic conditions were attempted (e.g., Pagets disease, rheumatoid arthritis), only 28.6% of specimens were correctly assigned. These rather distressing results underscore the need for techniques that aid in the identication of dry bone pathologies. In the paleopathological literature, few works describe methods focused specically on differential diagnosis. Steinbock (1976), while devoting considerable attention to differential diagnosis, did not present a general method for dealing with this problem. Buikstra (1976) used a key diagram (also called a decision tree) as a method to aid in the identication of tuberculosis in the spines of Caribou Eskimo. From an assemblage of defects that also could indicate rheumatoid arthritis, blastomycosis, brucellosis, and other diseases, she derived the most likely diagnosis of a pathological specimen by tracing the expression of traits along different paths for different individuals. (Rogers et al., 1987 presented information that could be used in a similar manner when differentiating between various arthropathies.) Another method, typied by Blackman et al. (1991), involves making a decision table to relate diseases to their concomitant pathological signs and other relevant data (e.g., demographics). This is constructed by listing signs of pathological conditions (or other data that can be used in a diagnosis) down the
*Correspondence to: Steven N. Byers, Department of Anthropology, University of New Mexico, Albuquerque, NM 87131. Received 24 August 2001; accepted 30 May 2002. DOI 10.1002/ajpa.10164

2003 WILEY-LISS, INC.

S.N. BYERS AND C.A. ROBERTS


TABLE 1. Pathological conditions explored using Bayes theorem or Bayesian logic Pathological condition Alzheimers disease Ankylosing spondylitis Appendicitis Asthma Bone tumors Cancer Reference Prince, 1996 Diffey et al., 1985 Edwards and Davies, 1984 Perpina et al., 1993 Lodwick et al., 1963 Du Boulay et al., 1977; Kahn et al., 1997; Lind and Singer, 1986; Montironi et al., 1994, 1998; Pastor et al., 1997 Siles et al., 1997; van de Merwe, 1983 Diamond et al., 1980 Nugent et al., 1964 Haddawy et al., 1994 Phaneuf et al., 1985 Warner et al., 1961, 1964; Reale et al., 1968 Blinowska et al., 1991; Elijovich and Laffer, 1992 Malchow-Moller et al., 1986 Begon et al., 1979 Somogyi et al., 1993 Mani et al., 1997 Blinowska et al., 1992 Worbel and Connolly, 1998 Aronsky and Haug, 1998 Bernelot Moens and van der Korst, 1991, 1992 Bernelot Moens et al., 1992 Poses et al., 1986 Overall and Williams, 1963 Berg, 1981

left-hand side of a table, while the different pathological entities that contain these signs are arranged across the top as column headings. The expression of each sign or other data for a condition is entered in the appropriate conuence of column and row, and a diagnosis is determined by the column that best describes the skeleton(s) under study. Unfortunately, these are the only examples from the literature involving methods of paleopathological diagnosis. Thus, this area is in need of concentrated attention if a greater battery of procedures is to be available to the osteological researcher. One source of such methods is the techniques used by clinicians to diagnose pathologies in living persons. In the medical literature, most solutions to the problem of diagnosis can be subsumed under the overall heading of decision support systems (DSSs). In a good overview, Degoulet and Fieschi (1997) distinguished four methodological approaches underlying these systems: mathematics, expert systems, neural networks, and probability. Mathematical models involve calculating functional relationships between physical and/or physiological parameters of interest (e.g., blood vessel size and ow rate). Because these functions do not calculate categorical variables that would represent pathological conditions, their applicability to diagnosis is limited. Expert systems are computer applications that use articial intelligence languages to mimic the methods used by the most skillful diagnosticians, while neural networks are computer systems that mimic the structure and function of the human brain. The effectiveness of these two approaches is still being researched. Finally, probabilistic methods use various observed frequencies of pathological conditions in conjunction with their signs and/or symptoms to develop probabilities of these conditions in patients presenting similar clinical pictures. A number of statistical procedures are available in this method, including Bayes theorem, logistic regression and/or discrimination, and linear discriminant analysis. Of the four methodological approaches, probability models using Bayesian theory and especially Bayes theorem are the most widely employed. Table 1 presents a partial list drawn from the medical literature of pathological conditions to which clinicians have applied Bayes theorem or Bayesian theory to the problem of diagnosis. As can be seen, even an incomplete list is extensive, indicating that the theory has been widely studied by the medical establishment. Furthermore, Bayes theorem has been shown to be as good as, or better than, both clinical diagnoses and guessing. Table 2 presents results from a sample of medical studies where Bayes theorem was used to diagnose a number of conditions whose correct identication was obtained from complete testing. As can be seen, in almost all studies, Bayes theorem predicts the correct diagnoses better (usually much better) than expected from chance alone. For example, when applied to the problem of differentiating appendicitis from other abdominal problems, Edwards and Davies (1984)

Colonic diseases Coronary artery disease Cushings Syndrome Gallbladder disease Hearing loss Heart disease Hypertension Jaundice Liver and biliary diseases Lupus Mental retardation Multiple sclerosis Osteomyelitis Pneumonia Rheumatological disorders Rheumatoid arthritis Streptococcal infections Thyroid disease Viral infections

found that the theorem was correct 9193% of the time, which is a signicant increase over the 50% probability due to chance alone. In addition, where reported, the theorem performs as well as, or better than, the initial diagnosis by physicians. Since the purpose of both modern clinical medicine and paleopathology is accurate diagnosis, the information in Tables 1 and 2 indicates that this method should be useful in paleopathology. Therefore, this paper explores how Bayes theorem could be applied to paleopathology. The issues and problems surrounding its application will be discussed, and an example given. It is hoped that the presentation here will stimulate further research into this promising method. BAYES THEOREM Bayes theorem describes how knowledge of prior probabilities can be used to calculate the probability of unknown events in many subject areas, including the sciences, the humanities, and even games of chance. In problems involving the diagnosis of modern disease, the prior probabilities are both the prevalences of pathological conditions and the likelihoods of signs found within those conditions. Prevalences are the frequencies with which pathological conditions (i.e., infectious diseases, nutritional deciencies, trauma, and any other departure from normal structure and function) occur in popu-

DIAGNOSIS USING BAYES THEOREM


TABLE 2. Accuracy of Bayes theorem in selected clinical studies1 General pathological condition Congenital heart disease Congenital heart disease Polycythemic states Intrathoracic radiographs Thoracic conditions Gastrointestinal bleeding Acute abdominal pain Liver disease Appendicitis
1

Number of specic conditions 33 94 5 15 3 2 9 40 2

% correct diagnosis Clinicians 73.2 73.2 6576 nr nr nr 76 nr 7689 Bayes theorem 81.4 81.8 95 53 73 4969 74 6477 9193 Reference Warner et al., 1961 Reale et al., 1968 Bishop and Warner, 1969 Alperovitch and Lellouch, 1974 McNeil and Sherman, 1978 Ohmann et al., 1988 Gammerman and Thatcher, 1991 Croft and Mochol, 1974 Edwards and Davies, 1984

nr, not reported.

lations. Likelihoods are the frequencies with which signs (i.e., visually observable characteristics, symptomatic complaints, the results of medical tests, location of abnormalities within the body, and any other attribute that can be used to identify a pathological condition) appear within the conditions for which the prevalence is known. From these statistics, the simplest form of Bayes theorem can be used to calculate the probability of the presence or absence of an ailment, given its prevalence and the likelihood of a single sign occurring in people suffering from that condition. In its general form, it can be extended for multiple diseases and multiple signs so that the probability of a pathological condition can be calculated as: P C i S 1, S 2 . . . S s P C i L S 1 C i S 2 C i . . . S S C i

i1

PC LS C S C
C i 1 i 2

(1)

. . . S S C i

where: P(Ci|S1, S2 . . . Ss) Probability of pathological condition i, given the presence of signs 1, 2, etc. P(Ci) Prevalence of pathological condition i L(S1|Ci, S2|Ci . . . ScCi) Likelihood of sign 1, sign 2, etc. in pathological condition i From equation (1), it can be seen that any number of conditions (Cc) or signs (Ss) can be used in the calculation of the probability of a disorder. In the medical literature, the number of conditions analyzed has varied from 1 (e.g., Ohmann et al., 1988) to almost 100 (e.g., Reale et al., 1968), with most researchers limiting their study to under 30 (e.g., Alperovitch and Lellouch, 1974; McNeil and Sherman, 1978; Gammerman and Thatcher, 1991). The num-

ber of signs used has varied from fewer than 20 (e.g., Sherman, 1978; Bishop and Warner, 1969) to around 30 (e.g., Alperovitch and Lellouch, 1974; Fryback, 1978; Gammerman and Thatcher, 1991) and as high as 50 (e.g., Warner et al., 1961). Once these values are determined, Bayes theorem can aid in diagnosis in the following manner. First, those signs found in a patient are identied, and their prior likelihoods are entered into equation (1). This equation is then calculated for each pathological condition for which prevalences exist. The condition with the highest resulting probability would be the most likely diagnosis. Given this, it seems reasonable that Bayes theorem has applications to diagnosing osteopathological conditions. Since modern medicine has demonstrated its efcacy and since both paleopathologists and clinicians have similar goals, the theorem should help meet the aforementioned need for methods in paleopathological diagnosis. All that needs to be done is to derive the necessary prior probabilities, and equation (1) could be applied. Prevalences for each pathological condition can be calculated according to the following formula: P C i where: P(Ci) Prevalence of pathological condition i NCi Number of people with pathological condition i N Total number of people in the population As can be seen, the number of individuals suffering from a condition is simply divided by the total number of individuals from the population on which the statistic is based. This calculation is carried out for each pathological condition. The computation of likelihoods follows a similar course. This statistic is computed by dividing the number of individuals having both a sign and a pathological condition by the total number of indiN Ci N (2)

S.N. BYERS AND C.A. ROBERTS

viduals suffering from that pathological condition. Thus: L S j C i where: L(Sj|Ci) Likelihood of sign j in condition i N(Sj|Ci) Number of persons exhibiting sign j in condition i NCi Number of persons with condition i This calculation is carried out on all signs for a pathological condition. As can be seen, to apply Bayes theorem to paleopathology requires only that the prevalences and likelihoods be available for input to equation (1). However, a number of issues and problems surrounding their derivation require explanation. PREVALENCES AND LIKELIHOODS IN PALEOPATHOLOGY In modern medicine, most researchers calculate prevalences and likelihoods from patients seen in a clinical setting for whom positive diagnoses have been obtained (e.g., Ledley and Lusted, 1959; Warner et al., 1961; Reale et al., 1968; Bishop and Warner, 1969; Croft and Machol, 1974; Alperovitch and Lellouch, 1974; Norusis and Jacquez, 1975; Starmer and Lee, 1976; Fryback, 1978; McNeil and Sherman, 1978; Ohmann et al., 1988; Gammerman and Thatcher, 1991). Since similar information is not available to paleopathologists, these prior probabilities have to be derived elsewhere. One obvious source is health statistics on living populations, both old data collected by industrialized countries from the 1800s and early 1900s as well as new statistics published by modern health agencies. Another source is data that could be gathered from the collections of human skeletons, such as the Terry and Todd Collections and the newer but smaller collections at places such as the University of New Mexico and the University of Tennessee, Knoxville. Before any of these are used to compute prior probabilities, a number of issues must be considered. The rst is, how representative of prehistoric populations are the health statistics and data that are derived from skeletal collections? It seems reasonable that the older the data (e.g., old health statistics, data from the Terry and Todd Collections), the more likely they are to reect the prehistoric condition because the effects of modern therapies would not mitigate (or eliminate) the defects used in the determination of prevalences and likelihoods. Thus, the people on whom these prior probabilities would be computed are more analogous to past populations than the people represented in newer sources. However, these samples contain biases of unknown effect, e.g., old health statistics are available only from industrialized countries, and the Terry and Todd Collections were derived from peoN S j C i N Ci (3)

ple of the midwestern United States in the earlier part of the 1900s. Newer skeletal collections have comparable problems with sample bias, i.e., they are derived from specic regions in the United States. Similarly, although modern health statistics from Third World countries may be more representative of past populations because they derive from people living a traditional lifestyle without access to modern therapies, the impact of diseases when people were rst exposed to them may have been different than they are today (Brothwell, 1967; Ortner and Pustchar, 1981). This would bias prevalences and likelihoods calculated from these statistics. In sum, all sources of prior probabilities have problems with representativeness; however, this simply characterizes the state of the science today. For example, the Terry and Todd Collections provide much of the information used by human skeletal biologists in osteological studies; this makes them the source of standards in physical anthropology, whether or not they are representative of the population at large. Also, as will be discussed below, the effect of this problem is not so critical as to invalidate the method. The next issue to consider is the accuracy of the diagnoses in these sources. As pointed out by Rothschild et al. (1990) when studying the Terry Collection, modern criteria for the identication of pathological conditions were unavailable in earlier times. Therefore, prevalences and likelihoods derived from early health statistics in industrialized countries, or calculated from older skeletal series, can be questioned. For example, although some support the diagnoses seen in the records for the Terry Collection (David Hunt, personal communication), others have noted that a number of these are laughable (Stan Rhine, personal communication). Despite these possible problems, it seems unlikely that all (or even most) of the diagnoses in early health statistics or skeletal collections are incorrect; thus, their use in computing prior probabilities is supported. Another problem with diagnoses is that health statistics and skeletal collection records list the presence of a disease usually only if it was the cause of death. Diseases suffered during life but not causing death are noted only infrequently (e.g., Terry Collection), if at all (e.g., modern health statistics). From the above discussion, it is evident that obtaining prevalences and likelihoods that are applicable to past populations is a difcult endeavor, at best. Although the problems seem insurmountable, there is evidence that their resolution may not need to meet all of the criteria demanded by science. In a potentially devastating article, Miettinen and Caro (1994) point out unsolvable epistemological problems with the calculation of these prior probabilities in modern clinical medicine. Their thesis is that the application of Bayes theorem to the identication of disease is based on untenable premises. At the same time, however, they openly admit that feasible diagnostic probabilities are calculable. This

DIAGNOSIS USING BAYES THEOREM

is encouraging for paleopathologists. If modern clinicians can compute useful probabilities (as seen by the number of pathological conditions in Table 2) despite underlying problems with prevalences and likelihoods, Bayes theorem almost certainly has application to the identication of osteological manifestations of diseases in past populations. In addition, it has been shown that the theorem is robust, since it yields useful results even when the assumptions on which it is based (e.g., independence of signs; see below) are violated in a limited manner. Thus, despite the difculties surrounding prior probabilities, Bayes theorem deserves the attention of paleopathologists. No matter which source is chosen for prevalences and likelihoods, a number of issues surrounding these statistics must be considered. For prevalences, four of these are of major importance. First, the list of pathological conditions for which prevalences are calculated must be exhaustive, i.e., all possible prevalences must be available for use in Bayes theorem. In the clinical literature, this problem is solved by identifying the diseases specically being studied and then creating a normal or other category for all other conditions (e.g., Warner et al., 1961; McNeil and Sherman, 1978). Another method for dealing with this is to restrict the number of diseases being examined (e.g., Alperovitch and Lellouch, 1974; Gammerman and Thatcher, 1991), so that diagnoses are attempted only on patients suffering from specic pathologies. A second issue is that prevalences are calculated for conditions that are mutually exclusive. Thus, a person cannot have more than one of the C conditions appearing in equation (1). Unfortunately, despite the fact that patients suffering from multiple diseases are part of the clinical world, most workers simply ignore this problem (e.g., Warner et al., 1961; Reale et al., 1968; Bishop and Warner, 1969; Alperovitch and Lellouch, 1974; Gammerman and Thatcher, 1991), while only a few (e.g., Reale et al., 1968) create disease categories which are combinations of individual conditions. A third issue involves the sum of the prevalences. Traditionally, the prevalences of all diseases add up to 1 because the P(Ci)s are computed from a population of individuals suffering from any number of pathological conditions of interest. However, despite the widespread use of this standard, it is not necessary for the proper application of Bayes theorem. Since this formula describes the relationship between probabilities, any values of prevalence that are considered valid can be used. The only requirement is that the relationship between the frequency of diseases must be maintained (i.e., if one disease is twice as common as another, the prevalence of the rst must be twice that of the second). Similarly, if only the rank order of prevalences is known, Bayes theorem still can be applied (for an example, see Horbar, 1983). The nal complication affecting prevalences are intrinsic and extrinsic population factors that inu-

ence the rate at which persons will suffer pathological conditions. For example, the frequency of genetically induced diseases such as sickle-cell anemia and thalassemia, which can have osteological manifestations (Cooley et al., 1927; Sebes and Diggs, 1979; Whipple and Bradford, 1932), would cause prevalences in some areas of the world to be different from those in other areas. Similarly, environmental factors such as climate and weather, which long have been recognized to be correlated with the appearance of diseases and often are seen to be related to seasonality (Yan, 2000; De Garine, 1993; Lukacs and Walimbe, 1998; Steinbock, 1976; Ortner and Putschar, 1981; Aufderheide and Rodr guezMartin, 1998), also make the calculation of this statistic dependent on geographic location. Even cultural factors such as dietary elements and their combinations (e.g., Larsen, 1995; El-Najar et al., 1975), or simply the aggregation of people in towns and cities (Brothwell, 1967; Howe, 1997; Schell and Ulijaszek, 1999), would affect the frequency of diseases. Finally, many workers (e.g., Steinbock, 1976; Ortner and Putschar, 1981; Roberts and Manchester, 1995; Aufderheide and Rodr guez-Martin, 1998; Grauer and Stuart-Macadam, 1998; Pollard and Hyatt, 1999) indicate that demographic factors such as sex and age are correlated with pathological conditions, which then would affect the calculation of prevalences. Likelihoods have three issues to consider. The rst is the simple identication of signs. Ortner (1994) delineated two factors that are important when describing a sign of a pathological condition: what is its nature, and where is it located? To these can be added demographic characteristics such as ancestral group, age, and sex (Ortner and Pustchar, 1981). Unfortunately, present methods for describing the nature of abnormalities are not yet as welldeveloped as paleopathologists might wish (Ortner, 1991, 1992, 1994). Generally four characteristics, and their combinations, have been presented as indicators of a pathologically induced change (Ortner and Putschar, 1981; Ortner, 1992, 1994): abnormal bone loss, abnormal bone gain, abnormal shape, and abnormal size. In addition to these, the amount of bone that is affected, either in absolute measure or percent of the total bone, is similarly important (Buikstra and Ubelaker, 1994). Fortunately, the other feature of description (i.e., location of lesions) is fairly easy to dene. This involves naming the bone(s) manifesting the condition(s), and the placement of defects within it/them (Ortner, 1992). Buikstra and Ubelaker (1994) thoroughly described this latter factor in a typical long bone as involving side (i.e., right, left), section (i.e., diaphysis, metaphysis, epiphysis), and aspect (i.e., medial vs. lateral, proximal vs. distal, anterior vs. posterior, and circumferential). Similar descriptions are available for other bones of the skeleton, and more detail could be useful.

S.N. BYERS AND C.A. ROBERTS

The second issue affecting likelihoods is the assumption of independence, i.e., the occurrence of one sign does not depend on the presence of another. Since many co-occur, unusually high or low probabilities may result from Bayes theorem calculations, thereby causing misclassication of diseases. In the medical literature, this problem has been approached either by ignoring dependencies (called simple Bayes by Gammerman and Thatcher, 1991) or by accounting for these in the calculation of disease probabilities (termed proper Bayes by Gammerman and Thatcher, 1991). Some studies show that misclassication goes down with proper Bayes (Norusis and Jacquez, 1975; Russek et al., 1983; Ohmann et al., 1988; Chard, 1989), while others show an increase in misclassication when dependencies are taken into account (Fryback, 1978; Gammerman and Thatcher, 1991). On the whole, however, the data indicate that misclassication rates between simple and proper Bayes are not great when a small number of signs (e.g., fewer than 10) is used (see Norusis and Jacquez, 1975; Fryback, 1978; Russek et al., 1983; Ohmann et al., 1988; Chard, 1989). Thus, although the assumption of independence between signs is theoretically incorrect, Bayes theorem is robust enough to be of value in diagnosis, even when this assumption is violated in a limited manner. The last problem involves determining which signs to employ in the calculation of Bayes theorem. In modern clinical medicine where many signs are considered, some researchers check rst for causally related (dependent) signs and use only the one from the group with the highest likelihood (e.g., Warner et al., 1961; Ohmann et al., 1988; Gammerman and Thatcher, 1991). Others only use signs that occur more often in one disease than another (e.g., McNeil and Sherman, 1978; Norusis and Jacquez, 1975). This information indicates that only the likelihoods from uncorrelated (independent) signs that have frequencies that are either high (i.e., approaching 1) or low (i.e., approaching 0) should be used in the calculation of the theorem, while those whose frequencies are correlated and nearly equal in all diseases should be avoided. However, as discussed above, this issue can be ignored if fewer than 10 signs are being considered. PALEOPATHOLOGICAL EXAMPLE As an example of how Bayes theorem could be applied to paleopathology, one of the authors (S.N.B.) discovered rib fragments with osteoproliferative lesions (Fig. 1) in a collection of bone excavated by Czajkowski (1934) from Little Woods, a Native American site in southern Louisiana dated 500 250 BC. Since tuberculosis has been hypothesized to occur prehistorically in the Americas (for a list of sources, see Powell, 1992), albeit later in time, it was reasonable to question if these lesions represented that disease. Unfortunately, the ribs were from an ossuary where considerable disarticulation

Fig. 1. Osteoproliferative lesion on internal surface of a rib from prehistoric Louisiana.

and fragmentation of osteological material had occurred. Thus, there were no associated skeletal elements to aid in the diagnosis or any opportunity to look at the distribution pattern of lesions. Roberts et al. (1994) researched the incidence of similar osteoproliferative lesions in some 380 skeletons from the Terry Collection. Their study revealed lesion frequencies in persons whose deaths were recorded as being due either to tuberculosis (Tub), other pulmonary disease (Oth Pul), or nonpulmonary disease (NonPul). These conditions met the criteria discussed above for the calculation of prevalences, i.e., they represented a mutually exclusive and exhaustive list. In addition, their frequencies added up to 1.0, and the disease of interest was identied (in this case, tuberculosis), while diseases of lesser interest were placed into other categories. Also, the precept by Ortner (1994, p. 6) that the nature of the abnormality must rst be specied was satised by the description osteoproliferative lesions. Likelihoods then would be based on frequencies concerning location as well as incidence by age, sex, and other demographic characteristics. In this case, Roberts et al. (1994) indicated that age, side, location in thorax, and affected surface of ribs were useful for distinguishing the three cause-ofdeath categories. Also, the independence between signs was not important, since there were fewer than 10, making the search for correlation unnecessary. When calculating Bayes theorem, the usual method is to place the prior probabilities into a table with diseases arranged as rows, and signs as columns (Warner et al., 1961; Vishnevskiy et al., 1973). In this manner, the likelihoods of signs within diseases are located at the intersection of the disease row with the sign column. Table 3 presents a summary of the data of Roberts et al. (1994) for the three causes of death. These statistics show that there are no characteristics that occur only in one disease (i.e., likelihood 1.0) and not in the others. Although this would allow for a more condent diagnosis, this situation is so rare that it should not be expected. Second, age at death is a useful characteristic for determining disease (e.g., persons between ages 26 35 are six times more likely to die of tuberculosis than from a nonpulmonary disease). Although it is

DIAGNOSIS USING BAYES THEOREM


TABLE 3. Prior probabilities for application of bayes theorem to osteoproliferative lesions
Age at death Cause of death Tuberculosis Other pulmonary Nonpulmonary Prevalence 0.421 0.137 0.442 1525 0.155 0.100 0.049 2635 0.316 0.100 0.049 3645 0.226 0.160 0.221 4655 0.142 0.220 0.221 56 0.161 0.420 0.436 Side of lesion One 0.548 0.760 0.521 Both 0.452 0.240 0.479 Position of lesions within cage Low 0.019 0.020 0.069 Mid 0.382 0.529 0.500 High 0.076 0.039 0.050 All 0.204 0.137 0.163 Position of lesions on ribs Anterior 0.076 0.200 0.282 Lateral 0.064 0.040 0.043 Posterior 0.057 0.080 0.049

All 0.611 0.440 0.399

theoretically incorrect to include such frequencies (see Miettenen and Caro, 1994), they can be used in calculations because of the robust nature of the theorem discussed above. Since the age, side, and position within the rib cage cannot be determined for the individual(s) from Little Woods, the only usable likelihood is the affected rib surface. Because these all are located anteriorally (Ant), equation (1) for the three causes of death are: P Tub Ant .421*.076 .421*.076 .137*.200 .442*.282 .032 .032 .174 .032 .027 .125 .184 (4)

P Oth Pul Ant .137*.200 .421*.076 .137*.200 .442*.282 .027 .027 .147 .032 .027 .125 .184 (5)

P NonPul Ant .442*.282 .421*.076 .137*.200 .442*.282 .125 .125 .679 .032 .027 .125 .184 (6)

These calculations reveal several important points. First, there is a higher probability that the lesions are associated with a nonpulmonary cause of death than with tuberculosis. This is surprising, because Roberts et al. (1994) showed that tuberculosis is a highly probable cause for these osteoproliferations. However, since they are not pathognomonic for this infection (other pulmonary diseases such as pneumonia and neoplastic disease also exhibit these lesions), the result here is reasonable. Second, these calculations double the probability of correct diagnosis over that due to chance. That is, all things being equal, there is only a 0.333 probability that a nonpulmonary disease caused the lesions. Third, the calculations are predicated on the applicability of Terry Collection frequencies to the prehistoric peoples of southern Louisiana. The assumption of representativeness of sample prior probabilities has already been discussed; however, for the purpose of illustration, the sample bias of the Terry Collection is ignored. Fourth, sometimes probabilities are not as persuasive as one would like (e.g., the prob-

ability of nonpulmonary cause of death is only P 0.679). However, in other cases, stronger probabilities result; for example, if these same lesions were found along the entire surface of both sides of the upper ribs of a skeleton from an individual who was around 30 years old at the time of death, the probabilities from Bayes theorem given these four signs would be: P(Tub|4 Signs) 0.914, P(Oth Pul|4 Signs) 0.018, and P(NonPul|4 Signs) 0.068, resulting in a better case for tuberculosis being the cause of the lesions. Last, given the paucity of data used in the above calculations, diagnosis based solely on the computed probabilities would be unwarranted. This situation would be true even in the hypothetical example where the probabilities were more compelling. It should never be construed that the results of Bayes theorem would be the only basis for developing a diagnosis; rather, all data should be considered before attempting to identify a pathological condition. As a way of reinforcing this last point, consider the following analysis. Since the question being explored here is whether or not the osteoproliferative lesions are due to tuberculosis, other signs of this disease should be evident in the skeletal material despite their disarticulation. According to Ortner and Putschar (1981), the primary osteological manifestation of tuberculosis is cavitating lesions in the hemopoietic marrow, with little reactive bone formation. This defect is found most often in the bodies of vertebrae, especially the lumbar region, where it usually affects two or more bones in the same vertebral column. In the joints, the hip and knee are affected most often, with destruction of the articular surfaces followed by fusion (ankylosis). Among 95 complete or nearly complete vertebral bodies from Little Woods, there is only one which exhibits a lytic lesion that approximates the cavitations expected in tuberculosis, and none of the joints show destruction and ankylosis. Thus, a diagnosis of tuberculosis for these rib fragments is unwarranted, and the presence of this disease at the site cannot be proven with these data. Unfortunately, because nonpulmonary causes of death were not investigated further by Roberts et al. (1994), nothing additional can be ascertained as to the cause of these lesions. CONCLUSIONS To summarize, Bayes theorem has been used successfully by the medical profession to aid in the diagnosis of disease in a clinical setting. Although physicians have the advantage of a larger range of tests as well as therapies whose effectiveness can be

S.N. BYERS AND C.A. ROBERTS


Blinowska A, Verroust J, Malapert D. 1992. Bayesian statistics as applied to multiple sclerosis diagnosis by evoked potentials. Electromyogr Clin Neurophysiol 32:1725. Brothwell DR. 1967. The bio-cultural background to disease. In: Brothwell DR, Sandision AT, editors. Diseases in antiquity. Springeld, IL: C.C. Thomas. p 56 68. Brothwell DR. 1981. Digging up bones, 3rd ed. Ithaca, NY: Cornell University Press. Brothwell DR, Sandison AT. 1967. Diseases in antiquity. Springeld, IL: C.C. Thomas. Buikstra JE. 1976. The caribou Eskimo: general and specic disease. Am J Phys Anthropol 45:351368. Buikstra JE, Ubelaker DL. 1994. Standards for data collection from human skeletal remains. Arkansas Archeological Survey research series no. 44. Fayetteville, AR: Arkansas Archeological Survey. Byers SN. 2002. A model for the diagnostic process in paleopathology. Paleopathol Newslett 117:1118. Chard T. 1989. The effect of dependence on the performance of Bayes theorem: an evaluation using computer simulation. Comput Methods Programs Biomed 29:1519. Cooley TB, Witner ER, Lee P. 1927. Anemia in children. Am J Dis Child 34:347363. Croft DJ, Machol RE. 1974. Mathematical methods in medical diagnosis. Ann Biomed Eng 2:69 89. Czajkowski JR. 1934. Preliminary report of archeological excavations in Orleans Parish. Louisiana Conservation Rev 4:1218. De Garine I. 1993. Culture, seasons and stress in two traditional African cultures (Massa and Mussey). In: Ulijaszek SJ, Strickland SS, editors. Seasonality and human ecology. Cambridge: University Press. p 184 201. Degoulet P, Fieschi M. 1997. Introduction to clinical informatics. New York: Springer. Diamond GA, Forrester JS, Hirsch M, Staniloff HM, Vas R, Berman DS, Swan HJ. 1980. Application of conditional probability analysis to the clinical diagnosis of coronary artery disease. J Clin Invest 65:1210 1221. Diffey BL, Pal B, Gibson CJ, Clayton CB, Grifths ID. 1985. Application of Bayes theorem to the diagnosis of ankylosing spondylitis from radioisotope bone scans. Ann Rheum Dis 44: 667 670. Du Boulay GH, Teather D, Harling D, Clarke G. 1977. Improvement in the computer-assisted diagnosis of cerebral tumours. Br J Radiol 50:849 854. Edwards FH, Davies RS. 1984. Use of a Bayesian algorithm in the computer-assisted diagnosis of appendicitis. Surg Gynecol Obstet 158:219 222. Elijovich F, Laffer CL. 1992. Bayesian analysis supports use of ambulatory blood pressure monitors for screening. Hypertension [Suppl] 19:268 272. El-Najar MY, Lozoff B, Ryan DJ. 1975. The paleoepidemiology of porotic hyperostosis in the American Southwest: radiological and ecological considerations. AJR Radium Ther Nucl Med 125:918 924. Fryback DG. 1978. Bayes theorem and conditional nonindependence of data in medical diagnosis. Comput Biomed Res 11: 423 434. Gammerman A, Thatcher AR. 1991. Bayesian diagnostic probabilities without assuming independence of symptoms. Methods Inf Med 30:1522. Grauer AL, Stuart-Macadam P. 1998. Sex and gender in paleopathological perspective. Cambridge: Cambridge University Press. Haddawy P, Kahn CE Jr, Butarbutar M. 1994. A Bayesian network model for radiological diagnosis and procedure selection: work-up of suspected gallbladder disease. Med Phys 21:11851192. Horbar JD. 1983. Revising ranked probabilities: a Bayesian approach to incomplete knowledge. Comput Biomed Res 16:367 377. Howe GM. 1997. People, environment, disease and death. Cardiff: University of Wales Press. Kahn CE Jr, Roberts LM, Wang K, Jenks D, Haddawy P. 1997. Construction of a Bayesian network for mammographic diagnosis of breast cancer. Comput Biol Med 27:19 29.

used to support a diagnosis, the method is still applicable to paleopathology. Despite the fact that there are difculties in obtaining reasonable prevalences and likelihoods, the derivation of these statistics would make it possible to calculate the probability of any number of pathological conditions for skeletons manifesting osteological abnormalities. Prevalence and likelihood statistics could be derived by a comprehensive study of all pathologies in the Terry and Todd Collections as well as other smaller skeletal series, using a data-gathering instrument that includes, and goes beyond, those characteristics described by Buikstra and Ubelaker (1994), Lovell (2000), or Byers (2002). Additionally, information on prevalences could be obtained from modern health statistics, especially from the Third World, as a means of checking the applicability of skeletal series data to the world at large. Information gathered in this manner could be used to develop a computerized diagnosis system (S.N.B. has a prototype of such a system, which is available by e-mailing him at j708@unm.edu), which would calculate probabilities that could support or refute diagnoses developed using more traditional methods. This would provide more information for the diagnostic process, thereby increasing the accuracy with which paleopathological conditions are identied. LITERATURE CITED
Alperovitch A, Lellouch J. 1974. Methods for aiding medical decision: application to diagnosis of round intra-thoracic x-ray picture. Comput Biomed Res 7:127141. Aronsky D, Haug PJ. 1998. Diagnosing community-acquired pneumonia with a Bayesian network. Proc AMAI Symp 1998: 632 636. Aufderheide AC, Rodr guez-Martin C. 1998. The Cambridge encyclopedia of human paleopathology. Cambridge: Cambridge University Press. Begon F, Lockhart AM, Metreau JM, Dhumeaux DA. 1979. Computer-aided system for the diagnosis of hepato-biliary diseases. A comparison with the performance of physicians. Med Inf (Lond) 4:35 42. Berg AO. 1981. Case-control diagnosis and Bayesian inference in common viral infections. J Fam Pract 12:10171021. Bernelot Moens HJ, van der Korst JK. 1991. Comparison of rheumatological diagnoses by a Bayesian program and by physicians. Methods Inf Med 30:187193. Bernelot Moens HJ, van der Korst JK. 1992. Development and validation of a computer program using Bayes theorem to support diagnosis of rheumatic disorders. Ann Rheum Dis 51: 266 271. Bernelot Moens HJ, Hirshberg AJ, Claessens AA. 1992. Datasource effects on the sensitivities and specicities of clinical features in the diagnosis of rheumatoid arthritis: the relevance of multiple sources of knowledge for a decision-support system. Med Decis Making 12:250 258. Bishop CR, Warner HR. 1969. A mathematical approach to medical diagnosis: application to polycythemic states utilizing clinical ndings with values continuously distributed. Comput Biomed Res 2:486 493. Blackman J, Allison MJ, Aufderheide, Oldroyd N, Steinbock RT. 1991. Secondary hyperparathyroidism in an Andean mummy. In: Ortner DJ, Aufderheide AC, editors. Human paleopathology: current syntheses and future options. Washington, DC: Smithsonian Institution Press. p 291296. Blinowska A, Chatellier G, Bernier J, Lavril M. 1991. Bayesian statistics as applied to hypertension diagnosis. IEEE Trans Biomed Eng 38:699 706.

DIAGNOSIS USING BAYES THEOREM


Larsen CS. 1995. Biological changes in human populations with agriculture. Annu Rev Anthropol 24:185213. Ledley RS, Lusted LB. 1959. Reasoning functions of medical diagnosis. Science 130:9 21. Lind SE, Singer DE. 1986. Diagnosing liver metastases: a Bayesian analysis. J Clin Oncol 4:379 388. Lodwick GS, Haun DL, Smith WE, Keller RF, Roberston ED. 1963. Computer diagnosis of primary bone tumors: a preliminary report. Radiology 80:273. Lovell NC. 2000. Paleopathological description and diagnosis. In: Katzenbery MA, Saunders SR, editors. Biological anthropology of the human skeleton. New York: Wiley-Liss. p 217248. Lukacs JR, Walimbe SR. 1998. Physiological stress in prehistoric India: new data on localized hypoplasia of primary canines linked to climate and subsistence. J Archaeol Sci 25:571585. Malchow-Moller A, Thomsen C, Matzen P, Mindeholm L, Bjerregaard B, Bryant S, Hilden J, Holst-Christensen J, Johansen TS, Juhl E. 1986. Computer diagnosis in jaundice. Bayes rule founded on 1002 consecutive cases. J Hepatol 3:154 163. Mani S, McDermott S, Valtorta M. 1997. MENTOR: a Bayesian model for prediction of mental retardation in newborns. Res Dev Disabil 18:303318. McNeil BJ, Sherman H. 1978. Example: Bayesian calculations for the determination of the etiology of pleuritic chest pain in young adults in a teaching hospital. Part B. Comput Biomed Res 11:187194. Miettinen OS, Caro JJ. 1994. Foundations of medical diagnosis: what actually are the parameters involved in Bayes theorem. Stat Med 13:201209. Miller E, Ragsdale BD, Ortner DJ. 1996. Accuracy in dry bone diagnosis: a comment on paleopathological methods. Int J Osteoarchaeol 6:221229. Montironi R, Bartels PH, Thompson D, Scarpelli M, Hamilton PW. 1994. Prostatic intraepithelial neoplasia. Development of a Bayesian belief network for diagnosis and grading. Anal Quant Cytol Histol 16:101112. Montironi R, Mazzucchelli R, Santinelli A, Hamilton PW, Thompson D, Batels PH. 1998. Case diagnosis as positive identication in prostatic neoplasia. Anal Quant Cytol Histol 20:424 436. Norusis MJ, Jacquez JA. 1975. Diagnosis. I. Symptom nonindependence in mathematical models for diagnosis. Comput Biomed Res 8:156 172. Nugent CA, Warner HR, Dunn JT, Tyler FH. 1964. Probability theory in diagnosis of Cushings Syndrome. J Clin Endorcrinol Metab 24:621 623. Ohmann C, Yang Q, Kunneke M, Stoltzing H, Thon K, Lorenz W. 1988. Bayes theorem and conditional dependence of symptoms: different models applied to data of upper gastrointestinal bleeding. Methods Inf Med 27:73 83. Ortner DJ. 1991. Theoretical and methodological issues in paleopathology. In: Ortner DJ, Aufderheide AC, editors. Human paleopathology: current syntheses and future options. Washington, DC: Smithsonian Institution Press. p 511. Ortner DJ. 1992. Skeletal paleopathology: probabilities, possibilities, impossibilities. In: Verano JW, Ubelaker DH, editors. Disease and demography in the Americas. Washington, DC: Smithsonian Institution Press. p 513. Ortner DJ. 1994. Descriptive methodology in paleopathology. In: Owsley DW, Jantz RL, editors. Skeletal biology in the Great Plains. Washington, DC: Smithsonian Institution Press. p 73 80. Ortner DJ, Putschar WGJ. 1981. Identication of pathological conditions in human skeletal remains. Smithsonian contributions to anthropology, no. 28. Washington, DC: Smithsonian Institution Press. Overall JE, Williams CM. 1963. Conditional probability program for diagnosis of thyroid function. JAMA 183:307. Pastor A, Menendez R, Cremades MJ, Pastor V, Llopis R, Aznar J. 1997. Diagnostic value of SCC, CEA and CYFRA 21.1 in lung cancer: a Bayesian analysis. Eur Respir J 10:603 609. Perpina M, Pellicer C, de Diego A, Compte L, Macian V. 1993. Diagnostic value of the bronchial provocation test with methacholine in asthma. A Bayesian analysis approach. Chest 104: 149 154.

Phaneuf R, Hetu R, Hanley JA. 1985. A Bayesian approach for predicting judged hearing disability. Am J Ind Med 7:343352. Pollard T, Hyatt SB. 1999. Sex, gender and health. Cambridge: Cambridge University Press. Poses RM, Cebul RD, Collins M, Fager SS. 1986. The importance of disease prevalence in transporting clinical prediction rules. The case of streptococcal pharyngitis. Ann Intern Med 105:586 591. Powell ML. 1992. Health and disease in the late prehistoric southeast. In: Verano JW, Ubelaker DH, editors. Disease and demography in the Americas. Washington, DC: Smithsonian Institution Press. p 4153. Prince MJ. 1996. Predicting the onset of Alzheimers disease using Bayes theorem. Am J Epidemiol 143:301308. Reale A, Maccacaro GA, Rocca E, DIntino S, Gioffre PA, Vestri A, Motolese M. 1968. Computer diagnosis of congenital heart disease. Comput Biomed Res 1:533549. Roberts C, Manchester K. 1995. The archaeology of disease. Gloucester, UK: Sutton Publishing. Roberts C, Lucy D, Manchester K. 1994. Inammatory lesions of the ribs: an analysis of the Terry collection. Am J Phys Anthropol 95:169 182. Rogers J, Waldron T, Dieppe P, Watt I. 1987. Arthropathies in palaeopathology: the basis of classication according to most probable cause. J Archaeol Sci 14:179 193. Rothschild BM, Woods RJ, Ortel W. 1990. Rheumatoid arthritis in the buff: erosive arthritis in deeshed bones. Am J Phys Anthropol 82:441 450. Russek E, Kronmal RA, Fisher LD. 1983. The effect of assuming independence in applying Bayes theorem to risk estimation and classication in diagnosis. Comput Biomed Res 16:537552. Schell LM, Ulijaszek SJ, editors. 1999. Urbanism, health and human biology in industrialized countries. Cambridge: Cambridge University Press. Sebes JI, Diggs LW. 1979. Radiographic changes in the skull in sickle-cell anemia. AJR 132:373377. Sherman H. 1978. A pocket diagnostic calculator program for computing Bayesian probabilities for nine diseases with sixteen symptoms. Comput Biomed Res 11:177186. Siles S, Garrigues V, Ponce J, Galvez C, Berenguer J. 1997. Analysis of the predictive value of clinical data in patients with suspected colonic disease. Rev Esp Enferm Apar Dig 89:445 456. Somogyi L, Cikes N, Marusic M. 1993. Evaluation of criteria contributions for the classication of systemic lupus erythematosus. Scand J Rheumatol 22:58 62. Starmer CF, Lee KL. 1976. A mathematical approach to medical decisions: application of Bayes rule to a mixture of continuous and discrete clinical variables. Comput Biomed Res 9:531541. Steinbock RT. 1976. Paleopathological diagnosis and interpretation. Springeld, IL: C.C. Thomas. Van de Merwe JP. 1983. Differentiation between two diseases using a programmable hand-held calculator and Bayes theorem: application to Crohns disease and ulcerative colitis. Comput Biol Med 13:317332. Vishnevskiy AA, Artobolevskiy II, Bykowskiy ML. 1973. Principles of a solution to the problem of diagnosis using digital computers. In: AA Vishnevskiy, editor. Machine diagnosis and information retrieval in medicine in the USSR. Bethesda, MD: U.S. Department of Health, Education, and Welfare. p 113. Waldron T. 1994. Counting the dead. The epidemiology of skeletal populations. Chichester, UK: John Wiley & Sons. Warner HR, Totonto AF, Veasey LG, Stephenson R. 1961. A mathematical approach to medical diagnosis. JAMAf 177:177183. Warner HR, Totonto AF, Veasey LG. 1964. Experience with Bayes theorem for computer diagnosis of congenital heart disease. Ann NY Acad Sci 115:558 567. Whipple GH, Bradford WL. 1932. Racial or familial anemia of children. Am J Phys Anthropol 44:336 365. Worbel JS, Connolly JE. 1998. Making the diagnosis of osteomyelitis. The role of prevalence. J Am Podiatr Med Assoc 88:337 343. Yan YY. 2000. The inuence of weather on human mortality in Hong Kong. Soc Sci Med 50:419 427. Zimmerman MR, Kelley MA. 1982. Atlas of human paleopathology. New York: Praeger.

Вам также может понравиться