00 голосов за00 голосов против

6 просмотров16 стр.Indice de similitud Raup-Crick

Mar 18, 2018

© © All Rights Reserved

PDF, TXT или читайте онлайн в Scribd

Indice de similitud Raup-Crick

© All Rights Reserved

6 просмотров

00 голосов за00 голосов против

Indice de similitud Raup-Crick

© All Rights Reserved

Вы находитесь на странице: 1из 16

Source: Journal of Paleontology, Vol. 53, No. 5 (Sep., 1979), pp. 1213-1227

Published by: SEPM Society for Sedimentary Geology

Stable URL: http://www.jstor.org/stable/1304099 .

Accessed: 08/10/2013 15:57

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .

http://www.jstor.org/page/info/about/policies/terms.jsp

.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of

content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms

of scholarship. For more information about JSTOR, please contact support@jstor.org.

SEPM Society for Sedimentary Geology is collaborating with JSTOR to digitize, preserve and extend access to

Journal of Paleontology.

http://www.jstor.org

All use subject to JSTOR Terms and Conditions

JOURNALOF PALEONTOLOGY,V. 53, NO. 5, P. 1213-1227, 9 TEXT-FIGS., SEPTEMBER 1979

PALEONTOLOGY

Department of Geology, Field Museum of Natural History, Chicago, Illinois 60605 and

Museum of Paleontology, University of Michigan, Ann Arbor, Michigan 48109

ABSTRACT-A probabilistic index of faunal similarity is proposed which compares the number of taxa

common to two faunas with the number that would be expected to be in common if the taxa were

distributed randomly. Departures of observed from expected numbers in common express the level

of similarity or dissimilarity. The frequency of taxa in the whole data set is used to adjust for the

differing probability of occurrence of taxa (cosmopolitan versus endemic). The new index can be used

to determine whether similarities or dissimilarities between faunas are statistically significant.

The index is tested with 1) modern biogeography of echinoids, 2) environmental distribution of

modern foraminifera in Santa Monica Bay, and 3) Ordovician biogeography of nautiloids. In each

case, the proposed index is more effective than traditional indexes of faunal similarity (Simpson,

Jaccard, and Dice coefficients) in addition to the advantage of making possible rigorous assessment

of statistical confidence. The index should also be useful in a biostratigraphic context. The computer

program used for calculating the index is available from the authors.

A COMMON PROBLEM throughout paleoecolo- blages being compared. For example, if two

gy, biostratigraphy, and paleobiogeography is assemblages have 15 taxa each, the presence

that of comparing faunal lists to evaluate their of 10 taxa in common clearly indicates a great-

similarities and differences. If two lists have er fundamental similarity than if the two as-

no taxa in common, it can be assumed that semblages had had 100 taxa each. Questions

something was different. The possible causes and ambiguities occur, however, and these

vary from ecological differences (marine vs. have given rise to attempts to quantify the pro-

fresh water; shallow vs. deep water, etc.) to cess of comparison.

temporal differences (complete evolutionary Many quantitative measures of faunal sim-

turnover) to biogeographic differences (provin- ilarity have been proposed and several are in

ciality, separation by geographic barriers, common use by paleontologists. Perhaps the

etc.). If the two lists are identical, on the other one most widely applied is the Simpson Coef-

hand, ecological, temporal, and biogeographic ficient (Simpson, 1943, 1947) which may be

unity is assumed (at some scale, at least). The defined as follows:

problems arise when some but not all taxa are

shared. Ordinarily, two rather different ap- S = 100k/B,

proaches are taken in this intermediate situa- where: k = the number of taxa common to

tion: 1) assessment of similarity on the basis of two assemblages A and B,

well informed intuition, and 2) computation of B = the total taxa found in the

a numerical index of similarity. smaller assemblage (B < A).

The intuitive approach is not without value.

The experienced practitioner recognizes that The Simpson Coefficient varies from zero to

some taxa are more common than others and 100 and thus implies percentage similarity.

gives less weight to their joint occurrence in Although it is an easily managed index, au-

two or more assemblages. Many paleobiogeog- thors have pointed out (most recently, Hen-

raphers, for example, discount cosmopolitan derson and Heron, 1977), that the Simpson

forms. In biostratigraphy, the joint occurrence and comparable indexes have important short-

of long-ranging taxa is given much less weight comings. Text-figure 1 illustrates one of the

than that of short-ranging taxa. Judgment is prime difficulties. A series of Venn diagrams

shows several cases which yield identical val-

I Present address: Dept. Geology, Univ. Texas, ues of 'S' yet which are intuitively very differ-

Arlington 76019. ent. In each case, the area of the largest circle

Copyright ? 1979, The Society of Economic

Paleontologists and Mineralogists 1213 0022-3360/79/0053-12

13$03.00

All use subject to JSTOR Terms and Conditions

1214 DAVID M. RAUP AND REX E. CRICK

20 20

.2011 .02

.20 .04

1. 2.

;= 20 20

= .11 .11

)= .20 .20

3. 4.

B A+B- k A+B

TEXT-FIG. 1-Venn diagrams showing hypothetical cases wherein two faunal assemblages (A and B) are

drawn from a pool of taxa (N). The number of taxa (k) common to A and B is indicated by the overlap

of the two smaller circles. For each case, the Simpson, Jaccard, and Dice similarity measures have been

calculated.

indicates the total pool (N) of taxa which could and Dice. Cheetham and Hazel (1969) have

occur in an assemblage and the two smaller provided an excellent comparative review of

circles (A and B) are assemblages drawn from these and about 20 other similarity coeffi-

this pool. The overlap zone (k) between the cients. Other critical reviews of selected coef-

smaller circles represents the number of taxa ficients exist in the literature (see, for example,

found in both assemblages. In all cases, the papers by Henderson and Heron, 1977, and

Simpson Coefficient is 20 yet one's intuition Simberloff, 1978). Most of these authors have

suggests that the four cases do not indicate the emphasized that it is important to have valid

same similarity in the sense of process (mean- measures of faunal similarity because of the

ing ecological, temporal or geographic similar- increasing use of multivariate statistical anal-

ity). ysis of large masses of distributional (presence/

In recognition of these and other difficulties, absence) data. The multivariate analysis can

many authors have proposed alternate be only as good as the matrix of similarity val-

schemes. Two of the coefficients most com- ues that forms its input!

monly applied in paleontology are the Jaccard At the risk of oversimplification, one can

All use subject to JSTOR Terms and Conditions

FAUNAL SIMILARITY 1215

argue that most existing similarity coefficients sprinkling. In the trilobite and echinoderm ex-

suffer from two main problems. First, they amples the departure from chance expecta-

have not been derived in a mathematically rig- tions is so obvious that sophisticated statistical

orous way: that is, they have been 'thought testing is unnecessary. But most cases of in-

up' rather than built on sound mathematical terest are more subtle and rigorous treatment

principles. Their validity has all too often been is obligatory-which is to say that one cannot

tested by whether they seem to work in prac- rely on intuition alone.

tice. Second, they have not been tied to clearly In actual cases, it makes no difference

defined null hypotheses; as a result, statisti- whether the null hypothesis of randomness is

cally meaningful comparisons between values rejectable 10% or 90% of the time. We wish

of a coefficient are impossible. It has been im- to use it only as a standard of comparison and

possible to say whether two assemblages are as a means of assessing the probability that

similar (or dissimilar) at the 95% level of con- two assemblages had different ecologic, tem-

fidence, for example. The discussion by Sim- poral, or geographic settings. When dealing

berloff (1978) includes a particularly good with assemblages from radically different fa-

treatment of this point. cies or from different continents, one would

Henderson and Heron (1977) recognized expect to be able to reject the null hypothesis

and discussed many of the problems just de- most of the time. On the other hand, when

scribed and made an attempt to produce a rig- dealing with assemblages from the same for-

orous and statistically valid similarity mea- mation in a local area, one would expect not

sure. The present effort takes a slightly to reject the null hypothesis and to conclude

different tack in the hope of developing a yet that the compositional differences between as-

more robust approach to the similarity ques- semblages are just the result of chance differ-

tion. Our approach is similar to that of Sim- ences in sampling.

berloff (1978), but our objectives and the re- We propose to use a comparison between

sulting technique are substantially different. the observed number of taxa common to two

assemblages (or faunas) and the probability

THE APPROPRIATE NULL HYPOTHESIS distribution of the expected number of com-

Suppose that taxa are sprinkled randomly mon taxa as a measure of the similarity of the

in space and time and that species lists are two assemblages. Assemblages which are

made up from the taxa that happen, by more similar than predicted by the null hy-

chance, to fall in certain areas and in certain pothesis will be interpreted as indicating a pos-

stratigraphic intervals. Most of the species lists itive bias in the make-up of the assemblages.

will differ from one another just because of That is, ecologic, temporal, or geographic fac-

the vagaries of sampling but they will have an tors must have limited the taxa available for

average similarity which is predictable from those assemblages. Conversely, assemblages

the numbers of taxa, areas, and stratigraphic less similar than predicted will be interpreted

intervals involved. As will be shown below, as indicating a negative bias.

it is possible under the random sprinkling hy- Simberloff (1978) had a quite different ob-

pothesis to predict how many species should jective. Working with modern species distri-

be expected to be shared ('k') and the expected butions in the Galapagos Islands, he was ask-

variation in this number. The expected 'k' and ing whether the total distribution represents

its probable variation constitute the appropri- a significant departure from the null hypoth-

ate null hypothesis for assessing faunal simi- esis of random sprinkling. That is, he was ask-

larity. ing whether the array of species lists is consis-

In the real world, the distribution of taxa in tent with the proposition that differences in

space and time is generally non-random. Tri- composition result solely from sampling error

lobites are confined to a small portion of geo- (in dispersal of species) and not from real bio-

logic history (the Paleozoic), echinoderms are geographic effects.

confined to marine environments, and so on.

When we use the temporal confinement of tri- METHODS

lobites or the ecological confinement of echi- Consider the Venn diagrams in Text-figure

noderms to make other interpretations, we are 1 and assume, as before, that the areas of the

tacitly rejecting the null hypothesis of random circles correspond to the numbers of species in

All use subject to JSTOR Terms and Conditions

1216 DAVID M. RAUP AND REX E. CRICK

.8

>..

I. 2.

_ .6 .6'

_J

-4

<:.4 .4

(a

0

a. .2 .2

0 0

0 1 2 3 4 5 0 3 4 5

kexp

.4 .4

3. 4.

.3 .3

.2 .2

.1 .1. k bs

i I

0 f I

f

mt

_ - - i.

0 5 10 15 20 0 5 10 15 20

TEXT-FIG. 2-Curves showing solutions to equation (4) for all possible values of 'k' in the four cases

illustratedin Text-figure1. The point markedkobsis the numberof taxa observedin commonin Text-

figure1. Note that the relationshipsare not continuousfunctions:only integervaluesof 'k' are possible.

the total pool and two assemblages drawn from the urn to define assemblage A. Then

from that pool. Assume further that all species replace them and draw 'B' balls to form as-

in the pool have an equal chance of being cho- semblage B. The question is: how many of the

sen for each of the assemblages. This is a sim- same balls will be found in both A and B?

plistic assumption because it is well known This problem was solved by Henderson and

that species vary in their abundance so that Heron (1977) by a logical series of steps cul-

some have a much higher probability of oc- minating in their equation (4) and in a slightly

curring in any given assemblage than others. different form by Simberloff (1978), (Null Hy-

But this is a convenient scenario with which pothesis I). The solution presented here is

to introduce a methodology and is the one used more straightforward and more flexible than

by Henderson and Heron (1977) and by Sim- either of the previous efforts.

berloff (1978). The total number of different 'A' assem-

The situation just presented can also be blages that can be drawn from the pool (N) is

thought of in the classic context of an 'urn the number of combinations of N things taken

problem.' Assume that a large urn contains A at a time, or

many balls, each numbered differently, and

that this collection of balls constitutes the pool NcA N= - (1)

of species (N). Now draw 'A' balls at random (N A)! A!(

All use subject to JSTOR Terms and Conditions

FAUNAL SIMILARITY 1217

TABLE 1-Probabilities calculated from equation (4) for the four cases shown in Text-figure 2: 'kobs' is the number of

species observed to be in common and 'kexp'is the number expected to be in common on the assumption of random

sprinkling of species.

Cases:

1 2 3 4

Probability that kexpis less than kobs: .77 .07 .39 .005

Probability that kexpequals kobs: .21 .26 .24 .012

Probability that kexpexceeds kobs: .02 .67 .37 .983

1.00 1.00 1.00 1.000

semblages is NCB. Therefore, the total num- A! B! (N - A)! (N - B)!

ber of different pairs of assemblages is the N! k! (A - k)! [(N - B) - (N - k)]! (B -

k)!'

product:

(4)

NCA-NB. For any set of A, B, and N values, this equa-

The probability that some particular number tion can be solved for the series 'k' values rang-

of species ('k') will occur in both assemblages ing from 0 to B so that a precise probability

may be expressed as follows: distribution can be developed for variation in

the expected number of species in common.

Prob [k species in common] = This is illustrated in Text-figure 2 for the four

total ways of obtaining k species in common cases shown in Text-figure 1. The equation

total number of different assemblage pairs has also been applied to the cases treated by

Henderson and Heron (1977, fig. 3) and the

(2)

results are comparable though not identical to

The denominator in this expression is the theirs.

product developed above. The numerator can In Text-figure 2, the values of 'k' observed

be constructed by inspection of the Venn in Text-figure 1 are indicated by 'kobs'and the

diagrams in Text-figure 1, as follows: all several theoretically possible values of 'k' by

species in the 'k' area must also belong to The ordinate is the probability of each

'B' which, in turn, must belong to 'N.' There- 'kexp.'

particular kexpoccurring by chance. Note that

fore, all possible compositions of the 'k' area the distributions are of markedly different

can be expressed by: shapes depending on the size of the pool and

NCB B k. the sizes of the two assemblages. The distri-

butions are often highly skewed, in contrast to

There remain only the species that belong to the examples shown by Henderson and Heron.

'A' but not to 'k.' All possible compositions of In fact,

skewing is probably typical of real

this group can be expressed by: world data because N is generally much larger

than A or B.

(N - B)(A - k).

The numerical data from Text-figure 2 are

The numerator of the probability expression is summarized in Table 1. For Case 1, there is

thus the product: only a .02 probability of finding more than the

- B)C(A - k). observed number of species in common follow-

NC'Bck(N

ing the hypothesis of random sprinkling. For

The terms for NCB in numerator and denomi- Case 4, on the other hand, the observed value

nator cancel and we are left with: will be exceeded by chance more than 98% of

Prob [k species in common] = the time. These relations can be used to define

a similarity measure, as follows:

BCk(N - B)C(A -k) = 1 minus the prob-

INDEXOFSIMILARITY

NCA

ability that kexp

When this is evaluated using factorials as in will be greater

equation (1), the result is: than kobs.

All use subject to JSTOR Terms and Conditions

1218 DAVID M. RAUP AND REX E. CRICK

INDEX OF SIMILARITY= the probability can be constructed with species of differing

that kexp will frequencies. This is analogous to an urn which

be less than or has more of some kinds of balls than of others.

Simberloff (1978) solved this problem in the

equal to kobs.

formulation of his Null Hypothesis II. He used

The four values of the INDEX are thus .98, the actual frequencies of species in a region as

.33, .63, and .02, respectively. an explicit measure of the probability of these

At this point, we can ask if any of the above species appearing in any one assemblage

figures are statistically significant in the sense drawn from that region. We will follow the

that the null hypothesis of random sprinkling same approach.

can be rejected. Questions concerning statis- The formation of the pool can be illustrated

tical significance must be framed carefully. It by using one of the examples of actual data

is tempting to say that the INDEX OF SIMI- that we will employ later in this paper as a

LARITY in Case 1 allows for the rejection of test of the methodology. We will use data on

the null hypothesis with 98% confidence. But the present biogeography of the 222 genera of

note that the observed number in common in living echinoid echinoderms. Their distribu-

Case 1 is expected to occur 21% of the time tion is expressed in terms of their presence or

under the null hypothesis so that the existence absence in 40 sampling areas in the present-

of this number in common is not startling. In day oceanic world. Some of the areas are ar-

Case 4, on the other hand, a 'k' greater than bitrarily defined and some are based on tra-

that observed is expected more than 98% of ditional biogeographic divisions. The basic

the time and we can conclude that the null data set thus consists of lists of genera for each

hypothesis can be rejected. We can go further of the 40 sampling areas. Some genera are en-

and say that because kobs is significantly low, demic to a single area while others are found

the two assemblages are significantly dissimi- in many areas. In this data set, 60 of the 222

lar-which is to say that something influenced genera are endemics and the most 'cosmopol-

the selection of species from the pool such that itan' genus is found in 20 of the 40 areas. The

fewer than expected occur in common. None number of genera occurring in a given area

of the other cases show significant departure ranges from 1 to 120.

from chance expectations at the 95% level of We will assume, a la Simberloff, that the

confidence. A significant similarity would be probability of occurrence of a genus in a sam-

represented by a number greater than or equal pling area is directly proportional to the num-

to 0.95 in the first row of Table 1. Thus, the ber of areas in which that genus occurs. There-

INDEXOF SIMILARITY as defined here cannot fore, a genus which occurs in only one area is

be used as a direct test of statistical signifi- seen as having a probability of 1/40 of occur-

cance but the data contributing to it can be so ring in any given area. A genus which is found

used. in 10 areas has a probability of 10/40 of oc-

The foregoing scheme is unfortunately not curring, and so on. There is a definite hint of

appropriate for general application because it circularity in this reasoning but Simberloff

uses the simplifying assumption that all (1978) has presented convincing arguments for

species are present in equal numbers in the the lack of significant circularity. The main

pool and thus have an equal chance of occur- point is that the reasoning does not make any

ring in any assemblage. When equation (4) is demands on the spatial distribution of occur-

used with actual assemblage pairs, the ob- rences within the whole region under study: it

served 'k' values are usually much higher than does not preclude a concentration of occur-

would be expected. Most assemblages appear rences in one part of the region. Therefore, the

to be significantly similar to each other. This null hypothesis of random sprinkling is a valid

is because most related faunas contain a few null hypothesis and can be falsified.

common, nearly ubiquitous species which The pool of taxa from which local faunas

have the effect of elevating the observed 'k' are formed has as many occurrences of each

values. This problem can be avoided by ex- genus as there are occurrences of that genus

plicitly accounting for differences in the rela- in the total data set. When assemblages are

tive probability of each species occurring in an selected at random from such a pool, the cos-

All use subject to JSTOR Terms and Conditions

FAUNAL SIMILARITY 1219

co 30

mopolitan genera are more likely to occur and

are thus more likely to be genera common to

both members of a pair of assemblages. Local

endemics (those with probabilities of 1/40, in

the echinoid case) can occur in two assem- 20- \ A= 36

blages but the probability of this event is low. L.

0

B 19

One would naturally like to be able to derive

an equation equivalent to equation (4) which

would predict values of kexpunder the condi-

tions described above. We have been unable

to derive this equation. Therefore, we have

had to rely on monte carlo simulations-just

as Simberloff did for his purposes. Our meth- 0 5 10 15 19

od is as follows:

kexp

1) For each pair of assemblage sizes in the

real world data set construct an imaginary pair TEXT-FIG. 3-Example of treatmentof a compar-

of assemblages (A and B) by drawing species ison between two echinoid faunas. kohs is the

from the pool. This is accomplished by com- numberof generaactuallyobservedto be in com-

mon. The curveshowsthe percentof simulations

puter with a random number generator. Hav- yielding each value of kexp.The ruled portion

ing made the two assemblages, the lists of gen- indicates the numberof simulationshaving 'k'

era are compared and the number of genera values less than or equal to the observedvalue.

in common is recorded. This number is one

outcome of sampling under the random sprin-

kling hypothesis: that is, one point in a kexp A special problem arises where the smaller

probability distribution. assemblage (B) is very small. In one echinoid

2) The same procedure is repeated many pair, for example, assemblage B contained

times with the number of taxa shared by each only two genera and thus the only possible

pair of assemblages being recorded. values of 'k' are 0, 1, and 2. Fifty simulations

3) A frequency distribution of the results is produced the following k's:

an estimate of the probability distribution of

kexpunder the specified conditions of A and B. k=0 40 (80%),

4) The number of taxa actually shared (kobs) k= 1 9 (18%),

by assemblages of these sizes in the real world k= 2 1 (2%).

is compared with the monte carlo generated 50

distribution and the INDEXOF SIMILARITY is

computed as in the simplified case described There were no genera actually common to the

earlier. two areas (kobs= 0). Thus, using the proce-

An actual example of this procedure is il- dure described earlier, computing the INDEX

lustrated in Text-figure 3 for a pair of sam- would yield a value of 0.80. But this implies

pling areas in the echinoid data set: these areas a higher similarity than may exist. In other

had 19 and 36 genera, respectively. A fre- words, we do not know where the value of

quency distribution of 50 simulated assem- kobs falls within the 80%. Therefore, an arbi-

blage pairs is shown in Text-figure 3 along trary convention was adopted: the INDEX is

with the actual number observed in common computed on the basis of the midpoint of the

(5). In this case, kobsfalls nearly at the center string of simulated 'k' values which are equal

of the simulated distribution and the null hy- to the observed 'k.' The INDEXin this case is

pothesis cannot be rejected. But the percent- recorded as 0.40. The same convention was

age of simulations having 'k' less than or equal followed throughout. In the case illustrated in

to kobs may be used as the INDEX OF SIMILAR- Text-figure 3, the INDEXwas recorded as 0.39.

ITY. This was arrived at by summing the percent-

When this procedure was followed with the ages of the simulations that gave 'k' values less

entire echinoid data set, most cases fell be- than kobs (2 + 6 + 20 = 28) and adding one-

tween the 5% tails of their distributions (as in half the percentage of simulations where 'k'

Text-fig. 3). equaled kobs(?2 x 22 = 11).

All use subject to JSTOR Terms and Conditions

1220 DAVID M. RAUP AND REX E. CRICK

The use of monte carlo methods calls for distributional data all come from Mortensen's

considerable computation time-much more Monograph of the Echinoidea (1928-51)

than would be required if an analytical expres- which provides a consistent and authoritative

sion comparable to equation (4) were avail- taxonomic base. Of the forty geographic sam-

able. But the results are just as accurate, given pling areas used for the present study, most

enough simulations. In the echinoid case we are relatively shallow water coastal or insular

used 50 simulations for each pair of assem- areas where distributions of taxa tend to re-

blage sizes. 100 or 1,000 simulations per pair flect regional climate. The others have uni-

would have produced more precise distribu- formly cold water faunas: the non-insular

tions but 50 was chosen as the best compro- ocean areas of the North Pacific, South Pacif-

mise with the limitations of computer budgets. ic, North Atlantic, Central Atlantic, and

The important point is that the simulation South Atlantic and the Arctic and Antarctic

technique does not sacrifice rigor unless the Oceans.

number of simulations is too low. As indicated earlier, the data set consists of

The computer program used for the echi- 222 genera which range from local endemics

noid and other analyses is available from the to those found in as many as 20 of the 40 sam-

authors. It is a relatively expensive program pling areas. In keeping with the philosophy of

to run. The cost depends on the number of the method, no data were discarded because

assemblages and the variation in their sizes. of endemism or cosmopolitanism and no areas

The echinoid data set described here is unusu- were excluded because of small sample size.

ally large and requires about 26 cpu minutes The basic computer program was run to as-

on an IBM 360/65 to produce the similarity sess similarity between the members of all pos-

matrix plus a complete record of the 19,750 sible pairs of the 40 generic lists. Fifty simu-

simulations required for the job. A variety of lations were used for each pair of assemblage

techniques could be used to reduce the cost sizes. The output consisted of 1) the tabulated

but they would sacrifice accuracy. results of all simulations (number of genera in

common) and 2) a matrix of values of the com-

APPLICATIONS

puted INDEX OF SIMILARITY. Various analy-

The method described in this paper can be ses were performed on the output, some of

applied to any data set consisting of presence which will be described below.

and absence of taxa. In other words, any sit- Text-figure 4 shows how one of the sam-

uation which yields floral or faunal lists is ap- pling areas compares with the other thirty-

propriate. Each list may represent a single col- nine. The reference area (marked by an 'X')

lecting locality or a composite of information on the west coast of Central America was cho-

from a group of geographically, ecologically or sen arbitrarily and other choices produce com-

stratigraphically related localities. parable results. Values of the INDEX OF SIM-

In order to test the methodology, we will ILARITYare contoured and show decrease in

present three quite different examples: 1) glob- similarity with distance from the reference

al biogeography of living echinoid echino- area. Contouring was straightforward; that is,

derms, 2) distribution of benthic foraminifera extreme contortion of contour lines was not

in Santa Monica Bay, California, and 3) global necessary. Furthermore, the resulting pattern

biogeography of Ordovician nautiloid cepha- is plausible and interpretable in biogeographic

lopods. It should be emphasized that the pro- terms. The map shows clearly that the echi-

posed INDEX OF SIMILARITY, like all other noids of the Eastern Pacific are much more

similarity measures, is a purely descriptive similar to those of the Western Atlantic than

tool. Its purpose is to measure similarities and to those of the Western Pacific and Indian

differences between taxonomic lists and to as- Ocean regions. In fact, the presence of the

sess the statistical significance of these simi- Central American barrier is not evident in the

larities and differences. It does not interpret pattern. (This would not be the case at the

the results in the sense of telling us the biolog- species level where virtually no echinoid

ical or geological factors responsible for the species are common to both sides of the Isth-

similarities or differences. mus of Panama.) While some details of the

Echinoid biogeography.-The data set used pattern may reflect sampling error, there is no

for this test was presented briefly above. The reason to believe that Text-figure 4 is not a

All use subject to JSTOR Terms and Conditions

FAUNAL SIMILARITY 1221

numbersindicatethe INDEXOF SIMILARITYof each area with respectto an arbitrarilychosenreference

area(markedwith X: Pacificcoastof CentralAmerica).Numbersenclosedin circlesindicatesignificant

similarity to the referencearea; numbersin boxes indicatesignificantdissimilarityto the reference

area. Intermediatevalues of the INDEXare contoured.

group of echinoderms. Text-figure 4, seven of the similarities are sig-

Similar plots have been made using the nificantly high (at the 95% level of confidence):

Simpson, Dice, and Jaccard similarity index- these are the areas whose INDEX values are

es. The results are approximately the same circled. In these cases, the number of genera

although contouring was more difficult. (In observed to be in common is greater than in

particular, the Simpson Coefficient data con- at least 48 of the 50 simulations. The values

tained several unexplained anomalies.) The of the INDEXcontained in squares are signif-

good results produced by the conventional in- icantly low: that is, the number of taxa ob-

dexes are not surprising in view of the ex- served in common is less than in at least 48 of

tremely high quality of Mortensen's distribu- the 50 simulations. The distribution of circled

tional data and the basic simplicity of echinoid and boxed similarity values is the expected

biogeography at the generic level. one. The intermediate cases indicate a prob-

If the production of a map such as Text- ability of faunal similarity but do not lead to

figure 4 were the only purpose, the other in- the rejection of the null hypothesis. We can

dexes would probably be adequate and the say, for example, that the Alaskan echinoids

saving in computation time would be substan- appear to be different from those of Central

tial. But the similarity measure proposed here America but the difference is not statistically

allows one to evaluate differences between significant and thus could be caused by chance

All use subject to JSTOR Terms and Conditions

1222 DAVID M. RAUP AND REX E. CRICK

_ _____ _____

A:

COASTAL

)

J

TROPICAL/SUBTROPICAL 0 /

INDO-PACIFIC /

NORTHATLANTIC

I.

U 0

SEA OF JAPAN

SEA OF JAPAN

0

0

0 ANTARCTIC

PC I

TEXT-FIG.5-Multivariate analysisof echinoidbiogeographicdata. The firsttwo principalcomponents

(PCI and PCII)are plottedfor the 40 samplingareas. PCI separatescoastalareas of the Indo-Pacific

from otherareasand fromthe cold water, open ocean areas. PCII reflectswatertemperature.

sampling error. The orderliness of the contour ration is the result of the East Pacific Barrier

lines strongly suggests, of course, that the (Ekman, 1953), an 1,810 km expanse of open

Alaskan and Central American echinoids are ocean separating the islands of Outer Polyne-

in fact different in the sense that they do not sia and the tropical/subtropical coast of Amer-

represent random sprinkling from the same ica. Under ordinary oceanic conditions, echi-

pool. noid larvae are not capable of crossing this

Text-figures 5 and 6 show 2-dimensional barrier and faunas on either side of the barrier

ordination plots of the first three principal are significantly different below the family

components axes representing 98.5 percent of level. Separation of the South Australian and

the variation in the data set. The principal New Zealand regions from the tropical/sub-

components, PCI, PCII, PCIII, account for tropical regions of the Indo-Pacific illustrates

51, 28, and 19.5 percent of the variation, re- cold-temperate character of the South Austra-

spectively. The sampling areas which form lian and New Zealand faunas. Although geo-

tight, natural groups are shown as solid dots graphically proximal, the echinoid faunas of

and the groups are labeled. Others are shown the Sea of Japan and the Sea of Okhotsk are

as open circles and identified individually. remarkably different. The echinoid fauna of

Text-figure 5 is an ordination of PCI and the Sea of Japan consists of shallow water,

PCII. PCI clearly separates the coastal areas warm-temperate genera derived from the sub-

of the Indo-Pacific region from those of the tropical Indo-Pacific via the warm Kuroshio

Eastern Pacific and the Atlantic. This sepa- Current while the echinoid fauna of the Sea of

All use subject to JSTOR Terms and Conditions

FAUNAL SIMILARITY 1223

Il -

ATLANTIC

INDIAN OCEANO

IU NORTHATLANTIC ARCT \

0

CENTRALATLANTIC0O SOUTH ATLANTIC

SEA OF OKHOTSK

Q

^ \~~~ SOUTH PACIFIC

S

?* OSEA OF JAPAN

/COASTAL

TROPICAL/SUBTROPICAL COASTAL

INDO-PACIFIC * \ C APACIFIC

EASTERN

NORTH PATAGONIA

COASTAL ANTARCTIC

WESTERNATLANTIC

PC I

TEXT-FIG. 6-Multivariate analysisof echinoidbiogeographicdata. The first and third principalcom-

ponents(PCI and PCIII)are plotted for the 40 samplingareas. PCIII serves to separatethe coastal

areasof the EasternPacificand Atlanticinto distinctregions.

Okhotsk consists of cold-temperate genera de- pies were analyzed to produce distributional

rived from the north via the cold Oyashio Cur- data on 96 foraminiferal species and subspe-

rent. Any chance mixing of the faunas is re- cies. The samples covered an area of approx-

duced by a shallow submarine sill separating imately 100 square kilometers in water depths

the two bodies of water. Deep water and high ranging from 10 to 828 meters. All occurrence

latitude faunas tend to cluster in the lower data (in terms of percentage abundance) were

right corner of Text-figure 5. Text-figure 6 tabulated in the Zalesny paper. For the pres-

shows that the coastal areas of the Eastern ent study, these were converted to simple pres-

Atlantic, Western Atlantic, and Eastern Pa- ence and absence of taxa and the INDEX OF

cific are separated along PCIII. Naturally, sta- SIMILARITY was computed for all pairs of the

tistical significance cannot be assessed in the 70 sampling localities.

results of the multivariate analysis but the or- Text-figure 7 shows a contour map of raw

dination plots yield considerable information similarity data comparable to that for echi-

of biogeographic interest. noids (Text-fig. 4). The reference fauna (Za-

Foraminifera in Santa Monica Bay.-In a lesny's sample #3110) is in the left central part

superbly detailed study, Zalesny (1959) record- of the map and is indicated by an 'X.' Con-

ed and interpreted the distribution of the fo- tours reflect decreasing similarity of the other

raminifera in the bottom sediments of Santa 69 localities with respect to the reference lo-

Monica Bay, California. Seventy bottom sam- cality. Numerical values of the INDEX OF SIM-

All use subject to JSTOR Terms and Conditions

1224 DAVID M. RAUP AND REX E. CRICK

50 fm

80-

100,

0

* 0

0

x

#3110

0

I

I

\"' /40 ,/

II

II

I

TEXT-FIG. 7-Analysis of foraminiferal assemblages from Santa Monica Bay. Solid contours indicate

variation of the INDEX OF SIMILARITYwith respect to an arbitrary reference fauna (#3110). Triangles

represent assemblages which are significantly similar to the reference fauna: open circles are assemblages

significantly dissimilar to the reference fauna; solid circles are assemblages not significantly similar or

dissimilar to the reference fauna.

ILARITY are not shown in this case but the The contours of faunal similarity follow the

location of each site is shown by a small sym- bathymetry with remarkable faithfulness. The

bol. Those indicated by triangles are the ones shelf edge is clearly defined and both canyons

that are significantly similar to sample #3110, are evident. The one major anomaly is the

those indicated by open circles are significant- small 'bump' on the inner shelf produced by

ly dissimilar, and the solid dots represent lo- sample #3348. Similarity between this sample

calities which are not significant in either di- and the deep water reference fauna is substan-

rection. Also included are the bathymetric tially higher than is found between the other

contours for 10, 50, and 100 fathoms. The shelf faunas and the reference fauna. It is not,

shelf edge is well defined in Santa Monica Bay however, a statistically significant anomaly. In

and the continental slope is steep. The shelf is fact, when #3348 is compared with the three

indented by two major submarine canyons: closest localities, statistically significant simi-

Santa Monica Canyon (near the reference lo- larity is found! #3348 may therefore by a sim-

cality) and the Redondo Canyon (southeast ple chance departure from the overall pattern

corner of the mapped area). of the contoured similarity surface or it may

All use subject to JSTOR Terms and Conditions

FAUNAL SIMILARITY 1225

show the locationsof 52 samplingareas, one of which (Siberia)was used as the referencefauna. The

paleogeographicreconstructionis an interpolationbetweenpublishedmaps(Scotese,et al., 1979)pro-

vided by C. R. Scoteseand A. M. Ziegler.

result from a minor habitat difference between the other. In fact, 21 or 30% show statistical

#3348 and other shelf sites. The latter sug- significance and their distribution is obviously

gestion is likely in view of the fact that Zales- non-random over the geographic area. This

ny's sediment maps show that #3348 comes means that the null hypothesis of random

from a small patch of silt on the shelf surface sprinkling can be rejected when considering

otherwise covered with sand, gravel, or rock. foraminiferal assemblages of the whole bay.

All the deep water sampling sites are in silty This is not surprising in the Santa Monica Bay

sediments. It is not surprising therefore, that case and is certainly substantiated by Zales-

the silt patch on the shelf should yield an as- ny's detailed analysis of the distributions of

semblage with relatively high similarity to the individual taxa. It illustrates how the method

deep-water reference fauna. being presented here can be used to explore

It should be emphasized that Text-figure 7 the question of whether distribution of taxa is

does not in itself require a bathymetric or sed- purely stochastic or whether it is biased by

iment interpretation. As Zalesny (1959) points deterministic biological and/or physical fac-

out, many other ecological parameters such as tors. Even though the stochastic model can be

temperature and salinity parallel changes in rejected easily in this case, the INDEX based

depth and sediment type. The contoured IN- on the null hypothesis of random distribution

DEX OF SIMILARITY only provides a statistical is still a valuable aid to ecological interpreta-

framework for interpretation. tion.

Text-figure 7 can be used also to investigate Multivariate analysis was also carried out

another aspect of faunal similarity. The null on the foraminiferal data. Bivariate ordination

hypothesis of random sprinkling predicts that plots are eminently contourable and follow

about 5% of the sites should appear to be sig- bathymetry.

nificantly similar to the reference fauna and Ordovician nautiloid biogeography.-This

that about 5% should be significantly dissim- data set consists of 182 genera of Arenigian

ilar and that these cases of apparent statistical nautiloid cephalopods which range from en-

significance should be randomly distributed demics to those found in as many as 20 of 52

over the area. In this instance, therefore, 10% sampling areas. The data are taken from a

or about seven of the 69 assemblages should broader study of Ordovician biogeography

show statistical significance in one direction or (Crick, 1978). The location of sampling areas

All use subject to JSTOR Terms and Conditions

1226 DAVID M. RAUP AND REX E. CRICK

TEXT-FIG.

9-Multivariate analysisof Arenigianbiogeographicdata. A plot of the first two principal

components (PCI and PCII) separates the major geographic elements of the early Ordovician.

is shown in Text-figure 8 on a reconstruction Bear Island at the present time but were sep-

of Ordovician paleogeography developed by arated (as part of Baltica) from Bear Island in

Scotese et al. (1979). the Ordovician.

The faunal relationships were measured Patterns in 2-dimensional ordinations of the

with the computer program used in the pre- principal component axes are not quite as eas-

ceding examples. Contouring of the similarity ily interpreted as were comparable plots of

values with respect to Siberia as an arbitrary Recent echinoid and foraminiferal data. This

reference area (Text-fig. 8) shows the expected reflects loss of information about physical en-

decrease in similarity away from the reference vironments and a certain amount of geograph-

area. Contouring the same data on a map of ic uncertainty. However, information on as-

modern geography (not shown) reveals sub- sociated faunas and sediments, along with

stantial anomalies which reflect the differences knowledge of tectonic setting, does make the

between modern and Ordovician geography. multivariate plots understandable. Text-figure

For example, the Bear Island fauna is signif- 9 shows a plot of PCI and PCII for the 52

icantly similar (at the 95% level) to faunas sampling areas. Clusters showing the principal

from Arctic Canada and Scotland but it is not geographic regions (plates) are indicated. Plots

significantly similar to Norway, Sweden, or including PCIII (not shown) show separation

Estonia. The latter three areas are closest to of two important Ordovician facies; the plat-

All use subject to JSTOR Terms and Conditions

FAUNAL SIMILARITY 1227

form facies characterized by shelly faunas and or biogeographic proximity rather than tem-

the slope deposits (graptolitic facies). More de- poral identity. But this is an ever-present

tail on this aspect is given elsewhere (Crick, problem in biostratigraphy which must be

1978). dealt with regardless of the method used to

assess similarity. In the biostratigraphic con-

DISCUSSION text, tests of statistical significance could be

The similarity measure presented here is performed in the manner of the echinoid and

somewhat cumbersome and expensive because foraminiferal examples.

of the simulation technique. The rewards may

be worth the extra effort, however. These may ACKNOWLEDGMENTS

be summarized as follows: This work was supported in part by the

1) Distributional data are weighted on the Earth Sciences Section, National Science

basis of frequency so that widespread taxa do Foundation, NSF Grant DES75-03870. We

not have a disproportionate influence on mea- would also like to thank Richard K. Bambach

surement of similarity. and Alan H. Cheetham for helpful reviews of

2) There is no need to discard taxa on the the manuscript.

a priori grounds that they are too widespread

or too localized. REFERENCES

3) The similarity or dissimilarity of any two Cheetham,A. H. and J. E. Hazel. 1969. Binary

faunas can be tested for statistical significance. (presence-absence) similaritycoefficients.J. Pa-

Such tests are robust assuming that enough leontol. 43:1130-1136.

simulations have been run. Crick, R. E. 1978. Ordoviciannautiloidbiogeog-

raphy:a probabilisticand multivariateanalysis.

4) Because the evaluation of similarity does Ph.D. Dissertation,Univ. Rochester,166 p.

not presume any particular shape for the prob- Ekman, S. 1953. Zoogeographyof the Sea. Sedg-

ability distribution of expected numbers of wick & JacksonLtd., London,417 p.

taxa in common, the results may be considered Henderson, R. A. and M. L. Heron. 1977. A prob-

precise and not dependent upon generaliza- abilistic method of paleobiogeographic analysis.

tions drawn from computed variances of the Lethaia 10:1-15.

Mortensen, T. 1928-1951. A Monograph of the

probability distribution. Echinoidea. C. A. Reitzel, Copenhagen. 5 vols.,

5) An entire faunal realm or data set can be 4469 p.

investigated for significance of the observed Rohlf, F. J., J. Kishpaugh and D. Kirk. 1971. NT-

departures from a random sprinkling (stochas- SYS. Numerical Taxonomy System of Multi-

tic) model of taxon distribution. variate Statistical Programs. Tech. Rep. State

The three examples that have been de- Univ. New York at Stony Brook, New York.

Scotese, C. R., R. K. Bambach, C. Barton, R. Van

scribed do not include one in a biostratigraphic Der Voo and A. M. Ziegler. 1979. Paleozoic

context but biostratigraphicapplications should base maps. J. Geol. 87:217-277.

be straightforward and follow logically from Simberloff, D. S. 1978. Using island biogeographic

the biogeographic/ecological cases used here. distributions to determine if colonization is sto-

For example, the probable stratigraphic posi- chastic. Am. Naturalist 112:713-726.

tion of a new fossil assemblage could be as- Simpson, G. G. 1943. Mammals and the nature

sessed by comparing it with a large number of of continents. Am. J. Sci. 241:1-31.

. 1947. Holarctic mammalian faunas and con-

assemblages in a standard (possibly composite) tinental relationships during the Cenozoic. Geol.

sequence. This could be done in the fashion of Soc. Am. Bull. 58:613-688.

the contour maps of Text-figures 4, 7, and 8 Zalesny, E. R. 1959. Foraminiferal ecology of

except that it would be a one-dimensional in- Santa Monica Bay, California. Micropaleontol.

stead of a two-dimensional problem. The 5:101-126.

highest INDEX OF SIMILARITYwould be cen-

tered on the assemblages in the standard se- MANUSCRIPT RECEIVED FEBRUARY 17, 1979

REVISED MANUSCRIPT RECEIVED APRIL 12, 1979

quence most similar to the new assemblage.

This would not demand that a temporal cor- The Field Museum of Natural History contributed

relation be made at that point, of course, be- $500 in support of this article.

cause the similarity might be due to ecological

All use subject to JSTOR Terms and Conditions

## Гораздо больше, чем просто документы.

Откройте для себя все, что может предложить Scribd, включая книги и аудиокниги от крупных издательств.

Отменить можно в любой момент.