Вы находитесь на странице: 1из 13

EconomicGeology

Vol. 64, 1969, pp. 538-550

A SimplifiedStatisticalTreatmentof GeochemicalData
by GraphicalRepresentation
CLAUDE LEPELTIER

Abstract

Inthecourse
ofamineral
explo•-ation
sponsored
bytheUnited
Nations
Development
Programmein two selectedzonesof Guatemala,a streamsedimentreconnaissance was
carriedout, and graphicalmethodsof interpretationwere attemptedin the searchfor a
simplifiedstatisticaltreatmentof about 25,000 geochemical results. The data were
groupedby drainageandlithological units,andthe frequency distributions
of the abun-
danceof Cu, Pb, Zn andMo werestudiedin the form of cumulative
frequencycurves.
The four elementsappearto be approximatelylognormallydistributed.Background,
coefficients
of deviationandthreshold
levelsweregraphically
estimated.Examplesare
givenof simpleandcomplexpopulations.Mineral associations
were studiedby correla-
tion diagrams.

Contents ous targets for follow-up operationswould not be


PAGE encountered but rather more subtle features not so
Introduction ................................. 538
easyto pinpointand interpret.
Difficultyof the statisticalapproachin the caseof 539 The interpretationphaseof the surveywas char-
stream sedimentsurvey ..................... 539
acterizedby two essentialfeatures:the great amount
Adjustmentto a lognormaldistribution......... 539
Definitions ................................. 539 of data to be analyzedand the lack of precisionof
these data.
Constructionof the cumulativefrequencycurve 542
Comparisonwith histograms .................. 543 Sampling and analytical methodsmust sacrifice
Informationgiven by cumulativefrequencycurves 544 precisionfor speeddue to the nature of geochemical
Background ................................ 544 prospecting,and the first consequenceof .this fact
Deviation .................................. 544 is that an isolatedresult has little meaning in geo-
Threshold .................................. 544 chemistry. It must be part of a population as
Examples .................................. 545 numerousand homogeneous as possible. Indeed in
Advantagesof cumulativefrequencycurves .... 546 all kindsof phenomena, individualinaccuracies shade
The coefficients of deviation ................... 546
off progressivelywhen observationis extendedto
Correlation diagrams ......................... 548
Conclusion .................................. 550 larger and larger populations.
References ................................... 550 The first phaseof geochemical interpretationis to
condenselarge massesof numerical data and ex-
Introduction tract from them the essential information. The most
objectiveand reliableway to do it (and sometimes
Ti•, United Nations Mineral Exploration Pro- the only one) is statistically. Large sets of num-
grammein Guatemalarelied heavilyon geochemical bers, cumbersomeand difficult to interpret, may be
prospecting. During one year (1967) 60 percent reducedto a useful form by the use of descriptive
of the total Projectareawas coveredsystematicallystatistics.This is bestdoneby the graphicalrepre-
by a geochemicalreconnaissance carried out in the sentationof the frequencydistributionof a given
drainagesystems. Nine thousandstream sediment set of data; then the averagevalue, an expression
samples were collected over about 12,000 km2 of the degreeof variation around the average,and
(roundedfigures). All the sampleswere analyzed the limit above which the anomalies start are im-
for copper and zinc, and the total number thinned mediatelyand preciselydeterminedas well as the
out to approximately4,000 before being run for existenceof one or severalpopulations in the sur-
lead and molybder/um.Finally about25,000 geo- veyed area.
chemicalresultswere availablefor compilationand This treatmentof the data alsosimplifies
the com-
interpretation.As they accumulated, it becameap- parisonof the geochemicalbehaviorof an element
parent that high-contrastanomalieswhich are obvi- in various geologicalsurroundingsor of several
• This article is publishedwith the authorization of the elementsin the samelithologicalunit.
United Nations. The opinionsexpressed
are not necessarily I am gratefulto Mr. Henry H. Meyer,Project
endorsedby this Organization. Manager of the Guatemalaand E1 Salvador Mineral
538

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
SIMPLIFIED STATISTICAL TREATMENT OF GEOCHEMICAL DATA 539

Surveys, and to Mr. Stephen S. Steinhauserfor of sedimentary


rocks,and others(Coulomb,1959;
technicalcriticismand much helpful discussion. Cousins,1956).
In all theseexamples,the characterstudiedfol-
Difficulty of Statistical Approach in Stream lows the lognormallaw, which is probablymore
Sediment Surveys common than the normal one.

A reliable statisticalinterpretationrequires that It is interestingto note here that the lognormal


a great quantity of data be treated and that these
law fits very well in the caseof low-gradedeposits
data be homogeneous. like gold but for high-gradedeposits,iron for in-
stance,the experimentaldistributionsare generally
In drainagereconnaissance surveys,the first con- negativelyskexved because of the limitationtowards
dition is easily filled but not the second. As a
the high values. G. Matheron gives a thermo-
matter of fact, the importanceof samplingtechnique
dynamicinterpretationof the proportionaleffectin
is sometimesoverlookedin this type of prospecting.
the caseof ore depositsand relatesit to the Mass
But even if given the appropriate attention, too
Action Law (Matheron, 1962). To the extent in
many typesof rivers and too many lithologicalunits
which geochemicalanomaliesare extrapolationsof
are generally sampledto result in a homogeneous
ore depositsthis theoryshouldapplyto geochemical
collectionof samples. The best way to limit the
prospecting.
inconvenience of the heterogeneityof the samples
(particularly pH, organiccontentand grain size) is
to splitthe surveyareainto drainagesand lithological Constructionof the CumulativeFrequencyCurve
units, when possible,and to make the statistical A lognormaldistributioncurve is definedby two
interpretationfor each of them separately. How- parameters:one dependenton the mean value, and
ever, even if this is done, the same degree of pre- the otherdependent on the characterof value-distri-
cision cannot be achieved as in the case of a soil bution. This latter parameteris a measureof the
survey where good homogeneityis possible. range of distributionof values,that is whether the
distributioncoversa wide or narrow range of values.
Adjustment to a Lognormal Distribution The two parameterscan be determinedgraphically
as will be explainedon following pages. For prac-
Definitions
tical purposes,we work on cumulative frequency
When dealing with a large mass of geochemical curves,and their constructionshall be explainedby
data, the first stepis to find what sort of distribution meansof a concreteexample.
pattern best fits the various sets of observations. The various steps of this constructionare the
And, thus far, the lognormal distributionpattern following:
appearsto be the one most applicableto the results
of most geochemicalsurveys (Ahrens, 1957). (a) Selectionof a preciseset of data ("popula-
In geochemical prospecting,we study the content tion") as large and homogeneous as possible.
of trace elements in various natural materials, and (b) Grouping of the valuesinto an adequatenum-
ber of classes.
to say that the values are lognormallydistributed
means that the logarithms of these values are dis- (c) Calculating the frequencyiof occurrence in
tributed following a normal law (or Gauss' law) each class and plotting it against the class limits;
well known as the bell-shapedcurve (Monjallon, this gives a diagram calledthe "histogram."
1963). (d) Smoothing the histogram to get the fre-
quency curve.
Many natural or economic phenomena can be
expressedby a value varying between zero and (e) Plotting the cumulatedfrequenciesas ordi-
infinity, representedby a skeweddistributioncurve. nates gives the cumulativefrequencycurve, which
If, instead of the actual value of the variable itself, is the integral of the frequencycurve.
we plot its logarithm in abscissae,the frequency (f) By replacing the arithmetic ordinate scale
curve takes a symmetrical,bell-shapedform, typical with a probability scale the cumulative frequency
of the normal distribution. This happenswhen a curve is representedby .one or more straight lines.
phenomenonis subjectto a proportionaleffect,that Examplesof lognormalfrequencycurvesare shown
is to saywhenindependentinitial causesof variations in Figure 1.
of the studiedvalue take effectin a multiplicative Somebriefcomments
on the differentstepsfollow:
way. It is the case,for instance,for the distribution
of trace elementsin rocks, for the area of the dif- (a) The larger the populationto be analyzed,the
ferent countries of the world, for the income of morepreciseand reliablethe results. If necessary,
individuals
in a country,for the grainsizein samples as few as 50 valuesmay be treatedstatistically
but

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
54O CL/I UDE LEPELTIER

Figure 1. Lognorm.i distributioncurves


Frequency
,% Frequency,%

Value Value
Arithmetic scale Logarithmic
scale

Cumulated
frequency
•.
Cumulated
frequency
• Q 99.99

,,

0• Value ' ' , , •Value


Logarithmicscale Lol•arithmic
scale

the confidence limits must be calculated to see if


of points (n) necessaryto constructa correctline;
the analysisis meaningful. the range of distributionof the values (R), ex-
(b) A correctgroupingof the valuesis mandatory pressedas the ratio of the highest to the lowest
if someprecisionis to be achievedin the statistical value of the population;and the width of the classes
interpretation;too few classeswill result in shading expressedlogarithmically(log. int.) which has to
out important featuresof the curve; too many in be selectedin functionof the two first parameters.
losing significant details amidst a cloud of erratic Thesethreevariablesare linked by the relation:
ones. The results are distributed in classes, the
modulusof whichshouldbe proportional to the pre- log R
log. int. -
cisionof the analyses:the moreprecisethe analyses,
the smaller the modulus. The logarithmicinterval
must be adapted to the variation amplitudeof the In mostof the casesR variesfrom 6 to 300 (experi-
valuesand to the precisionof the analyticalmethods mental average values), then, with (n) varying
(Miesh, 1967). from 10 to 20, log R from 0.78 to 2.48, the extreme
In statistics,workingwith 15 to 25 intervals(or values for the logarithmic interval will be:
classes) is recommended. As a rule, the width of 0.78
a class,expressedlogarithmically,mustbe kept equal log. int. - - 0.039
20
to or smallerthan half of standarddeviation(Shaw,
1964). 2.48
log. int. - - O.25
For geochemical purposes,it is convenientto work 10
with 10 to 20 points on the cumulativefrequency
line, that is to say with 9 to 19 intervals or classes. The 0.10 was selectedas the best suitedlogarithmic
There are three variables to consider: the number interval for the classes because it suits most distri-

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
SIMPLIFIED STATISTICAL TREATMENT OF GEOCHEMICAL DATA 541

bution, giving reasonablenumber of classesand a Practically,the histogram-frequency curve step is


gooddefinitionof the curve. In caseof very reduced skippedand the cumulativefrequencydirectly con-
dispersion of the values around the mean, it may structed. However, note here an advantageof the
be necessaryto use 0.05, and if the dispersionis histogram: it clearly illustrates the effect of the
specially large, 0.2 will be chosen. When the sensitivityof the analyticalmethodand more pre-
logarithmicinterval is selected,it is easyto calculate ciselythe bias broughtto the low valuesby the use
a table giving the class limits in ppm. The only of colorimetric scales of standards. As a matter of
precautionis to avoid starting with a round value fact, experience shows that there is an inevitable
so that no analyticalresultswill fall on the limit of concentrationof the readings,whoeverthe analyst,
two classes. The most useful and commonlyem- on the valuesactuallyrepresented in the colorimetric
ployedin geochemical work is the 0.1 log. int. classs scale. For instance,in the caseof copper,the lower
table, a part of which is given below: part of the standardcolorimetricscale reads 0,2,4,7
ß . . ppm. Usually this results in an excessof 2,
classlimit (log) .. 0.07, 0.17, 0.27, 0.37, 0.47, 0.57
4 and 7 values,and a conspicuous lack of 1, 3, 5
classlimit (ppm) . 1.17, 1.48, 1.86,2.34, 2.95, 3.72 ppm values. This is of importance for a correct
It can be extended in both directions as far as constructionof the frequencycurve, and the raw
necessary. valuesmust often be correctedby extrapolatingthe
(c-d) After selectingthe class table, the values general shapeof the curve.
are groupedand the frequencycalculatedfor each (e-f) By plotting the cumulated frequenciesas
class (in percentage); then the frequenciesare ordinatesinsteadof the frequencies,one obtainsthe
plotted against the class limits (the latter being integral curve of the preceding. It has the form of
logarithmicallycalculated,ordinary arithmetic-arith- a straightline whenusingthe appropriategraphpaper
metic paper must be used), giving a histogram (probability-log), and it is the one used in geo-
which is smoothedto a frequencycurve. But histo- chemicalpresentationand interpretation of the re-
grams are often misleading,being stronglyaffected sults. Then two questionshave to be answered:
by slight changesin class intervals, and frequency where to start accumulatingthe frequencies,and
curves are difficult to draw and handle: for instance, where to plot the cumulatedfrequencies ?
it is necessaryto determinethe inflexion points of As for the first point, the normal procedurefol-
the curve in order to evaluate the standard deviation. lowed by many authorsis to start cumulatingthe

Ftgure 2 - Cumulative lYequeu•y Dtltributton for Zn and Cu

5o

5
2.5

0,5

O.X

0.3. 0.2 0.5 X 2

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
542 CL•IUDE LEPELTIER

Figure3. Confidence
limits(Pl, P2}=t 0.05probobllity
level- againstthe lower classlimits. Using the classcenter
will entailan error of excesson the centraltendency
Cumulative parameters(backgroundand threshold) but not on
Frequency
inO/o Number
of the dispersionparameter(coefficientof deviation).
This error, or difference,varies with the type of
samples classesused and is easily calculated (6% for the
0.05 logarithmicclassinterval,12% for the 0.1 log.
int. and 26% for the 0.2 log. int.). If the class
limit is used,curvesconstructed from differentlog.
int. classescan be directly comparedwithout cor-
rection.
Let us take a concreteexample: the distribution
of Zn in the quaternaryalluvial depositsof Block I
(Fig. 2). There are 989 resultsrangingfrom 10
to 230 ppm.
230
population:N-- 989 range:R- - 23
10

The best class interval is selectedas explained


150 above'

log.
int. log
n
R 1.36
14
= 0.097
4OO
A 0.1 log. interval will give 14 intervals,which is
5OO
acceptable. Usually, the histogram-frequency curve
step is skippedand the cumulativefrequencydia-
gram directlyconstructed.
2 000
In Figure 2, the points fit fairly well along a
5OOO straight line, suggestinga lognormaldistributionof
10 000
zinc in the alluvial deposits. Actually, the points
never fit the line exactly,but this doesnot matter
provided they stay in a channeldelimited by the
Source:
A. t ie•zou,
Initiatiou
pratique
i laslatistiwe,
confidencelimits usually taken at the 5% prob-
Gauthief
Villars,Parrs,1961. ability level. This confidenceinterval has been
drawn on Figure 2 by usinga graph (Fig. 3), which
frequenciesfrom the lowest values toward the high- avoids fastidiouscalculationand givesa fairly good
est (Fig. 1) (Hubaux, 1961; Termant and White, precision for the cumulativefrequencyvalues be-
1959). However, one has to considera property tween 5% and 95%. The width of the confidence
of the probabilityscaleused as ordinates:the values channel is inversely proportionalto the importance
zero and 100% are rejectedat the infinite; it does of the population considered: the bigger the popula-
tion, the narrower the confidence interval. To
not matter for zero becausezero% never occurs,
but in each case the last cumulatedfrequencyis check that a distribution fits a lognormal pattern,
100%, and this value is impossibleto plot, lost one shoulduse the Pearson'stest (Rodionov,1965;
for the curve. Then consideringthe lack of pre- Vistelius, 1960), but this longer operation is gen-
cision in the low values and the importanceof the erally not warranted in this type of interpretation
high ones for the determination of the threshold and, for practical purposes,the graphical control
level, I considerit much better to cumu, late the fre- describedabove is satisfactory.
quenciesfrom the hi#hestto the lowestvalues; thus,
Comparison with Histograms
the 100% will correspondto the lowest classand be
eliminated. For comparisonpurposesthe cumulativefrequency
As for the secondpoint, the curve being an in- curve for Cu in the Motagua drainage (Fig. 2)
tegral one, the ordinates must be plotted at class was also constructed, then, in Figure 4, the cor-
limits and not at class center; then, since one respondinghistogramsand frequencycurvesfor Cu
cumulatesthe frequenciesfrom the highest values and Zn. Figures 2 and 4 presentthe samedata in
to the lowest,cumulatedfrequenciesare to be plotted two different ways. Before enumeratingand com-

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
A SIMPLIFIED STATISTICAL TREATMENT OF GEOCHEMICALDATA 543

Figure4. Histogrum.nd frequencycurvefor Zn md Cu


I 1.5 2 3 4 5 6 7 8 O I0 15 • 30 40 50 GO70-80 100 150 200 300

x-' I I I T-'I Illll I Z.

_ '•.
ß

.- I I tz.
- I

• •øC.t ': C.• q/ I


:'"'•'• • I•. • I•. • I

.......
,,
.....
19l 1.• 1,• 1,t l,• l,• ],1l 4.1 i• LII 9.l 11.1 14,8 18.f l•,4 19,• •1.l 1.81 1.9 14,1•,•

menting on the advantagesof the former presenta- Information Given by Cumulative


tion over the latter, an interesting feature of the Frequency Curves
histogram should be mentioned: in the case of
colorimetricdeterminationsmade in the lower range
The main purposein constructing the cumulative
of sensitivityof the analyticalmethod,the histogram
frequency curve for a given population
is to check
if it fits a lognormaldistribution,and if it does,to
showsclearly the bias introducedin the readingsby
the humanfactor and by the accuracyand sensitivity
estimategraphically its basicparameters:
background
limits of the method. This effect is illustrated for (b), coefficients
of deviation(s, s', s") and threshold
level (t).
copper in Figure 4, where the classesincluding a
colorimetric standard are shaded and the value of (b) gives an idea of the averageconcentration
the standarditself is given as a larger figure (1, 2, level of the elementsin a given surrounding.
4... ppm); the cumulationof the frequencyreduces (s) expressesthe scatterof the valuesaround
this effect,particularlyif it is startedfrom the high (b): it corresponds
to the spreadof the valuesand
values,but it may be necessaryto bring somecoro their range,from the lowestto the highest.
rections to the low value frequenciesin order to (t) is a complexnotionwhich might be termed
constructa precise distributioncurve. "conditional": statisticallyit dependson the prob-
ComparingFigures 2 and 4, one seesimmediately ability level chosen;geologically,
and for practical
ihat it is easierto compare
two straightlinesthan purposes,
it is supposed
to be the upperlimit of the
two overlappingbell-shapedcurves; many more fluctuationsof (b): it dependson (b) and (s).
populationscan be presentedon the same diagram The valuesequalto or higherthan (t) are considered
by using cumulativefrequencycurves than by using anomalous.
histograms. Cumulative frequency curves are of Adjustmentto the lognormallaw is generallythe
easier constructionand more precisethan ordinary casewhensoil samplesare considered:in the drain-
frequencycurves; it is simpler to draw a line that age reconnaissance
surveyin Guatemala,we found
fits a set of pointsthan to draw a bell-shapedcurve that trace element contents in stream sediments
with inflexionpoints. appearalso to be lognormallydistributed.

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
544 CLAUDE LEPELTtER

Background of the deviation: it is inverselyproportionalto the


slopeof the line. We call it the geometricdeviation
A straight line denotesa single populationlog-
(s'); it has no dimension: it is a factor obtained
normally distributed. In this simplecase,the back-
by dividing the value read in .4 by the value read
ground value (b) is given by the intersectionof inO:
the line with the 50% ordinate. In the examples
given in Figure 2, we have: 21
s' - - 2.28
9.2
backgroundvalue for copper .. b (Cu) --9.2 ppm
backgroundvalue for zinc .... b (Zn) = 48 ppm Then multiplyingor dividingthe backgroundvalue
Of course, these values must be rounded off; it by the geometricdeviationwill give the upper and
will be illusoryto imply a precisionfar out of reach lower limits of a range including68% of the popula-
of the analyticalmethods. In the illustratedexample, tion (from b-s to b+s, or A'A on the figure).
10 and 50 ppm are taken as reasonablygood ap- Multiplying or dividing by the square of the geo-
proximations of the background levels. metric deviationgivesa range includingabout95%
In the case of a perfect frequency distribution of the values( b -- 2s to b + 2s).
curve, the backgroundthus calculatedcorresponds Becauseall the reasoningis made on logarithms,
to the mode (most frequent) and median (50% of it is also necessaryto expressthe deviationby a
the valuesabove,50% below it) values,and is the logarithm: the coefficientof deviantion(s) is the
geometricmeanof the results. This geometricmean logarithm(base10) of the geometricdeviation(s').
is a more significantvalue that the arithmetic mean. s' = 2.28
It is also a more stable statistic, less subject to
s = logs • = 0.36
changewith the additionof new data and lessaffected
by high values. It will be seen later that it might be interesting
to consider a third deviation index: the relative
Deviation
deviation (s") sometimescalled coefficientof vari-
Before explaininghow to determinegraphically ation. It is expressedas a percentage:
the deviationcoefficient,an essentialproperty of the $

normal distribution(i.e., fitting the "bell-shaped" s"= 100•


curve) mustbe recalledhere:
(b) beingthe medianvalueand (s) the standard 0.36
deviation then: s" = 100 - 3.9%
9.2

68.26% of the population falls betxveenb-s Threshold


and b + s
95.44% of the populationfalls bet•veenb- 2s After the backgroundand the coefficientof devi-
and b + 2s ation,the third importantparameteris the threshold
99.74% of the populationfalls bet•veenb- 3s level (t), whichis a functionof the two former. It
and b + 3s has been seen that in the case of symmetricaldis-
tribution (either normalor lognormal)95% of the
This holds true in the case of the lognormaldis- individual values fall between b + 2s and b- 2s,
tribution since the logarithms of the values are that is to say that only 2.5% of the population
normallydistributed. Then, roundingoff the above- exceedsthe upper limit b + 2s. This upper limit
mentionedpercentages and taking (b) as the back- is conventionally taken as the thresholdlevel (t)
ground,we can say that 68% of the populationfalls above which the values are considered as anomalies:
between b-s and b-I-s or that 32% is outside
theselimits. The distributioncurve beingsymetrical log t = (log b) + 2s
around an axis of abscissa(b) (Fig. 4), 16% of or to avoid usinglogarithms:
the valueswill fall aboveb -I- s and 16% belowb -- s.
In Figure 2, the values b+s and b-s will be t = b Xs '2
obtainedby projectingthe intersectionof the dis- t = 9.2 X 5.2 = 47.8 ppm
tribution line with the ordinates 16 and 84% on
the abscissaaxis. Working with logarithms, one Practically,(t) as well as (b), is read directly
has to consider the ratios and not the absolute values on the graph as the abscissaof the intersectionof
thus established.Taking the sameexampleof Cu the distribution line with the 2.5% ordinate. In
the pointsP (at the 16% this exampleone reads47 ppm, and the slightdif-
(Fig. 2), onedetermines
expression ference is due to the rounding off of the exact
ordinate)and.4. 0.4 is the geometrical

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
A SIMPLIFIED STATISTICAL TREATMENT OF GEOCHEMICAL DATA 545

ordinate2.28% to 2.5%. This showsthe importance b. a mixture of two populationsin a given set of
of the deviation in the estimationof the threshold; data; and
two populationsmay havethe samebackgroundbut, c. an excessof low valuesin the consideredpopu-
nevertheless,different thresholds if their coefficients lation.
of deviationare different. In Figure 2, the threshold
is five timesthe background
for Cu and only 2.7 These three casesare representedgraphicallyin
times for Zn. Figures 5. They correspondto real distributions
In all the foregoing, I have consideredthe sim- encounetredin the Guatemalandrainagesurvey and
plest case: a singlelognormalpopulation,the dia- appear as solid lines with slopebreaks on the dia-
grammatic expressionof which is a straight line. gram. Some indications are given below showing
However, when constructingcumulativefrequency how to interpret suchlines.
curves, a broken line is frequentlyobtainedsug- Copper Distribution (in a lithologicalunit). The
gesting that the set of data consideredconsistsof a cumulativefrequencyline (Fig. 5) shows a break
complexpopulationor of different ones. Whenever to a flatter slope at the 30% level. This is the
possiblein practice,the interpretationis made on case when there is an excessof high values in the
sets of data selected so as not to include more than population;the histogramwill give a frequencycurve
two different distributions; for instance, a litho- skewed to the right, in the direction of the high
logicalunit may includetwo typesof mineralization values (positive skewhess). If the populationwas
showingup in soil or sedimentsamples;one repre- lognormallydistributed,the main branch Oat should
sentativeof the normal or backgroundcontent of extend as a straight line in Oz whereas,in this case,
the materialsampled, and the other,a superimposedOx is lifted to Oy whichmeansthat insteadof having
mineralization related to ore. 2.5% of the values30 ppmor greater,thereare 17%
of them. The abscissaof the breaking point, O,
Examples (in this case 18 ppm) indicatesthe limit above
The three main casesof non-homogeneous dis- whichthereis a departurefrom the norm (i.e., from
tribution that are the most likely to occur are, in the lognormaldistribution),an excessof high values.
decreasing frequencyorder: In this case,backgroundand coefficientsof deviation
a. an excess of high values in the considered are calculated with the main branch Oat. The
population; abscissa of the breakingpoint may be conveniently

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
546 CLAUDE LEPELTIER

taken as threshold value if the break occurs above Advantages of Cumulative Frequency Curves
the normalthresholdlevel of 2.5%. If, however, Plotting the distributionof an elementin a selected
the breakoccursbelow2.5% level (at point p for unit as cumulativefrequencycurve on probability
instance) the thresholdshould be taken as usual
graph paper is the easiestand most preciseway to
(abscissa of point P). Positivelybrokendistribu- presenta great amount of data (for instance,pre-
tion linesare the moreinteresting because they in- sentingFigure 5 as histogramsand frequencycurves
dicatean excessover the background mineralization. will result in an overloadedand illegiblediagram).
Molybdenum Distribution(in a lithological
unit). All the characteristicparametersof the distribution
The cumulative distribution line shows two breaks: can be estimated without cumbersome calculations.
first a positive,then a negativeone. Such a graph Comparisonbetweenvarious populationsare easy
is the expressionof a dual distribution,suggesting andcomplexdistributions are clearlyidentified.Fur-
the existenceof two distinctpopulationsin the set thermore,the adjustmentto a lognormaldistribution
of data considered.It givesa double-peaked histo- can be checkedgraphically.
gram. We shall considerhere only the most fre- Comparingthe geochemical featuresof the various
quentcaseof a main "background" populationmixed units of a surveyarea is importantin assessing their
with a smallerone of higher averagevalue, the two mineral potential. This is convenientlydone by
of them beinglognormallydistributed. On the dia- plotting the correspondingdistributionson the same
gram (Fig. 5), branchA corresponds to the main diagram for instance Cu distribution in three or
or normal population, branch B to the anomalous four different drainagesin the case of a stream
population and the central branch A q- B to a mix- sedimentreconnaissance.Distribution heterogenei-
ture of the two. By splitting the data at a value ties will be spotted and the correspondingunits
taken aroundthe middleof A q- B (at 4 ppm for selectedfor further investigations. On a broader
instance),it is possibleto separatethe total popula- scale, the geochemicalbehavior of trace elementsin
tion into two elementaryones appearingas a and a given geologicalenvironmentfrom different coun-
b on the diagram. The generalbackgroundwill be tries or metallogenicprovincescan be readily com-
taken with branch A and the threshold as the abscissa
pared. This is an approachto a betterunderstanding
of the middleof branchA q- B, thoughthe threshold of the distributionlaws of trace elementsin naturally
of populationa may alsobe considered, but we have occurringmaterials.
not enoughexamplesof suchcomplexdistributions
to make definite recommendations,and we lacked The Coefficients of Deviation
computingfacilities to calculatetheoreticaldistribu- A lognormaldistributionis completelydetermined
tions. The coefficients of deviation must be cal-
by two parameters:the geometricmean (b) and
culatedseparatelyfor distributionsa and b. the coefficientof deviation (s). It has been seen
Zinc Distribution (in a drainage unit). The that the absolutedeviation can be expressedas a
negativelybrokenline on Figure 5 is the expression geometricfactor s' or, more commonly,as a logarith-
of an excessof low valuesin an essentially
lognormal mic coefficients. The term "deviation"is preferred
distribution;in this case, the histogramis skewed to "dispersion"which might be more expressive,
to the left, toward theselow values (negativeskew- becausethere is no geneticimplicationin the concept
ness). Provided their proportionis not too high of statisticaldispersionwhereasthere is one in the
(20% or less or instance),they do not interfere notion of geochemicaldispersion; however, many
in the interpretation,which is done on the main peopleuse the term "dispersion"in statisticalinter-
branchof the distributionline in the usual way. pretationof geochemical data.
This excessof low valuesmay be due to the inclu- The coefficientof deviationis a dispersionindex
sionin the populationof a low-background lithologi-
specificfor the distribution of a given element in a
given environment and expresses the degree of
cal unit or, more often, to poor sampling(for in-
homogeneityof this distribution. When rocks are
stance,collectingan important set of sedimentsam-
plesthat are too coarse). considered, a similarityin the coefficientof deviation,
together with similar average values, may indicate
When the resultsdo not fit a lognormaldistribu-
similar geochemicalprocessesin their formation.
tion, an explanationmay generallybe found among
It is possiblethat a given value of s corresponds
these three factors: (1) lack of homogeneityin
to each type of mineralizationin a lithologicalunit.
sampling,(2) complexgeology(imprecisionin the Confirmingthis assumptionwould require very ex-
lithologicalboundaries),and (3) analyticalerrors.
tensivegeological-statisticalstudiesencompassing all
It shouldalsobe kept in mind that someelements
metallogeniccases.
in somesurroundings
may not be lognormallydis- There is also a relationshipbetween the back-
tributed. ground (b) and the coefficientof deviations(s)

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
SIMPLIFIED STATISTICAL TREATMENT OF GEOCHEMICAL DATA 547

whichis the expressionof the geochemical law which averageabsolutedeviations (graphicallyestimated


statesthat the dispersionof an elementis inversely in Fig. 6) also decreaseswhen the abundanceof the
proportionalto its abundance. This is expressed element increases.
very clearly by the relativd dispersions" (or rela- The weightedmean valuesof b, s and s" for each
tive deviation), a percentagerelated to b and s as elementhave been calculatedseparatelyfor Blocks
follows: I and II:

s"= 100• Block I b s s" Block II b s s"

Zn 55. 0.23 0.42 Zn 70. 0.17 0.24


The higher the background,the lower the relative Cu Cu8. 0.308. 0.34
3.8 4.2
Pb 6.8 0.32 4.7 Pb 5.8 0.30 5.2
deviation. This is best shown on a log/log correla- Mo 0.38 0.37 97.5 Mo 0.35 0.40 125.
tion diagram by plotting s" as abscissaand b as
ordinate. Figure 6, for instance,showsthe variation
of s"in functionof b in the different lithologicalunits The fact that the absolutedeviationfor Pb is equal
of Blocks I and II, for Cu, Zn, Pb and Mo. The to or slightly lower than that for copper is due to
diagram has been constructedby taking, for each two factors: (1) the sensitivitylimit of the analytical
element, the extreme values for b and s" thus deter- methodfor lead, which entaileda numberof assump-
mining parallelogramsincludingall the individual tions and extrapolationsin the interpretation--de-
values. One seesimmediatelythat there is an in- terminationof b and s, and (2) the existenceof
verse linear relationshipbetween b and s" (which somePb mineralizedzonesin the surveyarea where
is evident from the definition of s") and that the b was high and s low.

Figure 6. Correlationdiagramb/s"for blocksI andII

All lilholo•icllunits

6.2 0.3 0.4 O.i 0.8 ! 2 ,3 4 $ $ 78910

,=o.
lsJ'1
2o.2

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
548 CLAUDE LEPELTIER

Figure 7. CorrelationdiagramCu/Zn

cu(pp,.) •
III
III • \ ß
Ill / ' x ß
., Ill
Ill , / X , Ill
Ill I " • Ill
Cu :

III ,
Ill
III

,,i
ß.

....
'.A......• •
/'
- Il


:-_ .

,,,
I II
tll•
Ill
III Jill

III ...............
', III I II
III ; •11
Ill , III
III ill

5670910 2 3 4 5 67891• 2 3 4 5 67091ffi0

N1= nI + n3 = 168

In Figure 6, it is also interestingto note the is commonlyalsohigh in Zn. This geologicconcept


variationsof the dispersionof the sameelementin of a relationshipbetweentwo-typesof mineralization
different lithological units which is particularly (only qualitativeand rathervague)may be substi-
noticeablefor copper; the width of each parallelo- tutedby a precisefactor,the coefficient of correla-
gram indicatesthe range of variation of s for each tion p, whichgivesa rigorousmeasureof their de-
element. gree of dependency.In the caseof geochemical
The coefficientof deviation is a very important prospecting, p measures the degreeof dependency of
character of the distribution of an element in a two lognormalvariablesnamelythe tenorsof two
given surrounding;it is probablyrelatedto the type elementsin a samplepopulation(Matheron,1962).
of geochemical dispersion,mechanical or chemical, The coefficient p alwaysfallsbetween-1 and + 1.
and consequently might give an indicationof the p--o meansa complete independence betweenthe
type of anomalyencountered:syngeneticor epi- two elements, p -- --+-1
indicatesa functionalrelation-
genetic. It appearsthat a highercoefficientof devi- ship,director inverse,betweenthem (it is a linear
ation indicatesa preponderantly mechanical disper- relationship betweenthe logarithmsof the tenors).
sion, but this has not been proved. Much remains SimplifiedCalculation of p.--There is a graphical
to be done in this field. way to estimatep, slightlylessprecisebut much
faster than the completestatisticalcalculation:con-
Correlation Diagrams structinga correlationcloudin full log. coordinates
In the caseof a polymetallicmineralization,with (Fig. 7, 8). Each sampleof the population under
two or more elementslognormallydistributed,there study is plottedfollowingits t•vo coordinates: its
is generallya positivecorrelationbetweenthem;for tenor in element •/ and its tenor in element B and
instancebetweenlead and zinc, a samplehigh in Pb the total population appearsas a cloudof points.

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
SIMPLIFIED STATISTICAL TREATMENT OF GEOCHEMICAL DATA 549

Practically, this presentationof the data is very (1) eitherp is equalor near to zero: the elliptical
convenientbecauseit gives a geometric image of cloud has its axes parallel to the coordinate axes
the distribution laws. The axes passing by the and the two variablesare independent,
gravity center (b•, b)•), that is to say by the point (2) or p is clearly different from zero and the
whosecoordinatesare the backgroundvaluesfor the cloud is an ellipsewhoseaxes are inclined relative
two consideredelements,are then drawn. In Figure to the coordinates. The slopeof the main axis has
7, the axes will passthrough the point (bc, = 5.3 the samesign as p (if p > 0 the two elementsvary
ppm, bz, = 75 ppm). The points falling in each in the same direction; if p < 0 the two elements
quadrantare summedup and countedas follows: vary inversely).
N• = numberof pointsin first and third quadrants The correlation cloud is in fact a two dimensional
N•. = numberof pointsin secondand fourth quad- histogram; it is the bestand simplestway to estab-
rants. lish whethera populationis homogeneous or hetero-
geneous:in the first case,the points tend to group
Then Ois givenby the formula: in a singleellipticalcloud; in the second,they split
into 2 or several attraction centers and form several

o= sin[•'N•+Ns
•rN•--N•1 elliptical clouds more or less overlapping. G.
Matheron pointsout that the relation expressedby
p is an expressionof the MassAction Law if p = --+1
Practically,p is never equal to --1 (which would (or of the orderof +0.95) (Matheron,1962); then
be the caseif all the pointswere on a straightline) it is likely that a geologicallybasedchemicalequi-
and the points form an ellipticalcloud. Two cases librium exists between the two elements considered.
may happen: In geochemical prospecting,correlationcoefficients

Figure8. CorreletiondiegramPb/Zn

Pb (i)Pm)
1000•

6
© - I I IIII I II
............... III

'f' i ii
• iI[ I•111.
ilii-: / /
/, ,,,
Iii
I I
,,,
Ill
III
* IXI [ll../ / I ii ii

• /. ; / ' /I
, / • ,,,,,. 111

/
/
- '"'"
I IIIII

Illl•
III "III
III

1111 Ill Ill


I llIi III

n2• ]0 • : n]• n3•83 •=sin ß


• _N2
n3=45 • =n2+n4=16 • +N
n4 6

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018
550 CL,ZIUDE LEPELTIER

may be usedto assessmineral associations of ele- was useful in outlining subduedanomalouspatterns


mentsin natural samples. The correlationdiagram in a complex geochemicalsurrounding,but much
showswhether two elementsare spatially associated more informationcan certainlybe extractedfrom the
and if onemay be usedas a pathfinderfor the other. analytical results by a more thorough, computer-
Let us considertwo examples'the relationshipof oriented, treatment.
Cu/Zn in the drainageof the SuchiateRiver (Fig. The graphicalmethodsdescribedabove have the
7) andthe relationship of Pb/Zn in the Rio Grande great advantageof being quick, cheapand easy to
drainage(Fig. 8). use in the field without any specialmathematical
The first example,in Figure7, is intendedonlyto knowledge. It is a convenientand syntheticway
illustrate the lack of relationshipbetween two types to presenta greatamountof geochemical data,and
of mineralization.The cloudof pointshasno definite I think it might be usefulto any geologistinvolved
shape,but it can be dividedinto three zones' one in geochemical prospecting.
aroundthe intersection point of the axes,including
the majority of the pointswhich are spreadmore UNITED NATIONS MINERAL SURVEY,
or lessequallyamongthefourquadrants;
anelliptical GUATEMALA CITY, GUATEMALA,
one,markedCu, in the rangehigh-Cu/background- January20; March 28,1969
Zn values; and a third one, includingonly a fe•v
REFERENCES
high-Zn/background-Cu
points. Thisshowsthat,in
the Suchiatedrainage,thereis no relationship
what- Ahrens, L. H., 1957, The lognormaldistributionof the
soever between the Cu and Zn mineralization, that elements--a fundamental law of geochemistry: Geochim.
et Cosmochim. Acta, v. 11, no. 4.
the Cu anomalyis moreimportantthan that for Zn Coulomb,R., 1959,Contribution 3,la C•ochimiede l'uranium
and that the two anomalies are well separated dans les granitesintrusifs: Rapport C.E.A. 1173, Centre
d'Etudes Nucleires de Saclay, France.
spatially.All thisis expressed
by the coefficient
of Cousins,C. A., 1956, The value distributionof economic
correlation' minerals with special reference to the Witwatersrand
Gold Reefs: Geol. Soc. South Africa Trans. v. LIX.
p = -0.11 Hubaux,A., 1961,Representation
graphiquedesdistributions
Its low absolutevalue indicatesa nearly complete d'oligo-•l•ments:
Mars 1961.
Ann. Soc.G•ol. Belgique,T. LXXXIV--
independence of the two mineralizations,
with a Termant,C. B., and White, M. L., 1959,Study of the dis-
tendency' (negativevalue). tribution of somegeochemicaldata: ECON.GEOL.,V. 54,
to inverserelationship
p. 1281--1290.
On the contrary,Figure8 showsan exampleof Matheron, G., 1962,Trait• de g•ostatistique
appliqu•e,tome
directrelationship
betweentwo typesof mineraliza- 1: M•moire no. 14 du Bureau de RecherchesG•ologiques
tion. In the Rio Grande drainage,Pb and Zn are et MiniSres, Paris.
associated' the correlation cloud is an elongated Miesh, A. T., 1967,Methodsof computationfor estimating
geochemicalabundance--U. S. Geological Survey Pro-
ellipsewhosemain axis has a 45ø slopeand the fessionalPaper 574-B.
correlationcoefficient
t•--- +0.87. In this drainage, Monjallon, A., 1963, Introduction3. la m•thode statistique:
lead and zinc anomalies will have the same pattern Vuibert, Paris.
Rodionov,D. A., 1965,Distributionfunctionsof the elements
andwill be spatiallyrelated. In similargeological and mineral contentsof igneousrocks: ConsultantBureau,
conditions,
oneelementmaybe usedas a pathfinder New York.
for the other. Shaw, D. M., 1964,Interpretationg•ochimiquedes •l•ments
en trace dans les roches cristallines: Masson et Cie,
Conclusion Paris.
Vistelius,A. B., 1960,The skew frequencydistributionsand
In the Guatemalan geochemical reconnaissance,
the fundamentallaw of the geochemicalprocesses:Journal of
statistical
analysisof the data,althoughelementary, Geol. Jan. 1960.

Downloaded from https://pubs.geoscienceworld.org/segweb/economicgeology/article-pdf/64/5/538/3483614/538.pdf


by Juan M. Garcia
on 02 October 2018