Академический Документы
Профессиональный Документы
Культура Документы
Measurement Uncertainty
in Chemical Analysis
, Springer
Prof Dr. Paul De Bievre
Duineneind 9
2460 Kasterlee
Belgium
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,
reproduction on microtilm or in other ways, and storage in data banks. Duplication of this publication or
parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission tor use must always be obtained from Springer-Verlag. Violations
are liable tor prosecution under German Copyright Law.
springeronline.com
The use of general descriptive names, registered names, trademarks, etc. in this publication does not
imply, even in the absence of a specitic statement, that such names are exempt from the relevant protective
laws and regulations and there!ore free tor general use.
Product liability: The publisher cannot guarantee the accuracy of any intormation about dosage and
application contained in this book. In every individual case the user must check such intormation by
consulting the relevant literature.
For over six years, the journal "Accreditation and Quality The latter interpretation has the consequence that the
Assurance" (ACQUAL) has been publishing contributions uncertainty of any quantity influencing the result. including
from the chemical measurement community on various the chemical sample preparation prior to the measurement,
aspects of reliability in chemical measurement, the key must be included in the final uncertainty evaluation, thus
mission of ACQUAL. yielding a combined uncertainty. That, however, entails
One of these aspects is uncertainty. Although even its almost invariably an increase in the size of the uncertainty
very concept is still controversial, ACQUAL authors are bar of the measurement result, previously called the "error
quite proficient in writing about it. Their papers show that bar". Most of the chemists do not yet agree on this. Thus.
uncertainty is interpreted - and used - in many divergent uncertainty is increased as the result of more work because
ways. the whole measurement process must be evaluated for
It seemed a good idea to the publisher, Springer-Ver- possible uncertainty contributions. All of this makes
lag, to present some of the most prominent contributions uncertainty evaluation more elaborate but more realistic,
on the topic that have appeared in ACQUAL in the course and therefore more responsible. This marks a truly dramatic
of the years. The result lies before you. change.
It should be clear to the reader that this is a collection It would be very helpful if the ongoing revision of the
of papers and not an "integrated" book. We are still far "International Vocabulary of Basic and General Terms in
from a homogeneous, internationally accepted common Metrology" (VIM), would define "measurement result"
perception of what uncertainty means in chemical unequivocally in order to promote one meaning of the term
measurement. The result is that we use it in different ways. in our work as well as in international discussions. A common
But dramatic changes in the perception and interpretation language is absolutely essential in this matter.
of "uncertainty" amongst chemists are becoming visible. It gives us great pleasure to deliver ",'hat we consider
They are already reflected in the papers selected. Clearly a useful compendium from ACQUAL authors and editors
more time is necessary for the implementation of end-of- to ACQUAL readers and to other colleagues in the art
20th century uncertainty concepts, and be accepted by and science of chemical measurement. For those among
beginning-of-21-st century minds. the readers who consider themselves as newcomer in the
The spectrum of what we read and hear in matters of field, I propose to read the articles by Dube and / or Kadis
uncertainty in chemical measurement is very broad: it goes as an introduction to the current state and to the problems
from interpreting uncertainty as a mere repeatability of involved.
measurement results obtained from replicate measurements
of the quantity su~/ect to measurement*, all the way to the Prof Dr. P. De Bievre
full uncertainty of the result of a measurement procedure Editor-in-Chief
applied to a quantity intended for measurement. Accreditation and Quality Assurance
Kasterlee
2002-09-20
• quantity (German: "Grosse", French: "grandeur", Dutch: "grootheid")
is not used here in the meaning 'amount', but as the generic term for
the quantities we measure: concentration, volume, mass, temperature,
time. etc.
Contents
prescribed procedure to follow in producing results with in the hierarchy of analytical methodology [7] expressed
a known uncertainty. as a sequence from the general to the specific:
If we have indeed recognized chemical analysis to be
technique ~ method ~ procedure ~ protocol
measurement, though possessing its own peculiarities, we
can apply the principles and techniques of quality assurance Indeed, the procedure level provides the specific directions
developed in measurement to analytical work. These principles necessary to utilize a method, which is in line with the
and techniques constitute the field of measurement assurance definition of measurement procedure given in the Interna-
[3], a system affording a confidence that all the measurements tional Vocabulary of Basic and General Terms in Metrology
produced in a measurement process maintained in statistical (VIM): "set of operations, described specifically, used in
control are good enough for their intended use. "Good enough" the performance of particular measurements according to
implies here nothing more than having an allowable a given method" [8].
uncertainty. Although measurement assurance was originally This nomenclature is however not always adhered to.
developed for instrument calibration, i.e. with emphasis on In many cases, i.e. scientific publications, codes of practice,
measurement traceability, it is reasonable to treat it more or official directives, an analytical procedure is virtually
generally. One can say that a fixed measurement procedure implied when an analytical method is spoken about.
is a means of assigning an uncertainty to a single measurement, Commonly used expressions such as "validation of analytical
and this is the essence of measurement assurance. This also methods" or "performance characteristics of analytical
reveals the role a prescribed (analytical) procedure plays in methods" are typical examples of incorrect usage. Such
routine analytical measurement. We will focus on different confusion appears even in the definition suggested by Wilson
aspects involved in the concept of an analytical procedure in 1970 for the term "analytical method" [9]. As he then
defined in terms of measurement assurance such as put it, "an analytical method is to be regarded as the set of
terminology, content, evaluation, and validation. written instructions completely defining the procedure to
be adopted by the analyst in order to obtain the required
analytical result". It is actually difficult to make a distinction
Starting point between the two notions when one of them is defined in
terms of the other. On the other hand, there is normally no
Chemical analysis generally consists of several operational
reason to differentiate the two most specific levels in the
stages beginning with taking a sample representative of the
hierarchy above, carrying the term "procedure" over to
whole mass of the material to be analysed and ending with
the designated "protocol". The latter was defined [7] as
calculation and reporting of results. In this sequence the
"a set of definitive directions that must be followed, without
measurement proper usually makes a relatively small
exceptions, if the analytical results are to be accepted for
contribution to the overall variability involved in the entire
a given purpose". So, written directions have to be faithfully
chemical measurement process (CMP) [4], the largest portion
followed in both cases. In many instances the term
ofwhich being concerned with "non-measurement" operations
"procedure" actually signifies a document in which the
such as isolation, separation, and so on. Because of this,
procedure is recorded - this is specifically noted in VIM
everything in the chain that affects the chemical measurement
in respect to "measurement procedure". Besides, the term
result must be predetermined as far as practically possible:
"standard operating procedure" (SOP), especially applied
the experimental operations, the apparatus and equipment,
to a procedure intended for repetitive use, is popular in
the materials and reagents, the calibration and data handling.
quality assurance terminology.
Thus, a "complete analytical procedure, which is specified
A clear distinction needs to be drawn between analytical
in every detail, by fixed working directions (order ofanalysis)
procedure as a generalized concept and its particular
and which is used for a particular analytical task" [5] - a
realization, i.e. an individual version of the procedure arising
concept presented by Kaiser and Specker as far back as in
in specific circumstances. In practice, an analytical procedure
the 1950s [6] - becomes a point of critical importance in
exists as a variety of realizations, differing in terms of
obtaining meaningful and reproducible results. We use the
specimens, equipment, reagents, environmental conditions,
term "analytical procedure" or merely "procedure" for short,
and even the analyst's own routine. Not distinguishing
in the sense outlined above.
between these concepts can lead to a misinterpretation
embodied, for instance, in the viewpoint that with a detailed
specification the procedure will change "each time the
"Method'; "procedure'; or "protocol" analyst, the laboratory, the reagents or the apparatus
The importance of the correct usage of relevant terms, in changed" [9]. What will actually change is realizations of
particular, the term "procedure" rather than "method" is the procedure, only provided that all specified variables
noteworthy. The terms actually correspond to different levels remain within the specification. Also one cannot but note
Analytical procedure in terms of measurement (quality) assurance 3
that the hierarchy of methodology above concerns, in fact, the conditions and operations that ensure established
a level of specificity rather than the extent to w'hich the characteristics of trueness and precision" [12]. This wording
entire CMP may be covered, Although sampling is the which goes beyond the scope of analytical chemistry
first (and even the most critical) step of the process, it is specifically differs from the VIM definition of measurement
often treated as a separate issue when addressing analytical procedure quoted above by including an accuracy
methodology, A "complete analytical procedure" mayor requirement as a goal function. It clearly points out that
may not include sampling, depending on the particular adhering to such a fixed procedure ensures (at least
analytical problem to be solved and the scope of the conceptually) that the results obtained are of a guaranteed
procedure. degree of accuracy.
Two basic statements underlie the definition above. First,
an analytical procedure when performed as prescribed, \vith
the chemical measurement process operating in a state of
An analytical procedure yields the control, has an inherent accuracy to be evaluated. Second,
results of established accuracy a measure of the accuracy can be transferred to the results
In line with Doerffel's statement w'hich refers to analytical produced, providing a degree of their confidence. In essence,
science as "a discipline between chemistry and metrology" the measure of accuracy typical of a given procedure is
[10 L one may define analytical service - as a sort of being assigned to future results generated by the application
analytical industry, that is practical activities directed to of the procedure under specified conditions. The justification
meeting customer needs - as based upon concepts of for both the propositions was given by Youden in his work
chemistry, metrology, and industrial quality control. The on analytical performance [13, 14] where methods for
intention of any analytical methodology in service is to determining accuracy in laboratories were discussed in detail.
produce data of appropriate quality, i.e. those that are fit As a prerequisite for practical implementation of the
for their intended purpose. The question to ans\-,ver is what analytical procedure concept it is assumed that the chemical
kind of criteria should be addressed in characterizing fitness- measurement process remains in a state of statistical control,
for-purpose. being operated within the specifications. To furnish evidence
From the vie\vpoint of objective of measurement, which for this and to avoid the reporting of inyalid data the
is to estimate the true value of the quantity measured, and analytical system needs to be tested for continuing
its applicability for decision-making, closeness of the result performance. A number of control tests may be used with
to the true value, no matter how it is expressed, should be this aim, for instance, testing the difference between par-
such a criterion. If a measurement serves any practical allel determinations ,vhen they are prescribed to be carried
need, it is to meet an adequate level of accuracy. It is out, duplicating complete analysis of a current test material,
compliance with an accuracy requirement that fundamentally and analysis of a reference material. An important point
defines the suitability of measurement results for a specific is that the control tests are to be performed in such a manner
use, and hence corresponding demands are to be made on and in such a proportion as a given measurement requires,
a measurement process that produces the results. Next it and are an integral part of the w'hole analytical procedure.
is assumed that the process is generated by the application The control tests may be more specific in this case and
of a measurement procedure and thus, the accuracy relate to critical points of the measurement process. As
requirements should be finally referred to in the procedure examples, calibration stability control ,vith even one
itself. (The "requirements sequence" first implies reference sample or interference control by spiking provide
substantiation of the demands on accuracy in a particular useful means of expeditious control in an analytical
application area, the problem that needs special consideration procedure.
in chemical analysis [11].) A principal point in this scheme is that accuracy
Following this basic pattern, it is reasonable to re-define characteristics should be estimated before an analytical
Kaiser's "complete analytical procedure", so that the procedure is regularly used and should be the characteristics
fitnessfor- purpose criterion is explicitly allmved for. There of any future result obtained by application of the procedure
must be an accuracy constraint built in the definition so as under specified conditions. Measurements of this type are
to give a determining aspect of the notion. It is probably most commonly performed (by technicians, not
unknown to most analytical chemists worldwide that such measurement scientists) in control engineering and are
a definition has long since been adopted in analytical sometimes called "technical measurements". It is such
terminology in Russia. This was formulated in 1975 by measurements that are usually referred to as routine in
the Scientific Council on Analytical Chemistry of the chemical analysis. In fact, the problems of evaluation of
Russian Academy of Sciences. As defined by the latter an routine analyses faced by chemists are treated more generally
analytical procedure is the: ,. a detailed description of all in the "technical measurements" theory [15].
4 R. Kadis
Uncertainty as an index of accuracy line with GUM. Also this is true for the "top-down" approach
of an analytical procedure [20] that provides a valuable alternative when poorly
understood steps are involved in the CMP and a full
It is generally accepted that accuracy as a qualitative mathematical model is lacking. An important point is that
descriptor can be quantified only if described in terms of the top-down methodology implies a reconciliation of
precision and trueness corresponding to random and information available with the required one that is based
systematic errors, respectively. Accordingly, the two on a detailed analysis of the factors which affect the result.
measures of accuracy, the estimated standard deviation and For both approaches to work advantageously a clear
the (bounds for) bias, taken separately, have to be generally specification of the analytical procedure is evidently a
evaluated and reported [16]. As the traditional theory of necessary condition.
measurement errors holds, the two figures cannot be The break with the traditional subdivision of measurement
rigorously combined in any way to give an overall index errors has a crucial impact on the way accuracy may be
of (in)accuracy. Notice that accuracy, as such, ("closeness quantified and expressed. In 1961, Youden wrote [14]:
of the agreement between the result ofa measurement and "There is no solution to the problem of devising a single
a true value of the measurand" [8]) by no means involves number to represent the accuracy of a procedure". He was
any measurement error categorization. indeed right in the sense that a strict probability statement
On the other hand, it has long been recognized that the cannot be made about a combination of random and
confidence to be placed in a measurement result is systematic errors. Today, thanks to the present uncertainty
conveniently expressed by its uncertainty that was thought, concept, we maintain the other opinion that such a solution
from the outset, to mean an estimate of the likely limits to does exist. It is measurement uncertainty that can be regarded
the error of measurement. So, uncertainty has traditionally as a single-number index of accuracy inherent in the
been treated as "the range of values about the final result procedure. In doing so we must not be confused by the
within ,vhich the true value of the measured quantity is fact that the operational definition of measurement
believed to lie" [17]. However, there was no agreement uncertainty that GUM presents does not use the unknown
on the best method for assessing uncertainty. Consistent "true value" of the measured quantity following pragmatic
with the traditional subdivision, the "random uncertainty" philosophy. The old definitions and, in particular, that cited
and the "systematic uncertainty" each arising from above are equally valid and are now considered ideal.
corresponding sources should be kept separate in the Consequently, ,ve can define an analytical procedure as
evaluation of a measurement, and the question of how to leading to results with a known uncertainty, as in Fig. 1 in
combine them was an issue of debate for decades. which typical "constituents" to be specified in an analytical
Now a unified and widely applicable approach to the procedure are shown.
uncertainty statement set out in ISO Guide (GUM) [18] is
being accepted in many fields of measurement , particularly
in analytical measurements due to the helpful adaptation Specific inaccuracy sources in an analytical
in the EURACHEM Guide [19]. Some peculiarities of the procedure
new approach can be intimated, specifically, the
abandonment of the previous distinction betw·een random What has been said in the previous section generally refers
and systematic uncertainties, treating all of them as standard- to specified measurement procedures used in many fields
deviation-like quantities (after the corrections for known of measurement. There are, however, some special reasons,
systematic effects have been made), and their possible specific to chemical analysis, that make the uncertainty
estimation by other than statistical means. Fundamental, methodology particularly appealing in analytical
however, is that any numerical measurement is not thought measurements. This is because of specific inaccuracy sources
ofin isolation, but in relation to the process which generates in an analytical procedure which are difficult to be allowed
the measurements. All the factors operative in the process for otherwise. Two such sources, sampling and matrix
being defined, they virtually determine the relevant effects, will be mentioned here, with an outline of the
uncertainty sources, so making practicable their methods for their evaluation.
quantification to finally derive the value of total uncertainty.
One can say that the measurement uncertainty methodology Sampling
fits neatly the starting idea of a procedure specified in every
detail, since the procedure itself defines the context which Where sampling forms part of the analytical procedure,
the uncertainty statement refers to. all operations in producing the laboratory sample such as
This is true of the component-by-component Cbottom- sampling proper, sample pre-treatment, carriage, and sub-
up") method for evaluating uncertainty that is directly in sampling require examination in order to be taken into
Analytical procedure in terms of measurement (quality) assurance 5
account as possible sources contributing to the total as physical properties that influence the result and may be
uncertainty. at different levels for analytical samples and a calibration
It is generally accepted that a reliable estimate of this standard. It has long since been suggested in examination
uncertainty can be obtained empirically rather than of matrix effects [26, 27] that the influence of matrix factors
theoretically. Accordingly, an appropriate methodology has be varied (at least) at two levels corresponding to their
being developed [e. g. 21, 22] aimed at separating the upper and lower limits in accordance with an appropriate
sampling contribution from the total variability of the experimental design. The results from such an experiment
measurement results in a specially designed experiment. enable the main effects of the factors and also interaction
This is not, however, the only way of quantifYing uncertainty effects to be estimated as coefficients in a polynomial
in sampling. Explicit use of scientific judgement is now regression model, with the variance of matrix-induced error
equally approved w'hen experimental data are unavailable. found by statistical analysis. This variance is simply the
An illustrative example from the EURACHEM Guide (Ref. (squared) standard uncertainty we seek for the matrix effects.
19, Example A4) clearly demonstrates the potential of In many ways, this approach is similar to ruggedness
mathematical modelling inhomogeneity as an alternative testing aimed at the identification of operational (not matrix-
to the sampling assessment experiment. related) conditions that are critical to the analytical
It is significant that ,vith the uncertainty methodology performance.
both the major analytical properties, "accuracy" and
"representativeness" [23], which quality of analytical data
relies on, can be quantified and properly taken into account "Method validation" in terms
to give a single index of accuracy. This index expresses of measurement assurance
consistency between the measurement results and the true
value that refers to a bulk sample of the material rather The presented concept of analytical procedure offers a clear
than the test portion analysed. perspective on the problem of "method validation" which
is an issue of great concern in quality matters. Validation
Matrix effects is generally taken to mean a process of demonstration that
a methodology is suitable for its intended application. The
The problem of matrix mismatch is ahvays attendant when question is how should suitability be assessed, based on
one analyses an unknown sample "with the same matrix" customer needs? It is commonly recommended [e.g. 2,28-
using a fixed, previously determined, calibration function. 30] that a number of characteristics such as selectivity/
Not uncommonly, an analytical procedure is developed to specificity, limits of detection and quantitation, precision
cover a range of sample matrices in such a way that an and bias, linearity and working ranges be considered as
"overall" calibration function can be used. An error due criteria for analytical performance and evaluated in the
to matrix mismatch is therefore inevitable ifnot necessary course of an validation study. In principle. they need to be
significant. Commonly regarded as systematic for a sample compared to some standard; based on this. judgement is
with a particular matrix, the error becomes random when made as to whether the procedure under issue is capable
a population of samples to which the procedure applies is of meeting the specified analytical requirements. that is to
considered; this in fact constitutes an inherent part of the say, whether a "method is fit-for-purpose" [28]. However.
total variability associated \yith the analytical procedure. from the perspective of endusers of analytical results. it is
Meanwhile, these effects are in no way included in the important that the data be only of the required quality and
usual measures of accuracy as they result from a "method- thus appropriate for their intended purpose. In other words,
performance study" in accordance with the accepted the matter of primary concern is quality of analytical results
protocols [24, 25]. The accuracy experiment defined by as an end-product. In this respect. a procedure will be deemed
ISO 5725 (Ref. 24, Part 1, Section 4) does not presuppose suitable when the data produced are fit-for-purpose.
any variable matrix-dependent contribution, being confined It follows that the common criteria of validation should
to identical test items. The underlying statistical model be made more specific in terms of measurement assurance.
assumes that solely laboratory components of bias and their It is (the index of) accuracy that requires overriding
distribution must be considered. consideration among the characteristics of analytical
It is notable that such kinds of error sources are fairly performance if quality of the results is primarily kept in
treated using the concept of measurement uncertainty which mind. Other performance characteristics are desirable to
makes no difference between "random" and "systematic". ensure that a methodology is well-established and fully
When simulated samples with known analyte content can understood. but validation of an analytical procedure on
be prepared, the effect of the matrix is a matter of direct those criteria seems impractical also in vie,Y of the lack of
investigation in respect of its chemical composition as well corresponding requirements as is commonly the case.
6 R. Kadis
Fig. 1 Typical "constituents" to be specified within analytical procedure, which ensures obtaining the
results with a known uncertainty
(Strictly speaking, there is no validation unless a particular of analytical measurements may be in order. Nevertheless,
requirement has been set.) the conceptual (measurement assurance) basis of this
We have every reason to consider the estimation of approach to validation deserves attention beyond doubt.
measurement uncertainty in an analytical procedure followed
by the judgement of compliance with a target uncertainty
value as a kind of validation. This is in full agreement Conclusions
with ISO 17025 that points to several ",-ays of validation, This debate allows the following propositions to be
among them "systematic assessment of the factors made:
influencing the result" and "assessment of the uncertainty 1. The term "analytical procedure" commonly used
of the results ... " [31]. In line with this is also a statistical without reference to the quality of data is best
modelling approach to the validation process that has recently defined in terms of measurement (quality) assurance
been developed and exemplified as applied to in-house to explicitly include quality matters. This means a
[32] and interlaboratory [33] validation studies. specified procedure which ensures results with an
A concrete example of such validation is worthy of notice. established accuracy.
Certification (attestation) of analytical procedures used 2. The measurement uncertainty methodology neatly
in regulated fields such as environmental control and safety fits the idea of a specified measurement procedure
is operative in the Russian state measurement assurance and furthermore provides a tool for covering
system as a process of establishing metrological properties specific inaccuracy sources peculiar to analytical
and confirming their compliance ",-ith relevant requirements. measurement. Uncertainty can be regarded as a
(By metrological properties we mean herein the assigned single-number index of accuracy of an analytical
measurement error characteristics, i. e. measurement procedure.
uncertainty.) This is introduced by the Russian Federation 3. When an analytical procedure is so defined,
state standard GOST R 8.563 [34] ",-hich also covers uncertainty becomes the performance parameter that
procedures for quantitative chemical analysis. This needs overriding consideration over and above all
certification is, in fact, a legal metrology measure similar, the others assessed during validation studies. This
to some extent, to pattern evaluation and approval of kind of validation gives a direct answer to the
measuring instruments. Some scepticism concerning the question whether the data produced are of required
efficiency oflegal metrology practice in ensuring the quality quality and thus appropriate for their intended use.
References
1. Holcombe D (1999) Accred Qual 3. Cameron 1M (1976) J Qual Technol 6. Kaiser H, Specker H (1956)
Assur 4: 525-530 8 53-55 Fresenius Z Anal Chern 149: 46-66
2. International Conference on 4. Currie LA (1978) Sources of error 7. Taylor JK (1983) Anal Chern 55:
Harmonization of Technical and the approach to accuracy in 600A-604A, 608A
Requirements for Registration of analytical chemistry. In: KolthotIIM, 8. BlPM, IEC, IFCC, ISO, IUPAC,
Pharmaceuticals for Human Use Elving PE (eds) Treatise on IUPAP, OIML (1993) International
(1994) Text on validation of analytical chemistry. Part I. Theory vocabulary of basic and general terms
analytical procedures. ICH Quality and practice, vo!.I, 2nd edn. Wiley, in metrology, 2nd edn. International
topic Q2A: Definitions and New York, pp 95-242 Organization for Standardization
terminology (http://www.ifpma.org/ 5. Kaiser H (1978) Spectrochim Acta (ISO), Geneva
ich5q_html) 33B: 551-576 9. Wilson AL (1970) Talanta 17: 21-29
Analytical procedure in terms of measurement (quality) assurance 7
10. DoerfIel K (1998) Fresenius J Anal 19. EURACHEM/CITAC Guide (2000) 28. EURACHEM (1998) The fitness for
Chern 361: 393-394 Quantifying uncertainty in analytical purpose of analytical methods. A
11. Shaevich AB (1989) Fresenius Z measurement, 2nd edn (http://www. laboratory guide to method validation
Anal Chern 335: 9-14 eurachem.bam.de/guides/quam2.pdf) and related topics. LGC, Teddington
12. Terms, definitions, and symbols for 20. Ellison SLR, Barwick VJ (1998) 29. Wegsheider W (1996) Validation of
metrological characteristics in Analyst 123: 1387-1392 analytical methods. In: Giinzler H
analysis of substance (1975) Zh 21. Ramsey MH (1998) J Anal Atom (ed.) Accreditation and quality
Anal Chim 30: 2058-2063 (in Spectr 13: 97-104 assurance in analytical chemistry.
Russian) 22. van der Veen AMH, Alink A (1998) Springer, Berlin, etc., pp 135-158
13. Youden WJ (1960) Anal Chern 32 Accred Qual Assur 3: 20-26 30. Bruce P, Minkkinen P, Riekkola M-L
(13) 23A-37A 23. Valcarcel M, Rios A (1993) Anal (1998) Mikrochim Acta 128: 93-106
14. Youden WJ (1961) Mat Res Stand 1: Chern 65: 781A-787A 31. ISO/IEC 17025 (1999) General
268-271 24. ISO 5725 (1994) Accuracy (trueness requirements for the competence of
15. Zemelman MA (1991) Metrological and precision) of measurement testing and calibration laboratories.
foundations of technical methods and results. Parts 1-6. International Organization for
measurements. Izdatelstvo International Organization for Standardization, Geneva
standartov, Moscow (in Russian) Standardization, Geneva 32. Jiilicher B, Gowik P, Uhlig S (1999)
16. Eisenhart C (1963) J Res Nat Bur 25. IUPAC (1995) Protocol for the Analyst 124: 537-545
Stand 67C: 161-187 design, conduct and interpretation of 33. van der Voet H, van Rhijn JA, van de
17. Campion PJ, Burns JE, Williams A method performance studies. Pure Wiel HJ (1999) Anal Chim Acta 391:
(1973) A code of practice for the Appl Chern 67 331--343 159-171
detailed statement of accuracy. 26. Makulov NA (1976) Zavod Lab 42: 34. GOST R 8.563-96 State system for
National Physical Laboratory, Her 1457 -1464 (in Russian) ensuring the uniformity of
Majesty's Stationery OtTice, London 27. Parczewski A, Rokosz A (1978) measurements. Procedures of
18. BIPM, IEC, IFCC, ISO, IUPAC, Chern Analityczna 23: 225-230 measurements. Gosstandart of
IUPAP, OIML (1993) Guide to the Russia, Moscow (in Russian)
expression of uncertainty in
measurement ISO, Geneva
Accred Qual Assur (2001) 6:3-7
© Springer-Verlag 2001
Import of Natural Gas, Germany year, this error of 0.1 % leads to a price difference of
OM 20 million.
Ref.: BMWi, http://www.bmwi.de
3e+6 .-----------------------------------,
Uncertainty and traceability of measurement results
2e+6 It follows from these examples that the reliability of the
measurement results is of great public interest. Meas-
Q) urement results are reliable only if their uncertainty is
:; 2e+6 known and quantified. Uncertainty is a metrological
0
-
'ro'
.... term which is defined as follows: Uncertainty: parame-
Q)
ter, associated with the result of a measurement, that
c:: 1e+6 characterizes the dispersion of the values that could
>- reasonably be attributed to the measurand [5].
....
C)
Q)
c:: 1e+6 The uncertainty can be stated only if the traceability
W of the measurement result to a system of units is gua-
ranteed. Traceability is defined as follows [5]: Tracea-
5e+5
bility: property of a result of a measurement or the val-
ue of a standard whereby it can be related to stated ref-
erences, usually national or international standards,
Oe+O +---.----"'1' through an unbroken chain of comparisons, all having
1965 1970 1975 1980 1985 1990 1995
stated uncertainties.
Such a traceability system is demonstrated in Fig. 2.
Year
The International System of Units (SI) is at the top of
Fig.l Import of natural gas, Germany (Ref.: Bundesministerium the system. Its units are realized by standards. A meas-
flir Wirtschaft und Technologie, BMWI) urement is a process, in the course of which the measu-
rand is compared to a standard. For practical measure-
ments, usually a working standard not a primary stand-
ard is used. To state the uncertainty of the measure-
nearly 3 million terajoule (TJ) [4]. The data shown in ment result, the uncertainty of the value assigned to the
Fig. 1 are given in TJ, and that is why the calculation of working standard must be known. It results from the
the natural gas price is based on the energy consumed. uncertainty of the comparison measurement of the
This is the product of calorific value and volume. Fo! working standard with the reference standard. The un-
the determination of the energy, the calorific value H certainty of the value assigned to the reference stand-
of the natural gas must be known. The well-known ard results from the uncertainty of the comparison
method for the calorimetric determination of calorific measurement of the reference standard with the prima-
values is increasingly replaced by a new method. The ry standard. This chain of comparison measurements is
main feature of this method is the determination of the exactly what the definition of the term "traceability"
mole fractions of the gas components using gas chroma- means. If the traceability of a measurement result is
tography. The mole fractions Xj are multiplied by the
molar calorific values HO (t l ) of the gas components.
These products are summarized and multiplied by P2/
Responsibility
RT2 according to Eq. (1).
N
HO[tl V(t2P2)] = j~l XjX HO(t l) :;2 (1)
SI
...
Primary Method
CGPM
CIPM, NMI
•
SI CGPM
tools, necessary to get reliable measurement results.
In analytical chemistry, traceability of measurement Primary Method CIPM;NMI
results to SI units is not always possible and the tracea-
bility hierarchy ends below the level of the SI units. For 5x10E·3
example, in the case of standard measuring devices, or Second ary Method Accredited Reference Labs
reference materials, the values are fixed by mutual Producers Calibration Labs
Tasks of the National Metrology Institutes (NMlsl Fig.3 Traceability in clinical chemistry
vided by PTB and is carried out by the German Cali- Table 1 CCOM international comparisons, clinical diagnostic
bration Service (DKD). markers. NIST: National Institute of Standards and Technology,
USA; LGC: Laboratory of the Government Chemist, UK;
IRMM: Institute for Reference Materials and Measurements,
Belgium; SP: Sveriges Provnings- och Forskningsinstitut, Sweden
The International Committee for Weights and Measures
Mutual Recognition Arrangement (CIPM MRA) Reference Pilot lab Date
No.
The NMls are obliged by national law to realize, main- Cholesterol in serum CCOM-P6 NIST 199H
tain and disseminate the national standards. However, CCOM-K6 NIST 1999
they also take care of the uniformity of measurement Glucose in serum CCOM-PH NIST 1999
worldwide. The first activity for this was the signing of Creatinine in serum CCOM-P9 NIST 1999
Creatinine in serum CCOM-KI2 N 1ST 200!)
the Meter Convention in 1875. On the basis of this trea- Ca in serum CCOM-PI4 IRMM/SP 20m
ty, the General Conference of Weights and Measures Anabolic steroids in urine In preparation -
(CGPM) and CIPM work today. However, the NMls Hormones in serum In preparation -
are confronted today with new challenges [11]. The cal-
ibration certificates issued by them are generally valid
only in the country of issue and are not accepted world-
wide. This turned out to be a barrier to the internation- - Health
al trade. So, different activities have been launched by - Food
various institutions to overcome these obstacles. The Environment
contribution of the NMIs to these efforts is the "Mutual - Advanced materials
Recognition Arrangement" (CIPM MRA), which was - Commodities
signed by the presidents of 38 NMIs in October 1999 - Forensic matters
during the twenty-first session of the CGPM [12]. Its - Pharmaceuticals
objectives are: - Biotechnology
- To establish the degree of equivalence of national In the field of amount-of-substance measurements,
measurement standards maintained by NMls 70 comparisons have been planned. Some of them have
- To provide for the mutual recognition of calibration already been started. Up to now, the measurement pro-
and measurement certificates issued by the NMIs gramme for clinical chemistry has carried out the com-
- Thereby to provide governments and other parties parisons given in (Table 1) [13].
with a safe technical foundation for wider agree- The results of the key comparisons - including the
ments related to international trade, commerce and uncertainty statement - will be stored in an Internet-
regulatory affairs. accessible database. This will enable companies, ac-
The technical basis of the CIPM MRA is a system of crediting bodies, and institutions to evaluate the equi-
key intercomparisons. Furthermore, the NMls have to valence of the measurement results performed by the
prove that they work in accordance with a quality sys- NMIs. The database will make it easier for businesses
tem. The key comparisons are international compari- and organizations relying on these services to prove
son measurements. The Consultative Committee for compliance with the measurement-related require-
Amount of Substance (CCQM) of CIPM is responsible ments of regulations and standards. The database will
for the comparisons in the field of chemistry. It selects be an integral part of the infrastructure necessary to ex-
the substance systems, organizes the realization of the pand free trade and to eliminate technical barriers to
measurements and the evaluation of the measurement export.
results. The substance systems are chosen from areas of
public interest in which traceability is necessary. Prior- Acknowledgements Stimulating discussions with Mrs. P. Spitzer
ity areas are: and Dr. P. Ulbig are thankfully acknowledged.
References
I. Doerffel K (19H7) Preface to: Statistik 2. Bundesministerium flir Bildung und 3. Semerjian HG (199H) Metrology: Im-
in der analytischen Chemie, 4th edn. Forschung, Bundesministerium flir pact on national economy and inter-
YCH, Weinheim, Germany Gesundheit, Statistisches Bundesamt national trade. In: Seiler E (ed) The
(20UO) Die Gesundheitsberichterstat- role of metrology in economic and
tung des Bundes; http://www.gbe- social development. PTB-Texte, Band
bund.de 9, Braunschweig, pp 99-133
12 G. Dube
4. Bundesministerium fiir Wirtschaft 6. Quinn T (1997) Metrologia 34:61-05 12. Comite international des poids et me-
und Technologie (1999) Entwicklung 7. Richter W (1997) Accred Qual Assur sures (CIPM) (1999) Mutual recogni-
der Einfuhr Naturgas in die Bundes- 2:354-359 tion of national measurement stand-
republik; HYPERLINK H. Spitzer P, Eberhardt E, Schmidt I, ards and of calibration and measure-
http://www.bmwi.de Sudmeier U (1996) Fresenius J Anal ment certificates issued by national
5. Deutsches Institut fiir Normung Chern 356: 17H-IHI metrology institutes. Bureau interna-
(1994) Internationales W6rterbuch 9. Spitzer P (1997) Metrologia tional des poids et mesures (BIPM),
der Metrologie, 2nd edn. Beuth, Ber- 34:375-370 Sevres, France
lin Wien Ziirich 10. Bundesarztekammer (19HH) Dt A.rzte- 13. BIPM (1999) Comite consultatif pour
blatt H5:099-706;(1994) 91 :211-212 la quantite de matiere (CCQM).
11. Richter W (1999) Fresenius J Anal Report of the 5th Meeting (February
Chern 305: 509-573 1999). Bureau international des poids
et mesures (BIPM), Sevres, France
DOl 10.1007/500769-001-0438-7
In the last ten years much effort has been applied to Confinnation
introduce these same concepts of physical measurement
into chemical measurement. For example:
The Bureau International des Poids et Mesures
(BIPM) has put in place a consultative committee, the if" .. COffeel chemistry
Consultative Committee on the Quality of Material
(CCQM) [3], to strengthen the relationship of chemi-
cal measurements to its SI unit, the mole. Speciation
EURACHEM and CITAC [4] have developed a guide
for quantifying uncertainty in chemical analysis based Fig. 1 Cause and effect diagram showing sources of uncertainty
on metrological principles and GUM [2] to quantify- associated with chemical measurements
ing uncertainty of measurement.
ISO/IEC 17025: 1999 [5] is replacing ISO Guide 25
[6] as the standard against which laboratories are ac- the measurement. The cause and effect diagram in Fig 1
credited and supports these moves by having an in- represents this situation. Each of these components is
creased emphasis on this metrological approach. important. Get one wrong and the result is unlikely to be
'fit for purpose'.
Incorporating traceability to the mole and uncertainty For many years, analytical chemists have used refer-
budgets into chemical analysis is more complex than is ence methods as a means of limiting the numbers of un-
their application to physical measurement. Normally a knowns by removing those associated with traceability
chemical measurement depends on a combination of of the measurement to the defined chemical entity. Al-
physical measurements, chemical separation of the com- though reference methods remove uncertainty associated
pounds of interest and the selection of the test portion with traceability of the result to the named chemical enti-
from the bulk material. An understanding of the chemis- ty and thereby eliminate most chemical unknowns as an
try involved in these separation processes is vital before issue, they always have the disadvantage that they rede-
reliable results can be achieved and chemical analysts fine the analyte in terms of a method rather than as a
have tended to concentrate on this area of analysis. It is chemical species. Amongst the best reference methods
however a part of the measurement that is tending to be are those published by the Association of Official Ana-
ignored in moves to align chemical measurement with lytical Chemists International (AOAC) [7]. These meth-
the traditional physical metrological process. The sam- ods will have been validated within a number of labora-
pling process, both in the laboratory and outside in the tories from a collaborative study and will have associat-
field also contributes to the uncertainty of the measure- ed estimations of uncertainty based on repeatability and
ment but has tended to be ignored by analysts. Under- reproducibility results from the study.
standing of the uncertainty of chemical measurements In reference methods, uncertainty in the result will di-
will not be achieved without an understanding of the rectly relate to the measured repeatability. Defining the
whole process. analyte as the method result eliminates any uncertainty
related to the underlying chemistry. It may also define
the procedure for taking the test portion in the laboratory
Discussion and thereby include some of the uncertainty associated
with sampling.
Chemical measurement has a fundamental difference from Modern methods of analytical chemistry are less con-
physical measurement in that it does not take place under ducive than traditional methods to the reference method
controlled and defined conditions. Almost always, the pri- approach. Instrument and equipment combinations are
mary objective of a chemical measurement is to determine much more variable between laboratories and change
the amount of components of interest, not the total compo- over time as manufacturers add technical improvements.
sition of the sample. Total composition will almost always Reference methods also cause problems between coun-
remain unknown and therefore the total environment un- tries unless they have international acceptance and they
der which the measurement is taking place cannot be de- limit the adoption of new analytical methodology and
fined or controlled. Unknowns will always increase the equipment.
uncertainty associated with any measurement. As a consequence of the problems associated with
Three components can be considered as contributing reference methods, there is now more emphasis on an
to uncertainty in chemical measurement. These are absolute measure of analytes of interest where these are
sources of uncertainty associated with the sampling pro- distinct chemical entities. For instance, the Codex Com-
cess, the underlying chemistry of the chosen method, in- mittee on Sampling and Analysis is presently debating
cluding its selectivity, and the more readily quantifiable whether analytical requirements for discrete chemical
aspects of uncertainty associated with the repeatability of components in foods can be defined by method perfor-
Chemical metrology, chemistry and the uncertainty of chemical measurements 15
mance criteria or whether a prescribed method is also re- Measurement process Sampling
quired in dispute situations [8]. Reference methods must
still remain for those analytes not readily definable as a .
distinct chemical entity [8].
Moves away from reference methods towards perfor-
Mca"'Hr~nh::r:ib:
imJ t!2.:IUW.!;<;
If<
. .. .
mance criteria will make problems with the underlying -*-t. Unccrtainty in the result
chemistry of the selected method or in taking the test
portion more significant. A problem in either of these ar-
eas may have the consequence of making the result
meaningless. However, much of the recent work on the
analyses of uncertainty in chemical measurement has ne-
glected these issues and instead has tended to concen- Chemistry Hlvoiv(xi
trate on alternatives, based on GUM [2], to repeatability
Fig. 2 Cause and effect diagram showing the issues around under-
estimated from collaborative trials, examples being the
standing the chemistry
work of EURACHEM [4] and the survey of King [9].
Uncertainty arising from the repeatability of chemical
measurement is a characteristic of the method. Its calcu-
nation of characteristics, of the molecules involved. Ex-
lation has similarities to the calculation of uncertainty in
amples are systems involving chromatography, with or
physical measurement and many components are identi-
without, mass spectrometric detection, atomic absorption
cal to those involved in physical measurement, compo-
spectrometry and emission spectrometry. These methods
nents such as uncertainty in mass and volume. However,
do not work unless the measured characteristic, or com-
others such as purity of reference materials and recovery
bination of characteristics, is unique to the compound of
are rather more unique to chemistry, but once deter-
interest or the impact of known interferences can be re-
mined, can still be incorporated into the uncertainty bud-
moved by calculation. The analyst should have used
get using the standard techniques developed for physical
enough properties of the compound to make it unlikely
metrology and described in the publication, GUM [2].
responses from interfering substances could be incorpo-
Once determined, this estimation of uncertainty can be
rated into the reported result.
applied to future tests using the same method in the same
Interference has been a concern of analysts using
laboratory with the same equipment, reagents and staff.
chromatography for a long time. All positive results
Uncertainty due to the underlying chemistry and sam-
from analyses based on chromatographic methods must
pling is much more difficult to estimate. Both may vary
have an appropriate level of confirmation if they are to
with changes of sample. Realistic estimations of uncer-
be specific to the analyte of interest and results are to be
tainty as to the specificity of the method when testing
reliable. Methods of confirmation have included:
samples from one source may not be applicable to sam-
ples from other sources. However, laboratories must be Independent analysis of the sample by a different
able to incorporate uncertainty arising from any lack of method.
specificity into their estimation of total uncertainty if Re-analysis of the sample on a column of different
they are to be able to judge if results will be "fit for pur- polarity known to separate compounds in a signifi-
pose" and give reliable information on uncertainty as re- cantly different order.
quired by ISO 17025 [5]. Re-analysis of the sample using a different wave-
This same issue that uncertainty in analytical chemis- length on the detector. The second wavelength must
try involves more than the uncertainty in the reported re- be chosen so that it will give a good indication of the
sult has been highlighted previously in a small number shape of the absorption curve.
of papers including those of Wells and Smith [10] and Re-analysis of the sample using a detector that oper-
Alexandrov [11]. ates using a different principle. An example could be
an FID and nitrogen specific detector but the different
sensitivities of detectors will limit this option.
Uncertainty in the underlying chemistry A statement from the client of the expected level of
the analyte in the sample. Normally, it could be con-
The underlying chemistry involved in the test method is sidered that the test has an appropriate degree of spec-
obviously important. A simple cause and effect diagram ificity if the result is similar to this expected value.
for the underlying chemistry is shown in Fig. 2. Use of a detector such as a mass spectrometer that
Almost without exception, modern methods of trace gives additional informational on the nature of the
analysis require separation of the analyte of interest from compound detected.
the sample matrix, then estimation of the amount of anal- Recording of the spectrum of the detected compound
yte present using some unique characteristic, or combi- using for instance a diode array detector.
16 J.L.Love
Use of a spike of the compound of interest at a level who continue to improve the ease of operation of existing
that gives a similar response to the measured com- equipment. Improvements can be very good for the expert
pound. A spike gives good evidence that a compound analyst who understands the limitations of all measure-
differs from the compound of interest when retention ments but can also allow laboratories to downgrade the
times are close but it does not provide good confirma- skill level of equipment operators. Automated equipment
tion if retention times coincide. Spikes also reveal can also lead to a decreased level of appraisal of individu-
modifications to responses that can arise with new al results with an increase in uncertainty arising from
matrixes. overlooking problems with the underlying chemistry.
Experience from a number of known similar samples. Automated equipment often allows replacement of
This is unlikely for ad hoc samples from a number of expert technicians with less skilled staff, which will de-
sources but may be relevant for a project or a process crease reliability of judgements made during the analyti-
control laboratory. cal process. A decrease in the skill level of laboratory
staff seems to be a problem throughout the world [14]. It
Many of these approaches to confirmation involve an ad- is partly addressed by accreditation authorities requiring
ditional check procedure following that used to generate a minimum level of expertise within laboratories accred-
the analytical result. A change in the method of confir- ited to ISO/IEC 17025: 1999 [5] but it also requires sup-
mation may have a major impact on the specificity of the port of laboratory owners. Too often this is not forthcom-
result and its potential "fitness for purpose" without hav- ing. Accreditation authorities can probably control major
ing any impact on the metrological estimation of uncer- reductions in skill levels within a laboratory over a short-
tainty. In fact, a laboratory could ignore confirmation to time frame but may have less ability to resist longer-term
get a commercial advantage and still be able to demon- incremental shifts that have the same overall conse-
strate the same uncertainty using the current metrologi- quence. Skill levels in the laboratory certainly has a ma-
cal approach. Confirmation will normally have no effect jor impact on quality of analytical judgements and ana-
on the components of uncertainty included in the metro- lytical reliability without necessarily any effect on the
logical approach but lack of confirmation is likely to calculated uncertainty of the recorded numerical result.
make the specificity of the result so uncertain that it will It is often stated, by for instance King, that a significant
be "unfit for any purpose". number of reported chemical measurements are wrong
Suspect chemistry may not only apply to chromatog- [9]. This is almost certainly true and likely arises from
raphy, it can make any measurement meaningless. The problems in the chemistry underlying the method resulting
December 1999 report from the Canadian Food Inspec- in lack of specificity and/or loss of the analyte. Uncertain-
tion Agency on their Histamine Quality Assurance Pro- ty arising from problems with the chemistry will be rele-
gramme [12] shows major differences between results vant to most chemical measurements aimed at measuring
obtained by high performance liquid chromatography discrete chemical entities. In addition, lower limits of de-
(HPLC) methods, mainly with fluorescence detection, tection are often associated with increased uncertainty un-
and some of those obtained by immunoassay based sys- less significantly improved equipment is used.
tems. Using immunoassay methods for the sample of
fish sauce (code HI6), 2 of the 3 results at 48.00 and
105.83 mg/l 00 g were considerably higher than all ex- The sampling process
cept 1 of the 21 HPLC based results, which had a mean
of 11.83 mg/l 00 g. It seems unlikely these laboratories Sampling is the process used to obtain a portion of the
or the manufacturers of the immunoassay kits did not un- bulk material for testing. It may take place in the field or
derstand or had failed to validate the method. Obviously in the laboratory. Most chemical analyses will involve at
something is missing from these validations and the least two stages of sampling, primary sampling in the
present understandings of these chemical measurements, field and secondary sampling in the laboratory to give
and is unlikely to have been included in any uncertainty the final test portion. All samples are heterogeneous if
associated with the method. Alternatively, but less likely, looked at on a small enough scale [15]. The uncertainty
an interference with the same retention time as histamine arising from sampling will depend on the degree of het-
has quenched the observed fluorescence and resulted in erogeneity in that sample.
low results for HPLC based methods. Chemical laboratories are usually not responsible for
Uncertainty arising from any lack of specificity of the primary sampling in the field but this must be appropri-
underlying chemistry is not going to go away. It is also not ate or the result will be meaningless. They will however
a new concern. Interference has always dominated think- have to prepare the sample delivered to the laboratory
ing and precautions in analytical chemistry and is well and take a representative secondary sample for testing.
discussed in traditional reference textbooks such as Vogel The uncertainty associated with sampling is dependent
[13]. Uncertainty may not be decreased by the new meth- on the sample and may vary between nominally identical
ods analysts continue to introduce or by manufacturers samples although procedures used are identical. This is
Chemical metrology, chemistry and the uncertainty of chemical measurements 17
Sample I Sample 2
2 2 2 2
Analysts must understand and include the uncertainty present the chemical entity of interest and/or has not
associated with sampling if they are to realistically esti- measured all of this entity present.
mate uncertainty of chemical measurements for clients. Uncertainty arising from variation between repeats of
They must understand what is meant by a representative the measurement.
sample and have appropriate data to ensure samples test- Uncertainty arising from the sampling procedure and
ed are representative. However, as Gy [15] has pointed the likelihood that the test portion does not containing
out this is not a traditional part of the education or train- a representative amount of the analyte of interest.
ing of analysts although it is critical to the use of mea-
Present approaches to uncertainty based on metrological
surements and their "fitness for purpose". This must
principles developed for physical measurement concen-
change and sampling processes must be included as part
trate on estimating the uncertainty inherent in the trace-
of the metrological approach to chemical measurement.
ability of the quantity of measurand to its reference point.
In the past laboratories may have excluded primary
Incorporation of metrological practices of physical mea-
sampling from their concern but ISO/IEe 17025:1999
surement into chemical measurement is certainly an im-
[5] now imposes a requirement that the test and calibra-
provement but more effort is needed to incorporate uncer-
tion methods selected are capable of meeting the client's
tainty associated with the sampling process. The metro-
requirements. Without knowledge of the traceability of
logical practices of physical measurement do not address
the measurement back to the client's bulk sample, this is
uncertainty arising from the choice of chemistry and the
impossible.
traceability of the result to the defined chemical entity,
that is, the specificity of the method or uncertainty associ-
ated with sampling. At present chemists use judgement as
Conclusion to when uncertainty associated with sampling and speci-
ficity can be ignored. However, judgements are subjective
Two parameters define the result of a chemical measure-
and incompatible with formal methods to estimate uncer-
ment. These are the named chemical entity and the
tainty. Effort must be spent in developing a metrological
amount of this entity estimated by the defined procedure.
approach for chemical measurements of discrete chemical
Any estimation of uncertainty in the result must consider
entities that will allow realistic estimations of the total un-
traceability of the measurement to both these reference
certainty associated with the reported result, not just the
points. To be useful, the result must also be traceable
uncertainty associated with the numerical value.
back to the original sample. The uncertainty in chemical
measurements must include: Acknowledgements The author thanks Dr. Don Ferry from Inter-
national Accreditation New Zealand (IANZ) for the supply of the
Uncertainty arising from assumptions made in the data on rice tlour replicates. This paper arose from some discus-
chemistry on which the method is based and the pos- sions between the author and the late Dr. John Nicholas at New
sibility that the measured result does not solely re- Zealand Measurements Standards Laboratory (MSL).
References
I. CITAC (2000) Traceability in chemical 7. Horwitz W (2000) Official methods of 13. Vogel AI (1961) A textbook of quanti-
measurement. CITAC web page at analysis of AOAC International. AOAC tative inorganic analysis including ele-
http://www.vtt.filketlcitac/traceabili- International, Gaithersburg, Md., USA mentary instrumental analysis, 3rd edn.
ty.pdf 8. Codex Committee on Methods of Longmans, London, UK
2. BIPM, lEC, IFCC, ISO, IUPAC, Analysis and Sampling 23rd Session, 14. Clapp S (2000) Professional qualifica-
IUPAP, OIML (1995) Guide to the Ex- Budapest, Hungary 26 February-2 tions: How close should we look? In-
pression of the uncertainty in measure- March, 2001. Proposed draft guidelines side Laboratory Management June
ment. ISO, Geneva for the application of the criteria ap- 2000: 18-20. AOAC International,
3. Consultative Committee for Amount of proach by the committee on methods Gaithersburg, Md., USA
Substance (Bureau International des of analysis and sampling. Agenda item 15. Gy P (1998) Sampling for analytical
Poids et Mesures) http://www.bipm.orgl 4a (CX/MAS 01/4) purposes. Wiley, Chichester, UK
enusl2_Commi ttees/CCQ M. shtml 9. King B (2000) Accred Qual Assur 5: 16. Love JL (2000) Sampling - What
4. EURACHEM/CITAC Guide (2000) 173-179 should analytical chemists learn from
Qualifying uncertainty in analytical 10. Wells RJ, Smith RJ (1996) Chern Aust microbiologists? Inside Laboratory
measurement, 2nd edn., Final Draft April 1996: 167-168 Management February 2000: 17-18.
April 2000. EURACHEM II. Alexandrov YI (1997) Fresenius J Anal AOAC International, Gaithersburg,
5. ISO/lEC 17025 (1999) General re- Chern 357: 563-571 Md., USA
quirements for the competence of test- 12. Burns-Flett E (2000) Report by the His-
ing and calibration laboratories. ISO, tamine Quality Assurance Co-ordinator,
Geneva Canadian Food Inspection Agency dat-
6. ISO 25 (1990) General requirements ed 3 I March 2000. Canadian Food In-
for the competence of calibration and spection Agency, 501 University Cres-
testing laboratories. ISO, Geneva cent, Winnipeg, Manitoba R3T 2N6
Accred Qual Assur (1999) 4:4()1-405
© Springer-Verlag 1999
gates trueness to a lower priority. The reliance on pre- official definition of traceability in metrology is: "prop-
cision is repeatedly seen in the results from external erty of the result of a measurement or the value of a
quality assessment (or proficiency testing) schemes all measurement standard whereby it can be related to
over the world, where method-dependent groupings of stated references, usually national or international
results for a given measurand are abundant. measurement standards, through an unbroken chain of
Bias always impairs the comparability over space comparisons all having stated uncertainty" [11]. As
and time of the results for a given type of quantity and stressed in the first resolution of the 20th General Con-
distorts the relationships between different types of ference on Weights and Measures (CGPM) in 1995
quantity. Biological reference intervals are changed in [12], the top of the calibration hierarchy, when possible,
comparison with a true distribution [e.g. 4, 5]. Harris should be the definition of an SI unit.
even suggested a new term for such intervals, "medical
indifference ranges" [6]. Whereas serial monitoring for
change can sometimes live with a constant bias, this is The physical calibration hierarchy
not the case with screening, initial diagnosis, and move-
ment towards a fixed discriminatory true limit, where In physics, the use of calibration hierarchies is well es-
diagnostic misclassifications are the outcome [e.g. tablished and is used in any laboratory, e.g. for bal-
6-10]. A positive or negative bias of, say, 1 mmolll in ances, volumetric equipment, spectrometer wave-
the amount-of-substance concentration of cholesterol lengths, cuvette light path lengths, thermometers, ba-
or glucose in blood plasma has enormous effects on rometers and clocks.
population health and economy.
sec. C (NMI -->accr. CL-->mf.'s lab.) as the catalytic activity concentration of aspartate ami-
mf.'s selected MP (mf.'s lab.) notransferase in plasma and number concentration of
mf. 's working C (mf.'s lab.) erythrocytes in blood, no high-level calibrators exist.
mf. 's standing MP (mf.'s lab.) International calibrators, e.g. from WHO, but no high-
mf. 's product C (mf. -->user) level in vitro procedures characterize a couple of
routine MP (mf., user) hundred types of quantity involving, for example, cho-
routine sample (user) riogonadotropin. An overwhelming number of types of
result (user) quantity have no high-level ending of the traceability
chain, but rely on the internal best-measurement proce-
The length of the hierarchy can be reduced by eliminat- dure and calibrator of the reagent set manufacturer or
ing pairs of consecutive steps, thereby reducing uncer- individual laboratory. The end-user, as a rule, cannot
tainty. be expected to establish the entire traceability chain if
that goes above an in-house procedure. The laborato-
rian usually has to rely on the manufacturer which, in
Commutability and analytical specificity turn, may claim traceability of its product calibrators to
the highest available level, preferably provided by a na-
There are two major reasons why a traceability chain tional metrology institute, an accredited calibration la-
may be broken and trueness lost due to the introduc- boratory, or a reference measurement laboratory. In
tion of bias: insufficient commutability of a calibration fact, this responsibility of the manufacturer is now en-
material and non-specificity of a measurement proce- shrined in the EU Directive on in vitro diagnostic med-
dure. The effect of these separate properties are often ical devices [19], which will be supported by four EN/
indiscriminately lumped together as "matrix effect". ISO standards under development. The laboratorian
Commutability refers to the ability of a material, here a should, however, bolster his or her belief in trueness
calibrator, to show the same relationships between re- and comparability - especially if the traceability chain
sults from a set of procedures as given by routine sam- does not reach high - by recovery experiments [20],
ples [16, 17]. Analytical specificity refers to the ability comparison with a selected procedure [21], and interla-
of a measurement procedure to measure solely that boratory parallel measurements [22], including external
quantity which it purports to examine [16, 18]. Discre- quality assessment [23], preferably on material with ref-
pancies between results of a reference procedure and a erence measurement procedure assigned values [24].
routine procedure applied to routine samples are often The internal quality control system finally checks, with
caused by non-specificity of the routine procedure. The a given probability, whether the current measurements
use of a set of human samples as a manufacturer's cali- are in statistical control with no sign of change in the
brator to eliminate so-called matrix effects should only assumed zero bias.
be accepted if the relationship between the results from
reference and routine procedures is sufficiently con-
stant to allow explicit correction with consequent in- Uncertainty of measurement
creased uncertainty of assigned values.
The definition of metrological traceability (see above)
stipulates that each link in the chain has a known un-
Traceability in practice certainty. Nowadays, this concept and its application
have been reformulated by the BIPM and recently de-
It is relevant to ask how often the routine measurement tailed in the "Guide to the expression of uncertainty in
procedures currently used in laboratory medicine pro- measurement" (GUM) [26]: "parameter, associated
vide results that are traceable to high-level calibrators with the result of a measurement, that characterizes the
and reference measurement procedures (Lequin: per- dispersion of the values that could reasonably be attri-
sonal communication). It turns out that primary refer- buted to the measurand". Useful explanations are pro-
ence measurement procedures and primary calibrators vided in several other guides [26-30] as well as com-
are only available for about 30 types of quantity such as mentaries [e.g. 31-33]. The philosophy is to apply a bot-
blood plasma concentration of bilirubins, cholesterols tom-up approach by formulating a function of all input
and sodium ion. International reference measurement quantities giving the measurand as output. An uncer-
procedures from the International Federation of Clini- tainty budget of all sources of uncertainty is estab-
cal Chemistry and Laboratory Medicine (IFCC) and lished. Important items to consider are:
corresponding certified reference material from BCR - definition of the measurand
are available for the catalytic activity concentration of a - realization of the measurand
few enzymes such as alkaline phosphatase and creatine - sampling
kinase in plasma. For another 25 types of quantity, such - speciation and matrix
22 R. Dybkaer
References
1. Westgard JO, Hunt MR (1973) Clin 5. Hyltoft Petersen P. Gowans EMS, 9. Hyltoft Petersen P, H¢rder M (1992)
Chern 19:49-57 Blaabjerg 0, H¢rder M (1989) Scand Scand J Clin Lab Invest 52
2. Westgard JO, Carey RN, Wold S J Clin Lab Invest 49:727-737 (Suppl 2(8):65-87
(1974) Clin Chern 20:825-833 6. Harris EK (1988) Arch Pathol Lab 10. Hyltoft Petersen P, de Verdier C-H,
3. Westgard JO, Stein B, Westgard SA, Med 112:416-420 Groth T, Fraser CG, Blaabjerg 0,
Kennedy R (1997) Comput Method 7. Ehrmeyer SS, Laessig RH (1988) Am H¢rder M (1997) Clin Chim Acta
Programs Biomed 53:175-186 J Clin Path 89:14-18 260: 189-206
4. Gowans EMS, Hyltoft Petersen P,
BJaabjerg 0, H¢rder M (1988) Scand
J Clin Lab Invest 48: 757-764
°
8. Hyltoft Petersen P, Lytken Larsen M,
Harder M, Blaabjerg (1990) Scand
J Clin Lab Invest 50
11. BIPM, IEC, IFCC, ISO, IUPAC, IU-
PAP, OIML (1993) International vo-
cabulary of basic and general terms in
(SuppI198):66-72 metrology. ISO, Geneva
From total allowable error via metrological traceability to uncertainty of measurement of the unbiased result 23
12. Comite International des Poids et 21. Hyltoft Petersen P, St(ickl D, Blaa- 29. EAL-R2 (1997) Expression of the un-
Mesures (1 99H) National and interna- bjerg 0, Pedersen B, Birkemose E, certainty of measurement in calibra-
tional needs relating to metrology. Thienpont L, Flensted Lassen J, tion.
Bureau International des Poids et Kjeldsen J (1997) Clin Chem 30. ISO TR 14253-2 (199H) Geometrical
Mesures, Sevres 43:2039-2046 product specifications (GPS) - In-
13. Kaarls R, Quinn TJ (1997) Metrolog- 22. Groth T, de Verdier C-H (1993) Up- spection by measurement of work-
ia 34:1-5 sala J Med Sci 9H:259-274 pieces and measuring equipment -
14. Quinn TJ (1997) Metrologia 34:61-65 23. Hirst AD (199H) Ann Clin Biochem Part 2: Guide to the estimation of un-
15. Adams F (199H) Accred Qual Assur 35:12-1H certainty in GPS measurement, in cal-
3: 30H-31 6 24. Stamm D (19H2) J Clin Chem C1in ibration of measuring equipment and
16. Dybkaer R (1997) Eur J Clin Chem Biochem 20: H17-H24 in product verification. ISO, Geneva
Clin Biochem 35:141-173 25. BIPM, IEC, !FCC, ISO, IUPAC, IU- 31. Kadis R (199H) Accred Qual Assur
17. Fasce CF, Rej R, Copeland WH, PAP, OIML (1993) Guide to the ex- 3:237-241
Vanderlinde RE (1973) Clin Chem pression of uncertainty in measure- 32. Bremser W (199H) Accred Qual As-
19:5-9 ment. ISO, Geneva sur 3: 39H-402
lH. Kaiser H (1972) Z Anal Chem 26. Taylor BN, Kuyatt CE (1994) NIST 33. Hasselbarth W (199H) Accred Qual
260:252-260 Technical Note 1297. National Insti- Assur 3: 41 H-422
19. EU Directive 9H179/EC (199H) Off J tute of Standards and Technology, 34. Golze M (199H) Accred Qual Assur
Eur Comm L 331: 1-37 Washington 3:227-230
20. Willets P, Wood R (I99H) Accred 27. Eurachem (1995) Quantifying uncer-
Qual Assur 3:231-236 tainty in analytical measurement.
2H. EAL-G23 (1996) The expression of
uncertainty in quantitative testing.
Accred Qual Assur (199X) 3: 1XO-1 X4
© Springer-Verlag 199X
noted u(x;). In its second recommendation, the Comite LAB 03 <-------"'- ___ ROO>
International des Poids et Mesures (CIPM) requested LAB 04 I <-------*------->
that this combined standard uncertainty be used "by all LAB 05 <------ ---- ------------>
< ______ w_______ >
participants in giving results of all international com- LAB 06
.00_-_--*--------->
CIPM and Comites Consultatifs" [3]. LAB 08 < __
<-----"'----->
the uncertainty of the result of a measurement, it may LAB 10
<------"'- __ OR>
may be expected to encompass a large fraction of the LAB 14
- Between 6 and 15 laboratories carry out each six BAR-GRAPHS FOR LABORATORY MEANS AND EXPANDED
UNCERTAINTIES
measurements spread on two different units.
- Samples of each of both units are measured on two 140.0 150.0 160.0
+ ••••••••• + ••.•••••• + ••••••••• + ••••••••• +••••••.•• + ••••••.•• + ••••••••• +
170.0
Frequently, it is observed that the results of several la- LAB 10 < ___________________ W ___________________ > I
I
boratories participating in the certification do not even LAB 11 < ___________ • ______ W __________________ >
I
overlap with the value which is certified. The reason for LAB 12 <- - - - - - - -- - - -. - - - -- -*- ----- - --- --- ---- -->
I
this is not, as is generally believed on the basis of rou- LAB 14 <- - - - - - -- - - - - - -- - - - - - --- - _w_ - - - - - - - - - __ 0- - - --
increase) with time are determined by calculating the However, as is stated in its paragraph 3.4.8., the follow-
time at which the 95% lower (or higher) confidence ing should be noted:
limit intersects the acceptable lower (or higher) specifi- - It cannot substitute for critical thinking, intellectual
cation limit, i.e. the lower or higher limit of the certified honesty, and professional skill.
interval. The time so determined may then be consid- - The evaluation of uncertainty is neither a routine task
ered as the expiration date, as one may be 95% confi- nor a purely mathematical one and depends on detailed
dent that the average value of the batch characteristic knowledge of the nature of the measurand and of the
will remain within specification until that date. As was measurement.
the case for the within-unit variation, this possible in- - The quality and utility of the uncertainty quoted for
stability should, in general, not be included in the CRM the result of a measurement therefore ultimately depend
uncertainty, except if the degradation is significant on the understanding, critical analysis, and integrity of
compared to the certified uncertainty of the CRM. In those who contribute to the assignment of its value.
such cases it might be preferred, rather than to reject This is particularly the case for the certification of
the material as CRM, to certify an arbitrarily chosen reference materials. The above procedures can be used
interval within which the material can be expected to to obtain an estimation of both the certified value of a
remain stable during a significant period of time, i.e. reference material and its uncertainty. However, there
until the expiry date of the certificate. must be room for critical evaluation of the results by
the people and organizations taking up responsibility
for the values assigned to a CRM. Therefore it may be
Conclusion common practice in some organizations to increase the
calculated uncertainty as it is felt to be optimistic. One
The Guide to the expression of uncertainty in measure- should however be careful not to give lower uncertain-
ment provides a framework for assessing uncertainty ties just on the basis of the fact that large uncertainty
which can and should be used for the certification of intervals may be interpreted as being the consequence
reference materials by laboratory intercomparison. of e.g. an analytical artefact.
References
1. Guidelines for the production and cer- 3. Giacomo P (1987) Metrologia 7. Pauwels J, Vandecasteele C (1993)
tification of BCR reference materials 24:49-50 Fres J Anal Chern 345: 121-123
(1997) - document BCRIOI 197, Euro- 4. Quantifying uncertainty in analytical 8. Pauwels J, Lamberty A, Schimmel H,
pean Commission, Dg XII-5-C (SMT measurement, 1st edn (1995) Eura- Quantification of the expected shelf-
Programme ). chern, ISBN 0-948920-08-2 life of certified reference materials,
2. Guide to the expression of uncertainty 5. Pauwels J, Lamberty A, Schimmel H, Fres J Anal Chern (accepted)
in measurement (1995) ISO, Geneva, Homogeneity testing of reference ma-
ISBN 92-07-10188-9 terials, Accred Qual Assur 2:51-55
0. Ingamells CO, Switzer P (1973) Talan-
ta 20:547-508
Accred Qual Assur (2()()() 5: 95-99
© Springer-Verlag 2()()O
cal and chemical form, and to be stored at the correct and the variation between the different samples within
temperature from a very early stage in the production a bottle can only be obtained from measurements car-
process. In general, microbiological degradation can be ried out using a highly repeatable method so that the
minimised by reducing the water content of the materi- method repeatability is negligible compared to the var-
al to a level between 1 and 3%. Packaging is best car- iation between the samples in a bottle, i.e. S~Clh« S?nh'
ried out in an atmosphere of argon - not under vacuum In this case, sample intakes must however be minimal,
as this may become a source of leaks - whereby all pre- as the contribution of S~Clh to U?nh becomes negligible
cautions must be taken to guarantee absolute tightness. when extrapolating S?nh from smaller (m) to larger (M)
This can be achieved using bottles with inserts, penicil- sample sizes according to:
lin vials or ampoules, whereby it must be stressed that
all three solutions have failed in the past: bottles and [S?nh]M = [S?nh]m' m/M == [U?nh]m' m/ M (3)
vials due to insufficiently tight or retracting inserts (e.g. It must be emphasised that Sinh is irrelevant for the
due to ageing or freeze-temperature effects) or am- CRM uncertainty, provided the minimum representa-
poules due to cracks appearing during storage as a con- tive sample intake is properly determined. The value of
sequence of stresses present in the glass. Sinh is, however, of prime importance to estimate this
minimum representative sample intake correctly [10].
In both cases it should be noted that:
What is important in homogeneity testing? - Not correcting Uhh or Uinh for Smcas or Smclh is not real-
ly a problem, but leads to (too) conservative CRM
Homogeneity testing addresses a double problem: uncertainty estimates.
What is the variation in mean value which exists be- - Corrected Shh or Sinh values may never be taken
tween the various units of a batch of candidate RM? smaller than their respective combined uncertainties,
And, how inhomogeneous is the material contained in i.e. U(Shh) and U(Sinh) [9].
a bottIe?
The first problem is of utmost importance to the
user as he/she will, in general, buy just one bottle, and What to do with stability data?
will not care about the other ones! Therefore, between-
units variation is an important component of uncertain- Stability testing at higher temperatures simulating pos-
ty which must be included in the certified value of the sible transport conditions and conditions of long-term
CRM. The determination of the between-units varia- storage are often part of procedures describing the pro-
tion is carried out by measuring the value of a signifi- duction of CRMs [11]. In most cases they do, however,
cant number of units. As the result of such measure- not give quantitative information on presumed instabil-
ments is a combination of two effects, the between-bot- ity, mainly as a consequence of insufficient measure-
tle variability [Shh] and the measurement repeatability ment reproducibility and of an insufficient number of
[smcas] replicates. With the upcoming requirements of fixing
expiry dates [12], it will be mandatory that not only
(1) quantitative data be available, but that their quality is
the variation between the mean value of the bottles can such that high precision extrapolations can be made.
only be obtained from measurements carried out with This requires however that data are produced with
the highest repeatability: i.e. that each bottle must be measurement reproducibilities (or repeatabilities when
analysed, using a highly repeatable method, on sample isochronous measurements are carried out [13]) which
intakes of optimal size and carrying out a number of are negligible compared to the certified uncertainty. An
repetitions which is sufficient to obtain a measurement extrapolation method was recently proposed by Pau-
uncertainty which is negligible compared to the varia- wels et al. [14] to determine the time for which the cer-
tion between the bottles, i.e. s~cas < < S~h' Usually, this tified value of a CRM remains valid, based on the de-
is however not the case. Then, U~h should, as far as pos- termination of the intersection of the lower 95% confi-
sible, be corrected for s~cas to obtain the best estimate dence bound with the lower limit of the certified confi-
of S~h [9]. dence interval (see Fig. 1). Such calculations show how-
To evaluate the inhomogeneity of the material con- ever that, with the levels of uncertainty presently cer-
tained in a bottle, within-bottle measurements have to tifed, either unrealistically high precisions are required,
be carried out. Also here, the result of such measure- or that shelf-lifes must be reduced to unrealistically
ments is a combination of two effects, the within-bottle short periods of time, even if one considers that further
inhomogeneity [Sinh] and the method repeatability stability monitoring during the lifetime of the CRM
makes regular re-evaluation and updating of the shelf-
[Smclh]
life possible. Therefore, in many cases, it may become
(2) necessary to re-evaluate the certified uncertainties of
32 1. Pauwels et at.
-
i the batch (Uchar) should then take into account all these
~ 1
~ 0.98 - •
• • ---------
:;- I
:
standard uncertainties, considering that those uncer-
tainties which have been repeatedly determined in an
~ 0.96
• ••
:>
independent way, decrease proportionally with the
------------ ---
I
0.94
0.92 • ""1
: square root of the number of degrees of freedom. A
0.9 proposal to handle this problem was published by Pau-
o 10 20 30 40 50 60 wels et al. [15]. It is based on a separate consideration
time (months) of three types of standard uncertainties:
- Those which are exclusively laboratory dependent.
Fig.l Example of determination of the long-term stability of cer- - Those which are common to all laboratories, such as
tified reference material (CRM): Cr in CRM 27RR (mussel the effect of between-bottle variation or the use of a
tissue)
common calibrant.
- Those which are common to groups of laboratories,
e.g. those using the same measurement procedure.
RMs taking into account a realistic stability uncertainty. In this context it should be noted that matrix CRMs
Possibly, other approaches may be found to solve this are generally certified for mass fractions related to dry
extremely important problem, such as the one pro- matter, i.e. that not only the amount of substance but
posed by a group of experts working in the framework also the dry sample mass has to be assessed and its un-
of a "Standards, Measurements and Testing Accompa- certainty evaluated: a problem that is ignored and/or
nying Measure" under the co-ordination of LGC (s. underestimated by many analytical chemists and a po-
Burke, personal communication), consisting in extrapo- tential source of significant errors and unaccounted un-
lating the certified value to mid-way of an arbitrarily certainties in CRMs.
chosen life-time and calculating the associated supple-
mental uncertainty.
A similar reasoning may be appropriate for possible The CRM uncertainty according to GUM
degradation of the CRM during transportation to the
customer. The final uncertainty of a CRM according to GUM
should consider all sources of uncertainty described
above:
The characterisation of a homogenous batch of
material
UCRM = [ Uchar
2
+ U oo
2
+ Ul2ts + Usts
2 ] 1/2
, (4)
whereby Its and sts refer to long-term stability (upon
The estimation of the mean value of a quantity of a storage) and short-term stability (during transport), re-
CRM batch using: (1) a primary method of analysis, or spectively.
(2) by comparing the results of a limited number of ref- It is good practice to quantitatively determine all
erence methods, or (3) the results of various indepen- sources of uncertainty, be they significant or not. In the
dent methods applied in a series of laboratories should, latter case they will anyhow disappear in the rounding-
in fact, only be variants of one and the same philoso- off of the calculation, but it will:
phy. The third characterisation method, however, re- - Avoid the risk of overlooking sources of uncertainty
quires that a number of analyses are carried out by one due to ignorance.
or more techniques in one or more laboratories, where- - Demonstrate to users that they have been considered
by each series of measurements is carried out with maxi- and what is their magnitude.
References
1. Quinn TJ (1997) Metrologia 34:61-65 3. ISO Guide 35 (19R9) Certification of 4. ISO (1995) Guide to the expression
2. ISO Guide 30 (19R1) Terms and defi- reference materials - General and of uncertainty in measurement. ISO,
nitions used in connection with refer- statistical principles. ISO, Geneva, Geneva, ISBN 92-67-101 RR-9
ence materials. ISO, Geneva, Switzer- Switzerland 5. Jorhem L (199R) Fresenius J Anal
land Chern 306: 370 373
Evaluation of uncertainty of reference materials 33
6. Pauwels J (1 YYY) In: Fajgeli A, Parka- Y. Pauwels J, Lamberty A, Schimmel H 13. Lamberty A, Schimmel H, Pauwels J
ny M (eds) The use of matrix refer- (lYYS) Accred Qual Assur 3:51-55 (1 YYS) Fresenius J Anal Chern
ence materials in environmental ana- 10. Pauwels J, Vandecasteele C (lYY3) 360: 35Y-361
lytical processes. The Royal Chemical Fresenius J Anal Chern 345:121-123 14. Pauwels J, Lamberty A, Schimmel H
Society, London, pp 31-45 11. European Commission: DG XII-C-5 (1 YYS) Fresenius J Anal Chern
7. EURACHEM (lYY5) Quantifying un- - document BCRIOlIY7 (1 YY7) Guide- 361:3Y5-3YY
certainty in analytical measurement. lines for the production and certifica- 15. Pauwels J, Lamberty A, Schimmel H
EURACHEM, London, ISBN tion of BCR reference materials. Eu- (IYYS) Accred Oual Assur 3:1S0-1S4
O-Y4SY26-0S-2 ropean Commission, Brussels
S. Kramer GN, Pauwels J (lYY6) Mikro- 12. ISO Guide 31 (1 YYS) Reference ma-
chim Acta 123:S7 -Y3 terials - Contents of certificates and
labels (draft). ISO, Geneva, Switzer-
land
Accred Qual Assur (2002) 7:90-94
DOl 1O.1007/s00769-001-0434-y
© Springer-Verlag 2002
tion range and, consequently, the experimental bias can timating precision. In this paper, this latter term will be
be estimated using one reference sample with a concen- considered to be negligible. The overall expanded un-
tration similar to the routine samples. If this is the case, certainty, U, is then calculated by multiplying the stan-
the experimental bias is calculated as the difference be- dard uncertainty, u, by the two-sided t tabulated value,
tween the reference value, cref' and the mean value, t aJ2 . eff' for the effective degrees of freedom, V eff [2], i.e.
bias=cret-cfound' The experimental bias is not significant U = t aJ2 • eff' U. A coverage factor of k=2 is recommended
if: for most purposes when the effective degrees of free-
dom, veff' are large enough. This value represents a level
bias::; tal2.~fj . u(bias) (I) of confidence of approximately 95%. Strictly, the uncer-
tainty calculated in Eq. 3 corresponds to results of future
where t a12 • etf is the two-sided t tabulated value for the samples obtained after correcting the concentration
effective degrees of freedom, veff' [2] associated with found for the experimental bias. However, analytical re-
u(bias), and can be replaced by the coverage factor k if sults are never corrected for non-significant experimental
the effective degrees of freedom are large enough [3,4]. bias. As a result, this bias should be included as a com-
The uncertainty of the experimental bias, u(bias), de- ponent of uncertainty because the procedure may have a
pends on the reference used to assess trueness. If a certi- true bias.
fied reference material (CRM) is used, this uncertainty is
calculated as:
Approaches for including non-significant
u(bias) = -S1p + u(cret)~
.
~ (2) experimental bias in the uncertainty budget
where s,
is the standard deviation of the p results ob- Different approaches have been proposed in the field of
tained when analysing the CRM and u(c ref ) is the stan- physical measurements to include bias as a component of
dard uncertainty of the CRM (i.e. U(cref)lk, where k is uncertainty when results are not corrected for systematic
normally equal to 2 and U(cref) is the uncertainty of the errors [1]. In this paper, we will study whether these ap-
CRM provided by the manufacturer). proaches can be applied to include non-significant exper-
If the experimental bias is significant, the procedure imental bias in chemical measurements. The first ap-
should subsequently be revised in order to identify and proach consists of including this bias as another compo-
eliminate the systematic errors which produced the bias. nent of uncertainty and simply to add it in the usual root-
Otherwise, we assume that the procedure is unbiased sum-of-squares (RSS) manner, i.e.
and, consequently, we do not correct results for the ex-
perimental bias. However, several questions arise in this U(RSSu) = tal2.eif . -,Ju 2 + bias 2 •
latter case because, from a chemical point of view, some
bias is always to be expected in an analytical procedure. The second approach sums this bias in a RSS manner
with the expanded uncertainty, U, i.e.
" 'II,
75 +-----------------------------------------------~~&-__1
'b
70+-----.----,-----,----.-----,----.-----.-----,-----.~--1
100 90 80 70 60 50 40 30 20 10 o
% 13 error
75 +---------------------------------------------------------~
100 90 80 70 60 50 40 30 20 to o
% 13 error
Table 2 True bias of the analytical procedure, 8rroccdurc, and per- ed the true concentration of the routine sample, CfUL true'
centage of times that the experimental bias is identified as non- within the interval cfut±Uncertainty. Uncertainty was cal-
significant, i,e, % ~ error, for the three cases described in Table I
culated using the z-value for a level of significance
Case 1 Cases 2 and 3 a=5%. Therefore, if the uncertainty is correctly calculat-
ed, this percentage, i.e. % traceable future results, should
8proccdurc % ~ error 8proccdurc % ~ error be 95% (i.e. I OO-a%). If uncertainty is underestimated,
this percentage is lower than 95% and, if it is overesti-
°
OA
0,8
9S
91
79
°
OA
0,8
9S
88
66
mated, it is higher than 95%. This percentage was calcu-
lated and plotted as a function of the ~ error committed
1.2 60 L2 36 in the assessment of trueness.
1.6 37 lA 22 Figure 2 shows these results for case 1. In this case
2 18 1.6 13
the contribution of u(bias) to the overall uncertainty,
2A 7 1.8 6
u(bias)/u, is 43%. We see that uncertainty can be greatly
38 A. Maroto et al.
underestimated when the experimental bias is not includ- peri mental bias should be included in case 2 but this is
ed as a component of uncertainty. This underestimation not necessary in case 3. This is because in case 3 the un-
depends on the ratio u(bias)/u, i.e.: the higher this ratio certainty of the experimental bias is negligible when
is, the higher is the underestimation of uncertainty. The compared to the overall uncertainty.
best approach to include non-significant experimental bi-
as is the SUMU approach because it gives the percentage
of traceable future results closest to 95%. The uncertain- Conclusions
ty, U(bias), is also a good approach for including this
bias. However, this approach gives higher uncertainty Non-significant experimental bias should be included in
values than the SUMU approach. The U(RSSU) and the the uncertainty budget when the uncertainty of this bias
U(RSSu) uncertainties are clearly inferior for including represents about 30% of the overall uncertainty. The
this bias because they overestimate uncertainty for higher this contribution is, the more important it is to in-
higher probabilities of ~ errors and underestimate uncer- clude the non-significant experimental bias. In contrast,
tainty for lower probabilities of ~ errors. Moreover, these it is not necessary to include this bias when its uncertain-
approaches give higher uncertainty values than the ty has a low contribution to the overall uncertainty, i.e.
SUMU approach. 15% or lower. The best approach for including this bias
Figure 3 shows the percentage of traceable results is the SUMU approach. The uncertainty, U(bias), also
versus the percentage of ~ error for case 2 (i.e. 32% of gives good results. Otherwise, we can use the uncertainty
contribution of u(bias) to the overall uncertainty) and for U(bias) because, opposite to the SUMU approach, it has
case 3 (i.e. 15% of contribution). In this Figure, uncer- the advantage that it gives a symmetric confidence inter-
tainty is calculated without including the experimental val around the estimated result. However, it gives higher
bias and with the SUMU approach. We see that the ex- uncertainty values than the SUMU approach.
References
1. Phillips SO, Eberhardt KR, Parry B 4. EURACHEM (1995) Quantifying un- 7. Ellison SLR, Williams A (1998) Accred
(1997) J Res Natl [nst Stand Technol certainty in analytical measurements, Qual Assur 3:6-10
102:577-585 EURACHEM Secretariat, PO. Box 46, 8. Barwick VJ, Ellison SLR (2000) Accred
2. Satterthwaite FE (1941) Psychometrika Teddington, Middlesex, TWll OLY, UK Qual Assur 5:47-53
6: 309-316 5. Maroto A, Riu J, Boque R, Rius FX 9. EURACHEM/CfTAC Guide (2000)
3. BfPM, fEC, [FCC, [SO, IUPAC, IUPAP, (1999) Anal Chim Acta 391: 173-185 Quantifying uncertainty in analytical
OfML (1993) Guide to the expression of 6. Maroto A, Boque R, Riu J, Rius FX measurement, EURACHEM, 2nd Edition.
uncertainty in measurement, ISO, Geneva (1999) Trends Anal Chern 18/9-10:577- Helsinki
584
Accred Qual Assur (2002) 7:269-273
DOl 10.1007/s00769-002-0485-8
© Springer-Verlag 2002
calibration data set {xi' yd and the (predicted) content is + 2· xpred cov(bo,bl ) + var(bl )) (3)
calculated from the observed response with the help of
the inverse function (Eq. 3 marked in [2] as E3.3). A second estimation
40 L. Briiggemann . R. Wennrich
( ) - s?x
var Xpred - b2
I
[l
p
1... + (Xpred - X:::)
+ n ~(. _
4- XI
I
)2
Xcal
j (4)
According to [1] now the variance estimations of the
measurement uncertainty are marked by u2 and their
standard measurement uncertainty by u={i;2. Thus for
the estimation of the standard measurement uncertainty
is based on the calibration data (Eq. 4, E3.5 in [2]). It of a content xpred one obtains the expressions
concerns the well-known formula for the variance esti-
mation of an average value predicted when p repetitions sL (II)
(i number of calibration levels, cal symbolizes here that u(xpred)=T
the average values belong to the calibration data set, and
s~x defines the residual variance of the regression model).
(12)
Comparison of the estimations of measurement
uncertainty
sL
r
U(Xpred) =
With the helP[Ofthe relations Vajr(bl )= r(x, _'xWI )2 and .------------~~----------
Xpred =-Xeal + Yobs b-I Ycal (9) In order to check the trueness of an analytical procedure
concerning the analysis of a sample with the content Xs
and of one component, an additional analytical quality con-
trol (AQC) measurement of a reference material with the
var(xpred) =
certified content Xr is to be executed. The determined
+( Y: Yvar(bl ) +var(YCal»)
content x (observed from the reference material) is com-
~2 ( var(Yob.\) YObS; (10) pared with x r• Two further uncertainties have to be con-
sidered: the uncertainty of the AQC measurement u(xq),
Evaluation of measurement uncertainty for analytical procedures using a linear calibration function 41
based on the standard uncertainty u(y,) of the appropri- iment samples. The aqua regia extracts were prepared
ate response values, and the uncertainty u(x r ) concerning according to DIN ISO 11466 [8]. The concentration of
the content specification of the reference material [6]. zinc was determined in the diluted (deionized water) ex-
With the help of a correction factor f" defined by tracts by ICP-atomic emission spectrometry with pneu-
matic nebulization (Spectroflame MfP, Spectro A.I.). The
fr=xqlx r (14) aim of this work was to estimate the applied calibration
and the model equation for the corrected content procedure based on a set of diluted ICP multielemental
standards (Merck IV) in 0.1 mol I-I nitric acid for "true"
xcorr = x.J,h. (15) results in the aqua regia extracts. The trueness of this an-
alytical procedure should be proved on the basis of SRM
u(Xq) and u(xr) can be included in the calculation of com-
2709, Montana soil (NIST), which was handled as a
bined measurement uncertainty.
sample within the procedure. The certified value for zinc
The standard uncertainties of fr and xcorr are estimat-
(aqua regia soluble) is reported to be 100 mg kg- I (range
ed by
87-120 mg kg-I). If one considers the conversion factor
U(!,) = fr' ~(u(Xq) f x,y + (u(x r ) f xr)2 (16) from the sample preparation, the mean concentration of
zinc in the solution amounts to 0.285 mg I-I.
and Spread-sheet programs and special software solutions
u(xmrr ) = x corr ' .,j(u(x,) f x,)2 + (u(!,) f !,)2 , (17) can be used for the necessary calculations. For example
the program tool "SQS98" [9] supplies the standard uncer-
respectively. By multiplication of the combined mea- tainties u(bo), u(b,), and u(Ycal) needed in Eqs. (12) and
surement uncertainty u(xcorr ) with a coverage factor (13). However, a small side-calculation is necessary (viz.
(k=2) the expanded measurement uncertainty Table 2).
(18)
Table 2 Standard uncertainty in the parameters of linear regression
is obtained and reported in the result of the analysis.
Quantity Standard uncertainty
Using the test statistic
T- 11- frl (19)
5,.,=48.33
- u(fr) Residual standard deviation
bll=-42.5 CI(b o)=71.16
based on the t-distribution, it can be proven whether fr
Intercept u(bo)=CI(bo)/t45 ,;,
significantly differs from 1. If T is larger than 2 (according
to the size of k), an existent method bias is suggested [7]. u(b ll )=71.16/2.776=25.63
This method bias should be eliminated in the context of b J=2.224.9 C/(b /)=31.68
further investigations. If this is not possible, then it must Slope u(b /)=C/(b /)lt4 :YA
be considered with the calculation of the sample content. (Sensiti vity) u(b /)=31.68/2.776=11.41
XCiI/ = 1.433 u(XCi//) =0
Abscissa centroid
Example u(YCi//) = ~u2(b2)- ~2u2(bl)
YCi//=3146.6
The evaluation of measurement uncertainty is presented Ordinate centroid u(Y,a/) = .)25.63 2 - 1.433 2 . 11.412
on the basis of a calibration data set (Table 1) for the de-
u(Yw/) = 19.74
termination of zinc in aqua regia extracts of polluted sed-
Table 1 Calibration data set for n=6 calibration levels (p=5 repetitions)
x y1 y2 y3 y4 y5
a content in mg 1-1
42 L. Brtiggemann . R. Wennrich
Table 3 Measurement uncertainty concerning analyte contents in calibration range (without AQC measurement)
Analyte Response Stand. unc. Analyte Unc.-contr. Unc.-contr. Unc.-contr. Measurement uncertaintyb
Contenta Yobs U(Yobs) Content a Uobs ubi Ucal
Xi Xpred Eq. (13) Eq. (11)
U(Xpred) U(Xl'red)
"content in mg I-I
bThe Eqs. (13) and (II) correspond to EURACHEM Guide [2], E3.3 and E3.5, respectively.
Table 4 Evaluation of measurement uncertainty for the special application (analyte content x=2) including the assessment of trueness
(contents given in mg I-I)
Sample content:
. =1433+ 4372.4-3146.5
x, .. - 2224.9 U(XJ= 22i4.9 52.06 2 +( 4372i i24:J 46.5 fl1.4J2 +19.742
x,=1.984 u(x)=0.0252
X'1
=1433+ 679.3-3146.5
.. 2224.9 U(X'1) = 22i4.9 30.952+(679i22~.~46.5fll.4F+19.742
xq=0.324 u(x()=0.0175
RM Content, certified:
xr=0.285
Correction factor:
fr=x/xr
f r=0.324/0.285=1.137
Corrected content:
U(X corr ) = X corr ' ( U(XX.,.,) )2 + ( U(f,f,rr) )2
Xcorr=1.98411.137=1. 745 ( ) I 745 ( 0.0252)2 + ( 0.0877)2 = 0.136
U Xcorr =. . 1.984 1.137
In Table 3, according to Eqs. (13) and (11) and with- (see the third column in Table 3 and Fig. 1), are nearly
out consideration of the AQC measurement, the calculat- equal. For u(YohJ = -vs?'x/p = -V48.33 2/5 = 21.6 the curves
ed standard uncertainties are arranged for all calibration in Fig. I have an intersection point.
levels. One can see that the measurement uncertainties in In Table 4 the determination of the combined mea-
the part of the calibration range, within which the condi- surement uncertainty, including the assessment of true-
tion of the variance homogeneity is correctly fulfilled ness, for a special analytical application (analyte con-
Evaluation of measurement uncertainty for analytical procedures using a linear calibration function 43
...
1: ty u(x corr )=O.136 is used, thus for the analysis result
o c: 0,03
~ -~ .,.,.- x,=1.98 the associated expanded measurement uncer-
::s
" 0
0,02
",",,'"
,/ ..'-- tainty is U=2·0.136=O.27 .
III
o c:
U 0,01
2 ::I 0,00
o 0,1 0,5 2 5
Conclusion
Analyte content [mgllJ
Fig. 1 Comparison of the measurement uncertainty estimations The application of unweighted ordinary least squares re-
(Eqs. (13) and (11) correspond to EURACHEM Guide [2], E3.3 gression for linear calibration can lead to a slightly un-
and E3.5, respectively) derestimated uncertainty value, if the condition of vari-
ance homogeneity in the calibration range is not correct-
ly fulfilled and the uncertainty calculation is based on
tent x=2 mg 1-1), is represented. Here i r=1.137 and the residual standard deviation of the regression (Eq.
u(fr)=O.088 for the test statistic T= 1.56:::;2, so that a E3.5 in [2]). In this case the other indicated possibility of
significant method bias cannot be proven, although uncertainty calculation (uncertainty deduced from re-
the relatively large difference between xr and Xq sug- peated measurements, Eq. E3.3 in [2]) can result in more
gests a biased error. Because of the non-significance realistic estimations.
References
1. ISO (1995) Guide to the expression of 3. IUPAC Recommendations (1998) 8. DIN ISO 11466: 06.97 Soil quality - ex-
uncertainty in measurement. ISO, Guidelines for calibration in analytical traction of trace elements soluble in
Geneva chemistry: http://www.iupac.org aqua regia. Beuth, Berlin
2. EURACHEM/CITAC Guide (2000) 4. ISO 5725-2 (2000) Accuracy (trueness 9. Kleiner J, Lernhardt U (1998) Program
Quantifying uncertainty in analytical and precision) of measurements meth- SQS98. Perkin Elmer GmbH, Uberlin-
measurement, 2nd edn., Final Draft ods and results, Draft May 2000. ISO, gen, Germany
April 2000. EURACHEM: Geneva
http://www.measurementuncertainty.org 5. Rawlings 10 (1988) Applied regression
analysis. Wadsworth and Brooks, Cali-
fornia, USA
6. Kurfiirst U (1998) Accred Qual Assur
3:406-411
7. Barwick VJ, Ellison SLR (2000) Accred
Qual Assur 5:47-53
Accred Qual Assur (2001) 6:352-359
© Springer-Verlag 2001
the ability of computing devices to treat large data sets of a domination of Gaussian distribution was not con-
automatically, and hence their use is more widely firmed in the study, where a number of large data sets of
spread. Unfortunately users are often too unaware of various types [10] were assessed. Only about 25% of
the limitations of the corresponding models and tools. the studied measurements could be considered as a
Mathematical statistics is developing continuously Gaussian distribution.
and a number of new procedures (e.g. robust proce- In recommendations on uncertainty evaluation
dures) [6] are built in new specialized statistics soft- [1-3]. Gaussian distribution is said to be fundamental,
ware. Improved software to calculate precision has but some other distributions are possible to use. The
been published [7, 8]. These new tools provide more application of assumed distribution is also made less
correct results compared to traditional ones. It is there- strict by using a less firm relation between probability
fore reasonable to expect that their use becomes stand- and standard deviation as is used in traditional statis-
ard. tics.
There is no consideration about the relation be-
tween distribution of input and output values - while
the result of the measurement is obtained as a function
of more values with their own uncertainties, the stand-
Complete characterisation of analytical results ard deviation of result is determined using error propa-
and uncertainty distribution law gation and the interval estimate is based on the as-
sumption that the distribution is Gaussian. But the
The use of standard uncertainty has one important lim- shape of the distribution depends on the input values
itation (but often forgotten) - results can be compared distribution and their function relation too.
only in the case of the same uncertainty distribution The applied function relation can be relatively sim-
law. The other interval measures of precision and de- ple or more complex. The measurement of individual
rived values, such as confidence intervals, detection input values is often a complex process and one input
limit or determination limit, cannot be reliably deter- number is the result of many single operations. In the
mined without knowledge of the distribution, and like- evaluation of chemical analytical results these relations
wise further statistical deductions are not valid without are of various complexity, but they can be mostly ex-
this information. pressed as an explicit function and their evaluation us-
Gaussian distribution is often supposed. The analyti- ing the propagation rule is simple. But some important
cal determinations are, however, generally complex types of measurement are more difficult to treat. Two
problems - the measurements are indirect, many single examples are described below.
operation are carried out, each of them with its own
uncertainty (considered as direct measurement or more
or less complex, indirect measurement). The uncertainty Example I. A simple case: preparation of a standard
of a result is the composition of these elemental un- solution of zinc
certainties. There is widely accepted opinion that due
to the large number of uncertainty sources, result distri- When a certified reference material (CRM) is prepared
bution is often Gaussian. This is valid only in the case from a weighted amount of Zn-metal by dissolving in
of addition or subtraction of the individual consti- acid and adjusting to the desired volume, the obtained
tuents, in other cases solution of the distribution law concentration from this process is influenced by several
problem is rather difficult - simple application of Cen- factors and a relatively complete description can be ob-
tral limiting theorem without careful consideration of tained as an extensive equation comprising 16 paramet-
its limitation leads to overestimation of Normal distri- ers, the values of which carry some uncertainty [II].
bution applicability. For comparison: the simple but not complete relation-
The identification of the distribution law is reliable ship in this case is:
only for very large data sets [9], which are practically
impossible in real chemical measurements. Verification c =mIV.M, (I)
using common statistical tests requires, for good relia-
bility, large data sets too. In a real situation often only where m is the amount weighed USIng analytical bal-
an assumption can be made. As this assumption about ances, V is the volume measured using a volumetric
the distribution law is not well-founded, there is a risk flask and M is the molar weight. The more complete
of false information about uncertainty. The assumption form is as follows:
c = ((1110 + ml read + mica! + mloper + m2read + ~cer! + ~(}per ) I I - rpair . (I Irpmb) - 11 rpWeid!h)) X (1 + KI tl - Kn
(2)
('-':Iecl + Vread + '-':lecl.E. (t2 - 20))· M
46 P. Tarapcfk et al.
where all used parameters (except mo, K, and K 2 ) are tion can be designed in the form:
considered as uncertain values and their symbol mean-
y= 8/n2. exp(-n2Dt/4[2)· {1+1/9· exp(-8n2Dt14[2)}
ings are described in [II]. The actual values and origin
(6)
of non-experimental values (molar mass, densities, etc.)
are given in [11]. This relationship is even more com- The solution for D is treated for example in [14]:
plex if the contribution of impurities of the materials
D = 4[2/n2t . In(8/pr2) + 4[2n'4r/9t88 (7)
used is included.
From the relationship above, standard uncertainty and for uncertainty after very awkward work
can be obtained, but one has to do relatively extensive
ob = 401+ o?+ 0~{8·(nI6·rL9.87)/
work - the error propagation law must be applied, that
(9.8 8 . In(8/n2 . J1+nI6 . r8)} (8)
means one needs 16 sensitivity coefficients (square of
derivation) and one has to estimate the standard uncer- The value r is conveniently measured as the radioactiv-
tainties of all 16 constituents. The derivation can be ob- ity ratio (after background corrections), hence it is also
tained without deep knowledge of mathematical analy- a typical indirect measurement, which is a combination
sis using the appropriate software, or using a similar of four individual measurements of radioactivity decay
numerical approach as described in [12] using a com- and four measurements of time.
mon spreadsheet. This procedure, however, after the If one needs a more complete equation (n = 2 or
large amount of work, gives no information about the more), due to slow diffusion of larger particles, when r
result distribution law, even if the input distributions values are higher over a reasonable time period, the so-
are known, and so the result interval estimates are lution is more complex - D can be calculated by an iter-
usually based on a non-verified assumption. ative procedure, but determination of uncertainly ac-
cording to the usual procedure [2] is very awkward. The
general relationship for relative standard uncertainty is
Example 2. A more complex case: measurement of as follows:
diffusion coefficient
ob =401+ o?+ 0 2 .
{[2. Jf[2tD(~exp[ -n2(2n+l)2Dt/4f2])] } (9)
The situation is more difficult if an analytical expres-
sion is not possible in the explicit form y = f (xJ This and nothing is known about the result distribution.
differs only in the more complicated method to obtain The examples above show excessive work is required
the sensitivity coefficient. The method for this is de- using the traditional approach without achieving com-
scribed, for example, in [2]. Even more complex was pletely reliable output information from well-measured
our particular measurement of diffusion coefficients us- input data.
ing the open capillary method, where this process is de-
scribed by infinite order [13]:
Looking at the distribution law for composition
y = c,r/co = I,8/n2(2n + I )2· exp [-n2(2n + I )2Dt/4f2] (3)
of more values
the sum is for n integers (i.e. from 0 to infinity), where
c'lr and Co are the concentrations in the capillary before More complete output information requires knowledge
and after diffusion, [ is the length of the capillary, D is about the distribution law. The composition of distribu-
the measured diffusion coefficient, important for exam- tion can be considered as two values (e.g. a ratio of two
ple for characterisation of particle size, t is the diffusion Gaussians gives a Cauchy distribution, multiplication
time, 1t has its common meaning and n is the consecu- gives Laplace distribution in this case). More complex
tive number in infinity order. relations are very awkward and for most analysts also
This order converges rapidly - for a simple mathe- technically impossible. By solving the problem of un-
matical solution, the time of diffusion is usually ad- certainty distribution in the above way one has two val-
justed so as to obtain a second term of order (n=1) ues and can divide the problem and work in single
1000 times lower than the first term (n=O). steps, dealing with one problem at a time.
The solution for the first term only (n=O) is:
D(Il=O) = (4· L2/n2. t) . [n(8/y· n2) (4)
The distribution of results is obtained as a
and the relative standard uncertainty from the error combination of two uncertain values
propagation rule:
The procedures to obtain the probability of density
ob= 4·01+ o?+ 0~(1/[n(8/n2.)"'1)) (5)
functions in this case can be found in advanced mathe-
If one needs to use two terms - due to somewhat higher matical statistics [5], here we provide some examples.
values of r (lower rate of diffusion), the equa- Table I presents convolutions of some distributions
Measurement uncertainty distributions and uncertainty propagation by the simulation approach 47
Table 1 Combined uncertainty distributions of results obtained by arithmetic combination of two measurements with known and
common uncertainty distributions
f(A,B) Distributions of constituents Resulting distribution type, probability density Graph type
A B
generally known and often used. The Gaussian and La- Some remarks can be made on this basis:
place distributions are signed N(Il, a) and L (11, a), re- - The distribution of the value obtained by dividing
spectively, (where 11 is the true value and a is the stand- two uncertainty values is often calculated without
ard deviation), R( -a, a) means rectangular distribu- defining statistical moments other than the zero mo-
tion on interval <-a,a>. The graphic presentations of ment (e.g. Cauchy), and there is no sense in deter-
convolutions are shown in Fig. I: graph type 5* is simi- mining such characteristics as standard deviation be-
lar to graph type 5 with the exception that the values of cause there is skewness based on the higher mom-
high deviation have higher probability. ents.
The traditional suggestion (ISO) in such cases rec- - When multiplication of uncertainty values is used,
ommends as a first approximation a Gaussian distribu- the resulting distribution has a large value of kurtosis
tion (graph type 4 in Table I and Fig. I), where stand- and the usual precision characteristics have small sta-
ard uncertainty of the result is obtained by a combina- tistical efficiency. More suitable are estimates based
tion of standard uncertainties of the constituents ac- on median or robust methods.
cording to the rules of differential calculus. The other
case is, of coarse, relatively simple and can be found in
the basic text books or derived by a person with moder- Simulation approach for uncertainties
ate skills.
The illustrative example can be given as a compari- The possible solution is an exploitation of the ability of
son of distributions obtained by the addition of two rec- standard software to generate data with defined statisti-
tangular distributed values - graph type I in Table I cal properties - with defined distribution and its param-
and Fig. I. According to ISO one can obtain the proba- eters, e.g. mean value, standard deviation and the like.
bility density function by: In this way, sets of data can be obtained each repre-
senting a single "experiment". The number of these
P(x)=ex p( (X-J1A-J1B)2)1 '2;r(a2+a2) (10) simulated data sets can be very large and a set of results
2(a~ +a~) '1/ A B
can be treated traditionally - that means by determin-
and a comparison of the same input values in Fig. 2 ing statistical moments and derived values, construing
emphasises a sharp difference in this case. The example histograms, polygons and the like. Or interval estimates
presented is, however, quite academic. Real measure- can be obtained simply from percentiles of the set of
ments are not so simple and the distribution laws and results (this is really more correct than the moment
their parameters are only known from small data sets, method where the assumption about distribution often
i.e. in most cases only an estimation is used. plays a negative role). This procedure requires no new
48 P. Tarapcfk et al.
1 2
probability
= $A$1-$A$2*SQRT(3) +
RANDO*2*$A$2*SQRT(3) (12) Fig.3 Integral distribution function (polygon) of simulated cer-
tified reference material preparations
(NORMINV, SQRT and RAND are the names of
appropriate built-in functions of spreadsheet).
Table 2 Statistical evaluation of simulated certified reference
How to generate numbers of many other distribu- material "preparations"
tions is a well known and solved problem [6]:
Mean 0.0 I00004 molll
- If cells of column B have other values than in column Median 0.0 I 00002 molll
C the result of the calculation for the functional rela- Standard uncertainty 0.000004 molll
Asymmetry 0.0902
tion of A and B (for example the ratio) one obtains a Kurtosis -0.2527
1000 lines with simulated sets of input data and a set Relative standard deviation 0.03996%
of 1000 calculated results.
- Statistical analysis of the last set gives standard statis-
tical outputs, e.g. interval estimates (standard devia-
tion, percentiles) or better still a more complete The results are given in Table 2 and in graphical
picture using a graph (polygon or histogram) of the form of integral distribution function (polygon) in
simulated distribution, which can be then readily Fig. 3. In this graph there is also a curve for the Gaus-
compared with an a priori assumed distribution - sian distribution using the moment method and the
simply visually or by application of standard statisti- 95% interval obtained from the percentiles, and the
cal tests. The result distribution can be described by same estimate from parameters of the Gaussian distri-
an empirical equation and used in further result util- bution. It is evident, that in this case the assumptions
isation. were adequate. The simulating procedure gave results
in agreement with the results of the standard procedure
but without the element of vagueness due to the un-
Examples of the simulation approach known composition of the individual distributions.
For a rapid diffusion of Na+, at a mean measured Under favourable conditions there is no difference
value of Y= 0.424, using the uncertainty propagation in a number of characteristics between the traditional
rule we obtained a value of 0 -0,71 %. In this case si- and simulation procedure, and similarly there is no sig-
mulation gave a value of 0.70% and the relative uncer- nificant difference in relation to the assumption about
tainty of the calculated D was 1.15% by uncertainty the distribution. However, the simulation gave the
propagation and 1.14% by the simulation. At the same shape of the distribution and for often used assump-
time, the stimulation procedure gave details of the un- tions in type 2 a slightly narrower interval at 95% prob-
certainty distribution of the result (expressed, e.g., as a ability from percentiles technique is seen, which is most
histogram, moments of distribution, etc.), and it was evident.
possible to show the interval limits directly from the si- In the less favourable conditions, the traditional pro-
mulated set of results, regardless of the distribution cedure overestimated precision. The influence of a giv-
type. In activities measurements, assumptions were en choice of assumptions about distributions even dis-
used usually for radiometric measurements - normal played a skewed distribution and an evident enlarge-
distribution for counts, measurement of time and ment of interval, that shows the importance of well-
length were also of a Normal distribution (type 1). The founded assumptions.
histogram and parameters of simulated set of results in- For such measurements, more capillaries are used in
dicated a normal-like distribution, and this character the same experiment to obtain repeated values, al-
was maintained also in the case of the rectangular though a very limited number (8 for example) and with
distribution used for time and length measurements risk of correlation's, but the details are known by a
(type 2). skilled person. The uncertainties thus obtained show
For slow diffusion and a more critical measurement that the result of the simulation is more realistic com-
of Eu 3+, and not a long enough diffusion time, y was pared to the result of the traditional method.
about 0.75, and the situation is different. The applica-
tion of the uncertainty propagation rule leads to an
overestimation of precision by about 40-50% of the Conclusions
standard deviation, as compared to the values from the
simulation. Moreover, the simulation gave information Almost every analytical result is obtained as an indirect
about the shape of the distribution. It is also simple to measurement. The distribution of the results is given by
judge the influence of an assumption; the simulation the distribution of direct measured values and their re-
gave an objective result. Illustrative examples are lations. There is no reason to suppose a Gaussian distri-
shown in Table 3. bution of elementary measured values and results si-
Favourable conditions means a measurement in multaneously. In some cases, the simple error propaga-
which y is about 0.4, so higher terms of diffusion equa- tion law is sufficient but in some cases could be in error
tions are not important. Under unfavourable conditions and a simulation approach would be worth adopting. It is
the values of y was about 0.75: about four members of almost impossible to decide a priori which method to use.
the order are of importance. In both cases in an itera- If an assumption about the distribution is used in
tive calculation, the first eight members of a given or- one step of the analysis, further conclusions in the oth-
der were applied. er steps must be in harmony with this assumption, e.g.
Measurement uncertainty distributions and uncertainty propagation by the simulation approach 51
if the parameters of calibration of a straight line were ers, derived from small data sets are used. This meth-
tested as Gaussian distributed, simultaneously the as- od is also only an approximation, but in many cases
sumption of Cauchy distribution of the concentrations more correct and less strenuous than by the error
obtained by using this parameters is automatically propagation method.
made and further statistical deductions must agree with The result is easily variable as relate to the assump-
this fact. tions, this variability includes the distribution law
If assumptions about the direct measured values are too.
used, the result of the determination of the distribution The method is able to solve difficulties originating
law is also an assumption and it is not worth making a from complex calculations.
difficult exact analysis of distribution. It is easy to realise, and the procedure does not re-
The determination of the distribution of an analyti- quire any special mathematical knowledge or abili-
cal result is rather a difficult problem. Even if distribu- ty.
tions of direct measured values are known, it is reason- Only a moderate knowledge of standard software is
able to turn to statisticians for a solution, but a simula- required.
tion procedure is available for the moderately skilled Result distribution is given in table and graphical
person. form.
The simulation approach for interval estimates of
measured values does not solve this problem entirely, it The simulated distribution, in many cases without known
cannot overcome the problem of small numbers of de- (assumed) input distributions, remains a model of fre-
grees of freedom, but in most cases it is a good tool: quency distribution to be expected. This model is partic-
ularly useful when it is compared to experimen-
It can be used in cases where the distribution and dis- tally determined distributions, to check whether all sig-
tribution parameters of the constituent is known and nificant sources of uncertainties have been identified
in cases where only approximate values of paramet- and properly assessed.
References
I. EURACHEM (1995) Quantifying un- 5. Koroliuk VS, Portenko NI, Skorok- 11. Tarapcik P, Buzinkaiova T, Polonsky
certainty in analytical measurement. hod AV, Turbin AF (1985) Spra- J, Dlouha M, Chromek F (1997) Me-
EURACHEM, London, UK; ISBN votchnik po teorii veroyatnostei i ma- tro16gia a skusobnfctvo, 2:6-10;
0-948926-08-2 tematitcheskoi statistike (Handbook 3:10-14
2. TPM 0051-93 (1993) Stanovenie neis- of probability theory and mathemati- 12. Kragten J (1994) Analyst 119:2161-
tot pri meraniach (Determination of cal statistics). Nauka, Moscow, Russia 2166
measurement uncertainties). Slovak 6. Antoch J, Vorlickova D (1992) Vy- 13. Gosman A, Jech C (1989) Jaderne
Institute of Metrology, Bratislava, brane metody statisticke analyzy dat metody v chemickem vyskumu (Nu-
Slovak Republic (Selected statistical methods). Acad- clear method in chemical resarch).
3. ISO (1993) Guide to the expression emia, Prague, Czech Republic Academia, Prague, Czech Republic
of uncertainty in measurement. ISO, 7. Eckschlager K (1991) Collect Czech 14. Fourest B (1983) Coefficients de dif-
Geneva, Switzerland; ISBN Chern Commun 56:505-559 fusion limites et structure de quelques
92-67- I 0 188-9 8. TarapCik P (1992) Chern Lett ion aquo d'elements 5f et 4f. PhD
4. Taylor JK (1990) Statistical tech- 86:648-652 Thesis, Institute de Physique Nu-
niques for data analysis. Lewis Pub- 9. Thompson M, Howarth RJ (1980) cleaire, Orsay, France
lishers, Chelsea. Mich., USA Analyst 105:1188-1195
10. Novitskii PV. Zograf IA (1985) Ot-
senka pogreshnostei rezultatov izme-
renii (Errors evaluation of results of
measurements). Energoatomizdat,
Leningrad, Russia
Accred Qual Assur (200D) 5: 118-94
© Springer-Verlag 2000
Step 2
Step 3
liIe"tIfy wlikh
eourc"."I:re
- Similar effects, same time: check carefully, if sim- 3. Quantifying uncertainty components
ilar effects are accounted for twice. In that case
resolve the duplication. Measure or estimate the size of the uncertainty asso-
- Similar effects, different instances: re-Iabel. ciated with each potential source of uncertainty identi-
fied. The goal of the component by component ap-
54 M. Rosslein
ift): .
,', ).f~q[!
''';'
• - \
.
.'"
.
•• ~ r
,
' ..
An example for evaluating the measurement Fig.2 Flow chart of the analytical procedure
uncertainty utilising the component
by component approach
The evaluation process for the measurement uncertain- 2. Identifying and analysing uncertainty sources
ty utilising the component by component approach is
illustrated in the following textbook example: The aim of this step is to identify all major uncertainty
sources and to understand their effect on the measu-
rand and its uncertainty. This is best done by drawing a
1. Specification cause and effect diagram using the procedure described
by Ellison et al. [2] (Fig. 3).
The concentration of a hydrogen chloride solution
(HCl) is determined by titration against freshly stand-
ardised sodium hydroxide solution (NaOH). It is as- 3. Quantifying uncertainty component by component
sumed that the NaOH concentration is known to be of
the order of 0.1 moili. The end-point of the titration is Within step 3 each uncertainty source identified in step
determined by an automatic titration system using a 2 has to be quantified using relevant data and then con-
combined pH-electrode to measure the shape of the verted to a standard uncertainty.
pH-curve.
Procedure
The concentration of the NaOH solution is
The measurement sequence to determine the HCl con-
0.10215 molll with a standard uncertainty of
centration has the following stages (Fig. 2):
0.00009 molli.
1. Transfer an aliquot of 15 ml of HCl into the titration
vessel using a bulb pipette. V HCI
2. Approximately 50 ml of ion-free water is added to
the vessel and then titrated using the NaOH and the 1. Repeatability. The uncertainty due to variability in
pH-curve is recorded. The end-point of the titration filling and delivery is determined as a standard devia-
is determined from the shape of the recorded tion of 0.0037 ml.
curve.
2. Calibration. The uncertainty on the stated internal
Calculation volume is given by the manufacturer as ± 0.02 ml. This
value is transformed into a standard uncertainty assum-
CNaOH' Tit V (mol/l) ing a triangular distribution 0.02 y'6 =0.0082 ml.
CHC I=----
V HC1
3. Temperature. The effect of temperature difference
CHd concentration of the HCI solution (mol/l) from the pipette calibration temperature to the labora-
CNaOH: concentration of the NaOH solution (molll) tory environment can be calculated from an estimated
V Tit : titration volume of NaOH solution (ml) temperature range and the coefficient of volume expan-
V HCI: aliquot of HCI titrated with NaOH solution (ml) sion.
Evaluation of uncertainty utilising 55
Temperature----Jl~
Calibration-~~
Repeatability----Jl~
------------'----------,,-----1----)~c(HCI)
Repeatability-~~
Calibrationl----Jl~
Temperaturel----'~
V(HCI)
15 ml·2.1·10 - 4oC -1·4 °c cause a strong base (NaOH) is used to titrate a strong
0.0073 ml.
V3 acid (HCI). This leads to a large change of the pH-val-
ues around the end-point resulting in a very accurate
Combining the three contributions to the uncertainty determination of the value of the end-point.
u(VHO) of the volume V HO gives a value of V Tit is determined to be 14.89 ml and combining the
u(VHCI) =VO.00372 +0.00822+0.0072 2 =0.012 ml. four remaining contributions to the uncertainty u( V Tit)
gives a value of:
1. Repeatability of the volume delivery. The variability
of the delivered volume of the piston burette is deter-
mined as a standard deviation of 0.004 ml.
The above section shows that even for a relatively
2. Calibration. The limits of accuracy of the delivered simple textbook example the process to calculate the
volume is indicated by the manufacturer as ± 0.03 ml standard uncertainties for each of the components is
for a 20-ml piston burette. The standard uncertainty is time-consuming.
calculated assuming a triangular distribution 0.03/V6 =
0.012 ml.
4. Calculate the combined standard uncertainty
3. Temperature. The uncertainty due to the lack of
temperature control is calculated in the same way as for The intermediate values, their standard uncertainty and
V Hd 0.0073 ml. their relative standard uncertainty are shown in Ta-
ble 1.
4. Repeatability of the end-point detection. The repea- Using the values obtained above:
tability of the end-point detection is thoroughly investi-
gated during the method evaluation under the given 0.10215 '14.89
conditions and a standard uncertainty of 0.004 ml is
CHO=------ 0.10140 molll
15
found appropriate.
and
5. Bias of the end-point detection. During the method
evaluation no indication for any bias was found, be- U(CHO) =0.00016 mol/I.
Table 1 The standard uncertainties and relative standard uncertainties of the components used to calculate the combined standard
uncertainty
Similar components, such as repeatability or tempera- Table 2 The influence of the choice of shape of the distribution
ture dependence, are combined in Fig. 7. These compo- on the combined standard uncertainty
nents influence different parameters of the equation of Distribution Factor u(V(T; cal» u(V(T»
the measurand, but by combining them in this new way
allows to visualise their overall effect on the measure- Rectangular fl (U1l7 ml 0.019 ml O.OOO1R molll
ment and measurement uncertainty. Employing this ap- Triangular ~ 0.012 ml 0.015 ml 0.00016 molll
proach, the user of the data might gain a better under- Normal 1"9 0.010 ml ()'o14 ml (l.O0015 molll
standing of the measurement process. This is another
benefit of the evaluation of measurement uncertainty
besides its major objective to obtain comparable re- influencing quantities in step 3 for the utilising of vali-
sults. dation data approach (Figs. 8, 9). In this approach the
A closer look at the process of combining compo- grouping of influencing quantities has to be done to ac-
nents shows that it is very similar to the grouping of commodate the sometimes limited amount of informa-
Evaluation of uncertainty utilising 57
V(rep;e)
Repeatability~"
V
-~--+--.------L---~-------1-.c(HCI)
I Repeatability~'
/ Calibrallon- - - - - . , /
I '
I ,"
I Temperalure- - - - - .,'
!
,,
Repeatability V(HCI)
Temperature,--~
Calibrationl--~
--------.-----1----------.---1-----.c(HCI)
V(HCI)I---'l~
Calibrationl--~
V(Tit;delivery)--. .
Temperature,--~
V(Tit;End-point)--. .
Repeatability V(HCI)
58 M. Rosslein
tion obtained during a validation study. GUM com- tability components, the variation of the two volume
ments about combining individual components in its in- deliveries, have been determined independently in a se-
troduction [1]: "The actual quantity used to express un- ries of ten delivery and weight experiments.
certainty should be: The determination of the different components in
- internally consistent: it should be directly derivable independent experiments increases the risk of neglect-
from the components that contribute to it, as well as ing correlation between these components in a given
independent of how these components are grouped analytical procedure. The aim of experiments to deter-
and of the decomposition of the components into mine the size of a single component is to reduce any
subcomponents". other influences, especially one tries to avoid correla-
tions. In a given routine analysis correlations between
some components might occur. Therefore the relevance
of these correlations have to be investigated if one
Some drawbacks of the component wants to use the component by component approach
by component approach correctly. In contrast overall method performance pa-
rameters are determined during validation studies.
The component by component approach has a few These parameters already take into account most of the
drawbacks, which mostly reduce the efficiency of the correlations between different components.
evaluation process. For example there are always com- The drawbacks outlined above show that the utilia-
ponents whose effect cannot be directly measured. In tion of validation data can be more efficient than the
the above textbook example, the repeatability of the component by component approach to evaluate meas-
end-point determination is such a component. The var- urement uncertainty. However the former approach
iation of the end-point detection is always part of the does not provide information about the relative size of
overall repeatability of the experiment and there is no the components, this information is importnat when it
other means to directly determine its size in an inde- is necessary to develop the method further to reduce its
pendent measurement. In contrast the other two repea- uncertainty.
References
affairs for large data dispersions, e.g. in interlaboratory considered. In analytical measurements, the measure-
tests. Nevertheless, the very definition of uncertainty ment time usually exceeds the time at which the lIf
implies at least the nearly symmetric distribution of val- noise, which increases with time, becomes comparable
ues about the measured result (see e.g. [5]). to other noises. Usually this takes place in the order of
The applicability of other data distributions, such as seconds.
the logarithmically normal, for analytical data is also Long-lasting measurement sets seem to be charac-
discussed, but the procedures currently used in analyti- teristic of chemical analysis. Quality control charting
cal data analysis (e.g., homogeneity, outlier tests etc. [6, [10], interlaboratory studies [6, 7, 11], and applications
7)) have mainly been developed for normal data distri- of the certified reference materials are typical exam-
butions. Asymmetry can be due to contamination and ples. In spite of this, the concept of lIf noise is hardly
other errors, but one general reason is asymmetry of ever applied to analytical measurements with large time
the concentration scale - zero acts as the lower limit intervals. The following question in relation to lIf noise
and there is no upper limit. According to our analysis, is considered in this paper: is the lIf noise concept app-
the distribution peak position in relation to the mean licable to the long-term variations of analytical data,
value and symmetry of the data scatter of the interlabo- where the information on the characteristic variation
ratory test data can be improved essentially if the sym- time or variation time dependences can be of impor-
metric scale of logarithms of concentration ( - infinity, tance?
+ infinity) instead of the concentration itself is used. The result of an attempt to measure the low-fre-
The difference is important at large data scatters only, quency noise spectrum from AAS soil analysis quality
and all the data scales are almost symmetric if the data control data lasting for almost two years is presented in
dispersion is small enough. So the problem can be for- Fig. 1. Fourier analysis of the traditional quality control
mulated as follows: can scales be found in which the data time series was carried out. It follows that lIf noise
applicability of normal distribution would be wider seems to be a useful approximation for such a low-fre-
than in the linear scale? quency spectrum. In addition to the common lIf char-
In some cases, the applicability of the log-normal acter of the noise some characteristic frequencies at
distribution has been strongly contested (e.g. at the de- about 11(25 days) and 11(2 months) are observed.
tection limit [8]). Even in these cases, some arguments The example presented seems to be of interest from
seem to be reformulated. Of course, no distribution is different points of view. First of all, characteristic noise
universal (the normal one as well), and if more suitable frequencies or time constants of the main slowly vary-
types of distribution (or data scales) exist, these ought ing uncertainty sources are displayed. This is an addi-
to be applicable only to some definite types of meas- tional, parallel source of information compared to the
urement data. Concerning data distributions in interla- ruggedness tests and uncertainty evaluation, and can be
boratory tests, more extensive analysis of the interlabo- used to confirm, support or supplement the obtained
ratory data distributions from different test schemes conclusions or to control the effectiveness of uncertain-
ought to be carried out. Better applicability of the nor- ty reduction procedures. Not only the characteristic fre-
mal distribution could enable wider fitness of the statis- quencies but the contributions of the corresponding un-
tical approach and wider possibilities for validation of certainty sources to the combined uncertainty can be
the analytical methods. Besides this, possibly some of obtained as integrals of the noise spectral density in the
the results at the large concentration "tail" could be re- characteristic frequency ranges.
garded as normal. Estimation of uncertainty in the case
of other distributions, including the asymmetric ones,
also needs attention. :i 60 1
.,j
i;.
50 1\ \
'u; 40 1 \
The l/f noise approach to understanding bias, error
functions and uncertainty reduction possibilities -~30"\v
o:s
\
'
~ 20 \ \
Repeatability standard deviation of the measurements ~ 10 \
with the same apparatus at identical conditions is usual- 0] ':--.- ,----~-\;: ,
ly limited by low-frequency noise. This noise is lifO' fre- o 5 10 15 20 25 30 35
quency dependent. Often ex is close to 1, and the noise I I I
is known as lIf noise (in parallel, the term flicker noise l/year lImonth 1/( I 0 days)
Frequency, 1 div = 3.6'10-8 Hz
is used). The phenomenon, while not fully understood
(see e.g. [9]), takes part in many natural, physical, Fig.l Noise spectrum of the AAS quality control analytical re-
chemical, social, economical etc. phenomena and is of sults: solid line from measured results for the period 1995.01-
special interest when accuracy of the measurements is 1996.09, dashed line lIf dependence
Uncertainty - statistical approach, I/fnoise and chaos 61
It is easy to understand that the number of such long tion of the uncertainty sources is not possible, multi-
measurement data sets available from individual labo- channel measurements can be used to monitor the var-
ratory is restricted. More extensive and profound stud- iations of the important conditions of the experiment
ies can hardly be carried out without interlaboratory (see e.g. [13]). If such information is available, many
collaboration. Comparison of similar data from differ- methods can be used to exclude the effect of the vary-
ent laboratories would allow one to distinguish experi- ing experimental conditions from the measurement re-
mentally between the uncertainty sources characteristic sults through determination of the relations between
of the method and those of individual laboratories. those variations and variations of the analytical signal.
The integral of the noise spectrum up to the fre- Cross-correlation measurements between the final sig-
quencies comparable to the inverse characteristic time nal and the variations of the experimental conditions is
of the individual measurement would contain the full a very sensitive tool to ascertain how complete the ex-
uncertainty due to all sources, representing the transi- clusion was. How far can we go in such a process? If the
tion from one measurement to the long time scale. Of lIf noise is, as we understand till now, really a general
course, the integration frequency range is a problem. phenomenon due to a large number of effects, this
The high-frequency limit of the order of the inverse component can hardly be reduced significantly while
time of the individual measurement seems to be natu- the characteristic (excess) noise components (see Fig.
ral. The low-frequency limit, longest time scale, de- 1) ought to be reducible. For the case represented in
pends on the practical problems being considered. In the figure (the accuracy of the spectrum itself is still un-
any case, we ought to take into account that if the der study), contributions of the two noises to the uncer-
measurement time being regarded is comparable with tainty seem to be comparable, and uncertainty reduc-
the characteristic time of some uncertainty source, such tion by a factor about 2 can be expected, and only fun-
a source will reveal itself only as a tendency, trend, or damental changes to the system (including preparation
bias. If the measurement lasts for an essentially longer of extremely low-noise methods as a special case) could
time, such a source will manifest itself as an error func- result in a different lIf noise level.
tion. These circumstances represent a more rigid, math-
ematical, basis to the bias-error relativity [5] problem
and could be important when the collaborative trial re- Nonlinear phenomena - oscillations and chaos
peatability and reproducibility data are subjected to un-
certainty and bias estimates [12]. Hitherto, stochastic (or random) fluctuations have been
If the noise spectrum is regarded as information for considered. Sometimes special, quasi-oscillatory varia-
assessment of the integral uncertainty in spite of the tions of the analytical signal are observed. The ampli-
spectral density increment at low frequencies, an accu- tude of the oscillations is comparable with the analyti-
rate value of the lowest frequency is not important. As cal signal itself, approaching two or more possible val-
an example, a time interval of between 1 and 100 years ues of the signal. Variations of the integral analytical
covers a frequency range of only about 3 '10 -K Hz, signal depend on both the signal intensity dynamics and
while time intervals from 1 year to 1 min cover about the number of the spikes observed. Because of the lim-
2 ·10 -2 Hz. So, for example, contributions to uncertain- ited number of such spikes, some special values of the
ty for pure lIf noise from 1 s to 1 hand 1-h to half-year integral analytical signal are preferable, and the distri-
time intervals would be comparable. bution of this quantity would not be normal in this case.
Probably the low-frequency noise studies need too The well-known example of such variations are spikes
much data for routine uncertainty estimates. It is al- in graphite furnace atomic absorption measurements
most accepted that the total uncertainty can be assessed (e.g. [14]). In the still-continuing discussion on the de-
as the combined effect of the identified uncertainty tailed mechanism of this phenomenon, processes being
sources. But the situation is possible when it is mainly considered usually include autocatalytic, essentially
due to a large number of comparatively weak sources, nonlinear, reactions.
and experimental uncertainty measurement from data More or less similar signal-time dependences seem
dispersion or a noise spectrum would be highly desir- to follow analytical sample transformations more often
able in such cases. than was traditionally expected. Arc discharge is well
Another problem is how far the uncertainty can be known as a radiation source susceptible to instabilities.
reduced. Many methods can be used for such reduction. An example of the quasi-oscillatory radiation intensity
The most straightforward procedure seems to be the time dependence from original measurements in a car-
"bottom-up" assessment of the contributions from all bon arc is presented in Fig. 2. Two repetitive intensity
the uncertainty sources and modification of the analyti- peaks were very often observed and were traditionally
cal method to eliminate the largest of them. As was dis- attributed to different volatilities of the sample compo-
cussed earlier, the noise spectrum studies can aid such nents or reaction products or different volatilization
identification and assessment as well. If further elimina- mechanisms. The studies reported in [15] were possibly
62 P. Serapinas
.,
,
character and reproducibility of the process.
Because of the high sensitivity of the causes of non-
,
,, linear phenomena to the initial and dynamic character-
istics of the system, such phenomena, especially in in-
terlaboratory tests, can be appreciable sources of un-
o 100 200 300 certainty that are difficult to assess. Of course, even in
Time, s chaotic phenomena, limits of variation of the signal ex-
ist, but the phenomenon as it affects analytical applica-
Fig.2 Two individual Cu 261.X-nm spectral line intensity time
dependences. Spiked multi elemental oxide sample atomization in
tions is hardly studied and can hardly be properly ac-
carbon arc: current 20 A, sample 15 mg, analyte concentrations 1 counted for in certain cases. Understanding of the
mg/g. Data points are means from 14 measured time signal val- problem as such is in progress: "To date, chemometrics
ues has dealt with systems as deterministic or random, yet
many chemical systems behave chaotically" [17]. Stud-
ies of nonlinear phenomena should help in understand-
the first to take notice of the oscillatory character of the ing atomization mechanisms, revealing links between
process. It seems characteristic that often the quantity the reproducibility, repeatability and ruggedness tests,
of the sample material involved is too small (or the ob- and should present additional information for the as-
servation time is too short) for the dynamical structure sessment of balance and reduction of uncertainty.
of the sample transformations to be explicitly dis-
played, and only one or two peaks are observed. In the
atomization of solids and slurries such a situation can Conclusions
be expected more often now than hitherto [16].
The spike phenomenon, while interesting in itself, The problems of uncertainty understanding and assess-
reduces repeatability and needs special attention in or- ment in the range of small (a few per cent) and larger
ganization of measurements, but can be considered in uncertainties seem to be different. Experimental on-
the context of the common uncertainty assessment. The line uncertainty analysis should be of interest in both
essential feature seems to be the limited reproducibility cases. Comprehensive information inherent in interla-
or irreproducibility of the time scenario or even the boratory tests and quality control measurements could
characteristic time constants observed in measurements (and possibly should) be more effectively gained, sum-
similar to those of Fig. 2. From the mathematical point marized and used in method development. Preliminary
of view, if at least three reactions or processes, one of results show both the lIf and characteristic noises in the
them being nonlinear, are involved, it can result in quality control data. Attention to the nonlinear phe-
chaotic dynamics of the phenomenon in the sense that nomena including chaos seems to be essential in meth-
very small variations of the initial conditions can od preparation and performance studies. From this
change the character of the phenomenon as a whole. point of view, problems in uncertainty understanding
Such numbers of reactions may well be the rule than and reduction remain, and more intensive and in-depth
the exception in the atomization of multielement sam- interlaboratory collaboration would be highly desirable
ples. Thus, competition between different reaction for achieving faster progress.
mechanisms, autocatalytic and other nonlinear atomi-
zation processes, seem to be possible causes of oscilla- Acknowledgements The author thanks Dr. J. Lubyte, Agro-
tions and chaos in the analytical chain. chemical Research Center, Kaunas, for presenting quality control
data and the Regional Programme on Quality Assurance PRAG-
One of the early basic principles of analytical chem- III for financial support for the participation at the 2nd Work-
istry was to allow the reactions providing the analysis shop "Measurement Uncertainty in Chemical Analysis - Current
results to proceed to completion. In the fast modern Practice and Future Directions".
References
1. Eurachem (1995) Quantifying Uncer- 2. Horwitz W (1997) V AM Bulletin no. 4. ISO (1993) International Vocabulary
tainty in Analytical Measurement (1st 16:5-6 of Basic and General Terms in Me-
edn). Laboratory of the Government 3. Williams A (1997) V AM Bulletin no. trology. International Organization
Chemist. London 16:6-7 for Standardization, Geneva
Uncertainty - statistical approach, IIf noise and chaos 63
5. Analytical Methods Committee 9. Hooge FN (1997) In: Claeys C, Si- 13. Oberauskas J, Serapinas P, Salkaus-
(1995) Analyst 120: 2303-2301' moen E (eds) Noise in physical sys- kas J, Svedas V (191'1) Spectrochim
6. Horwitz W (191'1') Appl Chern 60: tems and lIf fluctuations, Proc 14th Acta B 36: 799-1'07
1'55-1'64 Int Conf, World Scientific. Singapore 14. L'vov BV (1996) Spectrochim Acta B
7. Sutarno R (1993) Procedure for sta- New Jersey London Hong Kong, pp 51:533-541
tistical evaluation of analytical data 3-10 15. Katasus Portuondo MR, Petrov AA,
resulting from international tests. ISO 10. Howarth RJ (1995) Analyst Sheinina GA (191'0) Zh Prikl Spektr
TC 102 N 451' 120: 11'51-1 1'73 33: 19-24
X. Thompson M, Howarth RJ (191'0) 11. Thompson M, Wood R (1993) Pure 16. Jackson KW, Chen G (1996) Anal
Analyst 105:111'1'-1195 Appl Chern 65:2123-2144 Chern 6X:243R-244R
12. Ellison S (1997) In: Measurement un- 17. Brown SD, Sum ST, Despagne F,
certainty in chemical analysis - cur- Lavine K (1996) Anal Chern 6X:23R
rent practice and future directions,
Proc 2nd Workshop. BAM, Berlin
Accred Qual Assur (2002) 7:153-158
DOl 10.1007/s00769-002-0440-8
© Springer-Verlag 2002
excitation of the sample. The ICP-AES instrument was a Results and discussion
Perkin-Elmer model Plasma 1/ emission spectrometer.
The first step in predicting the uncertainty of calibration
is to establish the relationship between the standard devi-
Calibrants ation and the level of the signal, and this is dealt with in
paragraph 7.5 in [I]. Three types of functional relation-
All samples were prepared by dilution of a Merck certified ship are considered for the reproducibility standard devi-
lead standard with a nominal concentration of 1000 mgIL. ation, sr. and the mean level, m
Aliquots of the standard were taken with a calibrated pi-
pette and diluted with a mixture of hydrochloric and ni- I sr=bm, where b is the slope
tric acids in calibrated volumetric flasks. II s,.=a+bm, where a is the intercept
Pure acid mixture was used as a blank, and calibration III sr=Cm d d::;l, where d is the logarithmic slope
samples were prepared with nominal concentrations of Simple linear regression analysis may be used for the de-
0.25, 1.00, 2.00, 5.00, and 10.00 mgIL. A nominal 1.5 mglL termination of the parameters a and b, and from the loga-
sample was chosen as an unknown. Samples were la- rithm of expression III
belled A, B ... G in random order.
log sr = log C+d . log m
Uncertainty budgets prepared in accordance with Ex-
ample Al in Ref. [2] provided an estimated relative stan- we can determine in the same way log C as the intercept
dard uncertainty of 0.22% for the dilution process. a, and d as the slope b.
These expressions are mathematically simple, but make
no physical or chemical sense. Much more sense is found
Measurements in Ref. [2] Appendix E4, which reflects the normal situa-
tion that the standard deviation is independent of the result,
Ten people working together two and two carried out all x, at low signal levels and proportional at high levels
measurements in one afternoon. Each participant mea-
sured all samples in alphabetic order, and the instrument £4 u(x) = ~S6 + (x· SI)2 (1)
was zero-adjusted at the beginning of each series. As proposed in paragraph E.4.5.2 in Ref. [2] linear re-
Readings in arbitrary units - counts - are presented in gression of u(x)2 on x 2 can be used to determine So as the
Table I together with the mean and standard deviation square root of the intercept, and s I as the square root of
for each level. the slope.
The calibration data - Levels A to F - were tested in
accordance with the procedure recommended by ISO [11
as Example 3. No outliers were detected. In addition we Relationships found by simple linear regression
used ANOVA to detect any influence from the order of
measurement or from a possible difference between the The quality of these representations can be judged by the
teams; no such effects were found. statistic T [3], which compares the deviations of the ob-
The variability of measurements expressed as stan- served values from the calculated values with the uncer-
dard deviations - SD - are therefore assumed to depend tainty of the observations
only on the measurement level.
T = (observed - expected)2
t (standard.uncertainty)2
II
(2)
Table 2 Test for adequacy of simple functional relationships for representation of the uncertainty of a reading
Level SD counts u(SD) counts Linear Regression Log/log Regression sqrlsqr Regression
Table 3 Test for goodness of fit of functional relationships for representation of the uncertainty of a reading
Level SO counts u(SO) counts Linear weighted sqr/sqr weighted Non-linear weighted
ed according to Eq, (2). With 5 degrees of freedom the original data, and results are presented in Table 5. Simple
critical values for Tare 15.1 for p < I % and ILl for p linear regression gave in all cases a less satisfactory fit.
<5%, which means that neither of the weighted regres- It is concluded that a perfectly satisfactory linear cali-
sions give satisfactory agreement with the observations. bration is maintained up to approx. 2 mg/L, which means
Only the non-linear - unweighted - fit is acceptable at that the observed deviations from the linear relationship
the 5% level of significance. are fully accounted for by the assumed uncertainty of the
Moreover, the actual deviations seen in the last col- readings. Thus no additional uncertainty contribution
umn of Table 3 are much more evenly distributed over needs to be included in the uncertainty budget.
the range than in the other two cases, which confirms
that the non-linear case is the best overall representation
of the data from Table I. This is confirmed by the loga- The uncertainty budget
rithmic plot of the standard deviation as a function of the
level shown in Fig. I. The uncertainty budget for the determination of Pb by
With the ready availability of non-linear regression to ICP-AES is reduced to
users of Microsoft Excel the computational effort is
probably less than that associated with the iterative, u(y)=~sa +(Y'Sj)2
weighted regression. Consequently we are going to use with so=8.43 and s [=0.0154, where y is the reading. The
the non-linear results presented in Table 4 for estimating conversion of a reading to a concentration is linear up to
the uncertainty of our calibration. at least 2 mg/L, and the additional uncertainty of 0.22%
is negligible in comparison with u(y), which is always
larger than 1.5%.
Linearity of calibration function Experimental verification of the budget is based on the
separate determination by each of the participants of the
The calibration function expresses the relationship be- concentration of Pb in the unknown sample G from Table 1.
tween the known concentrations of Pb in the reference With the uncertainty budget above each participant will cal-
solutions A to F and the readings of the instrument; to culate a result and its uncertainty from his own data alone.
the extent that the superposition principle is observed The participant may choose to assume a linear calibra-
this function is a straight line. However, deviations are tion and use all his observations to arrive at a result, or the
expected to occur at higher levels of the indicator, so that participant may calculate the result by interpolation only
the linearity range becomes restricted. between the two calibration points bracketing the unknown.
Instrument calibration is found by weighted linear re-
gression of the mean readings using the reciprocal square
of their respective uncertainties as the initial weight and Results based on linear calibration
Eq. (I) to calculate weights for subsequent iterations. The
linearity is tested by the value of T from Eq. (2) with the Each participant determines his own calibration data
standard uncertainties calculated from Eq. (I) using the from his readings in Table I of the reference solutions A
68 K. Heydorn . T. Anglov
Table 5 Tests for linearity of calibration based on weighted linear regression over a decreasing range
Reference Mean counts Uncertainly Linear weighted Linear weighted Linear weighted
mglL u(counts)
calibration Sq. residual calibration Sq. residual calibration Sq. residual
Table 6 Results obtained by using a linear calibration function over a range from zero to at least 2 mg/L
to F with the known concentrations given in Table 5. The the calculated T-statistic for 9 degrees of freedom is far
zero-offset and the slope are determined by weighted re- below the 95% level of significance of 16.9.
gression assigning to each reading y a weight of u(y)-2 , The calibration uncertainty as expressed by the sec-
as calculated from the uncertainty budget above. The lin- ond term in the expression (5) is based on 4-6 observa-
ear range that may be used is determined in exactly the tions and therefore a minor contribution compared to the
same way as in Table 5. measurement uncertainty of the reading for sample G
With a 95% confidence level of T =9.49 at 4 degrees based on only one observation. Thus the variability
of freedom 8 out of 10 participants could use all 6 cali- among final results is probably less than the variability
bration points to determine their calibration data. One of the readings on which these results are based, which
participant had to eliminate the highest level and one the indicates that the readings obtained by a particular par-
two highest levels; only the last case was therefore re- ticipant are not completely independent of each other.
stricted to the range found in Table 5.
Calibration data and results for the concentration of
Results based on interpolation
Pb in sample G are shown for individual participants in
Table 6 together with the estimated uncertainty of their Calculation of results may also be carried out without as-
predicted result, xpred. Its uncertainty is calculated in ac- suming a linear calibration function over a particular
cordance with the approach presented in Appendix E.3 range, but only between two calibration points bracke-
of Ref. [2], while using the T-values calculated for the ting the unknown. This is simply done as linear interpo-
weighted regression from paragraph 7.5 in Ref. [ 1] lation and has the further advantage that uncertainty in
both x and Y values are easily taken into account.
2 __1 [ 2 T ( (Xpred~ - Tz)2 )]
U(Xpred) - b2 U(Yoh.,) + ~(n-2) 1+ ~~_Tz2 Results for the unknown sample G are obtained by
each participant using simple interpolation between refer-
(5) ence levels A and D; the interpolation formula is entered
The agreement between individual results is actually bet- into a spreadsheet, and the procedure described by
ter than expected from their estimated uncertainties, and Kragten in Appendix E.2 in Ref. [2] calculates the com-
Calibration uncertainty 69
Participant Initials Reference samples Sample G Reading Result mg/L Uncertainty u(x)
bined uncertainty from the uncertainties of the readings, uncertainty of 0.014 mglL which slightly exceeds the
as well as from the assigned value of the reference sam- critical value of z=I.96.
ples.
Table 7 lists results and their uncertainties, which
show that also in this case the variability may be less Conclusion
than expected from the uncertainties with a value of the
statistic T that is well below the 95% level. Neither simple nor weighted linear regression yielded
satisfactory results for the functional representation of
standard deviation as a function of level with any of the
Bias alternatives proposed in Ref. [1]. Only the use of non-
linear regression of the logarithmic standard deviation on
The unknown sample G was prepared by dilution of the the expression recommended in Ref. [2], gave acceptable
same lead standard as the calibrants, and its concentra- agreement with experimental data.
tion was 1.485 mglL with a standard uncertainty of An uncertainty budget based on this representation
0.008 mg/L. for the determination of Pb in an aqueous solution yield-
Weighted mean values of the analytical results ob- ed experimental results in statistical control and without
tained by the participants are presented in Tables 6 and 7 bias.
together with their standard uncertainty. In either case Acknowledgements The authors are indebted to the Danish Insti-
the bias is positive, but barely significant at the 5% level tute of Occupational Health for making their ICP-AES available to
of confidence. the first course in "Metrology in Chemistry" held at Novo Nordisk
For the results based on linear calibration the bias is in the year 2000. The original readings made by the participants in
the course are used in this study exactly as they were made on
0.022 mglL with a standard uncertainty of 0.013 mg/L, 2000-10-09. In particular we acknowledge Miss Dorrit Meincke
which is not significantly different from zero. For the in- for supervising the ICP-AES instrument and for preparing the cali-
terpolated results the bias is 0.031 mg/L with a standard bration standards under careful statistical control.
References
I. ISO 5725-2, Accuracy of Measurement 2. EURACHEM/CITAC Guide. Quantify- 4. ISO Guide to the Expression of Uncer-
Methods and Results, I st Edition, ing Uncertainty in Analytical Measure- tainty in Measurement, Geneva, 1993
1994 ment, 2nd edn. 2000 5. Heydorn K, Griepink B (1990)
3. Heydorn K (1991) Mikrochim Acta Fresenius Z Anal Chern 338: 287-292
(Wien) III: 1-10
Accred Qual Assur (2001) 6:372-375
© Springer-Verlag 2001
mati on is addressed only to the extent of estimating the ble in microbiology and it is not worthwhile investing
binomial sampling variance of fractions confirmed. much effort in calculating values for these parameters
The Finnish guidance document for measuring uncer- (see Type A below). Instead, the uncertainty of the meth-
tainty is a novel approach for the estimation of uncer- od can be expressed as a formula into which observed
tainties in microbiological measurements based on culti- values of each measurement can be inserted.
vation and including confirmation [1]. This document Because the Type A uncertainty estimation (based on
has been elaborated as an activity of the Advisory Com- replicate measurements) is usually not economically fea-
mittee for Metrology in Finland and the Centre for Me- sible in microbiology, the emphasis in the guidance doc-
trology and Accreditation in Finland will produce a ument [1] is on the Type B approach.
translation into English. The complex measurements Type A: The standard uncertainty is calculated from n
such as taxonomically valid identification are not includ- independent replicate measurements Xl' X2' ... , xll as
ed in this document. It is hardly possible to know the ex- the experimental standard deviation [2]:
act number of atoms or molecules in chemical analysis.
Similarly, it is not possible to know the actual number of I;l_l (Xi - X)2
viable target cells or spores in a sample. Sr = -1
11
Therefore, it is only possible to measure the relative
The number of replicates must be rather high because
uncertainty of measurement. In this paper the uncertainty
even 30 replicates from sample sources following
measurements of microbiological cultivation methods
normal distribution yield estimates of the standard de-
are described according to Niemela [1].
viation with only 13% relative uncertainty.
Type B: According to [2], Type B uncertainty is obtained
using other approaches than replicate samples. The
Principles uncertainty variance u 2 or the standard uncertainty u
are based on the whole body of scientific information
The instructions for the calculation of uncertainty of available (with the exception of replicate measure-
measurement elaborated for chemistry [2, 3] are not di- ments) on the possible variation of the measurand. In-
rectly applicable to microbiology. In the guidance docu- formation of statistical theory, earlier measurements,
ment [1] the principles of these chemistry documents are experience or general beliefs on instruments and ma-
interpreted and adapted to microbiology. terials, specifications of the manufacturer, published
In microbiological measurements sample pretreat- reference values in calibration and certification re-
ment is normally limited to homogenisation and dilution ports, and uncertainty estimates in handbooks can be
(or concentration by filtration). The measured portion of utilised.
the sample or its dilution is transferred to the "detector"
The common sources of uncertainty in cultivation meth-
and the result is obtained by counting individual objects.
ods are sample stability, dilution, counting (including
The sole principle of the measurement is reflected in the
particle statistical variation and personal interpretation of
formula for calculation of the result:
the target), yield on the medium, crowding effect (coin-
y=F· VC cidence error) and uncertainty of confirmation. The com-
bined uncertainty can be calculated as the quadratic sum
where of different uncertainty components.
y = number of microorganisms per volume
F = dilution factor
V = volume of the portion of the final dilution Compilation of uncertainty in cultivation methods
C = number of microbial particles in V.
It is evident from the above formula that the counted In microbiological cultivation methods only four types of
number of microbial particles strongly affects the result. detection systems are used: the one-plate instrument, the
The particle statistical variation can be estimated by ap- set of plates instrument, the one-tube detector (Pres-
plying the Poisson theory: ence/Absence) and the set of tubes instrument (Most
Probable Number). All these classes necessitate different
RSDc =uc =jf formulas for uncertainty estimates. Different media and
incubation conditions, and confirmation and identifica-
Because the microbial detectors function optimally at tion tests offer the versatility needed for the enumeration
particle numbers between 25 and 100 per test portion, of different organisms. However, this versatility is not re-
the particle statistical variation often dominates the un- flected in the principles of the estimation of uncertainty.
certainty of the measurement. Therefore uncertainty esti- In microbiological cultivation methods the sample is
mates such as reproducibility and repeatability deter- homogenised, after which a measured amount is diluted
mined in collaborative efforts are not generally applica- or concentrated (e.g. by membrane filtration) to the
72 R. M. Niemi· S. I. Niemela
proper measuring range of the analytical method. Suit- Instructions for the estimation of individual uncertain-
able aliquots are transferred to the plates, tubes or wells ty components are described by NiemeHi [1].
and, after incubation, colonies or numbers of positive
and negative reactions in tubes or wells are counted.
It is often necessary to confirm that typical colonies A shortcut to uncertainty estimates
or tubes yielding a positive reaction actually show the of the instrument with several plates
presence of the target microorganism. When it is possi-
ble to confirm all the presumptive results, there is no un- Consider a detection instrument consisting of several
certainty due to the random error caused by sampling for plates with colony counts c i derived from the volumes Vi
binomial attributes. On the other hand, when only a frac- (i=1 ,2, ... ,17) of the final suspension. The average particle
tion of the presumptive positive reactions are tested for concentration x of the final suspension is estimated from
confirmation, a significant increase will occur. The bino- I.c
mial sampling variation in confirmation tests should be x---'
- I.vi
addressed in addition to the error due to Poisson distribu-
tion of presumptive target colonies counted. In the pri- The microbial concentration of the original sample is ob-
mary cultivation step, counting results of typical colo- tained from the calculation
nies or reactions are susceptible to subjective judgement
y=Fx
that can cause significant uncertainty.
The components causing uncertainty can be seen from where F is the dilution factor.
the formula used for calculating the measurement result. The uncertainty of the average particle density of the
Usually it includes all the dilution steps, colony counts final suspension consists of particle statistical variation
from different plates (or most probable numbers) and num- (uJ, counting uncertainty (u) and the uncertainty of in-
bers of isolates tested further and those confirmed. The un- oculum volume measurements (uv) including possible
certainty of counting of colonies caused by differences be- dilution effects within the detection instrument. These
tween technicians should be included. It is possible to cor- components merge to form the uncertainty of x. It can be
rect systematic errors, e.g. confirmation rate (often ex- estimated by using the log likelihood ratio statistic G2
pressed as % of typical colonies or reactions confirmed). calculated with the following formula:
In the calculation of the result multiplication is need-
ed and, therefore, the combined uncertainty is composed G,7_1 = 2[t Ci In( ~: )-(I.Ci)ln( ~~:)]
of the sum of the relative uncertainties. Fortunately, the
components of uncertainty tend to be independent in mi- where
crobiological measurements, so that covariances need c i = colony count of the plate i
seldom be considered. Vi = inoculum size (in ml of final suspension) on the plate i
Corrections for systematic errors, e.g. confirmation n =number of plates
rate and dilution error, are mostly multipliers. The cor- The relative uncertainty of the microbial concentra-
rected final result is therefore expressed as: tion x is calculated with the formula
References
I. Niemela SI (200 I) Uncertainty of Centre for Metrology and Accredita- 3. EURACHEM (2000) Quantifying un-
quantitative microbiological culture tion in Finland (MIKES) J1I2001, 69p certainty in analytical measurement,
methods. (In Finnish, to be translated ISBN 952-5209-3 2nd edn. Laboratory of the Govern-
into English and later available in elec- 2. ISO (1995) Guide to the expression of ment Chemist (LGC). Teddington, UK
tronic form at: www.mikes.fi). The uncertainty in measurement, 1st edn.
International Organization for Stan-
dardization (ISO), Geneva
Accred Qual Assur (2002) 7:228-233
DOl 10.1007/s00769-002-479-6
Abstract When a test is performed as, e.g. how the acceptance limit is
in order to qualify a material or a related to the risk. In the second
product for a certain use, the result case, the variation in the property of
is generally compared with an ac- the material or product dominates
ceptance limit. The test result has an and the uncertainty of the testing
uncertainty which should be esti- procedure is negligible. When the
mated and stated (e.g. in accordance results are non-quantitative (go - no
with GUM). Very often this is not go), statistical methods can be used
the case. Further, discussions often to estimate the risk taken with a cer-
arise on the issue of how the uncer- tain sampling and acceptance strate-
tainty shall be considered in rela- gy that a certain proportion of the
tionship to the acceptance limit. The batch to be delivered does not quali-
intention of this note is to describe, fy. This should be considered more
in simple terms, the statistical back- often in standardisation of product
ground and to give some recommen- test methods. When the results are
dations. In short, there are two quantitative, a statistical analysis
clean-cut, extreme situations. The should be performed and the uncer-
first case is when the uncertainty of tainty should be compared with the
the testing procedure is the dominat- acceptance limit as before, from the
ing factor. Here it is found that the actual circumstances. When effects
estimates of single laboratories can- of testing uncertainty and product
not, generally, be used for compari- variation are comparable a sound
sons with acceptance limits. One treatment requires extensive experi-
should have standardised, well-veri- mental work. No short cuts can be
H. Andersson (~) fied estimates based on comprehen- made without loss of confidence!
SP Swedish National Testing sive investigations of the method. It
and Research Institute, can also be concluded that compari- Keywords Uncertainty·
P.O. Box 857, 501 15 Boras, Sweden
e-mail: hans.andersson@sp.se sons between test results and accep- Conformity assessment· Acceptance
Tel.: +46-33-165000 tance limits have to be made with limit
Fax: +46-33-165010 regard to the actual circumstances,
where <p is the normalised Gaussian probability distribu- If there is a correlation between some of the tj:s, Eq. (4)
tion. The plus and minus signs relate to the two cases becomes more complex, see [2]. In many cases I' has the
above, respectively. It is readily seen that if the distribution form
of property and risk are well separated, the risk will be
II k.
small. In practise, there are considerable difficulties to find x=Jrf.'
i=l I
PE(x) and Ps(x), and therefore assumptions and approxima-
tions are used. The implications of this are treated next. then
0.''''
• PaItIaIIy ac~redlled ud ...kIng 'lilther accndllallon
• SeekIng accredllallon
40 .;.
o Mol acCNdlled Of no .c.c.m.nt .5
30 •cm
I!
20 'C
CII
co:
-
~
10 CII
0
0
0 CII
i5
'C
c). · 10 'E
-...
E
0
·20
c
0
0.0166 :p
·30 to
'>
CII
C
· 40
.!oO
For various reasons authorities, and written standards, This is in accordance with Eg, (3), if Xo = ms and as =0.
routinely replace the real distribution of risk with a limit- The result may seem trivial, but the reasoning gives
ing value which is intended to "safely" encompass all some insight in the process of replacing a real risk distri-
risks. In analogy with Fig. I, this corresponds to a situa- bution with a limiting value.
78 H. Andersson
It is evident that the relationships between the param- the measurement can be allowed to be omitted in the
eters, m s' as and Xo play an important part when it shall comparison with the limiting value.
be decided how a test result with the value mE and the Another illustrative example is the case where the al-
uncertainty estimate 2aE shall be related to a limiting cohol content in the blood of a car driver is measured
value xo' from the air exhaled from the lungs by a direct spectro-
Assume, as an example, that the limiting value Xo is metric method. The uncertainty of the method gives a
set in some relationship to the risk distribution P s (x), PE(x) distribution with a carefully determined uncertain-
and that we require that the measure mE shall have cer- ty approximating aE. The limiting value in Sweden, Xo
tain "margins" to the limiting value. We use the follow- =0.2 %0, is decided for political and pedagogical reasons
ing four cases: and it is comparable to factors as fatigue, irritation, etc.
The expanded uncertainty can therefore be safely credit-
1) Xo is determined as m s' and it is accepted that the val- ed to the car driver. There is still a margin to real risks.
ue mE may be used for comparison with xo' Of course, the fact that a sentence has severe conse-
2) Xo is determined as ms ' and it is required that the val- quences also indicates that the technical uncertainty
ue mE + 2ae shall be compared with xo' should be credited to the car driver.
3) Xo is determined as ms - 2ae and it is accepted that the In other cases other reasoning and conclusions have
value mE is compared with xo' to be made. The general conclusion is that it is not possi-
4) Xo is determined as ms - 2as and it is required that the ble to have a single "rule of thumb" for all cases when
value mE + 2aE shall be compared with xo' determining how the uncertainty of a measurement shall
be related to a given acceptance limit. Different situa-
This will give the corresponding risks for damage, with tions have to be treated separately.
assumption of normal distributions. There are, of course, also situations where the uncertain-
ty should be included in the comparison. One such is when
1) </>(0) =50% the material strength, e.g. fatigue properties are used for
safe dimensioning of critical structures, such as pipe lines
2) </>( ~ -;(J E 2 ), which is around 8% if a E =as or nuclear reactors. (Even here, though, the limits are set
with safety factors far exceeding the testing uncertainty.)
(JE +(Js
Again, the important principle to follow is that the re-
3) </>( ~ ~2(Js
(JE +(Js
2 ), which is around 8% if a E = as
lationship between limiting values and measurements
and their uncertainties should be technically well found-
ed. This requires a case by case decision.
Hence, an impression is obtained about the consequences It is now assumed that the uncertainty in the test method
of different strategies. If, for example, it is required that is small in relationship to the variation in product proper-
Xo = ms - 3as and aE«a s ' which is a usual situation, the ty and that the test is of the go - no go type.
risk of accepting mE for comparison with Xo is only The problem at hand is then to draw conclusions
about the proportion of approved products in a batch
from the proportion of approved specimen in a sample
</>( -4(Js )""</>(-3)<=01% tested. Examples may be safety helmets (impact resis-
~(J.~ +(J~ ,
tance), iron ore (contents of iron), pre-packed food
It is noted that in most cases the risk distribution is not (weight). This type of test is described, e.g. in ISO 2859
Gaussian, still the reasoning remains the same. [4] and may be described by so-called OC-diagrams (op-
It is usual to set limits for contents of hazardous sub- erating characteristics).
stances in food or in the occupational environment. Here, Such diagrams are based on the binomial distribution
the determination of the risk distribution, Ps(x), is very and the functional relationship is
unsure for small probabilities of injury or health effects. p
For this reason, there is a tendency for responsible au- PAccept(x) = L(~)XP(l- x)/-p
thorities to set the acceptance limit far below the lowest o
levels observed to cause effects, corresponding to large where PAcce.pt(x) is the probability to accept a batch with
k-values (k = 4,5 or even 6). This may cause difficulties the proportions x of non-accepted units if a sample of
in performing the measurement at all since the accep- size n is tested and if the batch is approved if at most p
tance level and detection levels become comparable. It is non-accepted units are found. The diagram has the gen-
also clear that in many such situations the uncertainty of eral shape according to Fig. 7.
The use of uncertainty estimates of test results in comparisons with acceptance limits 79
Paccept
Paccept
1.0
0.5
Of course it is desirable that the diagram shall be de- Hence, if the batch contains 20% (x = 0.2) erroneous
cisive, i.e. that it is close to the rectangular shape of the units there is a 50% probability to accept it. This means
limiting curve obtained for n~oo. If the non-accepted that two laboratories may well get contradictory results
proportion is smaller than pin the batch should not be ac- or that the supplier may get different results from the
cepted and vice versa. same laboratory on two consecutive occasions!
As an example may be taken testing of packaging for It should be noted that when quantitative measure-
dangerous goods, where a drop test, from a certain height ments are made one can determine in the usual fashion
is approved if no leakage occurs. Three drops are made an approximate PE(x) function.
and the batch is approved if no leakage occurs in any of This type of product testing is very usual in written
them. standards, and much confusion and debate occurs be-
If it is assumed that the batch has a proportion x tween customers and laboratories, due to lack of under-
which would not pass the test the probability to accept standing of the properties of the testing procedure. Yet it
the batch will be is necessary to have these methods, for cost reasons, in
product testing. What is needed is that the standards are
PAccep/X) = (b)X o(1- x)3 = (1- x)3 produced with more care, and with clear explanations of
The corresponding curve is shown in Fig. 8. the risks and properties of the test methods.
References
I. ISO 5725 (1994) Precision of test meth- 2. ISO (1993) Guide to the expression of 4. EURACHEM Guide (2000) Quantifying
ods - Determination of repeatability and uncertainty in measurements. ISO, Gen- uncertainty in analytical measurements.
reproducibility by inter-laboratory tests. eva 2nd edn. EURACHEM, LGC,
ISO, Geneva 3. Andersson H (1994 )Assessment and Teddington. UK
practical use of uncertainty in test re- 5. ISO 2859 (1974) Sampling procedures
sults. 2nd EUROLAB Symposium, Flor- and tables for inspection by attributes.
ence, 1994. EUROLAB, ISO. Geneva
Accred Qual Assur (200 I) 6:493-500
© Springer-Verlag 2001
odes) to be applied to the model. Laboratories that do not tions) to perform the same tests. The guide presumes that
meet these basic requirements can use the model to help the laboratory organization has established data quality
establish their overall Quality System program but objectives and is committed to meeting them. The Annex
should not expect to consistently receive acceptable rat- to that Guide describes a procedure for setting laborato-
ings on PTs without being in compliance with ISO ry-wide data quality objectives based on the ISO/ TC
17025. 17/SC I plot. Other recent ASTM Standards also make
The implementation of the model is simple. It makes use of the logarithmic correlation between variation and
extensive use of the logarithmic correlation between per- concentrati on [15-17].
formance statistic of a test method and concentration, in- The referenced programs typically use classical statis-
dependent of analyte (measurand), matrix, or method tical techniques, such as Dixon or Cochran, to reject in-
that was initially developed by Horwitz [2-5], beginning dividual data points and then use the log-log plot concept
in the early 1980s and further developed by Rocke and to identify data set outliers. Outliers are defined as points
Lorenzato [6]. Horwitz et. ai., used this model to analyze above or below a specified limit on the log-log plot.
the performance characteristics of test methods used for Root cause analysis of log-log plot outliers above
regulatory purposes by the Food and Drug Administra- the limit usually show possible deficiencies including:
tion. As discussed in the following representative exam- I) heterogeneity of one element in the test material,
ples, others use the concept to model the uncertainty in 2) unanticipated test method interference, 3) attempted
analytical measurement systems. application of the test method above or below the opti-
ISO 5725 [7] provides practical numerical definitions mum concentration range or 4) inadequate test method
for the repeatability, r, and reproducibility, R, of a stan- calibration and control protocols. Conversely, outliers
dard test method and describes the organization and below the line, although infrequent, appear to be "too
analysis of interlaboratory experiments for the numerical good", and usually indicate that the experimental design
determination of rand R. As part of its work program, did not include all of the normal sources of variation.
ISO Technical Committee 17 (Iron and Steel), Subcom-
mittee I (Chemical Analysis) applied these definitions
and practices to the Horwitz model [8]. That work in- ASTM'S proficiency test programs as a source
cluded 45 published and ready-to-publish national and of interlaboratory data
international standard test methods (BS!, ECISS/TC20,
and ISO/TC 17/SC 1) that employed 6 method principles ASTM currently sponsors PT programs in petroleum
(gravimetric, spectrophotometric, flame AAS, titrimet- products and lubricants, stainless steel, plain carbon and
ric-visual, titirimetric - potentiometric, and combus- low alloy steel, aluminum, gold in bullion; plastics: me-
tion/infra-red) to determine 21 elements in iron and steel. chanical properties testing, plastics testing (polyethyl-
The work showed that a clear correlation existed be- ene): melt index and ash; Textiles, and engine coolants.
tween the log of both rand R vs. the log of the analyte More information regarding ASTM PTs can be obtained
concentration. Based on that work, ISO/TC 17/SC I set a on their website, www.astm.org. These programs provide
policy [9] to use the log-log plots of rand R vs. concen- laboratories ongoing, statistically sound objective evi-
tration to evaluate test data of candidate ISO test meth- dence of their performance on common test materials as
ods. Any method that has data that exceeds specified compared with other competent laboratories around the
limits beyond the historical log-log plots will not be sub- world. With the exception of the petroleum-related pro-
mitted for international ballot to elevate to international grams, the PT reports provide feedback to laboratories
standard status. ISO 15350 [10] is a recent example of a on the current PT samples only and ASTM does not
test method that met those requirements. monitor a laboratory's long-term performance. The pe-
More recently, Flinchbaugh and Poholarz rII] de- troleum-related programs publish 2-year summary re-
scribed how the Horwitz model could be adapted to set ports of each laboratory's robust standard deviations.
MQOs for a Reference Materials (RM) Program. That These reports provide indications as to whether a lab
program uses the ISO/TC 17/ SC I plot in a manner simi- may have a persistent relative bias on a test or if their
lar to the model described in this paper to predict the precision performance has been poorer than the majority
MQOs (uncertainties) needed in certified homogeneity of labs. This analysis also attempts to identify labs with
and concentrations. A laboratory using those RMs especially tight performance. ASTM's Gold in Bullion
should be able to meet performance requirements consis- Program offer guidance to laboratories on how to track
tent with the rand R limits set by the ISO/TC 17/ SC I and monitor their own PT performance [18].
plot. The program based on that model is now accredited ASTM Committee EO I, on the Chemical Analysis of
to ISO Guide 34 [12, 13]. Metals, Ores, and Related Materials, conducts PT pro-
ASTM Committee EO I recently developed a Standard grams in plain carbon and low alloy steels, stainless
Guide [14] for managing uncertainties within a laborato- steels, aluminum, and gold in bullion. These programs
ry organization that uses multiple locations (worksta- are conducted in compliance with ISO Guide 43 and
82 D. A. Flinchbaugh · L. F. Crawford· D. Bradley
ASTM E2027 [19, 20]. This paper utilizes data from the ASTM Proficiency Test Data
first two of the EOI programs for its source of interlabo-
Plain Carbon/Low Alloy and Stainless Steel
ratory test data collected over a 2-year period . Samples .Btertaboratory Robus. Standard Deviatioa (95-1. contidcnce) VI. Robust MeaD
used to generate the PT data incorporated in this paper
were supplied by the Brammer Standard Company and 1--' .. :! ...
tested by ASTM E-826 [21], Standard Practice for Test- \-- I-HH-IHil-'- +++ml*I---+
'-I+I HHI -, I ..
Table 1
Element, Conc. Matrix Method a Method range
blocks on Fig. I in lieu of diamonds. Table I shows the laboratory standard deviation data from other programs.
source of those points. This visual analysis indicates that Figure 4 (Figs. 4-7 are available as electronic supple-
the most clearly outlying points were at or beyond the mentary material) compares the best-fit line from Fig. 1
specification ranges of the alloys typically produced, with the best-fit lines from the ISO TC 17/SC I plot re-
were at or beyond the maximum concentrations for ferred to in the Introduction. Note that the general agree-
which the Standard Test Methods were validated, or ment is good, especially considering that the two lines
were beyond the scope of the standard test method. In represent very large, totally independent data sets. Both
the case of AI, neither of the standard test methods re- measure interlaboratory standard deviations, but they
ported (ASTM E572 and ASTM E1086) include Al in were calculated from data derived under very different
the method scope. For Si, ASTM E572 does not include protocols. Many of the individual ISO data were from
Si in the method scope and the I % silicon value is above the final, most successful iteration of several optimiza-
the validated range of the test method. Laboratories tion experiments. On the other hand, most of the ASTM
should not use/report standard test methods for elements data were generated under production conditions using
not included in the standard test method scope nor for many more pieces of equipment, each designed to handle
samples above (or below) the analytical range. Based on a specific facility's fairly unique product mix. In many
these findings, there is a strong possibility that at least cases, the number of digits reported in test results are
some of the laboratories provided the test data outside preset to meet local production requirements and may
the demonstrated capability of ASTM test methods. create rounding errors that make precision worse. Most
This brief overview illustrates how PT providers can of the ISO data resulted from calibrations with certified
use the log-log plot of two related programs under one reference materials (CRMs) and pure chemicals that are
provider to identify and improve performance on any el- highly reliable, while many of the ASTM data sets are
ement/concentration pairs that tend to generate unusually from X-ray and optical emission instruments calibrated
high standard deviations. As progressive PT providers with secondary materials because appropriate calibrants
improve their robust standard deviations, their participat- are not always available. Also, the trace determinations
ing laboratories will need to improve their performance of most residual elements in the 0.00 I to 0.01 concentra-
to maintain consistent Z-scores over time. Clearly, labo- tion range are not critical to routine spectrometric labo-
ratories will not benefit by participating in PT programs ratories. Therefore, laboratories may not exercise as
with unusually large robust standard deviations, because much diligence in controlling those elements.
those laboratories will earn relatively low Z-scores when Using data from two ASTM PT and ISO/TC 17/SC I
their measurement uncertainties are relatively large. test methods programs, we have shown that interlabora-
Conversely, laboratories will benefit by participating in tory standard deviations from these diverse programs
PT programs that consistently provide low robust stan- are very similar at any given concentration. We have
dard deviations. The long-term goal is to have all com- also shown that opportunities exist for further improve-
positional-based PT programs show the same, minimum ment in the degree of curve-fit, once the respective or-
standard deviations at each tested concentration. ganizations begin to evaluate and improve their proce-
We believe that as this model is used and overall data dures using the model. The similarities of these two
quality is improved the best-fit line will drop to some curves tend to verify the model and should give labora-
minimum standard deviation, which is undefined at this tories the confidence they need to use this model to es-
time. However, we can compare log-log plots of inter- tablish MQOs.
84 D. A. Flinchbaugh· L. F. Crawford· D. Bradley
This correlation can be used to predict future perfor- Developing control limits to comply with MQOs
mance of PT exercises. We will show how laboratories
can use the predictability of future PT results to set Having defined the maximum allowable error to meet
MQOs and design control schemes to ensure satisfactory MQOs, such as to pass 95% of all PTs, it is easy to use
performance. the uncertainty budget concept to establish the maximum
error to be allowed at each major step in the subject
method. In a bias free environment, measurement uncer-
Establishing MQOs tainty can be described by pooling standard deviations as
follows:
Laboratories must understand client expectations and es-
(3)
tablish MQOs that are consistent with the agreed, client
expectations. High-quality PT programs that attract large where:
numbers of ISO 17025 [25] compliant laboratories are
reliable sources of competitive performance data that can ak total
,
= combined
2
uncertainty in a measurement system,
Raill!
be used by laboratories to negotiate realistic performance
= each source of variance in measurement system R.
goals with clients. This approach is also useful to labora-
tories in helping clients better understand the uncertainty An essential step in utilizing the uncertainty budget is to
in their test results by establishing performance goals identify all sources of uncertainty in the measurement
where none previously existed. system and then to quantify these sources to the extent
Starting with the interlaboratory robust standard devi- possible. Individual sources of variation that contribute
ation (data shown in Fig. 1), we estimate the intralabora- significantly to the combined uncertainty for a typical
tory standard deviation of the participants by dividing in- measurement system utilizing an analytical instrument
terlaboratory data by -Ii, a standard practice for estimat- are: instrument calibration (i.e. quantity and quality of
ing a reduction in variation by removing one major CRMs used in the calibration protocol), instrument con-
source of error [7]. These data are modeled using a trol, and field sampling. Other significant sources of
"power" fit, shown as a line in Fig. 5 (available as sup- variation may exist for specific situations and should be
plemental electronic material), to yield accounted for, as appropriate. Although field sampling
y=0.0271 xO.58 (2) is generally considered a significant source of uncertain-
ty in an analytical measurement system, it will not be
The intralaboratory line represents the maximum uncer- considered here because sampling variation is not sig-
tainty a single laboratory can have and still expect to re- nificant in PT programs. It is recommended that accept-
ceive Z-scores of less than 2 in 19 out of 20 PTs. This ed sampling techniques are used and documented when
line can be used as the maximum allowable uncertainty sampling is under the control of the laboratory organiza-
for a laboratory. For example, to establish the maximum tion to reduce this source of variation as much as possi-
allowable uncertainty for a 0.3 wt. % Mn sample, Eq. (2) ble.
yields a result of 0.0135 wt. %. The identified sources of variation can be used to ex-
Thus, if a laboratory's estimated uncertainty is 0.0135 pand Eq. 3 to better describe the measurement system:
wt.% (m/m) at 0.3wt.% (mlm) they can be assured that
they will receive Z-scores of less than 2 in any ASTM a1, total = ak, Control + ak, Calib (4)
PT 95% of the time and that their instrument is operating or
correctly at this concentration level. This also means that
a laboratory will receive Z-scores of greater than 2 in a a R.total = ~alControl + alCalib (5)
PT 5% of the time due to random causes when a 95% where:
confidence level is employed. It may be in the laborato-
ry's best interest to reduce the MQO to allow them to a R, total = combined uncertainty in measurement system R
work with less than a 5% probability of receiving a Z- a2R, COlltrol = uncertainty due to instrumental control in
score of greater than 2 due to random error. For the de- measurement system R
termination of some chemical species, only one method a 2R. Calib = uncertainty due to instrument calibration in
may be available and it may not be optimized, resulting measurement system R.
in higher than the desired variation. In these cases, ef-
forts should be made to continue optimizing the test from Eq. 5 can be set as equal to the MQO deter-
a R, total
method to reduce the amount of uncertainty in the re- mined by Fig. 5. The laboratory must design and control
ported value. its measurement systems so that the right-hand side of
Eq. 5 is equal to or less than the combined uncertainty
allowed in the system. This requires that the laboratory
quantify the individual sources of variation contributing
A model to set measurement quality objectives and to establish measurement uncertainty expectations 85
Measurement Quality Objective Prediction yield the MQO for calibration variation. The MQO for
control sample variation is described by
- ---- y=0.0192xo. 58 (6)
-'-1---+ - I'
and the MQO for calibration variation is described by
~-+++---:+-tt1--+ - f ! 1 ------ y=0.0l36xo. 58 (7)
+--;---t--Htt-H-++-Hit t Ht-++-!-' I Equations 6 and 7 are shown graphically in Fig. 2.
i
.-t- - - , 1 .--- -. ' ,~ "
Equations 6 and 7 give the maximum allowable varia-
-- t- .- - . -"1 I«
tion for control and calibration and are used to design test
-- I I - ,//
methods in a cost-effective manner. If the amount of vari-
ation is less than the maximum allowed in either control
~-+-+t-+Htt-+-- (~ - i ii ;; or calibration, it reduces the probability of failing a PT
1
H Hlf---++H+HH---+-I-L LI4~i ~7.t!.. due to random causes or allows the variation to be given
-- -'
-"- ._-
;: _ . - shaped curve, with a maximum frequency near 0.1, indi-
- II< cating that laboratories in compliance with the presented
model, as Bethlehem is, will consistently receive accept-
able Z-scores.
References
J. ISO VIM (1993) International vocabu- 4. Margosis M, Horwitz W. Albert R (1988) 8. Hobson 10; ISO Document N 938
lary of basic and general terms in me- J. Assoc Off Anal Chern 71 : 619-635 (1992) A survey of the precision of
trology, 2nd edn. ISO. Geneva 5. Horwitz W, Britton P. Chitrel SJ (1998) standard methods for the analysis of
2. Horwitz W, Kamps LR. Boyer KW J Assoc Off Anal Chern 81 : 1257-1265 steel and iron. based on ISO,
(1980) J Assoc Off Anal Chern 63 : 6. Rocke DM . Lorenzato S (1995) Tech- ECISS/EN and BSI Statistics, revised
1344-1354 nometrics 37: 176-184 1992-06-0 J. ISO, Geneva
3. Horwitz W (1982) Anal Chern 54: 7. ISO 5725 (1994) Accuracy (trueness
67A-76A and precision) of measurement meth-
ods and results. ISO, Geneva
A model to set measurement quality objectives and to establish measurement uncertainty expectations 87
9. ISO TCII7/SCI N 1235 (I 998)The 15. ASTM D6091-97: Standard practice 22. Analytical Methods Committee of the
procedures for activities of ISOITC for 99%/95% interlaboratory detection Royal Society of Chemistry (1989)
17/SC I (The 5 edn., version 2). ISO, estimate (IDE) for analytical methods Analyst 114: 1693-1697
Geneva with negligible calibration error. 23. ASTM E572-94 (reapproved 2000):
10. ISO 15350: Steel and iron: determina- 16. ASTM D6591-99: Standard practice Standard test method for X-ray emis-
tion of total carbon and sulfur contents for an interlaboratory quantitation esti- sion spectrometric analysis of stainless
- Infrared absorption method after mate steel
combustion in an induction furnace 17. ASTM E 1763: Guide for the interpre- 24. ASTM EI086-94 (reapproved 2000):
(routine method). ISO, Geneva tation and use of results from the inter- Standard test method for optical emis-
II. Flinchbaugh DA. Poholarz JM (1998) laboratory testing of chemical analyti- sion vacuum spectrometric analysis of
Accred Qual Assur 3: 367-372 cal methods stainless steel by the point-to-plane ex-
12. ISO Guide 34 (2000) General require- 18. ASTM Proficiency Test Program Re- citation technique
ments for the competence of reference port: Determination of gold in bullion 25. ISO 17025(1999) General require-
material producers. ISO, Geneva by cupellation (E 1335) May/June 200 ments for the competence of calibra-
13. American Association of Laboratory Appendix "Tracking and Monitoring tion and testing laboratories. ISO, Gen-
Accreditation (A2LA) Certificate No. your own Proficiency Test Perfor- eva
300.03, presented 7 November, 2000 mance" 26. A2LA Certificate No. 300.0 I, present-
valid through 31 August, 2002 19. ISO/IEC Guide 43-1 (1997) Proficien- ed October 18, 2000 valid through
14. E2093-00: Standard guide for optimiz- cy testing by interlaboratory compari- August 31 , 2002
ing, controlling and reporting test sons - Part I: Development and opera-
method uncertainties from multiple tion of proficiency testing schemes.
workstations in the same laboratory or- ISO, Geneva
ganization 20. ASTM E2027-99: Standard practice
for conducting proficiency tests in the
chemical analysis of metals, ores, and
related materials
21. ASTM: E-826-85 (Reapproved 1996):
Standard practice for testing homoge-
neity of materials for development of
reference materials
Accred Qual Assur (2000) 5:464-469
© Springer-Vertlag 2000
Abstract The preparation and cer- variance (ANOV A). As GUM also
tification of reference materials is a allows alternative evaluations other
rapidly developing area. Many in- than Type A evaluations, a reinter-
novative reference materials have pretation of the theory of ANOV A
limited homogeneity and stability, is necessary to establish a model
and, additionally, the uncertainty for the certification of reference
A.M.H. van der Veen (181) estimation of the property values materials that is widely applicable.
Nederlands Meetinstituut, must be brought in agreement with For this, analysis of variance can
Schoemakerstraat 97, 2600 AR Delft, the principles of the "Guide to the be used as a statistical technique to
The Netherlands
e-mail: avdveen@nmi.nl
expression of uncertainty in meas- derive standard uncertainties from
Tel.: + 31-15-2691 733 urement" (GUM). The results of homogeneity, stability and charac-
Fax: +31-15-2612971 the homogeneity and stability stud- terisation data.
J. Pauwels ies must be included to a certain
European Commission, Joint Research extent in the uncertainty of the
Centre, Institute for Reference Materials property values of the reference Keywords Reference materials .
and Measurements, Retieseweg, material, in order to comply with Measurement uncertainty .
2440 Geel, Belgium
e-mail: Jean.Pauwels@irmm.jrc.be
these requirements. The basic the- Analysis of variance .
Tel.: +32-14-571 722 ory needed to accomplish this is Homogeneity study . Stability
Fax: +32-14-590406 essentially the theory of analysis of study
yr
1I _
where a second bias term has been introduced: B;j. For SSam01lIl= L ni(Yi - (6)
this bias term, which is at the level of subgroups, the i=l
same assumptions are made as for A;, that it is normally Without proof, the relationship between the three sums
distributed with mean zero, and that it is independent of squares reads as
from both any A; and any C;jk' The subscript "B C A "
should be read as "among subgroups, within groups", SSwtll/ = SSam01l11 + SSw;thi1l (7)
as A represents the level of groups (for example: sam- Each of these sums of squares has well-defined num-
ples), and B represents the level of subgroups (for ex- bers of degrees of freedom. SSlIm01l11 has a -1 degrees of
ample: extracts). In a homogeneity study, a two-way freedom, SSwithi1l has L ni-a degrees of freedom and
ANOV A might be considered if additionally to the be- SSwtai has L ni-1 degrees of freedom. Dividing SS,,;thi1l
tween-sample variation also the repeatability of sub- and SSaI1101l11 by their respective number of degrees of
sampling and extraction is to be determined. freedom leads to the respective mean squares, abbre-
For the calculation of variances from these complex viated as MS. MSwithifl can thus be calculated as
experiments, two things are needed. First a method is
needed to partition the total scattering into contribu- MS .. - SSw;th;1l (8)
wt/hlll - II
tions, attributed to the various levels in the ANOV A.
In a second step, these contributions are converted into
L n;-a
;=1
variances. These variances can directly be used in un-
certainty evaluations that are compliant to the "Guide and MSamoflll is thus defined as
to the expression of uncertainty in measurement" MS - Sam01l11
(GUM) [2]. among ---l- (9)
a-
The mean squares take up the form of variances, but
Partitioning sums of squares they are, apart from MSwith;m not equal to the variance
at their specific level.
Scattering of data can be represented in various ways. The main objective of this partitioning is that it ena-
Probably the best known way is to express scattering of bles the separation of different effects that contribute
data in terms of variances, covariances, and standard to the combined standard uncertainty of the measu-
deviations. In analysis of variance, the scattering is oft- rand. This separation only makes sense if an uncertain-
en expressed in terms of sums of squared differences, ty component, obtained from one experiment is used in
or in short "sums of squares". These sums of squares another experiment, thus in the case of a kind of Type
express the scattering at various (hierarchic) levels in B evaluation [2].
the analysis of variance. At the top level, the total sum
of squares (SSWtal) is defined
Estimating uncertainties from ANOVA
a 11;
SStotlll= L L (Y;j- y)2 (3) In order to be able to calculate variances from mean
;=1 j=1
squares, the first thing needed is the expectations of the
where mean squares, expressed in terms of variances al and
cr. These relationships read as
= 1 a IIi
Y=II
- ~ ~
i..J i..J
y IJ (4) MSWithi1l=cr (10)
M SlImo1lg = a 2 + no a~
~ ;=1 j=1
i..J ni (11)
i=1
where no is a function of the number of degrees of free-
denotes the grand mean; a is the number of groups and dom. For a complete data set, where for any value of i,
n; is the number of members in the group. SSWtll/' the n; = n. then no = n. It has been determined by mathe-
total sum of squares, can be partitioned as follows. matical statisticians that the appropriate value for no for
First, SSwithi1l will be defined incomplete date sets reads as
90 A. M. H. van der Veen . J. Pauwels
(14) The double bar over Y denotes the grand mean; al-
though one could argue that it should be a triple bar, a
Although s1 is an unbiased estimator for 0-1, and can be grand mean in the ANOV A literature is always de-
employed as such, there is some aspect to keep in mind. noted b~ a dou~e bar. Likewise, group and subgroup
In various references [3], the following expression for means (YA and Y B) are always denoted by a single bar.
the confidence interval for this variance ratio can be The subscript denotes the level: A is the top level, B is
found the second level. The second step in developing expres-
2 sions for the variances 0-1, ~CA' and rr is to convert
FO.975 + n0-11 rr < S ~ < FO.025 + nail (T2 (15) the sum of squares in their respective mean squares.
S The expressions for the three mean squares are
where FO.975 and FO.025 are the lower and upper 2.5%
MS = SSamonx
(21)
one-tailed levels of F with numbers of degrees of free- among a-I
dom of a -1 and a(n -1), respectively. This expression
can be rearranged to MS _ SSBCA
(22)
!(n <u7trr <!(n
BcA - a
ing, that s2/n defines the "resolution" of the method for I: I: nij- I: b i
obtaining sl The smaller s2/n, the better the estimator
i=1 j=1 i=1
for sl This fact plays an important role when transfer- which are, after the discussion of the one-way ANOV A
ring uncertainty components from one experiment to self-explanatory. In the denominators, the expressions
an other. for the respective number of degrees of freedom are
given. The number of degrees of freedom is equal to
the number of observations, minus the number of pa-
Two-way fully nested design rameters computed from them. Thus, the number of
degrees of freedom among groups equals a-, as there
The expressions of the variances sl, S~CA' and S2 from are a group means and there is one parameter com-
a two-way fully nested design are developed in a similar puted from them (in fact, the grand mean).
way as those for the one-way layout. As in uncertainty These mean squares can be converted into variances
calculations for especially stability studies often a two- using the following expressions
way design is necessary (1: time; 2: samples; 3: repeata-
bility of measurement), the formulae are given below. a7nethod =M S withill (24)
The model for a two-way ANOV A has been given in -2 _ MS BCA - MSwithin
lfiJcA - (25)
Eq. (2). The grand mean is computed using no
1
a h
a hi nij
I: I: L Y ijk (17) -2 _
0:4-
M SanuJIlX- no crsc A - rr (26)
~ ~ i=lj=lk=l (nb)o
'-- '-- nij
i= 1 j= 1
Again, MSwithin equals rr. crscA is calculated by sub-
In this expression, a denotes the number of groups, b i tracting MSwithin from MS BcA ' Likewise, 0-1 could be
the number of subgroups within groups, and nij the obtained by subtracting MS BCA from MSamollg' For in-
number of observations in the subgroups. The formula complete data sets, it is less effort to compute 0-1 as
Unsertainty calculations in the certification of reference materials. I. Principles of analysis of variance 91
stated in Eq. (26) [1]. For higher-order ANOV As, the ANOV A in relation to the combined standard uncer-
pattern is similar. In [1], higher-order ANOV A as well tainty, not so much in relation to the repeatability of
as other designs of ANOVA are given. The expressions the measurement method used.
for the denominators of Eqs. (25 and 26) as well as in Returning to a one-way ANOV A, a useful relation-
the nominator of Eq. (26) read as follows ship can be developed from the expression for MSalllon!("
It is defined as
a _
L ni(Yi - y)2
i=1
MSamong = ------ (30)
a-1
(27)
The second relationship that is of interest is the expres-
sion for a sample variance
S2 = _i=_I_ _ __ (31)
a-1
(28)
If, in the expression for MSamong. ni is set to unity for all
L bi-a i, then this expression becomes identical with the one
i=1
for a sample variance for a row of results. So, from a
a
L L
(hi )2 matrix of a one-way analysis of variance, MSamong can
a h nij be computed directly from the variance of the group
L En.-i=1 1=1 means.
i= 1 j= 1 'J ~ "i Furthermore, returning to the model of a one-way
L... L nij
(nb)o = ______,_·=_1-'.1_=_1_ _ (29) ANOV A and the principle of propagation of uncertain-
a-1 ties, the following expression can be developed
As already pointed out, for practical reasons, the for- (32)
mulae for incomplete data sets are given, deliberately. For the group means, the following expression can
In most references, e.g. ISO 5725-3 [5] among others, be derived
usually only the formulae for complete data sets are
given. From a theoretical point of view, this may find its
(33)
justification in that ANOV A has been developed for
complete data sets, but a few observations (or even
subgroups) missing does not necessarily mean that the whereby it has been assumed that all groups are com-
whole experimental set-up has become invalid. Howev- plete, i.e. ni = n for all i. Combining this result with Eq.
er, the formulae needed certain modifications in order (31) leads to the interesting result that
to work with the (approximately) correct numbers of
degrees of freedom. Obviously, the more "holes" in the S 2 -SA
_ 2 +-2
Swilhin
- (34)
n
data set, the poorer the method works, and the poorer
the results. which is consistent. Furthermore, under the assumption
that ni = n for all i,
(35)
Useful relationships and inferences
These formulae also open up other options. According
The significance of ANOVA goes beyond the applica- to GUM [2], there is no difference in nature and prop-
tions sketched here. Traditionally, in chemistry erties of a standard uncertainty coming from a Type A
ANOVA has always been associated with the F-test, or a Type B evaluation. Accepting this principle, the
testing mean square ratios for significance. Although formulae given also open up possibilities to work with a
there are cases where this becomes relevant, in uncer- combination of Type A/Type B evaluation of uncer-
tainty evaluations it is rarely needed. Often, it is suffi- tainty. Especially in cases, where a series of data is to
cient to draw up the expression for the combined stand- be processed from which it is known that the values
ard uncertainty, and if components coming from carry an uncertainty (apart from the variation inherent
ANOV A are insignificant, then they will effectively to the data set), the formulae given may provide a basis
drop out in the summation anyway. What matters is the for developing a procedure for the uncertainty analysis,
significance of the uncertainty components in the based on this ANOV A work. This actually is one of the
92 A. M. H. van der Veen' J. Pauwels
reasons why the ANOV A theory is very suitable for within certain restrictions. The first equation setting re-
the description of the certification of batch reference strictions is the model, as specified in Eq. (1) for a one-
materials. way ANOV A and in Eq. (2) for a fully nested two-way
An illustration of this runs as follows. Suppose a se- ANOVA. These assumptions have already been dis-
ries of data is obtained, with a certain degree of scatter- cussed. The requirement of independence of the varia-
ing, but from which it is known that each member of bles on the right-hand sides of Eqs. (1) and (2) is prob-
the series has some additional measurement uncertain- ably the most critical one. This assumption seems to be
ty. This measurement uncertainty may be a combina- met in many cases in analytical chemistry as well as in
tion of uncertainty from Type A and Type B analysis, physical testing, but it cannot be taken for granted that
but it is assumed that there is only one additional un- this assumption is always valid. For instance, hetero-
certainty component (Uadd) to be considered, and it geneity of a material will in a homogeneity study lead
comes from a Type B evaluation. This assumption does to a greater value for Var(A;), but usually also Var(Eij)
not affect the general validity of this inference. Using increases with increasing heterogeneity.
Eq. (31), the variance can be computed. What about Another important point related to the model is that
the uncertainty of the mean? It is known that each Y i in the data obtained does not show any trend. This may
the series has this component Uadd. Given Eq. (33), it seem obvious, but for some applications (e.g. comput-
must be noted that it is impossible to determine u 2 (A;} ing uncertainty budgets from stability studies) it is
and U 2 (Eij), separately. This is the "penalty" for not something that should be carefully investigated. In ab-
having more than one data point per "group". sence of any kind of trend, the ANOVA approach is
Applying the principle of uncertainty propagation, valid. Otherwise, some kind of trend analysis or regres-
two alternatives can now be developed, which repre- sion technique is recommended. This requirement boils
sent extremes down to the usefulness of a grand mean: in absence of a
trend, the grand mean is (from a theoretical point of
2 S2 2
II (m) = - + lladd (36) view) a useful property. If there is a trend, the concept
a of calculating a grand mean, other than for internal
and purposes of the regression analysis, is of doubtful val-
ue.
2+ 2
U
2 () S Uadd
m =----'- (37) A further assumption already mentioned is the nor-
a mality of data. This assumption is quite notorious, and
The difference between the two is obvious. The first al- has lead to a variety of statistical tests for normality.
Probably the best known test for this purpose is the
ternative leads to a greater value for u 2 (m) than the
Kolmogorov-Smirnov test [4] or one of its variants.
second. Under what conditions can the second be used,
Normality of data is often assumed, but not so often
under what conditions must Eq. (36) be used? It de-
observed as desired. Statistical tests, which require nor-
pends on the nature of Uadd, and at what level it affects
mality of data, are known to be very sensitive with re-
the uncertainty in the mean m. If lladd is the same for all
Vi, it is clear that Uadd affects m at its own level: the
spect to deviations from normality. A skewed distribu-
scattering of Y i has no relationship with the value of tion may very well lead to completely wrong deci-
sions.
Uadd' If Uadd is specific to each Vi, then it affects the un-
In the applications discussed here, ANOV A is used
certainty of m at the level of Vi, and in this case Eq.
as a method for obtaining values for uncertainty com-
(37) can be used.
ponents. These values are variances, and their value is
In terms of correlations, Eq. (36) represents the case
relatively insensitive with respect to the underlying dis-
where the system is fully correlated with respect to Uad,"
tribution. This means that the evaluation method as
whereas Eq. (37) represents the fully independent case.
discussed is quite robust with respect to the actual dis-
In practice, usually it is not clear whether this addition-
tribution of the data. This is an important aspect, as it
al uncertainty source is correlated or not. As a princi-
makes testing for normality redundant. The variances
ple, in lack of information, the conservative alternative
thus obtained can be treated as any other variances
should be chosen, unless positive evidence is available
from Type A or Type B evaluations [2].
that the assumption of independence holds. That is, in
The fact that non-normality of data does not play
cases of doubt, Eq. (36) should be used instead of Eq.
such a role as in the classical use of ANOV A is dis-
(37).
cussed in ISO 5725-1 and -3 [5, 6], and ISO Guide 35
[7]. In the classical approach, the F-test for testing sig-
Underlying assumptions revisited nificance of ratios of means of squares playa dominant
role. This F-test is very sensitive with respect to non-
As with most mathematical and statistical techniques, normality of the underlying data, thus leading to false-
the computational methods as presented are only valid positive or false-negative results.
Unsertainty calculations in the certification ofreference materials. I. Principles of analysis of variance 93
References
1. Sokal RR, Rohlf FJ (1995) Biometry, tion modeling and analysis, 2nd edn. ment methods and results - Part 1:
3rd edn. Freeman, New York McGraw Hill, New York, Chapter 0 General principles and definition of
2. BIPM, IEC, IFCC, ISO, IUPAC, IU- 5. ISO 5725-3:1994 (1994) Accuracy statistical methods for quality control,
PAP, OIML (1995) Guide to the ex- (trueness and precision) of measure- vol. 2. International Organization for
pression of uncertainty in measure- ment methods and results - Part 3: I n- Standardization (ISO), Geneva, pp
ment, 1st edn. ISO, Geneva termediate measures of the precision 9-29
3. Sncdecor GW, Cochran WG (19H9) of a standard measurement method in 7. ISO Guide 35:19H9 (19H9)Certification
Statistical methods, Hth edn. Iowa statistical methods for quality control. of reference materials - General and
State University Press, Iowa, USA, International Organization for Stand- statistical principles, 2 edn. Interna-
Chapter 13 ardization (ISO), Geneva, pp 75-104 tional Organization for Standardiza-
4. Law AM, Kelton WD (1991) Simula- O. ISO 5725-1 : 1994 (1994) Accuracy tion (ISO) Geneva
(trueness and precision) of measure-
Accred Qual Assur (2001) 6:26-30
© Springer-Verlag 2001
matlon mation j
[-~ su-~--l l:n sub- 1
_samples - " samples_)
r----~
,,:,~:,~-j ~~,,-,,-j
I
+
C---'
-~
+ + + + +
-~-
Ii i------ -, n ,
PE~~
;::
, transfor- j
(---n-~
r----~
n measu-
bO n measu- : l n measu-
transfor-
~ rem~ts :men~~j
l~"-ITl"~t' j l_~_~~~~~~~,. \_ matiti~Il~J
'i
+ r--
+ J
C~?::~] I,~~;::~~-
... among groups • ... amoIlg groups ~
96 A. M. H. van der Veen· T. Linsinger· J. Pauwels
Within-bottle homogeneity testing ous (e.g. solutions) or known from previous experi-
ence to have negligible heterogeneity when pre-
pared properly
2. Application of a more conservative estimation tech-
nique for this uncertainty source, based on Sbb as ob-
[; -~
sub-
s~~ple #1
served and Smeas.
Both options comply with GUM, and apart from the
+
----,
I
+
lra~~io~~ J
expectations about the heterogeneity aspect, Smeas
should also fulfill the requirement, to be smaller than
l
transfor-
matio:.J
+.
(
mabon the repeatability standard deviation of the measure-
ments in the characterisation. If this requirement is not
+--1
~-
1----
Il..:'ments
~~easu-J l( J-~"
"measu-
·i tb
rements
.~ §'
clearly fulfilled, then in any case a more conservative
estimator than the value of Sbb as observed from
ANOVA is necessary. This is an effect of the existing
correlation between both standard deviations. This top-
ic will be covered in more detail in Part 4, about the
among groups
certification process, as it also has consequences for the
Fig.2 Lay-out of a within-bottle homogeneity study treatment of stability data.
For within-bottle homogeneity studies, a similar rea-
soning can be developed. The uncertainty from the ex-
lution of the method. The better the resolution, the periment can be expressed as [6]
smaller the effects that can be estimated.
(3)
Looking from the perspective of the "Guide to the
expression of uncertainty in measurement" (GUM) [8], which leads to
the variation of the bottle averages (Uc(bb») is a com- S 2wb -U2
- c(wb) -
S2
method (4)
bined uncertainty consisting of the between-bottle het-
erogeneity (Sbb) and the measurement variation (smeas). with Uc(wb) being the combined standard uncertainty of
The latter comprises analytical variation and within- the experiment, Swb the within-bottle variation and
bottle heterogeneity, which should be pooled for the Smethod the intrinsic variability of the method. These
estimation of between-bottle heterogeneity anyway. terms were named u exp , Ujnh and U meas ,. respectively in
The relationship between Uc(bb), Sbb and Smeas can be ex- [6]. The second term of this expression differs from that
pressed as [6] of Eq. (1) due to the difference in experimental design:
Usually, Smethod cannot be determined independently, as
(1)
this would require a material of the same type with per-
which implies that fect within-unit homogeneity, which renders estimation
of Swb impossible. In these cases, Uc(wb) must be used to
S~b = UZ-(bb) - s;;'eas (2)
estimate the minimum sample intake. To diminish the
Note that Uc(bb), Sbb and Smeas were named u exp , Ubetw and influence of Smethod as much as possible, a sample intake
respectively in [6]. Smeas is the analytical variation
U meas , should be chosen for which Swb is much larger than
divided by the square root of the number of replicates Smethod. Examples for this approach for (trace) elements
per bottle. By increasing the number of replicates per are solid sampling (SS) techniques like solid sampling
bottle a small Smeas can be obtained even for methods SS-ETAAS (Electrothermal Atomic Absorption Spec-
with poor repeatability, thus allowing a good estima- trometry) or solid sampling inorganic inductively cou-
tion of Sbb, the estimate of between-bottle variation pled plasma-mass spectrometry (SS-ICP-MS). In any
sought for. case, Swb is not part of the uncertainty of the certified
Equation (1) obviously cannot be used if the varia- reference material, as will be explained in Part 4. It is
tion of the measurement is large compared to the heter- only needed to establish the minimum sample intake
ogeneity, without looking at the repeatability standard for which the stated uncertainty is valid.
deviation of the measurements, Smeas. In Part 1 [1], it
has been demonstrated that the variance "among
groups" is affected by the variance within groups. This Evaluation of a between-bottle homogeneity study
means that for small values of Sbb a problem of quantif-
ication arises. There are two principle choices to deal For a clay soil sample, 18 samples were taken out of a
with this batch for an homogeneity study on barium. The results,
1. Acceptance of the value for Sbb, even if it is zero, expressed in mg/kg on a dry basis are given in Table 1
because the samples are expected to be homogene- [Van Son M, Van der Veen AMH, Verkuil D, unpub-
Unsertainty calculations in the certification of reference materials. 2. Homogeneity study 97
Table 1 Homogeneity study of barium in soil tIe. The link between Eqs. (land 5) is as follows:
MSlImong is equal to n times H~(I)") and MSwithin is equal
Sample Data #1 Data #2 Data #3 Mean s n
to S~,ethod (see part 1 of this paper). As SmellS =SmethoJyn,
# O1Hs 323 301 310 311 11 3 the equivalence of Eqs. (2 and 4) follows. As it can be
# 0201 340 334 3Hi 330 12 3 seen, the computation of the grand mean is not needed
# 03X3 320 321 309 317 7 3 for this uncertainty evaluation, although Excel internal-
# 0442 315 33X 321 325 12 3
# ()557 326 33X 325 330 7 3 ly will calculate this parameter.
# 0666 325 302 304 310 13 3
# 0791 324 331 317 324 7 3
# 091 X 310 310 331 317 12 3 Evaluation of a between-bottle homogeneity study in an
# 1026 336 321 32X 32X X 3 alternative format
# 1133 310 32X 312 317 10 3
# 1249 314 314 302 310 7 3
# 1464 329 300 299 3()9 17 3 In the experimental set-up described above, method re-
# 15Xl 320 329 311 320 9 3 peatability and between-bottle heterogeneity were esti-
# 1607 322 312 311 315 6 3 mated by analysing several units n-times each. Fre-
# 1799 332 317 299 316 17 3
# lX77 313 294 293 300 11 3 quently a different approach to obtain estimates for
# 19% 324 314 335 324 10 3 Smells and Shh is used: one unit is analysed several times
# 2000 321 342 316 327 14 3 to obtain an estimate for SmetllOd and several units are
then analysed in one replicate each to obtain an esti-
mate of He(hb)' For the estimation of the between-bottle
Table 2 Analysis of variance (ANOVA) table for barium in soil variability by this approach, it is vital that the results
from one unit and from the different units are obtained
Source of variation SS df MS F P-value Fail
by the same technique using the same sample intake.
Between groups 3467 17 204 1.66 0.10 1.92 As n = 1 in this case, SmellS = Smethod and SI)" can be esti-
Within groups 4412 36 123 mated according to Eq. (1).
Total 78XO 53 For the certification of a mussel-tissue material [9],
ten units were used for the homogeneity study on sele-
nium. Five determinations on one unit were performed,
whereas the other nine units were analysed once. All
lished data]. The measurements were carried out on ex- analyses were done by ko-NAA without sample pre-
tracts obtained from aqua regia digestion using NEN treatment. The results of this study are given in Table 3.
6465, and the measurements were carried out using In this case, S/)h amounts to 2.84%, which is the uncer-
Iep-MS. Using Excel, the following ANOV A table tainty contribution of homogeneity to the uncertainty
(one-way layout) can be computed (Table 2). The co- of one bottle.
lumn "SS" provides the sums of squares, the column Obviously, this method requires fewer measure-
"dj" the associated degrees of freedom, and the co- ments than performing the complete matrix of measur-
lumn "M S" the mean squares, which form the basis for ements for the ANOV A evaluation. This advantage is
the computation of variances as discussed in Part 1 [1]. paid for with less significant results, as measurement re-
The F-test indicates that the result of the homogeneity
is insignificant (F < Ferit , the critical value of F for
a=5%). The P-value gives the level for which the ob- Table 3 Homogeneity study for selenium in mussel tissue
served F equals Ferit .
5 replicates Results from
The calculation of uncertainties is now very straight- from one unit 9 different
forward. The repeatability of the test method is just the [mg/kg] units
square root of MSwithin, equal to 11 mg/kg (=3,5%). [mg/kg]
For the variance among groups, the following expres-
sion can be used 1.907 I.X72
1.917 1.X74
1.% 1 1.92X
2 2 MSlImon,,-MSwithin
SA =Shh = (5) 1.901 1.X33
n 1.X34 1.944
I.X40
This equation can be used instead of the formulas from 1.726
Part 1, as in this case the data matrix is complete (all 1.952
groups have the same number of members, n =3). The I.X61
variance is 27 mg 2 /kg 2 ; the standard deviation between Average 1.904 1.X70
Standard deviation 0.046 0.070
bottles (s",,) is 5.2 mg/kg ( = 1.6%), which is the contri- Variation coefficient 2.40% (s"",,,,) 3.72% (Ud!>!»)
bution of heterogeneity to the uncertainty of one bot-
98 A. M. H. van der Veen . T. Linsinger· 1. Pauwels
peatability is not reduced by replication. The effect is geneity, impaired by measurement variability, for
that SmellS is more likely to be larger than U,(bb), leading measurement variability. Reliable estimates for be-
to the problems already addressed about Sbb values tween- and within-bottle heterogeneity can be obtained
tending to zero. This format may therefore lead to a given that the measurement variability is small com-
greater uncertainty for the certified reference material. pared to the heterogeneity to be detected. If the re-
As in many cases a more conservative value for the un- quirement of low measurement variability compared to
certainty due to between-bottle variation must be in- heterogeneity is not met, more conservative ap-
serted rather than the value of Sbb as obtained directly proaches should be employed.
from ANOV A. On the other hand, this example also The method as such works equally well on the fully
demonstrates how to conduct a homogeneity test in nested ANOV A designs of experiments, and on other
cases where repetition of measurements is impossible. formats. The underlying theory and concepts are the
In these cases, the results in the first column of Table 2 same, but the implementation differs. This allows appli-
must be obtained from other sources ("Type B evalua- cation of the method for homogeneity tests on test
tion" [8]), such as the quality manual of the laboratory, pieces in destructive testing as well, which has great
validation data or some other source. benefits.
This work also shows that carrying out homogeneity
tests cannot and must not be separated from other
parts of the certification project (e.g. stability studies,
Conclusions characterisation measurements), as the accuracy of the
measurements in the homogeneity study have impor-
A general framework for the estimation of within- and tant implications on the establishment of the combined
between-bottle heterogeneity has been developed. The standard uncertainty of the candidate reference materi-
approach consists of correcting an estimation of hetero- al.
References
1. Van der Veen AMH, Pauwels J(2000) 4. Schiller SB (1996) Statistical aspects of II. BIPM, IEC, IFCC, ISO, IUPAC, IU-
Accred Qual Assur 5: 464-469 the certification of chemical batch PAP, OIML (1995) Guide to the ex-
2. Pauwels J, Kurftirst U, Grobecker KH, SRMs. NIST Special Publication pression of uncertainty in measure-
Quevauviller P (1993) Fresenius J 260-125. NIST, Gaithersburg, USA ment, 1st edn. ISO Geneva, Switzer-
Anal Chern 345:4711-4111 5. BCR Guidelines (1994) Standards, land
3. ISO Guide 35: 19119 (19119) Certifica- Measurement and Testing Programme, 9. Lamberty A, Muntau H (1999) The
tion of reference materials - General Brussels, Belgium certification of the mass fractions of
and statistical principles, 2nd edn. In- 6. Pauwels J, Lamberty A, Schimmel As, Cd, Cr. Cu, Hg, Mn, Pb, Se and
ternational Organization for Standardi- H(19911) Accred Qual Assur 3:51-55 Zn in mussel tissue Mytilus edulis.
zation (ISO), Geneva, Switzerland 7. Van der Veen AMH, Alink A (19911) EUR 1111140EN
Accred Qual Assur 3: 20-26
Accred Qual Assur (200 I) 6:257-263
© Springer-Verlag 2001
b -
t,(X i- X)(y; - V) t,XiY; - 11 (t,xi)(t,Y;) The mean square due to regression is often denoted as
SS(b,lbo), to be read as "sum of squares for bl after al-
i-
1- - 1- 1- I-
it,(Xi -X)2 - itX?-~(tXir (5) lowance has been made for bo". The mean square about
regression (s2) is an estimate for the property denoted by
O'2 y'x and called the variance about regression.
whereby it should be noted that the first expression is The ratio MS reg :s2 can be tested for significance using
more suitable for numerical work in computer programs, the F-tables. Table I provides the necessary information
whereas the second is more suitable for pocket calcula- with respect to the degrees of freedom. The advantage of
tors, as the computations are less tedious. The first ex- using the F-table instead of the method using the t-test is
pression for the slope is less sensitive to round-off er- twofold:
rors, so in most cases more accurate.
The estimate for the intercept can be computed from 1. The F-table is generated by most software systems by
default.
bo=Y-bJ (6) 2. The F-table can readily be extended to other regres-
Using the error propagation formula [I], the standard de- sion models, which makes it more widely applicable.
viations in b, and bo can be computed. The estimated Irrespective of what kind of test is used, it should be not-
standard deviation of b, is given by ed that the outcome is only meaningful if the repeatabili-
s(b.) = s ty standard deviation of measurement, possibly in con-
I ~I
II (
i='
x-x-)2 (7) junction with the between-bottle homogeneity is suffi-
I
ciently small. It can be demonstrated that if the repeat-
ability standard deviation is comparable to that of the ho-
whereby mogeneity study and the characterisation of the material
II ,
I (>'] - bo - b, X;) ~ (e.g. the determination of the property value), this re-
s2 = .:..I=""'_ _----c,.--_ _ (8) quirement is met.
n-2
±(y - Y)- %
1
Due to regression
i=1 I
t U;-Yf%
1
dent. This might seem obvious, but it is not. If a material ty studies. First, it should be noted that stability monitor-
shows considerable between-bottle heterogeneity, it can ing does not affect the uncertainty statement of the refer-
also be expected that the stability of the material differ ence material on the certificate, UCRM. and it is unneces-
from bottle to bottle, as the stability of the material will sary as will be demonstrated. The uncertainty from mon-
depend (among others) on the composition of the materi- itoring can be expressed as
al. However, as the preparation of a reference material (13)
involves the reduction of heterogeneity and the improve-
ment of the stability, it is for most reference materials whereby uCRM denotes the combined standard uncertain-
reasonable to assume independence between effects from ty of the reference material, and umeas the uncertainty
heterogeneity and instability. from measurement, including calibration. Ideally, UCRM is
The model also raises another question: is it possible considerably greater than U meaS ' but it should be consid-
to estimate shh from a stability study? The answer is that ered that this is not always possible. Furthermore, the
statistically speakingit is possible. Especially in those measurements should be carried out in such a way, that
cases where the effect of (in)stability is expected to be their validity must not be demonstrated from using the
small anyway, it is certainly an option. A two-way fully CRM. One cannot check two things at the same time in
nested ANOVA will do the job and provide the three one experiment. The validity of the CRM is to be recon-
standard deviations sf' Sbb' and Sstab. As stability studies firmed, which can only be valid if the measurement is
are often carried out at different temperatures, it is to be demonstrably reliable.
recommended to pick the value at one of the lowest tem- If these experimental conditions are fulfilled, both the
peratures. property value and its expanded uncertainty are recon-
Furthermore, it can be noted that if S'ta" is zero or suf- firmed. There is, under these conditions, no need to in-
ficiently close to zero, it is possible to scatter the bottles crease U CRM' as the uncertainty from measurement is
along the time axis, and to consider the set of data as a something which must be accounted for separately. This
homogeneity study. For example, if at5 different points is true both for monitoring as well as for the normal use
in time 3 bottles have been measured, then, provided that of the CRM. It should however be noted that for the sake
S'tah - 0, the experiment can be evaluated as a between- of the validity of the monitoring measurement, ume£lS
bottle homogeneity study with 15 bottles. If there is should be as small as possible, and certainly not exceed-
some instability left, that it, SHa" is not exactly zero, this ing umeas from a typical user of the CRM, who will use a
will be accounted for in the estimate obtained for s"" similar approach for verifying her measurements.
when evaluating the data as a homogeneity study. An alternative approach is to consider the point ob-
In the classical design, the expression for the uncer- tained in monitoring as just the next point in the stability
tainty reads as study, and from the complete set of data, a new estimate
u2(YUk) =S~tah + stir + sF,,, + s; (II) for Sits can obtained. If necessary, the uncertainty of the
CRM can be reviewed, but usually the evaluation will
whereby one term has been added, the variance due to only reconfirm the value of Sits already obtained and just
lack of repeatability, sio/. This term represents the stabil- extend the shelf-life (i.e. the time for which the certifi-
ity of the measurement system. The measurements in a cate is considered to be valid).
classical stability study take place under (within-labora-
tory) reproducibility conditions. The other terms are
identical to the isochronous case. The problem with the Examples
classical stability study is that the separation between
SHah and sior is not possible; as a result, the model de- An example of the results of an isochronous stability
scribing a typical analysis of variance layout (two-way, study is shown in Table 2, which lists the results of a 12-
fully nested design) read as month isochronous measurement for the determination
u2 (YUk) = s.~tah' + sF,,, + s; (12) of total glucosinolate in rapeseed, BCR 190R. Two units
were analysed in triplicate for each time.
whereby S'tah' now denotes the uncertainty component A standard deviation between bottles of 0.28 Ilmol/kg
due to instability of the measurement system and the ma-
(l.l %) was calculated, which corresponds well with the
terial. This is the case for both the short-term stability
homogeneity study, in which a method repeatability of
study as well as for the long-term stability study.
3.7 % and a between unit variation of 1.4% was estimat-
ed [3]. Standard deviation between times was estimated
0.31 Ilmollkg (1.4%). The detected instability was refut-
Uncertainty evaluation of stability monitoring ed by subsequent stability studies. Results of a classical
stability study are shown in Table 3 [7].
The uncertainty evaluation of stability monitoring is Performing a one-way ANOVA gives a standard devi-
quite different from the long-term and short-term stabili- ation within groups of 0.063 Ilg/kg (7.9%) and a standard
104 A. M. H. van der Veen et al.
Table 3 Results for a stability study for aHatoxin M I in milk powder in I1g/kg [7]
t=1 month t=2 months t=4 months t=6 months t=8 months t=IO months t=12 months
deviation between groups of 0.064 (8.0%). This result is If profound knowledge of the material and the produc-
obviously strongly influenced by measurement reproduc- tion process is required to adopt possibility for the ho-
ibility. Looking at the results, the aflatoxin content mogeneity case, it is even more true for stability test-
seems to decrease initially, rise afterwards and decrease ing. Homogeneity of a material can be assumed con-
again. Such behaviour is very unlikely, but would never- stant over time. Neglecting a smaIl inhomogeneity wiIl
theless be included in an estimation of uncertainty of sta- therefore result in only slight underestimation of uncer-
bility. tainty. On the contrary, instability exacerbates with
time. Degradation between the monitoring measure-
ments may therefore result in unrealistic uncertainty
Discussion statements if no aIlowance for possible degradation is
made.
All these points emphasise the importance of pro-
Evaluating stability studies using ANOYA seems to ne-
found knowledge of the material and possible degrada-
glect the information of the relative position of the mea-
tion pathways. Being able to predict degradation there-
surement in time. As is shown in the Annex to this pa-
fore alIows one actively to counteract it, which is to be
per, the statistics from ANOYA and those of regression
preferred over any statistical evaluation of the facts.
analysis are closely related. For obtaining the estimate
Knowledge also alIows a more reliable estimation of ults
of sits' it seems that a stability study with measurements
for those cases in which degradation cannot be prevented
after 0, 2, 4 and 6 months contains the same information than statistical evaluation a posteriori.
as one with measurements after 0, 8, \6, 24 months.
Finally, it should be noted that the estimates sits and to
When using the appropriate expressions for extrapolat-
a lesser extent Ssts form only the basis for the values for
ing the data and the evaluation of measurement uncer-
these uncertainty components in the expression of the
tainty, it will become clear that the 24-months stability
measurement uncertainty of the property values. One of
study will be different from that of 6 months, as would
the aspects not covered here is the development of a rec-
be expected. The expressions for the uncertainty must
ipe for a shelf-life, in conjunction with developing an ap-
appreciate the distance between the centre of gravity of
propriate estimate for Utts at the shelf-life.
the data.
In many cases, s2 stah will be smaller than the sum of
the other contributions to U2(YUk) in Eqs. (1) and (3),
which makes the estimation of SHah impossible. This Conclusions
problem was already addressed in Part 2 for the homoge-
neity study [3]. The same options for treating these cases
A framework for the estimation of uncertainty of stabili-
exist for the stability study:
ty from ANOYA has been developed. The approach sep-
1) Accept the low value and conclude that uncertainty of arates between-bottle variation from measurement re-
stability is negligible compared to the other uncertain- peatability and variation in time. For isochronous mea-
ty contributions. surements, this variation in time represents variation due
2) Choose a more conservative approach like for exam- to instability. For classical stability studies, this variation
ple the ones outlined in [10]. is confounded with the intralaboratory reproducibility.
Uncertainty calculations in the certification of reference materials. 3. Stability study 105
Estimates of uncertainty of stability are therefore smaller The method requires that variation due to stability is
when using isochronous schemes. not negligible compared to measurement and between-
It has been shown that estimates for between-unit unit variation. If this requirement is not met, a more con-
variation can be obtained from stability studies. These servative approach should be employed. However, the
can be used to back up original homogeneity studies or decision about this should be made based on a profound
may even serve as the sole homogeneity study. knowledge of the material.
References
I. BIPM, 1Ec' IFCC, ISO, IUPAC, IUPAP, 4. Van der Veen AMH. (2000) "Determina- 6. Lamberty A, Schimmel H, Pauwels J
OIML (1995) Guide to the expression of tion of the certified value of a reference (1997) Fresenius J Anal Chern
uncertainty in measurement, 1st edn. material appreciating the uncertainty 360:359-361
ISO, Geneva, Switzerland statements obtained in the collaborative 7. Van Egmond HP, Wagstaffe PJ (1992)
2. Van der Veen AMH., Pauwels J. (2000) study", presented at AMCTM 2000 con- "The certification of atlatoxin M I in
Accred Qual Assur 5:464-469 ference, Monte de Caparica, May 2000 four milk powder samples. CRM No's
3. Van der Veen AMH., Linsinger TPJ., 5. Draper NR, Smith H (1981) "Applied 282,283,284 and 285", European Com-
Pauwels J. (2000) AccredQual Assur regression analysis", 2nd edn. Wiley, mission, EUR 10412
6:26-30 New York, chapter 1
Accred Qual Assur (2000) 5: 231-237
© Springer-Verlag 2000
Due to CRMs used in calibration UCRM Homogeneity, stability, certified As indicated in ISO Guide
value of concentration 35:1989, adopted as SR 13252-
2:1995
Due to the use of the measurement Method of certification, calibration, As indicated in ISO Guide
standard conditions. matrix mismatch 33: 1989, adopted as SR 13252-
4: 1995
Due to the spectrophotometer Photometric linearity, photometric From technical specification of the
accuracy, wavelength accuracy, producer, from calibration certifi-
stray light cate or within the calibration proc-
ess
considerably simplified by calibration of the measure- but the CRM has a related matrix to the sample (sec-
ment system with traceable measurement standards, ond situation), the test sample uncertainty may be re-
since calibration considerably reduces the number of lated by a factor of k to that observed when measuring
uncertainty components that have to be evaluated. the CRM. When the previous situation does not apply,
Also, for some routine measurements, both SR the measurement of an appropriate CRM (third situa-
13434:1999 and the EURACHEM Guide [5] are used tion) can give an indication of the measurement uncer-
in accredited chemical laboratories, and RMs, or some tainty. Even so, it is recommended that isolated results
data and results from previous work or even the judg- on CRMs should be completed with the use of control
ment of the experienced analyst are chosen to do this charts if one consistently needs information on the
evaluation. Some of the problems experience by INM measuring process.
in the evaluation of measurement uncertainty of spec- A control chart is simply a graphical way to interpret
trometric results using the genealogical approach are test data. If a selected RM is measured periodically and
discussed in Ref. [6]. the results are plotted sequentially on a graph (chart
control type), a lot of information may be obtained on
the combined effect of many potential sources of errors
The evaluation of measurement uncertainty using RMs occurring in the measuring process. Limits for accepta-
ble values are defined and the chemical measurement
By definition [7], a RM is a material or a substance one system is assumed to be in control as long as the results
or more of whose property values are sufficiently ho- stay within these limits. The monitored precision of
mogeneous and well established to be used for the cali- measurement and the accuracy of measurement of the
bration of an apparatus, the assessment of a measure- reference material may be transferred, by inference, to
ment or for assigning values to materials. all other appropriate measurements made by the sys-
Further, a RM, accompanied by a certificate, one or tem while it is in a state of control (i.e. repeated meas-
more of whose property values are certified by a proce- urements over a period of time of standard samples
dure which establishes its traceability to an accurate re- processed right through the system are consistent with
alization of the unit in which the property values are the measured variance of the system). Thus, the result-
expressed, and for which each certified value is accom- ing judgment of uncertainty could be assigned to the
panied by an uncertainty at a stated level of confidence sample data output of the process provided the follow-
is a CRM. ing sources of uncertainty are taken into account:
Thus, according to the above definitions and if they - Uncertainty of the assigned value of the RM
are properly used, both RMs and CRMs may contri- - Reproducibility of the measurements made on the
bute to the evaluation of the performance of a meas- RM
urement system and to the estimation of measurement - Any difference between the measured value of the
uncertainty. Three general cases of how CRMs may be RM and its assigned value
used to evaluate measurement uncertainty are illus- - Difference between the composition of RM and sam-
trated in Fig. 2. When the matrix of the CRM matches ple
the matrix of the sample being measured, and the ho- - Difference in the response of measurement response
mogeneity and stability of the sample have been proven due to interferences or matrix effects
(first situation), the uncertainty of the sample measure- - Operations that are carried out in the laboratory on
ments can be equatable to that observed in measure- samples but not on RMs due to subdivision of the
ment of the CRM. Further, if this match is not possible, original sample.
Some aspects of the evaluation of measurement uncertainty using reference materials 109
Measurement
(
" lIcRM== u _
(D) Matrix-Related
( Measurement )
" IIcRM == k u_ .
One may easily note that one of the main pre-requi- Estimation of measurement uncertainty
site for estimation of measurement uncertainty using of a mass fraction result using CRMs
RMs is the stability (statistical control) of the measure-
ment system. Statistical control may be defined as the Three examples of this type of evaluation are given in
attainment of a state of predictability [8], Working un- Table 2 for methods based on molecular absorption
der the above-mentioned conditions, the mean of a spectrophotometry, flame atomic absorption spectros-
large number of measurements will approach a limiting copy (F-AAS), and ICP-OES. In each case samples of
value and the individual measurements should have a NIST-SRM 14 g (AISI 1078 type) were analysed using
stable distribution, described by their standard devia- a Perkin-Elmer 192 flame atomic spectrophotometer, a
tion. The limits within any new measured value can be UV2-100 A TI Unicam spectrophotometer, and a Spec-
predicted with a specified probability. Confidence lim- troflame ICP-P. Prior to this experiment, each meas-
its for a single measurement or for the mean of a set of urement system was metrologically verified in accor-
measurements can be calculated, and the number of dance with the legal metrological norms (NML 9-12-97,
measurements required to obtain a mean value with a NML 9-02-94 and OIML R116). Each system was cali-
given confidence may be estimated. brated against INM's own CRMs and internal quality
procedure. The practical operation conditions and the
parameters of the calibration curves are also indicated
Some outcomes on the estimation of measurement in the Table 2. Note that the uncertainty due to the
uncertainty using RMs and CRMs sampling and sample preparation took into considera-
tion [5] aspects regarding the homogeneity estimate,
Two examples of estimation of measurement uncertain- dissolution, dilution errors, chemical effects, etc. There-
ty using both the CRM and RM approach are discussed fore, several samples were taken and analysed sepa-
below. rately according to the considered methods. The varia-
bility between the individual results was considered as
a measure of the reproducibility of the specific analyti-
cal method and of the uncertainty of sampling and sam-
110 M. Buzoianu
Table 2 Estimation of the measurement uncerainty of a mass fraction result using a CRM
Operation conditions
wavelength 450 (copper diethyldithio- 324.8 324.75
carbamate)
repeated measurements 10 5 10
measurement method as described in ST AS as described in ST AS as described in internal
1463-84 1463.84 procedure
calibration against BCS (206/2, 224, 255, 257) single element solution from multielement solution from
high purity elements high purity metals (synthetic
matrix form)
Calibration curves parameters
slope, b 0.8455 2.1016 71517
intercept, a -0.0093 0.0016 11
standard deviation of the
regression, So 0.007 0.001 13.65
Uncertainties due to:
- sampling and sample pre-
paration, 11.",,,,, (rei) 0.015 0.005 0.005
- measurement method 1,
U""th, (rei) 0.040 0.045 0.037
- instrument calibration 2 ,
U,.,,/, (rei) 0.025 0.003 0.002
- CRM, UCRM, (rei) 0.020 0.005 0.005
- data treatment (rei) negligible negligible negligible
Measurement result, W (%) 0.047 0.055 0,(147
Expanded Uncertainty 3
(k=2), u,. (rei) 0.1068 0.0914 0.0755
pIe preparation. Thus, the uncertainty due to sampling may conclude that there is no evidence that measure-
and sample preparation was determined as the differ- ment process is not as precise as required. For the ICP
ence between the above mentioned variabilities divided method, the expanded uncertainty agreed with the cer-
by the number of samples analysed. tified value of the CRM (0.003%) used in this experi-
As one may note, the absolute expanded uncertainty ment. Starting from the certified value of mass fraction
measurement (for k =2) evaluated for the mass fraction of Cu in NIST-SRM (0.047%) one may also note that
result obtain against the molecular spectrophotometric measurement results obtained both with molecular
method (0.005%) was equal to the interlaboratory spectrophotometric and ICP methods were in good
standard deviation of the standard method of analysis agreement. In the F-AAS method, a bias exceeding the
(0.005%). Using the F-AAS method, the absolute un- prescribed limits was observed. Thus, the necessary op-
certainty of the measurement result (0.004%) exceeded timization of the measurement process was concluded.
the standard method accuracy (O.003%-abs). Compar-
ing the two values, one may note that K..1I1 = (0.004/
0.003)2 = 1.78, is less than ,itllble(4:0095) =3.65, and one
Some aspects of the evaluation of measurement uncertainty using reference materials III
Table 3 Example of determination of measurement uncertainty of a turbidity result using ANOV A method
Instrument 2 3 4 5
Mean turbidity
measured
Instrument 6 7 9 10
Mean turbidity
measured
S2(7') = (J-l)'s~+J'(k-l)'sT,
J·k·(J·k-l)
References
1. Buzoianu M, Duta S (1996) National 3. Buzoianu M (1999) Metrological Cali- 6. Buzoianu M, Aboul-Enein HY (1997)
System for Reference Materials in Ro- bration in Traceability. Proceedings of Accred Qual Assur 2:11-17
mania. Proceedings of Central Euro- the EURACHEM Workshop on the 7. ISO (1993) International vocabulary of
pean Conference on Reference Materi- Status of Traceability in chemistry, basic and general terms in metrology
als, Slovacia Bratislava (VIM), 2nd edn. International Organi-
2. Buzoianu M (1998) Accred Qual As- 4. ISO (1993) Guide to the expression of zation for Standardization (ISO), Gen-
sur 3: 270-277 uncertainty in measurements (GUM), eva
1st edn. ISO, Geneva 8. ISO Guide 33 (1989) Uses of certified
5. EURACHEM (1995) Guide to quan- reference materials. ISO, Geneva
tifying uncertainty in analytical meas-
urements. EURACHEM, London
Accred Qual Assur (19911) 3:115-116
© Springer-Verlag 19911
There is no specific chemical metrology: Instead, as a The main objective of metrology in chemistry is known
truly horizontal discipline, metrology is applied in uncertainty of analytical results
chemistry
By focussing on purposes instead of procedures, most
Although "chemical metrology" is frequently used in of the current "buzz items" in measurement quality as-
chemical analysis, this term should be abandoned. surance such as comparability, traceability and valida-
The main reason for this recommendation is that tion are reduced to uncertainty as the primary perform-
metrology is a truly horizontal discipline, operating on ance characteristic, as follows.
uniform principles, largely independently of the parti-
cular application field. Although first developed for
physical measurements, the basic metrological terms, Comparability
concepts and procedures are applicable throughout
analytical chemistry. Nevertheless, there are specific For comparing different measurement results on the
challenges in chemical measurements such as the enor- same measurand, three basic requirements have to be
mous diversity of measurands (i. e. analyte/level/matrix fulfilled:
combinations) and the lack of direct measurement
• The uncertainties of the measurement results have to
methods, which require specific strategies.
be known.
To promote the application of metrology in chemis-
• The units used to express the measurement results
try, its basic concepts and procedures have to be made
(including uncertainties) have to be the same, or at
crystal clear, emphasizing purposes instead of proto-
least convertible.
cols.
• The measures used to express the uncertainties have
to be the same, or at least convertible.
114 W. Hasse1barth
Evidently, the last two requirements call for standardi- budget is valid. For other operating conditions a cor-
zation. For the units of physical quantities the standar- rection and/or an additional uncertainty component
dization problem was solved by establishing the Inter- are necessary.
national System of Units (SI) in 1960. The confusion • Linearity is intended to specify that range of analyte
about different uncertainty measures continued beyond content where a linear calibration function applies.
that date and was solved only recently, with the appear- Beyond this range a correction and/or an additional
ance and world-wide acceptance of the "standard un- uncertainty are necessary.
certainty" proposed by the Guide to the Expression of • Reproducibility aims at establishing a "top-down"
Uncertainty in Measurement (GUM) in 1993. The first estimate of the uncertainty of an analytical method,
requirement - known uncertainty - is still largely unset- including interlaboratory bias as an uncertainty com-
tled, although the concepts and methods of uncertainty ponent, but excluding method bias.
evaluation proposed by the GUM have paved the way
for substantial progress.
Discussion and guidance on determination and
correction of bias in analytical methods is urgently
needed
Traceability
To date analytical chemists as a rule have been content
Evaluation of measurement uncertainty according to with stating agreement or disagreement of analytical re-
the concept of the GUM basically includes two steps. In sults, obtained on a certified reference material, with
the first step, the measurement process is investigated the corresponding certified values. In cases of disagree-
for bias. If significant bias is found, a correction is ap- ment, usually no attempt is made to derive a correction.
plied. In the second step, the uncertainty on the bias Neither is the uncertainty on the correction taken into
correction is combined with the uncertainty due to ran- account, which is also necessary in cases of agreement,
dom effects to yield the overall uncertainty of the cor- because a correction factor of unity comes also with an
rected measurement. uncertainty.
Traceability serves the purpose of excluding, or of At the workshop, the topic of bias handling was
determining and correcting, significant measurement raised on many occasions, indicating an urgent demand
bias by comparison between measured values and cor- for discussion and guidance, for example on
responding reference values. Thus traceability, where
applicable, provides a firm basis for valid uncertainty • Practical traceability procedures, i. e. procedures for
statements. performing valid comparisons between measured
values and reference values and for evaluating the
comparison results
• Criteria for when to apply corrections on the basis of
Validation bias information
• Procedures for bias correction and for estimating
The performance characteristics considered in method correction uncertainty.
validation serve the purpose of specifying, for a given
The topic of bias correction will be addressed in the
analytical method, an application range with defined
forthcoming revision of the EURACHEM Guide
uncertainty. For example
Quantifying Uncertainty in Analytical Measurement.
• Specificity and selectivity are intended to specify the In practice, rigorous and complete traceability of
range of matrices where the uncertainty budget is analytical results to established references will be the
valid. For other matrices, a correction and/or an ad- exception rather than the rule. Therefore it will be an
ditional uncertainty component are necessary. important task to agree on levels of rigour and com-
• Robustness and ruggedness are intended to specify a pleteness of traceability statements required for, and
range of operating conditions where the uncertainty feasible in, specific analytical sectors.
Accred Qual Assur (1998) 3:101-105
© Springer-Verlag 1998
In this paper, we describe and illustrate a structured grouping of related effects where possible (step 4) are
methodology applied in our laboratory to overcome explicitly suggested [2].
these difficulties, and present results obtained using the The final stage of the cause and effect analysis re-
methodology. It will be argued that application of the quires further elucidation. Duplications arise naturally
approach can lead to a full reconciliation of validation in detailing contributions separately for every input pa-
studies with the GUM approach, and the advantages rameter. For example, a run-to-run variability element
and disadvantages of the methodology will be consid- is always present, at least nominally, for any influence
ered. Finally, some uncertainty estimates obtained us- factor; these effects contribute to any overall variance
ing the methodology are presented, and the relative observed for the method as a whole and should not be
contributions of different contributions are consid- added in separately if already so accounted for. Similar-
ered. ly, it is common to find the same instrument used to
weigh materials, leading to over-counting of its calibra-
tion uncertainties. These considerations lead to the fol-
lowing additional rules for refinement of the diagram
Principles of approach
(though they apply equally well to any structured list of
effects):
The strategy has two stages:
1. Cancelling effects: remove both. For example, in a
1. Identifying and structuring the effects on a result. In
weight by difference, two weights are determined,
practice, we effect the necessary structured analysis
both subject to the balance "zero bias". The zero
using a cause and effect diagram (sometimes known
bias will cancel out of the weight by difference, and
as an Ishikawa or "fishbone" diagram) [9].
can be removed from the branches corresponding to
2. Reconciliation. The reconciliation stage assesses the
the separate weighings.
degree to which information available meets the re-
2. Similar effect, same time: combine into a single in-
quirement and thus identifies factors requiring fur-
put. For example, run-to-run variation on many in-
ther study.
puts can be combined into an overall run-to-run pre-
The approach is intended to generate an estimate of
cision "branch". Some caution is required; specifical-
overall uncertainty, not a detailed quantification of all
ly, variability in operations carried out individually
components.
for every determination can be combined, whereas
variability in operations carried out on complete
batches (such as instrument calibration) will only be
Cause and effect analysis observable in between-batch measures of precision.
3. Different instances: re-Iabel. It is common to find
The principles of constructing a cause and effect dia- similarly named effects which actually refer to differ-
gram are described fully elsewhere [9]. The procedure ent instances of similar measurements. These must
employed in our laboratory is as follows: be clearly distinguished before proceeding.
1. Write the complete equation for the result. The pa- The procedure is illustrated by reference to a simpli-
rameters in the equation form the main branches of fied direct density measurement. We take the case of
the diagram. (We have found it is almost always nec- direct determination of the density d(EtOH) of ethanol
essary to add a main branch representing a nominal by weighing a known volume V in a suitable volumetric
correction for overall bias, usually as recovery, and vessel of tare weight Mtarc and gross weight including
accordingly do so at this stage.) ethanol Mgross. The density is calculated from
2. Consider each step of the method and add any fur-
d(EtOH) = (Mgross - Mtarc)/V
ther factors to the diagram, working outwards from
the main effects. Examples include environmental For clarity, only three effects will be considered:
and matrix effects. equipment calibration, temperature, and the precision
3. For each branch, add contributory factors until ef- of each determination. Figures 1-3 illustrate the pro-
fects become sufficiently remote, that is, until effects cess graphically.
on the result are negligible. A cause and effect diagram consists of a hierarchical
4. Resolve duplications and re-arrange to clarify con- structure culminating in a single outcome. For our pur-
tributions and group related causes. We have found pose, this outcome is a particular analytical result
it convenient to group precision terms at this stage ["d(EtOH)" in Fig. 1]. The "branches" leading to the
on a separate precision branch. outcome are the contributory effects, which include
Note that the procedure parallels the EURACHEM both the results of particular intermediate measure-
guide's sequence of preliminary operations very close- ments and other factors, such as environmental or ma-
ly; specification of the measurand (step 1), identifica- trix effects. Each branch may in turn have further con-
tion of sources of uncertainty (steps 2 and 3) and tributory effects. These "effects" comprise all factors
Estimating measurement uncertainty: reconciliation using a cause and effect approach 117
served dispersion of results, it is sufficient to demon- where Ac is the peak area of the cholesterol, AB is the
strate that the distribution of values taken by the in- peak area of the betulin internal standard, Rr the re-
fluence parameter in the particular experiment is repre- sponse factor of cholesterol with respect to betulin
sentative of f(.:lx;). [Strictly, u(x;) could characterise (usually assumed to be 1.00), IS the weight of the betu-
many possible distributions and not all will yield the lin internal standard (mg), and m the weight of the
same value of u(y;) for all functions Y(Xi,Xj . .. ). It is as- sample (g). In addition, a nominal correction (1/ R) for
sumed here that either f(.:lx i ) is the particular distribu- recovery is included; R may be 1.0, though there is in-
tion appropriate to the problem, when g(Jly;) necessar- variably an associated uncertainty. If a recovery study
ily generates the correct value of U(Yi), or that including a representative range of matrices and levels
Y(Xi,Xj ... ) satisfies the assumptions justifying the first of analyte is conducted, and it includes several separate
order approximation of Ref. [1], in which case any dis- preparations of standards, the dispersion of the recove-
tribution f( Jlx;) characterised by u(x;) will generate ry results will incorporate uncertainty contributions
u(y;)]. from all the effects marked with a tick. For example, all
Following these arguments, it is normally straight- run-to-run precision elements will be included, as will
forward to decide whether a given parameter is suffi- variation in standard preparation; matrix and concen-
ciently covered by a given set of data or planned ex- tration effects on recovery will be similarly accounted
periment. Where a parameter is already so accounted for. Effects marked with a cross are unlikely to vary
for, the fact is noted. The parameters which are not ac- sufficiently, or at all, during a single study; examples in-
counted for become the subject of further study, either clude most of the calibration factors. The overall uncer-
through planned experimentation, or by locating ap- tainty can in principle be calculated from the dispersion
propriate standing data, such as calibration certificates of recoveries found in the experiment combined with
or manufacturing specifications. The resulting contribu- contributions determined for the remaining terms. Due
tions, obtained from a mixture of whole method stud- care is, of course, necessary to check for homoscedas-
ies, standing data and any additional studies on single ticity before pooling data.
effects, can then be combined according to ISO GUM
principles.
An illustrative example of a reconciled cause and ef-
fect study is shown in Fig. 4, which shows a partial dia- Results
gram (excluding long-term precision contributions and
secondary effects on recovery) for an internally stand- We have found that the methodology is readily applied
ardised GC determination of cholesterol in oils and by analysts. It is intuitive, readily understood and,
fats. The result, cholesterol concentration Cch in mg/ though different analysts may start with differing views,
100 g of material, is given by leads to consistent identification of major effects. It is
particularly valuable in identifying factors for variation
C ch -- Ac x Rr X IS X .l X 100 , (2) during validation studies, and for identifying the need
ABxm R for additional studies when whole method performance
figures are available. The chief disadvantage is that, in
focusing largely on whole method studies, only the
overall uncertainty is estimated; individual sources of
"Recovery" uncertainty are not necessarily quantified directly
"./ (though the methodology is equally applicable to for-
mal parameter-by-parameter studies). However, the
structured list of effects provides a valuable aid to plan-
ning when such additional information is required for
-----'---~---....o.--__.___ Cholesterol method development. Some results of applying this
methodology are summarised in Fig. 5, showing the re-
lative magnitudes of contributions from overall preci-
~ Covered by sion and recovery uncertainties u(precision) and u(re-
experiment" covery), before combination. "Other" represents the
~ Not covered by remaining combined contributions. That is, the pie
~
Int Std Repeatatlilily experiment"
charts show the relative magnitudes of u(precision),
VokJme
Temperature X
Internal Standard
u(recovery) and VL U(y;)2 with u(y;) excluding u(pre-
weight "See text cision) and u(recovery). It is clear that, as expected,
most are dominated by the "whole method" contribu-
Fig.4 Partial cause and effect diagram for cholesterol determina- tions, suggesting that studies of overall method per-
tion. See text for explanation formance, together with specific additional factors,
Estimating measurement uncertainty: reconciliation using a cause and effect approach 119
-
Fig. 5 Contributions to com- 0tIw AIo In PVC
bined standard uncertainty.
Charts show the relative sizes
of uncertainties associated
with overall precision, bias,
and other effects (combined).
See text for details
-
should provide adequate estimates of uncertainty for two approaches are equivalent given representative ex-
many practical purposes. perimental studies. The procedure permits effective use
of any type of analytical data, provided only that the
ranges of influence parameters involved in obtaining
Conclusions the data can be established with reasonable confidence.
Use of whole method performance data can obscure
We have presented a strategy capable of providing a the magnitude of individual effects, which may be
structured analysis of effects operating on test results counter-productive in method optimisation. However,
and reconciling experimental and other data with the if an overall estimate is all that is required, it is a con-
information requirements of the GUM approach. The siderable advantage to avoid laborious study of many
initial analysis technique is simple, visual, readily un- effects.
derstood by analysts and encourages comprehensive
Acknowledgement Production of this paper was supported un-
identification of major influences on the measurement. der contract with the Department of Trade and Industry as part
The reconciliation approach is justified by comparison of the National Measurement System Valid Analytical Measure-
with the ISO GUM principles, and it is shown that the ment Programme.
References
1. ISO (1993) Guide to the expression of 3. Analytical Methods Committee (1995) 7. ISO 5725:1994 (1995) Accuracy (true-
uncertainty in measurement. ISO, Analyst 120: 2303 ness and precision) of measurement
Geneva 4. Ellison SLR (1997) In: Ciarlini P, Cox methods and results. ISO, Geneva
2. EURACHEM (1995) Guide: Ouantify- MG, Pavese F, Richter D (eds) Ad- I{. Ellison SLR, Williams A, Accred Oual
ing uncertainty in analytical measure- vanced mathematical tools in metrolo- Assur (in press)
ment. Laboratory of the Government gy III. World Science, Singapore, pp 9. ISO 9()04-4:1993 (1993) Total quality
Chemist, London 5n-n7 management, part 2. Guidelines for
5. Horwitz W (191{1{) Pure Appl Chern quality improvement. ISO, Geneva
nO:1{55-I{M
n. AOAC (191{9) Recommendation. J As-
soc Off Anal Chern 72:n94-704
Accred Qual Assur (1991\) 3: 6-10
tion of the range of the values that could reasonably be standard methods are accepted and put into use by ap-
attributed to the concentration of the analyte and ena- propriate review or standardisation bodies. Since the
bles a judgement to be made as to whether the result is studies undertaken form a substantial investigation of
fit for its intended purpose. the performance of the method with respect to true-
Uncertainty estimation according to GUM princi- ness, precision and sensitivity to small changes and in-
ples is based on the identification and quantification of fluence effects, it is reasonable to expect some com-
the effects of influence parameters, and requires an un- monality with the process of uncertainty estimation.
derstanding of the measurement process, the factors in-
fluencing the result and the uncertainties associated
with those factors. These factors include corrections for Comparison of measurement uncertainty and method
duly quantified bias. This understanding is developed validation procedures
through experimental and theoretical investigation,
while the quantitative estimates of relevant uncertain- The evaluation of uncertainty requires a detailed exam-
ties are established either by observation or prior infor- ination of the measurement procedure. The steps in-
mation (see below). volved are shown in Fig. 1. This procedure involves
very similar steps to those recommended in the AOACI
IUPAC protocol [5, 6] for method development and
Method validation validation, shown in Fig. 2. In both cases the same proc-
esses are involved: step 1 details the measurement pro-
For most regulatory applications, the method chosen cedure, step 2 identifies the critical parameters that in-
will have been subjected to preliminary method devel- fluence the result, step 3 determines, either by experi-
opment studies and a collaborative study, both carried ment or by calculation, the effect of changes in each of
out according to standard protocols. This process, and these parameters on the final result, and step 4 their
subsequent acceptance, forms the 'validation' of the combined effect.
method. For example, the AOAC/IUPAC protocol [5, The AOAClIUPAC protocol recommends that
6] provides guidelines for both method development steps 2,3 and 4 be carried out within a single laboratory,
and collaborative study. Typically, method develop- to optimise the method, before starting the collabora-
ment forms an iterative process of performance evalua- tive trial. Tables 1 and 2 give a comparison of this part
tion and refinement, using increasingly powerful tests of the protocol [6] with an extract from corresponding
as development progresses, and culminating in collabo- parts of the EURACHEM Guide [3]. The two proce-
rative study. On the basis of the results of these studies, dures are very similar. Section 1.3.2 of the method vali-
Fig. 1 Fig. 2
Measurement Uncertainty
122 S. L.R. Ellison· A. Williams
Table 1 Method development and uncertainty estimation and between-batch variations. Collaborative trial is ex-
pected to randomise most of these contributions, with
Method validation I Uncertainty estimation
the exception of method bias. The latter would be ad-
1.3.2 Alternative approaches Having identified the possible dressed via combination of the uncertainties associated
to optimisation sources, the next step is to with a reference material or materials to which results
(a) Conduct formal rugged- make an approximate assess- are traceable with the statistical uncertainty associated
ness testing for identification ment of size of the contribu-
and control of critical varia- tion from each source, ex- with any estimation of bias using a finite number of ob-
bles. pressed as a standard devia- servations. Note that the necessary investigation and
(b) Use Deming simplex op- tion. Each of these separate reporting of bias and associated statistical uncertainty
timisation to identify critical contributions is called an un- (i.e. excluding reference material uncertainty), are now
steps. certainty component. recommended in existing collaborative study standards
(c) Conduct trials by changing Some of these components
one variable at a time. can be estimated from a series [7]. Where the method bias and its uncertainty are
of repeated observations, by small, the overall uncertainty estimate is expected to be
calculating the familiar statis- represented by the reproducibility standard deviation.
tically estimated standard de- The approach has been referred to as a "top-down"
viation, or by means of sub-
sidiary experiments which are view. The authors concluded that such an approach
carried out to assess the size would be feasible given certain conditions, but noted
of the component. For exam- that demonstrating that the estimate was valid for a
ple, the effect of temperature particular laboratory required appropriate internal
can be investigated by making
measurements at different quality control and assurance. Clearly, the controls re-
temperatures. This experi- quired would relate particularly to the principal factors
mental determination is refer- affecting the result. In terms of ISO principles, this re-
red to in the ISO Guide as quirement corresponds to control of the main contribu-
'Type A evaluation". tions to uncertainty; in method development and vali-
I Reprinted from The Journal of AOAC INTERNATIONAL
dation terms, the requirement is that factors found to
(1989) 72(4):694-704. Copyright 1989, by AOAC INTERNA- be significant in robustness testing are controlled within
TIONAL, Inc. limits set, while factors not found individually signifi-
cant remain within tolerable ranges. In either case,
where the control limits on the main contributing fac-
dation protocol is concerned with the identification of tors, together with their influence on the result, are
the critical parameters and the quantification of the ef- known to an individual laboratory, the laboratory can
fect on the final result of variations in these parameters; both check that its performance is represented by that
the experimental procedures (a) and (c) suggested are observed in the collaborative trial and straightforward-
closely similar to experimental methodology for evalu- ly provide an estimate of uncertainty following ISO
ating the uncertainty. Though the AOAC/IUPAC ap- principles.
proach aims initially to test for significance of change of The step-by-step approach recommended in the ISO
result within specified ranges of input parameters, this Guide and the "top down" approach have been seen as
should normally be followed by closer study of the ac- alternative and substantially different ways of evaluat-
tual rate of change in order to decide how closely a pa- ing uncertainty, but the comparison between method
rameter need be controlled. The rate of change is ex- development protocols and ISO approach above shows
actly what is required to estimate the relevant uncer- that they are more similar than appears at first sight. In
tainty contribution by GUM principles. The remainder particular, both require a careful consideration and
of the sections in the extract from the protocol give study of the main effects on the result to obtain robust
guidance on the factors that need to be considered; results accounting properly for each contribution to
these correspond very closely to the sources of uncer- overall uncertainty. However, the top down approach
tainty identified in the EURACHEM Guide. The data relies on that study being carried out during method
from method development studies required by existing development; to make use of the data in ISO GUM es-
method validation protocols should therefore provide timations, the detailed data from the study must be
much of the information required to evaluate the un- available.
certainty from consideration of the main factors in-
fluencing the result.
The possibility of relying on the results of a collabo- Availability of validation data
rative study to quantify the uncertainty has been con-
sidered [8], following from a general model of uncer- Unfortunately, the necessary data are seldom readily
tainties arising from contributions associated with available to users of analytical methods. The results of
method bias, individual laboratory bias, and within- the ruggedness studies and the within-laboratory op-
Measurement uncertainty and its implications for collaborative study method validation and method performance parameters 123
Table 2 Method performance and measurement uncertainty estimation. Note that the text is paraphrased for brevity and the numbers
in parentheses refer to corresponding items in the EURACHEM guide (column 2)
1.4 Develop within-laboratory attributes of the optimised The evaluation of uncertainty requires a detailed examination
method of the measurement procedure. The first step is to identify
(Some items can be omitted; others can be combined.) possible sources of uncertainty. Typical sources are:
1.41 Determine [instrument] calibration function ... to de- I. Incomplete definition of the measurand (for example, fail-
termine useful measurement range of method. (X, 9) ing to specify the exact form of the analyte being deter-
1.4.2 Determine analytical function (response vs concentra- mined).
tion in matrix ... ). (9) 2. Sampling - the sample measured may not represent the
1.4.3 Test for interference (specificity): defined measurand.
(a) Test effects of impurities ... and other components 3. Incomplete extraction and/or pre-concentration of the
expected ... (5) measurand, contamination of the measurement sample,
(b) Test non-specific effects of matrices. (3) interferences and matrix effects.
(c) Test effects of transformation products ... (3) 4. Inadequate knowledge of the effects of environmental
1.4.4 Conduct bias (systematic error) testing by measuring conditions on the measurement procedure or imperfect
recoveries ... (Not necessary when method itself de- measurement of environmental conditions.
fines the property or component.) (3, 10, 11) 5. Cross-contamination or contamination of reagents or
1.4.5 Develop performance specifications ... and suitability blanks.
tests ... to ensure satisfactory performance of critical 6. Personal bias in reading analogue instruments.
steps ... (X) 7. Uncertainty of weights and volumentric equipment.
1.4.6 Conduct precision testing ... [including] ... both be- II. Instrument resolution or discrimination threshold.
tween-run (between-batch) and within-run (within- 9. Values assigned to measurement standards and reference
batch) variability. (4,6, 7, X, 12) materials.
1.4.7 Delineate the range of applicability to the matrices or 10. Values of constants and other parameters obtained from
commodities of interest. (1) external sources and used in the data reduction algorithm.
1.4.X Compare the results of the application of the method 11. Approximations and assumptions incorporated in the
with existing tested methods intended for the same measurement method and procedure.
purposes, if other methods are available. 12. Variations in repeated observations of the measurand un-
1.4.9 If any of the preliminary estimates of the relevant per- der apparently identical conditions.
formance of these characteristics are unacceptable, re-
vise the method to improve them, and retest as neces-
sary
1.4.10 Have method tried by analyst not involved in its devel-
opment. Revise method to handle questions raised and
problems encountered.
'Reprint from The Journal of AOAC INTERNATIONAL (19X9) 72(4):694-704. Copyright 19X9, by AOAC INTERNATIONAL,
Inc.
who normally provide reports to trial co-ordinators; it validation study would considerably reduce the work
should therefore be possible to include them in the fi- involved.
nal report.
Acknowledgements The preparation of this paper was supported
Of course there will frequently be additional sources under contract with the Department of Trade and Industry as
of uncertainty that have to be examined by individual part of the National Measurement System Valid Analytical Meas-
laboratories, but providing this information from the urement (V AM) Programme [10].
References
1. Sargent M (1995) Anal Proc 3. EURACHEM (1995) Quantifying un- 7. ISO (1994) ISO 5725:1994 Precision
32:201-202 certainty in analytical measurement. of test methods. ISO, Geneva
2. ISO (1993) Guide to the expression Laboratory of the Government 1\. Analytical Methods Committee
of uncertainty in measurement. ISO, Chemist, London. (1995) Analyst 120:2303-2301\
Geneva, Switzerland, ISBN 0-94X926-01\-2 9. Ellison SLR, Williams A (1996) In:
ISBN 92-67-101X8-9 4. Williams A (1991) Accred Qual Parkany M (ed) The use of recovery
Assur 1: 14-17 factors in trace analysis. Royal Socie-
5. Horwitz W (1995) Pure Appl Chern ty of Chemistry, London
67:331-343 10. Fleming J (1995) Anal Proc 32:31-32
6. AOAC recommendation (191\9) J As-
soc Off Anal Chern 72:694-704
Accred Qual Assur (1997) 2:1X()-lX5
© Springer-Verlag 1997
gree of belief" in the judgement of the analyst and oth- then a second pH (pH2) is measured after the addition
ers as in all theoretical results in science. of standard acid (for example, HCI) to the system. Acid
The principles of the analytical methods validation value is calculated according to the following formula:
[4-7], in contrast to those described above, are based
AV=MKOHCsl Vst /(lO<lPH-l)m (mg KOH/g oil) (1)
on the cybernetic approach, which considers the whole
analytical procedure as a "black box". In this case, only where MKoH is the molecular mass of KOH (g), CSI is
the results of analysis can serve as the data for statisti- the concentration of the standard acid (mollL), VSI is
cal characterization of the analytical method without the volume of the standard acid added (mL),
any direct relationship with intermediate measurement ApH=pH\ -pH2' and m is the mass of the oil sample
results such as weighings, volumes, instrument read- (g).
ings, or other parameters such as molecular masses. The main A V uncertainty components discussed be-
Moreover, from the Horwitz function [8] it follows that low are presented in Table 1.
the standard deviation of analytical results arising from
random errors is practically independent of the specifi- Preparation of the standard acid (HCI) solution
cation of the analytical method.
The characteristics of the method used for the vali- A Titrisol (Merck, Germany) solution of HCI contain-
dation (validation parameters) such as repeatability, re- ing mHCI = 18.230 g HCI is used to prepare Cst = 0.5 M
producibility and accuracy are certainly correlated with HCI of volume V = 1000 mL. The volumetric flask used
the uncertainty of the analytical results. In particular, for the solution preparation has the volume
the combined uncertainty arising from random effects 1000 ± 0.4 mL at 20°C (DIN A, Superior, Germany).
cannot be less than the repeatability [2]. The appropriate standard deviation of the calibrated
Obviously, the use of the uncertainty calculation of volume (a rectangular distribution [1, 2]) is 0.41
ref. [2] for the definition of the quality of analytical y3= 0.23 mL. Since the difference between the actual
data should be harmonized with the concepts and prac- temperature and the flask calibration temperature is
tices of the method validation as well as quality control, - 3°C (with 95% confidence), at volume coefficient of
proficiency testing, and certification of reference mate- water expansion 2.1 x 10 -41°C, the possible volume
rials [9]. variation is 1000 x 3 X 2.1 X 10 -4 = 0.63 mL, and the cor-
In the present paper, the results of the uncertainty responding standard deviation is 0.63/1.96 = 0.32 mL.
calculation and validation are discussed for a new The standard deviation of the flask filling is less than
method of acid value determination in oils by pH meas- 113 of the standard deviations for calibration and tem-
urement without titration developed in our laboratory perature variations (mentioned above) and is thus ne-
[10]. gligible. Combining the two contributions of the uncer-
tainty u(V) we have u(V)IV=V(0.23 2 + 0.32 2) /1000
=0.00039.
The calculation of the uncertainty in the method by pH The concentration of HCI is Cst = mHClI MHCI V,
measurement where MHCI is the molecular mass of HCl. The manu-
facturer of the HCI solution indicates a possible devia-
In this method [10], an oil sample is introduced into the tion of its titer of 0.02% 10c. Taking a possible temper-
reagent consisting of triethanolamine dissolved in a ature variation in the manufacturer's laboratory of
mixture of water and isopropanol. First, a conditional _2°C (with 95% confidence), the standard uncertainty
pH (pHI) in the reagent-oil system is measured I and of mHCI is u(mHCI) =18.230 X 0.02 x2/(100 x 1.96)
=0.004 g, and u(mHCI)lmHCI=0.00022.
) Because the measurements of pH are performed with an aque- The standard uncertainty of the molecular mass of
ous reference electrode calibrated by aqueous buffer solutions, HCI, according to IUP AC atomic masses and rectangu-
the results of measurements are conditional [10] lar distribution [2], is u(MHCI) =0.000043.
Uncertainty in chemical analysis and validation of the analytical method: acid value determination in oils 127
Since U(MHC:I)/MHCI is negligible in comparison with ard uncertainty of the molecular mass of KOH, accord-
u(V)/V and u(mHCI)/mHCh the relative standard uncer- ing to IUPAC data and rectangular distribution, is
tainty is u(Ct)/Ct=V(0.00039 2 +0.00022 2) =0.00045. U(MKOH)/ M KOH =0.000041, which is less than 113 of
the standard uncertainty of any component in Eq. 1, for
example u(CsJ/CSI =0.00045. In their turn, U(CI)/Cst
Weighing and transfer of an oil sample to the reagent and u(m)/m are negligible in comparison with the rela-
tive standard uncertainty in the standard acid addition
The final mass of an oil sample is the difference in mass u(Vst)/Vst =0.01. The latter is also a negligible compo-
between a beaker with the sample and the empty beak- nent of the uncertainty of A V determination, since pH
er (after transfer of oil to the reagent). In the range up measurement is the dominant source of u(A V)/ A V
to 50 g, by analogy with ref. [2], u(m) = 0.000087. Since (see Table 1). Therefore, after the logarithmic differen-
for different A Vs the recommended sample mass is tiation of Eq. 1 only the following remains valid:
from 0.1 to 40 g, u(m)/m values are from 0.00087 to u(AV)/ AV =u(10Ll.pH -1)/
0.0000022. (10Ll.pH -1) = 10Ll.pH X 2.30 X u(dpH)/(10Ll.pH -1) (2)
Therefore
Measurement of pHI u(AV)/ AV =0.032/(1-1/10Ll.PH) (3)
From the relationship of Eq. 3 illustrated in Fig. 1 for
After mixing the system "oil-reagent", pHI is measured
the dpH range 0.1-1.0, it is clear that dpH<0.2Ieads to
with a pH meter pHM 95 (Radiometer, Denmark), and
an essential increase in the A V uncertainty. At
the standard uncertainty of pH reading u (pH) = 0.01.
dpH > 0.4, the amount of standard acid added may ex-
ceed 3 times the sum of the free fatty acids in the oil
sample. This acid addition may cause (1) pH2 to deviate
Addition of HCI to the "oil-reagent" system from the linear range of pH versus A V or (2) a signifi-
cant change in the concentration of the free form of
A recommended standard addition for samples with triethanolamine in the reagent, which is inadmissible
different A V is 0.05-0.4 mL of 0.5 M HCI; this volume [10]. Therefore the recommended dpH range is 0.25-
should be negligible in comparison with the volume of 0.35, and corresponding values of u(A V)/ A V are 0.07-
the reagent -50 mL. For transfer of the acid to the 0.06. The expanded uncertainty U (A V)/ A V = k
"oil-reagent" system, a mechanical hand pipette (Gil- u(A V)/ A V is 0.14-0.12, coverage factor k being 2. This
son, France) was used with a relative standard uncer- uncertainty is higher than in ref. [13], where some addi-
tainty of u(Vst)/Vst =0.01 according to the manufactur- tional simplifications were made.
er's information. N ate, the interference of atmospheric CO 2 (due to
the reaction with triethanolamine) is not taken into ac-
count.
Measurement of pH2 and calculation of dpH Since there are no general criteria for evaluation of
expanded uncertainty values, it is worth while to com-
After mixing the system with HCI added, pH2 is mea- pare the values obtained with corresponding ones for
sured under the same conditions as those for pHI (both the standard titrimetric method of A V determination in
measurements performed within 2-3 min) with the oils [14].
same uncertainty of pH reading. The expanded uncer-
tainty of the pH measurements can reach [11] 0.05 or
even [12] 0.1, but in our case the standard uncertainty 0.15
of the difference between the two measurements is
0.13
caused only by repeatability factors. Thus, only the un-
certainty of reading is important, and u(dpH) > 0.11
~
=V2xu(pH) =0.014, which is less than would be ex-
~ 0.09
pected from [11] and [12]. '5"
0.07
0.05
Calculation of acid value
0.03 '--~~---'~----'~~~-
0.05 0.25 0.45 0.65 0.85 1.05
The acid value calculation is performed using Eq. 1. In /lpH
this equation, all sources of uncertainty were described
by us with the exception of M KOH . The relative stand- Fig. 1 Dependence of A V uncertainty on .ipH
128 I. Kuse1man . A. Shenhar
with two higher A V values were prepared by adding to Sunflower 24.8 0.022 0.016 -0.010 (J.()45
the initially purchased oils (with minimal A V) the 1.51 0.025 0.008 - 0.021 OJl23
known amounts of oleic acid. These oils were analyzed 0.055 0.042 0.041 -0.053 0.112
by the validated method: four replicates for each sam- Soya 24.9 0.024 0.014 -0.019 0.039
ple daily, during a period of five days [15]. Results of 1.60 0.025 O.ot7 -(Ul32 0.047
the A V determination by the standard titration method 0.107 0.036 0.042 -0.082 0.117
(average from ten replicates) were used as "true" or as- Maize 23.4 0.014 0.012 -0.006 0.034
signed [9] values. Thus, the whole experiment consisted 1.57 0.016 (l.otl -0.016 (um
0.096 0.047 0.014 -0.022 0.038
of 5 X 3 X 4 X 5 = 300 A V determinations by the vali-
dated method and 5 X 3 X 10 = 150 determinations by Canola 22.4 0.023 0.012 - 0.023 n.033
1.57 0.029 0.009 -0.010 0.025
the standard method. Corresponding statistical data are 0.063 0.026 0.034 -0.066 0.090
given in Table 3. -(UI04 (l.OO9
Olive 23.7 O.ot 3 n.OO3
Average values of the relative standard deviation SI 6.62 (U1l9 OJ 104 - 0.008 0.012
of replicates (within a day - repeatability) and values of 0.579 (U1l6 0.016 -0.029 0.045
the daily relative standard deviation S2 (within a week-
reproducibility or intermediate precision [7]) for all the
samples satisfy Horwitz's criterion [4]. The relative bias
of the average result by pH measurement for each oil
sure that assumptions made during this calculation (see,
with respect to the corresponding "true " value is less
for example, the notes) were admissible.
than to<J5 S2 (to.<Js is Student's coefficient at 95% level of
On the other hand, the comparison of the uncertain-
confidence), and consequently satisfies Student's crite-
ties of the new method by pH measurement with the
rion.
standard titrimetric method cleared up the possibility
of using the latter as a source of "true" values in the
validation process. Moreover, although judgement on
the acceptability of the analytical method is based to-
Discussion day on the final validation report [7], the uncertainty
quantified by ref. [2] may be useful at an earlier stage
Comparing the combined uncertainty arising from ran- before experiments are carried out.
dom effects [u(AV)/AV =0.06-0.07] with SI values, For our example, the advantages of the new method
one can see that it is no less real than the repeatability, based on pH measurement are simplicity, rapidity, low-
as is required in ref. [2]. Also, from Table 3 it follows cost instruments, and suitability for automation [10]. It
that for all oil samples S2 < u(A V)/ A V, i.e. the com- can be applied for on-line quality control of oils and
bined uncertainty is not less than the reproducibility regulation of the extraction (from oil seeds) and refin-
(intermediate precision) too. ing processes. Its expanded uncertainty is satisfactory
Bias values characterizing the accuracy [4] of the for practical purposes because there are no two species
method (its trueness [16]) are less than the expanded or grades of oil with A V values differing by less than
uncertainty U(A V)/ A V = 0.12-0.14 even for minimal 12-14%.
A V, where the bias values are naturally the highest. Thus, the quantification of uncertainty by ref. [2] is a
The uncertainty of the A V determination evaluated suitable instrument for planning or forecasting method
from the data collected in Table 3 by the scheme pro- applications. Therefore, relationships between the un-
posed in ref. [9] (as a root of the sum of the variances certainty and validation parameters should be analyzed
caused by S" S2 and the bias) is also less than U(A V)/ in all possible aspects.
A V, the maximum value obtained being 0.10. This val-
ue may be higher when the evaluation is complete, Acknowledgements The authors express their gratitude to Prof.
Va. I. Tur'yan, Prof. E. Schoenberger and Dr. O. Yu. Berezin for
since at present we still have no estimation of interlabo- helpful discussions.
ratory deviations in the A V determination, i.e. repro-
ducibility of the method.
As shown above by statistical criteria, the values of
the validation parameters can be accepted as satisfacto-
ry. Their comparison with the results of the uncertainty
calculation according to ref. [2] allows us only to be
130 I. Kuselman . A. Shenhar
References
1. ISO (1993) Guide to the expression o. Hokanson GC (1994) Pharmaceut 13. Tur'yan Ya I, Ruvinsky OE, Sharudi-
of uncertainty in measurement, 1st Technol IX: llX-130; 92-100 na SYa (1991) J Anal Chern (in Rus-
edn. ISBN 92-07-101XX-9, Geneva 7. Green JM (1996) Anal Chern sian):917-925
2. EURACHEM (1995) Quantifying un- oX:305A-309A 14. International Standard ISO 000
certainty in analytical measurement, X. Boyer KW, Horwitz W, Albert R (19X3) Animal and Vegetable Fats
1st edn. ISBN 0-94X920-0X-2, Ted- (19X5) Anal Chern 57:454-459 and Oils - Determination of Acid
dington 9. Analytical Methods Committee Value and of Acidity, 1st edn., Swit-
3. Williams A (1990) Accred Qual As- (1995) Analyst:2303-230X zerland
sur 1 :14-17 10. Tur'yan Ya I, Berezin OYu, Kusel- 15. Berezin OYu, Kogan L, Tur'yan Ya
4. AOAC (1993) AOAC Peer-verified man I, Shenhar A (1996) J Amer Oil I, Kuselman I, Shenhar A (1996) The
methods program. Manual on policies Chern Soc 73: 295-301 Proceedings of the Eleventh Interna-
and procedures, Gaithersburg 11. Danish Standard DS 2X7 (197X) Van- tional Conference of the Israel Socie-
5. Accreditation for Chemical Laborato- dundersogelse pH (Water Analysis, ty for Quality, Nov. 19-21, Jerusalem,
ries. Guidance on the interpretation pH), 2nd edn pp 530-53X
of the EN 45000 series of Standards 12. Jensen H, Nielsen L (1994) Uncer- 10. Pocklington WD (1991) In: Rossell
and ISOIIEC Guide 25 (1993) WE- tainty of pH Measurements. Report JB, Pritchard JLR (eds) Analysis of
LAC Guidance Document No. of Danish Institute of Fundamental Oilseeds, Fats and Fatty Foods, Else-
WGD2/EURACHEM Guidance Metrology DFM-94-R24 on Nordtest- vier, London, pp 1-3X
Document No.1, 1st edn., Tedding- project No. 1194-94, Lyngby, Den-
ton mark
Accred Qual Assur (2002) 7: 182-188
001 1O.1007/s00769-002-0447-1
© Springer-Verlag 2002
out. Studies on soil sampling have mainly addressed pre- erence site for national and international intercomparison
cision and bias [4-11]. Sampling intercomparison exer- exercises.
cises have confirmed the contribution of different sam- This paper reports the methodological approach and a
pling protocols and devices [12] to the variability of the description of the first experimental activities performed
final analytical data, However, the best way of evaluat- in the framework of SOILSAMP, including a few pre-
ing the combined measurement uncertainty, which in- liminary considerations.
cludes sampling, is still under discussion. In the case of
soil, there is no doubt that an assessment of the contribu-
tion of sampling operations on the overall measurement Methodological approach
uncertainty is necessary to completely understand the
meaning of the analytical results [13, 14]. The evaluation of uncertainty associated with sampling
On the basis of these considerations, the National activities is based on a methodological approach includ-
Environmental Protection Agency of Italy (ANPA) ing the identification of the different sources of uncer-
has funded a project for the "Assessment of the uncer- tainty attributable to sampling procedures, the character-
tainty associated with the soil sampling in agricultural, ization of the sampling site (reference sampling) in terms
semi natural, urban and contaminated environments of trace element spatial distribution and the intercompar-
(SOILSAMP)". The project covers a three-year period ison exercise.
from2001 t02003 and involves collaboration with an Ex-
pert Advisory Group (EAG) composed of experts from
national and international institutions. The following In- Identification of uncertainty sources
stitutions are represented in the SOILSAMP EAG:
The combined uncertainty of analytical results u(r) in-
- National Environmental Protection Agency - ANPA cludes uncertainties associated with sampling u(s), sam-
(Italy) ple reduction u(rd) and analysis u(a). In the following
- International Union of Pure and Applied Chemistry - equation (Eq. I) the relationship between the above re-
IUPAC (United States) ported uncertainties (combined standard uncertainty) is
- International Union of Radioecology - IUR (Bel- given:
gium)
- Netherlands Energy Research Foundation - ECN
u(r) = -Ju(s)2 + u(rd)2 + u(a)2 (I)
(The Netherlands) The principles of EURACHEM/CITAC Guide [15] indi-
- Ente Italiano Nazionale di Unificazione - UNI (Italy) cate several steps to assess the uncertainty associated
- Universita Cattolica del Sacro Cuore di Piacenza, with an analytical process: a) specification of the measu-
"Istituto di Chi mica Agraria ed Ambientale - ICAA", rand, b) identification of the uncertainty sources, c)
(Italy) quantification of the uncertainty components, d) calcula-
- Universita di Pisa, Area della Ricerca CNR "Istituto tion of the combined uncertainty.
di Chimica del Terreno" (Italy) The EURACHEM approach requires a clear defini-
- Universita di Perugia, "Dipartimento di Scienze tion of the measurand and a quantitative expression of
Agro-ambientali e della Produzione Vegetale - DiSA- the relations existing between the value of the measurand
ProV", (Italy) and the parameters affecting its value. The parameters
- University of Barcelona, "Dipartimento Qufmica have to be identified; they can be other measurands,
Analftica" (Spain) quantities not measurable, or constants.
- University of Utrecht, Faculty of Geographical Sci- The first phase of the SOILSAMP project has been
ence, "Utrecht Centre for Environment and Land- devoted to the identification of the significant sources of
scape Dynamic" (The Netherlands) uncertainty linked to soil sampling. To this end, a cause-
- Regional Environmental Protection Agencies - ARPA effects diagram has been used. The diagram (sometimes
within the framework of the Centro Tematico Nazion- called fish-bone) easily shows the parameters considered
ale - Suoli e Siti Contaminati, CTN-SSC (Italy) and how they relate to each other. The fish-bone permits
- Ente Regionale per 10 Sviluppo Agricolo del Friuli visualization of the different sources of uncertainty
Venezia-Giulia - ERSA (Italy) avoiding over-counting.
- Dr. Herbert Muntau (Germany).
SOILSAMP is aimed at: i) the assessment of uncertain- Characterization of the sampling sites
ties associated with soil sampling in different environ- (reference sampling)
ments, based on trace element concentration measure-
ment in soil; ii) the characterization, in terms of trace el- Reference sampling is aimed at the characterization of
ement spatial variability, of a site to be qualified as a ref- the sampling site in terms of element spatial distribution:
A practical approach to assessmentof sampling uncertainty 133
it allows assessment of the element concentrations at any ance due to sampling, measurement and other unex-
point of a field with known uncertainty. plained spatially uncorrelated sources of variance.
In order to be used as a reference sampling site, the
site first has to be characterized for long- and short-range
spatial variation of trace element concentrations in the Trace element determination
soil. The long-range spatial variation is assessed by sub-
dividing the sample site into sub-areas of the same size. Trace element measurement in all samples is carried out
The same number of single soil samples is collected using instrumental neutron activation analysis (INAA).
from each sub-area. The samples are then pooled to give This technique achieves high precision levels and re-
a composite sample. The comparison of trace element quires little or no sample processing prior to analysis.
concentrations between the composite soil samples al- This analytical technique also eliminates uncertainty as-
lows evaluation of the long-range spatial variation. sociated with sample processing [18-21]. To rule out
The short-range spatial variation of trace element variabilities eventually caused by different analytical
concentration in the soil is assessed by comparing the laboratories, a single laboratory, following a predefined
analytical results obtained from single soil samples col- analytical protocol, performs all the analysis.
lected from randomly selected sub-areas. The number of
sub-areas to be considered for single sampling depends
on the expected spatial variability of the trace elements Field sampling exercise
considered.
The selection of the reference site must fulfil some The SOILS AMP project foresees the evaluation of the
minimal requirements, such as representative size, het- sampling uncertainty in four different environments: ag-
erogeneity, easy access, and a suitable trace element gra- ricultural, semi-natural, urban and contaminated sites.
dient within the site. The agricultural site (10,000 m2) has a regular shaped
and is characterized by the presence of three sub-areas
with different gravel content. These two conditions com-
Intercomparison sampling exercise ply with the pre-requisites of representative size and of a
structural heterogeneity of the soil. The site is a research
The intercomparison exercise is intended to assess the field belonging to a public scientific institution which is
uncertainty component attributable to different sampling easily accessible at any time, and where any accidental
devices. or unauthorized use can be prevented (i.e. spreading of
The trace element concentrations in soil samples col- unknown substances, transit of vehicles such as tractors).
lected at different point locations differ as a result of spa- The present and past land use of this site are known.
tial variation, effects of soil sampling, sample reduction, Considering that, generally, agricultural fields are not
and laboratory analyses. The final aim of the project is characterized by high spatial variability of trace ele-
not the evaluation of spatial variability of trace elements, ments, a spot-wise addition of fertilizer was performed,
but the assessment of the contribution of sampling to the to produce a well-marked analyte gradient within the test
uncertainty associated with the analytical data. To this site. The fertilizer containing 46% P20 S was added man-
end, the spatial variation must be accounted for, and sub- ually to two triangular-shaped areas, of about 50 m2
sequently eliminate. The regionalized variable theory each. The quantity of fertilizer was sufficient to increase
[16, 17] assumes that samples collected at locations close the concentration of phosphate in the first 5 cm of the
to each other are on average, more similar than samples top soil by about one order of magnitude.
collected further away from each other. Accordingly, the The agricultural test site has been divided into lOx
spatial variation of an attribute is assumed to be the sum 10m sub-areas. Figures 1 and 2 report the grid sampling
of three components: a) a structural component, having a points selected in the sampling exercises.
constant mean or trend, b) a spatially correlated compo- To assess the long-range spatial variation of trace ele-
nent, and c) an uncorrelated random noise. The spatially ments in each sub-area, samples were taken using an
correlated and noise terms are encapsulated in an experi- Edelman auger (20 cm length, 7 cm diameter) at a 2 m
mental variogram, plotting the experimental semi- distance from each other, after removing any surface
variance as a function of sampling distance. The experi- vegetation, resulting in 25 soil samples. These samples
mental semi-variance is estimated from the sample data were pooled and processed to give one composite sam-
and its value at zero distance is called the nugget. Theo- ple. The sampling device was cleaned after sampling
retically, the semi-variance should be zero at zero dis- each sub-area.
tance, but short-distance variation and other sources of To assess the short-range spatial variation of trace ele-
uncertainty make a positive value of the semi-variance. ments, on the hypothesis that the spatial variability of
The nugget is an estimation of the spatially uncorre- trace element is comparable between the different sub-
lated noise component mentioned above, including vari- areas, only 2 sub-areas (one where P20 S had been added)
134 p, de Zorzi et a!.
....
I ~
, !~ .. .-.
. a./ 1. . ."
-.....
....
.. J. ......
... - "
4. A
..... . ..........,..
~~ ,
.. - ---
~
; .... 4 4-
•
---
,'"
•..... .. ....
-..-..
... A, I. It.
4-
I ..
..' ' ,:r~
i·4 . ( ~~! :
L_ ....
,
...
..
.. . ..
" t • •
... ... .-
A ... 'II. A A- I.
'it. ~. 4- A A- A A l:3"
I
.
-;. .. :1> A 4- 4- 4.
e-----_
----
'.(t.' 4. 4. 6. 4. 4. 4 4.
...
.... ; '-----
"A _ _ _
........'"
"
4. ·4 · 4 A A It. A
--
.. .. .. ..
"
4 A A
cane 4- 4 4 A A ~.
,I
I A ,4-
4. I> I> A A 4 .4 4.
- --
A A A- Il. '4 I> A J,.
10 o 10 20 Ikh~rs
were sampled again. The resulting 25 samples per sub- Sample preparation
area will be analyzed separately to explore the within-
sub-area variability. The samples were weighted (wet weight) and the data re-
Three different devices, commonly used for sampling corded. The samples were then stored in cartons before
in agricultural fields were used for the intercomparison being dried in a fan oven at 35-40°C for several days (to
exercise: an Edelman auger (20 cm length, 7 cm diame- constant weight). They were then disaggregated with a
ter), a gauge auger (20 cm length, 3 cm diameter) and a wooden pestle and passed through an automatic, rotating
shovel. The devices were cleaned after each sampling. stainless steel sieve (2 mm mesh). The fraction above
A practical approach to assessmentof sampling uncertainty 135
,-....
~ r"'
\'~ t LJ Phosphate ~~ded _
o
\\ \~
\
--" -_. -' r-- o \ \~
\
--,
~
\ '--
__ Li. ____ Jj
\~-
\\U'l~\\:'
\
.
\ \ --
o i \ \-
~ \ t:s" \ ~
\ \ ::.
\ \~
\ \
-
\ \
cane
10
- -o -
--
10 20 Meters
2 mm was removed and was not considered in the ana- with coning and quartering of the sample sieved at 2 mm
lytical phases. and ended with the reduction by a riffle divider.
The samples sieved at 2 mm were reduced in order to
obtain the laboratory samples. The reduction phase was
Preliminary considerations
carried out to obtain samples representative of the soil
collected but having a reduced size to make them more By applying the EURACHEM principles a general
manageable in the laboratory. The reduction phase began cause-effect diagram (fish-bone) of the sampling phase
136 P. de Zorzi et al.
Ij
iI
L................. __ .. _........................................................................................ :
can be established. Figure 3 reports all the potential or corer samplers can increase due to compression of the
sources of uncertainty in soil sampling. soil). The influence of this source is higher in unman-
It is necessary to point out that not all the sources of aged soil (semi-natural ecosystems) than in managed soil
uncertainty have to be considered in all experimental ac- (agricultural field).
tivities involving soil sampling. The relative contribution The above reported considerations indicate that the
of the sources of uncertainty is dependent on the type of sources of uncertainty associated with sampling are
sampling and on the type of the analyte considered. The strongly dependent on the ecosystem investigated, sam-
contribution of sampling strategy is higher in the case of pling objectives and analytes studied. Figure 4 reports
the evaluation of "hot spots" or in the case of the assess- the cause-effect diagram for the assessment of the super-
ment of elements distribution in contaminated sites. This ficial distribution of trace elements in agricultural soil. In
aspect is not relevant in the case of the determination of the frame of SOILS AMP project, the influence of the op-
the mean value of an analyte in an agricultural field . erator is ruled out by selecting only one operator for
Sample type (disturbed/undisturbed) and 3D-spatial vari- sampling. Environmental conditions are ruled out as
ation give an important contribution in the determination well, because all sampling activity is carried out with
of vertical distribution of an analyte along the soil pro- similar temperature and moisture content in soil.
file. Sample stability and sample handling have a high Another aspect that has to be considered is that in
relative contribution for volatile elements determination. some cases it is extremely difficult to quantify each sin-
Environmental conditions, like moisture content of the gle uncertainty independently. In this case is more useful
soil and temperature can influence the depth of soil sam- to select the uncertainty sources that can be evaluated as
pled (in wet conditions, the layer sampled with an auger a "block". In the agricultural SOILSAMP experimental
A practical approach to assessmentof sampling uncertainty 137
design, some aggregated uncertaInties were defined as cause it is possible to carry out their determinations both
reported in Fig. 4. The influence of particle size, sam- on the entire sample before reduction and on the differ-
pling device, sampling strategy, sample handling, sample ent fractions resulting from reduction, without any treat-
container, and part of sample preparation are included in ment before radionuclide analysis. In addition, 137Cs and
the first block, while the critical phases of cone quarter- uranium series radionuclides show similar environmental
ing and riffling (reduction of the sample) in sample prep- behavior to many others trace elements in soil. 137Cs and
aration are considered separately. uranium series radionuclides activity concentrations
Each step of the sample reduction phase has its own will be determined in at least 10 replicates by gamma-
quantifiable uncertainty and it is possible to quantify the spectrometry.
uncertainty of the sample reduction as a "block". The de-
termination of this uncertainty will be quantified experi- Acknowledgements The authors would like to thank all the par-
mentally as standard deviation after several repetition of ticipants of the SOILSAMP external advisory group for their sci-
the reduction phase in three different samples. To quanti- entific contribution during the development of the activities. A
fy the contribution of the uncertainty linked with the re- special thanks to Dr. Luisa Stellato, consultant ANPA, for the as-
sessing the georef sampling points. Moreover, we are grateful to
duction phase, 137Cs and uranium series radionuclides Valter Coletti, ERSA - Ente Regionale per 10 Sviluppo Agricolo
activity concentrations will be determined. These radio- del Friuli Venezia-Giulia, for support and technical assistance dur-
nuclides have been selected as appropriate elements, be- ing the field activity.
References
I. ISO (1993) Guide to the expression of 7. Ramsey MH, Argyraky A (1997) Sci 16. lsaaks EH. Srivastava RM (1989) An
uncertainty in measurement. Interna- Total Environ 198: 243-257 introduction to applied geostatistics.
tional Organization for Standardization 8. Ramsey MH (1997) Analyst 122: Oxford University Press. Oxford, UK
(ISO), Geneva 1255-1260 17. Mulla DJ, McBratney AB (2000) Soil
2. ISO/IEC 17025:1999 (1999) General 9. Ramsey MH (1998) J Anal At Spect- spatial variability. In: Sumner ME (ed)
requirements for the competence of rom 13: 97-104 Handbook of soil science. CRC Press,
testing and calibration laboratories. In- 10. Ramsey MH, Squire S, Gardner MJ Boca Raton. FL
ternational Organization for Standard- (1999) Analyst 124: 1701-1706 18. 'Smodis B (1992) Vestn Slov Kern Drus
ization (ISO), Geneva II. Squire S, Ramsey MH, Gardner MJ 39(4): 503-519
3. Thompson M (1999) J Environ Monit (2000) Analyst 125: 139-145 19. Smodis B, Jacimovic R. Jovanovic S.
I: 19-21 12. Belli M, de Zorzi P, Menegon S. Stegnar P (1990) BioI Trace Elem Res
4. ISO 3534-1 (1993) Statistics, vocabu- Sansone U (2000) In the Proceedings 26:43-51
lary and symbols - Part I. Probability of XXXI National Congress of Radio- 20. SmodiS B, Jacimovic R, Medin G.
and general statistical terms. Interna- protection, 20-22 September 2000. Jovanovic S (1993) J Radioanal Nucl
tional Organization for Standardiza- Ancona (Italy). pp. 97-105 Chern Artic 169( I): 177-185
tion, Geneva 13. Ramsey MH. Thompson M. Hale M 21. Svetina M. Smodis B, Jeran Z.
5. Thompson M, Ramsey MH (1995) An- (1992) J Geochem Explor 44: 23-36 Jacimovic R (1996) J Radioanal Nucl
alyst 120: 261-270 14. Muntau H. Rehnert A, Desaules A. Chern Artic 204: 45-55
6. Ramsey MH, Argyraki A, Thompson Wagner G, Theocharopoulos S.
M (1995) Analyst 120: 1353-1356 Quevauviller P (2001) Sci Total Envi-
ron 264: 27-49
15. EURACHEM-CITAC Guide (2000)
Quantifying uncertainty in analytical
measurement. 2nd edn. EURACHEM
Accred Qual Assur (2002) 7: 106-110
DOl 10.1007/s00769-00 1-0420-4
© Springer-Verlag 2002
In this paper, we discuss how to ensure the quality of 2. Steps to avoid loss:
the analytical data of micro elements in food against (a) Keep the ashing temperature controlled and below
problems which can introduce error. 500°C to avoid the loss of volatile elements (e.g.,
Cd and Pb) when the dry ashing method is used
during sample treatment. Ashing aids used in dry
Sampling ashing, may promote the decomposition of the or-
ganic matter, ash solubilization, and also help
Our laboratory performed the following two parts of the avoid the loss of the elements determined because
task: (i) the mandatory inspection (including quality su- of the solute produced. For example, Mg(N0 3)2
pervisory inspection, productive license inspection, and has been applied as an ashing aid to avoid the loss
products attested inspection that are assigned by govern- of the Se, As, Cd, and Pb in fish, milk, and fruit
ment) and (ii) the entrust inspection (including common juice samples, when these samples have been
sample analysis and arbitrate analysis). treated with the high temperature ashing method.
Samples used for the mandatory inspection were ran- In the determination of volatile halogens, NaOH
domly sampled from the qualifying products within their has been used to fix fluorine, and Mg(NO, h has
guarantee dates at the products' factory storehouse or been used as an ashing aid; the alkaline dry ·ashing
market goods cabinet. Samples used for the entrust in- method may be used for the destruction of organic
spection were delivered to our laboratory in person by matter and the liberation of fluorine without any
the sampling person. A sufficient amount of the sample loss. ZnS04 has been used as an ashing aid in the
was selected by a suitable method according to the pro- iodine determination; the sample may then be ash-
vision in the relative standard. ed at 550-600 °C without any iodine loss.
Only the determination value of each ingredient in the (b) When the wet-digestive method is used for food
sample was used to determine whether the product was sample treatment, the appropriate digestant must
up to standard or not; we did not deduce whether this be selected for the sample and the element being
product set was up to standard or not. determined. For example, the oxidizing HNO r
H 2S04 must be used as the digestant for arsenic
determination in food containing high amounts of
Treatment of the analytical sample salt in order to ensure that the arsenic present is
all arsenic (V), otherwise, the arsenic (III) may be
An appropriate sample treatment method was selected be-
lost as the volatile AsCI., (b.p.= 130°C). When
fore analysis according to the rules of relative standard in
HNO.,-H 2S04 is used in· canned food digestion,
order to reduce the sampling error and misrepresentation.
the acid-insoluble meta-stannic acid is produced
Common liquids, powders, and small pellet foods
and adsorbed on the inner wall of the Kjeldahl
were homogenized by simply shaking, however, for solid
flask, so that the tin is then lost. At this time,
food, especially non-uniform solid food, it was first bro-
4 mol I-I NaOH solution must be added and then
ken into pieces and then mixed to achieve sufficient ho-
gently heated with swirling until meta-stannic acid
mogeneity. Canned foods were poured into a blender and
is fully dissolved in the sample solution. However,
thoroughly mixed since the element content is quite dif-
if H 2S04-H 20 2 used in this digestion, the afore-
ferent at the can center and at surfaces in contact with
mentioned trouble may be avoided. H 2S04 should
the can wall.
not be used in digestions for the determination of
Most of the foods examined were multi-component
trace lead in samples containing large amounts of
organisms. For the accurate determination, the micro ele-
calcium because insoluble CaS0 4 produced at the
ments must be free from organic matter before analysis.
end of the digestion causes the adsorption-loss of
The decomposition procedure of organisms and extrac-
lead.
tive procedures for inorganic elements were generally
(c) The original oxidation state (OS) of the elements
applied. However, for all procedures applied, the effec-
must be retained when the different OSs of the el-
tive steps necessary to avoid contamination or loss of the
ements have been determined individually. For
elements determined during the sample treatment were
example, in the simultaneous determination of to-
always taken.
tal chromium and chromium (VI) in food by
I. Steps to avoid contamination: atomic absorption spectrophotometry(AAS), the
(a) Clear the air in laboratory by filtration to avoid sample must be treated with 10% aqueous tetra-
analytical sample contamination by elements in methyl ammonium hydroxide in an ultrasonic wa-
the floating dust ter bath at 60±2°C until all solid matter is dis-
(b) Soak the glass vessel in dilute acid and then wash solved; all the chromium ions are extracted into
it with deionized water to avoid contamination by the alkaline solution without any change in OS.
elements adsorbed on the vessel walls For the determination of total iron and iron (II) in
140 Z. Hu' L. Liu
infant food by AAS, the sample is treated with an national standard, and trade standard have been selected
acid-extraction procedure using hydrochloric acid for the mandatory inspection. The national standard is
and ultrasonic vibration under nitrogen flow; all preferential for the entrust inspection and arbitrate in-
iron is extracted into the acid solution without any spection. If it is not available the normal standard and
change in as. then the trade standard or contract standard will be used.
(d) When the hydride generation method is applied to When no standard method was available for a certain
the determination of total arsenic in food, all of sample, the reliable method published or developed by
the organic and inorganic arsenic-containing com- us was selected, however, these methods must be passed
ponents in sample are completely converted into through assessment. Furthermore, it should be empha-
arsenic (V) by digestion with HNO r H 2S04 ; the sized that we must carefully consider the following fac-
reduction of As5+ to arsine is very slow by boro- tors when the standard method is applied to certain ele-
hydride and the As5+ must therefore be completely ment determinations in practical samples, because the
pre-reduced to As3+ by potassium iodide-ascorbic standard method has been worked out for many kinds of
acid (KI-VC). However, when graphite furnace food:
AAS is used for the total arsenic determination,
the arsenic-containing components in the sample 1. We must consider how to remove the various interfer-
solution must be completely oxidized to A S5+, ences for the determination of trace elements in food
since the atomization temperature of As3+ is very by AAS, according to the elements determined and its
different from As5+. When a sodium diethyldithio- content:
carbamate- methyl isobutyl ketone (DDTC-MIBK) (a) Prepare the standard solution in the same compo-
system is applied to the determination of chromi- sition as the sample or apply a standard addition
um concentration, the Cr3+ must be completely method in order to remove physical interference
oxidized to Cr6+ to ensure an accurate analytical from the differences in viscosity, surface tension,
result because Cr3+ is not readily chelated by and vapor pressure.
DDTC. (b) Suitable chelated-extraction or ion-exchange
methods should be applied to collect the deter-
mined element in order to separate off the high
Selection of the analytical method amounts of interfering inorganic salts or extract
out the micro elements.
Many analytical methods can be applied to the determi- (c) Make use of the characteristic gaseous state of the
nation of micro elements in food. Suitable analytical hydride (at normal atmospheric temperature and
methods should have the following features: pressure); it can therefore be decomposed at lower
temperatures. As, Sn, Bi, Pb, Se, Sb, Te, and Ge
1. The uncertainty of the methods should be minimized
may therefore be readily separated from their
(good precision). In general, the relative standard de-
mother solutions at normal temperature and pres-
viation of method should be lower than ±S%.
sure.
2. The sensitivity and detection limits of the methods
(d) When the alkaline metal and part of the alkaline-
should meet the needs of the standard (high sensitivi-
earth metals present have been determined, a
ty and low detection limit). In general, the detection
readily ionizable element (another alkaline metal)
limit of the method should lower than the permitted
must be added in an analytical solution in order to
content in the sample (provided in the product stan-
increase the free electronic concentration in the
dard) by at least one order of magnitude.
flame, therefore effectively controlling or remov-
3. A fair agreement between the true content and the ex-
ing the effect of the ionization interference.
pected content observed by the method is sufficient
(e) The chemical interference can be removed using
(good accuracy).
the temperature effect, gaseous state of the flame,
4. The method used for investigation is different from
addition of the release agent, protective agent,
the reference method and should have a better preci-
flux agent, and organic solvent etc, or by pre-
sion than the method generally used for the determi-
seperating off the interference matters.
nation of the parameters.
(f) The molecular absorption interference can be re-
In general, we selected the suitable analytical method ac- moved by the adjustment of the zero point, deduct
cording to the element, content, and matrix in the deter- with continuative light source and the Zeeman ef-
mined sample in order to ensure the accuracy and reli- fect.
ability of the analytical result. (g) When the graphite furnace-AAS is used for the
Our principle for method selection is that the standard determination of trace elements in food, the ma-
method should always be used where possible. Under the trix effect is more serious. A matrix improver
normal conditions, the existing national standard, inter- must be used for the removal of the matrix inter-
Quality assurance for the analytical data of micro elements in food 141
ference. For example, phosphoric acid must be 1549; 304 F063; GBW08509), cabbage (GBW08504),
added to the sample solution for the determination mussel (GBW08571), prawn (GBW08572) and pork
of trace lead; the ash temperature may be in- (GBW08552).
creased to 900-1000 °C and the matrix interfer-
ence for the lead determination is therefore re-
moved. In the determination of arsenic, Mg(N0 3)2 Blank test
and Ni(N0 3h are added as a matrix improver in-
creasing the ash temperature to 1100 °C; thus the The blank test is a scale for the inspection of reagents
interference of the anion and cation that coexisted and methods used in analysis to detect whether or not
in the sample solution is removed allowing the de- they correspond to the requirements of trace analysis.
tected of concentrations as low as 6 ng g-I. Blank test must be carried out for each set of the sample
and treated with high-purity reagent and water passed
through re-distiIIation, ion-exchange, or sub-boiling dis-
Standard solution and certified reference materials tiIIation apparatus in order to reduce the reagent blank
value to a sufficiently low level. For example, the nitric
I. Standard solutions acid solution containing large amounts of chromium
should not be used in the sample digestion for chromium
A standard solution of very reliable quality is a necessity
determination in food; the H 2SOr H 20 2 digestive method
for quantitative analysis. In our laboratory, the certified
is however suitable. The high pressure digestion (per-
reference reagents used as stock solutions for the micro
formed in a sealed container) is applied to the analysis of
element analysis were prepared by American Fisher Sci-
food because only small amounts of reagent are used in
entific Company or the China National Research Center
the sample digestion, thus its blank value is lower than
forCRM.
the other method. If a microwave heater was used in this
Each working standard solution was prepared by dilut-
method, the effect may be even better.
ing the stock solution with deionized water or dilute acid
using a calibrated burette and volumetric flask to a
known concentration before use. The container for the
storage of the standard solution were soaked with acid
Calibration of the instrument and equipment
and cleaned thoroughly with deionized water prior to use
All of the instruments and equipment in our laboratory,
in order to avoid contamination. The standard working
including atomic absorption spectrophotometer, UV-spec-
solutions for super micro element analysis were stored in
trophotometer, balance, thermometer, pressure gauge,
Teflon containers to avoid the adsorption loss and dis-
vacuum meter, and high capacity glass container were
solving element contamination from the container. The
calibrated at regular intervals and operative inspections
storage conditions of the standard working solutions were
were preformed every day before use to ensure a good
vigorously maintained. For example, the standard tin
operative state, accuracy, and reliability.
working solution was prepared before use with dilute acid
The calibration of the instruments are carried out
(0.1 mol I-I HCL, HN0 3. or H 2S04) and may be stored
every year by the legal measurement department (China
for several months in glass, polypropylene, polyvinyl
National Institute of Metrology, China National Research
chloride, polycarbonate, and Teflon containers. If pre-
Center for CRMs) according to the national rules for cali-
pared with water, obvious losses occurred when the stan-
dard working solution was stored for as little as one day. bration. Its verified value may be traced to the national
In our laboratory, the standard solutions were pre- measurement standards. Operative inspections were car-
ried out by analytical personnel prior to each use.
pared by two people, calibrated with each other, and then
For instruments without calibration rules, calibration
checked against newly purchased standards in order to
was carried out by comparison with a similar instrument
eliminate the risk of error.
so as to make its verified value comparable.
Any erroneous instrument or equipment was not used
2. Certified reference material
further and the data for samples recently analyzed were
The certified reference material (CRM) is used as quality re-inspected.
assurance samples for the assessment of the analytical Our laboratory had one set of small capacity glass
methods and results. The CRM used for the elements' containers which had been passed for calibration by the
analysis in our laboratory were prepared by the Ameri- legal measurement department. Other glass containers
can National Bureau of Standards, MBH Analytical Lim- were self-calibrated by personnel who had passed the
ited, and the China National Research Center for CRM. special training at the legal measurement department, us-
For example, oyster tissue (SRM 1566), bovine liver ing above the verified container as the measurement
(SRM I 577a; 308FI85), wheat flour (SRM 1567; standard in order to ensure its quantitative value could be
GBW08503), rice flour (SRM 1568), milk powder (SRM traced back to the national measurement standard.
142 Z. Hu . L. Liu
Application of the quality control chart [3] fell within the range of the certified value, it showed
that the sample analytical result was accurate and reli-
The quality control chart for the single measurem~nt ?f able.
micro element determination in food has been apphed m 2. When there no suitable CRM could be used, the sam-
our laboratory. The analytical results can therefore be di-
ple was determinate by classical .m~thods or. o~her
rectly expressed in order to discover and correct prob- methods based on a different pnnclple. StatIstIcal
lems during the inspection. The determination data are analysis was then carried on their means obtained by
therefore accurate. the above two methods. If there was no obvious dif-
Preparation of the quality control chart: a suitable ref- ference between them, we considered the sample ana-
erence material was selected as a quality control sample. lytical result to be accurate and effective.
The certified value (A) of the reference material was tak-
3. The analytical result of the unknown sample is al~o
en for the center line and certified values plus, minus corrected by the "recovery" of the certified value m
two times the standard deviation (A±2 0-) of the analyti- the analysis of the CRM.
cal method for the upper and lower control limits, re- 4. We joined the comparison test with national or inter-
spectively; take the determined values and data for the national laboratories at irregular intervals and ob-
vertical and horizontal coordinates, respectively, draw tained good results. This indicated that our analytical
out the quality control chart of the single measurement results were accurate and reliable. For example, in
for every element. 1998, we joined the proficiency test between 108 in-
These quality control charts were always available in ternational laboratories in the Asia Pacific region for
our laboratory, so that determination result of the quality the determination of As, Cd, Pb, Hg, and Zn content
control sample could be pin-pointed on its graph in order in a canned fish sample labeled APLAC T009. The
to control the analytical quality at any time. between-laboratory Z-score and the within-laboratory
The analytical quality control test for the assessment Z-score was -0.67-0.98 and -0.24-0.73, respectively.
of trueness of the analytical result was carried on each of Our achievement is elegant.
the sample determinations; at least one CRM was ana-
lyzed with each set of the practice sample. As the d~te.r
mination result of the quality control sample fell wlthm Quality Assurance System [5]
the upper and lower control limits, it showe.d that th~ d~
termination process of this sample set was sItuated wlthm
Our laboratory is a qualified laboratory through the ex-
the statistical control state; these results were therefore
amination and accreditation by the China National Ac-
effective. If the determination result of the CRM fell out-
creditation Committee for Laboratory (CNACL). We
side the control limits, it showed that the determination
have set up a quality assurance system and sufficiently
process was out of control. These results within the peri-
ensured the accuracy and reliability of the inspection
od from this time to former time were ineffective and the
data in six fields (including environmental condition, in-
reason for the deviation was investigated and corrected.
strument and equipment, personnel quality, inspective
process, quality appeal treatment, and accident tr~at
Assessment of the final analytical result [4] ment). It has been continuously revised and progressIve-
ly standardized through the management review (includ-
The important analytical result (including the arbitration
ing internal quality audits and review) each year.. .
analytical result, specific sample analytical :esul~, and
Our laboratory is also subject to one re-exammatlOn
determination by non-standard methods) obtamed m our
every five years and one selective examination every
laboratory were verified by following techniques:
year by the CNACL.
1. The CRM with a similar matrix to the sample was se- In our laboratory, the CRM is used as a blind sample
lected and analyzed using the same procedure togeth- in order to allow examination of personnel. Our person-
er with the sample. If the analytical result of the CRM nel are all sufficiently competent and qualified.
References
I. Hu Zhengzhi (1996) China Encyclope- 3. Pan Xiurong (1993) Introduction to 5. Quality Manual (1998) China National
dia of Chemical Industry. Chemical In- quality assurance of analysis and in- Center of Food Quality Supervision and
dustry Publisher, Beijing, 10:57-138 spection. Scientific and Technical I~for Testing
2. China National Standard, GB2762-94 mation Network for Standard Matenal,
and GB4810-94 Beijing, pp 1-63
4. Wang Shuchun (1991) Mathematical
statistics and quality control for food
analysis. China People's Hygienic Pub-
lisher, pp 24-49
Accred Qual Assur (1998) 3:227-230
© Springer-Verlag 1998
Often geometrical tolerances of manufactured parts and the inner limits depicted in Fig. 3 give the confi-
are controlled by use of calliper gauges. According to a dence interval of the certified value. The total interval
German standard [5], the highest permissible error of reflects the uncertainty of the analytical procedure esti-
such a calliper gauge should be u = 0.03 mm. Thus the mated as ± 2 s (s = standard deviation). As can be seen,
measurable tolerance using this gauge should be the procedure was out of control in early 1996 and had
T?:. 0.3 mm. If the tolerance is below this value one to be readjusted.
should use a more accurate device or measuring proce-
dure.
Uncertainty of tests caused by sampling
Compliance with limiting values Often the uncertainty associated with a test is not main-
ly caused by the measurement process itself but by the
The important task of control of whether products or
sampling procedure performed beforehand. Therefore
samples comply with limiting values defined by regula-
it is important to include this component into the un-
tions or laws for reasons of health, safety or environ-
certainty budget, which otherwise would be mislead-
mental protection is related to the control of toler-
ing.
ances. It is only mentioned here for the sake of com-
In connection with sampling, two question arise:
pleteness and is dealt with in detail in [6]. Again the
- The representativeness of the sample
uncertainty has to be taken into account when making
- The deduction of a result for the whole batch based
decisions.
on the sample result.
The first question causes severe problems e.g. in the
Laboratory quality control field of environmental analysis, but cannot be treated
here.
Reference materials are widely used in chemical The second problem which is of importance e.g. in
analysis to establish traceability and to control analyti- the fields of industrial quality control and market sur-
cal procedures e.g. by use of control charts. For this veillance, can be treated appropriately by statistical
purpose two uncertainties should be known: means [7, 8]. An application is the market surveillance
- The uncertainty of the certified reference value with regard to the so-called e-mark. This mark on pre-
- The uncertainty of the analytical procedure to be packed food and consumer goods is intended to assure
checked. the consumer that the actual contents of the prepack
Usually the latter is dominant. conform with the nominal contents within certain lim-
As an example (Fig. 3), a control chart is shown set its. For example, a German regulation [9] applied to a
up by a BAM laboratory for the analysis of aluminium lot of 10 000 packages of butter with a nominal weight
in steel by spark emission spectroscopy. The reference of 250 g each stipulates the sampling instruction 125-7/8,
material used is the EURONORM-CRM No. 194-1 which means that a sample of 125 packages is randomly
taken and each pack has to be weighed. If x gives the
number of items in the sample, with mj <241 g (the
Control-Chart ZRM 194-1 - Aluminium ~ZRM+2s
minimum permissible weight), then the criteria for ac-
0,100 . - - - - - - - - - - - - - - - - - - i ~ ZRM-2s ceptance or rejection of the whole batch are x:s; 7 and
!--ZRM x?:.8, respectively. The probability of acceptance P a as
--------.H--i =~:~-CII2
0,095
j •••••• ZRM+CI/2 a function of the quality level of the lot can be derived
from the so-called operational characteristic of this spe-
0,090 cific sampling instruction and is given in Table 1 for
some selected values. It can be seen that a batch con-
~ 0,085 . taining 8% packages with a weight less than 241 g will
:i
;:
0,080 .
still be accepted with a probability of approx. 20%. Concerted actions of the testing community,
This example may demonstrate the limited resolving accreditation and standardization bodies
power of such a sampling instruction, and should be
kept in mind. These problems cannot be solved by individual labora-
In the case of the e-mark, according to a second re- tories. Instead, the testing community as a whole is
quirement, a batch has also to be rejected if the mean asked to co-operate with accreditation and standardiza-
weight of the sample items is less than the nominal tion bodies aiming at:
weight with a significance level of 99%. Evaluation of the (generic) uncertainty of measure-
ment and test procedures
- Inclusion of uncertainty characteristics (e.g. repeata-
Concluding remarks bility, reproducibility) in testing standards
- Education of the customers.
It is the aim of this paper to demonstrate that uncer-
tainty statements are essential for the users of measure-
ment and test results when they assess these results and Provision of the necessary funds
have to take decisions based on them. However, for
this purpose it is often sufficient to know the generic It is the task of national and European authorities to
uncertainty of the type of test performed instead of the provide the necessary funds for these concerted ac-
uncertainty of the particular result. But we are faced tions
with some problems. - Because by this means the testing infrastructure can
be improved
- Because authorities are also customers of testing la-
Confusion of the customers, hesitation of the boratories and important administrative and politi-
laboratories cal decisions are based on their results.
Organizations like EURACHEM, EUROLAB and
Laboratory practitioners know from their contacts with NORDTEST can help to initiate and stimulate these
the customers that often the latter are not familiar with co-operative processes.
the concepts of uncertainty and are rather confused
when they are confronted with uncertainty statements. Acknowledgements The author would like to thank colleagues
from EUROLAB, NORDTEST and BAM for fruitful discus-
On the other hand many laboratories fear that a com- sions. In particular the contributions of Nazmir Presser, Rolf
prehensive and honest statement of uncertainty might Oberhauser, Siegfried Noack and Thomas Goedecke are grateful-
affect their reputation and competitiveness. ly acknowledged.
References
1. EN 45001 (19il9) General criteria for 4. Hernia M (1996) Qualitat und Zuver- H. ISO 3951 (19H9) Sampling procedures
the operation of testing laboratories, lassigkeit 41 : 1156-1162 and charts for inspection by variables
Brussels. 5. DIN H62, Mel3schieber - Anforderung- for percent nonconforming. Geneva
2. ISO/IEC Guide 25 (1990) General re- en, Prtifungen, 19HH 9. Bundesgesetzblatt Part I (19ill) Ver-
quirements for the competence of cali- 6. Christensen 1M, Holst E (199H) ordnung tiber Fertigpackungen (Fertig-
bration and testing laboratories, 3rd Accred Qual Assur packungsverordnung) of
edn. Geneva 7. ISO 2H59-1 (19H9) Sampling proce- 1K 12.19Hl : 15H5-1620, last change:
3. BIPM, IEC, IFCC, ISO, IUPAC, IU- dures for inspection by attributes; sam- Bundesgesetzblatt Part I (1989):
PAP, OIML (1993) Guide to the ex- pling plans indexed by acceptable 1557-1567
pression of uncertainty in measure- quality level (AQL) for lot-by-Iot in-
ment, 1st edn spection. Geneva
Accred Oual Assur (199H) 3: 237-241
© Springer-Verlag 199H
[5, Appendix A] do not seem entirely correct and call available be used. The three typical cases of what may
for comment. This is due to some oversimplified or in- be known are as follows (see Fig. 1):
correct insights into the subject, which may result in un- 1. A (statistically) estimated confidence interval having
realistic estimation. Regardless how much the "error" a stated confidence level
in an estimate of the uncertainty may be, these issues 2. An expected value and assigned maximum bounds
are of principal importance in view of the educational about it
significance of the Guide. Therefore, these "trifles" are 3. Assigned maximum bounds only.
worth drawing attention to in order to formulate some Unless otherwise stated, it is quite natural to assume
(obvious enough) rules, so that uncertainty estimates in case 1 that a normal (Gauss) distribution was used to
can be as correct as possible. These "in pursuit of cor- calculate the interval and recover the standard uncer-
rectness" rules are given below, with reference to the tainty by using a suitable quantile of the distribution.
respective examples of the Guide. (The quantile is taken equal to 2.0 for 95% confidence
level.) In contrast, extremely little information about
the quantity in question is available in case 3, and all
one can do is to model it by symmetric uniform (rectan-
Three "in pursuit of correctness" rules in estimating gular) distribution. Then, the expected value of the
uncertainty quantity is the midpoint of the range and the conver-
sion factor is equal to -y3. Case 2 occurs where addi-
Rule i: The choice of an appropriate distribution func- tional information such as an expected value allows us
tion (in type B evaluation of uncertainty) should be to regard values of the quantity near this value as being
made on the basis of all the available information on the more likely than values near the bounds. This situation
quantity at issue. differs from that in case 3. It is because of this that item
An estimate of standard uncertainty is often made F.2.3.3 of the ISO Guide to the Expression of Uncer-
from bounds a_ and a+ within which values of the tainty in Measurement [6] recommends in such in-
quantity in question X are expected to lie. [The range stances the adoption of a triangular distribution as a
a_ to a+ is commonly symmetric with respect to the compromise between the two extremes, normal and
best estimate of X and has half-width a =(a+-a_)/2.] It rectangular distributions. The conversion factor is
is a fairly frequent task in the practice of measurement equal to V6 in this case, and the standard uncertainty
data evaluation, as assigning maximum bounds (based obtained proves to be about 30% smaller than that ob-
on objective knowledge or personal jUdgment) is often tained using the rectangular distribution model. It can
the only thing to do. One may simply divide the value be said that increasing uncertainty in going from case 2
of a above by an appropriate conversion factor depend- to case 3 is in a sense a "payment" for our ignorance
ing on what kind of probability distribution is assumed. about the distribution of possible values of the quantity
It is essential here that all the relevant information between the bounds.
Examples
Confidence interval for a Nominal values and specification The purity of a material as being
weighing result: limits for volumetric glassware: "not less than p (%) level":
It should be recognized that all the instances of us- fied tolerances. Only the two contributions to the un-
ing rectangular distributions in the EURACHEM certainty remain in such a case, and a substantially re-
Guide examples fall in fact under case 2, not case 3. duced uncertainty value is achieved in the final analy-
Such are, in particular, the evaluations of the uncertain- sis. The cases considered are schematically depicted in
ty concerned with volumetric glassware: a nominal ca- Fig. 2. [It is necessary to note that an additional contri-
pacity is simply an expected value. Thus, all the stand- bution to the uncertainty may arise due to a substantial
ard uncertainties calculated by applying the conversion difference between the properties (such as viscosity,
factor of v'3 appear to be overestimated. Although this surface tension, and so on) of a liquid to be measured
approach based on Bayes "principle of equal igno- and those of water, for instance, in the case of nona-
rance" is common practice in estimating uncertainty in queous solutions. These effects are taken into account
metrology, we cannot regard it as correct in all the by means of individual calibration with an appropriate
cases of specifying measurement errors in the form of calibration liquid.]
maximum limits. Though simple and universal, this Let us examine another situation, "Determination of
scheme comes into conflict with common sense. Rec- organophosphorus pesticides in bread" (Example 3 of
tangular distribution is only to be assumed when noth- the Guide). This is a multistage procedure consisting of
ing but the limits for possible values of the quantity are several sequential steps, beginning with homogeniza-
available. For example, the uncertainty associated with tion and ending with a GC determination. The com-
purity of a material and expressed as being "not less bined correction Fe and the combined uncertainty U c
than the p (%) level" might be one such case insofar as for the procedure as a whole are derived from individu-
an expected (or nominal) value of purity is unknown al values of the correction factor Fi and the uncertainty
here. Other examples of the application of model distri- Ui each relating to a stage i as follows:
butions are shown in Fig. 1. n
Combined correction Fe = IT Fi
Rule 2: It is necessary to consider uncertainty compo- i=1
..
mary estimates Fi , Ui (lower Recovery Ri Fe = 1 x 1,00 x 1,04
part of the table) n x 1 x 1,10 x 0,96
®=~i -----
-.;'" Correction Comb. Correction Fe = TIFi xlxlxl=I,IO
"'~
"'''' factor
@ -- ....
i=l
.a.~ F, =1,04 F, = 1,10
.Ei<> Uncertainty Comb. Uncertainty u e = J~ u~
F.
IlFj 1
n
iij2 _ LU~
j~i1-1
r 1,06
1,10
rl'~
0,96
® -- ....
..
ri+\ i-I
~'" Uncertainty Comb. Uncertainty u e = ii ,2 + 2;u J2
"'~
E '" j"'i
F, =1,10 F, = 1,06
E.~
®=~i -----
Correction i·1
"''''
00<> factor Comb. Correction Fe = 'F;TIFj
j=1
Fe = 1 x 1,00 x 1,10
= 1,10
Recovery Ri
correction factor Fi for such a stage one must divide the Therefore, accounting for these factors (with separate
summary value Pi by the correction factor(s) relating to estimation of variabilities for GC determination and
all the following stages (the available data in Example 3 calibration stages) seems to be based on a misunder-
permit us to do this), and one can then calculate the standing in the context of the procedure.
combined factor for the procedure. Evaluation of an uncertainty associated with weigh-
It is also possible to do this without finding the indi- ing in the same example should also be mentioned. The
vidual correction factors in the calculation of the com- two contributions to the uncertainty taken into account
bined correction. It is sufficient to stop at the first sum- in this case are: a standard deviation for "repeatability
mary factor Pi (such as F3 in Example 3) in the product experiments" (0.03 g) and a standard deviation of the
of the factors. This leads the multistage procedure to be mean of the long-term data (0.008 g). The two compo-
broken up into the elements that make independent nents are combined, giving the value 0.031 g.
contributions to the combined value. Figure 3 demon- Note first of all that the use of the standard devia-
strates the two possible ways of calculating: getting in- tion of the mean as a measure of long term variability is
dividual estimates from summary estimates is depicted not correct in estimating the uncertainty sought for. If
by upright arrows, and obtaining the combined correc- the data available have covered not 11 months, as in
tion as a product is depicted by horizontal arrows. (The the example, but say 22 months, the long-term contri-
appropriate procedures in calculating the combined un- bution would be V2 times smaller following this way of
certainty are also included in the table.) The right-hand thinking. The available monthly check weights data
column of the table shows the relevant calculations as- (Table A3.1(3) [5]) give a standard deviation of 0.026 g,
sociated with Example 3. and this value alone characterizes the long-term varia-
bility of a single weighing. At the same time, Table
Rule 3: When estimating uncertainty only those influence A3.1(1) [5] shows repeat weighings / replicate readings
factors are to be considered that really affect the result of results that may be very useful for detailed examination
a measurement in the context of the procedure. of the precision of weighing. Application of one-way
Let us refer again to Example 3 of the Guide. Eval- analysis of variance according to a standard scheme [9]
uation of the uncertainty associated with the GC meas- allows us to estimate separate components of total
urements is based here (Table A3.6 [5]) on a wide- weighing variation, inasmuch as the standard deviation
ranging study of GC variability across different instru- of 0.03 g mentioned above was evidently obtained by
ments, operators, (and times). However, in spite of the treating the data as one large sample, without a separa-
wide variation of the factors, the usefulness of the un- tion.
certainty estimate so obtained is doubtful. Indeed, the So, a replicate readings standard deviation is found
conditions of the GC determinations are usually such to be 0.020 g (with the number of degrees of freedom f
that the responses for both a sample and a standard are being equal to 36). One can easily prove by means of
registered with the same instrument, by the same ana- the F test that the long term standard deviation of
lyst, and over a short period of time. This is why these 0.026 g (J = 10) does not differ significantly from this
influence factors largely cancel in the result of analysis. estimate, even at the 10% level. This means here that
Evaluating uncertainty in analytical measurements: the pursuit of correctness lSI
time is not a factor at all, so that a contribution due to plied to the problem of comparison of two measure-
long-term variability need not be considered. The anal- ment results, taking into account their uncertainties
ysis further leads to a repeat weighings standard devia- [10], regardless of the fact that the over-all uncertain-
tion of roughly 0.05 g if = 11) for a single weighing. ties, as calculated in Example 2, would be unsuitable to
This estimate is significantly greater (F test) than solve the problem correctly with respect to the compa-
0.020 g, the standard deviation for replicate readings, rative experiment. They are "excessive" for this. Clear-
and hence there is some kind of additional source of ly, a number of error components caused in particular
variation in weighing in the laboratory. Whatever the by deviations of actual experimental conditions from
source may be, it is the value 0.05 g which should be nominal ones are the same for the two results to be
taken as an actual contribution of weighing to the com- compared, and the corresponding contributions vanish
bined standard uncertainty required. in the uncertainty budget for the difference. Thus, if
only an internal consistency of results, not absolute
trueness, is of interest, the influence quantities which
And one more remark on the subject
are not variable in the scope of such a comparative trial
One of the most important points relating to the evalu- may be disregarded, with the overall uncertainty being
ation of uncertainty is that all relevant error sources reduced to that suited to the particular conditions and
should be taken into account, with the corresponding referred to as a conditional uncertainty.
contributions being combined. The estimate so ob- The possibility of such an approach, albeit with re-
tained quantifies the overall uncertainty inherent in the spect to the "top-down" method of dealing with uncer-
analytical procedure at issue. Apart from the meaning- tainty, was noticed in [1]. The term "conditional uncer-
ful reporting and interpretion of an analytical result, tainty" or a similar one is likely to gain currency in ana-
the overall uncertainties are applicable, for instance, lytical data treatment, since a considerable part of ev-
for quality control purposes when reference materials eryday tasks in analytical laboratories only requires
are used. such an internal consistency of results. It should not be
There are, however, many cases in analytical prac- regarded as a "loophole" in order to reduce an uncer-
tice in which the overall uncertainty estimates seem tainty that may otherwise be too large. It is to be reck-
inappropriate to be handled and thus unnecessary. Sup- oned rather as an instance of applying the fitness-for-
pose, for instance, one has to compare two articles of purpose principle. The notion of fitness for purpose is
ceramic ware with respect to cadmium release accord- apparently quite applicable to uncertainty estimates as
ing to BS 6478 (see Example 2 of the Guide). The ex- well as to data produced by the measurement process
periment is carried out in such a way that the tested itself.
vessels filled with the same leaching solution are al- In conclusion, it would be relevant to cite a very true
lowed to stand during the same period of time at the and profound passage (item 3.4.6) from the ISO Guide
same temperature (both measured with reasonable ac- [6], which was fully carried over to the EURACHEM
curacy), and the two extract solutions obtained are ana- document (item 5.4.16): "The evaluation of uncertainty
lyzed by AAS using the same bracketing reference so- is neither a routine task nor a purely mathematical one;
lutions. The question is whether the two samples differ it depends on detailed knowledge of the nature of the
from each other with respect to the test or not, or, in measurand and of the measurement method and proce-
terms of statistics, whether the difference between the dure used. The quality and utility of the uncertainty
two measurement results is significant against the back- quoted for the result of a measurement therefore ulti-
ground of their own variabilities. mately depends on the understanding, critical analysis,
Appropriate conformity criteria based on Bayesian and integrity of those who contribute to the assignment
theory as well as those of the usual statistics can be ap- of its value."
References
1. Analytical Methods Committee, RSC 5. EURACHEM (1995) Quantifying K ISO 47X7 (19X4) Laboratory glas-
(1995) Analyst 120:2303-230X Uncertainty in Analytical Measure- sware. Volumetric glassware. Meth-
2. Cortez L (1995) Microchim Acta ment ods for use and testing of capacity
119: 323-32X 6. ISO, IEC, OIML, BIPM (1992) 9. Doerffel K (1990) Statistik in der
3. Williams A (1996) Accred Qual As- Guide to the Expression of Uncer- analytischen Chemie, 5th edn, chap X.
sur 1: 14-17 tainty in Measurement. 1st edn. ISO Deutscher Verlag flir Grundstoffin-
4. Wegscheider W, Zeiler H-J, Heindl (The 1993 edition in the name of the dustrie, Leipzig
R, Mosser J (1997) Annal Chim seven organizations including IFCC, 10. Weise K, Wager W (1994) Meas Sci
X7:273-2X3 IUPAC, IUPAP is also available) TechnoI5:X79-XX2
7. ISO 3X4 (1979) Laboratory glassware.
Principles of design and construction
of volumetric glassware
Accred Qual Assur (1998) 3:14-19
© Springer-Verlag 1998
Abstract The problem with which varies markedly; also, the rigour of
analytical laboratories are con- the estimation increases with in-
fronted, after traceability of their creasing stringency of the demands.
results has been demonstrated, is This paper describes the primary
correctly estimating their uncertain- sources of uncertainty in chemical
ty - to which traceability is also to metrology and discusses different
some extent subject. While the approaches to its estimation in re-
general principles for calculating lation to the type of analytical la-
the uncertainty of physical measur- boratory concerned. The view pre-
ements are applicable to chemical sented tries to be close to the
metrology, some refinements are bench analytical level, in order to
needed, especially careful selection be practical and flexible for labora-
and planning the level at which un- tories, although it could sometimes
certainty will be estimated by each be considered slightly heterodox.
laboratory in accordance with its
A. Rios (lEI) . M. Valcarcel capacity and required demands.
Department of Analytical Chemistry.
Faculty of Sciences, Depending on the particular deci- Key words Uncertainty· Quality
University of C6rdoba, sion to be made, the mechanism to Assurance . Chemical
E-14004 C6rdoba, Spain be used to estimate the uncertainty measurements . Metrology
strated, supported and documented in most instances. what is measured (i.e. the analyte or measurand) but
One must concede, however, that calculating the uncer- also where it is measured (Le. samples and their ma-
tainty of the results obtained by an analytical chemical trices). Obviously, determinations of iron in rocks and
laboratory is currently an issue of great concern for la- human blood are not the same. The "tools" to be used
boratories and also, occasionally, a source of controver- in each case vary, Le. the analytical process that follows
sy between auditors themselves. It is clear that the di- sample collection differs (viz. samples are treated dif-
rect use of metrological principles in analytical labora- ferently and subjected to measuring methods and tech-
tories as they are used in the physical field is pretty use- niques that are dictated by their analytical properties).
less and produces a strong aversion on the part of the The analytical problem addressed, which demands a so-
laboratories, which see the process as artificial and far lution, is obviously not the same either [3]. As a result,
from reality. in chemical metrology the sample (as the physical mate-
This paper is aimed at clarifying the way analytical rialisation of the analytical problem) is the decisive fac-
laboratories should approach the problem and should tor. Whatever chemical measurement is to be made will
adopt a solution consistent with their role and compe- be dictated by the type of sample; also, even if the
tence. An assumption is made that there are several measurand is the same, the standard to be used and the
levels at which uncertainty can be estimated and which way measurements are to be made (viz. the analytical
make up global uncertainty (a more rigorous and valu- technique of choice) can vary markedly. The result of a
able concept). While every laboratory should aim to es- determination will be subject to a global uncertainty ar-
timate this last value, it would be foolish to ignore the ising from three distinct but closely related agents of
fact that most analytical control laboratories - those ac- the analytical process, namely: (a) the measuring in-
credited included - estimate other types of uncertainty strument, (b) the analytical method (sample treatment
that are numerically more accessible; in so doing, they included) and (c) the sampling and sub-sampling proce-
restrict the diversity and intrinsic heterogeneity of the dure. One other distinct feature of chemical metrology
samples they receive. is qualitative analysis, which also requires appropriate
standards and is absent from physical metrology.
this activity (e.g. the use of holmium or didymium fil- "absolute" and "stoichiometric" methods) and second-
ters to calibrate wavelengths in UV -visible spectropho- ary methods, which involve a longer traceability chain
tometers). and are commonly referred to as "relative" or "compa-
The standardisation of an instrument's response - rative" methods. One prominent part of relative analy-
commonly referred to as "calibration", as in "calibra- tical methods is the standardisation of the instrument's
tion curve", despite the fact that calibration is a differ- response, described in the previous section. However, a
ent activity - is a purely analytical activity typical of relative method also comprises other steps that are col-
chemical metrology. It affects analytical instruments lectively designated the preliminary operations of the
only and defines their response to the measurand( s) to analytical process. In fact, these operations significantly
be measured. Standardisation is crucial to ensuring tra- complicate chemical metrology as they are varied and
ceability in the results subsequently obtained. One difficult to control and reproduce in a systematic man-
must make several decisions and take several steps to ner [6]. They are thus the source of major errors not
reach this goal, namely: (a) select an appropriate stand- only of the random but also of the systematic type that
ard, (b) choose a suitable standardisation method, (c) have a decisive influence on uncertainty.
derive a mathematical relation between the analytical Method validation is thus a central activity in labora-
signal and concentration, and (d) validate the model es- tory quality systems in as much as it assesses adherence
tablished. There are some manuals and literature refer- of the laboratory to its quality policy. The validation
ences of help in this context, particularly those with a process is closely related to representativeness of the
chemometric slant [4, 5]. It is worth emphasising the results [7], which depends on the analytical objectives
significance of validating the experimental model in and types of sample. Table 1 shows the basic landmarks
terms of quality. Validation of a model involves experi- of the process. No doubt, demonstrating that the results
mentally confirming that it is a correct simplification of obtained are accurate is essential proof and an una-
the series of experimental points it contains in such a voidable requisite. In addition to meeting other objec-
way that it can accurately predict future unknown val- tives, including compatibility with the sample matrix
ues (in the samples to be analysed). Univariate linear and adequate robustness for use in routine work, one
calibration is usually done by using least-squares re- must estimate the degree of uncertainty associated with
gression, which involves checking fulfilment of various the results produced by a given method.
statistical hypotheses -alternatively, residual analysis The primary source of uncertainty lies in the prelim-
can be used to check for homoscedasticity. However, inary operations required to treat samples. Such opera-
one must also determine how closely the model fits ex- tions as digestion/disaggregation of solid samples or ex-
perimental points by using analysis of variance traction and clean-up processes (fairly frequent) intro-
(ANOVA) in order to confirm whether a different,
more precise type of fitting is needed. The validation
process also involves determining the confidence re- Table 1 The process involved in validating an analytical method
gion for the model, its sensitivity and its valid lower
limit (represented by the limit of determination). 1. Checking fitness to the analytical problem:
Validating the standardisation model allows one to - Choosing a suitable method (to be subsequently confirmed
or rejected by validation)
ensure traceability in the results produced by inverse
2. Preliminary study:
interpolation in the analysis of samples; however, the - Clear, detailed description
model will obviously be subject to an uncertainty U3 - Checking fitness to the analytical goal via
that will be a function of the standard error, Sv/x; this • Comparability with the sample matrix (applicability to
latter characterises the standard deviation associated real samples)
with the mathematical definition for the regression line. • Limit of detection
• Determination range
This is a standard uncertainty essentially subject to the • Selectivity
model's random errors, which arise from variations in - Robustness tests
signal measurements (the independent variable, y). Be- 3. Experimentally demonstrating that the system is "under statis-
cause the uncertainties in Xi values are small relative to tical control" (by means of control graphs):
- The means of measured values should remain constant over
the previous ones, they can usually be neglected. In any long periods (at high and low analyte concentrations)
case, calculating this type of uncertainty poses no spe- - The precision should be adequate and constant
cial problem. 4. Demonstrating accuracy:
- Recovery tests
- Comparison with an independent, previously validated
Uncertainty in the analytical method method
- Comparison with CRMs
- Interlaboratory studies
Analytical methods are currently divided according to 5. Compiling the SOP after the method has been validated
traceability into primary methods (formerly designated
A view of uncertainty at the bench analytical level 155
duce significant "uncontrolled" sources of error that ry. This step is also documented in quality schedules;
contribute to uncertainty. Type A evaluation (viz. un- however, the process is complicated by heterogeneity in
certainty that can be experimentally evaluated from the most samples - particularly solid samples - and intro-
statistical distribution of the results from a series of duces appreciable variability between sub-samples that
measurements) and type B evaluation (uncertainty ultimately leads to significant differences between the
evaluated from assumed probability distributions based results obtained for the same sample. No doubt, sam-
on experience or other information) uncertainties can pling and sub-sampling will demand greater interest in
be used to estimated the global uncertainty (as the ISO the future; such standards as EN 45000 and similar ones
recognises) of the analytical method concerned, de- should provide a more systematic and extensive de-
noted by U2. Recovery tests must be conducted very scription of the minimum requirements in this respect,
carefully if their uncertainty is to be correctly esti- engage laboratories in these activities and encourage
mated. Thus, the analyte must be added in the same the release of sampling guides for specific fields. This
chemical form as it is likely to be present in the sam- will decisively increase the quality of results and con-
ples; also, the spiked sample must be thoroughly homo- nect them to the real world - from which samples ulti-
genised and additions must include variable amounts of mately come - rather than only to the portion that
analyte. Finally, the recoveries must be evaluated in reaches the laboratory. Simultaneously, the bodies and
statistical terms (usually by regression analysis). institutions concerned with or responsible for quality
The possibility of evaluating uncertainty under a on a national or international scale should promote ef-
type B approach in this analytical step is a distinct fea- forts in the direction pointed out by Thompson and
ture of chemical metrology that entails obtaining addi- Ramsey [11] in order to develop reference sampling
tional information (mainly about the preliminary oper- targets (RST), analogue sampling of an RM or CRM,
ations involved in the method used) frequently ob- collaborative trials in sampling, and the possibility to
tained outside the laboratory (from the analytical liter- organise future proficiency testing in sampling.
ature or other laboratories). This type of evaluation is The uncertainty produced by these steps, denoted by
closely related to the variety of samples where the ana- u" is undoubtedly very high and exceeds that resulting
lyte can occur or with the fact that sometimes it is not a from the previous two steps. Also, its estimation is rath-
single analyte but a group of ill-defined individual ana- er complex in most cases: it entails using vast amounts
lytes that are to be determined (e.g. bitter compounds of information about the samples analysed and their
in beer or the hydrocarbon index in waters). origin to ensure a correct assessment. Obviously, as the
tools noted in the previous section become available,
this task will be easier and more reliable. Heterogeneity
Uncertainty in sampling and sub-sampling within and between samples is the origin of this prob-
lem and one more feature that clearly distinguishes
It is widely admitted that sampling poses special prob- chemical metrology from physical metrology. It con-
lems and influences the representativeness of the re- firms that the primary target of chemical metrology is
sults. The portions extracted from a sample for analysis the sample rather than the equipment used in the ana-
should contain essentially the same information as the lytical process, which is also important but only second-
population or system studied as a whole. Unsurprising- arily. There is thus a highly significant underlying prob-
ly, this activity has been the subject of abundant litera- lem awaiting solution in order to assure quality of
ture [8-10]. The sampling strategy can be suited to the chemical measurements: sampling and sub-sampling.
analytical problem addressed by using four different Under the influence of our fellow engineers and physi-
types of approach, viz. intuitive or judgmental, random, cists, who deserve due credit for establishing metrologi-
systematic and protocol-based. This is therefore the cal principles and starting and systematising Quality
first decision with which one is confronted. The sam- Assurance systems, we have placed too much emphasis
pling manual describes in detail the conditions, equip- on measuring equipment to the detriment of our true
ment and procedures used in this step. In such a widely goal and primary source of variability: samples.
variable activity, it is utterly important to develop clear,
well-documented protocols in order to release opera-
tors from the need to improvise or undertake responsi- Estimating uncertainty
bilities beyond their qualification. Even if these cau-
tions are exercised, there remain the enormous varia- The "Guide to the Expression of Uncertainty in Meas-
bility of samples and their also variable representative- urement", published jointly by the ISO and other bod-
ness of the problem addressed. ies [12], sets general rules for assessing and expressing
Equally important is sub-sampling, which involves uncertainty, and applying them to chemical metrology.
withdrawing aliquots from previously collected samples Uncertainty is assigned various sources including the
for subjection to the analytical process at the laborato- following: an incorrect definition of the measurand,
156 A. Rios' M. Valcarcel
tion used for its validation (by comparison this is the R. Albert have recently said, "the calculation of uncer-
analogous variation that introduces the tolerance stated tainty as recommended for physical measurements can-
for the internal volume of a volumetric flask); and, fi- not be transferred readily to chemical measurements"
nally, the uncertainty introduced by the variety of sam- [14], because both testing fields have entirely different
ples analysed (real samples, basically heterogeneous). error patterns and bias is difficult to identify and eradi-
Thus, Ulolal should represent the maximum uncertainty cate in chemical systems. In chemical metrology, the
that a particular laboratory could have in its reported sample (the physical entity of the analytical problem to
results. be solved) is the primary target. Also, the greatest
Estimating uncertainty under these circumstances is source of variability is the sample (its nature, its heter-
especially complex. The experience of the laboratory ogeneity, the matrix that accommodates it, the forms
concerned (or others) in the analytical process involved under which the analytes are present, etc.). Therefore,
and the type of sample processed can be of great assist- an analytical result can hardly be representative of the
ance as they may allow one to use existing data or plan object from which information is to be derived unless it
experiments for different samples or parts of samples in is accompanied by the uncertainty introduced at this
order to derive information on the degree of variability early stage of the analytical process. The principal vir-
resulting from sample heterogeneity and sample types. tue of the metrological concept of uncertainty is that it
Interlaboratory exercises are one other valuable source explicitly encompasses aspects related to the represent-
of information for assigning uncertainties due to sam- ativeness of results that have traditionally been disre-
ple variability and heterogeneity. Finally, scanning the garded in much analytical work. On the other hand, the
specialised literature - an essential task for laboratories orthodox step-by-step procedure to estimate the uncer-
wishing to sustain their competence - may also provide tainty in chemical measurements can be alternatively
estimations of variability which, duly justified, can be replaced by a overall procedure, as it has been present-
used to estimate uncertainties in specific cases. ed in this paper, which is simpler, more rational and
closer to the bench level. Under this approach, labora-
tories see as more practical and realistic the way to cal-
Conclusions culate their uncertainties by themselves.
References
1. XVII General Conference on 5. Miller JC, Miller IN (1993) Statistics 10. Gy PM (1995) Trends Anal Chern
Weights and Measurements (19X3) for analytical chemistry (3rd edn), 14:67-76
2. Valcarcel M, Rios A (1995) Analyst chap 5. Ellis Horwood, New York 11. Thompson M, Pansey MH (1995)
120:2291-2297 6. Valcarcel M, Luque de Castro MD, Analyst 120:261-270
3. Valcarcel M, Rios A (1997) Trends Tena MT (1993) Anal Proc 12. Guide to the Expression of Uncer-
Anal Chern 16:3X5-393 30: 276-2XO tainty in Measurements (1995) ISO,
4. Massart DL, Bandeginste BGM. 7. Rlos A, Valcarcel M (1994) Analyst Geneva, Switzerland
Deming SN, Nichotte y, Kaufman L 119: 109-112 13. Quantifiying Uncertainty in Analyti-
(19XX) Chemometrics: a textbook. El- X. Crosby T, Patel Y (1995) General cal Measurement, version 6 (1995)
sevier, Amsterdam, pp 75-92 principles of good sampling practice. EURACHEM
Royal Society of Chemistry, London 14. Horwitz W, Albert R (1997) Analyst
9. Smith R, James GV (19XI) The sam- 122:615-617
pling of bulk materials. Royal Society
of Chemistry, London
Accred Qual Assur (1998) 3:117-121
© Springer-Verlag 1998
Uncertainty of sampling contribution to the combined In analytical science, measurements are not usually
uncertainty of an analytical measurement that results made on the whole amount of the material of interest
from the production of the laboratory sample from the (here called the 'target'), but on a much smaller
sampling target. amount, the sample, which is selected from the target in
Sampling target mass of material that the laboratory some manner. As a consequence, metrologists in chem-
sample is designed to represent. istry have hitherto concentrated on the analytical meas-
Fitness for purpose property of the result of a measure- urement process in isolation. As that process involves
ment when its associated uncertainty minimises a cost estimating the concentration of an analyte in the labo-
function comprising all terms that are functions of the ratory sample, the uncertainty of measurement for the
uncertainty. analyst refers to that specific measurand. For the end-
user of the data, however, the measurand of interest is
the concentration of the analyte in the target. Hence
Uncertainty of sampling in chemical analysis 159
•
where U a is the standard uncertainty of 'pure measure-
ment' and Us is the standard uncertainty resulting from •
errors in sampling. It is stressed here that sampling un-
certainty characterises only those errors made during •
the process of producing the laboratory sample. Errors
introduced during the selection and weighing of a test ANALYSIS m·\
portion from the laboratory sample are subsumed into SAMPLE m
the measurement uncertainty. ANALYSIS m· 2
In most sectors of analytical science sampling proce-
dures regarded as 'best practice' or 'fit-for-purpose' Fig. 1 Design for replicate sampling and analysis of a single sam-
pling target, for the estimation of sampling and analytical preci-
have been developed. Usually, however, we have very sions
little information on the performance of such proce-
dures, because the validation of sampling is far less de-
veloped than that of analysis. In such cases the uncer-
tainty of sampling usually needs to be estimated ab ini-
V
iation characterised by ~ + U;, where U; is the analy-
tical variance. An experimental design for estimating ~
tio. As such estimation can present considerable practi- and a~ is shown in Fig. 1. A reasonably reliable esti-
cal difficulties, it is currently attempted in only a lim- mate of ~ can be made by anova if a s >3aa (i.e., the
ited number of sectors of analytical practice. analytical precision must be somewhat better than the
Sampling errors can be quantified only after analysis sampling precision) and if there are a sufficient number
of the samples, so the results of ordinary measurements of replicate samples (i.e., more than ten). If the mean
carry both sampling and analytical errors. As a conse- squares between and within samples from the anova
quence, results used to estimate sampling uncertainty are designated MSB and MSW respectively, the esti-
must usually be obtained from designed experiments mate 0; of the sampling variance is given by
(with replication and randomisation) for interpretation
by anova (analysis of variance) methods. ~=(MSB-MSW)/n
ANALYSIS 1·1·1 see how bias could arise in sampling methodology. For
SAMPLE 1·1 example, the samples could be consistently contami-
ANALYSIS )·)·2 nated by the sampling tools such as containers or grind-
TARGET I ing devices (example - rock chips from a borehole con-
ANALYS)S )·2·) taminated by chromium from the drill bit). Alternative-
SAMPLE 1·2 ly, some of the analyte could be consistently lost from
ANALYSIS )·2·2 the samples because of inappropriate handing (exam-
ple - loss of elemental mercury from a rock sample dur-
ANALYSIS 2·)·) ing grinding). A sample can be biased if the sampler
SAMPLE 2·1 misunderstands the protocol (example - sampler col-
ANALYSIS 2·1·2 lects a-horizon soil contaminated with b-horizon soil).
TARGET 2
Finally a sample can be biased if the sampler is selec-
ANALYSIS 2·2·1 tive (instead of random) in the selection of increments
SAMPLE 2·2
• that form the aggregate sample (example - always pick-
ANALYSIS 2·2·2
ing up large pieces of the target materials rather than
small pieces).
• It is often possible to avoid these biases once they
are known. All too often it is very difficult to establish
• the existence of such a bias in sampling for want of (a)
a reference sampling method for comparison, or (b) the
will to carry out the comparison, or because sampling
• ANALYSIS m·)·)
precision is too large to allow the existing bias to be
SAMPLE m·1
ANALYSIS m·)·2
demonstrated at a significant level. In such cases it is
TARGET m usual to regard the sampling method as 'empirical' (by
ANALYSIS m ·2·1
analogy with the empirical analytical method, where
SAMPLE m·2 the result is dependent on the analytical method). An
ANALYSIS m·2·2 empirical sampling method would have zero bias by
definition. In the absence of any readily available infor-
Fig.2 Design for duplicate sampling and analysis of a number m mation on sampling bias it is best in most circumstances
of similar sampling targets, for the estimation of sampling and to calculate combined uncertainty estimates only from
analytical precisions precision contributions.
pIe should be made up of a number of increments tak- Internal quality control in sampling
en from the target in a specific two-dimensional pat-
tern, the replicate samples should be obtained by relo- Given estimates of Us and U a for a particular type of
cating the origin and orientation of the pattern at ran- material and a fixed sampling protocol we can consider
dom for each sample. Any deviation from randomness the application of an internal quality control method to
would tend to give rise to an underestimate of sampling ensure that the measurement system, including sam-
precision. In practice the extraction of a random sam- pling, stays in statistical control. Let us assume that
ple from particular targets may be difficult or imprac- some or all of the successive routine sampling targets
ticable. Samplers must do their best under prevailing are sampled in duplicate and each of the two samples is
circumstances. analysed once, to give two values (Xh X2) for each tar-
In all of these experiments, the set of samples col- get. For monitoring this data, a control chart, based on
lected should be analysed in a random order under re- N (0, ~ = 2( ~ + uD), could be constructed for the dif-
peatability conditions. This strategy avoids confusing ference d =x 1 - X2 between the two values. As usual the
analytical problems like drifts with genuine differences warning limits should be at ± 2u and the action limits at
between the samples. ± 3u. That would be useful as a control on sampling as
Bias in sampling is a difficult topic. Some experts on long as 0:, was the dominant precision term. An out-
sampling argue that sampling bias is not a meaningful of-control situation could arise if, for example, the tar-
concept: sampling is either 'correct' or 'incorrect' [2]. get were more heterogeneous than usual, or if an inad-
Certainly sampling bias may well be difficult to detect equate number of increments were combined to form
or estimate, because we need an alternative sampling the aggregate sample. In contrast to analytical quality
technique or protocol, regarded as a reference point, control, there exists an extra possibility, namely that
with which to compare our method under test, before the two samples could be too similar. That could arise if
we can claim that bias is present. However, it is easy to the two samples were not independent, for instance, in
Uncertainty of sampling in chemical analysis 161
an extreme case, if both laboratory samples were splits were z 2, close to the value found in analytical collabo-
from a single aggregate sample. A possible approach to rative trials. This finding suggests that in some in-
monitoring too great a similarity would be to set up ad- stances reproducibility precisions might be most appro-
ditional control lines at (say) ± O.5u. Four successive priate for estimating sampling uncertainty. At the other
points within these inner lines would occur only rarely extreme, if the target comprises material that is nearly
under statistical control and suggest problems with homogeneous in the analyte, it may be impossible to
sampling. find significant between-sampler or within-sampler var-
iation, because they are both small relative to the ana-
lytical variation. This situation has been found in a col-
The collaborative trial in sampling laborative trial in sampling wheat (unpublished data).
The lack of information in this area is hardly surpris-
Until recently sampling precision has been treated im- ing. It is sometimes an unpopular activity, often techni-
plicitly as if it were independent of conditions under cally difficult, and always expensive to organise a colla-
which sampling is executed, so that repeatability condi- borative exercise in sampling. The samplers all have to
tions should suffice for its estimation. However, it is travel to the sampling target(s) and, as they must work
well recognised that in analytical measurement preci- independently, they must visit the target in succession.
sions measured under reproducibility conditions (one This might delay the shipment of a large amount of a
method, one material, different laboratories) are valuable commodity. Often a commodity (especially a
greater than repeatability precisions (one analyst, one packaged material) would be spoilt to an unacceptable
material, one method, one instrument, short time peri- degree by multiple sampling. However, there is no
od) by a factor of about two [3], i.e., uR/ur z2. Conse- doubt that an optimised sampling/analytical system can
quently it is worth enquiring whether, for sampling, us- maximise profits for a manufacturer (see below), and
ing the same protocol and test material, reproducibility there are industrial instances known to the author
precisions are greater than repeatability precisions to where proper attention to sampling errors have re-
any important degree. Such a finding would have con- sulted in a substantial net gain.
siderable implications for the correct estimation of sam-
pling uncertainty.
The established method of considering reproducibil-
ity precisions in analysis is the collaborative trial (meth- Uncertainty and fitness for purpose
od performance study) [4]. Applied to sampling, the
collaborative trial would require a number of samplers Given that errors are introduced into measurements by
each to take independent duplicate samples from a tar- both sampling and analysis, we need to consider two
get, at random, using a fixed sampling protocol [5]. If questions relating to fitness for purpose, namely: (a)
all of the samples were then analysed in duplicate, to- given limited resources, how can we divide them opti-
gether under randomised repeatability conditions, then mally between expenditure on sampling and on analy-
hierarchical anova could be used to decide whether sis; and (b) how can we decide whether a combined un-
either within-sampler or between-sampler precision certainty is adequate for the end-user needs (i.e., is the
had reached statistically significant levels. If that were end-user able to make valid decisions given the uncer-
so, the sampling precision under repeatability and re- tainty of the measurements)?
producibility conditions could be estimated from the A simple answer to the former question is to consid-
mean squares. er the relative contributions of sampling and analysis to
Very little experimentation has been conducted the combined uncertainty. If sampling uncertainty is
along those lines so far. The findings are suggestive, but the dominant term there is no point in utilising an ex-
insufficient work has been done for general conclu- pensive highly accurate analytical method, because the
sions. Studies on sampling contaminated land [5] combined uncertainty will not be usefull im roved.
showed contrasting results for different analytes. At the For example if u a <0.2u s , then U t < (u~+(O.2us)2) =
sites investigated the major contaminant (lead) was l.02u s , so that U t z Us regardless of how small u" be-
spatially distributed in a very heterogeneous manner. comes. At the other 'extreme', if u,,>O.5u s , then U t is
As a result, the within-sampler precision was so large dilated to a level substantially greater than Us. There-
(RSD z 30%) that no significant between-sampler var- fore U a should best fall within the approximate range
iation could be detected for this element. In contrast, {0.2u s - O.5u s \. The same argument, applied to a domi-
elements present at near the background levels (i.e., nant analytical uncertainty, produces the corresponding
present because of natural processes rather than con- result. These considerations, although informative in
tamination and therefore not wildly heterogeneous), themselves, pay no attention to the relative costs of
significant levels of between-sampler precision were sampling and analysis as functions of the precision ob-
found. Values of the ratio UR/ U r found for sampling tained.
162 M. Thompson
To obtain a more realistic picture, including an ele- on the measurement). For the sake of a simple exam-
mentary consideration of costs, we need to examine the ple, we take a linear function such as
apportionment of a fixed financial resource between
sampling and analysis [6]. We first consider the cost A Le=Q+Ru t
of procuring a sample with unit sampling uncertainty. where Q and R are constant costs. We are now in a
We can achieve an uncertainty of 112 by collecting and position to define fitness for purpose in an operational
thoroughly mixing four independent samples, collected manner: it is the combined uncertainty that minimises
using the same protocol, at a cost of 4A. The cost Ls of the total cost L, which is given by
achieving any uncertainty will therefore generally be
Ls=Alu;. dL =0
dU t
The same type of consideration would apply to anal-
ysis. We could assume with reasonable confidence that where
the cost of analysis La would be given by L = Le + L t = Q + RUt + D I u~.
La=Blui. Concrete applications of this idea have yet to be
Both costs escalate steeply with requirements for published, and it is likely that the necessary informa-
decreasing uncertainty. To apportion the costs we need tion would be difficult to obtain in many practical cir-
to minimise the total cost L t = Ls + La of the measure- cumstances. However, it provides a useful conceptual
ment 0 eration for a particular combined uncertainty framework for defining the relationship between uncer-
Ut = ui + u;. It can be shown [6] that the minimum is tainties of sampling and analysis and fitness for pur-
defined by pose.
Conclusions
References
1. Thompson M, Ramsey MH (1995) 3. Boyer KW, Horwitz W, Albert R 5. Ramsey MH, Argyraki A, Thompson
Analyst 120:261-270 (1985) Anal Chern 57 :454-459 M (1995) Analyst 120:2309-2312
2. Gy PM (1992) Sampling of heterogen- 4. Horwitz W (1995) Pure Appl Chern 6. Thompson M, Fearn T (1996) Analyst
eous and dynamic materials. Elsevier, 67:331-343 121 :275-278
Amsterdam
Accred Qual Assur (2002) 7:274-280
001 1O.1007/s00769-002-0489-4
© Springer-Verlag 2002
Received: 28 December 2001 Abstract Appropriate sampling, that 50%) can be shown to be fit for some
Accepted: 25 April 2002 includes the estimation of measure- specified purposes using this ap-
ment uncertainty, is proposed in pref- proach. Once reliable estimates of the
Presented at EUROLAB/EURACHEM erence to representative sampling uncertainty are available, then a
Workshop "Sampling",
5-6 November 2001, Lisbon, Portugal
without estimation of overall mea- probabilistic interpretation of results
surement quality. To fulfil this pur- can be made. This allows financial
pose the uncertainty estimate must aspects to be considered in deciding
include contribution from all sources, upon what constitutes an acceptable
including the primary sampling, sam- level of uncertainty. In many practi-
ple preparation and chemical analy- cal situations "representative" sam-
sis. It must also include contributions pling is never fully achieved. This
from systematic errors, such as sam- approach recognises this and instead,
M.H. Ramsey (~)
Centre for Environmental Research, pling bias, rather than from random provides reliable estimates of the un-
School of Chemistry, errors alone. Case studies are used to certainty around the concentration
Physics and Environmental Science, illustrate the feasibility of this ap- values that imperfect appropriate
University of Sussex, Falmer, proach and to show its advantages for sampling causes.
Brighton BNI 9QJ, UK
e-mail: m.h.ramsey@sussex.ac.uk
improved reliability of interpretation
Tel.: +44-1273-678085 of the measurements. Measurements Keywords Representative sampling·
Fax:+44-1273-677196 with a high level of uncertainty (e.g. Uncertainty of measurement
value lies is stated together with the measured value of who interprets these measurements often assumes that
concentration. they are "correct" and does not use, or have access to,
There is one crucial concept that must be accepted in any of the information gained from the AQC materials.
order to make this approach applicable; the action of pri- In the alternative "appropriate" approach, the laboratory
mary sampling must be considered as the first step in the reports an uncertainty value with each concentration val-
making of a measurement of concentration. In this way ue. This estimate of uncertainty is not just that arising
the uncertainty of the measurement includes the contri- from the chemical analysis (currently being reported in
bution from all of the sources [3]. These include the pri- some labs), but also includes components from the pri-
mary sampling, the physical preparation, the laboratory mary sampling and physical preparation. This uncertain-
sub-sampling, the chemical preparation and the chemical ty value is estimated using information derived in part
analysis of the sample. The word "measurement" refers from the AQC materials, and gives the customer access
to the final estimate of analyte concentration, but the to this information in a form that is useful.
phrase "measurement process" in this paper is used to This paper aims to give an overview of the research
denote all of these processes collectively. Estimates of that underpins this approach to "appropriate sampling",
measurement uncertainty that omit some of these sourc- with references given to sources of more detailed infor-
es, particularly the sampling, will inevitably be too small mation. It will cover definitions of uncertainty, methods
and therefore unrealistic. to estimate uncertainty that include all sources, accept-
A second important step is to include systematic er- able levels of uncertainty (fitness-for-purpose), implica-
rors (unknown or uncorrected) into the estimates of un- tions of uncertainty for interpretation of measurements,
certainty. If the uncertainty interval is to include the true and conclusions, with some examples from case studies
value, then its calculation cannot be restricted to just the to help clarify the explanations.
random errors that are unrelated to the true value. In pri-
mary sampling, the random errors are traditionally well
characterised and used, for example, to judge the Definitions of uncertainty
amount of sample that is required to achieve a specified
sampling precision. It is however the systematic errors The formal definition of uncertainty of measurement is
(e.g. sampling bias) that can often cause unsuspected er- "A parameter associated with the result of a measure-
rors, and therefore both types of error need to be esti- ment, that characterises the dispersion of the values that
mated. could reasonably be attributed to the measurand" [5].
If the actual uncertainty is estimated for every mea- The meaning of this definition is not entirely clear, and
surement, rather than relying on an assumption of cor- depends heavily on the definition of the word "measu-
rectness or representivity of the sampling, then the reli- rand", which is formally "the particular quantity subject
ability of measurements (and the sampling component) to measurement" [6]. The previously quoted informal
will improve. Moreover, once the assumption of perfect- definition of uncertainty as "the interval around the re-
ly representative samples is set aside, it is possible to de- sult of the measurement that contains the true value with
cide how close to the true value the measurements are re- high probability", clearly interprets measurand as being
quired to be, for any particular application. There are the "true value" of the analyte concentration, and not just
cases where relatively high levels of uncertainty (e.g. as the "analyte concentration" as has been implied by
80%) can be shown to be appropriate for some purposes some sources. This interpretation has the important im-
(i.e. the measurements are "fit-for-purpose") [3]. The plication that uncorrected systematic errors should be in-
practical limitation on the number and quality of mea- cluded within the estimates of uncertainty. It differenti-
surements made is frequently financial, and this "appro- ates uncertainty from the traditional "error bars" often
priate sampling" approach also allows a optimal balance quoted for analytical measurements, which are invari-
to be made between the quality of the measurements ably based upon random errors alone, with no reference
(from sampling and analysis), and the cost of both the to the "true value" of the analyte concentration.
measurements and the consequences of undetected mea- The estimation of uncertainty for analytical measure-
surement errors [4]. ments has now been widely advocated, at least in Europe
The traditional approach to quality sampling is usual- [7]. However, these estimates specifically exclude the
ly linked to an equally traditional but separate approach contribution to uncertainty arising from the primary sam-
to quality in chemical analysis. In traditional analytical pling, and often from the physical preparation of the
quality control (AQC), various AQC materials are analy- sample (e.g. drying, grinding, splitting). Several studies
sed in the same batch as the samples. If the measure- have shown that these are often the largest sources of un-
ments made on these materials fall within predetermined certainty in measurements [3, 8-10].
limits, then the measurements made on the samples are
reported to a customer as single concentration values
(e.g. in flg of analyte per g of sample). The customer
Appropriate rather than representative sampling, based on acceptable levels of uncertainty 165
Methods to estimate uncertainty that include value of concentration (and its uncertainty) can either be taken
all sources from the known concentration of analyte added, for the synthetic
RST [12), or established by the consensus of an inter-organisatio-
nal sampling trial (lOST) [13). The accepted value can also in-
Methods for the estimation of uncertainty from analytical methods clude a specification of the spatial distribution of the analyte, and
are well developed. They often use "bottom up" method which its uncertainty [14).
sum all of the individual components in the uncertainty budget [7), The second method of estimating the contribution of sampling
but they can also use information from "top down" approaches bias to the uncertainty of the measurement, is to apply more than
that use estimates of the total uncertainty, from inter-laboratory tri- one sampling protocol to a sampling target, ideally with more than
al for example, without necessarily subdividing it. The step of pri- one sampler (i.e. the person who takes the sample). One extreme
mary sampling is traditionally excluded from these estimate~ of example of this approach is the inter-organisational sampling trial
measurement uncertainty, but recent attempts have been descnbed in which eight or more samplers take samples from the same sam-
to apply bottom up methods to this aspect [10). pling target for the same specified purpose. If all participants use
New methods have been devised to estimate the uncertainty of the same protocol it constitutes a Collaborative Trial in Sampling
measurements, which is caused by all of the sources, including (CTS) [15), but if they all select their own protocols, based on
procedures used for primary sampling. Uncertainty of measure- their professional jUdgement, it constitutes a Sampling Proficiency
ment (uc) can be estimated by summing contributions from the Test (SPT) [13). The variability of the estimates of analyte con-
four types of error in the methods of measurement. These include centration between the participants can then be used to estimate
two random components (sampling precision and analytical preci- the uncertainty of the measurement procedure as a whole, as ap-
sion) and two systematic components (sampling bias and analyti- plied to a particular site. Any bias caused by the sampling of any
cal bias). The expanded uncertainty (U) is estimated in this case participant then becomes part of the random error across the
by the use of a coverage factor of two, to give approximately 95% whole sampling trial, and hence is automatically included in the
confidence for a normal distribution. uncertainty [3).
Well established methods are available to estimate three of these There are therefore four methods that can be identified for the
four components (Table I). Analytical precision can effectively be estimation of the overall uncertainty of measurement (Table 2).
estimated most cost-effectively using duplicate chemical analyses. None of these methods use the traditional "bottom-up" approach
Sampling precision can be estimated similarly by taking duplicated of adding all of the separate components of uncertainty together
samples at points separated in space (or time) by a distance reflect- [5). They rely on a fundamentally "top-down" approach that aims
ing the possible ambiguity in the sampling protocol [II). Analytical to get the most reliable estimate of the uncertainty overall, without
bias can be estimated using certified reference materials that have a necessarily identifying the contributions from all of the possible
chemical composition that is well matched to the samples. sources [16, 7).
The estimation of sampling bias is potentially much more These four methods estimate the uncertainty with increasing
problematic, but two methods have been described. One method rigour, but at increasing cost. Method 1 is the least expensive, but
requires the use of a Reference Sampling Target (RST), which is it does not include an estimate of sampling bias, although this can
the sampling equivalent of a reference material for the estimation be added by the independent use of an RST. Separation of the
of bias. The RST can either be created synthetically to have a main sources of uncertainty requires the use of analysis of vari-
known concentration of analyte [12), or it can be a routine sam- ance, usually of the robust type, to allow for non-normal frequen-
pling target selected for the purpose [13). The accepted or certified cy distributions. Detailed description of these methods is given
elsewhere [3). These methods have the advantage of estimating
the actual uncertainty for a particular investigation, and should in
that way be more realistic than estimates made by bottom-up
Table 1 The four types of errors in methods that contribute to the
uncertainty of measurements, and examples of how they might be methods. These later methods will need to use generalised values
estimated for the component variances, and cannot easily retlect the special
contribution to variance at each site, such as those made by the lo-
Error type Random (precision) cal levels of analyte heterogeneity. Top-down methods are there-
~ Systematic (bias)
fore particularly appropriate for estimation of uncertainty from
Process ~ Estimate using: Estimate using:
sampling especially for variable matrices such as those found in
Analysis Duplicate analyses Certified reference materials environmental materials. Further research will be needed to inves-
Sampling Duplicate samples RST, lOST tigate the relative merits of top-down and bottom-up methods for
this purpose.
RST, reference sampling target; lOST, inter-organisational sam-
pling trial
Table 2 Four methods for estimating uncertainty in measurements (including that from sampling)
CTS, collaborative trial in sampling; SPT, sampling proficiency test; CRM, certified reference material
166 M. H. Ramsey
Applications of uncertainty estimation the sample mass would be predicted to reduce the vari-
that includes contribution from sampling ance by a factor of 5, and hence the uncertainty by the
square root of 5.
Estimation of uncertainty in the measurement of lead in
top soils has been reported using duplicate samples [3]
(i.e. Method 1 in Table 2). The 1.8 hectares site in Der- Acceptable levels of uncertainty
byshire UK, was contaminated by a lead smelter operat- (fitness-for-purpose)
ing in the 14-16th century, and demonstrates the general
principle. Once estimates of the uncertainty of measurements are
A regular grid of 40 sampling points at 20 m spacing known, it is possible to judge whether that level of un-
was applied initially with single samples. This grid was certainty is acceptable for a particular stated purpose.
repeated using 5-fold composite samples (i.e. 5 incre- This can be used to judge whether measurements (rather
ments taken within 1 m 2 around each sampling point) in than all of the measurement procedures themselves) are
order to investigate how composite sampling would af- "fit-for-purpose" (FFP) , where fitness for purpose is de-
fect the measurement uncertainty. The duplicate samples fined as "The property of data produced by a measure-
were taken at around 20% of the sampling points (Table ment process that enables a user of the data to make
3), at a distance of 2 m away from the original sample technically correct decisions for a stated purpose" [18].
point. This distance represents the spatial uncertainty Three basic types of FFP criteria have been suggest-
caused by the method of surveying employed (i.e. mea- ed, in somewhat different contexts. The first, and widely
suring tape) on this undulating site. Duplicated analytical accepted criterion, is based on the relative precision of
measurements were then taken on both of the sample du- measurement method, usually specified within an Ana-
plicates in a balanced design, and the three components lytical Quality Control Scheme. A typically criterion is
of the variance separated using robust analysis of vari- that the relative analytical precision should be better than
ance (ANOVA) [17], according to the model: 10% (at 95% confidence). AQC is normally used to
check the measurement process and to check that this
2 _ 2 2 2 process step is in statistical control and comparable with
Stotal - Sgeochern + Ssarnp + Sanal (1)
the performance pertaining at the time of validation. This
The three components of the total variance are the ana- target performance is often set however, so as to enable
lytical variance (s~nal)' the sampling variance (s~arnp) and users of the data to make technically correct decisions. It
the geochemical variance (S~eochern). The measurement can therefore be considered as crude type of FFP criteri-
variance can be considered as the sum of the sampling on. The main problem with this approach is that this cri-
and analytical variance: terion is set by the laboratory, often without reference to
the specific purpose for which the customer will use the
(2) data. It could be, for example, that a precision of 30%
would be quite good enough for some of the user's pur-
The expanded uncertainty (U) can be estimated as poses.
2s rneas ' using a coverage factor of 2 for 95% confidence. The second FFP Criterion that has been suggested, is
The use of 5-fold composite samples reduces the that the uncertainty of the measurement (including that
overall expanded measurement uncertainty (U) by a fac- from the sampling) should not contribute more than 20%
tor of two (3742 to 1881 Ilg g-l). This is 1l0t significantly of the total variance for the analyte across all of the sam-
different from the reduction of 2.2 (i.e. v'5), predicted by ples in a particular survey [11]. The relative contribu-
the theory that sample mass is inversely proportional to tions to the total variance can be usefully represented us-
sample variance [1]. In this case the 5-fold increase in ing pie charts (Fig. I). In the case study in Derbyshire, it
Table 3 Estimates of uncertainty made at a site in Derbyshire using Method 1 (Table 2). The use of composite samples reduces the ex-
panded measurement uncertainty (U) by nearly a factor of 2
Sampling design Points Mean Pb SlOta) Smeas s2mca/ U=2s meas U=200smca/x
duplicated (x) robust robust robust S2tolal' (J.1g g-l) (%)
(J.1g g-l) (J.1gg-1) (J.1g g-l) (%)
Regular grid, single sample 7 7516 8185 1871 5.2 3742 49.8
Regular grid, compo samples 9 6093 5600 940.5 2.7 1881 30.9
Despite the high level of relative uncertainty using single samples across the site (s2mca/s2tolal%). This is well within the second FFP
(U%=50%), the variance caused by the measurements only con- criterion of 20% (Fig. I)
tributes 5.2% to the overall total variance of Pb measurements
Appropriate rather than representative sampling, based on acceptable levels of uncertainty 167
Analytical
Sampling
'0 10000 -r--~~-::-::--~--'
g ~ 8000~·~----~~--~~~
;: en 6000 +------.::.---'~.;-'f8_-=-""-:7I
S en 4000 -.-,,....--:.;.--,-----::.t-":'-':..:.-""""'t
FFP 5",...< 20%
~is. .3 2000 -~
r- .....----..;...._:'i"...-~-I ~
fIerm:Ii;Ibon cost etc.
~ 0 +-...::!~~E:...__r---..:..-f
o 50 100 150
Uncertainty (eslimaled as , - , "9'9)
Fig. 2 Optimisation of measurement uncertainty against cost,
demonstrated for a metal contaminated site in West London. It
shows the economic loss function with a clear minimum cost at
the optimal uncertainty (estimated by smea)' which is 30x lower
than the expectation of loss at the actual uncertainty of the mea-
surements
94 .76%
a Geochemical
reached despite the fact that the value of the measure-
ment uncertainty was 50% (relative to concentration val-
Sampling Analytical ue, at 95% confidence), using the protocol with single
samples. This level of uncertainty is much higher than is
often considered acceptable. These samples are not "rep-
resentative" in the usually accepted sense of the word,
but they have been shown to be "appropriate", in provid-
ing measurements that are fit-far-purpose.
FFP Smeal < 20"10 The third FFP Criterion that has been suggested in-
corporates a balance between uncertainty and financial
loss [4]. The optimal value of uncertainty is identified as
that which incurs the minimum value for the expectation
of loss. This loss includes both the cost of making the
measurement and also the loss that may arise due to in-
correct decisions made on the basis of the uncertain mea-
surements. In an initial application of these ideas to a site
investigation in West London [19], the actual measure-
97.24% ment uncertainty of 110 Ilg g-l resulted in an expectation
b Geochemical of financial lost of around £9000 per sampling point
(Fig. 2). When this uncertainty was optimised to a value
Fig. la, b Contributions towards total variance from the sam- of 55 Ilg g-I the expectation of loss was reduced by a
pling, the analysis and their combination the measurement. The
contribution from the sampling is reduced by the use of composite factor of thirty to around £300. A second stage of the op-
samples b compared with single samples a . This reduction is how- timisation allocates expenditure in an optimal way be-
ever not required as the contribution to the measurement variance tween the sampling and the chemical analysis , based on
using single samples a is already less than the fitness-for-purpose their respective contributions to the overall uncertainty.
criterion (FFP) of 20% of total variance
A recent application of this approach to the sampling of
food has shown that a substantial reallocation of expen-
diture to the sampling process (+300%) gave a substan-
was possible to optimize the sampling protocol by use of tial reduction in the overall uncertainty (-31 %) and a
this criterion. The use of 5-fold composite samples in consequent saving of £428 per batch [20].
place of single non-composite samples was shown to re-
duce the variance contributed by the sampling protocol
by a factor of approximately ..n.
However, this improve- Implications of uncertainty for interpretation
ment was shown to be unnecessary as the variance con- of measurements
tributed by the measurements overall (5.2%) was already
well below the limit of 20% of the total variance, even One of the main advantages of reporting realistic esti-
when using the non-composted samples [3]. This judge- mates of uncertainty together with measurements of con-
ment that the measurements were fit-far-purpose was centration, is that end-users of the analyses can consider
168 M. H. Ramsey
f
bilistic approach.
More sophisticated interpretations of contaminated
Concentration (e) land use a risk assessment approach, but this can also be
~+U-f------1!I----t+-----
made more reliable by allowing for the uncertainty in the
Fig. 3a, b Comparison between a deterministic, and b probabilis- Appropriate sampling is potentially much more reliable
tic classification of contaminated land, to show the effect of using and more cost-effective than representative sampling.
estimates of uncertainty to improve the interpretation of measure- The assumption that a "correct" protocol will give repre-
ments (derived from general case described previously [7])
sentative samples, does not give either the rationale for
estimation of sampling bias, or the flexibility to vary the
the implications of the uncertainty [7]. One example that quality of the sampling depending on the proposed ob-
illustrates this is in the classification of contaminated jective. An "appropriate" protocol can be selected to
land [3]. The traditional deterministic approach is to give an acceptable level of uncertainty that can allow for
compare the measured concentration values with an ap- consideration of financial constraints (e.g. the potential
propriate regulatory threshold value (Fig. 3a). Any sam- financial consequences of errors, or logistical constraints
pling point that has a reported concentration value below such as local conditions, time limitations). All "appropri-
the threshold is classified as uncontaminated, and those ate" protocols do require the estimation of uncertainty to
above as contaminated. This approach ignores the pres- be incorporated into their design. The simplest method,
ence of uncertainty, but once it is known a probabilistic using duplicate samples at a small proportion of sam-
approach can be taken (Fig. 3b). If the measured concen- pling points, is adequate for most purposes. More elabo-
tration is below the threshold but the uncertainty interval rate methods to estimate uncertainty will only be re-
extends above it, the point is classified as "possibly con- quired where the consequences of unsuspected uncer-
taminated" rather than "uncontaminated". Similarly if tainty are large. These estimates of uncertainty are used
the measured concentration is above the threshold but initially to judge the fitness-for-purpose of the measure-
the uncertainty interval extends below it, the point is ments. They are also very useful however, for improving
classified as "probably contaminated" rather than "con- the reliability of the interpretation of the measurements
taminated" . (e.g. in risk assessment or hazard classification). These
In one application of this probabilistic approach to a techniques also provide ways to assess the performance
disused landfill site in West London, the effect of the un- of sampling protocols (using collaborative trials in sam-
certainty was to totally change the interpretation of the pling) and to assess and improve performance of sam-
extent of the lead contamination at the site [3]. From the plers (using sampling proficiency tests). Sampling is
100 samples taken from the site in a regular grid pattern, never perfect, it is better therefore to measure the uncer-
the deterministic interpretation showed only eight sam- tainty that an "appropriate" sampling protocol generates,
ples, scattered across the site, to be over the appropriate rather than to assume the perfect application of a "cor-
threshold value (500 Ilg Pb g-l soil). However, the uncer- rect" protocol.
tainty of the measurements at the site was estimated as
83.6%, using Method I (Table 2). This large value is due
primarily to the high degree of small-scale heterogeneity
of lead distribution at the site. After taking this into ac-
count 91 % of the sampling point were classified as either
Appropriate rather than representative sampling, based on acceptable levels of uncertainty 169
References
I. Gy P (1979) Sampling of particulate 7. CITAC (2000) Quantifying uncertainty 15. Ramsey MH. Argyraki A, Thompson
materials - theory and practice. in analytical measurement. Eurochem M (1995) Analyst 120:2309
Elsevier, Amsterdam 8. Rios A, Valcarcel M (1998) Accred 16. Analytical Methods Committee (1995)
2. Thompson M (1995) Analyst Qual Assur 3:14 Anayst 120:2303
120:117N 9. BCR Report (1998), EUR 18405 EN 17. Analytical Methods Committee (1989)
3. Ramsey MH, Argyraki A (1997) Sci- Metrology in chemistry and biology: a Analyst 114: 1699
ence of the Total Environment 198:243 practical approach. Office for Official 18. Thompson M, Ramsey MH (1995) An-
4. Thompson M, Fearn T (1996) Analyst Publications of the European Commu- alyst 120:261
121:275 nities, Luxembourg 19. Hulls J (1998) Optimising sampling
5. ISO (1993) Guide to the expression of 10. de Zorzi P, Belli M, Barbizzi S, and analytical strategies for assessing
uncertainty in measurement. ISO, Menegon S, Deliusa A (2002) Accred contaminated land. MSc thesis, Imperi-
Geneva Qual Assur this issue al College, London
6. ISO (1995) VIM: 1995 Vocabulary of II. Ramsey MH, Thompson M. Hale M 20. Ramsey MH, LynJA, Wood R (2001)
metrology. Part I. Basic and general (1992). Journal of Geochemical Explo- Analyst 126: 1777
terms (international). International Or- ration 44:23
ganisation for Standardisation (Gene- 12. Ramsey MH. Squire S, Gardner MJ
va) [and published by the British Stan- (1999). Analyst 124:1701
dards Institution (London) as 13. Argyraki A, Ramsey MH. Thompson
PD646 I :Part I: 1995, 59 pp1 M (1995) Analyst 120:2799
14. Squire S, Ramsey MH. Gardner MJ,
Lister D (2000) Analyst 125:2026
Accred Qual Assur (2001) 6:368-371
0
0
0 0 -
or at most two alternate levels corresponding to the ex- E 0 0
;!. 9.0
pected (or permitted) variation in the parameter. The
•
0
•
1/1
For the determination of moisture, two different feeds Fig. 1 Mean weight loss vs drying temperature. Weight loss with
temperature for two samples (. and Oespectively)
were analysed in triplicate using the oven-loss method
specified in appropriate United Kingdom legislation [7].
A drying time of 3 h was used for all test portions. For ture. Results obtained by methods of this type are ac-
the study of temperature dependence, three S g portions cordingly subject to an uncertainty component due to the
of each sample were dried separately at each temperature allowable variation in temperature. In this study, a series
(that is, in separate drying runs), with the oven held of experiments was carried out on two samples of pellet-
within ±loC of the target temperature for the whole of ed animal feed. The drying temperature was varied over
the heating period. The weight loss for each SoC incre- a range around the target temperature of 100°C and the
ment in temperature from 8SoC to I I SoC was determined effect on the weight loss noted. The range chosen, 8S C
for each sample. to lIS C, is substantially larger than the permissible
In the grinding/milling experiments, a legislative pre- range, to permit both an investigation of the linearity or
treatment (grinding) method [7] was used. The method is otherwise of the effect and a sound sensiti vity analysis.
intended to reduce materials to <I mm particle size in The results, which are typical for moisture determina-
three or fewer grinding cycles, which allows variation tion, are shown in Fig. I. Both curves are approximately
upward from an experimentally determined minimum linear in the range 90-1 OsoC, but depart from linearity at
cycle time. Approximately 1200 g of one feed was thor- the extremes. At 8SoC, the mean weight loss is lower
oughly mixed in a tumbler mixer overnight and divided than might be expected from the trend at the higher tem-
into five equal parts, which were ground using cycle peratures. At higher extremes, factors such as progres-
times of 8, 10, 12, IS and 20 s, respectively. Ground sive oxidation or thermal degradation lead to different
samples were analysed for dimetridazole by extraction directions of departure from linearity.
with dichloromethane and clean-up using a Sep-Pak sili- To estimate the uncertainty associated with weight
ca cartridge (Waters Corporation), followed by reverse loss using the Guide to the Expression of Uncertainty in
phase high performance liquid chromatograph (HPLC) Measurement (GUM) principles [I], the gradient dUdT at
with UV detection. The solvent employed was acetoni- the nominal temperature (100°C) is multiplied by the
trile with ammonium acetate buffer. temperature uncertainty u(T). Here, linear regression ap-
plied to the linear temperature range in each case pro-
duces a gradient dUdT=0.030%m/m C-I for sample I and
ResuHs and discussion dUdT=O.O 19%m/moC-l for sample 2. The temperature
uncertainty u(T) is O.S77°C, estimated from the permit-
Case study I - Uncertainties from oven temperatures in ted variation of ±1 °C taken as the limits of a rectangular
moisture determination distribution. Calibration uncertainties in the thermometer
used are under 0.1 °C, so can be neglected by compari-
Moisture content is usually determined using a calcula- son. This gives u(l)=0.S77xO.030=0.017%m/m for sam-
tion of the form: ple I and u(l)=O.S77xO.O 19=0.0 lO%m/m for sample 2.
Comparing this with a repeatability estimate of
O.OS%m/m obtained from the replicate data by analysis
of variance (ANOVA), it is clear that the uncertainty es-
where I is the fractional loss in mass (usually expressed timates associated with temperature are just on the mar-
as a percentage), mw the mass before drying and md the gin of practical significance compared to the repeatabili-
mass after drying. ty estimate (assuming an uncertainty is practically signif-
The variability of the method is controlled by restrict- icant if greater than a fifth of the largest component -
ing the drying temperature to a narrow range, typically here, the precision is the largest component so far
±I°C. This figure represents an uncertainty in tempera- found). Under these circumstances, therefore, there is
172 J.R. Cowles
•
context of normal use, of course, these are both small E
•
•
U
uncertainties; for most practical purposes, therefore, the g
u
105.0
I
temperature-related uncertainty can be considered suffi-
!
ciently small. Further, note that the study is carried out
under the most precise conditions available; the uncer-
II
~
100.0 •
tainties found will almost certainly prove negligible
compared, for example, to between-run variation. 5 7 9 11 13 15 17 19 21
Returning to the principal aim of the study, it is useful
Grinding cycle time (s)
to consider whether a typical ruggedness test, operating
at one or both extremes of the permitted range, would Fig. 2 Effect of grinding time on analyte concentration. Duplicate
have given comparable results in the present case. Given analyses of samples ground for different grinding cycle times
(. and. are replicates I and 2, respectively)
the largest calculated sensitivity of 0.030%m/m °C-l, the
expected variations in I across a 1 or 2°C range are
0.030%m/m and 0.060%m/m, respectively. Clearly, nei- though there appears to be a decrease in the concentra-
ther would reliably lead to statistically significant effects tion at longer grinding times, the plot does not suggest
with the present repeatability precision unless a prohibi- strong linearity, and ANOVA does not show this effect to
tively large number of replicates were undertaken in the be statistically significant (P = 0.45, CL = 95%). It re-
ruggedness test; uncertainty estimates would accordingly mains possible to provide an estimate of uncertainty us-
be extremely variable. For small to modest effects, then, ing linear regression and the first-order GUM expression
uncertainty estimation from sensitivity experiments re- [1]; in this case, we obtain a linear regression result of
quires substantially wider variation than the 'expected y=-0.45t+ 112.41, where y is analyte concentration in mg
range'. Further, if the 'expected range' is based on con- kg-I, and t the cycle time (in seconds). For an uncertain-
trol limits intended to render an effect insignificant - as ty u(t) of 1 s in grinding time (based on the practical dif-
in most standard methods - it is generally to be expected ficulties of controlling grinding time more closely), the
that the change in influence quantity will not provide uncertainty in analyte concentration would be ±0.45 mg
useful uncertainty estimates. kg-I. In comparison, the precision of the analytical meth-
od for dimetridazole, at the analyte level encountered for
the whole sample (ca. 105 mg kg-I), was 3.9 mg kg-I.
Case study 2 - Milling and particle size effects Reassuringly, this very rough uncertainty estimate con-
on HPLC determination of dimetridazole firms the insignificance implied by the ANOVA result.
Considering the possible outcome of a ruggedness
Another common requirement is for the determination of test aimed at establishing the effect of grinding time
analytes in samples of agglomerated materials such as across a range of, for example, 3-5 s about the nominal
soils or animal feeds. A common preparative method is time of 10 s (much smaller variation would be impracti-
to grind or mill the material so that it passes through a cal), again we find that such a test would almost certain-
sieve of specified aperture. The grinding time in such ly fail to find a significant effect. In this instance, how-
cases is not generally specified, leaving open the possi- ever, it would have been entirely correct to ignore the ef-
bility of variation in grinding time and particle size. Both fect in comparison with observed precision.
constitute potential sources of uncertainty. Longer grind-
ing times may affect the results through greater produc-
tion of fines (that is, a particle size effect) or by thermal Conclusions
degradation or loss of the analyte. Grinding/milling
therefore constitutes a possible source of uncertainty. In The experiments were designed with the aim of using
this study, a series of experiments with different grinding sensitivity values, generated by linear regression, to ob-
times was carried out on a sample of medicated pelleted tain estimates of the contribution to overall measure-
animal feed. A rough particle size distribution was also ment uncertainty from two methods of sample pre-treat-
estimated using different sieve sizes. The feed contained ment. By comparison, the minimal range typical of cur-
approximately 100 mg kg-l of dimetridazole (a coccidio- rent practice in ruggedness testing would not be expect-
stat) and the experiments were designed to assess the ef- ed to give useful or reliable uncertainty estimates in ei-
fect of grinding on the subsequent determination of this ther case. This is consistent with recent studies on deri-
compound. vatisation effects [8] which show that as modelling co-
A plot of the variation of observed concentration of efficients (such as gradients) become statistically insig-
dimetridazole with grinding time is shown at Fig. 2. AI- nificant, uncertainty estimates become progressively
Experimental sensitivity analysis applied to sample preparation uncertainties 173
more unreliable and can be misleadingly large, even no significant effect is found in such a study, sensitivity
though the average remains negligible. However, the coefficients obtained from a study redesigned to assure a
lack of statistical significance predicted for such mini- significant change (for example by increasing the influ-
mal studies was generally consistent with the finding ence quantity range well beyond that expected) will gen-
that the uncertainties were small compared to repeat- erally only confirm practical insignificance of the effect.
ability precision. It is clearly more sensible to use a recorded lack of sta-
This has important implications for the use of data tistical significance in typical ruggedness tests as justifi-
from simple ruggedness studies in uncertainty estima- cation for omitting an effect from the measurement mod-
tion. Given a typical ruggedness study, properly de- el and associated uncertainty budget (which is, in fact,
signed to confirm insignificance of an expected range for the traditional statistical view), than to attempt to con-
an influence quantity, the sensitivity estimates will gen- struct an unreliable estimate for a practically insignifi-
erally be too unreliable for useful uncertainty estimation cant effect.
even though the study is a valid check on the potential
Acknowledgements Production of this paper was supported un-
influence quantities' effects. It follows that typical rug- der contract with the Department of Trade and Industry as part of
gedness studies are not generally appropriate sources of the National Measurement System Valid Analytical Measurement
data for reliable uncertainty estimation. However, where Programme.
References
I. ISO- GUM (1993) Guide to the expres- 3. Ellison SLR, Williams A (1998) Accred 7. The Feeding Stuffs (Sampling and Anal-
sion of uncertainty in measurement. Qual Assur 3: 6-10 ysis) Regulations (1999). Statutory In-
ISO, Geneva, Switzerland; ISBN 4. Barwick VJ, Ellison SLR (2000) Accred strument No 1633. Her Majesty's Sta-
92-67-10188-9 Qual Assur 5: 47-53 tionary Office (HMSO). London
2. EURACHEM (1995) Quantifying un- 5. Barwick VJ. Ellison SLR, Rafferty 8. Ellison SLR, Burns M. Holcombe DG
certainty in analytical measurement. MJQ. Gill RS (2000) Accred Qual As- (2001) Analyst 126: 199-210
EURACHEM, London; ISBN sur 5: 104-113
0-948926-08-2. Second edition now 6. Youden WJ, Steiner EH (1975) Statisti-
available at: http://www.vttJilket/eura- cal manual of the AOAC. Association of
chem/publications.htm Official Analytical Chemists Interna-
tional (AOAC), Arlington, Va., USA
Accred Qual Assur (1998) 3: 462-467
© Springer-Verlag 1998
(combined) measurement uncertainty at that level. In Outliers and/or stragglers [11] were identified by com-
metrology, the VIM [12] and GUM [13] are the basis puting a Z -score [15] based on the mean and the stand-
for evaluating uncertainty in measurement. However, ard deviation of the laboratory averages. The criterion
the implementation of the principles of the GUM is far was that this "Z" should not exceed a value of 2. The
from straightforward for matrix materials, especially criterion was developed based on requirements set by
when parameters are defined by the measurement all parties involved in coal production, trade, and con-
process. As a result, an interpretation of uncertainty sumption. For all samples involved in the programme,
analysis of this kind of parameter/matrix combination is the performance characteristics were computed after
required that explains the experimental results and is in removal of stragglers and outliers. A database of values
agreement with the basic principles of the GUM [13]. A of grand mean, repeatability standard deviation, and
problem that was already addressed in a paper by Van reproducibility standard deviation resulted.
der Veen and Alink [14] is that it is impossible to quan- In each interlaboratory study, a blind sample was
tify several sources of uncertainty when dealing with used, except in round I [2]. The link between the results
matrix materials. of this interlaboratory study and the results of the other
rounds in the programme was established by using the
materials of ILS Coal Characterisation I in later rounds
Set-up of the programme [1]. Figure 1 shows the principle of establishing these
traceability links in an interlaboratory study pro-
The objectives of the programme have been described gramme. The use of a blind sample enables the evalua-
in the previous section. With respect to the acceptance tion of whether the results of a round are comparable
of laboratories as participants, no specific requirements to those of other interlaboratory studies due to the fact
were set other than that these laboratories should be that each laboratory was requested to perform the
involved in the analysis of coal in support of trade. This measurement of all samples of the suite as independent
requirement implies that the laboratories are involved measurements under repeatability conditions. This way
in one-to-one comparisons between coal buyer and coal of implementing comparability also enables the assess-
seller. The implementation of a quality assurance sys- ment of traceability of measurement results to the writ-
tem (QAS) was, however, a requirement. An accredita- ten standard [12]. All laboratories involved use interna-
tion was not asked for. tionally accepted certified reference materials as a part
For the characterisation of coal there are two series of their QAS. So, from that point a traceability link is
of written standards: ASTM and ISO. In this interlabo- established between these reference materials and the
ratory study programme the ISO standards were re- results of the interlaboratory study programme.
quested. The laboratories were allowed to use their
own methods if the results of these methods are com-
parable to those obtained with the ISO method. It was
the responsibility of the laboratory to verify whether its
Samples
A
B
------+
Samples
A
D
..
---.J
Samples
D
F ~
r Samples
G
H
method is comparable to the ISO method. C E G J
Several methods, such as the determination of the
ash content, define the parameter: ash is the result of a
chemical conversion of coal. Its formation (and as a re-
sult, the ash content) highly depends on the conditions
under which the coal is combusted. This fact has some I I I
consequences. The first consequence is that it is gener-
ally not possible to determine the parameter with inde-
I laboratories] I
laboratories I I laboratories I I
laboratories ]
Uncertainty analysis
tation value ,.... changes, or the variance cr changes.
Both changes can happen simultaneously. Procedures
Starting with the working hypothesis, an uncertainty such as milling may well change the probability density
analysis can be carried out. The basic problem with the function of the content on the level of particles.
approach of the "Guide to the expression of uncertain- Changes in this distribution function are sometimes
ty in measurement" (GUM) is that many sources of un- wanted (reduction of the combined measurement un-
certainty are difficult, not to say impossible, to quantify certainty by increasing the total number of particles
in evaluating measurement results from matrix materi- (crushing/milling), sometimes unwanted. An example
als. There must also be a relationship between the per- of the latter is the loss of volatiles and moisture during
formance characteristics obtained from the interlabora- milling of coal.
tory study programme and the combined standard un- The evaluation model to be developed should there-
certainty that can be obtained by applying the GUM fore (1) avoid "double counting" of sources of uncer-
directly. tainty and (2) comply with the additivity rule of uncer-
In an interlaboratory study programme the measure- tainties as expressed in Eq. 1. This has been done by
ment chain starts with the arrival of one or more sam- working with a reference term, i.e. the uncertainty is
ples to be analysed. These samples mayor may not un- expressed in terms of the uncertainty of the measure-
dergo further treatment prior to the measurement. The ment, followed by several correction terms that may ac-
principles of this part of the measurement chain, as well count for extra budgets due to other steps in the meas-
as the role that can be played by (certified) reference urement chain.
materials, have been the subject of a previous paper The statistical fundamentals read as follows. Let the
[19]. In the interlaboratory study programme the role random variable X denote the content of the critical
of the (C)RM has been taken over by the blind sample. component (in this example: ash content). The expecta-
The structure of the interlaboratory study programme tion of X is given by
met the requirements set by ISO Guide 35 [16], so that
in principle the blind samples would be suitable to
E(X) =,.... (2)
serve as (at least) reference materials. If, during the process represented by the measure-
The basic expression for the combined measurement ment chain, changes in this expectation take place, then
uncertainty after splitting up the measurement chain it is said that the method is biased. For the determina-
reads as follows tion of the ash content, it is very hard to find out
whether the measurement method is biased, as the pa-
2 _2
ttcomhincd - UcTushing
+2
Usuhdividing
+2
llmcasufcmcnt (1)
rameter is determined by the method. Ash is as such
where it should be realised that each of the terms on not present in coal; the precursor of ash is the mineral
the right-hand side of this equation consists of one or matter in coal. During combustion, this mineral matter
more contributions from various sources. This ap- is converted into ash, a chemical process. The ash com-
proach is in agreement with the rule in statistics that position is a function of the temperature at which the
variances of parts of a process can be added in order to ash formation takes place. As a result, the ash content
obtain the total variance of the whole process [20-23]. (expressed as weight-% of the coal on a dry basis) is
The dominant factor in terms of uncertainty budgets also a function of the temperature.
is the heterogeneity of the material. Unfortunately, the The requested method (ISO 1171) requires a con-
heterogeneity of the material affects all terms on the stant temperature of 815°C. Insufficient control of this
right-hand side of Eq. 1, and as a result these contribu- temperature, or a deviation of the sample temperature
tions are heavily correlated. A crushing step for in- in the oven may lead to a change in expectation value,
stance increases the heterogeneity on the level of par- and thus in a bias of the measurement method. A con-
ticles, but decreases heterogeneity on the level of, say, a venient way of expressing the expectation value of the
few grams of powder. So, from the point of view of critical content could be
evaluating uncertainty, it is no use to make an attempt
p.,= E (X) + E (Bcrushing) + E(Bsubsampling)
to quantify the contribution of heterogeneity. A better
approach is to select the opposite way, starting with in- + E (Bmcasurcmcnt) (3)
vestigating a measurement chain and - by experiment - where B denotes the bias of the given step. Equation 3
breaking it up into smaller parts. The measurement expresses the expectation value in a sum of random
will, however, always be part of the uncertainty evalua- variables, where the first term denotes the expectation
tion, as was shown in a previous paper [14]. value of the method, followed by several correction
During the processing of the material in a measure- terms. In the ideal case each of these terms has the ex-
ment chain, the critical property (say, for instance, ash pectation value 0 (~expected bias=O). Now a match
content) undergoes (from a statistical perspective) must be found between (3) and (1). The expression for
changes. In principle, there are two changes: the expec- the variance of ,.... reads as
178 A.M.H. van der Veen' A.1.M. Broos' A. Alink
cr = Var (X) + Var (Bcrushing) + Var (BsuhsamPling) given in the GUM. Thus, with each of the terms known
+ Var(Bmeasurcmcnt) (4) in Eqs. 3 and 4, the combined measurement uncertainty
can be calculated. Whether the resulting combined
where Var() denotes variance. Equation 4 provides a standard uncertainty equals the reproducibility stand-
mathematical model for expressing the variance of the ard deviation depends on the answer to the question
measurement in terms of the variance of the measurand whether all sources of uncertainty have been included
X and the variances of the bias terms. This equation is in the interlaboratory study. If this is the case, both
only valid if the biases of the steps involved in the processes should lead to the same result. If not, the re-
measurement chain are not correlated. Otherwise, cov- producibility standard deviation will be lower than the
ariance terms should be introduced [13,23] in Eq. 4. It combined standard uncertainty that would be obtained
is unlikely that the bias terms are truly uncorrelated, as by identifying and quantifying all sources of uncertain-
a bias is defined by a systematic difference between the ty.
expectation value of the measurand before and after a
specific treatment. However, it is always possible to
Conclusions
modify Eq. 4 in such a way that it can be expressed as a
set of independent variables. The concepts of the GUM are also valid for solid-state
However, even without assuming that the bias terms materials. With a careful interpretation of the statistical
are uncorrelated, Eq. 4 provides an explanation for the concepts of the standard for the organisation of interla-
insignificant difference in the performance characteris- boratory studies, ISO 5725: 1994 [10, 11, 24-26] can be
tics in the interlaboratory studies as shown in Fig. 1. As brought into agreement with the concepts of the
all laboratories maintain the operational conditions in GUM.
steps such as crushing, milling, subdividing and analysis A measurement chain is best evaluated when taking
by means of a QAS, this will eventually result in com- the consensus/certified value of a reference material as
parable measurement results. Maintenance of the oper- a reference term and expressing all other terms in the
ation conditions during sample preparation and analy- chain in the form of bias terms. The sum of the refer-
sis steps will also minimise the variance of the biases ence term and the bias terms defines the expectation
associated with these steps in the measurement chain. value of the critical content typical for the laboratory,
That is, the terms Var(Bi) in Eq.4 are minimised by a which cannot be better than the reference term (critical
detailed description of the processing of the sampled content and stated uncertainty of the reference materi-
material. If the QAS is successful, it may be expected al). The expectation value of the critical content is inde-
that the contributions of the Var(Bi) terms in Eq. 4 will pendent of a possible correlation of the bias terms.
be small in comparison with the value of the term The expression of the measurement chain in a refer-
Var(X). ence term in combination with several bias terms ena-
There are still two terms to be interpreted: E(X) and bles the evaluation of the effectiveness of the reduction
Var(X). It is well known that the concept of a true val- of the bias (and its variance).
ue is not very useful. The value E(X) is the expectation An exact evaluation of the combined measurement
of the measurand, given the matrix and given the meas- uncertainty requires (at least) knowledge about the re-
urement method. The same holds for Var(X). Both pa- lationship between the critical content X and any of the
rameters account for the performance characteristics of bias terms B. The functional relationships between the
the test method involved. So, the fundament of the bias terms may be left out in a first approximation, as it
model given in Eq.3 complies with the GUM. More- may be expected that successful implementation of a
over, it does not contain parameters that are inaccessi- QAS will minimise the variance of these terms, and as a
ble in practice. The concept of expectations also com- result will lead to a very low covariance value between
plies with the GUM, as it forms the basis for statistics. any of the bias terms when compared to the variance of
The concept of uncertainty is very closely related to the the critical content, Var(X).
standard deviation [13]. The evaluation of the functional relationship be-
Similarly, the QAS will also aim to reduce E(BcruSh- tween X and B is problematic for matrix materials (Le.
ing) and E(BsUhdividing) (and other bias terms) to values coal) due to "matrix effects", but can be well estab-
close to zero. Likewise, the values for Var(Bcrushing) and lished for synthetic, more homogeneous systems.
Var(BSUhdividing) will be minimised by maintaining the Acknowledgements The European Coal and Steel Community
procedure as well as is feasible. In a separate paper [14] (ECSC) is acknowledged for its financial support of this work
the determination of these terms has been discussed in done under contract number ECSC 7220/EC-036, "Preparation
more detail. and characterisation of coal samples and maceral concentrates for
studies on gasification and combustion reactivity of coals in com-
Finally, the combined measurement uncertainty of bined cycle processes". The participants in the interlaboratory
the measurement chain can also be calculated from the study programme are thanked for their work and their expression
experimental biases and variances by the procedures as of interest during the project.
179
References
1. Veen AMH van der, Broos AJM 10. International Organization for Stand- 20. International Organization for Stand-
(1996) Preparation and characterisa- ardization (1994) ISO 5725-1: 1994 ardization (1994) ISO 3534-1 : 1993
tion of coal samples and maceral con- Accuracy (trueness and precision) of Statistics - vocabulary and symbols,
centrates for studies on gasification measurement methods and results, part 1. Probability and general statis-
and combustion reactivity of coals in part I. General principles and defini- tical terms. Statistical methods for
combined cycle processes. Draft final tion. Statistical methods for quality quality control, vol 1. pp 9-57
report, ECSC 7220/EC-036, Eygel- control, vol 2, pp 9-29 21. International Organization for Stand-
shoven 11. International Organization for Stand- ardization (1994) ISO 3534-2: 1993
2. Veen AMH van der (1994) ILS Coal ardization (1994) ISO 5725-2:1994 Statistics - vocabulary and symbols,
Characterisation I, Evaluation report. Accuracy (trueness and precision) of part 2. Statistical quality control. Sta-
NMi Van Swinden Laboratorium measurement methods and results, tistical methods for quality control,
B.V., Eygelshoven part 2. Basic method for the determi- vol 1, pp 5H-92
3. Veen AMH van der, Broos AJM nation of repeatability and reproduci- 22. International Organization for Stand-
(1994) ILS Coal Characterisation II, bility of a standard measurement ardization (1994) ISO 3534-3: 1993
Evaluation report. NMi Van Swinden method. Statistical methods for quali- Statistics - vocabulary and symbols,
Laboratorium B.V., Eygelshoven ty control, vol 2, pp 30-74 part 3. Design of experiments. Statis-
4. Veen AMH van der, Broos AJM 12. BIPM, IEc' IFCC, ISO, IUPAC, IU- tical methods for quality control, vol
(1995) ILS Coal Characterisation III, PAP, OIML (1993) International vo- 1, pp 93-134
Evaluation report. NMi Van Swinden cabulary of basic and general terms in 23. DeGroot MH (19H9) Probability and
Laboratorium B.V., Eygelshoven metrology, 2nd edn. ISO, Geneva statistics, 2nd edn. Addison-Wesley
5. Veen AMH van der, Broos AJM 13. BIPM, IEC, IFCC, ISO, IUPAC, IU- 24. International Organization for Stand-
(1995) ILS Coal Characterisation IV, PAP, OIML (1993) Guide to the ex- ardization (1994) ISO 5725-3:1994
Evaluation report. NMi Van Swinden pression of uncertainty in measure- Accuracy (trueness and precision) of
Laboratorium B.V., Eygelshoven ment, 1st edn. ISO, Geneva measurement methods and results,
6. Veen AMH van der, Broos AJM 14. Veen AMH van der, Alink A (199H). part 3. Intermediate measures of the
(1995) ILS Coal Characterisation V, Accred Qual Assur 3:20-26 precision of a standard measurement
Evaluation report. NMi Van Swinden 15. International Organization for Stand- method. Statistical methods for quali-
Laboratorium B.V., Eygelshoven ardization (1996) ISO/IEC Guide 43- ty control, pp 75-104
7. Veen AMH van der, Broos AJM 1: voting draft. Proficiency testing by 25. International Organization for Stand-
(1996) ILS Coal Characterisation VI, interlaboratory comparisons, part 1. ardization (1994) ISO 5725-4:1994
Evaluation report. NMi Van Swinden Development and operation of profi- Accuracy (trueness and precision) of
Laboratorium B.V., Eygelshoven ciency testing schemes measurement methods and results,
H. Veen AMH van der, Broos AJM 16. International Organization for Stand- part 4. Basic methods for the deter-
(1996) ILS Coal Characterisation VII, ardization (19H9) ISO Guide 35: 19H9 mination of the trueness of a stand-
Evaluation report. NMi Van Swinden - Certification of reference materials ard measurement method. Statistical
Laboratorium B.V., Eygelshoven - general and statistical principles, methods for quality control, pp
9. Veen AMH van der, Broos AJM 2nd edn. ISO, Geneva 105-130
(1996) ILS Coal Characterisation 17. International Organization for Stand- 26. International Organization for Stand-
VIII, Evaluation report. NMi Van ardization (1975) ISO 19HH Hard coal ardization (1994) ISO 5725-6: 1994
Swinden Laboratorium B.V., Eygel- - sampling. ISO, Geneva Accuracy (trueness and precision) of
shoven IH. International Organization for Stand- measurement methods and results,
ardization (1997) ISO 1171 Solid mi- part 6. Use in practice of accuracy
neral fuels - determination of ash. values. Statistical methods for quality
ISO, Geneva control, pp 131-176
19. Veen AMH van der, Alink A, Ver-
kuil D, Lecq B van der (1996).
Accred Qual Assur 1:207-212,250
Accred Qual Assur (2000) 5:47-53
© Springer-Verlag 2000
Abstract A protocol has been de- ness and ruggedness studies is dis-
veloped illustrating the link be- cussed in detail. The practical ap-
tween validation experiments, such plication of the protocol will be il-
as precision, trueness and rugged- lustrated in Part 2, with reference
ness testing, and measurement un- to a method for the determination
certainty evaluation. By planning of three markers (CI solvent red
validation experiments with uncer- 24, quinizarin and CI solvent yel-
V.J. Barwick (181) . S.L.R. Ellison tainty estimation in mind, uncer- low 124) in fuel oil samples.
Laboratory of the Government Chemist, tainty budgets can be obtained
Queens Road, Teddington, Middlesex, from validation data with little ad- Key words Measurement
TWl1 OLY, UK
e-mail: vjb@lgc.co.uk
ditional effort. The main stages in uncertainty . Method validation .
Tel.: + 44-20-89437421 the uncertainty estimation process Precision . Trueness . Ruggedness
Fax: + 44-20-89432767 are described, and the use of true- testing
protocol and discusses the use of data from trueness them, uncertainty components which are less than one-
and ruggedness studies in detail. The practical applica- third of the largest need not be evaluated in detail. Fi-
tion of the protocol will be described in Part 2 with ref- nally, the individual uncertainty components for the
erence to a high performance liquid chromatography method are combined to give standard and expanded
(HPLC) procedure for the determination of markers in uncertainties for the method as a whole. The use of
road fuel [17]. data from trueness and ruggedness studies in uncertain-
ty estimation is discussed in more detail below.
Principles of approach
Trueness studies
The stages in the uncertainty estimation process are il-
lustrated in Fig. 1. An outline of the procedure dis- In developing the protocol, the trueness of a method
cussed in the protocol is presented in Fig. 2. The first was considered in terms of recovery, i.e. the ratio of the
stage of the procedure is the identification of sources of observed value to the expected value. The evaluation
uncertainty for the method. Once the sources of uncer- of uncertainties associated with recovery is discussed in
tainty have been identified they require evaluation. detail elsewhere [18, 19]. In general, the recovery, R,
The main tools for doing this are precision, trueness (or for a particular sample is considered as comprising
bias) and ruggedness studies. The aim is to account for three components:
as many sources of uncertainty as possible during the - Rm is an estimate of the mean method recovery ob-
precision and trueness studies. Any remaining sources tained from, for example, the analysis of a CRM or a
of uncertainty are then evaluated either from existing spiked sample. The uncertainty in Rm is composed of
data (e.g. calibration certificates, published data, pre- the uncertainty in the reference value (e.g. the uncer-
vious studies, etc.) or via ruggedness studies. Note that tainty in the certified value of a reference material)
it may not be necessary to evaluate every source of un- and the uncertainty in the observed value (e.g. the
certainty in detail, if the analyst has evidence to suggest standard deviation of the mean of replicate ana-
that some are insignificant. Indeed, the EURACHEM lyses).
Guide states that unless there are a large number of - R, is a correction factor to take account of differ-
ences in the recovery for a particular sample com-
pared to the recovery observed for the material used
Identify sources of to estimate Rm.
uncertainty - Rrep is a correction factor to take account of the fact
that a spiked sample may behave differently to a real
sample with incurred analyte.
Plan and carry out These three elements are combined multiplicatively
precision study
to give an estimate of the recovery for a particular sam-
ple, R, and its uncertainty, u(R):
Plan and carry out R= Rm X R,. X R rcp , (1)
trueness study
J
u(R)=Rx (2)
Identify additional sources of
uncertainty and evaluate Rill and u(Rm) are calculated using Eq. (3) and
--. ~~-I~~-~- __ . . - Eq. (4):
~
Remove sources of uncertainty covered
f-----tJ by precision experiments from list 14-------1
tion of Rm to the combined uncertainty for the method uncertainty associated with Rm must be increased to
as a whole, the estimate is compared with 1, using an take account of this uncorrected bias. The relevant
equation of the form: equation is:
(5) -
u(Rm) -_V(1-Rm)2
I
- k - +u (-
Rm) 2 . (6)
To determine whether Rm is significantly different A special case arises when an empirical method is
from 1, the calculated value of t is compared with the being studied. In such cases, the method defines the
coverage factor, k = 2, which will be used to calculate measurand (e.g. dietary fibre, extractable cadmium
the expanded uncertainty [19]. A t value greater than 2 from ceramics). The method is considered to define the
suggests that Rm is significantly different from 1. How- true value and is, by definition, unbiased. The presump-
ever, if in the normal application of the method, no cor- tion is that Rm is equal to 1 and that the only uncertain-
rection is made to take account of the fact that the ty is that associated with the laboratory's particular ap-
method recovery is significantly different from 1, the plication of the method. In some cases, a reference ma-
The evaluation of measurement uncertainty from method validation studies. Part I: Description of a laboratory protocol 183
1
Fig. 2 Continued
Calculate Rs and
u(Rsl
~ cover a range of an~lyte
~
concentrations and/or
matrices?
'r~'
//~
stUd~
'>
Spiking Yes
used to estimate
Rm? /
~
Combine all recovery uncertainties to
give Rand u(RI 1(-------'
~/ ...
"'
<
~para~" >
have a significant
Yes Use data from ruggedness
study/ carry out additional
~~:::/
experiments
No
terial certified for use with the method may be availa- cially designed experimental studies. One efficient
ble. Where this is so, a bias study can be carried out method of experimental study is ruggedness testing,
and the results treated as discussed above. If there is no discussed below.
relevant reference material, it is not possible to esti-
mate the uncertainty associated with the laboratory
bias. There will still be uncertainties associated with Ruggedness studies
bias, but they will be associated with possible bias in
the temperatures, masses, etc. used to define the meth- Ruggedness tests are a useful way of investigating si-
od. In such cases it will normally be necessary to con- multaneously the effect of several experimental param-
sider these individually. eters on method performance. The experiments are
Where the method scope covers a range of sample based on the ruggedness testing procedure described in
matrices and/or analyte concentrations, an additional the Statistical Manual of the AOAC [20]. Such experi-
uncertainty term Rs is required to take account of dif- ments result in an observed difference, D Xi , for each pa-
ferences in the recovery of a particular sample type, rameter studied which represents the change in result
compared to the material used to estimate Rm. This can due to varying that parameter. The parameters are
be evaluated by analysing a representative range of tested for significance using a Student's t-test of the
spiked samples, covering typical matrices and analyte form [21]:
concentrations, in replicate. The mean recovery for
each sample type is calculated. Rs is normally assumed t= ynxD' i
(7)
to be equal to 1. However, there will be an uncertainty
-y2xs '
associated with this assumption, which appears in the where s is the estimate of the method precision, n is the
spread of mean recoveries observed for the different number of experiments carried out at each level for
spiked samples. The uncertainty, u(R,.), is therefore each parameter (n =4 for a seven-parameter Plackett-
taken as the standard deviation of the mean recoveries Burman experimental design), and DXi is the difference
for each sample type. calculated for parameter Xi' The values of t calculated
When a spiked sample, rather than a matrix refer- using Eq. (7) are compared with the appropriate critical
ence material, has been used to estimate Rm it may be values of t at 95% confidence. Note that the degrees of
necessary to consider Rrep and its uncertainty. In gener- freedom for terit relate to the degrees of freedom for the
al, Rrep is assumed to equal 1, indicating that the recov- precision estimate used in the calculation of t. For pa-
ery observed for a spiked sample is truly representative rameters identified as having no significant effect on
of that for the incurred analyte. The uncertainty, the method performance, the uncertainty in the final
u(Rrep), is a measure of the uncertainty associated with result y due to parameter Xi, u(y(xJ), is calculated using
that assumption. In some cases it can be argued that a Eq. (8):
spike is a good representation of a real sample, for ex-
ample in liquid samples where the analyte is simply dis- (y (Xi )) -_ -y2 X terit X S Oreal
(8)
solved in the matrix; u(Rrep) can therefore be assumed
U
Vn X 1.96 X--,
Owst
to be small. In other cases there may be reason to be- where Oreal is the change in the parameter which would
lieve that a spiked sample is not a perfect model for a be expected when the method is operating under con-
test sample and u(Rrep) may be a significant source of trol in routine use and Olest is the change in the param-
uncertainty. The evaluation of u(Rrep) is discussed in eter that was specified in the ruggedness study. In other
more detail elsewhere [18]. words, the uncertainty estimate is based on the 95%
confidence interval, converted to a standard deviation
by dividing by 1.96 [1,2]. The orca/Otest term is required
Evaluation of other sources of uncertainty to take account of the fact that the change in a parame-
ter used in the ruggedness test may be greater than that
An uncertainty evaluation must consider the full range observed during normal operation of the method. For
of variability likely to be encountered during applica- parameters identified as having a significant effect on
tion of the method. This includes parameters relating to the method performance, a first estimate of the uncer-
the sample (analyte concentration, sample matrix) as tainty can be calculated as follows:
well as experimental parameters associated with the
method (e.g. temperature, extraction time, equipment (9)
settings, etc.). Sources of uncertainty not adequately c= Observed change in result (10)
covered by the precision and trueness studies require
I Change in parameter '
separate evaluation. There are three main sources of
information: calibration certificates and manufacturers' where u(xJ is the uncertainty in the parameter and Ci is
specifications, data published in the literature and spe- the sensitivity coefficient.
The evaluation of measurement uncertainty trom method validation studies. Part I: Description of a laboratory protocol 185
Recovery, u(R)
o 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
The estimates obtained by applying Eqs. 8-10 are in- Calculation of combined measurement uncertainty for
tended to give a first estimate of the measurement un- the method
certainty associated with a particular parameter. If such
The individual sources of uncertainty, evaluated
estimates of the uncertainty are found to be a signifi-
through the precision, trueness, ruggedness and other
cant contribution to the overall uncertainty for the
studies are combined to give an estimate of the stand-
method, further study of the effect of the parameters is
ard uncertainty for the method as a whole. Uncertainty
advised, to establish the true relationship between
contributions identified as being proportional to ana-
changes in the parameter and the result of the method.
lyte concentration are combined using Eq. (11):
However, if the uncertainties are found to be small
compared to other uncertainty components (i.e. the un-
certainties associated with precision and trueness) then u~) =V(u~)r +(u~q)r +(u~)r + ... , (11)
no further study is required.
References
1. [SO (1993) Guide to the expression 9. [UPAC (1988) Pure Appl Chern 17. Barwick VJ, Ellison SLR, Rafferty
of uncertainty in measurement. [SO, 60:885 MJQ, Gill RS (1999) Accred Qual
Geneva 10. AOAC (1989) 1 Assoc Off Anal Assur
2. EURACHEM (1995) Quantifying un- Chern 72:694-704 18. Barwick VJ, Ellison SLR (1999) Ana-
certainty in analytical measurement. 11. [SO 5725: 1994 (1994) Accuracy lyst 124: 981-990
Laboratory of the Government (trueness and precision) of measure- 19. Ellison SLR, Williams A (1996) In:
Chemist (LGC), London ment methods and results. ISO, Parkany M (ed) The use of recovery
3. Pueyo M, Obiols J, Vilalta E (1996) Geneva factors in trace analysis. Royal Socie-
Anal Commun 33: 205-208 12. Ellison SLR, Barwick VJ (1998) ty of Chemistry, Cambridge
4. Williams A (1993) Anal Proc Accred Qual Assur 3:101-105 20. Youden WJ, Steiner EH (1975) Sta-
30:248-250 13. Ellison SLR, Barwick VJ (1998) Ana- tistical manual of the association of
5. Analytical Methods Committee lyst 123:1387-1392 official analytical chemists. Associa-
(1995) Analyst 120: 2303-2308 14. Barwick V1, Ellison SLR (1998) Anal tion of Official Analytical Chemists
6. Ellison SLR (1997) In: Ciarlini P, Comm 35:377-383 (AOAC), Arlington, Va.
Cox MG, Pavese F, Tichter D (eds) 15. ISO 9004-4:1993 (1993) Total quality 21. Vander Heyden Y, Luypaert K, Hart-
Advanced mathematical tools in me- management Part 2: Guidelines for mann C, Massart DL, Hoogmartens J,
trology III. World Scientific, Singa- quality improvement. [SO, Geneva De Beer J (1995) Anal Chim Acta
pore 16. Barwick VJ, Ellison SLR (1999) Pro- 312:245-262
7. Ellison SLR, Williams A (1998) tocol for uncertainty evaluation from
Accred Qual Assur 3: 6--10 validation data. V AM Technical Re-
8. Rlos A, Valcarcel M (1998) Accred port No. LGC/v AM11998/088, availa-
Qual Assur 3:14-29 ble on LGC website at www.lgc.co.uk
Accred Qual Assur (2()OO) 5: 104-113
© Springer-Verlag 2000
Abstract A protocol has been de- with diode array detection. The un-
veloped illustrating the link be- certainties for the determination of
tween validation experiments and the markers were evaluated using
measurement uncertainty evalua- data from precision and trueness
tion. The application of the proto- studies using representative sample
col is illustrated with reference to a matrices spiked at a range of con-
method for the determination of centrations, and from ruggedness
three markers (CI solvent red 24, studies of the extraction and
V.1. Barwick (181) . S.L.R. Ellison quinizarin and CI solvent yellow HPLC stages.
M.1.Q. Rafferty· R.S. Gill 124) in fuel oil samples. The meth-
Laboratory of the Government Chemist, od requires the extraction of the Key words Measurement
Queens Road. Teddington, Middlesex, markers from the sample matrix by uncertainty . Method validation
TWll OLY, UK Precision . Trueness . Ruggedness .
e-mail: vjb@lgc.co.uk,
solid phase extraction followed by
Tel.: + 44-20-H943 7421, quantification by high performance High performance liquid
Fax: + 44-20-H943 2767 liquid chromatography (HPLC) chromatography
taken and shows how the data were used in the calcula-
Introduction tion of the measurement uncertainty.
itrile (2.5 ml) and the resulting solution placed in an ul- solution, V F is the final volume of the sample solution
trasonic bath for 5 min. The solution was then passed (ml), Vs is the volume of the sample taken for analysis
through a 0.45 /-Lm filter prior to analysis by high per- (ml) and CSTD is the concentration of the standard solu-
formance liquid chromatography (HPLC). tion (mg I-I).
C= Asx VFXCSTD
(1)
ASTDX Vs '
where As is the peak area recorded for the sample solu- Fig. 1 Cause and effect diagram illustrating sources of uncertain-
tion, ASTD is the peak area recorded for the standard ty for the method for the determination of markers in fuel oil
Sample peak area (As) Working std conc (C STD ) Recovery (R)
pipette calibratio~
integration - - - . c
stock soln
,.~
"\
vol stock soln
'matrix effects
cone "------..----"'---,-+\ T
C
CSTD precision
temperatur'e------l~
VF precision
injection vol recovery precision
pipette Vs precision
reference solution cone Key:
calibration
T temperature effects
HPLC perfonnance - - - . C flask/pipel\e calibration
ASIO precision Bc balance calibration
L balance linearity
Working std peak area (ASTD) Sample volume (Vs) Precision (P)
The evaluation of measurement uncertainty from method validation studies. Part 2 189
CI solvent red 24
Evaluation of other sources of uncertainty: Ruggedness
test No significant difference was observed (F-tests, 95%
confidence) between the three estimates obtained for
The effects of parameters associated with the extrac- the relative standard deviation (0.0323, 0.0289 and
tion/clean-up stages and the HPLC quantification stage 0.0414). However, the test was borderline and across
were studied in separate experiments. The parameters the range studied (0.04 mg 1-1 to 4 mg I -1) the method
Table 1 Results from the ruggedness testing of the procedure for the determination of CI solvent red 24, quinizarin and CI solvent yellow 124 in fuel oil. a Ruggedness 'D
testing of the extraction/clean-up procedure o
Parameter Values 8.ceal/ o.o" u(x;) CI solvent red 24 Ouinizarin CI solvent yellow 124
:<
'-
Dx , C; u(Y(x;» D, C; u(Y(x;) ) Dx, C, u(Y(x;» to
(mg I-I) (mgl-I) (mg I-I) (mg I-I) (mgl-I) (mgl-') '~"
;S.
B rand of silica cartridges A Varian a Waters 1/1" 0.00750* - 0.0493 -0.00750* - 0.0174 -0.00250* - OJ1l99 i>I""
Sample volume B 10ml b 12 ml 0.04ml -0.353 0.176 0.00705 -0.180 0.090 0.00360 -0.423 0.212 0.00845 ~
Rate of elution of oil C vacuum c gravity 1/10" 0.0275* 0.00493 0.070 0'()070" -0.020* 0.00199 a
with hexane
Volume of hexane wash D 12 ml d 8ml 0.04 ml 0.213 0.0531 0.00213 0.176 0.0444 0.00177 0.225 0.0563 0'()0225
Concentration E 12% e 8% 0.2% (v/v)1 -0.0425* 0.00247 0.0175* 0.000868 -0.010* 0.0010
of butan-1-0Ilhexane 4°/., (v/v)
Volume 10% F 12ml 8ml 0.08 mil 0.04ml 0.0625* 0.000986 -0.0050* 0.000347 0.080 0.020 0.00080
of butan-l-ol/hexane 4ml
Evaporation temperature G 50°C g 80°C lO'C/30°C 2.89°C -0.0275* 0.0164 -0.0425 0.00142 0.00409 0.00750* - 0.00663
Parameter Values O,cal/o.c,' u(x;) CI solvent rcd 24 Ouinizarin CI solvent yellow 124
Table 2 Summary of data used in the estimation of u(P) the method is more variable across different matrices
and analyte concentrations for quinizarin than for the
Analyte/Matrix n Mean Standard Relative
(mg 1-1) deviation standard other markers. The uncertainty associated with the pre-
(mg 1-1) deviation cision was taken as the estimate of the relative standard
deviation obtained from the duplicate results, 0.0788.
CI solvent red 24 This estimate should ensure that the uncertainty is not
Matrix B 12 1.92 0.0021 0.0323
BP diesel 48" 3.88 0.112 0.0289
underestimated for any given matrix or concentration
Matrices A-C 15 b OJ)370 0.0414 (although it may result in an overestimate in some
Quinizarin cases).
Matrix B 11 0.913 0.0210 0.0230
BP diesel 48" 1.89 0.0250 0.0136
Matrices A-C 15 b 0.0470 0.0788
CI solvent yellow U4 CI solvent yellow 124
Matrix B 12 2.35 0.0251 (Ull07
BP diesel 48" 4.99 OJ)018 0.0124 There was no significant difference between the esti-
Matrices A-C 15 b 0.0247 0,(1464 mates of the relative standard deviation obtained for
"Standard deviation and relative standard deviation estimated samples at concentrations of 2.4 mg I - I and
from ANOYA of 3 sets of 16 replicates (see text) 4.99 mg I-I. However, the estimate obtained from the
b Standard deviation and relative standard deviation estimated duplicate analyses was significantly greater than the
from duplicate results (15 sets) for a range of concentrations and other estimates. Inspection of that data revealed that
matrices (see text)
the normalised differences observed for the samples at
a concentration of 0.04 mg I - I were substantially larger
than those observed at the other concentrations. Re-
preCISIOn was approximately proportional to analyte moving these data points gave a revised estimate of the
concentration. It was decided to use the estimate of relative standard deviation of 0.00903. This was in
0.0414 as the uncertainty associated with precision, agreement with the other estimates obtained (F-tests,
u(P), to avoid underestimating the precision for any 95% confidence). The three estimates were therefore
given sample. This estimate was obtained from the pooled to give a single estimate of the relative standard
analysis of different matrices and concentrations and is deviation of 0.0114. At present, the uncertainty esti-
therefore likely to be more representative of the preci- mate cannot be applied to samples with concentrations
sion across the method scope. below 1.2 mg I -I. Further study would be required to
investigate in more detail the precision at these low lev-
els.
Quinizarin
"Estimated from ANOYA of 3 groups of 16 replicates according to ISO 5725: 1994 [9]
192 V. J. Barwick et al.
spiked sample. The uncertainty, u(Rm), was calculated only includes concentrations above 1.2 mg 1-1, for the
using Eq. (3): reason discussed in the section on precision.
(3)
Calculation of Rand u(R)
where u( C RM ) is the standard uncertainty in its concen- The recovery, R, for a particular test sample and the
tration of the spiked sample. The standard deviation of corresponding uncertainty, u(R), is calculated using
the mean of the results, Sohs, was estimated from Eqs. (6) and (7):
ANOVA of the data according to Part 4 of ISO
5725: 1994 [9]. (6)
Using information on the purity of the material used
to prepare the spiked sample, and the accuracy and u(R)=Rx ( U(~m»)2 + (U(Rs»)2 + (U(Rrc p)2 (7)
precision of the volumetric glassware and analytical Rm Rs Rrcp
balance used, the uncertainty in the concentration of CI In this study a spiked sample can be considered a
solvent red 24 in the sample, u( C RM ), was estimated as reasonable representation of test samples of marked
0.05 mg 1- 1. 1 The uncertainties associated with the con- fuel oils. There is therefore no need to correct the esti-
centration of quinizarin and CI solvent yellow 124 were mates of Rm and u( Rm) by including the Rrcp and
estimated as 0.025 mg 1- 1 and 0.062 mg 1-1, respective- u(Rrcp) terms. Both Rm and Rs are assumed to be equal
ly. The relevant values are: to 1. R is therefore also equal to 1. Combining the esti-
mates of u(Rm) and u(Rs), the uncertainty u(R) was cal-
CI solvent red 24: Rm = 0.957 u(Rm) =0.0148
culated as 0.0415 for CI solvent red 24, 0.0974 for quin-
Quinizarin: Rm =0.949 u(Rm) =0.0121
izarin and 0.0187 for CI solvent yellow 124.
CI solvent yellow 124: Rm = 1.00 u(Rm) =0.0129
Applying Eq. (4):
Ruggedness test of extraction/clean-up procedure
t= 11- Rml (4)
u(Rm) The results from the ruggedness study of the extraction/
clean-up procedure are presented in Table 1a. The pre-
indicated that the estimates of Rm obtained for CI sol- cision of the method for the analysis of the sample used
vent red 24 and quinizarin were significantly different in the ruggedness study had been estimated previously
from 1.0 (t>2) [7, 10]. During routine use of the meth- as 0.0621 mg 1-1 (v= 11) for CI solvent red 24,
od, the results reported for test samples will not be cor- 0.0216 mg 1-1 (v= 10) for quinizarin and 0.0251 mg 1-1
rected for incomplete recovery of the analyte. Equation (v= 11) for CI solvent yellow 124. Parameters were
(5) was therefore used to calculate an increased uncer- tested for significance using Eq. (8):
tainty for Rm to take account of the uncorrected bias:
t= ynxD,; (8)
(5)
V2xS '
where S is the estimate of the method precision, n is the
u(Rm)' was calculated as 0.0262 for CI solvent red 24 number of experiments carried out at each level for
and 0.0283 for quinizarin. The significance test for CI each parameter (n = 4 for a seven-parameter Plackett-
solvent yellow 124 indicated that Rm was not signifi- Burman experimental design), and Dx; is the difference
cantly different from 1.0. The uncertainty associated calculated for parameter Xi [1, 11]. The degrees of free-
with Rm is the value of u(Rm) calculated above (i.e. dom for tcrit relate to the degrees of freedom for the
0.0129). precision estimate used in the calculation of t.
u(Rs) is the standard deviation of the mean recover- The parameters identified as having no significant
ies obtained for the samples analysed in the precision effect on method performance, at the 95% confidence
studies and the BP diesel sample used in the study of level are highlighted in Table 1a. For these parameters
Rm. This gave estimates of u(Rs) of 0.0322 for CI sol- the uncertainty in the final result was calculated using
vent red 24, 0.0932 for quinizarin and 0.0138 for CI sol- Eq. (9):
1 Detailed information on the estimation of uncertainties of this where Oreal is the change in the parameter which would
type is given in Ref. [7]. be expected when the method is operating under con-
The evaluation of measurement uncertainty from method validation studies. Part 2 193
trol in routine use and <\cst is the change in the param- certainties were therefore converted to relative stand-
eter that was specified in the ruggedness study. The es- ard deviations by dividing by the mean of the results
timates of 8rca1 are given in Table 1a. For parameter A, obtained from previous analyses of the sample under
brand of silica cartridge, the conditions of the test (i.e. normal method conditions (see results for Matrix B in
changing between two brands of cartridge) were con- Table 2).
sidered representative of normal operation of the
method. 8rca1 is therefore equal to 8tcst . The effect of the
rate of elution of oil by hexane from the cartridge was Ruggedness test of the HPLC procedure
investigated by comparing the elution under a vacuum
and with elution under gravity. In routine analyses, the The results from the ruggedness study of the HPLC
oil will be eluted under vacuum. Variations in the va- procedure, and the values of 8rca1 and u(x;) used in the
cuum applied from one extraction to another will affect uncertainty calculations, are presented in Table lb. Re-
the rate of elution of the oil and the amount of oil plicate analyses of a standard solution of the three
eluted. However, the effect of variations in the vacuum markers gave the following estimates of the precision of
will be small compared to the effect of having no va- the HPLC system at the concentration of the sample
cuum present. It can therefore be assumed that varia- used in the study: CI solvent red 24, s = 0.0363 mg 1- I
tions in the observed concentration of the markers, due (n=69); quinizarin, s=0.0107mgl- 1 (n=69); CI sol-
to variability in the vacuum, will be small compared to vent yellow 124, s = 0.0196 mg 1- I (n = 69). Parameters
the differences observed in the ruggedness test. As a were tested for significance, at 95% confidence, using
first estimate, the effect of variation in the vacuum dur- Eq. (8). The uncertainties for parameters identified as
ing routine application of the method was estimated as having no significant effect on the method performance
one-tenth of that observed during the ruggedness were calculated using Eq. (9). Based on information
study. This indicated that the parameter was not a sig- from manufacturers' specifications for HPLC systems,
nificant contribution to the overall uncertainty for CI the uncertainty associated with the column temperature
solvent red 24 and CI solvent yellow 124, so no further was estimated as ± 1°C, giving an estimate of 8rcal of
study was required. The estimates of 8rca1 for the con- 2°C. Again, based on manufacturers' specifications for
centration and volume of butan-1-01 in hexane used to DAD detectors, the uncertainty associated with the de-
elute the column were based on the manufacturers' tector wavelengths was estimated as ± 2 nm, giving a
specifications and typical precision data for the volu- 8rcal value of 4 nm.
metric flasks and pipettes used to prepare and deliver The uncertainties due to significant parameters were
the solution. estimated using Eqs. (10) and (11). Information in the
For the parameters identified as having a significant literature suggests that a typical variation in flow rate is
effect on method performance, the uncertainty was cal- ± 0.3% [12]. The uncertainty in the flow rate was there-
culated using Eqs. (10) and (11): fore estimated as 0.00173 ml min - ~ assuming a rectan-
gular distribution. Data in the literature gave 1.5% as a
u(y(xJ)=u(xJxc;, (10) typical coefficient of variation for the volume delivered
Observed change in result by an autosampler [13]. The uncertainty associated with
c- = --------"'----- (11) the injection volume of 50 fLl was therefore estimated
I Change in parameter
as 0.75 fLl.
The estimates of the uncertainty in each parameter, Two remaining parameters merit further discussion;
u(xJ, are given in Table 1a. The uncertainties asso- the type of acetonitrile used in the mobile phase and
ciated with the sample volume, volume of hexane wash whether or not the mobile phase was degassed. The
and volume of the 10% butan-1-01lhexane solution method was developed using HPLC grade acetonitrile.
were again based on the manufacturers' specifications The ruggedness test indicated that changing to far-UV
and typical precision data for the volumetric flasks and grade results in a lower recovery for all three analytes.
pipettes used to prepare and deliver the solutions. The The method protocol should therefore specify that for
uncertainty in the evaporation temperature was based routine use, HPLC grade acetonitrile must be used.
on the assumption that the temperature could be con- The ruggedness test also indicated that not degassing
trolled to ± 5 0c. This was taken as a rectangular distri- the mobile phase causes a reduction in recovery. The
bution and converted to a standard uncertainty by div- method was developed using degassed mobile phase,
iding by V3 [7]. As discussed previously, the effect on and the method protocol will specify that this must be
the final result of variations in the vacuum when eluting the case during future use of the method. As these two
the oil from the cartridge with hexane was estimated as parameters are being controlled in the method proto-
one-tenth that observed in the ruggedness test. col, uncertainty terms have not been included.
The effects of all the parameters were considered to The effects of all the parameters were considered to
be proportional to the analyte concentration. The un- be proportional to the analyte concentration. The un-
194 V. J. BalWick et al.
certainties were therefore converted to relative stand- ied during these experiments, such as the extraction
ard deviations by dividing by the mean of results ob- and HPLC conditions, were investigated in the rugged-
tained from previous analyses of the sample under nor- ness tests. There are however, a small number of pa-
mal method conditions (see results for Matrix B in Ta- rameters which were not covered by the above experi-
ble 2). ments. These generally related to the calibration of pi-
pettes and balances used in the preparation of the
standards and samples. For example, during this study
Other sources of uncertainty the same pipettes were used in the preparation of all
the working standards. Although the precision asso-
The precision and trueness studies were designed to ciated with the operation of the pipette is included in
cover as many of the sources of uncertainty as possible the overall precision estimate, the effect of the accuracy
(see Fig. 1), for example, by analysing different sample of the pipettes has not been included in the uncertainty
matrices and concentration levels, and by preparing budget so far. A pipette used to prepare the standard
new standards and HPLC mobile phase for each batch may typically deliver 0.03 ml above its nominal value.
of analyses. Parameters which were not adequately var- In the future a different pipette, or the same pipette
Precision, u(P)
I
~ Cone. butan-1-ollhexane,
U(Y(XE))
e Vol. bulan-1-o11hexane,
: U(Y(XF)) -0._ _
o 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
after re-calibration, may deliver 0.02 ml below the nom- ciated with the variation in recovery from sample to
inal value. Since this possible variation is not already sample was the major contribution to the recovery un-
included in the uncertainty budget it should be consid- certainty, u(R). This was due to the fact that the recov-
ered separately. However, previous experience [14] has eries obtained for matrix B were generally higher than
shown us that uncertainties associated with the calibra- those obtained for matrices A and C. However, in this
tion of volumetric glassware and analytical balances are study, a single uncertainty estimate for all the matrices
generally small compared to other sources of uncertain- and analyte concentrations studied was required. It was
ty such as overall precision and recovery. Additional therefore necessary to use "worst case" estimates of the
uncertainty estimates for these parameters have not uncertainties for precision and recovery to adequately
therefore been included in the uncertainty budgets. cover all sample types. If this estimate was found to be
unsatisfactory for future applications of the method,
separate budgets could be calculated for individual ma-
Calculation of measurement uncertainty trices and concentration ranges.
References
1. Barwick VJ, Ellison SLR (1999) 6. EURACHEM (199H) The fitness for 10. Ellison SLR, Williams A (1996) In:
Accred Qual Assur (in press) purpose of analytical methods, a la- Parka nay M (ed) The use of recovery
2. May EM, Hunt DC, Holcombe DG boratory guide to method validation factors on trace analysis. Royal Socie-
(19H6) Analyst 111 : 993-995 and related topics. Laboratory of the ty of Chemistry, Cambridge
3. Ellison SLR, Barwick VJ (1998) Government Chemist, London 11. Youden WJ, Steiner EH (1975) Sta-
Accred Qual Assur 3:101-105 7. EURACHEM (1995) Quantifying un- tistical manual of the association of
4. Ellison SLR, Barwick VJ (1998) Ana- certainty in analytical measurement. official analytical chemists. Associa-
lyst 123: 13H7-1392 Laboratory of the Government tion of Official Analytical Chemists,
5. ISO 9()04-4: 1993 (1993) Total quality Chemist, London Arlington, Va
management Part 2. Guidelines for 8. Farrant TJ (1997) Practical statistics 12. Brown PR, Hartwick RA (eds)
quality improvement. ISO, Geneva, for the analytical scientist: a bench (1989) High performance liquid chro-
Switzerland guide. Royal Society of Chemistry, matography. Wiley, New York
Cambridge 13. Dolan JW (1997) LC-GC Internation-
9. ISO 5725: 1994 (1994) Accuracy aI10:418-422
(trueness and precision) of measure- 14. Barwick VJ, Ellison SLR (1998) Anal
ment methods and results. ISO, Gen- Comm 35:377-3H3
eva. Switzerland
Accred Qual Assur (1998) 3:412-415
© Springer-Verlag 1998
8 final product
Evaluation uncertainty (uncertainty
caused by integration)
Uncertainty of the reference material(s)
<0.1% 0.1%
Table 2 Results of the estimation of the measurement The estimation of uncertainty replaces a full valida-
uncertainty for HPLC analysis tion of the analytical method. It generates the necessary
Option 1 a Option 2h information at the right time. The statistical informa-
tion received from the analysis can be used for the in-
Calibration 1% 1% terpretation of the data and finally the analysis is de-
Analysis of samples 1.5% 1% signed to the customers needs. In this case measure-
Total uncertainty (received from 1.11% 1.4% ment uncertainty is a good alternative to validation.
uncertainty propagation) The second example illustrates the determination of
water content which is an important characteristic for
"Option 1: two weights of the calibration standard with six chemical substances and is needed in many chemical
injections for each of them, two weights of the sample with two
injections for each weight
reactions. It is usually determined by Karl-Fischers
h Option 2: two weights of the sample with four injections for (KF) titration. The water content determined in our la-
each weight boratory ranges from < 0.1 % to about 30%. It is widely
known that KF water titration's may be influenced by
the sample and, depending on the range, some other
Table 3 Comparison of the results of the estimation and the
analysis (type B estimation compared to type A estimation) parameters may significantly affect the uncertainty.
Because the concentration of a reagent is often de-
Estimation Found termined on the basis of the water content of the reac-
(one example) tion mixture, uncertainty information for the water de-
Calibration 1% O.H%
termination is needed. The problem in a development
environment is that various synthesis routes are tested.
Analysis of samples 1% O.H%
If salts are used in a chemical reaction it is usual that
Total uncertainty 1.4% 1.14% chemists test different counter ions for the optimization
(by uncertainty propagation)
of the synthesis. However, different salts are rarely
tested in the analytical department. One of the prob-
lems of the variation of counter ions is that the hygros-
as for example covered in the control chart, the inho- copicity of the salts is often different.
mogeneity of the sample has to be included in the esti- Two independent steps have to be followed:
mation. Using the mean values between the min and 1. Substance dependent influences have to observed.
max column in Table 1 the uncertainty estimated by In most cases the chemical structure of the compo-
uncertainty propagation is 1.46%, which is close to the nent is known and therefore serious mistakes can be
value found in the control chart. In the case presented avoided. Titration software and various KF reagents
here the estimation of measurement uncertainty can be have to be available and standard operation proce-
performed in only two steps as shown in Table 2. Cali- dures have to be established.
bration and analysis of samples represent the major un- 2. The individual uncertainty has to be considered.
certainties and combined they provide the complete Therefore a preliminary specification has to be set
uncertainty of our experiment; in this case the uncer- and a type B estimation of uncertainty can be used
tainty of the HPLC sampler. The influence of the to show if there is any problem arising from the data
HPLC sampler is known, therefore, there are two op- and the specification limit [4].
tions to perform the analysis. The first step is a general task using the appropriate
Together with the customer it was decided to per- equipment and has been established in our laboratory.
form the analysis according to option 2. The samples However, the second part needs to be discussed in de-
were analysed in one analytical run. The result is shown tail.
in Table 3. The result of the estimation compares well Suppose there is a chemical reaction with reagent A
with the "found" results. The found uncertainty is where:
smaller than the estimation. Assessment of the results
A+B -> C. (1)
of the validation of the manufacturing formula be-
comes easier from the customers point of view because The alternative reaction for A may also be with wa-
the customer is able to deceide if a variation of his re- ter to D.
sult is related to his process or to the uncertainty of the
(2)
analytical method. Additionally, the influence of the in-
dividual contributions to uncertainty becomes smaller The reaction with water is often faster than with B.
because of the uncertainty propagation. Therefore, the Because of the low molecular weight of water 0.5% (wi
difference between the estimated and found uncertain- w) of water in B may be 10 mol %. Therefore an excess
ty becomes smaller with an increasing number of pa- of at least 10 mol % of A might be needed to complete
rameters that influence uncertainty. the reaction. The water content is determined in the
200 S. Ktippers
Table 4 Estimation of the measurement uncertainty for the The factors for the three contributions mentioned
titration of water (example performed manually) above are estimated on the basis of experience. The
Results from the titrations 0.20%; 0.23% 0.215% calculation is performed using a computer program [5].
(% (w/w) from the weight of +0.0150% This makes the decision easy and fast. In this case the
the sample) type A estimation on the basis of the results and the
Minimum uncertainty of 1.1% +0.00236% type B estimation of the influence factors are com-
1.1% bined. An example is given in Table 4 in a compressed
Hygroscopicity (estimated on 5% +0.00621% form.
the basis of experience with The alternative would be an experimental valida-
amin compounds) tion. In this case the uncertainty estimation has proven
Uncertainty of the titer 1% +0.00124% to be a very useful alternative to validation, although,
Reaction of the sample with 5% +0.00621% on the basis of experience, the estimate of hygroscopic-
the solvent ity is difficult and may lead to incorrect values.
Result reported including 0.25%
uncertainty
Conclusions
analytical department. For example the value deter-
mined by KF titration is 0.22% (w/w) from two measur- Method validation is a process used to confirm that an
ements with 0.2% (w/w) and 0.23% (w/w) as the indi- analytical procedure employed for a specific test is suit-
vidual values. The question that has to be asked is: Is able for the intended use. The examples above show
0.22% of the weight always smaller than 0.25% (w/w)? that the estimation of measurement uncertainty is a vi-
What we need is the measurement uncertainty added to able alternative to validation. The estimation of meas-
the "found" value. If this value is smaller than the limit urement uncertainty can be used to confirm that an
(0.25%) the pre-calculated amount of the reagents can analytical procedure is suitable for the intended use. If
be used. the estimation of measurement uncertainty is used to-
The model set up consists of two terms: gether with validations both the uncertainty estimation
and the validation have their own place in a develop-
Y=X*(l+ U)
mental environment. The major advantages of meas-
where X is the mean value of the measurements plus urement uncertainty are that it is fast and efficient.
the standard uncertainty. U is the sum of the various Normally, if the analytical method is understood by the
influence parameters on measurement uncertainty: laboratory very similar results are found for the estima-
- hygroscopicity tion of uncertainty and for the classical variation of crit-
- uncertainty of the titer ical parameters, namely, validation. The decision on
- reaction of the KF solvent with the sample how to perform a validation should be made on a case
- a minimum uncertainty constant of 1.1% (taken to case basis depending on experience.
from the control chart of the standard reference ma-
terial) covering balance uncertainties, the influence Acknowledgements Fruitful scientific discussions with Dr. P.
of water in the atmosphere and the instrument un- Blaszkiewicz, Schering AG, Berlin and Dr. W. Hasselbarth,
certainty with the detection of the end of titration. BAM, Berlin are gratefully acknowledged.
References
1. Klippers S (1997) Accred Qual Assur 3. Henrion A, Dube G, Richter W (1997) 5. Evaluation of uncertainty (1997) R.
2:30-35 Fres J Anal Chern 35H: 506-50H Metrodata GmbH, Grenzach-Whylen,
2. Klippers S (1997) Accred Qual Assur 4. Renger B (1997) Ph arm Tech Europe Germany
2:338-341 9:36-44
Accred Qual Assur (199X) 3: 155-160
© Springer-Verlag 199X
Abstract Every analytical result for the same method, can vary
should be expressed with some in- from analyst to analyst. It is impor-
dication of its quality. The uncer- tant to develop tools which will
Presented at: 2nd EURACHEM tainty as defined by Eurachem support each choice and approxi-
Workshop on Measurement Uncertainty ("parameter associated with the re- mation. In this work, the compari-
in Chemical Analysis, Berlin,
29-30 September 1997 sult of a measurement that charac- son of an estimated uncertainty
terises the dispersion of the values with an experimentally assessed
that could reasonably be attributed one, through a variance test, is per-
to the, ... , quantity subjected to formed. This approach is applied
measurement") is a good tool to to the determination by atomic ab-
accomplish this goal in quantitative sorption of manganese in digested
analysis. Eurachem has produced a samples of lettuce leaves. The total
guide to the estimation of the un- uncertainty estimation is calculated
R. J. N. Bettencourt da Silva (181) certainty attached to an analytical assuming 100% digestion efficiency
M. F. O. F. C. Cam6es result. Indeed, the estimation of with negligible uncertainty. This as-
CECUL, Faculdade de Ciencias da the total uncertainty by using un- sumption was tested.
Universidade de Lisboa, P-1700 Lisbon, certainty propagation laws is com-
Portugal
ponents-dependent. The estimation
J. Seabra e Barros of some of those components is Key words Uncertainty .
Instituto Nacional de Engenharia e
Tecnologia Industrial, Estrada do Pa<;o
based on subjective criteria. The Validation . Quality control
do Lumiar, P-1699 Lisbon Codex, identification of the uncertainty Solid samples . Atomic
Portugal sources and of their importance, spectrometry
many further doubts concerning observation of legal The dry-base content, D, is obtained by application of
limits and protects the user of the analytical data from the correction factor, fcon., to the metal content, M.
financial losses. The use of uncertainty instead of less
informative percentage criteria brings considerable D=fcorr. 1W (3)
benefits to the daily quality control.
Despite the analyst's experience, some analytical
Identification of uncertainty sources
steps like sampling and recovery are of particularly dif-
ficult estimation. Mechanisms should be developed to
The uncertainty associated with the determination of
support certain choices or approximations. The com-
fcon. is estimated from the combination of the three in-
parison of an estimated uncertainty with the experi-
volved weighing steps, Fig. 1a.
mentally assessed one can be of help.
The uncertainty associated with the sample metal
In this work the Eurachem guide [1] was used for content is estimated from the weighing, dilution and in-
the estimation of uncertainties involved in the determi-
terpolation sources (Fig. 1b). The model used for the
nation by electrothermic atomic absorption spectrome-
calculation of the contribution from the interpolation
try (EAAS) of manganese in digested lettuce leaves.
source assumes negligible standards preparation uncer-
The total uncertainty estimation was calculated assum-
tainty when compared with the instrumental random
ing a 100% digestion efficiency with negligible uncer-
oscillation [5, 10].
tainty. The experimental precision was compared with
an estimated one for the purpose of validation of the
proposed method of evaluation. After this validation
Quantification of the uncertainty components
the uncertainty estimation was used in an accuracy test
and in routine analysis with the support of a spread-
The quantification of the uncertainty is divided into
sheet programme.
equally treated operations:
The uncertainty estimation can be divided into four The weighing operations are present in the dry-base
steps [1]: (1) specification, (2) identification of uncer- correction factor (three) and in the sample metal con-
tainty sources, (3) quantification of uncertainty compo- tent (one). Two contributions for the associated uncer-
nents, and (4) total uncertainty estimation. tainty, O'Wcighing, were studied:
1. Uncertainty associated with the repeatability of the
weighing operations, oi~~~;tC, is obtained directly from
the standard deviation of successive weighing opera-
Specification tions. The corresponding degrees of freedom are the
number of replicates minus 1.
A dry-base content determination method is proposed,
the sample moisture determination being done in paral-
lel. Figure 1 represents the different steps of the analy-
sis. The analytical procedure was developed for labora- i)The dry base correction factor determination:
tory samples. Sampling uncertainties were not consid-
ered.
The dry-base correction factor, fcon., is calculated
from the weights of the vial (z), vial plus non-dried
sample (x) and vial plus dry sample (y)
x-y ii) Metal content in sample quantification:
fcorr. = 1 - - - (1)
x-z
The sample metal content, M, is obtained from the
interpolated concentration in the calibration curve,
entcr. the mass of the diluted digested sample, a, and
the dilution factor, fdiL, (digested sample volume times
dilution ratio).
M= C1ntcr. XfdiL
(2) Fig. 1 Proposed method for dry-base metal content determina-
a tion in lettuce leaves
Validation of the uncertainty evaluation for the determination of metals in solid samples by atomic spectrometry 203
~alancc _ 2 x Tolerance
V( y - Z)2 (
(X-Z)2 ~+ -
1)2 0-;+ (x(X-Z)2
(x-z) - Y )2 u; (9)
Cahh. - yr2 (4)
The values of lTx, lTy and IT, are then calculated as
described in the section "Gravimetric operations"
where the Tolerance is obtained from the balance cali-
above. The number of degrees of freedom is calculated
bration certificate.
by the Welch-Satterwaite equation (Eq. 7). The appli-
The Eurachem guide suggests that when the uncer-
cation of a spreadsheet program available in the litera-
tainty components are described by a confidence inter-
ture simplifies this task [3]. However, the classical ap-
val, ex ± {3, without information on degrees of freedom,
proach is more flexible for different experimental con-
the associated uncertainty is 2{31yr2, which represents
figurations or for one or more dilution steps, and is also
the uncertainty of a 2{3 amplitude rectangular distribu-
easily automated.
tion. These uncertainties are designated type B. The
number of degrees of freedom associated with the
~~:rh'cc type B estimation, lI~~:fh'cc, is approximately [1]
Volumetric operations
equal to
Balance _ 1 [ ~alancc
Calih. 12
The uncertainties associated with to the volumetric op-
lIc,a l"h
1. -
- 2
- Balance (5) erations were calculated from the combination of two
mCalih.
[(1) and (2) below] or three [(1), (2) and (3) below]
were m~~:rh'cc is the mass associated with the balance components:
calibration tolerance. 1. Uncertainty associated with volume calibrations,
The two uncertainties are then combined lT~~l;h.
2 x Tolerance
_ (-'3alaneC)2
+ (_Ralanec)2 (6) lT~~)l;h. = (10)
vTI
lTW<.:iging - lfCalih. lfRepeat.
The corresponding degrees of freedom are calcu- where the information on this tolerance is normally
lated by the Welch-Satterwaite equation [1-3]. When available with the instrument in the form: volumetric
the pairs (uncertainty, degrees of freedom) instrument volume ± tolerance. This type B uncertainty
estimation has the same treatment as the one reported
in Eq. 5 for the degrees of freedom
for the quantities a, b, c, d, in a function
="21 [ -V-
Vol. _ lTCalih.
Vol. 12
V = f(a, b, c, d, ... ) are taken into account, then the effec- lICalih. (11)
tive number of degrees of freedom associated with V,
lIv, is where lI~~l;h. is the number of degrees of freedom asso-
ciated with lT~~lih. for a certain volume V.
2. Uncertainty associated with volume repeatability
tests, lT~~\at.
The lT~~~cat. and the corresponding degrees of free-
dom, lI~~~cat.' are also extracted directly from the re-
peatability tests. Such tests consist of successive weigh-
The calculation of the uncertainty and of the degrees ings of water volumes measured by the instrument. The
of freedom associated with the sample weight is by the observed standard deviation is a function of the ana-
direct application of Eqs. 4-7. The calculations of the lyst's expertise.
dry-base correction factor are more elaborate. 3. Uncertainty associated with the use of volumetric
3. Uncertainty associated with the dry base factor equipment at a temperature different from that of cali-
The dry base correction factor is a function of three bration, lTf~)~p
weighing operations (Eq. 1). To estimate the uncertain- This third component corrects for errors associated
ty, lTlenTr., associated with the femT., the general equation with the use of 20°C calibrated material in 20 ± 3°C so-
(Eq. 8) was used [1] lutions. When two consecutive volumetric operations
are performed at the same temperature, as is the case in
corr.)2 cr. + (a feorr.)2 ~ + (a fenrr.)2 ~
(afax (8)
dilution stages, they become self-corrected for this ef-
x ay Y az Z fect.
The glass instrument expansion coefficient is much
It is therefore smaller than that of the solution. For this reason we
204 R. J. N. Bettencourt da Silva· M. F. G. F. C. Camoes· J. Seabra e Barros
have only calculated the latter. For a temperature oscil- Total uncertainty estimation
lation of aT= ±3K with a 95% significance level and
for a volumetric expansion coefficient of pure water of The total uncertainty estimation, UT, is a function of the
2.1 X 10 -4 °c-1 (our solutions can be treated as pure dry-base correction factor uncertainty, u/eorr ' of the un-
water because of their low concentrations), the 95% certainty associated to the analysis sample weighing op-
volume confidence interval becomes eratIon,
. _"ample
U-Wcighing, 0 f thed·l·
l utlOn factor, Ut,.(hI. ,and 0 f
V±Vx3x2.1xl0-4. Dividing the expanded uncer- the instrumental calibration interpolated uncertainty,
tainty by the Student t value, t( 00,95% ) = 1.96, we ob- UC;nter: These four quantities combine their uncertain-
tain the temperature effect component uncertainty ties in the equation
The number of degrees of freedom due to the tem- where D represents the dry-base sample metal content
perature effect can also be estimated as for lJ~~lli·b. (Eq. and a has the same meaning as in Eq. 2. The other
· · UCalib.
11) , sub stItutmg Vol. b Vol.
Y UTemp. quantities have already been described.
These components are then combined to calculate The expanded uncertainty can then be estimated af-
the volume uncertainty, UVol. ter the calculation of the effective number of degrees of
freedom, df (Eq. 7). Therefore the coverage factor used
'"
"Vol. - -y(UVOI.
Cahb. )2+(UVOI.
VO I.)2
Repeat. )2+(UTemp. (13) was the Student t defined for that number and a 95%
significance level (t( df, 95%). The estimated confidence
The number of degrees of freedom associated with UVol. interval is defined by
can also be calculated by the Welch-Satterwaite equa-
tion. D±uT·t(df,95%) (16)
4. Uncertainty associated with the dilution factor
Our analytical method has three volumetric steps
that can be combined as a dilution factor, fdil., whose Quality control
uncertainty, Uldil , can easily be estimated by:
Ideally, the readings of the instruments for each sample
Uld ;1. = and for each standard should be random [6-7]. Normal-
(14) ly, the instrument software separates the calibration
fdil.
from the sample reading. Although this allows an im-
were the DSV, P and V stand respectively for digested mediate calculation, it can produce gross errors if the
solution volume, dilution operation pipette and dilu- operator does not verify the drift of the instrument re-
tion operation vial; UVol. and V represent respectively sponse. For this reason, the calibration curves should
each corresponding volumetric uncertainty and volume. be tested from time to time by reading a well-known
As in the other cases, the degrees of freedom were cal- control standard. This standard can also be prepared
culated by the Welch-Satterthwaite equation. from another mother solution in respect to the calibra-
tion standards, for stability and preparation checking.
Normally, the laboratories use fixed and inflexible
Sample signal interpolation from a calibration curve criteria for this control. They define a limit to the per-
centage difference between the expected and the ob-
The mathematical model used to describe our calibra- tained value, and in low precision techniques they are
tion curve was validated by the Pennincky et al. [4] obliged to increase this value. Assuming the uncertain-
method. At this stage we proved the good fitting prop- ty associated with the control standard preparation to
erties of the unweighted linear model to our calibration be negligible when compared to the instrumental un-
curve. With this treatment we aimed not only at the ac- certainty, the case-to-case interpolation uncertainties
curacy but also at the estimation of more realistic sam- can be used as a fit for each case. If the observed confi-
ple signal interpolation uncertainties. These uncertain- dence interval includes the expected value, there is rea-
ties were obtained by the application of an ISO interna- son to think that the system is not under control. The
tional standard [5]. instrumental deviation from control can be used as a
The instrument was calibrated with four standards guide for instrumental checking or as a warning of the
(0-2-4-6 1Lg/L for Mn) with three measurement repli- inadequacy of the chosen mathematical model for the
cates each [4]. Samples and control standard (4 ILglL calibration.
for Mn) were also measured three times. The control
standard was analysed for calibration curve quality con-
trol (see "Quality control").
Validation of the uncertainty evaluation for the detelTI1ination of metals in solid samples by atomic spectrometry 205
60
Validation of the uncertainty estimation
!
56
54
I
ability of a method to produce reliable results [8]. An >--
zw 52
>--
analytical result should be expresed along with a confi- z
0
50
0
1~
dence interval and a confidence level. The confidence 2w
46
f f
-< 0
val width. Therefore the validation depends on the re- >-
a:
44
.L.
liability of the confidence interval width estimation. 0 42
40 t ---.-' •• .-~~-.+--
The accuracy test can be performed exactly only, after 0: ~ ~ L{ :£ ff f;: ~ ~
0 N
w 0: 0: 0:
that step. W W W W W W W W
It: It: It: a: It: It: It: It: It: w W W
References
1. Eurachem (1995) Quantifying uncer- 5. ISO International Standard 8466-1 10. Miller JC, Miller IN (1988) Statistics
tainty in analytical measurement, ver- (1990) Water quality - calibration for analytical chemistry (2nd edn).
sion 6 and evaluation of analytical methods Wiley, UK
2. ISO (1993) Guide to the expression and estimation of performance char- 11. Deaker M, Maher W (1995) J Anal
of uncertainty in measurement, Swit- acteristics - Part 1: Statistical evalua- At Spectrom 10:423-431
zerland tion of performance characteristics, 12. Soares ME, Bastos ML, Carvalho F,
3. Kargten J (1994) Analyse 119:2161- Geneva Ferreira M (1995) At Spectrosc
2165 6. Analytical Methods Committee 4:149-153
4. Penninckx W, Hartmann C, Massart (1994) Analyse 119: 2363-2366 13. NIST (1994) Certificate of Analysis
DL, Smeyers-Verbeke J (1996) J 7. Staats G (1995) Fresenius J Anal SRM1570a Trace elements in spinach
Anal At Spectrom 11 :237-246 Chern 352:413-419 leaves
8. Taylor JK (1983) Anal Chern 14. Penninckx W, Smeyers-Verbeke J,
55: 600A-608A Vankeerberghen P, Massart DL
9. Grubbs FE, Beck G (1972) Techno- (1996) Anal Chern 68:481-489
metrics 14:847-854
Accred Qual Assur (200n) 5: 495-4911
© Springer-Verlag 2000
color Cu Ni Zn Cr As Cd Pb
standard concentr. concentr. concentr. concentr. concentr. concentr. concentr.
#. mg/L mglL mg/L mglL mg/L mg/L mg/L
1 0 0 0 0 0 0 0
2 0.3 0.5 0.1 0.1 0.1 0.01 0.3
3 0.6 1.0 0.2 0.2 0.5 0.03 0.6
4 1.0 1.5 0.3 0.35 1.0 0.05 1.0
5 1.5 2.0 0.4 0.6 1.7 0.07
6 2.0 3.0 0.5 1.0 3.0 0.1
7 3.0 4.0 0.7 1.H 0.3
R 5.0 6.0 1.0 3.0 0.5
9 7.0 R.O 2.0 6.0 1.0
10 10.0 1O.O 5.0 10.0
standards (half-value reading). The exact procedure for determination: It is not at all necessary to employ a
the rapid tests is described in [5]. very precise analytical method which is in general cost-
ly, complex, and time-consuming but rather a rapid
method which is less precise but can be performed
Requirements for rapid tests readily. These rapid tests allow us to generate a greater
number of analytical data points in obviously less time,
Typically, the aim of any analytical characterization of which provides for a more representative declaration of
wastes and contaminated soils is to answer the question any inhomogeneous material.
whether a specific critical value is exceeded or not. This
may have several consequences: for instance a waste is
classified as hazardous waste and must be deposed of Statistical requirements
accordingly, or a contaminated area must be cleaned up
by decontamination procedures. Some important parameters and relations shall be de-
A critical value (cv) is definitely exceeded if the ana- fined:
lytical result (e) and its confidence interval (.1e) is
greater than that critical value.
Critical value (cv)
The total error of an analytical result is given by
three terms: Uncertainty of the sampling process, of The decision whether a measured analytical result, giv-
sample preparation, and analytical determination. Un- en as a mean concentration e is definitely (95% signifi-
certainty must be expressed in terms of the variance, cance in our study) below the critical value, can be ex-
which is equal to the square of the standard deviation s pressed as: e+ .1e < cv where e = mean value and
for the propagation. Llc = confidence interval.
S~otal = s~amp. + S~rep. + S~nal.
If we take a numerical example from the domain of Confidence interval
inhomogeneous samples like solid wastes and soils the
The confidence interval (.1e) is given by
sampling error is predominant by far: we assume the
relative standard deviation due to the sampling error as A-_ t(P,f) xs
.:..Ie - ,r;,- (1)
being 50%, that due to the sample preparation 10%,
and that due to the analytical determination 5%. This
vn
gives: where n = number of parallel measurements obtained
from one sample, t = Student's factor, S = standard de-
Stota] =V502 + 10 2 +5 2 =V2625 =51 % viation of e, and P = level of significance.
The term parallel measurements comprises the en-
It is obvious from this very simple calculation that in
tire analytical procedure including sampling and sam-
practice the total error is in fact determined by the sam-
ple preparation.
pling process and only insignificantly by the analytical
operation. Only if the analytical error is in the same or-
der as the sampling error will its contribution to total
Degree of freedom
error be remarkable. If we examine very heterogeneous
materials like soils and wastes the sampling error is pre- The number of degrees of freedom (n-l) for one sam-
dominant. This fact has consequences for the analytical ple denotes the number of control measurements per-
Statistical evaluation of uncertainty for rapid tests with discrete readings - examination of wastes and soils 209
5,00
'~"
.
~ 4,00
[=+--n
=
~
_nj=4
i=2 [
~ 3,00
~
J!!
2,00
1,00
0,00 + - - - - + - - - + - - - - - - t - - - - , - - - - t - - - - - t - - - - + - - - - , . - - . . . . , - - - - t - - - - - 1
2 10 15 20
m (number of samples in a sample family)
210 H. Malissa· W. Riepe
no range R, is observed in these cases because the color A-_ t(P;f) c(i+I)- CU-l)
"-Ie - - - - - - X --'-'--'-'-'---"--"'- (4)
standard sequence is too coarse. The question is, how d(nJ xy'n 2
can we still estimate a realistic figure for the confidence
interval even in these situations. With this expression a confidence interval for a sample
An example may illustrate the relevance: two sam- family is obtained, which is dependent on the number
ples (m = 2) are taken from a batch of galvanic sludge of samples and determinations used to characterize the
(site) and both are measured in duplicate (n=2). The object as well as on the level of significance selected.
graduation of the standard window may lead to four Only those critical values that are different from the
identical readings. In this case R, and hence the confi- mean concentration by more than that confidence in-
dence interval would be zero, although it is well known terval can be recognized as unambiguously different. It
that a certain degree of inhomogeneity simply exists. must be emphasized that this is a conversion of the un-
Therefore we must derive a figure from the measure- certainty of the reading into a concentration uncertain-
ment process which is, in fact, the range that could be ty and does not necessarily reflect a real inhomogenei-
just recognized and introduce it instead of R. ty.
It can be recognized from Table 1 that the levels Lic
in the concentration scales are not equidistant. When
taking the actual readings it will be not be easy for the Conclusions
operator to assign intermediate concentrations by inter-
polation. However, it can be reasonably expected that When doing parallel analysis the first term in Eq. 4 is
he is able to set an imagined limit of photometric den- about 2 (for P=95% and three and more samples) as
sity, which will be situated halfway between two con- can be seen from Fig. 1.
centration levels and may be able to allocate the actual Therefrom follows the confidence interval for the
sample density to the higher or lower level, respective- mean value:
ly.
An upper and lower limit for each concentration
reading Ci may be defined:
U r·
pper lmlt:
C(i+I)+Ci
2 '
L r·
ower lmlt:
Cj+c(i-I)
2 ' This means that the difference between the adjacent
higher and lower values of the reading step added to
where Cj = concentration reading for window i, the measured concentration e is the test figure, which
Cj + I = concentration reading for window i + 1, and must be compared with the critical value.
Cj_1 = concentration reading for window i-I. All color The critical value is not exceeded (with 95% level of
densities within the interval between these limits are as- significance) when the following condition is satisfied:
signed to one and the same concentration C;. This is
equivalent to the statement that the distance between e+ (cu + 1) - Cu - 1») < critical value
both limits can be regarded as a range Rl of a method,
for which a range cannot be derived from the measure- This easy procedure allows us to use rapid tests with
ment values themselves: a chosen level of significance as a valuable analytical
tool. If the analytical uncertainty is estimated as de-
RI = C(i+!)-C(i-\)
scribed above, it is found that Lie has about the same
2 size as the analytical result e itself. A reliable decision
This is introduced into Eq. 2 instead of R to give the (95% confidence level) that a critical value is not ex-
confidence interval for a colorimetric procedure with ceeded can be made only if the result e is not greater
discrete readings: than half of the critical value itself.
References
1. Unger-Heumann M (1990) Strategy of 3. Merck (1974) Untersuchungen von 5. Merck (1994) Applications
Analytical Test Kits. Fresenius J Anal Wasser Darmstadt. 9. Auflage 0. Doerffel K (1990) Statistik in der ana-
Chern 354: 803-800 4. G6tzl A, Malissa H, Riepe W (1997) Iytischen Chemie. Dt Verlag der
2. Valcarcel M, Cardenas S, Gallego M Analytische Schnellerkennungsme- Grundstoffind, Leipzig, p 28
(1999) Sample Screening Systems in thoden: Bewertung abzulagender Ab- 7. Doerffel K (1990) Statistik in der ana-
Analytical Chemistry. Trends in Ana- falle und Kontrolle von Deponien. Iytischen Chemie. Dt Verlag der
lytical Chemistry, vol 18, no 11, pp UWSF - Z Umweltchem Okotox Grundstoffind, Leipzig, p 82
085--094 9:245-248
Accred Oual Assur (199X) 3: 122-126
© Springer-Verlag 199X
than those associated with macro- and microelements Leaf samples were dried in a dust-free forced draft oven at
[3]. 70°C overnight, then coarsely ground by hand, and two 25-g por-
tions were taken for each grinding method (Table 1). Sample mi-
The possible causes of variability are present in all neralisation was carried out according to the procedure recom-
the analytical steps, which, in atomic absorption spec- mended by the CII (Comite Inter-Instituts d'etude des techniques
trometry, can be narrowed to the following three: sam- analytiques) [3] in a platinum capsule. The ash was treated with
ple preparation, mineralisation and instrumental meas- HN0 3 .
The instrumental measurement was performed on a Varian
urement. In the recent past, to achieve satisfactory pre- spectrometer equipped with a graphite tube atomiser and pro-
cision or reproducibility, the errors due to the instru- grammable autosampler (Spectra AA-400 Zeeman) with the pa-
mental techniques and/or the matrix mineralisation rameters reported in Table 2.
were investigated [4, 5, 6]. The quality control of these
two steps is indeed simplified by the availability of ref-
Statistical analysis
erence materials.
Our attention was centred on sample grinding, an Quality control
important step in sample preparation upon which sam-
An experiment was performed ax times with two deter-
ple homogeneity and possible contamination depend.
minations each time, i.e. a total number of 2ax =Nx de-
The object of this communication is to present the terminations. Following Stringari et al. [7], the
results obtained comparing the effects of two grinding ANOV A SS due to the time, SSAx , and error, SSEx ,
devices on analyses for Pb, Cd, Ni and Cr determined the mean square errors MSA x have been determined,
in routine procedures by atomic absorption spectrome-
followed by the test ratio:
try with a graphite furnace.
F=MSA./MSE x
the variance components:
Materials and methods
s;,=MSE, and s~,=(MSAx-s;)/N.
Chemical analysis
and the reproducibility variance which is the square un-
Two grinding machines, a planetary ball mill (pbm) (PM 4()OO- certainty for the measurand:
Retsch) with grinding jars and balls of agate versus a rotor-speed
mill (Pulverisette 14-Fritsch) stainless steel grinder (ssg), were s;.' + s~" for heterogeneous means, i.e. F
compared, by analysing leaf samples of cucumber (Cllcumis sati- u 2 (x) =Sk = { significant
vus L.), strawberry (Fragaria x anassa Duch.), kiwivines (Actinid- x s~" for homogenous means.
ia deliciosa Liang et Ferg.), apple trees (Malus pumila Mill.) and
grapevines (Vitis vinifera L.) from agricultural experimental plots The control limits for the mean values are given by
under controlled conditions consisting in mulching treatments
(field treatments) with composts of different origins. Mx±ksk, yI72-1I(2Nx)
Table 1 Sample number and Number of Tissue No. of Grinding time Speed
Species
grinding parameters with samples balls (min.) (rpm)
planetary ball mill
Cucumber 25 Leaf 10 10 300
Strawberry 25 Leaf 12 40 300
Kiwivines 10 Leaf 12 30 300
Apple trees 24 Leaf 11 20 300
Grapevines 32 Leaf 12 40 300
Analysis of variance
Results
The results of the above analyses are reported in the
Fig. 4. The horizontal length of the bars indicate the
s~=(SSE+SSA)/(n -1) average ppm content for all elements and species for
s; = (SQM -s;;)/n each grinding method.
~(Y)=Vs;;-s;;
The total horizontal length of each bar has been div-
s~=lly;J!n ided into portions proportional to the variation coeffi-
u(Y)=St cients based on the variance components, to which the
reproducibility variance has been added.
Reproducibility Lead
Following the procedure shown in Stringari et a1. [7], q Ssg grinding methods gave higher mean values in all
repeatitions of the determinations are considered. Thus the plant species (from 7% to 52%), and these differ-
the model of Eq. 1 generalises to ences were significant for cucumber and apple trees
leaves. Moreover, it allowed us to highlight significant
Yijk =p..+ O'i+ £ij+ 8ijk (5)
effects due to the field treatments in four of the five
and the variance components of the model without 8ijk species, while with pbm methods there were significant
are multiplied by q. effects only for apple trees and grapevines.
I:
Legend:
repeatib~ty variance pbm : planetary mill wilh balls and jars of agala
: error VlIllance ssg : staioless steel grinder
: treatmenl variance ~ : significant difference between pbm and ssg grinding
Influence of two grinding methods on the uncertainty of determinations of heavy metals 215
I:
0.00
Legend:
repeatibi~1y variance pbm : planetary mill with balls 8Jld jars of agala
: error van8Jlce ssg : stainless steel grinder
: treatment variance ~ : significant difference between pbm and ssg grinding
References
1. CLL-Comite Inter-Instituts d 'etude 3. BCR Catalogue. BCR Reference Ma- 6. Hoening M, Baete n H , Vanhentenryk
des techniques analytiques (1993-1997) terials. Community Bureau of Refer- S (in press) Anal Chim Acta
Compte Rendu de 67", 61\c, 69 c, 70", ence (BCR) Commission of European 7. Stringari G, Moller F, Ceschini A ,
71", 72" Reunion Communities, Brussels Failla 0 (1996) Comm Soil Sci Plant
2. Martin-Prevel P, Gagnard J .,Gautier P 4. Slavin W (191\4) Graphite furnace An al 27, 5-1\:1403-1416
(191\4) In: Martin-Prevel P , Gagnard J, AAS a source book. Perkin-Elmer,
Gautier P (eds) Plant an alysis. Lavoi- Ridgefield, Conn
sier, New York 5. Hoehig M, de Kersabiec AM (1990)
L'atomisation electrothermique en
spectrometrie d'absorption atomique.
Masson, Paris
Accred Qual Assur (1998) 3: 328-334
© Springer-Verlag 1998
Abstract The need for reliability priate traceability chain, the ex-
of measurements supporting legal perience of the INM in identifica-
decisions in environmental policy tion and evaluation of measure-
or medical diagnosis and treatment ment uncertainty in legal activities
is well known and widely accepted. concerning the environment and
This prerequisite can be met only health is reviewed. Practical exam-
by ensuring that legal measure- ples of measurement uncertainty
ments are accurate and traceable evaluation in spectrophotometric
to national or international stand- determination of five analytes,
ards. Consequently, an outline of commonly determined in environ-
the organizational structure of the mental and clinical chemistry are
Romanian National Institute of described. The implications of
Metrology (INM) for ensuring uni- measurement uncertainty for inter-
formity, consistency and accuracy pretation of regulatory compliance
M. Buzoianu (lEI) of all measurements including legal are discussed.
National Institute of Metrology, measurements performed in chemi-
Sos. Vitan-Biirzesti No. 11, cal laboratories is presented. Since Key words Measurement
75669 Bucharest. Romania
Tel.: +40-1-6344030 reliable measurements can only be uncertainty . Analytical chemistry .
Fax: + 40-1-330 15 33 accomplished within an appro- Environment . Clinical chemistry
quality of these results (i.e. measurement uncertainty) variants of that model. The performances of these in-
is reflected in regulatory compliance against limits is struments are evaluated and verified, using legal metro-
also discussed. logical norm (NML) methods and appropriate CRMs.
Note that various types of CRMs developed, recog-
nized and accepted for use for spectrophotometric sys-
Outline of metrological assurance of legal tems are presented in Ref. [1]. Metrological assurance
measurements of uniformity and traceability of measurements in legal
activities is coordinated and supervised by the Roman-
In accordance with the Romanian Law of Metrology ian Bureau of Legal Metrology (BRML), and carried
(issued in 1992), all measurements performed in pro- out by the INM, 14 area-organized metrological inspec-
duction and testing of pharmaceuticals, in trade or in torates (IIJM) and a number of accredited metrological
the fields of health, safety and environmental chemistry laboratories.
should be traceable to national or international stand- Founded in 1951, the INM's mission is to ensure a
ards, by the proper use of legal instruments, reference valid scientific background for uniformity, consistency
materials (RMs), and adequate methods of measure- and accuracy of all measurements in Romania, regard-
ments. Consequently, the necessary metrological activi- less of their field of application. The main activities of
ties for legal measurements are: the INM are shown in Fig. 1.
1. the assurance of the legality of all instruments used Measurement uncertainty and traceability are very
by pattern tests and initial or periodical verifica- important for regulatory compliance against limits,
tion; when a good reliability of the analytic results and/or
2. the development of RMs required by legal metrolog- monitoring of toxic pollutants is needed. Therefore,
ical norms; much is being done by the INM to improve matters in
3. the assessment of measurement uncertainty and the the specific legal metrology of environmental chemistry
achievement of traceability. and public health. Also, for comparability purposes, the
In this respect, all instruments used in legal activities INM organized several inter-laboratory studies using
are subject to pattern approval of each model and any appropriate CRMs (single or multielement). The re-
NATIONAL
INSTITUTE of
METROLOGY
against single element CRMs (code 13.01), as indicated different bandwidth were used. Instrument 2 and 3
in NML 9-02-94 'Atomic absorption spectrometers for were specialized for water measurements (AQUANAL
water pollution measurements'. A summary of the pa- type). The unknown sample of 0.250 ± 0.010 mg/l, was
rameters of each calibration curve is presented in Table prepared under well-controlled conditions. The meas-
1. An estimated standard uncertainty was evaluated urement conditions and evaluation of measurement un-
starting from the linear calibration of the instrument certainty are presented in Table 2. Starting from the ex-
(uncertainty of regression, residual standard deviation perimental steps involved in each measurement method
and uncertainty of calibration curve included) [4]. Then used, an estimated measurement uncertainty was calcu-
a standard uncertainty was determined, combining lated as the square sum of partial uncertainties for vol-
standard deviation of repeated measurements, correc- ume and absorbance measurements, preparation of the
tion of the calibration curve and the uncertainty of calibration standards and the calibration curve [5].
RMs. The results of the evaluation of these uncertain- By statistical analysis of the results obtained on con-
ties are also presented in Table 1. A good agreement trol RMs or CRMs, an observed measurement uncer-
between the two standard uncertainties is observed. tainty was evaluated (taking into account repeated
Limit ratios of 0.73 and 2.47 were calculated from the measurements, correction of the calibration curve, the
determined uncertainty and URM. calibration curve and the uncertainty of the RMs). A
The concentration of phosphates in waste water was quite good agreement between the two values of meas-
determined according to a national standard STAS urement uncertainty evaluated starting from two differ-
10064 'Surface and waste waters: determination of ent approaches was accomplished. Furthermore, the ex-
phosphates' by measuring the absorbance of the blue perimental standard deviation of the mean value of
colour of a reduced phosphomolybdate complex. Sev- concentration was determined using the analysis of var-
eral types of molecular absorption spectrophotometers iance of individual random effects according to [3]. Ex-
(SPECORD M40, instrument 1; DR 2000, instrument 4 perimental variances of individual values, mean values
and CADAS 100, instrument 5), and photometers of and within parallel measurements for cadium are
Table 1 Results on evaluation of the measurement uncertainty on cadmium determination in waste water
Measurement O-phosphate reacts with ammonium molybdate in acidic medium to produce a phosphomolybdate complex.
method This complex is then reduced to an intense molibden blue colour
Instrument 2 3 4 5 6
Steps considered:
l. sampling (ml) 50 10 50 25 2 50
As indica- As indica- As indica-
2. methods of STAS· ted by the STAS· ted by the ted by the STAS
measurement 10064 maunfacturer 10064 manufacturer manufacturer 10064
3. volume of 10 6 drops R. 10 0.2 10
reagent (ml) 2 drops R2
4. final volume (ml) 100 10 100 25 2.2 toO
5. calibration curve
r 0.9989 0.9988 0.9988 A '0.5722 Linear 0.9989
a 0.0168 0.0018 0.0863 -0.129 (k) 0.0933
b 0.8528 0.2589 1.2746 1.423 (F) l.2086
So 0.0450 0.0267 0.0660 0.0610
6. measurement
conditions; A (nm), 700 635 650 890 890 660
time (min), 30 5 30 2 10 30
path length (mm) 10 15 10 23.5 25 25
(,1Ele)timc 0.016 0.000 -0.04 +0.03 -0.01 -0.04
(,1Ele)tcmp25°C 0 0 0 0 0 0
Accuracy of the
method, rei 0.05 0.05 0.05 0.01 0.04 0.05
Mathematicl
equation (Ax -a) ·fr.implb (Ax -a) ·fr.implb (Ax -a) ·fr.implb 0.5722·A x F'Ax-k) (Ax -a) ·fr.implb
Validation of the
instrument with:
neutral filters Yes Yes Yes No No No
CRM 1 mg/I P04 0.984 l.03 1.05 1.05 0.96 1.01
Absorbance
measured on 0.246 0.069 0.424 0.472 0.275 0.392
sample
Concentration of
the sample (mg/I) 0.246 0.275 0.265 0.270 0.262 0.247
Estimated
standard 0.051 0.065 0.059 0.032 0.022 0.046
incertainty, rei
Determined
standard 0.047 0.051 0.060 0.030 0.022 0.044
uncertainty, rei
a ST AS: National standard ST AS 10064 'Surface and waste waters: determination of phosphates'
end-point determination, and their values are indicated tion lying within particular limits. Unfortunately few le-
in Table 4. A relative measurement uncertainty of 0.058 gal limits are set with allowance for uncertainty. Several
has been obtained for glucose determination, 0.128 for studies of comparability performed in the national area
urea and 0.025 for calcium. Note that the uncertainty of showed a quite large spread of the results obtained in
the CRMs used are indicated in parentheses in the ta- legal activities. For instance spreads of 35% for Cd and
ble. The ratio between the uncertainty of CRMs and Zn, and 25% for Cu and Cr in waste water have been
the measurement uncertainty evaluated for the above reported (7]. Also, in clinical laboratories under routine
described analyses varies from 1.03 to 2.78, which is ac- conditions, the spread was lower than 4.9% for Na,
ceptable agreement with the typically recommended 19% for K, 26.1 % for Ca, 18.6% for Mg and 15.6% for
value of 3. glucose, asymmetrically distributed around the assigned
values [1]. Most outliers were obtained in the absence
of a reliable uncertainty budget and insufficient quality
Measurement uncertainty meaning in legal metrology assurance procedures. Nevertheless, limit results do not
necessarily mean a higher measurement uncertainty.
Measurement uncertainty is significant when interpret- For instance, seven photometric systems of different
ing an analytical result of a toxic substance concentra- photometric accuracy were used to determine nitrite
1l
ing different photometric sys- [J N02 determination
L
tems 0.04 ;
[] Fe determination
0.02 III Glucose determination
o J.4 14 l..d ~
2 3 4 5 6 7
Instnunents
and iron in water, and glucose in human serum. In each tween the uncertainty of upper and lower measure-
case a standard measurement uncertainty was evalu- ments is very important. For physical standards used to
ated as described above, the results are illustrated in calibrate photometric systems in legal metrology the ra-
Fig. 3 (light-grey columns). The left-hand column in tio of 3 is most commonly followed. For concentration
each group showns the photometric uncertainty, evalu- calibrations this ratio usually does not exceed 1 or 2.
ated from the manufacture's specifications. Note that
for Fe determination using instruments of the same
photometric accuracy, the standard measurement un- Conclusions
certainty (reI) varied from 0.060 to 0.120.
In addition, note how important the confidence in- This paper has examined the importance and legal im-
terval from Table 2 is when judging the compliance plications of measurement uncertainty statements in
with limits. Measurement results from instruments 1, 3 environmental chemistry and in the public health sec-
and 5 need individual consideration if the limit is set tor.
with some allowance for measurement uncertainty. It is now accepted that the quality of an analytical
Measurement uncertainty also has a major influence result relies on the uncertainty of the quoted value,
on the traceability chains related to legal spectropho- evaluated mainly from the calibration and reproducibil-
tometric measurements. In such situations both the ity of the measurement system, and from the uncertain-
spectrophotometers and CRMs should be traceable, i.e. ty of calibration standards. But, evaluation of the over-
they need to be calibrated in a proper manner and with all uncertainty follows a complex procedure, which is
an adequate uncertainty. In this respect the ratio be- influenced by the skill of the analyst.
References
1. Buzoianu M (199H) Fresenius J Anal 4. ISO H466 (1990) 1 Qualite de I'eau - 6. Buzoianu M, Aboul-Enein H- Y (1997)
Chern 360:479--4H5 Etalonage et evaluation des methodes Accred Qual Assur 2: 1H6-192
2. Buzoianu M, Aboul-Enein H- Y (1997) d'analyse et estimation des caracteres 7. Duta S, Buzoianu M (1996) Compara-
Accred Qual Assur 2:11-17 de performance. Evaluation statistique bility of spectrophotometric measure-
3. Guide to the expression of uncertainty de la fonction linaire d'etalonage. ISO, ment results in the Romanian Institute
in measurements, ISO (1993), Geneva Geneva of Metrology. Proceedings of Central
5. EURACHEM Guide: Quantifying un- European Conference on Reference
certainty in analytical measurement, Materials, CERM '96', Slovakia
1st edn (1995) Laboratory of the Gov-
ernment Chemist, London
Accred Qual Assur (200 I) 6: 160-163
© Springer-Verlag 2001
Basic considerations for evaluating measurement The other mainstream design, in fact with a PTRV
uncertainty based on prior measurement, is always easier to imple-
ment. The understanding of what is going on during the
The basis for proficiency testing is descrihed in ISO estahlishment of reference values is usually better than
Guide 43-1: 1997 [3]. One of the tools necessary to as- in the case of consensus values: consensus values are
sess the performance of the participating lahoratories is often used in cases too complex to be handled by refer-
an assigned value, which is used as reference point. In ence values. This is often a result from a lack of under-
this paper, the abbreviation PTRV (proficiency test refer- standing, in terms of modeling, of the measurement
ence value) will be used for this purpose. Classically, problem. Properties of the sample, matrix effects, extrac-
there are two ways to obtain a PTRV: tion/destruction yields, etc. all contribute greatly to this
lack of understanding. All these aspects, that may greatly
I. By prior measurement ("reference value") influence the measurement results and therefore also
2. From the participants' results ("consensus value") their uncertainty, may lead to the conclusion that work-
Irrespective of the model chosen, the GUM [2] provides ing with a consensus value is inevitable. So, this lack of
a framework for the evaluation of the measurement un- understanding has more to do with the state-of-the-art in
certainty with respect to the PTRY. From a fundamental measurement science than with the skills of the team
point of view, there is no difference between the two operating the proficiency test.
ways of obtaining a PTRY. A practical example of work- The topic of correlation between measurement results
ing out the establishment of a PTRV using prior mea- is a very critical one, and it is gaining more and more in-
surement is given elsewhere [4J. Although the process is terest. The assumption of lID-data (independent, identi-
not uncomplicated, the estimation of measurement un- cally distributed data) is easily made, but difficult to ver-
certainty is certainly well feasible. ify, and in most cases highly critical. If data are not 110,
When working with a consensus value, the philoso- most of the statistics known do not work. Often, the
phy is not different: the GUM can be implemented problem is not so much in the distribution, it is more in
straightforwardly, as soon as the establishment of the the (in)dependence. Dependent data can already be ob-
consensus value is defined appropriately. There are how- served in cases where all laboratories use the same pure
ever some practical difficulties to be overcome, which substances for their calibration, for instance. This hap-
have mainly to do with the quality of the participants' pens, for example, in PAH-analysis, where there is only
data. It should be noted first that the quality of the PTRV one series of certified pure substances available. Obvi-
is directly dependent on the quality of the participants' ously, the purity data of these substances cannot be treat-
data. This will be reflected by the uncertainty of the ed as being independent.
PTRV as well. A further problem is the presence of sus- Both in testing and in calibration, correlation of data
picious results (e.g., outliers). It is not acceptable in a plays an important role. The consequence of data being
proficiency test to work without some policy to treat out- correlated and disrespecting this leads to wrong uncer-
liers. tainty estimates. The worst part of the message is that it
In this paper, the establishment of a PTRV through is even not known whether this leads to over- or underes-
consensus among participants will be revisited. There are timation problems. As a result, it will just not work to
a few different cases to be considered, in fact: ignore correlations. A safe practice is to drop the as-
sumption of independence, and to work from there. It
I. Results with credible uncertainty statement does make life somewhat more complicated under cer-
2. Results with non-credible uncertainty statements tain circumstances, but underestimation problems will be
3. Results without uncertainty statements avoided.
Establishment of a PTRV through consensus is more The first important case to be considered is the case of
complicated than through prior measurement. The reason credible uncertainty statements. The development of a
for this is that it is more difficult to develop a set of as- procedure for the calculation of the consensus value does
sumptions and assertions that is in compliance with the not differ from an approach suggested for evaluating key
data obtained, and a sufficient basis on which to develop comparisons [5] which has also been demonstrated to
an algorithm at the same time. The days are gone when work for the certification of reference materials [6]. In a
all data from all participants could be thrown into a big recent paper by this author, an implementation of this
"hat" and that automatically the consensus value would recipe has been given for the case of reference materials.
come out. Building consensus values is probably one of A disadvantage of the method is that a full description of
the most complex tasks to be carried out by the organizer. all measurement models is required. This is - apart from
Uncertainty evaluation in proficiency testing: state-of-the-art, challenges, and perspectives 225
the considerable extra effort - undesirable for another These uncertainties are considered to be more or less the
reason: it is far away from the present philosophy of pro- same for all participants. The standard deviation s is the
ficiency testing as it violates the principle to work under just the standard deviation of the means of the laboratory
"normal conditions". means, whereas m is the mean of these laboratory means.
The crux in designing an evaluation method is in the p denotes the number of laboratories. Further treatment
treatment of the data from the laboratories, in relation to of data can take place as usual, including outlier/strag-
the issue of correlations between results. In principle, for gler testing and/or removal if considered appropriate. It
each laboratory pair in the proficiency test, the covari- should be noted that the larger the proficiency test (P),
ance should be computed. To the full extent, this has the smaller the first term in the expression for the uncer-
been established elsewhere [5, 6]. Here, a simpler meth- tainty, so the more important the second term becomes.
od will be proposed. The task for the statistician respon- This is a serious disadvantage of the approach, and can-
sible for the evaluation of the proficiency test is to make not be solved easily, due to apparent problems in the
a fair estimate of the degree of correlation between two uncertainty estimation.
laboratory results. In order to make such an estimation, The method can obviously also be carried out with
the organizer should have some insight in the methods, robust estimation techniques, like for instance the use of
chemicals, and standards used, etc. In most proficiency the median and the (normalized) median of absolute
tests, such information is obtained through an inquiry deviations, MADe' The procedure remains the same, and
and/or regular participant-organizer communication. usually the results from robust estimation techniques do
Instead of requesting all measurement models from not differ significantly from those after an evaluation
all laboratories to be reported like in the case of refer- using classical statistical techniques [7, 8].
ence materials [6], the statistician should make a conser- The evaluation of the performance of the laboratories
vative estimate of the (possible) degree of correlation of can now take place as in the case of the credible uncer-
results. This conservative value should flow in into the tainty statements, as the uncertainty of the consensus
evaluation method as proposed for the reference materi- value is now available, and so are all uncertainty state-
als, and the calculation can be started. Using the metho- ments from the laboratories.
dology of looking at the degrees of equivalence [5, 6],
the unsatisfactory results can be removed and the con-
sensus value can be established. Then, with the consen- No uncertainty infonnation available
sus value after removal of unsatisfactory results, the
results of the laboratories can be assessed. In several cases it may still be impossible to come up
with an uncertainty statement. This is probably the worst
situation, as the customer of the laboratory does not have
Case of non-credible uncertainty statements any indication about the reliability of the reported data.
In the absence of uncertainty data it is obviously impos-
This case cannot be compared with the case of credible sible to work with anything else than the reported labora-
uncertainty statements. The problem is that the organizer tory averages. It still leaves the organizer of the profi-
of the proficiency test gets a lot of information, but the ciency test with the task of estimating the uncertainty of
value of this information is to a certain degree question- the consensus value. Typically, one could proceed as fol-
able. Obviously, the judgment as to whether information lows. The uncertainty at the level of a laboratory can be
is credible or not is something that must be decided from computed from
case to case, but always beforehand. If, during a profi- L
ciency test, it appears that the wrong decision has been u 2 (y) = s2 + i~ur()ther (2)
taken, then it is not an easy task to do a repair: the dan-
ger of violating other assumptions is great. Furthermore, where all symbols have the same meaning as in the pre-
it leaves the participants in doubt about the outcome of vious case. The major difference is that the division by p
the proficiency test, something to be avoided at all cost. has vanished. This is a necessity, as only the reported
If the uncertainty statements are not credible, it is bet- value of the laboratory (y) can be assessed (there is no
ter to refrain from using the uncertainty information at uncertainty information).
all for the establishment of the consensus value. It is bet- In this case, the well known Z-score can still be used:
ter practice to use some kind of approximation, like for
m-y
instance the following formula: Z = u(y) (3)
S2 L
= p + i~lu~(}ther
1
u 2 (m) (I) to assess the performance of the laboratories. The esti-
mation of the uncertainty of a "typical" laboratory is a
where the last term reflects those uncertainty sources real burden, as the organizer must find ways to come up
other than those randomized in the proficiency test. with an uncertainty statement in a complete lack of in-
226 A.M.H. van der Veen
formation. This situation should be avoided, or circum- ing area in the same way as comparisons in the calibra-
vented by working with fixed limits in the performance tion area. The nature of the two comparisons is exactly
characteristics. This is a completely different philosophy, the same: the problems of credible uncertainty state-
and outside the scope of this paper. ments as well as that of correlated variables also exist in
both cases. The outcome of the restyled proficiency test
must not differ from the classical approach, provided
Role of homogeneity and stability of PTMs that the same assumptions are used and that they are
"translated" correctly in the model.
Similarly to the uncertainty of the property values of Uncertainty calculations in the testing area are no
(certified) reference materials, the uncertainty of the longer completely different from those in the calibration
property values of PTMs (proficiency test materials) area. There are differences, and both areas have their
should also include the between-bottle homogeneity [9] specific problems. There is a big task ahead for profi-
and short- and long-term stability [10]. It should be ciency testing organizers in adapting to the new situa-
noted that (1) the stability of the material is only of con- tion, but they can borrow a lot from existing techniques
cern as long as the comparison is ongoing and (2) short- made available in comparisons in the calibration area. It
term stability might impose even greater problems than will bring probably the science of experimental measure-
in the case of CRMs. This is due to the fact that PTMs ment and the science of uncertainty evaluation more
are often more like "real-world" samples, in a sense that closely and more consistently together, which will im-
the measures taken to improve stability are less severe prove the learning cycle in proficiency testing consider-
than for several groups of CRMs. The inclusion of these ably. It will give a boost to the understanding of how
uncertainty components in the uncertainty of the PTM is measurement systems behave, and this will allow for
analogous to the uncertainty model established for refer- more direct and better heading actions if method im-
ence materials and is described elsewhere [6, II]. provement is necessary.
Conclusions
References
I. ISO (1999) International Organization 5. Nielsen L (1999) Evaluation of mea- 9. Van der Veen AMH, Linsinger TPJ,
for Standardization ISO 17025: Gen- surement intercomparisons by the Pauwels J (2001) Uncertainty calcula-
eral requirements for the competence method of least squares. DFM Rep 99- tions in the certification of reference
of testing and calibration laboratories. R39, presented at the EUROMET materials. 2. Homogeneity study.
ISO Geneva workshop on uncertainty calculations Accred Qual Assur 6:26-30
2. ISO (1995) BIPM, IEC. IFCC. ISO, in key comparisons, Teddington, 10. Van der Veen AMH, Linsinger TPJ,
IUPAC, IUPAP, OIML: Guide to the Nov 1999 Lamberty A, Pauwels J (2001) Uncer-
expression of uncertainty in measure- 6. Van der Veen AMH (2000) Determina- tainty calculations in the certification
ment, 1st edn, 2nd corrected print. tion of the certified value of a refer- of reference materials. 3. Stability
ISO Geneva ence material appreciating the uncer- study. Accred Qual Assur (in press)
3. ISO (1997) International Organization tainty statements obtained in the col- II. Van der Veen AMH, Linsinger TPJ,
for Standardization: ISO/IEC Guide laborative study. Presented at AMCTM Schimmel H, Lamberty A, Pauwels J
43-1: 1997: Proficiency testing by in- 2000, Monte de Caparica, May 2000 (200 I) Uncertainty calculations in the
terlaboratory comparisons - Part I: 7. Van der Veen AMH, Broos AJM certification of reference materials.
Development and operation of profi- (1996) Preparation and characterisation 4. Characterisation and certification:
ciency testing schemes. ISO Geneva of coal samples and maceral concen- Accred Qual Assur (in press)
4. Van der Veen AMH, Horvat M, trates for studies on gasification and
Milacic R, Buacr T, Repinc U, Scancar combustion reactivity of coals in com-
J, JaCimovic R (2001) Operation of a bined cycle processes. Draft Final Rep
proficiency test of trace elements in ECSC 7220/EC-036, Eygelshoven, NL
sewage sludge with reference values. 8. Cox MG (1999) A discussion of ap-
Accred Qual Assur (submitted for proaches for determining a reference
publication) value in the analysis of key-compari-
son data. NPL Rep CISE 42/99, Tedd-
ington, UK
Accred Qual Assur (199/1) 3: 69-7/1
© Springer-Verlag 199/1
throughout the European air pollution monitoring net- fraction concentration and can be expressed by the for-
work, UV fluorescence [5] has virtually replaced TCM mula:
for the analysis of atmospheric sulphur dioxide.
The European Reference Laboratory of Air Pollu- (1)
tion (ERLAP) is the reference laboratory for atmos-
pheric pollution serving the European Commission.
One of its duties is to maintain European standards for
the calibration of N0 2 and S02 methods of analysis. Implementation at the ERLAP laboratory
ERLAP has chosen the permeation method [0] evalu-
ated by gravimetry to produce reference standards for The static volumetric method is described in detail in
both N0 2 and S02 methods, but uses the static volu- the Guidelines of the VDI 3490 Blatt 14 [8], and it has
metric method [7] to cross check the permeation meth- been successfully tested for over 20 years at the UBA-
od. Pilot Station of the Federal Environmental Agency of
This paper describes ERLAP's implementation of Germany. The static volumetric system implemented
the static volumetric method and deals with the uncer- by ERLAP was devised and developed at the UBA Pi-
tainty of standards generated by this method. Practical lot Station and has been tested at the ERLAP laborato-
examples of the calculation of uncertainty are given. ry for more than 3 years. The ERLAP laboratory uses
this method for the preparation of S02 and NO stand-
ard calibration gas mixtures. The static volumetric sys-
Principle of the method tem used for this purpose is shown in Fig. 1.
General principles
Mode of operation
At atmospheric pressure p and room temperature, a
known volume v of the pure component to be analysed Borosilicate glass mixing vessel
(Co == 1) is transferred with a syringe to a large borosili-
cate vessel of known volume V filled with a selected Experiments at the UBA Pilot Station have clearly
carrier gas. The vessel is then filled with the selected shown that, for components such as S02 and NO, wall
carrier gas to pressure P, which is usually about 1.5 atm effects from borosilicate glass were negligible when
to facilitate use of the mixture. The mixture can be used preparing concentrations of 100 ppbv and above in dry
once temperature has returned to ambient tempera- carrier gases [9]. The volume of the vessel was deter-
ture. Under these conditions, the volume concentration mined as 0.11184 m3 ± 0.1% by a replicated process of
of the component C j is practically equal to the molar filling with water and deriving volume from weight of
~
.j!
Pressure gauge Temperature gauge
~
~
/ / IS Dilution Gas
.1
;z:
Fan
Reference
Analyser
0
ppb
Pure Gas
NO or S02
Uncertainty calculation and implementation of the static volumetric method for the preparation 229
The pressure sensor was a Druck model DPI 510, with This involves the use of a small stainless steel container
a precision of 0.025% and an accuracy of 0.04%. (the septum chamber, see Fig. 2), the integrity of which
is maintained by two manual stainless steel valves. One
valve connects directly to the reducing valve of the
Pure gas pure NO cylinder and the other to a vacuum pump.
Turn on the vacuum pump after first checking that
The pure gases used for the preparation of NO and S02 all valves between the pump and the NO cylinder are
standard mixtures were manufactured by Messer Grie- closed. Introduce an empty syringe into the chamber
sheim, with a purity better than 99.5% for NO and (through a septum similar to that in the mixing vessel)
99.98% for S02. NO with purity better than 99.8% may and carry out the following sequence to ensure that the
be available in the future. syringe is filled with pure NO.
(a) Close both septum chamber valves, the reducing
valve and the NO cylinder.
Carrier gas (b) Open the septum chamber valves for 20 s to "clean"
the septum chamber and tubes.
For the preparation of NO standards, cylinders of chro-
matography-grade N2 (NO free) were used. For the
preparation of S02 standards, zero air produced by the
ERLAP zero air generator (S02 free) was used.
230 M. Gerboles . E. Diaz . A. Noriega-Guerra
(c) Close both septum chamber valves and open the Correction for the level of purity of the pure gases
NO cylinder.
(d) Close the NO cylinder and open the reducing valve The pure gases (NO and S02) are supplied by Messer
until a pressure of 2 bar is established (read on indica- Griesheim Italia. The NO cylinder is of the Nitric Ox-
tor II of the reducing valve). ide 2.5 F1S type with a certified purity ;::: 99.5%. The
(e) Close the reducing valve and open the first septum S02 cylinder is of the Sulphur Dioxide 3.8 F1S type
chamber valve to send pure NO into the septum cham- with a certified purity;::: 99.98%. It was decided to ap-
ber. Close the first septum chamber valve. ply a correction factor to the reference value of the NO
(f) Fill the syringe with pure NO and drain the syringe. standards of -0.25%, with lower and upper limits of
Repeat three times. -0.5 and 0%. For S02 standards, the correction factor
(g) Fill the syringe with pure NO and open the second was -0.01 %, with lower and upper limits of -0.02 and
septum chamber valve to clean the septum chamber 0%. The purity certified by the manufacturer was verif-
and syringe. Drain the syringe and close the second ied by FT-IR spectrometry.
septum chamber valve.
(h) Repeat steps c, f, g.
(i) Repeat steps c, d and c. The new ISO method for calculating uncertainty
G) Slowly fill the syringe completely with pure NO and
wait for one min before removing from the septum In general, an analytical measurement provides only an
chamber. estimation of the value of a determinant, and must be
(k) Switch off the vacuum pump after first opening its accompanied by a quantitative statement of the uncer-
venting valve (to avoid oil entering the tubing). tainty attached to the estimate.
During these operations, it is important not to touch In 1993, ISO published a new guide to the expres-
the glass of the syringe or the septum chamber to en- sion of uncertainty [10], in which each component con-
sure that these remain at ambient temperature. tributing to the uncertainty of a measurement is allot-
ted an estimated uncertainty, termed standard uncer-
tainty (u;) equal to the positive square root of the esti-
Injecting the pure NO into the mixing vessel mated variance u/.
The uncertainty associated with a measurement gen-
Take the syringe filled with pure NO out of the septum erally consists of several components, which may
chamber, and, 10 s after adjusting it to the required vol- grouped into two categories:
ume (by way of the ERLAP mechanism), slowly inject A. Those which are evaluated by statistical meth-
the pure NO into the mixing vessel. Once the syringe is ods
empty, quickly remove it (to avoid possible loss of pure B. Those which are evaluated by other means.
NO along the surface of the syringe needle) and turn
on the fan for 2 min to aid mixing.
Uncertainty calculation and implementation of the static volumetric method for the preparation 231
Type A evaluation of standard uncertainty The partial derivatives are referred to as sensitivity
coefficients.
Type A evaluation of uncertainty may be based on any It is assumed that corrections have been applied to
valid statistical method for treating data. For example: compensate for each systematic effect which significant-
- Calculating the standard deviation of the mean of a ly influences the measured value, and that every effort
series of independent observations has been made to identify such effects.
- Using the method of least squares to fit a curve
- Carrying out an analysis of variance ANaYA to
quantify random effects. Expanded uncertainty
As an example of Type A evaluation, consider an
input quantity Xi whose value is estimated from n inde- What is often required is an expression of uncertainty
pendent observations Xu obtained under identical con- to define the limits associated with a measured value y
ditions of measurement. In this case, the estimated within which the value Y is confidently believed to lie.
standard deviation of the mean is the positive square The measure of uncertainty intended to meet this re-
root of: quirement is termed the expanded uncertainty, sug-
gested symbol U, and is obtained by multiplying ll,(y)
(3) by a coverage factor, suggested symbol k. Thus
U = k ll,(y) and is confidently believed that y - U ::5 Y
::5 Y + U, which is commonly written Y = y ± u. In
general, the value of the coverage factor k is chosen on
the basis of the desired level of confidence to be asso-
Type B evaluation of uncertainty ciated with the interval defined by U = k ll .. Typically
k is in the range 2 to 3. When the normal distribution
Type B evaluation of standard uncertainty is usually applies, U = 2 ll, defines an interval having a level of
based on scientific judgement and may make use of all confidence of approximately 95% and which is consis-
available relevant information, including: tent with current international practice.
- Previously measured data
- Available information concerning the behaviour and
properties of materials and instruments involved Uncertainty budget
- Manufacturer's specifications
- Calibration data and other available information. Purity of the gases Co
As an example of Type B evaluation, consider an input
quantity Xi whose value is estimated from an assumed As described in "Correction for the level of purity of
rectangular probability distribution with lower limit a- the pure gases", for NO, the lower and upper limits of
and upper limit a +. In this case the input estimate is the purity correction are 99.5-100%. The probability
usually expressed by: that the purity lies in this interval is 100% (rectangular
xi=(a+ +a_)/2 (4) distribution). The best estimate of the standard uncer-
tainty of the quantity is then the positive square root
and the standard uncertainty associated with Xi is: of:
ll(x;)=a/-{3" where a=(a+ -a_)/2 (5)
(a+ -a_)2 = (1-0.995)2 =2.08310- 0 (7)
12 12
For S02, the lower and upper limits of the purity cor-
Combined standard uncertainty rection are 99.8-100. The best estimate of the standard
uncertainty of the quantity is then the positive square
The combined standard uncertainty of a measured val- root of:
ue (ll,,) is assumed to correspond to the estimated
standard deviation of the result. In the case of non-cor- llj;l= (a+ -a_)2 = (1-0.998)2 =3.33310 -7 (8)
related components, it is derived by combining individ- 12 12
ual standard uncertainties ll;, which may arise either
from type A or type B evaluations. The method is often
referred to as the law of propagation of uncertainty, Y olume of the syringe v
and is expressed as:
The volume of each syringe (with ERLAP mechanism)
(6) was determined by filling with water and deriving the
volume from the weight of liquid (measured with a
232 M. Gerboles . E. Diaz . A. Noriega-Guerra
Table 1 Measured volume of the syringes. Average and standard Pressure of the pure gas p
deviation of 15 replicate measurements, linearity balance devia-
tion U~2 and diffusion deviation through the needle U~3
It is important that pure gas is injected at room temper-
Syringes Average (I) U~1 W) ature following the procedure described in "Mode of
operation" and paying special attention to the precau-
1 (NO) 24.71 10- 6 2.19 10- 15 1.33 10- 16 5.09 10- 17 tions relating to pressure.
2 (NO) 39.86 10- 6 2.08 10- 15 1.33 10 -16 1.32 10 -16
3 (NO) 78.43 10- 6 8.91 10 -15 1.33 10 -16 5.13 10- 16 Room pressure was measured with a barometer
4 (NO) 99.23 10- 6 5.11 10- 15 1.33 10 -16 8.21 10 -16 manufactured by Lambrecht Klimatologish Messtech-
5 (S02) 39.65 10- 6 2.16 10- 15 1.33 10 -16 1.31 10- 16 nik (Gotlingen) model 00.06040.100000, the specified
6 (SOz) 49.16 10- 6 2.44 10 -15 1.33 10 -16 2.01 10 -16 uncertainty Upl of which is ± 0.25 mbar. Pressure is in-
7 (S02) 69.29 10- 6 2.36 10 -15 1.33 10 -16 4.00 10 -16
fluenced by the time interval between extracting the sy-
ringe from the septum chamber and injecting into the
vessel. Appendix 2 shows the results of S02 measure-
Mettler A T201 balance) contained in the syringe (cor- ments made with different time intervals between with-
rected for water density of 0.998 g/cm3at 22 Qq. Several drawal of the syringe from the septum chamber and in-
replicate volume measurements were carried out in a jection of pure S02 into the mixing vessel. The results
hysteresis cycle, and the results are shown in Appendix show that there were few differences between injec-
1. Variations in the measured volume reflected variabil- tions after 5- or 15-s intervals, although an transient
ity in the filling and emptying processes and in the per- over-pressure of 0-0.2% cannot be ruled out. Assuming
formance of the balance. The standard deviation of the a rectangular distribution, the best estimate of the
various estimates of v is the square root of u v / and is standard uncertainty of p is the square root of u p / (in
given in Table 1. mbar2):
The balance was also subject to a linearity deviation
from the true value, evaluated by the manufacturer as (4)2 = 1.333 (9)
being ± 0.02 mg (0.02 ml) in the range 0-5 g. Assuming 12
a rectangular distribution, the best estimate of the
standard uncertainty of v is the square root of u v / , giv-
Volume of the vessel V
en in Table 1.
The syringe is filled with pure gas (NO or S02) at a
Total volume was determined by filling with water as
pressure of about 2 bar. When the pure gas is injected
described in "Mode of operation". Assuming rectangu-
into the glass vessel, it must be returned to the ambient
lar distribution, the best estimate of the standard uncer-
pressure without having reacted and undergone any
tainty of V is the square root of u/ (in 1-2) with:
transformation. In fact, the absorption of NO and S02
on the glass walls of the syringe has never been ob- u2=(a+-a_)2 =(111.95-111.73)2 =4.0310-3 (10)
served and is not likely to produce a relevant reduction v 12 12
of the injected volume. No reaction or transformation
of the pure gases has been evidenced so far. However, The 4-cm wide borosilicate glass vessel is not expected
diffusion of the pure gas out of the syringe through the to undergo increases in volume at internal pressure up
syringe needle is observed after the transient period to 1.7 bar.
needed for the syringe pressure to be adapted to the
ambient pressure. The pure gas may leave the syringe
chamber by diffusion through the needle before injec- Pressure in the vessel P
tion into the glass vessel. This diffusion has been inves-
tigated in Appendix 2 and Fig. 3. With a time interval Pressure was determined as described in in "Mode of
of 10 s (+ 15 s of tolerance) before injection, the pure operation". For pressures up to 1500 mbar, the manu-
gas represents only 99.9-100% of the syringe volume facturer claimed a precision of 0.375 mbar, although the
that is injected into the glass vessel. Assuming a rectan- precision of the digital display might be thought to be
gular distribution, the best estimate of the standard un- at least one digit (i.e. 1 mbar). Assuming rectangular
certainty of v is the square root of U v 3 2 , given in Ta- distributions, the best estimate of the standard uncer-
ble 1. tainty of P is the sum of the square root U p l 2 and U p 2 2
Obviously, there are differences between injecting a (in mbar 2):
gas with the syringe and injecting a liquid, but good es- 2 (a+ -a-f (0.75f =0.047 (11)
timates of volume are possible provided the precau- UPl
12 12
tions outlined in "Procedure for the preparation of a
mixture" are followed (especially with respect to the (12)
duration of injection, see Appendix 2).
Uncertainty calculation and implementation of the static volumetric method for the preparation 233
af)2 Uz
(ax; NO
syringe 1
NO
syringe 2
NO
syringe 3
NO
syringe 4
S~
syringe 1
S02
syringe 2
S02
syringe 3
components
pV)2 UZ·
(-PV Gas purity 4.0 10 -20 1.1 lO- lq 4.5 1O- IY 7.6 1O- lq 1.7 10 -20 2.5 10 -20 5.3 10- 20
<>
Syringe volume 7.4 10 -20 7.0 1O- 1Y 3.3 1O- 1Q 2.2 1O- IY 7.9 10 -20 11.6 10- 20 9.5 10- 20
Repetition of measurements 6.11 IO -20 6.9 1O- IY 3.1 10- 19 1.9 1O- IY 7.0 10 -20 7.5 10 -20 7.11 10 -20
Linearity of the balance 4.1 10- 21 4.4 10 -21 4.6 10 -21 4.9 10 -21 4.3 10- 21 4.1 10- 21 4.4 )()-21
Diffusion of pure gas 1.6 10 -21 4.4 10- 21 1.11 10 -20 3.0 10- 20 4.3 10- 21 6.2 10 -21 1.3 10 -20
Syringe 2.7 10 -20 7.4 10- 20 3.0 1O- 1Y 5.2 1O- 1Y 7.3 10- 20 1.1 1O- 1Y 2.2 1O- IY
Barometer repeatability 1.2 10 -21 3.3 10- 21 1.4 10- 20 2.3 10 -20 3.2 10- 21 4.11 10 -21 9.9 10 -21
Over pressure in the syringe 2.6 10- 20 7.1 10- 20 2.9 1O- 1Y 4.9 1O- 1Y 6.9 10- 20 1.0 1O- 1Q 2.1 10- 19
(COPV)2 2 Vessel volume 6.1 10- 21 1.7 10- 20 6.9 10- 10 1.2 1O- 1Q 1.7 10- 20 2.4 10 -20 5.1 10- 20
V2p U v
(C'PV)2 2 Vessel pressure 5.4 10- 21 1.6 10 -20 6.7 10 -20 1.2 10 -20 1.5 10 -20 2.1 10- 20 4.7 10- 2 1)
p2V Up
" ..• 112 PI Random effect of the sensor 3.5 10- 22 1.0 10 -21 4.4 10 -21 11.1 10- 21 1.0 10 -21 1.4 10- 21 3.1 10- 21
" ••• 112 P2 Systematic effect of the sensor 2.5 10 -21 7.3 10 -21 3.1 10- 20 5.7 10 -20 7.0 10- 21 9.11 10 -21 2.2 10 -20
" ... u 2 P3 ~ pressure due to the room 2.5 10 -21 7.3 10 -21 3.1 10- 20 5.7 10- 20 7.0 10- 21 9.11 10- 21 2.2 10 -20
temperature
Ue Combined standard uncertainty 3.9 10- 10 9.6 10- 10 1.1 1O- Q 1.3 lO- Q 4.5 10- 10 5.1 10- 10 6.11 10- 10
U=2u, in Expanded uncertainty 0.11 1.9 2.2 2.6 0.9 1.0 1.3
ppbv and % 0.56% 0.113% 0.47% 0.43% 0.39% 0.37% 0.34%
234 M. Gerboles . E. Diaz . A. Noriega-Guerra
NO Syringe 1 NO Syringe 2 NO Syringe 2 NO Syringe 3 NO Syringe 4 S02 Syringe 1 S02 Syringe 2 S02 Syringe 3
I·Ll f11 f11 f11 f11 f11 f11 f11
Table 6 Static volumetric experiments with various time intervals before injection
0.0% t--;---T"-+"""""=±::::=--+-----t------+L-----+-------l
1I -0.5%
I . .. . - . - - -. .
- - - . - - - - - -' - - .. - - - - - - - - -.. -
.
1! -1.0% .
- - _. - . .
J"S -1.5%
I~ -3.0%
-3.5% --- - - . - .. , - - - . _.
,
- - - -, - -
I
. . . . . - .,. . . -.
,
- - - - .
....
,
- - . . . . ., •
- -' . - - - --
o 20 40 80 100 120
To attain such high levels of confidence on the refer- ume from the weight of liquid (measured with a Mettler
ence value of NO and S02 standards, the following A 1'201 balance) contained in the syringe (corrected for
should be taken into account: water density of 0.998 g/cm 3 at 22°C). The volumes are
- The complex procedure for manipulating the dosing reported in table 5.
syringe must be carefully followed.
- Values specified in the pure gas manufacturer's cer-
tificate must be periodically verified. Appendix 2: 502 measurements plotted against time
- The balance used to weigh the syringe and the pres- interval before injection
sure sensor serving the mixing vessel must be well
maintained to ensure accurate and precise measure- In the present experiments, a UV fluorescence analyser
ments. Traceablility certificates for these instruments manufactured by Environnement SA model AF21 M
must be available. was used. A Hamilton syringe series 1800 with a needle
- Room temperature must remain constant between series n080451 was used and a volume of 78,...,1 (v) was
injection and dilution with carrier gas. injected for all experiments.
The uncertainty associated with the reference value Table 6 shows a series of experiments carried out to
of N0 2 (obtained by ERLAP using the permeation determine the maximum tolerable time delay before in-
method) has been evaluated previously [11] as about jection (see "Procedure for the preparation of a mix-
1 % with 95% confidence limits. This is slightly greater ture"). This time interval depends on the diffusion rate
than that of an NO standard prepared using the static of the S02 through the syringe needle and must be es-
volumetric method. tablished for the methodology to be viable. The results
are presented in the Fig. 3, and show that the perform-
ance of the methodology is not compromised provided
Appendix 1: Volumes dispensed with the syringe the interval is kept below 30 s.
References
1. Directive du Conseil du 15 juillet 4. Norme internationale ISO/DIS 7996 9. Rudolf W, "Implementation of the
1980 concernant des valeurs limites et (F)/TC 146 (1984) Qualite de l'air - Static Injection Method", EEC con-
des valeurs guides de qualite atmo- Determination des oxydes d'azote tract n° 4108-90-10-ED ISP D
spherique pour I'anhydride sulfureux dans l'air ambiant - Methode par 10. Guide to the expression of uncertain-
et les particules en suspension. Jour- chimiluminescence ty in measurement, ISBN 92-67-
nal officiel des Communautes 5. Norme internationale ISO/CD 10498 10188-9, copyright International Or-
2. Directive du Conseil du 7 mars 1995 (F)/TC 146 (1984) Air ambiant - ganisation for Standardisation,
concernant les normes de qualite de Dosage de souffre - Methode par flu- Printed in Switzerland
l'air pour Ie dioxyde d'azote (85/203/ orescence dans I'ultraviolet 11. Gerboles M, Manalis N, De Saeger E,
CEE). Journal officiel des Commu- 6. Norme internationale ISO 6349 (F) Payrissat M (1996) Report EUR
nautes (1979) Analyse des gaz - Preparation 16432 EN "Study of the long term
3. Norme internationale ISO 6767 (F) des melanges de gaz pour etalonnage stability of N02 Permeation Sources
(1990) Air ambiant - Determination - Methode par permeation and the efficiency of Gravimetry in
de la concentration en masse du diox- 7. Norme internationale ISO 6144 (F) determining their permeation rate"
yde de souffre - Methode au tetrach- (1981) Analyse des gaz - Preparation
loromercurate (TCM) et a la pararo- des melanges de gaz pour etalonnage
saline - Methode volumetriques statique
8. Verien Deutscher Ingenieure, "Mess-
en von Gasen, Priifgase - Herstellen
von Priifgasen nach der Volume-
trisch-Statischen Methode unter Ver-
wendung von Glasbehiiltern", VDI
3490 Blatt 14, November 1985
Accred Oual Assur (20()O) 5: 2X()-2X4
© Springer-Verlag 20()O
Qs=~*Ts* V (3)
Ps T (t2 -(.)
Since allowances must be made for changes in tem-
perature and pressure, the parameters of the perfect
gas law must apply. The precise calculation of the volu-
metric flow rate through a piston meter of this type is
based on Eq. 4:
C*Ts*Pm
(4)
Qs= K* t *Tm *pS
Fig. 1 Schematic structure of a typical thermal mass flowmeter
where:
Qs =the gas flow rate corrected to STP (l/min);
Ts =standard temperature: 273.15 K;
Thermal mass flowmeters have the capability of giv-
P s = standard pressure: 760 mmHg;
ing accurate measurements over a fairly wide range of
Tm = the temperature of the test gas ill the cylinder
temperature and pressure without the need to enter
(K);
pressure or temperature corrections into the calcula-
Pm = the pressure of the test gas in cylinder (barometric
tions. This feature is due to the units that are calibrated
pressure + cylinder pressure) (mmHg);
with reference to standard conditions and which are
C = total number of counts accumulated by a shaft en-
stable from about IS-32°C and from about atmospher-
coder synchronized to a crystal clock (counts);
ic pressure to about 30 PSI for the large volume flow-
t =total time to accumulate C counts (min);
meters (about 25-50 SLPM). Thermal mass flowmeters
K = number of counts per liter (calculated for each cy-
are always calibrated for a particular type of gas. All
linder at the calibration time).
manufacturers list gas conversion factors in their manu-
als for corrections to be made when measuring differ-
ent gasses. These figures are mostly theoretically com-
Secondary standards
puted based on densities, specific heats, and atomic
As secondary standards our laboratory uses thermal weights of the gases.
mass flowmeters and controllers in the range of 100
SCCM to 500 SLPM.
Mass flowmeters use the thermal properties of a gas Calibration methods
to measure flow rate directly [1]. Mass flow rates are
determined by measuring the heat required to maintain Standard thermal mass flowmeters up to 50 SLPM are
an elevated temperature profile along a laminar flow calibrated by comparison to the primary system, Cali-
sensor tube. For a specific flowmeter range and gas spe- flow AlSO and those used for higher range are sent
cies, flow is proportional to the voltage necessary to abroad for calibration.
maintain a constant temperature profile. The sensor in The secondary thermal mass flowmeters are used to
a mass flowmeter is a long, thin stainless steel tube, oft- calibrate all kinds of flow devices in accordance to the
en called a capillary tube because of its shape (see required range and uncertainty.
Fig. 1). A calibration procedure was written by our calibra-
Coils wrapped around the midpoint of the capillary tion laboratory, which describes in detail how the two
tube serve two functions: first as heaters and second as types of calibrations mentioned above are performed.
temperature sensors. Since the resistance of the coils
varies with temperature, they function as temperature
detectors, or resistance temperature detectors (RTDs), Uncertainty calculation
which measure the temperature of the gas. The heaters
create a known temperature profile along the sensor The uncertainty evaluation was done according to the
tube and then maintain the profile during gas flow by recommendations in the ISO Guide for the Expression
means of an autobalancing bridge circuit. As gas flows of Measurement Uncertainty [2]. This procedure was
through the sensor, the gas flow convects heat and the performed on both previously mentioned calibration
temperature difference is converted into a flow read- methods. According to the Guide, we followed the
ing. steps described further on.
Assessment of uncertainty in calibration ofa gas mass flowmeter 239
Table 1 Uncertainty budget for the calibration of a 10 standard liters per minute (SLPM) mass flowmeter vs. the Califlow
Table 2 Calculated expanded uncertainties for calibrated f1ow- Table 4 Calculated expanded uncertainties for calibrated f1ow-
meters vs. Califlow A 150 meters VS. flowmeters
Description of flowmeter Calculated expanded uncertainty Description of flowmeter Calculated expanded uncertainty
(U) (%) (U) (%)
Calibration vs. a secondary standard mass flowmeter Identification of standard uncertainty components
In this kind of calibration a precise mass flowmeter pre-
Standard uncertainty components are described in Ta-
viously calibrated by the AlSO Califlow serves as a sec-
ble 3. Both flowmeters were assumed to be calibrated
ondary standard. Using the same method for determin-
by the same type of gas and were exposed to the same
ing the measurement uncertainty as described in the
temperature and pressure conditions.
section on Calibration vs. Califlow AlSO we proceeded
Expanded uncertainties for calibration of flowmet-
with the following steps:
ers are presented in Table 4 using the same method as
described above. In this case, the reference mass flow-
meter, the drift and the repeatability measurements
Determination of the measurement model
contribute mostly to the uncertainty.
In this case the physical model is described by the Eq.
(12):
Summary and conclusions
(6)
where: Descriptions are given of the Rafael Calibration Labo-
ratory's facilities for calibrating flowmeters.
Qm = the gas flow rate measured by the tested flow-
Methods of operation are given together with the
meter (SLPM),
measurement uncertainties obtained using the "ISO
QR = the gas flow rate measured by the reference flow-
Guide for the Expression of Uncertainty" recommen-
meter (SLPM).
dations.
The first method uses a primary standard of the pis-
ton prover type, Califlow AlSO. The uncertainty esti-
Sensitivity coefficients calculation
mation in this case is based on the manufacturer's un-
Sensitivity coefficients for Qm and QR were calculated certainties, experienced judgment, and propagation un-
and resulted in 1 for each of them. certainty techniques. Typical values of uncertainty us-
Table 3 Calibration uncertainty budget for 10 SLPM flowmeter VS. secondary standard flowmeter
ing this method are around 0.3% F.S., with the drift and The second method is based on a comparison of the
repeatability measurements as the most significant con- UUT to a secondary standard calibrated by the first
tributors to the total budget. The most significant con- method. The uncertainty budget in this case is deter-
tributor to Califlow AlSO uncertainty is the tempera- mined equally by the drift and the reference standard
ture measurement uncertainty, followed by that of the used for the calibration. Typical values received in that
encoder and the volume. case, are around 0.5%.
References
1. Hinkle LD, Marino CF (1990) To- 2. ISO (1995) Guide to the expression of
wards understanding the fundamental uncertainty in measurement. ISO,
mechanism and properties of the ther- Geneva, Switzerland
mal mass flow controller. MKS Instru-
ments, Andover, Mass., USA
Accred Qual Assur (1998) 3:231-236
© Springer-Verlag 1998
Abstract Steps which are taken to tive trial data. In many analytical
implement the concept of measure- sectors, the differing strategies cur-
ment uncertainty in analytical rently followed for the determina-
Presented at: 2nd EURACHEM chemical laboratories should take tion and use of recovery informa-
Workshop on Measurement Uncertainty full account of existing internation- tion are an important cause of the
in Chemical Analysis, Berlin,
29-30 September 1997 ally agreed protocols for analytical non-comparability of analytical re-
quality assurance and reflect the sults. Guidelines which are being
needs of particular analytical sec- prepared for the estimation and
P. Willetts' R. Wood (lEI)
Food Labelling and Standards Division, tors. For the food sector this may use of recovery information in ana-
Ministry of Agriculture, Fisheries and mean that for official purposes the lytical measurement may provide a
Food, CSL Food Science Laboratory, use of the term measurement un- more unified approach which in-
Norwich Research Park, Colney, certainty is replaced by the term cludes measurement uncertainty as
Norwich NR4 7UQ, UK
Tel.: + 44-1603-259350
measurement reliability and that a a key concept in the use of recove-
Fax: +44-1603-501123 quantitative estimation of this is ry data.
e-mail: r.wood@fscii.maff.gov.uk made based on existing collabora-
been advanced recently which affect specifically the is- tation agencies to ensure that measurement uncertainty
sue of data quality, and therefore reliability, in this area estimations are carried out as part of the accreditation
of analysis. Firstly, in the EU, there is a tendency in the process [9].
food analysis sector to not prescribe specific methods of
analysis but to adopt a "criteria of methods approach"
whereby analysts may use the method of their choice Recovery and analyte losses
provided it meets certain prescribed quality criteria.
This flexibility of approach, to take advantage of the One aspect of analytical chemistry where, for all analy-
developments of new techniques and procedures as tical sectors including the food sector, current practice
they occur in analytical chemistry, clearly has conse- continues to have important consequences in terms of
quences for the comparability and measurement uncer- the non-comparability and uncertainty or reliability of
tainty of reported data. Secondly, there is a require- reported data, is that of the use of recovery informa-
ment in the food sector, as set out in EC Directive 93/ tion. This arises because of the different strategies for
99, that methods of analysis for food control purposes dealing with recovery assessment and the effect these
should wherever possible be formally validated by col- may have on the variability of the analytical results re-
laborative trial [5]. Thirdly, there have been discussions ported.
on measurement uncertainty within the Codex Com- Recovery studies are an essential component of
mittee on Methods of Analysis and Sampling. The Re- quality assurance systems in analytical measurement.
port of the March 1997 Session of that Committee Their use, particularly in the trace analyte area, to as-
states that with regard to measurement uncertainty sess the efficiency of the removal of the measurand
[6]: from the sample matrix and its transfer prior to detec-
tion is widely quoted in the scientific literature. Al-
1. The Committee will develop for Codex purposes an
though they thus provide an important indication of the
appropriate alternative term for measurement un-
reliability of these steps in the measurement process,
certainty, e.g. measurement reliability.
there generally has been no consistent approach to the
2. The precision of a method may be estimated
way in which recovery information is derived and used
through a method-performance study, or where this
in analytical data. In particular, in the case of recovery
information is not available, through the use of in-
factors calculated and applied to analytical data to cor-
ternal quality control and method validation.
rect for displacement or bias, the absence of accepted
3. Consideration should be given as to whether it is
strategies for the determination and use of these factors
necessary to undertake an additional formal evalua-
has meant that it frequently has been difficult to make
tion of a method of analysis using the ISO approach
comparisons between analytical results produced in dif-
[7] in addition to using information obtained
ferent laboratories or verify the suitability of that data
through a collaborative trial.
for the intended purpose. This is particularly marked in
4. Governments should advise accreditation agencies
the case of complex matrices, such as foodstuffs, where
that for national and Codex purposes the measure-
the difficulties of completely extracting the analyte are
ment uncertainty result need not be calculated using
most pronounced. Quite commonly in such procedures
the ISO approach [7] providing the laboratory is
a substantial proportion of the analyte remains in the
complying with the appropriate Codex principles.
matrix after extraction, so that the transfer is incom-
Discussions are on-going in Codex. However, if plete, and the subsequent measurement is lower than
these proposals are accepted, it is likely that the term the true concentration in the original test material. If
'measurement reliability' rather than measurement un- no compensation for these losses is made, then marked-
certainty will be adopted and that estimates of this will ly discrepant results may be obtained by different labo-
be made from collaborative trial data if such data are ratories. Even greater discrepancies are likely to arise if
available. In a recent study, carried out in the UK, some laboratories compensate for losses and others do
which compared 'top-down' (collaborative trial) and not. These considerations are especially important in
'bottom-up' (ISO) approaches to the estimation of legislative/enforcement situations where for instance
measurement uncertainty, it was concluded that for the difference between applying or not applying a re-
comparable matrix/analyte combinations these ap- covery factor to correct for the incomplete removal of
proaches gave not dissimilar results in the limited num- the analyte may mean respectively that a legislative
ber of cases studied [8]. It should be noted that, in re- limit is exceeded or that a result is in compliance with
cognising the importance of the concept of measure- the limit.
ment uncertainty in underpinning the reliability of ana-
lytical data, the Codex recommendations and discus-
sions are in accordance with statements on uncertainty
in ISO Guide 25 and EN 45001, which require accredi-
244 P. Willetts· R. Wood
Thus, where an estimate of the true concentration is re- a The reporting of an analytical result without correcting
quired, there is a compelling case for including a com- for bias by the application of a recovery factor, no ac-
companying statement being given of the level of recove-
pensation for losses in the calculation of the reported ryachieved
analytical result, provided that the correction factor can The reporting of an analytical result without correcting
b
be estimated reliably. In the case of an empirical meth- for bias by the application of a recovery factor, together
od, where the measurand is defined in terms of the with a statement of the level of recovery achieved
method used and no attempt is being made to estimate c The reporting of an analytical result corrected for bias
the amount of analyte actually present in the sample by the application of a recovery factor, without an ac-
matrix, the question whether or not a correction is ap- companying statement of the level of recovery
plied is a matter for the definition of the empirical d The reporting of an analytical result corrected for bias
method. by the application of a recovery factor, together with a
The four most common approaches which typically statement of the level of recovery achieved
have been taken by analysts in respect of the applica-
tion of recovery factors are shown in Table 1.
Table 2 Examples of ways in which the recovery factor may be
determined with spiking
Reference materials and spiking experiments a Basing a recovery correction factor on the recovery of
the analyte from a spiked sample in the batch
Quite apart from the variation which can arise from la- b Basing a recovery correction factor on the mean value
boratories adopting different practices in respect of obtained for the recovery of the analyte spiked into a
whether a correction factor is applied or is not applied sample in each of a number of batches
to an analytical result, a further aspect which can hin- c Basing a recovery correction factor on the recovery of a
der data comparison is the fact that 'recovery' informa- chemically similar internal standard added to the test
material
tion may be derived either from the inclusion of refer-
ence materials or the use of spiked samples. d Basing a recovery correction factor on the recovery of an
isotopic form of the analyte added as internal standard
In the case of reference materials, the analyte is to the test material
usually integrated or incorporated into the matrix,
whereas in the case of spiked samples the analyte is
merely added to the matrix. Potentially different infor-
mation relating to the behaviour of the native analyte Consideration of these different strategies has led
to be measured may be derived from each type of re- analytical chemists to recognise the desirability of using
covery measurement. Moreover, the regularity and pat- a more uniform approach when dealing with the topic
tern of use of these recovery materials may affect the of recovery measurements in order to facilitate the
recovery information produced. In the case of spiking, comparability of data.
for example, the different ways in which the recovery
factor may be determined include those shown in Ta-
ble 2. Guidelines for using recovery information
Each of these approaches differs in the representa-
tiveness it provides of the actual extraction of the ana- Following the circulation to a broad cross section of the
lyte itself, the basis of the representation being differ- analytical community world-wide of a questionnaire on
ent in each case. While it is generally agreed that, of the determination and use of recovery measurements in
these four alternatives, the use of an isotopic internal 1995, background information was obtained which ena-
standard is the preferred approach since the recovery bled further consideration to be given to the role of re-
of the auxiliary analyte equates most closely to being covery studies in chemical analysis [10]. The main ques-
'fully equivalent' to that of the target analyte, this op- tions addressed the issues shown in Table 3.
tion is often not possible. As a consequence one of the As expected, the differing answers given to the ques-
other alternatives is often followed in spiking experi- tions posed revealed considerable variation in the ways
ments. in which analysts deal with recovery measurements. In
When a reference material is used rather than spik- particular, the question on measurement uncertainty it-
ing, then it will be included at a different position in the self produced more differences than any of the other
batch to the test material itself. In this respect the use questions, perhaps suggesting a lack of appreciation of
of a reference material is akin to options a or b for spik- either the need for or the means of calculating this val-
ing (see Table 2). ue. The findings of this survey were presented at the
Measurement uncertainty - a reliable concept in food analysis and for the use of recovery data? 245
Table 3 Outline of questions included in the recovery factors Table 4 A summary of guidelines for the use of recovery infor-
questionnaire mation
(UR) in the determination of R, at some level of confi- which is necessarily greater than uxlx and may be con-
dence. The significance test takes the form siderably greater. Hence correction for recovery seems
at first sight to degrade, perhaps substantially, the relia-
I R -11 IUR>t: R differs significantly from 1
bility of the measurement.
I R -11 IUR::S t: R does not differ significantly from 1
It is stated that such a perception is incorrect. Only if
where t is a critical value based either on a 'coverage the method is regarded as empirical, and this has draw-
factor' allowing for practical significance or, where the backs in relation to comparability as already discussed,
test is entirely statistical, t(<>/2. n-l), being the relevant is U x the appropriate uncertainty. If the method were
value of Student's t for a level of confidence I-a. taken as rational, and the bias due to loss of analyte
Following this assessment, for a situation where in- were not corrected, a realistic estimate U x would have
complete recovery is achieved, four cases can be distin- to include a term describing the bias. Hence uxlx would
guished, chiefly differentiated by the use made of the be at least comparable with, and may be even greater
recovery R. than, uClJrrlxcorr'
(a) R is not significantly different from 1. No correction These approaches to the estimation of the uncertain-
is applied. ty of a recovery are necessarily tentative. Nevertheless,
(b) R is significantly different from 1 and a correction the following important principles of relevance to the
for R is applied. conduct of recovery experiments are demonstrated.
(c) R is significantly different from 1 but, for operation- (a) The recovery and its standard uncertainty may both
al reasons, no correction for R is applied depend on the concentration of the analyte. This
(d) An empirical method is in use. R is arbitrarily re- may entail studies at several concentration levels.
garded as unity and UR as zero. (Although there is (b) The main recovery study should involve the whole
obviously some variation in recovery in repeated or range of matrices that are included in the category
reproduced results, that variation is subsumed in for which the method is being validated. If the cate-
the directly estimated precision of the method.) gory is strict (e.g., bovine liver) a number of differ-
In the first case, where R is not significantly different ent specimens of that type should be studied so as to
from 1, the recovery can be viewed as being equal to represent variations likely to be encountered in
unity, no correction being applied. There is still an un- practice (e.g., sex, age, breed, time of storage etc.).
certainty, UR, about the recovery that contributes to the Probably a minimum of ten diverse matrices are re-
overall uncertainty of the analytical result. quired for recovery estimation. The standard devia-
In the cases where R is significantly different from 1, tion of the recovery over these matrices is taken as
the loss of analyte occurring in the analytical procedure the main part of the standard uncertainty of the re-
is taken into account, and two uncertainties need to be covery.
considered separately. First, there are the uncertainties (c) If there are grounds to suspect that a proportion of
associated only with the determination, namely those the native analyte is not extracted, then a recovery
due to gravimetric, volumetric, instrumental, and cali- estimated by a surrogate will be biased. That bias
bration errors. That relative uncertainty uxlx will be low should be estimated and included in the uncertainty
unless the concentration of the analyte is close to the budget.
detection limit. Second, there is the uncertainty UR on (d) If a method is used outside the matrix scope of its
the estimated recovery R. Here the relative uncertainty validation, there is a matrix mismatch between the
URI R is likely to be somewhat greater. If the raw result recovery experiments at validation time and the test
is corrected for recovery, we have Xcorr =xlR (i.e., the material at analysis time. This could result in extra
correction factor is lIR). The relative uncertainty on uncertainty in the recovery value. There may be
Xcorr is given by problems in estimating this extra uncertainty. It
would probably be preferable to estimate the recov-
ery in the new matrix, and its uncertainty, in a sepa-
rate experiment.
References
1. Horwitz W (1988) Pure Appl Chern 3. Thompson M, Wood R (1995) Pure 6. Codex Alimentarius Commission
60:855-864 Appl Chern 67: 649-666 (1997) Codex Committee on Methods
2. Thompson M, Wood R (1993) Pure 4. Thompson M (1996) Analyst of Analysis and Sampling 21st Ses-
Appl Chern 65:2123-2144 [Also pub- 121:285-288 sion
lished in J. AOAC Int (1993) 5. Official Journal of the European
76:926-940] Communities (1993) L290114 Council
Directive 93/99/EEC
Measurement uncertainty - a reliable concept in food analysis and for the use of recovery data? 247
7. ISO (1993) Guide to the Expression 9. Draft - ISOIIEC Guide 25 (1996) 12. Ellison SLR, Williams A (1996) In:
of Uncertainty in Measurement, Gen- General requirements for the compe- Parkany M (ed) Proceedings of the
eva tence of testing and calibration labo- Seventh International Symposium on
8. Brereton P, Anderson S, Willetts P, ratories the Harmonisation of Quality Assu-
Ellison S, Barwick Y, Thompson M, 10. Willetts P, Anderson S, Wood R rance Systems in Chemical Analysis.
Wood R (1997) CSL Report FD 90/ (1998) CSL Report FD 97/65 Royal Society of Chemistry, London
103 11. International Union of Pure and Ap-
plied Chemistry (1997) Draft Har-
monised Guidelines for the use of
Recovery Information in Analytical
Measurement
Accred Qual Assur (199X) 3:127-130
© Springer-Verlag 199X
source of variation CV
• freeze-drying
the material
\
• reconstitution
A/\ ~ ~
extraction of laboratory
~~
cholesterol of the user
/\ /\
gQQ g g g ggg
/\ • instrument
reproducibility
0,48 %
Fig.l Sampling scheme and sources of variation attributable to Table 1 ANOY A of the amount-of-cholesterol data
the tiers
Source of DF" SSb MS c F ratio
variation
For particulars of experimental designs and ANOY A, the Inter vials 3 99.007 33.022 147.595
reader not familiar with it is referred to the multitude of text- Inter extT. 12 70.00X 5.X39 20.()9X
books on this subject, e.g. Anderson and Bancroft [3). Inter deriv. 30 21.290 0.592 2.044
Inter GC/MS runs lOX 24.104 0.224
a Degrees of freedom
Results h Sum of squares
C Mean square
The direct results of the ANOV A are compiled in Ta-
ble 1. Since all of the F ratios exceed the tabulated crit-
ical value corresponding to 5% risk of error, all of the numbers of subsamples drawn on each of the tiers and
mean squares can reasonably be assumed to represent would change when other sampling schemes were
significant contributions to the total sum of squares used.
apart from the sole GC/MS sum of squares. Otherwise, Equating the mean squares in Table 1 with the cor-
the model would have to be recalculated, omitting the responding expected values furnishes the sought for es-
insignificant source( s). timates of variances of the individual sources. Their
The mean squares given in Table 1 do not yet repre- square roots, which can be interpreted in terms of un-
sent the variances attributable to the individual sources, certainties caused by them, are compiled in the third
as each of these is actually a weighted sum of contribu- column of Table 2. These figures at the same time rep-
tions of all sources below it in the hierarchy. This be- resent the coefficients of variation (CV), since, in this
comes evident from the calculated expected mean example, prior to the ANOV A, all data had been
squares derived from theory for this nested model [3]. standardized to yield mean values of 100 for each of the
These are given in Table 2. The coefficients reflect the pools.
250 A. Henrion
Table 2 Expected mean squares and derived coefficients of var- In addition to this, a vial-to-vial effect is observed. It
iation for the amount-of-cholesterol data ((T: standard deviations is significant though it was characterized with only
of the individual sources)
three degrees of freedom (see Table 1). The corre-
Source of E(MS)" sponding CV is about 1.0%. It can be discussed in
variation terms of heterogeneity of the material, reproducibilities
of filling into vials, freeze-drying and reconstitution pri-
Inter vials ~C/MS + 3 a3erivs. or to use. The author believes the last source to be ne-
+3'3~xtL
+ 3·3· 3 tTvials 1.()O gligible in magnitude. If so, the vial-to-vial CV can be
regarded as a plain off-laboratory contribution to the
Inter extr. ubC/MS + 3 a3crivs.
+3'3~xtL 0.76 overall CV.
Knowledge of the off-laboratory CV will be impor-
Inter deriv. oiC/MS + 3 a3crivs. 0.35
tant if the material is intended to be used for self as-
Inter GClMS
Runs O.4X sessment or evaluation of the performance of a labora-
~ClMS
tory to be accredited. Then
a Expectation of mean square
b Coefficient of variation
Uvials being the expanded uncertainty [4] attributable to
the off-laboratory source and tp.df the percentage points
Table 3 Cholesterol concentrations [in j..Lmol/g (dry mass)] certif- of the t distribution. This, for instance, is about 9% if
ied and found for SRM 1952a only two vials are available as in the present case, but
Pool Certified a Found Diff. in % would not have been more than 1% if six vials out of
each pool had been under investigation.
A 41.66 41.77 +0.25 Considering that
B 63.X5 64.76 +1.42
C X6.72 X5.50 -1.40 U~otal = (tp.df· CV vials)2 I nvials + (tp.df· CVin-lab.) 2I nin-Iab.
a Calc. from the d~ta given in the certificate one can derive the minimum number of vials that
would be needed if a laboratory was to be tested for its
capability of determining the concentration for the pool
A comparison of the mean concentrations found for with a given uncertainty Uin-Iah ..
the pools with the data stated in the manufacturer's cer- In the example discussed here, Uvial" as already
tificate is given in Table 3. mentioned, is estimated to be at as high as 9%. Howev-
er, the mean concentrations of cholesterol found for
the pools happen to be pretty close to those certified
Discussion (see Table 3). Therefore, at this level of uncertainty,
there is no reason to suspect that the results are
The CVs (Table 3) can be combined to obtain an esti- biased.
mate of the CV of a single measurement: CV =
(CV~ials + CV~xtL + CVacriv. + CVbC/MS) 112 = 1.39%. This
is characteristic of the one-time drawing of a random Conclusions
vial out of a pool, separating from the matrix and ex-
tracting the cholesterol, derivatizing it and finally de- A carefully designed survey combined with ANOV A is
tecting the concentration by GC/MS. It is, of course, a powerful tool for providing the experimenter with
equivalent to the CV calculated with the whole meas- knowledge of particular components of the total var-
urement repeated several times in a straightforward iance. In many instances, as in the example presented
way (i.e. formally with only one element on each of the here, no other way of detecting them is conceivable.
tiers). The variance components furnish valuable information
Knowledge of the individual variance contributions as to what steps of the whole procedure need to be
allows a further interpretation. For illustration, see the checked for possible improvement. These at the same
right-hand part of Fig. 1. Instrument reproducibility time would be the steps which perhaps would require
(GClMS) is not a critical source of uncertainty, as pos- further subsampling and averaging of the results in or-
sibly could have been assumed, and neither is the deri- der to keep the uncertainty of the final result low.
vatization step. Among the manipulations in the labo-
ratory of the user of the material, the separation from
the matrix and the extraction into solvent might be re-
viewed for improvement. The in-laboratory CVs can be
combined to form a joint CV of about 1.0%.
In- and off-laboratory sources of uncertainty in the use of a serum standard reference material 251
References
1. Certificate of Analysis for SRM 1952a, 2. Henrion A, Dube G, Richter W (1997) 4. Guide to the Expression of Uncertain-
National Institute of Standards & Fresenius J Anal Chem 358: 506-508 ty in Measurement (1st edn) (1993) In-
Technology, Gaithersburg, MD 20899, 3. Anderson RL, Bancroft TA (1952) ternational Organization for Standardi-
January 8, 1990 Statistical Theory in Research. zation, ISBN 92-67-10188-9
McGraw-Hili Book Company, New
York
Accred Qual Assur (1999) 4:124-128
© Springer-Verlag 1999
used today for quality control of many products. The The analogous dependence is also valid for LOQ.
method is relevant to ISO 9001-9003, Good Manufac- By definition, LOQ is the lowest concentration of an
turing Practice, Good Laboratory Practice, and Food analyte in a sample which can be determined by the
and Drug Administration (FDA) guidelines [10]. analytical method with acceptable precision and accu-
Therefore, different aspects of the measurement uncer- racy; LOQ is usually equal to ten standard deviations of
tainty in this method were studied in a number of pub- the blank response [2, p.1983]. In other words, the rela-
lications [10-13]. In particular, the work concerning the tive standard measurement uncertainty is u( C)/C = 10%
analysis of the uncertainty budget for Ca(OHh and at C = LOQ. Therefore, the relative expanded uncer-
Mg(OHh determination in CaO and MgO is interest- tainty with the coverage factor 2 at the LOQ is U(C)/
ing [14, lS]. In this method the Karl Fischer titration of C=20% and
water produced by the analytes at high temperatures is
LOQ=(100/20) U(C) =S U(C). (2)
the second final step of the determination.
One of the most important sources of uncertainty is Since uncertainty in analytical measurement can be
the presence of materials in the sample (components of calculated with "pen and paper only" [18], the Eqs. (1)
the matrix) other than water that react with the Karl and (2) also allow one to calculate or predict LOD and
Fischer reagent (KFR). If the sample includes such LOQ before an experiment, as it is shown below for
components in significant concentrations, direct Karl water determination in presence of ene-diols or thiols.
Fischer titration will be impossible. This is a problem,
for example, for ene-diols (such as ascorbic acid and its
preparations) and thiols which are used in the pharma- Uncertainty in water determination
ceutical, perfumery, food and other industries.
We have developed a new method for simultaneous The analytical procedure begins with the titration of
determination of water and ene-diols or thiols in sam- the sample test portion against the novel reagent [17].
ples unsuitable for direct Karl Fischer titration [16]. This reagent includes iodine in non-aqueous solvents,
The method is based on the consecutive titration first which oxidizes ene-diol or thiol to diketones or sul-
of ene-diol or thiol against a novel reagent [17] and phide derivatives, respectively, which do not interfere
then of water against a conventional KFR in the same with the next titration of water by KFR.
test portion and the same cell for electro metric location After the first titration (assay) the total water con-
of the end-point in both titrations. For ene-diol or thiol tent in the flask consists of the original amount of water
the method is classified as a Category I method and for in the test portion and that introduced with the novel
water a Category II method. Therefore LOD and LOQ reagent during titration. This total water content is ti-
evaluation is only necessary for water. trated against KFR (second titration).
In the present paper, the dependence of LOD and The original water content in the sample (Cw , %
LOQ on uncertainty in analytical measurement is dis- mass) is calculated from the equation:
cussed and used in the design of an experiment for the
C w = [VKFR - (F X V r -a )] xT KFR X 100/m, (3)
assessment of LOD and LOQ in the new method for
water determination. where m is the mass of the sample test portion in mg;
V KFR is the volume of the KFR spent for titration of
the solution formed after the first titration in ml; T KFR
Dependence of LOD and LOQ on uncertainty in
is the titre of the KFR in mg H 2 0/m!. Then
analytical measurement
LOD is the lowest concentration of an analyte in a sam- T KFR =m T H20NT KFR,
(4)
ple that can be detected, but not necessarily quantified, where m T H20 is the mass of water test portion used for
by the analytical method [2, p.1983]. Wegscheider has determination of the KFR titre in mg; V T KFR is the vol-
shown [9], that in order to define this concentration C ume of the KFR spent for the titration of m T H20 in ml;
as corresponding to three standard deviations of the V r-a is the volume of the reagent spent for titration of
blank response, it would mean that the relative stand- the sample test portion in ml, and F is the factor that
ard measurement uncertainty (relative standard devia- corresponds to the volume of the KFR in ml spent for
tion in the concentration domain) is u(C)/C=33% at titration of water traces in 1 ml of the novel reagent.
C=LOD. Hence, using the coverage factor 2, the rela- The F value is calculated from two consecutive titra-
tive expanded uncertainty at C=LOD is U(C)/ tions of one and the same dry SnCh sample by the nov-
C=2u(C)/C=66% and vice versa: el reagent and then by the KFR [16]:
LOD=(100/66) U(C)=1.S U(C), (S)
where U(C) is the expanded uncertainty (absolute val- where VOr is the volume of the novel reagent spent for
ue). the first SnCh titration in ml, and VOKFR is the volume
254 I. Kuselman· F. Shennan
in ml of the KFR spent for the second titration of the water concentrations in the range 0.1-1.0% mass are
solution formed after the first titration. shown in Fig. 1.
Masses of the sample test portion (m:::::: 50 mg) and
even of the water test portion for determination of the
KFR titre (m TH20:::::: 5 mg) are measured with negligible LOD and LOQ prediction and design of the experiment
uncertainty (for example, with Mettler AT 201 balance
According to Eqs. (1) and (2) and the calculated value
it is 0.015 mg) in comparison to the uncertainty of Yol-
U(C w), the predicted values of LOD and LOQ for the
urnes [6, 8]. Therefore, it is desirable to transform Eqs.
water determination are as follows:
(3-5) in the following way:
LODwp = 1.5 U(C w) =1.5 xO.074=0.11 % mass (8)
Cw =
[YKFR-(Yr-a xyoKFR/Vor)] X [em TH20 X 100/m)/V T KFR], and
(6) LOQwp = 5 U (C w) = 5 X 0.074 = 0.37% mass. (9)
where (m TH20 X 100/m) = K is a negligible source of un- Corresponding values of the relative combined uncer-
certainty (designated as K for convenience). tainty (66% reI. and 20% reI.) are indicated in Fig. 1 by
The standard combined uncertainty of the water de- empty circles.
termination can be evaluated by the partial differentia- Based on this prediction, the experiment for evalua-
tion of Eq. (6) taking into account the value K: tion of LOD and LOQ was designed to analyse a sam-
ple with a water concentrations close to LODwp and
u( C w) = {(K/V TKFR) 2 X [( u(Y KFR)/V KFR) 2 + two samples with water concentrations, one a little less
+ (yr_a/yor) 2 X (U(yoKFR)/YOKFR)2+ and one a little more than LOQwp. A purchased ascor-
+ (YOKFR/VOr)2 X (u(Yr-a)/V r-a ) 2 + bic acid powder containing 0.15% mass of water, pur-
+ (Y r-a XYOKFR/(YOr)2f X U(YOr)/VOr)2] + chased a-monothioglycerol with 0.24% mass of water
+ [(YKFR-(Yr-a X yOKFR/VOr» X and a fortified sample of a-monothioglycerol with
X K/(yT KFR)2]2 X (U(yT KFR)/V TKFR) 2} 112. (7) 0.53% mass of water were used for this purpose.
All the volumes were measured with a 10-ml burette
graduated in 0.02 ml divisions (Bein Z.M., Israel). The Experimental data
manufacturer specifies a calibration accuracy of ± 0.02
ml, which can be converted using rectangular distribu- The "true" values of water concentration, C tn in these
tion to a standard deviation uc(Y) = 0.02/p = 0.012 ml. samples, shown in Table 1, were obtained as a differ-
The standard deviation of the burette filling obtained ence between two titrations of two independent test
was equal to Ur (Y) = 0.013 ml. Since the volumes are
spent for titrations, they also depend on the standard 100
deviation of end-point detection. Bipotentiometric lo-
90
cation of the end-point by the direct dead-stop tech-
nique is very precise [19]. Therefore, in both titrations 80
the main deviation in end-point detection arises due to
the drop size of the burette. In our case the drop is 70
0.013 ml and the corresponding standard deviation is ~
~ 60
u,,(Y) =O.013/p =0.0075 ml. The temperature uncer-
0
i 50
tainty source is negligible here, therefore the standard ~
uncertainty of each volume spent for titration using
the burettes described above is u(Y) = [( uc(Y» 2 +
~
;;.
40
portions (each with ten replicates). First, by the Karl clear also that the model of uncertainty given in Eq. (7)
Fischer method and then by the pharmacopoeial meth- is not absolutely complete, but predicted parameters of
od for the determination of ascorbic acid [2, p. 131] or the method correspond rather well to the experimental
of a-monothioglycerol [2, p. 2271]. The average of 20 data. Maybe U(Cw)/Cw values calculated for concentra-
replicate water determinations by the new method, Cay; tions close to LOQwp and above, are excessively pes-
the corresponding relative standard deviation of repli- simistic (high). However, it is obvious that the experi-
cates, So and relative bias Sh = (Cay - CIT)/Cay are shown mental data (even obtained from 20 replicates) have a
in Table l. wide uncertainty range: sample statistical values can be
The relative standard uncertainty of the replicates significantly different from the population ones.
ucxp (Cw)/Cw= lOO(ST 2+ Sh 2) 112 is shown (% reI.) for val- Because the whole study was performed within the
ues Cw= CIT in Fig. 1 by full circles. Relative expanded framework of the new method validation according to
uncertainty in the uncertainty ucxp(Cw)/Cw by [18] for the AOAC Peer-Verified Methods Program [1] which
the level of confidence 0.95 and 20-1 =19 degrees of defines LOD and LOQ as experimental values, finally
freedom is U cxp =2.09x100/(2x19)1/2=39% reI., LOD w=0.2% mass and LOQw=0.5% mass were ac-
where 2.09 is the corresponding two-tailed percentile of cepted. These values are sufficient for the purposes of
the Student's t-test distribution. The U cxp values (39% the method. It has been adopted as an AOAC Peer-
reI. of ucxp (Cw)/Cw) are shown in Fig. 1 by the bars to Verified Method with the assigned number PVM
the full circles. 1: 1998.
Discussion Conclusions
From Fig. lone can see, that the calculated relative un- The approach to the assessment of the limits of detec-
certainty U(Cw)/Cw based on Eq. (7), being a hyperbo- tion and quantitation using uncertainty calculation can
la, depends on each 0.01 % mass of water concentration be helpful for the prediction of the former and experi-
at Cw close to LODwp' At Cw<LODwp the values of mental design.
U(Cw)/C w quickly tend to infinity. On the other hand, The calculation performed for the new method of
for Cw>LOQwp the relative uncertainty asymptotically water determination in samples which are unsuitable
draws nearer to zero. for direct Karl Fischer titration is in good conformity
Note that another definition of LOD and LOQ (for with the experimental validation data.
example, with requirements to both Type I and Type II
errors in decisions [20-22]) leads to another correlation Acknowledgements The authors thank Prof. E. Schoenberger
between these parameters and the uncertainty. It is for helpful discussions.
References
1. Lauwaars M (1998) Accred Qual As- 4. ISO/IEC Guide 25 (1990) General re- 7. Ellison SLS, Williams A (1998)
sur 3:32-35 quirements for the competence of cal- Accred Qual Assur 3:6-10
2. USP 23 (1995) The United States ibration and testing laboratories, 3d 8. Kuselman I, Shenhar A (1997)
Pharmacopeia (USP). The National edn. International Organization for Accred Qual Assur 2: 180-185
Formulary. United States Pharmaco- Standardization (ISO), Geneva, 9. Wegscheider W (1997) The Proceed-
peial Convention Inc, Md., USA 5. CITAC Guide I (1995) International ings of the 2nd EURACHEM Work-
3. Kuselman I, Shenhar A (1995) Anal guide to quality in analytical chemis- shop "Measurement Uncertainty in
Chim Acta 306:301-305 try: An aid to accreditation, 1st edn. Chemical Analysis. Current Practice
Teddington, UK and Future Directions", 29-30 Sep-
6. EURACHEM (1995) Quantifying un- tember, Berlin. EURACHEM
certainty in analytical measurement,
1st edn. EURACHEM, Teddington,
UK
256 I. Kuselman . F. Shennan
to. Dietrich A (1994) American Lab 20 14. Zeiler HJ, Heindl R, Wegscheider W 20. American Public Health Association,
(5): 33-39 (1990) Veitsch-Radex Rundschau American Water Works Association,
11. Mitchell J Jr, Smith DM (19HO) 2:4H--55 Water Environment Federation
Aquametry, Part III (the Karl Fischer 15. Wegscheider W, Zeiler HJ, Heindl R, (1995) Standard methods for the ex-
reagent). A treatise on methods for Mosser J (1997) Ann Chim amination of water and wastewater,
the determination of water. Wi\ley- H7:273-2H3 19th edn. American Public Health
Interscience, N.Y., USA 10. Sherman F, Kuselman I, Shenhar A Association, Washington, USA, pp 1-
12. Margolis SA (1995) Anal Chern (1990) Talanta 43: 1035-1042 10,1-11
07:4239-4240 17. Sherman F, Kuselman I, Shenhar A 21. Kaus R (199H) Accred Qual Assur
13. Margolis SA (1997) Anal Chern (199H) Reagent for determining water 3: 150-154
09:4HM-4871 and ene-diols or thiols. USA Patent 22. Vogelgesang J, Hadrich J (199H)
No. 5,750,404, 12.05.9H Accred Qual Assur 3: 242-255
IH. Kuselman I (199H) Accred Qual As-
sur 3:131-133
19. Cedergren A (1990) Anal Chern
oH:3079-30Hl
Accred Qual Assur (2002) 7:115-120
DOl 10.1007/s00769-002-0442-6
© Springer-Verlag 2002
Variable Factor
Regular precipitation nd +
Reverse precipitation ++ nd ++
Increase excess of precipitant nd nd nd +
Increase adding rate of precipitant -s -s nd -s +
Increase concentration of precipitant -s + nd + ++
Increase acidity of solution + nd -s
Increase concentration of analyte -s nd + +
Increase standing time nd
Increase standing temperature -s nd nd nd nd
Increase volume of washes +
Ignition condition nd nd
Increase coexisting ion nd nd nd nd +
-: factor decreases, +: factor increases, nd: no data available, -s: factor decreases not distinctly, - - : factor not applicable
cated with the minus (-). A variable that increases the on the transferring and washing processes. It can be
magnitude of a factor is indicated with the plus (+). The dissolved in a hot HCl solution and detected.
relative magnitude of the effect is not considered in this 4. BaCl 2 and Na2S04 occlusion: It has been found that
table. The influence of these variables is explained below. all chloride in the precipitate was presented as BaCl 2
and the sodium as Na2S04.12] The variables are some-
1. Barium in the filtrate: Because of the solubility of
times contradictory for the contamination of BaCl 2
precipitate, there is always trace amount of barium in
and Na2S04' Using reverse precipitation, the occlu-
the filtrate. A relatively large amount of barium was
sion of BaCl 2 decreased but that of Na2S04 increased
in the filtrate when reverse precipitation was used.
markedly. Increasing the excess amount and the add-
The amount of barium in the filtrate will increase
ing rate of precipitant, the content of coexisting cation
along with the increase of acidity of the precipitating
and anion, the occlusion of Na2S04 was aggravated,
solution. To reduce barium in the filtrate, regular pre-
but almost no data changed for BaCI 2. By diluting the
cipitation was used with the proper excess amount
solution, prolonging the standing time of precipitate,
and the adding rate of precipitant, so the common-ion
and increasing the volume of washes, both occlusions
effect is dominant. Diluting the precipitating solution,
of Na2S04 and BaCl 2 were minimized.
with the proper acidity and temperature of the precipi-
tating solution, standing the precipitate in the filtrate
for a time, so that the small and imperfect crystals
Experimental
will form larger and more perfect crystals and the sol-
ubility of the precipitate will be decreased. Another
1. Sample preparation
important variable is the filtrate volume, for a given
condition, the solubility of the precipitate is almost
The analytical-reagent grade BaCl 2 was weighed and
constant, as the filtrate volume decreased, the loss of dissolved in 5% HC1, 20 kg of barium solution was pro-
barium in the filtrate will be reduced. duced and the nominal concentration was 20 mg.g- I . The
2. Barium in the washes: There is always a trace amount solution was mixed and packed in 20 mL ampOUles.
of barium dissolved in washes by washing the precipi-
Some of them were taken for analysis.
tate. Increasing the concentration of the precipitant
and the volume of washes, the amount of barium in
the washes will be increased. In order to decrease pre- 2. General procedure
cipitate contamination, thorough rinsing of the precipi-
tate is necessary. Increasing the adding rate of precipi- The optimal test conditions shown in Table 2 were used in
tant, the acidity of precipitating solution, the concen- the following experiment. A hot solution of Na2S04 was
tration of analyte and the standing time of precipitate, added to a pre-weighed hot acidified solution of BaCI 2.
the amount of barium in washes will be decreased. The precipitated BaS04 was left overnight on a water-bath
3. Mechanical loss of barium: Some of the micro crys- at about 94°C and then filtered and washed. The precipi-
tals of the precipitate still adhere to the beaker side, tate and filter were charred, then ignited to constant mass
stir rod and policeman when it is transferred onto the (the difference between two ignitions was less than 0.03
filter. The amount of the mechanical loss depends up- mg) at 800 °C in a platinum crucible. The masses of the
Study of the uncertainty in gravimetric analysis of the Ba ion 259
Table 2 Optimal test conditions for the gravimetric analysis of barium sulfate
Correction
Ba"
l
Water and IICi
mechanical loss were determined by ICP-AES. The pre-
-i
cipitate was fused with K2C0 3 at 910 °C, extracted with I Ba"Sampic
t---
a
hot water and acidified with HN0 1. Some of the aliquots Solution ICP·AES Ba'in
r-
for determining chloride by IC and were calculated as BaSO, Filtrate Measure Da 2 ;
f---
filtrate
Correction
BaCl 2 and barium. Other for determining sodium by AAS
and was calculated as Na 2S04. A schematic diagram of
I
t-
I Precipitate
-1
b BaSO, lCp·AES Ba 2 'in
the procedure is shown in Fig. I and the gravimetric fac- BaSOI
Measure Ba' Washes
tors used for calculation are listed in Table 3. Washes f---
Correction
C tvteasure 13.('1,
f
3. Apparatus n - Occlusion
(' orrce! ion
Results and discussion 0.3% due to leaking, relative to that of the total barium.
This quantity of barium should be added to that of the to-
Results tal barium. If it is not corrected, it may lead to a negative
error of about 0.09% under this condition. It can also be
The total concentration of barium in solution is calculat- seen that the effect of leaking or other factors can be
ed as follows minimized or eliminated by using correction.
CBa2+(mg. g-l) Mechanical loss of barium. The mass of the mechanical
= (/11;2 - m6 -mg) x 0.5883993 + I1l:l + m4 + ms + '!by X 10-3 loss of barium was about 0.03%--0.08% relative to that of
m1 (1) the total barium. This quantity of barium should be added
to the total barium. If it is not corrected, it may lead to a
The factors used for calculation are given in Table 3. The negative error about 0.03%--0.08% under this condition.
results for gravimetric coupled with instrumental deter-
mination are listed in Table 5. All corrections were ap- Occlusion of BaCI2. The mass of Bael 2 occluded was about
plied on a sample-by-sample basis. The expanded uncer- 0.04%--0.1 % relative to that of BaS04, or 0.02%--0.05%
tainty of the final result was calculated according to relative to that of the total barium. This quantity of Bael 2
GUM guide [3]. should be subtracted from the mass of BaS04and the rele-
vant mass of barium should be added to that of the total bar-
ium. If it is not subtracted, it may lead to a positive error
Gravimetric determination about 0.02%--0.05% under this condition.
Using the Gravimetric method alone, the concentration of Occlusion of Na2S04' The mass of Na2S04 occluded was
Ba2+ in solution is 19.2680 mg g-l and the relative stan- about 0.03%-0.06% relative to that of BaS04' or
dard deviation is 0.089%. They are all listed in Table 5. 0.02%--0.04% relative to that of the total barium. This
quantity of Na2S04 should be subtracted from the mass
Loss of barium and contamination of BaS04. If it is not subtracted, it may lead to a positive
error about 0.02%--0.04% under this condition.
Loss of barium in filtrate. The mass of barium in the fil-
trate is about 0.02%-0.09% relative to that of the total Total loss of barium Total loss of barium includes the
barium. This quantity of barium should be added to the loss in filtrate and washes, mechanical loss and con-
total barium. If it is not corrected, it may lead to a nega- tained in the occlusion of Bael 2. The mass of total loss
tive error of about 0.1 % under this condition. of barium was about 0.2%-0.3% relative to that of the
total barium. This quantity of barium should be added to
Loss of barium in washes. The mass of barium in washes that of the total barium. Otherwise, it may lead to a nega-
was about 0.09% except in one sample which was up to tive error about 0.2%-0.3% under this condition.
Study of the uncertainty in gravimetric analysis of the Ba ion 261
3~~~~2;J~~~
""or.~~ool"--o'To\
--NNO---- Gravimetric-instrumental determination
000000000
Using gravimetric-instrumental determination, the con-
-o'TO\'TI"-II"ll"-ooN
centration of Ba 2+ in solution is 19.3010 mg·g- 1 and a
r-MO\Otn~",,"\OO\
("f'"', tn ('f""', N lr, ........ t.n t- ('f"', relative standard deviation of 0.024% was obtained. The
N"""'N-NN-N
000000000 results are list in Table 5 and were calculated with Eq.
(l). All the corrections were made on a sample-by-sam-
pIe basis.
The over-all repeatability of the low-precision instru-
mental measurement coupled with high-precision gravi-
metric analysis (relative standard deviation, 0'()24%) is
three times better than the repeatability of the gravimet-
oo\O~oor-rnootr,\O ric analysis alone and the value (19.3010 mg·g- 1) is
00 00 \0 r- lr'. N 00 ar,
greater than the gravimetric value (19.2680 mg·g- 1). This
("f')
II"l001"--o1"-0-o1"-1"-
........................................ \0 ................ -
000000000 indicates that an improvement has been made by the
coupled instrumental analysis. In addition, a determina-
0'T0-o0\0ll"l000
'TNI"-II"lool"-l"-'Tl"- tion made without the instrumental correction would be
-o'T-oo\NOI"-I"-~.
0000----- negatively biased by 0.2%.
000000000
Assessment of uncertainty
-5 00 0\ II"l or. I"- 0\ I"- 'T 0\
0\~."''T0\0\00'T~. 00000\
°
.~ I"- I"- I"- 00 or. N -0 I"- -0 -0 - 00
NNNNNNNNN NOO
Instrumental measurement
------- -
0\0\0\0\0\0\0\0\0\ 0\00
"0
=
OJ
........
:0
E
o
Barium in the filtrate measured by ICP-AES. There was
u -0 N ~. 'T I"- 0\ II"l 0\ 00
0\0\ \0(".1 \O('f""',oo-\O
a lot of NaCI in the filtrate, so the effect of matrix is very
1"--0\ I"-'T 'TO\-o-o
C'ir-:orrioo.....:r-:ooo high. The barium was measured by means of standard
\O\OLr;l£)<o::j"V1trl\Otr.
("f"")("fj('f"',('f"',('f"Jtr"lM("f')("I"'j addition. The concentration of barium in the filtrate was
about 0.2 mg·L-l, and the repeatability of ICP-AES was
(:f"',tr.",,"rt"'llr'.OOOO("f'"',
........ r- l£', \0 0\ ('f"', \0 00 N Jess than 2.5%. The combined standard uncertainty was
NO\"q-NVJrt"'l-OOlr,
r-O-O'\",,"lrlrrIOVl 3.5% and the mass of barium in filtrate was about
ON I"-I"--Ot"-O\ 00\
""':""':00000""':0 100 flg, so the estimated relative standard uncertainty of
"-
o the correction for the barium in filtrate was 1.8x1 0-5 .
180 Ilg, so the estimated relative standard uncertainty of (3) standard uncertainty for correction of barium loss in
the correction for the barium in washes was 1.7xlO-5 . washes u5= 1. 7x 10-5
(4) standard uncertainty for correction of mechanical
Mechanical loss of Barium measured by ICP-AES. The loss of barium
barium was determined by ICP-AES with the matrix- a standard uncertainty for the measurement of me-
matched calibration curve method. The concentration of chanicalloss of barium u6= 1.1 X 10-5
barium in solution was about 2 mg·L-l, and the repeatabil- b standard uncertainty caused by the incompletely
ity was 1.5%. The combined standard uncertainty was 2% extraction for the mechanical loss of barium
and the mass of mechanical loss of barium was about u7 = I xlO-5(the extraction rate of barium from the
100 Ilg, so the estimated relative standard uncertainty of beaker, stir rod and policeman was about 98%)
the correction for mechanical loss of barium was l.lxlO-5 . (5) standard uncertainty for correction of occlusion
BaCl 2
Chloride measured by Ie. There is a lot of KN0 3 in the a standard uncertainty for the measurement of oc-
extract, so the effect of the matrix is high. The chloride clusion BaCl 2 U8= 1.1 X 10-4
is measured by means of standard addition. The concen- b standard uncertainty caused by the incompletely
tration of chloride in solution is about 0.2 mg-L-l, and extraction for occluded BaCl 2 in precipitate is
the repeatability of IC is about 5%. The combined stan- u9=1.5xlO-5
dard uncertainty was 10%. The mass of BaCl 2 (trans- (6) standard uncertainty for correction of occlusion
ferred from chloride) in extract was about 250 Ilg, so the Na2S04
estimated relative standard uncertainty of the correction a standard uncertainty for the measurement of oc-
for BaCl 2 and barium from occlusion was I.lxlO-4. clusion Na2S04 uIO=2.3xlO- 5
b standard uncertainty caused by the incompletely
Sodium measured by FAAS. There was a lot of KNO, in extraction for occluded Na 2S0 4 in precipitate is
the extract, so the effect of the matrix is high. The sodium ull=1.0xlO-5
is measured with a matrix-matched calibration curve (7) standard uncertainty of the measurement for atomic
method. The concentration of sodium in solution is about weight u12=1.9xlO-4
0.3 mg-L-l, and the repeatability of FAAS was about 3%. (8) standard uncertainty of the buoyancy modification
The combined standard uncertainty was 5% and the mass u 13 =5.8xlO--6
of Na2S04 (transferred from sodium) in the extract was
The combined standard uncertainty is 4xl0-4
about 160 Ilg, so the estimated relative standard uncertain-
The expanded standard uncertainty is 8xlO-4
ty of the correction for occluded Na 2S04 was 2.3xlO-5 .
Conclusion
Assessment of the combined uncertainty
The combination of a classical gravimetric determination
Uncertainty Assessment of type A. The source of Type A
together with instrumental techniques was used to ana-
uncertainty is the relative standard deviation of result by
lyze the concentration of barium in solution. Corrections
gravimetric -instrumental method u 1=2.4xlO-4.
were made to the classical gravimetric method to correct
the loss of barium in the filtrate, washes and the mechan-
Uncertainty Assessment of type B. The sources of Type B
ical loss, and the contaminants of BaCl 2 and Na2S04' In-
uncertainty are as follows:
strumental methods (ICP-AES, IC, FAAS) were used to
(1) weighing uncertainty: quantify the loss of barium and the contaminants in the
a standard uncertainty for sample weighing u2=2xl()-6 precipitate. The sources of the uncertainties have been
b standard uncertainty for precipitate weighing assessed thoroughly, and the values were obtained. The
u3=8xlO--6 uncertainty of the combined method has been improved
(2) standard uncertainty for correction of barium loss in remarkably and the expanded standard uncertainty (k =2)
filtrate u4=1.8xI0-5 is 0.08%.
References
1. Kolthoff 1M, Sandell EB: Textbook of 2. Vetter TW (1995) Analyst 3. GUM (1995) Guide to the expression of
Quantitative Inorganic Analysis 3rd edn. 120:2025-2030 uncertainty in measurement"(issued by
ISO, IEC, BIPM, IFCC, IUPAC and
OIML)
Accred Qual Assur (2()()()) 5: 100-103
© Springer-Verlag 2()()O
peatability for four replicates is R4 .s = 3.63 u(AN) = termination (from 4 replicates, j = 1,2, ... ,4), and Wi is
0.22 AN. the range of replicates Xij in i-th day (i = 1,2, ... ,5).
At the level of intermediate precision for five repli- Table 2 shows the ranges Ws between the highest
cates, each an average from four daily results, the per- and lowest daily average values Xi, the total laboratory
5 4
missible range is RS,d=3.86u(AN)/(0.8JV4) = average results Xlii'!! = L L xi)20 and the difference
0.14 AN. i= I j=l
For the reproducibility level, the predicted permissi- W2 between them, with the predicted permissible
ble range for laboratory results, each an average from ranges: 1) for four replicates during a day - R 4s ; 2) for
4 x5 =20 replicates, is R2J=2.77 u(AN)/(0.67V20) = five daily average values - Rs.tI and 3) for two total av-
0.06 AN. erage results - R2J.
Comparing Wi values with their norms R 4 .s one can
see that all Wi::5 R 4 ." so the method repeatability is satis-
2 3 4 5 2 3 4 5
Basic oil, Xi (),()(J60 0,O()62 (),O(J60 0'()O61 0,OO5H O,OOti2 0,0061 0,0062 0,()O60 (W066
purchased Wi 0,0004 (1.0010 (J,0009 (),(J(J03 O,OO(JH 0,0006 O.OO(JH (J,0007 0,0004 0,0003
Basic oil, Xi O,()53 0,055 0,051 0,052 (1.052 (1.055 0,050 (1.051 0,050 (),()52
fortified Wi (),()03 (1.003 (),O02 (),()(J5 O'(J01 0,003 (J,OOH (J,005 0,006 (t()05
Transformer Xi 0,0023 (),0022 0,0023 0,0023 (1.0023 0,0021 (J,O()21 O,()O2l (J,()()22 0,0022
oil, purchased Wi (WOOl (),OO02 (WOOl O,O(J01 (),O()O1 0,0002 0,0000 0,0001 0,0001 OJ )(JOI
Transformer Xi 0,052 0.051 0,052 0,051 0,051 0,049 (J.()52 0,052 (J.()5l 0,052
oil, fortified Wi 0.001 0,003 0,001 0,003 0.004 I),(JOO (),OOO 0,001 O,OOl (J.()01
White oil, Xi 0.0021 (1,0022 (J,(J021 0,()O21 (),O()22 (),OO2() O,(J021 0,0019 0,0019 (W020
purchased Wi (),()()O3 (WOO2 0,OO(J2 (J,0002 0,0001 0,0002 0,0002 O,OO(JO 0,0000 0,0000
White oil, Xi 0,050 0,052 0,052 0,051 0,050 O,(J51 0,050 O,(J50 (J.()49 0,049
fortified Wi (J.(J03 (J.(JOl (J.()02 (J,(Jm 0,003 0,002 0,004 (J,004 0,002 (J,002
Table 2 Results of the range prediction and statistical analysis of the experimental data using Horwitz's norms
OiL Lahor- Range calculations (mg KOH/g oil) Statistical analysis with the Horwitz's norms
sample atory
R4 " W, R,,,, X m ·g W2 R2J C, parts RSD" RSD 1N, RSD 2 • RSD 2N . RSD,. RSD'N.
of I (Yo <X) (X, (X, (Xl (X)
Basic oil, 1 0,0014 0,O(XJ4 0,(XJ09 0,0060 0,0002 (W)04 1.1 ,10-5 4.29 7,47 1.96 4,50 2.32 2,49
purchased 2 0,0006 0,0062 3,40 2,91
Basic oil, 1 (J,()] 1 (W04 0,007 O'()53 0,001 0.003 9,4,10- 5 2.24 5,40 2.77 3.25 1.35 1,80
fortified 2 O,()O5 0,052 4.28 3,66
Transformer 1 (WOO5 0,0001 0,0003 (UX)23 O,(X)Ol 0,0001 4,0' 10-" 2.33 8,71 2,22 5.25 3,14 2,t)O
oiL purchased 2 O,()ml (W)22 2,01 1,05
Transformer 1 0,011 0,001 0,007 0,051 0,000 o,om 9,1'10-5 1.77 5,43 0,90 3,27 (),(K) 1.81
oil, fortified 2 0,003 0,051 0,83 2,17
White oil, 1 0,0004 (WOO 1 (),m03 (UK)21 0,0001 O,(KKl1 3,7' 10-6 3,47 8,83 2,70 5,32 3,45 2,94
purchased 2 0,0002 0,0020 l.67 4.24
White oil, I O.!lll 0,002 O,()(J7 0.051 (J.()01 0,003 9,0,10- 5 1.35 5,43 1.65 3,27 1,40 1.81
fortified 2 (),O02 (J.()50 2.lfi l.66
266 I. Kuselman
factory. The intermediate precision is also satisfactory, therefore, more difficult to analyse. However, it should
since Ws:5 Rs,d for all oils in both laboratories. The sim- be noted, the sample standard deviations RSD 3 have
ilar assessment of the method reproducibility by the 2 (20-1 )-1 = 37 degrees of freedom while the number of
condition W2:5 R2.I is positive too. degrees of freedom of the norms RSD 3N can be accept-
The same results can be obtained by statistical anal- ed as infinity. So, correct comparison of the standard
ysis of the experimental data using comparison of the deviations with their norms should be based on X2 or
relative standard deviations (RSDs) of Xij' Xi and Xl/ vg Fisher's criteria [13]. For example, by Fisher's criterion
with the corresponding empirical Horwitz's norms [8, for transformer oil F= RSDj/RSDjN =3.14 2/2.90 2
10-12]. For this purpose the following parameters are =1.17. It is less than the critical value F o.lJs {37, oo)
calculated and shown in Table 2: = 1.54 at the level of confidence 0.95. Therefore, the
1. Values X avg (mgKOH/g oil) expressed as concentra- population value of RSD 3 for this oil is no more
tions of naphthenic acid in decimal fractions: than the Horwitz's norm. The same is true also for
C=Xavg (100/56.11)/1000, where 100 and 56.11 are white oil, since the corresponding F=3.45 2 /2.94 2 =
the molecular masses of naphthenic acid and KOH, 1.38 < Fo.lJ5 {37, oo} = 1.54.
respectively and 1000 is the factor for transformation From the values above it can be seen that the range
of mg to g. prediction and uncertainty calculation (on which the
2. RSDs of Xij averaged for 5 days (the i-th values were prediction is based) are adequate and in good confor-
homogeneous ): mity not only with the experimental data for the meth-
RSD I = 100t~IJI [(Xij- X;)2IX;]I[5 (4-1)] f2, %.
od validation, but also with the database used by Hor-
witz for calculation of his norms.
3. Norms for RSD I by Horwitz:
RSD IN =2(I-O.5LogC) X 0.67, %.
4. RSD of Xi: Conclusions
J
RSD 2 = 100[JI [(Xi - Xavg) 2]/(5 -1) 1/21Xavg, %. The approach to evaluate the permissible ranges for re-
sults of replicate determinations using uncertainty cal-
5. Norms for RSD 2 derived from RSD I:
culation can be helpful for prediction of ranges and sta-
RSD 2N =RSD IN /(0.83y'4), %.
tistical analysis of the data obtained during validation
6. RSD of Xavg:
of the analytical (chemical) method.
RSD3=100y'2[(Xavgl-XlIvg2)/(XlIVI(I + X lIvI(2)] , %,
The range prediction performed for the new method
where numbers 1 and 2, as additional indices for
of pH-metric AN determination without titration in pe-
Xl/VI(' denote Lab 1 and Lab 2, respectively.
troleum oils is in good conformity with the experimen-
7. Norms for RSD 3 by Horwitz:
tal validation data.
RSD 3N =RSD IN /(0.67yLO), %.
All the RSD values are less than their norms, except Acknowledgements The author thanks Professor E. Schoenber-
inter-laboratory RSD 3 for purchased transformer and ger, Professor Ya. Tur'yan and Dr. E. Strochkova for helpful dis-
white oils. These samples have the lowest AN and are, cussions.
References
1. Havilcek LL, Crain RD (1988) Practi- 5. EURACHEM (1995) Quantifying un- 9. Kuselman I, Shenhar A (1997)
cal statistics for the physical sciences, certainty in analytical measurement Accred Qual Assur 2: 180-185
American Chemistry Society, Wash- 1st edn. EURACHEM, p Hi 10. Horwitz W, Albert R (1987) Anal
ington, D.C. 6. Owen DB (1962) Handbook of statis- Proc 24:49-55
2. United States Pharmacopeia. USP 23 tical tables, Addison-Wesley, Read- 11. Thompson M, Fearn T (1996) Ana-
(1995) US Pharmacopeial Conven- ing, Mass., pp 138--139 lyst 121 :275-278
tion, Inc., Rockville, Md., pp 7. Dixon WJ, Massey FJ Jr (1969) Intro- 12. King B (1999) Accred Qual Assur
1982-19M duction to statistical analysis, 3rd edn. 4:27-30
3. Kuselman I, Sherman F (1999) International Student Edition, New 13. Miller JC, Miller IN (1993) Statistics
Accred Qual Assur 4: 124-128 York, Table A-8b for analytical chemistry, 3rd edn. Ellis
4. Tur'yan YI, Strochkova E, Berezin 8. AOAC Peer-Verified Methods Pro- Horwood, Bodmin, England
OY, Kuselman I, Shenhar A (1998) gram (1993) Manual on policies and
Talanta 47: 53-58 procedures. Association of Official
Analytical Chemists International,
Arlington, p 9
Accred Qual Assur (2002) 7: 13-18
© Springer-Verlag 2002
low blank values, while isooctane is less toxic. There- temperature [K], and F is the Faraday constant (96485
fore, stages of the KI oxidation in all the discussed coulombs).
methods are the same, but in the new methods they are To obtain PV of the tested oil, PV t should be corrected
combined with the iodine redox-potentiometric mea- for the blank (organic solvent -water system without oil).
surements in the same electrochemical cell. So, the Blank value PVo is calculated by the same formula, as
main point is that the uncertainty of iodine determina- PV t, for the same mass value m. Finally, PV of the oil is
tions by volumetric titration and the uncertainty of io-
dine redox-potentiometric measurements without titra- PV = PVt - PVo. (2)
tion should be compared.
To compare the iodine measurement uncertainties, The main PV uncertainty components following from
they are assessed by identification of the uncertainty Eqs. (J) and (2) are shown in Table 1. Note only some
sources, quantification of uncertainty components and cal- details for their evaluation.
culation of combined uncertainties according to the EURA- The final mass m of an oil test portion is the differ-
CHEM/CITAC Guide [8] as values uc(y(xl' x2,"" xn»= ence in masses between a beaker with the test portion
[L(U(y, x))2]1I2, where y(x 1, x2""'x n) is a function of pa- and the empty beaker (after oil transfer to the solvent).
rameters x \, x2, ... ,xn and u(y, xi) is the uncertainty in y These masses are weighed using the balance (Mettler
arising from the uncertainty in xi' i=l, 2, ... ,n. AE 163, Switzerland) with reading 0.0001 g and calibra-
tion expanded uncertainty of ±0.0002 g at the level of
confidence 0.95 and coverage factor 2 (normal distribu-
Uncertainty of measurement resuHs tion) in the range up to 100 g.
by the proposed methods Uncertainty in iodine concentration Cst in the stan-
dard solutions is calculated taking into account the
After completing the reaction of the KI oxidation by hy- manufacturer information on possible deviation of the
droperoxides contained in the oil test portion the equilib- iodine titer (0.02%/ q in a Titrisol ampoule (Merck,
Q
rium I2+I-~I3- is established at the KI excess. Thus, re- Germany), as well as the information on the volume
dox-potential El caused by the electrochemical revers- uncertainty for volumetric flasks according to DIN,
ible couple I.,-+2e-~3I- is measured in the aqueous Class A, used for the solutions preparation, and possi-
phase with the Pt indicator and AgiAgCl, 3 molll KCI, 3 ble temperature variation in the laboratory (in limits of
molll KN0 3 reference electrodes. After E\ measurement 20±2 Qq.
the standard addition of the iodine aqueous solution is A recommended standard addition for oil samples
introduced into the cell and potential E2 is measured (for with different expected PV is 0.1-1 ml of 0.01 N, 0.1 N,
more details see [3]). or 0.5 N iodine solutions: this volume should be negligi-
PV of the test portion is calculated using the follow- ble in comparison with the volume of the aqueous phase
ing equation: (70-110 ml). For transfer of the addition to the "oil-
organic solvent-water" system, a mechanical hand pi-
PVt = (1000 I m)[(C st x Vst ) I (1 Ot1E/S -1)], meg I kg, (I) pette is used (Gilson, France, calibrated at INPL based
on the gravimetric method [9]).
where m is the mass of the oil test portion [g]; Cst is the El and E2 are measured under the same conditions,
iodine concentration in the standard solutions for addi- both within 2-3 min, by the same instrument (a pHlion-
tion, expressed in gramequivalents per liter [N]; Vst is the meter PHM 95, Radiometer, France). The expanded
volume of the iodine standard addition [ml]; LlE=E\-E2 is measurement uncertainty is ±(0.2+0.0005E) mV accord-
the difference of the potentials [mY]; S is an electro- ing to the Radiometer's information. At the normal dis-
chemical parameter equal to 2.303 RTI2F [mY], where R tribution it corresponds to the standard uncertainty
is the universal gas constant (8.314 J/(K x mol», T is the u(E)=O.1 ±0.00025E mV. Since the E measurement range
0.30
0.14
0.25
0.12
'1°.
0.20
10
~ 01' E1 0.08
• 0,10 ~
I::!, 0.06
~
0.05 0.04
0.02
0.00
5 9 13 17 21 0.00
AE,mV 1.5 2 2.5 3
r
Fig. 1 Dependence of the relative standard uncertainty u(PYt)/PY t
on the measured ditJerence of potentials 6E [mY]. The optimal Fig. 2 The expanded uncertainty U(PY) [meq/kg] as function of
6E value is shown by the dotted line. Wavy lines show the range the ratio r=PY/PY o. Line 1 corresponds to PYo=O.06 meq/kg. line
of recommended 6E values 2 to PYo=O.2 meq/kg. line 3 to PYo=O.5 meq/kg. LOD and LOQ
for PYo=O.5 meq/kg are shown by the dotted and wavy lines.
correspondingly
recommended in the method [3] is 282-333 mY,
u(E)=0.2 mY. So, the standard uncertainty of the differ-
ence between such two E measurements is line. Moreover, all the range f1E=7-13 mV can be rec-
ommended for practical use as far as it covers uncer-
u(f1E) =.J2 x u(E) = 0.3 mY. tainty values close to the optimal one: u(PVt)/PV t=
0.047±0.OlO. The same is correct for the blanks.
The main parameter influencing S value is the tempera- Uncertainty of the final result of PV determination
ture. Its variations in the laboratory in the range 291- in the tested oil calculated by Eq. (2) is u(PV)=
295 K (18-22 DC) lead to the S changes from 28.87 to [(u(PVt»2+(U(PVO»2]1/2. At the optimal f1E, assuring
29.27 mY. u(PV t)/PV t=u(PV o)/PVo=0.047, normal distribution and
As one can see from Table I, f1E measurement is the coverage factor 2, the expanded uncertainty U(PV) is
dominant source of the uncertainty in results of PV de-
U(PV) = 2 x (0.047[PV? + PVJ]1/2)
termination calculated by Eq. (I). So, the relative stan-
dard uncertainty of such a result u(PVt)/PV t is calculated = 0.094[PV? + PVJ ]1/2, meg I kg. (5)
by the logarithmic partial differentiation of the function
U(PV) as function of the ratio PV/PV o=r is shown in
(I) concerning f1E:
Fig. 2 in the range r=I-3 at PVo equal to 0.06, 0.20, and
0.50 meq/kg (Jines 1-3, correspondingly). From Eq. (5)
u(P~) I p~ = [2.303 X lOdEIS x u(L1£ I S)I 0.1£15 -I]. (3)
the relative expanded uncertainty is the following:
Taking into account u(f1E)=0.3 mV and S=28.87- U(PV) I PV = 0.094(r2 + 1)112 I (r + I). (6)
29.27 mY, Eq. (3) can be simplified:
When PVo is negligible, PV=PV p r+l~r, r2+I~r2 and
u(PVt ) I PVt = 0.024 I (1 - I I I OtlE/S). (4) the relative expanded uncertainty is U(PV)/PV=0.094.
LOQ can be predicted as PV corresponding to the ten iodine concentration Cst in solutions also prepared from
standard uncertainties of the blank response or its five such ampoules.
expanded uncertainties: Volume of the thiosulfate solution spent for titration is
the more complicated parameter. A 2-ml microburette
LOQ = 5U(PVo) = 0.47PVo, meg 1 kg oil. (8) (Bein Z.M., Israel) with O.OI-ml divisions and a drop
size reduced to 0.008 ml was used for this titration [3].
It means PV t=1.47 PVo or r=1.47 and again, practically The manufacturer specifies a calibration accuracy of
as for LOD, U(PV)/PV=0.068. ±O.O I ml. The standard deviation of the burette filling
Values rand U(PV) appropriate to LOD and LOQ are obtained was equal to the standard deviation of calibra-
shown in Fig. 2 for the case of PVo=0.50 meq/kg, for ex- tion - 0.006 ml. Since the volumes V th and Vth-O are
ample, by the dotted and wavy lines, correspondingly. spent for titrations, they depend also on the standard de-
The limit values here are LOD=0.14xO.50=0.07 meq/kg viation of end-point detection. Using "potato starch for
and LOQ=0.47xO.50=0.23 meq/kg. For more pure sol- iodometry" at the end of titration, as recommended in the
vents, i.e., for blanks with lower PVo, LOD and LOQ are standard [5], which produces a deep blue color in the
lower also. presence of the iodonium ion, location of the end-point
It is clear from Eqs. (5), (6), (7), and (8) and Fig. 2 (when the blue color just disappears) is precise. It re-
how the uncertainty of the result of PV determination de- quires an analytical experience, especially at low iodine
pends on the solvent purity (blank PVo), especially for concentrations (low PV), but for an experienced analyst
fresh refined oils having PV~0.5 meq/kg. the main deviation in the end-point detection arises due
to drop size of the burette. For the described burette cor-
responding standard deviation is 0.008/"3=0.005 ml.
Uncertainty of measurement results The temperature uncertainty source is negligible here,
by the standard methods and therefore the standard uncertainty of each volume
spent for titration using such a burette is
According to the standard methods [4,5], iodine, liberat-
ed after completing the reaction of the KI oxidation by
hydroperoxides contained in the oil test portion, is titrat-
ed against sodium thiosulfate. Final result is calculated Note, the use of 50-ml burette, DIN, Class A (Duran,
as usually in titrimetry: Germany) with O.I-ml divisions, calibration accuracy
±0.05 ml and a drop size 0.05 ml, recommended in the
PV = (10001 m) x (Vth - Vth - O) X C th , meq 1 kg. (9) standard [10], leads to the standard uncertainty u(V th )=
u(V th-O)=0.05 ml. So, such a burette is not suitable for
where V th and Vth-O are the volumes of thiosulfate solu- fresh refined oils with PV~0.5 meq/kg, as far as the vol-
tion spent for titration of the test solution and of the ume of the 0.0 I N thiosulfate solution spent for titration
blank, correspondingly [ml]; C th is the thiosulfate con- in this case is Vth~0.25 ml.
centration [N]. From comparison of the main components of PV un-
The main PV uncertainty components in this way are certainty in Table 2, one can see that volume components
shown in Table 2. Some details for their evaluation are are dominant. So, assuring normal distribution and cov-
discussed below in same manner as previously for the erage factor 2, the relative expanded uncertainty of the
new methods. titration result is
The final mass of an oil test portion (m=5 g) is deter-
mined as described already: it is not depending on a U(PV) 1 PV = 2 X [(u(Vth ))2 + (U(Vth_O))2]1/2 1 (Vth - Vth - O)
method for PV determination. = 0.028 1(Vth - Vth - O). (10)
Uncertainty in thiosulfate concentration C th in solu-
tions prepared from a Titrisol ampoule (Merck, Ger- Dependence of U(PV)/PV on the volume spent for titra-
many) is calculated by analogy with the uncertainty in tion is shown in Fig. 3 in the range of (Vth-Vth-O) up to
Uncertainty and other metrological parameters of peroxide value determination in vegetable oils 271
.. 1.00
to.
OJ
75
and
For a blank the volume Vth-D is relevant only in Eq. (10), To compare the methods at PV levels higher than PVo,
so U(PVo)IPVo=0.02N th-D' The same in Eq' (9): at five kinds of oils were analyzed using the purified isooc-
272 I. Kuselman . E. Kardash-Strochkova . Y. I. Tur'yan
Oil Proposed technique Standard ti trati on Fisher's ratio F Student's ratio t PVr/PV s'
PV p' meq/kg Sp' meq/kg PV s ' meq/kg Ss. meq/kg
tane: canol a, soya, sunflower, olive, and maize. Table 4 though were stored in refrigerator, their PV have increased
shows the average results PV p and PV s obtained by the after some months. Therefore, results of PV determinations
proposed and standard methods from n=5 replicates for shown in Table 4 are higher than those in [3].
each sample; the standard deviations for these replicates - Since LOD and LOQ should be determined experimen-
Sp and Ss' respectively; Fisher's ratio F=Sp2/Ss2; Student's tally [11], only LOQ=0.2 meq/kg predicted for the stan-
ratio t=IPVs-PV pl/[(S/+Sp2)15]O.S; and PV jPVs' %. dard methods is approved based on the work [3] data for
The critical value for the F-ratio is 6.39 at the 95% fresh refined canola. Test of other predictions requires an
level of confidence and the number of the degrees of additional experiment with very good refined fresh oils.
freedom n-l=4, For the t-ratio the critical value is 2.31
at the 95% level of confidence and the number of the de-
grees of freedom 2(n-l )=8. Conclusions
From the comparison of the F-data with the critical
value it follows that the differences between repeatabili- 1. Metrological parameters of the new methods for re-
ty of the results obtained by the standard titration and by dox-potentiometric PV determination in vegetable
the proposed method are insignificant (all F values are oils are fit for purposes (similar to demonstrated by
less than 6.39), i.e., repeatability of the proposed method the standard methods).
is sufficient. 2. Uncertainties of results of PV determination by the
The accuracy of these techniques is approximately the proposed methods are close to those by the standard
same, since the deviations of the average PV p results ob- methods or worse of them for oils with PV signifi-
tained by the proposed method from the average results cantly more than 0.5 meq/kg. However, for fresh re-
obtained by the standard titration PV s are insignificant in fined oils with PV~0.5 meq/kg the proposed methods
comparison with the repeatability deviations (all t values are better than standards ones, if the criterion is the
are less than 2.3\), i.e., accuracy of the proposed tech- measurement uncertainty.
nique is sufficient. Average ratio PV/PV s is 99.6%. 3. For proposed methods the LOD=O.Ol meq/kg and
Similar results were obtained in our previous work with LOQ=0.03 meq/kg are predicted, while for the stan-
the same oils dissolved in chloroform [3]. However, in ex- dard methods there are LOD=0.06 meq/kg and
periment described in [3] the oils were more fresh, and LOQ=0.20 meq/kg.
References
I. EURACHEM (1998) EURACHEM 4. AOCS (1996) AOCS official methods, 9. ISO (1999) ISO/DIS 8655-6 Draft.
Guide. The fitness for purpose of ana- vol II, method Cd 8-53: Peroxide Piston-operated volumetric apparatus.
lytical methods. A laboratory guide to value. Acetic acid-chloroform method. Part 6: Gravimetric test methods.
method validation and related topics, AOCS, Champaign, IL, USA Geneva, Switzerland
1st edn. Teddington, UK 5. AOCS (1996) AOCS Official methods, 10. AOCS (1996) AOCS official methods,
2. Kuselman I, Sherman F (1999) Accred vol II, method Cd 8b-90: Peroxide val- vol II, method Ja 8-87: Peroxide value.
Qual Assur 4: 124-128 ue. Acetic acid-isooctane method. AOCS, Champaign, IL, USA
3. Kardash-Strochkova E, Tur'yan Va, AOCS, Champaign, IL, USA II. AOAC (1997) AOAC peer-verified
Kuselman I (2001) Talanta 54:411-416 6. Finne G, Ikins WG, Williams J Jr, methods program. Manual on policies
Welborn JL (1998) Inside Lab Manage and procedures. AOAC, Gaithersburg,
2:24-26 MD
7. Israel Standard No 216 (1994) Edible
vegetable oils. Tel Aviv, Israel
8. EURACHEM/CITAC (2000) EURA-
CHEM/CITAC Guide. Quantifying un-
certainty in analytical measurement,
2nd edn. Teddington, UK
Accred Qual Assur (1999) 4:504-510
© Springer-Verlag 1999
Introduction Methods
Determination of nitrogen content plays a key role in Kjeldahl method
assigning values to insulin reference materials. The ref-
erence materials in question serve as reference stand- The nitrogen content of dry insulin was determined by
ards when measuring insulin in drug products. Thus, the Kjeldahl method, which consists of three steps, di-
the Kjeldahl nitrogen determination is a crucial link in gestion, distillation and titration. In brief, approximately
the traceability chain. The uncertainty budget for the 50 mg of the sample was weighed accurately and trans-
Kjeldahl method published in this paper has three ob- ferred to a digestion test tube. Concentrated sulphuric
jectives: Firstly, to estimate the uncertainty of results acid was added and the mixture was heated to 390°C
obtained by the Kjeldahl method; secondly, to identify for 4 h (digestion). The tube with the digested sample
steps in the analytical procedure that may be targets for was placed in the Kjeldahl apparatus, sodium hydrox-
improvement; and thirdly, to contribute a generally ap- ide and steam was added, and ammonia was distilled
plicable procedure for evaluating an uncertainty budget off. The ammonia-rich steam was condensed in the re-
for a chemical analytical method to the current litera- ceiving flask (distillation). The content of ammonia in
ture. The need for such schemes or procedures is ur- the receiving flask was determined by end-point titra-
gent as accredited laboratories in the near future will tion to pH 4.5 with 0.1 molll hydrochloric acid (titra-
be required to state their uncertainty of measurement tion). The normality of the acid (N HCh in mol/I) was de-
[1]. The uncertainty budget published in this paper is termined by titration of tris-(hydroxymethyl)-amino-
based on existing guidelines [2, 3]. methane (Tris). Blanks consisted of empty digestion
274 T. Anglov . I.M. Petersen' 1. Kristiansen
tubes treated as samples. Samples were measured in The factors 1Iy2 and 1/y3 account for the number of
duplicate and blanks in triplicate. replicate determinations (2 and 3, respectively). The
Let a (mg) denote the amount of sample. If b (ml) expression may be rearranged in order to emphasize
and c (ml) denote the volume of hydrochloric acid used the sensitivity coefficient of each uncertainty compo-
for titration of the blanks and the sample, respectively, nent:
then the relative nitrogen content of the sample (Ntota, )
can be calculated as: (N .)2 = ( Ntotal )2 ( )2 + ( N total )2 (b)2
U totdl y2 . (c _ b) u C y3 . (c _ b) u
N _ 14.01 g/mol· (c - b)· N HC1
total - (1)
a + (N to tal)2 u(a)2 + (N to tal)2 u(N HCI )2 (4b)
y2·a N HCI
The method was validated using two insulin drugs. The
relative standard deviation under intermediate preci- The standard uncertainties u(N HCI ), u(a), u(b) and u(c)
sion conditions was found to be 0.085%. are themselves composed from various uncertainty
components. These uncertainties were likewise ob-
tained from uncertainty budgets (Appendix Bl-4).
Uncertainty budgets
(3) (5)
7f
U sing an coverage factor k = 2 the result of the meas-
( U(C))2 (U(b))2
( U(Ntotal))2 = if + urement should be reported as:
N total (c-b)
(4a) Ntotal ± V(N total ) =0.1233 (±0.00046)
u(a))
+ (U(N HCI ))2 + ( £ where V(Ntotal) is the expanded standard uncertainty.
Details on the uncertainty budgets for u(N HCI ), u(a),
N HC1 a
u(b) and u(c) are given in Appendix B.
Uncertainty of nitrogen determination by the Kjeldahl method 275
Table 1 Values of uncertainty components, their standard uncertainty and sensitivity coefficient
Normality of hydrochloric acid N Hcl 0.1 molll 1.1 x 10-4 molll (N ,o,al) = 1.23 llmol Appendix B1
N Hcl
Relative contributions to the uncertainty cludes the contributions from uncertainty of the vol-
ume of the titrant and from uncertainty of the tempera-
The relative contributions (r;) from U(N HCI ), u(a), u(b) ture of the titrant (Appendix Bl, 3 and 4). Weighing
and u(e) to the combined standard uncertainty variance (Fig. 2) includes first and second weighing of both Tris
u(Ntotal)2, are shown in Fig. 1. The two largest contribu- base and the sample (Appendix Bl and 4). Digestion
tions come from U(NHCI) and u(e). Both contribute (Fig. 2) includes all uncertainty components denoted in
35-38% to the combined standard uncertainty var- digestion in Appendix B3 and 4. Lastly, Tris-purity
iance. The uncertainty budget for U(NHCI) (Appendix (Fig. 2) denotes the uncertainty component associated
Bl) shows that the uncertainty of this component is with the purity of the Tris base evaluated in Appendix
composed mainly of uncertainty contributions from the B1. From Fig. 2 it can be seen, that the largest uncer-
temperature of the titrant, the weighing of the Tris base tainty contribution comes from the use of volumetric
and the volume of the titrant. The primary contribu- equipment, i.e. burettes used for titration of hydro-
tions to a(e) (see Appendix B4) come from the uncer- chloric acid and for titration of samples and blanks.
tainty of digesting the sample (which consists of the un- This uncertainty component is composed of the preci-
certainty of the amount of sample transferred to the di- sion of titration and of the uncertainty of the tempera-
gestion tube) and from the uncertainty of the volume of ture. The latter factor influences the volume of titrant
the titrant. used for titration. The second largest contribution
In Fig. 2 the relative contributions to the uncertainty comes from weighing (of Tris base and of the sam-
variance are summarized for various types of analytical ple).
steps and equipment used in the Kjeldahl method. The
contribution from volumetric equipment (Fig. 2) in-
~
c
0
~ i
.c
c
0
i.c i
..
;:
c
0
8.:
!J
u
..:
::
various
contri-
q~ulltr
lxlO-4, B
rectangular
CX~)-4)=5.xX 10-' N lln =0.1 molll
qC~lnlr
5.X x 10-6 molll Own estimate
butions'
2 MPE of qrll '" Oh Instrument
the pH- specification.
meter own estimate
3 Purity of qpu, 0.9989 (UKW)SC B 2.9 x 10-' molll Supplier
Tris (99.X9%) liJ- .x 0
0.005 _ 2 9 1 -4 Nlln=O.1 molll
qpur
certificate
4 First m, 11K) mg 0.1 mg B O.lmg 5.9 x 10-' molll Instrument
rectangular Nlln = 1.02 x 10-' moll(mg·l) specification
weighing
of Tris base 7~
5.8x 10-" mg
m,-III,
a Contributions from: The water content of the silica gel; lack of C The stipulated purity of (hydroxymethyl)-aminomethane (Tris)
complete drying of the Tris base; variation in the time of drying; is 99.H9% with an uncertainty of (l.05%
variation in the time of dissolution of the Tris base; the hygrosco- d A standard deviation of 0.004 mL was found experimentally
py of the Tris base; air bubbles in the burette: Combined uncer- C Water density at 15°C is 0.99913 glmL, at 20°C 0.99H23 glml and
tainty estimated to 0.01 mg Tris base or 0.01 % of approximately at 25°C 0.99707 glmL. It is assumed that the temperature in the
100 mg Tris base laboratory (and the temperature of the acid) in average is 20°C,
h The pH-meter specifications stipulate a MPE of 0.02 pH units. but in worst case may vary ±5°C. The largest change in water
The form of the titration curve indicate that an uncertainty of 0.02 density from 20 to 25°C is 0.00116 g/ml, or 0.12%
pH corresponds to a negligible uncertainty in the volume of ti- f The uncertainty of the molar mass is assumed to give a negligi-
trant. Thus, the value of this uncertainty component is not in- ble contribution to the combined uncertainty
cluded in the uncertainty budget
a No Kjeldahl mixture or acid was used. The digestion was not laboratory (and the temperature of the acid) on average is 20°C,
performed but in worst case may vary ± 5°C. The largest change in water
h Addition of water and NaOH; distillation time; leaks; incom- density from 20 to 25°C is 0.001 Hi glml, or 0.12%
plete transmission of NH3 to the receiving vessel; temperature of C The pH-meter specifications stipulate a MPE of 0.02 pH units.
the vapour: All components are assumed to be negligible The form of the titration curve indicates that an uncertainty of
C A standard deviation of 0.004 ml was found experimentally 0.02 pH corresponds to a negligible uncertainty in the volume of
d Water density at 15°C is 0.99913 glml, at 20°C 0.99823 glml and titrant. Thus, the value of this uncertainty component is not in-
at 25°C 0.99707 g/m\. It is assumed that the temperature in the cluded in the uncertainty budget
"The uncertainty components: Amount of catalysator, the C A standard deviation of O'()04 ml was found experimentally
amount of Kjeldahl mixture and H 2S0 4 , block temperature, di- d The pH-meter specifications stipulate a MPE of 0.02 pH units.
gestion time and boiling are assumed not to contribute signifi- The form of the titration curve indicates that an uncertainty of
cantly to the uncertainty. Transfer of sample to digestion vessel is 0.02 pH corresponds to a negligible uncertainty in the volume of
assumed to contribute with 0.1 mg or 0.2% titrant. Thus, the value of this uncertainty component is not in-
h Addition of water and NaOH; destillation time; leaks; incom- cluded in the uncertainty budget
plete transmission of NH3 to the receiving vessel; temperature of
the vapour: All components are assumed to be negligible
Uncertainty of nitrogen detennination by the Kjeldahl method 279
N: Number of quantities or uncertainty compo- rection factors for the water content of Tris, the accura-
nents cy of the pH meter, the temperature of the titrant, and
N HC (: The normality of the hydrochloric acid the purity of Tris (Table BI):
N tota (: Content of nitrogen in the sample (average of 2
samples)
qi: Correction factors
u(x;): Standard uncertainty of the input estimate Xi
U(y): Expanded uncertainty of y
u(y): The combined standard uncertainty of y 2. Estimation of the uncertainty of a
Xi: Input estimates
y: Output estimate (the result of a measurement) The input estimate a is obtained from weighing the
aJ. Sensitivity coefficient for Xi
sample (m2) and sample cup alone (m(), i.e.
aXi a = J(x;) = m2 - mi' Thus, the standard uncertainty of a
is given by the equation u,7 = u;', + u,;", (Table B2).
aJ y (A2)
aXi Xi where v is the estimated volume of titrant read from
If the input estimates Xi is related to y by additions and the burette and all the correction factors (qi) are equal
subtractions, Eq. (3) becomes: to I (Table B3).
References
1. International Standard ISO/IEC 17025 3. EURACHEM (1995) Quantifying un- 6. Hansen AM, Kristiansen J, Nielsen JL,
(199X) General requirements for the certainty in analytical measurement, Byrialsen K. Christensen JM (1990)
competence of testing and calibration 1st edn. EURACHEM Talanta 50: 367-379
laboratories (Draft). International Or- 4. Miller JC, Miller IN (1993) Statistics 7. ISO (1993) International vocabulary of
ganization for Standardization. Geneva for analytical chemistry. 3rd edn. Ellis basic and general terms in metrology
2. BIPM, IEC. IFCC, ISO, IUPAC. Harwood, New York (VIM), 2nd edn. International Organi-
IUPAP. OIML (1993) Guide to the 5. Kristiansen J. Christensen JM, zation for Standardization, Geneva
expression of uncertainty in measure- Nielsen JL (1996) Mikrochim Acta
ment. International Organization for 123:241-249
Standardization. Geneva
John Fleming Glossary of analytical terms*
Bernd Neidhart
Christoph Tausch
Wolfhard Wegscheider
John Fleming Wolfuard Wegscheider
LGC, Queens Road, Teddington, Montanuniversitat Leoben,
Middlesex TW 11 OLY, UK Franz-Josef-Strasse IS,
A-S700 Leoben, Austria
Bernd Neidhart, Christoph Tausch
Philipps-Universitat Marburg,
Hans-Meerwein-Strasse.
D-35032 Marburg, Germany
are urged to contribute to the debate and work towards a zalnosc (PL); Uusittavuus (SF): Reprodukalhatos 'ag (H);
consensus on the usage of the key terms covered by the (RUS); Reprodutibilidade (P)
glossary. This debate can be pursued either by corresponding
Definition
with the editor of this journal or by sending an email message
to jwf@lgc.co.ukfor consideration by the working group. Precision under reproducibility conditions. 1
Description
Repeatability
Reproducibility is the closeness of the agreement between
Wiederholprazision (D,A, CH): Repetabilite (F, B); Repe- the results of measurements of the same analyte in distinct
tibilidad (E); E.panalhcimo? thta (GR): Ripetibilita (I); subsamples of a test material, where the individual
Herhaalbarheid (NL); Powtarzalnosc (PL): Toistettavuus measurements are carried out changing conditions such
(SF): Ismetelhetoseg (H): (RUS); Repetibilidade (P) as: observer, measuring instrument, location, conditions
of use, time, but applying the same method. 2
Definition
Example
Precision under repeatability conditions. 1
In a laboratory intercomparison samples (e.g. a surface
Description
water) were sent to a number of laboratories for
Repeatability is the closeness of the agreement between determination of e.g. nitrite. Each laboratory reports its
the results of independent measurements of the same analyte results as single values. The standard deviation from all
carried out subject to all of the following conditions: accepted individual results multiplied by 2.8 gives the
the same method of measurement, the same observer, the reproducibility at 95% confidence level.
same measuring instrument, the same location, the same Suppose that the reproducibility of a method has been
conditions of use, repetition over a short period of time. 2 determined to be x. If two of the laboratories in a real case
Independent measurements are made on distinct reported results for subsamples of the same sample which
subsamples of a test material. If possible, at least 8 differed by jX there would be a question concerning the
measurements should be performed. quality of performance.
Repeatability is a characteristic of a method not of a Methods which have a large reproducibility may not be
result. suitable for making valid comparisons in a given real
situation. In this case either the method must be improved
Example
or another method with a smaller reproducibility must be
Successive measurements under the above conditions gave applied.
eight single results from which a standard deviation is
calculated. The standard deviation multiplied by 2.8 gives 1 ISO 3534-1 (1993)
the repeatability at 95% confidence level. 2 International vocabulary of basic and general terms in
Suppose that an analyst uses a method for which the metrology, 1993, (BIPM, IEC, IFCC, ISO, IUPAC, IUPAP,
repeatability has been established as 2 mg/mL. OIML); ISO central secretariat, 1 rue de Varambe, CH-
If, in a real case, the same analyst reported results of a 1211 Geneva 20
measurement repeated over a short time interval as 50 and
56 mg/mL, there would be a question over the validity of
Traceability
these results as they are very unlikely to have differed by
6 mg/mL as a result of random variability. Riickfiihrbarkeit (D, A, CH); Tracabilite (F, B); Trazabilidad
(E); Ixnhla' thsh (GR); Riferibilita (I); Herleidbarheid (NL);
1 ISO 3534-1 (1993) Rastreabilidade (P); Jaeljitettaevyys (SF); Visszavezethet-
2 International vocabulary of basic and general terms in oseg (H); Zgodnosc (PL); (RUS)
metrology, 1993, (BIPM, IEC, IFCC, ISO, IUPAC, IUPAP,
Definition
OIML): ISO central secretariat, 1 rue de Varambe, CH-
1211 Geneva 20 The property of a result of measurement whereby it can
be related to appropriate standards, generally internatio-
nal or national standards, through an unbroken chain of
Reproducibility comparisons. 1
Vergleichprazision (D, A, CH): Reproductibilite (F, B);
Reproducibilidad (E): Anaparagvgymothta (GR);
Riproducibilita (I); Reproduceerbarheid (NL); Odtwar-
282 1. Fleming et al.
The uncertainty for the determination of e.g. atrazine International vocabulary of basic and general terms in
in water consists of the calibration of several components metrology, 1993. (BIPM. IEC, IFCC. ISO. IUPAC.
of uncertainty, such as the uncertainty of the true content IUPAP. OIML); ISO central secretariat, I rue de Varambe.
of the atrazine standard, uncertainty from dilution of this CH-1211 Geneva 20
standard, uncertainty regarding the loss of atrazine in Quantifying uncertainty in analytical measurement,
sampling and storage prior to analysis. as \vell as that EURACHEM, Queens road, Teddington. Middlesex
associated \vith the preconcentration step after correction TWll0LYUK
for recovery. The result would be expressed as:
1.02BO.13 mg/L