Multivariate Calibration
Harald Martens
Norwegian Computing Center,
N-0314 Oslo 3, Norway
Norwegian Food Research Institute,
Oslovegen 1, N-1430 Aas, Norway
and
Tormod Nas
Norwegian Food Research Institute,
Oslovegen 1, N-1430 Aas, Norway
JOHN WILEY & SONS
Chichester - New York » Brisbane - Toronto - Singapore1 Introduction to Multivariate
Calibration
SUMMARY Multivariate calibration is a general selectivity and reliability
enhancement tool. It is applicable to determination of major constituents as well as
‘microcomponents and other qualities, and for a very wide range of instrument types.
Successful examples range from spectraphotometric determination of the protein
percentage in intact, whole wheat kemels, to chromatographic determination of
dioxin in smoke at the nanogram range.
‘With multivariate calibration the need for sample preparation is greatly reduced.
‘The reason is that selective input measurements are no longer needed—it is the
output results that must be selective.
‘Multivariate calibration can thus stimulate the development of new analytical
instruments. It can also inerease the analytical capacity and reliability of traditional
struments, This extends the usefulness of quantitative chemical analysis in on-line
industrial process control, analysis of intact biological or medical samples, low-cost
pollution monitoring, etc
‘This chapter provides a non-technical background. It shows why it can be useful
to perform indirect measurements and calibrate these to yield valuable information,
instead of always trying to measure the wanted information directly. It illustrates
why selectivity problems often make multivariate procedures necessary and it
motivates why the reader should make the effort of learning certain statistical
techniques in order to ensure relevant, precise and reliable calibrations.
1.1 WHY MULTIVARIATE CALIBRATION?
SUMMARY There is a need for improved quantitative information in science and
technology. This requires transformation of measurements into informative results
Calibration is to establish this transformation,
12
1.1.1 ABSOLUTE VERSUS RELATIVE CALIBRATION
‘The word ‘Calibrate’ traditionally means to determine the inner diameter or capacity
{the calibre) of a gun or some other cylinder, for instance using the traditional
‘caliper’ instrument. But if you want to determine the calibre of something, then
You first have to learn how to relate your measurements X to some calibre reference
Y: You have to calibrate your instrument.
In this case the calibration is a standardization to fixed scale, and is here
termed absolute calibration. Such absolute calibrations should be tracable to legally
accepted international standards, like the tuning of music instruments to a fixed
frequency scale using a tuning fork
However, in practical quantitative analysis, the absolute accuracy of the end
results is often less important than their reliability and relevance.
1 sometimes happens that absolute “tuning fork’ standards simply do not
exist or are irrelevant for certain instruments, And more significantly, the tuning
of the instrument strings does not ensure good music: absolute calibration of
each individual variable may be irrelevant for the purpose of the analysis. One
‘example ofthis is diffuse near infrared (NIR) spectroscopy, for which many of the
techniques in this book originally were developed: The important thing is not to
attain universally accepted absorbance readings at some individual wavelengths;
the purpose is to predict protein content in a certain type of wheat samples,
octane number in a certain type of gasoline etc. So while the NIR reflectance
‘or transmission from the intact samples appear highly confusing, multivariate
calibration converts several wavelengths into precise and relevant information.
‘Therefore in the present book the meaning of the word *Calibrate’ is generalized
in the following way, here sometimes referred to as relative calibration:
To CALIBRATE js to use empirical data and prior knowledge for determining
how to predict unknown quantitative information Y from available measurements
X, via some mathematical transfer function
Multivariate calibration then means determining how to use many measured
variables xy..2...... simultaneously for quantifying some target variable y. For
instance, the X-variables could be chromatographic or spectroscopic measurements,
and the target variable could be analyte concentration,
1.12 AN EXTREME CALIBRATION PROBLEM THAT CANNOT
BE SOLVED BY TRADITIONAL METHODS.
‘The following spectroscopic illustration summarizes the content of this book by
demonstrating some advantages of multivariate calibration in a real, although
somewhat exaggerated, example from chemical analysis,
‘Assume that you want to monitor the concentration of a chemical constituent
in @ complex liquid industrial process stream by high-speed light absorption
spectroscopy (Figure |.la). The analyte in the example is actually the old litmus
pH-indicator.Process siteam
Figure L1_An exiimple of selectivity problem: Spectroscopic quantifation of litmus at
unknown pH and unknown turbidity without sample preparation: a) Application potential:
Remote high-speed analysis of a complex liquid process stream by fiber optic spectroscopy
1 determine the concentration of an analyte
Now you may have serious analytical problems:
‘The way that the analyte actually absorbs light in situ in the complex samples
‘may be different from the spectrum of the analyte in pure form, if the constituent
interacts with the solvent and with other constituents in the samples (analogy:
the NIR spectrum of HQ in wheat flour is different from that of pure H,O,) So
calibrating for the analyte in isolated, purified model systems may be of little use;
it will have to be done empirically on realistic samples from the actual process.
But there may be other problems too: The analyte may not be stable and/or
homogenous, and the measurements may be contaminated by interferences.
First of all, let us assume that there are natural pH. variations in the process,
and the in situ absorbance spectrum of the analyte changes with this pH variation
(which of course litmus does).
Secondly, there are varying levels of particulate material in the liquid samples,
causing strong turbidity changes in the samples 10 be analyzed (in this illustration
unknown levels of ZnO powder were added).
‘And thirdly, there may be spectral interferences from other, more or less
‘unidentified constituents and instrument variations.
Since your purpose for determining the analyte is on-line industrial process
‘control, you have no time for cleaning and standardizing samples in the laboratory4
prior to the light absorption measurements. You may choose to measure the diffuse
light transmittance T directly through the more or less turbid liquids, since these are
fast, simple and precise measurements. Figure 1.1b shows the apparent absorbance
spectra, log(I/T) of a set of such samples with varying analyte concentrations,
varying pH{ and varying light scattering levels.
Figure Ile shows the ‘best univariate calibration line’ for concentration by
traditional calibration procedures. The ‘best’ wavelength is the isospestic point
Of the constituent, (520 nm) marked by a vertical arrow in Figure 1.1b). The
crosses represent the data of the 10 calibration samples used in determining this
‘univariate calibration line; they show a rather unsatisfactory relationship between
‘concentration of the analyte and the absorbance at its “best” wavelength.
Under traditional circumstances chemists would conclude that these high-speed
4iffuse absorption data are unsatisfactory for determining the analyte in the given
type of samples. This is illustrated for ‘unknown’ samples (A.B,C,D and E),
The solid arrow marked A shows how the analyte concentration is subsequently
predicted from the absorbance reading, via this best univariate calibration line.
‘The concentration predictions are quite erroneous when compared to their correct
values, as expected.
Figure 1.1d, in contrast, shows the corresponding prediction results obtained from
the same data, but with multivariate calibration: The apparent absorbances from
‘a number of different wavelengths were now combined in a statistical calibration
‘mode! using 10 calibration samples as a training set, In this case the three normal
samples” concentrations of the analyte were then correctly predicted, irrespective
of level of pH and light scattering. This illustrates how multivariate calibration can
‘greatly enhance the selectivity of analytical measurements,
‘The two letters D and E represent measurements under two (unknown to us)
abnormal conditions: In one case the spectrophotometer instrument was not working
properly and needed maintainance; in the other an unforeseen chemical constituent
‘was present and interfered with the spectral reading. In both cases the obtained
predicted analyte concentrations happened to come in a reasonable range, and
‘might have passed unnoticed in spite of the gross errors involved.
‘But the multivariate ‘disharmony analysis” identified both abnormal observations
as outliers, and an error warning was automatically given by the software. Upon
this prompt we compare the residual “disharmony” spectrum of this outlier to the
expected noise levels in normal samples, and get a quick indication of what the
problems seem to be (Figure 1-1e). This illustrates how multivariate calibration can
make quantitative analysis safer.
With multivariate calibration it is possible to build somewhat ‘intelligent’
analytical instruments that give quantitative, reliable determination of valuable
information from high-speed, but highly non-selective input data. This can