Вы находитесь на странице: 1из 5

2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010)

A Quantification Method of Glucose in Aqueous


Solution by FTIR/ATR Spectroscopy
Jiemei Chen1, Lingling Wu1,2, Tao Pan2,*, Jun Xie2, Huazhou Chen2,3
2

1
Department of Biological Engineering, Jinan University, Guangzhou 510632, P.R.China
Key Laboratory of Optoelectronic Information and Sensing Technologies of Guangdong Higher Educational Institutes (Jinan
University), Guangzhou 510632, P.R.China
3
Department of Mathematics, Shanghai University, Shanghai 200444, P.R.China
*tpan@jnu.edu.cn

AbstractA rapid quantitative analysis method of glucose in


aqueous solution was established by using the FTIR/ATR
spectroscopy, partial least squares (PLS) regression and SavitzkyGolay (SG) smoothing method. Based on the prediction effect of the
optimal single wavenumber model, calibration set and prediction set
were divided. By extending the number of smoothing points and the
degree of polynomial, 483 smooth modes were calculated. The PLS
models corresponding to all combinations of 483 SG smoothing
modes and 1-40 PLS factor were established respectively. The
optimal smoothing parameters were the first order derivative
smoothing, 5 or 6 degree polynomial, 63 smoothing points, the
optimal PLS factor, root mean squared error of predication
(RMSEP), correlation coefficient of predication (RP) and relative
root mean squared error of predication (RRMSEP) were 3, 0.3729
(mmol/L), 0.9995 and 2.48% respectively, which was obviously
superior to the direct PLS model without SG smoothing and the
optimal SG smoothing model within 25 smoothing points (the
original smoothing method). This demonstrates that the extending of
SG smoothing modes and large-scale simultaneous optimization
selection of SG smoothing parameters and PLS factor was all very
necessary, and can be effectively applied to the model optimization
of FTIR/ATR spectroscopy analysis.
Keywords-Glucose solution; FTIR/ATR spectroscopy analysis;
partial least squares; Savitzky-Golay smoothing

I.

INTRODUCTION

Glucose is an important life metabolite, and much


significant life information can be gotten by detection of
glucose concentration in the living system. Routine
measurement method of glucose concentration always needs
chemical reagents, and its not an effective method because of
possible damaging to the living system.
Fourier transform infrared spectroscopy (FTIR) and
attenuated total reflection (ATR) technology are effective
determination methods for structure and content of
components[1]. And they could be used nondestructive
quantitative analysis of online, real-time and in situ which no
chemical reagents. The FTIR/ATR spectroscopy technique
had been extensive applied to the analysis of agricultural
product and food[2], fermentation process monitoring[3],

enzyme activity analysis[4] , cell metabolism measurement[5-6]


and many other fields.
In order to establish the quantitative analysis method of
glucose in life system, in this paper, a quantification method
of glucose in aqueous solution was first development by
FTIR/ATR spectroscopy. And the glucose concentrations in
aqueous solution samples were prepared according to the
glucose concentration range in human or animal blood.
Partial least squares (PLS) was a classical effective
chemometrics method which was used widely in spectroscopy
analysis[7-11]. The PLS factor is an important parameter. If the
PLS factor is too small, the spectral information of the
samples couldnt be fully used. The model accuracy would be
decreased. If the PLS factor is too big, noises would be led
into the model and the prediction ability would be decreased
too. Therefore, it is important to select a reasonable PLS
factor.
Savitzky-Golay (SG) smoothing method was a widely
used method in spectral pretreatment which can eliminate
noise[12-15]. The SG smoothing parameters include the order of
derivative, the degree of polynomial and the number of
smoothing points. Particularly, it is important to set the
number of smoothing points. If the number is too small, it
would lead new errors to the model. If the number is too big,
the spectra data containing information would be polished and
lost. Both of the situations would decrease the model
accuracy. The SG smoothing parameters would be different
when objects are different and measurement modes are
different. It is very necessary to large-scale simultaneous
optimization of the SG smoothing parameters and the PLS
factor according to prediction effects. But as there are many
smoothing modes and different formulas, the workload would
be very large. This work was seldom accomplished in
previous study. On the other hand, more smoothing points
may be necessary in some actual measuring systems. To
widen the application scope, it is necessary to expanse the
smoothing parameters table according to the original
method[12].
In this paper, by simulating the glucose concentration in
human or animal blood, 82 glucose aqueous solution samples
were designed. The rapid determination method and the

This work was supported by the National Natural Science Foundation of China (10771087), the Natural Science Foundation of Guangdong Province
(7005948), the Science and Technology Project of Guangdong Province (2007A020905001, 2009B030801239). *Corresponding author: Tao Pan (
tpan@jnu.edu.cn)

978-1-4244-5934-6/10/$26.00 2010 IEEE

2159

analysis model of glucose solution were established by using


the FTIR/ATR spectroscopy technology, PLS regression and
SG smoothing method. Especially, the simultaneous
optimization of the SG smoothing parameters and the PLS
factor was applied to FTIR/ATR spectroscopy analysis of
glucose solution.
II.

EXPERIMENT AND METHODS

A. Experimental Materials, Instrument and Measurement


Method
82 glucose aqueous solution samples were designed, and
glucose concentration ranged from 0.416 to 39.036 (mmol/L),
the mean values and the standard deviations are 18.449 and
10.810 (mmol/L) respectively.
A VERTEX 70 FTIR spectrometer (BRUKER Company)
equipped with a KBr beamsplitter and a deuterated triglycine
sulfate KBr detector was used to collect the spectra. The MIR
spectra were obtained from 4500 to 600 cm-1 with a horizontal
ATR sampling accessory with a diamond internal reflection
element on a ZnSe crystal (SPECAC Company, 45o angle of
incidence, 3 times reflective). 32 scans of symmetrical
interferograms at 4 cm-1 resolution were added for each
spectrum. The spectra were measured at temperature 25 1
and humidity 46% RH.
0.075ml of each glucose aqueous solution sample was
taken for spectroscopy measuring. Each sample was measured
3 times, and the average spectrum was calculated.
B. Dividing Method for Calibration Set and Prediction Set
Based on the prediction effect of the optimal single
wavenumber model for all samples, calibration set and
prediction set were divided. The glucose chemical value and
spectral data of samples in calibration set were combined to
establish models for data mining. Then, the established model
was applied to the spectral data of samples in prediction set,
to calculate the theoretical value of glucose. Comparing
calculated value and actual value, model prediction effect was
evaluated.
By Beer's law, the single wavenumber linear model for the
glucose aqueous solution sample absorbance and glucose
chemical values is follows
A(v)=k(v)C+,
(1)
where A(v) is the absorbance of sample and k(v) is the unit
concentration absorption coefficient of glucose aqueous
solution, for each wavenumber v. C is the glucose chemical
value and is other unknown interference. k(v) was regression
calculated using the absorbance and chemical values of all
samples, and then the predict value C'i(v) of sample i was
calculated by using k(v) and the sample absorbance,
i=1,2,,N, N is the number of samples. Additionally, root
mean square error (RMSE) between predict values and
chemical values were calculated. Set that Ci is the chemical
value of sample i, thus

(C '

(v ) C i ) 2

.
(2)
N 1
According to the minimum value of RMSE, the optimal single
wavenumber model and the corresponding wavenumber
vOptimal were selected. Base on the optimal model, the bias
between predict value and chemical value of each sample was
calculated which named single wavenumber prediction bias
(SWPB).
SWPB(i)=|C'i(vOptimal) Ci |, i=1, 2, , N.
(3)
According to SWPB, the calibration set and the prediction set
were divided. Computer procedures were used to make a
similar distribution of SWPB for the two sets (mean value and
standard deviation are similar, relative error was less than 1%).
By this method, the chemical values and spectral data were
combined to make the calibration set and the prediction set
have consistent distribution. Consequently the dividing
method has modeling representative. To ensure the
concentration range of calibration set cover that of validation
one, the samples with maximum and minimum chemical
values were divided into the calibration set, while the samples
with second maximum and second minimum chemical values
were divided into the prediction set.

RMSE( v ) =

i =1

C. SG Smoothing Method
SG Smoothing parameters include the order of derivative
(the original spectral smoothing was recorded zero order
derivative smoothing), the degree of polynomial and the
number of smooth points. Because some actual measurement
systems (for example, the case of the spectral wavenumber
gap was small) may require more smooth points. So, in this
paper, the number of smoothing points were expanded from 5,
7 25 (odd)[12] to 5, 7 81(odd), and the degree of
polynomial were expanded to n = 2, 3, 4, 5, 6 (originally n =
2, 3, 4, 5). According to the original method[12], 14 smoothing
coefficient tables which covering the original smoothing
coefficient, and 483 smoothing modes (originally 117 modes)
were calculated by computer program. And it is a SG
smoothing preprocessing group with wider application scope.
D. Model Evaluation Indicators
The model evaluation indicators main includes root mean
squared error of predication (RMSEP) and correlation
coefficient of predication (RP) and the relative root mean
squared error of predication (RRMSEP) as follows:
M

RMSEP =
M

(C

ip

RP =

(C '

ip

C ip ) 2

i =1

M 1
Cmp )(C 'ip C 'mp )

i =1

(C

ip

i =1

(5)

Cmp ) 2 (C 'ip C 'mp ) 2


i =1

RMSEP
RRMSEP =
100 (%)
C mp

2160

(4)

(6)

where C'ip, Cip were predictive value and chemical values of


the sample i in the prediction set, C'mp, Cmp were the mean
predicted value and mean chemical value of all samples in the
prediction set, M is the sample number in the prediction set.
RMSEP used as the goal of model optimization and parameter
design.
III.

RESULTS AND DISCUSSION

The FTIR/ATR spectra of 82 samples were showed in


Figure 1. According to the method of section IIB, all single
wavenumber models were established, and the optimal
wavenumber vOptimal was 1034 cm-1 according to the minimum
RMSE. Based on 1034 cm-1 model, the SWPB of each sample

Figure 1 The FTIR/ATR spectra of 82 glucose aqueous solution


samples.

was calculated. SWPBs distribution and chemical values


distribution of all the samples were shown in the Figure 2. All
82 samples were used for modeling. The calibration set
consists of 55 samples and the prediction set of 27 samples. By
the method mentioned in section IIB, the samples were split
into the calibration set and prediction set. TABLE I show the
mean value and standard deviation of SWPB and chemical
value. TABLE I and Figure 2 indicate that SWPBs distribution
in the calibration set and the prediction set was very consistent.

established by the direct PLS model without SG smoothing.


The optimal RMSEPs based on whole spectral region and
fingerprint region were 1.109 and 0.6604 (mmol/L)
respectively. The prediction accuracy of the later was
obviously better than the former. So the fingerprint region
(1600-900 cm-1) was selected as the spectral band for modeling
by PLS method, and the corresponding optimal PLS factor was
6, the optimal RMSEP was 0.6604 (mmol/L), RP was 0.9981,
RRMSEP was 4.40%.
TABLE I.
THE MEANS AND THE STANDARD DEVIATIONS OF
SWPBS AND CHAMICAL VALUES IN CALIBRATION SET AND
PREDICTION SET
Chemical value
(mmol/L)
Standard
Mean
deviation

SWPB
(mmol/L)
Standard
Mean
deviation

Calibration set

20.131

10.851

8.615

5.829

Prediction set

15.021

10.064

8.664

5.810

Then the PLS models with various SG smoothing were


built. Based on computer algorithms platform which was
developed by authors, PLS models corresponding to all
combinations of 483 smoothing modes and PLS factor
changing from 1 to 40 were established. According to the
prediction effect, the SG smoothing parameters and the PLS
factor were simultaneously optimized. The RMSEP values of
the optimal models with different derivative modes and
different smoothing points were shown in Figure 3. The degree
of polynomial, smooth points, PLS factor and RMSEP of the
optimal model with different derivation order were shown in
TABLE II. The result of the direct PLS model without SG

Figure 3 The optimal RMSEP corresponding to the number of


smoothing points for each order derivative mode
(0) Original spectra smoothing; (1) 1st order derivative; (2)2nd
order derivative; (3) 3rd order derivative; (4) 4th order
Derivative; (5) 5th order derivative

smoothing was also listed in TABLE II.


Figure 2 The distributions of SWPBs and chemical values.

As a comparison, the whole spectral region (4500-600


cm-1) and the fingerprint range (1600-900 cm-1) were also

The global optimal SG smoothing parameters were the


first order derivative smoothing, 5 or 6 degree of polynomial,
63 smoothing points. And the corresponding optimal PLS
factor, the optimal RMSEP, RP, RRMSEP were 3, 0.3729

2161

(mmol/L), 0.9995, 2.48% respectively. The prediction effect


was obvious better than the result obtained without SG
smoothing. TABLE II and Figure 3 showed that the optimal
smoothing points and the optimal PLS factor corresponding to
different derivative order were different. If using the
designated smoothing parameters which used by previous
researches, without a large-scale selection, it is difficult to
find the optimal SG smoothing parameters and the PLS factor.
In addition, TABLE II and Figure 3 also showed that the
optimal smoothing points were not less than 25, if using any
smoothing point within 25, the optimal prediction effect
would not be obtained (within 25 smoothing points, the best
RMSEP was 0.6604 (mmol/L), RP was 0.9981). Figure 4
showed the comparison of the predictive value and the
chemical value for the optimal SG smoothing model of each
sample. It was seen that the correlation between the predictive
value and the chemical value of all samples was very good.
These indicated the expansion of SG smoothing mode was
very necessary.
TABLE II. PREDICTION EFFECT OF THE OPTIMAL MODEL
CORRESPODING TO EACH ORDER DERIVATION
Number of
Polynomi
PLS RMSEP RRMS
smoothing
al degree
factor (mmol/L) EP
points

RP

No SG
smoothing
Original
spectra
smoothing
1st order
derivative

0.6604

4.40%

0.9981

45

41

0.6418

4.27%

0.9983

56

63

0.3729

2.48% 0.9995

2nd order
derivative

45

67

0.4762

3.17%

0.9989

3rd order
derivative

34

57

0.5279

3.51%

0.9988

4th order
derivative

45

65

0.5483

3.65%

0.9987

5th order
derivative

56

75

0.5459

3.63%

0.9985

63 smoothing points. And the corresponding smoothing


formula was the follows:

~ =
a
0

31

a
i

(7)

i = 31

~ were the
Where a i were the original spectral data, a
0
spectral data after SG smoothing,

i were

the smoothing

coefficient as follows: -4.886, -1.796, 0.459, 1.997, 2.928,


3.353, 3.364, 3.045, 2.472, 1.713, 0.830, -0.125, -1.104,
-2.067, -2.978, -3.808, -4.536, -5.141, -5.611, -5.937, -6.113,
-6.138, -6.016, -5.752, -5.354, -4.833, -4.204, -3.480, -2.679,
-1.820, -0.920, 0, 0.920, 1.820, 2.6789, 3.480, 4.204, 4.834,
5.354, 5.752, 6.016, 6.139, 6.113, 5.937, 5.611, 5.141, 4.536,
3.808, 2.978, 2.067, 1.104, 0.125, -0.830, -1.713, -2.472,
-3.045, -3.364, -3.353, -2.928, -1.997, -0.459, 1.796, 4.887
( 103 ).
IV.

CONCLUSION

The simultaneous optimization of the SG smoothing


parameters and the PLS factor was applied to FTIR/ATR
analysis of glucose solution. Based on the optimal single
wavenumber model, the calibration set and the prediction set
were divided. For direct PLS model without SG smoothing,
the optimal PLS factor, RMSEP, RP and RRMSEP were 6,
0.6604 (mmol/L), 0.9981 and 4.40% respectively. By
extending the number of smoothing points and the degree of
polynomial, 483 smooth modes were calculated. The PLS
models corresponding to all combinations of 483 SG
smoothing modes and 1-40 PLS factor were established
respectively. The optimal smoothing parameters were the first
order derivative smoothing, 5 or 6 degree polynomial, 63
smoothing points, the optimal PLS factor, RMSEP, RP and
RRMSEP were 3, 0.3729 (mmol/L), 0.9995 and 2.48%
respectively, which was obviously superior to the direct PLS
model without SG smoothing and the optimal SG smoothing
model within 25 smoothing points (the original smoothing
method). This demonstrates that the extending of SG
smoothing modes and large-scale simultaneous optimization
selection of SG smoothing parameters and PLS factor was all
very necessary, and can be effectively applied to the model
optimization of FTIR/ATR analysis.

REFERENCES
[1]

[2]
[3]

[4]
Figure 4 Comparison of predictive value and chemical value of
each sample for the optimal SG smoothing model

F. M. Mirabella and N. J. Harrick, Internal reflection spectroscopy:


review and supplement, New York: Harrick Scientific Corporation,
1985.
R. H. Wilson, Fourier Transform mid-infrared spectroscopy for food
analysis, Trends in Analyical Chemistry, vol. 9, pp. 127131, 1990.
P. H. Axelsen, W. D. Braddock, H. L. Brockman, et al., Use of internal
reflectance infrared spectroscopy for in situ study of supported lipid
monolayers, Applied Spectroscopy, vol. 49, no. 5, pp. 526531, 1995.
T. Fujii, Y. Miyahara, Infrared ATR spectroscopy of substrates in
aqueous solution using cryoenrichment and its application in enzymeactivity assays, Applied Spectroscopy, vol. 52, no. 2, pp. 128133,
1998.

The global optimal SG smoothing parameters were the


first order derivative smoothing, 5 or 6 degree of polynomial,

2162

[5]

K. Nakanishi, A. Hashimoto, T. Pan, et al., Mid-infrared spectroscopic


measurement of ionic dissociative materials in metabolic pathway.
Applied Spectroscopy, vol. 57, no. 12, 15101516, 2003.
[6] T. Pan, A. Hashimoto, M. Kanou, et al., Development of a
quantification system of ionic dissociative metabolites using an FTIR/ATR method. Bioprocess and Biosystems Engineering, vol. 26, no.
2, 133139, 2003.
[7] D. A. Burns, E. W. Ciurczak, Handbook of near-infrared analysis, 2nd
ed, New York: Marcel dekker inc, 2001: 633647
[8] X. L. Chu, H. F. Yuan, W. Z. Lu, Progress and Application of Spectral
Data Pretreatment and Wavelength Selection Methods in NIR
Analytical Technique, Progress in Chemistry, vol. 16, no. 4, 528542,
2004.
[9] X. L. Chu, Y. Xu, W. Z. Lu, Research and Application Progress of
Chemometrics Methods in Near Infrared Spectroscopic Analysis,
Chinese Journal of Analytical Chemistry, vol. 36, no. 5, 702709, 2008.
[10] P. Cao, T. Pan, X. D. Chen, Choice of wave band indesign of minitype
near-infrared corn protein content analyzer, Optics and Precision
Engineering, vol. 15, no. 12, 19521958, 2007.

[11] J. H. Jiang, R. J. Berry, H. W. Siesler, Y. Ozaki, Wavelength interval


selection in multicomponent spectral analysis by moving window partial
least-squares regression with applications to mid-infrared and hearinfrared spectroscopic data, Analytical Chemistry, vol. 74, 35553565,
2002.
[12] A. Savitzky, M. J. E. Golay, Smoothing and differentiation of data by
simplified least squares procedures, Analytical Chemistry, vol. 36, no.
8, pp. 16271637, 1964.
[13] J. M. Chen, T. Pan, X. D. Chen, Application of second derivative
spectrum prepares in quantification measuring glucose-6-phosphate and
fructose-6-phosphate using a FTIR/ATR method, Optics and Precision
Engineering, vol. 14, no. 1, 17, 2006.
[14] H. Yin, T. Pan, D. F. Liu, et al., A simultaneous quantification method
of thalassemia screening multiple indicators using FTIR/ATR
spectroscopy, Photonics and Optoelectronics, 2009, SOPO2009,
Symposium on 14-16 Aug. 2009, pp. 14.
[15] H. Yin, T. Pan, P. L. Tian, et al., Rapid Quantitative Analysis for the
Human Blood Hemoglobin Applied through FTIR/ATR Spectrum,
Chinese Journal of Spectroscopy Laboratory, vol. 26, No. 2, 432436,
2009.

2163