Xde Apa

4.
Ch 4. Statistics
Quantitative analysis requires

: sound knowledge of chemistry
: possibility of interferences
WHY do we need to use STATISTICS in Anal. Chem. ?
uncertainty exists.  will we accept uncertainty always ?

if not, from how will we disregard the data ?
by statistical treatment
Random Events follows Gaussian Distribution
Anal. Chem. by Prof. Myeong Hee Moon
4.2
4-1 Gaussian Distribution
test of the life times of 4768 light bulbs
1) mean value & standard deviation

* mean :x : or average
x i
x i
n

4.3
4-1 Gaussian Distribution (Cont.)
* standard dev. : s : measures

how closely the data are clustered
around the mean
s
(x i  x )2
n 1
n-1 : degrees of freedom
for an infinite set of data:

x (mean)   (mu, popular mean)
s   (sigma, popular standard deviation)
or  2 : var iance
4.4

4.5
2) std.dev. & probability

1 ( x   )2
Gaussian curve  y exp(  )
 2 2 2
 tells the broadness

of Gaussian curve
in a gaussian curve
area under 1 = 68.3 %
2 = 95.5 %
3 = 99.7 %
4.6
3) std.dev. of mean
more measurements  more confident on average
1 (nearly the true value)
uncertainty decreases by : n = number of meas.
n
s
standard deviation of mean = : s = std.dev.
n
s
* relative standard deviation = (RSD)
x
or into percentage = s  100 = C.V.
x
x
precision of mean =
n
average deviation of mean = d ( d   xx )
n n

4.7
4-2 Confidence Intervals

1) confidence interval : an expression stating that true mean, ,
is likely to lie within a certain distance
our measurements  , s (instead of , )
x
True mean () is likely to lie within a certain range from
Confidence intervals
s
  x t
n
4.8
Ex. The content of carbohydrate in a glycoprotein (a protein with

sugars attached to it) is determined to be 12.6, 11.9, 13.0, 12.7, and
12.5 g per 100 g of protein in replicated analysis. Find the 50% and
90% confidence intervals for the carbohydrate content.
mean = 12.5, std = 0.4

4.9
4-3 Comparison of means with Student's t

(from different measurements)
: tool for expressing confidence

interval for comparing results
from other experimental tech.
Normally, 95% confidence level

: Two results do not differ from
each other IF there is 95%
chance that our conclusion is
correct.
4.10
Case 1. t test : measured result with known value
: when we test a new analytical method,

we want to see if it agrees to a known value.
ex) Ni content; known value : 0.0319% (from std. Material)
measured value : 0.0329, 0.0322, 0.0330, 0.0323 %
The 95% confidence interval ?
0.0004
x  3.182  0.0326  0.0006
4
this interval doesn't cover 0.0319,
thus, measured value are different from known val.
Not within the random error boundary.

(it implies there exists systematic errors)

4.11
1. <t-test> You are developing a procedure for determining traces of copper in
biological materials using a wet digestion followed by measurements by atomic
absorption spectrophotometry. In order to test the validity of the method, you
obtain a NIST orchard leaves standard reference material and analyze this
material. Five replicas are sampled and analyzed, and the mean of the results is
found to be 10.08 ppm with a standard deviation of 0.7ppm. The listed value is
11.7ppm. Does your method gives a statistically correct value at the 95%
confidence level ?
4.12
Case 2. t test: comparing replicate measurements

(test of two sets of measurements)
: test the two techniques are statistically the SAME or NOT
for two sets of data, n1, n2 measurements
x1  x 2 n1n2
t 
S pooled n1  n2
S pooled 
(x i  x1 )2   ( x j  x 2 )2

s12 (n1  1)  s22 (n2  1)
n1  n2  2 n1  n2  2
If tcal > ttable (within 95%)

this difference is significant
(out of random error range)
there exists systematic error

4.13
Ex) The average mass of nitrogen from air in Table 4-3 is =2.31011 g, with
a standard deviation of s1=0.00014, (for n1=7 measurements). The average
mass from chemical sources is =2.29947 g, with a standard deviation of
s2=0.00138 (for n2=8 measurements)
4.14
2. <t-test> A new gravimetric method is developed for iron (II) in which the iron
is precipitated in crystalline form with an organocarbon "cage" compound. The
accuracy of the method is checked by analyzing the iron in an ore sample and
comparing with the results using the standard precipitation with ammonia and
weighing of Fe2O3. The results, reported as % Fe for each analysis, were as
follows.
Test method Reference Method
20.10% 18.89%
20.50 19.20
18.65 19.00
19.25 19.70
19.40 19.40
19.99 19.40
=19.65% =19.24%
Is there a difference between the two methods ?

4.15
Case 3; Comparing individual differences
Two different methods on several different samples (no duplication)

Cholesterol content (g/L)
Plasma Method A Method B Difference (di)

sample
1 1.46 1.42 0.04
2 2.22 2.38 -0.16
3 2.84 2.67 0.17
4 1.97 1.80 0.17
5 1.13 1.09 0.04
6 2.35 2.25 0.10
 =+0.06
d
t cal 
d
n sd 
 (d i  d )2
Sd n 1
4.16
Is my red blood cell count high today ?
Red cell counts on five “normal” days

: 5.1, 5.3, 4.8, 5.4, and 5.2x106 cells/L  x =5.16 s=0.23
Today’s value = 5.6x106 cells/L
today ' s count  x 5.16  5.6

t cal  n 5  4.28
Sd 0.23
What is the probability of finding t=4.28 for 4 degrees of freedom ?

See table 4.2: at 4 degrees of freedom, 4.28 lies between 98 & 99%
 There is less than a 2% probability of observing a count of
5.6x106 cells/L on normal days.
 reasonable to conclude that today’s count is elevated.

4.17
4-4 Comparison of st.dev. with the F test
F test ---- check two std.devs are significantly different each other.
S12
Fcalc  2 If Fcalc > Ftable then significant
S2
4.18
4-6. Grubbs test for an outlier

during measurements of mass lost of zinc,
we need to discard some questionable data
10.2, 10.8, 11.6, 9.9, 9.4, 7.8, 10.0, 9.2,
11.3, 9.5, 10.6, 11.6
If Gcalc > Gtab, then rejected.

4.19
4-7. Method of Least Squares

1. Finding the BEST STRAIGHT LINE
; correlation between data points
1) Method of Least Squares
y = mx + b
m: slope, b: y-intercept
each data --- ( xi, yi )

vertical deviation
= di = yi - y
= yi - (mxi + b)
4.20
we want to MINIMIZE di (whether positive or neg.)

-- direct summation of each di ? no good
method of maximum likelihood

: Assume a gaussian distribution with std.dev. i.
for the observations about the actual value y(xi) at x=xi
1  1  y  y  2 
the probability Pi Pi  exp  i  
i 2 
  i  
2
 maximize the probability ?

 minimize the sum in the exponential…

4.21

2
d 
    i 
2
di2 = (yi - y)2 = (yi - mxi -b)2
 i 
minimizing  (assume )
2
 2

m
 2
 
b
n ( x i y i )   x i  y i
METHOD OF m
LEAST SQUARES n ( xi2 )  (  xi )2
b  ( xi2 ) yi   ( xiyi ) xi
n ( xi2 )  (  xi )2
4.22
2) How reliable are least-squares parameters ?

estimate UNCERTAINTY in slope & intercept
std. dev. of y
y  sy   (di )2
deg rees of freedom (  n - 2)
 2yn
m
2

n ( xi2 )  (  xi )2
2y  ( xi2 )
b2 
n ( xi2 )  (  xi )2

4.23
4-8. Calibration Curves
Std. Solution : solutions with known concentrations

How to build calibration ?
1. prepare a series of std. Solutions (varying conc.)
measure absorbance.
2. subtract the absorbance of blank solution
4.24
3. Plot the absorbances vs. Concentration

 then do least squares.

4.25

Uncertainty Propagation in Calibration curve
m : slope
Depends on # of calibration points.

Lowest error data from the center of calibration
4.26
Homework
4-F, 13, 14, 16, 20, 33, Additional Problems Set

4.27
Additional Problems Set

1. The following replicate calcium determinations on a blood sample using AAS
and a new colorimetric method were reported. Is there a significant difference in

the precision of the two methods ?
AAS (mg/dL) 10.9, 10.1, 10.6, 11.2, 9.7, 10.0
Colorimetric (mg/dL) 9.2, 10.5,9.7, 11.5,11.6, 9.3, 10.1, 11.2
2. Students measured the concentration of HCl in a solution by 4.28
titrations using different indicators to find the end point. Is the

difference between indicators 1 and 2 significant at the 95%
confidence level ? Answer the same question for indicator 2 and 3.
Indicator Mean HCl concentration Number of
(M) (+std.dev.) Measurements
1. Bromothymol blue 0.09565 + 0.00225 28
2. Methyl red 0.08686 + 0.00098 18
3. Bromocresol green 0.08641 + 0.00113 29

4.29
3. A Standard Reference Material is certified to contain 94.6 ppm of an organic

contaminant in soil. Your analysis gives values of 98.6, 98.4, 97.2, 94.6, and
96.2 ppm. Do your results differ from the expected results at the 95%
confidence level ? If you made one more measurement and found 94.5, would
your conclusion change ?

Xde Apa

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Xde Apa

Загружено:

Авторское право:

Доступные форматы

4.

Quantitative analysis requires

uncertainty exists.  will we accept uncertainty always ?

Random Events follows Gaussian Distribution

Anal. Chem. by Prof. Myeong Hee Moon

4-1 Gaussian Distribution

test of the life times of 4768 light bulbs

1) mean value & standard deviation

Anal. Chem. by Prof. Myeong Hee Moon

4-1 Gaussian Distribution (Cont.)

* standard dev. : s : measures

n-1 : degrees of freedom

for an infinite set of data:

Anal. Chem. by Prof. Myeong Hee Moon

Anal. Chem. by Prof. Myeong Hee Moon

4-1 Gaussian Distribution (Cont.)

2) std.dev. & probability

 tells the broadness

Anal. Chem. by Prof. Myeong Hee Moon

4-1 Gaussian Distribution (Cont.)

Anal. Chem. by Prof. Myeong Hee Moon

4-2 Confidence Intervals

Anal. Chem. by Prof. Myeong Hee Moon

Ex. The content of carbohydrate in a glycoprotein (a protein with

mean = 12.5, std = 0.4

Anal. Chem. by Prof. Myeong Hee Moon

4-3 Comparison of means with Student's t

: tool for expressing confidence

Normally, 95% confidence level

Anal. Chem. by Prof. Myeong Hee Moon

Case 1. t test : measured result with known value

: when we test a new analytical method,

Not within the random error boundary.

Anal. Chem. by Prof. Myeong Hee Moon

Anal. Chem. by Prof. Myeong Hee Moon

Case 2. t test: comparing replicate measurements

If tcal > ttable (within 95%)

Anal. Chem. by Prof. Myeong Hee Moon

Anal. Chem. by Prof. Myeong Hee Moon

Anal. Chem. by Prof. Myeong Hee Moon

Case 3; Comparing individual differences

Two different methods on several different samples (no duplication)

Plasma Method A Method B Difference (di)

Anal. Chem. by Prof. Myeong Hee Moon

Is my red blood cell count high today ?

Red cell counts on five “normal” days

today ' s count  x 5.16  5.6

What is the probability of finding t=4.28 for 4 degrees of freedom ?

 reasonable to conclude that today’s count is elevated.

Anal. Chem. by Prof. Myeong Hee Moon

4-4 Comparison of st.dev. with the F test

Anal. Chem. by Prof. Myeong Hee Moon

4-6. Grubbs test for an outlier

If Gcalc > Gtab, then rejected.

Anal. Chem. by Prof. Myeong Hee Moon

4-7. Method of Least Squares

; correlation between data points

1) Method of Least Squares

each data --- ( xi, yi )

Anal. Chem. by Prof. Myeong Hee Moon

4-7. Method of Least Squares

we want to MINIMIZE di (whether positive or neg.)

method of maximum likelihood