You are on page 1of 15

4.

Ch 4. Statistics

Quantitative analysis requires


: sound knowledge of chemistry
: possibility of interferences
WHY do we need to use STATISTICS in Anal. Chem. ?

uncertainty exists.  will we accept uncertainty always ?


if not, from how will we disregard the data ?

by statistical treatment

Random Events follows Gaussian Distribution

Anal. Chem. by Prof. Myeong Hee Moon

4.2

4-1 Gaussian Distribution

test of the life times of 4768 light bulbs

1) mean value & standard deviation


* mean :x : or average

x i
x i
n

Anal. Chem. by Prof. Myeong Hee Moon


4.3

4-1 Gaussian Distribution (Cont.)

* standard dev. : s : measures


how closely the data are clustered
around the mean
s
(x i  x )2
n 1

n-1 : degrees of freedom

for an infinite set of data:


x (mean)   (mu, popular mean)
s   (sigma, popular standard deviation)
or  2 : var iance

Anal. Chem. by Prof. Myeong Hee Moon

4.4

Anal. Chem. by Prof. Myeong Hee Moon


4.5

4-1 Gaussian Distribution (Cont.)

2) std.dev. & probability


1 ( x   )2
Gaussian curve  y exp(  )
 2 2 2

 tells the broadness


of Gaussian curve
in a gaussian curve
area under 1 = 68.3 %
2 = 95.5 %
3 = 99.7 %

Anal. Chem. by Prof. Myeong Hee Moon

4.6

4-1 Gaussian Distribution (Cont.)

3) std.dev. of mean
more measurements  more confident on average
1 (nearly the true value)
uncertainty decreases by : n = number of meas.
n
s
standard deviation of mean = : s = std.dev.
n
s
* relative standard deviation = (RSD)
x
or into percentage = s  100 = C.V.
x
x
precision of mean =
n
average deviation of mean = d ( d   xx )
n n

Anal. Chem. by Prof. Myeong Hee Moon


4.7

4-2 Confidence Intervals


1) confidence interval : an expression stating that true mean, ,
is likely to lie within a certain distance
our measurements  , s (instead of , )
x
True mean () is likely to lie within a certain range from
Confidence intervals
s
  x t
n

Anal. Chem. by Prof. Myeong Hee Moon

4.8

Ex. The content of carbohydrate in a glycoprotein (a protein with


sugars attached to it) is determined to be 12.6, 11.9, 13.0, 12.7, and
12.5 g per 100 g of protein in replicated analysis. Find the 50% and
90% confidence intervals for the carbohydrate content.

mean = 12.5, std = 0.4

Anal. Chem. by Prof. Myeong Hee Moon


4.9

4-3 Comparison of means with Student's t


(from different measurements)

: tool for expressing confidence


interval for comparing results
from other experimental tech.

Normally, 95% confidence level


: Two results do not differ from
each other IF there is 95%
chance that our conclusion is
correct.

Anal. Chem. by Prof. Myeong Hee Moon

4.10

Case 1. t test : measured result with known value

: when we test a new analytical method,


we want to see if it agrees to a known value.
ex) Ni content; known value : 0.0319% (from std. Material)
measured value : 0.0329, 0.0322, 0.0330, 0.0323 %
The 95% confidence interval ?
0.0004
x  3.182  0.0326  0.0006
4
this interval doesn't cover 0.0319,
thus, measured value are different from known val.

Not within the random error boundary.


(it implies there exists systematic errors)

Anal. Chem. by Prof. Myeong Hee Moon


4.11
1. <t-test> You are developing a procedure for determining traces of copper in
biological materials using a wet digestion followed by measurements by atomic
absorption spectrophotometry. In order to test the validity of the method, you
obtain a NIST orchard leaves standard reference material and analyze this
material. Five replicas are sampled and analyzed, and the mean of the results is
found to be 10.08 ppm with a standard deviation of 0.7ppm. The listed value is
11.7ppm. Does your method gives a statistically correct value at the 95%
confidence level ?

Anal. Chem. by Prof. Myeong Hee Moon

4.12

Case 2. t test: comparing replicate measurements


(test of two sets of measurements)
: test the two techniques are statistically the SAME or NOT
for two sets of data, n1, n2 measurements

x1  x 2 n1n2
t 
S pooled n1  n2

S pooled 
(x i  x1 )2   ( x j  x 2 )2

s12 (n1  1)  s22 (n2  1)
n1  n2  2 n1  n2  2

If tcal > ttable (within 95%)


this difference is significant
(out of random error range)
there exists systematic error

Anal. Chem. by Prof. Myeong Hee Moon


4.13
Ex) The average mass of nitrogen from air in Table 4-3 is =2.31011 g, with
a standard deviation of s1=0.00014, (for n1=7 measurements). The average
mass from chemical sources is =2.29947 g, with a standard deviation of
s2=0.00138 (for n2=8 measurements)

Anal. Chem. by Prof. Myeong Hee Moon

4.14
2. <t-test> A new gravimetric method is developed for iron (II) in which the iron
is precipitated in crystalline form with an organocarbon "cage" compound. The
accuracy of the method is checked by analyzing the iron in an ore sample and
comparing with the results using the standard precipitation with ammonia and
weighing of Fe2O3. The results, reported as % Fe for each analysis, were as
follows.
Test method Reference Method
20.10% 18.89%
20.50 19.20
18.65 19.00
19.25 19.70
19.40 19.40
19.99 19.40
=19.65% =19.24%
Is there a difference between the two methods ?

Anal. Chem. by Prof. Myeong Hee Moon


4.15

Case 3; Comparing individual differences

Two different methods on several different samples (no duplication)


Cholesterol content (g/L)

Plasma Method A Method B Difference (di)


sample
1 1.46 1.42 0.04
2 2.22 2.38 -0.16
3 2.84 2.67 0.17
4 1.97 1.80 0.17
5 1.13 1.09 0.04
6 2.35 2.25 0.10
 =+0.06
d

t cal 
d
n sd 
 (d i  d )2
Sd n 1

Anal. Chem. by Prof. Myeong Hee Moon

4.16

Is my red blood cell count high today ?

Red cell counts on five “normal” days


: 5.1, 5.3, 4.8, 5.4, and 5.2x106 cells/L  x =5.16 s=0.23
Today’s value = 5.6x106 cells/L

today ' s count  x 5.16  5.6


t cal  n 5  4.28
Sd 0.23

What is the probability of finding t=4.28 for 4 degrees of freedom ?


See table 4.2: at 4 degrees of freedom, 4.28 lies between 98 & 99%
 There is less than a 2% probability of observing a count of
5.6x106 cells/L on normal days.

 reasonable to conclude that today’s count is elevated.

Anal. Chem. by Prof. Myeong Hee Moon


4.17

4-4 Comparison of st.dev. with the F test

F test ---- check two std.devs are significantly different each other.
S12
Fcalc  2 If Fcalc > Ftable then significant
S2

Anal. Chem. by Prof. Myeong Hee Moon

4.18

4-6. Grubbs test for an outlier


during measurements of mass lost of zinc,
we need to discard some questionable data
10.2, 10.8, 11.6, 9.9, 9.4, 7.8, 10.0, 9.2,
11.3, 9.5, 10.6, 11.6

If Gcalc > Gtab, then rejected.

Anal. Chem. by Prof. Myeong Hee Moon


4.19

4-7. Method of Least Squares


1. Finding the BEST STRAIGHT LINE

; correlation between data points

1) Method of Least Squares

y = mx + b
m: slope, b: y-intercept

each data --- ( xi, yi )


vertical deviation
= di = yi - y
= yi - (mxi + b)

Anal. Chem. by Prof. Myeong Hee Moon

4.20

4-7. Method of Least Squares

we want to MINIMIZE di (whether positive or neg.)


-- direct summation of each di ? no good

method of maximum likelihood


: Assume a gaussian distribution with std.dev. i.
for the observations about the actual value y(xi) at x=xi

1  1  y  y  2 
the probability Pi Pi  exp  i  
i 2 
  i  
2

 maximize the probability ?


 minimize the sum in the exponential…

Anal. Chem. by Prof. Myeong Hee Moon


4.21

4-7. Method of Least Squares


2
d 
    i 
2
di2 = (yi - y)2 = (yi - mxi -b)2
 i 

minimizing  (assume )
2

 2

m
 2
 
b

n ( x i y i )   x i  y i
METHOD OF m
LEAST SQUARES n ( xi2 )  (  xi )2

b  ( xi2 ) yi   ( xiyi ) xi
n ( xi2 )  (  xi )2
Anal. Chem. by Prof. Myeong Hee Moon

4.22

4-7. Method of Least Squares

2) How reliable are least-squares parameters ?


estimate UNCERTAINTY in slope & intercept

std. dev. of y
y  sy   (di )2
deg rees of freedom (  n - 2)

 2yn
m
2

n ( xi2 )  (  xi )2
2y  ( xi2 )
b2 
n ( xi2 )  (  xi )2

Anal. Chem. by Prof. Myeong Hee Moon


4.23

4-8. Calibration Curves

Std. Solution : solutions with known concentrations


How to build calibration ?
1. prepare a series of std. Solutions (varying conc.)
measure absorbance.

2. subtract the absorbance of blank solution

Anal. Chem. by Prof. Myeong Hee Moon

4.24

4-8. Calibration Curves

3. Plot the absorbances vs. Concentration


 then do least squares.

Anal. Chem. by Prof. Myeong Hee Moon


4.25

4-8. Calibration Curves


Uncertainty Propagation in Calibration curve

m : slope

Depends on # of calibration points.


Lowest error data from the center of calibration

Anal. Chem. by Prof. Myeong Hee Moon

4.26

Homework

4-F, 13, 14, 16, 20, 33, Additional Problems Set

Anal. Chem. by Prof. Myeong Hee Moon


4.27

Additional Problems Set


1. The following replicate calcium determinations on a blood sample using AAS
and a new colorimetric method were reported. Is there a significant difference in

the precision of the two methods ?
AAS (mg/dL) 10.9, 10.1, 10.6, 11.2, 9.7, 10.0
Colorimetric (mg/dL) 9.2, 10.5,9.7, 11.5,11.6, 9.3, 10.1, 11.2

Anal. Chem. by Prof. Myeong Hee Moon

2. Students measured the concentration of HCl in a solution by 4.28

titrations using different indicators to find the end point. Is the


difference between indicators 1 and 2 significant at the 95%
confidence level ? Answer the same question for indicator 2 and 3.
Indicator Mean HCl concentration Number of
(M) (+std.dev.) Measurements
1. Bromothymol blue 0.09565 + 0.00225 28
2. Methyl red 0.08686 + 0.00098 18
3. Bromocresol green 0.08641 + 0.00113 29

Anal. Chem. by Prof. Myeong Hee Moon


4.29

3. A Standard Reference Material is certified to contain 94.6 ppm of an organic


contaminant in soil. Your analysis gives values of 98.6, 98.4, 97.2, 94.6, and
96.2 ppm. Do your results differ from the expected results at the 95%
confidence level ? If you made one more measurement and found 94.5, would
your conclusion change ?

Anal. Chem. by Prof. Myeong Hee Moon