Ch 4. Statistics

: sound knowledge of chemistry

: possibility of interferences

WHY do we need to use STATISTICS in Anal. Chem. ?

if not, from how will we disregard the data ?

by statistical treatment

4.2

* mean :x : or average

x i

x i

n

4.3

how closely the data are clustered

around the mean

s

(x i x )2

n 1

x (mean) (mu, popular mean)

s (sigma, popular standard deviation)

or 2 : var iance

4.4

4.5

1 ( x )2

Gaussian curve y exp( )

2 2 2

of Gaussian curve

in a gaussian curve

area under 1 = 68.3 %

2 = 95.5 %

3 = 99.7 %

4.6

3) std.dev. of mean

more measurements more confident on average

1 (nearly the true value)

uncertainty decreases by : n = number of meas.

n

s

standard deviation of mean = : s = std.dev.

n

s

* relative standard deviation = (RSD)

x

or into percentage = s 100 = C.V.

x

x

precision of mean =

n

average deviation of mean = d ( d xx )

n n

4.7

1) confidence interval : an expression stating that true mean, ,

is likely to lie within a certain distance

our measurements , s (instead of , )

x

True mean () is likely to lie within a certain range from

Confidence intervals

s

x t

n

4.8

sugars attached to it) is determined to be 12.6, 11.9, 13.0, 12.7, and

12.5 g per 100 g of protein in replicated analysis. Find the 50% and

90% confidence intervals for the carbohydrate content.

4.9

(from different measurements)

interval for comparing results

from other experimental tech.

: Two results do not differ from

each other IF there is 95%

chance that our conclusion is

correct.

4.10

we want to see if it agrees to a known value.

ex) Ni content; known value : 0.0319% (from std. Material)

measured value : 0.0329, 0.0322, 0.0330, 0.0323 %

The 95% confidence interval ?

0.0004

x 3.182 0.0326 0.0006

4

this interval doesn't cover 0.0319,

thus, measured value are different from known val.

(it implies there exists systematic errors)

4.11

1. <t-test> You are developing a procedure for determining traces of copper in

biological materials using a wet digestion followed by measurements by atomic

absorption spectrophotometry. In order to test the validity of the method, you

obtain a NIST orchard leaves standard reference material and analyze this

material. Five replicas are sampled and analyzed, and the mean of the results is

found to be 10.08 ppm with a standard deviation of 0.7ppm. The listed value is

11.7ppm. Does your method gives a statistically correct value at the 95%

confidence level ?

4.12

(test of two sets of measurements)

: test the two techniques are statistically the SAME or NOT

for two sets of data, n1, n2 measurements

x1 x 2 n1n2

t

S pooled n1 n2

S pooled

(x i x1 )2 ( x j x 2 )2

s12 (n1 1) s22 (n2 1)

n1 n2 2 n1 n2 2

this difference is significant

(out of random error range)

there exists systematic error

4.13

Ex) The average mass of nitrogen from air in Table 4-3 is =2.31011 g, with

a standard deviation of s1=0.00014, (for n1=7 measurements). The average

mass from chemical sources is =2.29947 g, with a standard deviation of

s2=0.00138 (for n2=8 measurements)

4.14

2. <t-test> A new gravimetric method is developed for iron (II) in which the iron

is precipitated in crystalline form with an organocarbon "cage" compound. The

accuracy of the method is checked by analyzing the iron in an ore sample and

comparing with the results using the standard precipitation with ammonia and

weighing of Fe2O3. The results, reported as % Fe for each analysis, were as

follows.

Test method Reference Method

20.10% 18.89%

20.50 19.20

18.65 19.00

19.25 19.70

19.40 19.40

19.99 19.40

=19.65% =19.24%

Is there a difference between the two methods ?

4.15

Cholesterol content (g/L)

sample

1 1.46 1.42 0.04

2 2.22 2.38 -0.16

3 2.84 2.67 0.17

4 1.97 1.80 0.17

5 1.13 1.09 0.04

6 2.35 2.25 0.10

=+0.06

d

t cal

d

n sd

(d i d )2

Sd n 1

4.16

: 5.1, 5.3, 4.8, 5.4, and 5.2x106 cells/L x =5.16 s=0.23

Today’s value = 5.6x106 cells/L

t cal n 5 4.28

Sd 0.23

See table 4.2: at 4 degrees of freedom, 4.28 lies between 98 & 99%

There is less than a 2% probability of observing a count of

5.6x106 cells/L on normal days.

4.17

F test ---- check two std.devs are significantly different each other.

S12

Fcalc 2 If Fcalc > Ftable then significant

S2

4.18

during measurements of mass lost of zinc,

we need to discard some questionable data

10.2, 10.8, 11.6, 9.9, 9.4, 7.8, 10.0, 9.2,

11.3, 9.5, 10.6, 11.6

4.19

1. Finding the BEST STRAIGHT LINE

y = mx + b

m: slope, b: y-intercept

vertical deviation

= di = yi - y

= yi - (mxi + b)

4.20

-- direct summation of each di ? no good

: Assume a gaussian distribution with std.dev. i.

for the observations about the actual value y(xi) at x=xi

1 1 y y 2

the probability Pi Pi exp i

i 2

i

2

minimize the sum in the exponential…

4.21

2

d

i

2

di2 = (yi - y)2 = (yi - mxi -b)2

i

minimizing (assume )

2

2

m

2

b

n ( x i y i ) x i y i

METHOD OF m

LEAST SQUARES n ( xi2 ) ( xi )2

b ( xi2 ) yi ( xiyi ) xi

n ( xi2 ) ( xi )2

Anal. Chem. by Prof. Myeong Hee Moon

4.22

estimate UNCERTAINTY in slope & intercept

std. dev. of y

y sy (di )2

deg rees of freedom ( n - 2)

2yn

m

2

n ( xi2 ) ( xi )2

2y ( xi2 )

b2

n ( xi2 ) ( xi )2

4.23

How to build calibration ?

1. prepare a series of std. Solutions (varying conc.)

measure absorbance.

4.24

then do least squares.

4.25

Uncertainty Propagation in Calibration curve

m : slope

Lowest error data from the center of calibration

4.26

Homework

4.27

1. The following replicate calcium determinations on a blood sample using AAS

and a new colorimetric method were reported. Is there a significant difference in

the precision of the two methods ?

AAS (mg/dL) 10.9, 10.1, 10.6, 11.2, 9.7, 10.0

Colorimetric (mg/dL) 9.2, 10.5,9.7, 11.5,11.6, 9.3, 10.1, 11.2

difference between indicators 1 and 2 significant at the 95%

confidence level ? Answer the same question for indicator 2 and 3.

Indicator Mean HCl concentration Number of

(M) (+std.dev.) Measurements

1. Bromothymol blue 0.09565 + 0.00225 28

2. Methyl red 0.08686 + 0.00098 18

3. Bromocresol green 0.08641 + 0.00113 29

4.29

contaminant in soil. Your analysis gives values of 98.6, 98.4, 97.2, 94.6, and

96.2 ppm. Do your results differ from the expected results at the 95%

confidence level ? If you made one more measurement and found 94.5, would

your conclusion change ?

