You are on page 1of 29

Errors, Standard Deviation,

Data analysis

1
Errors In Chemical Analyses

Mean

Mean, arithmetic mean, and average (x) are synonyms for


the quantity obtained by dividing the sum of replicate
measurements by the number of measurements in the set.
N

x
i 1
i

x 
N
where, xi represents the individual values of x making up a set
of N replicate measurements.

2
Median

The median is the middle result when replicate data are


arranged in order of size.

 Equal number of results are larger and smaller than the


median.

 For an odd number of data points, the median can be


evaluated directly.

 For an even number, the mean of the middle pair is used.

3
xm = 19.70

Results from replicate determinations


4
Precision

Precision is the closeness of results to others that have been


obtained in exactly the same way.

- describes the reproducibility of measurements.

- determined by simply repeating the measurement on


replicate samples.

Three terms are widely used to describe the precision:


standard deviation, variance, and coefficient of variation.

All these terms are functions of the deviation from the mean
di or just the deviation.
di = | xi – x |

5
Accuracy

Accuracy is the closeness of a measurement to the true or


accepted value and is expressed by the error.

Accuracy measures agreement between a result and its true


value but precision describe the agreement among several
measurements.

We can never determine accuracy exactly because the true


value of a measured quantity can never be known exactly.
We need to use an accepted value.

Accuracy is expressed in terms of either absolute or relative


error.

6
Illustration of accuracy and precision 7
Absolute Error
The absolute error of a measurement is the
difference between the measured value and the true
value. The absolute error E in the measurement of
a quantity xi is given by the equation,

E = xi - x t

where, xt is the true or accepted value of the quantity.

– Negative sign indicates that the experimental result is


smaller than the accepted value
– Positive sign indicates that the result is larger than the
accepted value.
8
Relative Error

Relative error of a measurement is the absolute


error divided by the true value.

The percent relative error is given by the


expression,
xx
i t
E 
r  100%
x t
Relative error expressed in parts per thousand (ppt).

9
Absolute error in nitrogen determination 10
Random or Indeterminate Errors
The errors that affect the precision of
measurement. This type of error causes data to be
scattered more or less symmetrically around a
mean value. Random error in a measurement is
reflected by its precision.

Systematic or Determinate Errors


The errors that affect the accuracy of a result. This
type of error causes the mean of a set of data to
differ from the accepted value.

11
Gross Errors
They usually occur only occasionally, are often large, and
may cause a result to be either high or low. Gross error leads
to outliers.

This error causes the result differs significantly from the rest
of the results.

Bias Errors
Bias measures the systematic error associated with an
analysis. It has a negative sign if it causes the result to be low
and a positive sign otherwise.

Bias has a definite value, an assignable cause and are about


the same magnitude for replicate measurements. Bias affects
all the data in a set in the same way.
12
How do Systematic Errors Arise?

There are three types of systematic errors:-

1. Instrumental errors are caused by the imperfections in


measuring devices and instabilities in their components.

2. Method errors arise from non-ideal chemical or physical


behavior of analytical systems.

3. Personal errors result from the carelessness, inattention,


or personal limitations of the experimenter.

13
Effects of Systematic Errors

i) Constant Errors: Constant Errors does not


depend on the size of the quantity measured

ii) Proportional Errors: Proportional errors


decrease or increase in proportion to the size of the
sample taken for analysis.

A common cause of proportional errors is the presence of


interfering contaminants in the sample.

14
Detecting Systematic Errors

Systematic instrument errors are usually corrected


by calibration. Periodic calibration of equipment is
always desirable.

Method errors or bias of an analytical method is


estimated by analyzing standard reference materials.

Personal errors can be minimized by care and self-


discipline.

15
Standard reference materials are substances sold or
Certified by the National Institute of Standard and
Technology (NIST) to contain specified concentrations of
one or more analytes.

The concentration of the components in these materials


has been determined in one of three ways:

1. By analysis with a previously validated reference


method.
2. By analysis by two or more independent, reliable
measurement methods.
3. By analysis by a network of cooperating laboratories.

16
Blank Determinations
Blank determinations are useful for detecting certain types
of constant errors.

In blank determination all steps of the analysis are


performed in the absence of a sample.

A blank solution contains the solvent and all the reagents


in an analysis but none of the sample. The results from the
blank are then applied as a correction to the sample
measurements.

Blank determinations reveal errors due to interfering


contaminants from the reagents and vessels employed in
analysis.
17
Variance

Variance is the average squared


deviation from the mean of a set of
data. It is used to find the standard
deviation.

18
Variance
1. Find the mean of the data.
Hint – mean is the average so add up the
values and divide by the number of items.
2. Subtract the mean from each value – the
result is called the deviation from the mean.
3. Square each deviation of the mean.
4. Find the sum of the squares.
5. Divide the total by the number of items.

19
Variance Formula

The variance formula includes the Sigma


Notation,  , which represents the sum of all
the items to the right of Sigma.

 (x i  ) 2

n
Mean is represented by  and n is the
number of items.
20
Standard Deviation
Standard Deviation shows the variation in data.
If the data is close together, the standard deviation will be
small. If the data is spread out, the standard deviation will
be large.

Standard Deviation is often denoted


by the lowercase Greek letter sigma, 
Measures the dispersion of data

The greater the value of the standard deviation, the


further the data tend to be dispersed from the mean.

21
The bell curve which represents a normal
distribution of data shows what standard deviation
represents.

• One standard deviation away from the mean ( ) in either


direction on the horizontal axis accounts for around 68
percent of the data.
• Two standard deviations away from the mean accounts for
roughly 95 percent of the data
• Three standard deviations representing about 99 percent of
the data. 22
Standard Deviation
Find the variance.
a) Find the mean of the data.
b) Subtract the mean from each value.
c) Square each deviation of the mean.
d) Find the sum of the squares.
e) Divide the total by the number of items.
f) Take the square root of the variance.
n

 i
( x   ) 2

 i 1

n
23
Standard Deviation
Population Standard Deviation:
n

 i
( x   ) 2

 i 1

Sample Standard Deviation:

 i
( x   ) 2

s  i 1

n 1
24
Correlation Coefficient (r)

Statistic showing the degree of relation


between two variables

 The sign of r denotes the nature of association

 while the value of r denotes the strength of


association.

25
 If the sign is +ve this means the relation is
direct (an increase in one variable is
associated with an increase in the
other variable and a decrease in one variable
is associated with a decrease in the other
variable).

 While if the sign is -ve this means an inverse


or indirect relationship (which means an
increase in one variable is associated with a
decrease in the other).

26
 The value of r ranges between ( -1) and ( +1)
 The value of r denotes the strength of the association as
illustrated by the following diagram.

strong intermediate weak weak intermediate strong

-1 -0.75 -0.25 0 0.25 0.75 1

indirect Direct
perfect perfect
correlation correlation
no relation
27
Computation of Correlation Coefficient (r)

Covariance

Cov( x, y )    xy   x y
1
n

Correlation Coefficient
1
  xy   x y
r n
 x y
28
Example:
Pupil A B C D E F G H I J
Marks 20 23 8 29 14 11 11 20 17 17
X/30

Marks 30 35 21 33 33 26 22 31 33 36
Y/40

 xy   x y
Find Correlation Coefficient 1

r n
 x y
Cov( x, y )    xy   x y
1
n 1
1
  5313  17 *30
 5313  17*30
10  10
(6)(5)
 21.3
r  0.71
29