Вы находитесь на странице: 1из 32

STATISTICS AND

EXPERIMENTAL DESIGN
ON RESEARCH

DR. DEWI YUNITA, S.TP., M.RES.

LAST UPDATED: 5 NOVEMBER 2017


Factors
(Experimental Parameters)
 Numerical: continuous factors that can have intermediate settings.
Example: temperature or pressure.

 Categorical: discrete factors with one or more settings.


Example: a cat or a dog

Where possible express factors on a numeric scale.


Example: big hammer - small hammer (categorical)  hammers of 400g or 100g (numerical).
This would allow for an intermediate hammer (example: one of 250g as a mid-point)
Replicates

 A replicate is an INDEPENDENT repeat of a particular sample.


 Analysing one extract twice is not true replication.
 Extracting two different samples and then analysing both is replication.

 The more times you see a difference the more confidence you have it is real
 If we do an experiment once and get statistical significance at 5%, 1 in 20 chance of a mistake
 If we repeat it and get significance at 5% again that is a probability of 1/20 * 1/20.
A 1 in 400 chance
 A significant result on a 3rd repeat of the experiment is 1 in 8000
Randomisation

 If samples are NOT prepared and analysed in a random order, the


results obtained may be TOTAL RUBBISH!!
 If analysing two groups of samples (A and B) DO NOT extract all the
A’s then all the B’s. Run them in a random order.
Case study: My first error!

8 samples removed from freezer and placed on bench. Samples


were from plant material water stressed for 0-8h. I extracted
samples starting with 0h sample working up to 8h sample.
Extracts analysed by liquid chromatography.

A compound appeared to go up as the duration of the water


stress treatment increased. I later discovered that this
compound appeared as plant material thawed out. The 8h
sample had thawed out for longer than the 0h sample, hence
the trend.
THE RESULTS

Compound level

Compound level
0 8 0 8
Stress Time Stress Time

What I saw What I would have seen if


and wasted time on I had randomised the
samples! i.e. the truth
Standard Deviasi (SD)
Standard Error (SE)

 The standard deviation is VERY IMPORTANT in the real world


(commercial production).
 It defines quality control
 what is in/out of specification
 How variable your product is

 If you are trying to improve your consistency Lower SD (σ)


 You can use the F-test to look for differences in SD/variance
% Coefficient of Variation

 This gives an idea of how variable the data are.


 Question: How big is the standard deviation relative to the mean

%CV = 100 * SD / Mean


 Acceptable %CV’s depend on what you are working with.
 Lower %CV’s (<5%) are better
Calculating SE in MS EXCEL
ERROR BARS
Regresi dan Korelasi

Regression  Estimasi diantara variable penelitian Correlation Coefficient:


Analysis:  Pemodelan
How strong a relationship
between data
 >0.9 excellent
R2  >0.8 good
Value  >0.7 reasonable
 >0.6 not much use
 <0.6 there is too
much error in
there the model is
not much use –
predictive power
poor.
Inserting R2-Value and Equation in MS. Word
Inserting R2-Value and Equation in MS. Word
The t-test

Samples are taken from two populations


and the means and standard
deviations compared to see if they
are different Diet A

e.g. cows fed on diet A or B


4 replicates of each

Cows randomly allocated to Diet B


stalls where they will be fed
t-test performed to see if means are
different – depends on standard deviation

Mean Mean Mean Mean


diet A diet B diet A diet B
Standard deviation low, means Standard deviation high, means
different, little overlap of populations different, big overlap of populations
probably statistically different probably not statistically different
Paired sample t-test

Sometimes all samples cannot be run at the same time or in the


same place so pairs of cows are given diet A or B. The
comparison is then the sum of the differences within each pair

P-value < 0.05  statistically significant


P-value ≥ 0.05  no difference between the groups
T-Test in MS. Word

P Value: 0.693769  No Significant


The paired sample t-test
experimental design

Are there any other pairs?

Farm Farm
1 2

Pairs of cows fed on different diets at either farm 1 or Farm 2


each with a pair of cows on a high and a low pasture because
of a lack of space
Outcome

 By carefully designing the experiment we can test


the influence of three factors at the same time.
 This can save lots of effort, time and money
 Classical one factor test becomes multi factor….
The design space

H-2 L-2

H-1 L -1

Diet A or B
We can show this 3 factor
design as a cube with
each variable on a
separate vector or H-2
L-2
dimension
H-1 L -1

Pasture High Low


This type of experiment is known as
a Factorial Design
Features
 Balanced design
 Can have 2 or more dimensions testing lots of factors
simultaneously
 Determines statistical difference of changing a factor from one
extreme to another (Top and Bottom test)
 No information on changes across design space (in this case
categorical factors, so not possible)
 Can test for interactions
Statistical significance

 Quote probabilities as
P< 0.05 significantly different
P< 0.01 very different
P< 0.001 extremely different
 Or use actual value from stats package
P= 0.024
NOT P = 0.02356 - statistics is not that precise
So based on sample data….

 A difference at the 5% level (P≤0.05) means we may be wrong about


the samples actually being different 5% of the time or 1 in 20.
 Statistical differences with lower P-values give greater confidence that
the differences are real
 P<0.01 1 in 100 chance we are mislead by the data
 P<0.001  1 in 1000 chance we are mislead by the data
A Data Set in MS. Excel

Copy  Paste: Transpose


Copy The Date in to SPSS
Experimental Design in SPSS: RAL
OUTPUT
Experimental Design in SPSS: RAK
OUTPUT
LATIHAN
 Penelitian dilakukan dengan Rancangan Acak Kelompok faktorial (RAK) yang terdiri dari dua faktor.
 Faktor pertama adalah perbandingan jenis bahan penstabil yang digunakan (S) yang terdiri dari 2 taraf yaitu
karagenan (S1) dan xanthan gum (S2).
 Faktor kedua adalah perbandingan konsentrasi bahan penstabil (C) yang terdiri dari 4 taraf yaitu 0.1 % (C1),
0.2 % (C2), 0.3 % (C3) dan 0.4 % (C4).
 Hasil penelitian menunjukkan interaksi kedua faktor berpengaruh sangat nyata terhadap kecepatan
pemisahan cairan.
35
30 30
Kecepatan Pemisahan Cairan (menit)

30
Hasil 25
26
25
penelitian 20 20 Are they
untuk 20 0.1 %
16 significantly
kecepatan 15 0.2 %
15 different??
pemisahan 0.3 %
0.4 %
cairan: 10

0
Karagenan Xanthan gum
Jenis Bahan Penstabil
Sebuah penelitian dilakukan dengan Rancangan Acak
Kelompok faktorial (RAK) yang terdiri dari dua
faktor. Faktor pertama adalah perbandingan jenis bahan
penstabil yang digunakan (S) yang terdiri dari 2 taraf
yaitu karagenan (S1) dan xanthan gum (S2). Faktor
kedua adalah perbandingan konsentrasi bahan penstabil
(C) yang terdiri dari 4 taraf yaitu 0.1 % (C1), 0.2
% (C2), 0.3 % (C3) dan 0.4 % (C4). Hasil penelitian
menunjukkan interaksi kedua faktor berpengaruh sangat
nyata terhadap kecepatan pemisahan cairan. Data yang
diperoleh ditampilkan saat kuliah.
Pertanyaan: Apakah masing-masing taraf berbeda secara
significant?
Latihan: Tampil hasil penelitian tsb dalam grafik Excel
dilengkapi dengan Error Bars, Data label, dan Notasi yang
telah dihitung menggunakan SPSS terlebih dahulu.
NEVER BELIEVE A MODEL
ABSOLUTELY:
“SOME MODELS ARE USEFUL,
MOST ARE DANGEROUS”
(Alan Collins)

“ALL MODELS ARE WRONG”


(Florian Wulfert)

Вам также может понравиться