Parametric and Non Parametric Tests

ANALYSIS OF DATA
Manoj Kumar Y Department of Pharmaceutics 1701128860111
April 5, 2013
Contents
1. Brief introduction on sampling I. Methods of sampling 2. Probability distribution 3. Terms used 4. Parametric and Non Parametric methods 5. Parametric methods I. T test II. 2 test III. F test 6. Nonparametric methods I. Types of data II. Methods III. Sign test IV. Wilcoxon signed rank test V. Wilcoxon Rank sum test VI. Kruskal Wallis Test ( one way ANOVA) VII. Friedman test (two way ANOVA) VIII.Analysis of covariance IX. Run test for randomness X. Contingency tables 7. References
April 5, 2013
04 05 07
09 11 12 13 14 15 17 19 21 23 25 26 27 30
2
Brief introduction on sampling.

You dont have data unless you select samples. Hence sampling is a prime step. Sample is a thing which represents the whole lot. Samples includes raw materials, intermediate products, finished products, packed products etc.,
Sampling plan
After selection of sample, plan should be made for approval. It may be single, double, multiple sampling.
Single Approving the lot by examining a single lot. Double Approving by examining the second lot where first lot dont meet requirements. Multiple Approving the lot by examining many lots till it satisfies the preset requirements. Eg Dissolution test Q limits.
April 5, 2013 3
Methods of sampling
Methods of sampling
Random sampling Random sampling (each sample has equal opportunity) Random numbers Lottery method Stratified sampling Systematic sampling Non random sampling Judgement, purposive or deliberate
Quota
Convenience
Cluster sampling
April 5, 2013
Probability distribution
Mathematical determination of frequency distribution.
Types of probability distribution

1. Binomial Distribution For two mutually exclusive results p and q, it is given by the formula
r success for n trials

2. Poisons Distribution Studies probability of occurrence of rare events. It is given by e-m vaules are tabulated
April 5, 2013 5
3. Normal distribution It is for continuous variables, it uses Z transformation. It is given by
X is random variable, is mean, s is standard devieation
April 5, 2013
Terms used Significance is having difference between two compared samples or methods.
Level of significance Probability of making type 1 error here null hypothesis is rejected when it is true Different levels are 1%, 5%, 10% etc., mostly 5% is used. Mostly 5% level of significance is used (p = 0.05) Ranking, average ranking, geometric mean of ranks -
Different ranks are given for the data according to ascending or descending order. For values with same ranks average of ranks is given as common value. Geometric mean of values is given as Hypothesis
April 5, 2013
Null hypothesis Ho and Alternative hypothess H1

7
Parametric methods and non parametric methods

Parametric methods
Parametric methods considers the probability distribution parameters.
Non Parametric methods

Special prameters are not considered, need continuous distribution os data
Sign test Wilcoxon signed rank test Wilcoxon rank sum test Kruskal wallis test Friedman test Analysis of covariance Run test Contingency tables
8
t test Chi-square test F test
April 5, 2013
Parametric methods
t - Test
Assessing the significance of difference is test of significance.
t test differs with small and large sample sizes.

For large samples Standard error of difference is obtained as Where s1 and s2 are variance n1 and n2 are no of samples Difference in mean is given by
April 5, 2013
Smaller samples
where
are means
Paired samples
where
April 5, 2013
10
2 test
This test describes the magnitude of difference between observed frequency and expected frequency.
These differences are compared to the tabulated values to show significance Where f0 is observed frequency fe is expected frequency Degree of freedom Given by (c-1) or (c-1)x(r-1) Increase in the degree of freedom increases the symmetry of the curve
April 5, 2013
Applications
Check the good fit. Check Independence of variables by contingency tables Test for homogenity Detection of linkage 11
F - test
This test is to check the equality of 2 variables from two populations. It is simply given as Where S1 and S2 are variances if not given directly in the data can be obtained by squaring standard deviation
With reference to ANOVA it is given as
It is variability between in the group to variability within the group.

April 5, 2013 12
Non Parametric Methods

Distribution free statistics, Unknown type of distribution Need continuous data, independent.
Types of Data
1. Nominal/ Categoirical data categorized in to a particular group. Eg colours like white, Black, Yellow, Red,
2. Ordinal the data is ranked. Eg Assigning 1, 2, 3, ranks in order.

3. Interval /ratio scaled interval data. Eg 1 5, 6 10,
April 5, 2013 13
Methods
1. 2. 3. 4. 5. 6. 7. 8. Sign Test WilcoXon Signed Rank Test Wilcoxon rank sum test Kruskal wallis test (one way ANOVA) Friedman test (two way ANOVA) Analysis of covariance Contingency tables Run test
April 5, 2013
14
Sign test
It is the simplest of nonparametric tests This is compared to one sample t test of parametric tests.
Ex - Data from peak Plasma concentraion
Subject Time of peak plasma conc. (hr) A B 2.5 3.5 3.0 4.0 1.25 2.5 1.75 2.0 3.5 3.5 2.5 4.0 1.75 1.5 2.25 2.5 3.5 3.0 2.5 3.0 2.0 3.5 3.5 4.0 Difference (BA) +1 +1 +1.25 +0.25 0 +1.5 -0.25 +0.25 -0.5 +0.5 +1.5 +0.5
Steps
1. Calculate the difference of measurent of paired matches 2. Remove ties if any 3. Count the number of times one treatment has higher value. Number of positives or negatives. 4. Compare with tabulated values.
April 5, 2013
1 2 3 4 6 6 7 8 9 10 11 12
15
Sign test
The values of Z for more than 20 samples can be obtained by normal approximation.
Where
P is proportion observed N is number of samples
It can be simplifies as
April 5, 2013
16
Wilcoxon signed rank test

This is sensitive than sign test.
Value Rank 1 2 3 4 5 6 7 8 9 10 11
Steps
1. Claculate difference 2. Ranks are asigned disregarding sign 3. Average ranks are given for same values 4. Reassign the signs to assigned ranks 5. Ranks with like sign are summed up
-0.25 0.25 0.25 -0.5 0.5 0.5 1.0 1.0 1.25 1.5 1.5
Assigned Rank Rank with sign 2 -2 2 2 2 2 5 -5 5 5 5 5 7.5 7.5 7.5 7.5 9 9 10.5 10.5 10.5 10.5
April 5, 2013
17
Wilcoxon signed rank test

Compare the smaller rank with the tabulated values
For sample size larger then 20 it is calculated by normal approximation as
Where
R is sum of ranks(larger or smaller N is number of samples
April 5, 2013
For bioavailability studies confidence intervals are used.
18
Wilcoxon Rank sum test

This is also known as Mann Whitney U test. This is for independent samples Original apparatus Steps 1. Data is pooled and ranked irrespective of group 2. Average ranks are given to equal values. 3. Ranks of one group are summed up. 4. Compared with significance limits.
modified apparatus Amount Rank Amount Rank dissolved dissolved 53 3 58 11 61 14 55 5.5 57 9 67 21 50 1 62 15.5 63 17 55 5.5 62 15.5 64 18.5 54 4 66 20 52 2 59 12.5 59 12.5 68 22 57 9 57 9 64 18.5 69 23 56 7 If ties are included then correction factor is used
19
April 5, 2013
Wilcoxon Rank sum test
For samples larger than 10 normal approximation is using formula
Where
T is sum of ranks for small sample size N1 is small sample size N2 is large sample size
For the above data Z is obtained as 1.63 which is not more at 5% significance.(form significance table)
April 5, 2013 20
Kruskal Wallis Test ( one way ANOVA)

This is extension of rank sum test steps 1. Ranks are given to the data similar to Wilcoxon Rank sum test 2. Average ranks are assigned and regrouped. 3. Ranks in each group are summed.
Example effect of sedative on rats (time in min)
Control 8 1 9 9 6 3 15 1 April 5, 2013 7 Rank 22 3.5 535 24.5 15 10 28 3.5 18.5 low dose 10 5 8 6 7 7 15 1 15 7 Rank 26 13 22 15 18.5 18.5 28 3.5 28 18.5 High dose 3 4 8 1 1 1 1 6 2 2 Rank 10 12 22 3.5 3.5 3.5 3.5 15 7.5 7.5
21
Kruskal Wallis Test ( one way ANOVA)

This test statistic can be approximated by chi- square design -
Where
k-1 is degree of freedom k is number of groups N is total no of observations Ri is sum of ranks of ithgroup ni is number of observation in ith group
Chi-square is obtained as
, and shows sginificance(p=0.05).
April 5, 2013
22
ANOVA One way.

ANOVA Analysis of Variance In one way model influence of one factor is studied. The variation is set between the samples and within the samples. Rows show the replicates, columns show different treatments.
Example -
Different treatments
Sample 1 20 10 17 17 16 sample 2 19 13 17 12 9 sample 3 13 12 10 15 5
April 5, 2013
Replicates
23
ANOVA One way.

Steps involved in ANOVA one way
1. Find the sum of squares of columns and denote it as A 2. Square the sum of column and divide by number of observations and denote it as B 3. Find the correction factor, correction factor is square of sum of all the observations divided by number of observations. Denote it as D 4. Calculate F value and compare with tabulated value.
Source of variation Between treatments (column) Residual Total
April 5, 2013
Degree of freedom Sum of squares Mean of squares c-1(n1) B-D B-D/c-1 c(r-1) (n2) cr-1 A-B A-D A-B/c(r-1)
24
ANOVA One way.

Calculated F value is equal to 2.16. The tabulated F value is 3.68 Because the F value is below the tabulated value there is no significance at level 5%(p = 0.05)
April 5, 2013
25
Friedman Test (two way ANOVA)

Data is arranged as in two way ANOVA model. Steps
1. Unlike rank sum test the ranks are given to individual groups i.e., with in the group. 2. If there are ties average ranks are given. 3. Test for significance is obtained by chi-square distribuiton
Tablet formulation 1 2 3 4 5 Ri
April 5, 2013
Tablet press A 7.5(4) 8.2(3) 7.3(1) 6.6(3) 7.5(3) 14 B 637(1) 8.0(2) 7.9(3) 6.5(2) 6.8(2) 10 C 7.3(3) 8.5(4) 8.0(4) 7.1(4) 7.6(4) 19 D 7.0(2) 7.9(1) 7.6(2) 6.7(1) 6.7(1) 7
26
Friedman Test (two way ANOVA)

Chi-square distribution is given as
Where
c is number of column c-1 is degree of freedom r is number of rows Ri is sum of ranks in Ith group
The value of chi-square for the above data is 9.72 At the level of 5% significance chi-square value is high and the tablet press shows significant differences
April 5, 2013 27
ANOVA Two way.

This method can handle two variables or influencing factors.
Example
Yields of different variety of rice and four nitrogen rates were recorded. Different treatments
Nitrogen rate (Kg/ha) 0 30 60 90 v1 4.50 4.30 5.60 5.21 19.61
April 5, 2013
v2 5.01 6.17 6.37 6.48 24.39
v3 6.11 6.92 7.27 7.86 28.16
Total 15.62 17.39 19.24 19.91 72.16

28
Replicates
ANOVA Two way.

Steps involved in ANOVA two way 1. Find the sum of squares of columns and denote it as A 2. Square the sum of column and divide by number of observations and denote it as B 3. Square the sum of row and divide by number of observations and denote it as C 4. Find the correction factor, correction factor is square of sum of all the observations divided by number of observations. Denote it as D 5. Calculate F value and compare with tabulated value.
April 5, 2013
29
Source of variation Between treatments (column) Between replicates (rows) Residual (row) Total
degree of freedom c-1(n1) r-1(n1) (c-1)(r-1) (n2) cr-1
sum of squares B-D C-D (A-D)-[(B-D)+C-D)] A-D
mean of squares B-D/c-1 C-D/r-1 (A-D)-[(B-D)+C-D)]/ (c-1)(r-1)
Calculated F value for the given data is between varieties 25.34, between treatments is 14.38 F values for treatments i.e., different varieties has significance, for nitrogen fretilizer it is below the tabulateed value and has no significance.
April 5, 2013 30
Analysis of covariance
Steps involved 1. 2. 3. 4.
This is proposed by Quade
Rank X, Y irrespective of their treatments. Correct ranks to get mean of ranks 0. Perform regression for all data (predicted value) Residual is calculated (observed predicted) i.e., Ry-predicted
raw material X 98.40 98.6 98.6 99.2 Ranks RY 2.50 1.50 3.50 -0.5 RX -3.0 -1.0 -1.0 2.50 predicted 1.445 0.481 0.481 -1.204 Residual 1.0548 1.0182 3.0182 0.7042 5.7957 0.7408 -2.7774 -3.4451 -5.7957
Final assay Y Method 1 98.0 97.80 98.5 97.4 sum Method 2 97.6 95.4 96.1 sum
98.7 99.0 99.3
0.5 -3.5 -2.0
0.5 1.5 -3.0
-0.240 -0.722 1.445
April 5, 2013
31
Run test for randomness

Run is a series of uninterrupted like observations.
Ex 6 tablets with weight of >200 mg, next 5 tablets with weight of < 200 mg, next 4 tablets with weight of > 200 mg, next 5 tablets with weight of < 200 mg, etc.,
In this method compare the number of runs with the tabulated values, upper limits are checked for measuring significance.
If the sample size if greater than 40 then significance is calculated by normal approximation using the formula
Where r is number of runs
April 5, 2013
32
Contingency tables
This is used for categorical data which cant be analyzed by Ranking methods, R X C is rows X columns The relationship in rows and columns in contingency is given by chi-square method with (R-1)(C-1) as degree of freedom
Estimated values are obtained from product of row sum, column sum, and dividing by grand total.
April 5, 2013
33
Data
Very severe
Treatment A Treatment B Total 13 19 32
moderately severe
24 20 44
mildly severe
18 12 30
Total
55 51 106
Expected values
Very severe
Treatment A Treatment B Total 16.60 15.40 32
moderately severe
22.83 21.17 44
mildly severe
15.57 14.43 30
Total
55 51 106
April 5, 2013
34
Thank You
April 5, 2013
35
References:
1. Khan and Khanum, Fundamentals of Biostatistics, third edition, 2009. 2. Sanford Bolton, Charles BON, Pharmaceutical Statistics practical and clinical applicatrions, fourth edition,
April 5, 2013
36
First samples need to be collected , analysed by appropriate method and then the data is made to fit in to any of the frequecy curves, or to different distribuiton which is appropriate, then from different methds the samples are characterized like parametric methods, non parametric methods, control charts, from these methods you can find whether the samples or data shows significance or not from control chart limits.

Parametric and Non Parametric Tests

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Parametric and Non Parametric Tests

Загружено:

Авторское право:

Доступные форматы

ANALYSIS OF DATA

Manoj Kumar Y Department of Pharmaceutics 1701128860111

Brief introduction on sampling.

Types of probability distribution

r success for n trials

3. Normal distribution It is for continuous variables, it uses Z transformation. It is given by

X is random variable, is mean, s is standard devieation

Null hypothesis Ho and Alternative hypothess H1

Parametric methods and non parametric methods

Non Parametric methods

t test Chi-square test F test

t test differs with small and large sample sizes.

It is variability between in the group to variability within the group.

Non Parametric Methods

2. Ordinal the data is ranked. Eg Assigning 1, 2, 3, ranks in order.

P is proportion observed N is number of samples

Wilcoxon signed rank test

Wilcoxon signed rank test

For sample size larger then 20 it is calculated by normal approximation as

R is sum of ranks(larger or smaller N is number of samples

For bioavailability studies confidence intervals are used.

Wilcoxon Rank sum test

Wilcoxon Rank sum test

For samples larger than 10 normal approximation is using formula

Kruskal Wallis Test ( one way ANOVA)

Kruskal Wallis Test ( one way ANOVA)

, and shows sginificance(p=0.05).

ANOVA One way.

ANOVA One way.

ANOVA One way.

Friedman Test (two way ANOVA)

Friedman Test (two way ANOVA)

ANOVA Two way.

v2 5.01 6.17 6.37 6.48 24.39

v3 6.11 6.92 7.27 7.86 28.16

Total 15.62 17.39 19.24 19.91 72.16

ANOVA Two way.

degree of freedom c-1(n1) r-1(n1) (c-1)(r-1) (n2) cr-1

sum of squares B-D C-D (A-D)-[(B-D)+C-D)] A-D

mean of squares B-D/c-1 C-D/r-1 (A-D)-[(B-D)+C-D)]/ (c-1)(r-1)

This is proposed by Quade

98.7 99.0 99.3

0.5 -3.5 -2.0

0.5 1.5 -3.0

-0.240 -0.722 1.445

Run test for randomness

Вам также может понравиться