Вы находитесь на странице: 1из 19

Statistics in 1 hour: The most important statistics and how to calculate them in Excel

Michael G. Walker, Ph.D. Walker Bioscience 1050 Borregas Ave., #80 Sunnyvale, CA, USA 94089-1643 mwalker@stanfordalumni.org 408 234-8971 Michael G. Walker 2003 - 2007

This course is available as a free, 1-hour lecture at your work site. Contact Michael Walker at the phone number or email above. Visit walkerbioscience.com for descriptions of other courses.

How to set up the Data Analysis Toolpak in Excel The Excel Data Analysis Toolpak gives you menus to calculate many statistics. Before using the Data Analysis Toolpak for the first time, you need to ask Excel to install it. 1. In Excel, select the Tools menu. 2. Select Add-Ins 3. Put a check beside the Analysis Toolpak item. 4. Click OK Now, when you open the Tools menu, you will see a new menu item, Data Analysis. Select Data Analysis and you will see the Data Analysis menu:

Use the mean to describe a typical (average) value of a group. Birthweights of 5 children (cells A6:A10) 5.1 7.5 7.5 8.1 8.3 Birthweights of 5 children
Birthweight (pounds) 10 8 6 4 2 0 0 1 2 3 4 5 Child number

Using Excel workbook functions: Mean 7.3 =AVERAGE(A6:A10)

Using Menu: Tools / DataAnalysis / Descriptive statistics: Birthweight Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count

7.3 0.572713 7.5 7.5 1.280625 1.64 3.543724 -1.80933 3.2 5.1 8.3 36.5 5

Use the standard deviation to describe the variability of a group.


Birthweights of 5 children
Birthweight (pounds)

Birthweights of 5 children (cells A6:A10) 5.1 7.5 7.5 8.1 8.3

12 10 8 6 4 2 0 0 1 2 3 4 5 6 Child number

Using Excel workbook functions: Mean 7.3 =AVERAGE(A6:A10) Standard deviation 1.28 =STDEV(A6:A10)

Small variability => small standard deviation = 1.28


Birthweights of a second group of 5 children
Birthweight (pounds) 12 10 8 6 4 2 0 0 1 2 3 4 5 Child number

Birthweights of a second group of 5 children (cells A28:A32) 3.5 6.6 7.1 8.1 11.2

Using Excel workbook functions: Mean 7.3 =AVERAGE(A28:A32) Standard deviation 2.78 =STDEV(A28:A32)

Larger variability => larger standard deviation = 2.78

nd group of 5

Use the t-test to determine if two groups (two treatments) are different. Compare the blood pressure of patients taking a drug to that of patients not taking the drug Two treatments: Drug vs. No drug Blood pressure vs treatment Drug No drug 200 110 130 150 Drug 115 145 100 No drug 120 145 50 0 125 150 1 2 3 4 5 6 7 130 150 135 170 Patient number 140 175
Blood pressure

The t-test tells you the probability that the observed differences between the two groups is just due to random differences in sampling the two groups, in the absence of any effect of the drug. Expressed another way: The p-value for the t-test tells us the probability that we are wrong if we conclude that there is a difference between the treatments. Using the Excel ttest workbook function: =TTEST(A7:A13,B7:B13,2,2) t-test p-value = Using Menu: Tools / DataAnalysis / t-test Two-sample assuming equal variance t-Test: Two-Sample Assuming Equal Variances Drug No drug Mean 125 152.1429 Variance 116.6667 240.4762 Observations 7 7 Pooled Variance 178.5714 Hypothesized Mean Difference 0 df 12 t Stat -3.8 P(T<=t) one-tail 0.001265 t Critical one-tail 1.782287 P(T<=t) two-tail 0.00253 t Critical two-tail 2.178813

Analysis of variance (ANOVA) is the same as the t-test, but you can use it to compare 2 or more groups. Compare the blood pressure of patients taking three different drugs, Drug A, B and C. Drug A 110 115 120 125 130 135 140 Drug B 105 115 125 125 125 140 140 Drug C 130 145 145 150 150 170 175
Blood pressure

200 150 100 50 0 1 2 3 4


Patient
Drug A Drug B Drug C

ANOVA tells you the probability that the observed differences between the groups is just due to random differences in sampling the groups, in the absence of any effect of the drug. If at least one group is different, ANOVA gives us a small p-value. The p-value for the ANOVA tells us the probability that we are wrong if we conclude that there is any difference between the treatments. Using Menu: Tools / DataAnalysis / ANOVA Single Factor SUMMARY Groups Count Sum Average Variance Drug A 7 875 125 116.6667 Drug B 7 875 125 158.3333 Drug C 7 1065 152.1429 240.4762 ANOVA Source of Variation Between Groups Within Groups Total

SS 3438.095 3092.857 6530.952

df

MS F P-value F crit 2 1719.048 10.00462 0.001198 3.554561 18 171.8254 20

If you look at the SUMMARY table, you'll see that the average for drug C is 152.1429, while the average for the other two groups is 125. The ANOVA table tells you that P-value is 0.001198, which means that it is very unlikely you would see this big a difference between the three groups just by chance.

Use correlation to look at association between continuous variables: Is a baby's weight related to the mother's weight? Mother's weight 100 115 120 125 140 150 155 160 180 200 cells A6:A15 Child's weight 5.10 4.50 6.00 6.80 7.50 8.10 7.50 8.10 9.00 11.00 cells B6:B15

Child's weight vs mother's weight


12.00 10.00 8.00 6.00 4.00 2.00 0.00 0 100 200 300 Mother's weight

The Pearson correlation coefficient, R, tells us, given one variable, how well we can predict a correlated variable. For example, given Mother's weight, how well can we predict the child's weight? R ranges from 1.0, meaning perfect prediction, to 0.0, meaning no predictive value at all, to -1.0, meaning that the variables are perfectly correlated but go in opposite directions. Using the Excel Pearson workbook function: =PEARSON(A6:A15,B6:B15) Correlation R = 0.95790646

Using Menu: Tools / DataAnalysis / Correlation Mother's weight Child's weight Mother's weight 1 Child's weight 0.95790646 1 In this hypothetical example, mother's weight is almost perfect predictor of child's weight, and the Pearson correlation coefficient of 0.958 is near 1.0

Child's weight

An example of negative correlation: Suppose that we reversed the order of the children's weight: Mother's weight Child's weight 100 11.00 115 9.00 120 8.10 125 8.10 140 7.50 150 7.50 155 6.80 160 6.00 180 5.10 200 4.50 We would get a negative correlation Using the Excel Pearson workbook function: =PEARSON(A42:A52,B42:B52) Correlation R = -0.963398608

Child's weight vs mother's weight

12.00 10.00 8.00 6.00 4.00 2.00 0.00 0 100 200 300 Mother's weight

Child's weight Child's weight

What if we scrambled the children's weights? Mother's weight Child's weight 100 8.10 115 7.50 120 6.00 125 6.80 140 7.50 150 8.10 155 11.00 160 4.50 180 5.10 200 9.00 We'd see very little correlation (R = 0.028976)

Child's weight (scrambled) vs. mother's weight


12.00 10.00 8.00 6.00 4.00 2.00 0.00 0 100 200 300 Mother's weight

Use regression to quantify an association between continuous variables: Suppose we want to know how much our profit is increasing each month, on average Regression will tell us: Month 1 2 3 4 5 6 7 8 9 10 Profit (millions) 1.10 2.30 2.90 4.50 5.20 6.20 6.90 8.30 9.10 11.00 12.00 10.00 8.00 6.00 4.00 2.00 0.00 Profit (millions)

Profit

5
Month

10

15

Using Menu: Tools / DataAnalysis / Regression SUMMARY OUTPUT Coefficients Intercept 0.02 Month 1.041818182 I've edited out some of the Excel tables to show just coefficients, which are what we want. In this case, the coefficient for "Month" is 1.04, which means that we increase our profits on average 1.04 million per month

Use chi-square to test for association between categorical variables Example: Is there an association between age (child vs. adult) and drink preference (beer vs. lemonade)? Ask 50 adults which drink they prefer, ask 50 children which drink they prefer, and record their answers in a 2x2 table: Outcome observed (cells C12:D13): Lemonade Child 50 Adult 0

Beer 0 50

In this case, we observe that all the children preferred lemonade, while all the adults preferred beer. If there were no assocation between age and drink preference, we'd expect half of each group to prefer beer and half to prefer lemondade: Outcome expected by chance (cells C21:D22): Lemonade Beer Child 25 25 Adult 25 25

The chi-squared test examines the difference between the observed outcome and the outcome expected by chance, and tells us the probability that the observed outcome would have occurred in the absence of any true association (between age and drink preferrence, in this case). Use the Excel workbook CHITEST function: The chitest function requires that we specify the observed data (C12:D13) and the expected values (C21:D22) =CHITEST(C9:D10,C18:D19) p-value = 1.52397E-23

How to calculate the expected value in each cell and get the p-value using chi-square 1. Calculate the row totals, column totals, and grand total for the observed data Lemonade Beer Row Total Child 30 13 43 Adult 11 30 41 Column Total 41 43 84 <= Grand total 2. Calculate expected value for each cell as (row total * column total / grand total) Lemonade Beer Row Total Child 20.9881 22.0119 43 Adult 20.0119 20.9881 41 Column Total 41 43 84 <= Grand total 3. Calculate the p-value for the chi-square test p-value= =CHITEST(C36:D37,C42:D43) p-value= 8.3076E-05

If you have more than 1 hour In the following worksheets, I describe more statistical tests that you can perform using Excel. I also give short explanations of p-values, sample size, and power calculations, and some alternative tests that require other statistical analysis packages.

Paired versus unpaired t-tests. Use a paired t-test when the same subject receives both treatments, or is measured before and after treatment

Test the hypothesis that bears lose weight during hibernation. November weight March weight 300 280 1400 470 420 1200 550 500 1000 650 620 800 750 690 600 400 760 710 200 800 790 0 985 935 1 3 5 7 1100 1050 Animal 1200 1110
Weight

November weight March weight

Here's the output for the t-tests using the Data Analysis toolpak menu Using Menu: Tools / DataAnalysis / Descriptive statistics: First, use an ordinary t-test: Assuming 10 bears in November, different 10 bears in March: Two-Sample t-test assuming equal variances p-value = 0.71338 =TTEST(A9:A18,B9:B18,2,2) Second, use a paired t-test. The same 10 bears are wieghed in November and again in March: Paired T-test p-value = 0.00011 =TTEST(A9:A18,B9:B18,2,1) The t-test analysis for two different samples (different bears in Nov and March) gives a p-value of 0.713. The paired t-test analysis for paired samples (same bear in Nov and March) gives a p-value of 0.00011. The paired t-test is more powerful than the ordinary t-test for detecting differences. The paired t-test requires a smaller number of subjects (smaller sample size).

Here's the output for the t-tests using the Data Analysis toolpak menu Using Menu: Tools / DataAnalysis / Descriptive statistics: First, use an ordinary t-test: Assuming 10 bears in November, different 10 bears in March: t-Test: Two-Sample Assuming Equal Variances November weight March weight Mean 756.5 710.5 Variance 79255.83333 72691.38889 Observations 10 10 Pooled Variance 75973.61111 Hypothesized Mean Difference 0 df 18 t Stat 0.373 P(T<=t) one-tail 0.357 t Critical one-tail 1.734 P(T<=t) two-tail 0.713 t Critical two-tail 2.101 Second, use a paired t-test. The same 10 bears are wieghed in November and again in March: t-Test: Paired Two Sample for Means November weight March weight Mean 756.5 710.5 Variance 79255.83333 72691.38889 Observations 10 10 Pearson Correlation 0.997684745 Hypothesized Mean Difference 0 df 9 t Stat 6.54919 P(T<=t) one-tail 0.00005 t Critical one-tail 1.83311 P(T<=t) two-tail 0.00011 t Critical two-tail 2.26216

In case you are interested, the probability that 10 out of 10 bears weigh less in March than in November by chance, using the binomial model (equal chance of gain or loss) is (0.5)^10 = 0.00098

P-values We commonly use statistical tests to test hypotheses, such as the following: Hypothesis 1: Patients who receive a placebo have the same blood pressure as patients who receive a drug. Hypothesis 2: There is no association between an individual's age and their preference for beer versus lemonade. These hypotheses are stated in a form called the "Null hypothesis", which means that we are assuming that there is no effect (of the drug) or that that there is no association (between age and drink preference). The null hypothesis usually says that there is no effect or no association. Statistical hypothesis tests usually give us a p-value, which we need to interpret. A p-value is a probability, which can range from 0 to 1. A probability of 0 means an event will not occur. A probability of 1 means an event is certain to occur. A p-value tells us the probability that we are wrong if we conclude that there is a difference between the treatments. A p-value tells us the probability that we are wrong if we conclude that there is an association. In general, the p-value tells us the probability that we are wrong if we reject the null hypothesis.

Let's consider Hypothesis 1: Patients who receive a placebo have the same blood pressure as patients who receive a drug. If we get a p-value of 0.7, we would be wrong 7 out of 10 times, or 70% of the time, if we concluded that the drug has an effect different from placebo. If we get a p-value of 0.01, we would be wrong 1 in every 100 time, or 1% of the time, if we concluded that the drug has an effect different from placebo. Usually we consider an experiment to be successful if we get a small p-value, less than 0.05 meaning that we would be wrong less than 1 time in 20 if we conclude that the drug has an effect. Let's consider Hypothesis 2: There is no association between an individual's age and their preference for beer versus lemonade. If we get a p-value of 0.7, we would be wrong 7 out of 10 times, or 70% of the time, if we concluded that there is an association between an individual's age and their drink preference. If we get a p-value of 0.01, we would be wrong 1 in every 100 time, or 1% of the time, if we concluded that there is an association between an individual's age and their drink preference.

Sample size and power for a t-test. Suppose we want to test if a drug is more effective than a plabebo. How many patients should we enroll in our study? If the drug really works, what is the probability that we will detect the drug effect?

Power is the probability that our experiment will detect a significant difference between the treatment groups, assuming that there is a real difference, that is, we assume that the drug is more effective than placebo. Note that power makes the opposite assumption from the usual case, that is, we usually assume that there is no difference between treatment groups. To calculate power, we need to specify the following: Effect size: what is the difference between the means of the two treatment groups? Standard deviation: the average standard deviation of the two treatment groups. Sample size: how many subjects will be in each group?

Sample size for a t-test is the number of subjects we need in each group. To calculate sample size we need to specify the following: Effect size: what is the difference between the means of the two treatment groups? Standard deviation: the average standard deviation of the two treatment groups. Power: what power do we want the test to have, e.g., 80% power? Commercial statistics software can calculate power and sample size: NCSS PASS Statistica Glantz, Primer of Biostatistics The worksheet labeled "Sample size calc" gives an example of calculating the sample size for a t-test.

Approximate calculation of sample size for a t-test This example gives the sample size required for a t-test for a particular set of assumptions. For other tests or other assumptions, use the sample size and power software note on the previous page. For the calculation given here we assume the following: Assume desired p-value (alpha) is less than 0.05 Assume desired power is 80% Assume equal size groups Enter the values for your example: Mean for group 1 = 2 Mean for group 2 = 1 Pooled standard deviation = 2

(Must be larger than the mean for group 2) Pooled standard deviation is the standard deviation calculated using all of the observations pooled together.

Under the given assumptions and the values for your example: Difference between means = 1 Standardized effect = 0.5 Standardized effect=Difference of means/Pooled standard deviation Sample size (for each group) = Total sample size = 64 128 Sample size = 16/(standardized effect)^2 Total sample size = 2* sample size for each group

Non-parametric tests Parametric tests include the t-test, paired t-test, ANOVA, and repeated measures ANOVA The p-values calculated by the parametric tests can be influenced greatly by extreme observations (outliers) Non-parametric tests are analogs of the parametric tests. The non-parametric tests are usually based on ranking the observations, and are less influenced by outliers Parametric tests commonly use the mean, while non-parametric tests commonly use the median

Here are the analogous tests: Parametric test t-test paired t-test ANOVA Repeated measures ANOVA Pearson linear correlation

Non-parametric test Wilcoxon rank sum test(Also known as the Mann-Whitney U test) Wilcoxon signed rank test Kruskal-Wallis test Friedman test Spearman rank correlation

Вам также может понравиться