Вы находитесь на странице: 1из 11

STAB22 section 7.

2 and Chapter 7 exercises

7.54 I would have no reason to suppose ahead of time that a particular one of the two designs would be better, so I would use a two-sided test. (This would be dierent if Design A was what I am using smaller samples. Do it now and Design B was a new design, where I would want Design 7.57 This one is the same as 7.56, but with the same way. The df is 10 1 = 9, so t = 2.262 and the margin B to be better before I would consider adopting it.) of error is 2.262 (102 /10) + (122 /10) = 11.17. The condence With 2 groups of 25, my (approximate) df is the smaller sample interval is 100 120 11.17, from 31.17 to 8.83. size less 1, ie. 25 1 = 24. (If youre doing a problem by hand, Because this interval does not contain 0, 0 is not a plausible use the smaller sample size less 1 thing for the df, because the value for the dierence; specically, the null hypothesis that the software formula is much too messy to deal with by hand.) dierence in means is zero is rejected at the 0.05 level (1-0.95) Table D gives one-sided P-values, and to get a 2-sided P-value against the two-sided alternative. (Calculate the test statistic you have to double what you nd there. Thus to get a 2-sided and P-value for yourself if you want to check this.) P-value of exactly 0.05, you want a one-sided P-value of exactly The interval in 7.57 goes down further and up further than the 0.025 (with 24 df), so t = 2.064 will do it: that is, reject H0 for one in 7.56. This is a reection of the smaller sample sizes in t 2.064. Because its a two-sided test, you would also reject H0 7.57 (and because everything else is the same, you can compare for t 2.064. (Remember the procedure for a two-sided test: directly). if the test statistic is negative, ignore the minus sign and look up the positive version of it in the table.) 7.61 We should be safe as long as the males and females are (or 7.55 Quickly you can see that 2.75 is bigger than 2.064, so the twocan be treated as) simple random samples of all students (or sided P-value is less than 0.05 and you would reject H0 . To get whatever you think the population is). Unless you have theory a more accurate P-value, see where 2.75 comes along the 24 df that says that females should be higher (or lower), theres no line: a one-sided P-value would be between 0.005 and 0.01, so justication for anything other than a two-sided alternative here, two-sided it would be between 0.01 and 0.02. so, letting 1 denote females and 2 males, we have H0 : 1 = 2 and Ha : 1 = 2 (or you can express those in terms of 1 2 if 7.56 These two-sample condence intervals may look messy, but you prefer, which is clearer in some ways). its just a matter of getting all the numbers into the formula. 2 With 50 1 = 49 df, t = 2.009 (or if you want to be The bottom of the test statistic is s2 = 1 /n1 + s2 /n2 safe, use the 40 df value 2.021). The margin of error is then (34.792/71) + (33.242/37) = 6.85, so the test statistic itself is 2.009 (102 /50) + (122 /50) = 4.44, and the interval is 100 120 t = (173.70 171.81)/6.85 = 0.28. With 37 1 = 36 df (use 30 or 4.44, from 24.44 to 15.56. 40), the one-sided P-value is bigger than 0.25, and the two-sided As always, a 99% CI has to contain the population parameter 1 P-value we want is bigger than 0.50. There is no way we can

(here, the dierence between the two population means) in more of all possible samples than a 95% CI does, so the 99% interval has to contain more values to make this happen (all the samples that work for a 95% interval also work for a 99% interval, and the 99% interval contains those plus more).

reject the null hypothesis here. That is, there is no evidence of a dierence in mean total cholesterol levels. t for a 95% condence interval is 2.042 (30 df) or 2.021 (40). The margin of error for the CI is just t times the bottom of the test statistic, so the interval itself (using 30 df) comes out as 173.70 171.81 (2.042)(6.85) = 1.89 13.99, from 12.10 to 15.88. This interval contains 0, so it supports the result of the test (which said that the dierence in means could be 0). The variability for both males and females is high, so it was dicult to estimate the dierence in means accurately, even with these moderately large sample sizes. 7.62 This one is the same idea as the test in 7.61, but with dierent numbers. This time, the researchers want evidence that the mean LDL level for males is higher than for females, so a one-sided alternative is called for: H0 : 1 = 2 and Ha : 1 < 2 (since 1 was females). (The introduction in 7.61 said that these were sedentary students, so sedentary in 7.62 is all right.) The bottom of the test statistic is (29.782/71) + (31.052/37) = 6.21, using the LDL numbers, so the test statistic is t = (96.38 109.44)/6.21 = 2.10. This is correctly negative (according to the alternative, the males are supposed to be higher), so we can go ahead and nd the P-value for 2.10 on 30 (or 40) df. This is between 0.02 and 0.025, which is smaller than 0.05, so we can conclude that the mean LDL level for all (sedentary student) males is higher than for females.

population mean for the controls. Think about the alternative rst; we might want to know if the intervention actually led to an improvement in dietary behaviour, in which case Ha : I > C . Or we might want to know whether the intervention had any eect, positive or negative, in which case Ha : I = C would be more appropriate. In either case, the null hypothesis should be H0 : I = C . The test statistic is t= 5.08 4.33
1.152 165

1.162 212

= 6.26.

This is o the end of the 100 df line of Table D, so the P-value is less than 0.0005 (one-sided) or 0.001 (two-sided). These P-values are very small (smaller than 0.05), so we would reject the null hypothesis in favour of the alternative in both cases. There is strong evidence that the intervention is eective (one-sided) or makes a dierence (two-sided). Using 100 df gives a 95% condence interval of 5.08 4.33 (1.984) 1.152 1.162 + = 0.75 0.24, 165 212

or 0.51 to 0.99. The interval only contains positive values, so that the evidence points to the intervention having a positive eect. This is the same conclusion that we drew from the one-sided test. There may be something special about Durham, North Carolina, that prevents these results from generalizing well to other parts of the US or Canada (let alone to other parts of the world). It would be better to be careful.

7.63 The data here cannot be normally distributed, because rather than being any (decimal) numbers, they have to be whole numbers between 0 and 6. That said, using t procedures on the means is almost certainly going to be OK, because we have large samples 7.64 As in 7.63, the data cannot be normal (they are whole numbers), from distributions with no possible outliers (so the exact distribut the distribution cannot have outliers, and so, with large sambutional shape wont matter very much). ple sizes as we have here, using the t procedures wont be too far o the mark. Let I be the population mean for the intervention and C be the 2

Thinking about the alternative rst: you might think that the intervention could only have a positive eect, in which case a one-sided alternative is correct, or you might wish to allow for the intervention being harmful, in which case a two-sided alternative is what you want. In either case, the null says that the two means are equal. The test statistic is t= 4.10 3.67 0
1.192 165

(c) You might suspect that drill and blast workers are going to be exposed to more dust than concrete workers, in which case a one-sided alternative would make sense. t= 18 6.5
7.82 115

3.42 220

= 15.08.

Since the condence interval says that only positive values for the dierence in means are plausible, we are led to the same conclusion as from the (one-sided) test: that the intervention makes a positive dierence.

(d) Skewness in the distribution of the data shouldnt matter if the sample sizes are large, as they are here. and the P-value, using 100 df, is less than 0.0005 one-sided and 0.001 two-sided (use the one corresponding to the alternative you chose). Either way, wed reject the null hypothesis and conclude 7.66 Just doing the condence interval and test here: For the 95% condence interval, again with 100 df: that the intervention is having a positive eect (one-sided) or some eect (two-sided). 2.82 0.72 For the 95% condence interval: either way, were using 100 df, + = 4.9 0.53, 6.3 1.4 (1.984) 115 220 so t = 1.984, and the interval is 0.43 0.24, or 0.19 to 0.67. or 4.37 to 5.43 milligram years per cubic metre. As you might expect, the test gives a really small P-value: t=
2.82 115

1.122 212

= 3.57,

This is o the end of the table with 100 df, so the P-value is less than 0.0005, probably much less. Drill and blast workers have a higher exposure to dust.

6.3 1.4

7.65 (a) These results may apply to workers who are employed in similar types of conditions (and who receive medical examinations with again a P-value (much) less than 0.0005. This is strong as part of routine health checkups). The key is that the workevidence that drill and blast workers have a higher exposure than ers sampled should be close enough to a simple random sample outdoor concrete workers. from the population of all possible workers, whatever that is precisely. 7.67 Since the changes were recorded over a 17-year period, it seems likely that these were two separate samples of Americans, and (b) The 95% interval, with 100 df, is therefore that a two-sample test should be done. (If the measurements had been made one year apart, the authors could have 7.82 3.42 + = 11.5 1.51. 18 6.5 1.984 looked to see how soft drink sizes changed for the same group 115 220 of people, ie. a matched pairs experiment, but over a longer time or 9.99 to 13.01 milligram years per cubic metre. span, this seems unlikely.) So we gure out what we need to know 3

+ 0.72 /220

= 18.47,

to make a two-sample CI: we have the sample means, but we also 7.70 A two-sample test would be testing the null hypothesis that the dierence in population means is 0. Remembering that a conneed to know the sample SDs and sample sizes. These could make dence interval includes all plausible values for the dierence in a big dierence in assessing how big the increase really might be; population means, and that 0 is in the given interval, wed conif the sample SDs are large or the sample sizes are small, this apclude that the null hypothesis should not be rejected and that parently big increase might have occurred by chance, because the the dierence in population means could be 0. (Precisely, a 95% size of the increase is poorly estimated. (See, for example, 7.61, condence interval goes with a two-sided test at = 5% = 0.05, where the sample SDs were large and the condence interval was so the P-value for the test is bigger than 0.05.) long.) 7.69 (a) The hypotheses have to be statements about the population, not the sample. (A hypothesis about sample means could be declared true or false with certainty by looking at the sample statistics: if the sample means are dierent, the null hypothesis would be false). (b) The two samples need to be independent. The scores of all 50 freshmen contain the scores of the males, so the two samples cannot be independent. (The right way here would be to compare the 24 males with the 26 females.) (c) Condence only relates to condence intervals. In a test, the null hypothesis is rejected only if the P-value is small, smaller than an like 0.05. This P-value is far from small, so we have no evidence to be rejecting the null hypothesis with. (This kind of P-value comes from samples where the sample means are close together, so a hypothesis that the population means are equal is quite plausible.) For (b), larger samples generally give more information, and so the margin of error based on larger samples will generally be smaller. Or you can look at the formula: the margin of error 2 is t (s2 1 /n1 ) + (s2 /n2 ), and making the sample sizes bigger will make the margin of error smaller. (Making the sample sizes bigger will also increase the df, so that t will also be smaller.) This is not a certainty because the sample SDs might also come out bigger, but you can expect larger samples to be more informative most of the time. 7.71 (a) This is a two-sided test, so throw away the minus sign on the test statistic and look up 3.69 in the 9 df line of Table D. This gives a one-sided P-value right on 0.0025, so the two-sided P-value would be 0.005. This is smaller than 0.05, so we can reject a hypothesis of equal means in favour of a two-sided alternative. (b) For this one-sided test, we rst check that we are on the correct side. If the alternative says that the dierence in means is negative, the test statistic should be negative as well, which it is. So we can nd the P-value from Table D, which is as in (a) only without the doubling: the P-value is 0.0025, and the null hypothesis is rejected in favour of the alternative that the dierence in means is negative.

(d) When youre doing a one-sided test, you have to check that your test statistic value is on the correct side. With this alternative (containing less than), the test statistic needs to be negative before you can think about rejecting the null. It isnt, here, so it is on the wrong side and the null hypothesis wouldnt be rejected. (In plainer terms, a positive test statistic means that 7.73 This is no longer a two-separate-sample situation, but rather the sample mean for sample 1 is bigger than the sample mean for matched pairs. It is better to do a 1-sample t test on the seven sample 2, which is hardly evidence that the mean for population dierences (for the two dierent designs on the same day of the 1 is smaller than the mean for population 2.) week), because day of the week may also make a dierence. 4

7.80 The alternative hypothesis is that the two sources dier in mean trustworthiness. (I think a two-sided alternative is better, but if you want to take it for granted that the National Enquirer is not going to be a more trustworthy source for anything, go ahead and use a one-sided alternative.) The null hypothesis is that the two sources have the same mean trustworthiness. In symbols: H0 : 1 = 2 and Ha : 1 = 2 , where 1 and 2 are the population means for the Wall Street Journal and National Enquirer. The test statistic is t = 8.369 with 60 df, and the null hypothesis is rejected without question: there is a real dierence in mean trustworthiness. You might guess that 0 will not be in the condence interval, which is, using 60 df, 1.78 to 2.90. Using the software df of 121.5, it is 1.787 to 2.894. In summary, advertising in the Wall Street Journal is seen as between 1.8 and 2.9 points more credible, on average, than advertising in the National Enquirer. This is probably not a great surprise to you! 7.81 When you have the data, its easiest to get Minitab to do the heavy lifting for you.

Theres no reason to suspect a dierence in a particular direction, so a two-sided alternative makes sense. In Minitab, select Stat, Basic Statistics and 2-sample t. Our sample values (heights) are in one column, so make sure the rst alternative is selected, and select dbh for the heights and ns for the subscripts (indicators of north and south). Minitab gives us both a test and a condence interval, so ensure the right alternative, Not Equal, is selected (in Options), and also the right level, 95%, for the condence interval. Do not check the assume equal variances box; that uses the pooled procedures (described on page 461), which we do not want here. My results are in Figure 2. The two-sided P-value is 0.011, smaller than 0.05, so we can reject the null hypothesis and conclude that the mean DBHs dier in the northern and southern parts of the tract. The condence interval, from 19.1 to 2.6, gives us an idea of the direction and size of the dierence in means: the southern trees are bigger, but the size of the dierence is not estimated very accurately.

Minitab cannot do back-to-back stemplots, but it can do side-byside boxplots. Select Graph and Boxplot. Select With Groups (because you want a separate boxplot for each group, North and South). Select dbh as your graph variable and ns as your categorical variable for grouping. My boxplots are in Figure 1. Both distributions are somewhat skewed, and dier somewhat in spread. The North diameters are skewed to the right, and the South ones a little skewed to the left. The samples are relatively Figure 1: Boxplots for tree diameter data large; we hope theyre large enough to overcome the skewness. The dierence in spreads is not a concern. So we shrug our shoulders and continue. 7.82 This question bears a distinct similarity to 7.81. 5

Two Sample T-Test and Confidence Interval Two sample T for dbh ns n s N 30 30 Mean 23.7 34.5 StDev 17.5 14.3 SE Mean 3.2 2.6 ): ( -19.1, -2.6) ) (vs not =): T = -2.63

The hypotheses are the same as 7.81(c), and for the same reasons. In Minitab, select Stat, Basic Statistics and 2-sample t. Our diameters are all in one column with a second column labelling whether each diameter goes with East or West (subscripts in Minitab lingo). So select DBH as samples and ew as subscripts. Check that your alternative is not equal and that the condence level is 95%. Dont check assume equal variances. My output is shown in Figure 4. Taking the test rst: the P-value is 0.039, which means (at = 0.05) we can just reject the null hypothesis and conclude that the mean diameters are dierent. Looking at the rest of the output, this is because the west mean is bigger than the east mean. Since the P-value of a two-sided test was less than 0.05, the 95% condence interval for the dierence of the two means should not quite contain 0. The east-minus-west interval goes from 16.7 to 0.4, as promised, not quite reaching zero.
Two Sample T-Test and Confidence Interval Two sample T for dbh ew e w N 30 30 Mean 21.7 30.3 StDev 16.1 15.3 SE Mean 2.9 2.8 ): ( -16.7, -0.4) ) (vs not =): T = -2.11

95% CI for mu (n T-Test mu (n P = 0.011 DF = 55

) - mu (s ) = mu (s

Figure 2: Minitab output for tree diameter data (north-south) Side-by-side boxplots rst. Select Graph, Boxplot and With Groups. In the dialog box, select DBH as the graphing variable and east-west as as the categorical grouping variable. Click OK. My boxplots are shown in Figure 3. The East tree diameters have a smaller median than the West tree diameters, but are more spread out. The East tree diameters are slightly right-skewed and the West tree diameters are slightly left-skewed. There are no outliers. With our largish sample sizes (30 trees in each half-tract), mild skewness like this is not a problem.

95% CI for mu (e T-Test mu (e P = 0.039 DF = 57

) - mu (w ) = mu (w

Figure 4: T test and condence interval for tree diameter data (eastwest) 7.83 Use 55 1 = 54 df and t = 2.009 (50 df in table), to get Figure 3: Boxplots for tree diameter data (east-west) 6 53 50 2.009 152 182 + = 3 6.06, 70 55

or 3.06 to 9.06. Using the software df formula will give you more df and an interval that doesnt extend so far, but, either way, the intervals are not very dierent. (If you are doing a question by hand, using the df based on the smaller sample is ne, unless of course the question asks you to use the software df.)

alternative, there needs to be some reason to expect it before looking at the sample results. Your friend needs to have a two-sided alternative hypothesis, and thus the P-value he reports should be double the 0.06 that he thought was correct for a one-sided test (ie. the correct P-value is 0.12).

Because of random variation between stores, we might (by chance) 7.85 Same old thing: we need a test, with a one-sided alternative (is have observed a rise in mean sales in our sample if there wasnt there evidence that the mean . . . is higher?). one in the population (from looking at all stores). So we have to The test statistic is 1.654, with a one-sided P-value between 0.05 consider variability in sales too. and 0.10, so the null hypothesis cannot quite be rejected. The By way of comparison, Minitab can do the test based on just condence interval, using 18 df, goes from 0.24 to 2.04. the means, SDs and sample sizes. Select Stat, Basic Statistics, The above procedures assume normally distributed data that are 2-sample t. In the dialog box, click Summarized Data (the 3rd simple random samples from their populations, or at least data option). Fill in the sample sizes, means and SDs. Do not click asthat are suciently normal, given the sample sizes, for the Censume equal variances. The condence interval goes from 2.98 tral Limit Theorem to hold. The pooled procedures assume in to 8.98, which is (as we claimed) a little shorter than the one we addition that the population SDs are equal. found by hand (it uses the software df). Minitab also does the test, which not surprisingly says that we should not reject the null In the same way as 7.83, we can also do this one in Minitab, using hypothesis of equal mean sales (it gives the P-value as 0.322). See the Summarized Data option. See Figure 6. The df are bigger, so Figure 5. The software df is almost as big as the two sample sizes the CI is shorter, compared to our calculation by hand, but the combined, but not quite. change in the condence interval is really very small.
Two-Sample T-Test and CI Sample 1 2 N 70 55 Mean 53.0 50.0 StDev 15.0 18.0 SE Mean 1.8 2.4 Two-Sample T-Test and CI Sample 1 2 N 23 19 Mean 13.30 12.40 StDev 1.70 1.80 SE Mean 0.35 0.41

Difference = mu (1) - mu (2) Estimate for difference: 3.00000 95% CI for difference: (-2.98378, 8.98378) T-Test of difference = 0 (vs not =): T-Value = 0.99 P-Value = 0.322 DF = 104

Difference = mu (1) - mu (2) Estimate for difference: 0.900000 95% CI for difference: (-0.202700, 2.002700) T-Test of difference = 0 (vs not =): T-Value = 1.65 P-Value = 0.107 DF = 37

Figure 5: Minitab output for sales data

Figure 6: Minitab output for breast-feeding data

7.84 We need to choose the hypotheses without looking at the data; 7.112 When doing a one-sided test, check that your sample result is to do otherwise is cheating. If you are going to take a one-sided on the correct side before nding a P-value. In (a), it is (look at 7

the alternative hypothesis), so you can claim that the P-value is 0.07/2 = 0.035, and you would reject H0 in favour of your onesided Ha at the 5% level. In (b), t should be positive, so you are on the wrong side, the P-value is large, and you cannot reject H0 in favour of this one-sided Ha . 7.113 The simplest way to do this is to re up Minitab and ll column 1 with df values from 2 to 100 and column 2 with the 95% t values copied from the table. Then you plot them. To be slightly more clever: for your df, select Calc, Make Patterned Data and Simple Set of Numbers. Store them in C1, and go from rst number 2 to last number 100 in steps of 1. On your plot, you can also add the line at 1.96: right-click the plot, select Add and Reference Lines, and add a line at y = 1.96. My plot is in Figure 7. The values of t decrease as the df increases, rapidly at rst and then more slowly, and they become closer and closer to z = 1.96 as the df increases. This reinforces the idea that when you have a large sample, it matters little whether you know or not (because t and z are almost the same), but when you have a small sample, it makes a really big dierence (in that case, using z when you should be using t gives you an over-optimistic result). 7.115 Use Minitab for this: to get your column of n-values, select Calc and Make Patterned Data, then ask for values from 5 to 100 in steps of 5. Copy the t values from Table D into an empty column, and then work out the margin of error (using Calc and Calculator) as m = t / n, since s = 1. If you have the n values in column C1 and the t values in C2, this, as a formula, would be c2/sqrt(c1). Then plot your margins of error against the sample sizes. My plot looks like Figure 8. The margin of error decreases steadily as the sample size increases, though the rate of decrease tails o as the sample size gets bigger. 7.117 This is a matched-pairs test (actually, two of them: one for body weight and one for calories consumed). The dierence in body weight loss between the two experimental conditions is 0.4 8

Figure 7: Plot of t against df

Figure 8: Margin of error vs. sample size

1.1 = 0.7 kg (taking wine as treatment and no wine as control). This has standard error 8.6/ 14 = 2.30 kg. For calories, the dierence in means is 2589 2575 = 14 and the standard error is 210/ 14 = 56.12 calories. The df throughout all of this is 14 1 = 13. The test statistic for weight loss is t = 0.7/2.30 = 0.30 and for calories is 14/56.12 = 0.25. The report said no signicant dierences which implies two-sided alternative hypotheses; neither of these results comes close to statistical signicance at the 0.05 level, because the P-values are much bigger (you can guess this without even looking at the table). For the condence intervals, t = 2.160: for weight gain dierence, 0.7 (2.160)(2.30), 5.7 to 4.3, and for calories, 14 (2.160)(56.12), 107 to 135. Both of these condence intervals, especially the second one, suggest that the sample sizes werent large enough to estimate the dierence in means accurately.

think we are safe. Get the CI in Minitab the usual way: Stat, Basic Statistics, 1sample t. Sample in column oc. Click Options, check that the condence level is correct and that the Alternative is Not Equal. Click OK a couple of times. I got a condence interval from 26.2 to 40.6 (you can verify this by hand using the sample mean 33.42 and the sample SD 19.61).

The population about which the researchers appear to want to draw conclusions is all adults, and so the question is then one Figure 9: Boxplot of OC data of how the people in the study resemble a simple random sample of all adults. For one, the subjects in the study were all adult males up to age 50 of moderate weight (91 kg is 200 lbs), and all 7.129 The rst question to ask yourself here is where are the ranfrom the same city. So we cannot generalize to females, seniors or dom samples?. This appears to be a census of all pet owners overweight people. The other issue with this kind of study is how in the town, but you could argue that these people are (or might well the subjects complied with the experimental protocol, which be) representative of pet owners everywhere, and their potential in this case was pretty well, so that should not be a limiting behaviour if they were faced with evacuation from their homes. factor. Working on that principle, we can ask whether those people who evacuated some or all of their pets tended to have a higher score 7.118 Grab the data from the disk and read the values into Minitab. on commitment to adult animals. (Emergency-response people A histogram or a boxplot (my boxplot is in Figure 9) reveals will tell you that the rst priority in this kind of situation is to that the OC values are moderately right-skewed (on my boxplot, get yourself out of there, but thats by the way.) the upper part of the rectangle is bigger and the upper whisker is longer, but there are no outliers). If the skewness were more So: let group 1 be those who evacuated some or all pets, and let severe, I would have problems using a t condence interval on group 2 be those who did not. Let 1 and 2 be the population these data (and a moderate sample size of 31), but as it is, I mean scores on the Commitment to Adult Animals scale. Then 9

our alternative is Ha : 1 > 2 (you would expect, without looking at any data, that those who evacuated some pets would have if anything a higher score), and the null is H0 : 1 = 2 . The square root on the bottom of the test statistic is (3.622 /116) + (3.562 /125) = 0.46, and the test statistic itself is t = (7.95 6.26)/0.46 = 3.65. Using 100 df in the table (116 1 = 115), the P-value for this (one-sided) is less than 0.0005. This is very small, and so we reject the null hypothesis in favour of the alternative, concluding that those people who did evacuate some pets do have a higher average score on the Commitment to Adult Animals scale. To get a sense of how much higher, you can make a condence interval for the dierence in means. Using 100 df again, t for a 95% condence interval is 1.984, so the interval is 7.95 6.26 (1.984)(0.46), from 0.77 to 2.61. This gives a sense of how big the dierence in mean scores might be.

Two-Sample T-Test and CI Sample 1 2 N 116 125 Mean 7.95 6.26 StDev 3.62 3.56 SE Mean 0.34 0.32

Difference = mu (1) - mu (2) Estimate for difference: 1.69000 95% lower bound for difference: 0.92546 T-Test of difference = 0 (vs >): T-Value = 3.65

P-Value = 0.000

DF = 237

Two-Sample T-Test and CI Sample 1 2 N 116 125 Mean 7.95 6.26 StDev 3.62 3.56 SE Mean 0.34 0.32

Difference = mu (1) - mu (2) Estimate for difference: 1.69000 95% CI for difference: (0.77790, 2.60210) T-Test of difference = 0 (vs not =): T-Value = 3.65

P-Value = 0.000

DF = 237

Figure 10: Minitab output for 7.129

All of this can be veried in Minitab: Stat, Basic Statistics, 2sample t. Click on Summarized Data, and ll in the sample sizes, 7.141 Type the data into Minitab (it appears not to be on the disk). Enter all the prices into one column and the number of bedrooms sample means and sample SDs (in fact the exact same format into a second (you could also use two separate columns, but the as the table in the textbook). Click Options, and make sure suggested way is easier if you want to do things like side-by-side Alternative is Greater Than (for the test). See the top part of boxplots). Save yourself some trouble by working to the nearest Figure 10 (ignoring the one-sided condence interval): the test $500, rounding those 900 values upwards. statistic is veried as 3.65, and the P-value is very small. Note that Minitab is using the software df, which is bigger (237) than A nice way to see both groups of data at once is a side-by-side the by-hand 116 1. boxplot. Select Graph and Boxplot, and then select With Groups (the second alternative in the rst row). Select asking price into For the condence interval, repeat this process, but after clicking the Graph Variables box, and number of bedrooms into the Caton Options, turn the Alternative back into Not Equal. See the egorical Variable box. (I know number of bedrooms is really bottom half of Figure 10. Minitab again uses the software df, so its quantitative, but we are treating it as a way to divide the houses condence interval is a smidgen shorter than the one we calculated into two groups, so we are treating it in a categorical way.) My (it is 0.78 to 2.60). With sample sizes this big, though, any t value side-by-side boxplots are in Figure 11. For the four-bedroom you might use is going to be very close to the z = 1.96 that you houses, the prices have a right-skewed distribution (the upper would use for a Chapter-6 condence interval not knowing the part of the box is bigger and the upper whisker is longer, but values really doesnt hurt you here. 10

there are no outliers). The three-bedroom house prices also appear to have a right-skewed distribution, though its harder to see: the upper whisker is longer than the lower, and there are two outliers at the upper end, which are the houses with asking price over $250,000. The sample of four-bedroom houses is rather small (only 9 of them), but we are looking at the dierence of means of two distributions that are skewed the same way, which will help.

Two-Sample T-Test and CI: asking, bedrooms Two-sample T for asking bedrooms 3 4 N 28 9 Mean 129.6 195.0 StDev 49.3 57.2 SE Mean 9.3 19

Difference = mu (3) - mu (4) Estimate for difference: -65.3750 95% CI for difference: (-111.5962, -19.1538) T-Test of difference = 0 (vs not =): T-Value = -3.08

P-Value = 0.010

DF = 12

Figure 12: Minitab output for 7.141 You might have been wondering why we didnt use a one-sided alternative in the rst place, because you would guess that threebedroom houses would be cheaper than four-bedroom houses (all other things being equal, anyway). So I would have done this as a one-sided test; in that case, the P-value would have been 0.005 (notice that we are on the correct side), and we would have been able to reject the null at = 0.01 as well against our onesided alternative. (The rst sentence of the question takes it for granted that four-bedroom houses will be more expensive than three-bedroom ones.) These asking prices are not simple random samples, but they are asking prices for homes in West Lafayette, Indiana that happened to be on the market at the time the data were collected. You could argue that theres nothing much special about homes that happen to be for sale (as compared to other homes in the same town that are not currently for sale), and so these homes are like a simple random sample of homes in the town. (Think of a time 20 years from now: can you predict which homes will be for sale at that time? I thought not.)

Figure 11: Boxplots for asking prices of houses You can answer (b) and (d) in one go, by getting a two-sided test and condence interval. Select Stat, Basic Statistics and 2-sample t. Click on Samples in One Column, and enter your column of asking prices in the rst box, and your column containing the number of bedrooms in the second. (If you entered the askingprice values into two columns, use the second option and enter your two columns of prices into the two boxes.) The Options should be OK, but check it anyway: 95% condence, alternative Not Equal. My output is in Figure 12. The test statistic is 3.08, and the two-sided P-value is 0.01, so at = 0.05, we would reject the null hypothesis and conclude that there is a dierence in mean asking price. The condence interval goes from 111.6 to 19.1, saying that 3-bedroom houses are between about $20,000 and $110,000 cheaper on average than 4-bedroom houses. 11

Вам также может понравиться