Вы находитесь на странице: 1из 22

Practice Exam Questions; Statistics 301 Chapters 13, and 12 1. Below are 50 sorted observations. 0.01 0.18 0.

38 0.47 0.75 1.00 1.21 1.53 2.09 2.99 0.01 0.23 0.39 0.49 0.91 1.08 1.30 1.64 2.20 3.21 0.05 0.24 0.42 0.50 0.93 1.08 1.33 1.86 2.27 3.96 0.08 0.28 0.45 0.50 0.93 1.09 1.40 1.94 2.37 4.09 0.16 0.34 0.46 0.61 0.99 1.14 1.41 1.99 2.74 4.32

(b) Obtain the interquartile range of these data. (c) Given the mean and standard deviation of these data are 2.76 and 1.78, respectively, determine the proportion of observations that are within one standard deviation of the mean. How does your proportion compare to value predicted by the empirical rule? Are you are surprised by the agreement/disagreement? Comment. 3. A sample of size n = 200 yields the histogram printed below.

(a) Draw a density scale histogram of these data. Use 0.000.50, 0.501.00, 1.00 2.00 and 2.004.50 as your class intervals. Clearly label the height and endpoints of 140 150 160 170 180 190 200 210 220 230 240 250 each of the four rectangles. The mean of these data is 200.00. The standard (b) Obtain the interquartile range of these deviation is one of the following four numbers. data. (c) Given the mean and standard deviation of these data are 1.24 and 1.09, respectively, determine the proportion of observations that are within one standard deviation of the mean. Comment. 2. Below are 50 sorted observations. 0.05 0.22 0.49 1.50 2.64 3.18 3.76 4.35 4.63 4.85 0.07 0.33 0.71 1.58 2.67 3.34 3.96 4.41 4.65 4.92 0.08 0.42 0.93 1.59 2.88 3.53 4.00 4.45 4.75 4.93 0.09 0.43 0.99 2.01 2.91 3.57 4.04 4.56 4.78 4.94 0.13 0.46 1.36 2.50 3.01 3.71 4.28 4.57 4.83 4.96 5.00 10.00 20.00 40.00

Circle the standard deviation. Explain your answer. 4. Consider a balanced study with eight subjects, identied as A, B, C, D, E, G, H, and J. In the actual study, A, B, C and D are assigned to the rst treatment, and There are exactly four successes, and they are obtained by A, B, C, and H. This information is needed for parts (a)(c) below. (a) Compute the observed value of the test statistic. (b) Assume that the Skeptic is correct. Determine the observed value of the test statistic for the assignment that places A, D, E, and G on the rst treatment, and the remaining subjects on the second treatment. 1

(a) Draw a density scale histogram of these data. Use 0.000.50, 0.502.50, 2.50 4.50 and 4.505.00 as your class intervals. Clearly label the height and endpoints of each of the four rectangles.

(c) We have obtained the sampling distribution of the test statistic on the assumption that the Skeptic is correct. It also is possible to obtain a sampling distribution of the test statistic if the Skeptic is wrong provided we specify exactly how the Skeptic is in error. These new sampling distributions are used in the study of statistical power which is briey described in Chapter 7 of the text. Assume that the Skeptic is incorrect about subjects C, D, H, and J, but correct about subjects A, B, E, and G. This means that for subjects C, D, H, and J, his/her/its response will change if the treatment changes. For the assignment that puts A, D, E, and H on the rst treatment, and the other subjects on the second treatment, determine the response for each of the eight subjects. 5. Consider a unbalanced study with nine subjects, identied as A, B, C, D, E, G, H, J, and K. In the actual study, A, B, C, D, and E are assigned to the rst treatment, and There are exactly ve successes, and they are obtained by B, C, E, H, and J. This information is needed for parts (a)(c) below. (a) Compute the observed value of the test statistic. (b) Assume that the Skeptic is correct. Determine the observed value of the test statistic for the assignment that places A, C, D, G, and K on the rst treatment, and the remaining subjects on the second treatment. (c) We have obtained the sampling distribution of the test statistic on the assumption that the Skeptic is correct. It also is possible to obtain a sampling distribution of the test statistic if the Skeptic is wrong provided we specify exactly how the Skeptic is in error. These new sampling distributions are used in the study of statistical 2

power which is briey described in Chapter 7 of the text. Assume that the Skeptic is incorrect about subjects C, D, E, G, H, J, and K, but correct about subjects A, and B. This means that for subjects C, D, E, G, H, J, and K, his/her/its response will change if the treatment changes. For the assignment that puts A, E, G, H, and J on the rst treatment, and the other subjects on the second treatment, determine the response for each of the nine subjects. 6. Sally performs a CRD with a dichotomous response and obtains the following data. Treatment 1 2 Total S a c 14 F b d 66 Total 40 40 80

Next, she obtains the sampling distribution of the test statistic for Fishers test for her data; it is given below. x P (X = x) P (X x) P (X x) 0.30 0.0003 0.0003 1.0000 0.25 0.0029 0.0032 0.9997 0.20 0.0151 0.0184 0.9968 0.15 0.0514 0.0697 0.9816 0.10 0.1193 0.1890 0.9303 0.05 0.1957 0.3848 0.8110 0.00 0.2305 0.6152 0.6152 0.05 0.1957 0.8110 0.3848 0.10 0.1193 0.9303 0.1890 0.15 0.0514 0.9816 0.0697 0.20 0.0151 0.9968 0.0184 0.25 0.0029 0.9997 0.0032 0.30 0.0003 1.0000 0.0003 (a) Find the P-value for the second alternative (p1 < p2 ) if a = 4. (b) Determine the P-value for the third alternative (p1 = p2 ) if x = 0.25. (Go to top of next column.) (c) Determine the value of x and the P-value that satisfy the following condition: The

data are statistically signicant but not highly statistically signicant for the rst alternative (p1 > p2 ). (d) Assuming the Skeptic is correct, what is the largest possible value of the test statistic? Explain your answer. 7. Pam performs a CRD with a dichotomous response and obtains the following data. Treatment 1 2 Total S a c 15 F b d 75 Total 45 45 90

8. An unbalanced CRD has a total of 250 subjects, with 100 subjects on treatment 1. The total number of successes is 135, with 45 of the successes on the rst treatment. Use the standard normal curve to obtain an approximate P-value for Fishers test with the third alternative (=). 9. An unbalanced CRD has a total of 280 subjects, with 100 subjects on treatment 1. The total number of successes is 153, with 45 of the successes on the rst treatment. Use the standard normal curve to obtain an approximate P-value for Fishers test with the third alternative (=). 10. Below is the sampling distribution of the test statistic for Fishers Test for an unbalanced CRD. x P (X = x) x P (X = x) 0.8 0.0070 0.1 0.3916 0.4 0.1632 0.5 0.0932 0.2 0.3263 0.7 0.0186 I performed a simulation experiment with 5,350 runs. The frequencies of occurrences of the six values of the test statistic were obtained and then sorted: 34, 107, 482, 868, 1769, and 2090. Which of these ve numbers is the observed frequency of x = 0.4? Briey justify your answer. 11. An unbalanced CRD yields the data below.

Next, she obtains the sampling distribution of the test statistic for Fishers test for her data; it is given below.

x P (X = x) P (X x) P (X x) 0.289 0.0002 0.0002 1.0000 0.244 0.0016 0.0018 0.9998 0.200 0.0089 0.0107 0.9982 0.156 0.0330 0.0437 0.9893 0.111 0.0851 0.1288 0.9563 0.067 0.1576 0.2864 0.8712 0.022 0.2136 0.5000 0.7136 0.022 0.2136 0.7136 0.5000 0.067 0.1576 0.8712 0.2864 0.111 0.0851 0.9563 0.1288 0.156 0.0330 0.9893 0.0437 0.200 0.0089 0.9982 0.0107 0.244 0.0016 0.9998 0.0018 0.289 0.0002 1.0000 0.0002 (a) Find the P-value for the second alternative (p1 < p2 ) if a = 5. (b) Determine the P-value for the rst alternative (p1 > p2 ) if p 2 = 0.022. (Go to top of next column.) (c) Determine all values of x and the P-value that satisfy the following condition. The data are statistically signicant but not highly statistically signicant for the third alternative (p1 = p2 ). 3

Treatment 1 2 Total

S 7 3 10

F 3 2 5

Total 10 5 15

(a) Calculate the observed value of the test statistic. (b) On the assumption the Skeptic is correct, list all possible values of the test statistic.

12. A balanced study has ten subjectsve men and ve women. Given that there are 252 possible assignments of subjects to treatments, calculate the probability that all the men are assigned to the same (either) treatment. 13. An unbalanced CRD yields the data below. Treatment 1 2 Total S 9 7 16 F 6 13 19 Total 15 20 35

(b) Based on the simulation experiment, approximate the value of P (X = 0.2). (c) Which is larger, P (X = 0.2) or P (X = 0.2)? Explain your answer. (d) Diana approximates P (X = 0.2) by 0.246. Do you think this is a good approximation? Explain. Chapters 57 15. I select a random sample of size 180 from population 1 and obtain 63 successes. I select an independent random sample of size 300 from population 2 and obtain 87 successes. (a) Compute the 95% condence interval for p1 , the proportion of successes in the rst population. (b) Compute the point estimate of p1 p2 . (c) Evaluate the following expression for these data. 2 q 2 p 1 q 1 p + . n1 n2 (d) Nature knows that p1 = 0.32 and p2 = 0.26. Use your solutions above and natures knowledge to mark each of the following statements True, or False; no explanations are needed. The point estimate of p1 is correct. The 95% condence interval estimate of p1 is correct. The point estimate of p1 p2 is correct. 16. I select a random sample of size 240 from population 1 and obtain 144 successes. I select an independent random sample of size 320 from population 2 and obtain 176 successes. (a) Compute the 95% condence interval estimate of p1 , the proportion of successes in the rst population. (b) Compute the 90% condence interval estimate of p1 p2 . 4

Bert wants to use a simulation experiment to approximate the sampling distribution of the test statistic for Fishers test. The rst run of his simulation experiment yields the table below. Briey summarize what he has learned from this single run. Treatment 1 2 Total S 7 8 15 F 8 12 20 Total 15 20 35

14. Julia performs a balanced CRD and obtains the data given below. Treatment 1 2 Total S 7 3 10 F 3 7 10 Total 10 10 20

Julia performs a simulation experiment with 1000 runs to approximate the sampling distribution of the test statistic for Fishers test. Her results are below. x 0.8 0.6 0.4 0.2 freq. of x 2 15 71 238 x 0.0 0.2 0.4 0.6 freq. of x 333 254 71 16

Use the results above and your general knowledge to answer the questions below. (a) Based on the simulation experiment, approximate the value of P (X = 0.2).

(c) Nature knows that p1 = 0.62 and p2 = 0.57. Use your solutions above and natures knowledge to mark each of the following statements True, or False; no explanations are needed. The point estimate of p1 is correct. The 95% condence interval estimate of p1 is correct. The point estimate of p1 p2 is correct. 17. Agnes calculates a condence interval for p 1 p2 and obtains [0.16, 0.44]. On the assumption that this interval is correct, that is, that it contains p1 p2 , mark each of the following statements T for denitely true, F for denitely false or M for might be true, might be false. p1 p2 = 0.20 p 1 p 2 = 0.30 p1 > p2 p1 p2 = 0.50 18. Angela calculates a condence interval for p 1 p2 and obtains [0.12, 0.32]. On the assumption that this interval is correct, that is, that it contains p1 p2 , mark each of the following statements T for denitely true, F for denitely false or M for might be true, might be false. p1 p2 = 0. p 1 p 2 = 0.10 p1 = p2 + 0.40 p1 = p2 0.10 19. Carol performs 100 dichotomous trials and obtains the following data. Half First Second Total S a c 34 F b d 66 Total 50 50 100 5

For each statement below, determine any one value of c that makes the statement true for the data above. Be careful. Note that I am asking for the value of c. Also, note that for some statements, there might be more than one possible value of c that works. A. There is evidence that Carols ability improved during the course of the study. B. There is no evidence that Carols ability changed during the course of the study. C. There is evidence that Carols ability declined during the course of the study. 20. Barb performs 201 dichotomous trials and obtains the following data. Previous S F Total Current S F a b c d 40 160 Total 40 160 200

For each statement below, determine any one value of c that makes the statement true for the data above. Be careful. Note that I am asking for the value of c. Also, note that for some statements, there might be more than one possible value of c that works. A. There is no evidence that the outcome of the previous trial has any inuence on the outcome of the current trial. B. There is evidence that Barb performs better after a success than after a failure. C. There is evidence that Barb performs better after a failure than after a success. 21. Carl performs 200 dichotomous trials and obtains the following data. Half First Second Total S a c 134 F b d 66 Total 100 100 200

For each statement below, determine any one value of c that makes the statement true for the

data above. Be careful. Note that I am asking for the value of c. Also, note that for some statements, there might be more than one possible value of c that works. A. There is evidence that Carls ability improved during the course of the study. B. There is no evidence that Carls ability changed during the course of the study. C. There is evidence that Carls ability declined during the course of the study. 22. Edith performs 501 dichotomous trials and obtains the following data. Current S F a b c d 150 350

(b) Compute the probability that Brad occurs exactly twice next week. For future reference, if Brad occurs exactly twice during a week, we say that the event Mel has occurred. (c) Compute the probability that in the next four weeks, Mel occurs, then does not occur twice, then occurs, in that order. 24. On each of four days (Monday thru Thursday) every week for the next four weeks, Alex will shoot ve free throws. Assume that Alexs shots satisfy the assumptions of Bernoulli trials with p = 0.45. (a) Compute the probability that Alex obtains a total of (exactly) two successes on any particular day. For future reference, if Alex obtains exactly two successes on a particular day, then we say that the event Brad has occurred. (b) Compute the probability that Brad occurs exactly once next week. For future reference, if Brad occurs exactly once during a week, we say that the event Mel has occurred. (c) Compute the probability that in the next four weeks, Mel occurs, then does not occur twice, then occurs, in that order. 25. Don and Eric select independent random samples from the same population. Their sample sizes are different. Don uses his data to compute a b% condence interval for p. Eric uses his data to compute c% and d% condence intervals for p. The exact values of b, c, and d are unimportant, but you need to know that b < c < d. As a result of their computations, Don and Eric have three lower bounds and three upper bounds. Unfortunately, these six numbers became mixed up; their sorted values are below. 0.63 0.65 0.69 0.70 0.71 0.72 Given that 0.65 is one of the lower bounds and 0.69 is one of the upper bounds, reconstruct the 6

Previous S F Total

Total 150 350 500

For each statement below, determine any one value of c that makes the statement true for the data above. Be careful. Note that I am asking for the value of c. Also, note that for some statements, there might be more than one possible value of c that works. A. There is no evidence that the outcome of the previous trial has any inuence on the outcome of the current trial. B. There is evidence that performs better after a success than after a failure. C. There is evidence that Edith performs better after a failure than after a success. 23. On each of ve days next week (Monday thru Friday), Alice will shoot four free throws. Assume that Alices shots satisfy the assumptions of Bernoulli trials with p = 0.75. (a) Compute the probability that Alice obtains a total of (exactly) three successes on any particular day. For future reference, if Alice obtains exactly three successes on a particular day, then we say that the event Brad has occurred.

three condence intervals and match each interval with its condence level. 26. George and Henry select independent random samples from the same population. Their sample sizes are different. George uses his data to compute a b% condence interval for p. Henry uses his data to compute c% and d% condence intervals for p. The exact values of b, c, and d are unimportant, but you need to know that c < d. As a result of their computations, George and Henry have three lower bounds and three upper bounds. Unfortunately, these six numbers became mixed up; their sorted values are below. 0.30 0.35 0.37 0.42 0.43 0.45 Given that the three smallest of these numbers are lower bounds, determine the three condence intervals and match each interval to its condence level. 27. A box contains 10 red cards and six blue cards for a total of 16 cards. Wilbur is going to select n = 8 cards at random with replacement from the box. Let W denote the number of red cards that Wilbur obtains. Let X denote the number of blue cards that Wilbur obtains. Yolanda is going to select n = 8 cards at random without replacement from the box. Let Y denote the number of red cards that Yolanda obtains. Finally, let Z denote the number of blue cards that Yolanda obtains. You may use the fact that the probability histograms of the sampling distributions of W , X , Y and Z are pictured below. The number above each rectangle is its height which also equals its area. (a) Place an X next to the probability histogram of the sampling distribution of X and place a Y next to the probability histogram of the sampling distribution of Y . (b) Let w represent the probability that Wilbur obtains a representative sample. What is the the numerical value of w? (c) Compute the probability that Yolanda obtains a sample that is either representative 7

or misses being representative because it has one too many red cards. .282 .211 .101 .005 0 1 .030 2 3 .392 4 5 6 7 .235 .112 .023 8

.245

.245

.056 .003 0 1 2 3 4

.056 .003 5 .392 6 7 8

.245

.245

.056 .003 0 1 2 .235 .112 .023 0 1 2 3 4 5 3 .282 .211 .101 .030 6 4 5 6

.056 .003 7 8

.005 7 8

28. A box contains 10 red cards and ve blue cards for a total of 15 cards. Walter is going to select n = 9 cards at random with replacement from the box. Let W denote the number of red cards that Walter obtains. Let X denote the number of blue cards that Walter obtains. Yvonne is going to select n = 9 cards at random without replacement from the box. Let Y denote the number of red cards that Yvonne obtains. Finally, let Z denote the number of blue cards that Yvonne obtains. You may use the fact that the probability histograms of the sampling distributions of W , X , Y and Z are pictured below. The number above each rectangle is its height which also equals its area. (a) Place a W next to the probability histogram of the sampling distribution of W and place a Z next to the probability histogram of the sampling distribution of Z . (b) Let y represent the probability that Yvonne obtains a representative sample. What is the the numerical value of y ? (c) Compute the probability that Walter obtains a sample that is either representative or misses being representative because it has one too many red cards. .420 .252 .240

.234 .117 .026 0 1 2

.273

.205 .102 .034 .007 6 .273 7 .234 .117 .026 4 5 6 7 8 9 .001 8 9

.205 .001 .007 0 1 2 .034 3 .102

29. I cast my round-corned die 1,000 times and obtained the number six 240 times. I plan to cast the die an additional 500 times. Assume that successive casts are Bernoulli trials. (a) Compute the point prediction of the number of times the die yields a six in the additional 500 casts. (b) Calculate the 90% prediction interval for the number of times die yields a six in the additional 500 casts. (c) I perform the additional 500 casts and obtain the number six a total of 131 times. Comment on your answers to parts (a) and (b) above.

.042 0 1 2 3 .420 .240 .045 1 2 3 4 .252 4 5 6

30. Bert enjoys the dart game Cricket. To prepare for Cricket, he practices by aiming at the region of the dart board marked 20. (For the .045 .002 purpose of this exercise, we wont distinguish between single, double and triple twenties.) In 7 8 9 220 throws, Bert hits a 20 a total of 77 times. Assume that successive throws are Bernoulli trials. (a) Compute the point prediction of the number of times Bert hits a 20 in his next 140 throws. (b) Calculate the 90% prediction interval for the number of times Bert hits a 20 in his next 140 throws. 8

.002 0

.042 5 6 7 8 9

(c) Bert performs the additional 140 throws and hits a 20 a total of 62 times. Comment on your answers to parts (a) and (b) above. 31. The Course Notes presented examples of the possible effects of background (lurking) variables. I stated in lecture that Case 4 on that page is an example of Simpsons Paradox. For this exercise, suppose that the original table is: Gender Female Male Total Outcome Released Not released 50 50 56 44 106 94 Total 100 100 200

Gender Female Male Total

Job B Outcome Released Not released 30 30 28 32 58 62

Total 60 60 120

Match each possibility (1, 2, and 3 with one of the statements below. Note: Perhaps each statement is used once, perhaps not. (a) The expansion is not possible; it is inconsistent with the original table. (b) The expansion is possible and it is an example of Simpsons Paradox. (c) The expansion is possible but it is not an example of Simpsons Paradox. Chapters 15, 16, 8, and 13 32. Below is the table of population counts for a condition and its screening test. (Recall that A means the condition is present and B means the screening test is positive.) B 190 50 240 Bc 10 750 760 Total 200 800 1000

Below are three possibilities for the results obtained by expanding the above data to take into account type of job.
1 Gender Female Male Total Gender Female Male Total 2 Gender Female Male Total Gender Female Male Total 3 Gender Female Male Total Job A Outcome Released Not released 16 4 49 21 65 25 Job B Outcome Released Not released 34 46 7 23 41 69 Job A Outcome Released Not released 27 23 21 29 48 52 Job B Outcome Released Not released 24 26 34 16 58 42 Job A Outcome Released Not released 20 20 28 12 48 32

Total 20 70 90 Total 80 30 110 Total 50 50 100 Total 50 50 100

A Ac Total

(a) What proportion of the population has the condition? (b) What proportion of the population has the condition and would test positive? (c) Of those who have the condition, what proportion would test negative? (d) Of those who would test positive, what proportion is free of the condition? (e) What proportion of the population has the condition or would test positive? 33. Below is the table of population counts for a condition and its screening test. (Recall that A means the condition is present and B means the screening test is positive.) 9

Total 40 40 80

A Ac Total

B 49 14 63

Bc 7 210 217

Total 56 224 280

36. Below is a scatterplot of 10 cases, with one case denoted by the letter A, and one case denoted by B. O O OO O O O O

(a) What proportion of the population has the condition? (b) What proportion of the population has the condition and would test positive? (c) Of those who have the condition, what proportion would test negative? (d) Of those who would test negative, what proportion is free of the condition? (e) What proportion of the population has the condition or would test positive? 34. Casey (my dog) is looking out the window. There is a 20% chance that she will see one or more squirrels in the next 10 minutes. Given that she sees one or more squirrels, there is an 90% chance that Casey will bark during the time period. Given that she sees no squirrels, there is a 35% chance that Casey will bark during the time period. (a) What is the probability that Casey will bark during the next 10 minutes? (b) Given that Casey barks during the next 10 minutes, what is the probability that she saw one or more squirrels? 35. Bailey (my dog) is looking out the window. There is a 24% chance that he will see one or more squirrels in the next 10 minutes. Given that he sees one or more squirrels, there is an 87.5% chance that Bailey will bark during the time period. Given that he sees no squirrels, there is a 25% chance that Bailey will bark during the time period. (a) What is the probability that Bailey will bark during the next 10 minutes? (b) Given that Bailey barks during the next 10 minutes, what is the probability that he saw one or more squirrels? 10 A B

The correlation coefcient for these ten cases is r = 0.622. Consider three new data sets: Set 1: the nine cases that remain after deleting case A. Set 2: the nine cases that remain after deleting case B. Set 3: the eight cases that remain after deleting cases A and B. The correlation coefcients for these three data sets are: r = 0.153, r = 0.352 and r = 0.810. Match each data set with its correlation coefcient. 37. Below is a scatterplot of nine cases, with one case denoted by the letter A, and one case denoted by B. O O O O O

O O B

The correlation coefcient for these nine cases is r = 0.670. Consider three new data sets: Set 1: the eight cases that remain after deleting case A. Set 2: the eight cases that remain after deleting case B. Set 3: the seven cases that remain after deleting cases A and B. The correlation coefcients for these three data sets are: r = 0.466, r = 0.806 and r = 0.916. Match each data set with its correlation coefcient.

(Hint: x = 33.90 and s = 11.06.) (a) Construct a condence interval for the median of the population. Select the condence level and remember to report it with your answer. (b) Construct Gossets 90% condence interval for the mean of the population. 41. A random sample of size 16 is selected from a pdf. The sorted data are below. 34.4 54.1 73.2 41.6 65.5 78.4 41.6 67.6 79.9 42.3 70.5 101.0 43.7 70.7 49.9 71.9

38. Independent random samples are selected from two populations. Below are selected summary statistics. Pop. 1 2 Mean 38.00 32.00 Stand. Dev. 8.00 6.00 Sample size 15 9

(Hint: x = 61.64 and s = 18.35.) (a) Construct a condence interval for the median of the population. Select the condence level and remember to report it with your answer. (b) Construct Gossets 95% condence interval for the mean of the population. 42. Sixty students take two midterm exams. The scores on the rst midterm have a mean of 55.00 and a standard deviation of 10.00. The scores on the second midterm have a mean of 63.00 and a standard deviation of 15.00. The correlation coefcient of the set of two scores is 0.62. One of these students, Barbara, scored 45 on the rst midterm and 65 on the second midterm. (a) Obtain the regression line for using the rst midterm score to predict the second midterm score. (b) Use your answer to part (a) to obtain a predicted score for Barbara. (c) Calculate Barbaras residual. 43. Sixty students take two midterm exams. The scores on the rst midterm have a mean of 65.00 and a standard deviation of 12.00. The scores on the second midterm have a mean of 50.00 and a standard deviation of 6.00. The correlation coefcient of the set of two scores is 0.46. One of these students, Bill, scored 71 on the rst midterm and 47 on the second midterm. 11

(b) Obtain the P-value for the alternative X = Y .

(a) Construct the 95% condence interval for X Y .

39. Independent random samples are selected from two populations. Below are selected summary statistics. Pop. 1 2 Mean 28.00 22.00 Stand. Dev. 6.00 4.00 Sample size 6 12

(b) Obtain the P-value for the alternative X = Y . Make sure to specify which reference curve you use. 40. A random sample of size 16 is selected from a pdf. The sorted data are below. 17.9 29.1 42.3 22.0 30.6 46.0 22.1 36.1 49.0 24.2 37.5 56.1 24.9 38.7 25.6 40.3

(a) Construct the 90% condence interval for X Y .

(a) Obtain the regression line for using the second midterm score to predict the rst midterm score. (b) Use your answer to part (a) to obtain a predicted score for Bill. (c) Calculate Bills residual. 44. Each of 18 researchers selects a random sample from a given population and calculates a condence interval for the mean. The 18 lower bounds are below. 59.0 73.3 99.1 60.9 76.2 101.6 63.4 87.9 103.7 70.7 88.2 104.2 71.1 89.7 118.7 72.1 94.0 121.9

(a) Locate the point that has x = 8 and y = 2; put an A at that point. (b) Locate the point that has x = 4 and e = 2; put a B at that point. (c) Locate the point that has y = 6 and e = 3; put a C at that point.

(d) Draw the line that represents all points for which e = +1. (e) Given that x = 5, what is the value of y ? 47. Below is a coordinate system with the regression line y = 0.5x. 5 4 3 2 1 0 0

Given that exactly four of these 18 intervals are too large (i.e. all numbers in the interval are larger than ), what can you say about the value of ? 45. Each of 18 researchers selects a random sample from a given population and calculates a condence interval for the mean. The 18 upper bounds are below. 39.0 59.0 73.3 39.8 60.9 76.2 43.3 63.4 87.9 45.4 70.7 88.2 55.2 71.1 89.7 57.9 72.1 94.0

10

Given that exactly three of these 18 intervals are too small (i.e. all numbers in the interval are smaller than ), what can you say about the value of ? 46. Below is a coordinate system with the regression line y = 1 + 0.5x. 6 5 4 3 2 1  0 0
             

(a) Locate the point that has x = 4 and y = 3; put an A at that point. (b) Locate the point that has x = 5 and e = 2; put a B at that point. (c) Locate the point that has y = 2 and e = 3; put a C at that point.

(d) Draw the line that represents all points for which e = +1. (e) Given that x = 6, what is the value of y ? 48. Below is a pdf.

10

10 12

Carly performs a simulation study of this population with 10,000 runs. Below is the description of each run of the study.

A random sample of size 5 is selected from the pdf. The values W and V are calculated, where W is the minimum of the ve numbers and V is the maximum of the ve numbers. (For example, the rst sample generated yielded: 5.94, 6.49, 1.91, 3.06, and 6.97; for this sample, W = 1.91 and V = 6.97.) (a) The number of runs that yields a value of W larger than 5 is one of the counts below. Which one is it? Explain your answer. 2 28 106 285 749

(b) Thirteen of the values of an L are larger/smaller (circle one) than the mean of the pdf. Forty-two of the values of the other L are larger/smaller (circle one) than the mean of the pdf. Fifteen of the values of a U are larger/smaller (circle one) than the mean of the pdf. Fifty of the values of the other U are larger/smaller (circle one) than the mean of the pdf. Explain what can be learned from these results. Be specic. In the course of answering this question, you should explain your answers above (i.e. the circlings of larger or smaller). 50. A regression analysis yields x = 40 and y = 80. In addition, one of the subjects, Sally, has x = 50, y = 65 and e = 5. Determine the equation of the regression line. 51. A regression analysis yields x = 50 and y = 100. In addition, one of the subjects, Sally, has x = 75, y = 45 and e = 5. Determine the equation of the regression line.

(b) The number of runs that yields a value of V smaller than 5 is one of the counts below. Which one is it? Explain your answer. 7 49. Below is a pdf. 30 115 315 812

10

Bert performs a simulation study of this population. Below is the description of each run of the study. A random sample of size 6 is selected from the pdf. The values of x and s are obtained. The values L1 = x (2.015s)/ 6 and U1 = x + (2.015s)/ 6 are calculated. The values L2 and U2 are calculated, where L2 is the minimum of the six numbers and U2 is the maximum of the six numbers. One thousand runs are performed. (a) Note that because the pdf is symmetric, its mean equals its median. What is this common value? 13

52. Bart performs three simulation studies. The population is substantially skewed to the right with with = 100. For one study he has his computer generate 10,000 random samples of size n = 10 from his population. For each random sample, the computer calculates the Gosset 95% condence interval for and checks to see whether the interval is correct. His second study is like his rst, but n = 100. Finally, his third study is like the rst, but n = 200. In one of his studies, Bart obtains 8,644 correct intervals; in another he obtains 9,483 correct intervals; and in the remaining study he obtains 9509 correct intervals. Match each number of correct intervals to its study. To obtain credit you must explain your answer. 53. Matt performs two simulation studies. The population is a normal curve with with = 100. For one study he has his computer generate 10,000 random samples of size n = 10 from his population. For each random sample, the

computer calculates the Gosset 95% condence interval for and checks to see whether the interval is correct. His second study is like his rst, but n = 100. In one of his studies, Matt obtains 9,490 correct intervals and in the other he obtains 9,517 correct intervals. Match each number of correct intervals to its study. To obtain credit you must explain your answer. 54. Alma, Betty, and Chloe each calculate an 80% condence for the mean of a population. Their intervals are Alma: [29, 36]; Betty: [32, 38]; and Chloe: [33, 42]. (Hint: For each of the questions below, the correct answer is not any of the above three intervals.) (a) Suppose Nature announces, All three intervals are correct. Given this information, what is the smallest interval known to contain ? (b) Suppose Nature announces, The intervals of Alma and Betty are correct, but Chloes is incorrect. Given this information, what is the smallest interval known to contain ? (c) Suppose Nature announces, The intervals of Alma and Chloe are correct, but Bettys is incorrect. Given this information, what is the smallest interval known to contain ?

Solutions 1. (a) First, create the following table. Class Interval 0.00.5 0.51.0 1.02.0 2.04.5 0.68 0.32 0.30 0.08 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 (b) The rst quartile is 0.42. The third quartile is 1.86. The IQR is 1.86 0.42 = 1.44. Freq. 17 8 15 10 Rel. Freq. 0.34 0.16 0.30 0.20 Density 0.68 0.32 0.30 0.08

(c) The interval is [0.15, 2.33]. By counting, 39 values are in this interval; this gives a proportion of 39/50 = 0.78. The comment should compare this to the 68% of the empirical rule. It would be nice if you said that the higher proportion reects the skewness in the picture in part (a).

2.

(a) First, we create the relative frequency table. Class Interval 0.000.50 0.502.50 2.504.50 4.505.00 0.44 0.08 0.19 Freq. 11 8 19 12 Rel. Freq. 0.22 0.16 0.38 0.24 Density 0.44 0.08 0.19 0.48 0.48

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 (b) Q1 = x(13) = 0.93, Q3 = x(38) = 4.45; thus, IQR= 4.45 0.93 = 3.52. 14

(c) First, x s = 2.76 1.78 = 0.98 and x + s = 2.76 + 1.78 = 4.54. There are 25 values between 0.98 and 4.54 (starting with 0.99 and up to 4.45). The proportion between is 25/50 = 0.50. This proportion is considerably below the 0.68 predicted by the empirical rule. This is not surprising because the histogram (part (a)) is far from bell-shaped. 3. Use the empirical rule for interpreting s. Clearly, 5 and 10 are too small, and 40 is too large. The value s = 20 looks reasonable. 4. (a) x = 3/4 1/4 = 0.50 (b) x = 1/4 3/4 = 0.50

(c) The P-value is P (X x) which is in the fourth column of the table. In addition, it is between 0.01 and 0.05; the only such entry is 0.0184 for x = 0.20. Thus, the P-value is 0.0184 and x = 0.20. (d) The largest x comes from the largest a, which is a = 14, yielding x = 0.35 0.00 = 0.35. 7. (a) Given a = 5, x = 5/45 10/45 = 0.111. Thus, the P-value is P (X 0.111) = 0.1288. (b) First, 0.022 = p 2 = c/45, which implies that c = 0.022(45) = 1. (Remember, c must be an integer.) Thus, a = 14 and x = 14/45 1/45 = 0.289. The P-value is P (X 0.289) = 0.0002. (c) The P-value is P (X |x|) + P (X |x|). The rst of these probabilities is in column 4 of the table, the second is in column 3, and they are the same value. In addition, this sum is between 0.01 and 0.05. The only entry that satises these conditions is 0.0107. Therefore, the P-value equals 2(0.0107) = 0.0214, and |x| = 0.200. Thus, x = 0.200. 8. First, use the data to construct the following table. Tr. 1 2 Total S 45 90 135 F 55 60 115 Total 100 150 250

(c) Neither A, D, G, nor J will have its response change b/c its treatment does not change. Neither B nor E will have its response change b/c the Skeptic is correct about them. Each of the two remaining subjects, C and H, will have its response change b/c its treatment changed and the Skeptic was incorrect about it. As a result, A, and B will yield S, and the others will yield F.

5.

(a) x = 3/5 2/4 = 0.10. (b) x = 1/5 4/4 = 0.80. (c) We must refer back to the actual outcome which is given at the beginning of the question. The responses of A, E , and K will not change b/c their treatments did not change. The response of B will not change b/c the Skeptic is correct about it. For the remaining subjects, C , D , G, H , and J , the responses will change. Thus, B , D , E , and G will yield successes and the remaining subjects will yield failures. (a) For a = 4, x = 4/40 10/40 = 0.10 0.25 = 0.15. The P-value is P (X 0.15) = 0.0697. (b) The P-value is P (X 0.25) + P (X 0.25) = 0.0032 + 0.0032 = 0.0064. 15

6.

Next, x = 45/100 90/150 = 0.15, = 135(115) = 0.0645, 100(150)(249)

and z = 0.15/0.0645 = 2.33. For the third alternative, calculate |z | = 2.33; the area under the snc to the right of 2.33 is 0.0099, and the approximate P-value is 2(0.0099) = 0.0198.

9. The data yield the following table. Treatment 1 2 Total Thus, x = 45/100108/180 = 0.450.60 = 0.15. Also, = 153(127) = 0.0622, 100(180)(279) S 45 108 153 F 55 72 127 Total 100 180 280

(d) Because the probabilities are equal (part (c)), it makes sense to average our two estimates from (a) and (b) to obtain (238 + 254)/2000 = 0.246. 15. (a) p 1 = 63/180 = 0.35. The 95% condence interval is 0.35 1.96 0.35(0.65) = 0.350 180

1.96(0.0356) = 0.350 0.070 = [0.280, 0.420]. (b) The point estimate is p 1 p 2 = 0.35 87/300 = 0.06. (c) 0.35(0.65) 0.29(0.71) + = 180 300 0.001264 + 0.000686 = 0.00195 = 0.0442. (d) The rst statement is false because p 1 = 0.35 = 0.32 = p1 . The second statement is true because 0.32 is in the interval. The third statement is true because p1 p2 = 0.32 0.26 = 0.06 = p 1 p 2 . 16. (a) First, p 1 = 144/240 = 0.600. The 95% condence interval estimate of p1 is 0.6001.96 0.6(0.4) = 0.6000.062 = 240 [0.538, 0.662]. (b) First, p 2 = 176/320 = 0.550. The 90% condence interval estimate of p1 p2 is (0.600 0.550) 1.645 0.6(0.4) 0.55(0.45) + = 240 320 2 q 2 p 1 q 1 p + = n1 n2

yielding z = 0.15/0.0622 = 2.41. Next, |z | = 2.41 and the area under the snc to the right of 2.41 is 0.0080. The approximate P-value is 2(0.0080) = 0.0160. 10. Apply the LRRFIOP (page 60 of text). x = 0.4 has the third largest probability in the table so it should have the third largest relative frequency, 868. 11. (b) The smallest possible value of a is 5, which yields x = 0.5 1 = 0.5. Next, a = 6 yields x = 0.2. Thus, = 0.3. The largest a, 10, yields x = 1. Therefore, the possible values of x are: 0.5, 0.2, 0.1, 0.4, 0.7, and 1. 12. The answer is 2/252. There is one assignment that puts all of the men on treatment one, and one assignment that puts all of the men on treatment one. 13. He has learned that his computer program contains an error b/c the values of m1 and m2 changed. 14. (a) We approximate the probability by its relative frequency, 0.238. (b) We approximate the probability by its relative frequency, 0.254. (c) These probabilities are equal b/c the study is balanced. 16 (a) x = 0.7 0.6 = 0.1.

0.050 0.069 = [0.019, 0.119].

(c)

The point estimate of p1 is correct. This is false b/c 0.62 = 0.60. The 95% condence interval estimate of p1 is correct. This is true b/c 0.62 is in the CI. The point estimate of p1 p2 is correct. This is true b/c 0.05 = 0.05.

C. We need a/100 > c/100 which implies that a > c. Any c = 34, 35, . . . 66 will work. 22. A. First, note that m1 /n = 150/500 = 0.30. Thus, we need c/350 = 0.30 which gives c = 105. B. We need a/150 > c/350 or 7a > 3c. Any c = 0, 1, . . . 104 will work. C. We need a/150 < c/350 or 7a < 3c. Any c = 106, 107, . . . 150 will work. 23. (a) Each shot is a BT. Thus, the total number of successes on any given day will have a binomial distribution. The probability that Brad occurs on any given day is P (X = 3) = 4! (0.75)3 (0.25) = 3!1!

17.

p1 p2 = 0.20. M; 0.20 is in the CI; thus, it might be the value of p1 p2 . p1 > p2 . T; all numbers in the CI are positive, so this inequality must be true.

p 1 p 2 = 0.30. T; 0.30 is the center of the CI; thus, it equals p 1 p 2 . p1 p2 = 0.50. F; 0.50 is not in the CI; thus, it cannot be the value of p1 p2 .

18.

p1 p2 = 0. M; 0 is in the CI; thus, it might be the value of p1 p2 . p1 p2 = 0.40. F; 0.40 is not in the CI; thus, it cannot be the value of p1 p2 . p 1 p 2 = 0.10. T; 0.10 is the center of the CI; thus, it equals p 1 p 2 .

4(0.4219)(0.25) = 0.4219. (b) Now we are focusing on days as the BTs. A day is a S if it yields a Brad. From part (a) the probability of a success is 0.4219. Like part (a), this is a binomial problem. P (X = 2) = 5! (0.4219)2 (0.5781)3 = 2!3!

p1 p2 = 0.10. M; 0.10 is in the CI; thus, it might be the value of p1 p2 . 19. A. We need a/50 < c/50 which implies that a < c. Any c = 18, 19, . . . 34 will work. B. We need a/50 = c/50 which implies that a = c. Thus, c = 17. C. We need a/50 > c/50 which implies that a > c. Any c = 0, 1, . . . 16 will work. 20. A. First, note that m1 /n = 40/200 = 0.20. Thus, we need c/160 = 0.20 which gives c = 32. B. We need a/40 > c/160 or 4a > c. Any c = 0, 1, . . . 31 will work. C. We need a/40 < c/160 or 4a < c. Any c = 33, 34, . . . 40 will work. 21. A. We need a/100 < c/100 which implies that a < c. Any c = 68, 69, . . . 100 will work. B. We need a/100 = c/100 which implies that a = c. Thus, c = 67. 17 24.

10(0.1780)(0.1932) = 0.3439. (c) Now we are focusing on weeks as the BTs. A week is a success if it yields a Mel. Unlike, (a) and (b), this is a multiplication rule problem. P (SF F S ) = (0.3439)2 (0.6561)2 = 0.0509. (a) Each shot is a BT. Thus, the total number of successes on any given day will have a binomial distribution. The probability that Brad occurs on any given day is P (X = 2) = 5! (0.45)2 (0.55)3 = 2!3!

10(0.2025)(0.1664) = 0.3370.

(b) Now we are focusing on days as the BTs. A day is a S if it yields a Brad. From part (a) the probability of a success is 0.3370. Like part (a), this is a binomial problem. P (X = 1) = 4! (0.337)(0.663)3 = 1!3!

26. The rst fact to use is that the center of the CI is at p . As a result, two of the CIs must have the same center. Trial and error leads to the following three CIs. CI [0.30, 0.42] [0.35, 0.45] [0.37, 0.43] Center 0.36 0.40 0.40

4(0.337)(0.2914) = 0.3928. (c) Now we are focusing on weeks as the BTs. A week is a success if it yields a Mel. Unlike, (a) and (b), this is a multiplication rule problem. P (SF F S ) = (0.3928)2 (0.6072)2 = 0.0569. 25. Erics intervals will have the same center; Dons interval will have a possibly different center. First, we need to nd a way to create the intervals that satisfy this requirement. We know that 0.63 and 0.65 are lower bounds, while 0.69 and 0.72 are upper bounds. We dont know which of 0.70 and 0.71 is a lower bound. We will consider cases. First, suppose that 0.71 is a lower bound. Then [0.71, 0.72] is an interval. The other two intervals are [0.63, 0.69] and [0.65, 0.70] or [0.63, 0.70] and [0.65, 0.69]. Unfortunately, the centers dont work for either of these. Thus, our premise that 0.71 is a lower bound is wrong. Thus, 0.71 is an upper bound. The lower bounds are 0.63, 0.65, and 0.70. The upper bounds are 0.69, 0.71, and 0.72. 0.70 must be matched with 0.71 or 0.72. If it is with 0.71, the centers dont work (check it). Thus, [0.70, 0.72] is a CI. This means we must match the lower bounds 0.63 and 0.65 with the upper bounds 0.69 and 0.71. The only way to do this and make the centers work is [0.65, 0.69] and [0.63, 0.71]. The wider of these is d%, the narrower c% and [0.70, 0.72] is b%. 18 27.

The rst of these intervals belong to George. The other two intervals are Henrys. Next, use the fact that for a given set of data, the larger the level, the wider the interval. Thus, the second of these intervals is for d% and the third is for c%. (a) I will match each of W , X , Y , and Z with its picture. First, we list our facts. B/c of sampling without replacement, it is impossible for Y to be smaller than 2. B/c of sampling without replacement, it is impossible for Z to be larger than 6. The sampling distribution of W is Bin(8, 0.625). The sampling distribution of X is Bin(8, 0.375). Thus, the third picture is for Y , the second picture is for Z , the rst picture is for W and the fourth picture is for X . (b) Wilbur must obtain exactly ve red cards and three blue cards. From picture 4 (or 1) we see that the probability of this event is 0.282. (c) Yolanda must obtain either 5 or 6 red cards. From picture 3, the probability is 0.392 + 0.245 = 0.637. 28. (a) I will match each of W , X , Y , and Z with its picture. First, we list our facts. B/c of sampling without replacement, it is impossible for Y to be smaller than 4. B/c of sampling without replacement, it is impossible for Z to be larger than 5.

The sampling distribution of W is Bin(8, 2/3). The sampling distribution of X is Bin(8, 1/3). Thus, the rst picture is for Y , the second picture is for Z , the fourth picture is for W and the third picture is for X . (b) Yvonne must obtain exactly six red cards and three blue cards. From picture 1 (or 2) we see that the probability of this event is 0.420. (c) Walter must obtain either 6 or 7 red cards. From picture 4, the probability is 0.273 + 0.234 = 0.507. 29. (a) First, p = 240/1000 = 0.24. The point prediction is mp = 500(0.24) = 120. (b) 1201.645 120(0.76) 1 + 500/1000 = [101, 139]. 120 1.645(9.55)(1.225) = 120 19.2 =

32.

(a) 200/1000 = 0.2. (b) 190/1000 = 0.19. (c) 10/200 = 0.05. (d) 50/240 = 0.208. (e) 250/1000 = 0.25.

33.

(a) 56/280 = 0.2. (b) 49/280 = 0.175. (c) 7/56 = 0.125. (d) 210/217 = 0.968. (e) 70/280 = 0.25.

34. The given information leads to the following table, where A denotes Casey seeing the squirrel and B denotes Casey barking. B 0.18 0.28 0.46 Bc 0.02 0.52 0.54 Total 0.20 0.80 1.00

A Ac Total

(a) Reading from the above table, the probability is 0.46. (b) Reading from the above table, P (A|B ) = 0.18/0.46 = 0.391. 35. The given information leads to the following table, where A denotes Bailey seeing the squirrel and B denotes Bailey barking. B 0.21 0.19 0.40 Bc 0.03 0.57 0.60 Total 0.24 0.76 1.00

(c) The point prediction was too small by 11, but the prediction interval was correct. 30. (a) B/c p is unknown, use the data to estimate it; namely, p = 77/220 = 0.35. The point prediction is mp = 140(0.35) = 49. (b) The 90% prediction interval is 49 1.645 49(0.65) 1 + (140/220) = after rounding. 49 11.88 = [37, 61],

A Ac Total

(c) The point prediction is much too small. The prediction interval is incorrect too, but just barely. 31. In the original table, p 1 = 0.50 and p 2 = 0.56. Possibility (1) is an example of Simpsons Paradox b/c p 1 > p 2 in both component tables. Possibility (2) is impossible b/c, for example, the as in the component tables add to 51, not the required 50. Possibility (3) is not an example of Simpsons Paradox b/c p 1 < p 2 for Job A. 19

(a) Reading from the above table, the probability is 0.40. (b) Reading from the above table, P (A|B ) = 0.21/0.40 = 0.525. 36. First note the following. Deleting A (B) will strengthen (weaken) the linear relationship. As a result, r = 0.153 is for deleting B, r = 0.810 is for deleting A, and r = 0.352 is for deleting both.

37. First note the following. Deleting A (B) will strengthen (weaken) the linear relationship. As a result, r = 0.466 is for deleting B, r = 0.916 is for deleting A, and r = 0.806 is for deleting both. 38. (a) First, s2 p = 14(64) + 8(36) = 53.82, 14 + 8

40.

(a) There are four possible correct answers: 99.6% : [22.1, 46.0], 97.9% : [24.2, 42.3], 92.3% : [24.9, 40.3], and 79.0% : [25.6, 38.7]. (b) The condence interval is 33.90 1.753(11.06/ 16) = 33.90 4.85 = [29.05, 38.75].

yielding sp = 7.34. The 95% condence interval is 6.00 2.074(7.34) (1/15) + (1/9) = 41.

(a) There are four possible correct answers: 99.6% : [41.6, 78.4], 97.9% : [42.3, 73.2], 92.3% : [43.7, 71.9], and 79.0% : [49.9, 70.7].

6.002.074(7.34)(0.422) = 6.002.074(3.095) = 6.00 6.42 = [0.42, 12.42]. (b) The test statistic is t = 6/3.095 = 1.94. The area under the t(22) curve to the right of 1.94 is between 0.025 and 0.05. The Pvalue lies between twice these boundaries; that is, it lies between 0.05 and 0.10. 39. (a) First, s2 p 5(36) + 11(16) = = 22.25, 5 + 11

(b) The condence interval is 61.64 2.131(18.35/ 16) = 61.64 9.78 = [51.86, 71.42]. 42. From the given information, identify x = 55, sx = 10, y = 63, sy = 15, and r = 0.62. (a) The slope is b1 = 0.62(15/10) = 0.93, and the y-intercept is b0 = 63 0.93(55) = 11.85. Thus, the regression line is y = 11.85 + 0.93x. (b) For Barbara, x = 45; thus, her predicted value is y = 11.85 + 0.93(45) = 53.68. (c) Barbaras residual is e=yy = 65 53.68 = 11.32.

yielding sp = 4.717. The 90% condence interval is 6.00 1.746(4.717) (1/6) + (1/12) = 6.001.746(4.717)(0.5) = 6.001.746(2.358) = 6.00 4.12 = [1.88, 10.12]. (b) The test statistic is t = 6/2.358 = 2.544. The area under the t(16) curve to the right of 2.544 is between 0.01 and 0.025. The P-value lies between twice these boundaries; that is, it lies between 0.02 and 0.05. 20

43. From the given information, identify x = 50, sx = 6, y = 65, sy = 12, and r = 0.46. (a) The slope is b1 = 0.46(12/6) = 0.92, and the y-intercept is b0 = 65 0.92(50) = 19. Thus, the regression line is y = 19 + 0.92x. (b) For Bill, x = 47; thus, his predicted value is y = 19 + 0.92(47) = 62.24. (c) Bills residual is e=yy = 71 62.24 = 8.76. 44. Four intervals are too large; this implies < 103.7. If ve intervals were too large, then < 101.6. But, exactly four intervals are too large; thus, 101.6. To summarize, 101.6 < 103.7. 45. Three intervals are too small; this implies > 43.3. If four intervals were too large, then > 45.4. But, exactly three intervals are too large; thus, 45.4. To summarize, 43.3 < 45.4. 46. 6 5 4 3 2 0


y = 1 + 0.5( x) = 1 + 2.5 = 3.5. 47. Below is a coordinate system with the regression line y = 0.5x. 5 4 A 3 C 2 1 B 0 0 2 4 6 8 10 y = 0.5( x) = 3.0. 48. I offer two ways to solve this problem. W > 5 occurs if, and only if, every observation is greater than 5. By the multiplication rule the probability of this occurring is (0.5) 5 = 0.0312. As a result, the proportion of times the simulation study yields W > 5 should be close to 0.0312. Of the choices, 285 is the natural candidate. A similar argument gives 315 for part (b). Next, the second solution. The interval [W, V ] is the 93.8% CI for (see Table A.7). Thus, about 6.2%, or 620, of the simulated CIs should be incorrect. By symmetry, roughly half of these incorrect intervals should be too large and half to small. 49. (a) 5. (b) (Note: This problem turned out to be much too confusing, but a small number of students answered it correctly.) Read the solution to the previous question. By similar reasoning, about 1.6% of the L 2 s (U2 s) should be greater than (less than) 5.

C


   
  

A B

1  0

10 21

By examining the choices, we nd that 13 of the L2 s are greater than 5, and 15 of the U2 s are less than 5. The Gosset 90% CI for is [L1 , U1 ]. Gossets CI should work pretty well for a symmetric pdf. Thus, about 5% of the L1 s (U1 s) should be greater than (less than) 5. By examining the choices, we nd that 42 of the L1 s are greater than 5, and 50 of the U1 s are less than 5. 50. For Sally, y = y e = 70. This gives us two equations 80 = b0 + 40b1 , and 70 = b0 + 50b1 . By subtraction, 10 = 10b1 , or b1 = 1. Thus, b0 = 120 and the regression line is y = 120 x. 51. For Sally, y = y e = 50. This gives us two equations 50 = b0 + 75b1 , and 100 = b0 + 50b1 .

By subtraction, 50 = 25b1 , or b1 = 2. Thus, b0 = 200 and the regression line is y = 200 2x. 52. Recall the following results about Gossets CI. For a skewed population, it performs poorly for n small. As n increases it performs better until it reaches the target condence level. Once it reaches the target CL, increasing n has no impact. Putting these facts together we conclude that 8,644 correct intervals were obtained for n = 10. But 9,483 and 9,509 are very close to the target of 9,500. Thus, we cannot tell which of these numbers goes with n = 100 and which goes with n = 200. 53. For a normal population, Gossets CI works perfectly for every value of n. Thus, it is impossible to tell which result goes with which study. 54. (a) [33, 36]. (b) [32, 33). (c) This is impossible. If A and C are correct, then 33 36; but this implies that B is correct.

22

Вам также может понравиться