Вы находитесь на странице: 1из 7

Nick Timmons BADM 572 Stats Assignment 6 Responses 10/07/2011 1.

.) This is a familiar problem that we have seen before, except for this time, we are looking to compare two different branches, and trying to make statistical inferences from that comparison, as opposed to simply making inferences on a population based on a single locational sample. Given some sample sizes N from both branches, and a given level of confidence The null hypothesis is essentially a statement that assumes our claim is not valid. Of the two hypotheses that we make, null and alternate, the null hypothesis always contains the equality sign as well. Here, we want to claim there is a difference in wait times between the two branches, using results from one branch that has been trained, and another branch that has not yet been trained. So, we make our null the opposite of that claim. Namely, that there is no difference. Hence: : 1 - 2 = 0 and : 1 - 2

We can find the T value at =.05 through Excel via =T.INV.2T(0.025,198). So, our T = 2.259 To find out whether or or not we reject the null hypothesis, we must compare this to a critical t value, which we can also find using excel. t-Test: Two-Sample Assuming Equal Variances Variable 1 Variable 2 5.433690566 5.796085342 6.102537708 4.347545611 100 100 5.225041659 0 198 1.121042432 0.131813804 1.652585784 0.263627607 1.972017478

Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

Because the critical t value is less than the value of T we calculated above, we DO NOT REJECT the null hypothesis,.

Now, if we do the same thing, but assume unequal variances. The T value in this case is given by =T.INV.2T(0.025,193), but because the sample sizes are so large, and thus give us such large degrees of freedom, even though the df is slightly less in the unequal variances case, the T value is nearly the same, and is the same if we round out to three decimals. T = 2.259 t-Test: Two-Sample Assuming Unequal Variances Variable 1 Variable 2 5.433690566 5.796085342 6.102537708 4.347545611 100 100 0 193 1.121042432 0.131831367 1.652787068 0.263662735 1.972331676

Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

Once again, we see that the critical T value is smaller than the T value at the given 95% given level of significance. So, we DO NOT REJECT the null hypothesis, meaning that there is no statistically significantly difference between the two sets of waiting times, whether or not we assume equal or unequal variances.

2.) In this case, we are now testing the proportion for two different populations. The null hypothesis is essentially a statement that assumes our claim is not valid. Of the two hypotheses that we make, null and alternate, the null hypothesis always contains the equality sign as well. Here, we want to claim there is a difference in satisfaction (which is based on wait times) using results from one branch that has been trained, and another branch that has not yet been trained. So, we make our null the opposite of that claim. Namely, that there is no difference. Hence:

= 0 and

We know a few things n1 = 100 1 = .6, and n1 = 100 1 = .53, and the mean difference between the two = .07. Given that this is not zero, and that np and n(1-p) are greater than 5 for both p1 and p2, the

confidence interval is given by: given level of confidence is 1.96, from looking at a z table.

,and we know that z at the

.07 +/- 1.96

= [-.067,.207]. We can continue on trying to accept or reject based on

the p value, or we can simply look at the interval and recognize that it contains zero. Because the interval contains zero, we know that within the given level of confidence, there is a chance that no difference occurs in the normal run of possible results. Thus, we do not reject the null hypothesis, and can not say that the proportion of satisfied customers are different at the 5% level of significance. Put another way, the measured level of difference is not statistically significant. 3.) This is essentially the exact same problem that we saw in the first problem, except we are comparing intrabank wait times, before and after training, rather than interbank wait times between two branches. The underlying statistical techniques, though, are identical. The null hypothesis is essentially a statement that assumes our claim is not valid. Of the two hypotheses that we make, null and alternate, the null hypothesis always contains the equality sign as well. Here, we want to claim there is a difference in wait times before and after training, using results from one branch that has been trained, and another branch that has not yet been trained. So, we make our null the opposite of that claim. Namely, that there is no difference. Hence: : 1 - 2 = 0 and : 1 - 2 We can find the T value at =.05 through Excel via =T.INV.2T(0.025,198). So, our T = 2.259 To find out whether or not we reject the null hypothesis, we must compare this to a critical t value, which we can also find using excel. t-Test: Two-Sample Assuming Equal Variances Variable 1 Variable 2 5.796085342 4.834763024 4.347545611 5.343107265 100 100 4.845326438 0 198 3.088108184 0.001151741 1.652585784 0.002303481 1.972017478

Mean Variance Observations Pooled Variance Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

Because the critical t value is less than the value of T we calculated above, we DO NOT REJECT the null hypothesis.

Now, if we do the same thing, but assume unequal variances. The T value in this case is given by =T.INV.2T(0.025,196), but because the sample sizes are so large, and thus give us such large degrees of freedom, even though the df is slightly less in the unequal variances case, the T value is nearly the same, and is the same if we round out to three decimals. T = 2.259 t-Test: Two-Sample Assuming Unequal Variances Variable 1 Variable 2 5.796085342 4.834763024 4.347545611 5.343107265 100 100 0 196 3.088108184 0.001153271 1.652665059 0.002306541 1.972141222

Mean Variance Observations Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail

Once again, we see that the critical T value is smaller than the T value at the given 95% given level of significance. So, we DO NOT REJECT the null hypothesis, meaning that there is no statistically significantly difference between the two sets of waiting times before and after training. Thus we would tell HR that based on the sample study, we would not have them repeat the training for all branches. 4.) So, now we arent sim ly making assum tions about variance, we actually want to compare the variance of two samples. We are supposing that we randomly select independent samples from two normally distributed populations. The null hypothesis is that the variances of the two samples are the same. : , where 1 is before training, and 2 is after training. : This is exactly equivalent to: : : F-Test Two-Sample for Variances Variable 1 Variable 2 5.796085342 4.834763024 4.347545611 5.343107265 100 100 99 99

Mean Variance Observations df

F P(F<=f) one-tail F Critical one-tail

0.813673654 0.153314097 0.717328593

So, with excel having spit out the numbers based on the data, let us test whether or not taking a look at the sampling distribution, which in this case is given by the F statistic.

by

=F.INV.RT(0.05,99,99) = 1.391, this is our critical value = = 4.3475/5.3431 = . 8137 (which, incidentally is given in the table, just wanted some extra practice) = =F.INV.RT(0.813673,99,99) = .83546 Because , we do not reject the null hypothesis, and so the study does not support the assertion that the customers ex erience is more consistent after training than before training.

13.27) a.) With this problem we are starting to get into simple linear regression analysis. The simple linear regression model relating y to x is given by: We are looking to find the least squares point estimate, given by:

MeanTaste 3.5659 3.3290 2.4231 2.0895 1.9661 3.8061 17.1797 xi

12.71564281 11.082241 5.87141361 4.36601025 3.86554921 14.48639721 52.38725409 x^2i

MeanPref 4.2552 4.0911 3.0052 2.2429 2.5351 4.7812 20.9107 yi

15.17361768 13.6192719 7.28190012 4.68653955 4.98426011 18.19772532 63.9433 xiyi

SS(xy) = 63.94 (17.18*20.91)/6 = 4.07 SS(xx) = 52.39 (17.18^2)/6 = 3.2 The point estimate b1 = 4.07/3.2 = 1.27This is the oint estimate of the slo e of the linear regression equation for these two measured things, taste and preference. Given that it is positive and 1.27, we

know that the rise over rise of that equation is positive, and that y increases 1.27 for every marginal increase in x (marginal = increase of 1). b.) The confidence interval at 95% is given by [.9885, 1.5577], from the results in the spreadsheet analysis

shown in the problem. This means that at the 95% level of confidence, the point estimate of the slope of the linear regression equation can be found to be anywhere form .9885 to 1.5577 13.7) a.)

As we can see after going through the whole rigmarole in excel, next to the above chart, there is the answer we are looking for here. The linear regression equation = y = 10.146x + 18.488 + E b.) b1 is 10.146, and this represents the slope. b0 is 18.49, and this represents the y intercept. The y intercept number, in particular, makes sense. It essentially tells us that even when x is zero, that is, when a production run occurs that somehow doesnt produce anything, there are still fixed costs involved. Even when we produce nothing in a batch, there will still be costs to deal with. c.)

d.) y = 10.146x + 18.488 y = 10.146(60) + 18.488 y = 608.76 + 18.488 y= 627.248 So, when we have a batch size x of 60, then the direct labor cost y that corresponds to that batch size is 627.248

Вам также может понравиться