Hypothesis Testing

ab chAPTERS 84 example 88 Equation 87 | Equation a equation 8 Two-Sample ¢ Test for Independent Samples with Equal Variances Let's now disenss the question posed in Example 8.2, assuming that the crose Sectoral uy defined in quan 82 i boing aed, rahe than the Kango study defined In Fquation 8. j Cardiovascular Disease, Hyporiansion Supprse a sample of eight &S-t0.39-yeaeald | onpregnaat, premenopausal OC uses who work i a company are klentiied wha have ican systolic bloed pasane ‘ation Or S14 mim th. A sample of Gents-cne 45 to 29.year-ONl onpregnant, premeno, Frost nomi wets ate slay Mente sho have mean systole blowd pressure Torn san and sare standard levLaion of 18.21 mm Hg. What can be sai Ihout the underlying mea siferene nna! puessure betwen the bo groups ‘he mm Hg and sample standard Assume systolic blood pressure (SP swith mean 4 and ven is normally cstributed i the fist group ceo an inthe sevond group with mean ps and variance "Sp We want to test the hypothesis Hy fy =a Yess! yg. ASSUME in this section that the underlying variances in he two geOUPS ate the some (hat is, SG eo}a'), The means and variances 4n the two samples are denoted by Tu Fy. 6, sf respectively: TY secon casomabe to base the significance test on the diference between the tneo spe means, fy es lference i fa fom 0, then H, wil be ejected Sehr, wil e accepts, Thus we sesh t0 study the behavior of 8) = % under Jy, We know Ky is nonnally alseibuted with mean jt, and variance o/ and Xp ‘Wnormally istelbuted wis meat ry snd variance 6. Hence, fom Equation S10, because the two samples are independent, X,— Xp Is normally distributed with mea = iad valance @2Cr, + Vn) dn sym x, ae sao 4) No of 6! wore known, theo K,--F, could he dvided by on + Was. From Equation Np.)Equation 8.10 Equation 6.11 + Two Sample Test for depen! Samples with Equal Variances 305: and the test statistic tn Equation 89 could Le used asa basis forthe hypotlesis test Untortunately, «in general is unknown and must be estimated from the data How cat @? be hist estimated in this skuation? rons the frst and second sample, the sample vatiances are sf, sf, respectively, cach of sehich could he used to estimate a2. The average of sf ands could simply be used as the estimate of a2, However, this average will welght the sample var: anges equally es i the sample sizes ate very different from each other The sample warianees should not be weighted equally, because the sample variance from the longer sample s ptobahly mote prveise ate shoud be wet best estimate of the population varlanes 6°, shiel iy denoted by 32, is given by @ weighted average of the two sample variances, where the weights are the aumber of grees of freedom: ie each sample ‘The pooled estimate ofthe varlance from two Independent samples is given by. fo th = mE In particular, 5? will ten have 1, ~ 1 df from the fist sample and, 1d from the second sample, (14 04 ten, m= Dap overall. Then scan be substituted for in Equation 8.9, and the resulting test satis tic ean then be shown to follow a ¢ distribution with m, +n, ~ 2.dfather than an 1N(O, 1) distebuation, hecause ois unknown. Thus the following test procedure Is used ‘Two-Sample t Test for Independent Samples with Equal Variances. Suppose we ‘wish to test the hypothesis Ho: py = versus Hyt i ey with a significance level of «for two normally disttbuted populations, where oF Is assumed to be ‘he samme for each population, Compute the test statistic fee ean em, 2) Poti -atale OF ES hgane gt it ‘then H, Is rejected " Ingenta SES bom ttt then H, Is accepted. The acceptance and rejeetion regions for this test are shown in Figure 8.3.306 CHAPTER 6 « Hygohnee Tsing: Tac-Sargle Hnce Figure 8.1 Acceplance a rajcton regions forthe wo-samplet test for independent samples wih equal variances receptanee Disibion of in Equition 8. woe “iota, subtion Siallaaly, a p-value can be computed for the tst, Computation of the pralue deperas on whether ¥, £1 0)oray >, (1>0), In each ease, the pale cor responds to the probability of obtaining 3 st statistic atleast as extreme a the ab served value t. This given in Equation 8.12 Equation 2.12 Computation athe p-value for the Two-Sample& Test for Independent Samples with Equel Variances Compute the fst statist where se fle) —Dsi]fim + m.=2) ues ) p= 2 «(aca Uo the He of Lunde fy, yes dstebation). 150, px 2x(area to the Hightot Ender» gay distitbuton, The computation ofthe palo ilustrated in Figure 84, Example 8.10 | Corsiovasculr Disase, Hypertension Assess the statistical sgnitcance of the dats | im txample 89, Solution |The comiton valance Is ist stated oo -HS4) #201823)8.4 « Two-Sanpo Tost for ndopendont Salon wih Equal Varintees 907 Figure a4 for indopendont samples with equal variances oa tne bation a " e a bo oak me 00 t ° ie val hte ans Be H) <0sthen pe 2 ares to the let of funder a, ig 2 dlstibutlon). oak, “Tavs 2dtsebuton os| ® Fe oak ep oP 0 t Value cafe [=B) > 0, tnenp=2carento thelght of under fy ¢m-2 sttbutlon. ‘ors = 17,527. The following test statistic is then computed: pn 132.86-127.44 5.42, $42 yay afer yan 17527 x0aTS 7.282 If the eritcal-value method Is used, then note that under H, comes from a f, distfbutlon, Refering to Table Sin the Appendix, we see that t, 4, = 2.052. Because 2052 < 0,74 s 2.052, it follows that M, is accepted using a two-sided test atthe 59%olin cont wie nonslicen sltsthat eve obit epee, Sena ig, the onl desi sy mre ent Beas Woe aie aepe Sthe 8.5. Interval Estimation for the Comparison of Means from Two Independent Samples (Equal Variance Case) 1s ection, thos forbes testing fo the compan of wie earch fr Ine ibe mean aera DEweg fe aceby tom Favaton #7, We haan, then Xy My Fa 92 = Velo SBlwond oa Vina acl Yndopendert Samples 309 Jor he Comparison ol Mang or To 85» himates and 7 Nig * an, ach tnequatity matoped by 5+ ana jy =e 1 ade 10 both ides to onan vim a tata arated fi " af nine tin2f, and %-Xesai-e \ © seated obs sides te st inequality and su Finally, ene 028 Vn tracted! from both sides of the second inequality wo obtain fyi 8H, Re ge 1 Bam Katine state Yigg If these two inegualities are combined, the required confidence interval Is blaine - i HR toa ear beh ee Na ase substituted for the random varlable X),X,, then this the sample eneans % procechre cant be summarized a follows Equation 13 Conflence Interval forthe Underhing Mean Ditference (hy ~9,) Between Two Groupe CrworSided) (of = of) A two-sided 100% x (1~«) Cl forthe true mean Alffence m1, based on to independent samples s given by t a sees EA heelantiat ynsion Using the data in Examples 8.9 and 8.10, Example 8.11.) Cardiovascular Disease, Hypa | compte 95% CI forthe true mean difference in blood pressure Bewween 35-0 39: vyearokt OF usersand non-OC uses 4.956 C1 for the underlying mean difference in ystolke blood pressure betweeen the ) aoayear a OC users and non-OC users RiveR by Solution population of 35.1529.22 el vrei ae ih ah age sh soe 86 Testing for the Equality of Two Variances 4] {arcane nant et the hype iy cf ve] reas I of Ob nhs These ate oped nn sips oh a) i Seer | tanta germ octicnr ters Ie bran, app cate ee texte andi tound ta the ea choletr eefthe grp ) 207 3m 1 Sappe the supe stand devia nts py) se mg. Pe fl nbich was ass ob te denying mean evel emian thisage he eve chen the cae fer vd bya seh eal {Sond om th cen tee exec cn eet ental ene wh Inthe wane conse ct ute ce fa ater ave mo ry of ‘han tpl sales. The chlestera levels these eile co‘efinition 8.6 8.6 « Testa ior he Equality of Two Vaviancas 311 be measured, Suppose the researchers found that aniong 24 contiolchildien, the Tod aga eel evel (2) is 193-4 mya. with a sample standard deviation 4) of Jed n/a We would ket compare the means ofthese two groups using te ooo Sample tes for independent samples given in Equation 8.11, hut are hesitant te oy {is caual variances, because the sample vatiaice of the case yroup ts about ng times a large 25 that of the control grou 35.6/17.38 = 4.23, What should we do? Bhat We nice «significance test 1 determine ifthe undlettying variances are Isso al hat, oe want totes the hypothesis Ha: of = of versus i of + of A seems reasonable to ase the significance test on the relative magnitises of Sample variances sf, 62). The best test in this case ts based on the tate of the ple Yatlnces (si/43) rather than on the difference between the sample var reser agg tas Mh would be ejected if the variance tall Is elthe loo large or ar sah ad accepted otherwise. Yo implement this tes, the sampling disebnelog of sf/s3 under the null hypothesis 0 = 03 must be determined, The F Distribution The distribution ofthe variance ratio (S/S) was stuaied by statisticians R.A Fisher and G; Snedecor It can be shown that the variance ratio follows an F disteinnrag unger the null bypothess that of = of. There fs no unique Fdlstbution but nee a familly of Fdistebtions, This fansiy is indexed by two parameters termed thera arora denominator degrees of featon (tf), respectively I the sample sins ofthe fret and second samples aren, and, respectively, then the varanee rttofellonees £Gistebution with n= 1 (numerator df) and n,~1 (denominator df), which is ead an Fy istration, ection is generally positively skewed, with the skewness dependent srincedars tte magnitudes of dhe two degrees of freedom. Hf the numerator degiees of fedom are ¥ oF 2 hen, the distribution has a mode at O: otherwise t hee a oate rea at Or The cssRbution is Ulustrated in Figure 8.5. Table 9 in the Appendix gives the percentiles of the F distribution. The 100 x pth percentile ofan Festeibution wih and 4, degrees of treedom is demoted by Fy, y. Thus PAlfinag $Eay.o) 0 Aik table is organized suck thatthe numerator df) is shown in the fst row, se apgminator dF) shovsn inthe fest column, and the varous percentiles (oy are shown in the seeaitd column,q B12 CHAPTER 0 « Hypothesis Tein: Toone loons Figure a Example 8.13 Solution Probabilty density fo the Fdistibution Lotttiiiiiiiiaaii, vorozas04a0sasoposoy oisiziaiaisTei7 181920 Find the upper st percentile ofan F distribution with Sand 9 degroes of fcedom, Fy, must be found, Look inthe $ column, tie 9 row and the subrov makes 9 to obtain f Tan = 6.06 Generally, Fdstbton tables give only upper percentage pons, beceuse the symmetry properties ofthe Fdssbution make fe pone to derive the lower per ‘age pln of any dstrbuton fom te ecresponcing oper percentage poi ‘ofan dixbaton with te degrees of redomn reversed Specialy, note te Aer Hy SHS follows an Fy, 4 stbsion, Merete, PGS 2 arg) By taking the Inverse ofeach sde and reversing the rection ofthe inequity, we Under Hy, however, $/5f follows an Fy, dsebution Therefore oftEquation 6.14 Example 8.14 Solution Equation 0.15, Equation 8.16 8.0 + Testing for he Equality of Two Vanancos 943. 1 follows from the last ts inequalities that Foosap® er Be This principle is summarized a follows: Computation ofthe Lower Percentiles ofan F Distribution The Lower pth per: centile of an F distribution with d, and d, dls the reciprocal of the upper pth percentile of an F distribution with andl d, df. In symbols, Thus from Lquation 8.14 we see that the lower pth percentile of an F distibu. tion is the same as the inverse of the upper pth percentile of an F distribution with the degiees uf freedom eeversed Fstimate Fs. From Equation 814, fs 05 = Wap os = W415 0.281 The F Test ‘We now return to the significance test for the equality of two variances. We want to test the hypothesis fy: a} = oj versus Hy: of 4 02, We stated that the test would be based om the variance ratio $f, which under H, follows an F distribution with 4, ~ Dand m,~ 1 df. Tis isa tworsided test, so we want to reject H, for both small and large values of 83/5. This procedure can be made more specifi, as follows F Test for the Equality of Two Variances. Suppose we want to conduct a test of the hypothesis Mf: of = of versus Hy: of #63 with significance level a, Compute tie test statistic F= s/s? MP Fytmtaya OF FS By iy aye then Hf, ejected MW Byeame-niya SF 5 Fangortags then i, is accepted. The acceptance anc rejection regions for thts test are shown in Figure 8.6, Alternatively, the exact p-value is given by ‘Computation of the p-Value forthe F Test for the Equality of Two Variances Compute the test statistic F= 52/33316 CHAPTER 8 = Hiphede Testing: Tao Sania eorce Figure 8 Acceptance and rejection regions forthe F test fer the equal of fwo variances Fi uaciontP gery FoF ection / oon Fayette Frey Fag ine distbtion = stebtlon of F = 8} unde Hy “This computation is Mustrated to Figure 87 Example 815 | Corciovescular Disease, Pediatics Test fr te equality ofthe two varances glven in Example 8.12. Solution | 58/5! =35.6/17.3 Figure 8.7 Computation ofthe p-vaki forthe F tet for the equaiy of two variances ve sbston an Fetus ston Alera yas a sate). "ie an fy» tito) | in 02 on oa | ooh I oof vue value | 1 sid» 1 amen p= 2 area tothe ight oF tere s¥id < hen p=2x re tothe tof FTable 0.3 Table a 85 « Testing forthe pal of Two Varances 915, Because the tivo samples have 109 and 74 people, cespectively, we know from Equa tion 815 that under HF yy. Thus Hs rejected if PRs or Pe hagiaas Note that neither 99 df not 74 dfappear in able 9 n the Appendix. One approach §s to obtain the percentiles using a commuter program. li this example, we vont to find the value ey Ryrgngs aM 6 = Rs gy scl tat Pr{Finns $0) =025 andl Pe foq2 eg) = 978 The tesult is showi i Table 8.3 using the FINV function of Microsoft Excel 2000, ‘where the frst argunient of FINV i the csited right-hand tai area and the next two Aangumients ave Ue rummetator and denominator dy Computation of ential valves for the cholesterol data in Example 8.15 using Excet 2000, Benonisator a& 72 9.025, 0.4547598 0 xIvc.995, 98, 13) 0.978 Lis4sors0s rav(.0as, 98, 73) 0s, Alternatively, we could Compute the exact p-value, This is given by ps 2 » Pr 24.23), Thus ¢, = 0.6958 anc, = 1.5491. Because F'= 4.23 >, follows that p Computation ofthe exact p-alus in Example @.16 using Excel 2000 SS Oe Using the Exce! 2000 FDIST function, whieh calculates the rightchand tal ara, we Sec from Table 8.4 hat to four decimal places, the povatue = 2 x Px, > 4.23) = £0001. Thus the two sample variances ace significantly different. The two-sample f test with equal variances, as given tn Section 84, cannot be used, because this test depends on the assumption that the varlances are equa. A question often asked about the F tests whether oF not it makes a difference hich sample is selevted as the numerator sample and! which as the denominator sample, The answer is that, for a two-sided test, it does si make a difference, be ‘aust ofthe rules fo eakculating lower peteenties given in Equation 8.14. A variance Fatio > 1 is usually more eonwenient, so these fs no ned to use Equation 8.14. ThusB16 CHAPTER 8 + Hypotans Tesi Taw-Samgl frnce {he Inger valance is ustally put ithe numerator an the smaller vaiance Hn the denonninato, Example 8.16 | Cardiovascular Disease, Hypertnsion Usi the data in Example 89, test whether {rot the variance of Mood pressure i signiicaly diferent between OC users sa now-OC users Solution | The simple standart deviation of blood presse forthe # OC users was 15.3 and forthe 21 nomOC users was 1825, Hert the variance ratio ks Fo (1824/15.44)° 1a) Under HF follows an Fdstbution with 20 and ? de, whose percentiles do ap. pear in Table 9. However, the percentiles oi f.,, are provided in Table 9. Ako, any be sionen that for specified upper pereentile (ex, the 47.Sth percentile), ae ee she numerator or denominator of increases, le corresponding percentile decreases Therefore Fao. 0r9 gyorg AAR LAT 1 follows that p> 2.025) =.05, and te under variances ofthe two samples do ot significantly aller from each other. Thus it was corect to use the Weo-sample {est for independent samples with equa swvanes for these data, where the underly. {ng variances were assumed to be the sa To compute an enact p-value, @ compriter program must be used to evaluate the area under the F distiution. The exact puvalue for Example 8.16 has been evaluated using Feel 2000, with the resulls given Hh Table 8.3. The progeam eval ates the right-hand tail area = Pr(Pig, 2 141) 334. The twoutailed p-value PHCFygy 2 NAN) = D8 = 669, Table 8.5 Computation ofthe exact p-value forthe blood-pressure date in Example 8.16 using the F test forthe aquslty of two variances with tho Excel 2000 FDIST program one-tatled p-value 0.334279505 —_roren(1.41,20,7) twovtalled pevaiue —0,66655901 — +rorsy(4. dd, 20,37 the numerator and dénominator samples ate aeverse ten the Fstatistie = I LAL = O71 F, «under 1. We use the FDIST program of Excel 2000 vo ealeulate PriPa9# 0.71, This given by FDISTIO.71, 7, 20) = 666, Because F< 1. we have value = 23 24(P, ay $0.71) = 23(1- 666) =.667, whlch ts the same as the povale lust given. Thus i was correct use the bsosamplet test for independent samples ‘ith cyt variances for these data, where te variances were assumed to be the same In tis section, 8 have introduced the F test forthe eqtalty wf two variances. Tis test ty used t0 compare variance estimates rom two normally distrted8.7 Equation 8.17 Equation 8.18 Equation 0.19 187 + Two Sample Tut or dopandon Somos wth Uneqvl Vanunces 317 samples. If we roles to the floweart (Figure 8.15, p18), hen starting ftom post tion 1 we answer yes 1 41) two-sample problem? and (2) underlying wistrbution ‘normal or can central-limit theorem be assumed to hold? We answer no to (3) infer ‘ences concerning means? which leads to (4) inferences voncetning vatlances. This leads us to the box labeled “Iwo-sample F test to compare variances.” Be cautious about using this test with nonnormally distibuted samples ‘Two-Sample t Test for Independent Samples with Unequal Variances Te F test forthe equality of two varkances from two independent, normally dist uted samples wns presented in Equation 8.15. Ifthe two variances are nit sign cantly ferent, then the wo-sample (test for independent samples wlth egal var, faces outlined ins Section 84 eati be used, Hehe twa variances ave significantly iffeent, then 2 two-sample Fest for independent sanaples with sequal variances, which is presented in this setion, shoul! be use Specifically, sssunte there are two normally distributed samples, where the fits sample isa random sample of size m, trom ant Ni.,,02) distribution, the second sample isa randosn sample from an. N(i2,¢3) disteibation, and o2 #03. We again want to test the hypothesis Hs jy = pa Vets Hy jy # ie, Statisticians ter to this problem as the Belivens-Fisher problem, 11 still makes sonse to base the significance test on the difference between the sample means ¥) ~¥,. Under either hypothesis, ts normally distributed with mean jt, and vasiance oF/m, and X, is normally distetnted with mean yy and variance o3/,. Hence i follows that Under Hy, Hy Hs =O. Thas, from Equation 8.17, Ri nog) I of and a3 were known, then the test statistic aay PE WiedB18 CHAPTER 6 Hypcmesie Testing Twa-Sampl erence by sf and 3, respectively (the sample variances inthe two samples) Notice that a Pooled estimate ofthe variance was net computed asin Equation 8.10, because the Satiances (oj, 03) are assumed tobe cifterent ff ts substituted for af and #} fee ‘Fin Equation 819, then the follwing test statistic is obtained Equation 8.20 r= (5) -¥)/,it/n vy “The exact distnbution of ‘ner 1, is ficult derive. However, several approxi mate solutions have been proposed iat have appropriate type Teron, The Satterthwaite approximation is presented here, ls advantage fs 5 easy lmplemen tation using the ordinary tables (] Equation 8:21 Two-Sample fest for Independent Samples with Unequal Variances (Sattrthtaite's Method) () Compute the test static ' (2) Compt hesprose dee a he (t/a Ring)? (tin lost) +(ing) ©) Round a down to the neavest integer a”. Ho.) Winn or benty then eect B We tyga St Stirs then avcept it, The acceptance and rejection relons for this test ace ilustrted in Figure 8. Simlaely, the approximate povaluc forthe hypothesis test can be computed a fontows Equation 8.22 Computation ofthe p-Value forthe Two-Sample ¢ Test for Independent Semplos ith Unequal Variances (Sattrthwaite Approximation) ‘Compute the tet statistic87 + Two-Bamplo Tost or ndepanden! Samples with UnequalVaiances 319) Figure 8.8 Acceptance and rejection regions forthe two-sample ¢ test for independent samples with unequal variances wap Hejection cegion Acceptance eon ‘asibution = approsinatedistbution of ein Equation 821 under Hy 150, then p (orea tothe left of under ty. distribution) WE1> 0, then j= 2 « (area to the right of tu der a fy dlstetution) where dis given in Equation 8.21, Computation of the p-value is ilusteated in Figure 8.9. Example 6.17 | Cardiovascular Disease, Podiates Consider the cholesterol data in Example 8.12 | Test for the equality of the mean cholesterol levels of the children whose fathers hhave ded from heart disease versus the chien whose fathers do not have a history | of heat disease. Solution | We have already tested for equality of the two variances in Example 6.15 and found them slgnificantly different. Thus the two-sample f test for unequal variances in Equatlon 8.21 should be wsed, The test statistic is | The approximate degrees of freedom are nov computed: [in +st/myyt (0 1) (8m) fine) (35.62/10 + 17.32/74)? © (85.62/00) /994(a7.a379) a Therelore, the approximate degrees of freedom =i" =151, If the cxtical-value | method is used, note that ¢ =3:40> 5,55 =1.980> fey 9r.. Thetelore H, can be920 CHAPTER 8 « Hypothesis Toning: Tao-Sznplonforence Figure 89 Computetion of tho p-value forthe two-sample test for independent semples ‘wth unoqual variances : 03 é - on R ad . ° : a 5 : Me Gy Ryn + sng> 0, hen pe dx (eat the gnr rane’ sy sutton) rejected using a two-sided test with a=.05. Furthermore, {3.40% thn apns ©3373 >tsjgvs, ch Imples that the poalue < 2x (1.0 ~.9995) =.001, To compute the ‘exact p-value, we use Exee! 2000, as shown in Table 8.6. Table 8.6 | Computation ofthe exact p-value for Example 8.17 using Excel 2000 ae as rontalleg pevalue 0.000862 sorer(2.4.381,2)Figure 8.10 Example 8.18 8.7 + Two-Sample Test foindepondont Serpies with Unequal Vafancos 321 ‘We see from Table 8.6 that the p-value =2 x[l -Pr(fgs $3.40)) =.0009. We conclude that mean cholesterol levels in children whose fathers have died from heart disease are significantly higher than mean cholesterol levels in children of fathers without heart disease. It would be of great Interest to identify the cause of this differcnce: that is, whether it ts due to gencti factors, environmental factors such as diet, of both, In this chapter, two procedures for comparing two means from independent, normally distributed samples have been presented. The frst step in thls process is 10 {est forthe equality of the two variances, using the F test in Equation 8.13. If this test {s not significant, then use the test with equal variances; otherwise, use the test with unequal variances. This overall strategy is illustrated in Figure 8.10. ‘Strategy for testing forthe equalty of means in twa independent, normally istibuted samples, Signiicant Not significant Infectious Disease Using the data in Table 2.11, compare mean duration of hospi« talization between antibiotic users andl nonantibiotic uses. Refer to Table 8.7, where the PC-SAS T-TEST program (PROC TTEST) was used to anae Jyze these data. Among the 7 antibiotic users (ANTIB = yes), mean duration of hos pitalization was 11.57 days with standard deviation 8,81 days; among the 18 nonan UUblotic users (ANTIB = no}, mean duration of hospitalization was 7.44 days with standard deviation 3.70 days. Both the F test and the # test with equal and unequal variances are displayed in this program. Using Figuee 8.10, note that the fits step In ‘comparing the two means is to perform the F test for the equality of two varlances, to decide whether to use the # test with equal or unequal variances, The F statistic i+ denoted in Table 8.7 by F”=5.68, with p-value (labeled Prob > F") =.004, Thus the variances differ significantly, and a two-sample test with unequal vatiances should | be used. Therefore, refer to the Unequal Variance row, where the t statistic (as given Jn Bauattom 621) i 1.20 with degrees of feedom «(df)= 6.8. The comesponding {woralled p-value (beled Prob> fi) =.271. Thus no significant difference exsts be teen the mean duration of hospitalization In these wo groupe Ie the results ofthe test had revealed a nonsignlicantdiference between the varlanees ofthe two samples then the ¢ test with equal variances would have been used, which i provided in the Equal Variance row of te SAS output In this example, considerable aifereces ace present in both the test statistics (1.68 versus 1.20), and the two-aed pales (106 versus.271 resulting fom using these to puocelies,322 CHAPTER 6 + Hypota Toaing: Ino Smee itoence Teble 87 Use of the PC-SAS T-TEST progtam (PROG TTEST) to analyze the association between nlibitic use and duration of hoxplarlion (rm dst prosented in Table 2.11) wm Sed bey ed Revo Vata 2 oF Probe in} po MB 7aeeecege 3269792098 oLarinen7 Raunt setae 29.0 aven Tor WO: Yartanoes axe egunl, Pi = 5.68 DF" (6, 17) exemee’ » 9.8043 slog similar methods to thowe developed in Seton 8, we ean shove that bososided 10086>(1~ 1} eontidence intersal tor the underlying mean difference ty oie n the case of unequal variances is piven as folans Equation 823 Two-Sided 100%» (1 ~ a) Confidence Interval for u,~ 1, (4 62) (BR —tenagndid line Elia, tty aaitln ms) | given in Equation 8.21 Example 8.9 | Infectious Disease Using the data in Table 8.7, compute a 95% Cl forthe mean di ference in duration of hospital tay hessccn patients who do and palients who do ot receive antibiotics Soon | Using Tale 87, the 95% Cli len by 300858, (11571-7.44)+4¢ (STO ORB] =[4127—247(3. 42,4127 2.071.482] (4.127405, 4127 +8423) -(-41.90,1255) (01571-7484) -tyn In this section, we have introduced the two-sample ¢ test for independest samples with anequal variances. This tests used to eompare the meat leve of no nally dstibuted tandoen variable for a sandom varlable with samples large enough so the centtaF-it theorem can be assumed t0 hold) etecen to independent samples with unequal variances. I we rler to the floschart (igure 8.13, p. 338) ther starting rom postion 1 we answer yes 0 (1) two-sample problem? (2) under ing distbution normal or ean eeotra-imit theorem be assumed to hold? (3) inte fence concerning means? and (4) are samples ndepensent? We ansiner yes to (8) variances of tsb samples significantly diferent? This leads sto the box labeled "Use tworsample fest with unequal variances,88 once pote (SOP be ® (0) Pann tm ttn Revers Oust 9B 25 en oot a rs ae). 4 The oning dla wor tine Hom hace ty erpute SBP basen Case Study: Effects of Lead Exposure on Neurological ‘and Psychological Function in Children Ervtomeral Hoth Pesaiter Io Scion 29, we docs sd pxorne in Up Teun, ola the aocaon xtnen d expsut and lot Features in len [l eae dileent way fo quan lsd expose One Ineo se inthe ly ome of fing conta ou af aren whose Powe eels ese «1/10 Mn oth (972 and 197s r= 8) ad an oe posed op of en wha a dln exch 18 mM ether 1972S24 CHAPTER © « HypatessTeang: Tuo Sarge tance Faneton) as wells the Wechsler fall scale 19 sore fa measure of intelectual deve, !9pmenth Because only cudren 2 § years ak were given the neurological tests ee actualy have 48 exposed and 64 cont hkren sho have fingers tapping Scores. The distributions of thee variables by gun were splayed ina box pot Figures 29 and 2.10, expectively. he citibutions appeared to be reasonably spre mets, particularly inthe exposed group although there ta hin that a few ong ay be preset We disessdtecton outliers more formally in Section 89. We also note from these fgres thatthe exposed group seems to have lower levels tog uke contol group fr both these variables. we can we cntirm whether this impen Owe approach i to use 3 vosample Fes to compat the mean fevel of dhe ex [poked group to the mean level of the control group of these variables, We used the PPC-SAS TTST procedue for thls purpose 3 howl in Tables 8.10 and 8.11. The ps 14m actualy performs three diferent significance tests each tne the 1 test proce luc is speciie. in table 8.10, we analyze the mean finger-wris tapping scot Fo loving, the flowchaet jn Figuce 8.10, we fist pesform the F test for equality of te vaslances. In Table 810, the Ftaistic (set a6 F°) = 119 wlth 34 and 62 The ‘p-value (labeled Pro > #”) equals 0.5408, which implies we ean accept, thatthe varlances are not signiicanly different. Therefore, Following Figute 8.10, se should Table 6.10. Comperson of mean fager-wrsl tapping scares for tho eaposed versus contol group, Using the SAS TEST procedure Yor HOI Vartancon are equal, Fe 219 oF = (24,61) reams’ = 8.5408 Table 8.11. Comparison of mesn fulhscale 10. scores forthe exposed versus contol geoup, using the $AS TTEST procedure ser 8 Sed now eed eror _vaciene 7 be prebotsl8.9 Definttion 8.7 Example 8.21 Solution 89 + The Teatmont of Outliers 325 perform the two-sample ¢ test with equal variances (Equation 8.11), The t statistic is in the T coluinn and the Equal row is 2.6772 with 97 sf. The twostaled pevalue, found in the column headed Prob > [1] and the Equal row is 0.0087, which tnplicg here isa significant ciference in mean fingee-Wrist tapping scores between the ex Posed andl the conto! group, sith the exposed group having lower mean scores I there had been a significant difference between the variances from the # test--that 4s, (Prob > F°)-<0.05—then we would use the two-sample 1 1est with unequal var ances. The program automatically performs both ¢ tests and lets the user decide which to use. 1 tso sample test with unequal variances were used, then tefetrig to the Unequal ror, the statistic equals 2.6091 {as given in Equation 8.21) with 6 af (dn Equation 8.21) with a two-sided p-value equal to 0.0113, The progiam aise brovides the mean, standard deviation (Std Dev), atid standard error (Std Error) for each group, Kelerring to Table 8.11, for the analysis of the full-scale 1Q scores, we see that the p-value forthe F tests 0.6982, which is iot statistically signifleant, There. fore we agoin use the equal variance t test. Te ¢ statistic is 1.8334 with 122 af with {wo-taited p-value equal to 0.0692. Thus the mean full-scale 1Q scores for the two {groups do or ditfer sigalfcantly. The Treatment of Outliers We savy that the case study in Section 8.8 suggested there might be some outliers in the finger-serist tapping and 1Q scores. Outliers can have an important impact on the conclusions of a study. 1 is important to definitely identify outliers and either exclude them outright, or at feast perform alternative analyses with and without the ‘outliets presen. Therefore, in this section we study some decision rules for outlier ceteetion We refer to Figures 8.11 and 8.12, which provide stem-and-leaf and box plots from SAS of the finger-wvrist Lapping scozes and the full-scale 10, scores for the eon {Wol group and the exposed group, respectively. According to the box plots in Figure 8.11, there ate potential outlying finger-wtist tapping, scores (denoted by zeros in the plot of 13, 23, 26, and 84 taps per 10 seconds for the conteol group and 13, 14, and 83 taps per 10 seconds for the exposed group. According to the box plots in Fig luce 8.12, there are potential outlying full-scale 1Q seores of 50, 56, 125, 128, and 14] for the control group and 46 for the exposed group. All the potentially outlying val- tes are far from the mean in absolute value, Therefore, a useful way to quantify an extreme value is by the number of standard deviations that a value ts from the mean. This statistic applied to the most extreme value in'a sample is called the {treme Studentized Deviate (or ESD) and is defined as follows: The Bxteeme Stdentized Deviate (or ESD statistic) = way, hy is | Compute the ESH statistic fr the fhiger-wist tapping scores for the control group. From Table 8.10, we see that F= $4.4,5= 12.1, From Figure 8.112 we note that the | stance from the mean for the smallest and largest values are [3-844] 4 and

Hypothesis Testing

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Hypothesis Testing

Загружено:

Авторское право:

Доступные форматы

Вам также может понравиться