Hypothesis Testing

HYPOTHESIS TESTING
DEFINITION
Hypothesis tests are procedures for making rational decisions about the reality of effects. Rational Decisions Most decisions require that an individual select a single alternative from a number of possible alternatives. The decision is made without knowing whether or not it is correct; that is, it is based on incomplete information. For example, a person either takes or does not take an umbrella to school based upon both the weather report and observation of outside conditions. If it is not currentl raining, this decision must be made with incomplete information. ! rational decision is characteri"ed b the use of a procedure which insures the likelihood or probabilit that success is incorporated into the decision#making process. The procedure must be stated in such a fashion that another individual, using the same information, would make the same decision. $ne is reminded of a %T!& T&'( episode. )aptain (irk, for one reason or another, is stranded on a planet without his communicator and is unable to get back to the 'nterprise. %pock has assumed command and is being attacked b (lingons *who else+. %pock asks for and receives information about the location of the enem , but is unable to act because he does not have complete information. )aptain (irk arrives at the last moment and saves the da because he can act on incomplete information. This stor goes against the concept of rational man. %pock, being the ultimate rational man, would not be immobili"ed b indecision. Instead, he would have selected the alternative which reali"ed the greatest expected benefit given the information available. If complete information were required to make decisions, few decisions would be made b rational men and women. This is obviousl not the case. The script writer misunderstood %pock and rational man. Effects ,hen a change in one thing is associated with a change in another, we have an effect. The changes ma be either quantitative or qualitative, with the h pothesis testing procedure selected based upon the t pe of change observed. For example, if changes in salt intake in a diet are associated with activit level in children, we sa an effect occurred. In another case, if the distribution of political part preference *&epublicans, -emocrats, or Independents+ differs for sex *Male or Female+, then an effect is present. Much of the behavioral science is directed toward discovering and understanding effects. The effects discussed in the remainder of this text appear as various statistics including. differences between means, contingenc tables, and correlation coefficients.
GENERAL PRINCIPLES
!ll h pothesis tests conform to similar principles and proceed with the same sequence of events.

! model of the world is created in which there are no effects. The experiment is then repeated an infinite number of times. The results of the experiment are compared with the model of step one. If, given the model, the results are unlikel , then the model is re/ected and the effects are accepted as real. If, the results could be explained b the model, the model must be retained. In the latter case no decision can be made about the realit of effects.
0 pothesis testing is equivalent to the geometrical concept of h pothesis negation. That is, if one wishes to prove that ! *the h pothesis+ is true, one first assumes that it isn1t true. If it is shown that this assumption is logicall impossible, then the original h pothesis is proven. In the case of h pothesis testing the h pothesis ma never be proven; rather, it is decided that the model of no effects is unlikel enough that the opposite h pothesis, that of real effects, must be true. !n analogous situation exists with respect to h pothesis testing in statistics. In h pothesis testing one wishes to show real effects of an experiment. 2 showing that the experimental results were unlikel , given that there were no effects, one ma decide that the effects are, in fact, real. The h pothesis that there were no effects is called the NULL HYPOTHESIS. The s mbol 03 is used to abbreviate the 4ull 0 pothesis in statistics. 4ote that, unlike geometr , we cannot prove the effects are real, rather we may decide the effects are real. For example, suppose the following probabilit model *distribution+ described the state of the world. In this case the decision would be that there were no effects; the null h pothesis is true.
'vent ! might be considered fairl likel , given the above model was correct. !s a result the model would be retained, along with the 4566 078$T0'%I%. 'vent 2 on the other hand is unlikel , given the model. 0ere the model would be re/ected, along with the 4566 078$T0'%I%. The Model The SAMPLING DISTRI UTION is a distribution of a sample statistic. It is used as a model of what would happen if
9.+ the null h pothesis were true *there reall were no effects+, and :.+ the experiment was repeated an infinite number of times. 2ecause of its importance in h pothesis testing, the sampling distribution will be discussed in a separate chapter. P!o"a"ilit# P!o"a"ilit# is a theory of uncertainty. It is a necessar concept because the world according to the scientist is unknowable in its entiret . 0owever, prediction and decisions are obviousl possible. !s such, probabilit theor is a rational means of dealing with an uncertain world. 8robabilities are numbers associated with events that range from "ero to one *3#9+. ! probabilit of "ero means that the event is impossible. For example, if I were to flip a coin, the probabilit of a leg is "ero, due to the fact that a coin ma have a head or tail, but not a leg. ;iven a probabilit of one, however, the event is certain. For example, if I flip a coin the probabilit of heads, tails, or an edge is one, because the coin must take one of these possibilities. In real life, most events have probabilities between these two extremes. For instance, the probabilit of rain tonight is .<3; tomorrow night the probabilit is .93. Thus it can be said that rain is more likel tonight than tomorrow. The meaning of the term probabilit depends upon one1s philosophical orientation. In the )6!%%I)!6 approach, probabilities refer to the relative frequenc of an event, given the experiment was repeated an infinite number of times. For example, the .<3 probabilit of rain tonight means that if the exact conditions of this evening were repeated an infinite number of times, it would rain <3= of the time. In the %ub/ective approach, however, the term probabilit refers to a >degree of belief.> That is, the individual assigning the number .<3 to the probabilit of rain tonight believes that, on a scale from 3 to 9, the likelihood of rain is .<3. This leads to a branch of statistics called >2!7'%I!4 %T!TI%TI)%.> ,hile man statisticians take this approach, it is not usuall taught at the introductor level. !t this point in time all the introductor student needs to know is that a person calling themselves a >2a esian %tatistician> is not ignorant of statistics. Most likel , he or she is simpl involved in the theor of statistics. 4o matter what theoretical position is taken, all probabilities must conform to certain rules. %ome of the rules are concerned with how probabilities combine with one another to form new probabilities. For example, when events are independent, that is, one doesn1t effect the other, the probabilities ma be multiplied together to find the probabilit of the /oint event. The probabilit of rain toda !4- the probabilit of getting a head when flipping a coin is the product of the two individual probabilities.
! deck of cards illustrates other principles of probabilit theor . In bridge, poker, rumm , etc., the probabilit of a heart can be found b dividing thirteen, the number of hearts, b fift #two, the number of cards, assuming each card is equall likel to be drawn. The probabilit of a queen is four *the number of queens+ divided b the number of cards. The probabilit of a queen $& a heart is sixteen divided b fift #two. This figure is computed b adding the probabilit of hearts to the probabilit of a queen, and then subtracting the probabilit of a queen !4- a heart which equals 9?@:. !n introductor mathematical probabilit and statistics course usuall begins with the principles of probabilit and proceeds to the applications of these principles. $ne problem a student might encounter concerns unsorted socks in a sock drawer. %uppose one has twent #five pairs of unsorted socks in a sock drawer. ,hat is the probabilit of drawing out two socks at random and getting a pairA ,hat is the probabilit of getting a match to one of the first two when drawing out a third sockA 0ow man socks on the average would need to be drawn before one could expect to find a pairA This problem is rather difficult and will not be solved here, but is used to illustrate the t pe of problem found in mathematical statistics.
ERRORS IN HYPOTHESIS TESTING

! superintendent in a medium si"e school has a problem. The mathematical scores on nationall standardi"ed achievement tests such as the %!T and !)T of the students attending her school are lower than the national average. The school board members, who don1t care whether the football or basketball teams win or not, is greatl concerned about this deficienc . The superintendent fears that if it is not corrected, she will loose her /ob before long. !s the superintendent was sitting in her office wondering what to do, a salesperson approached with a briefcase and a sales pitch. The salesperson had heard about the problem of the mathematics scores and was prepared to offer the superintendent a >deal she couldn1t refuse.> The deal was teaching machines to teach mathematics, guaranteed to increase the mathematics scores of the students. In addition, the machines never take breaks or demand a pa increase. The superintendent agreed that the machines might work, but was concerned about the cost. The salesperson finall wrote some figures. %ince there were about 9333 students in the school and one machine was needed for ever ten students, the school would need about one hundred machines. !t a cost of B93,333 per machine, the total cost to the school would be about B9,333,333. !s the superintendent picked herself up off the floor, she said she would consider the offer, but didn1t think the school board would go for such a big expenditure without prior evidence that the machines actuall worked. 2esides, how did she know that the compan that manufactures the machines might not go bankrupt in the next ear, meaning the school would be stuck with a million dollar1s worth of useless electronic /unk.
The salesperson was prepared, because an offer to lease ten machines for testing purposes to the school for one ear at a cost of B@33 each was made. !t the end of a ear the superintendent would make a decision about the effectiveness of the machines. If the worked, she would pitch them to the school board; if not, then she would return the machines with no further obligation. !n experimental design was agreed upon. $ne hundred students would be randoml selected from the student population and taught using the machines for one ear. !t the end of the ear, the mean mathematics scores of those students would be compared to the mean scores of the students who did not use the machine. If the means were different enough, the machines would be purchased. The astute student will recogni"e this as a nested t#test. In order to help decide how different the two means would have to be in order to bu the machines, the superintendent did a theoretical anal sis of the decision process. This anal sis is presented in the following decision box.
>&eal ,orld> -')I%I$4 The machines don1t work. The machines work.
T pe I 2u the machines. '&&$& -ecide the machines work. probabilit C -o not bu the machines. -ecide that the machines do not work )$&&')T
)$&&')T probabilit C 9# >power> T pe II '&&$&
probabilit C 9 # probabilit C
The decision box has the decision that the superintendent must make on the left hand side. For simplicit 1s sake, onl two possibilities are permitted. either bu all the machines or bu none of the machines. The columns at the top represent >the state of the real world>. The state of the real world can never be trul known, because if it was known whether or not the machines worked, there would be no point in doing the experiment. The four cells represent various places one could be, depending upon the state of the world and the decision made. 'ach cell will be discussed in turn. 9. 2u ing the machines when the do not work.
This is called a T pe I error and in this case is ver costl *B9,333,333+. The probabilit of this t pe of error is , also called the significance level, and is directl controlled b the experimenter. 2efore the experiment begins, the experimenter directl sets the value of . In this case the value of would be set low, lower than the usual value of .3@, perhaps as low as .3339, which means that one time out of 93,333 the experimenter would bu the machines when the didn1t work. :. 4ot bu ing the machines when the reall didn1t work. This is a correct decision, made with probabilit 9# don1t work and the machines are not purchased. when in fact the teaching machines
The relationship between the probabilities in these two decision boxes can be illustrated using the sampling distribution when the null h pothesis is true. The decision point is set b , the area in the tail or tails of the distribution. %etting smaller moves the decision point further into the tails of the distribution.
D. 4ot bu ing the machines when the reall work. This is called a T pe II error and is made with probabilit E . The value of E is not directl set b the experimenter, but is a function of a number of factors, including the si"e of , the si"e of the effect, the si"e of the sample, and the variance of the original distribution. The value of E is inversel related to the value of ; the smaller the value of , the larger the value of E. It can now be seen that setting the value of to a small value was not done without cost, as the value of E is increased. <. 2u ing the machines when the reall work.
This is the cell where the experimenter would usuall like to be. The probabilit of making this correct decision is 9#E and is given the name >power.> 2ecause was set low, E would be high, and as a result 9#E would be low. Thus it would be unlikel that the superintendent would bu the machines, even if the did work. The relationship between the probabilit of a T pe II error *E+ and power *9#E+ is illustrated below in a sampling distribution when there actuall was an effect.
The relationship between the si"e of and E can be seen in the following illustration combining the two previous distributions into overlapping distributions, the top graph with C.3@ and the bottom with C .39. 03 true 09 true
The si"e of the effect is the difference between the center points * + of the two distributions. If the si"e of the effect is increased, the relationship between the probabilities of the two t pes of errors is changed.
,hen the error variance of the scores are decreased, the probabilit of a t pe II error is decreased if ever thing else remains constant, as illustrated below.
!n interactive exercise designed to allow exploration of the relationships between alpha, si"e of effects, si"e of sample *4+, si"e of error, and beta can now be understood. The values of alpha, si"e of effects, si"e of sample, and si"e of error can all be ad/usted with the appropriate scroll bars. ,hen one of these values is changed, the graphs will change and the value of beta will be re#computed. The area representing the value of alpha on the graph is drawn in dark gra . The area representing beta is drawn in dark blue, while the
corresponding value of power is represented b the light blue area. 5sing this exercise the student should verif .

The si"e of beta decreases as the si"e of error decreases. The si"e of beta decreases as the si"e of the sample increases. The si"e of beta decreases as the si"e of alpha increases. The si"e of beta decreases as the si"e of the effects increase.
The si"e of the increase or decrease in beta is a complex function of changes in all of the other values. For example, changes in the si"e of the sample ma have either small or large effects on beta depending upon the other values. If a large treatment effect and small error is present in the experiment, then changes in the sample si"e are going to have a small effect.
A SECOND CHANCE
!s might be expected, in the previous situation the superintendent decided not to purchase the teaching machines, because she had essentiall stacked the deck against deciding that there were an effects. ,hen she described the experiment and the result to the salesperson the next ear, the salesperson listened carefull and understood the reason wh had been set so low. The salesperson had a new offer to make, however. 2ecause of an advance in microchip technolog , the entire teaching machine had been placed on a single integrated circuit. !s a result the price had dropped to B@33 a machine. 4ow it would cost the superintendent a total of B@3,333 to purchase the machines, a sum that is quite reasonable. The anal sis of the probabilities of the two t pes of errors revealed that the cost of a T pe I error, bu ing the machines when the reall don1t work *B@3,333+, is small when compared to the loss encountered in a T pe II error, when the machines are not purchased when in fact the do work, although it is difficult to put into dollars the cost of the students not learning to their highest potential. In an case, the superintendent would probabl set the value of to a fairl large value *.93 perhaps+ relative to the standard value of .3@. This would have the effect of decreasing the value of E and increasing the power *9#E+ of the experiment. Thus the decision to bu the machines would be made more often if in fact the machines worked. The experiment was repeated the next ear under the same conditions as the previous ear, except the si"e of was set to .93. The results of the significance test indicated that the means were significantl different, the null h pothesis was re/ected, and a decision about the realit of effects made. The machines were purchased, the salesperson earned a commission, the math scores of the students increased, and ever one lived happil ever after.
THE ANALYSIS GENERALI$ED TO ALL E%PERIMENTS
The anal sis of the realit of the effects of the teaching machines ma be generali"ed to all significance tests. &ather than bu ing or not bu ing the machines, one re/ects or retains the null h pothesis. In the >real world,> rather than the machines working or not working, the null h pothesis is true or false. The following presents the boxes representing significance tests in general.
>&eal ,orld> 4566 F!6%' 4566 T&5' !6T'&4!TIF' F!6%' -')I%I$4 4o 'ffects &eal 'ffects &e/ect 4ull T pe I !ccept !lternative '&&$& -ecide there are prob C real effects. &etain 4ull )$&&')T &etain !lternative prob C 9 # -ecide that no effects were discovered. prob C E '&&$& T pe II >power> prob C 9# E )$&&')T !6T'&4!TIF' T&5'
CONCLUSION
%etting the value of is not automatic, but depends upon an anal sis of the relative costs of the two t pes of errors. The probabilities of the two t pes of errors *I and II+ are inversel related. If the cost of a T pe I error is high relative to the cost of a T pe II error, then the value of should be set relativel low. If the cost of a T pe I error is low relative to the cost of a T pe II error, then the value
Ho& to co'(a!e data sets))Ano*a

In 9G:3, %ir &onald !. Fisher invented a statistical wa to compare data sets. Fisher called his method the !nal sis of Fariance, which was later dubbed an !4$F!. This method eventuall evolved into %ix %igma data set comparisons. !n !4$F! is a guide for determining whether or not an event was most likel due to the random chance of natural variation. $r, conversel , the same method provides guidance in sa ing with a
G@= level of confidence that a certain factor *H+ or factors *H, 7, and?or I+ were the more likel reason for the event. The F ratio is the probabilit information produced b an !4$F!. It was named for Fisher. The orthogonal arra and the &esults 8ro/ect, -M!I) designed experiment1s cube were also his inventions. !n !4$F! can be, and ought to be, used to evaluate differences between data sets. It can be used with an number of data sets, recorded from an process. The data sets need not be equal in si"e. -ata sets suitable for an !4$F! can be as small as three or four numbers, to infinitel large sets of numbers. Ho& to Co'(lete an E+cel ANO,A The difficult of calculating !4$F!s b hand prevented most people from using this %ix %igma tool until the 9GG31s. 4ow, using software like Microsoft 'xcel, an one and ever one can quickl determine whether differences in a set of counts or measurements were most likel due to chance variation. $r, can we sa it should be more likel attributed to a >combination of factors.> These variables are often labeled factor H, 7, or I. 0ere is how ou could use an 'xcel !4$F! to determine who is a better bowler. 7ou could and can use an !4$F! to compare an scores. 6engths of sta , da s in !&, the number of phone calls, readmission rates, stock prices and an other measure are all fair game for an !4$F!. 2elow are six game scores for three bowlers. ,hich bowler is bestA If there is a best bowler, is the difference between bowlers statisticall significantA
Step 1. Recreate the columns using Excel. Each bowler's name is the field title. Step 2. Go to Tools and select Data Analysis as shown. If Data Analysis does not appear as the last choice on the list in your computer you must clic! Add"Ins and clic! the Analysis Tool#a! options.
Step 3. $lic! %& to the first choice A'%(A) *ingle +actor.
Step 4. $lic! and drag your mouse from #at's name to the last score in *heri's column. This automatically completes the Input Range for you),+,-),.,/. $lic! the box labeled 01abels in +irst Row.0 $lic! %utput Range. Then either type in an empty cell location or mouse clic! an empty cell ,I,2 as illustrated by the dotted cell below. $lic! %&.
Step 5. Interpret the probability results by e3aluating the F ratio. If the F ratio is larger than the F critical 3alue F crit there is a statistically significant difference. If it is smaller than the F crit 3alue the score differences are best explained by chance.
The F ratio -4.5/ is larger than the F crit 3alue 6.72. 8ar! is a better bowler. The difference between him and the other two bowlers is statistically significant. Excel automatically calculated the a3erage the 3ariance " which is the standard de3iation s9uared " and the essential probability information instantly. :ou can use this techni9ue to compare physicians nurses hospital lengths of stay re3enue expense supply cost days in accounts recei3able or any other factor of interest.
The G!a(hic ANO,A 'xcel takes care of the first three %ix %igma rules for completing an anal sis. 5nfortunatel , it does not create a meaningful, anal tic graph. !s mentioned earlier, most 'xcel graphs are descriptive, rather than anal tic.
!s ou advance in our %ix %igma learning ou ma want to learn to use a more advanced %ix %igma software program. $ne such program is called %tat#'ase -esign# 'xpert. %tat#'ase calculates an !4$F! and graphicall shows statistical differences between sets of data. It is all achieved with mouse clicks. 7ou won1t have to look at, or calculate an equation.
. 'ach I#2ar has a black square in its center. This square identifies the average score for each bowler. The top and bottom of each I#2ar extends two standard deviations, :, above and below the mean. Think of these as the 5pper )ontrol 6imit *5)6+ and 6ower )ontrol 6imit *6)6+ for each bowler1s score. 'ach I#2ar covers G@= of an imaginar , on#its#side bell curve for each bowler. This %ix %igma data arra of fields and records would tell us a little about each observation. The more fields, the richer our understanding can be. For example, in the same amount of space the following table has twice as much data. &ich data, meaning each column?field has a cr stal clear operational definition, can ield rich information. Man times we collect do"ens of fields for each recorded observation. %ince data collection is time consuming and expensive, design our collection plan with care before ou begin. 4ote the overlapping values in red circle markers. $n occasion both 8at and %heri could have bowled a better game than Mark. 2ut, when the data are viewed using the mean
value * +, the standard deviation *+, the probabilit information provided b the F ratio, and a meaningful anal tic graph, Mark is obviousl a better bowler. In fact, as we mentioned before, he is a better bowler in a statisticall significant wa . A"o-t The A-tho! -aniel %loan has provided senior executive leadership, pro/ect management, seminar leadership, education, %ix %igma training, and consultant services to manufacturing companies, software corporations, computer network companies, health care corporations, aerospace, insurance, and governmental agencies in DJ of the 5nited %tates, !ustralia, 5rugua , Mexico, and 2ra"il. Mr. %loan has provided consultant services to a diverse group of other organi"ations including. The ,ashington %tate -epartment of 0ealth Facilities, %ervices, and 6icensing -ivision, )it 5niversit , the 5niversit of 8ittsburgh, and the 5niversit of ,ashington 2usiness %chool.
.hen Definitions Collide /) Tho'as (#0 de1

It's important to think before reacting to an observed difference, whether it's significant or not
Few things breed more misunderstanding than the term >significance> in statistical
process control. Kualit and statistics professionals often use the word in one sense while their audience understands it in a completel different wa . For example, I once helped a team of process engineers experiment on a new vapor#phase soldering process. 2eing new, the process produced much variation from man unknown sources. This statistical noise overwhelmed an signals caused b the controlled changes we made to the variables we were stud ing. I commented that, based on the experiments1 results, none of the effects were significant. !n engineer disagreed, sa ing that, according to well# established scientific principles, all the variables tested were important. 0e was right. %o was I. $ur miscommunication stemmed from different interpretations of >significant.> ,hen discussing statistical significance, statisticians and qualit engineers usuall are referring to significance tests such as %tudent1s t#test for equalit of means. In simple terms, significance testing answers the question, >!re these things reall different, or could the difference be due to pure chanceA> 0owever, problems arise when these test results are presented to nonstatisticians. In common parlance, >significance> and >importance> are s non mous. ,hen statisticians speak of >statistical significance> *often dropping the >statistical> and adding to the confusion+, then >importance> and >significance> bear no relationship to one another. In fact, the following can be said.
! difference or effect can be important but insignificant. ! difference or effect can be both significant and important. ! difference or effect can be unimportant and insignificant. ! difference or effect can be significant but of no importance to the situation at hand.
In short, >important> and *statisticall + >significant> are not s non mous. )onsider the following examples. Important but not significant ## ! hospital1s fatalit rate for coronar arter b pass surger doubled from one week to the next. The loss of life clearl is important. 0owever, if the average death rate is low ## sa , 3.@ percent ## and the sample si"e is small ## 933 surgeries per month ## then one month1s fatalit rate can double the previous month1s without being *statisticall + significant. %ignificant and important ## ! part1s average hole si"e falls below the lower control limit, and the part won1t work in the field. This agreement in meaning between significant and important is, however, purel coincidental. The fact that it often occurs onl adds to the confusion in other circumstances. 5nimportant and not significant ## The part1s hole si"e changes b a statis#ticall meaningless amount, and the part still functions perfectl . !gain, agreement in meaning here can cause confusion at other times. %ignificant but not important ## )arpenter ! saws wall studs to within L9?M<> of the nominal length, while )arpenter 2 saws them with twice the variation at L9?D:>. The contractor can tolerate L9?J>. 2ecause each carpenter cuts man studs, even small differences are statisticall significant. 0owever, a difference of onl 9?M<> in a sawing process1s total spread is of no economic or architectural importance. %ignificance is a technical term when used in statistical science. It has a precise meaning as determined b the statistical test being applied. ,hile not advisable, it1s possible to determine statistical significance without knowing what the numbers mean or how the results will be used. Importance, on the other hand, is a nontechnical term. The questions being answered are implicit. Important to whomA Important for what purposeA 5nlike anal "ing statistical significance, pat answers can1t be given in advance regarding a particular number1s importance.
It1s important to think before reacting to an observed difference, whether it1s significant or not. Figure 9 provides some useful guidelines for the thinking process. About the author Thomas 8 "dek is president and )'$ of 8 "dek Management Inc. )omments can be e# mailed to him at tp "dekNqualit digest.com .
P *al-es and statistical si2nificance
&epl 9
0.ypothesis tests are typically used in the ;*ix *igma< Analy=e #hase to identify the critical x's >inputs? of a process. Generally these critical x's are assumed to exist when we re@ect the null hypothesis. The significance le3el >alpha? used in these hypothesis tests is often set at A5B >i.e. p"3alue C D.D5 threshold?. Ehen a hypothesis test is performed the p"3alue represents the probability of getting a sample as extreme >or worse? assuming the null hypothesis is true. Ee therefore re@ect the null hypothesis when the p"3alue is less than the significance le3el we established0 reply)4 The basic notion of p 3alue is this) Ehat is the probability >p? that the association seen in the data >in this case the correlation? would ha3e been seen by chance >i.e. if in fact there is no relationship between the 3ariables?F %r more accurately what is the probability that you will again get this 3alue for extent of correlation if in fact there is no correlationF 0If the p"3alue is D.D5 the common understanding is that the obser3ed relationship would be expected in 5B of measurements if there was no correlation. :ou would expect - out of 4D samples to show a correlation when in fact the 3ariables were not correlated.0 #osted by) Don 'anneman
&epl D
I always struggled with p" 3alues until I heard this helper. +irst the null hyothesis is always no difference >i.e. for normality "" the sample is no different than the normal cur3e for 4 sample t "" the two samples are no different etc.?. The alternate is always there is a difference. 0Then remember this saying) If the p is high >G.D5? the null will fly "" if the p is low >H.D5? the null must G%I0 #osted by) $indy
RepyJ Another reader added to $indy's thoughts) 0Kust for completeness there are other null hypotheses that may be tested for. +or example we
may in3estigate the null hypothesis that data may ha3e come from a 'ormal distribution >3ia Anderson"Darling Ryan"Koiner test etc.? or that the process mean is e9ual to some specified 3alue e.g. .D) mu C 475g etc. 0:our rhyme is still 3alid though one should always consider the sample si=e selected " would we be able to detect a difference say of a certain si=e if it were truly presentF That's why we ha3e power and sample si=e computational functionality as well of course.0
&epl @
In statistics you can 3irtually ne3er get data from an entire population so you ha3e to ta!e samples. A #"3alue is @ust an indication that there is a high chance that the factor or data is significant although there are ne3er any blac! or whites. .ow sensiti3ely you want to analy=e the data >or how thoroughly? is where you set your #"3alues at for significance. I'm not sure of your exact situation but it sounds li!e the higher you set your #"3alue the more thorough you are trying to be particularly during a D%E when you are trying to measure interaction effects. A #" 3alue of .D5 may show no significance. .owe3er a #"3alue of .4D may show significance. Lasically it's an arbitrary but well thought out le3el to detect significance of your data. The higher the more thorough. 0#"3alues in normality tests are something a little different. If they are abo3e .D5 that means chances are good your data is normal. If it is less than .D5 chances are good it's not and you should loo! for other options. Again this is @ust an inference made from the sample of the population you too!. Lut you better belie3e it's usually dead on.0 Repy7 The p"3alue is simply the actual le3el of confidence pro3ided by the model in 9uestion >regression hypothesis test + test difference of means etc.?. +or example if you decide to set you confidence le3el at D.D5 which means that you are willing to allow for a 5B change of erring in your final analysis >i.e. finding a significant association when one does not really exists? and the p"3alue is D.DD65 then your model passes the test.0 Reply/ 0I'm guessing that you're discussing the p"3alue related to the correlation coefficient. %f course the p"3alue represents the probability of incorrectly re@ecting the null hypothesis. If the p"3alue is less than some significance le3el alpha >typically practitioners use an alpha of D.D5? then we say that the result is statistically significant >at the 5B le3el? " i.e. the probability of incorrectly re@ecting the null hypothesis is less than 5B. 0+or the test I thin! you're alluding to it would indicate that we would re@ect the null hypothesis that rho >the true correlation coeff? is e9ual to =ero hence there may be some e3idence to suggest that a linear relation is present. Don't ignore a scatter diagram though of courseI0 Another reader continued the explanation) 0&8L is exactly right but maybe a further discussion will help. Imagine that there is a uni3erse of points from the process you are studying. :ou ta!e a sample of those to see if you can pro3e or dispro3e correlation. The truth that you are assuming is that there is no or null correlation. 0'ow you de3elop some test or way to mathematically relate the sample to some statistic in this case rho. :ou then compare it to some reference distribution. The probability >p? that you selected the sample in such a way that you got a sample that shows there is some correlation i.e. that rho is not =ero when in fact it is =ero the truth you assumedM is the p 3alue. In other words it is the probability that your sample indicates that the state of nature in the uni3erse is different than the truth you assumed when your assumption was the correct one. In this case it is
the probability that the rho or correlation is =ero when your sample indicates that it is not =ero. 0Nsually if we ha3e a one in twenty >D.D5? chance of ma!ing the wrong decision we are satisfied that there is a difference i.e. statistical significence and we re@ect the null hypothesis that there is no difference. :ou can set this le3el based on your need to be right. In drug testing wor! for instance a pC D.D- is often used since the conse9uences of being wrong are much more se3ere than being wrong about a !nob for a radio.0 #osted by) Da3e *trouse
&eplJ
And yet another reader continued the explanation) 0Ehat &8L and Da3e are saying is...If the p"3alue for a correlation coefficient test is less than D.D5 it indicates that the correlation coefficient I* significantly different from =ero >either positi3e or negati3e? at the alpha C D.D5 le3el. This means that there is some significant amount of linear relationship between your two 3ariables of interest. This test uses a test statistic tDC;rO*PRT>n" 4?<Q*PRT>-"rR4?. It has been pro3en that I+ the true correlation coefficient is e9ual to =ero then tD will follow the t"distribution with n"4 degrees of freedom. 0Extreme 3alues of this tD statistic are indicati3e that tD does not actually follow a t"distribution with n"4 df thus it indicates that the true correlation coefficient is '%T e9ual to =ero. Extreme 3alues of tD are characteri=ed by small p"3alues. Thus small p"3alues indicate that the true correlation coefficient is '%T e9ual to =ero.0 #osted by) Da3e *trouse

Hypothesis Testing

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Hypothesis Testing

Загружено:

Авторское право:

Доступные форматы

HYPOTHESIS TESTING

ERRORS IN HYPOTHESIS TESTING

)$&&')T probabilit C 9# >power> T pe II '&&$&

THE ANALYSIS GENERALI$ED TO ALL E%PERIMENTS

Ho& to co'(a!e data sets))Ano*a

Step 3. $lic! %& to the first choice A'%(A) *ingle +actor.

.hen Definitions Collide /) Tho'as (#0 de1

P *al-es and statistical si2nificance

Вам также может понравиться