Вы находитесь на странице: 1из 5

c 

›  

Hypothesis testing begins with an assumption, called a ›   we make about a
population parameter. A hypothesis in statistics is simply a quantitative statement about a
population

3     
› 
uhe procedure of testing hypotheses is briefly described below:
 

uhe first thing in hypothesis testing is to set up a hypothesis about a population parameter.
uhen we collect sample data, produce sample statistics, and use this information to decide how
likely it is that our hypothesized population parameter is correct.
uhese hypotheses must be so constructed that if one hypothesis is accepted, the other- is
rejected and 
  
Ô 

uhe null hypothesis is a very useful tool in testing the Significance of difference.
In its simplest form the hypothesis asserts that there is no real difference in the sample and the
population in the particular matter under consideration. For example, if we want to find out
whether extra coaching has benefited the students or not, we shall set up a null hypothesis that
"extra coaching has not benefited the students". Similarly if we want  find out whether a
particular drug is effective in curing malaria we will take the null hypothesis that "the drug is not
effective in curing malaria".

î
 

As against the null hypothesis, the alternative hypothesis specifies those values that the
researcher believes to hold true, and. of course he hopes that the sample data lead to acceptance
of this hypothesis as true. uhe alternative hypothesis may embrace the whole range of values
rather than single point. It is opposite to null hypothesis.

uhe null and alternative hypotheses are distinguished by the use of two different symbols.
Ho representing the null hypothesis and H1 the alternative hypothesis.


 


Having set up the hypothesis, the next step is to test the validity of Ho against that of Ha
at as certain level of significance. uhe significance level is customarily expressed as a
percentage such as 5 per cent, is the probability of rejecting the null hypothesis if it is true.

When the hypothesis in question is accepted at the 5 per cent level, the statistician is running
the risk that, in the long run, he will be making the wrong decision about 5 per cent of the
time. By rejecting the hypothesis at the same level he runs the risk of rejecting a true
hypothesis in 5 out of every 100 occasions.

 
c 



  


uhe third step in hypotheses testing procedure is to construct a test criterion. uhis involves
selecting an appropriate probability distribution for the particular test, that is, a probability
distribution which can properly be applied. Some probability distributions that are commo nly
used in testing procedures are t, F and º

Ñ  
  
 Having taken the first three steps, we have completely designed a
statistical test. uhese calculations include the testing statistic and the standard error of the
testing statistic.

5.  
  


Finally, as a fifth step, we may draw statistical conclusions and take' decisions. A statistical
conclusion or statistical decision is a decision either to reject or to accept the null hypothesis.
uhe decision will depend on whether the computed value of the test criterion falls in the region
of rejection or the region of acceptance.

   


 
› 
When a statistical hypothesis is tested there are four possibilities:
1. 'uhe hypothesis is true but our test rejects it. (uype I error)
2. uhe hypothesis is false but our test accepts it. (uype II error)
3. uhe hypothesis is true and our test accepts it. (Correct decision)
4. uhe hypothesis is false and our test rejects it. (Correct decision)
Obviously, the first two possibilities lead to errors.
In a statistical hypothesis testing experiment, a uype I error is committed by rejecting the null
hypothesis when it is true.
On the other hand a uype II error is committed by not rejecting [   accepting) the null
hypothesis when it is false.

SuANDARD ERROR

     is


  error.

 î  Ô

W hen data are collected by sampling fro m a population, the most important objective o f
statistical analysis is to draw inferences about that population from the information embo died
in the sample. Statistical estimation, or briefly estimation, is concerned with the methods by
which population characteristics are estimated from sample information. It may be po inted
out that the true value of a parameter is an unknown constant that can be correctly
ascertained only by an exhaustive study of the population.

 
c 


  3  
    A point estimate is a single number which is used as an estimate of the unknown
population parameter. u he estimator ș is a single point on the real number scale and thus the name
point estimation

  
 

An interval estimate of a population parameter is a statement of two values between


which it is estimated that the parameter lies. An interval estimate would always be specified
by two values,    the lower one and the upper one. In more technical terms, interval
estimatio n refers to the estimation of a parameter by a rando m interval, called the

  

On co mparing these two methods of estimation we find that point estimation has an
advantage as much as it provides an exact value for the parameter under investigation.

3  !   



A good estimator¶s qualit y is to be evaluated in terms of the following properties

[ c  


An estimator is said to be unbiased if its expected value is identical with the populatio n
parameter being estimated. Many estimators are "Asympto matically unbiased" in the sense
that the biases reduce to practically insignificant values zero when  'beco mes sufficiently
large.

[r  
 If an estimator, say ș approaches the parameter ș closer and closer as the
sample size   increases. ș is said to be a consistent estimator of ș. Stating somewhat more
rigorously, the estimator Ifis said to be a consistent estimator of ș if, as approaches infinity,
the probability approaches 1 that ș will differ from the parameter ș by not more than an
arbitrary small constant
In case of large samples consistency is a desirable property for an estimator to possess. However in
small samples, consistency is of little importance unless the limit of probability defining consistency is
reached even with a relatively small size of the simple.

[ -   uhe concept of efficiency refers to the sampling variability of an estimator. If two
competing estimators are both unbiased the one with the smaller variance (for a given sample size) is
said to be relatively more efficient.

If the population is symmetrically distributed then both the sample mean and the sample median are
consistent and unbiased estimators of  Yet the sample mean is better than the sample median as an
estimator of  uhis claim is made in terms of efficiency.

[     An estimator is said to be sufficient if it conveys as much information as is possible


about the parameter which is contained in the sample. uhe significance of sufficiency lies in the fact
 
c 
that if a sufficient estimator exists, it is absolutely unnecessary to consider any other estimator; a
sufficient estimator ensures that all information a sample can furnish with respect to the estimation of a
parameter is being utilized.

‰ 

  

 



  

 

Correlation analysis deals with the association between two or more variables. Correlation is an analysis
of the co variation between two or more variables.

3  
Ô  ‰  

Whether correlation is positive (direct) or negative (inverse) would depend upon the
direction of change of the variables. If both the variables are varying in the same direction i.e 
If as one variable is increasing the other,     is also increasing or, if as one variable
is decreasing the other,     is also decreasing, correlation is said to be positive. Ex.
Height and weight, income and expenditure .If, on the other hand the variables are varying in
opposite directions,     As one variable is increasing the other is decreasing or 
   
Correlation is said to be negative. Ex. Volume and pressure, price and demand etc

  3 


   ‰  


uhe distinction between simple, partial and multiple correlations is based upon the number of variables
studied. When only· two variables arc studied it is a problem of simple correlation. When three or more
variables are studied it is a problem of either multiple or partial correlation. In multiple correlations
three or more variables are studied simultaneously. For example, when we study the relationship
between the yield of rice per acre and both the amount of rainfall and the amount of fertilizers used. It is
a problem of multiple correlations. On the other hand, in partial correlation we recognize more than two
variables but consider only two variables to be influencing each other the effect of other influencing
variables being kept constant. For example, in the rice problem taken above if we limit our correlation
analysis of yield and rainfall to periods when a certain average daily temperature existed it becomes a
problem relating to partial correlation only.

"
  
 Ô
#"
  $‰ 
  ‰ 
 uhe distinction between linear and non-
linear correlation is based upon the constancy of the ratio of change between the variables. If
the amount of change in one variable tends to bear constant ratio to the amount of change in the
other variable then the correlation is said to be linear.

Correlation would be called non-linear or curvilinear if the amount of change in one variable
does not bear a constant ratio to the amount of change in the other variable.

% 
 
  Regression is the measure of the average relationship between two or
more variables in terms of the original units of the data.

uhe term 'regression analysis' refers to the methods by which estimates are made o f the
values o f a variable from a knowledge of the values of one or more other variables and to the
measurement of the errors involved in this estimation process."
 
c 


&  

 

à Regressio n analysis provides estimates of values of the dependent variable from values of the
independent variable
à A second goal of regression analysis is to obtain a measure of the error involved in using the
regression line as a basis for estimation.
à With the help of regression coefficients we can calculate the correlation coefficient.

‰ 
   

One very convenient and useful way of interpreting the value of
coefficient of correlation between two variables is to use square of coefficient of correlation, which is
called coefficient of determination. uhe coefficient of determination is defined as the ratio of the
explained variance to the total variance.

'%ԉ( )Ô‰ %%"î  ÔîÔ%!% ÔîÔî"*

Õ Whereas coefficient of correlation is a measure of degree of co variability between X and Y,


the objective of regression analysis is to study the 'nature of relationship' between the
variables so that we may be able to predict the value of one on the basis of another.
Õ Correlation is merely a tool of ascertaining the degree of relationship between two variables
and, therefore, we cannot say that one variable is the cause and other the effect. In regression
analysis one variable is taken as dependent while the other as independent thus making it
possible to study the cause and effect relationship.
Õ uhere may be nonsense correlation between two variables which is purely due to chance and
has no practical relevance such as increase in income and increase in weight of a group of
people. However, there is nothing like nonsense regression.
Õ Correlation coefficient is independent of change of scale and origin. Regression coefficients
are independent of change of origin but not of scale.

 

Вам также может понравиться