Вы находитесь на странице: 1из 16

Unit III

Correlation and Regression Analysis


Course Outcome

Understand about Correlation and Regression analysis and its


various types.

Know various methods to calculate regression analysis.

Comprehend standard error, R-squared and error estimation and


interpret them.

Analyse the data by applying these tools and interpret them in


meaningful way.
Correlation Analysis

Correlation analysis deals with association between


two or more variables.

OR

Correlation analysis attempts to determine the


degree of relationship between variables.
Causation
Causation indicates that one event is the result of the occurrence
of the other event i.e. there is a cause and effect relationship. 

(e.g. smoking causes an increase in the risk of developing lung


cancer), or it can correlate with another (e.g. smoking is correlated
with alcoholism, but it does not cause alcoholism).
Types of Correlation

Positive or Negative

Simple, Partial and Multiple

Linear and Non-linear


Methods to study Correlation

Scatterplot

Graphical method

Karl Pearson’s Coefficient of Correlation


Pearson coefficient of Correlation

where x = (x-x-) and y = (y-y-)


Find the correlation between Advertising
Expenditure and Sales
Rank Correlation Coefficient
British Psychologist Charles Edward Spearman in 1904 propounded
the rank correlation for leadership ability and judgement of female
beauty by assigning rank to them.

Where R = rank and d = difference of rank between paired items.


The ranking of 10 students in 2 subjects A & B as
follows, calculate rank correlation?

A 6 5 3 10 2 4 9 7 8 1

B 3 8 4 9 1 6 10 7 5 2
Regression Analysis

Regression analysis attempts to establish the nature of relationship


between variables that is, to study the functional relationship
between the variables and thereby provide a mechanism for
prediction or forecasting.
Regression Equation
The regression equation of Y on X is expressed as:

Y = a + bX
Where
Y = dependent/explained variable
X = independent/explanatory variable
a = Y intercept
b = slope (represents change in y variable for a unit
change in X)
If the values of constants a and b are obtained, the line is
completely determined.

The values are obtained by least squares method, which states


that the line should be drawn through the plotted points in such
a manner that the sum of the squares of the deviations of the
actual Y values from the computed Y values is the least. Such line
is known as the line of best fit.
Regression equation of Y on X
Example
The following data relate to the scores obtained by 9 salesman of a
company in an intelligence test and their weekly sales in thousand
rupees:
Salesman A B C D E F G H I
Intelligence

Test scores 50 60 50 60 80 50 80 40 70

Weekly scores 30 60 40 50 60 30 70 50 60

(a) Obtain the regression equation of sales on intelligence test


scores of the salesman.
(b) If the intelligence score of a salesman is 65, what would be the
weekly sales?
Standard Error of the Estimate
The standard error of the estimate is a measure of the accuracy of
predictions. (also called the sum of squares error). where σest is the
standard error of the estimate, Y is an actual score, Y' is a predicted
score, and N is the number of pairs of scores.

Вам также может понравиться