Вы находитесь на странице: 1из 26

ASSOCIATION BETWEEN QUANTITATIVE VARIABLES

ASSOCIATION BETWEEN QUANTITATIVE VARIABLES


Methods for analyzing the relationships between two quantitative variables are
linear regression correlation

Example 1
Consider the following data on body weight and plasma volume of eight healthy men The objective of the analysis is to see whether a change in plasma volume is associated with a change in body weight.

Example
Subject Body Plasma weight volume (l) (kg) 58.0 2.75 70.0 2.86 74.0 3.37 63.5 2.76 62.0 2.62 70.5 3.49 71.0 3.05 66.0 3.12

1 2 3 4 5 6 7 8

SCATTER DIAGRAM
Two related variables - plotted on a graph in the form of points or dots Each point on the diagram represents a pair of values, one based on X-scale and the other based on Y-scale. First step in investigating the relationship between two variables Diagram shows visually the shape and degree of closeness of the relationship

SCATTER DIAGRAM
Values on the X-scale refer to the explanatory or independent variable and on the Y-scale refer to the response or dependent variable. In situations where it is not clear which is the dependent variable, the choice of axes is arbitrary

Is there a trend?
3.6 3.4
P l a s m a v o l u m e

3.2 3 2.8 2.6 2.4 2.2 2 56 58 60 62 64 66 68 70 72 74 76 Body weight (kg)

LINEAR REGRESSION
can summarize previous relationship by a line drawn through the scatter of points. any straight line drawn on a graph can be represented by the equation: y = a + bx where y refers to the values of the dependent variable x to values of the explanatory (independent) variable.

LINEAR REGRESSION
The constant 'a' is the intercept, the point at which the line crosses the y-axis.
value of y when x = 0

The coefficient of x variable ('b') is the slope of the line.


the average change (increase or decrease) in y due to a unit change in x.

b sometimes called the regression coefficient.

LINEAR REGRESSION
b = (x - x )(y - y ) (x - x )2
Numerator =

xy -(xy)/n

Denominator =

x2 - (x)2/n

LINEAR REGRESSION

a = y - bx
where y = y/n and x = x/n

The resultant line is called the regression line, which estimates the average value of y for a given value of x.

Example 1 data on plasma volume and body weight


Subject Body Plasma weight volume (l) (kg) 58.0 2.75 70.0 2.86 74.0 3.37 63.5 2.76 62.0 2.62 70.5 3.49 71.0 3.05 66.0 3.12

1 2 3 4 5 6 7 8

n =8 x = 535 x2 = 35983.5 y = 24.02 y2 = 72.798 xy = 1615.295

Example
b = 1615.296 - (535)(24.02)/8 35983.5 - (535)2/8 = 8.96/205.38 = 0.043615 and a = 3.0025 - 0.043615 x 66.875 = 0.0857

Example 1
Regression line is given by:
Plasma volume = 0.09 + 0.04 x body weight Interpretation of slope For every one point change (1 kg) in body weight, on average there is a corresponding increase of 0.04 l in plasma volume

Example 1
3.6 3.4
P l a s m a v o l u m e

3.2 3 2.8 2.6 2.4 2.2 2 56 58 60 62 64 66 68 70 72 74 76 Body weight (kg)

CORRELATION
Linear regression - straight line with which to summarize the relationship between two variables. Does not tell how closely the data lie on a straight line. The (Pearson's) correlation coefficient, r measures the closeness (strength) of the linear association

Pearsons Correlation Coefficient


Measure of the strength of the linear association between two continuous variables
i.e. the closeness with which the points lie along the straight line

Linear regression - straight line with which to summarize the relationship between two variables. Does not tell how closely the data lie on a straight line.

Pearsons Correlation Coefficient


Let the underlying population correlation between X and Y be (rho) The population correlation can be estimated from a sample of data using the Pearsons correlation coefficient, r

Formula for r
The correlation coefficient is calculated as

r
2

xy ( x)( y) n ( x) 2 ( x) ] [ x ]*[ y n n
2 2

From the above example r = 8.96.. sqrt(205.38 x 0.678) = 0.76

Pearsons Correlation Coefficient (r)


It must lie between -1 and +1. If r = 0, there is no linear relationship If r = 1 or -1, the relationship is perfectly linear, ie. all points lie exactly on the regression line. If r > 0, then y increases with increasing x values (positive correlation). If r < 0, then y decreases with increasing x values (negative correlation).

Positive Correlation Coefficient


+1: perfect positive correlation
As X , Y As X , Y

Negative Correlation Coefficient


-1: perfect negative correlation
As X , Y As X , Y

Correlation Coefficient of 0
0: no correlation
There is no linear relationship between X and Y Or, there may be a relationship but it is nonlinear

No Correlation

Limitations of the correlation coefficient


It quantifies only the strength of the linear relationship between two variables
It is very sensitive to outlying values, and thus can sometimes be misleading It cannot be extrapolated beyond the observed ranges of the variables A high correlation does not imply a cause-and-effect relationship

Вам также может понравиться