Академический Документы
Профессиональный Документы
Культура Документы
Multiple Correlation
Inferential Statistic: Regression and Partial
Correlation
11/25/2014
Irwan Sulistyanto, S.Pd
1
Analysis Data
Data analysis is the important thing in research report either quantitative or qualitative
research. To know the research report succeeds or no it needs data analysis. There are
many kinds of data analysis. Here, the focus of this paper is describing the data analysis
on quantitative research. The data analysis of quantitative here are regression and partial
correlation.
A. Regression
Regresi linear
Regresi linear multipel
(berganda)
(1) Simple Linear Regression is one of the regression types that can be used as
the tool of statistical interference to decide the influence of independent variables to
dependent variable. Simple Linear regression is used to analyze the relationship between
one independent variable (X) and one dependent variable (Y). This result of analysis is
showing the direction between the relationships among independent (predictor variable)
and dependent variable (response variable). The direction can be positive or negative. It is
also used to predict the value of dependent variable when the value of independent
variables is changed. Simple linear regression is used the interval or ratio data. It is also
has the normal distribution. Below is the linear regression formula:
1. Y = α + βX + ε (model populasi)
2. Y = a + bX + e (model sampel)
Note:
a and b is estimate value for α and β
a = constant, as graphic it shows intercept
b = coefficient regression which shows the influence of variable X to Y, as graphic it
shows slope (kemiringan garis regresi).
If the data from the result of observation to the random sampling which n is available, to
get the regression equation Y = a + bX, it is needed to compute a and b by using least
square error methods (metode kuadrat kekeliruan terkecil).
;
To compute T-test, use formula as follows:
T = b : Sb then the result is consulted with T-table to know the hypothesis is accepted or
no.
After we know the formula above, the next step is that we have to use this formula to
compute the value of between two variables manually. But, beside we can use that formula
we can also use the SPSS to compute the regression. Before we do the calculation by
using the SPSS, we must make some basic assumptions and requirements to examine it
are right or no. Those are:
1. Independent variable not correlates with disturbance term (Error). The value of
disturbance term is 0 or it can write as follows: (E (U / X) = 0.
2. If there is more than one independent variable, so there is not real linear relation
each independent variable (explanatory).
3. The good model of regression is that the value of ANOVA is <0.05.
4. The predictor as the independent variable should be properly. It can see if the
value of Standard Error of Estimate < Standard Deviation.
5. Coefficient of regression should be significant. It can use T-Test. Coefficient of
regression is significant if T0 > T-table.
6. Model of regression can be described by using the value of determination
coefficient (KD = r2 x 100%). Higher the results of the formula before, better the
model of regression. If the result is closed on 1 so the model of regression is
better.
7. Data should have the normal distribution
8. Data is interval and ratio.
9. Both variables are dependent variable, it means that one variable is independent
variable (predictor variable) while the other is dependent variable (response
variable).
Example:
Simple Linear Regression is used to analyze two variables. We will take the name of
variable from data T0 which is given by your lecturer. The variables are family background
(X1), Motivation (Y). We will calculate the value of linear regression between X1 and Y. The
steps to do this calculation are below:
1. Open SPSS
2. Insert data from excel. Copy then paste.
3. Click variable view on SPSS data editor
4. See column Name, type X1 at first row, column Name, second row type Y.
5. Column Label, at first row type Family Background and second row type
Motivation.
6. Type 0 at column decimals.
7. The others column can be ignored (default)
8. Open data view at SPSS data editor, see the top column, we will see column for
variable X1 and Y.
9. Click Analyze - Regression – Linear
10. Click variable Y then click arrow to the Dependent box. Next, click variable X1 then
click arrow to the Independent box.
11. Click Statistics, click/thick estimates on regression coefficient, and thick model fit.
Click Continue.
12. Click OK, and the result is as follows:
Where:
Yi = Dependent variable for 1 experiment (i = 1, 2, …, n)
0, 1, 2,…, p-1 = Parameter
Xi1, Xi2, …X1, p-1 = Independent variable
I = error for experiment ke-i which is assumed that has independent
normal distribution, has 0 average, and has variance 2.
With:
Where:
1 Kutner, M.H., C.J. Nachtsheim., dan J. Neter. 2004. Applied Linear Regression Models. 4th .Ed. New York:
McGraw-Hill Companies, Inc.
2 Gujarati, N.D. 2003. Basic Econometrics. 4thed. New York: McGraw-Hill Companies, Inc.
This estimator OLS which is used in here should be not bias, linear, and the best (best
linear unbiased estimator/BLUE) (Sembiring, 20033; Gujarati, 2003; & Widarjono, 20074).
Then, when we calculated the estimation of multiple linear regressions, we must
assume some considerations. Those are (1) the model of regression is linear in parameter,
(2) the score of error is 0, (3) error variance, is constant (homoskedastik) not in
(Heteroskedastisitas5), (4) there is not autocorrelation6 between error, (5) there is not
multikolinieritas7 in independent variable, and (6) error has normal distribution. In the last,
to test the parameter we can use two ways those are simultaneous and partial.
Example:
Multiple Linear Regressions is used to analyze more than one independent variable toward
one dependent variable. We will take the name of variable from data T0 which is given by
your lecturer. The variables are family background (X1), Motivation (X2), and English
Achievement (Y). We will calculate the value of linear regression between X1, X2 and Y.
The steps to do this calculation are below:
1. Open SPSS
2. Insert data from excel. Copy then paste on it.
3. Click variable view on SPSS data editor
4. See column Name, type X1 at first row, second row type X2, third row type Y.
5. Column Label, at first row type Family Background, second row type
Motivation, and third row type English Achievement
6. Type 0 at column decimals.
7. The others column can be ignored (default)
8. Open data view at SPSS data editor, see the top column, we will see column
for variable X1, X2, and Y.
9. Click Analyze - Regression – Linear
10. Click variable Y then click arrow to the Dependent box. Next, click variable X1
and X2 then click arrow to the Independent box.
11. Click Statistics, click/thick estimates on regression coefficient, and thick model
fit. Click Continue.
12. Click OK, and the result is as follows:
3
Sembiring, R.K. 2003. Analisis Regresi. Edisi Kedua. Bandung: Institut Teknologi Bandung.
4 Widarjono, A. 2007. Ekonometrika: Teori dan Aplikasi untuk Ekonomi dan Bisnis. Edisi
Kedua. Yogyakarta: Ekonisia Fakultas Ekonomi Universitas Islam Indonesia.
5 Variansi dari error model regresi tidak konstan atau variansi antar error yang satu dengan error yang lain
pada data time series dan dapat juga terjadi pada data cross section tetapi jarang (Widarjono, 2007).
7 Terjadinya hubungan linier antara variabel bebas dalam suatu model regresi linier berganda (Gujarati,
2003). Hubungan linier antara variabel bebas dapat terjadi dalam bentuk hubungan linier yang sempurna
(perfect) dan hubungan linier yang kurang sempurna (imperfect).
Variables Enter ed/Re m ovebd
Variables Variables
Model Entered Remov ed Method
1 X2, X1a . Enter
a. All requested variables entered.
b. Dependent Variable: Y
ANOVAb
Sum of
Model Squares df Mean Square F Sig.
1 Regression 12.906 2 6.453 .013 .987 a
Residual 13704.594 27 507.578
Total 13717.500 29
a. Predictors: (Constant), X2, X1
b. Dependent Variable: Y
Coe fficientsa
Unstandardiz ed Standardized
Coef f icients Coef f icients
Model B Std. Error Beta t Sig.
1 (Cons tant) 70.122 35.158 1.994 .056
X1 -.061 .387 -.031 -.159 .875
X2 -.006 .343 -.004 -.018 .985
a. Dependent Variable: Y
To read this result of multiple linear regressions is as same as in reading the simple linear
regression result. Here, the writer only read the last table. It gets the regression formula as
follows: Y= 70.122 + (-0.061) X1 + (-0.006) X2. To read this result is also as same as in
reading the result of simple linear regression.
B. Partial Correlation
Suppose we want to find the correlation between X1 and Y controlling X2. This is
called the partial correlation and its symbol is rX1Y. X2. What we want to insure is that no
variance predictable from X2 enters the relationship between X1 and Y. In z-score form we
can predict both X1 and Y from X2 then subtract those predictions leaving only information
in X1 and Y that is independent of X2.
This purpose of the partial correlation are to know the relationship between two
variables with the effects of a third variable held constant or the estimation of the
relationship between a predictor variable and a criterion or outcome variable after
controlling for the effects of other predictors in the equation. Partialing represents a method
of exerting statistical control over variables. It is important to distinguish statistical control
from experimental control (e.g., random assignment to treatments, control by constancy,
etc.). Generally, experimental control provides stronger evidence than statistical control
because it is directly managed by the researcher and planned a priori.
A partial correlation coefficient is another third way of expressing the unique
relationship between the criterion and a predictor. Partial correlation represents the
correlation between the criterion and a predictor after common variance with other
predictors has been removed from both the criterion and the predictor of interest. That is,
after removing variance that the criterion and the predictor have in common with other
predictors, the partial expresses the correlation between the residualized predictor and the
residualized criterion.
Concept
1. Hubungan murni antara 2 variable, yang mengendalikan variable lain.
2. 1 variable terikat dengan 1 variable bebas, dikendalikan 1 atau lebih variable
bebas (karena diduga mempengaruhi hubungan kedua variable tersebut).
1.
2.
3.
Notation:
rX2Y.X1 : Korelasi parsial X2 dengan Y sedangkan X1 dikontrol
(1 r 2 X 2 X 1 ) (1 r 2YX 1 ) :
1 – r2 menyatakan bagian variable terikat yang tak diterangkan : jadi disini terdapat bagian
X2 dan Y yang tak diterangkan oleh X1.
Answer:
The conclusion is that if you want to compute the partial correlation between x1x2 controlled
by Y, you must compute the correlation between two variables such as rx 1x2, rx1y, rx2y.
After you get all of the score of correlation, you can compute the partial correlation
between rx1x2.y.