Академический Документы
Профессиональный Документы
Культура Документы
Presented by
k
FIP – ECO 701 – CLASS 2 – TEAM 2
Executive Summary
Economics is best defined as social science or rather a science that studies human behaviour. The
subject of Economics has come a long way. The existence of Economics has given the world a new way
to view the world. It has helped us the way of understanding the world that we live in and where we are
working. By using economics, we can assess if a country is achieving good growth by being a
measurement to its returns for all the decisions its government makes using policies and laws. With that
in mind, questions arise about economic welfare such as knowing what the best way to tell if the
economic prosperity of a country is in the right place is. As of today, economists still argue with one
another on which is the best determinant of a good or bad economy of a country, but what is accepted
is using GDP.
Gross domestic product or GDP, tells us the country’s current aggregate production of goods
and services. It is often considered the best measure of how well the economy is performing. GDP
summarises the aggregate of all economic activities in a given period. In any economy, however, goods
and services produced are not homogenous. It is not possible to add, for example, 10 barrels of
petroleum with 10 million metric tons of wheat. So, as a trick, quantities and volumes of all own goods
and services are multiplied by their prices and then summed up.
Simple linear regression is a technique for predicting the value of a dependent variable, based
on the value of a single independent variable. Sometimes, you only need one relevant independent
variable to make an accurate prediction. More often, however, the predicting is better when you use
two or more independent variables. Multiple regression is a technique for predicting the value of a
dependent variable, based on the benefits of two or more independent variables.
COLLECTION OF SAMPLE
The main source of the data used for this is from WORLD BANK website. We have taken timeline as 1985
to 2014 which is a period of 30 years for collecting the data, specifically for India.
2
FIP – ECO 701 – CLASS 2 – TEAM 2
NATURE OF STUDY
We feed this data into the Microsoft Excel’s ‘Data Analysis’ package. Any computer loaded with
Microsoft Excel Software is useful for this purpose. So this is a doctrinaire study. It does not involve field
survey. It includes an empirical element in the sense that it studies real life data on GDP.
In this study we have selected GDP as dependent variable and all the indicators mentioned have been
taken as independent variables.
ANOVA df SS MS F Significance F
Regression 9 74.32302 8.258113 2.34041 0.054367241
Residual 20 70.56981 3.528491
Total 29 144.8928
3
FIP – ECO 701 – CLASS 2 – TEAM 2
Name of the Indicator Description Mean Minimum Maximum No. of observations
Annual percentage growth rate of GDP at market prices based
on constant local currency. Aggregates are based on constant
2010 U.S. dollars. GDP is the sum of gross v alue added by all
resident producers in the economy plus any product taxes and
GDP Annual Growth Rate 6.5 1.05 10.25 30
minus any subsidies not included in the v alue of the products. It
is calculated without making deductions for depreciation of
fabricated assets or for depletion and degradation of natural
resources.
Under-fiv e mortality rate is the probability per 1,000 that a
Mortality rate under age 5 (per 1000 liv e
newborn baby will die before reaching age fiv e, if subject to 94.06 46.7 145.4 30
births)
age-specific mortality rates of the specified year.
Urban population refers to people liv ing in urban areas as
Urban Population (% of Total) defined by national statistical offices. The data are collected 27.94 24.34 32.38 30
and smoothed by United Nations Population Div ision.
Military expenditures data from SIPRI are deriv ed from the NATO
definition, which includes all current and capital expenditures
on the armed forces, including peacekeeping forces; defense
ministries and other gov ernment agencies engaged in defense
projects; paramilitary forces, if these are judged to be trained
and equipped for military operations; and military space
activ ities. Such expenditures include military and civ il
personnel, including retirement pensions of military personnel
and social serv ices for personnel; operation and maintenance;
procurement; military research and dev elopment; and military
aid (in the military expenditures of the donor country). Excluded
Military Expenditure (% of GDP) are civ il defense and current expenditures for prev ious military 2.93 2.34 4.23 30
activ ities, such as for v eterans' benefits, demobilization,
conv ersion, and destruction of weapons. This definition cannot
be applied for all countries, howev er, since that would require
much more detailed information than is av ailable about what
is included in military budgets and off-budget military
expenditure items. (For example, military budgets might or might
not cov er civ il defense, reserv es and auxiliary forces, police and
paramilitary forces, dual-purpose forces such as military and
civ ilian police, military grants in kind, pensions for military
personnel, and social security contributions paid by one part of
gov ernment to another.)
DATA ANALYSIS
Multiple linear regression model that relates a y – variable to k + 1 variables is written as
yi = β0 + β1xi,1 + β2xi,2 + ……………. + xk+1xi, k+ εi
Subscript i refers to the ith unit in the data. In the notation for the x variables, the subscript following
i, simply denotes which x variable it is.
The residue εi is assumed to have a normal distribution with mean zero and constant variance.
The estimates of the β coefficients are the values that minimize the sum of squared errors for the
sample data.
REGRESSION EQUATION
The first task in our analysis is to define a linear, least squares regression equation to predict GDP based
on Mortality rate under age of 5 years (per 1000 live births), urban population (% of total), military
expenditure (% of GDP), air transport passengers carried, natural gas rents (% of GDP), Industry (% of
GDP), trade (% of GDP), mineral rents (% of GDP), domestic credit to private sector (% of GDP).
Referring to the coefficients table, the regression equation has been formed as below:
GDP = 0.7539*Mortality rate under age of 5 years + 13.6140*Urban population (% of total) + 4.1715*Military
expenditure (% of GDP) – 3.24662E-07*Air transport passengers carried – 9.2247*Natural gas rents (% of
GDP) + 1.2586*Industry (% of GDP) + 0.1377*Trade (% of GDP) + 0.0797*Mineral rents (% of GDP) –
0.3016*Domestic credit to private sector (% of GDP) – 477.301
To know how well does our equation fit the data we look at the coefficient of multiple determination r2,
which measures the proportion of the variation in the dependent variable that can be predicted from
the set of independent variables in the regression equation. From the regression statistics table r2 is 0.5129
which means 51.29% of GDP can be explained by the independent variables which are considered.
The adj r2 is a modified version of r2 that has been adjusted for the number of predictors in the model.
For our model adj r2 has been attained as 0.2937. The adj r2 only if the new term improves the model
would be expected by chance. It decreases when a predictor improves the model by less than
expected by chance.
ANALYSIS OF VARIANCE
Another way to evaluate the regression equation is to assess the statistical significance of the regression
sum of squares. For that we examine the ANOVA table produced by excel. This table tests the statistical
significance of the independent variables as predictors of the dependent variables. The last column of
the table shows the result of an overall F test. The F statistic is 2.34041 and p - value is 0.05436 which is
almost equal to 0.05 significance level which passes the F test there by indicating that the null hypothesis
(coefficients of predictors is zero) can be rejected. Hence the overall F - test found in the ANOVA table
suggest that the regression equation fits the data well.
5
FIP – ECO 701 – CLASS 2 – TEAM 2
SIGNIFCANCE OF COEFFICIENTS
The coefficient for mortality rate under age of 5 years is 0.7539 which signifies that a change of 1
unit in mortality rate under age of 5 years corresponds to 0.7539 change in GDP. The p- value
from the coefficients table which represents the significance of t - statistic is 0.077205. Since it is
greater than 0.05 significance level thereby indicating that mortality rate under age of 5 years is
not significantly relevant variable for the model.
The coefficient for urban population (% of total) is 13.6140 which signifies that a change of 1 unit
in urban population (% of total) corresponds to 13.6140 change in GDP. The p value from the
coefficients table which represents the significance of t - statistic is 0.0709. Since it is greater than
0.05 significance level thereby indicating that natural gas rent is not significantly relevant variable
for the model.
The coefficient for military expenditure (% of total) is 4.1715 which signifies that a change of 1 unit
in military expenditure corresponds to 4.1715 change in GDP. The p value from the coefficients
table which represents the significance of t - statistic is 0.0725. Since it is greater than 0.05
significance level thereby indicating that natural gas rent is not significantly relevant variable for
the model.
The coefficient for air transport passengers carried is -3.246E-07 which signifies that a change of
1 unit in air transport passengers carried corresponds to -3.246E-07 change in GDP. Here the
negative sign indicates negative correlation between GDP and air transport passengers carried
suggesting that GDP decreases with increase in air transport passengers carried. The p value from
the coefficients table which represents the significance of t - statistic is 0.0196. Since it is less than
0.05 significance level thereby indicating that air transport passengers carried is significantly
relevant variable for the model.
The coefficient for natural gas rents carried is – 9.2247 which signifies that a change of 1 unit in
natural gas rents carried corresponds to – 9.2247 change in GDP. Here the negative sign indicates
negative correlation between GDP and natural gas rents suggesting that GDP decreases with
increase in natural gas rents. The p value from the coefficients table which represents the
significance of t - statistic is 0.6036. Since it is greater than 0.05 significance level thereby
indicating that natural gas rent is not significantly relevant variable for the model.
The coefficient for industry is 1.2586 which signifies that a change of 1 unit corresponds to 1.2586
change in GDP. The p value from the coefficients table which represents the significance of t -
statistic is 0.0503. Since it is almost equal to 0.05 significance level thereby indicating that industry
is relevant variable for the model.
The coefficient for trade is 0.1378 which signifies that a change of 1 unit corresponds to 0.1378
change in GDP. The p value from the coefficients table which represents the significance of t -
statistic is 0.5577. Since it is greater than 0.05 significance level thereby indicating that natural gas
rent is not significantly relevant variable for the model.
6
FIP – ECO 701 – CLASS 2 – TEAM 2
The coefficient for mineral rents is 0.07976 which signifies that a change of 1 unit corresponds to
0.07976 change in GDP. The p value from the coefficients table which represents the significance
of t - statistic is 0.9661. Since it is greater than 0.05 significance level thereby indicating that natural
gas rent is not significantly relevant variable for the model.
The coefficient for domestic credit to private sector is -0.30168 which signifies that a change of 1
unit corresponds to -0.30168 change in GDP. Here the negative sign indicates negative
correlation between GDP and domestic credit to private sector suggesting that GDP decreases
with increase in domestic credit to private sector. The p value from the coefficients table which
represents the significance of t - statistic is 0.4017. Since it is greater than 0.05 significance level
thereby indicating that natural gas rent is not significantly relevant variable for the model.
In regression, multicollinearity refers to the extent to which independent variables are correlated.
Multicollinearity exists when:
There are two popular ways to measure multicollinearity: (1) compute a coefficient of multiple
determination for each independent variable, or (2) compute a variance inflation factor for each
independent variable.
The variance inflation factor is another way to express exactly the same information found in the
coefficient of multiple correlation. A variance inflation factor is computed for each independent
variable, using the following formula:
7
FIP – ECO 701 – CLASS 2 – TEAM 2
The interpretation of the variance inflation factor mirrors the interpretation of the coefficient of multiple
determination. If VIFj = 1, variable j is not correlated with any other independent variable. As a rule of
thumb, multicollinearity is a potential problem when VIF j is greater than 4; and, a serious problem when
it is greater than 10. Since the output above shows VIF greater than 4 for all the variables, there is
collinearity among the independent variables.
Regression Statistics
Multiple R 0.694207865
R Square 0.48192456
Adjusted R Square 0.346774445
Standard Error 1.806575161
Observ ations 30
ANOVA df SS MS F Significance F
Regression 6 69.8274143 11.6379 3.565846 0.012068849
Residual 23 75.06541768 3.263714
Total 29 144.892832
8
FIP – ECO 701 – CLASS 2 – TEAM 2
A regression model that contains more independent variables than another model can look like
it provides a better fit merely because it contains more variables. Using adj r 2 to compare the
goodness of the fit that contains differing number of independent variables. The adj r 2 value for
our final model is 0.34677 when compared to initial model 0.29377 which indicates the
improvisation of the model.
From the ANOVA table, the significance F is less than 0.05 significance level, hence passes the F
– test.
Other than natural gas rent variable all other independent variables’ p-value is less than 0.05
significance level.
9
FIP – ECO 701 – CLASS 2 – TEAM 2
APPENDIX
Details of the variables collected from the WORLD BANK database is as below:
10
FIP – ECO 701 – CLASS 2 – TEAM 2
REFERENCES
11