Вы находитесь на странице: 1из 11

FINAL INTEGRATED PROJECT

ECO 701 – CLASS 2 – TEAM 2

Presented by

M. Sameer Kumar; Prashant Srivastava; Niraj Kumar; Ravi Dusad

k
FIP – ECO 701 – CLASS 2 – TEAM 2



Executive Summary
Economics is best defined as social science or rather a science that studies human behaviour. The
subject of Economics has come a long way. The existence of Economics has given the world a new way
to view the world. It has helped us the way of understanding the world that we live in and where we are
working. By using economics, we can assess if a country is achieving good growth by being a
measurement to its returns for all the decisions its government makes using policies and laws. With that
in mind, questions arise about economic welfare such as knowing what the best way to tell if the
economic prosperity of a country is in the right place is. As of today, economists still argue with one
another on which is the best determinant of a good or bad economy of a country, but what is accepted
is using GDP.

Gross domestic product or GDP, tells us the country’s current aggregate production of goods
and services. It is often considered the best measure of how well the economy is performing. GDP
summarises the aggregate of all economic activities in a given period. In any economy, however, goods
and services produced are not homogenous. It is not possible to add, for example, 10 barrels of
petroleum with 10 million metric tons of wheat. So, as a trick, quantities and volumes of all own goods
and services are multiplied by their prices and then summed up.

A regression is a statistical technique for summarising the empirical relationship between a


variable and one or more other variables. In economics, regression analysis is, by far, the most commonly
used statistical tool for discovering and communicating empirical evidence. The objective of this study
is to understand the basic idea of what a regression means, learn how to read and interpret a table that
presents estimates from regression and begin to appreciate some of the reasons why a regression may
or may not provide credible evidence on any particular question.

Simple linear regression is a technique for predicting the value of a dependent variable, based
on the value of a single independent variable. Sometimes, you only need one relevant independent
variable to make an accurate prediction. More often, however, the predicting is better when you use
two or more independent variables. Multiple regression is a technique for predicting the value of a
dependent variable, based on the benefits of two or more independent variables.

OBJECTIVE OF THE STUDY


The objective of the study is to seek how Gross Domestic Product (GDP) is effected by the following
indicators: Mortality rate under age of 5 years (per 1000 live births), urban population (% of total), military
expenditure (% of GDP), air transport passengers carried, natural gas rents (% of GDP), Industry (% of
GDP), trade (% of GDP), mineral rents (% of GDP), domestic credit to private sector (% of GDP) By using
multiple regression, we will predict the impact on GDP by these indicators.

COLLECTION OF SAMPLE

The main source of the data used for this is from WORLD BANK website. We have taken timeline as 1985
to 2014 which is a period of 30 years for collecting the data, specifically for India.

2
FIP – ECO 701 – CLASS 2 – TEAM 2



NATURE OF STUDY

We feed this data into the Microsoft Excel’s ‘Data Analysis’ package. Any computer loaded with
Microsoft Excel Software is useful for this purpose. So this is a doctrinaire study. It does not involve field
survey. It includes an empirical element in the sense that it studies real life data on GDP.

DEPENDENT AND INDEPENDENT VARIABLE

In this study we have selected GDP as dependent variable and all the indicators mentioned have been
taken as independent variables.

TABULATION AND CHARTING OF DATA


Regression Statistics
Multiple R 0.716206448
R Square 0.512951676
Adjusted R Square 0.29377993
Standard Error 1.878427681
Observ ations 30

ANOVA df SS MS F Significance F
Regression 9 74.32302 8.258113 2.34041 0.054367241
Residual 20 70.56981 3.528491
Total 29 144.8928

Coefficients Standard Error t Stat P-value


Intercept -477.3015276 236.3202878 -2.01972 0.057013
Mortality rate under age 5 (per 1000 liv e births) 0.753933441 0.404671435 1.863076 0.077205
Urban Population (% of Total) 13.61400695 7.137187325 1.907475 0.070923
Military Expenditure (% of GDP) 4.171515694 2.200241971 1.895935 0.072511
Air Transport Passengers carried -3.24662E-07 1.28029E-07 -2.53585 0.019665
Natural gas rents (% of GDP) -9.224705166 17.48773377 -0.5275 0.603653
Industry (% of GDP) 1.25867476 0.604333261 2.082749 0.050321
Trade (% of GDP) 0.137789643 0.231052958 0.596355 0.557628
Mineral rents (% of GDP) 0.07975275 1.85062155 0.043095 0.966053
Domestic credit to priv ate sector (% of GDP) -0.301687942 0.35207396 -0.85689 0.401658

3
FIP – ECO 701 – CLASS 2 – TEAM 2


Name of the Indicator Description Mean Minimum Maximum No. of observations
Annual percentage growth rate of GDP at market prices based
on constant local currency. Aggregates are based on constant
2010 U.S. dollars. GDP is the sum of gross v alue added by all
resident producers in the economy plus any product taxes and
GDP Annual Growth Rate 6.5 1.05 10.25 30
minus any subsidies not included in the v alue of the products. It
is calculated without making deductions for depreciation of
fabricated assets or for depletion and degradation of natural
resources.
Under-fiv e mortality rate is the probability per 1,000 that a
Mortality rate under age 5 (per 1000 liv e
newborn baby will die before reaching age fiv e, if subject to 94.06 46.7 145.4 30
births)
age-specific mortality rates of the specified year.
Urban population refers to people liv ing in urban areas as
Urban Population (% of Total) defined by national statistical offices. The data are collected 27.94 24.34 32.38 30
and smoothed by United Nations Population Div ision.

Military expenditures data from SIPRI are deriv ed from the NATO
definition, which includes all current and capital expenditures
on the armed forces, including peacekeeping forces; defense
ministries and other gov ernment agencies engaged in defense
projects; paramilitary forces, if these are judged to be trained
and equipped for military operations; and military space
activ ities. Such expenditures include military and civ il
personnel, including retirement pensions of military personnel
and social serv ices for personnel; operation and maintenance;
procurement; military research and dev elopment; and military
aid (in the military expenditures of the donor country). Excluded
Military Expenditure (% of GDP) are civ il defense and current expenditures for prev ious military 2.93 2.34 4.23 30
activ ities, such as for v eterans' benefits, demobilization,
conv ersion, and destruction of weapons. This definition cannot
be applied for all countries, howev er, since that would require
much more detailed information than is av ailable about what
is included in military budgets and off-budget military
expenditure items. (For example, military budgets might or might
not cov er civ il defense, reserv es and auxiliary forces, police and
paramilitary forces, dual-purpose forces such as military and
civ ilian police, military grants in kind, pensions for military
personnel, and social security contributions paid by one part of
gov ernment to another.)

Air passengers carried include both domestic and international


Air Transport Passengers carried 29311475 9441600 82718882 30
aircraft passengers of air carriers registered in the country.
Natural gas rents are the difference between the v alue of
Natural gas rents (% of GDP) natural gas production at regional prices and total costs of 0.08 0.0023 0.2364 30
production.
Industry corresponds to ISIC div isions 10-45 and includes
manufacturing (ISIC div isions 15-37). It comprises v alue added
in mining, manufacturing (also reported as a separate
subgroup), construction, electricity, water, and gas. Value
added is the net output of a sector after adding up all outputs
Industry (% of GDP) 29.01 27.51 31.73 30
and subtracting intermediate inputs. It is calculated without
making deductions for depreciation of fabricated assets or
depletion and degradation of natural resources. The origin of
v alue added is determined by the International Standard
Industrial Classification (ISIC), rev ision 3 or 4.
Trade is the sum of exports and imports of goods and serv ices
Trade (% of GDP) 30.92 12.35 55.79 30
measured as a share of gross domestic product.
Mineral rents are the difference between the v alue of
production for a stock of minerals at world prices and their total
Mineral rents (% of GDP) costs of production. Minerals included in the calculation are tin, 0.6466 0.1942 2.6187 30
gold, lead, zinc, iron, copper, nickel, silv er, bauxite, and
phosphate.

Domestic credit to priv ate sector refers to financial resources


prov ided to the priv ate sector by financial corporations, such
as through loans, purchases of nonequity securities, and trade
credits and other accounts receiv able, that establish a claim
for repayment. For some countries these claims include credit to
public enterprises. The financial corporations include monetary
Domestic credit to priv ate sector (% of GDP) authorities and deposit money banks, as well as other financial 34 22.81 52.38 30
corporations where data are av ailable (including corporations
that do not accept transferable deposits but do incur such
liabilities as time and sav ings deposits). Examples of other
financial corporations are finance and leasing companies,
money lenders, insurance corporations, pension funds, and
foreign exchange companies.
4
FIP – ECO 701 – CLASS 2 – TEAM 2



DATA ANALYSIS
Multiple linear regression model that relates a y – variable to k + 1 variables is written as
yi = β0 + β1xi,1 + β2xi,2 + ……………. + xk+1xi, k+ εi

 Subscript i refers to the ith unit in the data. In the notation for the x variables, the subscript following
i, simply denotes which x variable it is.
 The residue εi is assumed to have a normal distribution with mean zero and constant variance.
 The estimates of the β coefficients are the values that minimize the sum of squared errors for the
sample data.

REGRESSION EQUATION

The first task in our analysis is to define a linear, least squares regression equation to predict GDP based
on Mortality rate under age of 5 years (per 1000 live births), urban population (% of total), military
expenditure (% of GDP), air transport passengers carried, natural gas rents (% of GDP), Industry (% of
GDP), trade (% of GDP), mineral rents (% of GDP), domestic credit to private sector (% of GDP).

Referring to the coefficients table, the regression equation has been formed as below:
GDP = 0.7539*Mortality rate under age of 5 years + 13.6140*Urban population (% of total) + 4.1715*Military
expenditure (% of GDP) – 3.24662E-07*Air transport passengers carried – 9.2247*Natural gas rents (% of
GDP) + 1.2586*Industry (% of GDP) + 0.1377*Trade (% of GDP) + 0.0797*Mineral rents (% of GDP) –
0.3016*Domestic credit to private sector (% of GDP) – 477.301

COEFFICIENT OF MULTIPLE DETERMINATION

To know how well does our equation fit the data we look at the coefficient of multiple determination r2,
which measures the proportion of the variation in the dependent variable that can be predicted from
the set of independent variables in the regression equation. From the regression statistics table r2 is 0.5129
which means 51.29% of GDP can be explained by the independent variables which are considered.

The adj r2 is a modified version of r2 that has been adjusted for the number of predictors in the model.
For our model adj r2 has been attained as 0.2937. The adj r2 only if the new term improves the model
would be expected by chance. It decreases when a predictor improves the model by less than
expected by chance.

ANALYSIS OF VARIANCE

Another way to evaluate the regression equation is to assess the statistical significance of the regression
sum of squares. For that we examine the ANOVA table produced by excel. This table tests the statistical
significance of the independent variables as predictors of the dependent variables. The last column of
the table shows the result of an overall F test. The F statistic is 2.34041 and p - value is 0.05436 which is
almost equal to 0.05 significance level which passes the F test there by indicating that the null hypothesis
(coefficients of predictors is zero) can be rejected. Hence the overall F - test found in the ANOVA table
suggest that the regression equation fits the data well.

5
FIP – ECO 701 – CLASS 2 – TEAM 2



SIGNIFCANCE OF COEFFICIENTS

 The coefficient for mortality rate under age of 5 years is 0.7539 which signifies that a change of 1
unit in mortality rate under age of 5 years corresponds to 0.7539 change in GDP. The p- value
from the coefficients table which represents the significance of t - statistic is 0.077205. Since it is
greater than 0.05 significance level thereby indicating that mortality rate under age of 5 years is
not significantly relevant variable for the model.
 The coefficient for urban population (% of total) is 13.6140 which signifies that a change of 1 unit
in urban population (% of total) corresponds to 13.6140 change in GDP. The p value from the
coefficients table which represents the significance of t - statistic is 0.0709. Since it is greater than
0.05 significance level thereby indicating that natural gas rent is not significantly relevant variable
for the model.
 The coefficient for military expenditure (% of total) is 4.1715 which signifies that a change of 1 unit
in military expenditure corresponds to 4.1715 change in GDP. The p value from the coefficients
table which represents the significance of t - statistic is 0.0725. Since it is greater than 0.05
significance level thereby indicating that natural gas rent is not significantly relevant variable for
the model.
 The coefficient for air transport passengers carried is -3.246E-07 which signifies that a change of
1 unit in air transport passengers carried corresponds to -3.246E-07 change in GDP. Here the
negative sign indicates negative correlation between GDP and air transport passengers carried
suggesting that GDP decreases with increase in air transport passengers carried. The p value from
the coefficients table which represents the significance of t - statistic is 0.0196. Since it is less than
0.05 significance level thereby indicating that air transport passengers carried is significantly
relevant variable for the model.
 The coefficient for natural gas rents carried is – 9.2247 which signifies that a change of 1 unit in
natural gas rents carried corresponds to – 9.2247 change in GDP. Here the negative sign indicates
negative correlation between GDP and natural gas rents suggesting that GDP decreases with
increase in natural gas rents. The p value from the coefficients table which represents the
significance of t - statistic is 0.6036. Since it is greater than 0.05 significance level thereby
indicating that natural gas rent is not significantly relevant variable for the model.
 The coefficient for industry is 1.2586 which signifies that a change of 1 unit corresponds to 1.2586
change in GDP. The p value from the coefficients table which represents the significance of t -
statistic is 0.0503. Since it is almost equal to 0.05 significance level thereby indicating that industry
is relevant variable for the model.
 The coefficient for trade is 0.1378 which signifies that a change of 1 unit corresponds to 0.1378
change in GDP. The p value from the coefficients table which represents the significance of t -
statistic is 0.5577. Since it is greater than 0.05 significance level thereby indicating that natural gas
rent is not significantly relevant variable for the model.
6
FIP – ECO 701 – CLASS 2 – TEAM 2



 The coefficient for mineral rents is 0.07976 which signifies that a change of 1 unit corresponds to
0.07976 change in GDP. The p value from the coefficients table which represents the significance
of t - statistic is 0.9661. Since it is greater than 0.05 significance level thereby indicating that natural
gas rent is not significantly relevant variable for the model.
 The coefficient for domestic credit to private sector is -0.30168 which signifies that a change of 1
unit corresponds to -0.30168 change in GDP. Here the negative sign indicates negative
correlation between GDP and domestic credit to private sector suggesting that GDP decreases
with increase in domestic credit to private sector. The p value from the coefficients table which
represents the significance of t - statistic is 0.4017. Since it is greater than 0.05 significance level
thereby indicating that natural gas rent is not significantly relevant variable for the model.

TEST FOR MULTICOLLINEARITY

In regression, multicollinearity refers to the extent to which independent variables are correlated.
Multicollinearity exists when:

 One independent variable is correlated with another independent variable.


 One independent variable is correlated with a linear combination of two or more independent
variables.

There are two popular ways to measure multicollinearity: (1) compute a coefficient of multiple
determination for each independent variable, or (2) compute a variance inflation factor for each
independent variable.

COMPUTING A VARIANCE INFLATION FACTOR FOR EACH INDEPENDENT VARIABLE

The variance inflation factor is another way to express exactly the same information found in the
coefficient of multiple correlation. A variance inflation factor is computed for each independent
variable, using the following formula:

VIFj = S2XJ (n-1) SE2bj / S2

Sxj = Standard deviation of the jth independent variable


n = Total number of observations
SEbj = Standard error of the jth independent variable
S = Mean square error of residuals.

7
FIP – ECO 701 – CLASS 2 – TEAM 2



Variables VIF P Value


Mortality rate under age 5 (per 1000 liv e births) 1214.731 0.077205
Urban Population (% of Total) 2386.917 0.070923
Military Expenditure (% of GDP) 8.801731 0.072511
Air Transport Passengers carried 76.62325 0.019665
Natural gas rents (% of GDP) 11.47101 0.603653
Industry (% of GDP) 5.437366 0.050321
Trade (% of GDP) 98.81319 0.557628
Mineral rents (% of GDP) 10.83003 0.966053
Domestic credit to priv ate sector (% of GDP) 130.0832 0.401658

The interpretation of the variance inflation factor mirrors the interpretation of the coefficient of multiple
determination. If VIFj = 1, variable j is not correlated with any other independent variable. As a rule of
thumb, multicollinearity is a potential problem when VIF j is greater than 4; and, a serious problem when
it is greater than 10. Since the output above shows VIF greater than 4 for all the variables, there is
collinearity among the independent variables.

FINDINGS AND CONCLUSIONS


Based on the data analysis discussed above, we have noticed that there are variables whose p value is
more than 0.05 significance level. Therefore, in order to build a better regression, model we have
removed three variables which are trade (% of GDP), mineral rents (% of GDP) and domestic credit to
private sector (% of GDP); and once again generated the multiple regression model with the rest of the
independent variables using excel.

Regression Statistics
Multiple R 0.694207865
R Square 0.48192456
Adjusted R Square 0.346774445
Standard Error 1.806575161
Observ ations 30

ANOVA df SS MS F Significance F
Regression 6 69.8274143 11.6379 3.565846 0.012068849
Residual 23 75.06541768 3.263714
Total 29 144.892832

Coefficients Standard Error t Stat P-value


Intercept -380.9748457 135.3012329 -2.81575 0.009808
Mortality rate under age 5 (per 1000 liv e births) 0.576161627 0.251005684 2.295413 0.031163
Urban Population (% of Total) 10.86537536 3.991012294 2.722461 0.012143
Military Expenditure (% of GDP) 2.746981309 1.261217576 2.178039 0.03991
Air Transport Passengers carried -3.18966E-07 1.05931E-07 -3.01108 0.006226
Natural gas rents (% of GDP) -15.56367987 11.98101645 -1.29903 0.206807
Industry (% of GDP) 1.113788366 0.41515507 2.682825 0.013286

8
FIP – ECO 701 – CLASS 2 – TEAM 2



Variables VIF P value


Mortality rate under age 5 (per 1000 liv e births) 505.2647 0.031163
Urban Population (% of Total) 806.9118 0.012143
Military Expenditure (% of GDP) 3.126685 0.03991
Air Transport Passengers carried 56.71076 0.006226
Natural gas rents (% of GDP) 5.821015 0.206807
Industry (% of GDP) 2.77417 0.013286

The final regression equation is:

GDP = 0.5761*mortality rate under age of 5 years + 10.8653*urban population + 2.7469*military


expenditure – 3.18966E-07*air transport passengers carried – 15.5636*natural gas rents +
1.11378*industry – 380.9748

 A regression model that contains more independent variables than another model can look like
it provides a better fit merely because it contains more variables. Using adj r 2 to compare the
goodness of the fit that contains differing number of independent variables. The adj r 2 value for
our final model is 0.34677 when compared to initial model 0.29377 which indicates the
improvisation of the model.
 From the ANOVA table, the significance F is less than 0.05 significance level, hence passes the F
– test.
 Other than natural gas rent variable all other independent variables’ p-value is less than 0.05
significance level.

9
FIP – ECO 701 – CLASS 2 – TEAM 2



APPENDIX
Details of the variables collected from the WORLD BANK database is as below:

Mortal i ty ra te Mi l i tary Domes tic


Urba n Ai r Tra ns port Na tura l ga s
GDP Annua l under a ge 5 Expendi tur Indus try (% of Tra de (% of Mi nera l rents credi t to
Yea rs Popul a tion (% Pa s s engers rents (% of
Growth Ra te (%) (per 1000 l i ve e (% of GDP) GDP) (% of GDP) pri va te s ector
of Total ) ca rri ed GDP)
bi rths ) GDP) (% of GDP)
1985 4.77656417 145.4 24.348 3.56903028 10993800 0.009343058 27.74092125 13.0404045 0.254447538 24.8957613
1986 3.965355634 141.5 24.585 4.10646868 11785200 0.013917209 27.84438137 12.35209389 0.243839756 26.14982578
1987 9.62778292 137.6 24.823 4.23131753 12668600 0.006754044 27.81145571 12.72144914 0.194251187 25.72552425
1988 5.947343328 133.6 25.063 3.72867178 12863100 0.002353082 27.83211386 13.63690053 0.258287723 25.61683228
1989 5.533454563 129.7 25.305 3.53417973 12740100 0.010872338 28.70282608 15.33264919 0.326041266 26.9493033
1990 1.056831433 126 25.547 3.24326938 10862200 0.04750043 28.59178444 15.67452157 0.308838474 25.25332238
1991 5.482396022 122.5 25.778 3.00244595 10717400 0.046769621 27.51087676 17.17157606 0.371361993 24.14380996
1992 4.75077622 119.2 25.984 2.79944413 11127100 0.019313161 27.85260731 18.63282812 0.310450291 25.03215582
1993 6.658924067 115.9 26.191 2.92914213 9441600 0.033473047 27.83444655 19.86421331 0.264403522 24.1542603
1994 7.57449184 112.7 26.399 2.75102689 11518400 0.028564879 28.70177536 20.29553587 0.196387706 23.96704902
1995 7.549522249 109.4 26.607 2.65402203 14260600 0.034131929 29.72370903 23.11530552 0.194950376 22.81511615
1996 4.049820849 106 26.817 2.54569243 13394600 0.0556591 29.02263677 22.16718715 0.21126882 23.71883866
1997 6.184415821 102.5 27.028 2.72615672 16039800 0.055147777 28.9245369 22.86457755 0.212084499 23.87378892
1998 8.845755561 98.9 27.24 2.80857269 16521000 0.014958462 28.33547672 23.9564845 0.351470973 23.99786555
1999 3.840991157 95.3 27.453 3.06489109 16005400 0.015945748 27.56060966 25.08477065 0.263035334 25.76657453
2000 4.823966264 91.7 27.667 3.0542772 17299483 0.109660102 28.41991278 27.19234549 0.303540522 28.72269657
2001 3.803975321 88.1 27.918 2.93379444 16862737 0.121380652 27.55211341 26.2748446 0.333666418 29.00634455
2002 7.860381475 84.6 28.244 2.83319668 17633019 0.103251022 28.77011967 29.82833107 0.331111401 32.74326674
2003 7.922943418 81.2 28.572 2.68117622 19455085 0.095937254 28.59227115 30.92374364 0.351976632 32.05384679
2004 9.284824616 77.8 28.903 2.82875261 23934074 0.107403979 30.44560468 37.91026504 0.462879307 36.68111025
2005 9.263964759 74.5 29.235 2.75490677 27879461 0.18117864 30.72335882 42.48530654 1.008417263 40.63665469
2006 9.801360337 71.2 29.569 2.52680623 40288794 0.236466717 31.5825891 46.59202866 1.225365594 44.57317056
2007 3.890957062 68 29.906 2.34263368 51897450 0.161476678 31.73174068 46.15866756 2.017201747 46.22127594
2008 8.479783897 64.7 30.246 2.55019485 49877935 0.189048104 31.69952822 53.76337261 2.618755826 50.05802075
2009 10.25996306 61.6 30.587 2.89349597 54446373 0.137181832 31.14192811 46.77702612 1.127567541 48.77678557
2010 6.6383638 58.4 30.93 2.70746752 64374253.8 0.150162296 30.0826782 49.68889126 1.707200504 51.13514945
2011 5.456387552 55.3 31.276 2.65158495 73996912 0.191326752 30.16167976 55.62388001 1.470912785 51.28923313
2012 6.386106401 52.4 31.634 2.53547653 72151828.89 0.157033103 29.3985277 55.79372173 0.891366471 51.88850765
2013 7.410227605 49.5 32.003 2.46411828 75589071 0.110333266 28.40489956 53.84413195 1.025373782 52.38570952
2014 8.154425028 46.7 32.384 2.48815792 82718882.88 0.100189228 27.6564012 48.92218575 0.562804199 51.88218736

10
FIP – ECO 701 – CLASS 2 – TEAM 2



REFERENCES

1. GDP growth (annual %). (n.d.). Retrieved from


https://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG
2. Stat Trek. (n.d.). Retrieved from https://stattrek.com/multiple-
regression/multicollinearity.aspx?Tutorial=reg
3. Angrist, Joshua D., and Jorn-Steffen Pischke. 2009. Mostly Harmless Econometrics: An
Empiricist's Companion. Princeton and Oxford: Princeton University Press.
4. GlobalNxt University study materials.

11

Вам также может понравиться