Вы находитесь на странице: 1из 14

1.

Executive Summary

The following report tries to answer the question if GDP per capita (GDP*) and
passenger cars per 1000 persons of a country influence Carbon dioxide emission
per capita (CDE). Furthermore, it aims to clarify by how much it is influenced.
Descriptive statistic, correlation coefficient and the multiple regression analysis
are applied to determine the data. The analysis shows no significant relationship
between any of the variables and the dependant. However, the overall model is
still significant.

2. Introduction

The environmental topic is overwhelming in the politic nowadays. Nearly all the
countries strive to minimize their CDE, as we can witness from the climate
conference in Copenhagen at the moment. Among all the environmental figures,
the Carbon dioxide emission is among the most important ones, since it has a big
influence on global warming. As GDP* is a measure of economic wealth of a
country, it is tempting to find out if it has any impact on the degree of pollution.
Furthermore, cars are among the rising and important products, which produce
CDE. Since cars are getting more available to all the people, it is attention
grabbing what contribution they have towards CDE. I choose the countries in a
way, so the sample will represent the population as good as possible. To
determine the data I used descriptive statistic and the correlation coefficient
analysis. Finally, I applied a multiple regression analysis.

3. Sources of data and their limitations

For the country selection, I could only choose between the given data. Therefore,
the sample size is very small which makes it difficult to find a conclusion for the
population.
The second data is the GDP per capita in US dollars, 2006. The data comes from
the International Monetary Fund, which tries to “help their member governments
to take advantage of the opportunities—and manage the challenges—posed by
globalization and economic development more generally.” (IMF, 2009) However,
the consideration of the population size of a country would give us a different
result, especially due to the huge population in China and India. Additionally, the
exchange from all the different currencies to the US Dollar leads to some
additional error.
Thirdly, the Carbon dioxide emission per capita from 2004 is from the US
Department of Energy (USDE). The “overarching mission from the USDE is to
advance the national, economic, and energy security of the United States.”
(USDE, 2009) Here we should notice how the data was collected and calculated.
Different methods come to different results, which could over - or underestimate
the figures.
The fourth data is the number of passenger cars per 1000 persons, 2004 of a
country. The source is the Development Data group, The World Bank. The World
Bank is “a vital source of financial and technical assistance for developing
countries around the world.” (World Bank, 2009) However, car ownership doesn’t
implement anything about the driven distances of them. In developed countries,
the infrastructure is usually better than in developing ones.
In general it is important to note the different sources and calculation methods of
the data. Furthermore, the published years of the sources vary and could lead to
misinterpretation.
4. Presentation of data using appropriate graphs and tables.

1. Descriptive Analysis

Table 1:
Descriptive Statistic Carbon dioxide GDP per Passenger
Analysis emissions capita (US$), cars per 1000
(CDE) per 2006 persons,2004
capita, 2004
Mean 9.79 23,367.33 239.17
Median 9.79 27,737.00 210.50
Mode 9.79 - -
Stand. Dev. 6.49 13,499.04 201.47
Sample. Vari. 42.14 182,224,000. 40590.15
24
Skewness 0.40 (-0.14) 0.28
Confidence Level (95%) 4.12 8,576.88 128.01
Confidence Level (99%) 5.82 12,102.82 180.63

Mean:

The arithmetic mean is the average of a data set. It is among the most popular

measure of location. The problem with the mean is it takes all numbers into
account, which could lead to a misleading result.

Median:

The median is the middle value of an ordered array of numbers. It is not


influenced by high values of the data set.

Mode:

The mode is the most frequently occurring value in the data set.

Skewness:

Skewness is a measure of degree of asymmetry of the distribution. If the


skewness is negative, the graph will be skewed to the left, when positive it is
skwed to the right. If median, mean and mode are the same, it is asymmetric.
Asymmetric
Graph 1

Skewed to the left


Graph 2

Skewed to the right


Graph 3

Standard Deviation:
The Standard deviation is a measure of variability. For GDP* per capita it is
13,499.04 and means it varies around the mean of 23,367.33 with 13,499.04.

Confidence Interval:
The confidence level is a range of values within which you can state with some
confidence that the population parameter will fall in. I chose the 95% and 99%
confidence level to illustrate how the range of values are changing, if we increase
the level.
2. Correlation

The correlation coefficient measures the linear association between two variables
and is unit free. It can range from -1 ≤ r ≤ +1, whereas -1 means perfectly
negative linear relationship and +1 perfectly positive linear relationship.

Table 2:
Correlation (r) Carbon dioxide GDP per Passenger cars
emissions (CDE) capita per 1000
per capita,2004 (US$),200 persons,2004
6
Carbon dioxide
emissions (CDE)
per capita,2004 1
GDP per capita
(US$),2006 0.824126161 1
Passenger cars
per 1000 0.9208290
persons,2004 0.833336146 06 1

Graph 4:

Graph 5:

In both cases the variables are positively correlated. We can see this in the table,
but even more important from the graphs. Both of them are upward sloping,
which indicates a positive correlation.

3. Multiple Regression

Analysis of Error, Standard Residual and predicted Y

Graph 6:

Since the multiple regression model can never be 100% accurate we can check it
with the Analysis of the errors. This procedure tests if the model is adequate or
whether it is mis-specific. The following assumptions should be fulfilled:
1.ε is a random variable with mean or expected value zero
2.The variance of  is the same for all values of x.
3.The values of ε are independent
4.The error term ε is a normally distributed random variable

Point 2 and 4 can be shown graphically with graph 6

5. Analysis of the data with explanations of why you have chosen particular
techniques and rejected others.

Significant test for sample correlation coefficient:

The significant correlation test tries find out if there is any association between
the values or if r is due to chance.

Correlation between CDE and Passenger cars per 1000 persons

H0: ρ = 0

H1: ρ  0

t(n-2/α/2) = t(2,0.025) = ± 4.303

t= 4.767

Reject H0 in favour of H1
There is a significant linear relationship between CDE and Passenger cars per
1000 persons. The positive result can also be observed of graph 4.

Correlation between CDE and GDP*

H0: ρ = 0

H1: ρ ≠ 0

Critical value: 4.303

t= 4.601

Reject H0 in favour of H1

There is a significant linear relationship between CDE per capita and GDP per
capita. Here again, the positive result can be observed from graph 5.

I rejected the Spermann model, since we don’t need any ranking for our
calculations.

Multiple Regression model

The multiple regression analysis is used to construct a function, which predicts


one variable by another. Since we have more than one variable we chose the
multiple and not the single regression.

True model:
Hence, β0 ,β1 and β2 are unknown we use the estimate regression model.

Excel outcome:

Formula indicates, a one-unit increase in x1 will lead to a 0.000177353 increase in


y, if the other variables are hold constant. Also, a one-unit increase in x2 will lead
to a 0.01481024 increase in y, if the other variables is hold constant.

Testing the Multi Regression Model

1. Multiple Coefficient of Determination adjusted R2


The multiple coefficient of determination tries to identify if the observations are
close or far away from the regression line. It helps to distinguish between good
and bad regressions. The value of R2 lies between 0 and 1. 0 means, no prediction
of Y by X. R2 =1 means, a perfect prediction of X upon Y.
An increase with an extra variable will lead to a higher R2. However, such
improvements could be purely by change. Since we have more than one variable
we have to use the adjusted R2, which only increases if the explanation of the
model increases.

Adjusted R2 = 0.6524

This means 65.24% of the CDE emission in 2004 can be explained by our two
variables. The other 34.75% cannot be explained and depend on other random
variables.
2. The t-test determines if x has any influence upon y.

b 1:

H0: b1 = 0

H1: b1 0

Critical value: t(n-k-1,α/2) = t(12-2-1,α/2) = 2.262


Excel: t = 0.8189
Fail to reject H0, which means, GDP* has with 95% confidence no influence upon
CDE per capita.

b2

critical value: t(n-k-1,α/2) = t(12-2-1,α/2) = 2.262


Excel: t = 1.0741
Fail to reject H0, which means Passenger cars per 1000 persons has with 95%
confidence no influence upon CDE.

Both results illustrate, the variables do not affect the dependent, since the t-value
is below the critical value.

3. F-Test
F-test determines the significance of the overall model. Here we are testing the
hypothesis that all slope coefficients are simultaneously 0.

H0: β1 = β2 = 0
H1: β1 ≠ β 2 ≠ 0

Critical value: F(2,12-2-1,0.05) = 4.26


Excel: F = 11.3249

Reject H0 in favour of H1, which means the regression model as a whole is


significant.

4. Confidence interval for β


Even if the population β is unknown, we can construct a confidence interval for β
using our b value. With this interval, we can say with 95 % confidence, the true
value β lays between the interval.

As we can see from our interval, β is close to zero, which implies β has nearly no
influence on the dependant.

5. Multicollinearity
Multicollinearity ascertains to which degree the explanatory variables are
correlated to each other. If there are, which the normally are, it is hard to say
which one is influencing the dependant. The symptoms are:

– high correlation between two variables


– high standard errors leading to low t ratios

– high value of R2

VIF = 2.8769

The best method to over come the problem is to collect more data or drop one of
the correlated variables.

6. A discussion of the results, which explains them in their wider social, economic
and political context.

Since our variables don’t significantly influence CDE, they are not relevant and
should not be considered in terms of CDE pollution. However, the majority of the
countries with high CDE per capita emissions are mostly developed. This means,
their population on average is decreasing, which brings them closer to their peak
of CDE emission. Whereas on the other hand, the majority of the countries with
very low, but fast increasing populations has a very low CDE emission. As these
countries develop, GDP and car owners’ increase, the CDE emission per capita will
probably rise faster and the variables could gain influence upon it.

7. Conclusion

The result of the investigation shows no significant relationship between any of


the variables and the dependant. However, as mentioned before, this is only
value for our result and could change if we take a bigger sample or change the
countries. Furthermore, the data was not up to date and had different years of
observations.

8. An appendix with the raw data that you used, details of the sources and any
detailed analysis you think is inappropriate for the main text.
Appendix

Raw Data:

Country Carbon dioxide GDP per Passenger


emissions per capita cars per 1000
capita,2004 (US$), persons,2004
2006

Australia 16.3 33,037 450


Canada 20 35,514 468
China 3.84 7,722 1
Germany 9.79 31,390 386
India 1.2 3,802 2
Japan 9.84 32,530 283
Mexico 4.24 11,369 82
Poland 8 15,149 138
Saudi 98
Arabia 13.4 16,505
South 48
Korea 9.77 24,084
UK 9.79 35,486 341
USA 20.4 43,223 573

Given Sources:

GDP per capita – Gross domestic product per capita in US dollars, 2006; Source
International Monetary Fund.
Carbon Dioxide Emissions in 2004 – carbon dioxide emissions per capita
(tons/capita) 2004, Source: US Department of Energy

Passenger cars per 1000 persons. Source: Development Data group, The World
Bank, 2004.

Internet Sources:

International Monetary Fund (2009) What we do [Internet], Available from:


http://www.imf.org/external/about/whatwedo.htm [Accessed November, 15th
2009]

US Department of Energy (2009) About the department of Energy [Internet],


Available from: <http://www.energy.gov/about/index.htm>[Accessed November,
15th 2009]

Word Bank (2009) About us Internet, Available from:


<http://web.worldbank.org/WBSITE/EXTERNAL/EXTABOUTUS/0,,pagePK:500044
10~piPK:36602~theSitePK:29708,00.html> Accessed December, 10th 2009
SUMMARY
OUTPUT

Regression Statistics
0.715638
R Square 734
Adjusted R 0.652447
Square 342
Standard 3.592797
Error 654

ANOVA
Significan
df SS MS F ce F
292.3690 11.32494 0.003486
Regression 2 701 146.1845351 01 719
116.1737
Residual 9 549 12.90819499
408.5428
Total 11 25

Coefficien Standard Lower Upper


ts Error t Stat P-value 95% 95%
-
2.722142 2.746251 0.347476 3.490308 8.934594
Intercept 889 224 0.991221366 974 976 754
-
0.000177 0.000216 0.433974 0.000312 0.000667
X Variable 1 353 567 0.81893124 035 555 261
-
0.014810 0.013787 0.310710 0.016380 0.046000
X Variable 2 24 918 1.074146258 406 196 676

RESIDUAL
OUTPUT

Predicted Standard
Observation Y Residuals Residuals
15.24596 1.054032
1 723 774 0.324336962
15.95185 4.048144
2 532 678 1.245656661
-
4.106474 0.266474
3 225 225 -0.08199692
-
14.00601 4.216011 -
4 121 211 1.297310958
-
3.426060 2.226060 -
5 08 08 0.684982082

Вам также может понравиться