# RELATIONSHIP BETWEEN PER CAPITA EXPENDITURE AND ECONOMIC AND DEMOGRAPHIC VARIABLES USING ADVANCED DATA ANALYSIS

## SUBMITTED BY: RISHABH SETHI (A049) HARSHIT JAIN (A024)

RELATIONSHIP BETWEEN PER CAPITA EXPENDITURE AND ECONOMIC AND DEMOGRAPHIC VARIABLES USING ADVANCED DATA ANALYSIS

1. Abstract
The relation between the per capita expenditure and the three economic and demographic variables is very interesting. If we observe the factors on which the per capita expenditure depends, it tells us about what should be done to improve these per capita expenditures. This study is done to find out the relation between the per capita expenditure, the economic ability index, the percentage of population in the metropolitan areas, and the percentage change in population.

2. Problem statement
Firstly we will check whether the variables are normal. Then the shape of the distribution and scenes will be checked. We will see whether the distribution is shaped with a high peak or low peak. It will then be the descriptive statistics of all the variables that will tell us about their means and the standard deviations. We would then estimate the regression line in which we will take per capita expenditure as dependent variable and the economic ability index, percentage of population in the metropolitan area and the percentage change in population as the independent variables. This will tell us if there is any relation between the selected dependent and independent variables. Last but not the least, we will perform an ANOVA test to check if the population means of the independent variables are equal or not.

3. Description
State and Local Per capita public expenditures and associated state demographic and economic characteristics are available for the year 1960; we have 48 cases in this regard. The characteristics that are given in the data are related to the economic situation of the people and their demography. We have to see if the per capita expenditure really does depend on the variables that we have selected. Number of cases: 40 Variable Names:
## EX: Per capita state and local public expenditures (S)

2, ECAB: Economic ability index, in which income, retail sales, and the value of output

## (manufactures, mineral, and agricultural) per capita are equally weighted.

3, MET: Percentage of population living in standard metropolitan areas 4,

## GROW: Percent change in population. 1950-1960

## 4. Analysis of the Data

4.1 Graphical representation

The histogram shows high rise and low peak. In this scene most of the values are 3500. Very few values are below S180 and the average is 5280 (app).

Now we see that the distribution is negatively skewed and is low peaked. The mean is 92.9. Most of the value are above 100 or 85 while only a very few observations give values of 55. 65. and 115.

The above figure shows the percentage of population living in the metropolitan areas. In most cases 50% of population live in the metropolitan areas and only one case gives a figure of 0%. The distribution is a little negatively skewed and low peaked

Here the distribution is positively skewed and highly peaked. The percentage change in the population fluctuates from minimum to -the maximum value. The most observations result in a percentage figure of 15% and only one observation gives a figure of 30%. 4.2 Descriptive statistics:

The above table is a result of the processes done through SPSS n some data. We have taken 40 observations in total. The per capita expenditure curve is exhibiting positive scenes and low peak. In the above case most of the observations show a value of S300. Very few observations give a figure of per capita expenditure of S180 and the average comes out to be S280 app. In the case of the economic ability index, we can judge that the distribution is negatively skewed and is low peaked. The mean comes out to be 92.9. Most of the observations give a value of 100 or 85 while only a very, few observations give values of55, 65, and 115. While the percentage of population living in the metropolitan areas in most observations is 50% and onl y a single observation gives a figure of 0%. The distribution is slightly negatively skewed and. low peaked. The distribution for the percentage change in population is positively skewed and highly peaked. The percentage change in the population fluctuates from minimum to the maximum value. The most observations result in a percentage figure of 15% and only one observation gives a figure of 30%.

4.3 Test for equality of the means of three samples (ANOVA): Assumptions For testing the suggested hypothesis following assumptions are made.

Now in this test we have to check if the means of the three independent variables that are the economic ability index, the percentage of people ling in the metropolitan areas and the percentage change in population are equal. From the above results that have been extracted through SPSS we can state a few conclusions. We can see that the p value comes out to be 0,000 that is less than the pre- assigned level of significance that was 0.05. This fact suggests that there is a significant difference prevalent in the selected independent variables,

Now we can clearly see from the above results that are given in the table that there is a significant difference existent in all the three selected independent variables. In all the cases in the table is which the comparisons have been done, the p- values are less than the preassigned value of the level of significance that is 0,05.. The lower and upper bounds of the distribution have also been given. 4.4 Estimating the Multiple Regression Line:

Here we can conclude that none of the independent variables that have been selected is dropped and that is because the coefficients of all the -variables are significantly different.

From the above results shown in the table, we can see that none of the variables has been dropped. The given regression line explains about 50% variation of the dependent variable.

The estimated regression line from the above given model is valid because the p- value is less than the pre- assigned level of significance that is 0.05.

Here multicolinearity does not exist because there is no case in which the p- value gets greater than the pre- assigned level of significance. So it is the best model to be selected. Here is no dropped and it explains the dependent variable a lot. The significant difference

between the coefficients of the -variables involved in the model makes it the best option. So from the above model, the multiple regression line that can be fitted comes out to be. Y (PCE) = 75.485+2.439X1- 0.948X2 0.886X3 Where Y= Per capita expenditure X1= economic ability index X2 %age of population in 'metropolitan areas X3=%a.ge change in population

We can conclude that -there is a significant difference in the coefficients of the independent variables that are involved in this case. This fact makes a case where none of them can be dropped_ when we are to estimate a perfect regression line. The reason is the same that is the p value comes out to be lower. So there is a case of no correlation between the three independent variables that have been selected.

5. Conclusion
From all the analysis that has been done, we can conclude the following things,

The per capita expenditure curve is exhibiting positive scenes and low peak. In the case of the economic ability index, we can judge that the distribution is negatively skewed and is low peaked.

The distribution is slightl y negatively skewed and low peaked in the case of population in the metropolitan areas.

The distribution for the percentage change in population is positively skewed and highly peaked.

The multiple regression line that can be fitted comes out to be.

## Y (PCE) = 75.485+2.439X1- 0.948X2 0.886X3

## Data: 256.00 85.50 19.70 6.90

275.00 94.30 17.70 14.70 327.00 107.50 87.00 85.20 .00 1112 3.70 297.00 0 256.00 94.90 86.20 1.00 1 312.00 121.60 77.60 25.40 374.00 111.50 257.00 117 9 0 257.00 103.10 336.00 116.10 269.00 93.40 213.00 77.20 308.00 108 A 0 273.00 111.80 256.00 110.80 287.00 120 9 0 290.00 104 3 0 217.00 85.10 198.00 76.80 217.00 75.10 195.00 78.70 183.00 222.00 283.00 217.00 231.00 329.00 294.00 232.00 369,00 302.00 65.20 73.00 80.90 69.40 57.40 95.70 100.2 0 99.10 93.40 88.20 85.50 78.90 77.90 68.80 78.20 50.90 73.10 69.50 48.10 76.90 46.30 30.90 34.10 45.80 24.60 12.90 25.50 7.80 19.90 31.10 21.90 21.10 21.80 18.30 15.50 14.90 -7.40 .30 8.10 12.40

32.20 12.90 46.00 14.40 65.60 77.20 45.60 7.00 8,60 .50 51.30 14.40 33.20 5 3 0 57.90 9.80 10.60 2.90 12.70 4.60

269.00 99.10 37.60 6.80 291.00 102 2 37.40 13.70 323.00 86.000 5.00 21.90 282.00 84.90 43.90 246.00 98.80 63.40 309.00 86.20 27.60 309.00 90.20 71.40 334.00 97.60 22.60 6.40 24.10 39.40 7430 13.40

