Вы находитесь на странице: 1из 20

STATISTICS & BIG

DATA FOR REAL Team Members:

ESTATE
• Aatam Shah
• Apurv Shah
• Karan Rego
• Pranypratabsingh Mala
• Shamal Pande
• Akshay Udapure
• Tejas Mohata

COURSE CODE: AUMREAL 639


1
INTRODUCTION
Let’s start

2
Introduction

Regression is statistical analysis method used for


analysing and finding relationships between two
variables where one is dependent and another is
independent Variable. This analysis technique is
primarily used to forecast the dependent variable based
on past selection with Independent variables.

3
2
AIM & OBJECTIVE

4
Aim & Objective

Here, we have Income and expense data of 24


Households.
The Objective is to derive a mathematical relationship
between Income & Expenditure

The established relationship for the Sample helps in


finding out the Expenses of a household based on their
Income for the Population
5
3
METHODOLOGY

6
Methodology

There are seven types of regression


• Linear Regression
• Logistic Regression Here we have used Linear Regression
• Polynomial Regression
to analyze our sample.
• Stepwise regression
Equation for linear regression is
• Ridge regression
y = mx+c
• Lasso Regression
• Elastic Net Regression

7
4
ANALYSIS

8
Sr. No. Income (x) Expense(y) x-X y-Y (x-X)-(y-Y) (x-X)^2 (y-Y)^2
1 62.7 53.59 -37.63 -17.77 668.7487 1415.95 315.85
2 94.23 69.55 -6.10 -1.81 11.0522 37.20 3.28
3 95.14 83.84 -5.19 12.48 -64.75 26.93 155.70
4 96.6 56.5 -3.73 -14.86 55.42319 13.91 220.88
5 114.64 91.26 14.31 19.90 284.7558 204.80 395.93
6 91.49 65.49 -8.84 -5.87 51.90432 78.13 34.48
7 96.76 63.14 -3.57 -8.22 29.34599 12.74 67.60
8 132.35 119.29 32.02 47.93 1534.692 1025.33 2297.09
9 73.42 64.18 -26.91 -7.18 193.2639 724.10 51.58
10 101.37 73.25 1.04 1.89 1.965007 1.08 3.56
11 136 91.19 35.67 19.83 707.2783 1272.41 393.15
12 193.08 64.78 92.75 -6.58 -610.494 8602.72 43.32
13 17.45 15.08 -82.88 -56.28 4664.612 6868.96 3167.67
14 27.68 19.71 -72.65 -51.65 3752.481 5277.90 2667.94
15 83.18 75.12 -17.15 3.76 -64.4451 294.09 14.12
16 25.75 18.28 -74.58 -53.08 3958.818 5562.05 2817.71
17 81.3 61.24 -19.03 -10.12 192.6148 362.11 102.46
18 40.58 33.71 -59.75 -37.65 2249.681 3569.96 1417.68
19 81.2 72.02 -19.13 0.66 -12.5854 365.93 0.43
20 78.19 61.66 -22.14 -9.70 214.796 490.14 94.13
21 123.92 80.14 23.59 8.78 207.0784 556.53 77.05
22 198.65 143.77 98.32 72.41 7119.207 9666.99 5242.91
23 176.63 133.7 76.30 62.34 4756.435 5821.82 3886.02
24 185.59 102.2 85.26 30.84 2629.266 7269.41 950.98
Ʃ(x-X)-(y-Y) Ʃ(x-X)^2 Ʃ(y-Y)^2
Ʃx= 2407.9 Ʃy= 1712.69 0.00 0.00 = 32531.14 = 59521.19 = 24421.51
9
Solution 1

▰ If ‘y’ is regressed on x then estimate the expenditure of household whose incomes are 69.75,
85.37, 99.90, 125.30, 120.50 in thousands

Solution:

Step 1: Find out the means of x and y


X= Ʃ x/n = 2407.90/24 = 100.33 …(where X is the mean of income)
Y= Ʃ y/n = 1712.69/24 = 71.36 …(where Y is the mean of expense)
Hence, X= 100.49 & Y= 67.18 ...(1)

10
Step 2: Find out the variance and standard deviation of x and y
x 2 = 1/n Ʃ (x -X) 2 = 1/24(59,521.19) = 2,480.05 &
x = √ x 2 = √2,480.05 = 49.80
Also,
y 2 = 1/n Ʃ (y –Y) 2 = 1/24(24,421.51) = 1,017.56 &
y = √ y 2 = √814.53= 31.90
Hence, x 2 = 2,480.05 & y 2 = 1,1017.56
& x = 49.80 & y = 31.90 ...(2)
Step 3: Find out the covariance of x and y

Cov(X,Y) = 1/n  (x–X)(y-Y) = 1/24(32,531.14) = 1,355.46


Hence Cov(X,Y) = 1,355.46 …(3)

11
Step 4: Find out the Correlation of x and y
r(xy)= Cov(X,Y)/ x y
= 1,355.46 / (49.80 X 31.90) = 0.85
Hence, r(xy)= 0.85 …(4)

Step 5: Find out estimated value (ƀ)


ƀ= r(xy) (y /x) = 0.85(31.90/49.80) = 0.5465

Hence, ƀ= 0.5465 …(5)

12
160

Step 6: Find out the regression equation 140

120
The Regression Equation is, y= Y + ƀ (x-X)
100
y= 71.36 + 0.5465(x – 100.33)
80
y= 71.36 + 0.55x – 55.18
y= 0.5465x + 16.53
60

40
y = 0.5465x + 16.527
20 R² = 0.728

0
0 50 100 150 200 250

Hence, y= 0.5465x + 16.53 …(6)

13
5a
CONCLUSION

14
Regression Equation is, y= Y + ƀ (x-X)
y= 0.5465x + 16.53 160

140

INCOME (x) EXPENDITURE (y) 120

69.75 54.64 100

Expense
82.37
85.37 63.17 80

60
99.90 71.11 54.64
40
125.30 85.00 y = 0.5465x + 16.527
R² = 0.728
20

120.50 82.37 0
0 50 69.75 100 120.50 150 200 250

Income

15
Solution 2

▰ If ‘x’ is regressed on y then estimate the income of household whose expenditure are 50.95,
49.50, 60.66, 78.18, 80.50 in thousands

Solution:

Step 1: Find out estimated value (ƀ)


ƀ= r(xy) (x /y) = 0.85(49.80/31.90) = 1.3321

Hence, ƀ= 1.3321 …(1)

16
250

Step 2: Find out the regression equation


200
The Regression Equation is, y= Y + ƀ (x-X)
y= 100.33 + 1.3321(x – 71.36) 150

y= 1.33x + 5.27
100

50
y = 1.3321x + 5.2699
R² = 0.728
0
0 20 40 60 80 100 120 140 160

Hence, y= 1.33x + 5.27…(2)

17
5b
CONCLUSION

18
Regression Equation is, y= Y + ƀ (x-X)
y= 1.33x + 5.27
250

INCOME (y) EXPENDITURE (x)


200

73.03 50.95

Income
150
71.10 49.50 112.33

85.95 60.66
100

73.03
109.24 78.18 50
y = 1.3213x + 6.6516
R² = 0.7247
112.33 80.50 0
0 20 4050.95 60 80 80.50 100 120 140 160

Expense

19
THANKS!
Any questions?

20

Вам также может понравиться