You are on page 1of 4

# Adriano Axel Pliopas Pereira

# SGH Number 83393


April 01st, 2019

This exercise consist in the analysis of bike rental datasets, one with information
regarding days, other with information regarding the hour of rental. The follow
considerations apply:

For the Day dataset:


 Variable “weather situation” was treated as categorical, while for the variable
“season”, four different models were run.
 Only variables with P value smaller or equal to 3% were kept in the model,
except for season 4, where a value of 5% was adopted. This search was
automatically made (the regression was nested inside a while loop, and at each
loop the variable with higher p value was removed until no variable had p value
greater than the specified threshold).
 A model with no distinction for season was also run

For the hour dataset:


 Season and weather were treated as categories; variables with p value higher
than 5% were eliminated

The coefficients are summarized below, as well as the R results for the Day dataset:
All seasons Springer Summer Fall Winter
(Intercept) 1817.7 1644.9 2144.6 10274.1 2558.1
yr 2008.1 1453.2 2147.7 2269.9 1946.1
mnth 93.1 -57.2
holiday -617.2 -643.3 -971.5
weekday 57.5 52.5 94.3 51.9 75.5
weathersit2 6772.8 -307.0 -605.0 -505.5 -340.1
weathersit3 -2608.7 -1431.4 -2257.2 -2445.2 -1664.9
temp -3438.3 25527.2 -5424.2
atemp -18439.3 5817.8 9746.6
hum -1113.9 -1215.3 -2255.1 -3424.2
windspeed -4940.2 -2518.7 -2795.5 -3120.2
All seasons and weather:

> summary(mod)

Call:
lm(formula = f, data = Day)

Residuals:
Min 1Q Median 3Q Max
-4240.7 -453.1 104.7 614.5 2933.0

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1817.75 247.08 7.357 5.12e-13 ***
yr 2008.05 71.29 28.167 < 2e-16 ***
mnth 93.07 10.84 8.588 < 2e-16 ***
holiday -617.24 212.92 -2.899 0.00386 **
weekday 57.50 17.76 3.237 0.00126 **
atemp 6772.78 226.13 29.951 < 2e-16 ***
hum -2608.70 263.54 -9.899 < 2e-16 ***
windspeed -3438.31 481.40 -7.142 2.24e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 955.4 on 723 degrees of freedom


Multiple R-squared: 0.7591, Adjusted R-squared: 0.7568
F-statistic: 325.5 on 7 and 723 DF, p-value: < 2.2e-16

Season 1:
> summary(mod)

Call:
lm(formula = f, data = subset(Day, season == 1))

Residuals:
Min 1Q Median 3Q Max
-1622.33 -301.15 39.83 307.11 2477.75

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1644.90 330.36 4.979 1.56e-06 ***
yr 1453.24 90.68 16.025 < 2e-16 ***
mnth -57.25 13.04 -4.390 1.98e-05 ***
holiday -643.34 229.21 -2.807 0.005589 **
weekday 52.52 22.17 2.369 0.018947 *
weathersit2 -306.97 112.46 -2.730 0.007009 **
weathersit3 -1431.36 302.44 -4.733 4.64e-06 ***
temp 25527.21 5271.89 4.842 2.87e-06 ***
atemp -18439.31 5433.59 -3.394 0.000858 ***
hum -1113.94 377.14 -2.954 0.003586 **
windspeed -4940.19 823.33 -6.000 1.15e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 575.1 on 170 degrees of freedom


Multiple R-squared: 0.8406, Adjusted R-squared: 0.8312
F-statistic: 89.65 on 10 and 170 DF, p-value: < 2.2e-16

Season 2:
> summary(mod)
Call:
lm(formula = f, data = subset(Day, season == 2))

Residuals:
Min 1Q Median 3Q Max
-2642.46 -412.97 49.55 503.80 2023.43

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2144.59 453.18 4.732 4.54e-06 ***
yr 2147.73 111.64 19.238 < 2e-16 ***
weekday 94.26 27.53 3.424 0.000768 ***
weathersit2 -605.01 152.38 -3.970 0.000104 ***
weathersit3 -2257.23 489.41 -4.612 7.64e-06 ***
atemp 5817.80 567.02 10.260 < 2e-16 ***
hum -1215.32 506.61 -2.399 0.017487 *
windspeed -2518.74 809.36 -3.112 0.002169 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 731.4 on 176 degrees of freedom


Multiple R-squared: 0.8211, Adjusted R-squared: 0.814
F-statistic: 115.4 on 7 and 176 DF, p-value: < 2.2e-16

Season 3:
> summary(mod)

Call:
lm(formula = f, data = subset(Day, season == 3))

Residuals:
Min 1Q Median 3Q Max
-2309.34 -330.49 24.71 401.65 1830.53

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 10274.05 668.69 15.364 < 2e-16 ***
yr 2269.87 92.31 24.590 < 2e-16 ***
weekday 51.86 23.33 2.223 0.027475 *
weathersit2 -505.51 130.58 -3.871 0.000151 ***
weathersit3 -2445.22 362.89 -6.738 2.09e-10 ***
temp -5424.15 712.67 -7.611 1.47e-12 ***
hum -2255.15 509.34 -4.428 1.65e-05 ***
windspeed -2795.55 777.50 -3.596 0.000418 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 626.7 on 180 degrees of freedom


Multiple R-squared: 0.8226, Adjusted R-squared: 0.8157
F-statistic: 119.2 on 7 and 180 DF, p-value: < 2.2e-16

Season 4:
> summary(mod)

Call:
lm(formula = f, data = subset(Day, season == 4))

Residuals:
Min 1Q Median 3Q Max
-3044.4 -230.1 114.0 440.9 1453.7

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2558.12 563.25 4.542 1.06e-05 ***
yr 1946.05 122.31 15.910 < 2e-16 ***
holiday -971.52 327.96 -2.962 0.003494 **
weekday 75.47 30.21 2.499 0.013421 *
weathersit2 -340.12 159.89 -2.127 0.034854 *
weathersit3 -1664.94 357.62 -4.656 6.50e-06 ***
atemp 9746.62 672.45 14.494 < 2e-16 ***
hum -3424.18 750.43 -4.563 9.66e-06 ***
windspeed -3120.21 833.17 -3.745 0.000247 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 785.2 on 169 degrees of freedom


Multiple R-squared: 0.7962, Adjusted R-squared: 0.7866
F-statistic: 82.53 on 8 and 169 DF, p-value: < 2.2e-16

Hour dataset
Call:
lm(formula = f, data = Hour)

Residuals:
Min 1Q Median 3Q Max
-361.98 -93.44 -27.72 59.99 641.36

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -34.7604 7.0856 -4.906 9.39e-07 ***
season2 23.3558 3.8843 6.013 1.86e-09 ***
season3 0.8547 5.3690 0.159 0.873515
season4 62.4395 5.4326 11.493 < 2e-16 ***
yr 79.9588 2.1521 37.153 < 2e-16 ***
mnth 0.1819 0.5738 0.317 0.751277
hr 7.5311 0.1654 45.528 < 2e-16 ***
holiday -25.6328 6.4411 -3.980 6.93e-05 ***
weekday 2.0577 0.5366 3.835 0.000126 ***
weathersit2 9.7936 2.6092 3.753 0.000175 ***
weathersit3 -25.6884 4.3791 -5.866 4.54e-09 ***
weathersit4 52.0690 81.4229 0.639 0.522513
atemp 382.4821 10.0794 37.947 < 2e-16 ***
hum -195.1619 6.8745 -28.389 < 2e-16 ***
windspeed 47.6575 9.3746 5.084 3.74e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 140.9 on 17364 degrees of freedom


Multiple R-squared: 0.3973, Adjusted R-squared: 0.3968
F-statistic: 817.6 on 14 and 17364 DF, p-value: < 2.2e-16