Вы находитесь на странице: 1из 43

Assignment No: 02

Supervised Learning - Regression (Using R)

Generate a proper 2-D data set of N points.

Split the data set into Training Data set and Test Data set.

Perform linear regression analysis with Least Squares Method.

Plot the graphs for Training MSE and Test MSE and comment on Curve Fitting and Generalization Error.

Verify the Effect of Data Set Size and Bias-Variance Tradeoff.

Apply Cross Validation and plot the graphs for errors.

Apply Subset Selection Method and plot the graphs for errors.

vi) Describe your findings in each case

2-D DataSet

Runs

at_bats

2

4

3

7

4

8

7

9

8

15

9

20

10

25

12

30

14

38

15

44

20

51

22

56

34

61

44

68

Splitting Into Training and Testing Data Set getwd() setwd("C:/users/ML") #install.packages('caTools') library(caTools) input = read.csv("TEST.csv")

set.seed(123)

split = sample.split(input$at_bats, SplitRatio = 0.8) train_set = subset(input, split == TRUE) test_set = subset(input, split == FALSE) > print(train_set)

Runs at_bats

1 2

4

2 3

7

3 4

8

6

9

20

7

10

25

9

14

38

10

15

44

12

22

56

13

34

61

14

44

68

>

print(test_set) Runs at_bats

5

8

15

8

12

30

11

20

51

Perform linear regression analysis with Least Squares Method.

> setwd("C:/users/ML")

> input = read.csv("TEST.csv")

> attach(input)

> names(input)

[1] "Runs"

"at_bats"

> head(input,8)

Runs at_bats

1 2

4

2 3

7

3 4

8

4 7

9

5 8

15

6 9

20

7 10

25

8 12

30

> plot(Runs~at_bats,main="Runs Vs At_bats")

30 > plot(Runs~at_bats,main="Runs Vs At_bats") > cor(Runs,at_bats) [1] 0.929 > plotSS <-

> cor(Runs,at_bats) [1] 0.929

> plotSS <- function(x, y, showSquares = FALSE, leastSquares = FALSE)

+ {

+

plot(y~x, asp = 1, xlab=paste(substitute(x)),

ylab=paste(substitute(y)))

+ if(leastSquares)

+ {

+ m1 <- lm(y~x)

+ y.hat <- m1$fit

+ }

+ else {

+ cat("Click any two points to make a line.\n")

+ pt1 <- locator(1)

+ points(pt1$x, pt1$y, pch = 8, col = "red")

+ pt2 <- locator(1)

+ points(pt2$x, pt2$y, pch = 8, col = "red")

+ pts <- data.frame("x" = c(pt1$x, pt2$x),"y" = c(pt1$y, pt2$y))

+ m1 <- lm(y ~ x, data = pts)

+ y.hat <- predict(m1, newdata = data.frame(x))

+ # title(paste("b0 = ", pt1$y-(pt2$y-pt1$y)/(pt2$x-pt1$x)*pt1$x, ", b1 = ", (pt2$y-pt1$y)/(pt2$x-pt1$x)))

+ }

+ r <- y - y.hat

+ abline(m1)

+ oSide <- x - r

+ LLim <- par()$usr[1]

+ RLim <- par()$usr[2]

+ oSide[oSide < LLim | oSide > RLim] <- c(x + r)[oSide < LLim | oSide > RLim] # move boxes to avoid margins

+ n <- length(y.hat)

+ for(i in 1:n)

+ {

+ lines(rep(x[i], 2), c(y[i], y.hat[i]), lty = 2, col = "blue")

+ if(showSquares)

+ {

+ lines(rep(oSide[i], 2), c(y[i], y.hat[i]), lty = 3, col =

"orange")

+ lines(c(oSide[i], x[i]), rep(y.hat[i],2), lty = 3, col = "orange")

+ lines(c(oSide[i], x[i]), rep(y[i],2), lty = 3, col = "orange")

+ }

+ }

+ SS <- round(sum(r^2), 3)

+ cat("\r ")

+ print(m1)

+ cat("Sum of Squares: ", SS, "\n")

+ } > plotSS(x = at_bats, y = Runs)

Click any two points to make a line.

= at_bats, y = Runs) Click any two points to make a line. Call: lm(formula =

Call:

lm(formula = y ~ x, data = pts)

Coefficients:

(Intercept)

x

-0.346

0.563

Sum of Squares: 373 plotSS(x = at_bats, y = Runs, showSquares = TRUE) Click any two points to make a line.

Call: lm(formula = y ~ x, data = pts) Coefficients: (Intercept) x 2.674 0.347 Sum

Call:

lm(formula = y ~ x, data = pts)

Coefficients:

(Intercept)

x

2.674

0.347

Sum of Squares: 444

> fit1 <- lm(Runs ~ at_bats)

> summary(fit1)

Call:

lm(formula = Runs ~ at_bats)

Residuals:

Max

-6.142 -3.554 0.445 1.584 10.593

Coefficients:

Min

1Q Median

3Q

Estimate Std. Error t value Pr(>|t|)

(Intercept)

-1.344

2.220

-0.61

0.56

at_bats

0.511

0.059

8.66 1.6e-06 ***

---

Signif. codes:

Residual standard error: 4.66 on 12 degrees of freedom

Multiple R-squared: 0.862,

F-statistic:

> plotSS(x = at_bats, y = Runs, leastSquares = TRUE)

Call:

lm(formula = y ~ x)

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Adjusted R-squared: 0.851

75 on 1 and 12 DF,

p-value: 1.65e-06

Coefficients:

(Intercept)

x

-1.344

0.511

Sum of Squares: 261

2)Plot the graphs for Training MSE and Test MSE and comment on Curve Fitting and

2)Plot the graphs for Training MSE and Test MSE and comment on Curve Fitting and Generalization Error.

DATASET

s.id

p.id

y

x1

x2

x3

x4

x5

x6

x7

1

1 297497 10.24 9.47 10.70 10.54 9.01 10.10 10.30 9.86

2 2 218250 10.47 11.46 8.87 9.90 10.31 9.71 11.32 11.28

3 3 808994 11.61 10.00 11.69 11.31 9.39 8.88 10.01 9.58

4 4 352628 9.57 10.09 8.95 8.49 9.63 8.15 9.99 9.56

5 5 371676 9.90 9.37 10.91 11.63 11.77 10.69 10.84 11.28

6 6 342721 9.65 9.13 10.28 10.11 8.74 9.84 10.30 10.51

7 7 936561 11.83 11.77 10.93 8.87 10.14 9.91 10.50 9.95

8 8 688596 9.95 10.89 9.59 8.78 10.50 10.80 10.89 11.43

9 9 617893 11.83 12.59 9.18 11.18 10.72 11.35 9.80 9.56

10 10 892466 10.08 9.44 11.35 9.81 10.64 10.28 10.47 10.13

11 11 583189 11.70 11.52 10.87 9.32 10.96 9.26 9.15 9.64

12 12 884235 9.12 10.49 9.17 7.54 11.33 9.55 10.14 10.79

13 13 247372 10.85 11.43 9.74 11.07 10.60 12.19 10.14 10.71

14 14 845347 11.47 10.72 11.31 10.21 8.86 10.51 8.77 9.31

15 15 584813 9.81 11.29 8.97 8.25 10.23 10.53 11.86 12.21

16 16 331417 9.30 7.65 11.13 10.87 9.69 7.66 7.41 7.08

17 17 420750 9.87 9.40 10.35 9.36 8.39 9.66 9.35 9.57

18 18 848683 8.88 8.97 9.47 10.35 9.82 10.14 11.12 11.22

19 19 432153 10.36 10.72 9.66 10.57 10.92 10.28 11.48 11.17

20 20 628985 10.63 9.87 11.09 10.98 10.61 9.27 9.52 9.31

21 21 402889 10.02 10.64 9.69 9.22 11.51 9.34 9.98 10.45

22 22 545527 9.15 9.20 10.14 9.90 10.30 10.56 10.65 9.93

23 23 651801 10.53 11.11 10.40 8.39 10.60 11.21 9.75 10.44

24 24 128116 10.42 10.53 9.73 9.80 9.76 10.48 9.19 9.26

25 25 962372 8.65 8.56 10.07 10.20 10.61 10.02 10.40 10.38

26 26 40027 10.82 9.48 10.80 12.01 8.82 9.03 8.87 8.24

27 27 331741 10.18 10.09 9.22 11.47 10.01 9.49 10.46 10.25

28 28 688675 11.43 11.57 10.41 8.90 11.45 9.22 6.98 6.90

29 29 185088 9.38 10.40 8.97 9.04 10.47 11.07 9.74 9.57

30 30 149812 10.69 10.60 10.40 11.54 10.50 11.14 12.44 12.40

31 31 448176 9.99 9.46 11.53 9.98 11.66 10.70 12.11 11.75

32 32 260936 9.97 9.89 10.11 9.69 9.97 10.55 8.54 8.80

33 33 91951 11.46 12.22 10.02 8.47 10.70 9.59 10.35 10.60

34 34 894025 9.90 8.97 10.73 11.90 10.69 9.60 9.65 9.92

36

36 305071 9.38 10.35 9.27 9.00 11.47 8.69 12.63 11.59

37 37 179629 11.00 9.90 11.70 10.67 10.28 9.79 9.65 10.06

38 38 300820 9.91 10.88 8.15 10.09 9.81 9.47 10.03 9.48

39 39 103049 9.36 10.69 7.63 10.14 10.41 9.14 10.17 9.96

40 40 915820 9.79 9.64 10.35 9.54 9.51 10.31 10.43 10.96

41 41 953096 8.79 9.54 9.16 9.78 11.03 11.11 9.89 9.15

42 42 446925 9.68 9.81 8.86 10.42 8.89 8.88 9.17 9.11

43 43 233823 7.88 9.40 6.61 9.84 7.82 10.82 9.91 10.13

44 44 323472 9.14 9.13 9.78 10.17 10.62 10.19 11.23 11.18

45 45 675515 10.06 10.02 10.19 9.91 9.35 10.36 12.60 12.26

46 46 558965 10.84 10.35 10.07 10.61 10.45 8.22 8.81 8.85

47 47 291438 10.61 11.38 9.80 9.50 8.96 12.03 10.62 11.18

48 48 441567 10.91 10.82 10.64 10.69 10.59 10.59 10.27 10.16

49 49 33985 10.24 9.20 12.00 9.16 9.94 9.61 10.80 10.74

50 50 729983 10.84 10.29 10.91 9.92 8.31 10.32 10.52 11.23

51 51 183190 9.53 9.80 9.90 9.59 9.51 11.18 10.53 11.15

52 52 767464 9.17 8.80 9.75 10.12 9.71 8.82 9.68 9.03

53 53 144760 10.75 10.58 11.02 11.19 12.29 11.26 12.83 11.74

54 54 704173 11.02 10.56 11.19 11.27 10.58 11.48 10.90 10.93

55 55 679179 11.03 10.49 11.16 10.45 10.89 9.89 8.58 9.00

56 56 382746 9.85 10.83 8.32 10.05 10.09 10.11 11.50 11.06

57 57 302919 9.51 9.01 10.05 10.11 9.81 8.76 9.77 8.60

58 58 594967 10.63 9.80 11.26 10.73 9.17 10.74 9.27 9.90

59 59 702793 10.29 9.83 11.30 8.99 9.72 10.54 9.60 10.73

60 60 974842 8.86 9.51 8.16 11.69 9.51 10.93 10.10 10.43

61 61 755316 11.59 10.40 12.02 10.15 9.95 9.71 8.42 7.55

62 62 275564 9.18 9.73 8.90 10.65 10.35 10.04 9.10 9.64

63 63 474705 8.97 9.70 9.17 9.07 9.71 10.26 9.47 10.02

64 64 158827 9.46 9.12 9.22 11.43 9.46 9.17 10.94 9.51

65 65 225003 11.70 12.31 10.01 9.33 10.61 10.26 10.06 10.97

66 66 978778 10.87 11.04 10.26 9.72 9.61 10.26 11.18 11.47

67 67 345114 10.36 10.33 10.26 10.35 10.59 10.95 10.98 11.14

68 68 331464 10.72 11.16 10.12 10.31 11.46 11.10 9.87 9.76

69 69 60028 9.23 9.54 9.02 10.19 9.86 10.15 9.57 9.35

71

71 376464 9.74 10.46 8.23 10.49 10.29 8.13 10.45 9.53

72 72 889119 9.42 9.75 8.73 11.49 10.79 9.97 9.69 9.02

73 73 471537 11.22 11.63 10.39 9.24 9.19 12.03 10.07 10.91

74 74 701368 9.42 8.76 10.41 10.33 10.45 8.88 8.16 8.50

75 75 91061 10.10 10.17 9.32 11.83 10.62 9.57 11.20 10.45

76 76 876968 10.60 10.71 10.10 9.28 8.94 11.12 10.42 10.14

77 77 938896 9.30 9.07 9.78 9.55 9.88 9.33 9.26 9.54

78 78 672091 8.23 8.52 9.10 9.43 8.90 9.74 10.03 10.77

79 79 760473 9.81 10.11 10.34 8.96 11.39 9.93 9.87 10.06

80 80 772694 10.35 10.05 11.23 9.81 12.33 10.11 8.74 8.98

81 81 643786 10.71 10.69 10.71 7.90 10.19 8.69 9.72 9.75

82 82 323282 7.91 8.64 9.05 8.61 10.85 8.98 9.54 9.49

83 83 657369 7.77 8.83 8.88 8.47 11.56 9.21 9.61 10.37

84 84 234114 8.98 8.85 9.61 9.90 8.81 9.66 8.28 9.15

85 85 979113 10.86 11.10 10.10 9.88 10.72 10.68 8.68 8.56

86 86 770556 8.84 9.87 9.24 9.13 9.94 12.39 10.38 10.87

87 87 678377 10.77 11.58 9.18 9.48 10.81 8.79 9.58 9.49

88 88 747496 9.75 10.11 10.05 10.52 10.76 12.10 10.33 10.89

89 89 679210 11.15 11.68 10.39 9.71 10.82 10.62 10.86 11.31

90 90 671885 10.73 10.48 9.92 11.81 10.23 9.38 11.69 11.14

91 91 460454 10.96 11.47 8.87 10.24 9.67 9.47 9.43 9.89

92 92 675481 11.13 9.60 12.69 10.14 10.96 9.39 8.67 9.59

93 93 913708 8.82 9.20 9.81 8.78 10.33 10.88 8.75 9.20

94 94 448100 10.98 12.20 8.62 8.65 9.86 8.89 10.14 10.30

95 95 608282 11.01 10.51 10.31 10.85 9.79 8.65 10.62 10.31

96 96 216595 8.36 8.67 9.12 9.63 9.63 10.26 9.69 9.87

97 97 40556 9.93 10.08 9.56 11.72 10.65 10.15 11.37 11.24

98 98 810820 9.16 9.32 9.05 11.29 10.92 9.51 9.64 9.07

99 99 341012 9.72 9.96 9.07 10.32 9.29 9.14 9.98 10.14

100 100 238466 9.43 8.79 10.67 10.83 10.58 9.23 11.25 10.85

> TrsumSS<-0

> TssumSS<-0

> TrMSE<-0

> TsMSE<-0

> p<-0

> i<-1

> dim(sample1)

[1] 100

10

> for ( i in 1:10)

+ { + p=10*i + TrainingData <- head(sample1,p) + #TrainingData + dim(TrainingData) + data=+
+ {
+
p=10*i
+
TrainingData <- head(sample1,p)
+
#TrainingData
+
dim(TrainingData)
+
data=+
lmodel_trainig_data <- lm(y ~ x1 + x2+ x3 + x4 + x5 + x6 + x7,
predictedY<-predict(lmodel_trainig_data,TrainingData)
+
predictedY
+
TrainingData$y
+
TestData <- head(sample1,-p )
+
Trerror <- abs(TrainingData$y - predictedY)
+
#Trerror
+
Trsuqarederror<- Trerror * Trerror
+
TrsumSS[i]<-sum(Trsuqarederror)
+
#TrsumSS[i]
+
TrMSE[i]<-mean(Trsuqarederror)
+
predictedY<-predict(lmodel_trainig_data,TestData)
+
#predictedY
+
#TestData$y
+
Tserror <- abs(TestData$y - predictedY)
+
#Tserror
+
Tssuqarederror<- Tserror * Tserror
+
TssumSS[i]<-sum(Tssuqarederror)
+
#TssumSS[i]
+
TsMSE[i]<-mean(Tssuqarederror)
+
}
>
plot(TrsumSS, main="effect of size of training and test error")
>
lines(x=TrsumSS, y = NULL, type = "l", col="blue")
> points(TssumSS,pch=10,col="red") > lines(x=TssumSS, y = NULL, type = "l", col="red")

> points(TssumSS,pch=10,col="red")

> points(TssumSS,pch=10,col="red") > lines(x=TssumSS, y = NULL, type = "l", col="red")

> lines(x=TssumSS, y = NULL, type = "l", col="red")

y = NULL, type = "l", col="red") > plot(TrMSE, main="effect of size on training

> plot(TrMSE, main="effect of size on training MSE(BLUE) and test

MSE(RED)")

> lines(x=TrMSE, y = NULL, type = "l", col="blue") > points(TsMSE,pch=10,col="red") >

> lines(x=TrMSE, y = NULL, type = "l", col="blue")

y = NULL, type = "l", col="blue") > points(TsMSE,pch=10,col="red") >

> points(TsMSE,pch=10,col="red")

> points(TsMSE,pch=10,col="red") > lines(x=TsMSE, y = NULL, type = "l",

> lines(x=TsMSE, y = NULL, type = "l", col="red")

3)Apply Cross Validation and plot the graphs for errors.

> library(DAAG)

> sample1=read.table(“sample1=read.table("C:/users/ML/my_data.txt")

> val.daag <- CVlm(sample1, m=2,form.lm=formula(y ~ x1))

Analysis of Variance Table

Response: y

Df Sum Sq Mean Sq F value Pr(>F)

x1

1

48.7

48.7

117 <2e-16 ***

Residuals 98

41.0

0.4

---

Signif. codes:

0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

fold 1

Observations

in test set: 10

 

1

20

29

32

38

52

69

72

81

91

x1

9.470 9.870 10.400 9.8900 10.880 8.8000 9.540 9.750 10.690

11.4700

cvpred

9.596 9.889 10.276 9.9031 10.627 9.1062 9.647 9.801 10.488

11.0584

y

10.240 10.630 9.380 9.9700 9.910 9.1700 9.230 9.420 10.710

10.9600

CV residual 0.644 0.741 -0.896 0.0669 -0.717 0.0638 -0.417 -0.381 0.222

-0.0984

Sum of squares = 2.67

Mean square = 0.27

n = 10

fold 2

Observations

in test set: 10

 

3

8

16

18

39

42

45

62

63

100

x1

10.00 10.890 7.65 8.970 10.69 9.810 10.0200 9.730 9.700

8.790

cvpred

9.98 10.651 8.20 9.196 10.50 9.833 9.9918 9.772

9.749

9.059

y

11.61 9.950 9.30 8.880 9.36 9.680 10.0600 9.180

8.970

9.430

CV residual 1.63 -0.701 1.10 -0.316 -1.14 -0.153 0.0682 -0.592 -0.779

0.371

Sum of squares = 6.9

Mean square = 0.69

n = 10

fold 3

Observations

in test set: 10

 

4

9

10

19

30

47

55

78

82

93

x1

10.090 12.5900 9.440 10.720 10.600 11.380 10.490 8.520 8.64

9.200

cvpred

10.065 11.7761 9.621 10.496 10.414 10.948 10.339 8.991 9.07

9.456

y

9.570 11.8300 10.080 10.360 10.690 10.610 11.030 8.230 7.91

8.820

CV residual -0.495 0.0539 0.459 -0.136 0.276 -0.338 0.691 -0.761 -1.16

-0.636

Sum of squares = 3.48

Mean square = 0.35

n = 10

fold 4

Observations

in test set: 10

35 54

56

73

74

77

87

88

89

99

x1

11.120 10.560 10.830 11.630 8.760 9.07000 11.58 10.110

cvpred

10.795 10.387 10.583 11.166 9.076 9.30198 11.13 10.059

11.2024

9.95

y

11.170 11.020 9.850 11.220 9.420 9.30000 10.77 9.750

11.1500

9.72

CV residual 0.375 0.633 -0.733 0.054 0.344 -0.00198 -0.36 -0.309 -

0.0524 -0.23

Sum of squares = 1.48

Mean square = 0.15

n = 10

fold 5

Observations

in test set: 10

 

2

21

24

41

50

64

71

75

80

95

x1

11.46 10.640 10.5300 9.54 10.290 9.120 10.460 10.17000 10.05

10.510

cvpred

11.04 10.447 10.3671 9.65 10.193 9.346 10.316 10.10645 10.02

10.353

y

10.47 10.020 10.4200 8.79 10.840 9.460 9.740 10.10000 10.35

11.010

CV residual -0.57 -0.427 0.0529 -0.86 0.647 0.114 -0.576 -0.00645 0.33

0.657

Sum of squares = 2.56

Mean square = 0.26

 

n = 10

fold 6

Observations

in test set: 10

 

12

14

40

57

60

66

83

86

96

98

x1

10.49 10.720 9.640000 9.010 9.510 11.040 8.83 9.87 8.670

9.320

cvpred

10.36 10.516 9.789067 9.365 9.702 10.731 9.24 9.94 9.137

9.574

y

9.12 11.470 9.790000 9.510 8.860 10.870 7.77 8.84 8.360

9.160

CV residual -1.24 0.954 0.000933 0.145 -0.842 0.139 -1.47 -1.10 -0.777 -

0.414

Sum of squares = 7.36

Mean square = 0.74

n = 10

fold 7

Observations

in test set: 10

 

15

17

23

28

31

33

48

58

59

90

x1

11.29 9.400 11.11 11.570 9.460 12.220 10.820 9.80 9.830

10.480

cvpred

10.91 9.513 10.78 11.121 9.558 11.602 10.565 9.81 9.832

10.313

y

9.81 9.870 10.53 11.430 9.990 11.460 10.910 10.63 10.290

10.730

CV residual -1.10 0.357 -0.25 0.309 0.432 -0.142 0.345 0.82 0.458

0.417

Sum of squares = 2.88

Mean square = 0.29

n = 10

fold 8

Observations

in test set: 10

 

5

13

27

36

46

49

61

70

92

94

x1

9.370 11.430 10.090 10.350 10.350 9.200 10.40 10.99 9.60

12.200

cvpred

9.459 10.984 9.992 10.184 10.184 9.333 10.22 10.66 9.63

11.555

y

9.900 10.850 10.180 9.380 10.840 10.240 11.59 12.02 11.13

10.980

CV residual 0.441 -0.134 0.188 -0.804 0.656 0.907 1.37 1.36

0.575

1.50 -

fold 9

Observations in test set: 10

7

25

26

43

67

68

76

84

85

97

x1

11.770 8.560 9.48 9.40 10.330 11.1600 10.710 8.850 11.100

10.080

cvpred

11.212 8.984 9.62 9.57 10.213 10.7889 10.477 9.186 10.747

10.039

y

11.830 8.650 10.82 7.88 10.360 10.7200 10.600 8.980 10.860

9.930

CV residual 0.618 -0.334 1.20 -1.69 0.147 -0.0689 0.123 -0.206 0.113

-0.109

Sum of squares = 4.88

fold 10

Observations in test set: 10

Mean square = 0.49

n = 10

6

11

22

34

37

44

51

53

65

79

x1

9.130 11.520 9.200 8.970 9.90 9.130 9.800 10.58 12.3100

10.110

cvpred

9.332 11.043 9.383 9.218 9.88 9.332 9.812 10.37 11.6086

10.034

y

9.650 11.700 9.150 9.900 11.00 9.140 9.530 10.75 11.7000

9.810

CV residual 0.318 0.657 -0.233 0.682 1.12 -0.192 -0.282 0.38 0.0914 -

0.224

Sum of squares = 2.62

Overall (Sum over all 10 folds) ms

0.433

Mean square = 0.26

n = 10

(Sum over all 10 folds) ms 0.433 Mean square = 0.26 n = 10 > cverr1

> cverr1 = val.daag$y - val.daag$cvpred

> plot(x = sample1$s.id, y = cverr1, type ='l')

> points(cverr1,pch=15, col="red",type ='l')

4)Apply Subset Selection Method and plot the graphs for errors. > sumSS = 0 >

4)Apply Subset Selection Method and plot the graphs for errors.

> sumSS = 0

> library(graphics)

> sample=read.table("C:/users/ML/my_data.txt")

> p <- lm(y ~ sample$x1,data=sample, x=T, y=T)

> #X1

> print(p)

Call:

lm(formula = y ~ sample$x1, data = sample, x = T, y = T)

Coefficients:

(Intercept)

sample$x1

2.78

0.72

> plot(sample$y,sample$y,main = "Regression model develop using X1")

abline(p,col="blue") predictedY<-predict(p,sample) abline(p,col="blue")

abline(p,col="blue")

predictedY<-predict(p,sample)

predictedY<-predict(p,sample) abline(p,col="blue")

abline(p,col="blue")

predictedY<-predict(p,sample)

> print(predictedY)

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

9.60 11.03 9.98 10.04

9.52 9.35 11.25 10.62 11.84 9.57 11.07 10.33

11.01

10.50 10.91 8.29

9.55 9.24 10.50 9.88

 

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

10.44

9.40 10.78 10.36

8.94 9.60 10.04 11.11 10.27 10.41

9.59 9.90

11.58

9.24 10.78 10.23 9.91 10.61 10.47 9.72

 

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

9.65

9.84 9.55 9.35

9.99 10.23 10.97 10.57

9.40 10.19

9.83 9.11

10.40

10.38 10.33 10.57

9.27 9.83 9.86 9.63

 

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

10.27

9.78 9.76 9.34 11.64 10.73 10.22 10.81

9.65 10.69 10.31 9.80

11.15

9.09 10.10 10.49 9.31 8.91 10.06 10.01

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

10.47

9.00 9.14 9.15 10.77 9.88 11.11 10.06 11.19 10.32 11.04

9.69

9.40 11.56 10.34 9.02 10.04 9.49

> error<-abs(sample$y-predictedY)

> plot(error)

9.95 9.11

> suqarederror<- error * error > plot(suqarederror) sumSS[1]<-sum(suqarederror) > p <- lm(y ~

> suqarederror<- error * error

> plot(suqarederror)

> suqarederror<- error * error > plot(suqarederror)
> suqarederror<- error * error > plot(suqarederror)
error * error > plot(suqarederror) sumSS[1]<-sum(suqarederror) > p <- lm(y ~

sumSS[1]<-sum(suqarederror)

> p <- lm(y ~ sample$x2,data=sample, x=T, y=T)

> print(p)

Call:

lm(formula = y ~ sample$x2, data = sample, x = T, y = T)

Coefficients:

(Intercept)

4.943

sample$x2

0.516

> predictedY<-predict(p,sample)

> print(predictedY)

 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

10.46 9.52 10.98 9.56 10.57 10.25 10.58

9.97 10.78 9.57 10.69 10.28 9.83

9.89

9.93 10.67

9.68 10.80 10.55 9.68

 

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

9.94 10.18 10.31 9.96 10.14 10.52 9.70 10.32 9.57 10.31 10.89 10.16

10.11

10.48 10.79 9.73 10.98

9.15 8.88 10.28

 

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

9.67 9.52 8.35 9.99 10.20 10.14 10.00 10.43 11.14 10.57 10.05 9.97

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

11.15

9.54 9.68 9.70 10.11 10.24 10.24 10.17

9.60 10.76

9.19 9.45

10.30 10.32

9.75 10.16 9.99 9.64 10.28 10.74

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

10.47

9.61

9.53 9.90 10.16

9.71 9.68 10.13 10.30 10.06

9.52 11.49

10.01

9.39 10.26 9.65

9.88 9.61 9.62 10.45

 

> error<-abs(sample$y-predictedY)

> plot(error)

> error<-abs(sample$y-predictedY) > plot(error) > suqarederror<- error * error >

> suqarederror<- error * error

> plot(suqarederror)

suqarederror<- error * error > plot(suqarederror) > sumSS[2]<-sum(suqarederror) > # y as x3 >

> sumSS[2]<-sum(suqarederror)

> # y as x3

> p <- lm(y ~ sample1$x3,data=sample1, x=T, y=T)

> print(p)

Call:

lm(formula = y ~ sample1$x3, data = sample1, x = T, y = T)

Coefficients:

(Intercept)

sample1$x3

8.8476

0.1241

> predictedY<-predict(p,sample1)

> print(predictedY)

1

2

3

4

5

6

7

10.155823

10.076387 10.251394 9.901379 10.291112 10.102452 9.948544

9.937373 10.235259 10.065216

11

12

13

14

15

16

17

18

19

20

10.004398

9.783466 10.221606 10.114863 9.871590 10.196782 10.009362

10.132240

10.159546 10.210435

 

21

22

23

24

25

26

27

28

29

30

9.991986 10.076387 9.888967 10.063975 10.113622 10.338277 10.271253

9.952268

9.969644 10.279942

 

31

32

33

34

35

36

37

38

39

40

10.086316

10.050322 9.898897 10.324624 9.838078 9.964680 10.171958

10.099969

10.106175 10.031704

 

41

42

43

44

45

46

47

48

49

50

10.061492

10.140928 10.068939 10.109899 10.077628 10.164511 10.026739

10.174440

9.984539 10.078869

 

51

52

53

54

55

56

57

58

59

60

10.037910

10.103693 10.236500 10.246429 10.144652 10.095004 10.102452

10.179405

9.963438 10.298559

 

61

62

63

64

65

66

67

68

69

70

10.107416

10.169476 9.973368 10.266288 10.005639 10.054045 10.132240

10.127275

10.112381 10.328348

 

71

72

73

74

75

76

77

78

79

80

10.149617

10.273736 9.994468 10.129758 10.315936 9.999433 10.032945

10.018051

9.959715 10.065216

 

81

82

83

84

85

86

87

88

89

90

9.828149 9.916273 9.898897 10.076387 10.073904 9.980815 10.024257

10.153340

10.052804 10.313454

 

91

92

93

94

95

96

97

98

99

100

10.118587

10.106175 9.937373 9.921238 10.194299 10.042874 10.302283

10.248912

10.128516 10.191817

 

> error<-abs(sample1$y-predictedY)

> plot(error)

> error<-abs(sample1$y-predictedY) > plot(error) > suqarederror<- error * error >

> suqarederror<- error * error

> plot(suqarederror)

> sumSS[3]<-sum(suqarederror) > # y as x4 > p <- lm(y ~ sample1$x4,data=sample1, x=T, y=T)

> sumSS[3]<-sum(suqarederror)

> # y as x4

> p <- lm(y ~ sample1$x4,data=sample1, x=T, y=T)

> print(p)

Call:

lm(formula = y ~ sample1$x4, data = sample1, x = T, y = T)

Coefficients:

(Intercept)

sample1$x4

9.36013

0.07193

> predictedY<-predict(p,sample1)

> print(predictedY)

1

2

3

4

5

6

7

8

9

10

10.008260

10.101775 10.035595 10.052860 10.206800 9.988838 10.089546

10.115443

10.131268 10.125514

 

11

12

13

14

15

16

17

18

19

20

10.148533

10.175149 10.122636 9.997470 10.096020 10.057176 9.963661

10.066527

10.145655 10.123356

 

21

22

23

24

25

26

27

28

29

30

10.188097

10.101056 10.122636 10.062211 10.123356 9.994593 10.080195

10.183781

10.113285 10.115443

 

31

32

33

34

35

36

37

38

39

40

10.198887

10.077317 10.129830 10.129110 9.985960 10.185219 10.099617

10.065808

10.108969 10.044227

 

41

42

43

44

45

46

47

48

49

50

10.153568

9.999628 9.922658 10.124075 10.032718 10.111846 10.004663

10.121917

10.075159 9.957906

 

51

52

53

54

55

56

57

58

59

60

10.044227

10.058614 10.244206 10.121198 10.143497 10.085950 10.065808

10.019770

10.059334 10.044227

 

61

62

63

64

65

66

67

68

69

70

10.075879

10.104653 10.058614 10.040631 10.123356 10.051421 10.121917

71

72

73

74

75

76

77

78

79

80

10.100337

10.136304 10.021208 10.111846 10.124075 10.003225 10.070843

10.000347

10.179465 10.247083

 

81

82

83

84

85

86

87

88

89

90

10.093143

10.140620 10.191694 9.993873 10.131268 10.075159 10.137743

10.134146

10.138462 10.096020

 

91

92

93

94

95

96

97

98

99

100

10.055737

10.148533 10.103214 10.069405 10.064369 10.052860 10.126233

10.145655

10.028402 10.121198

 

> error<-abs(sample1$y-predictedY)

> plot(error)

> error<-abs(sample1$y-predictedY) > plot(error) > suqarederror<- error * error >

> suqarederror<- error * error

> plot(suqarederror)

error<-abs(sample1$y-predictedY) > plot(error) > suqarederror<- error * error > plot(suqarederror)

> sumSS[4]<-sum(suqarederror)

> # y as x5

> p <- lm(y ~ sample1$x5,data=sample1, x=T, y=T)

> print(p)

Call:

lm(formula = y ~ sample1$x5, data = sample1, x = T, y = T)

Coefficients:

(Intercept)

sample1$x5

9.35834

0.07327

> predictedY<-predict(p,sample1)

> print(predictedY)

1

2

3

4

5

6

7

8

9

10

10.098342

10.069768 10.008956 9.955471 10.141570 10.079293 10.084422

10.149629

10.189926 10.111530

 

11

12

13

14

15

16

17

18

19

20

10.036798

10.058045 10.251471 10.128382 10.129847 9.919571 10.066105

10.101273

10.111530 10.037531

 

21

22

23

24

25

26

27

28

29

30

10.042659

10.132045 10.179669 10.126184 10.092481 10.019947 10.053649

10.033867

10.169411 10.174540

 

31

32

33

34

35

36

37

38

39

40

10.142303

10.131313 10.060976 10.061709 10.154758 9.995036 10.075630

10.052184

10.028006 10.113728

 

41

42

43

44

45

46

47

48

49

50

10.172342

10.008956 10.151095 10.104936 10.117392 9.960600 10.239748

10.134243

10.062441 10.114461

 

51

52

53

54

55

56

57

58

59

60

10.177471

10.004560 10.183332 10.199451 10.082956 10.099075 10.000164

10.145233

10.130580 10.159154

 

61

62

63

64

65

66

67

68

69

70

10.069768

10.093946 10.110065 10.030204 10.110065 10.110065 10.160619

10.171609

10.102006 10.078560

 

71

72

73

74

75

76

77

78

79

80

9.954006

10.088818 10.239748 10.008956 10.059511 10.173075 10.041927

10.071966

10.085887 10.099075

 

81

82

83

84

85

86

87

88

89

90

9.995036

10.016283 10.033135 10.066105 10.140837 10.266124 10.002362

10.244877

10.136441 10.045590

 

91

92

93

94

95

96

97

98

99

100

10.052184

10.046323 10.155491 10.009689 9.992105 10.110065 10.102006

10.055115

10.028006 10.034600

 

> error<-abs(sample1$y-predictedY)

> plot(error)

> suqarederror<- error * error > plot(suqarederror) > sumSS[5 ]<-sum(suqarederror) > # y as x6

> suqarederror<- error * error

> plot(suqarederror)

suqarederror<- error * error > plot(suqarederror) > sumSS[5 ]<-sum(suqarederror) > # y as x6 >

> sumSS[5 ]<-sum(suqarederror)

> # y as x6

> p <- lm(y ~ sample1$x6,data=sample1, x=T, y=T)

> print(p)

Call:

lm(formula = y ~ sample1$x6, data = sample1, x = T, y = T)

Coefficients:

(Intercept)

sample1$x6

10.192761

-0.009984

> predictedY<-predict(p,sample1)

> print(predictedY)

1

2

3

4

5

6

7

8

9

10

11

10.08993

10.07975 10.09282 10.09302 10.08454 10.08993 10.08793 10.08404

10.09492

10.08823 10.10141

 

12

13

14

15

16

17

18

19

20

21

22

10.09153

10.09153 10.10520 10.07436 10.11878 10.09941 10.08174 10.07815

10.09772

10.09312 10.08644

 

23

24

25

26

27

28

29

30

31

32

33

10.09542

10.10101 10.08893 10.10421 10.08833 10.12308 10.09552 10.06856

10.07186

10.10750 10.08943

 

34

35

36

37

38

39

40

41

42

43

44

10.09642

10.09782 10.06667 10.09642 10.09263 10.09123 10.08863 10.09402

10.10121

10.09382 10.08064

 

45

46

47

48

49

50

51

52

53

54

55

10.06697

10.10481 10.08673 10.09023 10.08494 10.08773 10.08763 10.09612

10.06467

10.08394 10.10710

 

56

57

58

59

60

61

62

63

64

65

66

10.07795

10.09522 10.10021 10.09692 10.09193 10.10870 10.10191 10.09822

10.08354

10.09233 10.08114

 

67

68

69

70

71

72

73

74

75

76

77

10.08314

10.09422 10.09722 10.08194 10.08843 10.09602 10.09223 10.11129

10.08094

10.08873 10.10031

 

78

79

80

81

82

83

84

85

86

87

88

10.09263

10.09422 10.10550 10.09572 10.09752 10.09682 10.11010 10.10610

10.08913

10.09712 10.08963

 

89

90

91

92

93

94

95

96

97

98

99

10.08434

10.07605 10.09862 10.10620 10.10540 10.09153 10.08673 10.09602

10.07925

10.09652 10.09312

 

100

10.08045

> error<-abs(sample1$y-predictedY)

> plot(error)

> error<-abs(sample1$y-predictedY) > plot(error) > suqarederror<- error * error >

> suqarederror<- error * error

> plot(suqarederror)

> sumSS[6]<-sum(suqarederror) > # y as x7 > p <- lm(y ~ sample1$x7,data=sample1, x=T, y=T)

> sumSS[6]<-sum(suqarederror)

> # y as x7

> p <- lm(y ~ sample1$x7,data=sample1, x=T, y=T)

> print(p)

Call:

lm(formula = y ~ sample1$x7, data = sample1, x = T, y = T)

Coefficients:

(Intercept)

sample1$x7

10.05820

0.00335

> predictedY<-predict(p,sample1)

> print(predictedY)

1

2

3

4

5

6

7

8

9

10

11

10.09124

10.09600 10.09030 10.09023 10.09600 10.09342 10.09154 10.09650

10.09023

10.09214 10.09050

 

12

13

14

15

16

17

18

19

20

21

22

10.09435

10.09409 10.08940 10.09911 10.08192 10.09027 10.09579 10.09563

10.08940

10.09321 10.09147

 

23

24

25

26

27

28

29

30

31

32

33

10.09318

10.08923 10.09298 10.08581 10.09254 10.08132 10.09027 10.09975

10.09757

10.08769 10.09372

 

34

35

36

37

38

39

40

41

42

43

44

10.09144

10.09328 10.09703 10.09191 10.08996 10.09157 10.09492 10.08886

10.08873

10.09214 10.09566

 

45

46

47

48

49

50

51

52

53

54

55

10.09928

10.08785 10.09566 10.09224 10.09419 10.09583 10.09556 10.08846

10.09754

10.09482 10.08836

 

56

57

58

59

60

61

62

63

64

65

66

10.09526

10.08702 10.09137 10.09415 10.09315 10.08350 10.09050 10.09177

10.09007

10.09496 10.09663

 

67

68

69

70

71

72

73

74

75

76

77

10.09553

10.09090 10.08953 10.09445 10.09013 10.08842 10.09476 10.08668

78

79

80

81

82

83

84

85

86

87

88

10.09429

10.09191 10.08829 10.09087 10.09000 10.09295 10.08886 10.08688

10.09462

10.09000 10.09469

 

89

90

91

92

93

94

95

96

97

98

99

10.09610

10.09553 10.09134 10.09033 10.08903 10.09271 10.09275 10.09127

10.09586

10.08859 10.09218

 

100

10.09455

> error<-abs(sample1$y-predictedY)

> plot(error)

> error<-abs(sample1$y-predictedY) > plot(error) > suqarederror<- error * error >

> suqarederror<- error * error

> plot(suqarederror)

> suqarederror<- error * error > plot(suqarederror) > sumSS[7]<-sum(suqarederror) > ##y as x1+x2

> sumSS[7]<-sum(suqarederror)

> ##y as x1+x2

> p <- lm(y ~ sample1$x1+sample1$x2,data=sample1, x=T, y=T)

> print(p)

Call:

lm(formula = y ~ sample1$x1 + sample1$x2, data = sample1, x = T, y = T)

Coefficients:

(Intercept)

sample1$x1

sample1$x2

-2.8421

0.7408

0.5420

> predictedY<-predict(p,sample1)

> print(predictedY)

 

1

2

3

4

5

6

7

8

9

10

9.973081 10.455380 10.902319 9.483832 10.012826 9.493553 11.801611

10.423379

11.460525 10.303175

 

11

12

13

14

15

16

17

18

19

20

11.583887

9.899403 10.904721 11.229731 10.383645 8.857879 9.731514

8.935980

10.335384 10.480796

 

21

22

23

24

25

26

27

28

29

30

10.292380

9.469526 11.025401 10.232572 8.957465 10.034692 9.630180

11.371594

9.724324 10.647588

 

31

32

33

34

35

36