Вы находитесь на странице: 1из 90

Row Name Species Species.E Species.E Species.E Species.E Species.ESpecies.E Species.

E Weight_g
1 Bream 1 1 0 0 0 0 0 0 242
2 Bream 1 1 0 0 0 0 0 0 290
3 Bream 1 1 0 0 0 0 0 0 340
4 Bream 1 1 0 0 0 0 0 0 363
5 Bream 1 1 0 0 0 0 0 0 430
Regression analysis of the classic "fish catch" dataset using RegressIt, an Excel add-in available for
6 Bream
free at https://regressit.com 1 1 0 0 0 0 0 0 450
7 Bream 1 1 0 0 0 0 0 0 500
The data set
8 Bream to be analyzed contains
1 measurements
1 0 of height,
0 weight,
0 length,
0 and width,
0 as well0 390
as species name, for a set of 159 fish collected in Finland in 1917. See the "notes" worksheet for
9 Bream
a complete description of the1 data set1 and the 0steps in the0 analysis,
0 and see0the "source"
0 0 450
worksheet for more details 1of the history
10 Bream 1 of the0 data. The 0 objective
0 is to develop
0 a model
0 to 0 500
predict weight
11 Bream from the other
1 variables.
1 0 0 0 0 0 0 475
Weight 12isBream
measured in grams 1 and Length
1 is measured
0 0 Height
in cm. 0 and Width0 are measures
0 of0 500
the relative
13 Breamheight and width1 of the fish
1 expressed
0 as a percentage
0 0of Length.0 For example,
0 a 0 500
value of 41 for Height means that height is 41% of length.
14 Bream 1 1 0 0 0 0 0 0
15 Bream variables have
The quantitative 1 all been 1 renamed 0 to make 0 their units
0 clear: 0Weight_g,0 0 600
Length_cm,
16 BreamHeight_pct, and1Width_pct.1 0 0 0 0 0 0 600
17 Bream
The data transformation tool 1 in RegressIt
1 was used0 to apply 0 a natural
0 log transformation
0 0 to these 0 700
18 Bream
4 variables. 1
By default the logged 1
variables 0 automatically
were 0 0
named 0
Weight_g.Ln, 0 0 700
Length_cm.Ln,
19 Bream Height_pct.Ln,1 and Width_pct.Ln.
1 0 0 0 0 0 0 610
20 Bream
The variable called Species is1 a numeric1 code with
0 values0from 1 to07. The data 0 transformation
0 0 650
tool was applied
21 Bream to this variable
1 to create
1 dummy
0 variables
0 for species,
0 which0 were automatically
0 0 575
assigned the names Species.Eq.1,
22 Bream 1 Species.Eq.2,
1 0etc. 0 0 0 0 0 685
23 Bream
Note that there are two missing1 values 1 of weight
0 in the data,
0 at rows 0 14 and047, coded 0 as blank0 620
cells. These
24 Breamrows will be automatically
1 1 ignored in
0 the analysis.
0 0 0 0 0 680
This is25
theBream
final version of the1 data sheet
1 that also
0 includes 0 saved values
0 for 0the residuals
0 and 0 700
26 Bream 1 1 0 0 0 0
predictions of Model 4 as well as the exponentiated ("unlogged") predictions. These were 0 0 725
automatically
27 Bream named Weight_g.Ln.Model.4.Resid,
1 1 0 Weight_g.Ln.Model.4,
0 0 and0 0 0 720
Weight_g.Ln.Model.4.Exp at the time they were created.
28 Bream 1 1 0 0 0 0 0 0 714
29 Bream 1 1 0 0 0 0 0 0 850
30 Bream 1 1 0 0 0 0 0 0 1000
31 Bream 1 1 0 0 0 0 0 0 920
32 Bream 1 1 0 0 0 0 0 0 955
33 Bream 1 1 0 0 0 0 0 0 925
34 Bream 1 1 0 0 0 0 0 0 975
35 Bream 1 1 0 0 0 0 0 0 950
36 Whitefish 2 0 1 0 0 0 0 0 270
37 Whitefish 2 0 1 0 0 0 0 0 270
38 Whitefish 2 0 1 0 0 0 0 0 306
39 Whitefish 2 0 1 0 0 0 0 0 540
40 Whitefish 2 0 1 0 0 0 0 0 800
41 Whitefish 2 0 1 0 0 0 0 0 1000
42 Roach 3 0 0 1 0 0 0 0 40
43 Roach 3 0 0 1 0 0 0 0 69
44 Roach 3 0 0 1 0 0 0 0 78
45 Roach 3 0 0 1 0 0 0 0 87
46 Roach 3 0 0 1 0 0 0 0 120
47 Roach 3 0 0 1 0 0 0 0
48 Roach 3 0 0 1 0 0 0 0 110
49 Roach 3 0 0 1 0 0 0 0 120
50 Roach 3 0 0 1 0 0 0 0 150
51 Roach 3 0 0 1 0 0 0 0 145
52 Roach 3 0 0 1 0 0 0 0 160
53 Roach 3 0 0 1 0 0 0 0 140
54 Roach 3 0 0 1 0 0 0 0 160
55 Roach 3 0 0 1 0 0 0 0 169
56 Roach 3 0 0 1 0 0 0 0 161
57 Roach 3 0 0 1 0 0 0 0 200
58 Roach 3 0 0 1 0 0 0 0 180
59 Roach 3 0 0 1 0 0 0 0 290
60 Roach 3 0 0 1 0 0 0 0 272
61 Roach 3 0 0 1 0 0 0 0 390
62 Parkki 4 0 0 0 1 0 0 0 55
63 Parkki 4 0 0 0 1 0 0 0 60
64 Parkki 4 0 0 0 1 0 0 0 90
65 Parkki 4 0 0 0 1 0 0 0 120
66 Parkki 4 0 0 0 1 0 0 0 150
67 Parkki 4 0 0 0 1 0 0 0 140
68 Parkki 4 0 0 0 1 0 0 0 170
69 Parkki 4 0 0 0 1 0 0 0 145
70 Parkki 4 0 0 0 1 0 0 0 200
71 Parkki 4 0 0 0 1 0 0 0 273
72 Parkki 4 0 0 0 1 0 0 0 300
73 Smelt 5 0 0 0 0 1 0 0 6.7
74 Smelt 5 0 0 0 0 1 0 0 7.5
75 Smelt 5 0 0 0 0 1 0 0 7
76 Smelt 5 0 0 0 0 1 0 0 9.7
77 Smelt 5 0 0 0 0 1 0 0 9.8
78 Smelt 5 0 0 0 0 1 0 0 8.7
79 Smelt 5 0 0 0 0 1 0 0 10
80 Smelt 5 0 0 0 0 1 0 0 9.9
81 Smelt 5 0 0 0 0 1 0 0 9.8
82 Smelt 5 0 0 0 0 1 0 0 12.2
83 Smelt 5 0 0 0 0 1 0 0 13.4
84 Smelt 5 0 0 0 0 1 0 0 12.2
85 Smelt 5 0 0 0 0 1 0 0 19.7
86 Smelt 5 0 0 0 0 1 0 0 19.9
87 Pike 6 0 0 0 0 0 1 0 200
88 Pike 6 0 0 0 0 0 1 0 300
89 Pike 6 0 0 0 0 0 1 0 300
90 Pike 6 0 0 0 0 0 1 0 300
91 Pike 6 0 0 0 0 0 1 0 430
92 Pike 6 0 0 0 0 0 1 0 345
93 Pike 6 0 0 0 0 0 1 0 456
94 Pike 6 0 0 0 0 0 1 0 510
95 Pike 6 0 0 0 0 0 1 0 540
96 Pike 6 0 0 0 0 0 1 0 500
97 Pike 6 0 0 0 0 0 1 0 567
98 Pike 6 0 0 0 0 0 1 0 770
99 Pike 6 0 0 0 0 0 1 0 950
100 Pike 6 0 0 0 0 0 1 0 1250
101 Pike 6 0 0 0 0 0 1 0 1600
102 Pike 6 0 0 0 0 0 1 0 1550
103 Pike 6 0 0 0 0 0 1 0 1650
104 Perch 7 0 0 0 0 0 0 1 5.9
105 Perch 7 0 0 0 0 0 0 1 32
106 Perch 7 0 0 0 0 0 0 1 40
107 Perch 7 0 0 0 0 0 0 1 51.5
108 Perch 7 0 0 0 0 0 0 1 70
109 Perch 7 0 0 0 0 0 0 1 100
110 Perch 7 0 0 0 0 0 0 1 78
111 Perch 7 0 0 0 0 0 0 1 80
112 Perch 7 0 0 0 0 0 0 1 85
113 Perch 7 0 0 0 0 0 0 1 85
114 Perch 7 0 0 0 0 0 0 1 110
115 Perch 7 0 0 0 0 0 0 1 115
116 Perch 7 0 0 0 0 0 0 1 125
117 Perch 7 0 0 0 0 0 0 1 130
118 Perch 7 0 0 0 0 0 0 1 120
119 Perch 7 0 0 0 0 0 0 1 120
120 Perch 7 0 0 0 0 0 0 1 130
121 Perch 7 0 0 0 0 0 0 1 135
122 Perch 7 0 0 0 0 0 0 1 110
123 Perch 7 0 0 0 0 0 0 1 130
124 Perch 7 0 0 0 0 0 0 1 150
125 Perch 7 0 0 0 0 0 0 1 145
126 Perch 7 0 0 0 0 0 0 1 150
127 Perch 7 0 0 0 0 0 0 1 170
128 Perch 7 0 0 0 0 0 0 1 225
129 Perch 7 0 0 0 0 0 0 1 145
130 Perch 7 0 0 0 0 0 0 1 188
131 Perch 7 0 0 0 0 0 0 1 180
132 Perch 7 0 0 0 0 0 0 1 197
133 Perch 7 0 0 0 0 0 0 1 218
134 Perch 7 0 0 0 0 0 0 1 300
135 Perch 7 0 0 0 0 0 0 1 260
136 Perch 7 0 0 0 0 0 0 1 265
137 Perch 7 0 0 0 0 0 0 1 250
138 Perch 7 0 0 0 0 0 0 1 250
139 Perch 7 0 0 0 0 0 0 1 300
140 Perch 7 0 0 0 0 0 0 1 320
141 Perch 7 0 0 0 0 0 0 1 514
142 Perch 7 0 0 0 0 0 0 1 556
143 Perch 7 0 0 0 0 0 0 1 840
144 Perch 7 0 0 0 0 0 0 1 685
145 Perch 7 0 0 0 0 0 0 1 700
146 Perch 7 0 0 0 0 0 0 1 700
147 Perch 7 0 0 0 0 0 0 1 690
148 Perch 7 0 0 0 0 0 0 1 900
149 Perch 7 0 0 0 0 0 0 1 650
150 Perch 7 0 0 0 0 0 0 1 820
151 Perch 7 0 0 0 0 0 0 1 850
152 Perch 7 0 0 0 0 0 0 1 900
153 Perch 7 0 0 0 0 0 0 1 1015
154 Perch 7 0 0 0 0 0 0 1 820
155 Perch 7 0 0 0 0 0 0 1 1100
156 Perch 7 0 0 0 0 0 0 1 1000
157 Perch 7 0 0 0 0 0 0 1 1100
158 Perch 7 0 0 0 0 0 0 1 1000
159 Perch 7 0 0 0 0 0 0 1 1000
Weight_g.L Weight_g.L Weight_g.LnWeight_g.LnLength_cm Length_cm.Ln Height_pct Height_pct.L
5.4889377 5.605186 271.83248 -0.116248 30 3.401197382 38.4 3.64805746
5.6698809 5.7662535 319.3391 -0.096373 31.2 3.440418095 40 3.68887945
5.8289456 5.8005995 330.49763 0.0283461 31.1 3.437207819 39.8 3.68386691
5.8944028 5.9288173 375.70991 -0.034414 33.5 3.511545439 38 3.63758616
6.0637852 6.0159082 409.89795 0.047877 34 3.526360525 36.6 3.60004824
6.1092476 6.0902811 441.54553 0.0189664 34.7 3.546739687 39.2 3.66867675
6.2146081 6.1430031 465.44926 0.071605 34.5 3.540959324 41.1 3.71600812
5.9661467 6.033734 417.2702 -0.067587 35 3.555348061 36.2 3.58905912
6.1092476 6.1215757 455.58198 -0.012328 35.1 3.55820113 39.9 3.68637632
6.2146081 6.201349 493.4142 0.0132591 36.2 3.589059119 39.3 3.67122452
6.1633148 6.2181519 501.77503 -0.054837 36.2 3.589059119 39.4 3.67376582
6.2146081 6.1923923 489.01457 0.0222158 36.2 3.589059119 39.7 3.68135119
6.2146081 6.1228709 456.17245 0.0917372 36.4 3.594568775 37.8 3.6323091
37.3 3.618993327 37.3 3.61899333
6.3969297 6.3063835 548.0593 0.0905462 37.2 3.616308761 40.2 3.693867
6.3969297 6.367287 582.47542 0.0296427 37.2 3.616308761 41.5 3.72569343
6.5510803 6.3676923 582.71156 0.183388 38.3 3.645449896 38.8 3.65842025
6.5510803 6.3719153 585.17754 0.179165 38.5 3.650658241 38.8 3.65842025
6.413459 6.4000039 601.84737 0.0134551 38.6 3.653252276 40.5 3.70130197
6.4769724 6.4119179 609.0607 0.0650544 38.7 3.6558396 37.4 3.6216707
6.35437 6.464011 641.62951 -0.109641 39.5 3.676300672 38.3 3.6454499
6.5294188 6.4671702 643.65974 0.0622486 39.2 3.668676747 40.8 3.70868208
6.4297195 6.4621272 640.42192 -0.032408 39.7 3.681351188 39.1 3.66612247
6.5220928 6.5798716 720.44682 -0.057779 40.6 3.703768067 38.1 3.64021428
6.5510803 6.5585406 705.24172 -0.00746 40.5 3.701301974 40.1 3.69137633
6.5861717 6.6234924 752.56876 -0.037321 40.9 3.711130063 40 3.68887945
6.5792512 6.6131342 744.81379 -0.033883 40.6 3.703768067 40.3 3.69635147
6.570883 6.638859 764.22254 -0.067976 41.5 3.725693427 39.8 3.68386691
6.7452363 6.6882144 802.8873 0.057022 41.6 3.728100167 40.6 3.70376807
6.9077553 6.8410304 935.45252 0.0667249 42.6 3.751854253 44.5 3.79548919
6.8243737 6.8482823 942.26101 -0.023909 44.1 3.786459782 40.9 3.71113006
6.8617113 6.8445963 938.79426 0.017115 44 3.784189634 41.1 3.71600812
6.8297937 6.9592247 1052.817 -0.129431 45.3 3.813307032 41.4 3.72328088
6.8824375 6.979217 1074.077 -0.09678 45.9 3.826465117 40.6 3.70376807
6.856462 6.936456 1029.1165 -0.079994 46.5 3.839452313 37.9 3.63495111
5.598422 5.5951844 269.12728 0.0032375 28.7 3.356897123 29.2 3.37416871
5.598422 5.6149461 274.49858 -0.016524 29.3 3.377587516 27.8 3.32503602
5.7235851 5.8073431 332.73392 -0.083758 30.8 3.42751469 28.5 3.34990409
6.2915691 6.3001443 544.65047 -0.008575 34 3.526360525 31.6 3.45315712
6.6846117 6.6423468 766.89263 0.0422649 39.6 3.678829118 29.7 3.39114705
6.9077553 6.8444004 938.61036 0.0633548 43.5 3.772760938 28.4 3.34638915
3.6888795 3.6172577 37.235316 0.0716218 16.2 2.785011242 25.6 3.24259235
4.2341065 4.3099 74.433042 -0.075793 20.3 3.010620886 26.1 3.26193531
4.3567088 4.4387457 84.668673 -0.082037 21.2 3.054001182 26.3 3.26956894
4.4659081 4.575598 97.086075 -0.10969 22.2 3.100092289 25.3 3.2308044
4.7874917 4.7043607 110.42767 0.083131 22.2 3.100092289 28 3.33220451
22.8 3.126760536 28.4 3.34638915
4.7004804 4.7458164 115.10173 -0.045336 23.1 3.139832618 26.7 3.28466357
4.7874917 4.7716415 118.11296 0.0158503 23.7 3.165475048 25.8 3.25037449
5.0106353 4.8827847 131.99773 0.1278506 24.7 3.206803244 23.5 3.15700042
4.9767337 4.9102613 135.67487 0.0664724 24.3 3.19047635 27.3 3.3068867
5.0751738 5.0620796 157.91859 0.0130942 25.3 3.230804396 27.8 3.32503602
4.9416424 4.9203438 137.04973 0.0212986 25 3.218875825 26.2 3.26575941
5.0751738 4.9754172 144.80923 0.0997566 25 3.218875825 25.6 3.24259235
5.1298987 5.2431342 189.26236 -0.113235 27.2 3.303216973 27.7 3.32143241
5.0814044 5.1239175 167.99219 -0.042513 26.7 3.284663565 25.9 3.25424297
5.2983174 5.2422594 189.09686 0.056058 26.8 3.288401888 27.6 3.31781577
5.1929569 5.2596394 192.41209 -0.066683 27.9 3.328626689 25.4 3.23474917
5.6698809 5.5654601 261.24538 0.1044208 29.2 3.374168709 30.4 3.41444261
5.6058021 5.6603294 287.24324 -0.054527 30.6 3.421000009 28 3.33220451
5.9661467 6.0358865 418.16933 -0.06974 35 3.555348061 27.1 3.29953373
4.0073332 4.0162238 55.491164 -0.008891 16.5 2.803360381 41.5 3.72569343
4.0943446 4.0853087 59.460289 0.0090359 17.4 2.856470206 37.8 3.6323091
4.4998097 4.477792 88.040065 0.0220177 19.8 2.985681938 37.4 3.6216707
4.7874917 4.7409541 114.54344 0.0465376 21.3 3.058707073 39.4 3.67376582
5.0106353 4.9355872 139.15483 0.0750481 22.4 3.109060959 39.7 3.68135119
4.9416424 4.9740655 144.61363 -0.032423 23.2 3.144152279 36.8 3.60549785
5.1357984 5.0550008 156.80466 0.0807976 23.2 3.144152279 40.5 3.70130197
4.9767337 5.1081095 165.35745 -0.131376 24.1 3.18221184 40.4 3.69882978
5.2983174 5.3522219 211.07676 -0.053905 25.8 3.250374492 40.1 3.69137633
5.6094718 5.6137759 274.17756 -0.004304 28 3.33220451 39.6 3.67882912
5.7037825 5.7063213 300.7626 -0.002539 29 3.36729583 39.2 3.66867675
1.9021075 1.7514426 5.7629105 0.1506649 10.8 2.379546134 16.1 2.77881927
2.014903 2.0196503 7.5356892 -0.004747 11.6 2.451005098 17 2.83321334
1.9459101 1.9280283 6.8759393 0.0178819 11.6 2.451005098 14.9 2.70136121
2.2721259 2.2441648 9.4325344 0.0279611 12 2.48490665 18.3 2.90690106
2.2823824 2.2295697 9.2958652 0.0528127 12.4 2.517696473 16.8 2.82137889
2.163323 2.2285882 9.2867461 -0.065265 12.6 2.533696814 15.7 2.75366071
2.3025851 2.3737105 10.737159 -0.071125 13.1 2.57261223 16.9 2.82731362
2.2925348 2.3230365 10.206619 -0.030502 13.1 2.57261223 16.9 2.82731362
2.2823824 2.3263323 10.240314 -0.04395 13.2 2.58021683 16.7 2.81540872
2.501436 2.4211804 11.259142 0.0802555 13.4 2.595254707 15.6 2.74727091
2.5952547 2.4842334 11.991924 0.1110213 13.5 2.602689685 18 2.89037176
2.501436 2.4768079 11.903208 0.024628 13.8 2.624668592 16.5 2.80336038
2.9806186 3.0699334 21.540468 -0.089315 15.2 2.721295428 18.9 2.93916192
2.9907197 3.1510408 23.360366 -0.160321 16.2 2.785011242 18.1 2.89591194
5.2983174 5.3805743 217.14696 -0.082257 34.8 3.549617387 16 2.77258872
5.7037825 5.6594369 286.98699 0.0443456 37.8 3.632309103 15.1 2.71469474
5.7037825 5.7613412 317.77425 -0.057559 38.8 3.658420247 15.3 2.72785283
5.7037825 5.8004605 330.45171 -0.096678 39.8 3.683866912 15.8 2.76000994
6.0637852 5.9977168 402.50875 0.0660684 40.5 3.701301974 18 2.89037176
5.8435444 5.8608886 351.03593 -0.017344 41 3.713572067 15.6 2.74727091
6.1224928 6.1821212 484.01758 -0.059628 45.5 3.817712326 16 2.77258872
6.2344107 6.1562171 471.64052 0.0781936 45.5 3.817712326 15 2.7080502
6.2915691 6.3283321 560.22142 -0.036763 45.8 3.824284091 17 2.83321334
6.2146081 6.3171696 554.00271 -0.102561 48 3.871201011 14.5 2.67414865
6.3403593 6.415088 610.99453 -0.074729 48.7 3.88567903 16 2.77258872
6.6463905 6.5502113 699.39191 0.0961793 51.2 3.935739532 15 2.7080502
6.856462 6.8570361 950.54555 -0.000574 55.1 4.009149716 16.2 2.78501124
7.1308988 7.1883577 1323.9271 -0.057459 59.7 4.08933202 17.9 2.88480071
7.3777589 7.1793454 1312.0491 0.1984135 64 4.158883083 15 2.7080502
7.3460102 7.1793454 1312.0491 0.1666648 64 4.158883083 15 2.7080502
7.4085306 7.4728432 1759.6025 -0.064313 68 4.219507705 15.9 2.76631911
1.7749524 1.9020093 6.6993422 -0.127057 8.8 2.174751721 24 3.17805383
3.4657359 3.3715402 29.123348 0.0941957 14.7 2.687847494 24 3.17805383
3.6888795 3.6841378 39.810783 0.0047416 16 2.772588722 23.9 3.17387846
3.9415818 3.9793078 53.480004 -0.037726 17.2 2.844909384 26.7 3.28466357
4.2484952 4.1720247 64.846617 0.0764705 18.5 2.917770732 24.8 3.21084365
4.6051702 4.3894584 80.596757 0.2157118 19.2 2.954910279 27.2 3.30321697
4.3567088 4.3733483 79.308734 -0.016639 19.4 2.965273066 26.8 3.28840189
4.3820266 4.4884227 88.980988 -0.106396 20.2 3.005682604 27.9 3.32862669
4.4426513 4.4796495 88.203756 -0.036998 20.8 3.034952987 24.7 3.20680324
4.4426513 4.4422332 84.964475 0.000418 21 3.044522438 24.2 3.18635263
4.7004804 4.7750125 118.5118 -0.074532 22.5 3.113515309 25.3 3.2308044
4.7449321 4.7624347 117.03051 -0.017503 22.5 3.113515309 26.3 3.26956894
4.8283137 4.7914014 120.47008 0.0369123 22.5 3.113515309 25.3 3.2308044
4.8675345 4.8714658 130.51208 -0.003931 22.8 3.126760536 28 3.33220451
4.7874917 4.8795055 131.56558 -0.092014 23.5 3.157000421 26 3.25809654
4.7874917 4.8449281 127.09414 -0.057436 23.5 3.157000421 24 3.17805383
4.8675345 4.8973391 133.93292 -0.029805 23.5 3.157000421 26 3.25809654
4.9052748 4.8716578 130.53714 0.033617 23.5 3.157000421 25 3.21887582
4.7004804 4.8969834 133.88529 -0.196503 23.5 3.157000421 23.5 3.15700042
4.8675345 4.9230524 137.42144 -0.055518 24 3.17805383 24.4 3.19458313
5.0106353 5.0201435 151.43303 -0.009508 24 3.17805383 28.3 3.3428618
4.9767337 4.9500533 141.18248 0.0266805 24.2 3.186352633 24.6 3.20274644
5.0106353 4.8860163 132.42498 0.124619 24.5 3.198673118 21.3 3.05870707
5.1357984 5.0582768 157.31919 0.0775216 25 3.218875825 25.1 3.22286785
5.4161004 5.1930677 180.01996 0.2230327 25.5 3.238678452 28.6 3.35340672
4.9767337 5.1191967 167.20101 -0.142463 25.5 3.238678452 25 3.21887582
5.236442 5.2500039 190.56702 -0.013562 26.2 3.265759411 25.7 3.24649099
5.1929569 5.1771156 177.17105 0.0158412 26.5 3.277144733 24.3 3.19047635
5.2832037 5.2978222 199.90098 -0.014618 27 3.295836866 24.3 3.19047635
5.3844951 5.4111113 223.88024 -0.026616 28 3.33220451 25.6 3.24259235
5.7037825 5.6676396 289.35074 0.0361429 28.7 3.356897123 29 3.36729583
5.5606816 5.4932651 243.0495 0.0674165 28.9 3.363841595 24.8 3.21084365
5.5797298 5.4826179 240.47544 0.0971119 28.9 3.363841595 24.4 3.19458313
5.5214609 5.5310749 252.41508 -0.009614 28.9 3.363841595 25.2 3.22684399
5.5214609 5.56599 261.38384 -0.044529 29.4 3.380994674 26.6 3.28091122
5.7037825 5.6408843 281.71172 0.0628982 30.1 3.404525172 25.2 3.22684399
5.768321 5.7486983 313.78194 0.0196227 31.6 3.453157121 24.1 3.18221184
6.2422233 6.1865093 486.14616 0.055714 34 3.526360525 29.5 3.38439026
6.3207683 6.3637264 580.40513 -0.042958 36.5 3.597312261 28.1 3.33576958
6.7334019 6.5829056 722.63595 0.1504963 37.3 3.618993327 30.8 3.42751469
6.5294188 6.5628265 708.27079 -0.033408 39 3.663561646 27.9 3.32862669
6.5510803 6.5032251 667.29024 0.0478552 38.3 3.645449896 27.7 3.32143241
6.5510803 6.5308613 685.9888 0.020219 39.4 3.673765816 27.5 3.314186
6.5366916 6.5185479 677.5937 0.0181437 39.3 3.671224519 26.9 3.29212629
6.8023948 6.7346516 841.05039 0.0677432 41.4 3.723280881 26.9 3.29212629
6.4769724 6.6179952 748.44308 -0.141023 41.4 3.723280881 26.9 3.29212629
6.7093043 6.7921277 890.80692 -0.082823 41.3 3.7208625 30.1 3.40452517
6.7452363 6.7915255 890.27062 -0.046289 42.3 3.744787086 28.2 3.33932198
6.8023948 6.7979645 896.0216 0.0044302 42.5 3.749504076 27.6 3.31781577
6.9226439 6.8459703 940.08504 0.0766735 42.4 3.747148362 29.2 3.37416871
6.7093043 6.7186695 827.71548 -0.009365 42.5 3.749504076 26.2 3.26575941
7.0030655 6.917726 1010.0206 0.0853394 44.6 3.797733859 28.7 3.35689712
6.9077553 6.9269125 1019.3419 -0.019157 45.2 3.811097087 26.4 3.27336401
7.0030655 6.9801852 1075.1175 0.0228803 45.5 3.817712326 27.5 3.314186
6.9077553 7.0542677 1157.7893 -0.146512 46 3.828641396 27.4 3.31054301
6.9077553 7.0356993 1136.4894 -0.127944 46.6 3.841600541 26.8 3.28840189
Width_pct Width_pct.Ln
13.4 2.59525471
13.8 2.62466859
15.1 2.71469474
13.3 2.58776404
15.1 2.71469474
14.2 2.65324196
15.3 2.72785283
13.4 2.59525471
13.8 2.62466859
13.7 2.61739583
14.1 2.6461748
13.3 2.58776404
12 2.48490665
13.6 2.61006979
13.9 2.63188884
15 2.7080502
13.8 2.62466859
13.5 2.60268969
13.3 2.58776404
14.8 2.69462718
14.1 2.6461748
13.7 2.61739583
13.3 2.58776404
15.1 2.71469474
13.8 2.62466859
14.8 2.69462718
15 2.7080502
14.1 2.6461748
14.9 2.70136121
15.5 2.74084002
14.3 2.66025954
14.3 2.66025954
14.9 2.70136121
14.7 2.68784749
13.7 2.61739583
14.8 2.69462718
14.5 2.67414865
15.2 2.72129543
19.3 2.9601051
16.6 2.8094027
15 2.7080502
14 2.63905733
13.9 2.63188884
13.7 2.61739583
14.3 2.66025954
16.1 2.77881927
14.7 2.68784749
14.7 2.68784749
13.9 2.63188884
15.2 2.72129543
14.6 2.68102153
15.1 2.71469474
13.3 2.58776404
15.2 2.72129543
14.1 2.6461748
13.6 2.61006979
15.4 2.73436751
14 2.63905733
15.4 2.73436751
15.6 2.74727091
15.3 2.72785283
14.1 2.6461748
13.3 2.58776404
13.5 2.60268969
13.7 2.61739583
14.7 2.68784749
14.2 2.65324196
14.7 2.68784749
13.1 2.57261223
14.2 2.65324196
14.8 2.69462718
14.6 2.68102153
9.7 2.27212589
10 2.30258509
9.9 2.29253476
11.5 2.44234704
10.3 2.3321439
10.2 2.32238772
9.8 2.28238239
8.9 2.18605128
8.7 2.16332303
10.4 2.34180581
9.4 2.24070969
9.1 2.20827441
13.6 2.61006979
11.6 2.4510051
9.7 2.27212589
11 2.39789527
11.3 2.42480273
10.1 2.31253542
11.3 2.42480273
9.7 2.27212589
9.5 2.2512918
9.8 2.28238239
11.2 2.41591378
10.2 2.32238772
10 2.30258509
10.5 2.35137526
11.2 2.41591378
11.7 2.45958884
9.6 2.2617631
9.6 2.2617631
11 2.39789527
16 2.77258872
13.6 2.61006979
15.2 2.72129543
15.3 2.72785283
15.9 2.76631911
17.3 2.8507065
16.1 2.77881927
15.1 2.71469474
14.6 2.68102153
13.2 2.58021683
15.8 2.76000994
14.7 2.68784749
16.3 2.79116511
15.5 2.74084002
14.5 2.67414865
15 2.7080502
15 2.7080502
15 2.7080502
17 2.83321334
15.1 2.71469474
15.1 2.71469474
15 2.7080502
14.8 2.69462718
14.9 2.70136121
14.6 2.68102153
15 2.7080502
15.9 2.76631911
13.9 2.63188884
15.7 2.75366071
14.8 2.69462718
17.9 2.88480071
15 2.7080502
15 2.7080502
15.8 2.76000994
14.3 2.66025954
15.4 2.73436751
15.1 2.71469474
17.7 2.87356464
17.5 2.86220088
20.9 3.03974916
17.6 2.8678989
17.6 2.8678989
15.9 2.76631911
16.2 2.78501124
18.1 2.89591194
14.5 2.67414865
17.8 2.87919846
16.8 2.82137889
17 2.83321334
17.6 2.8678989
15.6 2.74727091
15.4 2.73436751
16.1 2.77881927
16.3 2.79116511
17.7 2.87356464
16.3 2.79116511
Notes on analysis of the fish catch data set See the following worksheets in the file for the analysis performed with RegressIt
historical background of the source data.

This dataset is part of the jse.amstat.org data archive and also appears as an example in the documentation for SAS. It was or
collected in 1917 in Finland. See the "source" sheet in this file for details. The "pedagogical notes" at the bottom of that shee
several candidates for models. Some descriptive analysis of this data can be found in the SAS manual. The model to be devel
here applies log transformations to the physical measurements, which is standard in the fishery literature, while also controlli
species and shape of fish.

The objective of the analysis is to develop a multiple regression model for predicting the weight of a fish from its physical dim
(relative height and width as well as length) and its species. There are 7 species in the sample, with different numbers of
observations of each. The data set is sorted by species and then approximately by length within a species. There are missing
of weight in rows 14 and 47 in the original data set) which are coded here as blank cells.

Descriptive statistics analyses were run on the original variables and their natural logs, looking ahead to the regression analys
described below. In the original variable analysis (Stats 1 sheet ) the scatterplots of weight versus the dimension variables ar
nonlinear, as expected. Scroll down to the bottom of the sheet to see the scatterplots.

In the descriptive analysis of the transformed variables (Stats 2 sheet), the scatterplot of log weight versus log length shows n
straight lines except with several distinct lines, evidently corresponding to different species of fish. Clusters of points correspo
to different species are also nicely separated on the other plots.

The classic length-weight-relationship (LWR) model in the fishery literature is Weight = a * Length^b. For fish of the same spe
with "isometric growth" (i.e., same relative shape at all stages of development), it might be expected that b would be close to
because weight should be proportional to volume and volume should be proportional to the cube of length. Taking the log of
sides, we get
Model 1: log Weight = a + b * log Length

which can be fitted by linear regression after applying a log transformation to the variables. See the Model 1 sheet for the re
fitting this model. The plots of actual and predicted values versus observation number and the residual plots do not look ver
the weights of some species are systematically overpredicted or underpredicted, particularly species 5 and 6. The estimate o
3.172 with a standard error of 0.061. This is not far from 3 but the difference is statistically significant given the standard erro
coefficient estimate, which is 0.061. That's a difference of about 3 standard errors.

A suggestive way to address the problems of Model 1 and the parallel straight lines seen in the scatterplot of log Weight_g vs
Length_cm is to include dummy variables for species, yielding:

Model 2: log Weight = a + b * log Length + c(1) * S(1) + … + c(6) * S(6)


…where S(i) is a dummy variable for species i and c(i) is its coefficient. (It is necessary to leave one of the dummies out, and it
doesn't matter which one. Here the dummy for species 7 has been left out.) This model allows the constant of proportionalit
between between weight and length^b to differ across species, perhaps due to differences in relative shape. See the Model 2
worksheet for the results of fitting this model. The standard error of the regression has been reduced from 0.310 to 0.103 in
log units. This means the standard deviation of the model's real errors in percentage terms is around 10%. The residual plots
that the errors are roughly the same size for all species and for large and small values of the predictions (in log units). The err
distribution is almost perfectly normal as indicated by the A-D stat and normal probability plot. The coefficient of log length i
with a standard error of 0.035. This again is close to 3 but the difference is still statistically significant: more than 4 standard e

Alternatively, we might try to address the differences in length-weight ratios across species and/or across growth stages by in
height and weight variables, which directly measure the relative shape of the fish. This allows for the possibility that the relati
shape of a fish may may change with its length, so-called "allometric growth." The file also contains variables Height_pct and
Width_pct which are measures of the relative height and width of the fish expressed as a percentage of Length. For example
of 41 for Height_pct means that height is 41% of length. A log transformation has been applied to these variables too. Addin
to the variables in Model 1 yields:

Model 3: log Weight_g = a + b * log Length_cm + d * log Height_pct + e * log Width_pct

[Note: it is not obvious a priori whether a log transformation should be applied to Height_pct and Weight_pct, insofar as the
already dimensionless ratios. Taking their logs is equivalent to assuming that, other things being equal, weight varies as a pow
function of the relative height and the relative width. Without the log transformation, the assumption is that weight varies as
exponential function of relative height and width. In fact, it makes little difference whether these variables are logged or not
model, because the log function is fairly straight over the range of variation of the values.]

The error statistics of Model 3 are almost exactly the same as those of Model 2: the standard errors of the regression are 0.1
0.101, respectively. The plots of predictions and errors also look almost exactly the same. The coefficient of log length is not
Model 3: log Weight_g = a + b * log Length_cm + d * log Height_pct + e * log Width_pct

[Note: it is not obvious a priori whether a log transformation should be applied to Height_pct and Weight_pct, insofar as the
already dimensionless ratios. Taking their logs is equivalent to assuming that, other things being equal, weight varies as a pow
function of the relative height and the relative width. Without the log transformation, the assumption is that weight varies as
exponential function of relative height and width. In fact, it makes little difference whether these variables are logged or not
model, because the log function is fairly straight over the range of variation of the values.]

The error statistics of Model 3 are almost exactly the same as those of Model 2: the standard errors of the regression are 0.1
0.101, respectively. The plots of predictions and errors also look almost exactly the same. The coefficient of log length is not
significantly different from 3 in the model. Its value is 3.017 with a standard error of 0.020. Hence the "cube law" applies nic
when controlling for relative width and relative height.

Now let's consider including *both* the species dummies and the relative height and weight measures. This will allow the m
separately distinguish inter-species effects and growth-stage effects, if any.

Model 4: log Weight = a + b * log Length_cm + c(1) * S(1) + … + c(6) * S(6) + d * log Height_pct + e * log Width_pct

When fitting this model, which will be our final one, let's also choose the option to save residuals and predictions back to the
sheet. This will allow the predictions to be unlogged (exponentiated) to get the model's predictions in real units of grams of w

Model 4 is a notable further improvement, yielding a standard error of 0.081, corresponding to a standard deviation of aroun
percentage terms. The mean absolute error of this model is 0.063 in natural log units, which says that the mean absolute per
error of the model is roughly 6.3% in the original units. Also, the coefficient of log Length_cm is 3.031 with a standard error o
not significantly different from 3 as in the previous model.

Overall Model 4 looks very good in terms of its goodness of fit, its satisfaction of the assumptions of linear regression, and the
physical interpretation of its equation.

The predictions of Model 4 have been automatically saved to the data sheet in the form of a new variable which is called
Weight.Ln.Model.4 by default. We can convert these predictions back to real units by using the variable transformation tool t
an exponential function to them, yielding a variable called Weight.Ln.Model.4.Exp. As a quick-and-dirty way to get the error s
real terms, we can just run a simple regression of Weight on Weight.Ln.Model.4.Exp. This is done in Model 5:

Model 5: Weight = a + b * Weight.Ln.Model.4.Exp.

The slope coefficient in this model is 1.001 (almost exactly equal to 1) and the constant is 0.890 (less than one gram) which is
insignificant in real terms. Therefore the errors of this model are effectively just the weights minus the predicted weights of M
in real terms. The standard error of the regression (the standard deviation of the real errors) is 49.096 grams, the mean abso
error is 26.967 grams, and the mean absolute percentage error is 6.7%. The latter number is close to the estimate of 6.3% ob
by computing the mean absolute error in log units in Model 4.

Of course, the plots of the errors of Model 5 are problematic. Much larger errors are made for weight predictions of larger fis
seen in the residual versus predicted plot. The error distribution is also highly non-normal as seen in the histogram and quan
plots. This is to be expected when the predictions of a linear model fitted to a logged variable are converted back to real unit
 
 
 
 
is performed with RegressIt and for
File contents as viewed with the "History" tool on the RegressIt menu:

mentation for SAS. It was originally


s" at the bottom of that sheet list
nual. The model to be developed
terature, while also controlling for

f a fish from its physical dimensions


th different numbers of
a species. There are missing values

ead to the regression analysis


s the dimension variables are highly

ght versus log length shows nice


h. Clusters of points corresponding

^b. For fish of the same species


ted that b would be close to 3
e of length. Taking the log of both

the Model 1 sheet for the results of


esidual plots do not look very good:
cies 5 and 6. The estimate of b is
cant given the standard error of the

atterplot of log Weight_g vs. log

e of the dummies out, and it


he constant of proportionality
tive shape. See the Model 2
uced from 0.310 to 0.103 in natural
und 10%. The residual plots show
ctions (in log units). The error
he coefficient of log length is 3.155
cant: more than 4 standard errors

r across growth stages by including


the possibility that the relative
ains variables Height_pct and
age of Length. For example, a value
o these variables too. Adding them

d Weight_pct, insofar as they are


equal, weight varies as a power
ption is that weight varies as an
variables are logged or not in this

ors of the regression are 0.100 and


efficient of log length is not
d Weight_pct, insofar as they are
equal, weight varies as a power
ption is that weight varies as an
variables are logged or not in this

ors of the regression are 0.100 and


efficient of log length is not
e the "cube law" applies nicely

sures. This will allow the model to

e * log Width_pct

and predictions back to the data


ons in real units of grams of weight.

standard deviation of around 8% in


that the mean absolute percentage
.031 with a standard error of 0.031,

of linear regression, and the

variable which is called


ariable transformation tool to apply
d-dirty way to get the error stats in
e in Model 5:

ess than one gram) which is


us the predicted weights of Model 4
9.096 grams, the mean absolute
e to the estimate of 6.3% obtained

eight predictions of larger fish as


n in the histogram and quantile
e converted back to real units.
Descriptive Statistics Stats 1
Variable # Fitted Mean Median Std.Dev. Root.M.Sqr. Std.Err.Mean Minimum
Weight_g 157 401.235 273.000 358.809 537.507 28.636 5.900
Height_pct 157 28.255 26.900 8.323 29.448 0.664 14.500
Length_cm 157 31.242 29.400 11.655 33.332 0.930 8.800
Row 157 80.631 81.000 45.955 92.735 3.668 1.000
Species 157 4.529 5.000 2.390 5.117 0.191 1.000
Width_pct 157 14.116 14.600 2.288 14.299 0.183 8.700

Series Plots Hi-res picture


.
Weight_g (n = 157, mean = 401.235)
2000

1500

1000

500

0
0 20 40 60 80 100 120
.
Height_pct (n = 157, mean = 28.255)
50

40

30

20

10
0 20 40 60 80 100 120
.
Length_cm (n = 157, mean = 31.242)
80

60

40

20

0
0 20 40 60 80 100 120
.
Row (n = 157, mean = 80.631)
200

150

100

50

0
0 20 40 60 80 100 120
.
Species (n = 157, mean = 4.529)
8
7
6
5
4
3
2
1
0 20 40 60 80 100 120
7
6
5
4
3
2
1
0 20 40 60 80 100 120
.
Width_pct (n = 157, mean = 14.116)
22
20
18
16
14
12
10
8
0 20 40 60 80 100 120

Histogram Plots Hi-res picture

.
Weight_g (n=157, mean=401.235) Height_pct (n=
60 40
30
40
20
20
10

0 0
Mi n = 5.900 Mi dpoint = 827.950 Ma x = 1,650 Mi n = 14.500 Mi d

.
Length_cm (n=157, mean=31.242) Row (n=157
40 14
30 13
20 12
10 11
0 10
Mi n = 8.800 Mi dpoint = 38.400 Ma x = 68.0 Mi n = 1.000 Mi d

.
Species (n=157, mean=4.529) Width_pct (n=1
60 60

40 40

20 20

0 0
Mi n = 1.000 Mi dpoint = 4.000 Ma x = 7.00 Mi n = 8.700 Mid

Correlation Matrix (n=157)


Variable Weight_g
Weight_g 1.000 Height_pct
Height_pct 0.194 1.000 Length_cm
Length_cm 0.925 0.133 1.000 Row
Row 0.035 -0.545 -0.004 1.000 Species
Species -0.133 -0.662 -0.136 0.960 1.000 Width_pct
Width_pct 0.136 0.456 0.038 0.242 0.089 1.000

Scatterplots Hi-res picture


.
Weight_g vs. Weight_g vs.
Height_pct Length_cm
r = 0.194, r-squared = 0.038 r = 0.925, r-squared = 0.855
1,650 1,650

828 828

6 6
14.5 29.5 44.5 8.8 38.4 68.0
.
Weight_g vs. Weight_g vs.
Species Width_pct
r = -0.133, r-squared = 0.018 r = 0.136, r-squared = 0.018
1,650 1,650

828 828

6 6
1.00 4.00 7.00 8.7 14.8 20.9

End of Output
Maximum
1,650 Descriptive analysis of original variables, with scatterplots o
44.500 Weight_g versus the other variables. (Scroll to the bottom o
68.000 sheet to see these.) The series plots show the grouping of fi
159.000 species in the ordering of rows in the file. The scatterplots s
7.000 highly nonlinear relationship between Weight_g and Length
20.900 suggesting a need for a nonlinear data transformation. Also
is a lack of detail at the bottom of all the scatterplots: many
are "clumped" along the bottom edges, making it hard to ju
their variations in percentage terms. A log transformation is
mean = 401.235) traditional for this kind of data in the fishery literature.

100 120 140 160 180

mean = 28.255)

100 120 140 160 180

, mean = 31.242)

100 120 140 160 180

ean = 80.631)

100 120 140 160 180

mean = 4.529)

100 120 140 160 180


100 120 140 160 180

mean = 14.116)

100 120 140 160 180

Height_pct (n=157, mean=28.255)


40
30
20
10

0
Mi n = 14.500 Mi dpoint = 29.500 Ma x = 44.5

Row (n=157, mean=80.631)


14
13
12
11
10
Mi n = 1.000 Mi dpoint = 80.000 Ma x = 159

Width_pct (n=157, mean=14.116)


60

40

20

0
Mi n = 8.700 Midpoint = 14.800 Ma x = 20.9
g vs. Weight_g vs.
cm Row
ared = 0.855 r = 0.035, r-squared = 0.001
1,650

828

6
8.4 68.0 1 80 159

g vs.
pct
ared = 0.018

4.8 20.9
Observation # 13.000 2/5/20 9:38 AM + FACDS414 + fish_catch_d

ginal variables, with scatterplots of


r variables. (Scroll to the bottom of the
eries plots show the grouping of fish by
rows in the file. The scatterplots show a
hip between Weight_g and Length_cm,
onlinear data transformation. Also, there
ottom of all the scatterplots: many points
bottom edges, making it hard to judge
age terms. A log transformation is
data in the fishery literature.
M + FACDS414 + fish_catch_data.xlsx + fishdata + RegressItPC 2020.01.20
2/5/20 9:38 AM + FACDS414 + fish_catch_data.xlsx + fishdata + RegressItPC 2020.01.20
Descriptive Statistics Stats 2

Series Plots Hi-res picture


L
H
Wi
W
ee
n ig
e
g
d
Sp
i
R
th
h
th
g
oe
t_
h_
_
c
wt
cp
p
i
_ m
e
c
cg
(st.
n
t.
.
.L
L(
LL
n
n
=
nn
n=1
((
(
(n
5
nn
n
17
=5
=
==
, 7
1
m1
1
1,
555
5
e
m7
7a
7
7,
e,
,
n
, m
a
m
mm=
n
ee
ee
a
8a
a
=
an0
n
n
n4
.=
6=
.
==
53
52
33
1
..
9
.
4
.)
62
3)
03
9
673
5
5))
)
)
4 .
2
8
3 04
5
10
3
7
2 .
.8
4
9
1
3 5
6 .60
3
2
3 .
5 .7
5
4
1 0 0
3 .
4
2 .2
3
5
35 30
2
2 .
.3
5
8
1 .
2 2
6
10
0 00 202
2 0
0 4 0
4
400 606
6 0
0 880
80
0 11
0
1000
0 1
112
2
200
0

Histogram Plots Hi-res picture

Correlation Matrix (n=157)

Scatterplots Hi-res picture


.
Weight_g.Ln vs. Weight_g.Ln vs.
Height_pct.Ln Length_cm.Ln
r = 0.396, r-squared = 0.157 r = 0.973, r-squared = 0.946
7.41 7.41

4.59 4.59

1.77 1.77
2.67 3.23 3.80 2.17 3.20 4.22
.
Weight_g.Ln vs. Weight_g.Ln vs.
Species Width_pct.Ln
r = -0.203, r-squared = 0.041 r = 0.366, r-squared = 0.134
7.41 7.41

4.59 4.59

1.77 1.77
1.00 4.00 7.00 2.163 2.602 3.040

End of Output
Descriptive analysis of logged variables. The scatterplot of
e
5
5
m
7 7
7
a
7,e,
,
nm
,a
mm
m=
n
ee
e
ea8
=
a
a
an
0
nn
n
4.
=6.
==
=
53
52
31
3
.4
9
.
.)
.
62
30
)3
96
73
55
))
)
)
Weight_g.Ln versus Length_cm.Ln shows a very linear
11
10
000
00 11
122
2
000 1
1
144
400
0 1
1
166
600
0 18 0

pattern, with several distinct lines having the same slope.


These evidently correspond to different species. Also note
that the clumps of points at the bottom edges of the
scatterplots of weight versus the other variables have been
nicely separated into distinct clusters.

Click the plus signs in the left sidebar to unhide the other
Ln vs. Weight_g.Ln vs. tables and plots.
m.Ln Row
ared = 0.946 r = -0.032, r-squared = 0.001
7.41

4.59

1.77
20 4.22 1 80 159

Ln vs.
ct.Ln
ared = 0.134

02 3.040
Observation # 13.000 2/5/20 9:40 AM + FACDS414 + fish_catch_s

gged variables. The scatterplot of


gth_cm.Ln shows a very linear
tinct lines having the same slope.
ond to different species. Also note
s at the bottom edges of the
rsus the other variables have been
tinct clusters.

e left sidebar to unhide the other


M + FACDS414 + fish_catch_stats1.xlsx + fishdata + RegressItPC 2020.01.20
2/5/20 9:40 AM + FACDS414 + fish_catch_stats1.xlsx + fishdata + RegressItPC 2020.01.20
Model: Model 1
Dependent Variable: Weight_g.Ln

R-Squared Adj.R-Sqr. Std.Err.Reg. Std.Dep.Var. # Fitted # Missing Critical t


0.946 0.946 0.310 1.330 157 2 1.975

Variable Coefficient Std.Err. t-Statistic P-value Lower95% Upper95% VIF


Constant -5.269 0.206 -25.557 0.000 -5.676 -4.862 0.000
Length_cm.Ln 3.172 0.061 52.160 0.000 3.052 3.293 1.000

Length_cm.Ln StdErrMean StdErrFcst Predicted Lower 95% Upper 95%


2.175 Model 1 for Weight_g.Ln
0.077 0.319 (1 variable,
1.630 n=157) 1.000 2.260
2.686Predicted0.048
Weight_g.Ln 0.313
= -5.269 + 3.172*Length_cm.Ln
3.252 2.632 3.871
3.197 0.027 0.311 4.873 4.259 5.488
10
3.708 0.032 0.311 6.495 5.880 7.110
4.220 0.058 0.315 8.117 7.494 8.739
8

6 Actual

4 Upper 95%
Predicted
2
Lower 95%
0
2 2.5 3 3.5 4 4.5
Length_cm.Ln

Mean Error RMSE MAE Minimum Maximum MAPE A-D* stat


Fitted (n=157) 0.000 0.308 0.235 -0.797 0.522 5.3% 10.88 (P=0.000)

Actual and 0.000


Predicted -vs- Observation #
Model 1 for Weight_g.Ln (1 variable, n=157)
9
8
7
6
5
4
3
2
1
0 20 40 60 80 100 120 140 160 180

Residual -vs- Observation #


Model 1 for Weight_g.Ln (1 variable, n=157)
0.6
0.4
0.2
0
Residual -vs- Observation #
Model 1 for Weight_g.Ln (1 variable, n=157)
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1

138

153
103
108
113
118
123
128
133

143
148

158
37
11
17
22
27
32

42
48
53
58
63
68
73
78
83
88
93
98
1
6

Residual -vs- Predicted


Model 1 for Weight_g.Ln (1 variable, n=157)
0.6
0.4
0.2
0
-0.2
-0.4
-0.6
-0.8
-1
1 2 3 4 5 6 7 8 9

Histogram of Residuals
Model 1 for Weight_g.Ln (1 variable, n=157)
40
35
30
25
20
15 Actual
10
5 Normal
0
-0.720

-0.560

-0.400

-0.160
-0.800

-0.640

-0.480

-0.320
-0.240

-0.080
0.000
0.080
0.160
0.240
0.320
0.400
0.480
0.560
0.640
0.720
0.800

N ormality test (A-D*): P < 0.001

Normal Quantile Plot


Model 1 for Weight_g.Ln (1 variable, n=157)
3
2
Normal Quantile Plot
Model 1 for Weight_g.Ln (1 variable, n=157)
3
2
1
0
-1
-2
-3
-3 -2 -1 0 1 2 3
N ormality test (A-D*): P < 0.001

End of Output
White No Font NoHeaders With P-value
Model 1 (#var
R code: The Model.1
classic <- lm(Weight_g.Ln ~ Length_cm.Ln,
length-weight-relationship data =model
(LWR) fishdata)in the fishery literature is Weight
Confidence = a * Length^b. For fish of the same species with isometric growth (i.e., same
95.0% relative shape at all stages of development), it might be expected that b would be
close to 3 because weight should be proportional to volume and volume should be
Std. Coeff. proportional to the cube of length. Taking the log of both sides, we get
0.000
0.973 Model 1: log Weight_g = a + b * log Length_cm

which can be fitted by linear regression after applying a log transformation to the
variables.

The plots of actual and predicted values versus observation number and the residual
plots do not look very good: the weights of some species are systematically
overpredicted or underpredicted, particularly species 5 and 6. The estimate of b is
3.172 with a standard error of 0.061. This is not far from 3 but the difference is
statistically significant given the standard error of the coefficient estimate, which is
0.061. That's a difference of about 3 standard errors.
Hi-res picture No Comment 2/5/20 9:43 AM + FACDS414 + fish_catch_s
No preceding m Model 1 last f Model 1 following model is Model 2 (#vars=7, n=157, AdjRsq=0.994): Weight_g.Ln << Length_cm.Ln, Species.Eq.1, Species.Eq.2,
re is Weight
same
would be
should be

on to the

the residual
ally
mate of b is
ence is
e, which is
M + FACDS414 + fish_catch_stats2.xlsx + fishdata + RegressItPC 2020.01.20
, Species.Eq.1, Species.Eq.2, Species.Eq.3, Species.Eq.4, Species.Eq.5, Species.Eq.6
2/5/20 9:43 AM + FACDS414 + fish_catch_stats2.xlsx + fishdata + RegressItPC 2020.01.20
Model: Model 2
Dependent Variable: Weight_g.Ln

R-Squared Adj.R-Sqr. Std.Err.Reg. Std.Dep.Var. # Fitted # Missing Critical t


0.994 0.994 0.103 1.330 157 2 1.976

Variable Coefficient Std.Err. t-Statistic P-value Lower95% Upper95% VIF


Constant -5.054 0.117 -43.151 0.000 -5.286 -4.823 0.000
Length_cm.Ln 3.155 0.035 90.388 0.000 3.086 3.224 2.992
Species.Eq.1 -0.055 0.025 -2.221 0.028 -0.104 -0.006077 1.554
Species.Eq.2 0.071 0.045 1.593 0.113 -0.017 0.159 1.089
Species.Eq.3 -0.122 0.028 -4.427 0.000 -0.177 -0.068 1.206
Species.Eq.4 0.149 0.035 4.301 0.000 0.081 0.218 1.170
Species.Eq.5 -0.671 0.041 -16.453 0.000 -0.752 -0.591 2.010
Species.Eq.6 -0.775 0.034 -22.787 0.000 -0.843 -0.708 1.661

Mean Error RMSE MAE Minimum Maximum MAPE A-D* stat


Fitted (n=157) 0.000 0.100 0.079 -0.217 0.369 1.6% 0.68 (P=0.078)

Actual and 0.000


Predicted -vs- Observation #
Model 2 for Weight_g.Ln (7 variables, n=157)
8
7
6
5
4
3
2
1
0 20 40 60 80 100 120 140 160 180

Residual -vs- Observation #


Model 2 for Weight_g.Ln (7 variables, n=157)
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
138

153
103
108
113
118
123
128
133

143
148

158
37
11
17
22
27
32

42
48
53
58
63
68
73
78
83
88
93
98
1
6
Residual -vs- Predicted
Model 2 for Weight_g.Ln (7 variables, n=157)
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
1 2 3 4 5 6 7 8

Histogram of Residuals
Model 2 for Weight_g.Ln (7 variables, n=157)
30
25
20
15
10 Actual
5 Normal
0
-0.333

-0.259

-0.185

-0.074
-0.370

-0.296

-0.222

-0.148
-0.111

-0.037
0.000
0.037
0.074
0.111
0.148
0.185
0.222
0.259
0.296
0.333
0.370

Normality test (A-D*): P > 0.05

Normal Quantile Plot


Model 2 for Weight_g.Ln (7 variables, n=157)
4
3
2
1
0
-1
-2
-3
-3 -2 -1 0 1 2 3
Normality test (A-D*): P > 0.05
-2
-3
-3 -2 -1 0 1 2 3
Normality test (A-D*): P > 0.05

End of Output
Color Font NoHeaders With P-value
Model 2 (#vars
R code: Model.2 <- lm(Weight_g.Ln ~ Length_cm.Ln + Species.Eq.1 + Species.Eq.2 + Species.Eq.3 + Species.Eq.4 + Species.Eq.5
Confidence A suggestive way to address the problems of Model 1 and the parallel straight lines seen in the
95.0% scatterplot of log Weight_g vs. log Length_cm is to include dummy variables for species, yielding:

Std. Coeff.
Model 2: log Weight_g = a + b * log Length_cm + c(1) * S(1) + … + c(6) * S(6)
0.000
0.967
…where S(i) is a dummy variable for species i and c(i) is its coefficient. This model allows the
constant of proportionality between between weight and length^b to differ across species, perhaps
-0.017
due to differences in density or relative shape.
0.010
-0.030
The standard error of the regression has been reduced from 0.310 to 0.103 in natural log units. This
0.029
means the standard deviation of the model's real errors in percentage terms is around 10%. The
-0.144 residual plots show that the errors are roughly the same size for all species and for large and small
-0.182 values of the predictions (in log units). The error distribution is almost perfectly normal as indicated
by the A-D stat and normal probability plot.

The coefficient of log length is 3.155 with a standard error of 0.035. This again is close to 3 but the
difference is still statistically significant: more than 4 standard errors
Hi-res picture No Comment 2/5/20 9:44 AM + FACDS414 + fish_catch_m
Model 2 preceModel 2 last f Model 2 following model is Model 3 (#vars=3, n=157, AdjRsq=0.994): Weight_g.Ln << Height_pct.Ln, Length_cm.Ln, Width_pct.Ln
es.Eq.3 + Species.Eq.4 + Species.Eq.5 + Species.Eq.6, data = fishdata)
raight lines seen in the
bles for species, yielding:

S(6)

his model allows the


ffer across species, perhaps

103 in natural log units. This


erms is around 10%. The
ies and for large and small
perfectly normal as indicated

s again is close to 3 but the


M + FACDS414 + fish_catch_model1.xlsx + fishdata + RegressItPC 2020.01.20
Length_cm.Ln, Width_pct.Ln
2/5/20 9:44 AM + FACDS414 + fish_catch_model1.xlsx + fishdata + RegressItPC 2020.01.20
Model: Model 3
Dependent Variable: Weight_g.Ln

R-Squared Adj.R-Sqr. Std.Err.Reg. Std.Dep.Var. # Fitted # Missing Critical t


0.995 0.994 0.100 1.330 157 2 1.976

Variable Coefficient Std.Err. t-Statistic P-value Lower95% Upper95% VIF


Constant -9.143 0.129 -70.698 0.000 -9.398 -8.887 0.000
Height_pct.Ln 0.445 0.033 13.603 0.000 0.380 0.509 1.631
Length_cm.Ln 3.017 0.020 150.412 0.000 2.977 3.057 1.053
Width_pct.Ln 1.113 0.058 19.256 0.000 0.999 1.227 1.598

Mean Error RMSE MAE Minimum Maximum MAPE A-D* stat


Fitted (n=157) 0.000 0.098 0.077 -0.299 0.312 1.6% 0.24 (P=0.763)

Actual and 0.000


Predicted -vs- Observation #
Model 3 for Weight_g.Ln (3 variables, n=157)
8
7
6
5
4
3
2
1
0 20 40 60 80 100 120 140 160 180

Residual -vs- Observation #


Model 3 for Weight_g.Ln (3 variables, n=157)
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
138

153
103
108
113
118
123
128
133

143
148

158
37
11
17
22
27
32

42
48
53
58
63
68
73
78
83
88
93
98
1
6

Residual -vs- Predicted


Model 3 for Weight_g.Ln (3 variables, n=157)
0.4
0.3
Residual -vs- Predicted
Model 3 for Weight_g.Ln (3 variables, n=157)
0.4
0.3
0.2
0.1
0
-0.1
-0.2
-0.3
-0.4
1 2 3 4 5 6 7 8

Histogram of Residuals
Model 3 for Weight_g.Ln (3 variables, n=157)
25
20
15
10
Actual
5 Normal
0
-0.288

-0.224

-0.160

-0.064
-0.320

-0.256

-0.192

-0.128
-0.096

-0.032
0.000
0.032
0.064
0.096
0.128
0.160
0.192
0.224
0.256
0.288
0.320

Normality test (A-D*): P > 0.05

Normal Quantile Plot


Model 3 for Weight_g.Ln (3 variables, n=157)
4
3
2
1
0
-1
-2
-3
-4
-3 -2 -1 0 1 2 3
Normality test (A-D*): P > 0.05
End of Output
Color Font NoHeaders With P-value
Model 3 (#var
R code: Model.3 <- lm(Weight_g.Ln ~ Height_pct.Ln + Length_cm.Ln + Width_pct.Ln, data = fishdata)
Confidence Alternatively, we might try to address the differences in length-weight ratios across species and/or acro
95.0% growth stages by including the height and weight variables which directlyy measure the relative shape
the fish. A log transformation has been applied to these variables too. Adding them to the variables in
Std. Coeff. Model 1 yields:
0.000
0.104 Model 3: log Weight_g = a + b * log Length_cm + d * log Height_pct + e * log Width_pct
0.925
The error statistics of Model 3 are almost exactly the same as those of Model 2: the standard errors of
0.146
regression are 0.100 and 0.101, respectively. The plots of predictions and errors also look almost exact
the same.

The coefficient of log length is not significantly different from 3 in the model. Its value is 3.017 with a
standard error of 0.020. Hence the "cube law" applies nicely when controlling for relative width and
relative height.
Hi-res picture No Comment 2/5/20 9:44 AM + FACDS414 + fish_catch_m
Model 3 preceModel 3 last f Model 3 following model is Model 4 (#vars=9, n=157, AdjRsq=0.996): Weight_g.Ln << Height_pct.Ln, Length_cm.Ln, Species.Eq.1,

ratios across species and/or across


tlyy measure the relative shape of
Adding them to the variables in

e * log Width_pct

Model 2: the standard errors of the


nd errors also look almost exactly

model. Its value is 3.017 with a


trolling for relative width and
M + FACDS414 + fish_catch_model2.xlsx + fishdata + RegressItPC 2020.01.20
Length_cm.Ln, Species.Eq.1, Species.Eq.2, Species.Eq.3, Species.Eq.4, Species.Eq.5, Species.Eq.6, Width_pct.Ln
2/5/20 9:44 AM + FACDS414 + fish_catch_model2.xlsx + fishdata + RegressItPC 2020.01.20
Model: Model 4
Dependent Variable: Weight_g.Ln

R-Squared Adj.R-Sqr. Std.Err.Reg. Std.Dep.Var. # Fitted # Missing Critical t


0.996 0.996 0.081 1.330 157 2 1.976

Variable Coefficient Std.Err. t-Statistic P-value Lower95% Upper95% VIF


Constant -8.228 0.372 -22.143 0.000 -8.963 -7.494 0.000
Height_pct.Ln 0.655 0.138 4.755 0.000 0.383 0.927 43.283
Length_cm.Ln 3.031 0.031 98.765 0.000 2.970 3.091 3.687
Species.Eq.1 -0.228 0.064 -3.555 0.001 -0.355 -0.101 16.575
Species.Eq.2 0.023 0.038 0.612 0.542 -0.052 0.098 1.241
Species.Eq.3 -0.106 0.024 -4.461 0.000 -0.153 -0.059 1.431
Species.Eq.4 -0.083 0.070 -1.191 0.236 -0.221 0.055 7.497
Species.Eq.5 -0.247 0.056 -4.382 0.000 -0.358 -0.135 6.094
Species.Eq.6 -0.159 0.074 -2.149 0.033 -0.306 -0.013 12.600
Width_pct.Ln 0.526 0.105 5.013 0.000 0.319 0.733 7.882

Mean Error RMSE MAE Minimum Maximum MAPE A-D* stat


Fitted (n=157) 0.000 0.079 0.063 -0.197 0.223 1.3% 0.32 (P=0.541)

Actual and 0.000


Predicted -vs- Observation #
Model 4 for Weight_g.Ln (9 variables, n=157)
8
7
6
5
4
3
2
1
0 20 40 60 80 100 120 140 160 180

Residual -vs- Observation #


Model 4 for Weight_g.Ln (9 variables, n=157)
0.3

0.2

0.1

-0.1

-0.2

-0.3
138

153
103
108
113
118
123
128
133

143
148

158
37
11
17
22
27
32

42
48
53
58
63
68
73
78
83
88
93
98
1
6
-0.1

-0.2

-0.3

138

153
103
108
113
118
123
128
133

143
148

158
37
11
17
22
27
32

42
48
53
58
63
68
73
78
83
88
93
98
1
6
Residual -vs- Predicted
Model 4 for Weight_g.Ln (9 variables, n=157)
0.3

0.2

0.1

-0.1

-0.2

-0.3
1 2 3 4 5 6 7 8

Histogram of Residuals
Model 4 for Weight_g.Ln (9 variables, n=157)
25
20

15
10
Actual
5 Normal
0
-0.207

-0.161

-0.115

-0.046
-0.230

-0.184

-0.138

-0.092
-0.069

-0.023
0.000
0.023
0.046
0.069
0.092
0.115
0.138
0.161
0.184
0.207
0.230

Normality test (A-D*): P > 0.05

Normal Quantile Plot


Model 4 for Weight_g.Ln (9 variables, n=157)
4
3
2
1
0
-1
-2
-3
-3 -2 -1 0 1 2 3
Normality test (A-D*): P > 0.05
-1
-2
-3
-3 -2 -1 0 1 2 3
Normality test (A-D*): P > 0.05

End of Output
Color Font NoHeaders With P-value
Model 4 (#vars
R code: Model.4 <- lm(Weight_g.Ln ~ Height_pct.Ln + Length_cm.Ln + Species.Eq.1 + Species.Eq.2 + Species.Eq.3 + Species.Eq.
Confidence
95.0% Now let's consider including *both* the species dummies and the relative height and weight measure
model to separately distinguish inter-species effects and growth-stage effects, if any.
Std. Coeff.
0.000
Model 4: log Weight_g = a + b * log Length_cm + c(1) * S(1) + … + c(6) * S(6) + d * log Height_pct + e *
0.153
When fitting this model, which will be our final one, let's also choose the option to save residuals and
0.929
data sheet. This will allow the predictions to be unlogged (exponentiated) to get the model's predictio
-0.071 of weight. (Note that the box for saving residuals and predictions to the data sheet has been checked
0.003338
-0.026 Model 4 is a notable further improvement, yielding a standard error of 0.081, corresponding to a stand
-0.016 8% in percentage terms. . The mean absolute error of this model is 0.063 in natural log units, which s
-0.053
absolute percentage error of the model is roughly 6.3% in the original units. Also, the coefficient of lo
-0.037 with a standard error of 0.031, not significantly different from 3 as in the previous model.
0.069
Overall Model 4 looks very good in terms of its goodness of fit, its satisfaction of the assumptions of lin
physical interpretation of its equation.
Hi-res picture No Comment Notes 2/5/20 9:47 AM + FACDS414 + fish_catch_m
Model 4 preceModel 4 last f Model 4 following model is Model 5 (#vars=1, n=157, AdjRsq=0.981): Weight_g << Weight_g.Ln.Model.4.Exp
es.Eq.2 + Species.Eq.3 + Species.Eq.4 + Species.Eq.5 + Species.Eq.6 + Width_pct.Ln, data = fishdata)

tive height and weight measures. This will allow the


effects, if any.

* S(6) + d * log Height_pct + e * log Width_pct

he option to save residuals and predictions back to the


ted) to get the model's predictions in real units of grams
e data sheet has been checked here.)

0.081, corresponding to a standard deviation of around


063 in natural log units, which says that the mean
units. Also, the coefficient of log Length_cm is 3.031
he previous model.

faction of the assumptions of linear regression, and the


M + FACDS414 + fish_catch_model3.xlsx + fishdata + RegressItPC 2020.01.20
2/5/20 9:47 AM + FACDS414 + fish_catch_model3.xlsx + fishdata + RegressItPC 2020.01.20
Model: Model 5
Dependent Variable: Weight_g

R-Squared Adj.R-Sqr. Std.Err.Reg. Std.Dep.Var. # Fitted # Missing


0.981 0.981 49.096 358.809 157 2

Variable Coefficient Std.Err. t-Statistic P-value Lower95% Upper95%


Constant 0.890 5.912 0.150 0.881 -10.789 12.568
Weight_g.Ln.Model.4.Exp 1.001 0.011 90.428 0.000 0.979 1.023

Weight_g.Ln.ModStdErrMean StdErrFcst Predicted Lower 95% Upper 95%


5.763 Model5.8655 for Weight_g49.445(1 variable, n=157) -91.013
6.660 104.333
Predicted
444.223 Weight_g
3.949 = 0.890 +49.255
1.001*Weight_g.Ln.Model.4.Exp
445.661 348.364 542.958
882.683 6.628 49.541 884.663 786.799 982.526
2000
1,321.143 10.927 50.297 1,323.664 1,224.308 1,423.021
1,759.603 15.557 51.502 1,762.666 1,660.930 1,864.402
1500

1000 Actual

500 Upper 95%


Predicted
0
Lower 95%
-500
0 500 1000 1500 2000
Weight_g.Ln.Model.4.Exp

Mean Error RMSE MAE Minimum Maximum MAPE


Fitted (n=157) 0.000 48.782 26.967 -160.109 285.440 6.7%

Actual and Predicted


0.000 -vs- Observation #
Model 5 for Weight_g (1 variable, n=157)
2000
1800
1600
1400
1200
1000
800
600
400
200
0
0 20 40 60 80 100 120 140 160 180

Residual -vs- Observation #


Model 5 for Weight_g (1 variable, n=157)
400

300

200
Residual -vs- Observation #
Model 5 for Weight_g (1 variable, n=157)
400

300

200

100

-100

-200

103
108
113
118
123
128
133
138
143
148
153
158
17

53

78
11

22
27
32
37
42
48

58
63
68
73

83
88
93
98
1
6

Residual -vs- Predicted


Model 5 for Weight_g (1 variable, n=157)
400

300

200

100

-100

-200
0 200 400 600 800 1000 1200 1400 1600 1800 2000

Histogram of Residuals
Model 5 for Weight_g (1 variable, n=157)
100

80

60

40 Actual

20 Normal

0
-203

-145
-290
-261
-232

-174

-116

232
261
116
145
174
203

290
-87
-58
-29

29
58
87
0

N ormality test (A-D*): P < 0.001

Normal Quantile Plot


Model 5 for Weight_g (1 variable, n=157)
8
6
Normal Quantile Plot
Model 5 for Weight_g (1 variable, n=157)
8
6
4
2
0
-2
-4
-3 -2 -1 0 1 2 3
N ormality test (A-D*): P < 0.001

End of Output
White No Font NoHeaders NoGridlines

R code: Model.5 <- lm(Weight_g ~ Weight_g.Ln.Model.4.Exp, data = fishdata)


Critical t Confidence The predictions of Model 4 were automatically saved to the data sheet in the form of a new
1.975 95.0% called Weight.Ln.Model.4 by default. We can convert these back to real units by using the v
transformation tool to apply an exponential function to them, yielding a variable called
VIF Std. Coeff. Weight.Ln.Model.4.Exp. As a quick-and-dirty way to get the error stats in real terms, we can
0.000 0.000 simple regression of Weight on Weight.Ln.Model.4.Exp. This is done in Model 5:
1.000 0.991
Model 5: Weight = a + b * Weight.Ln.Model.4.Exp.

The slope coefficient in this model is 1.001 (almost exactly equal to 1) and the constant is 0.8
than one gram) which is insignificant in real terms. Therefore the errors of this model are eff
just the weights minus the predicted weights of Model 4 in real terms. Of course R-squared
close to 1 (0.981) but the important statistics are the error statistics. The standard error of t
regression (the standard deviation of the real errors) is 49.096 grams, the mean absolute err
26.967 grams, and the mean absolute percentage error is 6.7%. The latter number is very cl
estimate of 6.3% obtained by computing the mean absolute error in log units in Model 4.

Of course, the plots of the errors of Model 5 are problematic. Much larger errors are made
predictions of larger fish as seen in the residual versus predicted plot. The error distribution
highly non-normal as seen in the histogram and quantile plots. This is to be expected when
predictions of a linear model fitted to a logged variable are converted back to real units.
 

A-D* stat
9.96 (P=0.000)
With P-value Hi-res picture No Comment Notes 2/5/20 9:49 AM + FACDS414
Model 5 (#var No preceding mNo following model in this sequence.

ata sheet in the form of a new variable


ack to real units by using the variable
yielding a variable called
ror stats in real terms, we can just run a
s done in Model 5:

al to 1) and the constant is 0.890 (less


the errors of this model are effectively
al terms. Of course R-squared is very
tistics. The standard error of the
grams, the mean absolute error is
%. The latter number is very close to the
rror in log units in Model 4.

Much larger errors are made for weight


ed plot. The error distribution is also
. This is to be expected when the
nverted back to real units.
2/5/20 9:49 AM + FACDS414 + fish_catch_model4.xlsx + fishdata + RegressItPC 2020.01.20
2/5/20 9:49 AM + FACDS414 + fish_catch_model4.xlsx + fishdata + RegressItPC 2020.01.20
Summary of Regression Model Results
Model 1 (#vars=1, n=157, AdjRsq=0.946): Weight_g.Ln << Length_cm.Ln 4, Species.Eq.5, Species.Eq.6 , Length_cm.Ln, Width_pct.Ln
Linear Model For Weight_g.Ln Model 1 Model 2 Model 3
Run Time 2/5/20 9:43 AM 2/5/20 9:44 AM 2/5/20 9:44 AM
# Fitted 157 157 157
Mean 5.407 5.407 5.407
Standard Deviation 1.330 1.330 1.330
Number Of Variables 1 7 3
Standard Error of Regression 0.310 0.103 0.100
R-squared 0.946 0.994 0.995
Adjusted R-squared 0.946 0.994 0.994
Mean Absolute Error 0.235 0.079 0.077
Mean Absolute Percentage Error 5.3% 1.6% 1.6%
Maximum VIF 2.992 1.631
Normality Test *** _ _

Coefficients: Model 1 Model 2 Model 3


Constant -5.269 (0.000) -5.054 (0.000) -9.143 (0.000)
Height_pct.Ln 0.445 (0.000)
Length_cm.Ln 3.172 (0.000) 3.155 (0.000) 3.017 (0.000)
Species.Eq.1 -0.055 (0.028)
Species.Eq.2 0.071 (0.113)
Species.Eq.3 -0.122 (0.000)
Species.Eq.4 0.149 (0.000)
Species.Eq.5 -0.671 (0.000)
Species.Eq.6 -0.775 (0.000)
Width_pct.Ln 1.113 (0.000)

Model 5 (#vars=1, n=157, AdjRsq=0.981): Weight_g << Weight_g.Ln.Model.4.Exp


Linear Model For Weight_g Model 5
Run Time 2/5/20 9:49 AM
# Fitted 157
Mean 401.235
Standard Deviation 358.809
Number Of Variables 1
Standard Error of Regression 49.096
R-squared 0.981
Adjusted R-squared 0.981
Mean Absolute Error 26.967
Mean Absolute Percentage Error 6.7%
Maximum VIF
Normality Test ***

Coefficients: Model 5
Constant 0.890 (0.881)
Weight_g.Ln.Model.4.Exp 1.001 (0.000)
5, Species.Eq.6, Width_pct.Ln
Model 4
2/5/20 9:46 AM
RegressIt automatically builds a journal-style table
157
of side-by-side comparative statistics for models
5.407
fitted to a given dependent variable. Color and
1.330
font coding can be toggled on and off to highlight
9
significance of coefficient estimates. Cell notes
0.081 under the displayed numbers provide much more
0.996 detail.
0.996
0.063
1.3%
43.283
_

Model 4
-8.228 (0.000)
0.655 (0.000)
3.031 (0.000)
-0.228 (0.001)
0.023 (0.542)
-0.106 (0.000)
-0.083 (0.236)
-0.247 (0.000)
-0.159 (0.033)
0.526 (0.000)
Color Font Comment
Source: http://jse.amstat.org/jse_data_archive.htm
http://jse.amstat.org/datasets/fishcatch.txt
http://jse.amstat.org/datasets/fishcatch.dat.txt

NAME: fishcatch A few references on log-log models in


TYPE: Sample http://eprints.cmfri.org.in/12178/1/17
SIZE: 159 observations, 8 variables http://www.dnr.state.mi.us/publicatio
The first one fits a regression model in
DESCRIPTIVE ABSTRACT:

159 fishes of 7 species are caught and measured. Altogether there are Also see pages 56-67 here for some de
8 variables. All the fishes are caught from the same lake https://support.sas.com/documentati
(Laengelmavesi) near Tampere in Finland. …and pages 1747 to 1751 here:
https://support.sas.com/documentati
SOURCES:
Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaera
sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4,
Meddelanden utgivna av fiskerifoereningen i Finland. The original dataset included 3 diff
called Length1, Length2, and Leng
Helsingfors 1917 notes below for details. Only the
included in this analysis (renamed
VARIABLE DESCRIPTIONS: these variables are highly correlat
Length1 and Length2 turn out to b
in which Length3 is included.
1 Obs Observation number ranges from 1 to 159
2 Species (Numeric)
Code Finnish Swedish English Latin
1 Lahna Braxen Bream Abramis brama
2 Siika Iiden Whitewish Leusiscus idus
3 Saerki Moerten Roach Leuciscus rutilus
4 Parkki Bjoerknan ? Abramis bjrkna
5 Norssi Norssen Smelt Osmerus eperlanus
6 Hauki Jaedda Pike Esox lucius
7 Ahven Abborre Perch Perca fluviatilis

3 Weight Weight of the fish (in grams)


4 Length1 Length from the nose to the beginning of the tail (in cm)
5 Length2 Length from the nose to the notch of the tail (in cm)
6 Length3 Length from the nose to the end of the tail (in cm)
7 Height% Maximal height as % of Length3
8 Width% Maximal width as % of Length3
9 Sex 1 = male 0 = female

___/////___ _
/ \ ___ |
/\ \_ / / H
< ) __) \ |
\/_\\_________/ \__\ _

|------- L1 -------|
|------- L2 ----------|
|------- L3 ------------|

Values are aligned and delimited by blanks.


Missing values are denoted with NA.
There is one data line for each case.

SPECIAL NOTES:
I have usually calculated
Height = Height%*Length3/100
Widht = Widht%*Length3/100

PEDAGOGICAL NOTES:
I have mainly used only Species=7 (Perch) and here is some of the
models and test, we have used

Weight=a+b*(Length3*Height*Width)+epsilon
Ho: a=0;
Heteroscedastic case. Question: What is proper weighting,
if you use Length3 as a weighting variable.

Log(Weight)=a+b1*Length3+epsilon

Weight^(1/3)=a+b1*Length3+epsilon
(Given by Box-Cox-transformation)
Ho: a=0;

Log(Weight)=a+b1*Length3+b2*Height+b3*Width+epsilon
Ho: b1+b2+b3=3;
i.e. dimension of the fish = 3

Weight^(1/3)=a+b1*Length3+b2*Height+b3*Width+epsilon
(Given by Box-Cox-transformation)
Ho: a=0;

Weight=a*Length3^b1*Height^b2*Width^b3+epsilon
Nonlinear, heteroscedastic case.
What is proper weighting?
Is obs 143

143 7 840.0 32.5 35.0 37.3 30.8 20.9 0

an outlier? It had in its stomach 6 roach.

REFERENCES:
Brofeldt, Pekka: Bidrag till kaennedom on fiskbestondet i vaara
sjoear. Laengelmaevesi. T.H.Jaervi: Finlands Fiskeriet Band 4,
Meddelanden utgivna av fiskerifoereningen i Finland.
Helsingfors 1917

SUBMITTED BY:
Juha Puranen
Departement of statistics
PL33 (Aleksanterinkatu 7)
000014 University of Helsinki
Finland
e-mail: jpuranen@noppa.helsinki.fi
rences on log-log models in the fishery literature:
ints.cmfri.org.in/12178/1/17-Estimation%20of%20length%20weight%20relationship%20in%20fishes.pdf
w.dnr.state.mi.us/publications/pdfs/IFR/manual/SMII%20Chapter17.pdf
ne fits a regression model in Excel, the second includes a table of coefficients from many models.

ages 56-67 here for some descriptive analysis of the same data set in SAS:
pport.sas.com/documentation/onlinedoc/base/procstat93m1.pdf
es 1747 to 1751 here:
pport.sas.com/documentation/onlinedoc/stat/123/candisc.pdf#cite.fish%3A17@-30

riginal dataset included 3 different measures of length,


Length1, Length2, and Length3. See the picture and
below for details. Only the Length3 variable has been
ed in this analysis (renamed as Length_cm) because
variables are highly correlated with each other and
h1 and Length2 turn out to be insignficant in any model
ch Length3 is included.
Obs Species Weight Length1 Length2 Length3 Height Width Sex
1 1 242 23.2 25.4 30 38.4 13.4 NA
2 1 290 24 26.3 31.2 40 13.8 NA
3 1 340 23.9 26.5 31.1 39.8 15.1 NA
4 1 363 26.3 29 33.5 38 13.3 NA
5 1 430 26.5 29 34 36.6 15.1 NA
6 1 450 26.8 29.7 34.7 39.2 14.2 NA
7 1 500 26.8 29.7 34.5 41.1 15.3 NA
8 1 390 27.6 30 35 36.2 13.4 NA
9 1 450 27.6 30 35.1 39.9 13.8 NA
10 1 500 28.5 30.7 36.2 39.3 13.7 NA
11 1 475 28.4 31 36.2 39.4 14.1 NA
12 1 500 28.7 31 36.2 39.7 13.3 NA
13 1 500 29.1 31.5 36.4 37.8 12 NA
14 1 NA 29.5 32 37.3 37.3 13.6 1
15 1 600 29.4 32 37.2 40.2 13.9 1
16 1 600 29.4 32 37.2 41.5 15 NA
17 1 700 30.4 33 38.3 38.8 13.8 1
18 1 700 30.4 33 38.5 38.8 13.5 NA
19 1 610 30.9 33.5 38.6 40.5 13.3 NA
20 1 650 31 33.5 38.7 37.4 14.8 NA
21 1 575 31.3 34 39.5 38.3 14.1 1
22 1 685 31.4 34 39.2 40.8 13.7 NA
23 1 620 31.5 34.5 39.7 39.1 13.3 NA
24 1 680 31.8 35 40.6 38.1 15.1 NA
25 1 700 31.9 35 40.5 40.1 13.8 NA
26 1 725 31.8 35 40.9 40 14.8 1
27 1 720 32 35 40.6 40.3 15 NA
28 1 714 32.7 36 41.5 39.8 14.1 NA
29 1 850 32.8 36 41.6 40.6 14.9 NA
30 1 1000 33.5 37 42.6 44.5 15.5 0
31 1 920 35 38.5 44.1 40.9 14.3 0
32 1 955 35 38.5 44 41.1 14.3 NA
33 1 925 36.2 39.5 45.3 41.4 14.9 1
34 1 975 37.4 41 45.9 40.6 14.7 0
35 1 950 38 41 46.5 37.9 13.7 NA
36 2 270 23.6 26 28.7 29.2 14.8 NA
37 2 270 24.1 26.5 29.3 27.8 14.5 NA
38 2 306 25.6 28 30.8 28.5 15.2 NA
39 2 540 28.5 31 34 31.6 19.3 NA
40 2 800 33.7 36.4 39.6 29.7 16.6 0
41 2 1000 37.3 40 43.5 28.4 15 NA
42 3 40 12.9 14.1 16.2 25.6 14 NA
43 3 69 16.5 18.2 20.3 26.1 13.9 NA
44 3 78 17.5 18.8 21.2 26.3 13.7 NA
45 3 87 18.2 19.8 22.2 25.3 14.3 NA
46 3 120 18.6 20 22.2 28 16.1 NA
47 3 0 19 20.5 22.8 28.4 14.7 NA
48 3 110 19.1 20.8 23.1 26.7 14.7 0
49 3 120 19.4 21 23.7 25.8 13.9 0
50 3 150 20.4 22 24.7 23.5 15.2 0
51 3 145 20.5 22 24.3 27.3 14.6 0
52 3 160 20.5 22.5 25.3 27.8 15.1 0
53 3 140 21 22.5 25 26.2 13.3 NA
54 3 160 21.1 22.5 25 25.6 15.2 0
55 3 169 22 24 27.2 27.7 14.1 NA
56 3 161 22 23.4 26.7 25.9 13.6 NA
57 3 200 22.1 23.5 26.8 27.6 15.4 0
58 3 180 23.6 25.2 27.9 25.4 14 NA
59 3 290 24 26 29.2 30.4 15.4 NA
60 3 272 25 27 30.6 28 15.6 0
61 3 390 29.5 31.7 35 27.1 15.3 NA
62 4 55 13.5 14.7 16.5 41.5 14.1 NA
63 4 60 14.3 15.5 17.4 37.8 13.3 1
64 4 90 16.3 17.7 19.8 37.4 13.5 1
65 4 120 17.5 19 21.3 39.4 13.7 1
66 4 150 18.4 20 22.4 39.7 14.7 NA
67 4 140 19 20.7 23.2 36.8 14.2 NA
68 4 170 19 20.7 23.2 40.5 14.7 0
69 4 145 19.8 21.5 24.1 40.4 13.1 0
70 4 200 21.2 23 25.8 40.1 14.2 NA
71 4 273 23 25 28 39.6 14.8 0
72 4 300 24 26 29 39.2 14.6 0
73 5 6.7 9.3 9.8 10.8 16.1 9.7 1
74 5 7.5 10 10.5 11.6 17 10 0
75 5 7 10.1 10.6 11.6 14.9 9.9 1
76 5 9.7 10.4 11 12 18.3 11.5 0
77 5 9.8 10.7 11.2 12.4 16.8 10.3 1
78 5 8.7 10.8 11.3 12.6 15.7 10.2 1
79 5 10 11.3 11.8 13.1 16.9 9.8 1
80 5 9.9 11.3 11.8 13.1 16.9 8.9 0
81 5 9.8 11.4 12 13.2 16.7 8.7 0
82 5 12.2 11.5 12.2 13.4 15.6 10.4 0
83 5 13.4 11.7 12.4 13.5 18 9.4 0
84 5 12.2 12.1 13 13.8 16.5 9.1 0
85 5 19.7 13.2 14.3 15.2 18.9 13.6 0
86 5 19.9 13.8 15 16.2 18.1 11.6 0
87 6 200 30 32.3 34.8 16 9.7 NA
88 6 300 31.7 34 37.8 15.1 11 0
89 6 300 32.7 35 38.8 15.3 11.3 NA
90 6 300 34.8 37.3 39.8 15.8 10.1 NA
91 6 430 35.5 38 40.5 18 11.3 NA
92 6 345 36 38.5 41 15.6 9.7 1
93 6 456 40 42.5 45.5 16 9.5 NA
94 6 510 40 42.5 45.5 15 9.8 NA
95 6 540 40.1 43 45.8 17 11.2 NA
96 6 500 42 45 48 14.5 10.2 NA
97 6 567 43.2 46 48.7 16 10 0
98 6 770 44.8 48 51.2 15 10.5 0
99 6 950 48.3 51.7 55.1 16.2 11.2 NA
100 6 1250 52 56 59.7 17.9 11.7 NA
101 6 1600 56 60 64 15 9.6 NA
102 6 1550 56 60 64 15 9.6 0
103 6 1650 59 63.4 68 15.9 11 0
104 7 5.9 7.5 8.4 8.8 24 16 NA
105 7 32 12.5 13.7 14.7 24 13.6 NA
106 7 40 13.8 15 16 23.9 15.2 NA
107 7 51.5 15 16.2 17.2 26.7 15.3 NA
108 7 70 15.7 17.4 18.5 24.8 15.9 NA
109 7 100 16.2 18 19.2 27.2 17.3 NA
110 7 78 16.8 18.7 19.4 26.8 16.1 NA
111 7 80 17.2 19 20.2 27.9 15.1 NA
112 7 85 17.8 19.6 20.8 24.7 14.6 NA
113 7 85 18.2 20 21 24.2 13.2 NA
114 7 110 19 21 22.5 25.3 15.8 NA
115 7 115 19 21 22.5 26.3 14.7 NA
116 7 125 19 21 22.5 25.3 16.3 1
117 7 130 19.3 21.3 22.8 28 15.5 0
118 7 120 20 22 23.5 26 14.5 0
119 7 120 20 22 23.5 24 15 NA
120 7 130 20 22 23.5 26 15 NA
121 7 135 20 22 23.5 25 15 NA
122 7 110 20 22 23.5 23.5 17 0
123 7 130 20.5 22.5 24 24.4 15.1 0
124 7 150 20.5 22.5 24 28.3 15.1 0
125 7 145 20.7 22.7 24.2 24.6 15 NA
126 7 150 21 23 24.5 21.3 14.8 NA
127 7 170 21.5 23.5 25 25.1 14.9 NA
128 7 225 22 24 25.5 28.6 14.6 NA
129 7 145 22 24 25.5 25 15 NA
130 7 188 22.6 24.6 26.2 25.7 15.9 NA
131 7 180 23 25 26.5 24.3 13.9 0
132 7 197 23.5 25.6 27 24.3 15.7 NA
133 7 218 25 26.5 28 25.6 14.8 NA
134 7 300 25.2 27.3 28.7 29 17.9 0
135 7 260 25.4 27.5 28.9 24.8 15 0
136 7 265 25.4 27.5 28.9 24.4 15 NA
137 7 250 25.4 27.5 28.9 25.2 15.8 0
138 7 250 25.9 28 29.4 26.6 14.3 NA
139 7 300 26.9 28.7 30.1 25.2 15.4 0
140 7 320 27.8 30 31.6 24.1 15.1 0
141 7 514 30.5 32.8 34 29.5 17.7 NA
142 7 556 32 34.5 36.5 28.1 17.5 NA
143 7 840 32.5 35 37.3 30.8 20.9 0
144 7 685 34 36.5 39 27.9 17.6 0
145 7 700 34 36 38.3 27.7 17.6 0
146 7 700 34.5 37 39.4 27.5 15.9 0
147 7 690 34.6 37 39.3 26.9 16.2 0
148 7 900 36.5 39 41.4 26.9 18.1 0
149 7 650 36.5 39 41.4 26.9 14.5 NA
150 7 820 36.6 39 41.3 30.1 17.8 NA
151 7 850 36.9 40 42.3 28.2 16.8 0
152 7 900 37 40 42.5 27.6 17 0
153 7 1015 37 40 42.4 29.2 17.6 0
154 7 820 37.1 40 42.5 26.2 15.6 0
155 7 1100 39 42 44.6 28.7 15.4 0
156 7 1000 39.8 43 45.2 26.4 16.1 0
157 7 1100 40.1 43 45.5 27.5 16.3 0
158 7 1000 40.2 43.5 46 27.4 17.7 1
159 7 1000 41.1 44 46.6 26.8 16.3 0
Values of Weight are missing
for fish #14 and #47, and they
were recoded as blank cells
prior to analysis.

Вам также может понравиться