Вы находитесь на странице: 1из 3

STAT101-15S2

Assignment 1

Jesse Scott, Jiany Tophoven

Question 1
(a)

(b)
Area (sq miles)
Mean 75840.38
Standard Error 13726.21349
Median 57093
Standard Deviation 97058.98641
Sample Variance 9420446842
Range 661722
Minimum 1545
Maximum 663267
Sum 3792019
Count 50
(c)
90% of the US States have an area between 0 and 120,000 square miles. The
remaining 10% consist of three states between 120,000 and 180,000 square miles
and two states are outliers with an area above 260,000 square miles, meaning there
are no state with an area between 180,000 and 260,000 square miles. This causes
the histogram to be right-skewed.
(d)
If we wanted to look at the relationship between the population and the area, we
would use a scatterplot. As we want to show the data collected of all fifty states,
we have a large amount of data, which a scatterplot displays very well. Scatterplots
are also great to analyse if there is a correlation between the two variables, as they
easily show if there is a trend. Finally, it is easy to see clustering effects on
scatterplots.

Question 2
(a)
Eaglehawk Bloomsbury
Lowest number 0
Highest number (b) Please
130 Lowest see Attachment No.1
number 14
Highest number 83
Median 8.5 (c) The modified boxplot of the weekly
Median 40
Lower Quartile 1
February rainfall
Lower Quartile in Eaglehawk has a very
30
Upper Quartile 17.5
short Quartile
Upper left whisker and a long right whisker
66
IQR 16.5
in comparison. The median lies to the36
IQR left
IQR x 1.5 24.75
within
IQR x 1.5the box and there are two outliers
54
Lower Fence -23.75
Lower Fenceto the right. All these
positioned -24
Upper Fence 41.25 Upper Fence 120
characteristics show that the modified
5 number (0,1,8.5,17.5,1 5boxplot
number is extremely right-skewed.
(14,30,40,66,8
Summary: 30) Summary: 3)
The whiskers of the modified boxplot of
the weekly February rainfall in Bloomsbury on each side are approximately the
same size. The box is spread out over a bigger range and there are no outliers. The
median is located to the left of the box which influences the modified boxplot
overall to be slightly right-skewed.
(d) Stating the mean for the rainfall in February in Eaglehawk would not be a
reasonable summary, as the extreme values effect the mean dramatically. If taking
the extreme values into consideration, the mean will be a lot higher than the
rainfall would be in reality.
Stating the mean for the rainfall in February in Bloomsbury would be a reasonable
summary, because the mean lies close to the middle of the range and no outliers
exist that influence the mean.
(e) Visiting Eaglehawk in February, a tourist most likely will experience very little
rain for the week, if any, but infrequently there is extreme rainfall in comparison to
usual. However, in Bloomsbury a tourist very likely will experience a moderate
amount of rain, as in the last four years there has been no weekly period of no
rainfall.

Question 3
(a)
Which University Majors are dominated by what gender?

Вам также может понравиться