Академический Документы
Профессиональный Документы
Культура Документы
Math 1040
Zeph Smith
T/R 10:00-11:20
Report Introduction
Through active learning techniques, students are able to demonstrate an understanding of key concepts
learned in Math 1040. The Skittles term project allows students to be more engaged throughout the
term, as well as hands-on in learning the basic concepts of statistics. During the project, students will
follow the three steps of statistics: Preparing, Analyzing, and Concluding data. In the preparation phase
of this project, each student in the class will provide data, that data will then be compiled into a larger
sample. That sample will then be analyzed by way of an observational study, and that information can
be used to form a conclusion about the larger population of all skittles. In this case, the convenience
sample of data is the number of each color of Skittle found in a 2.17 ounce bag of Skittles candy.
Categorical Data
0.210
0.205
0.183
0.207
0.195
Red
Orange
Yellow
Green
Purple
0.210
0.207
0.205
Purple
Green
Red
0.195
0.200
0.183
0.150
0.100
0.050
0.000
Yellow Orange
Color of Skittle
After compiling the data, it appears that the distribution of the colors is almost uniform. This can be
seen in both the pie chart and the Pareto charts which both show that the proportion of each color is
nearly identical to every other color. It should also be noted that the bigger the sample size, the more
uniform the results will be. While every bag will have a slightly different count of each color, each bag is
typically within normal variance. Obviously there will be a few outliers; however, the bag I selected was
within one standard deviation of the mean in every color.
Name
Red
Orange
Yellow
Green
Purple
Respondant 1
10
12
14
17
61
Respondant 2
13
20
11
61
Respondant 3
11
13
12
10
15
61
Respondant 4
13
11
12
13
10
59
Respondant 5
14
16
13
61
Respondant 6
15
10
13
19
62
Respondant 7
13
12
14
45
Respondant 8
18
14
11
13
62
Respondant 9
10
11
16
11
56
Respondant 10
12
13
10
13
13
61
Respondant 11
22
17
12
15
75
Respondant 12
11
11
18
13
14
67
Respondant 13
15
15
12
13
64
Respondant 14
26
24
21
22
21
114
Respondant 15
14
12
17
61
Respondant 16
12
14
12
10
11
59
Respondant 17
12
13
11
13
58
Respondant 18
14
12
14
15
58
Respondant 19
15
10
13
19
62
Respondant 20
17
11
10
15
62
Respondant 21
14
14
16
60
Respondant 22
12
17
14
60
Respondant 23
11
12
14
17
62
Respondant 24
15
17
18
61
Respondant 25
21
24
22
19
20
106
Respondant 26
19
10
14
11
62
Respondant 27
18
11
10
12
60
Respondant 28
10
13
17
10
58
Respondant 29
14
15
10
13
60
Respondant 30
15
14
14
10
60
Respondant 31
16
11
17
56
Respondant 32
14
12
15
11
10
62
Respondant 33
13
11
11
15
59
Respondant 34
10
15
10
10
14
59
Respondant 35
16
12
16
62
Respondant 36
10
18
16
60
Respondant 37
22
15
28
20
13
98
Respondant 38
11
14
14
14
61
500
446
474
503
512
Name
Respondant 32
Red
Orange
14
Yellow
12
Green
15
Purple
11
62
Quantitative Data
Number of Skittles
Orange
Yellow
Green
Purple
Min
Q1
10
10
11
11
Q2
13
11
12
13
13.5
Q3
15
14
14
16
15
Max
26
24
28
22
21
STD Dev
4.89
4.47
4.72
3.51
3.52
38
38
38
38
38
Red
Orange
Ratio of Candies
% of Each Color
Total Candies
0.205 0.183
20.5 18.3
500
446
38
61
2435
Yellow
Green
Purple
0.195
19.5
474
0.207
20.7
503
0.210
21.0
512
Table 4: Ratio, Total for Each Color, and Total Candies and Bags Data
Name
Respondant 32
Red
Orange
14
Yellow
12
Green
15
Purple
11
62
After examining the data, it appears that there was more variation with the red and yellow skittles,
followed by the orange skittle. The green skittle had the least variation followed by the purple skittle.
Looking a bit closer at the maximum and minimum values for each color further explains why there is
such a huge variation between skittle colors. The red skittle has a maximum value of 26, and a minimum
value of 0. Likewise, the yellow skittle had a maximum value of 28, and a minimum value of 3. This is not
what one would expect, if in fact, the intent of the distributor is to have equal amounts of all colors in a
2.17 ounce bag of skittles. The bag I selected, however, was within one standard deviation of the mean
in every color.
Reflection
The difference between categorical and quantitative data is that categorical data consists of names and
labels that are not numbers, whereas, quantitative data consists of numbers that also represent counts
or measurements (Triola, 2014). In preparing categorical data it makes sense to use graphs such as: bar
graphs, pie charts, and pareto charts. Conversely, when preparing quantitative data it makes sense to
use graphs such as: histograms, normal quantile plots, time series, scatterplots, frequency polygons, dot
plots, and stem and leaf plots. This is due in large part to the differences of data being graphed. As
explained above, categorical data does not represent counts or measurements, and therefore, even if
the data can be arranged in some order, math such as subtraction and division is meaningless.
Alternatively, quantitative data represents numbers or measurements and therefore depending on the
type of data, whether discrete or continuous, math such as subtraction, addition, multiplication, and
division makes sense and can be calculated for such data.
Confidence Interval Estimates
A confidence interval is when a sample proportion is used to construct a confidence interval estimate
of the true value of a population proportion (Triola, 2014).
= 2.576
p-E
)(
.174
.216
We can say with 99% confidence that the true proportion of yellow candies is between .174 and .216.
95% confidence interval for the true mean number of candies per bag:
= 2.026
61 4.34 = 56.66
61 + 4.34 = 65.34
56.66
We can say with 95% confidence that the true mean number of candies per bag is between 56.66 and
65.34.
98% confidence interval for the standard deviation of the number of candies per bag:
)(
We can say with 98% confidence that the standard deviation of the number of candies per bag is
between 11.26 and 20.76.
Hypothesis Tests
A hypothesis test is a procedure for testing a claim about a property of a population (Triola, 2014).
= .05
1-PropZTest:
Z=0
P = .999
= Fail to Reject
The P-Value is larger than the alpha value, therefore, there is not sufficient evidence to reject the null
hypothesis 20% of all skittles are red.
= .01
T Test
t = 2.80
P = .008
= Reject
The P- Value is smaller than the alpha value; therefore, there is sufficient evidence to reject the claim
that the mean number of candies in a bag of Skittles is 55.
Reflection
In order to construct confidence interval estimates, and hypothesis tests, certain criteria need to be
met. Firstly, the sample should be a simple random sample. Secondly, that the data fits the conditions
for a binomial distribution. And lastly, there must be at least five failures and at least five successes.
This can be done with a simple process of multiplying n and p (successes), and n and q (failures) to
ensure that they are both greater than five. All three intervals and hypothesis tests above meet all of
these requirements.
When reviewing the data gathered for the Skittles project, there appears to be either a few outliers or
more likely an error in purchasing the incorrect size of Skittles bag. Another error to factor in is that this
was a convenience sample, which means it does not necessarily represent the true population
proportions of Skittles. There are other errors that could also affect the results, these could be: the
sample size is too small or the incorrect data entry from other students. The sampling method could
easily be improved by removing the outliers, increasing the sample size, and ensure a simple random
sample.