Вы находитесь на странице: 1из 6

Stacey Roman

Skittles Project
In late January we were told each to purchase a 2.17 oz bag of Skittles for an upcoming
project. This would be a project that we would revisit at least once a month for the entire
semester. Each part of this project consisted of concepts we learned throughout our Math 1040
class. As we put the concepts in use with the data we would gather, we were also put into
groups to discuss key points of the concept. The Skittles project consisted of four parts and our
first task was to count the colors we received in our individual bags.
PART 1: Project Data Collections

Count Red Count Orange Count Yellow Count Green Count Purple Total
My Bag 13 10 8 10 18 59

After we collected our individual data, our instructor would then compile all the data
from the class into a document and share it with everyone who participated. We had begun the
semester discussing the practice of statistics and different types of sampling. By understanding
the different types of samplings, we would be able to continue to the next part of the project.
PART 2: Graphics
1. Predictions:

Red Orang Yellow Green Purple


e
Predicted Proportion for each color .25 .20 .10 .20 .25

Based on the amount I had and remembering any other time I had consumed a bag of
skittles, I believe red and purple were more abundant than the rest of the colors, and yellow
was always the least. I predict red and purple have a similar and higher percentage, orange and
green are similar too, and yellow would be the lowest percentage.

2. Data:

Red Orange Yellow Green Purple Total Count


Counts for my bag 13 10 8 10 18 59

Counts for the entire class


263 298 286 270 294 1411
sample
Actual Proportions for my bag .22 .17 .14 .17 .30
Stacey Roman

Actual Proportions for the


.19 .21 .20 .19 .21
entire class sample

3. Graphics for Qualitative Data:

4. Skittles Colors:
I was surprised by the amount of yellow skittles my classmates had overall had since I
received a small amount compared to the other colors in my bag. When adding all the amount
to each color I was also surprised to find how close the amounts of each color were to each
other. For some reason I had thought the company produced more of a certain color, but by
looking at the amount our class accumulated, I believe they most likely produce an equal
amount of each color. By viewing the data of my own bag and the class, I notice the assumption
I made from looking at my own bag was not a good assumption to make for all skittle bags.
Having a larger sample had shifted my view on the variation in color proportions. I believe if we
took this sample to a greater frame, each color amount would become closer to each other.
DISCUSSION: What kind of sampling is this?
Well as a refresher for myself, a random sample is using chance to select individuals
from a population to be included in the sample. I agreed with everyone in my group that this
feels a bit ambiguous to understand how big the population is, since it can be the state of Utah,
SLCC Math 1040 classes, or South City Campus Math 1040 class. Maybe, in this case, the
population is South City Math 1040 class, and our group is, sort of, a random sample, but since
Stacey Roman

everyone is included after randomly being put into groups, I agree with our group member
about how this can be a cluster sample.
After we predicted and calculated the actual proportions of this sample, we were able to
put into perspective of the proportion of each color from our Skittles bags. The next concept we
would observe was the amount of candy per bag. For this next concept we would go over the
summary stats and box plot.
PART 3: Summary Stats
1. Summary statistics:

Mean number of candies per bag 58.79

Standard deviation of the number of candies per bag 2.02


5-number summary for the number of candies per 54, 58.5, 59, 60,
bag 62

2. Histogram:
Stacey Roman

3. Boxplot:

4. Number of Candies:
After viewing both graphs it is interesting to notice how both graphs are skewed
differently from each other. The histogram is skewed slightly to the left, whereas the box plot is
obviously skewed to the right. I had originally expected the graphs to be skewed the same
direction, but then I remembered that histograms include outliers, while the boxplot will
mention the outliers but since we include fences it excludes the outliers that misinterprets the
data. In conclusion the shape of the distribution is skewed to the right. When comparing the
graphic to the summary statistic I believe the histogram aligns better with the median and
mean since the median is bigger than the mean and that normally would mean the graph would
be skewed to the left.

DISCUSSION: Qualitative versus Quantitative Data


As a group we agreed it was a quantitative data, mostly because the graphs we designed
would only work with quantitative data. We then discussed the outcome of our graphs. I
obtained similar results with my group when completing my graphs. It gave me the visual
understanding that outliers can distort the mean and standard deviation since neither is
resistant, therefore is the reason the histogram was skewed to the left. In our textbook, it
states by computing the five-number summary it is resistant to the extreme values because we
use the interquartile range to determine the fences we can exclude the outliers since they
typically reflect an error in data recording or in our case unusually large or small observations.
My instructor then made it clear we should not exclude the outliers since they are legitimate
numbers.
Stacey Roman

After completing the summary stats, it was interesting to see how each bag does not
receive the same amount. The amount seems to be close to one another but not identical all
the time. By putting it into graphs it made it clear how these graphs would not be able to work
with qualitative data. There are specific graphs that only work for a certain type of data and the
data we graphed for this project could only be identified as quantitative data. After this section
we moved on to learning about confidence intervals and were able to calculate a possible
interval of yellow Skittles per bag and the mean of all Skittles per bag.
PART 4: Confidence Intervals
1. Proportion Yellow Candies: 
We are looking for the proportion that are yellow candies. There are 1411 total candies. Out
of the 1411, there are 286 yellow candies. To find the proportion we will use the 1-PropZInt
function on our calculators.

1-PropZInt
x: 286
n: 1411
C-Level: 0.99
(0.17513, 0.23026)
p-hat= 0.2026931254
n= 1411
This is correct and we could verify by checking p-hat.
286/1411 = 0.2026931254
We are 99% confident that the interval (0.17513, 0.23026) will contain the true value for
the proportion that are yellow candies.
2. Mean Number of Candies: 
Our Class data of candies per bag

59 61 61 56 59 59
59 59 59 59 59 54
60 60 58 58 60 54
59 56 60 62 59 61

I used T Interval to find the population mean number of candies per bag.

T Interval - Data T Interval – Stats


(57.938, 59.645) x-bar: 58.79
Sx: 2.02
Stacey Roman

x-bar= 58.79 n: 24
Sx= 2.02 c-level: 0.95
n=24 (57.938, 59.645)
There is a 95% confidence that the population mean number of candies per bag is
between 58 and 60.
DISCUSSION: What is a Confidence Interval?
The purpose of finding a confidence interval is to find the probability that a population
parameter falls between two sets of values for the certain situation. Factors that affect
confidence interval is sample standard deviation and the confidence level. Both factors when
increased also increase the width of the confidence interval. When standard deviation is
increased it creates a longer range, and the same goes for the confidence level.
In summary, this was an interesting project to have for the entire semester. At first, I
thought we would only apply one concept for this project and then start a new project for a
different concept but by only having one project we are able to see how we can use many of
these concepts for just one data set we gathered at the beginning of the semester. It
demonstrates how these concepts when understood and done right can be used to draw
predictions and conclusions from a small frame to a larger frame. It helped me become aware
that we cannot jump to conclusions with just one bag of skittles, but it is safer to make
predictions when there is more data. This project also helped as understand how to apply it in
the real world rather than just using made up numbers from an unrealistic story problem.

Вам также может понравиться