Вы находитесь на странице: 1из 10

Taylor Wilson

Skittles Project Math 1040

February 28, 2017

Skittles Term Project

Introduction:

For my Math 1040 class we have been given an assignment to do a statistical


analysis of the amount of each color in a 2.7 oz bag of skittles candies. The class in
total was 22 students who collected their data from their skittle bags. I have
compiled the data in a couple different forms as follows.

Data Collection:

Number of Numb Numb Numb Numb


Red Candies er of er of er of er of
Orang Yellow Green Purpl
e Candi Candi e
Candi es es Candi
es es
13 16 12 5 12

Organizing and Displaying Categorical Data: Colors

I recorded the proportions from the data that was taken from the class. I then
created a Pie Chart, and a Pareto Chart for the numbers of candies of each color. To
create my Pie Chart I listed the color categories in the first five cells of column one.
In the second column I listed the total numbers of each color.
Pie Chart - number of skittles by color

Red
252 279 purple
yellow
green
orange
254
258

257

Table describing the data results and proportion for the entire class by
color:

Number of Numb Numb Numb Numb


Red Candies er of er of er of er of
Orang Yellow Green Purpl
e Candi Candi e
Candi es es Candi
es es
279 252 257 254 258

For the Pareto Chart I did the same thing in excel but I made sure to use the

count descending option so that it would list the chart from greatest to least.

Pareto Chart - showing number of skittles by color


285

280

275

270

265

260

255

250

245

240

235
Red Purple Yellow Green Orange

From the class data and my observations it shows that the red candies were

more apparent in each bag of skittles, and the orange were the least. Before I

began this project I thought that the colors would be distributed more evenly then

they actually ended up being. In my own bag of skittles, the red had the most and

the green had the least amount. I was surprised to see the green with so many less

than the others. The class had a variety of different numbers for their green colors.

It was very interesting to see the variety in each back. I always thought that they

would distribute the colors more evenly throughout the bags of skittles and be less

sporadically.

Organizing and Displaying Categorical Data: Colors


Column n Mean Standa Min Q1 Media Q3 Max

rd Dev. n
Total 22 58.6 2.53 53 58 60 61 63

My observations of this data was that the shape of the distribution is not the

normal distribution, it is skewed to the left. There were no outliers in the data when
the boxplot was all done. After doing the 5- number summary the graphs did

ultimately reflect what I expected to see.

Reflection

The difference between categorical and quantitative data here are that the

colors in each of the bags that we graphed above are categorical data and the

numbers of candies per bag that we graphed for this exercise are quantitative data.

The categorical data is represented by names, labels, and colors. They do not

count or measure anything like quantitative data does.

The graphs that work best for categorical data are the pie and the pareto

charts. These draw attention to the most important information. The best graphs

to use for categorical data are the histogram and box plot. They show us if there

are any outliers and what numbers we have to deal with.

Confidence Interval

A confidence interval is used in statistics to measure the probability that a

population parameter lies within it. The most commonly used confidence intervals

are 99% and 95% confidence intervals. The confidence interval allows us to

estimate the range in which our true population parameter falls.


Interpretation of Confidence Intervals

The first problem was to construct a 99% confidence interval estimate for the

true proportion of yellow candies. Based on these calculations we are 99%

confident the interval from 0.162 to 0.218 actually contains the true population

proportion. This means that if we were to randomly select different samples of the

same size of candies, which was 1300. Then construct the equivalent confidence

intervals, then 99% of them would contain the true value of the population

proportion.

The second problem was to construct a 95% confidence interval estimate for

the true mean number of candies per bag. Based on these calculations we are 95%

confident that the interval from 57.487 to 59.713 actually contain the true value of

the mean number of candies per bag in the population. This means that if we were
to randomly select different samples of the same size. (22 bags of skittles). Then

construct confidence intervals, the 95% of them would contain the true value of the

population mean.

Hypothesis Testing

Hypothesis testing is the method and practice of testing a hypothesis by

comparing it with the null hypothesis. The null hypothesis is rejected when the

probability falls below the significant level. Which then in turn means that the

hypothesis has that level of significance.

Interpretation of Hypothesis Testing


The first hypothesis test was to use a 0.5 significance level to test the claim

that 20% of all skittles candies are red, and the alternative hypothesis is that it is

not true that 20% of all skittles candies are red. This is a two tailed test. I found

that the test statistic is 1.32 is less than the critical value of 1.96. So the result was

to reject the null hypothesis. There is sufficient evidence to support the claim that

20% of all skittles candies are red.

The second hypothesis test was to use 0.1 significance level to test the claim

that the mean number of candies in a bad of skittles is 55. The alternative

hypothesis is that the mean number of candies in a bag is not 55. This is a two

tailed test and I found that the test statistic 6.675 is greater than the critical value

of 2.797. So we do NOT reject the null hypothesis. Then in turn there is not

sufficient evidence to support the claim that the mean number of candies in a bag

of skittles is 55.

Conditions

The conditions for doing interval estimates are first off, that the sample is a

random sample and that the population is normally distributed. n >30. Our sample

of data did meet this criteria. The collection of all of our data allowed for our

sample to be normally distributed. Our sample size n, was 1300. There are always

possibilities for errors. One sample would be there could have been counting errors.

The counting errors could have been in the counting of the colors in each bag, and

in the total number of candies per bag. The sampling method could have been

improved by increasing the sample size. It could have also been improved by

buying the bags of skittles from various stores around Utah, or even possibly getting

some bags from different states or possible countries.


From my research I have drawn the conclusion that each color of skittle is fairly

evenly proportional from each bag. I have also found that the mean number of

skittles in each bag is close to the mean we found through our data.

Reflection on Term Skittles Project

I remember when the class first begun and we were given this assignment. It

was definitely terms that I had never seen or heard of before in my life. I did not

think that I was capable of completing this type of math. During this semester we

were taught certain principles and given certain formulas to complete this project,

along with other assignments.

This project changed my views and the way I think about math out in the real

world. Before this semester I felt like I would never need to know any of this

information in my life. Seeing as I am going into the Dental Field, I figured that

these formulas would be pointless for me. As the semester went on, I was able to

grasp the concepts. I was able to apply these concepts and figure out the equations

that I never thought were possible.

Statistics are not only used in this class, but they are actually used in our

everyday life. They are used in the weather forecasts. That is how they compare

prior weather conditions with current weather conditions, to predict the future

forecasts. It is also used for quality testing, to be able to to test a sample of

something and see if it passes the quality check and then it can be issued to the

buyer. Another way that statistics is used in our everyday lives is, in the medical

field and genetic testing. I did not realize how great of a power that statistics really

has. It is not just a bunch of formulas and odd thinking. It is necessary for us to use

statistics to find of some very important things in each of our daily lives.
I feel that statistics has expanded my mind and has developed my problem

solving skills. It has made me more able to think deeply and outside of the box. It

is a neat concept to be able to figure out probabilities and proportions with a simple

formula or table.

Вам также может понравиться