Вы находитесь на странице: 1из 5

Laura Cano

Term project
Data collection was conducted on 40 bags of skittles which included each student
receiving at least one bag of skittles candy to count and sort into colors. Each
student counted the total number of candies in one bag and also counted how many
candies per color were contained in the package. Each student then submitted the
data to the teacher for a full compilation of data onto an excel word spreadsheet.
That data was then released back to the students for analysis. The following charts
visually show the breakdown of candy colors based off of 40 bags of skittles and a
total of 2437 candies.

Class skittles color proportion


red
orange

0.205 0.188

yellow
green

0.185

0.222

purple

0.200

Class skittles color proportion


0.240
0.220
0.200
0.180
0.160
red

orange

yellow

green

purple

In my personal bag of Skittles that I counted 62 total skittles and the breakdown of
colors is the following data: 13 red candies, 13 orange candies, 18 yellow candies,

11 green candies and 7 purple candies. To compare my personal bag proportions to


the class data a chart has been constructed.

My bag of skittles
red
orange

0.113 0.210

yellow

0.177

green
0.210

purple

0.290

my bag of skittles
0.400
0.300
0.200
0.100
0.000
red

orange

yellow

green

purple

My bag of skittles had a different distribution than the class data distribution,
although the color distribution was close to the class distribution. This is what I
expected to see. The larger data set shows a more even distribution of color as
expected. The data above shows the categorical data distribution and below are
displayed the quantitive data distribution. This data deals with the numbers of
skittles per bag. The class data agrees with my personal bag of candy which
contained 62 candies because the median was 61 for the class. The overall class
data was close to a normal distribution with some low outliers. There were a total of
40 bags of candy in the classroom data with a total of 2437 candies included in the
visual graphs below.
media
n

61

Q1

60

Q3

62

max

64

min

54

In reflection, the difference between categorical and quantitative data are shown
best above in the graphs. Categorical data show non numerical items such as color
in this particular case with the skittles. We categorize the colors together and make
comparisons between the different colors or different bags of candies. Quantitative
data is looking at the numbers. We can conclude numerical data and analyze this
data to understand how many skittles are dispersed per bag and the average
number of candies. We can estimate the approximate amount of candies we will
receive per bag of skittles using the data we gathered from our sample size of 40
bags.

A confidence interval is a way of displaying the range that we believe we are correct in our
sampling. In the confidence interval we expect to see the population parameter will fall within
the range of the confidence interval. If confidence intervals are constructed and repeated many
times, the proportion of these intervals that contain the true value of the parameter will match the
confidence interval. The general purpose is to increase confidence that our sample is reflecting
the true population as much as we possibly can so our data is good.
The 95% confidence interval estimate for the true proportion of purple skittles is shown
below:There were 500 purple candies out of a total of 2437 total candies. The sample proportion
of .205 and the confidence interval is (.1891<p< .2212) so we see from this data that 18.91% to
22.12% purple skittles are most likely contained per bag.
When considering the mean number of skittles contained in each bag we look at our data of total
bags of candy (40 bags) which contain a mean of 60.925 skittles per bag. There is a margin of
error of .826 and Using a 99% confidence interval we build a confidence interval of means
(60.099<mu< 61.751) We are 99% confident that our population mean will fall in this range.
The 98% confidence interval of the standard deviation of the number of candies in the bags of
skittles is (1.510<sigma<2.560)
The hypothesis test is a way to test a hypothesis about our about the population parameter.

Using a .01 significance level to test the claim that 20% of the skittles are green. After testing
the hypothesis that 20% of the skittles in each bag are green we do not have enough evidence to
reject this hypothesis. We are 99% confident that we can fail to reject this claim that 20% of the
skittles in each bag will be green.
Using the .05 significance level we test the claim that the mean number of candies in a bag of
skittles is 56. Our sample mean is 62 candies per bag. From the calculations we can conclude
there is sufficient evidence to say that we are 95% certain that a bag of skittles will not contain
56 candies. The critical value is 2.023 and the t value obtained was 16.057.
In reflecting upon our conditions for doing interval estimates and hypothesis tests for population
proportions we see that the samples met the conditions we assumed they would meet. The class
data and individual sample were closely matched.

REFLECTION
Upon reflection I have learned several things from this term project. The most valuable aspect
of this project to me personally was using the data obtained and learning how to make use of it in
such a way that made this class more interesting and more applicable. I think it is especially
important with math type classes that students have the opportunity to apply math they are
learning. I took statistics in 1997 and really didnt have the perspective I was able to get from
direct application. The other benefit I received was learning how to use the computer better. We
live in an age where computers and technology are vital, but when I obtained my undergraduate
degree I rarely had to do anything but type a paper on a word document. This was a very good
opportunity to use excel and the tools needed to create graphs and showcase the data in a
presentable clean way.
The math skills that I obtained in this class has helped me understand better the type of data that
will be useful in my current education path. Many time there are research studies done for
different reasons, and as I have looked over some of the data, it makes more sense to me. I am
sure as I further my education these applicable skills will continue to grow as I feel this class has
provided a good fundamental base for additional learning.
The other classes this will class will apply to is sure to come in graduate school. I have high
hopes of using all of this education to understand different things I will read and learn as I learn
more about medicine and the data that has already been published, as well as populations that
have patho-physiological conditions.
In reflecting upon the problem solving skills I was able to obtain, I believe the strongest
perception was to question all data to a certain degree. I think it is important to really reflect and

consider many different things before accepting statistical data, even if it has been published
because there are variables that can really influence the data and skew it.
I have always enjoyed math, and I have found it useful in understanding the world around me. I
have found this class to be a way to understand and think more about real life statistics. I find
myself more curious about certain things that I did not have a curiosity for before this class. I
believe that is the greatest gain I have been able to take with me after taking this class. The
additional love for learning and curiosity has been enjoyable, although challenging too.

Вам также может понравиться