00 голосов за00 голосов против

25 просмотров14 стр.Apr 30, 2015

© © All Rights Reserved

DOCX, PDF, TXT или читайте онлайн в Scribd

© All Rights Reserved

25 просмотров

00 голосов за00 голосов против

© All Rights Reserved

Вы находитесь на странице: 1из 14

Skittles Term Project

Amber D. Voorhies

Page

1

Introduction

For our Statistics 1040 Math final project our class has was instructed to observe

and analyze data accumulated from Skittles candy bags we opened and counted. We

collected data from 40 individual 2.17-ounce bags of Skittles and then we collectively

analyzed the ratio of different colored Skittles. We calculated the mean, standard

deviation, and 5-number summary (Range, minimum, maximum, sum, and count). This

data was then constructed into tables and graphs to illustrate the results. We further

investigated our Skittle data by constructing confidence intervals as well as performed

hypothesis tests. We are learning about these different methods and how to apply them

to real life examples.

Colors

The pie and pareto graphs are good visual aids for the Skittle data our class

collected. Each graph is a good illustration to see our proportions and percentages and

allows for visual comparisons. The graphs do an adequate job in depicting the data and

what I expected to see from the data we collected. The overall data collected by the

whole class does not agree with my own data from my single bag of candies. My

proportions are skewed much more than those of the class data. One example is the

percentage of green candies my individual bag had. It had 4 green candies which

calculated out to be 6.557% of my total bag. Whereas the class total of green candies

was 452 which was 18.547% of the class total. With that being said both the class and

single data ended up having the most amount of orange candies and the least amount of

candies were green. The graphs and data can be seen below.

Page

2

Pie Chart

Skittle Data Sample

by Color

Red

21%

Orange

Yellow

19% Green

19%

22%

20%

Proportions

Red

.188

Orange

.222

Yellow .1999 or .200

Green .185

Purple .205

Page

3

Purple

25.000%

22.199%

20.517%

20.000%

19.984%

.222

.205

18.753%

18.547%

.188

.185

.200

15.000%

10.000%

5.000%

0.000%

Orange

Purple

Yellow

Red

Green

Red

candies

# of

Candies

Percent

age

Orange

Candies

Yellow

candies

Green

candies

Purple

candies

Total

10

19

15

13

16.393%

31.148%

24.590%

6.557%

21.311%

Page

4

61

100.0

0%

Samples

Red

candies

# of

Candies

Percenta

ge

Orange

Candies

457

Yellow

candies

Green

candies

541

18.753%

487

Purple

candies

Total

452

500

22.199%

19.984%

18.547%

The Number of Candies per Bag

20.517%

2437

100.00

%

Class Sample

Red

Orange

Yellow

Green

Purple

Total

Mean

11.4

13.5

12.2

11.3

12.5

60.9

Standard Deviation

2.93

3.65

2.75

3.37

3.06

1.93

Range

12.0

13.0

11.0

14.0

13.0

10.0

Minimum

7.0

7.0

7.0

4.0

6.0

54.0

Maximum

19.0

20.0

18.0

18.0

19.0

64.0

Quartile 1

9.0

11.0

10.0

9.0

11.0

60.0

Quartile 3

13.0

15.5

15.0

14.0

14.3

62.0

457.0

541.0

487.0

452.0

500.0

2437.0

40.0

40.0

40.0

40.0

40.0

40.0

Sum

Count

Page

5

Histogram

Number of Skittle Candies Per Bag

12

10

8

Frequency

Number of Individual Skittle bags

Frequency

4

2

0

Bin

Total Number of Skittles Per Bag

The shape of this distribution is a bell curve-a normal distribution with a few

outliers. The overall numbers in both single and class data sets do seem to agree with

one another. Their maximum (orange ) and minimum (green) colors coincide with one

another as well as their second lowest (red) number of colors. As far as proportions and

percentages I would have expected the numbers between the single and class data to

have been closer than they were. The total number of candies in my single bag were 61

and comparing that to the mean of the total class candies which was 60.9. This

demonstrates that the data collected from my single bag of candies and the data

collected from the entire class agree with one another.

Page

6

Boxplot

Number of Skittle Candies Per Bag

48 50 52 54 56 58 60 62 64 66

Reflection

Two types of data we have studied this semester are quantitative and categorical.

According to Elementary Statistics, 12th Edition, Quantitative data "consists of numbers

representing counts or measurements," where "Categorical data consist of names or

labels not numbers representing counts or measurements."

Graphs that make more sense for categorical data are things like Pie charts and

Pareto charts/Bar graphs. These graphs help depict categories very well visually.

Quantitative data are better represented with graphs such as the boxplot, histograms,

frequency polygons, ogives, and scatterplots. These graphs depict numbers and show

trends and patterns. These represent numbers much better than the latter.

Some good calculations that make sense for Categorical data are the calculations

used in the 5-number summary. These include range, minimum, maximum, sum, and

Page

7

count. The numbers that we get from these calculations help us to visually see

proportions and percentages. We can get solid visual representations of the number

derived from these calculations. Calculations that are helpful to quantitative data are

things like mean and the standard deviation. The mean is a measurement of the center of

our data-the top of our bell curve. This can also be known as the center of our data. This

is not easily represented on a pie chart-pie charts do not show the center. This is why a

boxplot, histograms, frequency polygons, ogives, and scatterplots because these graphs

allow us to see that center. According to Elementary Statistics, 12th Edition the standard

deviation "is a measure of how much data values deviate from the mean. This means a

lot numerically because it gives us a standard to evaluate with. Pie and pareto charts do

not illustrate outliers that fall outside of the standard deviation but boxplots, histograms,

frequency polygons, ogives, and scatterplots do.

To summarize, pie and pareto charts are good for categorical data and boxplot,

histograms, frequency polygons, ogives, and scatterplots and better for quantitative data.

Confidence Intervals are a collection of values that are used to "estimate the true

value of a population parameter." ( Elementary Statistics, pg 325) They give us visual

boundaries to help us see if our hypothesis is reasonable or not.

Discussion of the Three Interval Estimates

The first confidence interval we set up was 95% confidence Interval for the true

proportion of purple candies. Our results .194 < p < .226 show that the 500 purple

Page

8

candies-which is .21 and it is within a 95% confidence interval. dAjls

Page

9

The second confidence interval we constructed to estimate for the true mean

number of candies per bag is 61.726 < < 60.074. This supports or proves our mean

60.9 is within this confidence interval.

The third confidence interval we constructed was to get a 98% confidence interval to

estimate for the standard deviation. The standard deviation we calculated from our class

data is 1.93. This falls within the 98% confidence interval 1.510 < <2.560.

Page

10

Hypothesis Tests

Hypothesis tests are used to measure and check claims made " about a property of

a population." (Elementary Statistics, pg 382) This helps us get a good indication if we

need to re-evaluate the initial claim.

Discussion

The first hypothesis test , we use a 0.01 significance level to test the claim that

20% of all Skittles candies are green. We used the null hypothesis that p=.02 and to test

this claim we used the alternate hypothesis that p .20. Finding that our significance

level or = .01 and our /2= .005. We found our critical values from .005 which are

-2.575 and 2.575. We then solved for our test statistic and got -1.79. -1.79 is within our

critical values which allows us to conclude that we fail to reject our null hypothesis

because we do not have enough evidence to reject the null hypothesis that 20% of all

Skittles candies are green. The evidence we have gained from testing the claim provides

sufficient evidence that we are unable to reject the claim that 20% of all Skittles candies

are green.

Page

11

The second hypothesis test was to use 0.05 significance level to test the claim that

the mean number of candies in a bag of Skittles is 56. The null hypothesis is that p=56

and the alternate hypothesis is p56. The critical values for our significance level are

-2.023 and 2.023. Our test statistic is 16.057. This is significantly outside of our critical

values. We therefore reject our null hypothesis. We have enough evidence to show that

the mean number of candies in a bag is not 56.

Reflection

Requirements estimating a population proportion p are that the sample needs to

be a simple random sample, there needs to be a fixed number of trials, the trials need to

be independent, there need to be two categories of outcomes, and the probabilities need

to be constant for each trial. There also needs to be at least 5 successes and 5 failures.

The class samples do fall within these requirements. Requirements for hypothesis tests

concerning population proportions np 5 and nq 5. N=2437 and p =.21.

2437*.21=511.77 which is greater than 5. N*q which is 2437*.79=1925.23. The sample

did fit these requirements.

Requirements for interval estimates for a population mean is to make sure the

sample is a simple random sample. "Either or both the population is normally distributed

or n > 30. Our sample was a simple random sample and our n is 40 which is greater than

Page

12

is not known

and normally distributed or standard deviation is not know and n >30 the equation you

use is

t=x /(

s

)

n

or if

z=x /(

)

n . The sample did meet these requirements.

Requirements for doing interval estimates and hypothesis tests for population

standard deviations are the samples need to be simple random samples, the population

needs to be normally distributed-no exceptions. This sample does meet these

requirements.

A possible error that could have been made by using this data is that we could

commit the fallacy of a type I error which is "the mistake of rejecting a true null

hypothesis." (Elementary Statistics, pg 393) In our Skittles example, this would be if we

were to reject that 20% if all Skittles candies are green when, in fact, our evidence

suggests that we actually fail to reject it.

The other possible error that could have been made is a type II error. This is where

" the mistake of failing to reject the null hypothesis is false." (Elementary Statistics, pg

392) In our Skittles example, this would be where we fail to reject that the mean number

of candies in a bag of Skittles is 56 when, after our calculations prove that 56 mean

Skittles is well outside of our critical values. The calculated total true mean of our data is

60.9. Type II error is failing to reject our null hypothesis when our evidence proves that we

ought to reject it.

It is best to have a large sample to get the best results. Large samples have more

allowance for errors and allow us to get a more normal distribution.

Conclusion-Reflective Writing

There are many math skills that I have been able to identify that we applied in this

project that will impact other classes I will take in my school career.

I have recently completed pre-requisite courses required for my nursing school

application. Statistics are used frequently in the medical field. Experiments and trials

being performed in order to get new treatments and medications approved for patient

care. These experiments and trials are used in the medical field to increase quality of

treatments.

Page

13

Hypothesis tests are used to see whether or not treatments that health

professionals are recommending are effective. These experiments/trials use subjects and

are often conducted using a double blind study method. In this class, I learned that a

double blind study is when both the subjects being tested do not know whether they are

receiving a placebo or the medication that is being tested and the administrators or

doctors who are providing the medication do not know which subject gets the placebo or

medicine. This type of statistic experiment helps provide unbiased results. Before the

blind study is set up, researchers are able to construct hypothesis tests to help us see

how effective these medications/ treatments are. Methods based in statistics allow

people to make educated hypothesis tests and provide helpful results. This is imperative

in the medical field. When an individual's life depends on an outcome it is vital that we

have accurate test results so that we can give people the best medical care.

Statistic methods such a confidence levels help medical professionals successfully

explain treatment options and the advantages/disadvantages using evidence derived

using Statistical methods and formulas.

Using methods based in statistics allows researchers to pinpoint their successes

and helps them to make necessary changes. Researchers were able to do this very thing

when the SALK Vaccine Experiment was conducted in 1954. (See Elementary Statistics

pages 28-29) It is now 2015 and polio is and illness rarely ever heard of. Children are

able to be vaccinated to prevent the onset of polio. This is all because experiments were

conducted and the results were analyzed and deemed reputable using methods founded

in Statistics.

Taking this course has helped me better understand how medical community and

governments make decisions about patient care. Decisions are not made based on

emotions, they are based on facts and experiments that have been conducted in an

unbiased manner. We have laws and codes that are based on Statistical standards.

Treatments must meet requirements based on the standards mentioned previously and

then have to be successfully repeated over and over. Statistics has provided a foundation

of understanding for how things are done or rather how things ought to be done in the

medical community. This is extremely helpful for me as I am pursuing a career in the

medical field. It has expanded my understanding.

Page

14

## Гораздо больше, чем просто документы.

Откройте для себя все, что может предложить Scribd, включая книги и аудиокниги от крупных издательств.

Отменить можно в любой момент.