Вы находитесь на странице: 1из 11

Intro to Descriptive Statistics

Final Project
Jul 18, 2017
Dane David

1. Introduction

1.1 Descriptive Statistics


Descriptive Statistics is a branch of statistics that is often used to summarize data sets, and to
conduct a meaningful analysis of the data as a prerequisite to conduct Inferential statistics.
Briefly, descriptive statistics refer to the coefficients that are used to represent the given data.
They are measures of central tendency ( Mean, Median, Mode, etc.) and measures of
variability (Min and Max values, Range, IQR).

1.2 Pack of Cards


For the experiment, a standard pack of 52 playing cards was used. The cards are divided into
four suits, viz. Clubs (♣), Diamonds(♦), Hearts(♥) and Spades(♠), with each suit containing 13
cards, Ace, 2 - 10, and the face cards King, Queen and Jack.
For this experiment, each card is assigned a value, according to the table given below:

Card Value

Ace 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10
Jack 10

Queen 10

King 10

Table 1.1: Values for cards.

The table below shows the absolute frequency and relative frequency of the appearance of
each value in a standard pack of cards:

Value Frequency Relative Frequency

1 4 0.07692

2 4 0.07692

3 4 0.07692

4 4 0.07692

5 4 0.07692

6 4 0.07692

7 4 0.07692

8 4 0.07692

9 4 0.07692

10 16 0.30769

Total 52 1.0

Table 1.2: Frequency distribution table for a pack of cards.

The following chart shows a histogram which depicts the above data:
Chart 1.1: Histogram for a standard pack of cards.

2. Experiment
2.1 Collecting Samples
For the purpose of the experiment, samples are collected in the following manner:

1. The pack of cards is shuffled and three cards are drawn without replacement.
2. The three cards are recorded, and placed back on the deck.
3. The steps 1 and 2 are repeated 30 times to obtain the sample.

To see the full sample, please refer to Appendix A.

2.2 Sample Statistics

2.2.1 Measures of Central Tendency


The following table shows the values of various measures of central tendency taken from the
sample:

Measure Value

Mean 19.86667

Median 20
Mode 25

Table 2.1: Measures of central tendency.

2.2.2 Measures of Variability


The following table shows the values of various measures of variability taken from the sample:

Measure Value

Range 22

Inter Quartile Range (IQR) 11.25

Variance 39.31556

Standard Deviation 6.27021


Table 2.2: Measures of variability.

2.2.3 Visualizing the Data


The following histogram represents the sample distribution:

Chart 2.1: Histogram for a standard pack of cards

It can be seen that the histogram shape is approaching that of a normal distribution. This is
because of the relatively small sample size (n=30). It can be seen that the shape of the
histogram becomes more normal as the sample size is increased.
In comparing with the original distribution, we find that it was uniform except for the last value,
while this histogram represents a more normal distribution.

2.3 Sampling Distribution


Now we take the mean of all sampling means of sample size n=3. The following histogram
represents the Sampling Distribution:

Chart 2.2: Histogram for Sampling Distribution.

By the Central Limit Theorem, the following values of the sampling distribution are found:

Mean (M) = μ = 6.62

Standard Deviation = σ = 2.09


Standard Error (SE) = μn= 1.207
Below is the normal distribution curve corresponding to the data:

Chart 2.3: Normal Distribution curve for Sampling Distribution.


A normal distribution curve for the sample sums may also be obtained:

Chart 2.3: Normal Distribution curve for sample sums.

3. Future Estimates
Now that we have a normal distribution, we can make estimates and predictions of outputs of
future experiments. Some of the future estimates may be:

What is the approximate probability that you will get a draw value of at least
20 ?
To answer this question, we need to find the area under the normal distribution curve above
value 20.
The various steps involved in the calculation are:
 Calculate z-score of 20:
The z-score can be calculated using the formula:
z=x -
We get z = +0.02
 Look up the z-table to find the area under the normal distribution curve less than z-score:
The z-table is given in Appendix B.
From the table, the area under the curve for z < +0.02,
P(z < +0.02) = 0.508
 Subtract the probability from one to obtain the required probability:
The required probability is given by
P(z ≥ +0.02) = 1 - 0.508 = 0.492

Within what range will you expect approximately 90% of your draw values
to fall ?
To answer this question, we need to find the z-scores of two points:
 The point below which 95% of the values lie.
 The point below which 5% of the values lie.
The x-values corresponding to these z-scores gives the range between which 90% of the values
lie.
The steps are:
 Find the required z-scores:
The z-score below which 5% of data lie = -1.64
The z-score below which 95% of data lie = +1.65
 Compute corresponding x-values using the equation
x = (z) +
The x-value corresponding to -1.64 is 9.58
The x-value corresponding to +1.65 is 30.21

Thus, approximately, we can expect 90% of our future values to lie between 9.58 and 30.21.
Appendix A
Sample

Sl No. Card 1 Value Card 2 Value Card 3 Value Sample Sum Sample Mean
1 3♣ 3 8♣ 8 5♦ 5 16 5.33
2 2♠ 2 5♦ 5 6♠ 6 13 4.33
3 6♥ 6 7♦ 7 7♣ 7 20 6.66
4 9♣ 9 10♣ 10 4♣ 4 23 7.66
5 K♥ 10 K♣ 10 7♥ 7 27 9
6 3♥ 3 3♦ 3 2♥ 2 8 2.66
7 5♠ 5 10♠ 10 Q♠ 10 25 8.33
8 6♥ 6 J♠ 10 J♥ 10 26 8.66
9 10♦ 10 A♣ 1 A♠ 1 12 4
10 4♠ 4 2♥ 2 6♠ 6 12 4
11 5♣ 5 7♦ 7 7♣ 7 19 6.33
12 5♠ 5 10♠ 10 Q♠ 10 25 8.33
13 10♥ 10 J♣ 10 Q♥ 10 30 10
14 10♥ 10 J♣ 10 Q♥ 10 30 10
15 J♣ 10 Q♥ 10 Q♦ 10 30 10
16 9♥ 9 4♥ 4 8♥ 8 21 7
17 2♥ 2 K♦ 10 4♣ 4 16 5.33
18 3♣ 3 J♣ 10 7♥ 7 20 6.66
19 3♣ 3 6♦ 6 2♦ 2 11 3.66
20 2♥ 2 10♦ 10 A♣ 1 13 4.33
21 5♠ 5 10♠ 10 Q♠ 10 25 8.33
22 9♣ 9 7♣ 7 9♦ 9 25 8.33
23 8♠ 8 9♦ 9 7♥ 7 24 8
24 A♠ 1 10♠ 10 8♣ 8 19 6.33
25 6♦ 6 2♦ 2 10♥ 10 18 6
26 A♥ 1 4♥ 4 8♥ 8 13 4.33
27 4♦ 4 A♦ 1 6♠ 6 11 3.66
28 9♦ 9 7♥ 7 9♣ 9 25 8.33
29 7♥ 7 6♣ 6 3♣ 3 16 5.33
30 8♣ 8 10♠ 10 5♦ 5 23 7.66

Appendix B
Normal Distribution table
Descriptive Statistics Final Project - Student Instructions

Descriptive Statistics Final Project


Note: This course is currently only available for free, so you won't be able to submit your work
for review. We encourage you to use the specifications and evaluation tools to complete it, then
self-assess and seek feedback from family, friends, and your social networks. Use their
feedback to improve, then you’ll have a great example of your work to show-off anytime!

Overview
Welcome to the Descriptive Statistics Final Project! In this project, you will demonstrate what
you have learned in this course by conducting an experiment dealing with drawing from a deck
of playing cards and creating a writeup containing your findings.
Be sure to check through the project rubric to self-assess and share with others who will give
you feedback.

Questions for Investigation


This experiment will require the use of a standard deck of playing cards. This is a deck of fifty-
two cards divided into four suits (spades (♠), hearts (♥), diamonds (♦), and clubs (♣)), each suit
containing thirteen cards (Ace, numbers 2-10, and face cards Jack, Queen, and King). You can
use either a physical deck of cards for this experiment or you may use a virtual deck of cards
such as that found on random.org (http://www.random.org/playing-cards/).
For the purposes of this task, assign each card a value: The Ace takes a value of 1, numbered
cards take the value printed on the card, and the Jack, Queen, and King each take a value of
10.
1. First, create a histogram depicting the relative frequencies of the card values.
2. Now, we will get samples for a new distribution. To obtain a single sample, shuffle your deck
of cards and draw three cards from it. (You will be sampling from the deck without replacement.)
Record the cards that you have drawn and the sum of the three cards’ values. Replace the
drawn cards back into the deck and repeat this sampling procedure a total of at least thirty
times.
3. Let’s take a look at the distribution of the card sums. Report descriptive statistics for the
samples you have drawn. Include at least two measures of central tendency and two measures
of variability.
4. Create a histogram of the sampled card sums you have recorded. Compare its shape to that
of the original distribution. How are they different, and can you explain why this is the case?
5. Make some estimates about values you will get on future draws. Within what range will you
expect approximately 90% of your draw values to fall? What is the approximate probability that
you will get a draw value of at least 20? Make sure you justify how you obtained your values.
Published by Google Drive–Report Abuse–Updated automatically every 5
minutes
Final Project Rubric - Descriptive Statistics
Descriptive Statistics Final Project Rubric
Note: This course is currently only available for free, so you won't be able to submit your work
for review. We encourage you to use the specifications and evaluation tools to complete it, then
self-assess and seek feedback from family, friends, and your social networks. Use their
feedback to improve, then you’ll have a great example of your work to show-off anytime!

Overview:
This rubric is here to help you understand the specifications for how your project will be evaluated. It is
the same rubric you should share with others who give you feedback. You should look at the rubricbefore
you begin working on this project and before you submit it.

Before you begin:


1. Read the final project instructions and this document in detail.

Before you submit:


1. Read the rubric below in detail and do your best to evaluate where your project stands.
2. If you think your project “does not meet specifications” for any criterion, make necessary
changes so that it “meets specifications”.
3. When you are confident that your project meets or exceeds specifications in each criterion, share
it with others for feedback

The Rubric:

Criteria Does Not Meet Meets Specifications


Specifications

Responses to Project
Questions

Question 1: Plotting a The histogram does not accurately A histogram is provided that
histogram of card values reflect the card values’ relative accurately reflects the card values’
frequency distribution or no relative frequency distribution.
histogram is provided.

Question 2: Obtain Sampled data is not provided, At least thirty samples have been
samples from a deck of insufficient, or does not reflect the performed and the summed values
cards experiment being performed for the from each sample have been
project. reported in a submitted spreadsheet.
Question 3: Report Two measures of central tendency At least two measures of central
descriptive statistics and variability are not reported to tendency and two measures of
regarding sample taken describe the sample or are not variability are accurately reported
computed correctly. to summarize and describe the
samples taken for Question 2.

Question 4: Plotting a The histogram does not accurately A histogram accurately reflecting
histogram of sampled reflect sampled values or no the sampled data is provided.
values histogram is provided. No Discussion of the shape is provided,
discussion of the shape of the including a comparison to that of
distribution is provided or the histogram of the original card
comparison is not well-justified. values.

Question 5: Making Estimates made for the prompted Estimates are made for the
estimates based on the questions do not reflect the values prompted questions that reflect the
sampled distribution obtained from the sample. samples taken and their
distribution.
Published by Google Drive–Report Abuse–Updated automatically every 5
minutes

Вам также может понравиться