Вы находитесь на странице: 1из 13

Math 1040 Data Analysis Project

Sara Coats
Math 1040-03

Part 1
I chose to use the body measurement data set to complete this data analysis project. The data
set provides the gender, height, weight, and age of 507 subjects. The data set also provides
the following body measurements for each of the 507 subjects: biacromial diameter, pelvic
breadth, bitrochanteric diameter, chest depth, chest diameter, elbow diameter, wrist
diameter, knee diameter, ankle diameter, shoulder girth, chest girth, waist girth, abdominal
girth, hip girth, thigh girth, forearm girth, knee girth, calf maximum girth, ankle minimum girth,
and wrist minimum girth.
I will use the Statcrunch tool to obtain a random sample of 40 men and 40 women. From the
random sample I will organize the information and draw conclusions about men and women
and their knee girth and how they compare to other randomly selected subjects. This should
be an interesting project and I am looking forward to completing it.









Part 2
I will use gender as the categorical variable. The following Pie and Pareto charts show the
categorical variable for the entire population of 507 subjects.






Using Statcrunch I generated a simple random sample of 40 subjects from the 507 original
subjects. The numbers and their respective genders are listed below.

19 MALE SUBJECTS:
6 42 51 84 111 115 118 130 145 146 158 161 173 175 183 213 215 229 247 21
21 FEMALE SUBJECTS:
255 258 271 272 278 286 288 304 354 396 402 404 405 451 460 462 478 488
492 504 506

The following Pie and Pareto charts show the categorical variable (gender) for the simple
random sample.



I created a systemic sample by selecting every 12
th
subject. The systemic sample contains 40
subjects. The numbers and respective genders are listed below.
20 MALE SUBJECTS:
12 24 36 48 60 72 84 96 108 120 132 144 156 168 180 192 204 216 228 240
20 FEMALE SUBJECTS:
252 264 276 288 300 312 324 336 348 360 372 384 396 408 420 432 444 456-
468 480

The following Pie and Pareto charts show the categorical variable (gender) for the simple
random sample.



The two sampling methods I used were the simple random sample and the systematic sample.
To obtain the random sample I used the random numbers applet in Statcrunch. To obtain the
systemic sample I divided the total population by the number I wanted in my sample group.
507/40 = 12.675. I rounded down to 12 and selected every 12
th
subject.
The samples are very similar. They each contain about half men and half women. I would
expect this because the population is split fairly evenly between the two genders.



















Part 3

The quantitative variable I selected to use is the knee girth of the test subjects. The following
chart shows the five number summary, the population mean and the population standard
deviation for the different knee girths of the 507 test subjects.
Summary statistics:
Column Max Min Q1 Q3 Median Mean Std. dev.
Knee girth 49 29 34.4 38 36 36.202959 2.6175697

The following frequency histogram and box plot show the quantitative variable for the entire
population.


The following chart shows the mean, the standard deviation and the five number summary, of
the simple random sample of 40 subjects.
Summary statistics:
Column Std. dev. Mean Median Min Max Q1 Q3
Sample(Knee girth) 2.745794 36.69 36.35 31.1 42.8 34.7 38.95

The following frequency histogram and box plot show the quantitative variable for the simple
random sample.



The following chart shows the mean, the standard deviation and the five number summary, of
the systemic sample of 40 subjects.
Summary statistics:
Column Mean Std. dev. Median Min Max Q1 Q3
systemic sample 35.625 2.2573811 35.8 29 40.9 34.2 37

The following frequency histogram and box plot show the quantitative variable for the
systematic sample.


The shapes of the frequency histograms and the box plots differ a great deal. The simple
random sample graphs and the systematic sample graphs have only a small sample of the
original population. Because they contain only a small sample some of the population outliers
were left out of the sample graphs. If the sample size were larger the sample graphs would
more closely resemble the population graphs.





















Part 4
Below are the confidence intervals from the simple random sample and systematic sample used in part 1 of
this project.
Simple Random Sample - Gender
40 subjects
21 female
19 male
21/40 = 0.525
1 - = 0.475
95%
0.05

1.96

= 0.1548




Systematic Sample - Gender
40 subjects
20 female
20 male
20/40 = 0.5
1 - = 0.5
95%
0.05

1.96






Below is the confidence interval for the mean of the samples of knee girth.
Simple Random Sample Knee Girth
40
df (n-1) = 39
CL 95%

2.023
36.69
2.75







Systemic Sample Knee Girth
40
(n-1) = 39
CL 95%

2.023
35.625
s 2.26













Below are the confidence intervals for the standard deviation of the samples of knee girth.
Simple Random Sample knee girth
Standard deviation 2.745794
n 40
df 39
CL 99%



Systemic Sample knee girth
Standard deviation 2.2573811
n 40
df 39
CL 99%




These several confidence intervals demonstrate a range of values that hopefully provide a way to gage the
true value of the categorical variable (gender) and quantitative variable (knee girth). Each of the above
confidence intervals capture the population parameters.










Part 6:
Summary Reflection Paper
This project provided data about the body measurements of 507 men and women. I was able to use gender as
my categorical variable and knee girth as my quantitative variable. Using techniques taught in class I obtained
simple random samples and systemic samples from each of these variables. Using StatCrunch I was able to
produce pie charts and pareto charts to make the categorical data easier to appreciate.
For the quantitative variable I computed the sample mean, the sample standard deviation and the five
number summary for each of the samples. Then I used StatCrunch to make histograms and box plots for the
quantitative variable. The shape of the box plots and histograms for the sample varied from the box plots and
histograms of the entire population. I understand that if the sample sizes were increased the sample graphs
would more closely resemble the population graphs.
Using the categorical and quantitative sample values I created confidence intervals for each of the samples.
Doing this provided good practice for the concepts taught in chapter 7.
Over all this project has given me experience and practice using many statistical principles. Statistics is a very
detailed branch of mathematics. With more time and more practice I feel I could develop a greater
understanding of these principles and how they are involved in every day life.

Вам также может понравиться