Вы находитесь на странице: 1из 15

Math 1040 Term Project

Starting Compensation for College Graduates


Introduction
This project will allow you to pull together many of the concepts you are studying this semester,
including organizing and analyzing data, drawing conclusions using confidence intervals and
hypothesis tests, and presenting your work in a well-organized paper. Your overall report will
be a narrative that clearly explains your process and your conclusions. Your included
mathematical calculations may be neatly handwritten and scanned, but overall summaries and
written conclusions must be word processed.
This is the signature assignment for this course. It is to be posted in your e-portfolio as a single
document. Your e-portfolio must be linked through your SLCC MyPage account. See the specific
e-portfolio instructions for details.

Data Collection
Because of the scope of this project (and because of issues with access to data), data from the
year 2014 has been collected in advance for you to analyze. This is actual data collected from a
simple random sample taken from the US Department of Labor statistics on college graduates.
Just in case you were thinking these numbers are a little high, know that these starting
compensations are including values of benefit packages (retirement contributions, insurance,
etc.). Using the information provided in the Excel file, calculate the following (using Excel
functions):

Major Field
Sample
Mean
Sample
Std. Dev.

Computer

ications

Science

51537

46227

59542

40021

60664

38817

41923

7232

7214

4670

2365

7879

7146

4938

Business

Education

Engi-

Humanities

Commun-

neering

& Social
Sciences

Math &
Sciences

Report Introduction
Your report must begin with an introduction explaining the process of the project. Do not
assume that your reader knows in advance what the assignment is about. Get started on your
introduction paragraph. Explain the overall procedures and goals of the assignment in your
own words. Note that you will be editing and adding to this introduction as you proceed
through all the parts of the project.

Whenever data is collected, the data is usually in a form or tables or arrays that make it difficult
to observe patterns in the data. In addition, the presented data becomes difficult to draw any
meaningful real life conclusions based on another set of data obtained from another sample of
the original population. Statistical methods provide a consistent way of organizing and
analyzing data, drawing conclusions using confidence intervals and hypothesis tests, and
presenting youre the data in an organized fashion.
This report is a summarized report on the 2014 compensation data on graduate students
employed into the major fields of practice. Using histograms, the data was plotted to show the
distribution of the data and the distribution curve that will best interpolate data values. The
data has been explored for their central tendencies using mean and median. The spread of the
data has also checked with the calculation of standard deviation, quartiles, maximum and
minimum values. The results of the analysis were compared across the major fields in order to
draw some evidence-based conclusions.

Organizing and Displaying Categorical Data: Histograms


First, you will make a histogram for compensation distribution for each Major Field. Your
graphics must have descriptive titles and be appropriately labeled. Throughout this project,
represent calculations dollar values as whole numbers, and use round numbers for your
histogram categories.
Write a paragraph discussing your observations of this data. Do the graphs reflect what you
expected to see? Are the distributions skewed or symmetric? Are they Uniform, Normal, or do

they seem to follow some other distribution? Comment on the differences amongst different
Major Fields. In which Major Field does your intended career lie?

Histogram for Compensation Distribution for Business


16
14
12
10
8
6
Frequency

4
2
0

Compensation Category

Histogram for Compensation Distribution for Communications


18
16
14
12
10
8
6

Frequency

4
2
0

Compensation Category

Histogram for Compensation Distribution for Education


12
10
8
6
Frequency

4
2
0

Compensation category

Histogram for Compensation Distribution for Computer Science

Histogram for Compensation Distribution for Engineering


14
12
10
8
6
Frequency

4
2
0

Compensation Category

Histogram for Compensation Distribution for Humanities & Social Sciences

0-35,000

35,001-40,000 40,001-45,000 45,001-50,000 50,001-55,000


Compensation Category

Histogram for Compensation Distribution for Mathematics & Sciences

0-35,000

35,001-40,000 40,001-45,000 45,001-50,000 50,001-55,000


Compensation Category

Organizing and Displaying Quantitative Data: Boxplots


Calculate the 5-Number Summary for each Major Field (please use Excel to make these
calculations). Make boxplots for the seven categories. Do any of them seem to have any
outliers? If so, give an explanation of why you think certain Major Fields might have such
extreme high or low values. Also determine if normal, t-distribution, and

2 -distribution

procedures seem to be appropriate for this data.


Write a paragraph discussing your observations of this data. Do any of them seem to have any
outliers? If so, give an explanation of why you think certain Major Fields might have such
extreme high or low values. Also determine if normal and t-distribution procedures seem to
be appropriate for this data.

Submit your work (graphs and writing) to this point (in class, on paper) to your
instructor by Tuesday, November 29.
Confidence Interval Estimates
Explain in general the purpose and meaning of a confidence interval.
Whenever a parameter of interest (eg mean) needs to be investigated in a
population, a sample is drawn from the population for such analysis. The purpose of
taking a random sample from the population and computing the parameter (eg
mean), is to approximate the mean of the population. Every random sample with
produce a different value for the parameter of interest. A confidence interval
addresses this issue because it provides a range of values which is likely to contain
the population parameter of interest. This means that any sample should have its
parameter of interest within this range called confidence interval, to an extent.
95% confidence interval estimate for the mean starting compensation for students graduating
in Humanities and Social Sciences.
Confidence interval = mean + (Z97.5*standard deviation)/sample size
Confidence interval = 38,817 + (1.96*7146)/50
Confidence interval = 38,817 + 1,981

Confidence interval = ($36,836, $40,798)

80% confidence interval estimate for the proportion of all students with starting
compensation over $50,000.
Sample proportion with compensation above 50,000= 149/350=0.43
Confidence interval
= sample proportion + (Z90*[sample proportion*(1-sample proportion)/sample size]
= 0.43 + (1.28*[0.43*(1-0.43)/350]
= 0.43 + 0.03
=(0.40, 0.46)

Discuss and interpret the results of each of your interval estimates. Include neatly written and
copies of your work.

Based on our sample data, we are 95% confident that estimate for the mean starting
compensation for students graduating in Humanities and Social Sciences is between $36,836 and
$40,798. This implies that if we were to take a sample of 100 graduating students, 95 times out of
100, their mean starting compensation would fall between $36,836 and $40,798.
Based on our sample data, we are 80% confident that estimate for the proportion of all students
with starting compensation over $50,000 is between 40% and 46%. This implies that if we were
to take a sample of 100 graduating students, 80 times out of 100, the proportion of all students
with starting compensation over $50,000 would fall between 40% and 46%.

Hypothesis Tests
Explain in general the purpose and meaning of a hypothesis test.

A hypothesis test is a statistical test that is used to examine whether there is enough evidence in a
sample of data to infer that a certain condition is true for the entire population. Such a test
examines two opposing hypotheses about a population: the null hypothesis and the alternative
hypothesis. The null hypothesis is the statement being tested. Usually the null hypothesis is a
statement of "no effect" or "no difference". The alternative hypothesis is the statement you want
to be able to conclude is true. Based on the available data, the test determines whether to reject
the null hypothesis. The criteria for this decision is based on the p-value. If the p-value is less
than or equal to the level of significance, which is a cut-off point that is pre-defined, then you
can reject the null hypothesis.

Use a 0.05 significance level to test the claim that students graduating in Education have an
average starting compensation of under $35,000.
Ho= = 35,000
Ha= <35,000 (claim)
z= (sample mean value in hypothesis)/(standard deviation/sample size)
z= (40021 35000)/(2365/50)
z= 14.89

critical value = -1.645

Use a 0.01 significance level to test the claim that 80% of students graduating with a college
degree will find a starting compensation package valued at over $40,000.
Proportion of graduates with starting compensation package valued at over $40,000= 269/350

=0.77
Ho= = 0.8
Ha= >0.8 (claim)
z= (0.77 0.8)/[0.8(1-0.8)/350]
z= -1.40
critical value for 0.01 significance level upper sided=2.33

Discuss and interpret the results of each of your two hypothesis tests. Include neatly written and
scanned copies of your work.
Since the z-value is greater than the critical value of -1.645, we will accept the null hypothesis
that students graduating in Education have an average starting compensation equal to $35,000.
At 95% significance level, there is no sufficient evidence to accept the claim.
Since the z-value is less than the critical value of 2.33, we will accept the null hypothesis that
80% of graduating students have an average starting compensation over $40,000. At 99%
significance level, there is no sufficient evidence to accept the claim.

Reflection
Interval estimates and hypothesis tests are done when the following conditions are met
1. Independence Assumption:
a. Random sampling condition: the sample must be random, usually less than 10%
of the population size.
2. Nearly Normal Condition: The sample data has a symmetric distribution, so we can
assume that it comes from a nearly normal population. In addition, when the sample
size is greater than 50, we van assume a normal distribution.
My samples meet both conditions because the sample is selected by simple random
sampling and the number is clearly less than 10% of the population so the independence

assumption is met. In addition, the sample can be assumed to be normally distributed from
the histogram as well as using the sample size which is greater or equal to 50.
The possible errors that can be made are Type I and Type II errors.
When the null hypothesis is true and you reject it, one may make a type I error. The
probability of making a type I error is , which is the level of significance you set for your
hypothesis test. An of 0.05 indicates that you are willing to accept a 5% chance that you
are wrong when you reject the null hypothesis. To lower this risk, you must use a lower
value for . However, using a lower value for alpha means that there will be less likelihood
to detect a true difference if one really exists.
When the null hypothesis is false and you fail to reject it, one makes a type II error. The
probability of making a type II error is , which depends on the power of the test. You can
decrease your risk of committing a type II error by ensuring your test has enough power.
You can do this by ensuring your sample size is large enough to detect a practical difference
when one truly exists.
The sampling can be improved by increasing the sample size for each major field.
From this data, it can be concluded that graduating students in major fields receive
compensations whose median and mean are different from one another.

Submit your work to this point (in class, on paper) to your instructor by Tuesday,
December 6.

Reflective Writing and e-Portfolio


You may use one or more of the following ideas to build your reflection paper. Your paper must
be at least one page (double spaced) and use correct spelling and grammar. See specific eportfolio instructions in your syllabus.
What have you learned as a result of this project?
Discuss how the math skills that you applied in this project will impact other classes you
will take in your school career.

Identify specific parts of the project and your own process in completing the project that
may have applications for other classes.
Discuss how the project helped to develop your problem solving skills.
Discuss how this project changed the way you think about real-world math applications.
If your thinking was not changed, then discuss how the project supported your views
about real-world math applications.

Your completed project (including electronic copies of all previously submitted


work) will be posted on your e-Portfolio and linked through MyPage by Thursday,
December 8.

Kanlayanee Phothiworn
Spencer Bartholomew
Math 1040
December 8,2016
What have you learned a result of this project?

From this project, I have learned how to use the Mean, Standard Division, 5 number summary. It
is a great contribution to my learning, through this project to incorporate graphs, use of excel and box
plots. I love how I was able to use different formulas in creating a good excel spreadsheet.
Working in-group is hard, because we all have different schedules and are busy with our daily
life. It was great to learn about other people, who are from different country and with different belief.
Also, help me to realize that Math is a nerve wrecking subject for all us just as how it used to be in most
countries for example: statistics. Im glad that our team was good with communicating with one another
and working in unity.
Discuss how the math skills that you applied in this project will impact other classes you will take in
you school career.
I know we will use math skills in my daily life and know that all the higher science classes, which
Im going to take in future, will use the mean sample data. As I know from one of my friend that we will
be using statistical applications for Biology 1610.
Identify specific part of the project and your own process in completing the project that may have
application for other classes.
I worked with my team to organize the final work on the project like making sure of printing
materials, checking errors on the print and collecting the ideas from the group.

Discuss how the project helped to develop your problem solving skills.
This project helped me to resolve problems via talking to other people. I learned that as a
group, a work could be done more efficiently than working as an individual. It also, helped me to

understand that individuals can contribute in their own significant way. This knowledge will help
me in future to be more open to ideas to complete a particular work.

Discuss how this project changed the way you think about real-world math applications. If
your thinking was not changed, then discuss how the project supported your views about
real-world math applications.
Yes, this project has changed the way I think about real-world math applications. It
helped me to think critically to solve a problem.