Академический Документы
Профессиональный Документы
Культура Документы
Each year, thousands of families make decisions about a living area. Their hope is, of
course, to find a place that provides a high quality of life. One major factor that may influence
the quality of life in a certain zip code is the area’s population density. Does a crowded
neighborhood enhance or diminish quality of life in a certain area? Does it not affect quality of
life at all?
For the 2010 Math Team Project, we have to find the correlation between population
density and quality of life. In order to confirm our conclusion, we had to gather data from
http://zipskinny.com on cities from different regions of the United States and analyze whether
they form any trends. WSMC provided three factors to measure quality of life, and these include
income, educational attainment, and quality of schools. We were required to choose two more
factors to assess quality of life. As for population density, we must judge categories of 0-100
people per square mile, 101-1,000 people per square mile, 1,001-5,000 people per square mile,
5,001-10,000 people per square mile, and 10,001+ people per square mile. We will choose at
least ten cities that fall into each of the above categories and we will evaluate each category by
the five factors. We must find how each factor varies among the five density categories.
Approach
We started by determining our two other factors that measure quality of life. After some
consideration, we decided on the poverty percentage and marital status. The poverty line is a
good indication of the quality of life because it acknowledges what percentage of people we
consider destitute in comparison to a consistent standard. For the data, we took only the
percentage of people below the poverty line because it shows the percentage of people whose
quality of life we believe to be undesirable. An increasing trend would mean decreasing quality
of life. As for marital status, most people believe the opportunity to marry and keep a family a
necessary component of life. However, because many people may choose not to marry and many
people below eighteen years old may not have had a chance to consider marriage, we formed a
ratio of percent married vs. a combination of percent separated, divorced, and widowed. We
decided to form a ratio because we eliminated the percent not married, which makes the married
percentage and separated, divorced, and widowed percentage separate (not able to combine to
1
100%). If we just graphed population percent married, we would not be able to assume that the
percent not in the married would all be undesirable. One zip code may have a 70% vs. 25% ratio
while another may have a 60% vs. 10% ratio. Simply graphing the percent married would make
the first area more desirable, but a ratio of the married to separated, divorced, and widowed
would make the second area more attractive. A person would like to marry without a high chance
of getting separated, divorced, or widowed. Forming the ratio combines the positive and negative
percentages.
Now, for our three required factors, we also had to make decisions on the data that best
represents them. With median income, we simply took the median income for the zip codes.
Quality of schools involved more complications, such as non-standard schools. We accepted
only schools that are of the standard k-4, k-5, k-6, 5-7, 5-8, 6-8, 9-12. In addition, we accepted
schools of grade levels that are common throughout the other schools in the zip code, meaning
that this grade structure is “standard” in that area. We averaged the student to teacher ratio in
every acceptable school in each zip code. Lastly, for educational attainment, we reasoned that in
today’s American society, high schools are common in most cities and just a high school
diploma will not allow us to predict a person’s future quality of life. At least a bachelor’s degree
is required for improved quality of life, so we took from ZIPSkinny the percent of people with a
bachelor’s degree or higher.
Next, we chose a total of 50 cities from all parts of United States. Picking cities randomly
from different regions eliminates other unnecessary factors such as climate patterns and
geographical conditions that do not affect the chosen five factors. We gathered the data for each
factor from each of the 50 cities. In order to analyze the effect of population density on quality of
life, we decided to develop a mathematical model to measure quality of life. After creating a
model, we will test the model with cities of different population densities to evaluate quality of
life based on population density.
2
Data
Table 1
4
Mathematics
We culminated all of our data in Table 1. The cities are arranged in order of increasing
population density.
First, we decided on a scatter plot and regression approach to model our data. We plotted
our data from every city for each factor vs. population density for a total of five scatter plots.
Then we performed a least-squares regression on each of the scatter plots. (Figures 1-5)
Figure 1 (Income)
5
Figure 5 (Marriage/Divorced, Widowed, Separated Ratio)
For our five factors, the two factors that we chose, poverty line and marriage ratio,
showed moderately strong correlations with r-values of 0.407422 and -0.359014, respectively
(Figures 4 & 5). The regression of the three required factors all had weak correlations (Figures
1, 2, & 3). This demonstrates little relationship between population density and income,
educational attainment, and quality of schools. If we wanted to predict income, educational
attainment, and quality of schools from population density we would have to find a relationship
between the two, but our scatter plots demonstrate a weaker relationship.
However, because each regression line is based on 50 data points, the correlation is likely
to be low. It would be hard to find a best-fit line that would represent so many data points. In
addition, just about every scatter plot contains outliers that are very far from the regression line.
In order to simplify our data (eliminate outliers) and characterize each of the population
density categories separately, we decided to try a bar graph approach. This time, for each factor,
we took the data from the ten cities for each population density category and identified the
median; we used the median to represent the data for the population density category. (Figures
6-10)
Figure 6
6
Figure 7 Figure 8
Figure 9 Figure 10
The graphs of income and poverty line (Figures 6 & 9) show the best quality of life in
the 1001-5000 people/mile2 population density category. The 1-100 and 101-1000 catagories
both show better quality of life than the 5001-10000 and 10001+ catagories. Educational
attainment (Figure 7) indicates the highest quality of life in the 5001-10000 people/mile2
population density category. The marital status graph (Figure 10) illustrates a sharp decrease
from the 1001-5000 people/mi2 to the 5001-10000 people/mi2 population density categories.
Lastly, the quality of schools bar graph (Figure 8) demonstrates no distinct trend.
Next, we decided to simplify the data one last time. This time, we will organize each data
value for each zip code into a number on a scale of 0-10. We took the median of the 50 data
points for each factor and denoted that as a five. Then, we divided the median values by five to
get a value that represents a one the 0-10 scale. This way, we can divide any value for a specific
7
factor by that “one” value and get a number on a scale of zero to ten, with ten representing high
quality of life and zero representing minimal quality of life. Our results are shown in Table 2.
Table 2
Because poverty line and quality of schools increase in quality of life as the data value
decreases, we will subtract the values that we obtain from the process above for these two factors
from ten. For example, if a city has a percent below poverty line of 13.38%, we will divide that
by 2.41% to get 5.55187; then we subtract that value from 10 to get 4.448133 on the 0-10 scale.
Any numbers that are more than ten will be denoted as a ten and any negative number will be a
zero.
We calculated the 0-10 number for each factor for each population density category using
the medians that we used for the bar graphs (Table 4). In order to calculate a final number that
represents quality of life overall we decided to weigh some factors more and others less. We
deemed poverty line and educational attainment most influential to quality of life because if a
person is in poverty or he cannot go to college, he would likely have poor quality of life, so we
multiplied the 0-10 number for these two factors by two. Median income and marriage ratio we
considered second important because they depend on cost of living and decision to marry,
respectively, so we multiplied that by 1.5. Lastly, since quality of schools does not significantly
determine a person’s quality of life and the student/teacher ratio also does not completely
represent quality of schools, we denoted that 0-10 number as a one. Then we added for each
population density category the multiplied numbers from each factor and received a number out
of 80. Our results are shown in Table 3
8
Income Attainment Schools Line Status Life)
1-100 4.9243 3.1068 5.6682 5.4149 5.9002 38.94835/80
101-1000 5.8247 4.6359 5.1119 5.5187 5.9371 43.0638/80
1001-5000 6.7683 5.4247 4.4118 7.2407 5.3102 47.86035/80
5001-10000 4.9573 7.1845 5.3325 3.1950 3.4663 38.7269/80
10000+ 4.7665 4.6723 4.7762 1.3278 3.5032 29.18095/80
Table 3
Population Educational Quality of Marital Status
Density Range Median Attainment Schools (# % Below (% married/%
(number of Household (% with of Poverty separated,
people/square Income ($) bachelors or students:# Line widowed, or
mile) higher) of teachers) divorced)
1-100 35322.5 12.8 13.55 11.05 3.20
101-1000 41781.5 19.1 15.2935 10.8 3.22
1001-5000 48546.5 22.35 17.4835 6.65 2.88
5001-10000 35559 29.6 14.6 16.4 1.88
10001+ 34190.5 19.25 16.335 20.9 1.90
Table 4
From the five “total” values, we can infer that the best quality of life is concentrated in
the 1001-5000 people/mile2 population density category. If the five values were graphed with the
numbers being the y-value and population density as the x-value, the curve will be somewhat
parabolic, with quality of life rising as population density reaches around 4000 people/mile2 and
then dropping down as population density increases beyond that.
Conclusions
Our first approach, the least-squares regression approach, shows a clear trend only for
poverty line and marriage ratio. As population density increases, the percent of people below the
poverty line rises, indicating a drop in quality of life. For the marriage ratio, the ratio drops as
population density increases, which also signifies a decrease in quality of life. Because the
correlations for our other three factors were very weak, it would be unreliable for us to predict
those factors based upon population density. This model indicates a lower quality of life as
population density increases.
Our second approach, the bar graph approach, indicates best poverty status and income in
the 1001-5000 people/mile2 population density category. The educational attainment graph leans
toward the 5001-10000 people/mile2 for best quality, but still demonstrates better quality of life
in medium population density rather than high or low population density. The marital status and
9
poverty line bar graphs both indicate dramatically decreasing quality in the 5001+ people/mile2
population density categories. This model signifies high quality of life starting in the 1001-
5000 people/mile2 population density category, and illustrates a drop in quality of life as
population density increases or decreases.
Our last approach, the 0-10 model approach, is a simplification of our bar graph approach
and allows evaluation of quality of life as a whole instead of as separate factors as with the
regression and bar approaches. Although the final number is manipulated by several steps, it still
gives a generalized number from which we can see a range of numbers representing quality of
life in which the population density category falls in. From the final numbers representing
quality of life, superlative quality of life lies in the medium population density range, and it
maximizes in the 1001-5000 people/mile2 population density category. The 1-100 people/mile2
and 5001-10000 people/mile2 population density categories have a similar value in quality of
life. If the values were plotted vs. population density on a graph, a parabola would seem to start
at around 35-40 and rise as the population density increases to the 1001-5000 people/mile2
population density category, dropping to around its starting point in the 5001-10000 people/mile2
density category, and then plummet down as it increases to 10000+ people/mile2. This model
suggests best quality of life in the 1001-5000 people/mile2 population density category and
extremely low quality of life as population density increases beyond 10000 people/mile2.
Sources of Error
Because quality of life is very difficult to define, our models may not represent it
completely accurately. The regression model is impaired by the population density being so
widespread so the 10000 people/mile2 data points lie far from the other data points. The
numerous data points may provide more than a few outliers. The bar graph approach may be
impaired by only taking the median and using it to represent a whole range. The 0-10 scale will
be weakened by the numerous steps we took to arrive at the final quality of life value. However,
because quality of life only needs to be classified generally, our approaches are sufficient.
10