Вы находитесь на странице: 1из 10

Student Debt Crisis at For-Profit Schools: Do for-profit colleges benefit

students or merely saddle them with huge student debt?


Seungkwan Bryan Baek, Benjamin Legesse, Takehiro Matsuzawa
May 11, 2016
Abstract
The ongoing student debt crisis and the show Last Week Tonight with John Oliver provided inspiration to
investigate the question whether for-profit schools, in fact, leave students with more debt compared to public and
private schools. The effect on debt of for-profit schools was separately compared to both public and private schools to
gain sharper analysis. The data is derived from all higher educations in the US. In this observational study, propensity
scores were used to balance the covariates. Three distinct analytical methods show that in general, for-profit institutions
create less debt per student than private institutions, while they create more debt per student than public institutions.

Introduction

As more US high school students opt to go to university, the total student loan debt has increased, tripling in the last
decade. Student loan debt exceeded credit card debt in 2010 and auto loan debt in 2011, and it passed the $1 trillion
mark in 2012.1 Like stocks and other commodities, student loans can be packaged for investors, who hope to make returns
as the students slowly repay their student loans after graduation. Experts agree that student loan debt is currently in
a financial bubble2 , with unsustainable growth that can cause a financial collapse much greater than the magnitude of
the housing mortgage crisis in 2008. Thus, there is a great need to study the student loan debt bubble to prevent the
collapse.
In his popular show Last Week Tonight with John Oliver, Oliver asserted that one of the largest contributors of
student debt in the United States is for-profit schools. In America, there are three different types of higher education
schools: nonprofit public universities, nonprofit private universities, and for-profit universities. In the United States, most
public universities are state universities founded, operated, and funded by state government entities. Examples of public
universities are the University of Michigan, University of Massachusetts, and 2-year community colleges. Although private
universities are not operated by government, many still receive tax breaks, public student loans, and grants. However,
depending on their location, private universities may be subject to government regulation. Most private universities are
non-profit organizations. For example, Harvard and Yale are private universities.
On the other hand, for-profit, post-secondary institutions are operated by private, profit-seeking businesses. They
operate as businesses, receiving fees from each student they enroll. Prominent ones like the University of Phoenix and
Devry University are publicly traded on Wall Street. For-profits exist in large part to fix educational market failures
left by traditional institutions, and they profit by serving students that public and private nonprofit institutions often
ignore.3
Beyond the immoral principle of commodifying college education, for-profit schools leave students with less benefits
and greater debt. According to Oliver, they cost 5-6 times as much as a community college and twice as much as the cost
of a state university. While 13% of post-secondary schools students are from for-profit schools, they account for 33% of
all student debt. As institutions that prioritize profit for the investors, for-profit institutions allocate less budget towards
the quality of education. For example, University of Phoenix spends 25% of the budget on marketing and 10-20% for the
teachers. These schools are known to exploit people like immigrants, the poor, and veterans who have traditionally had
difficulty accessing post-secondary education. 4
Thus, this paper seeks to answer the causal question posed by John Oliver: do for-profit colleges create more student
debt? The paper first compares public universities against for-profit schools, and then compares private universities with
for-profit schools. Naturally, private universities are more expensive than public and for-profit universities, and students
who attend public universities share similar profiles such as family income level and GPA with students who attend
for-profit institutions. As a result, grouping private and public universities together as nonprofit institutions may lead to
inaccurate results such as the ones from the Stat186 in-class presentation.
1 http://time.com/money/4168510/why-student-loan-crisis-is-worse-than-people-think/
2 http://www.marketwatch.com/story/the-us-education-bubble-is-now-upon-us-2015-11-09
3 http://chronicle.com/article/Why-Do-You-Think-Theyre/123660/
4 http://www.hbo.com/last-week-tonight-with-john-oliver/episodes/01/16-september-7-2014/video/ep-16-clip-studentdebt.html?autoplay=true

Data

The data was derived from 2013 College Scorecard Data from the U.S. Dept of Education.5 The data was gathered from
all undergraduate degree-granting institutions of higher education, and it contained basic school stats such as student
enrollment and supporting data such as student completion, debt and repayment, and earnings. These data provide
insights into the performance of schools eligible to receive federal financial aid, and offer a look at the outcomes of
students at those schools. The data included 7804 Schools. 3762 were for-profit, 1969 were private, and 2073 were public.
Each unit or row represented a specific school. The data had 1729 covariates. The treatment was whether the institution
was for-profit school or not for the two comparisons mentioned above in the introduction. The outcome variable was
median debt per student.

2.1

Covariates

The study focused on relevant variables that would likely impact the median debt per student. The covariates were
narrowed down to:
1. Net tuition revenue per full-time equivalent (FTE) student
2. Instructional expenditures divided by the number of FTE students
3. Average faculty salary
4. Admission Rate
5. Number of undergraduates
6. Proportion of undergraduates enrolled part time in the fall term
7. Average Undergraduate Student Age
8. Highest Degree offered by the school (categorical; 4 represents masters/doctorates, 3 represents bachelors, etc.)
9. 3-year cohort default rate
10. % of Undergraduates Receiving Federal Loans
11. 3-year Repayment Rate on Federal Student Loans

2.2

Assumptions

First, student characteristics are uncorrelated with the quality of the institution in which they enroll. While the aggregate
model is attractive because data on aggregate institutional outcomes and student characteristics are more readily available,
it is unlikely to yield accurate measures of differences in college quality across a broad and heterogeneous set of institutions.
Additionally, colleges come in two main types: non-profit or for-profit. This leads to the central assumption that
having for-profit structure can be assigned as a treatment. In this context, the control group will be the non-profit
schools and the treated group will be the for-profit schools. They hypothetically represent non-profit schools that have
received treatment and transformed into for-profit schools. In this hypothetical experiment, SUTVA will hold. First, the
treatment (transforming a nonprofit school to into a for-profit school) will not change the outcome data for the other
non-profit schools. This asserts independence among the units. Next, the treatment is binary; there are no levels in being
a for-profit school. However, in reality, treating non-profit schools may effect other non-profit schools that can change
the outcomes meaning SUTVA wouldnt hold.

Analysis

The authors in the paper each derived his or her own method to address the problem. The data contained many missing
values. After the covariate selection, treatment assignment, and dividing the study into private vs. for-profit and public
vs. for-profit, Ben and Takehiro decided to remove institutions that contained null values for any of the covariates.
Because they removed 5734 institutions, Bryan instead imputed to fill in the missing values to keep all 7804 institutions
for the analysis. Bryan and Takehiro then used subclassification to achieve covariate balance. Ben used a special R
package MATCHIT to achieve covariate balance.
5 https://collegescorecard.ed.gov/data/

Figure 1: Plan of Attack.

3.1
3.1.1

Public vs. For-Profit


Bryan

First, it was important to figure out which schools were missing which covariate values. In the context of imputation, the
missing values should be imputed from schools with a similar profile. The result can be misleading if private university
(i.e. Harvard) data was imputed from a public university (i.e. University of Massachusetts) because they differ drastically
in operations. In the context of deleting missing values (Ben and Takehiro), it was equally as important to know what
group of schools is being deleted. The initial analysis showed for-profits in particular lacked admission rate, faculty salary,
and tuition compared to both public and private institutions. The private institutions were removed for this portion of
the analysis.
Next, the missing values were imputed by randomly sampling from two different institution type groups (i.e. public).
An initial look at the distribution of median debt per student shows that although Public Universities distribution is
wider, the median of the distribution is higher for the for-profit institutions.

Figure 2: Public (up) vs. For-profit (down) median debt per student
Due to the nature of the dataset, this paper approaches the causal question from the observational studys point of
view. An observational study attempts to replicate a randomized experiment. Thus, to do so, the outcome (median
debt per student) was removed. In addition, there was significant consideration towards covariate balance to imitate
addressing the missing data problem in a hypothetical randomized experiment.

To achieve covariate balance, propensity scores were estimated via logistic regression. Stepwise regression was used to
select covariates automatically from the first order, second order, and interaction terms. However, the covariate balance
was not improved significantly, so the analysis just stuck with the first order terms. Next, some of the control units that
had higher propensity score than minimum propensity score of treatment units were discarded. A propensity score gives
the probability of being assigned to either treatment (==1) or control (==0), thus it wouldnt make sense if control units
had significantly high propensity scores relative to treatment units.
Next, 4 and 8 different subclasses based on tuition were created. Different subclasses were created to test the covariate
balance. For 4 subclasses, quantiles were arbitarily chosen that gave more even balance of control and treatment units in
the subclasses. Also, these bins made it easier to interpret the result in terms of top 10% and bottom 50% of the tuition.
8 subclasses were divided evenly. Doing a histogram analysis shows an in-depth look at how the treatment outlines how
the median debt compares within each subclass. To further check the covariate balance, love plot and the density plot
were used.
Table 1: Example of covariate balance with 4 subclasses c(0.5,0.75,0.9)
Control Treated
1908
1881
116
943
30
561
19
377

Figure 3: The distributions of the median debt within total and each 4 subclasses for both treatment groups.

Figure 4: The love and the density plot shows the covariate balance is not optimal. For example, the balance on Tuition
was significantly better at the sacrifice of other covariates.

The data was manipulated to resemble a randomized experiment so far, but it needs to be consistent with the actual
data. Thus, the data was reloaded with actual outcomes and compared with the previous work. To check the work,
naive estimates using means of median debt from different treatment groups were done. First, it was done on the entire
dataset, and it was also performed within each subclass. Finally, a linear regression was performed on the covariates
Table 2: Treatment Effect
<50%
Total
Tuition
Mean Treatment Effect 515
-285
95% C.I.
(261,769) NA

on Median Debt per Student


50-75%Tuition

75-90%Tuition

>90%Tuition

-5367
NA

-3728
NA

1496
NA

against the median debt to find the treatment effect.


Table 3: Treatment Effect on Median Debt per Student
Estimate Std. Error P-value 95% C.I.
608
180
<0.001 (255, 961)

3.1.2

Ben

In order to balance the data, the propensity scores were estimated using step-wise regression on all the covariates. The
control units were not removed using the propensity scores from the treated units because it left few control units for
matching to be successfully carried out. Finally, the treated units were matched one-to-one with the control units. The
234 unmatched units were removed from the data before doing the analysis.

Figure 5: The covariate balance before and after matching.


After adding back the median debt covariate as the outcome, linear regression was performed to get the treatment
effect and the 95 % interval.

Treatment Effect
Confidence Interval

Results
494.9
(342.16 to 647.65)

Then a Neyman test was performed to compare the results, particularly the significance of the bounds, with that of
the linear regression results. The Neyman test was chosen over the Fisher test since it provides the confidence interval
needed to compare against the linear regression. A comparison is needed to make sure the treatment effect is convincing,
with the given data.
Treatment Effect
Confidence Interval

Results
97.22
(21.79 to 172.66)

These results show that public schools are better than for-profit schools for the median debt. Both the linear regression
and the Neyman test have treatment effects that are positive and 95% confidence intervals that are significant.
3.1.3

Takehiro

After removing private institutions and any schools with missing values, the treatment (being a for-profit school) was
tested against median Debt per student. Schools with different amounts of tuition may carry altered treatment effect.
Therefore, the data was subclassified by qunintiles based on the amount of tuition.

<20%
20-40% Tuition
40-60% Tuition
60-80% Tuition
>80% Tuition

Control
122
113
128
125
113

Treatment
79
64
61
76
87

After the subclassification, covariate balance was checked to provide more meaningful estimates of the effect of for-profit
school and increase precision of results.
Covariate Balance after Subclassification

In the love plot, its evident that the covariates are more balanced after subclassifying data.
Using median debt as the outcome variable, linear regression was employed to determine if for-profit schools lead to
students to have greater debt than public schools do.
Treatment Effect on Median Debt per Student
All Data
<20% Tuition
20-40%Tuition
40-60%Tuition
60-80%Tuition
>80% Tuition

p-value
2.51e-09
0.00238
0.038788
0.00510
0.01391
0.00173

coefficient of for-profit (TREAT)


4.341e+02
5.313e+02
3.568e+02
450.101307
4.423e+02
5.031e+02

p-value (After Stepwise)


3.58e-13
5.38e-06
0.0600
1.58e-05
0.000178
6.54e-05

coefficient
439.50822
425.94756
243.60038
497.78057
540.45508
557.07975

To provide more context in addition to the median debt, the percent of undergraduates receiving federal loans was
used as the output variable. The median debt only reveals the median debt of students that attend the school, not the
percentage of students that end up in debt from attending the school. This gives more insight.
Percent of Undergraduates Receiving Federal Loans
All Data
<20% Tuition
20-40% Tuition
40-60% Tuition
60-80% Tuition
>80% Tuition

3.2
3.2.1

p-value
2e-16
1.15e-06
4.5e-09
0.000106
2.85e-09
1.59e-14

coefficient of for-profit (TREAT)


1346.24190
1.192e+03
1340.73360
1.069e+03
1438.58887
1681.28419

p-value (After Stepwise)


2e-16
1.59e-15
8.54e-16
1.80e-08
9.68e-10
2e-16

coefficient
1276.64534
1188.77417
1387.75668
1149.34161
1302.62161
1658.47359

Private vs. For-Profit


Bryan

In this part, public institutions were removed. Then, similar steps were taken as seen in the public and for-profit
comparison. Initial look at the distributions show that center of the distribution (mean and median) for private institutions
is greater (or more right) than that of the for-profit institutions. The covariate balance was slightly better for all of the
covariates.
To show the different subclassification, this section shows 8 subclasses that were aforementioned in the public vs.
for-profit comparison. Whether the data is looked at as a whole or individually within each subclass, private institutions
tend to leave students with higher median debt.
Finally, a linear regression was performed on the covariates against the median debt to find the treatment effect, and
it affirms the conclusion that private institutions tend to create greater median debt per student.

Figure 6: Private (up) vs. For-profit (down) median debt per student

Figure 7: The love and the density plot show the covariate balance is not optimal. For example, the balance on Tuition
was significantly better at the sacrifice of other covariates.

Mean
95% C.I.

Total
-6300
(-6584, -6017)

Treatment
1st Subclass
-5054
NA

Effect
2nd
-3816
NA

on Median Debt per


3rd
4th
5th
-5263 -4592 -5807
NA
NA
NA

Student
6th
7th
-5838 -6253
NA
NA

Table 4: Treatment Effect on Median Debt per Student


Estimate Std. Error P-value 95% C.I.
-3097
194
<2e-16 (-3477, -2718)

8th
-5867
NA

Weighted Avg.
-5400
(-5681, -5118)

3.2.2

Ben

Like in the aforementioned analysis that involves public institutions, the propensity scores were estimated using step-wise
regression for the comparison between private and for-profit schools. Then, the control units that had propensity scores
higher than the maximum value of the treated propensity score were removed. Also, the units that had propensity scores
lower than the minimum treated propensity score were removed. The treated units were then matched one-to-one with
the leftover control units. Now, 582 unmatched units were removed from the data before the analysis.

Figure 8: The covariate balance before and after matching.


The median debt was re-incorporated, then linear regression was performed to get the treatment effect and 95 %
interval.

Treatment Effect
Confidence Interval

Results
-65.8
(-175.22 to 44.21)

Finally, a Neyman test was done to follow up on the linear regression results.

Treatment Effect
Confidence Interval

3.2.3

Results
-146.21
(-233.91 to -58.49)

Takehiro

A similar analysis was done as before. However, this section uses private institutions instead of public institutions.
<20%
20-40% Tuition
40-60% Tuition
60-80% Tuition
>80% Tuition

Control
239
217
210
216
220

Treatment
58
75
82
76
82

Covariate Balance after Subclassification

The covariate is more balanced as shown in the love plot.


All Data
<20%
20-40% Tuition
40-60% Tuition
60-80% Tuition
>80%

p-value
0.00139
9.32e-06
0.512406
0.46270
0.54717
0.0695

coefficient of for-profit (TREAT)


-1.467e+02
-481.969
-5.957e+01
87.71868
-63.89245
-1.872e+02

p-value (After Stepwise)


0.000443
6.39e-06

-208.49

coefficient
-1.585e+02
-475.30925

0.0249

Percent of Undergraduates Receiving Federal Loans


All Data
<20%
20-40% Tuition
40-60% Tuition
60-80% Tuition
>80%

4
4.1

p-value
0.050810
0.495464
0.78727
0.22529
0.257344
0.339150

coefficient of for-profit (TREAT)


148.08262
-133.80037
48.32133
215.03584
183.01937
161.74011

p-value (After Stepwise)


0.049664

0.001525
0.00651

coefficient
142.74385

-0.35927
386.55781

Results
Bryan

For public vs. for-profit schools, the result was interesting. Linear regression and the naive estimate on the entire dataset
shows that for-profit schools saddle students with more debt ($515 and $608, respectively). Both confidence intervals and
p-values indicate that the estimate is significant.
However, subclassification tells a different story: For institutions that have low and high tuition, for-profit schools
have higher median debt per student. For institutions with mid-range tuition, public schools have higher median debt
per student.
One cause for concern in this finding is the covariate balance on the love plot. In order to compensate for the balance
in Tuition, the covariate used to subclassify, other covariate balances were exacerbated.
For private vs. for-profit schools, linear regression and the naive estimate on the entire dataset shows that private
schools saddle students with more debt ($6300 and $3097, respectively). Both confidence intervals and p-values indicate
that the estimate is significant.
Subclassification supports this findings. Within each 8 subclasses, private institutions always carried higher median
debt per student.
Covariate balance improved slightly across different covariates. Multiple tries of subclassification and in-depth look
at the data showed that improvements were insignificant.

4.2

Ben

The results of public versus for-profit schools show that public schools have less median debt per student than for-profit
schools. In addition, both the linear regression and the Neyman test produced positive treatment effects and significant
95% confidence intervals.
The results of the private versus for-profit schools show that private schools have greater median debt per student
than for-profit schools when it comes to the median debt. Both the linear regression and the Neyman test have treatment
effects that are negative indicating that private schools are worse. However, linear regressions 95% confidence interval
was insignificant; the treatment effect is ambiguous.

4.3

Takehiro

When comparing against public schools across five subclasses, for-profit schools median debt and percent of undergraduates receiving federal loans were higher than those of public schools. However, when comparing against private schools,
there is no statistically significant difference for median debt and percent of undergraduates receiving federal loans. This
can be explained by the fact that private schools are more expensive than public schools.

Conclusion

All three separate analyses present a similar conclusion. Students who attend public institutions leave with less debt
(median debt) than those who attend for-profit schools. However, the data was inconclusive for the treatment effect
against private institutions. While Bryan found a significant, negative treatment effect for for-profit schools against
private schools, Ben and Takehiro discovered the negative treatment effect was insignificant.
In further analysis, Takehiro showed that public schools have less percentage of students receiving federal loans
compared to for-profit schools. There was no statistical significance in the case of private schools.

5.1

Limitation

First, the data includes all institutions that offer some form of degree. It include trades and culinary schools that do not
compare well against the typical universities. In addition, even within the universities, 2-year and 4-year schools were
grouped in the same cohort. There are many differences such as the tuition and net expenditure on students that could
change the results of the analysis.
One of softer limitations is the amount of missing information. Some schools, especially for-profit schools, completely
withheld information. Imputing found in Bryans methods still faces limitations; the actual data is always better than
the simulated data.

The data also lacks the quality of education such as teacher-student ratio published in a given year. These criteria
are often used to rank universities around the world. These values can help determine the relative value of the debt.

10

Вам также может понравиться