Академический Документы
Профессиональный Документы
Культура Документы
Quasi-Experimental Design
Research-8-QuasiExpDesign.doc
2
not to the other half, but I might not have enough resources to do so. For example,
staffing and equipment restrictions might allow me only to provide the treatment to the
students in the bottom 25% of the distribution of scores on the pretest.
To illustrate the use of the regression-discontinuity design, I have simulated data
from a hypothetical project employing that design. I started by simulating 38 pairs of
scores (pretest, posttest) randomly drawn from a population where
Post = 7 + 1.35∗Pre + error, ρ = .9, and σ error = 1.71 . These data are available at this
2
hot link: RegD0.txt. This data file is a plain text file. Each line contains data for one
subject. The first score is the letter ‘C’ or ‘T,’ indicating whether that subject was
assigned to the control group or the treatment group. Following a blank space (the
delimiter), the next score is the subject’s posttest score, and the next is the subjects’
pretest score. I assigned to the treatment group all subjects whose score on the pretest
was 6 or less. I defined the effect of the treatment to be exactly zero in the population.
If you remember how to use the statistical package you learned in your statistics
course, you can bring these data into that program and conduct a regression analysis
on them. Using all 38 data points, you would find that the estimated regression
parameters are: Post = 7.58 + 1.27∗Pre + error, r = .85, and MSE = 2.13. A plot of the
data with the regression line drawn in is shown in Figure 1 below. On this plot, data
points for subjects in the treatment group are plotted with the symbol ‘T’ and those for
subject in the control group with the symbol ‘C.’
If you were to conduct two regression analyses on these data, one for the
treatment group and one for the control group, you would find slightly different
regression lines, but those differences would be due totally to sampling error, because
the data for all subject in both groups were randomly sampled from exactly the same
population. I have estimated the separate regression lines and plotted them on the
scatter plot which is shown in Figure 2 below. For the treatment group the separate
regression estimates are Post = 8.09 + 1.17∗Pre + error, r = .62, and MSE = 2.13 and
for the control group Post = 6.33 + 1.43∗Pre + error, r = .72, and MSE = 2.29. Looking
at the plot, you should be able to see the difference between these two regression lines,
but they don’t look very different from one another.
5
Next I re-simulated the data, but with one change -- I built in a three point
treatment effect when defining the population for those in the treatment group. These
data are available at hot link RegD1.txt. I have estimated the separate regression lines
and plotted them on the scatter plot which is shown in Figure 3 below. For the
treatment group the separate regression estimates are Post = 11.27 + 1.07∗Pre + error,
r = .82, and MSE = 1.35 and for the control group Post = 7.90 + 1.18∗Pre + error, r = .
82, and MSE = 1.25. Looking at the plot, you can clearly see the difference in these two
regression lines. I extended the control group regression line into the treatment group
area to show what we would expect the regression line to look like for the treatment
group if there were no treatment effect.
7
The plot provides pretty convincing evidence that there is an effect of the
treatment. The regression line for the treatment group is clearly higher than what we
would expect it to be if the treatment were without effect. It would be hard to imagine
how one of the threats to internal validity that we have already discussed could have
created the observed discontinuity at exactly the cutoff point, but there is another threat
that must be considered. What if the true relationship between the criterion variable and
the pretest is curvilinear, but we have used a linear analysis? As illustrated in Chapter
11 of Trochim, this can lead one to conclude that there is a treatment effect when in fact
there is not.
8
Proxy-Pretest Design N O1 X O2
In this design one gathers the pretest information after the N O1 O2
experimental treatment has started. In other words, one finds an
archival proxy for the pretest. For example, suppose I ask the following question:
“Does completion of PSYC 2210 (experimental psychology) have an effect on a
student’s knowledge of statistics?” Ideally I would measure the students’ statistical
knowledge at the beginning of the semester, but suppose that the question did not
occur to me until the middle of the semester. I might decide to use as a proxy-pretest
students’ performance in their PSYC 2101 (statistics) class. My control group might
consist of a group of students taking some other class (not 2210). For each student I
would obtain a continuous measurement of the student’s performance in PSYC 2101
and, at the end of the semester, a continuous measurement of the student’s knowledge
of statistics. ANCOV would be used, with the proxy-pretest serving as a covariate.
regression line for predicting posttest from pretest in the control group only, as in Figure
4 below. In the plot, I have plotted the posttest data in the vertical dimension and the
pretest data in the horizontal dimension. The data point for your home town is plotted
with an ‘X’ instead of an ‘O.’ Evidence of the effectiveness of the program is the
displacement of the ‘X’ point from where it would be expected to fall given the
regression line obtained for the control group and your home town’s pretest score.