Академический Документы
Профессиональный Документы
Культура Документы
In a medical research study, 36 patients suffering from severe clinical depression were prescribed
different treatments and the effectiveness of these treatments in managing severe clinical
depression was recorded. Note that for the duration of the study, each patient was uniquely
prescribed one and only one of the four available treatments (referred to as Treatments A, B, C, and
D). Moreover, each treatment was prescribed to the exact same number of patients. Finally, all of
the variables recorded in the study are given in Table 3.1.
The pairwise correlation matrix for all these variables is first computed as shown below (Table 3.2):
As the correlation coefficient value between Effectiveness and Treatment C is negative, it was
concluded that Treatment C is not effective in treating clinical depression. State whether this claim is
true or false and justify your conclusion.
Solution: False. The correlation coefficient merely shows that as Treatment C increases from 0 to 1,
the average effectiveness levels reduce, i.e., for patients not prescribed Treatment C (Treatment C
= 0), their effectiveness levels are higher on average than those patients who were prescribed
Treatment C (Treatment C = 1). Clearly, the effectiveness level of Treatment C in this case is
measured relative to other treatments and therefore it is fair to state that Treatment C is not as
effective as other treatments in the study, but we cannot state that it is not effective in treating
clinical depression. To make the claim that Treatment C is not effective in treating clinical
depression, we would have to compare Treatment C with patients who were not given any
treatment at all, but this option is not available in the data.
1
Model 1:
A linear regression model was initially constructed to predict effectiveness based only on the
treatment used and the following SPSS output was obtained.
Model 1 Coefficientsa
(Constant) 55.444
Treatment D = 55.444
Treatment A = 55.444 + 7.3333 = 62.774
Treatment B = 55.444 – 2.889 = 52.55555
Treatment C = 55.444 – 5.5555 = 49.889
What would the prediction equation be if the base category was taken to be Treatment B?
If the base category is taken to be Treatment B, then the prediction equation can be written as:
2
effectiveness on average. Perform an appropriate test to verify this claim at a 95% confidence level,
clearly stating the null and alternate hypotheses.
Ho: beta1=beta2=beta3
K = 3, n = 36
F = MSR/MSE = 1.9506
F_critical = 2.9
Therefore, we cannot reject the null hypothesis, and therefore we cannot reject the claim that all
treatments are the same on average.
Model 2
Recognizing that ‘Age’ of the patient might be a factor when measuring the effectiveness of a
received treatment, Model 1 was enhanced by using an interaction variable, namely
“Age*Treatment” (all combinations) and the regression output for this new model is given below.
Model Summaryb
Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
constant 49.470 5.864 8.437 .000
TreatmentB -22.030 7.882 -.779 -2.795 .009
TreatmentC -44.931 7.512 -1.589 -5.981 .000
1 TreatmentD -20.964 7.844 -.742 -2.673 .012
Age_TreatA .296 .125 .496 2.361 .025
Age_TreatB .549 .110 .948 5.010 .000
Age_TreatC 1.074 .104 1.742 10.286 .000
Age_TreatD .620 .114 1.018 5.439 .000
a. Dependent Variable: Effectiveness
3
Question 3.5 (3 points)
Based on the regression output, plot the regression equations for each of the four treatments (on the
same graph), if the minimum and maximum ages of the patients in the dataset are 19 and 67 years,
respectively. Based on your regression plots, which treatment would you recommend as the best on
average for a patient of age 60 years?
Equating the prediction equations for Treatments A and C, we get the threshold age where
Treatment C surpasses Treatment A as 57.30 years. (In the plot above, it is the intersection point
between the blue and grey lines). Therefore, based on the prediction equations, on average,
Treatment A works better for younger patients and Treatment C works better for elderly patients.
What is the probability that the effectiveness level would be at least 60 for a 55-year old patient who
is administered Treatment B?
4
Given, sqrt(MSE) = 4.8479
Y|x ~N(57.635,4.84794^2)
Model 3
A stepwise regression procedure was initiated to predict effectiveness using all the original
explanatory variables given in Table 3.1 as well as interaction variables: Age*Treatment A;
Age*Treatment B; and Age*Treatment C. The following SPSS output obtained in the first two models
of the stepwise regression procedure is shown below.
Variables Entered/Removeda
Step Variables Entered Variables Removed Method
Stepwise (Criteria:
Probability-of-F-to-enter <= .
1 Age . 050, Probability-of-F-to-
remove >= .100).
Stepwise (Criteria:
Probability-of-F-to-enter <= .
2 TreatmentA . 050, Probability-of-F-to-
remove >= .100).
At the end of Step 1 of the regression procedure, determine the partial and part correlations for
Treatment A. Correspondingly, determine what percentage of the variability in “effectiveness” is
explained by Step 1 and Step 2, respectively?
R^2=0.7967^2=0.6347 (Step 1)
R^2=0.331^2+0.6347=0.7443 (Step 2)
5
We do a partial F-test to test this hypothesis.
Full model: age, treatA, age*treatA
Red model: age, treatA
R^2_red = 0.744
F_part=((R^2_ful-R^2_red)/(k-m))/(1-R^2_full)/(n-k-1)) = 7.37
F_crit = 4.14
As F_part>F_crit we reject the null hypothesis, and therefore “Age*Treatment A” must be included
into the model in Step 3.
In Model 3, note that “Age” is also an explanatory variable; hence, if we include all possible
combinations of “Age*Treatment” variables, it leads to linear dependency and therefore, we must
exclude one of them and treat it as the base category. However, in Model 2, as “Age” was not an
explanatory variable, it is okay to include all possible combinations of “Age*Treatment” as
explanatory variables.