Вы находитесь на странице: 1из 8

Biometrics 60, 892899

December 2004

Joint Modeling of Longitudinal and Survival


Data via a Common Frailty
Sarah J. Ratclie, Wensheng Guo, and Thomas R. Ten Have
Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine,
Blockley Hall, 6th Floor, 423 Guardian Drive, Philadelphia, Pennsylvania 19104-6021, U.S.A.

email: sratclif@cceb.upenn.edu
Summary. We develop a joint model for the analysis of longitudinal and survival data in the presence of
data clustering. We use a mixed eects model for the repeated measures that incorporates both subject- and
cluster-level random eects, with subjects nested within clusters. A Cox frailty model is used for the survival
model in order to accommodate the clustering. We then link the two responses via the common cluster-level
random eects, or frailties. This model allows us to simultaneously evaluate the eect of covariates on the
two types of responses, while accounting for both the relationship between the responses and data clustering.
The model was motivated by a study of end-stage renal disease patients undergoing hemodialysis, where we
wished to evaluate the eect of iron treatment on both the patients hemoglobin levels and survival times,
with the patients clustered by enrollment site.
Key words:
analysis.

EM algorithm; Frailty; Longitudinal data; Nonparametric maximum likelihood; Survival

1. Introduction
In longitudinal studies, it is common to collect both repeated
measures and time-to-event (e.g., survival) data, as well as
treatment information. When modeled separately, random effects models (Laird and Ware, 1982) and Cox proportional
hazard models (Cox, 1972) are often used to estimate the
treatment eect on the repeated measures and survival data,
respectively. These types of models can also be used when
the two outcomes are related, provided the relationship only
depends on observed variables included as covariates in the
separate models. However, often the longitudinal data and
survival times are related through some unobserved variables,
with the treatment aecting both types of outcomes. In this
case, separate models can result in biased estimates (e.g.,
Prentice, 1982; Little and Rubin, 1987) and hence joint modeling is desired. We present a new approach to joint modeling
in this article, in which the responses are linked at a cluster
level, under which the subjects are nested.
Our research was motivated by the end-stage renal disease
(ESRD) study. This study consisted of patients with ESRD in
the Fresenius Medical Corporation clinical dataset undergoing
hemodialysis. For patients with ESRD, anemia is a common
problem causing low hemoglobin levels. Because this is associated with higher rates of death, parenteral iron is used to
increase a patients hemoglobin level. However, other studies
have found that high doses of iron are associated with elevated death rates (Feldman et al., 2002). Thus, any analysis
needs to consider not just the eect of iron on a patients
hemoglobin levels or on their survival time, but also whether
the eects are distorted by not controlling for the relationship
between hemoglobin levels and mortality.

The subjects were clustered into groups based on the enrollment site. The sites consisted of a mixture of hospital (private
and public) and health centers. Ignoring this clustering was
thought to bias the results due to the nature of the sites involved, because some sites, like teaching hospitals, naturally
attract sicker patients and this could inuence the survival
times and hemoglobin levels. This is reected in the average
survival and hemoglobin scores for each site (Table 1). The
sites whose subjects survive the longest tend to have higher
hemoglobin scores, on average, with a correlation coecient
of 0.4887. Because the type of site is correlated with both the
sickness of the patients and their survival times, linking
the responses at the subject level may not be appropriate. A
subject-level link would assume that the subject eects were
a good surrogate for the unobserved, underlying process that
links the two responses, and would result in conditional independence of the responses. However, for the ESRD study, this
necessary assumption may not hold due to the cluster-level
correlations. Thus, site is a better surrogate for the unobserved process, and a cluster-level link is more likely to result
in conditional independence. Further, linking at the subject
level, or ignoring the link, may result in biased and/or less
ecient estimates. Therefore, a joint model linking the responses at the cluster level is needed for the ESRD study. We
return to this application in Section 4.
Many existing approaches to joint modeling link the two
responses via subject-level random eects (Wu and Carroll,
1988; Little, 1995; Tsiatis, De Gruttola, and Wulfsohn, 1995;
Hogan and Laird, 1997), as opposed to the cluster-level
link presented here. For example, when the focus is on the

892

Joint Modeling of Longitudinal and Survival Data


Table 1
Descriptive statistics for the ESRD data by site

Site

Proportion
of deaths

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

29
27
21
26
39
40
16
32
31
31
29
28
26
25
26
27
27
39
36

0.4483
0.2593
0.5714
0.4615
0.3590
0.2250
0.5625
0.2812
0.3871
0.3226
0.2414
0.2500
0.5000
0.3200
0.3846
0.2593
0.4074
0.3590
0.3056

Average survival
time for patients
who died (months)

Average
hemoglobin
level

12.4458
13.1211
14.1760
14.1970
14.6563
14.9160
14.9267
15.1960
15.3900
15.6228
16.5069
16.5771
16.6182
16.6815
18.2640
19.2240
19.2785
19.5703
19.7858

10.2692
10.2216
10.7984
10.8132
10.3075
10.8137
10.5145
10.6512
10.6512
10.6312
10.2155
10.4702
10.0493
10.3770
10.8144
10.8226
10.7504
10.9316
11.0143

longitudinal data, Schluchter (1992) and De Gruttola and Tu


(1994) used a shared parameter approach by assuming multivariate normal distributions for the two outcomes, linked
by individual random slopes and intercepts. Wulfsohn and
Tsiatis (1997) used a similar approach, but with a Cox model
for the survival times. They also dier from our approach by
using BLUP estimators, which can cause biases as it results in
using a rst-order Taylor term. Further, we assume continuous
time and hence use a continuous hazard function, as opposed
to the discrete time multinomial distribution used by Wu and
Carroll (1988). Lancaster and Intrator (1998) modeled a Poisson process, as opposed to the mixed model presented here,
and also assumed that the underlying subject-level link (represented by an observed variable) was positively correlated
with both outcomes and had the same proportionate eect on
both hazard functions, assumptions we do not make. Another
approach, similar in spirit to the shared parameter model, is
to use an unknown time-varying latent variable to link the
two outcomes (e.g., Henderson, Diggle, and Dobson, 2000;
Huang et al., 2001; Wang and Taylor, 2001; Xu and Zeger,
2001). While this provides a more exible link between the
repeated measures and survival data, a specic model for the
latent variable must be dened. When the focus is the survival times, another approach is to incorporate the repeated
measures into the survival model by regarding them as timevarying covariates measured with error (Prentice, 1982; Hu,
Tsiatis, and Davidian, 1998; Tsiatis and Davidian, 2001).
These existing models assume that subject-level eects are
adequate surrogates for the unobserved process linking the
two responses, so that given the subject-level link, the repeated measures and survival times are conditionally independent. These subject-level random eects are the only random eects included in the model. In contrast, two levels of
nested random eects (subject-level and frailty, with subjects
nested within clusters) are used in the model presented in this

893

article, with the responses linked at the higher cluster level.


This additional level of random eects makes the model more
exible. Further, a more ecient and unbiased estimate of the
link (frailties) may be obtained, compared to that found via a
model linked at the subject level. This is especially true when
there are only a few observations per subject. The homogeneity within the clusters means that the estimate for the underlying process can be found by combining information across
subjects within the cluster, as opposed to only having the few
observations with which to estimate a subject-level link. Finally, we are able to investigate the relationship between the
two responses, instead of focusing on one of the responses,
through the link parameter.
In this article, we present a method for the joint modeling
of longitudinal and survival data that links the two responses
at the cluster level due to data clustering. The repeated measures are modeled as a mixed eects type model, since random
eects models can accommodate cluster- (and subject-) level
random eects. The Cox frailty model (Clayton, 1978; Vaupel,
Manton, and Stallard, 1979) is used for the survival times as
it allows for between-cluster heterogeneity. In this extension
of the Cox proportional hazards model, the heterogeneity is
modeled via the addition of a frailty(s) term, a cluster-level
random eect with a multiplicative eect on the hazard function. We then link these two models via the common frailty,
using a multivariate normal assumption for computation ease
(Li and Lin, 2000). Using a random eect (frailty) as the
surrogate for the latent process linking the two responses is
natural because it is unobserved. It also has the advantage of
eliminating the need to estimate a large number of nuisance
parameters, as used in a xed eects model.
The remainder of the article is organized as follows. Section
2 presents the joint model, and its nonparametric likelihood
function. Section 3 describes the EM algorithms used to compute the parameter estimates. Section 4 applies these methods
to the data from the ESRD study. Conclusions are oered in
Section 5.
2. Joint Model for Repeated Measures
and Survival Times
In this section, we present the joint model for the repeated
measures and survival times. The data are clustered into frailties, indexed by i = 1, . . . , ni , with j = 1, . . . , nij subjects
within each cluster. Thus, y ij is the response vector for subject j within cluster i. Further, the survival time for this subject, sij , can be right censored at some time cij . Hence, tij =
min(sij , cij ) denotes the observed time at risk for subject j
within cluster i, and ij = I(sij cij ) is the failure indicator.
The longitudinal response y ij is modeled using a random
eects model (Laird and Ware, 1982), with both subject-level
and frailty random eects. The possibly right-censored survival times are modeled using a Cox frailty model (Clayton,
1978). We link the two models via the frailties ui . Thus,
y ij = X ij 1 1 + ui + ij ,

(1)

ij (sij | xij2 , ui ) = 0 (sij ) exp xTij2 2 + bui ,


where
ij = Z ij ij + ij ,

(2)

Biometrics, December 2004

894

and where the known design matrices for the xed eects
and subject-level random eects associated with y ij are X ij 1
and Z ij , respectively, with corresponding parameter vectors
1 and ij N (0, ), ui N (0, u2 ) is the frailty for cluster
i and ij N (0, e2 Iij ) is a vector of residuals for y ij , xij2 is
the known xed eects design matrix linking the unknown
parameter vector 2 to ij , b is an unknown parameter linking
ui to ij , and 0 (t) is the baseline hazard function.
This model assumes that conditional on the cluster-level
eects, the missing data mechanism for Y is ignorable within
a cluster. Further, the informativeness of the survival times is
incorporated into equation (1) by equation (2). This can be
thought of as implicitly imposing a constraint on the frailties
in the random eects model. When b = 0, model (1) reduces
to an ignorable missingness (dropout) model. Further, for
model (2), we assume that conditional on the cluster eects,
the hazard of failure at time t is the same for both the censored
and uncensored subjects at t, given the information available
up to t.
Model (2) can be generalized to

ij (sij | xij2 , ui ) = 0 (sij ) exp xTij2 2 + bui + vi ,


where vi N (0, 2v ). This generalization allows the baseline
hazard function used in (2) to be split into its cluster- and
subject-specic components. Because the focus of this article
is on the link, ui , we have assumed vi = 0 for simplicity and
allowed the baseline hazard function to model both components. This assumption can easily be checked using a twostage procedure (e.g., Bycott and Taylor, 1998), and for our
application it did not impact the results.
The parameters 1 and 2 are the implied weighted
population-averaged or marginal eects for the xed eects
covariates. Subject- and cluster-level changes in these eects
are accommodated through the random eects. For simplicity,
we have used a single random eect (frailty) for each cluster.
In this case, ui can be thought of as the cluster-specic deviation for the repeated measures and survival models. Because it
is assumed that the two responses are related, the parameter
b describes the nature of this association. The methods can
easily be extended to a random vector ui , with an associated
design matrix W ij .
Let = (1 , u2 , , e2 ), = (2 , b), and ij = Z ij Z Tij +
e2 Iij . Then, assuming that censoring is independent of the
random eects and covariates, the complete data log likelihood, conditional on ui and up to a constant, is

i ,j

+l
+l
=

u2



tij R(t)

log u2 + u2i u2 ,

(3)

where the cumulative baseline hazard (t) is discretized to


a step function with jumps at the observed mortality times.
Without this discretization, the maximum of the likelihood
would not exist (e.g., Li and Lin, 2000). This type of likelihood function is often referred to as a nonparametric log likelihood. Estimates under this likelihood have similar properties
to those of a fully parametric likelihood (Murphy, Rossini,
and Van der Vaart, 1997). The complete data form of the
likelihood is needed in order to nd the maximum likelihood
parameter estimates, as described in Section 3. We obtain the
complete data likelihood with an EM algorithm.
Under this model, the two outcomes are dependent only at
the cluster level. That is, y ij and tij are independent given
some underlying, but possibly unobserved process. By linking the two via the frailty rather than at the subject level,
we are able to utilize similarities between subjects in order
to obtain more information with which to better estimate
this unobserved process. Further, the nested correlation structure and outcome dependency found in many studies are able
to be eectively modeled in an interpretable fashion without
overparameterization.
3. Estimation Procedures
Parameter estimates can be found using an adaptation of
the EM (Dempster, Laird, and Rubin, 1977) or ECM (Meng
and Rubin, 1993) algorithms. For either algorithm, the E-step
nds the expectation of the conditional likelihood (3), integrating out the unobserved frailties ui . For simplicity, we write
E(ui | , , ) for the conditional expectations instead of the
full form of E(ui | , , , T , , Y , X 1 , x2 , Z). Thus, the
expectation of the complete data log likelihood is
E(l) =





ij log(0 (tij )) + ij xTij2 2 + bE(ui | , , )

i ,j

(tij ) exp xTij2 2 E(exp(bui ) | , , )

tij R(t)


1 

nij log |ij | + E (y ij X ij 1 1 ui )T


2



(tij ) exp xTij2 2 + bui

 ni 
i

ij log(0 (tij )) + ij xTij2 2 + bui

i ,j

1
ij (y ij X ij 1 1 ui ) | , ,

| T, , Y, X1 , x2 , Z, u

| T, , Y, X1 , x2 , Z, u

 ni 

i ,j

= l(, | T, , Y, X1 , x2 , Z, u)
1 , , e2

1
ij (y ij X ij 1 1 ui )

l(, , | T, , Y, X1 , x2 , Z, u)

1 
nij log |ij | + (y ij X ij 1 1 ui )T
2



log u2 + E u2i , ,

u2 .

The M-step then maximizes this expected log likelihood. The


ECM and EM algorithms used to nd the parameter estimates
are described below.
3.1 ECM Algorithm
The parameter estimates for the kth iteration are found by
the following steps:

Joint Modeling of Longitudinal and Survival Data


1. Obtain (k+1) , updated estimates for

(k+1)

1,(k)

X Tij 1 ij

X Tij 1

1,(k)
ij

u2,(k+1) =

ni


(k) (k) (k) 

y ij E ui , ,
,


1


i
(k+1)
ij

(k+1)

2,(k+1)




ni E u2i (k) , (k) , (k) ,

are found by solving

ij N 0, e2 Iij

 

ij

(k)

exp xTrs2 2

 

ij

(k+1)

,

i ,j

 

ij

xTij2

E ui (k) , (k) , (k)

 .

1 

S (0) ,

r ,s

 

E eUrs (k) , (k) , (k) I(trs tij )


T

Aij ATij


ij

E ui (k) , (k) , (k)



E ebui , , .


The rst two are required for step 1, the last one for step 2,
and the formulas for E(ui ) and E(ebui ) are needed in the maximization of step 3, as they depend on the parameters being
estimated. These expectations do not have a closed form. Instead, they require some form of numerical integration in or-

E eUrs (k) , (k) , (k) I(trs tij )

Thus, we require the following expectations:



E u2i , , ,

 

r ,s

Aij =

i ,j

rs eU rsT (k) , (k) , (k) I(trs tij )


E U

 

 

(k+1) (tij ) exp xTij2 2 E ebui (k) , (k) , (k) .

E(ui | , , ),

T

rs U
T eU rsT (k) , (k) , (k) I(trs tij )
E U
rs

r ,s

ij xTij2 2

ui

ij = xT
U
ij2

(1) = (0) I (0)

(k+1) (tij )E exp xTij2 2 + bui (k) , (k) , (k)

i ,j

T

i ,j

This equation is solved using one step of the Newton


Raphson method,

I() =

i ,j

r ,s

where 
  (k) (k) (k) 

ij , ,
S() =
ij E U
Aij ,

i ,j   T

E eUrs (k) , (k) , (k) I(trs tij )



ij (k) , (k) , (k) =
E U

E exp b(k) ur (k) , (k) , (k)

ij E xTij2 2 + bui (k) , (k) , (k)

3. Update estimate of 2 and b by maximizing

i ,j

rs eU rsT (k) , (k) , (k) I(trs tij )


E U

and

tij t

 

where

= T2

(trs tij )

ij

r ,s

ij (k) , (k) , (k) = 0,


U

= 0,

= Z ij + ij ,

ij (k) , (k) , (k)


ij E U

i ,j

(trs tij )

,(k+1)

(k)
E exp xTrs2 2 + b(k) ur (k) , (k) , (k)

(k+1) (t) =

T
(k) U
ij

ij ij e

i ,j

via one step of a NewtonRaphson algorithm.


2. Update estimate of

i ,j

N (0, )

E ui (k) , (k) , (k) = ij

(k+1)

 

+ e2,(k+1) Iij ,

and e

y ij X ij 1 1

(k+1)
ij

3.2 EM Algorithm
A slightly dierent algorithm is achieved using the EM
method, and proceeds as follows:
1. Same as step 1 above.
2. Update the estimates of 2 and b by solving the score
equation

(k+1)
Z ij Z Tij

where

X ij 1



der to be evaluated. The procedure used is described in the


Appendix.



895

r ,s

3. Update estimate of
(k+1)

ij

= 

E exp

ij

(k+1)
xTrs2 2

+ b(k+1) ur (k) , (k) , (k)

(trs tij )

= 
(trs tij )

exp

xTrs2 2(k+1)

 

ij

E exp b(k + 1) ur (k) , (k) , (k)

,

Biometrics, December 2004

896

(k+1) (t) =

(k+1)

ij

tij t

This algorithm requires two implementations of the Newton


Raphson method, one in step 1 and again in step 2, as no
closed-form solutions are available for updating ij and . It
also requires the calculation of two additional conditional expectations

E ui ebui (k) , (k) , (k) ,

E u2i ebui (k) , (k) , (k) .

Because the conditional expectations must be calculated using a form of numerical integration, the cumulative eect of
calculating two additional expectations over the number of
iterations needed for the algorithm to converge can be time
consuming, as is the extra NewtonRaphson algorithm. Further, each additional procedure needed within the overall algorithm produces a small amount of rounding errors. Thus,
the more the procedures that need to converge, the looser the
overall convergence criteria must be.
Standard errors for the variance terms associated with ij ,
, and 2e can be approximated as part of the Newton
Raphson procedure used to nd the parameter estimates. The
standard errors for the xed eects, 1 and 2 , and the linking
parameter b are obtained by inverting the information matrix
of the prole likelihood. Due to the complexities of calculating
the second derivative analytically, we use EM-aided dierentiation as described by Meilijson (1989). This method also requires the calculation of E(ui ebui | (k) , (k) , (k) ), needed for
the EM algorithm. Thus, when calculating the standard errors, it is computationally more ecient to use the EM version
of the algorithm, provided good initial estimates are available.
Hence, we have tended to use the ECM algorithm in practice
to nd parameter estimates, as it can be more ecient, time
wise, when a large number of iterations are needed to reach
convergence. Moreover, stricter convergence criteria can be
used. However, once the estimates are found, they are used
as initial values in the EM algorithm in order to calculate the
standard errors in a few iterations.
4. Application to the ESRD Study
We now return to the ESRD study and apply the methods
described in Section 2. The data consist of a cohort study
of patients with ESRD as of January 1996 and who were still
alive as of August 1996. These rst 6 months were used as the
baseline period in order to calculate their iron exposure. The
patients were then followed up until October 1998 or death. Of
the 27,571 eligible patients, a random sample of 555 patients
has been used in this analysis due to the excessive computational burden of analyzing the entire sample. These patients
all had at least one follow-up visit, with approximately a quarter of the patients completing the entire nine follow-up visits.
Of these, 195 patients had died by the end of the follow-up period. Survival time was measured in months from the start of
the follow-up period, with the exact date of death/censoring
known. Also of interest were the patients hemoglobin levels
over time. Initial levels were recorded at the end of the baseline period, and every 3 months thereafter. The aim of the
study was to determine the relationships between the 6-month
cumulative iron dose and mortality, iron dose and hemoglobin

levels, and whether they are distorted by not controlling for the relationship between mortality and hemoglobin
levels.
Whether iron was administered was determined by calculating the cumulative iron dose over the 6-month baseline period. Patients categorized as having no iron received between
0 and 50 mg of iron over this period, with the majority receiving no iron. Patients administered 50 mg or more of iron
over the 6 months were categorized as having received iron;
409 patients received iron at baseline. Of these, 32.8% died
before the censoring date, while 41.8% of patients who did not
receive iron died. For all the patients who died, the average
survival time was 16.14 months, within a range of 6 to almost
27 months. The overall average hemoglobin score was 10.6,
within a range of 4.817.35. Each patient had at least 2 and
at most 10 repeated measurements taken. Other baseline and
repeated measures data were also recorded, such as age, sex,
length (in years) of ESRD prior to enrollment, and albumin
levels at each follow-up visit.
Three methods were used to estimate the parameters for
the ESRD data: independent models, a two-stage procedure,
and the joint model described in Section 2. Results are shown
in Table 2. For the independent models approach, a linear
mixed model was tted to the hemoglobin scores with both a
frailty and a subject-level random intercept. The subject-level
random slope parameters were found to essentially equal zero,
and hence were not included in the nal model. A Cox frailty
model was applied to the survival times, assuming no link
with the repeated measures. In these models, it was found
that baseline iron may marginally increase the hemoglobin
levels, 95% condence interval of (0.04, 0.26), but it did not
aect the relative risk of death, 95% CI = (0.62, 1.20). Variables that did inuence the hemoglobin levels included the age
of the patient (older patients had higher hemoglobin levels)
and the quarterly albumin levels (the higher the albumin
level the higher the hemoglobin level). Age was also significant in the survival model, where there is a 1.032 (95% CI =
[1.02, 1.04]) times increased risk of death for each year of
age.
Next, a two-stage procedure was employed (e.g., Bycott
and Taylor, 1998). The same mixed model was again applied
to the hemoglobin scores. However, the relationship between
the two responses was modeled in an ad hoc way by using the
estimated pattern eects from this model as xed covariates
in the survival model. This approach reduced the bias in the
survival parameter estimates.
Finally, the joint model was tted to the ESRD data. There
was a signicant link between the two responses under the
joint model. Having accounted for the relationship between
the hemoglobin levels and the survival times, we found that
the baseline iron dose now had no signicant eect on the
hemoglobin levels (p = 0.3150) but it did signicantly aect
the survival times of the patients (p = 0.0794). Thus, the
use of iron may increase the hemoglobin levels (though not
signicantly) as well as decrease the relative risk of death.
The estimates from the independent model are biased compared to those obtained under the two-stage procedure or the
joint model. In this application, the joint model produced
similar estimates to the two-stage procedure. However, based
on simulated data, this does not always occur in practice.

Joint Modeling of Longitudinal and Survival Data

897

Table 2
Model tting for the ESRD data. Parameter estimates and standard errors of the estimates are shown.
Independent models
Parameters
Intercept
Time
Time2
Albumin
Length ESRD
Age
Iron
(subject)
2u (frailty)
2e (error)
Age
Iron
b

Estimate
6.2485
0.0144
0.0080
0.9351
0.0123
0.0054
0.1108
0.4197
0.0752
0.8474
0.0311
0.1457

Error

Two-stage procedure
Estimate

Model for hemoglobin levels


0.2783
6.2485
0.0183
0.0144
0.0021
0.0080
0.0506
0.9351
0.0079
0.0123
0.0023
0.0054
0.0758
0.1108
0.0345
0.4197
0.0315
0.0752
0.0201
0.8474
Model for survival times
0.0060
0.0279
0.1679
0.2489
0.1517

For some cases, the two-stage procedure appeared to produce


more biased estimates, for example, when there was a small
maximum number of measurements per subject with subjects
measured at varying time points. In this case, the largest
biases tend to occur in the estimates for the random eects in
the longitudinal model and in the parameters for the survival
model. The value of the maximum number of measurements
at which biases occur depends on how large the cluster-level
eects are and the strength of the link between the models.
Further work is needed to determine at exactly which combinations of parameter values the joint model outperforms the
two-stage procedure.
The joint model assumes conditional independence of the
two responses. A basic check of this assumption was made by
examining the residuals for each response. Residuals were adjusted for censoring. The marginal correlation coecient was
0.068, indicating conditional independence. Also, upon examination of the normality plot for the longitudinal residual, the
normality assumption for these residuals appeared to be satised. Future work will investigate other methods for testing
the modeling assumptions.
5. Conclusion
We have proposed a joint model for longitudinal and survival
data that links the responses via a common frailty. Given the
frailty, the two responses are conditionally independent. More
ecient and less-biased estimates were obtained using this
model. Further, the joint model is more exible than models
that link at the subject level due to the two levels of random
eects possible. The frailty can be thought of as a surrogate
for the underlying process that links the two responses. Due
to the correlations between clusters, survival times, and longitudinal outcomes, the frailty was a better surrogate than a
subject-level link. Finally, the model is able to evaluate the
eect of treatment variables on the responses while accommodating the relationship between them.
The joint model was motivated by the ESRD study, in
which the enrollment center was considered a better surro-

Joint model

Error

Estimate

Error

0.2783
0.0183
0.0021
0.0506
0.0079
0.0023
0.0758
0.0345
0.0315
0.0201

6.2704
0.0139
0.0080
0.9340
0.0101
0.0050
0.1038
0.4143
0.1025
0.8468

0.3264
0.0169
0.0019
0.0553
0.0114
0.0031
0.1033
0.0768
0.0408
0.0174

0.0056
0.1546
0.2997

0.0290
0.2466
0.1375

0.0064
0.1406
0.0308

gate for the underlying process linking the responses than


a subject-level link. In this analysis, we have focused only
on the eect of baseline iron. Generally, iron supplementation
also occurs during the longitudinal portion of the study. However, it is unclear whether iron dosage is a treatment for or
the outcome of low hemoglobin levels. Thus, to include a timevarying iron variable in the model would require strong causal
modeling assumptions, and a complex correlation structure.
While the model can be extended to other correlation structures for the longitudinal component, we chose not to due
to the increased computational complexity of the problem.
Rather, we chose to use the baseline iron model from Feldman et al. (2002). Other applications for this model include
quality-of-life data, for example, where there is an interest
in evaluating the trade-o between long life/low quality and
short life/high quality.
For simplicity in describing the estimation procedure, we
have used a single common frailty. Under the multivariate
assumption, the method can easily be extended to multiple
common frailties. The model also assumes that the clusterlevel random eect and the frailty are perfectly correlated.
This limitation can be overcome by the addition of a frailty
not linked to the random eects model. However, this requires
a third level of expectation and can be computationally demanding. The normal assumption can also be relaxed, but at
the expense of an increase in computational diculty. Modeling assumptions can be checked by examining the residuals.
However, further work is still needed on methods for checking
the model assumptions.

Acknowledgements
This research was supported by grants CA84438, NIMH62298,
NIMH61892, and AG14131 from the National Institutes of
Health. The authors wish to thank the editor, the associate
editor, and the referees for many helpful comments that substantially improved this article.

Biometrics, December 2004

898

sume

Re
Nous developpons un mod`ele joint pour lanalyse de donnees
longitudinales et de donnees de survie en presence de donnees
en cluster. Nous utilisons un mod`ele `
a eets mixtes pour les
mesures repetees qui incorpore `
a la fois les eets aleatoires
au niveau des sujets et au niveau du cluster, avec des sujets
embotes `
a linterieur des clusters. Un mod`ele de Cox avec
un param`etre de fragilite est utilise pour le mod`ele de survie
an de prendre en compte leet du cluster. Puis, nous lions
les deux reponses via les eets aleatoires au niveau du cluster commun ou fragilite. Ce mod`ele nous permet devaluer
simultanement les eets des covariables sur les deux types de
reponses, en considerant a
` la fois la relation entre les reponses
et les donnees regroupees en cluster. Le mod`ele a ete motive
par une etude de patients atteints de maladie renale `
a un stade
avance sous hemodialyse, o`
u nous avons souhaite evaluer les
eets du traitement `
a base de fer `
a la fois sur les niveaux
dhemoglobine des patients et les delais de survie, avec les
patients regroupes en cluster selon le centre dinclusion.

References
Abramowitz, M. and Stegun, I. (eds) (1964). Handbook of Mathematical Functions with Formulas, Graphs,
and Mathematical Tables. Applied Mathematics Series,
Volume 55. Washington, D.C.: U.S. Government Printing Oce.
Bycott, P. and Taylor, J. (1998). A comparison of smoothing
techniques for CD4 data measured with error in a timedependent Cox proportional hazards model. Statistics in
Medicine 17, 20612077.
Clayton, D. (1978). A model for association in bivariate life
tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika
65, 141151.
Cox, D. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society, Series
B 34, 187220.
De Gruttola, V. and Tu, X. (1994). Modelling progression of
CD4-lymphocyte count and its relationship to survival
time. Biometrics 50, 10031014.
Dempster, A., Laird, N., and Rubin, D. (1977). Maximum
likelihood from incomplete data via the EM algorithm.
Journal of the Royal Statistical Society, Series B 39, 138.
Feldman, H., Santanna, J., Guo, W., Furst, H., Franklin, E.,
Joe, M., Marcus, S., and Faich, G. (2002). Iron administration and clinical outcomes in hemodialysis patients.
Journal of the American Society of Nephrology 13, 734
744.
Henderson, R., Diggle, P., and Dobson, A. (2000). Joint modelling of longitudinal measurements and event time data.
Biostatistics 1, 465480.
Hogan, J. and Laird, N. (1997). Mixture models for the
joint distribution of repeated measures and event times.
Statistics in Medicine 16, 239257.
Hu, P., Tsiatis, A., and Davidian, M. (1998). Estimating the
parameters in the Cox model when covariate variables
are measured with error. Biometrics 54, 14071419.
Huang, W., Zeger, S., Anthony, J., and Garrett, E. (2001).
Latent variable model for joint analysis of multiple re-

peated measures and bivariate event time. Journal of the


American Statistical Association 96, 906914.
Laird, N. and Ware, J. (1982). Random-eects models for
longitudinal data. Biometrics 38, 963974.
Lancaster, T. and Intrator, O. (1998). Panel data with survival: Hospitalization of HIV-positive patients. Journal
of the American Statistical Association 93, 4653.
Li, Y. and Lin, X. (2000). Covariate measurement errors in
frailty models for clustered survival data. Biometrika 87,
849866.
Little, R. (1995). Modeling the drop-out mechanism in
repeated-measures studies. Journal of the American Statistical Association 90, 11121121.
Little, R. and Rubin, D. (1987). Statistical Analysis with Missing Data. New York: John Wiley.
Meilijson, I. (1989). A fast improvement to the EM algorithm
on its own terms. Journal of the Royal Statistical Society,
Series B 51, 127138.
Meng, X.-L. and Rubin, D. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework.
Biometrika 82, 267278.
Murphy, S., Rossini, A., and Van der Vaart, A. (1997). Maximum likelihood estimation in the proportional odds
model. Journal of the American Statistical Association
92, 968976.
Prentice, R. (1982). Covariate measurement errors and parameter estimation in a failure time regression model.
Biometrika 69, 331342.
Schluchter, M. (1992). Methods for the analysis of informatively censored longitudinal data. Statistics in Medicine
11, 18611870.
Tsiatis, A. and Davidian, M. (2001). A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika 88,
447458.
Tsiatis, A., De Gruttola, V., and Wulfsohn, M. (1995). Modeling the relationship of survival to longitudinal data
measured with error: Applications to survival and CD4
counts in patients with AIDS. Journal of the American
Statistical Association 90, 2737.
Vaupel, J., Manton, K., and Stallard, E. (1979). The impact
of heterogeneity in individual frailty and the dynamics
of mortality. Demography 16, 439454.
Wang, Y. and Taylor, J. (2001). Jointly modeling longitudinal
and event time data with application to acquired immunodeciency syndrome. Journal of the American Statistical
Association 96, 895905.
Wu, M. and Carroll, R. (1988). Estimation and comparison of
changes in the presence of informative right censoring by
modeling the censoring process. Biometrics 44, 175188.
Wulfsohn, M. and Tsiatis, A. (1997). A joint model for survival and longitudinal data measured with error. Biometrics 53, 330339.
Xu, J. and Zeger, S. (2001). Joint analysis of longitudinal
data comprising repeated measures and times to events.
Applied Statistics 50, 375387.

Received October 2002. Revised March 2004.


Accepted April 2004.

Joint Modeling of Longitudinal and Survival Data


Appendix
The conditional expectations needed in the EM algorithm can
be calculated as follows:

exp ui

899

1Tij 1
ij (y ij

E(g(ui ) | , , )

f (ui | ) = 

g(ui )f (ui | , , , Ti , i , y i , Xi1 , xi2 ) du i ,

2u2

f (Ti , i | , , ui , xi2 )f (y i | , ui , Xi1 )f (ui | ) du

E(g(ui ) | , , ) =

f (Ti , i | , , ui , xi2 )

h(ui ) =

xT
+bui
ij2 2

ijij (tij ) exp 0 (tij )e

h(ui ) dui

and

ni


exp

ui b

ni


exp

bui

ij

j=1

ij

ni



xT

ij2 2

0 (tij )e


bui

j=1

exp

j=1
ni


j=1

xT
0 ij (tij )e ij ij2 2

ni


j=1

j=1

exp u2i 2u2 .

g(ui )h(ui ) dui

f (Ti , i | , , ui , xi2 )f (y i | , ui , Xi1 )f (ui | )

ni


f (ui | , , , Ti , i , y i , Xi1 , xi2 )

Substituting and removing those terms not dependent on ui ,


we get

where

= 

1  T 1
X ij 1 1 ) u2i
1ij ij 1ij
2

ni



xT

ij2 2

0 (tij )e

ui

1Tij 1
ij (y ij X ij 1 1 )


1
1Tij 1
u2i 1 +
ij 1ij
2

ebui ,

j=1

f (y i | , ui , Xi1 ) =

ni


We approximate each of the integrals using Gaussian quadrature. Select xed abscissas ak and corresponding weight wk
from tabled data (Abramowitz and Stegun, 1964, p. 919), and
choose the integral limits with care. Then,

f (y ij | X ij 1 , ui , )

j=1

ni


nij /2

(2)

| ij |

K


1/2

j=1

exp

1

(y ij X ij 1 1 )T 1
ij (y ij X ij 1 1 )
2
j

E(g(ui ) | , , )

wk g(ak )h(ak )

k=1
K

k=1

.
wk h(ak )

Вам также может понравиться