Академический Документы
Профессиональный Документы
Культура Документы
DOI 10.1007/s10985-010-9181-x
Abstract We propose a Bayesian approach for estimating the hazard functions under
the constraint of a monotone hazard ratio. We construct a model for the monotone
hazard ratio utilizing the Cox’s proportional hazards model with a monotone time-
dependent coefficient. To reduce computational complexity, we use a signed gamma
process prior for the time-dependent coefficient and the Bayesian bootstrap prior for
the baseline hazard function. We develope an efficient MCMC algorithm and illustrate
the proposed method on simulated and real data sets.
1 Introduction
Estimation and inference of two survival functions S1 and S2 under certain order restric-
tions have received much attention in survival analysis. The most popular order restric-
tion is the stochastic ordering, which assumes that S1 (t) ≥ S2 (t) for all t ∈ [0, ∞).
The nonparametric estimator of the survival functions under the stochastic ordering
were found by Brunk et al. (1966) for complete observations and Dykstra (1982)
for right censored data, and asymptotic properties were studied by Paestgaard and
Huang (1996). Bayesian approaches for stochastic ordering have been proposed by
Arjas and Gasbarra (1996) and Gelfand and Kottas (2001). Also, uniform stochastic
J. K. Park
International Vaccine Institute, Seoul, Korea
123
Yongdai Kim et al.
λ1 (t) = λ(t)
and
123
Bayesian analysis for monotone hazard ratio
Remark Another advantage of the proposed model is that we could easily incorporate
other covariates z, if they exist in the model, by setting
λ1 (t|z) = exp(z β)λ(t)
and
λ2 (t|z) = exp z β + γ0 + γ1 H (t) λ(t).
This is useful if we want to know whether the risk of one group decreases faster than
that of the other group after adjusting for other risk factors such as age, gender, etc.
For prior, we use standard parametric priors for γ0 and γ1 and a nonparametric prior
for H. A priori, we let γ0 ∼ N (0, σ02 ) and Pr(γ1 = k) = 1/3 for k = −1, 0, 1. For H,
a priori, we let H be a gamma process with mean H0 and precision parameter c > 0.
That is, H is a nondecreasing stochastic process on [0, ∞) with independent incre-
ments such that H (0) = 0 and H (t) − H (s), s ≤ t follows a gamma distribution with
mean H0 (t) − H0 (s) and variance (H0 (t) − H0 (s))/c. See Lo (1982) and Kalbfleisch
(1978) for details of gamma processes. To reduce computational complexity, we use
the BB prior for λ, which is explained in detail in Sect. 3.
The main idea of the BB approach for the proportional hazards model is to approxi-
mate the full Bayesian posterior by the BB posterior that is proportional to the product
123
Yongdai Kim et al.
of the empirical likelihood and prior. Let (x1 , δ1 , z 1 (·)), . . . , (xn , δn , z n (·)) be obser-
vations where xi are right-censored times (i.e., minimum of survival and censoring
times), δi are censoring indicators, and z i (·) are (time-dependent) covariates. Under
the proportional hazards model given as
λ(t|z) = exp(z(t) β)λ(t)
where λ(t|z) is the hazard function of the survival time with covariate z, the likelihood
function of θ = (β, λ(·)) is
⎛ x ⎞
n
δi i
L(θ ) = exp(z i (xi ) β)λ(xi ) exp ⎝− exp(z i (s) β)λ(s)ds ⎠
i=1 0
⎛ ⎞
n
δi xi
= exp(z i (xi ) β)d(xi ) exp ⎝− exp(z i (s) β)d(s)⎠ , (1)
i=1 0
t
where (t) = 0 λ(s)ds is the cumulative hazard function. Let q be the number of
distinct, uncensored observations, and let 0 < t1 < · · · < tq be the correspond-
ing ordered, uncensored observations. Then, the empirical likelihood is obtained by
assuming that is a step function having jumps only at t1 , . . . , tq and replacing d(t)
by (t) = (t) − (t−) in (1), which results in
⎛ ⎞
n
δi
L E (θ ) = exp(z i (xi ) β) (xi ) exp ⎝− exp(z i (tk ) β) (tk )⎠. (2)
i=1 k:tk ≤xi
For details of the empirical likelihood (2), see Andersen et al. (1993). Finally, the BB
posterior of θ is defined to be proportional to the product of the empirical likelihood
and prior.
Remark There is an alternative empirical likelihood called the binomial form empir-
ical likelihood. See Kim and Lee (2003b) for details. An advantage of the binomial
form is that the resulting BB posterior can be obtained as a limit of full Bayesian
posteriors. However, the computation is more difficult, and the BB posterior may not
be proper. Therefore, we do not consider the binomial form empirical likelihood in
this paper.
An advantage of the BB approach is that the dimension of parameter, θ , is finite
because we discretize to a step function with finitely many jumps. That is, the
parameters in the empirical likelihood are β and { (tk ), k = 1, . . . , q}, and hence,
the posterior distribution can be obtained easily using Bayes theorem.
A technical difficulty in the BB approach is the choice of the prior for { (tk ), k =
1, . . . , q}. For this, Kim and Lee (2003b) proposed the following improper prior (BB
prior):
123
Bayesian analysis for monotone hazard ratio
q
1
π( ) ∝ , (3)
(tk )
k=1
and showed that the resulting posterior is always proper, approximates the full Bayes-
ian posterior well, and has desirable large sample properties. It is interesting to note
that the marginal BB posterior of β with the prior (3) turns out to be proportional to
the Cox’s partial likelihood times prior.
Remark The BB approach does not require prior information on , which may be a
disadvantage when we have prior information. However, we could incorporate prior
information to the BB posterior by choosing the prior of accordingly. Suppose
a priori follows a gamma process with mean 0 and precision parameter cλ > 0.
Given that we could think of (tk ) as an approximation of (tk ) − (tk−1 ), we
could incorporate the prior information into the BB posterior by choosing the BB prior
as
q
π( ) ∝ ( (tk ))cλ (0 (tk )−0 (tk−1 ))−1 exp (−cλ (tk )) . (4)
k=1
m
λ(t) = λk I (sk−1 < t ≤ sk )
k=1
for some sequence 0 = s0 < s1 < s2 < · · · < sm . See, for example, Arjas and
Gasbarra (1996) and Ibrahim et al. (2001). Nonetheless, we use the BB approach
because it has more sound theoretical backgrounds (at least asymptotically) and pro-
vides a simpler MCMC algorithm. In contrast, it is not easy to choose the break
points s1 , . . . , sm in the piecewise constant hazard model, and the computation of the
posterior would be more difficult.
The parameter in the model is θ = (γ0 , γ1 , H, ). The likelihood of the proposed
model is
ns
2
δsi
L(θ ) = exp (γ0 + γ1 H (xsi )) I (s=2) d(xsi )
s=1 i=1
⎡ ⎤
xsi
exp ⎣− exp (γ0 + γ1 H (u)) I (s=2) d(u)⎦ .
0
The full Bayesian computation is extremely hard, as the likelihood involves terms like
123
Yongdai Kim et al.
t
exp(γ1 H (s))d(s),
0
which require the knowledge of sample paths of both H (t) and (t). To resolve this
problem, we employ the BB approach as follows:
Let 0 < t1 < t2 < · · · < tq be the corresponding ordered distinct uncensored
survival times among the pooled sample, and let R(t) = {(s, i) : xsi ≥ t} and
D(t) = {(s, i) : xsi = t, δsi = 1}. Let (tk ) = (tk ) − (tk −) = λk , and we
assume that (t) = tk ≤t λk . Then, the empirical likelihood of the proposed model
becomes
⎛ ⎞
q
d(t )
L E (θ ) = λ k exp ⎝
k (γ0 + γ1 H (tk ))⎠
k=1 (2,i)∈D(tk )
⎧ ⎫
⎨ ⎬
× exp −λk exp (γ0 + γ1 H (tk )) I (s=2)
⎩ ⎭
(s,i)∈R(tk )
where d(t) is the cardinality of D(t). For prior of λk s, we use the BB prior
q
1
π(λ) = ,
λk
k=1
π B B (θ |Data) ∝ L E (θ )π(θ ),
We use a Gibbs sampler algorithm in which the parameters γ0 , γ1 , λ and H are gen-
erated sequentially from the conditional BB posteriors. We can easily generate γ0 and
γ1 using the Metropolis-Hastings (MH) algorithm with the following conditional BB
posterior distributions:
⎛ ⎞
q
π(γ0 |γ1 , λ, H, Data) ∝ exp ⎝γ0 1⎠
k=1 (2,i)∈D(tk )
⎡ ⎤
q
exp ⎣− exp(γ0 ) λk exp(γ1 H (tk )) 1⎦ π(γ0 ),
k=1 (2,i)∈R(tk )
(5)
123
Bayesian analysis for monotone hazard ratio
⎛ ⎞
q
π(γ1 |γ0 , λ, H, Data) ∝ exp ⎝γ1 H (tk ) 1⎠
k=1 (2,i)∈D(tk )
⎡ ⎤
q
exp ⎣− exp(γ0 ) λk exp(γ1 H (tk )) 1⎦ π(γ1 ).
k=1 (2,i)∈R(tk )
(6)
Note that Wk for k > p are not used in the empirical likelihood when p = max{x2i :
δ2i = 1}, as they affect the empirical likelihood through γ0 + γ1 Wk + log λk when
p = max{x1i : δ1i = 1}, in which case Wk and λk are not identifiable by the empirical
likelihood. To avoid these identifiability issues, we let W1 = 0 and Wk = 0 for k > p,
which is equivalent to using H0∗ instead of H0 in the prior parameter of the gamma
process where H0∗ (t) = 0 for t < t1 , H0∗ (t) = H0 (t) − H0 (t1 ) for t1 ≤ t ≤ t p and
H0∗ (t) = H0 (t p ) for t > t p .
We now explain how to generate Wk from its conditional posterior distribution. Let
(−l)
Hk = H (tk ) − Wl . Then, the conditional posterior distribution of Wl for 2 ≤ l ≤ p
given others = (γ0 , γ1 , λ, W (−l) , Data) is given as
⎛ ⎞
q
π(Wl |others) ∝ exp ⎝Wl γ1 1⎠
k=l (2,i)∈D(tk )
⎧ ⎡ ⎛ ⎞⎤⎫
⎨
q ⎬
(−l) ⎝
× exp − exp(γ1 Wl ) ⎣ λk exp γ0 + γ1 Hk 1⎠⎦
⎩ ⎭
k=l (2,i)∈D(tk )
123
Yongdai Kim et al.
q
αl = 1
k=l (2,i)∈D(tk )
and
⎛ ⎞
q
(−l) ⎝
βl = λk exp γ0 + γ1 Hk 1⎠ .
k=l (2,i)∈D(tk )
where
Note that the maximum of h l (exp(γ1 Wl )), say h l∗ , on Wl ∈ (0, ∞) can be easily
calculated and we can easily generate a random number from the gamma distribution.
Hence, we can use the AR sampling technique for generating Wl from (8) as follows:
1. Generate W ∼ Gamma(vl , c) where Gamma(a, b) is the gamma distribution with
mean a/b and variance a/b2 .
2. Generate U ∼ Uniform(0, 1).
3. Let Wl = W if h l (exp(γ1 W ))/ h l∗ ≥ U. Otherwise, go to 1.
The MCMC algorithm for the BB posterior can be summarized as follows:
• Sampling γ0 given γ1 , λ, H and data: We use the random-walk MH algorithm. Let
γ0∗ be a candidate value generated from a random-walk kernel q(γ0 , γ0∗ ). Then,
the acceptance rate is
π(h|γ0 , λ, H, Data)
ph =
l∈{−1,0,1} π(l|γ0 , λ, H, Data)
123
Bayesian analysis for monotone hazard ratio
4 Numerical experiments
In this section, we illustrate the proposed model on various data sets. For prior param-
eters, we let σ02 = 10, H0 (t) = log(1 + t) and c = 1.
4.1 Simulation 1
We let n 1 = n 2 = 50 and generated survival times of the first group from the expo-
nential distribution with mean 20, and those of the second group from the exponential
distribution with mean 30. Censoring times are generated from the exponential dis-
tribution such that the censoring probability is 0.3. Note that the model used for the
simulation satisfies the proportional hazards assumption. We obtained the posterior
distributions of θ using the proposed MCMC algorithm. We iterated the MCMC algo-
rithm 100,000 times after a burn-in period of 10,000 iterations. Then, we collected
2,000 samples at every 50th iteration after the burn-in for further analysis. We used
a relatively extreme thinning (every 50th iteration) to make the samples almost inde-
pendent, making further analysis easier.
Figure 1 gives the traceplots and histograms of γ0 and H (t) and (t) at t = 20 (the
mean survival time of the first group) generated from the MCMC algorithm. The pro-
posed MCMC algorithm converges well, and the posterior densities have nice shapes
(at least, they are unimodal). Figure 2a shows how the empirical probability of γ1 = 0,
calculated based on the generated samples from the MCMC algorithm, converges. The
two dashed lines in the figure represent the 95% confidence interval obtained from the
samples, assuming that they are independent. With the exception of the early stage of
the iteration, the empirical probabilities lie inside the confidence limits, which implies
that the MCMC algorithm converges well to its stationary distribution for γ1 , too.
Figure 2b displays the posterior probabilities of γ1 , which supports the proportional
hazards model because it has the largest value when γ1 = 0.
Figure 3 shows the acceptance probability of Wk for k = 2, . . . , p in the AR
sampling step inside the MCMC algorithm. The smallest acceptance probability is
around 30%, which implies that the AR sampling step does not significantly hamper
the overall computing time of the MCMC algorithm.
Table 1 compares the Bayes estimator and 90% (equal-tail) posterior probability
interval of γ0 with those obtained from the proportional hazards model (i.e., γ1 = 0)
123
Yongdai Kim et al.
10
(a)
2
0.6
8
1
0.5
6
Lambda(20)
0
gamma0
H(20)
0.4
4
−1
0.3
−2
0.2
−3
(b)
1.2
5
0.30
1.0
4
0.25
0.8
0.20
3
density
density
density
0.6
0.15
2
0.4
0.10
1
0.2
0.05
0.00
0.0
Fig. 1 Panel a shows the traceplots of γ0 , H (20) and (20), and panel b shows the corresponding histo-
grams
and corresponding frequentist counterpart. The posterior interval based on the pro-
posed model is much wider than the other two intervals. This is because there is
additional uncertainty in estimating H for the proposed model. However, all intervals
contain the true value −0.4055.
We conducted additional simulations to investigate the effect of the censoring prob-
ability and sample sizes on the posterior distribution. Table 2 presents the posterior
distributions of γ1 for various values of the censoring probability and sample sizes.
The results are stable and consistently support the proportional hazards model.
123
Bayesian analysis for monotone hazard ratio
(a) (b)
The empirical posterior probability
1.0
0.8
0.9
Posterior probability
0.6
of gamma1
0.8
0.4
0.7
0.2
0.6
0.5
0.0
0 500 1000 1500 2000 −1 0 1
the number of iteration gamma1
Fig. 2 Panel a shows the traceplots of the empirical posterior probability of γ1 = 0 (solid) with the 95%
confidence limits (dashed), and panel b present the posterior probabilities of γ1
1.0
Acceptance probability
0.8
0.6
0.4
0.2
0.0
2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50
k
Fig. 3 Acceptance probabilities of Wl for l = 2, . . . , p in the AR algorithm
Table 1 Bayes estimator and 90% posterior probability interval of γ0 of the proposed model (MHR Mono-
tone hazard ratio) with those obtained from the proportional hazards (PH) model and corresponding frequ-
entist results
Table 2 The posterior probabilities of γ1 = −1, 0 and 1 for various values of the censoring probability
and sample sizes in simulation 1
123
Yongdai Kim et al.
4
0.8 True True
90%PB 90%PB
Cumulative lambda
Posterior probabilty
4
log(Hazard ratio)
3
0.6
2
0.4
0
0.2
1
−2
0.0
−4
0
gamma1 0 20 40 60 80 100 0 20 40 60 80 100
time time
Fig. 4 Panel a draws the posterior probability of γ1 , and panel b and c presents Bayes estimators of the
log hazard ratio and with the pointwise 90% probability bands (PB) and true functions, respectively
Table 3 The posterior probabilities of γ1 = −1, 0 and 1 for various values of the censoring probability
and sample sizes in Simulation 2
4.2 Simulation 2
We let λ2 (t) = αt α−1 βλ1 (t). The hazard ratio is increasing monotonically when
α > 1 and decreasing when α < 1. We set α = 0.5 √ and λ1 (t) = 1/20 to have a
monotonically decreasing hazard ratio, and β = 20/ 10 to make the mean survival
time of the second group equal to 20. The other set-ups such as sample sizes, censor-
ing probability, the number of iterations of the MCMC algorithm etc., are the same as
those for the simulated data set 1.
The posterior probability of γ1 is given in Fig. 4a, which strongly supports the
true model, monotonically decreasing hazard ratio. Figure 4b and c present the Bayes
estimator and corresponding pointwise 90% posterior probability bands of the log
hazard ratio (γ0 + γ1 H (t)) and cumulative baseline hazard function (t) with the true
ones, respectively. Note that the true functions lie inside the probability bands, imply-
ing that the proposed method estimates the monotone hazard ratio and cumulative
baseline hazard function well.
As is done for Simulation 1, Table 3 presents the posterior probabilities of γ1 for
various values of the censoring probability and sample sizes. All of the results strongly
indicate that the hazard ratio is decreasing.
Priors need to be specified for three parameters γ0 , γ1 and H . Since γ1 has a value
among {−1, 0, 1}, the uniform prior is a natural one. For γ0 , unless the prior variance
123
Bayesian analysis for monotone hazard ratio
Table 4 The posterior probabilities of γ1 = −1, 0 and 1 for various specifications of H0 and c
is very small, the posterior is not seriously affected by the choice of prior. However,
the specification of the prior of H needs a special consideration.
There are two prior parameters, prior mean H0 and precision parameter c. In the
previous two subsections, we let H0 (t) = log(1 + t), which is a concave function.
The concavity of H0 represents the prior belief that the degree of monotonicity of the
hazard ratio decreases as time increases, and we have seen that this choice of prior
works well in various cases. In this view, an alternative choice of H0 (t) would be t a
for a ∈ (0, 1).√ Table 4 compares the posterior probabilities of γ1 when H0 is either
log(1 + t) or t with various values of c. We set (n 1 , n 2 ) = (100, 100) and the
censoring probability 30%.
When c = 0.1, the probability of γ1 = 0 is relatively smaller than the other cases
for Simulation 1. This is because the model is poorly identifiable when H is very
small, if the true model has a constant hazard ratio. For an extreme case, γ1 is not
identifiable when H (t) ≡ 0. When c is small, the prior variance of H becomes
√ large,
and hence, γ1 becomes poorly identifiable. When c = 10 and H0 (t) = t, the result
is completely misleading for Simulation 2. Note that a priori H (t) is a Gamma random
variable with mean H0 (t) and variance H0 (t)/c. Hence, the prior mass is concentrated
too much around the prior mean when H0 (t) and c are large; therefore, and so when
the true hazard ratio is different from the prior mean, the posterior would support the
constant hazard ratio model rather than the prior mean H0 . Based on the simulation
results, we suggest selecting the prior parameters such that H0 does not increase too
fast and c is not too large or small.
In this section, we analyzed two real data sets. The first data set, “Leukemia,” is the
leukemia patients remission time data analyzed by Laud et al. (1998). The data con-
sisted of 42 patients, divided into control and treatment groups, with 21 patients each.
The treatment and control groups are given 6-mercaptopurine (or 6-MP) and placebo,
respectively. Of the 42 observations, 12 are censored. The second data set “Ovarian,” is
the time from treatment to progression of disease of 35 patients with stage II (n 1 = 15)
or IIA (n 2 = 20) ovarian cancer. Of 35 observations, 13 are censored. The data set
was analyzed by Gill and Schumacher (1987).
123
Yongdai Kim et al.
(a) (b)
1.0
1.0
0.8
0.8
of gamma1
of gamma1
0.6
0.6
0.4
0.4
0.2
0.2
0 500 1000 1500 2000 0 500 1000 1500 2000
the number of iteration the number of iteration
Fig. 5 The panels a and b show the traceplots of the empirical posterior probability of γ1 = 0, −1 (solid)
with the 95% confidence limits (dashed) for the Leukemia and Ovarian data sets, respectively
Remark When we are interested in the validity of the monotone hazard ratio assump-
tion, the frequentist tests are not valid because the rejection of the frequentist tests does
not necessarily mean that the monotone hazard ratio is valid. In contrast, the Bayesian
results—the posterior probability of γ1 and the DIC values, directly confirm whether
the assumption of the monotone hazard ratio is valid.
Remark Along with the DIC values for γ1 = −1, 0 and 1, we calculated the DIC
value of the model where γ1 is random. The DIC value with random γ1 would be
expected to be smaller than that with γ1 = 0 when the proportional hazards assump-
tion is valid. The DIC and p D values with random γ1 for the Leukemia and Ovarian
data sets are 175.23, 1.20, and 128.41, 2.16 respectively, which do not confirm our
conjecture. We find, however, that the DIC values are unstable, particularly when the
123
Bayesian analysis for monotone hazard ratio
2.5
(a)
1
2.0
0
Lambda(10)
−1
1.5
gamma0
H(10)
4
−2
1.0
−3
0.5
−4
1.5
(b)
0.7
0.4
0.6
0.5
1.0
0.3
0.4
density
density
density
0.2
0.3
0.5
0.2
0.1
0.1
0.0
0.0
0.0
Fig. 6 The Leukemia data set results—panel a presents the traceplots of γ0 , H (10), and (10) and panel
b shoes the corresponding histograms
proportional hazards assumption is valid. Note that the difference of the DIC values
between γ1 = 0 and 1 for the Leukemia data set is very small, whereas the posterior
probabilities are much different. We think that the DIC may not be appropriate for
our model because our model is semiparametric (i.e., the hazard ratio is completely
unspecified), and the DIC is developed mainly for parametric models where the max-
imum likelihood estimator is asymptotically Gaussian. We leave this problem as a
future work.
123
Yongdai Kim et al.
(a)
6
1.0
6
4
0.8
Lambda(10)
2
gamma0
H(10)
4
0.6
0
0.4
2
−2
0.2
0
(b)
0.35
2.5
0.30
0.3
2.0
0.25
0.20
1.5
density
density
density
0.2
0.15
1.0
0.10
0.1
0.5
0.05
0.00
0.0
0.0
Fig. 7 The Ovarian data set results—panel a presents the traceplots of γ0 , H (10), and (10) and panel b
shows the corresponding histograms
For the Ovarian data set, in which the proportional hazard assumption is rejected
against the monotone hazard ratio, we draw the Bayes estimator of the hazard ratio
with the pointwise 90% probability bands in Fig. 9a. The figure suggests that the hazard
ratio of the second group (stage II) over the first group (stage IIA) decreases steadily.
We draw the Bayes estimators of the two cumulative hazard functions 1 and 2 with
their pointwise 90% probability bands and the empirical cumulative hazard (ECH)
123
Bayesian analysis for monotone hazard ratio
(a) (b)
0.8
0.5
Posterior probability
0.6
Posterior probability
0.4
0.3
0.4
0.2
0.2
0.1
0.0
0.0
−1 0 1 −1 0 1
gamma1 gamma1
Fig. 8 Panels a and b present the posterior probabilities of γ1 for the Leukemia and Ovarian data sets,
respectively
Table 5 P-values of the three frequentist test statistics for the proportional hazards against monotone hazard
ratio and the DIC, and p D values for the proposed model with γ1 = −1, 0, 1, respectively
Leukemia 0.6897 0.6807 0.1660 176.73, 1.39 174.64, 0.92 174.72, 1.28
Ovarian 0.0571 0.0507 0.0298 127.52, 2.03 130.73, 1.05 133.46, 1.67
(b) (c)
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
(a)
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Bayes Bayes
2
ECH ECH
1
90% PB 90% PB
Hazard ratio
Lambda2(t)
Lambda1(t)
0
−4 −3 −2 −1
100 200 300 400 100 200 300 400 100 200 300 400
time time time
Fig. 9 Part a draws the Bayes estimator of H (t) with its poinwise 90% probability band, part b for 1
and part c for 2 .
functions in Fig. 9b and c, respectively. The Bayes estimators and ECH functions are
close and are located inside the probability bands.
5 Concluding remarks
We proposed a Bayesian approach for estimating the two hazard functions under the
monotone hazard ratio constraint and developed an efficient MCMC algorithm. We
demonstrated with simulated and real data sets that the MCMC algorithm, based on
the BB approach, converges well and provides reliable results.
In this paper, we modeled the monotone hazard ratio nonparametrically. An alterna-
tive model is a piecewise constant monotone hazard ratio, which provides information
123
Yongdai Kim et al.
about when the hazard ratio changes. The proposed BB approach can be easily modi-
fied to this model to save significant computational costs.
The proposed model can be extended to a case where there are more than two haz-
ard functions. Suppose there are three hazard functions λ1 , λ2 and λ3 with λ2 /λ1 and
λ3 /λ2 increasing monotonically. We can model λ2 and λ3 by
(2)
λ2 (t) = exp γ0 + H (2) (t) λ1 (t)
and
λ3 (t) = exp γ0(3) + H (2) (t) + H (3) (t) λ1 (t)
where H (2) and H (3) are two independent gamma processes a priori. The proposed
MCMC algorithm can be easily modified for this model as well.
Studying asymptotic properties of the posterior distribution is worth pursuing. With-
out H, Kim and Lee (2003b) and Kim √ (2006) proved that the convergence rate of the
BB and full Bayesian posteriors is 1/ n. We think, however, that the convergence
√
rate of the posterior of H to the true hazard ratio would be slower than 1/ √ n, as the
optimal convergence rate for the hazard function is typically slower than 1/ n. This
conjecture would partly explain the wider probability interval of γ0 for the proposed
model compared to the results for the proportional hazards model in Table 1 and the
wider probability band for 2 in Fig. 9c compared to that of 1 in Fig.9b.
Acknowledgment This work was supported by the Korea Science and Engineering Foundation (KOSEF)
grant funded by the Korea government (MEST) (R01-2007-000-20045-0(2008)).
References
Andersen PK, Borgan O, Gill RD, Keiding N (1993) Statistical methods based on counting processes.
Springer, New York
Arjas E, Gasbarra D (1996) Bayesian inference of survival probabilities under stochastic ordering con-
straints. J Am Stat Assoc 91:1101–1109
Brunk HD, Franck WE, Hanson DL, Hogg RV (1966) Maximum likelihood estimation of the distribution
of two stochastically ordered random variables. J Am Stat Assoc 61:1067–1080
Deshpande JV, Sengupta D (1995) Testing for the hypothesis of proportional hazards in two population.
Biometrika 82:251–261
Dykstra RL (1982) Maximum likelihood estimation of the survival functions of stochastically ordered
random variables. J Am Stat Assoc 77:621–628
Dykstra RL, Kochar S, Robertson T (1991) Statistical inference for uniform stochastic ordering in several
population. Ann Stat 19:870–888
Gelfand AE, Kottas A (2001) Nonparametric Bayesian modeling for stochastic order. Ann Stat 53:865–876
Gill R, Schumacher M (1987) A simple test of the proportional hazards assumption. Biometrika 74:289–300
Hjort NL, Claeskens G (2006) Focussed information criteria and model averaging for Cox’s hazard regres-
sion model. J Am Stat Assoc 101:1449–1464
Ibrahim JG, Chen MH, Sinha D (2001) Bayesian survival analysis. Springer-Verlag, New York
Kalbfleisch JD (1978) Nonparametric Bayesian analysis of survival time data. J R Stat Soc Ser B 40:214–
221
Kim Y, Lee J (2003) Bayesian analysis of proportional hazard models. Ann Stat 31:493–511
Kim Y, Lee J (2003) Bayesian bootstrap for proportional hazards models. Ann Stat 31:1905–1922
123
Bayesian analysis for monotone hazard ratio
Kim Y (2006) The Bernstein-von Mises theorem for the proportional hazard model. Ann Stat 34:1678–1700
Laud PW, Damien P, Smith AFM (1998) Bayesian nonparametric and covariate analysis of failure time data.
In: Practical nonparametric and semiparametric Bayesian statistics. Springer, New York, pp 213–225
Lo AY (1982) Bayesian nonparametric statistical inference for Poisson point processes. Z Wahrsch Verw
Gebiete 59:55–66
Mukerjee H (1996) Estimation of survival functions under uniform stochastic ordering. J Am Stat Assoc
91:1684–1689
Paestgaard JT, Huang J (1996) Asymptotic theory for nonparametric estimation of survival curves under
order restriction. Ann Stat 24:1679–1716
Ripley BD (2006) Stochastic simulation. Wiley, New York
Sengupta D, Bhattacharjee A, Rajeev V (1998) Testing for the proportionality of hazards in two samples
against the increasing cumulative hazard ratio alternative. Scand J Stat 25:637–647
Spiegelhalter DJ, Best N, Carlin B, Linde A (2002) Bayesian measures of model complexity and fit (with
discussion). J R Stat Soc Ser B 64:583–639
123