Ch3 Nonparametric Estimation講義 - 學生

Ch3 Nonparametric Estimation
We will describes methods for computing nonparametric estimates and

confidence intervals for F(t) for complete data or censored data.
3.2 Estimation form single censored interval data

This section will show how to compute a nonpapametric estimate of a cdf from
interval censored data when either units fail or all of the right censoring is at one
point at the end of the study (a single right censoring).
Let be the initial number of units (sample size) and let denote the number
of units that died or failed in the th interval (t i1 , t i ].
The nonparametric estimator F(t i ) based on the simple binomial distribution is

# of failures up to time ti ij=1 dj
F(t i ) = = ,
n n
the maximum likelihood estimator of F(t i ).
In general, the nonparametric estimator F(t i ) is defined at all

values (upper endpoints of all observation intervals).
If interval is known to have no failure, then

F(t i ) = F(t i1 ) for t i1 t t i .
If interval is known to contain one or more failures, F(t i ) increases

form F(t i1 ) to F(t i ) in the interval ( t i1 , t i ] but F(t) is not
defined for t i1 < t < t i .
1
Example 3.1& 3.2 (P.47): use the Plant1 data in Example 1.5
Example : use the data in Example 1.2

Compute a nonparametric estimate for F(t i )
3.4 Confidence interval from complete or singly censored data

1. Pointwise Binomial-based confidence for ( )
A conservative 100(1 )% confidence interval
+1) 1
(nnF +2,2nF
)
(1 ;2n2nF
2
F(t i ) = {1 +
} ,
~ nF
1
~ n nF
F(t i ) = {1 + } ,
(nF + 1)(1;2nF+2,2n2nF
2
where F = F(t i ) and (;1 ,2 ) is the p quantile of the F distribution with (1 , 2 )
degrees of freedom.
Reference: Brownlee, K. A. (1960), statistical theory and methodology in

science and engineering.
Example 3.3 (p.50) & Figure 3.2: use the Plant1 data in Example 1.5
2
2. Pointwise normal-approximation confidence for ( )
For a specified value of , a simple approximate 100(1 )% confidence interval
for F(t i ) is
~
[F(t i ), F(t i )] = F(t i ) z(1)
F,
~ 2
where () is the quantile of the standard normal distribution and

(ti )[1F
F (ti )]
=
is an estimate of the stand error of F(t i ).
n
If nF(t i ) is at least 5 to 10 and no more than n-5 or n-10, the

distribution of
F(t i ) F(t i)
=

can be approximated adequately by a stand normal distribution.
Otherwise the approximation will be crude and it is even possible to get
confidence limits that are outside the interval 0 to 1.
Example 3.4 (p.51) & table 3.1: use the Plant1 data in Example 1.5
The 95% confidence interval based on normal approximation
Example 3.5 &Figure 3.3 (use the data in ex1.2 table 1.2)
3

3. The logit transformation, () = () (in Section 3.6)
The approximate confidence interval for F(t i ) based on the large sample normal
~
approximation, [F(t i ), F(t i )] = F(t i ) z(1)
F, may not provide an adequate
~ 2
approximation, when the sample size is not large.
A better approximation of confidence interval might be obtained by using the logit

transformation,

logit() = ln[ ].
1
Based on the assumption of the distribution of

logit[F(t i )] logit[F(t i )]
logit =
se
logitF
can be approximated adequately by a standard normal distribution.
The two-sided approximate 100(1 )% confidence interval for F(t i ) is
~ (ti ) (ti )
F F
[ F (t i ), F(t i )] =[ ,
(ti ))W F(ti )+(1F(ti ))/W
],
~ F(ti )+(1F
1
where w = exp {z1
(ti ))
}
2 F(ti )(1F
This is a better approximation of confidence interval.
4
Example : use the Plant1 data in Example 1.5
(a) Compute and plot a nonparametric estimate for F(t i )
(b) Compute a set of pointwise approximate 95% confidence intervals for
F(t i ) based on normal approximation and add these to the plot in
part (a).
(c) ) Compute a set of pointwise approximate 95% confidence intervals
for F(t i ) based on logit transformation and normal approximation.
5
3.5 Estimation from multiply censored data
In this section shows how to compute a nonparametric estimate of a cdf from data
with multiple right censoring.
The censoring structure isif a unit does not fail in interval , it is either censored
at the end of interval or it continuous into interval + 1. Information is
available on the status of the units at the end of each interval. The intervals for
different units do not overlap.
The maximum likelihood estimator of pi (the conditional probability of failing in

interval , given that a unit was still operating at the beginning of interval ), is
the sample proportion failing

p = , = 1, , .

Thus, the MLE of S(t i ) and F(t i ) are

i
S(t i ) = ( 1 pj ) and
j=1
i
F(t i ) = 1 ( 1 pj ), i = 1, , m.
j=1
di denote the number of units that died or failed in the ith interval
( t i1 , t i ].
ri denote the number of units that survive in the interval and

right-censored at t i .
The units that are alive at beginning of the interval are called the "risk
set " for the interval (i.e., those at risk to failure)and the size of the
risk set at the beginning of interval I is
i1 i1
ni = n dj rj , i = 1, , m,
j=0 j=0
where m is the number of intervals and d0 = 0, r0 = 0.
Note the nonparametric estimator F(t i ) is defined at all t i values

(endpoints of all observation intervals).
6
If interval is known to have zero failure, thenF(t i ) =
F(t i1 ) for t i1 t t i .
If interval is known to contain one or more failures, F(t i ) increases

form F(t i1 ) to F(t i ) in the interval ( t i1 , t i ] but F(t i ) is not
defined over the interval.
When there are no censored observations before the last failure, F(t i )
is equivalent to estimation from single censored interval data (section
3.2).
Example 3.6 (p. 53) & Figure 3.4 & Table 3.2 (use the data in ex1.5)
Pool data from the three different plants in operation time.
7
Confidence intervals
(1) the pointwise normalapproximation method
~ ^ ^
[ F (t i ), F(t i )] [ F (t i ) z se F , F (t i ) z se F ]
~ (1- ) (1- )
2 2
(2) the logit transformation method

~ (ti ) (ti )
F F
[ F (t i ), F(t i )] =[
, (ti ))/W
],
~ F(ti )+(1F(ti ))W F(ti )+(1F
1
where w = exp {z1
(ti ))
}
2 F(ti )(1F
Since the estimate for F(t) has changed, the stand error of F(t i ) here
should be modified as
i
2 pj
[F(t i ) ] = [S(t i )]
=
se
nj (1 pj )
j=1
8
Example 3.6 & 3.7(p. 53, 56), table 3.2, 3.3 & figure 3.4, 3.5 :
Use and pool data from the three different plants in operation time in
ex1.5
(a) Compute and plot a nonparametric estimate for F(t i )
(b) Compute a set of pointwise approximate 95% confidence intervals for
F(t i ) based on normal approximation and add these to the plot in
part (a).
(c) Compute a set of pointwise approximate 95% confidence intervals
for F(t i ) based on logit transformation and normal approximation.
9
3.7 . Estimation from multiply censored data with exact failure data
I f the exact failure times arise from a continuous inspection process or, perhaps,
from having used a very large number of closely-spaced inspections, this can be
treated as limiting case of the interval-based nonparametric estimator and is
generally known as the Kaplan-Meier estimator.
In the limit, as the number of inspections increases and the

width of the inspection intervals approaches zero, failure are
concentrated in a relatively small number of intervals. Most
intervals will not contain any failures.
F(t) is constant over all intervals that have no failures.
With small intervals, F will become a step function with gaps

over the intervals where there were failures and with jumps at the
upper endpoint of these intervals.
In the limit, as the width of the intervals approaches 0, the

size of the gaps approaches zero and the step function
increases at the reported failure times.
Confidence intervals
We can also use the following two methods to obtain the confidence intervals for
F(t):
(1) pointwise normalapproximation method
(2) the logit transformation method
Example 3.8 &3.9(p.59), table C.2 & 3.4, Figure 3.6 :
Vehicle Shock absorber() failure data.
The 95% confidence interval is constructed by logit transformation

Method.
10
3.9. Uncertain censoring time
Censoring times are known only to be within specified intervals, the risk
sets is decreasing over the intervals in a manner that cannot be specified
precisely.
Two extreme methods of handing the censored observations in the

intervals are:
1. Assume all censored observations are removed at t i , the upper

endpoint of the interval. This gives p = as used for multiply

censored data ( section 3.5).

2. Assume all censored observations are removed at t i1, the lower

endpoint of the interval. This gives p = .

i1 i1
Note ni = n dj rj , i = 1, , m,
j=0 j=0
However, the two methods are both biased. A commonly used compromise is
di
pi = r .
ni i
2
And then substitute this estimates to the nonparametric estimate

i
F(t i ) = 1 ( 1 pj ),
j=1
and the corresponding standard error
2
p
[F(t i ) ] = [( )] =1
=
se j
.
n (1p) j j
Example 3.12 (p.64, ex2.12 appendix C.6), table 3.6 :
11
12

Ch3 Nonparametric Estimation講義 - 學生

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Ch3 Nonparametric Estimation講義 - 學生

Загружено:

Авторское право:

Доступные форматы

Ch3 Nonparametric Estimation

We will describes methods for computing nonparametric estimates and

3.2 Estimation form single censored interval data

The nonparametric estimator F(t i ) based on the simple binomial distribution is

the maximum likelihood estimator of F(t i ).

In general, the nonparametric estimator F(t i ) is defined at all

If interval is known to have no failure, then

If interval is known to contain one or more failures, F(t i ) increases

Example : use the data in Example 1.2

3.4 Confidence interval from complete or singly censored data

Reference: Brownlee, K. A. (1960), statistical theory and methodology in

where () is the quantile of the standard normal distribution and

If nF(t i ) is at least 5 to 10 and no more than n-5 or n-10, the

approximation, when the sample size is not large.

A better approximation of confidence interval might be obtained by using the logit

Based on the assumption of the distribution of

This is a better approximation of confidence interval.

The maximum likelihood estimator of pi (the conditional probability of failing in

Thus, the MLE of S(t i ) and F(t i ) are

ri denote the number of units that survive in the interval and

where m is the number of intervals and d0 = 0, r0 = 0.

Note the nonparametric estimator F(t i ) is defined at all t i values

If interval is known to contain one or more failures, F(t i ) increases

(2) the logit transformation method

In the limit, as the number of inspections increases and the

With small intervals, F will become a step function with gaps

In the limit, as the width of the intervals approaches 0, the

Vehicle Shock absorber() failure data.

The 95% confidence interval is constructed by logit transformation

Two extreme methods of handing the censored observations in the

censored data ( section 3.5).

And then substitute this estimates to the nonparametric estimate

and the corresponding standard error

Example 3.12 (p.64, ex2.12 appendix C.6), table 3.6 :

Вам также может понравиться