Вы находитесь на странице: 1из 12

Ch3 Nonparametric Estimation

We will describes methods for computing nonparametric estimates and


confidence intervals for F(t) for complete data or censored data.

3.2 Estimation form single censored interval data


This section will show how to compute a nonpapametric estimate of a cdf from
interval censored data when either units fail or all of the right censoring is at one
point at the end of the study (a single right censoring).

Let be the initial number of units (sample size) and let denote the number
of units that died or failed in the th interval (t i1 , t i ].

The nonparametric estimator F(t i ) based on the simple binomial distribution is


# of failures up to time ti ij=1 dj
F(t i ) = = ,
n n

the maximum likelihood estimator of F(t i ).

In general, the nonparametric estimator F(t i ) is defined at all


values (upper endpoints of all observation intervals).

If interval is known to have no failure, then


F(t i ) = F(t i1 ) for t i1 t t i .

If interval is known to contain one or more failures, F(t i ) increases


form F(t i1 ) to F(t i ) in the interval ( t i1 , t i ] but F(t) is not
defined for t i1 < t < t i .

1
Example 3.1& 3.2 (P.47): use the Plant1 data in Example 1.5

Example : use the data in Example 1.2


Compute a nonparametric estimate for F(t i )

3.4 Confidence interval from complete or singly censored data


1. Pointwise Binomial-based confidence for ( )
A conservative 100(1 )% confidence interval

+1) 1
(nnF +2,2nF
)
(1 ;2n2nF
2
F(t i ) = {1 +
} ,
~ nF

1
~ n nF
F(t i ) = {1 + } ,
(nF + 1)(1;2nF+2,2n2nF
2
where F = F(t i ) and (;1 ,2 ) is the p quantile of the F distribution with (1 , 2 )
degrees of freedom.

Reference: Brownlee, K. A. (1960), statistical theory and methodology in


science and engineering.

Example 3.3 (p.50) & Figure 3.2: use the Plant1 data in Example 1.5

2
2. Pointwise normal-approximation confidence for ( )
For a specified value of , a simple approximate 100(1 )% confidence interval
for F(t i ) is
~
[F(t i ), F(t i )] = F(t i ) z(1)
F,
~ 2

where () is the quantile of the standard normal distribution and


(ti )[1F
F (ti )]
=
is an estimate of the stand error of F(t i ).
n

If nF(t i ) is at least 5 to 10 and no more than n-5 or n-10, the


distribution of
F(t i ) F(t i)
=


can be approximated adequately by a stand normal distribution.
Otherwise the approximation will be crude and it is even possible to get
confidence limits that are outside the interval 0 to 1.

Example 3.4 (p.51) & table 3.1: use the Plant1 data in Example 1.5
The 95% confidence interval based on normal approximation

Example 3.5 &Figure 3.3 (use the data in ex1.2 table 1.2)

3

3. The logit transformation, () = () (in Section 3.6)

The approximate confidence interval for F(t i ) based on the large sample normal
~
approximation, [F(t i ), F(t i )] = F(t i ) z(1)
F, may not provide an adequate
~ 2

approximation, when the sample size is not large.

A better approximation of confidence interval might be obtained by using the logit


transformation,

logit() = ln[ ].
1

Based on the assumption of the distribution of


logit[F(t i )] logit[F(t i )]
logit =
se
logitF
can be approximated adequately by a standard normal distribution.
The two-sided approximate 100(1 )% confidence interval for F(t i ) is
~ (ti ) (ti )
F F
[ F (t i ), F(t i )] =[ ,
(ti ))W F(ti )+(1F(ti ))/W
],
~ F(ti )+(1F

1
where w = exp {z1
(ti ))
}
2 F(ti )(1F

This is a better approximation of confidence interval.

4
Example : use the Plant1 data in Example 1.5
(a) Compute and plot a nonparametric estimate for F(t i )
(b) Compute a set of pointwise approximate 95% confidence intervals for
F(t i ) based on normal approximation and add these to the plot in
part (a).
(c) ) Compute a set of pointwise approximate 95% confidence intervals
for F(t i ) based on logit transformation and normal approximation.

5
3.5 Estimation from multiply censored data
In this section shows how to compute a nonparametric estimate of a cdf from data
with multiple right censoring.

The censoring structure isif a unit does not fail in interval , it is either censored
at the end of interval or it continuous into interval + 1. Information is
available on the status of the units at the end of each interval. The intervals for
different units do not overlap.

The maximum likelihood estimator of pi (the conditional probability of failing in


interval , given that a unit was still operating at the beginning of interval ), is
the sample proportion failing


p = , = 1, , .

Thus, the MLE of S(t i ) and F(t i ) are


i
S(t i ) = ( 1 pj ) and
j=1
i
F(t i ) = 1 ( 1 pj ), i = 1, , m.
j=1

di denote the number of units that died or failed in the ith interval
( t i1 , t i ].

ri denote the number of units that survive in the interval and


right-censored at t i .

The units that are alive at beginning of the interval are called the "risk
set " for the interval (i.e., those at risk to failure)and the size of the
risk set at the beginning of interval I is
i1 i1

ni = n dj rj , i = 1, , m,
j=0 j=0

where m is the number of intervals and d0 = 0, r0 = 0.

Note the nonparametric estimator F(t i ) is defined at all t i values


(endpoints of all observation intervals).

6
If interval is known to have zero failure, thenF(t i ) =
F(t i1 ) for t i1 t t i .

If interval is known to contain one or more failures, F(t i ) increases


form F(t i1 ) to F(t i ) in the interval ( t i1 , t i ] but F(t i ) is not
defined over the interval.

When there are no censored observations before the last failure, F(t i )
is equivalent to estimation from single censored interval data (section
3.2).

Example 3.6 (p. 53) & Figure 3.4 & Table 3.2 (use the data in ex1.5)
Pool data from the three different plants in operation time.

7
Confidence intervals
(1) the pointwise normalapproximation method
~ ^ ^
[ F (t i ), F(t i )] [ F (t i ) z se F , F (t i ) z se F ]
~ (1- ) (1- )
2 2

(2) the logit transformation method


~ (ti ) (ti )
F F
[ F (t i ), F(t i )] =[
, (ti ))/W
],
~ F(ti )+(1F(ti ))W F(ti )+(1F

1
where w = exp {z1
(ti ))
}
2 F(ti )(1F

Since the estimate for F(t) has changed, the stand error of F(t i ) here
should be modified as

i
2 pj
[F(t i ) ] = [S(t i )]
=
se
nj (1 pj )
j=1

8
Example 3.6 & 3.7(p. 53, 56), table 3.2, 3.3 & figure 3.4, 3.5 :
Use and pool data from the three different plants in operation time in
ex1.5
(a) Compute and plot a nonparametric estimate for F(t i )
(b) Compute a set of pointwise approximate 95% confidence intervals for
F(t i ) based on normal approximation and add these to the plot in
part (a).
(c) Compute a set of pointwise approximate 95% confidence intervals
for F(t i ) based on logit transformation and normal approximation.

9
3.7 . Estimation from multiply censored data with exact failure data
I f the exact failure times arise from a continuous inspection process or, perhaps,
from having used a very large number of closely-spaced inspections, this can be
treated as limiting case of the interval-based nonparametric estimator and is
generally known as the Kaplan-Meier estimator.

In the limit, as the number of inspections increases and the


width of the inspection intervals approaches zero, failure are
concentrated in a relatively small number of intervals. Most
intervals will not contain any failures.
F(t) is constant over all intervals that have no failures.

With small intervals, F will become a step function with gaps


over the intervals where there were failures and with jumps at the
upper endpoint of these intervals.

In the limit, as the width of the intervals approaches 0, the


size of the gaps approaches zero and the step function
increases at the reported failure times.

Confidence intervals
We can also use the following two methods to obtain the confidence intervals for
F(t):
(1) pointwise normalapproximation method
(2) the logit transformation method
Example 3.8 &3.9(p.59), table C.2 & 3.4, Figure 3.6 :

Vehicle Shock absorber() failure data.

The 95% confidence interval is constructed by logit transformation


Method.

10
3.9. Uncertain censoring time
Censoring times are known only to be within specified intervals, the risk
sets is decreasing over the intervals in a manner that cannot be specified
precisely.

Two extreme methods of handing the censored observations in the


intervals are:
1. Assume all censored observations are removed at t i , the upper

endpoint of the interval. This gives p = as used for multiply

censored data ( section 3.5).


2. Assume all censored observations are removed at t i1, the lower

endpoint of the interval. This gives p = .

i1 i1

Note ni = n dj rj , i = 1, , m,
j=0 j=0

However, the two methods are both biased. A commonly used compromise is
di
pi = r .
ni i
2

And then substitute this estimates to the nonparametric estimate


i
F(t i ) = 1 ( 1 pj ),
j=1

and the corresponding standard error

2
p
[F(t i ) ] = [( )] =1
=
se j
.
n (1p) j j

Example 3.12 (p.64, ex2.12 appendix C.6), table 3.6 :

11
12

Вам также может понравиться