Stat 567:Statistical Reliability
Sixth Class (September 17, 1997)
Data Analysis Methods:
Right Censoring
Product-Limit Estimator

Class Objectives:
- Learn to estimate the survivor function of right-censored life data.
- Understand the Product-Limit (Kaplan-Meier) estimator for right-censored
data
Homework Assignments:
- Let A1, A2, A3,
and A4 be nested events
with A1 being the smallest and A4
being the largest. Show that
P(A1)=P(A1|A2)P(A2|A3)P(A3|A4)P(A4)
- Show that in Example 2.2 on Page 43 of your textbook the Product Limit
estimator is the same as the empirical survivor function
- Analyze the rat diet data supplied in class. Write a professional
quality report. Attach a cover letter directed at management. The report
itself should be directed at the scientists or engineers who are neither
management nor statisticians but are technically oriented. The report should
have:
- Summary and Conclusions
- Data Analysis
- Appendix with the raw data and other things that interfere with the
smooth flow of the Data Analysis section of the report.
(Due October 1, 1997)
Go to the bottom of this page to download the data for this homework
Hints:

- Are the three groups - low fat, saturated fat, unsaturated fat, significantly
different? What test are you using? If two different tests lead to different
significance levels can you explain the reason?
- Avoid vague language such as "they appear to be different."
Instead use language such as: "there is statistical evidence (alpha=.05)
to indicate that the three groups are different."
- Are the mortality (hazard) rates constant, decreasing, or increasing
for each group? Why can you use JMP's exponential plot to find this information
out? Why is this information useful?
- What distribution fits each group best? Why? Why is this information
useful?
- Avoid language implying causation such as: "Saturated fat causes
tumors." Instead, use language such as: "In this study the x
group had a significantly lower mortality rate than the y group."
- Use JMP's help menu to learn about the properties of the statistical
procedures you are using.
Class Outline and Main Points:
- A simple probability formula for nested sets
- How the P-L estimator works: Calculation of the P-L estimator from
first principles.
- General P-L estimator formula
- Standard error of the P-L estimator at a particular time point: Greenwood's
formula
- Confidence intervals for probability of surviving pass a particular
time point
- Estimates and confidence intervals for the cumulative hazard function
based on the P-L estimator
- Improved plotting point for the P-L estimator
- Q-Q plots based on the P-L estimator
- Weibull
- lognormal
- Exponential
- Example: Strength of weathered braided cord
- There is an extension of the P-L estimator due to Turnbull that can
also handle left-censored and interval-censored data. (See References below.)
Remarks:
- Formula (2.14) on Page 39 of your
textbook for the standard error applies when the estimate of the survivor
function is based on the empirical survivor function . Recall that the
books defines the empirical survivor function only when there is no censoring
or there is Type I and II right censoring.
If you have arbitrary right censoring then one uses the Product-Limit estimator
to estimate the survivor function. In this case the formula for the standard
error of the estimator is given by Greenwood's formula (Formula 2.21 on
Page 45 of your textbook.) not Formula (2.14).
Study Questions:
- Why is it important to have a non-parametric estimate of the survivor
function?
- What is the relationship between the empirical survivor function and
the P-L estimator of the survivor function
- Why is right censoring the most common type of censoring?
- What is the difference between individual and simultaneous confidence
intervals?
- What is the difference between Type I and Type II censoring? Why are
these two types of right censoring given so much attention?
- Why do we need parametric models for reliability data? Why not simply
work with the empirical survivor function or the P-L estimator of the survivor
function?
References.
- Statistical Methods for
Survival Data Analysis (1980) Elisa T. Lee. Lifetime Learning Publications.
Belmont, California. (The rat data comes from this book)
- Nonparametric Estimation of a Survivorship Function with Doubly
Censored Data. (1974) Turnbull, B. W. Journal of the American Statistical
Society, 69, 169-73.
- The Empirical Distribution Function with Arbitrary Grouped, Censored,
and Truncated Data. (1976) Turnbull, B. W. Journal of the Royal
Statistical Society B, 38, 618-26.
SAS JMP files (Mac) of
classroom examples and homework
- Data on the strengths of 48 pieces of weathered braided cord. (Example
2.3, Page 46 of your textbook)


(Instructions
for importing a text file into JMP)
Do you have something to tell me?