What’s this about? Three types of censoring!

Time to the event of interest is not always observed in survival analysis. It can be right-censored, left-censored, or interval-censored.

 

A medical study might involve follow-up visits with patients who had breast cancer. Patients are tested for recurrence on a regular basis. If cancer is found, the time to the recurrence will not be measured exactly. If the cancer recurs before the first visit, the time is left-censored. If it recurs between visits, the time is interval-censored. If there is no recurrence by the last visit, the time is right-censored.

 

The same applies to many other examples, like unemployment duration in economic data, time of weaning in demographic data, or time to obesity in epidemiological data.

 

The new stintreg command for fitting parametric survival models accounts for all types of censoring. It can analyze current status data in which the event of interest is known to occur only before or after an observed time. And it can analyze data that include all types of censoring.

Let’s see it work

We want to investigate the effect of the stage of AIDS patients on their time to resistance to the drug zidovudine. The exact times are not available, but they are known to be within time intervals recorded in variables ltime and rtime. Some patients developed drug resistance in the first 6 months of therapy. Those times are left-censored. Others developed resistance between 1 and 12 months. Those times are interval-censored. Yet others showed no resistance, even at 16 months. Those times are right-censored.

 

We can use stintreg to fit a Weibull model to these data.

 

. stintreg i.stage, interval(ltime rtime) distribution(weibull)



Weibull PH regression                           Number of obs     =         31
                                                   Uncensored     =          0
                                                   Left-censored  =         15
                                                   Right-censored =         13
                                                   Interval-cens. =          3

                                                LR chi2(1)        =      10.02
Log likelihood =  -13.27946                     Prob > chi2       =     0.0016


Haz. Ratio Std. Err. z P>|z| [95% Conf. Interval]
stage
late 6.757496 4.462932 2.89 0.004 1.851897 24.65783
_cons .0003517 .0010552 -2.65 0.008 9.82e-07 .1259497
/ln_p 1.036663 .3978289 2.61 0.009 .2569325 1.816393
p 2.819791 1.121795 1.292958 6.149638
1/p .3546362 .1410845 .1626112 .7734204
Note: Estimates are transformed only in the first equation. Note: _cons estimates baseline hazard.

 

We find that the hazard of resisting zidovudine for patients in their late stage is approximately seven times the hazard for patients in their early stage.

 

We can plot the hazard functions for those two stages.

 

. stcurve, hazard at1(stage = 0) at2(stage = 1)


hazard graph

After about two months, the hazard of resisting zidovudine for late-stage patients increases rapidly compared with the hazard for early-stage patients.

 

Suppose we believe that the hazard depends on the dosage level, low or high. We use the strata() option to stratify on dose.

 

. stintreg i.stage, interval(ltime rtime) distribution(weibull) strata(dose)



Weibull PH regression                           Number of obs     =         31
                                                   Uncensored     =          0
                                                   Left-censored  =         15
                                                   Right-censored =         13
                                                   Interval-cens. =          3


                                                LR chi2(2)        =      12.40
Log likelihood = -11.115197                     Prob > chi2       =     0.0020


Coef. Std. Err. z P>|z| [95% Conf. Interval]
ltime
stage
late 2.711532 1.084146 2.50 0.012 .5866456 4.836419
dose
high -2.661872 5.883967 -0.45 0.651 -14.19424 8.870492
_cons -9.143003 4.930789 -1.85 0.064 -18.80717 .5211664
ln_p
dose
high .453894 .670098 0.68 0.498 -.8594739 1.767262
_cons 1.051935 .6190537 1.70 0.089 -.1613879 2.265258

 

In the stratified model, the stage coefficients are the same across the two dosage levels, but the intercept and the shape parameter or, more precisely, the log-shape parameter ln_p are different for low and high doses. Stratifying on dose does not seem to improve our model.

 

We can consider a simpler model in which only the shape parameter is dose specific. We do this by specifying the ancillary() option.

 

. stintreg i.stage, interval(ltime rtime) distribution(weibull) ancillary(i.dose)



Weibull PH regression                           Number of obs     =         31
                                                   Uncensored     =          0
                                                   Left-censored  =         15
                                                   Right-censored =         13
                                                   Interval-cens. =          3


                                                LR chi2(1)        =      12.20
Log likelihood = -11.214877                     Prob > chi2       =     0.0005


Coef. Std. Err. z P>|z| [95% Conf. Interval]
ltime
stage
late 2.795073 1.167501 2.39 0.017 .5068139 5.083332
_cons -10.8462 4.233065 -2.56 0.010 -19.14286 -2.549547
ln_p
dose
high .1655302 .0874501 1.89 0.058 -.0058689 .3369292
_cons 1.252361 .4143257 3.02 0.003 .4402972 2.064424

 

There is no statistical evidence, at least at the 5% significance level, that dosage levels affect the shape parameter of the Weibull model.

Tell me more

Learn more about Stata’s survival analysis features.

Read more about Parametric models for interval-censored survival-time data in the Stata Survival Analysis Reference Manual.