survSplit {survival} | R Documentation |

Given a survival data set and a set of specified cut times, split each record into multiple subrecords at each cut time. The new data set will be in ‘counting process’ format, with a start time, stop time, and event status for each record.

survSplit(formula, data, subset, na.action=na.pass, cut, start="tstart", id, zero=0, episode, end="tstop", event="event")

`formula` |
a model formula |

`data` |
a data frame |

`subset, na.action` |
rows of the data to be retained |

`cut` |
the vector of timepoints to cut at |

`start` |
character string with the name of a start time variable (will be created if needed) |

`id` |
character string with the name of new id variable to create (optional). This can be useful if the data set does not already contain an identifier. |

`zero` |
If |

`episode` |
character string with the name of new episode variable (optional) |

`end` |
character string with the name of event time variable |

`event` |
character string with the name of censoring indicator |

Each interval in the original data is cut at the given points; if an original row were (15, 60] with a cut vector of (10,30, 40) the resulting data set would have intervals of (15,30], (30,40] and (40, 60].

Each row in the final data set will lie completely within one of the
cut intervals. Which interval for each row of the output is shown by the
`episode`

variable, where 1= less than the first cutpoint, 2=
between the first and the second, etc.
For the example above the values would be 2, 3, and 4.

The routine is called with a formula as the first
argument.
The right hand side of the formula can be used to delimit variables
that should be retained; normally one will use ` ~ .`

as a
shorthand to retain them all. The routine
will try to retain variable names, e.g. `Surv(adam, joe, fred)~.`

will result in a data set with those same variable names for
`tstart`

, `end`

, and `event`

options rather than
the defaults. Any user specified values for these options will be
used if they are present, of course.
However, the routine is not sophisticated; it only does this
substitution for simple names. A call of `Surv(time, stat==2)`

for instance will not retain "stat" as the name of the event variable.

Rows of data with a missing time or status are copied across
unchanged, unless the na.action argument is changed from its default
value of `na.pass`

. But in the latter case any row
that is missing for any variable will be removed, which is rarely
what is desired.

New, longer, data frame.

fit1 <- coxph(Surv(time, status) ~ karno + age + trt, veteran) plot(cox.zph(fit1)[1]) # a cox.zph plot of the data suggests that the effect of Karnofsky score # begins to diminish by 60 days and has faded away by 120 days. # Fit a model with separate coefficients for the three intervals. # vet2 <- survSplit(Surv(time, status) ~., veteran, cut=c(60, 120), episode ="timegroup") fit2 <- coxph(Surv(tstart, time, status) ~ karno* strata(timegroup) + age + trt, data= vet2) c(overall= coef(fit1)[1], t0_60 = coef(fit2)[1], t60_120= sum(coef(fit2)[c(1,4)]), t120 = sum(coef(fit2)[c(1,5)]))

[Package *survival* version 2.44-1.1 Index]