

   CCoommppuuttee aa ssuurrvviivvaall CCuurrvvee ffoorr CCeennssoorreedd DDaattaa

        survfit( object, data=sys.parent(), weights, subset, na.action,
                  newdata, individual=F, conf.int=.95, se.fit=T,
               type=c("kaplan-meier","flemington-harrington", "fh2"),
                  error=c("greenwood","tsiatis"),
               conf.type=c("log","log-log","plain","none"),
               conf.lower=c("usual", "peto", "modified")

   AArrgguummeennttss::

     object: A formula object or a coxph object.  If a formula
             object is supplied it must have a Surv object as
             the response on the left of the ~ operator and, if
             desired, terms separated by + operators on the
             right.  One of the terms may be a strata object.
             For a single survival curve the "~ 1" part of the
             formula is not required.

       data: a data.frame in which to interpret the variables
             named in the formula, or in the subset and the
             weights argument.

    weights: The weights must be nonnegative and it is strongly
             recommended that they be strictly positive, since
             zero weights are ambiguous, compared to use of the
             subset argument.

     subset: expression saying that only a subset of the rows
             of the data should be used in the fit.

   na.action: a missing-data filter function, applied to the
             model.frame, after any subset argument has been
             used.  Default is options()$na.action.

    newdata: a data.frame with the same variable names as those
             that appear in the coxph formula.  The curve(s)
             produced will be representative of a cohort who's
             covariates correspond to the values in newdata.
             Default is the mean of the covariates used in the
             coxph fit.

   individual: a logical value indicating whether the data
             frame represents different time epochs for only
             one individual (T), or whether multiple rows indi-
             cate multiple individuals (F, the default).  If
             the former only one curve will be produced; if the
             latter there will be one curve per row in `new-
             data'.

   conf.int: The level for a two-sided confidence interval on
             the survival curve(s).  Default is 0.95.

     se.fit: a logical value indicating whether standard errors
             should be computed.  Default is true.

       type: either "kaplan-meier" , "fleming-harrington" or
             "fh2",  (only the   first  two characters  are
             necessary).   The  default  is "fleming-harring-
             ton" when a coxph object is given,  and  it  is
             "kaplan-meier" otherwise.

      error: either the string "greenwood" for the Greenwood
             formula or  "tsiatis"  for  the  Tsiatis  formula,
             (only the first character  is  necessary).   The
             default  is  "tsiatis"  when a coxph object is
             given, and it is "greenwood" otherwise.

   conf.type: One of "none", "plain", "log" (the default), or
             "log-log".  Only enough of the string to uniquely
             identify it is necessary.  The first option causes
             confidence intervals not to be generated.  The
             second causes the standard intervals "curve +- k
             *se(curve)", where k is determined from conf.int.
             The log option calculates intervals based on the
             cumulative hazard or log(survival). The last
             option bases intervals on the log hazard or
             log(-log(survival)).  These last will never extend
             past 0 or 1.

   conf.lower: controls modified lower limits to the curve, the
             upper limit remains unchanged.  The modified lower
             limit is based on an ``effective n'' argument.
             The confidence bands will agree with the usual
             calculation at each death time, but unlike the
             usual bands the confidence interval becomes wider
             at each censored observation.  The extra width is
             obtained by multiplying the usual variance by a
             factor m/n, where n is the number currently at
             risk and m is the number at risk at the last death
             time.  (The bands thus agree with the un-modified
             bands at each death time.)  This is especially
             useful for survival curves with a long flat tail.

             The Peto lower limit is based on the same effec-
             tive n argument as the modified limit, but also
             replaces the usual Greenwood variance term with a
             simple approximation.  It is known to be conserva-
             tive.

   DDeessccrriippttiioonn::

        Actually, the estimates used are the Kalbfleisch-Pren-
        tice (Kalbfleisch and Prentice, 1980, p.86) and the
        Tsiatis/Link/Breslow, which reduce to the Kaplan-Meier
        and Fleming-Harrington estimates, respectively, when
        the weights are unity.  When curves are fit for a Cox
        model, subject weights of exp(sum(coef*(x-center))) are
        used, ignoring any value for wt input by the user.
        There is also an extra term in the variance of the
        curve, due to the variance of coef and hence variance
        in the computed weights.

        The Greenwood formula for the variance is a sum of
        terms d/(n*(n-m)), where d is the number of deaths at a
        given time point, n is the sum of wt for all individu-
        als still at risk at that time, and m is the sum of
        weights for the deaths at that time.  The justification
        is based on a binomial argument when weights are all
        equal to one; extension to the weighted case is ad hoc.
        Tsiatis (1981) proposes a sum of terms d/(n*n), based
        on a counting process argument which includes the
        weighted case.

        The two variants of the F-H estimate have to do with
        how ties are handled.  If there were 3 deaths out of 10
        at risk, then the first would increment the hazard by
        3/10 and the second by 1/10 + 1/9 + 1/8.  For curves
        created after a Cox model these correspond to the Bres-
        low and Efron estimates, respectively, and the proper
        choice is made automatically.  The fh2 method will give
        results closer to the Kaplan-Meier.

        Based on the work of Link (1984), the log transform is
        expected to produce the most accurate confidence inter-
        vals.  If there is heavy censoring, then based on the
        work of Dorey and Korn (1987) the modified estimate
        will give a more reliable confidence band for the tails
        of the curve.

   VVaalluuee::

        a survfit object. Methods defined for survfit objects
        are print, plot, lines, and points.

   RReeffeerreenncceess::

        Terry Therneau, author of local function.

        Dorey, F.J. and Korn, E.L. (1987).  Effective sample
        sizes for confidence intervals for survival probabili-
        ties.  Statistics in Medicine 6, 679-87.

        Fleming, T. H. and Harrington, D.P. (1984).  Nonpara-
        metric estimation of the survival distribution in cen-
        sored data.  Comm. in Statistics 13, 2469-86.

        Kablfleisch, J. D. and Prentice, R. L. (1980).   The
        Statistical Analysis of Failure Time Data.  Wiley, New
        York.

        Link, C. L. (1984). Confidence intervals for the sur-
        vival function using Cox's proportional hazards model
        with covariates.  Biometrics 40, 601-610.

        Tsiatis, A. (1981). A large sample study of the esti-
        mate for the integrated hazard function in Cox's
        regression model for survival data. Annals of Statis-
        tics 9, 93-108.

   SSeeee AAllssoo::

        `print', `plot', `lines', `coxph', `Surv', `strata'.

   EExxaammpplleess::

        #fit a Kaplan-Meier and plot it
        fit <- coxph( Surv(admlfuhr, dead) ~ gcs.12, rochadm)
        plot(fit)

        #fit a cox proportional hazards model and plot the
        #predicted survival curve
        fit <- coxph( Surv(admlfuhr, dead) ~ gcs.12, rochadm)
        plot( survfit( fit))

