

   EEmmppiirriiccaall IInnfflluueennccee VVaalluueess

        empinf(boot.out=NULL, data=NULL, statistic=NULL,
               type=<<see below>>, stype="w", index=1, t=NULL,
               strata=rep(1, n), eps=0.001, ...)

   AArrgguummeennttss::

   boot.out=: A bootstrap object created by the function
             `boot'.  If `type' is `"reg"' then this argument
             is required.  For any of the other types it is an
             optional argument.  If it is included when
             optional then the values of `data', `statistic',
             `stype', and `strata' are taken from the compo-
             nents of `boot.out' and any values passed to `emp-
             inf' directly are ignored.

      data=: A vector, matrix or data frame containing the data
             for which empirical influence values are required.
             It is a required argument if `boot.out' is not
             supplied.  If `boot.out' is supplied then `data'
             is set to `boot.out$data' and any value supplied
             is ignored.

   statistic=: The statistic for which empirical influence val-
             ues are required.  It must be a function of at
             least two arguments, the data set and a vector of
             weights, frequencies or indices.  The nature of
             the second argument is given by the value of
             `stype'.  Any other arguments that it takes must
             be supplied to `empinf' and will be passed to
             `statistic' unchanged.  This is a required argu-
             ment if `boot.out' is not supplied, otherwise its
             value is taken from `boot.out' and any value sup-
             plied here will be ignored.

      type=: The calculation type to be used for the empirical
             influence values.  Possible values of `type' are
             `"inf"' (infinitesimal jackknife), `"jack"' (usual
             jackknife), `"pos"' (positive jackknife), and
             `"reg"' (regression estimation).  The default
             value depends on the other arguments.  If `t' is
             supplied then the default value of `type' is
             `"reg"' and `boot.out' should be present so that
             its frequency array can be found.  It `t' is not
             supplied then if `stype' is `"w"', the default
             value of `type' is `"inf"'; otherwise, if
             `boot.out' is present the default is `"reg"'.  If
             none of these conditions apply then the default is
             `"jack"'.  Note that it is an error for `type' to
             be `"reg"' if `boot.out' is missing or to be
             `"inf"' if `stype' is not `"w"'.

     stype=: A character variable giving the nature of the sec-
             ond argument to `statistic'.  It can take on three
             values: `"w"' (weights), `"f"' (frequencies), or
             `"i"' (indices).  If `boot.out' is supplied the
             value of `stype' is set to `boot.out$stype' and
             any value supplied here is ignored.  Otherwise it
             is an optional argument which defaults to `"w"'.
             If `type' is `"inf"' then `stype' MUST be `"w"'.

     index=: An integer giving the position of the variable of
             interest in the output of `statistic'.

         t=: A vector of length `boot.out$R' which gives the
             bootstrap replicates of the statistic of interest.
             `t' is used only when `type' is `reg' and it
             defaults to `boot.out$t[,index]'.

    strata=: An integer vector or a factor specifying the
             strata for multi-sample problems.  If `boot.out'
             is supplied  the value of `strata' is set to
             `boot.out$strata'.  Otherwise it is an optional
             argument which has default corresponding to the
             single sample situation.

       eps=: This argument is used only if `type' is `"inf"'.
             In that case the value of epsilon to be used for
             numerical differentiation will be `eps' divided by
             the number of observations in `data'.

        ...: Any other arguments that `statistic' takes.  They
             will be passed unchanged to `statistic' every time
             that it is called.

   DDeessccrriippttiioonn::

        This function calculates the empirical influence values
        for a statistic applied to a data set.  It allows four
        types of calculation, namely the infinitesimal jack-
        knife (using numerical differentiation), the usual
        jackknife estimates, the "positive" jackknife estimates
        and a method which estimates the empirical influence
        values using regression of bootstrap replicates of the
        statistic.  All methods can be used with one or more
        samples.

   DDeettaaiillss::

        If `type' is `"inf"' then numerical differentiation is
        used to approximate the empirical influence values.
        This makes sense only for statistics which are written
        in weighted form (i.e. `stype' is `"w"').  If `type' is
        `"jack"' then the usual leave-one-out jackknife esti-
        mates of the empirical influence are returned.  If
        `type' is `"pos"' then the positive (include-one-twice)
        jackknife values are used.  If `type' is `"reg"' then a
        bootstrap object must be supplied.  The regression
        method then works by regressing the bootstrap repli-
        cates of `statistic' on the frequency array from which
        they were derived.  The bootstrap frequency array is
        obtained through a call to `boot.array'.  Further
        details of the methods are given in Section 2.7 of
        Davison and Hinkley (1997).

        Empirical influence values are often used frequently in
        nonparametric bootstrap applications.  For this reason
        many other functions call `empinf' when they are
        required.  Some examples of their use are for nonpara-
        metric delta estimates of variance, BCa intervals and
        finding linear approximations to statistics for use as
        control variates.  They are also used for antithetic
        bootstrap resampling.

   VVaalluuee::

        A vector of the empirical influence values of `statis-
        tic' applied to `data'.  The values will be in the same
        order as the observations in data.

   WWAARRNNIINNGG::

        All arguments to `empinf' must be passed using the
        `name=value' convention.  If this is not followed then
        unpredictable errors can occur.

   RReeffeerreenncceess::

        Davison, A.C. and Hinkley, D.V. (1997) Bootstrap Meth-
        ods and Their Application. Cambridge University Press.

        Efron, B. (1982) The Jackknife, the Bootstrap and Other
        Resampling Plans.  CBMS-NSF Regional Conference Series
        in Applied Mathematics, 38, SIAM.

        Fernholtz, L.T. (1983) von Mises Calculus for Statisti-
        cal Functionals.  Lecture Notes in Statistics, 19,
        Springer-Verlag.

   SSeeee AAllssoo::

        `boot', `boot.array', `boot.ci', `control',
        `jack.after.boot', `linear.approx', `var.linear'

   EExxaammpplleess::

        # The empirical influence values for the ratio of means in
        # the city data.
        data(city)
        ratio <- function(d, w) sum(d$x *w)/sum(d$u*w)
        empinf(data=city,statistic=ratio)
        city.boot <- boot(city,ratio,499,stype="w")
        empinf(boot.out=city.boot,type="reg")

        # A statistic that may be of interest in the difference of means
        # problem is the t-statistic for testing equality of means.  In
        # the bootstrap we get replicates of the difference of means and
        # the variance of that statistic and then want to use this output
        # to get the empirical influence values of the t-statistic.
        data(gravity)
        grav1 <- gravity[as.numeric(gravity[,2])>=7,]
        grav.fun <- function(dat, w)
        {    strata <- tapply(dat[, 2], as.numeric(dat[, 2]))
             d <- dat[, 1]
             ns <- tabulate(strata)
             w <- w/tapply(w, strata, sum)[strata]
             mns <- tapply(d * w, strata, sum)
             mn2 <- tapply(d * d * w, strata, sum)
             s2hat <- sum((mn2 - mns^2)/ns)
             c(mns[2]-mns[1],s2hat)
        }

        grav.boot <- boot(grav1, grav.fun, R=499, stype="w", strata=grav1[,2])

        # Since the statistic of interest is a function of the bootstrap
        # statistics, we must calculate the bootstrap replicates and pass
        # them to empinf using the t argument.
        grav.z <- (grav.boot$t[,1]-grav.boot$t0[1])/sqrt(grav.boot$t[,2])
        empinf(boot.out=grav.boot,t=grav.z)

