mda                   package:mda                   R Documentation

_M_i_x_t_u_r_e _D_i_s_c_r_i_m_i_n_a_n_t _A_n_a_l_y_s_i_s

_U_s_a_g_e:

     mda(formula, data, subclasses, sub.df, tot.df, dimension, eps,
         iter, weights, method, keep.fitted, trace, ...)

_A_r_g_u_m_e_n_t_s:

 formula: of the form `y~x' it describes the response and the
          predictors. The formula can be more complicated, such as
          `y~log(x)+z' etc (type `?formula' for more details). The
          response should be a factor or category representing the
          response variable, or any vector that can be coerced to such
          (such as a logical variable).

    data: data frame containing the variables in the formula
          (optional).

subclasses: Number of subclasses per class - default = 3. Can be a
          vector with a number for each class.

  sub.df: If subclass centroid shrinking is performed, what is the
          effective degrees of freedom of the centroids per class. Can
          be a scalar, in which case the same number is used for each
          class, else a vector.

  tot.df: The total df for all the centroids can be specified rather
          than separately per class.

dimension: The dimension of the reduced model. If we know our final
          model will be confined to a discriminant subspace (of the
          subclass centroids), we can specify this in advance and have
          the EM algorithm operate in this subspace.

     eps: A numerical threshold for automatically truncating the
          dimension.

    iter: A limit on the total number of iterations - default is 5.

 weights: NOT observation weights! This is a special weight structure,
          which for each class assigns a weight (prior probability) to
          each of the observations in that class of belonging to one of
          the subclasses. The default is provided by a call to
          `mda.start(x, g, subclasses, trace, ...)' (by this time x and
          g are known).  See the help for `mda.start()'. Arguments for
          `mda.start()' can be provided via the `...{}' argument to
          mda, and the `weights' argument need never be accessed. A
          previously fit `mda' object can be supplied, in which case
          the final subclass `responsibility' weights are used for
          `weights'. This allows the iterations from a previous fit to
          be continued.

  method: regression method used in optimal scaling. Default is linear
          regression via the function `polyreg', resulting in the usual
          mixture model. Other possibilities are `mars' and `bruto'.
          For penalized mixture discriminant models `gen.ridge' is
          appropriate.

keep.fitted: a logical variable, which determines whether the
          (sometimes large) component `"fitted.values"' of the `"fit"'
          component of the returned `mda' object should be kept. The
          default is `TRUE' if `n * dimension < 1000'

   trace: if `TRUE', iteration information is printed. Note that the
          deviance reported is for the posterior class likelihood, and
          not the full likelihood, which is used to drive the EM
          algorithm under mda. In general the latter is not available.

     ...: additional arguments to `mda.start()' and to `method()'.

_V_a_l_u_e:

     An object of class `c("mda","fda")'. The most useful extractor is
     `predict', which can make many types of predictions from this
     object. It can also be plotted, and any functions useful for
     `"fda"' objects will work here too, such as `confusion' and
     `coef'.

     The object has the following components: 

percent.explained: the percent between-group variance explained by each
          dimension (relative to the total explained.)

  values: optimal scaling regresssion sum-of-squares for each dimension
          (see reference).

   means: subclass means in the discriminant space. These are also
          scaled versions of the final theta's or class scores, and can
          be used in a subsequent call to `mda()' (this only makes
          sense if some columns of theta are omitted-see the
          references)

theta.mod: (internal) a class scoring matrix which allows predict to
          work properly.

dimension: dimension of discriminant space

sub.prior: subclass membership priors, computed in the fit. No effort
          is currently spent in trying to keep these above a threshold.

   prior: class proprotions for the training data

     fit: fit object returned by "method"

    call: the call that created this object (allowing it to be
          `update()'-able)

confusion: confusion matrix when classifying the training data

 weights: These are the subclass membership probabilities for each
          member of the training set; see the weights argument.

assign.theta: a pointer list which identifies which elements of certain
          lists belong to individual classes.

deviance: The multinomial log-liklihood of the fit. Even though the
          full log-likelihood drives the iterations, we cannot in
          general compute it because of the flexibility of the method()
          used.  The deviance can increase with the iterations, but
          generally does not.


     The `method' functions are required to take arguments `x' and `y'
     where both can be matrices, and should produce a matrix of
     `fitted.values' the same size as `y'. They can take additional
     arguments `weights' and should all have a `...{}' for safety sake.
      Any arguments to method() can be passed on via the `...{}'
     argument of `mda()'. The default method `polyreg()' has a `degree'
     argument which allows polynomial regression of the required total
     degree.  See the documentation for `predict.fda()' for further
     requirements of `method'.

     The function `mda.start()' creates the starting weights; it takes
     additional arguments which can be passed in via the `...{}'
     argument to `mda'. See the documentation for `mda.start'.

_N_o_t_e:

     This software it is not well-tested, we would like to hear of any
     bugs.

_A_u_t_h_o_r(_s):

     Trevor Hastie and Robert Tibshirani

_R_e_f_e_r_e_n_c_e_s:

     ``Flexible Disriminant Analysis by Optimal Scoring'' by Hastie,
     Tibshirani and Buja, 1994, JASA, 1255-1270.

     ``Penalized Discriminant Analysis'' by Hastie, Buja and
     Tibshirani, Annals of Statistics, 1995 (in press).

     ``Discriminant Analysis by Gaussian Mixtures'' by Hastie and
     Tibshirani, 1994, JRSS-B (in press).

_S_e_e _A_l_s_o:

     `predict.mda', `mars', `bruto', `polyreg', `gen.ridge', `softmax',
     `confusion'

_E_x_a_m_p_l_e_s:

     data(iris)
     irisfit <- mda(Species ~ ., data = iris)
     irisfit
     ## Call:
     ## mda(formula = Species ~ ., data = iris)
     ##
     ## Dimension: 4
     ##
     ## Percent Between-Group Variance Explained:
     ##     v1     v2     v3     v4
     ##  96.02  98.55  99.90 100.00
     ##
     ## Degrees of Freedom (per dimension): 5
     ##
     ## Training Misclassification Error: 0.02 ( N = 150 )
     ##
     ## Deviance: 15.102

     data(glass)
     # random sample of size 100
     samp <- c(1, 3, 4, 11, 12, 13, 14, 16, 17, 18, 19, 20, 27, 28, 31, 38,
     42, 46, 47, 48, 49, 52, 53, 54, 55, 57, 62, 63, 64, 65, 67, 68,
     69, 70, 72, 73, 78, 79, 83, 84, 85, 87, 91, 92, 94, 99, 100,
     106, 107, 108, 111, 112, 113, 115, 118, 121, 123, 124, 125, 126,
     129, 131, 133, 136, 139, 142, 143, 145, 147, 152, 153, 156, 159,
     160, 161, 164, 165, 166, 168, 169, 171, 172, 173, 174, 175, 177,
     178, 181, 182, 185, 188, 189, 192, 195, 197, 203, 205, 211, 212, 214)
     glass.train <- glass[samp,]
     glass.test <- glass[-samp,]
     glass.mda <- mda(Type ~ ., data = glass.train)
     predict(glass.mda, glass.test, type="post") # abbreviations are allowed
     confusion(glass.mda,glass.test)

