

   DDiivviissiivvee AAnnaallyyssiiss

        diana(x, diss = F, metric = "euclidean", stand = F)

   AArrgguummeennttss::

          x: data matrix or dataframe, or dissimilarity matrix,
             depending on the value of the `diss' argument.

             In case of a matrix or dataframe, each row corre-
             sponds to an observation, and each column corre-
             sponds to a variable. All variables must be
             numeric.  Missing values (NAs) are allowed.

             In case of a dissimilarity matrix, `x' is typi-
             cally the output of `daisy' or `dist'. Also a vec-
             tor with length n*(n-1)/2 is allowed (where n is
             the number of observations), and will be inter-
             preted in the same way as the output of the above-
             mentioned functions. Missing values (NAs) are not
             allowed.

       diss: logical flag: if TRUE, then `x' will be considered
             as a dissimilarity matrix. If FALSE, then `x' will
             be considered as a matrix of observations by vari-
             ables.

     metric: character string specifying the metric to be used
             for calculating dissimilarities between observa-
             tions.  The currently available options are
             "euclidean" and "manhattan".  Euclidean distances
             are root sum-of-squares of differences, and man-
             hattan distances are the sum of absolute differ-
             ences.  If `x' is already a dissimilarity matrix,
             then this argument will be ignored.

      stand: logical flag: if TRUE, then the measurements in
             `x' are standardized before calculating the dis-
             similarities. Measurements are standardized for
             each variable (column), by subtracting the vari-
             able's mean value and dividing by the variable's
             mean absolute deviation.  If `x' is already a dis-
             similarity matrix, then this argument will be
             ignored.

   DDeessccrriippttiioonn::

        Returns a list representing a divisive hierarchical
        clustering of the dataset.

   DDeettaaiillss::

        `diana' is fully described in chapter 6 of Kaufman and
        Rousseeuw (1990).  It is probably unique in computing a
        divisive hierarchy, whereas most other software for
        hierarchical clustering is agglomerative.  Moreover,
        `diana' provides (a) the divisive coefficient (see
        `diana.object') which measures the amount of clustering
        structure found; and (b) the banner, a novel graphical
        display (see `plot.diana').

        The `diana'-algorithm constructs a hierarchy of clus-
        terings, starting with one large cluster containing all
        n observations. Clusters are divided until each cluster
        contains only a single observation.  At each stage, the
        cluster with the largest diameter is selected.  (The
        diameter of a cluster is the largest dissimilarity
        between any two of its observations.)  To divide the
        selected cluster, the algorithm first looks for its
        most disparate observation (i.e., which has the largest
        average dissimilarity to the other observations of the
        selected cluster). This observation initiates the
        "splinter group". In subsequent steps, the algorithm
        reassigns observations that are closer to the "splinter
        group" than to the "old party". The result is a divi-
        sion of the selected cluster into two new clusters.

   VVaalluuee::

        an object of class `"diana"' representing the cluster-
        ing.  See diana.object for details.

   BBAACCKKGGRROOUUNNDD::

        Cluster analysis divides a dataset into groups (clus-
        ters) of observations that are similar to each other.
        Hierarchical methods like `agnes', `diana', and `mona'
        construct a hierarchy of clusterings, with the number
        of clusters ranging from one to the number of observa-
        tions. Partitioning methods like `pam', `clara', and
        `fanny' require that the number of clusters be given by
        the user.

   RReeffeerreenncceess::

        Kaufman, L. and Rousseeuw, P.J. (1990).  Finding Groups
        in Data: An Introduction to Cluster Analysis.  Wiley,
        New York.

        Struyf, A., Hubert, M. and Rousseeuw, P.J. (1997).
        Integrating Robust Clustering Techniques in S-PLUS,
        Computational Statistics and Data Analysis, 26, 17-37.

   SSeeee AAllssoo::

        `agnes', `diana.object', `daisy', `dist', `plot.diana',
        `twins.object'.

   EExxaammpplleess::

        data(votes.repub)
        dv <- diana(votes.repub, metric = "manhattan", stand = TRUE)
        print(dv)
        plot(dv)

        data(agriculture)
        ## Plot similar to Figure 8 in ref
        plot(diana(agriculture), ask = TRUE)

        data(votes.repub)
        dv <- diana(votes.repub, metric = "manhattan", stand = TRUE)
        print(dv)
        plot(dv)

        data(agriculture)
        ## Plot similar to Figure 8 in ref
        plot(diana(agriculture), ask = TRUE)

