

   FFuuzzzzyy AAnnaallyyssiiss

        fanny(x, k, diss = F, metric = "euclidean", stand = F)

   AArrgguummeennttss::

          x: data matrix or dataframe, or dissimilarity matrix,
             depending on the value of the `diss' argument.

             In case of a matrix or dataframe, each row corre-
             sponds to an observation, and each column corre-
             sponds to a variable. All variables must be
             numeric.  Missing values (NAs) are allowed.

             In case of a dissimilarity matrix, `x' is typi-
             cally the output of `daisy' or `dist'. Also a vec-
             tor with length n*(n-1)/2 is allowed (where n is
             the number of observations), and will be inter-
             preted in the same way as the output of the above-
             mentioned functions. Missing values (NAs) are not
             allowed.

          k: integer, the number of clusters.  It is required
             that 0 < k < n/2 where n is the number of observa-
             tions.

       diss: logical flag: if TRUE, then `x' will be considered
             as a dissimilarity matrix. If FALSE, then `x' will
             be considered as a matrix of observations by vari-
             ables.

     metric: character string specifying the metric to be used
             for calculating dissimilarities between observa-
             tions.  The currently available options are
             "euclidean" and "manhattan".  Euclidean distances
             are root sum-of-squares of differences, and man-
             hattan distances are the sum of absolute differ-
             ences.  If `x' is already a dissimilarity matrix,
             then this argument will be ignored.

      stand: logical flag: if TRUE, then the measurements in
             `x' are standardized before calculating the dis-
             similarities. Measurements are standardized for
             each variable (column), by subtracting the vari-
             able's mean value and dividing by the variable's
             mean absolute deviation.  If `x' is already a dis-
             similarity matrix, then this argument will be
             ignored.

   DDeessccrriippttiioonn::

        Returns a list representing a fuzzy clustering of the
        data into `k' clusters.

   DDeettaaiillss::

        In a fuzzy clustering, each observation is "spread out"
        over the various clusters. Denote by u(i,v) the member-
        ship of observation i to cluster v.  The memberships
        are nonnegative, and for a fixed observation i they sum
        to 1.  The particular method `fanny' stems from chapter
        4 of Kaufman and Rousseeuw (1990).  Compared to other
        fuzzy clustering methods, `fanny' has the following
        features: (a) it also accepts a dissimilarity matrix;
        (b) it is more robust to the `spherical cluster'
        assumption; (c) it provides a novel graphical display,
        the silhouette plot (see `plot.partition').

        Fanny aims to minimize the objective function

        SUM_v (SUM_(i,j) u(i,v)^2 u(j,v)^2 d(i,j)) / (2 SUM_j u(j,v)^2)

        where n is the number of observations, k is the number
        of clusters and d(i,j) is the dissimilarity between
        observations i and j.

   VVaalluuee::

        an object of class `"fanny"' representing the cluster-
        ing.  See `fanny.object' for details.

   BBAACCKKGGRROOUUNNDD::

        Cluster analysis divides a dataset into groups (clus-
        ters) of observations that are similar to each other.
        Partitioning methods like `pam', `clara', and `fanny'
        require that the number of clusters be given by the
        user.  Hierarchical methods like `agnes', `diana', and
        `mona' construct a hierarchy of clusterings, with the
        number of clusters ranging from one to the number of
        observations.

   RReeffeerreenncceess::

        Kaufman, L. and Rousseeuw, P.J. (1990).  Finding Groups
        in Data: An Introduction to Cluster Analysis.  Wiley,
        New York.

        Anja Struyf, Mia Hubert & Peter J. Rousseeuw (1996):
        Clustering in an Object-Oriented Environment.  Journal
        of Statistical Software, 1.  <URL:
        http://www.stat.ucla.edu/journals/jss/>

        Struyf, A., Hubert, M. and Rousseeuw, P.J. (1997).
        Integrating Robust Clustering Techniques in S-PLUS,
        Computational Statistics and Data Analysis, 26, 17-37.

   SSeeee AAllssoo::

        `fanny.object', `daisy', `partition.object', `plot.par-
        tition', `dist'.

   EExxaammpplleess::

        ## generate 25 objects, divided into two clusters, and 3 objects lying
        ## between those clusters.
        x <- rbind(cbind(rnorm(10,0,0.5), rnorm(10,0,0.5)),
                   cbind(rnorm(15,5,0.5), rnorm(15,5,0.5)),
                   cbind(rnorm(3,3.5,0.5), rnorm(3,3.5,0.5)))
        fannyx <- fanny(x, 2)
        fannyx
        summary(fannyx)
        plot(fannyx)

        data(ruspini)
        ## Plot similar to Figure 6 in Stryuf et al (1996)
        plot(fanny(ruspini, 5))

