HICLUS provides
analysis of (dis)similarity data by means of a hierarchical (agglomerative)
clustering scheme, whose results are ordinally invariant.
Data:
2-way, 1-mode non-negative
dis/similarities
Transform:
Monotonic
Model:
Ultra-metric distance
The
method of hierarchical clustering implemented in HICLUS is often used as
an alternative or as a supplementary technique to the basic model of MDS and
takes the same form of data.
The
matrix of (dis)similarities between a set of objects is used to define a set of
non-overlapping clusters such that the more similar objects are joined together
before less similar objects. The scheme consists of a series of clustering
(levels), each of which is a partition
In the initial level each object forms a cluster, whilst at the highest
level all the objects form a single cluster. In a hierarchical clustering scheme
there are exactly (p-1) levels where there are p objects. Each intermediate level joins points or clusters present
at the lower (finer) level. The clustering scheme is hierarchical in the sense
that once two objects have been joined together at a lower level of the scheme,
they may not be separated at a higher level. So the clustering at each level includes the ones below
it..
HICLUS expects data in
the form of a lower triangle or a full symmetric matrix of (dis)similarity
measures between a set of objects (stimuli). Any of the types of data suitable
for input to MINISSA are suitable. Note that data
values must be non-negative.
HICLUS
implements Johnson's (1967) Hierarchical Clustering Schemes. If data conform
exactly to the ultra-metric inequality, which defines a hierarchical clustering
scheme, then there is no ambiguity in defining the distance between a cluster
and a new point. However, most data
do not perfectly conform, and the problem then becomes to define the distance
unambiguously, since a number of options (mean, median, mode) are possible, but
each option chosen will produce a different clustering. Johnson therefore
proposes using the two extreme possibilities (the minimum and the maximum) as
solutions, thus alerting the user to the full range of possible
solutions.
The
"minimum" method
Also known as the "connectedness" or "single-link" method,
this approach defines the dissimilarity between a point and a cluster as the
smallest of the dissimilarities between the external point and the constituent
points in the cluster. This method tends to join single points to existing
clusters and schemes resulting from it are often not easily amenable to
substantive interpretation. The "level" value in this approach gives the length
of the longest chain joining any two points in the cluster. The approach is
chosen by specifying METHODS(1) in the PARAMETERS statement.
The
"maximum" method
Also known as the 'diameter' or 'complete link' method, this
approach defines the dissimilarity between a point and a cluster to be the
largest of the dissimilarities between it and the points constituting the
cluster. In this case the " level" gives the size of the diameter of the largest
cluster at that level. This method is chosen by specifying METHODS(2) in
the PARAMETERS statement.
The
default option METHODS(3) allows for both methods to be used
sequentially.
HICLUS
is not a dimensional method; the solutions are presented as a dendogram or
hierarchical tree.
The
method used to obtain a solution is not an iterative method, and no fit measure
is produced.
See
also : Displaying dendrograms,
The NewMDSX commands in full
INPUT
COMMANDS
Keyword Function
N
OF
STIMULI [number] Number
of stimuli in the
analysis
LABELS [followed
by a series Optionally
identify the
stimuli.
of
labels (<= 65 chars There
should be as many
labels
each
on a separate line] as there are stimuli.
READ MATRIX
Start reading input data
PARAMETERS
Keyword Default Function
DATA
TYPE 0 0: Lower-triangle
matrix
of similarities
(high
values mean high
similarities
between
points).
1: Lower-triangle
matrix of
dissimilarities
(high
values mean high
dissimilarities
between
points).
2: Full-symmetric
matrix
of similarities
(high
values mean high
similarities
between
points).
3: Full-symmetric
matrix of
dissimilarities
(high
values mean high
dissimilarities
between
points).
METHODS 3 1:
Only the minimum method is
used.
2:
Only the maximum method is
used.
3:
Both methods are used.
The
following statements are not valid with HICLUS:
N OF
SUBJECTS
DIMENSIONS
ITERATIONS
PLOT
PUNCH
N OF
STIMULI may be replaced with N OF POINTS
The
input should be specified as (non-negative) real numbers and should be presented
as a lower-triangle matrix without diagonal.
PRINT
option
Option Description
HISTORY A
detailed history of the clustering is produced.
PROGRAM
LIMIT
Maximum no.
stimuli = 300