provides
internal analysis by decomposing a 3- to 7-way data
matrix of (dis)similarity or correlation matrices, into a set of dimensional
weights (one set per way), whose scalar product reproduces the data (using a linear transformation of the
data).
DATA:
N-way (‘rectangular’) matrix of dis/similarity or correlational measures
TRANSFORM: Linear
MODEL: N-way scalar-products
CANDECOMP takes as
input a table of data values with between three and seven ways. In the
solution, each of these ways is represented by a configuration of points
representing the elements of that particular way in a space of chosen
dimensionality. Each data value is regarded as being the scalar product between
the relevant elements. The program assumes that the data are at the interval level
of measurement.
There
are two basic forms of data input to CANDECOMP, which we will
refer to as
1. Canonical Decomposition analysis
proper, and
2. "Extended INDSCAL"
analysis
1.
CANDECOMP
proper
In
the general CANDECOMP case all the ways are considered distinct, and up to
seven ways are allowed for. For example, a four-way CANDECOMP might consists of
a set of subjects (W1) who make preference ratings of a set of objects
(W2) in different experimental
conditions (W3) at different point
in time (W4). The default parameter values produce this analysis.
2. The “extended INDSCAL” analysis
What
we call the 'extended INDSCAL' refers to the case where the user wishes to
extend a conventional INDSCAL analysis to include not only a third way (e.g.
subjects) but yet further ways. The mode will always be one less than the way
of the data. For instance, the user may have a set of two-way one-mode
dis/similarity matrices which exist
both for individuals (third way) and different points in time (fourth way). The solution will give a configuration
(the Group Space) and dimensional weights
for each remaining way -- in this
case, for individuals and another set for each point in time.
For
data of this type, SET MATRICES should be given the value 1 in
the PARAMETERS command. The DATA TYPE
parameter should also be given a suitable value. See below for a description of
the use of the SIZES parameter.
3.
Initial configuration
CANDECOMP
is prone to suboptimal solutions; users are recommended to make a series of
runs with different starting configurations. A series of similar solutions will
usually indicate if a global minimum has been found. If the extended INDSCAL
analysis is required (i.e. SET MATRICES(1)) then an initial configuration may
also be input.
To
perform an external INDSCAL analysis, SET MATRICES(1) is also required,
an initial configuration is input, and the FIX POINTS parameter is set to 1.
This can be a useful option when there are a very large number of subjects, and
an INDSCAL Group Space from a representative sample is used as the fixed
initial configuration. The program is then used to estimate weights of input
batches of subjects, each referring to the same fixed Group Space.
4.
DATA
Data
are read by the READ MATRIX instruction, in free format, or
under an associated INPUT FORMAT specification. The dimensions
of the input matrix are supplied by the SIZES command, which
is peculiar to CANDECOMP. This replaces N OF SUBJECTS and N OF STIMULI, which
are not recognized by this program. SIZES requires as its
operand up to seven numbers, separated by commas or spaces, each of which is
the number of objects in one of the ways of the matrix. As many numbers must be
specified as there are ways in the data.
The
order in which the ways are entered in the SIZES command is
critical:
5.
PARAMETERS in CANDECOMP
Keyword Default Function
DATA TYPE 0 0: An N-way table is input.
1:
Lower triangle similarity matrix (without diagonals).
2:
Lower triangle dissimilarity matrix (without diagonals).
3:
Lower triangle Euclidean distances (without diagonals).
4:
Lower triangle correlation matrix (without diagonals).
5:
Lowerhalf covariance matrix (with diagonals).
6:
Full symmetric similarity matrix (diagonals ignored).
7:
Full symmetric dissimilarity matrix (diagonals ignored).
RANDOM 12345
(Any positive integer)
Seed
for pseudo-random number generator.
SET
MATRICES 0 0: The CANDECOMP analysis is performed.
1:
The extended INDSCAL analysis is performed
(matrix
2 and matrix 3 are set equal).
FIX
POINTS 0 0: Iterate and solve for all matrices.
1: One matrix is input, and held constant (external analysis).
CRITERION 0.005
(values between 0 and 1)
Sets
improvement level for terminating iterations.
CENTRE 0 0:
The matrices are not centred.
1:
Each of the N ways will be centred by extraction
of
the appropriate mean (only applicable if DATA(0)).
NOTES
1. The SIZES command is obligatory for CANDECOMP
2. The commands N OF STIMULI, N OF SUBJECTS are not valid with CANDECOMP.
3. When DATA TYPE takes values 1 through 5 no diagonal is input.
For values 6 and 7 the diagonals are input but ignored.
4. In the parameters SET MATRICES and FIX POINTS the spaces are significant
characters.
5.
PRINT options
( N denotes the number of ways in the analysis (3 £N
£ 7), m the number of modes (2
£ m £
7).
Option Form
Description
INITIAL n matrices The
initial estimates of the
are
output. configurations are output. Each
matrix
contains the coordinates of the
points
in the required dimensions.
If
the user has input an initial
configuration,
then the second two
matrices
will be identical.
FINAL m matrices
The solution configurations are output.
are
output. Each matrix contains the coordinates of
the
relevant number of points on the
axes
of the space. These are followed
by
the correlations between each
subject's
data and solution.
The
matrix of cross-products between
the
dimensions is output.
CORRELATIONS
Correlations between computed scores and
original
data for subjects.
HISTORY The overall correlation at each
iteration is output.
The
unnormalised matrices at convergence are also
output
(there will be n of these).
By default only the FINAL matrices and the overall correlation at convergence
are output.
6.
PLOT options
Option Description
INITIAL The
initial configuration may be plotted only if one has
been
input by the user.
CORRELATIONS The overall correlation at each iteration is plotted in the
form
of a histogram.
WAY1
WAY2
WAY3
r(r-1)/2 plots are produced for
WAY4
each way specified.
WAY5
WAY6
WAY7
7.
PUNCH options (to secondary output file)
Option Description
FINAL The
configuration of points for each way in the chosen
dimensionality
is output in a fixed format.
CORRELATIONS The overall correlation at each iteration is output
in a fixed format.
Note: by default, no secondary output file is produced.
8.
PROGRAM LIMITS
Maximum no. of ways = 7
Maximum no. of dimensions = 10
Maximum no. of elements per way = 100
Way1 x Way2 x Way3 = 18000
See also