INDividual Differences SCALing : INDSCAL

INDSCAL provides internal metric analysis of  a "stack" of dis/similarity (or correlation) matrices in terms of a  weighted distance model, such that each "individual" (or data-source) has a set of dimensional weights which systematically "distort" the Group Stimulus space to produce a "Private" space.

 

Data:  3-way, 2-mode dis/similarities (or correlations)           
Transform:  Linear          
Model:  Weighted Euclidean Distance (or Canonical Decomposition)

 

INDSCAL was originally developed to explain the relationship between subjects' differential cognition of a set of stimuli. Suppose that there are N subjects and p stimuli. The program takes as input a set of N matrices each of which is a square symmetric matrix (of order p) of (dis)similarity judgments/measures between the p stimuli. The model explains differences between subjects' cognitions by a variant of the distance model. The stimuli are thought of as points positioned in a 'Group' or 'master' Stimulus space. This space is perceived differentially by the subjects in that each of them affords a different salience or weight to each of the dimensions of the space. In the graphic displays of the 2 dimensional subject space , an additional menu item Vectors optionally enables you to plot the subjects as vectors and the line representing equal weighting.  The transformation which is assumed to take the data into the solution is a linear one. Note that on closing the graphic displays of  subject spaces, it is also possible to submit arc-distances in the space to further analysis using SUBJSTAT. Note that a subject space, by convention, is always represented in the positive quadrant of the plotted space, i.e. the coordinate values are all positive.

The INDSCAL model is a special case of  CANDECOMP  (where the second and third 'ways' of the data matrix are identical), and is also akin to the P1 model  in  the PINDIS hierarchy of models.

INDSCAL is an expressly dimensional model and produces a unique orientation of the axes of the Group Space, in the sense that any rotation will destroy the optimality of the solution and will change the values of the subject weights. Moreover, the distances in the Group Space are weighted Euclidean, whereas those in the private spaces are simple Euclidean. Because of this, it is not legitimate to rotate the axes of a Group Space to a more 'meaningful' orientation, as is commonly done both in factor analysis and in the basic multidimensional scaling model. It has generally been found that the recovered dimensions yield readily to interpretation. In many uses of INDSCAL, the third way consists not of  individual "subjects" but of aggregated subgroups ("pseudo-subjects"), and indeed different replications, time-series, methods, places, etc.

The N (dis)similarity matrices of order p must follow the READ MATRIX command sequentially.

At the beginning of an INDSCAL analysis each input matrix of dis/similarities, or distances is converted into a matrix of scalar products. To equalize each subject's influence on the analysis these data are normalized by scaling each scalar products matrix so that its sum of squares equals one. Data input as product-moment covariances or correlations are already scalar products and do not need converting in this way.  Thus it is essential to signal this type of input by means of the DATA TYPE parameter (see below).

DIMENSIONS
Some experimentation is generally needed to determine how many dimensions are appropriate for a given set of data. This involves analyzing the data in spaces of different dimensionality. For each space of r dimensions the program uses as a starting configuration the solution in (r + 1) dimensions less the dimension accounting for the least variance. Usually between two and four dimensional solutions will be adequate for any reasonable data set.

The starting configuration
The analysis begins with an initial configuration of stimulus points. This may be supplied by the user following the READ CONFIG command. The configuration should contain stimuli coordinates in the maximum dimensionality required

Alternatively the program can generate a pseudo-random starting configuration if the value of the parameter RANDOM is 0. If RANDOM is assigned a non-zero value this is used as a seed to generate the random numbers. Since sub- optimal solutions are not uncommon with this method users are strongly recommended to make several runs with different starting configurations. A series of similar (or identical) solutions may be taken to indicate that a true 'global' solution has been found.

Alternatively, the user may wish to minimize this particular difficulty by submitting as an initial configuration one obtained from, say, a MINISSA run in which the averaged judgements have been analysed. This method will also reduce the amount of machine time taken to reach a solution.

READ CONFIG / FIX POINTS
It is sometimes useful to determine only subject weights for some previously determined stimulus configuration, such as a previous INDSCAL solution, or, some known configuration. This makes it possible to use INDSCAL in an external mode.  This configuration may be supplied following the READ CONFIG command.

The full set of data should follow READ MATRIX but FIX POINTS should be set to 1 in the PARAMETERS command and the program will then solve only for the subject weights.

This option is particularly useful when the user has more data than the program is capable of handling. The user can input the configuration obtained either from a MINISSA analysis of averaged judgments or from an INDSCAL analysis of some judiciously (or randomly) selected subset of subjects and fit to it any number of subjects' weights.

If the user wishes to constrain the solution as closely as possible to orthogonality (i.e. in the sense that the correlation between the coordinates is zero) then the parameter SOLUTIONS should be set to 1 in the PARAMETERS statement. Users are warned that this will necessarily produce a suboptimal solution.

INPUT COMMANDS
Keyword                                                     Function
N OF STIMULI     [number]                         Number of stimuli for analysis

N OF SUBJECTS  [number]                         Number of subjects for which
                                                               data are to be input
DIMENSIONS      [number]
                       [number list]                      Dimensions for analysis
                  [number] TO [number]

LABELS       [followed by a series               Optionally identify the stimuli,
                  of labels (<= 65 chars)            followed by the subjects, as 
                  each on a separate                  required. All labels should be
                  line]                                       entered, without omissions. 

PARAMETERS
Keyword           Default Value               Function
SOLUTIONS                0            0: Compute all dimensions simultaneously
                                               1: Compute separate one-dimensional
                                                   solutions.

FIX POINTS                0             0: Iterate and solve for all matrices.
                                               1: Solve for subject weights only.

RANDOM                    0             Random number seed for generating the
                                               initial configuration. (When the user does
                                               not provide an initial configuration
                                               using the READ CONFIG command)

DATA TYPE                 1            0: IDIOSCAL starting configuration.
                                               1: Lower triangle similarity matrix
                                                (without diagonals).
                                               2: Lower triangle dissimilarity matrix
                                                (without diagonals).
                                               3: Lower triangle euclidean distances
                                                 (without diagonals).
                                               4: Lower triangle correlation matrix
                                                 (without diagonals).
                                               5: Lowerhalf covariance matrix
                                                 (with diagonals).
                                               6: Full symmetric similarity matrix
                                                 (diagonals ignored).
                                               7: Full symmetric dissimilarity matrix
                                                 (diagonals ignored).

CRITERION              0.005         Sets criterion value for termination of
                                                iterations.

MATFORM                   0             0: Input configuration is stimuli
                                                (rows) by dimensions (columns).
                                                1: Input configuration is dimensions
                                                (rows) by stimuli (columns).
                                                Valid only with READ CONFIG.

PRINT options (to the main output file)
Option                    Form                     Description
INITIAL                   N x r           Three matrices are output:
                              p x r            1. the initial estimates of the subject
                                                 weights.
                                                 2. & 3. separate estimates of the
                                                 stimulus configuration.

FINAL                     N x r            Two matrices are output:
                             p x r             the matrix of subject weights and the
                                                 coordinates of the group space.
                                                 These are followed by the correlation
                              N                 between each subject's data and
                                                 solution and the matrix of cross-
                             r x r              products between the dimensions.

HISTORY                                    An iteration by iteration history
                                                 of the overall correlation. (The final
                                                (3) matrices at convergence are also
                                                  output).

SUMMARY                                  Summary of results produced at the end
                                                of each analysis.

By default only the solution matrices and the final overall
correlation are output.

PLOT options (to the main output file)
Option                              Description
INITIAL                         The initial configuration may be
                                    plotted only if one is input by the user.
CORRELATIONS             The correlations at each iteration are plotted.
GROUP                          Up to r(r-1)/2 plots of the p stimulus points.
SUBJECTS                     Up to r(r-1)/2 plots of the Subject Space.
                                    Note, however, that it is mistaken to regard 
                                    these spaces as Euclidean. SUBJSTAT offers

                                    an arc-distance measure for the analysis of
                                    distances between items in subject spaces.

By default the Subject and Group Spaces will be plotted.

PUNCH options (to a secondary output file)
Option                              Description
FINAL                            Outputs the final configuration
                                    and the subject correlations in
                                    the following order:
                                    - each subject is followed by the
                                    coordinates of its weight on
                                    each dimension;
                                    -each stimulus point is followed
                                    by its coordinates on each dimension.
CORRELATIONS             The overall correlation at each
                                    iteration is output in a fixed format.
SCALAR PRODUCTS       the scalar product matrix is saved.

By default, no secondary output file is generated.

PROGRAM LIMITS
Maximum no. of dimensions = 5
Maximum no. of stimuli = 200
Maximum no. of subjects = 200
N OF SUBJECTS x N OF STIMULI x N OF STIMULI = 8,000,000

See also

  • The NewMDSX commands in full
  • SUBJSTAT - analysis of distances in subject spaces