Multi-Dimensional PREFerence Scaling : MDPREF

provides internal analysis of two-way data of either a set of paired comparisons matrices or a rectangular, row-conditional matrix by means of a vector model, using a linear transformation of the data.

 

DATA: 2-Way, 2 mode dis/similarities, preferences (or 3-way, 2-mode dominance data for pair-comparisons option)

TRANSFORM: Linear                        

MODEL: Scalar Products (Vector)

INPUT DATA
MDPREF accepts input data in either of two main forms:
(i) as a set of dominance (0,1) pair-comparisons matrices, or
(ii) as a set of rankings or ratings forming a rectangular, so-called "first-score" matrix.
Options within the program differ with different data input. The type of input is indicated by the DATA TYPE parameter in the PARAMETERS command.

1. The first-score matrix - DATA TYPEs 1-4
Suppose a set of N subjects is asked to rank in order of preference, or give a rating to a set of p stimuli. The resultant data form a rectangular 'row-conditional' matrix with N rows (subjects) and p columns (stimuli), called the "first score matrix" in MDPREF. Each row of the matrix represents the preference rank or score assigned by that subject to the stimuli.

Such a matrix can also be obtained by taking the pair comparison matrix for a given subject and summing each row. The resultant column of scores gives that subject's (possibly weak) rank order of preference for the stimuli and these may be collected to form the "first-score matrix".

Ranks vs. Scores
Preference judgments may be represented for MDPREF (as in MINIRSA and other procedures) in four distinct ways. The major distinction is that between a rank and a score. Ranked data may be input to MDPREF in this form by specifying DATA TYPE(1) if the order is most-preferred to least-preferred, or DATA TYPE(2) if it is the reverse.

If the data represent 'scores', so that the lowest number is used to denote the least preferred stimulus and the highest to represent the most preferred, the option is indicated by DATA TYPE(3). Alternatively, the highest number might have been used to represent the least preferred stimulus and if this is so, DATA TYPE(4) should be specified.

The pair-comparisons matrices DATA TYPE(0)
Suppose a subject is asked to consider all possible pairs of p stimuli and for each pair to indicate which stimulus is preferred (or which stimulus possesses more of a given attribute), they are asked to make p (p-1)/2 judgments of preference. The data obtained may be collected into a square, asymmetric matrix whose rows and columns each represent the p stimulus points, and entries a(i,j) take the value 1 if the subject prefers stimulus i to stimulus j and 0 if the opposite is the case. The subject may be allowed to express indifference between the stimuli, or express no opinion on a particular pair comparison.

The READ CODES command instructs the program to read in four code values, the first of which represents preference, the second its opposite ("anti-preference"), the third indifference, and the fourth is a missing data value. When using the input Wizard, you will be prompted to enter these values, before proceeding to the spreadsheets to enter the subjects' preference matrices using the codes specified.

If there are N subjects performing this test of preference, then there will be N such matrices. These are input to MDPREF by specifying DATA TYPE(0) in the PARAMETERS command.

The N pair comparisons matrices will be read by the READ MATRIX command. They may be in free format (or entered according to an optional INPUT FORMAT specification, to read one row of the input matrices), and the individual matrices should follow each other without separation.

If there are missing data, then MISSING(1) should be specified in the PARAMETERS command.

The MDPREF model represents the preferences of a subject for a group of stimuli as a vector through the configuration of stimulus points. This vector indicates the direction in which his (her) preference increases over the space. Substantively this makes strong assumption about the nature of preference, in that the model implies an "ideal" point - i.e. a point of maximum preference - at infinity (which is similar to the classic econometric assumption of insatiability). In MDPREF, where the point of maximum preference is at infinity, the contours become perpendicular to the vector).

MDPREF is a linear (or metric) procedure and the measure of goodness-of-fit of the model to the data is a product-moment correlation. Consider one subject vector passing through a configuration of stimulus points with the perpendicular lines drawn from the points onto the vector. It is the values given to the points at which these perpendicular lines meet the vector which are maximally correlated with that subject's data. (This is guaranteed by the Eckart Young decomposition). The subject vectors are normalised (for convenience only) to the same length, i.e. so that their ends lie at a common distance from the origin of the space, forming a circle, sphere or hypersphere as the case may be. Thus when a solution of more than 3 dimensions is represented (as it must be) as a set of 2-dimensional plots, some of the vectors will not, in fact, lie on the boundary circle since they will have been projected down from the higher dimensions. The length of the vector in the sub-space is related to the amount of variation in that subject's data explained by those two dimensions of the solution space. In the graphic displays of these results, an additional menu item Vectors enables you to plot or suppress the subject vectors if these are becoming too cluttered.  

Dimensionality
The program lists the latent roots of the matrices. The number of positive roots will be equal to the number of stimuli or the number of subjects, whichever is the smaller. The magnitude of the roots gives an indication of the amount of variation in the data accounted for by that dimension. The largest root will always be first and the others will follow in decreasing order. Some may be zero. An appropriate dimensionality may be chosen by means of the scree-test.

Normalising and Centering
With the data in the form of a first score matrix the user may choose how the matrix is to be centred and normalised using the parameters CENTRE and NORMALISE. The default for these parameters is 0 and means no action.

Other options allow various courses:
CENT(1) instructs the program simply to subtract the row means. This will, in a rating exercise, remove any effect due to differences in the actual values used by particular subjects.

NORM(1) allows the program not only to subtract the row means but also to take out any effect due to differences in the range or spread of scores involved by normalising each row by dividing it by its standard deviation.

CENT(2) and NORM(2) perform the same operations on the column elements, i. e. subtracting column means and column normalising respectively. This latter option has the effect of taking out the unanimity effect in subjects judgements and leaving only the significant differences in judgements.

CENT(3) instructs the program to double centre the matrix by subtracting both row and column means. NORM(3) does this and normalises the entire matrix.

Weighting of pair comparison matrices
Since pairwise judgements are often difficult to make, the user may sometimes wish to accord to each judgement a 'weight'. This might represent the degree of confidence which the subject attaches to his/her judgement, or perhaps the reliability which the researcher ascribes to each judgement.

If weights are input (indicated by WEIGHTS(1) in the PARAMETERS command) there must be one weights matrix per subject. The weights matrix immediately follows its associated pair comparisons matrix. This may be input in free format, or read according to an optional WEIGHTS FORMAT specification, which should be suitable for real (F-type) numbers.

Recurring patterns of input
If, as often happens, there is more than one identical weights matrix, then the number of such matrices may be specified as the SAME PATTERN parameter. In this case, the weights matrix follows the first pair comparisons matrix. Those pair comparisons matrices having the same pattern of weights then follow each other without separation.

Blocking of pair-comparisons data
If the number of possible pair-comparisons judgements has been thought too great then the researcher may resort to the use of incomplete data, i.e. certain element-pairs may not be presented to the subjects. The resulting data matrix will have 'blocks' missing. If one of these strategies is used and the data are arranged in blocks, then BLOCK(1) must be specified in the PARAMETERS command so that allowance can be made in the calculation of row- and column-sums.

INPUT PARAMETERS

Keyword                   Default                                 Function
N OF SUBJECTS       [number]                      Number of subjects
N OF STIMULI          [number]                      Number of stimuli
DIMENSIONS           [number]                      Dimensions of the data
DATA TYPE                 0                              0: Data are in a pair-comparisons
                                                                     matrix
                                                                 1: Data are ranks (I-scales) of column
                                                                     indices in decreasing order of preference
                                                                 2: As 1 but in increasing order of
                                                                     preference.
                                                                 3: Data are scores in order of column
                                                                     indices - high score means high preference.
                                                                 4: As 3 but high scores mean low preference.

LABELS             [followed by a series             Optionally identify the stimuli
                        of labels (<= 65 chars           and subjects. These should identify
                        each on a separate line]        first the stimuli (columns)
                                                                   then the subjects (rows), of the
                                                                   data matrix.  Subject labels may be
                                                                   omitted

OPTIONS WITH THE FIRST-SCORE MATRIX

Keyword   Default                            Function
MATFORM     0        0: The matrix is entered subjects
                                (rows) by stimuli (columns).
                             1: The matrix is entered stimuli
                                (rows) by subjects (columns).
GROUPS       0        The number of groups present in an
                                analysis of variance should be specified.
CENTRE                0: The data are not centred.
                              1: Row-means only are subtracted.
                              2: Column-means only are subtracted.
                              3: Matrix is double centred.
NORMALISE  0         0: Matrix is not normalised.
                              1: Rows are centred and normalised.
                              2: Columns are centred and normalised.
                              3: Both rows and columns are
                                centred and normalised.

OPTIONS WITH PAIRED-COMPARISONS MATRICES

Keyword         Default              Function
SAME PATTERN  0    Sets the number of subjects whose
                              pattern of missing data or weights
                              are the same.
WEIGHTS          0     0: No weights are input.
                               1: Weights are input.
BLOCK              0     0: The data are not arranged in blocks.
                               1: The non-empty cells are arranged
                               in blocks or are to be treated as such.
                    NOTE: Weights cannot be used with this option.
MISSING           0     0: There are no missing data.
                               1: There are missing data in the matrix.
GROUPS            0     The number of groups present in an
                                analysis of variance should be specified.
CENTRE             0     0: The data are not centred.
                               1: Row-means only are subtracted.
                               2: Column-means only are subtracted.
                               3: The matrix is double centred.
NORMALISE        0    0: Matrix is not normalised.
                               1: Rows are centred and normalised.
                               2: Columns are centred and normalised.
                               3: Both rows and columns are
                                centred and normalised.

NOTES
1. READ CONFIG, is not valid with MDPREF.
2. Even if only two or three codes are used in the paired comparisons matrices READ CODES must specify four (integer) codes which must be in the order specified.

PRINT options (to main output file)
Option                  Form                   Description
FINAL                    p x r          The stimulus matrix, followed by subject
                            N x r          matrix.
FIRST                   N x p          The first-score matrix. (This is the
                                             input matrix after being centred/normalised.
                                             Means and standard deviations of
                                             subjects are printed.
CROSS-PRODUCTS                 Four matrices are printed:
                            N x N         1: the cross-product matrix of subjects;
                            p x p         2: the cross-product matrix of stimuli;
                            N x N         3: the correlation matrix of subjects;
                            p x p         4: the correlation matrix of stimuli.
SECOND               N x p         The second-score matrix.
ROOTS                                  The latent roots.
RESIDUALS           N x p         The first-score matrix minus the
                                             second-score.
CORRELATIONS       N            The correlation for each subject between
                                             the data and the stimulus projections
                                             is printed.

By default, only the final configuration is printed.

PLOT options (to main output file)
Option                      Description
SUBJECTS               The n(n-1)/2 plots of the subject
                              vectors in the chosen
                              dimensionalities.
STIMULI                  The p(p-1)/2 plots of the stimulus
                              points in the chosen
                              dimensionalities.
JOINT                      Both of the above.
SHEPARD                A quasi-Shepard-plot - in this case
                              simply the first-score plotted
                              against the second-score.
ROOTS                    A scree diagram of the latent roots.

By default, the first two dimensions of the joint space only are plotted

PUNCH options
Option                      Description
SUBJECT SPACE       The final configuration of subjects is output.
STIMULUS SPACE     The final configuration of stimuli is output.

By default no secondary output file is produced.

PROGRAM LIMITS

Maximum no. of subjects = 200
Maximum no. of stimuli = 200
Maximum dimensions = 8

See also

  • The NewMDSX commands in full