Unidimensional CONJOINT measurement

analyses data in the form of a rectangular array of integers by means of a family of simple composition functions using a monotonic transformation of the data. It may be regarded as a kind of "ordinal analysis of variance". The CONJOINT program takes a dependent variable and a set of independent variables and then estimates for a given simple composition function, that monotone transformation which will best fit that function. By a 'simple composition function' we mean an expression linking the independent variables by means of the operators +, - and x.

 

The program estimates a weight for each category of each way (or "facet") of the Table, and hence is a unidimensional model. When weights are combined in the way specified in the Model, the data will be maximally reproduced.

 

DATA: N-way (N mode) Table of frequencies                                
TRANSFORM: Monotonic                   
MODEL: Simple composition  

The user must supply two things for a run of CONJOINT:
     i) the data
     ii) the form of the composition model.

The data are presented to the program as a rectangular N-way array of numbers, whose "facets" or "ways" (these terms are used interchangeably) will be the number of categories contained in each of the variables.

CONJOINT makes use of the MODEL command peculiar to it for the coding of the model. This contains in the parameter field a specification in ordinary notation of the model to be fitted. For example, for three facets, we might use the simple additive model. In this case the command would be:

MODEL     A + B + C

 

It may be the case that one facet is a subset of another (or indeed may be identical). In this case the name of the first facet can be repeated. Thus for a study for three facets when the third is a subset of the second and the model is multiplicative, the command would be:


MODEL     A * B * B

 

Note that '*' is used to denote multiplication when encoding a model.

The coding of categories
The numbers of categories in each of the facets (and thus the dimensions of the input array) are given by the parameters A-FACET, B-FACET, C-FACET, D-FACET and E-FACET in the PARAMETERS command. No more than five facets are allowed. The argument to each of these parameters is the number of categories in each of the facets, e.g.

PARAMETERS     A-FACET(2), B-FACET(3), C-FACET(4)

 

Note that the hyphen is a significant character!

If subsetting is involved, then A-FACET refers to the first facet, B-FACET to the second etc., regardless of the actual names given in the MODEL specification.

The program finds the monotone transformation of the data (d[0]) which is as close as possible (in a least squares sense) to a set of values (d) which conform to the requirements of the composition function specified. This is analogous in the basic model of MDS to the set of fitting values which approximate the actual distances in the solution space.

Tied values
Two ways of treating tied data values are recognised in the CONJOINT procedure, specified using TIES in the PARAMETERS command:
TIES (1) - In the primary approach, ties in the data are broken in the fitting values, if, in doing so, STRESS is reduced. This option places little or no importance on the appearance of ties.
TIES (2) - By contrast, the secondary approach regards the information on ties as important and requires that tied data values are fit by equal fitting values.

CONJOINT treats each facet as being a categorical or nominal scale, and estimates an interval-level weight for each category of each facet. Note that if the categories happen to be ordered (say, High, Medium and Low Status) there is nothing in the procedure which will guarantee that the category weights will be similarly ordered.

Replications
Users may wish to analyse by the same model a number of replications of the same study. Such a study is signalled to the program by means of the REPLICATIONS parameter. This specifies the number of sets of data, not the number of actual replications, i.e. if you have an original study and two follow-ups then the correct coding is REPLICATIONS (3).

If a replicatory study provides data on only a subset of the original variables, then it is suggested that the study be coded as a replication with MISSING DATA values inserted at the appropriate places in the data matrix.

CRITERION
If the improvement in stress between iterations is less than the value specified in the CRITERION parameter then the process is stopped and the current values output as the solution.

The program begins the iterative process by assigning to each of the parameters a pseudo-randomly-generated value. The starting 'seed' for the pseudo-random number generator is specified by RANDOM in the PARAMETERS command. Retaining the same value produces the same results on repeating the analysis for the same data.

The procedure minimises STRESS by manipulating these initial, pseudo-random numbers. Since random starts are prone to the problem of local minima, it is suggested that the user make a number of runs using the same data but different starting values. This is done automatically within one run of CONJOINT by means of the keyword RESTARTS in the PARAMETERS command. The number specified in this parameter should be the number of different starts required. The appearance of a number of highly similar (or identical) solutions is inductive indication of a global minimum.

PARAMETERS

Keyword       Value   Function
TIES                  1   1: Primary approach
                              2: Secondary approach
REPLICATIONS   1    Sets number of data-sets for
                              replicated studies.
RANDOM     12345    Seed for pseudo-random-number generator
RESTARTS         1    Sets the number of times the program
                              will restart analysis using
                              different random starts.
A-FACET            1    Sets the number of categories in
B-FACET                  each FACET.
C-FACET
D-FACET
E-FACET
CRITERION  0.00001 Sets stopping value for stress.

NOTES
1. The command MODEL is obligatory for CONJOINT.
2. The following are not valid: READ CONFIG, ITERATIONS, N OF STIMULI, N OF SUBJECTS.
3. The program expects as input integer (I-type) variables. The INPUT FORMAT specification, if used, should take account of this, and should read one row of the data. Otherwise, free format data input is assumed.
4. The data for CONJOINT are input as a rectangular array of integers in which the first facet is that associated with the fastest-running subscript. Consider first the two-facet case. If facet A has 5 categories and facet B has three then the input array will have five columns and three rows. (NOT five rows and three columns). If a third facet C were added, which had two categories, then two such 3 x 5 arrays would be input (six rows in all, each of five columns). A fourth facet with four categories would result in four such blocks, i.e. twenty four rows in all. The data follow without separation.

PRINT options (to main output file)
Option                       Description
TABLES                Two matrices are output:
                           1. the matrix of fitting-values;
                           2. the solution matrix.
                           Both, of course, will be in the
                           same order as the input data.
HISTORY              An extended history of the iterative
                           process.
SOLUTION

By default, only the SOLUTION will be output along with the final stress value.

PLOT options (to main output file)
Option                      Description
STRESS             A histogram of STRESS at each iteration,
                         is produced.
SHEPARD           A Shepard diagram plotting data
                         against solution, is produced, and
                         the fitted values indicated.
RESIDUALS        A histogram of residual values,
                         natural and logarithmic,
                         is produced.

A Shepard diagram is produced by default.

PUNCH options (to secondary output file)
Option                     Description
SPSS                The following values are output:
                        I,J,K,L,M (being indices of the
                        five possible facets), DATA, FITTING,
                        SOLUTION, RESIDUALS, being the
                        corresponding values in a fixed format.
FINAL                Outputs final solution
STRESS             Outputs STRESS values by iteration.

By default, no secondary output file is produced.

PROGRAM LIMITS
Maximum no. of facets = 100

Maximum no. of scale values = 500
Maximum no. of replications = 2500

See also

  • The NewMDSX commands in full