analyses data in the form of a rectangular array of integers by means of a family of simple composition functions using a monotonic transformation of the data. It may be regarded as a kind of "ordinal analysis of variance". The CONJOINT program takes a dependent variable and a set of independent variables and then estimates for a given simple composition function, that monotone transformation which will best fit that function. By a 'simple composition function' we mean an expression linking the independent variables by means of the operators +, - and x.
The
program estimates a weight for each category of each way (or "facet") of the
Table, and hence is a unidimensional model. When weights are combined in the
way specified in the Model, the data will be maximally reproduced.
DATA:
N-way (N mode) Table of frequencies
TRANSFORM:
Monotonic
MODEL:
Simple composition
The
user must supply two things for a run of CONJOINT:
i) the data
ii) the form of the composition model.
The
data are presented to the program as a rectangular N-way array of numbers,
whose "facets" or "ways" (these terms are used
interchangeably) will be the number of categories contained in each of the
variables.
CONJOINT makes use
of the MODEL command peculiar to it for the coding of the model. This
contains in the parameter field a specification in ordinary notation of the
model to be fitted. For example, for three facets, we might use the simple
additive model. In this case the command would be:
MODEL A + B + C
It
may be the case that one facet is a subset of another (or indeed may be
identical). In this case the name of the first facet can be repeated. Thus for
a study for three facets when the third is a subset of the second and the model
is multiplicative, the command would be:
MODEL A
* B * B
Note
that '*' is used to denote multiplication when encoding a model.
The
coding of categories
The numbers of categories in each of the facets (and thus the dimensions of the
input array) are given by the parameters A-FACET, B-FACET, C-FACET,
D-FACET and E-FACET in the PARAMETERS command. No more
than five facets are allowed. The argument to each of these parameters is the
number of categories in each of the facets, e.g.
PARAMETERS
A-FACET(2), B-FACET(3), C-FACET(4)
Note
that the hyphen is a significant character!
If
subsetting is involved, then A-FACET refers to the first facet, B-FACET
to the second etc., regardless of the actual names given in the MODEL
specification.
The
program finds the monotone transformation of the data (d[0]) which is as close
as possible (in a least squares sense) to a set of values (d) which conform to
the requirements of the composition function specified. This is analogous in
the basic model of MDS to the set of fitting values which approximate the
actual distances in the solution space.
Tied
values
Two ways of treating tied data values are recognised in the CONJOINT
procedure, specified using TIES in the PARAMETERS command:
TIES (1) - In the primary approach, ties in the data are broken in the
fitting values, if, in doing so, STRESS is reduced. This option places little
or no importance on the appearance of ties.
TIES (2) - By contrast, the secondary approach regards the information
on ties as important and requires that tied data values are fit by equal
fitting values.
CONJOINT treats each
facet as being a categorical or nominal scale, and estimates an interval-level
weight for each category of each facet. Note that if the categories happen to
be ordered (say, High, Medium and Low Status) there is nothing in the procedure
which will guarantee that the category weights will be similarly ordered.
Replications
Users may wish to analyse by the same model a number of replications of the
same study. Such a study is signalled to the program by means of the REPLICATIONS
parameter. This specifies the number of sets of data, not
the number of actual replications, i.e. if you have an original study and two
follow-ups then the correct coding is REPLICATIONS (3).
If
a replicatory study provides data on only a subset of the original variables,
then it is suggested that the study be coded as a replication with MISSING
DATA values inserted at the appropriate places in the data matrix.
CRITERION
If the improvement in stress between iterations is less than the value
specified in the CRITERION parameter then the process is stopped and the
current values output as the solution.
The
program begins the iterative process by assigning to each of the parameters a
pseudo-randomly-generated value. The starting 'seed' for the pseudo-random
number generator is specified by RANDOM in the PARAMETERS
command. Retaining the same value produces the same results on repeating the
analysis for the same data.
The
procedure minimises STRESS by manipulating these initial, pseudo-random
numbers. Since random starts are prone to the problem of local minima, it is
suggested that the user make a number of runs using the same data but different
starting values. This is done automatically within one run of CONJOINT
by means of the keyword RESTARTS in the PARAMETERS command. The
number specified in this parameter should be the number of different starts
required. The appearance of a number of highly similar (or identical) solutions
is inductive indication of a global minimum.
PARAMETERS
Keyword
Value Function
TIES 1 1:
Primary approach
2:
Secondary approach
REPLICATIONS 1 Sets number of data-sets for
replicated
studies.
RANDOM 12345 Seed for
pseudo-random-number generator
RESTARTS 1 Sets
the number of times the program
will
restart analysis using
different
random starts.
A-FACET 1
Sets the number of categories in
B-FACET each
FACET.
C-FACET
D-FACET
E-FACET
CRITERION 0.00001 Sets stopping value for stress.
NOTES
1. The command MODEL is obligatory for CONJOINT.
2. The following are not valid: READ CONFIG, ITERATIONS, N OF
STIMULI, N OF SUBJECTS.
3. The program expects as input integer (I-type) variables. The INPUT FORMAT
specification, if used, should take account of this, and should read one row of
the data. Otherwise, free format data input is assumed.
4. The data for CONJOINT are input as a rectangular array of integers in
which the first facet is that associated with the fastest-running subscript.
Consider first the two-facet case. If facet A has 5 categories and facet B has
three then the input array will have five columns and three rows. (NOT five
rows and three columns). If a third facet C were added, which had two
categories, then two such 3 x 5 arrays would be input (six rows in all, each of
five columns). A fourth facet with four categories would result in four such
blocks, i.e. twenty four rows in all. The data follow without separation.
PRINT
options
(to main output file)
Option
Description
TABLES
Two matrices are output:
1. the matrix of
fitting-values;
2.
the solution matrix.
Both, of course, will be in the
same order as the input data.
HISTORY An
extended history of the iterative
process.
SOLUTION
By
default, only the SOLUTION will be output along with the final stress value.
PLOT
options
(to main output file)
Option
Description
STRESS A
histogram of STRESS at each iteration,
is produced.
SHEPARD A
Shepard diagram plotting data
against solution, is
produced, and
the fitted values
indicated.
RESIDUALS A histogram of
residual values,
natural and logarithmic,
is produced.
A
Shepard diagram is produced by default.
PUNCH
options
(to secondary output file)
Option
Description
SPSS The
following values are output:
I,J,K,L,M
(being indices of the
five possible facets), DATA,
FITTING,
SOLUTION, RESIDUALS, being the
corresponding values in a fixed
format.
FINAL
Outputs final solution
STRESS Outputs
STRESS values by iteration.
By
default, no secondary output file is produced.
PROGRAM
LIMITS
Maximum no. of facets =
100
Maximum no. of scale values = 500
Maximum no. of replications = 2500
See also