|
Mode Choice Models: Bespoke and Transferred
TAG Unit 3.11.3
January 2006
Unit 3.11.3 (Adobe Acrobat - 427kb)
Contents
1. Mode Choice Models - A Decision Making Framework
1.1 Introduction
2. Issues for Consideration
2.1 Introduction
2.2 Necessary model components
2.3 Policy responses to be measured in the mode choice model
2.4 Available data
2.5 New Revealed Preference (RP) data
2.6 Advice on whether to develop or transfer models
3. Introduction to Bespoke Mode Choice Models
3.1 Introduction
3.2 The purpose of bespoke mode choice models
4. Model Design
4.1 High-level considerations
4.2 Lower-level considerations
5. Model Development
5.1 Logit model
5.2 Specifying the utility function
5.3 Introducing socio-economic variables
5.4 Alternative-specific constants
5.5 Functional form
5.6 Defining the choice set
5.7 Independence from Irrelevant Alternatives (IIA)
5.8 Maximum likelihood estimation
5.9 Preliminary interpretation
5.10 Further interpretation and diagnostic testing
5.11 Validation
5.12 Nested logit
5.13 Estimating and interpreting nested logit
6. Data Collection
6.1 Revealed Preference and Stated Preference
6.2 SP design methods
6.3 Choice set
6.4 Response method
6.5 Number of alternatives
6.6 Number of replications
6.7 Task complexity
6.8 Which attributes?
6.9 Number of attributes
6.10 Units of measurement
6.11 Numbers of levels of attributes
6.12 Selection of values for levels of attributes
6.13 Combining the attribute levels: orthogonality
6.14 Combining the attribute levels: non-orthogonality
6.15 Combining the attribute levels: boundary values
6.16 Realism
6.17 Testing the design using simulation
6.18 Questionnaire design and implementation
6.19 Staging
6.20 Background information
6.21 Means of presentation
6.22 Interception
6.23 Response rates
6.24 Preamble to questionnaire
6.25 Focus groups
6.26 Pre-pilot survey
6.27 Pilot survey
6.28 Field survey
6.29 Cleaning
6.30 Merging SP data
6.31 Combining RP and SP data
6.32 Merging RP and SP data
6.33 Repeat measurements
6.34 Sampling
7. Model Application
7.1 Introduction to model application
7.2 Sample enumeration
7.3 Adjusting the ASCs
7.4 Forecasting
7.5 Patronage build-up
8. Model Outputs and Use in Appraisal
8.1 Introduction
8.2 Spatial detail
8.3 Segmentation by purpose
8.4 Segmentation by person-type
8.5 Choice set
8.6 Generalised costs
8.7 Time
8.8 Outputs for TUBA
9. Documentation
9.1 Model design
9.2 Data collection
9.3 Model development
9.4 Model output
10. Estimation of Transferred Models
10.1 Introduction
10.2 Importing model parameters
10.3 Importing model parameters: Recalibration with disaggregate and semi-aggregate RP data
10.4 Importing model parameters: Recalibration with aggregate RP data
10.5 Transfer of model systems
10.6 Model validation
11. Further Information
12. References
13. Document Provenance
1. Mode Choice Models - A Decision Making Framework
1.1 Introduction
1.1.1 This TAG Unit provides advice for modelling and appraisal of major public transport schemes. It should be read in conjunction with Model Structure and Traveller Response for Public Transport Scheme (Unit 3.11.1), Road Traffic and Public Transport Assignment Modelling (Unit 3.11.2) and Forecasting and Sensitivity Tests for Public Transport Schemes (Unit 3.11.4).
1.1.2 It has three objectives. First, it provides guidance on the decision of when it may be appropriate to develop a bespoke mode choice model or transfer a demand model and/or any of its components. Second, it provides guidance on the process of developing bespoke mode choice models. Third, it provides advice on procedures for transferring mode choice models where this is appropriate.
2. Issues for Consideration
2.1 Introduction
2.1.1 In general, two approaches for travel demand model development can be considered. In one approach, a bespoke model can be developed from local data, using statistical estimation procedures. In the other approach specific model parameters, or in some cases an entire model structure, can be transferred from elsewhere; in this case adjustments will be required so that the model reflects local behaviour.
2.1.2 In fact, every model development project represents a transfer to some extent, in the sense that the modeller's concepts, developed in other areas, are applied in the area of interest. In most cases, model structures (such as the use of a four-stage procedure) and concepts (like generalised cost) are transferred without considering that these are transfers at all.
2.1.3 In general, bespoke modelling does not require pre-conceptions about the structure of the choice model. Alternative structures can be examined using the available dataset and compared in statistical terms in order to select the one that performs best.
2.1.4 In contrast, transferred models require a priori assumptions about the model structure and parameter values. The latter could be defined exogenously, for example published values from recognised sources, e.g. values of time from Values of Time and Vehicle Operating Costs (Unit 3.5.6), lambdas from Variable Demand Modelling - Key Processes (Unit 3.10.3), or imported from other studies.
2.1.5 Many models may fall somewhere between these two types, depending on how the 'imported parameters' are obtained, and the extent to which local data is available to calibrate the model. When we refer to model calibration, we refer specifically to adjustments that will be required to ensure that:
- the aggregate mode shares are replicated; and
- the sensitivity of the model (i.e. the scale) is correct.
As part of the calibration procedure, it is important that both the observed mode shares and model scale are replicated.
2.1.6 This TAG Unit provides advice to guide the analyst about the appropriateness of transferring or developing a bespoke model for their specific circumstances, focusing on:
- the overall model structure;
- policy variables to be included in the mode choice model;
- available data;
- the size of the scheme and the time and cost budget for the study.
2.2 Necessary model components
2.2.1 The model components that are necessary for forecasting and appraising the effects of a scheme will influence the decision of whether a bespoke model is required or whether a model or parts of that model can be transferred from elsewhere.
2.2.2 In the case of car ownership and trip generation, models and forecasts are provided in the DfT's TEMPRO system. This information provides a basis for forecasting for these components that is generally both transferable and defensible. Although it may be desirable in some circumstances to set up local models for these components, it would be necessary to justify the deviation from TEMPRO forecasts, and compare results against that base.
2.2.3 For structures that require a mode choice model only, the appropriate type of model, i.e. bespoke or one based on a transferred model or model parameters, may also depend on:
- the size of the scheme;
- the degree of socio-economic segmentation required to adequately reflect traveller behaviour;
- any particularities in the area and the type of policies that are to be evaluated.
2.2.4 It is unlikely that anything other than bespoke models should be developed for appraisal of:
- very large public transport schemes;
- schemes which require rich socio-economic segmentation for appraisal;
- areas where traveller behaviour, e.g. in terms of values of time, is substantially different from national norms.
2.2.5 Whether an entire model structure from another area could be transferred depends on the following considerations:
- there is a relevant model to transfer, with appropriate segmentation and behavioural responses;
- the quality of the model considered for transfer is high (based on analysis of the significance of model coefficients, results of validation tests, etc.);
- the age of the model considered for transfer;
- the areas and zone systems are broadly similar;
- local, preferably disaggregate data, is available (or could be collected) and it is compatible with the original model;
- the network descriptions between the original model and area of transfer are broadly similar;
- the mode shares and trip lengths between the original model and area of transfer are generally compatible.
2.2.6 In the case of a transfer, the model structure may be constrained to the structure adopted in the original modelling.
2.2.7 For medium and smaller sized schemes, it may be appropriate to construct (transferred) simple mode choice models based on imported model parameters from other sources, assuming that appropriate parameters are available from other sources. In these cases, calibration of the model to replicate observed mode shares and model scale will still be required.
2.2.8 Transfer of joint mode and destination choice models is possible, if the considerations raised in 2.2.5 are met.
2.3 Policy responses to be measured in the mode choice model
2.3.1 Model Structures and Traveller Responses for Public Transport Schemes (Unit 3.11.1) advises on necessary model components linked to policies to be tested, and particularly policy responses that are expected. Table 1 below identifies the suitability of a transferred mode choice model for appraisal of specific policy improvements, depending on the expected responses from existing car and public transport users.
Table 1: Suitability of Transferred Mode Choice Models for Specific Policy Improvements
| Policy |
Expected passenger transfer from car? |
Expected passenger transfer from other PT modes? |
Recommended Model Type |
| PT Performance improvements for existing PT modes and conventional variables |
No |
No |
No model required. |
| No |
Yes |
Allocation of shares could be done through a mode choice model or assignment. Transferred model appropriate for smaller and medium sized schemes; bespoke model required for large schemes. |
| Yes |
Perhaps |
Mode choice model required. Transferred model appropriate, depending on the scale of the scheme improvements and variables to be considered. Large scheme improvements require bespoke modelling. |
| New modes or new use of existing modes |
Yes |
Yes |
Mode choice model required. Bespoke model required. |
| Car congestion reduction |
Yes |
N/a |
Mode choice model required, with detailed highway assignment. Transferred model appropriate, depending on the scale of the scheme improvements and variables to be considered |
| Reliability, Quality, Integration |
No |
No |
No model required. |
| No |
Yes |
Allocation of shares could be done through a mode choice model or assignment. Model transfer appropriate. |
| Yes |
Perhaps |
Mode choice model required. Transferred model may be appropriate, depending on the scale of the scheme improvements and variables to be considered. |
| Park & Ride |
Yes |
Yes |
Mode choice model required, including access mode choice. Transferred model may be appropriate: may need to consider bespoke modelling, depending on the extent of the proposed system. |
2.3.2 Typically public transport schemes will have more than one policy improvement objective. In these cases, the most demanding modelling requirements of the relevant objectives should be met.
2.4 Available data
2.4.1 Bespoke modelling and model transfers have different data requirements.
2.4.2 In general, the data requirements for transferred models are typically less than for bespoke models, because estimation of statistically reliable behavioural parameters is not an issue. Aggregate data will be required for model calibration, e.g. constants. Additionally, either semi-aggregate or disaggregate data are required for calibration of the model scale, with disaggregate data being preferred.
2.4.3 There are three types of Revealed Preference (RP) data that could be used for mode choice modelling:
- aggregate data: including information on aggregate mode shares, mode shares by trip length, etc.;
- semi-aggregate data: reflecting proportions of choices made by groups of travellers, typically from matrix data of choices by origin, destination and purpose categories, trip length distributions, etc.;
- disaggregate data: reflecting detailed observations of actual mode choice behaviour, for a sample of travellers, in the relevant study area.
2.4.4 In addition count data can provide information on aggregate mode shares.
2.4.5 For all mode choice models, aggregate RP data are required as a minimum, either for recalibration or validation of mode shares. Information on mode shares by trip length could be used to (manually) calibrate the model scale, but semi-aggregate or disaggregate data provide much better information for this purpose. It is therefore recommended that data on mode shares by trip length category is the minimum information required to calibrate the model scale and is appropriate for development of models for appraisal of small schemes only.
2.4.6 Semi-aggregate data have a relevant role both for bespoke modelling and model transfers. In both cases, these data can be used to provide appropriate scale and constant adjustments. For bespoke models, it is recommended that semi-aggregate data be supplemented with additional disaggregate RP or SP data, in order to estimate statistically reliable model parameters.
2.4.7 Disaggregate data is much richer than semi-aggregate data, in terms of its explanatory power for modelling, and is therefore preferable for bespoke modelling (and for calibration of transferred models, although often disaggregate data will not be available, which may be a reason for opting for a model transfer). Disaggregate choice data can be collected from en-route postcard surveys, from home or phone interviews, travel diaries, as well as from existing sources such as the National Travel Survey. Data collected using choice-based survey procedures, for example, interviewing passengers on a bus or train, will lead to samples which are not representative of observed mode shares. Adjustments to take account of these biases must therefore be made in the model estimation procedure.
2.4.8 Stated Preference (SP) data can also play an important role in bespoke modelling, particularly for modelling demand for new modes (see Section 6 for a detailed discussion of the role of SP data).
2.5 New Revealed Preference (RP) data
2.5.1 New RP surveys should collect, as a minimum, the respondents' origin, destination, choice of mode and purpose of travel.
2.5.2 For public transport users, information on licence holding and car availability is important and for users of all modes pass holding or entitlement is valuable. Other background characteristics, for example, age, gender, income, employment status are desirable but may not be possible to collect in the context of the survey.
2.5.3 For mode choice models, observations of people choosing the specified modes are required for all (existing) modes considered in the study.
2.6 Advice on whether to develop or transfer models
2.6.1 Bespoke models are more costly to develop than transferred models, because of the data that is required to estimate statistically reliable behavioural parameters. The time required to estimate these models will also be more extensive than that required for model transfer, leading to higher model development costs.
2.6.2 Bespoke modelling will be necessary in the following situations:
- for appraisal of new modes or new characteristics, e.g. reliability changes;
- for appraisal of schemes in areas where traveller behaviour, e.g. values of time, may be substantially different from national norms or values from other existing models.
- The transfer of an entire model system from another area could be considered, but only if the conditions in Section 2.2.5 are met.
2.6.3 For medium and smaller sized schemes, it may be appropriate to construct (transferred) mode choice models based on imported model parameters from other sources, assuming that appropriate parameters are available from other sources. In these cases, calibration of the model to replicate observed mode shares and model scale will still be required. Semi-aggregate or disaggregate data will be required for calibration of the model scale, with disaggregate data being preferred. Aggregate data will be required for calibration of the mode-specific constants.
2.6.4 In general, it is difficult to say anything about the reliability of estimates produced from bespoke or transferred models because the quality of the model will depend on many other issues in addition to whether it is bespoke or based on a model transfer from elsewhere, for example the amount (and quality) of data for model estimation, the level of geographical segmentation of the model area, the degree of socio-economic segmentation, etc.. It is therefore advised that model validation is undertaken whether bespoke or transferred models are used, including examination of coefficient ratios, e.g. implied values of time, trip lengths, time and cost elasticities and realism tests. The results of the model validation should be adequately documented.
3. Introduction to Bespoke Mode Choice Models
3.1 Introduction
3.1.1 This and the following Sections of this Unit (Sections 4 to 9) provide guidance on the procedures and documentation required in the development of bespoke mode choice models. The material covered:
- explains the purpose of bespoke mode choice models;
- provides guidance on model design;
- reviews the stages to data collection;
- gives guidance on model estimation, application and validation; and
- sets out the required documentation for audit.
3.2 The purpose of bespoke mode choice models
3.2.1 Bespoke mode choice models are used to provide forecasts of passenger demand for transport services. They are typically applied where transport services are new, such as the introduction of a light rail system.
3.2.2 Bespoke specification involves developing a new model specific to the context of interest and estimating local parameters. As discussed above, this contrasts with a transferred specification, which involves importing parameters from elsewhere (whether standard values, or valuations identified in previous studies - see Section 10 below).
3.2.3 The guidance that follows is specific to bespoke mode choice models, and considers the specification and estimation of such models, as well as the design and implementation of associated data collection.
3.2.4 It is noted at the outset that the development of a mode choice model is a specialist activity, often requiring the creative contributions of a skilled analyst, but within a structured framework defined by good practice. The objective of this TAG Unit is to provide guidance on the procedures, testing and documentation that would be required for an analyst to demonstrate adherence to good practice. Whilst identifying many elements of good practice, this guidance should not in itself be considered a detailed technical guide to the development of mode choice models.
4. Model Design
4.1 High-level considerations
4.1.1 The choice of modelling approach will depend on a number of potentially conflicting factors:
- the nature of identified problems and their likely solutions;
- the definition and size of the study area;
- the likely number of options to be tested;
- the availability of data and existing models;
- the need to update and (re)calibrate models;
- the need to conduct new surveys;
- the timescale for model development; and
- the required accuracy and robustness of results/recommendations'.
4.1.2 As well as guiding strategic-level decisions of the analyst such as sub-mode choice vs. public transport assignment and bespoke vs. transferred, the above considerations should be borne in mind when formulating the precise specification of the bespoke mode choice model - if this is the chosen approach - as well as any associated data collection.
4.1.3 An important strategic question should be one of 'what is required from the model?' Indeed 'fitness for purpose' should be a guiding principle throughout the design process. A number of sub-questions follow.
4.1.4 Is the interest restricted to existing modes, or is consideration of a new mode required? If there were interest in new modes, this would usually imply a need for a bespoke mode choice model, as well as a need for new data collection.
4.1.5 What relevant data already exists? Following from the above, it is important to establish what existing sources of revealed preference (RP), or even historical stated preference (SP) data, might usefully enhance model development. This includes not only disaggregate choice data but also aggregate planning data and traffic count data. The availability of such data may impact on the scale and scope of any new data collection, the specification of the model (for example, whether the model should accommodate a range of data sources, the extent of segmentation, and the nature of any validation procedure), and the reliability of the model in application.
4.1.6 Do similar studies exist, and what methodologies were employed? Further to 2.1.5, analysis of similar studies may yield not only relevant data sources, but also valuable insight into how the mode choice model and data collection might be specified. The analyst should identify and review such studies, and justify the chosen methodology against these.
4.1.7 What resources, in terms of both time and money, are available for mode choice research? Whatever the scheme or policy under investigation, the analyst should always ensure that the budget designated for data collection and mode choice modelling, and the timescale for such activities, is commensurate with the nature and complexity of the problem, and the likely scale and extent of impacts.
4.1.8 In demonstrating that the analysis undertaken follows good practice, the analyst should ensure that the final report to the client adheres to the audit trail detailed in section 9. As a supplement to this report, all questionnaire materials, data sets, and analytical tools (including model command files) should be made available to the client, if requested.
4.2 Lower-level considerations
4.2.1 Having considered the high-level issues, preliminary model specification should move on to address a series of lower-level issues. Each of these issues is fundamental to model specification and, it follows, data requirements. These issues could have a significant impact on resource needs in analysis.
4.2.2 What is the relevant unit of decision-maker? For mode choice, this will usually be the individual, although the travelling party may be relevant in some cases. The advantage of the latter is that car cost is accounted for correctly.
4.2.3 What is the choice set? The analyst should identify the alternatives of interest, including any new modes, and take an initial view on the availability of the complete choice set to decision-makers, as well as the extent of any captivity to alternatives.
4.2.4 What are the key behavioural variables of interest? As well as standard variables such as time and cost, the analyst should consider the relevance of 'softer' variables such as crowding/congestion, quality and reliability. This will be dictated by the nature of the scheme or policy under investigation, and the likely scale and scope of its impacts. For example, an interest in road pricing would usually imply a need to investigate the prevalence of congestion effects. Where such variables are of interest, advice should be sought from Model Structures and Traveller Responses for Public Transport Schemes (Unit 3.11.1) and Road Traffic and Public Transport Assignment Modelling (Unit 3.11.2).
4.2.5 What are the key policy variables of interest? These might include fares, service frequency, reliability, accessibility, and quality in public transport modes, and road pricing and parking for cars.
4.2.6 What are the key socio-economic variables of interest? These will usually include age, sex, employment status, income, and car ownership.
5. Model Development
5.1 Logit model
5.1.1 Mode choice models, as conventionally specified, are based on the behavioural principle that a decision-maker will choose the travel mode that yields greatest satisfaction or 'utility'.
5.1.2 Utility is postulated to be a function of both observable (or deterministic) utility and unobservable (or random) utility. Specifically:

where is the deterministic utility derived from alternative i by decision-maker n, and is the associated random utility.
5.1.3 For purposes of implementation, a specific model form should be adopted. Although there exist a range of options, model development should always commence with the logit form. Logit offers substantial versatility; indeed it will be sufficient for many needs. Where more complex forms are deemed necessary, logit offers a valuable benchmark for comparison.
5.1.4 Logit relates probability of choosing alternative i from J alternatives as follows:
 |
(5.1) |
where is a strictly positive scale parameter.
5.1.5 In the context of mode choice, convention is to reinterpret utility as 'generalised cost', which is essentially the negative of deterministic utility expressed in monetary units. The methods discussed below are based on the construct of utility, since this typically includes additional variables - such as ones relating to the decision-maker - that may be difficult to translate into generalised cost terms.
5.1.6 With reference to equation (5.1), the scale parameter is inversely related to the variance of random utility (or 'error') as follows:

The amount of error has important implications for the properties of the model. All else equal, the greater the error, the smaller the scale parameter and the closer the choice probabilities will tend to 1/J for all J alternatives. This issue is known as the 'scale factor problem' and is of particular relevance when estimating models to SP data, which may contain biases and errors typically not found in RP data. Since it cannot in practice be estimated separately from , is commonly taken to be one.
5.1.7 It is differences in deterministic utility across alternatives that influence probability - not absolute utility. The relationship of utility difference to logit probability is sigmoid (Figure 1). Thus, if an alternative has an extreme probability (whether high or low), a small change in utility difference will have little impact on probability of choice, whereas if an alternative has a probability close to 0.5, the same change in utility difference will have considerably greater impact on probability.

Figure 1: Plot of logit probability against utility difference
5.2 Specifying the utility function
5.2.1 An important practical issue is the specification of . This is typically represented as a function of observed variables relating to the alternative and the decision-maker.
5.2.2 As regards functional form, linear-in-parameters is sufficient for most needs. Indeed, under fairly general conditions, any function can be approximated arbitrarily closely by the linear-in-parameters form.
5.2.3 Variables relating to alternatives may be entered in the function 'directly', as follows:

where are observations relating to the kth variable (or 'attribute') of decision-maker n and alternative i, and the are associated parameters. For example, if there were interest in the effect of time (T) and cost (C) on choice, an appropriate representation would be as follows:
 |
(5.2) |
where and are utility or 'taste' parameters relating to time and cost, respectively. Since both time and cost are, in terms of utility, perceived as 'bad', .
5.2.4 The parameters in equation (5.2) are shown to be 'generic' across choice alternatives. Thus attributes that are common to alternatives are specified as having common parameters, such that estimates of these parameters will be averages across the data. This is not however a requirement; whilst convention is to specify cost as generic, other parameters may be specified not only as mode-specific, but also as person-specific, depending on the focus of interest. Relaxing the assumption of generic parameters allows for different values of time for different modes, people, or both modes and people, for example.
5.2.5 Analysts should adopt the standard of expressing time in minutes and cost in pence. Furthermore, care should be taken to express each in terms of single trip units; return trip attributes such as parking charges should be halved.
5.3 Introducing socio-economic variables
5.3.1 Following from 5.1.7, variables that for a given observation are common across alternatives, such as those relating to the decision-maker, should not be entered 'directly' since they have no impact on utility difference (and therefore probability). They must instead be 'interacted' with variable(s) that do vary across alternatives. For example, if there were interest in the influence of age on the responsiveness of choice to cost, (5.2) could be re-written:
 |
(5.3) |
where:

and Y is the parameter relating to the interaction between cost and age (M).
Substituting into (5.3), the interaction of age with cost can be seen clearly:
 |
(5.4) |
5.3.2 It should be noted that the parameters of (5.4) are again represented generically. Since
logit models are usually estimated on data from a sample of decision-makers (the topic
of sampling is considered in Section 6.34), it may be revealing to extend the model to
investigate the potential for tastes to vary across decision-makers.
5.3.3 If, for example, there were interest in the distribution of tastes with respect to the
interaction between age and cost, (5.4) could be re-specified:

which would yield a separate Y parameter for each decision-maker n.
5.3.4 An alternative, and more efficient, representation would be to segment decision-makers
by age group, and represent each group by a dummy variable. For any such variable,
dummies should be included explicitly in the model for all but one group, thereby
avoiding the 'dummy variable trap'. If, for example, the data were assigned to one of
three age groups, dummies for two of the groups should be included in the model, with
the third acting as the 'base'. The V function should then be represented as follows,
where in this case l = 1,2.

and Yl is the cost parameter specific to segment l.
5.4 Alternative-specific constants
5.4.5 It is essential to include a constant in the utility function of all but one choice alternative This constant is referred to as to the 'alternative-specific constant' (ASC) or 'modespecific constant', specifically:

where ASCi is the alternative-specific constant relating to alternative i.
5.4.6 The ASC is omitted for one alternative - which becomes the 'base' - again to avoid the
'dummy variable trap'. An ASC can be interpreted as representing the net average
effect of omitted variables (relative to the base). The inclusion of ASCs ensures that,
when estimated by maximum likelihood, logit is able to replicate the aggregate choice
shares.
5.5 Functional form
5.5.7 Thus far, the model has adhered to linearity-in-variables as well as linearity-inparameters.
In some cases, non-linear forms may offer additional flexibility. One such
case, which may be useful in mode choice studies, would be to incorporate an explicit
'income effect' i.e. the marginal utility of income diminishes with increasing income.
Chapter 8 of Ortúzar and Willumsen (2001) provides direction on such specifications.
5.5.8 As will become evident later, the linear form is particularly attractive when it comes to
interpretation of the model, although additional flexibility can be achieved by introducing
non-linearity. Among the more popular alternatives are the following:

While it is difficult to offer clear prescription, the case for using such forms should be
based on a combination of theory (i.e. behavioural rationale) and/or data (i.e. empirical
support). One of the more common contexts for non-linear forms is where national-level
data (e.g. for trip lengths) may be distinct from local level data.
5.6 Defining the choice set
5.6.1 An important specification task is to define the choice set appropriately. For bespoke
mode choice models this will usually be relatively small and clearly defined, at least in
an aggregate sense. What may be less obvious is the propensity for decision-makers to
consider only a subset of alternatives when actually choosing. There could be any
number of reasons why particular alternatives might not be considered, although by far
the most common issue in mode choice modelling is car availability. It is important to
identify such constraints, and represent the appropriate choice set for each decisionmaker
(an 'unavailability of alternatives' command is provided in most software
packages). Although it is necessary, for successful estimation, that at least some
decision-makers choose each choice alternative, it is not a requirement that all
decision-makers have access to the full choice set.
5.6.2 Where choice models are developed to consider the demand implications of a new
mode, there are a range of issues associated with data collection and how the new
mode should be considered during forecasting. As regards the former, a requirement for
a SP experiment would often be implied (guidance is offered in section 4). As regards
the latter, a particular issue is the specification of ASCs (Sections 5.4 and 7.3).
5.7 Independence from Irrelevant Alternatives (IIA)
5.7.1 In defining the choice set, it should be noted that logit is characterised by the property of
independence from irrelevant alternatives (IIA); that is, for any two alternatives, the ratio
of their choice probabilities is unaffected by the presence or absence of any other
alternatives in the choice set. Where two alternatives in the choice set are closely
related in some sense (i.e. the 'red bus-blue bus' problem), IIA is violated, and the use
of logit is (in principle) inappropriate. It should be noted that the IIA property of the logit
model is evident at the level of the decision-maker and not always present for groups of
decision-makers.
5.7.2 There are a number of ways to identify cases where the IIA assumption is violated, but
arguably the most practical is to calibrate nested models. This process not only
identifies whether IIA applies, but also how to alleviate it (see Section 5.12).
5.7.3 Where IIA provides an accurate representation of reality, it may permit considerable
efficiency in analysis, since models can be estimated on restricted choice sets.
5.8 Maximum likelihood estimation
5.8.1 Logit can be estimated on RP data, SP data, or a combination of the two. Such data are
usually collected on a sample of decision-makers from the population of interest. Data
collection is considered in detail in section 6 of this guidance.
5.8.2 Convention is to estimate logit by maximum likelihood (ML), the purpose of which is to
estimate the parameters for which the observed sample is most likely to have occurred.
A number of software packages offer routines for ML estimation, although these may
vary considerably according to their cost, ease-of-use and flexibility. Whichever
software is chosen, estimation of logit by ML is usually reliable, and it is uncommon for
close examination of the ML routine to be required.
5.8.3 Where computational problems are encountered in estimation, closer examination of
the ML routine may be necessary. A reasonably detailed account of the most popular
ML algorithms is offered in Train (2003), along with diagnostic advice on how common
estimation problems may be overcome.
5.8.4 Having estimated a logit model by ML, an initial post-estimation check is to ensure that
the ML routine converged successfully - this is reported as standard output in most
software packages. If the model failed to converge, then it is necessary to investigate
the reasons for this, resolve them, and repeat the estimation. In the event of such
problems, the software may provide appropriate prescriptive advice, although it is often
necessary for the analyst to interpret such advice in the context of how the data, model
and estimation routine have been specified. The analyst should not draw
behavioural conclusions from software failure.
5.9 Preliminary interpretation
5.9.5 Having estimated logit successfully, a series of preliminary tasks in statistical inference
should be undertaken. Each of the utility parameters should be subjected to a Student's
t-test for statistical significance and, strictly speaking, only parameters that are of
statistical significance should be retained for purposes of model application. In practice,
however, the decision to include/exclude a given variable is less clear cut and depends
as much on the sign and relative magnitude of the coefficient as well as its standard
error. Accepting a coefficient with an inappropriate sign or magnitude simply because it
is statistically significant is clearly wrong, as is rejecting a key policy variable if it is
marginally insignificant. The development of a choice model is largely guided by
experience, informed by what the standard errors infer about the accuracy of the
coefficients.
5.9.6 When making such judgements, it may be informative to consider the relationship
between statistical significance and sample size. More specifically, for large populations
and relatively small samples - which is the typical context for mode choice modelling -
the standard error of an estimate relates approximately to sample variance and sample
size as follows:

where is the sample variance of and N is the sample size. To illustrate this relation, a quadrupling of the sample size would, for given sample variance, imply a
doubling of the t-ratio in a test of statistical significance.
5.9.7 Such considerations may also impact on the specified degree of segmentation, since
greater segmentation may imply reduced standard errors for segment-specific
parameters. Moreover, where budget and/or other constraints restrict sampling, the
retention of insignificant variables may be justifiable if a modest expansion of the data
set would likely bring significance.
5.9.8 The sign of each significant parameter should be assessed as to its intuitive validity; for
example, fares should always have a negative effect on utility.
5.10 Further interpretation and diagnostic testing
5.10.9 Analogous to least squares estimation, the prevalence of any (near) collinearity
between variables may affect the sign and/or significance of parameter estimates. Such
dependency can be investigated through estimating models with restricted sets of
variables, and examining the behaviour of the model as variables are added or
removed. Good estimation software will also produce parameter correlation matrices,
analysis of which will inform any such investigations.
5.10.10 Referring back to 5.1.6, it should be noted that the parameters in the utility function are scaled relative to the variance of unobserved factors; larger variance in will lead to smaller . When it comes to interpretation, therefore, ratios of parameters are more meaningful than absolutes, since the scale factor cancels out.
5.10.11 A further attraction of ratios of parameters is that, at least in the case of a linear
functional form, they have ready economic meaning as 'marginal rates of substitution'.
In particular, if the denominator of such a ratio is a cost parameter, then the ratio can be
interpreted as the marginal rate of substitution with respect to cost, or in other words
'value'. For example, and with reference to (5.3), the value of time is given by the ratio
of time and cost parameters:

5.10.12 VOT can, more generally, be derived from any functional form by taking the ratio of
marginal utilities, as follows:

5.10.13 Any derived valuations should be tested for statistical significance; tests for significant
difference from 'reference' values (such as 'standard' values') may also be insightful.
5.10.14 If estimated by ML, the goodness of fit of a logit specification should be measured using
the log-likelihood (commonly referred to as 'rho-squared') index. The basic form of this
index is defined:

where LLf is the final log-likelihood of the full model, and LLr is the final log-likelihood of a restricted model.
5.10.15 Although a number of restricted models may offer bases for meaningful tests, a
minimum requirement should be to implement the test with a market share model (i.e. a
restricted version of the full model that includes only ASCs) as the base. Such a
formulation yields the widely used 'rho-squared with respect to constants' index.
5.10.16 The P2 index offers a measure of the goodness of fit of the logit model, and is analogous to the R2 statistic in ordinary least squares regression. The value P2 lies between zero and one, but values between 0.2 and 0.4 are often considered indicative of very good fits. In common with R2, the P2 with respect to constants is comparable across different samples.
5.10.17 Expanding on the t-tests for hypotheses regarding individual parameters, it may be
insightful in some cases to test more complex hypotheses regarding subsets of
parameters. Two of the more common such hypotheses are (i) that the coefficients of a
subset of variables are collectively zero; (ii) that the coefficients of two variables are the
same. Both of these tests can be implemented using a likelihood ratio test, which is
given by the general form:

where Lr is the final likelihood of the restricted model under the null hypothesis (e.g. in case I, the restricted model would constrain the relevant subset of coefficients to be zero), and Lf is the final likelihood of the unrestricted model. Thus a restricted model should be estimated by ML in accordance with the null hypothesis. The test statistic is given by -2logR, which is distributed chi-squared with degrees of freedom equal to the number of restrictions implied by the null hypothesis.
5.11 Validation
5.11.1 Having conducted the above procedures in statistical inference, the properties of the
estimated model should be validated against benchmark empirical evidence. Such
investigations should focus on two principal constructs - valuation and elasticity. As
regards the former, any valuations implied by the estimated model should be
reconciled, where possible, with empirical evidence from comparable local schemes.
Where such evidence is unavailable at the local level, recourse to national evidence
should be made. As regards the latter, the elasticity properties of the model should be
similarly compared against available local evidence. Such analysis can be based on
measures of point elasticity, calculated for both direct and cross effects, across the
sample. The relevant formulae for direct and cross elasticity for each decision-maker
are, respectively:

5.11.2 To obtain elasticity estimates for the sample as a whole it usual to take a weighted
average across of the elasticity estimates for each decision-maker, with the weights
being the individual choice probabilities for the mode in question. Simply inserting
average values for P and x will, if there is any variance to the data, lead to an
aggregation bias and incorrect elasticities. An alternative method is to make small
changes to the variables during model application, and derive arc elasticity estimates
from the predicted market shares.
5.11.3 The validity of the estimated model should be further tested in implementation. The
estimated model should be applied to forecasting (the subject of forecasting is
considered in section 7), and the ability of the model to replicate observed market
shares assessed. A range of indicators of forecasting performance may be employed,
although a simple and robust test is offered by a Chi-squared test:

where
fa is the actual frequency
fp is the forecast frequency
J is the number of alternatives in the choice set
with degrees of freedom:

where m is the number of parameters to be estimated on the basis of the sample data.
5.11.4 The validation process should be carried out across a number of dimensions including
those defined by the characteristics of the sample (e.g. income, gender, age) and the
attributes of the choice alternative (e.g. cost, in-vehicle time).
5.12 Nested logit
5.12.1 With reference to Section 5.7, a diagnostic for, and (partial) resolution to, the property of
IIA is offered by the nested logit model, which groups similar alternatives together in
mutually exclusive subsets or 'nests' (i.e. an alternative can be included in only one
nest). Choice probability is represented as the product of marginal probabilities of
choosing nests and the conditional probability of choosing a given alternative from a
nest.
5.12.2 Nested logit can be illustrated by considering a problem of two-levels and two-nests,
although the model can in principle be extended to any numbers of levels and nests.
With reference to the tree diagram in Figure 2, choice probability is given by:

where Pm is the marginal probability of choosing nest m, and is the conditional probability of choosing alternative i from nest m.
Figure 2: Nested logit for a two-level two-nest problem

5.12.3 Within each nest, the property of IIA holds, and conditional probability is represented as
logit:
 |
(5.5) |
where is the scale parameter relating to nest m.
5.12.4 Turning now to marginal probability, the utilities from (5.5) are introduced in an expression for the Expected Maximum Utility of each nest m (commonly referred to as the 'log sum' or composite cost), as follows:

5.12.5 The probability of choosing an alternative in nest m is also of the logit form and is shown as:

5.12.6 A common simplification is to assume that is constant across all m in a given level of the tree (i.e. all nests at a given level have the same scale factor).
5.13 Estimating and interpreting nested logit
5.13.1 It is very important to note that there are (in general) two different and commonly used specifications of the nested logit model. In particular some applications are specified and estimated without dividing the lower level utility by .
5.13.2 Since this distinction between specifications may have substantive implications for
interpretation and application, the analyst is advised to seek appropriate advice from the
software supplier before proceeding.
5.13.3 The inferential and diagnostic analysis required following estimation is essentially the
same as for logit, although one additional test is required in order to check the internal
consistency of the nested logit structure. Following from 5.13.2, the precise specification
of this test differs according to the specification of nested logit adopted. For example, if lower-level utility is specified without dividing through by , the test requires that:
 |
(5.6) |
Interpreting (5.6), where is not significantly different from one, there is a violation of the IIA assumption (see Section 5.7); where this holds for all the nested logit model collapses to logit.
5.13.4 Although the above discussion was based on a two-level problem, the analysis can be
readily extended to more than two-levels, with different scale factors at each level. The
internal consistency test then involves an extension of (5.6).
5.13.5 A practical difficulty with nested logit is that the most appropriate nesting structure may
not always be obvious. It may therefore take some effort to identify a definitive structure,
judgements on which should be based on internal consistency, relative explanatory
power and other properties of the model such as implied valuation and elasticity.
6. Data Collection
6.1 Revealed Preference and Stated Preference
6.1.1 Revealed Preference (RP) refers to observations of actual behaviour, for example the
mode choices that decision-makers currently make or made in the past.
6.1.2 RP data is inherently more credible than SP data and its use, if only partially, will
strengthen the credibility of demand forecasts in the appraisal framework.
6.1.3 RP data can be obtained from SP respondents, from postcard surveys (an under-used
and relatively inexpensive approach), from home or phone interviews, travel diaries, as
well as from the National Travel Survey and Census.
6.1.4 The collection of RP data is not without problems however. There are often large biases
in respondents' self reported data, underestimating the costs of their chosen mode and
overestimating the costs of alternative modes. To overcome these problems it is
sometimes necessary to use explanatory variables from network models and published
timetable data. Even where respondents' reported data is modelled, there is often a
considerable amount of missing data which needs to be collated.
6.1.5 Stated Preference (SP) refers to observations of hypothetical behaviour under
controlled experimental conditions.
6.1.6 The need for a bespoke approach to mode choice modelling would often imply an
interest in a 'new mode'. The interest in new modes would itself imply a need for SP
analysis, since RP data is by definition unavailable for such contexts. In short, bespoke
mode choice development would often require new SP analysis.
6.2 SP design methods
6.2.1 The design of SP experiments has evolved into a specialist technical area.
Comprehensive accounts of SP design methods are offered in a number of dedicated
texts; popular ones include Pearmain and Kroes (1990), Louviere et al. (2000) and
Bateman et al. (2002). It is important to recognise that SP design is a developing
discipline, and that there is some disagreement as to the most appropriate methods.
Indeed the three texts cited above show significant differences in respect of the
methods they promote. Moreover, SP must at all times remain practical and useful, and
it is left to the analyst to reconcile an in-depth knowledge of theoretical concepts of SP
design with an appreciation of the practical implications of particular design features.
There are a number of commercial software packages that can help with this task.
6.2.2 In what follows, we offer generic guidance on the stages to be followed in promoting
best practice for SP analysis of mode choice.
6.3 Choice set
6.3.1 Definition of the choice set of alternatives should be based on RP evidence on existing
behaviour, as well as policy interest in the potential addition of new modes. In most
mode choice studies, the choice set will be reasonably well defined. The judgement of
the analyst may however be required in some cases, for example in deciding whether
two alternatives of common mode should be represented as distinct. Since such
judgements may have a significant impact on the properties of the model, it is
sometimes useful to conduct appropriate preliminary investigation such as focus group
analysis. The option of 'not travel' should always be included in the choice set. In mode
choice studies, alternatives should always be described 'explicitly', meaning that they
should be referred to as 'bus', 'train', or 'car' etc., rather than as 'A', 'B' or 'C' etc.
6.4 Response method
6.4.1 The most natural, and therefore most reliable, response method for mode choice
studies will usually be choice (as opposed to ranking or rating). Where the response
method deviates from choice, the analyst should offer clear and convincing justification.
6.5 Number of alternatives
6.5.1 This decision involves reconciling the demands of the experiment on respondents (both
in terms of the cognitive effort required on any given replication and the number of
replications that require application) with considerations regarding its realism. The most
natural presentation would be to offer the complete choice set on every replication,
although this may be impractical depending on how many alternatives are involved and
the means by which the SP experiment is implemented. For example, if the SP were
administered as a pen-and-paper exercise at a motorway service station, the most
appropriate implementation would perhaps be a binary choice experiment, since larger
choice sets may place unreasonable demands on respondents.
6.6 Number of replications
6.6.1 This decision is related to decisions on numbers of alternatives, attributes, and levels.
For choice response exercises, the presentation of between 5 and 16 replications is
typical, although this may again be influenced by other considerations, such as the
means of implementation.
6.7 Task complexity
6.7.1 There is a controversial literature on the effects of task complexity (defined in a number
of ways including numbers of alternatives, attributes and replications) on statistical
properties, with different researchers reporting different findings. The general advice
offered here is that excessive numbers of alternatives, attributes and replications may
introduce significant bias in valuation and forecasting, as well as impact on response
rates. The prevalence of such effects should be identified at the pilot stage, and the SP
design should be adjusted accordingly.
6.8 Which attributes?
6.8.1 In mode choice studies this is usually dictated by which generalised cost components of
interest to policymakers; Road Traffic and Public Assignment Modelling (Unit
3.11.2) is helpful in this regard.
6.9 Number of attributes
6.9.1 Having identified the attributes of interest, the next issue to be decided is whether to
present the complete set of attributes together, or subsets of attributes only. In the latter
case, separate designs should be developed for each subset of attributes, whilst
ensuring that each design contains at least one (but preferably more) common attribute
- this permits merger of the designs at the estimation stage. Since mode choice SP is
commonly implemented as an interception survey, the analyst should remain conscious
of the cognitive and time demands placed on respondents. It is uncommon, therefore,
for any single SP experiment to consider more than four attributes.
6.10 Units of measurement
6.10.1 For most attributes, the units of measurement will be natural and straightforward. For
some attributes, however, no natural units may exist (a good example would be an
attribute such as 'ride quality'), and it may be left to the analyst to construct a
measurement unit or scale that is both useful for analysis and comprehensible to the
respondent (the latter perhaps investigated through focus groups).
6.10.2 A question of particular relevance to mode choice studies is whether attributes should
be presented in respect of single or return trips; the answer to this should again be
governed by what appears the more natural for the particular context under study. The
preamble to the questionnaire should always make clear whether the attributes refer to
single or return trips.
6.10.3 An associated consideration is whether to construct attributes as absolutes (e.g. 'car
has travel time of 50 minutes'), or as some form of deviation from a base, whether as an
absolute deviation (e.g. 'car is 5 minutes faster than now'), or as a percentage deviation
('car is 20% faster than now'). As will become apparent in the subsequent discussion,
the deviation options sometimes offer convenience in design; absolute deviations may
permit efficiency in the number of variables required, as well as facilitate easy derivation
of boundary values; percentage deviations are particularly amenable to customisation.
These considerations aside, when it comes to implementation of the SP experiment,
any such deviations should be translated into absolute values for purposes of
presentation to respondents.
6.11 Numbers of levels of attributes
6.11.1 At the early design stage, decisions on the number of levels will often be dictated by the
scope of policy interest. It should be noted that at least three levels would be required in
order to test an attribute for non-linearity. As with many other decisions in the design
process, however, decisions on the numbers of levels has implications elsewhere, in
particular on the number of replications required.
6.12 Selection of values for levels of attributes
6.12.1 The selection of values for attributes levels should be guided by the need to ensure
realism in the representation of the choice problem and the need to include values
relevant for policy testing. Other factors to consider include the range of any interest in
non-linearity, and the variability (and hence significance) of attribute coefficients.
6.13 Combining the attribute levels: orthogonality
6.13.1 Conventional practice is to be guided - to greater or lesser extent - by fractional factorial
designs. These provide templates for combining attribute levels in an orthogonal (i.e. zero correlation) manner. The attraction of orthogonality is that it enables the model to
identify the separate influence of each attribute on utility.
6.13.2 Design templates can be found in several references (e.g. Kocur et al., 1982); a number
of software packages offer automated facilities based on the same principles. Although
these templates can, on the face of it, be applied in a reasonably prescriptive and
straightforward manner, significant judgement is required on the part of the analyst in
reconciling conflicting objectives with respect to the numbers of alternatives, attributes,
attribute levels, interaction effects and replications.
6.13.3 A number of strategies may be adopted to mitigate the effects of increasing numbers of
attributes and attribute levels on the number of replications. One strategy, where the
choice set is binary, is to apply the attributes of the design plan as differences between
the two alternatives (e.g. time difference, cost difference, etc.), thereby reducing the
requisite number of attributes in the design by half. Other more complex strategies
include the use of specialised algorithms to arrive at 'optimal 'designs, which minimise
the number of replications whilst remaining 'efficient' according to some criteria.
6.14 Combining the attribute levels: non-orthogonality
6.14.1 In some cases, it may be desirable to deviate from orthogonality. One such case is
where an orthogonal design yields an unrealistic combination of attribute levels. Another
motivation for deviating from orthogonality is that some degree of correlation between
attributes may improve the precision of parameter estimates. Moreover, since
orthogonality in design is not preserved in estimation of choice models, some analysts
argue that the attraction of orthogonality is sometimes overstated.
6.15 Combining the attribute levels: boundary values
6.15.1 A more advanced - but powerful - approach to SP is to ground design more firmly in
behavioural theory, thereby establishing an intimacy between the data (i.e. the
experimental design and the experimental responses) and the behavioural model to
which the data are applied. A popular approach of this kind is 'boundary values'.
6.15.2 Boundary values are the implied valuations at which a decision-maker is just indifferent
between two choice alternatives iand jon a given replication of the design. More
formally, let:

Where such that an individual is indifferent between the two alternatives, the boundary value of time (BVOT expressed in terms of cost is given by:

Assuming that utility is entirely deterministic, an individual whose valuation of T in terms of C is greater than , will prefer the alternative with the greater C/least T, whilst an individual whose valuation of T in terms of C is less than , will prefer the alternative with the least C/greater T. If decision-makers utility maximise, and it is ensured that the SP design presents boundary values closely either side of standard
valuations (of time, headway etc.), one can be reasonably confident that the model will
reproduce realistic valuations in estimation.
6.16 Realism
6.16.1 Although stated preferences are by their very definition hypothetical, the analyst should
at all times endeavour to ground SP design in realism, thereby ensuring that analysis is
insightful and reliable. Thus, having arrived at an initial design by means of the above
methods, it is good practice to check the design for unrealistic or irrational combinations
of attribute levels. These should be adjusted accordingly.
6.17 Testing the design using simulation
6.17.1 Whatever methods are employed in the design process, it is important to test whether
the design is capable of producing a model with realistic properties, particularly with
respect to valuation (i.e. the ability of the design to reproduce acceptable ranges of
parameter ratio). Simulation is a powerful tool for such testing, and can be used to
identify problems in design without incurring the costs of pilot or field application.
6.18 Questionnaire design and implementation
6.18.1 Before proceeding to implementation, it is necessary to develop some form of vehicle
for administering the SP experiment to respondents. This usually involves the design of
a questionnaire. The following issues should be considered:
6.19 Staging
6.19.1 The choice here is between a single-stage or multi-stage questionnaire process.
Although the latter is more demanding of respondents (with associated fall-off in
response rate), and usually more costly, it provides opportunity for customisation of
questionnaires to individuals' circumstances. This can make the SP experiment more
realistic to respondents, and thereby make analysis more insightful.
6.20 Background information
6.20.1 In most cases it will be necessary to collect a range of information relating to
respondents' socio-economic and demographic characteristics. These needs will tend
to be dictated by segmentation requirements.
6.21 Means of presentation
6.21.1 Possible options include mail-back questionnaire, computer-assisted questionnaire,
questionnaire posted on the Internet, questionnaire distributed by e-mail, and
questionnaire administered by telephone interview. Choice of method may have
significant implications for cost and effort (both in administering the experiment and
processing the responses), as well as the size and characteristics of the sample. Some
methods - particularly the electronic methods - may be more amenable to
customisation. For mode choice studies, mail-back or computer-assisted questionnaire
are most common.
6.22 Interception
6.22.1 A decision must be reached on where the questionnaire should be administered;
conventional practice for mode choice is to intercept travellers en route, whether on a
particular mode or at a terminal. Where customised questionnaires are being used,
interception usually involves recruitment and the collection of basic information relating
to the respondent, which can then be used to generate customised SP experiments that
are posted to respondents. 'Cold-calling' by telephone can be a useful means of
interception.
6.23 Response rates
6.23.1 These may vary depending on the means of presentation and interception: surveying
on-mode may yield a response rate of up to 90%; response rates at terminals tend to be
variable; two-stage questionnaires may yield a response of 40-60% of those in scope
and agreeing to participate; a typical response from telephone interviewing is 40%.
6.24 Preamble to questionnaire
6.24.1 It is standard practice to provide a preamble to the questionnaire, which should explain
the purpose of the investigation, the choice context of interest, the variables (including
advice on how to treat variables not explicitly included), how the experiment should be
completed, and how any data will be stored and used (in accordance with the Data
Protection Act). Contact details for any queries should be provided, as should
information on how the questionnaire should be returned (if appropriate). The preamble
should be as succinct as possible.
6.25 Focus groups
6.25.1 A useful means of informing presentation and implementation issues is to test prototype
questionnaires on focus groups of typical respondents. Focus groups are particularly
useful where a wholly new product or research methodology is planned; they are less
useful where the analyst has a good understanding of the new mode.
6.26 Pre-pilot survey
6.26.1 Before proceeding to a pilot survey, it is good practice to apply the questionnaire to a
pre-pilot survey involving a small number of colleagues. The purpose of this is to invite
comment and identify any problems, rather than to test the statistical performance of the
SP design.
6.27 Pilot survey
6.27.1 Although 6.15, 6.17 and 6.26 provide various assessments on the quality of the SP
design, it is essential to test the design fully in the context of a pilot survey. This should
involve a representative sub-sample of the population of interest. The pilot survey
should be a complete dummy run of the field survey, including a full statistical analysis
of the responses to the experiment. Moreover, the pilot survey should consider a range
of issues including: sampling strategy; comprehensibility of the questionnaire and
experiment (respondents should be invited to comment); response and completion
rates; market shares (i.e. prevalence of dominant alternatives); ability to estimate a
choice model successfully using the response data; parameter significance and overall
fit of model; and implied valuations.
6.28 Field survey
6.28.1 Having tested the experiment and questionnaire through simulation, focus groups and
piloting, the analyst can proceed to the field survey with some confidence that the
analysis will be successful when it comes to implementation. If testing has been
sufficiently comprehensive, then the field survey should simply be a repeat of activities
that have been carried out in the pilot surveys; the scope for unexpected problems
should therefore be minimal. Whilst the analyst must remain vigilant of the potential for
bias, a clear justification must be offered for the removal of any observations for
reasons of 'irrationality'. Where such observations are deleted, it must be ensured that
forecasts are adjusted accordingly.
6.29 Cleaning
6.29.1 Whatever form of data are collected, they should be subjected to a cleaning process.
This process should identify, and treat, any 'irrational' or missing observations. In
relation to the former, there is often confusion about the treatment of SP respondents
who always choose the same alternative (these respondents are known as non-traders
or non-switchers). The recommended approach is that car users who never switch
mode should be retained, as should respondents with relatively high valuations.
Inconsistent or biased responses may however be removed. The latter may take
several forms; a common phenomenon is where a respondent always uses a currently
available but rejected alternative. It should be noted that the distinguishing of nontraders
from those with high valuations is often difficult. Missing observations can often
be treated in some way; deletion should be regarded as a least-preferred option. If
any observations are deleted during cleaning then forecasts should be adjusted
accordingly.
6.30 Merging SP data
6.30.1 If separate designs have been developed for different subsets of attributes, it is
necessary to merge the designs at the estimation stage. Since the data relating to
different designs could have different error variances, and therefore different scale
factors, it is important to apply a correction to ensure that all parameters across different
designs are of common scale.
6.30.2 Where two or more attributes are common to the different designs, this can be
accomplished by exploiting the structure of the nested logit model. For the simpler case
of only one common attribute, recourse to nested logit is unnecessary, and re-scaling
can be achieved simply by multiplying the parameters of one design through by the ratio
of parameters relating to the common attribute.
6.30.3 To illustrate the nested logit procedure for merger, consider the case of two SP designs (which we refer to a and b), each of which considers a binary choice between alternatives i and j but represents the alternatives in terms of a different set of attributes (excepting the attributes that are common to both). The merger procedure is
as follows:
6.30.4 With reference to Figure 3, four nominal alternatives are specified: specifically alternatives i and j for design a, and alternatives i and j for design b.
6.30.5 The data should be organised such that where utility and preference data relating to
design a are presented, the alternatives relating to design b are specified as
unavailable, and vice versa.
6.30.6 All alternatives should be specified with a path directly to the root (i.e. not nested with other alternatives), although the alternatives relating to one of the designs (here we arbitrarily pick b)should be assigned 'dummy nodes'. This means that the design b alternatives are specified in single-alternative nests at the lower level of the tree. The parameter should be specified as common across the two nests; this accommodates the difference in error variance across the two designs and ensures that all estimated utility parameters are of common scale. Unlike conventional nested logit estimation, there is no requirement that falls within specified bounds.
Figure 3: Nested logit 'trick' for data merger

6.31 Combining RP and SP data
6.31.1 SP is powerful for eliciting valuations, but less reliable for forecasting. This is because
the scale factor in SP, which may deviate significantly from that in RP, cancels out in
valuation but not in forecasting. If a model is to be applied to forecasting then it should
not be estimated on SP data alone. Best practice is to merge RP and SP data in
estimation. A second best option is to validate a SP-based model against RP evidence
on elasticity. Validation is discussed in Section 5.11.
6.32 Merging RP and SP data
6.32.1 Merger with RP data should be regarded as much the preferred option, and should be
carried out in an analogous manner to the merger of SP data. Thus either the RP or SP
alternatives should be specified at the lower level of the tree, and the difference in the
error variances of the two data sources accommodated in the parameter. It should
be remembered that this method of merger is dependent on there being at least one
common variable in the two data. Indeed the more common variables there are, the
more confident one can be about the reliability of the merger process. It should be
noted that exact specification of the nested logit 'trick' for data merger is dependent
upon the type of software used for calibration (see section 5.13.1) and the analyst is
advised to seek appropriate advice from the software supplier before proceeding.
6.33 Repeat measurements
6.33.1 It is common, though often incorrect, practice to assume that the observations in the SP
experiments are independent of each other. Where respondents are invited to make a
series of repeated choices, as is typical, the informational content of the data
diminishes. An implication is that, while the coefficients estimated on such data will be
unbiased, the associated t-ratios will be upward biased, giving an illusion of greater
significance than is actually the case. A number of correction procedures have been
proposed, the simplest of which assumes perfect correlation of errors across the
choices of each individual and involves multiplying the standard errors by the square root of the number of responses per individual. A less extreme, but computationally
more difficult approach, is to assume that the principal effect of the repeat observations
is to introduce a structure to the error term:

where is an error component associated with respondent and is independently and identically Gumbel distributed.
6.33.2 Estimation of the above is only possible using specialist software. Where this is not
available, it is recommended that re-sampling techniques are employed to make
unbiased estimates of model coefficients and their variance. The most popular of these
techniques are known as 'jack-knifing' and 'bootstrapping'.
6.34 Sampling
6.34.1 In general it will not be economically feasible to collect data from the population of
interest and therefore some form of sampling strategy will be required. This strategy
should aim to ensure that the data collected provides the greatest amount of useful
information about the population.
6.34.2 The first task is to identify the population of interest and the sampling unit. In many
cases this will be defined by the objectives of the study and may for example include all
households in a given geographical area. Next an appropriate sampling method should
be chosen. This may involve a simple random sampling approach, or where it is
important to sample from relatively small subgroups in the population a stratified
random sampling approach should be adopted. The latter involves subdividing the
population into homogenous strata and then conducting a simple random sampling
strategy within each stratum. Whichever approach is adopted, care should be taken to
ensure that the sample is representative of the population. Mode choice modelling will
often involve choice-based sampling, whereby the existing users (i.e. choosers) of a
mode will be surveyed, for example on mode (for public transport) or at roadside
interviews (for car users). It should be noted that where logit is applied to a choicebased
sample, and specifies a full set of J - 1 ASCs, ML estimation will yield
inconsistent estimates of the ASCs, as well as possible bias to other coefficients.
Appropriate correction is therefore required.
6.34.3 Although there are no hard and fast rules to determine sample size, it is recommended
that the sample be commensurate with the budget for the study, which in turn should be
commensurate with the likely costs and benefits of the proposed scheme.
6.34.4 Where the sampling methodology generates a sample that is not representative of the
general population, consideration should be given to the development of an appropriate
weighting system to be used during model estimation and application.
7. Model Application
7.1 Introduction to model application
7.1.1 Logit and nested logit are usually estimated on probabilities of choice for a sample of
decision-makers. What is typically of interest to policy-makers, however, is an
aggregate measure of these probabilities - i.e. market share - across a population.
7.1.2 The application of average measures of explanatory variables to the calculation of
probability yields biased measures of average probability.
7.2 Sample enumeration
7.2.1 Consistent estimates of market share can be obtained using sample enumeration. This
involves calculating, for each decision-maker in a sample, the probability of choice for
each alternative in the choice set. These probabilities are then aggregated over
decision-makers; average probability can be obtained by dividing through by the sample
size.
7.2.2 More formally, a consistent estimate of the number of decision-makers choosing
alternative iis given by:

where wn is the weight attributed to decision-maker n. The wn parameter represents
the number of decision-makers similar to decision-maker n in the population, i.e. the
number of decision-makers within each segment of interest. Thus if the sample is
random then wn is constant for all n, whereas if the sample is segmented then wn is
the same for all n within a segment. If the sample is not representative of the
population, then the weights should be adjusted accordingly.
7.3 Adjusting the ASCs
7.3.1 In applying a model with ASCs to forecasting, it should be recognised that the influence
of explanatory variables not represented explicitly in the model may change between
estimation and forecast contexts (e.g. over time). Such changes can be accommodated
through re-calibration of the ASCs. This involves inserting the estimated parameters
(including the ASCs) in the model, along with the base data, and assessing the ability of
the model to replicate 'target' market shares. If the forecast shares differ significantly
from the target shares, then the ASCs should be adjusted, and the analysis repeated
iteratively.
7.3.2 Target market shares may be based on external evidence, the analyst's judgement, or
by particular requirements relating to a forecast segment. In the latter case, for
example, there may be an interest in the ability of the model to forecast accurately for a
particular segment of the sample, and a need to tailor the ASCs accordingly. Adjusting
the model constants for existing modes is relatively straight forward as the base market
shares will be known. Setting the ASC for a new mode is however more problematic, as
the values from SP research will be estimated to choice sets different from those to
which they are applied, may be of the wrong scale, and are likely to be subject to
various respondent biases inherent in the SP experiment. There is no easy solution
here, and recourse to similar travel situations may be required. The constant for the
new mode is therefore a strong candidate for sensitivity testing.
7.4 Forecasting
7.4.1 Forecasting involves applying the above aggregation methods to some alternative
scenario, defined on the basis of two inputs: first, data on the utility variables under the
scenario of interest (e.g. reflecting an increase in fares); and second, the wn parameters (e.g. reflecting changes in socio-demographics). Changes to the latter are
particularly important for long term forecasts, where changing patterns in population,
income and car ownership are likely to be influential on demand.
7.5 Patronage build-up
7.5.1 In most instances, the mode choice model will predict an equilibrium state in which
mode switching occurs instantaneously (e.g. in SP). In reality, however, there is likely to
be inertia within the market, perhaps because of dissipation of knowledge about the
service and/or a delayed behavioural response to the new journey opportunities (e.g. in
RP). A prudent forecaster might factor down initial patronage forecasts to take account
of the delay in take-up. This can be done 'off-model' using rules of thumb or included
within the model by means of an inertia term that decays over time. In the long run
(greater than 2 years), one would expect the overwhelming degree of inertia to have
disappeared. Further advice on this issue is provided in MSA: Cost Benefit Analysis Unit 3.9.2.
8. Model Outputs and Use in Appraisal
8.1 Introduction
8.1.1 As was noted earlier, a key preliminary consideration in the model building process is to
be clear about the purpose of modelling, and how that impinges on the detailed
specification of the model.
8.1.2 It will be necessary, in many cases, to ensure that the output from the mode choice
model is of appropriate form and detail in respect of a number of considerations. This
could be to ensure consistency with other elements of the model system and/or to
appeal to particular informational needs of policy-makers.
8.1.3 Here we consider the case where the purpose of mode choice modelling is to inform
some form of scheme, strategy or project appraisal, where detailed specification would
usually involve consideration of the following issues:
8.2 Spatial detail
8.2.1 Advice on this issue is offered in Model Structures and Traveller Responses for Public
Transport Schemes (Unit 3.11.1). In respect of mode choice modelling, this
essentially refers to the need to adopt an appropriate representation of zones and
movements between them, whilst noting that finer detail implies greater burden in terms
of both data and computation.
8.3 Segmentation by purpose
8.3.1 Advice can again be found in Model Structures and Traveller Responses for Public
Transport Schemes (Unit 3.11.1), although it is noted that a typical breakdown is:
home-based work, home-based employer's business, home-based other, non-homebased
employer's business, and non-home based other. If education is a significant
fraction of the market then it should always be modelled separately.
8.4 Segmentation by person-type
8.4.1 A minimum requirement is to segment by non-car-owning and car-owning households,
although greater segmentation by numbers of cars and drivers per household is
advisable. Model Structures and Traveller Responses for Public Transport Schemes
(Unit 3.11.1) offers advice.
8.5 Choice set
8.5.1 It is important to consider all relevant choice alternatives in the mode choice model,
whilst noting that the definition of alternatives, and their representation in the tree, may
have a significant impact on the properties of the model (section 5).
8.6 Generalised costs
8.6.1 A function of the mode choice model is to provide calculations of generalised cost by
mode to TUBA. These should be derived by dividing the utility of each alternative by the
cost coefficient.
8.6.2 Non-work values of time and walk and wait times to be used in appraisal should be
tested for significant difference from the values recommended in Values of Time and
Operating Costs (Unit 3.5.6); where local values are not statistically significantly
different, they should not be used; where they are statistically significantly different, they
may be used, subject to sensitivity tests using the recommended values.
8.6.3 The suitability of any local values, including significance and sensitivity testing as must
be fully documented. Further guidance on local values may be found in Values of Time
and Operating Costs (Unit 3.5.6).
8.7 Time
8.7.1 As well as estimating a model for the base year (typically the year in which bespoke SP
data are collected), it will usually be necessary to apply the model to forecasting for
several future years including the opening year(s) of the relevant scheme, forecast
year(s) for appraisal, and a horizon year. Further guidance is provided in Cost Benefit
Analysis (Unit 3.5.4).
8.7.1 Whilst the focus of the above is on the demand side, it should be acknowledged that
generalised cost forecasts are contingent on a range of assumptions regarding the
supply-side, for example vehicle kilometres and vehicle hours.
8.8 Outputs for TUBA
8.8.1 Requisite outputs from the mode choice model for purposes of appraisal include
forecasts (by O-D pair and mode) of:
- Passenger trips
- Total revenue
- Passenger kilometres
- Generalised costs
9. Documentation
A flow-diagram of the stages to model development is presented in Figure 4 below. For
each stage, a summary of the processes involved together with an audit checklist is
presented below. This audit trail should be completed during model development to justify
the methodological approach taken and any assumptions that are made.
Figure 4: Mode choice model development process

9.1 Model design
9.1.1 The model development process starts with a clear definition of the scheme context and
the need for a choice model to assist in the decision-making process. The modelling
approach should be suited to the scheme, its context and its objectives, and should be
capable of generating output suitable for use in appraisal. Above all, the model should
be designed to assist the investment and planning decision-making process. A
summary of the information required for model design is presented below.
|
The model design report should include:
- Information on the nature of problem and the objectives of the likely solutions;
- A definition and size and scope of the study area;
- The availability of existing data to establish new models;
- The need to undertake new surveys to establish new models;
- Preliminary model specification, including information on model structure, explanatory variables and estimation procedures;
- Details of the software to create and apply the model;
- The forecasting parameters and years for which the forecasts are required; and
- Information on the timescale and resources required for model development.
|
9.2 Data collection
9.2.1 The data collection exercise should follow the design stage. This exercise should
involve the collection of RP and SP data and may also involve the use of focus groups.
The data should be processed and cleaned and simple analysis undertaken to ensure that the data covers the relevant dimensions needed for the construction of choice
models. This data should be made available to the client, if requested.
|
The data collection report should include the following:
- Documentation of the the key findings of focus groups (if undertaken);
- The RP data collection exercise and sampling strategy;
- The SP data collection exercise and sampling strategy. This will also contain information on the questionnaire design, testing by simulation and pilot survey results; and
- Data processing and cleaning. This will include information on the processing of raw data for use in final model development.
|
9.3 Model development
9.3.1 The model development stage includes model estimation, model application and model
validation. This process can be quite complex requiring a number of iterations until a
satisfactory model is achieved.
9.3.2 Once the data has been collected, processed and cleaned, the model estimation
process can commence.
|
The model development report should include information on model estimation.
The report should include a step-by-step account of the stages to model development including:
- the specification of logit models calibrated to each data set;
- the specification and justification for alternative nested structures; and
- the merging of different data sets to develop joint RP-SP choice models.
For each model, evidence and justification is required for:
- the inclusion/exclusion of each variable;
- the specification of the functional form;
- the degree of market segmentation; and
- the significance of any structural coefficients.
For each reported model, information should be presented on:
- the variables included, their unit of measurement, and which alternatives they apply to;
- the estimated coefficients and associated t-statistics/standard errors. Where models are estimated to SP data, the standard errors should be adjusted to account for repeat observations;
- the number of observations; and
- the explanatory fit of the model;
And where appropriate:
- the relative attribute valuations (e.g. value of time) together with estimates of their statistical confidence; and
- the implied elasticities of demand.
|
9.3.3 Following estimation, the model should be applied to forecasting.
|
The model development report should include information on model application.
The models should be applied where possible using sample enumeration techniques.
Documentation is required to:
- Justify the approach to model application;
- Report any weighting of the sample to make it representative;
- Show how explanatory variables change over time; and
- Show how the model is able to recover the base market demand forecasts over a range of market segments.
|
9.3.4 The final stage to model development is to validate the properties of the model to
evidence available from elsewhere.
|
The model development report should include information on model validation.
- The report should comment on the credibility of the forecasts when compared to
actual patronage figures for similar schemes;
- The report should review the relative attribute values (e.g. value of time) implied
by the model and compare them to published evidence; and
- The report should review the own and cross elasticities of demand implied by the
model and compare them to published evidence.
|
9.4 Model output
9.4.1 The final stage to bespoke mode choice development is to report on the model
forecasts.
|
The model outputs report should include:
- Forecasts of generalised cost, passenger demand, revenue and kilometrage by OD
pair and mode;
- Estimates of patronage build-up over time;
- Sensitivity tests on key input parameters; and
- Specification of the schemes tested and scenario forecasts.
|
10. Estimation of Transferred Models
10.1 Introduction
10.1.1 This Section looks at the process of mode choice transfer. However, in some cases a
transferable mode choice model may be embedded in a complete transferable model
system and advice for these cases is also presented.
10.1.2 This section advises on two issues: the first involves importing model coefficients from
other sources; the second involves the transfer of one or more components of an entire
model system, estimated in one area, for application in another.
10.2 Importing model parameters
10.2.1 The first case of transfer is one in which coefficients are available from sources outside
the study area which are believed to be appropriate for mode choice modelling within
the study area.
10.2.2 Importing model parameters is likely to be appropriate only for appraisal of medium and
smaller sized schemes.
10.2.3 As with bespoke modelling, the foundation of a mode choice model using imported
model parameters is the utility formulation describing the choice alternatives, i.e.:

where is the deterministic component of utility derived from alternative i by user n, are the relevant attribute values (k) relating to alternative i for user n and are the model parameters indicating the relative importance of each attribute.
10.2.4 The specific attribute values, for each choice observation, will usually be derived from
networks or other databases, e.g. fares. In bespoke models, the s will be estimated
such that the observed choices are best represented. For transferred models, these
parameters are inputs to the modelling. These inputs may be obtained from a number
of sources, including:
- Values of Time and Operating Costs (Unit 3.5.6),;
- TRL Report TRL593, The demand for public transport: a practical guide, for information on the relative valuations of public transport journey components;
- Variable Demand Modelling - Key Processes (Unit 3.10 3),;
- Passenger Demand Forecasting Handbook (PDFH), in cases where access to this source is possible;
- other SP or RP studies;
- a mixture of the above.
10.2.5 For transferred models, it is essential that the s are measured in consistent units. Two
units of measure are generally used: Generalised Costs (GC) and Generalised Times (GT).
10.2.6 In the generalised cost formulation, all in-vehicle and out-of-vehicle time components (x)
are multiplied by appropriate values, by purpose and journey component ( ), to convert
them into monetary values. For example, if we were to consider a typical generalised
cost formulation for a rail journey, it may include in-vehicle time, out-of-vehicle time and
other components, for example:

where
value of time for travel by rail, for specific purpose of travel
value of time for access and egress to rail
value of time for (first) wait time
value of time for interchange time
monetary penalty value of an interchange
10.2.7 When using generalised times, all in-vehicle and out-of-vehicle time components (x)
must be multiplied by appropriate values, by purpose and journey component ( ), to
convert the component into units of time. The same example rail journey specified
above would now be specified as follows:

where
value of access and egress time, relative to rail in-vehicle time
value of (first) wait time, relative to rail in-vehicle time
value of interchange time, relative to rail in-vehicle time
time penalty value (in terms of rail in-vehicle time) of an interchange
value of money, in terms of rail time (1 / VOT)
10.2.8 There will be no discernable difference between generalised cost and generalised time
utility formulations in the base year: the difference between the formulations is simply
one of scale. However, there is an important difference for forecasting. Here
differences will arise when assumptions are made of income increases and
corresponding increases in the value of time. When the value of time increases, the
impact of time in generalised cost will increase (i.e. the generalised cost will increase),
but the impact of cost in generalised time will decrease (i.e. the generalised time will
decrease). The model will therefore behave differently if defined on the basis of
generalised cost than if defined on the basis of generalised time. This property is not
unique to transferred models but the procedure for transfer makes the property more
obvious.
10.2.9 It is generally preferable to define models in terms of generalised time because in this
formulation an increase in income is modelled as making travel easier, while in the
generalised cost formulation it would appear that an increase in income would make
travel more difficult. The generalised time formulation will therefore lead to increasing
trip lengths over time, which is consistent with observed trends, whereas the
generalised cost formulations will lead to declining trip lengths.
10.2.10 No further advice is offered in this guidance with respect to how values of time should
increase over time. Values of Time and Operating Costs (Unit 3.5.6) has
assumptions that should be made for changes in value of time into the future, and
though these recommendations relate strictly to their use in appraisal, they may be
taken as representing reasonable practice for modelling as well.
10.2.11 We do not recommend any changes to the model alternative-specific constants as a
result of forecast changes in income and/or values of time, on the basis that the
unmeasured component of utility, as measured by the alternative-specific constants,
has no expected relationship with income.
10.2.12 For models using imported model parameters, we recommend the use of local RP data
to calibrate both the model scale and alternative specific constants.
10.3 Importing model parameters: Recalibration with disaggregate and semi-aggregate RP data
10.3.1 The advantage of using disaggregate or semi-aggregate RP data for recalibration of
transferred models is that these data allow the direct estimation of the model scale and
alternative-specific-constants through Maximum Likelihood estimation of a logit model,
with the accompanying tests of coefficient accuracy and significance that can be
undertaken (see Sections 5.8, 5.9 and 5.10) and model fit. Specifically, the model
results will indicate the accuracy of the scale coefficient and provide evidence of its
validity, i.e. whether it is significantly different from zero. The methodology for
identification of the model scale and alternative specific constants is set out below.
First, for each observation, the generalised cost or time term is calculated, e.g. , and below. The utility equation for each alternative is then defined by the
generalised cost or time term, multiplied by a scale and a constant (added to all
but one alternative).

1 |