header image
 
home button overview button document button links button topics button
site map and back image
 
   
 

Mode Choice Models: Bespoke and Transferred
TAG Unit 3.11.3

January 2006


pdf icon Unit 3.11.3 (Adobe Acrobat - 427kb)

Contents

1. Mode Choice Models - A Decision Making Framework

1.1 Introduction

2. Issues for Consideration

2.1 Introduction
2.2 Necessary model components
2.3 Policy responses to be measured in the mode choice model
2.4 Available data
2.5 New Revealed Preference (RP) data
2.6 Advice on whether to develop or transfer models

3. Introduction to Bespoke Mode Choice Models

3.1 Introduction
3.2 The purpose of bespoke mode choice models

4. Model Design

4.1 High-level considerations
4.2 Lower-level considerations

5. Model Development

5.1 Logit model
5.2 Specifying the utility function
5.3 Introducing socio-economic variables
5.4 Alternative-specific constants
5.5 Functional form
5.6 Defining the choice set
5.7 Independence from Irrelevant Alternatives (IIA)
5.8 Maximum likelihood estimation
5.9 Preliminary interpretation
5.10 Further interpretation and diagnostic testing
5.11 Validation
5.12 Nested logit
5.13 Estimating and interpreting nested logit

6. Data Collection

6.1 Revealed Preference and Stated Preference
6.2 SP design methods
6.3 Choice set
6.4 Response method
6.5 Number of alternatives
6.6 Number of replications
6.7 Task complexity
6.8 Which attributes?
6.9 Number of attributes
6.10 Units of measurement
6.11 Numbers of levels of attributes
6.12 Selection of values for levels of attributes
6.13 Combining the attribute levels: orthogonality
6.14 Combining the attribute levels: non-orthogonality
6.15 Combining the attribute levels: boundary values
6.16 Realism
6.17 Testing the design using simulation
6.18 Questionnaire design and implementation
6.19 Staging
6.20 Background information
6.21 Means of presentation
6.22 Interception
6.23 Response rates
6.24 Preamble to questionnaire
6.25 Focus groups
6.26 Pre-pilot survey
6.27 Pilot survey
6.28 Field survey
6.29 Cleaning
6.30 Merging SP data
6.31 Combining RP and SP data
6.32 Merging RP and SP data
6.33 Repeat measurements
6.34 Sampling

7. Model Application

7.1 Introduction to model application
7.2 Sample enumeration
7.3 Adjusting the ASCs
7.4 Forecasting
7.5 Patronage build-up

8. Model Outputs and Use in Appraisal

8.1 Introduction
8.2 Spatial detail
8.3 Segmentation by purpose
8.4 Segmentation by person-type
8.5 Choice set
8.6 Generalised costs
8.7 Time
8.8 Outputs for TUBA

9. Documentation

9.1 Model design
9.2 Data collection
9.3 Model development
9.4 Model output

10. Estimation of Transferred Models

10.1 Introduction
10.2 Importing model parameters
10.3 Importing model parameters: Recalibration with disaggregate and semi-aggregate RP data
10.4 Importing model parameters: Recalibration with aggregate RP data
10.5 Transfer of model systems
10.6 Model validation

11. Further Information

12. References

13. Document Provenance


1. Mode Choice Models - A Decision Making Framework

1.1 Introduction

1.1.1 This TAG Unit provides advice for modelling and appraisal of major public transport schemes. It should be read in conjunction with Model Structure and Traveller Response for Public Transport Scheme (Unit 3.11.1), Road Traffic and Public Transport Assignment Modelling (Unit 3.11.2) and Forecasting and Sensitivity Tests for Public Transport Schemes (Unit 3.11.4).

1.1.2 It has three objectives. First, it provides guidance on the decision of when it may be appropriate to develop a bespoke mode choice model or transfer a demand model and/or any of its components. Second, it provides guidance on the process of developing bespoke mode choice models. Third, it provides advice on procedures for transferring mode choice models where this is appropriate.

2. Issues for Consideration

2.1 Introduction

2.1.1 In general, two approaches for travel demand model development can be considered. In one approach, a bespoke model can be developed from local data, using statistical estimation procedures. In the other approach specific model parameters, or in some cases an entire model structure, can be transferred from elsewhere; in this case adjustments will be required so that the model reflects local behaviour.

2.1.2 In fact, every model development project represents a transfer to some extent, in the sense that the modeller's concepts, developed in other areas, are applied in the area of interest. In most cases, model structures (such as the use of a four-stage procedure) and concepts (like generalised cost) are transferred without considering that these are transfers at all.

2.1.3 In general, bespoke modelling does not require pre-conceptions about the structure of the choice model. Alternative structures can be examined using the available dataset and compared in statistical terms in order to select the one that performs best.

2.1.4 In contrast, transferred models require a priori assumptions about the model structure and parameter values. The latter could be defined exogenously, for example published values from recognised sources, e.g. values of time from Values of Time and Vehicle Operating Costs (Unit 3.5.6), lambdas from Variable Demand Modelling - Key Processes (Unit 3.10.3), or imported from other studies.

2.1.5 Many models may fall somewhere between these two types, depending on how the 'imported parameters' are obtained, and the extent to which local data is available to calibrate the model. When we refer to model calibration, we refer specifically to adjustments that will be required to ensure that:

  • the aggregate mode shares are replicated; and
  • the sensitivity of the model (i.e. the scale) is correct.

As part of the calibration procedure, it is important that both the observed mode shares and model scale are replicated.

2.1.6 This TAG Unit provides advice to guide the analyst about the appropriateness of transferring or developing a bespoke model for their specific circumstances, focusing on:

  • the overall model structure;
  • policy variables to be included in the mode choice model;
  • available data;
  • the size of the scheme and the time and cost budget for the study.

2.2 Necessary model components

2.2.1 The model components that are necessary for forecasting and appraising the effects of a scheme will influence the decision of whether a bespoke model is required or whether a model or parts of that model can be transferred from elsewhere.

2.2.2 In the case of car ownership and trip generation, models and forecasts are provided in the DfT's TEMPRO system. This information provides a basis for forecasting for these components that is generally both transferable and defensible. Although it may be desirable in some circumstances to set up local models for these components, it would be necessary to justify the deviation from TEMPRO forecasts, and compare results against that base.

2.2.3 For structures that require a mode choice model only, the appropriate type of model, i.e. bespoke or one based on a transferred model or model parameters, may also depend on:

  • the size of the scheme;
  • the degree of socio-economic segmentation required to adequately reflect traveller behaviour;
  • any particularities in the area and the type of policies that are to be evaluated.

2.2.4 It is unlikely that anything other than bespoke models should be developed for appraisal of:

  • very large public transport schemes;
  • schemes which require rich socio-economic segmentation for appraisal;
  • areas where traveller behaviour, e.g. in terms of values of time, is substantially different from national norms.

2.2.5 Whether an entire model structure from another area could be transferred depends on the following considerations:

  • there is a relevant model to transfer, with appropriate segmentation and behavioural responses;
  • the quality of the model considered for transfer is high (based on analysis of the significance of model coefficients, results of validation tests, etc.);
  • the age of the model considered for transfer;
  • the areas and zone systems are broadly similar;
  • local, preferably disaggregate data, is available (or could be collected) and it is compatible with the original model;
  • the network descriptions between the original model and area of transfer are broadly similar;
  • the mode shares and trip lengths between the original model and area of transfer are generally compatible.

2.2.6 In the case of a transfer, the model structure may be constrained to the structure adopted in the original modelling.

2.2.7 For medium and smaller sized schemes, it may be appropriate to construct (transferred) simple mode choice models based on imported model parameters from other sources, assuming that appropriate parameters are available from other sources. In these cases, calibration of the model to replicate observed mode shares and model scale will still be required.

2.2.8 Transfer of joint mode and destination choice models is possible, if the considerations raised in 2.2.5 are met.

2.3 Policy responses to be measured in the mode choice model

2.3.1 Model Structures and Traveller Responses for Public Transport Schemes (Unit 3.11.1) advises on necessary model components linked to policies to be tested, and particularly policy responses that are expected. Table 1 below identifies the suitability of a transferred mode choice model for appraisal of specific policy improvements, depending on the expected responses from existing car and public transport users.

Table 1: Suitability of Transferred Mode Choice Models for Specific Policy Improvements

Policy Expected passenger transfer from car? Expected passenger transfer from other PT modes? Recommended Model Type
PT Performance improvements for existing PT modes and conventional variables No No No model required.
No Yes Allocation of shares could be done through a mode choice model or assignment. Transferred model appropriate for smaller and medium sized schemes; bespoke model required for large schemes.
Yes Perhaps Mode choice model required. Transferred model appropriate, depending on the scale of the scheme improvements and variables to be considered. Large scheme improvements require bespoke modelling.
New modes or new use of existing modes Yes Yes Mode choice model required. Bespoke model required.
Car congestion reduction Yes N/a Mode choice model required, with detailed highway assignment. Transferred model appropriate, depending on the scale of the scheme improvements and variables to be considered
Reliability, Quality, Integration No No No model required.
No Yes Allocation of shares could be done through a mode choice model or assignment. Model transfer appropriate.
Yes Perhaps Mode choice model required. Transferred model may be appropriate, depending on the scale of the scheme improvements and variables to be considered.
Park & Ride Yes Yes Mode choice model required, including access mode choice. Transferred model may be appropriate: may need to consider bespoke modelling, depending on the extent of the proposed system.

2.3.2 Typically public transport schemes will have more than one policy improvement objective. In these cases, the most demanding modelling requirements of the relevant objectives should be met.

2.4 Available data

2.4.1 Bespoke modelling and model transfers have different data requirements.

2.4.2 In general, the data requirements for transferred models are typically less than for bespoke models, because estimation of statistically reliable behavioural parameters is not an issue. Aggregate data will be required for model calibration, e.g. constants. Additionally, either semi-aggregate or disaggregate data are required for calibration of the model scale, with disaggregate data being preferred.

2.4.3 There are three types of Revealed Preference (RP) data that could be used for mode choice modelling:

  • aggregate data: including information on aggregate mode shares, mode shares by trip length, etc.;
  • semi-aggregate data: reflecting proportions of choices made by groups of travellers, typically from matrix data of choices by origin, destination and purpose categories, trip length distributions, etc.;
  • disaggregate data: reflecting detailed observations of actual mode choice behaviour, for a sample of travellers, in the relevant study area.

2.4.4 In addition count data can provide information on aggregate mode shares.

2.4.5 For all mode choice models, aggregate RP data are required as a minimum, either for recalibration or validation of mode shares. Information on mode shares by trip length could be used to (manually) calibrate the model scale, but semi-aggregate or disaggregate data provide much better information for this purpose. It is therefore recommended that data on mode shares by trip length category is the minimum information required to calibrate the model scale and is appropriate for development of models for appraisal of small schemes only.

2.4.6 Semi-aggregate data have a relevant role both for bespoke modelling and model transfers. In both cases, these data can be used to provide appropriate scale and constant adjustments. For bespoke models, it is recommended that semi-aggregate data be supplemented with additional disaggregate RP or SP data, in order to estimate statistically reliable model parameters.

2.4.7 Disaggregate data is much richer than semi-aggregate data, in terms of its explanatory power for modelling, and is therefore preferable for bespoke modelling (and for calibration of transferred models, although often disaggregate data will not be available, which may be a reason for opting for a model transfer). Disaggregate choice data can be collected from en-route postcard surveys, from home or phone interviews, travel diaries, as well as from existing sources such as the National Travel Survey. Data collected using choice-based survey procedures, for example, interviewing passengers on a bus or train, will lead to samples which are not representative of observed mode shares. Adjustments to take account of these biases must therefore be made in the model estimation procedure.

2.4.8 Stated Preference (SP) data can also play an important role in bespoke modelling, particularly for modelling demand for new modes (see Section 6 for a detailed discussion of the role of SP data).

2.5 New Revealed Preference (RP) data

2.5.1 New RP surveys should collect, as a minimum, the respondents' origin, destination, choice of mode and purpose of travel.

2.5.2 For public transport users, information on licence holding and car availability is important and for users of all modes pass holding or entitlement is valuable. Other background characteristics, for example, age, gender, income, employment status are desirable but may not be possible to collect in the context of the survey.

2.5.3 For mode choice models, observations of people choosing the specified modes are required for all (existing) modes considered in the study.

2.6 Advice on whether to develop or transfer models

2.6.1 Bespoke models are more costly to develop than transferred models, because of the data that is required to estimate statistically reliable behavioural parameters. The time required to estimate these models will also be more extensive than that required for model transfer, leading to higher model development costs.

2.6.2 Bespoke modelling will be necessary in the following situations:

  • for appraisal of new modes or new characteristics, e.g. reliability changes;
  • for appraisal of schemes in areas where traveller behaviour, e.g. values of time, may be substantially different from national norms or values from other existing models.
  • The transfer of an entire model system from another area could be considered, but only if the conditions in Section 2.2.5 are met.

2.6.3 For medium and smaller sized schemes, it may be appropriate to construct (transferred) mode choice models based on imported model parameters from other sources, assuming that appropriate parameters are available from other sources. In these cases, calibration of the model to replicate observed mode shares and model scale will still be required. Semi-aggregate or disaggregate data will be required for calibration of the model scale, with disaggregate data being preferred. Aggregate data will be required for calibration of the mode-specific constants.

2.6.4 In general, it is difficult to say anything about the reliability of estimates produced from bespoke or transferred models because the quality of the model will depend on many other issues in addition to whether it is bespoke or based on a model transfer from elsewhere, for example the amount (and quality) of data for model estimation, the level of geographical segmentation of the model area, the degree of socio-economic segmentation, etc.. It is therefore advised that model validation is undertaken whether bespoke or transferred models are used, including examination of coefficient ratios, e.g. implied values of time, trip lengths, time and cost elasticities and realism tests. The results of the model validation should be adequately documented.

3. Introduction to Bespoke Mode Choice Models

3.1 Introduction

3.1.1 This and the following Sections of this Unit (Sections 4 to 9) provide guidance on the procedures and documentation required in the development of bespoke mode choice models. The material covered:

  • explains the purpose of bespoke mode choice models;
  • provides guidance on model design;
  • reviews the stages to data collection;
  • gives guidance on model estimation, application and validation; and
  • sets out the required documentation for audit.

3.2 The purpose of bespoke mode choice models

3.2.1 Bespoke mode choice models are used to provide forecasts of passenger demand for transport services. They are typically applied where transport services are new, such as the introduction of a light rail system.

3.2.2 Bespoke specification involves developing a new model specific to the context of interest and estimating local parameters. As discussed above, this contrasts with a transferred specification, which involves importing parameters from elsewhere (whether standard values, or valuations identified in previous studies - see Section 10 below).

3.2.3 The guidance that follows is specific to bespoke mode choice models, and considers the specification and estimation of such models, as well as the design and implementation of associated data collection.

3.2.4 It is noted at the outset that the development of a mode choice model is a specialist activity, often requiring the creative contributions of a skilled analyst, but within a structured framework defined by good practice. The objective of this TAG Unit is to provide guidance on the procedures, testing and documentation that would be required for an analyst to demonstrate adherence to good practice. Whilst identifying many elements of good practice, this guidance should not in itself be considered a detailed technical guide to the development of mode choice models.

4. Model Design

4.1 High-level considerations

4.1.1 The choice of modelling approach will depend on a number of potentially conflicting factors:

  • the nature of identified problems and their likely solutions;
  • the definition and size of the study area;
  • the likely number of options to be tested;
  • the availability of data and existing models;
  • the need to update and (re)calibrate models;
  • the need to conduct new surveys;
  • the timescale for model development; and
  • the required accuracy and robustness of results/recommendations'.

4.1.2 As well as guiding strategic-level decisions of the analyst such as sub-mode choice vs. public transport assignment and bespoke vs. transferred, the above considerations should be borne in mind when formulating the precise specification of the bespoke mode choice model - if this is the chosen approach - as well as any associated data collection.

4.1.3 An important strategic question should be one of 'what is required from the model?' Indeed 'fitness for purpose' should be a guiding principle throughout the design process. A number of sub-questions follow.

4.1.4 Is the interest restricted to existing modes, or is consideration of a new mode required? If there were interest in new modes, this would usually imply a need for a bespoke mode choice model, as well as a need for new data collection.

4.1.5 What relevant data already exists? Following from the above, it is important to establish what existing sources of revealed preference (RP), or even historical stated preference (SP) data, might usefully enhance model development. This includes not only disaggregate choice data but also aggregate planning data and traffic count data. The availability of such data may impact on the scale and scope of any new data collection, the specification of the model (for example, whether the model should accommodate a range of data sources, the extent of segmentation, and the nature of any validation procedure), and the reliability of the model in application.

4.1.6 Do similar studies exist, and what methodologies were employed? Further to 2.1.5, analysis of similar studies may yield not only relevant data sources, but also valuable insight into how the mode choice model and data collection might be specified. The analyst should identify and review such studies, and justify the chosen methodology against these.

4.1.7 What resources, in terms of both time and money, are available for mode choice research? Whatever the scheme or policy under investigation, the analyst should always ensure that the budget designated for data collection and mode choice modelling, and the timescale for such activities, is commensurate with the nature and complexity of the problem, and the likely scale and extent of impacts.

4.1.8 In demonstrating that the analysis undertaken follows good practice, the analyst should ensure that the final report to the client adheres to the audit trail detailed in section 9. As a supplement to this report, all questionnaire materials, data sets, and analytical tools (including model command files) should be made available to the client, if requested.

4.2 Lower-level considerations

4.2.1 Having considered the high-level issues, preliminary model specification should move on to address a series of lower-level issues. Each of these issues is fundamental to model specification and, it follows, data requirements. These issues could have a significant impact on resource needs in analysis.

4.2.2 What is the relevant unit of decision-maker? For mode choice, this will usually be the individual, although the travelling party may be relevant in some cases. The advantage of the latter is that car cost is accounted for correctly.

4.2.3 What is the choice set? The analyst should identify the alternatives of interest, including any new modes, and take an initial view on the availability of the complete choice set to decision-makers, as well as the extent of any captivity to alternatives.

4.2.4 What are the key behavioural variables of interest? As well as standard variables such as time and cost, the analyst should consider the relevance of 'softer' variables such as crowding/congestion, quality and reliability. This will be dictated by the nature of the scheme or policy under investigation, and the likely scale and scope of its impacts. For example, an interest in road pricing would usually imply a need to investigate the prevalence of congestion effects. Where such variables are of interest, advice should be sought from Model Structures and Traveller Responses for Public Transport Schemes (Unit 3.11.1) and Road Traffic and Public Transport Assignment Modelling (Unit 3.11.2).

4.2.5 What are the key policy variables of interest? These might include fares, service frequency, reliability, accessibility, and quality in public transport modes, and road pricing and parking for cars.

4.2.6 What are the key socio-economic variables of interest? These will usually include age, sex, employment status, income, and car ownership.

5. Model Development

5.1 Logit model

5.1.1 Mode choice models, as conventionally specified, are based on the behavioural principle that a decision-maker will choose the travel mode that yields greatest satisfaction or 'utility'.

5.1.2 Utility is postulated to be a function of both observable (or deterministic) utility and unobservable (or random) utility. Specifically:

mathamatical formula

where mathematical formula is the deterministic utility derived from alternative i by decision-maker n, and mathematical formula is the associated random utility.

5.1.3 For purposes of implementation, a specific model form should be adopted. Although there exist a range of options, model development should always commence with the logit form. Logit offers substantial versatility; indeed it will be sufficient for many needs. Where more complex forms are deemed necessary, logit offers a valuable benchmark for comparison.

5.1.4 Logit relates probability of choosing alternative i from J alternatives as follows:

mathematical formula (5.1)

where mathematical formula is a strictly positive scale parameter.

5.1.5 In the context of mode choice, convention is to reinterpret utility as 'generalised cost', which is essentially the negative of deterministic utility expressed in monetary units. The methods discussed below are based on the construct of utility, since this typically includes additional variables - such as ones relating to the decision-maker - that may be difficult to translate into generalised cost terms.

5.1.6 With reference to equation (5.1), the scale parameter mathematical formula is inversely related to the variance of random utility (or 'error') as follows:

mathematical formula

The amount of error has important implications for the properties of the model. All else equal, the greater the error, the smaller the scale parameter and the closer the choice probabilities will tend to 1/J for all J alternatives. This issue is known as the 'scale factor problem' and is of particular relevance when estimating models to SP data, which may contain biases and errors typically not found in RP data. Since it cannot in practice be estimated separately from mathematical formula, mathematical formula is commonly taken to be one.

5.1.7 It is differences in deterministic utility across alternatives that influence probability - not absolute utility. The relationship of utility difference to logit probability is sigmoid (Figure 1). Thus, if an alternative has an extreme probability (whether high or low), a small change in utility difference will have little impact on probability of choice, whereas if an alternative has a probability close to 0.5, the same change in utility difference will have considerably greater impact on probability.

Figure 1: Plot of logit probability against utility difference

Figure 1: Plot of logit probability against utility difference

5.2 Specifying the utility function

5.2.1 An important practical issue is the specification of mathematical formula. This is typically represented as a function of observed variables relating to the alternative and the decision-maker.

5.2.2 As regards functional form, linear-in-parameters is sufficient for most needs. Indeed, under fairly general conditions, any function can be approximated arbitrarily closely by the linear-in-parameters form.

5.2.3 Variables relating to alternatives may be entered in the function 'directly', as follows:

mathematical formula

where mathematical formula are observations relating to the kth variable (or 'attribute') of decision-maker n and alternative i, and the mathematical formula are associated parameters. For example, if there were interest in the effect of time (T) and cost (C) on choice, an appropriate representation would be as follows:

mathematical formula (5.2)

where mathematical formula and mathematical formula are utility or 'taste' parameters relating to time and cost, respectively. Since both time and cost are, in terms of utility, perceived as 'bad', mathematical formula.

5.2.4 The parameters in equation (5.2) are shown to be 'generic' across choice alternatives. Thus attributes that are common to alternatives are specified as having common parameters, such that estimates of these parameters will be averages across the data. This is not however a requirement; whilst convention is to specify cost as generic, other parameters may be specified not only as mode-specific, but also as person-specific, depending on the focus of interest. Relaxing the assumption of generic parameters allows for different values of time for different modes, people, or both modes and people, for example.

5.2.5 Analysts should adopt the standard of expressing time in minutes and cost in pence. Furthermore, care should be taken to express each in terms of single trip units; return trip attributes such as parking charges should be halved.

5.3 Introducing socio-economic variables

5.3.1 Following from 5.1.7, variables that for a given observation are common across alternatives, such as those relating to the decision-maker, should not be entered 'directly' since they have no impact on utility difference (and therefore probability). They must instead be 'interacted' with variable(s) that do vary across alternatives. For example, if there were interest in the influence of age on the responsiveness of choice to cost, (5.2) could be re-written:

mathematical formula (5.3)

where:

mathematical formula

and Y is the parameter relating to the interaction between cost and age (M).

Substituting into (5.3), the interaction of age with cost can be seen clearly:

mathematical formula (5.4)

5.3.2 It should be noted that the parameters of (5.4) are again represented generically. Since logit models are usually estimated on data from a sample of decision-makers (the topic of sampling is considered in Section 6.34), it may be revealing to extend the model to investigate the potential for tastes to vary across decision-makers.

5.3.3 If, for example, there were interest in the distribution of tastes with respect to the interaction between age and cost, (5.4) could be re-specified:

mathematical formula

which would yield a separate Y parameter for each decision-maker n.

5.3.4 An alternative, and more efficient, representation would be to segment decision-makers by age group, and represent each group by a dummy variable. For any such variable, dummies should be included explicitly in the model for all but one group, thereby avoiding the 'dummy variable trap'. If, for example, the data were assigned to one of three age groups, dummies for two of the groups should be included in the model, with the third acting as the 'base'. The V function should then be represented as follows, where in this case l = 1,2.

mathematical formula

and Yl is the cost parameter specific to segment l.

5.4 Alternative-specific constants

5.4.5 It is essential to include a constant in the utility function of all but one choice alternative This constant is referred to as to the 'alternative-specific constant' (ASC) or 'modespecific constant', specifically:

mathematical formula

where ASCi is the alternative-specific constant relating to alternative i.

5.4.6 The ASC is omitted for one alternative - which becomes the 'base' - again to avoid the 'dummy variable trap'. An ASC can be interpreted as representing the net average effect of omitted variables (relative to the base). The inclusion of ASCs ensures that, when estimated by maximum likelihood, logit is able to replicate the aggregate choice shares.

5.5 Functional form

5.5.7 Thus far, the model has adhered to linearity-in-variables as well as linearity-inparameters. In some cases, non-linear forms may offer additional flexibility. One such case, which may be useful in mode choice studies, would be to incorporate an explicit 'income effect' i.e. the marginal utility of income diminishes with increasing income. Chapter 8 of Ortúzar and Willumsen (2001) provides direction on such specifications.

5.5.8 As will become evident later, the linear form is particularly attractive when it comes to interpretation of the model, although additional flexibility can be achieved by introducing non-linearity. Among the more popular alternatives are the following:

mathematical formula

While it is difficult to offer clear prescription, the case for using such forms should be based on a combination of theory (i.e. behavioural rationale) and/or data (i.e. empirical support). One of the more common contexts for non-linear forms is where national-level data (e.g. for trip lengths) may be distinct from local level data.

5.6 Defining the choice set

5.6.1 An important specification task is to define the choice set appropriately. For bespoke mode choice models this will usually be relatively small and clearly defined, at least in an aggregate sense. What may be less obvious is the propensity for decision-makers to consider only a subset of alternatives when actually choosing. There could be any number of reasons why particular alternatives might not be considered, although by far the most common issue in mode choice modelling is car availability. It is important to identify such constraints, and represent the appropriate choice set for each decisionmaker (an 'unavailability of alternatives' command is provided in most software packages). Although it is necessary, for successful estimation, that at least some decision-makers choose each choice alternative, it is not a requirement that all decision-makers have access to the full choice set.

5.6.2 Where choice models are developed to consider the demand implications of a new mode, there are a range of issues associated with data collection and how the new mode should be considered during forecasting. As regards the former, a requirement for a SP experiment would often be implied (guidance is offered in section 4). As regards the latter, a particular issue is the specification of ASCs (Sections 5.4 and 7.3).

5.7 Independence from Irrelevant Alternatives (IIA)

5.7.1 In defining the choice set, it should be noted that logit is characterised by the property of independence from irrelevant alternatives (IIA); that is, for any two alternatives, the ratio of their choice probabilities is unaffected by the presence or absence of any other alternatives in the choice set. Where two alternatives in the choice set are closely related in some sense (i.e. the 'red bus-blue bus' problem), IIA is violated, and the use of logit is (in principle) inappropriate. It should be noted that the IIA property of the logit model is evident at the level of the decision-maker and not always present for groups of decision-makers.

5.7.2 There are a number of ways to identify cases where the IIA assumption is violated, but arguably the most practical is to calibrate nested models. This process not only identifies whether IIA applies, but also how to alleviate it (see Section 5.12).

5.7.3 Where IIA provides an accurate representation of reality, it may permit considerable efficiency in analysis, since models can be estimated on restricted choice sets.

5.8 Maximum likelihood estimation

5.8.1 Logit can be estimated on RP data, SP data, or a combination of the two. Such data are usually collected on a sample of decision-makers from the population of interest. Data collection is considered in detail in section 6 of this guidance.

5.8.2 Convention is to estimate logit by maximum likelihood (ML), the purpose of which is to estimate the parameters for which the observed sample is most likely to have occurred. A number of software packages offer routines for ML estimation, although these may vary considerably according to their cost, ease-of-use and flexibility. Whichever software is chosen, estimation of logit by ML is usually reliable, and it is uncommon for close examination of the ML routine to be required.

5.8.3 Where computational problems are encountered in estimation, closer examination of the ML routine may be necessary. A reasonably detailed account of the most popular ML algorithms is offered in Train (2003), along with diagnostic advice on how common estimation problems may be overcome.

5.8.4 Having estimated a logit model by ML, an initial post-estimation check is to ensure that the ML routine converged successfully - this is reported as standard output in most software packages. If the model failed to converge, then it is necessary to investigate the reasons for this, resolve them, and repeat the estimation. In the event of such problems, the software may provide appropriate prescriptive advice, although it is often necessary for the analyst to interpret such advice in the context of how the data, model and estimation routine have been specified. The analyst should not draw behavioural conclusions from software failure.

5.9 Preliminary interpretation

5.9.5 Having estimated logit successfully, a series of preliminary tasks in statistical inference should be undertaken. Each of the utility parameters should be subjected to a Student's t-test for statistical significance and, strictly speaking, only parameters that are of statistical significance should be retained for purposes of model application. In practice, however, the decision to include/exclude a given variable is less clear cut and depends as much on the sign and relative magnitude of the coefficient as well as its standard error. Accepting a coefficient with an inappropriate sign or magnitude simply because it is statistically significant is clearly wrong, as is rejecting a key policy variable if it is marginally insignificant. The development of a choice model is largely guided by experience, informed by what the standard errors infer about the accuracy of the coefficients.

5.9.6 When making such judgements, it may be informative to consider the relationship between statistical significance and sample size. More specifically, for large populations and relatively small samples - which is the typical context for mode choice modelling - the standard error of an estimate relates approximately to sample variance and sample size as follows:

mathematical formula

where mathematical formula is the sample variance of mathematical formula and N is the sample size. To illustrate this relation, a quadrupling of the sample size would, for given sample variance, imply a doubling of the t-ratio in a test of statistical significance.

5.9.7 Such considerations may also impact on the specified degree of segmentation, since greater segmentation may imply reduced standard errors for segment-specific parameters. Moreover, where budget and/or other constraints restrict sampling, the retention of insignificant variables may be justifiable if a modest expansion of the data set would likely bring significance.

5.9.8 The sign of each significant parameter should be assessed as to its intuitive validity; for example, fares should always have a negative effect on utility.

5.10 Further interpretation and diagnostic testing

5.10.9 Analogous to least squares estimation, the prevalence of any (near) collinearity between variables may affect the sign and/or significance of parameter estimates. Such dependency can be investigated through estimating models with restricted sets of variables, and examining the behaviour of the model as variables are added or removed. Good estimation software will also produce parameter correlation matrices, analysis of which will inform any such investigations.

5.10.10 Referring back to 5.1.6, it should be noted that the parameters in the utility function are scaled relative to the variance of unobserved factors; larger variance in mathematical formula will lead to smaller mathematical formula. When it comes to interpretation, therefore, ratios of parameters are more meaningful than absolutes, since the scale factor mathematical formula cancels out.

5.10.11 A further attraction of ratios of parameters is that, at least in the case of a linear functional form, they have ready economic meaning as 'marginal rates of substitution'. In particular, if the denominator of such a ratio is a cost parameter, then the ratio can be interpreted as the marginal rate of substitution with respect to cost, or in other words 'value'. For example, and with reference to (5.3), the value of time is given by the ratio of time and cost parameters:

mathematical formula

5.10.12 VOT can, more generally, be derived from any functional form by taking the ratio of marginal utilities, as follows:

mathematical formula

5.10.13 Any derived valuations should be tested for statistical significance; tests for significant difference from 'reference' values (such as 'standard' values') may also be insightful.

5.10.14 If estimated by ML, the goodness of fit of a logit specification should be measured using the log-likelihood (commonly referred to as 'rho-squared') index. The basic form of this index is defined:

mathematical formula

where LLf is the final log-likelihood of the full model, and LLr is the final log-likelihood of a restricted model.

5.10.15 Although a number of restricted models may offer bases for meaningful tests, a minimum requirement should be to implement the test with a market share model (i.e. a restricted version of the full model that includes only ASCs) as the base. Such a formulation yields the widely used 'rho-squared with respect to constants' index.

5.10.16 The P2 index offers a measure of the goodness of fit of the logit model, and is analogous to the R2 statistic in ordinary least squares regression. The value P2 lies between zero and one, but values between 0.2 and 0.4 are often considered indicative of very good fits. In common with R2, the P2 with respect to constants is comparable across different samples.

5.10.17 Expanding on the t-tests for hypotheses regarding individual parameters, it may be insightful in some cases to test more complex hypotheses regarding subsets of parameters. Two of the more common such hypotheses are (i) that the coefficients of a subset of variables are collectively zero; (ii) that the coefficients of two variables are the same. Both of these tests can be implemented using a likelihood ratio test, which is given by the general form:

mathematical formula

where Lr is the final likelihood of the restricted model under the null hypothesis (e.g. in case I, the restricted model would constrain the relevant subset of coefficients to be zero), and Lf is the final likelihood of the unrestricted model. Thus a restricted model should be estimated by ML in accordance with the null hypothesis. The test statistic is given by -2logR, which is distributed chi-squared with degrees of freedom equal to the number of restrictions implied by the null hypothesis.

5.11 Validation

5.11.1 Having conducted the above procedures in statistical inference, the properties of the estimated model should be validated against benchmark empirical evidence. Such investigations should focus on two principal constructs - valuation and elasticity. As regards the former, any valuations implied by the estimated model should be reconciled, where possible, with empirical evidence from comparable local schemes. Where such evidence is unavailable at the local level, recourse to national evidence should be made. As regards the latter, the elasticity properties of the model should be similarly compared against available local evidence. Such analysis can be based on measures of point elasticity, calculated for both direct and cross effects, across the sample. The relevant formulae for direct and cross elasticity for each decision-maker are, respectively:

mathematical formula

5.11.2 To obtain elasticity estimates for the sample as a whole it usual to take a weighted average across of the elasticity estimates for each decision-maker, with the weights being the individual choice probabilities for the mode in question. Simply inserting average values for P and x will, if there is any variance to the data, lead to an aggregation bias and incorrect elasticities. An alternative method is to make small changes to the variables during model application, and derive arc elasticity estimates from the predicted market shares.

5.11.3 The validity of the estimated model should be further tested in implementation. The estimated model should be applied to forecasting (the subject of forecasting is considered in section 7), and the ability of the model to replicate observed market shares assessed. A range of indicators of forecasting performance may be employed, although a simple and robust test is offered by a Chi-squared test:

mathematical formula

where

fa is the actual frequency

fp is the forecast frequency

J is the number of alternatives in the choice set

with degrees of freedom:

mathematical formula

where m is the number of parameters to be estimated on the basis of the sample data.

5.11.4 The validation process should be carried out across a number of dimensions including those defined by the characteristics of the sample (e.g. income, gender, age) and the attributes of the choice alternative (e.g. cost, in-vehicle time).

5.12 Nested logit

5.12.1 With reference to Section 5.7, a diagnostic for, and (partial) resolution to, the property of IIA is offered by the nested logit model, which groups similar alternatives together in mutually exclusive subsets or 'nests' (i.e. an alternative can be included in only one nest). Choice probability is represented as the product of marginal probabilities of choosing nests and the conditional probability of choosing a given alternative from a nest.

5.12.2 Nested logit can be illustrated by considering a problem of two-levels and two-nests, although the model can in principle be extended to any numbers of levels and nests. With reference to the tree diagram in Figure 2, choice probability is given by:

mathematical formula

where Pm is the marginal probability of choosing nest m, and is the conditional probability of choosing alternative i from nest m.

Figure 2: Nested logit for a two-level two-nest problem

Figure 2: Nested logit for a two-level two-nest problem

5.12.3 Within each nest, the property of IIA holds, and conditional probability is represented as logit:

mathematical formula (5.5)

where mathematical formula is the scale parameter relating to nest m.

5.12.4 Turning now to marginal probability, the utilities from (5.5) are introduced in an expression for the Expected Maximum Utility of each nest m (commonly referred to as the 'log sum' or composite cost), as follows:

mathematical formula

5.12.5 The probability of choosing an alternative in nest m is also of the logit form and is shown as:

mathematical formula

5.12.6 A common simplification is to assume that mathematical formula is constant across all m in a given level of the tree (i.e. all nests at a given level have the same scale factor).

5.13 Estimating and interpreting nested logit

5.13.1 It is very important to note that there are (in general) two different and commonly used specifications of the nested logit model. In particular some applications are specified and estimated without dividing the lower level utility by mathematical formula.

5.13.2 Since this distinction between specifications may have substantive implications for interpretation and application, the analyst is advised to seek appropriate advice from the software supplier before proceeding.

5.13.3 The inferential and diagnostic analysis required following estimation is essentially the same as for logit, although one additional test is required in order to check the internal consistency of the nested logit structure. Following from 5.13.2, the precise specification of this test differs according to the specification of nested logit adopted. For example, if lower-level utility is specified without dividing through by mathematical formula, the test requires that:

mathematical formula (5.6)

Interpreting (5.6), where mathematical formula is not significantly different from one, there is a violation of the IIA assumption (see Section 5.7); where this holds for all mathematical formula the nested logit model collapses to logit.

5.13.4 Although the above discussion was based on a two-level problem, the analysis can be readily extended to more than two-levels, with different scale factors at each level. The internal consistency test then involves an extension of (5.6).

5.13.5 A practical difficulty with nested logit is that the most appropriate nesting structure may not always be obvious. It may therefore take some effort to identify a definitive structure, judgements on which should be based on internal consistency, relative explanatory power and other properties of the model such as implied valuation and elasticity.

6. Data Collection

6.1 Revealed Preference and Stated Preference

6.1.1 Revealed Preference (RP) refers to observations of actual behaviour, for example the mode choices that decision-makers currently make or made in the past.

6.1.2 RP data is inherently more credible than SP data and its use, if only partially, will strengthen the credibility of demand forecasts in the appraisal framework.

6.1.3 RP data can be obtained from SP respondents, from postcard surveys (an under-used and relatively inexpensive approach), from home or phone interviews, travel diaries, as well as from the National Travel Survey and Census.

6.1.4 The collection of RP data is not without problems however. There are often large biases in respondents' self reported data, underestimating the costs of their chosen mode and overestimating the costs of alternative modes. To overcome these problems it is sometimes necessary to use explanatory variables from network models and published timetable data. Even where respondents' reported data is modelled, there is often a considerable amount of missing data which needs to be collated.

6.1.5 Stated Preference (SP) refers to observations of hypothetical behaviour under controlled experimental conditions.

6.1.6 The need for a bespoke approach to mode choice modelling would often imply an interest in a 'new mode'. The interest in new modes would itself imply a need for SP analysis, since RP data is by definition unavailable for such contexts. In short, bespoke mode choice development would often require new SP analysis.

6.2 SP design methods

6.2.1 The design of SP experiments has evolved into a specialist technical area. Comprehensive accounts of SP design methods are offered in a number of dedicated texts; popular ones include Pearmain and Kroes (1990), Louviere et al. (2000) and Bateman et al. (2002). It is important to recognise that SP design is a developing discipline, and that there is some disagreement as to the most appropriate methods. Indeed the three texts cited above show significant differences in respect of the methods they promote. Moreover, SP must at all times remain practical and useful, and it is left to the analyst to reconcile an in-depth knowledge of theoretical concepts of SP design with an appreciation of the practical implications of particular design features. There are a number of commercial software packages that can help with this task.

6.2.2 In what follows, we offer generic guidance on the stages to be followed in promoting best practice for SP analysis of mode choice.

6.3 Choice set

6.3.1 Definition of the choice set of alternatives should be based on RP evidence on existing behaviour, as well as policy interest in the potential addition of new modes. In most mode choice studies, the choice set will be reasonably well defined. The judgement of the analyst may however be required in some cases, for example in deciding whether two alternatives of common mode should be represented as distinct. Since such judgements may have a significant impact on the properties of the model, it is sometimes useful to conduct appropriate preliminary investigation such as focus group analysis. The option of 'not travel' should always be included in the choice set. In mode choice studies, alternatives should always be described 'explicitly', meaning that they should be referred to as 'bus', 'train', or 'car' etc., rather than as 'A', 'B' or 'C' etc.

6.4 Response method

6.4.1 The most natural, and therefore most reliable, response method for mode choice studies will usually be choice (as opposed to ranking or rating). Where the response method deviates from choice, the analyst should offer clear and convincing justification.

6.5 Number of alternatives

6.5.1 This decision involves reconciling the demands of the experiment on respondents (both in terms of the cognitive effort required on any given replication and the number of replications that require application) with considerations regarding its realism. The most natural presentation would be to offer the complete choice set on every replication, although this may be impractical depending on how many alternatives are involved and the means by which the SP experiment is implemented. For example, if the SP were administered as a pen-and-paper exercise at a motorway service station, the most appropriate implementation would perhaps be a binary choice experiment, since larger choice sets may place unreasonable demands on respondents.

6.6 Number of replications

6.6.1 This decision is related to decisions on numbers of alternatives, attributes, and levels. For choice response exercises, the presentation of between 5 and 16 replications is typical, although this may again be influenced by other considerations, such as the means of implementation.

6.7 Task complexity

6.7.1 There is a controversial literature on the effects of task complexity (defined in a number of ways including numbers of alternatives, attributes and replications) on statistical properties, with different researchers reporting different findings. The general advice offered here is that excessive numbers of alternatives, attributes and replications may introduce significant bias in valuation and forecasting, as well as impact on response rates. The prevalence of such effects should be identified at the pilot stage, and the SP design should be adjusted accordingly.

6.8 Which attributes?

6.8.1 In mode choice studies this is usually dictated by which generalised cost components of interest to policymakers; Road Traffic and Public Assignment Modelling (Unit 3.11.2) is helpful in this regard.

6.9 Number of attributes

6.9.1 Having identified the attributes of interest, the next issue to be decided is whether to present the complete set of attributes together, or subsets of attributes only. In the latter case, separate designs should be developed for each subset of attributes, whilst ensuring that each design contains at least one (but preferably more) common attribute - this permits merger of the designs at the estimation stage. Since mode choice SP is commonly implemented as an interception survey, the analyst should remain conscious of the cognitive and time demands placed on respondents. It is uncommon, therefore, for any single SP experiment to consider more than four attributes.

6.10 Units of measurement

6.10.1 For most attributes, the units of measurement will be natural and straightforward. For some attributes, however, no natural units may exist (a good example would be an attribute such as 'ride quality'), and it may be left to the analyst to construct a measurement unit or scale that is both useful for analysis and comprehensible to the respondent (the latter perhaps investigated through focus groups).

6.10.2 A question of particular relevance to mode choice studies is whether attributes should be presented in respect of single or return trips; the answer to this should again be governed by what appears the more natural for the particular context under study. The preamble to the questionnaire should always make clear whether the attributes refer to single or return trips.

6.10.3 An associated consideration is whether to construct attributes as absolutes (e.g. 'car has travel time of 50 minutes'), or as some form of deviation from a base, whether as an absolute deviation (e.g. 'car is 5 minutes faster than now'), or as a percentage deviation ('car is 20% faster than now'). As will become apparent in the subsequent discussion, the deviation options sometimes offer convenience in design; absolute deviations may permit efficiency in the number of variables required, as well as facilitate easy derivation of boundary values; percentage deviations are particularly amenable to customisation. These considerations aside, when it comes to implementation of the SP experiment, any such deviations should be translated into absolute values for purposes of presentation to respondents.

6.11 Numbers of levels of attributes

6.11.1 At the early design stage, decisions on the number of levels will often be dictated by the scope of policy interest. It should be noted that at least three levels would be required in order to test an attribute for non-linearity. As with many other decisions in the design process, however, decisions on the numbers of levels has implications elsewhere, in particular on the number of replications required.

6.12 Selection of values for levels of attributes

6.12.1 The selection of values for attributes levels should be guided by the need to ensure realism in the representation of the choice problem and the need to include values relevant for policy testing. Other factors to consider include the range of any interest in non-linearity, and the variability (and hence significance) of attribute coefficients.

6.13 Combining the attribute levels: orthogonality

6.13.1 Conventional practice is to be guided - to greater or lesser extent - by fractional factorial designs. These provide templates for combining attribute levels in an orthogonal (i.e. zero correlation) manner. The attraction of orthogonality is that it enables the model to identify the separate influence of each attribute on utility.

6.13.2 Design templates can be found in several references (e.g. Kocur et al., 1982); a number of software packages offer automated facilities based on the same principles. Although these templates can, on the face of it, be applied in a reasonably prescriptive and straightforward manner, significant judgement is required on the part of the analyst in reconciling conflicting objectives with respect to the numbers of alternatives, attributes, attribute levels, interaction effects and replications.

6.13.3 A number of strategies may be adopted to mitigate the effects of increasing numbers of attributes and attribute levels on the number of replications. One strategy, where the choice set is binary, is to apply the attributes of the design plan as differences between the two alternatives (e.g. time difference, cost difference, etc.), thereby reducing the requisite number of attributes in the design by half. Other more complex strategies include the use of specialised algorithms to arrive at 'optimal 'designs, which minimise the number of replications whilst remaining 'efficient' according to some criteria.

6.14 Combining the attribute levels: non-orthogonality

6.14.1 In some cases, it may be desirable to deviate from orthogonality. One such case is where an orthogonal design yields an unrealistic combination of attribute levels. Another motivation for deviating from orthogonality is that some degree of correlation between attributes may improve the precision of parameter estimates. Moreover, since orthogonality in design is not preserved in estimation of choice models, some analysts argue that the attraction of orthogonality is sometimes overstated.

6.15 Combining the attribute levels: boundary values

6.15.1 A more advanced - but powerful - approach to SP is to ground design more firmly in behavioural theory, thereby establishing an intimacy between the data (i.e. the experimental design and the experimental responses) and the behavioural model to which the data are applied. A popular approach of this kind is 'boundary values'.

6.15.2 Boundary values are the implied valuations at which a decision-maker is just indifferent between two choice alternatives iand jon a given replication of the design. More formally, let:

mathematical formula

Where mathematical formula such that an individual is indifferent between the two alternatives, the boundary value of time (BVOT expressed in terms of cost is given by:

mathematical formula

Assuming that utility is entirely deterministic, an individual whose valuation of T in terms of C is greater than mathematical formula, will prefer the alternative with the greater C/least T, whilst an individual whose valuation of T in terms of C is less than mathematical formula, will prefer the alternative with the least C/greater T. If decision-makers utility maximise, and it is ensured that the SP design presents boundary values closely either side of standard valuations (of time, headway etc.), one can be reasonably confident that the model will reproduce realistic valuations in estimation.

6.16 Realism

6.16.1 Although stated preferences are by their very definition hypothetical, the analyst should at all times endeavour to ground SP design in realism, thereby ensuring that analysis is insightful and reliable. Thus, having arrived at an initial design by means of the above methods, it is good practice to check the design for unrealistic or irrational combinations of attribute levels. These should be adjusted accordingly.

6.17 Testing the design using simulation

6.17.1 Whatever methods are employed in the design process, it is important to test whether the design is capable of producing a model with realistic properties, particularly with respect to valuation (i.e. the ability of the design to reproduce acceptable ranges of parameter ratio). Simulation is a powerful tool for such testing, and can be used to identify problems in design without incurring the costs of pilot or field application.

6.18 Questionnaire design and implementation

6.18.1 Before proceeding to implementation, it is necessary to develop some form of vehicle for administering the SP experiment to respondents. This usually involves the design of a questionnaire. The following issues should be considered:

6.19 Staging

6.19.1 The choice here is between a single-stage or multi-stage questionnaire process. Although the latter is more demanding of respondents (with associated fall-off in response rate), and usually more costly, it provides opportunity for customisation of questionnaires to individuals' circumstances. This can make the SP experiment more realistic to respondents, and thereby make analysis more insightful.

6.20 Background information

6.20.1 In most cases it will be necessary to collect a range of information relating to respondents' socio-economic and demographic characteristics. These needs will tend to be dictated by segmentation requirements.

6.21 Means of presentation

6.21.1 Possible options include mail-back questionnaire, computer-assisted questionnaire, questionnaire posted on the Internet, questionnaire distributed by e-mail, and questionnaire administered by telephone interview. Choice of method may have significant implications for cost and effort (both in administering the experiment and processing the responses), as well as the size and characteristics of the sample. Some methods - particularly the electronic methods - may be more amenable to customisation. For mode choice studies, mail-back or computer-assisted questionnaire are most common.

6.22 Interception

6.22.1 A decision must be reached on where the questionnaire should be administered; conventional practice for mode choice is to intercept travellers en route, whether on a particular mode or at a terminal. Where customised questionnaires are being used, interception usually involves recruitment and the collection of basic information relating to the respondent, which can then be used to generate customised SP experiments that are posted to respondents. 'Cold-calling' by telephone can be a useful means of interception.

6.23 Response rates

6.23.1 These may vary depending on the means of presentation and interception: surveying on-mode may yield a response rate of up to 90%; response rates at terminals tend to be variable; two-stage questionnaires may yield a response of 40-60% of those in scope and agreeing to participate; a typical response from telephone interviewing is 40%.

6.24 Preamble to questionnaire

6.24.1 It is standard practice to provide a preamble to the questionnaire, which should explain the purpose of the investigation, the choice context of interest, the variables (including advice on how to treat variables not explicitly included), how the experiment should be completed, and how any data will be stored and used (in accordance with the Data Protection Act). Contact details for any queries should be provided, as should information on how the questionnaire should be returned (if appropriate). The preamble should be as succinct as possible.

6.25 Focus groups

6.25.1 A useful means of informing presentation and implementation issues is to test prototype questionnaires on focus groups of typical respondents. Focus groups are particularly useful where a wholly new product or research methodology is planned; they are less useful where the analyst has a good understanding of the new mode.

6.26 Pre-pilot survey

6.26.1 Before proceeding to a pilot survey, it is good practice to apply the questionnaire to a pre-pilot survey involving a small number of colleagues. The purpose of this is to invite comment and identify any problems, rather than to test the statistical performance of the SP design.

6.27 Pilot survey

6.27.1 Although 6.15, 6.17 and 6.26 provide various assessments on the quality of the SP design, it is essential to test the design fully in the context of a pilot survey. This should involve a representative sub-sample of the population of interest. The pilot survey should be a complete dummy run of the field survey, including a full statistical analysis of the responses to the experiment. Moreover, the pilot survey should consider a range of issues including: sampling strategy; comprehensibility of the questionnaire and experiment (respondents should be invited to comment); response and completion rates; market shares (i.e. prevalence of dominant alternatives); ability to estimate a choice model successfully using the response data; parameter significance and overall fit of model; and implied valuations.

6.28 Field survey

6.28.1 Having tested the experiment and questionnaire through simulation, focus groups and piloting, the analyst can proceed to the field survey with some confidence that the analysis will be successful when it comes to implementation. If testing has been sufficiently comprehensive, then the field survey should simply be a repeat of activities that have been carried out in the pilot surveys; the scope for unexpected problems should therefore be minimal. Whilst the analyst must remain vigilant of the potential for bias, a clear justification must be offered for the removal of any observations for reasons of 'irrationality'. Where such observations are deleted, it must be ensured that forecasts are adjusted accordingly.

6.29 Cleaning

6.29.1 Whatever form of data are collected, they should be subjected to a cleaning process. This process should identify, and treat, any 'irrational' or missing observations. In relation to the former, there is often confusion about the treatment of SP respondents who always choose the same alternative (these respondents are known as non-traders or non-switchers). The recommended approach is that car users who never switch mode should be retained, as should respondents with relatively high valuations. Inconsistent or biased responses may however be removed. The latter may take several forms; a common phenomenon is where a respondent always uses a currently available but rejected alternative. It should be noted that the distinguishing of nontraders from those with high valuations is often difficult. Missing observations can often be treated in some way; deletion should be regarded as a least-preferred option. If any observations are deleted during cleaning then forecasts should be adjusted accordingly.

6.30 Merging SP data

6.30.1 If separate designs have been developed for different subsets of attributes, it is necessary to merge the designs at the estimation stage. Since the data relating to different designs could have different error variances, and therefore different scale factors, it is important to apply a correction to ensure that all parameters across different designs are of common scale.

6.30.2 Where two or more attributes are common to the different designs, this can be accomplished by exploiting the structure of the nested logit model. For the simpler case of only one common attribute, recourse to nested logit is unnecessary, and re-scaling can be achieved simply by multiplying the parameters of one design through by the ratio of parameters relating to the common attribute.

6.30.3 To illustrate the nested logit procedure for merger, consider the case of two SP designs (which we refer to a and b), each of which considers a binary choice between alternatives i and j but represents the alternatives in terms of a different set of attributes (excepting the attributes that are common to both). The merger procedure is as follows:

6.30.4 With reference to Figure 3, four nominal alternatives are specified: specifically alternatives i and j for design a, and alternatives i and j for design b.

6.30.5 The data should be organised such that where utility and preference data relating to design a are presented, the alternatives relating to design b are specified as unavailable, and vice versa.

6.30.6 All alternatives should be specified with a path directly to the root (i.e. not nested with other alternatives), although the alternatives relating to one of the designs (here we arbitrarily pick b)should be assigned 'dummy nodes'. This means that the design b alternatives are specified in single-alternative nests at the lower level of the tree. The mathematical formula parameter should be specified as common across the two nests; this accommodates the difference in error variance across the two designs and ensures that all estimated utility parameters are of common scale. Unlike conventional nested logit estimation, there is no requirement that mathematical formula falls within specified bounds.

Figure 3: Nested logit 'trick' for data merger

Figure 3: Nested logit 'trick' for data merger

6.31 Combining RP and SP data

6.31.1 SP is powerful for eliciting valuations, but less reliable for forecasting. This is because the scale factor in SP, which may deviate significantly from that in RP, cancels out in valuation but not in forecasting. If a model is to be applied to forecasting then it should not be estimated on SP data alone. Best practice is to merge RP and SP data in estimation. A second best option is to validate a SP-based model against RP evidence on elasticity. Validation is discussed in Section 5.11.

6.32 Merging RP and SP data

6.32.1 Merger with RP data should be regarded as much the preferred option, and should be carried out in an analogous manner to the merger of SP data. Thus either the RP or SP alternatives should be specified at the lower level of the tree, and the difference in the error variances of the two data sources accommodated in the mathematical formula parameter. It should be remembered that this method of merger is dependent on there being at least one common variable in the two data. Indeed the more common variables there are, the more confident one can be about the reliability of the merger process. It should be noted that exact specification of the nested logit 'trick' for data merger is dependent upon the type of software used for calibration (see section 5.13.1) and the analyst is advised to seek appropriate advice from the software supplier before proceeding.

6.33 Repeat measurements

6.33.1 It is common, though often incorrect, practice to assume that the observations in the SP experiments are independent of each other. Where respondents are invited to make a series of repeated choices, as is typical, the informational content of the data diminishes. An implication is that, while the coefficients estimated on such data will be unbiased, the associated t-ratios will be upward biased, giving an illusion of greater significance than is actually the case. A number of correction procedures have been proposed, the simplest of which assumes perfect correlation of errors across the choices of each individual and involves multiplying the standard errors by the square root of the number of responses per individual. A less extreme, but computationally more difficult approach, is to assume that the principal effect of the repeat observations is to introduce a structure to the error term:

mathematical formula

where mathematical formula is an error component associated with respondent mathematical formula and mathematical formula is independently and identically Gumbel distributed.

6.33.2 Estimation of the above is only possible using specialist software. Where this is not available, it is recommended that re-sampling techniques are employed to make unbiased estimates of model coefficients and their variance. The most popular of these techniques are known as 'jack-knifing' and 'bootstrapping'.

6.34 Sampling

6.34.1 In general it will not be economically feasible to collect data from the population of interest and therefore some form of sampling strategy will be required. This strategy should aim to ensure that the data collected provides the greatest amount of useful information about the population.

6.34.2 The first task is to identify the population of interest and the sampling unit. In many cases this will be defined by the objectives of the study and may for example include all households in a given geographical area. Next an appropriate sampling method should be chosen. This may involve a simple random sampling approach, or where it is important to sample from relatively small subgroups in the population a stratified random sampling approach should be adopted. The latter involves subdividing the population into homogenous strata and then conducting a simple random sampling strategy within each stratum. Whichever approach is adopted, care should be taken to ensure that the sample is representative of the population. Mode choice modelling will often involve choice-based sampling, whereby the existing users (i.e. choosers) of a mode will be surveyed, for example on mode (for public transport) or at roadside interviews (for car users). It should be noted that where logit is applied to a choicebased sample, and specifies a full set of J - 1 ASCs, ML estimation will yield inconsistent estimates of the ASCs, as well as possible bias to other coefficients. Appropriate correction is therefore required.

6.34.3 Although there are no hard and fast rules to determine sample size, it is recommended that the sample be commensurate with the budget for the study, which in turn should be commensurate with the likely costs and benefits of the proposed scheme.

6.34.4 Where the sampling methodology generates a sample that is not representative of the general population, consideration should be given to the development of an appropriate weighting system to be used during model estimation and application.

7. Model Application

7.1 Introduction to model application

7.1.1 Logit and nested logit are usually estimated on probabilities of choice for a sample of decision-makers. What is typically of interest to policy-makers, however, is an aggregate measure of these probabilities - i.e. market share - across a population.

7.1.2 The application of average measures of explanatory variables to the calculation of probability yields biased measures of average probability.

7.2 Sample enumeration

7.2.1 Consistent estimates of market share can be obtained using sample enumeration. This involves calculating, for each decision-maker in a sample, the probability of choice for each alternative in the choice set. These probabilities are then aggregated over decision-makers; average probability can be obtained by dividing through by the sample size.

7.2.2 More formally, a consistent estimate of the number of decision-makers choosing alternative iis given by:

mathematical formula

where wn is the weight attributed to decision-maker n. The wn parameter represents the number of decision-makers similar to decision-maker n in the population, i.e. the number of decision-makers within each segment of interest. Thus if the sample is random then wn is constant for all n, whereas if the sample is segmented then wn is the same for all n within a segment. If the sample is not representative of the population, then the weights should be adjusted accordingly.

7.3 Adjusting the ASCs

7.3.1 In applying a model with ASCs to forecasting, it should be recognised that the influence of explanatory variables not represented explicitly in the model may change between estimation and forecast contexts (e.g. over time). Such changes can be accommodated through re-calibration of the ASCs. This involves inserting the estimated parameters (including the ASCs) in the model, along with the base data, and assessing the ability of the model to replicate 'target' market shares. If the forecast shares differ significantly from the target shares, then the ASCs should be adjusted, and the analysis repeated iteratively.

7.3.2 Target market shares may be based on external evidence, the analyst's judgement, or by particular requirements relating to a forecast segment. In the latter case, for example, there may be an interest in the ability of the model to forecast accurately for a particular segment of the sample, and a need to tailor the ASCs accordingly. Adjusting the model constants for existing modes is relatively straight forward as the base market shares will be known. Setting the ASC for a new mode is however more problematic, as the values from SP research will be estimated to choice sets different from those to which they are applied, may be of the wrong scale, and are likely to be subject to various respondent biases inherent in the SP experiment. There is no easy solution here, and recourse to similar travel situations may be required. The constant for the new mode is therefore a strong candidate for sensitivity testing.

7.4 Forecasting

7.4.1 Forecasting involves applying the above aggregation methods to some alternative scenario, defined on the basis of two inputs: first, data on the utility variables under the scenario of interest (e.g. reflecting an increase in fares); and second, the wn parameters (e.g. reflecting changes in socio-demographics). Changes to the latter are particularly important for long term forecasts, where changing patterns in population, income and car ownership are likely to be influential on demand.

7.5 Patronage build-up

7.5.1 In most instances, the mode choice model will predict an equilibrium state in which mode switching occurs instantaneously (e.g. in SP). In reality, however, there is likely to be inertia within the market, perhaps because of dissipation of knowledge about the service and/or a delayed behavioural response to the new journey opportunities (e.g. in RP). A prudent forecaster might factor down initial patronage forecasts to take account of the delay in take-up. This can be done 'off-model' using rules of thumb or included within the model by means of an inertia term that decays over time. In the long run (greater than 2 years), one would expect the overwhelming degree of inertia to have disappeared. Further advice on this issue is provided in MSA: Cost Benefit Analysis Unit 3.9.2.

8. Model Outputs and Use in Appraisal

8.1 Introduction

8.1.1 As was noted earlier, a key preliminary consideration in the model building process is to be clear about the purpose of modelling, and how that impinges on the detailed specification of the model.

8.1.2 It will be necessary, in many cases, to ensure that the output from the mode choice model is of appropriate form and detail in respect of a number of considerations. This could be to ensure consistency with other elements of the model system and/or to appeal to particular informational needs of policy-makers.

8.1.3 Here we consider the case where the purpose of mode choice modelling is to inform some form of scheme, strategy or project appraisal, where detailed specification would usually involve consideration of the following issues:

8.2 Spatial detail

8.2.1 Advice on this issue is offered in Model Structures and Traveller Responses for Public Transport Schemes (Unit 3.11.1). In respect of mode choice modelling, this essentially refers to the need to adopt an appropriate representation of zones and movements between them, whilst noting that finer detail implies greater burden in terms of both data and computation.

8.3 Segmentation by purpose

8.3.1 Advice can again be found in Model Structures and Traveller Responses for Public Transport Schemes (Unit 3.11.1), although it is noted that a typical breakdown is: home-based work, home-based employer's business, home-based other, non-homebased employer's business, and non-home based other. If education is a significant fraction of the market then it should always be modelled separately.

8.4 Segmentation by person-type

8.4.1 A minimum requirement is to segment by non-car-owning and car-owning households, although greater segmentation by numbers of cars and drivers per household is advisable. Model Structures and Traveller Responses for Public Transport Schemes (Unit 3.11.1) offers advice.

8.5 Choice set

8.5.1 It is important to consider all relevant choice alternatives in the mode choice model, whilst noting that the definition of alternatives, and their representation in the tree, may have a significant impact on the properties of the model (section 5).

8.6 Generalised costs

8.6.1 A function of the mode choice model is to provide calculations of generalised cost by mode to TUBA. These should be derived by dividing the utility of each alternative by the cost coefficient.

8.6.2 Non-work values of time and walk and wait times to be used in appraisal should be tested for significant difference from the values recommended in Values of Time and Operating Costs (Unit 3.5.6); where local values are not statistically significantly different, they should not be used; where they are statistically significantly different, they may be used, subject to sensitivity tests using the recommended values.

8.6.3 The suitability of any local values, including significance and sensitivity testing as must be fully documented. Further guidance on local values may be found in Values of Time and Operating Costs (Unit 3.5.6).

8.7 Time

8.7.1 As well as estimating a model for the base year (typically the year in which bespoke SP data are collected), it will usually be necessary to apply the model to forecasting for several future years including the opening year(s) of the relevant scheme, forecast year(s) for appraisal, and a horizon year. Further guidance is provided in Cost Benefit Analysis (Unit 3.5.4).

8.7.1 Whilst the focus of the above is on the demand side, it should be acknowledged that generalised cost forecasts are contingent on a range of assumptions regarding the supply-side, for example vehicle kilometres and vehicle hours.

8.8 Outputs for TUBA

8.8.1 Requisite outputs from the mode choice model for purposes of appraisal include forecasts (by O-D pair and mode) of:

  • Passenger trips
  • Total revenue
  • Passenger kilometres
  • Generalised costs

9. Documentation

A flow-diagram of the stages to model development is presented in Figure 4 below. For each stage, a summary of the processes involved together with an audit checklist is presented below. This audit trail should be completed during model development to justify the methodological approach taken and any assumptions that are made.

Figure 4: Mode choice model development process

Figure 4: Mode choice model development process

9.1 Model design

9.1.1 The model development process starts with a clear definition of the scheme context and the need for a choice model to assist in the decision-making process. The modelling approach should be suited to the scheme, its context and its objectives, and should be capable of generating output suitable for use in appraisal. Above all, the model should be designed to assist the investment and planning decision-making process. A summary of the information required for model design is presented below.

The model design report should include:

  • Information on the nature of problem and the objectives of the likely solutions;
  • A definition and size and scope of the study area;
  • The availability of existing data to establish new models;
  • The need to undertake new surveys to establish new models;
  • Preliminary model specification, including information on model structure, explanatory variables and estimation procedures;
  • Details of the software to create and apply the model;
  • The forecasting parameters and years for which the forecasts are required; and
  • Information on the timescale and resources required for model development.

9.2 Data collection

9.2.1 The data collection exercise should follow the design stage. This exercise should involve the collection of RP and SP data and may also involve the use of focus groups. The data should be processed and cleaned and simple analysis undertaken to ensure that the data covers the relevant dimensions needed for the construction of choice models. This data should be made available to the client, if requested.

The data collection report should include the following:

  • Documentation of the the key findings of focus groups (if undertaken);
  • The RP data collection exercise and sampling strategy;
  • The SP data collection exercise and sampling strategy. This will also contain information on the questionnaire design, testing by simulation and pilot survey results; and
  • Data processing and cleaning. This will include information on the processing of raw data for use in final model development.

9.3 Model development

9.3.1 The model development stage includes model estimation, model application and model validation. This process can be quite complex requiring a number of iterations until a satisfactory model is achieved.

9.3.2 Once the data has been collected, processed and cleaned, the model estimation process can commence.

The model development report should include information on model estimation.

The report should include a step-by-step account of the stages to model development including:

  • the specification of logit models calibrated to each data set;
  • the specification and justification for alternative nested structures; and
  • the merging of different data sets to develop joint RP-SP choice models.

For each model, evidence and justification is required for:

  • the inclusion/exclusion of each variable;
  • the specification of the functional form;
  • the degree of market segmentation; and
  • the significance of any structural coefficients.

For each reported model, information should be presented on:

  • the variables included, their unit of measurement, and which alternatives they apply to;
  • the estimated coefficients and associated t-statistics/standard errors. Where models are estimated to SP data, the standard errors should be adjusted to account for repeat observations;
  • the number of observations; and
  • the explanatory fit of the model;

And where appropriate:

  • the relative attribute valuations (e.g. value of time) together with estimates of their statistical confidence; and
  • the implied elasticities of demand.

9.3.3 Following estimation, the model should be applied to forecasting.

The model development report should include information on model application.

The models should be applied where possible using sample enumeration techniques.

Documentation is required to:

  • Justify the approach to model application;
  • Report any weighting of the sample to make it representative;
  • Show how explanatory variables change over time; and
  • Show how the model is able to recover the base market demand forecasts over a range of market segments.

9.3.4 The final stage to model development is to validate the properties of the model to evidence available from elsewhere.

The model development report should include information on model validation.

  • The report should comment on the credibility of the forecasts when compared to actual patronage figures for similar schemes;
  • The report should review the relative attribute values (e.g. value of time) implied by the model and compare them to published evidence; and
  • The report should review the own and cross elasticities of demand implied by the model and compare them to published evidence.

9.4 Model output

9.4.1 The final stage to bespoke mode choice development is to report on the model forecasts.

The model outputs report should include:

  • Forecasts of generalised cost, passenger demand, revenue and kilometrage by OD pair and mode;
  • Estimates of patronage build-up over time;
  • Sensitivity tests on key input parameters; and
  • Specification of the schemes tested and scenario forecasts.

10. Estimation of Transferred Models

10.1 Introduction

10.1.1 This Section looks at the process of mode choice transfer. However, in some cases a transferable mode choice model may be embedded in a complete transferable model system and advice for these cases is also presented.

10.1.2 This section advises on two issues: the first involves importing model coefficients from other sources; the second involves the transfer of one or more components of an entire model system, estimated in one area, for application in another.

10.2 Importing model parameters

10.2.1 The first case of transfer is one in which coefficients are available from sources outside the study area which are believed to be appropriate for mode choice modelling within the study area.

10.2.2 Importing model parameters is likely to be appropriate only for appraisal of medium and smaller sized schemes.

10.2.3 As with bespoke modelling, the foundation of a mode choice model using imported model parameters is the utility formulation describing the choice alternatives, i.e.:

mathematical formula

where mathematical formula is the deterministic component of utility derived from alternative i by user n, mathematical formula are the relevant attribute values (k) relating to alternative i for user n and mathematical formula are the model parameters indicating the relative importance of each attribute.

10.2.4 The specific attribute values, for each choice observation, will usually be derived from networks or other databases, e.g. fares. In bespoke models, the mathematical formulas will be estimated such that the observed choices are best represented. For transferred models, these parameters are inputs to the modelling. These inputs may be obtained from a number of sources, including:

  • Values of Time and Operating Costs (Unit 3.5.6),;
  • TRL Report TRL593, The demand for public transport: a practical guide, for information on the relative valuations of public transport journey components;
  • Variable Demand Modelling - Key Processes (Unit 3.10 3),;
  • Passenger Demand Forecasting Handbook (PDFH), in cases where access to this source is possible;
  • other SP or RP studies;
  • a mixture of the above.

10.2.5 For transferred models, it is essential that the mathematical formulas are measured in consistent units. Two units of measure are generally used: Generalised Costs (GC) and Generalised Times (GT).

10.2.6 In the generalised cost formulation, all in-vehicle and out-of-vehicle time components (x) are multiplied by appropriate values, by purpose and journey component (mathematical formula), to convert them into monetary values. For example, if we were to consider a typical generalised cost formulation for a rail journey, it may include in-vehicle time, out-of-vehicle time and other components, for example:

mathematical formula

where

mathematical formula value of time for travel by rail, for specific purpose of travel

mathematical formula value of time for access and egress to rail

mathematical formula value of time for (first) wait time

mathematical formula value of time for interchange time

mathematical formula monetary penalty value of an interchange

10.2.7 When using generalised times, all in-vehicle and out-of-vehicle time components (x) must be multiplied by appropriate values, by purpose and journey component (mathematical formula), to convert the component into units of time. The same example rail journey specified above would now be specified as follows:

mathematical formula

where

mathematical formula value of access and egress time, relative to rail in-vehicle time

mathematical formula value of (first) wait time, relative to rail in-vehicle time

mathematical formula value of interchange time, relative to rail in-vehicle time

mathematical formula time penalty value (in terms of rail in-vehicle time) of an interchange

mathematical formula value of money, in terms of rail time (1 / VOT)

10.2.8 There will be no discernable difference between generalised cost and generalised time utility formulations in the base year: the difference between the formulations is simply one of scale. However, there is an important difference for forecasting. Here differences will arise when assumptions are made of income increases and corresponding increases in the value of time. When the value of time increases, the impact of time in generalised cost will increase (i.e. the generalised cost will increase), but the impact of cost in generalised time will decrease (i.e. the generalised time will decrease). The model will therefore behave differently if defined on the basis of generalised cost than if defined on the basis of generalised time. This property is not unique to transferred models but the procedure for transfer makes the property more obvious.

10.2.9 It is generally preferable to define models in terms of generalised time because in this formulation an increase in income is modelled as making travel easier, while in the generalised cost formulation it would appear that an increase in income would make travel more difficult. The generalised time formulation will therefore lead to increasing trip lengths over time, which is consistent with observed trends, whereas the generalised cost formulations will lead to declining trip lengths.

10.2.10 No further advice is offered in this guidance with respect to how values of time should increase over time. Values of Time and Operating Costs (Unit 3.5.6) has assumptions that should be made for changes in value of time into the future, and though these recommendations relate strictly to their use in appraisal, they may be taken as representing reasonable practice for modelling as well.

10.2.11 We do not recommend any changes to the model alternative-specific constants as a result of forecast changes in income and/or values of time, on the basis that the unmeasured component of utility, as measured by the alternative-specific constants, has no expected relationship with income.

10.2.12 For models using imported model parameters, we recommend the use of local RP data to calibrate both the model scale and alternative specific constants.

10.3 Importing model parameters: Recalibration with disaggregate and semi-aggregate RP data

10.3.1 The advantage of using disaggregate or semi-aggregate RP data for recalibration of transferred models is that these data allow the direct estimation of the model scale and alternative-specific-constants through Maximum Likelihood estimation of a logit model, with the accompanying tests of coefficient accuracy and significance that can be undertaken (see Sections 5.8, 5.9 and 5.10) and model fit. Specifically, the model results will indicate the accuracy of the scale coefficient and provide evidence of its validity, i.e. whether it is significantly different from zero. The methodology for identification of the model scale and alternative specific constants is set out below.

First, for each observation, the generalised cost or time term is calculated, e.g. mathematical formula , mathematical formula and mathematical formula below. The utility equation for each alternative is then defined by the generalised cost or time term, multiplied by a scale mathematical formula and a constant (added to all but one alternative).

mathematical formula

1