Forecasting demand for high speed rail Maria Börjesson ⇑ Centre for Transport Studies, KTH Royal Institute of Technology, Teknikringen 14, SE-100 44 Stockholm, Sweden a r t i c l e i n f o Article history: Received 24 October 2013 Received in revised form 30 September 2014 Accepted 13 October 2014 Available online 11 November 2014 Keywords: High speed rail Demand Forecasting Air–rail share Cost–benefit analysis Box–Cox transformation of travel time a b s t r a c t It is sometimes argued that standard state-of-practice logit-based models cannot forecast the demand for substantially reduced travel times, for instance due to High Speed Rail (HSR). The present paper investigates this issue by reviewing the literature on travel time elasticities for long distance rail travel and comparing these with elasticities observed when new HSR lines have opened. This paper also validates the Swedish long distance model, Sampers, and its forecast demand for a proposed new HSR, using aggregate data revealing how the air–rail modal split varies with the difference in generalized travel time between rail and air. The Sampers long distance model is also compared to a newly developed model applying Box–Cox transformations. The paper contributes to the empirical literature on long distance travel, long distance elasticities and HSR passenger demand forecasts. Results indicate that the Sampers model is indeed able to predict the demand for HSR reasonably well. The new non-linear model has even better model fit and also slightly higher elasticities. Ó 2014 Elsevier Ltd. All rights reserved. 1. Introduction Long distance travel stands for a disproportionately large share of person kilometers traveled compared to its share of trip making. Worldwide there are great hopes that High Speed Rail (HSR) may help to alleviate the heavy load of traffic in road and air corridors and improve interregional accessibility. There is a wide political backing for investments in HSR in many countries and the European Union is considering increasing the financial funding for HSR projects (European Commission, 2010). However, HSR requires substantial investments. The economic rationale for allocating public money to construction of new HSR tracks is highly dependent on the present volume of rail travel, generation of new rail trips, and the extent to which air and car trips would be diverted to rail. A common argument is that state-of-practice forecast models tend to underpredict demand when travel times are substantially reduced for instance due to HSR, and specifically that such models predict too small a diversion of trips from air to rail. There have so far not been many studies trying to validate forecast models in this respect, which is the purpose of the present study. Flyvbjerg et al. (2005) analyze, however, the accuracy of demand forecasts, finding that they systematically have overestimated traffic volumes of rail investments. Moreover, Flyvbjerg et al. find that forecasts have not improved over time, although estimation and forecasting techniques have. According to their findings the demand for road investments is not overestimated as much as for rail investments, indicating that the overestimation of demand for rail investments is not primarily due to unreliable forecast models, but to strong political pressure. http://dx.doi.org/10.1016/j.tra.2014.10.010 0965-8564/Ó 2014 Elsevier Ltd. All rights reserved. ⇑ Tel.: +46 702 58 32 66; fax: +46 8 790 60 10. E-mail address: maria.borjesson@abe.kth.se Transportation Research Part A 70 (2014) 81–92 Contents lists available at ScienceDirect Transportation Research Part A journal homepage: www.elsevier.com/locate/tra The purpose of this paper is to investigate whether state-of-practice forecasting models can predict the demand for HSR. First, model-based long distance elasticities in the literature are compared with elasticities observed when new HSR lines have been introduced. Then the paper briefly describes the Swedish long distance model that is part of the national transport model package Sampers (Beser and Algers, 2002), studies its elasticities and demand forecast for a suggested HSR service, and validates the forecast against previous literature and aggregate Swedish data. The Sampers long distance model has been in use for some ten years and is one of the most comprehensive state-of-practice long distance models in the world presently in use for appraisal. The response of the Sampers long distance model is also compared to the response of a newly developed model applying Box–Cox transformations on time and cost parameters. The paper contributes to the empirical literature on long distance travel elasticities and HSR passenger demand forecasts. There are reasons to believe that long distance models are less reliable than models for regional travel. First, the vast majority of forecasting models deal with regional travel, although the interest in HSR has triggered the development of long distance models in many countries (e.g. Ben-Akiva et al., 2010; de Bok et al., 2010; Outwater et al., 2010; Rohr et al., 2010). When developing long distance models the same modeling techniques are used as have traditionally been used for regional travel although long distance travel seems to be more heterogeneous. Second, non-linearity in travel time sensitivity makes long distance modeling complex. Gaudry (2008) demonstrates that mode choice logit models assuming linear sensitivity underestimate the cross-elasticity in HSR line forecasts. Daly (2010) reveals a large amount of evidence of non-linear time and cost sensitivity in previous research. Third, since long distance travel is less frequent and less evenly distributed in the population, data collection is more difficult. For instance, a long reporting period used to increase the chance that the respondent can report at least one journey induces underreporting of trips due to forgetfulness (Armoogum and Madre, 1997; Axhausen et al., 1997). Section 2 reviews evidence of elasticities for HSR investments in the literature. Evidence of cross-elasticities of long distance travel is virtually non-existent, and this section therefore concentrates on own elasticities. Besides, cross-elasticities are less meaningful to compare between situations since they tend to be highly dependent on specific market conditions. Section 3 describes the Sampers long distance model. The section also reports the implied average elasticities and crosselasticities of this model, which are compared to the previous evidence. The Sampers long distance model has been used to forecast the demand for a proposed new HSR service of about 500 km, connecting the country’s two largest cities: Stockholm and Gothenburg. In this corridor there is already a fast train in service, called X2000, with a travel time of 3 h and 5 min operating on upgraded conventional tracks. With the new track the travel time is supposed to decrease to 2 h and 14 min1 . Section 4 describes the forecast demand response to the suggested HSR service and compares it with international evidence. Since cross-elasticities are rare in the literature and difficult to compare across different contexts, Section 5 validates the forecast effect on air–rail mode split against aggregate traffic count data and corresponding generalized travel time difference between air and rail in different relations. Section 6 concludes. 2. Elasticities and air–rail split in the literature The literature on rail travel time elasticities and cross-elasticities for long distance travel is fairly limited. Long distance models, which can produce elasticities, are few, but examples are de Bok et al. (2010), Cabanne (2003), de Rohr et al. (2010) Table 1 Rail elasticities in the literature. Study Elasticity Comment Model-based studies Román et al. (2010) À0.4 (Madrid–Barcelona) Cross-section RP/SP data. Spanish HSR corridors. In-vehicle travel time elasticity. À0.6 (Madrid–Zaragoza) Cabanne (2003)a 0.3/0.45 Time series data models. Rail accessibility elasticity. French HSR corridor. À0.16 (air cross-elasticity) de Bok et al. (2010) À0.6 (business) Average distance elasticity. Portugal. Cross-section RP data. À0.5 (commute) À0.3 (other) Rohr et al. (2010) À0.9 (business) Average distance elasticity. UK. Cross-section RP data. À0.4 (private) Dargay (2010) À0.49 to À3.04 Aggregate time series, UK. Different purposes and trip length segments. Empirical studies Nash (2010) À1.6 (Paris–Lyon, phase 1) HSR line 1981–1983. In-vehicle travel time elasticity. À1.1 (Paris–Lyon, phase 2) Sánchez-Borràs (2010) À1.3 (Madrid–Barcelona)b HSR line 2008. In-vehicle travel time elasticity. Sánchez-Borràs (2010) À1.2 (Madrid–Sevilla)b HSR line 1992. In-vehicle travel time elasticity. a These elasticities refer to rail accessibility and not in-vehicle travel time, implying positive own elasticity. b Computed by the author based on data reported in the reference given. 1 The cost is assessed to €10–€15 billion. 82 M. Börjesson / Transportation Research Part A 70 (2014) 81–92 and Román et al. (2010). The former two papers report time and price elasticities on number of trips for a particular HSR line. The latter two report average trip distance elasticities, giving the approximate average percentage change in travel distance by rail, in response to a percentage change in the generalized cost of rail trips uniformly over all origin–destination pairs. The elasticities implied by these models, reported in Table 1, are in the range of À0.9 to À0.3. Dargay (2010) reports average trip distance elasticities estimated on time series data, which have a tendency to be higher2 than those estimated on cross-section data, but it remains unclear why. Dargay also reports higher (absolute) elasticities for longer trips. Dargay and Clark (2012) focus income elasticities for long distance travel. One particular HSR service will attract travellers from other origin–destination pairs. For this reason, the trip elasticities for a particular HSR service (counting travellers attracted from other destinations as new trips) will tend to be higher than the average trip elasticities, predicting the total change in number of trips in response to a uniform travel time change over all trip relations. Moreover, because travellers can respond by choosing destinations closer to their origin as well as making fewer trips, one expects that for a uniform change in travel times over all origin–destination pairs, distance elasticities would be higher than trip elasticities. Hence, when comparing the elasticities of travel time on number of trips for a particular HSR line and on average trip distance over all origin–destination pairs, as in Table 1, there is no a priori expectation as to which ones should be highest. And indeed, there is no clear pattern. Note also that one expects that models estimated on data with less accurate travel time information or with poor model specification tend to have lower elasticities. Further down Table 1 also includes observed elasticities found after the opening of three HSR lines. When the TGV (Train à Grande Vitesse) was introduced, the rail travel time was first reduced by 30 percent, and the implied travel time elasticity was then about À1.6 (with respect to number of trips) (Nash, 2010). When travel time was further reduced by 25 percent, the elasticity was lower, À1.1 (Nash, 2010). The trip elasticity for the Madrid-Barcelona HSR line was À1.3 and for Madrid– Seville À1.2 (computed from volumes reported by Sánchez-Borràs (2010)). The observed elasticities seem thus in general to be larger than the model-based, indicating that at least some of the models may underpredict elasticities. The reason for the higher observed elasticities may, on the other hand, be that the train alternative has been very unattractive before the introduction of HSR, in particular in the Spanish cases. The air–rail cross-elasticity regarding rail travel time is difficult to estimate and transfer between contexts. They are also rarely reported in the literature, but Ben-Akiva et al. (2010), Cabanne (2003) and Rohr et al. (2010) are exceptions. Correlations in time and cost trends for different modes do not usually allow estimation of cross-elasticities in time series data, and in logit models cross-elasticities are very sensitive to different model specifications. The observed cross-elasticities between rail and other travel modes differ greatly between cases. When the TGV between Paris–Lyon (HSR travel time 2 h) was introduced 1981 roughly half the additional rail traffic consisted of newly generated trips (Vickerman, 1997), and there was almost no direct substitution of car trips (Nash, 2010). For the Madrid-Seville HSR line (2 h 15 min) opening in 1992, where the initial market share for rail was much lower, only 15 percent of the increase in rail trips was newly generated. Some of the increase in rail travel was due to substitution of car trips, but most of the additional rail trips were substituted air trips (COST318, 1998). In Germany, where the HSR uses existing networks, only 12 percent of the travelers on the HSR lines have shifted from other modes (Cheng, 2010). Cheng suggests that the high price of the train service explains the low shift, and another explanation is that HSR competes less with air travel because it is more Fig. 1. Estimated relationship between share of rail trips (air–rail mode split) and in-vehicle train travel time. Source: Jansson and Nelldal (2010). 2 Here and in the rest of this paper, ‘‘higher’’ negative elasticities refers to the absolute value. M. Börjesson / Transportation Research Part A 70 (2014) 81–92 83 focused on regional travel. Sánchez-Borràs et al. (2010) explore specifically how the demand and market share for rail depend on ticket prices. Jansson and Nelldal (2010) have estimated and plotted the relationship between rail travel time and the air–rail modal split on an aggregate level depicted in Fig. 1. They suggest that it could be used to validate passenger forecasts. Such validation, however, can be questioned for two reasons. First, it would assume that the in-vehicle rail travel time is the only important determinant for the resulting air–rail market-share, ignoring the importance of variables such as accessibility to airports and train stations, differences in air and rail ticket fares, service frequencies and the share of business travel. Second, the relationship is sensitive to selection effects, i.e. to which origin–destination pairs that are included. Still the idea is good, and we will use is later in this paper, but adapting the analysis to avoid these two problems. Jansson and Nelldal (2010) use data from the year 2000. Some of these numbers are published elsewhere, like the air–rail split in Paris–Lyon (2 h), which was 9–91 percent according to COST318 (1998), and the air–rail mode split in the Madrid– Seville corridor (2 h 15 min) which was 17–83 percent in 2000 according to Sánchez-Borràs (2010). Other numbers have not been published elsewhere or have been updated. One updated example is the air–rail split 20–80 percent in the London– Paris corridor (2 h 15 min) (Eurostar, 2011). The HSR line operating in the Madrid–Barcelona corridor (2 h 38 min), opened in 2007, has an air–rail mode split of 53–47 percent, but the HSR is more competitive in the shorter travel segment Madrid– Zaragoza (Sánchez-Borràs, 2010). 3. The forecast model The Sampers model system is administered by the Swedish Transport Administration. Different versions of Sampers have been used by the Transport Administration in project appraisal for approximately 10 years. The model includes five regional models and one model for national long distance trips. All sub-models are nested logit models predicting frequency, destination and mode choice interacting with the Emme/2 network assignment software (Consultants I.N.R.O., 1999). The long distance sub-model includes car, coach, rail and air. The destination choice includes 700 zones on the national level3 . Different sub-models are estimated for business trips and private trips. The specification and estimation results of the travel time and cost parameters are described in Section 3.1. Other variables included in the utility functions and estimation results in detail will not be described; the estimation has been using ‘‘best practice’’ state-of-the-art methods. Details on the estimation and specification of the version of the model evaluated in the present paper can be found in the Technical Report (Transek AB, 2004)4 . Beser and Algers (2002) describe the estimation of an early version of the model. The models are estimated applying Alogit (Daly, 1992) and data from the national travel diary survey, collected in 1994– 2000. The survey includes a one-day travel diary including all trips. Since the frequency of long distance trips is small, the one-day survey is supplemented by a long distance travel diary survey, covering trips taking place in the period starting one day before the survey day and extending 30 days back in time for trips at least 100 km. All trips above 100 km, in the long distance and the one-day survey are used in the model estimation. There are 65,815 observed trip legs in the raw data: car as driver, 28,985; car as passenger, 19,530; rail, 7013; coach, 4809; air, 4406; other modes, 1072. Trips were coded into tours prior the estimation. The model was calibrated by estimating dummy variables included in the utility functions. This calibration method ensures consistency. In the first calibration phase, the number of trips and the distance distribution by travel mode and trip purpose were calibrated against the long distance survey data. Mode and distance specific dummy variables were then estimated and included in the utility functions. In the second calibration phase, the number of trips by origin and destination region (8Á8 regions in total) was calibrated against traffic counts for rail and air. This led to a sharp, 61 percent, increase in the number of rail trips, indicating a substantial underreporting of rail trips in the survey. The number of air trips was increased by 16 percent in the second calibration phase. For long distance car, no traffic counts were available, and the second calibration phase was not carried out. However, more recent comparisons between the long distance and the one-day surveys indicates that car trips below 400 km are underreported in the long distance travel survey (but no underreporting is indicated for longer car trips). It has, however, to be pointed out that the uncertainty margin is much higher in the one-day survey (comprising 1100 long distance trips compared to 28,000 trips in the long distance study) but the basic pattern is stable. The underreporting could be due to forgetfulness or fatigue effects, in particular for individuals making many long distance trips. The problem of underreporting and possible bias in long distance surveys extending over a longer period has been acknowledged in other countries (Axhausen et al., 1997). Supply data, including various travel time and distance components for each travel mode, were generated from the Emme/2 representation of the Swedish transport network. Car travel times were calculated using the assumption that a half-hour break is taken every two hours5 . Fare matrices were used for scheduled travel modes6 . Car travel costs were assumed to be proportional to the trip distance. 3 In the estimation a stratified sampling scheme was used, including 20 destinations. Popular destination zones in the three larger cities were then assigned a higher sampling probability. 4 Available on request from the author. 5 This assumption was deduced from comparing reported travel times (including stops) with travel times imputed from the Emme/2 system. 6 Due to the high uncertainty and variability of fares, the cost parameter in the models is likely to be less reliable than the other parameters. 84 M. Börjesson / Transportation Research Part A 70 (2014) 81–92 3.1. Implicit values of time The cost parameter is generic across modes and deflated in the forecast, with an income elasticity of À0.5 in real terms (based on the income elasticity found in the 1994 Swedish value of time study (Dillén and Algers, 1998)). The in-vehicle travel time parameter is segmented with respect to length of stay for private trips; it is higher for one-day trips than for overnight stays, presumably because there are more time constraints applying for one-day trips. For private trips, but not for business trips, the in-vehicle time parameter is significantly higher for car than for other modes. Many modeling studies (Börjesson, 2012; Wardman, 2004) have found that the marginal valuation of first wait time declines with increasing headway. Since this proved to be difficult to model directly, a piecewise linear function transforming headway into disutility of wait time was applied7 . Table 2 reports the values of time of the business and private trip models. The values of time for private trips can be compared with values of time estimated on Swedish stated choice data (Börjesson and Eliasson, 2014): car €12/h, rail €10/h and coach €6.5/h in price level 2008. The value of time for one-day car trips is higher in Sampers, but apart from that the values are similar. 3.2. Is HSR a new travel mode? As stated in the introduction, it is commonly argued that state-of-practice forecast models tend to underpredict demand when travel times are substantially reduced for instance due to HSR. The first thing to note is that the cross-sectional data that the present model, and presumably most other models, is estimated on includes a larger variability in travel times than the time shift due to HSR. To forecast the effect of the HSR should therefore not be impossible. Another possible difficulty of forecasting demand for HSR could arise because travelers view HSR as another mode than conventional rail. Using stated choice data, Burge et al. (2010) investigate whether travelers place a special value on HSR compared conventional rail, over and above the value due to differences in level of service attributes, by estimating a mode-specific constant for HSR. They do find a positive HSR constant for car and air travelers but not for rail travelers. Since the stated choices of rail travelers are assumed to be more reliable, the conclusion is that this constant should not be used for forecasting. A similar Swedish study (WSP Analysis and Strategy, 2012), also based on stated choices, gives the same results. Burge et al. (2010) and WSP Analysis and Strategy (2012) also investigate how to include HSR and convectional rail in a nested model structure, depending on the substitution patterns between HSR and conventional rail. These studies give some support for including HSR and conventional rail as different modes in the nesting structure, but this is still uncertain given the lack of revealed preference data. In the late 1990s the fast train service running on upgraded conventional tracks was introduced in Sweden, called X2000. Different nesting structures including X2000 and conventional trains as one single and as two separate modes were explored (Beser Hugosson, 2003), giving some support for including the two train types as separate modes at the same level as other travel modes in the nesting structure. This means that if a new X2000 line was introduced with the same level of service as the conventional train, the market share for rail would instantly be twice as high. However, X2000 is different from conventional trains in the sense that these trains operate primarily between the largest cities at attractive departure times and with higher fares. The differences between the train types captured by the demand model could arise because of these differences, since they are not sufficiently well captured by the models. For instance, the modeled impact of train fare is unreliable because of uncertain fare information and departure time is not accounted for. Since it is uncertain what the mode difference Table 2 Values of time derived from the Sampers long distance model. Unit €/h; price level 2008. Different values of time are reported for business trips because the cost parameter are segmented with respect to high and low income for those trips. Business high income Business low income Private trips 1–5 days In-vehicle time all models; one-day trips 115.6 64.9 In-vehicle time all models; overnight stay 58.9 33.1 Wait time one-day trips 227.0 127.5 Wait time overnight stays 92.7 52.1 Value of one transfer one-day trips 61.6 34.6 Value of one transfer overnight stays 45.8 25.7 In-vehicle time car one-day trips 17.9 In-vehicle time car overnight stays 11.0 In-vehicle time other modes; one-day trips 7.7 In-vehicle time other modes; overnight stays 5.5 Wait time, one-day/overnight stays 20.8 Value of one transfer 11.1 7 From 1994 Swedish the value of time study (Dillén and Algers, 1998): Wait = (min(Headway,60) + 0.5Á(min(Headway,120) À 60)(Headway>60) + 0.2Á(Headway-120)Á(Headway > 120)). ((x > y) equals 1 if x > y and 0 otherwise) M. Börjesson / Transportation Research Part A 70 (2014) 81–92 85 found in the estimation represents, all train types, including the proposed HSR evaluated in the present paper, are modeled as one single mode in the later versions of Sampers. 3.3. Elasticities The calibrated model implies certain distance elasticities, shown in Table 3, which can be interpreted as the percentage change in travel distance by mode m, Dm, per percentage change in a given attribute, xm. These elasticities have been calculated from total traffic volumes (total kilometers) by mode m in two sample points, Dm1 and Dm2. The sample points were produced by applying the model to a base scenario and to a scenario where the given attribute xm1 has been increased ten percent to xm2, uniformly over all trip relations. Since a ten percent increase is a relatively large change, arc elasticities are applied. These are computed based on the two sample points, while assuming a constant elasticity exponential demand function. The arc own elasticity, e, is then computed as e ¼ lnðDt1=Dt2Þ lnðxt1=xt2Þ : ð1Þ It is questionable whether the approximation of a constant elasticity is accurate over such large price changes as 10%. The elasticities and the transferability of them in particular, should therefore be interpreted with caution. Logit models always produce elasticities that vary with market shares. In Table 3 only elasticities with respect to total travel distance are reported. The elasticities for travel distance are larger (in absolute terms) than elasticities for number of trips, because responses to increased generalized travel costs include not only fewer trips but also shorter trips. The table also reports cross-elasticities, referring to the change in travel distance per day with mode m in response to a change in an attribute associated with another mode n e ¼ lnðDm1=Dm2Þ lnðxn1=xn2Þ : ð2Þ All the elasticities have the expected sign. As travel time or travel cost increases for one travel mode, the demand for that mode falls, while the demand for travel with other modes increases. The rail fare elasticity is À0.72 for business trips and À0.59 for private trips. These numbers are similar to those reported by Rohr et al. (2010), who find the corresponding elasticities to be À0.5 (business), À0.9 (commuting trips) and À0.6 (other private trips), computed using the same method. However, these elasticities are lower than À1 which is reported by Dargay (2010) who uses time series data. The elasticity for rail in-vehicle travel time is À1.50 for business trips and À1.01 for private trips. These figures are slightly higher (absolute value) than what is reported by Rohr et al. (2010) and Román et al. (2010). The fuel price elasticity on car travel is low, around À0.14, and very similar to what is reported by Rohr et al. For all car trips, of which the vast majority is regional trips, many studies have found a long-run fuel price elasticity of around À0.3 (Dargay, 2010; Goodwin et al., 2004; Graham and Glaister, 2004). This is also what is found in Sampers’ regional models. Dargay (2010) also finds a lower elasticity for long distance trips than for regional trips (about À0.2). In summary, the own elasticities reported in Table 3 are well in line with other studies using cross sectional data. Cross-elasticities for travel time are consistently higher for car than for rail. This is due to the general property of nested logit models that an improvement of one alternative in a nest will have the same proportional impact on the probability of all other alternatives in the same nest, and this impact is proportional to the market share of the improved alternative. Nesting structures with air, rail and coach in the same nest, which would imply higher cross-elasticities between these modes, have been explored but were not supported by the data (Beser Hugosson, 2003). Table 3 Arc elasticities for travel distance, derived from a ten percent increase in each attribute. Car Coach Air Rail Total Car in-vehicle time Business À0.87 0.60 0.55 0.66 À0.11 Private À0.53 0.57 0.60 0.54 À0.19 Total À0.58 0.58 0.57 0.57 À0.17 Fuel price Business À0.14 0.11 0.09 0.10 À0.03 Private À0.15 0.16 0.17 0.15 À0.04 Total À0.14 0.16 0.12 0.14 À0.04 Rail in-vehicle time Business 0.16 0.14 0.16 À1.50 À0.07 Private 0.06 0.09 0.20 À1.01 À0.08 Total 0.08 0.10 0.18 À1.12 À0.08 Rail fare Business 0.07 0.07 0.07 À0.72 À0.04 Private 0.04 0.07 0.06 À0.59 À0.05 Total 0.05 0.07 0.06 À0.61 À0.05 Income Business À0.59 À1.57 6.48 1.50 2.15 Private 0.25 0.34 0.42 0.35 0.29 Total 0.11 0.13 3.85 0.60 0.72 86 M. Börjesson / Transportation Research Part A 70 (2014) 81–92 The income elasticities for all trips reported in Table 3 are slightly lower for car and higher for air compared to those reported Dargay and Clark (2012). For rail and bus, however, they are similar to Dargay and Clark. The basic pattern in Table 3, showing the highest elasticities for air followed by rail is also consistent with Dargay and Clark. For business trips, coach and car, Sampers’ income elasticities are even negative, because car and coach trips are diverted to air and rail at higher incomes. 4. High speed rail forecast The Swedish Transport Administration has used the Sampers long distance model to forecast the effects of a proposed HSR rail track in the Stockholm-Gothenburg corridor. The thick line on the map in Fig. 2 marks this HSR track, while the existing the conventional rail network is depicted with thinner lines. The travel demand has been forecast in a HSR scenario and in a baseline scenario, the former with the new HSR investment and the latter without. Both scenarios refer to year 2020. As have been found by Goodwin et al. (2004) and Dargay and Clark (2012), long-term elasticities tend to be higher than short-term elasticities. Long-term refers to the asymptotic equilibrium state when responses are completed, 5–10 years in most transport applications (Goodwin et al., 2004). When studying the impact of HSR, long-term elasticities should be applied. However, since Sampers is a static model, estimated on cross-section data, there is no explicit temporal dimension. It is therefore not possible to interpret Sampers elasticities as long-term or short-term. Long-term forecasting, such as the present exercise, always relies on a range of scenario assumptions, e.g. future sociodemographic composition and spatial distribution, economic growth, fuel prices, train fares and car fleet characteristics. In the present case, all scenario assumptions are based on long-term national and international projections. All of them are identical in the baseline and in the HSR scenarios, except for assumptions regarding train timetables, which of course are affected by the HSR investment. Since we are focusing the effect of the introduction of the HSR in this paper, i.e. comparing the traffic volumes between the baseline and HSR scenarios, the output error due to scenario assumptions 2020 should be minor. Timetable assumptions in the baseline and the HSR scenarios, however, are of course of crucial importance for the predicted effect (Eliasson and Börjesson, 2014). In the baseline scenario the travel time of the X2000 trains is on average 3 h 5 min and there are 18 return trips a day. In the HSR scenario it is assumed that the travel time decreases to 2 h 14 min and the frequency increases to 24 return trips a day. The fare is assumed to be equal in both scenarios. There are also some slower conventional trains using other routes in both scenarios. All trains stop at some intermediate stations, but since the number of people living in these towns is relatively small these are neglected in the analysis. Since there is already a fast train service in the corridor, the travel time gain from the new HSR track is relatively small, 28 percent. Table 4 summarizes the forecast travel demand and market share in the 2020 baseline scenario and HSR scenario. According to the forecast, the number of rail trips would increase by 40 percent or 0.63 million trips per year when the HSR track is introduced. Of these new rail trips, 75 percent are newly generated, 16 percent are diverted from air, and 9 percent are diverted from car and almost nothing from coach. The predicted market share for rail in the HSR scenario is 49 percent for all trips, which is close to the observed market share for rail in the Madrid-Seville corridors (COST318, 1998). Concentrating on the air–rail mode split only, the share of rail Fig. 2. The evaluated HSR rail track in the Stockholm–Gothenburg corridor. M. Börjesson / Transportation Research Part A 70 (2014) 81–92 87 trips increases from 65 percent to 75 percent for all trips. For Madrid–Seville and London–Paris (where the HSR travel time is the same as in the Stockholm–Gothenburg case), the market share for rail is higher, about 80 percent. From the numbers in Table 4 we may compute the demand elasticities for the HSR line using the formula (1), where demand Dm1 and Dm2 now are taken to be number of rail trips and xm1 and xm2 are taken to be the rail travel time in the baseline and the HSR scenario, respectively. This trip elasticity is À1.6 for business trips, À0.78 for private trips and À1.0 for all trips. This is similar to the second phase of the opening of the Paris–Lyon HSR line, but lower than observed for the first phase of the opening of the Paris–Lyon line, the Madrid–Barcelona HSR line and the Madrid–Seville HSR line. The higher elasticities in the three latter cases are likely due to substantially lower shares of rail trips in the baseline than in the Stockholm–Gothenburg case. The corresponding cross-elasticities with respect to air, computed by (2), are 0.14 for private trips, 0.54 for business trips and 0.38 for all trips. A new long distance model has recently been estimated based on the same data and basic model specification as the Sampers model but applying Box–Cox transformations to the travel time and cost variables. WSP Analysis and Strategy (2012) gives a detailed description of the estimation procedure and results. 8 This model is referred to as ‘the non-linear model’ in the following. The estimation shows clear evidence of non-linearity with Box–Cox parameters less than unity. The non-linear model has also been used to forecast the effects of the Stockholm–Gothenburg HSR track. All scenario assumptions and timetables in the base and the HSR scenarios are identical to those used in the Sampers forecast. Surprisingly, the total elasticity resulting from this forecast is only slightly higher than that resulting from the corresponding Sampers forecast. The own elasticity regarding the number of rail trips is À2.1 for business trips, À0.77 for private trips and À1.15 for all trips.9 The average arc elasticities described in Section 3.3 are similar to the Samper forecast. 5. Validation of the forecasts In this section, the air–rail mode split predicted by the Sampers model and the non-linear model, in response to the introduction of the HSR line, is validated against aggregate data. A relationship between the difference in generalized travel time between air and rail and the air–rail mode split found in aggregate traffic count data is estimated. The estimated relationship is then compared to the demand model of Sampers and the non-linear model using an incremental logit function. This analysis is based on the same idea as the analysis in Fig. 1. However, there are two important differences, to avoid the two problems discussed in Section 2. First, the independent variable is the generalized travel time between air and rail and not only the in-vehicle rail travel time. Second, the aggregate data includes all relations connecting Stockholm to another domestic airport to avoid selection bias. 5.1. Aggregate data The function describing the generalized travel time differences between rail and air in the aggregate data is denoted DGTT. This DGTT includes the components in-vehicle travel time, access/egress time including an estimate of check-in, security, service, and baggage delivery at the airports, first wait time and number of transfers. All travel time components are transformed to the equivalent in-vehicle time using the relative weights of the national guidelines for cost-benefit valua- tions10 . The difference in fare is not included explicitly in this function, but picked up by the constant, because the difference in average air and rail fare is relatively constant across trip relations11 . Rail, however, is on average cheaper but slower than air, implying that the market share for air is higher for business trips than for the more price-sensitive private trips. For this reason, Table 4 Base line and forecast scenario 2020. Rail Air Car Coach Priv Bsn Tot Priv Bsn Tot Priv Bsn Tot Priv Bsn Tot Million trips per year Baseline scenario 1.13 0.47 1.60 0.33 0.52 0.85 1.30 0.21 1.52 0.10 0.00 0.10 HSR 1.45 0.78 2.23 0.31 0.44 0.75 1.27 0.19 1.46 0.19 0.00 0.19 % change 29 67 40 À4 À16 À12 À3 À10 À4 À3 À11 À3 Market shares Baseline scenario 0.40 0.39 0.39 0.11 0.43 0.21 0.45 0.18 0.37 0.03 0.00 0.02 HSR scenario 0.46 0.55 0.49 0.10 0.40 0.17 0.40 0.18 0.32 0.03 0.00 0.02 8 The report is available on request from the author. 9 The corresponding cross-elasticities with respect to air are 0.15 for private trips, 0.71 for business trips and 0.34 for all trips. 10 A detailed description of the computation of DGTT is available on request from the author. 11 Average domestic flight ticket price is almost independent of the trip distance. Average rail fares are more strongly related to the trip distance but the impact is limited. For instance, the price of a normal ticket Stockholm–Malmo is less than twice the price Stockholm–Linkoping, although the distance is three times as long. 88 M. Börjesson / Transportation Research Part A 70 (2014) 81–92 business trips and private trips are analyzed separately. Aggregate traffic volumes for 2007 were obtained from the rail operators and Swedavia Swedish Airports. 5.2. Estimation result A relationship between DGTT and air–rail mode split was estimated on the aggregated data by applying exponential regression. The reason for choosing an exponential function (truncated at 100 percent), as opposed to a logit model, is that it reaches 100 percent and can therefore pick up the effect that air service reduces or vanishes in travel relations where rail becomes very competitive. The estimated exponential functions for private and business trips are plotted as continuous lines in Figs. 3 and 4 (‘Exponential model’ in the figures). The parameters of the exponential function are shown in Table 5. The aggregated trip volumes used in the estimation of the exponential function are marked by dots in the same figures12 . The figures also include Sampers applied as an incremental logit model (‘Incremental logit Sampers’ in the figures). The incremental model is calibrated based on the present air–rail split in the Stockholm–Gothenburg corridor and the corresponding difference in generalized travel time DGTTSG (given in minutes of rail in-vehicle time). The present share of rail trips is denoted RSG, and equals 0.29 for business trips and 0.73 for private trips (‘Stockholm–Gothenburg Base scenario’ in the figures) according to the aggregate trips volumes13 . DGTTSG is 47 min for business trips and 48 min for private trips. The share of rail trips (of the total number of air and rail trips) in a relation is then predicted by the incremental logit function RðDGTTÞ ¼ Prail Prail þ Pair ¼ RSGexpðbðDGTT À DGTTSGÞÞ ð1 À RSGÞ þ RSGexpðbðDGTT À DGTTSGÞÞ ; where Prail and Pair are the market share for rail and air and b is the parameter corresponding to in-vehicle time for one-day trips in Sampers. The incremental non-linear models are plotted in the same figures (‘Incremental non-linear model’). Note, however, that for private trips the Sampers and the non-linear model do not include identical trips (the latter includes commuting trips in a separate model), so the non-linear model just serves as an illustration in this case. The figures also depict the air–rail mode split of the linear Sampers forecast (‘HSR scenario’), assuming that the rail travel time decreases by 51 min (in-vehicle travel time decreases from 3 h 5 min to 2 h 14 min). This would imply that DGTT0 SG (Stockholm-Gothenburg, HSR scenario) becomes À4 min for business trips and À3 min for private trips. Fig. 3 suggests that the Sampers model for business trips has a rather good model fit, as long as the share of rail trips is less than 60 percent. Above this point the curve fit is rather poor. As pointed out by Gaudry (2008), this is a typical problem with the linear-in parameters logit model because the response curve is forced to be symmetric around the inflexion point at 0.5. The curve fit, compared to the ‘Exponential model’, is better for the non-linear model. For the part of the response curve relevant for this particular forecast, however, the slope of the Sampers and the non-linear response curves is similar, explaining why their forecasts are similar (see Section 4). Fig. 4 shows that for private trips the fit of Sampers’ response curve, compared to the ‘Exponential model’, is not as good as for the business model, in particular for high rail market shares. Again, the non-linear model performs considerably better. The comparison should be interpreted with caution since the definition of private trips in Sampers and in the non-linear model is not identical. The forecasts of the two models are similar also for private trips (see Section 4). Fig. 3. Share for rail travel, R, as function of generalized travel time difference between air and rail, DGTT, business trips. 12 Traffic counts are available on request from the author. 13 Note that this air–rail split does not correspond exactly to the Sampers forecast in Table 4, partly because Sampers is not calibrated perfectly and because Table 4 refers to a forecast for 2020. M. Börjesson / Transportation Research Part A 70 (2014) 81–92 89 According to the Sampers forecast, the market share for rail increases from 29 percent to 46 percent for business trips and from 73 percent to 78 percent for private trips when the HSR line is introduced. According to the exponential model, the resulting market share for rail would be similar for business trips but higher, 88 percent, for private trips. Assuming that the incremental Sampers own elasticity is right this implies that the linear forecast model underpredict the reduction of air travel by 176,000 air trips per year (which is about 16 percent of the total current number of air trips). An important point to make here is that the incremental logit does not take into account the effect that the frequency of air service decreases when rail becomes more competitive. Taking this into account in a second step, however, is not the main reason for the underprediction. The effect on the total air–rail splits can be computed from the models above14 . According to the exponential model, the rail share increases from 55 to 71 percent and according to the incremental Sampers model the rail share increases from 55 to 67 percent. Hence, the exponential curve suggests that the total air–rail split is underestimated by four percentage points in the forecast. Even a 71 percent market share for rail is lower than for Madrid–Seville and London–Paris, having approximately the same rail travel time. The higher share for air could be due to the very competitive small airport located within the City of Stockholm, having high accessibility and fast check-in, security, service and baggage delivery15 . In passing we note that the elasticity varies strongly with travel time in all three models plotted in the figures, providing a strong warning for assuming constant elasticity demand functions over large ranges as in Section 4. The exponential function implies that the elasticity reduces with increasing DGTT. The elasticity of the logit models (the linear and non-linear), however, first increase with DGTT, up to the inflexion point, where it starts to reduce with DGTT. 6. Conclusions This paper investigates the methodological question, to what extent state-of-the-art forecasting models can predict the demand for HSR, by varying the in-vehicle time attribute of the rail mode. The main approach is to evaluate the forecasts for the HSR proposed for the Stockholm–Gothenburg corridor using two different model systems:  the nested logit-based Swedish long distance model, part of the national transport model package Sampers, and  a modified set of nested logit-based models where the travel time attributes are non-linear, Box–Cox transformed, in the utility function. Fig. 4. Share for rail travel, R, as function of generalized travel time difference between air and rail, DGTT, private trips. Table 5 Exponential models explaining how the share of rail trips (air–rail mode split) depends on the difference in generalized travel time. Business trips Private trips Nr. Obs. 23 23 R-squared 0.849 0.877 Estimate Std. error T-value Estimate Std. error T-value Intercept À0.129 0.034679 À3.732 À0.6837 0.0572 À11.96 D GK À0.003 0.000255 À10.884 À0.0069 0.00056 À12.33 14 Business rail 304.039 trips; Business air 733.622 trips; Private rail 1.078.653; Private air 389.750 trips. 15 Another reason may be that many travelers feel see-sick on rail. In an interview among domestic air passengers 45 percent state that they at least sometimes feel sick on X2000 and 11 percent state that this affect their mode choices (WSP Analysis and Strategy, 2012). 90 M. Börjesson / Transportation Research Part A 70 (2014) 81–92 These models are evaluated against observed outcomes of other cases in Europe, and to an exponential model, estimated on Swedish aggregate data revealing how the air–rail mode split varies with the difference in generalized travel time between rail and air. The exponential model ensures zero market shares for air when the rail–air in-vehicle time difference falls below zero, pricing out air from the marked. The two logit-based models are also compared to each other. In general, the elasticities of long distance models estimated on cross-sectional data in the literature tend to be lower (in absolute terms) than the elasticities observed when new HSR lines has been opened, such as those in Madrid–Barcelona, Madrid–Seville and the first phase of the Paris–Lyon HSR line. The high observed elasticities, however, are likely a result of very long initial rail travel times, in particular in the Spanish corridors. The own travel time elasticities implied by Sampers are well in line with or above those reported from other studies based on cross-sectional data. In particular, a similar UK model produces average elasticities in the same range. The non-linear model produces, as expected, even slightly higher elasticities than the linear-in-parameters Sampers model. The own elasticity of in-vehicle travel time on travel demand in response to a proposed HSR line in the Stockholm–Gothenburg corridor is À1.0 in the Sampers model and À1.15 in the non-linear model, which is similar to the second phase of the opening of the Paris–Lyon HSR line (Nash, 2010). This is a relevant comparison, since the rail alternative is reasonably good also without a new HSR investment in the Stockholm–Gothenburg corridor, indicating that the Sampers long distance model produce creditable elasticities. For business trips, the air–rail split of the HSR corridor Stockholm–Gothenburg predicted by Sampers and the non-linear model is relatively consistent with the exponential model estimated on the aggregate data. The model fit in comparison to the exponential model is in general slightly better for the non-linear model than for the Sampers model, in particular for relations with high share of rail trips. Still, the demand forecasts for the proposed HSR line in the Stockholm–Gothenburg corridor are similar for Sampers and the non-linear model. For private trips, the Sampers model seems to underpredict the elasticities, in particular for relations where the market share for rail is high. This is probably due to too low cross-elasticity of air demand with respect to rail travel time, since the own elasticities are consistent with earlier experience and since cross-elasticities are more difficult to estimate accurately. As for business trips, the model fit in comparison to the exponential model seems to be better for the non-linear model. If travelers view HSR as a qualitatively new travel mode, not just as a fast train, the demand effect of HSR would be hard to predict using standard forecasting models. This study, however, does not find any evidence supporting that this is the case. One may, however, question to what extent this conclusion can be generalized from the Stockholm–Gothenburg corridor, where the reduction in travel time due to the introduction of HSR would be moderate. This case differs from a number of other potential or realized HSR projects, in which the travel time reduction was or would be considerably larger. However, as long as HSR is not seen as a qualitatively new travel mode, demand predictions should be reliable in most cases, because travel time variability in cross-sectional data (covering wide ranges of origin–destination distances) is usually much larger than the reduction in travel time due to HSR. Acknowledgments Many thanks to Staffan Algers for fruitful discussions, and to Lars-Göran Mattsson for useful comments on the several versions of the paper. References Armoogum, J., Madre, J., 1997. Accuracy of data and memory effects in home based surveys of travel behaviour. In: 76th Annual Meeting of the Transportation Research Board, Washington, D.C. Axhausen, K., Köll, H., Bader, M., Herry, M., 1997. Workload, response rate and data yield: experiments with long-distance diaries. Transp. Res. Rec.: J. Transp. Res. Board 1593, 29–40. Ben-Akiva, M., Cascetta, E., Coppola, P., Papola, A., Velardi, V., 2010. High speed rail demand forecasting n a competitive market: the Italian case study. In: Proceedings of the World Conference of Transportation Research (WCTR), Lisbon, Portugal. Beser Hugosson, M., 2003. Issues in Estimation and Application of Long Distance Travel Demand Models. Thesis. Division for Transport and location analysis, Royal institute for Technology, Stockholm. No. 03-044. 2003. Retrieved from http://kth.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:9360. Beser, M., Algers, S., 2002. SAMPERS – The New Swedish National Travel Demand Forecasting Tool. In: Lundqvist, L., Mattsson, L.-G. (Eds.), National Transport Models: Recent Developments and Prospects. Springer. Börjesson, M., 2012. Valuing perceived insecurity associated with use of and access to public transport. Transp. Policy 22, 1–10. http://dx.doi.org/10.1016/ j.tranpol.2012.04.004. Börjesson, M., Eliasson, J., 2014. Experiences from the Swedish Value of Time study. Transp. Res. A 59, 144–158. Burge, P., Rohr, C., Kim, C., 2010. Modelling choices for long-distance travellers in the UK: an SP analysis of mode choice. In: Proceeding from the European Transport Conference, Glasgow. Retrieved from http://etcproceedings.org/paper/modelling-choices-for-long-distance-travellers-in-the-uk-an-sp- analysis-of-mod. Cabanne, I., 2003. A long term model for long distance travel in France. In: Proceeding from the European Transport Conference, Strasbourg. Cheng, Y.-H., 2010. High-speed rail in Taiwan: new experience and issues for future development. Transp. Policy 17 (2), 51–63. http://dx.doi.org/10.1016/ j.tranpol.2009.10.009. Consultants, I.N.R.O., 1999. Emme/2 User’s Manual: Release 9.2. Montréal, Canada. COST318, 1998. Interaction Between High Speed Rail and Air Passenger Transport. European Commission: Directorate General of Transport. Daly, A., 1992. ALOGIT 3.2 User’s Guide. Hague Consulting Group, La Haya. Daly, A., 2010. Cost Damping in Travel Demand Models: Report of a Study for the Department for Transport. Dargay, J., 2010. The prospects for longer distance domestic coach, rail, air and car travel in Britain (Unpublished). Report to the independent transport commission: Institute for Transport Studies. Faculty of Environment Study for Strategic Rail Authority. M. Börjesson / Transportation Research Part A 70 (2014) 81–92 91 Dargay, J.M., Clark, S., 2012. The determinants of long distance travel in Great Britain. Transp. Res. Part A: Policy Pract. 46 (3), 576–587. http://dx.doi.org/ 10.1016/j.tra.2011.11.016. De Bok, M., Costa, Á., Melo, S., Palma, V., Frias, R.D., 2010. Estimation of a mode choice model for long distance travel in Portugal. In: Proceedings from Word Conference of Transport Research, Lisbon. Dillén, J., Algers, S., 1998. Further research on the national Swedish value of time study. In: Selected Proceedings of the 8th World Conference on Transport Research, 3, 135–148. Eliasson, J., Börjesson, M., 2014. On timetable assumptions in railway investment appraisal. Transp. Policy, Forthcoming. European Commission, 2010. White paper – European Transport Policy for 2010: Time to Decide (White Paper), Brussels. Eurostar, 2011. Eurostar contributes to rail renaissance as UK domestic passenger numbers hit highest level since 1920s. Retrieved December 1, 2011, from http://www.eurostar.com/uk-en/about-eurostar/press-office/press-releases/2011/eurostar-contributes-rail-renaissance-uk-domestic#.U-DOnigXzIk. Flyvbjerg, B., Holm, M.K., Buhl, S.L., 2005. How (in) accurate are demand forecasts in public works projects? J. Am. Plan. Assoc. 71 (2), 131–146. Gaudry, M., 2008. Non linear logit modelling developments and high speed rail profitability. Agora Jules Dupuit, Publication AJD-127, WWW. E-Ajd. Org. Goodwin, P., Dargay, J., Hanly, M., 2004. Elasticities of road traffic and fuel consumption with respect to price and income: a review. Transp. Rev. 24 (3), 275– 292. Graham, D., Glaister, S., 2004. Road traffic demand elasticity estimates: a review. Transp. Rev. 24 (3), 261–274. Jansson, K., Nelldal, B., 2010. High-speed trains in Sweden – a good idea? In: Proceedings of the 12th World Conference on Transport Research Society, ID 02762. Nash, C., 2010. When to invest in high-speed rail links and networks? In: Presented at the 18th International ITF/OECD Symposium on Transport Economics and Policy. Retrieved from http://trid.trb.org/view.aspx?id=926734. Outwater, M.L., Tierney, K., Bradley, M., Sall, E., Kuppam, A., Modugula, V., 2010. California statewide model for high-speed rail. J. Choice Model. 3 (1), 58–83. Rohr, C., Fox, J., Daly, A., Patruni, B., Patil, S., Tsang, F., 2010. Modelling long-distance travel in the UK. In: Proceeding from the European Transport Conference, Glasgow. Román, C., Espino, R., Martín, J.C., 2010. Analyzing competition between the high speed train and alternative modes. The case of the Madrid–Zaragoza– Barcelona Corridor. J. Choice Model. 3 (1), 84–108, doi:10.1016/S1755-5345(13)70030-7. Sánchez-Borràs, M., 2010. High-speed rail in Spain. In: Presented at the 1st TEMPO Conference on Sustainable Transport, Oslo, Norway, 18–19 May 2010. Sánchez-Borràs, M., Nash, C., Abrantes, P., López-Pita, A., 2010. Rail access charges and the competitiveness of high speed trains. Transp. Policy 17 (2), 102– 109. http://dx.doi.org/10.1016/j.tranpol.2009.12.001. Transek A.B., 2004. Technical Report of Sampers: Long Distance Trips. English. Vickerman, R., 1997. High-speed rail in Europe: experience and issues for future development. Ann. Reg. Sci. 31 (1), 21–38. http://dx.doi.org/10.1007/ s001680050037. Wardman, M., 2004. Public transport values of time. Transp. Policy 11 (4), 363–377. http://dx.doi.org/10.1016/j.tranpol.2004.05.001. WSP Analysis & Strategy, 2012. Höghastighetståg Del 2 – modellutveckling och känslighetsanalyser. High Speed Train Part 2 – model development and sensitivity analyzes (in English and Swedish). 92 M. Börjesson / Transportation Research Part A 70 (2014) 81–92