Introduction
Research Background
Large airport hubs serve as convergence points for multiple transportation modes, such as aviation, high-speed rail/urban rail transit, taxis, ride-hailing services, buses, and private cars. These modes collectively form a complex multimodal transportation network, which not only meets diverse passenger travel needs but also poses significant operational management challenges. Passenger travel demands at airport hubs often involve transfers across different transportation modes and are influenced by flight dynamics, operational environments, and other factors, resulting in highly dynamic and complex traffic demand patterns.
With the continuous growth of passenger flow at airport hubs, accurately predicting multimodal transportation demand to enhance operational efficiency and passenger experience has become a key research focus in transportation management and planning. However, traditional prediction methods exhibit clear limitations in addressing complex cross-modal demand, struggling to adapt to the dynamic and variable characteristics of airport hub traffic. Therefore, constructing an integrated traffic demand prediction model based on multi-source data can more comprehensively capture the dynamic interconnections among different transportation modes, which holds significant importance for improving airport hub traffic management capabilities.
Research Significance
From a theoretical perspective, research on integrated multimodal transportation demand prediction at airport hubs will advance the application of multi-source data fusion technologies and deep learning models in complex transportation systems, deepen the understanding of cross-modal transportation correlations, and further enrich the methodological framework of intelligent transportation systems research.
From a practical perspective, accuratetransfertraffic demand forecasting can help managers allocate transportation resources efficiently, optimize intermodal connectivity, alleviate congestion, and enhance passenger transfer experiences. Moreover, this research can serve as a reference for other types of transportation hubs, providing theoretical support and practical guidance for optimizing and developing urban integrated transportation systems.
Literature Review
This sectionwill explore the research fromairport transfer passenger flowandtrafficforecasting models, particularly multimodal forecasting models, from these two perspectives..
Airport Transfer Passenger Flow
Early research on airport transfer passenger flows was constrained by limitations in real-time data collection technology and distribution management conditions, leading scholars both domestically and internationally to primarily rely on operations research theory and simulation modeling for exploration. Overseas research took the lead due to earlier airport construction.Monterier and Hansen[3] used San Francisco Airport as a case study to systematically analyze the impact of landside transportation systems on overall airport traffic. They proposed improving operational efficiency by optimizing traffic organization and adding transfer facilities, providing a practical template for subsequent research.Gorstein and Mccabe[8] further defined the boundaries of landside transportation systems, using simulation software to model dynamic interactions between pedestrians and vehicles. By quantifying simulation results, they derived a traffic capacity calculation model, offering key parameter support for airport planning.Stephane's team[17] broke through the single-entity perspective by integrating decisions from airports, airlines, and passengers into a unified framework. Based on survey data from the San Francisco Bay Area, they constructed a two-level nested Logit model, revealing the complex relationships between fare strategies, flight schedules, and passenger choices.Gulsah[18] focused on micro-level individual behavior, conducting field research at Columbus International Airport and using a binomial Logit model to quantify passengers' acceptance thresholds for private car alternatives (e.g., shared transportation). The study found that travel cost and time reliability were core decision-making factors.
Domestic research was constrained by the lag in airport infrastructure construction and only gradually emergedin the early21st century, but developed rapidly. Yao Yanbin[19]pioneeringly combined the Analytic Hierarchy Process (AHP) with multipleLogitmodels in2006, using the Capital Airport as a case study, to quantitatively evaluate for the first time the diversion effect of rail transit access on the landside transfer system, demonstrating that increased rail分担率 could alleviate road congestion during peak hours. Wen Yifan[20]further refined the model structure by differentiating the transfer preferences of business and leisure travelers through a hierarchicalLogitmodel, predicting market shares of different transportation modes, and constructing a demand elasticity model to support dynamic pricing strategies. Guo Zhenyi[21]targeted the diversified transfer scenarios at Baiyun Airport, improved the traditional nestedLogitmodel by introducing variables such as terminal layout and future expansion plans, and verified the model's adaptability in long-term planning. The Bao Danwen team[22]based on fused stated preference (SP) and revealed preference (RP) data, compared the performance differences between mixedLogitand nestedLogitmodels, finding that mixedLogithad greater advantages in capturing individual heterogeneity. Zhang Lanfang et al.[42]used theNLmodel to reveal that business travelers are highly sensitive to time while non-business travelers are more sensitive to costs, proposing recommendations for differentiated transfer service designs.
With the maturity of intelligent sensor deployment at airports and real-time passenger flow statistics technology, research focus has gradually shifted to data-driven prediction models. Li Xinyue[43] focused on the real-time decision-making mechanisms of departing passengers, analyzing the influence of multi-source information such as display screen guidance and mobileAPP prompts on choice behavior. However, Hu Xiaobo[13] and Sun Zhiqiang[14]'s research on Capital Airport remained limited to local transportation connection issues (e.g., taxi dispatch optimization), lacking in-depth exploration of the overall synergistic effects of the transfer system. Xia Wei[16] summarized general patterns of traffic flow organization but did not propose quantitative evaluation tools.
Single-mode traffic prediction
Data-driven models have achieved mature research in other transportation modes, especially road traffic. Single-mode traffic prediction has evolved from statistical methods to combinations of various machine learning approaches over years of development, with many established theories.
Early single-mode predictions relied on traditional statistical methods, such asARIMA (Chen et al. 2020), which smooths short-term fluctuations through time series, logistic regression (Apronti et al. 2016), and Bayesian networks (Zhu et al. 2016), which fit probability distributions using historical data. However, their linear assumptions struggle to capture the dynamic nonlinear relationships between passenger flow and external factors like weather or events. The historical average algorithm (Smith and Demetsky 1997), while simple and practical, fails to adapt to sudden flow changes such as holidays. Kalman filtering (Kumar 2017) can update predictions through state equations, but its high sensitivity to model parameters results in insufficient robustness.
The introduction of machine learning models partially addresses the aforementioned limitations.KNNalgorithm(Xu et al. 2020; Tak et al. 2014)achieves non-parametric prediction through similar-day matching,but faces the "curse of dimensionality" in high-dimensional data;neural networks(Wei and Chen 2012; Zheng et al. 2006)capture nonlinear features through multilayer perceptrons, but shallow structures have limited ability to model spatiotemporal correlations;SVM(Tang et al. 2019; Jiang et al. 2014)maps high-dimensional spaces using kernel functions, but relies heavily on hyperparameter selection, and computational complexity increases sharply with data volume. Although these models outperform traditional methods in specific scenarios, they still lack the ability to finely characterize the implicit spatiotemporal heterogeneity in traffic data (e.g., differences in passenger flow patterns between morning subway peaks and evening bus peaks).
The breakthrough progress in deep learning has propelled prediction accuracy to new heights. Recurrent neural networks(Ma et al. 2015) and their variantsLSTM(Liu et al. 2019; Yang et al. 2021; Zhang et al. 2019) capture the temporal dependencies of passenger flow through gating mechanisms, but their unidirectional propagation structures struggle to model long-term (e.g., weekly, monthly) patterns;CNN utilizes convolutional kernels to extract local spatial features, excelling in grid-basedOD matrix predictions, yet forcibly rasterizing transportation networks leads to the loss of topological information (e.g., road connectivity). Graph neural networks (GNN) model the native transportation topology through node-edge relationships, emerging as a new paradigm for spatial feature extraction:Guo et al. (2019) designed a spatiotemporal graph neural network (STGNN), integrating graph convolution (GCN) withGRU, achieving the first dynamic prediction of city-wide bicycle demand;Chen et al. (2020) proposed a multi-taskGCN framework, simultaneously predicting taxi demand and vacancy rates, enhancing fine-grained prediction accuracy through road cascade relationship modeling;Geng et al. (2019) constructed a multi-graph convolutional network, encoding inter-regional distance, traffic flow similarity, and functional complementarity, significantly optimizing ride-hailing dispatch efficiency.However, single models often face performance bottlenecks, such asGCN's over-smoothing issue when layers deepen, and LSTM's delayed response to abrupt patterns.
To overcome the limitations of single models, hybrid architectures have become a research hotspot.Ke et al. (2017) combined LSTM with CNN in parallel to separately extract temporal trends and spatial hotspots of ride-hailing demand, improving short-term prediction robustness through feature concatenation; Zhang et al. (2020) designed a GCN-3DCNN hybrid model to capture spatial correlations between subway stations while using 3D convolution to mine spatiotemporal cube features of passenger flow; Zhang et al. (2021) proposed Res-LSTM, which introduces residual connections to alleviate gradient vanishing and combines GCN to model station topology, reducing subway passenger flow prediction errors by 12%.The rise of attention mechanisms has further enhanced model interpretability:Vaswaniet al. (2017)proposed theTransformer, which captures global temporal dependencies through self-attention.Liet al. (2019)'sLogSparse Transformeradopted an exponentially decaying sparsity strategy to reduce computational overhead;Zhouet al. (2020)'sInformerachieved efficient modeling of long sequences through probabilistic sparse attention mechanisms, improving subway passenger flow prediction accuracy by23%compared to traditionalLSTM;Yaoet al. (2019)designed the Spatio-Temporal Dynamic Network (STDN), which filters noise using local flow gating and weights historical periodic data through attention mechanisms, effectively addressing the challenge of predicting sudden holiday passenger flow fluctuations.
The aforementioned models all focus on single-mode transportation prediction. However,in the real world, multiple transportation modes form a dynamic interconnected network due to passenger behavior choices and system synergies.Therefore,jointly considering multiple transportation modes rather than ignoring their correlations becomes particularly important.
Multimodal prediction models
In the real world, multiple transportation modes form a dynamically interconnected network due to passenger behavior choices and system synergies.Irawan et al. (2019) empirically found significant complementary and competitive relationships between motorcycle ride-hailing services, motorcycle taxis, and public transport. For instance, motorcycle taxis provide "last-mile" connectivity for subways during peak hours, while directly competing with buses during low-demand periods. With the integration of multi-task learning and deep learning techniques, joint multi-modal prediction is gradually becoming feasible.Ke et al. (2021) proposed a multi-task multi-graph learning method, constructing independent graph structures for taxis, shared bikes, and buses. They utilized multi-graph convolution to extract spatiotemporal features of each mode and designed an inter-task knowledge-sharing mechanism to predict ride-hailing demand. However, such methods do not explicitly model cross-modal influences (e.g., subway delays causing surges in taxi demand), relying instead on implicit parameter transfer for correlation, resulting in weaker interpretability.
Early multi-modal prediction studies had significant limitations.Zhong et al. (2017) simply aggregated subway, bus, and taxi passenger flows into regional total passenger volume for prediction, ignoring nonlinear interactions between modes (e.g., demand shifts caused by fare differences).Ye et al. (2019) used a CNN-LSTM hybrid model to simultaneously predict taxi and shared bike demand, but their spatial feature decomposition was based on static grid partitioning, failing to reflect cross-modal flow transmission under real road network topologies.Xu et al. (2022) attempted to introduce multi-spatial correlation graphs to depict competition between bikes and taxis, combining graph attention networks to quantify inter-modal demand substitution elasticity. However, reliance on manually defined correlation matrices made it difficult to adapt to dynamic scenarios (e.g., sudden weather changes causing sharp drops in cycling demand).Liang et al. (2022) proposed a Multi-Relation Graph Neural Network (MR-GNN), using fixed mode correlation matrices to model subway-bus synergy. Yet, static matrices could not capture emergency coordination demands during sudden disruptions, leading to prediction delays.
To break through static correlation constraints, some studies have shifted to adaptive graph modeling.Lu et al. (2020) introduced an adaptive adjacency matrix in road speed prediction, dynamically adjusting the weights of different traffic modes (e.g., private cars and trucks) through a gating mechanism, but it was not extended to multimodal passenger flow prediction scenarios.Huang et al. (2022) designed a dynamic residualGCN, using an attention mechanism to adaptively generate inter-mode correlation matrices, and validated the effectiveness of dynamic modeling in intercity traffic flow prediction;Bai et al. (2020) proposed an adaptive graph convolutional recurrent network, simultaneously optimizing node features and graph structure parameters, but it focused on intra-mode correlations and did not address cross-modal interactions. Recently, the multimodal former (M2-former)[2] addressed the challenge of heterogeneity in multimodal transportation by proposing an encoder-decoder architecture: the encoder separately models the spatiotemporal features of subways, buses, and shared bikes through adaptive multi-graph convolution and captures dynamic interactions using a cross-mode attention mechanism; the decoder then transfers features from high-density traffic modes (e.g., subways) to low-density modes (e.g., customized buses) through knowledge distillation, significantly alleviating data sparsity issues. This model validated the ripple effect of subway flow restrictions on sharedbike demand during holidays in a Beijing multimodal dataset, but its applicability to small-scale hub scenarios (e.g., airport landside traffic) has yet to be verified.
Knowledge transfer and adaptation to sparse data have become another research focus.Li et al. (2021) utilized memory neural networks to transfer dense passenger flow features from subway stations to sparse intercity bus station predictions, but relied solely on time-series similarity without considering spatial topological constraints;Li et al. (2022) further proposed an unsupervised knowledge adaptation model, aligning feature distributions of different transportation modes through adversarial training, but theirLSTM-based backbone network struggled to capture complex spatial dependencies. Notably, the Locality-Perception-Enhanced Spatiotemporal Graph Transformer Network (LPE-STGTN)[1] dynamically captures cross-temporal shared patterns (e.g., commuting cycles) and transient interactions (e.g., emergencies) between regions through a spatiotemporal graph generator, combined with a lightweightAFT-local module to enhance computational efficiency, offering a new approach for multimodal correlation modeling. However, it remains limited to unimodal predictions for taxis and ride-hailing services.
Multimodal prediction models primarily focus on spatiotemporal analysis at urban scales, with limited development for small-scale areas such as transportation hubs.
Summary of Research Status
Based on the above background, existing research on short-term traffic flow prediction for multimodal transportation faces the following challenges: (1) the difficulty of learning interaction mechanisms among multiple transportation modes; (2) the challenge of extracting complex dynamic spatiotemporal features; (3) the organizational difficulty of heterogeneous data structures.
Research Content and Technical Approach
Conducting research on integrated prediction of multi-modal landside transportation demand at airports driven by multi-source data aims to develop deep learning models capable of dynamically capturing complex intermodal correlations and spatiotemporal features. Through empirical analysis of real-world data from airports like Beijing Daxing International Airport, this research can not only validate the feasibility and effectiveness of the proposed methods but also provide theoretical support and practical references for demand prediction in other airport hubs and comprehensive transportation hubs.