Analyzing the Spatial and Temporal Characteristics of Subway Passenger Flow Based on Smart Card Data

  • Xiaolei MaEmail author
  • Jiyu Zhang
  • Chuan Ding
Part of the Complex Networks and Dynamic Systems book series (CNDS, volume 4)


Passenger flow is a core feature of rail transportation stations, and its station-level fluctuation is strongly influenced by its surrounding land-use types. This study develops a sequential K-means clustering algorithm that utilizes smart card data to categorize Beijing subway stations. The temporal characteristics of daily inbound and outbound subway passenger flows are considered in the clustering. The stations are divided into 10 groups that are classified under three categories: employment-oriented, dual-peak, and residence-oriented stations. We analyze how these categories differ in terms of station-level passenger flow. In addition, a station-level buffer area calculation method is used to estimate the land-use density around each subway station. Considering the spatial nonstationarity of passenger flow, we employ a geographically weighted regression (GWR) model to determine the correlation effect between peak-hour passenger flow and land-use density. We then analyze the spatial distribution of the correlation coefficients. Results demonstrate that most residents commute via rail transportation, and the passenger flows for the different categories of stations exhibit distinct characteristics of residences and workplaces. The findings of this study provide insightful information and theoretical foundation for rail transportation network design and operation management.


Subway passenger flow Sequential clustering Geographically weighted regression Spatial and temporal analysis Smart card data 



This work is partly supported by the National Natural Science Foundation of China (51408019, U1564212, and 71503018), Beijing Nova Program (z151100000315048).


  1. 1.
    L. Sun, Y. Lu, J.G. Jin, D.H. Lee, K.W. Axhausen, Y. Lu, K.W. Axhausen, An integrated Bayesian approach for passenger flow assignment in metro networks. Transp. Res. C 52, 116–131 (2015)CrossRefGoogle Scholar
  2. 2.
    W.-T. Zhou, B.-M. Han, Passenger flow assignment model of subway networks under train capacity constraint. J. South China Univ. Technol. 43(8), 126–134 (2015)Google Scholar
  3. 3.
    J. Wang, J.-F. Liu, F.-L. Sun, Passenger demand distribution and increasing trend over Beijing rail transit. Urban Transp. China 10(2), 26–32 (2012)Google Scholar
  4. 4.
    X.-Y. Ma, A. Jin, M.-M. Liu, et al., Rail transit passenger flow characteristics in Guangzhou. Urban Transp. China 06, 35–42 (2013)Google Scholar
  5. 5.
    J.-F. Liu, M. Luo, Y.-L. Ma, et al., Analysis on the passenger flow characteristics of Beijing urban rail network. Urban Rapid Rail Transit 25(05), 27–32 (2012)Google Scholar
  6. 6.
    V. Aguiléra, S. Allio, V. Benezech, F. Combes, C. Milion, Using cell phone data to measure quality of service and passenger flows of Paris transit system. Transp. Res. C 43, 198–211 (2013)CrossRefGoogle Scholar
  7. 7.
    S. Srinivasan, Linking land use and transportation in a rapidly urbanizing context: A study in Delhi, India. Transportation 32(1), 87–104 (2005)CrossRefGoogle Scholar
  8. 8.
    J. Dill, Transit use at transit-oriented developments in Portland, Oregon, area. Transp. Res. Rec. 2063, 159–167 (2008)CrossRefGoogle Scholar
  9. 9.
    N. Zhang, X.-F. Ye, L. Jian-Feng, The impact of land use on demand of urban rail transit. Urban Transp. China 08(03), 23–27 (2010)Google Scholar
  10. 10.
    S.-S. Peng, X.-P. Wu, S. Mei, Study on coordination between urban rail transit and land use based on GIS. J. Railw. Eng. Soc. 01, 76–79 (2011)Google Scholar
  11. 11.
    A.K. Jain, Data clustering: 50 years beyond K-means, in European Conference on Machine Learning and Knowledge Discovery in Databases (2008), pp. 651–666Google Scholar
  12. 12.
    S. Aghabozorgi, A.S. Shirkhorshidi, T.Y. Wah, Time-series clustering – A decade review. Inf. Syst. 53, 16–38 (2015)CrossRefGoogle Scholar
  13. 13.
    T. Warren Liao, Clustering of Time Series Data—A Survey (Elsevier, New York, 2005)Google Scholar
  14. 14.
    X. Golay, S. Kollias, G. Stoll, D. Meier, A. Valavanis, P. Boesiger, A new correlation-based fuzzy logic clustering algorithm for fmri. Magn. Reson. Med. 40(2), 249–260 (1998)CrossRefGoogle Scholar
  15. 15.
    C.S. Möllerlevet, F. Klawonn, K.H. Cho, O. Wolkenhauer, Fuzzy Clustering of Short Time-Series and Unevenly Distributed Sampling Points, vol 2810 (Springer, Heidelberg, 2003), pp. 330–340Google Scholar
  16. 16.
    M. Kumar, J. Woo, J. Woo, Clustering seasonality patterns in the presence of errors, in Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2002), pp. 557–563Google Scholar
  17. 17.
    K. Košmelj, V. Batagelj, Cross-sectional approach for clustering time varying data. J. Classif. 7(1), 99–109 (1990)CrossRefGoogle Scholar
  18. 18.
    T.W. Liao, B. Bolt, J. Forester, E. Hailman, C. Hansen, R.C. Kaste, J. O’May, Understanding and projecting the battle state, in 23rd Army Science Conference, Orlando, FL, vol. 25 (2002)Google Scholar
  19. 19.
    R.H. Shumway, Time-frequency clustering and discriminant analysis. Stat. Probab. Lett. 63(3), 307–314 (2003)CrossRefGoogle Scholar
  20. 20.
    X. Wang, K. Smith, R. Hyndman, Characteristic-based clustering for time series data. Data Min. Knowl. Disc. 13(3), 335–364 (2006)CrossRefGoogle Scholar
  21. 21.
    A.M. Alonso, J.R. Berrendero, A. Hernández, A. Justel, Time series clustering based on forecast densities. Comput. Stat. Data Anal. 51(2), 762–776 (2008)CrossRefGoogle Scholar
  22. 22.
    A.K. Jain, Data clustering: 50 years beyond K-means, in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, vol. 31 (Springer, Berlin, 2008), pp. 3–4Google Scholar
  23. 23.
    C. Genolini, X. Alacoque, M. Sentenac, C. Arnaud, Kml and kml3d: R packages to cluster longitudinal data. J. Stat. Softw. 65, 1–34 (2015)CrossRefGoogle Scholar
  24. 24.
    C. Genolini, B. Falissard, KmL: a package to cluster longitudinal data. Comput. Methods Prog. Biomed. 104, e112–e121 (2011)CrossRefGoogle Scholar
  25. 25.
    Z.-X. Tao, N. Zhang, B. Du, Research on the time &space distribution of in urban rail transport. Urban Public Transport 2, 33–35 (2004)Google Scholar
  26. 26.
    J. Zhao, W. Deng, Y. Song, Y. Zhu, What influences metro station ridership in China? Insights from Nanjing. Cities 35(4), 114–124 (2013)CrossRefGoogle Scholar
  27. 27.
    M.G. Mcnally, The four step model, in Handbook of Transport Modelling, (Elsevier, Amsterdam, 2008), pp. 35–52Google Scholar
  28. 28.
    O.D. Cardozo, J.C. García-Palomares, J. Gutiérrez, Application of geographically weighted regression to the direct forecasting of transit ridership at station-level. Appl. Geogr. 34(4), 548–558 (2012)CrossRefGoogle Scholar
  29. 29.
    D.P. Mcmillen, Geographically weighted regression: the analysis of spatially varying relationships. Am. J. Agric. Econ. 86, 554–556 (2004)CrossRefGoogle Scholar
  30. 30.
    W.-Z. Pei, The basic theoretic and application research on geographically weighted regression, Ph.D. Dissertation, Tongji University, Shanghai, China, 2007Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Transportation Science and Engineering, Beijing Key Laboratory for Cooperative Vehicle Infrastructure System and Safety Control, Beihang UniversityBeijingChina

Personalised recommendations