Data Mining For Robust Flight Scheduling

  • Ira Assent
  • Ralph Krieger
  • Petra Welter
  • Jörg Herbers
  • Thomas Seidl

In scheduling of airport operations the unreliability of flight arrivals is a serious challenge. Robustness with respect to flight delay is incorporated into recent scheduling techniques. To refine proactive scheduling, we propose classification of flights into delay categories. Our method is based on archived data at major airports in current flight information systems. Classification in this scenario is hindered by the large number of attributes, that might occlude the dominant patterns of flight delays. As not all of these attributes are equally relevant for different patterns, global dimensionality reduction methods are not appropriate.We therefore present a technique which identifies locally relevant attributes for the classification into flight delay categories. We give an algorithm that efficiently identifies relevant attributes. Our experimental evaluation demonstrates that our technique is capable of detection relevant patterns useful for flight delay classification.


Class Label Relevant Attribute Subspace Cluster Attribute Entropy Very Large Data Base 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    M. Abdel-Aty, C. Lee, Y. Bai, X. Li, and M. Michalak. Detecting periodic patterns of arrival delay. Journal of Air Transport Management, pages 355 – 361, 2007.Google Scholar
  2. 2.
    R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the ACM International Conference on Management of Data (SIGMOD), pages 94 – 105, 1998.Google Scholar
  3. 3.
    R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the International Conference on Very Large Data Bases (VLDB), pages 487 – 499, 1994.Google Scholar
  4. 4.
    I. Assent, R. Krieger, B. Glavic, and T. Seidl. Clustering multidimensional sequences in spatial and temporal databases. International Journal on Knowledge and Information Systems (KAIS), 2008.Google Scholar
  5. 5.
    I. Assent, R. Krieger, E. Müller, and T. Seidl. DUSC: Dimensionality unbiased subspace clustering. In Proceedings of the IEEE International Conference on Data Mining (ICDM), pages 409 – 414, 2007.Google Scholar
  6. 6.
    . I. Assent, R. Krieger, P. Welter, J. Herbers, and T. Seidl. Subclass: Classification of multidimensional noisy data using subspace clusters. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Osaka, Japan. Springer, 2008.Google Scholar
  7. 7.
    T. Bayes. An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society, 53:370 – 418, 1763.CrossRefGoogle Scholar
  8. 8.
    K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is nearest neighbors meaningful. In Proceedings of the 7th International Conference on Database Theory (ICDT), pages 217 – 235, 1999.Google Scholar
  9. 9.
    A. Bolat. Procedures for providing robust gate assignments for arriving aircrafts. European Journal of Operational Research, 120:63 – 80, 2000.MATHCrossRefGoogle Scholar
  10. 10.
    Bureau of Transportation Statistics. Airline on-time performance data. Available from
  11. 11.
    Y. Cao and J. Wu. Projective art for clustering data sets in high dimensional spaces. Neural Networks, 15(1):105 – 120, 2002.CrossRefGoogle Scholar
  12. 12.
    C. Domeniconi, J. Peng, and D. Gunopulos. Locally adaptive metric nearest-neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 24(9):1281 – 1285, 2002.CrossRefGoogle Scholar
  13. 13.
    J. Dong, A. Krzyak, and C. Suen. Fast SVM Training Algorithm with Decomposition on Very Large Data Sets. IEEE Transactions Pattern Analysis and Machine Intelligence (PAMI), pages 603 – 618, 2005.Google Scholar
  14. 14.
    U. Dorndorf, F. Jaehn, and E. Pesch. Modelling robust flight-gate scheduling as a clique partitioning problem. Transportation Science, 2008.Google Scholar
  15. 15.
    R. Duda, P. Hart, and D. Stork. Pattern Classification (2nd Edition). Wiley, 2000.Google Scholar
  16. 16.
    Eurocontrol Central Office for Delay Analysis. Delays to air transport in europe. Available from
  17. 17.
    R. Gray. Entropy and Information Theory. Springer, 1990.Google Scholar
  18. 18.
    S. Hettich and S. Bay. The UCI KDD archive []. Irvine, CA: University of California, Department of Information and Computer Science, 1999.Google Scholar
  19. 19.
    A. Hinneburg, C. Aggarwal, and D. Keim. What is the nearest neighbor in high dimensional spaces? In Proceedings of the International Conference on Very Large Data Bases (VLDB), pages 506 – 515, September 2000.Google Scholar
  20. 20.
    I. Joliffe. Principal Component Analysis. Springer, New York, 1986.Google Scholar
  21. 21.
    K. Kailing, H.-P. Kriegel, and P. Kröger. Density-connected subspace clustering for high-dimensional data. In Proceedings of the IEEE International Conference on Data Mining (ICDM), pages 246 – 257, 2004.Google Scholar
  22. 22.
    S. Lan, J.-P. Clarke, and C. Barnhart. Planning for robust airline operations: Optimizing aircraft routings and flight departure times to minimize passenger disruptions. Transportation Science, 40(1):15–28, 2006.CrossRefGoogle Scholar
  23. 23.
    W. McCulloch and W. Pitts. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5:115 – 137, 1943.MATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    S. Murthy. Automatic construction of decision trees from data: A multi-disciplinary survey. Data Mining and Knowledge Discovery, 2(4):345 – 389, 1998.CrossRefGoogle Scholar
  25. 25.
    L. Parsons, E. Haque, and H. Liu. Subspace clustering for high dimensional data: a review. SIGKDD Explorations Newsletter, 6(1):90 – 105, 2004.CrossRefGoogle Scholar
  26. 26.
    E. Patrick and F. Fischer. A generalized k-nearest neighbor rule. Information and Control, 16(2):128 – 152, 1970.MATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    J. Platt. Fast training of support vector machines using sequential minimal optimization. In Schoelkopf, Burges, and Smola, editors, Advances in Kernel Methods. MIT Press, 1998.Google Scholar
  28. 28.
    J. Quinlan. Induction of decision trees. Machine Learning, 1:81 – 106, 1986.Google Scholar
  29. 29.
    J. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1992.Google Scholar
  30. 30.
    K. Sequeira and M. Zaki. SCHISM: A new approach for interesting subspace mining. In Proceedings of the IEEE International Conference on Data Mining (ICDM), pages 186 – 193, 2004.Google Scholar
  31. 31.
    C. Shannon and W. Weaver. The Mathematical Theory of Communication. University of Illinois Press, Urbana, Illinois, 1949.MATHGoogle Scholar
  32. 32.
    L. Silva, J. M. de Sa, and L. Alexandre. Neural network classification using Shannon? Entropy. In Proceedings of the European Symposium on Artificial Neural Networks (ESANN), 2005.Google Scholar
  33. 33.
    M. Zaki, M. Peters, I. Assent, and T. Seidl. Clicks: An effective algorithm for mining subspace clusters in categorical datasets. Data & Knowledge Engineering (DKE), 57, 2007.Google Scholar
  34. 34.
    H. Zhang, A. Berg, M. Maire, and J. Malik. SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2, 2006.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Ira Assent
    • 1
  • Ralph Krieger
    • 1
  • Petra Welter
    • 2
  • Jörg Herbers
    • 3
  • Thomas Seidl
    • 1
  1. 1.Data Management and Exploration GroupRWTH Aachen UniversityGermany
  2. 2.Dept. of Medical InformaticsRWTH Aachen UniversityGermany
  3. 3.INFORM GmbHAachenGermany

Personalised recommendations