Data Mining For Robust Flight Scheduling
In scheduling of airport operations the unreliability of flight arrivals is a serious challenge. Robustness with respect to flight delay is incorporated into recent scheduling techniques. To refine proactive scheduling, we propose classification of flights into delay categories. Our method is based on archived data at major airports in current flight information systems. Classification in this scenario is hindered by the large number of attributes, that might occlude the dominant patterns of flight delays. As not all of these attributes are equally relevant for different patterns, global dimensionality reduction methods are not appropriate.We therefore present a technique which identifies locally relevant attributes for the classification into flight delay categories. We give an algorithm that efficiently identifies relevant attributes. Our experimental evaluation demonstrates that our technique is capable of detection relevant patterns useful for flight delay classification.
KeywordsClass Label Relevant Attribute Subspace Cluster Attribute Entropy Very Large Data Base
Unable to display preview. Download preview PDF.
- 1.M. Abdel-Aty, C. Lee, Y. Bai, X. Li, and M. Michalak. Detecting periodic patterns of arrival delay. Journal of Air Transport Management, pages 355 – 361, 2007.Google Scholar
- 2.R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proceedings of the ACM International Conference on Management of Data (SIGMOD), pages 94 – 105, 1998.Google Scholar
- 3.R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the International Conference on Very Large Data Bases (VLDB), pages 487 – 499, 1994.Google Scholar
- 4.I. Assent, R. Krieger, B. Glavic, and T. Seidl. Clustering multidimensional sequences in spatial and temporal databases. International Journal on Knowledge and Information Systems (KAIS), 2008.Google Scholar
- 5.I. Assent, R. Krieger, E. Müller, and T. Seidl. DUSC: Dimensionality unbiased subspace clustering. In Proceedings of the IEEE International Conference on Data Mining (ICDM), pages 409 – 414, 2007.Google Scholar
- 6.. I. Assent, R. Krieger, P. Welter, J. Herbers, and T. Seidl. Subclass: Classification of multidimensional noisy data using subspace clusters. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Osaka, Japan. Springer, 2008.Google Scholar
- 8.K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is nearest neighbors meaningful. In Proceedings of the 7th International Conference on Database Theory (ICDT), pages 217 – 235, 1999.Google Scholar
- 10.Bureau of Transportation Statistics. Airline on-time performance data. Available from http://www.transtats.bts.gov.
- 13.J. Dong, A. Krzyak, and C. Suen. Fast SVM Training Algorithm with Decomposition on Very Large Data Sets. IEEE Transactions Pattern Analysis and Machine Intelligence (PAMI), pages 603 – 618, 2005.Google Scholar
- 14.U. Dorndorf, F. Jaehn, and E. Pesch. Modelling robust flight-gate scheduling as a clique partitioning problem. Transportation Science, 2008.Google Scholar
- 15.R. Duda, P. Hart, and D. Stork. Pattern Classification (2nd Edition). Wiley, 2000.Google Scholar
- 16.Eurocontrol Central Office for Delay Analysis. Delays to air transport in europe. Available from http://www.eurocontrol.int/eCoda.
- 17.R. Gray. Entropy and Information Theory. Springer, 1990.Google Scholar
- 18.S. Hettich and S. Bay. The UCI KDD archive [http://kdd.ics.uci.edu]. Irvine, CA: University of California, Department of Information and Computer Science, 1999.Google Scholar
- 19.A. Hinneburg, C. Aggarwal, and D. Keim. What is the nearest neighbor in high dimensional spaces? In Proceedings of the International Conference on Very Large Data Bases (VLDB), pages 506 – 515, September 2000.Google Scholar
- 20.I. Joliffe. Principal Component Analysis. Springer, New York, 1986.Google Scholar
- 21.K. Kailing, H.-P. Kriegel, and P. Kröger. Density-connected subspace clustering for high-dimensional data. In Proceedings of the IEEE International Conference on Data Mining (ICDM), pages 246 – 257, 2004.Google Scholar
- 27.J. Platt. Fast training of support vector machines using sequential minimal optimization. In Schoelkopf, Burges, and Smola, editors, Advances in Kernel Methods. MIT Press, 1998.Google Scholar
- 28.J. Quinlan. Induction of decision trees. Machine Learning, 1:81 – 106, 1986.Google Scholar
- 29.J. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1992.Google Scholar
- 30.K. Sequeira and M. Zaki. SCHISM: A new approach for interesting subspace mining. In Proceedings of the IEEE International Conference on Data Mining (ICDM), pages 186 – 193, 2004.Google Scholar
- 32.L. Silva, J. M. de Sa, and L. Alexandre. Neural network classification using Shannon? Entropy. In Proceedings of the European Symposium on Artificial Neural Networks (ESANN), 2005.Google Scholar
- 33.M. Zaki, M. Peters, I. Assent, and T. Seidl. Clicks: An effective algorithm for mining subspace clusters in categorical datasets. Data & Knowledge Engineering (DKE), 57, 2007.Google Scholar
- 34.H. Zhang, A. Berg, M. Maire, and J. Malik. SVM-KNN: Discriminative nearest neighbor classification for visual category recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2, 2006.Google Scholar