Skip to main content
Log in

Feature importance ranking for classification in mixed online environments

  • S.I.: Computational Biomedicine
  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

Online learning is a growing branch of machine learning with applications in many domains. One of the less studied topics in this area is development of strategies for online feature importance ranking. In this paper we present two methods for incremental ranking of features in classification tasks. Our ranking strategies are based on measurement of the sensitivity of the classification outcome with respect to individual features. The two methods work for different types of classification environments with discrete, continuous and mixed feature types with minimum prior assumptions. The second method, which is a modification of the original method, is designed to handle concept drift while avoiding cumbersome computations. Concept drift is described as sudden or slow changes in characteristics of the learning features which happens in many online learning tasks such as online marketing analysis. If the rankings are not adaptable, during the time, these changes will make the rankings obsolete. Moreover, we investigate different feature selection schemes for feature reduction in online environments to effectively remove irrelevant features from the classification model. Finally, we present experimental results which verify the efficacy of our methods against currently available online feature ranking algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Anguita, D., Ghio, A., Oneto, L., Parra Perez, X., & Reyes Ortiz, J. L. (2013). A public domain dataset for human activity recognition using smartphones. In Proceedings of the 21th international European symposium on artificial neural networks, computational intelligence and machine learning (pp. 437–442).

  • Bi, J., Bennett, K., Embrechts, M., Breneman, C., & Song, M. (2003). Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, 3(Mar), 1229–1243.

    Google Scholar 

  • Bifet, A., Holmes, G., Kirkby, R., & Pfahringer, B. (2010). Moa: Massive online analysis. Journal of Machine Learning Research, 11(May), 1601–1604.

    Google Scholar 

  • Bolon-Canedo, V., Fernández-Francos, D., Peteiro-Barral, D., Alonso-Betanzos, A., Guijarro-Berdiñas, B., & Sánchez-Maroño, N. (2016). A unified pipeline for online feature selection and classification. Expert Systems with Applications, 55, 532–545.

    Article  Google Scholar 

  • Carvalho, V. R. & Cohen, W. W. (2006). Single-pass online learning: Performance, voting schemes and online feature selection. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 548–553). ACM.

  • Cohen, L., Avrahami-Bakish, G., Last, M., Kandel, A., & Kipersztok, O. (2008). Real-time data mining of non-stationary data streams from sensor networks. Information Fusion, 9(3), 344–353.

    Article  Google Scholar 

  • Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., & Singer, Y. (2006). Online passive-aggressive algorithms. Journal of Machine Learning Research, 7(Mar), 551–585.

    Google Scholar 

  • Czitrom, V. (1999). One-factor-at-a-time versus designed experiments. The American Statistician, 53(2), 126–131.

    Google Scholar 

  • Dash, M., & Liu, H. (1997). Feature selection for classification. Intelligent Data Analysis, 1(1–4), 131–156.

    Article  Google Scholar 

  • Duda, R. O., Hart, P. E., & Stork, D. G. (1973). Pattern classification. New York: Wiley.

    Google Scholar 

  • Fan, Y.-J., & Chaovalitwongse, W. A. (2010). Optimizing feature selection to improve medical diagnosis. Annals of Operations Research, 174(1), 169–183.

    Article  Google Scholar 

  • Finch, T. (2009). Incremental calculation of weighted mean and variance, Vol. 4, pp. 11–15. University of Cambridge

  • Gaber, M. M., Zaslavsky, A., & Krishnaswamy, S. (2005). Mining data streams: A review. ACM Sigmod Record, 34(2), 18–26.

    Article  Google Scholar 

  • Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.

    Google Scholar 

  • Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1–3), 389–422.

    Article  Google Scholar 

  • Hoffman, J., Rodner, E., Donahue, J., Darrell, T., & Saenko, K. (2013). Efficient learning of domain-invariant image representations. arXiv preprint arXiv:1301.3224.

  • Katakis, I., Tsoumakas, G., & Vlahavas, I. (2005). On the utility of incremental feature selection for the classification of textual data streams. In P. Bozanis & E. N. Houstis (Eds.), Advances in informatics (pp. 338–348). Berlin: Springer.

    Chapter  Google Scholar 

  • Le Thi, H. A., & Nguyen, M. C. (2017). DCA based algorithms for feature selection in multi-class support vector machine. Annals of Operations Research, 249(1–2), 273–300.

    Article  Google Scholar 

  • Lichman, M. (2013). UCI machine learning repository.

  • Lin, Y., Guo, H., & Hu, J. (2013). An svm-based approach for stock market trend prediction. In The 2013 international joint conference on neural networks (IJCNN) (pp. 1–7). IEEE.

  • Liu, H. & Setiono, R. (1995). Chi2: Feature selection and discretization of numeric attributes. In ICTAI (pp. 388–391).

  • Liu, Y., Li, H., Peng, G., Lv, B., & Zhang, C. (2015). Online purchaser segmentation and promotion strategy selection: Evidence from chinese e-commerce market. Annals of Operations Research, 233(1), 263–279.

    Article  Google Scholar 

  • Nair, B. B., Mohandas, V., & Sakthivel, N. (2010). A decision tree-rough set hybrid system for stock market trend prediction. International Journal of Computer Applications, 6(9), 1–6.

    Article  Google Scholar 

  • Nguyen, H.-L., Woon, Y.-K., Ng, W.-K., & Wan, L. (2012). Heterogeneous ensemble for feature drifts in data streams. In P. N. Tan, S. Chawla, C. K. Ho, & J. Bailey (Eds.), Advances in knowledge discovery and data mining (pp. 1–12). Berlin: Springer.

    Google Scholar 

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.

    Google Scholar 

  • Perkins, S. & Theiler, J. (2003). Online feature selection using grafting. In ICML (pp. 592–599).

  • Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

    Google Scholar 

  • Quinlan, J. R. (2014). C4. 5: Programs for machine learning. New York: Elsevier.

    Google Scholar 

  • Ramírez-Gallego, S., Krawczyk, B., García, S., Woźniak, M., & Herrera, F. (2017). A survey on data preprocessing for data stream mining: Current status and future directions. Neurocomputing, 239, 39–57.

    Article  Google Scholar 

  • Razmjoo, A., Xanthopoulos, P., & Zheng, Q. P. (2017). Online feature importance ranking based on sensitivity analysis. Expert Systems with Applications, 85, 397–406.

    Article  Google Scholar 

  • Robnik-Šikonja, M., & Kononenko, I. (2003). Theoretical and empirical analysis of relieff and rrelieff. Machine Learning, 53(1–2), 23–69.

    Article  Google Scholar 

  • Saltelli, A., & Annoni, P. (2010). How to avoid a perfunctory sensitivity analysis. Environmental Modelling & Software, 25(12), 1508–1517.

    Article  Google Scholar 

  • Sayed-Mouchaweh, M. (2016). Learning from data streams in dynamic environments. Berlin: Springer.

    Book  Google Scholar 

  • Seref, O., Fan, Y.-J., Borenstein, E., & Chaovalitwongse, W. A. (2018). Information-theoretic feature selection with discrete k-median clustering. Annals of Operations Research, 263(1–2), 93–118.

    Article  Google Scholar 

  • Shen, K.-Q., Ong, C.-J., Li, X.-P., & Wilder-Smith, E. P. (2008). Feature selection via sensitivity analysis of svm probabilistic outputs. Machine Learning, 70(1), 1–20.

    Article  Google Scholar 

  • Thomopoulos, N. T. (2012). Essentials of Monte Carlo simulation: Statistical methods for building simulation models. Berlin: Springer.

    Google Scholar 

  • Tsymbal, A. (2004). The problem of concept drift: Definitions and related work. Dublin: Computer Science Department, Trinity College Dublin.

    Google Scholar 

  • Wang, J., Wang, M., Li, P., Liu, L., Zhao, Z., Hu, X., et al. (2015). Online feature selection with group structure analysis. IEEE Transactions on Knowledge and Data Engineering, 27(11), 3029–3041.

    Article  Google Scholar 

  • Wang, J., Zhao, P., Hoi, S. C., & Jin, R. (2014). Online feature selection and its applications. IEEE Transactions on Knowledge and Data Engineering, 26(3), 698–710.

    Article  Google Scholar 

  • Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1–3), 37–52.

    Article  Google Scholar 

  • Yu, L. & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 856–863).

  • Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., & Stoica, I. (2013). Discretized streams: Fault-tolerant streaming computation at scale. In Proceedings of the twenty-fourth ACM symposium on operating systems principles (pp. 423–438). ACM.

Download references

Acknowledgements

Dr. Zheng’s work is in part supported by the AFRL Mathematical Modeling and Optimization Institute. The authors would also like to thank the reviewers and Editors for their constructive comments and recommendations.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alaleh Razmjoo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Razmjoo, A., Xanthopoulos, P. & Zheng, Q.P. Feature importance ranking for classification in mixed online environments. Ann Oper Res 276, 315–330 (2019). https://doi.org/10.1007/s10479-018-2972-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-018-2972-2

Keywords

Navigation