Neural Computing and Applications

, Volume 31, Issue 4, pp 1173–1187 | Cite as

A discriminative model selection approach and its application to text classification

  • Lungan Zhang
  • Liangxiao JiangEmail author
  • Chaoqun Li
Original Article


Classification is one of the fundamental problems in data mining, in which a classification algorithm attempts to construct a classifier from a given set of training instances with class labels. It is well known that some classification algorithms perform very well on some domains, and poorly on others. For example, NB performs well on some domains, and poorly on others that involve correlated features. C4.5, on the other hand, typically works better than NB on such domains. To integrate their advantages and avoid their disadvantages, many model hybrid approaches, such as model insertion and model combination, are proposed. In this paper, we focus on a novel view and propose a discriminative model selection approach, called discriminative model selection (DMS). DMS discriminatively chooses different single models for different test instances and retains the interpretability of single models. Empirical studies on a collection of 36 classification problems from the University of California at Irvine repository show that our discriminative model selection approach outperforms single models, model insertion approaches and model combination approaches. Besides, we apply the proposed discriminative model selection approach to some state-of-the-art naive Bayes text classifiers and also improve their performance.


Naive Bayes C4.5 Discriminative model selection Text classification 



The work was partially supported by the National Natural Science Foundation of China (61203287), the Program for New Century Excellent Talents in University (NCET-12-0953), the Chenguang Program of Science and Technology of Wuhan (2015070404010202), and the Open Research Project of Hubei Key Laboratory of Intelligent Geo-Information Processing (KLIGIP201601).

Compliance with ethical standards

Conflict of interest

We confirm that there is no conflict of interest in the submission, and the manuscript has been approved by all authors for publication.


  1. 1.
    Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287Google Scholar
  2. 2.
    Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRefzbMATHGoogle Scholar
  3. 3.
    Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetzbMATHGoogle Scholar
  4. 4.
    Frank A, Asuncion A (2010) UCI machine learning repository. Department of Information and Computer Science, University of California, IrvineGoogle Scholar
  5. 5.
    Han E, Karypis G (2000) Centroid-based document classification: analysis and experimental results. In: Proceedings of the 4th European conference on the principles of data mining and knowledge discovery. Springer, pp 424–431Google Scholar
  6. 6.
    Han S, Karypis G, Kumar V (2001) Text categorization using weight adjusted k-nearest neighbor classification. In: Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 53–65Google Scholar
  7. 7.
    Jiang L, Cai Z, Zhang H, Wang D (2013) Naive Bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):273–286CrossRefGoogle Scholar
  8. 8.
    Jiang L, Li C (2011) Scaling up the accuracy of decision-tree classifiers: a naive-Bayes combination. J Comput 6(7):1325–1331Google Scholar
  9. 9.
    Jiang L, Li C, Wang S, Zhang L (2016a) Deep feature weighting for naive Bayes and its application to text classification. Eng Appl Artif Intell 50:188–203Google Scholar
  10. 10.
    Jiang L, Wang D, Cai Z (2012) Discriminatively weighted naive Bayes and its application in text classification. Int J Artif Intell Tools 21(1):1250007CrossRefGoogle Scholar
  11. 11.
    Jiang L, Wang S, Li C, Zhang L (2016b) Structure extended multinomial naive Bayes. Inf Sci 329:346–356CrossRefGoogle Scholar
  12. 12.
    Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649CrossRefzbMATHGoogle Scholar
  13. 13.
    Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining. ACM, pp 202–207Google Scholar
  14. 14.
    Langley P, Iba W, Thomas K (1992) An analysis of Bayesian classifiers. In: Proceedings of the tenth national conference of artificial intelligence. AAAI Press, pp 223–228Google Scholar
  15. 15.
    Li Y, Luo C, Chung SM (2012) Weighted naive Bayes for text classification using positive term-class dependency. Int J Artif Intell Tools 21(01):1250008CrossRefGoogle Scholar
  16. 16.
    Losada DE, Azzopardi L (2008) Assessing multivariate bernoulli models for information retrieval. ACM Trans Inf Syst 26(3), Article No. 17Google Scholar
  17. 17.
    McCallum A, Nigam KA (1998) A comparison of event models for naive Bayes text classification. In: Working notes of the 1998 AAAI/ICML workshop on learning for text categorization. AAAI Press, pp 41–48Google Scholar
  18. 18.
    Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281CrossRefzbMATHGoogle Scholar
  19. 19.
    Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf B, Burges C, Smola A (eds) Advances in kernel methods—support vector learning. MIT Press, Cambridge, pp 185–208Google Scholar
  20. 20.
    Ponte JM, Croft WB (1998) A language modeling approach to information retrieval. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval. ACM Press, Melbourne, pp 275–281Google Scholar
  21. 21.
    Provost F, Domingos P (2003) Tree induction for probability-based ranking. Mach Learn 52:199–215CrossRefzbMATHGoogle Scholar
  22. 22.
    Quinlan JR (1993) C4.5: programs for machine learning, 1st edn. Morgan Kaufmann, San MateoGoogle Scholar
  23. 23.
    Ratanamahatana CA, Gunopulos D (2003) Feature selection for the naive Bayesian classifier using decision trees. Appl Artif Intell 17:475–487CrossRefGoogle Scholar
  24. 24.
    Rennie JD, Shih L, Teevan J, Karger DR (2003) Tackling the poor assumptions of naive Bayes text classifiers. In: Proceedings of the twentieth international conference on machine learning. Morgan Kaufmann, pp 616–623Google Scholar
  25. 25.
    Wang S, Jiang L, Li C (2015) Adapting naive Bayes tree for text classification. Knowl Inf Syst 44(1):77–89CrossRefGoogle Scholar
  26. 26.
    Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, Los AltosGoogle Scholar
  27. 27.
    Wu X, Kumar V, Quinlan JR (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1):1–37CrossRefGoogle Scholar
  28. 28.
    Xiao Y, Zhu Z, Zhao Y, Wei Y, Wei S, Li X (2014) Topographic NMF for data representation. IEEE Trans Cybern 44(10):1762–1771CrossRefGoogle Scholar
  29. 29.
    Yan J, Gao X (2014) Detection and recognition of text superimposed in images base on layered method. Neurocomputing 134:3–14CrossRefGoogle Scholar
  30. 30.
    Zhang H, Ho KL, Wu QM, Ye Y (2013) Multidimensional latent semantic analysis using term spatial information. IEEE Trans Cybern 43(6):1625–1640CrossRefGoogle Scholar
  31. 31.
    Zhang H, Liu G, Chow TWS, Liu W (2011) Textual and visual content-based anti-phishing: a Bayesian approach. IEEE Trans Neural Netw 22(10):1532–1546CrossRefGoogle Scholar
  32. 32.
    Zhang L, Jiang L, Li C (2016a) C4.5 or naive Bayes: a discriminative model selection approach. In: Proceedings of the twenty-fifth international conference on artificial neural networks. Springer, pp 419–426Google Scholar
  33. 33.
    Zhang L, Jiang L, Li C (2016b) A new feature selection approach to naive Bayes text classifiers. Int J Pattern Recognit Artif Intell 30(2):1650003MathSciNetCrossRefGoogle Scholar
  34. 34.
    Zhang L, Jiang L, Li C, Kong G (2016c) Two feature weighting approaches for naive Bayes text classifiers. Knowl Based Syst 100:137–144CrossRefGoogle Scholar
  35. 35.
    Zhang T, Oles FJ (2001) Text categorization based on regularized linear classification methods. Inf Retr 4(1):5–31CrossRefzbMATHGoogle Scholar

Copyright information

© The Natural Computing Applications Forum 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceChina University of GeosciencesWuhanChina
  2. 2.Hubei Key Laboratory of Intelligent Geo-Information ProcessingChina University of GeosciencesWuhanChina
  3. 3.Department of MathematicsChina University of GeosciencesWuhanChina

Personalised recommendations