Naive Bayesian Classifiers for Ranking

Zhang, Harry; Su, Jiang

doi:10.1007/978-3-540-30115-8_46

Harry Zhang²² &
Jiang Su²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3201))

Included in the following conference series:

European Conference on Machine Learning

5065 Accesses
42 Citations

Abstract

It is well-known that naive Bayes performs surprisingly well in classification, but its probability estimation is poor. In many applications, however, a ranking based on class probabilities is desired. For example, a ranking of customers in terms of the likelihood that they buy one’s products is useful in direct marketing. What is the general performance of naive Bayes in ranking? In this paper, we study it by both empirical experiments and theoretical analysis. Our experiments show that naive Bayes outperforms C4.4, the most state-of-the-art decision-tree algorithm for ranking. We study two example problems that have been used in analyzing the performance of naive Bayes in classification [3]. Surprisingly, naive Bayes performs perfectly on them in ranking, even though it does not in classification. Finally, we present and prove a sufficient condition for the optimality of naive Bayes in ranking.

Download to read the full chapter text

Chapter PDF

An Empirical Study of a Simple Naive Bayes Classifier Based on Ranking Functions

Decision Tree Models for Ranking Data

On the Capability of Classification Trees and Random Forests to Estimate Probabilities

Article 22 April 2024

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bennett, P.N.: Assessing the calibration of Naive Bayes’ posterior estimates. Technical Report No. CMU-CS00-155 (2000)
Google Scholar
Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)
Article Google Scholar
Domingos, P., Pazzani, M.: Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier. Machine Learning 29, 103–130 (1997)
Article MATH Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. A Wiley-Interscience Publication, Hoboken (1973)
MATH Google Scholar
Ferri, C., Flach, P.A., Hernández-Orallo, J.: Learning Decision Trees Using the Area Under the ROC Curve. In: Proceedings of the 19th International Conference on Machine Learning, pp. 139–146. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Lachiche, N., Flach, P.A.: Improving Accuracy and Cost of Two-class and Multiclass Probabilistic Classifiers Using ROC Curves. In: Proceedings of the 20th International conference on Machine Learning, pp. 416–423. Morgan Kaufmann, San Francisco (2003)
Google Scholar
Frank, E., Trigg, L., Holmes, G., Witten, I.H.: Naive Bayes for Regression. Machine Learning 41(1), 5–15 (2000)
Article Google Scholar
Friedman, N., Greiger, D., Goldszmidt, M.: Bayesian Network Classifiers. Machine Learning 29, 103–130 (1997)
Article Google Scholar
Hand, D.J., Till, R.J.: A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186 (2001)
Article MATH Google Scholar
Kononenko, I.: Comparison of Inductive and Naive Bayesian Learning Approaches to Automatic Knowledge Acquisition. In: Current Trends in Knowledge Acquisition, IOS Press, Amsterdam (1990)
Google Scholar
Ling, C.X., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of the International Joint Conference on Artificial Intelligence IJCAI 2003, pp. 329–341. Morgan Kaufmann, San Francisco (2003)
Google Scholar
Ling, C.X., Yan, R.J.: Decision Tree with Better Ranking. In: Proceedings of the 20th International Conference on Machine Learning, pp. 480–487. Morgan Kaufmann, San Francisco (2003)
Google Scholar
Merz, C., Murphy, P., Aha, D.: UCI repository of machine learning databases. Dept of ICS, University of California, Irvine (1997), http://www.ics.uci.edu/~mlearn/MLRepository.html
Pazzani, M., Merz, P., Murphy, C., Ali, P., Hume, K., Brunk, T., Reducing, C.: misclassification costs. In: Proceedings of the 11th International conference on Machine Learning, pp. 217–225. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: comparison under imprecise class and cost distribution. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 43–48. AAAI Press, Menlo Park (1997)
Google Scholar
Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 445–453. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Provost, F.J., Domingos, P.: Tree Induction for Probability-Based Ranking. Machine Learning 52(3), 199–215 (2003)
Article MATH Google Scholar
Swets, J.: Measuring the accuracy of diagnostic systems. Science 240, 1285–1293 (1988)
Article MathSciNet Google Scholar
Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers. In: Proceedings of the Eighteenth International conference on Machine Learning, pp. 609–616. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Witten, I.H., Frank, E.: Data Mining –Practical Machine Learning Tools and Techniques with Java Implementation. Morgan Kaufmann, San Francisco (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, University of New Brunswick, P.O. Box 4400, Fredericton, NB, E3B 5A3, Canada
Harry Zhang & Jiang Su

Authors

Harry Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiang Su
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Pisa KDD Laboratory, ISTI - CNR, Area della Ricerca di Pisa, Via Giuseppe Moruzzi 1, Pisa, Italy
Fosca Giannotti
Dipartimento di Informatica, Via F. Buonarroti 2, 56127, Pisa, Italy
Dino Pedreschi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, H., Su, J. (2004). Naive Bayesian Classifiers for Ranking. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Machine Learning: ECML 2004. ECML 2004. Lecture Notes in Computer Science(), vol 3201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_46

Download citation

DOI: https://doi.org/10.1007/978-3-540-30115-8_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23105-9
Online ISBN: 978-3-540-30115-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Naive Bayesian Classifiers for Ranking

Abstract

Chapter PDF

Similar content being viewed by others

An Empirical Study of a Simple Naive Bayes Classifier Based on Ranking Functions

Decision Tree Models for Ranking Data

On the Capability of Classification Trees and Random Forests to Estimate Probabilities

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Naive Bayesian Classifiers for Ranking

Abstract

Chapter PDF

Similar content being viewed by others

An Empirical Study of a Simple Naive Bayes Classifier Based on Ranking Functions

Decision Tree Models for Ranking Data

On the Capability of Classification Trees and Random Forests to Estimate Probabilities

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation