Continuous Naive Bayesian Classifications

Vega S. N., Vinsensius Berlian; Bressan, Stéphane

doi:10.1007/978-3-540-24594-0_27

Vinsensius Berlian Vega S. N.⁹ &
Stéphane Bressan⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2911))

Included in the following conference series:

International Conference on Asian Digital Libraries

867 Accesses
1 Citations

Abstract

The most common model of machine learning algorithms involves two life-stages, namely the learning stage and the application stage. The cost of human expertise makes difficult the labeling of large sets of data for the training of machine learning algorithms. In this paper, we propose to challenge this strict dichotomy in the life cycle while addressing the issue of labeling of data. We discuss a learning paradigm called Continuous Learning. After an initial training based on human-labeled data, a Continuously Learning algorithm iteratively trains itself with the result of its own previous application stage and without the privilege of any external feedback. The intuitive motivation and idea of this paradigm are elucidated, followed by explanations on how it differs from other learning models. Finally, empirical evaluation of Continuous Learning applied to the Naive Bayesian Classifier for the classification of newsgroup articles of a well-known benchmark is presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Esposito, F., Malerba, D., Semeraro, G., Fanizzi, N., Ferili, S.: Adding Machine Learning and Knowledge Intensive Techniques to a Digital Library Service. International Journal on Digital Libraries 2(1), 3–19 (1998)
Article Google Scholar
Kupiec, J., Pederson, J., Chen, F.: A trainable document summarizer. In: Proceedings of SIGIR 1995, pp. 68–73. ACM Press, New York (1995)
Chapter Google Scholar
Bartell, B.T., Cottrell, G.W., Belew, R.K.: Learning the optimal parameters in a ranked retrieval system using multi-query relevance feedback. In: Proceedings Symposium on Document Analysis and Information Retrieval (1994)
Google Scholar
Vega, V.B., Bressan, S.: Indexing the Indonesian Web: Language Identification and Miscellaneous Issues. In: Poster Proceedings of 10th World Wide Web Conference (2001)
Google Scholar
Vega, V.B., Bressan, S.: Continuous-Learning Weighted-Trigram Approach for Indonesian Language Distinction: A Preliminary Study. In: Proceedings of 19th International Conference on Computer Processing of Oriental Languages (2001)
Google Scholar
Blum, A., Tom Mitchell, T.: Combining Labeled and Unlabeled Data with Co- Training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory (1998)
Google Scholar
Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning 39(2/3), 103–134 (2000)
Article MATH Google Scholar
Smyth, P.: Learning with Probabilistic Supervision. In: Petsche, T., Hanson, S., Shavlik, J. (eds.) Computational Learning Theory and Natural Learning Systems, vol. 3, pp. 163–182. MIT Press, Cambridge (1995)
Google Scholar
Kearns, M., Li, M.: Learning in the presence of malicious errors. SIAM Journal on Computing 22(4), 807–837 (1993)
Article MATH MathSciNet Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B 39, 1–38 (1997)
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, National University of Singapore, 3 Science Drive, Singapore, 117543
Vinsensius Berlian Vega S. N. & Stéphane Bressan

Authors

Vinsensius Berlian Vega S. N.
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Bressan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Science, Faculty of Information Science & Technology, Universiti Kebangsaan Malaysia, Bangi, 43600, UKM Selangor, Malaysia
Tengku Mohd Tengku Sembok
Universiti Kebangsaan Malaysia, Bangi, Malaysia
Halimah Badioze Zaman
Department of Management Information Systems, Eller College of Management, The University of Arizona, AZ 85721, USA
Hsinchun Chen
International School of Information Management, University of Mysore, Mysore, India
Shalini R. Urs
School of Engineering, Information and Communications University, 119, Munjiro, Yuseong-gu, 305-732, Daejeon, Korea
Sung-Hyon Myaeng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vega S. N., V.B., Bressan, S. (2003). Continuous Naive Bayesian Classifications. In: Sembok, T.M.T., Zaman, H.B., Chen, H., Urs, S.R., Myaeng, SH. (eds) Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access. ICADL 2003. Lecture Notes in Computer Science, vol 2911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24594-0_27

Download citation

DOI: https://doi.org/10.1007/978-3-540-24594-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20608-8
Online ISBN: 978-3-540-24594-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics