Exploiting Hierarchical Domain Values for Bayesian Learning

Han, Yiqiu; Lam, Wai

doi:10.1007/3-540-36175-8_25

Yiqiu Han⁵ &
Wai Lam⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2637))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1147 Accesses

Abstract

This paper proposes a framework for exploiting hierarchical structures of feature domain values in order to improve classification performance under Bayesian learning framework. Inspired by the statistical technique called shrinkage, we investigate the variances in the estimation of parameters for Bayesian learning. We develop two algorithms by maintaining a balance between precision and robustness to improve the estimation. We have evaluated our methods using two real-world data sets, namely, a weather data set and a yeast gene data set. The results demonstrate that our models benefit from exploring the hierarchical structures.

The work described in this paper was partially supported by grants from the Research Grant Council of the Hong Kong Special Administrative Region, China (Project Nos: CUHK 4385/99E and CUHK 4187/01E)

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

L. D. Baker and A. K. McCallum, Distributional Clustering of Words for Text Classification, Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 96–102, 1998.
Google Scholar
P. Domingos and M. Pazzani, Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier, Machine Learning 29, pp. 103–130, 1997.
Article MATH Google Scholar
S. Dumis and H. Chen, Hierarchical Classification of Web Content, Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 256–263, 2000.
Google Scholar
D. Freitag and A. McCallum. Information Extraction with HMMs and Shrinkage. Proceedings of the AAAI-99 Workshop on Machine Learning for Information Extraction. 1999
Google Scholar
W. James and C. Stein, Estimation with Quadratic Loss., Proceedings of the Fourth Berkeley Symposim on Mathematical Statistics and Probability 1, pp. 361–379, 1961.
MathSciNet Google Scholar
A. McCallum, R. Rosenfeld, T. Mitchell and A. Y. Ng, Improving Text Classification by Shrinkage in a Hierarchy of Classes, Proceedings of the International Conference on Machine Learning (ICML), pp. 359–367, 1998.
Google Scholar
A. McCallum and K. Nigam, Text Classification by Bootstrapping with Keywords, EM and Shrinkage, ACL Workshop for Unsupervised Learning in Natural Language Processing, 1999
Google Scholar
K. Qu, L. Nan, L. Yanmei and D. G. Payan, Multidimensional Data Integration and Relationship Inference, IEEE Intelligent Systems 17, pp. 21–27, 2002.
Article Google Scholar
E. Segal and D. Koller, Probabilistic Hierarchical Clustering for Biological Data, Annual Conference on Research in Computational Molecular Biology, pp. 273–280, 2002.
Google Scholar
C. Stein, Inadmissibility of the Usual Estimator for the Mean of a Multivariate Normal Distribution, Proceedings of the Third Berkeley Symposim on Mathematical Statistics and Probability 1, pp. 197–206, 1955.
Google Scholar
F. Takech and E. Suzuki, Finding an Optimal Gain-Ratio Subset-Split Test for a Set-Valued Attribute in Decision Tree Induction, Proceedings of the International Conference on Machine Learning (ICML), pp. 618–625, 2002
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong
Yiqiu Han & Wai Lam

Authors

Yiqiu Han
View author publications
You can also search for this author in PubMed Google Scholar
Wai Lam
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Korea Advanced Institute of Science and Technology, 373-1 Koo-Sung Dong, Yoo-Sung Ku, Daejeon, 305-701, Korea
Kyu-Young Whang
Department of Statistics, Seoul National University, Sillimdong Kwanakgu, Seoul, 151-742, Korea
Jongwoo Jeon
School of Electrical Engineering and Computer Science, Seoul National University, Kwanak P.O. Box 34, Seoul, 151-742, Korea
Kyuseok Shim
Department of Computer Science and Engineering, University of Minnesota, 200 Union St SE, Minneapolis, MN, 55455, USA
Jaideep Srivastava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, Y., Lam, W. (2003). Exploiting Hierarchical Domain Values for Bayesian Learning. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_25

Download citation

DOI: https://doi.org/10.1007/3-540-36175-8_25
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-04760-5
Online ISBN: 978-3-540-36175-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics