Skip to main content

Exploiting Heterogeneous Features for Classification Learning

  • Conference paper
Intelligent Data Engineering and Automated Learning (IDEAL 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2690))

  • 1316 Accesses

Abstract

This paper proposes a framework for handling heterogeneous features containing hierarchical values and texts under Bayesian learning. To exploit hierarchical features, we make use of a statistical technique called shrinkage. We also explore an approach for utilizing text data to improve classification performance. We have evaluated our framework using a yeast gene data set which contain hierarchical features as well as text data.

The work described in this paper was partially supported by grants from the Research Grant Council of the Hong Kong Special Administrative Region, China (Project Nos: CUHK 4385/99E and CUHK 4187/01E).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blaschke, C., Valencia, A.: The frame-based module of the SUISEKI information extraction system. IEEE Intelligent Systems 17(2), 14–19 (2002)

    Google Scholar 

  2. Domingos, P., Pazzani, M.: Beyond independence: Conditions for the optimality of the simple Bayesian classifier. Machine Learning 29, 103–130 (1997)

    Article  MATH  Google Scholar 

  3. James, W., Stein, C.: Estimation with quadratic loss. In: Proceedings of the Fourth Berkeley Symposim on Mathematical Statistics and Probability 1, pp. 361–379 (1961)

    Google Scholar 

  4. McCallum, A., Rosenfeld, R., Mitchell, T., Ng, A.Y.: Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of the Fourteenth International Conference on Machine Learning ICML, pp. 359–367 (1998)

    Google Scholar 

  5. Segal, E., Koller, D.: Probabilistic hierarchical clustering for biological data. In: Annual Conference on Research in Computational Molecular Biology, pp. 273–280 (2002)

    Google Scholar 

  6. Stein, C.: Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In: Proceedings of the Third Berkley Symposim on Mathematical Statistics and Probability 1, pp. 197–206 (1955)

    Google Scholar 

  7. Tan, P.N., Blau, H., Harp, S., Goldman, R.: Textual data mining of service center call records. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 417–422 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Han, Y., Lam, W. (2003). Exploiting Heterogeneous Features for Classification Learning. In: Liu, J., Cheung, Ym., Yin, H. (eds) Intelligent Data Engineering and Automated Learning. IDEAL 2003. Lecture Notes in Computer Science, vol 2690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45080-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45080-1_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40550-4

  • Online ISBN: 978-3-540-45080-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics