Exploiting Heterogeneous Features for Classification Learning

  • Yiqiu Han
  • Wai Lam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2690)


This paper proposes a framework for handling heterogeneous features containing hierarchical values and texts under Bayesian learning. To exploit hierarchical features, we make use of a statistical technique called shrinkage. We also explore an approach for utilizing text data to improve classification performance. We have evaluated our framework using a yeast gene data set which contain hierarchical features as well as text data.


hierarchical features Bayesian learning parameter estimation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Blaschke, C., Valencia, A.: The frame-based module of the SUISEKI information extraction system. IEEE Intelligent Systems 17(2), 14–19 (2002)Google Scholar
  2. 2.
    Domingos, P., Pazzani, M.: Beyond independence: Conditions for the optimality of the simple Bayesian classifier. Machine Learning 29, 103–130 (1997)zbMATHCrossRefGoogle Scholar
  3. 3.
    James, W., Stein, C.: Estimation with quadratic loss. In: Proceedings of the Fourth Berkeley Symposim on Mathematical Statistics and Probability 1, pp. 361–379 (1961)Google Scholar
  4. 4.
    McCallum, A., Rosenfeld, R., Mitchell, T., Ng, A.Y.: Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of the Fourteenth International Conference on Machine Learning ICML, pp. 359–367 (1998)Google Scholar
  5. 5.
    Segal, E., Koller, D.: Probabilistic hierarchical clustering for biological data. In: Annual Conference on Research in Computational Molecular Biology, pp. 273–280 (2002)Google Scholar
  6. 6.
    Stein, C.: Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In: Proceedings of the Third Berkley Symposim on Mathematical Statistics and Probability 1, pp. 197–206 (1955)Google Scholar
  7. 7.
    Tan, P.N., Blau, H., Harp, S., Goldman, R.: Textual data mining of service center call records. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 417–422 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Yiqiu Han
    • 1
  • Wai Lam
    • 1
  1. 1.Department of Systems Engineering and Engineering ManagementThe Chinese University of Hong KongShatin, Hong Kong

Personalised recommendations