HierCost: Improving Large Scale Hierarchical Classification with Cost Sensitive Learning

Charuvaka, Anveshi; Rangwala, Huzefa

doi:10.1007/978-3-319-23528-8_42

Anveshi Charuvaka¹⁰ &
Huzefa Rangwala¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9284))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

4957 Accesses
10 Citations
1 Altmetric

Abstract

Hierarchical Classification (HC) is an important problem with a wide range of application in domains such as music genre classification, protein function classification and document classification. Although several innovative classification methods have been proposed to address HC, most of them are not scalable to web-scale problems. While simple methods such as top-down “pachinko” style classification and flat classification scale well, they either have poor classification performance or do not effectively use the hierarchical information. Current methods that incorporate hierarchical information in a principled manner are often computationally expensive and unable to scale to large datasets. In the current work, we adopt a cost-sensitive classification approach to the hierarchical classification problem by defining misclassification cost based on the hierarchy. This approach effectively decouples the models for various classes, allowing us to efficiently train effective models for large hierarchies in a distributed fashion.

Download to read the full chapter text

Chapter PDF

Evaluation measures for hierarchical classification: a unified view and novel approaches

Article 06 September 2014

Reduction strategies for hierarchical multi-label classification in protein function prediction

Article Open access 15 September 2016

An Extended Local Hierarchical Classifier for Prediction of Protein and Gene Functions

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Babbar, R., Partalas, I., Gaussier, E., Amini, M.R.: On flat versus hierarchical classification in large-scale taxonomies. In: Advances in Neural Information Processing Systems, pp. 1824–1832 (2013)
Google Scholar
Cai, L., Hofmann, T.: Hierarchical document categorization with support vector machines. In: ACM International Conf. on Information & Knowledge Management, pp. 78–87 (2004)
Google Scholar
Charuvaka, A., Rangwala, H.: Approximate block coordinate descent for large scale hierarchical classification. In: ACM SIGAPP Symposium on Applied Computing (2015)
Google Scholar
Chen, J., Warren, D.: Cost-sensitive learning for large-scale hierarchical classification. In: ACM International Conf. on Information & Knowledge Management, pp. 1351–1360 (2013)
Google Scholar
Dekel, O., Keshet, J., Singer, Y.: Large margin hierarchical classification. In: International Conf. on Machine Learning, p. 27 (2004)
Google Scholar
Dimitrovski, I., Kocev, D., Loskovska, S., Džeroski, S.: Hierarchical annotation of medical images. Pattern Recognition 44(10), 2436–2449 (2011)
Article Google Scholar
Evgeniou, T., Micchelli, C., Pontil, M.: Learning multiple tasks with kernel methods. Journal of Machine Learning Research 6(1), 615–637 (2005)
MathSciNet MATH Google Scholar
Evgeniou, T., Pontil, M.: Regularized multi-task learning. In: ACM SIGKDD International Conf. on Knowledge Discovery and Data Mining, pp. 109–117 (2004)
Google Scholar
Gopal, S., Yang, Y.: Recursive regularization for large-scale classification with hierarchical and graphical dependencies. In: ACM SIGKDD International Conf. on Knowledge Discovery and Data Mining, pp. 257–265 (2013)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: Rcv1: A new benchmark collection for text categorization research. The Journal of Machine Learning Research 5, 361–397 (2004)
Google Scholar
Liu, T.Y., Yang, Y., Wan, H., Zeng, H.J., Chen, Z., Ma, W.Y.: Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explorations Newsletter 7(1), 36–43 (2005)
Article Google Scholar
Masnadi-Shirazi, H., Vasconcelos, N.: Risk minimization, probability elicitation, and cost-sensitive svms. In: International Conf. on Machine Learning, pp. 759–766 (2010)
Google Scholar
McCallum, A., Rosenfeld, R., Mitchell, T.M., Ng, A.Y.: Improving text classification by shrinkage in a hierarchy of classes. In: International Conf. on Machine Learning, vol. 98, pp. 359–367 (1998)
Google Scholar
Nesterov, Y.: Introductory lectures on convex optimization, vol. 87. Springer Science & Business Media (2004)
Google Scholar
Partalas, I., Kosmopoulos, A., Baskiotis, N., Artieres, T., Paliouras, G., Gaussier, E., Androutsopoulos, I., Amini, M.R., Galinari, P.: Lshtc: A benchmark for large-scale text classification. arXiv preprint arXiv:1503.08581 (2015)
Saha, B., Gupta, S., Phung, D., Venkatesh, S.: Multiple task transfer learning with small sample sizes. Knowledge and Information Systems, 1–28 (2015)
Google Scholar
Shahbaba, B., Neal, R.M., et al.: Improving classification when a class hierarchy is available using a hierarchy-based prior. Bayesian Analysis 2(1), 221–237 (2007)
Article MathSciNet Google Scholar
Silla Jr., C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery 22(1–2), 31–72 (2011)
Article MathSciNet MATH Google Scholar
Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Information Processing & Management 45(4), 427–437 (2009)
Article Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research 6, 1453–1484 (2005)
MathSciNet MATH Google Scholar
Yang, Y.: A study of thresholding strategies for text categorization. In: ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 137–145 (2001)
Google Scholar
Yang, Y., Liu, X.: A re-examination of text categorization methods. In: ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 42–49 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

George Mason University, Fairfax, USA
Anveshi Charuvaka & Huzefa Rangwala

Authors

Anveshi Charuvaka
View author publications
You can also search for this author in PubMed Google Scholar
Huzefa Rangwala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anveshi Charuvaka .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Porto, Porto, Portugal
Pedro Pereira Rodrigues
University of Porto - CRACS/INESC TEC, Porto, Portugal
Vítor Santos Costa
University of Porto - INESC TEC, Porto, Portugal
Carlos Soares
University of Porto - INESC TEC, Porto, Portugal
João Gama
University of Porto - INESC TEC, Porto, Portugal
Alípio Jorge

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Charuvaka, A., Rangwala, H. (2015). HierCost: Improving Large Scale Hierarchical Classification with Cost Sensitive Learning. In: Appice, A., Rodrigues, P., Santos Costa, V., Soares, C., Gama, J., Jorge, A. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9284. Springer, Cham. https://doi.org/10.1007/978-3-319-23528-8_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-23528-8_42
Published: 29 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23527-1
Online ISBN: 978-3-319-23528-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HierCost: Improving Large Scale Hierarchical Classification with Cost Sensitive Learning

Abstract

Chapter PDF

Similar content being viewed by others

Evaluation measures for hierarchical classification: a unified view and novel approaches

Reduction strategies for hierarchical multi-label classification in protein function prediction

An Extended Local Hierarchical Classifier for Prediction of Protein and Gene Functions

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

HierCost: Improving Large Scale Hierarchical Classification with Cost Sensitive Learning

Abstract

Chapter PDF

Similar content being viewed by others

Evaluation measures for hierarchical classification: a unified view and novel approaches

Reduction strategies for hierarchical multi-label classification in protein function prediction

An Extended Local Hierarchical Classifier for Prediction of Protein and Gene Functions

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation