Skip to main content

Quality Control for Crowdsourced Multi-label Classification Using RAkEL

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10634))

Included in the following conference series:

Abstract

The quality of labels is one of the major issues in crowdsourced labeling tasks. A convenient method for ensuring the quality of labels is to assign the same labeling task to multiple workers and aggregate the labels. Several statistical aggregation methods for single-label classification tasks have been proposed; however, for multi-label classification tasks has not been well studied. Although the existing aggregation methods for single-label classification tasks can be applied to the multi-label classification tasks, they are not designed to incorporate relationships among classes, or they require large computation time. To address these issues, we propose to use RAndom k-labELsets (RAkEL). By incorporating an existing aggregation method for single-label classification tasks into RAkEL, we propose a novel quality control method for crowdsourced multi-label classification. We demonstrate that our method achieves better quality than the existing methods with real data especially when spammers are included in the worker pool.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.eiga-ranking.com.

  2. 2.

    http://www.lancers.jp.

  3. 3.

    http://www.aozora.gr.jp/cards/001475/files/52111_47798.html.

  4. 4.

    http://www.aozora.gr.jp/cards/001475/files/52113_46622.html.

  5. 5.

    http://www.aozora.gr.jp.

References

  1. Bragg, J., Mausam, Weld, D.S.: Crowdsourcing multi-label classification for taxonomy creation. In: HCOMP (2013)

    Google Scholar 

  2. Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. Appl. Stat. 28, 20–28 (1979)

    Article  Google Scholar 

  3. Demartini, G., Difallah, D.E., Cudré-Mauroux, P.: Large-scale linked data integration using probabilistic reasoning and crowdsourcing. VLDB J. 22, 665–687 (2013)

    Article  Google Scholar 

  4. Duan, L., Oyama, S., Kurihara, M., Sato, H.: Crowdsourced semantic matching of multi-label annotations. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 3483–3489 (2015)

    Google Scholar 

  5. Duan, L., Oyama, S., Sato, H., Kurihara, M.: Separate or joint? Estimation of multiple labels from crowdsourced annotations. Expert Syst. Appl. 41(13), 5723–5732 (2014)

    Article  Google Scholar 

  6. Ekman, P.: An argument for basic emotions. Cogn. Emotion 6(3–4), 169–200 (1992)

    Article  Google Scholar 

  7. Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS, vol. 3056, pp. 22–30. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24775-3_5

    Chapter  Google Scholar 

  8. Nakamura, A.: Kanjo Hyogen Jiten [Dictionary of Emotive Expressions]. Tokyodo (1993)

    Google Scholar 

  9. Oyama, S., Baba, Y., Sakurai, Y., Kashima, H.: Accurate integration of crowdsourced labels using workers’ self-reported confidence scores. In: Proceedings of the 23rd International Joint Conference on Artificial Intelligence, pp. 2554–2560 (2013)

    Google Scholar 

  10. Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse. Min. 3(3), 1–11 (2007)

    Article  Google Scholar 

  11. Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2011)

    Article  Google Scholar 

  12. Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems, vol. 23, pp. 2424–2432 (2010)

    Google Scholar 

  13. Whitehill, J., Fan Wu, T., Bergsma, J., Movellan, J.R., Ruvolo, P.L.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Advances in Neural Information Processing Systems, vol. 22, pp. 2035–2043 (2009)

    Google Scholar 

Download references

Acknowledgments

We thank Lei Duan and Satoshi Oyama for sharing the datasets used in [4, 5].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kosuke Yoshimura .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Yoshimura, K., Baba, Y., Kashima, H. (2017). Quality Control for Crowdsourced Multi-label Classification Using RAkEL. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10634. Springer, Cham. https://doi.org/10.1007/978-3-319-70087-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70087-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70086-1

  • Online ISBN: 978-3-319-70087-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics