Skip to main content

Toward Three-Stage Automation of Annotation for Human Values

  • Conference paper
  • First Online:
Information in Contemporary Society (iConference 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11420))

Included in the following conference series:

Abstract

Prior work on automated annotation of human values has sought to train text classification techniques to label text spans with labels that reflect specific human values such as freedom, justice, or safety. This confounds three tasks: (1) selecting the documents to be labeled, (2) selecting the text spans that express or reflect human values, and (3) assigning labels to those spans. This paper proposes a three-stage model in which separate systems can be optimally trained for each of the three stages. Experiments from the first stage, document selection, indicate that annotation diversity trumps annotation quality, suggesting that when multiple annotators are available, the traditional practice of adjudicating conflicting annotations of the same documents is not as cost effective as an alternative in which each annotator labels different documents. Preliminary results for the second stage, selecting value sentences, indicate that high recall (94%) can be achieved on that task with levels of precision (above 80%) that seem suitable for use as part of a multi-stage annotation pipeline. The annotations created for these experiments are being made freely available, and the content that was annotated is available from commercial sources at modest cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Two documents in Round 9 were discovered to be duplicates and removed.

  2. 2.

    As fastText parameters we selected: -dim 50; -loss negative sampling (ns); -epoch 1000; -wordNgrams 5; -minCount 2.

  3. 3.

    We have also generated learning curves using all of the adjudicated annotations, again plotting only at even values for the number of original annotations. This yields similar results.

References

  1. CD-Mainichi Shimbun Data Collection 2011 version; 2012 version; 2013 version; 2014 version; 2015 version; and 2016 version

    Google Scholar 

  2. Cheng, A.-S., Fleischmann, K.R., Wang, P., Ishita, E., Oard, D.W.: The role of innovation and wealth in the net neutrality debate: a content analysis of human values in congressional and FCC hearings. J. Am. Soc. Inf. Sci. Technol. 63, 1360–1373 (2012)

    Article  Google Scholar 

  3. Fleischmann, K.R.: Information and Human Values. Morgan & Claypool, San Rafael (2014)

    Google Scholar 

  4. Friedman, B., Kahn Jr., P.H., Borning, A.: Value sensitive design and information systems. In: Zhang, P., Galletta, D. (eds.) Human-Computer Interaction and Management Information Systems: Foundations, pp. 348–372. M.E. Sharpe, Armonk (2006). https://doi.org/10.1002/9780470281819.ch4

    Chapter  Google Scholar 

  5. Grimmer, J., Stewart, B.M.: Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Polit. Anal. 21, 267–297 (2013)

    Article  Google Scholar 

  6. Ishita, E., et al.: Toward automating detection of human values in the nuclear power debate. In: Proceedings of 80th Annual Meeting of the Association for Information Science and Technology, vol. 54, no. 1, pp. 714–715 (2017). https://doi.org/10.1002/pra2.2017.14505401127

  7. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. https://arxiv.org/abs/1607.01759. Accessed 10 Sept 2018

  8. JUMAN (a user-extensible morphological analyze for Japanese). http://nlp.ist.i.kyoto-u.ac.jp/EN/index.php?JUMAN. Accessed 10 Sept 2018

  9. Khetan, A., Lipton, Z.C., Anandkumar, A.: Learning from noisy singly-labeled data. In: Proceedings of ICLR 2018, 15 p. (2018). https://arxiv.org/abs/1712.04577. Accessed 10 Sept 2018

  10. Nelson, L.K.: Computational grounded theory: a methodological framework. Sociol. Methods Res. (2017). https://doi.org/10.1177/0049124117729703

  11. Nelson, L.K., Burk, D., Knudsen, M., McCall, L.: The future of coding: a comparison of hand-coding and three types of computer-assisted text analysis methods. Sociol. Methods Res. (2018). https://doi.org/10.1177/0049124118769114

  12. Pang, B., Lee, L.: Opinion mining and sentiment analysis. J. Found. Trends Inf. Retr. 2(1–2), 1–135 (2008). https://doi.org/10.1561/1500000011

    Article  Google Scholar 

  13. Schwartz, S.H.: Value orientations: measurement, antecedents and consequences across nations. In: Jowell, R., Roberts, C., Fitzgerald, R., Eva, G. (eds.) Measuring Attitudes Cross-Nationally: Lessons from the European Social Survey, pp. 169–203. Sage, London (2007). https://doi.org/10.4135/9781849209458.n9

  14. Takayama, Y., Tomiura, Y., Ishita, E., Oard, D.W., Fleischmann, K.R., Cheng, A.-S.: A word-scale probabilistic latent variable model for detecting human values. In: Proceedings on ACM International Conference on Information and Knowledge Management (CIKM 2014), pp. 1489–1498 (2014). https://doi.org/10.1145/2661829.2661966

  15. Clay, T., Fleischmann, K.R.: The relationship between human values and attitudes toward the Park51 and nuclear power controversies. In: Proceedings of the 74th Annual Meeting of the American Society for Information Science and Technology, New Orleans, LA (2011). https://doi.org/10.1002/meet.2011.14504801172

  16. TinySVM: Support Vector Machines. http://chasen.org/~taku/software/TinySVM/. Accessed 10 Sept 2018

  17. Verma, N., Fleischmann, K.R., Koltai, K.S.: Human values and trust in scientific journals, the mainstream media and fake news. In: Proceedings of 80th Annual Meeting of the Association for Information Science and Technology, vol. 54, no. 1, pp. 426–435 (2017)

    Google Scholar 

  18. Yan, J.L.S., McCracken, N., Crowston, K.: Semi-automatic content analysis of qualitative data. In: Proceedings of the iConference, pp. 1128–1132 (2014)

    Google Scholar 

Download references

Acknowledgements

This work has been supported in part by JSPS KAKENHI Grant Number JP18H03495.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emi Ishita .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ishita, E. et al. (2019). Toward Three-Stage Automation of Annotation for Human Values. In: Taylor, N., Christian-Lamb, C., Martin, M., Nardi, B. (eds) Information in Contemporary Society. iConference 2019. Lecture Notes in Computer Science(), vol 11420. Springer, Cham. https://doi.org/10.1007/978-3-030-15742-5_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-15742-5_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-15741-8

  • Online ISBN: 978-3-030-15742-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics