Skip to main content

nCoder+: A Semantic Tool for Improving Recall of nCoder Coding

  • Conference paper
  • First Online:
Advances in Quantitative Ethnography (ICQE 2019)

Abstract

Coding is a process of assigning meaning to a given piece of evidence. Evidence may be found in a variety of data types, including documents, research interviews, posts from social media, conversations from learning platforms, or any source of data that may provide insights for the questions under qualitative study. In this study, we focus on text data and consider coding as a process of identifying words or phrases and categorizing them into codes to facilitate data analysis. There are a number of different approaches to generating qualitative codes, such as grounded coding, a priori coding, or using both in an iterative process. However, both qualitative and quantitative analysts face the same coding problem: when the data size is large, manually coding becomes impractical. nCoder is a tool that helps researchers to discover and code key concepts in text data with minimum human judgements. Once reliability and validity are established, nCoder automatically applies the coding scheme to the dataset. However, for concepts that occur infrequently, even with an acceptable reliability, the classifier may still result in too many false negatives. This paper explores these problems within the current nCoder and proposes adding a semantic component to the nCoder. A tool called “nCoder+” is presented with real data to demonstrate the usefulness of the semantic component. The possible ways of integrating this component and other natural language processing techniques into nCoder are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Not all researchers perform IRR tests. For example, researchers may use social moderation, where two or more raters code all of the data and resolve differences until they all agree on the code (Herrenkohl and Cornelius) [14].

References

  1. Shaffer, D.W.: Quantitative Ethnography. Cathcart Press, Madison (2017)

    Google Scholar 

  2. Chi, M.T.H.: Quantifying qualitative analyses of verbal data: a practical guide. J. Learn. Sci. 6, 271–315 (1997)

    Article  Google Scholar 

  3. Saldaña, J.: The Coding Manual for Qualitative Researchers (2014). https://doi.org/10.1007/s13398-014-0173-7.2

  4. Glaser, B.G., Strauss, A.L.: The Discovery of Grounded Theory: Strategies for Qualitative Research. Aldine Transaction, New Brunswick (1967)

    Google Scholar 

  5. Charmaz, K.: Constructing Grounded Theory. SAGE, London (2006)

    Google Scholar 

  6. Eagan, B.R., Rogers, B., Serlin, R., Ruis, A.R., Irgens, G.A., Shaffer, D.W.: Can we rely on IRR? testing the assumptions of inter-rater reliability. In: CSCL 2017 Proceedings, pp. 529–532 (2017)

    Google Scholar 

  7. Blei, D.M., Edu, B.B., Ng, A.Y., Edu, A.S., Jordan, M.I., Edu, J.B.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). https://doi.org/10.1162/jmlr.2003.3.4-5.993

    Article  Google Scholar 

  8. Hu, Y., Boyd-Graber, J., Satinoff, B.: Interactive topic modeling. In: Proceedings of the 49th Annual Meeting Association for Computational Linguistics Human Language Technologies, pp. 248–257 (2011)

    Google Scholar 

  9. Marquart, C.L., Swiecki, Z., Eagan, B., Shaffer, D.W.: ncodeR (Version 0.1.2) (2018)

    Google Scholar 

  10. Eagan, B.R., Rogers, B., Pozen, R., Marquart, C., Shaffer, D.W.: rhoR: Rho for inter rater reliability (Version 1.1.0) (2016). https://cran.r-project.org/web/packages/rhoR/index.html

  11. Gašević, D., Joksimović, S., Eagan, B., Shaffer, D.W.: SENS: network analytics to combine social and cognitive perspectives of collaborative learning. Comput. Hum. Behav. 92, 562–577 (2019)

    Article  Google Scholar 

  12. Cai, Z., Pennebaker, J.W., Eagan, B., Shaffer, D.W., Dowell, N.M., Graesser, A.C.: Epistemic network analysis and topic modeling for chat data from collaborative learning environment. In: Proceedings of the 10th International Conference on Educational Data Mining, pp. 104–111 (2017)

    Google Scholar 

  13. Sullivan, S., et al.: Using epistemic network analysis to identify targets for educational interventions in trauma team communication. Surg. (United States) 163, 938–943 (2018). https://doi.org/10.1016/j.surg.2017.11.009

    Article  Google Scholar 

  14. Shaffer, D.W., Ruis, A.R.: Epistemic network analysis: a worked example of theory-based learning analytics. In: Handbook of Learning Analytics Data Mining, in press (2017)

    Chapter  Google Scholar 

  15. Cohen, J., Cohen, J.: A coefficient of agreement for nomial scales. Educ. Psychol. Meas. 20(1), 37–46 (1960). https://doi.org/10.1177/001316446002000104a coefficient of agreement for nomial scales. Educ. Psychol. Meas. 20, 37–46 (1960). https://doi.org/10.1177/001316446002000104

  16. Landauer, T., McNamara, D., Dennis, S., Kintsch, W.: Handbook of Latent Semantic Analysis (2007)

    Google Scholar 

Download references

Acknowledgements

The research was supported by the National Science Foundation (SBR 9720314, REC 0106965, REC 0126265, ITR 0325428, REESE 0633918, ALT-0834847, DRK-12-0918409, 1108845; DRL-1661036, 1713110; ACI-1443068), the Institute of Education Sciences (R305H050169, R305B070349, R305A080589, R305A080594, R305G020018, R305C120001), the Army Research Lab (W911INF-12-2-0030), and the Office of Naval Research (N00014-00-1-0600, N00014-12-C-0643; N00014-16-C-3027), the Wisconsin Alumni Research Foundation, and the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin-Madison. The opinions, findings, and conclusions do not reflect the views of the funding agencies, cooperating institutions, or other individuals.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiqiang Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cai, Z., Siebert-Evenstone, A., Eagan, B., Shaffer, D.W., Hu, X., Graesser, A.C. (2019). nCoder+: A Semantic Tool for Improving Recall of nCoder Coding. In: Eagan, B., Misfeldt, M., Siebert-Evenstone, A. (eds) Advances in Quantitative Ethnography. ICQE 2019. Communications in Computer and Information Science, vol 1112. Springer, Cham. https://doi.org/10.1007/978-3-030-33232-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-33232-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-33231-0

  • Online ISBN: 978-3-030-33232-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics