A performance evaluation of automatic survey classifiers

Viechnicki, Peter

doi:10.1007/BFb0054080

Peter Viechnicki¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1433))

Included in the following conference series:

International Colloquium on Grammatical Inference

118 Accesses
5 Citations

Abstract

A novel NLP task, automatic survey coding, is described, and two methods for performing this task are presented. The first method uses a Boolean pattern-matching strategy to code survey responses, while the second uses a vector-based (probabilistic) method. The performance of the two methods is tested and compared on three representative survey datasets. The Boolean method is shown to perform slightly better on average than the vector-based method. Linguistic factors affecting the difficulty of the coding task for each survey are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berlin, B. (1978) ‘Ethnobiological classification.’ In E. Rosch and B. Lloyd (eds.) Cognition and Categorization, pp. 9–27. Hillsdale, New Jersey: Lawrence Erlbaum.
Google Scholar
Bookstein, A., (1985) ‘Probability and fuzzy-set applications to information retrieval.’ In M. Williams (ed.), Annual Review of Information Science and Technology 20:117–151.
Google Scholar
Cohen, J. (1960) ‘A coefficient of agreement for nominal scales.’ Education and Psychological Measurement 20:37–46.
Google Scholar
Davis, J., and Smith, T. (1996) General Social Surveys, 1972–1996: Cumulative Codebook. Chicago: National Opinion Research Center.
Google Scholar
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., and Harshman, R. (1990) ‘Indexing by latent semantic analysis.’ Journal of the American Society for Information Science 41(6).
Google Scholar
Duda, R., and Hart, P. (1973) Pattern Classification and Scene Analysis. New York: John Wiley & Sons.
Google Scholar
Ellis, D. (1990) New Horizons in Information Retrieval. London: Library Association.
Google Scholar
Fellbaum, C. (1993) ‘English verbs as a semantic net.’ In G. Miller (ed.) Five Papers on Wordnet. http://www.cogsci.princeton.edu/~wn.
Google Scholar
Landis, J., and Koch, G. (1977) ‘The measurement of observer agreement for categorical data.’ Biometrics 33:159–174.
Article MATH MathSciNet Google Scholar
Lewis, D. (1992) ‘An evaluation of phrasal and clustered representations on a text categorization task.’ ACM-SIGIR'92, pp. 37–50.
Google Scholar
Pratt, D., and Mays, J. (1989) ‘Automatic coding of transcript data for a survey of recent college graduates.’ Proceedings of the Section on Survey Methods of the American Statistical Association Annual Meeting,pp. 796–801.
Google Scholar
Raud, R., and Fallig, M. (1995) ‘Automating the coding process with neural networks.’ http://www.monmouth.com/~rraud/autocode.html.
Google Scholar
Rosch, E. (1978) ‘Principles of categorization.’ In E. Rosch and B. Lloyd (eds.)Cognition and Categorization, pp. 28–49. Hillsdale, New Jersey: Lawrence Erlbaum.
Google Scholar
Salton, G. (ed.) (1971) The SMART Retrieval System — Experiments in Automatic Document Processing. Englewood Cliffs, New Jersey: Prentice-Hall.
Google Scholar
Salton, G., and McGill, M. (1983) Introduction to Modern Information Retrieval. New York: McGraw-Hill.
Google Scholar
Schuetze, H., Hull, D., and Pedersen, P. (1995) ‘A comparison of classifiers and document representations for the routing problem.’ ACM-SIGIR'95, pp. 229–237.
Google Scholar
Thomas, T. (1994) ‘Concept extraction applied to text analysis of medical records.’ Los Alamos Science 22:145–148.
Google Scholar
Viechnicki, P. (1997) ‘A comparison of classification algorithms for a survey coding task.’ http://student-www.uchicago.edu/users/pdviechn/comp.html.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Linguistics, The University of Chicago, USA
Peter Viechnicki

Authors

Peter Viechnicki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Vasant Honavar Giora Slutzki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Viechnicki, P. (1998). A performance evaluation of automatic survey classifiers. In: Honavar, V., Slutzki, G. (eds) Grammatical Inference. ICGI 1998. Lecture Notes in Computer Science, vol 1433. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054080

Download citation

DOI: https://doi.org/10.1007/BFb0054080
Published: 23 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64776-8
Online ISBN: 978-3-540-68707-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics