Mapping an Automated Survey Coding Task into a Probabilistic Text Categorization Framework

Giorgetti, Daniela; Prodanof, Irina; Sebastiani, Fabrizio

doi:10.1007/3-540-45433-0_18

Daniela Giorgetti³,
Irina Prodanof³ &
Fabrizio Sebastiani⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2389))

Included in the following conference series:

International Conference for Natural Language Processing in Portugal

485 Accesses

Abstract

This paper describes how to apply a probabilistic Text Categorization method to a different and new domain where documents are answers to open end questionnaires and codes viewed as categories consist of a hierarchical model. A reduced size training set may be used taking advantage of the hierarchical organization of categories. The system developed in this framework aims at helping psychologists in the evaluation of open end surveys inquiring about job candidates’ competencies.

Research supported by the Sintesi company (Perugia, Italia), which funds the JobNet project.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Melina Alexa and Cornelia Zuell. Text analysis software: Commonalities, differences and limitations: The results of a review. Quality & Quantity, (34):299–321, 2000.
Article Google Scholar
Susan T. Dumais, John Platt, David Heckerman, and Mehran Sahami. Inductive learning algorithms and representations for text categorization. In Georges Gardarin, James C. French, Niki Pissinou, Kia Makki, and Luc Bouganim, editors, Proceedings of CIKM-98, 7th ACM International Conference on Information and Knowledge Management, pages 148–155, Bethesda, US, 1998. ACM Press, New York, US.
Chapter Google Scholar
Hay Group. Web site: http://www.haygroup.com. Last visited on April 8, 2002.
Leah S. Larkey. Automatic essay grading using text categorization techniques. In W. Bruce Croft, Alistair Moffat, Cornelis J. van Rijsbergen, Ross Wilkinson, and Justin Zobel, editors, Proceedings of SIGIR-98, 21st ACM International Conference on Research and Development in Information Retrieval, pages 90–95, Melbourne, AU, 1998. ACM Press, New York, US.
Chapter Google Scholar
Andrew K. McCallum and Kamal Nigam. A comparison of event models for Naive Bayes text classification. In Proceedings of AAAI/ICML-98 Workshop on Learning for Text Categorization, pages 41–48, Madison, US, 1998. AAAI Press.
Google Scholar
Andrew K. McCallum, Ronald Rosenfeld, Tom M. Mitchell, and Andrew Y. Ng. Improving text classification by shrinkage in a hierarchy of classes. In Jude W. Shavlik, editor, Proceedings of ICML-98, 15th International Conference on Machine Learning, pages 359–367, Madison, US, 1998. Morgan Kaufmann Publishers, San Francisco, US.
Google Scholar
Tom M. Mitchell. Machine Learning. McGraw Hill, New York, US, 1997.
MATH Google Scholar
Andrew J. Perrin. The CodeRead system: Using natural language processing to automate coding of qualitative data. Social Science Computer Review, 19(2):213–220, 2001.
Article Google Scholar
Daniel J. Pratt and William Mays. Automatic coding of transcript data for a survey of recent college graduates. In Proceedings of the section on Survey Methods of the American Statistical Association Annual Meeting, pages 796–801, 1989.
Google Scholar
Raymond Raud and Michael Fallig. Automating the coding process with neural networks, 1995.
Google Scholar
Fabrizio Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 34(1):1–47, 2002.
Article Google Scholar
Lyle M. Spencer and Signe M. Spencer. Competence at Work: models for Superior Performance. John Wiley & Sons, New York, US, 1993.
Google Scholar
Lyle M. Spencer and Signe M. Spencer. Competenza nel Lavoro-Modelli per una Performance Superiore. Franco Angeli, 1995.
Google Scholar
Peter Viechnicki. A performance evaluation of automatic survey classifiers. In Vasant Honavar and Giora Slutzki, editors, Proceedings of ICGI-98, 4th International Colloquium on Grammatical Inference, pages 244–256, Ames, US, 1998. Springer Verlag, Heidelberg, DE. Published in the “Lecture Notes in Computer Science” series, number 1433.
Google Scholar

Download references

Author information

Authors and Affiliations

Istituto di Linguistica Computazionale del CNR di Pisa, Italia
Daniela Giorgetti & Irina Prodanof
Istituto di Elaborazione dell’Informazione del CNR di Pisa, Italia
Fabrizio Sebastiani

Authors

Daniela Giorgetti
View author publications
You can also search for this author in PubMed Google Scholar
Irina Prodanof
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Sebastiani
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidade de Lisboa e CAUTL (IST), Av. Rovisco Pais, 1049-001, Lisboa, Portugal
Elisabete Ranchhod
L2F/INESC ID Lisboa, Technical University of Lisbon, Av. Rovisco Pais, 1049-001, Lisboa, Portugal
Nuno J. Mamede

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Giorgetti, D., Prodanof, I., Sebastiani, F. (2002). Mapping an Automated Survey Coding Task into a Probabilistic Text Categorization Framework. In: Ranchhod, E., Mamede, N.J. (eds) Advances in Natural Language Processing. PorTAL 2002. Lecture Notes in Computer Science(), vol 2389. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45433-0_18

Download citation

DOI: https://doi.org/10.1007/3-540-45433-0_18
Published: 21 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43829-8
Online ISBN: 978-3-540-45433-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics