Abstract
Online health fora are increasingly visited by patients to get help and information related to their health. However, these fora are not limited to patients: a significant number of health professionals actively participate in many discussions. As experts their posted information are very important since, they are able to well explain the problems, the symptoms, correct false affirmations and give useful advices, etc. For someone interested in trusty medical information, obtaining only these kinds of posts can be very useful and informative. Unfortunately, extracting such knowledge needs to navigate over the fora in order to evaluate the information. Navigation and selection are time consuming, tedious, difficult and error-prone activities when done manually. It is thus important to propose a new method for automatically categorize information proposed both by non-experts as well as by professionals in online health fora. In this paper, we propose to use a supervised approach to evaluate what are the most representative components of a post considering vocabularies, uncertainty markers, emotions, misspellings and interrogative forms to perform efficiently this categorization. Experiments have been conducted on two real fora and shown that our approach is efficient for extracting posts done by professionals.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
www.allodocteurs.fr/forum-rubrique.asp [collected on: 19-11-2013].
- 2.
www.masantenet.com/questions.php [collected on: 18-02-2014].
- 3.
www.ihtsdo.org/snomed-ct [last access: 06-05-2014].
- 4.
www.theriaque.org [last access: 06-05-2014].
- 5.
www.nlm.nih.gov/research/umls [last access: 06-05-2014].
- 6.
www.aspell.net [last access: 06-05-2014].
- 7.
For readability reasons, ngrams have been translated from French to English.
- 8.
References
Himmel, W., Reincke, U., Michelmann, H.W.: Text mining and natural language processing approaches for automatic categorization of lay requests to web-based expert forums. J. Med. Internet Res. 11(3), 1 (2009)
Huh, J., Yetisgen-Yildiz, M., Pratt, W.: Text classification for assisting moderators in online health communities. J. Biomed. Inform. 46(6), 998–1005 (2013)
Melzi, S., Abdaoui, A., Azé, J., Bringay, S., Poncelet, P., Galtier, F.: Patient’s rationale: patient knowledge retrieval from health forums. In: ETELEMED 2014, The Sixth International Conference on eHealth, Telemedicine, and Social Medicine, 2014, pp. 140–145 (2014)
Bringay, S., Kergosien, E., Pompidor, P., Poncelet, P.: Identifying the targets of the emotions expressed in health forums. In: Gelbukh, A. (ed.) CICLing 2014, Part II. LNCS, vol. 8404, pp. 85–97. Springer, Heidelberg (2014)
Rangel, F., Rosso, P., Koppel, M., Stamatatos, E., Inches, G.: Overview of the author profiling task at PAN 2013. Notebook Papers of CLEF, pp. 23–26 (2013)
Bouguessa, M., Dumoulin, B., Wang, S.: Identifying authoritative actors in question-answering forums: the case of Yahoo! answers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, pp. 866–874 (2008)
Fisher, D., Smith, M., Welser, H.T.: You are who you talk to: detecting roles in usenet newsgroups. In: Proceedings of the 39th Annual Hawaii International Conference on System Sciences, 2006, HICSS ’06, vol. 3, p. 59b (2006)
Thoumelin, P.C., Grabar, N.: La subjectivité dans le discours médical: sur les traces de l’incertitude et des émotions. Rev. Nouv. Technol. Inf., Extraction et Gestion des Connaissances, RNTI-E-26, pp. 455–466 (2014)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Tanguy, L., Fabre, C., Ho-Dac, L.-M., Rebeyrolle, J.: Caractérisation des échanges entre patients et médecins : approche outillée d’un corpus de consultations médicales. Corpus 10, 137–154 (2012)
Hamon, T., Nazarenko, A.: Le développement d’une plate-forme pour l’annotation spécialisée de documents Web: retour d’expérience. Trait. Autom. Lang. 49(2), 127–154 (2008)
Augustyn, M., Hamou, S.B., Bloquet, G., Goossens, V., Loiseau, M., Rinck, F.: Lexique des affects: constitution de ressources pédagogiques numériques.. In: Autour du langage et des langues: perspective pluridisciplinaire, Sélection d’articles du Colloque International des étudiants-chercheurs en didactique des langues et linguistique. (2008)
Balahur, A.: Sentiment analysis in social media texts. In: 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Atlanta, Georgia, pp. 120–128 (2013)
Salton, G.: Developments in automatic text retrieval. Science 253(5023), 974–980 (1991)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Platt, J.C.: Fast training of SVMs using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods, pp. 185–208. MIT Press, Cambridge (1999)
John, G.H. Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, San Mateo, pp. 338–345 (1995)
Cohen, W.W.: Fast Effective Rule Induction. In: Twelfth International Conference on Machine Learning, pp. 115–123 (1995)
Cross-validation and selection of priors. Statistical Modeling, Causal Inference, and Social Science [Online]. http://andrewgelman.com/2006/03/24/crossvalidation_2/. Accessed 7 May 2014
Lexique des sentiments et des émotions français
Mohammad, S.M., Turney, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion Lexicon. In Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, Stroudsburg, PA, USA, pp. 26–34 (2010)
Skopik, F., Truong, H.-L., Dustdar, S.: Trust and reputation mining in professional virtual communities. In: Gaedke, M., Grossniklaus, M., Díaz, O. (eds.) ICWE 2009. LNCS, vol. 5648, pp. 76–90. Springer, Heidelberg (2009)
Wanas, N., El-Saban, M., Ashour, H., Ammar, W.: Automatic scoring of online discussion posts. In: Proceedings of the 2Nd ACM Workshop on Information Credibility on the Web, New York, NY, USA, pp. 19–26 (2008)
Feng, D., Shaw, E., Kim, J., Hovy, E.: Learning to detect conversation focus of threaded discussions. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Stroudsburg, PA, USA, pp. 208–215 (2006)
Acknowledgement
This paper is based on studies supported by the “Maison des Sciences de l’Homme de Montpellier” (MSH-M) within the framework of the French project “Patient’s mind”.Footnote 8
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Abdaoui, A., Azé, J., Bringay, S., Grabar, N., Poncelet, P. (2014). Predicting Medical Roles in Online Health Fora. In: Besacier, L., Dediu, AH., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2014. Lecture Notes in Computer Science(), vol 8791. Springer, Cham. https://doi.org/10.1007/978-3-319-11397-5_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-11397-5_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11396-8
Online ISBN: 978-3-319-11397-5
eBook Packages: Computer ScienceComputer Science (R0)