Comparison of Clustering Approaches through Their Application to Pharmacovigilance Terms

Dupuch, Marie; Engström, Christopher; Silvestrov, Sergei; Hamon, Thierry; Grabar, Natalia

doi:10.1007/978-3-642-38326-7_9

Marie Dupuch²²,
Christopher Engström²³,
Sergei Silvestrov²³,
Thierry Hamon²⁴ &
…
Natalia Grabar²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7885))

Included in the following conference series:

Conference on Artificial Intelligence in Medicine in Europe

2046 Accesses

Abstract

In different applications (i.e., information retrieval, filtering or analysis), it is useful to detect similar terms and to provide the possibility to use them jointly. Clustering of terms is one of the methods which can be exploited for this. In our study, we propose to test three methods dedicated to the clustering of terms (hierarchical ascendant classification, Radius and maximum), to combine them with the semantic distance algorithms and to compare them through the results they provide when applied to terms from the pharmacovigilance area. The comparison indicates that the non disjoint clustering (Radius and maximum) outperform the disjoint clusters by 10 to up to 20 points in all the experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barzilay, R., Elhadad, N.: Sentence alignment for monolingual comparable corpora. In: EMNLP, pp. 25–32 (2003)
Google Scholar
Paşca, M.: Mining paraphrases from self-anchored web sentence fragments. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 193–204. Springer, Heidelberg (2005)
Chapter Google Scholar
Max, A., Bouamor, H., Vilnat, A.: Generalizing sub-sentential paraphrase acquisition across original signal type of text pairs. In: EMNLP, pp. 721–31 (2012)
Google Scholar
Jacquemin, C.: A symbolic and surgical acquisition of terms through variation. In: Wermter, S., Riloff, E., Scheler, G. (eds.) IJCAI-WS 1995. LNCS, vol. 1040, pp. 425–438. Springer, Heidelberg (1996)
Chapter Google Scholar
Daille, B., Habert, B., Jacquemin, C., Royauté, J.: Empirical observation of term variations and principles for their description. Terminology 3(2), 197–257 (1996)
Article Google Scholar
Hahn, U., Honeck, M., Piotrowsky, M., Schulz, S.: Subword segmentation - leveling out morphological variations for medical document retrieval. In: Annual Symposium of the American Medical Informatics Association (AMIA), Washington (2001)
Google Scholar
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics 19(1), 17–30 (1989)
Article Google Scholar
Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: Proceedings of Associations for Computational Linguistics, pp. 133–138 (1994)
Google Scholar
Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. In: WordNet: An Electronic Lexical Database, pp. 305–332 (1998)
Google Scholar
Zhong, J., Zhu, H., Li, J., Yu, Y.: Conceptual graph matching for semantic search. In: Priss, U., Corbett, D.R., Angelova, G. (eds.) ICCS 2002. LNCS (LNAI), vol. 2393, pp. 92–106. Springer, Heidelberg (2002)
Chapter Google Scholar
Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in wordnet. In: Proceedings of the 16th European Conference on Artificial Intelligence (ECAI 2004), pp. 1089–1090 (2004)
Google Scholar
Nguyen, H., Al-Mubaid, H.: New ontology-based semantic similarity measure for the biomedical domain. IEEE Eng. Med. Biol. Proc., 623–628 (2006)
Google Scholar
Maedche, A., Staab, S.: Mining ontologies from text. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS (LNAI), vol. 1937, pp. 189–202. Springer, Heidelberg (2000)
Chapter Google Scholar
Bodenreider, O., Pakhomov, S.: Exploring adjectival modification in biomedical discourse across two genres. In: Workshop Natural Language Processing in Biomedical Applications of ACL, pp. 105–112 (2003)
Google Scholar
Grabar, N., Zweigenbaum, P.: Lexically-based terminology structuring. Terminology 10, 23–54 (2004)
Article Google Scholar
D’aquin, M., Euzenat, J., Le Duc, C., Lewen, H.: Sharing and reusing aligned ontologies with cupboard. In: K-CAP 2009, pp. 179–180 (2009)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)
Google Scholar
Kaufman, L., Rousseeuw, P.: Clustering by means of medoids. In: Statistical Data Analysis based on the L1 Norm, pp. 405–416 (1987)
Google Scholar
Bezdek, J.: Pattern Recognition with Fuzzy Objective Function Algoritms. Plenum Press, New York (1981)
Book Google Scholar
Krishnapuram, R., Joshi, A., Nasraoui, O., Yi, L.: Low complexity fuzzy relational clustering algorithms for web mining. IEEE Trans. Fuzzy System, 595–607 (2001)
Google Scholar
Lelu, A.: Modles neuronaux pour lanalyse de donnes documentaires et textuelles. Phd thesis, Universite de Paris VI, Paris, France (1993)
Google Scholar
Dupuch, M., Bousquet, C., Grabar, N.: Automatic creation and refinement of the clusters of pharmacovigilance terms. In: ACM IHI, pp. 181–190 (2012)
Google Scholar
Cleuziou, G., Martin, L., Vrain, C.: PoBOC: An overlapping clustering algorithm. application to rule-based classification and textual data. In: ECAI, pp. 440–444 (2004)
Google Scholar
Cleuziou, G.: OKM: Une extension des k-moyennes pour la recherche de classes recouvrantes. In: EGC, pp. 691–702 (2007)
Google Scholar
Johnson, S.: Hierarchical clustering schemes. Psychometrika 32, 241–254 (1967)
Article Google Scholar
Kaufman, L., Rousseeuw, P.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)
Book Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: ACM SIGMOD, pp. 103–114 (1996)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: Cure: An efficient clustering algorithm for large databases. In: ACM SIGMOD, pp. 73–84 (1998)
Google Scholar
Alecu, I., Bousquet, C., Jaulent, M.: A case report: Using snomed ct for grouping adverse drug reactions terms. BMC Med. Inform. Decis. Mak. 8(1), 4 (2008)
Article Google Scholar
Brown, E.G., Wood, L., Wood, S.: The medical dictionary for regulatory activities (MedDRA). Drug Saf. 20(2), 109–117 (1999)
Article Google Scholar
Stearns, M.Q., Price, C., Spackman, K.A., Wang, A.Y.: SNOMED clinical terms: Overview of the development process and project status. In: AMIA, pp. 662–666 (2001)
Google Scholar
NLM: UMLS Knowledge Sources Manual. National Library of Medicine, Bethesda, Maryland (2008), http://www.nlm.nih.gov/research/umls/

Download references

Author information

Authors and Affiliations

CNRS UMR 8163 STL, Université Lille 3, 59653, Villeneuve d’Ascq, France
Marie Dupuch & Natalia Grabar
Division of Applied Mathematics, Mälardalen University, Västerås, Sweden
Christopher Engström & Sergei Silvestrov
LIM&BIO (EA3969), Université Paris 13, Sorbonne Paris Cité, France
Thierry Hamon

Authors

Marie Dupuch
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Engström
View author publications
You can also search for this author in PubMed Google Scholar
Sergei Silvestrov
View author publications
You can also search for this author in PubMed Google Scholar
Thierry Hamon
View author publications
You can also search for this author in PubMed Google Scholar
Natalia Grabar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Academic Medical Center, Dept. of Medical Informatics, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, The Netherlands
Niels Peek
Dept. of Information Egineering and Communications, University of Murcia, Campus de Espinardo, 30100, Espinardo, Murcia, Spain
Roque Marín Morales
Department of Information Systems, University of Haifa, Rabin Bldg, Haifa, Israel
Mor Peleg

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dupuch, M., Engström, C., Silvestrov, S., Hamon, T., Grabar, N. (2013). Comparison of Clustering Approaches through Their Application to Pharmacovigilance Terms. In: Peek, N., Marín Morales, R., Peleg, M. (eds) Artificial Intelligence in Medicine. AIME 2013. Lecture Notes in Computer Science(), vol 7885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38326-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-38326-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38325-0
Online ISBN: 978-3-642-38326-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics