Abstract
The determination of characteristic and discriminating terms as well as their semantic relationships plays a vital role in text processing applications. As an example, term clustering techniques heavily rely on this information. Classic approaches for this means such as statistical co-occurrence analysis however usually only consider relationships between two terms that co-occur as immediate neighbours or on sentence level. This article presents flexible approaches to find statistically significant correlations between two or more terms using co-occurrence windows of arbitrary sizes. Their applicability will be discussed in detail by presenting solutions to improve the interactive and image-based search in the World Wide Web. Moreover, approaches to determine directed term associations and applications for them will be explained, too.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Timonen, M., Silvonen, P., Kasari, M.: Modelling a query space using associations. In: Proceedings of the 2011 Conference on Information Modelling and Knowledge Bases XXII, pp. 77–96. IOS Press (2011)
Kubek, M., Witschel, H.F.: Searching the web by using the knowledge in local text documents. In: Proceedings of Mallorca Workshop 2010 Autonomous Systems. Shaker Verlag Aachen (2010)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)
Biemann, C.: Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06, pp. 73–80. ACL, New York City (2006)
de Saussure, F.: Cours de Linguistique Générale. Payot, Paris (1916)
Dice, L.R.: Measures of the Amount of Ecologic Association Between Species. Ecology 26(3), 297–302 (1945)
Jaccard, P.: Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)
Büchler, M.: Flexibles Berechnen von Kookkurrenzen auf strukturierten und unstrukturierten Daten. Masters thesis, University of Leipzig (2006)
Quasthoff, U., Wolff, C.: The poisson collocation measure and its applications. In: Second International Workshop on Computational Approaches to Collocations. IEEE, Vienna (2002)
Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19(1), 61–74. MIT Press, Cambridge (1993)
Heyer, G., Quasthoff, U., Wittig, T.: Text Mining: Wissensrohstoff Text: Konzepte, Algorithmen, Ergebnisse. W3L-Verlag, Dortmund (2006)
Fellbaum, C.: WordNet and wordnets. In: Brown, K., et al. (eds.) Encyclopedia of Language and Linguistics, 2nd edn, pp. 665–670. Elsevier, Oxford (2005)
McDonald, R., et al.: Non-projective dependency parsing using spanning tree algorithms. In: Byron, D., Venkataraman, A., Zhang, D. (eds.) Proc. of the Joint Conf. on Human Language Technology and Empirical Methods in Natural Language Processing (HLT/EMNLP), pp. 523–530. ACL, Vancouver (2005)
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proc. of the Sixteenth National Conference on Artificial Intelligence, Orlando, pp. 474–479 (1999)
Michel, J., et al.: Quantitative Analysis of Culture Using Millions of Digitized Books. Science 331(6014), 176–182 (2011)
Website of Google Autocomplete (2015). https://support.google.com/websearch/answer/106230?hl=en
Kubek, M.: Interaktive Anwendungen Kontextbasierter Suchverfahren. In: Fortschritt-Berichte VDI, Reihe 10 Nr. 839, VDI-Verlag Düsseldorf (2014)
Sukjit, P., Kubek, M., Böhme, T., Unger, H.: PDSearch: using pictures as queries. In: Boonkrong, S., Unger, H., Meesad, P. (eds.) Recent Advances in Information and Communication Technology. AISC, vol. 265, pp. 255–262. Springer, Heidelberg (2014)
Joshi, A., Motwani, R.: Keyword generation for search engine advertising. In: Sixth IEEE International Conference on Data Mining Workshops, Hong Kong, pp. 490–496 (2006)
Cutts, M.: Oxford Guide to Plain English. Oxford University Press (2013)
Biemann, C., Bordag, S., Quasthoff, U.: Automatic acquisition of paradigmatic relations using iterated co-occurrences. In: Proc. of the 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal, pp. 967–970 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kubek, M.M., Unger, H., Dusik, J. (2015). Correlating Words - Approaches and Applications. In: Azzopardi, G., Petkov, N. (eds) Computer Analysis of Images and Patterns. CAIP 2015. Lecture Notes in Computer Science(), vol 9256. Springer, Cham. https://doi.org/10.1007/978-3-319-23192-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-23192-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23191-4
Online ISBN: 978-3-319-23192-1
eBook Packages: Computer ScienceComputer Science (R0)