Correlating Words - Approaches and Applications

Kubek, Mario M.; Unger, Herwig; Dusik, Jan

doi:10.1007/978-3-319-23192-1_3

Mario M. Kubek¹⁵,
Herwig Unger¹⁵ &
Jan Dusik¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9256))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

3079 Accesses
4 Citations

Abstract

The determination of characteristic and discriminating terms as well as their semantic relationships plays a vital role in text processing applications. As an example, term clustering techniques heavily rely on this information. Classic approaches for this means such as statistical co-occurrence analysis however usually only consider relationships between two terms that co-occur as immediate neighbours or on sentence level. This article presents flexible approaches to find statistically significant correlations between two or more terms using co-occurrence windows of arbitrary sizes. Their applicability will be discussed in detail by presenting solutions to improve the interactive and image-based search in the World Wide Web. Moreover, approaches to determine directed term associations and applications for them will be explained, too.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Timonen, M., Silvonen, P., Kasari, M.: Modelling a query space using associations. In: Proceedings of the 2011 Conference on Information Modelling and Knowledge Bases XXII, pp. 77–96. IOS Press (2011)
Google Scholar
Kubek, M., Witschel, H.F.: Searching the web by using the knowledge in local text documents. In: Proceedings of Mallorca Workshop 2010 Autonomous Systems. Shaker Verlag Aachen (2010)
Google Scholar
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)
Google Scholar
Biemann, C.: Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of the HLT-NAACL-06 Workshop on Textgraphs-06, pp. 73–80. ACL, New York City (2006)
Google Scholar
de Saussure, F.: Cours de Linguistique Générale. Payot, Paris (1916)
Google Scholar
Dice, L.R.: Measures of the Amount of Ecologic Association Between Species. Ecology 26(3), 297–302 (1945)
Article Google Scholar
Jaccard, P.: Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)
Google Scholar
Büchler, M.: Flexibles Berechnen von Kookkurrenzen auf strukturierten und unstrukturierten Daten. Masters thesis, University of Leipzig (2006)
Google Scholar
Quasthoff, U., Wolff, C.: The poisson collocation measure and its applications. In: Second International Workshop on Computational Approaches to Collocations. IEEE, Vienna (2002)
Google Scholar
Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19(1), 61–74. MIT Press, Cambridge (1993)
Google Scholar
Heyer, G., Quasthoff, U., Wittig, T.: Text Mining: Wissensrohstoff Text: Konzepte, Algorithmen, Ergebnisse. W3L-Verlag, Dortmund (2006)
Google Scholar
Fellbaum, C.: WordNet and wordnets. In: Brown, K., et al. (eds.) Encyclopedia of Language and Linguistics, 2nd edn, pp. 665–670. Elsevier, Oxford (2005)
Google Scholar
McDonald, R., et al.: Non-projective dependency parsing using spanning tree algorithms. In: Byron, D., Venkataraman, A., Zhang, D. (eds.) Proc. of the Joint Conf. on Human Language Technology and Empirical Methods in Natural Language Processing (HLT/EMNLP), pp. 523–530. ACL, Vancouver (2005)
Google Scholar
Riloff, E., Jones, R.: Learning dictionaries for information extraction by multi-level bootstrapping. In: Proc. of the Sixteenth National Conference on Artificial Intelligence, Orlando, pp. 474–479 (1999)
Google Scholar
Michel, J., et al.: Quantitative Analysis of Culture Using Millions of Digitized Books. Science 331(6014), 176–182 (2011)
Article Google Scholar
Website of Google Autocomplete (2015). https://support.google.com/websearch/answer/106230?hl=en
Kubek, M.: Interaktive Anwendungen Kontextbasierter Suchverfahren. In: Fortschritt-Berichte VDI, Reihe 10 Nr. 839, VDI-Verlag Düsseldorf (2014)
Google Scholar
Sukjit, P., Kubek, M., Böhme, T., Unger, H.: PDSearch: using pictures as queries. In: Boonkrong, S., Unger, H., Meesad, P. (eds.) Recent Advances in Information and Communication Technology. AISC, vol. 265, pp. 255–262. Springer, Heidelberg (2014)
Chapter Google Scholar
Joshi, A., Motwani, R.: Keyword generation for search engine advertising. In: Sixth IEEE International Conference on Data Mining Workshops, Hong Kong, pp. 490–496 (2006)
Google Scholar
Cutts, M.: Oxford Guide to Plain English. Oxford University Press (2013)
Google Scholar
Biemann, C., Bordag, S., Quasthoff, U.: Automatic acquisition of paradigmatic relations using iterated co-occurrences. In: Proc. of the 4th International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal, pp. 967–970 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Chair of Communication Networks, FernUniversität in Hagen, Hagen, Germany
Mario M. Kubek, Herwig Unger & Jan Dusik

Authors

Mario M. Kubek
View author publications
You can also search for this author in PubMed Google Scholar
Herwig Unger
View author publications
You can also search for this author in PubMed Google Scholar
Jan Dusik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mario M. Kubek .

Editor information

Editors and Affiliations

University of Malta, Msida, Malta
George Azzopardi
University of Groningen, Groningen, The Netherlands
Nicolai Petkov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kubek, M.M., Unger, H., Dusik, J. (2015). Correlating Words - Approaches and Applications. In: Azzopardi, G., Petkov, N. (eds) Computer Analysis of Images and Patterns. CAIP 2015. Lecture Notes in Computer Science(), vol 9256. Springer, Cham. https://doi.org/10.1007/978-3-319-23192-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-23192-1_3
Published: 25 August 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23191-4
Online ISBN: 978-3-319-23192-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics