Small Worlds of Concepts and Other Principles of Semantic Search

  • Stefan Bordag
  • Gerhard Heyer
  • Uwe Quasthoff
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2877)


A combination of the strengths of both classic information retrieval with the distributed approach of P2P networks can avoid both their weaknesses: The organisation of document collections relevant for special communities allows both high coverage and quick access. We present a theoretical framework in which the semantic structure between words can be deduced from a document collection. This structural knowledge can then be used to connect document collections to communities based on their content.


Random Graph Regular Graph Word Form Small World Short Path Length 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Adar 2000]
    Adar, E., Hubermann, B.: Freeriding on Gnutella. Firstmonday 5(10) (2000)Google Scholar
  2. [Barabasi 2000]
    Barabasi, A.L., et al.: Scale-free characteristics of random networks: the topology of the World-wide web. Physica A (281), 70–77 (2000)Google Scholar
  3. [Bolloba 1985]
    Bolloba, B.: Random Graphs. Academic Press, London (1985)Google Scholar
  4. [Bordag 2002b ]
    Bordag, S.: Sentence Co-occurrences as Small-World Graphs: A solution to Automatic Lexical Disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 329–332. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  5. [Bordag 2002a]
    Bordag, S.: Vererbungsalgorithmen von semantischen Eigenschaften auf As-soziationsgraphen und deren Nutzung zur Klassifikation von natürlichsprachlichen Daten, Diplomarbeit, Universität Leipzig, Institut für Mathematik und Informatik (2002)Google Scholar
  6. [Böhm 2002]
    Böhm, K., Heyer, G., Quasthoff, U., Wolff, C.: Topic Map Generation using Text Mining. Journal of Universal Computer Science 8(6) (2002),
  7. [Clarke 2000]
    Clarke, I., Sandberg, O., Wiley, B., Hong, T.: Freenet: A Distributed Anonymous Information Storage and Retrieval System. In: ICSI Workshop on Design Issues in Anonymity and Unobservability, Berkeley, CA (2000)Google Scholar
  8. [Davidson 1996]
    Davidson, R., Harel, D.: Drawing graphs nicely using simulated annealing. ACM Transactions on Graphics 15(4), 301–331 (1996)CrossRefGoogle Scholar
  9. [Deo 2001]
    Deo, N., Gupta, P.: World Wide web: a Graph Theoretic Approach. Technical Report CS TR-01-001, University of Central Florida, Orlando Fl, USA (2001)Google Scholar
  10. [Ferrero 2001]
    Ferrero i Cancho, R., Solé, R.V.: The Small-World of Human Language (2001),
  11. [Gnutella]
    Gnutella (2002),
  12. [GRACE]
  13. [Heyer 2000]
    Heyer, G., Quasthoff, U., Wolff, C.: Aiding Web Searches by Statistical Classification Tools. In: Knorz, G., Kuhlen, R. (eds.) Informationskompetenz - Basiskompetenz in der Informationsgesellschaft. In: Proc. 7. Intern. Symposium f. Informationswissen-schaft, ISI 2000, Darmstadt, Konstanz: UVK, pp.163–177 (2000)Google Scholar
  14. [Heyer 2001]
    Heyer, G., Quasthoff, U., Wittig, T., Wolff, C.: Learning Relations Using Collocations. In: Maedche, A., Staab, S., Nedellec, C., Hovy, E. (eds.) Proc. IJCAI Workshop on Ontology Learning, Seattle/ WA (2001)Google Scholar
  15. [Heyer 2002a]
    Heyer, G., Quasthoff, U., Wolff, C.: Automatic Analysis of Large Text Corpora - A Contribution to Structuring WEB Ccommunities. In: Unger, H., Böhme, T.(Hrsg.) (eds.) Proceedings I2CS - 2002. Advanced Lecture Notes in Computer Science, Springer, Heidelberg (2002)Google Scholar
  16. [Heyer 2002b]
    Heyer, G., Quasthoff, U., Wolff, C.: Knowledge Extraction from Text: Using Filters on Collocation Sets. In: Unger, H., Böhme, T., Mikler, A.R. (eds.) IICS 2002. LNCS, vol. 2346, p. 15. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. [Joseph 2001a]
    Joseph, S.: NeuroGrid - Freenet Simulation ResultsGoogle Scholar
  18. [Joseph 2001b]
    Joseph, S.: NeuroGrid White PaperGoogle Scholar
  19. [Kleinberg 2000]
    Kleinberg, J.: The small-world phenomenon: An algorithmic perspective. In: Proc. 32nd ACM Symposium on Theory of Computing (2000)Google Scholar
  20. [Lechner 2002]
    Lechner, U.: Peer to Peer beyond Filesharing. In: Unger, H., Bohme, T. (Hrsg.)Google Scholar
  21. [Lifantsev 2000]
    Lifantsev, M.: Voting Model for Ranking Web Pages. In: Graham, P., Maheswaran, M. (eds.) Proceedings of the International Conference on Internet Computing (Las Vegas, Nevada, U.S.A.), pp. 143–148. CSREA Press, Las Vegas (2000)Google Scholar
  22. [Lifantsev and Chiueh 2002]
    Lifantsev, M., Chiueh, T.: I/O-Conscious Data Preparation for Large-Scale Web Search Engines. In: Proceedings of 28th International Conference on Very Large Data Bases, Hong Kong, China, August 20-23, Morgan Kaufmann, Hong Kong (2002)Google Scholar
  23. [Milgram 1967]
    Milgram, S.: The small world problem. Psychology Today 2, 60–67 (1967)Google Scholar
  24. [Neurogrid]
  25. [Newman 2000]
    Newman, M.E.J.: Models of the Small World (2000)Google Scholar
  26. [OpenGriD]
  27. [Ritter 2002]
    Ritter, J.: Why Gnutella can’t scale. No, really, (2002)
  28. [Sanderson 1996]
    Sanderson, M.: Word Sense Disambiguation and information Retrieval. In: Proceedings of the 17th ACM SIGIR Conference, pp. 142–151 (1996)Google Scholar
  29. [Saussure 1916]
    Saussure, de Saussure, F.: Cours de linguistique générale (1916) Google Scholar
  30. [Schmidt 1999]
    Schmidt, F.: Automatische Ermittlung semantischer Zusammenhänge lexi-kalischer Einheiten und deren graphische Darstellung, Diplomarbeit, Universität Leipzig (1999) Google Scholar
  31. [Manning & Schütze 1999]
    Manning, C.D., Schutze, H.: Foundations of statistical natural language processing (1999)Google Scholar
  32. [Sebastiani 2001]
    Sebastiani, F.: Machine Learning in Automated Text Categorization (2001) Google Scholar
  33. [Singla 2002]
    Singla, A., Rohrs, C.: Ultrapeers: Another Step towards gnutella scalability. Lime Wire LLC, Working Draft (2002),
  34. [Steyvers & Tenenbaum 2002]
    Steyvers, M., Tenenbaum, J.B.: The large-scale structure of semantic networks: statistical analyses and a model of semantic growth, Congnitive Science (2002)Google Scholar
  35. [Strogatz 1998]
    Watts, D.J., Strogatz, S.H.: Collective dynamics of ’small-world’ networks. Nature 393, 440–442 (1998) (2001) (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Stefan Bordag
    • 1
  • Gerhard Heyer
    • 1
  • Uwe Quasthoff
    • 1
  1. 1.Natural Language Processing DepartmentLeipzig University Computer Science InstituteLeipzig

Personalised recommendations