Skip to main content

Graph Discovery and Visualization from Textual Data

  • Chapter
Intelligent Technologies for Information Analysis
  • 182 Accesses

Abstract

This chapter tackles the problem of knowledge discovery in text collections and the dynamic display of the discovered knowledge. We claim that these two problems are deeply interleaved, and should be considered together. The contribution of this chapter is fourfold: (1) the description of the properties needed for a high-level representation of concept relations in text; (2) a stochastic measure for a fast evaluation of dependencies between concepts; (3) a visualization algorithm to display dynamic structures; and (4) a deep integration of discovery and knowledge visualization, i.e., the placement of nodes and edges automatically guide the discovery of knowledge to be displayed. The resulting program has been tested using two specific data sets based on the specific domains of molecular biology and WWW how-tos. We have also integrated the proposed discovery and visualization methods in a more refined Web mining process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. McCallum, K. Nigam: A comparison of event models for naive bayes text classification (1998)

    Google Scholar 

  2. K. Nigam, A. McCallum, S. Thrun, T. Mitchell: Text classification from labeled and unlabeled documents using em (1999)

    Google Scholar 

  3. G. Escudero, L. arquez, G. Rigau: Naive bayes and exemplar-based approaches to word sense disambiguation revisited (2000)

    Google Scholar 

  4. L.E. Sucar, J. Ruiz-Suârez: Learning structure from data and its application to ozone prediction. Applied Intelligence, 7, 327–338 (1997)

    Article  Google Scholar 

  5. V. Dubois, M. Quafafou: Discovering graph structures in high dimensional spaces. In: Data Mining I I (2000)

    Google Scholar 

  6. N. Friedman, I. Nachman, D. Peter: Learning bayesian network structure from massive datasets: The sparse candidate algorithm. In: Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99) ( San Francisco, CA, Morgan Kaufmann Publishers 1999 ) pp. 196–205

    Google Scholar 

  7. T. Kohonen: Self-Organizing Maps (Springer-Verlag 1994)

    Google Scholar 

  8. T. Honkela, S. Kaski, K. Lagus, T. Kohonen: Websom-self-organizing maps of document collections (1997)

    Google Scholar 

  9. B. Meyer: Self-organizing graphs-a neural network perspective of graph layout (1998)

    Google Scholar 

  10. V. Dubois, M. Quafafou: Incremental and dynamic text mining. In: M.S. Hacid, Z.W. Ras, D.A. Zighed, Y. Kodratoff (eds.), Foundations of Intelligent Systems, 13th International Symposium, ISMIS 2002, Lyon, France, June 27–29, 2002, Proceedings. Volume 2366 of Lecture Notes in Computer Science (Springer, 2002 ) pp. 265–273

    Google Scholar 

  11. G.C. Williams: Collocational networks: Interlocking patterns of lexis in a corpus of plant biology research articles. International Journal of Corpus Linguistic, 3, 151–171 (1998)

    Article  Google Scholar 

  12. D. Hawking, N. Craswell, D. Harman: Results and challenges in web search evaluation. In: www8, Toronto (1999)

    Google Scholar 

  13. S. Chakrabarti, S. Srivastava, M. Subramanyam, M. Tiwari: Using memex to archive and mine community web browsing experience. In: www9 (2000)

    Google Scholar 

  14. I.B. Schaul, M. Herscovici, M. Jacovi, Y.S. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, S. Ur: Adding support for dynamic and focused search with fetuccino. In: WWW8 (1999)

    Google Scholar 

  15. J. Cho, H.G. Molina: The evolution of the web and implications for an incremental crawler. In: Proceedings of 26th International Conference on Very Large Databases, VLDB, 2000

    Google Scholar 

  16. L. X.: Searching and browsing on map displays. In: Proceedings of ASIS’95 (1995)

    Google Scholar 

  17. A. Brodera, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, A.T.R. Stata, J. Wiener: Graph structure in the web. In: WWW9. (2000)

    Google Scholar 

  18. B.E. Brewington, G. Cybenko: How dynamic is the web? In: WWW9 (2000)

    Google Scholar 

  19. G. Salton, C.S. Yang, C.T. Yu: A theory of term importance in automatic text analysis. Journal of the American Society for Information Science, 26 (1975)

    Google Scholar 

  20. V. Dubois, M. Quafafou, B. Habegger: Mining crawled data and visualizing discovered knowledge. In: N. Zhong, Y.Y. Yao, J. Liu, S. Ohsuga (eds.), Web Intelligence: Research and Development, First Asia-Pacific Conference, WI 2001, Maebashi City, Japan, October 23–26, 2001, Proceedings Volume 2198 of Lecture Notes in Computer Science (Springer, 2001 ) pp. 493–497

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Dubois, V., Quafafou, M. (2004). Graph Discovery and Visualization from Textual Data. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-07952-2_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-07378-6

  • Online ISBN: 978-3-662-07952-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics