Abstract
This chapter tackles the problem of knowledge discovery in text collections and the dynamic display of the discovered knowledge. We claim that these two problems are deeply interleaved, and should be considered together. The contribution of this chapter is fourfold: (1) the description of the properties needed for a high-level representation of concept relations in text; (2) a stochastic measure for a fast evaluation of dependencies between concepts; (3) a visualization algorithm to display dynamic structures; and (4) a deep integration of discovery and knowledge visualization, i.e., the placement of nodes and edges automatically guide the discovery of knowledge to be displayed. The resulting program has been tested using two specific data sets based on the specific domains of molecular biology and WWW how-tos. We have also integrated the proposed discovery and visualization methods in a more refined Web mining process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. McCallum, K. Nigam: A comparison of event models for naive bayes text classification (1998)
K. Nigam, A. McCallum, S. Thrun, T. Mitchell: Text classification from labeled and unlabeled documents using em (1999)
G. Escudero, L. arquez, G. Rigau: Naive bayes and exemplar-based approaches to word sense disambiguation revisited (2000)
L.E. Sucar, J. Ruiz-Suârez: Learning structure from data and its application to ozone prediction. Applied Intelligence, 7, 327–338 (1997)
V. Dubois, M. Quafafou: Discovering graph structures in high dimensional spaces. In: Data Mining I I (2000)
N. Friedman, I. Nachman, D. Peter: Learning bayesian network structure from massive datasets: The sparse candidate algorithm. In: Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence (UAI-99) ( San Francisco, CA, Morgan Kaufmann Publishers 1999 ) pp. 196–205
T. Kohonen: Self-Organizing Maps (Springer-Verlag 1994)
T. Honkela, S. Kaski, K. Lagus, T. Kohonen: Websom-self-organizing maps of document collections (1997)
B. Meyer: Self-organizing graphs-a neural network perspective of graph layout (1998)
V. Dubois, M. Quafafou: Incremental and dynamic text mining. In: M.S. Hacid, Z.W. Ras, D.A. Zighed, Y. Kodratoff (eds.), Foundations of Intelligent Systems, 13th International Symposium, ISMIS 2002, Lyon, France, June 27–29, 2002, Proceedings. Volume 2366 of Lecture Notes in Computer Science (Springer, 2002 ) pp. 265–273
G.C. Williams: Collocational networks: Interlocking patterns of lexis in a corpus of plant biology research articles. International Journal of Corpus Linguistic, 3, 151–171 (1998)
D. Hawking, N. Craswell, D. Harman: Results and challenges in web search evaluation. In: www8, Toronto (1999)
S. Chakrabarti, S. Srivastava, M. Subramanyam, M. Tiwari: Using memex to archive and mine community web browsing experience. In: www9 (2000)
I.B. Schaul, M. Herscovici, M. Jacovi, Y.S. Maarek, D. Pelleg, M. Shtalhaim, V. Soroka, S. Ur: Adding support for dynamic and focused search with fetuccino. In: WWW8 (1999)
J. Cho, H.G. Molina: The evolution of the web and implications for an incremental crawler. In: Proceedings of 26th International Conference on Very Large Databases, VLDB, 2000
L. X.: Searching and browsing on map displays. In: Proceedings of ASIS’95 (1995)
A. Brodera, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, A.T.R. Stata, J. Wiener: Graph structure in the web. In: WWW9. (2000)
B.E. Brewington, G. Cybenko: How dynamic is the web? In: WWW9 (2000)
G. Salton, C.S. Yang, C.T. Yu: A theory of term importance in automatic text analysis. Journal of the American Society for Information Science, 26 (1975)
V. Dubois, M. Quafafou, B. Habegger: Mining crawled data and visualizing discovered knowledge. In: N. Zhong, Y.Y. Yao, J. Liu, S. Ohsuga (eds.), Web Intelligence: Research and Development, First Asia-Pacific Conference, WI 2001, Maebashi City, Japan, October 23–26, 2001, Proceedings Volume 2198 of Lecture Notes in Computer Science (Springer, 2001 ) pp. 493–497
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Dubois, V., Quafafou, M. (2004). Graph Discovery and Visualization from Textual Data. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-662-07952-2_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07378-6
Online ISBN: 978-3-662-07952-2
eBook Packages: Springer Book Archive