Abstract
TextVis is a visual data mining system for document collections. Such a collection represents an application domain, and the primary goal of the system is to derive patterns that provide knowledge about this domain. Additionally, the derived patterns can be used to browse the collection. TextVis takes a multi-strategy approach to text mining, and enables defining complex analysis schemas from basic components, provided by the system. An analysis schema is constructed by dragging functional icons from a tool-pallette onto the workspace and connecting them according to the desired flow of information. The system provides a large collection of basic analysis tools, including: frequent sets, associations, concept distributions, and concept correlations. The discovered patterns are presented in a visual interface allowing the user to operate on the results, and to access the associated documents. TextVis is a complete text mining system which uses agent technology to access various online information sources, text preprocessing tools to extract relevant information from the documents, a variety of data mining algorithms, and a set of visual browsers to view the results. This paper provides an overview on the TextVis system. We describe the system’s architecture, the various tools, and discuss the advantages of our visual environment for mining large document collections.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal A., Srikant R.: Fast algorithms for mining association rules. In: Proceedings of the VLDB Conference, (1994).
Agrawal A., Imielinski T., Swami A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD Conference on Management of Data, (1993) 207–216.
Cutting D. R., Karger D. R., Pederson J. O., Tukey J. W.: Scatter/Gather: a cluster-based approach to browsing large document collections. In: Proceedings of the 15th International ACM SIGIR Conference on Research and Development in Information Retrieval, (1992) 318–329.
Fayyad, U.; Piatetsky-Shapiro, G.; and Smyth P.: Knowledge Discovery and Data Mining: Towards a Unifying Framework. In: Proceedings of the 2nd International Conference of Knowledge Discovery and Data Mining (KDD), (1996) 82–88.
Feldman R., Aumann A., Amir A., Zilberstein A., Kloesgen W.: Maximal Association Rules: a New Tool for Mining for Keyword Co-occurrence in Document Collections. In Proceedings of the 3rd International Conference on Knowledge Discovery (KDD),(1997) 167–170.
Feldman R., and Hirsh H. “Exploiting Background Information in Knowledge Discovery from Text”, Journal of Intelligent Information Systems, (1997).
Feldman R., Dagan I., Kloesgen W.: Efficient Algorithms for Mining and Manipulating Associations in Texts. In: Proceedings of EMCSR96, (1996).
Feldman R., Dagan I.: KDT—knowledge discovery in texts. In: Proceedings of the First International Conference on Knowledge Discovery (KDD), (1995).
Klösgen W.: Efficient Discovery of Interesting Statements. The Journal of Intelligent Information Systems, 4(1) (1995).
Klösgen W.: Explora: A Multipattern and Multistrategy Discovery Assistant. In: U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, (Eds.) Advances in Knowledge Discovery and Data Mining, MIT Press, Cambridge, MA (1996).
Lagus, K., Honkela, T., Kaski, S., Kohonen, T.: Self-organizing maps of document collections: A new approach to interactive exploration. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining (KDD), (1996) 238–243.
Rocchio, J. J.: Document retrieval systems—optimization and evaluation. Ph.D. Thesis, Harvard University, (1966).
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Landau, D. et al. (1998). TextVis: An integrated visual environment for text mining. In: Żytkow, J.M., Quafafou, M. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1998. Lecture Notes in Computer Science, vol 1510. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0094805
Download citation
DOI: https://doi.org/10.1007/BFb0094805
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65068-3
Online ISBN: 978-3-540-49687-8
eBook Packages: Springer Book Archive