Trend graphs: Visualizing the evolution of concept relationships in large document collections

Feldman, Ronen; Aumann, Yonatan; Zilberstein, Amir; Ben-Yehuda, Yaron

doi:10.1007/BFb0094803

Trend graphs: Visualizing the evolution of concept relationships in large document collections

Ronen Feldman¹,
Yonatan Aumann¹,
Amir Zilberstein¹ &
…
Yaron Ben-Yehuda¹

Communications
Conference paper
First Online: 19 October 2006

457 Accesses
11 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1510))

Abstract

The proliferation of digitally available textual data necessitates automatic tools for analyzing large textual collections. Thus, in analogy to data mining for structured databases, text mining is defined for textual collections. A central tool in text mining is the analysis of concept relationship, which discovers connections between different concepts, as reflected in the corpus. Most previous work on text mining in general, and concept relationship in particular, viewed the entire corpus as one monolithic entity. However, large corpuses are often composed of documents with different characteristics. Most importantly, documents are often tagged with timestamps (e.g. news articles), and thus represent the state of the domain in different time periods. In this paper we introduce a new technique for analyzing and visualizing differences and similarities in the concept relationships, as they are reflected in different segments of the corpus. Focusing on the case of timestamped documents, we introduce Trend Graphs, which provide a graphical tool for analyzing and visualizing the dynamic changes in concept relationships over time. Trend Graphs thus provide a tool for tracking the evaluation of the corpus over time, highlighting trends and discontinuities.

Download to read the full chapter text

Chapter PDF

References

Agrawal, R.; Lin, K. I.; Sawhney, H. S.; and Shim, K.: Fast Similarity Search in the Presence of Noise, Scalling and Translation in Time-Series Databases. In: Proceeding of the Annual Symposium on Very Large DataBases (VLDB), (1995) 490–501.
Google Scholar
Bettini, C.; Wang, X. S.; and Jajodia, S.: Testing Complex temporal relationships Involving Multiple Granularities and its Application to Data Mining. In: Proceedings of the 15th ACM Symposium on Principles of Database Systems (PODS), (1996) 68–78.
Google Scholar
Dousson, C.; Gaborit, P.; and Ghallab, M.: Situation Recognition: Representation and Algorithms. In: Proceedings of the 13th International Joint Conference of Artificial Inteligence (IJCAI), (1993) 166–172.
Google Scholar
Fayyad, U.; Piatetsky-Shapiro, G.; and Smyth P.: Knowledge Discovery and Data Mining: Towards a Unifying Framework. In: Proceedings of the 2^nd International Conference of Knowledge Discovery and Data Mining (KDD), (1996) 82–88.
Google Scholar
Feldman, R.; and Dagan, I.: KDT—Knowledge Discovery in Texts. In: Proceedings of the 1sr International Conference of Knowledge Discovery and Data Mining (KDD), (1995).
Google Scholar
Feldman, R.; and Hirsh, H.: Mining Association Rules in Text in the Presence of Background Knowledge. In: Proceedings of the 2nd International Conference of Knowledge Discovery and Data Mining (KDD), (1996).
Google Scholar
Feldman, R.; Klosgen, W.; and Zilberstein, A.: Visualization Techniques to Explore Data Mining Results for Document Collections. In: Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining (KDD), (1997), 16–23.
Google Scholar
Hahn, U.; and Schnattinger, K.: Deep Knowledge Discovery from Natural Language Texts. In: Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining (KDD), (1997) 175–178.
Google Scholar
Hotanen, k.; Klemettinen, M.; Mannila, H.; Ronkainon, P.; and Toivonen, H.: TASA: Telecommunication Alarm Sequence Analyzer, or “How to Enjoy Faults in Your Network”. In: Proceedings of the 1996 IEEE Network Operations and Management Symposium (NOMS), (1996) 520–529.
Google Scholar
Keogh, E.; and Smyth, P.: A Probabilistic Approach to Fast Pattern Matching in Time-Series Databases. In: Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining (KDD), (1997) 24–130.
Google Scholar
Lent, B.; Agrawal, R.; and Srikant, R.: Discovering Trends in Text Databases. In: Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining (KDD), (1997) 227–230.
Google Scholar
Mannila, H.; Toivonen, H.; and Verkamo, A.I.: Discovering Frequent Episodes in Sequences. In: Proceedings of the 1st International Conference of Knowledge Discovery and Data Mining (KDD), (1995) 210–215.
Google Scholar
Mannila, H.; and Toivenen, H.: Discovering Generalized Episodes Using Minimal Occurances. In: Proceedings of the 2nd International Conference of Knowledge Discovery and Data Mining (KDD), (1996) 146–151.
Google Scholar
Srikant, R.; and Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proceedings of the 5th International Conference on Extending Database Technology (EDBT), (1996).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, Bar Ilan University, 52900, Ramat-Gan, Israel
Ronen Feldman, Yonatan Aumann, Amir Zilberstein & Yaron Ben-Yehuda

Authors

Ronen Feldman
View author publications
You can also search for this author in PubMed Google Scholar
Yonatan Aumann
View author publications
You can also search for this author in PubMed Google Scholar
Amir Zilberstein
View author publications
You can also search for this author in PubMed Google Scholar
Yaron Ben-Yehuda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jan M. Żytkow Mohamed Quafafou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feldman, R., Aumann, Y., Zilberstein, A., Ben-Yehuda, Y. (1998). Trend graphs: Visualizing the evolution of concept relationships in large document collections. In: Żytkow, J.M., Quafafou, M. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1998. Lecture Notes in Computer Science, vol 1510. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0094803

Download citation

DOI: https://doi.org/10.1007/BFb0094803
Published: 19 October 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65068-3
Online ISBN: 978-3-540-49687-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics