SEuS: Structure Extraction Using Summaries

Ghazizadeh, Shayan; Chawathe, Sudarshan S.

doi:10.1007/3-540-36182-0_9

SEuS: Structure Extraction Using Summaries

Shayan Ghazizadeh⁷ &
Sudarshan S. Chawathe⁷

Conference paper
First Online: 01 January 2002

975 Accesses
27 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2534))

Abstract

We study the problem of finding frequent structures in semistructured data (represented as a directed labeled graph). Frequent structures are graphs that are isomorphic to a large number of subgraphs in the data graph. Frequent structures form building blocks for visual exploration and data mining of semistructured data.We overcome the inherent computational complexity of the problem by using a summary data structure to prune the search space and to provide interactive feedback. We present an experimental study of our methods operating on real datasets. The implementation of our methods is capable of operating on datasets that are two to three orders of magnitude larger than those described in prior work.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Tatsuya Asai, Kenji Abe, Shinji Kawasoe, et al. Efficient substructure discovery from large semi-structured data. In Proc. of the Second SIAM International Conference on Data Mining, 2002.
Google Scholar
R. Agrawal, T. Imielinski, and A. Swami. Mining associations between sets of items in massive databases. SIGMOD Record, 22(2):207–216, June 1993.
Article Google Scholar
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. of the 20th International Conference Very Large Data Bases, pages 487–499. Morgan Kaufmann, 1994.
Google Scholar
P. Buneman, S. B. Davidson, M. F. Fernandez, and D. Suciu. Adding structure to unstructured data. In Proc. of the 6th International Conference on Database Theory, 1997.
Google Scholar
D. Conklin and J. Glasgow. Spatial analogy and subsumption. In Proc. of the Ninth International Conference on Machine Learning, pages 111–116, 1992.
Google Scholar
D. J. Cook and L. B. Holder. Graph-based data mining. ISTA: Intelligent Systems & their applications, 15, 2000.
Google Scholar
D. Conklin. Structured concept discovery: Theory and methods. Technical Report 94-366, Queen’s University, 1994.
Google Scholar
Gao Cong, Lan Yi, Bing Liu, and Ke Wang. Discovering frequent substructures from hierarchical semi-structured data. In Proc. of the Second SIAM International Conference on Data Mining, 2002.
Google Scholar
D. H. Fisher, Jr. Knowledge acquisition via incremental conceptual clustering. Machine Learning, (2):139–172, 1987.
Google Scholar
S. Fortin. The graph isomorphism problem. Technical Report 96-20, University of Alberta, 1996.
Google Scholar
Shayan Ghazizadeh and Sudarshan Chawathe. Discovering freuqent structures using summaries. Technical report, University of Maryland, Computer Science Department, 2002.
Google Scholar
J. H. Gennari, P. Langley, and D. Fisher. Models of incremental concept formation. Artificial Intelligence, (40):11–61, 1989.
Google Scholar
R. Goldman and J. Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In Proc. of the Twenty-Third International Conference on Very Large Data Bases, pages 436–445, 1997.
Google Scholar
A. Inokuchi, T. Washio, and H. Motoda. An apriori-based algorithm for mining frequent substructures from graph data. In Proc. of the 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, pages 13–23, 2000.
Google Scholar
M. Kuramochi and G. Karypis. Frequent subgraph discovery. In Proc. of the 1st IEEE Conference on Data Mining, 2001.
Google Scholar
M. Lebowitz. Experiments with incremental concept formation: Unimem. Machine Learning, (2):103–138, 1987.
Google Scholar
R. Levinson. A self-organizing retrieval system for graphs. In Proc. of the National Conference on Artificial Intelligence, pages 203–206, 1984.
Google Scholar
B. D. McKay. nauty user’s guide (version 1.5), 2002.
Google Scholar
S. Nestorov, S. Abiteboul, and R. Motwani. Inferring structure in semistructured data. In Proc. of the Workshop on Management of Semistructured Data, 1997.
Google Scholar
S. Nestorov, S. Abiteboul, and R. Motwani. Extracting schema from semistructured data. In Proc. of the ACM SIGMOD International Conference on Management of Data, pages 295–306, 1998.
Google Scholar
S. Nestorov, J. Ullman, J. Wiener, and S. Chawathe. Representative objects: Concise representations of semistructured, hierarchial data. In Proc. of the International Conference on Data Engineering, pages 79–90, 1997.
Google Scholar
P. H. Winston. Learning structural descriptions from examples. In The Psychology of Computer Vision, pages 157–209. 1975.
Google Scholar
K. Yoshida, H. Motoda, and N. Indurkhya. Unifying learning methods by colored digraphs. In Proc. of the InternationalWorkshop on Algorithmic Learning Theory, volume 744, pages 342–355, 1993.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Maryland, College Park, MD
Shayan Ghazizadeh & Sudarshan S. Chawathe

Authors

Shayan Ghazizadeh
View author publications
You can also search for this author in PubMed Google Scholar
Sudarshan S. Chawathe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Deutsches Forschungszentrum für Künstliche Intelligenz, Stuhlsatzenhausweg 3, 66123, Saarbrücken, Germany
Steffen Lange
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Ken Satoh
Department of Computer Science, University of Maryland, College Park, 20742, Maryland, MD, USA
Carl H. Smith

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ghazizadeh, S., Chawathe, S.S. (2002). SEuS: Structure Extraction Using Summaries. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_9

Download citation

DOI: https://doi.org/10.1007/3-540-36182-0_9
Published: 08 November 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00188-1
Online ISBN: 978-3-540-36182-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics