Skip to main content

Discovering Pattern-Based Dynamic Structures from Versions of Unordered XML Documents

  • Conference paper
Data Warehousing and Knowledge Discovery (DaWaK 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3181))

Included in the following conference series:

Abstract

Existing works on XML data mining deal with snapshot XML data only, while XML data is dynamic in real applications. In this paper, we discover knowledge from XML data by taking account its dynamic nature. We present a novel approach to extract pattern-based dynamic structures from versions of unordered XML documents. With the proposed dynamic metrics, the pattern-based dynamic structures are expected to summarize and predict interesting change trends of certain structures based on their past behaviors. Two types of pattern-based dynamic structures, increasing dynamic structure and decreasing dynamic structure are considered. With our proposed data model, SMH-Tree, an algorithm for mining such pattern-based dynamic structures with only two scans of the XML sequence is presented. Experimental results show that the proposed algorithm can extract the pattern-based dynamic structures efficiently with good scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Termier, A., Rousset, M.C., Sebag, M.: Treefinder: a first step towards XML data mining. In: Proc. IEEE ICDM, pp. 450–458 (2002)

    Google Scholar 

  2. Braga, D., Campi, A., Ceri, S., Lanzi, P.L.: Discovering interesting information in XML data with association rules. In: Proc. ACM SAC, pp. 450–454 (2003)

    Google Scholar 

  3. Chen, L., Bhowmick, S.S., Chia, C.: Mining Association Rules from Structural Deltas of Historical XML Documents. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 452–457. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Inokuchi, A., Washio, T., Motoda, H.: An apriori based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  5. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. IEEE ICDM, pp. 313–320 (2001)

    Google Scholar 

  6. Nicola, M., John, J.: XML parsing: a threat to database performance. In: Proc. ACM CIKM, pp. 175–178 (2003)

    Google Scholar 

  7. Wang, Y., DeWitt, D.J., Cai, J.-Y.: X-diff: An effective change detection algorithm for XML documents. In: Proc. IEEE ICDE, pp. 519–527 (2003)

    Google Scholar 

  8. Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: Proc. IEEE ICDM, pp. 721–724 (2002)

    Google Scholar 

  9. Zaki, M.: Efficiently mining frequent trees in a forest. In: Proc. ACM SIGKDD, pp. 71–80 (2002)

    Google Scholar 

  10. Zaki, M.J., Aggarwal, C.C.: XRules: An effective structural classifier for XML data. In: Proc. ACM SIGKDD, pp. 316–325 (2003)

    Google Scholar 

  11. Zhao, Q., Bhowmick, S.S., Madria, S.: Research issues in web structural delta mining. In: Proc. IEEE ICDM Workshop (2003)

    Google Scholar 

  12. Zhao, Q., Bhowmick, S.S., Mohania, M., Kambayashi, Y.: FCS mining: Discovering frequently changing structures from historical XML structural deltas (2004) (submitted for publication)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhao, Q., Bhowmick, S.S., Madria, S. (2004). Discovering Pattern-Based Dynamic Structures from Versions of Unordered XML Documents. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2004. Lecture Notes in Computer Science, vol 3181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30076-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30076-2_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22937-7

  • Online ISBN: 978-3-540-30076-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics