Abstract
Existing works on XML data mining deal with snapshot XML data only, while XML data is dynamic in real applications. In this paper, we discover knowledge from XML data by taking account its dynamic nature. We present a novel approach to extract pattern-based dynamic structures from versions of unordered XML documents. With the proposed dynamic metrics, the pattern-based dynamic structures are expected to summarize and predict interesting change trends of certain structures based on their past behaviors. Two types of pattern-based dynamic structures, increasing dynamic structure and decreasing dynamic structure are considered. With our proposed data model, SMH-Tree, an algorithm for mining such pattern-based dynamic structures with only two scans of the XML sequence is presented. Experimental results show that the proposed algorithm can extract the pattern-based dynamic structures efficiently with good scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Termier, A., Rousset, M.C., Sebag, M.: Treefinder: a first step towards XML data mining. In: Proc. IEEE ICDM, pp. 450–458 (2002)
Braga, D., Campi, A., Ceri, S., Lanzi, P.L.: Discovering interesting information in XML data with association rules. In: Proc. ACM SAC, pp. 450–454 (2003)
Chen, L., Bhowmick, S.S., Chia, C.: Mining Association Rules from Structural Deltas of Historical XML Documents. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 452–457. Springer, Heidelberg (2004)
Inokuchi, A., Washio, T., Motoda, H.: An apriori based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proc. IEEE ICDM, pp. 313–320 (2001)
Nicola, M., John, J.: XML parsing: a threat to database performance. In: Proc. ACM CIKM, pp. 175–178 (2003)
Wang, Y., DeWitt, D.J., Cai, J.-Y.: X-diff: An effective change detection algorithm for XML documents. In: Proc. IEEE ICDE, pp. 519–527 (2003)
Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: Proc. IEEE ICDM, pp. 721–724 (2002)
Zaki, M.: Efficiently mining frequent trees in a forest. In: Proc. ACM SIGKDD, pp. 71–80 (2002)
Zaki, M.J., Aggarwal, C.C.: XRules: An effective structural classifier for XML data. In: Proc. ACM SIGKDD, pp. 316–325 (2003)
Zhao, Q., Bhowmick, S.S., Madria, S.: Research issues in web structural delta mining. In: Proc. IEEE ICDM Workshop (2003)
Zhao, Q., Bhowmick, S.S., Mohania, M., Kambayashi, Y.: FCS mining: Discovering frequently changing structures from historical XML structural deltas (2004) (submitted for publication)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, Q., Bhowmick, S.S., Madria, S. (2004). Discovering Pattern-Based Dynamic Structures from Versions of Unordered XML Documents. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2004. Lecture Notes in Computer Science, vol 3181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30076-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-30076-2_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22937-7
Online ISBN: 978-3-540-30076-2
eBook Packages: Springer Book Archive