A Data Mining Approach to XML Dissemination

Wang, Xiaoling; Ester, Martin; Qian, Weining; Zhou, Aoying

doi:10.1007/978-3-642-17616-6_40

Xiaoling Wang¹⁹,
Martin Ester²⁰,
Weining Qian¹⁹ &
…
Aoying Zhou¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6488))

Included in the following conference series:

International Conference on Web Information Systems Engineering

1525 Accesses

Abstract

Currently user’s interests are expressed by XPath or XQuery queries in XML dissemination applications. These queries require a good knowledge of the structure and contents of the documents that will arrive; As well as knowledge of XQuery which few consumers will have. In some cases, where the distinction of relevant and irrelevant documents requires the consideration of a large number of features, the query may be impossible. This paper introduces a data mining approach to XML dissemination that uses a given document collection of the user to automatically learn a classifier modelling of his/her information needs. Also discussed are the corresponding optimization methods that allow a dissemination server to execute a massive number of classifiers simultaneously. The experimental evaluation of several real XML document sets demonstrates the accuracy and efficiency of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jacobsen, H.-A.: Content-based publish/subscribe. In: Liu, L., Tamer Ozsu, M. (eds.) Encyclopedia of Database Systems, pp. 464–466. Springer, Heidelberg (2009)
Google Scholar
Diao, Y., Rizvi, S., Franklin, M.J.: Towards an internet-scale XML dissemination service. In: Nascimento, M.A., Özsu, M.T., Kossmann, D., Miller, R.J., Blakeley, J.A., Bernhard Schiefer, K. (eds.) VLDB, pp. 612–623. Morgan Kaufmann, San Francisco (2004)
Google Scholar
Gong, X., Yan, Y., Qian, W., Zhou, A.: Bloom filter-based XML packets filtering for millions of path queries. In: ICDE, pp. 890–901. IEEE Computer Society, Los Alamitos (2005)
Google Scholar
Kwon, J., Rao, P., Moon, B., Lee, S.: Fast xml document filtering by sequencing twig patterns. ACM Trans. Internet Techn. 9(4) (2009)
Google Scholar
Theobald, M., Schenkel, R., Weikum, G.: Exploiting structure, annotation, and ontological knowledge for automatic classification of xml data. In: WebDB, pp. 1–6 (2003)
Google Scholar
Zaki, M.J., Aggarwal, C.C.: Xrules: An effective algorithm for structural classification of xml data. Machine Learning 62(1-2), 137–170 (2006)
Article Google Scholar
Hong, M., Demers, A.J., Gehrke, J., Koch, C., Riedewald, M., White, W.M.: Massively multi-query join processing in publish/subscribe systems. In: Chan, C.Y., Ooi, B.C., Zhou, A. (eds.) SIGMOD Conference, pp. 761–772. ACM, New York (2007)
Google Scholar
Zaki, M.J.: Efficiently mining frequent trees in a forest: Algorithms and applications. IEEE Trans. Knowl. Data Eng. 17(8), 1021–1035 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Software Engineering Institute, East China Normal University, China
Xiaoling Wang, Weining Qian & Aoying Zhou
School of Computing Science, Simon Fraser University, Burnaby, Canada
Martin Ester

Authors

Xiaoling Wang
View author publications
You can also search for this author in PubMed Google Scholar
Martin Ester
View author publications
You can also search for this author in PubMed Google Scholar
Weining Qian
View author publications
You can also search for this author in PubMed Google Scholar
Aoying Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
Lei Chen
University of Patras, 26504, Patras, Greece
Peter Triantafillou
Polytechnic Institute of NYU, 11201, Brooklyn, NY, USA
Torsten Suel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Ester, M., Qian, W., Zhou, A. (2010). A Data Mining Approach to XML Dissemination. In: Chen, L., Triantafillou, P., Suel, T. (eds) Web Information Systems Engineering – WISE 2010. WISE 2010. Lecture Notes in Computer Science, vol 6488. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17616-6_40

Download citation

DOI: https://doi.org/10.1007/978-3-642-17616-6_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17615-9
Online ISBN: 978-3-642-17616-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics