Building XML Data Warehouse Based on Frequent Patterns in User Queries
With the proliferation of XML-based data sources available across the Internet, it is increasingly important to provide users with a data warehouse of XML data sources to facilitate decision-making processes. Due to the extremely large amount of XML data available on web, unguided warehousing of XML data turns out to be highly costly and usually cannot well accommodate the users needs in XML data acquirement. In this paper, we propose an approach to materialize XML data warehouses based on frequent query patterns discovered from historical queries issued by users. The schemas of integrated XML documents in the warehouse are built using these frequent query patterns represented as Frequent Query Pattern Trees (FreqQPTs). Using hierarchical clustering technique, the integration approach in the data warehouse is flexible with respect to obtaining and maintaining XML documents. Experiments show that the overall processing of the same queries issued against the global schema become much efficient by using the XML data warehouse built than by directly searching the multiple data sources.
Unable to display preview. Download preview PDF.
- 1.Garcia-Molina, H., Labio, W., Wiener, J.L., Zhuge, Y.: Distributed and ParallelComputing Issues in Data Warehousing. In: Proc. of ACM Principles of Distributed Computing Conference (PODS), Puerto Vallarta, Mexico (1998)Google Scholar
- 2.Golfarelli, M., Rizzi, S., Vrdoljak, B.: Data Warehouse Design from XML Sources. In: Proc. of ACM DOLAP 2001, Atlanta, Georgia, USA (November 2001)Google Scholar
- 5.Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric Management of Versions in an XML Warehouse. In: Proc. of Intl. Conf. on Very Large Data Bases (VLDB 2001), Roma, Italy, September 2001, pp. 581–590 (2001)Google Scholar
- 7.XQuery Language 1.0., http://www.w3.org/TR/xquery/
- 8.Xyleme, L.: A Dynamic Warehouse for XML Data of the Web. IEEE Data Engineering Bulletin 24(2), 40–47 (2001)Google Scholar
- 9.Yang, L.H., Lee, M.L., Hsu, W., Acharya, S.: Mining Frequent Query Patterns from XML Queries. In: Proc. of 8th Intl. Symp. on Database Systems for Advanced Applications (DASFAA 2003), Kyoto, Japan (March 2003)Google Scholar
- 10.Garber, L.: Michael StoneBraker on the Importance of Data Integration. IT Professional 1(3), 77–79, 80 (1999)Google Scholar