Building XML Data Warehouse Based on Frequent Patterns in User Queries

  • Ji Zhang
  • Tok Wang Ling
  • Robert M. Bruckner
  • A Min Tjoa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2737)


With the proliferation of XML-based data sources available across the Internet, it is increasingly important to provide users with a data warehouse of XML data sources to facilitate decision-making processes. Due to the extremely large amount of XML data available on web, unguided warehousing of XML data turns out to be highly costly and usually cannot well accommodate the users needs in XML data acquirement. In this paper, we propose an approach to materialize XML data warehouses based on frequent query patterns discovered from historical queries issued by users. The schemas of integrated XML documents in the warehouse are built using these frequent query patterns represented as Frequent Query Pattern Trees (FreqQPTs). Using hierarchical clustering technique, the integration approach in the data warehouse is flexible with respect to obtaining and maintaining XML documents. Experiments show that the overall processing of the same queries issued against the global schema become much efficient by using the XML data warehouse built than by directly searching the multiple data sources.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Garcia-Molina, H., Labio, W., Wiener, J.L., Zhuge, Y.: Distributed and ParallelComputing Issues in Data Warehousing. In: Proc. of ACM Principles of Distributed Computing Conference (PODS), Puerto Vallarta, Mexico (1998)Google Scholar
  2. 2.
    Golfarelli, M., Rizzi, S., Vrdoljak, B.: Data Warehouse Design from XML Sources. In: Proc. of ACM DOLAP 2001, Atlanta, Georgia, USA (November 2001)Google Scholar
  3. 3.
    Huang, S.M., Su, C.H.: The development of an XML-based data warehouse system. In: Yin, H., Allinson, N.M., Freeman, R., Keane, J.A., Hubbard, S. (eds.) IDEAL 2002. LNCS, vol. 2412, pp. 206–212. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Mangisengi, O., Huber, J., Hawel, C., Essmayr, W.: A framework for supporting interoperability of data warehouse islands using XML. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 328–338. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric Management of Versions in an XML Warehouse. In: Proc. of Intl. Conf. on Very Large Data Bases (VLDB 2001), Roma, Italy, September 2001, pp. 581–590 (2001)Google Scholar
  6. 6.
    Passi, K., Lane, L., Madria, S.K., Sakamuri, B.C., Mohania, M., Bhowmick, S.S.: A Model for XML Schema Integration. In: Bauknecht, K., Tjoa, A.M., Quirchmayr, G. (eds.) EC-Web 2002. LNCS, vol. 2455, pp. 193–202. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  7. 7.
    XQuery Language 1.0.,
  8. 8.
    Xyleme, L.: A Dynamic Warehouse for XML Data of the Web. IEEE Data Engineering Bulletin 24(2), 40–47 (2001)Google Scholar
  9. 9.
    Yang, L.H., Lee, M.L., Hsu, W., Acharya, S.: Mining Frequent Query Patterns from XML Queries. In: Proc. of 8th Intl. Symp. on Database Systems for Advanced Applications (DASFAA 2003), Kyoto, Japan (March 2003)Google Scholar
  10. 10.
    Garber, L.: Michael StoneBraker on the Importance of Data Integration. IT Professional 1(3), 77–79, 80 (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Ji Zhang
    • 1
  • Tok Wang Ling
    • 1
  • Robert M. Bruckner
    • 2
  • A Min Tjoa
    • 2
  1. 1.Department of Computer ScienceNational University of SingaporeSingapore
  2. 2.Institute of Software TechnologyVienna University of TechnologyViennaAustria

Personalised recommendations