Advertisement

Information Systems Frontiers

, Volume 20, Issue 1, pp 63–90 | Cite as

Evaluating Queries and Updates on Big XML Documents

  • Nicole Bidoit
  • Dario Colazzo
  • Noor Malla
  • Carlo Sartiani
Article

Abstract

In this paper we present Andromeda, a system for processing queries and updates on large XML documents. The system is based on the idea of statically and dynamically partitioning the input document, so as to distribute the computing load among the machines of a MapReduce cluster.

Keywords

XML Cloud computing Map/Reduce 

Notes

Acknowledgements

Authors would like to thank all the people that contributed to the design and development of Andromeda: Maurizio Nolé, Alessandro Solimando, and Federico Ulliana.

References

  1. Baazizi, M.A., Bidoit, N., Colazzo, D., Malla, N., & Sahakyan, M. (2011) Projection for XML Update Optimization. In Proceedings of the 14th International Conference on Extending Database Technology (pp. 307–318).Google Scholar
  2. Benedikt, M., & Cheney, J. (2009). Semantics, types and effects for xml updates. In Gardner, P., & Geerts, F. (Eds.), DBPL, Springer, Lecture Notes in Computer Science, (Vol. 5708 pp. 1–17).Google Scholar
  3. Benzaken, V., Castagna, G., Colazzo, D., & Nguyen, K. (2006). Type-based xml projection. In VLDB.Google Scholar
  4. Berglund, A., Boag, S., Chamberlin, D., Fernández, M.F., Kay, M., Robie, J., & Siméon, J. (2010). Xml path language (xpath) 2.0 (2nd edition). Tech. rep., World Wide Web Consortium, w3C Recommendation.Google Scholar
  5. Basex (2015). http://www.basex.org.
  6. Bidoit, N., Colazzo, D., Malla, N., & Sartiani, C. (2012). Partitioning XML documents for iterative queries. In IDEAS, 2012, ACM (pp. 51–60).Google Scholar
  7. Bidoit, N., Colazzo, D., Malla, N., Ulliana, F., Nolé, M., & Sartiani, C. (2013). Processing XML queries and updates on map/reduce clusters. In Guerrini, G., & Paton, N.W. (Eds.), EDBT, ACM (pp. 745–748).Google Scholar
  8. Boag, S., Chamberlin, D., Fernández, M.F., Florescu, D., Robie, J., & Siméon, J. (2010). XQuery 1.0: An XML Query Language (2nd Edition). Tech. rep., World Wide Web Consortium, w3C Recommendation.Google Scholar
  9. Bordawekar, R., Lim, L., & Shmueli, O. (2009). Parallelization of XPath queries using multi-core processors: challenges and experiences. In EDBT.Google Scholar
  10. Camacho-Rodríguez, J., Colazzo, D., & Manolescu, I. (2014). Paxquery: A massively parallel xquery processor. In Proceedings of the Third Workshop on Data analytics in the Cloud, 2014 (pp. 6:1–6:4).Google Scholar
  11. Choi, H., Lee, K., Kim, S., Lee, Y., & Moon, B. (2012). Hadoopxml: a suite for parallel processing of massive XML data with multiple twig pattern queries. In CIKM.Google Scholar
  12. Cong, G., Fan, W., Kementsietsidis, A., Li, J., & Liu, X. (2012). Partial evaluation for distributed XPath query processing and beyond. ACM TODS, 37(4), 43.CrossRefGoogle Scholar
  13. Dean, J., & Ghemawat, S. (2004). Mapreduce: Simplified data processing on large clusters. In OSDI, USENIX Association (pp. 137–150).Google Scholar
  14. Fegaras, L., Li, C., Gupta, U., & Philip, J. (2011). XML Query Optimization in Map-Reduce. In WebDB.Google Scholar
  15. Goldman, R., & Widom, J. (1997). DataGuides: Enabling query formulation and optimization in semistructured databases. In VLDB’97.Google Scholar
  16. Jr, E.P.C., Westmann, T., Borkar, V.R., Carey, M.J., & Tsotras, V.J. (2015). Apache vxquery: A scalable xquery implementation. CoRR arXiv:1504.00331.
  17. Khatchadourian, S., Consens, M.P., & Siméon, J. (2011). Having a chuql at XML on the cloud. In Proceedings of the 5th Alberto Mendelzon International Workshop on Foundations of Data Management.Google Scholar
  18. Kling, P., Özsu, M.T., & Daudjee, K. (2010). Generating efficient execution plans for vertically partitioned xml databases. PVLDB, 4(1), 1–11.Google Scholar
  19. MapDB (2015). http://www.mapdb.org.
  20. Marian, A., & Siméon, J. (2003). Projecting xml documents. In VLDB (pp. 213–224).Google Scholar
  21. Robie, J., Chamberlin, D., Dyck, M., Florescu, D., Melton, J., & Siméon, J. (2011). XQuery Update Facility 1.0. Tech. rep., World Wide Web Consortium, w3C Recommendation.Google Scholar
  22. Sedna (2011). http://www.sedna.org.
  23. Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., & Busse, R. (2002). XMark: A Benchmark for XML Data Management. In VLDB, Morgan Kaufmann (pp. 974–985).Google Scholar
  24. Schneider, J., Kamiya, T., Peintner, D., & Kyusakov, R. (2014). Efficient XML Interchange (EXI) Format 1.0 (2nd Edition). Tech. rep., World Wide Web Consortium, w3C Recommendation.Google Scholar
  25. Snyder, S.L. (2010). Efficient XML Interchange (EXI) compression and performance benefits: Development, implementation and evaluation. Master of science in modeling, virtual environments and simulation (moves). USA: Naval Postgraduate School.Google Scholar
  26. Sonar, R.P., & Ali, M.S. (2015). Xml parsing: A review. International Journal of Emerging Science and Engineering, 3(7), 40–43.Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Nicole Bidoit
    • 1
  • Dario Colazzo
    • 2
  • Noor Malla
    • 3
  • Carlo Sartiani
    • 4
  1. 1.Laboratoire de Recherche en Informatique, Université Paris-Sud, CNRS UMR 8623, Université Paris-SaclayOrsayFrance
  2. 2.Université Paris-Dauphine, PSL Research University, CNRS, LAMSADEParisFrance
  3. 3.Saudi School of ParisParisFrance
  4. 4.DIMIE - Università della BasilicataPotenzaItaly

Personalised recommendations