Skip to main content

Incremental Validation of String-Based XML Data in Databases, File Systems, and Streams

  • Conference paper
Advances in Databases and Information Systems (ADBIS 2007)

Abstract

Although the native (tree-like) storage of XML data becomes more and more important there will be an enduring demand to manage XML data in its textual representation, for instance in relational structures or file systems. XML data has to be wellformed by definition and additionally, in many cases, it has to be valid according to a given XML schema. Because the XML column types are often derived from text types (e.g. CLOBs) guaranteeing well-formedness as well as validity is not trivial. And even worse, for frequently modified data it is usually too expensive to re-validate the whole XML data after each update – but waiving re-validation may lead to inconsistencies and malfunctions of applications. In this paper we present a schema-aware pushdown automaton (i.e. a stack machine) that validates an XML string/stream. Using an element/state-index, the pushdown automaton is able to re-validate local modifications of the data while guaranteeing overall validity. Update operations (e.g. SQLXML, XQuery updates) are validated before executing them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altova. XMLSpy, URL: http://www.altova.com

  2. Balmin, A., Papakonstantinou, Y., Vianu, V.: Incremental validation of XML documents. ACM Trans. Database Syst. 29(4), 710–751 (2004)

    Article  Google Scholar 

  3. Barbosa, D., Mendelzon, A.O., Libkin, L., Mignet, L., Arenas, M.: Efficient Incremental Validation of XML Documents. In: ICDE 2004. Proceedings of the 20th International Conference on Data Engineering, Washington, DC, USA, pp. 671–682. IEEE Computer Society Press, Los Alamitos (2004)

    Google Scholar 

  4. Beyer, K., Cochrane, R., Josifovski, V., Kleewein, J., Lapis, G., Lohman, G., Lyle, B., Özcan, F., Pirahesh, H., Seemann, N., Truong, T.: System RX: One Part Relational, One Part XML. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16 2005, pp. 347–358. ACM Press, New York (2005)

    Chapter  Google Scholar 

  5. Bouchou, B., Alves, M.H.F.: Updates and Incremental Validation of XML Documents. In: DBPL, pp. 216–232 (2003)

    Google Scholar 

  6. Bouchou, B., Alves, M.H.F., Laurent, D., Duarte, D.: Extending Tree Automata to Model XML Validation Under Element and Attribute Constraints. In: ICEIS (1), pp. 184–190 (2003)

    Google Scholar 

  7. Brüggemann-Klein, A., Wood, D.: Balanced context-free grammars, hedge grammars and pushdown caterpillar automata. In: Extreme Markup Languages (2004)

    Google Scholar 

  8. Chitic, C., Rosu, D.: On validation of XML streams using finite state machines. In: WebDB 2004. Proceedings of the 7th International Workshop on the Web and Databases, pp. 85–90. ACM Press, New York, NY, USA (2004)

    Chapter  Google Scholar 

  9. Megginson, D.: Simple API for XML, URL: http://www.saxproject.org/

  10. Fiebig, T., Helmer, S., Kanne, C.-C., Moerkotte, G., Neumann, J., Schiele, R., Westmann, T.: Anatomy of a native XML base management system. VLDB Journal 11(4), 292–314 (2002)

    Article  MATH  Google Scholar 

  11. Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: VLDB 1997, Proceedings of 23rd International Conference on Very Large Data Bases, pp. 436–445 (1997)

    Google Scholar 

  12. Grust, T., Klinger, S.: Schema validation and type annotation for encoded trees. In: Proceedings of the First International Workshop on XQuery Implementation (XIME-P), Paris, France, June 2004, pp. 55–60 (2004)

    Google Scholar 

  13. Hammerschmidt, B.C.: KeyX: Selective Key-Oriented Indexing in Native XML-Databases. Dissertation zum Dr.-Ing., Institut für Informationssysteme, Technisch-Naturwissenschaftliche Fakultät, Universität zu Lübeck, October, DISDBIS 93, Akademische Verlagsgesellschaft Aka GmbH, Berlin 2006, ISBN 3-89838-493-4 (2005)

    Google Scholar 

  14. Hammerschmidt, B.C., Kempa, M., Linnemann, V.: A selective key-oriented XML Index for the Index Selection Problem in XDBMS. In: Galindo, F., Takizawa, M., Traunmüller, R. (eds.) DEXA 2004. LNCS, vol. 3180, Springer, Heidelberg (2004)

    Google Scholar 

  15. Hammerschmidt, B.C., Kempa, M., Linnemann, V.: Autonomous Index Optimization in XML Databases. In: Proceedings of the International Workshop on Self-Managing Database Systems (SMDB 2005), Tokyo, Japan, April 8-9 2005, pp. 56–65 (2005)

    Google Scholar 

  16. Hammerschmidt, B.C., Kempa, M., Linnemann, V.: On the Intersection of XPath Expressions. In: Proceedings of the 9th International Database Engineering & Application Symposium (IDEAS 2005), Montreal, Canada, July 25-27, 2005 (2005)

    Google Scholar 

  17. Hammerschmidt, B.C., Linnemann, V.: The Index Update Problem for XML Data in XDBMS. In: Proceedings of the 7th International Conference on Enterprise Information Systems (ICEIS 2005), Miami, USA, pp. 27–34 (2005)

    Google Scholar 

  18. Hopcroft, J.E., Motwani, R., Ullman, J.D.: Introduction to Automata Theory, Languages, and Computation. Addison Wesley Publishing Company, Reading (2001)

    MATH  Google Scholar 

  19. Hunter, J., McLaughlin.: JDOM 1.0, URL: http://www.jdom.org/

  20. Sang-Kyun, K., Myungcheol, L., Kyu-Chul, L.: Immediate and Partial Validation Mechanism for the Conflict Resolution of Update Operations in XML Databases. In: Meng, X., Su, J., Wang, Y. (eds.) WAIM 2002. LNCS, vol. 2419, pp. 387–396. Springer, Heidelberg (2002)

    Google Scholar 

  21. Sang-Kyun, K., Myungcheol, L., Kyu-Chul, L.: Validation of XML Document Updates Based on XML Schema in XML Databases. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 98–108. Springer, Heidelberg (2003)

    Google Scholar 

  22. Liu, Z.H., Krishnaprasad, M., Arora, V.: Native Xquery processing in Oracle XMLDB. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14-16 2005, pp. 828–833. ACM Press, New York (2005)

    Chapter  Google Scholar 

  23. Miklau, G., Suciu, D.: Containment and equivalence for a fragment of XPath. Journal of the ACM 51(1), 2–45 (2004)

    Article  MathSciNet  Google Scholar 

  24. Murata, M., Lee, D., Mani, M., Kawaguchi, K.: Taxonomy of XML schema languages using formal language theory. ACM Trans. Inter. Tech. 5(4) (2005)

    Google Scholar 

  25. Papakonstantinou, Y., Vianu, V.: Incremental Validation of XML Documents. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 47–63. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  26. Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A Benchmark for XML Data Management. In: Proceedings of the International Conference on Very Large Data Bases (VLDB), Hong Kong, China, pp. 974–985 (2002)

    Google Scholar 

  27. Schöning, H.: Tamino - A DBMS designed for XML. In: Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, April 2-6, 2001, pp. 149–154. IEEE Computer Society, Los Alamitos (2001)

    Chapter  Google Scholar 

  28. Segoufin, L.: Typing and querying XML documents: some complexity bounds. In: PODS 2003. Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 167–178. ACM Press, New York (2003)

    Chapter  Google Scholar 

  29. Segoufin, L., Vianu, V.: Validating streaming XML documents. In: PODS 2002. Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 53–64. ACM Press, New York (2002)

    Chapter  Google Scholar 

  30. Sun Microsystems, Inc. Trang: Multi-format schema converter based on RELAX NG (May 2006), URL: http://www.thaiopensource.com/relaxng/trang.html

  31. Thompson, H.S., Beech, D., Maloney, M., Mendelsohn, N.: XML Schema part 1: Structures 2 edn. W3C Recommendation (October 2004), URL: http://www.w3.org/TR/xmlschema-1

  32. Werner, C., Buschmann, C., Brandt, Y., Fischer, S.: Compressing SOAP Messages by using Pushdown Automata. In: Proceedings of the IEEE International Conference on Web Services, Chicago, USA, September 2006, IEEE Computer Society Press, Los Alamitos (2006)

    Google Scholar 

  33. World Wide Web Consortium (W3C). XQuery Update Facility Requirements (2005), URL: http://www.w3.org/TR/xquery-update-requirements/

  34. World Wide Web Consortium (W3C). XML Schema (2006), URL: http://www.w3.org/XML/Schema

  35. World Wide Web Consortium (W3C). XQuery Update Facility (2006), URL: http://www.w3.org/TR/2006/WD-xqupdate-20060711/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Yannis Ioannidis Boris Novikov Boris Rachev

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hammerschmidt, B.C., Werner, C., Brandt, Y., Linnemann, V., Groppe, S., Fischer, S. (2007). Incremental Validation of String-Based XML Data in Databases, File Systems, and Streams. In: Ioannidis, Y., Novikov, B., Rachev, B. (eds) Advances in Databases and Information Systems. ADBIS 2007. Lecture Notes in Computer Science, vol 4690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75185-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75185-4_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75184-7

  • Online ISBN: 978-3-540-75185-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics