Abstract
XML fills a critical role in many software infrastructures such as SOA (Service-Oriented Architecture), Web Services, and Grid Computing. In this paper, we propose a high performance XML parser used as a fundamental component to increase the viability of such infrastructures even for mission-critical business applications. We previously proposed an XML parser based on the notion of differential processing under the hypothesis that XML documents are similar to each other, and in this paper we enhance this approach to achieve higher performance by leveraging static information as well as dynamic information. XML schema languages can represent the static information that is used for optimizing the inside state transitions. Meanwhile, statistics for a set of instance documents are used as dynamic information. These two approaches can be used in complementary ways. Our experimental results show that each of the proposed optimization techniques is effective and the combination of multiple optimizations is especially effective, resulting in a 73.2% performance improvement compared to our earlier work.
Chapter PDF
References
Takase, T., Miyashita, H., Suzumura, T., Tatsubori, M.: An Adaptive, Fast, and Safe XML Parser Based on Byte Sequences Memorization. In: 14th International World Wide Web Conference, WWW 2005 (2005)
Suzumura, T., Takase, T., Tatsubori, M.: Optimizing Web Services Performance by Differential Deserialization. In: ICWS 2005 (International Conference on Web Services) (2005)
Abu-Ghazaleh, N., Lewis, M.J.: Differential Deserialization for Optimized SOAP Performance. In: SC 2005 (2005)
Chiu, K., Liu, W.: A Compiler-Based Approach to Schema-Specific XML Parsing. In: WWW 2004 Workshop (2004)
Reuter, F., Luttenberger, N.: Cardinality Constraint Automata: A Core Technology for Efficient XML Schema-aware Parsers, http://www.swarms.de/publications/cca.pdf
Abu-Ghazaleh, N., Lewis, M.J.: Differential Serialization for Optimized SOAP Performance. In: The 13th IEEE International Symposium on High-Performance Distributed Computing (HPDC 13)
Evaluating SOAP for High-Performance Business Applications: Real Trading System. In: Proceedings of the 12th International World Wide Web Conference
Wang, Y., DeWitt, D.J., Cai, J.Y.: X-Diff: An Effective Change Detection Algorithm for XML Documents. In: 19th international conference on Data Engineering (2003)
Noga, M.L., Schott, S., Lowe, W.: Lazy XML Processing. In: Symposium on Document Engineering (2002)
van Lunteren, J., Engbersen, T.: XML Accelerator Engine. In: First International Workshop on High Performance XML
Nicola, M., John, J.: XML parsing: a threat to database performance. In: 12th International Conference on Information and knowledge management (2003)
W3C XML Schema, http://www.w3.org/XML/Schema
Apache Xerces, http://xml.apache.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Suzumura, T., Makino, S., Uramoto, N. (2006). Optimizing Differential XML Processing by Leveraging Schema and Statistics. In: Dan, A., Lamersdorf, W. (eds) Service-Oriented Computing – ICSOC 2006. ICSOC 2006. Lecture Notes in Computer Science, vol 4294. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11948148_22
Download citation
DOI: https://doi.org/10.1007/11948148_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68147-2
Online ISBN: 978-3-540-68148-9
eBook Packages: Computer ScienceComputer Science (R0)