Skip to main content

Evolution of XPath Lists for Document Data Selection

  • Conference paper
Book cover Parallel Problem Solving from Nature, PPSN XI (PPSN 2010)

Abstract

XML has became a standard for structured data, and very often transformations from one specific format to another are needed. XSLT stylesheets are programs designed for this purpose, and they use XPath expressions to select sets of nodes within the document. In this paper a new version of an evolutionary algorithm that creates XSLT from examples is presented, improving on previously obtained results by testing a new individual representation with a new set of operators, based mainly on evolution of XPaths with a fixed XSLT program structure. The experiments show that this new representation, and a lower set of operators, yield better results in less generations that in our previous version.

Supported by projects AmIVital (CENIT2007-1010) and EvOrq (TIC-3903).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Clark, J.: XSL transformations (XSLT), version 1.0, W3C recommendation (November 16, 1999), http://www.w3.org/TR/xslt.html

  2. Wikipedia: Simple API for XML — Wikipedia, the free encyclopedia (2007) [Online; accessed March 21, 2007]

    Google Scholar 

  3. Clark, J., DeRose, S., et al.: XML Path Language (XPath) Version 1.0. W3C Recommendation 16 (1999)

    Google Scholar 

  4. García-Sánchez, P., Guervós, J.J.M., Sevilla, J.P., Laredo, J.L.J., Mora, A.M., Valdivieso, P.A.C.: Automatic generation of xslt stylesheets using evolutionary algorithms. In: Genetic and Evolutionary Computation Conference, GECCO 2008, Proceedings, pp. 1701–1702 (2008)

    Google Scholar 

  5. García-Sánchez, P., Guervós, J.J.M., Laredo, J.L.J., Mora, A., Castillo, P.A.: Evolving xslt stylesheets for document transformation. In: Rudolph, G., Jansen, T., Lucas, S., Poloni, C., Beume, N. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 1021–1030. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  6. Martens, S.: Automatic creation of XML document conversion scripts by genetic programming. In: Genetic Algorithms and Genetic Programming at Stanford, p. 269 (2000)

    Google Scholar 

  7. Schmid, U., Waltermann, J.: Automatic synthesis of XSL-transformations from example documents. In: Hamza, M. (ed.) IASTED International Conference on Artificial Intelligence and Applications, pp. 252–257 (2004)

    Google Scholar 

  8. Biermann, A.: The inference of regular LISP programs from examples. IEEE Transactions on Systems, Man and Cybernetics 8(8), 585–600 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  9. Biermann, A.W., Guiho, G. (eds.): Computer Program Synthesis Methodologies. Reidel, Dordrecht (1983)

    MATH  Google Scholar 

  10. Leinonen, P.: Automating XML document structure transformations. In: Proceedings of the 2003 ACM Symposium on Document Engineering, pp. 26–28 (2003)

    Google Scholar 

  11. Kuikka, E., Leinonen, P., Penttonen, M.: Towards automating of document structure transformations. In: Proceedings of the 2002 ACM Symposium on Document Engineering, pp. 103–110 (2002)

    Google Scholar 

  12. Chidlovskii, B., Fuselier, J.: Supervised learning for the legacy document conversion. In: DocEng 2004: Proceedings of the 2004 ACM Symposium on Document Engineering, pp. 220–228. ACM, New York (2004)

    Chapter  Google Scholar 

  13. Suzuki, N., Fukushima, Y.: An XML document transformation algorithm inferred from an edit script between DTDS. In: ADC 2008: Proceedings of the Nineteenth Conference on Australasian Database, pp. 175–184. Australian Computer Society, Inc., Australia (2007)

    Google Scholar 

  14. Chuang, T.R., Lin, J.L.: On modular transformation of structural content. In: DocEng 2004: Proceedings of the 2004 ACM symposium on Document Engineering, pp. 201–210. ACM, New York (2004)

    Chapter  Google Scholar 

  15. Soares, L.F.G., Rodrigues, R.F., de Resende Costa, R.M.: Automatic building of frameworks for processing XML documents. In: WebMedia 2006: Proceedings of the 12th Brazilian Symposium on Multimedia and the Web, pp. 118–127. ACM Press, New York (2006)

    Chapter  Google Scholar 

  16. Shin, D.H., Lee, K.H.: Towards the faster transformation of XML documents. J. Inf. Sci. 32(3), 261–276 (2006)

    Article  Google Scholar 

  17. Arenas, M.G., Dolin, B., Merelo-Guervós, J.J., Castillo, P.A., de Viana, I.F., Schoenauer, M.: JEO: Java Evolving Objects. In: Proceedings of the Genetic and Evolutionary Computation Conference, p. 991 (2002)

    Google Scholar 

  18. Arenas, M., Collet, P., Eiben, A., Jelasity, M., Merelo, J.J., Paechter, B., Preuß, M., Schoenauer, M.: A framework for distributed evolutionary algorithms. In: Guervós, J.J.M., Adamidis, P.A., Beyer, H.-G., Fernández-Villacañas, J.-L., Schwefel, H.-P. (eds.) PPSN 2002. LNCS, vol. 2439, pp. 665–675. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

García-Sánchez, P. et al. (2010). Evolution of XPath Lists for Document Data Selection. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds) Parallel Problem Solving from Nature, PPSN XI. PPSN 2010. Lecture Notes in Computer Science, vol 6239. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15871-1_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15871-1_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15870-4

  • Online ISBN: 978-3-642-15871-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics