Skip to main content

A Custom Browser Architecture to Execute Web Navigation Sequences

  • Conference paper
  • First Online:
Web Information Systems Engineering – WISE 2015 (WISE 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9419))

Included in the following conference series:

  • 1356 Accesses

Abstract

Web automation applications are widely used for different purposes such as B2B integration and automated testing of web applications. Most current systems build the automatic web navigation component by using the APIs of conventional browsers. This approach suffers performance problems for intensive web automation tasks which require real time responses and/or a high degree of parallelism. Other systems use the approach of creating custom browsers to avoid some of the tasks of conventional browsers, but they work like them, when building the internal representation of the web pages. In this paper, we present a complete architecture for a custom browser able to efficiently execute web navigation sequences. The proposed architecture supports some novel automatic optimization techniques that can be applied when loading and building the internal representation of the pages. The tests performed using real web sources show that the reference implementation of the proposed architecture runs significantly faster than other navigation components.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alexa. The Web Information Company. http://www.alexa.com

  2. Anupam, V., Freire, J., Kumar, B., Lieuwen, D.: Automating web navigation with the WebVCR. Comput. Netw. 33(1–6), 503–517 (2000)

    Article  Google Scholar 

  3. Cascaval, C., Fowler, S., Montesinos-Ortego, P., Piekarski, W., Reshadi, M., Robatmili, B., Weber, M., Bhavsar, V.: ZOOMM: a parallel web browser engine for multicore mobile devices. In: Proceedings of the 18th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPoPP 2013). ACM, New York, NY, USA, pp. 271–280 (2003)

    Google Scholar 

  4. Document Object Model (DOM). http://www.w3.org/DOM/

  5. EnvJS. http://www.envjs.com/

  6. Grosskurth, A., Godfrey, M.W.: A reference architecture for web browsers. In: ICSM 2005: Proceedings of the 21st IEEE International Conference on Software Maintenance (ICSM 2005). pp. 661–664 (September 2005)

    Google Scholar 

  7. Mai, H., Tang, S., King, S.T., Cascaval, C., Montesinos, P.: A case for parallelizing web pages. In: Proceedings of the 4th USENIX Conference on Hot Topics in Parallelism, HotPar 2012, Berkeley, CA, USA. USENIX Association (June 2012)

    Google Scholar 

  8. HtmlUnit. http://htmlunit.sourceforge.net/

  9. Hupp, D., Miller, R.C.: Smart Bookmarks: automatic retroactive macro recording on the web. In: Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology, pp. 81–90. ACM New York, Newport (2007)

    Google Scholar 

  10. Jaunt. Java Web Scraping and Automation. http://jaunt-api.com

  11. Kapow. http://kapowsoftware.com/

  12. Losada, J., Raposo, J., Pan, A., Montoto, P.: Efficient execution of web navigation sequences. World Wide Web J. doi:10.1007/s11280-013-0259-8. ISSN 1386-145X

  13. Losada, J., Raposo, J., Pan, A., Montoto, P., Álvarez, M.: Optimization techniques to speed up the page loading in custom web browsers. Manuscript accepted for publication in ICEBE 2015. Beijing, China (23–25 October 2015)

    Google Scholar 

  14. Mozilla HTML5 Parser. https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/HTML5/HTML5_Parser

  15. Pan, A., Raposo, J., Álvarez, M., Hidalgo, J., Viña, A.: Semiautomatic wrapper generation for commercial web sources. In: IFIP WG8.1 Working Conference on Engineering Information Systems in the Internet Context, pp. 265–283. Kluwer, B.V. Deventer, Japan (2002)

    Google Scholar 

  16. Safonov, A., Konstan, J., Carlis, J.: Beyond hard-to-reach pages: interactive, parametric web macros. In: 7th Conference on Human Factors and the Web. Madison (2001)

    Google Scholar 

  17. Selenium. http://seleniumhq.org

  18. HTML5. https://html.spec.whatwg.org

  19. XML Path Language (XPath). http://www.w3.org/TR/xpath

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Losada .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Losada, J., Raposo, J., Pan, A., Montoto, P., Álvarez, M. (2015). A Custom Browser Architecture to Execute Web Navigation Sequences. In: Wang, J., et al. Web Information Systems Engineering – WISE 2015. WISE 2015. Lecture Notes in Computer Science(), vol 9419. Springer, Cham. https://doi.org/10.1007/978-3-319-26187-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26187-4_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26186-7

  • Online ISBN: 978-3-319-26187-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics