Building Data-Intensive Grid Applications with Globus Toolkit – An Evaluation Based on Web Crawling

  • Andreas Walter
  • Klemens Böhm
  • Stephan Schosser
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4749)


Nowadays, there is a trend to create resource-consuming applications without building heavy computer centers, but to use resources on computer systems distributed over the internet. Grid middleware is a framework to access these resources. The concern of this paper is the evaluation of a specific grid middleware, namely Globus Toolkit, for data-intensive applications. As a test case, we have designed and implemented a service-based distributed web crawler on top of this middleware: A web crawler is a complex application consisting of many nodes. It imposes significantly higher demands on grid middleware regarding administrative flexibility compared to grid applications that allocate computing power of grid nodes. We have observed that some components of Globus Toolkit are flexible enough to provide the control functionality necessary for a web crawler, while others are not. For these other components, we propose possible extensions. Since we expect the combination of those characteristics to occur with many other grid applications as well, our study is of broader interest, beyond web crawling.


Globus Toolkit Grid-Services Complex Grid Applications Usability of grid-services requirements for data intensive grid applications 


  1. 1.
    Austin, J.: DAME - Distributed Aircraft Maintenance Environment: (last visited 2006-07-24) (2004),
  2. 2.
    Bharat, K., et al.: Who links to whom: Mining linkage between web sites. In: ICDM 2001. Proceedings of the IEEE, International Conference on Data Mining, San Jose, USA, IEEE Computer Society Press, Los Alamitos (2001)Google Scholar
  3. 3.
  4. 4.
    Brin, S., Page, L.: The anatomy of a large-scale hyper textual Web search engine. In: Computer Networks and ISDN Systems, vol. 30 (1998)Google Scholar
  5. 5.
    Chinnici, R., et al.: Web Services Description Language (WSDL) Version 2.0, W3C Whitepaper last visited (2006-07-24) (March 2006),
  6. 6.
    Condor – High Throughput Computing,
  7. 7.
    Foster, I., Kesselman, C.: The Anatomy of the Grid. In: Sakellariou, R., Keane, J.A., Gurd, J.R., Freeman, L. (eds.) Euro-Par 2001. LNCS, vol. 2150, Springer, Heidelberg (2001)CrossRefGoogle Scholar
  8. 8.
    Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The Physiology of the Grid, Global Grid Forum (June 2002)Google Scholar
  9. 9.
    Foster, I., Kesselman, C.: The Grid. Blueprint for a New Computing Infrastructure, 2nd edn. Morgan Kaufmann Publishers, San Francisco (2003)Google Scholar
  10. 10.
    Foster, I.: Globus Toolkit Version 4: Software for Service-Oriented Systems. In: Jin, H., Reed, D., Jiang, W. (eds.) NPC 2005. LNCS, vol. 3779, Springer, Heidelberg (2005)CrossRefGoogle Scholar
  11. 11.
    Globus Toolkit,
  12. 12.
    Gray, J., Szalay, A.: The World Wide Telescope. Science Bd. 293 (2002)Google Scholar
  13. 13.
    Gudgin, et al.: Web Services Addressing 1.0 – SOAP Binding, W3C Whitepaper, (March 2006)Google Scholar
  14. 14.
  15. 15.
    Shkapenyuk, V., Suel, T.: Design and implementation of a high-performance distributed Web crawler. In: Proceedings of the 18th International Conference on Data Engineering, San Jose, pp. 357–368 (2002)Google Scholar
  16. 16.
  17. 17.
    Tomcat 5.5, Google Scholar
  18. 18.
    The OGSA-DAI Project,
  19. 19.
  20. 20.
    Walter, A., Schosser, S., Böhm, K: Überlegungen zur Entwicklung komplexer Grid-Anwendungen mit Globus Toolkit. In: Proceedings of the GI Fachtagung für Datenbanksysteme, Technologie und Web (BTW), Aachen, Germany (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Andreas Walter
    • 1
  • Klemens Böhm
    • 2
  • Stephan Schosser
    • 2
  1. 1.IPE, FZI Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe 
  2. 2.IPD, Universität Karlsruhe, Am Fasanengarten 5, 76131 Karlsruhe 

Personalised recommendations