Skip to main content

Optimization of Task Processing Schedules in Distributed Information Systems

  • Conference paper
Informatics Engineering and Information Science (ICIEIS 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 253))

  • 1230 Accesses

Abstract

The performance of data processing in distributed information systems strongly depends on the efficient scheduling of the applications that access data at the remote sites. This work assumes a typical model of distributed information system where a central site is connected to a number of remote and highly autonomous remote sites. An application started by a user at a central site is decomposed into several data processing tasks to be independently processed at the remote sites. The objective of this work is to find a method for optimization of task processing schedules at a central site. We define an abstract model of data and a system of operations that implements the data processing tasks. Our abstract data model is general enough to represent many specific data models. We show how an entirely parallel schedule can be transformed into a more optimal hybrid schedule where certain tasks are processed simultaneously while the other tasks are processed sequentially. The transformations proposed in this work are guided by the cost-based optimization model whose objective is to reduce the total data transmission time between the remote sites and a central site. We show how the properties of data integration expressions can be used to find more efficient schedules of data processing tasks in distributed information systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahmad, M., Aboulnaga, A., Babu, S.: Query interactions in database workloads. In: Proceedings of the Second International Workshop on Testing Database Systems, pp. 1–6 (2009)

    Google Scholar 

  2. Ahmad, M., Duan, S., Aboulnaga, A., Babu, S.: Predicting completion times of batch query workloads using interaction-aware models and simulation. In: Proceedings of the 14th International Conference on Extending Database Technology, pp. 449–460 (2011)

    Google Scholar 

  3. Braumandl, R., Keidl, M., Kemper, A., Kossmann, D., Kreutz, A., Seltzsam, S., Stocker, K.: ObjectGlobe: Ubiquitous query processing on the Internet. The VLDB Journal 10(1), 48–71 (2001)

    MATH  Google Scholar 

  4. Costa, R.L.-C., Furtado, P.: Runtime Estimations, Reputation and Elections for Top Performing Distributed Query Scheduling. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 28–35 (2009)

    Google Scholar 

  5. Du, W., Krishnamurthy, R., Shan, M.-C.: Query Optimization in Heterogeneous DBMS. In: Proceedings of the 18th VLDB Conference, pp. 277–299 (1992)

    Google Scholar 

  6. Friedman, M., Levy, A., Millstein, T.: Navigational plans For Data Integration. In: Proceedings of the National Conference on Artificial Intelligence, pp. 67–73 (1999)

    Google Scholar 

  7. Ilarri, S., Mena, E., Illarramendi, A.: Location-dependent query processing: Where we are and where we are heading. ACM Computing Surveys 42(3), 1–73 (2010)

    Article  Google Scholar 

  8. Lenzerini, M.: Data Integration: A Theoretical Perspective (2002)

    Google Scholar 

  9. Mishra, C., Koudas, N.: The design of a query monitoring system. ACM Transactions on Database Systems 34(1), 1–51 (2009)

    Article  Google Scholar 

  10. Nam, B., Shin, M., Andrade, H., Sussman, A.: Multiple query scheduling for distributed semantic caches. Journal of Parallel and Distributed Computing 70(5), 598–611 (2010)

    Article  MATH  Google Scholar 

  11. Harangsri, B., Shepherd, J., Ngu, A.: Query Classification in Multidatabase Systems. In: Proceedings of the 7th Australasian Database Conference, pp. 147–156 (1996)

    Google Scholar 

  12. Ives, Z.G., Green, T.J., Karvounarakis, G., Taylor, N.E., Tannen, V., Talukdar, P.P., Jacob, M., Pereira, F.: The ORCHESTRA Collaborative Data Sharing System. SIGMOD Record (2008)

    Google Scholar 

  13. Liu, L., Pu, C.: A Dynamic Query Scheduling Framework for Distributed and Evolving Information Systems. In: Proceedings of the 17th International Conference on Distributed Computing Systems (1997)

    Google Scholar 

  14. Lu, H., Ooi, B.-C., Goh, C.-H.: Multidatabase Query Optimization: Issues and Solutions. In: Proceedings RIDE-IMS 1993, Research Issues in Data Engineering: Interoperability in Multidatabase Systems, pp. 137–143 (April 1993)

    Google Scholar 

  15. Ozcan, F., Nural, S., Koksal, P., Evrendilek, C., Dogac, A.: Dynamic Query Optimization in Multidatabases. Bulletin of the Technical Committee on Data Engineering 20(3), 38–45 (1997)

    Google Scholar 

  16. Sheth, A.P., Larson, J.A.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys 22(3), 183–236 (1990)

    Article  Google Scholar 

  17. Srinivasan, V., Carey, M.J.: Compensation-Based On-Line Query Processing. In: Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data, pp. 331–340 (1992)

    Google Scholar 

  18. Thain, D., Tannenbaum, T., Livny, M.: Distributed computing in practice: the Condor experience: Research Articles. Concurrency Computing: Practice and Experience 17(2-4), 323–356 (2005)

    Article  Google Scholar 

  19. Wache, H., Vogele, T., Visser, U., Stuckenschmidt, H., Schuster, G., Neuman, H., Hubner, S.: Ontology-Based Integration of information - A Survey of Existing Approaches (2001)

    Google Scholar 

  20. Zhou, Y., Ooi, B.C., Tan, K.-L., Tok, W.H.: An adaptable distributed query processing architecture. Data and Knowledge Engineering 53(3), 283–309 (2005)

    Article  Google Scholar 

  21. Zhu, Q., Larson, P.A.: Solving Local Cost Estimation Problem for Global Query Optimization in Multidatabase Systems. Distributed and Parallel Databases 6(4), 373–420 (1998)

    Article  Google Scholar 

  22. Ziegler, P.: Three Decades of Data Integration - All problems Solved? In: 18th IFIP World Computer Congress, vol. 12 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Getta, J.R. (2011). Optimization of Task Processing Schedules in Distributed Information Systems. In: Abd Manaf, A., Sahibuddin, S., Ahmad, R., Mohd Daud, S., El-Qawasmeh, E. (eds) Informatics Engineering and Information Science. ICIEIS 2011. Communications in Computer and Information Science, vol 253. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25462-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25462-8_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25461-1

  • Online ISBN: 978-3-642-25462-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics