Advertisement

Toward Optimizing Latency Under Throughput Constraints for Application Workflows on Clusters

  • Nagavijayalakshmi Vydyanathan
  • Umit V. Catalyurek
  • Tahsin M. Kurc
  • Ponnuswamy Sadayappan
  • Joel H. Saltz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4641)

Abstract

In many application domains, it is desirable to meet some user-defined performance requirement while minimizing resource usage and optimizing additional performance parameters. For example, application workflows with real-time constraints may have strict throughput requirements and desire a low latency or response-time. The structure of these workflows can be represented as directed acyclic graphs of coarse-grained application tasks with data dependences. In this paper, we develop a novel mapping and scheduling algorithm that minimizes the latency of workflows that act on a stream of input data, while satisfying throughput requirements. The algorithm employs pipelined parallelism and intelligent clustering and replication of tasks to meet throughput requirements. Latency is minimized by exploiting task parallelism and reducing communication overheads. Evaluation using synthetic benchmarks and application task graphs shows that our algorithm 1) consistently meets throughput requirements even when other existing schemes fail, 2) produces lower-latency schedules, and 3) results in lesser resource usage.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Kumar, V.S., Rutt, B., Kurc, T., Catalyurek, U., Saltz, J., Chow, S., Lamont, S., Martone, M.: Large image correction and warping in a cluster environment. In: Supercomputing Conf., p. 79 (2006)Google Scholar
  2. 2.
    Guirado, F., Ripoll, A., Roig, C., Luque, E.: Optimizing latency under throughput requirements for streaming applications on cluster execution. In: Cluster Computing Conf. (2005)Google Scholar
  3. 3.
    Spencer, M., Ferreira, R., Beynon, M., Kurc, T., Catalyurek, U., Sussman, A., Saltz, J.: Executing multiple pipelined data analysis operations in the grid. In: Supercomputing Conf., pp. 1–18 (2002)Google Scholar
  4. 4.
    Shukla, S.B., Agrawal, D.P.: Scheduling pipelined communication in distributed memory multiprocessors for real-time applications. SIGARCH Comput. Archit. News 19(3) (1991)Google Scholar
  5. 5.
    Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co, New York, USA (1990)Google Scholar
  6. 6.
    Kwok, Y.K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999)CrossRefGoogle Scholar
  7. 7.
    Hary, S.L., Ozguner, F.: Precedence-constrained task allocation onto point-to-point networks for pipelined execution. IEEE Trans. Par. Distrib. Syst. 10(8), 838–851 (1999)CrossRefGoogle Scholar
  8. 8.
    Yang, M.T., Kasturi, R., Sivasubramaniam, A.: A pipeline-based approach for scheduling video processing algorithms on now. IEEE Trans. Par. Distrib. Syst. 14(2), 119–130 (2003)CrossRefGoogle Scholar
  9. 9.
    Benoit, A., Robert, Y.: Mapping pipeline skeletons onto heterogeneous platforms. Technical Report LIP RR-2006-40 (2006)Google Scholar
  10. 10.
    Subhlok, J., Vondran, G.: Optimal latency-throughput tradeoffs for data parallel pipelines. In: 8th ACM Symp. on Parallel Algorithms and Arch, pp. 62–71. ACM Press, New York (1996)Google Scholar
  11. 11.
    Benoit, A., Robert, Y.: Complexity results for throughput and latency optimization of replicated and data-parallel workflows. Technical Report LIP RR-2007-12 (2007)Google Scholar
  12. 12.
    Vydyanathan, N., Catalyurek, U., Kurc, T., Sadayappan, P., Saltz, J.: An approach for optimizing latency under throughput constraints for application workflows on clusters. Technical Report OSU-CISRC-1/07-TR03, The Ohio State University (2007)Google Scholar
  13. 13.
    Davidovic, T., Crainic, T.G.: Benchmark-problem instances for static scheduling of task graphs with communication delays on homogeneous multiprocessor systems. Computers & OR 33(8), 2155–2177 (2006)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Vallerio, K.: Task graphs for free, http://ziyang.ece.northwestern.edu/tgff/maindoc.pdf

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Nagavijayalakshmi Vydyanathan
    • 1
  • Umit V. Catalyurek
    • 2
  • Tahsin M. Kurc
    • 2
  • Ponnuswamy Sadayappan
    • 1
  • Joel H. Saltz
    • 2
  1. 1.Dept. of Computer Science and Engineering 
  2. 2.Dept. of Biomedical Informatics, The Ohio State University 

Personalised recommendations