Skip to main content

LUTMap: A Dynamic Heuristic Application Mapping Algorithm Based on Lookup Tables

  • Conference paper
  • First Online:
Internet and Distributed Computing Systems (IDCS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9864))

Included in the following conference series:

  • 1547 Accesses

Abstract

In this paper, we propose and investigate a dynamic heuristic mapping algorithm with lookup table optimizations. Distributed and parallel computing are trends due to the performance requirement of modern applications. Application mapping in a multiprocessor system is therefore critical due to the dynamic and unpredictable nature of the applications. We analyse the communication delay among different tasks in an application. A fundamental algorithm is analysed to optimize the average delay of the mapping region. We discuss and evaluate the effectiveness of the algorithm in terms of average intra-application latency. Results from synthetic applications revealed that average latencies from the mapping regions of the fundamental algorithm have reduced up to 23 % compared with the incremental mapping. By noticing the time overhead of the algorithm due to extra number of search spaces, we introduce a mechanism with lookup tables to speed up the process of searching optimized mapping regions. The lookup table is examined with both size and construction time. Experiments shown that the lookup table is small enough to fit into the cache, and the table can be constructed in milliseconds in most practical cases. The results from real applications show that the average execution time of applications of the proposed algorithm has reduced by 15.2 % compared with the first fit algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bienia, C., Kumar, S., Singh, J.P., Li, K.: The parsec benchmark suite: characterization and architectural implications. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT 2008, pp. 72–81. ACM, New York (2008)

    Google Scholar 

  2. Chen, Y.J., Yang, C.L., Chang, Y.S.: An architectural co-synthesis algorithm for energy-aware network-on-chip design. J. Syst. Archit. 55(5–6), 299–309 (2009)

    Article  Google Scholar 

  3. Chou, C.L., Ogras, U., Marculescu, R.: Energy- and performance-aware incremental mapping for networks on chip with multiple voltage levels. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 27(10), 1866–1879 (2008)

    Article  Google Scholar 

  4. Dally, W., Towles, B.: Principles Practices Interconnection Netw. Morgan Kaufmann Publishers Inc., San Francisco (2003)

    Google Scholar 

  5. Demaine, E.D., Fekete, S.P., Rote, G., Schweer, N., Schymura, D., Zelke, M.: Integer point sets minimizing average pairwise distance: what is the optimal shape of a town? Comput. Geom. 44(2), 82–94 (2011). Special issue of selected papers from the 21st Annual Canadian Conference on Computational Geometry

    Article  MathSciNet  MATH  Google Scholar 

  6. Fattah, M., Rahmani, A.M., Xu, T., Kanduri, A., Liljeberg, P., Plosila, J., Tenhunen, H.: Mixed-criticality run-time task mapping for noc-based many-core systems. In: 2014 22nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 458–465, February 2014

    Google Scholar 

  7. Fleig, T., Mattes, O., Karl, W.: Evaluation of adaptive memory management techniques on the tilera tile-gx platform. In: 2014 27th International Conference on Architecture of Computing Systems (ARCS), pp. 1–8, February 2014

    Google Scholar 

  8. Ghosh, A., Paul, S., Bhunia, S.: Energy-efficient application mapping in FPGA through computation in embedded memory blocks. In: 2012 25th International Conference on VLSI Design (VLSID), pp. 424–429, January 2012

    Google Scholar 

  9. Hu, J., Marculescu, R.: Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In: Proceedings of the Conference on Design, Automation and Test in Europe, DATE 2004, vol. 1, p. 10234. IEEE Computer Society, Washington, DC (2004)

    Google Scholar 

  10. Hyde, R.: The Art of Assembly Language, 2nd edn. No Starch Press, San Francisco (2010)

    Google Scholar 

  11. LaCouvee, D.: Fact or fiction: Android apps only use one CPU core, December 2015. http://www.androidauthority.com/fact-or-fiction-android-apps-only-use-one-cpu-core-610352/

  12. Lei, T., Kumar, S.: A two-step genetic algorithm for mapping task graphs to a network on chip architecture. In: 2003 Proceedings of Euromicro Symposium on Digital System Design, pp. 180–187 (2003)

    Google Scholar 

  13. Leung, V.J., Sabin, G., Sadayappan, P.: Parallel job scheduling policies to improve fairness: a case study. In: 39th International Conference on Parallel Processing, ICPP. Workshops 2010, San Diego, California, USA, 13–16 September, pp. 346–353 (2010)

    Google Scholar 

  14. Leutenegger, S.T., Vernon, M.K.: The performance of multiprogrammed multiprocessor scheduling algorithms. SIGMETRICS Perform. Eval. Rev. 18(1), 226–236 (1990)

    Article  Google Scholar 

  15. Magnusson, P., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Hogberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: a full system simulation platform. Computer 35(2), 50–58 (2002)

    Article  Google Scholar 

  16. Martin, M.M., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s general execution-driven multiprocessor simulator (gems) toolset. Computer Architecture News, September 2005

    Google Scholar 

  17. Mediatek: Helio x20, December 2015. http://mediatek-helio.com/x20/

  18. de Souza Carvalho, E., Calazans, N., Moraes, F.: Dynamic task mapping for MPSoCS. IEEE Des. Test Comput. 27(5), 26–35 (2010)

    Article  Google Scholar 

  19. TGG: Task graph generator, July 2014. http://taskgraphgen.sourceforge.net/

  20. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The splash-2 programs: characterization and methodological considerations. In: Proceedings of the 22nd International Symposium on Computer Architecture, pp. 24–36, June 1995

    Google Scholar 

  21. Xu, T., Toivonen, J., Pahikkala, T., Leppanen, V.: BDMap: a heuristic application mapping algorithm for the big data era. In: 2014 IEEE 11th International Conference on Ubiquitous Intelligence and Computing and IEEE 11th International Conference on Autonomic and Trusted Computing, and IEEE 14th International Conference on Scalable Computing and Communications and Its Associated Workshops (UTC-ATC-ScalCom), pp. 821–828, December 2014

    Google Scholar 

  22. Xu, T.C., Leppänen, V.: DBFS: dual best-first search mapping algorithm for shared-cache multicore processors. In: Wang, G., Zomaya, A., Martinez Perez, G., Kenli, L. (eds.) ICA3PP 2015. LNCS, vol. 9528, pp. 185–198. Springer, Heidelberg (2015). doi:10.1007/978-3-319-27119-4_13

    Chapter  Google Scholar 

  23. Xu, T.C., Liljeberg, P., Plosila, J., Tenhunen, H.: Exploration of heuristic scheduling algorithms for 3D multicore processors. In: Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems, SCOPES 2012, pp. 22–31. ACM, New York (2012)

    Google Scholar 

  24. Xu, T.C., Leppänen, V.: Cache- and communication-aware application mapping for shared-cache multicore processors. In: Pinho, L.M.P., Karl, W., Cohen, A., Brinkschulte, U. (eds.) ARCS 2015. LNCS, vol. 9017, pp. 55–67. Springer, Heidelberg (2015)

    Google Scholar 

  25. Xu, T.C., Liljeberg, P., Tenhunen, H.: A minimal average accessing time scheduler for multicore processors. In: Xiang, Y., Cuzzocrea, A., Hobbs, M., Zhou, W. (eds.) ICA3PP 2011, Part II. LNCS, vol. 7017, pp. 287–299. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Canhao Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Xu, T.C., Leppänen, V. (2016). LUTMap: A Dynamic Heuristic Application Mapping Algorithm Based on Lookup Tables. In: Li, W., et al. Internet and Distributed Computing Systems. IDCS 2016. Lecture Notes in Computer Science(), vol 9864. Springer, Cham. https://doi.org/10.1007/978-3-319-45940-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45940-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45939-4

  • Online ISBN: 978-3-319-45940-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics