Abstract
The increasing demands such as high-performance and energy-efficiency for future embedded systems result in the emerging of heterogeneous Multiprocessor System-on-Chip (MPSoC) architectures. To fully enable the power of those architectures, new tools are needed to take care of the increasing complexity of the software to achieve high productivity. An MPSoC compiler is the tool-chain to tackle the problems of expressing parallelism in applications’ modeling/programming, mapping/scheduling and generating the software to distribute on an MPSoC platform for efficient usage, for a given (pre-)verified MPSoC platform. This chapter talks about the various aspects of MPSoC compilers for heterogeneous MPSoC architectures, using a comparison to the well-established uni-processor C compiler technology. After a brief introduction to MPSoC and MPSoC compilers, the important ingredients of the compilation process, such as programming models, granularity and partitioning, platform description, mapping/scheduling and code-generation, are explained in detail. As the topic is relatively young, a number of case studies from academia and industry are selected to illustrate the concepts at the end of this chapter.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
OpenMP has also distributed memory extensions.
- 2.
Being more closely related to the so-called Computation Graphs [41]
References
Eclipse. http://www.eclipse.org/. Visited on Jan. 2010
GDB: The GNU Project Debugger. http://www.gnu.org/software/gdb/. Visited on Jan. 2010
MCAPI - Multicore Communications API. http://www.multicore-association.org/workgroup/comapi.php. Visited on Nov. 2009
PISA - A Platform and Programming Language Independent Interface for Search Algorithms. http://www.tik.ee.ethz.ch/sop/pisa/. Visited on Nov. 2009
Real Time Software Components. http://www.eclipse.org/dsdp/rtsc. Visited on Jan. 2010
AbsInt: aiT worst-case execution time analyzers. http://www.absint.com/ait/. Visited on Nov. 2009
ACE: Embedded C for high performance DSP programming with the CoSy compiler development system. a-qual.com/topics/2005/EmbeddedCv2.pdf. Visited on Jan. 2010
Adl-Tabatabai, A.R., Kozyrakis, C., Saha, B.: Unlocking concurrency. Queue 4(10), 24–33 (2007)
Aho, A.V., Sethi, R., Ullman, J.D.: Compilers: Principles, Techniques, and Tools. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA (1986)
Asanovic, K., Bodik, R., Catanzaro, B.C., Gebis, J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: A view from Berkeley. Tech. rep., EECS Department, University of California, Berkeley (2006)
Bacivarov, I., Haid, W., Huang, K., Thiele, L.: Methods and tools for mapping process networks onto multi-processor systems-on-chip. In: S.S. Bhattacharyya, E.F. Deprettere, R. Leupers, J. Takala (eds.) Handbook of Signal Processing Systems, second edn. Springer (2013)
Benini, L., Bertozzi, D., Guerri, A., Milano, M.: Allocation and scheduling for MPSoCs via decomposition and no-good generation. Principles and Practices of Constrained Programming - CP 2005 (DEIS-LIA-05-001), 107–121 (2005)
Bhattacharya, B., Bhattacharyya, S.S.: Parameterized dataflow modeling for DSP systems. IEEE Transactions on Signal Processing 49(10), 2408–2421 (2001)
Bhattacharyya, S.S., Deprettere, E.F., Theelen, B.: Dynamic dataflow graphs. In: S.S. Bhattacharyya, E.F. Deprettere, R. Leupers, J. Takala (eds.) Handbook of Signal Processing Systems, second edn. Springer (2013)
Carro, L., Rutzig, M.B.: Multicore systems on chip. In: S.S. Bhattacharyya, E.F. Deprettere, R. Leupers, J. Takala (eds.) Handbook of Signal Processing Systems, second edn. Springer (2013)
Castrillon, J., Leupers, R., Ascheid, G.: MAPS: Mapping concurrent dataflow applications to heterogeneous MPSoCs. IEEE Transactions on Industrial Informatics p. 19 (2011). DOI 10.1109/TII.2011.2173941
Castrillon, J., Shah, A., Murillo, L., Leupers, R., Ascheid, G.: Backend for virtual platforms with hardware scheduler in the MAPS framework. In: Circuits and Systems (LASCAS), 2011 IEEE Second Latin American Symposium on, pp. 1–4 (2011). DOI 10.1109/LASCAS.2011.5750280
Castrillon, J., Sheng, W., Leupers, R.: Trends in embedded software synthesis. In: SAMOS, pp. 347–354 (2011)
Castrillon, J., Zhang, D., Kempf, T., Vanthournout, B., Leupers, R., Ascheid, G.: Task management in MPSoCs: An ASIP approach. In: ICCAD 2009 (2009)
Ceng, J., Castrillon, J., Sheng, W., Scharwächter, H., Leupers, R., Ascheid, G., Meyr, H., Isshiki, T., Kunieda, H.: MAPS: an integrated framework for MPSoC application parallelization. In: DAC ’08: Proceedings of the 45th annual conference on Design automation, pp. 754–759. ACM, New York, NY, USA (2008)
Ceng, J., Sheng, W., Castrillon, J., Stulova, A., Leupers, R., Ascheid, G., Meyr, H.: A high-level virtual platform for early MPSoC software development. In: CODES+ISSS ’09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, pp. 11–20. ACM, New York, NY, USA (2009)
Cesario, W., Jerraya, A.: Multiprocessor Systems-on-Chips, chap. Chapter 9. Component-Based Design for Multiprocessor Systems-on-Chip, pp. 357–394. Morgan Kaufmann (2005)
Collette, T.: Key technologies for many core architectures. In: 8th International Forum on Application-Specific Multi-Processor SoC (2008)
CoWare: CoWare Virtual Platforms. http://www.coware.com/products/virtualplatform.php. Visited on Apr. 2009
Fisher, J., P., F., Young, C.: Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools. Morgan-Kaufmann (Elsevier) (2005)
Flake, P., Davidmann, S., Schirrmeister, F.: System-level exploration tools for MPSoC designs. In: Design Automation Conference, 2006 43rd ACM/IEEE, pp. 286–287 (2006)
Gao, L., Huang, J., Ceng, J., Leupers, R., Ascheid, G., Meyr, H.: TotalProf: a fast and accurate retargetable source code profiler. In: CODES+ISSS ’09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, pp. 305–314. ACM, New York, NY, USA (2009)
Geilen, M., Basten, T.: Kahn process networks and a reactive extension. In: S.S. Bhattacharyya, E.F. Deprettere, R. Leupers, J. Takala (eds.) Handbook of Signal Processing Systems, second edn. Springer (2013)
Gheorghita, S., T. Basten, H.C.: An overview of application scenario usage in streaming-oriented embedded system design
Gupta, R., Micheli, G.D.: Hardware-software co-synthesis for digital systems. In: IEEE Design & Test of Computers, pp. 29–41 (1993)
Ha, S., Oh, H.: Decidable dataflow models for signal processing: Synchronous dataflow and its extensions. In: S.S. Bhattacharyya, E.F. Deprettere, R. Leupers, J. Takala (eds.) Handbook of Signal Processing Systems, second edn. Springer (2013)
Hankins, R.A., et al.: Multiple instruction stream processor. SIGARCH Comp. Arch. News 34(2) (2006)
Hansson, A., Goossens, K., Bekooij, M., Huisken, J.: CoMPSoC: A template for composable and predictable multi-processor system on chips. ACM Trans. Des. Autom. Electron. Syst. 14(1), 1–24 (2009)
Hewitt, C., Bishop, P., Greif, I., Smith, B., Matson, T., Steiger, R.: Actor induction and meta-evaluation. In: POPL ’73: Proceedings of the 1st annual ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pp. 153–168. ACM, New York, NY, USA (1973)
Hind, M.: Pointer analysis: Haven’t we solved this problem yet? In: PASTE ’01, pp. 54–61. ACM Press (2001)
Hu, T.C.: Parallel sequencing and assembly line problems. Operations Research 9(6), 841–848 (1961). URL http://www.jstor.org/stable/167050
Hwang, Y., Abdi, S., Gajski, D.: Cycle-approximate retargetable performance estimation at the transaction level. In: DATE ’08: Proceedings of the conference on Design, automation and test in Europe, pp. 3–8. ACM, New York, NY, USA (2008)
Hwu, W.M., Ryoo, S., Ueng, S.Z., Kelm, J.H., Gelado, I., Stone, S.S., Kidd, R.E., Baghsorkhi, S.S., Mahesri, A.A., Tsao, S.C., Navarro, N., Lumetta, S.S., Frank, M.I., Patel, S.J.: Implicitly parallel programming models for thousand-core microprocessors. In: DAC ’07: Proceedings of the 44th annual conference on Design automation, pp. 754–759. ACM, New York, NY, USA (2007)
Kahn, G.: The semantics of a simple language for parallel programming. In: J.L. Rosenfeld (ed.) Information Processing ’74: Proceedings of the IFIP Congress, pp. 471–475. North-Holland, New York, NY (1974)
Kandemir, M., Dutt, N.: Multiprocessor Systems-on-Chips, chap. Chapter 9. Memory Systems and Compiler Support for MPSoC Architectures, pp. 251–281. Morgan Kaufmann (2005)
Karp, R.M., Miller, R.E.: Properties of a model for parallel computations: Determinacy, termination, queuing. SIAM Journal of Applied Math 14(6) (1966)
Karuri, K., Al Faruque, M.A., Kraemer, S., Leupers, R., Ascheid, G., Meyr, H.: Fine-grained application source code profiling for ASIP design. In: DAC ’05: Proceedings of the 42nd annual conference on Design automation, pp. 329–334. ACM, New York, NY, USA (2005)
Kempf, T., Wallentowitz, S., Ascheid, G., Leupers, R., Meyr, H.: A workbench for analytical and simulation based design space exploration of software defined radios. In: Proc. Int. Conf. VLSI Design, pp. 281–286. New Delhi, India (2009)
Kennedy, K., Allen, J.R.: Optimizing compilers for modern architectures: A dependence-based approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002)
Kloss, N.: Application Programming Strategies for TI’s OMAP Solutions. Embedded Edge (2003)
Krishnan, V., Torrellas, J.: A chip-multiprocessor architecture with speculative multithreading. IEEE Trans. Comput. 48(9), 866–880 (1999)
Kumar, S., Hughes, C.J., Nguyen, A.: Carbon: Architectural support for fine-grained parallelism on chip multiprocessors. SIGARCH Comp. Arch. News 35(2) (2007)
Kung, H.T.: Why systolic architectures? Computer 15(1), 37–46 (1982)
Kwok, Y.K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999)
Kwon, S., Kim, Y., Jeun, W.C., Ha, S., Paek, Y.: A retargetable parallel-programming framework for MPSoC. ACM Trans. Des. Autom. Electron. Syst. 13(3), 1–18 (2008)
Lam, M.: Software pipelining: An effective scheduling technique for VLIW machines. SIGPLAN Not. 23(7), 318–328 (1988)
Lee, E., Messerschmitt, D.: Synchronous data flow. Proceedings of the IEEE 75(9), 1235–1245 (1987)
Lee, E.A.: Consistency in dataflow graphs. IEEE Trans. Parallel Distrib. Syst. 2(2), 223–235 (1991)
Lee, E.A.: The problem with threads. Computer 39(5), 33–42 (2006). URL http://portal.acm.org/citation.cfm?id=1137232.1137289
Leupers, R.: Retargetable Code Generation for Digital Signal Processors. Kluwer Academic Publishers, Norwell, MA, USA (1997)
Leupers, R.: Code selection for media processors with SIMD instructions. In: DATE ’00, pp. 4–8. ACM (2000)
Li, L., Huang, B., Dai, J., Harrison, L.: Automatic multithreading and multiprocessing of C programs for IXP. In: PPoPP ’05: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 132–141. ACM, New York, NY, USA (2005)
Ma, Z., Marchal, P., Scarpazza, D.P., Yang, P., Wong, C., Gmez, J.I., Himpe, S., Ykman-Couvreur, C., Catthoor, F.: Systematic Methodology for Real-Time Cost-Effective Mapping of Dynamic Concurrent Task-Based Systems on Heterogenous Platforms. Springer Publishing Company, Incorporated (2007)
Martin, G.: ESL requirements for configurable processor-based embedded system design. Design and Reuse
Mignolet, J.Y., Baert, R., Ashby, T.J., Avasare, P., Jang, H.O., Son, J.C.: MPA: Parallelizing an application onto a multicore platform made easy. IEEE Micro 29(3), 31–39 (2009)
Muchnick, S.S.: Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1997)
National Instruments: LabView. http://www.ni.com/labview/. Visited on Mar. 2009
Nieuwland, A., Kang, J., Gangwal, O.P., Sethuraman, R., Busa, R.S.C.N., Goossens, K., Llopis, R.P.: C-HEAP: A heterogeneous multi-processor architecture template and scalable and flexible protocol for the design of embedded signal processing systems. Design Automation for Embedded Systems (7), 233–270 (2002)
Nikolov, H., Stefanov, T., Deprettere, E.: Systematic and automated multiprocessor system design, programming, and implementation. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on 27(3), 542–555 (2008)
Nikolov, H., Thompson, M., Stefanov, T., Pimentel, A., Polstra, S., Bose, R., Zissulescu, C., Deprettere, E.: Daedalus: Toward composable multimedia MP-SoC design. In: DAC ’08: Proceedings of the 45th annual conference on Design automation, pp. 574–579. ACM, New York, NY, USA (2008)
The OpenMP specification for parallel programming: www.openmp.org. Visited on Nov. 2009
Palsberg, J., Naik, M.: Multiprocessor Systems-on-Chips, chap. Chapter 12. ILP-based Resource-aware Compilation, pp. 337–354. Morgan Kaufmann (2005)
Paolucci, P.S., Jerraya, A.A., Leupers, R., Thiele, L., Vicini, P.: SHAPES:: a tiled scalable software hardware architecture platform for embedded systems. In: CODES+ISSS ’06: Proceedings of the 4th international conference on Hardware/software codesign and system synthesis, pp. 167–172. ACM, New York, NY, USA (2006)
Park, S., sun Hong, D., Chae, S.I.: A hardware operating system kernel for multi-processor systems. IEICE 5(9) (2008)
Parks, T.M.: Bounded scheduling of process networks. Ph.D. thesis, Berkeley, CA, USA (1995)
Pimentel, A.D., Erbas, C., Polstra, S.: A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Transactions on Computers 55(2), 99–112 (2006)
Sharma, G., Martin, J.: MATLAB (R): A language for parallel computing. International Journal of Parallel Programming 37(1) (2009)
Sheng, W., Schürmans, S., Odendahl, M., Leupers, R., Ascheid, G.: Automatic calibration of streaming applications for software mapping exploration. In: Proceedings of the International Symposium on System-on-Chip (SoC) (2011)
Snir, M., Otto, S.: MPI-The Complete Reference: The MPI Core. MIT Press (1998)
Sporer, T., Franck, A., Bacivarov, I., Beckinger, M., Haid, W., Huang, K., Thiele, L., Paolucci, P., Bazzana, P., Vicini, P., Ceng, J., Kraemer, S., Leupers, R.: SHAPES - a scalable parallel HW/SW architecture applied to wave field synthesis. In: Proc. 32nd Intl Audio Engineering Society (AES) Conference, pp. 175–187. Audio Engineering Society, Hillerod, Denmark (2007)
Sriram, S., Bhattacharyya, S.S.: Embedded Multiprocessors: Scheduling and Synchronization. Marcel Dekker, Inc., New York, NY, USA (2000)
Standard for information technology - portable operating system interface (POSIX). Shell and utilities. IEEE Std 1003.1–2004, The Open Group Base Specifications Issue 6, section 2.9: IEEE and The Open Group
Synopsys: Synopsys Virtual Platforms. http://www.synopsys.com/Tools/SLD/VirtualPlatforms/Pages/default.aspx. Visited on May 2009
TI: OMAP35x Product Bulletin. http://www.ti.com/lit/sprt457. Visited on Mar. 2009
TI: TI eXpressDSP Software and Development Tools. http://focus.ti.com/general/docs/gencontent.tsp?contentId=46891. Visited on Jan. 2010
Tournavitis, G., Wang, Z., Franke, B., O’Boyle, M.: Towards a holistic approach to auto-parallelization – integrating profile-driven parallelism detection and machine-learning based mapping. In: PLDI 0–9: Proceedings of the Programming Language Design and Implementation Conference. Dublin, Ireland (2009)
UMIC: Ultra high speed Mobile Information and Communication. http://www.umic.rwth-aachen.de. Visited on Nov. 2009
Verdoolaege, S., Nikolov, H., Stefanov, T.: pn: A tool for improved derivation of process networks. EURASIP J. Embedded Syst. 2007(1), 19–19 (2007)
Wilhelm, R., Engblom, J., Ermedahl, A., Holsti, N., Thesing, S., Whalley, D., Bernat, G., Ferdinand, C., Heckmann, R., Mitra, T., Mueller, F., Puaut, I., Puschner, P., Staschulat, J., Stenström, P.: The worst-case execution-time problem - overview of methods and survey of tools. ACM Trans. Embed. Comput. Syst. 7(3), 1–53 (2008)
Working Group ISO/IEC JTC1/SC22/WG14: C99, Programming Language C ISO/IEC 9899:1999
Zalfany Urfianto, M., Isshiki, T., Ullah Khan, A., Li, D., Kunieda, H.: Decomposition of task-level concurrency on C programs applied to the design of multiprocessor SoC. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E91-A(7), 1748–1756 (2008)
Acknowledgements
The authors would like to thank Jianjiang Ceng, Anastasia Stulova, Stefan Schürmans, and all ISS research members for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Leupers, R., Sheng, W., Castrillon, J. (2013). Software Compilation Techniques for MPSoCs. In: Bhattacharyya, S., Deprettere, E., Leupers, R., Takala, J. (eds) Handbook of Signal Processing Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-6859-2_37
Download citation
DOI: https://doi.org/10.1007/978-1-4614-6859-2_37
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-6858-5
Online ISBN: 978-1-4614-6859-2
eBook Packages: EngineeringEngineering (R0)