Abstract
The task of architectural Design Space Exploration (DSE) is extremely complex, with multiple architectural parameters to be tuned and optimized, resulting in a huge design space that needs to be explored efficiently. Furthermore, each architectural parameter and/or design point is critically affected by decisions made at lower levels of abstraction (e.g., layout, choice of transistors, etc.). Ideally designers would like to perform DSE incorporating information and decisions made across multiple layers of design abstraction so that the ensuing design space is both feasible and has good fidelity. Simulation-based methods alone can not deal with this incredibly large and complex design space. To address these issues, this chapter presents an approach for cross-layer architectural DSE that efficiently prunes the large design space and furthermore uses predictive models to avoid expensive simulations. The chapter uses a single-chip heterogeneous single-ISA multiprocessor as an exemplar to demonstrate how the large search space can be covered and evaluated efficiently. A cross-layer approach is presented to cope with the complexity by restricting the search/design space through the use of cross-layer prediction models to avoid too costly full system simulations, coupled with systematic pruning of the design space to enable good coverage of the design space in an efficient manner.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- CLDSE:
-
Cross-Layer Design Space Exploration
- DoE:
-
Design of Experiments
- DSE:
-
Design Space Exploration
- EDP:
-
Energy-Delay Product
- EDSP:
-
Energy-Delay Square Product
- HMP:
-
Heterogeneous Multi-core Processor
- ILP:
-
Instruction-Level Parallelism
- ISA:
-
Instruction-Set Architecture
- RSM:
-
Response Surface Modeling
- SA:
-
Simulated Annealing
References
Anderson MJ, Whitcomb PJ (2000) Design of experiments. Wiley Online Library. doi: 10.1002/0471238961.0405190908010814.a01.pub3. http://onlinelibrary.wiley.com/doi/10.1002/0471238961.0405190908010814.a01.pub3/abstract. Accessed Sep 2010
Annamalai A, Rodrigues R, Koren I, Kundu S (2013) An opportunistic prediction-based thread scheduling to maximize throughput/watt in amps. In: 2013 22nd international conference on parallel architectures and compilation techniques (PACT), pp 63–72. doi: 10.1109/PACT.2013.6618804
Balakrishnan S et al (2005) The impact of performance asymmetry in emerging multicore architectures. SIGARCH Comput Archit News 33(2):506–517. doi: 10.1145/1080695.1070012
Becchi M et al (2006) Dynamic thread assignment on heterogeneous multiprocessor architectures. In: Proceedings of the 3rd conference on computing frontiers, CF ’06. ACM, New York, pp 29–40. doi: 10.1145/1128022.1128029
Bienia C et al (2008) The parsec benchmark suite: characterization and architectural implications. In: Proceedings of the 17th international conference on parallel architectures and compilation techniques. ACM, pp 72–81
Binkert N et al (2011) The gem5 simulator. SIGARCH Comput Archit News 39(2):1–7. doi: 10.1145/2024716.2024718
Chen T, Chen Y, Guo Q, Zhou ZH, Li L, Xu Z (2013) Effective and efficient microprocessor design space exploration using unlabeled design configurations. ACM Trans Intell Syst Technol (TIST) 5(1):20
Chen J et al (2009) Efficient program scheduling for heterogeneous multi-core processors. In: 46th ACM/IEEE design automation conference, 2009, DAC ’09, pp 927–930
Chen T, Guo Q, Tang K, Temam O, Xu Z, Zhou ZH, Chen Y (2014) Archranker: a ranking approach to design space exploration. In: 2014 ACM/IEEE 41st international symposium on computer architecture (ISCA). IEEE, pp 85–96
Chitlur N, Srinivasa G, Hahn S, Gupta P, Reddy D, Koufaty D, Brett P, Prabhakaran A, Zhao L, Ijih N et al (2012) Quickia: exploring heterogeneous architectures on real prototypes. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA). IEEE, pp 1–8
Cook H, Skadron K (2008) Predictive design space exploration using genetically programmed response surfaces. In: Proceedings of the 45th annual design automation conference. ACM, pp 960–965
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, Chichester
Dubach C, Jones T, O’Boyle M (2007) Microarchitectural design space exploration using an architecture-centric approach. In: Proceedings of the 40th annual IEEE/ACM international symposium on microarchitecture. IEEE Computer Society, pp 262–271
Eeckhout L, Vandierendonck H, Bosschere K (2002) Workload design: selecting representative program-input pairs. In: Proceedings of the 2002 international conference on parallel architectures and compilation techniques. IEEE, pp 83–94
Ge R et al (2010) Powerpack: Energy Profiling and analysis of high-performance systems and applications. IEEE Trans Parallel Distrib Syst 21(5):658–671. doi: 10.1109/TPDS.2009.76
Givargis T, Vahid F, Henkel J (2002) System-level exploration for pareto-optimal configurations in parameterized system-on-a-chip. IEEE Trans Very Large Scale Integr Syst 10(4):416–422
Greenhalgh P (2011) Big.little processing with arm cortex-a15 & cortex-a7: improving energy efficiency in high-performance mobile platforms. http://www.arm.com/files/downloads/big.LITTLE_Final.pdf
Henning JL (2006) Spec cpu2006 benchmark descriptions. ACM SIGARCH Comput Archit News 34(4):1–17
Hill M et al (2008) Amdahl’s law in the multicore era. Computer 41(7):33–38. doi: 10.1109/MC.2008.209
Ïpek E, McKee SA, Caruana R, de Supinski BR, Schulz M (2006) Efficiently exploring architectural design spaces via predictive modeling. SIGPLAN Not 41(11):195–206. doi: 10.1145/1168918.1168882
Ipek E, McKee SA, Singh K, Caruana R, Supinski BRd, Schulz M (2008) Efficient architectural design space exploration via predictive modeling. ACM Trans Archit Code Optim (TACO) 4(4):1
Jin Z, Cheng AC (2008) Improve simulation efficiency using statistical benchmark subsetting: an implantbench case study. In: Proceedings of the 45th annual design automation conference. ACM, pp 970–973
Joseph P, Vaswani K, Thazhuthaveetil MJ (2006) Construction and use of linear regression models for processor performance analysis. In: The twelfth international symposium on high-performance computer architecture. IEEE, pp 99–108
Kenzo VC et al (2012) Scheduling heterogeneous multi-cores through performance impact estimation (PIE). In: International symposium on computer architecture, ISCA’12
Kessler R (1999) The alpha 21264 microprocessor. IEEE Micro 19(2):24–36. doi: 10.1109/40.755465
Keutzer K et al (2000) System-level design: orthogonalization of concerns and platform-based design. IEEE Trans Comput-Aided Des Integr Circuits Syst 19(12):1523–1543. doi: 10.1109/43.898830
Khan S, Xekalakis P, Cavazos J, Cintra M (2007) Using predictivemodeling for cross-program design space exploration in multicore systems. In: Proceedings of the 16th international conference on parallel architecture and compilation techniques. IEEE Computer Society, pp. 327–338
Kleijnen JP (2008) Design and analysis of simulation experiments, vol 20. Springer, New York/London
Kumar R et al (2004) Single-ISA heterogeneous multi-core architectures for multithreaded workload performance. In: Proceedings of the 31st annual international symposium on computer architecture, pp 64–75. doi: 10.1109/ISCA.2004.1310764
Kumar R et al (2005) Heterogeneous chip multiprocessors. Computer 38(11):32–38. doi: 10.1109/MC.2005.379
Lee BC, Brooks DM (2006) Accurate and efficient regression modeling for microarchitectural performance and power prediction. In: ACM SIGPLAN notices, vol 41. ACM, pp 185–194
Lee C, Potkonjak M, Mangione-Smith WH (1997) Mediabench: a tool for evaluating and synthesizing multimedia and communicatons systems. In: Proceedings of the 30th annual ACM/IEEE international symposium on microarchitecture. IEEE Computer Society, pp 330–335
Lee BC, Collins J, Wang H, Brooks D (2008) Cpr: composable performance regression for scalable multiprocessor models. In: 2008 41st IEEE/ACM international symposium on microarchitecture, 2008, MICRO-41. IEEE, pp 270–281
Li S et al (2009) Mcpat: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In: 42nd annual IEEE/ACM international symposium on microarchitecture, 2009, MICRO-42, pp 469–480
Liu HY, Carloni LP (2013) On learning-based methods for design-space exploration with high-level synthesis. In: Proceedings of the 50th annual design automation conference. ACM, p 50
Liu G et al (2013) Dynamic thread mapping for high-performance, power-efficient heterogeneous many-core systems. In: 2013 IEEE 31st international conference on computer design (ICCD), pp 54–61. doi: 10.1109/ICCD.2013.6657025
Montgomery DC: Design and analysis of experiments. Wiley, Hoboken (2008)
Mück T, Sarma S, Dutt N (2015) Run-DMC: runtime dynamic heterogeneous multicore performance and power estimation for energy efficiency. In: Proceedings of the 10th international conference on hardware/software codesign and system synthesis. IEEE, pp 173–182
NVidia (2011) Variable SMP – a multi-core CPU architecture for low power and high performance. http://www.nvidia.cn/content/PDF/tegra_white_papers/Variable-SMP-A-Multi-Core-CPU-\Architecture-for-Low-Power-and-High-Performance-v1.1.pdf
Palermo G, Silvano C, Zaccaria V (2009) Respir: a response surface-based pareto iterative refinement for application-specific design space exploration. IEEE Trans Comput-Aided Des Integr Circuits Syst 28(12):1816–1829. doi: 10.1109/TCAD.2009.2028681
Phansalkar A, Joshi A, John LK (2007) Subsetting the spec CPU2006 benchmark suite. ACM SIGARCH Comput Archit News 35(1):69–76
Pimentel A et al (2006) A systematic approach to exploring embedded system architectures at multiple abstraction levels. IEEE Trans Computers 55(2):99 – 112. doi: 10.1109/TC.2006.16
Santner TJ, Notz W, Williams B (2003) The design and analysis of computer experiments. Springer, New York
Sarma S (2016) Cyber-physical-system-on-chip (CPSoC): an exemplar self-aware SoC and smart computing platform
Sarma S, Dutt N (2015) Cross-layer exploration of heterogeneous multicore processor configurations. In: 2015 28th international conference on VLSI design (VLSID), pp 147–152. doi: 10.1109/VLSID.2015.30
Sarma S, Muck T, Bathen LAD, Dutt N, Nicolau A (2015) Smartbalance: a sensing-driven linux load balancer for energy efficiency of heterogeneous mpsocs. In: Proceedings of the 52nd annual design automation conference, DAC ’15. ACM, New York, pp 109:1–109:6. doi: 10.1145/2744769.2744911
Shelepov D et al (2009) Hass: a scheduler for heterogeneous multicore systems. SIGOPS Oper Syst Rev 43(2):66–75. doi: 10.1145/1531793.1531804
Teodorescu R, Torrellas J (2008) Variation-aware application scheduling and power management for chip multiprocessors. SIGARCH Comput Archit News 36(3):363–374. doi: 10.1145/1394608.1382152
Vidyarthi DP et al (2009) Scheduling in distributed computing systems: analysis, design & models, a research monogram. Springer
Wu W, Lee BC (2012) Inferred models for dynamic and sparse hardware-software spaces. In: Proceedings of the 2012 45th annual IEEE/ACM international symposium on microarchitecture. IEEE Computer Society, pp 413–424
Yi JJ, Lilja DJ, Hawkins DM (2003) A statistically rigorous approach for improving simulation methodology. In: Proceedings of the ninth international symposium on high-performance computer architecture, 2003, HPCA-9 2003. IEEE, pp 281–291
Acknowledgements
This work was partially supported by the NSF Variability Expedition award CCF-1029783.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Dordrecht
About this entry
Cite this entry
Sarma, S., Dutt, N. (2017). Architecture and Cross-Layer Design Space Exploration. In: Ha, S., Teich, J. (eds) Handbook of Hardware/Software Codesign. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7267-9_9
Download citation
DOI: https://doi.org/10.1007/978-94-017-7267-9_9
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-017-7266-2
Online ISBN: 978-94-017-7267-9
eBook Packages: EngineeringReference Module Computer Science and Engineering