Abstract
Mobile processors, a subclass of embedded processors, are increasingly employing multicore designs to improve performance. This often requires sacrificing resources in each CPU, degrading single thread performance which is still important according to Amdahl’s law. The traditional technique for efficiently boosting serial performance in embedded processors, dedicated hardware acceleration, is unsuitable for modern mobile processors because of the heterogeneity and the diversity of applications they run. This paper proposes ‘general purpose’ accelerators, reconfigured on an application-by-application basis, as a means of increasing single thread performance. These accelerators are placed within the datapath of CPUs and support dynamic compilation. This paper presents the design of an architecture with such accelerators and evaluates the cost/performance implications of the design.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Open Virtual Platform TM, http://www.ovpworld.org
Si2, http://www.si2.org
Synopsys Inc., http://www.synopsys.com
The LLVM Target-Independent Code Generator, http://llvm.org/docs/CodeGenerator.html
AMD Accelerated Parallel Processing OpenCL®. Advanced Micro Devices, Inc., Sunnyvale, CA, USA (August 2011), http://developer.amd.com/sdks/amdappsdk/assets/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf
Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proc. of the Spring Joint Comp. Conf., AFIPS 1967, April 18-20, pp. 483–485. ACM, New York (Spring 1967)
Beck, A.C.S., et al.: Transparent reconfigurable acceleration for heterogeneous embedded applications. In: Proc. of the Conf. on Design, Automation and Test in Europe, DATE 2008, pp. 1208–1213. ACM, New York (2008)
Bienia, C., et al.: The PARSEC benchmark suite: characterization and architectural implications. In: Proc. of the 17th Int. Conf. on Parallel Arch. and Compilation Techniques, PACT 2008, pp. 72–81. ACM, New York (2008)
Binkert, N.L., et al.: The M5 simulator: Modeling networked systems. In: IEEE Micro, vol. 26, pp. 52–60 (July 2006)
Che, S., et al.: Rodinia: A benchmark suite for heterogeneous computing. In: Proc. of the 2009 IEEE Int. Symp. on Workload Characterization, IISWC 2009, pp. 44–54. IEEE Comp. Society, Washington, USA (2009)
Clark, N., et al.: Processor acceleration through automated instruction set customization. In: Proc. of the 36th Annual IEEE/ACM Int. Symp. on Microarchitecture, MICRO, vol. 36, p. 129. IEEE Comp. Society, Washington, USA (2003)
Duran, A., et al.: Barcelona OpenMP Tasks Suite: A set of benchmarks targeting the exploitation of task parallelism in OpenMP. In: Proc. of the 2009 Int. Conf. on Parallel Processing, ICPP 2009, pp. 124–131. IEEE Comp. Society, Washington, USA (2009)
Ghuloum, A., et al.: Future-Proof Data Parallel Algorithms and Software on IntelTM for Multi-Core Architecture. Intel Technology Journal 11(4), 333–347 (2007)
Keutzer, K., et al.: From ASIC to ASIP: the next design discontinuity. In: Proc. 2002 IEEE Int. Conf. on Comp. Design: VLSI in Comp.s and Processors, pp. 84–90 (2002)
Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis & transformation. In: Proc. of the Int. Symp. on Code Generation and Optimization, CGO 2004, pp. 75–86. IEEE Comp. Society, Washington, USA (2004)
Luk, C.K., et al.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proc. of the 42nd Annual IEEE/ACM Int. Symp. on Microarchitecture, MICRO, vol. 42, pp. 45–55. ACM, New York (2009)
Lysecky, R., et al.: Warp processors. In: Proc. of the 41st Annual Design Automation Conf., DAC 2004, pp. 659–681. ACM, New York (2004)
Rutzig, M.B., Beck, A.C.S., Carro, L.: CReAMS: an Embedded Multiprocessor Platform. In: Koch, A., Krishnamurthy, R., McAllister, J., Woods, R., El-Ghazawi, T. (eds.) ARC 2011. LNCS, vol. 6578, pp. 118–124. Springer, Heidelberg (2011)
Suri, T., Aggarwal, A.: Improving scalability and per-core performance in multi-cores through resource sharing and reconfiguration. In: 2009 22nd Int. Conf. on VLSI Design, pp. 145–150 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ndu, G., Garside, J. (2012). Boosting Single Thread Performance in Mobile Processors via Reconfigurable Acceleration. In: Choy, O.C.S., Cheung, R.C.C., Athanas, P., Sano, K. (eds) Reconfigurable Computing: Architectures, Tools and Applications. ARC 2012. Lecture Notes in Computer Science, vol 7199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28365-9_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-28365-9_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28364-2
Online ISBN: 978-3-642-28365-9
eBook Packages: Computer ScienceComputer Science (R0)