Abstract
To improve performance and meet power constraints, vendors are introducing heterogeneous multicores that combine high performance and low power cores. However, choosing which cores and scheduling applications on them remain open problems. This paper presents a scheduling algorithmthat provably minimizes energy on heterogeneousmulticores and meets latency constraints for interactive applications, such as search, recommendations, advertisements, and games. Because interactive applications must respond quickly to satisfy users, they impose multiple constraints, including average, tail, and maximumlatency.We introduce SEM (Slow-to-fast, Energy optimization for Multiple constraints), which minimizes energy by choosing core speeds and how long to execute jobs on each core. We prove SEM minimizes energy without a priori knowledge of job service demand, satisfies multiple latency constraints simultaneously, and only migrates jobs from slower to faster cores. We address practical concerns of migration overhead and congestion. We prove optimizing energy for average latency requires homogeneous cores,whereas optimizing energy for tail and deadline constraints requires heterogeneous cores. For interactive applications,we create a formal foundation for scheduling and selecting cores in heterogeneous systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Albers, S., Müller, F., Schmelzer, S.: Speed scaling on parallel processors. In: SPAA (2007)
Anand, S., Garg, N., Kumar, A.: Resource augmentation for weighted flow-time explained by dual fitting. In: SODA (2012)
Azar, Y., Epstein, A.: Convex programming for scheduling unrelated parallel machines. In: STOC (2005)
Azar, Y., Epstein, L., Richter, Y., Woeginger, G.J.: All-norm approximation algorithms. In: Penttonen, M., Schmidt, E.M. (eds.) SWAT 2002. LNCS, vol. 2368, pp. 288–297. Springer, Heidelberg (2002)
Bansal, N., Pruhs, K.: Server scheduling in the l p norm: A rising tide lifts all boat. In: STOC (2003)
Bornholt, J., Mytkowicz, T., McKinley, K.S.: The model is not enough: Understanding energy consumption in mobile devices. In: Hot Chips (2012)
Boyd, S., Ghosh, A., Magnani, A.: Branch and bound methods (2003), http://www.stanford.edu/class/ee392o/bb.pdf
Brooks, D., Bose, P., Schuster, S., Jacobson, H., Kudva, P., Buyuktosunoglu, A., Wellman, J., Zyuban, V., Gupta, M., Cook, P.: Power-aware microarchitecture: Design and modeling challenges for next generation microprocessors. In: Micro (2000)
Cao, T., Blackburn, S.M., Goa, T., McKinley, K.S.: The yin and yang of power and performance for asymmetric hardware and managed software. In: ISCA (2012)
Chao, H.J., Uzun, N.: An atm queue manager handling multiple delay and loss priorities. IEEE/ACM Trans. Networking 3, 652–659 (1995)
Chen, J., John, L.K.: Efficient program scheduling for heterogeneous multi-core processors. In: DAC (2009)
Dean, J., Barroso, L.A.: The tail at scale. CACM 56(2), 74–80 (2013)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. In: SOSP (2007)
Esmaeilzadeh, H., Cao, T., Xi, Y., Blackburn, S.M., McKinley, K.S.: Looking back on the language and hardware revolutions: Measured power, performance, and scaling. In: ASPLOS (2011)
Greenhalgh, P.: Big.LITTLE processing with ARM Cortex-A15 & Cortex-A7. ARM Whitepaper (September 2011)
Harchol-Balter, M.: The effect of heavy-tailed job size distributions on computer system design. In: Applications of Heavy Tailed Distributions in Economics (1999)
He, Y., Elnikety, S., Larus, J., Yan, C.: Zeta: Scheduling interactive services with partial execution. In: SOCC (2012)
Im, S., Moseley, B.: An online scalable algorithm for minimizing l k -norms of weighted flow time on unrelated machines. In: SODA (2011)
Janapa Reddi, V., Lee, B.C., Chilimbi, T., Vaid, K.: Web search using mobile cores: Quantifying and mitigating the price of efficiency. In: ISCA (2010)
Kotla, R., Devgan, A., Ghiasi, S., Keller, T., Rawson, F.: Characterizing the impact of different memory-intensity levels. In: WWC (2004)
Kumar, R., Farkas, K.I., Jouppi, N.P., Ranganathan, P., Tullsen, D.M.: Single-ISA heterogeneous multi-core architectures: The potential for processor power reduction. In: MICRO (2003)
Lan, G., Lu, Z., Monteiro, R.D.: Primal-dual first-order methods with \(\mathcal{O}(1\backslash\epsilon)\) iteration-complexity for cone programming. Mathematical Programming 126(1), 1–29 (2011)
Lorch, J.R., Smit, A.J.: Improving dynamic voltage scaling algorithms with PACE. In: SIGMETRICS (2001)
Nathuji, R., Isci, C., Gorbatov, E.: Exploiting platform heterogeneity for power efficient data centers. In: ICAC (2007)
Pruhs, K.: Competitive online scheduling for server systems. In: SIGMETRICS (2007)
Ren, S., He, Y., Elnikety, S., McKinley, K.S.: Exploiting processor heterogeneity in interactive systems. In: ICAC (2013)
Ren, S., He, Y., McKinley, K.S.: A theoretical foundation for scheduling and designing heterogeneous processors for interactive applications. Tech. Rep. TR-2014-101. Microsoft Research (2014)
Srinivasan, S., Iyer, R., Zhao, L., Illikkal, R.: HeteroScouts: Hardware assist for OS scheduling in heterogeneous CMPs. In: SIGMETRICS (2011)
Suleman, M.A., Mutlu, O., Qureshi, M.K., Patt, Y.N.: Accelerating critical section execution with asymmetric multi-core architectures. In: ASPLOS (2009)
Suleman, M.A., Patt, Y.N., Sprangle, E., Rohillah, A., Ghuloum, A., Carmean, D.: Asymmetric chip multiprocessors: Balancing hardware efficiency and programmer efficiency. TR-HPS-2007-001 (2007)
Suri, S., Tóth, C.D., Zhou, Y.: Selfish load balancing and atomic congestion games. In: SPAA (2004)
Wierman, A., Harchol-Balter, M.: Classifying scheduling policies with respect to higher moments of conditional response time. In: SIGMETRICS (2005)
Xie, Y., Yang, T.: Cell discarding policies supporting multiple delay and loss requirements in atm networks. In: Globecom (1997)
Xiong, W., Kansal, A.: Energy efficient data intensive distributed computing. IEEE Data Eng. Bull. (2011)
Xu, R., Xi, C., Melhem, R., Moss, D.: Practical PACE for embedded systems. In: EMSOFT (2004)
Yao, F.F., Demers, A.J., Shenker, S.J.: A scheduling model for reduced CPU energy. In: FOCS (1995)
Yuan, W., Nahrstedt, K.: Energy-efficient CPU scheduling for multimedia applications. ACM Trans. Computer Systems 24(3), 292–331 (2006)
Yun, H., Wu, P.-L., Arya, A., Kim, C., Abdelzaher, T.F., Sha, L.: System-wide energy optimization for multiple DVS components and real-time tasks. Real-Time Systems 47(5), 489–515 (2011)
Zeng, W., Ng, C., Medard, M.: Joint coding and scheduling optimization in wireless systems with varying delay sensitivities. In: SECON (2012)
Zhu, Y., Reddi, V.J.: High-performance and energy-efficient mobile web browsing on big/little systems. In: HPCA (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ren, S., He, Y., McKinley, K.S. (2014). A Theoretical Foundation for Scheduling and Designing Heterogeneous Processors for Interactive Applications. In: Kuhn, F. (eds) Distributed Computing. DISC 2014. Lecture Notes in Computer Science, vol 8784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45174-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-662-45174-8_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45173-1
Online ISBN: 978-3-662-45174-8
eBook Packages: Computer ScienceComputer Science (R0)