Abstract
Vector can enhance peak performance while multi-threading can improve efficiency. MTV is a new architecture that combines the two to achieve both high computing performance and high throughput. Matrix multiplication is the kernel of many scientific applications. A parallel matrix multiplication algorithm is presented and an analytical performance model is built. Based on the model, the performance of MTV was evaluated and critical configurations are given to guide the design of MTV processors..
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Rau, B.R., Fisher, J.A.: Instruction-level parallel processing: History, overview, and perspective. Journal of Supercomputing 7 (1993)
Badulescu, A.M., Veidenbaum, A.V.: Power-efficient instruction fetch architecture for superscalar processors. In: Proceedings of the Parallel and Distributed Processing Techniques and Architectures (2002)
Weaver, D.L., Germond, T.: The SPARC Architecture Manual. SPARC International, Inc. (1994)
Maruyama, T., Yoshida, T., Kan, R., Yamazaki, I., Yamamura, S., Takahashi, N., Hondou, M., Okano, H.: SPARC64 VIIIfx: A new-generation octocore processor for petascale computing. IEEE Micro 30, 30–41 (2010)
Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., Hanrahan, P.: Larrabee: a many-core x86 architecture for visual computing. ACM Transaction on Graphics 27, 1–16 (2008)
IBM: PowerPC Microprocessor Family: AltiVec Technology Programming Environments Manual (2004)
Kongetira, P., Aingaran, K., Olukotun, K.: Niagara: A 32-way multithreaded sparc processor. IEEE Micro 25, 21–29 (2005)
Dongarra, J.J., Croz, J.D., Hammarling, S.: An extended set of FORTRAN basic linear algebra subprograms. ACM Transactions on Mathematical Software 14, 1–17 (1988)
Liu, J., Chi, L., Xie, L., Wang, Y., Gan, X., Feng, H., Hu, Q.: A peak performance model for matrix multiplication on general-purpose dsp. Journal of Hunan University 40, 148 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Y., Gao, J., Sui, B., Zhang, C., Xu, W. (2015). An Analytical Model for Matrix Multiplication on Many Threaded Vector Processors. In: Xu, W., Xiao, L., Li, J., Zhang, C., Zhu, Z. (eds) Computer Engineering and Technology. NCCET 2014. Communications in Computer and Information Science, vol 491. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45815-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-662-45815-0_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45814-3
Online ISBN: 978-3-662-45815-0
eBook Packages: Computer ScienceComputer Science (R0)