The Analysis of Generic SIMT Scheduling Model Extracted from GPU
To improve the performance of processor, more and more companies during the industrial circle put the single instruction multi-threads (SIMT) scheduling technology into the processor architecture now, which can develop the multicore processor multi-thread parallel performance through promote the ability of processor multi-thread parallel processing. In order to research and develop the technology of SIMT, this article extracts a generic SIMT scheduling model from Graphic Processing Unit (GPU) which is a kind of processor that used in the field of high performance computing. Through analyzing the performance of this scheduling model, this article shows the attributes of this model and can be an important reference for the use and optimizing of this model in other processors.
KeywordsMulticore processor Multi-thread parallel processing Single instruction multi-threads Scheduling model Performance analysis
Unable to display preview. Download preview PDF.
- 1.Lee, V.W.: Debunking the 100X GPU vs. CPU Myth: An Evaluation of Throughput Computing on CPU and GPU. In: The 37th International Symposium on Computer Architecture, ISCA 2010, Saint-Malo, France, pp. 451–460 (2010)Google Scholar
- 3.John, N., Dally, W.J.: The GPU Computing Era. Annals Through the Year, pp. 56–69. The IEEE Computer Society (2010)Google Scholar
- 4.NVIDIA CUDA: Compute Unified Device Architecture, NVIDIA Corp. (2007)Google Scholar
- 5.NVIDIA CUDA C Programming Guide Version 3.2 (M/OL). NVIDIA (2010), http://developer.download.nvidia.com/compute/cuda/3_2/toolkit/docs/CUDA_C_Program-ming_Guide.pdf
- 6.Fung, W.W.L., Sham, I., Yuan, G., Aamodt, T.M.: Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow. In: 40th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 407–420. IEEE Press (2007)Google Scholar
- 7.Meng, J., Tarjan, D., Skadron, K.: Skadron: Dynamic Warp Subdivision for Integrated Branch and Memory Divergence Tolerance. In: 37th Annual International Symposium on Computer Architecture, ISCA 2010 (June 2010)Google Scholar
- 8.Manavski, S.A.: CUDA compatible GPU as an efficient hardware accelerator for AES cryptography. In: ICSPC 2007: Proc. of IEEE Int’l Conf. on Signal Processing and Communication, pp. 65–68 (2007)Google Scholar
- 9.Giles, M., Xiaoke, S.: Notes on using the NVIDIA 8800 GTX graphics card, http://people.maths.ox.ac.uk/~gilesm/hpc/
- 10.Giles, M.: Jacobi iteration for a Laplace discretisation on a 3D structured grid, http://people.maths.ox.ac.uk/~gilesm/hpc/NVIDIA/laplace3d.pdf
- 11.Maxime Ray tracing, http://www.nvidia.com/cuda
- 12.Al-Kiswany, S., Gharaibeh, A., Santos-Neto, E., Yuan, G., Ripeanu, M.: StoreGPU: exploiting graphics processing units to accelerate distributed storage systems. In: Proc. 17th Int’l Symp. on High Performance Distributed Computing, pp. 165–174 (2008)Google Scholar