Abstract
Multi-Swarm PSO (MPSO) is an extension of the PSO algorithm that incorporates multiple, collaborating swarms. Although embarrassingly parallel in appearance, MPSO is memory bound, introducing challenges for GPU-based architectures. In this paper, we use device-utilization metrics to drive the development and optimization of an MPSO algorithm applied to the task matching problem. Our hardware architecture is the AMD Accelerated Processing Unit (APU), which fuses the CPU and GPU together on a single chip. We make effective use of features such as the hierarchical memory structure on the APU, the 4-way very long instruction word (VLIW) feature for vectorization, and DMA transfer features for asynchronous transfer of data between global memory and local memory. The resulting algorithm provides a 29% decrease in overall execution time over our baseline implementation.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: IEEE International Conference on Neural Networks, Perth, Australia, vol. 4, pp. 1942–1948 (1995)
Cardenas-Montes, M., Vega-Rodriguez, M., Rodriguez-Vazquez, J., Gomez-Iglesias, A.: Accelerating Particle Swarm Algorithm with GPGPU. In: 2011 19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Ayia Napa, Cyprus, pp. 560–564 (February 2011)
Solomon, S., Thulasiraman, P., Thulasiram, R.: Collaborative Multi-swarm PSO for Task Matching using Graphics Processing Units. In: ACM Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, Dublin, Ireland, pp. 1563–1570 (July 2011)
Sidhu, M.S., Thulasiraman, P., Thulasiram, R.K.: A Load-Rebalance PSO Algorithm for Taskmatching in Heterogeneous Computing Systems. In: IEEE Symposium Series on Computational Intelligence, Singapore (April 2013)
Cagnoni, S., Bacchini, A., Mussi, L.: OpenCL Implementation of Particle Swarm Optimization: A Comparison between Multi-core CPU and GPU Performances. In: Proceedings of 2012 European Conference on Applications of Evolutionary Computation, Málaga, Spain, pp. 406–415 (April 2012)
Rabinovich, M., Kainga, P., Johnson, D., Shafer, B., Lee, J., Eberhart, R.: Particle Swarm Optimization on a GPU. In: 2012 IEEE International Conference on Electro/Information Technology, Indianapolis, USA, pp. 1–6 (May 2012)
Pinel, F., Dorronsoro, B., Bouvry, P.: Solving Very Large Instances of the Scheduling of Independent Tasks Problem on the GPU. Journal of Parallel Distributed Computing 73(1), 101–110 (2013)
Vanneschi, L., Codecasa, D., Mauri, G.: An Empirical Comparison of Parallel and Distributed Particle Swarm Optimization Methods. In: Proceedings of 12th Annual Conference on Genetic and Evolutionary Computation, Portland, USA, pp. 15–22 (July 2010)
Fernandez-Baca, D.: Allocating modules to processors in a distributed system. IEEE Transactions on Software Engineering 15(11), 1427–1436 (1989), doi:10.1109/32.41334
Advanced Micro Devices: AMD Accelerated Parallel Processing OpenCL Programming Guide (July 2012)
Shi, Y., Eberhart, R.: A Modified Particle Swarm Optimizer. In: IEEE World Congress on Computational Intelligence Evolutionary Computation Proceedings, Anchorage, Alaska, USA, pp. 69–73 (May 1998)
Salmon, J., Moraes, M., Dror, R., Shaw, D.: Parallel random numbers: as easy as 1, 2, 3. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, Seattle, Washington, USA, pp. 16:1–16:12 (November 2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Franz, W., Thulasiraman, P., Thulasiram, R.K. (2013). Memory Efficient Multi-Swarm PSO Algorithm in OpenCL on an APU. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-03859-9_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03858-2
Online ISBN: 978-3-319-03859-9
eBook Packages: Computer ScienceComputer Science (R0)