Memory-Efficient and Stabilizing Management System and Parallel Methods for RELION Using CUDA and MPI
In cryo-electron microscopy, RELION has been proven to be a powerful tool for high-resolution reconstruction and has quickly gained its popularity. However, as the data processed in cryoEM is large and the algorithm of RELION is computation-intensive, the refinement procedure of RELION appears quite time-consuming and memory-demanding. These two problems have become major bottlenecks for its usage. Even though there have been efforts on paralleling RELION, the global memory size still may not meet its requirement. Also as by now there is no automatic memory management system on GPU (Graphics Processing Unit), the fragmentation will increase with iteration. Eventually, it would crash the program. In our work, we designed a memory-efficient and stabilizing management system to guarantee the robustness of our program and the efficiency of GPU memory usage. To reduce the memory usage, we developed a novel RELION 2.0 data structure. Also, we proposed a weight calculation parallel algorithm to speedup the calculation. Experiments show that the memory system can avoid memory fragmentation and we can achieve better speedup ratio compared with RELION 2.0.
KeywordscryoEM RELION CUDA Performance tuning
This research is supported by the National Key Research and Development Program of China (2017YFA0504702), the NSFC projects Grant No. U1611263, U1611261, 61472397, 61502455, 61672493 and Special Program for Applied Research on Super Computation of the NSFC-Guangdong Joint Fund (the second phase).
- 10.Scheres, S.H.: Single-particle processing in RELION-1.3 (2014)Google Scholar
- 12.Su, H., Wen, W., Du, X., Lu, X., Liao, M., Li, D.: Gerelion: GPU-enhanced parallel implementation of single particle cryo-EM image processing. bioRxiv 075887 (2016)Google Scholar
- 13.Corporation N.: CUDA in C best practices guide. NVIDIA Corporation (2016)Google Scholar