Journal of Computer Science and Technology

, Volume 32, Issue 2, pp 269–285 | Cite as

Parallel Turing Machine, a Proposal

Regular Paper


We have witnessed the tremendous momentum of the second spring of parallel computing in recent years. But, we should remember the low points of the field more than 20 years ago and review the lesson that has led to the question at that point whether “parallel computing will soon be relegated to the trash heap reserved for promising technologies that never quite make it” in an article entitled “the death of parallel computing” written by the late Ken Kennedy — a prominent leader of parallel computing in the world. Facing the new era of parallel computing, we should learn from the robust history of sequential computation in the past 60 years. We should study the foundation established by the model of Turing machine (1936) and its profound impact in this history. To this end, this paper examines the disappointing state of the work in parallel Turing machine models in the past 50 years of parallel computing research. Lacking a solid yet intuitive parallel Turing machine model will continue to be a serious challenge in the future parallel computing. Our paper presents an attempt to address this challenge by presenting a proposal of a parallel Turing machine model. We also discuss why we start our work in this paper from a parallel Turing machine model instead of other choices.


parallel Turing machine codelet abstract architecture parallel computing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

11390_2017_1721_MOESM1_ESM.pdf (219 kb)
ESM 1 (PDF 219 kb)


  1. [1]
    Bouknight W J, Denenberg S A, McIntyre D E, Randall J M, Sameh A H, Slotnick D L. The Illiac IV system. Proceedings of the IEEE, 1972, 60(4): 369-388.CrossRefGoogle Scholar
  2. [2]
    Russell R M. The CRAY-1 computer system. Communications of the ACM, 1978, 21(1): 63-72.CrossRefGoogle Scholar
  3. [3]
    Duncan R. A survey of parallel computer architectures. Computer, 1990, 23(2): 5-16.CrossRefGoogle Scholar
  4. [4]
    Eckert Jr J P, Mauchly J W. Automatic high-speed computing: A progress report on the EDVAC. Report of Work under Contract No.W-670-ORD-4926, More School of Elec. Eng., U. of Pennsylvania, Sept. 1945.Google Scholar
  5. [5]
    von Neumann J. First draft of a report on the EDVAC. IEEE Annals of the History of Computing, 1993, 15(4): 27-75.MathSciNetCrossRefMATHGoogle Scholar
  6. [6]
    Valiant L G. A bridging model for parallel computation. Communications of the ACM, 1990, 33(8): 103-111.CrossRefGoogle Scholar
  7. [7]
    Culler D E, Karp R M, Patterson D, Sahay A, Santos E E, Schauser K E, Subramonian R, von Eicken T. LogP: A practical model of parallel computation. Communications of the ACM, 1996, 39(11): 78-85.CrossRefGoogle Scholar
  8. [8]
    Lee E A. The problem with threads. Computer, 2006, 39(5): 33-42.CrossRefGoogle Scholar
  9. [9]
    Hemmerling A. Systeme von Turing-Automaten und Zellularräume auf rahmbaren pseudomustermengen. Elektronische Informationsverarbeitung und Kybernetik, 1979, 15(1/2): 47-72. (in German)Google Scholar
  10. [10]
    Wiedermann J. Parallel Turing machines. Technical Report RUU-CS-84-11, Department of Computer Science, University of Utrecht, The Netherlands, 1984.Google Scholar
  11. [11]
    Cooper S B. Turing’s Titanic machine? Communications of the ACM, 2012, 55(3): 74-83.CrossRefGoogle Scholar
  12. [12]
    Goldin D, Wegner P. The church-Turing thesis: Breaking the myth. In Lecture Notes in Computer Science 3526, Cooper S B, Löwe B, Torenvliet L (eds.), Springer, 2005, pp.152-168.Google Scholar
  13. [13]
    Turing A M. On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, 1936, S2-42(1): 230-265.Google Scholar
  14. [14]
    Lamport L. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 1978, 21(7): 558-565.CrossRefMATHGoogle Scholar
  15. [15]
    Dennis J B. A parallel program execution model supporting modular software construction. In Proc. the 3rd Working Conference on Massively Parallel Programming Models, November 1997, pp.50-60.Google Scholar
  16. [16]
    Gao G R, Sarkar V. Location consistency — A new memory model and cache consistency protocol. IEEE Transactions on Computers, 2000, 49(8): 798-813.CrossRefGoogle Scholar
  17. [17]
    Garcia E, Orozco D, Gao G R. Energy efficient tiling on a Many-Core Architecture. In Proc. the 4th Workshop on Programmability Issues for Heterogeneous Multicores; the 6th International Conference on High-Performance and Embedded Architectures and Compilers, January 2011, pp.53-66.Google Scholar
  18. [18]
    Dennis J B, Fosseen J B, Linderman J P. Data flow schemas. In Proc. the International Symposium on Theoretical Programming, August 1972, pp.187-216.Google Scholar
  19. [19]
    Gao G R. An efficient hybrid dataflow architecture model. Journal of Parallel and Distributed Computing, 1993, 19(4): 293-307.CrossRefMATHGoogle Scholar
  20. [20]
    Dennis J B, Gao G R. An efficient pipelined dataflow processor architecture. In Proc. the ACM/IEEE Conference on Supercomputing, November 1988, pp.368-373.Google Scholar
  21. [21]
    Gao G R, Tio R, Hum H H J. Design of an efficient dataflow architecture without data flow. In Proc. International Conference on the 5th Generation Computer Systems, December 1988, pp.861-868.Google Scholar
  22. [22]
    Zuckerman S, Suetterlein J, Knauerhase R, Gao G R. Using a “codelet” program execution model for exascale machines: Position paper. In Proc. the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era, June 2011, pp.64-69.Google Scholar
  23. [23]
    Dennis J B. Fresh Breeze: A multiprocessor chip architecture guided by modular programming principles. ACM SIGARCH Computer Architecture News, 2003, 31(1): 7-15.CrossRefGoogle Scholar
  24. [24]
    Dennis J B, van Horn E C. Programming semantics for multiprogrammed computations. Communications of the ACM, 1966, 9(3): 143-155.CrossRefMATHGoogle Scholar
  25. [25]
    Gao G R, Suetterlein J, Zuckerman S. Toward an execution model for extreme-scale systems-runnemede and beyond. CAPSL Technical Memo 104, April 2011.Google Scholar
  26. [26]
    Denning P J, Dennis J B. The resurgence of parallelism. Communications of the ACM, 2010, 53(6): 30-32.CrossRefGoogle Scholar
  27. [27]
    van Emde Boas P. Machine models and simulations. In Handbook of Theoretical Computer Science (vol. A), van Leeuwen (ed.), MIT Press, 1990, pp.1-66.Google Scholar
  28. [28]
    Ito T. Synchronized alternation and parallelism for three-dimensional automata [Ph.D. Thesis]. University of Miyazaki, 2008.Google Scholar
  29. [29]
    Okinaka K, Inoue K, Ito A. A note on hardware-bounded parallel Turing machines. In Proc. the 2nd International Conference on Information, December 2002, pp.90-100.Google Scholar
  30. [30]
    Ito T, Sakamoto M, Taniue A, Matsukawa T, Uchida Y, Furutani H, Kono M. Parallel Turing machines on four-dimensional input tapes. Artificial Life and Robotics, 2010, 15(2): 212-215.CrossRefGoogle Scholar
  31. [31]
    Dubois M, Scheurich C, Briggs F. Memory access buffering in multiprocessors. ACM SIGARCH Computer Architecture News, 1986, 14(2): 434-442.CrossRefGoogle Scholar
  32. [32]
    Gharachorloo K, Lenoski D, Laudon J, Gibbons P, Gupta A, Hennessy J. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proc. the 17th Annual International Symposium on Computer Architecture, May 1990, pp.15-26.Google Scholar
  33. [33]
    Chen C, Manzano J B, Gan G, Gao G R, Sarkar V. A study of a software cache implementation of the OpenMP memory model for multicore and manycore architectures. In Lecture Notes in Computer Science 6272, D’Ambra P, Guarracino M, Talia D (eds.), Springer, 2010, pp.341-352.Google Scholar
  34. [34]
    Gao G R, Hum H H J, Monti J M. Towards an efficient hybrid dataflow architecture model. In Proc. Parallel Architectures and Languages Europe, June 1991, pp.355-371.Google Scholar
  35. [35]
    Iannucci R A. Toward a dataflow/von Neumann hybrid architecture. ACM SIGARCH Computer Architecture News, 1988, 16(2): 131-140.CrossRefGoogle Scholar
  36. [36]
    Dennis J B. First version of a data flow procedure language. In Proc. Programming Symposium, April 1974, pp.362-376.Google Scholar
  37. [37]
    Arvind K, Nikhil R S. Executing a program on the MIT tagged-token dataflow architecture. IEEE Transactions on Computers, 1990, 39(3): 300-318.CrossRefMATHGoogle Scholar
  38. [38]
    Blumofe R D, Joerg C F, Kuszmaul B C, Leiserson C E, Randall K H, Zhou Y L. Cilk: An efficient multithreaded runtime system. Journal of Parallel and Distributed Computing, 1996, 37(1): 55-69.CrossRefGoogle Scholar
  39. [39]
    Hum H H J, Maquelin O, Theobald K B, Tian X M, Tang X A, Gao G R, Cupryk P, Elmasri N, Hendren L J, Jimenez A, Krishnan S, Marquez A, Merali S, Nemawarkar S S, Panangaden P, Xue X, Zhu Y C. A design study of the EARTH multiprocessor. In Proc. the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques, June 1995, pp.59-68.Google Scholar
  40. [40]
    Dennis J B, Misunas D P. A preliminary architecture for a basic data-flow processor. In Proc. the 2nd Annual Symposium on Computer Architecture, January 1975, pp.126-132.Google Scholar
  41. [41]
    Hennessy J L, Patterson D A. Computer Architecture: A Quantitative Approach (5th edition). Elsevier, 2012.Google Scholar
  42. [42]
    Lauderdale C, Khan R. Position paper: Towards a codelet-based runtime for exascale computing. In Proc. the 2nd International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era, March 2012, pp.21-26.Google Scholar
  43. [43]
    Cassidy A S Merolla P, Arthur J V, Esser S K, Jackson B, Alvarez-Icaza R, Datta P, Sawada J, Wong T M, Feldman V, Amir A, Rubin D B D, Akopyan F, McQuinn E, Risk W P, Modha D S. Cognitive computing building block: A versatile and efficient digital neuron model for neurosynaptic cores. In Proc. the International Joint Conference on Neural Networks, Aug. 2013.Google Scholar
  44. [44]
    Khan M M, Lester D R, Plana L A, Rast A, Jin X, Painkras E, Furber S B. SpiNNaker: Mapping neural networks onto a massively-parallel chip multiprocessor. In Proc. IEEE International Joint Conference on Neural Networks, June 2008, pp.2849-2856.Google Scholar
  45. [45]
    Ji Y, Zhang Y H, Li S C, Chi P, Jiang C H, Qu P, Xie Y, Chen W G. NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints. In Proc. the 49th Annual IEEE/ACM International Symposium on Microarchitecture, October 2016.Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Peng Qu
    • 1
  • Jin Yan
    • 2
  • You-Hui Zhang
    • 1
  • Guang R. Gao
    • 3
  1. 1.Department of Computer Science and TechnologyTsinghua UniversityBeijingChina
  2. 2.Department of Computer ScienceBrown UniversityProvidenceU.S.A.
  3. 3.Department of Electrical and Computer EngineeringUniversity of DelawareNewarkU.S.A.

Personalised recommendations