Keywords

1 Introduction

The suggestion of getting inspiration from neural architectures in designing microprocessors, the “brain in silicon” idea, has lurked around the twists and turns of the computer history, almost since its beginning. More precisely, three main periods during which this idea turned into practical projects can be identified. The first neural hardware was designed by Marvin Minsky in 1951 [52], building upon the logical interpretation of neural activity by McCulloch and Pitts [48], and was followed by just a few more attempts. After a long period of an almost complete lack of progress, a renewed interest sparked at the end of the 80’s, driven by the success of neural algorithms in software [64], with several funded projects in Europe, US, and Japan [75]. However, at the beginning of this century, almost no results of all that effort had reached maturity. In the last few years, a new wave of enthusiasm on neural hardware spread around, propelled by a few large projects funded in Europe and US for realistic brain simulations [45]. Again, a revolution in microprocessor design was forecast, in terms that closely remind of those of the previous two periods. This history is described in the next section.

Despite obvious progresses and changes in the technology and knowledge of neural mechanisms, the analysis of those three periods shows a common view of the reasons to believe the brain in silicon should be successful. This view, discussed in depth in Sect. 3, essentially hinges on the computational nature of the nervous systems, and the extreme degree of efficiency it offers, thanks to millions of years of evolution. We will first remark the difference between the possibility of a shared computational nature of the brain and of human-made digital computers, as held within the cognitive science and artificial intelligence communities, and the more problematic equivalence of the features, at the implementation level, of brain and silicon computations Second, even if mimicking the brain at the circuital level might be effective, a further point is that nobody knows exactly what to copy, as what turns the brain into an extremely powerful computer is still obscure. We will review the neuroscientific community efforts in characterizing the key circuital elements of the brain, and their yet inconsistent answers. We will add that it is very likely that the most powerful computational feature of the brain relies on the plasticity mechanisms, which are the less likely to be reproduced in silicon. At the end, even if we are unable to make prediction about the future of neuromorphic hardware, we can say that the principles used to promote this enterprise are theoretically flawed, and therefore the premises for the success of this approach are weak.

2 A Short Historical Account

As a matter of fact, the events identified in the following short historical sketch, can be framed into the wider context of the history of the mind as a machine. An history that one might reasonably suggest began with the thorough reflections on the possibility of mechanizing mind by Descartes [16]. Even if he dismissed a mechanical nature of the human mind, he brought to light this exciting idea, embraced shortly after by LaMettrie in the essay L’Homme Machine [39], and put in practice by the French engineer and inventor Jacques de Vaucanson with the Tambourine Player and Flute Player [9]. In the period around the advent of digital computers, the history of the mind as a machine evolves into, at least, three main movements: the cybernetic project aimed at explaining living behaviors by principles of the information-feedback machine in the 1940s and 1950s [38, 78]; the advent of artificial intelligence [13, 47]; and finally the enterprise of understanding the mind by modeling its workings within the field of cognitive science [7]. The history traced here is much narrower, focused on the more specific idea that hints gathered from the observation of the structure of the brain may provide guiding principles for the design of computer hardware, which may turn out to be competitive with the standard von Neumann architecture. Even if entrenched with the broader histories just mentioned, there are important differences, for example the goal here is rather practical and does not attempt to tackle an understanding of how the mind works.

2.1 First Ideas and Realization

The first suggestion to design computers borrowing hints from the brain came from Alan Turing [76], who envisioned a machine based on distributed interconnected elements, called B-type unorganized machine. It befell even before the first general-purpose electronic computers were up and running. Turing’s neurons are simply NAND gates with two inputs, randomly interconnected, and each NAND input can be connected or disconnected, and thus a learning method can “organize” the machine by modifying the connections. His idea of learning generic algorithms by reinforcing successful and useful links and of cutting useless ones in networks was the most farsighted of this report. Not so farsighted was his employer at the National Physical Laboratory, for which the report was produced, who dismissed the work as a “schoolboy essay”. Therefore, this report remained hidden for decades, until upheld by Copeland and Proudfoot in 1996 [12].

On the contrary, an earlier paper by McCulloch and Pitts [48], suggesting that neurons work akin logical ports, had a tremendous impact on the yet unborn field of Artificial Intelligence. They attempted to adapt the logic formalism of Carnap [10] to “neural” elements, that have two possible values only, corresponding to the logical truth values. There are two types of connections, excitatory and inhibitory, and an intrinsic feature of all neurons is a threshold, corresponding to the net number of “true” values in input to produce “true” as output value. Today we all know well that McCulloch and Pitts’ idea was simply wrong, they too became well aware of the different direction the growing neuroscientific evidence was pointing to [59], nevertheless the brain as a logic machine was a fascinating hypothesis, and galvanized more than one scholar at that time. Including Minsky [52], who designed SNARK (Stochastic Neural Analog Reinforcement Computer), the first neural computer, assembling 40 “neurons”, each made with six vacuum tubes and a motor to adjust mechanically its connections. The objective of SNARK was to try and find the exit from a maze where the machine would play the part of a rat. Running a mouse through a maze to investigate behavioral patterns was one of the most common paradigms in empirical psychology at that time. The construction of SNARK was an ambitious project, one of the first attempt to reproduce artificial intelligent behavior, but it had no influence on the contemporary progress of digital general-purpose computers. Later on, Minsky himself was one of the most authoritative voices begetting the dismissal of artificial neural research as a whole. The analysis he carried out, together with Papert, of Rosenblatt’s perceptron neural network [61], concluded with these discouraging words [51]:

The perceptron has shown itself worthy of study despite (and even because of!) its severe limitations. It has many features to attract attention: its linearity; its intriguing learning theorem; its clear paradigmatic simplicity as a kind of parallel computation. There is no reason to suppose that any of these virtues carry over to the many-layered version. Nevertheless, we consider it to be an important research problem to elucidate (or reject) our intuitive judgment that the extension is sterile.

[pp. 231–232]

2.2 The Artificial Neural Networks Period

In the 80’s artificial neural networks became rapidly one of the dominant research fields in artificial intelligence, in part thanks to the impressive development of Rumelhart and McClelland [64] and their group, who introduced simple and efficient learning algorithms, like backpropagation. In the 90’s artificial neural networks were the preferred choice for problems in which the rules governing a system are unknown or difficult to implement, thanks to their ability to learn arbitrary functions [28], but applications were typically implemented in software. The enormous world-wide resurgence of artificial neural networks raised again the interest for building brain-like hardware. The European Commission program ESPRIT from the 90’s promoted the development of neural hardware with several research projects: ANNIE, PYGMALION, GALATEA and SPRINT. The Japanese made neuromorphic hardware a key component of their 6th generation computing, and in the US funding for the subject was provided by DARPA, ONR and NFS [75]. In the mid 90’s about twenty neuromorphic hardware were commercialized, ranging from Siemens’ SYNAPSE-1 (Synthesis of Neural Algorithms on a Parallel Systolic Engine), to Philips’ L-Neuro, to Adaptive Solutions’ CNAPS (Connected Network of Adaptive Processors), to Intel’s ETANN and Hitachi’s MY-NEUPOWER [29]. Despite differences in the technical solutions adopted, the shared approach was essentially to achieve in hardware the maximum performance on array multiplication and summation of the results. This is actually the most intensive computation on a feed-forward network scheme [64], where each layer is defined by a weight matrix, to be applied to the current input vector. The way the elements of a neural chip are connected with one another in paralleling the application of the weight matrix is variable. There are n-parallel configurations, where the synapses of one neuron are mapped to the same processing element, and several neurons are computed in parallel, and s-parallel configurations, where the synapses of one neuron are mapped to different processing elements, so that several synapses, not belonging to the same neuron, are computed at once. The best configurations depends on the network architecture, and on the number of processing elements of the chip. The CNAPS processor has 64 processing elements, Siemens’ MA-16 is designed for 4\(\,\times \,\)4 matrix operation, and in SYNAPSE-1 eight MA-16 chips are cascaded to form systolic arrays.

Strictly speaking, not much of the computing architecture design in these projects is inspired by the brain, it is tailored to the abstract interpretation of the computation performed by the brain from a connectionist point of view: summations of inputs weighted by connections strengths. All solutions met a negligible market interest and disappeared shortly.

2.3 The Brain “reverse-Engineering” Challenge

At the beginning of this century a new wave of efforts towards neuron-like hardware mounted, in part driven by the large worldwide enterprise of brain reverse-engineering, taken up by projects like the Blue Brain Project [43] and the Human Brain Project in Europe [45], the DARPA C2S2 (Cognitive Computing via Synaptronics and Supercomputing) project at IBM Research [54]. For most of these long-term projects, the ultimate goal is to emulate the entire human brain, and the dominant approach is the emulation of neurons in software, running on the world top supercomputing systems, like IBM Blue Gene [67]. However, a number of smaller projects started developing new neural hardware too, like FACETs, Neurogrid, SpiNNaker and NeuroDyn [53]. In most of these projects the imitation of the brain reduces, in fact, to a fast parallel implementation of the most time consuming part of mathematical abstractions of neural activity. For example SpiNNaker (Spiking Neural Network Architecture, run by Furber and his team at Manchester [32]), is based on conventional digital-based low-power computer ARM9 core, the kind of CPU commonly found in smartphones, programmed to solve a popular simplified formulation of action potential [31]. Currently the SpiNNaker consists of 20,000 chips, each of which emulates 1000 neurons. On a similar path is the new effort taken by Modha at IBM with the TrueNorth chip, based on conventional digital devices running Izhikevich’s algorithm, but using an entirely new chip design, consisting of an array of 4096 cores, each with 256 neurons [49]. There are alternatives to digital devices too, that aims at reproducing the analog behavior of neurons. For example, in the FACETS (Fast Analog Computing with Emergent Transient States) project [66] an ASIC chip simulates analog neuron waveforms, for as many as 512 neurons with 256 inputs. As soon as an analog neuron reaches the conditions for an action potential, the digital part of the chip generates a spike event with the event time and the address of the spiking neuron.

Even if the most advanced achievements in brain reverse-engineering has been obtained on traditional supercomputers [45], neuromorphic hardware systems may offer a valid alternative for this challenging enterprise. But hopes are high, again, on the potential of neuromorphic computation as a general purpose computer, not just for simulating brain behavior, or executing advanced AI applications, but for ordinary software run by mundane users in their ordinary computers [77].

In order to get a grasp of the motivations for such a periodic impulse toward brain in silicon, it is instructive to compare three overviews of neural hardware and forecasts for the future, spaced each one a decade apart [17, 24, 25]. It is impressive the similarity that they share, in finding an unsatisfactory current impact of neural hardware but expressing the confidence in the potential, in the long run, of this approach. Heemskerk, in 1995, first expressed his concerns [25]:

“Neurocomputer building is expensive in terms of development time and resources, and little is known about the real commercial prospects for working implementations [...] Another reason for not actually building neurocomputers might lie in the fact that the number and variety of (novel) neural network paradigms is still increasing rapidly”

but concluded with:

“If progress advances as rapidly as it has in the past, this implies that neurocomputer performances will increase by about two orders of magnitude [...]. This would offer good opportunities”

Similarly, in 2004 Dias and coworkers [17] stated that:

“A few new neurochips are reported in this survey while the information collected indicates that more neurochips are no longer available commercially. The new solutions that have appeared indicate that this field is still active, but the removal of the market of other solutions does not seem to be good news. [...] there is no clear consensus on how to exploit the currently available [...] technological capabilities for massively parallel neural network hardware implementations.”

nevertheless, they believe in future opportunities:

“These might be the reasons for the slow development of the ANN hardware market in the last years, but the authors believe that this situation will change in the near future with the appearance of new hardware solutions.”

In 2013 Hasler and Marr acknowledge that not much has been achieved so far:

“A primary goal since the early days of neuromorphic hardware research has been to build large-scale systems, although only recently have enough technological breakthroughs been made to allow such visions to be possible.”

but their hopes are even higher:

“Neuromorphic engineering builds artificial systems utilizing basic nervous system operations implemented through bridging fundamental physics of the two mediums, enabling superior synthetic application performance [...] research in this area will accelerate by the pull of commercial ventures that can start utilizing these technologies to competitive commercial advantage.”

3 Computational Secrets of the Brain

There is a noteworthy difference between the neurocomputer project and the AI enterprise, which is grounded on the famous multiple-realizibility thesis [21]: cognition is characterized as computations independent on their physical implementation. Why then the mechanisms that cause computational power in a biophysical system, like the brain, would cause, in a radically different system, efficient computation in executing generic (including non cognitive) algorithms? The most common argument used to support the belief in the future superiority of neuromorphic hardware is well summarized in these words of Lande [40]:

“One possible answer [to CPU design problems] is to look into what life has invented along half a billion years of evolution [...] Numerous principles found in the brain can provide inspiration to circuit and system designers.”

But this argument is flawed on several counts.

3.1 Evolution Is Not Design

First, the mechanisms implemented by biological evolution are carved on the specific constraints of the organic system, which imposes roads that in principle are not better or worse than man-made solutions. Both the brain and the digital computers are based on electricity. However, electrical power for digital computation is conducted by metals, like copper, the fastest available conductor, and semiconductors, such as silicon and germanium, which allow control over electron flow at the highest possible speed. Thus, nature had to deal with an enormous disadvantage in dealing with electricity, compared to man made devices, in that metals and semiconductors cannot be used inside living organisms. Nature opted for the only electrical conductors compatible with organic materials: ions. The biophysical breakthrough of exploiting electric power in animals has been the ion channel, a sort of natural electrical device, first appeared as potassium channel about three billion years ago in bacteria, then evolved into the sodium channel 650 million years ago, and currently the most important neural channel [79]. The success of this particular ion channel is very likely due to the abundant availability of sodium in the marine environment during the Paleozoic era. How the neural cell emerged from early ion channels is still uncertain. As with many events in evolution, contingencies may have prevailed to pure adaptation. In the case of neurons, an intriguing theory is that their phylogeny is independent from the history of ion channels. One shared prerequisite of all neurons, from a genomic standpoint, is the capacity to express many more genes and gene products than other cell types, a behavior exhibited by most cells too, as a result of severe stress responses. Neurons might have evolved in ancestral metazoans from other cell types, as the result of development in the adaptive response to localized injury and stress [55]. Similar obscure and contingency dependent histories concern the whole span of neural machinery, from its first appearance in cnidarians, the early central nervous system of echnoderms, in the simple brain in flatworms, to the full brains in polychaetes, insects, cephalopods, and vertebrates [62].

Nature has its troubles but also its vantage points with respect to computer designers, in playing with electricity. The repertoire of organic structures and compounds is quite vast: there are more than 100 types of different known neurotransmitters, and a vast galaxy of neural cell types, diverse morphologically and in their molecular basis [74]. The organization of the neural circuitry span over three dimensions. Even in the new era of three dimensional semiconductors, which could accommodate for the density of a nervous system massive connection pool, technology would not suffice as ground to device the layered structure including the growth of a dendritic tree, which has no foreseeable equivalent in artificial systems.

3.2 What Should Be Copied from the Brain?

For the sake of argument, let the evolutionary trajectory of the brain be irrelevant, and the profound physical differences between brain and silicon be somehow overcome. What would the key elements of the brain structure be to be used for guiding microprocessor design? One may risk to struggle to reproduce in silicon some architectural aspect of the nervous system which is just necessary for some metabolic maintenance, but irrelevant from the computational point of view. The obvious answer should be to take the essential features that make computation in the brain powerful and efficient. The sad point is that nobody has yet been able to identify those features.

The many attempts to relate the complexity of behavior of an organism to macroscopic measures of the brain remain inconclusive. Both weight and size of the brain in vertebrates scale with body size, in a regular way. The relative brain size as a percentage of body size is also an index with scarce relevance, highest in the mouse and the shrew, and average values for primates, human included. Another much-discussed general factor is the encephalization quotient, which indicates the extent to which the brain size of a given species deviates from a sort of standard for the same taxon. This index ranks humans at the top, but is inconsistent with other mammals [62]. The picture is even more complicated when including invertebrates in the comparison [11]. A honeybee’s brain has a volume of about 1 cubic millimeter, yet bees display several cognitive and social abilities previously attributed exclusively to higher vertebrates [68]. Even the pure count of neurons lead to puzzling results, for example the elephant brain contains 257 billion neurons, against the 86 billion neurons of the human brain [26].

A suggestion for a designer might be to avoid seeking inspirations from the brain in all its extension and to focus instead on a specific structure that exhibits computational power at the maximum level. A good candidate is the cortex, a milestone in the diversification of the brain throughout evolution, occurred about 200 millions years ago [73]. It is well agreed upon that the mammalian neocortex is the site of the processes enabling higher cognition [23, 50], and it is particularly attractive as a circuit template, for being extremely regular over a large surface.

However, the reason why the particular way neurons are combined in the cortex makes such a difference with respect to the rest of the brain, remains obscure. There are two striking, and at first sight conflicting, features of the cortex:

  • the cortex is remarkably uniform in its anatomical structure and in terms of its neuron-level behavior, with respect to the rest of the brain;

  • the cortex has an incredibly vast variety of functions, compared to any other brain structure.

The uniformity of the cortex, with the regular repetition of a six-layered radial profile, has given rise to the search of a unified computational model, able to explain its power and its advantages with respect of other neural circuitry. It is the so called “canonical microcircuit”, proposed in several formulations. Marr proposed the first canonical model in a long and difficult paper that is one of his least known works [46], trying to derive an organization at the level of neural circuits from a general theory of how mathematical models might explain classification of sensory signals. The results were far from empirical reality, both at the time and subsequently through experimental evidence, so they were almost totally neglected and later abandoned by Marr himself. A few years later, Shepherd [6971] elaborated a model that was both much simpler and more closely related to the physiology of the cortex, compared to Marr’s work. This circuit has two inputs: one from other areas of the cortex making excitatory synapses on dendrites of a superficial pyramidal neuron, and an afferent input terminating on a spiny stellate cell and dendrites of a deep pyramidal neuron. There are feed-forward inhibitory connections through a superficial inhibitory neuron and feedback connections through a basket cell.

Independently, a similar microcircuit was proposed [19], using a minimal set of neuron-like units, disregarding the spiny stellate cells, whose effect is taken into account in two pyramidal-like cells, and the only non-pyramidal neuron is a generic GABA-receptor inhibitory cell. Several more refined circuits were proposed after these first models [18, 56, 57]. Features of the proposed canonical circuits, such as input amplification by recurrent intracortical connections, have been very influential on researchers in neural computation [15]. Yet these ideas are far from explaining why the cortex is computationally so powerful. Specifically, none of the models provide a direct mapping between elements of the circuits and biological counterparts, and there is no corresponding computational description, with functional variables that can be related to these components, or equations that match dependencies posited among components of the cortical mechanism. Moreover, it is fundamentally quite difficult to see how any of these circuits could account for the diversity of cortical functions. Each circuit would seem to perform a single, fixed operation given its inputs, and cannot explain how the same circuit could do useful information processing across diverse modalities.

3.3 Power from Flexibility

While the search for a cortical microcircuit advanced modestly in revealing the computational power of the cortex, there is very strong evidence that a key feature of the cortex is the capability to adapt to perform a wide array of heterogeneous functions, starting from roughly similar structures. This phenomenon is called neural plasticity, it comes in different forms, and it has been investigated from a wide variety of perspectives: the reorganization of the nervous system after injuries and strokes [22, 42], in the early development after birth [6, 63], and as the ordinary, continuous way that the cortex works, such as in memory formation [3, 8, 72].

With respect to the circuital grain level, plasticity can be classified into:

  1. 1.

    synaptic plasticity, addressing changes at single synapse level;

  2. 2.

    intracortical map plasticity, addressing internal changes at the level of a single cortical area;

  3. 3.

    intercortical map plasticity, addressing changes on a scale larger than a single cortical area.

Synaptic plasticity encompasses long-term potentiation (LTP) [1, 2, 4, 5], where an increase in the synaptic efficiency follows repeated coincidences in the timing of the activation of presynaptic and postsynaptic neurons, long-term depression (LTD), the converse process to LTP [30, 80], and spike-timing-dependent (STDP), inducing synaptic potentiation when action potentials occur in presynaptic cell a few milliseconds before those in the postsynaptic site, whereas the opposite temporal order results in long-term depression [41, 44].

Intracortical map plasticity is best known from somatosensory and visual cortical maps, with behavioral modifications such as perceptual learning [20, 60, 65], occurring at all ages, and the main diversification of cortical functions taking place early, in part before birth [14, 36, 37].

Intercortical map plasticity refers to changes in one or more cortical maps induced by factors on a scale larger than a single map’s afferents. A typical case is the abnormal developments in primary cortical areas following the loss of sensory inputs, when neurons become responsive to sensory modalities different from their original one [34]. The most dramatic cortical reorganizations are in consequence of a congenital sensory loss, or very early in development, although even in adults significant modifications have been observed [33]. Most of the research in this field is carried out by inducing sensorial deprivation in animals, resulting in an impressive crossmodal plasticity of primary areas [35].

Thus, the overall picture of how the cortex gains its computational power is rather discouraging from the perspective of taking hints for microprocessor design. Very little can be identified as key solutions in the adult cortex, and the plasticity processes, where its real power is hidden, involve organic transformations, like axon growing, neural connection rearrangement, appearance and disappearance of boutons and dendritic spines [27], programmed cell death [58, 81] which are easy to implement in the organic medium, but alien to silicon.

4 Conclusions

There are fundamental features of the brain that remain at present not well known, which may account for the lack of computational models able to encompass the real key aspects of the biological processing power. We believe that these key aspects could be researched in the very nature and complexity of the plasticity mechanisms rather than in the details of the neural unit processing alone. Until a theoretical framework emerges able to capture the essential aspects of neural plasticity and an appropriate technology able to mimic it is devised, the quest for the “brain in silicon” could be severely impaired. We are agnostic concerning the future of neurocomputers, our point is that the justification put forward for its realizibility is scientifically flawed, and it may be the cause of the scarce success met so far.