# Design and specification of analog artificial neural network

- 171 Downloads

**Part of the following topical collections:**

## Abstract

In this paper, we have implemented, using Matlab Simulink an analog artificial neural network for breast cancer classification. Simulated results with ideal building blocks exhibit a total error of classification of 2.6%. Thanks to this value, we have modified Simulink models of the building blocks (i.e. multiplier, activation function and its derivative) in order to take into account their non-idealities. This study allows to determine their influence on the classification quality and to extract some specifications of these building blocks.

## Keywords

Neural network MLP Breast cancer Non-ideal blocks Simulink## 1 Introduction

Artificial Neural Networks (ANN) are used in many fields such as medical domain (the current paper application), communication, power control and other applications.

- 1.
Implementation by using software.

- 2.
Implementation by using hardware.

The first method uses software like C++, MATLAB, LabVIEW and other software using different algorithm like Multilayer perceptron with Back-propagation (MLP), Radial Basis Function (RBF), Support Vector Machine (SVM) [1, 2, 3]. Some published papers use the Fuzzy logic method [4] and others use the normalized Multilayer perceptron with Back-propagation (MLP) [5].

- 1.
Digital hardware implementation.

- 2.
Analog hardware implementation.

In the first one, digital devices are used for example the Field Programmable Gate Array (FPGA) [6, 7, 8] and [9].

The second way is based on analog implementation of ANN [10], this type of artificial neural networks is particularly interesting for CMOS VLSI implementations because every parallel element (neuron or synapse) is relatively simple, allowing the complete integration of big networks on a single chip [11]. Multipliers [12], non-linear function and its derivative [13, 14] are essential key elements in the analog signal processing in particular for analog VLSI implementation of artificial neural networks. In this article, we create a design for breast cancer detection and classification (i.e. malignant or benign) by using the artificial neural network MLP (Multi-Layer Perceptron) with back-propagation [14, 15, 16], based on the Wisconsin Data Base [17]. In a previous work, we have demonstrated that the most efficient architecture consists of 9 inputs neurons corresponding to 9 attributes of the Data Base, 10 neurons in the hidden layer and 2 for the output layer (two output classes: Benign or malignant), named 9-10-2 without biases [18]. The main objective of this work is to determine the influence of the non-idealities of the analog building blocks on the quality of the classification and to determine the specifications of each building block. Thanks to the HCMOS9A 130 nm STMicroelectronics technology [12], we create non-ideal multiplier, activation function and its derivative blocks, under Simulink Matlab R2016a [19].

This paper is organized as follows. Section 2 highlights the Simulink implementation of the neural network. Ideal models of building blocks are created and simulated results are presented in Sect. 3. In Sect. 4, these models are modified in order to take into account the limitations of the CMOS analog implementation of these building blocks. Finally, conclusion and trends are given in Sect. 5.

## 2 Definition of cancer

All blocks of body were created from cells [20]. In normal case cells grow and divide to form new cells and when cells grow out of control will be a cancer. If these cells grow inside the breast form a tumor [21], and will be benign (not cancer) or malignant (cancer). From different parts of the breast, the breast cancers can begin [21].

## 3 Wisconsin breast cancer dataset

The Wisconsin Breast Cancer database [22] is located in the UCI Machine Learning Repository and it is applied in this research.

699 instances (Benign: 458 Malignant: 241).

9 integer-valued attributes.

2 classes: 212 malignant and 357 benign, or (65.5% malignant and 34.5% benign).

Missing attribute values: none.

We can find also this database from the following link: ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/.

## 4 Dataset description [17]

Specification of the breast cancer dataset

Dataset | No. of attributes | No. of instances | No. of classes |
---|---|---|---|

Wisconsin breast cancer (original) | 9 | 699 | 2 |

## 5 Data set information [17]

distribution of 699 instances

Group | Instance | Date |
---|---|---|

1 | 367 | January 1989 |

2 | 70 | October 1989 |

3 | 31 | February 1990 |

4 | 17 | April 1990 |

5 | 48 | August 1990 |

6 | 49 | Updated January 1991 |

7 | 31 | June 1991 |

8 | 86 | November 1991 |

Total | 699 | As of the donated database on 15 July 1992 |

Wisconsin breast cancer dataset attributes

Id number | Attribute | Domain |
---|---|---|

1 | Clump thickness | 1–10 |

2 | Uniformity of cell size | 1–10 |

3 | Uniformity of cell shape | 1–10 |

4 | Marginal adhesion | 1–10 |

5 | Single epithelial cell size | 1–10 |

6 | Bare nuclei | 1–10 |

7 | Bland chromatin | 1–10 |

8 | Normal nucleoli | 1–10 |

9 | Mitoses | 1–10 |

- 1.
Benign or 0.

- 2.
Malignant or 1.

## 6 Simulink neural network implementation

### 6.1 Feed forward neural network

_{i}are the inputs (the 9 Cancer attributes of the Wisconsin Data Base), W

_{ij}(or wH) are the 90 coefficients of the hidden layer, a

_{j}are the 10 outputs of the hidden layer (i.e. the inputs of the output layer), W

_{jk}(or wO) are the 20 coefficients of the output layer and s

_{k}are the 2 outputs of the neural network. g

_{j}is the activation function of the hidden layer and g

_{k}is the activation function of the output layer. A logsigmoid function was chosen as activation function for both hidden and output layers [18].

_{ij}and W

_{jk}, of the neurons). Our reference for breast cancer detection and classification are located in neural network toolbox of Matlab R2016a [19]:

- 1.
Number of inputs equal to 9.

- 2.
Number of outputs equal to 2 = Number of neurons in output layer.

- 3.
Number of neurons in hidden layer equal to 10.

- 4.
Number of samples equal to 699.

- 5.
Number of samples for learning process equal to 489.

- 6.
Number of epochs equal to 14.

- 7.
Learning rate equal to 0.25.

### 6.2 Memory package

Before starting the learning process in our model, we must initialize the weights in hidden layer wH and the weights in output layer wO (cf. Fig. 1b) by applying the random function that creates a random value between − 0.9 and 0.9. This value will be stored in the memories package wH and wO and will be used in the future as inputs for analog CMOS multiplier.

The “Read storage weights for output layer” block is used to read the weight’s values in output layer, and the “Read storage weights for hidden layer” block is used to read the weight’s values in hidden layer.

### 6.3 Hidden oud output layers

The CancerTargets, the target output and the connection between hidden and output layers, are realized with Simulink routing [23]. We can find the Simulink method to build the artificial neural network in several papers for example the radial basis function RBF [24] or self-organizing map SOM artificial network [25].

To separate the desired output 1 and desired output 2, the Demux block is used, and the “Subtract” block is used to generate the subtraction between the target outputs (1 or 2) and the neuron output (1 or 2).

### 6.4 Back-propagation

_{k}, used in Fig. 6, the “Back-propagate error signal O” (cf. Fig. 5) is given by Eq. 5, and is then updated with the newest weight values through the “To store the new values for output neuron” block based in the Eq. 6 (cf. Fig. 6).

_{k}, used in Fig. 6, the “Back-propagate error signal H” is given by Eq. 7, and is then updated with the newest weight values through the “To store the new values for hidden weights” block based in the Eq. 8 (cf. Fig. 6).

## 7 Simulation and results

### 7.1 Signals generator

A similar organization chart was used to create signals generator and successfully tested by using the Microcontroller PIC18F2680 through software PIC C and Proteus 4.8 SF0 Pro [26], the unit time can be 0.1 μS, 1 μS,…1 s depending on the application used.

The duration time for learning phase and testing phase was calculated during running the small program that launch the model and it’s equal respectively to tL = m * n [0 to m * n − 1] and tT = [0 + m * n to m * n + tT − 1].

Two switches block determine the end of the learning phases and the beginning of the testing phase, through the variable tL, for example when not achieved this value during running the model, we still in learning phase.

During each unit time and during the learning phase the first switch lets the model read the weights (memories block) from the hidden layer during the first half of unit time and write the updated weights during the second half and at the end of the learning time. At the beginning of the testing phase, this switch lets the model read only from the memories block (weights in Hidden layer).

The same procedure for the second switch for the Output layer.

### 7.2 Artificial neural network signals

Number of sample NS = 489

Number of epochs = 14

Learning rate = 0.25.

The needed times for learning and testing phases in our model are equal to 6845 µs (i.e. 489 * 14 µs) and 7545 µs (i.e. 6845 + 699 µs) respectively.

On Figs. 10 and 11, we can notice that at the end of 6845 µs (learning phase), the weights do not vary any more.

### 7.3 Classification results

This final result giving a total error of 2.6% was obtained with ideal blocks: multipliers, logsigmoid function and its derivative as activation function. This result serves as a reference for the following studies.

## 8 Non-idealities implementation

### 8.1 Introduction

In order to determine the influence on the non-idealities of these building blocks on the quality of the classification, we have to take into account their limitations and drawbacks. This study will allow us to determine some specifications of these circuits.

### 8.2 Creating a non-ideal multiplier block

Two characteristics of the multiplier will be extracted from this study: dynamics and noise. The dynamics will be given by the extreme values of the weights (inputs) and the noise by the influence of their accuracy.

By replacing the ideal multiplier block with another noisy block in the Simulink model, we create an error in the multiplier block and in the derivative function at the same time. In fact, the activation function f(x) is the log-sigmoid and its derivative is equal to f(x) * (1 − f(x)). For this reason, we changed at the same time the multiplier and the derivative function.

The non-ideal multiplier and non-ideal derivative function blocks

er1 (%) | er2 (%) | Nb errors | % Error | Extreme values: wH | Extreme values: wO |
---|---|---|---|---|---|

0 | 0 | 18 | 2.6 | − 1.37 → 1.17 | − 3.54 → 3.48 |

− 10 | 10 | 19 | 2.7 | − 1.6 → 1.4 | − 3.8 → 3.5 |

− 20 | 20 | 23 | 3.3 | − 1.7 → 1.2 | − 3.2 → 3.8 |

− 30 | 30 | 26 | 3.7 | − 1.7 → 1.2 | − 3 → 3 |

− 40 | 40 | 31 | 4.3 | − 1.7 → 1.5 | − 3 → 3.2 |

− 50 | 50 | 43 | 6.2 | − 1.7 → 1.2 | − 3 → 2.8 |

− 60 | 60 | 68 | 9.7 | − 1.7 → 1.3 | − 3 → 2.8 |

− 70 | 70 | 77 | 11.0 | − 1.8 → 1.6 | − 2.8 → 3 |

− 80 | 80 | 86 | 12.3 | − 2.5 → 1.8 | − 2.5 → 3.1 |

− 90 | 90 | 92 | 13.6 | − 2.4 → 1.8 | − 2.7 → 2.5 |

− 100 | 100 | 98 | 14.0 | − 2.8 → 1.8 | − 2.7 → 2.5 |

The variables er1 and er2 (the first 2 columns) represent the percentages of error range between two values added to the multiplier block. The next two columns provide the total number of errors during the classification on the 699 biopsies and the corresponding percentage respectively. So, 18 total errors (cf. first line with er1 = er2 = 0%, 12 errors for the benign class and 6 errors for the malignant class as shown in the confusion matrix, cf. Fig. 14) give a percentage error of 18/699 = 2.6%. wH and wO represent the extreme values (minimum and maximum) of the weights for hidden and output layers.

These results shown that an error lower than 10% on the input of the multiplier, and so on the value of the weights, does not modify the quality of the classification; 20% with a final error of 3.3% (compared with 2.6%) and 50% (with an error of 6.2%) can also suit our application. Moreover, the worst case gives a dynamic (extreme values) of [− 3.8; 3.5]. If we consider a scale of 5 mV for a weight of 0.01, a weight of 3.8 will be corresponding to a voltage of 760 mV. A final input dynamic of the multiplier, equals to ± 800 mV, will be enough and easily to achieve with the HCMOS9A 130 nm technology and a power supply of ± 900 mV [12]. Also, a maximum input noise of the multiplier of 80 mV (10% of error) is not a difficult constraint.

### 8.3 Create a non-ideal log-sigmoid block

By changing the ideal logsigmoid block with another non-ideal logsigmoid block, we obtain the results presented in Table 2.

The non-ideal logsigmoid block

er3 (%) | er4 (%) | Nb errors | % Error | Extreme values: wH | Extreme values: wO |
---|---|---|---|---|---|

− 10 | 10 | 24 | 3.4 | − 1.6 → 1.4 | − 3.2 → 3.8 |

− 20 | 20 | 49 | 7.0 | − 1.7 → 1.2 | − 4 → 6.2 |

− 25 | 25 | 241 | 34.5 | − 1.7 → 1.2 | − 4 → 16 |

Error changes between − 10% and 10% for logsigmoid block

er1 (%) | er2 (%) | er3 (%) | er4 (%) | Nb errors | % Error | Extreme wH | Extreme wO |
---|---|---|---|---|---|---|---|

− 10 | 10 | − 10 | 10 | 22 | 3.1 | − 1.6 → 1.2 | − 3.9 → 4.1 |

− 20 | 20 | − 10 | 10 | 24 | 3.4 | − 1.6 → 1.3 | − 3.6 → 3.6 |

− 30 | 30 | − 10 | 10 | 28 | 4.0 | − 1.4 → 1.5 | − 3 → 3 |

− 40 | 40 | − 10 | 10 | 30 | 4.3 | − 1.6 → 1.0 | − 4 → 3.4 |

− 50 | 50 | − 10 | 10 | 38 | 5.4 | − 1.7 → 1.5 | − 3.2 → 3.4 |

− 60 | 60 | − 10 | 10 | 241 | 34.5 | − 5.2 → 6.5 | − 6 → 48 |

Error changes between − 20% and 20% for logsigmoid block

er1 (%) | er2 (%) | er3 (%)1 | er4 (%) | Nb errors | % Error | Extreme wH | Extreme wO |
---|---|---|---|---|---|---|---|

− 10 | 10 | − 20 | 20 | 48 | 6.9 | − 2.2 → 1.2 | − 4 → 6.2 |

− 20 | 20 | − 20 | 20 | 241 | 34.5 | − 2.1 → 2.3 | − 3.8 → 11 |

Table 7 shows that the maximum acceptable range for the multiplier block is from − 20 to 20% when the error range for the logsigmoid block varies between − 20 and 20%.

## 9 Results analysis

In analog CMOS circuits we can’t find, create or fabricate an ideal multiplier or an ideal activation function. For this reason, a non-ideal multiplier and a non-ideal log-sigmoid were designed to verify the acceptable range to be used in the analog CMOS circuits.

wH and wO represent respectively the weights in Hidden and Output layers and these weights change during the learning phase. In software world no limits for range contrary to CMOS world where this range varies from circuit to another.

From Table 4 we can deduce that in CMOS world for example:

Case 2 (row 2) when we have an error between − 10 and +10% applied to the multiplier the extreme values for wH and wO respectively will be − 160 mV → 140 mV and − 380 mV → 350 mV.

The same for the activation function for example from Table 5:

Case 2 (row 2) when we have an error between − 20 and + 20% applied to the log-sigmoid the extreme values for wH and wO respectively will be − 170 mV → 120 mV and − 400 mV → 620 mV.

When we choose the blocks together and from Table 7:

Case 1(row 1) when we have an error between − 10 and + 10% applied to the multiplier and when we have an error between − 20 and + 20% applied to the log-sigmoid, the extreme values for wH and wO respectively will be − 220 mV → 120 mV and − 400 mV → 620 mV.

These results show that a ± 800 mV input dynamics of the multiplier is sufficient, since it gives a very high sensitivity of 5 mV for a 0.01 accuracy of the coefficients (i.e. weights of the neurons), well beyond than we need. Likewise, an error of 10% (or 20%) on these inputs corresponds to an input noise of the multiplier of 80 mV (or 160 mV respectively). These specifications will be very easy to perform with the chosen 130 nm CMOS HCMOS9A technology and with a power supply of the electronics of ± 900 mV [12].

In addition, to simplify the final realization of the artificial neural network, we chose a current output of the multipliers. Thus, the addition of multipliers’ outputs is realized by using simply the Kirchhoff law for feeding the activation function. The constraints of this latter are not in voltage but in current; its dynamics will therefore depend on the realization of the multiplier with however a noise constraint which not exceed 10% of the value.

## 10 Conclusion

After creation and simulation for different errors on the non-ideal blocks for the multiplier, activation function and its derivative, this procedure gives us information about the acceptable error when we want to create a neural network by using the analog CMOS circuits.

From this study, we conclude that the activation function log-sigmoid block is very sensitive to the wrong answer for the classification of breast cancer detection.

In future projects, we will try to build the activation function log-sigmoid by using the analog CMOS circuit, taking into consideration the maximum acceptable range for this block.

## Notes

### Acknowledgements

The authors would like to acknowledge the support of the Lebanese International University, and Polytech’Lab,.

### Funding

This study was funded by the Lebanese International University, and Polytech’Lab.

### Compliance with ethical standards

### Conflict of interest

The authors declare that they have no conflict of interest.

## References

- 1.Aličković E, Subasi A (2017) Breast cancer diagnosis using GA feature selection and Rotation Forest. Neural Comput Appl 28(4):753–763CrossRefGoogle Scholar
- 2.Aličković E, Subasi A (2011) Data mining techniques for medical data classification. In: The international Arab conference on information technology (ACIT)Google Scholar
- 3.Abbas HA (2001) An evolutionary artificial neural network approach for breast cancer diagnosis. Artif Intell Med 25:265–281CrossRefGoogle Scholar
- 4.Hassan MR, Hossain MM, Begg RK, Ramamohanarao K, Morsi Y (2010) Breast-cancer identification using HMM-fuzzy approach. Comput Biol Med 40:240–251CrossRefGoogle Scholar
- 5.Alickovic E, Subasi A (2019) Normalized neural networks for breast cancer classification. In: International conference on medical and biological engineering. Springer, Cham, pp 519–524Google Scholar
- 6.Liu J, Liang D (2005) A survey of FPGA-based hardware implementation of ANNs. In: 2005 International conference on neural networks and brain, vol 2. IEEE, pp 915–918Google Scholar
- 7.Alçın M, Pehlivan İ, Koyuncu İ (2016) Hardware design and implementation of a novel ANN-based chaotic generator in FPGA. Optik 127(13):5500–5505CrossRefGoogle Scholar
- 8.Sahin S, Becerikli Y, Yazici S (2006) Neural network implementation in hardware using FPGAs. In: International conference on neural information processing. Springer, Berlin, Heidelberg, pp 1105–1112CrossRefGoogle Scholar
- 9.Dias FM, Antunes A, Mota AM (2004) Artificial neural networks: a review of commercial hardware. Eng Appl Artif Intell 17(8):945–952CrossRefGoogle Scholar
- 10.Valle M, Caviglia DD, Bisio GM (1994) Back propagation learning algorithm for analog VLSI imple-mentation. In: Delgado JG, Moore WR (eds) VLSI for neural network and artificial intelligence. Plenum Press, New York, pp 35–44CrossRefGoogle Scholar
- 11.Hirai Y (1993) Recent VLSI neural networks in Japan. J VLSI Signal Process 6(1):7–18CrossRefGoogle Scholar
- 12.Jouni H, Harb A, Jacquemod G, Leduc Y (2017) Wide range analog CMOS multiplier for neural network application. In: EEETEM2017, BeirutGoogle Scholar
- 13.Kenzhina M, Dolzhikova I (2018) Analysis of hyperbolic tangent passive resistive neuron with CMOS-memristor circuit. In: Second international conference on computing and network communications (CoCoNet’18)Google Scholar
- 14.Heidari M, Shamsi H (2019) Analog programmable neuron and case study on VLSI implementation of multi-layer perceptron (MLP). Microelectron J 84:36–47CrossRefGoogle Scholar
- 15.Jayadeva Dr, Deb A, Chandra S (2002) Algorithm for building a neural network for function approximation. IEEE Proc Circuits Devices Syst 149:301–307CrossRefGoogle Scholar
- 16.da Silva IN, Hernane Spatti D, Andrade Flauzino R, Liboni LHB, dos Reis Alves SF (2017) Multilayer perceptron networks. In: Artificial neural networks. Springer, ChamGoogle Scholar
- 17.Salama GI, Abdelhalim MB, Zeid MA (2012) Breast cancer diagnosis on three different datasets using multi-classifiers. Int J Comput Inf Technol 01(01):2277-0764Google Scholar
- 18.Jouni H, Issa M, Harb A, Jacquemod G, Leduc Y (2016) Neural network architecture for breast cancer detection and classification. IMCET, BeirutCrossRefGoogle Scholar
- 19.MathWorks (2016) MATLAB and statistics toolbox release 2016a. The MathWorks Inc, Natick, Massachusetts, United StatesGoogle Scholar
- 20.
- 21.
- 22.Frank A, Asuncion A (2010) UCI machine learning repository. School of Information and Computer Science, University of California, Irvine. http://archive.ics.uci.edu/ml
- 23.Matlab Neural Network Toolbox. http://www.mathworks.com/
- 24.Thanh NP, Kung YS, Chen SC, Chou HH (2016) Digital hardware implementation of a radial basis function neural network. Comput Electr Eng 53:106–121CrossRefGoogle Scholar
- 25.Tisan A, Cirstea M (2013) SOM neural network design-A new Simulink library based approach targeting FPGA implementation. Math Comput Simul 91:134–149MathSciNetCrossRefGoogle Scholar
- 26.Jouni H, Harb A, Jacquemod G, Leduc Y (2017) Programmable signal generator for neural network application. ICM, BeirutCrossRefGoogle Scholar