Optimizing Neural Network Parameters Using Taguchi’s Design of Experiments Approach: An Application for Equivalent Stress Prediction Model of Automobile Chassis
Abstract
Artificial neural networks (ANNs) have been successfully applied to problems in different fields including medicine, management, and manufacturing. One major disadvantage of ANNs is that there is no systematic approach for model design. Most literature suggests a trial-and-error method for parameter setting which requires more time. The accuracy of the ANN model greatly depends on the network parameter settings including the number of neurons, momentum, learning rate, transfer function, and training algorithm. In this paper, we apply Taguchi’s design of experiments approach to determine the optimum set of parameters for an ANN trained using feed forward back-propagation. We present a case study of an equivalent stress prediction model for an automobile chassis to demonstrate the implementation of the approach. After training the network, the optimum values of the ANN parameters are determined according to the performance statistics. The performance of the ANN is superior using the Taguchi method to optimize the parameters compared with random parameter values.
Keywords
Artificial neural network Model selection ANN parameters Neural network optimization Taguchi method1 Introduction
Most literature related to ANN focused on the results for specific applications rather than the methodology of developing and training the networks. In general, the ANN parameters, including the learning rate, the number of hidden nodes and hidden layers, and the transfer functions, are set during the training process. These settings are crucial to the performance of the ANN model. The trial-and-error method is typically used to determine the appropriate values of these parameters.
Patel and Bhatt [1] prepared ANN model using the results of FEA. For training the ANN model, the standard back-propagation algorithm is observed to be the best. A multilayer perception network is used for nonlinear mapping between the input and the output parameters. FEA–ANN hybrid model can save material used, production cost, and time [1]. Lefik and Schrefler [2] proposed back-propagation ANN as a tool to numerically model the constitutive behavior of a physically nonlinear body. They demonstrated that the model was applicable even in the case of complex nonlinear, inelastic behavior [2]. Rao and Babu [3] demonstrated the applicability of ANNs for the design of beams subjected to moment and shear forces. Gudur and Dixit [4] used ANN to predict the location of the neutral point and velocity field. The training data were obtained from a rigid–plastic FEA code. This procedure provided highly accurate solutions and was suitable for optimization applications [4]. Castellani and Rowlands [5] proposed the evolutionary artificial neural network generation and training (ANNGaT) algorithm to formulate an ANN system. The algorithm simultaneously evolved the ANN topology and the weights. The results showed no differences in accuracy between ANN architectures with one or two hidden layers [5]. Sholahudin and Han [6] developed a method to predict the instantaneous building energy load depending on various combinations of input parameters using a dynamic neural network model. The results of this study demonstrated that Taguchi’s method could successfully reduce the number of input parameters. Moreover, the dynamic neural network model could precisely predict instantaneous heating loads using a reduced number of inputs [6]. Patel and Bhatt [7] optimized the weight of the Eicher 11.10 chassis frame. To reduce the number of experiments, they used the Taguchi method along with FEA. This method could minimize the materials used, the production cost, and time [7]. Patel and Bhatt [8] developed the mathematical model of von Mises stress (VMS) to optimize the weight of the chassis frame using FEA–response surface methodology (RSM) hybrid modeling. The regression equation for VMS was developed using the FEA results of different variants of the chassis frame [8]. Patel and Bhatt [8] compared the prediction accuracy of the RSM and multiple linear regressions (MLR) model for the equivalent stress of the chassis frame. The results indicated that predictions of the RSM were more accurate than the predictions of the MLR model [9]. Stojanović et al. [10] use Taguchi’s method to investigate the tribological behavior of aluminum hybrid composites. They used ANN to predict the wear rate and coefficient of friction [10].
The performance of ANN depends on the network training parameters and network architectural parameters [11]. In general, a standard ANN model that is applicable for every problem has not been established. For this reason, the appropriate parameter values must be determined experimentally for each problem. Different statistical methods have been used to find suitable ANN parameter values. The Taguchi method is a statistical technique used to study the relationship between the factors affecting a process and the outputs of the process. It can be used to systematically identify the best parameter settings that optimize the output. Several authors have applied the Taguchi design of experiments methodology for ANN parameter setting [11, 12, 13, 14, 15, 16, 17, 18, 19].
Tortum et al. [11] optimized the data transformation, the percentage of data used for training, the number of neurons in the first and second layers, and the activation function to increase the accuracy of the back-propagation algorithm [11]. Packianather et al. [12] studied the effect of design variables and back-propagation neural network (BPNN) performance for wood veneer inspection. Roy [13] described using the Taguchi method to optimize the design factors of an ANN. Kuo and Wu [14] prepared the prediction model for polymer blends using a BPNN along with Taguchi’s method to improve the deficiencies of the network architecture design. The objective of the ANN prediction model was to identify the relationship between the control parameter settings and the surface roughness in the film coating process [14]. Sukthomya and Tannock [15] applied Taguchi’s method to optimize ANN parameters in a multilayer perceptron network trained with the back-propagation algorithm in a case study of a complex forming process. Laosiritaworn and Chotchaithanakorn [16] studied the optimum settings of an ANN trained to model ferromagnetic material data. They optimized the number of neurons in the first and second layers, the learning rate, and the momentum. The authors suggested that this optimization procedure should be performed for each ANN application because the significant parameters vary for different purposes [16]. Jung and Yum [17] developed a dynamic parameter design approach for ANNs to optimize the number of neurons in the first and second layers, the learning rate, and the momentum. Madić and Radovanović [18] applied Taguchi’s DOE method to optimize an ANN model trained using the Levenberg–Marquardt algorithm. A high prediction accuracy was achieved by the Taguchi optimized ANN model [18]. Kazancoglu et al. [19] suggested using Taguchi’s method with BPNN to minimize the surface roughness in a wire cut electron discharge machining process. The predicted values closely approached the experimental values [19]. Moosavi et al. [20] assessed different factors affecting the performance of wavelet–ANN and wavelet–ANFIS hybrid models. Each of the models entailed several levels, and the optimum structures for both models were determined using Taguchi’s method [20]. Adalarasan et al. [21] studied the drilling characteristics of second-generation hybrid composites. The experimental trials were designed using an L18 orthogonal array, and a Taguchi-based response surface method was used for optimizing the drilling parameters [21]. Khoualdia et al. [22] proposed a monitoring and diagnosis system based on a neural network model for gear–bearing combined faults prediction. The time domain parameters and the binary codes of defects were used as input and output data to train and test the neural network model. The Taguchi standard orthogonal array and the Grey–Taguchi method were used as multi-objective optimization approaches to determine the best neural network model architecture [22]. Padhi et al. [23] fabricated intricate parts using fused deposition modeling (FDM). They used a fuzzy inference system combined with Taguchi’s method to generate a single response from three responses. They also used Taguchi’s method with artificial neural networks to evaluate the accuracy of the dimensions of the FDM fabricated parts, subjected to various operating conditions. The predicted values obtained from both models were consistent with the experimental data [23]. Sahare et al. [24] optimized the end milling process for Al2024-T4 workpiece material. The input process parameters were the cutting speed, feed per tooth, depth of cut, and the cutting fluid flow rate. The response parameters were the surface roughness, cutting force, and material removal rate. They use the Taguchi L9 orthogonal array for the experimentation and performed regression analysis using ANN to obtain the optimal settings of the end milling process. The obtained results demonstrated that the ANN combined with Taguchi’s method was suitable for optimization [24].
2 ANN Parameter Optimization
To demonstrate ANN parameter optimization using Taguchi’s design of experiments approach, we use the equivalent stress on a vehicle chassis as a case study.
2.1 Experimental Data
FEA experimental datasets for ANN training [1]
Sr. no. | Thickness of web (mm) | Thickness of upper flange (mm) | Thickness of lower flange (mm) | Equivalent stress (N/mm^{2}) |
---|---|---|---|---|
1 | 3 | 3 | 3 | 155.010 |
2 | 3 | 4 | 4 | 128.200 |
3 | 3 | 5 | 5 | 118.160 |
4 | 3 | 6 | 6 | 115.500 |
5 | 3 | 7 | 7 | 103.390 |
6 | 4 | 3 | 4 | 118.570 |
7 | 4 | 4 | 5 | 112.420 |
8 | 4 | 5 | 6 | 102.610 |
9 | 4 | 6 | 7 | 96.970 |
10 | 4 | 7 | 3 | 131.150 |
11 | 5 | 3 | 5 | 108.040 |
12 | 5 | 4 | 6 | 97.007 |
13 | 5 | 5 | 7 | 93.031 |
14 | 5 | 6 | 3 | 121.770 |
15 | 5 | 7 | 4 | 108.200 |
16 | 6 | 3 | 6 | 97.780 |
17 | 6 | 4 | 7 | 88.380 |
18 | 6 | 5 | 3 | 110.080 |
19 | 6 | 6 | 4 | 97.607 |
20 | 6 | 7 | 5 | 88.780 |
21 | 7 | 3 | 7 | 87.279 |
22 | 7 | 4 | 3 | 100.250 |
23 | 7 | 5 | 4 | 93.288 |
24 | 7 | 6 | 5 | 86.179 |
25 | 7 | 7 | 6 | 82.599 |
The ANN consists of three neurons in the input layer that correspond to three topological parameters of the automobile chassis sidebar (thickness of web, thickness of upper flange, thickness of lower flange), and one neuron in the output layer that corresponds to the equivalent von Mises stress (VMS).
2.2 Neural Network Parameters
- 1.
ANN architectural parameters: number of neurons in the hidden layer, training algorithm, and transfer function in hidden and output layers;
- 2.
ANN learning parameters: learning rate, momentum, and increment and decrement factors;
ANN training and architectural parameters and levels
Parameter | Parameter description | Level 1 | Level 2 | Level 3 |
---|---|---|---|---|
A | Training algorithm | Trainscg | Trainlm | – |
B | Transfer function in hidden layer | Purelin | Logsig | Tansig |
C | Transfer function in output layer | Purelin | Logsig | Tansig |
D | Increment factor | 5 | 10 | 15 |
E | Decrement factor | 0.05 | 0.1 | 0.2 |
F | Learning rate (μ) | 0.001 | 0.01 | 0.1 |
G | Momentum | 0.1 | 0.3 | 0.5 |
H | Hidden neurons in first layer | 2 | 6 | 10 |
This design problem involves eight main parameters, where one parameter has two levels and the remaining seven parameters each have three levels. Considering all of the possible combinations of the eight parameters entails a total of 2^{1} × 3^{7} = 4374 different experiment sets.
2.3 Taguchi Design of Experiments
Taguchi’s orthogonal array
Exp. no. | A | B | C | D | E | F | G | H |
---|---|---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
2 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 |
3 | 1 | 1 | 3 | 3 | 3 | 3 | 3 | 3 |
4 | 1 | 2 | 1 | 1 | 2 | 2 | 3 | 3 |
5 | 1 | 2 | 2 | 2 | 3 | 3 | 1 | 1 |
6 | 1 | 2 | 3 | 3 | 1 | 1 | 2 | 2 |
7 | 1 | 3 | 1 | 2 | 1 | 3 | 2 | 3 |
8 | 1 | 3 | 2 | 3 | 2 | 1 | 3 | 1 |
9 | 1 | 3 | 3 | 1 | 3 | 2 | 1 | 2 |
10 | 2 | 1 | 1 | 3 | 3 | 2 | 2 | 1 |
11 | 2 | 1 | 2 | 1 | 1 | 3 | 3 | 2 |
12 | 2 | 1 | 3 | 2 | 2 | 1 | 1 | 3 |
13 | 2 | 2 | 1 | 2 | 3 | 1 | 3 | 2 |
14 | 2 | 2 | 2 | 3 | 1 | 2 | 1 | 3 |
15 | 2 | 2 | 3 | 1 | 2 | 3 | 2 | 1 |
16 | 2 | 3 | 1 | 3 | 2 | 3 | 1 | 2 |
17 | 2 | 3 | 2 | 1 | 3 | 1 | 2 | 3 |
18 | 2 | 3 | 3 | 2 | 1 | 2 | 3 | 1 |
Taguchi’s analysis
Exp. no. | MSE (N/mm^{2})^{2} | R ^{2} | PI | S/N ratio |
---|---|---|---|---|
1 | 0.0045100 | 0.95566 | 0.91042 | − 0.8151441 |
2 | 0.0031400 | 0.96966 | 0.92868 | − 0.6427020 |
3 | 0.0061800 | 0.93285 | 0.88723 | − 1.0392891 |
4 | 0.0000000 | 1.00000 | 0.99984 | − 0.0014034 |
5 | 0.0029600 | 0.97152 | 0.93125 | − 0.6186625 |
6 | 0.0000013 | 0.99999 | 0.99884 | − 0.0101039 |
7 | 0.0000000 | 1.00000 | 0.99994 | − 0.0005309 |
8 | 0.0004300 | 0.99587 | 0.97720 | − 0.2003626 |
9 | 0.0000711 | 0.99932 | 0.99123 | − 0.0765300 |
10 | 0.0045100 | 0.95566 | 0.91042 | − 0.8151441 |
11 | 0.0031400 | 0.96966 | 0.92868 | − 0.6427020 |
12 | 0.0068100 | 0.93286 | 0.88332 | − 1.0775989 |
13 | 0.0000000 | 1.00000 | 1.00000 | − 0.0000374 |
14 | 0.0000000 | 1.00000 | 0.99994 | − 0.0005452 |
15 | 0.0006950 | 0.99331 | 0.97029 | − 0.2619999 |
16 | 0.0000000 | 1.00000 | 1.00000 | − 0.0000336 |
17 | 0.0000000 | 1.00000 | 0.99999 | − 0.0000867 |
18 | 0.0006950 | 0.99331 | 0.97029 | − 0.2619999 |
3 Analysis of Results
Table 4 and Fig. 2 indicate that the optimal ANN parameter settings are A2B3C1D1E1F2G2H2. In other words, the optimal ANN model is trained with the Levenberg–Marquardt (LM) algorithm using μ = 0.01 as the initial learning rate and 0.3 as the momentum, μ − = 0.05 as the decrement factor, and μ + = 5 as increment factor. It uses the tansig transfer function in the hidden layer and the purelin transfer function in the output layer. It has six neurons in the hidden layer.
Influence of various parameters
Level | A | B | C | D | E | F | G | H |
---|---|---|---|---|---|---|---|---|
1 | 0.9583 | 0.9081 | 0.9701 | 0.9667 | 0.968 | 0.9616 | 0.9527 | 0.945 |
2 | 0.9625 | 0.9834 | 0.961 | 0.9522 | 0.9599 | 0.9667 | 0.968 | 0.9746 |
3 | - | 0.9898 | 0.9502 | 0.9623 | 0.9534 | 0.9529 | 0.9605 | 0.9617 |
Delta | 0.0042 | 0.0817 | 0.0199 | 0.0145 | 0.0146 | 0.0138 | 0.0153 | 0.0296 |
Rank | 8 | 1 | 3 | 6 | 5 | 7 | 4 | 2 |
4 Confirmation Experiment
The confirmation experiment is an essential step of Taguchi’s method. There is no need to run the confirmation test if the optimal set of parameters is already included in the OA. However, the best design identified in this experiment is not included in the OA, and therefore, a confirmation test is required. The optimal ANN model is developed and trained using the A2B3C1D1E1F2G2H2 parameter value combination, and its performance is tested.
ANN error rates for training and testing data
Sr. no. | Factors | Target equivalent stress (N/mm^{2}) | Predicted equivalent stress (N/mm^{2}) | Error (N/mm^{2}) | MSE (N/mm^{2})^{2} | RMSE (N/mm^{2}) | R ^{2} | ||
---|---|---|---|---|---|---|---|---|---|
Web (mm) | Upper flange (mm) | Lower flange (mm) | |||||||
Training data | |||||||||
1 | 0 | 0 | 0 | 1.0000 | 0.9991 | 0.0009 | 9.31 × 10^{−06} | 0.0031 | 0.9999 |
2 | 0 | 0.25 | 0.25 | 0.6298 | 0.6316 | − 0.0018 | |||
3 | 0 | 0.5 | 0.5 | 0.4911 | 0.4927 | − 0.0016 | |||
4 | 0 | 0.75 | 0.75 | 0.4544 | 0.4542 | 0.0002 | |||
5 | 0 | 1 | 1 | 0.2871 | 0.2902 | − 0.0030 | |||
6 | 0.25 | 0 | 0.25 | 0.4968 | 0.4972 | − 0.0004 | |||
7 | 0.25 | 0.25 | 0.5 | 0.4118 | 0.4085 | 0.0033 | |||
8 | 0.25 | 0.5 | 0.75 | 0.2764 | 0.2769 | − 0.0005 | |||
9 | 0.25 | 0.75 | 1 | 0.1985 | 0.1952 | 0.0033 | |||
10 | 0.25 | 1 | 0 | 0.6705 | 0.6705 | 0.0000 | |||
11 | 0.5 | 0 | 0.5 | 0.3513 | 0.3528 | − 0.0014 | |||
12 | 0.5 | 0.25 | 0.75 | 0.1990 | 0.1995 | − 0.0006 | |||
13 | 0.5 | 0.5 | 1 | 0.1441 | 0.1440 | 0.0001 | |||
14 | 0.5 | 0.75 | 0 | 0.5410 | 0.5427 | − 0.0017 | |||
15 | 0.5 | 1 | 0.25 | 0.3536 | 0.3470 | 0.0065 | |||
16 | 0.75 | 0 | 0.75 | 0.2097 | 0.2104 | − 0.0007 | |||
17 | 0.75 | 0.25 | 1 | 0.0798 | 0.0782 | 0.0017 | |||
18 | 0.75 | 0.5 | 0 | 0.3795 | 0.3893 | − 0.0098 | |||
19 | 0.75 | 0.75 | 0.25 | 0.2073 | 0.2022 | 0.0051 | |||
20 | 0.75 | 1 | 0.5 | 0.0854 | 0.0920 | − 0.0066 | |||
21 | 1 | 0 | 1 | 0.0646 | 0.0651 | − 0.0005 | |||
22 | 1 | 0.25 | 0 | 0.2438 | 0.2357 | 0.0081 | |||
23 | 1 | 0.5 | 0.25 | 0.1476 | 0.1480 | − 0.0004 | |||
24 | 1 | 0.75 | 0.5 | 0.0494 | 0.0496 | − 0.0001 | |||
25 | 1 | 1 | 0.75 | 0.0000 | 0.0008 | − 0.0008 | |||
Testing data | |||||||||
26 | 0 | 0 | 0.5 | 0.8014 | 0.8008 | 0.0006 | 2.94 × 10^{−06} | 0.0017 | 0.9999 |
27 | 0 | 0 | 0.75 | 0.7232 | 0.7256 | − 0.0024 | |||
28 | 0 | 0.25 | 0.5 | 0.5592 | 0.5577 | 0.0015 | |||
29 | 0.25 | 0.25 | 0.75 | 0.3443 | 0.3447 | − 0.0004 | |||
30 | 0.25 | 0.75 | 0.25 | 0.4313 | 0.4323 | − 0.0010 | |||
31 | 0.25 | 0.75 | 0.75 | 0.3562 | 0.3542 | 0.0019 | |||
32 | 0.5 | 0.25 | 0.25 | 0.3944 | 0.3927 | 0.0017 | |||
33 | 0.5 | 0.5 | 0 | 0.5836 | 0.5821 | 0.0016 | |||
34 | 0.5 | 0.5 | 0.5 | 0.3254 | 0.3281 | − 0.0027 | |||
35 | 0.5 | 1 | 0.5 | 0.3294 | 0.3293 | 0.0001 | |||
36 | 0.75 | 0.25 | 0.25 | 0.2238 | 0.2254 | − 0.0016 | |||
37 | 0.75 | 0.25 | 0.75 | 0.1119 | 0.1147 | − 0.0028 | |||
38 | 0.75 | 0.75 | 0.75 | 0.0769 | 0.0750 | 0.0020 | |||
39 | 1 | 0.5 | 0.5 | 0.0812 | 0.0822 | − 0.0010 |
The training data MSE, RMSE, and R^{2} values for the LM6TP architecture are 0.0031, 9.31 × 10^{−06}, and 0.9999, respectively. The testing data MSE, RMSE, and R^{2} values for the LM6TP architecture are 0.0017, 2.94 × 10^{−06}, and 0.9999, respectively. This indicates that the ANN accurately predicts the equivalent stress for both training and testing datasets, and there is no evidence of over-fitting because of the results similar for both datasets.
5 Conclusions
- 1.
It is found that the best ANN model architecture had six hidden neurons in hidden layer. Analysis shows that adding more neurons in a hidden layer has an adverse effect on ANN performances. This finding further supports the conclusion made by Madić and Radovanović [18] that too many neurons in the first hidden layer are not desirable when training ANNs with LM algorithm.
- 2.
ANN trained with LM algorithm using μ = 0.01 as initial learning rate and 0.3 as a momentum, μ − = 0.05 as decrement factor, and μ + = 5 as increment factor, using tansig transfer function in hidden layer, using purelin transfer function in output layer, and having six hidden neurons in hidden layer, is the optimal ANN model.
- 3
The values of mean square error, root mean square error, and coefficient of determination R^{2} for LM6TP architecture are 0.0031, 9.31 × 10^{−06}, and 0.9999, respectively, in training. The values of mean square error, root mean square error, and coefficient of determination R^{2} for LM6TP architecture are 0.0017, 2.94 × 10^{−06}, and 0.9999, respectively, for randomly chosen the testing datasets. This indicates that the ANN accurately predicts the equivalent stress for both training and testing datasets, and there is no evidence of over-fitting because of the results similar for both datasets.
- 4.
However, it was shown that Taguchi’s method can be successfully implemented in design and training of ANN in order to develop the optimized ANN model of high performance with a comparatively small and time-saving experiment.
- 5.
The methodology presented in this paper can be performed for different ANN applications.
