The Online Soft Computing Models of key variables based on the Boundary Forest method

Deng, Chang-Hui; Wang, Xiao-Jun; Gu, Jun; Wang, Wei

doi:10.1007/s00500-019-04584-1

The Online Soft Computing Models of key variables based on the Boundary Forest method

Methodologies and Application
Published: 05 December 2019

Volume 24, pages 10815–10828, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

Chang-Hui Deng¹,
Xiao-Jun Wang ORCID: orcid.org/0000-0001-6434-2198²,
Jun Gu¹ &
…
Wei Wang¹

163 Accesses
4 Citations
Explore all metrics

Abstract

The Online Soft Computing Models (OSCMs) based on ensemble methods are novel and quite effective data-driven tools for predicting key variables. The current challenge encountered by them is how to enhance the reliability caused by both the uncertainty from noise and the unsuitable specifications of models, on the premise of high predicting accuracy and low computational cost. To meet the current challenge, the OSCM based on the Boundary Forest (OSCM-BF) is proposed in this paper. The BF combines a set of the Tree-Structure Ensemble (TSE) models. In terms of the different values of θ (i.e., the minimum size of leaf nodes), the BF enhances the reliability of a single TSE not only by overlapping the gap segments of output range (i.e., connecting the discontinuous boundaries of leaf nodes), but also by possessing stronger robustness via producing enough diversity. Moreover, a theoretical range of the value of θ constructed by BF is provided. Since the simplicity, the nice interpretability and the flexibility on large-scale data, the moving-window strategy was adopted to realize the update of the BF models. The experiments on the noisy data from the industrial process of Ladle Furnace reveal that the OSCM-BF can enhance the reliability of the OSCM-TSE on the premise of high predicting accuracy and low computational cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving optimum-path forest learning using bag-of-classifiers and confidence measures

Article 18 December 2017

Stochastic optimization for bayesian network classifiers

Article 16 March 2022

Fuzzy entropy and fuzzy support-based boosting random forests for imbalanced data

Article 17 July 2021

Abbreviations

BF:: Boundary Forest
CART:: Classification and Regression Tree
ELM:: Extreme Learning Machine
GRNN:: General Regression Neural Network
LF:: Ladle Furnace
LSSVR:: Least Squares Support Vector Regression
MAE:: Maximum Absolute Error
MSE:: Mean Square Errors
NN:: Neural Network
OSCM:: Online SCM
OSCM-BF:: OSCM based on the Boundary Forest
OSCM-TSE:: OSCM based on the TSE
pENsemble:: Parsimonious Ensemble
RF:: Random Forest
RMSE:: Root-Mean-Square Error
SCM:: Soft Computing Models
SVM:: Support Vector Machine
TSE:: Tree-Structure Ensemble
\( \varpi \) :: The width of a window
\( \vartheta \) :: The step for updating
Θ:: A learning set, and \( \varTheta = {\text{\{ (}}{\mathbf{X}},y )_{n} {\text{\} }}_{{n{ = }1}}^{N} \)
\( ({\mathbf{X}},y) \) :: A sample pair
y :: The output variable, or the real output, \( y \in {\mathbb{R}}^{1} \)
\( \hat{y} \) :: The prediction of a model
\( {\mathbf{X}} \) :: The input vector or a sample, and \( {\mathbf{X}} = (x_{1} , \ldots ,x_{M} ) \in {\mathbb{R}}^{M} \)
x_i, i = 1, 2, …, M :: The ith input variable
N :: The number of the samples in Θ
M :: The dimension of the input variables
p(X):: The mapping of the piecewise function to X
\( \hbar_{i} , { }i = 1, \ldots ,M \) :: The threshold of the input variable \( x_{i} \)
Θ_leaf, Θ_right :: The sample subsets of the left and the right sub-branches
MSE_leaf, MSE_right :: The MSEs of the outputs in Θ_leaf and Θ_right
\( \bar{y}_{\text{left}} \), \( \bar{y}_{\text{left}} \) :: The mean values of the real outputs in Θ_leaf and Θ_right
N_leaf, N_right :: The numbers of samples in Θ_leaf and Θ_right
MSE_min :: The minimum sum of MSE_leaf and MSE_right
J :: The number of the possible thresholds of a input variable
θ :: The minimum size of leaf nodes in a TSE model
K :: The number of the TSE models in a BF model
T _k :: The kth TSE models in a BF model, k = 1, …, K
θ _k :: The minimum size of leaf nodes in the TSE model T_k
Φ _k :: The set of leaf nodes in the TSE sub-model T_k, and \( \varPhi_{k} = \{ \varTheta_{1k}^{\text{leaf}} ,\varTheta_{2k}^{\text{leaf}} , \ldots ,\varTheta_{{\varGamma_{k} k}}^{\text{leaf}} \} \)
Г_k :: The number of the leaf nodes in Φ_k
\( g_{1k}^{\text{leaf}} ({\mathbf{X}}),g_{2k}^{\text{leaf}} ({\mathbf{X}}), \ldots ,g_{{\varGamma_{k} k}}^{\text{leaf}} ({\mathbf{X}}) \) :: The mappings of the local TSE models learnt on Φ_k
f^BF(X):: The mapping of a BF model
ω = [ω₁, ω₂, …, ω_K] :: The weight vector of the TSE models{T₁, T₂, …, T_K}
\( \omega_{k} \) :: The weight of the TSE sub-model T_k
f ^TSE_k (X):: The mapping of the TSE sub-model T_k
Ω :: The covariance matrix with size K × K
Ω _kj :: The element of Ω, j, k = 1, …, K
\( \hat{y}_{ki} \) :: The prediction of the sample X_i from the TSE sub-model T_k, j, k = 1, …, K
\( y_{i} \) :: The real output of the sample X_i
\( {\hat{\mathbf{\varLambda }}} \) :: The prediction matrix of the training samples from the K TSE models
X _q :: The query sample
\( \hat{y}_{{1{\text{q}}}} ,\hat{y}_{{2{\text{q}}}} , \ldots ,\hat{y}_{{K{\text{q}}}} \) :: The predictions of X_q from the K TSE models in a BF model
χ _jk :: The size of the jth leaf node in T_k, j = 1, …, Г_k, k = 1, …, K

References

Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article Google Scholar
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth Int Group 40(3):17–23
MATH Google Scholar
Demsǎr J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
García S, Fernandez A, Luengo J, Herrera F (2009) A study statistical of techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977
Article Google Scholar
Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern B Cybern 42(2):513–529
Article Google Scholar
Jaramillo F, Orchard M, Muñoz C, Antileo C, Sáez D, Espinoza P (2018) On-line estimation of the aerobic phase length for partial nitrification processes in SBR based on features extraction and SVM classification. Chem Eng J 331:114–123
Article Google Scholar
Kadlec P, Gabrys B (2011) Local learning-based adaptive soft sensor for catalyst activation prediction. AIChE J 57(5):1288–1301
Article Google Scholar
Kadlec P, Gabrys B, Strandt S (2009) Data-driven soft sensor in the process industry. Comput Chem Eng 33(4):795–814
Article Google Scholar
Kadlec P, Grbić R, Gabrys B (2011) Review of adaptation mechanisms for data-driven soft sensors. Comput Chem Eng 35(1):1–24
Article Google Scholar
Kazienko P, Lughofer E, Trawinski B (2015) Editorial on the special issue “Hybrid and ensemble techniques in soft computing: recent advances and emerging trends”. Soft Comput 19:3353–3355
Article Google Scholar
Liu Y, Gao Z, Chen J (2013) Development of soft-sensors for online quality prediction of sequential-reactor-multi-grade industrial processes. Chem Eng Sci 102(11):602–612
Article Google Scholar
Liukkonen M, Hälikkä E, Hiltunen T, Hiltunen Y (2013) Adaptive soft sensor for fluidized bed quality: applications to combustion of biomass. Fuel Process Technol 105(1):46–51
Article Google Scholar
Lughofer E, Macian V, Guardiola C, Klement EP (2011) Identifying static and dynamic prediction models for NOx emissions with evolving fuzzy systems. Appl Soft Comput 11(2):2487–2500
Article Google Scholar
Marković D, Petković D, Nikolić V, Milovančević M, Petković B (2017) Soft computing prediction of economic growth based in science and technology factors. Phys A 465:217–220
Article Google Scholar
Moayedi H, Hayati S (2018) Modelling and optimization of ultimate bearing capacity of strip footing near a slope by soft computing methods. Appl Soft Comput 29:1393–1409
Google Scholar
Parsaie A, Haghiabi AH, Saneie M, Torabi H (2018) Applications of soft computing techniques for prediction of energy dissipation on stepped spillways. Neural Comput Appl 29:1393–1409
Article Google Scholar
Peng X, Tang Y, Du W, Qian F (2017) Online performance monitoring and modeling paradigm based on just-in-time learning and ELM for a non-Gaussian chemical process. Ind Eng Chem Res 56(23):6671–6684
Article Google Scholar
Perrone MP, Cooper LN (1993) When networks disagree: ensemble methods for hybrid neural networks. In: Mammone RJ (ed) Artificial neural networks for speech and vision. Chapman & Hall, London, pp 126–142
Google Scholar
Polikar R, Upda L, Upda SS, Honavar V (2001) Learn ++: an incremental learning algorithm for supervised neural networks. IEEE Trans Syst Man Cybern C Appl Rev 31(4):497–508
Article Google Scholar
Pratama M, Pedrycz W, Lughofer E (2018) Evolving ensemble fuzzy classifier. IEEE Trans Fuzzy Syst 26(5):2552–2567
Article Google Scholar
Shen K-Y, Tzeng G-H (2015) A decision rule-based soft computing model for supporting financial performance improvement of the banking industry. Soft Comput 19:859–874
Article Google Scholar
Specht DF (1991) A general regression neural network. IEEE Trans Neural Netw 2(6):568–576
Article Google Scholar
Suykens JAK, Gestel TV, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore
Book Google Scholar
Tatinati S, Veluvolu KC, Wei TA (2015) Multistep prediction of physiological tremor based on machine learning for robotics assisted microsurgery. IEEE Trans Cybern 45(2):328–339
Article Google Scholar
Tian H-X, Mao Z-Z (2010) An ensemble ELM based on modified AdaBoost.RT algorithm for predicting the temperature of molten steel in ladle furnace. IEEE Trans Autom Sci Eng 7(1):73–85
Article Google Scholar
Vandechali MR, Abbaspour-Fard MH, Rohani A (2018) Development of a prediction model for estimating tractor engine torque based on soft computing and low cost sensors. Measurement 121:83–95
Article Google Scholar
Vapnik VN (1999) The nature of statistical learning theory, 2nd edn. Springer, New York
MATH Google Scholar
Wang X (2017) Ladle furnace temperature prediction model based on large-scale data with random forest. IEEE/CAA J Autom Sin 4(4):770–774
Article Google Scholar
Wang L, Jin H, Chen X, Dai J, Yang K, Zhang D (2016a) Soft sensor development based on the hierarchical ensemble of Gaussian process regression models for nonlinear and non-Gaussian chemical processes. Ind Eng Chem Res 55(28):7704–7719
Article Google Scholar
Wang X, You M, Mao Z, Yuan P (2016b) Tree-structure ensemble general regression neural networks applied to predict the molten steel temperature in ladle furnace. Adv Eng Inform 30(3):368–375
Article Google Scholar
Wang X, Yuan P, Mao Z, You M (2016c) Molten steel temperature prediction model based on bootstrap feature subsets ensemble regression trees. Knowl Based Syst 101:48–59
Article Google Scholar
Wang XJ, Wang XY, Zhang Q, Mao ZZ (2018) The soft sensor of the molten steel temperature using the modified maximum entropy based pruned bootstrap feature subsets ensemble method. Chem Eng Sci 189:401–412
Article Google Scholar
Weigl E, Heidl W, Lughofer E, Radauer T, Eitzinger C (2016) On improving performance of surface inspection systems by on-line active learning and flexible classifier updates. Mach Vis Appl 27(1):103–127
Article Google Scholar
Yan Y, Wang L, Wang T, Wang X, Hu Y, Duan Q (2018) Application of soft computing techniques to multiphase flow measurement: a review. Measurement 60:30–43
Google Scholar
Yuan X, Ge Z, Huang B, Song Z, Wang Y (2017) Semisupervised JITL framework for nonlinear industrial soft sensing based on locally semisupervised weighted PCR. IEEE Trans Industr Inf 13(2):99
Article Google Scholar
Zou QY, Wang XJ, Zhou CJ, Zhang Q (2018) The memory degradation based online sequential extreme learning machine. Neurocomputing 275:2864–2879
Article Google Scholar

Download references

Acknowledgements

The authors would like to acknowledge Professor Zhi-Zhong Mao for providing the data and suggestions. He is a PhD Supervisor at Northeastern University, and his research interests include control and optimization in complex industrial system.

Funding

This study was funded by the National Natural Science Foundation of China (No. 61702070) and the Research Projects of Liaoning Marine Fisheries Office (No. 201512).

Author information

Authors and Affiliations

College of Information Engineering, Dalian Ocean University, Dalian, 116023, China
Chang-Hui Deng, Jun Gu & Wei Wang
School of Management Science and Engineering, Dongbei University of Finance and Economics, Dalian, 116025, China
Xiao-Jun Wang

Authors

Chang-Hui Deng
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Jun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Gu
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Jun Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deng, CH., Wang, XJ., Gu, J. et al. The Online Soft Computing Models of key variables based on the Boundary Forest method. Soft Comput 24, 10815–10828 (2020). https://doi.org/10.1007/s00500-019-04584-1

Download citation

Published: 05 December 2019
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00500-019-04584-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Online Soft Computing Models of key variables based on the Boundary Forest method

Abstract

Access this article

Similar content being viewed by others

Improving optimum-path forest learning using bag-of-classifiers and confidence measures

Stochastic optimization for bayesian network classifiers

Fuzzy entropy and fuzzy support-based boosting random forests for imbalanced data

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Online Soft Computing Models of key variables based on the Boundary Forest method

Abstract

Access this article

Similar content being viewed by others

Improving optimum-path forest learning using bag-of-classifiers and confidence measures

Stochastic optimization for bayesian network classifiers

Fuzzy entropy and fuzzy support-based boosting random forests for imbalanced data

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation