An effective approach for causal variables analysis in diesel engine production by using mutual information and network deconvolution
The effective control of the power consistency, which is one of the most important quality indicators of diesel engine, plays a decisive role for improving the competitiveness of the products. The widely used sensors and other data acquisition equipment make the “data-driven quality control” become possible. However, how to determine the highly related parameters with the engine power from massive captured manufacturing data and effectively discriminated the direct and indirect dependencies between these variables are still challenging. This paper proposed a feature selection algorithm named NMI-ND which uses network deconvolution (ND) to infer causal correlations among various diesel engine manufacturing parameters from the observed correlations based on normalized mutual information (NMI). The proposed algorithm is thoroughly evaluated through the experimental study by comparing it with other representative feature selection algorithms. The comparison demonstrates that NMI-ND performs better in both effectiveness and efficiency.
KeywordsPower consistency Causal variables analysis Transitive effects Mutual information Network deconvolution
This work was supported by financial support of National Science Foundation of China (Nos. 51435009, 51775348), National Technology Support Program of China (No. 2015BAF12B02) and Shanghai Aerospace Science and Technology Innovation Fund (No. SAST2016048).
- Bai, Y., Sun, Z., Zeng, B., Long, J., Li, L., & Oliveira, J. V. D., et al. (2018). A comparison of dimension reduction techniques for support vector machine modeling of multi-parameter manufacturing quality prediction. Journal of Intelligent Manufacturing, 1–12.Google Scholar
- Chang, W., Gao, C., Xiao, Y., & Zhou, S. (2016). Mining approximate dependencies from diesel engine assembling data using clustering-based rough sets theory. In Control and decision conference (CCDC), 2016 Chinese (pp. 5683–5687). IEEE.Google Scholar
- Du, S., Lv, J., & Xi, L. (2012). A robust approach for root causes identification in machining processes using hybrid learning algorithm and engineering knowledge. New York: Springer.Google Scholar
- Hall, M. A. (1998). Correlation-based feature subset selection for machine learning. Thesis submitted in partial fulfillment of the requirements of the degree of Doctor of Philosophy at the University of Waikato.Google Scholar
- Jia Q. (2012). Research and application of multivariate correlation and data processing engine, M.S. thesis, Dept. Mechanical Eng., Shanghai Jiao Tong University, Shanghai.Google Scholar
- Kong, D., Ding, C., Huang, H., & Zhao, H. (2012). Multi-label relieff and f-statistic feature selections for image annotation. In 2012 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2352–2359). IEEE.Google Scholar
- Li, Z., Wang, Y., & Wang, K. (2017a). A data-driven method based on deep belief networks for backlash error prediction in machining centers. Journal of Intelligent Manufacturing. https://doi.org/10.1007/s10845-017-1380-9.
- Li, C., Liu, S., Zhang, H., & Hu, Y. (2017b). Machinery condition prediction based on wavelet and support vector machine. Journal of Intelligent Manufacturing, 28(4), 1–11.Google Scholar
- Neyman, J., & Pearson, E. S. (1992). On the problem of the most efficient tests of statistical hypotheses. In Breakthroughs in statistics (pp. 73–108). New York: Springer. https://link.springer.com/chapter/10.1007/978-1-4612-0919-5_6.
- Veiga, D. F. T., Vicente, F. F. R., Grivet, M., De la Fuente, A., & Vasconcelos, A. T. R. (2007). Genome-wide partial correlation analysis of Escherichia coli microarray data. Genetics and Molecular Research, 6(4), 730–742.Google Scholar
- Wainwright, M. J., & Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends® in Machine Learning, 1(1–2), 1–305.Google Scholar
- Yanai, T., Kurashige, Y., Mizukami, W., Chalupský, J., Lan, T. N., & Saitow, M. (2015). Density matrix renormalization group for ab initio Calculations and associated dynamic correlation methods: A review of theory and applications. International Journal of Quantum Chemistry, 115(5), 283–299.CrossRefGoogle Scholar
- Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 856–863).Google Scholar
- Yu, J., Lee, H., Im, Y., Kim, M. S., & Park, D. (2010). Real-time classification of internet application traffic using a hierarchical multi-class SVM. KSII Transactions on Internet & Information Systems, 4(5), 859–876.Google Scholar
- Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–1224.Google Scholar
- Zhou, X., & Jiang, P. (2014). Variation source identification for deep hole boring process of cutting-hard workpiece based on multi-source information fusion using evidence theory. Journal of Intelligent Manufacturing, 28, 1–16.Google Scholar