Control Sensitivity SVM for Imbalanced Data A Case Study on Automotive Material
In many classification problems the data is imbalanced, that is the class priors are different. Here we consider the classification problem of fatigue crack initiation in automotive camshafts, where this imbalance is significant. The standard averaging technique used to access the performance of a model is inappropriate for imbalanced data and therefore the geometric mean, was used to evaluate the performance of the model. It has been shown elsewhere that the original SVM estimate concurs with that of the Bayes optimal decision rule. As such, a comparison was investigated using Support Vector Machine (SVM) and Controlled Sensitivity (CS) SVM using two different training sets, with different class ratios (1:8 and 1:1) between the “crack” and “no crack” respectively. Result show that the obtained balanced training set gave improved performance for the SVM. Alternatively, using imbalanced training data the CS SVM outperformed the SVM. Although, the computation speed for balanced data is faster, however, the emphasis in this application is for model performance, as such, the CS SVM with imbalanced produced an average estimated generalisation performance of over 71%.
KeywordsSupport Vector Machine Fatigue Crack Radial Basis Function Receiver Operating Characteristic Curve True Positive
Unable to display preview. Download preview PDF.
- K. Veropoulos, C. Campbell, and N. Cristianini, “Controlling the sensitivity of support machines” Proceedings of the Int. Joint Conf. on Artifical Intelligence (IJCAI99), Sweden, 1999.Google Scholar
- R. Hockley, D. Thakar, J. Boselli, I. Sinclair, and P. Reed, “Effect of graphite nodule distribution on’ crack’ initiation and early growth in austempered ductile iron” Small Cracks Mechanics and Mechanisms, 1999.Google Scholar
- J. Boselli, P. Pitcher, P. Gregson, and I. Sinclair, “Secondary phase distribution analysis via finite body tessellation” Journal of Microscopy, vol. TM 140, 1998.Google Scholar
- Y. Lin, Y. Lee, and G. Wahba, “Support vector machines for classification in nonstandard situations” Tech. Rep. 1016, University of Wisconsin, March 2000. 4Google Scholar