Mass classification of benign and malignant with a new twin support vector machine joint \({l_{2,1}}\)-norm

  • Xiaoming LiuEmail author
  • Ting Zhu
  • Leilei Zhai
  • Jun Liu
Original Article


Breast cancer is the second leading cause of cancer related death for women in the world, and mass is one of the most common kinds of abnormal. A mass can be either benign or malignant, the accurate diagnosis is important for early intervention and treatment. In this paper, we investigated the mass classification problem and proposed a new method for feature selection. The proposed method integrates joint \({l_{2,1}}\)-norm minimizing regularization with a nonparallel twin support vector machine, which is called TWSVML21. The \({l_{2,1}}\)-norm regularization selects features across positive and negative classes with joint sparsity, and features are selected by a ranking strategy. An iterative method is proposed to solve the involved optimization problem. Preliminary results on mass classification and several benchmark datasets showed the feasibility and effectiveness of the proposed TWSVML21 method.


Mass classification \({l_{2,1}}\)-norm Twin support vector machine Feature selection 



This work is partially supported by the National Natural Science Foundation of China (Nos. 61403287, 61472293, 31201121, 61572381, 61273303), China Postdoctoral Science Foundation (No. 2014M552039) and the Natural Science Foundation of Hubei Province (No. 2014CFB288).


  1. 1.
    Ferlay J et al (2015) “Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012”. Int J Cancer 136(5):E359–E386Google Scholar
  2. 2.
    Samulski M, Karssemeijer N (2011) Optimizing case-based detection performance in a multiview CAD system for mammography. IEEE Trans Med Imaging 30(4):1001–1009Google Scholar
  3. 3.
    Liu X, Mei M, Liu J, Hu W (2015) “Microcalcification detection in full-field digital mammograms with PFCM clustering and weighted SVM-based method”. EURASIP J Adv Signal Process 2015(1):1Google Scholar
  4. 4.
    Tang J, Rangayyan RM, Xu J, El Naqa I, Yang Y (2009) Computer-aided detection and diagnosis of breast cancer with mammography: recent advances. IEEE Trans Inf Technol Biomed 13(2):236–251Google Scholar
  5. 5.
    Eltonsy NH, Tourassi GD, Elmaghraby AS (2007) A concentric morphology model for the detection of masses in mammography. IEEE Trans Med Imaging 26(6):880–889Google Scholar
  6. 6.
    Pereira DC, Ramos RP, Do Nascimento MZ (2014) Segmentation and detection of breast cancer in mammograms combining wavelet analysis and genetic algorithm. Comput Methods Prog Biomed 114(1):88–101Google Scholar
  7. 7.
    Chan H-P et al (1995) Computer-aided classification of mammographic masses and normal tissue: linear discriminant analysis in texture feature space. Phys Med Biol 40(5):857–876Google Scholar
  8. 8.
    Eltoukhy MM, Faye I, Samir BB (2012) A statistical based feature extraction method for breast cancer diagnosis in digital mammogram using multiresolution representation. Comput Biol Med 42(1):123–128Google Scholar
  9. 9.
    Cheng H, Shi X, Min R, Hu L, Cai X, Du H (2006) Approaches for automated detection and classification of masses in mammograms. Patt Recognit 39(4):646–668Google Scholar
  10. 10.
    Ganesan K, Acharya UR, Chua CK, Min LC, Abraham KT, Ng K-H (2013) Computer-aided breast cancer detection using mammograms: a review. IEEE Rev Biomed Eng 6:77–98Google Scholar
  11. 11.
    Liu X, Tang J (2014) Mass classification in mammograms using selected geometry and texture features, and a new SVM-based feature selection method. IEEE Syst J 8(3):910–920Google Scholar
  12. 12.
    Shmilovici A (2005) Support vector machines. In: Data mining and knowledge discovery handbook. Springer, pp 257–276Google Scholar
  13. 13.
    Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300Google Scholar
  14. 14.
    Sun B, Ng WW, Chan PP (2016) Improved sparse LSSVMS based on the localized generalization error model. Int J Mach Learn Cybern 1–9Google Scholar
  15. 15.
    Pan X, Xu Y (2016) Two effective sample selection methods for support vector machine. J Intell Fuzzy Syst 30(2):659–670Google Scholar
  16. 16.
    He Q, Wang X, Chen J, Yan L (2006) A parallel genetic algorithm for solving the inverse problem of support vector machines. Adv Mach Learn Cybern 871–879Google Scholar
  17. 17.
    Wang X-Z, Lu S-X, Zhai J-H (2008) Fast fuzzy multicategory SVM based on support vector domain description. Int J Pattern Recognit Artif Intell 22(01):109–120Google Scholar
  18. 18.
    Wang X-Z, RAR Ashfaq, Fu A-M (2015) “Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196MathSciNetGoogle Scholar
  19. 19.
    Hu L, Lu S, Wang X (2013) A new and informative active learning approach for support vector machine. Inf Sci 244:142–160MathSciNetzbMATHGoogle Scholar
  20. 20.
    Qi Y, Zhang G (2016) Strategy of active learning support vector machine for image retrieval. IET Comput Vis 10(1):87–94Google Scholar
  21. 21.
    Dufrenois F, Noyer JC (2015) Generalized eigenvalue proximal support vector machines for outlier description. In: 2015 International Joint Conference on Neural Networks (IJCNN), 2015, pp 1–9: IEEEGoogle Scholar
  22. 22.
    Khemchandani R, Chandra S (2009) Optimal kernel selection in twin support vector machines. Optim Lett 3(1):77–88MathSciNetzbMATHGoogle Scholar
  23. 23.
    Kumar MA, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4):7535–7543Google Scholar
  24. 24.
    Tian Y, Ju X, Qi Z, Shi Y (2014) Improved twin support vector machine. Sci China Math 57(2):417–432MathSciNetzbMATHGoogle Scholar
  25. 25.
    Shao Y-H, Zhang C-H, Wang X-B, Deng N-Y (2011) Improvements on twin support vector machines. IEEE Trans Neural Netw 22(6):962–968Google Scholar
  26. 26.
    Xu Y, Chen M, Yang Z, Li G (2016) ν-twin support vector machine with Universum data for classification. Appl Intell 44(4):956–968Google Scholar
  27. 27.
    Xu Y, Yu J, Zhang Y (2014) KNN-based weighted rough ν-twin support vector machine. Knowl-Based Syst 71:303–313Google Scholar
  28. 28.
    Xu Y, Yang Z, Pan X (2017) A novel twin support-vector machine with pinball loss. IEEE Trans Neural Netw Learn Syst 28(2):359–370MathSciNetGoogle Scholar
  29. 29.
    Tomar D, Agarwal S (2015) Twin support vector machine: a review from 2007 to 2014. Egypt Inf J 16(1):55–69Google Scholar
  30. 30.
    Yang Z-M, He J-Y, Shao Y-H (2013) Feature selection based on linear twin support vector machines. Proc Comput Sci 17:1039–1046Google Scholar
  31. 31.
    Guo J, Yi P, Wang R, Ye Q, Zhao C (2014) Feature selection for least squares projection twin support vector machine. Neurocomputing 144:174–183Google Scholar
  32. 32.
    Bai L, Wang Z, Shao Y-H, Deng N-Y (2014) A novel feature selection method for twin support vector machine. Knowl-Based Syst 59:1–8Google Scholar
  33. 33.
    Fan R-E, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874zbMATHGoogle Scholar
  34. 34.
    Nie F, Huang H, Cai X, Ding CH (2010) Efficient and robust feature selection via joint l 2,1-norms minimization. Adv Neural Inf Process Syst 1813–1821Google Scholar
  35. 35.
    Tian Y-J, Ju X-C (2015) Nonparallel support vector machine based on one optimization problem for pattern recognition. J Oper Res Soc China 3(4):499–519MathSciNetzbMATHGoogle Scholar
  36. 36.
    Platt JC (1999) 12 fast training of support vector machines using sequential minimal optimization. In: Advances in kernel methods, pp 185–208Google Scholar
  37. 37.
    Andersen ED, Andersen KD (2000) The MOSEK interior point optimizer for linear programming: an implementation of the homogeneous algorithm. In: High performance optimization. Springer, pp 197–232Google Scholar
  38. 38.
    Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. New YorkGoogle Scholar
  39. 39.
    Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69zbMATHGoogle Scholar
  40. 40.
    Huang J, Zhang T (2010) The benefit of group sparsity. Ann Stat 38(4):1978–2004MathSciNetzbMATHGoogle Scholar
  41. 41.
    Moreira IC, Amaral I, Domingues I, Cardoso A, Cardoso MJ, Cardoso JS (2012) INbreast: toward a full-field digital mammographic database. Acad Radiol 19(2):236–248Google Scholar
  42. 42.
    Moura DC et al (2013) Benchmarking datasets for breast cancer computer-aided diagnosis (CADx). In: Iberoamerican Congress on Pattern Recognition, 2013. Springer, pp 326–333Google Scholar
  43. 43.
    Dhungel N, Carneiro G, Bradley AP (2016) The automated learning of deep features for breast mass classification from mammograms. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, 2016, pp 106–114. SpringerGoogle Scholar
  44. 44.
    Liu X, Zeng Z (2015) A new automatic mass detection method for breast cancer with false positive reduction. Neurocomputing 152:388–402Google Scholar
  45. 45.
    Haralick RM, Shanmugam K, Dinstein IH (1973) Textural features for image classification. IEEE Trans Syst Man Cybern (6):610–621Google Scholar
  46. 46.
    Jähne B (2002) Digital image processing. IOP PublishingGoogle Scholar
  47. 47.
    Mudigonda NR, Rangayyan RM, Desautels JL (2000) Gradient and texture analysis for the classification of mammographic masses. IEEE Trans Med Imaging 19(10):1032–1043Google Scholar
  48. 48.
    Li H, Kallergi M, Clarke L, Jain V, Clark R (1995) Markov random field for tumor detection in digital mammography. IEEE Trans Med Imaging 14(3):565–576Google Scholar
  49. 49.
    Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27Google Scholar
  50. 50.
    Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238Google Scholar
  51. 51.
    Yang Y, Ma Z, Hauptmann AG, Sebe N (2013) Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans Multimed 15(3):661–669Google Scholar
  52. 52.
    Metz C (2006) ROCKIT 1.1 B2 (beta version for Windows operating system) [Computer software]. University of Chicago, Chicago, UK.
  53. 53.
    Gu Q, Li Z, Han J (2012) Generalized fisher score for feature selection. arXiv preprint: arXiv:1202.3725Google Scholar
  54. 54.
    Cai X, Nie F, Huang H, Ding C (2011) Multi-class l 2,1-norm support vector machine. In: 2011 IEEE 11th International Conference on Data Mining, 2011, pp 91–100: IEEEGoogle Scholar
  55. 55.
    Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422zbMATHGoogle Scholar
  56. 56.
    Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148(3):839–843Google Scholar
  57. 57.
    Cevikalp H (2016) Best fitting hyperplanes for classification. IEEE Trans Pattern Anal Mach IntellGoogle Scholar
  58. 58.
    Peng X (2011) TPMSVM: a novel twin parametric-margin support vector machine for pattern recognition. Pattern Recognit 44(10–11):2678–2692zbMATHGoogle Scholar
  59. 59.
    Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Advances in kernel methods, pp 185–208Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.College of Computer Science and TechnologyWuhan University of Science and TechnologyWuhanChina
  2. 2.Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial SystemWuhanChina

Personalised recommendations