A Clinically Applicable Automated Risk Classification Model for Pulmonary Nodules

  • Triparna Poddar
  • Jhilam MukherjeeEmail author
  • Bhaswati Ganguli
  • Madhuchanda Kar
  • Amlan Chakrabarti
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1016)


Lung cancer has the highest prevalence in cancer-related deaths due to its rapid progression and it is detected at advanced stages. The paper proposes a novel method for predicting the risk of being malignant of Pulmonary Nodule (PN), presence of which can be an indication of lung cancer, with the motive to reduce the number of unnecessary biopsies and prevent anxiety among the patients. The study has considered different morphological features along with the clinical history of the patient having the particular nodule as described in medical literature. Depending upon these features, we have classified the risk of being malignant of pulmonary nodule into two classes, namely, low-risked or benign and high-risked or malignant. The entire dataset required to design the model is collected from a retrospective dataset, containing 476 (401 Malignant or high-risked and 75 low-risked or benign) PNs. The classification is performed by Recursive Partitioning Algorithm (RPA). RPA not only improves the accuracy but also helps to interpret how the morphological features are classifying the true risk of being malignant of the nodules.


Pulmonary nodule Morphological features Decision tree Recursive partitioning Imbalance class problem ROC curve Low-risked High-risked 



We are thankful to Centre of Excellence in Systems Biology and Biomedical Engineering (TEQIP II and III), UGC UPE-II projects of University of Calcutta for providing the financial support of this research, and Peerless Hospital for providing their valuable dataset.

Compliance with Ethical Standard

The collection of patient images and pathological report for research purpose was approved by the Ethical Committee of Peerless Hospital and B. K. Roy Research Centre Ltd.


  1. 1.
    Formdan, D., Bray, F., Brewster, D. H., Mbalawa, C. G., Kohler, B., Pieros, M., et al. (2013). Cancer incidence in five continents, vol. X (electronic version). Lyon: IARC (2013).Google Scholar
  2. 2.
    Bach, P. B., Mirkin, J. N., Oliver, T. K., Azzoli, C. G., Berry, D. A., Brawley, O. W., et al. (2012). Benefits and harms of CT screening for lung cancer: A systematic review. Jama, 307(22), 2418–2429.CrossRefGoogle Scholar
  3. 3.
    Setio, A. A. A., Ciompi, F., Litjens, G., Gerke, P., Jacobs, C., Van Riel, S. J., et al. (2016). Pulmonary nodule detection in CT images: False positive reduction using multi-view convolutional networks. IEEE Transactions on Medical Imaging, 35(5), 1160–1169.CrossRefGoogle Scholar
  4. 4.
    Froz, B. R., de Carvalho Filho, A. O., Silva, A. C., de Paiva, A. C., Nunes, R. A., & Gattass, M. (2017). Lung nodule classification using artificial crawlers, directional texture and support vector machine. Expert Systems with Applications, 69, 176–188.CrossRefGoogle Scholar
  5. 5.
    Smola, A. J., & Schlkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.MathSciNetCrossRefGoogle Scholar
  6. 6.
    Yoshino, Y., Miyajima, T., Huimin, L., Tan, J., Kim, H., Murakami, S., et al. (2017). Automatic classification of lung nodules on MDCT images with the temporal subtraction technique. International Journal of Computer Assisted Radiology and Surgery, 12(10), 1789–1798.CrossRefGoogle Scholar
  7. 7.
    Jacobs, C., van Rikxoort, E. M., Twellmann, T., Th Scholten, E., de Jong, P. A., Kuhnigk, J.-M., et al. (2014). Automatic detection of subsolid pulmonary nodules in thoracic computed tomography images. Medical Image Analysis, 18(2), 374–384.CrossRefGoogle Scholar
  8. 8.
    Jones, R., & Svalbe, I. D. (1994). Basis algorithms in mathematical morphology. In Advances in electronics and electron physics (vol. 89, pp. 325–390). Academic Press.Google Scholar
  9. 9.
    Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.CrossRefGoogle Scholar
  10. 10.
    Barandiaran, I. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8) (1998).Google Scholar
  11. 11.
    Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). The Annals of Statistics, 28(2), 337–407.MathSciNetCrossRefGoogle Scholar
  12. 12.
    Fukunaga, K. (2013). Introduction to statistical pattern recognition. Elsevier.Google Scholar
  13. 13.
    Vapnik, V. N. (1999). An overview of statistical learning theory. IEEE Transactions on Neural Networks, 10(5), 988–999.CrossRefGoogle Scholar
  14. 14.
    Kuruvilla, J., & Gunavathi, K. (2014). Lung cancer classification using neural networks for CT images. Computer Methods and Programs in Biomedicine, 113(1), 202–209.CrossRefGoogle Scholar
  15. 15.
    Otsu, N. (1979). A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1), 62–66.CrossRefGoogle Scholar
  16. 16.
    Nibali, A., He, Z., & Wollersheim, D. (2017). Pulmonary nodule classification with deep residual networks. International Journal of Computer Assisted Radiology and Surgery, 12(10), 1799–1808.CrossRefGoogle Scholar
  17. 17.
    Armato, I. I. I., Samuel, G., McLennan, G., Bidaut, L., McNittGray, M. F., Meyer, C. R., et al. (2011). The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38(2), 915–931.CrossRefGoogle Scholar
  18. 18.
    World Medical Association. (2001). World Medical Association Declaration of Helsinki. Ethical principles for medical research involving human subjects. Bulletin of the World Health Organization, 79(4), 373.Google Scholar
  19. 19.
    Khan, A. N., Al-Jahdali, H. H., Irion, K. L., Arabi, M., & Koteyar, S. S. (2011). Solitary pulmonary nodule: A diagnostic algorithm in the light of current imaging technique. Avicenna Journal of Medicine, 1(2), 39.CrossRefGoogle Scholar
  20. 20.
    Terry Therneau and Beth Atkinson. (2018). Rpart: Recursive partitioning and regression trees. R package version 4.1–13.
  21. 21.
    Therneau, T. M., & Atkinson, E. J. (1997). An introduction to recursive partitioning using the RPART routines.Google Scholar
  22. 22.
    Gong, J., Gao, T., Bu, R.-R., Wang, X.-F., Nie, S.-D. (2014). An automatic pulmonary nodules detection method using 3d adaptive template matching. In International Conference on Life System Modeling and Simulation and International Conference on Intelligent Computing for Sustainable Energy and Environment (pp. 39–49). Springer, Berlin, Heidelberg.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Triparna Poddar
    • 1
  • Jhilam Mukherjee
    • 2
    Email author
  • Bhaswati Ganguli
    • 1
  • Madhuchanda Kar
    • 3
  • Amlan Chakrabarti
    • 2
  1. 1.Department of StatisticsUniversity of CalcuttaKolkataIndia
  2. 2.A.K. Choudhury School of Information TechnologyUniversity of CalcuttaKolkataIndia
  3. 3.Peerless Hospitex HospitalKolkataIndia

Personalised recommendations