Multimedia Tools and Applications

, Volume 77, Issue 17, pp 22771–22786 | Cite as

Chinese materia medica resource images screening method study

  • Xiaobo Zhang
  • Zhanquan SunEmail author
  • Zhao Li


Chinese materia medica resource survey provides an important basis for the development of traditional Chinese Medicine (TCM) industry. During the Chinese materia medica resource survey process, millions of materia medica plant images are collected. The collected image dataset includes some images that are unqualified for image analysis, i.e. they can’t be used to build medicinal plant classifier model. It is a burdensome work to identify the unqualified Chinese materia medica resource images manually. How to screen the unqualified images automatically is an important task of Chinese materia medica resource survey. Image recognition techniques developed quickly in recent years. Outlier detection is a kind of unsupervised method to find the unqualified images automatically. Lots of research work has been done on the topic. Extracted features and correlation metric play important roles on the outlier image detection result. For improving the image screening performance, a novel outlier detection method is proposed in this paper. Convolutional neural network (CNN) is used to extract the complicated features of Chinese materia medica resource images. Extended entropy is introduced into the calculation of information loss that is used to measure the distance between images. Based on the extracted image features and correlation metric, a novel outlier detection method based on clustering is proposed here. The efficiency of the screening method is illustrated with a practical example.


outlier detection Chinese materia medica resource feature extraction deep learning convolutional neural network information loss 



This work is partially supported by the Shandong science and technology development plan (Grant No. 2016GGC01061, 2016GGX101029), Natural Science Foundation of Shandong Province (Grant No.ZR2015JL023 and Grant No.ZR2015FL025).


  1. 1.
    Abid A, Kachouri A, Mahfoudhi A (2017) Outlier detection for wireless sensor networks using density-based clustering approach. IET Wireless Sens Syst 7(4):83–90CrossRefGoogle Scholar
  2. 2.
    Bakon M, Irene O, Daniele P, Sousa J, Papco J, Data Mining A (2017) Approach for Multivariate Outlier Detection in Postprocessing of Multitemporal InSAR Results. IEEE J Sel Top Appl Earth Obs Remote Sens 10(6):2791–2798CrossRefGoogle Scholar
  3. 3.
    Boriah S, Chandola V, Kumar V (2008) Similarity Measures for Categorical Data: A Comparative Evaluation. Proceedings of the 8th SIAM International Conference on Data Mining 243–254Google Scholar
  4. 4.
    Breunig MM (2015) LOF: identifying density-based local outliers. ACM SIGMOD Rec 29(2):93–104MathSciNetCrossRefGoogle Scholar
  5. 5.
    Jin T, Lou J, Zhou Z (2012) Extraction of Landmine Features Using a Forward-Looking Ground-Penetrating Radar With MIMO Array. IEEE Trans Geosci Remote Sens 50(10):4135–4144CrossRefGoogle Scholar
  6. 6.
    Kang XD, Li ST, Benediktsson JA (2014) Feature extraction of hyperspectral images with image fusion and recursive Filtering. IEEE Trans Geosci Remote Sens 52:3742–3752CrossRefGoogle Scholar
  7. 7.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1097–1105Google Scholar
  8. 8.
    Kuncheva LI, Faithfull WJ, Feature Extraction PCA (2014) for Change Detection in Multidimensional Unlabeled Data. IEEE Trans Neural Netw Learn Syst 25(1):69–80CrossRefGoogle Scholar
  9. 9.
    Lajevardi SM, Hussain ZM (2014) Novel higher-order local autocorrelation-like feature extraction methodology for facial expression recognition. IET Image Process 4:114–119CrossRefGoogle Scholar
  10. 10.
    LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444CrossRefGoogle Scholar
  11. 11.
    Liang J, Wang M, Chai Z, Different QW (2014) lighting processing and feature extraction methods for efficient face recognition. IET Image Process 8(9):528–538CrossRefGoogle Scholar
  12. 12.
    Lunga D, Prasad S, Crawford MM, Ersoy O (2014) Manifold-learning-based feature extraction for classification of hyperspectral Data: a review of advances in manifold learning. IEEE Signal Process Mag 31:55–66CrossRefGoogle Scholar
  13. 13.
    Niu ZX, Shi SP, Sun JY, He X (2011) A Survey of Outlier Detection Methodologies and Their Applications. International Conference on Artificial Intelligence and Computational Intelligence 380–387Google Scholar
  14. 14.
    Rahmani M, George K (2017) Randomized Robust Subspace Recovery and Outlier Detection for High Dimensional Data Matrices. IEEE Trans Signal Process 65(6):1580–1594MathSciNetCrossRefGoogle Scholar
  15. 15.
    Szegedy C, Liu W, Jia YQ, Sermanet P et al (2014) Going Deeper with Convolutions. CoRR arXiv:1409.4842Google Scholar
  16. 16.
    Tang H, Chen C, Pei X (2016) Visual Saliency Detection via Sparse Residual and Outlier Detection. IEEE Signal Process Lett 23(12):1736–1740CrossRefGoogle Scholar
  17. 17.
    Tishby N, Fernando C, Bialek W (1999) The information bottleneck method. The 37th Annual Allerton Conference on Communication, Control and Computing 1–11Google Scholar
  18. 18.
    Wang D, Nie FP, Huang H (2015) Feature selection via global redundancy minimization. IEEE Trans Knowl Data Eng 27(10):2743–2755CrossRefGoogle Scholar
  19. 19.
    Zhao Z, Wang L, Liu H, Ye JP (2013) On similarity preserving feature selection. IEEE Trans Knowl Data Eng 25(3):619–632CrossRefGoogle Scholar
  20. 20.
    Zhou CE, Lin DY, Yang XM, Lai XM (2008) Database of Traditional Chinese Medicinal herbs: A bridge between TCM and modern science. IEEE International Symposium on IT in Medicine and Education 773–776Google Scholar
  21. 21.
    Zhu L, Qiu YY, Yu S, Yuan S (2017) A fast KNN-based MST outlier detection method. Chin J Comput 40(139):1–16Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  1. 1.State Key Laboratory Breeding Base of Dao-di Herbs, National Resource Center for Chinese Materia MedicalChina Academy of Chinese Medical SciencesBeijingChina
  2. 2.Shandong Provincial Key Laboratory of Computer NetworksShandong Computer Science Center (National Supercomputer Center in Jinan), Shandong Demonstration Engineering Technology Research Center of E-government Big DataJinanChina
  3. 3.Shandong Engineering Technology Research Center of E-government Big DataJinanChina

Personalised recommendations