Skip to main content

An Improved Outlier Detection Algorithm to Medical Insurance

  • Conference paper
  • First Online:
Intelligent Data Engineering and Automated Learning – IDEAL 2016 (IDEAL 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9937))

Abstract

With the development of the medical insurance industry in China, medical insurance data with complex, multidimensional and interdisciplinary feature are extremely increasing. How to mine the potential value from the vast amounts of data and improve the efficiency of data analysis are topical issues in the study of data mining. This paper presents an improved LOF Outlier Detection Algorithm — GdiLOF, an algorithm which reduces dataset by removing the normal data and introduces information entropy to improve the accuracy of the LOF algorithm. Platform adaptability is analyzed by running it on Hadoop platform. The experimental results show that GdiLOF algorithm has high efficiency and the accuracy is 6 percentage points higher than LOF algorithm. And it also run better in the Hadoop distributed platforms, as well as having obvious advantages in processing huge amounts of data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. General Office of the State Council of the People’s Republic of China. http://www.zgylbx.com/gfwswecpnew107541_1/. Accessed 27 Apr 2016

  2. Dhar, V.: Data Science and Prediction. Commun. ACM 56(12), 64–73 (2012)

    Article  MathSciNet  Google Scholar 

  3. Tao, H.: Research and application of data mining technology in medical insurance. University of Science and Technology of China, USTC (2015)

    Google Scholar 

  4. Li, C.H., Sun, Z.: GridOF: efficient outlier detection algorithm for large-scale data sets. J. Comput. Res. Dev. 40(11), 1586–1592 (2003)

    Google Scholar 

  5. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 3–55 (1948)

    Article  MathSciNet  Google Scholar 

  6. Wang, Y.F., Zhang, C.H., Zhang, B.B., et al.: Review of data cleaning research. New Technol. Libr. Inf. Serv. 12, 50–56 (2007)

    Google Scholar 

  7. Su, X., Tsai, C.L.: Outlier detection. Wiley Interdisc. Rev. Data Min. Knowl. Dis. 1(3), 261–268 (2011)

    Article  Google Scholar 

  8. Sloane, N., Wyner, A.: A Mathematical Theory of Communication, pp. 379–423. Wiley-IEEE Press, New York (2009)

    Google Scholar 

  9. Xie, L., Li, G., Xiao, M., et al.: Novel classification method for remote sensing images based on information entropy discretization algorithm and vector space model. Comput. Geosci. 89, 252–259 (2015)

    Article  Google Scholar 

  10. Breunig, M.M., Kriegel, H.P., Ng, R.T., et al.: LOF: identifying density-based local outliers. ACM SIGMOD Rec. 29(2), 93–104 (2000)

    Article  Google Scholar 

  11. Wang, X.X., Huang, L.W.: Research and improvement of GridLOF algorithm in data mining. Modern Computer (2007)

    Google Scholar 

  12. Chen, W.M.: Research and improvement of outlier mining algorithm based on GridLOF, Sun Yat-sen University (2007)

    Google Scholar 

  13. Tang, J., Chen, Z., Fu, A.W.-C., Cheung, D.W.: Enhancing effectiveness of outlier detections for low density patterns. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 535–548. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  14. Jiang, F., Sui, Y., Cao, C.: An information entropy-based approach to outlier detection in rough sets. Expert Syst. Appl. 37(9), 6338–6344 (2010)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by the National Science Foundation of China (Grant Nos. 61502082) and the Fundamental Research Funds for the Central Universities, ZYGX2014J065.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhiping Xie or Xiaoyu Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Xie, Z., Li, X., Wu, W., Zhang, X. (2016). An Improved Outlier Detection Algorithm to Medical Insurance. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2016. IDEAL 2016. Lecture Notes in Computer Science(), vol 9937. Springer, Cham. https://doi.org/10.1007/978-3-319-46257-8_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46257-8_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46256-1

  • Online ISBN: 978-3-319-46257-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics