Scene text reading based cloud compliance access

User behavior analyze of graphical system and images data sanitization

Abstract

Cloud Compliance Access system assure that cloud users could utilize resources of cloud platform while conforming to regulations and the specific functions of this system were User Behavior Analyze and Data Sanitization. However, the problem of information extraction from images data is still to be addressed for both of these functions. The authors, in this paper, ventures to propose a new cloud compliance access system based on scene text reading, which could realize timely auditing to graphical systems of users’ operations and image data sanitization on users’ transparency. To improve the feasibility and availability of this system, with regard to cloud computing scene, the author refined the model of scene text reading and realize two main functions of cloud compliance access through extracted information through this model. In evaluation part, the author examines the accuracy rate of scene text reading model through real data (the overall accuracy rate achieved 93.4%), simulated real application of this mode as well as evaluate the effect of two main functions, user behavior analyze and data sanitization.

This is a preview of subscription content, log in to check access.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

References

  1. 1.

    2.6 billion pieces of data were lost in 2017. https://www.easyaq.com/news/1159935014.shtml (2017)

  2. 2.

    AlexGraves, M., HorstBunke, J., Fernandez, S.: Unconstrained on-line handwriting recognition with recurrent neural networks. In: Advances in neural information processing systems, pp. 577–584 (2008)

  3. 3.

    Bartz, C., Yang, H., Meinel, C.: See: To- wards semi-supervised end-to-end scene text recognition. arXiv:1712.05404 (2017)

  4. 4.

    Ciphercloud: cloud services adoption while ensuring security, compliance and control, Online. https://www.ciphercloud.com/

  5. 5.

    Cui, W., Li, H., Li, W., et al.: The design and implementation of remote desktop access audit system, 7th International Conference on Computing and Convergence Technology (ICCCT), pp. 1239–1243 (2012)

  6. 6.

    Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding, arXiv:1810.04805 (2018)

  7. 7.

    Doxed by Microsofts Docs.com: Users unwittingly shared sensitive docs publicly. https://arstechnica.com/security/2017/03/doxed-by-microsofts-docs-com-users-unwittingly-shared-sensitive-docs-publicly/ (2017)

  8. 8.

    Gomez-Hidalgo, J.M., Martin-Abreu, J.M., Nieves, J., Santos, I., Brezo, F., Bringas, P.G.: Data leak prevention through named entity recognition. IEEE Second International Conference on Social Computing, pp. 1129–1134 (2010)

  9. 9.

    He, W., Akhawe, D., Jain, S., et al.: Shadowcrypt: Encrypted web applications for everyone. Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp. 1028–1039 (2014)

  10. 10.

    He, D., Yang, X., Liang, C., Zhou, Z., Ororbia, A.G., Kifer, D., Giles, C.L.: Multi-scale fcn with cascaded instance aware segmentation for arbitrary oriented word spotting in the wild. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 474–483. IEEE (2017)

  11. 11.

    Huang, S., Lin, C., Chen, Z., et al.: Proxy-based security audit system for remote desktop access. Proceedings of 18th International Conference on Computer Communications and Networks, pp. 1–5 (2009)

  12. 12.

    “i2b2dataset.”[Online]. Available: https://www.i2b2.org/NLP/DataSets/

  13. 13.

    ICPR Dataset. https://tianchi.aliyun.com/competition/entrance/231651/information

  14. 14.

    Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift, arXiv:1502.03167 (2015)

  15. 15.

    Lau, B., Chung, S., Song , C., et al.: Mimesis aegis: A mimicry privacy shield–a system’s approach to data privacy on public cloud. 23rd USENIX Security Symposium, pp. 33–48 (2014)

  16. 16.

    Lee, C.-Y., Osindero, S.: Recursive recurrent nets with attention modeling for ocr in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2231–2239 (2016)

  17. 17.

    Liao, M., Shi, B., Bai, X., Wang, X., Liu, W.: Textboxes: A fast text detector with a single deep neural network. In: AAAI, pp. 4161–4167 (2017)

  18. 18.

    Li, H., Wang, P., Shen, C.: Towards end-to-end text spotting with convolutional recurrent neural networks. In: The IEEE International Conference on Computer Vision (ICCV) (2017)

  19. 19.

    Liu, W., Chen, C., Wong, K.K.Y.: Char-net: A character- aware neural network for distorted scene text recognition. In: AAAI Conference on Artificial Intelligence. New Orleans, Louisiana (2018)

  20. 20.

    Pagh, A., Pagh, R., Rao, S.S.: An optimal Bloom filter replacement. In: Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 823–829 (2005)

  21. 21.

    Popa, R.A., Redfield, C., Zeldovich, N., Balakrishnan, H.: Cryptdb: protecting confidentiality with encrypted query processing. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pp. 85–100. ACM (2011)

  22. 22.

    Popa, R.A., Stark, E., Valdez, S., Helfer, J., Zeldovich, N., Balakrishnan, H.: Building web applications on top of encrypted data using mylar. In: 11th Symposium on Networked Systems Design and Implementation, pp. 157–172 (2014)

  23. 23.

    Rashid, S.F., Shafait, F., Breuel, T.M.: An evaluation of HMM-based techniques for the recognition of screen rendered text. In: 11th International Conference on Document Analysis and Recognition, pp. 1260–1264 (2011)

  24. 24.

    Ruoti, S., Andersen, J., Monson, T., et al.: Messageguard: A browser-based platform for usable, content-based encryption research. arXiv:1510.08943 (2015)

  25. 25.

    Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39.11, 2298–2304 (2016)

    Google Scholar 

  26. 26.

    Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2550–2558 (2017)

  27. 27.

    Shi, B., Yao, C., Liao, M., et al.: ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17), 14Th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE Computer Society (2017)

  28. 28.

    There is a hole in 1951 amazon s3 buckets. https://community.rapid7.com/community/infosec/blog/2013/03/27/1951-open-s3-buckets (2013)

  29. 29.

    Wachenfeld, S., Klein, H.U., Jiang, X.: Recognition of Screen-Rendered text. In: 18th International Conference on Pattern Recognition, pp. 1086–1089 (2006)

  30. 30.

    Xu, X., Zhou, J., Zhang, H., et al.: Chinese characters recognition from Screen-Rendered images using inception deep learning architecture. Pacific Rim Conference on Multimedia. Springer, pp. 722–732 (2017)

  31. 31.

    Xu, X., Zhou, J., Zhang, H.: Screen-rendered text images recognition using a deep residual network based segmentation-free method, 24th International Conference on Pattern Recognition (ICPR), pp. 2741–2746 (2018)

  32. 32.

    Yao, C., Bai, X., Sang, N., Zhou, X., Zhou, S., Cao, Z.: Scene text detection via holistic, multi-channel prediction. arXiv:1606.09002 (2016)

  33. 33.

    Yin, F., Yi-Chao, W., Zhang, X.-Y., Liu, C.-L.: Scene text recognition with sliding convolutional character models. arXiv:1709.01727 (2017)

  34. 34.

    Zeiler, M.D.: ADADELTA: an adaptive learning rate method, arXiv:1212.5701 (2012)

  35. 35.

    Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

  36. 36.

    Zhang, X.L., Xiao-Yu, W.U., Zhang, W.X.: Research and Implementation of RDP Proxy Proxy-based Audit System. International Conference on Computer Networks and Communication Technology (CNCT 2016) 214–220 (2016)

  37. 37.

    Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: EAST: An efficient and accurate scene text detector. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

  38. 38.

    Zhu, X., Jiang, Y., Yang, S., Wang, X., Li, W., Fu, P., Wang, H., Luo, Z.: Deep residual text detection network for scene text. In 14th International Conference on Document Analysis and Recognition (2017)

Download references

Acknowledgements

This work is supported by Key research and Development Program for Guangdong Province under grant No.2019B010136001, the National Key R&D Program of China (Grant No.2016YFB0800803) and the National Natural Science Foundation of China (Grant No.61872110).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Chuanyi Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Data Science in Cyberspace 2019

Guest Editors: Bin Zhou, Feifei Li and Jinjun Chen

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pan, H., Liu, C., Duan, S. et al. Scene text reading based cloud compliance access. World Wide Web 23, 2633–2647 (2020). https://doi.org/10.1007/s11280-020-00805-y

Download citation

Keywords

  • Cloud compliance access
  • Scene text reading
  • User behavior analyze
  • Data sanitization