Towards an Automatic Detection of Sensitive Information in Mongo Database

Heni, Houyem; Gargouri, Faiez

doi:10.1007/978-3-030-16657-1_13

Houyem Heni¹⁸ &
Faiez Gargouri¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 940))

Included in the following conference series:

International Conference on Intelligent Systems Design and Applications

1440 Accesses

Abstract

Before talking about the security tools we must first think about what we should protect and how we should distinguish information that seems to be sensitive and identifiable among heterogeneous data that spread over several sources like Facebook, twitter and several other suppliers of big data. Thus, in this paper we proved a method of identifying sensitive information in mongo data store. In this paper we propose an innovative approach and its implementation as an expert system to achieve the automatic detection of the candidate attributes for fragmentation. Our approach is mainly based on semantic rules that determine which concepts have to be fragmented, and on a linguistic component that retrieves the attributes that semantically correspond to these concepts. Since attributes cannot be considered independently from each other we also address the challenging problem of the propagation of it among the whole mongo database. An important contribution of our approach is to provide a semantic modeling of sensitive data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
“JESS, the Rule Engine for the Java Platform,” http://www.jessrules.com.
2.
“WordNet: An Electronic Lexical Database,” http://wordnet.princeton.edu.

References

Atallah, M., Elmagarmid, A., Ibrahim, M., Bertino, E., Verykios, V.: Disclosure limitation of sensitive rules. In: Proceedings of the International Workshop on Knowledge and Data Engineering Exchange (KDEX), pp. 45–52 (1999)
Google Scholar
Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl. Based. Syst. 10(5), 571–588 (2002)
Article MathSciNet Google Scholar
Oliveira, S.R.M., Zaïane, O.R.: Protecting sensitive knowledge by data sanitization. In: Proceedings of the International Conference on Data Mining (ICDM), pp. 613–616 (2003)
Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-Diversity: privacy beyond K-anonymity. In: Proceedings of the International Conference on Data Engineering (ICDE), p. 24 (2006)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-Closeness: privacy beyond k-Anonymity and l-diversity. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 106–115 (2007)
Google Scholar
Wang, E.T., Lee, G.: An efficient sanitization algorithm for balancing information privacy and knowledge discovery in association patterns mining. Data Knowl. Eng. (DKE) 65(3), 463–484 (2008)
Article MathSciNet Google Scholar
Amiri, A.: Dare to share: protecting sensitive knowledge with data sanitization. Decis. Support Syst. 43(1), 181–191 (2007)
Article Google Scholar
Atzori, M., Bonchi, F., Giannotti, F., Pedreschi, D.: Blocking anonymity threats raised by frequent itemset mining. In: Proceedings of the International Conference on Data Mining (ICDM), pp. 561–564 (2005)
Google Scholar
Wang, Z., Wang, W., Shi, W.: Blocking inference channels in frequent pattern sharing. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 1425–1429 (2007)
Google Scholar
Sweeney, L.: Datafly: a system for providing anonymity in medical data. In: Proceedings of the International Conference on Database Security (DBSec), pp. 356–381 (1997)
Google Scholar
Beckwith, B.A., Mahaadevan, R., Balis, U.J., Kuo, F.: Development and evaluation of an open source software tool for deidentification of pathology reports. BMC Med. Inform. Decis. Mak. 6(12), 176–186 (2006)
Google Scholar
Gardner, J., Xiong, L.: HIDE: an integrated system for health information DE-identification. In: Proceedings of the International Symposium on Computer-Based Medical Systems (CBMS), pp. 254–259 (2008)
Google Scholar
Chakaravarthy, V.T., Gupta, H., Roy, P., Mohania, M.K.: Efficient techniques for document sanitization. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), pp. 843–852 (2008)
Google Scholar
Datamasker. http://www.datamasker.com
Camouflage. http://www.datamasking.com
Solix. http://www.solix.com
Datavantage Globa. http://www.datavantage.com
Pse Data Security. http://www.psedatasecurity.com
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proceedings of the International Conference on Data Engineering (ICDE), pp. 217–228 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Miracl Laboratory, Sfax University, Sfax, Tunisia
Houyem Heni & Faiez Gargouri

Authors

Houyem Heni
View author publications
You can also search for this author in PubMed Google Scholar
Faiez Gargouri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Houyem Heni .

Editor information

Editors and Affiliations

Machine Intelligence Research Labs, Auburn, WA, USA
Ajith Abraham
School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
Aswani Kumar Cherukuri
Tijuana Institute of Technology, Tijuana, Mexico
Patricia Melin
Machine Intelligence Research Labs, Auburn, WA, USA
Niketa Gandhi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Heni, H., Gargouri, F. (2020). Towards an Automatic Detection of Sensitive Information in Mongo Database. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2018 2018. Advances in Intelligent Systems and Computing, vol 940. Springer, Cham. https://doi.org/10.1007/978-3-030-16657-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-16657-1_13
Published: 12 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16656-4
Online ISBN: 978-3-030-16657-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics