Skip to main content

Abstract

Thanks to the rapid advances in sequencing technologies, genomic data is now being produced at an unprecedented rate. To adapt to this growth, several algorithms and paradigm shifts have been proposed to increase the throughput of the classical DNA workflow, e.g. by relying on the cloud to perform CPU intensive operations. However, the scientific community raised an alarm due to the possible privacy-related attacks that can be executed on genomic data. In this paper we review the state of the art in cloud-based alignment algorithms that have been developed for performance. We then present several privacy-preserving mechanisms that have been, or could be, used to align reads at an incremental performance cost. We finally argue for the use of risk analysis throughout the DNA workflow, to strike a balance between performance and protection of data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Ancestry – https://www.ancestry.com.

  2. 2.

    DisGeNet – http://www.disgenet.org.

  3. 3.

    Surname Navigator – http://www.surnamenavigator.org.

References

  1. Akgün, M., Bayrak, A.O., Ozer, B., et al.: Privacy preserving processing of genomic data: a survey. J. Biomed. Inf. 56, 103–111 (2015)

    Article  Google Scholar 

  2. Altschul, S.F., Gish, W., Miller, W., et al.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)

    Article  Google Scholar 

  3. Baron, J., El Defrawy, K., Minkovich, K., et al.: 5pm: secure pattern matching. In: SCN, pp. 222–240 (2012)

    Google Scholar 

  4. Bessani, A., Brandt, J., Bux, M., et al.: Biobankcloud: a platform for the secure storage, sharing, and processing of large biomedical data sets. In: DMAH (2015)

    Google Scholar 

  5. Chan, I.S., Ginsburg, G.S.: Personalized medicine: progress and promise. Ann. Rev. Genomics Hum. Genet. 12(1), 217–244 (2011)

    Article  Google Scholar 

  6. Chen, Y., Peng, B., Wang, X., et al.: Large-scale privacy-preserving mapping of human genomic sequences on hybrid clouds. In: NDSS (2012)

    Google Scholar 

  7. Cogo, V.V., Bessani, A., Couto, F.M., et al.: A high-throughput method to detect privacy-sensitive human genomic data. In: ACM WPES, pp. 101–110 (2015)

    Google Scholar 

  8. Dove, E.S., Joly, Y., Tasse, A.M., et al.: Genomic cloud computing: legal and ethical points to consider. Eur. J. Hum. Genet. 23, 1271–1278 (2015)

    Article  Google Scholar 

  9. Erlich, Y., Narayanan, A.: Routes for breaching and protecting genetic privacy. Nat. Rev. Genet. 15, 409–421 (2014)

    Article  Google Scholar 

  10. Gymrek, M., McGuire, A.L., Golan, D., et al.: Identifying personal genomes by surname inference. Science 339(6117), 321–324 (2013)

    Article  Google Scholar 

  11. Homer, N., Szelinger, S., Redman, M., et al.: Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLoS Genet. 4(8), e1000167 (2008)

    Article  Google Scholar 

  12. Huang, Y., Evans, D., Katz, J., et al.: Faster secure two-party computation using garbled circuits. In: USENIX Security Symposium, vol. 201(1) (2011)

    Google Scholar 

  13. Kaye, J., Heeney, C., Hawkins, N., et al.: Data sharing in genomics re-shaping scientific practice. Nat. Rev. Genet. 10(5), 331–335 (2009)

    Article  Google Scholar 

  14. Kienzler, R., Bruggmann, R., Ranganathan, A., et al.: Large-scale DNA sequence analysis in the cloud: a stream-based approach. In: ICPP, vol. 2, pp. 467–476 (2012)

    Google Scholar 

  15. Matsunaga, A., Tsugawa, M., Fortes, J.: Cloudblast: combining mapreduce and virtualization on distributed resources for bioinformatics applications. In: ESCIENCE 2008, pp. 222–229 (2008)

    Google Scholar 

  16. Namazi, M., Troncoso-Pastoriza, J.R., Pérez-González, F.: Dynamic privacy-preserving genomic susceptibility testing. In: ACM MMSec, pp. 45–50 (2016)

    Google Scholar 

  17. Naveed, M., Ayday, E., Clayton, E.W., et al.: Privacy in the genomic era. ACM CSUR 48(1), 1–44 (2015)

    Article  Google Scholar 

  18. Nyholt, D.R., Yu, C.E., Visscher, P.M.: On Jim Watsons apoe status: genetic information is hard to hide. Eur. J. Hum. Genet. 17, 147–149 (2009)

    Article  Google Scholar 

  19. O’Driscoll, A., Daugelaite, J., Sleator, R.D.: “Big data”, hadoop and cloud computing in genomics. J. Biomed. Inf. 46(5), 774–781 (2013)

    Article  MATH  Google Scholar 

  20. Rocha, F., Correia, M.: Lucy in the sky without diamonds: stealing confidential data in the cloud. In: DSNW, pp. 129–134 (2011)

    Google Scholar 

  21. Stein, L.D.: The case for cloud computing in genome informatics. Genome Biol. 11(5), 207 (2010)

    Article  Google Scholar 

  22. Talukder, A., Gandham, S., Prahalad, H., et al.: Cloud-maq: the cloud-enabled scalable whole genome reference assembly application. In: WOCN, pp. 1–5 (2010)

    Google Scholar 

  23. Vayena, E., Gasser, U.: Between openness and privacy in genomics. PLoS Med. 13(1), 1–7 (2016)

    Article  Google Scholar 

  24. Zhang, K., Zhou, X., Chen, Y., et al.: Sedic: Privacy-aware data intensive computing on hybrid clouds. In: ACM CCS, pp. 515–526 (2011)

    Google Scholar 

  25. Zhou, X., Peng, B., Li, Y.F., et al.: To release or not to release: Evaluating information leaks in aggregate human-genome data. In: ESORICS, pp. 607–627 (2011)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Fonds National de la Recherche Luxembourg (FNR) through PEARL grant FNR/P14/8149128, and by the Fundação para a Ciência e para a Tecnologia (FCT) through funding of the LaSIGE Research Unit, ref. UID/CEC/00408/2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maria Fernandes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Fernandes, M., Decouchant, J., Couto, F.M., Esteves-Verissimo, P. (2017). Cloud-Assisted Read Alignment and Privacy. In: Fdez-Riverola, F., Mohamad, M., Rocha, M., De Paz, J., Pinto, T. (eds) 11th International Conference on Practical Applications of Computational Biology & Bioinformatics. PACBB 2017. Advances in Intelligent Systems and Computing, vol 616. Springer, Cham. https://doi.org/10.1007/978-3-319-60816-7_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60816-7_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60815-0

  • Online ISBN: 978-3-319-60816-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics