Abstract
In order to overcome the limitations imposed by DNA barcoding when multiplexing a large number of samples in the current generation of high-throughput sequencing instruments, we have recently proposed a new protocol that leverages advances in combinatorial pooling design (group testing) [9]. We have also demonstrated how this new protocol would enable de novo selective sequencing and assembly of large, highly-repetitive genomes. Here we address the problem of decoding pooled sequenced data obtained from such a protocol. Our algorithm employs a synergistic combination of ideas from compressed sensing and the decoding of error-correcting codes. Experimental results on synthetic data for the rice genome and real data for the barley genome show that our novel decoding algorithm enables significantly higher quality assemblies than the previous approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alon, S., Vigneault, F., Eminaga, S., et al.: Barcoding bias in high-throughput multiplex sequencing of mirna. Genome Research 21(9), 1506–1511 (2011)
Amir, A., Zuk, O.: Bacterial community reconstruction using compressed sensing. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 1–15. Springer, Heidelberg (2011)
Earl, D., et al.: Assemblathon 1: A competitive assessment of de novo short read assembly methods. Genome Research 21(12), 2224–2241 (2011)
Engler, F.W., Hatfield, J., Nelson, W., Soderlund, C.A.: Locating sequence on FPC maps and selecting a minimal tiling path. Genome Research 13(9), 2152–2163 (2003)
Erlich, Y., Chang, K., Gordon, A., et al.: DNA sudoku - harnessing high-throughput sequencing for multiplexed specimen analysis. Genome Research 19(7), 1243–1253 (2009)
Erlich, Y., Gordon, A., Brand, M., et al.: Compressed genotyping. IEEE Transactions on Information Theory 56(2), 706–723 (2010)
Hajirasouliha, I., Hormozdiari, F., Sahinalp, S.C., Birol, I.: Optimal pooling for genome re-sequencing with ultra-high-throughput short-read technologies. Bioinformatics 24(13), i32–i40 (2008)
Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10(3), R25 (2009)
Lonardi, S., Duma, D., Alpert, M., et al.: Combinatorial pooling enables selective sequencing of the barley gene space. PLoS Comput. Biol. 9(4), e1003010 (2013)
Ngo, H.Q., Porat, E., Rudra, A.: Efficiently decodable compressed sensing by list-recoverable codes and recursion. In: STACS, pp. 230–241 (2012)
Prabhu, S., Pe’er, I.: Overlapping pools for high-throughput targeted resequencing. Genome Research 19(7), 1254–1261 (2009)
Shental, N., Amir, A., Zuk, O.: Identification of rare alleles and their carriers using compressed se(que)nsing. Nucleic Acids Research 38(19), e179–e179 (2010)
Simpson, J.T., Durbin, R.: Efficient de novo assembly of large genomes using compressed data structures. Genome Research 22(3), 549–556 (2012)
The International Barley Genome Sequencing Consortium. A physical, genetic and functional sequence assembly of the barley genome. Nature (advance online publication October 2012) (in press)
Thierry-Mieg, N.: A new pooling strategy for high-throughput screening: the shifted transversal design. BMC Bioinformatics 7(28) (2006)
Tropp, J.A., Gilbert, A.C.: Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inform. Theory 53, 4655–4666 (2007)
Tropp, J.A., Gilbert, A.C., Strauss, M.J.: Algorithms for simultaneous sparse approximation: part i: Greedy pursuit. Signal Process. 86(3), 572–588 (2006)
Zerbino, D., Birney, E.: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Research 8(5), 821–829 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Duma, D. et al. (2013). Accurate Decoding of Pooled Sequenced Data Using Compressed Sensing. In: Darling, A., Stoye, J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science(), vol 8126. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40453-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-40453-5_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40452-8
Online ISBN: 978-3-642-40453-5
eBook Packages: Computer ScienceComputer Science (R0)