Finding Teams in Graphs and Its Application to Spatial Gene Cluster Discovery

Schulz, Tizian; Stoye, Jens; Doerr, Daniel

doi:10.1007/978-3-319-67979-2_11

Tizian Schulz^15,16,
Jens Stoye¹⁵ &
Daniel Doerr¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10562))

Included in the following conference series:

RECOMB International Workshop on Comparative Genomics

896 Accesses

Abstract

Gene clusters are sets of genes in a genome with associated functionality. Often, they exhibit close proximity to each other on the chromosome which can be beneficial for their common regulation. A popular strategy for finding gene clusters is to exploit the close proximity by identifying sets of genes that are consistently close to each other on their respective chromosomal sequences across several related species.

Yet, even more than gene proximity on linear DNA sequences, the spatial conformation of chromosomes may provide a pivotal indicator for common regulation and/or associated function of sets of genes.

We present the first gene cluster model capable of handling spatial data. Our model extends a popular computational model for gene cluster prediction, called \(\delta \) -teams, from sequences to general graphs. In doing so, \(\delta \)-teams are single-linkage clusters of a set of shared vertices between two or more undirected weighted graphs such that the largest link in the cluster does not exceed a given threshold \(\delta \) in any input graph.

We apply our model to human and mouse data to find spatial gene clusters, i.e., gene sets with functional associations that exhibit close neighborhood in the spatial conformation of the chromosome across species.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000)
Article Google Scholar
Beal, M., Bergeron, A., Corteel, S., Raffinot, M.: An algorithmic view of gene teams. Theoret. Comput. Sci. 320(2–3), 395–418 (2004)
Article MathSciNet MATH Google Scholar
Belton, J.M., McCord, R.P., Gibcus, J.H., Naumova, N., Zhan, Y., Dekker, J.: Hi-C: a comprehensive technique to capture the conformation of genomes. Methods 58(3), 268–276 (2012)
Article Google Scholar
Burton, J.N., Adey, A., Patwardhan, R.P., Qiu, R., Kitzman, J.O., Shendure, J.: Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31(12), 1119–1125 (2013)
Article Google Scholar
Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1990)
MATH Google Scholar
Díaz-Díaz, N., Aguilar-Ruiz, J.S.: Go-based functional dissimilarity of gene sets. BMC Bioinform. 12(1), 360 (2011)
Article Google Scholar
Didier, G., Schmidt, T., Stoye, J., Tsur, D.: Character sets of strings. J. Discret. Algorithms 5(2), 330–340 (2006)
Article MathSciNet MATH Google Scholar
Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., Ren, B.: Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485(7398), 376–380 (2012)
Article Google Scholar
He, X., Goldwasser, M.H.: Identifying conserved gene clusters in the presence of homology families. J. Comput. Biol. 12(6), 638–656 (2005)
Article Google Scholar
Jacob, F., Perrin, D., Sanchez, C., Monod, J.: Operon: a group of genes with the expression coordinated by an operator. C. R. Hebd. Seances Acad. Sci. 250, 1727–1729 (1960)
Google Scholar
Jahn, K.: Efficient computation of approximate gene clusters based on reference occurrences. J. Comput. Biol. 18(9), 1255–1274 (2011)
Article MathSciNet Google Scholar
Larroux, C., Fahey, B., Degnan, S.M., Adamski, M., Rokhsar, D.S., Degnan, B.M.: The NK homeobox gene cluster predates the origin of Hox genes. Curr. Biol. 17(8), 706–710 (2007)
Article Google Scholar
Ryba, T., Hiratani, I., Lu, J., Itoh, M., Kulik, M., Zhang, J., Schulz, T.C., Robins, A.J., Dalton, S., Gilbert, D.M.: Evolutionarily conserved replication timing profiles predict long-range chromatin interactions and distinguish closely related cell types. Genome Res. 20(6), 761–770 (2010)
Article Google Scholar
Schmidt, T., Stoye, J.: Gecko and GhostFam: rigorous and efficient gene cluster detection in prokaryotic genomes. Methods Mol. Biol. 396, 165–182 (2007). (Chapter 12)
Article Google Scholar
Selvaraj, S., Dixon, J.R., Bansal, V., Ren, B.: Whole-genome haplotype reconstruction using proximity-ligation and shotgun sequencing. Nat. Biotechnol. 31(12), 1111–1118 (2013)
Article Google Scholar
Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A., Cavalli, G.: Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148(3), 458–472 (2012)
Article Google Scholar
Thévenin, A., Ein-Dor, L., Ozery-Flato, M., Shamir, R.: Functional gene groups are concentrated within chromosomes, among chromosomes and in the nuclear space of the human genome. Nucleic Acids Res. 42(15), 9854–9861 (2014)
Article Google Scholar
Uno, T., Yagiura, M.: Fast algorithms to enumerate all common intervals of two permutations. Algorithmica 26(2), 290–309 (2000)
Article MathSciNet MATH Google Scholar
Wang, B.F., Kuo, C.C., Liu, S.J., Lin, C.H.: A new efficient algorithm for the gene-team problem on general sequences. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(2), 330–344 (2012)
Article Google Scholar
Wang, B.F., Lin, C.H.: Improved algorithms for finding gene teams and constructing gene team trees. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(5), 1258–1272 (2010)
Article Google Scholar
Wang, B.F., Lin, C.H., Yang, I.T.: Constructing a gene team tree in almost O(n lg n) time. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(1), 142–153 (2014)
Article Google Scholar
Winter, S., Jahn, K., Wehner, S., Kuchenbecker, L., Marz, M., Stoye, J., Böcker, S.: Finding approximate gene clusters with Gecko 3. Nucleic Acids Res. 44(20), 9600–9610 (2016)
Google Scholar
Yates, A., Akanni, W., Amode, M.R., Barrell, D., Billis, K., Carvalho-Silva, D., Cummins, C., Clapham, P., Fitzgerald, S., Gil, L., Girn, C.G., Gordon, L., Hourlier, T., Hunt, S.E., Janacek, S.H., Johnson, N., Juettemann, T., Keenan, S., Lavidas, I., Martin, F.J., Maurel, T., McLaren, W., Murphy, D.N., Nag, R., Nuhn, M., Parker, A., Patricio, M., Pignatelli, M., Rahtz, M., Riat, H.S., Sheppard, D., Taylor, K., Thormann, A., Vullo, A., Wilder, S.P., Zadissa, A., Birney, E., Harrow, J., Muffato, M., Perry, E., Ruffier, M., Spudich, G., Trevanion, S.J., Cunningham, F., Aken, B.L., Zerbino, D.R., Flicek, P.: Ensembl 2016. Nucleic Acids Res. 44(D1), D710 (2016)
Article Google Scholar
Zhang, M., Leong, H.W.: Gene team tree - a hierarchical representation of gene teams for all gap lengths. J. Comput. Biol. 16(10), 1383–1398 (2009)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We are very grateful to Krister Swenson for kindly providing the Hi-C data used in this study and for his many valuable suggestions. We wish to thank Pedro Feijão for many fruitful discussions in the beginning of this project. This work was partially supported by DFG GRK 1906/1.

Author information

Authors and Affiliations

Faculty of Technology and CeBiTec, Bielefeld University, Bielefeld, Germany
Tizian Schulz, Jens Stoye & Daniel Doerr
International Research Training Group 1906 “Computational Methods for the Analysis of the Diversity and Dynamics of Genomes”, Bielefeld University, Bielefeld, Germany
Tizian Schulz

Authors

Tizian Schulz
View author publications
You can also search for this author in PubMed Google Scholar
Jens Stoye
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Doerr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Doerr .

Editor information

Editors and Affiliations

University of Campinas, Campinas, São Paulo, Brazil
Joao Meidanis
Rice University, Houston, Texas, USA
Luay Nakhleh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schulz, T., Stoye, J., Doerr, D. (2017). Finding Teams in Graphs and Its Application to Spatial Gene Cluster Discovery. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-67979-2_11
Published: 15 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67978-5
Online ISBN: 978-3-319-67979-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics