# Natural selection in compartmentalized environment with reshuffling

## Abstract

The emerging field of high-throughput compartmentalized in vitro evolution is a promising new approach to protein engineering. In these experiments, libraries of mutant genotypes are randomly distributed and expressed in microscopic compartments—droplets of an emulsion. The selection of desirable variants is performed according to the phenotype of each compartment. The random partitioning leads to a fraction of compartments receiving more than one genotype making the whole process a lab implementation of the group selection. From a practical point of view (where efficient selection is typically sought), it is important to know the impact of the increase in the mean occupancy of compartments on the selection efficiency. We carried out a theoretical investigation of this problem in the context of selection dynamics for an infinite non-mutating subdivided population that randomly colonizes an infinite number of patches (compartments) at each reproduction cycle. We derive here an update equation for any distribution of phenotypes and any value of the mean occupancy. Using this result, we demonstrate that, for the linear additive fitness, the best genotype is still selected regardless of the mean occupancy. Furthermore, the selection process is remarkably resilient to the presence of multiple genotypes per compartments, and slows down approximately inversely proportional to the mean occupancy at high values. We extend out results to more general expressions that cover nonadditive and non-linear fitnesses, as well non-Poissonian distribution among compartments. Our conclusions may also apply to natural genetic compartmentalized replicators, such as viruses or early *trans*-acting RNA replicators.

## Keywords

Directed evolution Co-compartmentalization Group selection Frequency-dependent selection Acellular genotype-phenotype linkage## List of symbols

- \({\mathbb {N}}\)
We assume \(0 \in {\mathbb {N}}\)

- \({\mathbb {R}}_+\)
The nonnegative semiaxis: \({\mathbb {R}}_+ = [0,+\infty ) \subset {\mathbb {R}}\)

- \(C_c\)
Space of continuous functions with compact support

- \(C_{c+}\)
Space of nonnegative functions from \(C_c\)

- \(C'_c\)
Space of generalized functions on \(C_c\) (Radon measures)

- \(C'_{c+}\)
Subset of nonegative generalized functions

- \({\mathbb {P}}\)
Subset of probability densities: \({\mathbb {P}} = \{\rho \in C_{c+}'\,|\,\langle \rho ,1\rangle = 1\}\)

- \({\mathbb {P}}_p\)
Finite point-mass densities: \({\mathbb {P}}_p = \{\rho \in {\mathbb {P}}\,|\,\rho = \sum \limits _{k=1}^n a_k \delta _{x_k}\)}

- \({\mathscr {I}}\)
Some very large closed interval: \({\mathscr {I}} = [0,{\mathscr {L}}]\)

- \({\mathbb {P}}^{\mathscr {I}}\)
Densities in \({\mathscr {I}}\): \({\mathbb {P}}^{\mathscr {I}} = \{\rho \in {\mathbb {P}}\,|\, {\text {supp}}\rho \subset {\mathscr {I}}\}\)

- \({\mathbb {P}}^{\mathscr {I}}_p\)
Finite point-mass densities in \({\mathscr {I}}\): \({\mathbb {P}}^{\mathscr {I}}_p = {\mathbb {P}}_p \cap {\mathbb {P}}^{\mathscr {I}}\)

- \(\chi _A\)
Indicator function of the set

*A*: \(\chi _A(x) = {\left\{ \begin{array}{ll}1,&{}x \in A\\ 0,&{}x\notin A\end{array}\right. }\)- \(C^k_n\)
Binomial coefficient \(\dfrac{n!}{k!(n-k)!}\)

- \(\langle \rho , \varphi \rangle \)
The action of the generalized function \(\rho \) on the test function \(\varphi \)

- \(\langle \rho , \varphi (x) \rangle \)
Implicitly \(\langle \rho (x),\varphi (x)\rangle \), where

*x*is the internal variable- \(\langle \rho , \varphi (x,y) \rangle \)
Implicitly \(\langle \rho (y),\varphi (x,y)\rangle \), where

*y*is internal and*x*is external- \(\langle \rho _x, \varphi (y) \rangle \)
Implicitly \(\langle \rho _x(y),\varphi (y)\rangle \), where

*y*is the internal variable and*x*is a parameter of the distribution family \(\{\rho _x\}\)*g*(*x*)a shortcut for \((1 - e^{-x})/x\)

- \(\delta _a\)
\(\delta \)-function concentrated at

*a*: \(\langle \delta _a, \varphi \rangle = \varphi (a)\)- \({\text {supp}}\varphi \)
Support of the function \(\varphi \): the closure of \(\{x \in {\mathbb {R}}\,|\,\varphi (x) \ne 0\}\)

- \({\text {supp}}\rho \)
Support of the generalized function \(\rho \): \({\text {supp}}\rho = {\mathbb {R}} {\setminus } O_\rho \), where \(O_\rho \) is the largest open subset \(O \subset {\mathbb {R}}\) such that \(\rho |_O = 0\)

- \(\bigotimes \limits _k \rho _k\)
Tensor product \(\rho _1 \otimes \rho _2 \otimes \ldots \)

- \(\rho ^{\otimes n}\)
*n*-th tensorial power: \(\underbrace{\rho \otimes \rho \otimes \ldots \otimes \rho }_{n\text { times}}\)- Open image in new window
Convolution product \(\rho _1 * \rho _2 * \ldots \)

- \(\rho ^{*n}\)
*n*-th convolution power: \(\underbrace{\rho *\rho *\ldots *\rho }_{n\text { times}}\)- \(f_\star \)
Pushforward of a generalized function by the map

*f*of the domain: \(\langle f_\star \rho , \varphi \rangle = \langle \rho , \varphi \circ f\rangle \)- \({\mathrm {Corr}}(\rho _1,\rho _2)\)
Cross-correlation of densities \(\rho _1\) and \(\rho _2\)

- \(\rho \)
Probability density of the phenotypes (in the model description and application)

- \(\sigma \)
Probability density of the fitness in a compartmentalized population

- \(\sigma _x\)
Probability density of the fitness conditioned on phenotype

*x*- \({\bar{x}}\)
Mean phenotypic trait: mathematical expectation of the function \(x\mapsto x\) with respect to the phenotype distribution, \(\langle \rho ,x\rangle \) (in the model description and application)

- \(\overline{x^n}\)
The

*n*-th moment of the phenotypic trait: mathematical expectation of the function \(x\mapsto x^n\) with respect to the phenotype distribution, \(\langle \rho ,x^n\rangle \) (in the model description and application)- \({\bar{w}}\)
Mean fitness of an individual in a compartmentalized population: \(\langle \sigma ,x\rangle \) (in the model description and application)

- \({\bar{w}}_x\)
Mean fitness of an individual with pheontype

*x*in a compartmentalized population: \(\langle \sigma _x,y\rangle \) (in the model description and application)- \({\text {ch}}x\)
Hyperbolic cosine of

*x*: \({\text {ch}}x = (e^x + e^{-x})/2\)- \(\lambda \)
Poisson parameter: the mean number of individuals per compartment

- \(\wedge \), \(\Rightarrow \), \(\lnot \)
Logical conjunction, implication, and negation, respectively

## Mathematics Subject Classification

46F99 46N60 92D15## Notes

### Acknowledgements

The authors are grateful to David Lacoste and Luca Peliti for stimulating discussions and especially to Ken Sekimoto for numerous discussions and for critically reading the manuscript. We also would like to thank an anonymous reviewer for pointing out a noncritical but unpleasant mathematical mistake in the manuscript.

## Supplementary material

## References

- Alfaro M, Carles R (2014) Explicit solutions for replicator-mutator equations: extinction versus acceleration. SIAM J Appl Math 74(6):1919–1934MathSciNetCrossRefzbMATHGoogle Scholar
- Bianconi G, Zhao K, Chen IA, Nowak MA (2013) Selection for replicases in protocells. PLoS Comput Biol 9(5):e1003051MathSciNetCrossRefGoogle Scholar
- Bitbol A-F, Schwab DJ (2014) Quantifying the role of population subdivision in evolution on rugged fitness landscapes. PLoS Comput Biol 10(8):e1003778CrossRefGoogle Scholar
- Bürger R (2014) A survey on migration-selection models in population genetics. Discret Contin Dyn Syst Ser B 19(4):883–959MathSciNetCrossRefzbMATHGoogle Scholar
- Collins DJ, Neild A, Liu A-Q, Ai Y et al (2015) The poisson distribution and beyond: methods for microfluidic droplet production and single cell encapsulation. Lab Chip 15(17):3439–3459CrossRefGoogle Scholar
- Davies CM, Fairbrother E, Webster JP (2002) Mixed strain schistosome infections of snails and the evolution of parasite virulence. Parasitology 124(1):31–38CrossRefGoogle Scholar
- Dodevski I, Markou GC, Sarkar CA (2015) Conceptual and methodological advances in cell-free directed evolution. Curr Opin Struct Biol 33:1–7CrossRefGoogle Scholar
- Domingo E, Biebricher C, Eigen M, Holland JJ (2001) Quasispecies and RNA virus evolution: principles and consequences. Landes Bioscience, AustinGoogle Scholar
- Edd JF, Di Carlo D, Humphry KJ, Köster S, Irimia D, Weitz DA, Toner M (2008) Controlled encapsulation of single-cells into monodisperse picolitre drops. Lab Chip 8(8):1262–1264CrossRefGoogle Scholar
- Ewens WJ (2004) Mathematical population genetics 1: theoretical introduction, 2nd edn. Springer, BerlinCrossRefzbMATHGoogle Scholar
- Fontanari JF, Santos M, Szathmáry E (2006) Coexistence and error propagation in pre-biotic vesicle models: a group selection approach. J Theor Biol 239(2):247–256MathSciNetCrossRefGoogle Scholar
- Fontanari JF, Serva M (2013) Solvable model for template coexistence in protocells. EPL (Europhys Lett) 101(3):38006CrossRefGoogle Scholar
- Frank SA (2001) Multiplicity of infection and the evolution of hybrid incompatibility in segmented viruses. Heredity 87(5):522CrossRefGoogle Scholar
- Fried B, Alenick DS (1981) Localization, length and reproduction in single-and multiple-worm infections of echinostoma revolutum (trematoda) in the chick. Parasitology 82(1):49–53CrossRefGoogle Scholar
- Fried B, Huffman JE, Weiss PM (1990) Single and multiple worm infections of echinostoma caproni (trematoda) in the golden hamster. J Helminthol 64(1):75–78CrossRefGoogle Scholar
- Gardner A, Grafen A (2009) Capturing the superorganism: a formal theory of group adaptation. J Evol Biol 22(4):659–79CrossRefGoogle Scholar
- Geritz SAH, Metz JAJ, Klinkhamer PGL, De Jong TJ (1988) Competition in safe-sites. Theoret Popul Biol 33(2):161–180MathSciNetCrossRefzbMATHGoogle Scholar
- Geritz SAH, van der Meijden E, Metz JAJ (1999) Evolutionary dynamics of seed size and seedling competitive ability. Theor Popul Biol 55(3):324–343CrossRefzbMATHGoogle Scholar
- Ghadessy FJ, Ong JL, Holliger P (2001) Directed evolution of polymerase function by compartmentalized self-replication. Proc Nat Acad Sci 98(8):4552–4557CrossRefGoogle Scholar
- González-Jara P, Fraile A, Canto T, García-Arenal F (2009) The multiplicity of infection of a plant virus varies during colonization of its eukaryotic host. J Virol 83(15):7487–7494CrossRefGoogle Scholar
- Hamilton M (2011) Population genetics. Wiley, New YorkzbMATHGoogle Scholar
- Mingyan He J, Edgar S, Jeffries GDM, Lorenz RM, Patrick Shelby J, Chiu DT (2005) Selective encapsulation of single cells and subcellular organelles into picoliter-and femtoliter-volume droplets. Anal Chem 77(6):1539–1544CrossRefGoogle Scholar
- Higgs PG, Lehman N (2015) The RNA world: molecular cooperation at the origins of life. Nat Rev Genet 16(1):7CrossRefGoogle Scholar
- Lampert A, Tlusty T (2011) Density-dependent cooperation as a mechanism for persistence and coexistence. Evol Int J Org Evol 65(10):2750–2759CrossRefGoogle Scholar
- Manrubia SC, Lázaro E (2006) Viral evolution. Phys Life Rev 3(2):65–92CrossRefGoogle Scholar
- Martin G, Roques L (2016) The non-stationary dynamics of fitness distributions: asexual model with epistasis and standing variation. Genetics 204:1541–1558CrossRefGoogle Scholar
- Martínez R, Schwaneberg U (2013) A roadmap to directed enzyme evolution and screening systems for biotechnological applications. Biol Res 46(4):395–405CrossRefGoogle Scholar
- Matsumura S, Kun Á, Ryckelynck M, Coldren F, Szilágyi A, Jossinet F, Rick C, Nghe P, Szathmáry E, Griffiths AD (2016) Transient compartmentalization of RNA replicators prevents extinction due to parasites. Science 354(6317):1293–1296CrossRefGoogle Scholar
- Novella IS, Reissig DD, Wilke CO (2004) Density-dependent selection in vesicular stomatitis virus. J Virol 78(11):5799–5804CrossRefGoogle Scholar
- Packer MS, Liu DR (2015) Methods for the directed evolution of proteins. Nat Rev Genet 16(7):379CrossRefGoogle Scholar
- Price GR (1970) Selection and covariance. Nature 227:520–521CrossRefGoogle Scholar
- Price GR (1972) Extension of covariance selection mathematics. Ann Hum Genet 35(4):485–490CrossRefzbMATHGoogle Scholar
- Queller DC (1992) Quantitative genetics, inclusive fitness, and group selection. Am Nat 139(3):540–558CrossRefGoogle Scholar
- Romero PA, Tran TM, Abate AR (2015) Dissecting enzyme function with microfluidic-based deep mutational scanning. Proc Nat Acad Sci 112(23):7159–7164CrossRefGoogle Scholar
- Rudin W (1976) Principles of mathematical analysis. McGraw-Hill, New YorkzbMATHGoogle Scholar
- Schwartz L (1997) Analyse, vol III. Hermann, PariszbMATHGoogle Scholar
- Simon B, Fletcher JA, Doebeli M (2013) Towards a general theory of group selection. Evolution 67(6):1561–1572CrossRefGoogle Scholar
- Smerlak M, Youssef A (2017) Limiting fitness distributions in evolutionary dynamics. J Theor Biol 416:68–80MathSciNetCrossRefzbMATHGoogle Scholar
- Maynard Smith J (1964) Group selection and kin selection. Nature 201(4924):1145CrossRefGoogle Scholar
- Tawfik DS, Griffiths AD (1998) Man-made cell-like compartments for molecular evolution. Nat Biotechnol 16(7):652CrossRefGoogle Scholar
- van Veelen M (2005) On the use of the price equation. J Theor Biol 237(4):412–426MathSciNetCrossRefGoogle Scholar
- Wilson DS (1975) A theory of group selection. Proc Nat Acad Sci 72(1):143–146CrossRefzbMATHGoogle Scholar
- Wilson EO (1973) Group selection and its significance for ecology. Bioscience 23(11):631–638CrossRefGoogle Scholar
- Wójcik M, Telzerow A, Quax W, Boersma Y (2015) High-throughput screening in protein engineering: recent advances and future perspectives. Int J Mol Sci 16(10):24918–24945CrossRefGoogle Scholar
- Wright S (1931) Evolution in mendelian populations. Genetics 16(2):97Google Scholar
- Zeymer C, Hilvert D (2018) Directed evolution of protein catalysts. Annu Rev Biochem 87:131–157CrossRefGoogle Scholar
- Zheng Y, Roberts RJ (2007) Selection of restriction endonucleases using artificial cells. Nucleic Acids Res 35(11):e83CrossRefGoogle Scholar
- Zintzaras E, Santos M, Szathmáry E (2010) Selfishness versus functional cooperation in a stochastic protocell model. J Theor Biol 267(4):605–613MathSciNetCrossRefzbMATHGoogle Scholar