Introduction

Over the last 50 years theoretical models of prebiotic synthesis have shown how peptides, RNA or other polymers could have interacted in a cooperative manner to grow and persist, exploiting mechanisms of self-replication and natural selection (Eigen 1971; Gánti 2003; Kauffman 1986). In more recent years, DNA (von Kiedrowski 1986), bio-inspired molecules (Adamski et al. 2020; Carnall et al. 2010; Tjivikua et al. 1990), RNA (Lincoln and Joyce 2009; Vaidya et al. 2013), and peptides (Lee et al. 1996; Rubinov et al. 2009; Yao et al. 1998) have been engineered to exhibit different forms of self-replication, including auto-catalysis and cross-catalysis (Issac et al. 2001), where species directly and indirectly contribute to their own formation, respectively. While theory and practice have shown the feasibility of designing molecules that can self-replicate, there still remains a significant open question: how could molecular replicators have spontaneously arisen on the prebiotic earth?

To explore possible mechanisms for the spontaneous emergence of self-replicators in plausible prebiotic and other environments, diverse conditions, methods, and model systems have been pursued (Duim and Otto 2017; Gershenson et al. 2018). The most plausible prebiotic systems might well entail complex reaction networks involving precursors of nucleic acids, amino acids and metabolisms (Segré et al. 2000; Vincent et al. 2019; Xavier et al. 2020). However, for simplicity and tractability we focus on a subset of these chemistries: the condensation of amino acids to form peptides (Danger et al. 2012; Greenwald et al. 2018; Rode 1999). Because condensation reactions are disfavored in water, various activating agents, including polyphosphates, magnesium ion and imidazole, have been proposed (Lohrmann and Orgel 1973). In 1990, Rode and Schwendinger found simple conditions for amino acid monomers to form peptide polymers. Using the coordinating properties of copper ion, the dehydrating power of concentrated solutions of NaCl, and evaporation of solvating water, they drove the condensation of glycine to form dimers and trimers (Rode and Schwendinger 1990; Schwendinger and Rode 1991) in a process called salt-induced peptide formation(SIPF) (Schwendinger and Rode 1992); subsequently, they employed SIPF to form about 50 unique dipeptides (Rode et al. 1997). By employing such drying-induced condensation principles, we tracked the dimerization of alanine to form di-alanine in the presence of copper ion, where the dry-state residue was reactive over 25 days (Napier and Yin 2006). Further, we showed that a cyclic triphosphate (TP), can activate amino acids under alkaline conditions (pH 10) and elevated temperature (70-to-95 °C) by N-phosphorylation for condensation as well as hydrolysis (Sibilska et al. 2017), forming and breaking peptide bonds, respectively. Such TP-mediated activation may have foreshadowed activation by polyphosphates in the living cell, where the C-termini of amino acids are O-phosphorylated by adenosine triphosphate (ATP) and the enzyme aminoacyl tRNA synthetase, producing energy-rich amino acyl adenylates.

To further investigate the range of conditions under which spontaneous polymerization is possible, we and others have employed tandem mass spectrometry to characterize longer and more diverse de novo peptides, revealing their amino acid sequences (Forsythe et al. 2017; Forsythe et al. 2015; Parker et al. 2014; Rodriguez-Garcia et al. 2015; Sibilska et al. 2018; Surman et al. 2019). These studies have begun to reveal an expanding range of conditions and protocols that can promote de novo peptide synthesis, including insights into how environment can influence emergent peptide sequences. For example, starting with equimolar mixtures of glycine and alanine we reacted samples, in the presence and absence of TP in conditions of pH ranging from 1 to 12 and temperatures of 0–100 °C. Different reaction environments impacted the formation of tripeptide isomers containing two glycines(G) and one alanine(A). Specifically, AGG and GGA were detected in at least 50 of the 264 environments, while GAG was detected in none (Sibilska et al. 2018). Such a skewed distribution is not so surprising given that a likely di-peptide precursor, GG, was detected at higher levels than GA or AG in most of the tested environments. Despite such progress, the field of de novo prebiotic peptides has yet to discover conditions that enable the detection of emergent autocatalytic function, a key transition for prebiotic chemistry.

Here, we use theory and experiments to explore how individual amino acids condense to form dimers and combine further to produce trimers, tetramers and other products. We consider these processes from the framework of chemical kinetics, inspired by the quantitative characterization of synthetic molecular replicators (Lee et al. 1996; Nowick et al. 1991; von Kiedrowski 1986). Others have developed detailed kinetic models of amino acid polycondensation (Harshe et al. 2007; Yu et al. 2016), but they have not sought signatures of emergent molecular cooperation. Our simple kinetic models of de novo peptide synthesis show how peptide products of different length could exhibit different patterns of enrichment; levels of the shortest peptides (dimers) grow linearly with time while levels of longer peptides exhibit faster-than-linear or super-linear growth. For our wet-lab study we condensed five amino acids over 21 wet-dry cycles; the amino acids were selected to represent a diversity of potential side-group structures and chemistries: glycine is the smallest of the natural amino acids and is non-polar; serine is a polar amino acid with a potentially reactive hydroxyl moiety that could enable branching or formation of non-linear peptides; proline is a cyclic amino acid with a secondary amine group; arginine has a terminal guanidinum group where conjugation between the double bond and the nitrogen lone electron pair can enable formation of multiple hydrogen bonds, impacting water release; and aspartic acid has an additional carboxylic group that will be negatively charged at and above physiological pH, and it may also act as a hydrogen bond donor or acceptor. We found multiple species that exhibit super-linear growth, as anticipated by our models.

Materials and Methods

Materials

All chemicals were of analytical grade purity and used without further purification. Glycine, sodium hydroxide, potassium phosphate monobasic and hydrochloric acid (12 N) were purchased from Thermo Fisher Scientific (Waltham, MA, USA). Hexanesulfonic acid sodium salt, phosphoric acid, serine, proline, arginine and aspartate were supplied by Sigma Aldrich Co. (St. Louis, MO, USA).

Experiment Setup

Multiple tubes (triplicates for 21 cycles) were prepared so that each tube was sampled only once: at its corresponding cycle number. Extension of this approach to multiple (n) cycles is depicted in Fig. 1.

Fig. 1
figure 1

Drying-induced reactions over multiple cycles. Each 24-h cycle starts when the dissolved reactants are set at the incubation temperature and left open to the atmosphere, promoting evaporation and solid-phase reactions. The cycle ends when water is added back to dissolve the solid

Reactions were run and carried out in 1.5 mL low-retention Eppendorf tubes. A standard modular heater with temperature control was used as a heat source (VWR, Randor, PA, USA). Stock solution, or Mix5”, contained five amino acids (glycine, serine, proline, arginine, and aspartic acid), each at 200 mM in ddH2O.

General Procedure for Drying-Induced Condensation with Rehydration Cycling

The reaction mixture was prepared using Mix5 stock (50 μl), alkalized with NaOH (30 μl, 1 M aq. solution) to pH 9.5 and further supplemented with ddH2O to a reaction volume of 200 μl; this solution was vortexed for a few seconds to ensure homogeneity of the sample. The reaction mixture was incubated at 70 °C uncapped for 24 h, permitting evaporation and incubation of the dried solid residue. This sample had been subjected to a single sequential process of drying-reaction-rehydration, so it was named “cycle 1”; for the subsequent cycles remaining solid crude was dissolved in 200 μl of dd H2O, vortexed for about 1 min (longer if needed), placed back on the heating block allow for evaporation; subsequent reactions were carried out in the original Eppendorf tubes.

Product Analysis by IP-HPLC Analysis

After completing each cycle, set of triplicates was removed from the heating block, dissolve in 1 ml of dd H2O, vortexed until all material dissolved and subjected to analysis. Samples were analyzed using a Shimadzu Nexera XR IP-HPLC system fitted with a reversed-phase C18 column (Phenomenex Aeris XB-C18, 150 mm × 4.6 mm, 3.6 μL, Phenomenex Torrance, CA, USA). Samples were auto-injected in 10 μL aliquots (Shimadzu Nexera X2 Autosampler, Schimadzu Nakagyo-ku, Kyoto, Japan), and analyzed in binary gradient mode with a flow rate of 1 mL/min. The mobile phase A consisted of 50 mM potassium monophosphate (KH2PO4) and 7.5 mM of hexanesulfonic acid (sodium salt) solution adjusted to pH 2.1 with phosphoric acid, and mobile phase B was 5% acetonitrile in 95% phase A. For each sample, the gradient was created by feeding to the column phase A (1-to-10 min), ramp from phase A to B (10-to-15 min), phase B (15-to-30 min), ramp from phase B to A (30-to-31 min) and phase A (31-to-35 min). The instrument was controlled and the resulting data analyzed using software of LabSolutions (Schimadzu, Kyoto, Japan). Oligomeric products were detected based on their absorption at 195 nm, and their characteristic HPLC retention times, which were reproducible to within 0.1-to-0.5% of their mean value across the triplicates.

Results

Kinetic Modeling of De Novo Polymer Synthesis

An elementary reaction between two monomers (M1) yield a dimer (M2), as shown in the following reaction:

$$ {M}_1+{M}_1\overset{k_{11}}{\to }{M}_2 $$
(1)

where we have omitted indicating the formation of a water molecule, a byproduct of amide bond formation by the condensation of two amino acids. The rate of this reaction is given by the law of mass action, so the concentration of dimer will change in time with the square of the concentration of the monomer, and with a rate constant k11, as follows

$$ \frac{d\left[{M}_2\right]}{dt}={k}_{11}{\left[{M}_1\right]}^2 $$
(2)

where the brackets [] indicate concentrations. Initially (and under most experimental conditions to date) the formation of dimer has a negligible effect on the level or concentration of monomer, so one may treat the concentration of monomer as a constant, its initial value. Then the rate of dimer formation is approximately constant:

$$ \frac{d\left[{M}_2\right]}{dt}={k}_{11, app} $$
(3)

where the observed or apparent zeroth-order rate constant for the reaction, k11, app, depends on the rate constant of the elementary reaction and the initial concentration of monomer:

$$ {k}_{11, app}={k}_{11}{\left[{M}_1\right]}_0^2 $$
(4)

If we assume an initial condition where no dimer is present, then integration of Eq. (3) yields a linear dependence of the concentration of dimer with time:

$$ \left[{M}_2\right]={k}_{11, app}t $$
(5)

Next, we consider the reaction between a monomer (M1) and a dimer (M2) to produce a trimer (M3), as shown by the elementary reaction:

$$ {M}_1+{M}_2\overset{k_{12}}{\to }{M}_3 $$
(6)

As before, the rate of trimer formation is given by mass action

$$ \frac{d\left[{M}_3\right]}{dt}={k}_{12}\left[{M}_1\right]\left[{M}_2\right] $$
(7)

and we also assume the monomer concentration is approximately its initial value, and the dependence of dimer concentration from Eq. (5) holds, so

$$ \frac{d\left[{M}_3\right]}{dt}={k}_{12}{k}_{11, app}{\left[{M}_1\right]}_0t $$
(8)

Assuming there is no trimer initially present, integration of this rate expression then yields a concentration of trimer that grows with the square of time:

$$ \left[{M}_3\right]={k}_{12, app}{t}^2 $$
(9)

where the rate constant k12, app is an apparent rate of timer formation equivalent to ½(k11, app)(k12)[M1]0.

Finally, tetramer (M4) can be formed through a reaction between two dimers:

$$ \frac{d\left[{M}_4\right]}{dt}={k}_{22}{\left[{M}_2\right]}^2 $$
(10)

The dimers both grow in concentration with linear dependence on time, so substituting that relation from Eq. (5), integrating, and assuming no tetramer is initially present, we get a dependence of tetramer concentration on the cube of time:

$$ \left[{M}_4\right]={k}_{22, app}{t}^3 $$
(11)

Note that the dependence of tetramer concentration on the cube of time would also arise by forming the tetramer through a reaction between a trimer and a monomer, though the rate constant would in general be different. The derived growth of dimer, trimer and tetramer concentrations with time(t) are summarized in Fig. 2; in general, longer polymers exhibit more non-linear growth.

Fig. 2
figure 2

Kinetic signatures of species formation and cooperation. The de novo formation of dimers (M2), trimers (M3), and tetramers (M4) exhibit linear, quadratic and cubic dependence on time(t), as shown. When two species, A and B, cooperate by mutually supporting each other’s enrichment, each species exhibits exponential growth. To enable comparisons, concentrations have been normalized to unity at t = 10; concentrations and time are in arbitrary units

Kinetic Modeling of Emergent Cooperation and Collective Autocatalysis

In the preceding section, we considered how the de novo emergence of short peptide oligomers of different length could arise from the condensation of preexisting components, starting from monomers; we neglected cooperative interactions between oligopeptide products and their effects on growth, which we now take into account. Assume we have de novo species A and B, which are formed with zeroth-order kinetics, exhibiting concentrations that increase linearly in time, analogous to the formation of dimer above in Eq. (4). For simplicity, let us further assume concentrations of both A and B are initially zero, and they grow at the same rate:

$$ \frac{d\left[i\right]}{dt}={k}_0\kern0.33em or\left[i\right]={k}_0t,\kern0.33em where\kern0.33em i=A,B $$
(12)

Now we allow species A to be recruited toward the formation of species B, a feature that now provides two sources for species B to grow: the original de novo source, which is independent of species A, and a new source that depends on the concentration of species A. We may express the rate of species B change as follows:

$$ \frac{d\left[B\right]}{dt}={k}_0+{k}_1\left[A\right] $$
(13)

where k1 is a rate constant that accounts for the extent to which species A contributes to the rate of formation of species B. We do not say anything mechanistic about how species A achieves this feat; it may be a catalyst or peptide template that is not consumed as it contributes to the formation of species B. We only consider the consequence of species A on the rate of formation of species B. Now, by substituting the linear growth of species A into Eq. (13) for species B, assuming B was not initially present, and integrating, we get

$$ \left[B\right]={k}_0t+\frac{k_0{k}_1}{2}{t}^2 $$
(14)

Species A provides a boost such that the concentration of B grows with a squared dependence on time, comparable to trimer growth (as in Eq. (9) and [M3] in Fig. 2). Finally, we consider the case where B reciprocates, being recruited to the synthesis of A in the same manner that A contributes to the synthesis of B. The process of reciprocal cooperation gives rise to exponential growth in the concentrations of both species:

$$ \left[i\right]=\frac{k_0}{k_1}\left({e}^{k_1t}-1\right),\kern0.33em where\kern0.33em i=A,B $$
(15)

This reciprocal cooperation is cross-catalytic in a sense that neither species is consumed as it contributes to its partner’s synthesis, and yet both species exhibit exponential growth behavior that is qualitatively different from the growth of de novo species described in section 3.1 (Fig. 2).

Population Dynamics of De Novo Products

Reproducible column retention times for triplicate samples were used to detect and track the behavior of 63 de novo product species over 21 wet-dry cycles (Fig. 3, supplementary materials). Following the first cycle, nine species were detected. By the third cycle only one of the original nine species could be detected; losses of the other original eight species where counted as extinction events. The one surviving species was joined by 17 new ones to give a count at cycle three of 18 species and a cumulative diversity of 26 species. In subsequent sampling of cycles, 0-to-10 new species appeared while 0-to-9 species went extinct with each new sampling. Across the 21 cycles, species counts tended to increase during the first 15 cycles, then approached a steady-state level of 40 species. However, even as the species number approached this steady-state level, new species continued to emerge as other species became extinct, apparent from the increase in cumulative species from cycle 17 to cycle 19.

Fig. 3
figure 3

Dynamics of emergent de novo products. More than 60 distinct putative peptide products were detectable by UV absorption and HPLC over a course of 21 cycles of wet-dry synthesis; the duration of each cycle was one day. New species appeared and others went extinct as overall (current) species approached a steady-state level

Kinetics of Emergent Cooperation by De Novo Peptides

To test whether putative peptide products exhibit signatures of cooperation, we used peak areas associated with UV absorbance as a proxy of concentration for each species (Sibilska et al. 2018). From cycle 3 to cycle 21 we were able to continuously track the integrated peak areas for eight species (Fig. 4); in most cases, best fits to the data gave super-linear growth, with exponents on time(t) greater than 1.00. Column retention times(RT) for these species ranged from 1.589 to 22.51 min, and variation across triplicate samples was minimal up to cycle 19. Only one species (RT 2.987) exhibited slower than linear growth, with exponent on time of 0.84. In general, super-linear growth was coupled with simple concave-up area-vs-time behavior; however, one species (RT 22.51) with overall exponent on time of 1.29 exhibited more complex growth, initially concave-up through 9 to 11 days followed by concave-down behavior to 21 days.

Fig. 4
figure 4

Products of de novo synthesis enrich over multiple cycles. A subset of species, defined by their HPLC column retention times (RT, in minutes), was initially detected after 3 days (cycles) and enriched to 21 days. One product (RT 2.897) exhibited sub-linear growth with an exponent on time of 0.84, while most species displayed super-linear growth, with exponents on time ranging from 1.13 to 1.52

Discussion

Our modeling predicted that de novo peptides containing three or more amino acids would enrich with super-linear dependence on time, incorporating shorter peptides from earlier reactions to support the subsequent production of longer peptides. As predicted, we observed super-linear growth among multiple products that were detectable and quantifiable over nearly 20 days. However, our modeling predicted super-linear enrichment with quadratic and cubic powers of time for trimers and tetramers, respectively, while our experimentally observed super-linear growth was relatively modest, with powers of time ranging from 1.13 to 1.52.

The discrepancy between predicted and observed levels of super-linear growth likely reflects differences between assumptions of our model and the complexity of the actual reaction environment. For simplicity, we assumed de novo product formation was well-mixed or spatially homogenous over the course of the reaction for each cycle; in reality, the precipitation of dissolved solids during evaporation has significant spatial structure and heterogeneity (Charlesworth and Marshall Jr. 1960). The solid residue is a complex reaction environment where the reduction in solution volume by evaporation causes species concentrations to approach infinity (promoting mass action kinetics), while the translational and rotational diffusivities of all species approach zero (inhibiting productive collisions between potential reactants). Such solid residues can be reactive over multiple weeks when they are initially composed of a single small amino acid like alanine (Napier and Yin 2006), but it is not known how factors that promote or inhibit reactions play out in our 5-component starting mixture. The super-linear growth behavior we observed suggests trimer or longer products were formed, but they did not exhibit quadratic or more pronounced super-linear behavior owing to reaction limitations in the solid residue. Future studies that employ activating agents may enable condensation reactions in the liquid phase (without the need for evaporation); in the absence of factors that inhibit solid-phase reactions, one would then predict more pronounced and more easily detected super-linear growth of peptide products.

Our analysis has been based on the detection of reproducible product retention times by HPLC analysis and the assumption that each peak corresponds to a single putative peptide product. Such assumptions will be addressed in the future by coupling HPLC with mass spectrometry, to resolve co-eluting species based on differing mass-to-charge ratios; further, tandem mass spectrometry would resolve species composed of the same amino acids but differing only in sequence. Analysis of other products, particularly by thermal decomposition at more elevated temperatures, may be further enriched by calorimetry and thermogravimetry (Weiss et al. 2018).

Do species actually go extinct? Based on the detection of new species and their absence in subsequent population sampling, a subset of species indeed appears to go extinct (Fig. 3). This may seem counterintuitive because conditions from one cycle to the next should be quite similar; species generated in an early cycle might likewise be generated in subsequent cycles. However, the kinetic expressions for rates of change for a given species depend not only on the rates of species formation but also on rates of species consumption; a species that is more rapidly consumed during subsequent cycles than is formed may become undetectable and appear to go extinct, even when is still being formed. It remains to be determined how or whether apparent extinction events relate to the emergence of further recruitment or cooperative reactions.

While our data support a range of growth rates of some individual species, none exhibited exponential growth (section 3.2), implying a lack of reciprocal recruitment or mutual cooperation. This does not rule out the possibility of autocatalysis emerging during cycles of de novo peptide formation. First, further cycling may enable new species to continue to emerge, either by condensation or hydrolysis of existing species, even if the total number of species remains constant, as we observed after 21 cycles (Fig. 3). Such new species may have distinct properties, for example template-mediated interactions that favor the ligation or elongation of some species over others. Examples are known for peptides that show such mechanisms (Lee et al. 1996; Rout et al. 2018; Rubinov et al. 2009). Additionally, environments that promote higher product yields during each cycle, for example different temperature, pH, or the inclusion of mineral surfaces (Gillams and Jia 2018; Imai et al. 1999; Lahav et al. 1978; Lohrmann and Orgel 1973; Sibilska et al. 2018; Vincent et al. 2019) may enable exploration of a broader range of species interactions for peptide bond formation or cleavage. Finally, it might be possible to implement cycles as serial dilutions so as to impose selections for reactions that are able to persist. Clearly, many opportunities await further exploration.