Abstract
In the central dogma of biology, RNA bridges the gap between genomics and proteomics. CA rules are designed in Chap. 2 for RNA building blocks. This chapter reports how ‘equivalent CA rule groups’ model the ‘codon degeneracy’. Design of RCAM (RNA Modelling CA Machine) is covered in this chapter. RCAM evolution generates signal graphs. Algorithms are designed out of signal graph analytics. RCAM evolution of codon string of an mRNA molecule models the biological process of translation. Algorithms are constructed to predict co-translational folds and mutational effects. The predicted results are similar to the results reported out of wet lab experiments. From CA signal graph analytics, secondary structures are predicted for different class of RNA molecules—tRNA, RNA precursor, miRNA, etc. Predicted results are validated against the experimental results reported in web sites or research papers. Finally, the model for binding of siRNA molecule on its target gene is reported along with prediction of true positive and true negative instance reported in databases.
“I’m fascinated by the idea that genetics is digital. A gene is a long sequence of coded letters, like computer information. Modern biology is becoming very much a branch of information technology”.
—Richard Dawkins
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Clancy, S., Brown, W.: Translation: DNA to mRNA to protein. Nat. Educ. 1(1), 101 (2008)
Tuller, T.: Selected Publications. Tel Aviv University, www.cs.tau.ac.il/~tamirtul/Sublinkes/Tuller_Publications.html
Diament, A., Tuller, T.: Estimation of ribosome profiling performance and reproducibility at various levels of resolution. Biol. Direct 11(1), 24 (2016)
Stothard, P.: The sequence manipulation suite: JavaScript programs for analyzing and formatting protein and DNA sequences (2000)
Guglielmi, L., et. al.: Expression of single-chain Fv fragments in E. coli cytoplasm. Methods Mol Biol. 215–224 (2009)
Babenko, A.P., Polak, M., Cavé, H., Busiah, K., Czernichow, P., Scharfmann, R., Bryan, J., Aguilar-Bryan, L., Vaxillaire, M., Froguel, P.: Activating mutations in the ABCC8 gene in neonatal diabetes mellitus. N. Engl. J. Med. 355(5), 456–466 (2006)
Bowman, P., et al.: Heterozygous ABCC8 mutations are a cause of MODY. Diabetologia 55(1), 123–127 (2012)
Gingold, H., Pilpel, Y.: Determinants of translation efficiency and accuracy. Mol. Syst. Biol. 7(1), 481 (2011)
Iwakawa, H., Tomari, Y.: The functions of microRNAs: mRNA decay and translational repression. Trends Cell Biol. 25(11), 651–665 (2015)
Krol, J., et al.: Structural features of microRNA (miRNA) precursors and their relevance to miRNA biogenesis and small interfering RNA/short hairpin RNA design. J. Biol. Chem. 279(40), 42230–42239 (2004)
Biao, Liu et. al.: Analysis of secondary structural elements in human microRNA hairpin precursor. BMC Bioinform. 17, 112 (2016)
Fernandez, N., et al.: Genetic variation and RNA structure regulate microRNA biogenesis. Nat. Commun. 8, 15114 (2017)
Lambe, A.T., et al.: Transitions from functionalization to fragmentation reactions of laboratory secondary organic aerosol (SOA) generated from the OH oxidation of alkane precursors. Environ. Sci. Technol. 46(10), 5430–5437 (2012)
Deigan, K.E., et al.: Accurate SHAPE-directed RNA structure determination. Proc. Natl. Acad. Sci. 106(1), 97–102 (2009)
Belter, A., et al.: Mature miRNAs form secondary structure, which suggests their function beyond RISC. PLoS ONE 9(11), e113848 (2014)
Griffiths-Jones, S., et al.: miRBase: tools for microRNA genomics. Nucleic Acids Res. 36(suppl_1), D154–D158 (2007)
Gibb, E.A., et al.: Human cancer long non-coding RNA transcriptomes. PLoS ONE 6(10), e25915 (2011)
Rohloff, J.C., et al.: Nucleic acid ligands with protein-like side chains: modified aptamers and their use as diagnostic and therapeutic agents. Mol. Ther.-Nucleic Acids 3, e201 (2014)
Keeler, A.M., ElMallah, M.K., Flotte, T.R.: Gene therapy 2017: progress and future directions. Clin. Transl. Sci. 10(4), 242–248 (2017)
Reynolds, A., et al.: Rational siRNA design for RNA interference. Nat. Biotechnol. 22(3), 326 (2004)
Riba, A., et al.: Explicit modeling of siRNA-dependent on-and off-target repression improves the interpretation of screening results. Cell Syst. 4(2), 182–193 (2017)
Chalk, A.M., et al.: siRNAdb: a database of siRNA sequences. Nucleic Acids Res. 33(suppl_1), D131–D134 (2005)
Carthew, R.W., Sontheimer, E.J.: Origins and mechanisms of miRNAs and siRNAs. Cell 136(4), 642–655 (2009)
Gerstein, M., Tsai, J., Levitt, M.: the volume of atoms on the protein surface: calculated from simulation, using Voronoi polyhedra. J. Mol. Biol. 249, 955–966 (1995). https://doi.org/10.1006/jmbi.1995.0351
Loughlin, F.E., et al.: Structural basis of pre-let-7 miRNA recognition by the zinc knuckles of pluripotency factor Lin28. Nat. Struct. Mol. Biol. 19(1), 84 (2012)
Author information
Authors and Affiliations
Corresponding author
Annexure
Annexure
3.1.1 Annexure 3.1: Analysis of Nucleotide Bases to Assign Cellular Automata (CA) Rules
Section 3.3.3 reports the assignment of Cellular Automata (CA) rules to the codon triplets. Such an assignment demands detailed analysis of the atoms of different nucleotide bases and the associated sequence in different codons, as presented in this annexure.
The relevant features of the four nucleotide bases are retrieved from PubChem database (https://pubchem.ncbi.nlm.nih.gov/) which are analyzed in this annexure. The volume surface of Nitrogen (N), Oxygen (O), and Carbon (C) atom are reported as 15.3, 18.1 and 9.9, respectively as noted in the paper [24]. The polar positive, polar negative and non-polar surface areas for four bases are computed for each of the bases. The value of XLogP3 for G, A, U and C are −1.0, −0.1, −1.1 and −1.7, respectively. This value specifies how hydrophilic or hydrophobic a molecule is. Higher hydrophobicity is indicated by lesser negative value in the comparative scale for four bases.
Based on the analysis of these parameters, the bases are arranged in the following descending order: G > A > U > C with G and C, assigned maximum and minimum weight respectively. Among the bases, the atomic structure of base A stands apart from others since it does not have any oxygen atom in its atomic structure and so it does not have any negatively charged surface area. As per XLogP3 values, base A can be viewed as most hydrophobic, while C is least hydrophobic among the four bases.
For comparative analysis of 16 base pairs with left and middle base (first two bases), it has been assumed that the parameters of the left base play more dominant role. For example, the base A is assigned higher weight than the base C for the pair AC; on the other hand higher weight is assigned to C than A for the pair CA. Considering the special characteristics of base A, and the XLogP3 values, the 16 base pairs are divided into the following two groups.
-
(i)
The first group consists of the pairs GG, GU, GC, UC, CG, CU, CC, none of which covers the base A. In a comparative scale, these pairs are assumed to be neither hydrophilic nor hydrophobic. The base pair A followed by C is added in this list since A is least hydrophilic, while C is most hydrophilic.
-
(ii)
The second group covers the six base pairs involving base A that is least hydrophilic—GA, AA, UA, CA, AG and AU. These pairs are assumed to be either hydrophilic or hydrophilic. Two other base pairs UG and UU are added to this list since both U and G have similar XLogP3 values −1.1 and −1.0. Further, the base U displays polar positive surface area (30.6) and polar negative surface area (36.2).
The first group of base pairs are assigned to (2, 2) 4RGs having balanced rules subdivided as per the parameter value δ. The descending order of base pairs is derived based on higher weight to the left base and sum of polar surface area for each pair. Each pair is associated with an amino acid as per the parameter value δ elaborated next.
-
δ = 1 GG > GU > GC > AC
-
δ = 2 UC > CG > CU
-
δ = 0 CC
The second group of base pairs are assigned to non-(2, 2) 4RGs (ID 9–16) with its subgroups detailed next. Each pair is associated with two or three amino acids including stop codons.
-
δ = 1 GA > AA > UA > CA
-
δ = 0 AG
-
δ = 2 AU > UG
-
δ = 0 UU
The base pairs are next assigned to different 4RGs, as reported in main text Sect. 3.3.3, based on the sum of decimal values of CA rules associated with 4RGs and parameter δ.
3.1.2 Annexure 3.2: Outline of CA Model for Post co-Translational Folding and Prediction of Protein Structure
CA model for co-translational folding is explained in Sect. 3.6. The algorithm designed to predict the fold points (of nascent peptide chain translated out of codon string) is reported along with the results derived on executing the software code developed for the algorithm.
We have experimented with the mRNA codon string of 16000 proteins retrieved from Dunbrack data set (http://dunbrack.fccc.edu) and designed two databases named as EFRL (Excluded Fold Residue Location) and VFRL (Valid Fold Residue Location). We predict true/false cases of prediction of fold residue location (FRL) for an input codon string based on these EFRL and VFRL databases. Based on the foundation laid down in Sect. 3.6 with the CA model for co-translational folding, we take the next step to model post-co-translational folding.
This annexure highlights the methodology we are implementing for post-co-translational folding and prediction of protein structure. We have assumed that during post-co-translational folding, secondary structures of helix and beta get introduced on protein chain. Each of helix and beta introduces folds at the start and end on the protein chain. In addition to the folds detected during co-translational folding, we add these new folds due to helix and beta. Fold Segments (FSs) are next derived with the amino acid residues between each pair of folds identified during co-translational and post-co-translational folding. On either end of such a segment dihedral angles of first and last two residues are noted. The Fold Segment Database (FSDB) designed out of large number of proteins lays the foundation for prediction of protein structure. The sequential steps of this methodology are noted below which are being finalized prior to development of the program.
Step 1: Develop the fold segment databases (FSDB0, FSDB1, FSDB2 and FSDB3) with the results of true VFRL, and FRL specified on first and last base of secondary structures of helix and beta noted in PDB for 16000 Dunbrack proteins.
-
While FSDB0 covers, each of the segments derived out of 16000 proteins, the next three databases merge adjacent fold segments as follows.
-
FSDB1 and FSDB2 merges each FSDB0 entry with the left and right neighbouring segments, respectively.
-
The FSDB3 merges each of FSDB0 with its left and right neighbours. These databases have been designed.
-
Each entry in FSDB records two dihedral angles of the first and last residue locations as explained in Sect. 3.6 while introducing two databases for EFRL (excluded FRL) and VFRL (valid FRL).
Step 2: Select three public domain packages for prediction of helix and beta in a protein chain.
Step 3: In order to predict three-dimensional structure of a candidate protein chain, we execute the Steps 4–7.
Step 4: Derive co-translational fold list out of VFRL (valid fold residue location) noted in Sect. 3.6.
Step 5: Run the three packages (noted in Step 2) to predict helix and beta structure in the candidate input chain. Identify helix and beta locations in the candidate chain based on majority voting of the results derived out of three packages. Based on location of helix and beta in the candidate chain, next step modifies the VFRL list.
Step 6: Modification of VFRL (Valid Fold Residue Location) list and derivation of composite VFRL for the candidate chain—mark a VFRL entry as ‘False’ if it is covered within a helix and beta excepting first and last two bases. Remove such ‘False’ entries. Subsequent to exclusion of such ‘False’ FRLs from VFRL, add additional folds at the start and end of location of each helix and beta. This is marked as composite VFRL List.
Step 7: Derive fold segments between each pair of folds noted in the composite VFRL list generated in Step 6.
Step 8: For each fold segment identified in Step 7, search FSDB (fold segment data base) designed in Step 1.
Step 9: Stitch the fold segments identified in Step 8 to predict the three-dimensional structure of the input candidate protein chain while taking into consideration the dihedral angle pairs noted for start and end residues of a segment. Stitching of a pair of adjacent segments extracted from FSDB demand search of correct options out of available multiple options.
3.1.3 Annexure 3.3: CS/CL Signal Graph of an Example RNA Molecule and Derivation of CrP (Critical Pair)
CS signal graph is derived out of Cycle Start (CS)/Cycle Length (CL) parameter table for an example RNA molecule having 159 nucleotide bases reported below. The column 1 and column 2 of the table below report the serial number of the nucleotide bases of the RNA. The CS and CL parameters derived out of ComRCAM are noted on column 3, 4 and on column 9, 10 on two major columns 1 to 6 and 7 to 12 respectively. The CS signal graph is derived by subtracting CS value of (i − 1)th cell from that of ith cell modelling respectively the (i − 1)th and ith location bases. The CS Difference (CSD) values are noted on column 5 and 11. Most of the CS difference values lie in the range 0 to −4, hence the threshold limit is set as −4. The CS difference value that is outside this range is marked as CrP (critical pair) reported on Col 6 and 12. For example, the first CrP in the list at location 28 is -32 since CS values for base location 27 is 372, while the value for location 28 is 340. A positive CS difference value outside the range of 0 to −4 is also marked as CrP as noted for location 62. The CrP values are analysed and processed in different algorithms reported in the main body of this chapter. Location of CrPs derived out of CS signal graphs provides the foundation of signal graphs analytics projected for different classes of RNA molecules.
The CS signal graph derived out of CS difference values is reported below for the RNA molecule. Signal graph analytics of such graphs represents various physical domain features of the RNA molecule as reported in different section of this chapter (Fig. 3.37).
Pos | AA | CS Val | CL Val | CSD | CrP | Pos | AA | CS Val | CL Val | CSD | CrP |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | G | 388 | 136 | 81 | T | 271 | 68 | 0 | |||
2 | G | 387 | 136 | −1 | 82 | T | 271 | 68 | 0 | ||
3 | C | 387 | 136 | 0 | 83 | T | 270 | 68 | −1 | ||
4 | T | 386 | 136 | −1 | 84 | A | 270 | 68 | 0 | ||
5 | C | 386 | 136 | 0 | 85 | A | 269 | 68 | −1 | ||
6 | T | 385 | 136 | −1 | 86 | T | 269 | 68 | 0 | ||
7 | A | 385 | 136 | 0 | 87 | T | 268 | 68 | −1 | ||
8 | T | 384 | 136 | −1 | 88 | A | 267 | 68 | −1 | ||
9 | G | 380 | 136 | −4 | 89 | G | 267 | 68 | 0 | ||
10 | G | 380 | 136 | 0 | 90 | C | 267 | 68 | 0 | ||
11 | C | 380 | 136 | 0 | 91 | T | 266 | 68 | −1 | ||
12 | T | 380 | 136 | 0 | 92 | G | 266 | 68 | 0 | ||
13 | T | 380 | 136 | 0 | 93 | G | 265 | 68 | −1 | ||
14 | A | 380 | 136 | 0 | 94 | G | 265 | 68 | 0 | ||
15 | G | 380 | 136 | 0 | 95 | A | 264 | 68 | −1 | ||
16 | T | 380 | 136 | 0 | 96 | A | 264 | 68 | 0 | ||
17 | T | 380 | 136 | 0 | 97 | T | 263 | 68 | −1 | ||
18 | G | 379 | 136 | −1 | 98 | G | 263 | 68 | 0 | ||
19 | G | 379 | 136 | 0 | 99 | G | 262 | 68 | −1 | ||
20 | T | 378 | 136 | −1 | 100 | T | 262 | 68 | 0 | ||
21 | T | 378 | 136 | 0 | 101 | G | 261 | 68 | −1 | ||
22 | A | 377 | 136 | −1 | 102 | G | 261 | 68 | 0 | ||
23 | A | 377 | 136 | 0 | 103 | C | 260 | 68 | −1 | ||
24 | A | 376 | 136 | −1 | 104 | A | 256 | 68 | −4 | ||
25 | G | 372 | 136 | −4 | 105 | C | 256 | 68 | 0 | ||
26 | C | 372 | 136 | 0 | 106 | A | 256 | 68 | 0 | ||
27 | G | 372 | 136 | 0 | 107 | C | 228 | 68 | −28 | CrP | |
28 | C | 340 | 136 | −32 | CrP | 108 | T | 228 | 68 | 0 | |
29 | C | 340 | 136 | 0 | 109 | C | 226 | 68 | −2 | ||
30 | T | 339 | 136 | −1 | 110 | C | 226 | 68 | 0 | ||
31 | G | 339 | 136 | 0 | 111 | T | 226 | 68 | 0 | ||
32 | T | 338 | 136 | −1 | 112 | G | 226 | 68 | 0 | ||
33 | C | 338 | 136 | 0 | 113 | T | 226 | 68 | 0 | ||
34 | T | 337 | 136 | −1 | 114 | A | 226 | 68 | 0 | ||
35 | C | 337 | 136 | 0 | 115 | G | 226 | 68 | 0 | ||
36 | G | 336 | 136 | −1 | 116 | T | 211 | 68 | −15 | CrP | |
37 | T | 336 | 136 | 0 | 117 | C | 211 | 68 | 0 | ||
38 | A | 335 | 136 | −1 | 118 | C | 210 | 68 | −1 | ||
39 | A | 335 | 136 | 0 | 119 | C | 210 | 68 | 0 | ||
40 | A | 334 | 136 | −1 | 120 | A | 209 | 68 | −1 | ||
41 | A | 330 | 136 | −4 | 121 | G | 209 | 68 | 0 | ||
42 | A | 330 | 136 | 0 | 122 | C | 208 | 68 | −1 | ||
43 | A | 330 | 136 | 0 | 123 | T | 208 | 68 | 0 | ||
44 | T | 302 | 136 | −28 | CrP | 124 | A | 207 | 68 | −1 | |
45 | G | 302 | 136 | 0 | 125 | C | 207 | 68 | 0 | ||
46 | T | 302 | 136 | 0 | 126 | T | 206 | 68 | −1 | ||
47 | C | 302 | 136 | 0 | 127 | C | 206 | 68 | 0 | ||
48 | A | 302 | 136 | 0 | 128 | A | 205 | 68 | −1 | ||
49 | G | 301 | 136 | −1 | 129 | G | 205 | 68 | 0 | ||
50 | C | 299 | 136 | −2 | 130 | G | 204 | 68 | −1 | ||
51 | C | 295 | 136 | −4 | 131 | A | 204 | 68 | 0 | ||
52 | T | 294 | 136 | −1 | 132 | G | 203 | 68 | −1 | ||
53 | G | 294 | 136 | 0 | 133 | A | 203 | 68 | 0 | ||
54 | A | 293 | 136 | −1 | 134 | C | 202 | 68 | −1 | ||
55 | G | 293 | 136 | 0 | 135 | T | 202 | 68 | 0 | ||
56 | C | 292 | 136 | −1 | 136 | G | 201 | 68 | −1 | ||
57 | A | 288 | 136 | −4 | 137 | A | 201 | 68 | 0 | ||
58 | A | 288 | 136 | 0 | 138 | A | 200 | 68 | −1 | ||
59 | C | 288 | 136 | 0 | 139 | G | 200 | 68 | 0 | ||
60 | A | 282 | 136 | −6 | CrP | 140 | C | 199 | 68 | −1 | |
61 | T | 281 | 136 | −1 | 141 | A | 199 | 68 | 0 | ||
62 | T | 282 | 136 | 1 | CrP | 142 | G | 198 | 68 | −1 | |
63 | T | 282 | 136 | 0 | 143 | G | 194 | 68 | −4 | ||
64 | C | 282 | 136 | 0 | 144 | A | 194 | 68 | 0 | ||
65 | T | 282 | 136 | 0 | 145 | G | 194 | 68 | 0 | ||
66 | A | 282 | 136 | 0 | 146 | G | 194 | 68 | 0 | ||
67 | C | 282 | 136 | 0 | 147 | A | 194 | 68 | 0 | ||
68 | A | 282 | 136 | 0 | 148 | T | 194 | 68 | 0 | ||
69 | A | 280 | 136 | −2 | 149 | C | 194 | 68 | 0 | ||
70 | A | 280 | 136 | 0 | 150 | G | 194 | 68 | 0 | ||
71 | T | 280 | 136 | 0 | 151 | C | 194 | 68 | 0 | ||
72 | T | 279 | 136 | −1 | 152 | T | 193 | 68 | −1 | ||
73 | A | 279 | 136 | 0 | 153 | T | 193 | 68 | 0 | ||
74 | T | 275 | 136 | −4 | 154 | G | 192 | 68 | −1 | ||
75 | T | 274 | 68 | −1 | 155 | A | 192 | 68 | 0 | ||
76 | A | 274 | 68 | 0 | 156 | G | 191 | 68 | −1 | ||
77 | T | 273 | 68 | −1 | 157 | C | 191 | 68 | 0 | ||
78 | T | 273 | 68 | 0 | 158 | C | 190 | 68 | −1 | ||
79 | T | 272 | 68 | −1 | 159 | C | 181 | 68 | −9 | CrP | |
80 | T | 271 | 68 | −1 |
3.1.4 Annexure 3.4: Predicted tRNA Secondary Structures for tRNA Sequences Retrieved from the Database - GtRNAdb.ucsc.edu
The algorithim to predict secondary structure of a tRNA molecule is reported in Sect. 3.7.1. Representative results are reported in Figs. 3.12 and 3.13. The algorithm has been coded and executed for large number base sequences of tRNA molecules of different species. The predicted results are tabulated below. The basic structure of tRNA molecule for wide variety of species is identical with three loops D, T, anticodon loops and variable length V loop.
Name | Accepter | D Stem | D Loop | D Stem | A Stem | A Loop | A Stem | V Loop | T Stem | T Loop | T Stem | Stem |
---|---|---|---|---|---|---|---|---|---|---|---|---|
chr19trna13 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–51 | 52–62 | 63–65 | 66–72 |
chr1trna50 | 1–5 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–52 | 53–60 | 61–65 | 68–72 |
chr6trna125 | 2–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–49 | 50–52 | 53–61 | 62–64 | 66–71 |
chr6trna126 | 1–3 | 5–7 | 8–12 | 13–15 | 28–32 | 33–39 | 40–44 | 45–49 | 50–52 | 53–63 | 64–66 | 71–73 |
chr6trna127 | 1–3 | 6–8 | 9–24 | 25–27 | 28–32 | 33–39 | 40–44 | 45–59 | 60–62 | 63–69 | 70–72 | 73–75 |
chr6trna122 | 1–3 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–49 | 50–52 | 53–63 | 64–66 | 71–73 |
chr2trna13 | 1–7 | 15–17 | 18–22 | 23–25 | 27–31 | 32–42 | 43–47 | 48–68 | 69–73 | 74–80 | 81–85 | 86–92 |
chr2trna12 | 1–7 | 10–12 | 13–17 | 18–20 | 22–26 | 27–33 | 34–38 | 39–42 | 43–47 | 48–53 | 54–58 | 59–65 |
chr2trna17 | 1–7 | 11–13 | 14–20 | 21–23 | 28–34 | 35–33 | 34–40 | 41–46 | 47–49 | 50–60 | 61–63 | 64–70 |
chr19trna14 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–50 | 51–62 | 63–65 | 66–72 |
chr6trna9 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–51 | 52–62 | 63–65 | 66–72 |
chr1trna78 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–51 | 52–62 | 63–65 | 66–72 |
chr8trna5 | 4–7 | 9–11 | 12–18 | 19–21 | 23–27 | 28–35 | 36–40 | 41–44 | 45–49 | 50–56 | 57–61 | 62–65 |
chr8trna2 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–42 | 43–47 | 48–66 | 67–69 | 70–75 | 76–78 | 86–92 |
chr8trna1 | 1–3 | 11–13 | 14–18 | 19–21 | 28–30 | 31–34 | 35–37 | 38–46 | 47–49 | 50–64 | 65–67 | 68–70 |
chr8trna9 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–57 | 58–60 | 61–71 | 72–74 | 75–81 |
chr8trna8 | 1–3 | 10–12 | 13–22 | 23–25 | 27–30 | 31–39 | 40–43 | 44–48 | 49–52 | 53–61 | 62–65 | 70–72 |
chr6trna95 | 1–4 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–51 | 52–58 | 59–61 | 62–65 |
chr6trna94 | 4–6 | 10–12 | 13–22 | 23–25 | 29–33 | 34–36 | 37–41 | 42–48 | 49–51 | 52–62 | 63–65 | 70–72 |
chr6trna97 | 1–3 | 5–7 | 8–16 | 17–19 | 27–30 | 31–39 | 40–43 | 44–47 | 48–50 | 51–65 | 66–68 | 70–72 |
chr6trna96 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–50 | 51–62 | 63–65 | 66–72 |
chr6trna91 | 1–7 | 11–13 | 14–18 | 19–21 | 27–30 | 31–39 | 40–43 | 44–48 | 49–51 | 52–62 | 63–65 | 66–72 |
chr6trna93 | 1–3 | 11–13 | 14–21 | 22–24 | 27–31 | 32–38 | 39–43 | 44–48 | 49–53 | 54–60 | 61–65 | 67–69 |
chr6trna99 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–50 | 51–61 | 62–64 | 65–71 |
chr6trna98 | 1–3 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–59 | 60–62 | 63–69 | 70–72 | 73–75 |
chr16trna8 | 1–4 | 6–8 | 9–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–52 | 53–61 | 62–66 | 69–72 |
chr16trna6 | 1–3 | 6–8 | 9–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–52 | 53–61 | 62–66 | 70–72 |
chr16trna1 | 1–3 | 7–9 | 10–16 | 17–19 | 27–31 | 32–38 | 39–43 | 44–48 | 49–52 | 53–61 | 62–65 | 69–71 |
chr16trna3 | 2–7 | 10–12 | 13–22 | 23–25 | 28–31 | 32–38 | 39–42 | 43–48 | 49–53 | 54–60 | 61–65 | 66–71 |
chr1trna98 | 1–3 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–52 | 53–55 | 56–63 | 64–66 | 71–73 |
chr1trna99 | 1–5 | 12–14 | 15–19 | 20–22 | 28–31 | 32–35 | 36–39 | 40–48 | 49–52 | 53–59 | 60–63 | 67–71 |
chr12trna15 | 1–5 | 7–9 | 10–24 | 25–27 | 29–34 | 35–39 | 40–45 | 46–57 | 58–60 | 61–66 | 67–69 | 70–74 |
chr1trna92 | 1–3 | 12–14 | 15–19 | 20–22 | 28–31 | 32–38 | 39–42 | 43–44 | 45–47 | 48–61 | 62–64 | 70–72 |
chr12trna10 | 1–3 | 12–14 | 15–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–53 | 54–56 | 57–62 | 63–65 | 69–71 |
chr12trna12 | 1–3 | 12–14 | 15–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–53 | 54–56 | 57–62 | 63–65 | 69–71 |
chr2trna18 | 1–7 | 12–14 | 15–19 | 20–22 | 30–32 | 33–36 | 37–39 | 40–47 | 48–51 | 52–60 | 61–64 | 65–71 |
chrXtrna10 | 1–7 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–49 | 50–52 | 53–63 | 64–66 | 67–73 |
chr6trna128 | 1–7 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–49 | 50–52 | 53–63 | 64–66 | 67–73 |
chr7trna29 | 1–5 | 10–12 | 13–21 | 22–24 | 27–30 | 31–37 | 38–41 | 42–47 | 48–51 | 52–60 | 61–64 | 67–71 |
chr15trna8 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–50 | 51–61 | 62–64 | 65–71 |
chr15trna9 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–51 | 52–60 | 61–64 | 65–71 |
chr6trna50 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–57 | 58–60 | 61–71 | 72–74 | 75–81 |
chr7trna21 | 1–5 | 10–12 | 13–21 | 22–24 | 27–30 | 31–37 | 38–41 | 42–47 | 48–51 | 52–60 | 61–64 | 67–71 |
chr7trna20 | 1–4 | 10–12 | 13–21 | 22–24 | 26–30 | 31–37 | 38–42 | 43–47 | 48–50 | 51–61 | 62–64 | 67–70 |
chr7trna23 | 1–5 | 10–12 | 13–21 | 22–24 | 26–30 | 31–37 | 38–42 | 43–47 | 48–51 | 52–60 | 61–64 | 67–71 |
chr7trna22 | 1–3 | 10–12 | 13–21 | 22–24 | 27–30 | 31–37 | 38–41 | 42–48 | 49–51 | 52–60 | 61–63 | 69–71 |
chr15trna1 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–51 | 52–60 | 61–64 | 65–71 |
chr15trna3 | 1–5 | 10–12 | 13–22 | 23–25 | 28–31 | 32–38 | 39–42 | 43–48 | 49–52 | 53–61 | 62–65 | 68–72 |
chr19trna3 | 1–3 | 11–13 | 14–22 | 23–25 | 30–36 | 37–35 | 36–42 | 43–47 | 48–53 | 54–62 | 63–68 | 71–73 |
chr19trna4 | 1–3 | 5–7 | 8–15 | 16–18 | 26–31 | 32–38 | 39–44 | 45–49 | 50–52 | 53–57 | 58–60 | 65–67 |
chr6trna54 | 1–7 | 9–11 | 12–25 | 26–28 | 29–33 | 34–36 | 37–41 | 42–49 | 50–54 | 55–61 | 62–66 | 67–73 |
chr19trna6 | 2–7 | 10–12 | 13–22 | 23–25 | 29–33 | 34–36 | 37–41 | 42–49 | 50–52 | 53–61 | 62–64 | 66–71 |
chr6trna56 | 2–4 | 6–8 | 9–14 | 15–17 | 29–31 | 32–38 | 39–41 | 42–33 | 34–36 | 37–61 | 62–64 | 65–67 |
chr6trna59 | 1–3 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–55 | 56–58 | 59–66 | 67–69 | 71–73 |
chr6trna168 | 1–3 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–49 | 50–52 | 53–63 | 64–66 | 71–73 |
chr5trna5 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–51 | 52–62 | 63–65 | 66–72 |
chr6trna159 | 1–4 | 10–12 | 13–23 | 24–26 | 31–30 | 31–40 | 41–40 | 41–50 | 51–53 | 54–62 | 63–65 | 70–73 |
chr6trna158 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–54 | 55–59 | 60–66 | 66–72 |
chr6trna155 | 1–4 | 10–12 | 13–23 | 24–26 | 31–30 | 31–40 | 41–40 | 41–50 | 51–53 | 54–62 | 63–65 | 70–73 |
chr6trna154 | 1–3 | 10–12 | 13–23 | 24–26 | 31–30 | 31–40 | 41–40 | 41–50 | 51–53 | 54–62 | 63–65 | 71–73 |
chr6trna157 | 3–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–53 | 54–60 | 61–65 | 66–70 |
chr6trna151 | 1–7 | 11–13 | 14–20 | 21–23 | 26–30 | 31–37 | 38–42 | 43–51 | 52–54 | 55–61 | 62–64 | 65–71 |
chr6trna150 | 1–6 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–51 | 52–62 | 63–65 | 67–72 |
chr6trna153 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–48 | 49–51 | 52–62 | 63–65 | 66–72 |
chr6trna51 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–57 | 58–60 | 61–71 | 72–74 | 75–81 |
chr19trna1 | 1–3 | 10–12 | 13–23 | 24–26 | 28–32 | 33–39 | 40–44 | 45–52 | 53–55 | 56–63 | 64–66 | 71–73 |
chr19trna2 | 1–7 | 11–13 | 14–23 | 24–26 | 28–31 | 32–38 | 39–42 | 43–47 | 48–52 | 53–59 | 60–64 | 65–71 |
chr1trna59 | 1–4 | 10–12 | 13–24 | 25–27 | 30–33 | 34–40 | 41–44 | 45–59 | 60–64 | 65–71 | 72–76 | 80–83 |
chr9trna1 | 1–7 | 9–11 | 12–24 | 25–27 | 30–33 | 34–40 | 41–44 | 45–54 | 55–58 | 59–63 | 64–67 | 68–74 |
chr9trna1 | 1–7 | 11–13 | 14–20 | 21–23 | 25–28 | 29–45 | 46–49 | 50–54 | 55–58 | 59–63 | 64–67 | 68–74 |
chr6trna137 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–47 | 48–50 | 51–61 | 62–64 | 66–72 |
chr1trna118 | 1–5 | 10–13 | 14–22 | 23–26 | 29–33 | 34–44 | 45–49 | 50–136 | 137–140 | 141–149 | 150–153 | 154–158 |
chrXtrna1 | 1–3 | 6–8 | 9–23 | 24–26 | 27–32 | 33–37 | 38–43 | 44–47 | 48–51 | 52–60 | 61–64 | 69–71 |
chr6trna28 | 1–7 | 10–12 | 13–23 | 24–26 | 41–43 | 44–50 | 51–53 | 54–69 | 70–72 | 73–83 | 84–86 | 87–93 |
chr6trna44 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–57 | 58–60 | 61–71 | 72–74 | 75–81 |
chr14trna19 | 1–7 | 10–12 | 13–22 | 23–25 | 33–35 | 36–46 | 47–49 | 50–69 | 70–72 | 73–83 | 84–86 | 87–93 |
chr17trna41 | 1–7 | 10–12 | 13–22 | 23–25 | 27–31 | 32–38 | 39–43 | 44–57 | 58–61 | 62–70 | 71–74 | 75–81 |
3.1.5 Annexure 3.5: Predicted Results for Binding of siRNA Molecules on RNA Transcript of Target Gene
The algorithm to predict binding of siRNA with target gene is reported in Sect. 3.8.
Table 3.24 reports sample prediction results for ten cases derived on execution of the program code for the algorithm. The siRNA/gene ID is noted on column 1. The next table reports a number of prediction results (DOM) on last column for siRNA and target gene retrieved from NCBI Probe database [22].
siRNA ID/Gene ID | M′ | LOM | M | ROM | LOM′/ROM′ | LOM/ROM | DOM |
---|---|---|---|---|---|---|---|
8811589b/UBE2T | 19 | 300–317 | 318 | 0/0 | 0/0 | 0 | |
8812102b/UBE2E3 | 8 | 518–524 | 525 | 526–536 | 0/0 | 0/1 | 1 |
8812967b/P2RX3 | – | 586–590 | 591 | 592–604 | 0/0 | 0/1 | 1 |
8811547b/UFC1 | 19 | 114–131 | 132 | 1/0 | 0/0 | 1 | |
8811418b/UBE2L6 | 15 | 437–450 | 451 | 452–455 | 1/0 | 0/0 | 1 |
8811391b/UBE2S | 19 | 160–177 | 178 | 0/0 | 1/0 | 0 | |
8811310b/UBE2I | 8 | 155–161 | 162 | 163–173 | 0/0 | 0/0 | 0 |
8812922b/P2RX3 | 5 | 326–329 | 330 | 331–344 | 0/0 | 0/1 | 1 |
8817061b/PHF19 | 19 | 1565–1582 | 1583 | 0/0 | 0/0 | 0 | |
8810803b/UBE2 M | 4 | 414–416 | 417 | 418–432 | 0/1 | 0/2 | 1 |
8810630b/UBE2D3 | 12 | 332–342 | 343 | 344–350 | 0/0 | 0/0 | 0 |
8811746b/UBE2J1 | 19 | 489–506 | 507 | 1/0 | 1/0 | 0 | |
8812175b/UBE2V1 | 19 | 444–461 | 462 | 0/0 | 3/0 | 1 | |
8811576b/UBE2T | 15 | 472–485 | 486 | 487–490 | 1/0 | 0/0 | 1 |
8811073b/RAB6IP1 | 19 | 3672–3689 | 3690 | 0/0 | 1/0 | 1 | |
8811496b/UFC1 | 4 | – | – | – | 0/0 | 0/0 | 0 |
8812990b/P2RX3 | 19 | 108–125 | 126 | 2/0 | 0/0 | 1 | |
8811406b/UBE2S | 10 | 88–96 | 97 | 98–106 | 1/1 | 1/0 | 1 |
8812149b/UBE2E3 | – | 227–227 | 228 | 229–245 | 0/0 | 0/2 | 1 |
8811687b/UBE2 W | – | 258–262 | 263 | 264–276 | 0/0 | 0/0 | 0 |
8811792b/UBE2 K | 19 | 330–347 | 348 | 1/0 | 0/0 | 1 | |
8811775b/UBE2 K | 19 | 382–399 | 400 | 0/0 | 0/0 | 0 | |
8817034b/CYP4Z1 | 15 | 1314–1327 | 1328 | 1329–1332 | 1/0 | 0/1 | 2 |
8817011b/OR12D3 | 19 | – | – | – | 0/0 | 0/0 | 0 |
8811564b/UBE2T | 19 | 537–554 | 555 | 0/0 | 2/0 | 1 | |
8811442b/UBE2L6 | – | 262–266 | 267 | 268–280 | 0/0 | 0/0 | 0 |
8812959b/P2RX3 | 19 | 670–687 | 688 | 0/0 | 0/0 | 0 | |
8810822b/UBE2 M | 15 | 279–292 | 293 | 294–297 | 1/0 | 2/0 | 1 |
8811779b/UBE2 K | 13 | 388–399 | 400 | 401–406 | 0/0 | 0/0 | 0 |
8812723b/SOST | – | 212–226 | 227 | 228–230 | 0/0 | 0/0 | 0 |
8812136b/UBE2E3 | 8 | 366–372 | 373 | 374–384 | 0/0 | 0/1 | 1 |
8810843b/UBE2 M | 11 | 118–127 | 128 | 129–136 | 0/1 | 0/1 | 0 |
8810821b/UBE2 M | – | 291–292 | 293 | 294–309 | 0/0 | 0/0 | 0 |
8812738b/SOST | – | 109–109 | 110 | 111–127 | 0/0 | 0/1 | 1 |
8811513b/UFC1 | 15 | 310–323 | 324 | 325–328 | 1/0 | 0/0 | 1 |
8811561b/UBE2T | 19 | 537–554 | 555 | 0/0 | 2/0 | 1 | |
8810642b/UBE2D3 | 9 | 284–291 | 292 | 293–302 | 0/0 | 1/0 | 1 |
8812124b/UBE2E3 | 19 | 383–400 | 401 | 0/0 | 0/0 | 0 | |
8811205b/UBE2H | – | 452–455 | 456 | 457–470 | 0/0 | 0/1 | 1 |
8811017b/RAB6IP1 | 7 | 3046–3051 | 3052 | 3053–3064 | 0/1 | 0/0 | 1 |
8810855b/UBE2 M | 5 | 69–72 | 73 | 74–87 | 0/1 | 0/1 | 0 |
8812094b/UBE2E3 | 19 | 563–580 | 581 | 0/0 | 2/0 | 1 | |
8812027b/UBE2G1 | – | 425–435 | 436 | 437–443 | 0/0 | 0/0 | 0 |
8811703b/UBE2 W | – | 221–226 | 227 | 228–239 | 0/0 | 0/0 | 0 |
8817021b/KCNT1 | 19 | 1336–1353 | 1354 | 0/0 | 1/0 | 1 | |
8811319b/UBE2I | 19 | 116–133 | 134 | 0/0 | 0/0 | 0 | |
8811622b/UBE2T | 8 | 184–190 | 191 | 192–202 | 0/0 | 1/0 | 0 |
8817088b/DVL2 | 15 | 785–798 | 799 | 800–803 | 0/0 | 0/0 | 0 |
8810760b/UBE2B | – | 187–188 | 189 | 190–205 | 0/0 | 0/1 | 1 |
8811283b/UBE2I | 19 | 313–330 | 331 | 0/0 | 0/0 | 0 |
3.1.6 Annexure 3.6: Fold Residue Prediction
Section 3.6.2 showed a partial table in Table 3.9 with a partial number of residues. The next table shows the complete results for prediction of Fold Residue Location (FRL) for mRNA codon string of length 217, transcripted out of Gene ID CAA33815, Uniprot ID for the protein synthesized is P13342, PDB ID 1C1K.
(1) Pos | (2) IFRL prediction | (3) VFRL fold points | (4) SS motifs as noted in PDB | (5) Comment | (1) Pos | (2) IFRL prediction | (3) VFRL fold points | (4) SS motifs as noted in PDB | (5) Comment |
---|---|---|---|---|---|---|---|---|---|
1 | – | 110 | H | ||||||
2 | E | 111 | H | ||||||
3 | 1* | 1 | E | True | 112 | H | |||
4 | E | 113 | H | ||||||
5 | – | 114 | H | ||||||
6 | – | 115 | H | ||||||
7 | – | 116 | H | ||||||
8 | S | 117 | H | ||||||
9 | – | 118 | H | ||||||
10 | 1* | 1 | T | True | 119 | H | |||
11 | T | 120 | H | ||||||
12 | – | 121 | H | ||||||
13 | – | 122 | H | ||||||
14 | – | 123 | H | ||||||
15 | 1* | 1 | – | True | 124 | H | |||
16 | H | 125 | H | ||||||
17 | H | 126 | H | ||||||
18 | H | 127 | T | ||||||
19 | H | 128 | T | ||||||
20 | H | 129 | – | ||||||
21 | H | 130 | S | ||||||
22 | H | 131 | S | ||||||
23 | H | 132 | G | ||||||
24 | H | 133 | G | ||||||
25 | H | 134 | G | ||||||
26 | H | 135 | G | ||||||
27 | H | 136 | T | ||||||
28 | H | 137 | S | ||||||
29 | H | 138 | – | ||||||
30 | 1* | 1 | H | True | 139 | B | |||
31 | T | 140 | T | ||||||
32 | T | 141 | T | ||||||
33 | 1* | 1 | S | True | 142 | T | |||
34 | – | 143 | T | ||||||
35 | – | 144 | B | ||||||
36 | T | 145 | – | ||||||
37 | T | 146 | H | ||||||
38 | T | 147 | H | ||||||
39 | T | 148 | H | ||||||
40 | T | 149 | H | ||||||
41 | T | 150 | H | ||||||
42 | – | 151 | 1* | 1 | H | True | |||
43 | – | 152 | H | ||||||
44 | – | 153 | T | ||||||
45 | 1* | 1 | – | True | 154 | 1* | 1 | T | True |
46 | – | 155 | S | ||||||
47 | H | 156 | 1* | 1 | S | True | |||
48 | H | 157 | – | ||||||
49 | H | 158 | H | ||||||
50 | H | 159 | H | ||||||
51 | H | 160 | H | ||||||
52 | H | 161 | H | ||||||
53 | – | 162 | H | ||||||
54 | S | 163 | H | ||||||
55 | – | 164 | H | ||||||
56 | H | 165 | H | ||||||
57 | H | 166 | H | ||||||
58 | H | 167 | H | ||||||
59 | H | 168 | H | ||||||
60 | H | 169 | 1* | 1 | – | True | |||
61 | H | 170 | H | ||||||
62 | H | 171 | H | ||||||
63 | H | 172 | H | ||||||
64 | H | 173 | H | ||||||
65 | H | 174 | H | ||||||
66 | – | 175 | H | ||||||
67 | – | 176 | H | ||||||
68 | H | 177 | H | ||||||
69 | H | 178 | – | ||||||
70 | H | 179 | – | ||||||
71 | H | 180 | – | ||||||
72 | H | 181 | H | ||||||
73 | H | 182 | 1* | 1 | H | True | |||
74 | H | 183 | H | ||||||
75 | 1* | H | False | 184 | 1* | H | False | ||
76 | H | 185 | H | ||||||
77 | H | 186 | H | ||||||
78 | H | 187 | H | ||||||
79 | H | 188 | H | ||||||
80 | H | 189 | H | ||||||
81 | H | 190 | H | ||||||
82 | – | 191 | H | ||||||
83 | H | 192 | H | ||||||
84 | H | 193 | H | ||||||
85 | H | 194 | H | ||||||
86 | H | 195 | H | ||||||
87 | 1* | 1 | S | True | 196 | H | |||
88 | S | 197 | E | ||||||
89 | – | 198 | E | ||||||
90 | G | 199 | E | ||||||
91 | G | 200 | – | ||||||
92 | G | 201 | 1* | 1 | H | True | |||
93 | T | 202 | H | ||||||
94 | T | 203 | H | ||||||
95 | H | 204 | 1* | H | False | ||||
96 | H | 205 | H | ||||||
97 | H | 206 | H | ||||||
98 | H | 207 | H | ||||||
99 | H | 208 | H | ||||||
100 | H | 209 | H | ||||||
101 | H | 210 | H | ||||||
102 | H | 211 | H | ||||||
103 | H | 212 | H | ||||||
104 | H | 213 | H | ||||||
105 | H | 214 | 1* | 1 | H | False | |||
106 | H | 215 | H | ||||||
107 | 1* | 1 | H | True | 216 | H | |||
108 | T | 217 | – | ||||||
109 | H |
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Pal Chaudhuri, P., Ghosh, S., Dutta, A., Pal Choudhury, S. (2018). Cellular Automata Model for Ribonucleic Acid (RNA). In: A New Kind of Computational Biology. Springer, Singapore. https://doi.org/10.1007/978-981-13-1639-5_3
Download citation
DOI: https://doi.org/10.1007/978-981-13-1639-5_3
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1638-8
Online ISBN: 978-981-13-1639-5
eBook Packages: Computer ScienceComputer Science (R0)