Skip to main content

DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method

  • Protocol
  • First Online:
Viral Metagenomics

Abstract

Multiple sequence alignment (MSA) is a fundamental component in many DNA sequence analyses including metagenomics studies and phylogeny inference. When guided by protein profiles, DNA multiple alignments assume a higher precision and robustness. Here we present details of the use of the upgraded version of MSA-PAD (2.0), which is a DNA multiple sequence alignment framework able to align DNA sequences coding for single/multiple protein domains guided by PFAM or user-defined annotations. MSA-PAD has two alignment strategies, called “Gene” and “Genome,” accounting for coding domains order and genomic rearrangements, respectively. Novel options were added to the present version, where the MSA can be guided by protein profiles provided by the user. This allows MSA-PAD 2.0 to run faster and to add custom protein profiles sometimes not present in PFAM database according to the user’s interest. MSA-PAD 2.0 is currently freely available as a Web application at https://recasgateway.cloud.ba.infn.it/.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, Weber CF (2009) Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75(23):7537–7541. https://doi.org/10.1128/AEM.01541-09

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Matsen FA, Kodner RB, Armbrust EV (2010) pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. BMC Bioinformatics 11:538. https://doi.org/10.1186/1471-2105-11-538

    Article  PubMed  PubMed Central  Google Scholar 

  3. Balech B, Vicario S, Donvito G, Monaco A, Notarangelo P, Pesole G (2015) MSA-PAD: DNA multiple sequence alignment framework based on PFAM accessed domain information. Bioinformatics 31(15):2571–2573. https://doi.org/10.1093/bioinformatics/btv141

    Article  CAS  PubMed  Google Scholar 

  4. Yang XF, Peng JJ, Liang HR, Yang YT, Wang YF, Wu XW, Pan JJ, Luo YW, Guo XF (2014) Gene order rearrangement of the M gene in the rabies virus leads to slower replication. Virusdisease 25(3):365–371. https://doi.org/10.1007/s13337-014-0220-1

    Article  PubMed  PubMed Central  Google Scholar 

  5. Flanagan EB, Zamparo JM, Ball LA, Rodriguez LL, Wertz GW (2001) Rearrangement of the genes of vesicular stomatitis virus eliminates clinical disease in the natural host: new strategy for vaccine development. J Virol 75(13):6107–6114. https://doi.org/10.1128/JVI.75.13.6107-6114.2001

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. D’Onorio de Meo P, D’Antonio M, Griggio F, Lupi R, Borsani M, Pavesi G, Castrignano T, Pesole G, Gissi C (2012) MitoZoa 2.0: a database resource and search tools for comparative and evolutionary analyses of mitochondrial genomes in Metazoa. Nucleic Acids Res 40(Database issue):D1168–D1172. https://doi.org/10.1093/nar/gkr1144

    Article  CAS  PubMed  Google Scholar 

  7. Gai Y, Song D, Sun H, Yang Q, Zhou K (2008) The complete mitochondrial genome of Symphylella sp. (Myriapoda: Symphyla): extensive gene order rearrangement and evidence in favor of Progoneata. Mol Phylogenet Evol 49(2):574–585. https://doi.org/10.1016/j.ympev.2008.08.010

    Article  CAS  PubMed  Google Scholar 

  8. Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113. https://doi.org/10.1186/1471-2105-5-113

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Katoh K, Standley DM (2016) A simple method to control over-alignment in the MAFFT multiple sequence alignment program. Bioinformatics 32(13):1933–1942. https://doi.org/10.1093/bioinformatics/btw108

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Sievers F, Higgins DG (2014) Clustal omega, accurate alignment of very large numbers of sequences. Methods Mol Biol 1079:105–116. https://doi.org/10.1007/978-1-62703-646-7_6

    Article  CAS  PubMed  Google Scholar 

  11. Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23(7):802–808. https://doi.org/10.1093/bioinformatics/btm017

    Article  CAS  PubMed  Google Scholar 

  12. Loytynoja A (2014) Phylogeny-aware alignment with PRANK. Methods Mol Biol 1079:155–170. https://doi.org/10.1007/978-1-62703-646-7_10

    Article  PubMed  Google Scholar 

  13. Abascal F, Zardoya R, Telford MJ (2010) TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations. Nucleic Acids Res 38(Web Server issue):W7–13. https://doi.org/10.1093/nar/gkq291

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Rice P, Longden I, Bleasby A (2000) EMBOSS: the European molecular biology open software suite. Trends Genet 16(6):276–277

    Article  CAS  Google Scholar 

  15. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M (2014) Pfam: the protein families database. Nucleic Acids Res 42(Database issue):D222–D230. https://doi.org/10.1093/nar/gkt1223

    Article  CAS  PubMed  Google Scholar 

  16. Johnson AD (2010) An extended IUPAC nomenclature code for polymorphic nucleic acids. Bioinformatics 26(10):1386–1389. https://doi.org/10.1093/bioinformatics/btq098

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Coordinators NR (2017) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 45 (D1):D12-D17. doi:https://doi.org/10.1093/nar/gkw1071

  18. Ratnasingham S, Hebert PD (2007) Bold: the barcode of life data system (http://www.barcodinglife.org). Mol Ecol Notes 7(3):355–364. https://doi.org/10.1111/j.1471-8286.2007.01678.x

  19. Pickett BE, Greer DS, Zhang Y, Stewart L, Zhou L, Sun G, Gu Z, Kumar S, Zaremba S, Larsen CN, Jen W, Klem EB, Scheuermann RH (2012) Virus pathogen database and analysis resource (ViPR): a comprehensive bioinformatics database and analysis resource for the coronavirus research community. Virus 4(11):3209–3226. https://doi.org/10.3390/v4113209

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by the Italian nodes of Lifewatch and ELIXIR Research Infrastructures.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Balech, B. et al. (2018). DNA Multiple Sequence Alignment Guided by Protein Domains: The MSA-PAD 2.0 Method. In: Pantaleo, V., Chiumenti, M. (eds) Viral Metagenomics. Methods in Molecular Biology, vol 1746. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7683-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7683-6_13

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7682-9

  • Online ISBN: 978-1-4939-7683-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics