Large-Scale Nucleotide Sequence Alignment and Sequence Variability Assessment to Identify the Evolutionarily Highly Conserved Regions for Universal Screening PCR Assay Design: An Example of Influenza A Virus

  • Alexander NagyEmail author
  • Tomáš Jiřinec
  • Lenka Černíková
  • Helena Jiřincová
  • Martina Havlíčková
Part of the Methods in Molecular Biology book series (MIMB, volume 1275)


The development of a diagnostic polymerase chain reaction (PCR) or quantitative PCR (qPCR) assay for universal detection of highly variable viral genomes is always a difficult task. The purpose of this chapter is to provide a guideline on how to align, process, and evaluate a huge set of homologous nucleotide sequences in order to reveal the evolutionarily most conserved positions suitable for universal qPCR primer and hybridization probe design. Attention is paid to the quantification and clear graphical visualization of the sequence variability at each position of the alignment. In addition, specific problems related to the processing of the extremely large sequence pool are highlighted. All of these steps are performed using an ordinary desktop computer without the need for extensive mathematical or computational skills.

Key words

Alignment Entropy Primer Probe Inclusivity Influenza PCR Real-time PCR qPCR 



This work was supported by institutional support of the Ministry of Health of the Czech Republic no.1RVO-SZÚ/2014. We acknowledge the authors and originating and submitting laboratories of the sequences from GISAID’s EpiFlu Database and NCBI’s Influenza Virus Resource database.


  1. 1.
    Spackman E, Senne DA, Myers TJ et al (2002) Development of a real-time reverse transcriptase PCR assay for type A influenza virus and the avian H5 and H7 hemagglutinin subtypes. J Clin Microbiol 40:3256–3260CrossRefPubMedCentralPubMedGoogle Scholar
  2. 2.
    Van Boheemen S, Bestebroer TM, Verhagen JH et al (2013) A family-wide RT-PCR assay for detection of paramyxoviruses and application to a large-scale surveillance study. PLoS One 7:1–9Google Scholar
  3. 3.
    Escutenaire S, Mohamed N, Isaksson M et al (2007) SYBR green assay real-time reverse transcription-polymerase chain reaction assay for the generic detection of coronaviruses. Arch Virol 152:41–58CrossRefPubMedGoogle Scholar
  4. 4.
    Toussaint JF, Sailleau C, Breard E et al (2007) Bluetongue virus detection by two real-time RT-qPCRs targeting two different genomic segments. J Virol Methods 140:115–123CrossRefPubMedGoogle Scholar
  5. 5.
    Reid SM, Ferris NP, Hutchings GH et al (2002) Detection of all seven serotypes of foot-and-mouth disease virus by real-time, fluorogenic reverse transcription polymerase chain reaction assay. J Virol Methods 105:67–80CrossRefPubMedGoogle Scholar
  6. 6.
  7. 7.
    VSD, Virus Sequence Database.
  8. 8.
    IVSD, Bao Y, Bolotov P, Dernovoy D et al (2008) The influenza virus resource at the National Center for Biotechnology Information. J Virol 82:596–601
  9. 9.
    GISAID, Global Initiative on Sharing All Influenza Data.
  10. 10.
    HCV, Hepatitis C Virus Sequence Database.
  11. 11.
    HIV, Human Immunodeficiency Virus Sequence Database.
  12. 12.
    HPV, Human Papillomavirus Sequence Database.
  13. 13.
    Nagy A, Vostinakova V, Pirchanova Z et al (2010) Development and evaluation of one step real-time RT-PCR assay for universal detection of influenza A viruses from avian and mammal species. Arch Virol 155:665–673CrossRefPubMedGoogle Scholar
  14. 14.
    Ito T, Gorman OT, Kawaoka Y et al (1991) Evolutionary analysis of the influenza A virus M gene with comparison of the M1 and M2 proteins. J Virol 65:5491–5498PubMedCentralPubMedGoogle Scholar
  15. 15.
    Widjaja L, Krauss SL, Webby RJ et al (2004) Matrix gene of influenza A viruses from wild aquatic birds: ecology and emergence of influenza A viruses. J Virol 78:8771–8779CrossRefPubMedCentralPubMedGoogle Scholar
  16. 16.
    Katoh K, Misawa K, Kuma K et al (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acid Res 30:3059–3066CrossRefPubMedCentralPubMedGoogle Scholar
  17. 17.
    Hall TA (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/ NT. Nucl Acids Symp Ser 41:95–98Google Scholar
  18. 18.
    Hoffmann E, Stech J, Guan Y et al (2001) Universal primer set for the full-length amplification of all influenza A viruses. Arch Virol 146:2275–2289CrossRefPubMedGoogle Scholar
  19. 19.
    Fouchier RAM, Bestebroer TM, Herfst S et al (2000) Detection of influenza A viruses from different species by PCR amplification of conserved sequences in the matrix gene. J Clin Microbiol 38:4096–4101PubMedCentralPubMedGoogle Scholar
  20. 20.
    Purohit HJ, Raje DV, Kapley A (2003) Identification of signature and primers specific to genus Pseudomonas using mismatched patterns of 16S rDNA sequences. BMC Bioinformatics 4:1–9CrossRefGoogle Scholar
  21. 21.
    Cao Y, Wang L, Xu K et al (2005) Information theory-based algorithm for in silico prediction of PCR with whole genomic sequences as templates. BMC Bioinformatics 5:1–5Google Scholar
  22. 22.
    Batista MVA, Freitas AC, Balbino VQ (2013) Entropy-based approach for selecting informative regions in the L1 gene of bovine papillomavirus for phylogenetic interference and primer design. Genet Mol Res 12:400–407CrossRefPubMedGoogle Scholar
  23. 23.
    Linhart C, Samir R (2002) The degenerate primer design problem. Bioinformatics 18:S172–S180CrossRefPubMedGoogle Scholar
  24. 24.
    Hysom DA, Naraghi-Arani P, Elseikh M et al (2012) Skip the alignment: degenerate, multiplex primer and probe design using k-mer matching instead of alignments. PLoS One 7:1–12CrossRefGoogle Scholar
  25. 25.
    Kruger D, Kapturska D, Fischer C et al (2012) Diversity measures in environmental sequences are highly dependent on alignment quality-data from ITS and new LSU primers targeting Basidiomycetes. PLoS One 7:1–21Google Scholar
  26. 26.
    Hall TA (1999) BioEdit user’ manual. See ref. 17Google Scholar
  27. 27.
    Lipman DJ, Pearson WR (1985) Rapid and sensitive protein similarity searches. Science 227:1435–1441CrossRefPubMedGoogle Scholar
  28. 28.
    Krasnitz M, Levine AJ, Rabadan R (2008) Anomalies in the influenza virus genome database: new biology or laboratory errors? J Virol 82:8947–8950CrossRefPubMedCentralPubMedGoogle Scholar
  29. 29.
    Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780CrossRefPubMedCentralPubMedGoogle Scholar
  30. 30.
    Larsson A (2014) AliView: a fast and lightweight alignment viewer and editor for large data sets. Bioinformatics 15:3276–32780Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Alexander Nagy
    • 1
    • 2
    Email author
  • Tomáš Jiřinec
    • 2
  • Lenka Černíková
    • 1
  • Helena Jiřincová
    • 2
  • Martina Havlíčková
    • 2
  1. 1.Laboratory of Molecular MethodsState Veterinary Institute PraguePrague 6Czech Republic
  2. 2.National Reference Laboratory for Influenza, Centre for Epidemiology and MicrobiologyNational Institute of Public HealthPrague 10Czech Republic

Personalised recommendations