Multiple Sequence Alignment Based Upon Statistical Approach of Curve Fitting

Jha, Vineet; Mazumder, Mohit; Bhuyan, Hrishikesh; Jha, Ashwani; Nagar, Abhinav

doi:10.1007/978-3-642-11164-8_30

Vineet Jha²¹,
Mohit Mazumder²¹,
Hrishikesh Bhuyan²¹,
Ashwani Jha²¹ &
…
Abhinav Nagar²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5909))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

1373 Accesses

Abstract

The main objective of our work is to align multiple sequences together on the basis of statistical approach in lieu of heuristics approach. Here we are proposing a novel idea for aligning multiple sequences in which we will be considering the DNA sequences as lines not as strings where each character represents a point in the line. DNA sequences are aligned in such a way that maximum overlap can occur between them, so that we get maximum matching of characters which will be treated as our seeds of the alignment. The proposed algorithm will first find the seeds in the aligning sequences and then it will grow the alignment on the basis of statistical approach of curve fitting using standard deviation.

Download to read the full chapter text

Chapter PDF

Sequence Similarity Alignment Algorithm in Bioinformatics: Techniques and Challenges

Global Common Sequence Alignment Using Dynamic Window Algorithm

NestMSA: a new multiple sequence alignment algorithm

Article 19 February 2020

Keywords

References

Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673–4680 (1994)
Article Google Scholar
Morgenstern, B.: DIALIGN: Multiple DNA and Protein Sequence Alignment at BiBiServ. Nucleic Acids Research 32, W33–W36 (2004)
Article Google Scholar
Notredame, C., Higgins, D., Heringa, J.: T-Coffee: a novel algorithm for multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000)
Article Google Scholar
Notredame, C.: Recent progress in multiple sequence alignment: a survey. Pharmacogenomics 3, 131–144 (2002)
Article Google Scholar
Lee, C., Grasso, C., Sharlow, M.F.: Multiple sequence alignment using partial order graphs. Bioinformatics 18(3), 452–464 (2002)
Article Google Scholar
Edgar, R.: MUSCLE: Multiple sequence alignment with high score accuracy and high throughput. Nuc. Acids Res. 32, 1792–1797 (2004)
Article Google Scholar
Do, C.B., Mahabhashyam, M.S., Brudno, M., Batzoglou, S.: ProbCons: Probabilistic consistency-based multiple sequence alignment. Genome Research 15, 330–340 (2005)
Article Google Scholar
Katoh, K., Misawa, K., Kuma, K., Miyata, T.: MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002)
Article Google Scholar
Edgar, R.C.: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004)
Article Google Scholar
Gotoh, O.: Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838 (1996)
Article Google Scholar
Gotoh, O.: A weighting system and algorithm for aligning many phylogenetically related sequences. Comput. Appl. Biosci. 11, 543–551 (1995)
Google Scholar
Van Walle, I., Lasters, I., Wyns, L.: Align-m-a new algorithm for multiple alignment of highly divergent sequences. Bioinformatics 20, 1428–1435 (2004)
Article Google Scholar
Morgenstern, B.: DIALIGN: 2 improvement of the segment-tosegment approach to multiple sequence alignment. Bioinformatics 15, 211–218 (1999)
Article Google Scholar
Grasso, C., Lee, C.: Combining partial order alignment and progressive multiple sequence alignment increases alignment speed and scalability to very large alignment problems. Bioinformatics 20, 1546–1556 (2004)
Article Google Scholar
Lee, C., Grasso, C., Sharlow, M.F.: Multiple sequence alignment using partial order graphs. Bioinformatics 18, 452–464 (2002)
Article Google Scholar
Edgar, R.C., Sjölander, K.: SATCHMO: sequence alignment and tree construction using hidden Markov models. Bioinformatics 19, 1404–1411 (2003)
Article Google Scholar
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. In: Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Structure, vol. 5(3), pp. 345–352 (1978)
Google Scholar
Henikoff, S., Henikoff, J.: Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89(biochemistry), 10915–10919 (1992)
Article Google Scholar
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)
Google Scholar
Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., Haussler, D.: Genome Res. 12(6), 996–1006 (June 2002)
Google Scholar
University of California santa Cruz, http://genome.ucsc.edu/
Rice, P., Longden, I., Bleasby, A.: EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276–277 (2000)
Article Google Scholar
MacLaughlin, D.S.: MATCHER: a program to create and analyze matched sets. Comput. Programs Biomed. 14(2), 191–195 (1982)
Article Google Scholar
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
Article Google Scholar
Smith, T.F., Waterman, M.S., Fitch, W.M.: Comparative biosequence metrics. J. Mol. Evol. 18(1), 38–46 (1981)
Article Google Scholar

Download references

Author information

Authors and Affiliations

InSilico Biosolution, 103B, North Guwahati College Road, Abhoypur, Near IIT-Guwahati, P.O College Nagar, North Guwahati, 781031, Assam, India
Vineet Jha, Mohit Mazumder, Hrishikesh Bhuyan, Ashwani Jha & Abhinav Nagar

Authors

Vineet Jha
View author publications
You can also search for this author in PubMed Google Scholar
Mohit Mazumder
View author publications
You can also search for this author in PubMed Google Scholar
Hrishikesh Bhuyan
View author publications
You can also search for this author in PubMed Google Scholar
Ashwani Jha
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Nagar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Electrical Engineering Department, Indian Institute of Technology Delhi, 110016, New Delhi, India
Santanu Chaudhury
Center for Soft Computing Research, Indian Statistical Institute, 700 108, Kolkata, India
Sushmita Mitra
Center for Soft Computing Research, Indian Statistical Institute,
C. A. Murthy
Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore, INDIA
P. S. Sastry
Center for Soft Computing Research, Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, 700 108, Kolkata, India
Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jha, V., Mazumder, M., Bhuyan, H., Jha, A., Nagar, A. (2009). Multiple Sequence Alignment Based Upon Statistical Approach of Curve Fitting. In: Chaudhury, S., Mitra, S., Murthy, C.A., Sastry, P.S., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2009. Lecture Notes in Computer Science, vol 5909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11164-8_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-11164-8_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11163-1
Online ISBN: 978-3-642-11164-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Multiple Sequence Alignment Based Upon Statistical Approach of Curve Fitting

Abstract

Chapter PDF

Similar content being viewed by others

Sequence Similarity Alignment Algorithm in Bioinformatics: Techniques and Challenges

Global Common Sequence Alignment Using Dynamic Window Algorithm

NestMSA: a new multiple sequence alignment algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Multiple Sequence Alignment Based Upon Statistical Approach of Curve Fitting

Abstract

Chapter PDF

Similar content being viewed by others

Sequence Similarity Alignment Algorithm in Bioinformatics: Techniques and Challenges

Global Common Sequence Alignment Using Dynamic Window Algorithm

NestMSA: a new multiple sequence alignment algorithm

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation