Skip to main content

Part of the book series: Algorithms for Intelligent Systems ((AIS))

  • 885 Accesses

Abstract

The retrieval of biological data is quite trending nowadays as a significant amount of research is being carried out in this area. There are numerous algorithms being proposed for analyzing biological data based on pattern matching based approach. Several new pattern matching based algorithms ranging from brute force approach to most recent algorithms are being developed. As it is well understood that for retrieval of data, the retrieval algorithm must be fast in terms of execution, very less attention has been paid towards the factors which might affect the execution time of an algorithm. Factors like pattern length, type of datasets, input size and other related factors can affect the execution of an algorithm, but how much is really unknown and unaddressed. Hence, this paper has addressed this problem by utilizing factorial design 2k. The factorial technique is designed and implemented in such a way, which will give new insight to researchers while proposing or developing algorithms for retrieving biological data. The study shows for the algorithm to be efficient, the main motivating factor is pattern length. Pattern length is having a 38.5% effect on the execution time of an algorithm followed by the type of dataset with the impact of 18%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Egholm M, Margulies M, Altman W, Attiya S, Bader J, Bemben L et al (2005) Genome sequencing in open microfabricated high density picoliter reactors. Nature 437:376–380

    Article  Google Scholar 

  2. E.P. Consortium (2004) The ENCODE (ENCyclopedia of DNA elements) project. Science 306:636–640

    Article  Google Scholar 

  3. G.P. Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061

    Article  Google Scholar 

  4. Yang Z, Yu J, Kitsuregawa M (2010) Fast algorithms for top-k approximate string matching. AAAI

    Google Scholar 

  5. Faro S, Lecroq T (2013) The exact online string matching problem: a review of the most recent results. ACM Comput Surv (CSUR) 45:13

    Article  Google Scholar 

  6. Hakak S, Kamsin A, Shivakumara P, Idris MYI (2018) Partition-based pattern matching approach for efficient retrieval of Arabic text. Malays J Comput Sci 31:200–209

    Article  Google Scholar 

  7. Hakak S, Kamsin A, Tayan O, Idris MYI, Gani A, Zerdoumi S (2017) Preserving content integrity of digital holy Quran: survey and open challenges. IEEE Access 5:7305–7325

    Article  Google Scholar 

  8. Hakak S, Kamsin A, Palaiahnakote S, Tayan O, Idris MYI, Abukhir KZ (2018) Residual-based approach for authenticating pattern of multi-style diacritical Arabic texts. PLoS ONE 13:e0198284

    Article  Google Scholar 

  9. Hakak S, Kamsin A, Tayan O, Idris MYI, Gilkar GA (2017) Approaches for preserving content integrity of sensitive online Arabic content: a survey and research challenges. Inf Process Manag

    Google Scholar 

  10. Zerdoumi S, Sabri AQM, Kamsin A, Hashem IAT, Gani A, Hakak S et al (2017) Image pattern recognition in big data: taxonomy and open challenges: survey. Multimed Tools Appl 1–31

    Google Scholar 

  11. Hakak S, Kamsin A, Shivakumara P, Idris MYI, Gilkar GA (2018) A new split based searching for exact pattern matching for natural texts. PLoS ONE 13:e0200912

    Article  Google Scholar 

  12. Hakak SI (2015) Evaluating the effect of routing protocol, packet size and DSSS rate on network performance indicators in MANET’s. Kulliyyah of Engineering, International Islamic University Malaysia

    Google Scholar 

  13. Allauzen C, Crochemore M, Raffinot M (1999) Factor oracle: a new structure for pattern matching. In: International conference on current trends in theory and practice of computer science. Springer, pp 295–310

    Google Scholar 

  14. Faro S, Lecroq T (2009) Efficient variants of the backward-oracle-matching algorithm. Int J Found Comput Sci 20:967–984

    Article  MathSciNet  Google Scholar 

  15. Khan ZA, Pateriya R (2012) Multiple pattern string matching methodologies: a comparative analysis. Int J Sci Res Publ 2:1–7

    Google Scholar 

  16. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    Article  Google Scholar 

  17. Berry T, Ravindran S (1999) A fast string matching algorithm and experimental results. Stringology 16–28

    Google Scholar 

  18. Hudaib A, Al-Khalid R, Suleiman D, Itriq M, Al-Anani A (2008) A fast pattern matching algorithm with two sliding windows (TSW). J Comput Sci 4:393

    Article  Google Scholar 

  19. Itriq M, Hudaib A, Al-Anani A, Al-Khalid R, Suleiman D (2012) Enhanced two sliding windows algorithm for pattern matching (ETSW). J Am Sci 8:607–616

    Google Scholar 

  20. Hakak S, Anwar F, Latif SA, Gilkar G, Alam M (2014) Impact of packet size and node mobility pause time on average end to end delay and jitter in MANET’s. In: 2014 International conference on computer and communication engineering (ICCCE). IEEE, pp 56–59

    Google Scholar 

  21. Hakak S, Latif SA, Anwar F, Alam MK (2014) Impact of key factors on average jitter in MANET. In: First international conference on systems informatics, modeling and simulation computer society. IEEE, pp 179–183

    Google Scholar 

  22. Hakak S, Latif SA, Anwar F, Alam M, Gilkar G (2014) Effect of mobility model and packet size on throughput in MANET’s. In: 2014 International conference on computer and communication engineering (ICCCE). IEEE, pp 150–153

    Google Scholar 

  23. Hakak S, Latif SA, Anwar F, Alam M, Gilkar G (2014) Effect of 3 key factors on average end to end delay and jitter in MANET. J ICT Res Appl 8:113–125

    Article  Google Scholar 

  24. Jain R (1990) The art of computer systems performance analysis: techniques for experimental design, measurement, simulation, and modeling. Wiley

    Google Scholar 

Download references

Acknowledgements

This research is supported by FRGS FP003A-2017, Faculty of Computer Science and Information Technology, University of Malaya.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saqib Hakak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Parvez, H.M.S., Hakak, S., Gilkar, G.A., Abdur Rahman, M. (2020). Factorial Analysis of Biological Datasets. In: Uddin, M., Bansal, J. (eds) Proceedings of International Joint Conference on Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-13-7564-4_1

Download citation

Publish with us

Policies and ethics