An Open Framework for Extensible Multi-stage Bioinformatics Software
In research labs, there is often a need to customise software at every step in a given bioinformatics workflow, but traditionally it has been difficult to obtain both a high degree of customisability and good performance. Performance-sensitive tools are often highly monolithic, which can make research difficult. We present a novel set of software development principles and a bioinformatics framework, Friedrich, which is currently in early development. Friedrich applications support both early stage experimentation and late stage batch processing, since they simultaneously allow for good performance and a high degree of flexibility and customisability. These benefits are obtained in large part by basing Friedrich on the multiparadigm programming language Scala. We present a case study in the form of a basic genome assembler and its extension with new functionality. Our architecture has the potential to greatly increase the overall productivity of software developers and researchers in bioinformatics.
KeywordsOpen Framework Scala Code Bioinformatics Application Short Read Data Early Stage Experimentation
- 3.Goecks, J., et al.: Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology 11(8), R86+ (2010)Google Scholar
- 5.Hundt, R.: Loop Recognition in C++/Java/Go/Scala. In: Proceedings of Scala Days 2011 (2011)Google Scholar
- 11.Mitsuteru, N.G., et al.: BioRuby: open-source bioinformatics library (2003)Google Scholar
- 12.Odersky, M.: The Scala Language Specification, Version 2.9 (May 2011), http://www.scala-lang.org/docu/files/ScalaReference.pdf
- 13.Prins, P.: BioScala (March 2011), https://github.com/bioscala/bioscala