One of the burning questions in bacterial genomics is how the phenotype of a bacterial strain correlates to its genotype. Some phenotypes of a given organism’s isolate arise through simple sequence variations like single nucleotide polymorphisms (SNP) or small insertions/deletions (InDel). For some phenotypes, however, the underlying mechanism cannot be explained by simple genomic differences; rather, most of them are the result of more complex sequence variations. Insight into complex phenotypes such as bacterial pathogenicity, or resistance traits and their molecular background, require comprehensive data obtained in large-scale projects and involve statistical methods. With the increasing usage of next-generation sequencing (NGS) and other “-omics” techniques in molecular biology, projects are now feasible which provide such a data foundation. Big data, however, not only offers new opportunities but also requires extensive data management systems. A coupled system of a relational database, web interface and statistical methods provides substantial support for phenotype-genotype correlation studies aimed to unravel molecular mechanisms underlying complex phenotypes and designed for biomarker identification.
KeywordsAssociation study Biomarker identification Genotype-phenotype correlation
- 5.European Commission (2010) Workshop to clarify the scope for stratification biomarkers and to identify bottlenecks in the discovery and the use of such biomarkers. http://ec.europa.eu/research/health/pdf/biomarkers-for-patient-stratification_en.pdf. Accessed 19 Mar 2015