The concept of operational taxonomic units revisited: genomes of bacteria that are regarded as closely related are often highly dissimilar
The concept of operational taxonomic units (OTUs), which constructs “mathematically” defined taxa, is widely accepted and applied to describe bacterial communities using amplicon sequencing of 16S rRNA gene. OTUs are often used to infer functional traits since they are considered to fairly represent of community members. However, the link between molecular taxa, real taxa, and OTUs seems to be much more complicated. Strains of the same bacterial species (ideally belonging to the same OTU) typically only share some genes (the core genome), while other genes are strain-specific and unique. It is thus unclear to what extent are important functional traits homogeneous within an OTU and how correctly can functional traits be inferred for individual OTU members. Here, we have tested in silico the similarity of all genes and, more specifically, the set of genes encoding for glycoside hydrolases (GH) in bacterial genomes that belong to the same OTU. Genome similarity varied among OTUs, but as many as 5–78% of genes were not shared between the two bacterial genomes in the pair. The complement of GH families (the presence of gene families and the number of genes per family) differed in 95% of OTUs. In average, 43% of GH families either differed in gene counts or were present in one genome and absent in the other. These results show a serious limitation of the OTU-based approaches when used to infer the functional traits of bacterial communities and open the questions how to link environmental sequencing data and microbial functions.
This work was supported by the Czech Science Foundation (18-25706S) and by the Ministry of Education, Youth and Sports of the Czech Republic (LTT17022).