Introduction

Widely regarded as being an endosymbiont of ancient protobacterial origin, mitochondria are a defining characteristic of eukaryotic organisms (Gray et al. 1999). The availability of Neurospora strains carrying mutations in the mitochondrial genome enabled the first studies of maternal inheritance in Neurospora (Mitchell & Mitchell 1952), and mitochondrial inheritance has been shown to be primarily maternal for Neurospora (Mannella et al. 1979) as well as for other filamentous fungi (Griffiths 1996). In rare cases mitochondrial genome markers are transmitted by the fertilizing cytoplasm or in unstable heterokaryons (Collins and Saville 1990). Mitochondrial genome analysis has been used both to understand fundamental aspects of evolution (Gray et al. 1999) and as a source of markers for population and species delimitation (Moore 1995). Some recent analysis of a rapidly expanding pool of information has led to re-evaluation of some of the assumptions of early studies of mitochondrial genetics (Galtier et al. 2009). Moreover, mitochondrial biology has seen a resurgence of interest as degraded mitochondria were reported in brain tissue from Alzheimer’s (Sultana & Butterfield 2009), and Huntington’s (Damiano et al. 2010) patients.

Filamentous fungi have been described as providing a good model for the study of mitochondrial inheritance and biology (Griffiths 1996). In one instance a fungal mitochondrial genome project emphasized high level comparisons and used one representative of each major phylogenetic lineage (Paquin et al. 1997). More recent pan-fungal phylogenetic analysis, however, did not include mitochondrial markers (James et al. 2006) and recent fungal genome analysis does not emphasize mitochondrial biology (Martin et al. 2011), although some authors have described mitochondrial genomes as part of their whole genome sequence projects (Torriani et al. 2008). The N. crassa mitochondrial genome is 64,800 bases and it encodes twenty-eight protein coding genes, as well as two rRNAs and twenty-eight tRNA genes (Borkovich et al. 2004). Among these are genes for the electron transport chain, subunits of the mitochondrial ATPase, protein synthesis, and genes of unknown function. Compared to other mitochondrial genomes, the N. crassa mitochondrial genome is larger than many, but still near the middle of the 19 to 109 Kb range for fungi as well as for the overall range of 16 to 366 Kb from human to Arabidopsis (Bullerwell & Lang 2005). The Neurospora mitochondrial genome is a circular molecule and it varies somewhat in size depending on the presence or absence of optional intron sequences (Griffiths 1996; Collins & Lambowitz 1983). Additionally, an aberrant version of the NADH dehydrogenase was characterized in the Neurospora mitochondrial genome (de Vries et al. 1986) and this was ultimately associated with a duplication that includes two tRNA genes as well as the mutant version of the NADH dehydrogenase subunit 2 (Agsteribbe et al. 1989). Mitochondrial genome rearrangements were associated with the intermittent cessation of growth phenotype known as ‘stopper’ and these rearrangements involved the NADH dehydrogenase gene fragment (de Vries et al. 1986). Additionally, while the Neurospora mitochondria has been known to harbor various plasmids, the Varkud satellite plasmids were recently shown to be phenotypically neutral (Keeping & Collins 2011). Other mitochondrial plasmids are known to induce senescence, presumably through recombination with the mitochondrial genome (Court et al. 1991). Self-splicing introns of the 25S rRNA gene were identified in Neurospora mitochondria (Garriga & Lambowitz 1983) and led to the characterization of the mechanism of self splicing of the group I introns in Neurospora (Garriga et al. 1986). Whole genome resequencing has been used to analyze the nuclear genome of numerous N. crassa classical mutant strains (McCluskey et al. 2011) and that dataset provides unprecedented insight into Neurospora mitochondrial genetics and biology. Because most whole genome data includes mitochondrial sequence it is likely that analysis of mitochondrial genomes will be available for many fungal taxa and this suggests a renaissance of interest in mitochondrial genetics in fungi.

Materials and Methods

Total DNA from Neurospora strains (Table 1) was prepared as described (McCluskey et al. 2011). Most strains were preserved on anhydrous silica gel (Perkins 1962a) since their original deposit into the FGSC collection without multiple passages. For example, strain FGSC 1303 was preserved in 1966 and strain FGSC 1363 was preserved in 1967. Some of these strains have morphological abnormalities and for these strains, the cultures were macerated with sterile glass tissue grinder and resuspended in fresh culture medium to allow production of enough tissue for DNA extraction. Genome sequencing was carried out at the US DOE JGI using the Illumina platform as described (McCluskey et al. 2011).

Table 1. Strains employed in the current analysis.

SNP and indel analysis was carried out using the MAQ software platform, version 0.7.1 (Li et al. 2008). Larger indels and rearrangements were assessed using Breakdancer (Chen et al. 2009b). Comparative analysis of polymorphisms was carried out as previously described (McCluskey et al. 2011).

Results

Among all the resequenced strains 129 single nucleotide variants (SNV) occurring at 67 different positions in the mitochondrial genome were detected. Of these, 48 were found in only one strain each while nineteen were found in two or more strains (Table 2). Two variants were present in fourteen and seventeen strains respectively. The SNV found in fourteen strains is a C to G at position 2,246 in non-coding sequence. The SNV found in seventeen strains occurs at position 17,478 just downstream from the mitochondrial ribosomal protein S5 (S3). All of the SNVs are non-coding except one that encodes a synonymous substitution in NCU16015 in strain FGSC 821 (Table 3). One strain, FGSC 3566, had the most SNVs in its mitochondrial genome, with 36 SNVs, of which 23 are unique to this strain. With the exception of the C to G at position 17,478 all of the SNVs in this strain are ambiguous with alternate bases making up 2 to 49 % of reads. In every case, among these variants in strain 3566 the primary call at each variant site was identical with the reference genome. At the other extreme, several strains had fewer than three or four SNVs and all of these strains included the C to G mutation at position 17,478. Strains FGSC 106 and FGSC 2261 each had only two SNVs and these were both shared and had no significant alternate base calls.

Table 2. Number of mitochondrial polymorphisms in each of 18 strains of Neurospora crassa.

A total of 1,250 insertions and deletions were identified among the strains. These occur as 553 different unique changes relative to the reference genome. These occur at 475 positions and of all of the independent iterations of all indels, 1,080 were annotated as being homo-allelic while 170 were identified as multi allelic (that is, different reads were recovered for the same location in one strain). In total, 662 deletions and 588 insertions were characterized. Three hundred and twenty-five indels occur only once in the dataset while 228 indels occur among two to eighteen strains. Sixty-six sites have two different variants (insertions or deletions of a different base, or of a different number of bases) and 4 sites have 3 variants.

One position with an indel in all eighteen strains occurs at position 12,228. This position, falling in intergenic space between the full-length NADH Dehydrogenase (NCU16004) and the mitochondrial ribosomal protein S5 (NCU16005), has sixteen deletions of one T and two insertions of one T and these all occur adjacent to a stretch of nine Ts. Most of the indels that are found in multiple strains occur among stretches of five or more repeats of the same base as the specific indel.

Among indels occurring in gene coding sequence the indel at 1,532 (NCU16002), is seen in fourteen strains. The deletion of one A from this position is homoallelic and strongly supported in thirteen strains, while the addition of one A is less well supported in strain FGSC 3246. Four strains are identical with the reference genome at this position. The deletion at 1,532 causes numerous stop codons in the NCU16002 ORF, beginning with a TAG at amino acid residue 203, which removes 121 residues from the full length conserved hypothetical protein encoded at NCU16002. In strains FGSC 3566 and 3831 this ORF has additional indels including the insertion of GG at position 1356. Position 1481 has an insertion of one G in strain 3566 and one C in strain 3831.

NCU16001 encodes a truncated version of NADH dehydrogenase subunit 2, with the full-length version encoded by NCU16006. The truncated NCU16001 ORF is 705 nucleotides in length and has multiple indels in five strains and all of these indels induce frameshift errors. The deletion of the C at position 616 is found in strains FGSC 3114 and FGSC 3566 and is homoallelic in both strains. This deletion causes a frameshift and introduces multiple stop codons, the first being a TAG codon at triplet 120 of the 235 amino acid protein. Similarly, the deletion of one G at position 630 in strains 322 and 3921 causes a frameshift that introduces a stop codon at position 121, as well as multiple stops after that position. There are no indels in the full-length version of NADH Dehydrogenase subunit 2 (NCU16006) in any of the strains sequenced in this program.

In all, twenty mitochondrial ORFs have indels (Table 3) and of these, nine ORFs have indels within the protein coding region of the gene. Eleven ORFs have insertions or deletions in an intron and four have indels directly adjacent (3′ or 5′) to the ORF. Five ORFs have no insertions or deletions and these include the full-length version of the NADH dehydrogenase subunit 2 (NCU16006), two hypothetical proteins (NCU16011, NCU16023), and two endonucleases (NCU16014 and NCU16021).

Table 3. Mitochondrial open reading frames (ORF) with polymorphisms relative to the reference genome.

Seventy-two larger rearrangements with both endpoints within the mitochondrial genome were detected among these 18 strains using Breakdancer (Chen et al. 2009a). An additional 37 rearrangements have one endpoint on a chromosome in the nuclear genome. All of the polymorphisms detected with Breakdancer are unique although nineteen have shared endpoints with another variant. Of these, all consisted of different variants within one strain with one shared endpoint. Twenty-five of the polymorphisms detected with Breakdancer were deletions while twelve were insertions. The average deletion was 2,925 bases, although three putative deletions of over 20 Kb were identified in different stains. The average insertion was 123 bases with a range of 96 to 159 bases.

Discussion

Overall there is a very low level of SNVs in the mitochondrial genomes of the eighteen strains characterized by whole genome sequence analysis. Even the strain with the most SNVs, FGSC 3566, had only 36 SNVs and most of these were at positions where both the reference genome base and an alternate base were detected. Interestingly, several of the strains characterized in the present study are related to those used in the pioneering work clearly showing uniparental inheritance of Neurospora mitochondria (Mannella et al. 1979). Strain FGSC 821 was deposited as a spontaneous mutant arising in strain 4A, which is the designation used for the Abbott strain in Mannella et al. (1979). In this earlier work, Abbott strains were described as mitochondrial genome type I. Similarly, strains in the Lindegren and St Lawrence backgrounds are described as having type II mitochondria. On the deposit form submitted with the strain, FGSC 1363 was explicitly listed as being in a Lindegren background. Other strains in the current analysis were backcrossed into the St Lawrence background (for example, FGSC 7035). The possibility that the strains in the current study contain the same mitochondrial genome as those described in Mannella et al (1979) is supported by the presence of the G for C SNP at position 2,246 in both St Lawrence type genome (FGSC 7035) and the Lindegren derived strain (FGSC 1363) as well as thirteen additional strains, but not in strain FGSC 821 (the Abbott strain).

Two of the indels in NCU16001 (the truncated NADH dehydrogenase subunit 2) occur in multiple strains and are well supported although both of these indels occur in short strings of the same base. NCU16002 encodes a conserved hypothetical protein and has multiple unique and shared indels including the second most common indel in the mitochondrial genome among these strains. The deletion of one A from position 1,532 in this ORF removes 121 amino acids from the final putative protein product. This ORF, also known as ufILM (D’Souza et al. 2005), has little orthology to other proteins in the PUBMED NR protein database, and has no conserved protein domains. The finding of these frameshift inducing indels suggests that these two genes are both pseudogenes resulting from an ancestral partial duplication within the mitochondrial genome (Agsteribbe et al. 1989). While many mitochondrial ORFs have indels, these do not follow the same pattern of bias towards indels that do not disrupt the reading frame as was seen for indels in nuclear genes (McCluskey et al. 2011). Although the observation of the same indel in multiple strains lends credence to the fact that they are an accurate representation of the underlying sequence, indels are commonly seen occurring in runs of the same base and it cannot be determined from these data whether these are changes in the mitochondrial genome or systematic errors in the sequencing process. Although intrachromosomal rearrangements have been previously implicated as being responsible for the start-stop growth phenotype of so called stopper mutants, the rearrangements found in the present study do not correspond to those described for the stopper E35 mutant (de Vries et al. 1986). Indeed, the rearrangements typically only comprise a fraction of the reads for a given region. The anomalous characterization of interchromosomal recombination between nuclear and mitochondrial genomes by the Breakdancer program suggests either artifacts from library construction or in silica in the subsequent analysis. The possibility that mitochondrial sequences are found in the nuclear genome or that nuclear sequence is present in the mitochondrial genome is impossible to assess without additional investigation.

While a traditional view of the mitochondria is that of individual cell-like organelles (Luck 1963), recent study suggests more of a filamentous or syncytial structure (Bowman et al. 2009) with the mitochondrial DNA organized into nucleoids (Gilkerson et al. 2008, Basse 2010). Moreover, recent analysis of the mitochondrial proteome is adding to the understanding of the role of nuclear and mitochondrial encoded genes (Keeping et al. 2011). While it may be attractive to suggest that the deleterious mutations detected in a fraction of the reads in the whole genome sequencing of Neurospora strains represent defective mitochondrial genomes present in an otherwise healthy background, the present level of analysis does not allow this conclusion. The fact that most of the indels were homoallelic contrasts markedly with the observation that most of the SNVs were multiallelic. By way of contrast, larger scale rearrangements detected by the Breakdancer algorithm were mostly multiallelic. Whether these observations provide insight into fundamental aspects of mitochondrial genome maintenance cannot be determined with the present dataset. Additional experiments, for example comparing sequence from freshly germinated conidia to that generated from stationary-phase cultures, may allow insight into the nature of these polymorphisms. Future studies may take advantage of the information presented here to, for example, amplify unique DNA fragments only generated by deletions or large-scale rearrangements. Recent advances in whole genome sequencing may enable experimental analysis of mutation and rearrangements of mitochondrial genome in Neurospora, other fungi, and indeed all organisms.