Endogenous retroviral promoter exaptation in human cancer
Cancer arises from a series of genetic and epigenetic changes, which result in abnormal expression or mutational activation of oncogenes, as well as suppression/inactivation of tumor suppressor genes. Aberrant expression of coding genes or long non-coding RNAs (lncRNAs) with oncogenic properties can be caused by translocations, gene amplifications, point mutations or other less characterized mechanisms. One such mechanism is the inappropriate usage of normally dormant, tissue-restricted or cryptic enhancers or promoters that serve to drive oncogenic gene expression. Dispersed across the human genome, endogenous retroviruses (ERVs) provide an enormous reservoir of autonomous gene regulatory modules, some of which have been co-opted by the host during evolution to play important roles in normal regulation of genes and gene networks. This review focuses on the “dark side” of such ERV regulatory capacity. Specifically, we discuss a growing number of examples of normally dormant or epigenetically repressed ERVs that have been harnessed to drive oncogenes in human cancer, a process we term onco-exaptation, and we propose potential mechanisms that may underlie this phenomenon.
KeywordsGene regulation Endogenous retrovirus Long terminal repeat Retrotransposon Epigenetics Cancer Alternative promoter Exaptation Transcription
AFAP1 antisense RNA 1
Anaplastic large-cell lymphoma
Anaplastic lymphoma kinase
BRAF-regulated lncRNA 1
Capped analysis of gene expression
Colony stimulating factor one receptor
Diffuse large B-cell lymphoma
Erb-b2 receptor tyrosine kinase 4
Expressed sequence tag
ETS variant 1
Endogenous retroviral-associated Adenocarcinoma RNA
Fatty acid binding protein 7
Human ovarian cancer specific transcript-2
Highly upregulated in liver cancer
Interferon regulatory Factor 5
Interferon regulatory factor-binding element
Long intergenic non-protein coding RNA, regulator of reprogramming
- LINE-1: L1
Long interspersed repeat-1
Long non-coding RNA
Long terminal repeat
MET proto-oncogene, receptor tyrosine kinase
Organic anion transporting polypeptide 1B3
Survival associated mitochondrial melanoma specific oncogenic non-coding RNA
SWI/SNF complex antagonist associated with prostate cancer 1
Short interspersed element
Solute carrier organic anion transporter family member 1B3
The cancer genome atlas
Tissue factor pathway inhibitor 2
Translation initiation site
Transcriptional start site
Urothelial cancer associated 1.
Sequences derived from transposable elements (TEs) occupy at least half the human genome [1, 2]. TEs are generally classified into two categories; DNA transposons, which comprise 3.2% of the human genome; and the retroelements, short interspersed repeats (SINEs, 12.8% of the genome), long interspersed repeats (LINEs, 20.7%) and long terminal repeat (LTR) elements, derived from endogenous retroviruses (ERVs, 8.6%). Over evolutionary time, TE sequences in the genome can become functional units that confer a fitness advantage, a process called “exaptation” [3, 4]. Exaptation includes protein coding, non-coding and regulatory effects of TEs. This is in contrast to the designation of “nonaptations” for genetic units that perform some function (such as initiate transcription) but don’t impact host fitness . Besides their roles in shaping genomes during evolution, TEs continue to have impact in humans through insertional mutagenesis, inducing rearrangements and affecting gene regulation, as discussed in recent reviews [5, 6, 7, 8, 9, 10, 11, 12].
Efforts to explore the role of TEs in human cancer have focused primarily on LINEs and ERVs. While nearly all L1s, the major human LINE family, are defective, a few hundred retain the ability to retrotranspose  and these active elements occasionally cause germ line mutations [9, 14, 15]. Several recent studies have also documented somatic, cancer-specific L1 insertions [16, 17, 18, 19, 20, 21, 22, 23], and a few such insertions were shown to contribute to malignancy . For example, two L1 insertions were documented to disrupt the tumor suppressor gene APC in colon cancer [16, 23]. However, it is probable that most insertions are non-consequential “passenger mutations”, as recently discussed by Hancks and Kazazian . Thus, the overall biological effect size of LINE retrotransposition on the process of oncogenesis may be limited.
No evidence for retrotranspositionally active ERVs in humans has been reported [24, 25, 26], so it is unlikely that human ERVs activate oncogenes or inactivate tumor suppressor genes by somatic retrotransposition. This is in contrast to the frequent oncogene activation by insertions of exogenous and endogenous retroviruses in chickens or mice, where retrotranspositional activity of ERVs is very high [27, 28, 29]. Therefore, to date, most studies into potential roles for ERVs in human cancer have focused on their protein products. Indeed, there is strong evidence that the accessary proteins Np9 and Rec, encoded by members of the relatively young HERV-K (HML-2) group, have oncogenic properties, particularly in germ cell tumors [30, 31, 32, 33].
Regardless of their retrotranspositional or coding capacity, ERVs may play a broader role in oncogenesis involving their intrinsic regulatory capacity. De-repression/activation of cryptic (or normally dormant) promoters to drive ectopic expression is one mechanism that can lead to oncogenic effects [34, 35, 36, 37, 38, 39, 40]. Because TEs, and especially ERV LTRs, are an abundant reservoir of natural promoters in the human genome [6, 41, 42], inappropriate transcriptional activation of typically repressed LTRs may contribute to oncogenesis. Here we review examples of such phenomena, which we term “onco-exaptation”, and propose two explanatory models to understand the role of LTRs in oncogenesis.
Promoter potential of ERVs
ERV/LTR groups mentioned in this review
Associated LTRs (Repbase names)
~Copies of internal regionsa
~Copies of solitary LTRsb
LTR7, 7B, 7C, 7Y
LTR12, 12B-12 F
LTR2, 2B, 2C
LTR1, 1A-1 F
LTR5, 5A, 5B, 5Hs
Approximately 90% of the “ERV-related” human genomic DNA is in the form of solitary LTRs, which are created over evolutionary time via recombination between the 5’ and 3’ LTRs of an integrated provirus [48, 49]. LTRs naturally contain transcriptional promoters and enhancers, and often splice donor sites, required for autonomous expression of the integrated LTR element. Furthermore, unlike for LINEs (see below), the integration process nearly always retains the primary transcriptional regulatory motifs, i.e. the LTR, even after recombination between the LTRs of a full-length proviral form. Mutations will degrade LTR promoter/enhancer motifs over time, but many of the >470,000 ERV/LTR loci in the genome  likely still retain some degree of their ancestral promoter/enhancer function, and hence a gene regulatory capacity.
LTR-mediated regulation of single genes and gene networks has been increasingly documented in the literature. For example, studies have implicated ERV LTRs in species-specific regulatory networks in ES cells , in the interferon response , in p53-mediated regulation , as tissue-specific enhancers [54, 55] and in regulating pluripotency by promoting genes and lncRNAs in stem cells [56, 57, 58, 59, 60]. LTR regulatory capacity arises from both their “ready-to-use” ancestral transcriptional factor (TF) binding sites and by mutation/evolution of novel sites, possibly maintained through epistatic capture  (recently reviewed in ). For more in depth discussion of the evolutionary exaptation of enhancers/promoters of LTRs and other TEs in mammals, we refer the reader to a rapidly growing number of reviews on this subject [6, 10, 42, 62, 63, 64, 65]. Suffice it to say that, retrotranspositionally incompetent ERV LTRs, long considered the “poor cousin” of active L1 elements, have emerged from the shadowy realm of junk DNA and are now recognized as a major source of gene regulatory evolution through exaptation of their promoters and enhancers.
Promoter potential of LINEs and other non-LTR TEs
Besides via new retrotransposition events, existing L1 elements can also impact genes through promoter donation. Full-length L1 elements harbor two internal promoters at their 5’ end, a sense promoter that drives expression of the element and an antisense promoter that has been shown to control expression of nearby genes through formation of chimeric transcripts [66, 67, 68, 69]. Recently, this antisense promoter was also shown to promote expression of a small protein ORF0, which plays a regulatory role in retrotransposition . While there are approximately 500,000 L1 loci in the human genome , the vast majority of them are 5’ truncated due to incomplete reverse transcription during the retrotransposition process. Only ~3500-7000 are full length, retaining their promoters and hence, the potential ability to lend these promoters to nearby genes [71, 72]. Therefore, irrespective of differences in promoter strength, epigenetic regulation or mutational degradation, the vast copy number difference (~500,000 LTRs versus ~5000 promoter-containing L1s), is likely a major reason why the great majority of TE-initiated transcripts involve LTRs rather than L1s. In genome-wide screens of TE-initiated transcripts, small fragments of old L2 elements, which do not span the canonical L2 promoter, can be found as TSSs of lowly expressed transcripts  (unpublished data). Such instances likely represent “de novo” promoters, those arising naturally from genomic DNA which happens to be derived from a TE fragment, (possibly because L2 fragments have a GC rich base composition), rather than an “ancestral” or “ready-made” promoter, one which utilizes a TE’s original regulatory sequence.
Human SINE elements, namely ALUs and the older MIRs, can also promote transcription of nearby genes but these instances are relatively rare  given their extremely high copy numbers (~1.85 million fragments) . This likely partly reflects the fact that SINEs, being derived from small functional RNAs, inherently possess PolIII promoters, rather than PolII, and their autonomous promoter strength is weak [74, 75]. Old MIR elements, as well as other ancient SINEs and DNA TEs, have been more prominent as enhancers, rather than genic promoters, as shown in several studies [76, 77, 78, 79, 80, 81].
TEs and the cancer transcriptome
While some TE components have assumed cellular functions over evolutionary time, such as the syncytin genes in mammalian placenta, derived from independent ERV env genes in multiple mammals [6, 44, 82, 83, 84], the vast majority of TE/ERV insertions will be neutral or detrimental to the host. Given the potential for harm, multiple host mechanisms to repress these sequences have evolved. In mammals, ERV and L1 transcription is suppressed in normal cells by DNA methylation and/or histone modifications as well as many other host factors [9, 85, 86, 87, 88, 89, 90, 91, 92]. The epigenetic regulation of TEs is relevant in cancer because epigenetic changes are common in malignancy and frequently associated with mutations in “epigenome-modifying” genes [93, 94, 95, 96, 97]. While the ultimate effects of many such mutations are not yet clear, their prominence indicates a central role for epigenomic dysregulation in oncogenesis [94, 98]. The most well established epigenetic changes are promoter hypermethylation and associated silencing of tumor suppressor genes [95, 99, 100] as well as genome-wide DNA hypomethylation [101, 102, 103]. Hypomethylation of ERVs and L1s in many tumors has been documented [104, 105, 106] and general transcriptional up-regulation of ERVs and L1s is often observed in cancers [33, 107, 108, 109]. However, other studies have shown no significant changes in ERV expression in selected human cancers compared to corresponding normal tissues [110, 111].
General conclusions about overall TE transcriptional deregulation in malignancy, or in any other biological state, are not always well founded and can depend on the type and sensitivity of the assay. For example, expression studies that use consensus probes for internal L1 or ERV regions to assay expression by custom microarrays or RT-PCR don’t resolve individual loci, so high expression signals could reflect dispersed transcriptional activation of many elements or the high expression of only one or a few loci. Such assays typically also cannot distinguish between expression due to TE promoter de-repression or due to increased transcription of transcripts harboring TEs. RNA-Seq has the potential to give information on expression of individual TE loci, but interpretations of expression levels can be confounded by mapping difficulties, length of read and sequencing depth . In any event, in most cases where transcriptional up-regulation of TE groups or individual TEs has been detected in cancer, the biological relevance of such aberrant expression is poorly understood.
Onco-exaptation of ERV/TE promoters
Activation of oncogenes by Onco-exaptation of TE-derived promoters
Primary result of TE-driven expression
TE promoter coordinates (hg38)
Tyrosine kinase receptor
Ectopic expression of normal protein
(ERVL-MaLR) THE1B LTR
Ectopic expression of normal protein
(ERV1) LOR1a LTR
Tyrosine kinase receptor
Tyrosine kinase receptor
Tyrosine kinase receptor
(ERVL-MaLR) MLT1C LTR
(ERVL-MaLR) MLT1H2 LTR
Fatty acid binding
Ectopic and overexpression of protein-coding genes
The most straightforward interaction between a TE promoter and a gene is when a TE promoter is activated, initiates transcription, and transcribes a downstream gene without altering the open reading frame (ORF), thus serving as an alternative promoter. Since the TE promoter may be regulated differently than the native promoter, this can result in ectopic and/or overexpression of the gene, with oncogenic consequences.
The first case of such a phenomenon was discovered in the investigation of a potent oncogene colony stimulating factor one receptor (CSF1R) in Hodgkin Lymphoma (HL). Normally, CSF1R expression is restricted to macrophages in the myeloid lineage. To understand how this gene is expressed in HL, a B-cell derived cancer, Lamprecht et al.  performed 5’ RACE which revealed that the native, myeloid-restricted promoter is silent in HL cell lines, with CSF1R expression instead being driven by a solitary THE1B LTR, of the MaLR-ERVL class (Fig. 1a). THE1B LTRs are ancient, found in both Old and New World primates, and are highly abundant in the human genome, with a copy number of ~17,000 [50, 114] (Table 1). The THE1B-CSF1R transcript produces a full-length protein in HL, which is required for growth/survival of HL cell lines  and is clinically prognostic for poorer patient survival . Ectopic CSF1R expression in HL appears to be completely dependent on the THE1B LTR, and CSF1R protein or mRNA is detected in 39–48% of HL patient samples [115, 116].
To detect additional cases of onco-exaptation, we screened whole transcriptomes (RNA-Seq libraries) from a set of HL cell lines as well as from normal human B cells for TE-initiated transcripts, specifically transcripts that were recurrent in HL and not present in normal B cells . We identified the Interferon Regulatory Factor 5 gene (IRF5) as a recurrently up-regulated gene being promoted by a LOR1a LTR located upstream of the native/canonical TSS (Fig. 1b). LOR1a LTRs are much less abundant compared to THE1 LTRs (Table 1) but are of similar age, with the IRF5 copy having inserted prior to New World-Old World primate divergence. IRF5 has multiple promoters/TSSs and complex transcription  and, contrary to the CSF1R case, the native promoters are not completely silent in HL. However, LTR activity correlates with strong overexpression of the IRF5 protein and transcript, above normal physiological levels . While our study was ongoing, Kreher et al. reported that IRF5 is upregulated in HL and is a central regulator of the HL transcriptome . Moreover, they found that IRF5 is crucial for HL cell survival. Intriguingly, we noted that insertion of the LOR1a LTR created an interferon regulatory factor-binding element (IRFE) that overlaps the 5’ end of the LTR. This IRFE was previously identified to be critical for promoter activity as a positive feedback loop through binding of various IRFs, including IRF5 itself . Hence, the inherent promoter motifs of the LTR, coupled with the creation of the IRFE upon insertion, combined to provide an avenue for ectopic expression of IRF5 in HL.
Expression of truncated proteins
In these cases, a TE-initiated transcript results in the expression of a truncated open reading frame of the affected gene, typically because the TE is located in an intron, downstream of the canonical translational start site. The TE initiates transcription, but the final transcript structure depends on the position of downstream splice sites, and protein expression requires usage of a downstream ATG. Protein truncations can result in oncogenic effects due to loss of regulatory domains or through other mechanisms, with a classic example being v-myb, a truncated form of myb carried by acutely transforming animal retroviruses [121, 122].
The first such reported case involving a TE was identified in a screen of human ESTs to detect transcripts driven by the antisense promoter within L1 elements. Mätlik et al. identified an L1PA2 within the second intron of the proto-oncogene MET (MET proto-oncogene, receptor tyrosine kinase) that initiates a transcript by splicing into downstream MET exons (Fig. 1c) . Not surprisingly, transcriptional activity of the CpG rich promoter of this L1 in bladder and colon cancer cell lines is inversely correlated to its degree of methylation [123, 124]. A slightly truncated MET protein is produced by the TE-initiated transcript and one study reported that L1-driven transcription of MET reduces overall MET protein levels and signaling, although by what mechanism is not clear . Analyses of normal colon tissues and matched primary colon cancers and liver metastasis samples showed this L1 is progressively demethylated in the metastasis samples, which strongly correlates with increased L1-MET transcripts and protein levels . Since MET levels are a negative prognostic indicator for colon cancer , these findings suggest an oncogenic role for L1-MET.
More recently, Wiesner et al. identified a novel isoform of the receptor tyrosine kinase (RTK), anaplastic lymphoma kinase (ALK), initiating from an alternative promoter in its 19th intron . This alternative transcription initiation (ATI) isoform or ALK ATI was reported to be specific to cancer samples and found in ~11% of skin cutaneous melanomas. ALK ATI transcripts produce three protein isoforms encoded by exons 20 to 29. These smaller isoforms exclude the extracellular domain of the protein but contain the catalytic intracellular tyrosine kinase domain. This same region of ALK is commonly found fused with a range of other genes via chromosomal translocations in lymphomas and a variety of solid tumors . In the Wiesner et al. study it was found that ALKATI stimulates several oncogenic signaling pathways, drives cell proliferation in vitro, and promotes tumor formation in mice .
To gain a molecular understanding of ALK-negative anaplastic large-cell lymphoma (ALCL) cases, Scarfo et al. conducted gene expression outlier analysis and identified high ectopic co-expression of ERBB4 and COL29A1 in 24% of such cases . Erb-b2 receptor tyrosine kinase 4 (ERBB4), also termed HER4, is a member of the ERBB family of RTKs, which includes EGFR and HER2, and mutations in this gene have been implicated in some cancers . Analysis of the ERRB4 transcripts expressed in these ALCL samples revealed two isoforms initiated from alternative promoters, one within intron 12 (I12-ERBB4) and one within intron 20 (I20-ERBB4), with little or no expression from the native/canonical promoter. Both isoforms produce truncated proteins that show oncogenic potential, either alone (I12 isoform) or in combination. Remarkably, both promoters are LTR elements of the ancient MaLR-ERVL class (Fig. 1e). Of note, Scarfo et al. reported that two thirds of ERBB4 positive cases showed a “Hodgkin-like” morphology, which is normally found in only 3% of ALCLs . We therefore examined our previously published RNA-Seq data from 12 HL cell lines  and found evidence for transcription from the intron 20 MLTH2 LTR in two of these lines (unpublished observations), suggesting that truncated ERBB4 may play a role in some HLs.
TE-promoted expression of chimeric proteins
Perhaps the most fascinating examples of onco-exaptation involve generation of a novel “chimeric” ORF via usage of a TE promoter that fuses otherwise non-coding DNA to downstream gene exons. These cases involve both protein and transcriptional innovation and the resulting product can acquire de novo oncogenic potential.
The solute carrier organic anion transporter family member 1B3, encodes organic anion transporting polypeptide 1B3 (OATP1B3, or SLCO1B3), is a 12-transmembrane transporter with normal expression and function restricted to the liver . Several studies have shown that this gene is ectopically expressed in solid tumors of non-hepatic origin, particularly colon cancer [131, 132, 133, 134]. Investigations into the cause of this ectopic expression revealed that the normal liver-restricted promoter is silent in these cancers, with expression of “cancer-type” (Ct)-OATP1B3 being driven from an alternative promoter in the second canonical intron [133, 134]. While not previously reported as being within a TE, we noted that this alternative promoter maps within the 5’ LTR (LTR7) of a partly full-length antisense HERV-H element that is missing the 3’ LTR. Expression of HERV-H itself and LTR7-driven chimeric long non-coding RNAs is a noted feature of embryonic stem cells and normal early embryogenesis, where several studies indicate an intriguing role for this ERV group in pluripotency (for recent reviews see [8, 10, 60]). A few studies have also noted higher general levels of HERV-H transcription in colon cancer [109, 135]. The LTR7-driven isoform of SLCO1B3 makes a truncated protein lacking the first 28 amino acids but also includes protein sequence from the LTR7 and an adjacent MER4C LTR (Fig. 1f). The novel protein is believed to be intracellular and its role in cancer remains unclear. However, one study showed that high expression of this isoform is correlated with reduced progression-free survival in colon cancer .
In another study designed specifically to look for TE-initiated chimeric transcripts, we screened RNA-seq libraries from 101 patients with diffuse large B-cell lymphoma (DLBCL) of different subtypes  and compared to transcriptomes from normal B-cells. This screen resulted in the detection of 98 such transcripts that were found in at least two DLBCL cases and no normals . One of these involved the gene for fatty acid binding protein 7 (FABP7). FABP7, normally expressed in brain, is a member of the FABP family of lipid chaperones involved in fatty acid uptake and trafficking . Overexpression of FABP7 has been reported in several solid tumor types and is associated with poorer prognosis in aggressive breast cancer [139, 140]. In 5% of the DLBCL cases screened, we found that FABP7 is expressed from an antisense LTR2 (the 5’LTR of a HERV-E element) (Fig. 1g). Since the canonical ATG is in the first exon of FABP7, the LTR driven transcript encodes a chimeric protein with a different N-terminus (see accession NM_001319042.1) . Functional analysis in DLBCL cell lines revealed that the LTR-FABP7 protein isoform is required for optimal cell growth and also has subcellular localization properties distinct from the native form .
Overall, among all TE types giving rise to chimeric transcripts detected in DLBCL, LTRs were over represented compared to their genomic abundance and, among LTR groups, we found that LTR2 elements and THE1 LTRs were over represented . As discussed above, this predominance of LTRs over other TE types is expected.
TE-initiated non-coding RNAs in cancer
LTR-driven LncRNAs with oncogenic role
TE promoter coordinates (hg38)
(ERVL-MaLR) THE1A LTRb
(ERVL-MaLR) MLT1A LTR
(ERV1) MER41B LTR
TE-initiated LncRNAs with oncogenic properties
Linc-ROR is a non-coding RNA (long intergenic non-protein coding RNA, regulator of reprogramming) promoted by the 5’ LTR (LTR7) of a full length HERV-H element  (Fig. 3b) and has been shown to play a role in human pluripotency . Evidence suggests it acts as a microRNA sponge of miR-145, which is a repressor of the core pluripotency transcription factors Oct4, Nanog and Sox2 . Several recent studies have reported an oncogenic role for Linc-ROR in different cancers by sponging miR-145 [147, 148, 149] or through other mechanisms [150, 151].
Using Serial Analysis of Gene Expression (SAGE), Rangel et al. identified five Human Ovarian cancer Specific Transcripts (HOSTs) that were expressed in ovarian cancer but not in other normal cells or cancer types examined . One of these, HOST2, is annotated as a spliced lncRNA entirely contained within a full length HERV-E and promoted by an LTR2B element (Fig. 3c). Perusal of RNA-Seq from the 9 core ENCODE cell lines shows robust expression of HOST2 in GM12878, a B-lymphoblastoid cell line, which extends beyond the HERV-E. As with Linc-ROR, HOST2 appears to play an oncogenic role by functioning as a miRNA sponge of miRNA let-7b, an established tumor suppressor , in epithelial ovarian cancer .
The Ref-Seq annotated lncRNA AFAP1 antisense RNA 1 (AFAP1-AS1) runs antisense to the actin filament associated protein 1 (AFAP1) gene and several publications report its up-regulation and association with poor survival in a number of solid tumor types [155, 156, 157, 158]. While the oncogenic mechanism of AFAP1-AS1 has not been extensively studied, one report presented evidence that it promotes cell proliferation by upregulating RhoA/Rac2 signaling  and its expression inversely correlates with AFAP1. Although clearly annotated as initiating within a solitary THE1A LTR (Fig. 3d), this fact has not been mentioned in previous publications. In screens for TE-initiated transcripts using RNA-seq data from HL cell lines, we noted recurrent and cancer-specific up-regulation of AFAP1-AS1 (unpublished observations), suggesting that it is not restricted to solid tumors. The inverse correlation of expression between AFAP1 and AFAP1-AS1 suggests an interesting potential mechanism by which TE-initiated transcription may suppress a gene; where an anti-sense TE-initiated transcript disrupts the transcription, translation or stability of a tumor suppressor gene transcript through RNA interference .
The SAMMSON lncRNA (survival associated mitochondrial melanoma specific oncogenic non-coding RNA), which is promoted by a solitary LTR1A2 element, was recently reported as playing an oncogenic role in melanoma . This lncRNA is located near the melanoma-specific oncogene MITF and is always included in genomic amplifications involving MITF. Even in melanomas with no genomic amplification of this locus, SAMMSON is expressed in most cases, increases growth and invasiveness and is a target for SOX10 , a key TF in melanocyte development which is deregulated in melanoma . Interestingly, the two SOX10 binding sites near the SAMMSON TSS lie just upstream and downstream of the LTR (Fig. 2b), suggesting that both the core promoter motifs provided by the LTR and adjacent enhancer sites combine to regulate SAMMSON.
Other examples of LTR-promoted oncogenic lncRNAs include HULC for Highly Upregulated in Liver Cancer [163, 164], UCA1 (urothelial cancer associated 1) [165, 166, 167, 168] and BANCR (BRAF-regulated lncRNA 1) [169, 170, 171]. Although not mentioned in the original paper, three of the four exons of BANCR were shown to be derived from a partly full length MER41 ERV, with the promoter within the 5’LTR of this element annotated MER41B . Intriguingly, MER41 LTRs were recently shown to harbor enhancers responsive to interferon, indicating a role for this ERV group in shaping the innate immune response in primates . It would be interesting to investigate roles for BANCR with this in mind.
TE-initiated lncRNAs as cancer-specific markers
There are many examples of TE-initiated RNAs with potential roles in cancer or which are preferentially expressed in malignant cells but for which a direct oncogenic function has not yet been demonstrated. Still, such transcripts may underlie a predisposition for transcription of specific groups of LTRs/TEs in particular malignancies and therefore function as a marker for a cancer or cancer subtype. Since these events potentially do not confer a fitness advantage for the cancer cell, they are not “exaptations” but “nonaptations” .
One of these is a very long RNA initiated by the antisense promoter of an L1PA2 element as reported by Tufarelli’s group and termed LCT13 [172, 173]. EST evidence indicates splicing from the L1 promoter to the GNTG1 gene, located over 300 kb away. The tumor suppressor gene, tissue factor pathway inhibitor 2, (TFPI-2), which is often epigenetically silenced in cancers , is antisense to LCT13 and it was shown that LCT13 transcript levels are correlated with down regulation of TFPI-2 and associated with repressive chromatin marks at the TFPI-2 promoter .
Gibb et al. analyzed RNA-Seq from colon cancers and matched normal colon to find cancer-associated lncRNAs and identified an RNA promoted by a solitary MER48 LTR, which they termed EVADR, for Endogenous retroviral-associated ADenocarcinoma RNA . Screening of data from The Cancer Genome Atlas (TCGA)  showed that EVADR is highly expressed in several types of adenocarcinomas, it is not associated with global activation of MER48 LTRs across the genome and its expression correlated with poorer survival . In another study, Gosenca et al. used a custom microarray to measure overall expression of several HERV groups in urothelial carcinoma compared to normal urothelial tissue and generally found no difference . However, they found one full-length HERV-E element, located in the antisense direction in an intron of the PLA2G4A gene that is transcribed in urothelial carcinoma and appears to modulate PLA2G4A expression, thereby possibly contributing to carcinogenesis, although the mechanism is not clear.
By mining long nuclear RNA datasets from ENCODE cell lines, normal blood and Ewing sarcomas, one group identified over 2000 very long (~50–700 kb) non coding transcripts termed vlincRNAs . They found the promoters for these vlincRNAs to be enriched in LTRs, particularly for cell type-specific vlincRNAs, and the most common transcribed LTR types varied in different cell types. Moreover, among the datasets examined, they reported that the number of LTR-promoted vlincRNAs correlated with degree of malignant transformation, prompting the conclusion that LTR-controlled vlincRNAs are a “hallmark” of cancer .
In a genome-wide CAGE analysis of 50 hepatocellular carcinoma (HCC) primary samples and matched non-tumor tissue, Hashimoto et al. found that many LTR-promoted transcripts are upregulated in HCC, most of these apparently associated with non-coding RNAs as the CAGE peaks in the LTRs are far from annotated protein coding genes . Similar results were found in mouse HCC. Among the hundreds of human LTR groups, they found the LTR-associated CAGE peaks to be significantly enriched in LTR12C (HERV9) LTRs and mapped the common TSS site within these elements, which agrees with older studies on TSS mapping of this ERV group . Moreover, this group reported that HCCs with highest LTR activity mostly had a viral (Hepatitis B) etiology, were less differentiated and had higher risk of recurrence . This study suggests widespread tissue-inappropriate transcriptional activity of LTRs in HCC.
LTR12s as flexible promoters in cancer and normal tissues
Most recent human ERV LTR research has been focused on HERV-H (LTR7/7Y/7B/7C) due to roles for HERV-H/LTR7-driven RNAs in pluripotency [56, 57, 58, 60, 179, 180] or on the youngest HERV group, HERV-K (LTR5/5Hs), due to its expression in early embryogenesis [181, 182, 183], coding capacity of some members [30, 184] and potential roles for its proteins in cancer and other diseases [30, 31, 32, 33, 185]. LTR12s (including LTR12B,C,D,E and F subtypes), which are the LTRs associated with the HERV-9 group , are generally of similar age to HERV-H  but are much more numerous than HERV-H or HERV-K, with solitary LTRs numbering over 6000 (Table 1). There are several examples of LTR12s providing promoters for coding genes or lncRNAs in various normal tissues [63, 188, 189, 190, 191]. LTR12s, particularly LTR12C, are longer and more CpG rich than most other ERV LTRs, possibly facilitating development of diverse inherent tissue-specificities and flexible combinations of TF binding sites, which may be less probable for other LTR types. For example, the consensus LTR7 (HERV-H) is 450 bp whereas LTR12C (of similar age) is 1577 bp , which is usually long for retroviral LTRs. As noted above, LTR12 elements are among the most enriched LTR types activated as promoters in HCC  and appear to be the most active LTR type in K562 cells . It is important to point out, however, that only a very small fraction of genomic LTR12 copies are transcriptionally active in any of these contexts, so general conclusions about activity of ‘a family of LTRs’ should be made with caution.
A number of other recent investigations on LTR12-driven chimeric transcription have been published. One study specifically screened for and detected numerous LTR12-initiated transcripts in ENCODE cell lines, some of which extend over long genomic regions and emanate from bidirectional promoters within these LTRs . The group of Dobbelstein discovered that a male germ line-specific form of the tumor suppressor TP63 gene is driven by an LTR12C . Interestingly, they found that this LTR is silenced in testicular cancer but reactivated upon treatment with histone deacetylase inhibitors (HDACi), which also induces apoptosis . In follow-up studies, this group used 3’ RACE to detect more genes controlled by LTR12s in primary human testis and in the GH testicular cancer cell line and reported hundreds of transcripts, including an isoform of TNFRSF10B which encodes the death receptor DR5 . As with TP63, treating GH or other cancer cell lines with HDAC inhibitors such as trichostatin A activated expression of the LTR12-driven TNFRSF10B and some other LTR12-chimeric transcripts and induced apoptosis [193, 194]. Therefore, in some cases, LTR-driven genes can have a proapoptotic role. In accord with this notion is a study reporting that LTR12 antisense U3 RNAs were expressed at higher levels in non-malignant versus malignant cells . It was proposed that the antisense U3 RNA may act as a trap for the transcription factor NF-Y, known to bind LTR12s , and hence participate in cell cycle arrest .
Chromosomal translocations involving TEs in cancer
Activation or creation of oncogenes via chromosomal translocations most commonly involves either the fusion of two coding genes or juxtaposition of new regulatory sequences next to a gene, resulting in oncogenic effects due to ectopic expression . One might expect some of the latter cases to involve TE-derived promoters/enhancers but, to date, there are very few well-documented examples of this mechanism in oncogenesis. The ETS family member ETV1 (ETS variant 1) is a transcription factor frequently involved in oncogenic translocations, particularly in prostate cancer . Although not a common translocation, Tomlins et al. identified a prostate tumor with the 5’ end of a HERV-K (HML-2) element on chromosome 22q11.23 fused to ETV1 . This particular HERV-K element is a complex locus with two 5’ LTRs and is quite highly expressed in prostate cancer . Indeed, while a possible function is unknown, this HERV-K locus produces a lncRNA annotated as PCAT-14, for prostate cancer–associated ncRNA transcript-14 . In the HERV-K-ETV1 fusion case, the resultant transcript (Genbank Accession EF632111) initiates in the upstream 5’LTR, providing evidence that the LTR controls expression of ETV1.
The fibroblast growth factor receptor 1 (FGFR1) gene on chromosome 8 is involved in translocations with at least 14 partner genes in stem cell myeloproliferative disorder and other myeloid and lymphoid cancers . One of these involves a HERVK3 element on chromosome 19 and this event creates a chimeric ORF with HERVK3 gag sequences . While it was reported that the LTR promoter may contribute to expression of the fusion gene , no supporting evidence was presented. Indeed, perusal of public expression data (Expressed sequence tags) from a variety of tissues indicates that the HERVK3 element on chromosome 19 is highly expressed, but from a non-ERV promoter just upstream (see chr19:58,305,253–58,315,303 in human hg38 assembly). Therefore, there is little current evidence for LTR/TE promoters playing a role in oncogene activation via chromosomal translocations or rearrangements.
Models for onco-exaptation
The aforementioned cases of onco-exaptation are a distinct mechanism by which proto-oncogenes become oncogenic. Classical activating mutations within TEs may also lead to transcription of downstream oncogenes but we are unaware of any evidence for DNA mutations resulting in LTR/TE transcriptional activation, including cases where local DNA was sequenced  (unpublished results). Thus, it is important to consider the etiology through which LTRs/TEs become incorporated into new regulatory units in cancer. The mechanism could possibly be therapeutically or diagnostically important and perhaps even model how TEs influence genome regulation in evolutionary time.
In some of the above examples, there is no or very little detectable transcription from the LTR/TE in any cell type other than the cancer type in which it was reported, suggesting the activity is specific to a particular TE in a particular cancer. In other cases, CAGE or EST data show that the LTR/TE can be expressed in other normal or cancer cell types, perhaps to a lower degree. Hence the term “cancer-specific” should be considered a relative one. Indeed, the idea that the same TE-promoted gene transcripts occur recurrently in tumors from independent individuals is central to understanding how these transcripts arise. Below we present two models that may explain the phenomenon of onco-exaptation.
The De-repression model
The LOR1a-IRF5 onco-exaptation in HL  can be interpreted using a de-repression model. An interferon regulatory factor binding element site was created at the intersection of the LOR1a LTR and genomic DNA. In normal and HL cells negative for LOR1a-IRF5, the LTR is methylated and protected from DNAse digestion, a state that is lost in de-repressed HL cells. This transcription factor-binding motif is responsive to IRF5 itself and creates a positive feedback loop between the IRF5 and the chimeric LOR1a-IRF5 transcript. Thus epigenetic de-repression of this element may reveal an oncogenic exploitation, resulting in high recurrence of LOR1a LTR-driven IRF5 in HL .
A de-repression model explains several experimental observations, such as the necessity for a given set of factors to be present (or absent) for a certain promoter to be active, especially when those factors differ between cell states. Indeed, experiments probing the mechanism of TE/LTR activation have used this line of reasoning, often focusing on DNA methylation [113, 117, 125, 129]. The limitation of these studies is that they fail to determine if a given condition is sufficient for onco-exaptation to arise. For instance, the human genome contains >37,000 THE1 LTR loci (Table 1), and indeed this set of LTRs is generally more active in HL cells compared to B-cells as would be predicted  (unpublished results). The critical question is why particular THE1 LTR loci, such as THE1B-CSF1R, are recurrently de-repressed in HL, yet thousands of homologous LTRs are not.
The Epigenetic Evolution model
Key to the epigenetic evolution model is that there is high epigenetic variance, both between LTR loci and at the same LTR locus between cells in a population. This epigenetic variance fosters regulatory innovation, and increases during oncogenesis. In accord with this idea are several studies showing that DNA methylation variation, or heterogeneity, increases in tumor cell populations and this isn’t simply a global hypomethylation relative to normal cells [207, 208, 209] (reviewed in ). In contrast to the de-repression model, a particular pathogenic molecular state is not sufficient or necessary for TE-driven transcripts to arise; instead the given state only dictates which sets of TEs in the genome are permissive for transcription. Likewise, global de-repression events, such as DNA hypomethylation or mutation of epigenetic regulators, are not necessary, but would increase the rate at which novel transcriptional regulation evolves.
Underpinning this model is the idea that LTRs are highly abundant and self-contained promoters dispersed across the genome that can stochastically initiate low or noisy transcription. This transcriptional noise is a kind of epigenetic variation and thus contributes to cell-cell variation in a population. Indeed, by re-analyzing CAGE datasets of retrotransposon-derived TSSs published by Faulkner et al. , we observed that TE-derived TSSs have lower expression levels and are less reproducible between biological replicates, compared to non-TE promoters (unpublished observations). During malignant transformation, TFs can become deregulated and genome-wide epigenetic perturbations occur [94, 98, 211] which would change the set of LTRs that are potentially active as well as possibly increasing the total level of LTR-driven transcriptional noise. Up-regulation of specific LTR-driven transcripts would initially be weak and stochastic, from the set of permissive LTRs. Those cells gaining an LTR-driven transcript which confers a growth advantage would then be selected for, and the resultant oncogene expression would increase in the tumor population as that epiallele increases in frequency, in a similar fashion as proposed for the epigenetic silencing of tumor suppressor genes [95, 99, 100]. Notably, this scenario also means that within a tumor, LTR-driven transcription would be subject to epigenetic bottleneck effects as well, and that transcriptional LTR noise can become “passenger” expression signals as the cancer cells undergo somatic, clonal evolution.
It may be counter-intuitive to think of evolution and selection as occurring outside the context of genetic variation, but the fact that both genetic mutations and non-genetic/epigenetic variants can contribute to somatic evolution of a cancer is becoming clear [209, 212, 213, 214, 215]. Epigenetic information or variation by definition is transmitted from mother to daughter cells. Thus, in the specific context of a somatic/asexual cell population such as a tumor, this information, which is both variable between cells in the population and heritable, will be subject to evolutionary changes in frequency. DNA methylation in particular has a well-established mechanism by which information (mainly gene repression) is transmitted epigenetically from mother to daughter cells  and DNA hypomethylation at LTRs often correlates with their expression [113, 117, 217]. Thus, this model suggests that one important type of “epigenetic variant” or epiallele is the transcriptional status of the LTR itself, since the phenotypic impact of LTR transcription may be high in onco-exaptation. Especially in light of the fact that large numbers of these highly homologous sequences are spread across the genome, epigenetic variation, and possibly selection, at LTRs creates a fascinating system by which epigenetic evolution in cancer may occur.
Here we have reviewed the growing number of examples of LTR/TE onco-exaptation. Although such TEs have the potential to be deleterious by contributing to oncogenesis if transcriptionally activated, their fixation in the genome and ancient origin suggests that their presence is not subject to significant negative selection. This could be due to the low frequency of onco-exaptation at a particular TE locus and/or to the fact that cancer is generally a disease that occurs after the reproductive years. However, it is generally assumed that negative selection is the reason why TEs are underrepresented near or within genes encoding developmental regulators [218, 219, 220]. Similarly we hypothesize that LTR/TE insertions predisposed to causing potent onco-exaptations at a high frequency would also be depleted by selective forces.
In this review we have also presented two models that may explain such onco-exaptation events. These two models are not mutually exclusive but they do provide alternative hypotheses by which TE-driven transcription may be interpreted. This dichotomy is possibly best exemplified by the ERBB4 case (Fig. 1e) . There are two LTR-derived promoters which result in aberrant ERBB4 expression in ALCL. From the de-repression model viewpoint, both LTR elements are grouped MLT1 (MLT1C and MLT1H) and thus this group can be interpreted as de-repressed. From the epigenetic evolution model viewpoint, this is convergent evolution/selection for onco-exaptations involving ERBB4.
Through application of the de-repression model, TE-derived transcripts could be used as a diagnostic marker in cancer. If the set of TE/LTR derived transcripts are a deterministic consequence of a given molecular state, by understanding which set of TEs correspond to which molecular state, it might be possible to assay cancer samples for functional molecular phenotypes. In HL for example, CSF1R status is prognostically important  and this is dependent on the transcriptional state of a single THE1B. HL also has a specific increase in THE1 LTR transcription genome-wide (unpublished observations). Thus, it’s reasonable to hypothesize that the prognostic power can be increased if the transcriptional status of all THE1 LTRs is considered. A set of LTRs can then be interpreted as an in situ ‘molecular sensor’ for aberrant NF-kB function in HL/B-cells for instance.
The epigenetic evolution model proposes that LTR-driven transcripts can be interpreted as a set of epimutations in cancer, similar to how oncogenic mutations are analyzed. Genes that are recurrently (and independently) onco-exapted in multiple different tumors of the same cancer type may be a mark of selective pressure for acquiring that transcript. This is distinct from the more diverse/noisy “passenger LTR” transcription occurring across the genome. These active but “passenger LTRs” may be expressed to a high level within a single tumor population due to epigenetic drift and population bottlenecks but would be more variable across different tumors. Thus analysis of recurrent and cancer-specific TE-derived transcripts may enrich for genes of significance to tumor biology.
While we focused in this review on TE-initiated transcription in cancer, many of the concepts presented here can be applied to other regulatory functions of TEs such as enhancers, insulators, or repressors of transcription. Although less straightforward to measure, it is probable that perturbations to such TE regulatory functions contribute to some malignancies. Furthermore, several studies have shown that TEs play substantial roles in cryptic splicing in humans [221, 222, 223] and thus may be a further substrate of transcriptional innovation in cancer, particularly since DNA methylation state can affect splicing .
Regardless of the underlying mechanism, onco-exaptation offers a tantalizing opportunity to model evolutionary exaptation. Specifically, questions such as “How do TEs influence the rate of transcriptional/regulatory change?” can be tested in cell culture experiments. As more studies that focus on regulatory aberrations in cancer are performed in the coming years, we predict that this phenomenon will become increasingly recognized as a significant force shaping transcriptional innovation in cancer. Moreover, we propose that studying such events will provide insight into how TEs have contributed to reshaping transcriptional patterns during species evolution.
We thank Matt Lorincz and the anonymous reviewers for comments and helpful suggestions on this manuscript. We apologize to colleagues and other researchers if we failed to cite relevant work on this subject.
Work on this topic in our laboratory has been funded by grants from the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC), the Canadian Cancer Society and the Leukemia and Lymphoma Society of Canada, with core support provided by the BC Cancer Agency. AB is supported by a studentship award from NSERC.
Availability of data and materials
Data sharing not applicable as no datasets were generated or analyzed during the current study.
AB and DLM wrote the manuscript and both authors approved the final version.
The authors declare that they have no competing interests.
Consent for publication
- 1.International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001;409(6822):860–921.Google Scholar
- 4.Gould SJ, Vrba ES. Exaptation-A Missing Term in the Science of Form. Paleobiology. 1982;8(1):4–15.Google Scholar
- 7.Richardson SR, Doucet AJ, Kopera HC, Moldovan JB, Garcia-Pérez JL, Moran JV: The Influence of LINE-1 and SINE Retrotransposons on Mammalian Genomes. Microbiol Spectr 2015, 3(2): 10.1128/microbiolspec.MDNA1123-0061-2014
- 9.Hancks DC, Kazazian HH. Roles for retrotransposon insertions in human disease. Mob DNA. 2016;7(1):1–28.Google Scholar
- 10.Gerdes P, Richardson SR, Mager DL, Faulkner GJ. Transposable elements in the mammalian embryo: pioneers surviving through stealth and service. Genome Biol. 2016;17(1):1–17.Google Scholar
- 26.Magiorkinis G, Blanco-Melo D, Belshaw R. The decline of human endogenous retroviruses: extinction and survival. Retrovirology. 2015;12(1):1–12.Google Scholar
- 27.Rosenberg N, Jolicoeur P. Retroviral pathogenesis. In: Coffin JM, Hughes SH, Varmus H, editors. Retroviruses. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 1997. p. 475–586.Google Scholar
- 31.Chen T, Meng Z, Gan Y, Wang X, Xu F, Gu Y, Xu X, Tang J, Zhou H, Zhang X, et al. The viral oncogene Np9 acts as a critical molecular switch for co-activating beta-catenin, ERK, Akt and Notch1 and promoting the growth of human leukemia stem/progenitor cells. Leukemia. 2013;27(7):1469–78.PubMedGoogle Scholar
- 34.Gomez-del Arco P, Kashiwagi M, Jackson AF, Naito T, Zhang J, Liu F, Kee B, Vooijs M, Radtke F, Redondo JM, et al. Alternative promoter usage at the Notch1 locus supports ligand-independent signaling in T cell development and leukemogenesis. Immunity. 2010;33(5):685–98.PubMedPubMedCentralGoogle Scholar
- 35.Thorsen K, Schepeler T, Øster B, Rasmussen MH, Vang S, Wang K, Hansen KQ, Lamy P, Pedersen JS, Eller A, et al. Tumor-specific usage of alternative transcription start sites in colorectal cancer identified by genome-wide exon array analysis. BMC Genomics. 2011;12(1):1–14.Google Scholar
- 39.O’Connell MR, Sarkar S, Luthra GK, Okugawa Y, Toiyama Y, Gajjar AH, Qiu S, Goel A, Singh P. Epigenetic changes and alternate promoter usage by human colon cancers for expressing DCLK1-isoforms: Clinical Implications. Scie Rep. 2015;5:14983.Google Scholar
- 44.Mager DL, Stoye JP: Mammalian Endogenous Retroviruses. Microbiol Spectr 2015, 3(1). doi: 10.1128/microbiolspec.MDNA3-0009-2014.
- 69.Criscione SW, Theodosakis N, Micevic G, Cornish TC, Burns KH, Neretti N, Rodić N. Genome-wide characterization of human L1 antisense promoter-driven transcripts. BMC Genomics. 2016;17(1):1–15.Google Scholar
- 75.Deininger P. Alu elements: know the SINEs. Genome Biol. 2011;12(12):1–12.Google Scholar
- 79.Jjingo D, Conley AB, Wang J, Mariño-Ramírez L, Lunyak VV, Jordan IK: Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression. Mobile DNA 2014, 5:14–14Google Scholar
- 81.Lynch Vincent J, Nnamani Mauris C, Kapusta A, Brayer K, Plaza Silvia L, Mazur Erik C, Emera D, Sheikh Shehzad Z, Grützner F, Bauersachs S, et al. Ancient Transposable Elements Transformed the Uterine Regulatory Landscape and Transcriptome during the Evolution of Mammalian Pregnancy. Cell Rep. 2015;10(4):551–61.PubMedPubMedCentralGoogle Scholar
- 92.Yang F, Wang PJ: Multiple LINEs of retrotransposon silencing mechanisms in the mammalian germline. Semin Cell Dev Biol 2016, in press.Google Scholar
- 100.Kazanets A, Shorstova T, Hilmi K, Marques M, Witcher M. Epigenetic silencing of tumor suppressor genes: Paradigms, puzzles, and potential. Biochimica et Biophysica Acta (BBA) - Reviews on Cancer. 2016;1865(2):275–88.Google Scholar
- 112.Haase K, Mosch A, Frishman D. Differential expression analysis of human endogenous retroviruses based on ENCODE RNA-seq data. BMC Med Genet. 2015;8(1):71.Google Scholar
- 114.Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 2015;6(1):1–6.Google Scholar
- 116.Martín-Moreno AM, Roncador G, Maestre L, Mata E, Jiménez S, Martínez-Torrecuadrada JL, Reyes-García AI, Rubio C, Tomás JF, Estévez M, et al. CSF1R Protein Expression in Reactive Lymphoid Tissues and Lymphoma: Its Relevance in Classical Hodgkin Lymphoma. PLoS One. 2015;10(6):e0125203.PubMedPubMedCentralGoogle Scholar
- 119.Kreher S, Bouhlel MA, Cauchy P, Lamprecht B, Li S, Grau M, Hummel F, Köchert K, Anagnostopoulos I, Jöhrens K, et al. Mapping of transcription factor motifs in active chromatin identifies IRF5 as key regulator in classical Hodgkin lymphoma. Proc Natl Acad Sci. 2014;111(42):E4513–22.PubMedPubMedCentralGoogle Scholar
- 120.Mancl ME, Hu G, Sangster-Guity N, Olshalsky SL, Hoops K, Fitzgerald-Bocarsly P, Pitha PM, Pinder K, Barnes BJ. Two Discrete Promoters Regulate the Alternatively Spliced Human Interferon Regulatory Factor-5 Isoforms: Multiple isoforms with distinct cell type-specific expression, localization, regulation, and function. J Biol Chem. 2005;280(22):21078–90.PubMedGoogle Scholar
- 126.Gao H, Guan M, Sun Z, Bai C. High c-Met expression is a negative prognostic marker for colorectal cancer: a meta-analysis. Tumor Biol. 2015;36(2):515–20.Google Scholar
- 128.Fantom-Consortium. A promoter-level mammalian expression atlas. Nature. 2014;507(7493):462–70.Google Scholar
- 133.Nagai M, Furihata T, Matsumoto S, Ishii S, Motohashi S, Yoshino I, Ugajin M, Miyajima A, Matsumoto S, Chiba K. Identification of a new organic anion transporting polypeptide 1B3 mRNA isoform primarily expressed in human cancerous tissues and cells. Biochem Biophys Res Commun. 2012;418(4):818–23.PubMedGoogle Scholar
- 138.Lock FE, Rebollo R, Miceli-Royer K, Gagnier L, Kuah S, Babaian A, Sistiaga-Poveda M, Lai CB, Nemirovsky O, Serrano I, et al. Distinct isoform of FABP7 revealed by screening for retroelement-activated genes in diffuse large B-cell lymphoma. Proc Natl Acad Sci. 2014;111(34):E3534–43.PubMedPubMedCentralGoogle Scholar
- 144.Masliah-Planchon J, Bièche I, Guinebretière J-M, Bourdeaut F, Delattre O. SWI/SNF Chromatin Remodeling and Human Malignancies. Ann Rev Pathol Mech Dis. 2015;10(1):145–71.Google Scholar
- 149.Zhou P, Sun L, Liu D, Liu C, Sun L. Long Non-Coding RNA lincRNA-ROR Promotes the Progression of Colon Cancer and Holds Prognostic Value by Associating with miR-145. Pathol Oncol Res. 2016;22(4):733–40.Google Scholar
- 155.Yang F, Lyu S, Dong S, Liu Y, Zhang X, Wang O. Expression profile analysis of long noncoding RNA in HER-2-enriched subtype breast cancer by next-generation sequencing and bioinformatics. OncoTargets Ther. 2016;9:761–72.Google Scholar
- 156.Zeng Z, Bo H, Gong Z, Lian Y, Li X, Li X, Zhang W, Deng H, Zhou M, Peng S, et al. AFAP1-AS1, a long noncoding RNA upregulated in lung cancer and promotes invasion and metastasis. Tumor Biol. 2016;37(1):729–37.Google Scholar
- 158.Wu W, Bhagat TD, Yang X, Song JH, Cheng Y, Agarwal R, Abraham JM, Ibrahim S, Bartenstein M, Hussain Z, et al. Hypomethylation of Noncoding DNA Regions and Overexpression of the Long Noncoding RNA, AFAP1-AS1, in Barrett's Esophagus and Esophageal Adenocarcinoma. Gastroenterology. 2013;144(5):956–66. e954.PubMedPubMedCentralGoogle Scholar
- 167.Xue M, Chen W, Li X. Urothelial cancer associated 1: a long noncoding RNA with a crucial role in cancer. J Cancer Res Clin Oncol. 2016;142(7):1407–19.Google Scholar
- 172.Cruickshanks HA, Vafadar-Isfahani N, Dunican DS, Lee A, Sproul D, Lund JN, Meehan RR, Tufarelli C. Expression of a large LINE-1-driven antisense RNA is linked to epigenetic silencing of the metastasis suppressor gene TFPI-2 in cancer. Nucleic Acids Res. 2013;41(14):6857–69.PubMedPubMedCentralGoogle Scholar
- 176.Tomczak K, Czerwińska P, Wiznerowicz M. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol. 2015;19(1A):A68–77.Google Scholar
- 177.Hashimoto K, Suzuki AM, Dos Santos A, Desterke C, Collino A, Ghisletti S, Braun E, Bonetti A, Fort A, Qin X-Y, et al. CAGE profiling of ncRNAs in hepatocellular carcinoma reveals widespread activation of retroviral LTR promoters in virus-induced tumors. Genome Res. 2015;25(12):1812–24.PubMedPubMedCentralGoogle Scholar
- 180.Ohnuki M, Tanabe K, Sutou K, Teramoto I, Sawamura Y, Narita M, Nakamura M, Tokunaga Y, Nakamura M, Watanabe A, et al. Dynamic regulation of human endogenous retroviruses mediates factor-induced reprogramming and differentiation potential. Proc Natl Acad Sci. 2014;111(34):12426–31.PubMedPubMedCentralGoogle Scholar
- 187.Mager DL, Medstrand P. Retroviral Repeat Sequences. In: eLS. edn. Hoboken: Wiley; 2005.Google Scholar
- 193.Beyer U, Kronung SK, Leha A, Walter L, Dobbelstein M. Comprehensive identification of genes driven by ERV9-LTRs reveals TNFRSF10B as a re-activatable mediator of testicular cancer cell death. Cell Death Differ. 2016;23:64–75.Google Scholar
- 194.Krönung SK, Beyer U, Chiaramonte ML, Dolfini D, Mantovani R, Dobbelstein M. LTR12 promoter activation in a broad range of human tumor cells by HDAC inhibition. Oncotarget. 2016;7(23):33484–497.Google Scholar
- 201.Prensner JR, Iyer MK, Balbin OA, Dhanasekaran SM, Cao Q, Brenner JC, Laxman B, Asangani IA, Grasso CS, Kominsky HD, et al. Transcriptome sequencing across a prostate cancer cohort identifies PCAT-1, an unannotated lincRNA implicated in disease progression. Nat Biotech. 2011;29(8):742–9.Google Scholar
- 205.Böhne A, Brunet F, Galiana-Arnoux D, Schultheis C, Volff J-N. Transposable elements as drivers of genomic and biological diversity in vertebrates. Chromosom Res. 2008;16(1):203–15.Google Scholar
- 223.Darby MM, Leek JT, Langmead B, Yolken RH, Sabunciyan S: Widespread splicing of repetitive element loci into coding regions of gene transcripts. Hum Mol Gen 2016, in press.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.