|
|
||||||||
1 Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel
Reprint requests to: Shulamit Michaeli, Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan 52900, Israel; e-mail: michaes{at}mail.biu.ac.il; fax: 972-3-5351824.
| ABSTRACT |
|---|
|
|
|---|
Keywords: snoRNA; trypanosomatids; C/D; H/ACA; pseudouridines; 2'-O-methyls
| INTRODUCTION |
|---|
|
|
|---|
The C/D snoRNAs that guide 2'-O-methylation are named after short sequence motifs, the C-box (RUGAUGA; R designates purine) and the D-box (CUGA). These boxes, together with the short sequences near the 5' end and 3' end of the RNA, are essential for their accumulation, processing, localization, and function (Cavaille and Bachellerie 1996
; Xia et al. 1997
; Lange et al. 1998
; Watkins et al. 2002
). Most of these snoRNAs contain, between the C and D motifs, sequences related to these boxes known as C' boxes and D' boxes. Four core proteins bind the C/D snoRNAs: fibrillarin or Nop1p in yeast, Nop56p, Nop58p, and 15.5 K or Snu13p in yeast. It was found that the region of perfect complementarity (1021 nt) between the target RNA and the snoRNA lies upstream from the D or D' sequences. The methylated nucleotide is always located 5 nt upstream from the D-box or D' box within the domain of interaction between the snoRNA and the target. This is known as the +5 rule (Kiss-Laszlo et al. 1996
). The C/D snoRNA usually carry domains complementary to two targets present upstream to the D-box and D' box. Potentially these snoRNAs can guide the modifications on two sites (double guiders). In case only one of the sites is used for guiding modification, the snoRNA is a single guider. However, in several cases there are two sites that are complementary to the target RNA, but as in the case of U14, one of the guide sequences is essential for 18S processing, whereas the second one is essential for methylation (Li et al. 1997
; Dunbar and Baserga 1998
).
In most eukaryotes studied so far, the snoRNAs that govern pseudouridylation consist of two hairpin domains connected by a single-stranded hinge, the H (AnAnnA) domain, and a tail region, the ACA-box. Four core proteins, namely, Gar1p, Nop10p, Nhp2p, and Cbf5p/dyskerin, were identified in eukaryotic H/ACA snoRNPs. With the exception of Gar1p, all core proteins are essential for snoRNA stability. Two short rRNA recognition motifs of the snoRNA base pair, with rRNA sequences flanking the uridine to be converted to pseudouridine, have been characterized (Ganot et al. 1997
; Tollervey and Kiss 1997
). The pseudouridine is always located 1416 nt upstream from the H-box or ACA-box of the snoRNA. In yeast and mammals, the two hairpin domains are essential for rRNA modification, even when the RNA contains a single guide sequence (Bortolin et al. 1999
). The two major structural domains of the H/ACA snoRNA, the 5' hairpin (hp) followed by the H-box and the 3' hairpin followed by the ACA-box, share striking structural and functional similarities. Pseudouridylation pockets are found equally frequently in the 5' or 3' end of the molecule, and in several cases many snoRNAs can direct pseudouridylation of rRNA at two different positions (Ganot et al. 1997
).
All snoRNA guiding modifications characterized so far are transcribed by RNA polymerase II, whereas snoRNAs involved in rRNA processing can also be transcribed in plants by RNA polymerase III (Brown et al. 2003a
), and in trypanosomes U3 is transcribed by RNA polymerase III using a divergently transcribed tRNA as an extragenic promoter (Nakaar et al. 1994
). Vertebrates, plants, and yeast contain independently transcribed snoRNA genes flanked by promoter, enhancer, and termination sequences (Brown et al. 2003a
). In vertebrates, the majority of the snoRNAs that guide modifications are located within introns. The intronic snoRNAs are transcribed from the host gene promoters. In vertebrates and yeast having only a single snoRNA in any intron, the processing is largely splicing-dependent (Ooi et al. 1998
; Filipowicz and Pogacic 2002
).
Trypanosomes are unicellular parasitic protozoa that are the causative agent of several infamous parasitic diseases such as African trypanosomiasis caused by Trypanosoma brucei, Chagas disease caused by Trypanosoma cruzi, and Leishmaniasis caused by Leishmania species. Trypanosomatids are well-known for harboring exotic and unique RNA processing events such as nuclear pre-mRNA trans-splicing (Liang et al. 2003a
) and mitochondrial RNA editing (Simpson et al. 2003
). In addition, the large rRNA subunit undergoes specific cleavages that yield two large rRNA molecules and four small RNAs, ranging in size from 220 to 76 nt (White et al. 1986
).
Relatively little is known about snoRNAs in trypanosomatids. Early studies suggest the existence of ~100 2'-O-methylated nucleotides on the rRNA (Gray 1979
). The first trypanosome C/D snoRNA (snoRNA-2) was identified in Leptomonas collosoma (Levitan et al. 1998
). Later, snoRNAs and reiterated gene clusters encoding for snoRNAs were identified in T. brucei (Dunbar et al. 2000a
,b
). Whereas the trypanosome C/D snoRNA fit the prototype C/D snoRNA in eukaryotes and Archaea, the trypanosome H/ACA RNAs possess unique features. Most if not all of the these guide RNAs are single-hairpin RNAs and carry an AGA-box instead of an ACA-box (Uliel et al. 2004
). After the discovery of the single-hairpin guide RNAs in trypanosomes, such guide RNAs were found in Archaea (Tang et al. 2002
; Rozhdestvensky et al. 2003
) and more recently in Euglena (Russell et al. 2004
). Prior to this study, only ~20 C/D snoRNA and ~10 H/ACA-like RNAs were described in trypanosomatids and are listed in Uliel et al. (2004)
. The organization of trypanosome snoRNAs mostly resembles plants because the genes are clustered and each cluster carries a mixture of both C/D and H/ACA RNAs (Brown et al. 2003a
; Uliel et al. 2004
). The trypanosome snoRNAs are processed from long polycistronic transcripts (Xu et al. 2001
; Liang et al. 2003b
), but the machinery involved in this processing is currently unknown.
In this study, bioinformatics and experimental tools were used to describe on a genomic scale, the snoRNAs that guide methylation and pseudouridylation on rRNA in T. brucei. The data suggest that most but not all the snoRNAs are clustered in reiterated repeats that carry a mixed population of C/D and H/ACA-like RNAs. All the H/ACA-like RNAs that potentially can guide modification are single-hairpin RNAs. Predicting the modifications guided by these RNAs and using partial mapping data, we identified 84 2'-O-methyls (Nms) and 32
s on rRNA, suggesting a high number of Nms on rRNA compared with their genome size. Many of these modifications are species-specific and enlarge a domain already rich with such modifications. However, of the trypanosome-specific modifications, 40% are also predicted to exist in unique positions outside the highly conserved domains. These numerous modifications increase the stability of the ribosomes and are perhaps beneficial in coping with the adverse conditions associated with the cycling of these parasites between the mammalian and insect hosts.
| RESULTS |
|---|
|
|
|---|
|
The majority of the clusters are repeated in the chromosome, ranging from 1.4 to a maximum of 7.5. The last repeat is almost always not complete, that is, it lacks portions from its 3' end. However, we also detected complete repeats that are repeated, seven and five times. All the snoRNA clusters identified in this study are flanked by protein-coding genes. We noticed that the 5' upstream flanking protein is situated ~500 bp upstream from the beginning of the cluster, whereas the location of the 3' flanking protein varies considerably and can be as short as 1020 nt downstream from the end of cluster. These data suggest that the region upstream from the cluster may have "promoter-like" activity, as was recently demonstrated (Liang et al. 2004
).
A second type of cluster are those clusters where not only the snoRNAs are repeated, but the repeat also includes a protein-coding gene. An example of such a cluster is TB9Cs4, which carries the protein glycerolkinase (GLK1) (Tb09.211.3560). This protein, along with its neighboring snoRNA, are repeated 5.4 times. The TB11Cs1 repeat also carries a protein (GP63). The cluster appears once carrying two copies of GP63 (Tb11.02.5640; Tb11.02.5630) at the 3' end of the cluster. The second cluster also carries two GP63 proteins: Tb11.02.5620 and a truncated version of the protein (Tb11.02.5610).
Another type of snoRNA organization is the duplication of only a portion of a cluster, which is the case of TB5Cs1. Structurally, the cluster contains two copies of C/D (TB5Cs1C1), and again in the same chromosome, another copy of this C/D snoRNA exists. Moreover, the proteins flanking these snoRNAs are different. Another interesting cluster is TB8Cs3, which carries a full repertoire of snoRNA (three C/Ds and one H/ACA) and is also found consisting of only the two C/D snoRNAs. In each case, the snoRNA clusters are flanked by different sequences.
Expression of snoRNA genes
Primer extension was used to detect the expression of different snoRNA genes (TB9Cs2H1, TB6Cs1H2, TB6Cs1H4, TB10Cs3H1, TB9Cs2C5, and TB5Cs1C1). The results, presented in Figure2A
, indicate the expression of clusters 2, 3, 8, and 13. Since snoRNAs are transcribed as polycistronic transcripts, the expression of any snoRNA within a cluster suggests that this cluster is actively transcribed (Roberts et al. 1998
; Dunbar et al. 2000a
,b
; Liang et al. 2001
; Xu et al. 2001
). Previous studies identified the expression of the following C/D snoRNAs: TB6Cs1C1, TB6Cs1C2, TB6Cs1C3, TB8Cs1C4, TB8Cs3C3, TB9Cs2C1, TB9Cs2C4, TB10Cs2C2, TB10Cs3C1, TB10Cs3C4, TB10Cs3C5, TB11Cs1C2, TB11Cs2C1, TB11Cs2C2, and TB11Cs3C2 (Dunbar et al. 2000a
; see Table 1
). Additionally, our previous study confirmed the expression of Leptomonas snoRNAs (Liang et al. 2001
, 2004
; Xu et al. 2001
). The T. brucei homologs to these snoRNAs are TB5Cs1C1, TB9Cs1C1, TB9Cs4C1, TB9Cs4C2, TB11Cs4C2, and TB11Cs4C3 (see Table 1
). Collectively, these data suggest the expression of 13 clusters: clusters 2, 3, 4, 6, 7, 8, 10, 12, 13, 15, 16, 17, and 18. Whereas the study of Dunbar et al. (2000a)
indicated the expression of only C/D snoRNAs, the data presented here suggest the expression of H/ACA-like RNAs. For instance, in cluster 4 the expression of two H/CAC-like TB6Cs1H2, Tb6Cs1H4 RNAs was confirmed, suggesting that, indeed, all the RNAs within a polycistronic transcript are most likely expressed. Note that the snoRNA precursors can easily be detected in steady-state RNA by RT-PCR both in L. collosoma and T. brucei (Xu et al. 2001
; Liang et al. 2003b
, 2004
).
|
|
The distance between the different snoRNA genes (intergenic region) ranges from 15 to 450 nt. We have previously demonstrated that although the intergenic region can vary, 10 nt is essential for proper processing of the snoRNA (Xu et al. 2001
; Liang et al. 2004
). Interestingly, as shown in Figure2A
, we detected efficient processing of the snoRNA TB10Cs3H1 (cluster 16) that is spaced by only 15 nt from the upstream RNA.
The repertoire and properties of the T. brucei C/D and H/ACA-like snoRNAs
The repertoire of the 57 C/D snoRNAs identified in these clusters is presented in Table 1
. All the C/D snoRNAs range in size from 67 to 118 nt. Of the 57 C/D snoRNAs, 27 have the potential to guide two modification sites. In 14 out of the 27 snoRNAs, the sites lie adjacent to each other on the target RNA. In the other cases, the two sites are either located on the same RNA or even on two different RNA molecules. We were able to identify the targets guided by 56 out of 57 C/D snoRNAs. In addition, 39 of the T. brucei C/D snoRNAs have homologs in other organisms such as yeast (these snoRNAs are designated as SnX), human (designated as Ux) or Arabidopsis (designated as AtsnoX). The fact that we identified homologs to the T. brucei RNA in other eukaryotes suggests that all these snoRNAs should be expressed in trypanosomes as well. Interestingly, 27 snoRNAs seem to be trypanosome-species specific (cf. the two columns in Table 1
). Out of these, six snoRNAs were shown to be expressed in T. brucei (Dunbar et al. 2000b
), and four were shown to be expressed in L. collosoma (Xu et al. 2001
; Liang et al. 2004
).
The analysis presented here suggests that at least 38 of the C/D snoRNAs are expressed, but since the rest are situated in expressed clusters, it is reasonable to assume that all these snoRNAs are expressed.
The size of the C/D snoRNAs and their 5' and 3' ends were deduced based on experimental mapping data of several of these snoRNAs. The mapping data indicate that the 5' end of the molecule is situated 15 nt upstream from the C-box. In those cases where the 5' end was experimentally mapped, we indicated the exact location. For the remaining molecules, we provided the sequence of the 5 nt upstream from the C-box. Based on experimental data, the 3' end of the C/D molecule is found 13 nt downstream from the D-box. For those RNAs with no available experimental data, we provided the sequence of the 3 nt downstream from the D-box. Interestingly, unlike most of the eukaryotic C/D snoRNAs, as well as those described in L. collosoma (Xu et al. 2001
; Liang et al. 2004
), the 3' and 5' ends of T. brucei snoRNA cannot form a perfect stem. Note that the comparative analysis between T. brucei and T. cruzi cannot be helpful in determining the ends of the molecule, since the sequence of the C/D snoRNA is not conserved outside the domain that is complementary to the target site (see Fig. 4
).
|
|
The potential targets for C/D and H/ACA-like RNAs
The base-pair interactions between the guide RNAs and their targets are presented in Figure3
(Fig. 3A
for C/D and Fig. 3B
for H/ACA-like RNAs). The interaction domain between the C/D RNA and its target is relatively easy to find, since there is perfect complementarity of 1016 nt between the C/D snoRNA and its target site. As suggested previously (Levitan et al. 1998
; Dunbar et al. 2000b
; Xu et al. 2001
), the +5 rule for guiding modifications applies to all C/D RNAs identified in this study. This is in contrast to the guiding rules suggested for the C/D RNAs present in the SLA1 locus (cluster 19) (Roberts et al. 1998
). Interestingly, TB10Cs2C1 also has the potential to base-pair (10 bp) with the ITS2 (internal-transcribed spacer) region of the rRNA precursor, as shown in Figure3A
(boxed). SnoRNA interactions with pre-rRNA are relatively rare. However, U8 snoRNA has been shown in vertebrates to be essential in ITS2 processing, and it base-pairs with the precursor serving as a chaperone but has no guide methylation function (Peculis 1997
; Michot et al. 1999
). It remains to be seen if TB10Cs2C1 guides modification on pre-rRNA. Only for one of the C/Ds (TB11Cs2C2), the target on either rRNA or snRNAs could not be identified. However, this RNA is in the SLA1 locus that encodes for RNAs with special functions. It is therefore possible that this RNA may either function in RNA processing or direct modification on a novel target.
|
Sequence and structure conservation among T. brucei, T. cruzi, and L. major
To examine whether the conservation of the C/D and H/ACA among the trypanosomatid species can be helpful in identifying features common to the trypanosome snoRNA, which may assist in finding the snoRNAs that we have not yet identified (see Discussion), we examined the conservation of C/D and H/ACA RNAs in three trypanosomatid species: T. brucei, T. cruzi, and L. major. An example of such a comparison is presented in Figure4A
, indicating that the H/ACA-like RNAs are slightly more conserved (58%75% identity) at the primary sequence level compared with C/D RNAs (43%75% identity). In other examples, we noticed that on the primary sequence level, the C/D snoRNAs are more conserved than the H/ACA-like RNAs because of the conservation in the boxes and in the extensive interaction domains. Note that the percentage of identity among C/D homolog molecules depends on the length of the molecule. In cases in which the molecule is short and the two extensive complementary sequences to the target exist, the overall identity is high, yet the region between the conserved domains can be highly divergent.
The C- and D-boxes are highly conserved, but deviations can be found in the C' and D' boxes. In all C/D snoRNAs, the homologous guide RNAs in the three or four trypanosomatid species have the potential to guide the same targets either at adjacent sites or on two different RNA molecules. For example, TB3Cs2C1 has two targets: one on SSU and one on LSU. An inspection of the conservation and compensatory changes in the two H/ACA RNAs (TB8Cs3H1 and TB9Cs3H1) reveals the conservation and several structural features of the trypanosome H/ACA-like RNAs (see Fig. 4B
).
The H/ACA-like RNAs show a high degree of conservation at their secondary structure. The conserved domains are as follows: the lower stem consists of at least 6 bp; the pseudouridylation pocket can range from 5 to 10 nt; and the upper stem usually contains four conserved base pairs. These features can be clearly seen in Figure4B
. The structure of the stems is essential for the functioning of these molecules, since compensatory changes in these domains exist in T. brucei, T. cruzi, and L. major RNAs. There is a great variation in the length of the upper stem, both in the degree of complementarity (number of bulges) and also in the size of the terminal loop. Mutation analysis in L. collosoma is currently in progress to determine the structural features essential for RNA processing, RNP biogenesis, and nucleoli localization of these RNAs.
The localization of modifications guided by the C/D and H/ACA-like RNAs on the rRNA secondary structure
Next, we were interested in mapping the potential modifications guided by both C/D and H/ACA RNAs, and in comparing the pattern of modifications on the rRNAs of trypanosomes to those described in humans, yeast, and plants. The results are summarized in Table 3
and specify the homologs to trypanosome C/D snoRNAs from yeast, humans, and plants including their potential target sites on different RNAs (LSU, 5' half and 3' half, SSU or 5.8 rRNA). The results indicate that we identified 84 potential 2'-O-methylations on the rRNA in T. brucei. Among those, 44 predicted sites were found to be modified in at least another organism. Of these modifications, 23 are shared between plants, humans, yeast, and trypanosomes and therefore represent the most highly conserved modifications. Six modifications seem to be unique only to plants and trypanosomes: three to trypanosomes, humans, and plants; four to plants, yeast, and trypanosomes; and four common to trypanosomes and yeast. A great overlap exists between plants and trypanosomes. The most striking finding was the number of species-specific modifications identified in trypanosomes.
|
|
on rRNA
|
at positions 581 and 618 in the LSU 3' half were identified, but the RNAs needed to target these modifications were not identified in this study. | DISCUSSION |
|---|
|
|
|---|
s is not equal to the quantity of Nms found in yeast and mammals (Decatur and Fournier 2002
The repertoire described compared to what we expect
The repertoire described here most probably represents most but not all of the small RNAs that guide modification in T. brucei. Only 84 Nms are predicted to exist on rRNA, based on this study, but early studies suggest the existence of as many as 100 Nms in Crithidia rRNA (Gray 1979
). In addition, we have not yet identified any guide RNA that guides modification on trypanosome snRNAs such as scaRNA. scaRNAs are chimeric molecules carrying both C/D and H/ACA functions, which are localized in metazoa in special compartments near the nucleolus, the Cajal bodies (Richard et al. 2003
). Modifications were mapped on trypanosome snRNAs (Li et al. 2000
). We therefore expect to find snoRNAs that will guide these modifications. At this point, we cannot exclude the possibility that enzymes mediate many of these modifications. Indeed, in yeast, U2,
35, and 44 are generated by enzymes Pus7p and Pus1p, respectively (Massenet et al. 1999
; Ma et al. 2003
). The recent finding that pseudouridylation of snRNAs can also be guided by conventional H/ACA snoRNAs (Kiss et al. 2004
) raises the possibility that such RNAs may also exist in trypanosomes. In addition, trypanosomes may also use enzymatic modifications. Preliminary mapping of the
s on U2 snRNA in the pseudouridine synthase (Cbf5p) RNAi-silenced cells indicates that many of the known conserved modifications are not abolished or even changed during the elimination of the H/ACA-like RNA, suggesting that also in trypanosomes some of the modifications may be carried out by enzymes (S. Barth, A. Hury, and S. Michaeli, unpubl.).
Where are the rest of the expected guide RNAs "hiding" in the genome? Since our search was able to identify snoRNAs that are repeated within a chromosome, single-copy snoRNA genes may have escaped our searches. New snoRNAs might be found using experimental approaches. We have recently TAP-tagged RNA-binding proteins of both C/D and H/ACA-like RNA and are in the process of identifying the RNAs that are coimmunoprecipitated with these particle-specific proteins. At this point, we cannot exclude the possibility that, in fact, we have identified almost all of the C/D snoRNAs in this study and that enzymes are responsible for the remaining modifications on rRNA (12 Nms). Indeed, as can be seen in Figure6
, we have identified modifications but we have not yet identified a cognate snoRNA to target these modifications. A recent study in yeast indicates that the Nm modification guided by snR52 is also enzymatically modified by methyltransferase (Spb1p) and that knocking out both these functions causes a growth defect, suggesting redundant mechanisms for modification of this site (Bonnerot et al. 2003
). The homolog to snR52 was identified in this study (TB10C3C1). Our recent discovery of snoRNAi in T. brucei (Liang et al. 2003b
) may enable us to examine in trypanosomes the existence of a similar redundant mechanism to modify rRNA.
Unique structural features of trypanosome snoRNAs
The only striking property of trypanosome C/D snoRNAs is that many of them are double-guiders and can potentially guide adjacent sites on rRNA. The trypanosome genome is ~30Mb, which is small relative to plants and mammals. The small genome and the large number of modifications may have selected the double-guide organization. The simultaneous formation of two guide duplexes may suggest that these snoRNAs have a chaperone function that is needed to control pre-rRNA folding. The trypanosome H/ACA possesses a unique structure compared to the molecules in most eukaryotes, since instead of being a double-guide molecule, most if not all of them are single-hairpin molecules. When the secondary structural features and compensatory changes in the secondary structure are examined, it will be possible to establish rules to specify these RNA molecules and to write an algorithm that will search for these RNAs in a whole-genome search. Since the discovery of these single
-guide RNAs in trypanosomes, single-guiding RNAs were discovered in Archaea (Tang et al. 2002
; Rozhdestvensky et al. 2003
) and Euglena (Russell et al. 2004
). In Euglena all the guide RNAs that are involved in guiding pseudouridylation also carry the AGA-box (Russell et al. 2004
). Indeed, in yeast and humans the AGA sequence never appears naturally at the 3' end of the molecule. It was recently suggested that the trypanosome and Euglena snoRNA resemble the 5' end of the eukaryotic H/ACA RNA, since the H-box is, in fact, AGANNN (Russell et al. 2004
). Moreover, it was already suggested that since both the trypanosome and Euglena diverged early in the eukaryotic lineage, their single
-guide RNA may represent the primordial guide RNA that gave rise to the double-guiders later in evolution.
In Archaea (Bachellerie et al. 2002
) as well as in humans (Kiss et al. 2002
), there are molecules that carry several H/ACA-like domains and are most probably the "fusion" products of individual molecules. Such molecules have not yet been found in trypanosomes.
Genome organization compared to other eukaryotes
The genomic organization of snoRNA genes is very diverse in different eukaryotes (recently summarized in Uliel et al. 2004
). The organization of trypanosome snoRNAs resembles mostly the organization of plants because the genes are clustered and the clusters carry a mixture of both H/ACA and C/D snoRNAs (Brown et al. 2003a
). The similarity between plants and trypanosomes is intriguing; in fact, it was recently found that Trypanosoma and Leishmania contain several "plant-like" genes. These genes most probably originated from endosymbiosis with an archaic organelle that was once common to plants and trypanosomes but was later lost during evolution in trypanosomes (Hannaert et al. 2003
). Recently, the genome organization of Euglena snoRNA genes was studied, and it is suggested that as in trypanosomes, these genes are also clustered and the clusters are composed of both C/D and H/ACA-like RNA. These clusters are also repeated (Russell et al. 2004
).
Almost all clusters encoding for snoRNAs identified in this study are repeated, suggesting that the level of expression is dependent on the number of the copies of the repeat. Like many protein-coding genes in trypanosomes, the repeat nature of the snoRNA cluster represents a mechanism of coping with the absence of Pol II promoters (Clayton 2002
). The high degree of expression of these RNAs is therefore mediated by gene multiplicity. Interestingly, there are also snoRNA clusters that are single-copy genes, yet their RNAs are also properly expressed (Fig. 2
). Several repeats contain an accompanying protein-coding gene. In both cases, the genes (GP63, GLK1) have no direct relationship to snoRNA or RNA metabolism.
Additional snoRNA genes may still be identified in the future, since as previously discussed, the full repertoire of H/ACA and C/D snoRNAs is most probably incomplete. Perhaps one should expect to find repeats that carry only H/ACA RNA; these would not have been identified in our searches because we used the SnoScan program (Lowe and Eddy 1999
), which identifies only the C/D snoRNAs. The identification of such H/ACA gene clusters awaits the development of an algorithm that will be able to predict trypanosome H/ACA-like RNAs on a genomic scale. It is also possible that the remaining missing snoRNAs, for instance, the snoRNAs that direct modification on snRNAs, are present as single-copy genes. Indeed, the number of snRNA molecules to be guided is often 100 times less abundant than rRNA, and therefore single-copy genes may suffice in supplying the need for snRNAs modification.
The biological role of modifications and why Nms are so abundant in trypanosomes
In mammals, there are ~9395 sites of methylation. However, in yeast, there are only 55 such modifications. The estimated number of such modifications in trypanosomatids is ~95100 (Gray 1979
) and resembles the number found in plants and vertebrates (Brown et al. 2003a
). Surprisingly, in Euglena, as in trypanosomes, the rRNA is extensively modified, and the estimated number of modifications is 150 Nms and 70
s (Russell et al. 2004
). The increased number of methylations on plant rRNA was rationalized by the fact that plants are exposed to large temperature changes during which the ribosomes must be produced and remain active. Also in hyperthermophilic Archaea, there is a correlation between growth at elevated temperatures and the number of Nms in the rRNA (Bachellerie et al. 2002
). We initially hypothesized that since trypanosomes undergo temperature changes during their life cycle, from 26°C in the insect host to 37°C in the mammalian host, the hypermodification is related to the need to preserve ribosomal activity under adverse environmental conditions. However, Euglena is not a parasite that cycles between different hosts, but like plants, is exposed to major temperature changes in nature. In addition, like trypanosomes, Euglena diverged very early from the eukaryotic lineage (Sogin et al. 1986
), and its large rRNA is fragmented (Schnare and Gray 1990
). Each of the unique properties shared by both organisms as well as their early divergence from the eukaryotic lineage may have been selected for the generation and conservation of the large number of Nms on rRNA found in these organisms. It will be very interesting to compare the positions of the trypanosome-species-specific modifications in Euglena and determine whether they are located at similar positions.
Of great interest is the large number of predicted Nms that are species-specific. These modifications are clustered together in the most conserved structural domains. Also of interest is the finding that relatively many adjacent nucleotides are methylated. In several cases such as TB5Cs1C1, TB3Cs2C1, and TB8Cs2C1, the same snoRNA can direct the methylation on two adjacent sites. It is now well-accepted that eliminating a single modification does not have a dramatic effect on ribosome function, which suggests that most individual modifications contribute a small non-essential benefit and only when numerous modification exist is a large benefit provided (Decatur and Fournier 2003
). The increasing number of modifications in the conserved functional domains may stabilize the ribosome and help it to function even under adverse conditions. Indeed, the sites of modifications are clustered in domains where specific translation events take place (Decatur and Fournier 2002
).
In this study we identified a large number of H/ACA-like RNAs that appear to be species-specific. However, this may change when more of these guide RNAs are identified in other organisms.
Novel trypanosome snoRNAs with non-nucleolar RNA targets
One of the most interesting H/ACA-like molecules discovered in trypanosomatids is SLA1, which directs pseudouridylation on the SL RNA. This RNA was initially discovered because of its efficient cross-linking to SL RNA and at that time was proposed to represent the U5 snRNA (Watkins et al. 1994
). Although it is clear that SLA1 is, indeed, an H/ACA-like RNA (Liang et al. 2002
), its role in SL RNA biogenesis is still an open question. Also, it is not yet clear if the main function of SLA1 is to direct the modification at position 12 or to serve as a chaperone for the SL RNA during its early steps of biogenesis before assembly with Sm proteins (Mandelboim et al. 2003
).
Additional snoRNAs were revealed in the clusters described in this study that deviate from the canonical structures. TB10Cs1C2 and the 270-nt RNA TB11Cs2C3 (in the SLA1 locus) are longer than the canonical guide RNAs. In addition, these RNAs obey neither the structure of C/D nor the H/ACA-like RNA and must therefore have other functions. It will be of interest to find out if these RNAs guide the cleavage of pre-rRNA. Structurally, the 270-nt RNA highly resembles the Euglena RNA Eg-h1 recently described (Russell et al. 2004
). In both cases, RNA possesses an ACA-box at the 3' end and an H-like box located in a single-stranded region. This kind of molecule may represent the primordial H/ACA RNA already present in protists, which may have evolved from a fusion of single stemloop RNAs.
Also of great interest is the TB11Cs2C2 snoRNA that appears in the SLA1 locus and the snoRNAs that have the potential to guide modifications in the ITS (Tb10Cs2C1). So far, all the modifications were mapped to functional domains within the mature RNA. In fact, although modification takes place on the nascent elongating transcript, no modification was ever found in the transcribed spacer that is removed from the pre-RNA during processing. This novel type of snoRNA may serve as a chaperone during rRNA processing to direct or accelerate proper RNA folding. It remains to be seen if the position on the pre-rRNA is, indeed, modified.
Recently we found that snoRNAs can be silenced by an RNA interference-like mechanism (Liang et al. 2003b
). In Leptomonas and Leishmania, the silencing of snoRNA was achieved by overexpressing of antisense RNA, whereas in T. brucei, the silencing was facilitated by in vivo production of double-stranded RNA (Liang et al. 2003b
). The mechanism of silencing may differ among different trypanosomatid species. However, this finding opens up the possibility of elucidating the function of individual snoRNAs or specific modifications described in this study.
In summary, in this study we used bioinformatics and experimental tools to investigate the repertoire of snoRNAs that guide modification in T. brucei. The results of the past studies suggest that we are at the tip of a large iceberg, and future studies in elucidating the snoRNomics of trypanosomes and other eukaryotes promise to further remain a fascinating branch of RNomics.
| MATERIALS AND METHODS |
|---|
|
|
|---|
TB5Cs1, 5'-TGTTTTCAATCGCAGGGTCC-3', antisense, complementary to snoRNA TB5Cs1C1, from position 38 to 57;
TB6Cs2-A, 5'-ATGCCCGTTACGGAACTCT-3', antisense, complementary to snoRNA TB6Cs2C1, from position 34 to 52;
TB5H1, 5'-CGCACGTGCTTCGTACCG-3', antisense, complementary to snoRNA TB10Cs3H1, from 41 to 58;
TB9Cs5-A, 5'-TCTTCACATTTGCTAATTCA-3', antisense, complementary to snoRNA TB9Cs5C2, from position 33 to 52;
TBC-4, 5'-ATAGAGTTCACAGTTGCA-3', antisense, complementary to snoRNA TB9Cs2C5, from position 59 to 76;
TBsno-H-1, 5'-AATTCTCGGACCACGTGA-3', antisense, complementary to snoRNA TB9Cs2H1, from position 58 to 75;
2-CH-2, 5'-CGCGGGTCCGATTGAG-3', antisense, complementary to snoRNA TB6Cs1H2, from position 51 to 66;
2-HH-3, 5'-AACCTCAATGGGTATC-3', antisense, complementary to snoRNA TB6Cs1H4, from position 54 to 71;
1425, 5'-ATCGCCTGCTCCGCTTAC-3', antisense, complementary to rRNA large subunit 5' half, from position 1425 to 1442;
385, 5'-GGCAGAAATCAGTTTGCG-3', antisense, complementary to rRNA large subunit 3' half, from position 385 to 402;
1923, 5'-ATTGTAGTGCGCGTGTCG-3', antisense, complementary to rRNA small subunit, from position 1923 to 1940; and
22269, 5'-ACCTCCAAAGTCGCCGCA-3', antisense, complementary to rRNA large subunit 3' half, from position 637 to 654.
Prediction of the targets on rRNA
The potential targets (2'-O-methylation) in rRNA were determined using the computer program BestFit (from the GCG package) searching for complementarity to rRNA that complies with the +5 guiding rule. Additionally, the targets were also predicted based on the data available from the yeast homologs. To predict the pseudouridines guided by H/ACA RNAs, secondary structure of H/ACA RNA was folded using the MFOLD program (http://www.bioinfo.rpi.edu/applications/mfold/old/rna/form1.cgi), and the sequences from the internal loop were used to search for complementarity with rRNA, based on the guiding rule established in yeast mammals and plants (http://www.bio.umass.edu/biochem/rna-sequence/Yeast_snoRNA_Database/snoRNA_DataBase.html; http://bioinf.scri.sari.ac.uk/cgi-bin/plant_snorna/conservation).
Prediction of the secondary structure of rRNA
The secondary structure of T. brucei rRNA was derived from http://www.icmb.utexas.edu. The sequence of the LSU is from EMBL X14553, X05682, X04986. The T. brucei SSU secondary structure was obtained by superimposing the T. brucei sequence (derived from chromosome 1, positions 79411779636) on the L. major RNA present at the same site previously mentioned.
RNA preparation and primer extension analysis
RNA was prepared from T. brucei cells using TRI-Reagent (Sigma). Primer extension analysis was performed as described (Liang et al. 2001
; Xu et al. 2001
) using 5'-end-labeled oligonucleotides specific to target RNAs, as indicated in the figure legends. The extension products were analyzed on 6% polyacrylamide7 M urea gel and visualized by autoradiography.
Mapping of the modified nucleotides
2'-O-Methylations on rRNA were mapped using a primer extension with a different level of dNTPs, as described in Xu et al. (2001)
. Pseudouridines were examined after n-cyclohexyl-N'-ß-(4-methylmorpholinium) ethylcarbodiimeide p-tosylate-(CMC) modification, as described in Liang et al. (2001)
using primers specific to the relevant region in rRNA. Primer extension products were analyzed on 6% polyacrylamide7 M urea gel, next to sequencing reactions performed using the same primer.
| ACKNOWLEDGMENTS |
|---|
| Footnotes |
|---|
Article and publication are at http://www.rnajournal.org/cgi/doi/10.1261/rna.7174805.
Received September 2, 2004; accepted January 17, 2005.
| REFERENCES |
|---|
|
|
|---|
Bachellerie, J.P., Cavaille, J., and Hüttenhofer, A. 2002. The expanding snoRNA world. Biochimie 84: 775790.[Medline]
Benson, G. 1999. Tandem repeats finder: A program to analyze DNA sequences. Nucleic Acids Res. 27: 573580.
Bonnerot, C., Pintard, L., and Lutfalla, G. 2003. Functional redundancy of Spb1p and a sRN52-dependent mechanism of the 2'-O-ribose methylation of a conserved rRNA position in yeast. Mol. Cell 12: 13091315.[CrossRef][Medline]
Bortolin, M.L., Ganot, P., and Kiss, T. 1999. Elements essential for accumulation and function of small nucleolar RNAs directing site-specific pseudouridylation of ribosomal RNAs. EMBO J. 18: 457469.[CrossRef]