|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
REVIEW |
1. Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ontario K1H 8M5, Canada
2. School of Information Technology and Engineering, University of Ottawa, Ontario K1N 6N5, Canada
3. Department of Pediatrics, University of Ottawa, Ontario K1H 8M5, Canada
4. Apoptosis Research Centre, Children's Hospital of Eastern Ontario, Ottawa, Ontario K1H 8L1, Canada
| ABSTRACT |
|---|
|
|
|---|
Keywords: IRES; RNA; secondary structure; prediction software
| INTRODUCTION |
|---|
|
|
|---|
There have been many very good reviews on IRESes over the years that are helpful in understanding the different facets of this mechanism of translation initiation. Some favorites are: Hellen and Sarnow (2001)
, Jackson et al. (1995)
, on Picornavirus (Belsham and Sonenberg 1996
), FMDV IRES structure/function (Martinez-Salas et al. 2002
), structural aspects relevant to medical intervention (Gallego 2002
), with respect to cancer (Holcik 2004
; Stoneley and Willis 2004
), the very critical and controversial Kozak (2001
, 2003)
, and on stress-related IRES (Holcik et al. 2000
; Holcik and Sonenberg 2005
; Lewis and Holcik 2005
). In this review, we examine the published data that could aid in the detection of unknown IRESes in an mRNA database and the RNA motif/structure-predicting and search programs presently available, which could be applicable to this search.
| MECHANISM OF CAP-DEPENDENT TRANSLATION, INHIBITION, AND IRES |
|---|
|
|
|---|
At some points of a cell's life, this standard mechanism of translation initiation is compromised, but at these times, some mRNAs use an alternative form of initiation that does not require the cap nucleotide as a congregation site for initiation factors. This was first observed in Picornavirus infections, where the uncapped RNA viral genomes of the polio and encephalomyocarditis virus were efficiently translated in eukaryotic cells through the binding of the ribosome to an internal portion of the 5'-UTR of the viral RNA (Jang et al. 1988
; Pyronnet et al. 2001
). A viral protease cleaves the two forms of eIF4G, shutting down host protein (Gradi et al. 1998a
,b
; Svitkin et al. 1999
). Even though the cleaved form of eIF4GI does not bind to eIF4E, it has been shown to translate capped mRNA but much less efficiently than viral RNA (Ali et al. 2001
).
There are several other mechanisms that lower the efficiency of cap-dependant translation initiation inside a cell besides viral infection. During mitosis, the eIF4E-binding proteins (4E-BPs) are hypophosphorylated and competitively bind onto the cap binding protein, eIF4E, preventing eIF4E from forming the eIF4F initiation complex (Pyronnet et al. 2001
). The phosphorylation states of eIF4E and the 4E-BPs in different cellular conditions have been well reviewed by Gingras et al. (1999)
.
During times of cellular perturbation, changes in protein levels and mRNA levels do not always correlate (Ideker et al. 2001
; Nishizuka et al. 2003
). Cellular stresses and the induction of apoptosis cause inhibition of standard translation initiation through the phosphorylation of the eIF2
subunit by one of the four known eIF2
kinases in mammalian cells: HRI, PKR, PERK, and GCN2 (Proud 2005
). The initiation factor eIF2 is the adapter protein that binds Met-tRNA and GTP as part of the 43S preinitiation complex. Phosphorylation of the
-subunit of eIF2 creates tighter binding to eIF2B, which prevents the GDP-GTP exchange activity of eIF2B needed for the recycling of eIF2 for the successive rounds of initiation of protein synthesis. During apoptosis, the cell still requires the de novo synthesis of proteins required for the orderly breakdown of the cell, but the standard protein translation initiation machinery is also slowed down with changes to the phosphorylation of eIF4G (Ling et al. 2005
), eIF4E, and 4E-BPs (Clemens 2001
), as well as caspase cleavage of several canonical initiation factors, eIF4B, eIF3, eIF2
, and proteins of the eIF4G family (Clemens et al. 2000
). Other molecular events like the hyperphosphorylation of eIF4GII (Pyronnet et al. 2001
) or Hsp27 binding to eIF4G during heat shock may hinder the formation of eIF4F (Cuesta et al. 2000
).
Despite these cellular conditions that change the normal translation initiation machinery, some cellular mRNAs and viral RNA still retain the ability to recruit ribosomes to a region of their 5'-UTR to initiate translation. There have been at least 85 cellular IRESes and 39 viral IRESes described in the literature so far, as given in Tables 1![]()
and 2
. These sequences have been shown to exhibit cap-independent translation initiation. The standard method of defining this activity has been the ability of the sequence to initiate translation of the second open reading frame (ORF) or cistron in a bicistronic construct. There have been some criticism and caveats attached to the use of bicistronic constructs for measuring IRES activity (Hellen and Sarnow 2001
; Kozak 2001
; Sherrill et al. 2004
), and checks need to be made for promoter activity in the UTR, reinitiation of ribosome on the second ORF, aberrant splicing (Holcik et al. 2005
), and inconsistent values (Hennecke et al. 2001
) of the dual luciferase reporter gene construct. Re-evaluation of the 5'-UTRs of PDGF (Han et al. 2003
), PIM-1 (Wang et al. 2005
), and the human cyclin-dependent kinase inhibitor, p27kip1 (Liu et al. 2005
), has shown that they do not have IRESes as was initially thought (Bernstein et al. 1997
; Johannes et al. 1999
; Miskimins et al. 2001
), but the sequences are able to function as promoters. A statistically rigorous methodology has been published to evaluate the output values of the bicistronic constructs (Jacobs and Dinman 2004
). The only drawback to this method becoming universally used is that the minimal sample size might require 2550 measurements, which is more than the three to nine sample measurements that are usually done.
|
|
|
|
|
mRNA binding proteins
When the function of some canonical translation initiation factors have been disabled or limited in the cell to make cap-dependent translation less efficient, other RNA binding proteins have been found to be required or enhance IRES-mediated translation initiation. RNA binding proteins have many functions that affect translation from localization of mRNA in the cytoplasm (zipcode), stabilization of message (AREs), metabolite riboswitch (Sudarsan et al. 2003
), and translation repression. Our interest in this case is in IRES-specific cellular trans-acting factors (ITAFs). Some of the mRNA binding proteins implicated in IRES-mediated translation are polypyrimidine tract binding protein PTB/hnRNP I (Giraud et al. 2001
; Mitchell et al. 2001
, 2003
, 2005
; Pickering et al. 2003
; Cho et al. 2005
); La autoantigen (Holcik and Korneluk 2000
; Bhattacharyya and Das 2005
; Marash and Kimchi 2005
); hnRNP A1 (Bonnal et al. 2005
); hnRNP C1/C2 (Sella et al. 1999
; Millard et al. 2000
; Holcik et al. 2003
); hnRNP E, hnRNP K, DAP5/p86 (Henis-Korenblit et al. 2000
; Nevins et al. 2003
; Warnakulasuriyarachchi et al. 2004
; Marash and Kimchi 2005
); Unr (Mitchell et al. 2001
, 2003
; Tinton et al. 2005
); p60 (Vagner et al. 1996
), HuR (Millard et al. 2000
), and PCBP1 (Pickering et al. 2003
). For a more comprehensive list, see the online IRES database (http://ifr31w3.toulouse.inserm.fr/IRESdatabase/) or the list in Stoneley and Willis (2004)
.
One group of ITAFs is the many heterogeneous nuclear ribonucleoproteins, hnRNPs that bind onto transcripts and form ribonucleoprotein complexes. They play a key role in pre-mRNA processing as well as mRNA export, localization, stability, and translation (Dreyfuss et al. 2002
). As an example, PTB has been connected with several functions such as splicing repression, pre-mRNA 3'-end processing, mRNA localization, and mRNA stability. It has also been shown to be involved in IRES activation with both viral (Sanderbrand et al. 2000
; Wollerton et al. 2001
; Bieleski et al. 2004
) and cellular (Giraud et al. 2001
; Mitchell et al. 2003
, 2005
; Pickering et al. 2003
) IRES but has been also shown to be inhibitory to IRES activity in Unr (Cornelis et al. 2005
) or Bip (Kim et al. 2000
). Several IRESes have a polypyrimidine tract at the 3'-end that has been shown to be important for activity (Kaminski et al. 1994
) and a recognition site for PTB (Kolupaeva et al. 1996
). The PTB consensus recognition sequence shown to be important in the HCV IRES sequence is CYYYYCYYYY(G|Y)G, where Y is a pyrimidine (Anwar et al. 2000
). It is not known if the location of this sequence within the HCV IRES structure is important as only the first four bases are consistently in single-stranded regions, but some believe the binding site needs to be within double-stranded regions (Mitchell et al. 2005
). PTB has at least three splicing isoforms (Wollerton et al. 2001
) and caspase (Back et al. 2002b
) or viral protease cleavage products (Back et al. 2002a
), which affect IRES activity to differing degrees, as well as tissue-specific paralogs (Pilipenko et al. 2001
; Gooding et al. 2003
). The protein also contains four RNA binding motifs, thus it is not surprising that other recognition sites for PTB/RNA interactions have also been found that do not match this consensus sequence (Wollerton et al. 2001
). It is understandable from the variety of PTB forms available that there does not seem to be an unequivocal consensus sequence for PTB.
The ITAF La is very promiscuous as well with its binding site recognition, and the requirement for IRES function is not always clear. For example, an early in vitro study had shown that the HCV IRES could function without any noncanonical factors, but recent in vivo studies show that La is required for HCV IRES translation (Shimazaki et al. 2002
; Costa-Mattioli et al. 2004
). The RNA binding protein Unr is required for HRV IRES activity, and although all five cold shock domains of Unr are necessary for RNA binding, the binding seems to be a nonspecific sequence interaction (Brown and Jackson 2004
).
Although there is no overall consistency as to which RNA binding proteins are required for IRES activity, they may be consistently involved in ribonucleoprotein complexes that exist during specific cellular contexts of stress, cell cycle, or particular mechanism of viral control over cellular functions. Using microarrays, some RNA binding proteins have been shown to interact with a specific group of transcripts during times of cellular perturbation (Tenenbaum et al. 2002
, 2003
). It has been postulated that RNA binding proteins play a role in coregulating the translation of groups of proteins analogous to bacterial operons (Keene and Tenenbaum 2002
). Therefore, knowing an RNA binding protein that binds an IRES in a specific cellular context would suggest that other IRESes in that context may also be bound to the same protein.
The binding of the canonical initiation factors also affect IRES activity. When eIF4E, the cap binding protein, has been removed (Hernandez et al. 2004
), the Drosophila reaper IRES initiates translation more efficiently than capped messages. The eIF4G family member DAP5/p97 is cleaved to DAP5/p86 and enhances IRES-mediated translation (Henis-Korenblit et al. 2002
; Nevins et al. 2003
; Warnakulasuriyarachchi et al. 2004
; Marash and Kimchi 2005
). The importance of ribosomal proteins in IRES translation initiation was shown using a genome-wide RNAi screen of Drosophila genes. One hundred twelve cellular genes were found that were required for infection by the IRES-dependent Drosophila C virus (Cherry et al. 2005
). More than 50% of these were genes of ribosomal proteins, two of which when deleted affected IRES but not cap-dependent translation. This suggests that some ITAFs will be not be unique to only IRES translation machinery and will include components of the ribosome as well.
The ITAFs are abundant in the cell and seem quite ubiquitous but are not required in all examples of known IRES activity. Specific initiation factors and ribosomal proteins will also play a role. They can also exist in several forms of post-translational modification and bind a wide range of sequence motifs. Their own regulation and regulated cellular localization would control the IRES function. Do the mRNA binding proteins that are used for induction of IRES activity in the subsets of the published IRES possibly have shared binding motifs? Although the RNA binding protein data are not as well defined to use for database searching alone, they could still be used in a search algorithm as an added weight in a search.
Functional classes
There appear to be differences among IRESes as to which proteins are necessary to bind to the UTR in order to recruit the ribosome for translation. When this is coupled with the results of different IRESes initiating translation with varying efficiency dependent on which cell type or cellular context they are measured in (Nevins et al. 2003
), the results suggest that there exist several IRES classes. Many groups have pointed out this observation already. Several groups have examined their characterized IRESes in several cell lines, comparing it to other IRESes, and found specific IRESes will have more activity in one specific cell line relative to another IRES (Stoneley et al. 2000
; Jopling and Willis 2001
; Nevins et al. 2003
; Jopling et al. 2004
). This may be due to different available protein factors in each cell line and, therefore, the different subsets of mRNAs that different cells or tissues are able to translate at any one time. This may also explain the lack of primary sequence similarity between the cellular IRES. For example, the 5'-UTRs containing IRES from c-Myc and cyclin D are dependent on the activity of AKT through p38 MAPK and ERK signaling to initiate translation (Shi et al. 2005
) but not the 5'-UTR of P27kip1. The link may be the ITAFs PCBP1, PCBP2, and hnRPK, which are known to be required by the c-Myc IRES (Evans et al. 2003
) and are regulated by phosphorylation (Shi et al. 2005
).
In investigation of the FGF1 IRES activity in muscle and cell culture (Martineau et al. 2004
), FGF1 has four separate 5'-UTRs, each exhibiting some IRES activity. FGF1A and C have similar activity to each other and that of FGF2 IRES but much less than the EMCV IRES in cell culture. The same IRES constructs electrotransferred into mouse muscle cells showed FGF1A to be much more active than FGF1C and similar to EMCV, while the FGF2 IRES seemed to exhibit no activity at all. Clearly the context of available ITAFs must favor the translation of one mRNA over another. A very similar contextual difference is seen where IRES from the muscle-relevant genes SMAD and utrophin are active in myoblast C2C12 muscle cells but not at all in 293T renal epithelial cells for SMAD (Shiroki et al. 2002
) or differentiated muscle cells for utrophin (Miura et al. 2005
). Other examples are the IRES from transcripts of the calcium channel proteins like Scamper exhibiting tissue-specific activity in kidney cells (De Pietri Tonelli et al. 2003
) and the Nkx6.1 IRES being most active in
-cells (Watada et al. 2000
). The Apaf1 IRES has been found to be more active in neuronal cell types possibly due to the presence of a neuronal isoform of PTB, which seems to confer greater activity than PTB-1 (Mitchell et al. 2003
), and this may be where the APAF IRES is most physiologically relevant. This correlates well with the developmental problems found in the brains of Apaf-1 knockout mice (Cecconi et al. 1998
). In contrast, the IRES of HRV is repressed in neuronal cells because of the presence of the mRNA binding protein DRBP76/NF90 (Merrill et al. 2006
). A list of IRESes that are regulated can be found in the review by Komar and Hatzoglou (2005)
. Whereas viral IRES might share a more universal context of translation regulation and therefore some similarity has been found, the larger number of cellular contexts that would require different regulation infers a large number of classes of IRES with many different sequence and structural components. Even saying there are regulatory classes of IRESes may be too rigid, as there may be a loose overlap of some mRNA translation.
18S complementation and modular elements
It has been pointed out by Chappell et al. (2000)
that partial IRES activity is still retained when the segments of a 5'-UTR ascribed to full IRES activity are partially deleted, and therefore some elements that help to recruit ribosomes must still exist in the remaining sequence. In some UTRs, nonoverlapping segments of sequence retain partial IRES activity, suggesting that different modules may act synergistically to provide full IRES activity in vivo. Nonoverlapping fragments of the Kv1.4 IRES that each retained partial activity showed different patterns of activity when tested in a variety of cell types (Jang et al. 2004
). This suggests distinct modular elements with different modes of regulation. Some examples of postulated IRES elements are listed in Table 3. Chappell et al. (2000)
had found an example of a distinct module with a 9-nt motif in the Gtx mRNA, complementary to 18S rRNA that can function as a site for internal initiation of translation. This is an attractive model for ribosome recruitment for internal initiation as it parallels the function of the bacterial Shine-Delgarno sequence and 16S rRNA. This complementation to the 18S rRNA is not new and has also been shown necessary for reattachment for scanning of the mRNA during "ribosome shunting," where the ribosome is stalled because of a complex structure in the mRNA and must disengage and then re-engage the mRNA on the 3'-side of the structure (Yueh and Schneider 2000
). Similar motifs with IRES activity exhibiting 18S rRNA complementation were found in a library of random nucleotides (Owens et al. 2001
) and the plant potato virus Y (Akbergenov et al. 2004
), and have been suggested to be in YAP1 and TIF4631 transcripts of yeast (Zhou et al. 2001
). Additional copies of the elements from either Gtx or the potato virus Y arranged in tandem produced additive increases of IRES activity. A segment in the 3'-UTR of a hibiscus plant virus, although perhaps not an IRES, was shown to enhance translation through an 18S rRNA complementation (Koh et al. 2002
). This is not a universal truth about sequences complementary to 18S, as several of the matches in the YAP1 and Tif4631 5'-UTRs are in segments that confer no IRES activity (Zhou et al. 2001
). The rRNA/mRNA interaction cannot be too great, as increasing the degree of complementation increases the thermodynamic stability of this interaction, lowering the efficiency of translation, and if large enough can completely inhibit translation (Hu et al. 1999
; Verrier and Jean-Jean 2000
). This differential ability to translate an mRNA based on the rRNA and mRNA interactions as well as the interactions due to changes in the structure of ribosomal subunits from one cell type to another is postulated as an overall method of translation control in the "ribosome filter hypothesis" of Mauro and Edelman (2002)
. The last few years have seen the realization that rRNA is no longer just a scaffolding for the ribosomal proteins; the ribosome is a ribozyme, and translation is now more RNA centered (Woese 2001
). It is, therefore, reasonable to believe with present-day evidence that interactions between mRNA and rRNA can enhance translation initiation.
|
Fine mapping of the c-Myc IRES sequence found a minimal 50-base sequence that was responsible for the bulk of the IRES activity (Cencig et al. 2004
). The c-Myc sequence did not seem to map to 18S rRNA and was also not dependent on the secondary structure formed for activity. Within this element, two 14-nt segments with an AX6AC motif were chiefly responsible for ribosome recruitment reducing c-Myc IRES activity to these modular units. The IRES found within the APC gene that possibly is responsible for the milder form of adenomatosis polyposis coli is only 84 nt long (Heppner Goss et al. 2002
) and may be also representative of a modular unit with IRES activity.
In an acyclovir-resistant strain of herpes simplex virus, very low levels of thymidine kinase are translated by a small IRES that requires only 12 bases and contains a CUG start codon (Griffiths and Coen 2005
). The low levels of thymidine kinase prevent its proper activation of acyclovir but are high enough to retain the virus's pathogenicity. As well as being a new IRES modular element, this also shows how very low levels of IRES translation initiation can be physiologically significant.
The above evidence supports the notion that an IRES can be made of modular units that act synergistically with or without trans-acting factors to recruit a ribosome and enhance translation initiation. The overall structure of the UTR for these small units may be somewhat unimportant. A very stable structure like a large hairpin or a tertiary structure that would bury a modular recognition sequence would probably still have an effect on IRES activity. The modulation of the structure by mRNA binding proteins could, therefore, have a regulatory IRES effect by changing the access to these small modules. As these types of short IRES sequences may be available elsewhere on mRNA sequences, they suggest the possibility of a much expanded proteome (Griffiths and Coen 2005
).
| DATABASES AND IRES |
|---|
|
|
|---|
IRES structure data for 16 cellular IRES and five viral IRES sequences are available at Rfam (Griffiths-Jones et al. 2003
), a database of noncoding RNA. For the most part, published IRES structures are used initially at Rfam, and a covariance sequence model is built using UTR sequences from different transcript entries and known ortholog sequences. Where no published structure is known, the covariance models for the IRES structures have either been built from energy minimization program prediction (RNAfold) or from a sequence alignment using the PFOLD program. It should be noted that if the alignment sequences did not give any covariance data, the seed structure would dominate the resulting structure model. This could possibly be the case for HCV, which appears to be little different from its seed structure (Lytle et al. 2002
) but dissimilar from the HCV NMR structure data (Lukavsky et al. 2003
). For tertiary structure information of IRES elements, the data can be found in the structure database at NCBI or RNABase (Murthy and Rose 2003
), a specific RNA structure database that may include additional annotations not included in the standard 3D structure files. At present, there are 13 IRES-related elements from GBV-B, enterovirus, and mostly HCV. Tertiary structures can be converted into secondary structure diagrams using RNAVIEW (Yang et al. 2003
).
There are other available data sets of IRES-enriched sequences where microarray expression studies have been carried out on cells undergoing a perturbation that would turn down the amount of cap-dependent translation and allow for a greater amount of IRES-mediated translation. Assuming that IRES elements under these conditions would more efficiently recruit ribosomes, experiments that isolated mRNAs bound by multiple ribosomes (polysomes) should show mRNAs with a greater likelihood of containing an IRES element. Johannes et al. (1999)
isolated polysomes from poliovirus-infected cells and evaluated any increased amount of bound mRNA compared to noninfected cells using a 10K human cDNA array. They found
200 transcripts with a greater than twofold enrichment of polysome-bound mRNA of
7000 hits that produced an acceptable signal on the microarray. There are several other studies using microarray examination of polysome-bound mRNA, cells with and without von Hippel-Landu tumor suppressor protein (Galban et al. 2003
), rapamycin's effect on translation (Grolleau et al. 2002
), resting and mitogenically activated fibroblast (Zong et al. 1999
), synchronized cells in the mitotic cycle (Qin and Sarnow 2004
), and in yeast during cell cycle arrest (Serikawa et al. 2003
) or rapid growth (Arava et al. 2003
). Lately some researchers have combined the proteomic approach of measuring both the increase of protein expression using 2D gels and mass spectrometry as well as the mRNA levels (Ideker et al. 2001
; Grolleau et al. 2002
). This allows one to specifically discover transcripts that are regulated post-transcriptionally. Microarray studies of mRNA bound to mRNA binding proteins implicated as ITAFs will also yield a database of transcripts of which some may contain IRES.
Databases of UTR sequences can be obtained from several sources. Using the Web-based EnsMart in Ensembl (Clamp et al. 2003
), UTR data sets from a variety of genomes can be made with a user-defined number of flanking nucleotides. As almost all genes produce several transcripts, one must still consider filtering redundant 5'-UTRs, alternately spliced UTRs, or alternate UTRs produced by different promoters when creating a database. A prefiltered nonredundant database, UTRdb (Pesole et al. 2002
), can be found at http://bighost.area.ba.cnr.it/BIG/UTRHome/, in which UTRs with >90% sequence overlap and 95% nucleotide identity within the overlapping region (Grillo et al. 1996
) have been removed. UTRdb has annotated the sequences that contain patterns matching possible RNA structure/sequence, which would confer known regulatory elements such as the iron response element (IRE), histone 3'-UTR stemloop structure (HSL3), or even IRES elements as adapted from the computationally predicted structure of Le and Maizel (1997)
. In Release 20 of UTRdb, there are
34,000 human 5'-UTR sequences and the IRES pattern is found
7000 times,
20% of all the entries. Although the number of mRNAs containing IRES elements could be quite abundant, there is doubt that the pattern actually detects IRESes. The RefSeq database from NCBI can also be used to produce a data set from the transcript entries but requires the user to write his or her own program/script to pull out the UTR sequences. A redundant version of the RefSeq UTRs is available at http://bighost.area.ba.cnr.it/BIG/UTRHome/.
These data sets allow one to compare the overlaps in the UTR sequence characteristics believed to have IRES elements as discovered by different means versus a complete set of a genome's UTRs. For instance, of the
23,000 human genes in the human genome, around half of them have AUGs upstream of the defined AUG start codon, as seen in Figure 1, A and B. The number of upstream AUGs does not seem to depend on length as there is a subset of UTRs with more AUGs than expected randomly, as shown in Figure 1B, but the overall frequency is less than expected from random. Although many of the upstream AUGs will most likely be relatively close to the Kozak consensus sequence (Suzuki et al. 2000
), this alone does not seem to mark them as proper start codons (Peri and Pandey 2001
). Around 10% of the upstream AUGs would be considered strong start codons (RXXAUGG, where R is a purine) as defined by Kozak (2005)
, whereas 41% to 45% of the coding region start codons would be considered strong start codons. Perhaps the ribosome starts to translate with the first AUG that is in the proper context of the surrounding folded RNA structure, bound proteins, and/or a continuous open reading frame. Possibly the start codon is decided on by a "pioneer" scan (Ishigaki et al. 2001
) of the mRNA or an RNA binding protein (McBratney and Sarnow 1996
).
|
|
|
Many more characteristics of the UTR sequences can be examined by comparing and crossing all of these data sets, looking for sequence or structural motifs that may be required for translation initiation in the periods of translation dysregulation.
| IRES STRUCTURE |
|---|
|
|
|---|
There are several ways to experimentally determine RNA structure; for example, X-ray crystallography, NMR, chemical and enzymatic structural probing, as well as mutational analysis (Kjems and Egebjerg 1998
). Structures that have been predicted using computer programs often are functionally tested for validity with small sequence mutations (Kanamori and Nakashima 2001
). X-ray crystallography and NMR, while more definitive in determining the structure, have limitations on either the type or length of the RNA molecules that can be investigated. RNA alone tends to form poor crystals that exhibit weak diffraction or heterogeneous samples (Ke and Doudna 2004
), possibly because of the dynamic folding of many RNA molecules. The only crystal structures of RNA not complexed with proteins >76 bases in length in the structure databases are either ribonuclease P or group I intron ribozymes. The limitation for NMR is due to the severe spectral overlap of the four different nucleotides. These two methods have been used in examining small, structurally stable motifs extracted from larger RNA molecules. Enzymatic and chemical probing are aids in determining a secondary structure of an RNA molecule but often have some ambiguous results in their experiments caused by the difficulty in the art of doing the experiment and interpreting the results. These wet-laboratory determinations of structure can also be supported with phylogenetic data comparisons in which sequences that have changed have preserved the structure to preserve the function of the structure. The best data from all these approaches are used to support a proposed model, but conflicting secondary structure models do arise in the literature as has been evident in domain II of HCV (Honda et al. 1999
; Zhao and Wimmer 2001
; Lytle et al. 2002
).
We must also consider how much of the proposed mRNA structure is as significant for IRES activity in vivo as it is in a naked state in vitro when it is derived. Messenger RNAs in vivo start folding as they are transcribed and are covered with the protein complexes required for translocation, splicing, stabilization, capping, polyadenylation, cellular localization, and translation. In vivo chemical probing of structures has been performed on relatively abundant RNAs like rRNA, telomerase RNA, and snRNA (Zaug and Cech 1995
; Mathews et al. 2004
), and could possibly be used to determine whether IRES structures determined in vitro exist in vivo as well.
Viral IRES structures
Several viral IRESes share similar secondary structures, suggesting that similar structures instead of specific sequence recognition sites are used to bind initiation factors used for cap-independent translation. This is a great aid in understanding the IRES mechanism overall, but it must be remembered that ssRNA viruses have a life cycle in the cytoplasm and will not interact with the nucleoprotein complex like cellular mRNA produced in the nucleus. Therefore, we would not expect the noncanonical protein factors to necessarily interact by exactly the same mechanism. Several groups have used characteristics of the viral IRES to separate them into three groups. Type I viral IRESes, which include entero- and rhinoviruses, translate poorly in rabbit reticulocyte lysates (RRL) and require the ribosome to bind and then scan downstream to a start codon 30150 nt away. Type II viral IRESes include cardio- and apthoviruses, translate very efficiently in RRL, encompass the AUG start codon, and do not require scanning. Type I IRESes are stimulated by the eIF4G cleavage products produced by the viral protease, but Type II are not. Type III IRESes are typified by hepatitis A, do not translate at all in RRL, and encompass the AUG start codon. These classes, which may not be able to apply definitive separation of viral IRES types, serve to show how some viruses use a similar mechanism to initiate translation, while others use distinctly separate mechanisms to arrive at the same end.
Comparative studies involving covariation analysis as well as enzymatic and chemical probing of structures demonstrated a conservation of structures between members of the Picornaviridae enterovirus family, Poliovirus and coxsackievirus B3, and the human rhinovirus family (Rivera et al. 1988
; Pilipenko et al. 1989b
). Similar studies found sequence and structural conservation between the Picornaviridae cardiovirus family (EMCV, TMEV) and apthovirus FMDV (Pilipenko et al. 1989a
). Further examination of variants and structural probing of PV (Pilipenko et al. 1992
), mutational analysis of EMCV (Hoffman and Palmenberg 1995
; Kolupaeva et al. 1996
), and FMDV (Lopez de Quinto and Martinez-Salas 1997
) have further refined the original structure models. The preservation of a stem structure in the base of region 3 of FMDV has been shown to be more important than the sequence that creates the structure (Martinez-Salas et al. 1996
; Martinez-Salas et al. 2002
). Recently the central domain was analyzed with chemical and enzymatic probing altering slightly the predicted structure (Fernandez-Miragall and Martinez-Salas 2003
).
Several Flaviviridae viruses contain IRES sequences and have had their structures determined. In the pestivirus genera of Flaviviridae viruses, there is the bovine viral diarrhea virus (BVDV) and cholera swine fever virus (CSFV); in the hepacivirus genera, there is HCV; and unclassified are the GBV-A, B, and C viruses. Lemon and Honda (1997)
reviewed the similarity and structural importance of their IRES structures. The IRES structure that has been studied the most of all IRESes is from the hepatitis C virus (HCV) because of its importance as a human pathogen. The initial secondary structure had been predicted using both enzymatic probing results as constraints in an early version of the MFOLD program (Zuker and Stiegler 1981
) along with comparative computer-predicted models with the UTRs of BVDV and CSFV (Brown et al. 1992
). There have been numerous studies adding to the refinement of the IRES structure (Lemon and Honda 1997
; Lyons et al. 2001
; Odreman-Macchioli et al. 2001
), arriving at two similar models (Honda et al. 1999
; Zhao and Wimmer 2001
; Lytle et al. 2002
) with some differences in domain II. Smaller stemloop motifs of the HCV IRES structure have had their tertiary structure determined using both X-ray crystallography (Kieft et al. 2002
) and NMR (Klinck et al. 2000
; Lukavsky et al. 2000
; Collier et al. 2002
). Recently, the tertiary structure for domain II has been determined with NMR (Lukavsky et al. 2003
) and finally resolves the conflicts of structural models proposed for this domain. It is important to note that the extensive chemical and enzymatic probing of the HCV IRES over 10 years was still not accurate because of the limitations of that approach. A structure model including the NMR and X-ray data is presented in Figure 4. Using cryo-EM, studies of the HCV IRES in the 40S ribosome subunit, domain II, which is necessary for IRES activity, specifically produced conformational changes to the 40S ribosome (Spahn et al. 2001
). The segments IIIe and IIIf have recently been shown to interact with ribosomal protein S5 in an IRES-specific manner that does not seem to be important in cap-dependent translation initiation (Ray and Das 2004
), suggesting that IRES RNA interaction with the ribosome can be different from that of non-IRES-containing mRNA even without considering nonribosomal protein factors.
|
The other known Flaviviridae viral IRES GBV-B, BVDV, and CSFV share similar secondary structures to HCV (Lemon and Honda 1997
). BVDV and CSFV structure models (Brown et al. 1992
) were originally proposed by aligning four sequences, doing manual sequence covariation analysis, and using the determined base pairs as constraints on the MFOLD program (Zuker and Stiegler 1981
). Mutation analysis has substantiated the CSFV pseudoknot (Rijnbrand et al. 1997
; Fletcher and Jackson 2002
), stemloop IIIa (Fletcher and Jackson 2002
), as well as other sections of the structure with enzymatic probing (Kolupaeva et al. 2000
). Only a little supportive work (Grassmann et al. 2005
) has been done on the original model of the BVDV UTR structure (Brown et al. 1992
). GBV-B was modeled comparing similar sequences to HCV and using those that might base pair as constraints in the MFOLD program, followed by mutational studies on the similar stems and loops in domains II and III (Rijnbrand et al. 2000
) as well as NMR on specific domain III stemloops (Rijnbrand et al. 2004
).
RNase P is an endoribonuclease that processes tRNA precursor 5'-ends to the correct length. It has been used to cleave the 5'-UTR IRES-containing regions of HCV, BVDV, CrPV, EMCV, and CSFV (Lyons and Robertson 2003
). This suggests that the structures of these IRESes may mimic a portion of a tRNA. Lyons and Robertson (2003)
postulate a model in which these viruses would occupy the E-site of the ribosome, positioning it for the AUG start codon downstream. A structural element similar to tRNA spatially situated so that the start codon would be properly placed in the ribosome would greatly enhance the ability to find IRESes in the database, as effective search methods exist that can find these structures within genomic sequences (Lowe and Eddy 1997
; Tsui et al. 2003
). These putative tRNA-like sequences may not reflect a complete tRNA structure as RNase P cleaves several different RNA structures (Gopalan et al. 2002
) including a minimal stem structure of a tRNA molecule as a substrate (Zuleeg et al. 2001
). A complete tRNA structure may also not be needed for affinity to the E-site of the ribosome. This proposed structural element used to direct the positioning of mRNA in the ribosome has an even greater role in CrPV IRES-mediated translation, where translation initiation occurs without eIF2 and an activated Met-tRNA in the P-site (Wilson et al. 2000a
). The first activated tRNA starts translation on an alanine codon at the A-site. The interactions in CrPV substituting for Met-tRNA would be different from that of just a tRNA, but still suggest structural elements that may be shared with other IRESes. This is a good mechanism by the virus to overcome the antiviral response of the cell to shut off protein synthesis through PKR phosphorylation of eIF2
(Stark et al. 1998
). Cellular stress leading to transcription of TNF
and IFN-
activates PKR as well through conserved RNA motifs in the UTRs of their mRNAs (Kaempfer 2003
), requiring cellular proteins to translate in this context in an alternative manner possibly akin to CrPV.
Other Dicistroviridae also seem to be able to initiate translation without a methionine tRNA. A structural model was proposed using the intergenic IRES sequence upstream of the capsid proteins from Plautria stali intestine virus (PSIV), Taura syndrome virus (TSV), and acute bee paralysis virus (ABPV) and used to search nucleotide databases with a pattern searching program (Nishiyama et al. 2003
). The key region of the model had experimentally been shown to be a pseudoknot (PK1) 5' of the capsid-coding region. This structure, even with relaxed search parameters, was not found in extensive database searches, strongly suggesting this functional structure to be distinct to Dicistroviridae.
Coxsackievirus B3, which is in the same family, has a GNRA loop in its IRES that influences the binding of La but not of PTB (Bhattacharyya and Das 2005
). La binds to several places on the IRES and not specifically the loop, but the RNA structure of the IRES may change because of mutation of this loop, suggesting a structure requirement for La binding in this case.
There is a definite conservation of RNA structures between groups of viruses within their IRES sequences. The mutation analysis studies have shown that the preservation of these structures in viruses are more important for IRES function than the actual sequence.
Cellular IRES structures
We know that viruses have evolved regulatory mechanisms borrowed from host cells, and therefore it is felt that since some viruses have shared IRES RNA structure, some cellular IRESes most probably have a shared structure/function relationship. At this time, the secondary structure has been derived for several cellular IRESes with enzymatic and chemical probing from the transcripts of c-Myc (Le Quesne et al. 2001
), L-Myc (Jopling et al. 2004
), Apaf-1 (Mitchell et al. 2003
), FGF-2 (Bonnal et al. 2003b
), FGF1 (Martineau et al. 2004
), Kv1.4 (Jang et al. 2004
), Bag-1 (Pickering et al. 2004
), Igf2 (Pedersen et al. 2002
), cat-1 (Yaman et al. 2003
), Mnt, and MTG8a (Mitchell et al. 2005
). Their structure/function relationships are described below.
As has been mentioned previously, a common Y structure (Le and Maizel 1997
) had been predicted for cellular IRESes based on the computational comparison of several orthologs of Bip and FGF2 UTRs. This pattern had been adapted for the PATSEARCH program (Grillo et al. 2003
) to annotate the UTRdb entries as putative IRES motifs and is used by the UTRscan Web server. Of the 33,677 entries in Release 20 of UTRdb, 7122 entries have been annotated as IRESes, which is
21% of all the entries. Using the same pattern to search the cellular IRESes from Table 1![]()
with RNAMotif finds the pattern in
21% of the entries, showing that this pattern is no more common in known IRES-containing UTRs than in all UTRs.
Some studies have proposed that mRNA binding proteins open up the natural RNA structure of the IRES and present single-stranded RNA for other ITAFs or the small ribosomal subunit to bind to. Therefore, the structure would not be the "landing pad" for the ribosome but the "attenuater" that is controlled by the ability of some proteins to change it. For example, the poly rC binding protein 1 (PCBP1) appears to open the Bag-1 IRES structure allowing PTB-1 to bind. Mutations that open up the binding region of the PCBP1 on the Bag-1 IRES seem to remove any requirement for this factor, even enhancing IRES activity after PTB-1 is added (Pickering et al. 2004
). Although PCBP1s role seems to be to open up the structure for PTB-1, it is not clear whether the structure is necessary for its binding. A similar mechanism was proposed for the ITAFs Unr and PTB in the APAF-1 IRES (Mitchell et al. 2003