RNA
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online before print September 6, 2006, 10.1261/rna.157806
RNA (2006), 12:1755-1785. Published by Cold Spring Harbor Laboratory Press. Copyright © 2006 RNA Society.
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
rna.157806v1
12/10/1755    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Baird, S. D.
Right arrow Articles by Holcik, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Baird, S. D.
Right arrow Articles by Holcik, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

REVIEW

Searching for IRES

Stephen D. Baird1,4, Marcel Turcotte2, Robert G. Korneluk1,3,4 and Martin Holcik1,3,4

1. Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ontario K1H 8M5, Canada
2. School of Information Technology and Engineering, University of Ottawa, Ontario K1N 6N5, Canada
3. Department of Pediatrics, University of Ottawa, Ontario K1H 8M5, Canada
4. Apoptosis Research Centre, Children's Hospital of Eastern Ontario, Ottawa, Ontario K1H 8L1, Canada


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MECHANISM OF CAP-DEPENDENT...
 DATABASES AND IRES
 IRES STRUCTURE
 RNA STRUCTURE PREDICTION AND...
 SINGLE SEQUENCE STRUCTURE...
 CONCLUDING REMARKS
 REFERENCES
 
The cell has many ways to regulate the production of proteins. One mechanism is through the changes to the machinery of translation initiation. These alterations favor the translation of one subset of mRNAs over another. It was first shown that internal ribosome entry sites (IRESes) within viral RNA genomes allowed the production of viral proteins more efficiently than most of the host proteins. The RNA secondary structure of viral IRESes has sometimes been conserved between viral species even though the primary sequences differ. These structures are important for IRES function, but no similar structure conservation has yet to be shown in cellular IRES. With the advances in mathematical modeling and computational approaches to complex biological problems, is there a way to predict an IRES in a data set of unknown sequences? This review examines what is known about cellular IRES structures, as well as the data sets and tools available to examine this question. We find that the lengths, number of upstream AUGs, and %GC content of 5'-UTRs of the human transcriptome have a similar distribution to those of published IRES-containing UTRs. Although the UTRs containing IRESes are on the average longer, almost half of all 5'-UTRs are long enough to contain an IRES. Examination of the available RNA structure prediction software and RNA motif searching programs indicates that while these programs are useful tools to fine tune the empirically determined RNA secondary structure, the accuracy of de novo secondary structure prediction of large RNA molecules and subsequent identification of new IRES elements by computational approaches, is still not possible.

Keywords: IRES; RNA; secondary structure; prediction software


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MECHANISM OF CAP-DEPENDENT...
 DATABASES AND IRES
 IRES STRUCTURE
 RNA STRUCTURE PREDICTION AND...
 SINGLE SEQUENCE STRUCTURE...
 CONCLUDING REMARKS
 REFERENCES
 
The cell has many ways of regulating the production of a protein from a gene. In this review, we focus on one mechanism of initiating the translation of messenger RNA without following the standard pathway used by the majority of mRNAs. Translation initiation via the internal ribosome entry site (IRES) is a mechanism by which the cell allows translation of specific mRNAs because of unique RNA sequences in their untranslated regions (UTRs), which recruit ribosomes. Although some viral IRESes share primary sequence or secondary structure similarity, this similarity has not yet been found between known cellular IRESes, which has raised some questions regarding the existence of cellular IRESes (Kozak 2001Go, 2003Go). A sequence containing a cellular IRES can be cloned and tested for function outside of its native gene context, but it is not known what commonality exists to allow various IRESes to recruit ribosomes without using the standard protein translation initiation mechanism.

There have been many very good reviews on IRESes over the years that are helpful in understanding the different facets of this mechanism of translation initiation. Some favorites are: Hellen and Sarnow (2001)Go, Jackson et al. (1995)Go, on Picornavirus (Belsham and Sonenberg 1996Go), FMDV IRES structure/function (Martinez-Salas et al. 2002Go), structural aspects relevant to medical intervention (Gallego 2002Go), with respect to cancer (Holcik 2004Go; Stoneley and Willis 2004Go), the very critical and controversial Kozak (2001Go, 2003)Go, and on stress-related IRES (Holcik et al. 2000Go; Holcik and Sonenberg 2005Go; Lewis and Holcik 2005Go). In this review, we examine the published data that could aid in the detection of unknown IRESes in an mRNA database and the RNA motif/structure-predicting and search programs presently available, which could be applicable to this search.


    MECHANISM OF CAP-DEPENDENT TRANSLATION, INHIBITION, AND IRES
 TOP
 ABSTRACT
 INTRODUCTION
 MECHANISM OF CAP-DEPENDENT...
 DATABASES AND IRES
 IRES STRUCTURE
 RNA STRUCTURE PREDICTION AND...
 SINGLE SEQUENCE STRUCTURE...
 CONCLUDING REMARKS
 REFERENCES
 
The textbook explanation of standard translation initiation has the cap-binding protein, the eukaryotic initiation factor eIF4E, recruited to the 5'-end of the mRNA, where it binds to the modified "cap" nucleotide, a methyl7GDP (guanadyldiphosphate) on the 5'-end of all cellular mRNAs. With this, the initiation factor eIF4G binds to both the cap-binding protein and the mRNA. eIF4G is called the "scaffold" protein as it also binds eIF4A, an ATP-dependent helicase, which is responsible for unwinding the secondary and tertiary structure of the RNA during translation, as well as the kinase MnkI, which regulates eIF4E-binding activity through phosphorylation. The initiation factors eIF4A, eIF4G, and eIF4E are also known collectively as the protein complex eIF4F. These factors are key to the recruitment of the ribosome to the 5'-cap structure of mRNA. The 40S small subunit of the ribosome, the initial part of the ribosome to bind the mRNA, binds the activated start codon tRNA, fMet-tRNAi fMet with eIF2 and GTP, to its P-site. This binding is promoted by eIF1A, the eIF4 factors, and eIF3. Together these proteins and tRNAs, now called the 43S complex, are believed to scan along the mRNA with the use of ATP to drive them, looking for the proper place to start translation. At the point where the proper start codon is found by the 43S scanning complex, the GTP with eIF2 is hydrolyzed to GDP in the presence of eIF5. Several factors dissociate, leaving the 40S subunit with the Met-tRNA anticodon base-paired to the start codon. The large 60S ribosomal subunit then joins the small subunit, and protein synthesis begins. For a review on translation initiation factors, see Dever (2002)Go and on structural aspects of initiation factors, see Sonenberg and Dever (2003)Go.

At some points of a cell's life, this standard mechanism of translation initiation is compromised, but at these times, some mRNAs use an alternative form of initiation that does not require the cap nucleotide as a congregation site for initiation factors. This was first observed in Picornavirus infections, where the uncapped RNA viral genomes of the polio and encephalomyocarditis virus were efficiently translated in eukaryotic cells through the binding of the ribosome to an internal portion of the 5'-UTR of the viral RNA (Jang et al. 1988Go; Pyronnet et al. 2001Go). A viral protease cleaves the two forms of eIF4G, shutting down host protein (Gradi et al. 1998aGo,bGo; Svitkin et al. 1999Go). Even though the cleaved form of eIF4GI does not bind to eIF4E, it has been shown to translate capped mRNA but much less efficiently than viral RNA (Ali et al. 2001Go).

There are several other mechanisms that lower the efficiency of cap-dependant translation initiation inside a cell besides viral infection. During mitosis, the eIF4E-binding proteins (4E-BPs) are hypophosphorylated and competitively bind onto the cap binding protein, eIF4E, preventing eIF4E from forming the eIF4F initiation complex (Pyronnet et al. 2001Go). The phosphorylation states of eIF4E and the 4E-BPs in different cellular conditions have been well reviewed by Gingras et al. (1999)Go.

During times of cellular perturbation, changes in protein levels and mRNA levels do not always correlate (Ideker et al. 2001Go; Nishizuka et al. 2003Go). Cellular stresses and the induction of apoptosis cause inhibition of standard translation initiation through the phosphorylation of the eIF2{alpha} subunit by one of the four known eIF2{alpha} kinases in mammalian cells: HRI, PKR, PERK, and GCN2 (Proud 2005Go). The initiation factor eIF2 is the adapter protein that binds Met-tRNA and GTP as part of the 43S preinitiation complex. Phosphorylation of the {alpha}-subunit of eIF2 creates tighter binding to eIF2B, which prevents the GDP-GTP exchange activity of eIF2B needed for the recycling of eIF2 for the successive rounds of initiation of protein synthesis. During apoptosis, the cell still requires the de novo synthesis of proteins required for the orderly breakdown of the cell, but the standard protein translation initiation machinery is also slowed down with changes to the phosphorylation of eIF4G (Ling et al. 2005Go), eIF4E, and 4E-BPs (Clemens 2001Go), as well as caspase cleavage of several canonical initiation factors, eIF4B, eIF3, eIF2{alpha}, and proteins of the eIF4G family (Clemens et al. 2000Go). Other molecular events like the hyperphosphorylation of eIF4GII (Pyronnet et al. 2001Go) or Hsp27 binding to eIF4G during heat shock may hinder the formation of eIF4F (Cuesta et al. 2000Go).

Despite these cellular conditions that change the normal translation initiation machinery, some cellular mRNAs and viral RNA still retain the ability to recruit ribosomes to a region of their 5'-UTR to initiate translation. There have been at least 85 cellular IRESes and 39 viral IRESes described in the literature so far, as given in Tables 1GoGo and 2Go. These sequences have been shown to exhibit cap-independent translation initiation. The standard method of defining this activity has been the ability of the sequence to initiate translation of the second open reading frame (ORF) or cistron in a bicistronic construct. There have been some criticism and caveats attached to the use of bicistronic constructs for measuring IRES activity (Hellen and Sarnow 2001Go; Kozak 2001Go; Sherrill et al. 2004Go), and checks need to be made for promoter activity in the UTR, reinitiation of ribosome on the second ORF, aberrant splicing (Holcik et al. 2005Go), and inconsistent values (Hennecke et al. 2001Go) of the dual luciferase reporter gene construct. Re-evaluation of the 5'-UTRs of PDGF (Han et al. 2003Go), PIM-1 (Wang et al. 2005Go), and the human cyclin-dependent kinase inhibitor, p27kip1 (Liu et al. 2005Go), has shown that they do not have IRESes as was initially thought (Bernstein et al. 1997Go; Johannes et al. 1999Go; Miskimins et al. 2001Go), but the sequences are able to function as promoters. A statistically rigorous methodology has been published to evaluate the output values of the bicistronic constructs (Jacobs and Dinman 2004Go). The only drawback to this method becoming universally used is that the minimal sample size might require 25–50 measurements, which is more than the three to nine sample measurements that are usually done.


View this table:
[in this window]
[in a new window]
 
TABLE 1. Reported cellular IRES

 

View this table:
[in this window]
[in a new window]
 
TABLE 1. Continued

 

View this table:
[in this window]
[in a new window]
 
TABLE 1. Continued

 

View this table:
[in this window]
[in a new window]
 
TABLE 2. Reported viral IRES

 

View this table:
[in this window]
[in a new window]
 
TABLE 2. Reported viral IRES

 
The problems in assessing IRES activity must be kept in mind when using the published IRES sequences as a data set for bioinformatics tools. Like all biological databases, the data are not perfect, as some sequence activity may have been inadvertently misinterpreted.

mRNA binding proteins
When the function of some canonical translation initiation factors have been disabled or limited in the cell to make cap-dependent translation less efficient, other RNA binding proteins have been found to be required or enhance IRES-mediated translation initiation. RNA binding proteins have many functions that affect translation from localization of mRNA in the cytoplasm (zipcode), stabilization of message (AREs), metabolite riboswitch (Sudarsan et al. 2003Go), and translation repression. Our interest in this case is in IRES-specific cellular trans-acting factors (ITAFs). Some of the mRNA binding proteins implicated in IRES-mediated translation are polypyrimidine tract binding protein PTB/hnRNP I (Giraud et al. 2001Go; Mitchell et al. 2001Go, 2003Go, 2005Go; Pickering et al. 2003Go; Cho et al. 2005Go); La autoantigen (Holcik and Korneluk 2000Go; Bhattacharyya and Das 2005Go; Marash and Kimchi 2005Go); hnRNP A1 (Bonnal et al. 2005Go); hnRNP C1/C2 (Sella et al. 1999Go; Millard et al. 2000Go; Holcik et al. 2003Go); hnRNP E, hnRNP K, DAP5/p86 (Henis-Korenblit et al. 2000Go; Nevins et al. 2003Go; Warnakulasuriyarachchi et al. 2004Go; Marash and Kimchi 2005Go); Unr (Mitchell et al. 2001Go, 2003Go; Tinton et al. 2005Go); p60 (Vagner et al. 1996Go), HuR (Millard et al. 2000Go), and PCBP1 (Pickering et al. 2003Go). For a more comprehensive list, see the online IRES database (http://ifr31w3.toulouse.inserm.fr/IRESdatabase/) or the list in Stoneley and Willis (2004)Go.

One group of ITAFs is the many heterogeneous nuclear ribonucleoproteins, hnRNPs that bind onto transcripts and form ribonucleoprotein complexes. They play a key role in pre-mRNA processing as well as mRNA export, localization, stability, and translation (Dreyfuss et al. 2002Go). As an example, PTB has been connected with several functions such as splicing repression, pre-mRNA 3'-end processing, mRNA localization, and mRNA stability. It has also been shown to be involved in IRES activation with both viral (Sanderbrand et al. 2000Go; Wollerton et al. 2001Go; Bieleski et al. 2004Go) and cellular (Giraud et al. 2001Go; Mitchell et al. 2003Go, 2005Go; Pickering et al. 2003Go) IRES but has been also shown to be inhibitory to IRES activity in Unr (Cornelis et al. 2005Go) or Bip (Kim et al. 2000Go). Several IRESes have a polypyrimidine tract at the 3'-end that has been shown to be important for activity (Kaminski et al. 1994Go) and a recognition site for PTB (Kolupaeva et al. 1996Go). The PTB consensus recognition sequence shown to be important in the HCV IRES sequence is CYYYYCYYYY(G|Y)G, where Y is a pyrimidine (Anwar et al. 2000Go). It is not known if the location of this sequence within the HCV IRES structure is important as only the first four bases are consistently in single-stranded regions, but some believe the binding site needs to be within double-stranded regions (Mitchell et al. 2005Go). PTB has at least three splicing isoforms (Wollerton et al. 2001Go) and caspase (Back et al. 2002bGo) or viral protease cleavage products (Back et al. 2002aGo), which affect IRES activity to differing degrees, as well as tissue-specific paralogs (Pilipenko et al. 2001Go; Gooding et al. 2003Go). The protein also contains four RNA binding motifs, thus it is not surprising that other recognition sites for PTB/RNA interactions have also been found that do not match this consensus sequence (Wollerton et al. 2001Go). It is understandable from the variety of PTB forms available that there does not seem to be an unequivocal consensus sequence for PTB.

The ITAF La is very promiscuous as well with its binding site recognition, and the requirement for IRES function is not always clear. For example, an early in vitro study had shown that the HCV IRES could function without any noncanonical factors, but recent in vivo studies show that La is required for HCV IRES translation (Shimazaki et al. 2002Go; Costa-Mattioli et al. 2004Go). The RNA binding protein Unr is required for HRV IRES activity, and although all five cold shock domains of Unr are necessary for RNA binding, the binding seems to be a nonspecific sequence interaction (Brown and Jackson 2004Go).

Although there is no overall consistency as to which RNA binding proteins are required for IRES activity, they may be consistently involved in ribonucleoprotein complexes that exist during specific cellular contexts of stress, cell cycle, or particular mechanism of viral control over cellular functions. Using microarrays, some RNA binding proteins have been shown to interact with a specific group of transcripts during times of cellular perturbation (Tenenbaum et al. 2002Go, 2003Go). It has been postulated that RNA binding proteins play a role in coregulating the translation of groups of proteins analogous to bacterial operons (Keene and Tenenbaum 2002Go). Therefore, knowing an RNA binding protein that binds an IRES in a specific cellular context would suggest that other IRESes in that context may also be bound to the same protein.

The binding of the canonical initiation factors also affect IRES activity. When eIF4E, the cap binding protein, has been removed (Hernandez et al. 2004Go), the Drosophila reaper IRES initiates translation more efficiently than capped messages. The eIF4G family member DAP5/p97 is cleaved to DAP5/p86 and enhances IRES-mediated translation (Henis-Korenblit et al. 2002Go; Nevins et al. 2003Go; Warnakulasuriyarachchi et al. 2004Go; Marash and Kimchi 2005Go). The importance of ribosomal proteins in IRES translation initiation was shown using a genome-wide RNAi screen of Drosophila genes. One hundred twelve cellular genes were found that were required for infection by the IRES-dependent Drosophila C virus (Cherry et al. 2005Go). More than 50% of these were genes of ribosomal proteins, two of which when deleted affected IRES but not cap-dependent translation. This suggests that some ITAFs will be not be unique to only IRES translation machinery and will include components of the ribosome as well.

The ITAFs are abundant in the cell and seem quite ubiquitous but are not required in all examples of known IRES activity. Specific initiation factors and ribosomal proteins will also play a role. They can also exist in several forms of post-translational modification and bind a wide range of sequence motifs. Their own regulation and regulated cellular localization would control the IRES function. Do the mRNA binding proteins that are used for induction of IRES activity in the subsets of the published IRES possibly have shared binding motifs? Although the RNA binding protein data are not as well defined to use for database searching alone, they could still be used in a search algorithm as an added weight in a search.

Functional classes
There appear to be differences among IRESes as to which proteins are necessary to bind to the UTR in order to recruit the ribosome for translation. When this is coupled with the results of different IRESes initiating translation with varying efficiency dependent on which cell type or cellular context they are measured in (Nevins et al. 2003Go), the results suggest that there exist several IRES classes. Many groups have pointed out this observation already. Several groups have examined their characterized IRESes in several cell lines, comparing it to other IRESes, and found specific IRESes will have more activity in one specific cell line relative to another IRES (Stoneley et al. 2000Go; Jopling and Willis 2001Go; Nevins et al. 2003Go; Jopling et al. 2004Go). This may be due to different available protein factors in each cell line and, therefore, the different subsets of mRNAs that different cells or tissues are able to translate at any one time. This may also explain the lack of primary sequence similarity between the cellular IRES. For example, the 5'-UTRs containing IRES from c-Myc and cyclin D are dependent on the activity of AKT through p38 MAPK and ERK signaling to initiate translation (Shi et al. 2005Go) but not the 5'-UTR of P27kip1. The link may be the ITAFs PCBP1, PCBP2, and hnRPK, which are known to be required by the c-Myc IRES (Evans et al. 2003Go) and are regulated by phosphorylation (Shi et al. 2005Go).

In investigation of the FGF1 IRES activity in muscle and cell culture (Martineau et al. 2004Go), FGF1 has four separate 5'-UTRs, each exhibiting some IRES activity. FGF1A and C have similar activity to each other and that of FGF2 IRES but much less than the EMCV IRES in cell culture. The same IRES constructs electrotransferred into mouse muscle cells showed FGF1A to be much more active than FGF1C and similar to EMCV, while the FGF2 IRES seemed to exhibit no activity at all. Clearly the context of available ITAFs must favor the translation of one mRNA over another. A very similar contextual difference is seen where IRES from the muscle-relevant genes SMAD and utrophin are active in myoblast C2C12 muscle cells but not at all in 293T renal epithelial cells for SMAD (Shiroki et al. 2002Go) or differentiated muscle cells for utrophin (Miura et al. 2005Go). Other examples are the IRES from transcripts of the calcium channel proteins like Scamper exhibiting tissue-specific activity in kidney cells (De Pietri Tonelli et al. 2003Go) and the Nkx6.1 IRES being most active in beta-cells (Watada et al. 2000Go). The Apaf1 IRES has been found to be more active in neuronal cell types possibly due to the presence of a neuronal isoform of PTB, which seems to confer greater activity than PTB-1 (Mitchell et al. 2003Go), and this may be where the APAF IRES is most physiologically relevant. This correlates well with the developmental problems found in the brains of Apaf-1 knockout mice (Cecconi et al. 1998Go). In contrast, the IRES of HRV is repressed in neuronal cells because of the presence of the mRNA binding protein DRBP76/NF90 (Merrill et al. 2006Go). A list of IRESes that are regulated can be found in the review by Komar and Hatzoglou (2005)Go. Whereas viral IRES might share a more universal context of translation regulation and therefore some similarity has been found, the larger number of cellular contexts that would require different regulation infers a large number of classes of IRES with many different sequence and structural components. Even saying there are regulatory classes of IRESes may be too rigid, as there may be a loose overlap of some mRNA translation.

18S complementation and modular elements
It has been pointed out by Chappell et al. (2000)Go that partial IRES activity is still retained when the segments of a 5'-UTR ascribed to full IRES activity are partially deleted, and therefore some elements that help to recruit ribosomes must still exist in the remaining sequence. In some UTRs, nonoverlapping segments of sequence retain partial IRES activity, suggesting that different modules may act synergistically to provide full IRES activity in vivo. Nonoverlapping fragments of the Kv1.4 IRES that each retained partial activity showed different patterns of activity when tested in a variety of cell types (Jang et al. 2004Go). This suggests distinct modular elements with different modes of regulation. Some examples of postulated IRES elements are listed in Table 3. Chappell et al. (2000)Go had found an example of a distinct module with a 9-nt motif in the Gtx mRNA, complementary to 18S rRNA that can function as a site for internal initiation of translation. This is an attractive model for ribosome recruitment for internal initiation as it parallels the function of the bacterial Shine-Delgarno sequence and 16S rRNA. This complementation to the 18S rRNA is not new and has also been shown necessary for reattachment for scanning of the mRNA during "ribosome shunting," where the ribosome is stalled because of a complex structure in the mRNA and must disengage and then re-engage the mRNA on the 3'-side of the structure (Yueh and Schneider 2000Go). Similar motifs with IRES activity exhibiting 18S rRNA complementation were found in a library of random nucleotides (Owens et al. 2001Go) and the plant potato virus Y (Akbergenov et al. 2004Go), and have been suggested to be in YAP1 and TIF4631 transcripts of yeast (Zhou et al. 2001Go). Additional copies of the elements from either Gtx or the potato virus Y arranged in tandem produced additive increases of IRES activity. A segment in the 3'-UTR of a hibiscus plant virus, although perhaps not an IRES, was shown to enhance translation through an 18S rRNA complementation (Koh et al. 2002Go). This is not a universal truth about sequences complementary to 18S, as several of the matches in the YAP1 and Tif4631 5'-UTRs are in segments that confer no IRES activity (Zhou et al. 2001Go). The rRNA/mRNA interaction cannot be too great, as increasing the degree of complementation increases the thermodynamic stability of this interaction, lowering the efficiency of translation, and if large enough can completely inhibit translation (Hu et al. 1999Go; Verrier and Jean-Jean 2000Go). This differential ability to translate an mRNA based on the rRNA and mRNA interactions as well as the interactions due to changes in the structure of ribosomal subunits from one cell type to another is postulated as an overall method of translation control in the "ribosome filter hypothesis" of Mauro and Edelman (2002)Go. The last few years have seen the realization that rRNA is no longer just a scaffolding for the ribosomal proteins; the ribosome is a ribozyme, and translation is now more RNA centered (Woese 2001Go). It is, therefore, reasonable to believe with present-day evidence that interactions between mRNA and rRNA can enhance translation initiation.


View this table:
[in this window]
[in a new window]
 
TABLE 3. Reported minimal IRES modules

 
Relatively small IRES elements have also been synthetically derived using a bicistronic vector with 50-nt-long randomly generated sequences inserted between two reporter genes (Venkatesan and Dasgupta 2001Go). Although two of the sequences that exhibited IRES activity showed no sequence homology with any transcripts in the public databases, the findings did show that a relatively small sequence is needed for IRES activity. In this case, the synthetic IRESes did not complement 18S rRNA but were able to compete for the trans-activating factors used by polio virus IRES. This procedure was repeated again in yeast, and 56 IRES functioning elements were found, with 10 having significant matches in the yeast 18S rRNA (Zhou et al. 2003Go).

Fine mapping of the c-Myc IRES sequence found a minimal 50-base sequence that was responsible for the bulk of the IRES activity (Cencig et al. 2004Go). The c-Myc sequence did not seem to map to 18S rRNA and was also not dependent on the secondary structure formed for activity. Within this element, two 14-nt segments with an AX6AC motif were chiefly responsible for ribosome recruitment reducing c-Myc IRES activity to these modular units. The IRES found within the APC gene that possibly is responsible for the milder form of adenomatosis polyposis coli is only 84 nt long (Heppner Goss et al. 2002Go) and may be also representative of a modular unit with IRES activity.

In an acyclovir-resistant strain of herpes simplex virus, very low levels of thymidine kinase are translated by a small IRES that requires only 12 bases and contains a CUG start codon (Griffiths and Coen 2005Go). The low levels of thymidine kinase prevent its proper activation of acyclovir but are high enough to retain the virus's pathogenicity. As well as being a new IRES modular element, this also shows how very low levels of IRES translation initiation can be physiologically significant.

The above evidence supports the notion that an IRES can be made of modular units that act synergistically with or without trans-acting factors to recruit a ribosome and enhance translation initiation. The overall structure of the UTR for these small units may be somewhat unimportant. A very stable structure like a large hairpin or a tertiary structure that would bury a modular recognition sequence would probably still have an effect on IRES activity. The modulation of the structure by mRNA binding proteins could, therefore, have a regulatory IRES effect by changing the access to these small modules. As these types of short IRES sequences may be available elsewhere on mRNA sequences, they suggest the possibility of a much expanded proteome (Griffiths and Coen 2005Go).


    DATABASES AND IRES
 TOP
 ABSTRACT
 INTRODUCTION
 MECHANISM OF CAP-DEPENDENT...
 DATABASES AND IRES
 IRES STRUCTURE
 RNA STRUCTURE PREDICTION AND...
 SINGLE SEQUENCE STRUCTURE...
 CONCLUDING REMARKS
 REFERENCES
 
A searchable database of published IRESes exists at http://ifr31w3.toulouse.inserm.fr/IRESdatabase/ (Bonnal et al. 2003aGo). Within this database, the IRESes are classified in several categories based on the function of the gene in which they are found, how the genes are regulated, and with which factors the IRES elements interact. Links to some of the sequences are directed to NCBI entries. Unfortunately, the database has not been updated since 2002; Table 1GoGo includes all published cellular IRES sequences to date. Many of the IRES publications did not give exact coordinates to specific database entries for the UTR sequences used in their research. For this reason, Table 1GoGo was compiled to include the GenBank Identifier (GI) number and the position on the sequence of the minimal known IRES for all published IRESes to date. Multiple fasta files of all the cellular or viral IRES sequences are available as supplemental data at http://bio.site.uottawa.ca/IRES_rna_supplement/ as either the full-length UTRs or as a collection of minimal IRES sequences or full UTRs where the minimal sequence is not known. Filling the gap left by the previous IRES database, a new online database of IRES sequences has been created at http://www.iresite.org/IRESite_web.php (Mokrejs et al. 2006Go).

IRES structure data for 16 cellular IRES and five viral IRES sequences are available at Rfam (Griffiths-Jones et al. 2003Go), a database of noncoding RNA. For the most part, published IRES structures are used initially at Rfam, and a covariance sequence model is built using UTR sequences from different transcript entries and known ortholog sequences. Where no published structure is known, the covariance models for the IRES structures have either been built from energy minimization program prediction (RNAfold) or from a sequence alignment using the PFOLD program. It should be noted that if the alignment sequences did not give any covariance data, the seed structure would dominate the resulting structure model. This could possibly be the case for HCV, which appears to be little different from its seed structure (Lytle et al. 2002Go) but dissimilar from the HCV NMR structure data (Lukavsky et al. 2003Go). For tertiary structure information of IRES elements, the data can be found in the structure database at NCBI or RNABase (Murthy and Rose 2003Go), a specific RNA structure database that may include additional annotations not included in the standard 3D structure files. At present, there are 13 IRES-related elements from GBV-B, enterovirus, and mostly HCV. Tertiary structures can be converted into secondary structure diagrams using RNAVIEW (Yang et al. 2003Go).

There are other available data sets of IRES-enriched sequences where microarray expression studies have been carried out on cells undergoing a perturbation that would turn down the amount of cap-dependent translation and allow for a greater amount of IRES-mediated translation. Assuming that IRES elements under these conditions would more efficiently recruit ribosomes, experiments that isolated mRNAs bound by multiple ribosomes (polysomes) should show mRNAs with a greater likelihood of containing an IRES element. Johannes et al. (1999)Go isolated polysomes from poliovirus-infected cells and evaluated any increased amount of bound mRNA compared to noninfected cells using a 10K human cDNA array. They found ~200 transcripts with a greater than twofold enrichment of polysome-bound mRNA of ~7000 hits that produced an acceptable signal on the microarray. There are several other studies using microarray examination of polysome-bound mRNA, cells with and without von Hippel-Landu tumor suppressor protein (Galban et al. 2003Go), rapamycin's effect on translation (Grolleau et al. 2002Go), resting and mitogenically activated fibroblast (Zong et al. 1999Go), synchronized cells in the mitotic cycle (Qin and Sarnow 2004Go), and in yeast during cell cycle arrest (Serikawa et al. 2003Go) or rapid growth (Arava et al. 2003Go). Lately some researchers have combined the proteomic approach of measuring both the increase of protein expression using 2D gels and mass spectrometry as well as the mRNA levels (Ideker et al. 2001Go; Grolleau et al. 2002Go). This allows one to specifically discover transcripts that are regulated post-transcriptionally. Microarray studies of mRNA bound to mRNA binding proteins implicated as ITAFs will also yield a database of transcripts of which some may contain IRES.

Databases of UTR sequences can be obtained from several sources. Using the Web-based EnsMart in Ensembl (Clamp et al. 2003Go), UTR data sets from a variety of genomes can be made with a user-defined number of flanking nucleotides. As almost all genes produce several transcripts, one must still consider filtering redundant 5'-UTRs, alternately spliced UTRs, or alternate UTRs produced by different promoters when creating a database. A prefiltered nonredundant database, UTRdb (Pesole et al. 2002Go), can be found at http://bighost.area.ba.cnr.it/BIG/UTRHome/, in which UTRs with >90% sequence overlap and 95% nucleotide identity within the overlapping region (Grillo et al. 1996Go) have been removed. UTRdb has annotated the sequences that contain patterns matching possible RNA structure/sequence, which would confer known regulatory elements such as the iron response element (IRE), histone 3'-UTR stem–loop structure (HSL3), or even IRES elements as adapted from the computationally predicted structure of Le and Maizel (1997)Go. In Release 20 of UTRdb, there are ~34,000 human 5'-UTR sequences and the IRES pattern is found ~7000 times, ~20% of all the entries. Although the number of mRNAs containing IRES elements could be quite abundant, there is doubt that the pattern actually detects IRESes. The RefSeq database from NCBI can also be used to produce a data set from the transcript entries but requires the user to write his or her own program/script to pull out the UTR sequences. A redundant version of the RefSeq UTRs is available at http://bighost.area.ba.cnr.it/BIG/UTRHome/.

These data sets allow one to compare the overlaps in the UTR sequence characteristics believed to have IRES elements as discovered by different means versus a complete set of a genome's UTRs. For instance, of the ~23,000 human genes in the human genome, around half of them have AUGs upstream of the defined AUG start codon, as seen in Figure 1, A and B. The number of upstream AUGs does not seem to depend on length as there is a subset of UTRs with more AUGs than expected randomly, as shown in Figure 1B, but the overall frequency is less than expected from random. Although many of the upstream AUGs will most likely be relatively close to the Kozak consensus sequence (Suzuki et al. 2000Go), this alone does not seem to mark them as proper start codons (Peri and Pandey 2001Go). Around 10% of the upstream AUGs would be considered strong start codons (RXXAUGG, where R is a purine) as defined by Kozak (2005)Go, whereas 41% to 45% of the coding region start codons would be considered strong start codons. Perhaps the ribosome starts to translate with the first AUG that is in the proper context of the surrounding folded RNA structure, bound proteins, and/or a continuous open reading frame. Possibly the start codon is decided on by a "pioneer" scan (Ishigaki et al. 2001Go) of the mRNA or an RNA binding protein (McBratney and Sarnow 1996Go).


Figure 1
View larger version (17K):
[in this window]
[in a new window]
 
FIGURE 1. (A) The number of AUGs found upstream of the natural start codon in 5'-UTRs is compared in several nonredundant 5'-UTR data sets. Data come from human transcripts from RefSequation (21589, Release 13), RefSequation 5'-UTRs fully reviewed by NCBI staff (1049), and the 66 mammalian UTRs containing published IRES from Table 1GoGo. Data bins are represented as the percentage of the total entries in each data set. (B) The frequency of upstream AUGs in several nonredundant data sets. A histogram in several asynchronous bins containing the relative frequency of upstream AUGs as a percentage of each data set compares the 5'-UTR from human transcripts in RefSeq, and fully reviewed RefSeq UTRs. The distribution shows a selection for no upstream AUGs in all the data sets, but approximately half of all transcripts contain an upstream AUG where ~10% are considered "strong" AUGs (RNNAUGG) by Kozak (2005)Go. Approximately 45% of annotated start codons in these transcripts are "strong" AUGs.

 
Each data set could be mapped to one common nonredundant set of genes so that ESTs from microarrays as well as genes that are represented by multiple sequence IDs in the public databases could be properly compared. Some basic questions regarding upstream AUGs, polypyrimidine tracks, UTR length, GC content, and putative thermodynamic equilibrium of the structures can be compared as well as comparing the different IRES predictions to further enrich a data set. As an example, we have compared the set of published IRESes from Table 1GoGo with human 5'-UTR sequences from each of the UTRdb and/or RefSeq databases for upstream AUGs as discussed (Fig. 1), UTR length (Fig. 2), and % GC content (Fig. 3). The Ensembl data set gave similar results to RefSeq.


Figure 2
View larger version (18K):
[in this window]
[in a new window]
 
FIGURE 2. The lengths of 5'-UTRs of human transcripts from nonredundant data sets. The frequency of different lengths of all the UTRs in each data set is placed in increasing bins of 50 nt and plotted as a percentage of the total number of UTRs in each database. A nonredundant data set of 5'-UTRs from UTRdb, RefSeq, fully reviewed transcripts of RefSeq (see Fig. 1 legend), and the mammalian transcripts containing IRES in their 5'-UTR from Table 1GoGo are compared. The legend gives the median, mean, and third quartile values in each data set with the total number of UTRs in each data set in brackets.

 

Figure 3
View larger version (16K):
[in this window]
[in a new window]
 
FIGURE 3. The distribution of the % GC content of human 5'-UTRs. The degree of RNA folding and structure stability can be partially assessed by the percentage of possible G and C base pairing within the UTR sequence and therefore the percent of Gs and Cs within the sequence. The % GC content of mammalian UTRs with published IRES are compared to the human 5'-UTRs from UTRdb and RefSeq and show a similar distribution. Plotted values are grouped in bins of five.

 
The most notable difference in cellular IRES sequence from the UTR data sets is that IRES-containing UTRs are generally longer (Fig. 2). It should be also noted that the median length of 5'-UTRs is >150 nt, longer than the commonly stated dogma of it being "usually <100 bases" (Lewin 2006Go). As minimal IRES sequences can be at that length or less, one conclusion from our comparison is that half of all UTRs are long enough to include an IRES. The % GC content of the IRES data set seems quite similar to the overall distribution of UTRs (Fig. 3), suggesting that the degree of RNA secondary structure stability as correlated to the amount of GC pairing possible is no different from that of most UTRs. A similar result is obtained if the secondary structure scanning algorithm of Rivas and Eddy (2000)Go is used instead of % GC content (data not shown). This does not negate that there could be secondary structure elements within a 5'-UTR that impede translation, but the overall degree of structure in UTRs with IRESes does not appear different from what is found in the distribution of all human 5'-UTRs.

Many more characteristics of the UTR sequences can be examined by comparing and crossing all of these data sets, looking for sequence or structural motifs that may be required for translation initiation in the periods of translation dysregulation.


    IRES STRUCTURE
 TOP
 ABSTRACT
 INTRODUCTION
 MECHANISM OF CAP-DEPENDENT...
 DATABASES AND IRES
 IRES STRUCTURE
 RNA STRUCTURE PREDICTION AND...
 SINGLE SEQUENCE STRUCTURE...
 CONCLUDING REMARKS
 REFERENCES
 
RNA motifs recognizable by RNA binding protein are often made up of structure and/or some primary sequence. Structures can be preserved without preserving the primary sequence, but proteins often bind onto specific sequence motifs, and thus to search for common attributes of known IRES sequences, one would expect a mixture of these two characters, structures, and some small sequence motifs being preserved. Although it is the 5'-UTR sequence that is responsible for recruiting the ribosome, the 3'-UTR may play a similar role (Izquierdo and Cuezva 2000Go) or synergistically enhance the IRES-mediated translation of an mRNA (Lopez de Quinto et al. 2002Go; Dobrikova et al. 2003Go; Koh et al. 2003Go). It is understood that in 3'-UTRs there is a role played by motifs that control translation (Mazumder et al. 2003Go). Virtually all the characterized IRESes have been tested and function independently of their 3'-UTRs; for this reason, just the 5'-UTR can be solely considered.

There are several ways to experimentally determine RNA structure; for example, X-ray crystallography, NMR, chemical and enzymatic structural probing, as well as mutational analysis (Kjems and Egebjerg 1998Go). Structures that have been predicted using computer programs often are functionally tested for validity with small sequence mutations (Kanamori and Nakashima 2001Go). X-ray crystallography and NMR, while more definitive in determining the structure, have limitations on either the type or length of the RNA molecules that can be investigated. RNA alone tends to form poor crystals that exhibit weak diffraction or heterogeneous samples (Ke and Doudna 2004Go), possibly because of the dynamic folding of many RNA molecules. The only crystal structures of RNA not complexed with proteins >76 bases in length in the structure databases are either ribonuclease P or group I intron ribozymes. The limitation for NMR is due to the severe spectral overlap of the four different nucleotides. These two methods have been used in examining small, structurally stable motifs extracted from larger RNA molecules. Enzymatic and chemical probing are aids in determining a secondary structure of an RNA molecule but often have some ambiguous results in their experiments caused by the difficulty in the art of doing the experiment and interpreting the results. These wet-laboratory determinations of structure can also be supported with phylogenetic data comparisons in which sequences that have changed have preserved the structure to preserve the function of the structure. The best data from all these approaches are used to support a proposed model, but conflicting secondary structure models do arise in the literature as has been evident in domain II of HCV (Honda et al. 1999Go; Zhao and Wimmer 2001Go; Lytle et al. 2002Go).

We must also consider how much of the proposed mRNA structure is as significant for IRES activity in vivo as it is in a naked state in vitro when it is derived. Messenger RNAs in vivo start folding as they are transcribed and are covered with the protein complexes required for translocation, splicing, stabilization, capping, polyadenylation, cellular localization, and translation. In vivo chemical probing of structures has been performed on relatively abundant RNAs like rRNA, telomerase RNA, and snRNA (Zaug and Cech 1995Go; Mathews et al. 2004Go), and could possibly be used to determine whether IRES structures determined in vitro exist in vivo as well.

Viral IRES structures
Several viral IRESes share similar secondary structures, suggesting that similar structures instead of specific sequence recognition sites are used to bind initiation factors used for cap-independent translation. This is a great aid in understanding the IRES mechanism overall, but it must be remembered that ssRNA viruses have a life cycle in the cytoplasm and will not interact with the nucleoprotein complex like cellular mRNA produced in the nucleus. Therefore, we would not expect the noncanonical protein factors to necessarily interact by exactly the same mechanism. Several groups have used characteristics of the viral IRES to separate them into three groups. Type I viral IRESes, which include entero- and rhinoviruses, translate poorly in rabbit reticulocyte lysates (RRL) and require the ribosome to bind and then scan downstream to a start codon 30–150 nt away. Type II viral IRESes include cardio- and apthoviruses, translate very efficiently in RRL, encompass the AUG start codon, and do not require scanning. Type I IRESes are stimulated by the eIF4G cleavage products produced by the viral protease, but Type II are not. Type III IRESes are typified by hepatitis A, do not translate at all in RRL, and encompass the AUG start codon. These classes, which may not be able to apply definitive separation of viral IRES types, serve to show how some viruses use a similar mechanism to initiate translation, while others use distinctly separate mechanisms to arrive at the same end.

Comparative studies involving covariation analysis as well as enzymatic and chemical probing of structures demonstrated a conservation of structures between members of the Picornaviridae enterovirus family, Poliovirus and coxsackievirus B3, and the human rhinovirus family (Rivera et al. 1988Go; Pilipenko et al. 1989bGo). Similar studies found sequence and structural conservation between the Picornaviridae cardiovirus family (EMCV, TMEV) and apthovirus FMDV (Pilipenko et al. 1989aGo). Further examination of variants and structural probing of PV (Pilipenko et al. 1992Go), mutational analysis of EMCV (Hoffman and Palmenberg 1995Go; Kolupaeva et al. 1996Go), and FMDV (Lopez de Quinto and Martinez-Salas 1997Go) have further refined the original structure models. The preservation of a stem structure in the base of region 3 of FMDV has been shown to be more important than the sequence that creates the structure (Martinez-Salas et al. 1996Go; Martinez-Salas et al. 2002Go). Recently the central domain was analyzed with chemical and enzymatic probing altering slightly the predicted structure (Fernandez-Miragall and Martinez-Salas 2003Go).

Several Flaviviridae viruses contain IRES sequences and have had their structures determined. In the pestivirus genera of Flaviviridae viruses, there is the bovine viral diarrhea virus (BVDV) and cholera swine fever virus (CSFV); in the hepacivirus genera, there is HCV; and unclassified are the GBV-A, B, and C viruses. Lemon and Honda (1997)Go reviewed the similarity and structural importance of their IRES structures. The IRES structure that has been studied the most of all IRESes is from the hepatitis C virus (HCV) because of its importance as a human pathogen. The initial secondary structure had been predicted using both enzymatic probing results as constraints in an early version of the MFOLD program (Zuker and Stiegler 1981Go) along with comparative computer-predicted models with the UTRs of BVDV and CSFV (Brown et al. 1992Go). There have been numerous studies adding to the refinement of the IRES structure (Lemon and Honda 1997Go; Lyons et al. 2001Go; Odreman-Macchioli et al. 2001Go), arriving at two similar models (Honda et al. 1999Go; Zhao and Wimmer 2001Go; Lytle et al. 2002Go) with some differences in domain II. Smaller stem–loop motifs of the HCV IRES structure have had their tertiary structure determined using both X-ray crystallography (Kieft et al. 2002Go) and NMR (Klinck et al. 2000Go; Lukavsky et al. 2000Go; Collier et al. 2002Go). Recently, the tertiary structure for domain II has been determined with NMR (Lukavsky et al. 2003Go) and finally resolves the conflicts of structural models proposed for this domain. It is important to note that the extensive chemical and enzymatic probing of the HCV IRES over 10 years was still not accurate because of the limitations of that approach. A structure model including the NMR and X-ray data is presented in Figure 4. Using cryo-EM, studies of the HCV IRES in the 40S ribosome subunit, domain II, which is necessary for IRES activity, specifically produced conformational changes to the 40S ribosome (Spahn et al. 2001Go). The segments IIIe and IIIf have recently been shown to interact with ribosomal protein S5 in an IRES-specific manner that does not seem to be important in cap-dependent translation initiation (Ray and Das 2004Go), suggesting that IRES RNA interaction with the ribosome can be different from that of non-IRES-containing mRNA even without considering nonribosomal protein factors.


Figure 4
View larger version (23K):
[in this window]
[in a new window]
 
FIGURE 4. The correctly predicted RNA secondary structure of HCV IRES by MFOLD. (A) The empirically predicted structure of the HCV IRES is shown with correctly predicted basepairs and nonpairing bases with gray background of the lowest energy prediction when using MFOLD with no constraints. (B) The best predicted structure was not the most thermodynamically stable fold, and the predicted base pairs and nonpairing bases shown are in gray. (C) All of the correctly predicted base pairs and nonpairing bases from all of the 36 predicted suboptimal folds using a 50% suboptimal parameter in MFOLD. Structures for figures have been produced with RnaViz2 (De Rijk et al. 2003Go).

 
The HCV IRES is an example of how both tertiary and secondary structure is important for IRES function. Conservation of specific stems in domain II preserves IRES function regardless of the sequence used (Honda et al. 1999Go). Point mutations that alter the tertiary structure of interdomain interactions (Kieft et al. 1999Go) and mutations to the pseudoknot (Wang et al. 1995Go) severely affect IRES activity. The importance of these domains has become clearer as domain III has been shown to directly interact with eIF3 and the 40S ribosomal subunit (Kieft et al. 2001Go) and cryo-EM studies have shown domains II and III wrapped on the 40S subunit, both interacting with the E-site and creating a conformational change of the subunit itself (Spahn et al. 2001Go). Further studies of this interaction show that the HCV IRES structure sits in the same spot of the 40S normally occupied by initiation factor eIF4G (Siridechadilok et al. 2005Go), suggesting that the IRES structure has taken the place of eIF4G, which is not needed for IRES function in vitro.

The other known Flaviviridae viral IRES GBV-B, BVDV, and CSFV share similar secondary structures to HCV (Lemon and Honda 1997Go). BVDV and CSFV structure models (Brown et al. 1992Go) were originally proposed by aligning four sequences, doing manual sequence covariation analysis, and using the determined base pairs as constraints on the MFOLD program (Zuker and Stiegler 1981Go). Mutation analysis has substantiated the CSFV pseudoknot (Rijnbrand et al. 1997Go; Fletcher and Jackson 2002Go), stem–loop IIIa (Fletcher and Jackson 2002Go), as well as other sections of the structure with enzymatic probing (Kolupaeva et al. 2000Go). Only a little supportive work (Grassmann et al. 2005Go) has been done on the original model of the BVDV UTR structure (Brown et al. 1992Go). GBV-B was modeled comparing similar sequences to HCV and using those that might base pair as constraints in the MFOLD program, followed by mutational studies on the similar stems and loops in domains II and III (Rijnbrand et al. 2000Go) as well as NMR on specific domain III stem–loops (Rijnbrand et al. 2004Go).

RNase P is an endoribonuclease that processes tRNA precursor 5'-ends to the correct length. It has been used to cleave the 5'-UTR IRES-containing regions of HCV, BVDV, CrPV, EMCV, and CSFV (Lyons and Robertson 2003Go). This suggests that the structures of these IRESes may mimic a portion of a tRNA. Lyons and Robertson (2003)Go postulate a model in which these viruses would occupy the E-site of the ribosome, positioning it for the AUG start codon downstream. A structural element similar to tRNA spatially situated so that the start codon would be properly placed in the ribosome would greatly enhance the ability to find IRESes in the database, as effective search methods exist that can find these structures within genomic sequences (Lowe and Eddy 1997Go; Tsui et al. 2003Go). These putative tRNA-like sequences may not reflect a complete tRNA structure as RNase P cleaves several different RNA structures (Gopalan et al. 2002Go) including a minimal stem structure of a tRNA molecule as a substrate (Zuleeg et al. 2001Go). A complete tRNA structure may also not be needed for affinity to the E-site of the ribosome. This proposed structural element used to direct the positioning of mRNA in the ribosome has an even greater role in CrPV IRES-mediated translation, where translation initiation occurs without eIF2 and an activated Met-tRNA in the P-site (Wilson et al. 2000aGo). The first activated tRNA starts translation on an alanine codon at the A-site. The interactions in CrPV substituting for Met-tRNA would be different from that of just a tRNA, but still suggest structural elements that may be shared with other IRESes. This is a good mechanism by the virus to overcome the antiviral response of the cell to shut off protein synthesis through PKR phosphorylation of eIF2{alpha} (Stark et al. 1998Go). Cellular stress leading to transcription of TNF{alpha} and IFN-{gamma} activates PKR as well through conserved RNA motifs in the UTRs of their mRNAs (Kaempfer 2003Go), requiring cellular proteins to translate in this context in an alternative manner possibly akin to CrPV.

Other Dicistroviridae also seem to be able to initiate translation without a methionine tRNA. A structural model was proposed using the intergenic IRES sequence upstream of the capsid proteins from Plautria stali intestine virus (PSIV), Taura syndrome virus (TSV), and acute bee paralysis virus (ABPV) and used to search nucleotide databases with a pattern searching program (Nishiyama et al. 2003Go). The key region of the model had experimentally been shown to be a pseudoknot (PK1) 5' of the capsid-coding region. This structure, even with relaxed search parameters, was not found in extensive database searches, strongly suggesting this functional structure to be distinct to Dicistroviridae.

Coxsackievirus B3, which is in the same family, has a GNRA loop in its IRES that influences the binding of La but not of PTB (Bhattacharyya and Das 2005Go). La binds to several places on the IRES and not specifically the loop, but the RNA structure of the IRES may change because of mutation of this loop, suggesting a structure requirement for La binding in this case.

There is a definite conservation of RNA structures between groups of viruses within their IRES sequences. The mutation analysis studies have shown that the preservation of these structures in viruses are more important for IRES function than the actual sequence.

Cellular IRES structures
We know that viruses have evolved regulatory mechanisms borrowed from host cells, and therefore it is felt that since some viruses have shared IRES RNA structure, some cellular IRESes most probably have a shared structure/function relationship. At this time, the secondary structure has been derived for several cellular IRESes with enzymatic and chemical probing from the transcripts of c-Myc (Le Quesne et al. 2001Go), L-Myc (Jopling et al. 2004Go), Apaf-1 (Mitchell et al. 2003Go), FGF-2 (Bonnal et al. 2003bGo), FGF1 (Martineau et al. 2004Go), Kv1.4 (Jang et al. 2004Go), Bag-1 (Pickering et al. 2004Go), Igf2 (Pedersen et al. 2002Go), cat-1 (Yaman et al. 2003Go), Mnt, and MTG8a (Mitchell et al. 2005Go). Their structure/function relationships are described below.

As has been mentioned previously, a common Y structure (Le and Maizel 1997Go) had been predicted for cellular IRESes based on the computational comparison of several orthologs of Bip and FGF2 UTRs. This pattern had been adapted for the PATSEARCH program (Grillo et al. 2003Go) to annotate the UTRdb entries as putative IRES motifs and is used by the UTRscan Web server. Of the 33,677 entries in Release 20 of UTRdb, 7122 entries have been annotated as IRESes, which is ~21% of all the entries. Using the same pattern to search the cellular IRESes from Table 1GoGo with RNAMotif finds the pattern in ~21% of the entries, showing that this pattern is no more common in known IRES-containing UTRs than in all UTRs.

Some studies have proposed that mRNA binding proteins open up the natural RNA structure of the IRES and present single-stranded RNA for other ITAFs or the small ribosomal subunit to bind to. Therefore, the structure would not be the "landing pad" for the ribosome but the "attenuater" that is controlled by the ability of some proteins to change it. For example, the poly rC binding protein 1 (PCBP1) appears to open the Bag-1 IRES structure allowing PTB-1 to bind. Mutations that open up the binding region of the PCBP1 on the Bag-1 IRES seem to remove any requirement for this factor, even enhancing IRES activity after PTB-1 is added (Pickering et al. 2004Go). Although PCBP1’s role seems to be to open up the structure for PTB-1, it is not clear whether the structure is necessary for its binding. A similar mechanism was proposed for the ITAFs Unr and PTB in the APAF-1 IRES (Mitchell et al. 2003