|
|
||||||||
BIOINFORMATICS |
1 Departments of Chemistry and 2 Ecology and Evolutionary Biology, Princeton University, New Jersey 08544, USA
Reprint requests to: Laura Landweber, Department of Ecology and Evolutionary Biology, Princeton University, NJ 08544, USA; e-mail: lfl{at}princeton.edu; fax: (609) 258-7892.
| ABSTRACT |
|---|
|
|
|---|
Keywords: tRNA mimicry; protein translation; RRF; EF-G; EF-Tu; conserved elements
| INTRODUCTION |
|---|
|
|
|---|
The structurefunction correlation is not a privilege only for comparisons between the same type of biological molecules: It seems to extend between totally different types of molecules like protein and RNA. In the last 10 yr, extensive studies of the three-dimensional structure of the translation apparatus have revealed several such vivid examples. Namely, several protein translation factors resemble the tRNA molecule in terms of size and shape (Nissen et al. 1995
; Liljas 1996
; Selmer et al. 1999
; Song et al. 2000
; Klaholz et al. 2003
; Rawat et al. 2003
; Hanawa-Suetsugu et al. 2004
), and this has been termed ";molecular mimicry". Therefore, it was proposed that tRNA and its protein mimics are functionally related in the way they bind to or interact with the ribosome.
However, such claims were only based qualitatively on the overall three-dimensional structural similarity and often turned out to be misleading (Brodersen and Ramakrishnan 2003
). Currently, comparing the structures of proteins and RNA molecules quantitatively remains a big challenge. The difficulties are not only technical but, more importantly, conceptual: How can we compare two completely different biopolymers that share very few common characteristics? In this study, we present a novel computational approach to study structural similarity quantitatively between proteins and RNA based on the spatial distribution of conserved elements. We apply it to two previously proposed tRNAprotein mimicry cases whose functional relatedness between two molecules has recently been determined experimentally. Our results are consistent with experimental evidence. We hope that this method can advance our understanding about the structurefunction correlation and provide a useful protocol for future examinations of other proposed mimicry pairs.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
Comparison of the spatial distribution of conserved elements between protein and RNA
First, we identified conserved elements in EF-G, RRF, and tRNA, respectively (see Supplementary Material at http://oxytricha.princeton.edu/liang/mimicry/mimicry.htm). The Consurf server (version 2.0) was used to calculate the conservation scores of amino acids (Glaser et al. 2003
). Given the three-dimensional structure of a protein as an input, this software extracts the protein sequence from the PBD file and automatically carries out a search for homologous sequences of this protein. It aligns sequences, builds a maximum likelihood phylogenetic tree consistent with the alignment, and then calculates the conservation scores and classifies amino acid positions into nine groups (9, most conserved; 1, most variable). Because there are almost no arbitrarily chosen parameters involved in Consurf, the advantage of defining the most conserved (variable) amino acid residues is to avoid subjectivity.
In the case of EF-G, domains IV and V in EF-G mimic the tRNA in the EF-Tu ternary complex. (Domain III is poorly visible in the electron density map and therefore absent from the Protein Data Bank (PDB) file [Liljas 1996
].) The PDB file 1DAR was used as an input and 247 homologous sequences were used in the alignment. Seventeen of 206 amino acid residues with the score 9 were identified as conserved elements and 31 amino acid residues with the score 1 were identified as the most variable elements.
For RRF, the PDB file 1EH1 was used as an input and 121 homologous sequences were used in the alignment. Eighteen of 185 amino acid residues in RRF with the score 9 were identified as conserved elements and 30 amino acid residues with the score 1 were identified as the most variable elements. Regarding the conserved elements in tRNA, it has been well documented that there are 16 invariant nucleotides among all normal tRNAs (Kim 1978
). The conserved CCA nucleotides at the amino acid binding 3' end were excluded, since there are no corresponding parts in the superimpositions in both cases.
Second, we next superimposed two partner structures for each proteinRNA mimicry case (EF-G vs. EF-TutRNA; RRF vs. tRNA). The optimal orientations of two structures in the superimposition were calculated by the BIND2 program, which applied geometric hashing to find globally maximal matching of two molecules (Chang et al. 2004
). To be more cautious, the superimpositions determined manually were then used to confirm the best geometry alignment of two partner structures. Importantly, in both examples in our study, the two partner structures mimic each other nearly perfectly, so the best superimposition is actually quite self-evident. The superimposed PDB files are provided in Supplementary Materials (http://oxytricha.princeton.edu/liang/mimicry/mimicry.htm).
Third, we calculated the number of conserved element pairs (CEPs) for each superimposition as follows. For tRNA and its corresponding protein in the superimposition, each nucleotide was represented by one nitrogen atom (N1 for C and U, N9 for A and G) and each amino acid residue was represented by its C
atom. For a given threshold (R), if the distance between a conserved amino acid and a conserved nucleotide in the superimposition is smaller than the threshold, it is counted as a CEP. We scored the total number of CEPs in each superimposition.
Fourth, for each superimposition, we generated a randomized background distribution to determine the statistical significance of the observed CEP number. While preserving the conserved elements in tRNA, the same number of amino acid residues was randomly chosen in the corresponding region of the protein as pseudo-conserved elements. The CEP number was then calculated. This random sampling was repeated 1000 times and the frequency of each CEP number in the simulation was calculated. The statistical significance of the observed CEP number (n) is defined as the cumulative probability of the CEP numbers that are not smaller than n in the simulation.
Finally, we also carried out negative controls for each case. The most varied amino acid residues in the protein replaced the most conserved amino acids and then a similar calculation was performed. In this situation, the statistical significance of the observed CEP number (n) is defined as the cumulative probability of the CEP numbers that are not larger than n in the simulation.
| RESULTS |
|---|
|
|
|---|
A control test with protein homologs
To test this idea initially, we first applied the method to several pairs of protein homologs. FtsZ and tubulin are a well-known pair of ancient protein homologs (Fig. 2
) in prokaryotes and eukaryotes that function in cell division among other roles. Owing to their low sequence identity (< 15%), their distant relationship was firmly established only by comparison of their macromolecular and atomic structures, as well as by their functional mechanism (Lowe and Amos 1998
; Nogales et al. 1998a
,b
). We superimposed two structures (1FSZ-A and 1FFX-A) using SUPERPOSE (version 1.0; Maiti et al. 2004
) and identified the most conserved amino acids using Consurf (66 FstZ homologs and 230 tubulin homologs, respectively) (Supplementary Material, http://oxytricha.princeton.edu/liang/mimicry/mimicry.htm). Then in the superimposition of both proteins, the number of CEPs was introduced as a measurement of the similarity of conserved element spatial distributions. We used the randomized CEP background distribution to determine the statistical significance of the observed CEPs. As anticipated, there are significantly many more CEPs than randomly expected for this protein pair (Fig. 3a
). More importantly, as in Figure 3b
, the statistical significance (P-value) of the CEP number strongly depends on the given threshold, which is used to define CEP. When the threshold is very small, no significant results can be detected since the criterion is too strict to score any CEPs; when the threshold is very large, one can also not detect any significant results, because in this situation the criterion is so loose that any two conserved elements in the superimposition will be counted as one CEP. Nevertheless, when the threshold falls into a suitable range, two superimposed structures that share a similar conserved element distribution in three-dimensional space will have a CEP number significantly higher than random expectation.
|
|
Results from two proteinRNA structural comparisons
When we applied the method to two well-studied proteinRNA pairs (EF-G vs. EF-TutRNA complex; RRF vs. tRNA), the graphs of P-value versus threshold were clearly different (Fig. 4a,b
). The graph of EF-G and EF-TutRNA, the positive example, showed an exact tendency as expected. There is a middle region in the graph where a significant P-value could be detected (920 Å) and the most significant P-value is < 0.001, indicating that the two structures share a similar distribution of conserved elements. However, in RRF, the negative example, no significant P-value can be detected at any threshold in the graph of RRF and tRNA.
|
| DISCUSSION |
|---|
|
|
|---|
For the two pairs of proposed protein mimics that bear remarkable structural resemblance to tRNA, why is the spatial distribution of conserved elements a reliable indicator of its function? Regardless of the type of biopolymer, the conserved elements essentially preserve the nature of the biological molecule and determine its overall shape, flexibility, and specific interactions. These factors intrinsically define the role of the molecule in a biological system. Specifically, internal conserved elements play a key role in maintaining the overall structural stability of the molecule, while conserved elements on the surface are more likely to be binding sites and perform specific interactions. Therefore, the spatial distribution of conserved elements can reflect the selective constraints on the molecule more accurately. Regarding the positive case in this study, the similar pattern of conserved elements between EF-G and EF-TutRNA may represent a general requirement to enter the same ribosomal cavity, which is necessary to fulfill their biological functions.
While this method appears promising, the extent of application remains to be explored. Here we applied our method mainly to two well-studied examples of tRNA mimicry, because of the relatively strict requirement to be able to superimpose two partner structures. For molecules as dissimilar as RNA and protein, a near perfect superimposition provides the only platform for further analysis. Such similarity does not hold for other proposed proteinRNA mimicry pairs (Song et al. 2000
; Rawat et al. 2003
), making any structural comparisons very subjective and difficult. Second, the method cannot establish a direct correspondence between a conserved amino acid and a conserved nucleotide in two molecules. Because protein and RNA are chemically different polymers, the specific details for two functional molecules are surely distinct. Here we use the spatial distribution of conserved elements as a proxy for comparison, but one should be cautious not to overinterpret this information. Third, as for any other computational methods, the observation of a similar distribution of conserved elements between two molecules does not guarantee related function, since they may be conserved for different reasons. The significance of our method is that it increases ones confidence in drawing functional inferences based on the member of the structural pair with known function.
Our study also calls to attention the evolutionary relationship between protein and RNA. The RNA world hypothesis (Gilbert 1986
) leads to the conjecture that most or many proteins displaced RNA ancestors, allowing the transition from the RNA world to the protein-dominated world of today (Nakamura 2001
). The fact that protein translation, a web of interactive RNAs and proteins, contains protein components that mimic tRNA implies that an important step in this transition was mimicry of functional or catalytic RNA by the proteins that usurped their role (Landweber 1999
). As we demonstrate here, our work may further uncover one important architectural rule for RNA mimicry. This approach may furthermore prove useful for probing ancient homology in either proteins or RNA.
| ACKNOWLEDGMENTS |
|---|
| Footnotes |
|---|
Received October 15, 2004; accepted May 19, 2005.
| REFERENCES |
|---|
|
|
|---|
Aevarsson, A., Brazhnikov, E., Garber, M., Zheltonosova, J., Chirgadze, Y., al-Karadaghi, S., Svensson, L.A., and Liljas, A. 1994. Three-dimensional structure of the ribosomal translocase: Elongation factor G from Thermus thermophilus. EMBO J. 13: 36693677.[Medline]
Agrawal, R.K., Penczek, P., Grassucci, R.A., and Frank, J. 1998. Visualization of elongation factor G on the Escherichia coli 70S ribosome: The mechanism of translocation. Proc. Natl. Acad. Sci. 95: 61346138.
Agrawal, R.K., Sharma, M.R., Kiel, M.C., Hirokawa, G., Booth, T.M., Spahn, C.M., Grassucci, R.A., Kaji, A., and Frank, J. 2004. Visualization of ribosome-recycling factor on the Escherichia coli 70S ribosome: Functional implications. Proc. Natl. Acad. Sci. 101: 89008905.
Brodersen, D.E. and Ramakrishnan, V. 2003. Shape can be seductive. Nat. Struct. Biol. 10: 7880.[CrossRef][Medline]
Chang, P.K., Chen, C.C., and Ouhyoung, M. 2004. A tool for structure alignment of molecules. In Proceedings of IEEE Sixth International Symposium on Multimedia Software Engineering (IEEE-MSE2004) Special Session on Bioinformatics, pp. 354361. IEEE Computer Society Press, Miami, FL.
Czworkowski, J., Wang, J., Steitz, T.A., and Moore, P.B. 1994. The crystal structure of elongation factor G complexed with GDP, at 2.7 Å resolution. EMBO J. 13: 36613668.[Medline]
Gilbert, W. 1986. The RNA world. Nature 319: 618.
Glaser, F., Pupko, T., Paz, I., Bell, R.E., Bechor-Shental, D., Martz, E., and Ben-Tal, N. 2003. ConSurf: Identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics 19: 163164.
Hanawa-Suetsugu, K., Sekine, S., Sakai, H., Hori-Takemoto, C., Terada, T., Unzai, S., Tame, J.R., Kuramitsu, S., Shirouzu, M., and Yokoyama, S. 2004. Crystal structure of elongation factor P from Thermus thermophilus HB8. Proc. Natl. Acad. Sci. 101: 95959600.
Holm, L. and Sander, C. 1999. Protein folds and families: Sequence and structure alignments. Nucleic Acids Res. 27: 244247.
Kim, S.H. 1978. Three-dimensional structure of transfer RNA and its functional implications. Adv. Enzymol. Relat. Areas Mol. Biol. 46: 279315.[CrossRef][Medline]
Klaholz, B.P., Pape, T., Zavialov, A.V., Myasnikov, A.G., Orlova, E.V., Vestergaard, B., Ehrenberg, M., and van Heel, M. 2003. Structure of the Escherichia coli ribosomal termination complex with release factor 2. Nature 421: 9094.[CrossRef][Medline]
Koehl, P. 2001. Protein structure similarities. Curr. Opin. Struct. Biol. 11: 348353.[CrossRef][Medline]
Lancaster, L., Kiel, M.C., Kaji, A., and Noller, H.F. 2002. Orientation of ribosome recycling factor in the ribosome from directed hydroxyl radical probing. Cell 111: 129140.[CrossRef][Medline]
Landweber, L.F. 1999. Experimental RNA evolution. Trends Ecol. Evol. 14: 353358.[Medline]
Liljas, A. 1996. Imprinting through molecular mimicry. Protein synthesis. Curr. Biol. 6: 247249.[CrossRef][Medline]
Lowe, J. and Amos, L.A. 1998. Crystal structure of the bacterial cell-division protein FtsZ. Nature 391: 203206.[CrossRef][Medline]
Maiti, R., Van Domselaar, G.H., Zhang, H., and Wishart, D.S. 2004. SuperPose: A simple server for sophisticated structural superposition. Nucleic Acids Res. 32: W590W594.
Nakamura, Y. 2001. Molecular mimicry between protein and tRNA. J. Mol. Evol. 53: 282289.[CrossRef][Medline]
Nakano, H., Yoshida, T., Uchiyama, S., Kawachi, M., Matsuo, H., Kato, T., Ohshima, A., Yamaichi, Y., Honda, T., Kato, H., et al. 2003. Structure and binding mode of a ribosome recycling factor (RRF) from mesophilic bacterium. J. Biol. Chem. 278: 34273436.
Nissen, P., Kjeldgaard, M., Thirup, S., Polekhina, G., Reshetnikova, L., Clark, B.F., and Nyborg, J. 1995. Crystal structure of the ternary complex of Phe-tRNAPhe, EF-Tu, and a GTP analog. Science 270: 14641472.
Nogales, E., Downing, K.H., Amos, L.A., and Lowe, J. 1998a. Tubulin and FtsZ form a distinct family of GTPases. Nat. Struct. Biol. 5: 451458.[CrossRef][Medline]
Nogales, E., Wolf, S.G., and Downing, K.H. 1998b. Structure of the a ß tubulin dimer by electron crystallography. Nature 391: 199203.[CrossRef][Medline]
Rawat, U.B., Zavialov, A.V., Sengupta, J., Valle, M., Grassucci, R.A., Linde, J., Vestergaard, B., Ehrenberg, M., and Frank, J. 2003. A cryo-electron microscopic study of ribosome-bound termination factor RF2. Nature 421: 8790.[CrossRef][Medline]
Selmer, M., Al-Karadaghi, S., Hirokawa, G., Kaji, A., and Liljas, A. 1999. Crystal structure of Thermotoga maritima ribosome recycling factor: A tRNA mimic. Science 286: 23492352.
Song, H., Mugnier, P., Das, A.K., Webb, H.M., Evans, D.R., Tuite, M.F., Hemmings, B.A., and Barford, D. 2000. The crystal structure of human eukaryotic release factor eRF1Mechanism of stop codon recognition and peptidyl-tRNA hydrolysis. Cell 100: 311321.[CrossRef][Medline]
Stark, H., Rodnina, M.V., Rinke-Appel, J., Brimacombe, R., Wintermeyer, W., and van Heel, M. 1997. Visualization of elongation factor Tu on the Escherichia coli ribosome. Nature 389: 403406.[CrossRef][Medline]
Teichmann, S.A., Chothia, C., and Gerstein, M. 1999. Advances in structural genomics. Curr. Opin. Struct. Biol. 9: 390399.[CrossRef][Medline]
van den Ent, F. and Lowe, J. 2000. Crystal structure of the cell division protein FtsA from Thermotoga maritima. EMBO J. 19: 53005307.[CrossRef][Medline]
Wilson, K.S. and Noller, H.F. 1998. Mapping the position of translational elongation factor EF-G in the ribosome by directed hydroxyl radical probing. Cell 92: 131139.[CrossRef][Medline]
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
M. Delarue An asymmetric underlying rule in the assignment of codons: Possible clue to a quick early evolution of the genetic code via successive binary choices RNA, February 1, 2007; 13(2): 161 - 169. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |