tRNA Structure Secondary article Eric Westhof, Institute of Molecular and Cellular Biology, Strasbourg, France Pascal Auffinger, Institute of Molecular and Cellular Biology, Strasbourg, France Article Contents . Introduction . Cloverleaf Structure Transfer ribonucleic acid (tRNA) molecules that participate in the elongation step of protein synthesis on the ribosome have a conserved secondary structure, known as the cloverleaf, and fold into a common three-dimensional architecture. . Universally Invariant and Semi-invariant Bases Introduction . Conservations of Bases and Loop Conformations Single-stranded RNA (ribonucleic acid) molecules are ubiquitous in cells and play many biological roles. However, three main types of RNA dominate molecular biology: the messenger RNAs (mRNAs), the transfer RNAs (tRNAs) and the ribosomal RNAs (rRNAs). They all participate in the biosynthesis of proteins. The mRNAs contain the sequence of nucleotide triplets or codons which will be read by tRNAs during the translation process on the ribosomes of which the rRNAs constitute central components. In mRNAs, the sequence alone is the primary determinant of biological function. However, for tRNAs and rRNAs, the folding in space of the polynucleotide RNA chain in a native tertiary structure, forms the basis of biological activity. The existence of some type of ‘adaptor’ RNA molecules acting as go-betweens between the RNA world and the protein universe was predicted by Francis Crick as early as 1955, long before transfer RNA molecules were characterized biochemically. Because there are 61 different codons in the genetic code, one would expect about the same number of different tRNAs, distributed among the 20 different isoacceptor families corresponding to the 20 different amino acids. However, if the wobble hypothesis of Crick (which relaxes the Watson–Crick complementarity and considers not only G.U but also I.U, I.C, or I.A pairings involving inosine (I), an oxidized form of adenosine) is followed, only 31 different tRNAs are necessary to decode all codons (32 with the additional initiator tRNA) (Crick, 1966). The recent sequencing of whole eubacterial genomes has revealed that the number of tRNA genes varies between 33 (in Mycoplasma genitalium, a parasitic bacterium with a minimal genome) and 88 (in the Grampositive bacterium Bacillus subtilis) with most being between 44 and 46. It should be added that the copy number of tRNA genes can be as high as four and that some tRNA variants present nucleotide changes outside the anticodon. Thus, in Escherichia coli the 84 tRNA genes code 45 tRNA species with only 41 tRNAs having different anticodons. As with every biological macromolecule, the molecular evolution of tRNAs is intimately coupled to the structural constraints imposed by the nature of the polymer and its . Summary . The Two Domains of the Three-dimensional Architecture . Unusual Hydrogen Bonds Maintain the Tertiary Structure . Unusual tRNAs functions. Each tRNA molecule has to evolve under two opposing constraints. On the one hand it needs a threedimensional architecture that allows it to fit precisely in the ribosome-binding sites for promoting protein synthesis, but on the other hand it needs to contain enough molecular diversity to guarantee specific recognition with a unique cognate tRNA synthetase. Recall that tRNA synthetase aminoacetylates the tRNA 3’-end adenosine with the amino acid specific to that tRNA. Indeed, a tRNA carries an anticodon triplet complementary to a given codon and should consequently be charged by the synthetase solely with the amino acid specified by the anticodon–codon interactions. The signatures of those opposing biological constraints are apparent in the network of invariant residues and interactions that maintain a common architecture on which enough molecular diversity can be coded for specific recognition with the tRNA synthetases. These structural constraints are clearly seen in the invariance in the length of some helices of the secondary structure and in the constant presence of some residues at definite positions. However, tRNAs constitute a set of molecules with which biological evolution has tinkered enormously. Although plant chloroplast and some mitochondrial tRNAs present high sequence homologies to eubacterial tRNAs, other mitochondrial tRNAs display a palette of unusual structural features. tRNAs responsible for the cotranslational insertion of the 21st amino acid, selenocysteine, also contain odd features. Nature has further exploited the sequence and structural diversity compatible with tRNA folding to adapt tRNA structure to functions unrelated to protein synthesis. For example, a special tRNA charged with a glutamic acid is used during chlorophyll synthesis in plants, while another tRNA charged with a glycine is necessary for peptidoglycan synthesis in bacteria. Retroviruses, like VIH, require a tRNA (in human cells, a tRNALys) as a primer for the replication of their genomic RNA. Several plant viruses contain at their 3’ ends, a tRNA-like structure, the integrity of which is necessary for viral replication. ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 1 tRNA Structure Cloverleaf Structure The secondary structure of an RNA molecule is generally defined by the set of contiguous Watson–Crick pairs (including the G.U wobble pairs) forming helices between segments of the single-stranded RNA. For a given family of RNA sequences, the ensemble of the so-defined helices is conserved as well as their relative arrangement. This is why secondary structures can be established by sequence comparisons. The basic hypothesis is, indeed, that partly divergent, but nevertheless functionally and historically related, molecules from various biological origins fold into similar secondary and tertiary structures. Helices can occur at homologous positions in the secondary structures of two different sequences if compensatory changes occur on both strands so as to maintain complementarity between the bases in the Watson–Crick sense. For example, when Holley sequenced the first tRNA in 1965, he suggested several possible secondary structures (Holley et al., 1965). However, when a second tRNA was sequenced, the only secondary structure common to both sequences was the one in the form of a planar cloverleaf (Figure 1a). All known cytoplasmic elongator tRNA sequences (of lengths varying between 74 and 98 nucleotides) can be folded in the cloverleaf structure. The cloverleaf structure consists of four helices and three hairpin loops. The four helices are called: the acceptor (AA) helix (or stem) because it will carry the amino acid once charged by the cognate aminoacyl synthetase; the dihydrouridine (D) hairpin because it often contains the modified base dihydrouracil; the anticodon (AC) hairpin because its apical loop presents the anticodon triplet; and the thymine (T) hairpin because its loop often contains the thymine base unusual in RNAs. Universally Invariant and Semiinvariant Bases Alignments of tRNA sequences, equivalent to inserting tRNA sequences on the cloverleaf template, reveal positions that are always occupied by a single type of base (‘invariant’) and positions that are occupied either only by purines or by pyrimidines (‘semi-invariant’). About 22 nucleotides belong to these categories. As is usual with biological sequences, those conservations are never absolute and, for clarity, we will focus the following description on cytoplasmic elongator tRNAs, for which the numbering of yeast tRNAPhe constitutes the reference system, since it was the first tRNA structure solved by X-ray crystallography. Figure 1b shows the distribution of the four nucleotides found in the 932 cytoplasmic tRNA genes listed in the 1998 release of the tRNA database (Sprinzl et al., 1998). In the following, R stands for a purine (A or G), Y for a pyrimidine (U or C) and N for any of the four nucleotides; standard Watson–Crick secondary base pairs 2 will be separated by a dash (e.g. U12–A23); non-Watson– Crick pairs by a dot (e.g. G10.U25 or the trans U8.A14); and the third base of triples will be separated by three dots from the interacting secondary base pair on the side of the contacting base (e.g. G45...G10.U25). tRNA synthetases attach the cognate amino acid (i.e. the one dictated by the codon–anticodon pairings) on either the 3’-OH or the 2’-OH group of the terminal adenosine. The first residue of all tRNAs, which starts the first base pair of the acceptor helix, carries a 5’-phosphate group. It originates in the biosynthesis of tRNAs. After being transcribed from tRNA genes, the tRNA transcripts are maturated at the 5’-end by the ribonucleoprotein enzyme RNAaseP, which clips off the 5’-end segment of the tRNA gene. The catalysis is effected by the RNA moiety of the RNAaseP, a process which leaves a 3’ OH and a 5’ phosphate. The maturation of tRNAs is very complex and requires a great number of enzymes since tRNAs are characterized by the presence of several modified nucleotides. The lengths of the helices and loops are either conserved or vary between defined limits. The distribution of the four Watson–Crick pairs within helices is not uniform. For example, the acceptor helix starts in three-quarters of sequences with a G1–C72 pair and ends very rarely with a C7–G66 base pair. The D helix also frequently starts with a R10–Y25 followed by a Y11–R24 base pair and ends in half of the sequences with a C13–G22 pair. The conservation is very strong for the last base pair of the thymine helix which is almost always G53–C61. Only the central base pairs of the acceptor helix as well as the very central one in the anticodon helix present an almost equal distribution of the four Watson–Crick combinations. Interestingly, those two sets of base pairs constitute determinants for recognition by aminoacyl synthetases. The acceptor helix has seven base pairs, except in tRNAHis where an eighth residue at the 5’ end is added posttranscriptionally. The acceptor helix possesses a 3’ dangling strand with four nucleotides, the sequence of which is -RCCA-3’OH (R is most often A and half as frequently G; that position is unfrequently occupied by a pyrimidine). Position 73 is called ‘discriminator’ owing to its role in synthetase selection of tRNAs. In RNA helices, interstrand stacking occurs in 5’-YR-3’ steps. The last base pair of the acceptor helix, often a R1–Y72 base pair, is thus probably stabilized by interstrand stacking with R73. The first four base pairs of the acceptor helix are important discriminant elements for synthetase recognition. The dihydrouridine helix generally possesses four base pairs, while its hairpin loop is more variable with 7–11 residues. The variation occurs in two regions, called a and b, situated on either side of two invariant guanine residues, G18 and G19 (Figure 1a). The dihydrouridine residues occupy either or both of those positions. In the loop, the first (14) and last (21) residues are always a purine, mainly ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net tRNA Structure 5’ P 0 Dihydrouridine hairpin (D hairpin) 1 72 2 71 3 70 4 69 5 68 6 67 7 66 8 α 17 16 15 73 C Acceptor stem (AA helix) Thymine hairpin (T hairpin) 60 59 65 64 63 62 61 58 49 50 51 52 53 56 57 9 14 13 12 11 10 17a 54 55 48 18 22 23 24 25 19 20 20a 20b A 3’ OH C 44 26 21 β Anticodon hairpin (AC hairpin) 27 43 28 42 29 41 30 40 31 39 e21 Variable loop e22 e23 e24 e11 e25 e12 e26 e27 e13 e5 e14 e15 e4 e16 e17 e3 e1 e2 37 33 (a) 46 38 32 34 45 47 35 36 Anticodon triplet 5’ P 3’ OH 75 70 60 65 16 15 17 10 17a 50 55 18 47 25 19 20 21 46 e21 45 20a 20b e27 e11 Single strands A G e5 30 40 T C e17 Helix A G T C T C A G 34 35 e1 36 (b) Figure 1 Nomenclature and base distributions in elongator transfer RNAs. (a) The accepted nomenclature of transfer RNA molecules (Sprinzl et al., 1998). Variable positions are present in the dihydrouridine and variable loops. The variable loop itself forms a hairpin when long enough. Residue 0 occurs only in histidinyl-tRNAs. Straight lines indicate secondary base pairing and broken lines unusual base pairings at the beginning or end of a helix. (b) The distribution of the four common bases at corresponding positions along the sequence in 932 sequences of elongator tRNA genes (Auffinger and Westhof, 1998). In single strands, the adenine region always starts at 2 908 from the vertical. For the variable positions which are not always occupied, the proportion of sequences where they are occupied can be evaluated starting from the outer ring. Thus, position 17 is present in less than half of the sequences and positions 45 and 46 in more than half of the sequences. In helices, the colour codes for paired residues are arranged so as to follow Watson–Crick pairings; the 5’ strand has a thin outer circle and the 3’ strand a thick outer circle. an adenine. The first three base pairs of the D helix constitute identity elements for aminoacyl synthetases. The anticodon hairpin has five base pairs in the helix and seven residues in the loop. The second residue of the anticodon loop is always a uridine (U33); it precedes the three bases of the anticodon, the first position of which is called the wobble base (34). Following the anticodon triplet, a highly modified nucleotide is generally present (position 37). Except for the first two residues (32 and 33), ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 3 tRNA Structure the residues of the anticodon loop are important identity elements for synthetases. After the anticodon helix, there is a variable region which may vary between 4 and 21 residues. According to the size of the variable loop, tRNAs belong to class I or II. Class I tRNAs have 4–5 nucleotides in the variable loop and class II tRNAs from 10 to 24. In class II tRNAs, the variable loop is long enough to form a fifth helix. Only tRNAs specific for the amino acids leucine and serine (with tyrosine in eubacteria and organelles only) belong to class II. Finally, the thymine hairpin has, like the anticodon hairpin, five base pairs and seven residues in the loop. However, the base conservation in the thymine loop is very different from that in the anticodon loop. The last base pair of the thymine helix, G53–C61, is highly conserved. In the thymine loop, the first three residues are rather well conserved and modified to -TCC- where C stands for pseudouridine, a modified base in which the link between the base and the sugar is C(1’)-C5 instead of C(1’)-N1. A pseudouridine is generally present at the second position of the loop in tRNAs but, since pseudouracils represent more than 50% of the modified bases in tRNAs, many more modified positions exist in the cloverleaf (Auffinger and Westhof, 1998). The Two Domains of the Threedimensional Architecture The folding in three dimensions of the cloverleaf secondary structure posed a tantalizing problem for many years. In a remarkable paper of 1968, Levitt proposed a threedimensional model of tRNA (Levitt, 1969). Although the overall architecture of the model turned out to be wrong, that article laid the foundations for our thinking about the relations between biological evolution and biomolecular structure and function. Since the first correct tracing of a tRNA chain in crystals of yeast tRNAPhe in 1974 (Suddath et al., 1974), only one other crystal structure of a free tRNA has been solved (yeast tRNAAsp in 1980) (Moras et al., 1980). In addition, several structures of tRNAs complexed with their cognate aminoacyl synthetases now exist. An examination of the three-dimensional folding revealed by the crystal structures allows a structural basis for all of the base conservations observed to be suggested. The four helices of the cloverleaf stack on each other coaxially and two by two, forming two main arms or domains (Figure 2a). Thus, the acceptor helix and the thymine hairpin form the acceptor arm and the dihydrouridine together with the anticodon hairpins form the anticodon arm. The two coaxial and contiguous stacks make an angle of about 908 between them, giving to the overall architecture of tRNAs the appearance of the letter L or the Greek G. At the two ends, which are about 7.5– 4 8.0 nm apart, are the anticodon triplet and the -CCA 3’terminus, extremities which, when the tRNA is bound to the acceptor site of the ribosome (A site), should, respectively, bind to the codon triplet of the mRNA and bring the attached amino acid into the peptidyl reaction centre for reacting with the peptide chain carried by the preceding tRNA present in the peptidyl site (P site). The interfaces between the two coaxial helices are different in each domain. The tRNA chain is continuous between the 3’ end of the thymine helix and the beginning of the acceptor helix, while the last residue of the 5’ strand of the acceptor helix adopts the C2’-endo sugar pucker which facilitates branching off the helix. By contrast, at the interface between the dihydrouridine and the anticodon helices, there is usually a non-Watson–Crick base pair linking residues 26 and 44 (e.g. an imino G.A pair) with a twist angle with the last base pair of the dihydrouridine stem of around 458, much larger than the standard 338 present between base pairs of RNA helices. The L-shaped architecture is locked in place by two main structural features. First, the single-stranded regions linking the two domains (residues 8–9 between the acceptor and the dihydrouridine helices, as well as the variable region between the anticodon and the thymine helices) adopt conformations such that their residues form base triples in the deep groove of the dihydrouridine (Figure 2b). Secondly, those base triples position the D and T loops so that extremely precise tertiary interactions can occur between them. Fluorescence and ultraviolet melting experiments have shown that formation of the base triples in the deep groove of the D helix is the rate-limiting step in the tertiary folding of tRNAs. The stereochemistry of the nucleotides is rather uniform in helices (ribose puckers in the C3’-endo domain, bases in anti orientation with respect to the sugar, helical phosphate torsion angles) and the folding of the chain is accomplished in single-stranded segments by altering the ribose pucker to the C2’-endo domain or by rotating the torsion angles at the phosphorus atoms away from the helical conformations (Sundaralingam, 1973). Unusual Hydrogen Bonds Maintain the Tertiary Structure Residues not participating in helices defining the secondary structure, i.e. residues of loops and single-stranded regions, form tertiary interactions either by interacting between themselves or by interacting with base pairs within a helix (leading to base triples). The four natural bases can interact via hydrogen bonding in many different ways. There are 27 pairs with two standard hydrogen bonds. The recent crystal structures of other RNAs expand further the recognition modes, since they reveal pairings with only one interbase hydrogen bond or with a water molecule ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net tRNA Structure Figure 2 Two-dimensional and three-dimensional representation of the tertiary structure of elongator transfer RNAs. (a) Two-dimensional representation of the tertiary structure of tRNAs, as proposed by Kim (1978), which emphasizes the two main domains and the tertiary contacts linking them. Only the secondary structure can be represented in a plane without crossing lines (in other words, mathematically, a secondary structure is equivalent to a planar graph). The representation of a three-dimensional structure and of the underlying tertiary contacts can be drawn in a plane, but with several line crossings. Such schematic drawings are, however, useful for quick assessment and comparisons of tertiary contacts. The Kim representation shows clearly the two domains, the contacts between the T and D loops and the tertiary base pairs and triples between the single-stranded segments and the D hairpin. The contacts represented correspond to those of yeast tRNAAsp (Westhof et al., 1985). (b) Stereoview of a schematized representation of the tertiary structure of yeast tRNAAsp. The sugar–phosphate backbone is drawn as a ribbon and the base pairs as rods. The colour code is the same as in Figure 2a. Notice the characteristic deep and shallow grooves of an RNA helix in the acceptor and thymine helices, respectively. bridging the two bases. The fundamental property of the Watson–Crick pairs (the isosteric geometry of all four possible pairs G–C, C–G, A–U, U–A involving complementary bases) is thus even more remarkable. In order to discuss the tertiary interactions in tRNAs, it is necessary to explain two terms. First, the nucleotides in a pair may approach with their sugars on the same sides of the H bonds or on either side of them. Following the chemical literature, in the first case, the pairing is said to be cis and, in the second, trans (often called ambiguously ‘reverse’). ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 5 tRNA Structure Secondly, purine bases can interact via their edge carrying the imidazole ring (involving ring nitrogen N7) instead of only their pyrimidine edge, as in Watson–Crick pairings. Such pairings are called Hoogsteen pairs, after their discovery by Hoogsteen in 1956 in cocrystals of derivatives of U and A. Figure 3a shows examples of a variety of pairings. Thus, elongator tRNAs contain two Hoogsteen U.A pairs, both with a trans orientation of the sugars. The first trans Hoogsteen pair occurs between the invariant U at position 8, the first residue after the acceptor helix, and the invariant A at position 14, the first residue of the D loop. The U8.A14 pair, thus, links the two main domains of tRNAs. The second trans Hoogsteen pair (T54.A58) occurs between two invariant residues within the T loop, the T at position 54 and the A at position 58. There is also a third trans base pair, between the semi-invariant residues R15 and Y48, but involving the Watson–Crick sites. The trans Watson–Crick R15.Y48 base pair, sometimes called after Levitt who first noticed the covariation, stacks with the trans Hoogsteen U8.A14 pair. They are both central for the maintenance of the L-shaped structure of tRNAs (Figure 3a). Class I tRNAs contain three base triples, all in the deep groove of the D helix. Of the four base pairs of the D helix, only the 11–24 pair does not contact a third base. Both single-stranded regions linking the two domains participate in the network of triple interactions. Residue 9 interacts with base 23 of base pair 12–23 (12–23...9), while residues 45 and 46 of the variable loop interact, respectively, with base pair 10–25 (45...10–25) and base pair 13– 22 (13–22...46). The type of the hydrogen bonding sites (Watson–Crick or Hoogsteen) is related to the local orientation of the chains. Thus, cis Watson–Crick pairs (as in usual double helices) lead to antiparallel strands while, with equivalent stereochemistry of the nucleotides, trans Watson–Crick pairs lead to a local parallel orientation of the strands (e.g. R15.Y48). The reverse is true for Hoogsteen pairs between a pyrimidine and a purine (e.g. U8.A14) and for Hoogsteen pairs involving the Watson– Crick of one purine with the Hoogsteen sites of another purine (e.g. G22...A46 in yeast tRNAAsp). However, purine–purine pairs involving only the Hoogsteen sites are locally antiparallel when cis and parallel when trans (e.g. A9...A23 in yeast tRNAPhe). The local orientations of the chains cannot be altered without altering profoundly the overall topology or flipping some bases in the syn orientation with respect to the sugar (a conformation extremely rare in crystal structures of RNA fragments). These structural constraints impose definite covariations between residues involved in triple interactions. The set of triples in the deep groove of the D helix is such that residue 9 interdigitates between residues 45 and 46 of the variable link and the contacts are made alternatively to the 5’ and 3’ strands of the D helix. Again, this topology dictates the relative strand orientations: residue R45 interacts with G10 and is parallel to it, but R46 is antiparallel to residue R22 to which it binds, while the inbetween residue R9 is parallel to its binding residue 23. Because the D and anticodon helices stack coaxially with a right-handed twist at the interface, the 3’-dangling variable region faces the deep groove and is antiparallel to the 3’ strand of the D helix, which leads to the 5’ strand of the anticodon helix. On the other hand, the junctions between the two domains occur at the internucleotide linkages of residues 7–8 and 48 so that residues 7 and 49 could almost be linked as in a continuous helix. In short, the two singlestranded segments linking the two domains run antiparallel to each other, facing the deep groove of the D helix. Figures 2 and 3 correspond to the molecular structure of the yeast tRNAAsp as determined by X-ray crystallography (Westhof et al., 1985). Figure 4 illustrates the sequence variability in 33 available tRNAAsp genes from other organisms. Invariant residues are clearly seen, e.g. U8, A14, and G18 or G10. Covariations typical of secondary structure are displayed by residues 11 and 24 or 12 and 23, the two central base pairs of the D helix. The flanking base pairs present biases, G10–Y25 and Y13–R22, with some unusual oppositions (for example Y13.Y22). Correlations between the third base and the secondary pair forming triples are not easy to detect. For example (not apparent on Figure 4), any of the four bases is found with the very frequent G10–C25, although a G at position 45 is by far the most frequent situation. In contrast, position 9, either A or G, covaries with base pair 12–23 so that A9...U12–A23 and G9...G12–C23 are most frequently observed, although either A9 (and almost G9) can be found together with any of the four Watson–Crick base pairs (see Table 1). Even the residues involved in triples appear to vary. For example, in the structure of E. coli tRNAGln complexed with its cognate synthetase (Rould et al., 1989), it is residue 45 which interacts with the 13–22 base pair, and not residue 46 which is pointing on the exterior of the molecule. In that structure, the 13–22 base pair is an unusual A.A pair (see below). Such adaptability blurs the covariation tables (Klug et al., 1974; Gautheret et al., 1995). Several isolated additional H bonds contribute to the integrity of the tertiary structure. They involve the O2’ Figure 3 Atomic representations of the tertiary contacts in elongator transfer RNAs. (a) Tertiary interactions between bases as observed in yeast tRNAAsp. From top to bottom: the two trans Hoogsteen pairs T54.A58 and U8.A14; the trans Watson-Crick pair A15.U48; the cis Watson–Crick G.A pair (also called imino G.A pair); the standard cis Watson–Crick G19–C56 base pair; the two unusual bifurcated pairs G18.C55 and C32.C38. (b) The tertiary triple contacts present in yeast tRNAAsp. The colour code is the same as in Figure 2a. In green, the G45...G10.U25 triple in which the amino N2 group of G45 Hbonds to the Hoogsteen sites of G10 (N7 and O6). In orange, the U12–A23...A9 triple which includes a trans symmetric Hoogsteen A.A pair. In red, the C13.G22...A46 triple. Notice how, in G10.U25 and C13.G22, the pyrimidine base protrudes into the deep groove. 6 ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net tRNA Structure ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 7 tRNA Structure D stem D loop 1 1 1 1 1 1 1 1 1 7 8 9 0 1 2 3 4 5 6 7 7 Organisme A Methanococcus Jan. … G U G G U G U A G C C C …G U G G U G U A G C U C Methanococ. Vani. Methanotherm. Fer. … G U G G U G U A G U - Methanococ. Voltae … G U G G U G U A G C U C Mycoplasma Capric. … A U A G C G A A G U U …C A U GGU G U A G U Mycoplasma Gen. Mycoplasma Mycoid. … A U A G C G A A C G U U Mycoplasma Pneumo.… A U G G U G U A G U - Acholeplasma Laid. … G U G G U G U A G G - …G U A G U G A A G U U Spiroplasma Melif. …G U G G A G C A G U U U Streptomyces Liv. Staphlylococ. Aure. … G U A G U G U A G C - Staphlylococ. Aure. … G U A G U G U A G C - …U U G G A G C A G U - Lactobac. Bulg. …G U A G U U C A G U U Bacillus Subtilis …G U A G U G U A G U - Bacillus Sp. PS3 …G U A G U U C A G U C E. Coli …G U A G U U C A G C U Haemophilus Influ. …G U A G U U C A G C U Haemophilus Influ. …G U A G U U C A G C U Haemophilus Influ. …G U A G U U C A A U U Synechocystis Sp. …U U A GU A U A G U - Phytophthora Par. Saccharomyces Cer. … A U A G U U U A A U - Saccharomyces Cer. … A U A G U U U A A U - Schizosaccha. Pom. … U U A G U A U A G G - …G U A G U A U A G U - Glycine Max. Caenorhabdi. Eleg. … G U A G U A U A G U - Drosophila Melano. … A U A G U A U A G U - …U U A GU A U A G U - Chicken …U U A GU A U A G U - Chicken …U U A GU A U A G U - Mouse …U U A GU A U A G U - Rat …U U A GU A U A G U - Rat …N U R G U R Y A R Y - Consensus 1 1 2 2 2 8 9 0 0 0 A B GG C C U GG C C U GG C U GG C C U GG U U GA U - GG U U GG U U GG U U GG U U GG A G GG U U GG U U GG U C U GG U U GG U U GG U U GG U U GG U U GG U U GG U U GG U U GG U C GG U - GG U - GG U A GG U G GG U U GG U G GG U G GG U G GG U G GG U G GG Y N - D stem Var. loop 2 2 2 2 2 2 2 1 2 3 4 5 6 7 A A A A A A A A A A U A A A A A A A A A A A A C A A A A A A A A A A U U U U U A U A A U G A A U G A G G G G G G G A G G G G G G G G G R C C C C C G C C C C C C C C A C A A A A A U A G U U U U U U U U U Y A A A A G A G A A A U A A U A A A A A A G A A A A A A A A A A A A A U U U U C U C U U U C C C C U U U U U U C U U A C U U U U U U U U Y A A G A G A G A G G G G G G G G A A A A A A G U A U C C C C C C C R C C C C C U C U C C C C C C C C C C C C C C G G C C C C C C C C C C 4 4 4 4 4 4 4 3 4 5 6 7 8 9 …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …G …C …C …G …G …G …G …G …G …G …G …G …G U U C U A G A G A A A A A A A A G G G G A U A A C U A A A A A A A R GA GA GA GA GA GU GA GU GA GA GG GA GA GA GG GA GG GU GG GG AG GA GA GA AG GA GA GA GA GA GA GA GA GR U U U U C U U U U U U U U U - C C C C C U C U C C C C C C C C C C C C U C U U C C C C C C C C C Y U U U U A G A G G G G G G G G G G G G G G C C C C C C C C C C C C N … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … Figure 4 Sequence comparison of the 33 available sequences of aspartic acid specific tRNAs. The disposition and colour codes emphasize the structural alignment. Identical colour codes emphasize observed covariations. The sequence of the yeast tRNAAsp, illustrated in Figures 2 and 3, is boxed. The consensus sequence, which reflects the most frequent base at each position, is shown at the bottom. Because of the small number of sequences, this comparison gives only a glimpse of the possible base variations. Table 1 Table of covariations between the secondary base pair 12–23 in the D helix and residue 9 in class I tRNAs (analysis is made on 745 sequences of class I elongator tRNAs) Residue 9 Base pair 12–23 A U G C A–U 21 – – 4 U–A 412 – 34 – G–C 39 4 104 – C–G 23 2 64 13 Others 9 1 10 1 hydroxyl group of the ribose or the anionic phosphate oxygen atoms to each other or to a base atom. Thus, the N1 atom of the invariant base A21 binds to the O2’ hydroxyl of residue U8; the O2’(C55) forms an H bond to the N7 of G57; the O2’(A58) gives a proton to an anionic oxygen of phosphate 60. The amino nitrogen N2 of guanine often binds either to O2’ hydroxyl or to the intraring oxygen O4’ (e.g. N2(G57) to O4’(G19) and O2’(G18)). The anionic phosphate oxygen atoms of the 5’ phosphate of residue 60 receive each a H bond, one from the hydroxyl of residue 58 8 and the other from the amino group of the invariant C61. Only some of those additional contacts are observed in all crystalline forms of tRNAs and a comparison between different crystal structures reveals a subtle diversity of slight rearrangements and alternative contacts, often including distorted hydrogen bond geometries and the use of the weaker C-H...O/N hydrogen bondings (see Figure 5). Conservations of Bases and Loop Conformations As mentioned above, both anticodon and thymine loops always contain seven residues. However, the patterns of conservation are rather different and reflect their respective functions. In the anticodon loop, the only strictly invariant residue is U33, while in the thymine loop almost five residues are highly conserved, -T54CCRA58-. The anticodon loop should bind precisely the mRNA triplet complementary to its anticodon bases to form a regular RNA helix, since recognition is mainly of the Watson– Crick type. In contrast, structurally the T loop binds intramolecularly to the D loop. The conformation of the T loop might also be important for RNAaseP binding to ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net tRNA Structure Figure 5 Two examples of triples implicating a sheared R.R pair. In the structure of the class I yeast tRNAGln complexed with its cognate synthetase (Rould et al., 1989), the sheared A13.A22 forms a trans Watson–Crick contact with A46 (both strands are parallel). Notice the contact between the C2-H group of A13 and the N7 of A22 marked by a lightly dotted line. Notice also in both triples, the H bond between the hydroxyl O2’ atom of the ribose of R13 with the amino N6 of A22. In the structure of the class II yeast tRNASer (Biou et al., 1994), the sheared G13.A22 pair forms a trans Watson–Crick/Hoogsteen contact with G9 (both strands are antiparallel). Notice the H bond between the N1(G9) and an anionic phosphate oxygen of A22. tRNAs during maturation and for recognition by protein elongation factors during protein synthesis. The contact between the D and T loops occurs via the invariant -GG- residues of the D loop and the quasiinvariant -CC- residues of the T loop. The interactions connect the first G residue with the C residue and the second G with the C residue. The G18.C55 contact is locally parallel with a trans orientation of the H bonds which are bifurcated and occur between the O4(C) and N1(G)/N2(G). The G19–56 interaction is a classical cis Watson–Crick pair, antiparallel following a rotation about its 5’ phosphate. At the same time, the -GG- segment interdigitates with the R57A58 segment so that R57 of the T loop intercalates between G18 and G19 and G18 of the D loop intercalates between R57 and A58. The only strictly invariant residue in the anticodon loop is situated where the chain reversal occurs. It is accomplished by a U turn identical to that present in the T loop. It is the phosphate group 3’ of U33 (or C55 in the T loop), which rotates about the P-O3’ torsion angle. The U turn is stabilized by contacts with residue n 1 2, A35 (or G57 in the T loop). The pyrimidine ring of the U turn stacks over the 5’ phosphate of residue n 1 2, the O2’ hydroxyl of the pyrimidine interacts with the N7 of the purine at n 1 2, and the N3 imino proton of the uracil H bonds to an anionic phosphate oxygen (the Rp oxygen) of the 3’ phosphate of residue n 1 2. Position 37 is occupied invariably by a purine (mainly A) and is often heavily modified. It stabilizes the codon–anticodon triplet interaction by stacking on the very discriminating base pair formed between the first base of the codon and the third base of the anticodon. The structural and functional basis for the chemical modifications on residue 37 are difficult to establish: stabilization by 3’-end dangling purines (as at the discriminator 73 position), blocking of the Watson– Crick sites preventing pairing with the preceding residue on the mRNA. Correlations have also been found between the chemical nature of the modifications and the hydrophobicity/hydrophilicity of the coded amino acids. The first and last bases of the anticodon loop have restricted variations, leading in more than 60% of sequences to a one H bond Y32.A38 base pair between O2(Y32) and N6(A38). Unusual tRNAs In the preceding sections, the class I tRNAs, the best known structurally, have been extensively described. Only one class II tRNA (or long arm tRNA), the E. coli tRNASer, is known crystallographically in a complex with its cognate aminoacyl tRNA synthetase (Biou et al., 1994). In class II tRNAs, residue 45 base-pairs with a residue in the variable helix and cannot form a triple with base pair 10–25. Also, residue 9 does not interact with 12–23 but with base pair 13–22, which is often a purine–purine G13.A22 pair, or A13.A22 (see Table 2). The R13.R22 base pair belongs to a special type of base pair, the sheared type where the Hoogsteen sites of the A interact with the N3 and N2 atoms of the G (in sheared A.A pair, there is only one bona fide H bond between N6(A) and N3(A)). In the triple 9...13–22, the Hoogsteen sites of residue 9 interact with the Watson–Crick sites of residue 13, still available since residue 22 interacts with N3 and N2 on the shallow groove side. The triple A9...A13.A22 is also observed, which implies a slight displacement of A9 so that ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 9 tRNA Structure Table 2 Table of covariations between the secondary base pair 13–22 in the D helix and residue 9 in class II tRNAs (analysis is made on 187 sequences of class II elongator tRNAs) Residue 9 Base pair 12–23 A U G C A•A 32 – 8 3 G•A – 4 115 – G•U – – 11 – Others 7 1 4 2 its Hoogsteen sites (N7 and N6) interact with the Watson– Crick sites (N6 and N1) of A13. The tRNASec is a tRNASer able to read, in an appropriate context, UGA stop codons for incorporation of selenocysteine and constitutes a variant of class II tRNAs. In archaea and eukaryotes, the acceptor helix is 9 bp long and the T helix is only 4 bp long (9/4 model). On the other hand, in prokaryotes, the acceptor helix is 8 bp long with a 5-bp T helix (8/5 model). Thus, for all phylogenetic groups, the length of the acceptor arm is 13 bp. The D helix is also changed and displays 6 bp. The T loop is shorter with the invariant -GG- residues centrally disposed. Initiator tRNAs (Schmitt et al., 1998) also display some sequence peculiarities linked to their biological function: they must bind directly into the ribosomal P site and not in the A site where elongator tRNAs are channelled to. Thus, the first base pair of the acceptor helix is never Watson– Crick in prokaryotes, which favours recruitment of initiator factor IF-2. Prokaryotic initiator tRNAs also present an interesting reversal at the level of the 11–24 base pair where it is R11–Y24, instead of the Y11–R24 as in elongators. The T loop, normal in prokaryotes, is different in eukaryotes where positions 54 and 60 are occupied by adenines. The usual trans Hoogsteen pair T54.A58 is thus replaced by the similar trans A54.A58 pair between the Watson–Crick sites of A54 and the Hoogsteen sites of A58. Eubacterial and eukaryotic initiator tRNAs have an odd distribution of G–C pairs at the beginning of the acceptor helix (alternating G–C/C–G) and at the end of the anticodon helix (a run of three G–C pairs). Eukaryotic initiator tRNAs have other unusual features, like the unique sugar modification on residue 64, a phosphoribosyl group on the 2’ hydroxyl, a modification which hinders binding of the elongation factor EF-1a and, thus, access to the A site. The structural diversity of mitochondrial tRNAs is enormous. In some of them the T hairpin is missing and replaced by a single-stranded segment, while in others it is the D hairpin. In the remaining stems and loops, there is a 10 systematic loss of base invariance or semi-invariance compared to cytoplasmic tRNAs. In the absence of a reference crystal structure, it is difficult to discuss the structural aspects of such truncated and functional tRNAs. Summary tRNA constitutes a paradigm for RNA structure and folding. It possesses a well-defined consensus secondary structure, the cloverleaf, which folds into a common threedimensional architecture with a characteristic L shape. The three-dimensional fold can be divided into two domains, similarly built of two coaxial helical stems. The threedimensional architecture is kept in place by loop–loop tertiary contacts between the thymine and dihydrouridine loops, together with tertiary contacts involving the singlestranded junctions linking the helical domains and the deep groove of one helix, the dihydrouridine helix. The latter contacts involve non-Watson–Crick pairings and formation of base triples. tRNA structure displays several elements of the RNA folding logic: coaxial stacking of contiguous helices, formation of base triples in a RNA helix deep groove, loop–loop interactions with Watson–Crick and nonWatson–Crick pairs, the U-turn motif for hairpin formation, and the extensive use of additional hydrogen bonding involving the ribose hydroxyl O2’, the anionic phosphate oxygen atoms, and some polar C-H groups. The modular view of RNA architecture, based on a hierarchical assembly of recurrent RNA motifs, first glimpsed at in the tRNA structure, has been corroborated by the recent crystal structures of larger RNA domains and ribozymes. The increasing number of crystal structures has revealed an amazing and subtle variability in precise atomic contacts. However, the microheterogeneities in the specific atomic contacts between residues important for the stability of the global tertiary fold maintain identical overall topological arrangements. These structural constraints lead to base conservations or base covariations in sequence comparisons and alignments. Comparisons between RNA molecules show that topologically and functionally distinct molecules share quasi-identical threedimensional motifs which display clear signatures in sequence conservation and variability. References Auffinger P and Westhof E (1998) Location and distribution of modified nucleotides in tRNA. In: Grosjean H and Benne R (eds) Modification and Editing of RNA, pp. 569–576. Washington, DC: American Society for Microbiology. Biou V, Yaremchuk A, Tukalo M and Cusack S (1994) The 2.9 Å crystal structure of T. thermophilus seryl-tRNA synthetase complexed with tRNASer. Science 263: 1404–1436. ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net tRNA Structure Crick HFC (1966) Codon–anticodon pairing: the wobble hypothesis. Journal of Molecular Biology 19: 548–555. Gautheret D, Damberger SH and Gutell RR (1995) Identification of base-triples in RNA using comparative sequence analysis. Journal of Molecular Biology 248: 27–43. Holley RW, Apgar J, Everett GA et al. (1965) Structure of a ribonucleic acid. Science 147: 1462–1465. Kim S-H (1978) The three dimensional structure of transfer RNA and its functional implications. Advances in Enzymology 46: 279–315. Klug A, Ladner J and Robertus JD (1974) The structural geometry of cocordinated base changes in transfer RNA. Nature 89: 511–516. Levitt M (1969) Detailed molecular model for transfer ribonucleic acid. Nature 224: 759–763. Moras D, Comarmond MB, Fisher J et al. (1980) Crystal structure of yeast tRNAAsp. Nature 288: 669–673. Rould MA, Perona JJ, Söll D and Steitz TA (1989) Structure of E. coli glutaminyl-tRNA synthetase complexed with tRNA(Gln) and ATP at 2.8 Å resolution. Science 246: 1135–1142. Schmitt E, Panvert M, Blanquet S and Mechulam Y (1998) Crystal transformylase complexed with the structure of methionyl-tRNAMet f . EMBO Journal 17: 6819–6826. initiator formyl-methionyl-tRNAMet f Sprinzl M, Horn C, Brown M, Loudovitch A and Steinberg S (1998) Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Research 26: 148–153. Suddath FL, Quigley GJ, McPherson A et al. (1974) Three-dimensional structure of yeast phenylalanine transfer RNA at 3.0 angstroms resolution. Nature 248: 20–24. Sundaralingam M (1973) The concept of a conformationally ‘rigid’ nucleotide and its significance in polynucleotide conformational analysis. Jerusalem Symposia of Quantum Chemistry and Biochemistry 5: 417–456. Westhof E, Dumas P and Moras D (1985) Crystallographic refinement of yeast aspartic acid transfer RNA. Journal of Molecular Biology 184: 119–145. Further Reading Grosjean H and Benne R (eds) (1998) Modification and Editing of RNA. Washington, DC: American Society for Microbiology. Quigley GJ and Rich A (1976) Structural domains of transfer RNA. Science 194: 796–806. Rich A and RajBhandary UL (1976) Transfer RNA: molecular structure, sequence, and properties. Annual Review of Biochemistry 45: 805–860. Saenger W (1984) Principles of Nucleic Acid Structure. New York: Springer-Verlag. Söll D and RajBhandary UL (eds) (1995) tRNA Structure, Biosynthesis, and Function. Washington, DC: American Sociey for Microbiology. ENCYCLOPEDIA OF LIFE SCIENCES / & 2001 Nature Publishing Group / www.els.net 11
© Copyright 2025 Paperzz