|
ScienceWeek
GENOME BIOLOGY: ON THE GENETIC CODE
The following points are made by Brian Hayes (American Scientist 2004 92:494):
1) The genetic code was cracked 40 years ago, and yet we still do not fully understand it. We know enough to read individual messages, translating from the language of nucleotide bases in DNA or RNA into the language of amino acids in a protein molecule. The RNA language is written in an alphabet of four letters (A, C, G, U), grouped into words three letters long, called triplets or codons. Each of the 64 codons specifies one of 20 amino acids or else serves as a punctuation mark signaling the end of a message. That's all there is to the code. But a nagging question has never been put to rest: Why this particular code, rather than some other? Given 64 codons and 20 amino acids plus a punctuation mark, there are 10^(83) possible genetic codes. What is so special about the one code that -- with a few minor variations -- rules all life on the planet Earth?
2) The canonical nonanswer to this question came from Francis Crick (1916-2004), who argued that the code need not be special at all; it could be nothing more than a "frozen accident". The assignment of codons to amino acids might have been subject to reshuffling and refinement in the earliest era of evolution, but further change became impossible because the code was embedded so deeply in the core machinery of life. A mutation that altered the codon table would also alter the structure of every protein molecule, and thus would almost surely be lethal. In other words, the genetic code is the qwerty keyboard of biology -- not necessarily the best solution, but too deeply ingrained to be replaced or improved.
3) There has always been resistance to the frozen-accident theory. Who wants to believe that the key to life is so arbitrary and ad hoc? And there is evidence that the accident is not quite frozen. Certain protozoa, bacteria and intracellular organelles employ genetic codes slightly different from the standard one, hinting that changes to codon assignments are not impossible after all. And if the code is subject to change, then it must also be subject to natural selection, which in turn suggests the possibility of ongoing improvement. Perhaps ours is not the very best of all possible codes, but after four billion years of evolution it ought to be an especially good one.
4) The urge to find something singular and superlative about the code was already evident even before it was deciphered. For several years before experiments began to reveal the true structure of the genetic code, theorists were at liberty to dream up codes of their own. Some of the proposals were so ingenious that the real code seemed a bit disappointing. But the creative thinking did not end with the publication of the codon table; indeed speculation seems to have been inhibited very little by the constraints of mere fact.(1-5)
References (abridged):
1. Alff-Steinberger, C. 1969. The genetic code and error transmission. Proceedings of the National Academy of Sciences of the U.S.A. 64:584-591
2. Beland, Pierre, and T. F. H. Allen. 1994. The origin and evolution of the genetic code. Journal of Theoretical Biology 170:359-365
3. Cortazzo, Patricia, Carlos Cervenansky, Monica Marín, Claude Reiss, Ricardo Ehrlich and Atilio Deana. 2002. Silent mutations affect in vivo protein folding in Escherichia coli. Biochemical and Biophysical Research Communications 293:537-541
4. Crick, F. H. C. 1968. The origin of the genetic code. Journal of Molecular Biology 38:367-379
5. Freeland, Stephen J., and Laurence D. Hurst. 1998. The genetic code is one in a million. Journal of Molecular Evolution 47:238-248
American Scientist http://www.americanscientist.org
--------------------------------
Related Material:
MOLECULAR BIOLOGY: ON THE GENETIC CODE
A.R. Cavalcanti and L.F. Landweber (Current Biology 2004 14:R147):
1) The so-called "universal" or standard genetic code is the set of rules that define the correspondence between the 20 amino acids in proteins and groups of three bases (codons) in the mRNA.
2) The code, however, is not "universal". Although most organisms have the same genetic code, researchers began to discover exceptions to the "universal" code in 1979, and today we know of more than 15 alternative codes; each has just a few differences from the standard code, indicating common ancestry from this code. Several of these codes arose independently a number of times in evolution and are present in a variety of taxa.
3) In addition, the genetic code is not limited to 20 amino acids. The first exception was selenocysteine, encoded by the stop codon UGA in genes with a selenocysteine insertion sequence (SECIS) element. Selenocysteine is used in several proteins and is found in every domain of life. In 2002, a 22nd amino acid --pyrrolysine -- was found to be encoded by UAG in the genetic code of some Archaea and possibly Eubacteria species. It is not yet clear if other signals in the mRNA are necessary to specify this amino acid.
4) The genetic code can be changed. Recent studies have successfully engineered aminoacyl-tRNA synthetases and tRNAs to incorporate several unnatural amino acids in the code of prokaryotes and eukaryotes. In 2003, a strain of E. coli capable of autonomously synthesizing and incorporating an artificial amino acid into proteins was developed.
5) Codons do not need to be three bases long. Studies have shown that the E. coli translational machinery is capable of accommodating four and even five base codons. But these seem to be the limits for possible codon sizes: functional two or six base codons have not yet been found, despite efforts to create them. A number of naturally occurring suppressor tRNAs exhibit four-base anticodons, and several studies of artificial genetic code expansion use four base codons to incorporate new amino acids into the code.(1-5)
References (abridged):
1. Knight, R.D., Freeland, S.J., and Landweber, L.F. (2001). Rewiring the keyboard: evolvability of the genetic code. Nat. Rev. Genet. 2, 49-58
2. Srinivasan, G., James, C.M., and Krzycki, J.A. (2002). Pyrrolysine encoded by UAG in Archaea: Charging of a UAG-decoding specialized tRNA. Science 296, 1459-1462
3. Cavalcanti, A.R.O. and Landweber, L.F. (2003). Genetic Code: What nature missed. Curr. Biol. 13, R884-R885
4. Anderson, J.C., Maglieri, T.J., and Schultz, P.G. (2002). Exploring the limits of codon and anticodon size. Chem. Biol. 9, 237-244
5. Knight, R., Landweber, L.F., and Yarus, M. (2003). Tests of a stereochemical genetic code. In Translation Mechanisms. Lapointe, J. and Brakier-Gingras, L. eds. (: Kluwer Academic/Plenum Publishers), pp. 115-128
Current Biology http://www.current-biology.com
--------------------------------
Related Material:
A CRITIQUE OF THE COEVOLUTION THEORY OF THE GENETIC CODE
Notes by ScienceWeek:
In general, the term "genetic code" refers to the sequences of nucleotides in DNA and RNA that determine the various amino acid sequences of proteins. Proteins are not synthesized directly from information in DNA; instead a "messenger" molecule, "messenger RNA" (mRNA), is synthesized from DNA and then directs the synthesis of protein in special structures called "ribosomes". Messenger RNA is composed of 4 nucleotides: adenine (A), guanine (G), cytosine (C), and uracil (U). Three adjacent nucleotides constitute a unit known as a "codon", and it is this unit that codes for a particular amino acid. For example, the sequence AUG is a codon that specifies the amino acid methionine. There are 64 possible codons, three of which do not code for amino acids but instead indicate the end of a protein. The remaining 61 codons specify the 20 amino acids that make up proteins.
The AUG codon, in addition to coding for methionine, is found at the beginning of every messenger RNA, and this codon indicates the start of a protein. Because most of the 20 amino acids are coded for by more than one codon, the genetic code, as a code, is "degenerate". The code was once believed to be identical in all forms of life, but the codon-amino acid code differs in *mitochondria, in certain bacteria, and in certain *ciliated protozoa. The differences, however, are rare, and in general the genetic code is identical in nearly all species, the same codons specifying the same amino acids.
In this context, the term "quartet" derives from the following: The 64 possible codons can be organized into 16 "quartets" defined by the first 2 nucleotides in a triplet. Thus, UUU, UUC, UUA, UUG is one quartet; GAU, GAC, GAA, GAG is another quartet.
As with any other biological subsystem, researchers believe the genetic code evolved from a more primitive form, although no ancestral codes have yet been discovered. Attempts at an evolutionary reconstruction of the genetic code have therefore generally relied on a detailed analysis of the features of the present code, and those features of the genetic code of apparent importance in theoretical logical and probabilistic considerations of its evolution are as follows:
a) Messenger RNA molecules consist of only 4 kinds of nucleotide bases, and these compose chains of varying lengths and varying sequences.
b) A codon that specifies a particular amino acid is always a triplet consisting of a chain of 3 nucleotides.
c) The code has no "spacers" (intermediary nucleotides) between codons (the code is "comma-less" and nonoverlapping): each codon is translated in a continuous sequence, 3 successive nucleotides at a time, from one end of a messenger RNA "reading frame" to the other. (A "reading frame" is a particular nucleotide sequence coding for a polypeptide that starts at a specific point and then partitions into codons until it reaches the final codon of that sequence.)
d) Each codon sequence is complementary to an "anticodon" sequence of another RNA molecule, transfer RNA (tRNA), that carries a particular amino acid to the messenger RNA codon.
e) Except for rare instances, all living organisms share the same coding "dictionary".
f) Ambiguities have not been found in the code: the same codon does not specify two or more amino acids.
g) With the exception of methionine and tryptophan, all amino acids are each designated by more than one codon.
h) The pattern of degeneracy in the code is mostly in the 3rd codon position.
i) When an amino acid is coded for by only 2 of the codons of the four in a quartet, the third codon positions in the amino acid "duet" are both pyridines (U and C) or both purines (A and G), never one pyrimidine and one purine.
So there is the problem: given that no ancestor codes are available to us, the research goal is to provide a theoretical model that can explain the evolution of the present universal genetic code.
In evolutionary biology, the term "coevolution" refers to the evolution of one or more species in synchrony with another species as a consequence of their interdependence. In the context that concerns the evolution of the genetic code, the term "coevolution theory" refers to a set of theories proposing a linked evolution between the evolution of the genetic code and the evolution of amino acid biosynthetic pathways.
The following points are made by T.A. Ronneberg et al (Proc. Natl. Acad. Sci. US 2000 97:13690):
1) The authors point out that the idea that the canonical genetic code evolved from a simpler primordial form that encoded fewer amino acids originates with F. Crick (1968). The more recent (and more influential) version of this idea is the "code coevolution" hypothesis introduced by J. Wong in 1975, which proposes that the genetic code coevolved with the evolution of biosynthetic pathways for new amino acids. Coevolution theory further proposes that a comparison of modern codon assignments with the evolutionary conserved metabolic pathways of amino acid biosynthesis can reveal the history of code expansion: A central tenet of coevolution theory is that a "product" amino acid synthesized from a precursor amino acid usurped codons previously assigned to this precursor, such that the sequence of steps by which the code expanded is visible within modern codon assignments.
2) The authors report they have re-examined the biochemical basis of the code coevolution theory to test the validity of its statistical support. The authors demonstrate that the theory's definition of "precursor-product" amino acid pairs is unjustified biochemically because it requires the energetically unfavorable reversal of steps in extant metabolic pathways to achieve desired relationships. In addition, the authors suggest coevolution theory neglects important biochemical constraints when calculating the probability that chance could assign precursor-product amino acids to contiguous codons.
3) The authors report a conservative correction for these errors reveals a surprisingly high 23 percent probability that apparent patterns within the code are caused purely by chance. Finally, even this figure rests on post hoc assumptions about primordial codon assignments, without which the probability rises to 62 percent that chance alone could explain the precursor-product pairings found within the code. The authors state: "We conclude that coevolution theory cannot adequately explain the structure of the genetic code."
Proc. Nat. Acad. Sci. http://www.pnas.org
--------------------------------
Notes by ScienceWeek:
mitochondria: Mitochondria are double-membrane enclosed organelles of cells that are involved with several important biochemical pathways, including electron transport and oxidative metabolism. Various types of *eukaryotic cells may contain from a few to several thousand mitochondria in each cell type. The mitochondria are relatively large cylindrical structures up to 10 microns long and up to 2 microns in diameter, and most biologists believe mitochondria are cell organelles that may have originated as separate organisms that became resident in eukaryotic cells. Mitochondrial DNA is independent of nuclear DNA. It consists of a circular molecule, 16,569 base pairs long in humans, with a known nucleotide sequence.
eukaryotic cells: In general, refers to cells which contain a nucleus and/or other membrane-bound organelles (e.g., mitochondria).
ciliated protozoa: A phylum (or subkingdom) comprising unicellular and colonial animals of varied form, cells ranging from simple to extremely complex macro-structures. "Cilia" are short threadlike extensions, hundreds usually present on an individual ciliated cell, the cilia undergoing synchronized movements to produce locomotion of the protozoan.
ScienceWeek http://scienceweek.com
|