|
ScienceWeek
EVOLUTION: GENOMES AND THE TREE OF LIFE
The following points are made by K.A. Crandall and J.E. Buhay (Science 2004 306:1144):
1) Although we have not yet counted the total number of species on our planet, biologists in the field of systematics are assembling the "Tree of Life" (1,2). The Tree of Life aims to define the phylogenetic relationships of all organisms on Earth. Driskell et al (3) recently proposed a computational method for assembling this phylogenetic tree. These investigators probed the phylogenetic potential of ~300,000 protein sequences sampled from the GenBank and Swiss-Prot genetic databases. From these data, they generated "supermatrices" and then super-trees.
2) Supermatrices are extremely large data sets of amino acid or nucleotide sequences (columns in the matrix) for many different taxa (rows in the matrix). Driskell et al (3) constructed a supermatrix of 185,000 protein sequences for more than 16,000 green plant taxa and one of 120,000 sequences for nearly 7500 metazoan taxa. This compares with a typical systematics study of, on a good day, four to six partial gene sequences for 100 or so taxa. Thus, the potential data enrichment that comes with carefully mining genetic databases is large. However, this enrichment comes at a cost. Traditional phylogenetic studies sequence the same gene regions for all the taxa of interest while minimizing the overall amount of missing data. With the database supermatrix method, the data overlap is sparse, resulting in many empty cells in the supermatrix, but the total data set is massive.
3) To solve the problem of sparseness, the authors built a "super-tree" (4). The supertree approach estimates phylogenies for subsets of data with good overlap, then combines these subtree estimates into a supertree. Driskell et al (3) took individual gene clusters and assembled them into subtrees, and then looked for sufficient taxonomic overlap to allow construction of a supertree. For example, using 254 genes (2777 sequences and 96,584 sites), the authors reduced the green plant supermatrix to 69 taxa from 16,000 taxa, with an average of 40 genes per taxon and 84% missing sequences! This represents one of the largest data sets for phylogeny estimation in terms of total nucleotide information; but it is the sparsest in terms of the percentage of overlapping data.
4) Yet even with such sparseness, the authors are still able to estimate robust phylogenetic relationships that are congruent with those reported using more traditional methods. Computer simulation studies (5) recently showed that, contrary to the prevailing view, phylogenetic accuracy depends more on having sufficient characters (such as amino acids) than on whether data are missing. Clearly, building a super-tree allows for an abundance of characters even though there are many missing entries in the resulting matrix.
References (abridged):
1. M. Pagel, Nature 401, 877 (1999)
2. A new NSF program funds computational approaches for "assembling the Tree of Life" (AToL). Total AToL program funding is $13 million for fiscal year 2004. NSF, Assembling the Tree of Life: Program Solicitation NSF 04-526 (www.nsf.gov/pubs/2004/nsf04526/nsf04526.pdf)
3. A. C. Driskell et al., Science 306, 1172 (2004)
4. M. J. Sanderson et al., Trends Ecol. Evol. 13, 105 (1998)
5. J. Wiens, Syst. Biol. 52, 528 (2003)
Science http://www.sciencemag.org
--------------------------------
Related Material:
EVOLUTIONARY BIOLOGY: PHYLOGENETIC TREES AND MICROBES
The following points are made by W. Martin and T. M. Embley (Nature 2004 431:134):
1) Charles Darwin (1809-1882) described the evolutionary process in terms of trees, with natural variation producing diversity among progeny and natural selection shaping that diversity along a series of branches over time. But in the microbial world things are different, and various schemes have been devised to take both traditional and molecular approaches to microbial evolution into account. For example, Rivera and Lake(1), based on analysis of whole-genome sequences, call for a radical departure from conventional thinking.
2) Unknown to Darwin, microbes use two mechanisms of natural variation that disobey the rules of tree-like evolution: lateral gene transfer and endosymbiosis. Lateral gene transfer involves the passage of genes among distantly related groups, causing branches in the tree of life to exchange bits of their fabric. Endosymbiosis -- one cell living within another -- gave rise to the double-membrane-bounded organelles of eukaryotic cells: mitochondria (the powerhouses of the cell) and chloroplasts. At the endosymbiotic origin of mitochondria, a free-living proteobacterium came to reside within an archaebacterially related host. This event involved the genetic union of two highly divergent cell lineages, causing two deep branches in the tree of life to merge outright. To this day, biologists cannot agree on how often lateral gene transfer and endosymbiosis have occurred in life's history; how significant either is for genome evolution; or how to deal with them mathematically in the process of reconstructing evolutionary trees. The report by Rivera and Lake(1) bears on all three issues: Instead of a tree linking life's three deepest branches (eubacteria, archaebacteria and eukaryotes), they uncover a ring.
3) The ring comes to rest on evolution's sorest spot -- the origin of eukaryotes. Biologists fiercely debate the relationships between eukaryotes (complex cells that have a nucleus and organelles) and prokaryotes (cells that lack both). For a decade, the dominant approach has involved another intracellular structure called the ribosome, which consists of complexes of RNA and protein, and is present in all living organisms. The genes encoding an organism's ribosomal RNA (rRNA) are sequenced, and the results compared with those for rRNAs from other organisms. The ensuing tree(2) divides life into three groups called "domains". The usefulness of rRNA in exploring biodiversity within the three domains is unparalleled, but the proposal for a natural system of all life based on rRNA alone has come increasingly under fire.
4) Ernst Mayr(3), for example, argued forcefully that the rRNA tree errs by showing eukaryotes as sisters to archaebacteria, thereby obscuring the obvious natural division between eukaryotes and prokaryotes at the level of cell organization. A central concept here is that of a tree's "root", which defines its most ancient branch and hence the relationships among the deepest-diverging lineages. The eukaryote-archaebacteria sister-grouping in the rRNA tree hinges on the position of the root. The root was placed on the eubacterial branch of the rRNA tree based on phylogenetic studies of genes that were duplicated in the common ancestor of all life(2). But the studies that advocated this placement of the root on the rRNA tree used, by today's standards, overly simple mathematical models and lacked rigorous tests for alternative positions(4).
5) One discrepancy is already apparent in analyses of a key data set used to place the root, an ancient pair of related proteins, called elongation factors, that are essential for protein synthesis(5). Although this data set places the root on the eubacterial branch, it also places eukaryotes within the archaebacteria, not as their sisters(5). Given the uncertainties of deep phylogenetic trees based on single genes(4), a more realistic view is that we still don't know where the root on the rRNA tree lies and how its deeper branches should be connected.
References (abridged):
1. Rivera, M. C. & Lake, J. A. Nature 431, 152-155 (2004)
2. Woese, C., Kandler, O. & Wheelis, M. L. Proc. Natl Acad. Sci. USA 87, 4576-4579 (1990)
3. Mayr, E. Proc. Natl Acad. Sci. USA 95, 9720-9723 (1998)
4. Penny, D., Hendy, M. D. & Steel, M. A. in Phylogenetic Analysis of DNA Sequences (eds Miyamoto, M. M. & Cracraft, J.) 155-183 (Oxford Univ. Press, 1991)
5. Baldauf, S., Palmer, J. D. & Doolittle, W. F. Proc. Natl Acad. Sci. USA 93, 7749-7754 (1996)
Nature http://www.nature.com/nature
--------------------------------
Related Material:
EVOLUTIONARY BIOLOGY: ON THE PRIMEVAL KINGDOMS
Notes by ScienceWeek:
During most of the past 100 years, the consensus view among biologists was that all life on Earth evolved from a universal common ancestor, a primitive cellular form that lived approximately 3.5 to 3.8 billion years ago. This view capped centuries of detailed classifications of living systems, with relationships between organisms deduced and revised and revised again as new discoveries were made. Detailed analysis of many traits indicated, for example, that primates in the human family (hominids) shared a common ancestor with apes, that this common ancestor shared an earlier common ancestor with monkeys, and that that common ancestor, in turn, shared an even earlier common ancestor with primitive primates (prosimians; e.g., lemurs), and so on.
The view was thus of a "tree of life", with discrete branches rising ever higher, but with all branches deriving from a single primeval trunk. The known organisms that might have comprised the primeval trunk and its lowest branches, however, did not provide enough organismic information to define detailed relationships, so that biologists were left with apparent mysteries concerning radical evolutionary innovations between primitive cells and more complex cells, between the first biological cells and the appearance of multicellular fungi, plants, and animals.
The following points are made by W. Ford Doolittle (Scientific American February 2000):
1) In the mid-1960s, Zuckerkandl and Pauling proposed a revolutionary strategy that might supply the missing information concerning evolutionary branching. The essential idea was that instead of investigating anatomy and physiology, family trees of living organisms should be based on differences in the monomer sequences in selected genes or proteins. This approach became known as "molecular phylogeny", and its essential basis was that as a result of changes in genes caused by mutations, as two species diverge from an ancestor, the gene sequences they share will also diverge, and as time passes, the genetic divergence will increase. Researchers could thus reconstruct the evolutionary past of living species by assessing the apparent history of divergence of genes or proteins isolated from those species. Protein studies completed in the 1960s and 1970s demonstrated the general utility of molecular phylogeny by confirming and then extending the already established family trees of well-studied groups such as the vertebrates.
2) A new research development occurred in the late 1970s, when Carl Woese proposed that the two-domain view of life that divided living organisms into a) bacteria and b) cells with internal membrane-bound organelles (eukaryotes) was no longer tenable on the basis of molecular analysis. Woese suggested that certain so-called "bacteria" formed a distinct third primary group -- the archaea -- and that members of this group were as different from other bacteria as bacteria were different from eukaryotes. Woese suggested that although certain cells without internal membrane-bound organelles (prokaryotes) classified as bacteria might look like bacteria, they were genetically much different, and their *ribosomal RNA (rRNA) supported an early evolutionary divergence.
3) Once the idea of three rather than two primeval domains was accepted by researchers, an important question was which of the two structurally primitive groups -- bacteria or archaea --gave rise to the first eukaryotes? Because of evidence indicating an apparent kinship between the gene expression/protein synthesis machinery of archaea and eukaryotes, the consensus was that eukaryotes diverged from the archaea.
4) One important result of research in molecular phylogeny during the past 15 years has been the production of strong evidence supporting the "endosymbiont hypothesis". In biology, the term "symbiosis" refers in general to an intimate and protracted association of individuals of different species, and "endosymbiosis" refers to a symbiotic association between cells of two or more different species in which a smaller cell inhabits a larger host cell. The endosymbiont hypothesis in evolutionary biology, now a consensus view, proposes that the mitochondria components of eukaryotes, so essential for eukaryote metabolism, formed when an early eukaryote engulfed and then retained one or more primitive bacteria of a certain type (alpha-proteobacteria). Eventually, these bacteria relinquished their ability to live on their own and transferred some of their genes to the nucleus of the host cell, and these bacteria then evolved into the extant mitochondria. In addition, and similarly, the hypothesis proposes that some mitochondria-bearing eukaryotes ingested bacteria capable of producing oxygen during photosynthesis (cyanobacteria), and these resident symbiotic bacteria subsequently evolved into the chloroplasts, the present internal structures that drive photosynthesis in certain eukaryotes (e.g., in plant cells).
5) Until very recently, therefore, the consensus view in biology could be summarized as follows: The early descendants of the last universal common ancestor -- a small prokaryote cell --divided into two prokaryotic groups: the bacteria and the archaea. Later, the archaea gave rise to the eukaryotes. Subsequently, the eukaryotes gained valuable energy-generating organelles --mitochondria and (in the case of plants, for example) chloroplasts -- by taking up and retaining certain symbiotic bacteria.
6) Several years ago, however, the consensus view stated above became complicated by a large amount of evidence concerning the phenomenon of "lateral gene transfer" (horizontal gene transfer). Biologists recognize two types of gene transfer from one organism to another: vertical and horizontal. Vertical gene transfer occurs between parents and offspring, and horizontal gene transfer is the transfer that may occur between organisms otherwise. It is in bacteria that horizontal gene transfer has been studied most extensively, particularly in the last decade. Three types of horizontal gene transfer are known: conjugation, transduction, and transformation. Conjugation is a type of sexual reproduction exhibited by some bacteria, the process involving the exchange of genetic material by means of a tube or bridge, the transfer of DNA occurring either in one direction or in both directions. Transduction involves the transfer of genetic material from one bacterium to another with the intermediation of a virus. Essentially, when the virus infects one bacterium, it often carries away pieces of that bacterium's genome, and those pieces, upon the infection of a new bacterium, become incorporated into the second bacterial genome. Finally, transformation is the process involving the uptake or incorporation of DNA fragments (plasmids) by a bacterium, first observed in 1944 by Oswald Avery. In this context, the important aspect of horizontal gene transfer is that in primitive cells such as prokaryotes it is now apparent that horizontal gene transfer readily occurs across species. As a consequence of the new evidence, the consensus view of the interrelations between the primeval three kingdoms has now been seriously destabilized.
7) In general, the current situation concerning the evolutionary "tree of life" is as follows: The conceptual tree-like structure with discrete branches is retained at the top of the eukaryote domain, and also retained is the idea that eukaryotes obtained mitochondria and chloroplasts from bacteria. But the lower parts of the tree are now seen to involve an extensive anastomosis of branches -- branches joining other branches in a complex network of intersecting links -- resulting from extensive horizontal gene transfer of single or multiple genes, the horizontal gene transfer known to be common in unicellular organisms. Thus, the author (Doolittle) suggests that the "tree of life" lacks a single organism at its base, and that "the three major domains of life probably arose from a population of primitive cells that differed in their genes."
Scientific American http://www.sciam.com
--------------------------------
Notes by ScienceWeek:
ribosomal RNA (rRNA): A ribosome (not to be confused with riboZYME) is a small particle, a complex of various ribonucleic acid component subunits and proteins that functions as the site of protein synthesis.
ScienceWeek http://scienceweek.com
|