|
ScienceWeek
2004 25 June C5 5. PLANT BIOLOGY: ON PLANT PROTEOMICS
The following points are made by Paul Jarvis (Current Biology 2004 14:R317):
1) The evolution of the modern plant cell involved the acquisition of mitochondria and chloroplasts through endosymbiosis, and it is now widely accepted that these organelles are distant relatives of present-day alpha-proteobacteria and cyanobacteria, respectively. Over the course of evolution, the progenitors of mitochondria and chloroplasts conceded many of their genes to the nuclear genome, so that now more than 90% of their constituent proteins are translated on cytoplasmic ribosomes [1]. Many of these nucleus-encoded, organellar proteins initially bear an amino-terminal targeting signal -- called a "presequence" or "transit" peptide -- which guides them through a post-translational targeting pathway to their final destination [2,3]. While programs for predicting targeting signals from sequence data do exist [4], these in silico methods are not 100% reliable, and so the only truly dependable method for determining the protein complement of a particular organelle is laboratory experimentation.
2) The completion of genome sequencing projects, and advances in methods for routine protein identification by mass spectrometry, have precipitated the onset of the proteomic era. In plants, the chloroplast proteome of the model plant, Arabidopsis thaliana, has received considerable attention [5]. Although the proteome of an Arabidopsis chloroplast is substantially smaller and more manageable than that of an entire cell, it nevertheless comprises several thousand different proteins. For this reason, initial studies tended to focus on a particular suborganellar compartment. Peltier et al.[5] studied the space enclosed within the photosynthetic membranes, called the thylakoid lumen, whereas other researchers focused on the double membrane system, or envelope, that surrounds each chloroplast. While many of the identified proteins turned out to have functions one would predict would be associated with the compartment in question, many more did not, and so these studies have paved the way for major advances in our understanding of thylakoids and the envelope. Such studies also facilitate the development of protein localization prediction tools.
3) Kleffmann et al.(Current Biology 2004 14:354) have recently reported the first extensive study of the whole chloroplast proteome. Using a comprehensive series of fractionation procedures to overcome dynamic range limitations -- the tendency of abundant proteins to mask the presence of less abundant proteins -- a total of 690 different proteins were identified in highly purified preparations of Arabidopsis chloroplasts. By eliminating putative contaminating proteins from other compartments, a final set of 636 proteins was selected for analysis, 604 of which are encoded by nuclear genes (the other 32 are encoded by the chloroplast's own genome).
4) More than 30% of these proteins are of unknown function. In a recent, similarly comprehensive study of the Arabidopsis mitochondrial proteome, almost 20% of the identified proteins were of unknown function, and so it seems that we have some way to go yet before the functions of these two organelles are fully understood.
References (abridged):
1. Leister, D. (2003). Chloroplast research in the genomic age. Trends Genet. 19, 47-56
2. Pfanner, N. and Geissler, A. (2001). Versatility of the mitochondrial protein import machinery. Nat. Rev. Mol. Cell Biol. 2, 339-349
3. Jarvis, P. and Soll, J. (2002). Toc, Tic, and chloroplast protein import. Biochim. Biophys. Acta 1590, 177-189
4. Emanuelsson, O., Nielsen, H., Brunak, S., and von Heijne, G. (2000). Predicting subcellular localization of proteins based on their N- terminal amino acid sequence. J. Mol. Biol. 300, 1005-1016
5. Peltier, J.B., Emanuelsson, O., Kalume, D.E., Ytterberg, J., Friso, G., Rudella, A., Liberles, D.A., Soderberg, L., Roepstorff, P., and von Heijne, G. et al. (2002). Central functions of the lumenal and peripheral thylakoid proteome of Arabidopsis determined by experimentation and genome-wide prediction. Plant Cell 14, 211-236
Current Biology http://www.current-biology.com
--------------------------------
Related Material:
GENOMIC BIOLOGY: ON THE YEAST PROTEOME
The following points are made by J.A. Wohlschlegel and J.R. Yates (Nature 2003 425:671):
1) Saccharomyces cerevisiae was the first eukaryote -- the type of organism characterized by a nucleus and membrane-bound organelles, which also includes humans -- to have its genome sequenced(3). Work with this organism has since led the way in functional genomics. Experiments pioneered in yeast have set the standard for the global analysis of cellular processes and paved the way for similar approaches in other organisms. They have also generated genome-wide collections of reagents that have been tremendously valuable.
2) Open reading frames (ORFs) are commonly the center of attention in genome biology. These are stretches of DNA that have the characteristics of protein-coding capacity; that is, they may be genes. Collections of yeast strains now exist in which the expected ORFs have been either deleted or fused to various protein tags(4,5). Arrays have been created by using yeast strains expressing proteins that carry so-called affinity tags, allowing large numbers of proteins to be rapidly purified, then immobilized on a solid support. Large-scale studies involving various techniques -- protein arrays, and yeast two-hybrid or co-immunoprecipitation assays -- have revealed the identities of proteins that interact with individual proteins, large macromolecular complexes, or even specific small molecules.
3) All in all, yeast biologists have led the charge in developing approaches to understanding eukaryotic genomes. Huh et al(1) and Ghaemmaghami et al(2) continue that tradition. Their goal was to tag and study the gene products of all recognized ORFs in the yeast genome. A key component of these studies was the tagging method used: artificially altering a protein's expression level can lead to results, such as mislocalization, that do not reflect its characteristics when it is expressed normally.
4) In technical terms, Huh et al and Ghaemmaghami et al used homologous recombination to integrate a DNA sequence, encoding either a tandem affinity purification tag (TAP) or green fluorescent protein (GFP), in-frame with the 3'-end of the coding sequence of each gene in its original chromosomal location. Because a gene's promoter and upstream regulatory sequences are not affected in this approach, it is likely that the behavior of these fusion genes is nearly identical to that of their normal counterparts.
5) These studies have achieved three major results: First, we now have data on protein abundance and localization for 75% of the predicted yeast ORFs. Second, we have a value for the number of proteins present in a yeast cell during normal growth. Previously, a fun game to play with yeast biologists was to ask how many proteins they thought should be present under a given set of conditions. Numbers ranged between 2500 and 5000. It appears that the higher number was correct. Last -- and most important -- reagents have been developed for tracking a large majority of yeast genes while keeping them under native regulatory control. The reagents will be tools for further studies.
References (abridged):
1. Huh, W-K et al. Nature 425, 686-691 (2003)
2. Ghaemmaghami, S. et al. Nature 425, 737-741 (2003)
3. Goffeau, A. et al. Science 274, 546, 563-567 (1996)
4. Winzeler, E. A. et al. Science 285, 901-906 (1999)
5. Martzen, M. R. et al. Science 286, 1153-1155 (1999)
Nature http://www.nature.com/nature
--------------------------------
Related Material:
ON PROTEOMICS
The following points are made by Ruedi Aebersold (Nature 2003 422:115):
1) One of the most striking results obtained from completed genome sequencing projects is the knowledge of the precise number of genes in the genome of a species. Although the number of genes ana lysed to date is relatively small -- ranging from a few hundred for bacteria to tens of thousands for mammalian species -- the number of possible products encoded by these genes is much higher. In particular, the number of encoded proteins is enormous, as the same gene can generate multiple protein products that differ as a result of combinatorial splicing, processing and modification.
2) Given a species, as the universe of its biological processes and the molecules that constitute these processes is finite and knowable, there are several important consequences for experimental biology. First, even the most complex biological phenomena such as development, differentiation, metabolism and memory will be explained using known genes, their products and their interplay with the external conditions that the organism encounters. Second, projects to discover genes and their products in a species have defined end points. So far, although this end point has been reached only for gene discovery (sequencing), at some point in the future all of the gene products of a species --including messenger RNA and proteins -- will also be comprehensively described. Third, once all the possible molecules and activities within a species have been discovered and described, biological experimentation will be transformed from a discovery mode of identifying and describing molecules, to a "browsing" mode, in which the universe of possible events is searched to find constellations that correlate with a particular state or function. Genomics-style biology can therefore be separated into two distinct phases: a discovery phase to characterize the universe; and a browsing phase, in which system-wide biological assays navigate the universe.
3) Proteins are involved in all biological processes and can therefore be considered the functionally most important biological molecules. They are also particularly rich in biological information. In addition to the amino-acid sequence defining a protein, protein properties such as the amount of a protein expressed, its specific activity, state of modification and association with other proteins or molecules of different types are crucial for the description of biological systems. The systematic identification and characterization of proteins, called "proteomics", carries with it huge expectations, such as diagnostic and prognostic markers in blood serum and other body fluids; targets for pharmaceutical drugs; and improving the knowledge of fundamental biological processes. Hence the development of technologies to search the "proteome" routinely and systematically would be a significant achievement.
4) Unfortunately, the same properties that make proteins information-rich also significantly complicate their experimental analysis. There is no experimental platform, even under development, to systematically measure the diverse properties of proteins at high throughput.
Nature http://www.nature.com/nature
ScienceWeek http://scienceweek.com
|