Вы находитесь на странице: 1из 31

ANNUAL REVIEWS

Click here for quick links to Annual Reviews content online, including: Other articles in this volume Top cited articles Top downloaded articles Our comprehensive search

Further

Quantitative, High-Resolution Proteomics for Data-Driven Systems Biology


Jurgen Cox1 and Matthias Mann1,2
1 Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried D-82152, Germany; email: mmann@biochem.mpg.de, cox@biochem.mpg.de 2 The Novo Nordisk Foundation Center for Protein Research, Faculty of Health Sciences, University of Copenhagen, Copenhagen DK-2200, Denmark

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Annu. Rev. Biochem. 2011. 80:27399 First published online as a Review in Advance on May 3, 2011 The Annual Review of Biochemistry is online at biochem.annualreviews.org This articles doi: 10.1146/annurev-biochem-061308-093216 Copyright c 2011 by Annual Reviews. All rights reserved 0066-4154/11/0707-0273$20.00

Keywords
bioinformatics, genetics, protein quantication, protein interactions, posttranslational modications, signaling

Abstract
Systems biology requires comprehensive data at all molecular levels. Mass spectrometry (MS)-based proteomics has emerged as a powerful and universal method for the global measurement of proteins. In the most widespread format, it uses liquid chromatography (LC) coupled to high-resolution tandem mass spectrometry (MS/MS) to identify and quantify peptides at a large scale. This peptide intensity information is the basic quantitative proteomic data type. It is used to quantify proteins between different proteome states, including the temporal variation of the proteome, to determine the complete primary structure of proteins including posttranslational modications, to localize proteins to organelles, and to determine protein interactions. Here, we describe the principles of analysis and the areas of biology where proteomics can make unique contributions. The large-scale nature of proteomics data and its high accuracy pose special opportunities as well as challenges in systems biology that have been largely untapped so far.

273

Contents
INTRODUCTION . . . . . . . . . . . . . . . . . PEPTIDE AND PROTEIN IDENTIFICATION AND QUANTIFICATION . . . . . . . . . . . . Generic Shotgun Proteomics Workow . . . . . . . . . . . . . . . . . . . . . Peptide Identication . . . . . . . . . . . . . Quantication in Proteomics . . . . . . Targeted Analysis . . . . . . . . . . . . . . . . EXPRESSION PROTEOMICS . . . . . Comprehensive Proteome Quantication . . . . . . . . . . . . . . . . . Compartment, Organellar, and Temporal Proteomics . . . . . . . . . Proteomes of Clinical Interest . . . . . PROTEOMICS OF POSTTRANSLATIONAL MODIFICATIONS . . . . . . . . . . . . . . Large-Scale Measurement of Posttranslational Modications by Mass Spectrometry . . . . . . . . . Making Use of Large-Scale Posttranslational Modication Data Sets . . . . . . . . Implications for Biology and Systems Biology . . . . . . . . . . . . . . . 274

275 275 277 278 280 280 280 281 283

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

283

283

284 284

INTERACTION PROTEOMICS . . Quantitative Afnity Purication Followed by Mass Spectrometry . . . . . . . . . . . . . . . . . Large-Scale Data Sets for Network Biology . . . . . . . . . . . . . . Protein Interactions with Posttranslational Modications, DNA, and Other Biomolecules . . . . . . . . . . . . . . . . . . INTEGRATING PROTEOMICS WITH OTHER LARGE-SCALE DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . Integration with Ontologies and Pathways . . . . . . . . . . . . . . . . . Physical and Functional Protein-Protein Interactions . . . Correlation of Transcriptome and Proteome . . . . . . . . . . . . . . . . . Combining Genetics with Proteomics . . . . . . . . . . . . . . . . . . . . OUTLOOK: TRANSFORMING BIOCHEMSTRY INTO A SYSTEMS SCIENCE VIA PROTEOMICS . . . . . . . . . . . . . . . . .

285

285 288

288

289 289 289 289 290

290

INTRODUCTION
The word proteomics was coined more than 10 years ago in analogy to genomics and denotes the entire complement of proteins expressed in a specic state of an organism or a cell population (1). The term expresses the ambition to obtain a global view at the protein level in analogy to what has already been possible at the level of DNA and RNA. However, despite supercial similarities, proteomics is very different from genomics in almost every respect. Genomics measures the genotype of an organism, whereas proteomics measures the phenotype, which is shaped by both the genotype and the past and present environment
274 Cox

of the organism. Genomics is founded upon a rich and long history of genetics, whereas proteomics builds upon an equally long history of biochemistry. The largest impediment to proteomics having a similar impact to genomics has, however, been technological: Genomics makes use of generic, scalable, and constantly rened technologies for decoding nucleic acid sequences, whereas proteomics has employed comparatively primitive techniques. Fortunately, this has now changed owing to the development of very powerful mass spectrometric technologies. In this review, we restrict ourselves to this area of mass spectrometry (MS)-based proteomics. Furthermore, within MS-based proteomics, we review the

Mann

peptide-based method of shotgun proteomics, which has turned out to be by far the most widely used and the most generic approach (27). We focus on individual studies, often drawn from our own laboratory, to illustrate general principles dealing with the current and potential contribution of MS-based proteomics to biology in general and to a systems biological understanding of the cell in particular. From a systems biology perspective, MS-based proteomics delivers three distinct types of experimental results or data types. Expression proteomics determines the relative or absolute amount of proteins in a sample. This is analogous to transcriptomics, the measurement of mRNA by microarrays or by deep sequencing methods. A principal advantage of focusing on proteins is that this automatically takes regulation at the posttranscriptional and posttranslational levels into account. A second data type uniquely accessible to proteomics is the modication state of proteins. MS can, in principle, identify and quantify all elements of the primary structure of the mature protein, including its posttranslational modications (PTMs). MS has already increased the number of known sites 10- to 100-fold for many of the most important PTMs (5). Indeed, one of the surprises of the past years has been just how widespread and how diverse these PTMs are. Third, proteomics is exceptionally well suited to the mapping of protein interactions. This crucial contribution to network biology is not restricted to protein-protein interactions but encompasses interactions of proteins with modied peptides, with small molecules, and with specic RNA and DNA sequences. These three types of information can be woven together and applied in myriad different formats to study biological and medical questions. We rst describe the technology of MSbased shotgun proteomics with a particular emphasis on the maturity of the tools. Then, we discuss the principles of applying proteomics in each of the three areas mentioned above. Next, we review the integration of proteomics data with other large-scale data sets. Modern

MS-based proteomics also offers a powerful toolbox, which not only enables large-scale protein-based studies but also accelerates smallscale, directed studies in many areas of biology. In this way, proteomics directly supports the focused, hypothesis-driven research that continues to bring about the majority of discoveries in molecular biology and biomedicine.

PTM: posttranslational modication Proteome: the total complement of all proteins expressed in a given cellular or tissue state

PEPTIDE AND PROTEIN IDENTIFICATION AND QUANTIFICATION


In proteomics, the object of investigation is nearly always a protein mixture (810). These mixtures range in complexity from hundreds of proteins in afnity purications (because of their inevitable background) to more than 10,000 different proteins in complete mammalian proteomes (see the box How Many Proteins Are There in the Human Proteome? for a denition of distinct proteins). The main technological goal of MS-based proteomics is the accurate characterization of as many proteins as possible in these mixtures.

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Generic Shotgun Proteomics Workow


Proteomics can investigate a wide variety of input materials, from prokaryote or eukaryote cells through entire tissues and body uids (Figure 1). The protein complement of these samples is analyzed directly, fractionated, or subjected to afnity-based purication. In each case, proteins are digested to peptides preferably by a sequence-specic enzyme such as trypsinbecause peptides are easier to separate and analyze by LC-MS than proteins and because MS is much more sensitive for low molecular-weight molecules. [The analysis of entire proteins is called top-down proteomics to distinguish it from peptide-based bottomup proteomics (11, 12).] The resulting peptide mixture is separated with a gradient of aqueous/organic solvent in high-performance liquid chromatography (HPLC), lasting about one to several hours depending on complexity. Both
www.annualreviews.org High-Resolution Proteomics 275

HOW MANY PROTEINS ARE THERE IN THE HUMAN PROTEOME?


There is much uncertainty and confusion about the number of proteins in the human proteome, with estimates ranging from 20,000 to millions. This situation is somewhat reminiscent of the sequencing of the human genome, which saw generally accepted gene numbers shrink from more than 100,000 to the current 20,000 (160). Partly, the discussion is about semantics. In a one gene-one protein view, the human proteome has just 20,000 proteins, only three times more than yeast. Although simplistic, the one gene-one protein rule provides a good yardstick to judge how proteomics stacks up to genomics and to compare different omics data sets. Furthermore, any specic proteome, such as the proteome of a particular cell line, is of course smaller than the potential proteome because only a subset of proteins is actually expressed. We have identied more than 10,000 proteins in a single human cell line, so 10,000 proteins is a lower limit of the complexity of a human proteome. A pragmatic denition of the proteome, and the one almost universally used in practice, is based on the entries in standard protein sequence databases. If two different forms of a protein are recorded as different entries and MS nds differentiating peptides, then both proteins must have been present in the sample. Widely used databases, such as SwissProt/UniProt, contain alternative splice isoforms as separate entries if there is some evidence of their existence (but not necessarily experimental verication). We suggest dening the human proteome on the basis of what one would reasonably include as different entries in a sequence database. Specically, one should start from one protein per gene and only add separate protein entries when there is a clear difference in the sequence, preferably with direct experimental and quantitative information (i.e., the modied form should not be an innitesimal fraction of the main protein product). Although it is too early to even guess, the proteome dened in this way may be severalfold larger than the number of genes. If one also counts modications as separate proteins, one quickly arrives at absurd numbers for the human proteome, especially if one simply multiplies the possibilities. If one assumes that each protein has at least 10 different possible modicationsa conservative estimatethen there would be 210 = 1,024 different PTM forms. The numbers become even larger if all possible alternative splice variants, coding single-nucleotide polymorphisms (SNPs), and so on are considered. Owing to in vitro modications of peptides, the number of isoforms observable by MS-based proteomics would be further increased into the astronomical range [up to 10 different forms of abundant peptides have been observed using the dependent-peptide modication algorithms (24)]. This clearly implies that most modications must be coregulated or otherwise cannot have independent functions. In our view, only functionally distinct forms of proteins should be counted as different, but they should still not be used to inate the human proteome.

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

the electrospray (13) and the matrix-assisted laser desorption/ionization methods (14) can convert the peptides to gas phase ions, a necessary step for their MS analysis, although direct on-line coupling of the efuent of the column by electrospray is most common. Many peptides coelute from the column and reach the mass spectrometer at the same time and in vastly different concentrations. This presents formidable challenges in terms of the required analysis speed and dynamic range in addition to the challenges in peptide identication and peptide quantication. To reduce complexity, at least one additional step of protein or peptide
276 Cox

separation is almost always performed. These are typically one-dimensional gel electrophoresis followed by in-gel digestion of the proteins and on-line or off-line strong cation exchange, respectively. Analysis time in MS-based proteomics for the above workow is given by the length of the gradient, multiplied by the number of fractions to be analyzed and by the number of replicates. For example, a proteome measurement with six peptide fractions and 2-h gradients in triplicate would occupy a mass spectrometer for one day. Contrary to initial expectations, extensive fractionation at the proteome level has turned out to be of

Mann

MS Expression Optional organellar enrichment Modications Modication enrichment Interactions Anity purication Optional protein fractionation Intensity

Proteins

Peptide Chromatography and mass analyzer

m/z

MS/MS Intensity

. . .
Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Optional peptide fractionation

m/z

Figure 1 Outline of a generic shotgun proteomics workow. Depending on the study, optional organellar enrichment, enrichment of modications, or afnity purication is performed up front as indicated in the colored boxes. For example, expression proteomics can be performed on whole cell lysates or puried organelles, peptides bearing posttranslational modications are often specically enriched, and interaction partners are afnity puried. Prefractionation at the protein or peptide level can be introduced to increase the coverage and dynamic range. Analysis with chromatography and mass spectrometry (MS) then results in large data sets of MS and tandem mass spectrometry (MS/MS) data containing quantitative peptide information and characteristic fragmentation patterns that are used for determining the sequence. m/z, mass-to-charge ratio.

little help in increasing the depth of proteome coverage and is in any case usually impractical because it inates overall measurement time. A more promising strategy is to improve HPLC separation efciency and to make use of ever increasing MS performance (15).

Peptide Identication
As the peptides elute from the column and are electrosprayed, the mass spectrometer scans the entire mass range every few seconds. These mass spectra can be obtained at high resolution (mass divided by full peak width at half maximuma dimensionless number), with a value of 60,000 now routine on the linear ion trap Orbitrap analyzers and more than 20,000 on time-of-ight analyzers. This is a huge advantage compared to previous instruments because these analyzers allow the constituents of complex peptide mixtures to be distinguished in the spectra. The acquisition software of the mass spectrometer then selects a preset number of peptides in the mass spectra and proceeds to isolate each one of them, to

fragment them in the mass spectrometer, and to measure the mass spectra of the fragments. This is called tandem mass spectrometry (MS/MS). There are many peptide precursor ions in each mass spectrum, and each mass spectrum is therefore typically followed by 520 tandem mass spectra. The peptide mass is used in conjunction with the masses of its fragments for peptide identication. These data are scanned through an amino acid sequence database by algorithms that calculate the predicted spectrum for each possible linear peptide sequence and return the one that most likely gave rise to the mass spectrum and tandem mass spectrum data (16). In early proteomics experiments, peptide identication was often not well controlled, and incorrect data were frequently reported. Fortunately, this has now changed, mainly owing to robust techniques for determining false positives (17) and stringent community requirements for peptide identication (18). Likewise, peptide identication rates were often as low as a few percent (19), but todays high-resolution instrumentation combined with advances in
www.annualreviews.org High-Resolution Proteomics

Linear ion trap: a mass spectrometer in which the ions are captured and manipulated in a linear conguration of quadruple rods Orbitrap analyzer: new type of mass spectrometer in which ions circle a central spindle and the mass is obtained via frequency measurement

277

Stable isotope labeling by amino acids in cell culture (SILAC): a quantitative proteomics technology based on isotope labeling

computational proteomics routinely allow unambiguous assignment of peptide sequences to more than half of all tandem mass spectra (20). Peptide search engines such as Mascot (21), Sequest (22), and many others can also be instructed to identify peptides with a few specied modications, including those present in vivo (e.g., phosphorylations) and those introduced in vitro by sample handling (e.g., methionine oxidation). This still leaves a substantial number of peptides that are altered versions of already identied ones, and these peptides carry any of a vast variety of different and unexpected modications. Algorithms have been developed that match these dependent peptides to their base peptides and determine both the mass and the exact location of any possible modication (2325). Unassigned peptides can, in principle, also be used to nd novel genes in the genome or to experimentally prove predicted splice sites. However, so far, much of this work has been done with low-accuracy data, making unambiguous assignment difcult. Some search algorithms are based on extracting partial peptide sequence information that is then used for peptide identication (26). Using a novel hybrid linear ion trap Orbitrap instrument, it has recently become possible to obtain accurate MS/MS information routinely and without loss of sensitivity, which will allow the extraction of extensive sequence information and perhaps even de novo sequencing (2730). Because of the above developments, the dark matter of proteomics (unassigned peptide fragmentation spectra) is being drastically reduced, and often less than 10% of fragmented peptides remain unexplained. Nevertheless, the remaining spectra of high quality that are currently unassigned are sure to help in genome annotation and to lead to interesting discoveries related to unusual gene arrangements. Figure 2a summarizes data from a large-scale proteomics experiment in which more than 50% of about 1.8 million peptides [as judged by the presence of stable isotope labeling by amino acids in cell culture (SILAC) pairs] were successfully identied.

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Peptide-based proteomics does not directly identify proteins, but these rather need to be reconstructed from the obtained peptide data. This is called the protein inference problem (31). One or two peptides are frequently sufcient for identifying a protein. However, to distinguish between isoforms and to determine all PTMs, it is desirable to fragment as many peptides as possible to maximize sequence coverage. In the example in Figure 2b, an overall sequence coverage of 27% of the proteome was achieved, which is sufcient to resolve most isoforms. Note that many more peptides were present in the mixture than were targeted for fragmentation in the experiment. This suggests that there is an opportunity for much deeper proteome coverage as mass spectrometers become even faster and more sensitive.

Quantication in Proteomics
Quantication is of central importance in MS-based proteomics and has been reviewed extensively (3234). Proteomics can determine the absolute amount of each of the proteins in a mixture or their relative change between two or more conditions. Absolute quantication, for instance, can provide the copy number of proteins in a cellimportant input for systems biology modeling. The absolute amount of a protein in a sample is determined by the comparison of its signal to that of a spiked in standard peptide (35) or to a protein that is isotope labeled (36, 37), using the methods described below. It can also be estimated in silico from the signals of the peptides identifying a protein typically within an order of magnitude (38, 39). Usually, it is the change in protein amount or the change in the level of a particular PTM upon a dened perturbation that is of biological interest. There are two main approaches to making MS quantitativethe stable isotopebased or label-free methods. Isotope-based methods incorporate heavy versions of specic molecules into the peptides, either by chemical derivatization or by metabolic labeling. For example, the amide groups of peptides can be

278

Cox

Mann

a
1,788,451 peptides

Proteins
0.8

Fractional coverage

795,186 sequenced peptides

0.6

0.4

392,794 identied peptides Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

0.2

Sequence coverage

42,572 peptide sequences 4,074 proteins


Figure 2

200,000

400,000

600,000

800,000

1,000,000

Peptide identications

(a) Numbers of peptides found in a large-scale yeast study (38). Almost 1.8 million peptides were detected as Lys0-Lys8 stable isotope labeling by amino acids in cell culture (SILAC) pairs by the MaxQuant software. Note that the presence of SILAC pairs distinguishes peptide isotope patterns from chemical noise. A fragmentation spectrum has been acquired for only about half of the eluting peptides. Among these, the overall identication rate is 50%. Because peptides were redundantly sequenced in many fractions, the number of different peptide sequences is substantially lower. On average, more than 10 peptides were identied for each of the 4,074 identied proteins. (b) Saturation curves for peptide identication. The fraction of identied proteins (red line) and sequence coverage ( green line) is shown as a function of the number of tandem mass spectra used for identication. Eighty percent of the protein identications are already made with less than 10% of the tandem mass spectra. The overall sequence coverage reaches 27%.

methylated by light (normal isotope) or heavy (15 N, 13 C, 2 H) versions of a chemical reagent (40, 41). After samples are combined and analyzed, peptides appear as pairs with a dened mass difference. Several ratios can be obtained from a single peptide as it elutes from the column, and several peptides identifying the same protein are usually combined. This can yield accurate and robust quantication even without replicate measurements. Popular implementations of chemical derivatization for quantication are tandem mass tags (42) and iTRAQ (43). These methods use isobaric labeling reagents, perform quantication via reporter ions in the tandem mass spectra, and lead to different constraints on quantication accuracy than the MS-based methods (44). In metabolic labeling, cells are cultured in dened isotope media. For example, in the 15 N method, one cell population is grown

in media containing only the normal isotope of nitrogen (14 N), and another cell population is grown in media containing only 15 Nsubstituted molecules (4551). Although 15 N is primarily used for microbes or plants, SILAC is the method of choice in mammalian systems (52, 53). In SILAC, arginine and lysine (or other essential amino acids) are provided in light or heavy forms to the two cell populations and are incorporated into each protein after a few cell doublings. [Note that metabolic labeling by heavy amino acids in physiology predates its use in proteomics by more than half a century (54). Even in MS, heavy amino acids have already been used to count the number of leucines in intact proteins (55).] SILAC labeling leads to a dened and easily recognizable mass difference. Metabolic labeling methods have the advantage that samples can be combined directly after cell lysis, leading to very high quantitative accuracy
www.annualreviews.org High-Resolution Proteomics 279

Spectral counting: a simple quantication method in which the number of peptide fragmentation events is used as a proxy for the protein amount Label-free quantication: integration of the complete MS signal of each eluting peptide, used to quantify the same peptide in different LC MS/MS runs Multiple reaction monitoring (MRM): a triple quadrupole mass spectrometer is set to specically monitor a few specic transitions from peptide to fragment masses Quadrupole: the part of a mass spectrometer that acts as a lter by selectively passing molecules of a given mass-to-charge ratio (m/z) Transcriptome: the total complement of all messages expressed in a given cellular or tissue state

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

because the variability in sample processing cannot affect measured ratios. The applicability of SILAC has recently been extended to entire organisms and even to human tissue (56, 57). Quantication of proteins without isotopebased methods is less accurate than the above methods. In its simplest form, the number of peptide fragmentation events is taken as an estimate of the amount of proteins (58, 59). This spectral counting technique is an improvement over purely qualitative data, but with the advent of high-resolution MS, it is much more accurate to integrate the actual peptide signals. In labelfree quantication, the signals of the same peptides in different experiments are compared to each other. Combined with sophisticated algorithms, label-free quantication is now robust, and it can readily be used in situations were relatively large ratios (greater than fourfold) are expected (see the Section Interaction Proteomics below).

applications that require the repeated measurement of a predened set of proteins. Recently, the Aebersold group (67) has extended the concept of MRM targeting more generally to building a map of the cell by dening key peptides to measure in each experiment. Such peptides do not necessarily have to have been observed before. Instead, their MRM transitions can be determined from synthetic peptides, and their presence or absence can be ascertained blindly in any mixture (67). Key challenges for MRM-based targeted proteomics at the present are robust statistics for false-positive signals, increasing throughput as well as the synthesis, and handling the large numbers of isotope-labeled peptides that are necessary for accurate quantication in this approach.

EXPRESSION PROTEOMICS
In an expression proteomics experiment, the absolute or relative quantities of the proteins in a mixture are determined. Thus, expression proteomics is similar in concept to the more familiar measurement of the transcriptome by microarrays. However, proteomics has the principal advantage that it measures the end product of the gene expression cascade, the mature protein, which is more closely related to biological function than the message levels are. Furthermore, unlike transcriptomics, expression proteomics can determine the levels of gene products in subcellular compartments and in organelles.

Targeted Analysis
Once proteins of interest have been discovered by the above technologies, one may like to measure their behavior in a large set of different conditions. It would then be attractive to focus the MS measurement only onto specic peptides related to these proteins. For such purposes, the acquisition software of the mass spectrometer can be directed to acquire suitable mass spectrometric information specically for these peptides. They can be prioritized for fragmentation (60), or narrow mass range spectra with a high dynamic range can be acquired (selected ion monitoring scans). The usual implementation of peptide targeting, however, is in the form of single or multiple reaction monitoring (MRM). In MRM, a special type of mass spectrometera triple quadrupole instrumentis used to selectively record fragmentation events that are predetermined and specic for the peptides of interest (6166). The advantages of targeted approaches are potentially higher sensitivity and higher throughput, which would make them attractive in many systems biological
Cox

Comprehensive Proteome Quantication


Despite its obvious attractions, expression proteomics has historically been held back by technological difculties. For example, expression proteomics was the explicit goal of two-dimensional gel electrophoresis, but despite the presence of up to thousands of spots, in practice, the identities of at most a few hundred proteins of high abundance could be determined. Even a few years ago, it was

280

Mann

not clear if entire proteomes could ever be quantied. This situation has changed dramatically owing to the introduction of shotgun proteomics, and the rst complete proteome quantication in the yeast model organism has recently been reported (38). The completeness of a proteome has to be assessed indirectly because not every gene is expressed as a protein in every cellular state. Comparisons to genomewide tagging experiments (68, 69) showed that the MS data were essentially complete and, in particular, that MS can identify at least as many proteins in every abundance range of the proteome (Figure 3a). It also provided information on proteins that are difcult to tag, for example, proteins involved in membrane fusion. Importantly, open reading frames that are known not to be transcribed (so-called dubious open reading frames) were indeed not identied. Haploid and diploid yeast were compared using SILAC; this comparison showed that all the top differentially regulated proteins belong to the mating pathway and revealed several other interesting differences between these cell types (Figure 3b,c). This study also showed that extensive fractionation has diminishing returns (as mentioned above), whereas a straightforward single peptide fractionation step combined with high-resolution LC MS/MS quantied nearly the entire yeast proteome in a relatively rapid manner. (This was strategy B in Reference 38 in which 24 pI fractions were analyzed in triplicate runs starting with 300 g of material.) The yeast model system was also used to evaluate an MRM-based targeted proteomics approach (70). The copy numbers of several proteins were measured by adding labeled, synthetic peptides (including three at less than 100 copies per cell). That study showed that single copy number proteins should be detectable in this system. Highlighting a strength of the targeted approach, the level of the entire glycolytic pathway was monitored at several time points during nutrient switch in a label-free MRM approach (Figure 3d). A human cell type is commonly thought to express more than 10,000 proteins, which

would only be two- to threefold more complex than that of yeast. Currently, about 7,000 proteins can be measured in mammalian systems with a reasonable measurement time of 48 h (71), and just a few thousand cells obtained from tissue material can be sufcient for analysis; for examples, see References 72 and 73. Given the ongoing technological developments in all areas of proteomics, comprehensive expression analysis of mammalian proteomes appears to be within reach.

LC MS/MS: liquid chromatography coupled to tandem mass spectrometry

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Compartment, Organellar, and Temporal Proteomics


Proteomics is uniquely suited to characterize the protein constituents of subcellular structures (7476). However, purely qualitative studies risk adding many copurifying background proteins to the organellar proteome and have now largely been superseded by the quantitative approaches discussed above. The basic principle is to quantify an enriched fraction of the subproteome of interest against a nonenriched fraction. Although modern MS will usually detect the proteins in both fractions, the quantitative differences effectively discriminate between true members of the organelle and background proteins. In protein correlation proling, the quantitative abundance prole for all proteins is measured over density centrifugation fractions. Proteins belonging to the structure of interest have a closely related pattern, distinguishing them from all other proteins (77, 78). Many cellular compartments have been characterized by these or similar technologies (79, 80), including nuclear bodies and other nonmembrane bound structures. A majority of proteins do not only have a single cellular address but occur in several compartments (81). Proteomics can quantify these proportions and, in principle, establish localization indices for each protein (82, 83). Organellar proteomics, based on quantitative MS, is now robust and well established and complements traditional microscopy-based approaches in cell biology. Together, these technologies are likely to provide data on subcellular localization
www.annualreviews.org High-Resolution Proteomics 281

a
1,000 900 800 700 600 500 400 300 200 100 0

b
MS-based proteomics TAP GFP 1013

Number of proteins

1012

Summed peptide intensities (counts s1)

1011

1010

Lys9 Lys20 Lys12 Lys2 Lys4 Lys1

Ty1-LR

Hsp26 Ty2-DR Ctp1 Hpf1 Mss1 Ste2 Gpa1 Asg7 Bar1 Sst2 Ste18

1, 00 2, 0 00 4, 0 00 8, 0 00 16 0 ,0 32 00 ,0 64 00 ,0 12 00 8, 25 000 6, 51 000 2, 1, 00 02 0 >1 4,0 ,0 00 24 ,0 00

12 5 25 0 50 0

109

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

c
-factor

Molecules per cell

108

Far1

Prm5 Ste6

13.2 Ste2 Axl1

Tec1 Axl1

11.69

107
Alpha2

Prm4 Ste5

YML109W YIL084C

YHR183W

YOL116W

YGL248W

YNR067C YGL006W

YJR051W YEL031W YPR118W YHR107C YKL145W YCL017C YMR170C

YLR249W

YJL026W

YBR249C

Protein abundance (log2 copies per cell)

Regulation of pheromone signaling 1.94 1.33 6.8 Far1 1.21 Dig1 Dig2 Mcm1 17.5 Bar1 0.67 Ste12 2.86 1.02 1.08 Far3 Far7 Far8 Far 1.18

Transcriptional induction

1.14 Cdc28 Cln

Ste6 15.7 Pmr4 Pmr5 9.3 23.2 0.0 0.5 0.8 Cell cycle arrest

1.3

1.8

20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 0 10

1 x 106 copies per cell

1 x 105 copies per cell

1 x 104 copies per cell

1 x 103 copies per cell

100 copies per cell 20 30 40 50 60 70 80 90 100

Haploid/diploid ratio

Proteins measured

Figure 3 (a) Identied yeast proteins per copy number for shotgun proteomics and two tagging approaches. MS, mass spectrometry; GFP, green uorescent protein; TAP, tandem afnity purication. (b) Haploid to diploid ratios for the yeast proteome. Proteins are color coded according to their haploid/diploid expression ratio as in the color legend in panel c. (c) Pheromone response pathway with fold change color coded for each of its members according their haploid/diploid expression ratio (see color legend). Haploid/diploid ratios are shown as red numbers. MAPK, mitogen-activated protein kinase. (d ) Abundances of proteins detected by targeted proteomics using MRM assays according to Reference 70. Colored labels indicate the copy numbers per yeast cell of the proteins that they point to. Proteins for which the absolute abundance was measured are indicated on top of the graph (open circles). Panels a, b and c reproduced from Reference 38 with permission from Nature Publishing Group. Panel d reprinted from Cell (70) with permission from Elsevier.
282 Cox

Mann

YKR031C

YKL060C

YLR058C

YJL136C

Ste18 61.3 1.36 12.9 Gpa1 1.11 Ste4 5.2 Cdc42 Cdc24 Ste20 Bem1 1.24 10.58 Asg7 Bni1 1.15 0.95 Actin 1.13 6.5 0.90 Ste7 1.27 Ptp2 Afr1 39.8 Ste11 1.07 1.23 1.41 6.7 Msg5 Ptp3 Fus3 MAPK 1.31 39.2 cascade Mpt5 Sst2 Ste5

106

105

104 1 10

100

101

102

Haploid/diploid ratio

for essentially the entire proteome over the next few years. Furthermore, these studies can all be performed in a dynamic or time-resolved manner (spatiotemporal proteomics). Technically, this is often achieved by employing several different isotope labels to compare more than two proteomic states at the same time (84, 85).

PROTEOMICS OF POSTTRANSLATIONAL MODIFICATIONS


Until very recently, PTMs were characterized only on a single-protein basis and often without determining the actual modication site. When PTMs were localized, antibodies against synthetic peptides bearing these PTMs were raised to study them further. Today, MS is the tool of choice for PTM analysis because it can discover them without previous assumptions, it can locate the PTM with single-amino acid resolution, and it can be much more specic and quantitative than antibody-based methods.

Proteomes of Clinical Interest


Efforts to analyze the proteome of clinical samples, such as body uids or tumor biopsies, date back many years (86). However, the high expectations for proteomics in biomarker discovery have not been fullled so far. This is mainly owing to the formidable analytical challenges presented by the large dynamic range of the body uid proteomes in combination with the premature application of relatively unsophisticated analysis methods. Current approaches for body uid proteomics often employ targeted techniques, such as MRM (87), which are sometimes used in combination with antibody enrichment, a technique termed stable isotope standards with capture by antipeptide antibodies (63, 88). Although the jury is still out on the analysis of body uids by MS-based proteomics, in-depth proteomic analysis of tissue samples is clearly becoming possible. For example, the limitation of SILAC to cells that can be labeled can be circumvented by adding a SILAC-labeled cell line to a tissue of interest as an internal standard (89). Preparation of a ve-cell super-SILAC mix from breast cancer cell lines of different tumor grades has recently allowed very accurate quantication of 4,500 proteins in breast cancer samples with relatively short measurement times (57). Such an approach is sufciently streamlined that application to a large number of biopsies can be contemplated. The results can then be queried for expression patterns of specic biomarker proteins. Alternatively, or additionally, overall protein expression patterns may answer a specic clinical question, such as the risk of recurrence of breast cancer, in an analogous way as has already been done in microarray-based work (90, 91).

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Large-Scale Measurement of Posttranslational Modications by Mass Spectrometry


MS characterization of modied peptides is challenging because the modication may be labile during fragmentation or it may otherwise make the interpretation of tandem mass spectra difcult. For example, labile modications require dissociation methods that generate sufcient backbone fragment ion information to localize the modication. Some protein modications of the ubiquitin family type, such as sumoylation, leave large conjugates on the modied peptide that require high-resolution methods for accurate characterization (reviewed in Reference 92). The main challenge in PTM analysis, compared to proteome analysis, is the low abundance of modied peptides. Fortunately, this problem can often be alleviated by modication-specic enrichment. Phosphorylated peptides, for example, can be enriched more than 100-fold by resins that chemically coordinate the phosphogroups (93 95). Pan-specic antibodies against a particular type of modication can also be used, with antiphosphotyrosine antibodies as a prominent example. The same generic shotgun proteomics workow is applied in PTM-specic analyses too, with the difference that an enrichment step is inserted either at the protein level or more typically at the peptide level (Figure 1).
www.annualreviews.org High-Resolution Proteomics 283

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Identication of PTM-bearing peptides in sequence databases is more difcult than that of unmodied peptides because the search engine needs to consider additional possibilities. When several modications are allowed simultaneously, this combinatorial explosion of the search space can lead to lower statistical condence in search results. Furthermore, as mentioned above, it is not sufcient to unambiguously identify the peptide and its modication, but the modication also needs to be localized to a specic amino acid with high probability. Finally, the goal of PTM studies is usually not only to catalog many sites but also to determine how they change during cell signaling or in disease. This necessitates quantitative and possibly time-resolved studies of PTMs (84). Luckily, technological advances have alleviated those difculties, and MS-based PTM analysis is now quite successful in global PTM mapping. By far the largest quantitative data sets have been produced for phosphorylation (96103), but lysine acetylation (104, 105) or methylation (106), N-glycosylation (107, 108), ubiquitination (109111), sumoylation (92), and many others have also been investigated at a very large scale and often in a quantitative manner.

The global PTM data can also be directly used to investigate large-scale features of particular PTMs. For example, evolutionary conservation can be studied at the protein or preferablyat the site level. These types of analyses have revealed, for instance, that the phosphoproteome is generally more conserved than the rest of the proteome, that tyrosine phosphorylation may have an unexpected evolutionary history (112), that N-glycosylation is highly conserved among vertebrates (108), and that the lysine acetylome is conserved even down to prokaryotes (113). PTMs also have denite preferences for localization in the secondary structure of proteins. For instance, phosphorylation occurs preferentially in loops and disordered regions of proteins, whereas N-glycosylation is often found in both loops and -sheets (108, 114). It has become apparent in recent years that many if not most processes are regulated by an intimate interplay of more than one PTM; prominent examples are the cross talk between ubiquitination and phosphorylation (115) and between different histone modication marks (116, 117). MS-based proteomics is clearly the ideal technology to directly study integration of these pathways by analyzing quantitative changes of several PTMs in the same system.

Making Use of Large-Scale Posttranslational Modication Data Sets


What can be learned from these large-scale PTM experiments? In the rst instance, they serve as a resource to the community. Provided that the data are highly reliable, researchers can now directly test possible effects of PTMs by mutating the corresponding sites in their proteins of interest. However, it is advisable to primarily focus only on sites obtained with highresolution methods, on sites that are regulated in a biological process of interest, or on sites that have independent information that make them likely to be functional. Otherwise, the chances of nding phenotypes may be quite low.
284 Cox

Implications for Biology and Systems Biology


Although the large-scale analysis of PTMs is still rapidly developing, some themes have already emerged clearly. The rst is the sheer number of modications. There are many more PTMs than thought possible even a few years ago. At the same time, it is interesting that frequently studied proteins, such as the tumor suppressor p53 or the core histones, have over the years been reported to have almost any imaginable modication. This is partly an artifact and a consequence of the large number of research groups studying these particular molecules. Nevertheless, results from MS-based proteomics now suggest that

Mann

extensive modication is the rule rather than the exception in the entire proteome. This observation raises the question of how many of these modications have key functions in the cell (118). A widespread preconception of a typical PTM is that it occurs on a key residue in a key protein involved in the cellular function of interest. An example of this paradigm is a phosphorylation on the activation loops of a kinase that controls the downstream pathway. Clearly, the very large number of PTMs already discovered suggests that such a scenario is unlikely for the majority of the tens of thousands of PTMs that are now in databases. Rather, many PTMs will be coregulated rather than occur independently, with patterns that allow sensitive switches to be constructed, and many will contribute in a small or redundant way toward achieving a cellular response to a perturbation. It follows that, to associate PTMs with cellular functions at a systems scale, it may be more promising to obtain quantitative information on modication changes upon many different perturbations rather than to perform site-specic mutagenesis for each of them. This will eventually allow placing them in modules and pathways and priortizing them for detailed functional studies. The stoichiometry of thousands of phosphorylation sites during the cell cycle has recently been determined and found to be very high (103). Having such data more generally available should greatly help in deciding if a modication is likely to be functional in the process studied. The large number of modicationsand especially the fact that they are often independently regulated at different sites (98)also suggests a rethinking of our conceptions of gene expression and biological control. In a textbook view, the cell primarily reacts to perturbation by regulating expression of specic genes. In a protein- and PTM-centered view, the cell may additionally often use proteins as information processing platforms upon which a large number of incoming signals impinge and which then react as analog processors by changing their cellular localization, their activity status, and even their functions.

INTERACTION PROTEOMICS
Protein interactions are arguably the most useful pieces of data for associating a protein with a function or process. Having such data on a proteome-wide scale should reveal many aspects of cellular organization and function. Establishing networks is therefore a highly prized goal of the postgenomic eld, and there are calls for large-scale and global interaction proteome projects. MS-based proteomics is a method of choice for determining proteinprotein interactions because it uses proteins in their natural habitat and in their natural modication state. The principle of afnity purication followed by MS (AP-MS) is simply to pull down the bait protein of interest and identify interacting proteins. However, what sounds simple in concept is more difcult in practice because preserving an interaction from the cellular environment through a biochemical procedure involving cell lysis and protein solubilization is conceptually and practically very challenging. Furthermore, high-sensitivity MS can easily identify hundreds of background proteins in any such pull-down as well. This necessitates quantication of proteins in bait pull-downs against a control. When this is the case, AP-MS can indeed provide very reliable information on protein interactions in a generic format. Furthermore, specic protein interactions with RNA and DNA as well as with other biomolecules can be determined with very similar methods.
AP-MS: afnity purication followed by mass spectrometric identication or quantication of the bound proteins

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Quantitative Afnity Purication Followed by Mass Spectrometry


AP-MS rst needs an afnity handle to precipitate the protein of interest and its interaction partners. Ideally, one would like to pull down endogenous proteins using antibodies because this would least disturb the system. However, despite ongoing efforts to produce antibodies against every protein of the proteome (119, 120), it is currently impractical to obtain reagents that cleanly immunoprecipitate each target. A comparable and practical alternative is
www.annualreviews.org High-Resolution Proteomics 285

Bacterial articial chromosomes (BACs): chromosomes that can be used to genetically manipulate mammalian genes of interest, for example, by tagging them

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

to tag the gene directly in the genome, thereby ensuring endogenous expression levels. This is routinely performed in budding yeast, where homologous recombination is very efcient, and recently, this has also become possible in mouse and human cells via large-scale bacterial articial chromosome (BAC) technology (121, 122). Most commonly, however, cDNAs are ectoptically expressed; this has the advantages that existing cDNA libraries can be employed and that a minimum expression level of the bait is enforced. However, the disadvantages are that there is no guarantee that the protein product will localize correctly or that it will be processed similarly to its natural counterpart. AP-MS protocols have advanced tremendously over the past 10 years, evolving from one-dimensional gel-based tandem afnity procedures, with implicit quantication by gel staining, to procedures that use a single tag and a single LC MS/MS run and only a single Petri dish of input material (123). However, the most important shift has been the near universal adoption of quantitative methods to distinguish background binders from specic interaction partners (124126). In the most basic form, the above-mentioned spectral counting method is

used for quantication, but much more accurate results can be obtained with sophisticated label-free algorithms or with isotope-based methods. In our laboratory, we use SILACbased methods when small differences in binding to bait and control are expected, for example, when comparing dynamic interaction partners recruited to a bait upon stimulus. In most cases, however, ratios should reect on-off behavior between bait versus control pulldowns, and we apply label-free quantication in the MaxQuant software environment (20). If both pull-downs are performed in triplicates, a standard Students t-test combined with the fold-change information sensitively discriminates between interaction partners and background binders, as illustrated in Figure 4a. In that example, quantitative BAC interactomics (123), which combines BAC technology with a streamlined and very sensitive MS readout, was used on a member of the anaphase-promoting complex. This experiment revealed not only the known anaphase-promoting complex but also two novel and previously uncharacterized subunits. In addition to identifying interaction partners, AP-MS can, in principle, also determine the stoichiometry of complexes if labeled


Figure 4 (a) Volcano plot representing results of a label-free pull-down of the anaphase promoting complex member CDC23 versus the untagged HeLa cell line performed in triplicate (123). Logarithmic ratio of protein intensities is plotted against the negative logarithmic p value of a t-test. The dashed curve indicates the region of signicant interactors with a false discovery rate of 1%, and the black dots in the upper right corner are proteins that signicantly enrich with CDC23. (b) Heatmap representing complexes resulting from a nested clustering approach for large-scale analysis of afnity purication followed by mass spectrometry data (139). (c) Networked visualization of the complexes involved in panel b. Green and brown nodes are baits and preys, respectively. (d ) Stable isotope labeling by amino acids (SILAC)-based DNA-interaction screen with DNA oligonucleotides representing two variants of a single-nucleotide polymorphism (149). A single protein is clearly visible as a differential interaction partner. Forward refers to exposing one allele to the heavy SILAC lysate, and reverse refers to exposing the other allele to heavy SILAC lysatethis is also called a label-swapping experiment. (e) Results of a modication-specic forward-reverse pull-down applied to histones detecting the specic trimethylation of histone H3 at lysine 4 interactors (144). ANAPC1C7, anaphase-promoting complex members C1-C7; BAP18, BPTF-associated protein of 18 kDa; hINO80, human INO80; IFGF2 G3072A, G/A polymorphism at position 3072 of the insulin-like growth factor 2 locus; LAP, localization and afnity purication; NFLAP, N-terminal green uorescent protein LAP tag; PHF8, PHD nger protein 8; P value, probability value; Prefoldin, protein used in protein folding complexes; q3umd3, UniProt protein of unknown function; SAGA, Spt-Ada-Gcn5-acetyltransferase complex; SO, regularization parameter of the t-test; SRCAP/TRRAP complex, homologs of the yeast INO80 and SWR complexes; TFIID, transcription factor IID. Panel a is from Reference 123 and is reproduced with permission from Rockefeller University Press; panels b and c are from Reference 139 and reproduced with permission from Nature Publishing Group and EMBO; panel d is from Reference 149 and reproduced with permission from Cell Press (Elsevier).
286 Cox

Mann

a
CDC23-LAP

b
5 FBXO5 ANAPC13 C10orf104 NEK2 ANAPC2 ANAPC10 C11orf51 PYCR1 CDC23 CDC26 FZR1 CDC16 CDC27 ANAPC1 ANAPC4 ANAPC5 ANAPC7 10
TCF3 NFRKPB INO80 IES2 YY1 FLJ20309 UCH37 ARP8 ZnFHIT1 TRRAP P400 TIP60 EPClike MRGBP EPC ING3 SRCAP DPCD ZnFHIT2 FLJ21945 NUFIP FLJ20729 UQCRH LEPRE1 HBB ANP32E RIKENcDNA FLJ20436 Lin9/TUDOR Pol3A MCRS1 FLJ90652 IES6 ARP5 DMAP1 ARP6 GAS41 YL1 BAF53 MRGX FLJ11730 TRCp120 H2AZ MRG151 Rpb5 FLJ20643 BC014022 PFDN2 UXT1 FLJ21908 RPB5MP P53DNADRP HKE2 Tip49b Tip49a
TIP49A CL6 ARP6 CL5 FLJ20729 CL9 LIN9 CL10 SRCAP CL4 ZNF.HIT1 CL4 TIP49B CL3 YL1 CL3 MRGBP CL3 H2AZ CL3 FLJ20436 CL8 TCF3 CL1 IES6 CL1 IES2 CL1 FLJ90652 CL1 FLJ20309 CL1 ARP5 CL1 ARP8 CL1 ZNF.HIT2 CL7 DPCD CL7 NUFIP CL2 P53 CL2 UXT1 CL2 RPB5MP CL2 FLJ20643 CL2 BC014022 CL2

hINO80

log2 (p-value) (t-test)

4 10 m
C11orf51-NFLAP

SRCAP/ TRRAP

CDC20

PYCR2

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Threshold value = 0.01 SO = 2 0 10 5 0 5

hINO80

log2 (CDC23/HeLa)

c
hINO80

Prefoldin

TRRAP Prefoldin

Bait Prey SRCAP TRRAP

SRCAP

hINO80

Prefoldin

e
4 Color-coded: SAGA complex TFIID complex BAP18 interactors PHF8

log2 (ratio) reverse

d
log2 SILAC ratio (IGF2 G3072A forward)

5 4

3 SAGA 2 BAP18 HMG2L1 PHF8 BPTF TAF9B 1 TFIID

3
2 1

q3umd3

0 0 1 5 4 3 2 1 0 1 1 1 0 1 2 3 4 5 6

log2 SILAC ratio (IGF2 G3072A reverse)

log2 (ratio) forward


www.annualreviews.org High-Resolution Proteomics 287

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

standards are added (126). For example, MRM has been applied to determine the stoichiometry of the human spliceosomal hPrp19/CDC5L complex (127). If combined with protein crosslinking, AP-MS can also yield structural constraints in protein complexes. Direct sequencing of cross-linked peptides is challenging but increasingly possible, revealing information on interaction surfaces (128, 129). Finally, AP-MS can protably be combined with MS of entire complexes or structural studies using electron microscopy-based techniques (130, 131). Thus, data derived from MS-based proteomic experiments are compatible with unrelated approaches, including classical biochemistry and imaging, to determine stoichiometry, threedimensional structures, or other aspects of protein complexes.

Protein Interactions with Posttranslational Modications, DNA, and Other Biomolecules


The quantitative principles described above can be employed to discover protein interaction partners for any molecule that can be immobilized on beads. For example, the function of some PTMs involve interactions of the linear peptide sequence where they occur and a domain that binds this modication (141). Immobilized phosphotyrosine-containing peptides or peptides with proline-rich sequences can be used to sh out the specic proteinbinding partners containing SH2/PTB domains or SH3 domains, respectively (142). Recently, the main trimethyllysine modication sites of histone N-terminal tails, which have pivotal roles in epigenetics, have been subjected to such an interaction screen. This revealed the identity of novel proteins involved in activation and repression of gene expression (Figure 4e) (143, 144). These studies were followed up by full-length interaction screens with the BAC technology, which assigned the proteins to distinct complexes, and with chromatin immunoprecipitation coupled to deep DNA sequencing, which conrmed binding of the new factors to the corresponding methylation mark. It is difcult to isolate proteins biochemically via their interaction with particular DNA sequences because many proteins bind nonspecically and because specic binding usually involves subtle and low-afnity interactions (145). Nevertheless, the quantitative AP-MS methods described above can be employed to efciently distinguish sequence-specic binding partners from hundreds of background binders. Concatenated DNA oligonucleotides containing the sequence of interest or a control sequence are immobilized on beads and exposed to labeled nuclear extracts (146). This technology recently revealed the identity of the protein mediating the effects of a single quantitative trait locus responsible for enhanced muscle mass and decreased fat tissue in European pigs (147). Using SILAC-based quantication, this domesticated transposable element, which

Large-Scale Data Sets for Network Biology


The rst proteome-wide AP-MS experiments were performed in the yeast model system (132 135). The resulting interactome data have been analyzed extensively by bioinformatic methods, which faced the simultaneous difculties of ltering out background binders by repeated occurrence (as the data were not quantitative) and of reconstructing meaningful biological networks. A recent study of the yeast kinase and phosphatase interaction network used spectral counting and several different tags for each bait, and the investigators found surprisingly little overlap between these different tags (136). In the mammalian system, specic pathways have been targeted (137), and the deubiquitinases have been analyzed systematically (138). Another study employed biclustering to reveal protein complexes as rectangular structures in a two-dimensional heatmap (139). Very recently, Behrends et al. (140) performed pull-downs of members of the authophagy system. This revealed a number of interactions with proteins involved in lipid phosphorylation as well as 67 possible interaction partners of the human ATG8 orthologs.

288

Cox

Mann

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

turned out to be the long-sought repressor, was found independently by two groups (Figure 4f ) (148, 149). The concept of quantitative AP-MS has also been adapted to identify RNA-binding proteins by using aptamer-immobilized RNAs (150). Interactions with endogenous or exogenous small molecules can likewise be studied by immobilizing them on beads and incubating them with cell lysate. SILAC-based quantication has been used to nd the interactors of chemical libraries of compounds (151), a development that should be of great interest to the pharmaceutical industry. In a similar context, a broad range of kinases can be immobilized on beads bearing ATP analogs; this can be used to enrich and study the collection of all kinases (the kinome) (152) or to dene binding constants and binding specicities of the kinome in a global manner (153, 154).

also be mapped onto pathways already known to be involved in the process (see Figure 3c for an example). These analyses are often performed with the statistical tools available in scripting and visualization environments such as R and Bioconductor (155). The publicly available Perseus program developed by our group combines generic quantitative bioinformatics methods with proteomics-specic algorithms in a single environment (http:// www.perseus-framework.org). Perseus also allows integration of genomic, transcriptomic, or metabolomic data on the basis of annotated pathway information. As progressively more accurate annotation information accumulates in databases, it will become increasingly feasible to understand and interpret the results of quantitative proteomics experiments in silico.

INTEGRATING PROTEOMICS WITH OTHER LARGE-SCALE DATA


As shown above, large-scale, high-accuracy proteomics is now as powerful as other more established omics technologies. Clearly, the largest benet for a systems view of the cell is gained when combining proteomic data with these other data types. Indeed we argue that proteomics provides an ideal scaffold with which to integrate these multidimensional data because it intrinsically deals with the carrier of biological function.

Physical and Functional Protein-Protein Interactions


Network data obtained by AP-MS can usefully be integrated with other protein-protein interaction data, such as the yeast two-hybrid system. AP-MS determines protein complexes, but the yeast two-hybrid system can resolve binary interactions and even contact points between proteins (156). Furthermore, it is possible to overlay the data from functional screens performed by RNAi with protein-protein interaction data, and this already happens in environments using the Search Tool for the Retrieval of Interacting Genes/Proteins (157). This is especially interesting for functional interaction screens, where suppressors or enhancers of a knockout or mutation are scored in a genome-wide format (158). The combination of genetic interaction maps and global AP-MS data resolves interactions into physical or purely functional ones, thereby creating a physical and functional interaction landscape of the proteome.

Integration with Ontologies and Pathways


The analysis of quantitative proteomic data sets has many similarities with the analysis of microarray data. Therefore, computational proteomics can often build on the previous efforts of the microarray community. A common rst analysis step is the visualization of the data by clustering algorithms or principal component analysis. Clusters of interest can then be tested for enrichment of pathways, biological processes, or any other genomewide annotation. Conversely, the data can

Correlation of Transcriptome and Proteome


In a cartoon view of gene expression, genes are transcribed, and the message is translated
www.annualreviews.org High-Resolution Proteomics 289

Single-nucleotide polymorphism (SNP): occurring with a certain minimum frequency. SNPs can occur in coding regions or outside of them Amplicon: cancer cells often selectively amplify a part of their chromosomes to overexpress certain genes; this region is called an amplicon

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

to proteins in a direct one-to-one manner. In reality, however, there are many levels of control between message and proteome. Although it is clear that there must be an overall correlation between the message and protein changes, otherwise there would be no sense in having regulated gene expression at all, the general magnitude of this correlation is unknown. Early comparisons between microarray and proteome data usually reported very low correlation. This was at least partly owing to technical factors because any technological imperfections of the microarray or the proteomics platforms (which were at that time still often based on two-dimensional gels) would reduce the apparent correlation between the transcriptome and the proteome. The correlation between transcript change and protein change is different in varied biological situations. This becomes immediately apparent in the case of body uids or special cell types, such as red blood cells or platelets, which have proteins but no mRNA. Another case of protein regulation without a corresponding message change is selective degradation by the ubiquitin proteasome system. The recently introduced deep DNA sequencing methods can also be applied to transcriptome measurements. It will therefore be interesting to compare transcriptome changes with proteome changes using highly accurate quantitative platforms as this is bound to shed new light on the differential regulation of gene expression.

(Figure 4e). In cancer genomics, large numbers of tumors are being sequenced with nextgeneration technologies. It is now clear that many of the genome changes are amplications or deletions of parts of chromosomes. This raises the interesting question of whether all or just some of the genes in the individual amplicons are overexpressed in cancer cells at the protein level. Using SILAC-based quantitative proteomics, this question can be answered (159). Despite an overall weak correlation between gene copy number and protein changes, these correlations are sufcient to correctly predict amplicons in the cancer genomes on the basis of accurately quantied proteome data (Figure 5) (159). Interestingly, changes in gene copy number caused by deletions or amplications appear to be further amplied at the message level, whereas they can be attenuated at the protein level. Thus, there appears to be an additional level of expression control on many proteins. For example, proteins in stable complexes tend to be insensitive to gene copy numbers. Although still in its infancy, the combination of genetics or genomics with quantitative proteomics clearly has immense potential to unravel the inuence of the genotype on the cellular phenotype.

OUTLOOK: TRANSFORMING BIOCHEMSTRY INTO A SYSTEMS SCIENCE VIA PROTEOMICS


One of the current limitations of proteomics is its slower throughput compared to some of the other large-scale technologies. Efforts are currently focusing on reducing measuring time by improved instrumentation, which should obviate the need for measuring many fractions. This will also reduce the needed input material proportionately. Furthermore, there are activities underway to develop simplied sample preparation protocols, more compact mass spectrometers, and increasingly capable software for computational proteomics and downstream analysis. We envision that quantitative proteomics will encompass virtually all expressed proteins with short measurement times

Combining Genetics with Proteomics


Advances in DNA sequencing technologies are leading to an explosion of genetic data, including association studies of large cohorts with common diseases or common traits. These studies often lead to single-nucleotide polymorphisms (SNPs) that are statistically linked to the trait, but the mechanism leading to this link may not be clear. Proteomics can help to tie the SNPs to cellular functions by determining the transcription factor that binds to them using the technology described above
290 Cox

Mann

a
4

Protein level (HCC2218/HMEC)

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

2.4

2.6

2.8

3.0

3.2

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Concatenated human chromosomes (billion base pairs)

b
10

Genome copy number (HCC2218/HMEC)

7.5 5 2.5 0 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2

Concatenated human chromosomes (billion base pairs)

c
log2 (ratio HCC2218/HMEC)

7 6 5 4 3 2 1 0 1 2 Grb7 C17orf37 ErbB2

Gene copy number change Protein level change

Figure 5 (a) Genome proling of proteomic data (159). Regional chromosomal amplication is predicted from the quantitative proteome data. (b) The gene copy numbers corresponding to panel a, measured with comparative genomic hybridization. Colors in (a) and (b) indicate different chromosomes. (c) Zoom-in illustration of the small amplicon surrounding ErbB2. HCC2218, human breast cancer cell line; HMEC, human mammary epithelial cell. The gure is from Reference 159 and reproduced with permission from the Public Library of Science.

36 .9 0 36 9 .9 22 37 .0 0 37 4 .0 2 37 6 .3 5 37 7 .4 1 37 7 .5 61 37 .6 1 37 8 .7 8 37 3 .7 9 37 3 .8 44 37 .8 8 37 5 .8 9 38 4 .0 7 38 7 .1 19 38 .1 3 38 7 .1 7 38 5 .2 9 38 7 .3 7 38 6 .5 45 38 .6 3 38 2 .7 8 39 1 .8 45

Position on chromosome 17 (x 106)

www.annualreviews.org High-Resolution Proteomics

291

and with an effort similar to routine techniques, such as Western blotting. The resulting data will be highly accurate, and instead of quantifying one or a few proteins, a researcher will obtain a systems view of all the changes associated with the cellular perturbation. Because of more complete annotation of proteins, computational proteomics will increasingly be able to deduce changes not only in individual proteins but also in their processes and functions. Thus, simple and fast proteomic experiments will in the future allow practical and data-driven sysAnnu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

tems biology, which could replace many of todays protein-specic assays and lead to a much more unbiased view of the process under investigation. This vision entails an opportunity and a challenge for systems biological modeling. It is clear that large amounts of accurate, quantitative data will soon be available at the proteomic level. The task ahead will then be to develop models that can deal with thousands of protein changes as well as tens of thousands of changes in PTMs. If successful, this program would help realize much of the promise of systems biology.

SUMMARY POINTS 1. Mass spectrometry (MS)-based proteomics provides high-resolution data on hundreds of thousands of peptides in a generic workow. Accurate quantication is possible by isotope-based methods, but label-free quantication methods are increasingly attractive. 2. Expression proteomics is conceptually similar to microarray measurements but directly measures the level of proteins, which are the biologically active molecules. The rst complete proteome has been identied and quantied. 3. Posttranslational modications (PTMs) can be identied and quantied at a large scale. There are thousands of PTMs in the proteome, but only a small fraction of them changes in specic cellular processes. 4. Protein interactions with other proteins can be accurately measured by proteomics strategies that quantify eluates from bait against eluates from control. These techniques can also be applied with DNA, RNA, or small molecules as baits. 5. Proteomics data can efciently be integrated with other large-scale omics data using bioinformatics tools. The effect of genetic variation on the proteome can now be assessed. 6. Proteomics can provide accurate, quantitative data for a systems biology view of cellular processes.

FUTURE ISSUES: 1. It will soon be possible to obtain extensive sequence information on all fragmented peptides. Will this lead to the discovery of new types of PTMs and not yet described genes? And are other surprises waiting in the proteome? 2. Will incremental or revolutionary advances be necessary to measure complete mammalian proteomes with reasonable throughput? 3. When will MS be simple and affordable enough to be as accessible as sophisticated microscopy? What needs to happen to make MS as ubiquitous as the Western blot technique?

292

Cox

Mann

4. How can the sensitivity of MS-based proteomics be extended to a small cell number (say 1,000) obtained from in vivo sources? 5. How can MS-based proteomics deal with the enormous dynamic range of body uid proteomes, especially human plasma? 6. When will we have a rst draft of the human interactome by AP-MS? 7. What is the function of the thousands of PTMs already discovered by MS? 8. Is it possible to create models in systems biology that take thousands of accurate protein ratios and PTM changes into account?

DISCLOSURE STATEMENT
Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

The authors are not aware of any afliations, memberships, funding, or nancial holdings that might be perceived as affecting the objectivity of this review.

ACKNOWLEDGMENTS
We thank other members of our group for stimulating discussions and Falk Butter for critical reading of the manuscript. This work was partially supported by the European Commissions seventh Framework Program PROteomics SPECication in Time and Space (PROSPECTS, HEALTH-F4-2008-021,648).

LITERATURE CITED
1. Wilkins MR, Pasquali C, Appel RD, Ou K, Golaz O, et al. 1996. From proteins to proteomes: large scale protein identication by two-dimensional electrophoresis and amino acid analysis. Biotechnology 14:6165 2. Aebersold R, Mann M. 2003. Mass spectrometry-based proteomics. Nature 422:198207 3. Domon B, Aebersold R. 2006. Mass spectrometry and protein analysis. Science 312:21217 4. Cravatt BF, Simon GM, Yates JR 3rd. 2007. The biological impact of mass-spectrometrybased proteomics. Nature 450:9911000 5. Choudhary C, Mann M. 2010. Decoding signalling networks by mass spectrometrybased proteomics. Nat. Rev. Mol. Cell Biol. 11:42739 6. Nilsson T, Mann M, Aebersold R, Yates JR 3rd, Bairoch A, Bergeron JJ. 2010. Mass spectrometry in high-throughput proteomics: ready for the big time. Nat. Methods 7:68185 7. Mallick P, Kuster B. 2010. Proteomics: a pragmatic perspective. Nat. Biotechnol. 28:695709 8. Link AJ, Eng J, Schieltz DM, Carmack E, Mize GJ, et al. 1999. Direct analysis of protein complexes using mass spectrometry. Nat. Biotechnol. 17:67682 9. Peng J, Gygi SP. 2001. Proteomics: the move to mixtures. J. Mass Spectrom. 36:108391 10. Washburn MP, Wolters D, Yates JR 3rd. 2001. Large-scale analysis of the yeast proteome by multidimensional protein identication technology. Nat. Biotechnol. 19:24247 11. McLafferty FW, Breuker K, Jin M, Han X, Infusini G, et al. 2007. Top-down MS, a powerful complement to the high capabilities of proteolysis proteomics. FEBS J. 274:625668 12. Kellie JF, Tran JC, Lee JE, Ahlf DR, Thomas HM, et al. 2010. The emerging process of top down mass spectrometry for protein analysis: biomarkers, protein-therapeutics, and achieving high throughput. Mol. Biosyst. 6:153239 13. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM. 1989. Electrospray ionization for mass spectrometry of large biomolecules. Science 246:6471
www.annualreviews.org High-Resolution Proteomics 293

14. Hillenkamp F, Karas M, Beavis RC, Chait BT. 1991. Matrix-assisted laser desorption/ionization mass spectrometry of biopolymers. Anal. Chem. 63:A1193203 15. Makarov A, Denisov E, Lange O. 2009. Performance evaluation of a high-eld Orbitrap mass analyzer. J. Am. Soc. Mass Spectrom. 20:139196 16. Sadygov RG, Cociorva D, Yates JR 3rd. 2004. Large-scale database searching using tandem mass spectra: looking up the answer in the back of the book. Nat. Methods 1:195202 17. Elias JE, Gygi SP. 2007. Target-decoy search strategy for increased condence in large-scale protein identications by mass spectrometry. Nat. Methods 4:20714 18. Bradshaw RA, Burlingame AL, Carr S, Aebersold R. 2006. Reporting protein identication data: the next generation of guidelines. Mol. Cell Proteomics 5:78788 19. Mallick P, Schirle M, Chen SS, Flory MR, Lee H, et al. 2007. Computational prediction of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25:12531 20. Cox J, Mann M. 2008. MaxQuant enables high peptide identication rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantication. Nat. Biotechnol. 26:136772 21. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. 1999. Probability-based protein identication by searching sequence databases using mass spectrometry data. Electrophoresis 20:355167 22. Eng JK, McCormack AL, Yates JR. 1994. An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J. Am. Soc. Mass Spectrom. 5:97689 23. Hansen BT, Jones JA, Mason DE, Liebler DC. 2001. SALSA: a pattern recognition algorithm to detect electrophile-adducted peptides by automated evaluation of CID spectra in LC-MS-MS analyses. Anal. Chem. 73:167683 24. Savitski MM, Nielsen ML, Zubarev RA. 2006. ModiComb, a new proteomic tool for mapping substoichiometric post-translational modications, nding novel types of modications, and ngerprinting complex protein mixtures. Mol. Cell Proteomics 5:93548 25. Tanner S, Pevzner PA, Bafna V. 2006. Unrestrictive identication of post-translational modications through peptide mass spectrometry. Nat. Protoc. 1:6772 26. Mann M, Wilm M. 1994. Error-tolerant identication of peptides in sequence databases by peptide sequence tags. Anal. Chem. 66:439099 27. Olsen JV, Macek B, Lange O, Makarov A, Horning S, Mann M. 2007. Higher-energy C-trap dissociation for peptide modication analysis. Nat. Methods 4:70912 28. Olsen JV, Schwartz JC, Griep-Raming J, Nielsen ML, Damoc E, et al. 2009. A dual pressure linear ion trap Orbitrap instrument with very high sequencing speed. Mol. Cell Proteomics 8:275969 29. Frank AM, Savitski MM, Nielsen ML, Zubarev RA, Pevzner PA. 2007. De novo peptide sequencing and identication with precision mass spectrometry. J. Proteome Res. 6:11423 30. Pan C, Park BH, McDonald WH, Carey PA, Baneld JF, et al. 2010. A high-throughput de novo sequencing approach for shotgun proteomics using high-resolution tandem mass spectrometry. BMC Bioinformatics 11:118 31. Nesvizhskii AI, Aebersold R. 2005. Interpretation of shotgun proteomic data: the protein inference problem. Mol. Cell Proteomics 4:141940 32. Ong SE, Mann M. 2005. Mass spectrometry-based proteomics turns quantitative. Nat. Chem. Biol. 1:252 62 33. Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B. 2007. Quantitative mass spectrometry in proteomics: a critical review. Anal. Bioanal. Chem. 389:101731 34. Vermeulen M, Selbach M. 2009. Quantitative proteomics: a tool to assess cell differentiation. Curr. Opin. Cell Biol. 21:76166 35. Steen H, Jebanathirajah JA, Springer M, Kirschner MW. 2005. Stable isotope-free relative and absolute quantitation of protein phosphorylation stoichiometry by MS. Proc. Natl. Acad. Sci. USA 102:394853 36. Hanke S, Besir H, Oesterhelt D, Mann M. 2008. Absolute SILAC for accurate quantitation of proteins in complex mixtures down to the attomole level. J. Proteome Res. 7:111830 37. Singh S, Springer M, Steen J, Kirschner MW, Steen H. 2009. FLEXIQuant: a novel tool for the absolute quantication of proteins, and the simultaneous identication and quantication of potentially modied peptides. J. Proteome Res. 8:220110
294 Cox

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Mann

38. de Godoy LM, Olsen JV, Cox J, Nielsen ML, Hubner NC, et al. 2008. Comprehensive massspectrometrybased proteome quantication of haploid versus diploid yeast. Nature 455:125154 39. Malmstrom J, Beck M, Schmidt A, Lange V, Deutsch EW, Aebersold R. 2009. Proteome-wide cellular protein concentrations of the human pathogen Leptospira interrogans. Nature 460:76265 40. Hsu JL, Huang SY, Chow NH, Chen SH. 2003. Stable-isotope dimethyl labeling for quantitative proteomics. Anal. Chem. 75:684352 41. Boersema PJ, Raijmakers R, Lemeer S, Mohammed S, Heck AJ. 2009. Multiplex peptide stable isotope dimethyl labeling for quantitative proteomics. Nat. Protoc. 4:48494 42. Thompson A, Schafer J, Kuhn K, Kienle S, Schwarz J, et al. 2003. Tandem mass tags: a novel quantication strategy for comparative analysis of complex protein mixtures by MS/MS. Anal. Chem. 75:1895904 43. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, et al. 2004. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell Proteomics 3:115469 44. Zhang Y, Askenazi M, Jiang J, Luckey CJ, Grifn JD, Marto JA. 2009. A robust error model for iTRAQ quantication reveals divergent signaling between oncogenic FLT3 mutants in acute myeloid leukemia. Mol. Cell Proteomics 9:78090 45. Oda Y, Huang K, Cross FR, Cowburn D, Chait BT. 1999. Accurate quantitation of protein expression and site-specic phosphorylation. Proc. Natl. Acad. Sci. USA 96:659196 46. Conrads TP, Alving K, Veenstra TD, Belov ME, Anderson GA, et al. 2001. Quantitative analysis of bacterial and mammalian proteomes using a combination of cysteine afnity tags and 15 N-metabolic labeling. Anal. Chem. 73:213239 47. Washburn MP, Ulaszek R, Deciu C, Schieltz DM, Yates JR 3rd. 2002. Analysis of quantitative proteomic data generated via multidimensional protein identication technology. Anal. Chem. 74:165057 48. Krijgsveld J, Ketting RF, Mahmoudi T, Johansen J, Artal-Sanz M, et al. 2003. Metabolic labeling of C. elegans and D. melanogaster for quantitative proteomics. Nat. Biotechnol. 21:92731 49. Wu CC, MacCoss MJ, Howell KE, Matthews DE, Yates JR 3rd. 2004. Metabolic labeling of mammalian organisms with stable isotopes for quantitative proteomic analysis. Anal. Chem. 76:495159 50. Kolkman A, Daran-Lapujade P, Fullaondo A, Olsthoorn MM, Pronk JT, et al. 2006. Proteome analysis of yeast response to various nutrient limitations. Mol. Syst. Biol. 2:2006.0026 51. Gouw JW, Tops BB, Mortensen P, Heck AJ, Krijgsveld J. 2008. Optimizing identication and quantitation of 15 N-labeled proteins in comparative proteomics. Anal. Chem. 80:7796803 52. Mann M. 2006. Functional and quantitative proteomics using SILAC. Nat. Rev. Mol. Cell Biol. 7:95258 53. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, et al. 2002. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell Proteomics 1:37686 54. Schoenheimer R, Rittenberg D. 1935. Deuterium as an indicator in the study of intermediary metabolism. Science 82:14557 55. Veenstra TD, Martinovic S, Anderson GA, Pasa-Tolic L, Smith RD. 2000. Proteome analysis using selective incorporation of isotopically labeled amino acids. J. Am. Soc. Mass Spectrom. 11:7882 56. Kruger M, Moser M, Ussar S, Thievessen I, Luber CA, et al. 2008. SILAC mouse for quantitative proteomics uncovers Kindlin-3 as an essential factor for red blood cell function. Cell 134:35364 57. Geiger T, Juergen C, Pawel O, Wisniewski JR, Mann M. 2010. Super-SILAC mix for quantitative proteomics of human tumor tissue. Nat. Methods 7:38385 58. Liu B, Lin Y, Darwanto A, Song X, Xu G, Zhang K. 2009. Identication and characterization of propionylation at histone H3 lysine 23 in mammalian cells. J. Biol. Chem. 284:3228895 59. Liu H, Sadygov RG, Yates JR 3rd. 2004. A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Anal. Chem. 76:4193201 60. Schmidt A, Claassen M, Aebersold R. 2009. Directed mass spectrometry: towards hypothesis-driven proteomics. Curr. Opin. Chem. Biol. 13:51017 61. Barnidge DR, Goodmanson MK, Klee GG, Muddiman DC. 2004. Absolute quantication of the model biomarker prostate-specic antigen in serum by LC-MS/MS using protein cleavage and isotope dilution mass spectrometry. J. Proteome Res. 3:64452
www.annualreviews.org High-Resolution Proteomics 295

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

62. Kuhn E, Wu J, Karl J, Liao H, Zolg W, Guild B. 2004. Quantication of C-reactive protein in the serum of patients with rheumatoid arthritis using multiple reaction monitoring mass spectrometry and 13C-labeled peptide standards. Proteomics 4:117586 63. Anderson L, Hunter CL. 2006. Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol. Cell Proteomics 5:57388 64. Wolf-Yadlin A, Hautaniemi S, Lauffenburger DA, White FM. 2007. Multiple reaction monitoring for robust quantitative proteomic analysis of cellular signaling networks. Proc. Natl. Acad. Sci. USA 104:5860 65 65. Kitteringham NR, Jenkins RE, Lane CS, Elliott VL, Park BK. 2009. Multiple reaction monitoring for quantitative biomarker analysis in proteomics and metabolomics. J. Chromatogr. B 877:122939 66. Keshishian H, Addona T, Burgess M, Mani DR, Shi X, et al. 2009. Quantication of cardiovascular biomarkers in patient plasma by targeted mass spectrometry and stable isotope dilution. Mol. Cell Proteomics 8:233949 67. Ahrens CH, Brunner E, Qeli E, Basler K, Aebersold R. 2010. Generating and navigating proteome maps using mass spectrometry. Nat. Rev. Mol. Cell Biol. 11:789801 68. Huh WK, Falvo JV, Gerke LC, Carroll AS, Howson RW, et al. 2003. Global analysis of protein localization in budding yeast. Nature 425:68691 69. Ghaemmaghami S, Huh WK, Bower K, Howson RW, Belle A, et al. 2003. Global analysis of protein expression in yeast. Nature 425:73741 70. Picotti P, Bodenmiller B, Mueller LN, Domon B, Aebersold R. 2009. Full dynamic range proteome analysis of S. cerevisiae by targeted proteomics. Cell 138:795806 71. Wisniewski JR, Zougman A, Nagaraj N, Mann M. 2009. Universal sample preparation method for proteome analysis. Nat. Methods. 6:35962 72. Cha S, Imielinski MB, Rejtar T, Richardson EA, Thakur D, et al. 2010. In situ proteomic analysis of human breast cancer epithelial cells using laser capture microdissection (LCM)-LC/MS: annotation by protein set enrichment analysis (PSEA) and gene ontology (GO). Mol. Cell Proteomics 9:252944 73. Waanders LF, Chwalek K, Monetti M, Kumar C, Lammert E, Mann M. 2009. Quantitative proteomic analysis of single pancreatic islets. Proc. Natl. Acad. Sci. USA 106:189027 74. Bergeron JJ, Au CE, Desjardins M, McPherson PS, Nilsson T. 2010. Cell biology through proteomics ad astra per alia porci. Trends Cell Biol. 20:33745 75. Yates JR 3rd, Gilchrist A, Howell KE, Bergeron JJ. 2005. Proteomics of organelles and large cellular structures. Nat. Rev. Mol. Cell Biol. 6:70214 76. Walther TC, Mann M. 2010. Mass spectrometry-based proteomics in cell biology. J. Cell Biol. 190:491 500 77. Andersen JS, Wilkinson CJ, Mayor T, Mortensen P, Nigg EA, Mann M. 2003. Proteomic characterization of the human centrosome by protein correlation proling. Nature 426:57074 78. Foster LJ, de Hoog CL, Zhang Y, Xie X, Mootha VK, Mann M. 2006. A mammalian organelle map by protein correlation proling. Cell 125:18799 79. Dunkley TP, Watson R, Grifn JL, Dupree P, Lilley KS. 2004. Localization of organelle proteins by isotope tagging (LOPIT). Mol. Cell Proteomics 3:112834 80. Tan DJ, Dvinge H, Christoforou A, Bertone P, Martinez Arias A, Lilley KS. 2009. Mapping organelle proteins and protein complexes in Drosophila melanogaster. J. Proteome Res. 8:266778 81. Trinkle-Mulcahy L, Lamond AI. 2007. Toward a high-resolution view of nuclear dynamics. Science 318:14027 82. Boisvert FM, Lam YW, Lamont D, Lamond AI. 2010. A quantitative proteomics analysis of subcellular proteome localization and changes induced by DNA damage. Mol. Cell Proteomics 9:45770 83. Lam YW, Evans VC, Heesom KJ, Lamond AI, Matthews DA. 2010. Proteomics analysis of the nucleolus in adenovirus-infected cells. Mol. Cell Proteomics 9:11730 84. Blagoev B, Ong SE, Kratchmarova I, Mann M. 2004. Temporal analysis of phosphotyrosine-dependent signaling networks by quantitative proteomics. Nat. Biotechnol. 22:113945 85. Andersen JS, Lam YW, Leung AK, Ong SE, Lyon CE, et al. 2005. Nucleolar proteome dynamics. Nature 433:7783
296 Cox

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Mann

86. Hanash S, Taguchi A. 2010. The grand challenge to decipher the cancer proteome. Nat. Rev. Cancer 10:65260 87. Addona TA, Abbatiello SE, Schilling B, Skates SJ, Mani DR, et al. 2009. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma. Nat. Biotechnol. 27:63341 88. Anderson NL, Anderson NG, Haines LR, Hardie DB, Olafson RW, Pearson TW. 2004. Mass spectrometric quantitation of peptides and proteins using stable isotope standards and capture by anti-peptide antibodies (SISCAPA). J. Proteome Res. 3:23544 89. Ishihama Y, Sato T, Tabata T, Miyamoto N, Sagane K, et al. 2005. Quantitative mouse brain proteomics using culture-derived isotope tags as internal standards. Nat. Biotechnol. 23:61721 90. van de Vijver MJ, He YD, vant Veer LJ, Dai H, Hart AA, et al. 2002. A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347:19992009 91. vant Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, et al. 2002. Gene expression proling predicts clinical outcome of breast cancer. Nature 415:53036 92. Andersen JS, Matic I, Vertegaal AC. 2009. Identication of SUMO target proteins by quantitative proteomics. Methods Mol. Biol. 497:1931 93. Pinkse MW, Uitto PM, Hilhorst MJ, Ooms B, Heck AJ. 2004. Selective isolation at the femtomole level of phosphopeptides from proteolytic digests using 2D-NanoLC-ESI-MS/MS and titanium oxide precolumns. Anal. Chem. 76:393543 94. Larsen MR, Thingholm TE, Jensen ON, Roepstorff P, Jorgensen TJ. 2005. Highly selective enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns. Mol. Cell Proteomics 4:87386 95. Zhao Y, Jensen ON. 2009. Modication-specic proteomics: strategies for characterization of posttranslational modications using enrichment techniques. Proteomics 9:463241 96. Ficarro SB, McCleland ML, Stukenberg PT, Burke DJ, Ross MM, et al. 2002. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20:3015 97. Beausoleil SA, Jedrychowski M, Schwartz D, Elias JE, Villen J, et al. 2004. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. Acad. Sci. USA 101:1213035 98. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, et al. 2006. Global, in vivo, and site-specic phosphorylation dynamics in signaling networks. Cell 127:63548 99. Bodenmiller B, Mueller LN, Mueller M, Domon B, Aebersold R. 2007. Reproducible isolation of distinct, overlapping segments of the phosphoproteome. Nat. Methods 4:23137 100. Dephoure N, Zhou C, Villen J, Beausoleil SA, Bakalarski CE, et al. 2008. A quantitative atlas of mitotic phosphorylation. Proc. Natl. Acad. Sci. USA 105:1076267 101. Guo A, Villen J, Kornhauser J, Lee KA, Stokes MP, et al. 2008. Signaling networks assembled by oncogenic EGFR and c-Met. Proc. Natl. Acad. Sci. USA 105:69297 102. Choudhary C, Olsen JV, Brandts C, Cox J, Reddy PN, et al. 2009. Mislocalized activation of oncogenic RTKs switches downstream signaling outcomes. Mol. Cell 36:32639 103. Olsen JV, Vermeulen M, Santamaria A, Kumar C, Miller ML, et al. 2010. Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis. Sci. Signal. 3:ra3 104. Kim SC, Sprung R, Chen Y, Xu Y, Ball H, et al. 2006. Substrate and functional diversity of lysine acetylation revealed by a proteomics survey. Mol. Cell 23:60718 105. Choudhary C, Kumar C, Gnad F, Nielsen ML, Rehman M, et al. 2009. Lysine acetylation targets protein complexes and co-regulates major cellular functions. Science 325:83440 106. Ong SE, Mittler G, Mann M. 2004. Identifying and quantifying in vivo methylation sites by heavy methyl SILAC. Nat. Methods 1:11926 107. Kaji H, Kamiie J, Kawakami H, Kido K, Yamauchi Y, et al. 2007. Proteomics reveals N-linked glycoprotein diversity in Caenorhabditis elegans and suggests an atypical translocation mechanism for integral membrane proteins. Mol. Cell Proteomics 6:21009 108. Zielinska DF, Gnad F, Wisniewski JR, Mann M. 2010. Precision mapping of an in vivo N-glycoproteome reveals rigid topological and sequence constraints. Cell 141:897907 109. Peng J, Schwartz D, Elias JE, Thoreen CC, Cheng D, et al. 2003. A proteomics approach to understanding protein ubiquitination. Nat. Biotechnol. 21:92126
www.annualreviews.org High-Resolution Proteomics 297

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

110. Xu G, Paige JS, Jaffrey SR. 2010. Global analysis of lysine ubiquitination by ubiquitin remnant immunoafnity proling. Nat. Biotechnol. 28:86873 111. Danielsen JMR, Sylvestersen KB, Bekker-Jensen S, Szklarczyk D, Poulsen JW, et al. 2011. Mass spectrometric analysis of lysine ubiquitylation reveals promiscuity at site level. Mol. Cell Proteomics 10(3): M110.003590 112. Tan CS, Pasculescu A, Lim WA, Pawson T, Bader GD, Linding R. 2009. Positive selection of tyrosine loss in metazoan evolution. Science 325:168688 113. Zhang J, Sprung R, Pei J, Tan X, Kim S, et al. 2009. Lysine acetylation is a highly abundant and evolutionarily conserved modication in Escherichia coli. Mol. Cell Proteomics 8:21525 114. Collins MO, Yu L, Campuzano I, Grant SG, Choudhary JS. 2008. Phosphoproteomic analysis of the mouse brain cytosol reveals a predominance of protein phosphorylation in regions of intrinsic sequence disorder. Mol. Cell Proteomics 7:133148 115. Hunter T. 2007. The age of crosstalk: phosphorylation, ubiquitination, and beyond. Mol. Cell 28:73038 116. Strahl BD, Allis CD. 2000. The language of covalent histone modications. Nature 403:4145 117. Lee JS, Smith E, Shilatifard A. 2010. The language of histone crosstalk. Cell 142:68285 118. Lienhard GE. 2008. Non-functional phosphorylations? Trends Biochem. Sci. 33:35152 119. Berglund L, Bjorling E, Oksvold P, Fagerberg L, Asplund A, et al. 2008. A genecentric Human Protein Atlas for expression proles based on antibodies. Mol. Cell Proteomics 7:201927 120. Pont n F, Gry M, Fagerberg L, Lundberg E, Asplund A, et al. 2009. A global view of protein expression e in human cells, tissues, and organs. Mol. Syst. Biol. 5:337 121. Zhang Y, Buchholz F, Muyrers JP, Stewart AF. 1998. A new logic for DNA engineering using recombination in Escherichia coli. Nat. Genet. 20:12328 122. Poser I, Sarov M, Hutchins JR, H rich JK, Toyoda Y, et al. 2008. BAC TransgeneOmics: a highe e throughput method for exploration of protein function in mammals. Nat. Methods 5:40915 123. Hubner NC, Bird AW, Cox J, Splettstoesser B, Bandilla P, et al. 2010. Quantitative proteomics combined with BAC TransgeneOmics reveals in vivo protein interactions. J. Cell Biol. 189:73954 124. Gingras AC, Gstaiger M, Raught B, Aebersold R. 2007. Analysis of protein complexes using mass spectrometry. Nat. Rev. Mol. Cell Biol. 8:64554 125. Vermeulen M, Hubner NC, Mann M. 2008. High condence determination of specic protein-protein interactions using quantitative mass spectrometry. Curr. Opin. Biotechnol. 19:33137 126. Wepf A, Glatter T, Schmidt A, Aebersold R, Gstaiger M. 2009. Quantitative interaction proteomics using mass spectrometry. Nat. Methods 6:2035 127. Schmidt C, Lenz C, Grote M, Luhrmann R, Urlaub H. 2010. Determination of protein stoichiometry within protein complexes using absolute quantication and multiple reaction monitoring. Anal. Chem. 82:278496 128. Chen ZA, Jawhari A, Fischer L, Buchen C, Tahir S, et al. 2010. Architecture of the RNA polymerase II-TFIIF complex revealed by cross-linking and mass spectrometry. EMBO J. 29:71726 129. Leitner A, Walzthoeni T, Kahraman A, Herzog F, Rinner O, et al. 2010. Probing native protein structures by chemical cross-linking, mass spectrometry, and bioinformatics. Mol. Cell Proteomics 9:163449 130. Robinson CV, Sali A, Baumeister W. 2007. The molecular sociology of the cell. Nature 450:97382 131. Nickell S, Beck F, Scheres SH, Korinek A, Forster F, et al. 2009. Insights into the molecular architecture of the 26S proteasome. Proc. Natl. Acad. Sci. USA 106:1194347 132. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, et al. 2002. Systematic identication of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415:18083 133. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, et al. 2002. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415:14147 134. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, et al. 2006. Proteome survey reveals modularity of the yeast cell machinery. Nature 440:63136 135. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, et al. 2006. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440:63743 136. Breitkreutz A, Choi H, Sharom JR, Boucher L, Neduva V, et al. 2010. A global protein kinase and phosphatase interaction network in yeast. Science 328:104346
298 Cox

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Mann

137. Bouwmeester T, Bauch A, Ruffner H, Angrand PO, Bergamini G, et al. 2004. A physical and functional map of the human TNF-alpha/NF-kappa B signal transduction pathway. Nat. Cell Biol. 6:97105 138. Sowa ME, Bennett EJ, Gygi SP, Harper JW. 2009. Dening the human deubiquitinating enzyme interaction landscape. Cell 138:389403 139. Choi H, Kim S, Gingras AC, Nesvizhskii AI. 2010. Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data. Mol. Syst. Biol. 6:385 140. Behrends C, Sowa ME, Gygi SP, Harper JW. 2010. Network organization of the human autophagy system. Nature 466:6876 141. Seet BT, Dikic I, Zhou MM, Pawson T. 2006. Reading protein modications with interaction domains. Nat. Rev. Mol. Cell Biol. 7:47383 142. Schulze WX, Mann M. 2004. A novel proteomic screen for peptide-protein interactions. J. Biol. Chem. 279:1075664 143. Vermeulen M, Mulder KW, Denissov S, Pijnappel WW, van Schaik FM, et al. 2007. Selective anchoring of TFIID to nucleosomes by trimethylation of histone H3 lysine 4. Cell 131:5869 144. Vermeulen M, Eberl HC, Matarese F, Marks H, Denissov S, et al. 2010. Quantitative interaction proteomics and genome-wide proling of epigenetic histone marks and their readers. Cell 142:96780 145. Rohs R, Jin X, West SM, Joshi R, Honig B, Mann RS. 2010. Origins of specicity in protein-DNA recognition. Annu. Rev. Biochem. 79:23369 146. Mittler G, Butter F, Mann M. 2009. A SILAC-based DNA protein interaction screen that identies candidate binding proteins to functional DNA elements. Genome Res. 19:28493 147. Van Laere AS, Nguyen M, Braunschweig M, Nezer C, Collette C, et al. 2003. A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig. Nature 425:83236 148. Markljung E, Jiang L, Jaffe JD, Mikkelsen TS, Wallerman O, et al. 2009. ZBED6, a novel transcription factor derived from a domesticated DNA transposon regulates IGF2 expression and muscle growth. PLoS Biol. 7:e1000256 149. Butter F, Kappei D, Buchholz F, Vermeulen M, Mann M. 2010. A domesticated transposon mediates the effects of a single-nucleotide polymorphism responsible for enhanced muscle growth. EMBO Rep. 11:30511 150. Butter F, Scheibe M, Morl M, Mann M. 2009. Unbiased RNA-protein interaction screen by quantitative proteomics. Proc. Natl. Acad. Sci. USA 106:1062631 151. Ong SE, Schenone M, Margolin AA, Li X, Do K, et al. 2009. Identifying the proteins to which smallmolecule probes and drugs bind in cells. Proc. Natl. Acad. Sci. USA 106:461722 152. Daub H, Olsen JV, Bairlein M, Gnad F, Oppermann FS, et al. 2008. Kinase-selective enrichment enables quantitative phosphoproteomics of the kinome across the cell cycle. Mol. Cell 31:43848 153. Bantscheff M, Eberhard D, Abraham Y, Bastuck S, Boesche M, et al. 2007. Quantitative chemical proteomics reveals mechanisms of action of clinical ABL kinase inhibitors. Nat. Biotechnol. 25:103544 154. Sharma K, Weber C, Bairlein M, Greff Z, Keri G, et al. 2009. Proteomics strategy for quantitative protein interaction proling in cell extracts. Nat. Methods 6:74144 155. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, et al. 2004. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5:R80 156. Boxem M, Maliga Z, Klitgord N, Li N, Lemmens I, et al. 2008. A protein domain-based interactome network for C. elegans early embryogenesis. Cell 134:53445 157. Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, et al. 2009. STRING 8a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37:D41216 158. Collins SR, Weissman JS, Krogan NJ. 2009. From information to knowledge: new technologies for dening gene function. Nat. Methods 6:72123 159. Geiger T, Cox J, Mann M. 2010. Proteomic changes resulting from gene copy number variations in cancer cells. PLoS Genet. 6(9):e1001090 160. Clamp M, Fry B, Kamal M, Xie X, Cuff J, et al. 2007. Distinguishing protein-coding and noncoding genes in the human genome. Proc. Natl. Acad. Sci. USA 104:1942833

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

www.annualreviews.org High-Resolution Proteomics

299

Contents
Preface Past, Present, and Future Triumphs of Biochemistry JoAnne Stubbe p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p pv Prefatory From Serendipity to Therapy Elizabeth F. Neufeld p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 1 Journey of a Molecular Biologist Masayasu Nomura p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p16 My Life with Nature Julius Adler p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p42 Membrane Vesicle Theme Protein Folding and Modication in the Mammalian Endoplasmic Reticulum Ineke Braakman and Neil J. Bulleid p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p71 Mechanisms of Membrane Curvature Sensing Bruno Antonny p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 101 Biogenesis and Cargo Selectivity of Autophagosomes Hilla Weidberg, Elena Shvets, and Zvulun Elazar p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 125 Membrane Protein Folding and Insertion Theme Introduction to Theme Membrane Protein Folding and Insertion Gunnar von Heijne p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 157 Assembly of Bacterial Inner Membrane Proteins Ross E. Dalbey, Peng Wang, and Andreas Kuhn p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 161 -Barrel Membrane Protein Assembly by the Bam Complex Christine L. Hagan, Thomas J. Silhavy, and Daniel Kahne p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 189

Annual Review of Biochemistry Volume 80, 2011

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

vii

Transmembrane Communication: General Principles and Lessons from the Structure and Function of the M2 Proton Channel, K+ Channels, and Integrin Receptors Gevorg Grigoryan, David T. Moore, and William F. DeGrado p p p p p p p p p p p p p p p p p p p p p p p p 211 Biological Mass Spectrometry Theme Mass Spectrometry in the Postgenomic Era Brian T. Chait p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 239 Advances in the Mass Spectrometry of Membrane Proteins: From Individual Proteins to Intact Complexes Nelson P. Barrera and Carol V. Robinson p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 247 Quantitative, High-Resolution Proteomics for Data-Driven Systems Biology Jurgen Cox and Matthias Mann p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 273 Applications of Mass Spectrometry to Lipids and Membranes Richard Harkewicz and Edward A. Dennis p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 301 Cellular Imaging Theme Emerging In Vivo Analyses of Cell Function Using Fluorescence Imaging Jennifer Lippincott-Schwartz p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 327 Biochemistry of Mobile Zinc and Nitric Oxide Revealed by Fluorescent Sensors Michael D. Pluth, Elisa Tomat, and Stephen J. Lippard p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 333 Development of Probes for Cellular Functions Using Fluorescent Proteins and Fluorescence Resonance Energy Transfer Atsushi Miyawaki p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 357 Reporting from the Field: Genetically Encoded Fluorescent Reporters Uncover Signaling Dynamics in Living Biological Systems Sohum Mehta and Jin Zhang p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 375 Recent Advances in Biochemistry DNA Replicases from a Bacterial Perspective Charles S. McHenry p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 403 Genomic and Biochemical Insights into the Specicity of ETS Transcription Factors Peter C. Hollenhorst, Lawrence P. McIntosh, and Barbara J. Graves p p p p p p p p p p p p p p p p p p p 437

Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

viii

Contents

Signals and Combinatorial Functions of Histone Modications Tamaki Suganuma and Jerry L. Workman p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 473 Assembly of Bacterial Ribosomes Zahra Shajani, Michael T. Sykes, and James R. Williamson p p p p p p p p p p p p p p p p p p p p p p p p p p p p 501 The Mechanism of Peptidyl Transfer Catalysis by the Ribosome Edward Ki Yun Leung, Nikolai Suslov, Nicole Tuttle, Raghuvir Sengupta, and Joseph Anthony Piccirilli p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 527 Amyloid Structure: Conformational Diversity and Consequences Brandon H. Toyama and Jonathan S. Weissman p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 557
Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

AAA+ Proteases: ATP-Fueled Machines of Protein Destruction Robert T. Sauer and Tania A. Baker p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 587 The Structure of the Nuclear Pore Complex Andr Hoelz, Erik W. Debler, and Gunter Blobel p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 613 e Benchmark Reaction Rates, the Stability of Biological Molecules in Water, and the Evolution of Catalytic Power in Enzymes Richard Wolfenden p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 645 Biological Phosphoryl-Transfer Reactions: Understanding Mechanism and Catalysis Jonathan K. Lassila, Jesse G. Zalatan, and Daniel Herschlag p p p p p p p p p p p p p p p p p p p p p p p p p p p 669 Enzymatic Transition States, Transition-State Analogs, Dynamics, Thermodynamics, and Lifetimes Vern L. Schramm p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 703 Class I Ribonucleotide Reductases: Metallocofactor Assembly and Repair In Vitro and In Vivo Joseph A. Cotruvo Jr. and JoAnne Stubbe p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 733 The Evolution of Protein Kinase Inhibitors from Antagonists to Agonists of Cellular Signaling Arvin C. Dar and Kevan M. Shokat p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 769 Glycan Microarrays for Decoding the Glycome Cory D. Rillahan and James C. Paulson p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 797 Cross Talk Between O-GlcNAcylation and Phosphorylation: Roles in Signaling, Transcription, and Chronic Disease Gerald W. Hart, Chad Slawson, Genaro Ramirez-Correa, and Olof Lagerlof p p p p p p p p p 825 Regulation of Phospholipid Synthesis in the Yeast Saccharomyces cerevisiae George M. Carman and Gil-Soo Han p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 859

Contents

ix

Sterol Regulation of Metabolism, Homeostasis, and Development Joshua Wollam and Adam Antebi p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 885 Structural Biology of the Toll-Like Receptor Family Jin Young Kang and Jie-Oh Lee p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 917 Structure-Function Relationships of the G Domain, a Canonical Switch Motif Alfred Wittinghofer and Ingrid R. Vetter p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 943 STIM Proteins and the Endoplasmic Reticulum-Plasma Membrane Junctions Silvia Carrasco and Tobias Meyer p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 973
Annu. Rev. Biochem. 2011.80:273-299. Downloaded from www.annualreviews.org by b-on: Universidade Nova de Lisboa (UNL) on 10/31/11. For personal use only.

Amino Acid Signaling in TOR Activation Joungmok Kim and Kun-Liang Guan p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p1001 Mitochondrial tRNA Import and Its Consequences for Mitochondrial Translation Andr Schneider p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p1033 e Caspase Substrates and Cellular Remodeling Emily D. Crawford and James A. Wells p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p1055 Regulation of HSF1 Function in the Heat Stress Response: Implications in Aging and Disease Julius Anckar and Lea Sistonen p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p1089 Indexes Cumulative Index of Contributing Authors, Volumes 7680 p p p p p p p p p p p p p p p p p p p p p p p p p p1117 Cumulative Index of Chapter Titles, Volumes 7680 p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p1121 Errata An online log of corrections to Annual Review of Biochemistry articles may be found at http://biochem.annualreviews.org/errata.shtml

Contents

Вам также может понравиться