Вы находитесь на странице: 1из 18

Review

Advances and Challenges in Liquid


Chromatography-Mass Spectrometry-based
Proteomics Profiling for Clinical Applications*
Wei-Jun Qian, Jon M. Jacobs, Tao Liu, David G. Camp II, and Richard D. Smith‡

Recent advances in proteomics technologies provide tre- have recently gained significant attention due to the power of
mendous opportunities for biomarker-related clinical ap- these technologies for analyzing complex protein mixtures
plications; however, the distinctive characteristics of hu- and their potential for identifying novel markers indicative of
man biofluids such as the high dynamic range in protein disease. It is widely believed that many complex human dis-
abundances and extreme complexity of the proteomes eases, including cancers, might be more effectively cured if
present tremendous challenges. In this review we sum- specific disease biomarkers were available to enable detec-
marize recent advances in LC-MS-based proteomics pro-
tion and treatment at very early stages of disease (3). Despite
filing and its applications in clinical proteomics as well
noteworthy efforts, only a handful of cancer biomarkers have
as discuss the major challenges associated with imple-

Downloaded from www.mcponline.org by on February 4, 2007


menting these technologies for more effective candi- been approved by the United States Food and Drug Adminis-
date biomarker discovery. Developments in immunoaf- tration (FDA)1 for clinical use, with the majority of these being
finity depletion and various fractionation approaches in protein biomarkers (4). Although existing markers play a signif-
combination with substantial improvements in LC-MS icant role in screening, monitoring, and staging, effective bi-
platforms have enabled the plasma proteome to be pro- omarkers are not currently available for most cancers and are
filed with considerably greater dynamic range of cover- generally nonexistent for early detection (3). Therefore, there is a
age, allowing many proteins at low ng/ml levels to be clear need for applying advanced technologies such as these
confidently identified. Despite these significant advances based on proteomics in the quest for novel candidate clinical
and efforts, major challenges associated with the dy- biomarkers.
namic range of measurements and extent of proteome
Although widely speculated that advances in genomics and
coverage, confidence of peptide/protein identifications,
proteomics would alter the landscape of clinical biomarker
quantitation accuracy, analysis throughput, and the ro-
bustness of present instrumentation must be addressed discovery and validation, the declining trend of new FDA-
before a proteomics profiling platform suitable for effi- approved biomarkers reported over the last decade (5) high-
cient clinical applications can be routinely implemented. lights the magnitude of the challenges associated with human
Molecular & Cellular Proteomics 5:1727–1744, 2006. clinical samples and validation of candidate biomarkers. Con-
tributing to these challenges are the substantial complexity of
the human proteome and the heterogeneity of the human
Advances in MS technologies, high resolution liquid phase population, both of which make the search for biomarkers
separations, and informatics/bioinformatics for large scale from either biofluids or disease tissues a daunting task. As a
data analysis have made MS-based proteomics an indispen- result of the heterogeneous nature of humans and the com-
sable research tool with the potential to broadly impact biol- plexity of diseases, e.g. cancers, a panel of biomarkers rather
ogy and laboratory medicine (1). In particular, proteomics than a single marker may be required to achieve the high
technologies have been increasingly applied to the study of sensitivity and specificity required for clinical applications (3).
disease-related clinical samples (e.g. human blood serum/ Proteomics technologies offer significant potential for discov-
plasma, proximal fluids, and disease tissues) for the purposes ering such marker panels.
of identifying novel disease-specific protein biomarkers, gain- Many different technologies have been applied for biomar-
ing better understandings of disease processes, and discov- ker discovery and other clinical applications, including two-
ering novel protein targets for therapeutic interventions and dimensional (2D) gel-electrophoresis (6), LC-MS, and protein-
drug developments (2). and antibody-based microarrays (7–9). LC-MS- or tandem MS
Proteomics-based candidate biomarker discovery efforts

1
From the Biological Sciences Division and Environmental Molecular The abbreviations used are: FDA, Food and Drug Administration;
Sciences Laboratory, Pacific Northwest National Laboratory, SCX, strong cation exchange chromatography; NET, normalized elu-
Richland, Washington 99352 tion time; AMT, accurate mass and time; IMS, ion mobility spectrom-
Received, May 2, 2006, and in revised form, July 25, 2006 etry; 2D, two-dimensional; RPLC, reversed phase LC; MARS, multiple
Published, MCP Papers in Press, August 3, 2006, DOI 10.1074/ affinity removal system; HUPO, Human Proteome Organization; LPS,
mcp.M600162-MCP200 lipopolysaccharide; MRM, multiple reaction monitoring.

This paper is available on line at http://www.mcponline.org Molecular & Cellular Proteomics 5.10 1727
LC-MS-based Clinical Proteomics

TABLE I
Challenges and limitations of current LC-MS-based proteomics technologies applied to biomarker discovery
Challenge Current techniques for addressing the challenge Limitations
Dynamic range of Immunoaffinity depletion and multidimensional Low throughput, requires relatively large
measurements fractionation coupled with high resolution LC-MS sample sizes
or MS/MS instrumentation
Sensitivity Small inner diameter LC column (50 ␮m or less) Issues in robustness and expense
coupled with nanoflow electrospray ionization and
advanced MS instrumentation (i.e. FTICR, LTQ-FT)
Reproducibility and Platform automation (including sample processing), Variations from multistep sample
quantitation label-free direct quantitation, and isotope labeling- processing, ionization suppression and
based quantitation instrument variations, labeling efficiencies
Throughput Automated fast LC and gas phase ion mobility Limited dynamic range or coverage
separations
False positive identifications Improved database searching algorithms and Lack of consensus
statistical models

(MS/MS)-based proteomics technologies offer highly sensi- and therapeutic target discovery (21), offering a promising al-
tive analytical capabilities and a relatively large dynamic range ternative to direct tissue analysis. In the following review, we

Downloaded from www.mcponline.org by on February 4, 2007


of detection and have increasingly become the method of highlight LC-MS-based proteomics profiling for clinical applica-
choice for in depth profiling of complex protein mixtures (1). In tions by summarizing recent advances as well as the major
addition, the relatively high throughput of LC-MS technolo- challenges facing this technology for more effective candidate
gies is amenable to clinical applications that involve human biomarker discovery.
biofluids and disease tissues. The application of LC-MS/MS
for human biofluid protein profiling was initiated by the first CHALLENGES AND REQUIREMENTS FOR DESIGNING A ROBUST
global shotgun proteomics study of human plasma/serum LC-MS DISCOVERY PLATFORM
published in 2002 by Adkins et al. (10). An explosion of LC- The distinctive nature of human biofluid proteomes, in par-
MS-based applications in human plasma/serum and various ticular the serum/plasma proteome, presents significant chal-
biofluids soon followed due to the tremendous interest in lenges for current analytical technologies aimed at quantita-
identifying disease-related proteins (11, 12). Various deple- tive protein profiling and biomarker discovery. First, the
tion/fractionation/enrichment techniques have been devel- serum/plasma protein content is dominated by several very
oped along the way and coupled to LC-MS to increase cov- abundant proteins (i.e. the 22 most abundant proteins represent
erage of the biofluid proteomes (13). ⬃99% of the total protein mass in plasma) yet at the same time
Human blood serum/plasma remains the most commonly presents an extraordinary dynamic range (⬎10 orders of mag-
used clinical sample to date for proteomics applications be- nitude) in protein concentrations that begins with serum albumin
cause it may include specific biomarkers for virtually all hu- at ⬃45 mg/ml and extends to cytokines (and potentially many
man diseases due to its either direct or indirect interaction disease-related proteins) at around 1–10 pg/ml or lower (5).
with the entire cell complement of the body, i.e. tissue-spe- Second, the serum/plasma proteome presents tremendous bi-
cific proteins may be released into the blood stream upon cell ological complexity as a result of tissue “leakage” proteins from
damage or cell death. Additionally serum/plasma can be the entire body, complex post-translational protein modifica-
readily obtained by clinical sampling. However, the magnitude tions such as glycosylation, and the existence of various forms
of the previously mentioned challenges associated with hu- (i.e. splice variants, proteolytic products, and the tremendous
man clinical samples coupled with the anticipation that po- variability in the immunoglobulin class) for each expressed
tential biomarkers of interest could be present at extremely gene. Finally the substantial genetic and non-genetic biological
low concentrations in plasma has raised doubts as to whether variability of human clinical samples contributes significantly to
disease biomarkers can be accurately detected or identified the overall analytical challenge.
from plasma using a proteomics approach. As a result, anal- Despite significant recent advances, major challenges re-
ysis of various other biofluids/tissues has gained increasing main to prevent routine implementation of an LC-MS protein
attention. Due to their proximity to the source of disease or profiling platform suitable for efficient biomarker discovery
perturbation in the body, tissues (14) and various biofluids such (Table I). To effectively address these challenges, a protein
as cerebrospinal fluid (15), bronchoalveolar lavage fluid (16), profiling platform suitable for biomarker discovery and clinical
synovial fluid (17), nipple aspirate fluid (18), saliva (19), and urine applications must provide at the very minimum 1) overall high
(20) are believed to provide a more focused pool of potential dynamic range of measurements and extensive coverage of
biomarkers of interest. In addition, tumor interstitial fluids have the proteome for effective detection of low abundance pro-
also been reported as a novel source for proteomics biomarker teins, 2) highly confident and specific protein identifications,

1728 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

FIG. 1. A component diagram of an LC-MS protein profiling platform. FFE, free flow electrophoresis; 1D, one-dimensional; iTRAQ,
isobaric tags for relative and absolute quantitation.

3) accurate quantitation of relative protein abundances across throughput can be severely reduced. Other key performance
many clinical samples, and 4) high throughput capable of factors are the confidence of protein identifications and the
analyzing large numbers of clinical samples to provide suffi- quantitative accuracy, which determine the ability of the plat-
cient statistical power needed to address biological variability. form to confidently identify a potential biomarker based on the

Downloaded from www.mcponline.org by on February 4, 2007


In addition, the platform, including both sample processing abundance differences between healthy and diseased condi-
and LC-MS instrumentation, must be robust and include ef- tions. Both the reproducibility of sample processing/fraction-
ficient informatics software capabilities for data mining and ation prior to LC-MS and the LC-MS instrumentation will
statistical analyses. Currently there is a broad consensus that contribute to the accuracy of quantitation.
no existing platform meets all of these requirements for effec-
tive biomarker discovery. ADVANCES IN LC-MS TECHNOLOGIES
Fig. 1 shows a component-based diagram of an LC-MS A high resolution LC (or LC/LC) separation coupled on line
protein profiling platform. Note that such a platform is not with MS is the central component of many proteomics plat-
based on a single instrument but rather on a compilation of forms. Over the past decade, there have been significant
current technologies to achieve high dynamic range quanti- advances in LC separations as well as in MS instrumentation
tative proteome profiling for clinical samples. A key perform- and ESI. To date, the “bottom-up” proteomics strategy that
ance factor of any such platform is the overall dynamic range combines high efficiency separations with MS to characterize
of detection and extent of proteome coverage, which in turn highly complex peptide mixtures still accounts for the majority
dictates its ability to detect low abundance proteins. Many of proteomics measurements. This strategy relies on the iden-
disease-specific proteins in plasma/serum are anticipated to tification of peptides sufficiently unique for protein identifica-
be present at very low levels (ng/ml or even lower), e.g. within tion. Protein mixtures from cellular lysates or biofluids are
the same range as current FDA-approved markers such as typically digested by trypsin (or other proteases) into polypep-
prostate-specific antigen (0.01–100 ng/ml) and Troponin-T tides, which are then separated by capillary LC and analyzed
(0.02–100 ng/ml). This is particularly obvious for cancer mark- by MS on line via an ESI interface. Peptide sequences are
ers of early detection where tumor size is very small (millime- identified by using automated database searching algorithms
ter size), and cancer-specific proteins in plasma may present such as SEQUEST (22), MASCOT (23), or X!Tandem (24) to
at pg/ml or lower levels. This overall dynamic range presents correlate experimental MS/MS spectra to theoretical mass
a tremendous challenge for any MS-based technology. The spectra based on sequences in a given protein database for a
achievable dynamic range or proteome coverage for a plat- specific organism. With the recent development of high speed
form depends on the peak capacity (the number of chromato- 2D linear ion trap instruments, i.e. LTQ, the protein profiling
graphic peaks that can be fit into the length of separation) of coverage has been greatly enhanced compared with tradi-
the on-line LC separations prior to MS measurements, the tional three-dimensional ion trap systems (25). When coupled
dynamic range of the MS instrumentation, and the efficiency with SCX fractionation either on line or off line (26, 27), LC-
of sample enrichment or fractionation steps at both protein MS/MS technologies now routinely allow for identification of
and peptide levels prior to LC-MS analyses. Analysis through- thousands of proteins from complex mammalian tissues and
put inevitably determines the size of any clinical study sample cells. Although routinely used for peptide/protein identifications,
set and largely depends on factors such as automation of data-dependent LC-MS/MS still has an inherent “undersam-
each platform component, LC-MS analysis duty cycle, and pling” limitation whereby only a portion of the species observed
the extent of prefractionation prior to LC-MS analysis. Al- in the survey MS scan is selected for fragmentation (28).
though the application of more extensive fractionation can To overcome the undersampling issue, our laboratory de-
lead to a higher dynamic range of detection, the overall veloped an accurate mass and time (AMT) tag approach that

Molecular & Cellular Proteomics 5.10 1729


LC-MS-based Clinical Proteomics

FIG. 2. A typical LC-FTICR analysis of an IgY-12 depleted human plasma sample. A, the base peak chromatogram. B, a 2D display of
⬃2,800 identified species at the mass and NET space. The analysis was performed using a Bruker 9.4-tesla FTICR instrument coupled with
an LC system equipped with a 150-␮m-inner diameter and 65-cm-long capillary column operated at 5,000 p.s.i.

Downloaded from www.mcponline.org by on February 4, 2007


utilizes highly accurate mass measurements from a high res- As mentioned previously, the achievable dynamic range
olution mass spectrometer (e.g. FTICR or TOF mass spec- for the LC-MS platform depends significantly on the peak
trometer) in conjunction with accurate elution time measure- capacity of the on-line gradient reversed phase separations,
ments from high resolution capillary LC separations to the dynamic range of the MS system, and the efficiency and
achieve high throughput proteome profiling without routine stability of the ESI interface. A single MS spectrum can
MS/MS measurements (29, 30). The concept of this AMT tag provide a dynamic range of up to 103 for a high resolution
approach is based on the principle that the accurate mass instrument (e.g. FTICR), and one would expect to achieve
and time measurements will allow reliable peptide identifica- a dynamic range of at least 105 by coupling this instrument
tions by correlating the mass and time of detected peaks to a to an on-line high resolution LC separation that provides a
pre-established peptide AMT tag reference library for a par- peak capacity of ⬃1,000. However, the observed dynamic
ticular biological system (e.g. plasma). With this approach, range of measurements can be significantly reduced for
LC-MS/MS proteome analyses coupled with extensive frac- complex biological samples such as human plasma due to
tionation only need to be performed once to create an effec- the charge competition of co-eluting high abundance
tive reference database of peptide markers defined by accu- species, leading to ion suppression of the relatively low
rate masses and elution times, i.e. AMT tags. The AMT tag abundance species. Ion suppression is a particular issue
database then serves as a comprehensive “look-up table” for when analyzing human biofluid samples as these samples
subsequent higher throughput LC-MS analyses, allowing are dominated by a handful of highly abundant proteins.
many peptides in each spectrum to be identified without Significant ion suppression will occur when peptides origi-
MS/MS. Fig. 2 exemplifies an LC chromatogram and 2D dis- nating from low abundance proteins of interest co-elute with
play of ⬃2,800 peptides identified using the AMT tag strategy peptides originating from high abundance proteins, leading
resulting from a single LC-FTICR analysis of a Pro- to the inability to detect the co-eluting low abundance
teomeLabTM IgY-12 depleted human plasma sample. peptides.
The fact that application of the AMT tag approach obviates Table II provides a summary of the relative proteome cov-
the need for routine MS/MS is particularly attractive in high erage and estimated dynamic ranges achieved by coupling
throughput repeated analyses of similar samples (e.g. serum/ high resolution reversed phase capillary LC separations with
plasma) in clinical proteomics studies. We have recently dem- either MS/MS using an LTQ instrument or MS using a 9.4-
onstrated the application of the AMT tag approach coupled tesla FTICR instrument. The enhanced coverage and dynamic
with 18O labeling for quantitative profiling of the human ranges obtained by the removal of high abundance proteins
plasma proteome in response to lipopolysaccharide adminis- and SCX fractionation are illustrated. All results shown in
tration (31). The availability of commercial high performance Table II are based on triplicate experiments that involved a
mass spectrometers (e.g. ThermoElectron Finnigan LTQ-FT pooled plasma sample from healthy subjects. The number of
and LTQ-Orbitrap) will likely lead to an even broader range of peptide identifications are reported with ⬎95% confidence
applications based on this LC-MS-only approach for higher based on either a reversed database evaluation for MS/MS
throughput peptide identifications. data (32) or a shifted database evaluation for the LC-FTICR

1730 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

TABLE II
The proteome coverage and estimated dynamic range offered by current LC-MS technologies
A pooled reference plasma sample from healthy individuals was used for this evaluation. A prepacked 4.6 ⫻ 50-mm (loading capacity, 15
␮l of plasma) MARS affinity column (Agilent, Palo Alto, CA) and a 7 ⫻ 52-mm (loading capacity, 25 ␮l of plasma) ProteomeLab IgY-12 affinity
column (Beckman Coulter, Fullerton, CA) were used for the depletion of high abundance proteins. For each method, the samples were
processed in triplicate and individually analyzed using a 150-␮m-inner diameter and 65-cm-long column coupled with either a Finnigan LTQ
system (MS/MS) or a Bruker 9.4-tesla FTICR instrument. 10 and 5 ␮g of peptide samples were loaded for each LC-MS/MS and LC-FTICR
analyses, respectively. 300 ␮g of peptides were used for each SCX fractionation. The LC and SCX operations were the same as described
previously (31). Peptides were filtered with a confidence level ⬎95% based on reversed database evaluation (32), and proteins were identified
with at least two different peptides. ALS, acid-labile subunit; vWF, von Willebrand factor; SAA, serum amyloid A; CRP, C-reactive protein;
HGFA, hepatocyte growth factor activator; MSF, megakaryocyte-stimulating factor; EGFR, epidermal growth factor receptor; APOC2,
apolipoprotein C-II; B2M, ␤2-microglobulin; NAP1L1, nucleosome assembly protein 1-like1; MMP2, matrix metallopeptidase 2; 1D, one-
dimensional. We note that more relaxed indentification criteria would considerably expand the numbers of peptides and proteins identified by
all approaches.
Replicate Estimated
Methods Overlap Identified low abundance proteins dynamic range
1 2 3 of coverage
Non-depleted plasma and 1D LC-MS/MS
Peptides 1,398 1,213 1,466 972 ALS, 25 ␮g/ml; Factor XII, 30 ␮g/ml;
Proteins 99 97 102 96 APOC2, 35 ␮g/ml ⬃103
MARS depletion and 1D LC-MS/MS

Downloaded from www.mcponline.org by on February 4, 2007


Peptides 1,723 1,732 1,692 1,250 B2M, 1.1 ␮g/ml; vWF, 1.3 ␮g/ml; SAA,
Proteins 119 118 115 111 10 ␮g/ml ⬃104
IgY-12 depletion and 1D LC-MS/MS
Peptides 1,869 1,912 1,999 1,309 Myoglobin, 90 ng/ml; CRP, 500 ng/ml;
Proteins 130 141 130 122 HGFA, 500 ng/ml; CD14, 1.4 ␮g/ml ⬃105
IgY-12 depletion and 1D LC-FTICR
Peptides 2,800 2,840 2,630 2,070 Myoglobin, 90 ng/ml; CRP, 500 ng/ml;
Proteins 174 172 167 162 HGFA, 500 ng/ml; CD14, 1.4 ␮g/ml ⬃105
IgY-12 depletion and SCX-LC-MS/MS
Peptides 5,196 6,148 5,687 3,391 MSF, 1 ng/ml; Leptin, 5 ng/ml; NAP1L1,
Proteins 498 474 476 369 7 ng/ml; MMP2, 9 ng/ml; Cathepsin D, ⬃106–107
9 ng/ml; EGFR, 11 ng/ml

data2 with all proteins identified using a minimum of two sion effects for different proteins/peptides within the complex
different peptides. As shown, the single LC-MS/MS analysis sample.
only identifies ⬃100 proteins with high confidence and pro- One key area of recent advances in LC-MS technologies is
vides a dynamic range of ⬃103. With the removal of either the the improvement associated with capillary LC instrumentation
top six (MARS) or top 12 (IgY-12) abundant proteins, the that provides enhanced peak capacities and dynamic range
overall dynamic range is enhanced to ⬃105. LC-FTICR shows of detection needed to analyze clinical samples. These im-
greater coverage for both peptide and protein identifications provements have been achieved primarily through the use of
compared with LC-MS/MS, and the dynamic range is esti- very high pressure (10 –20 kp.s.i.), very small porous particles
mated to be similar to that observed for LC-MS/MS. (It should (3 ␮m or less), smaller inner diameter columns (50-␮m inner
be noted that presently unassigned peptides probably include diameter or less), nanoelectrospray interfaces, and relatively
many more proteins.) When IgY-12 depletion and SCX frac- long columns and long gradients for separations (33–35). For
tionation are combined with LC-MS/MS, a dynamic range of example, high efficiency separations with peak capacities of
106–107 can be achieved, allowing identification of nearly 500 ⬃1,000 have been achieved by using 15–75-␮m-inner diam-
proteins in plasma with high confidence including many at the eter and 85-cm-long capillary columns packed with 3-␮m
low ng/ml level, and 2D LC-FTICR analyses would be ex- C18-bonded silica particles operated at 10 kp.s.i. By using
pected to increase this by approximately another order of smaller inner diameter columns (e.g. 15 ␮m) (34), the sensi-
magnitude. Note, however, that this dynamic range still falls 3 tivity of the system continues to increase inversely as the
orders of magnitude short for detecting pg/ml protein con- mobile phase flow rates drop to as low as 20 nl/min, demon-
centrations. In addition, it should be noted that not all the strating the advantages of ESI-MS analyses at very low liquid
proteins within the estimated dynamic range will be detected flow rates (36, 37). More recently, the use of 20 kp.s.i. capillary
due to the differences in digestion efficiency and ion suppres- LC columns packed with 1.4 –3-␮m porous C18-bonded silica
particles has been demonstrated to provide chromatographic
2
V. A. Petyuk, W. J. Qian, M. H. Chin, H. Wang, E. A. Livesay,
peak capacities of 1,000 –1,500 for complex peptide and me-
M. E. Monroe, J. N. Adkins, N. Jaitly, D. J. Anderson, D. G. Camp, tabolite mixtures (35). Although these very high pressure sys-
D. J. Smith, and R. D. Smith, manuscript submitted. tems present technical challenges for robust automated op-

Molecular & Cellular Proteomics 5.10 1731


LC-MS-based Clinical Proteomics

erations, the recently commercialized Waters nanoACQUITY followed by in-gel digestion has also been used for plasma
UPLC System that takes advantage of 1.7-␮m sized particles protein fractionation prior to LC-MS/MS (52). A number of
and operates at ⬎10 kp.s.i. demonstrates the feasibility of recent large scale proteome profiling studies have combined
such high performance systems for routine applications. With different protein- and peptide-level fractionation techniques
further improvements in robustness, these “ultraperformance” (e.g. PF2D (45), SCX/RPLC (54), free flow electrophoresis-IEF/
systems may become a powerful component for separating RPLC (47), ZOOM/SDS-PAGE (50), and Rotofor/RPLC/SDS-
complex mixtures such as human biofluids while concurrently PAGE (49) protein fractionation) with peptide-level LC-MS/MS
providing the high dynamic range needed for candidate analyses to achieve more comprehensive coverage of the
biomarker discovery applications. plasma proteome.
An alternative to plasma protein fractionation is to specifi-
MULTIDIMENSIONAL FRACTIONATION STRATEGIES COUPLED WITH cally enrich functional “subproteomes” such as the glycopro-
LC-MS FOR IMPROVED PROTEOME COVERAGE teome or the cysteinyl subproteome by using chemical tag-
Given the tremendous dynamic range of protein abun- ging or capture agents; this significantly reduces overall
dances and the extraordinary complexity of human biofluid sample complexity and enhances detection of low abundance
proteomes, many different fractionation techniques have proteins. For example, we have recently demonstrated a sim-
been developed and applied in a multidimensional fashion to ple procedure for effectively enriching cysteinyl peptides from
enhance dynamic range of detection and improve proteome complex proteomes (including human biofluids (55)) that pro-
coverage (13). Multicomponent immunoaffinity removal of vides significantly improved proteome coverage when used

Downloaded from www.mcponline.org by on February 4, 2007


highly abundant proteins in human plasma/serum (38, 39) has as a peptide-level fractionation technique (27). Additionally
increasingly become the method of choice for prefractionat- hydrazine chemistry can be applied to specifically enrich N-
ing human plasma samples due to the high specificity, effi- linked glycopeptides (56, 57), and multilectin affinity chroma-
cacy, and ease of coupling to other fractionation techniques. tography can be used to isolate and characterize glycopro-
As shown in Table II, coupling the immunoaffinity depletion teins from human plasma and serum samples (58). Our
step to LC-MS provides an additional 1–2 orders of magni- laboratory has recently developed a strategy that combines
tude increase in dynamic range, allowing for detection of immunoaffinity depletion and subsequent chemical fraction-
more low abundance proteins by effectively increasing the ation based on cysteinyl peptide and N-glycoprotein captures
sample loading; similar improvements were reported in other with 2D LC-MS/MS for in depth plasma profiling (Fig. 3) (59).
studies (40, 41). Good reproducibility was demonstrated by Application of this “divide-and-conquer” strategy to trauma
performing immunoaffinity depletion with an automated LC patient plasma samples resulted in confident identification of
system; however, some of the nontarget low abundance pro- ⬃1,500 different proteins (with a minimum of two peptides per
teins have also been observed to bind to the columns but in a protein; ⬃99.5% confidence level based on reversed data-
reproducible fashion (42). A possible approach to counter this base evaluation) and illustrated an overall dynamic range of
effect is to analyze both the flow-through and bound fractions detection of ⬎107 (low ng/ml concentrations for six identified
in more of a “partitioning” method instead of a pure “deple- low abundance proteins were verified by ELISA).
tion” approach (39) with the accompanying trade-off of an
increased number of required analyses. A further enhance- ANALYSIS THROUGHPUT
ment to the platform dynamic range will stem from the con- Although integration of extensive multidimensional fraction-
tinuous improvement of antibody-based microbead technol- ation/separations with MS greatly increases the overall pro-
ogies that will allow for removal of more highly to moderately teomics analysis dynamic range and the extent of proteome
abundant proteins. coverage, this general approach suffers from the limitation of
Several different techniques for protein-level fractionation very low throughput. To date, most reports involving exten-
have been applied to human plasma/serum proteome profil- sive fractionation have been limited to small scale studies of
ing, including common gel-based techniques (43, 44), PF2D one or two pooled clinical samples rather than larger scale
automated chromatofocusing/reversed phase LC (RPLC) (45) quantitative studies. The development of more effective de-
and other liquid chromatography-based separations (46), pletion/fractionation strategies and improved LC-MS plat-
free-flow electrophoresis (41, 47), and IEF (46, 48 –51). IEF is forms will most likely reduce the total number of fractions
a common fractionation technique that has been applied to necessary for the detection of low abundance and clinically
plasma profiling at both peptide and protein levels. Various relevant proteins and thus provide higher throughput.
forms of liquid phase IEF techniques have been developed, Several recent technology developments hold potential for
including off-gel electrophoresis (48), Rotofor (49) or Mini- greatly enhancing the overall analysis throughput of clinical
Rotofor (46), microscale solution IEF (ZOOM) (50), and a pre- samples. The first is the development of very fast LC separa-
parative multichannel electrolyte system (51). A common fea- tions for proteomics analyses. Current automated LC-MS
ture of these systems is the multiple tandem electrode proteomics platforms typically involve LC separations with
chambers used to partition complex protein samples. IPG IEF gradients of 100 min or longer, which limits throughput to ⬃10

1732 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

Downloaded from www.mcponline.org by on February 4, 2007


FIG. 3. Schematic representation of a chemical fractionation strategy applied to the plasma proteome characterization. High
abundance proteins were first removed using immunoaffinity subtraction. The resulting less abundant proteins were split and subjected to solid
phase cysteinyl peptide and N-glycoprotein captures independently. Non-cysteinyl peptides and non-glycopeptides generated at the same
time were also collected. All four different peptide populations were then fractionated by SCX, and each fraction was analyzed by capillary
LC-MS/MS. PNGase F, peptide-N-glycosidase F (59).

sample analyses per day per MS instrument. Several reports (commonly helium or N2) and a uniform electric field estab-
have explored the use of smaller particle-packed columns lished along the axis of separation. Mixtures of peptides,
or monolithic columns for fast LC separations (10 min or proteins, or small molecules are separated by their gas phase
less) as well as multiplex column systems to significantly cross-sections (size) in addition to charge, and knowledge of
improve the throughput (60, 61). However, it is unclear their mobility provides another separation dimension to aid in
whether sufficient separation power can be achieved with identification.
these fast liquid phase separations because the increase in The power of IMS has been advanced by several recent
the solvent gradient speed can degrade the separation peak technical developments. IMS coupled with a TOF MS platform
capacity (60), which in turn reduces the overall dynamic and combinatorial libraries (65) has been recently demon-
range of detection. Other strategies for achieving robust strated for analysis of proteolytic digests (66). Because an
fast separations include liquid phase chromatographic and IMS separation typically requires 1–100 ms and has a resolv-
electrophoretic separations on a microfluidic chip platform ing power of 50 –200, a single species IMS peak exits the drift
(62– 64). Such chip-based separation devices also have the tube over a ⬃0.1–1-ms period. Generation of a typical TOF
advantage of providing better robustness, reliability, and MS spectrum requires ⬃30 –100 ␮s, which allows multiple
ease of operation. mass spectra to be obtained during the “elution” of an IMS
Very fast (millisecond scale) gas phase separations based peak. More recently, LC has been coupled to IMS-TOF MS via
on ion mobility spectrometry (IMS; a separation method that is an ESI interface, providing 2D separations prior to MS anal-
somewhat analogous to electrophoresis in the gas phase) are ysis (67). Despite enormous potential for high throughput
another powerful alternative to liquid phase separations for analyses of complex samples, the application of IMS-TOF MS
significant improvement in throughput. At its simplest, an IMS has been limited by low sensitivity due to ion losses at the
stage consists of a drift tube filled with a non-reactive gas IMS-MS interface; however, the recent implementation of

Molecular & Cellular Proteomics 5.10 1733


LC-MS-based Clinical Proteomics

FIG. 4. Schematic diagram of a prototype ESI-IMS-Q-TOF instrumentation platform that uses electrodynamic ion funnel interfaces
at both ends of the IMS drift tube and, as a result, provides very high sensitivity from high speed analyses. Reproduced with permission

Downloaded from www.mcponline.org by on February 4, 2007


from Ref. 68, copyright 2005 Am. Chem. Soc.

electrodynamic ion funnels at both the ESI-IMS and IMS-TOF


MS interfaces has significantly improved the sensitivity of the
overall LC-ESI-IMS-TOF MS platform (Fig. 4) (68) such that the
sensitivity is now comparable to that of a commercial ESI-MS
instrument. Although still in the development stage, the very fast
separation speed and potential high dynamic range of meas-
urements offered by the 2D liquid phase-gas phase separations
make LC-ESI-IMS-TOF MS an attractive and practical platform
for high throughput clinical applications.

CONFIDENCE OF PEPTIDE/PROTEIN IDENTIFICATIONS

One of the challenges associated with MS/MS-based pro-


teome profiling is how to assess the confidence levels of
peptide and protein identifications that result from automated
FIG. 5. Relative frequency of different peptides identified from
database searching. It is recognized that a significant portion the normal human protein database (solid line) and the reversed
of the protein identifications in previously published proteom- human protein database (dashed line) at different Xcorr values.
ics datasets of human plasma are likely comprised of false Data shown are for the 2⫹ charge state fully tryptic peptides identified
positive identifications (32, 69 –71). For example, four different from human plasma and filtered with ⌬Cn ⱖ 0.1. Reproduced with
permission from Ref. 32, copyright 2005 Am. Chem. Soc.
plasma proteomics datasets that originated from different
methodologies were combined into a list that included 1,175
non-redundant proteins; however, only 46 of these non-re- ysis in which selected human proteomes, including human
dundant proteins (⬃4%) were observed across all four studies plasma, were searched against a sequence-reversed human
(70). This surprisingly low overlap suggests the potential for a protein database (32) similar to a previous report applying the
very large number of false protein identifications. In a plasma reversed database strategy to the yeast proteome (72). The
profiling study using nanoscale LC-MS/MS, Shen et al. (69) reversed protein database was created by reversing the order
reported a nearly 2-fold difference in the number of identified of amino acid sequences for each protein (the carboxyl ter-
proteins (ranging from 800 to 1,600) depending on which set minus becomes the amino terminus and vice versa) in the
of previously published criteria were used to filter the data. original human protein database. This approach assumes that
This criteria-dependent difference illustrates the need for the numbers of false positives that arise from “random” hits
more detailed statistical evaluations to ensure high confi- should be the same for both the normal database and the
dence protein identifications. reversed database because the reversed database is identical
To address the issue of false peptide identifications, we in number of protein entries, protein size, and distribution of
recently performed a probability-based evaluation of peptide amino acids to the normal database. Fig. 5 shows a histogram
identifications derived from LC-MS/MS and SEQUEST anal- of Xcorr distribution for unique peptides (charge state 2⫹;

1734 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

TABLE III
Comparison of peptide and protein identifications from a plasma proteome profiling dataset analyzed using different criteria (59)
Average Estimated
Peptides Proteins Multipeptide
Filtering criteria Difference in stringency peptides false positive
identified identifieda proteins
per protein rateb
%

⬎95% confidence at the unique peptide level


Reversed database (32) 22,267 3,654 1,494 (40.9%) 6.1 ⬃4
based on statistical evaluation. Only fully
and partially tryptic peptides are
considered.
HUPO Plasma Proteome Inclusion of partially tryptic peptides with 30,524 7,928 2,850 (35.9%) 3.9 ⬃25
Project (77) relatively low cutoffs.
Hood et al. (78) Inclusion of partially tryptic and other 66,839 18,958 11,653 (61.5%) 3.5 ⬃66
enzymatically cleaved peptides as well as
peptides without protease constraints with
relatively low cutoffs.
a
Non-redundant protein identifications generated by Protein Prophet (80).
b
False positive rate for each filtering criteria was calculated at unique peptide level based on reversed database evaluation (32). The reversed
protein database was created by reversing the order of amino acid sequences for each protein (the carboxyl terminus becomes the amino
terminus and vice versa) in the original protein database.

Downloaded from www.mcponline.org by on February 4, 2007


fully tryptic) from a human plasma sample identified by the reversed database filtering criteria generated the smallest
searching the normal (solid line) and reversed (dashed line) number of peptide and protein identifications, consistent with
databases. The Xcorr distribution allows an estimated confi- the significantly lower percentage of false positive identifica-
dence level for any given Xcorr bin as well as the overall false tions (⬃4%), whereas the Human Proteome Organization
positive rate for a given Xcorr cutoff to be calculated by (HUPO) plasma proteome project-recommended criteria (77)
dividing the area beneath the dashed line (reversed database and the criteria recently reported by Hood et al. (78). gener-
hits) by the area beneath the solid line (normal database hits) ated nearly ⬃25 and ⬃66% false positives at the peptide
for a given Xcorr range. This study also revealed the high false level, respectively. The comparison shows that the number of
positive rates for plasma/serum peptide/protein identifica- peptide/protein identifications from an individual protein pro-
tions in several previously published studies (10, 69, 70, 73, filing study could be easily inflated if a statistical evaluation of
74). For example, ⬃30% false positives were observed when false positives was not performed.
the often cited Washburn et al. (75) filtering criteria were A similar observation was recently reported for proteins
applied to human plasma. Thus, filtering criteria that provided identified from data acquired on different instruments from
overall ⬎95% confidence at the unique peptide level for both 18 laboratories as part of the large scale HUPO plasma
human cell lines and human plasma were proposed. When proteome collaborative study (77). Application of a rigorous
identical filtering criteria were used, the observed false posi- statistical approach that used multiple hypothesis-testing
tive rates of peptide identifications for human plasma were techniques and took into account the length of coding
significantly higher than those for the human cell lines, sug- regions in genes reduced the initial list of 9,504 proteins (of
gesting that the false positive rates are significantly depend- which 3,020 were identified with two or more peptides) to
ent upon sample characteristics, particularly the number of 889 proteins (containing both multipeptide and single pep-
proteins found within the detectable dynamic range for differ- tide protein identifications) identified with a confidence level
ent samples. Additionally Xie and Griffin (76) reported the of at least 95% (71). Interestingly this length-dependent
increased potential for false positive identifications for the 2D statistical approach was applied to reanalyze one of our
linear ion trap (LTQ) when compared with a traditional three- previously published datasets (69) and resulted in 1,073
dimensional ion trap (LCQ) instrument, and more stringent proteins using the HUPO criteria and 433 proteins using the
filtering criteria are required for LTQ compared with LCQ to ⬎95% confidence length-dependent statistics (71). Similarly a
minimize false positive identifications. These results suggest ⬃2-fold difference in protein identifications between the re-
that peptide/protein identification confidence levels not only versed database filtering results and the HUPO criteria (Table
depend on sample characteristics but also on components of III) was observed, suggesting similar performance between
the LC-MS platform. the length-dependent statistical approach and reversed da-
Table III illustrates differences in filtering criteria stringency tabase filtering with ⬎95% confidence.
by comparing peptide/protein identification results from the PeptideProphet provides another independent statistical
same plasma MS/MS dataset (obtained from a recent profiling model for evaluating potential false positive peptide identifi-
study using trauma patient plasma samples (59)) that was cations. The model utilizes the expectation maximum algo-
filtered using three different sets of criteria (77, 78). As shown, rithm to derive a mixture of correct and incorrect peptide

Molecular & Cellular Proteomics 5.10 1735


LC-MS-based Clinical Proteomics

Downloaded from www.mcponline.org by on February 4, 2007


FIG. 6. A, mass error histograms of features detected from a single LC-FTICR dataset of a human plasma sample that matched to a human
plasma AMT tag database using different levels of NET constraints. The LC separation time is normalized to a 0 –1 scale in NET. B, mass error
histograms for features from the same dataset matching to a normal AMT tag database (gray circles) and to a shifted AMT tag database (black
squares). Note, the black squares represent random matches to the 11 Da shifted AMT tag database.

assignments from the data (79). This approach has been measurements for peptide/protein identifications. The utility
directly compared with the reversed database approach for of accurate mass measurements initially was demonstrated in
analyzing the same dataset derived from human plasma (59). the “peptide mass fingerprinting” approach for protein iden-
Following filtering with reversed database criteria, 6,279 tification in which a set of peptide fragments unique to each
unique peptides were identified from this dataset with ⬎95% protein are created by digestion, and the mass of these pep-
confidence, whereas 6,341 unique peptides were identified by tide fragments is used as a “fingerprint” to identify the original
PeptideProphet using a minimum computed probability of protein (84 – 86). Thus far, this approach has been limited to
0.95. Approximately 95% of peptides were common between simple protein mixtures or single proteins. The more recently
the two datasets, suggesting comparable results from these reported AMT tag approach utilizes accurate LC retention
two statistical approaches. The use of ProteinProphet, an- time measurements in addition to accurate mass measure-
other statistical model that computes the probability of the ments to identify peptides and has been successfully applied
presence of proteins, addresses the issue of whether pep- to global proteome profiling, including the human plasma
tides are present in more than one entry in the protein data- proteome (31, 87). With the AMT tag approach, peptides are
base (protein redundancy problem) (80). The list of identified identified by matching LC-MS observed mass and normalized
peptides from both the PeptideProphet and the reversed da- elution time (NET) features to AMT tags in the pre-established
tabase filtering approaches can serve as input for Protein- reference database (look-up table of peptides) with a given
Prophet to generate a list of non-redundant protein identifi- mass error and NET error tolerances (typically 1–5 ppm for
cations. Several other statistical methods have been recently mass and 1–3% for NET). The potential false positive identi-
described for evaluating peptide assignments from MS/MS fications resulting from random matching of features to the
spectra (81– 83). Ideally universal acceptance of a statistical reference database are indicated on histograms of mass error
model that optimizes both sensitivity and specificity for con- (the difference between observed mass and calculated mass
fident peptide identifications from MS/MS spectra will allow for the matched peptide in the database) exemplified in Fig.
cross-comparison of protein profiling results from different 6A for a human plasma dataset analyzed by LC-FTICR. Note
laboratories, which currently remains as an unresolved that the use of the NET constraint significantly reduces the
challenge. level of random matches as indicated by the background level
Similar challenges exist for evaluating false positive identi- for each histogram. Similar to the reversed database ap-
fications from MS-only approaches that utilize accurate mass proach for MS/MS, we have recently applied a shifted data-

1736 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

Downloaded from www.mcponline.org by on February 4, 2007


FIG. 7. A, a partial 2D display of the detected 18O/16O-labeled peptide pairs from an LC-FTICR analysis. The elution time is shown as a
normalized scale between 0 and 1. Observed peaks (represented by spots) correspond to various eluting peptides. The heavy and light
isotope-labeled pairs are easily visualized with a 4-Da mass difference. B, normalized -fold changes for the 429 quantified proteins following
LPS administration. The abundance ratio for each protein shown was normalized to zero (R ⫺ 1) (53). For ratios smaller than 1, normalized
inverted ratios were calculated as 1 ⫺ (1/R). The error bar for each protein indicates the S.D. for the abundance ratios from multiple peptides.
Proteins without error bars were identified with single peptides.

base approach for evaluating the false positive rate in the served to be up-regulated upon LPS administration. Several
AMT tag process.2 As shown in Fig. 6B, an ⬃3% false positive other studies have shown that this peptide hits approach can be
rate for this human plasma dataset was estimated as the ratio used as a semiquantitative approach for initial screening when
of the area beneath the curve that represents matches to the applied with proper controls and with adequate thresholds
shifted database (black squares) and the area beneath the (90 –93).
curve that represents matches to the normal database within More recently, we have demonstrated 16O/18O labeling
a ⫾2 ppm window (gray circles). In addition to being used for combined with the AMT tag strategy as an effective global
direct identification in the MS-only approach, the accurate quantitative approach for quantifying relative protein abun-
mass information also has been utilized for improving the dance differences in human plasma (31). By incubating tryptic
confidence of peptide identifications by MS/MS through ap- peptides in 18O water (55, 94) in the presence of trypsin, the
18
plication of the new generation of LTQ-FT and LTQ-Orbitrap O atoms are incorporated into the carboxyl terminus of
mass spectrometers (88, 89). tryptically cleaved peptides via a postdigestion trypsin-cata-
lyzed oxygen exchange reaction. The 16O/18O-labeled pep-
QUANTITATION STRATEGIES tide pairs provide a 4-Da mass difference (Fig. 7A), which
The ability to quantitatively measure relative protein abun- allows a high resolution mass spectrometer such as FTICR or
dance differences between different clinical samples is essen- TOF to effectively resolve the 16O- and 18O-labeled peptide
tial for identifying candidate protein biomarkers; however, the pairs and accurately measure the relative abundances. The
vast majority of proteomics work related to biomarker discov- advantage is that all types of samples (e.g. tissues, cells, and
ery published to date has been qualitative, highlighting the biological fluids) can be effectively labeled using this simple
need for more robust quantitative approaches for such appli- and specific enzyme-catalyzed reaction. Fig. 7A shows a
cations. Our initial application for comparative proteome anal- partial 2D display of detected peptide pairs in mass versus
ysis of human plasma following lipopolysaccharide (LPS) ad- time dimensions. The 18O/16O-labeled peptides are readily
ministration involved a semiquantitative strategy based on the visualized as co-eluting pairs (4 Da apart), and the abundance
total number of peptide identifications per protein (peptide ratio can be precisely calculated for each 18O/16O pair. In this
hits or spectrum count) (74). In this study, standard SCX-LC- initial comparative analysis demonstration of two human
MS/MS analysis was performed at the 0-h time point (control) plasma samples obtained from a healthy individual prior to
and a 9-h time point following LPS administration, and pep- (control) and following LPS administration, relative abundance
tide hits were used to obtain a relative quantitative measure differences between the two plasma samples were quantified
between the control and 9-h time point. Several known in- for a total of 429 plasma proteins. Fig. 7B shows the normal-
flammatory response and acute phase proteins were ob- ized -fold changes in 429 quantified proteins and demon-

Molecular & Cellular Proteomics 5.10 1737


LC-MS-based Clinical Proteomics

strates the significant changes in abundance for a set of capillary columns for separations (36, 37). It is well demon-
proteins following LPS administration. The combined 16O/18O strated that smaller inner diameter columns with lower flow
labeling-AMT tag strategy can also be easily coupled with rates provide significantly higher sensitivity than larger inner
subsequent peptide-level fractionation approaches such as diameter columns with higher flow rates (34) because of the
cysteinyl peptide enrichment (55) and SCX fractionation. significant improvements in both ionization and MS sampling
Other stable isotope labeling methods based on relative efficiencies. Reversed phase packed nanoscale LC and mon-
peptide/protein abundance measurements include metabolic olithic nanoscale LC separations have been developed and
labeling (95–97) and chemical labeling of specific functional coupled to ESI for improved ionization and quantitation (34,
groups using reagents such as ICAT (98) and iTRAQ (isobaric 105). As ionization efficiencies are increased for nanoelectro-
tags for relative and absolute quantitation) (99, 100) have been spray, detection biases are decreased because undesired
routinely used for quantitative proteomics analysis. In clinical matrix effects and/or ion suppression effects are either re-
proteomics applications, these stable isotope labeling tech- duced or eliminated (104 –106), providing the basis for im-
niques are well suited for detecting accurate changes in pair- proved quantitation. With further improvements to the ro-
wise comparisons provided the samples can be effectively bustness of these nano-LC-ESI-MS systems, label-free
labeled; however, it is often challenging to compare across a quantitation may be widely applied in clinical applications.
large number of clinical samples. One alternative to the use of Another challenge for quantitative clinical proteomics appli-
these labeling techniques is the use of a labeled reference cations is the variability introduced during multiple steps of
sample (often a pooled composite) that is spiked into each sample processing. With continued development of cleanup

Downloaded from www.mcponline.org by on February 4, 2007


normally processed individual clinical sample that allows rel- products for more consistent performance and automated
ative quantitation between each clinical sample and the ref- sample processing, such reproducibility issues may be mini-
erence sample and cross-comparison among the entire set of mized, leading to further improvements in quantitation when
clinical samples. The 18O labeling strategy is well suited for applying either the stable isotope labeling or label-free
generating such a labeled reference sample as all other clin- approaches.
ical samples can be processed with natural 16O on the car-
boxyl termini without labeling; 16O/18O peptide pairs are IMPLICATIONS OF HUMAN HETEROGENEITY IN CLINICAL
formed after spiking the samples with the 18O-labeled PROTEOMICS STUDIES
reference. The ability to identify disease-specific differences by using
Alternatively “label-free” direct quantitation approaches a proteomics approach relies on multiple factors integral to
hold interest because of greater flexibility for comparative the overall analysis pipeline. For example, when performing
analyses and simpler sample processing procedures com- peptide-level measurements, achieving high peptide identifi-
pared with labeling approaches. The isotope labeling and cation quality is a prerequisite for assuring confidence in all
label-free approaches are complementary, and each ap- other downstream parameters (i.e. confidence in both protein
proach has different sources of variations. Several initial stud- identification and quantitation), whereas the ability to quantify
ies suggest that the use of normalized LC-MS peak intensities differences between any two samples largely depends on the
for detected peptides can be used to compare relative abun- reproducibility of the overall platform. Due to inherent varia-
dances between similar complex samples (101–103). It has tions that stem from sample preparation and instrument anal-
been demonstrated that abundance ratios of separate model ysis, technical replicates are often performed to evaluate and
proteins may be predicted to within ⬃20% in complex pro- minimize technical variability arising from the overall analysis
teome digests by using measured peptide ion intensities ob- pipeline. Technical variability will be minimized as technolo-
tained in LC-MS analyses (101). Among the main challenges gies continue to mature, and platforms will likely become
for label-free quantitation are the multiple issues that affect more robust and reproducible; however, biological variability
the usefulness of peptide peak intensities for relative quanti- within the same comparative groups remains as a challenge
tation, such as differences in electrospray ionization efficien- for identifying real differences between different conditions.
cies among different peptides and different samples (37), Although ideally one would like to either control or minimize
differences in the amount of sample injected in each analysis, such biological variability by utilizing more controlled model
and sample preparation reproducibility. These issues are of- systems such as cell cultures, an in vitro model system, or
ten peptide-dependent, leading to observed disparity among even inbred mouse strains, this is not always possible. Most
relative abundances of different peptides originating from the clinical studies are based on “real world” human clinical sam-
same protein. The significant bias and ion suppression effects ples where inherent human individual heterogeneity makes
caused by charge competition (ionization bias) during ESI discovery efforts more difficult. The human heterogeneity
(104) are often considered a major limitation for accurate challenge in proteomics studies stems from the high proba-
label-free quantitation. Recent studies have demonstrated bility that two equally “healthy” individuals will have overall
substantial advantages for ESI-MS analyses at nanoflow re- significantly different individual protein abundance levels
gimes (⬍100 nl/min) afforded by narrower inner diameter when sampled at any given time. This heterogeneity can be

1738 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

Downloaded from www.mcponline.org by on February 4, 2007


FIG. 8. Pearson correlation plot comparing peptide intensities of LC-FTICR analyses of plasma samples. A, nine technical replicates
for a pooled reference human plasma sample from multiple healthy subjects. B, nine human plasma samples from individual healthy subjects
with ages range from 18 to 26. C, nine mouse plasma samples isolated from individual C57BL6 mice. Each sample including the technical
replicate was separately processed by ProteomeLab IgY-12 (for human) or IgY-R7 (for mouse) depletion, and the flow-through portions were
digested with trypsin prior to LC-MS analyses.

due to individual genetic variability (i.e. gender, race, etc.) healthy control subjects present a challenge for identifying
and/or to contributing environmental factors such as diet, disease-specific differences. To address these challenges
overall health, detrimental environmental exposures, etc. The and increase the confidence of discovery results, it is essen-
complexity of human diseases presents another degree of tial for the discovery platform to be able to analyze a relatively
challenge. For example, in human cancer, each tumor type large number of clinical samples in a high throughput manner
typically consists of a number of subtypes that differ with to obtain sufficient statistical power.
regard to their spectrum of genetic alterations (107). There- Other proteomics studies have also described the effects of
fore, a potential candidate biomarker of disease may be ele- human heterogeneity in specific model systems. Hu et al. (15)
vated only in a certain percentage of the pool of disease performed a limited study that compared both intra- and
patients. interindividual variability of human cerebrospinal fluid samples
The implications of human heterogeneity in the context of obtained from six individuals. Specific proteins were observed
LC-MS-based proteomics experiments centers mostly on the to fluctuate over time with the same individual, but overall
measured quantitative values for peptide/protein identifica- there was a higher concordance of interindividual results
tions. Fig. 8 shows an initial evaluation of the technical vari- than across individuals. Interestingly results from measuring
ation and biological variations of human and mouse plasma intraindividual protein levels suggested that certain proteins
samples based on the Pearson correlation of the identified tended to fluctuate more than others, calling into question
peptide intensities between any two individual samples. The the effectiveness of using these proteins as potential dis-
technical replicate results (Fig. 8A; nine individually processed ease markers. Other studies include a report by Zhan and
samples from one pooled reference plasma) show overall Desiderio (108) that showed the heterogeneity in 2D gel
good correlation (0.94 ⫾ 0.02), which suggests relatively good electrophoresis human pituitary proteome analysis and an
reproducibility of the overall analytical platform. The increased interesting review by Mann et al. (109) that overviewed the
variation among human subjects (Fig. 8B) appears obvious on effects of genotypic and phenotypic variations in evalua-
the basis of significantly reduced average correlation coeffi- tions of the hemostatic proteome. They reported that “nor-
cients (0.85 ⫾ 0.06) compared with the technical replicate mal” pro- and anticoagulant concentrations were observed
results; whereas mouse plasma samples (Fig. 8C) show only to vary significantly and influence downstream responses,
slightly reduced correlation (0.92 ⫾ 0.05), which suggests demonstrating how heterogeneity in individual phenotypes
relatively small biological variation in these inbred mouse should influence diagnosis and therapy for hemorrhagic and
models. Such large variations observed among different thrombotic diseases.

Molecular & Cellular Proteomics 5.10 1739


LC-MS-based Clinical Proteomics

Designing experiments to minimize biological variability is of these two approaches may lead to more effective biomar-
imperative for clinical studies. One example is to analyze a ker discovery.
serial sample set, i.e. plasma or biopsy tissue samples, from
the same individual over a time course or disease progres- TARGETED PROTEOMICS APPROACHES
sion; this in theory will alleviate a majority of heterogeneity The majority of proteomics applications in the search for
effects, but such samples are traditionally more difficult to candidate biomarkers to date have been focused on global
obtain in addition to the fact that most patients do not have a proteome characterization focused on identifying multiple
“control” blood or tissue sample in storage for comparison protein differences (candidate biomarkers) that correlate
against a possible disease diagnosis. For most studies that with specific human diseases; however, as discussed pre-
use cross-sectional approaches, it is desirable to match the viously, there are many challenges associated with applying
patients and controls in terms of age, sex, race, weight, and such a strategy to the discovery of low abundance candi-
even diet if possible. A recent study reported the potential date marker proteins. An alternative strategy for biomarker
utility of pooling for reducing the effects of biological variation discovery that complements global profiling is the targeted
in microarray studies while retaining the accuracy of identify- proteomics approach that involves quantitative MS to
ing differentially expressed genes when biological replicates measure a hypothesis-generated list of candidates (112).
are retained in the study design and providing the additional The targeted proteomics strategy often provides greater
benefit of a great reduction in the total number of samples to sensitivity and allows for detection of low abundance can-
be analyzed (110). Such a strategy might be explored and didate proteins. Anderson and Hunter (113) recently dem-

Downloaded from www.mcponline.org by on February 4, 2007


extended to clinical proteomics studies. onstrated the use of peptide multiple reaction monitoring
A further implication in heterogeneity is the presence of (MRM) for quantitative assaying of major plasma proteins.
protein isoforms, splice variants, specific amino acid muta- Such MRM assays provide great specificity for peptide/
tions, proteolytic products, and other post-translational mod- protein identifications and relatively good precision for
ifications that are likely present in individual samples but are quantitation. Additionally MRM can provide a rapid and
most often not explicitly included as sequences in the search- specific platform for biomarker validation, particularly when
able protein database. This exclusion makes it challenging for coupled with specific enrichment techniques such as the
traditional LC-MS/MS-based bottom-up approaches to iden- recently published SISCAPA (Stable Isotope Standards and
tify such modified proteins and is possibly one of the main Capture by Anti-Peptide Antibodies) method for enriching
reasons that a large percentage of MS/MS spectra in clinical target peptides using anti-peptide antibodies (114). Activity-
analyses remain unidentified. The identification of amino acid- based protein profiling is another strategy that uses chem-
specific post-translational modifications (e.g. phosphoryla- ical probes for tagging, enriching, and isolating a specific
tion, glycosylation, glycation, nitration, oxidation, and deami- subset of physiologically important proteins on the basis of
nation) challenges MS/MS-based approaches due to the vast enzymatic activity (115, 116). Coupling such strategies with
variety of possible modifications and the potential high false LC-MS holds potential for eliminating many issues related
positive rates that originate from database searching. Be- to the dynamic range of protein abundance.
cause it is recognized that many protein biomarkers may be A continuing issue for current LC-MS-based profiling ap-
specific protein isoforms or modified proteins, further techni- proaches is that many of the detected species or features
cal developments for more effective identification and quan- from LC-MS and LC-MS/MS analyses remain unidentified.
titation of protein isoforms and modifications would be greatly Based on our experience, ⬃80% of MS/MS spectra on
desirable. average are not confidently identified via database search-
As an alternative to identifying protein isoforms and mod- ing, and more than 50% of LC-FTICR-detected features
ifications, intact protein-level separations can be used to remain unidentified by the AMT tag approach. Present in-
separate different protein isoforms on the basis of their formatics tools and statistical algorithms have been able to
different masses or other properties. The ability to use 2D utilize intensity information of these unidentified features to
gel electrophoresis for resolving different isoforms and identify “interesting” features as potential biomarkers for
monitoring their abundance changes has been well docu- specific diseases; effectively targeting these interesting fea-
mented (111). The recently developed multidimensional in- tures using data-directed or targeted MS/MS approaches is
tact protein analysis system (IPAS) separates intact proteins of current interest. One of the informatics challenges asso-
on the basis of charge, hydrophobicity, and molecular ciated with identifying these features concerns different
mass; quantitation is achieved by protein tagging with flu- post-translational modifications. Current commercial mass
orophores (43). The potential for revealing different protein spectrometers such as the LTQ offer a targeted MS/MS
isoforms and specific protein cleavage products in human capability based on the selection of a list of m/z values.
plasma/serum also has been demonstrated (49). The advan- Developing an advanced targeted MS/MS approach (117)
tages offered by intact protein analysis complements the that incorporates “smart selection” of the targets and dif-
bottom-up proteomics approaches, and better integration ferent, but complementary fragmentation techniques will be

1740 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

an integral component for an effective LC-MS profiling plat- * Portions of the reviewed research were supported by the United
form suitable for clinical applications. States Department of Energy (DOE) Office of Biological and Environ-
mental Research; the National Institutes of Health through the Na-
CONCLUSIONS AND PERSPECTIVES tional Center for Research Resources Grant RR018522, NIGMS Large
The amount of effort placed into the development and Scale Collaborative Research Grant U54 GM-62119-02, NIDDK Grant
application of effective proteomics profiling of serum/plasma R21 DK070146, and NIDA Grant 1P30DA01562501; the Entertain-
ment Industry Foundation (EIF) and the EIF Women’s Cancer Re-
and other clinical samples has increased tremendously over search Fund; and the Laboratory Directed Research Development
the last several years. With the emergence of more effective program at Pacific Northwest National Laboratory. Our laboratories
LC-MS technologies and the variety of fractionation ap- are located in the Environmental Molecular Sciences Laboratory, a
proaches, the number of proteins detectable in human plasma national scientific user facility sponsored by the DOE and located at
by global profiling has been greatly expanded (e.g. 889 pro- Pacific Northwest National Laboratory, which is operated by Battelle
Memorial Institute for the DOE under Contract DE-AC05-76RL0 1830.
teins with ⬎95% confidence reported in the recent HUPO
The costs of publication of this article were defrayed in part by the
study and 1,494 proteins with ⬎99% confidence, including payment of page charges. This article must therefore be hereby
confident identification of many low ng/ml level plasma pro- marked “advertisement” in accordance with 18 U.S.C. Section 1734
teins, in our recent study (59)). Although this level of detection solely to indicate this fact.
still falls short of the 10 orders of magnitude in dynamic range ‡ To whom correspondence should be addressed: Environmental
Molecular Sciences Laboratory, Pacific Northwest National Labora-
that encompasses plasma protein abundances, it still offers
tory, P. O. Box 999, MSIN: K8-98, Richland, WA 99352. E-mail:
significant potential for the discovery of novel candidate bi- rds@pnl.gov.
omarkers from clinical plasma/serum samples.

Downloaded from www.mcponline.org by on February 4, 2007


Currently there is no single platform that represents the REFERENCES
“best” technology for such discovery applications, and inte- 1. Aebersold, R., and Mann, M. (2003) Mass spectrometry-based proteom-
gration of multiple technologies is often required for detection ics. Nature 422, 198 –207
2. Hanash, S. (2003) Disease proteomics. Nature 422, 226 –232
and quantitation of low abundance proteins. The need for 3. Etzioni, R., Urban, N., Ramsey, S., McIntosh, M., Schwartz, S., Reid, B.,
improved reproducibility, throughput, dynamic range, and Radich, J., Anderson, G., and Hartwell, L. (2003) The case for early
quantitation will continue to drive technology development detection. Nat. Rev. Cancer 3, 243–252
4. Ludwig, J. A., and Weinstein, J. N. (2005) Biomarkers in cancer staging,
and improvement efforts. Importantly several new technolog- prognosis and treatment selection. Nat. Rev. Cancer 5, 845– 856
ical developments such as fast LC separations, gas phase 5. Anderson, N. L., and Anderson, N. G. (2002) The human plasma proteome:
IMS separations, and high efficiency nano-ESI interfaces history, character, and diagnostic prospects. Mol. Cell. Proteomics 1,
845– 867
presently appear promising for future discovery platforms and 6. Zhou, G., Li, H., DeCamp, D., Chen, S., Shu, H., Gong, Y., Flaig, M.,
applications. With improvements in quantitation accuracy, Gillespie, J. W., Hu, N., Taylor, P. R., Emmert-Buck, M. R., Liotta, L. A.,
throughput, and robustness, the LC-MS protein profiling plat- Petricoin, E. F., III, and Zhao, Y. (2002) 2D differential in-gel electro-
phoresis for the identification of esophageal scans cell cancer-specific
form may eventually become a powerful tool for clinical diag- protein markers. Mol. Cell. Proteomics 1, 117–124
nostic testing that provides simultaneous measurements of a 7. Zangar, R. C., Varnum, S. M., and Bollinger, N. (2005) Studying cellular
large number of clinically relevant analytes. processes and detecting disease with protein microarrays. Drug Metab.
Rev. 37, 473– 487
An important component of any integrated profiling plat- 8. Janzi, M., Odling, J., Pan-Hammarstrom, Q., Sundberg, M., Lundeberg, J.,
form not previously discussed is the informatics and statistical Uhlen, M., Hammarstrom, L., and Nilsson, P. (2005) Serum microarrays
analysis. The development of more effective software pack- for large scale screening of protein levels. Mol. Cell. Proteomics 4,
1942–1947
ages will be essential for processing the large number of 9. Uhlen, M., Bjorling, E., Agaton, C., Szigyarto, C. A., Amini, B., Andersen,
LC-MS datasets, which may include peak (or feature) detec- E., Andersson, A. C., Angelidou, P., Asplund, A., Asplund, C., Berglund,
tion, run-to-run feature alignment, intensity normalization, fea- L., Bergstrom, K., Brumer, H., Cerjan, D., Ekstrom, M., Elobeid, A.,
Eriksson, C., Fagerberg, L., Falk, R., Fall, J., Forsberg, M., Bjorklund,
ture matching to the database, and statistical analysis to
M. G., Gumbel, K., Halimi, A., Hallin, I., Hamsten, C., Hansson, M.,
generate a list of high confidence potential candidates. Hedhammar, M., Hercules, G., Kampf, C., Larsson, K., Lindskog, M.,
Finally due to the complexity of large scale clinical proteom- Lodewyckx, W., Lund, J., Lundeberg, J., Magnusson, K., Malm, E.,
Nilsson, P., Odling, J., Oksvold, P., Olsson, I., Oster, E., Ottosson, J.,
ics studies, collaborative efforts from multiple laboratories
Paavilainen, L., Persson, A., Rimini, R., Rockberg, J., Runeson, M.,
with different platforms may be required for benchmarking Sivertsson, A., Skollermo, A., Steen, J., Stenvall, M., Sterky, F., Strom-
and better cross-validation of the discovery results and elim- berg, S., Sundberg, M., Tegel, H., Tourle, S., Wahlund, E., Walden, A.,
Wan, J., Wernerus, H., Westberg, J., Wester, K., Wrethagen, U., Xu,
inating potential biases introduced into any given platform.
L. L., Hober, S., and Ponten, F. (2005) A human protein atlas for normal
This implies that a common set of standards is needed so that and cancer tissues based on antibody proteomics. Mol. Cell. Proteom-
platform performance in different laboratories may be readily ics 4, 1920 –1932
compared and large scale proteomics datasets can be effec- 10. Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J., Angell, N. H.,
Smith, R. D., Springer, D. L., and Pounds, J. G. (2002) Toward a human
tively exchanged and shared. blood serum proteome: analysis by multidimensional separation cou-
pled with mass spectrometry. Mol. Cell. Proteomics 1, 947–955
Acknowledgments—The contributions of Marina Gritsenko, Hongli- 11. Jacobs, J. M., Adkins, J. N., Qian, W. J., Liu, T., Shen, Y., Camp, D. G., II,
ang Jiang, Matt Monroe, Ron Moore, Tom Metz, Angela Norbeck, and Smith, R. D. (2005) Utilizing human blood plasma for proteomic
Sam Purvine, and Yufeng Shen to the work reviewed here are grate- biomarker discovery. J. Proteome Res. 4, 1073–1085
fully acknowledged. 12. Veenstra, T. D., Conrads, T. P., Hood, B. L., Avellino, A. M., Ellenbogen,

Molecular & Cellular Proteomics 5.10 1741


LC-MS-based Clinical Proteomics

18
R. G., and Morrison, R. S. (2005) Biomarkers: mining the biofluid pro- O labeling and the accurate mass and time tag approach. Mol. Cell.
teome. Mol. Cell. Proteomics 4, 409 – 418 Proteomics 4, 700 –709
13. Lee, H. J., Lee, E. Y., Kwon, M. S., and Paik, Y. K. (2006) Biomarker 32. Qian, W. J., Liu, T., Monroe, M. E., Strittmatter, E. F., Jacobs, J. M.,
discovery from the plasma proteome using multidimensional fraction- Kangas, L. J., Petritis, K., Camp, D. G., and Smith, R. D. (2005) Prob-
ation proteomics. Curr. Opin. Chem. Biol. 10, 42– 49 ability-based evaluation of peptide and protein identifications from tan-
14. Wright, M. E., Han, D. K., and Aebersold, R. (2005) Mass spectrometry- dem mass spectrometry and SEQUEST analysis: the human proteome.
based expression profiling of clinical prostate cancer. Mol. Cell. Pro- J. Proteome Res. 4, 53– 62
teomics 4, 545–554 33. Tolley, L., Jorgenson, J. W., and Moseley, M. A. (2001) Very high pressure
15. Hu, Y., Malone, J. P., Fagan, A. M., Townsend, R. R., and Holtzman, D. M. gradient LC/MS/MS. Anal. Chem. 73, 2985–2991
(2005) Comparative proteomic analysis of intra- and interindividual var- 34. Shen, Y., Zhao, R., Berger, S. J., Anderson, G. A., Rodriguez, N., and
iation in human cerebrospinal fluid. Mol. Cell. Proteomics 4, 2000 –2009 Smith, R. D. (2002) High-efficiency nanoscale liquid chromatography
16. Wattiez, R., and Falmagne, P. (2005) Proteomics of bronchoalveolar la- coupled on-line with mass spectrometry using nanoelectrospray ioni-
vage fluid. J. Chromatogr. B Anal. Technol. Biomed. Life Sci. 815, zation for proteomics. Anal. Chem. 74, 4235– 4249
169 –178 35. Shen, Y., Zhang, R., Moore, R. J., Kim, J., Metz, T. O., Hixson, K. K., Zhao,
17. Liao, H., Wu, J., Kuhn, E., Chin, W., Chang, B., Jones, M. D., O’Neil, S., R., Livesay, E. A., Udseth, H. R., and Smith, R. D. (2005) Automated 20
Clauser, K. R., Karl, J., Hasler, F., Roubenoff, R., Zolg, W., and Guild, kpsi RPLC-MS and MS/MS with chromatographic peak capacities of
B. C. (2004) Use of mass spectrometry to identify protein biomarkers of 1000 –1500 and capabilities in proteomics and metabolomics. Anal.
disease severity in the synovial fluid and serum of patients with rheu- Chem. 77, 3090 –3100
matoid arthritis. Arthritis Rheum. 0, 3792–3803 36. Wilm, M. S., and Mann, M. (1994) Electrospray and Taylor-Cone theory,
18. Varnum, S. M., Covington, C. C., Woodbury, R. L., Petritis, K., Kangas, Dole’s beam of macromolecules at last? Int. J. Mass Spectrom. Ion
L. J., Abdullah, M. S., Pounds, J. G., Smith, R. D., and Zangar, R. C. Process. 136, 167–180
(2003) Proteomic characterization of nipple aspirate fluid: identification 37. Smith, R. D., Shen, Y., and Tang, K. (2004) Ultrasensitive and quantitative
of potential biomarkers of breast cancer. Breast Cancer Res. Treat. 80, analyses from combined separations-mass spectrometry for the char-
87–97 acterization of proteomes. Acc. Chem. Res. 37, 269 –278

Downloaded from www.mcponline.org by on February 4, 2007


19. Xie, H., Rhodus, N. L., Griffin, R. J., Carlis, J. V., and Griffin, T. J. (2005) A 38. Zolotarjova, N., Martosella, J., Nicol, G., Bailey, J., Boyes, B. E., and
catalogue of human saliva proteins identified by free flow electrophore- Barrett, W. C. (2005) Differences among techniques for high-abundant
sis-based peptide separation and tandem mass spectrometry. Mol. protein depletion. Proteomics 5, 3304 –3313
Cell. Proteomics 4, 1826 –1830 39. Huang, L., Harvie, G., Feitelson, J. S., Gramatikoff, K., Herold, D. A., Allen,
20. Theodorescu, D., Wittke, S., Ross, M. M., Walden, M., Conaway, M., Just, D. L., Amunngama, R., Hagler, R. A., Pisano, M. R., Zhang, W. W., and
I., Mischak, H., and Frierson, H. F. (2006) Discovery and validation of Fang, X. (2005) Immunoaffinity separation of plasma proteins by IgY
new protein biomarkers for urothelial cancer: a prospective analysis. microbeads: meeting the needs of proteomic sample preparation and
Lancet Oncol. 7, 230 –240 analysis. Proteomics 5, 3314 –3328
21. Celis, J. E., Gromov, P., Cabezon, T., Moreira, J. M., Ambartsumian, N., 40. Echan, L. A., Tang, H. Y., Ali-Khan, N., Lee, K., and Speicher, D. W. (2005)
Sandelin, K., Rank, F., and Gromova, I. (2004) Proteomic characteriza- Depletion of multiple high-abundance proteins improves protein profil-
tion of the interstitial fluid perfusing the breast tumor microenvironment: ing capacities of human serum and plasma. Proteomics 5, 3292–3303
a novel resource for biomarker and therapeutic target discovery. Mol. 41. Cho, S. Y., Lee, E. Y., Lee, J. S., Kim, H. Y., Park, J. M., Kwon, M. S., Park,
Cell. Proteomics 3, 327–344 Y. K., Lee, H. J., Kang, M. J., Kim, J. Y., Yoo, J. S., Park, S. J., Cho,
22. Yates, J. R., III, Eng, J. K., and McCormack, A. L. (1995) Mining genomes: J. W., Kim, H. S., and Paik, Y. K. (2005) Efficient prefractionation of
correlating tandem mass spectra of modified and unmodified peptides low-abundance proteins in human plasma and construction of a two-
to sequences in nucleotide databases. Anal. Chem. 67, 3202–3210 dimensional map. Proteomics 5, 3386 –3396
23. Perkins, D., Pappin, D., Creasy, D., and London, U. (1999) Probability- 42. Liu, T., Qian, W. J., Mottaz, H. M., Gritsenko, M. A., Norbeck, A. D., Moore,
based protein identification by searching sequence databases using R. J., Purvine, S. O., Camp, D. G., II, and Smith, R. D. (July 19, 2006)
mass spectrometry data. Electrophoresis 20, 3551–3567 Evaluation of multiprotein immunoaffinity subtraction for plasma pro-
24. Craig, R., and Beavis, R. C. (2004) TANDEM: matching proteins with teomics and candidate biomarker discovery using mass spectrometry.
tandem mass spectra. Bioinformatics 20, 1466 –1467 Mol. Cell. Proteomics 10.1074/mcp.T600039-MCP200
25. Mayya, V., Rezaul, K., Cong, Y. S., and Han, D. (2005) Systematic com- 43. Wang, H., Clouthier, S. G., Galchev, V., Misek, D. E., Duffner, U., Min,
parison of a two-dimensional ion trap and a three-dimensional ion trap C. K., Zhao, R., Tra, J., Omenn, G. S., Ferrara, J. L., and Hanash, S. M.
mass spectrometer in proteomics. Mol. Cell. Proteomics 4, 214 –223 (2005) Intact-protein-based high-resolution three-dimensional quantita-
26. Wolters, D. A., Washburn, M. P., and Yates, J. R. (2001) An automated tive analysis system for proteome profiling of biological fluids. Mol. Cell.
multidimensional protein identification technology for shotgun proteom- Proteomics 4, 618 – 625
ics. Anal. Chem. 73, 5683–5690 44. Wang, H., and Hanash, S. (2005) Intact-protein based sample preparation
27. Wang, H., Qian, W. J., Chin, M. H., Petyuk, V. A., Barry, R. C., Liu, T., strategies for proteome analysis in combination with mass spectrome-
Gritsenko, M. A., Mottaz, H. M., Moore, R. J., Camp, D. G., II, Khan, try. Mass Spectrom. Rev. 24, 413– 426
A. H., Smith, D. J., and Smith, R. D. (2006) Characterization of the 45. Sheng, S., Chen, D., and Van Eyk, J. E. (2006) Multidimensional liquid
mouse brain proteome using global proteomic analysis complemented chromatography separation of intact proteins by chromatographic fo-
with cysteinyl-peptide enrichment. J. Proteome Res. 5, 361–369 cusing and reversed phase of the human serum proteome: optimization
28. Tabb, D. L., MacCoss, M. J., Wu, C. C., Anderson, S. D., and Yates, J. R. and protein database. Mol. Cell. Proteomics 5, 26 –34
(2003) Similarity among tandem mass spectra from proteomic experi- 46. Barnea, E., Sorkin, R., Ziv, T., Beer, I., and Admon, A. (2005) Evaluation of
ments: detection, significance, and utility. Anal. Chem. 75, 2470 –2477 prefractionation methods as a preparatory step for multidimensional
29. Smith, R. D., Anderson, G. A., Lipton, M. S., Pasa-Tolic, L., Shen, Y., based chromatography of serum proteins. Proteomics 5, 3367–3375
Conrads, T. P., Veenstra, T. D., and Udseth, H. R. (2002) An accurate 47. Moritz, R. L., Clippingdale, A. B., Kapp, E. A., Eddes, J. S., Ji, H., Gilbert,
mass tag strategy for quantitative and high throughput proteome meas- S., Connolly, L. M., and Simpson, R. J. (2005) Application of 2-D
urements. Proteomics 2, 513–523 free-flow electrophoresis/RP-HPLC for proteomic analysis of human
30. Qian, W. J., Camp, D. G., and Smith, R. D. (2004) High throughput plasma depleted of multi high-abundance proteins. Proteomics 5,
proteomics using Fourier transform ion cyclotron resonance (FTICR) 3402–3413
mass spectrometry. Expert Rev. Proteomics 1, 89 –97 48. Heller, M., Michel, P. E., Morier, P., Crettaz, D., Wenz, C., Tissot, J. D.,
31. Qian, W. J., Monroe, M. E., Liu, T., Jacobs, J. M., Anderson, G. A., Shen, Reymond, F., and Rossier, J. S. (2005) Two-stage Off-Gel isoelectric
Y., Moore, R. J., Anderson, D. J., Zhang, R., Calvano, S. E., Lowry, S. F., focusing: protein followed by peptide fractionation and application to
Xiao, W., Moldawer, L. L., Davis, R. W., Tompkins, R. G., Camp, D. G., proteome analysis of human plasma. Electrophoresis 26, 1174 –1188
and Smith, R. D. (2005) Quantitative proteome analysis of human 49. Misek, D. E., Kuick, R., Wang, H., Galchev, V., Deng, B., Zhao, R., Tra, J.,
plasma following in vivo lipopolysaccharide administration using 16O/ Pisano, M. R., Amunugama, R., Allen, D., Walker, A. K., Strahler, J. R.,

1742 Molecular & Cellular Proteomics 5.10


LC-MS-based Clinical Proteomics

Andrews, P., Omenn, G. S., and Hanash, S. M. (2005) A wide range of of-flight mass spectrometry approach. Int. J. Mass Spectrom. 212,
protein isoforms in serum and plasma uncovered by a quantitative intact 97–109
protein analysis system. Proteomics 5, 3343–3352 68. Tang, K., Shvartsburg, A. A., Lee, H. N., Prior, D. C., Buschbach, M. A., Li,
50. Tang, H. Y., Ali-Khan, N., Echan, L. A., Levenkova, N., Rux, J. J., and F., Tolmachev, A. V., Anderson, G. A., and Smith, R. D. (2005) High-
Speicher, D. W. (2005) A novel four-dimensional strategy combining sensitivity ion mobility spectrometry/mass spectrometry using electro-
protein and peptide separation methods enables detection of low- dynamic ion funnel interfaces. Anal. Chem. 77, 3330 –3339
abundance proteins in human plasma and serum proteomes. Proteom- 69. Shen, Y., Jacobs, J. M., Camp, D. G., Fang, R., Moore, R. J., Smith, R. D.,
ics 5, 3329 –3342 Xiao, W., Davis, R. W., and Tompkins, R. G. (2004) High efficiency
51. Herbert, B., and Righetti, P. G. (2000) A turning point in proteome analysis: SCXLC/RPLC/MS/MS for high dynamic range characterization of the
sample prefractionation via multicompartment electrolyzers with iso- human plasma proteome. Anal. Chem. 76, 1134 –1144
electric membranes. Electrophoresis 21, 3639 –3648 70. Anderson, N. L., Polanski, M., Pieper, R., Gatlin, T., Tirumalai, R. S.,
52. Tu, C. J., Dai, J., Li, S. J., Sheng, Q. H., Deng, W. J., Xia, Q. C., and Zeng, Conrads, T. P., Veenstra, T. D., Adkins, J. N., Pounds, J. G., Fagan, R.,
R. (2005) High-sensitivity analysis of human plasma proteome by im- and Lobley, A. (2004) The human plasma proteome: a nonredundant list
mobilized isoelectric focusing fractionation coupled to mass spectrom- developed by combination of four separate sources. Mol. Cell. Pro-
etry identification. J. Proteome Res. 4, 1265–1273 teomics 3, 311–316
53. Andersen, J. S., Lam, Y. W., Leung, A. K., Ong, S. E., Lyon, C. E., Lamond, 71. States, D. J., Omenn, G. S., Blackwell, T. W., Fermin, D., Eng, J., Speicher,
A. I., and Mann, M. (2005) Nucleolar proteome dynamics. Nature 433, D. W., and Hanash, S. M. (2006) Challenges in deriving high-confidence
77– 83 protein identifications from data gathered by a HUPO plasma proteome
54. Jin, W. H., Dai, J., Li, S. J., Xia, Q. C., Zou, H. F., and Zeng, R. (2005) collaborative study. Nat. Biotechnol. 24, 333–338
Human plasma proteome analysis by multidimensional chromatography 72. Peng, J., Elias, J. E., Thoreen, C. C., Licklider, L. J., and Gygi, S. P. (2003)
prefractionation and linear ion trap mass spectrometry identification. J. Evaluation of multidimensional chromatography coupled with tandem
Proteome Res. 4, 613– 619 mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the
55. Liu, T., Qian, W. J., Strittmatter, E. F., Camp, D. G., Anderson, G. A., Thrall, yeast proteome. J. Proteome Res. 2, 43–50
B. D., and Smith, R. D. (2004) High throughput comparative proteome 73. Tirumalai, R. S., Chan, K. C., Prieto, D. A., Issaq, H. J., Conrads, T. P., and

Downloaded from www.mcponline.org by on February 4, 2007


analysis using a quantitative cysteinyl-peptide enrichment technology. Veenstra, T. D. (2003) Characterization of the low molecular weight
Anal. Chem. 76, 5345–5353 human serum proteome. Mol. Cell. Proteomics 2, 1096 –1103
56. Zhang, H., Li, X.-j., Martin, D. B., and Aerbersold, R. (2003) Identification 74. Qian, W. J., Jacobs, J. M., Camp II, D. G., Monroe, M. E., Moore, R. J.,
and quantification of N-linked glycoproteins using hydrazide chemistry, Gritsenko, M. A., Calvano, S. E., Lowry, S. F., Xiao, W., Moldawer, L. L.,
stable isotope labeling and mass spectrometry. Nat. Biotechnol. 21, Davis, R. W., Tompkins, R. G., and Smith, R. D. (2005) Comparative
660 – 665 proteome analyses of human plasma following in vivo lipopolysaccha-
57. Liu, T., Qian, W. J., Gritsenko, M. A., Camp, D. G., II, Monroe, M. E., ride administration using multidimensional separations coupled with
Moore, R. J., and Smith, R. D. (2005) Human plasma N-glycoproteome tandem mass spectrometry. Proteomics 5, 572–584
analysis by immunoaffinity subtraction, hydrazide chemistry, and mass 75. Washburn, M. P., Wolters, D., and Yates, J. R. (2001) Large-scale analysis
spectrometry. J. Proteome Res. 4, 2070 –2080 of the yeast proteome by multidimensional protein identification tech-
58. Yang, Z. P., Hancock, W. S., Chew, T. R., and Bonilla, L. (2005) A study of nology. Nat. Biotechnol. 19, 242–247
glycoproteins in human serum and plasma reference standards (HUPO) 76. Xie, H., and Griffin, T. J. (2006) Trade-off between high sensitivity and
using multilectin affinity chromatography coupled with RPLC-MS/MS. increased potential for false positive peptide sequence matches using a
Proteomics 5, 3353–3366 two-dimensional linear ion trap for tandem mass spectrometry-based
59. Liu, T., Qian, W. J., Gritsenko, M. A., Xiao, W., Moldawer, L. L., Kaushal, proteomics. J. Proteome Res. 5, 1003–1009
A., Monroe, M. E., Varnum, S. M., Moore, R. J., Purvine, S. O., Maier, 77. Omenn, G. S., States, D. J., Adamski, M., Blackwell, T. W., Menon, R.,
R. V., Davis, R. W., Tompkins, R. G., Camp, D. G., II, and Smith, R. D. Hermjakob, H., Apweiler, R., Haab, B. B., Simpson, R. J., Eddes, J. S.,
(June 8, 2006) High dynamic range characterization of the trauma Kapp, E. A., Moritz, R. L., Chan, D. W., Rai, A. J., Admon, A., Aebersold,
patient plasma proteome. Mol. Cell. Proteomics 10.1074/mcp R., Eng, J., Hancock, W. S., Hefta, S. A., Meyer, H., Paik, Y. K., Yoo,
.M600068-MCP200 J. S., Ping, P., Pounds, J., Adkins, J., Qian, X., Wang, R., Wasinger, V.,
60. Shen, Y., Smith, R. D., Unger, K. K., Kumar, D., and Lubda, D. (2005) Wu, C. Y., Zhao, X., Zeng, R., Archakov, A., Tsugita, A., Beer, I., Pandey,
Ultrahigh-throughput proteomics using fast RPLC separations with ESI- A., Pisano, M., Andrews, P., Tammen, H., Speicher, D. W., and Hanash,
MS/MS. Anal. Chem. 77, 6692– 6701 S. M. (2005) Overview of the HUPO Plasma Proteome Project: results
61. Chen, H. S., Rejtar, T., Andreev, V., Moskovets, E., and Karger, B. L. from the pilot phase with 35 collaborating laboratories and multiple
(2005) High-speed, high-resolution monolithic capillary LC-MALDI MS analytical groups, generating a core dataset of 3020 proteins and a
using an off-line continuous deposition interface for proteomic analysis. publicly-available database. Proteomics 5, 3226 –3245
Anal. Chem. 77, 2323–2331 78. Hood, B. L., Zhou, M., Chan, K. C., Lucas, D. A., Kim, G. J., Issaq, H. J.,
62. Xie, J., Miao, Y., Shih, J., Tai, Y. C., and Lee, T. D. (2005) Microfluidic Veenstra, T. D., and Conrads, T. P. (2005) Investigation of the mouse
platform for liquid chromatography-tandem mass spectrometry analy- serum proteome. J. Proteome Res. 4, 1561–1568
ses of complex peptide mixtures. Anal. Chem. 77, 6947– 6953 79. Keller, A., Nesvizhskii, A. I., Kolker, E., and Aebersold, R. (2002) Empirical
63. He, B., and Regnier, F. (1998) Microfabricated liquid chromatography statistical model to estimate the accuracy of peptide identifications
columns based on collocated monolith support structures. J. Pharm. made by MS/MS and database search. Anal. Chem. 74, 5383–5392
Biomed. Anal. 17, 925–932 80. Nesvizhskii, A. I., Keller, A., Kolker, E., and Aebersold, R. (2003) A statis-
64. Li, J., LeRiche, T., Tremblay, T. L., Wang, C., Bonneil, E., Harrison, D. J., tical model for identifying proteins by tandem mass spectrometry. Anal.
and Thibault, P. (2002) Application of microfluidic devices to proteomics Chem. 75, 4646 – 4658
research: identification of trace-level protein digests and affinity capture 81. MacCoss, M. J., Wu, C. C., and Yates, J. R. (2002) Probability-based
of target peptides. Mol. Cell. Proteomics 1, 157–168 validation of protein identifications using a modified SEQUEST algo-
65. Srebalus, C. A., Li, J., Marshall, W. S., and Clemmer, D. E. (2000) Deter- rithm. Anal. Chem. 74, 5593–5599
mining synthetic failures in combinatorial libraries by hybrid gas-phase 82. Anderson, D. C., Li, W., Payan, D. G., and Noble, W. S. (2003) A new
separation methods. J. Am. Soc. Mass Spectrom. 11, 352–355 algorithm for the evaluation of shotgun peptide sequencing in proteom-
66. Henderson, S. C., Valentine, S. J., Counterman, A. E., and Clemmer, D. E. ics: support vector machine classification of peptide MS/MS spectra
(1999) ESI/ion trap/ion mobility/time-of-flight mass spectrometry for and SEQUEST scores. J. Proteome Res. 2, 137–146
rapid and sensitive analysis of biomolecular mixtures. Anal. Chem. 71, 83. Fenyo, D., and Beavis, R. C. (2003) A method for assessing the statistical
291–301 significance of mass spectrometry-based protein identifications using
67. Valentine, S. J., Kulchania, M., Srebalus Barnes, C. A., and Clemmer, D. E. general scoring schemes. Anal. Chem. 75, 768 –774
(2001) Multidimensional separations of complex peptide mixtures: a 84. Henzel, W. J., Billeci, T. M., Stults, J. T., Wong, S. C., Grimley, C., and
combined high-performance liquid chromatography/ion mobility/time- Watanabe, C. (1993) Identifying proteins from two-dimensional gels by

Molecular & Cellular Proteomics 5.10 1743


LC-MS-based Clinical Proteomics

molecular mass searching of peptide fragments in protein sequence with multidimensional liquid chromatography and tandem mass spec-
databases. Proc. Natl. Acad. Sci. U. S. A. 90, 5011–5015 trometry. J. Proteome Res. 4, 377–386
85. Pappin, D. J., Hojrup, P., and Bleasby, A. J. (1993) Rapid identification of 101. Wang, W., Zhou, H., Lin, H., Roy, S., Shaler, T. A., Hill, L. R., Norton, S.,
proteins by peptide-mass fingerprinting. Curr. Biol. 3, 327–332 Kumar, P., Anderle, M., and Beker, C. H. (2003) Quantification of pro-
86. Yates, J. R., Speicher, S., Griffin, P. R., and Hunkapiller, T. (1993) Peptide teins and metabolites by mass spectrometry without isotope labeling or
mass maps: a highly informative approach to protein identification. spiked standards. Anal. Chem. 75, 4818 – 4826
Analytical Biochemistry 214, 397– 408 102. Chelius, D., and Bondarenko, P. V. (2002) Quantitative profiling of proteins
87. Zimmer, J. S., Monroe, M. E., Qian, W. J., and Smith, R. D. (2006) in complex mixtures using liquid chromatography and mass spectrom-
Advances in proteomics data analysis and display using an accurate etry. J. Proteome Res. 1, 317–323
mass and time tag approach. Mass Spectrom. Rev. 25, 450 – 482 103. Fang, R., Elias, D. A., Monroe, M. E., Shen, Y., McIntosh, M., Wang, P.,
88. Olsen, J. V., and Mann, M. (2004) Improved peptide identification in Goddard, C. D., Callister, S. J., Moore, R. J., Gorby, Y. A., Adkins, J. N.,
proteomics by two consecutive stages of mass spectrometric fragmen- Fredrickson, J. K., Lipton, M. S., and Smith, R. D. (2006) Differential
tation. Proc. Natl. Acad. Sci. U. S. A. 101, 13417–13422 label-free quantitative proteomic analysis of Shewanella oneidensis cul-
89. Dieguez-Acuna, F. J., Gerber, S. A., Kodama, S., Elias, J. E., Beausoleil, tured under aerobic and suboxic conditions by accurate mass and time
S. A., Faustman, D., and Gygi, S. P. (2005) Characterization of mouse tag approach. Mol. Cell. Proteomics 5, 714 –725
spleen cells by subtractive proteomics. Mol. Cell. Proteomics 4, 104. Tang, K., Page, J. S., and Smith, R. D. (2004) Charge competition and the
1459 –1470 linear dynamic range of detection in electrospray ionization mass spec-
90. Gao, J., Opiteck, G. J., Friedrichs, M. S., Dongre, A. R., and Hefta, S. A. trometry. J. Am. Soc. Mass Spectrom. 15, 1416 –1423
(2003) Changes in the protein expression of yeast as a function of 105. Luo, Q., Shen, Y., Hixson, K. K., Zhao, R., Yang, F., Moore, R. J., Mottaz,
carbon source. J. Proteome Res. 2, 643– 649 H. M., and Smith, R. D. (2005) Preparation of 20-␮m-i.d. silica-based
91. Liu, H., Sadygov, R. G., and Yates, J. R. (2004) A model for random monolithic columns and their performance for proteomics analyses.
sampling and estimation of relative protein abundance in shotgun pro- Anal. Chem. 77, 5028 –5035
teomics. Anal. Chem. 76, 4193– 4201 106. Juraschek, R., Dulcks, T., and Karas, M. (1999) Nanoelectrospray—more
92. Jacobs, J. M., Diamond, D. L., Chan, E. Y., Gritsenko, M. A., Qian, W. J., than just a minimized-flow electrospray ionization source. J. Am. Soc.

Downloaded from www.mcponline.org by on February 4, 2007


Stastna, M., Camp, D. G., Rice, C. M., Carithers, R. L., Katze, M. G., and Mass Spectrom. 10, 300 –308
Smith, R. D. (2005) Proteome analysis of Huh-7.5 cells containing 107. Alaiya, A., Al-Mohanna, M., and Linder, S. (2005) Clinical cancer proteom-
full-length hepatitis C virus replicon and application to HCV infected ics: promises and pitfalls. J. Proteome Res. 4, 1213–1222
liver biopsy samples. J. Virol. 79, 7558 –7569 108. Zhan, X., and Desiderio, D. M. (2003) Heterogeneity analysis of the human
93. Zybailov, B., Coleman, M. K., Florens, L., and Washburn, M. P. (2005) pituitary proteome. Clin. Chem. 49, 1740 –1751
Correlation of relative abundance ratios derived from peptide ion chro- 109. Mann, K. G., Brummel-Ziedins, K., Undas, A., and Butenas, S. (2004) Does
matograms and spectrum counting for quantitative proteomic analysis the genotype predict the phenotype? Evaluations of the hemostatic
using stable isotope labeling. Anal. Chem. 77, 6218 – 6224 proteome. J. Thromb. Haemostasis 2, 1727–1734
94. Heller, M., Mattou, H., Menzel, C., and Yao, X. (2003) Trypsin catalyzed 110. Kendziorski, C., Irizarry, R. A., Chen, K. S., Haag, J. D., and Gould, M. N.
16
O-to-18O exchange for comparative proteomics: tandem mass spec- (2005) On the utility of pooling biological samples in microarray exper-
trometry comparison using MALDI-TOF, ESI-QTOF, and ESI-ion trap iments. Proc. Natl. Acad. Sci. U. S. A. 102, 4252– 4257
mass spectrometers. J. Am. Soc. Mass Spectrom. 14, 704 –718 111. Sickmann, A., Marcus, K., Schafer, H., Butt-Dorje, E., Lehr, S., Herkner,
95. Pasa-Tolic, L., Jensen, P. K., Anderson, G. A., Lipton, M. S., Peden, K. K., A., Suer, S., Bahr, I., and Meyer, H. E. (2001) Identification of post-
Martinovic, S., Tolic, N., Bruce, J. E., and Smith, R. D. (1999) High translationally modified proteins in proteome studies. Electrophoresis
throughput proteome-wide precision measurements of protein expres- 22, 1669 –1676
sion using mass spectrometry. J. Am. Chem. Soc. 121, 7949 –7950 112. Anderson, L. (2005) Candidate-based proteomics in the search for bi-
96. Oda, Y., Huang, K., Cross, F. R., Cowburn, D., and Chait, B. T. (1999) omarkers of cardiovascular disease. J. Physiol. 563, 23– 60
Accurate quantitation of protein expression and site-specific phospho- 113. Anderson, L., and Hunter, C. L. (2006) Quantitative mass spectrometric
rylation. Proc. Natl. Acad. Sci. U. S. A. 96, 6591– 6596 multiple reaction monitoring assays for major plasma proteins. Mol.
97. Ong, S. E., Blagoev, B., Kratchmarova, I., Kristensen, D. B., Steen, H., Cell. Proteomics 5, 573–588
Pandey, A., and Mann, M. (2002) Stable isotope labeling by amino acids 114. Anderson, N. L., Anderson, N. G., Haines, L. R., Hardie, D. B., Olafson,
in cell culture, SILAC, as a simple and accurate approach to expression R. W., and Pearson, T. W. (2004) Mass spectrometric quantitation of
proteomics. Mol. Cell. Proteomics 1, 376 –386 peptides and proteins using Stable Isotope Standards and Capture by
98. Gygi, S. P., Rist, B., Gerber, S. A., Turecek, F., Gelb, M. H., and Aebersold, Anti-Peptide Antibodies (SISCAPA). J. Proteome Res. 3, 235–244
R. (1999) Quantitative analysis of complex protein mixtures using iso- 115. Berger, A. B., Vitorino, P. M., and Bogyo, M. (2004) Activity-based protein
tope-coded affinity tags. Nat. Biotechnol. 17, 994 –999 profiling: applications to biomarker discovery, in vivo imaging and drug
99. Zhang, Y., Wolf-Yadlin, A., Ross, P. L., Pappin, D. J., Rush, J., Lauffen- discovery. Am. J. Pharmacogenomics 4, 371–381
burger, D. A., and White, F. M. (2005) Time-resolved mass spectrometry 116. Speers, A. E., and Cravatt, B. F. (2004) Chemical strategies for activity-
of tyrosine phosphorylation sites in the epidermal growth factor recep- based proteomics. Chembiochem 5, 41– 47
tor signaling network reveals dynamic modules. Mol. Cell. Proteomics 4, 117. Masselon, C., Pasa-Tolic, L., Tolic, N., Anderson, G. A., Bogdanov, B.,
1240 –1250 Vilkov, A. N., Shen, Y., Zhao, R., Qian, W. J., Lipton, M. S., Camp, D. G.,
100. DeSouza, L., Diehl, G., Rodrigues, M. J., Guo, J., Romaschin, A. D., II, and Smith, R. D. (2005) Targeted comparative proteomics by liquid
Colgan, T. J., and Siu, K. W. (2005) Search for cancer markers from chromatography-tandem Fourier ion cyclotron resonance mass spec-
endometrial tissues using differentially labeled tags iTRAQ and cICAT trometry. Anal. Chem. 77, 400 – 406

1744 Molecular & Cellular Proteomics 5.10

Вам также может понравиться