Subido por Deliana Infante

Mass spectrometry-driven phosphoproteomics patterning the systems biology mosaic

Anuncio
Advanced Review
Mass spectrometry-driven
phosphoproteomics: patterning
the systems biology mosaic
Martin A. Jünger1∗ and Ruedi Aebersold1,2
Protein phosphorylation is the best-studied posttranslational modification and
plays a role in virtually every biological process. Phosphoproteomics is the analysis
of protein phosphorylation on a proteome-wide scale, and mainly uses the same
instrumentation and analogous strategies as conventional mass spectrometry
(MS)-based proteomics. Measurements can be performed either in a discoverytype, also known as shotgun mode, or in a targeted manner which monitors a
set of a priori known phosphopeptides, such as members of a signal transduction
pathway, across biological samples. Here, we delineate the different experimental
levels at which measures can be taken to optimize the scope, reliability, and
information content of phosphoproteomic analyses. Various chromatographic and
chemical protocols exist to physically enrich phosphopeptides from proteolytic
digests of biological samples. Subsequent mass spectrometric analysis revolves
around peptide ion fragmentation to generate sequence information and identify
the backbone sequence of phosphopeptides as well as the phosphate group
attachment site(s), and different modes of fragmentation like collision-induced
dissociation (CID), electron transfer dissociation (ETD), and higher energy
collisional dissociation (HCD) have been established for phosphopeptide analysis.
Computational tools are important for the identification and quantification
of phosphopeptides and mapping of phosphorylation sites, the deposition of
large-scale phosphoproteome datasets in public databases, and the extraction
of biologically meaningful information by data mining, integration with other
data types, and descriptive or predictive modeling. Finally, we discuss how
orthogonal experimental approaches can be employed to validate newly identified
phosphorylation sites on a biochemical, mechanistic, and physiological level.
© 2013 Wiley Periodicals, Inc.
How to cite this article:
WIREs Dev Biol 2014, 3:83–112. doi: 10.1002/wdev.121
INTRODUCTION
S
ystems biology is an interdisciplinary effort. It
aims at a more comprehensive understanding
of biological processes by interrogating global
cellular landscapes like genomes, transcriptomes,
proteomes, interactomes, or metabolomes instead
∗ Correspondence
to: [email protected]
1 Department
of Biology, Institute of Molecular Systems Biology,
ETH Zurich, Zurich, Switzerland
2
Faculty of Science, University of Zurich, Zurich, Switzerland
Conflict of interest: The authors declare that they have no conflicts
of interest.
Volume 3, January/February 2014
of single genes and proteins. It also attempts to
increase the understanding of biological processes
by considering the context of these biomolecules,
i.e., their interactions in time and space in the
living cell. The ambition to understand biological
processes on a systems level is not new. Classical
forward genetics can be thought of as systems
biology in slow motion—it aims at the identification
of all genetic components relevant for a specific
phenotype, but mapping and characterization of the
underlying genes take years of work. Modern systems
biology accelerates systems level analyses by applying
advanced analytical techniques that are able to
capture global cellular parameters. The success of this
© 2013 Wiley Periodicals, Inc.
83
wires.wiley.com/devbio
Advanced Review
(a)
(b)
Biology
Biological
question
- conclusion
-
FIGURE 1 | Systems biology (a) and
proteomic (b) research are interdisciplinary
fields. Fruitful research in these areas depends
on strong intersections and dialog between
experts in biology, technology, and
computation.
Computation
The Systems Biology Triangle
emerging science critically depends on the combined
expertise of biologists, analytical specialists, and
computer scientists. This interdependent constellation
is illustrated in the ‘systems biology triangle’
(Figure 1(a)) and ensures that challenging biological
questions can be addressed that are amenable to the
biological model systems and analytical techniques
available, that the analysis is performed in a way that
maximizes the amount and quality of the desired data,
and that downstream data processing and analysis
by computational means lead to results which are
the basis for a meaningful answer to the biological
questions asked. In the context of large-scale protein
analyses, the ‘proteomics triangle’ (Figure 1(b))
displays the more specific disciplines constituting
the field of proteomics. Experimentally addressing
a specific biological question by proteomics usually
involves enrichment of defined populations of proteins
or peptides and their subsequent analysis by MS.
Computational tools are then used to identify and, if
desired, quantify the proteins and peptides of interest.
Furthermore, modeling in the sense of capturing
the acquired information, possibly integrating it
with other data types such as transcriptome or
metabolome profiles or protein–protein and genetic
interactions, and visualization of the results can be
instrumental in arriving at a biological conclusion.
Corresponding to these expertise or discipline
triangles, this article gives an overview about
the different aspects involved in the analysis of
phosphoproteomes, specifically addressing the issues
of suitable biological systems and questions, sample
preparation, enrichment of phosphopeptides, MS
for phosphopeptide identification and quantitation,
data analysis, and the range of possible validation
experiments that can be employed to consolidate new
biological knowledge.
PROTEIN PHOSPHORYLATION IN
BIOLOGICAL SYSTEMS
Reversible protein phosphorylation is one of the
most widespread regulatory mechanisms found within
cells, and probably the most extensively studied
84
Technology
Computation
and Modeling
Sample preparation
and Mass spectrometry
The Proteomics Triangle
posttranslational protein modification. Virtually every
cellular process is, in addition to other regulatory
mechanisms, directly or indirectly regulated by protein
phosphorylation. Initially characterized in 1955 on the
metabolic enzyme glycogen phosphorylase,1 protein
phosphorylation was later identified as a central
mechanism in intracellular information processing:
signal transduction cascades coordinate the cellular
response to external cues such as hormones or nutrient
conditions, and consist of consecutively acting protein
kinases which phosphorylate downstream kinases
and other substrate proteins such as transcription
factors and adapter proteins when activated by the
external stimulus. Protein phosphorylation occurs
frequently on the side chains of serine, threonine,
and tyrosine residues (O-phosphorylation), but can
also occur on arginine, lysine, and histidine (Nphosphorylation). The vast majority of cellular
protein phosphorylation events reported are on
serine and threonine residues; while Hunter and
Selfton estimated in 1980 that the ratio of S/T/Y
phosphorylation is 90:10:0.05,2 recent analyses based
on large phosphoproteomic studies suggest that this
distribution is rather 88:11:1.3,4 Phosphorylation on
histidine is thought to account for up to 6% of
protein phosphorylation events in eukaryotic cells,
but its analysis by standard proteomic strategies is
complicated by the lability of this modification under
acidic conditions.5 While presently, the true extent of
phosphorylation of the proteome is experimentally
not mapped out to completion in any species, a
frequently cited approximation is that about 30% of
proteomes in eukaryotic cells are phosphorylated.6,7
However, experimental phosphoproteome coverage
based on several large-scale phosphoproteomic studies
(refer to the coverage values for Saccharomyces
cerevisiae, Drosophila, mouse, and human in Table 2)
demonstrates that the percentage of phosphoproteins
in eukaryotic proteomes is rather in the range of
40–45%. The true percentage is likely even higher,
because even these advanced phosphoproteome
mapping efforts are still incomplete. On the basis
of extrapolations from existing phosphoproteomic
datasets, the number of potential phosphorylation
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
sites in a eukaryotic cell is estimated to be in the range
of 500,000 to 700,000.9,10
The Evolution and Functional Significance
of Protein Phosphorylation
Probably the most severe limitation in deriving meaningful biological conclusions from large-scale phosphoproteomic datasets is that of the thousands of
phosphorylation sites that can be measured with current technology (see below), only a small fraction has
a described function. This certainly does not mean
that all the functionally unannotated phosphorylation events are nonfunctional. Evolutionary analysis
of identified phosphoproteomes can facilitate focusing
on the modifications that have a high probability of
having a higher functional impact than others. It has
been demonstrated that functional phosphorylation
sites tend to be evolutionarily conserved; furthermore,
phosphosites on low abundant proteins that occur
with a high stoichiometry seem to be more strongly
conserved than others.11 An interesting theory has
been formulated for activating phosphorylation sites,
which have been proposed to have evolved from acidic
amino acid (‘phospho-mimicking’) residues, thereby
transforming a constitutively active protein to a conditionally active one during the course of evolution.12
Phosphotyrosine signaling systems that include tyrosine kinases, phosphatases, and phosphotyrosinebinding protein domains seem to be primarily regulatory and therefore display more ‘functional density’
than the rest of the phosphoproteome.13 While tyrosine phosphorylation also occurs to some extent in
simpler organisms like bacteria and yeast, pTyr signaling pathways appear to be a hallmark of more complex
organisms. This notion is supported by the finding that
with increasing organismal complexity (cell number
per organism), the number of protein tyrosine kinases
increases. Conversely, genomically encoded tyrosine
content significantly decreases with complexity, possibly to reduce the amount of possible nonfunctional
or deleterious pTyr events and therefore reduce noise
and enhance regulation in pTyr-dependent signaling
systems.14
Signal Transduction Pathways as Biological
Information Processing Systems
The complexity of phosphorylation-based intracellular signaling systems ranges from relatively simple,
near-linear pathways to complex networks of interconnected signaling modules and feedback loops.
Figure 2 depicts three simplified signaling pathway
diagrams of increasing complexity. Probably the simplest signaling pathways are bacterial two-component
Volume 3, January/February 2014
systems that are often mediators in virulence-related
genetic regulation (Figure 2(a)). In contrast to eukaryotic cells, protein phosphorylation in bacteria was
discovered relatively late, in 1979. Although S/T/Y
phosphorylation has also been described in bacteria, two-component systems utilize phosphorylation
of histidine and aspartate residues. As the name
implies, these signaling modules consist of two components, a signal-sensing transmembrane histidine
kinase (HK) which autophosphorylates on histidine
residues upon activation, and then transfers the phosphate groups to aspartate residues on response regulator (RR) proteins which in turn regulate gene
transcription. However, it has emerged that these
simple signaling modules also exist in more complex versions that display branching and crosstalk.15
A relatively simple eukaryotic pathway is the Janus
kinase/signal transducers and activators of transcription (JAK/STAT) system which is involved in cytokine
signaling (Figure 2(b)). In this pathway, interferon
receptors activate JAK upon ligand binding, which in
turn autophosphorylate, become activated, and phosphorylate the receptor on tyrosine residues. Through
Src homology 2 (SH2) domains, STAT proteins are
recruited to the active receptor complex, phosphorylated on tyrosine residues, and homo- or heterodimerize in the phosphorylated state. Subsequently, the
STAT complexes translocate into the nucleus and
regulate cytokine target gene expression, acting as
transcription factors. A more complex eukaryotic signaling system is the insulin-Target of Rapamycin
(TOR) network (Figure 2(c)). The nutrient-sensing
and growth-regulating TOR pathway is conserved
from unicellular organisms such as S. cerevisiae to
humans. In multicellular organisms, additional hormonal control of growth and metabolism is provided
by the insulin/insulin-like growth factor (IGF) signaling (IIS) cascade which is interfaced with the TOR
module at several nodes. The central molecular event
in the IIS pathway is generation of the second messenger phosphatidylinositol (3,4,5)-trisphosphate (PIP3 )
which is catalyzed by class IA phosphoinositide 3kinase (PI3K). Elevated PIP3 levels in the plasma
membrane lead to membrane recruitment of Pleckstrin
homology (PH) domain-containing protein kinases
such as phosphoinositide-dependent kinase 1 (PDK1)
and protein kinase B (PKB)/AKT, which in turn
regulate downstream effectors like forkhead box O
(FOXO) transcription factors. All these pathways are
composed of kinase-substrate networks, and a particularly challenging aspect of phosphoproteomics in
general and signal transduction research in particular
is the identification of direct protein kinase substrates.
© 2013 Wiley Periodicals, Inc.
85
wires.wiley.com/devbio
Advanced Review
(a)
(b)
(c)
FIGURE 2 | Signaling pathways and kinase–substrate relationships in biological systems. Three schematic pathway examples of increasing
complexity are shown. (a) Bacterial two-component signaling consisting of histidine kinases (HK) and response regulators (RR) constitutes one of the
simplest signaling systems, while eukaryotic cells feature networks such as the relatively simple JAK-STAT pathway (b) and the more complex
insulin-TOR network (c). Only the core constituents of the pathways are shown for the sake of clarity. Interconnected cascades of direct
kinase–substrate relationship form the functional backbone of these networks. Another important concept in pathway architecture is the role of
phosphospecific protein binding modules like SH2 domains and 14-3-3 proteins in the dynamic regulation of phosphorylation dependent protein
complexes.
The concepts of approaching this issue experimentally
are briefly summarized in Box 1.
Crosstalk with Other Posttranslational
Modifications
Protein phosphorylation events can influence and
be influenced by other types of posttranslational
modifications (PTMs) occurring on the same protein
machine. This kind of crosstalk between two or
more modification systems can be either positive or
negative, occur in cis (on the same protein) or trans
(on two different proteins), and is comparatively well
characterized in chromatin biology, more specifically
on histone tails that are heavily modified and
display a high density and diversity of PTMs.20 A
86
modification that is closely linked to phosphorylation
is the glycosylation of serine and threonine residues
by β-O-linked N-acetylglucosamine (O-GlcNAc).
Basically all known O-GlcNAc-modified proteins
are also phosphoproteins, and in a number of
cellular responses, there is a reciprocal regulation of
these two modifications.21 This dynamic interplay
is not fully understood. Known mechanisms include
competition of the two PTMs for the same sites, and
the occupancy and cross-regulation of each other at
adjacent sites or even more distant sites on the same
protein.22 Furthermore, it has been reported that
almost 70% of O-GlcNAc-modified proteins are also
phosphorylated at tyrosine residues, suggesting that
tyrosine phosphorylation might somehow impact the
crosstalk between phosphorylation and glycosylation
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
BOX 1
IDENTIFICATION OF DIRECT KINASE
SUBSTRATE PAIRS REQUIRES THE
COMBINATION OF ORTHOGONAL
EXPERIMENTAL APPROACHES
To generate a solid body of biological evidence about novel kinase–substrate pairs, several
orthogonal approaches have to be combined.
These include computational analysis of the
presence of kinase consensus motifs, in vitro
phosphorylation of the putative substrate by
the kinase, and recapitulation of the phosphorylation events in intact cells or even tissues. In vitro and yeast two-hybrid assays
have often been instrumental in determining
a direct kinase–substrate interaction. Genetic
experiments include overexpression and deletion
of the kinase and monitoring the phosphorylation event on the putative substrate. These
are often complicated by redundancy, the rather
long time frame of genetic manipulations and
the resulting occurrence of indirect and compensatory effects. A specific kinase inhibitor
is invaluable for the identification of direct
substrates, because its action is faster than a
genetic manipulation, minimizing the potential
for compensation effects. Although novel substrates are not easily identified by AP-MS type
experiments, some kinases bind their substrate
proteins with high enough affinity to facilitate
a physical co-purification, an example being the
phosphorylation of FoxO1 by PKA-α.16 Therefore, epitope-tagging strategies and tools like
kinobeads, that allow the physical enrichment of
endogenous protein kinases,17 are useful for the
elucidation of kinase interactomes and potentially the identification of new substrates. A more
direct strategy makes use of genetically engineered ‘analog-sensitive’ kinases that can utilize
ATP analogs which wild-type kinases cannot, and
thereby attach radioactive or other labels to the
substrates to be identified.18 Conventional and
systems-level approaches toward linking kinases
with their substrates have been reviewed by
Sopko and Andrews.19 It is apparent that these
types of sophisticated experiments are difficult
to scale up to a high-throughput format.
on serine and threonine residues.23 Another example
of PTM crosstalk which is especially relevant for
cellular signaling occurs within kinase consensus
sequences.24 Methylation or acetylation of arginine or
lysine residues commonly found in consensus motifs
Volume 3, January/February 2014
can affect the phosphorylation event or vice versa.
For example, PRMT1 (Protein Arginine MethylTransferase 1)-catalyzed methylation of R248 and
R250 in FOXO1 (mouse numbering), the two arginine
residues that define the AKT consensus motif, inhibits
phosphorylation of S253 by AKT, thereby preventing
the nuclear export and inhibition of FOXO1.25
Conversely, phosphorylation of RNA polymerase II
(RNAPII) at S2 and S5 by positive transcription
elongation factor b (P-TEFb) and CDK activating
kinase (CAK), respectively, prevents its methylation
at R1810 by coactivator-associated protein arginine
methyltransferase 1 (CARM1) or PRMT4.26 It should
be apparent that mechanistic and functional issues
that arise from protein phosphorylation and its
interplay with other types of modifications can only be
experimentally approached if suitable techniques for
the reliable measurement of the respective substrates
are available.
In the following sections of this review,
we will provide an overview of the current
methods, achievements, and challenges in MS-based
phosphoproteomics. To illustrate some of the concepts
which are discussed, we will frequently return to
the aforementioned biological context of insulin/TOR
signaling, which is a relatively well-studied system and
therefore suitable to provide examples for research
strategies in the field of protein phosphorylation.
EXPERIMENTAL TECHNIQUES
TO STUDY PHOSPHORYLATIONMEDIATED SIGNAL TRANSDUCTION
Analysis of protein phosphorylation has been an
important aspect of biological research for decades,
and several distinct experimental approaches have
evolved to this end. Figure 3 gives an overview
of commonly used protocols to investigate protein
phosphorylation. Traditionally, the incorporation
of radioactive phosphate into substrate proteins
or peptides, typically using radiolabeled ATP
and in vitro kinase reactions, was utilized to
characterize phosphorylation events. Kinase activity
assays can be performed with a radioactivity-based
readout, and phosphopeptide mapping of 32 P-marked
proteins by gel electrophoresis, proteolytic digestion,
two-dimensional thin layer chromatography and
autoradiography has been instrumental in determining
many phosphorylation sites over the years. The
advent of phospho-specific antibodies marked a
breakthrough in protein phosphorylation research, as
they are often tools with high specificity and sensitivity
that allow the detection of a specific phosphorylated
amino acid residue in a protein of interest. Several
© 2013 Wiley Periodicals, Inc.
87
wires.wiley.com/devbio
Advanced Review
techniques employ the antibody-based detection of
phosphoproteins.
Antibody-Based Experimental Strategies to
Investigate Protein Phosphorylation
Immunofluorescence and related antibody-based
approaches such as immuno-electron microscopy are
the only methods which can visualize the localization
of the phospho-epitope in cells or tissues. Qualitative
and semi-quantitative measurements can be performed
by Western blotting with phosphospecific antibodies,
more quantitative results can be obtained with an
ELISA (enzyme-linked immunosorbent assay) setup.
Protein array-type experimental setups have been
used to some extent in phosphoproteome and cell
signaling research,27 but are not discussed here.
With the exception of immunofluorescence, the
methods discussed until now detect phosphoproteins
in cell lysates or other cell-free contexts, but not in
individual cells or tissues. In contrast, flow cytometry
can be performed with phosphospecific antibodies
to measure specific modified residues in proteins
with single cell-resolution.28,29 Conventional flow
cytometry is performed with fluorescently marked
antibodies, and is therefore limited, by the spectral
overlap of the different fluorophores used, to a certain
number (in the range of 10) of phosphorylation
events that can be detected simultaneously. A recent
development has coupled antibody-based protein or
phosphoprotein detection with a mass spectrometric
readout that does not suffer from the spectral overlap
issue, and is referred to as mass cytometry.30,31
This approach potentially allows the simultaneous
measurement and quantification of up to 100 different
epitopes. Unless the phosphorylated epitopes are
extracellular, the analyses by both conventional flow
cytometry and mass cytometry involve fixation and
permeabilization of the cells to enable antibody
access to the intracellular epitopes. Although the
antibody-based methods of phosphoprotein analysis
summarized here have been invaluable in advancing
our understanding of phosphorylation networks and
can be multiplexed to a certain degree, they have two
main conceptual limitations. First, phosphospecific
antibodies are available for only a small fraction
of well-characterized phosphorylation sites, their
generation is time-consuming, and they inherently
vary considerably in terms of analytical parameters
such as specificity and sensitivity. Second, one
obviously cannot discover new phosphorylation sites
when working with antibody-based methods, but
rather analyze the response or characteristics of
known phosphorylation sites within a given biological
88
or biochemical context. For a global analysis of
phosphoproteomes in discovery-type experiments,
or the quantitative targeted analysis of specific
phosphopeptides across sets of biological samples,
mass spectrometric techniques have evolved as
the state-of-the-art approach, as will be described
below. In some cases, antibody-based enrichment of
phosphopeptides has been interfaced with subsequent
MS-based detection, most notably in the analysis of
tyrosine-phosphorylated peptides.
Mass Spectrometry of Large Phosphopeptide
Pools—Phosphoproteomics
The central analytical procedure in MS-based phosphoproteomic workflows is the fragmentation of
isolated phosphopeptide ions in the collision cell of
a tandem mass spectrometer. Several experimental
and computational steps before and after the peptide fragmentation constitute the workflow which
yields qualitative and in certain setups also quantitative information about the analyzed phosphopeptides.
As illustrated in Figure 3, several MS data acquisition schemes can be employed, depending on the
experimental goals, and either the global phosphoproteomes or defined sub-phosphoproteomes can be
investigated depending on the biological question
asked. In so-called shotgun or discovery-mode experiments, the mass spectrometer conventionally operates
in data-dependent acquisition (DDA) mode, selecting
a number of detected peptide ions, usually based on
the precursor ion intensity detected in a survey scan
(MS1 scan), for fragmentation in an MS2 step and
subsequent identification. In this mode of operation,
large catalogs of phosphopeptides identified in various
cells and tissues have been generated and deposited
in publicly accessible databases. Tandem mass spectrometers which can employ the scanning and mass
filtering capabilities of quadrupoles, such as triple
quadrupole instruments, have been used in different
setups to analyze phosphopeptides. In presursor ion
scanning, the first mass analyzer scans all present
precursor ions, while the second mass analyzer is
set to detect product ions that are diagnostic for
phosphopeptides. Upon fragmentation by collisioninduced dissociation (CID), phosphopeptides usually
produce a specific PO3 − ion (m/z = 79) which can
be detected in negative ion mode, which is incompatible with usual proteomics workflows generally
performed in positive ion mode. A derivatization
strategy based on β-elimination and Michael addition has been developed which allows precursor ion
scanning geared toward detection of phosphoserineand phosphothreonine-containing peptides in positive
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
FIGURE 3 | Approaches to the analysis of protein phosphorylation. Experimental categories employing radioactivity, antibody-based detection,
and mass spectrometry are shown. Refer to the text for more details.
ion mode.32 For phosphotyrosine-containing peptides,
precursor ion scanning can also be performed for monitoring the generation of a phosphotyrosine-specific
immonium ion (PSI) at m/z = 216.043 in positive ion
mode. A similar strategy which can be used to detect
serine-, threonine-, and tyrosine-phosphorylated peptides is referred to as neutral loss scanning and is
based on the characteristic of phosphopeptides to
generate the neutral phosphate fragments H3 PO4
(98 Da) and HPO3 (80 Da) during CID. In neutral loss
scanning, both mass analyzers simultaneously scan
through the respective m/z range at a fixed offset corresponding to the mass of the neutral loss fragment,
thereby identifying phosphorylated precursor ions.
Recently, data-independent acquisition (DIA) schemes
have been introduced, in which the entire precursor
ion population is either fragmented in one step (AIF33 )
or by repeated cycling through consecutive isolation
windows (SWATH34 ). Because the link between precursor and fragment ion m/z values is lost in this type
of acquisition, reliable peptide identification has to
include, in contrast to conventional database searching of DDA data, chromatographic information, and
entails the alignment of precisely co-eluting precursor
and fragment (or only fragment) masses. Although
of great potential as general proteomic workflows,
these DIA approaches have not yet been routinely
applied to the analysis of phosphoproteomes. In
contrast to discovery-type strategies in which previously unknown phosphopeptides in the samples of
interest are identified, targeted analysis by selected
reaction monitoring (SRM) on triple quadrupole-type
Volume 3, January/February 2014
instruments employs two mass analyzers that are set
to a pair of selected m/z values at a time. These two
values correspond to the precursor m/z of a given peptide and a specific fragment ion, and are referred to as
a transition. Several transitions in turn constitute an
analytical assay for a peptide, which can be used to
detect it within a complex sample with high sensitivity
and specificity.35 Some examples of phosphopeptide
analysis by SRM are given in the mass spectrometry
section below. A novel development for targeted proteomics implemented on the Q Exactive instrument
(which consists of an orbitrap mass analyzer equipped
with a quadrupole mass filter) is called parallel reaction monitoring.36,32 In this approach, the quadrupole
is used to select a specific precursor ion, which is then
detected with high resolution and accuracy in MS1
(single ion monitoring) and MS/MS modes (parallel
reaction monitoring). Like DIA, this technology has
not yet been applied to phosphoproteomics yet, but is
likely to be also valuable in this type of application.
More Focused Analysis of Informative
Phosphoprotein Sets
The phosphoproteomes of higher organisms are of
such extensive complexity, that it is not yet possible to
achieve a full coverage with current instrumentation
and data analysis tools. First, eukaryotic proteomes
display an enormous range of protein abundances
that exceed the dynamic range of current LC-MS/MS
instrumentation (see Section General Issues of Proteome Complexity). Second, the phosphopeptidome
© 2013 Wiley Periodicals, Inc.
89
wires.wiley.com/devbio
Advanced Review
is much more complex than the unmodified peptidome due to the very large number of measured
and estimated phosphorylation sites (see Section ‘Protein Phosphorylation in Biological Systems’). Third,
many biologically important phosphoproteins such as
signaling pathway components are low abundant, and
many phosphorylation events are sub-stoichiometric,
leading to low phosphopeptide abundances compared
to the levels of the respective unmodified peptides. For
these reasons, it can be of great advantage to interrogate specific subpopulations of the global phosphoproteome, which are relevant and of high information
content for the biological context under investigation. The isolation of subproteomes and their selective
analysis has been achieved by a number of approaches.
Subcellular Fractions
The large majority of large-scale phosphoproteomic
studies describe phosphopeptidome profiles derived
from whole cell extracts. During the lysis step, the
information about subcellular localization of protein
kinases and their phosphorylated substrates is lost.
However, this kind of information is very valuable for
the understanding of phosphorylation-based signaling, because spatial assembly of signaling modules in
subcellular compartments is an important structural
and regulatory element of these networks. To address
this issue and to generate compartment- or organellespecific datasets, several studies have investigated
the phosphoproteomes of specific subcellular compartments and structures, including the nucleus, the
plasma membrane, mitochondria, endosomes, phagosomes, synaptic vesicles, and the mitotic spindle, as
reviewed by Trost et al.38 Phosphoproteome analyses performed in Drosophila cells in our laboratory
have focused mainly on cytoplasmic phosphorylation
events, as cells were lysed in hypotonic buffer, and
the supernatant of an ultracentrifugation at 100,000 g
was used as starting material for the tryptic digests
and phosphopeptide enrichments.39,40
AP-MS of Signal Transduction Proteins
Physical enrichment and subsequent analysis of protein complexes by affinity purification-mass spectrometry (AP-MS) has proven to be a powerful approach to
shed further light on intracellular signaling cascades.
The limitation of this method is the rather laborious
generation of high quality antibodies that can be used
for the purification, or the generation of transgenic
cell lines or organisms which express an epitopetagged variant of the bait protein. When a working
experimental system is established, these studies
can deliver two types of information—composition
of the protein complex which is assembled around
90
the bait protein, and phosphorylation patterns
on the bait protein itself and its interactors. In
both cases, the information obtained can either be
qualitative, quantitative, and static or dynamic, when
e.g., different cellular conditions are compared. A
recent AP-MS study identified the protein complex
compositions around several nodes in the Drosophila
insulin/TOR signaling system,41 and described the
dynamic reassembly of node complexes upon insulin
stimulation. In an investigation of the Drosophila
insulin receptor substrate (IRS) homolog CHICO,
affinity-purified protein complexes were subjected
to an analytical strategy involving differential phosphatase treatment and quantification by iTRAQ (see
Section Quantitative Phosphopeptide Measurements)
to identify and quantify both the CHICO interactome
and its phosphorylation patterns in response to the
insulin stimulus in a single analysis.42 It has also been
demonstrated that AP-MS followed by either the
phosphatase-iTRAQ analysis performed by Pflieger
et al., or phosphopeptide enrichment and subsequent
LC-MS/MS is capable of identifying substantially
more phosphorylation sites on the bait protein compared to proteome-wide phosphopeptide screening.
Two examples are CHICO and the transcription factor dFOXO40 : in both cases, phosphorylation analysis
of the affinity purified complex yielded a higher number of identified phosphopeptides compared to the
CHICO and dFOXO phosphopeptides identified in
a global phosphoproteome analysis of TiO2 , IMAC,
and PAC-enriched tryptic Drosophila cell digests.39
Antibody-based enrichment (pTyr)
As discussed above, the phosphotyrosine proteome
is a small subpopulation of the phosphoproteome. It
is highly informative and central in transmembrane
and intracellular signal transduction. In contrast
to phosphoserine and phosphothreonine, there are
several antibodies against phosphotyrosine which are
highly specific and very well suited for the physical
isolation of tyrosine-phosphorylated macromolecules.
Although it is also possible to isolate tyrosine phosphorylated proteins by immunoaffinity purification,
the commonly used protocols for pTyr-proteomics
employ an affinity purification after protease digestion, at the peptide level. A significant advantage
of this procedure is that protein extraction can be
performed under relatively harsh conditions, that
would not be compatible with protein-AP-MS experiments (e.g., 9 M urea). Therefore, protein extraction
is more efficient also for difficult populations such
as membrane proteins, and subsequent coverage of
the pTyr-proteome more satisfactory. An analysis of
the insulin signaling pathway in differentiated brown
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
adipocytes employed a combination of stable isotope
labeling by amino acids in cell culture (SILAC)
labeling (see Section Quantitative Phosphopeptide
Measurements) and protein immunopurification
with anti-phosphotyrosine antibodies, identifying 40
insulin-regulated effectors, 7 of which had not been
previously described.43 Boersema et al. combined antiphosphotyrosine peptide immunoaffinity purification
and stable isotope dimethyl labeling to investigate the
pTyr proteome in Hela cells. From 4 mg of starting
material, more than 1,100 unique nonredundant
phosphopeptides were identified, and quantitation
of the cellular response to epidermal growth factor
(EGF) stimulation revealed the regulation of 73
unique pTyr peptides.44
CHALLENGES AND EXPERIMENTAL
STRATEGIES IN PHOSPHOPROTEOME
RESEARCH
General Issues of Proteome Complexity
As discussed in our recent review article,36 proteomic complexity by far exceeds genomic complexity
because of the presence of multiple splice variants per
gene, alternative translation start sites, mRNA editing,
and most importantly, the great range of protein abundances found in cells, tissues, and body fluids, which
still cannot be detected in its entirety with current
instrumentation in proteomic experiments. Very complex proteomes such as the human plasma proteome
span a protein abundance range of more than 10
orders of magnitude45 (the proteome of a human cell
spans about 7, yeast about 6, and prokaryotes about
4–5 orders of magnitude). The currently most sensitive MS techniques are able to monitor an abundance
range of 4–5 orders of magnitude.46 More than 200
different types of posttranslational protein modifications have been described,47 which add an additional
layer to proteomic complexity. To reduce this complexity and separate the phosphoproteome from the
background of unmodified or differently modified
peptides, physical enrichment strategies are usually
applied to improve phosphoproteome coverage. There
are several established protocols for the physical
enrichment of phosphopeptides from protein digests.
Antibody-based enrichment is generally only used for
pTyr-containing peptides, while chromatographic and
chemical enrichment is efficient for all phosphopeptides (pTyr/pSer/pThr). These enrichment steps are
usually performed offline, that is, not directly coupled
to the analytical reversed phase LC-MS/MS system.
Figure 4 depicts the individual steps in a typical
phosphoproteome analysis, from the cells or tissues
Volume 3, January/February 2014
under investigation to the fully analyzed data. While
gel electrophoresis is still performed in some cases of
phosphoproteome studies, most large-scale screens
follow a gel-free strategy, in which the protein samples
are directly digested with a protease and subjected to
subsequent phosphopeptide enrichment. Figure 4 also
highlights the steps within the experimental pipeline
during which stable isotope labeled standards can be
introduced (metabolic labeling, protein labeling, peptide labeling), and the critical points which will impact
the obtained phosphoproteome coverage and the overall quality of the experimental data. In the context of
quantitative analyses utilizing stable isotope labeling
approaches (see Section Quantitative Phosphopeptide
Measurements), it should be noted that performing
the labeling and sample pooling steps early in the
experimental pipeline (e.g., metabolic labeling by
SILAC versus chemical peptide labeling) reduces
individual handling errors and therefore usually leads
to more accurate quantification results. As discussed
above, subcellular fractionation can be utilized to shift
the focus of the phosphoproteome screen to the cellular context most relevant for the biological question
asked. Different protein extraction protocols also have
a strong impact on the subsets of the phosphoproteome that will be detected. Especially transmembrane
and membrane-associated proteins are extracted more
efficiently if denaturing agents and detergents are used
during protein extraction. MS-compatible acid-labile
surfactants such as RapiGest are frequently used in
this context.48 After protein extraction, enrichment or
separation on the protein level can be performed, e.g.,
by antibody-based techniques, gel electrophoresis, or
size exclusion chromatography.
Use of Different Proteases
The vast majority of proteomic experiments is
performed with trypsin as a protease. The cleavage
specificity which directs enzymatic protein hydrolysis
to the C-terminal side of lysine and arginine residues,
confers the practical advantage that during MS, the
resulting peptide ions usually carry a minimum of two
positive charges, one at the protonated N-terminal
amino-group of the peptide, and one at the protonated
amino group of the lysine or arginine side chain at
the C-terminus of the peptide. High mass accuracy
instruments can discriminate the higher charged
peptide ions from contaminants which are usually
singly charged, and direct precursor selection for
MS/MS to peptide ions by on-the-fly decision-making
algorithms. During fragmentation in the MS/MS step,
it is furthermore assured that all fragments of the
b- and y-ion series (which propagate throughout
© 2013 Wiley Periodicals, Inc.
91
wires.wiley.com/devbio
Advanced Review
FIGURE 4 | Experimental strategies to extend phosphoproteome coverage and introduce stable isotope-labeled reference molecules. The blue
boxes on the left display the typical stages in a phosphoproteomic experiment. The boxes in the central part of the figure show experimental
processes either during (mauve) or between (yellow) the stages, which can impact the success (in terms of phosphoproteome coverage and quality of
quantitation) of the experiment. The orange boxes on the right illustrate at which stages stable isotope labels can be introduced. Introduction of
isotope labels at an early experimental stage reduces individual sample handling errors because the samples are pooled earlier, and therefore usually
leads to a more accurate quantification.
the length of the peptide from both termini) carry
at least one charge and are therefore detectable by
the machine. Despite these advantages, limiting an
analysis by default to tryptic digests will exclude a
large number of phosphorylation sites, specifically
those that are not located within protein sequence
stretches that will generate tryptic peptides of a length
suitable for MS/MS analysis. While this is currently
not much of an issue for global phosphoproteomics
which is already battling with complexity issues
even when only considering only tryptic peptides,
it does become an important consideration when
analyzing selected proteins or protein complexes for
novel phosphorylation events. For example, many
kinase consensus sequences contain a high density of
basic residues, which often lead to tryptic peptides of
insufficient length. Which parts of a protein of interest
is likely to be visible by a proteomic analysis of a
specific proteolytic digest can be easily checked with
in silico digestion tools. Experimentally, it has been
demonstrated that the parallel digestion of a protein
sample with proteases of fundamentally different
specificities leads to an improved sequence coverage
(approaching 100%) and PTM discovery rate.49
92
Chromatographic Enrichment
of Phosphopeptides
Various enrichment techniques have been developed
that exploit the unique physicochemical properties of
the phosphate group. Because of its low pKa value,
ion exchange chromatography has been successfully
used to enrich for phosphopeptides, namely strong
cation exchange (SCX) chromatography50 as well
as strong anion exchange.51 However, the enrichment is not very specific, as peptides with other
acidic modifications are enriched as well. Hydrophilic
interaction chromatography (HILIC), which separates
analytes based on their hydrophilicity, has also successfully been used in phosphoproteomic studies52
and confers the advantage over SCX that it does not
employ high concentrations of nonvolatile salts for
elution, and can therefore be more easily coupled
online to an LC-MS/MS system. A similar chromatographic approach using a weak anion exchange
resin is electrostatic repulsion-hydrophilic interaction liquid chromatography (ERLIC), and has been
used to analyze phosphorylated and glycosylated peptides in the same experiment.53 Several studies have
used combined chromatographic setups employing
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
subsequent purifications to increase phosphoproteome
coverage.54 Metal-based chromatography approaches
to enrich phosphopeptides rely on the interaction
of the negatively charged phosphate with positively
charged or polarized metal atoms. Immobilized metal
affinity chromatography (IMAC) uses metal ions such
as Fe3+ , Ti4+ , or Zr4+ that are immobilized to the
resin by chelating agents. Metal oxide affinity chromatography (MOAC) works similarly and involves
metal oxides like TiO2 or ZrO2 . Several variations of
the IMAC protocol have been described, including one
employing two consecutive IMAC steps55 to enhance
phosphoproteome coverage, and sequential elution
IMAC (SIMAC)56 for the separation of monophosphorylated peptides and multiply phosphorylated peptides from complex samples. Hydroxyapatite (HAP),
a naturally occurring calcium-containing phosphate
compound, has also been used to enrich phosphopeptides. The electrostatic attraction between the alkaline
earth metal Ca2+ and phosphate enables this purification, similarly to the other described metal-based
chromatography approaches. In contrast to those,
phosphopeptide purification with an HAP matrix
seems to have the advantage of reduced co-enrichment
of acidic peptides.57
Chemical Derivatization
Although recently somewhat displaced by the metalbased affinity enrichment protocols, considerable
effort has been invested in the design of chemical
strategies to capture phosphopeptides. The common
principle of all these approaches is the specific
chemical modification of the phosphate group,
followed by the attachment of an affinity handle
that facilitates the physical enrichment of the
phosphopeptides. As there is no single chemical
reaction that is exclusively specific for the phosphate
group, several approaches have been developed.58
To minimize side reactions with, e.g., acidic groups
on the C-terminus or side chains of acidic amino
acid residues, protection steps such as carboxylic
acid methylation are usually employed prior to the
chemical targeting of the phosphate groups. A widely
used chemical derivatization protocol is based on the
combined β-elimination (BE) under strongly basic
conditions, followed by a Michael addition which
attaches an affinity tag and optionally a stable isotope
label for quantification. The drawback of the βelimination approach is that it is only able to capture
serine- and threonine- (not tyrosine-) phosphorylated
peptides and displays a substantial amount of side
reactions.59 The other chemical approaches are
applicable to pS, pT, and pY-containing peptides.
Volume 3, January/February 2014
The second well-established chemical strategy is
based on phosphoramidate chemistry (PAC), which
entails reaction of the phosphate group in a
phosphopeptide with primary amines to form a
phosphoramidate. Affinity handles and isotope labels
can be attached according to this strategy in various
ways. One example is the protection of carboxyl
groups by methylation and concomitant isotope
labeling, followed by the carbodiimide-catalyzed
reaction of the phosphate groups with cystamine
and subsequent reduction to generate free thiol
groups on every phosphorylation site, which are then
physically captured with maleimide-functionalized
glass beads.40 Phosphopeptides were then eluted
with trifluoroacetic acid (TFA) and analyzed by LCMS/MS. Other chemical methods that are not based on
BE or PAC, such as diazo chemistry60 or oxidationreduction condensation61 have also been described
but are so far not widely used in phosphoproteomic
research.
Gas Phase Ion Separation
Phosphoproteome coverage and phosphopeptide
identification can be improved by an additional
separation technique which is applied to the ionized
peptides in the gas phase, at the interface between
the LC and MS steps in the analytical pipeline. A
conceptual weakness of shotgun-type LC-MS/MSbased phosphoproteomics concerns the analysis
of phosphoisomeric peptides. These peptides of
identical sequence which display a phosphorylation
on different residues are often difficult to separate in
the LC step. In the MS1 step, they cannot be resolved
either, because they are isobaric, and therefore are
fragmented together during CID. This results in MS2
spectra derived from a mixed population of precursor
ions which complicates unambiguous phosphorylation site assignment. These phosphoisomers are
estimated to account for 3–6% of the total detected
phosphoproteome in large-scale studies, and their
identification can be substantially improved by targeted LC-MS/MS analysis.62 Ion mobility separation
(IMS) can be employed to physically separate these
phosphoisomeric peptide ions in an analytical device
operating between the ionization and MS steps in the
LC-MS/MS setup (LC-IMS-MS/MS). It furthermore
facilitates the removal of chemical contaminants from
multiply charged peptide ions, enhancing spectral
quality in MS/MS. IMS separates ions in the gas
phase based on their differential mobility in a range
of physical devices. Specifically, differential or high
field asymmetric waveform IMS (FAIMS) has been
established as a postionization separation technique
© 2013 Wiley Periodicals, Inc.
93
wires.wiley.com/devbio
Advanced Review
that is orthogonal to MS in the context of peptide
analysis. In FAIMS (also called differential ion mobility spectrometry, DMS), peptide ions travel through
a carrier gas (or drift gas) at atmospheric pressure
within a drift tube between two electrodes, across
which a high-voltage asymmetric waveform at radio
frequency (RF) is applied. This electric field is called
the dispersion field and oscillates between high and
low electric field strength because of its asymmetric
waveform. Only ions with a specific mobility will be
balanced and move through the drift tube without
hitting one of the electrodes and being neutralized,
while ions with substantially different mobility in
high and low electric field will drift toward one of the
electrodes during their travel through the device. This
ion drift toward an electrode can be counteracted by
the application of a small DC voltage applied to this
electrode (and superimposed on the RF field), which is
referred to as the compensation voltage (CV). The CV
required to keep a specific ion stabilized is dependent
on the mobility of this ion and therefore characteristic
of it. When analyzing ion mixtures, the CV is scanned,
and the mixture thereby separated into peaks of ions
of distinct mobility, because each ion species can only
travel through the space between the FAIMS plates at
a specific CV which stabilizes its flight. The resulting
spectrum is called a CV spectrum. In phosphopeptide
analysis, FAIMS was demonstrated to be successful in
separating phosphoisomeric peptides even in cases in
which the phosphorylation site was shifted by only one
residue, which is usually very challenging with conventional LC-MS/MS setups.63 Also for more complex
phosphopeptide mixtures like TiO2 -enriched cellular
phosphoproteomes, FAIMS can be instrumental in
resolving chromatographically coeluting phosphoisomers and substantially enhancing phosphoproteome
coverage. Bridon et al. used a FAIMS interface coupled
to an LTQ-Orbitrap instrument to analyze the phosphoproteome of Drosophila S2 cells. A combination
of FAIMS separation and decision tree fragmentation
(discussed below) resulted in a 50% increase in the
number of identified unique phosphopeptide species
compared to conventional LC-MS/MS analysis.64
In the same workflow, the authors performed
label-free quantification to identify insulin-regulated
phosphopeptides.
As discussed above, several different MS data
acquisition strategies have been developed which
are suitable for phosphoproteomic experiments,
and these can be categorized into discovery-type
(like DDA) and targeted approaches (like SRM). In
the section about MS of phosphopeptides below,
the advantages and limitations of various peptide
fragmentation techniques will be addressed, while
94
post-MS computational data processing is presented
in a dedicated Computational Tools Section below.
Normalization to Protein Levels
Large-scale phosphoproteomic studies suffer from
the conceptual limitation that in the absence of
proteome-wide quantitative data about protein levels,
quantitative phosphoproteome measurements cannot
distinguish between changes in phosphorylation level
and protein level, simply because they exclusively
interrogate phosphopeptides. If a certain phosphopeptide increases or decreases its abundance upon
a certain stimulus or cellular condition, this change
may be caused by an alteration of the degree of phosphorylation at the respective site, but also by changes
in the abundance of the phosphorylated protein.
The group of Steven Gygi investigated this problem
in a systematic manner. Quantitative proteome and
phosphoproteome analysis of FUS3 or STE7 mutant
yeast strains compared to wild-type strains revealed
that 25% of the observed phosphorylation changes
were indeed attributed to protein expression levels
and not to the regulation of the phosphorylation
events.65 This limitation has to be kept in mind when
interpreting large-scale phosphoproteomic datasets.
It is often assumed that in contrast to the comparison
of different cell types, the analysis of short-term
manipulations such as stimulation with hormones or
brief treatment with kinase inhibitors will minimize
the impact of protein levels on the phosphoproteome
profiles. However, if novel phosphorylation sites are
later selected for validation by further experiments,
the issue of protein versus phosphorylation level
should certainly be addressed.
MASS SPECTROMETRY
OF PHOSPHOPEPTIDES
Expanding on the experimental workflow for a
typical phosphoproteomic experiment outlined in
Figure 4, Figure 5 zooms into the central part of
mass spectrometric analysis and presents a more
detailed view of this critical step. In many respects,
the mass spectrometric analysis of phosphopeptides
follows the same patterns as conventional MS-based
proteomics with unmodified peptides. The tandem MS
measurements detect intact peptide precursor ions,
fragment them by various means, and subsequently
measure the masses of the resulting product ions.
The available acquisition schemes correspond to those
used in all proteomic disciplines, with the exception
of specialized scanning techniques such as neutral loss
scanning or precursor ion scans for phosphoaminoacid-specific immonium ions.
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
FIGURE 5 | Mass spectrometry of phosphopeptides. The different elements of mass spectrometric analysis of (phospho-) peptides are shown. The
central element is the fragmentation of peptides which generates sequence information and is therefore instrumental in identifying peptides in a
biological sample. Different fragmentation strategies used in phosphoproteomics are displayed, as are the diverse acquisition schemes that have
already been introduced in Figure 3.
DDA: Strategies and Fragmentation
Techniques
Most phosphoproteomic studies that were geared
toward the identification (and sometimes quantification) of large numbers of phosphopeptides have
been performed on instruments like the LTQ-FT or
the LTQ-Orbitrap, in which precursor masses are
measured in a high resolution and accuracy analyzer
such as a Fourier transform ion cyclotron resonance
(FT-ICR) cell or an orbital trap, fragmentation is performed by CID inside the linear ion trap, and the
resulting fragment masses either measured with relatively low resolution in the trap itself or in the high
resolution analyzer. The use of high accuracy measurements in the MS1 step has been demonstrated
to significantly improve the quality and reliability
of phosphopeptide identifications.66 An important
issue for the MS2 step in phosphoproteomics is the
lability of the phosphoester bond in peptides phosphorylated on serine or threonine residues. Upon
fragmentation by CID, these phosphopeptides (and
not tyrosine-phosphorylated peptides) easily lose neutral phosphate, a phenomenon that is exploited by the
neutral loss scanning strategy mentioned earlier. This
Volume 3, January/February 2014
behavior is problematic, because it leads to fragmentation spectra that display a very prominent fragment
ion peak that represents the neutral loss of phosphate,
and b- and y- sequence ion series of much lower intensity. This decreases the information content of the
fragmentation spectrum, which is needed for reliable
peptide identification and phosphorylation site assignment. Different measurement techniques have been
developed to alleviate this issue while still performing
CID in the linear ion trap. Two examples of such
strategies are MS3 and multistage activation (MSA).
In MS3 (in this context more specifically referred
to as data-dependent neutral loss MS,3 DDNLMS3 ),
the neutral loss peptide ion is fragmented again to
induce a higher degree of backbone fragmentation
and sequence information.67 The drawback is that
sequence ions which were generated in the first fragmentation step are lost when isolating the neutral
loss peak for the second step. MSA (or pseudo MSn )
addresses this issue by omitting the ion isolation step
between MS/MS and MS.3 The neutral loss fragment
is collisionally activated while the fragment ions from
the MS/MS step are still present in the trap. This procedure results in a hybrid fragmentation spectrum that
© 2013 Wiley Periodicals, Inc.
95
wires.wiley.com/devbio
Advanced Review
contains a higher number and intensity of structurally
informative ions than the individual spectra generated
in the MS3 approach.68 Acquisition speed is enhanced
as well, because the trap does not need to be refilled for
a second isolation step. Another issue related to fragmentation is not directly related to phosphorylation,
but to the fact that phosphoproteomic datasets contain
a significant population of proline-containing peptides
due to the activity of proline-directed kinases.10 The
peptide bond amino-terminal to proline is particularly
labile in CID (‘proline effect’), complicating peptide
identification.69 A distinct fragmentation mode can
be performed in the multipole collision cell of LTQOrbitrap instruments. In contrast to the routine setup
in which high-resolution MS1 scans are performed in
the orbitrap analyzer and low-resolution MS2 scans
in the linear ion trap (‘high–low’ strategy), it has been
demonstrated to improve phosphopeptide measurements when a ‘high–high’ strategy is applied in which
each MS1 scan is followed by several higher energy
collisional dissociation (HCD) events and subsequent
measurement of the fragment ions in the orbitrap with
high resolution.70,71 The refinement of database search
engines should be adapted to this kind of fragmentation data, as some, but as yet not all engines actually
exploit the high accuracy MS2 information to generate
more robust peptide identifications.72 When analyzing
samples of moderate complexity such as protein complexes, where acquisition speed to improve coverage
is less of an issue, it can be beneficial to consecutively
perform several different fragmentation steps on phosphopeptide precursor ions. Przybylski et al. applied a
combination of iTRAQ labeling, discovery, and subsequent targeted (inclusion list-based) measurements
and a set of different fragmentations (CID/MS2, MSA
and HCD) on all targeted precursors to optimize
phosphopeptide identification and quantification on
proteins of interest.73 The FragMixer software tool
was specifically designed to optimize peptide identification and localization of phosphorylation sites
based on the combined analysis of spectra generated
by different fragmentation modes on the same phosphopeptide precursor ions.74 Finally, it was recently
reported that HCD fragmentation of phosphopeptides
also features a distinct advantage in the determination of phosphorylation sites. On an LTQ-Orbitrap,
a newly discovered fragmentation mechanism during
HCD was shown to generate a neutral loss-derived
x-fragment ion which directly pinpoints the modified
residue.75
Electron-driven dissociation methods to dissociate peptide ions represent an alternative fragmentation
strategy for phosphoproteomics, one that is complementary to CID and especially well suited for the
96
analysis of highly charged phosphopeptide ions with a
low m/z values (high charge density). Electron capture
dissociation (ECD) and electron transfer dissociation
(ETD) employ direct electron capture or electron
transfer from a singly charged radical donor anion
(such as anthracene, fluoroanthene, and azobenzene)
to dissociate the precursor ion, respectively. Unlike
CID, which generates b- and y-fragment ions, ECD
and ETD result in the formation of c- and z-type
fragment ions, and the phosphate group is usually
retained on the fragment without the occurrence of a
neutral loss. This greatly facilitates peptide identification and the precise determination of phosphorylation
sites.76 A recently described strategy termed decision
tree fragmentation combines the two complementary
fragmentation methods CID and ETD on a single
instrument in a way that seeks to maximize phosphoproteome coverage in complex samples (where
acquisition speed is a crucial issue). On a modified LTQ-Orbitrap machine equipped with an ETD
module, the decision tree algorithm makes on-thefly decisions of how to fragment peptide precursor
ions, based on charge state and m/z which are determined during the high resolution MS1 scans in the
orbital trap analyzer. Precursors with lower charge
states and higher m/z values are fragmented by CID,
while more highly charged precursors with lower m/z
are dissociated by ETD. The authors demonstrated
that this decision tree fragmentation scheme leads to
the identification of 7,422 phosphopeptides in human
embryonic stem cell phosphopeptide samples, compared to either 2,801 (CID) or 5,874 (ETD) when
using only one fragmentation method.77 Other modes
of fragmentation that have been applied to phosphopeptide analysis and are displayed in Figure 5 include
the electron-driven methods electron detachment dissociation (EDD) and metastable atom-activated dissociation (MAD), photodissociation approaches and
MALDI post-source decay (PSD). For further reading, we refer to a recent review article about MS/MS
strategies for phosphoproteomics.76
Quantitative Phosphopeptide Measurements
With the exception of a few specific issues which are
unique to the field of phosphoproteomics, the quantification strategies for phosphopeptide measurements
largely correspond to those employed in conventional
peptide-centric (bottom-up) proteomics. As these have
been recently reviewed extensively,78 we confine this
section to delineating the general concepts of quantification without referring to specific examples of
application. Figure 6 displays different approaches
to the quantification of peptides in MS-based proteomics and phosphoproteomics. They can be divided
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
FIGURE 6 | Quantitation strategies in phosphoproteomics. The triple color coding represents the three distinct analytical stages of LC (blue), MS1
(green), and MS2 (orange) during LC-MS/MS. The different quantitation schemes are colored according to which kind of information they exploit for
quantitation. The yellow boxes on the left indicate categories such as label-free methods or stable isotope labeling that are commonly used for the
different quantitation strategies. Note that in contrast to spectral counting, SIL techniques can also be combined with strategies that are also used for
label-free quantification. For example, chemical tags with a mass shift such as ICAT or ICPL, or sample preparation methods employing metabolic
labeling or synthetic heavy peptides can be used in conjunction with MS1-based quantification or SRM, and isobaric tags like iTRAQ or TMT can be
combined with SRM.
into two broad categories: strategies utilizing stable
isotope labeling (SIL) and label-free quantification
methods. SIL can be performed either by spiking
labeled synthetic reference peptides into the biological samples to be quantified, by metabolic protein
labeling, or by chemical labeling of proteins or peptides. Label-free approaches either use the number
of MS/MS spectra identifying a given peptide, which
correlates with its abundance (spectral counting), or
use the precursor ion signal intensity in combination
with chromatographic information. SRM uses two
physical mass filters and utilizes information on the
LC, MS1, and MS2 level. As MS1- and SRM-based
quantification strategies rather correspond to general
acquisition schemes, they can also be combined with
SIL approaches to generate a compound quantification strategy. In general, quantification employing SIL
is more accurate than label-free quantification. The
color code shown in Figure 6 illustrates which levels of information are exploited by which method to
generate quantitative evidence on peptide levels. SRM
is the only method which uses information on all
three levels (as do DIA schemes such as SWATH-MS,
which are not included in this figure because they have
Volume 3, January/February 2014
not been applied to phosphoproteome profiling yet).
MS1-based quantification and the SIL approaches
employing synthetic reference peptides, metabolic protein labeling, and chemical non-isobaric tags utilize
LC and MS1 information. The most commonly used
metabolic labeling strategy is SILAC, which utilizes
isotopically labeled amino acids in the cell culture
medium to generate ‘light’ and ‘heavy’ proteome samples for relative quantification.79 Non-isobaric chemical tags include isotope-coded protein labels (ICPL)80
and mTRAQ reagents.81 Isotope-coded affinity tag
(ICAT)82 approaches are not considered compatible
with phosphoproteomic analyses, as they interrogate
cysteine-containing peptides, which are relatively rare
and therefore do not deliver the protein sequence
coverage needed for PTM analysis. Quantification
based on the alignment of MS1 features typically uses
MS2-derived peptide identifications to annotate the
chromatographic peaks with the corresponding peptide identities, but do not use MS2 information for
quantitative purposes. The only methods which utilize only MS2 information are spectral counting and
isobaric tagging approaches, which generate quantitative information based on reporter ions produced
© 2013 Wiley Periodicals, Inc.
97
wires.wiley.com/devbio
Advanced Review
upon fragmentation in the MS2 step. These reagents
include isobaric tags for relative and absolute quantification (iTRAQ)83 and tandem mass tags (TMT).84 A
recent study suggests that iTRAQ labeling is superior to mTRAQ for the quantitative profiling of
global phosphoproteomes.85 An important issue in
which quantitative information in phosphoproteomics
is crucial is the determination of phosphorylation stoichiometries. For single proteins, this has been achieved
in several cases with SRM-type strategies.86,87 Wu
et al. have established a strategy based on phosphatase
treatment and SIL to determine phosphorylation stoichiometries on a global scale. The utility of the
approach was demonstrated by determining the stoichiometries for 5,033 phosphorylation sites in yeast.88
Targeted Mass Spectrometry of
Phosphopeptides (SRM)
A number of studies have described the targeted measurement of phosphopeptides by SRM/MRM. Being
by definition a non-discovery method in which the
mass spectrometer simply records signals according to
coordinates that have been a priori set by the user, it
is usually performed to quantify sets of known phosphopeptides in a sensitive and reproducible manner
across many biological samples. For small numbers
and single proteins, SRM can also be applied as
a hypothesis-driven discovery type approach, when
transitions are designed to monitor putative phosphopeptides on a specific protein and to thus exhaustively
explore the space of possible phosphorylation events
on it. Several studies have utilized this kind of measurement to screen for phosphorylation sites with an
SRM-MS2 strategy, in which acquisition of a signal with phosphopeptide-specific transitions triggers
the acquisition of a full MS2 spectrum to reliably
identify the phosphopeptide.89,90 Zappacosta et al.
used a combination of isotope labeling, precursor ion
scanning for PO3 − in negative ion mode, and SRM
to analyze phosphorylation of the yeast transcription factor Pho4.91 Glinski and Weckwerth performed
phosphopeptide SRM to monitor phosphorylation of
a small synthetic peptide library corresponding to
trehalose-6-phosphate synthase (TPS) isoform peptides by Arabidopsis thaliana leaf protein extracts.92
In human breast cancer tissue samples, SRM was
employed to validate 15 regulated phosphopeptides
selected from a quantitative phosphoproteome comparison of low- and high-risk recurrence groups.93
A study addressing the suitability of archival clinical
cancer samples reported the successful SRM measurement of 18 phosphopeptides derived from fresh frozen
(FF) and formalin-fixed paraffin-embedded (FFPE)
98
cancer tissue samples.94 On a larger scale, WolfYadlin et al. have used SRM to monitor 222 distinct
phosphotyrosine-peptides across seven time points following EGF treatment of cells.95 Sherrod et al. have
described a pseudo-SRM approach implemented for
phosphopeptide analysis on a linear ion trap, which
entails acquisition of full MS2 spectra of peptides
specified in an inclusion list, and subsequent computational extraction of SRM-like traces specific for the
peptides of interest with the Skyline software tool. The
authors report the quantification of six phosphopeptides derived from immunoprecipitated EGF receptor
(EGFR).96 While certainly a viable strategy for targeted quantification, this ‘pseudo’-SRM which relies
on a posteriori computational extraction of SRM-like
information from full-scan data most likely does not
provide the excellent sensitivity of ‘real’ SRM, which
in contrast applies actual physical mass filters on the
precursor and fragment ion levels, which are instrumental for achieving high sensitivity measurements in
complex backgrounds.
COMPUTATIONAL TOOLS IN
PHOSPHOPROTEOMICS
Like conventional proteomics, the analysis of phosphoproteomes is heavily dependent on a range of
computational tools. A representative collection of
these tools is summarized in Table 1. For a detailed
explanation of computational analysis of phosphoproteomic datasets, please refer to a recent review
article by Ren et al.138 Computational analysis is
required at several points in the data analysis workflow. The earliest point at which specialized software
is required (with certain exceptions like the decision
tree algorithm governing fragmentation decisions during MS/MS) is the one of database searching and
subsequent assignment of peptide identification probabilities. Two-step search strategies such as the refinement search option in X! Tandem, in which a limited
database is created based on the proteins identified in
the first round, which is then searched against with
more variable modifications being considered, can
help to identify modified peptides without leading to
an explosion of the search space. However, care must
be taken with respect to different false discovery rates
in comparison to one-step database searches.139 The
Ascore algorithm operates after the database search
and assigns a statistical probability for a phosphorylation site assignment of being correct. The SloMo
algorithm is an adaptation of this site assignment
tool for ETD data. To make the results of large-scale
phosphoproteome studies publicly available and to
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
TABLE 1 Computational Tools for Phosphoproteomics
Name
Description
PubMed ID (Reference)
Phosphorylation site identification and assignment
Ascore
Probability-based algorithm for phosphorylation site localization in
high-throughput proteomic datasets
1696424397
SLoMo
Adaptation of the Ascore for electron transfer dissociation (ETD) data
1927524198
FragMixer
Automated identification of phosphopeptides and phosphorylation sites
based on multiple fragmentation modes in MS/MS
2309486674
PTMProphet
Localization of PTMs in modified peptides, integrated into the transproteomic
pipeline (TPP)
Mascot Delta Score
Localization of phosphorylation sites based on peptide identifications
generated by the Mascot search engine
21057138100
LuciPHOr
Phosphosite localization based on TPP-processed MS/MS data, providing
estimates for the false localization rate (FLR)
n/a (in review)
PhosphoScore
Phosphorylation site assignment tool compatible with data from multiple MS
levels (MSn )
18543960101
MSQuant
Comprises phosphorylation site scoring feature compatible with Mascot
search results
170819833
Inspect
Contains scoring function to improve phosphopeptide identification from
unassigned MS/MS spectra
18563926102
Phosm
Contains PhosphoSiteScore feature for site assignment, suited for in-depth
analysis of small datasets, also works for unassigned MS/MS spectra
17718535103
n/a.99
Publicly accessible databases for phosphorylation sites
PhosphoPep
MS-based phosphopeptide database for Saccharomyces cerevisiae ,
Caenorhabditis elegans , Drosophila melanogaster , and Homo sapiens
21082442104
PHOSIDA
Posttranslational modification database containing phosphorylated,
acetylated, and N -glycosylated peptides
18039369105
Phospho.ELM
Database of experimentally verified phosphorylation sites in eukaryotic
proteins
21062810106
PhosphoPOINT
Integrates human phosphoproteome datasets with human kinase
interactome networks
18689816107
PhosPhAt
Depository for Arabidopsis thaliana phosphorylation sites
19880383108
P3 DB
Plant Protein Phosphorylation DataBase, contains phosphorylation sites for
six plant species
18931372109
SysPTM
Comprehensive posttranslational modification database containing datasets
covering nearly 50 different PTMs
19366988110
HPRD
Human Protein Reference Database, contains data about domain
architecture, PTMs, interaction networks and disease association
18988627111
PhosphoSitePlus
Posttranslational modification database from Cell Signaling Technology,
primarily human and mouse proteins
22135298112
PhosphoNET
Phosphorylation site database from Kinexus for human proteins, contains
experimentally validated as well as predicted sites
n/a (corporate)
Sequence pattern recognition in phosphoproteomic datasets
Motif-X
Extracts overrepresented patterns from any sequence dataset through
iterative comparison to a dynamic statistical background
16273072113
Scan-X
Detects phosphorylation motifs (identified by Motif-X) within any sequence
dataset
18974045114
HPRD PhosphoMotif
Reports the presence of known phosphorylation-based substrate and binding
17344875115
Finder
MoDL
motifs curated from the literature
Motif Description Length, discovery of motif mixtures for uncharacterized
kinases and phosphatases in phosphoproteomic datasets
Volume 3, January/February 2014
© 2013 Wiley Periodicals, Inc.
18996944116
99
wires.wiley.com/devbio
Advanced Review
TABLE 1 Continued
Name
Description
PubMed ID (Reference)
SMALI
Scoring Matrix-Assisted Ligand Identification, identification of
phosphopeptide ligands that are likely to bind to SH2 domains
17956856117 , 18424801118
Prediction of phosphorylation sites and kinase–substrate relationships
NetPhorest
Atlas of linear kinase consensus motif and phosphorylation-dependent
binding domains, interfaces to Scansite, Phospho.ELM, and PhosphoSite
18765831119
NetworKIN
Prediction of in vivo kinase–substrate relationships, integrating cellular and
molecular contexts for kinases and phosphoproteins
17570479120
NetPhosK
Kinase-specific prediction of eukaryotic protein phosphorylation sites
15174133121
Predikin
Prediction of substrate specificities of protein kinases, suitable for
proteome-wide predictions
21829434122
Scansite
Prediction of kinase–substrate and cell signaling interactions based on short
sequence motifs
12824383123
GPS
Group-based Prediction System, kinase-specific phosphorylation site predictor
18463090124
Minimotif Miner
Detection of approximately 300,000 distinct short sequence motifs in protein
sequence queries
22146221125
KEA
Kinase Enrichment Analysis, draws upon data from NetworKIN, Phospho.ELM,
MINT, HPRD, PhosphoPoint, SwissProt, and manually curated data
19176546126
KID
Kinase Interaction Database for yeast proteins, literature-curated depository
of kinase–substrate pairs
21492431127
Network analysis and protein classification
Cytoscape
Modular software package for visualizing molecular interaction networks and
integrating them with other data types
21149340128
STRING
Search Tool for the Retrieval of Interacting Genes, interaction database for
more than 1100 sequenced organisms
21045058129
BioGRID
Biological General Repository for Interaction Datasets, contains
protein–protein and genetic interactions from large- and small-scale
studies
21071413130
MINT
Molecular INTeraction Database, contains roughly 235,000 binary
protein–protein interactions captured from over 4750 publications
22096227131
IntAct
Molecular interaction database containing data interactions from the
literature or direct data depositions
22121220132
PANTHER
Protein ANalysis THrough Evolutionary Relationships, protein classification
based on gene families, GO classes and pathways
12520017133
PhosSNP
Database of nonsynonymous single-nucleotide polymorphism (nsSNPs) that
potentially influence protein phosphorylation in human cells
19995808134
PepCyber: P∼PEP
Database of protein–protein interactions mediated by
phosphoprotein-binding domains (PPDBs) in human cells
18160410135
OpenMS
Modular software package for the analysis of mass spectrometry-based
proteomic data
21063960136
Skyline
Software tool for performing quantitative proteomic experiments either in
selected reaction monitoring (SRM) or full-scan mode
20147306137
Additional resources
The tools displayed in this table tackle various challenges in the analysis and dissemination of phosphoproteomic data. See main text for details.
facilitate data mining, repositories for protein phosphorylation datasets are an important part of the
computational curation in phosphoproteomics. Large
and widely used databases include PhosphoPep, PhosphoSitePlus, PHOSIDA, and Phospho.ELM. These
100
resources differ substantially in what amount of data
they contain for each given phosphorylation event. In
PhosphoPep, consensus fragmentation spectra can be
accessed, but the raw MS data or mzXML files are
usually not accessible in the public databases, unless
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
they have been deposited in e.g., PRIDE or Tranche
upon publication of the respective research paper. The
value of accessible raw data files is illustrated by the
recently reported identification of ADP-ribosylation
sites via re-searching of a published phosphoproteomic dataset.140 Furthermore, a number of specialized tools address the assignment of putative upstream
kinases to experimentally identified phosphorylation
sites, or to predict phosphorylation sites based on
consensus motifs in an input protein sequence. These
tools comprise NetPhorest, NetworKIN, Scansite, and
others. Related algorithms such as Motif-X and ScanX can be used to recognize overrepresented sequence
motifs surrounding phosphorylation sites in phosphoproteomic datasets, facilitating the identification of
putative novel kinase recognition sequences involved
in a specific biological context.
In addition, there are several computational
resources which are not specific to phosphoproteomics, but are more generic tools that can be used
for visualization of large-scale networks derived from
various datasets. These include Cytoscape, for mapping of phosphoproteome profiles onto pre-existing
(mostly protein–protein interaction) networks, or for
structuring phosphoproteome datasets according to
gene ontology (GO) annotations. Hyperlinks to the
individual database and software tool Web sites, and
the respective literature references in PubMed are
provided in Table 1.
PHOSPHOPROTEOMICS IN THE
CONTEXT OF SYSTEMS BIOLOGY
Choice of Appropriate Biological Questions
The success of a phosphoproteomic analysis strongly
depends on the experimental system used and the
biological question—if any—asked. Pure cataloging
projects usually do not address a specific biological
question, but still, the outcome of the investigation depends on parameters like available amounts
of starting material, protein extraction conditions,
phosphopeptide enrichment, and downstream experimental protocols. As delineated above, the focus on
a specific subsection of the phosphoproteome may
provide more biological insight than a global profiling. Usually cell culture models provide an accessible
entry point before moving to more challenging sample types like tissues or more complex organisms.
Furthermore, model organisms with a less complex
proteome allow for higher coverage in global profiling
experiments compared to cells from higher organisms.
Several examples of how proteomic analyses of protein expression and PTM profiles have been applied
Volume 3, January/February 2014
to investigate the molecular basis of developmental
biology processes are presented in a recent review
article by Alexey Veraksa in WIREs Developmental
Biology.141
Modeling and Integration with Other Data
Types
While phosphoproteome measurements contain very
valuable information about the studied biological
systems independently of other experimental observations, the combined analysis of phosphoproteomic
data with other cellular parameters, and the generation of models based on the experimental data
can certainly yield novel insights. Figure 7 illustrates which kind of datasets may be combined with
phosphoproteome profiles, and which categories of
modeling approaches are available. The overall structure of Figure 7 demonstrates that various types of
experimental data can be acquired and analyzed in
combination with the phosphoproteome data. One
example for this is the normalization of phosphopeptide to protein abundances described above. Furthermore, phosphoproteome (or combined systemslevel) datasets can be mapped onto and combined
with sets of biological information stored in public
databases such as genomes, interaction networks, signaling pathways, phosphorylation site repositories,
or individual research papers. The resulting compound body of experimental evidence is then used
to facilitate the generation of computational models to visualize and describe the biological system
under investigation, or even generate models with
predictive features. According to the review article by
Terfve and Saez-Rodriguez,142 we divide the modeling
approaches into descriptive and predictive ones. It is
beyond the scope of this review to comprehensively
discuss the types of different datasets which have
been combined with phosphoproteome profiles, and
the various modeling approaches. An example for the
integration of different datasets is the combined computational analysis of the yeast phosphoproteome and
protein–protein interactome. Mapping of the phosphoproteome onto the yeast interactome and related
large-scale datasets revealed that in general, phosphorylated proteins have more interaction partners than
nonphosphorylated proteins, implying that phosphorylation plays an important role in the regulation
of protein–protein interactions.143 Data integration
and modeling are of high importance to phosphoproteomics and systems biology, because they can reveal
regulatory connections and subnetworks that would
not be apparent from the analysis of isolated datasets
alone. More information about the modeling options
© 2013 Wiley Periodicals, Inc.
101
wires.wiley.com/devbio
Advanced Review
FIGURE 7 | Phosphoproteomics in the framework of systems biology. Phosphoproteome profiles can be either analyzed as stand-alone datasets,
or combined with other experimental data (yellow ovals) to generate further biological knowledge. In addition, experimental datasets of different
sources which are stored in public databases or retrieved from literature can be integrated with the acquired phosphoproteome profiles in various
ways. The generation of descriptive or predictive models of processes involving phosphorylation events depend on high-quality experimental data
describing different cellular parameters.
for phosphoproteome data can be found in the recent
review article mentioned above.142
Biochemical and Functional Validation of
Phosphorylation Events
The constantly growing body of accumulated
phosphoproteomic datasets in publicly available
databases creates a discrepancy between the many
phosphorylation events that have been identified
by large-scale experiments, but are otherwise
completely uncharacterized, and the relatively small
population of phosphorylation sites for which the
actual mechanistic and/or biological relevance is
known. Vice versa, a lot of well characterized
phosphorylation events are invisible to current
standard phosphoproteomic screens, either because
of abundance issues, or because they are not
located in a protein sequence context that produces
adequately sized tryptic peptides for MS analysis.
This gap between uncharacterized and characterized
phosphorylation events will be challenging to close,
simply because of the large amount of time that has
to be invested in the mechanistic and functional
validation of phosphorylation events by classical
low-throughput biochemical and genetic experiments.
There is a tradeoff between the investment of time and
102
manpower and the depth or ‘biological resolution’
of knowledge that can be acquired about a specific
number of newly discovered phosphorylation events.
These different levels of validation (biochemical,
mechanistic, physiological) are discussed in this
section and are summarized in Figure 8. The first level
of validation is a biochemical one, in which the goal
is to show that the phosphorylation observed by MS
on the peptide level is indeed occurring on the protein
in question. For this purpose, so-called PhosTag gels
which facilitate the visualization of a phosphorylationinduced bandshift of a protein in immunoblotting
are a useful tool, as e.g. performed in a study
investigating rapamycin-sensitive phosphorylation
events in yeast.144 In the case of a phosphorylation
that occurs under a specific condition, a bandshift
may be seen in this condition compared to a control
(e.g., untreated) condition, and this should be sensitive
to phosphatase treatment of the protein extracts.
Constitutive phosphorylation events will display a
bandshift relative to a phosphatase-treated control
sample, or a mutated allele of the protein of interest
carrying a non-phosphorylatable amino acid in place
of the given phosphorylation site. A more detailed
and much more laborious biochemical validation
entails the identification of the protein kinase which
is catalyzing the phosphorylation of interest. Similar
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
FIGURE 8 | Different levels of validation for newly discovered phosphorylation sites. We divide the various types of validation experiments into
three categories of varying biological information content, namely characterization of the phosphorylation events itself, investigation of the
mechanistic relevance of the modification for the protein on which it occurs, and assessment of in vivo relevance by genetic means and
structure–function analyses.
to the identification of novel substrates for a
given kinase, this is a nontrivial task and can
be approached by a combination of computational
(identification of kinase consensus motifs surrounding
the phosphorylated residue), biochemical (in vitro and
in vivo enzymatic assays with candidate kinases) and
genetic (using the phosphorylation event as a readout
in kinase knockdown or inhibition experiments)
procedures. A chemical genetic approach which
may be used in this context utilizes engineered
analog-sensitive protein kinases.145 Once a new
phosphorylation event has been confirmed on the
protein level, a central question is how the activity,
complex composition, subcellular localization or
other cellular parameters of the modified protein
are altered by the phosphorylation. This second level
of validation addresses the issue of the mechanistic
relevance of a phosphorylation event on the protein on
which it occurs. An example of this kind of validation
is the seminal discovery that phosphorylation of
FOXO transcription factors by AKT leads to binding
to 14-3-3 proteins, sequestration in the cytoplasm and
thereby inhibition of transcription factor activity.146
Even if the phosphorylation event is well characterized
on a biochemical and mechanistic level, even if the
upstream kinase has been identified, this evidence is
not sufficient to judge the physiological relevance of
Volume 3, January/February 2014
a phosphorylation for the protein function in vivo.
This can be assessed by investigating the capability
of phosphorylation site mutant proteins to exert the
function of the wild-type protein, either by knock-in
or genetic rescue strategies. An example of how these
different levels of validation have been performed for
a specific set of phosphorylation site on a protein is
described in Box 2. For the sake of completeness,
it should be noted that these rescue experiments are
usually performed in vivo and convey a powerful
statement about physiological relevance in an actual
organism, but can also be done with less effort in
cell culture. Depending on the experimental system,
either the endogenous gene can be replaced with the
phosphosite mutant by homologous recombination,
or a rescue construct can be combined with a null
allele or RNA interference (RNAi). In the case of
RNAi, measures have to be taken to direct the
knockdown only to the endogenous gene but not
the transgene encoded by the rescue construct. This
can be achieved by targeting the untranslated regions
(UTRs) of the endogenous messenger RNA by RNAi
to achieve the knockdown. The transgene usually
contains exogenous UTRs and will therefore not
be affected by the RNAi-mediated silencing. The
introduction of silent mutations that render the
rescue construct-encoded gene refractory to RNAi are
© 2013 Wiley Periodicals, Inc.
103
wires.wiley.com/devbio
Advanced Review
only suitable for short interfering (siRNA)- or short
hairpin RNA (shRNA)-mediated knockdown, and
not the procedures involving long double-stranded
RNA (dsRNAs) like in most Drosophila cell culture
RNAi screens. These complementation-based assays
for assessing phosphorylation site relevance are time
consuming and difficult to extend to high-throughput
experimental scales. As a compromise between effort
and biological insight, newly identified proteins from
phosphoproteomic screens can be screened with RNAi
knockdown for phenotypes related to the biological
context under investigation. For example, hits from
the analysis of the insulin- or rapamycin-sensitive
phosphoproteome could be screened for cell growth
phenotypes with RNAi in cell culture or in vivo.
The proteins which elicit a growth-related phenotype
upon silencing would be strong candidates for novel
effectors or pathway components, even if no direct
evidence about the phosphorylation event is generated
by this type of experiment.
BOX 2
PHYSIOLOGICAL RELEVANCE OF
PHOSPHORYLATION SITES—THE
EXAMPLE OF TSC2 PHOSPHORYLATION
BY AKT/PKB
An example for such a well-characterized
molecular event is the phosphorylation of TSC2
by AKT. It had been shown that AKT is
directly phosphorylating the protein in vitro
and in vivo at several specific sites, thereby
inactivating the TSC1/TSC2 complex. In this
way, AKT was proposed to exert an activating
stimulus on TOR activity by relieving the
inhibition by the upstream TSC protein complex.
However, it was then demonstrated by genetic
means involving non-phosphorylatable as well
as phospho-mimicking TSC2 mutants that this
phosphorylation was irrelevant for normal TSC2
protein function in the developing organism.
Phosphosite mutant alleles of TSC2 were
demonstrated to perfectly rescue the lethality
of TSC2 mutants, demonstrating that the
phosphorylation events are dispensable for TSC2
protein function under normal conditions.147
In a similar study, this finding was confirmed
and it was additionally demonstrated that
although AKT phosphorylates also TSC1 in
Drosophila, this modification is dispensable for
AKT-dependent growth regulation as well.148
While these experiments show that AKTdependent TSC phosphorylation is non-essential
during organismal development under normal
104
conditions, they do not rule out the possibility
that it may be functionally required for
specific processes such as cellular transformation
triggered by deregulated insulin signaling. In a
similar genetic setup in S. cerevisiae, the impact
of specific phosphorylation events on metabolic
enzyme activity was investigated using metabolic
fluxes as a phenotypic readout.149 In summary,
these genetic approaches illustrate how the in
vivo relevance of specific phosphorylation events
can be assessed.
Current State of Phosphoproteome Analysis
in Different Organisms
Table 2 summarizes the current state of experimental
phosphoproteome coverage in a range of organisms
belonging to the three animal kingdoms archea,
bacteria, and eukarya. When looking at the numbers
in the table, two issues should be kept in mind.
First, it is not possible to make a general statement
about data quality and false discovery rates regarding
these reported numbers. The underlying data sources
are very heterogenous in terms of experimental and
statistical procedures, as well as the availability of
raw data such as fragmentation spectra. Even when
using the large databases, it is usually not possible for
the user to really recapitulate the quality of the underlying data and the calculation of confidence scores,
if available at all. Second, the displayed coverage
values are mainly a measure of how much effort has
been invested in the phosphoproteome analysis of a
specific organism. Because basically all experimental
proteome and phosphoproteome inventories must
still be considered incomplete to varying extents,
the coverage values are likely not very accurate, but
rather rough estimates. Seminal large-scale MS-based
phosphoproteomic studies that each identified thousands of phosphorylation sites in model organisms
and human cells include phosphoproteome analyses
in Caenorhabditis elegans,150 Drosophila cultured
cells151 and embryos,152 a mouse liver cell line,153
liver extracts4 and mouse embryonic stem cells,154
and the human cancer cell lines HeLa and K562.155
CONCLUSION
Mass spectrometry has emerged as the method of
choice for the large-scale analysis of protein phosphorylation events. Although impressive progress has been
made in terms of proteome coverage and refinement
of analytical techniques regarding enrichment, fragmentation, and detection of phosphopeptides, as well
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
TABLE 2 Phosphoproteome Coverage in Different Organisms
Phosphoproteome Identifications
Phospho- Phospho- Phosphorylation
Organism
proteins peptides
Data
Annotated Protein
Percent
sites
Source
Coding Genes
Phosphorylated
Archaea
Halobacterium salinarum
62
100
75
P
2,749
2.3
Haloferax volcanii
8
n/a
9
P
4,015
0.2
Bacteria
Mycoplasma pneumoniae
63
n/a
16
P
707
8.9
Trypanosoma cruzi
753
1,494
2,572
P
19,615 (P)
3.8
Bacillus subtilis (strain 168)
78
102
76
P
4,176
1.9
Escherichia coli (K12 substr. MG1655)
79
104
81
P
4,146
1.9
Lactococcus lactis
63
99
73
P
2,321
2.7
Klebsiella pneumoniae
81
n/a
117
P
5,779
1.4
Pseudomonas aeruginosa
23
n/a
55
P
5,571
0.4
Pseudomonas putida
40
n/a
53
P
5,350
0.7
Streptomyces coelicolor
40
n/a
46
P
8,153
0.5
Mycobacterium tuberculosis
301
n/a
516
P
4,003
7.5
Streptococcus pneumoniae
84
n/a
163
P
2,148
3.9
4,078
n/a
13,899
DB (P3DB)
37,761 (AtGDB171)
10.8
325
n/a
818
P
1,01,620
0.3
Eukarya
Arabidopsis thaliana
Brassica napus
Glycine max
Medicago truncatula
Nicotiana tabacum
Oryza sativa
1,451
n/a
2,739
DB (P3DB)
62,442 (GMGDB163)
2.3
980
n/a
3,351
P
45,888 (Phytozome)
2.1
10
n/a
10
P
1,16,964
4,829
n/a
12,317
P
66,338 (Phytozome)
> 0.1
7.3
Solanum tuberosum
2
n/a
3
P
39,031
Zea mays
86
n/a
115
P
1,36,522 (ZmGDB181)
0.1
1
> 0.1
Saccharomyces cerevisiae
3,006
24,190
n/a
DB (PhosphoPep )
6,692 (ENSEMBL)
44.9
Caenorhabditis elegans
2,373
6,926
6,780
DB (PHOSIDA)
20,517 (ENSEMBL)
11.6
1
Drosophila melanogaster
5,786
23,301
n/a
DB (PhosphoPep )
13,937 (ENSEMBL)
41.5
Mus musculus
9,234
24,604
25,085
DB (PHOSIDA)
23,158 (ENSEMBL)
39.9
Homo sapiens
8,283
23,130
24,262
DB (PHOSIDA)
20,848 (ENSEMBL)
39.7
Homo sapiens and Mus musculus
18,768
n/a
17,7945
DB (PhosphoSitePlus)
44,006 (ENSEMBL)
42.6
Experimental phosphoproteome coverage is shown in terms of identified phosphorylated proteins, phosphopeptides, and phosphorylation sites. The distinction
between identified phosphopeptides and phosphorylation sites was only made where this was clearly stated in the data source. Data sources are either research
publications (P) or databases (DB). For organisms with several phosphoproteomic studies, database results are provided. For organisms that are represented
in several phosphorylation site databases (yeast, C. elegans, Drosophila, mouse, human), the datasets containing the highest number of phosphopeptides or
phosphorylation sites are displayed. PhosphoSitePlus, which does not feature an organism-specific proteome-wide search function, contains the highest number
of phosphorylation sites, 90% of which are from human and mouse. Total predicted proteome size is expressed as the number of annotated protein coding
genes and was retrieved, unless indicated otherwise, from the NCBI Genome database. The numbers regarding phosphoproteome coverage for most of the
prokaryotes are taken from a review article about bacterial phosphoproteomics by Mijakovic and Macek.8
1 PeptideProphet cutoff set at 0.9.
as computational curation of the acquired datasets,
current phosphoproteomic studies must still be considered incomplete because of the inability to capture the
whole phosphoproteome. Several steps in the experimental pipeline can be manipulated and optimized
Volume 3, January/February 2014
to enhance phosphoproteome coverage. A fruitful
and goal-oriented strategy is the targeted analysis
of a defined subphosphoproteome which contains
a high degree of information with respect to the
biological context under investigation. This can be
© 2013 Wiley Periodicals, Inc.
105
wires.wiley.com/devbio
Advanced Review
achieved by various sample preparation protocols
such as subcellular fractionation, isolation of subpopulations such as the phosphotyrosine proteome, or
the physical enrichment of protein classes or complexes. Furthermore, targeted MS techniques like
SRM enable the sensitive, specific, and quantitative
detection of intermediate numbers of phosphorylated
peptides and proteins in the samples of interest. As MS
instrumentation becomes more powerful and enables
ever-increasing coverage of the phosphoproteome
by conventional DDA-type or DIA-type acquisition
schemes, the need for targeted detection of phosphopeptides may decrease in the future. Probably the most
challenging aspect of phosphoproteomics for decades
to come will be the task of assigning biological significance to all the newly discovered phosphorylation
sites, a challenge that will only be met by the concerted
efforts of proteomics, classical biology and computational resources to integrate the experimental results
from different disciplines.
REFERENCES
1. Cohen P. The origins of protein phosphorylation. Nat
Cell Biol 2002, 4:E127–E130.
13. Hunter T. Tyrosine phosphorylation: thirty years and
counting. Curr Opin Cell Biol 2009, 21:140–146.
2. Hunter T, Sefton BM. Transforming gene product
of Rous sarcoma virus phosphorylates tyrosine. Proc
Natl Acad Sci USA 1980, 77:1311–1315.
14. Tan CS, Pasculescu A, Lim WA, Pawson T, Bader
GD, Linding R. Positive selection of tyrosine loss in
metazoan evolution. Science 2009, 325:1686–1688.
3. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar
C, Mortensen P, Mann M. Global, in vivo, and
site-specific phosphorylation dynamics in signaling
networks. Cell 2006, 127:635–648.
15. Goulian M. Two-component signaling circuit structure and properties. Curr Opin Microbiol 2010,
13:184–189.
4. Villen J, Beausoleil SA, Gerber SA, Gygi SP. Largescale phosphorylation analysis of mouse liver. Proc
Natl Acad Sci USA 2007, 104:1488–1493.
5. Hohenester UM, Ludwig K, Krieglstein J, Konig S.
Stepchild phosphohistidine: acid-labile phosphorylation becomes accessible by functional proteomics. Anal
Bioanal Chem 2010, 397:3209–3212.
6. Cohen P. The regulation of protein function by
multisite phosphorylation—a 25 year update. Trends
Biochem Sci 2000, 25:596–601.
7. Holt LJ, Tuch BB, Villen J, Johnson AD, Gygi
SP, Morgan DO. Global analysis of Cdk1 substrate
phosphorylation sites provides insights into evolution.
Science 2009, 325:1682–1686.
16. Lee JW, Chen H, Pullikotil P, Quon MJ. Protein kinase
A-alpha directly phosphorylates FoxO1 in vascular
endothelial cells to regulate expression of vascular
cellular adhesion molecule-1 mRNA. J Biol Chem
2011, 286:6423–6432.
17. Bantscheff M, Eberhard D, Abraham Y, Bastuck S,
Boesche M, Hobson S, Mathieson T, Perrin J, Raida
M, Rau C, et al. Quantitative chemical proteomics
reveals mechanisms of action of clinical ABL kinase
inhibitors. Nat Biotechnol 2007, 25:1035–1044.
18. Dephoure N, Howson RW, Blethrow JD, Shokat
KM, O’Shea EK. Combining chemical genetics and
proteomics to identify protein kinase substrates. Proc
Natl Acad Sci USA 2005, 102:17940–17945.
8. Mijakovic I, Macek B. Impact of phosphoproteomics
on studies of bacterial physiology. FEMS Microbiol
Rev 2012, 36:877–892.
19. Sopko R, Andrews BJ. Linking the kinome
and phosphorylome—a comprehensive review of
approaches to find kinase targets. Mol Biosyst 2008,
4:920–933.
9. Lemeer S, Heck AJ. The phosphoproteomics data
explosion. Curr Opin Chem Biol 2009, 13:414–420.
20. Hunter T. The age of crosstalk: phosphorylation, ubiquitination, and beyond. Mol Cell 2007, 28:730–738.
10. Ubersax JA, Ferrell JE Jr. Mechanisms of specificity in
protein phosphorylation. Nat Rev Mol Cell Biol 2007,
8:530–541.
21. Kamemura K, Hart GW. Dynamic interplay between
O-glycosylation and O-phosphorylation of nucleocytoplasmic proteins: a new paradigm for metabolic
control of signal transduction and transcription. Prog
Nucleic Acid Res Mol Biol 2003, 73:107–136.
11. Levy ED, Michnick SW, Landry CR. Protein
abundance is key to distinguish promiscuous from
functional phosphorylation based on evolutionary
information. Philos Trans R Soc Lond B Biol Sci
2012, 367:2594–2606.
22. Hart GW, Housley MP, Slawson C. Cycling of Olinked beta-N-acetylglucosamine on nucleocytoplasmic proteins. Nature 2007, 446:1017–1022.
12. Pearlman SM, Serber Z, Ferrell JE Jr. A mechanism
for the evolution of phosphorylation sites. Cell 2011,
147:934–946.
23. Mishra S, Ande SR, Salter NW. O-GlcNAc
modification: why so intimately associated with
phosphorylation? Cell Commun Signal 2011, 9:1.
106
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
24. Rust HL, Thompson PR. Kinase consensus sequences:
a breeding ground for crosstalk. ACS Chem Biol 2011,
6:881–892.
25. Yamagata K, Daitoku H, Takahashi Y, Namiki K,
Hisatake K, Kako K, Mukai H, Kasuya Y, Fukamizu A.
Arginine methylation of FOXO transcription factors
inhibits their phosphorylation by Akt. Mol Cell 2008,
32:221–231.
37. Peterson AC, Russell JD, Bailey DJ, Westphall
MS, Coon JJ. Parallel reaction monitoring for high
resolution and high mass accuracy quantitative,
targeted proteomics. Mol Cell Proteomics 2012,
11:1475–1488.
38. Trost M, Bridon G, Desjardins M, Thibault P.
Subcellular phosphoproteomics. Mass Spectrom Rev
2010, 29:962–990.
26. Sims RJ 3rd, Rojas LA, Beck D, Bonasio R, Schuller
R, Drury WJ 3rd, Eick D, Reinberg D. The C-terminal
domain of RNA polymerase II is modified by sitespecific methylation. Science 2011, 332:99–103.
39. Bodenmiller B, Mueller LN, Mueller M, Domon
B, Aebersold R. Reproducible isolation of distinct,
overlapping segments of the phosphoproteome. Nat
Methods 2007, 4:231–237.
27. Zhang H, Pelech S. Using protein microarrays to study
phosphorylation-mediated signal transduction. Semin
Cell Dev Biol 2012, 23:872–882.
40. Bodenmiller B, Mueller LN, Pedrioli PG, Pflieger D,
Junger MA, Eng JK, Aebersold R, Tao WA. An
integrated chemical, mass spectrometric and computational strategy for (quantitative) phosphoproteomics:
application to Drosophila melanogaster Kc167 cells.
Mol Biosyst 2007, 3:275–286.
28. Perez OD, Nolan GP. Simultaneous measurement of
multiple active kinase states using polychromatic flow
cytometry. Nat Biotechnol 2002, 20:155–162.
29. Perl AE, Kasner MT, Shank D, Luger SM, Carroll
M. Single-cell pharmacodynamic monitoring of S6
ribosomal protein phosphorylation in AML blasts
during a clinical trial combining the mTOR inhibitor
sirolimus and intensive chemotherapy. Clin Cancer
Res 2012, 18:1716–1725.
30. Ornatsky O, Bandura D, Baranov V, Nitz M, Winnik
MA, Tanner S. Highly multiparametric analysis by
mass cytometry. J Immunol Methods 2010, 361:1–20.
31. Bendall SC, Simonds EF, Qiu P, Amir el AD, Krutzik
PO, Finck R, Bruggner RV, Melamed R, Trejo
A, Ornatsky OI, et al. Single-cell mass cytometry
of differential immune and drug responses across
a human hematopoietic continuum. Science 2011,
332:687–696.
32. Steen H, Mann M. A new derivatization strategy
for the analysis of phosphopeptides by precursor
ion scanning in positive ion mode. J Am Soc Mass
Spectrom 2002, 13:996–1003.
33. Geiger T, Cox J, Mann M. Proteomics on an
Orbitrap benchtop mass spectrometer using allion fragmentation. Mol Cell Proteomics 2010,
9:2252–2261.
34. Gillet LC, Navarro P, Tate S, Rost H, Selevsek N,
Reiter L, Bonner R, Aebersold R. Targeted data
extraction of the MS/MS spectra generated by dataindependent acquisition: a new concept for consistent
and accurate proteome analysis. Mol Cell Proteomics
2012, 11:O111 016717.
35. Maiolica A, Junger MA, Ezkurdia I, Aebersold R.
Targeted proteome investigation via selected reaction
monitoring mass spectrometry. J Proteomics 2012,
75:3495–3513.
36. Gallien S, Duriez E, Crone C, Kellmann M, Moehring
T, Domon B. Targeted proteomic quantification on
quadrupole-orbitrap mass spectrometer. Mol Cell
Proteomics 2012, 11:1709–1723.
Volume 3, January/February 2014
41. Glatter T, Schittenhelm RB, Rinner O, Roguska K,
Wepf A, Junger MA, Kohler K, Jevtov I, Choi H,
Schmidt A, et al. Modularity and hormone sensitivity
of the Drosophila melanogaster insulin receptor/target
of rapamycin interaction proteome. Mol Syst Biol
2011, 7:547.
42. Pflieger D, Junger MA, Muller M, Rinner O,
Lee H, Gehrig PM, Gstaiger M, Aebersold R.
Quantitative proteomic analysis of protein complexes:
concurrent identification of interactors and their
state of phosphorylation. Mol Cell Proteomics 2008,
7:326–346.
43. Kruger M, Kratchmarova I, Blagoev B, Tseng YH,
Kahn CR, Mann M. Dissection of the insulin signaling
pathway via quantitative phosphoproteomics. Proc
Natl Acad Sci USA 2008, 105:2451–2456.
44. Boersema PJ, Foong LY, Ding VM, Lemeer
S, van Breukelen B, Philp R, Boekhorst J,
Snel B, den Hertog J, Choo AB, et al. Indepth qualitative and quantitative profiling of
tyrosine phosphorylation using a combination of
phosphopeptide immunoaffinity purification and
stable isotope dimethyl labeling. Mol Cell Proteomics
2010, 9:84–99.
45. Anderson NL, Anderson NG. The human plasma
proteome: history, character, and diagnostic prospects.
Mol Cell Proteomics 2002, 1:845–867.
46. Picotti P, Bodenmiller B, Mueller LN, Domon B,
Aebersold R. Full dynamic range proteome analysis
of S. cerevisiae by targeted proteomics. Cell 2009,
138:795–806.
47. Walsh CT. Posttranslational modification of proteins:
expanding nature’s inventory. Greenwood Village,
CO: Roberts and Co. Publishers; 2006.
48. Chen EI, Cociorva D, Norris JL, Yates JR
3rd.. Optimization of mass spectrometry-compatible
surfactants for shotgun proteomics. J Proteome Res
2007, 6:2529–2538.
© 2013 Wiley Periodicals, Inc.
107
wires.wiley.com/devbio
Advanced Review
49. MacCoss MJ, McDonald WH, Saraf A, Sadygov R,
Clark JM, Tasto JJ, Gould KL, Wolters D, Washburn
M, Weiss A, et al. Shotgun identification of protein
modifications from protein complexes and lens tissue.
Proc Natl Acad Sci USA 2002, 99:7900–7905.
50. Beausoleil SA, Jedrychowski M, Schwartz D, Elias
JE, Villen J, Li J, Cohn MA, Cantley LC, Gygi
SP. Large-scale characterization of HeLa cell nuclear
phosphoproteins. Proc Natl Acad Sci USA 2004,
101:12130–12135.
51. Han G, Ye M, Zhou H, Jiang X, Feng S, Jiang
X, Tian R, Wan D, Zou H, Gu J. Large-scale
phosphoproteome analysis of human liver tissue by
enrichment and fractionation of phosphopeptides with
strong anion exchange chromatography. Proteomics
2008, 8:1346–1361.
52. Singer D, Kuhlmann J, Muschket M, Hoffmann R.
Separation of multiphosphorylated peptide isomers
by hydrophilic interaction chromatography on an
aminopropyl phase. Anal Chem 2010, 82:6409–6414.
53. Hao P, Guo T, Sze SK. Simultaneous analysis of proteome, phospho- and glycoproteome of rat kidney
tissue with electrostatic repulsion hydrophilic interaction chromatography. PLoS One 2011, 6:e16884.
54. Nilsson CL. Advances in quantitative phosphoproteomics. Anal Chem 2012, 84:735–746.
55. Ye J, Zhang X, Young C, Zhao X, Hao Q, Cheng
L, Jensen ON. Optimized IMAC-IMAC protocol for
phosphopeptide recovery from complex biological
samples. J Proteome Res 2010, 9:3561–3573.
56. Thingholm TE, Jensen ON, Robinson PJ, Larsen MR.
SIMAC (sequential elution from IMAC), a phosphoproteomics strategy for the rapid separation of
monophosphorylated from multiply phosphorylated
peptides. Mol Cell Proteomics 2008, 7:661–671.
in large-scale phosphoproteomics experiments. J
Proteome Res 2012, 11:3753–3765.
63. Shvartsburg AA, Singer D, Smith RD, Hoffmann R.
Ion mobility separation of isomeric phosphopeptides
from a protein with variant modification of adjacent
residues. Anal Chem 2011, 83:5078–5085.
64. Bridon G, Bonneil E, Muratore-Schroeder T,
Caron-Lizotte O, Thibault P. Improvement of
phosphoproteome analyses using FAIMS and decision
tree fragmentation. application to the insulin signaling
pathway in Drosophila melanogaster S2 cells. J
Proteome Res 2012, 11:927–940.
65. Wu R, Dephoure N, Haas W, Huttlin EL, Zhai B, Sowa
ME, Gygi SP. Correct interpretation of comprehensive
phosphorylation dynamics requires normalization by
protein expression changes. Mol Cell Proteomics
2011, 10:M111 009654.
66. Haas W, Faherty BK, Gerber SA, Elias JE, Beausoleil
SA, Bakalarski CE, Li X, Villen J, Gygi SP.
Optimization and use of peptide mass measurement
accuracy in shotgun proteomics. Mol Cell Proteomics
2006, 5:1326–1337.
67. Olsen JV, Mann M. Improved peptide identification
in proteomics by two consecutive stages of mass
spectrometric fragmentation. Proc Natl Acad Sci USA
2004, 101:13417–13422.
68. Schroeder MJ, Shabanowitz J, Schwartz JC, Hunt
DF, Coon JJ. A neutral loss activation method
for improved phosphopeptide sequence analysis by
quadrupole ion trap mass spectrometry. Anal Chem
2004, 76:3590–3598.
69. Steen H, Mann M. The ABC’s (and XYZ’s) of peptide
sequencing. Nat Rev Mol Cell Biol 2004, 5:699–711.
70. Nagaraj N, D’Souza RC, Cox J, Olsen JV, Mann
M. Feasibility of large-scale phosphoproteomics with
higher energy collisional dissociation fragmentation. J
Proteome Res 2010, 9:6786–6794.
57. Mamone G, Picariello G, Ferranti P, Addeo
F. Hydroxyapatite affinity chromatography for
the highly selective enrichment of mono- and
multi-phosphorylated peptides in phosphoproteome
analysis. Proteomics 2010, 10:380–393.
71. Zhang Y, Ficarro SB, Li S, Marto JA. Optimized Orbitrap HCD for quantitative analysis of phosphopeptides. J Am Soc Mass Spectrom 2009, 20:1425–1434.
58. Leitner A, Lindner W. Chemical tagging strategies
for mass spectrometry-based phospho-proteomics.
Methods Mol Biol 2009, 527:229–243.
72. Mann M, Kelleher NL. Precision proteomics: the case
for high resolution and high mass accuracy. Proc Natl
Acad Sci USA 2008, 105:18132–18138.
59. McLachlin DT, Chait BT. Improved beta-eliminationbased affinity purification strategy for enrichment of
phosphopeptides. Anal Chem 2003, 75:6826–6836.
73. Przybylski C, Junger MA, Aubertin J, Radvanyi F,
Aebersold R, Pflieger D. Quantitative analysis of protein complex constituents and their phosphorylation
states on a LTQ-Orbitrap instrument. J Proteome Res
2010, 9:5118–5132.
60. Warthaka M, Karwowska-Desaulniers P, Pflum MK.
Phosphopeptide modification and enrichment by
oxidation-reduction condensation. ACS Chem Biol
2006, 1:697–701.
61. Lansdell TA, Tepe JJ. Isolation of phosphopeptides
using solid phase enrichment. Tetrahedron Lett 2004,
45:91–93.
74. Vandenbogaert M, Hourdel V, Jardin-Mathe O,
Bigeard J, Bonhomme L, Legros V, Hirt H,
Schwikowski B, Pflieger D. Automated Phosphopeptide Identification Using Multiple MS/MS Fragmentation Modes. J Proteome Res 2012, 11:5695–5703.
62. Courcelles M, Bridon G, Lemieux S, Thibault P.
Occurrence and detection of phosphopeptide isomers
75. Kelstrup CD, Hekmat O, Francavilla C, Olsen
JV. Pinpointing phosphorylation sites: Quantitative
108
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
filtering and a novel site-specific x-ion fragment. J
Proteome Res 2011, 10:2937–2948.
reaction monitoring mass spectrometry. J Proteome
Res 2010, 9:2752–2761.
76. Palumbo AM, Smith SA, Kalcic CL, Dantus M,
Stemmer PM, Reid GE. Tandem mass spectrometry
strategies for phosphoproteome analysis. Mass
Spectrom Rev 2011, 30:600–625.
88. Wu R, Haas W, Dephoure N, Huttlin EL, Zhai B,
Sowa ME, Gygi SP. A large-scale method to measure
absolute protein phosphorylation stoichiometries. Nat
Methods 2011, 8:677–683.
77. Swaney DL, McAlister GC, Coon JJ. Decision
tree-driven tandem mass spectrometry for shotgun
proteomics. Nat Methods 2008, 5:959–964.
89. Cox DM, Zhong F, Du M, Duchoslav E, Sakuma
T, McDermott JC. Multiple reaction monitoring as
a method for identifying protein posttranslational
modifications. J Biomol Tech 2005, 16:83–90.
78. Bantscheff M, Lemeer S, Savitski MM, Kuster B.
Quantitative mass spectrometry in proteomics: critical
review update from 2007 to the present. Anal Bioanal
Chem 2012, 404:939–965.
79. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB,
Steen H, Pandey A, Mann M. Stable isotope labeling
by amino acids in cell culture, SILAC, as a simple and
accurate approach to expression proteomics. Mol Cell
Proteomics 2002, 1:376–386.
80. Schmidt A, Kellermann J, Lottspeich F. A novel
strategy for quantitative proteomics using isotopecoded protein labels. Proteomics 2005, 5:4–15.
81. DeSouza LV, Taylor AM, Li W, Minkoff MS,
Romaschin AD, Colgan TJ, Siu KW. Multiple
reaction monitoring of mTRAQ-labeled peptides
enables absolute quantification of endogenous levels
of a potential cancer marker in cancerous and
normal endometrial tissues. J Proteome Res 2008,
7:3525–3534.
82. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb
MH, Aebersold R. Quantitative analysis of complex
protein mixtures using isotope-coded affinity tags. Nat
Biotechnol 1999, 17:994–999.
83. Ross PL, Huang YN, Marchese JN, Williamson B,
Parker K, Hattan S, Khainovski N, Pillai S, Dey S,
Daniels S, et al. Multiplexed protein quantitation
in Saccharomyces cerevisiae using amine-reactive
isobaric tagging reagents. Mol Cell Proteomics 2004,
3:1154–1169.
84. Thompson A, Schafer J, Kuhn K, Kienle S, Schwarz J,
Schmidt G, Neumann T, Johnstone R, Mohammed
AK, Hamon C. Tandem mass tags: a novel
quantification strategy for comparative analysis of
complex protein mixtures by MS/MS. Anal Chem
2003, 75:1895–1904.
85. Mertins P, Udeshi ND, Clauser KR, Mani DR, Patel
J, Ong SE, Jaffe JD, Carr SA. iTRAQ labeling is
superior to mTRAQ for quantitative global proteomics
and phosphoproteomics. Mol Cell Proteomics 2012,
11:M111 014423.
86. Balasubramaniam D, Eissler CL, Stauffacher CV,
Hall MC. Use of selected reaction monitoring data
for label-free quantification of protein modification
stoichiometry. Proteomics 2010, 10:4301–4305.
87. Jin LL, Tong J, Prakash A, Peterman SM, St-Germain
JR, Taylor P, Trudel S, Moran MF. Measurement
of protein phosphorylation stoichiometry by selected
Volume 3, January/February 2014
90. Unwin RD, Griffiths JR, Leverentz MK, Grallert A,
Hagan IM, Whetton AD. Multiple reaction monitoring
to identify sites of protein phosphorylation with high
sensitivity. Mol Cell Proteomics 2005, 4:1134–1144.
91. Zappacosta F, Collingwood TS, Huddleston MJ,
Annan RS. A quantitative results-driven approach
to analyzing multisite protein phosphorylation: the
phosphate-dependent phosphorylation profile of the
transcription factor Pho4. Mol Cell Proteomics 2006,
5:2019–2030.
92. Glinski M, Weckwerth W. Differential multisite
phosphorylation of the trehalose-6-phosphate synthase gene family in Arabidopsis thaliana: a mass
spectrometry-based process for multiparallel peptide
library phosphorylation analysis. Mol Cell Proteomics
2005, 4:1614–1625.
93. Narumi R, Murakami T, Kuga T, Adachi J, Shiromizu
T, Muraoka S, Kume H, Kodera Y, Matsumoto
M, Nakayama K, et al. A strategy for large-scale
phosphoproteomics and SRM-based validation of
human breast cancer tissue samples. J Proteome Res
2012, 11:5311–5322.
94. Gamez-Pozo A, Sanchez-Navarro I, Calvo E, Diaz E,
Miguel-Martin M, Lopez R, Agullo T, Camafeita E,
Espinosa E, Lopez JA, et al. Protein phosphorylation
analysis in archival clinical cancer samples by shotgun
and targeted proteomics approaches. Mol Biosyst
2011, 7:2368–2374.
95. Wolf-Yadlin A, Hautaniemi S, Lauffenburger DA,
White FM. Multiple reaction monitoring for
robust quantitative proteomic analysis of cellular
signaling networks. Proc Natl Acad Sci USA 2007,
104:5860–5865.
96. Sherrod SD, Myers MV, Li M, Myers JS,
Carpenter KL, Maclean B, Maccoss MJ, Liebler
DC, Ham AJ. Label-free quantitation of protein
modifications by pseudo selected reaction monitoring
with internal reference peptides. J Proteome Res 2012,
11:3467–3479.
97. Beausoleil SA, Villen J, Gerber SA, Rush J, Gygi
SP. A probability-based approach for high-throughput
protein phosphorylation analysis and site localization.
Nat Biotechnol 2006, 24:1285–1292.
98. Bailey CM, Sweet SM, Cunningham DL, Zeller
M, Heath JK, Cooper HJ. SLoMo: automated site
localization of modifications from ETD/ECD mass
spectra. J Proteome Res 2009, 8:1965–1971.
© 2013 Wiley Periodicals, Inc.
109
wires.wiley.com/devbio
Advanced Review
99. Shteynberg D, Deutsch E, Mendoza L, Slagel J, Lam
HH, Nesvizhskii A, Moritz R. PTMProphet: TPP
Software for Validation of Modified Site Locations
on Post-Translationally Modified Peptides. 60th ASMS
(American Society for Mass Spectrometry) Conference,
2012.
100. Savitski MM, Lemeer S, Boesche M, Lang M,
Mathieson T, Bantscheff M, Kuster B. Confident
phosphorylation site localization using the Mascot
Delta Score. Mol Cell Proteomics 2011, 10:M110
003830.
101. Ruttenberg BE, Pisitkun T, Knepper MA, Hoffert JD.
PhosphoScore: an open-source phosphorylation site
assignment tool for MSn data. J Proteome Res 2008,
7:3054–3059.
102. Payne SH, Yau M, Smolka MB, Tanner S, Zhou
H, Bafna V. Phosphorylation-specific MS/MS scoring
for rapid and accurate phosphoproteome analysis. J
Proteome Res 2008, 7:3373–3381.
103. Schlosser A, Vanselow JT, Kramer A. Comprehensive
phosphorylation site analysis of individual phosphoproteins applying scoring schemes for MS/MS data.
Anal Chem 2007, 79:7439–7449.
104. Bodenmiller B, Aebersold R. Phosphoproteome
resource for systems biology research. Methods Mol
Biol 2011, 694:307–322.
105. Gnad F, Ren S, Cox J, Olsen JV, Macek B,
Oroshi M, Mann M. PHOSIDA (phosphorylation site
database): management, structural and evolutionary
investigation, and prediction of phosphosites. Genome
Biol 2007, 8:R250.
106. Dinkel H, Chica C, Via A, Gould CM, Jensen LJ,
Gibson TJ, Diella F. Phospho.ELM: a database of
phosphorylation sites—update 2011. Nucleic Acids
Res 2011, 39:D261–D267.
107. Yang CY, Chang CH, Yu YL, Lin TC, Lee SA,
Yen CC, Yang JM, Lai JM, Hong YR, Tseng
TL, et al. PhosphoPOINT: a comprehensive human
kinase interactome and phospho-protein database.
Bioinformatics 2008, 24:i14–i20.
108. Durek P, Schmidt R, Heazlewood JL, Jones A,
MacLean D, Nagel A, Kersten B, Schulze WX.
PhosPhAt: the Arabidopsis thaliana phosphorylation
site database. An update. Nucleic Acids Res 2010,
38:D828–834.
Protein Reference Database—2009 update. Nucleic
Acids Res 2009, 37:D767–D772.
112. Hornbeck PV, Kornhauser JM, Tkachev S, Zhang
B, Skrzypek E, Murray B, Latham V, Sullivan M.
PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally
determined post-translational modifications in man
and mouse. Nucleic Acids Res 2012, 40:D261–D270.
113. Schwartz D, Gygi SP. An iterative statistical approach
to the identification of protein phosphorylation motifs
from large-scale data sets. Nat Biotechnol 2005,
23:1391–1398.
114. Schwartz D, Chou MF, Church GM. Predicting protein
post-translational modifications using meta-analysis of
proteome scale data sets. Mol Cell Proteomics 2009,
8:365–379.
115. Amanchy R, Periaswamy B, Mathivanan S, Reddy
R, Tattikota SG, Pandey A. A curated compendium
of phosphorylation motifs. Nat Biotechnol 2007,
25:285–286.
116. Ritz A, Shakhnarovich G, Salomon AR, Raphael
BJ. Discovery of phosphorylation motif mixtures
in phosphoproteomics data. Bioinformatics 2009,
25:14–21.
117. Huang H, Li L, Wu C, Schibli D, Colwill K, Ma
S, Li C, Roy P, Ho K, Songyang Z, et al. Defining
the specificity space of the human SRC homology 2
domain. Mol Cell Proteomics 2008, 7:768–784.
118. Li L, Wu C, Huang H, Zhang K, Gan J, Li SS.
Prediction of phosphotyrosine signaling networks
using a scoring matrix-assisted ligand identification
approach. Nucleic Acids Res 2008, 36:3263–3273.
119. Miller ML, Jensen LJ, Diella F, Jorgensen C, Tinti
M, Li L, Hsiung M, Parker SA, Bordeaux J, SicheritzPonten T, et al. Linear motif atlas for phosphorylationdependent signaling. Sci Signal 2008, 1:ra2.
120. Linding R, Jensen LJ, Ostheimer GJ, van Vugt
MA, Jorgensen C, Miron IM, Diella F, Colwill
K, Taylor L, Elder K, et al. Systematic discovery
of in vivo phosphorylation networks. Cell 2007,
129:1415–1426.
121. Blom N, Sicheritz-Ponten T, Gupta R, Gammeltoft S,
Brunak S. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino
acid sequence. Proteomics 2004, 4:1633–1649.
109. Gao J, Agrawal GK, Thelen JJ, Xu D. P3DB: a plant
protein phosphorylation database. Nucleic Acids Res
2009, 37:D960–D962.
122. Ellis JJ, Kobe B. Predicting protein kinase specificity:
Predikin update and performance in the DREAM4
challenge. PLoS One 2011, 6:e21169.
110. Li H, Xing X, Ding G, Li Q, Wang C, Xie L, Zeng
R, Li Y. SysPTM: a systematic resource for proteomic
research on post-translational modifications. Mol Cell
Proteomics 2009, 8:1839–1849.
123. Obenauer JC, Cantley LC, Yaffe MB. Scansite 2.0:
Proteome-wide prediction of cell signaling interactions
using short sequence motifs. Nucleic Acids Res 2003,
31:3635–3641.
111. Keshava Prasad TS, Goel R, Kandasamy K,
Keerthikumar S, Kumar S, Mathivanan S, Telikicherla
D, Raju R, Shafreen B, Venugopal A, et al. Human
124. Xue Y, Ren J, Gao X, Jin C, Wen L, Yao X. GPS 2.0, a
tool to predict kinase-specific phosphorylation sites in
hierarchy. Mol Cell Proteomics 2008, 7:1598–1608.
110
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
WIREs Developmental Biology
Mass spectrometry-driven phosphoproteomics
125. Mi T, Merlin JC, Deverasetty S, Gryk MR, Bill
TJ, Brooks AW, Lee LY, Rathnayake V, Ross CA,
Sargeant DP, et al. Minimotif Miner 3.0: database
expansion and significantly improved reduction of
false-positive predictions from consensus sequences.
Nucleic Acids Res 2012, 40:D252–D260.
126. Lachmann A, Ma’ayan A. KEA: kinase enrichment
analysis. Bioinformatics 2009, 25:684–686.
127. Sharifpoor S, Nguyen Ba AN, Youn JY, van Dyk D,
Friesen H, Douglas AC, Kurat CF, Chong YT, Founk
K, Moses AM, et al. A quantitative literature-curated
gold standard for kinase-substrate pairs. Genome Biol
2011:12:R39.
128. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker
T. Cytoscape 2.8: new features for data integration
and network visualization. Bioinformatics 2011,
27:431–432.
129. Szklarczyk D, Franceschini A, Kuhn M, Simonovic M,
Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork
P, et al. The STRING database in 2011: functional
interaction networks of proteins, globally integrated
and scored. Nucleic Acids Res 2011, 39:D561–568.
130. Stark C, Breitkreutz BJ, Chatr-Aryamontri A, Boucher
L, Oughtred R, Livstone MS, Nixon J, Van Auken
K, Wang X, Shi X, et al. The BioGRID interaction
database: 2011 update. Nucleic Acids Res 2011,
39:D698–704.
131. Licata L, Briganti L, Peluso D, Perfetto L, Iannuccelli
M, Galeota E, Sacco F, Palma A, Nardozza AP,
Santonico E, et al. MINT, the molecular interaction
database: 2012 update. Nucleic Acids Res 2012,
40:D857–861.
132. Kerrien S, Aranda B, Breuza L, Bridge A, BroackesCarter F, Chen C, Duesbury M, Dumousseau M,
Feuermann M, Hinz U, et al. The IntAct molecular
interaction database in 2012. Nucleic Acids Res 2012,
40:D841–846.
133. Thomas PD, Kejariwal A, Campbell MJ, Mi H,
Diemer K, Guo N, Ladunga I, Ulitsky-Lazareva B,
Muruganujan A, Rabkin S, et al. PANTHER: a
browsable database of gene products organized by
biological function, using curated protein family and
subfamily classification. Nucleic Acids Res 2003,
31:334–341.
134. Ren J, Jiang C, Gao X, Liu Z, Yuan Z, Jin C, Wen L,
Zhang Z, Xue Y, Yao X. PhosSNP for systematic
analysis of genetic polymorphisms that influence
protein phosphorylation. Mol Cell Proteomics 2010,
9:623–634.
135. Gong W, Zhou D, Ren Y, Wang Y, Zuo Z, Shen
Y, Xiao F, Zhu Q, Hong A, Zhou X, et al.
PepCyber:PPEP: a database of human protein protein
interactions mediated by phosphoprotein-binding
domains. Nucleic Acids Res 2008, 36:D679–D683.
136. Bertsch A, Gropl C, Reinert K, Kohlbacher O.
OpenMS and TOPP: open source software for LC-MS
data analysis. Methods Mol Biol 2011, 696:353–367.
Volume 3, January/February 2014
137. MacLean B, Tomazela DM, Shulman N, Chambers
M, Finney GL, Frewen B, Kern R, Tabb DL, Liebler
DC, MacCoss MJ. Skyline: an open source document
editor for creating and analyzing targeted proteomics
experiments. Bioinformatics 2010, 26:966–968.
138. Ren J, Gao X, Liu Z, Cao J, Ma Q, Xue
Y. Computational analysis of phosphoproteomics:
progresses and perspectives. Curr Protein Pept Sci
2011, 12:591–601.
139. Eng JK, Searle BC, Clauser KR, Tabb DL. A face in the
crowd: recognizing peptides through database search.
Mol Cell Proteomics 2011, 10:R111 009522.
140. Matic I, Ahel I, Hay RT. Reanalysis of phosphoproteomics data uncovers ADP-ribosylation sites. Nat
Methods 2012, 9:771–772.
141. Veraksa A. Regulation of developmental processes:
insights from mass spectrometry—based proteomics.
WIREs Dev Biol 2012. doi:10.1002/WDEV.102.
142. Terfve C, Saez-Rodriguez J. Modeling signaling
networks using high-throughput phospho-proteomics.
Adv Exp Med Biol 2012, 736:19–57.
143. Yachie N, Saito R, Sugiyama N, Tomita M, Ishihama
Y. Integrative features of the yeast phosphoproteome
and protein-protein interaction map. PLoS Comput
Biol 2011, 7:e1001064.
144. Huber A, Bodenmiller B, Uotila A, Stahl M, Wanka S,
Gerrits B, Aebersold R, Loewith R. Characterization
of the rapamycin-sensitive phosphoproteome reveals
that Sch9 is a central coordinator of protein synthesis.
Genes Dev 2009, 23:1929–1943.
145. Bishop AC, Buzko O, Shokat KM. Magic bullets for
protein kinases. Trends Cell Biol 2001, 11:167–172.
146. Brunet A, Bonni A, Zigmond MJ, Lin MZ, Juo P, Hu
LS, Anderson MJ, Arden KC, Blenis J, Greenberg ME.
Akt promotes cell survival by phosphorylating and
inhibiting a Forkhead transcription factor. Cell 1999,
96:857–868.
147. Dong J, Pan D. Tsc2 is not a critical target of Akt
during normal Drosophila development. Genes Dev
2004, 18:2479–2484.
148. Schleich S, Teleman AA. Akt phosphorylates both Tsc1
and Tsc2 in Drosophila, but neither phosphorylation is
required for normal animal growth. PLoS One 2009,
4:e6305.
149. Oliveira AP, Ludwig C, Picotti P, Kogadeeva M,
Aebersold R, Sauer U. Regulation of yeast central
metabolism by enzyme phosphorylation. Mol Syst Biol
2012, 8:623.
150. Zielinska DF, Gnad F, Jedrusik-Bode M, Wisniewski
JR, Mann M. Caenorhabditis elegans has a
phosphoproteome atypical for metazoans that is
enriched in developmental and sex determination
proteins. J Proteome Res 2009, 8:4039–4049.
151. Bodenmiller B, Malmstrom J, Gerrits B, Campbell
D, Lam H, Schmidt A, Rinner O, Mueller LN,
© 2013 Wiley Periodicals, Inc.
111
wires.wiley.com/devbio
Advanced Review
Shannon PT, Pedrioli PG, et al. PhosphoPep—a
phosphoproteome resource for systems biology
research in Drosophila Kc167 cells. Mol Syst Biol
2007, 3:139.
152. Zhai B, Villen J, Beausoleil SA, Mintseris J,
Gygi SP. Phosphoproteome analysis of Drosophila
melanogaster embryos. J Proteome Res 2008,
7:1675–1682.
153. Pan C, Gnad F, Olsen JV, Mann M. Quantitative
phosphoproteome analysis of a mouse liver cell
line reveals specificity of phosphatase inhibitors.
Proteomics 2008, 8:4534–4546.
112
154. Li QR, Xing XB, Chen TT, Li RX, Dai J, Sheng QH,
Xin SM, Zhu LL, Jin Y, Pei G, et al. Large scale
phosphoproteome profiles comprehensive features of
mouse embryonic stem cells. Mol Cell Proteomics
2011, 10:M110 001750.
155. Zhou H, Di Palma S, Preisinger C, Peng M,
Polat AN, Heck AJ, Mohammed S. Toward a
comprehensive characterization of a human cancer cell
phosphoproteome. J Proteome Res 2013, 12:260–271.
© 2013 Wiley Periodicals, Inc.
Volume 3, January/February 2014
Descargar