Beruflich Dokumente
Kultur Dokumente
Francesco Giorgianni,
Charles B. Stout Neuroscience Mass Spectrometry Laboratory, The University of Tennessee Health Science Center, Memphis, Tennessee
Advanced Article
Article Contents
Biological Background Mass Spectrometry in Phosphoproteomics Chemical Tools and Techniques
Sarka Beranova-Giorgianni,
doi: 10.1002/9780470048672.wecb304
Proteins are biological macromolecules whose structure and functions are essential to every biological process within cells. Protein phosphorylation is one of the most important post-translational modications and it has a profound effect on protein function. Recently, concurrent advances in bioanalytical technologies and informatics enabled studies of proteins and phosphoproteins on a global scale. These large-scale approaches represent an integral component of systems biology, which is an area of scientic inquiry that focuses on a biological system as a whole. State-of-the art mass spectrometry is a key technology for global-scale protein and phosphoprotein analyses. Phosphorylated proteins from diverse biological systems can be probed with a combination of separation methods, tandem mass spectrometry, and bioinformatics, to reveal the identity of the phosphorylated protein and the exact localization of the site(s) of phosphorylation. Characterization of the phosphoproteomes in cells, tissues and biological uids provides an excellent foundation on which to build new knowledge of living systems.
Proteins are the nal products of gene expression, and while the genome provides the blueprint for the molecular components of a living cell, proteins are the essential molecules responsible for cellular structure and function. Decades of studies of proteins in a one-by-one fashion have generated a wealth of knowledge on proteins as individual parts of the cellular machinery. Recently, interrogation of proteins in biological systems on a global scale gave rise to a new area of scientic inquiry, termed proteomics. Proteomics focuses on the study of the proteome, which is dened as the array of proteins that are present in a cell, organ, or biological uid at a specic time, under a specic set of conditions. The goals of proteomics are diverse and include elucidation of basic molecular mechanisms that regulate cell function in physiological and pathological state, discovery of novel targets for the development of improved drug treatments, discovery of biomarkers for early detection of a disease and for design of tailored therapies, and many other objectives. From the analytical standpoint, large-scale, comprehensive analysis of proteins is an extremely challenging undertaking because of the enormous complexity of proteomes and their dynamic nature. In fact, the development of proteomics as a
scientic discipline was made possible through the concurrent advances in separation sciences, mass spectrometry, and informatics. Mass spectrometry has been the essential technology that enabled interrogating proteins on a global scale, with a high degree of sensitivity and accuracy. The purpose of this chapter is to describe the basic principles of mass spectrometry in the context of proteomics. Specically, the review focuses on the use of mass spectrometry for large-scale analysis of specic subsets of proteomes the phosphoproteomes. The chapter includes discussion of the basics of gas-phase behavior of peptides and phosphopeptides, and shows the role of mass spectrometry as a component of a general analytical strategy for phosphoproteome analysis. Because of the diversity of the analytical platforms that are being used, this review is not intended as a comprehensive description of all approaches. Rather, this article includes an overview of selected methods, a sampling of relevant references, and an example of a mass spectrometry-based phosphoproteomics methodology used in the authors laboratories. 1
WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.
Biological Background
Proteins are high molecular weight organic molecules that are essential to every biological process within living systems. Proteins are structurally and functionally diverse. Some proteins are assembled in multi-unit complexes to form the cytoskeleton of cells or other mechanical structures, while others are enzymes that catalyze biochemical reactions, or they participate in signal transduction within a cell or in cell-to-cell communication. Post-translational modications play a key role in regulatory cellular processes, and in particular, protein phosphorylation is central to most of the signaling events that ultimately determine the biological status of all eukaryotic cells. The intracellular regulation of protein phosphorylation within cells occurs via a very complex system of positive and negative feed-backs with the surrounding environment. Protein phosphorylation regulates critical protein functions such as protein-DNA, protein-RNA, and protein-protein interactions, enzyme activity, protein trafcking, protein intracellular localization, and protein degradation. Aberrations in protein phosphorylation can have deleterious consequences and have been linked to various diseases, including cancer. It is estimated that approximately 30% of all proteins in a mammalian cell are phosphorylated at any given time (1). A proteome represents the complete repertoire of proteins present in a cell at any given time. The term phosphoproteome refers to a specic subset of the proteome that includes all the phosphorylated protein species. Phosphoproteomics focuses on the comprehensive characterization of phosphorylated proteins in biological systems, including identication of phosphorylated proteins, assignment of their exact sites of phosphorylation, and quantication of changes in protein phosphorylation. The expansion of proteomics and phosphoproteomics in recent years has been driven by technological developments. The greatest challenge for proteomics is the inherent complexity of cellular proteomes, which is due to the dynamic nature of the proteome, the large number and wide abundance range of cellular proteins, and their diverse physicochemical properties. It is recognized that the diversity and extent of proteome complexity cannot be solved by a single technology. Instead, the trend in proteomics is to develop an array of methodologies from which a method or a set of methods can be selected to tailor the analytical strategy to suit a specic study. Chromatography and electrophoresis are the central separation technologies for proteomics. High performance mass spectrometry in combination with bioinformatics tools are key components for protein identication and characterization.
Protein Extraction (Isolation of proteins and phosphoproteins from the biological system under study)
commonly occurs on serine, threonine, or tyrosine residues. The task to characterize phosphorylated proteins on a proteome-wide scale includes determination of protein identities and localization of the phosphorylated amino acid residues in these proteins. Mass spectrometry is the central technology for these tasks. Although there have been major advancements in the mass spectrometry analysis of intact proteins in proteomics (2), most approaches still focus on characterization of peptides and phosphopeptides from proteolytic digestion of proteins. Therefore, the discussion in this section will concentrate on these strategies. Mass spectrometry has several inherent characteristics that make it an excellent choice for peptide analysis. The technique is rapid, versatile, highly amenable to automation, and it requires low-to-mid femtomole sample quantities to yield reliable information about the amino acid sequence of a peptide. (The eld of mass spectrometry is continuously moving towards improved detection limits, and cutting-edge instruments provide sensitivity in the attomol range). For phosphoproteome analysis, the general analytical strategy (Fig. 1) includes isolation of the proteins from the biological system under study; protein or peptide fractionation and enrichment of phosphorylated proteins/peptides; mass spectrometry measurement of specic attributes of phosphopeptides, including their mass and fragmentation patterns; searches of protein sequence databases to identify the proteins and to assign phosphorylation sites.
Protein extraction
The rst step in the analysis of proteomes and phosphoproteomes involves extraction of proteins from the biological system under study; the objective is to solubilize the proteins and to prepare them for subsequent analysis. Obviously, this step is critical for the overall success of the analysis, and choice of
WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.
methods should be tailored to the characteristics of the biological system and to the goals of the study. Depending on the biological system, protein extraction may involve disruption of cells, removal of contaminants such as salts (e.g., by dialysis or ultraltration), and/or overabundant proteins (e.g., by immunoafnity columns). For phosphoproteomics, particular care must be taken to preserve phosphorylation of the proteins, i.e., to prevent the action of protein phosphatases. This is achieved by controlling the temperature of the sample and by addition of phosphatase inhibitors to the extraction buffer.
WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.
x3
y3
+2H
z3
x2
y2
+2H
z2
x1
y1
+2H
z1
O H2N CH R1
+2H
O NH CH R2
+2H
O NH CH R3
+2H
O NH CH R4 C OH
a1
b1
c1
a2
b2
c2
a3
b3
c3
Figure 2 Nomenclature of peptide fragmentation. The possible product ion series that arise by cleavages along the peptide backbone are a-, b-, and c-series (N-terminal); and x-, y-, and z-series (C-terminal). (The designation +2H denotes addition of two hydrogens that are transferred onto the structures depicted in the gure to form the corresponding singly charged y- or c- product ions (21)). Under low-energy CID, y- and b-ions usually predominate. The mass differences between adjacent ions of the same series can be used to deduce portions of the peptide sequence.
analysis of the product ions and recording of the MS/MS spectrum. The process of collisional activation and dissociation is termed collision-induced dissociation (CID). Depending on the instrument type, the MS/MS events can be separated in space (tandem-in-space) or in time (tandem-in-time). In MS/MS, protonated peptide ions in the gas phase dissociate via cleavages along the peptide backbone; fragmentations can occur at any of the three types of bonds that make up the backbone of the peptide (Fig. 2). The nomenclature for peptide dissociations distinguishes six major series of sequence-determining product ions (14, 15). The N-terminal series encompass the a-, b-, and c-ion types; the C-terminal series include x-, y-, and z-ions. In addition to the six basic series, other types of product ions may also be observed (16) under certain conditions. The relative abundance of the different product ions depends on the amino acid sequence of the peptide, on the internal energy of the dissociating precursor ion, and on additional variables that affect the CID process (17). Under low-energy CID regime, used for example in ion trap mass spectrometers, the predominant types of product ions are the b-ions and y-ions that form by cleavages of the peptide bond. As shown in Fig. 2, adjacent (singly-charged) product ions from a series have a difference in mass that determines the amino acid present at that position of the peptide. For example, if the amino acid in position 3 of the tetrapeptide in Fig. 2 is a serine (R3 =CH2 OH), the mass difference between the y1 and y2 product ions will be 87 Da, corresponding to the mass of the serine residue NH-CH(CH2 OH)-CO-. Therefore, when an MS/MS spectrum of a peptide ion contains high quality data for one complete or several partial overlapping product-ion series, then the sequence of the peptide can be deduced from the MS/MS data.
complexity of the phenomena associated with analyses of highly complex peptide/phosphopeptide mixtures. Modication by phosphorylation adds 80 Da to the mass of the corresponding peptide. The principles of gas-phase dissociations of protonated phosphopeptide ions into sequencedetermining product ion series are analogous to those of non-phosphorylated peptides, with an additional issue that must be taken into consideration. Under CID conditions phosphorylated peptide ions undergo a facile neutral loss of H3 PO4 , corresponding to the loss of 98 Da. The mechanisms that underlie this dissociation behavior have been studied for phosphoserine-, phosphothreonine-, and phosphotyrosine-containing peptides (19, 20). The loss of phosphoric acid from the molecular ion produces a non-sequence specic product ion (M+nH-H3 PO4 )n+ . This product ion can serve as a marker ion, indicating the presence of a phosphorylated peptide. An example of an MS/MS spectrum of a phosphorylated peptide is shown in Fig. 3. This spectrum illustrates the typical fragmentation behavior of protonated phosphopeptide ions in ion trap MS/MS. The spectrum is dominated by the intense (M+2H-H3 PO4 )2+ product ion; the spectrum further contains product ions of the y- and b- series that determine the amino acid sequence of the phosphopeptide and the location of the phosphorylation site. Often, the scenario is not so favorable. The loss of phosphoric acid dominates and not enough other product ions are observed for an unequivocal sequence determination. One way to remedy this unfavorable outcome is to perform an additional dissociation, an MS/MS/MS, where the primary product ion (M+nH-H3 PO4 )n+ is mass-selected and then dissociated via CID.
Specics of phosphopeptides
Phosphorylated peptides are modied peptides and therefore most of the basic concepts discussed above also apply to mass spectrometry of phosphopeptides. In terms of ionization, the majority of large-scale phosphoproteomics strategies utilize ESI. The difculty to effectively analyze phosphopeptides by ESI-based approaches is often attributed, among other factors, to selective suppression of phosphorylated peptides in the presence of unmodied peptides, and to decreased ionization efciencies of phosphopeptides relative to their non-phosphorylated counterparts. However, this notion has been challenged in a recent study (18) underscoring the 4
WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.
100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5 0
952.8[M+2H H3PO4]+2
Relative Abundance
[M+2H 2H2O]+2 983.8 y6 812.5 y11 1329.7 y9 y8 1157.7 y10 1042.6 1272.7
y5 683.5
600
1000
1200
1400
1600
1800
2000
m/z
Figure 3 MS/MS spectrum of the phosphopeptide FNDS*EGDDTEETEDYR. The spectrum, which was acquired with an ion trap mass spectrometer, illustrates the typical behavior of phosphopeptide ions under low-energy CID. The molecular ion was (M+2H)2+ , m/z 1001.9. The MS/MS spectrum is dominated by an intense product ion corresponding to the neutral loss of phosphoric acid from the activated precursor ion. In addition, the spectrum contains a number of product ions from the y- and b-series that determine the peptide sequence and the site of phosphorylation. (The product ions are singly charged unless noted otherwise). This phosphopeptide belongs to Bcl-2-associated transcription factor 1 (BCLF1 HUMAN) and it was identied in the analysis of the phosphoproteome in the LNCaP prostate cancer cell line.
Two issues have to be addressed. First, because of the complexity of the starting mixture, even after LC separation, multiple peptides will coelute and therefore at any given time there will be more than one peptide ion present. Second, the characteristics, including the mass of the precursor ion that is needed to set the precursor selection in MS/MS are not known. These issues are dealt with through data-dependent acquisition mode, in which the mass spectrometer automatically cycles through a sequence of measurements of MS and MS/MS data. For peptide and phosphopeptide analysis, the instrument measures an MS spectrum to obtain masses of the analytes eluting from LC at that particular time. Based on the information from these MS data, subsequent MS/MS events are set for example, 5 MS/MS measurements of 5 of the most intense ions from the MS spectrum, provided they are above a specied intensity threshold. The cycle is repeated many times during the LC-MS/MS analysis. To maximize information gained in the MS/MS steps, some strategies are incorporated such as a permanent exclusion from MS/MS of known contaminants throughout the entire analysis; and temporary exclusion of peptides whose MS/MS have already been measured for the duration of the expected time that it takes for the peptide to elute from the LC column. These strategies that decrease redundancy and maximize the number of peptides surveyed in the analysis are particularly important for phosphopeptides that are frequently minor components in a peptide digest. In a typical LC-MS/MS analysis, a large number of MS and MS/MS spectra are acquired, for example >10,000 MS/MS spectra on state-of-the-art ion trap instruments.
Bioinformatics
In proteomics and phosphoproteomics applications, search programs are used that utilize minimally processed MS/MS data without the need for manual interpretation (21). Development of these programs, for example SEQUEST, that allow the integration of mass spectrometry data with database searching has been one of the enabling developments in proteomics. In SEQUEST-based searches, the experimentally measured peptide mass is used to locate in the database peptide sequences whose masses match the measured mass, and then experimental product ion patterns are compared to theoretical patterns for each candidate peptide, and a correlation score is calculated. The highest scoring peptide sequences are reported. For phosphopeptide characterization, the search considers possible addition of 80 Da to serine, threonine, and tyrosine residues. After completion of the search, it is imperative that the spectra and the database search outputs are inspected before an ultimate decision about the correctness of the match is reached. For phosphopeptides, this examination includes verication that the correct amino acid sequence was retrieved, and verication of the assignment of the phosphorylation site. Once the phosphoproteins are identied and their sites are characterized, additional bioinformatics resources are available for in silico analysis and functional integration. For example, with the program Scansite (scansite.mit.edu ), sequences of the identied proteins are searched to locate motifs that would suggest phosphorylation by a specic kinase or a phospho-specic binding interaction. 5
WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.
Information on protein phosphorylation is compiled in several databases, for example Phosphosite (www.phosphosite.org ), Phosida (www.phosida.org ), and others.
IEF, the strip is divided into sections. Each section still contains multiple proteins but the complexity of these mixtures is greatly reduced compared to that of the initial mixture. The proteins in each section of the IPG strip are digested with trypsin to produce mixtures of peptides that include phosphorylated and non-phosphorylated peptides. IMAC is used to enrich for phosphorylated peptides. The steps in the IMAC procedure involve: 1. selective binding of the phosphopeptides via interaction of their phosphate groups with the immobilized metal ion (e.g., Ga3+ ) under carefully controlled acidic pH conditions; 2. washing of unbound and non-specically bound material; 3. elution of phosphopeptides from the column under alkaline conditions. Following IMAC, desalting and volume reduction of the samples is performed with a C18 minicolumn, and the samples are analyzed by LC-MS/MS. The nano-LC setup includes a combined capillary column/spray needle packed with a C18 stationary phase. The i.d. of the column is 75 m, the i.d. of the spray tip is 15 m, and the ow-rate is on the order of 50-150 nL/min. Mobile phases typical for reversed-phase chromatography that are compatible with mass spectrometry are used, such as water/acetonitrile/formic acid or water/methanol/formic acid (26). Peptides and phosphopeptides eluting from the nanoLC are ionized by nanoelectrospray to produce multi-protonated ions in most cases (doubly or triply charged). MS and MS/MS spectra are acquired in the data-dependent mode. MS/MS/MS may be performed in non-data-dependent mode in a separate LC-MS/MS experiment. Alternatively, this step, where MS3 is triggered when a (M+nH-H3 PO4 )n+ product ion is present in the MS/MS spectrum, may be incorporated into data-dependent scanning (3). The set of data is used to search a protein sequence database such as the SWISSPROT or NCBInr. The search parameters include modications on the amino acid residues where phosphorylation is expected to occur. The searches yield lists of phosphopeptide matches with scores indicating the quality of the match. The matches are evaluated manually. This evaluation has two objectives: conrmation of the correct amino acid sequence of the phosphopeptide which establishes the presence of the phosphorylated form of the corresponding protein in the pituitary; and assignment of the exact phosphorylation site(s) in the peptide. This validation includes inspection of the MS/MS data and the scores. Finally, the phosphorylated proteins are put into context of current scientic knowledge, using databases such as Phosphosite that extract and compile published information on protein phosphorylation.
Acknowledgments
The authors research activities are funded by the University of Tennessee College of Pharmacy, and by Chiesi Pharmaceuticals. Funds for mass spectrometry instrumentation have been provided in part by NIH grant 1S10 RR16679.
References
1. 2. Cohen P. The regulation of protein function by multisite phosphorylationa 25 year update. Trends Biochem. Sci. 2000;25:596601. Siuti N, Kelleher NL. Decoding protein modications using top-down mass spectrometry. Nat. Methods 2007;4:817821.
WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.
3.
4.
5.
6.
7.
8.
9. 10. 11.
12.
13. 14.
15.
19.
20.
21.
22.
Beausoleil SA, Jedrychowski M, Schwartz D, Elias JE, Villen J, Li J, Cohn MA, Cantley LC, Gygi SP. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl. Acad. Sci. U. S. A. 2004;101:1213012135. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Mann M. Global in vivo, and site-specic phosphorylation dynamics in signaling networks. Cell 2006;127:635648. Rush J, Moritz A, Lee KA, Guo A, Goss VL, Spek EJ, Zhang H, Zha XM, Polakiewicz RD, Comb MJ. Immunoafnity proling of tyrosine phosphorylation in cancer cells. Nat. Biotechnol. 2005;23:94101. N uhse T, Yu K, Salomon A. Isolation of phosphopeptides by immobilized metal ion afnity chromatography. Curr. Protoc. Mol. Biol. 2007;18:18.13. Thingholm TE, Jrgensen TJ, Jensen ON, Larsen MR. Highly selective enrichment of phosphorylated peptides using titanium dioxide. Nat Protoc. 2006;1:19291935. Ficarro SB, McCleland ML, Stukenberg PT, Burke DJ, Ross MM, Shabanowitz J, Hunt DF, White FM. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat Biotechnol. 2002;20:301305. Collins MO, Yu L, Choudhary JS. Analysis of protein phosphorylation on a proteome-scale. Proteomics 2007;7:27512768. Dass C. Fundamentals of Contemporary Mass Spectrometry. 2007. Wiley, Hoboken, NJ, pp. 1560. Kinter M, Sherman NE. Protein Sequencing and Identication Using Tandem Mass Spectrometry. 2000. Wiley, New York, pp. 3139. Dongre AR, Jones JL, Somogyi A, Wysocki VH. Inuence of peptide composition, gas-phase basicity, and chemical modications on fragmentation efciency: Evidence for the mobile proton model. J. Am. Chem. Soc. 1996;118:83658374. Paizs B, Suhai S. Fragmentation pathways of protonated peptides. Mass Spectrom. Rev. 2005;24:508548. Roepstorff P, Fohlman J. Proposal for a common nomenclature for sequence ions in mass spectra of peptides. Biomed. Mass Spectrom. 1984;11:601. Biemann K. Contributions of mass spectrometry to peptide and protein structure. Biomed. Environ. Mass Spectrom. 1988;16: 99111. Dass C. Fundamentals of Contemporary Mass Spectrometry. 2007. Wiley, Hoboken, NJ, pp 317322. Wells JM, McLuckey SA. Collision-induced dissociation (CID) of peptides and proteins. Methods Enzymol. 2005;402:148185. Steen H, Jebanathirajah JA, Rush J, Morrice N, Kirschner MW. Phosphorylation analysis by mass spectrometry: Myths, facts, and the consequences for qualitative and quantitative measurements. Mol. Cell. Proteomics 2006;5:172181. DeGnore JP, Qin J. Fragmentation of phosphopeptides in an ion trap mass spectrometer. J. Am. Soc. Mass Spectrom. 1998;9:1175 1188. Palumbo AM, Tepe JJ, Reid GE. Mechanistic insights into the multistage gas-phase fragmentation behavior of phosphoserine- and phosphothreonine-containing peptides. J. Proteome Res. 2008;7: 771779. Hernandez P, M uller M, Appel RD. Automated protein identication by tandem mass spectrometry: issues and strategies. Mass Spectrom. Rev. 2006;25:235254. Syka JE, Coon JJ, Schroeder MJ, Shabanowitz J, Hunt DF. Peptide and protein sequence analysis by electron transfer dissociation mass spectrometry. Proc Natl Acad Sci U. S. A. 2004;101:95289533.
23.
24.
25.
26.
Molina H, Horn DM, Tang N, Mathivanan S, Pandey A. Global proteomic proling of phosphopeptides using electron transfer dissociation tandem mass spectrometry. Proc Natl Acad Sci U. S. A. 2007;104:21992204. Wolf-Yadlin A, Hautaniemi S, Lauffenburger DA, White FM. Multiple reaction monitoring for robust quantitative proteomic analysis of cellular signaling networks. Proc Natl Acad Sci U. S. A. 2007;104:58605865. Beranova-Giorgianni S, Zhao Y, Desiderio DM, Giorgianni F. Phosphoproteomic analysis of the human pituitary. Pituitary 2006;9:109120. Giorgianni F, Cappiello A, Beranova-Giorgianni S, Palma P, Trufelli H, Desiderio DM. LC-MS/MS analysis of peptides with methanol as organic modier: improved limits of detection. Anal. Chem. 2004;76:70287038.
See Also
Post-Translational Modications, Roles in Regulating Protein Function; Proteins, Chemistry and Chemical Reactivity of
WILEY ENCYCLOPEDIA OF CHEMICAL BIOLOGY 2008, John Wiley & Sons, Inc.