Sie sind auf Seite 1von 65

Proteomics &

Mass Spectrometry
Nathan Edwards
Center for Bioinformatics and Computational Biology
Outline

• Proteomics

• Mass Spectrometry

• Protein Identification
• Peptide Mass Fingerprint
• Tandem Mass Spectrometry

2
Proteomics
• Proteins are the machines that drive
much of biology
• Genes are merely the recipe
• The direct characterization of a
sample’s proteins en masse.
• What proteins are present?
• How much of each protein is present?

3
Systems Biology
• Establish relationships by
• Choosing related samples,
• Global characterization, and
• Comparison.
Gene / Transcript / Protein
Measurement Predetermined Unknown
Discrete (DNA) Genotyping Sequencing
Continuous Gene Expression Proteomics

4
Samples

• Healthy / Diseased
• Cancerous / Benign
• Drug resistant / Drug susceptible
• Bound / Unbound
• Tissue specific
• Cellular location specific
• Mitochondria, Membrane

5
2D Gel-Electrophoresis
• Protein separation
• Molecular weight (MW)
• Isoelectric point (pI)

• Staining

• Birds-eye view of
protein abundance

6
2D Gel-Electrophoresis

Bécamel et al., Biol. Proced. Online 2002;4:94-104.


7
Paradigm Shift

• Traditional protein chemistry assay


methods struggle to establish identity.
• Identity requires:
• Specificity of measurement (Precision)
• Mass spectrometry
• A reference for comparison
(Measurement → Identity)
• Protein sequence databases

8
Mass Spectrometer

Sample

+
_

Ionizer Mass Analyzer Detector


• MALDI • Time-Of-Flight (TOF) • Electron
• Electro-Spray • Quadrapole Multiplier
Ionization (ESI) • Ion-Trap (EM)
9
Mass Spectrometer
(MALDI-TOF)
UV (337 nm) Microchannel
Field-free drift zone
Source plate detector
Pulse
voltage

Analyte/
Ed = 0
matrix
Length = D
Length = s

Backing plate
(grounded) Extraction grid
(source voltage -Vs) Detector grid -Vs

10
Mass Spectrum

11
Mass is fundamental

12
Peptide Mass Fingerprint

Cut out
2D-Gel
Spot

13
Peptide Mass Fingerprint

Trypsin Digest

14
Peptide Mass Fingerprint

MS

15
Peptide Mass Fingerprint

16
Peptide Mass Fingerprint

• Trypsin: digestion enzyme


• Highly specific
• Cuts after K & R except if followed by P

• Protein sequence from sequence database


• In silico digest
• Mass computation

• For each protein sequence in turn:


• Compare computer generated masses with
observed spectrum

17
Protein Sequence

• Myoglobin - Plains zebra

GLSDGEWQQV LNVWGKVEAD IAGHGQEVLI


RLFTGHPETL EKFDKFKHLK TEAEMKASED
LKKHGTVVLT ALGGILKKKG HHEAELKPLA
QSHATKHKIP IKYLEFISDA IIHVLHSKHP
GDFGADAQGA MTKALELFRN DIAAKYKELG
FQG

18
Protein Sequence

• Myoglobin - Plains zebra

GLSDGEWQQV LNVWGKVEAD IAGHGQEVLI


RLFTGHPETL EKFDKFKHLK TEAEMKASED
LKKHGTVVLT ALGGILKKKG HHEAELKPLA
QSHATKHKIP IKYLEFISDA IIHVLHSKHP
GDFGADAQGA MTKALELFRN DIAAKYKELG
FQG

19
Peptide Masses

1811.90 GLSDGEWQQVLNVWGK
1606.85 VEADIAGHGQEVLIR
1271.66 LFTGHPETLEK
1378.83 HGTVVLTALGGILK
1982.05 KGHHEAELKPLAQSHATK
1853.95 GHHEAELKPLAQSHATK
1884.01 YLEFISDAIIHVLHSK
1502.66 HPGDFGADAQGAMTK
748.43 ALELFR

20
ALELFR

LFTGHPETLEK

21
HGTVVLTALGGILK

HPGDFGADAQGAMTK

VEADIAGHGQEVLIR
Peptide Mass Fingerprint

GLSDGEWQQVLNVWGK
GHHEAELKPLAQSHATK
YLEFISDAIIHVLHSK

KGHHEAELKPLAQSHATK
Mass Spectrometry

• Strengths
• Precise molecular weight
• Fragmentation
• Automated
• Weaknesses
• Best for a few molecules at a time
• Best for small molecules
• Mass-to-charge ratio, not mass
• Intensity ≠ Abundance

22
Sample Preparation for
MS/MS

Enzymatic Digest
and
Fractionation

23
Single Stage MS

MS

24
Tandem Mass Spectrometry
(MS/MS)

Precursor selection

25
Tandem Mass Spectrometry
(MS/MS)

Precursor selection +
collision induced dissociation
(CID)

MS/MS

26
Peptide Fragmentation

Peptides consist of amino-acids


N-terminus arranged in a linear backbone.

H…-HN-CH-CO-NH-CH-CO-NH-CH-CO-…OH
Ri-1 Ri Ri+1
C-terminus

AA residuei-1 AA residuei AA residuei+1

27
Peptide Fragmentation

28
Peptide Fragmentation

yn-i
yn-i-1

-HN-CH-CO-NH-CH-CO-NH-
Ri CH-R’
i+1

bi R”
i+1

bi+1
29
Peptide Fragmentation
Peptide: S-G-F-L-E-E-D-E-L-K
MW ion ion MW
88 b1 S GFLEEDELK y9 1080
145 b2 SG FLEEDELK y8 1022
292 b3 SGF LEEDELK y7 875
405 b4 SGFL EEDELK y6 762
534 b5 SGFLE EDELK y5 633
663 b6 SGFLEE DELK y4 504
778 b7 SGFLEED ELK y3 389
907 b8 SGFLEEDE 30
LK y2 260
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions

100
% Intensity

0 m/z
250 500 750 1000
31
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions
y6
100

y7
% Intensity

y5
b3
b4
y2 y3 y4 b5 b6 b8 y
y9
b7 b9 8
0 m/z
250 500 750 1000
32
Peptide Identification

Given:
• The mass of the precursor ion, and
• The MS/MS spectrum

Output:
• The amino-acid sequence of the peptide

33
Peptide Identification

Two paradigms:

• De novo interpretation

• Sequence database search

34
De Novo Interpretation

100
% Intensity

0 m/z
250 500 750 1000

35
De Novo Interpretation

100
% Intensity

E L

0 m/z
250 500 750 1000

36
De Novo Interpretation

100
% Intensity

SGF L E E L F G

E
KL E D E D E L

0 m/z
250 500 750 1000

37
De Novo Interpretation
Amino-Acid Residual MW Amino-Acid Residual MW
A Alanine 71.03712 M Methionine 131.04049
C Cysteine 103.00919 N Asparagine 114.04293
D Aspartic acid 115.02695 P Proline 97.05277
E Glutamic acid 129.04260 Q Glutamine 128.05858
F Phenylalanine 147.06842 R Arginine 156.10112
G Glycine 57.02147 S Serine 87.03203
H Histidine 137.05891 T Threonine 101.04768
I Isoleucine 113.08407 V Valine 99.06842
K Lysine 128.09497 W Tryptophan 186.07932
L Leucine 113.08407 Y Tyrosine 163.06333

38
De Novo Interpretation

…from Lu and Chen (2003), JCB 10:1

39
De Novo Interpretation

40
De Novo Interpretation

…from Lu and Chen (2003), JCB 10:1

41
De Novo Interpretation

• Find good paths in spectrum graph


• Can’t use same peak twice
• Simple peptide fragmentation model
• Usually many apparently good solutions
• Amino-acids have duplicate masses!
• “Best” de novo interpretation may have no
biological relevance
• Identifies relatively few peptides in high-
throughput workflows

42
Sequence Database
Search
• Compares peptides from a protein
sequence database with spectra
• Filter peptide candidates by
• Precursor mass
• Digest motif
• Score each peptide against spectrum
• Generate all possible peptide fragments
• Match putative fragments with peaks
• Score and rank

43
Peptide Fragmentation

S G F L E E D E L K

100
% Intensity

0 m/z
250 500 750 1000
44
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions

100
% Intensity

0 m/z
250 500 750 1000
45
Peptide Fragmentation
88 145 292 405 534 663 778 907 1020 1166 b ions
S G F L E E D E L K
1166 1080 1022 875 762 633 504 389 260 147 y ions
y6
100

y7
% Intensity

y5
b3
b4
y2 y3 y4 b5 b6 b8 y
y9
b7 b9 8
0 m/z
250 500 750 1000
46
Sequence Database Search

• Sequence fills in gaps in the spectrum


• All candidates have biological relevance
• Practical for high-throughput peptide
identification
• Correct peptide might be missing from
database!

47
Peptide Candidate
Filtering
Digestion Enzyme: Trypsin
• Cuts just after K or R unless followed
by a P.
• Must allow for “missed” cleavage sites
• “Average” peptide length about 10-15
amino-acids

48
Peptide Candidate
Filtering
>ALBU_HUMAN
MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFE
DHVKLVNEVTEFAK…

No missed cleavage sites


MK
WVTFISLLFLFSSAYSR
GVFR
R
DAHK
SEVAHR
FK
DLGEENFK
ALVLIAFAQYLQQCPFEDHVK
LVNEVTEFAK49
Peptide Candidate
Filtering
>ALBU_HUMAN
MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFE
DHVKLVNEVTEFAK…

One missed cleavage site


MKWVTFISLLFLFSSAYSR
WVTFISLLFLFSSAYSRGVFR
GVFRR
RDAHK
DAHKSEVAHR
SEVAHRFK
FKDLGEENFK
DLGEENFKALVLIAFAQYLQQCPFEDHVK
ALVLIAFAQYLQQCPFEDHVKLVNEVTEFAK
… 50
Peptide Scoring

• Peptide fragments vary based on


• The instrument
• The peptide’s amino-acid sequence
• The peptide’s charge state
• Etc…
• Search engines model peptide
fragmentation to various degrees.
• Speed vs. sensitivity tradeoff
• y-ions & b-ions occur most frequently

51
Mascot Search Engine

52
Mascot MS/MS Ions
Search

53
Mascot MS/MS Search
Results

54
Mascot MS/MS Search
Results

55
Mascot MS/MS Search
Results

56
Mascot MS/MS Search
Results

57
Mascot MS/MS Search
Results

58
Mascot MS/MS Search
Results

59
Mascot MS/MS Search
Results

60
Mascot MS/MS Search
Results

61
Mascot MS/MS Search
Results

62
Mascot MS/MS Search
Results

63
Summary

• Protein identification by mass


spectrometry is a key element of
proteomics and systems biology.
• Mass spectrometry + sequence
databases represent a huge leap for
protein (bio-)chemistry.
• Sample prep, instruments and algorithms
still maturing, much work to be done.

64
Further Reading

• Matrix Science (Mascot) Web Site


• www.matrixscience.com
• Seattle Proteome Center (ISB)
• www.proteomecenter.org
• Proteomic Mass Spectrometry Lab at
The Scripps Research Institute
• fields.scripps.edu
• UCSF ProteinProspector
• prospector.ucsf.edu

65

Das könnte Ihnen auch gefallen