Beruflich Dokumente
Kultur Dokumente
Identification by Mass
Spectrometry
Masses of Amino Acid Residues
Protein Backbone
H...-HN-CH-CO-NH-CH-CO-NH-CH-CO-OH
Ri-1 Ri Ri+1
N-terminus C-terminus
Peptide without
71
415
s
e
e
tid
tid
185
ep
301
ep
lp
lp
a
ina
in
rm
154 332
rm
te
te
N-
C-
57 429
N- and C-terminal Peptides
486
71
415
s
e
e
tid
tid
185
ep
301
ep
lp
lp
a
ina
in
rm
154 332
rm
te
te
N-
C-
57 429
N- and C-terminal Peptides
486
71
415
185
301
154 332
57 429
N- and C-terminal Peptides
486
71
415
154 332
57 429
Peptide Fragmentation
b2-H2O b3- NH3
a2 b2 a3 b3
HO NH3+
| |
R1 O R2 O R3 O R4
| || | || | || |
H -- N --- C --- C --- N --- C --- C --- N --- C --- C --- N --- C -- COOH
| | | | | | |
H H H H H H H
y3 y2 y1
y3 -H2O y2 - NH3
Mass Spectra
G V D
D L K
H2O
L
57 Da =KG 99 Da = V D V G
mass
0
G V D L K
Peptide
MS/MS Identification:
Intensity
mass
00
Tandem Mass-Spectrometry
Breaking Proteins into
Peptides
GTDIMR HPLC To
PAKID
MPSERGTDIMRPAKID...... MS/MS
MPSER
protein peptides
Mass Spectrometry
Matrix-Assisted Laser Desorption/Ionization (MALDI)
1389 NL: 95
MS
100 1991
1.52E8 90
LC
1409 2149
90 1615 1621 Base Peak F: + 85
80
c Full ms [
1411 300.00 -
75
80 2147 70
1611 2000.00] 65
70 1655 1995 60
1593
1387
Relative Abundance
55
60 2155
801.0
1435 1987 50
2001 2177 45
ce
50 1445 1661
an
1937
d
40
n
Scan 1707
u
b
2205
tiveA
35
40 1779
638.9
2135 30
Rla
e
30 2017 25
1095 10
687.6
944.7 1742.1 1884.5
2331 783.3 1048.3 122.0 1413.9 1617.7
10 5
0
200 400 600 800 1000 1200 1400 1600 1800 2000
0 m/z
5 10 15 20 3 25 35 0 40 45 50 55 60 65 70 75 80
Time (min)
95
687.3
90
85
collision
588.1
80
75
MS-1 MS-2
MS/MS
70
cell 65
60
Relative Abundance
55 851.4
425.0
50
Ion
45 949.4
40
326.0
35 524.9
Source
30
25 589.2
20 1048.6
226.9 397.1
1049.6
15 489.1
10
629.0
5
Scan 1708
0
200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
Protein Identification by Tandem
Mass Spectrometry
S MS/MS instrument S#: 1708 RT: 54.47 AV: 1 NL: 5.27E6
e
T: + c d Full ms2 638.00 [ 165.00 - 1925.00]
850.3
100
95
687.3
90
q
85
588.1
80
75
u
70
65
60
Relative Abundance
55 851.4
425.0
e
50
45 949.4
40
326.0
35 524.9
n Database search
30
25 589.2
20 1048.6
226.9 397.1
1049.6
c
489.1
Sequest
15
10
629.0
5
e
200 400 600 800 1000 1200 1400 1600 1800 2000
de Novo interpretation
m/z
Sherenga
Tandem Mass Spectrum
Tandem Mass Spectrometry (MS/MS): mainly
generates partial N- and C-terminal peptides
Spectrum consists of different ion types
because peptides can be broken in several
places.
Chemical noise often complicates the
spectrum.
Represented in 2-D: mass/charge axis vs.
intensity axis
De Novo vs. Database Search
S #: 1708 R T: 54.47 AV: 1 N L: 5.27E6
T: + c d Full m s 2 638.00 [ 165.00 - 1925.00]
850.3
100
Database
95
687.3
90
De Novo
85
588.1
80
75
70
R e la ti ve A b u n d a n c e
65
60
55 851.4
425.0
50
45 949.4
40
Search
326.0
35 524.9
30
25 589.2
20 1048.6
226.9 397.1
1049.6
15 489.1
10
629.0
5
0
200 400 600 800 1000 1200 1400 1600 1800 2000
m /z
Mass, Score
Database of Database of allWpeptides =
R 20
n
known peptides
A A V L
L
G
G T
AAAAAAAA,AAAAAAAC,AAAAAAAD,AAAAAAAE,
MDERHILNM, KLQWVCSDL, AAAAAAAG,AAAAAAAF,AAAAAAAH,AAAAAAI,
E
PTYWASDL, ENQIKRSACVM, P
C L K
K
TLACHGGEM, NGALPQWRT,
HLLERTKMNVV, GGPASSDA, W
AVGELTI, AVGELTK , AVGELTL, AVGELTM,
D
GGLITGMQSD, MQPLMNWE,
ALKIIMNVRT,AVGELTK
ALKIIMNVRT, AVGELTK, ,
T
HEWAILF, GHNLWAMNAC,
GVFGSVLRA, EKLNKAATYIN.. YYYYYYYS,YYYYYYYT,YYYYYYYV,YYYYYYYY
AVGELTK
De Novo vs. Database Search: A
Paradox
The database of all peptides is huge O(20n) .
95
687.3
90
85
588.1
80
75
70
65
60
Relative Abundance
55 851.4
425.0
50
45 949.4
40
326.0
35 524.9
30
25 589.2
20 1048.6
226.9 397.1
1049.6
15 489.1
10
629.0
5
0
200 400 600 800 1000 1200 1400 1600 1800 2000
m/z
Sequence
Theoretical Spectrum
Theoretical Spectrum (contd)
Theoretical Spectrum (contd)
Building Spectrum Graph
How to create vertices (from masses)
S E Q U E N C
E
Mass/Charge (M/Z)
a
SE Q U E N C E
Mass/Charge (M/Z)
a is an ion type shift in b
S E Q U E N C E
Mass/Charge (M/Z)
y
E C N E U Q E S
Mass/Charge (M/Z)
Intensity
Mass/Charge (M/Z)
Intensity
Mass/Charge (M/Z)
noise
Mass/Charge (M/Z)
MS/MS Spectrum
Intensity
Mass/Charge (M/z)
Some Mass Differences between
Peaks Correspond to Amino Acids
u
q
e
s e u q
e
n
c n
e e
s e q c
n e
u e s
c
e
Ion Types
Some masses correspond to fragment
ions, others are just random noise
Knowing ion types ={1, 2,, k} lets us
distinguish fragment ions from noise
We can learn ion types i and their
probabilities qi by analyzing a large test
sample of annotated spectra.
Example of Ion Type
={1, 2,, k}
Ion types
{b, b-NH3, b-H2O}
correspond to
={0, 17, 18}
*Note: In reality the value of ion type b is -1 but we will hide it for the sake of simplicity
Match between Spectra and the
Shared Peak Count
The match between two spectra is the number of masses
(peaks) they share (Shared Peak Count or SPC)
In practice mass-spectrometrists use the weighted SPC
that reflects intensities of the peaks
Match between experimental and theoretical spectra is
defined similarly
Peptide Sequencing Problem
Goal: Find a peptide with maximal match between
an experimental and theoretical spectrum.
Input:
S: experimental spectrum
m: parent mass
Output:
P: peptide with mass m, whose theoretical
={1,2,,k}
EveryNterminalpeptidecangenerateuptokions
m1,m2,,mk
EverymasssinanMS/MSspectrumgenerateskvertices
V(s)={s+1,s+2,,s+k}
correspondingtopotentialNterminalpeptides
Verticesofthespectrumgraph:
{initialvertex}V(s1)V(s2)...V(sm){terminalvertex}
Reverse Shifts
Shift in H2O
Shift in H2O+NH3
Edges of Spectrum Graph
Two vertices with mass difference
p(P,S) = sS p(P, s)
Peak Score
For a position t that represents ion type dj :
(1 q )
i 1
i
Database
95
687.3
90
De Novo
85
588.1
80
75
70
R e la ti ve A b u n d a n c e
65
60
55 851.4
425.0
50
45 949.4
40
Search
326.0
35 524.9
30
25 589.2
20 1048.6
226.9 397.1
1049.6
15 489.1
10
629.0
5
0
200 400 600 800 1000 1200 1400 1600 1800 2000
m /z
W
Database of R
known peptides
A
A V L
L
G
G T
MDERHILNM, KLQWVCSDL, E
PTYWASDL, ENQIKRSACVM, P
C L K
K
TLACHGGEM, NGALPQWRT,
HLLERTKMNVV, GGPASSDA, W D
GGLITGMQSD, MQPLMNWE,
ALKIIMNVRT,AVGELTK
ALKIIMNVRT, AVGELTK, ,
T
HEWAILF, GHNLWAMNAC,
GVFGSVLRA, EKLNKAATYIN..
AVGELTK