Sie sind auf Seite 1von 43

Protein Interactions Predicted by a Combination of NMR and Analysis of Protein Sequence Covariances

Anders Bergkvist

abk@molbio.gu.se 031-773 3803

Protein Structure Analysis by Nuclear Magnetic Resonance (NMR)

Protein Interaction

Protein Interaction

Protein Interaction

Protein Interaction

N
Human prion protein PrPC, residues 125-228

Protein Interaction

NMR X-ray

Protein Structure Analysis by Nuclear Magnetic Resonance (NMR)

Protein interaction on the cellular level

B ?

A
?

?
C

when it is known that two proteins interact...

B !

Ex. High-throughput mass spectrometry

Ex. NMR spectroscopy chemical shift mapping

Ex. High-throughput mass spectrometry

Ex. NMR spectroscopy chemical shift mapping

Ex. Protein sequence alignment covariances

Protein interaction on the amino acid level

? ? ? ? ?

A
Z

B
R

Amino acid residue contacts within a protein

A
Z

Y
?

Sequence Alignment Covariances

Part of a plastocyanin sequence alignment Pc


PLAS_SCEOB PLAS_CHLFU PLAS_CHLRE PLAS_RUMOB O22646 PLAS_PEA PLAS_VICFA PLAS_CAPBU PLAT_ARATH PLAS_ARATH PLAS_CUCPE PLAS_CUCSA PLAS_MERPE PLAS_POPNI PLAT_POPNI PLAS_LYCES PLAS_SOLTU PLAS_TOBAC PLAT_TOBAC PLAS_SOLCR PLAS_LACSA PLAS_SILPR PLAS_SAMNI PLAS_SPIOL PLAS_PHAVU

N positions
TAGEYGYFCEPHQGAANVKLGADSGALVFEPATVTIKAGDSVTWTNNAGFPHNIVFDEDA TAGTYGYFCEPHQGAVTVKLGADSGALVFEPSSVTIKAGETVTWVNNAGFPHNIVFDEDE AAGEYGYYCEPHQGAATVKLGADSGALEFVPKTLTIKSGETVNFVNNAGFPHNIVFDEDA EKGTYSFYCSPHQGAIEIKLGGDDGALAFVPGSFTVAAGEKIVFKNNAGFPHNIVFDEDE EKGTYSIYCSPHQGAVEVLLGGSDGSLAFVPSNIEVAAGETVVFKNNAGFPHNVLFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPSSLEVSAGETIVFKNNAGFPHNVVFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPNSFEVSAGDTIVFKNNAGFPHNVVFDEDE EAGTYSFYCAPHQGAIEVLLGGGDGSLAFVPNDFSIAKGEKIVFKNNAGFPHNVVFDEDE EPGTYSFYCAPHQGAIEVLLGGGDGSLAFIPNDFSIAKGEKIVFKNNAGYPHNVVFDEDE EPGSYGFYCAPHQGAMEVLLGSDDGSLAFVPSEFTVAKGEKIVFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGAIEVLLGGDDGSLAFIPNDFSVAAGEKIVFKNNAGFPHNVVFDEDE ZKGSYSFYCSPHQGAIEILLGGDDGSLAFVPNNFTVASGEKITFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGALDVLLGSDDGELAFVPNNFSVPSGEKITFKNNAGFPHNVVFDEDE NKGEYSFYCSPHQGAIDVLLGADDGSLAFVPSEFSISPGEKIVFKNNAGFPHNIVFDEDS DKGEYTFYCSPHQGAVDVLLGADDGSLAFVPSEFSVPAGEKIVFKNNAGFPHNVLFDEDA EKGTYTFYCAPHQGALEVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGALDVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCSPHQGAIEVLLGSDDGGLAFVPGNFSISAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCAPHQGAAEVLLGSSDGGLVFEPSTFSVASGEKIVFKNNAGFPHNVVFDEDE EKGTYKFYCAPHAGAAEVLLGSSDGGLAFVPSDLSIASGEKITFKNNAGFPHNVVFDEDE ESGTYKFYCSPHQGAVEILLGGEDGSLAFIPSNFSVPSGEKITFKNNAGFPHNVVFDEDE EKGTYKFYCSPHQGAVEVLLGGGDGSLAFLPGDFSVASGEEIVFKNNAGFPHNVVFDEDE TKGTYSFYCSPHQGALEVLLGSGDGSLVFVPSEFSVPSGEKIVFKNNAGFPHNVVFDEDE

M species

Pc

i j

All positions in the alignment are analyzed in pairs for a total of N(N-1)/2 analyses positions
PLAS_SCEOB PLAS_CHLFU PLAS_CHLRE PLAS_RUMOB O22646 PLAS_PEA PLAS_VICFA PLAS_CAPBU PLAT_ARATH PLAS_ARATH PLAS_CUCPE PLAS_CUCSA PLAS_MERPE PLAS_POPNI PLAT_POPNI PLAS_LYCES PLAS_SOLTU PLAS_TOBAC PLAT_TOBAC PLAS_SOLCR PLAS_LACSA PLAS_SILPR PLAS_SAMNI PLAS_SPIOL PLAS_PHAVU

i & j

M species

TAGEYGYFCEPHQGAANVKLGADSGALVFEPATVTIKAGDSVTWTNNAGFPHNIVFDEDA TAGTYGYFCEPHQGAVTVKLGADSGALVFEPSSVTIKAGETVTWVNNAGFPHNIVFDEDE AAGEYGYYCEPHQGAATVKLGADSGALEFVPKTLTIKSGETVNFVNNAGFPHNIVFDEDA EKGTYSFYCSPHQGAIEIKLGGDDGALAFVPGSFTVAAGEKIVFKNNAGFPHNIVFDEDE EKGTYSIYCSPHQGAVEVLLGGSDGSLAFVPSNIEVAAGETVVFKNNAGFPHNVLFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPSSLEVSAGETIVFKNNAGFPHNVVFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPNSFEVSAGDTIVFKNNAGFPHNVVFDEDE EAGTYSFYCAPHQGAIEVLLGGGDGSLAFVPNDFSIAKGEKIVFKNNAGFPHNVVFDEDE EPGTYSFYCAPHQGAIEVLLGGGDGSLAFIPNDFSIAKGEKIVFKNNAGYPHNVVFDEDE EPGSYGFYCAPHQGAMEVLLGSDDGSLAFVPSEFTVAKGEKIVFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGAIEVLLGGDDGSLAFIPNDFSVAAGEKIVFKNNAGFPHNVVFDEDE ZKGSYSFYCSPHQGAIEILLGGDDGSLAFVPNNFTVASGEKITFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGALDVLLGSDDGELAFVPNNFSVPSGEKITFKNNAGFPHNVVFDEDE NKGEYSFYCSPHQGAIDVLLGADDGSLAFVPSEFSISPGEKIVFKNNAGFPHNIVFDEDS DKGEYTFYCSPHQGAVDVLLGADDGSLAFVPSEFSVPAGEKIVFKNNAGFPHNVLFDEDA EKGTYTFYCAPHQGALEVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGALDVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCSPHQGAIEVLLGSDDGGLAFVPGNFSISAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCAPHQGAAEVLLGSSDGGLVFEPSTFSVASGEKIVFKNNAGFPHNVVFDEDE EKGTYKFYCAPHAGAAEVLLGSSDGGLAFVPSDLSIASGEKITFKNNAGFPHNVVFDEDE ESGTYKFYCSPHQGAVEILLGGEDGSLAFIPSNFSVPSGEKITFKNNAGFPHNVVFDEDE EKGTYKFYCSPHQGAVEVLLGGGDGSLAFLPGDFSVASGEEIVFKNNAGFPHNVVFDEDE TKGTYSFYCSPHQGALEVLLGSGDGSLVFVPSEFSVPSGEKIVFKNNAGFPHNVVFDEDE

Analysis of sequence alignment covariances in general


positions i
S S S D D D D D D D D D . . .

&

j
E E V V V V I V V E V L . . .

Analysis of sequence alignment covariances in general


positions i
S S S D D D D D D D D D . . .

&

j
E E V V V V I V V E V L . . .

Gbel et al., Proteins 18 (1994), 309-317

and this and other studies

F(i, j)

Proposed analysis based on amino acid properties (charge)


positions i
0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 . . .

&

j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . The algorithm can handle data on a ratio or interval scale, rather than data on a nominal scale Reduce number of possible entities Analyze biochemically relevant properties

F(i, j)

Proposed analysis based on amino acid charge or volume


positions i
0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . .

F(i, j)

Proposed analysis based on amino acid charge or volume


positions i
0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0

F(i, j)

Proposed analysis based on amino acid charge or volume


positions i
0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0

What is the statistical significance of this value?

F(i, j)

Proposed analysis based on amino acid charge or volume


positions i
0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0

What is the statistical significance of this value? That is what is important!

F(i, j)

Proposed analysis based on amino acid charge or volume


positions i
0 0 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0 Introduce Xq Xq=0 if randomized variance is higher than the original variance Xq=1 otherwise q=1, 2, 3, ..., Q

p-value =(Xq)/Q

Proposed analysis based on amino acid charge or volume


positions i
-1 -1 -1 0 -1 0 -1 -1 -1 0 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
0 0 -1 -1 0 0 0 0 -1 0 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 -2 -1 -1 0 -1 -1 -2 0 -1 -1 . . . Variance of the sum= 4.0 2.0

X1=0

p-value =0.00

Proposed analysis based on amino acid charge or volume


positions i
-1 0 0 -1 -1 -1 -1 0 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
0 0 0 0 0 -1 0 -1 0 0 -1 0 . . . = = = = = = = = = = = = . . .

sum
-1 0 0 -1 -1 -2 -1 -1 -1 -1 -2 -1 . . . Variance of the sum= 4.0 2.0

X1=0 X2=0

p-value =0.00

Proposed analysis based on amino acid charge or volume


positions i
-1 0 -1 0 -1 -1 -1 -1 -1 -1 -1 0 . . .

&
+ + + + + + + + + + + + . . .

j
0 -1 0 -1 0 0 0 -1 0 0 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 -1 -1 -1 -1 -1 -2 -1 -1 -1 0 . . . Variance of the sum= 2.0 2.0

X1=0 X2=0 X3=1

p-value =0.33

Proposed analysis based on amino acid charge or volume


positions i
0 -1 -1 -1 -1 0 -1 0 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
0 0 0 0 -1 0 -1 0 0 0 0 -1 . . . = = = = = = = = = = = = . . .

sum
0 -1 -1 -1 -2 0 -2 0 -1 -1 -1 -2 . . . Variance of the sum= 6.0 2.0

X1=0 X2=0 X3=1 X4=0

p-value =0.25

Proposed analysis based on amino acid charge or volume


positions i
-1 -1 -1 0 0 -1 -1 -1 0 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .

sum
-2 -2 -1 0 0 -1 -1 -1 0 -2 -1 -1 . . . Variance of the sum= 6.0 2.0

X1=0 X2=0 X3=1 X4=0 X5=0

p-value =0.20

Proposed analysis based on amino acid charge or volume


positions i
-1 0 -1 -1 0 0 -1 -1 -1 -1 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 0 0 0 0 0 -1 0 0 0 -1 0 . . . = = = = = = = = = = = = . . .

sum
-2 0 -1 -1 0 0 -2 -1 -1 -1 -2 -1 . . . Variance of the sum= 6.0 2.0

X1=0 X2=0 X3=1 X4=0 X5=0 X6=0

p-value =0.17

Proposed analysis based on amino acid charge or volume


positions i
-1 -1 -1 -1 -1 -1 -1 0 -1 -1 0 0 . . .

&
+ + + + + + + + + + + + . . .

j
0 0 -1 -1 0 -1 0 0 0 0 0 0 . . . = = = = = = = = = = = = . . .

sum
-1 -1 -2 -2 -1 -2 -1 0 -1 -1 0 0 . . . Variance of the sum= 6.0 2.0

X1=0 X2=0 X3=1 X4=0 X5=0 X6=0 X7=0

p-value =0.14

Proposed analysis based on amino acid charge or volume


positions i
-1 -1 -1 -1 0 0 -1 -1 -1 0 -1 -1 . . .

&
+ + + + + + + + + + + + . . .

j
-1 0 0 0 0 -1 0 -1 0 0 0 0 . . . = = = = = = = = = = = = . . .

sum
-2 -1 -1 -1 0 -1 -1 -2 -1 0 -1 -1 . . . Variance of the sum= 4.0 2.0 X1=0 X2=0 X3=1 X4=0 X5=0 X6=0 X7=0 X8=0

p-value =0.13

How many randomizations need to be done?


To reach a significance level of 1% it is recommended that you perform a minimum of 5000 randomizations*

*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998

How many randomizations need to be done?


To reach a significance level of 1% it is recommended that you perform a minimum of 5000 randomizations* But, since I perform N(N-1)/2 analyses of amino acid pairs, the overall significance level = N(N-1)(the individual significance levels)/2

*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998

How many randomizations need to be done?


To reach a significance level of 1% it is recommended that you perform a minimum of 5000 randomizations* But, since I perform N(N-1)/2 analyses of amino acid pairs, the overall significance level = N(N-1)(the individual significance levels)/2 Thus, if I want an overall significance level of 1%, I need to perform a minimum of N(N-1)2500 randomizations for each amino acid pair!

*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998

How many randomizations need to be done?


To reach a significance level of 1% it is recommended that you perform a minimum of 5000 randomizations* But, since I perform N(N-1)/2 analyses of amino acid pairs, the overall significance level = N(N-1)(the individual significance levels)/2 Thus, if I want an overall significance level of 1%, I need to perform a minimum of N(N-1)2500 randomizations for each amino acid pair! N is in the order of 100-1000

*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998

Results

Pairs of non-randomly covarying residues in a plastocyanin sequence alignment, based on charge at an overall significance level of 1%

Pairs of non-randomly covarying residues in a plastocyanin sequence alignment, based on volume at an overall significance level of 1%

Conclusions Preliminary results indicate a possible good agreement between covariances and structural proximity of amino acid residues The current algorithm is very computationally intensive Protein sequence alignments are intrinsically covarying (evolutionary covariance)

Future Devise a way to compensate for evolutionary covariances Optimize the algorithm

Implement analysis of triplets, quadruplets, etc.


Apply the working algorithm on pairs of interacting proteins

Presented by Anders Bergkvist abk@molbio.gu.se 031-773 3803

Acknowledgements
Martin Billeter Peter Damaschke Gerhard Wagner George Church

The Knut and Alice Wallenberg Foundation The Foundation Blanceflor Boncompagni-Ludovisi, ne Bildt Carl Tryggers Stiftelse Vetenskapsrdet Assar Gabrielssons Stiftelse

Das könnte Ihnen auch gefallen