Beruflich Dokumente
Kultur Dokumente
Anders Bergkvist
Protein Interaction
Protein Interaction
Protein Interaction
Protein Interaction
N
Human prion protein PrPC, residues 125-228
Protein Interaction
NMR X-ray
B ?
A
?
?
C
B !
? ? ? ? ?
A
Z
B
R
A
Z
Y
?
N positions
TAGEYGYFCEPHQGAANVKLGADSGALVFEPATVTIKAGDSVTWTNNAGFPHNIVFDEDA TAGTYGYFCEPHQGAVTVKLGADSGALVFEPSSVTIKAGETVTWVNNAGFPHNIVFDEDE AAGEYGYYCEPHQGAATVKLGADSGALEFVPKTLTIKSGETVNFVNNAGFPHNIVFDEDA EKGTYSFYCSPHQGAIEIKLGGDDGALAFVPGSFTVAAGEKIVFKNNAGFPHNIVFDEDE EKGTYSIYCSPHQGAVEVLLGGSDGSLAFVPSNIEVAAGETVVFKNNAGFPHNVLFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPSSLEVSAGETIVFKNNAGFPHNVVFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPNSFEVSAGDTIVFKNNAGFPHNVVFDEDE EAGTYSFYCAPHQGAIEVLLGGGDGSLAFVPNDFSIAKGEKIVFKNNAGFPHNVVFDEDE EPGTYSFYCAPHQGAIEVLLGGGDGSLAFIPNDFSIAKGEKIVFKNNAGYPHNVVFDEDE EPGSYGFYCAPHQGAMEVLLGSDDGSLAFVPSEFTVAKGEKIVFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGAIEVLLGGDDGSLAFIPNDFSVAAGEKIVFKNNAGFPHNVVFDEDE ZKGSYSFYCSPHQGAIEILLGGDDGSLAFVPNNFTVASGEKITFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGALDVLLGSDDGELAFVPNNFSVPSGEKITFKNNAGFPHNVVFDEDE NKGEYSFYCSPHQGAIDVLLGADDGSLAFVPSEFSISPGEKIVFKNNAGFPHNIVFDEDS DKGEYTFYCSPHQGAVDVLLGADDGSLAFVPSEFSVPAGEKIVFKNNAGFPHNVLFDEDA EKGTYTFYCAPHQGALEVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGALDVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCSPHQGAIEVLLGSDDGGLAFVPGNFSISAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCAPHQGAAEVLLGSSDGGLVFEPSTFSVASGEKIVFKNNAGFPHNVVFDEDE EKGTYKFYCAPHAGAAEVLLGSSDGGLAFVPSDLSIASGEKITFKNNAGFPHNVVFDEDE ESGTYKFYCSPHQGAVEILLGGEDGSLAFIPSNFSVPSGEKITFKNNAGFPHNVVFDEDE EKGTYKFYCSPHQGAVEVLLGGGDGSLAFLPGDFSVASGEEIVFKNNAGFPHNVVFDEDE TKGTYSFYCSPHQGALEVLLGSGDGSLVFVPSEFSVPSGEKIVFKNNAGFPHNVVFDEDE
M species
Pc
i j
All positions in the alignment are analyzed in pairs for a total of N(N-1)/2 analyses positions
PLAS_SCEOB PLAS_CHLFU PLAS_CHLRE PLAS_RUMOB O22646 PLAS_PEA PLAS_VICFA PLAS_CAPBU PLAT_ARATH PLAS_ARATH PLAS_CUCPE PLAS_CUCSA PLAS_MERPE PLAS_POPNI PLAT_POPNI PLAS_LYCES PLAS_SOLTU PLAS_TOBAC PLAT_TOBAC PLAS_SOLCR PLAS_LACSA PLAS_SILPR PLAS_SAMNI PLAS_SPIOL PLAS_PHAVU
i & j
M species
TAGEYGYFCEPHQGAANVKLGADSGALVFEPATVTIKAGDSVTWTNNAGFPHNIVFDEDA TAGTYGYFCEPHQGAVTVKLGADSGALVFEPSSVTIKAGETVTWVNNAGFPHNIVFDEDE AAGEYGYYCEPHQGAATVKLGADSGALEFVPKTLTIKSGETVNFVNNAGFPHNIVFDEDA EKGTYSFYCSPHQGAIEIKLGGDDGALAFVPGSFTVAAGEKIVFKNNAGFPHNIVFDEDE EKGTYSIYCSPHQGAVEVLLGGSDGSLAFVPSNIEVAAGETVVFKNNAGFPHNVLFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPSSLEVSAGETIVFKNNAGFPHNVVFDEDE AKGTYKFYCSPHQGAVEVLLGASDGGLAFVPNSFEVSAGDTIVFKNNAGFPHNVVFDEDE EAGTYSFYCAPHQGAIEVLLGGGDGSLAFVPNDFSIAKGEKIVFKNNAGFPHNVVFDEDE EPGTYSFYCAPHQGAIEVLLGGGDGSLAFIPNDFSIAKGEKIVFKNNAGYPHNVVFDEDE EPGSYGFYCAPHQGAMEVLLGSDDGSLAFVPSEFTVAKGEKIVFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGAIEVLLGGDDGSLAFIPNDFSVAAGEKIVFKNNAGFPHNVVFDEDE ZKGSYSFYCSPHQGAIEILLGGDDGSLAFVPNNFTVASGEKITFKNNAGFPHNVVFDEDE EKGSYSFYCSPHQGALDVLLGSDDGELAFVPNNFSVPSGEKITFKNNAGFPHNVVFDEDE NKGEYSFYCSPHQGAIDVLLGADDGSLAFVPSEFSISPGEKIVFKNNAGFPHNIVFDEDS DKGEYTFYCSPHQGAVDVLLGADDGSLAFVPSEFSVPAGEKIVFKNNAGFPHNVLFDEDA EKGTYTFYCAPHQGALEVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGALDVLLGGDDGSLAFIPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYTFYCAPHQGAIEVLLGSDDGGLAFVPGNFSVSAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCSPHQGAIEVLLGSDDGGLAFVPGNFSISAGEKITFKNNAGFPHNVVFDEDE EKGTYSFYCAPHQGAAEVLLGSSDGGLVFEPSTFSVASGEKIVFKNNAGFPHNVVFDEDE EKGTYKFYCAPHAGAAEVLLGSSDGGLAFVPSDLSIASGEKITFKNNAGFPHNVVFDEDE ESGTYKFYCSPHQGAVEILLGGEDGSLAFIPSNFSVPSGEKITFKNNAGFPHNVVFDEDE EKGTYKFYCSPHQGAVEVLLGGGDGSLAFLPGDFSVASGEEIVFKNNAGFPHNVVFDEDE TKGTYSFYCSPHQGALEVLLGSGDGSLVFVPSEFSVPSGEKIVFKNNAGFPHNVVFDEDE
&
j
E E V V V V I V V E V L . . .
&
j
E E V V V V I V V E V L . . .
F(i, j)
&
j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . The algorithm can handle data on a ratio or interval scale, rather than data on a nominal scale Reduce number of possible entities Analyze biochemically relevant properties
F(i, j)
&
+ + + + + + + + + + + + . . .
j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . .
F(i, j)
&
+ + + + + + + + + + + + . . .
j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0
F(i, j)
&
+ + + + + + + + + + + + . . .
j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0
F(i, j)
&
+ + + + + + + + + + + + . . .
j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0
F(i, j)
&
+ + + + + + + + + + + + . . .
j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 0 -1 -1 -1 -1 -1 -1 -2 -1 -1 . . . Variance of the sum= 2.0 Introduce Xq Xq=0 if randomized variance is higher than the original variance Xq=1 otherwise q=1, 2, 3, ..., Q
p-value =(Xq)/Q
&
+ + + + + + + + + + + + . . .
j
0 0 -1 -1 0 0 0 0 -1 0 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 -2 -1 -1 0 -1 -1 -2 0 -1 -1 . . . Variance of the sum= 4.0 2.0
X1=0
p-value =0.00
&
+ + + + + + + + + + + + . . .
j
0 0 0 0 0 -1 0 -1 0 0 -1 0 . . . = = = = = = = = = = = = . . .
sum
-1 0 0 -1 -1 -2 -1 -1 -1 -1 -2 -1 . . . Variance of the sum= 4.0 2.0
X1=0 X2=0
p-value =0.00
&
+ + + + + + + + + + + + . . .
j
0 -1 0 -1 0 0 0 -1 0 0 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 -1 -1 -1 -1 -1 -2 -1 -1 -1 0 . . . Variance of the sum= 2.0 2.0
p-value =0.33
&
+ + + + + + + + + + + + . . .
j
0 0 0 0 -1 0 -1 0 0 0 0 -1 . . . = = = = = = = = = = = = . . .
sum
0 -1 -1 -1 -2 0 -2 0 -1 -1 -1 -2 . . . Variance of the sum= 6.0 2.0
p-value =0.25
&
+ + + + + + + + + + + + . . .
j
-1 -1 0 0 0 0 0 0 0 -1 0 0 . . . = = = = = = = = = = = = . . .
sum
-2 -2 -1 0 0 -1 -1 -1 0 -2 -1 -1 . . . Variance of the sum= 6.0 2.0
p-value =0.20
&
+ + + + + + + + + + + + . . .
j
-1 0 0 0 0 0 -1 0 0 0 -1 0 . . . = = = = = = = = = = = = . . .
sum
-2 0 -1 -1 0 0 -2 -1 -1 -1 -2 -1 . . . Variance of the sum= 6.0 2.0
p-value =0.17
&
+ + + + + + + + + + + + . . .
j
0 0 -1 -1 0 -1 0 0 0 0 0 0 . . . = = = = = = = = = = = = . . .
sum
-1 -1 -2 -2 -1 -2 -1 0 -1 -1 0 0 . . . Variance of the sum= 6.0 2.0
p-value =0.14
&
+ + + + + + + + + + + + . . .
j
-1 0 0 0 0 -1 0 -1 0 0 0 0 . . . = = = = = = = = = = = = . . .
sum
-2 -1 -1 -1 0 -1 -1 -2 -1 0 -1 -1 . . . Variance of the sum= 4.0 2.0 X1=0 X2=0 X3=1 X4=0 X5=0 X6=0 X7=0 X8=0
p-value =0.13
*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998
*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998
*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998
*) B. F. J. Manly, Randomization, Bootstrap and ..., Chapman & Hall, 2nd ed., 1998
Results
Pairs of non-randomly covarying residues in a plastocyanin sequence alignment, based on charge at an overall significance level of 1%
Pairs of non-randomly covarying residues in a plastocyanin sequence alignment, based on volume at an overall significance level of 1%
Conclusions Preliminary results indicate a possible good agreement between covariances and structural proximity of amino acid residues The current algorithm is very computationally intensive Protein sequence alignments are intrinsically covarying (evolutionary covariance)
Future Devise a way to compensate for evolutionary covariances Optimize the algorithm
Acknowledgements
Martin Billeter Peter Damaschke Gerhard Wagner George Church
The Knut and Alice Wallenberg Foundation The Foundation Blanceflor Boncompagni-Ludovisi, ne Bildt Carl Tryggers Stiftelse Vetenskapsrdet Assar Gabrielssons Stiftelse