Sie sind auf Seite 1von 49


The sequence dictates the 3D structure

Poten9al problems to finding the final na9ve structure
1. Exposed hydrophobic surfaces forcing aggrega9on – Solu9on: Chaperones
2. Incorrect disulfide bonds – Solu9on: PDI (Protein Disulfide Isomerases)
3. Isomeriza9on of proline residues, all X-Pro are made trans, some need to be
cis – Solu9on: PPI (Pep9dyl Proline cis-trans Isomerases )

Protein folding experiments
Proteins are marginally stable – important for biology (dynamics/func9on,
protein turnover)

Contribu9ons to free energy of folding of soluble proteins /
Factors governing stability of proteins:

The covalent bonds are the same in U and F, so
plays no role (with the excep;on of disulfides).

Entropic (ΔS) – unfavorable – (many
conforma9ons in U, a single conforma9on in F

Hydrophobic effect – favorable - burial of
hydrophobic side chains, releases water results in
an increase ΔS

Enthalpic (ΔH) – favorable – forma9on of Non-
covalent interac9on in F : H-bonds, electrosta9c

Net result is a marginally stable protein.
Intramolecular chaperones


Type I : N-terminal pep9de

Helps in folding – ter9ary structure

Type II : C-terminal pep9de

Helps in assembly - quaternary structure

NaIvely (inherently, intrinsically)
Unstructured (disordered, unfolded)

NaIvely unstructured proteins

How common are they?

~ 10% of all proteins are fully disordered.

~ 40% of eukaryo9c proteins have at least one long (>50 amino acids) disordered loop

Why have the evolved to be unstructured?

Permits specific binding with fast on/off-rate

Provides specific binding without strong binding

They can cover a large surface with few residues

They can be removed by proteases quickly (regulated)

NaIvely unstructured proteins

challenges the tradiIonal protein structure paradigm:
which states that a specific well-defined structure is required for the
correct func9on of a protein and that the structure defines the funcIon
of the protein.

The Database of Protein Disorder (DisProt): h_p://

Coupled folding and binding
•  Many unstructured proteins undergo transi9ons to more ordered states upon
binding to their targets.

•  The coupled folding and binding may be local, involving only a few interac9ng
residues, or it might involve an en9re protein domain. It was recently shown that
the coupled folding and binding allows the burial of a large surface area that
would only be possible for fully structured proteins if they were much larger.

•  The ability of disordered proteins to bind, and thus to exert a funcIon, shows
that stability is not a required condiIon for funcIon.

Example of inherently unstructured protein
BamCD complex
Components involved in the Assembly of beta-barrel membrane proteins

BamCD complex co-crystal structure

Structural changes in BamD upon binding BamC

An inherently unstructured
region of protein can cover a
large amount of protein
surface with few residues

The interface between

BamC and BamD is
conserved (maroon)

Potential control (regulation) mechanism for BAM?
By looking at how other structurally related proteins
(or even other machines) work,
you can find clues to how your protein may work

Sequence signatures of disorder
•  low content of bulky hydrophobic amino acids

•  high propor9on of polar and charged amino acids.

•  Thus disordered sequences cannot bury sufficient hydrophobic core to fold like
stable globular proteins.

•  In some cases, hydrophobic clusters in disordered sequences provide the clues

for iden9fying the regions that undergo coupled folding and binding.

•  low complexity sequences, i.e. sequences with overrepresenta9on of a few

residues. While low complexity sequences are a strong indica9on of disorder, the reverse is not
necessarily true, that is, not all disordered proteins have low complexity sequences.

•  Disordered proteins have a low content of predicted secondary structure.

IdenIficaIon of intrinsically unstructured proteins
•  once purified.

•  Folded proteins have a high density (par9al specific volume of 0.72-0.74 mL/g) and
commensurately small radius of gyraIon. Hence, unfolded proteins can be
detected by methods that are sensi9ve to molecular size, density or hydrodynamic
drag, such as size exclusion chromatography, analyIcal ultracentrifugaIon, Small
angle X-ray scaRering (SAXS).

•  Unfolded proteins are also characterized by their lack of secondary structure, as

assessed by far-UV (170-250 nm) circular dichroism (esp. a pronounced minimum
at ~200 nm) or infrared spectroscopy.

•  Unfolded proteins have exposed backbone pep9de groups exposed to solvent, so

that they are readily cleaved by proteases, undergo rapid hydrogen-deuterium
exchange and exhibit a small dispersion (<1 ppm) in their 1H amide chemical shiis
as measured by NMR. (Folded proteins typically show dispersions as large as 5 ppm
for the amide protons.)

•  The primary method to obtain informa9on on disordered regions of a protein is

NMR spectroscopy.
Residue sequences within IUP

Since amino acid sequence determines 3-D structure, amino acid sequence should also
determine lack of 3-D structure

Disordered protein sequences are substan9ally depleted in I, L, V, W, F, Y, and C.

Disordered protein sequences are enriched in: E, K, R, G, Q, S, P, and A

Why are regions of sequence missing
in high-resolu9on crystal structures?
intrinsically disordered loops?

Many crystallographic structures have missing loops -- that is, ranges of amino acids
with no atomic coordinates in the model.

These "gaps" in the model are oien thought to be ar9facts of inadvertent disorder
in the crystal.

In some cases, these gaps may be aler9ng us to the presence of intrinsically
disordered loops in an otherwise folded protein.

Such gaps are the basis for the DISOPRED2 disorder predic9on server.

Macromolecular crowding
It’s very crowded in the cell

Why don’t all the proteins aggregate and precipitate out of solution?

An artists view of the inside of an E. coli cell.

Illustration by David S. Goodsell, The
Scripps Research Institute.
ElectrostaIc repulsion is important for limiIng interacIons

The approximate distribution of pI values predicted for proteins in Drosophila melanogaster,

based on the genome.

Similar plots are obtained for other eukaryotic organisms.

Distribution of pI values for eukaryotic proteins by location in the cell.
Cytoplasmic proteins (CP)= red
Integral membrane proteins (MP) = blue
Nuclear proteins (NP)= green

MP – attracted/stabilized by neg. lipids

CP attracted to MP
CP repelled by each other
MP repelled by each other 21
Protein misfolding

Protein quality control

Protein homeostasis
Protein recycling
Protein Misfolding

Change in structure

Causes fiber/filament formaIon

Amyloid-beta Precursor Protein (APP)

a large membrane protein in nerves.

Plays a role in neural growth and repair.


Amyloid-β pepIde

first enzyme to be discovered
(18th century)

second enzyme to be crystallized
(aier urease).

Beta-secretase These crystals played an
HIV protease important role in showing that
enzymes were proteins and that
they had a defined structure.

Pepsin works its best in strong
hydrochloric acid
Transmissible spongiform encephalopathies (TSEs)

Microscopic "holes” / "spongy" architecture in brain 9ssue.

Prion: infec9ous proteins
PrP (prion protein)

PrPc (cellular) à PrPSc (Scrapie)

Scrapie is a fatal, degenera9ve disease that affects the nervous

systems of sheep and goats. It is one of several transmissible
spongiform encephalopathies (TSEs), which are related to bovine
spongiform encephalopathy (BSE or "mad cow disease") and chronic
was9ng disease of deer.
Proteins that need to be removed –

Damaged or aggregated (miss-folded) proteins

Metabolic proteins (enzymes)

Regula9on proteins (inhibitors, repressors, ac9vators, cell cycle)

Many others…..

Example protein half-lives

Name Half-Life
Collagen 117 years
Eye lens crystallin >70 years
RFC1(part of DNA polymerase) 9 hours
RPS8(part of ribosome) 3 hours
Ornithine decarboxylase 11 minutes
UbiquiIn – a PTM before destrucIon
Ubiqui;n is used to tag obsolete proteins for destruc;on
highly conserved, found in almost all ;ssues (ubiquitous). E1: ubiqui9n-ac9va9ng enzyme

76 residues
pI = 6.8

7 Lys. E2/E3: ubiqui9n-conjuga9ng enzymes


Ubiqui9n Ligase
Proteasome: cell's protein recyclers
19S regulatory

20S core 14β

End view Side view

of 20S of 20S

19S regulatory
Bacterial Signal pepIde pepIdase
SppA (protease IV)

E. coli SppA
cleaves signal pep9des
Novak & Dev 1988

•  618 residues

•  Anchored to the inner

•  sequence analysis
3 transmembrane segments
29-45, 398-414 and 421-441
E. coli SppA : soluble domain (Δ2–46, 1TM domain removed)
2.5Å resolu9on

Ser409 / Lys209 region
at the interface
of 2 domains
Ser409 globular

Gene duplicaIon:
2 domains: 56-316, 326-549
linker Only 18% idenIty,
rmsd= 2.5 Å
SppA is a tetrameric bowl shaped structure
Compartmentalized protease

Posi9vely charged
Axial hole


SEC / light sca_ering
shows tetramer
4 acIve sites
SppA has structural similarity to
Ser/His/Asp protease ClpP

7 protomers in each layer 4 protomers

2 layers to form barrel (8 domains)
Gram-posi9ve bacteria & Archaea : “half-sized” SppA

Larger SppA
Gram-nega9ve bacteria

Half sized SppA

Most similar to C-terminal domain
Gram-posi9ve bacteria
Bacillus subtilis SppA (Gram-positive)
16% identity to E. coli SppA N-terminal domain
26% identity to E. coli SppA C-terminal domain

Ser147 (proposed nucleophile)

and Lys199 (proposed general base)
on same domain

But located far apart (~30Å)

2.4 Å resolution
Dyad assembled from
neighboring protomers

Octamer in soluIon

AcIve site
assembled from 2 protomers

Substrate binding groove

assembled from 3 protomers

BsSppA has a deeper S1 specificity as compared to EcSppA

Consistent with fluorometric pep9de cleavage assays.
SppA self-cleaves

Likely C-terminal processing:

N-term his-tag
N-terminal sequencing
New crystal form (alcohol, under oil)

Reveals circle of difference density

Density consistent with missing
C-termini (8).

2.4 Å resolu9on
C-terminal pep9de found within BsSppA
Intramolecular chaperones

Type I: N-terminal - folding

Type II: C-terminal – oligomic assembly

Yu-Jen Chen and Masayori Inouye

Current Opinion in Structural Biology 2008, 18:765–770
2 poten9al binding sites.