Sie sind auf Seite 1von 9

THE JOURNALOF BIOLOGICAL CHEMISTRY

Vol. 257, No. 16. Issue of August 25, pp. 9724-9732, 1982
Prrnled rn U.9.A.

Two Similar but Nonallelic Rat Pancreatic Trypsinogens


NUCLEOTIDE SEQUENCES OF T H E CLONED cDNAs*

(Received for publication, March 3, 1982)

Raymond J. MacDonald, S. Jennifer Stary, and Galvin H. Swifts


From the Division of Molecular Biology, Department of Biochemistry, The University of Texas Health Science Centerat
Dallas, Dallas, Texas 75235

We have cloned and identified mRNA sequences for Neurath, 1964; Mikes et al., 1966),porcine (Hermodsonet al.,
two rat pancreatic trypsinogens. Nucleotide sequence 1973), and dogfish (Titani et al., 1975) trypsins demonstrate
analysis of the cloned sequences revealed two mRNAs extensive homology. The 3-dimensional structures of bovine
that encode similar, though nonallelic, pretrypsino- trypsin (Stroudet al., 1971) and trypsinogen (Fehlhammer et
gens. Trypsinogen I mRNA is 804 nucleotides in length, al., 1977) determined by x-ray crystallographic analyses have
plus an estimated poly(A) tract of 100 nucleotides, and revealed the structural alterations involved in zymogen acti-
contains a short (13 nucleotide) 5’ noncoding region vation and the structural basis of serine protease catalysis and
and a 3’ noncoding region of 54 nucleotides. It encodes substrate cleavage preference. Comparisons of the 3-dimen-
a preproenzyme of 246 amino acids comprising a hy-
sional structures of trypsin (Stroudet al., 1971),chymotrypsin
drophobicprepeptide (signal peptide) of 15 amino
acids, an activation peptide characteristic of trypsino- (Mathews et al., 1967), and elastase (Shotton and Watson,
gens, and an active form of trypsin, 223 amino acids in 1970) have revealed a common structural motif for simple
length, that has 78% amino acid sequence identity with serine proteases;differences allow their characteristic sub-
porcine trypsin. TrypsinogenI1 mRNA has a nucleotide strate cleavage preferences.
sequence 88% homologous with that of trypsinogen I Although other serine proteases with similar catalytic prop-
mRNA and encodes a protein with 89% amino acid erties and substrate cleavage specificities are expressedin
sequence identity with trypsinogen I. The enzymes en- numerous cells and tissues, the expression of the genes encod-
coded by trypsinogen I and Il mRNAs retain the key ing the pancreatic digestive serine proteases appears limited
amino acid residues that determine the characteristic to the acinar cells of the pancreas. In embryonic pancreatic
substrate cleavage preference of trypsins and, there- development, the expression of trypsin and other digestive
fore, represent the rat counterparts of this digestive serine protease genes is induced during a narrow develop-
enzyme. Trypsinogen I mRNA is amajor pancreatic mental window (Rutter et al., 1968). In the adult, their syn-
mRNA comprising an estimated 2-5% of the total, thesis comprises a major fraction of the protein synthesis of
whereas trypsinogen I1 mRNA is present at much lower the gland (VanNest et aL, 1980) andcan be altered by
levels. hormone effects ( R det al., 1977; Dagorn andMongeau, 1977)
and changes in diet (Dagorn and Lahaie, 1981). The events
that determine the timing, the extent, and the tissue specificity
Trypsins (EC3.4.21.4) are membersof a familyof pancreatic of expression of these serine protease genes are as yet un-
serine proteases related by common structure, function, and known.
expression withinthe acinarcells of the pancreas.As expected Toward thegoals of elucidating the controlof expression of
for secretory proteins, the trypsins appear to be synthesized the trypsinogens and other members of the family of pan-
initially as precursors with NH2-terminal “signal” peptides creatic serine protease genes and the steps required for the
(Devillers-Thiery et al., 1975; Carne and Scheele, 1982) that biosynthesis, maturation, and secretion of the preproenzymes,
specify vectorial transfer into the cisternae of the rough en- we have cloned and sequenced the mRNAs for two trypsino-
doplasmic reticulum as the first step in sequesteringthe gen isozymes. These represent two similarbut nonallelic genes
proteins for secretion (Blobel and Sabatini, 1971). Trypsins that are expressed to very different levels in the pancreas of
are secreted as inactive proenzymes (trypsinogens) and are the adult rat.
activated in the intestineby the selective action of enteropep- EXPERIMENTALPROCEDURES
tidase (Marouxet al., 1971).In the manner of a simple cascade Preparation and Analysis of Pancreatic RNA-RNA was isolated
process the activated trypsins activate the other pancreatic from rat pancreasusing guanidine thiocyanate as described by Chirg-
digestive enzymes secreted as proenzymes. win et af. (1979). RNA from several adult Sprague-Dawley rats was
Trypsin and its proenzyme have been extensively charac- pooled. Polyadenylated RNA was isolated from total rat pancreatic
terized. The amino acid sequences of the bovine (Walsh and RNA by binding to oligo-dT cellulose (Collaborative Research, Type
2) (Aviv and Leder, 1972). T o enhance the removal of rRNA, the
* This work was supported by Grant AM27430 from the National RNA was dissolved in water, heated ina 68 “C water bathfor 3 min,
Institutes of Health, Grant PCM8006231 from the National Science and dilutedwith anequal volume of twice-concentratedbinding
Foundation, and Biomedical Research Support (2-507-RR07175-04) buffer immediately before passage over oligo-dT cellulose.
tothe University of TexasHealth Science Center a t Dallas. A Polyadenylated RNA was analyzed by electrophoresis in 2% aga-
preliminary report of the work was presented to the American Pan- rose slab gels containing methyl mercury hydroxide as described by
:reaticAssociation,Chicago, IL in November, 1981. The costs of Bailey and Davidson (1976). Electrophoresis a t 50 ma (approximately
publication of this article were defrayed in part by the payment of 90 V) for 4.5 h in slab gels (0.15 X 13 X 11 cm) caused significant
page charges. This article must therefore be hereby marked “adver- heatingbut improvedresolutionsignificantly compared to slower
tisement” in accordance with 18 U.S.C. Section 1734 solely to indicate overnight runs. Nucleic acid size standards were yeast mitochondrial
this fact. rRNA (a gift from Dr. D. Miller, The University of Texas Health
+ Recipient of a Virginia Lazenby O’Hara postdoctoral fellowship. Science Center at Dallas), rabbit globin mRNA (Bethesda Research

9724
Rat Trypsinogen mRNAs 9725

Laboratories), and@X174R F digested with HincII. After electropho- ds-cDNASequence Analysis-Nucleotide sequence determina-
resis, staining with ethidium bromide, and photography, the nucleic tions were performed according to the detailed protocols of Maxam
acids were transferred to diazobenzyloxymethyl paper (Alwine et al., and Gilbert (1980). Five sequencing reactions ( G , dimethylsulfate;
1977) byelectroblotting (Stellwag and Dahlberg, 1980)and hybridized G+A, formic acid; C+T, hydrazine; C, hydrazine plus NaCl; A>C,
to recombinant plasmid DNA or recombinantplasmid inserts !abeled sodium hydroxide) were employed to enhance sequence accuracy.
with 3 2 Pby nick translation (Rigby et al., 1977). After transfer and DNA fragments were labeled at their 5' termini with [Y-~'P]ATPand
deactivation according to Alwine et al. (1977), background hybridi- polynucleotide kinase (New England Nuclear) after treatment with
zation could be reduced by prolonged prehybridization for 24 h at 42 calf intestine alkaline phosphatase (Boehringer Mannheim). Labeled
"C in 5 X SSC,' 50%formamide, 10 X Denhardt's solution (Denhardt, fragments were isolated by electrophoresis and eluted from gel slices
1966), 200 p g / d of heat-denatured salmon sperm DNA (P-L Bio- either by shaking for several hours in0.3 M NaOAc, pH 7.0, for
chemicals), 0.1% sodium dodecyl sulfate, and 50 m~ sodium phos- fragments shorter than 250 base pairs, or by electroelution in 0.05 M
phate, pH 7.0. Tris, 0.05 M sodium borate, 0.001 M EDTA, pH 8.3, for longer frag-
Preparation of a ds-cDNA Library of Pancreatic mRNA Se- ments.
quences-Rat pancreatic polyadenylated RNA, without enrichment Trypsinogen I Sequencing-The scheme for determining the full
for trypsinogen mRNA, was used for the construction of a ds-cDNA length sequence of trypsinogen I mRNA is summarized in Fig. 1. The
library. Details of the preparation of the library will be described ds-cDNA insert from recombinant plasmid pcXP4-78 was sequenced
elsewhere.2 The library comprised approximately 1000 recombinant completely. Inserts from four other recombinant plasmids (see Fig. 1)
clones obtained by inserting pancreatic ds-cDNA into pBR322 after identified from the library by hybridization to the inserts of either
the addition of homopolymeric tails (Roychoudhury et al., 1976) and pcXP1-90 or pcXP4-78 contained sequences for trypsinogen I mRNA
transformation of X1776 (Goodman and MacDonald, 1980). The that overlapped with those of pcXP4-78 and were also sequenced. (Of
library used in this study had been previously screened to remove the 1925 nucleotides that overlap among the five ds-cDNA clones for
cloned mRNA sequences for the mRNAs of amylase (MacDonald et trypsinogen I, only 3 discrepancies occur.) A composite sequence of
al., 1980),chymotrypsin," and elastase I (MacDonaldet al., 1982).All the five inserts indicated that the cloned sequence began within the
experiments with bacteria containing recombinant plasmids were 5' noncoding region six nucleotides before the initial AUG codon and
conducted according to the National Institutes of Health Guidelines extended to the 3' end of the mRNA and included 15 residues of the
for Recombinant DNA Research in force at thetime. poly(A) tail. Except for six 5' end nucleotides each of the 804 nucleo-
Isolation of Plasmid DNAs-Recombinant plasmid DNAs bearing tides of the mRNA was determined by a minimum of two sequencing
ds-cDNA inserts were prepared according to theprocedure described runs, including analysis of both strands of the complete length en-
by Birnboim and Doly (1979).For uses other than simple restriction coded by the inserts (Fig. 1).
endonuclease digestions, plasmid DNA preparations from large cul- The nucleotide sequence at the5' end of trypsinogen I mRNA not
tures (0.1 liter or larger) were purified further by treatment with included in a cloned ds-cDNA was determined by primer extension
RNase A (100 pg/ml of boiled RNase A, 37 "C, 4 h), phenol/chloro- analysis. Primer extension by cDNA synthesis using a restriction
form extraction, andchromatography on A-150M (Bio-Rad) in 10 mM endonuclease-generated fragment of a cloned ds-cDNA as a primer
Tris-C1, pH 8, 1m~ EDTA, 100 mM NaCI. and total pancreatic polyadenylated RNA as template was performed
Preparation and Label of an EnrichedTrypsinogen mRNA Frac- as described previously (MacDonald et al., 1982).Six pmol of an Alu
tion-50pg of pancreatic polyadenylated RNA was subjected to I-Sau 3A restriction endonuclease fragment of pcXP7-35 (Fig. 1)were
electrophoresis in a 1.3% agarose slab gel containing methyl mercury end labeled with 32P,hybridized to 30 pg of pancreatic polyadenylated
hydroxide. After staining for 4 min with 1pg/ml of ethidium bromide RNA, and extended with reverse transcriptase. After purification by
in 5 mM 2-mercaptoethanol, the stained bands presumed to include polyacrylamide gel electrophoresis, the extended primer was se-
trypsinogen mRNA (see under "Results") were cut out, andthe RNA quenced using the protocols of Maxam and Gilbert (1980).
was electroeluted through an acrylamide plug into a dialysis bag. Trypsinogen I Z Sequencing-As shown in Fig. 1, the nucleotide
After concentrating by repeated extractionwith 1-butanol, the mRNA sequence of a large part of a second trypsinogen mRNA was obtained
was ethanol-precipitated three times. from a single recombinant plasmid, pcXP8-64. Of the cloned length of
The enriched mRNA fraction was labeled with 32P according to trypsinogen I1 mRNA represented by pcXP8-64,81% of the sequence
Biessman et al. (1979).The mRNA fraction was dissolved in 20 pl of was determined by two or more sequencing runs over the same
50 mM Tris-C1, pH 9.5, and incubated at 90 "C for 25 min. After regions; 53% of the sequence was determined by runs on both strands
cooling and adding MgC12 to 10mM, dithiothreitol to 5 mM, [y-"'PI (Fig. 1).
ATP to 1.7 mCi/ml, and polynucleotide kinase (New England Nu-
clear) to 42 units/ml, the mixture was incubated a t 37 "C for 50 min.
The reaction was terminated with 10 pl of 2 M sodium acetate, 3 p1of
0.5 M EDTA, and 50pgof yeast tRNA (Miles Laboratories, Inc.),
extracted with phenol/chloroform (l:l), andethanolprecipitated.
The 32P-labeledRNA fragments were purified by a microcentrifugal
chromatographic procedure (Helmerhorst and Stokes, 1980) as mod-
ified by J. Barry4using Bio-Gel P-30 (Bio-Rad).
Screening the Libraryfor Recombinant Plasmids Bearing Tryp-
sinogen mRNA Sequences-The library of bacterial colonies contain- -23 0 50 100 150 200 223 amino acids
ing recombinant plasmids was transferred to and grown on 15-cm I'
1
', I
200
' ',
400 600
I
' &4 nucleotides
disks of Whatman 541 paper as described by Craig et at. (1979). After
k ----poly A
treatment to lyse the bacterial cells and fur the DNA to the paper,
each disk wasprehybridized in a solution of 5 X SSC, 50%formamide, -( .P
TRYPSIN I
x- "------s
T

0.1% sodium dodecyl sulfate, 1 mM EDTA, 2 X Denhardt's solution,


100 pg/ml of Escherichia coli DNA, 200 pg/ml of yeast tRNA, and 10
( P L T (- F: p-"
? (7'0-64

mM 4-(2-hydroxyethyl)-l-piperazineethanesulfonic acid, pH 7.3, and FIG. 1. Sequencing strategies for the tworatpancreatic
then hybridized in the same solution but with 0.5 pCi of the '*P- trypsinogen mRNAs. Thehorizontal rectangles delineate the
labeled mRNA enriched for trypsinogen sequences. After autoradiog- amino acid coding regions; the signal peptide (Pre) and activation
raphy colonies were selected on the basis of hybridization to the '*P- peptide (Pro) regions are indicated. The lines extending from the
labeled trypsinogen mRNA fraction. rectangles represent the lengths of the noncoding regions of each
mRNA in addition poly(A) is present at the 3' end of each mRNA.
' The abbreviations used are: SSC, 0.15 M sodium chloride, 0.015 The thick horizontal lines represent the extent of each cloned ds-
M sodium citrate; ds-cDNA, double-stranded complementary DNA, cDNA; the numbers are the identification numbers for each recom-
pcXP, designation for recombinant plasmids containing ds-cDNA for binant plasmid. The direction and length of each sequencing run is
an exocrine pancreas mRNA. indicated by the horizontal arrows,each starting at the position of
W. Swain, R. J. MacDonald, and W. J. Rutter, manuscript in the restriction endonuclease site noted (P, PstI; B, BstN1; A, Alu I;
preparation. S, Sau 3A R, Rsa I; E, Eco RI; X , Xho I). The Pst I sites in
C. Quinto, G. Bell, C. Craik, and W. J. Rutter, manuscript in parentheses were generated upon cloning the ds-cDNA. The wavy
preparation. horizontal arrow delineates the position and length of sequences
J. Barry, personal communication. determined by primer extension analysis.
9726
Subcloning the Unique 3’ Noncoding Regions of Trypsinogen I from a preparative gel, cleaved by gentle alkali treatment,
and II mRNAs-For trypsinogen I mRNA a Fnu 4H-Pst I restriction labeled with [Y-~~P]ATP and polynucleotide kinase, and hy-
endonuclease fragment that extends from codon 229 up to and includ- bridized to half the library (500 colonies) containing recom-
ing 15 residues of the poly(A) tail was subcloned from pcXP4-78. For
trypsinogen I1 mRNA, a Sac I-PstI restriction endonuclease fragment binant plasmids bearing ds-cDNA inserts for rat pancreatic
that extends from the termination codon to the end of the cloned mRN.4 (see under “Experimental Procedures”). Seventeen
sequence was subcloned from pcXP8-64. Prior to subcloning, 0.1-0.3 colonies hybridized significantly over background. Ten of
pg of each purified restriction endonuclease fragment were treated these could be grouped into a unique class on the basis of
with 4 units of T4 DNA polymerase (P-L Bio€hemicals) a t 37 “C to c r o ~ - h y b ~ d ~ t i After
o n . determining the length of the ds-
make the ends blunt (Swift et a[.,1981). Each blunt-ended fragment cDNA inserts, one recombinant plasmid (pcXP1-90) from the
was then cloned into the Hind111 site of pBR322 after the addition of class that cross-hybridized was chosen for preliminary nucleo-
Hind111 restriction site dodecamers (Collaborative Research, Inc.) as
described by Goodman and MacDonald (1980). The subcloned frag- tide sequence analysis. The ds-cDNA insert was found to
ments were excised from the plasmid DNA and labeled with 32Pby contain a portion of an mRNA coding sequence for a trypsin-
nick translation for use as specific hybridization probes for trypsino- like protein (see below),
gen I and I1 mRNAs. Screening the ds-cDNA library with the 32P-labeledinsert
from pcXP1-90 detected 24 additional recombinantplasmids,
RESULTS including pcXP4-78, which contained a ds-cDNA insert of 850
Screening for ds-eDNA Clones Encoding Pancreatic Tryp- base pairs, sufficient to encode a nearly full-length copy of
sin mRNA-Trypsin mRNA was partially pwrified from total trypsinogen mRNA. One recombinant plasmid (pcXP8-64)
rat pancreatic polyadenylated RNA by preparative agarose within this collection had a slightly different restriction en-
gel electrophoresis. Fig. 2 illustrates the electrophoretic profile donuclease digestion pattern, indicating that it encoded a
of pancreatic polyadenylated RNAs separated according to different mRNA. Those plasmids used to derive the nucleotide
size in an agarose gel containing methyl mercury hydroxide. sequences of the trypsinogen mRNAs are shown in Fig. 1. By
Because the pancreas synthesizes a few secretory proteins a t screening with the ds-cDNA insert of pcXP4-78, which con-
very high rates (Jamieson and Palade, 1967; Van Nest et al., tains 5’ and 3’ regions of the mRNA not contained within
1980),a few mRNAs predominate (Hading et al., 1977; Przy- pcXP1-90, a total of 48 recombinant plasmids were detected
byla et al., 1979). Thus each RNA peak shown in Fig. 2 is within the totalpancreatic ds-cDNA library of approximately
predominantly one or a very few mRNA species. 1000 clones. This is the number expected for an mRNA
Previous experiments on in vitro translation of size-frac- comprising approximately 5% of the total mRNA population.
tionated pancreatic mRNA (MacDonald et al., 1977; Rutter The Complete Nucleotide Sequence of a Major Trypsino-
et al., 1978; Chirgwin et al., 1979) indicated that in vitro gen mRNA-The nucleotide sequencing strategy for the ma-
translation products of the size expected for the precursor of jor pancreatic trypsinogen mRNA, shown in Fig. l , utilized a
trypsinogen were encoded by mRNAs 0.9 to 1.2 kilobases in collection of overlapping ds-cDNA clones. To assure a high
length. This mRNA size range corresponds to thelarge double degree of sequence accuracy, multiple sequencing runs were
mRNA peak indicated in Fig. 2. RNA of this peak was eluted performed on each ds-cDNA, and the nucleotide sequences
for overlapping regions were compared.
The nucleotide sequence of trypsinogen I mRNA and the
derived amino acid sequence are given in Fig. 3. The single
amino acid coding frame uninterruptedby termination codons
prescribes a protein246 amino acids in length with a molecular
weight of25,964. The presence and spacing of amino acid
residues histidine 48, aspartate 92, and serine 185 and the
conservation of amino acid sequences surrounding them are
characteristic of serine proteases in general and trypsinogen
in particular. Comparison of amino acid sequence homology
with various pancreatic serine proteases (Table I) shows the
highest homology (72-78s) with trypsin. In addition, key
amino acid residues that determine the characteristic tryptic
preference for cleavage of polypeptide substrates a t basic
amino acids are preserved (see under “Discussion”).
Three nucleotide discrepancies were found among the over-
lapping ds-cDNAs for trypsinogen I. These discrepancies,
4 indicated in Fig. 3, occur within the codons specifying gluta-
mate 4, serine 25, and leucine 95; none would alter the coded
amino acid. These differences may be due to either the pres-
ence of more than one nearly identical, possibly allelic, pan-
LENGTH (kb) creatic trypsinogen I mRNA, or to sequence artifacts engen-
2.0 1.5 1.2 1.0 0.8 0.6 L, dered during the cloning procedures.
l i I I I I An obvious cloning artifact was discovered among the c01-
I I I 1 i I I 1 I
0 2 4 6 8 leetion of trypsinogen I ds-cDNA clones. A 42-base pair se-
DISTANCE (cm) quence at the 5’ end of the insert of pcXP1-90 (nucleotides
FIG.2. Size distribution of mRNAs isolated from the pan- 291 to 332 of the mRNA) was inverted. In addition, five
creas of the rat. Rat pancreatic polyadenylated RNA was isolated nucleotides between the inverted region and thecontinuation
by passage over oligo(dT)-celluloseand subjected to electrophoresis of the normal mRNA sequence were deleted. Because two
in an agarose gel containing methyl mercury hydroxide as described other cloned ds-cDNAs overlapping this entire region of the
under “Experimental Procedures.” The RNA size range expected for mHNA gave nucleotide sequences consistent with the amino
mRNAs encoding for serine proteases (0.9 to 1.2 kilobases) is indicated
o n18 S and 28 S rRNAs are
by a bracket. The positions of ~ i ~ a t i of acid coding requirements for trypsinogen, the inverted region
also shown. of pcxP1-90 is clearly an artifact of the cloning procedures.
Rat Trypsinogen mRNAs 9727
-15 -1 0 1
.UAGGAAGGCGU- met ser leu
ala Ile leuala
leu val gly ala ala val phe
ala pro leu glu- asp asp
asp
CCUUCUGCCACC AUGAGU GCA CUUCUG AUCC U A GC C C U U G U G G G A G C U G C U G U U G C U U U C C C U U U G G A q G A U G A U G A C
.................................................. U U C G U
Val asp

10 20 30
lys Ile val gly tyr
gly lhr cys pro glu hls val
ser pro tyr gln val ser asn
leu ser gly tyr h1S phe CyS
AAG AUC GUU GGA GGA UAC ACC UGC CCG GAA CAU ucu GUC ccc UAC CAG GUG U C CUG
~ AAC ucu GGC UAC CAC uuc UGU
AA GA U A
gln asn

60 70 80
glu hls asn de asn val leu glu gly asp glu gln asn
phe
ala
ile lys Ile 118 lys hls pro asn tyr ser
GAGCAC AAC AUC AAU GUC CUU GAG GGC GAUGAG CAA UUU AUCAAU GCU GCC AAGAUC AUC AAG CAC CCC AAC UAU AGU
A G G U UC GA
asn Val phe asp

90 100 1
ser trp thr leu asn asn asp Ile met leu- de lys leu ser ser pro Val lys leu
asn
ala arg Val ala pro Val
UCG UGG ACC CUG AACAAU a A U C AUG CUP AUC AAG CUC UCU UCC CCU GUG AAA CUC AAU GCC CGA GUG GCC CCU GUA
AG AA U A G
arg lys thr

1 20 130
ala leu pro ser ala cys ala pro ala gly lhr gln cys leu de ser gly trp gly asn thr leu ser asn gly val
G C U C U G C C C A G C G C C U G U G C A C C U G C A G G C A C U CAG UGC CUCAUC UCUGGCUGGGGCAAC ACC CUCAGCAAU GGUGUG
U U G G C C
ser ser

1 40 1 50 160
asn asn pro asp leu
leu gln cys val asp ala pro val leu ser gln ala asp cys glu ala ala tyr pro gly glu
AACAAC CCA GAC CUG CUC CAA UGC GUG GAU GCC CCA GUG CUG UCU CAG GCU GAC UGU GAA GCC GCC UAC CCU GGG GAA
U G A G C C C C A U A A G
leu leu pro ser lYS

170 180

U GA A G C G U
asp asn val

210

U G G C C U
glu

220 230 231


lys val cys asn phe val gly trp #le gln asp thr de ala asn stop
AAG GUC UGC AAC UUU GUG GGC UGG AUU CAG GAC ACC AUU GCU GCA AAC
UAA AUAUCUUCAGUCUCUCUUCAAUCAGUGUGUC
A A A C G GC C A ACUG A
tyr asp

AAUAAAGUUCAUUUGCCCUUA)”
- GA A AUU U U ACUG

FIG. 3. The nucleotide sequences for trypsinogen I and I1 begins within the codon for amino acid -8; the 3’ ends before the
mRNAs and the amino acid sequences of the encoded pre- poly(A) tract. The conserved sequence AAUAAA in the 3’ noncoding
proenzymes. The complete nucleotide sequence for trypsinogen I regions of both mRNAs is underlined. The conservednucleotide
mRNA is given; the line below contains the nucleotides for trypsin- sequence near the 3’ end of eukaryotic 18 S rRNA (italics) is shown
ogen I1 mRNA that differ from trypsinogen I mRNA. The aminoacid at the position with the greatest number of base pairings within the
sequence for pretrypsinogen I, derived from its mRNA sequence is 5’ noncoding region of trypsinogen I mRNA. The positions of three
also shown; numbering starts at thefirst amino acid of the zymogen. nucleotides for trypsinogen I mRNA that differ between two cloned
The predicted prepeptide comprises amino acid residues-15 through ds-cDNAs are indicated; the nucleotide within codon 4 is an A in
-1, and the activationpeptideresidues 1 through 8. Amino acid pcXP1-102 and a G in pcXP7-35; the nucleotide within codon 26 is a
residues important in enzyme action that arediscussed in the text are C in both pcXP1-102 and pcXP4-78 and an Ain pcXP7-35; the
enclosed in boxes. The amino acids of trypsinogen I1 that differ from nucleotide within codon 95 is a G in both pcXP1-102 and pcXP1-90
trypsinogen I are shown. The nucleotide and aminoacid sequences of and a T in pcXP4-78.
trypsinogen I1 are incomplete; at the 5’ end the cloned sequence

Indeed, such inverted regions in cloned ds-cDNAs may be sinogen I mRNA) and continues three nucleotides past the
explained5 byan aberrant step during or after synthesis of the site of poly(A) addition of trypsinogen I mRNA. Nucleotide
second cDNA strand primed by the formation of a hairpin sequence and amino acid coding differences between trypsin-
loop (Volckaert et al., 1981; Fields and Winter, 1981). Other ogen I and I1 mRNAs are indicated inFig. 3. The nucleotide
cloning anomalies involving the very 5’ end of the mRNA homology with trypsinogen I mRNA is 88%. Aminoacid
sequence (e.g. Richards et al., 1979; Sippel et al., 1978) were sequence identity is 89% with rat trypsin I and 81% with
not observed, even with pcXP7-35, which extends to all but porcine trypsin (Table I).
an estimatedseven nucleotides from the5’ end of the mRNA. The 5‘ E n d of the Trypsinogen mRNAs-The nucleotide
A Second Closely Related Trypsinogen mRNA-Nucleo- sequence at the 5’ end of the trypsinogenI mRNA was
tide sequence analysis of pcXP8-64 (Fig. 1) confirmed that it determined by primer extension analysis. Within the collec-
contained a nearly full-length copy of a second trypsinogen tion of trypsinogen recombinant plasmids,pcXP7-35con-
mRNA. The mRNA sequence within pcXP8-64 beginswithin tained sequences farthest toward the 5’ end of the mRNA,
the prepeptide coding region (nucleotide 36 relative to tryp- including 6 nucleotides of the noncoding region. Nucleotide
sequence analysis of the extended Alu I-Sau3A primer (Fig.
J. C. Dagorn and R. J. MacDonald, unpublished observations. 4) revealed that the mRNA extended at least 6, and probably
9728 Rat Trypsinogen mRNAs
TAH1.E I

~ _ _.. ~-"~
Amino acid homologies hetuven trypsins and other pnncreaticserine proteases ..
~ ~~~~~~~~ ~~ ~ ~~

Percentage of identity" with


. ~. .. ~. ~ - - ~~ ... . "" ~~ ~ ~~ ~

Rat
trvpsin I1 Cow
trypsin Pig trypsin
Pig kallikrein'' Cow chyme-
trypsin A'
'Ow chyn'o-
trypsin H'
" ~" ~~~ . . ~ . - ~ . ~ ~. ~ . .. - ~~~~ . -
trypsin Rat I 89 72 78 38 45 43
trypsin Rat I1 73 81 39 44 41
Bovine trypsin
43 43 82 38
- ~"-~ ~~ . .~ ~ ~~~~

" I'er cent of the minimum length required for alignment according to Young et ccf. (1978) of the active enzymes.
'' Completed sequence for porcine pancreatic kallikrein from Tschescheet u f . (1978).
' For the chymotrypsin residues i6-245 only.

A B C
G r
+ ;A 1 2 3 4 5 2 3 4 5 2 3 4 5

G A T - >

p G
met- PC
GG
ser - c ~ n
a l a - G~
-
leu AA
G

1 eu -G, A ~
iI e -'AG
FIG. 5. Determination of the molecular length and relative
Ieu - G ~ levels of trypsin I and I1 mRNAs. A , ethidium bromide staining
C pattern of a 2 8 agarose gel containing 5 mM methyl mercury hydrox-
ala-G
ide. Lane I , RNA size standards: total yeast mitochondrial RKA (the
GG 15 S rRNA is 1689 nucleotides(Li el al. 1982)) andrabbitglobin
leu- A
mRNAs (Bethesda Research Laboratories; 552 and 589 nucleotides
C A plus poly(A) tails of approximately 125 nucleotides (Heindell el al.,
va1-A
C 1978; Efstratiadis el a f . 1977; Kaempfer et a f . 1979)).Lanes 2 and 5,
C total rat pancreatic polyadenylated RNA prepared as described under
glY CT - "ExperimentalProcedures." Lane 3, DNA size standards: HincII
digest of OX-174KF (New England Biolabs). Fragment sizes are1057,
C 770, 612, 495, 392, 345, 341, 335, 297, 291, 210, 162, and 7 nucleotides.
ala-G Lane4, rabbit globin mKNAs.The molecular length scale was derived
A

\
C from the size standards of Lanes 1 . 3 , and 4. B , following staining, the
ala- G RNAfrom the gel in A wastransferredtodiazobenzyloxymethyl
A paper as described under "Experimental Procedures."The paper was
C
AluI cut in half between Lanes3 and 4. Hybridization probes were prepared
Val A - by nick translation as described under "Experimental Procedures."
Lanes 2 and 3 were hybridized with pcXI'-TI, a probe specific for
A
trypsin I. Lanes 4 and 5 were hybridized with pcXI'-TII, a probe
C specificfortrypsin 11. C , To confirm thattheamount of RNA
ala G - transferredtothediazobenzyloxymethylpaperwasequivalent
Lanes 2 and 5, both halves of the diazobenzyloxymethyl paper were
in
A
hybridized to nick-translatedpcXP4-78 (the nearly full-length trypsin
A I cDNA clone) following the hybridizations described for B. Sick-
phe- A translated OX-174RF DNA was includedin the hybridization in order
to provide size standards on the autoradiograph (Lane 3 ) .
G
G The Alu I-Sau 3A primerfragment. is from a region of
known nucleotide sequence for both trypsinogen mRNAs. The
pro-G nucleotide sequence covered by the primer is nearly identical
for both mRNA species, and the 3' nucleotide will be base
A paired in both primer/mRNA hybrids. Therefore, the Alu I-
FIG. 4. Primer extension determination of the nucleotide Sau 3A fragment should have acted as a primerfor both
sequencefor the 5' end of trypsinogen mRNA. Becausethe trypsinogen mRNAs, and thecDNA synthesized should have
cIISA was generatedby copying the mKKA, the nucleotide sequence containedamixture of twocDNAsequences.However, a
shown is the mRNA complement.The amino acid translation for this unique sequencewas obtained (Fig. 4) that was identical with
region (theprepeptideregion) of themRNAis given. T h e slur
indicates the position of the nucleotide difference between trypsino- the trypsinogen I mRNA sequence derived from pcXPi-35.
gen I and I 1 mHKAs; it is a G in the trypsinogen I extended primer. The single nucleotide difference betweenthe end of the primer
and the endof the common sequences known for trypsinogen
7, nucleotides farther than the ds-cDNA insert of pcXP7-35 I and I1 mRNAs may be used to estimate therelative amounts
(see under "Discussion"). Thus, the complete lengthof tryp- of cDNA synthesized from the two mRNAs. This nucleotide
sinogen I mRNA is 804 nucleotides, plus the 5' cap structure difference is within the codon for amino acid 8 (see Fig. 3) and
and poly(A). would give rise to a G for the cDNA extended primer for
Rat Trypsinogen mRNAs 9729
trypsinogen I mRNA and anA for trypsinogenI1 mRNA. The 3 ‘ ... UAGGAAGGCGU ...5 ‘ 18s t R N A
sequence data of Fig. 4 show that a G is present, indicating
CCUUCuGCCACC
~- AUGA ... T r y p s i n I (12)
thatthe nucleotide sequence derived fromtheextended
GUGGUCUACUCuCUCCACAAC AUGC.. . E l a s t a s e I (21)
primer is for trypsinogen I mRNA. This result also implies
that trypsinogen I1 mRNA is present at lower levels and is ACAGACA&ACGGAgCACC AUGA.. . E l a s t a s e 11 ( 2 1 )
consistent with findingonlya single recombinant plasmid A G U C C N G A C C A G C AACUCG G . . . C h y m o t r y p s( i1n8 )
bearing trypsinogen I1 sequencesbut alarge numberfor ACCUUCuCACC AUGA ... C-peptidase A (11)
trypsinogen I. Furthermore,theabsence of anextended ACAACUUCAAAGCAAA AUGA.. . A m y l a s e (16)
primer longer than that shown in Fig. 4 suggests that the 5’
- 5n1u c s - / / - C A G C A A A G C C A C U AUGG.. . R N a sAe (64)
noncoding region of trypsinogen I1 mRNA is either the same
length or shorter than that of trypsinogen I. FIG. 6. 5’ noncoding regions of pancreatic mRNAs. Numbers
Levels of Trypsinogen I and 11 mRNAs-The length of inparentheses indicatethe nucleotide length of each noncoding
region. Potential base pairings with a conserved 3’ oligonucleotide
trypsinogen mRNA was experimentally determined by elec- sequence of eukaryotic 18 S rRNA (Hagenbuchle et al., 1978) are
trophoresis in an agarose gel containing the denaturant methylunderlined. Lower case u indicates potential G:U pairings. Sequence
mercury hydroxide. After electrophoresis, the RNAwas trans- sources: elastases I and TI, MacDonald et al., 1982; chymotrypsin“;
ferred to diazobenzyloxymethyl paper and hybridized with carboxypeptidase A, Quinto et al., 1982; amylase’; RNase A.H
“’P-labeled pcXP4-78 DNA to localize trypsinogen mRNA.
Theresultsare shown in Fig. 5C. The hybridizing band duplication event that gave rise to these two trypsins genetic
corresponds to the ethidium bromide-stained band a t 0.91 changes decreased the level of expression of one.Inthe
kilobases and represents2-4% of the total mRNA as estimated absence of sufficientselective pressure forremoval, it has
from the electrophoretic profie shown in Fig. 2. The estimated been maintained.
length of 910 nucleotides implies an average poly(A) length of In addition to the overall sequence homology of both rat
approximately 100 nucleotides. enzymes with other trypsinogens, key amino acidresidues
Thesequence homology between the two 3’ noncoding (Asp-179, Gly-202, and Gly-212) that are largely responsible
regions is 66%, and cloned sequences derived fromthese for accommodating the bulky positively charged side chains
regions would not be expectedto cross-hybridize under normal of lysine and arginine characteristic of trypsin-like substrate
stringent hybridization conditions. Therefore, specific probes preference are preserved. Stroud et al. (1971) and Hermodson
for each of the two trypsinogen mRNAs could be obtained by et al. (1973) list twelve additional componentsof the substrate
subcloning the 3‘ noncodingregionsfrompcXP4-78 and binding site preserved in bovine, porcine, and dogfish trypsin.
pcXP8-64 (see under “Experimental Procedures”). Thespec- These amino acidresidues are also preservedin both rat
ificity of each subclone was verified by the absenceof detect- trypsins. Thus, in addition to general sequencehomology,
able cross-hybridization under our standard conditions (data major structural requirements fortrypsin-like activity are
not shown). Hybridization of the 32P-labeled subcloned frag- conserved in the enzymes encoded by the two rat trypsinogen
ment for trypsinogen I readily detects trypsinogen I mRNA mRNAs.
within a Northern blot of pancreatic polyadenylated RNA
(Fig. 5B). However, no trypsinogen I1 mRNA is detectable mRNA and Protein Domains
with the ”P-labeled trypsinogen I1 subcloned fragment. By 5’ and 3‘ Noncoding Regionsof the mRNAs-We estimate
this criterion trypsinogen I1 mRNA levels are more than 20- the length of the 5‘ noncoding region of the trypsinogen I
fold lower than trypsinogen I mRNA levels in the adult rat mRNAto be 13 nucleotides,plus the 5’ capstructure. A
pancreas. comparison of the sequences for the 5’ end of mRNAs derived
from primer extension analysis to those derived from direct
DISCUSSION analysis of purified mRNA (e.g. globin (Baralle, 1977), oval-
Cloned ds-cDNA sequencesthat
encode trypsinogen bumin (McReynolds et al., 1978; Malek et al., 1981), and
mRNA were detected within a library of ds-cDNA sequences amylase (Hagenbuchle et al., 1980) reveals that the primer
for rat pancreatic mRNAs byhybridization to an mRNA extension method is capable of determining all but thefist 5’
fraction enrichedin trypsinogen mRNA. Nucleotide sequence nucleotide and, of course, the 7meG cap structure. Therefore,
analysis of a representative sample of the cross-hybridizing it is likely that the nucleotidesequence of trypsinogenI
ds-cDNAs demonstrated that they were derived from two mRNA shown in Fig. 3 is complete, except for the fiist 5’
distinct trypsinogen mRNAs. TrypsinogenI mRNA is a prom- nucleotide. Furthermore, themissing fiist nucleotide is likely
inentpancreaticmRNAapproximately 910 nucleotidesin to be an adenosine, since in the 50 examples compiled by
length that encodes a serine protease zymogen with a secre- Breathnach and Chambon(1981) the fist transcribed nucleo-
tory protein leader peptide and an amino acid sequence 78% tide is always a purine and generally an adenine. Thus, we
homologous to porcine trypsinogen and 72% homologous to estimate the lengthof trypsinogen I mRNA to be 804 nucleo-
bovine trypsinogen. The homologies increase to 82 and 77% tides, plus the 5’ cap structure and poly(A).
for the porcine and bovine zymogens, respectively, if chemi- Such a short 5’ noncoding region is common to many, but
cally similar amino acids (as defined by Shotton and Hartley, not all, of the pancreatic mRNAsfor secretory enzymes (Fig.
1970) are accepted. 6). The mRNAs encoding trypsin, elastases I and 11, chymo-
Trypsinogen I1 mRNA is arelatively minorpancreatic trypsin, carboxypeptidase A, and amylase have 5’ noncoding
mRNA that encodes a second very closely related, but non- leader regions between 11 and 21 nucleotides long, whereas
allelic, trypsinogen. The cloned nucleotide sequence extends that of RNase A is 64 nucleotides. Although homology among
from the codon specifying amino acid -8 of the prepeptide to the 5’ noncoding regionsof these pancreatic mRNAs elusive, is
near the presumptive 3’ end of the mRNA. The nucleotide all of the short leaders,except for amylase, contain sequences
sequence homology withtrypsinogenI mRNA is 88%; the with significant complementarity to a highly conserved oli-
amino acid sequence homology of the encoded proteins 89%. is
I’ C. Craik, G. Bell, and W. Rutter, personal communication.
The significance of a second nearly identical trypsin present ‘ W. Swain, personal communication.
at such low levels as to not contribute significantly to digestion “ G . H. Swift, S. J. S t a y , and R. J. MacDonald, manuscript in
is not clear. It may be that during or following the gene preparation.
9730 Rat Trypsinogen mRNAs
-16 -1 +1
Trypsin I (15) MetSerAlaLeuLeu I l e Leu A l a LeuVal Gly AlaAlaValAla Phe
Trypsin I 1 Leu
Val Gly A l aA l aV aAl l a Phe
Elastase I (16) MetLeuArg Phe LeuVal Phe AlaSerLeuValLeuTyr Gly HisSerThr
E l a s t a s e I 1 (16)
Met I lA
e r Tg h r Leu
Leu
Leu
Ser
Ala Phe Val
Ala Gly A l a Leu
Ser Cys
Chymotrypsin(18)MetAla Phe Leu TrpLeuValSer Cys Phe A l a LeuVal Gly A l aT h r Phe Gly Cys
C-peptidase A (16)
Met
Lys
Arg
Leu
Leu I l e Leu
Ser
Leu
Leu
Leu
Glu
Ala
Val Cys Gly Asn
Amylase
(1 5) Met
Lys Phe Leu
Val
Leu
Leu
Ser
Leu I l e Gly Phe Cys TAG
r lpal n
RNase A ( 2 5 ) Met Gly Leu
Glu
Lys
Ser
Leu Phe Leu Phe Ser
LeuLeu
Val
Leu
Val
Leu-/
/-Gly T r p ValGlnProSerLeu Gly , Gly

Prepeptide
C1 eavage
FIG. 7. bepeptides of rat pancreatic secretory enzymes. Numbers in parentheses indicate the amino acid length of each prepeptide.
Sequence sources: elastasesI and 11, MacDonald et al., 1982; chymotrypsin6; carboxypeptidaseA, Quint0 et al.,1982; amylase, MacDonald et
al., 19809;RNase A?

gonucleotide sequence at the3’ end of eukaryotic 18 S rRNA to the prepeptides of amylase and carboxypeptidase. The
(Fig. 6). In accordance with the sequence bias reported by pancreatic prepeptides, generally 15 or 16 amino acids in
Kozak (1981), the positions three and four nucleotides up- length, are among the shortestprepeptides known. They
stream of the initiator codon of each pancreatic mRNA are contain an arginine or lysine near theNH2 terminus, two short
adenosine and cytosine, respectively, and, except for elastase clusters of hydrophobic amino acid residues, and end at the
I mRNA, the nucleotide following the initiator codon is a carboxyl terminus with an amino acid residue with a small
purine. The presence of a short noncoding leader (Young et side chain. The RNase A prepeptide is a clear contrast, with
al., 1981), a sequence within the leader region capable of base Glu-Lys near the NH2 terminus andan extended second
pairing with the 3’ end of 18 S rRNA (Hagenbuchle et al., cluster of hydrophobic amino acids within a much longer
1978), and prescribed nucleotides near the initiator codon that prepeptide of 25 amino acids.
improve binding to ribosomes (Kozak, 1981) haire been sug- The prepeptide sequences of these rat pancreatic secretory
gested to enhance the efficiency of initiation of translation of proteins (Fig. 7) contrast with those reported by Devillers-
eukaryotic mRNAs. Thiery et al. (1975), which indicated very similar, possibly
In contrast to the highly conserved nucleotide sequence identical, prepeptides for a numberof dog pancreatic secretory
within the amino acid coding regions (88%),the 3’ noncoding proteins. Carne and Scheele (1982) have recently reinvesti-
regions of the two trypsinogen mRNAs are only 66% homol- gated the sequences of the dog prepeptides and found them
ogous. The region of longest homology contains the hexanu- not to be closely homologous. The divergent pancreatic pre-
cleotide AAUAAA, which is preserved (sometimes as a peptide sequences of the rat and now the dog should dispel
slightly altered sequence (MacDonald et al., 1980)) near the the notion of a tissue- or cell-specific signal peptide.
3’ end of polyadenylated eukaryotic mRNAs (Proudfoot and Activation Peptides-Based upon the homology with the
Brownlee, 1976). This hexanucleotide sequence is present 15 trypsinogen activation peptides of other species (Fig. 8), the
nucleotides from the poly(A) of trypsinogen I mRNA. Al- activation peptides of rat trypsinogens I and I1 are predicted
though a poly(A) tract was not present in the single cloned to comprise amino acids 1 through 8 (Fig. 3). A comparison of
ds-cDNA from trypsinogen I1 mRNA, the conserved hexa- the 12 known trypsinogen activation peptides (Fig. 8) illus-
nucleotide sequence is present 18 nucleotides from the last trates the diversity of acceptable amino acid sequences rec-
cloned nucleotide. ognized by intestinal enteropeptidase for the specific activa-
Prepeptides-Pretrypsinogen I has aprepeptide (signal tion of trypsinogen. The common feature is the preservation
peptide) similar in length and hydrophobic character to the of four contiguous amino acid residues with carboxylic acid
prepeptides of other secretoryproteins (for examples see side chains (generally aspartate) followed by lysine where the
Blobel et al. (1979)). The partial trypsinogen I1 prepeptide peptide cleavage for activation is known to occur. An excep-
sequence (Fig. 3) containing the seven carboxyl-terminal tion has been found by Brodrick et al. (1978) for human
amino acid residues is identical with the trypsinogen I se- cationic trypsinogen.
quence. Based upon the presence of a characteristic trypsin- The Enzymes-The predicted molecular weights of rat
ogen activation peptide (see below), cleavage of the prepep- trypsinogens I and I1 are 24,509 and 24,734, respectively.
tides is predicted to occur between amino acids -1 and +1 These estimatesclosely agree with the experimental estimates
(Fig. 3). of Vandermeers and Christophe (1969). Similarly, the amino
Through nucleotide sequence analysis of cloned ds-cDNA acid compositions of the zymogens derived from the mRNA
sequences, a number of prepeptides for rat pancreatic secre- sequences are nearly identical with those reported for the
tory enzymes are now known, including those for four serine purified rat cationic trypsinogen (Vandermeers and Chris-
proteases (Fig. 7). Even thoughthe pancreatic serine proteases tophe, 1969).
arose from a common ancestralprotease (Neurath et al., Both rat trypsins contain 12 half-cystinyl residues present
1967), with the exception of the two very similar trypsins, at the same residue positions as the bovine, porcine, and
their prepeptides appear no more related to each other than dogfish trypsins. Although the pairing of the rat trypsin half-
cystinyl residues cannot be determined from the nucleotide
’R. G . Lahaie and R. J. MacDonald, unpublished observations. sequence data, it seems likely that the manner of pairing is
Rat Trypsinogen mRNAs 9731

Phe P r o Leu G l u Asp


Asp ASP L Y ~ Rat I of these are in the enzyme interior, and the substitutions that
Phe P V
r oa l Asp
Asp
Asp
ASP LYS Rat I 1
occur are always conservative. The amino acid sequences of
the rat trypsins reveal eight additional mutable sites. Assum-
Phe P V r oa l Asp
Asp
Asp
Asp LyS Sheep ing similar structures for the rat and bovine trypsins, only two
Phe
Pro
Val Asp
Asp
Asp
Asp Lys Pig of these sites are in the interior and both are conservative
PhePro I l e Asp
Asp A s p Asp L y s Dog 2 & 3 changes (Ala-102 to Val and Leu-146 to Val).
No gaps are required toalign the amino acid sequences of
T hPrrToh r Asp
Asp
Asp A S P LYS Dog 1 the rat, bovine, and porcine trypsins. The alignments reveal
APl ar o Asp
Asp
Asp
Asp Lys Dogfish 62 amino acid replacements between rat I and cow and 48
Val
Asp Asp
Asp
Asp Lys cow between rat I and pig. Applying the equation of Dickerson
(1971), the actual number of replacements per 100 residues
V a l Asp
Asp
Asp
Asp Lys Goat ( m ) for rat trypsin I can be calculated to be 32.6 and 24.2 for
V a l Asp
Asp Asp Asp LTyus r k e y comparisons with the bovine and porcine trypsins, respec-
Asp L y s Human c a t i o n i c tively. Assuming thattherodent-ungulate divergence oc-
curred at themammalian radiation90 million years ago (Dick-
FIG. 8. Activation peptides of trypsinogens. Sequence sources: erson, 1971), the unit evolutionary period, which is the time
sheep, Bricteux-Gregoire et al., 1966; pig, Hermodson et al., 1973;
bovine, Davie and Neurath, 1955; turkey, Kishida and Liener, 1968; required to fi a 1%sequence divergence, is 3.5 million years
dogfish, Titani et al., 1975; goat, Bricteux-Gregoire et al., 1968; dog 1, for rat and bovine trypsins and 2.6 million years for rat and
2, and 3, Carneand Scheele, 1982 and Borgstrom, 1979; human, porcine trypsins. The unit evolutionary period calculated by
Brodrick et al., 1978. Hermodson et al. (1973) for the divergence of bovine and
porcine trypsins is very similar, 3.0 million years. Assuming
common to all trypsins. Thus, thesix disulfide bridges, includ- an average unit evolutionaryperiod for mammalian trypsins
ing the two unique to trypsins (Cys-15-Cys-145 and Cys-117- of 3 million years, the duplication event that gave rise to rat
Cys-218), appear preserved in both rat enzymes. trypsins I and I1 may be estimated to have occurred 34 million
The binding of calcium ions by trypsinogen promotes au- years ago, prior to the divergence of Cricetidae and Muridae
tocatalytic activation by the cleavage of one critical peptide (New and Old World rats and mice) (Wood, 1959).
bond of the activation peptide while suppressing cleavage a t
certain other peptide bonds that results in loss of enzyme Acknowledgments-We thank Drs. W. Swain and W. Rutter for
activity. Bode and Schwager (1975) have identified five amino providing the ds-cDNA library, Drs. W. Swain, C. Craik, G. Bell, W.
acid residues that actas ligands for calcium ionbinding. Both Rutter, and G. Scheele for providing sequence information prior to
publication, Dr. H. Martinez for providing the computerprograms for
rat trypsinogens retain these residues and, therefore, may beDNAsequenceanalyses, Drs. J-C. Dagorn and M. Waterman for
predicted tobind calcium ionas one aspectof their activation. critically reading the manuscript, and Marie Rotondi for preparing
Furthermore, five conserved glycine residues have been pro- the manuscript.
posed to actas flexible “hinges” to facilitate the conformation
changes required to form the active site domain upon trypsin-
ogen activation (Huber andBode, 1978). Three of the glycine REFERENCES
hinges have neighboring aromatic residues that appear to Alwine, J . C., Kemp, D. J., and Stark, E.R. (1977) Proc. Natl. Acad.
serveasanchors for the hinges. The glycine hinges and Sci. U. S. A. 74,5350-5354
adjacent aromatic residues are preserved in the rat trypsino-Aviv, H., and Leder, P. (1972) Proc. Natl. Acad. Sci. U. S. A. 59,
gens, indicating that similar structural changes can occur upon 1408-1412
activation. Bailey, J. M., and Davidson, N. (1976) Anal. Biochem. 70, 75-85
Preservation of key amino acid residues in the rat trypsins BaraLle, F. E. (1977) Cell 12, 1085-1095
Biessman, H., Craig, E. A., and McCarthy, B. J . (1979) Nucleic Acids
indicates that eachwould be a functional enzyme with char- Res. 7,981-996
acteristic tryptic cleavage activity. Amino acid residues 179- Birnboim, H. C., and Doly, J. (1979) Nucleic Acids Res. 7, 1513-1523
182 and 200-205 line the substrate binding pocket and deter- Blobel, G., and Sabatini, D. (1971) in Biomembranes (Manson, L. A.,
mine the specificity toward polypeptide substrate side chains ed) Vol. 2, pp. 193-195, Plenum Press, New York
(Huber andBode, 1978) and are preservedin the rattrypsins. Blobel, G., Walter, P., Change, C. N., Goldman, B. M., Erickson, A.
The conserved residues Asp-179, Ser-180, and Gly-204 form H., and Lingappa, V. R. (1979) Symp. SOC.Exp. Biol. 33,9-36
W., and Schwager, P. (1975) FEBS Lett. 56, 139-143
ionic and hydrogen bonds with lysyl and arginyl side chains Bode, Borgstrom, A. (1979) Hoppe-Seylers Z. Physiol. Chem. 360, 657-661
and are, therefore, the principal determinants of specificity. Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem. 50,
In addition,hydrogen bonding with the substrate polypeptide 349-383
chain includes an interaction between the peptide carbonyl Bricteux-Gregoire, S., Schyns, R., and Florkin, M. (1966) Biochim.
group of P,lo and the side chain amide of Gln-182, between Biophys. Acta 127,277-279
the peptide NH of P3 and the phenolic hydroxyl of Tyr-30, Bricteux-Gregoire, S., Schyns, R., and Florkin, M. (1968) Arch. Int.
Physiol. Biochim. 76, 571-572
and an essential interaction of the peptide carbonylof PI with
Brodrick, J. W., Largman, C., Hsiang, M.W., Johnson, J . H., and
the peptide NH of Gly-183 and Ser-185; these amino acid Geokas, M. C. (1978) J. Biol. Chem. 253,2737-2742
residues are also preserved in both rat enzymes. Carne, T., and Scheele, G. (1982) J . Biol. Chem. 257,4133-4140
The predictions of catalytic activity, substrate specificity, Chirgwin, J . M., Przybyla, A. E., MacDonald, R. J., and Rutter. W. J.
and other properties of these two rat trypsins could be verified (1979) Biochemistry 24, 5294-5299
by the expression of the cloned mRNA sequences in alterna- Craig, E. A., McCarthy, B. J., and Wadsworth, S. C. (1979) Cell 16,
tive host cells such as bacteria or yeast and characterization 574-588
Dagorn, J-C., and Lahaie, R. G. (1981) Biochim. Biophys. Acta 654,
of the purified enzymes. 111-118
Comparisons betweenbovine, porcine, and dogfish trypsins Dagorn, J-C., and Mongeau, R. (1977) Biochim. Biophys. Acta 498,
have revealed 88 mutable loci within the 223 amino acid 76-82
residues of the activeenzyme (Titani et al., 1975). Only seven Davie, E. W., and Neurath, H. (1955) J. Biol. Chem. 212, 515-529
Denhardt, D. T. (1966) Biochem. Biophys. Res. Commun. 23, 641-
The nomenclature of Schechter and Berger (1967) is used to
Io 646
identify amino acid positions of the polypeptide substrate. Devillers-Thiery, A., Kindt, T., Scheele, G., and Blobel, G. (1975)
9732 Rat Trypsinogen mRNAs
Proc. Natl. Acad. Sci. U. S. A . 72, 5016-5020 Pnybyla, A. E., MacDonald, R. J., Harding, J. D., Pictet, R. L., and
Dickerson, R. E. (1971) J. Mol. Evolution 1, 26-45 Rutter, W. J. (1979) J. Biol. Chem. 254, 2154-2159
Efstratiadis, A., Kafatos, F. C., and Maniatis, T. (1977) Cell 10, 571- Quinto, C., Quiroga, M., Swain, W. F., Nickovits, W. C., Standring, D.
585 N., Pictet, R. L., Valenzuela, P., and Rutter, W. J. (1982) Proc.
Fehlhammer, H., Bode, W., and Huber, R. (1977) J. Mol. Biol. 111, Natl. Acad. Sci.U.S. A . 79, 31-35
415-438 Rall, L., Pictet, R., Githens, S., and Rutter, W. J. (1977) J. Cell. Biol.
Fields, S., and Winter, G. (1981) Gene (Amst.)15, 207-214 75, 398-409
Goodman, H. M., and MacDonald, R. J. (1980) Methods Enzymol. Richards, R. I., Shine, J., Ullrich, A., Wells, J. R. E., and Goodman,
68, 75-90 H. M. (1979) Nucleic Acids Res. 7, 1137-1146
Hagenbuchle, O., Santer, M., Steitz, J. A,, and Mans, R. J. (1978) Cell Rigby, P. W. J., Dieckman, M., Rhodes, C., and Berg, P. (1977) J.
13, 551-563 Mol. Biol. 113,237-251
Hagenbuchle, O., Bovery, R., and Young, R. A. (1980) Cell 21, 179- Roychoudhury, R., Jay, E., and Wu, R. (1976) Nucleic Acids Res. 3,
187 101-116
Harding, J. D., MacDonald, R. J., Przybyla, A. E., Chirgwin, J. M., Rutter, W. J., Kemp, J. D., Bradshaw, W. S., Clark, W. R., Ronzio, R.
Pictet, R. L., and Rutter, W. J. (1977) J. Biol. Chem. 252, 7391- A., and Sanders, T. G. (1968) J. Cell. Physiol. 72, Suppl. 1, 1-18
7397 Rutter, W. J., Przybyla, A. E., MacDonald, R. J., Harding, J . D.,
Heindell, H. C., Liu, A,, Paddock, G. V., Studnicka, G. M., and Salser, Chirgwin, J. M., and Pictet, R. L. (1978) in Differentiation and
W. A. (1978) Cell 15,43-54 Neoplasia (Saunders, G. F., ed) pp. 487-508, Raven Press, New
Helmerhorst, E., and Stokes, G. B. (1980) Anal. Biochem. 104, 130- York
135 Schechter, I., and Berger, A. (1967)Biochem. Biophys.Res. Commun.
Hermodson, M. A,, Ericsson, L. H., Neurath, H., and Walsh, K. A. 27, 157-162
(1973) Biochemistry 12,3146-3153 Shotton, D. M., and Hartley, B. S. (1970) Nature 225, 802-806
Huber, R., and Bode, W. (1978) Accts. Chem. Res. 11, 114-122 Shotton, D. M., and Watson, H. C. (1970) Nature 225, 811-816
Jamieson, J. D., and Palade, G. E. (1967) J. Cell. Biol. 34, 597-615 Sippel, A. E., Land, H., Lindenmaier, W., Nguyen-Huu, M. C., Wurtz,
Kaempfer, R., Hollender, R., Soreq, H., and Nudel, V. (1979) Eur. J. T., Timmis, K. N., Giesecke, K., and Schutz, G.(1978) Nucleic
Biochem. 94,591-600 Acids Res. 5, 3275-3294
Kishida, T., and Liener, I. E. (1968) Arch. Biochem. Biophys. 126, Stellwag, E. J., and Dahlberg, A. E. (1980) Nucleic Acids Res.8,299-
111-120 317
Kozak, M. (1981) Nucleic Acids Res. 9, 5233-5252 Stroud, R. M., Ray, K. M., and Dickerson, R. E. (1971) Cold Spring
Li, M., Tzagoloff, A., Underbrink-Lyon, K., and Martin, N. C. (1982) Harbor Symp. Quant. Biol. 36, 125-140
J. Biol. Chem. 257, 5921-5928 Swift, G., McCarthy, B. J., and Heffron, F. (1981) Mol. Gen. Genet.
MacDonald, R. J., Przybyla, A. E., and Rutter, W. J. (1977) J. Biol. 181,441-447
Chem. 252, 5522-5528 Titani, K., Ericsson, L. H., Neurath, H., and Walsh, K. A. (1975)
MacDonald, R. J.,Crerar, M. M., Swain, W. F., Pictet, R.L., Thomas, Biochemistry 14, 1358-1366
G., and Rutter, W. J. (1980) Nature 287, 117-122 Tschesche, H., Mair, G., Godec, G., Fiedler, F., Ehret, W., Hirschauer,
MacDonald, R. J., Swift, G. H., Quinto, C., Swain, W., Pictet, R. L., C., Lemon, M., and Fritz, H. (1978) in Kinins-11 (Fujii, S., Moriya,
Nickovits, W., and Rutter, W. J. (1982)Biochemistry 21,1453-1463 H., and Suzuki, T., eds) pp. 245-260, Plenum Press, New York
Malek, L. T., Eschenfeldt, W. H., Munns, T. W., and Rhoads, R. E. Vandermeers, A., and Christophe, J. (1969) Biochim. Biophys. Acta
(1981) Nucleic Acids Res. 9, 1657-1673 188, 101-112
Maroux, S., Baratti, J., and Desneulle, P. (1971) J. Biol. Chem. 246, Van Nest, G. A,, MacDonald, R. J., Raman, R. K., and Rutter, W. J.
5031-5039 (1980) J. Cell Biol. 86, 784-794
Mathews, B. W., Sigler, P. B., Henderson, R., and Blow, D. M. (1967) Volckaert, G., Tavernier, J., Derynck, R., Devos, R., and Fiers, W.
Nature 214,652-656 (1981) Gene (Amst.) 15,215-223
Maxam, A,, and Gilbert, W. (1980) Methods Enzymol. 65,499-560 Walsh, K. A,, and Neurath, H. (1964) Proc. Natl. Acad.Sci. U. S. A .
McReynolds, L., O’Malley, B. W., Nisbet, A. D., Fotherfl, J. E., 52,884-889
Gival, D., Fields, S., Robertson, J., and Brownlee, G.G. (1978) Wood, A. E. (1959) Evolution 13,354-361
Nature 273, 723-728 Young, C. L., Barker, W. C., Tomaselli, C. M., and Dayhoff, M. 0.
Mikes, O., Holeysovsky, V., Tomasek, V., and Sorm, F. (1966) Bio- (1978) in Atlas of Protein Sequence and Structure (Dayhoff, M.
chem. Biophys. Res. Commun. 24, 346-352 O., ed) Vol. 5, Supplement 3, pp. 73-93, National Biomedical
Neurath, H., Walsh, K. A,, and Winter, W. P. (1967) Science 158, Research Foundation, Silver Spring, MD
1638-1644 Young, R. A,, Hagenbuchle, O., and Schibler, U. (1981) Cell 23,451-
Proudfoot, N. J., and Brownlee, G. G. (1976) Nature 263,211-214 458

Das könnte Ihnen auch gefallen