Sie sind auf Seite 1von 21

American Society of Plant Biologists (ASPB)

Analyses of Expressed Sequence Tags from Apple


Author(s): Richard D. Newcomb, Ross N. Crowhurst, Andrew P. Gleave, Erik H. A. Rikkerink,
Andrew C. Allan, Lesley L. Beuning, Judith H. Bowen, Emma Gera, Kim R. Jamieson, Bart J.
Janssen, William A. Laing, Steve McArtney, Bhawana Nain, Gavin S. Ross, Kimberley C.
Snowden, Edwige J. F. Souleyre, Eric F. Walton and Yar-Khing Yauk
Source: Plant Physiology, Vol. 141, No. 1 (May, 2006), pp. 147-166
Published by: American Society of Plant Biologists (ASPB)
Stable URL: http://www.jstor.org/stable/20205729
Accessed: 11-10-2015 08:36 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.

American Society of Plant Biologists (ASPB) is collaborating with JSTOR to digitize, preserve and extend access to Plant
Physiology.

http://www.jstor.org

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

of Expressed

Analyses

Tags from Apple

Sequence

P. Gleave,
D. Newcomb*,
Erik H.A. Rikkerink,
Ross N. Crowhurst,
Andrew
Andrew
C. Allan,
L.
Kim
R.
H.
Emma
Bart
William
A.
Gera,
Bowen,
Judith
Jamieson,
J. Janssen,
Lesley
Beuning,
Laing,
Steve McArtney,
C. Snowden,
Bhawana Nain, Gavin S. Ross, Kimberley
J.F. Souleyre,
Edwige
Eric F.Walton,
Yauk
and Yar-Khing

Richard

Horticultural

and Food
New

Auckland,

Research

Institute

of New

Zealand

Mt.

Limited,

Albert

The domestic apple (Malus domestica; also known as Malus pumila Mill.) has become
commercial traits such as disease and pest resistance, grafting, and flavor and health
of

discovery
substantial

Over

genes

expressed

in

involved
sequence

150,000 expressed
treatments.

and

Research

Centre,

Zealand

these
tag

sequence

Clustering

of

these

traits,
collection

to map
markers
and
genes,
develop
tissues
from various
of apple,
focusing

from 43 different

tags have been collected


sequences

results

in a set

of 42,938

fruit crop in which to study


compound biosynthesis. To speed the
new

breed
on

cDNA

nonredundant

a model

fruit

cultivars,
tissues
of

we
the

have
cultivar

libraries representing
sequences

Royal

34 different
17,460

comprising

produced
Gala.

tissues
tentative

contigs and 25,478 singletons, together representing what we predict are approximately one-half the expressed genes from
repeats are found in 4,018
apple. Many potential molecular markers are abundant in the apple transcripts. Dinucleotide
nonredundant
sequences, mainly in the 5'-untranslated region of the gene, with a bias toward one repeat type (containing AG,
88%) and against another (repeats containing CG, 0.1%). Trinucleotide repeats are most common in the predicted coding
regions and do not show a similar degree of sequence bias in their representation. Bi-allelic single-nucleotide polymorphisms
are highly abundant with one found, on average, every 706 bp of transcribed DNA. Predictions of the numbers of
representatives from protein families indicate the presence of many genes involved in disease resistance and the biosynthesis
of flavor and health-associated
compounds. Comparisons of some of these gene families with Arabidopsis
(Arabidopsis thaliana)
in the lineages leading to apple of biosynthetic and regulatory genes
instances
where
have
there
been duplications
suggest
that

are

fruit

in fruit.

expressed

This

resource

paves

the way

for a concerted

are recognized
for their flavor,
Apples
by consumers
attributes
health, and nutritional
(Harker et al, 2003).
Because of this, they have become
the major temperate
fruit crop and a significant
horticultural
of
component
fresh fruit traded internationally
(Zohary and Hopf,
2000). The domestic apple (Malus domestica; also known
as Malus
to the family Rosaceae.
pumila Mill.) belongs
with
fruit and ornamental
other
commercial
Together
it forms the subfamily Maloideae
(Challice,
species,
is thought to have evolved by hybridiza
1974), which
=
tion from the families
(x
9) and
Spiraeoideae
=
et al., 2000). The re
Prunoideae
8; Lespinasse
(x
has a basic haploid
number
of
sulting allopolyploid
x = 17 and an estimated
to
size
of
743
796
Mb
genome
and Earle, 1991).
(Arumuganathan

was
This work
for Research,
supported
by the Foundation
no. C06X0207),
and Technology
and the Horticultural
Science,
(grant
and Food Research
Institute of New
Zealand
Limited.
*
e-mail
author;
rnewcomb@hortresearch.co.nz;
Corresponding
fax 64-9-8154200.
The author
findings
described

of materials
for distribution
responsible
integral to the
in
this
in
with
the policy
article
accordance
presented
in the Instructions
for Authors
is:
(www.plantphysiol.org)

D. Newcomb

Richard
Article,

publication

(rnewcomb@hortresearch.co.nz).
and citation
information

date,

can be found

at

www.plantphysiol.org/cgi/doi/10.1104/pp.105.076208.
Plant

functional

genomics

effort

in this

important

temperate

crop.

Physiology,

May

2006,

Vol.

141, pp.

147-166,

www.plantphysiol.org

a model
im
for understanding
Apple has become
tree
in
to
traits
The
crops.
portant
ability
graft
fruiting
a
to speed propagation
scions
and mass
produce
uniform
fruit from an outbreeding
genetically
plant
to the success of apple and many other
has contributed
horticultural
other important
traits, in
crops. Also,
some insect resistance
and
traits, can
cluding dwarfing
be conferred by rootstocks
and
Carlson,
(Ferr?e
1987).
in the skin and flesh of the fruit confer
Compounds
flavor, taste, and health benefits that are important con
sumer traits in apple. Presumably,
these compounds
as attractants
evolved
and bribes for their seed dis
Flavor compounds
increase substantially
dur
perses.
fruit
which
toward
the
end
takes
of
ing
ripening,
place
20 to 21 weeks
of fruit development.
This increase in
flavor is caused by an autocatalytic
burst of ethylene
late in fruit development,
characteristic
of
production
all climacteric fruit (Fellman et al., 2000). Also triggered
are a marked
increase
in cell wall
and
by ethylene
and a general progression
starch breakdown
through
and breakdown
senescence,
(Giovannoni,
ripening,
2001).
in many of the aforementioned
The genes involved
in apple. However,
traits are yet to be identified
with
the advent of high-throughput
isolation
sequencing,
of genes potentially
in such traits is now
involved
more
is the single
readily attainable. One approach
cloned
of
cDNAs
RNA
pass sequencing
representing
?

2006 American

Society

of Plant

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Biologists

147

Newcomb

et al.

as
known
(mRNAs). These are otherwise
transcripts
an
and
have
become
(ESTs)
sequence
tags
expressed
for rapidly developing
established method
gene data
bases (Adams et al., 1993). By sequencing
numbers
of
clones from cDNA
libraries derived
from RNA from
source tissues, the total set of genes
different
sampled
can be maximized.
from the genome
Bioinformatic
of the resulting
sequences yield
sorting and clustering
that form the basis of a
databases
of putative
genes
functional
of these
program. Gene mining
genomics
as
aided
such
databases,
microarrays,
by techniques
can be used to select candidate
genes that are impli
cated in particular
ESTs have
crop traits. In addition,
as useful
sources
of both
been
identified
simple
(SSRs) and single-nucleotide
sequence
repeats
poly
for creating
(SNPs), both useful markers
morphisms
in plants (Morgante et al., 2002; Rafalski,
genetic maps
2002).
ESTs have been collected
for many
plant species.
are
The most
comprehensively
surveyed
Arabidopsis
in GenBank)
and rice
thaliana; 418,563
(Arabidopsis
in
both
of
which
GenBank),
(Oryza sativa; 406,624
have also had their entire genome
(Arabi
sequenced
Rice Ge
Initiative, 2000; International
dopsis Genome
nome
fruit crops
Project, 2005). Whereas
Sequencing
an EST
have been
less extensively
using
surveyed
a number
there
have
been
of re
approach,
recently
an
on
EST
is
extensive
EST
fruit
There
ports
projects.
collection
available
from tomato (Lycopersicon esculen
et al., 2002), and genes likely to
tum; Van der Hoeven
in the ripening process have been identi
be involved
fied by virtual northern analysis
(Fei et al., 2004). ESTs
of strawberry
fruit have been ana
ananassa)
(Fragaria
(Aharoni et al., 2000)
lyzed by microarray
technology
also during
2002).
(Aharoni and O'Connell,
ripening
In an EST collection
from the fruit of pineapple
(Ananas
et al. (2005) found a very high abun
comosus), Moyle
re
dance of metallothione
gene transcripts, whereas
new
(Vitis vinifera) identify many
ports from grape
SSRs useful for grape mapping
(Moser et al., 2005) and
in fruit development
candidate
involved
many
genes
traits (Goes da Silva et al., 2005). The only other
a significant
to have
Rosaceae
number
of
species
ESTs described
is apricot (Prunus armeniaca; Grimplet
et al., 2005).
the first EST sequencing
Here we describe
project in
and analysis of 151,687
apple. We report the collection
high-quality
apple ESTs, largely from the commercial
cultivar
infor
apple
Royal Gala. From this sequence
in
into functional
mation, we put sequences
categories
for functional
and
programs
preparation
genomics
describe SSRs and SNPs in the sequence data that will
in marker-assisted
In
be useful
programs.
breeding
ESTs that po
addition, we show that there are many
and
flavor
encode
of important
enzymes
tentially
and explore
health compound
biosynthetic
pathways,
whether
there has been an expansion
of the number of
in
from
families
involved
genes
gene
secondary me
and regulation
that are expressed
tabolite biosynthesis
in fruit tissues.

RESULTS
EST Sequencing

and Clustering

cDNA
libraries were
constructed
from a range of
different
tissues and developmental
time points using
material
from the apple cultivars Royal Gala, Pinkie,
Pacific Rose, and the dwarfing
rootstock M9. Libraries
were also constructed
from some tissues, plants, and
cell lines that were
to biotic and abiotic
subjected
stresses. The libraries were sequenced
to varying depths
on library
and novelty.
(Table I), depending
quality
Over the 43 cDNA libraries sequenced,
151,687 good
quality sequences were recovered. The average edited
length of the sequences was 468 bases.
of the sequences
Clustering
using a 95% threshold
consensus
17,460 tentative
(TC) sequences
yielded
with 25,478 sequences
remaining unclustered
(single
tons). TC sequences
range in length from 66 to 6,145
bases with an average of 745 bases, whereas
singletons
range in size from 47 to 790 bases with an average of
394 bases. The GC ratio of singletons
from
ranged
13% to 78%, with an average of 44%, whereas
that for
TC sequences
ranged from 14% to 69%, also with an
of
44%.
the TC sequences
and sin
average
Together,
an
se
EST
dataset
of
42,938
gletons
yielded
apple
the singletons
and TC sequences
quences. Hereafter,
are
referred to as the nonredundant
(NR)
collectively
set.
of 90% gener
gene
using a threshold
Clustering
ated fewer TC sequences
(16,756) and singletons
(17,858).
this
lower
threshold
increased
the
However,
using
number
of instances
of paralogs
being
incorporated
into the same cluster and was
not used
therefore
subsequently.
on a set of 545
Codon
assessed
usage was
apple
to
cDNA sequences
contain
cod
predicted
full-length
checked by manual
ing regions. These cDNAs were
inspection of BLASTx versus NRDB90
reports to make
sure they are devoid of introns and frameshift
errors.
From these data, the open reading
frames were de
fined and a codon usage table created from the 203,267
are found
codons
in the full
(Table II). All codons
cDNA
with
the
least
codon
dataset,
length
frequent
over
times.
content
at the
100
The
GC
represented
third position
of the codon is 52%.
SSRs and SNPs
are a useful source of microsatel
cDNA sequences
common
lites or SSRs in plants. SSRs are particularly
a lesser
in the 5'-untranslated
to
and,
(UTR)
region
in the 3'-UTR of transcribed
extent,
sequences
plant
the nature of the
(Morgante et al., 2002). We analyzed
dataset. Approxi
perfect SSRs in the apple sequence
one or
17%
of
the
contained
mately
apple sequences
more
SSRs. The relative
di-, tri-, or tetranucleotide
of di- and trinucleotide
is similar
repeats
frequency
Table III). Just over
(4,018 versus 4,010, respectively;
12 and 14
one-half
(57%) of the repeats were between
bases in length and only 17% of the di-, tri-, and tetra
in
nucleotide
repeats were
longer than 20 nucleotides
148

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

Analyses

Table

I. Summary

AAFA
AAFB
AAGA
AAKA

Minimum

Average

Maximum

Sequence

Sequence

Sequence

Length

Length

Length

50
47

767425
7,018
382

821

6,184

5,389

50

331

682

1,556

1,058

Library
Description
fruit, seeds removed
fruit 150 DAFB
skin peel, tree-ripened
fruit 150 DAFB
skin peel, tree-ripened
51 342 702 530
leaf, normalized

59 DAFBb

Royal Gala
Royal Gala

apple

Royal Gala apple


Pinkie expanding
Pinkie expanding

51

leaf, normalized

687

365

Total
No.
ESTs

No. ESTs
Assigned
toTCa

Royal Gala

senescing

partially

50

leaf

318

729

1,888

50

leaf

451

3,059

770

partially

51 553 767 4,860


10 DAFB fruit
777
562 4,240
187
10 DAFB fruit
200 440 697 405
10 DAFB fruit
182
511 748
87 DAFB fruitcortex
2,458
59 DAFB seeds 101 520 766 5,472
195
514
766
126 DAFB fruitcore
5,092

AAYA

Royal Gala

126 DAFB

ABCA

fruit cortex

521

118

736
751
740
753

512
589
524
591

at 0.5?C

for 24

ABEB
ABKA
ABLC

infected with

V.

inaequalisc

leaves infected with


seedling
stress leaves
Gala
temperature
Royal
Braeburn cell culture 3 d after subculture

V.

inaequalisc
128

539

72

573

Royal Gala
Royal Gala

leaves

seedling

Braeburn

cultured

fruit cells,

boron

3,946

5,004

523

4,693 756

165

552

741

4,808

4,616

170

564

775

4,798

3,967

198

521

562
766

4,082

557
209

exposed

781

4,900

AELA

Aotea

expanding
Royal Gala young

51

leaf

758

349
leaf

expanding

777
3,963

831

1,665
50 318

1,524
712

5,457

4,388

AENA Northern Spy expanding leaf 52 428 759 1,222


50 370 828 4,786
AEPA Pinkie expanding leaf
50 350 877 4,959
AFBC Royal Gala preopened floral bud
24 DAFB

51

344

777

918

shoot

50

296

737

5,572

4,511

AVBC

Royal Gala

young

shoot

57

544

790

17,967

15,874

AYFB Royal Gala 10 DAFB fruit


Total

for all

54 304 693 645


47

libraries combined

1,077

468

877

Total
to a TC sequence
of ESTs from each
(contig).
library that have been assigned
bDays after
V. inaequalis
derived from the pathogen
(K. Plummer, W. Cui, and M. Templeton,
fungal sequences
in the entire database
TC sequences
(i.e. this figure is not additive).

aNumber
contain
unique

the relative frequency of repeats


length. We compared
and found
with different
dinucleotide
compositions
there to be a striking bias to one of four possible
repeat
classes
(Table III). AG repeats were by far the most
common dinucleotide
repeat, constituting
nearly 88%
of dinucleotide
repeats. A similar bias to AG repeats
has been found in Arabidopsis
(83%) and indeed other
AT
et
the next
al.,
2004).
repeats were
plants
(Zhang
common
most
at 7.6% in apple compared with 8.8%
Plant

Physiol.

Vol.

141,

2006

unpublished

8.6
11.9
21.8
24.0
22.2
17.8
18.6
10.2
4.0
17.3
29.7
13.8
13.4
10.6
17.7
17.7
10.3
8.5
19.6
24.9
18.7
20.7

295

27.4
19.0
11.6

2,093

235

25,478
17,460d
no. of NR sequences
43,
full bloom.

3.9
12.8

1,061

410

151,687

7.0
6.7

1,026

782

young

14.2

304
895

3,933

Royal Gala
Royal Gala

27.4

1,069

3,891

AVBB

AOFA

fruit

4,383

754
585 944

32.3
24.4

189
108
192
92
478
192
831
167
148
633
517
167
850
95
141

395
927

770
1,075
764
4,715

31.3
23.0

1,090

4,215

ABPB M9 root tips 86 562 760 4,813


199 504 749 926
ABQA Royal Gala flower
AEAA

3,914

186

108

ABMA M9 phloem
ABNB

4,655

600
378
886
403

ABDA RoyalGala fruitstored for 24 h under lowoxygen/high C02


ABEA

2,361
4,769

756

26.8

691
295
27
97
703
437
535

378

506

21.8

2,426

3,945

789
486
1,078
495

24.6

646

4,169

7424,481

21.1

1,460

6,412

97
139
105
201
226

spur buds
spur buds
spur buds
spur buds
fruit stored

Royal Gala

8,838

12.9
32.0

342
309
435

2,003

Royal Gala
Royal Gala
Royal Gala
Royal Gala
Royal Gala
Royal Gala

21.7

1,152

1,453

Royal Gala

Percentage
of Singletons
per Library

795
498
112
127

933
679

AASA
AASB
AASC
AAUA
AAWA
AAXA

M9xylem
Pacific Rose
Pacific Rose
Pacific Rose
Pacific Rose

from Apple

1,523

4,130

AARA

senescing

No.

418
389

516

AAOA Royal Gala phloem 50 371 734 4,519


50 482 771 2,649
AAPA Royal Gala 24 DAFB fruit

AAZA
ABAA
ABAB
ABBA
ABBB

Tags

Singletons

5,495

50 348 721 5,282


AALA Royal Gala 150 DAFB fruit cortex
51 341 739 1,275
AALB Royal Gala 150 DAFB fruitcortex
51 374 668 988
AAMA Royal Gala spur bud autumn
AANA

Sequence

ESTs

of apple

Library
Code
AAAA

of Expressed

cThese
data).

36.4

938
libraries also will
dTotal number

of

for Arabidopsis,
followed
by AC repeats at 4% in
in
8%
CG repeats
with
apple compared
Arabidopsis.
are very infrequent
in plants at 0.05% in apple and
0.14% in Arabidopsis.
Next we
the position
of the SSRs in
investigated
relation to putative
initiation
(Met) and stop codons
within
the apple sequence dataset. First, we identified
a dinucleotide
sequences
containing
repeat with more
than 100 bp of flanking
DNA
and
ranked
the

149

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

et al.

Newcomb

Table
cDNA

II. Codon

usage

calculated

using

545

full-length

Table

apple

sequences3

Codon

II. (Continued.)

Codon

AminoAcid

Fraction0

Per 1,000e

No.

GCA

0.27

18.92

3,846

GCC

0.26

18.42

3,745

GCG

0.14

9.82

GCT

0.33

23.36

TGC

0.59

10.38

TGT

0.41

7.20

GAC

0.44

23.91

4,860

1,997
4,748
2,110
1,464

GAT

0.56

29.98

6,095

GAA

0.45

28.70

5,834

GAG

0.55

34.67

7,047

TTC

0.53

22.14

4,500

F
F

TTT

0.47

19.95

4,056

GGA

0.29

19.85

4,034

GGC

CAC

CAT

0.23

16.09

3,270

GGG
G
0.22

15.40

3,130

GGT
G
0.25

17.43

3,542

0.51

13.34

2,712

0.49

12.67

2,575

ATA

0.21

9.98

2,028

ATC

0.37

17.49

3,556

ATT 0.42

19.74

4,012

I
K

AAA

0.39

23.77

4,832

AAG

0.60

36.10

7,337

CTA
L
0.09

7.77

1,579

0.21

18.83

3,828

L
L
L

CTC
CTG
CTT

0.17

15.33

3,116

0.21

18.76

3,813

TTA

0.09

8.09

TTG

0.23

21.11

1 ATGM
N

AAC 0.51

1,644
4,291

24.87

5,056

23.02

4,679

21.98

4,467

AAT

0.49

CCA

0.30

17.37

3,530

CCC

0.20

11.60

2,358

CCG

0.21

11.84

2,406

CCT 0.29

CA?

16.37

3,328

0.48

19.08

3,879

Q
R

CAG

0.52

20.52

4,172

AGA

0.26

13.02

2,646

AGG

0.27

13.60

2,764

CGA

0.11

5.28

1,073

CGC

0.12

6.17

1,254

CGG

1,255

0.12

6.17

CGT

0.10

5.14

AGC

0.16

14.65

2,977

1,044

AGT

0.14

12.18

2,476

TCA

0.19

16.63

3,381

0.19

16.99

3,453

11.21

2,279

TCC

TCG 0.13
TCT
S
0.19

17.28

3,512

ACA

0.26

12.97

2,636

ACC

3,079

0.30

15.15

ACG

0.14

7.23

ACT

0.29

14.68

V
GTA
0.11

7.27

V
GTC
0.23

14.47

1,469
2,985
1,478
2,942

GTG

0.33

20.81

4,231

GTT

0.33

20.57

4,181

Y
Y

TAC
TAT

1 TGG W

14.01

0.57

14.83

3,015

0.43

11.16

2,269

2,847

Amino Acid
*
*
*

Per 1,000e

Fraction6
TAA

0.34

0.91

TAG

0.25

0.66

TGA

0.41

1.11

No.
185
135
225

aCodon usage calculated


from EMBOSS
using the CUSP program
of usage of a given codon
(Rice et al., 2000).
among
Proportion
set (i.e. the set of codons
its redundant
that code for this codon's amino
cCodon

acid).

frequency

normalized

1,000

per

bases.

in order of significance
to public
of match
sequences
BLASTx
et
al., 1990). This
(Altschul
sequences
using
that we had identified
ensured
the correct open read
the start and stop codons
ing frame and, therefore,
accurately
(Fig. 1A). Of the top 100 in this ranking, we
found that 83% contained
dinucleotide
repeats in the
in
5'
2%
the
-UTR,
putative
putative
coding region, and
15% in the putative
3'-UTR. These figures are similar
for the Arabidopsis
83%, coding
(5'-UTR
genome
from relative
region 0.4%, and 3'-UTR 16% as deduced
per Megabase
frequencies
pair given by Zhang et al.
data
that repeats
in the
[2004]). These
suggested
5'-UTR are disproportionately
within
100
high
bp of
start site. We then analyzed
the translation
all dinu
cleotide
(Fig. IB) and trinucleotide
(Fig. 1C) repeats
than
six repeats present
in the entire apple
longer
database
cutoff
using a BLASTx E-value
significance
criterion of e-20 to identify all sequences with a rea
on which
sonable protein match
in GenBank
to base
start
At
re
sites.
least
for
the
dinucleotide
putative
seen
on
this
the
data
is
consistent
peats,
pattern
global
that manually
with
collected
for the top 100 ranked
show a consistent
genes used above. Both datasets
in the 5'-UTR closest to
pattern, with SSRs clustered
the

start

codon.

In addition
are also a useful
to SSRs, EST sequences
source of SNPs, which
can also be used in
mapping
and marker-assisted
The major
cultivar
breeding.
in this study was
(78.9%).
sequenced
Royal Gala
some sequences were also from other
However,
apple
cultivars,
including M9 (9.7%), Pinkie (3.8%), Braeburn
(3.7%), Pacific Rose (1.9%), Aotea
(1.1%), and Northern
an
is
also
which will
outbreeder,
Spy (0.8%). Apple
increase levels of heterozygosity
within
cultivars. To
increase the instances of
gether, these factors should
SNPs in the apple EST data. This seems to be the case
with evidence
for 18,408 bi-allelic
SNPs confirmed
by
more than one sequence per al?ele from the 13.0 Mb of
Bi-allelic
SNPs are
aligned NR sequences
analyzed.
therefore
706
found, on average,
every
bp of tran
were more
common
scribed DNA.
Transitions
than
transversions.
There were
4,592 AG and 5,112 CT
transitions
compared with 2,032 AC, 2,372 AT, 2,228
CG, and 2,072 GT transversions
(Table IV). Further
more,

one

or more

restriction

endonuclease

were
site polymorphisms
revealed
in approximately
SNPs
82% of NR
SNPs.
predicted
150

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

cleavage

with

candidate
with
sequences

Physiol.

Vol.

141,

2006

Analyses

Table

III. Summary

in apple

of microsatellites

Dinucleotide

No. NR

Repeat

Composition

Sequences

AC/CA/GT/TG
AG/GA/CT/TC

4.0
3 162
3,548
AT/TA
306

CG/GC

0.1
Totals

Trinucleotide

ESTs and comparison

4,018
No. NR

Repeat

Composition

Sequences

with

Percentage of
Apple Di Repeats

of Expressed

Sequence

Tags

from Apple

Arabidopsis

Apple
Rank

Percentage of
Di Repeats

Arabidopsis

8.0

1 88.3
7.62
4

83.0
8.8
0.14

100

100
Percentage of
Apple Tri

Apple
Rank

Arabidopsis
Rank

Repeats

AAC/ACA/CAA/GTTATGT/TTG 5.6 223


7
AAG/AGA/GAA/CTT/TCT/TTC
918
122.9
AAT/ATA/TAA/TTA/TAT/ATT
126
3.1
9
635
ACC/CAC/CCA/GGT/GTG/TGG
315.8
ACG/CGA/GAC/CGT/GTC/TCG
202
8 5.0
ACT/CTA/TAC/AGT/TAG/GTA
52
1.3
10
544
AGC/CAG/GGVTGC/CTG/GCT
4
13.6
752
2
18.8
AGG/GGA/GAGATCC/CTC/CCT
ATC/CAT/TCA/GAT/ATG/TGA
2917.3 5
267
CCG/CGC/GCC/GGC/GCG/CGG
66.7
Totals

Functional

4,010

of the apple sequences was based on


to
genes and transfer of their
similarity
Arabidopsis
to apple sequences.
annotation
BLASTx comparisons
to predicted
were used to
from Arabidopsis
proteins
NR
21
cate
into
functional
sequences
assign apple
on
based
functional
annotations
available
for
gories
the Arabidopsis
the
Munich
Infor
proteins
following
mation
Center
for Protein Sequences
(MIPS; http://
Functional
(FunCat) schema
mips.gsf.de)
Catalogue
5.82% of the apple NR
(Ruepp et al, 2004). Only
a
not
did
in Arabidopsis.
match
have
Of
sequences
those that do have a match,
72.79% are most similar to
in Arabidopsis
unclassified
(83.39%). Of the
proteins
in apple, the category Metabolism
classified proteins
the most genes, as it does in Arabi
(5.39%) contains
(4.09%; Table V).
dopsis
The representation
of protein families, domains,
and
functional
sites within
the apple sequence dataset was
to the Inter-Pro (Zdobnov
determined
by comparison
and Apweiler,
et al., 2003) protein family
2001; Mulder
In total, matches
to 2,692 Inter-Pro families
database.
were
found. The Inter-Pro
families with
the most
in
the
frequent representation
apple sequence dataset
are presented
in Table VI. Protein kinases are the most
Annotation

abundant family (IPR000719),with 801NR sequences

identified
from apple. We used automated
predictions
to the Inter-Pro database
based on comparisons
to
in
factors
and
detail
analyze
transcription
greater
common
identified
the most
factor fam
transcription
ilies in the apple sequences
and compared
the rank
(Table VII). The MYB
ings of these with Arabidopsis
factor family is the most common within
transcription
the apple NR sequences.
Plant

Physiol.

Vol.

141,

2006

1
5
4
9
8
7
6
2
10

100%

Genes

Categorization

Encoding

Important Traits

in Apple

Fruit

This

collection
of ESTs contains signatures
of many
in
involved
in
traits
Whereas
genes
important
apple.
much of primary metabolism
and basic plant physio
are not
to apple,
some
logical processes
peculiar
elements
of the biology
to the
of apple are unique
or other
of the Rosaceae
species, or at least members
climacteric
fruit.
Fruit Ripening
are a climacteric
a
fruit, displaying
Apples
rapid
at the onset of ripening simul
increase in respiration
an increase
taneous with
in the production
of the
hormone
(Knee, 1993). This process alters the
ethylene
and physiology
of the fruit to produce
biochemistry
the attributes we associate with fruit that are ready to
eat, including
color, texture, flavor, and nutritional
content (Fellman et al., 2000). Many of these processes
are under
the control of ethylene,
the synthesis
of
which
is autocatalytic
(McKeon and Yang, 1987; Fig. 2).
In the first biosynthetic
is converted
to
step, Met
(SAM) by S-adenosyl-L-Met
S-adenosyl-L-Met
synthe
tase (EC 2.5.1.6), represented
in
by 28 NR sequences
the apple sequence
dataset. Next,
SAM is converted
to
acid (ACC) by
1-aminocyclopropane-l-carboxylic
ACC synthase
in what
(EC 4.4.1.14; 10 NR sequences)
is the rate-limiting
in
the
step
pathway.
Finally, ethyl
ene is
(EC 1.14.17.4; 13
synthesized
by ACC oxidase
NR sequences).
In apple, an ACC
and an
synthase
ACC oxidase gene have each been silenced
in trans
that many
of the flavor and
lines, revealing
genic
texture traits are under ethylene
control
(Dandekar
et al., 2004).

151

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

et al.

Newcomb

20 -,

Position

Length

cd 16 H
O)
?CO 14
12 H
10

o
I
o
CM

O
o
t-

Distance

from putative

O
O
t-

nl
POih
O
O
O
O
O
O
O
O
CN co ?3- m

start/stop

Dinucleotide
70

60
C\|
CD
v
50

0
a.

40
LU
co ^n

d
z:

20
10
il? nlknnfll?

illnnnnllll

rjnilpnl^n,11
CO cm

Distance

Flavor

from putative

initiating Met

35

Trinucleotide

30
o

CN

25 H

O
CD 20
Q.
X
LU
CO 15 H
or:
z:

10 H

Jjldll

i nnn[ll]

Distance

in the perception
of ethylene have
Proteins
involved
the posi
almost
been
isolated
exclusively
through
in tomato
and
of genetic mutants
tional cloning
et
2001; Adams-Phillips
al.,
(Giovannoni,
Arabidopsis
of the apple
these sequences,
2004). Using
many
can be found in the apple sequence
representatives
are members
of
dataset
receptors
(Fig. 2). Ethylene
are
17
which
there
class
of
the His kinase
receptor
in the apple sequence
dataset. These
NR sequences
their signal through a mitogen
transduce
receptors
activated protein
(MAP) kinase cascade. MAP kinase
kinase kinases are negative
regulators of the receptors
constitutive
with mutants
for these genes
showing
families
activation of the pathway. Two representative
have relatives in the apple sequence dataset
(CTR1, 27
Alterna
NR sequences;
CTR2, eight NR sequences).
an ethylene-inducible
is
MAP
MPK6,
kinase,
tively,
of these
represented by six NR sequences. Downstream
are amembrane-bound
insensitive-2
receptor ethylene
two sets of
NR
and
then
(EIN2; eight
sequences)
the
factors/including
ethylene-insensitive
transcription
and the ethylene
like (EIL) family (18 NR sequences)
21
NR
factors
ERF2, six
(ERF1,
response
sequences;
NR sequences;
ERF4, 10 NR
ERF3, 15 NR sequences;
The targets of the ERFs are likely to
sequences).
in flavor biosynthesis
and
include
involved
genes
in ripening
texture modification
fruit.

lllll|lljll,
from

Biosynthesis

and modulators
contributors
Sugars are important
in most
fruit species Sue is
of flavor in fruit. Whereas
in members
of
the major
transported
photosynthate,
for
the Rosaceae,
including
apple, sorbitol accounts
more
than 50% of the fixed carbon and the carbon
from the leaves (Bieleski,
1982). The enzy
exported
in
matic
for
the
of sorbitol
steps required
synthesis
source tissues and its metabolism
in sink tissues are
are well
these enzymes
known. Genes encoding
rep
resented in this apple NR set (Fig. 3). In source leaves,
from the same hexose
sorbitol is derived
phosphate
aldose 6-P reduc?ase
(EC
pool as Sue. The enzyme
is the rate-limiting
1.1.1.200; 11NR sequences)
step for
to sorbitol from the hexose phosphate
the conversion
sorbitol
and
Loescher,
1981), synthesizing
pool (Negm
6-P from Glc 6-P. Antisense
of the aldose
suppression

initiating Met
than 500 bases from the putative
Repeats more
UTRs
(and
stop
longer than 500) have been combined
bin set at the beginning
and end of the range, respectively.
B,

from the start codon.


relative to predicted
(SSR) positions
coding
Figure
A, Manual
analysis of the number
regions of the apple NR sequences.
dinucleotide
of NR sequences
repeats and their distance
containing
1. Microsatellite

or stop codon
from the putative
(3'-SSR) in 50
initiating Met (5'-SSR)
to
ranked in order of most significant
base bin sets. NR sequences were
to decrease
to public domain databases
least significant
BLASTx match
and stop codon
putative
initiating Met
a BLASTx
NR sequences
with
top-ranked
for repeat
match more significant
than e-38 were manually
inspected
position. Also shown are the lengths of the UTRs (% of total for the same
the

influence

identifications.

of

The

start and
into one

repeats in relation to the putative


initiating Met
a BLASTx
analysis,
including all NR sequences with
in
sets.
Note
e-20
than
15-base
match more
bin
that, in
significant
in A, the stop sites have not been
contrast
to the manual
analysis
Position

of dinucleotide

from an automated

incorrect

predicted

100

repeats

are
that fit into the same bin set. The stop and start positions
dataset)
in between
these two indicates the distance
shown and the numbering

sites

with

analysis
significant
in B, the

including
than e-20
stop

of trinucleotide
C, Position
analysis.
to the putative
from an automated
initiating Met
more
a BLASTx match
with
all NR sequences

this automated

in relation

sites

in 15-base
have

not

bin sets. As for the automated


been

predicted

with

this

analysis
automated

analysis.
152

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

of Expressed

Analyses

IV. SNP

Table

Cumulative
NR
Total

of apple

analysis

NR

13.0

NR

from 42,938
+

sequences

18,408
predicted
(25,478 TC

17,460
singletons)
of predicted

occurrence
Average
bi-allelic
SNPs +
with

Mb

sequences

occurrence
Average
SNPs
bi-allelic

of predicted

in 706

in 144 bp

bp

sequences
base
polymorphic

one
NR

20.61%
with
sequences
SNPs
bi-allelic
predicted
no.
1.05
bi-allelic
Average
predicted
SNPs per NR sequence
Contig

Transitions

transversions

and

transitions

AG

4,592

CT transitions
Total
AC

5,112

transitions

9,704

transversions

1,516

AT transversions

3,508

CG

transversions

2,726

GT

transversions

2,570

Total

transversions

8,704

Total

18,408

transcript to 15% to 30% that of control


results
in a switch from sorbitol produc
apple plants
tion to starch synthesis with no overall effect on the
amount of C02 fixation (Cheng et al., 2005). Sorbitol is
then produced
of sorbitol
through dephosphorylation
61-kD phosphatase
6-P by an as yet unidentified
(Zhou
et al, 2003; EC 3.1.3.50). The sorbitol
is then trans
to sink tissues by a family of specialized
ported
for unloading
sorbitol transporters
(18 NR sequences)
et
in the sink tissue such as
Once
al, 2004).
(Watari
fruit, sorbitol dehydrogenase
(EC 1.1.1.14; 23 NR se
converts
et al.,
to
Fru (Loescher
sorbitol
quences)
the pool of available
1982), thereby reentering
sugars.
Upon
ripening, apple fruit produce
large quantities
a taste
to attract and provide
of volatiles
presumably
reward for seed dispersers
(Yahia, 1994; Fellman et al.,
2000). In addition, other parts of the plant, such as the
that can act as attractants
leaves, also produce volatiles
et al., 2001). The volatiles
for insects (Bengtsson
pro
duced by apple, including alcohols, aldehydes,
esters,
are derived
and polyphenolics,
terpenes,
terpenoids,
from secondary metabolite
The
apple ter
pathways.
are
via
and
(E,E)~
(Z,E)-a-farnesene
penes
produced
the mevalonate
and
2000;
(Ju
Curry,
Fig. 4). In
pathway
to being
in
addition
flavor
important
compounds
are attractants
for
apple, (E,E)- and (Z,E)-a-farnesene
et al., 2001) and have been
codling moth
(Bengtsson
in
the
scald (Pechous et al.,
disorder
storage
implicated
in the mevalonate
involved
2005). All the enzymes
are represented
in the apple sequence data
pathway
set. Initial steps from mevalonate
include the enzymes
mevalonate
kinase
three
NR sequences),
(EC 2.7.1.36;
one NR se
kinase
2.7.4.2;
(EC
phosphomevalonate
mevalonate
(EC
quence),
diphosphate
decarboxylase
6-P reductase

Plant

Physiol.

Vol.

141,

2006

Tags

from Apple

and isopentyl-diphos
four NR
sequences),
?-isomerase
(EC 5.3.3.2; four NR
sequences).
phate
The progenitors
of the terpenoids, geranyl diphosphate,
and geranylgeranyl
farnesyl diphosphate,
diphosphate,
are
(EC 2.5.1.x;
synthases
synthesized
by polyisoprene
12 NR sequences). The sesquiterpenes
(E,E)- and (Z,E)
are produced
from farnesyl diphosphate
a-farnesene
the
a-farnesene
enzyme
by
synthase. The gene encod
has
been isolated and shown
a-farnesene
synthase
ing
to be up-regulated
in fruit during
(Pechous
ripening
and Whitaker,
2004). The ct-farnesene synthase gene is
in the apple
sequences
by three NR
represented
dataset. Other
sesquiterpenes
(e.g. ?-caryophyllene,
/3-farnesene,
D) and monoterpenes
germacrene
(e.g.
ocimene,
linalool) are produced
by apple (Bengtsson
et al., 2001); however,
the terpene synthases
responsi
are yet to be identified.
ble for their biosynthesis
from ripe
The major group of compounds
produced
fruit of apple cultivars,
such as Royal Gala, is esters
(Young et al., 1996, 2004),
including
straight-chain
esters derived
from fatty acids (Rowan et al., 1999)
esters derived
and branched-chain
from branched
chain amino acids (Rowan et al., 1996). Of the straight
are thought to be derived
chain esters, C-6 constituents
via the lipoxygenase
from linoleic acid (Fig.
pathway
5). The first committed
step is performed
by members
of the lipoxygenase
(EC 1.13.11.12), which
family
linoleic
acid from linoleic
produces
13-hydroperoxide
acid. A large number of candidate
have
lipoxygenases
4.1.1.33;

sequences

length of analyzed

entries
sequence
no. bi-allelic
SNPs

Sequence

Table V. MIPS
with

FunCat

of apple NR

analysis

sequences

compared

Apple NR
Sequences

Arabidopsis

Arabidopsis
No.

Functional Category

01

Metabolism

02

Cell

04

Storage protein
Cell cycle and DNA

10
11

5.39

fate

12

Transcription
Protein synthesis

14

Protein

16

Protein with

18

Protein
Cellular

30

Cellular

32

Cell

0.26

0.13

0.07
0.47

0.44

0.42

0.41

0.03

0.02

0.64
0.93

function

binding

2.29

1.17
2.30

or cofactor
20

2.13

processing
2.47

fate

%
4.09

requirement

activity

regulation
2.19

transport

1.39
2.07

communication/signal
transduction
mechanism

34

and virulence
rescue, defense,
Interaction with
the cellular

36

Interaction

2.02

1.19
1.02

0.10

0.03

0.16

0.07

0.01

0.48

environment

38

with

Transposable
and plasmid

40

Cell

41

Development

42
70

Biogenesis
Subcellular

98

Classification

99

Unclassified

the environment
viral

elements,
proteins

fate

0.24

0.11

0.31
of cellular

0.11

components

localization
not yet clear
proteins

153

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

cut

1.36

0.40

1.75

0.31

2.49

2.36

72.79

83.39

Newcomb

et al.

Table VI.

Fifty most

common

Inter-Pro

families

IPR000719
IPR002290
IPR001611
IPR008271
IPR001245
IPR000504
IPR001680
IPR007090
IPR001128
IPR001841
IPR001005
IPR001806
IPR000626
IPR002048
IPR002885
IPR002110
IPR001810
IPR006662
IPR000608
IPR001410
IPR001440
IPR002401
IPR001932
IPR001993
IPR001623
IPR001471
IPR002198
IPR001087
IPR001344
IPR001356
IPR007087
IPR002130
IPR003439
IPR002347
IPR001878
IPR000157
IPR000795
IPR000425
IPR002016
IPR001395
IPR000008
IPR002423
IPR004087
IPR000916
IPR001092
IPR001023
IPR000571
IPR000217
IPR001938
IPR000823

found

the apple NR

sequences

Description
Protein

Frequency

801
359
346
274
269
202
193
170
159
156
133
124
124
118
117
111
106
100
98
96
95
95
83
81
78
76
73
65
63
63
61
60
60
60
57
56
54
53
52
51
49
48
46
45
43
42
41
40
39
38

kinase
kinase

Ser-Thr protein

LRR
active

Ser-Thr protein kinase,


Tyr protein kinase
RNA-binding
G-protein

region
? VVD-40

RNP-1

site

(RNA

motif)

recognition

repeat

LRR, plant specific


P450
Cytochrome
Zinc finger, RING
domain
Myb DNA-binding
Ras GTPase
superfamily

Ubiquitin
EF-hand

Calcium-binding
PPR repeat
Ankyrin
F-box

Cyclin-like

domain

Thioredoxin-type

enzymes

Ubiquitin-conjugating

DEAD/DEAH box helicase


TPR

repeat
E-class P450, group
Protein phosphatase

Mitochondrial
Heat

shock

I
2C-like
carrier

substrate
protein

DnaJ,

terminus

transcriptional
Pathogenesis-related
Short-chain
dehydrogenase/reductase
enzyme,

Lipolytic
Chlorophyll
Homeobox

factor and

ERF

SDR

G-D-S-L
protein

a/b-binding

Zinc

type
finger, C2H2
cis-trans
Peptidyl-prolyl

isomerase,

cyclophilin

type

ABC

transporter
Glucose/ribitol
dehydrogenase
Zinc finger, CCHC type

TIR
Protein
Major
Haem

factor, GTP
synthesis
intrinsic protein

binding

plant/fungal/bacterial
peroxidase,
reductase
Aldo/keto
C2 domain
Chaperonin
KH
Bet v

Cpn60/TCP-1

I allergen

Basic

helix-loop-helix
Heat-shock
protein

dimerization

Hsp70
Zinc finger, C-x8-C-x5-C-x3-H
Tubulin

domain

bHLH

type

Thaumatin,
pathogenesis-related
Plant peroxidase

in the apple
dataset
(41 NR
sequence
not
will
all
of
these
however,
necessarily
sequences);
in fruit. In tomato, at
in ester biosynthesis
be involved
least five lipoxygenase
genes have been identified, but
in the
one
has
been
of
these
implicated
directly
only
of flavor compounds
(Chen et al., 2004).
production
linoleic acid, the cytochrome
From 13-hydroperoxide
for the con
P450, hydroperoxide
lyase, is responsible
se
to the aldehyde,
hex-3-enal
version
(four NR
can
to
hex-2-enal
be converted
which
by
quences),
been

within

represented

Inter-Pro No.

another
P450, hydroperoxide
lyase (EC
cytochrome
Alcohol
three NR sequences).
4.2.1.92;
dehydrogen
to alcohols
ases can reduce the aldehydes
(EC 1.1.1.1;
To date, one alcohol dehydrogenase
37 NR sequences).
from apple and shown not to be
has been identified
under the control of ethylene
(Defilippi et al., 2005b).
to
can be converted
to alcohols, aldehydes
In addition
27
NR
acids by aldehyde
1.2.1.3;
(EC
dehydrogenases
are then able to be esterified with
sequences). Alcohols
Co A acids by alcohol
(EC 2.3.1.84;
acyl transf erases
154

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

Analyses

common

10 most

Table VII.

The

automated

predictions

using

Top 10 TF
Family Descriptions

factor

No. Apple
NR Sequences

Inter-Pro
Accessions Nos.

MYB

138

related

Rathogenesis

76

66

C2H2 Zn finger 64
NAC 52
Basic

43

helix-loop-helix
1 Zn finger

41

C3H-type

WRKY 40
bZip 36
Total No.
aBased

TFs

on data

not determined

et al.

et al.

(2000).

are

characteristic

of

many

of apple, including Royal Gala (Young et al.,


from He (Rowan et al.,
1996,2004). These are produced
in apple fruit skin
in
which
increases
1996),
quantity
et
al.,
2005a). Ile is synthe
(Defilippi
during ripening
sized from the amino
acid Thr via five enzymatic
each of these steps
steps. NR sequences
representing
are present in the apple sequence dataset (Fig. 6). Ile is
aminotrans
then metabolized
first by branched-chain
and then pyru
ferases (EC 2.6.1.42; 11 NR sequences)
to
vate decarboxylase
(EC 4.1.1.1; 11 NR sequences)
derive
available
for alcohol dehydrogen
aldehydes
ases and alcohol acyl transferases
to form esters. In ad
to the pool of alcohols, branched
dition to contributing
derived
from lie can also form CoA
hydrocarbons
to the branched-chain
in addition
acids. For example,
ester 2-methyl butyl acetate, the same branched
chain
can form ethyl-2-methyl
the
butanoate,
predominant
ester found in Granny
Smith apples
branched-chain
esters are avail
(Rowan et al., 1996). Once produced,
Plant

Physiol.

Vol.

141,

2006

on data

Tags

from Apple

of

Apple

Arabidopsis3

Riceb

14

1/ 9

1, 11,

2
7, 8,

10

NDC

ND

7
8
9
10

ND

18

ND

10

from Goff

et al.

(2002).

1,306
cFamily

(2002).

the common apple


For example,
sequences).
and
is
from hexanol
ester, hexyl acetate,
synthesized
alcohol
CoA.
One
transferase
(MpAATl)
apple
acyl
acetyl
is up-regulated
characterized. MpAATl
has been well
in response to ethylene
(Defilippi et al.,
during ripening
2005b; Souleyre et al., 2005). The MpAATl
enzyme is able
a wide
to synthesize
of
esters, including many
range
found in Royal Gala fruit (Souleyre et al., 2005). It is
possible that the other alcohol acyl transferases found in
the apple sequence dataset may also be contributing to the
production of volatile esters.
esters

bBased

Sequence

TF Family Rank

1,470

from Riechmann

three NR

Branched-chain

by searches

identified

952

by Goff

cultivars

in apple

IPR001005,
IPR006447,
IPR000818,
IPR001471
IPR002991,
IPR000315,
IPR000679,
IPR003851,
IPR006780
IPR001356,
IPR003106,
IPR000047
IPR007087,
IPR003656
IPR008917,
IPR003441
IPR001092
IPR000571
IPR003657
IPR004827

C2C2 Zn finger 74

Homeobox

(TF) families

transcription
Inter-Pro

of Expressed

able to be hydrolyzed
(EC 3.1.1.1; two NR
by esterases
are
esterases
for
Such
responsible
sequences).
perhaps
in very ripe
found
the large quantities
of alcohols
apple fruit and apple juice.
Color and Health-Related

Compound Biosynthesis

and flavanols,
Flavonoids,
including
anthocyanins
are a class of secondary metabolites,
derived
from the
amino
that impart
beneficial
acid Phe,
important
their antioxidant
health attributes
probably
through
et
and
2003;
Liu, 2004). Rep
al.,
activity (Wolfe
Boyer
are found in
resentative
and
flavanols
anthocyanins
in
apple (McGhie et al., 2005). The major anthocyanins
are
which
the
apple
cyanidins,
produce
glycosylated
in the fruit of many
red color observed
cultivars,
is found
Gala.
The
flavanol
quercetin
including Royal
in apple
fruit and has also been
with
associated
health benefits. Representatives
of the genes involved
are present
in the apple
in flavonoid
biosynthesis
are
NR se
dataset
There
7).
sequence
(Fig.
multiple
for all the genes in the pathway,
quences
including Phe
ammonia
cin
lyase (EC 4.3.1.5; eight NR sequences),
se
namate
NR
(EC 1.14.13.11;
4-hydroxylase
eight
14
4-coumarate
CoA
NR
6.2.1.12;
quences),
ligase (EC
chalcone
(EC 2.3.1.74; 25 NR se
sequences),
synthase
isomerase
chalcone
(EC 5.5.1.6; nine NR
quences),
flavanone
(EC 1.14.11.9;
sequences),
3-hydroxylase
seven NR sequences),
and flavanone
3'-hydroxylase
From dihydroquer
(EC 1.14.13.21; six NR sequences).
cetin to the production
of anthocyanins,
the pathway

155

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Newcomb

et al.

Figure 2. Ethylene synthesis and signal transduc


tion as
inferred
from Giovannoni
and
(2001)
et al. (2004).
sequences
Adams-Phillips
Apple
and signal transduction
enzymes
pro
encoding
teins were
identified
(e-05 cutoff)
by BLASTx
(Wu et al., 2003)
using the PIR NREF database
or sequences
in Adams-Phillips
et al.
quoted
in parentheses
under ESTs refer
(2004). Numbers
to the number of
apple NR sequences,
singletons,
TC sequences,
and total number of ESTs, respec
tively.

Ethylene

synthesis

and signal

transduction

EnzymeESTs

Compounds
methionine

S-adenosyl-L-methionine
EC 2.5.1.6

synthase

(28,17,11,323)

I
S-adenosyl-L-methioine

I
1-aminocyclopropane-1
carboxylic

1 -aminocyclopropane-1
EC 4.4.1.14

-carboxylate

synthase

(10,10,0,10)

1 -aminocyclopropane-1
EC 1.14.17.4

-carboxylate

oxidase

(13,7,6,139)

acid

ethylene

V
ETR1,

receptors

ethylene

ETR2,

ERS1,

ERS2,

EIN4

(17,11,6,32)

I
MAP

Pkin
kinases

CTR1
CTR2
MPK6

(27,12,15,59)
(8,5,3,13)
(6,5,1,7)

I
receptor
;cepti

EIN2

(8,4,4,17)

EIN3,EIL1,EIL2

(18,7,11,51)

ERF1
ERF2
ERF3
ERF4

(21,8,13,319)
(6,2,4,19)
(15,4,11,238)

4
insensitive

ethylene

transcription

factors

I
ethylene

response

ethylene

response

factors

to leucoanthocyanidins
proceeds
produced
by dihy
droflavonol
reduc?ase
NR
1.1.1.219;
(EC
eight
to
then
sequences)
anthocyanidins
by anthocyanidin
(EC 1.14.11.19; six NR sequences).
synthase
Finally, the
are formed
red-colored
cyanidin 3-glycosides
through
the transfer of a sugar onto a hydroxyl
group by a
transferase
(EC 2.4.1.91; 26 NR sequences).
glycosyl
can be
Also, from dihydroquercetin,
quercetin
synthe
seven NR
sized by flavonol
1.14.11.23;
(EC
synthase
in turn can be glycosylated
which
sequences),
by
transferases.
Some members
of these gene
glycosyl
in the apple skin have been
families that are expressed
to be inducible by UV and coor
isolated and shown
in the skins of red apple varie
dinately up-regulated
ties (Kim et al, 2003; Ben-Yehudah
et al., 2005). In
MYB
the
members
of
factor
addition,
transcription
can
have
been
that
interact
with
identified
pro
family
moters
of these genes
(Hellens et al., 2005). Further
one MYB
has
more,
(MdMYBlO; one NR sequence)
in
been
that up-regulates
identified
this pathway
skin
R.P.
and
Hellens,
J. Putterill,
(R.V. Espley,
apple
A.C. Allan, personal
communication).

(10,3,7,104)

genes

Gene

Family Evolution

Within
sentatives

the apple sequence


there are repre
dataset,
of many
in the
families
involved
large gene
of phytochemicals,
such as the flavor and
biosynthesis
health
described
above. Such multigene
compounds
families include the acyl transferases, methyl
transfer
and
P450s.
We
ases, glycosyl
transferases,
cytochrome
have compared
the predicted
amino acid sequences
of
members
of selected biosynthetic
families
from
gene
and apple using phylogenetic
to
methods
Arabidopsis
identify clades where apple genes may have expanded
in number, presumably
An ex
by gene duplication.
of
this
of
is
shown
for
the
type
ample
analysis
acyl
transferases
(Fig. 8), a gene family that contains mem
in ester biosynthesis
bers that are involved
in apple
et
For
this
is at
there
al.,
2005).
gene family,
(Souleyre
least one clade with more
from apple
representatives
than Arabidopsis.
in
of gene number
Expansions
are
in
are
that
in
fruit
found
genes
apple
frequent
cDNA
libraries. For example,
there have been more
in
of
clades
that contain
expansions
apple P450s
156

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

Analyses

Sorbitol metabolism
Enzyme

Compounds

ESTs

from Apple

Tags

3. The sorbitol metabolism


path
in
sequences
way
apple. Apple
encoding
in sorbitol metabolism
involved
enzymes

were

identified by BLASTx (e-05 cutoff)

using
2003).
refer

the PIR NREF


Numbers
to

the

database

in parentheses
number

of

(Wu et al.,
ESTs

under

apple

TC sequences,
quences,
singletons,
total number of ESTs, respectively.

glucose 6-phosphate
Nk

Sequence

Figure

v
aldose 6-phosphate reductase
EC 1.1.1.200

of Expressed

(11,6,5,177)

PIR reference
EC 3.1.3.50

gene
because

NR

se
and

*, No
for
currently available
the gene has yet to be

isolated.

sorbitol 6-phosphate
sorbitol 6-phosphatase
EC 3.1.3.50
sorbitol

sorbitol transporters

(18,7,11,97)

sorbitol dehydrogenase
EC 1.1.1.14

(23,15,8,299)

sorbitol

\|/
fructose

ESTs from fruit libraries over dupli


representative
cated genes
containing
only ESTs from exclusively
nonfruit
libraries (15 versus
four duplicated
genes).
of
of
P450 genes
numbers
Comparisons
nonduplicated
in fruit compared with genes that contained
expressed
ESTs exclusively
11
from nonfruit were
13 versus
We
also
examined
genes.
gene families
nonduplicated
in gene regulation
involved
ad
because presumably
ditional transcription
factor families and control genes
might be required to regulate new biosynthetic
path
factor gene families also show an
ways. Transcription
of clades of apple members
of these families
expansion
in cDNA
that are found
libraries made
from fruit
tissues. For example,
the bZIP transcription
factors
contain eight duplicated
in
genes that were expressed
fruit tissues compared
to three that were not, whereas
the MYB transcription
factors contain nine expressed
in fruit compared
to six that are not. Comparisons
for orthologous
fruit- and nonfruit
genes between
were
for
the
bZIPs
four versus eight
genes
expressed
and for the MYBs were
16 versus
10.
DISCUSSION
We report here a significant
sample of transcripts
taken from 43 apple cDNA libraries. We constructed
cDNA libraries from various
tissue types, but with a
bias toward fruit tissues. A staged series of develop
ing and then ripening Royal Gala fruit were sampled
Plant

Physiol.

Vol.

141,

2006

for ESTs (53,620 ESTs). This series included


flower,
whole
fruit, fruit cortex, skin, and seed samples. Such
a series will become
a useful
resource of genes
for
at
aimed
experiments
important pro
understanding
cesses
in fruit development,
and transformations
such as early cell proliferation,
cell expansion,
and,
from these libraries will come
finally, ripening. Also,
and transcription
factors
genes
enzymes
encoding
in the biosynthesis
involved
of health
and flavor
from apple fruit. Other major plant tis
compounds
sues were
also sampled,
buds,
shoots,
including
leaves, roots, phloem, and xylem (76,472 ESTs). Finally,
in response to external
many genes are only expressed
effects. We therefore sampled ESTs from various
tis
sues that had been
to biotic
and abiotic
exposed
stresses. These included harvested
fruit stored at low
and
altered
condi
temperature
storage atmospheric
the fungal
tions, leaves that had been infected with
to high tem
Venturia inaequalis and exposed
pathogen
to
and
fruit
cell
lines
that
had
been
perature,
exposed
boron (21,595 ESTs).
As is typical for EST gene-sampling
there
strategies,
is a high degree
in the sequences
of redundancy
the
collected.
of the sequences
reduced
Clustering
com
number
of sequences
to 43,938 NR sequences
and 25,478 singletons.
posed of 17,460 TC sequences
to the total
The proportion
of singletons
compared
a measure
number of ESTs can provide
of the overall
contribution
of the library to the dataset. No single

157

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

et al.

Newcomb

Figure
pathway

4. The
via

Terpenoid biosynthesis via themevalonate

terpene
biosynthesis
the mevalonate
pathway.

sequences
encoding
Apple
in the mevalonate
involved

enzymes

pathway
identified by BLASTx (e-05 cutoff)
(Wu et al.,
using the PIRNREF database
in parentheses
under
2003). Numbers
ESTs refer to the number of apple NR

Compounds

pathway
ESTs

Enzyme

were

TC sequences,
sequences,
singletons,
and total number of ESTs, respectively.

mevalonate
mevalonate
EC

kinase

(3,3,0,3)

2.7.1.36

5-phosphomevalonate

phosphomevalonate kinase
^

EC

(1,0,1,4)

2.7.4.2

5-diphosphomevalorate

mevalonate diphosphate decarboxylase


EC

isopentenyl diphosphate
N
polyisoprene
EC 5.3.3.2

\|

(4,2,2,17)

4.1.1.33

(4,1,3,22)

synthase

dimethylallyl diphosphate
(12,3,9,103)

isopentyl-diphosphatedelta-isomerase
V EC

geranyl diphosphate ?>

2.5.1.x

[monoterpenes]
(12,3,9,103)

polyisoprene synthase
EC
2.5.1.x
4/
farnesyl diphosphate -^ [sesquiterpenes]
polyisoprene

(12,3,9,103)

synthase

EC 2.5.1.x
vj,
_geranylgeranyl
diphosphate

?^

[diterpenes]

squalene
EC
squalene

?^

(4,0,4,26)

synthase

2.5.1.21

[triterpenes]
squalene
EC

(13,7,6,38)

monoxygenase

1.14.99.7

4
sterols

phytoene
EC

M/

phytoene

library contained more than 8% of the total number of


that much
of the diversity
is
singletons,
indicating
sources of tissue. The
different
derived by sequencing
of
AARA
the greatest
proportion
library contained
ESTs
with
of
total
number
(7.8%),
apple
singletons per
to great
the next highest being the library sequenced
est depth,
the leaf library AVBC
(6.7% of singletons)
that contained
11.8% of all apple sequences.
Sequenc
is also a good
of different
genotypes
ing a number
strategy for identifying new genes. The extreme of this
a comparison
clusters
of the NR
is illustrated
by
leaf librar
the two largest expanding
shared between
ies from the cultivars Royal Gala
(AELA, 2,629 NR
NR sequences).
and
Pinkie
2,074
(AEPA,
sequences)

?>

(4,2,2,18)

synthase

2.5.1.32

[carotenoids]

two libraries only share 14 NR sequences


These
be
tween them, which
0.3%
of
the
total
only
comprises
in the combined
NR sequences
dataset of
represented
are not
the two libraries
(4,689). These differences
to
due
solely
genotype-specific
expression
profiles, but
also will
introduced
include differences
by the two
the
involved with making
separate cloning procedures
libraries. Tissues where
be
further sequencing
would
of singletons
useful are indicated by the percentage
by
in li
of singletons
library figures. High percentages
braries
such as AYFB (36.4%), AAFB
(32%), AAMA
that these librar
(31.3%), and AAOA
(32.3%) suggest
ies could be targeted for sampling
for further genes
from apple.
158

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

Analyses

Straight chain ester biosynthesis from fatty acids


Compounds

ESTs

Enzyme

of Expressed

Figure
thetic

Sequence

5. The
pathway

Tags

straight-chain
from fatty

sequences

encoding
ester
straight-chain

enzymes

from Apple

ester

biosyn

acids.

Apple
in
involved

biosynthesis

were

identified by BLASTx (e-05 cutoff) using

linoleic acid

the PIR NREF database

M/
13 hydroperoxide

(41,34,17,165)

lipoxygenase
EC 1.13.11.12
linoleic acid
hydroperoxide

lyase

(4,1,3,29)

(Wu et al., 2003).


in parentheses
under ESTs refer
to the number
of apple NR sequences,
TC sequences,
and total num
singletons,
ber of ESTs, respectively.

Numbers

[CYTP45074B]
hex-3-enal

hydroperoxide
EC

isomerase

(3,1,2,9)

4.2.1.92

hex-2-enal

aldehyde dehydrogenase
EC
hex-3-enoic

(27,10,17,143)

1.2.1.3

acid

alcohol dehydrogenase
EC

(37,19,18,153)

1.1.1.1

hex-3-enol

alcohol acyl transferase


EC 2.3.1.84
acetate

hex-3-enyl

it is expected
that 43,938 NR sequences
is an
Overall,
tran
overestimate
of protein-coding
of the number
in apple
scripts
(protein-coding
genes)
represented
and that more sequencing,
both of the cDNAs sampled
reduce this
here and novel cDNAs from apple, would
number
of NR sequences.
Other EST projects under
taken in fruit crops of a similar size in terms of total
number
of ESTs collected have reported
lower num
a
For example,
bers of NR sequences.
study of 152,635
tomato ESTs produced
31,012 NR sequences
(Fei et al.,
a
of
whereas
collection
146,075 grape ESTs
2004),
rendered
25,746 NR sequences
(Goes da Silva et al.,
to the higher
is likely due
2005). This
clustering
to the
in this study compared
threshold
(95%) used
tomato and grape
If
the
studies
(90%).
apple EST
a
is analyzed
dataset
90%
threshold,
using
clustering
are attained
in the
similar numbers
of NR sequences
com
other fruit EST studies
(34,614 NR sequences
TC
and
of
16,756
17,858 singletons).
sequences
posed
even this lower number of NR sequences
is
However,
an overestimate
to
be
number
in
of
the
of
genes
likely
the apple genome.
set esti
The Arabidopsis
unigene
a 35%
mated
from all Arabidopsis
ESTs produces
overestimate
of the actual number
of protein-coding
from the genome
this
genes estimated
sequence. Using
the
actual
number
of
genes
figure,
predicted
apple
when
data
27,000. However,
may be approximately
are
from full cDNA sequences
and ho
incorporated
to Arabidopsis
genes is taken into account, it is
mology
Plant

Physiol.

(3,1,2,10)

Vol.

141,

2006

here represents
ap
likely the apple NR set presented
one-half
the number
of expressed
genes
proximately
found in apple.
A common
feature of the cDNA sequences
obtained
from apple, and indeed other plants
(Morgante et al.,
2002), is the high frequency of SSRs contained within
8,028 of the 43,938 apple NR sequences
them, with
di- or trinucleotide
(19%) containing
repeats. Dinucle
in
otide repeats were most
the
100 bp imme
frequent
start AUG, whereas
5' of the presumptive
diately
in the coding
trinucleotide
repeats were more common
common
most
far
the
of
dinucleotide
class
region. By
repeat were AG repeats, making
up 88.3% of all di
nucleotide
repeats. Least frequent were CG repeats at
0.1%. This bias toward AG and/or
against CG repeats
to
be
due
the
of
to be
may
tendency
CpG sequences
et
which
al.,
1998),
methylated
(Finnegan
potentially
inhibit transcription.
Another
fea
might
interesting
ture of apple SSRs is the difference
between
the
relative
of
to
the
AG
other
dinucleotide
frequency
repeat types in transcribed
sequences
compared with
those found in genomic DNA
(Guilford et al., 1997).
For example,
in apple genomic DNA,
the AG repeats
are
more
common
60%
than AC re
approximately
in
whereas
transcribed
AG repeats
sequences
peats,
are almost 22 times (i.e. 2,200%) more
common
than
AC repeats. A similar bias is found in Arabidopsis
for this
(Zhang et al., 2004). One possible
explanation
an
is
that
there
is
role
active
phenomenon
being

159

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

et al.

Newcomb

Figure 6.
Ile. Apple

Branched-chain

ester

via
biosynthesis
involved
enzymes

sequences
encoding
ester biosynthesis
in branched-chain

were

identi

fied by BLASTx (e-05 cutoff) using the PIRNREF


database
theses

chain ester biosynthesis

Branched

Enzyme ESTs

Compounds
threonine

in paren
Numbers
(Wu et al., 2003).
ESTs refer to the number of apple NR
TC sequences,
and total
singletons,

threonine

under

sequences,
number of ESTs, respectively.

via isoleucine

deaminase

(3,3,0,3)

EC 4.3.1.19

i
2-ketobutyrate

acetolactate

(8,2,6,28)

synthase

EC 2.2.1.6

i
2-acetohydroxybutyrate

acetohydroxyacid
EC

M/
2,3 -dihydroxy-3

isomeroreductase

-methylvalerate

dihydroxyacid dehydratase

(2,0,2,10)

EC 4.2.1.9

\^
2-keto-3

(2,1,1,14)

1.1.1.86

-methylvalerate
aminotransferase

(11,5,6,34)

EC 2.6.1.42
isoleucine
aminotransferase

(11,5,6,34)

EC 2.6.1.42
acid

2-oxo-3methylpentanoic

pyruvate

decarboxylase

(11,6,5,257)

EC 4.1.1.1
2-methylbutanal
alcohol
EC

(37,19,18,153)

dehydrogenase

1.1.1.1

2-methylbutanol
alcohol acyl
EC 2.3.1.84

(3,1,2,10)

transferase

(2,0,2,17)
EC 3.1.1.1
?carboxylesterase
2-methylbutylacetate

in plant
species. This
played
by these AG repeats
so
are
common
in
could also account
for why
they
to other repeats. Fac
transcribed
sequences
compared
in regulatory
tors that bind AG repeats
regions are
in both animals and plants (Epplen et al., 1996;
known
and O'Brian, 2002; Iglesias et al., 2004). Other
Sangwan
is by hyper
SSRs affect regulation
ways
potential
structure
and/or
(Jacobsen
secondary
methylation
et al, 2000).
se
in the apple
SNPs were
detected
Numerous
Mb
13.0
of
quence dataset. From a cumulative
length
of contiguous NR sequences
18,408 bi-allelic
sampled,
SNPs occur with a fre
SNPs were detected.
Bi-allelic
quency of one in every 706 bp of sequence. This is a
due to two
relatively high level of variation
probably
factors. The apple NR sequences, while predominantly
from the cultivar Royal Gala, also contain sequences
from six other cultivars,
Braeburn,
including Aotea,
and Northern
Pacific Rose, Pinkie, M9,
Spy. Also,
a strong incompatibility
system selecting
apple utilizes
self-crosses.
Therefore,
high levels of hetero
against
are
to trans
zygosity
expected. The ratio of transitions

in the apple bi-allelic SNPs is close to 1:1, with


versions
52.7% transitions. Similarly, in a SNP analysis of a com
erecta acces
of the Columbia
and Landsberg
parison
sions of Arabidopsis,
52.8% of the SNPs were transitions
(Jander et al., 2002). With the advent of high-throughput
detection
these SSRs and SNPs will
form a
systems,
and marker-assisted
breed
large resource for mapping
in apple and closely related crops.
ing programs
of GC content of a genome
and codon
Knowledge
is
when
useful
PCR-based
usage
devising
strategies
and gene isolation, as well as for hybrid
for mapping
in the
ization studies by microarray.
The GC content
third base position
of the full-length
cDNA sampled
than the overall GC ratio of 44%
(52% GC) is higher
from the sequences
of the NR sequences.
This indi
to a more balanced GC ratio in
cates some pressure
with UTRs.
Similar GC
coding
regions
compared
are
in
ratios in coding
found
(51%)
grape
regions
and pear (Pyrus communis; 52%). Overall,
the codon
similarities with
that of
usage of apple shares many
in the codon usage database
other dicots represented
et al., 2000). Apple
codon usage differs
(Nakamura
160

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

Analyses

Anthocyanin and flavanol biosynthesis


ESTs

phenylalanine
ammonia

phenylalanine
EC

Sequence

Tags

from Apple

7. Anthocyanin
and flavanol bio
en
in
sequences
apple. Apple
synthesis
in anthocyanin
involved
enzymes
coding
Figure

Enzyme

Compounds

of Expressed

(8,2,6,91)

lysase

4.3.1.5

were
identified
and flavanol biosynthesis
the PIR
(e-05 cutoff)
using
by BLASTx
NREF database
(Wu et al., 2003). Numbers
ESTs refer to the
under
in parentheses
of apple NR sequences,
single
and total number of
tons, TC sequences,

number

cinnamate
cinnimate

(8,3,5,261)

4-hydroxylase

ESTs, respectively.

EC 1.14.13.11
p-coumarate
CoA

4-coumarate
EC

(14,5,9,63)

ligase

6.2.1.12

p-coumaroyl-CoA
chalcone
EC

synthase

(25,16,9,67)

isomerase

(9,4,5,40)

2.3.1.74

chalcone
chalcone
EC

Nk

5.5.1.6

naringenin
flavanone
EC

Nk

3-hydroxylase

(7,4,3,17)

3'-hydroxylase

(6,3,3,33)

1.14.11.9

dihydrokaempferol
flavanone
EC

1.14.13.21

dihydroquercetin
4-reductase

dihydroflavonol
EC

(8,3,5,22)

1.1.1.219

leucoanthocyanidin

(6,3,3,35)

synthase

anthocyanidin

EC 1.14.11.19

Nk

anthocyanidin
UDP Glyc
EC

^
cyanidin

:flavanoid 3-0-glycosyl transferase

(26,15,11,78)

2.4.1.91

3-glycosides

flavonol synthase
1

(j

\ ,6,84)

1.14.11.23

EC

quercetin

UDP Glyc

(26,15,11,78)

2.4.1.91

EC

:flavanoid 3-0-glycosyl transferase

quercetin glycosides

for 12 amino acids. Further


from Arabidopsis
markedly
with
(Prunus p?rsica),
grape, pear, peach
comparisons
loblolly pine (Pinus taeda), poplar, tomato, citrus, potato
(Nicotiana tabacum)
(Solarium tuberosum), and tobacco
ismost similar to
showed that apple codon preference
from grape only in its
that of grape and pear, differing

preference forHis (CAC),Leu (TTG),and Ser (TCT)and


to pear

for an additional

in its preference

three codons,

Arg (AGG),Val (GTG),and the stop codon (TGA).The


codon usage

is correlated with

Plant

Vol.

Physiol.

141,

2006

trinucleotide

repeat class

=
in
0.75). These are also correlated
(R2
frequency
et
that
codon
al.,
2004), arguing
(Zhang
Arabidopsis
trinucle
usage is selecting repeat classes because most
otide repeats are found in the coding region (Morgante
in apple
et al., 2002). CpG suppression
is also evident
with a XCG:XCC ratio of 0.65, similar to that of tomato
level of suppression
of the CpG
(0.58). This modest
from that of nearly no
differs markedly
dinucleotides
level
in Arabidopsis
(0.92) to the high
suppression
reflect different
found in grape (0.35). This may well

161

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Newcomb

et al.

AtAT1

tree of Arabidopsis
(AtATs)
Figure 8. Phylogenetic
and apple (numbers) members
of the acyl transferase
in the
(AT). Apple ATs that have duplicated
family
are colored
green, orthologous
apple
apple
lineage
ATs are colored
red, and apple ATs for which
assign
ment
apple
those

are colored
blue. The whole
is ambiguous
image on the right of the apple ATs identifies
librar
that include ESTs from fruit tissue cDNA

ies, whereas
ATs that do

the crossed
not

have

image identifies apple


apple
EST representatives
from fruit

libraries.

12234 M

in the coding sequences used by


levels of methylation
different species of plants.
were employed
Predictive
bioinformatics
methods
to suggest
the function of encoded proteins predicted
from the apple NR
included
sequences.
Analyses
with
the
MlPS-based
role
classi
BLASTx, comparison
to the Inter-Pro
fication of Arabidopsis,
and matches
protein family. Overall, many gene families commonly
are represented
in the apple
found in plant genomes
dataset.
5%
of the apple
sequence
Only approximately
in the Arabidopsis
NR sequences did not have a match
Of the apple NR sequences,
the most
fre
genome.

class of genes were


the protein
quently
represented
followed
Leu-rich
kinases,
repeat (LRR) proteins
by
to rapidly
and RNA-binding
The ability
proteins.
function
bioinformatic
methods
gene
using
predict
will
likely to be in
speed efforts to identify genes
in certain economically
traits. For
volved
important
LRR
contains
the
class
of
(IPR001611)
protein
example,
in
321 NR sequences
from apple,
genes
including
in disease
in self
involved
volved
resistance, RNases
incompatibility, and many other cellular processes where
are important.
Included
interactions
protein-protein
within
the LRR class (IPR001611) are the NBS-LRR
162

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

Analyses

there are 59 NR
genes, of which
type of resistance
within
the
(e-20). In
sequences
apple sequence dataset
of plant-specific
Inter-Pro class (IPR007090)
another
were
in the apple
found
LRRs, 47 NR
sequences
dataset.
Disease-resistance
gene candidates
sequence
in other Inter-Pro classes; for ex
will also be common

ample, theNB-ARC (IPR002182)and TIR (IPR000157)

domain classes (63 and 56 NR sequences,


respectively)
are likely to consist
involved with
largely of genes
disease resistance. In addition, the pathogenesis-related
factor and ERF (IPR001471) and protein
transcriptional
kinase (IPR000719) classes (63 and 564 NR sequences,
are also likely to contain a subset of genes
respectively)
that play important roles in plant defense. Other func
tran
such as many putative
tional classes of proteins,
our
in
could
be
database.
identified
factors,
scription
common
the frequency
We
of the most
compared
data
available
families
with
similar
factor
transcription
from the fully sequenced plants Arabidopsis
(Riechmann
et al., 2000) and rice (Goff et al, 2002), and found the
these three species to be quite sim
rankings between
our anal
ilar. The MADS-box
family, ranked 17th in
was the
in
omission
apple from the
ysis,
only striking
and rice.
top 10 in both Arabidopsis
and
of flavor
and maintenance
The biosynthesis
in fruit are important
health compounds
agronomic
have
traits in apple. Presumably,
such compounds
as attractants
and potential
rewards for seed
evolved
fruit produce more than 200 volatile
dispersers. Apple
es
flavor compounds,
alcohols,
aldehydes,
including
and
and
ters, ketones,
Hoskin,
(Dimick
sesquiterpenes
also contains health-promoting
1983). Apple
phyto
chemicals,
sugars, acids, and vita
complex
including
as secondary
as well
such as
metabolites,
mins,
and
and
Liu, 2004;
triterpenes
(Boyer
polyphenolics
in
Ma et al., 2005). Gene families potentially
involved
the biosynthesis
of flavor and health-related
secondary
are well represented within
metabolites
the apple NR
this in two ways.
First,
sequences. We have shown
in
the
involved
enzymes
particular biosynthetic
path
that produce
associated
ways
secondary metabolites
in the
with flavor and health traits are well represented
the classes of
apple EST dataset
(Figs. 4-7). Second,
in these secondary
often found
metabolite
enzyme
are well
the cyto
represented,
including
pathways
chrome P450s (IPR001128, 159 in apple) and transfer
ase classes
(IPR003480, 62 in apple; IPR004159, 59 in
to
Table
for example.
One hypothesis
VI),
apple;
to
fruit
how
have
evolved
the
produce
explain
ability
is by duplicating
these compounds
these biosynthetic
genes (Ohno, 1970), with the duplicate
being co-opted
a
in fruit. In
that operates
by
biosynthetic
pathway
new
would
also
be
addition,
required
regulatory genes
to coordinate
of these new genes and
the expression
(Grotewold,
potentially entire new biosynthetic pathways
that biosynthetic
2005). We have tested the hypothesis
the
genes and regulatory genes have duplicated within
more
to
often
for
genes expressed
lineage, leading
apple
in fruit tissues compared with genes not expressed in fruit
our sample
is not
tissues. Because
of apple genes
Plant

Physiol.

Vol.

141,

2006

of Expressed

Sequence

Tags

from Apple

we can only infer duplications


relative to a
complete,
full
of
which
the
for
genes is known
complement
plant
more frequent
We
find
for
evidence
(e.g. Arabidopsis).
in fruit tissues than
of genes expressed
duplications
in fruit. An explanation
for
those that are not expressed
be that we have
these results may
genes
sampled
less than
from fruit libraries. However,
preferentially
one-half the apple ESTs are derived from fruit libraries.
is that there has been
Another
explanation
possible
in
the
loss
gene
leading to the Arabidopsis
lineages
it will be
these
Given
limitations,
however,
genes.
to
in
how
the
future
determine
many
interesting
just
in com
of these gene families are involved
members
and whether
gene family expan
pound biosynthesis
an important
sion has played
role in allowing
apple
fruit to produce
these compounds
(Schwab, 2003) and
how different biosynthetic
genes have been recruited to
make new secondary
(Ober, 2005).
compounds
that
There are two extreme types of gene duplication
could have given rise to the phylogenetic
pattern we
or
observe: either whole-genome
multiple
duplication
event
events. A palaeopolyploidy
local duplication
in the origin of the Maloideae
has been predicted
et al., 2000) that may well have provided
(Lespinasse
an
to expand the set of secondary metabo
opportunity
lite genes and their regulators. InArabidopsis,
however,
et al. (2005) observed
Maere
that regulatory proteins,
such a transcription
factors and signal transduction
that duplicated
tended
proteins
by palaeopolyploidy,
to be retained, whereas
these genes appear to be more
derived by segmental
duplication.
rapidly lost when
estimate
that
these whole-genome
duplication
They
events are responsible
for about 90% of the transcrip
tion factor in higher plants. Indeed, the rate of transcrip
in plants
tion factor evolution
is thought to be higher
et
than in animals
contrasts
with
This
al,
(Shiu
2005).
or
in
involved
metabolism
abiotic
proteins
secondary
those derived by small segmental dupli
stress, where
cations tended to be retained, whereas
those derived
were more
rapidly lost
by whole-genome
duplications
et
If
is con
this
of
evolution
al.,
2005).
(Maere
pattern
served in apple, we would
expect many of the genes in
to be derived
from segmental
secondary metabolism
events
to
and
therefore
be
clustered.
duplication
an extensive
In summary, we present
set of ESTs
one-half
representing what we predict is approximately
of the expressed genes from apple. The dataset contains
SSR and SNP markers
that will be useful for breeding,
as well as many
for
genes that can be tested directly
their roles in various
crop traits. This gene set is also
the basis for a microarray
for apple that is
forming
to
in
used
further
being
experiments
identify genes
and their regulators.
enzymes
encoding biosynthetic

MATERIALS

AND

Library Construction
Tissues were
Havelock

North

METHODS
and EST Sequencing

collected from apples at HortResearch


sites inAuckland
and
in New Zealand over two seasons (see Table I for details of

163

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

et al.

Newcomb

extracted from apple tissues


tissues, cultivar, and treatments). Total RNAwas
and Gomez-Lim
(1992) or Chang et al.
by the method of either Lopez-Gomez
RNA was isolated from total RNA by passage
(1993). Messenger
through
columns (Amersham Biosciences),
and either phage (Zap
oligo(dT)-cellulose
cDNA synthesis kit and Zap-cDNA gigapack IIIgold cloning kit;
Stratagene) or
cDNA libraries (Superscript system for cDNA synthesis and cloning;
libraries were produced essentially
Invitrogen) were constructed. Normalized
as described
in normalization method 4 from Bonaldo et al. (1996), with the
following modifications.
Single-stranded DNA (ssDNA) was prepared from an

plasmid

apple (cv Pinkie) leaf library by production of Ml 3 phage from the library and
isolation of phage DNA, as described by Sambrook et al. (1989), and ssDNA and
double-stranded
DNA was selected using QiaexII resin, according
to the
manufacturer's
instructions.
ssDNA was hybridized
with PCR-amplified
driver DNA before isolation of rare ssDNA using hydroxyapatite
chromatog
raphy. Rare ssDNA was made double stranded and transformed into DH10B
as described by Bonaldo et al. (1996). Plasmids from the
using electroporation,
phage cDNA libraries were mass excised, according to the manufacturer's
recommendations
(Stratagene). Plasmid extractions were then undertaken on
or the plasmid
individual bacterial colonies of either the phage-derived
derived

cDNA

cDNA inserts sequenced


libraries and the corresponding
from the 5' end. Big Dye Terminator
reactions
sequencing
on ABI377, ABI3100, or ABI3700 sequencers,
according to the
instructions (Applied Biosystems).

to that of other plant


they started with an ATG codon at a similar position
genes (or have an in-frame stop codon upstream of the putative ATG) and end
with an in-frame stop codon at a position
to that of other plant
equivalent
genes. Codon usage was calculated from sequences using the CUSP program
within EMBOSS (Rice et al., 2000).
of gene families from Arabidopsis were extracted from GenBank
and compared with predicted full-length family members
from apple. Align
ments
and trees were constructed using ClustalX
(version 1.81) using the
default settings (Thompson et al., 1997). Tree View (version 1.6.6) was used to
display resulting trees (Page, 1996).
implemented
Members

data
Sequence data from this article can be found in the GenBank/EMBL
libraries under accession numbers CN848772
to CN851520,
to
CN851527
to CN860109, CN860111
to CN861528, CN861730
to
CN852114, CN854524
to CN865258,
CN862091
to CN870966,
CN865263
CN870969
CN862087,
to CN875894, CN875896
to CN881602, CN881608
to CN881609, CN881619
to
to CN886998, CN887004
to CN890357, CN890361
to
CN884429, CN884434
to CN896142,
CN890413
to CN900284,
CN896144
CN900286
CN890409,
to CN901293, CN901299
to CN906863, CN906869
to CN907638, CN907715
to

manufacturer's
For determination

to CN914912, CN916097
CN914230
to CN920835, CN920840
to
to CN925934,
CN925028
to CN929310,
CN925939
CN929396
CN925026,
to CN932721, CN932727
to CN933610, CN933676
to CN937515, CN937517
to CN943462, CN943466
to CN949201, CN949206
to CN949208, CN949216
to
to CV126104, CV126106
CV126090
to CV126115, DR033885
to
CN949629,

M13F

DR033893,

predominantly
were resolved

of the complete sequence of cDNA clones, M13R and


or T3 and T7 primers were used for 5'
confirmatory resequencing and 3'
In situations where EST clones had long poly(A) tails (gener
end sequencing.
and therefore failed to yield good-quality
ally >40 nucleotides)
sequence with
standard sequencing primers, an anchored T24VN primer was used. Resulting
and assembled using Sequencher software,
sequences were edited manually
version 4.0.5 (GeneCodes). Sequencing progress for each cDNA library was
assessed manually
for clone length and sequence quality. Decisions were made
on the depth a library was sequenced
to based on the levels of predicted
from meristematic
sequence
redundancy. This resulted in libraries made
tissues being sequenced
to greater depths than libraries made
from other
tissues (see Table I).

CN914192,

EB105831

to EB157590,

We thank Robert Simpson, Dave Greenwood,


Rasam, Matt
Maysoon
and Ross Atkinson
for their work on gene annotation
Templeton,
systems;
David Chagne
for advice on SNPs; Colm Carraher and Tim Holmes
for
and Richard Forster for support. The ESTs
graphics; and Ian Ferguson
reported in this article were sequenced at Genesis Research and Development
Corporation, Auckland, New Zealand.
January 11, 2006; revised
2006; published March 10, 2006.

EST sequences were automatically


trimmed of vector, adapter, and low
to a relational database. Automatic
quality sequence regions, and uploaded
annotation was performed using the HortResearch
BioPipe sequence anno
tation pipeline
annotation
in PERL [R.N.
(a cluster-based
system written
Crowhurst, unpublished
data!) and utilizing a relational database
(MySQL;
The EST clustering phase was performed using The
http://www.mysql.com).
Institute for Genomic Research (TIGR) gene indices clustering tools (http://
The representation
of protein families, do
www.tigr.org/tdb/tgi/software).
the apple NR sequences was determined
mains, and functional sites within
using Inter-ProScan. The proteome for Arabidopsis
(Arabidopsis thaliana) was
obtained from The Arabidopsis Information Resource (TAIR;http://Arabidopsis.
et al., 2002), and comparisons
to proteins from Arabi
org; Garcia-Hernandez
dopsis using BLASTx (Altschul et al., 1990) were used to identify apple NR
sequences with similarity to Arabidopsis
proteins. These apple NR sequences
were then categorized into 21 functional categories based on functional annota
tions available

for Arabidopsis
proteins following MIPS (http://mips.gsf.de)
FunCat schema (Ruepp et al., 2004). Apple sequences encoding enzymes involved
in secondary metabolite biosynthetic pathways were identified by BLASTx (e-05
cutoff) using
2003).
Detection

the Protein

Information Resource

(PIR)NREF database

(Wu et al.,

of SSRs was undertaken using the PERL program within BioPipe


that identified tandem repetition of sequence words in target sequences. SSRs
were characterized by repeat type (di-, tri-, or tetranucleotide
repeat units),
repeat length, and position. For the purpose of reporting the frequency of
sequences were combined by
repeat classes, different di- and trinucleotide
type; for example, AG repeats also encompassed
repeats identified as GA and
their complementary
sequences CT or TC repeats. The repeat motifs combined
are described
in detail in Table II.
errors was
and sequencing
of SNPs and insertion/deletions
using PERL scripts that parsed the output of contig sequences
1999) sequence assembly pro
generated by the CAP3 (Huang and Madan,
gram running as part of the TIGR gene indices clustering tools.
Codon usage tables were derived
from sequences of cDNAs encoding
Prediction

performed

predicted

full-length

proteins.

Clones were

predicted

to be full length only

to EB178034.

ACKNOWLEDGMENTS

Received
Bioinformatics

and EB175250

LITERATURE

February

21, 2006; accepted

February

22,

CITED

Adams MD, Soares MB, Kerlavage


AR, Fields C, Venter JC (1993) Rapid
cDNA sequencing
cloned
(expressed sequence tags) from a directionally
human infant brain cDNA library. Nat Genet 4: 373-380
transduction
L, Barry C, Giovannoni
J (2004) Signal
Adams-Phillips
fruit ripening. Trends Pharmacol
Sei 9: 331-338
systems regulating
Aharoni
A, Keizer
LCP, Bouwmeester
M,
HJ, Sun Z, Alvarez-Huerta
Verhoeven
HA, Blaas J, van Houwelingen
AMML, De Vos RCH, van
der Voet H, et al (2000) Identification
of the SAAT gene involved
in
Plant Cell 12:
strawberry flavor biogenesis
by use of DNA microarrays.
647-662
Aharoni A, O'Connell
AP (2002) Gene expression
analysis
achene and receptacle maturation
using DNA microarrays.

of strawberry
J Exp Bot 53:

2073-2087
Altschul

SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local


search tool. JMol Biol 215: 403-410
Genome
Initiative
of the genome sequence of
(2000) Analysis
Arabidopsis
the flowering plant Arabidopsis thaliana. Nature 408: 796-820
K, Earle ED (1991) Nuclear DNA content of some impor
Arumuganathan
tant plant species. Plant Mol Biol Rep 9: 208-218
alignment

I, Ramirez MI, Borg-Karlson


A-K,
Bengtsson M, Backman A-C, Liblikas
P (2001) Plant odor analysis
Ansebo
L, Andreson
P, Lof qvist J,Witzgall
of apple: antennal response of codling moth females to apple volatiles
JAgrie Food Chem 49: 3736-3741
development.
during phonological
Ben-Yehudah
G, Korchinsky
R, Redel G, Ovadya
R, Oren-Shamir
M,
Cohen Y (2005) Colour
and the anthocyanin
accumulation
patterns
in 'Red Delicious'
Sei
J Hortic
biosynthetic
pathway
apple variants.
Biotechnol
RL

80: 187-192

In FA Loewus, W Tanner, eds, Plant


(1982) Sugar alcohols.
I. Intracellular Carbohydrates.
of Plant
Carbohydrates
Encyclopaedia
New Series, Vol 13A. Springer, New York, pp 158-192
Physiology

Bieleski
if

164

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Plant

Physiol.

Vol.

141,

2006

Analyses

and subtraction:
MF, Lennon G, Soares MB (1996) Normalization
two approaches
to facilitate gene discovery. Genome Res 6: 791-806
and their health benefits.
Boyer J, Liu RH (2004) Apple phytochemicals
Nutr J 3: 5
and the origin of the Pomi
Challice
JS (1974) Rosaceae
chemotaxonomy
dae. Bot J Linn Soc 69: 239-259
Bonaldo

for
S, Puryear J, Cairney
J (1993) A simple and efficient method
isolating RNA from pine trees. Plant Mol Biol Rep 11: 113-116
D (2004)
Chen G, Hackett
R, Walker
D, Taylor A, Lin Z, Grierson
Identification
of a specific isoform of tomato lipoxygenase
(TomloxC)
in the generation
involved
of fatty acid-derived
flavor compounds.
Plant Physiol 136: 2641-2651
AM (2005) Antisense
Cheng L, Zhou R, Reidel EJ, Sharkey TD, Dandekar
of starch synthesis
inhibition of sorbitol synthesis leads to up-regulation
Chang

in apple leaves. Planta 220: 767-776


altering C02 assimilation
AM, Teo G, Defilippi
BG, Uratsu SL, Passey AJ, Kader AA,
of
Stow JR, Colgan RJ, James DJ (2004) Effect of down-regulation
on fruit flavour complex in apple fruit. Transgenic
ethylene biosynthesis

without
Dandekar

Res 13: 373-384

of Expressed

Sequence

Rice Genome
(2005) The map-based
Sequencing
Project
sequence of the rice genome. Nature 436: 793-800
EM (2000) Ectopic
Jacobsen SE, Sakai H, Finnegan EJ, Cao X, Meyerowitz
of flower-specific
genes in Arabidopsis. Curr Biol 10:
hypermethylation
179-186
SR, Rounsley
SD, Bush DF, Levin IM, Last RL (2002)
Jander G, Norris
era. Plant Physiol
Arabidopsis
map-based
cloning in the post-genome
International

129: 440-450
E (2000) Lovastatin

a-farnesene
synthesis without
in 'Golden Su
fruit ripening
during
Sei 215: 105-110
preme'
Kim S-H, Lee J-R, Hong S-T, Yoo Y-K, An G, Kim S-R (2003) Molecular
genes preferentially
cloning and analysis of anthocyanin
biosynthesis
in apple skin. Plant Sei 165: 403-413
expressed
Knee M
fruits. In GB Seymour,
(1993) Pomme
JE Taylor, GA Tucker,
Ju Z, Curry
affecting

inhibits

ethylene
production
apples. JAm Soc Hortic

eds, Biochemistry of Fruit Ripening. Chapman and Hall, London, pp 325-346


M, Pinard F, Seguin M
Lespinasse D, Grivet L, Troispoux V, Rodier-Goud
in the resistance to South Amer
of QTLs involved
(2000) Identification
ican leaf blight (Microcyclus ulei) in the rubber tree. Theor Appl Genet
Loescher

a rate limiting step for ester biosynthesis,


is regulated
acyltransferase,
by ethylene. Plant Sei 168: 1199-1210
Dimick PS, Hoskin
of the art. Crit
JC (1983) Review of apple flavor?state
Rev Food Sei Nutr 4: 387-409
Epplen JT, Kyas A, Maueler W (1996) Genomic
simple repetitive DNAs are

MA (1992) A method
R, Gomez-Lim
Lopez-Gomez
RNA from fruits rich in polysaccharides
using
27: 440-442
HortScience

targets for differential binding of nuclear proteins. FEBS Lett 389: 92-95
Fei ZJ, Tang X, Alba RM, White
CM, Martin GB, Tanksley
JA, Ronning
EST analysis of tomato and
SD, Giovannoni
JJ (2004) Comprehensive
comparative
genomics of fruit ripening. Plant J 40: 47-59
Fellman
TW, Mattinson
DS, Mattheis
JK, Miller
JP (2000) Factors that
influence
HortScience

of volatile
biosynthesis
35: 1026-1033

Ferr?e DC, Carlson


eds, Rootstocks
107-144

RF (1987) Apple
for Fruit Crops.

flavor

compound

in apple

fruits.

rootstocks.
John Wiley

In RC Rom, RF Carlson,
& Sons, New York, pp

source for integrated Arabidopsis


data. Funct Integr Genomics 2: 239-253
Giovannoni
and ripening.
J (2001) Molecular
biology of fruit maturation
Annu Rev Plant Physiol 52: 725-749
Goes da Silva F, Iandolino A, Al-Kayal
F, Bohlmann MC, Cushman MA,
Lim H, Ergu? A, Figueroa R, Kabuloglu
C, et al (2005)
EK, Osborne
the grape transcriptome. Analysis
of expressed sequence
Characterizing
of a compendium
of
tags from multiple Vitis species and development
Plant Physiol 139: 574-597
gene expression during berry development.
Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook
J,
Sessions A, Oeller P, Varma H, et al (2002) A draft sequence of the rice
genome
(Oryza sativa L. ssp. jap?nica). Science 296: 92-100
I, Albagnac
G, Lambert P,
J, Romieu C, Audergon
J-M, Marty
Grimplet
Bouchet
J-P, Terrier N (2005) Transcriptomic
study of apricot fruit
(Prunus armeniaca) ripening
among 13,006
Physiol Plant 125: 281-292
E (2005) Plant metabolic
Grotewold
diversity:
Trends Plant Sei 10: 57-62

expressed
a regulatory

sequence

tags.

perspective.

Guilford

PS, Prakash S, Zhu JM, Rikkerink


E, Gardiner
S, Bassett H,
Forster R (1997) Microsatellites
inMalus X domestica (apple): abundance,

and cultivar
94:
identification.
Theor Appl Genet
polymorphism
249-254
Harker FR, Gunson
FA, Jaeger SR (2003) The case for fruit quality: an
review of consumer attitudes, and preferences
for apples.
interpretive
Postharvest
Biol Technol 28: 333-347
Hellens
RP, Allan A, Friel E, Bolitho K, Grafton K, Templeton
MD,
Karunairetnam
S, Gleave AP, Laing WA (2005) Transient expression
vectors for functional genomics, quantification
of promoter activity and
1: 13
RNA silencing in plants. Plant Methods
A (1999) CAP3: a DNA
Huang X, Madan
sequence
assembly program.

Plant

Res 9: 868-877

Physiol.

Vol.

141,

2006

100: 975-984
WH,
source-sink
70: 335-339

RA (1982) Sorbitol metabolism


and
GC, Kennedy
in developing
interconversions
apple leaves. Plant Physiol

Mario

for extraction
ripe mango

of intact

mesocarp.

Ma CM, Cai SQ, Cui JR,Wang RQ, Tu PF, Masao H, Mohsen


D (2005) The
Eur J Med Chem 40:
acid derivatives.
cytotoxic
activity of ursolic
582-589
Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M, Van de
in eukaryotes.
Peer Y (2005) Modeling
gene and genome duplications
Proc Nati Acad Sei USA 102: 5454-5459
and growing
TK, Hunt M, Barnett LE (2005) Cultivar
region
determine
the antioxidant polyphenolic
concentration
and composition
of apples grown in New Zealand.
JAgrie Food Chem 53: 3065-3070
McKeon
and metabolism
of ethylene. In PJ
T, Yang SF (1987) Biosynthesis
and Their Role in Plant Growth
and
Davies,
ed, Plant Hormones
Martinus Nijhoff, Boston, pp 94-112
Development.
are preferen
M, Hanafey M, Powell W (2002) Microsatellites
Morgante
McGhie

ES (1998) DNA methylation


Finnegan EJ, Genger RK, Peacock WJ, Dennis
in plants. Annu Rev Plant Physiol Plant Mol Biol 49: 223-247
Garcia-Hernandez
M, Berardini TZ, Chen G, Crist D, Doyle A, Huala E,
Knee E, Lambrecht M, Miller N, Mueller
LA, et al (2002) TAIR: a re

Genome

from Apple

C (2004) Some micro


E, Tammi M, Wadelius
Iglesias AR, Kindlund
elements through
satellites may act as novel polymorphic
cis-regulatory
factor binding. Gene 341:149-165
transcription

of ethylene
AM, Kader AA (2005a) Relationship
BG, Dandekar
and precursor
to volatile
related enzymes,
biosynthesis
production,
in apple peel and flesh tissues. J Agrie Food Chem 53:
availability
3133-3141
aroma: alcohol
AM (2005b) Apple
BG, Kader AA, Dandekar
Defilippi

Defilippi

Tags

tially associated with non repetitive DNA


30: 194-200
Moser C, Segala C, Fontana P, Salakhudtinov
E, Toepfer

R, Grando

in plant

genomes.

Nat Genet

I, Gatto P, Pind? M, Zyprian


MS, Velasco R (2005) Comparative
analysis of
tags from different organs of Vitis vinifera L. Funct

sequence
expressed
5: 208-217
Integr Genomics

R, Fairbairn DJ, Ripi J, Crowe M, Botella


JR (2005) Developing
Moyle
fruit has a small transcriptome
dominated
pineapple
by metallothio
nine. J Exp Bot 56: 101-112
Mulder NJ, Apweiler
R, Attwood
TK, Bairoch A, Barrell D, Bateman A,
Binns D, Biswas M, Bradley
P, Bork P, et al (2003) The InterPro
2003 brings increased coverage
and new features. Nucleic
Database,
Acids Res 31: 315-318
Nakamura
Y, Gojobori
T, Ikemura T (2000) Codon usage tabulated from
status for the year 2000.
the international DNA
sequence databases:
Nucleic Acids Res 28: 292
of aldose 6-phosphate
FB, Loescher WH
(1981) Characterization
Negm
NADP
from apple
(alditol 6-phosphate:
1-oxidoreductase)
leaves. Plant Physiol 67:139-142
Ober D (2005) Seeing double: gene duplication
and diversification
in plant
Trends Plant Sei 10: 444-449
secondary metabolism.
reduc?ase

Ohno

S (1970) Evolution by Gene Duplication.


New York
Springer-Verlag,
to display phylogenetic
trees on
Page RDM (1996) Treeview: an application
personal computers. Comput Appl Biosci 12: 357-358
Pechous
BD (2005) Expression
of a-farnesene
SW, Watkins
CB, Whitaker
and conjugated
synthase gene AFS1 in relation to levels of a-farnesene
in peel tissue of scald-susceptible
trienols
'Law Rome'
and scald
resistant Trared' apple fruit. Postharvest
Biol Technol 35: 125-132
BD (2004) Cloning and functional expression
Pechous SW, Whitaker
of an
(E,E)-a-farnesene
synthase cDNA from peel tissue of apple fruit. Planta
219: 84-94

165

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

Newcomb

et al.

in crop
Raf alski A (2002) Applications
of single nucleotide
polymorphisms
genetics. Curr Opin Plant Biol 5: 94-100
Rice P, Longden
I, Bleasby A (2000) EMBOSS: The European Molecular
Biology Open Software Suite. Trends Genet 16: 276-277
Riechmann
JL, Heard J,Martin GLR, Jiang C, Keddie
J, Adam L, Pineda
O, Ratcliffe
R, et al (2000) Arabidopsis
OJ, Samaha RR, Creelman
analysis among eukary
transcription factors: genome-wide
comparative
otes. Science 290: 2105-2110
Rowan DD, Allen JM, Fielder S, Hunt MB (1999) Biosynthesis
of straight
in Red Delicious
and Granny Smith apples using
chain ester volatiles
deuterium-labeled
JAgrie Food Chem 47: 2553-2562
precursors.
of
DD, Lane HP, Allen JM, Fielder S, Hunt MB (1996) Biosynthesis
esters in red
and 2-methylbutanoate
2-methyl-2-butenyl,
2-methylbutyl,
substrates.
delicious and Granny Smith apples using deuterium-labeled
JAgrie Food Chem 44: 3276-3285

Rowan

K, Hani J,Mokrejs M, Tetko I,


Ruepp A, Zollner A, Maier D, Albermann
Guldener U, Mannhaupt
M, et al (2004) The FunCat,
G, Munsterkotter
a functional annotation
of proteins
scheme for systematic
classification
from whole genomes. Nucleic Acids Res 32: 5539-5545
T (1989) Molecular
J, Fritsch EF,Maniatis
Cloning: A Laboratory
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
that
of a soybean protein
I, O'Brian MR
(2002) Identification
Sangwan
interacts with GAGA element dinucleotide
repeat DNA. Plant Physiol
Sambrook
Manual.

129: 1788-1794
too few genes, too many metab
Schwab W (2003) Metabolome
diversity:
62: 837-849
olites? Phytochemistry
factor families have
Shiu S-H, Shih M-C, Li W-H
(2005) Transcription
much

higher
139: 18-26

expansion

rates

in plants

than in animals.

Plant Physiol

RD
DR, Friel EN, Karunairetnam
S, Newcomb
Souleyre EJF, Greenwood
(2005) An alcohol acyl transfer ase from apple (cv. Royal Gala), MpAATl,
in apple fruit flavour. FEBS J 272: 3123-3144
esters involved
produces
DG (1997)
F, Jeanmougin
F, Higgins
JD, Gibson TJ, Plewniak
Thompson

The CLUSTAL_X

windows
interface: flexible
for multiple
strategies
aided by quality analysis tools. Nucleic Acids Res
sequence alignment
24: 4876-4882
Van der Hoeven
MG
C, Giovannoni
R, Ronning
J, Tanksley
(2002)
Deductions
about the number, organization,
and evolution of genes in
the tomato genome based on analysis of a larger expressed
sequence tag
and selective genomic sequencing.
Plant Cell 14: 1441-1456
Y, Yamaki
S, Yamada K, Toyofuku
K, Tabuchi
T,
J, Kobae
in
Shiratake K (2004) Identification
of sorbitol transporters
expressed
the phloem of apple source leaves. Plant Cell Physiol 45: 1031-1041
Wolfe K, Wu XZ, Liu RH (2003) Antioxidant
activity of apple peels. JAgrie
Food Chem 51: 609-614
collection

Watari

Wu CH, Yeh L-SL, Huang H, Arminski


L, Castro-Alvear
J, Chen Y, Hu Z,
Kourtesis
P, Ledley RS, Suzek BE, et al (2003) The Protein Information
Resource. Nucleic Acids Res 31: 345-347
Yahia EM (1994) Apple flavour. Hortic Rev (Am Soc Hortic Sei) 16:197-234
effects of
SH, Ball RD (1996) Causal
JM, Murray
Young H, Gilbert
aroma compounds
on Royal Gala apple flavours.
J Sei Food Agrie 71:
329-336
in apple varieties
Young JC, Chu CLG, Lu X, Zhu H (2004) Ester variability
as determined
and gas chromatography
by solid-phase microextraction
mass spectrometry.
JAgrie Food Chem 52: 8086-8093
Zdobnov

R (2001) InterProScan?an
EM, Apweiler
integration platform
in InterPro. Bioinformatics
17:
for the signature-recognition
methods
847-878
Zhang L, Yuan D, Yu S, Li Z, Cao Y, Miao Z, Qian H, Tang K (2004)
of simple sequence repeats in coding and non-coding
regions
20: 1081-1086
thaliana. Bioinformatics
R (2003) Purification
and characterization
of
L, Wayne
from apple leaves. Plant Sei 165:
phosphatase
sorbitol-6-phosphate

Preference

of Arabidopsis
Zhou R, Cheng
227-232

Zohary D, Hopf M (2000) Domestication


Oxford University
Press, New York

of Plants

166 Plant

This content downloaded from 103.254.86.9 on Sun, 11 Oct 2015 08:36:15 UTC
All use subject to JSTOR Terms and Conditions

in the Old World,

Ed 3.

Vol.

2006

Physiol.

141,

Das könnte Ihnen auch gefallen