Sie sind auf Seite 1von 7

Short Report

New primers for amplifying and sequencing the mitochondrial ND4/ND5 gene region of the Cypriniformes (Actinopterygii: Ostariophysi)
Masaki Miya1*, Kenji Saitoh2, Robert Wood3, Mutsumi Nishida4, and Richard L. Mayden3
1

Department of Zoology, Natural History Museum & Institute, Chiba, 955-2 Aoba-cho, Chuo-ku, Chiba 260-8682, Japan (e-mail: miya@chiba-muse.or.jp) 2 Tohoku National Fisheries Research Institute, 3-27-5 Shinhama, Shiogama, Miyagi 985-0001, Japan 3 Department of Biology, St. Louis University, 357 Laclede Ave., St. Louis, MO 63103-2010, USA 4 Ocean Research Institute, University of Tokyo, 1-15-1 Minamidai, Nakano-ku, Tokyo 164-8639, Japan Received: February 9, 2004 / Revised: June 10, 2005 / Accepted: July 4, 2005
Abstract We provide 15 new primers for amplifying and sequencing the mitochondrial ND4/ND5 gene region of the Cypriniformes in an attempt to resolve relationships of this diverse group of freshwater shes with extensive taxonomic sampling. Sequences from this region have the following desirable characteristics for phylogenetic analyses, some of which are lacking from the more commonly used cyt b and 12S/16S rRNA genes: they are (1) easy to align, (2) relatively long (ca. 3.4 kb), and (3) contain more phylogenetically informative variation at 1st and 2nd codon positions. Moreover, the ND4/ND5 gene region is easy to amplify and sequence when employing the protocol suggested herein. Key words Cypriniformes Tree of Life (CToL) Molecular systematics Phylogeny

Ichthyological Research
The Ichthyological Society of Japan 2006

Ichthyol Res (2006) 53: 7581 DOI 10.1007/s10228-005-0303-5

espite repeated cautions against use of sequences from the mitochondrial cyt b and 12S/16S rRNA genes in molecular systematic studies at higher taxonomic levels (Meyer, 1994; Ort and Meyer, 1997; Farias et al., 2001), these genes continue to be used as standard phylogenetic markers in many sh groups and have been sequenced more often than all other mitochondrial genes combined (82% vs 18%; Table 1). The widespread use of these genes derives essentially from the early availability of polymerase chain reaction (PCR) primers that allowed partial sequences to be easily generated (Kocher et al., 1989; Palumbi, 1996), not because of any inherent superiority in their performance in phylogenetic analyses over other genes (Miya and Nishida, 2000). Miya and Nishida (1999) recently developed a new method for sequencing the whole mitochondrial genome (mitogenome), employing a long-PCR technique and many sh-versatile primers. This technical breakthrough facilitated the use of many longer sequences necessary to resolve phylogenetic questions in shes and whole mitogenome sequences have been empirically demonstrated as useful phylogenetic markers at various taxonomic levels (e.g., see Miya et al., 2005 and references therein). With limited time and resources, however, an attempt using whole mitogenome sequences to resolve intrarelationships of a speciose, monophyletic group of shes would be unrealistic when the phylogenetic question at hand is to resolve relationships of several hundreds to thousands of species. Rather, a more ideal (and practical) approach to working with so many taxa would be to

use sequences from genes with better phylogenetic performance than commonly used genes such as the cyt b and 12S/16S rRNA. Unfortunately, limited attention has been given to bioinformatic exercises exploring questions of this nature to locate highly efcient genes for specic phylogenetic or evolutionary questions. Consequently, few alternatives to these three standard genes exist irrespective of the phylogenetic questions or the taxonomic groups being examined. Recently, a large international-scale project called the Cypriniformes Tree of Life (CToL) (http://cypriniformes. org) was initiated in an attempt to resolve interrelationships of this largest clade of freshwater shes, with roughly 3344 described species in 281 genera and ve (or six) families (FishBase; http://www.shbase.org). One of the objectives of this project is to reconstruct relationships of 1000 targeted species, a goal that may be perceived as being not only difcult but possibly unattainable using currently available gene systems. However, we argue that the ND4/ND5 gene region is an outstanding molecular marker for resolving taxon relationships at this scale. In this article, we provide 15 new primers specically designed for the cypriniforms and suggest a protocol for sequencing this region using these primers.

Selection of Genes
In the selection of phylogenetically useful genes we argue that the following four criteria must be met. They should (1)

76

M. Miya et al.

be easy to align, (2) contain considerable phylogenetically informative variation, (3) be of sufcient length (e.g., >3.0 kb), and (4) be easy to amplify and sequence. For the rRNA genes (12S and 16S), assessment of positional homology should minimally be based on secondary structure of molecules; this requires manual adjustments to alignments with reference to the suggested secondary structure models (Miya and Nishida, 1998). As such, one is faced with multiple alignments for a vast number of sequences from these two genes, an extremely time-consuming effort and a problem that will very likely lead to some degree of

Table 1. Approximate representation of genes among 22,773 actinopterygian mtDNA sequences deposited in DDBJ (as of December 25, 2004) Genea Cyt b 16S rRNA 12S rRNA ND4 COI ND5 ATPase 6 ND6 ND4L ATPase 8 ND1 ND3 COIII COII ND2
a

No. 10,938 4,251 3,475 607 517 408 378 374 368 359 269 248 202 202 177

Percent (%) 48.0 18.7 15.3 2.7 2.3 1.8 1.7 1.6 1.6 1.6 1.2 1.1 0.9 0.9 0.8

12S and 16S rRNA, 12S and 16S ribosomal RNAs; ND16, 4L, NADH dehydrogenase subunits 16, 4L; COIIII, cytochrome c oxidase subunits IIII; ATPase 6 and 8 ATPase subunits 6 and 8; Cyt b, cytochrome b

erroneous assessment of positional homology. Furthermore, as Ort and Meyer (1997) convincingly demonstrated, these two genes do not contain sufcient information for resolving the higher-level relationships in their target group, the Characiformes, a similarly diverse group of freshwater shes also placed in the Ostariophysi as with the Cypriniformes. Accordingly, these two genes do not satisfy the rst two requirements mentioned above and should not be considered further. Multiple alignment of the 13 mitochondrial proteincoding genes is relatively easy and straightforward because triplets code for amino acids. Interestingly, patterns of amino acid variation differ among genes. Actually, pairwise comparisons of aligned amino acids from seven published cypriniform mitogenome sequences indicate that the cyt b and COIIII are the four most conservative genes, whereas the other 9 genes (ATPase 6/8, ND16/4L) are notably more variable between taxa (Table 2). Thus, clearly the former 4 genes (cyt b, COIIII) accumulate most nucleotide substitutions at the 3rd codon position, a pattern of gene evolution that likely to result in a high level of homoplasy in sequence comparisons among distantly related species, such as those from different genera, tribes, subfamilies, or families. The other 9 genes (ATPase 6/8, ND16/4L) are, on the other hand, relatively variable in amino acids, indicating that these genes accumulate more informative variations at 1st and 2nd codon positions. Sequence length is also becoming an increasingly more important attribute in evaluating genes for reconstructing higher-level relationships. With longer sequences the historical signal is expected to be predominant over noise, because the former is additive (synapomorphies support the same correct tree) and the latter is random (homoplasies do not collectively support any particular tree). While we are not sure how much sequence is required, sequences of about 3 kb would be a good starting point considering the number of taxa involved (several hundreds to thousands of species)

Table 2. Mean pairwise amino acid differences of the 13 mitochondrial genes from seven cypriniform species published in Saitoh et al. (2003) Rank 1 2 3 4 5 6 7 8 9 10 11 12 13
a b

Genea ATPase 8 ND5 ND2 ND6 ND3 ND1 ND4 ND4L ATPase 6 Cyt b COII COIII COI

Length (bp)b 165 1842 1044 519 348 972 1380 298 681 1140 690 783 1548

Mean pairwise difference (range)c 0.269 (0.0000.407) 0.217 (0.0590.371) 0.199 (0.0830.267) 0.180 (0.0170.320) 0.123 (0.0170.224) 0.107 (0.0190.198) 0.103 (0.0260.154) 0.099 (0.0000.235) 0.087 (0.0400.150) 0.083 (0.0180.132) 0.065 (0.0260.109) 0.043 (0.0040.092) 0.031 (0.0080.052)

For abbreviations of genes, see footnote of Table 1 Calculated from length of aligned amino acid sequences c Uncorrected pairwise amino acid differences

New primers for Cypriniformes Fig. 1. Ostariophysan relationships based on the 50% majority rule consensus tree of the 810 pooled trees from the three independent Bayesian analyses of the cyt b gene (A) and ND4/ND5 gene region (B). Numerals beside internal branches indicate Bayesian posterior probabilities (shown as percentages). Data are cited from Saitoh et al. (2003)

77

and the limited resources in a given project. In addition to the length of genes, the ease with which they can be amplied and sequenced is also very important. Given the numerous mitogenomes that we have generated for species in the Cypriniformes, we have evaluated all the genes for these properties and we unequivocally selected the ND4 and ND5 genes. These two genes have the greatest length among the nine most variable proteincoding genes (Table 2) and are adjacently located between the two relatively conservative tRNA genes (tRNAArg and tRNAGlu). Thus, we can easily purify (or amplify) this region using a long-PCR technique, and the products may be used as a template for subsequent, full-nested short PCRs, as in the whole mitogenome sequencing strategy (Miya and Nishida, 1999).

Phylogenetic Performance
We compared the phylogenetic performance of the cyt b (1140 bp) and ND4/ND5 (3408 bp including intervening three tRNA genes) gene sequences from 15 species of the Ostariophysi published in Saitoh et al. (2003). Note that this is a gross comparison between the two gene regions not based on the standardized gene length. Sequences were aligned manually, and nucleotide positions that include gaps and stop codons were eliminated from the analysis. Partitioned Bayesian analyses were conducted using MrBayes 3.04b (Ronquist and Huelsenbeck, 2003). We set three partitions (1st, 2nd, and 3rd codon positions) for protein-coding genes from the two data sets. For the ND4/ND5 data set, an additional partition was set for the three tRNA genes. MrModeltest (http://www.ebc.uu.se/ systzoo/staff/nylander.html) selected GTR + I + G and GTR + G as the best-t models of nucleotide substitutions for the 1st/2nd/3rd codon positions and tRNAs, respectively. We assumed that all the model parameters were unlinked and the rate multipliers were variable across partitions. The Markov chain Monte Carlo (MCMC) process was set so that four chains (three heated and one cold) ran simulta-

neously. We conducted three independent runs for each data set and continued the runs for 3.0 105 cycles, with 1 in every 1000 trees being sampled. Parameters of the model of sequence evolution for the two data sets were in excellent agreement after reaching stationarity (30,000 cycles), and the rst 31 trees were discarded as burnin. Posterior probabilities were calculated from the 810 trees pooled from the three independent runs. Figure 1 shows 50% majority rule consensus trees of the 810 pooled trees for the two data sets. The ND4/ND5 gene tree (Fig. 1B) is better resolved and internal branches are supported by high posterior probabilities compared to the gene tree derived with cyt b sequences from the same organisms (Fig. 1A). In the cyt b gene tree, the Cypriniformes is not reproduced as a monophyletic group because the four representatives of Cyprinidae are shown as more closely related to members of Characiphysi than to the three representatives of Cobitidae + Balitoridae. On the other hand, the ND4/ND5 gene tree is fully bifurcated and condently supports monophyly of the Characiphysi and Cypriniformes with 100% posterior probability, a relationship consistent with morphological and other data.

Primer Design
Fifteen primers (Figs. 24, Table 3) were newly designed with reference to the aligned and published sequences from 15 ostariophysans (for a list of species, see Saitoh et al., 2003) and unpublished sequences from 36 cypriniforms (17 cyprinids, one gyrinocheilid, six catastomids, six cobitids, and six balitorids; Saitoh et al., unpublished data). In the primer design process we considered a number of factors (Palumbi, 1996). Primers were preferentially located in the most conservative portions of tRNA and proteincoding genes. Primers included some G/C at the 3-ends to strengthen primertemplate annealing at this position. Considering the unconventional base pairing in the T/G bond, the designed primers used G rather than A in the primer when the template was variably C or T, and T rather than C

78

M. Miya et al.

Fig. 2. A Schematic representation of the ND4/ND5 gene region and relative positions of newly designed 15 primers. For abbreviations of protein-coding genes, see footnote of Table 1. Transfer RNA genes are represented by the single-letter code. B Sequencing strategy for the ND4/ ND5 gene region. Numbers above arrows correspond to those for primer pairs indicated in Table 4

Fig. 3. Aligned sequences from seven cypriniforms published in Saitoh et al. (2003), with newly designed L primers indicated above. Degenerate positions are indicated using IUPAC code

New primers for Cypriniformes Fig. 4. Aligned sequences from seven cypriniforms published in Saitoh et al. (2003), with newly designed H primers indicated above. Degenerate positions are indicated using IUPAC code

79

Table 3. A list of 15 newly designed PCR primers Primera L primers L10474-Arg-C L10681-ND4-C L11427-ND4-C L12170-His-C L12319-Leu-C L12328-Leu-C L13058-ND5-C L13559-ND5-C H primers H11618-ND4-C H12296-Leu-C H12632-ND5-C H13393-ND5-C H13721-ND5-C H14473-ND6-C H14710-Glu-C
a

Sequenceb (53)

GGT TWG AKT CCG YGG TTC CCT TAT GAC GCK TTT TCT GCK TGT GAR GC CCW AAG GCS CAT GTW GAR GC GTA AGT ATA GTT TAA KTW AAA TRT TAG ATT GTG TTG GTC TTA GGA ACC AAA AAC TCT TGG TGC AAC TCT TGG TGC AAM TCC AAG TCK GCT ATG GAG GGY CCK AC TCK TAT CTK AAC GCC TGR GC TGG CTK ACK GAK GAG TAK GC CAA GAG TTT TTG GTT CCT AAG TTC TAG GAT KGA TCA GGT GAC GWA KAG KGC CCT ATT TTK CGG ATG TCT TGY TC ATG CTT CCT CAG GCR AGK CG GCG GCW TTG GCK GCK GAG CC CTT GTA GTT GAA TWA CAA CGG TGG TTY TTC

Primers are designated by their 3-ends, which correspond to the position on the human mitochondrial genome (Anderson et al., 1981) by convention; L and H denote heavy and light strands, respectively b Degenerate positions are denoted using IUPAC codes

80

M. Miya et al.

when the template was A or G. Finally, we incorporated some degenerate positions in many primers, thereby accommodating much of the variation in the mtDNA sequences among cypriniform taxa.

Concluding Remarks
Although the sequencing strategy presented herein has been specically developed for the Cypriniformes, it is applicable to various groups of shes with minor modication. Optimization of primer design for a specic group is feasible for many sh groups with reference to the published whole mitogenome sequences from 190 species (see MitoFish Web site at http://mitosh.ori.u-tokyo.ac.jp/). In some cases, however, such optimization is unnecessary, as the new primers work very well with a variety of actinopterygians from the base to the top of the tree (Miya et al., unpublished data). One potential problem with the ND4/ND5 gene region is that its actual phylogenetic performance for the large data set is unknown at present. This area of uncertainty is being explored empirically with the assembly of an actual data set for 1000 species of Cypriniformes. With continued international collaborative efforts in the CToL project, we hope to be able to demonstrate the phylogenetic performance of the ND4/ND5 gene region for this diverse group of organisms with taxonomically intensive sampling.
Acknowledgments This study was conducted as a part of the CToL project. We thank Hank Bart for his thoughtful and useful comments on an earlier version of this manuscript. We also thank participants of the CToL meeting during the 11th European Congress of Ichthyology in Tallinn, Estonia, for fruitful discussions on systematics of the Cypriniformes. A portion of this study was supported by Grants-in-Aid from the Ministry of Education, Culture, Sports, Science and Technology, Japan (Nos. 15570090, 15380131 and 17207007).

A Suggested Protocol
The following protocol consists of three steps: (1) long PCR, (2) full-nested short PCRs, and (3) direct cycle sequencing. We have conrmed that the new primers and protocol work very well with representative(s) of all ve cypriniform families and produce consistent results.
Long PCR.Five longer primers (L10474-Arg-C, L12170His-C, L12319-Leu-C, H12632-ND5-C, and H14710-Glu-C; see Fig. 2A, Table 3) may be used for long PCRs in various combinations depending on the primer specicity and quality of the template DNA. In all cases we have examined, however, single reactions using a primer pair of L10474-Arg-C and H14710Glu-C (ca. 4.2 kb) have proven successful. This result indicates that use of the long-PCR technique (Cheng et al., 1994) may not be necessary with this moderately sized gene region. In our laboratories, reactions are carried out with 30 cycles of a 15-ml reaction volume containing 8.3 ml sterile distilled H2O, 1.5 ml 10 PCR buffer (Takara, Kyoto, Japan), 1.2 ml dNTP (4 mM), 1.5 ml each primer (5 mM), 0.07 Taq polymerase (Z Taq; Takara), and 1.0 ml template (30 ng/ml). The thermal cycle prole after an initial 2-min denaturation at 94C is as follows: denaturation at 98C for 1 s; annealing at 55C for 5 s; and extension at 72C for 40 s (10 s per 1 kb). The long-PCR products are diluted with sterile TE buffer (1 : 19) for subsequent full-nested short PCRs. Full-nested short PCRs.We use six primer pairs (Table 4) to amplify contiguous, overlapping short segments (ca. 900 1400 bp) that cover the entire ND4/ND5 gene region (Fig. 2B). Experimental conditions are the same as those of the long PCR except for the shorter extension period (15 s instead of 40 s). Direct sequencing.Of the 12 primers used in the six short PCRs, two primers (L12328-Leu-C and H14473-ND6-C) are used twice (Table 4), and we conduct ten direct cycle sequencing reactions to avoid duplication. In our laboratories, double-stranded short-PCR products are puried using a PreSequencing kit (USB) to remove redundant dNTPs and oligonucleotides from primers. Direct cycle sequencing is accomplished with dye-labeled terminators (BigDye terminator version 1.1/3.1; Applied Biosystems). Primers used are the same as those for the short PCRs. All sequencing reactions are performed according to the manufacturers instructions.
Table 4. Suggested six primer pairs for full-nested short PCRs No.a 1 2 3 4 5 6
a

Literature Cited
Anderson S, Bankier AT, Barrell BG, de Bruijn MHL, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJH, Staden R, Young IG (1981) Sequence and organization of the human mitochondrial genome. Nature 290:457465 Cheng S, Chang SY, Gravitt P, Respess R (1994) Long PCR. Nature 369:684685 Farias IP, Ort G, Sampaio I, Schneider I, Meyer A (2001) The cytochrome b gene as a phylogenetic marker: The limits of resolution for analyzing relationships among cichlid shes. J Mol Evol 53:89 103 Kocher TD, Thomas WK, Meyer A, Edwards SV, Pbo SV, Villablanca FS, Wilson AC (1989) Dynamics of mtDNA evolution in animals: amplication and sequencing with conserved primers. Proc Natl Acad Sci USA 86:61966200 Meyer A (1994) Shortcomings of the cytochrome b gene as a molecular marker. Trends Ecol Evol 9:278280 Miya M, Nishida M (1998) Molecular phylogeny and evolution of the deep-sea sh genus Sternoptyx. Mol Phylogenet Evol 10:1122 Miya M, Nishida M (1999) Organization of the mitochondrial genome of a deep-sea sh Gonostoma gracile (Teleostei: Stomiiformes): rst example of transfer RNA gene rearrangements in bony shes. Mar Biotechnol 1:416426 Miya M, Nishida M (2000) Use of mitogenomic information in teleostean molecular phylogenetics: A tree-based exploration under the maximum-parsimony optimality criterion. Mol Phylogenet Evol 17:437455

L primer L10681-ND4-C L11427-ND4-C L12328-Leu-C L12328-Leu-C L13058-ND5-C L13559-ND5-C

H primer H11618-ND4-C H12632-ND5-C H13393-ND5-C H13721-ND5-C H14473-ND6-C H14473-ND6-C

Numbers correspond to those denoted in Fig. 2B

New primers for Cypriniformes Miya M, Satoh TP, Nishida M (2005) The phylogenetic position of toadshes (order Batrachoidiformes) in the higher ray-nned sh as inferred from partitioned Bayesian analysis of 102 whole mitochondrial genome sequences. Biol J Linn Soc 85:289306 Ort G, Meyer A (1997) The radiation of characiform shes and the limits of resolution of mitochondrial ribosomal DNA sequences. Syst Biol 46:75100 Palumbi SR (1996) Nucleic acids II. The polymerase chain reaction. In: Hillis DM, Moritz C, Mable BK (eds) Molecular systematics, 2nd edn. Sinauer, Sunderland, MA, pp 205221

81 Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:15721574 Saitoh K, Miya M, Inoue JG, Ishiguro NB, Nishida M (2003) Mitochondrial genomics of ostariophysan sh: Perspectives on phylogeny and biogeography. J Mol Evol 56:464472

Das könnte Ihnen auch gefallen