Beruflich Dokumente
Kultur Dokumente
2
Significance of the study
4
Scope
5
Identify the differentially
expressed genes between a
mature (Green) and ripe (Red)
tomato.
Objectives
Functional enrichment analysis in
Tomato (Solanum lycopersicum)
7
▪ Source: http://www.slideshare.net/mkim8/a-comparison-of-ngs-platforms
8
NGS Application
▪ Whole genome sequencing
▪ Whole exome sequencing
▪ RNA sequencing
▪ ChIP-seq/ChIP-exo
▪ CLIP-seq
▪ GRO-seq/PRO-seq
▪ Bisulfite-Seq
9
RNA-seq
▪ Gene expression can be estimated by measuring RNA in the cell
▪ Northern Blots: one gene per experiment
▪ Microarray: pre-built probes for lots of genes
▪ RNA-seq: sequence and count millions of RNA molecules present in the sample
RNA-seq has larger dynamic range, correlates more closely with qPCR, identifies
transcript isoforms, discovers novel genes
10
Overview of RNA-Seq
11
Application of RNA-Seq
▪ Differential expression
▪ Gene fusion
▪ Alternative splicing
▪ Novel transcribed regions
▪ Allele-specific expression
▪ RNA editing
▪ Transcriptome for non-model organisms
12
Benefits & Challenge
Benefits:
▪ Independence on prior knowledge
▪ High resolution, sensitivity and large dynamic range
▪ Unravel previously inaccessible complexities
Challenge:
▪ Interpretation is not straightforward
▪ Procedures continue to evolve
13
14
Identify the differentially
expressed genes between a
mature (Green) and ripe
(Red) tomato.
Objective 1
16
FASTQ
17
18
19
FASTQ files
Line1: Sequence identifier
Line2: Raw sequence
Line3: (+ decscription ) meaningless
Line4: quality values for the sequence
Pre processing
21
22
Quality Control using FastQC
23
Sequencing QC
Information we need to check
▪ Basic information( total reads, sequence length, etc.)
▪ Per base sequence quality
▪ Overrepresented sequences
▪ GC content
▪ Duplication level
▪ Etc.
24
25
26
MAPPING READS
27
MAPPING READS
(BowTie2 and Tophat2)
▪ TopHat is a fast splice junction mapper for RNASeq reads
▪ It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-
throughput short read aligner Bowtie, and then analyzes the mapping results to
identify splice junctions between exons.
▪ TopHat aligns FASTQ data files to a Reference Genome. It also makes use of
genome annotation (gene names, location of exons on genome).
28
Output of
Tophat
29
differential
expression
30
Cuffdiff
▪ There are two workflows you can choose from when looking for differentially
expressed and regulated genes using the Cufflinks package. The first workflow is
simpler and is a good choice when you aren't looking for novel genes and
transcripts. This workflow requires that you not only have a reference genome,
but also a reference gene annotation in GFF format.
▪ The second workflow, which includes steps to discover new genes and new
splice variants of known genes, is more complex and requires more computing
power. The second workflow can use and augment a reference gene annotation
GFF if one is available
▪ Cuffdiff calculates differential expression between two sets of RNA-seq data files
(treatment vs. control), using BAM files created by TopHat.
31
Result
Objective :
Identify the differentially expressed
genes between a mature (Green) and
ripe (Red) tomato.
32
Functional enrichment
analysis in Tomato
(Solanum lycopersicum)
Objective 2
Functional
enrichment
analysis
34
35
36
Results:
37
Results:
38
Identification & Validation of
EST-SSR Markers in Tomato
Fruit (Solanum
lycopersicum).
Objective 3
SSR Marker
• SSR is repeating sequences of 2-5 (most of them) base pairs of DNA such as (AT)n, (CTC)n, (GAGT)n,
(CTCGA)n
Fingerprinting
40
assembles
transcripts
41
Transcript Abundance:
42
MIcroSAtellite identification
tool.
43
44
design of primers suitable
for their amplification.
45
WebSat
▪ This tool is used for microsatellite molecular marker prediction and development.
▪ WebSat is accessible through the Internet, requiring no program installation.
▪ WebSat allows the submission of sequences, visualization of microsatellites and
the design of primers suitable for their amplification.
▪ The program allows full control of parameters and the easy export of the resulting
data, thus facilitating the development of microsatellite markers.
46
47
Primer Analysis Software
48
NetPrimer
▪ NetPrimer combines the latest primer analysis algorithms with a web-based interface allowing
the user to analyze primers over the Internet.
▪ All primers are analyzed for primer melting temperature using the nearest neighbor
thermodynamic theory to ensure accurate Tm prediction.
▪ Primers are analyzed for all primer secondary structures including hairpins, self-dimers, and
cross-dimers in primer pairs.
▪ This ensures the availability of the primer for the reaction as well as minimizing the
formation of primer dimer. The program eases quantitation of primers by calculating primer
molecular weight and optical activity.
▪ To facilitate the selection of an optimal primer, each primer is given a rating based on the
stability of its secondary structures. A comprehensive analysis report can be printed for
individual primers or primer pairs.
49
50
Results:MISA
SRR363116 SRR363117 SRR363118 SRR363119 SRR363120 SRR363121 SRR363122
sequences
examined
Total size of 1119192 1874915 946216 436494 1030179 372280 598433
examined
sequences
Total No. of 109 171 80 52 90 27 53
identified SSRs
sequences
containing more
than 1 SSR
No. of SSRs 3 10 4 1 3 1 1
present in
compound
formation
51
Results:
52
Results:
53
NetPrimer Results
54
▪ The results showed that genes like PG2, TBG6,
ACS2, ACS4 play a crucial role during the ripening
may have important roles in shelf life .
▪ Understanding their roles during ripening process
will lead us to design better strategy for controlling
the shelf life in agroindustry.
Conclusion
▪ This translational genomic study of Solanum
lycopersicum provides valuable information that
can be used in population genetics, marker
assisted selection for breeding programs,
Quantitative trait loci (QTL) analysis and for the
estimation of genetic diversity.
55
▪ Shelf life improvement studies from over
expression of genes.
56
57