Beruflich Dokumente
Kultur Dokumente
Open Seqmonk
Launch SeqMonk
The first thing you will see is the opening page. SeqMonk scans your copy and make sure everything
is in order, indicated by the green check marks.
1. Under the top menu, go to File and select New project ...
2. When prompted to select a genome, chose GRCh37 under the Homo sapiens folder
3. Click OK to proceed
List Panel:
A listing of all the imported and created files
Chromosome Panel:
A quick bird's eye view of data signal on the chromosomes
Track Panel:
A detail view of annotation and data tracks
Import BAM
Importing BAM files into SeqMonk
1. To import data into SeqMonk, go to File, chose Import Data, then BAM/SAM ...
2. Navigate to the BAM files location, highlight and select all BAM file (.bam), and click Open.
Import In Progress
BAM files are huge, please allow some time to finish the importing process
Note: Different software packages interpret the Mitochondria naming system differently. In this case,
SeqMonk is expecting Mitochrodria to be named as "M", but our BAM files is naming it "chrM".
Therefore, rendering SeqMonk unable to import Mitochondria reads.
Define Probe
1. A Probe is a predefined region on the genome. Here we can use many different methods to
define Probes: gene, mRNA, or CDS
Quantitation
Quantitation is a process of quantifying the amount of reads within the Probe region
2. Define Probe by gene/mRNA. Here a Probe is being defined using the gene/mRNA region, and
the read quantitation is being represented in this region
Note: when using mRNA to define Probe, the algorithm only include reads in exons, not intron. On
the other hand, if gene is used, reads in exons and introns will be included.
3. Define Probe by CDS. Here a Probe is being defined using the CDS region, and the read
quantitation is being represented in this region.
A new Define Quantitation window appears for more option, please chose:
2. Select RNA-Seq quantitation pipeline Option
3. Transcript features: mRNA
4. Library type: Non-strand specific
5. Merge transcript isoforms: check
6. Log transform: check
7. Apply transcript length correction: check
This is being shown on the List Panel, under the Probe Lists
QC Inspection of Reads
We will do a visual inspection on the imported samples
1. At the Chromosome Panel, use your mouse to highlight the left most region of Chromosome 4.
2. Careful examination reveals that sample ABC_Ly3.bam is particularly noisy; having reads
scattered all over the region
See next step on how to assign samples into each replicate set .... continue ...
1. To add data track, go to View, and select Set Data Tracks ...
2. In the new Select Data Track window, highlight both ABC and GCB Replicate Sets
3. Click Add to add these data onto the Track Panel
4. Here, it shows that the new data has been added
Note: Examine the Track Panel where the replicate sets ABC and GCB have added to the bottom of
the tracks.
Example:
Given a Probe of 2000 base-pair in length, 20 reads were mapped to this Probe
Therefore, the intensity value would be:
2. Adjust the Division level for a more granular view of the signal
Note: The Probe Value Histogram gives us a sense of the distribution of positive vs. negative probe
(mRNA in this case) quantitation. Here, we see that negative probe value is slightly higher than
positive.
2. Here, the plot shows that all reads have the same length; which is 86 nucleotide in length
Note: the original read length is 36, recall the during the Import BAM step, we extended the reads
by 50 bp. (see Page 4)
1. Go to Plots, then select Box Whisker Plot, follow by Visible Data Stores...
2. The BoxWhisker Plot shows very even distribution among the samples, which indicates that the
normalization process was appropriate.
Plot MA Plot
Plot MA Plot to show how well the normalization works
Note: MA Plot shows the difference vs. average between ABC and GCB. The difference is plotted
on the Y-axis, and the average on the X-axis. What we want to see is that the same different is
exhibited through out the different data range.
7. On the new window Found XXX probes, give the gene list a meaningful name
8. The gene list will show up on the List Panel, under Probe Lists. In this case, we found 748
statistically significant genes.
Note: We did not use mRNA as annotate choice here, becuase it will return gene isoforms
information. Instace, we have chosen to use gene which will collapse all the isoforms into a single
easy to handle entry.
1. In Start at Row, select 1 since the data starts from row number one
2. In Chr Col (Chromosome Column), select 2 for the chromosome column
3. In Start Col (start of genomic region), select 3 for the beginnig of the genomic region (or TF
binding site)
4. In End Col (end of genomic region), select differenc 4 for the end of the genomic region.
Warning ........
One of the major limitation of SeqMonk is that it can only store one set of Probes. Therefore, when a
new set of Probes is being defined here, the old set will be removed.
Probe Quantitation
Once we have set up the Probe Defintion, we are now ready to quantify the reads within those
Probes
7. In the Found XXX probes window, give it a meaningful name for the list.
8. The newly created significant Probes list will show up on the List Panel under Probe Lists