Beruflich Dokumente
Kultur Dokumente
#How to run scripts/codes written in other languages like perl/python etc in linux
python myscript.py # runs a python script
Genome/Transcriptome Assembly using velvet
#Kmer Generation
#foldername can be any output folder name in which you store data. If this name is not given file
will get stored in present directory.
#31 is kmer length. Velvet comes with by-default maximum 31kmer length. This can be changed
during installation.
#For velvet you need to tell the program file format & type of input file. So that it can distinguish
between different file formats.
#Making Contigs
velvetg 31mer
#min_contig_lgth this should must be greater than kmer length. So that contigs can be made using
overlaps.
Align the reads with indexes and store them in file alignments.sam
bowtie –S my-indexes mu1.txt > mu1.sam # will align mu1 reads to reference
(transcripts) and store it to file mu1.sam or whatever name you give it
bowtie my-indexes mu2.txt > mu2.sam # will do the same for mu2 and so on.
Get read counts while using counter script (python code) ‘getCountMatrix.py’ as;
create a folder with name bowtieOutputs and place the aligned bowtie outputs (mu1.sam and
mu2.sam) in it.
Make sure that the getCountMatrix.py, GeneIDs.txt and bowtieOutputs (folder) are in the same
folder.
python getCounts.py
Nothing will be shown on shell but a file with name ‘countMatrix.txt’ will be created that store counts
of reads hitting to each gene.
Recheck some random gene counts and confirm it with actual sam output file.