Willkommen bei Scribd!

M.Tech Final Year Project (Phase - 3) : Submited To Submited by

Hochgeladen von

0% fanden dieses Dokument nützlich (0 Abstimmungen)

8 Ansichten23 Seiten

This document describes an M.Tech final year project to evaluate the performance of the TOPHAT RNA-seq alignment tool using different algorithms on an OpenStack cloud environment. The goals are to study available alignment algorithms, understand how TOPHAT works, and tweak aspects of TOPHAT to evaluate and potentially improve its performance. Key aspects that will be examined include RNA sequences, alignment, transcripts, splice junctions, introns, exons, and coding versus non-coding regions. The document provides background on RNA-Seq, sequence alignment, and TOPHAT and describes plans to test TOPHAT using different algorithms like BWT and evaluate results based on mapping time and accuracy.

Originalbeschreibung:

Originaltitel

Phase3.odp

Copyright

Verfügbare Formate

ODP, PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Copyright:

Verfügbare Formate

Als ODP, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

8 Ansichten23 Seiten

M.Tech Final Year Project (Phase - 3) : Submited To Submited by

Hochgeladen von

anshulvyas23

Copyright:

Verfügbare Formate

Als ODP, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 23

Im Dokument suchen

M.

Tech Final Year Project

(Phase - 3)

Submited to Submited By
Varshapriya J.N

Anshul Vyas
(Computer & IT department)
(M.Tech 2nd Year)

Target

Performance evaluation of TOPHAT.

Using Openstack cloud environment

Using Black Box Technique different Algorithms

Next Target

Study all available allignment algorithms.

Study algorithm on which tophat works.

Tweak something to evaluate and improve the

performance.

Challenging keywords

Rna sequence

Allignment

Transcript/Transcriptome

Splice junction

Intron/Extron

Coding/ Non coding part

Rna Sequence

RNA-seq (RNA sequencing), also called whole transcriptome

shotgun sequencing(WTSS)
To reveal the presence and quantity of RNA in a biological sample
at a given moment in time.

RNA-Seq is used to analyze the continuall

Spliced transcripts,

Post-transcriptional modifications,

Gene fusion,

RNASeq can also be used to determine exon/intron boundaries and

verify or amend previously annotated 5 and 3 gene boundaries.

Allignment

In bioinformatics, a sequence alignment is a way of arranging

the sequences of DNA, RNA, or protein to identify regions of
similarity that may be a consequence of functional, structural,
or evolutionary relationships between the sequences.
Aligned sequences of nucleotide or amino acid residues are
typically represented as rows within a matrix. Gaps are
inserted between the residues so that identical or similar
characters are aligned in successive columns. Sequence
alignments are also used for non-biological sequences, such
as calculating the edit distance cost between strings in a
natural language or in financial data.

Transcriptome

the sum total of all the messenger RNA

molecules expressed from the genes of an
organism.

Extron and Intron

Sequence Editor generates FASTAQ.

Splice Junction

In molecular biology, splicing is the editing of

the messenger RNA (pre-mRNA) transcript in
which introns are removed and exons are
joined together (ligated).

Splicing

Refer in rna seq paper ppt

All Available Algorithms

1. Hash Based Algorithms(Mosaik,SwiftWat,AGILE (AliGnIng Long Reads)
2.Tree-Prefix/Suffix Based Algorithms
3.Merge Sort based Algorithms

Current limitations:

High mapping error rates,

Low mapping speed,

Read length limitation

Study TOPHAT

It is an open source.

Created by Jhon Hoppkins University

It is not a single software.

Combination of different bioinformatic

tools .

FASTQ Format
@Read_id_1
CTGATGTGCCGCCTCACTTCGGTGGT

+
@@@DDDDDH8<BAHG@BHGIHIII>(
@Read_id_2
TGATGTGCCGCCTCACTACGGTGGTG
+
FHHHHHJIJIJIJIIIJJIIJGIGII
@Read_id_3
...

The four lines are:

The name/ID of the read, preceded by a "@". For read pairs, there will be two entries with that name, either in the same or a
second FASTQ file.

The sequence of the read.

A "+" sign. In very old FASTQ files, this is followed by the read name from the first line. Today, this line is present for historical
reasons backwards compatibility only.

The quality scores of the bases from line 2. The scores are generated by the sequencing machine, and encoded as ASCII
(33+score) characters. The line should have the same length as line 2, as there is one quality score per base.

BlackBox Architecture
FASTQ-2

FASTQ-1

?
Mapped
Sequence

Unmapped
Sequence

Setup Tophat on machine

1. Take FASTAQ as input.
2. Process it for mapping.
3.Use different algorithms for mapping.
4. Evaluate the result.

Result after Implimenting BWT

Default Tophat algo : 08 second for rice rna

After replacing it with BWT algorithm:

It took 07 Second

Next tasks

On the basis of algo:Analyze performance of

mapper with
1) BWT-SW
2)FM index
3)GFM index

Next tasks

On the basis of length of rna-sequence:

1) Humen and animals
2) Different organic Crops
3) Change the base lenght (bp)

Thank You

Das könnte Ihnen auch gefallen

Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Von Everand
Preliminary Specifications: Programmed Data Processor Model Three (PDP-3) October, 1960
Digital Equipment Corporation
Noch keine Bewertungen
HISAT2
Dokument35 Seiten
HISAT2
Jelena Nađ
100% (1)
Propeller Programming: Using Assembler, Spin, and C
Von Everand
Propeller Programming: Using Assembler, Spin, and C
Sridhar Anandakrishnan
Noch keine Bewertungen
34 Fastp An Ultra
Dokument7 Seiten
34 Fastp An Ultra
Resul Kıymaz
Noch keine Bewertungen
Genomics Dataset: Prof. Devdatt Dubhashi & Dr. Mukund Deshpande
Dokument11 Seiten
Genomics Dataset: Prof. Devdatt Dubhashi & Dr. Mukund Deshpande
Frank Iss
Noch keine Bewertungen
In Silico Genome Analysis-Inderjit (SoAB)
Dokument5 Seiten
In Silico Genome Analysis-Inderjit (SoAB)
tango0385
Noch keine Bewertungen
I R Assignment 1
Dokument2 Seiten
I R Assignment 1
Arghya Adhya
Noch keine Bewertungen
xwdgdswuhpryhvdgdswhu Vhtxhqfhviurpkljkwkurxjksxw Vhtxhqflqjuhdgv
Dokument3 Seiten
xwdgdswuhpryhvdgdswhu Vhtxhqfhviurpkljkwkurxjksxw Vhtxhqflqjuhdgv
OSCAR ALEXIS QUINTERO L�PEZ
Noch keine Bewertungen
NGS ToolsFormats r1 BDG
Dokument32 Seiten
NGS ToolsFormats r1 BDG
Rangga K Negara
Noch keine Bewertungen
Analyzing of Pseudo-Ring Memory Self-Testing Schemes With Algorithms
Dokument8 Seiten
Analyzing of Pseudo-Ring Memory Self-Testing Schemes With Algorithms
ijdps
Noch keine Bewertungen
Chaos-Based Bitwise Dynamical Pseudorandom Number Generator On FPGA
Dokument4 Seiten
Chaos-Based Bitwise Dynamical Pseudorandom Number Generator On FPGA
Tammy Sguizzatto
Noch keine Bewertungen
Assignment 1
Dokument2 Seiten
Assignment 1
Gourab Patro
Noch keine Bewertungen
Reliability Literature Review Assignment 3
Dokument4 Seiten
Reliability Literature Review Assignment 3
TIVIYAH MOGAN
Noch keine Bewertungen
Brouwer1998 Chapter MythsAndFactsAboutTheEfficient PDF
Dokument15 Seiten
Brouwer1998 Chapter MythsAndFactsAboutTheEfficient PDF
ruba
Noch keine Bewertungen
Operating Systems (04JEZOQ)
Dokument2 Seiten
Operating Systems (04JEZOQ)
Mustafa Lulaj
Noch keine Bewertungen
PTRAJ and CPPTRAJ: Software For Processing and Analysis of Molecular Dynamics Trajectory Data
Dokument12 Seiten
PTRAJ and CPPTRAJ: Software For Processing and Analysis of Molecular Dynamics Trajectory Data
sue
Noch keine Bewertungen
Modelo Estadístico Sobre El Frpsruwdplhqwrgho Wkurxjksxw en Uhghv/$1Vreuhwhfqrorjtd Srzhuolqh Communications
Dokument16 Seiten
Modelo Estadístico Sobre El Frpsruwdplhqwrgho Wkurxjksxw en Uhghv/$1Vreuhwhfqrorjtd Srzhuolqh Communications
Jessica Asitimbay Zurita
Noch keine Bewertungen
UNIT-5: Department of Computer Science & Engineering
Dokument25 Seiten
UNIT-5: Department of Computer Science & Engineering
rosh ben
Noch keine Bewertungen
Hkbu Thesis Format
Dokument5 Seiten
Hkbu Thesis Format
ihbjtphig
100% (2)
Lect 1 Intro
Dokument61 Seiten
Lect 1 Intro
Mohamed Akel
Noch keine Bewertungen
Slide
Dokument14 Seiten
Slide
api-3845765
100% (2)
Slide1
Dokument15 Seiten
Slide1
api-3845765
100% (2)
Random 123 SC 11
Dokument12 Seiten
Random 123 SC 11
Abed Momani
Noch keine Bewertungen
Implementation of Dynamic Level Scheduling Algorithm Using Genetic Operators
Dokument5 Seiten
Implementation of Dynamic Level Scheduling Algorithm Using Genetic Operators
International Journal of Application or Innovation in Engineering & Management
Noch keine Bewertungen
Icimp 2016 2 30 30033
Dokument7 Seiten
Icimp 2016 2 30 30033
Bruno Rekowsky
Noch keine Bewertungen
BY:-Shruti Sachdeva 100905090: Thapar University
Dokument54 Seiten
BY:-Shruti Sachdeva 100905090: Thapar University
Shruti Sachdeva
Noch keine Bewertungen
CD Module2 16 03 23 PDF
Dokument36 Seiten
CD Module2 16 03 23 PDF
Souvik Das
Noch keine Bewertungen
Primr Design
Dokument57 Seiten
Primr Design
Sunil
Noch keine Bewertungen
FPGA Based Parallel Computation Techniques For Bioinformatics Applications
Dokument5 Seiten
FPGA Based Parallel Computation Techniques For Bioinformatics Applications
Hugo Vinícius
Noch keine Bewertungen
Timing Analysis in Physical Design
Dokument32 Seiten
Timing Analysis in Physical Design
goud.mahesh0584269
100% (2)
R: A Reconfigurable Atomic Memory Service For Dynamic Networks
Dokument18 Seiten
R: A Reconfigurable Atomic Memory Service For Dynamic Networks
Roop Strong
Noch keine Bewertungen
SynTest Tutorial TS
Dokument1 Seite
SynTest Tutorial TS
kapilhali
Noch keine Bewertungen
Journal of Computational Physics: Y. Lin, F. Wang, B. Liu
Dokument11 Seiten
Journal of Computational Physics: Y. Lin, F. Wang, B. Liu
pov
Noch keine Bewertungen
TMP DA0 D
Dokument13 Seiten
TMP DA0 D
Frontiers
Noch keine Bewertungen
2016 Arpn Jeas Peak Soc
Dokument9 Seiten
2016 Arpn Jeas Peak Soc
NHI NGUYỄN HƯƠNG
Noch keine Bewertungen
D002.Lab Manual Laboratory Practice V
Dokument39 Seiten
D002.Lab Manual Laboratory Practice V
tanay.bhor3
Noch keine Bewertungen
Department of Information Technology Assignment No. 2 TITLE: Implement Multithreading For Matrix Multiplication Using Pthreads. Objective
Dokument6 Seiten
Department of Information Technology Assignment No. 2 TITLE: Implement Multithreading For Matrix Multiplication Using Pthreads. Objective
Dipali Patil
Noch keine Bewertungen
UPCOT
Dokument7 Seiten
UPCOT
Arun Uoh
Noch keine Bewertungen
Thesis Reportt 2
Dokument85 Seiten
Thesis Reportt 2
MianHammadNazir
100% (1)
Parallel Libraries and Parallel I/O: John Urbanic
Dokument45 Seiten
Parallel Libraries and Parallel I/O: John Urbanic
arjun
Noch keine Bewertungen
First Lecture
Dokument89 Seiten
First Lecture
Mohamed Hasan
Noch keine Bewertungen
Titanic: Mohit Kothari Roger Tanuatmadja Gautam Akiwate
Dokument18 Seiten
Titanic: Mohit Kothari Roger Tanuatmadja Gautam Akiwate
zarthon
Noch keine Bewertungen
WARP3D 17.4.0 Manual Updated June 5 2013
Dokument484 Seiten
WARP3D 17.4.0 Manual Updated June 5 2013
marc53042
Noch keine Bewertungen
Routing Protocol Comparison
Dokument8 Seiten
Routing Protocol Comparison
Babar Saeed
Noch keine Bewertungen
File Structures Lab: Laboratory Manual
Dokument65 Seiten
File Structures Lab: Laboratory Manual
ANAGHA
Noch keine Bewertungen
Analysis of RIP, OSPF, and EIGRP Routing": Synopsis ON "
Dokument9 Seiten
Analysis of RIP, OSPF, and EIGRP Routing": Synopsis ON "
Đěěpâķ Šîňğh
Noch keine Bewertungen
Sanskrit Tag-Sets and Part-Of-Speech Tagging Methods - A Survey
Dokument6 Seiten
Sanskrit Tag-Sets and Part-Of-Speech Tagging Methods - A Survey
sujal1310
Noch keine Bewertungen
J.D. Opdyke - Permutation Tests (And Sampling Without Replacement) Orders of Magnitude Faster Using SAS - Scribe
Dokument40 Seiten
J.D. Opdyke - Permutation Tests (And Sampling Without Replacement) Orders of Magnitude Faster Using SAS - Scribe
jdopdyke
Noch keine Bewertungen
Updatedbibliography
Dokument5 Seiten
Updatedbibliography
api-298940823
Noch keine Bewertungen
A "How To" Tutorial On Logic Analyzer Basics For Digital Design
Dokument4 Seiten
A "How To" Tutorial On Logic Analyzer Basics For Digital Design
shinojose
Noch keine Bewertungen
HHS Public Access: Ballgown Bridges The Gap Between Transcriptome Assembly and Expression Analysis
Dokument9 Seiten
HHS Public Access: Ballgown Bridges The Gap Between Transcriptome Assembly and Expression Analysis
ranjanmanishblue
Noch keine Bewertungen
Openmp
Dokument21 Seiten
Openmp
Mark Veltzer
Noch keine Bewertungen
Parallel FPGA-based All-Pairs Shortest-Paths in A Directed Graph
Dokument10 Seiten
Parallel FPGA-based All-Pairs Shortest-Paths in A Directed Graph
prakash_one
Noch keine Bewertungen
Artificial Intelligence - Assignment 3
Dokument11 Seiten
Artificial Intelligence - Assignment 3
Pankhuri Bhatnagar
Noch keine Bewertungen
DAA R19 - All Units
Dokument219 Seiten
DAA R19 - All Units
pujitha akumalla
Noch keine Bewertungen
11.1 EtherChannel
Dokument3 Seiten
11.1 EtherChannel
amit_post2000
Noch keine Bewertungen
Week 2-3 - Analysis of Algorithms
Dokument7 Seiten
Week 2-3 - Analysis of Algorithms
Xuân Dương Vương
Noch keine Bewertungen
Parallel Performance Study of Monte Carlo Photon Transport Code On Shared-, Distributed-, and Distributed-Shared-Memory Architectures
Dokument7 Seiten
Parallel Performance Study of Monte Carlo Photon Transport Code On Shared-, Distributed-, and Distributed-Shared-Memory Architectures
vermashanu89
Noch keine Bewertungen
Synopsis
Dokument13 Seiten
Synopsis
Rangrez shaibaz
Noch keine Bewertungen
Implementation and Analysis of An Error Detection
Dokument7 Seiten
Implementation and Analysis of An Error Detection
infinitywaysalways
Noch keine Bewertungen
Unit 1 Java Fundamentals
Dokument51 Seiten
Unit 1 Java Fundamentals
anshulvyas23
Noch keine Bewertungen
Exception Handling in Java
Dokument2 Seiten
Exception Handling in Java
anshulvyas23
Noch keine Bewertungen
MOSAIKSOAP
Dokument11 Seiten
MOSAIKSOAP
anshulvyas23
Noch keine Bewertungen
Why Do We Study Theory of Computation ?
Dokument36 Seiten
Why Do We Study Theory of Computation ?
anshulvyas23
Noch keine Bewertungen
Mpi
Dokument30 Seiten
Mpi
anshulvyas23
Noch keine Bewertungen
DPC Principal 2015 16 28042015
Dokument332 Seiten
DPC Principal 2015 16 28042015
anshulvyas23
Noch keine Bewertungen
Utility Curve For Parag
Dokument2 Seiten
Utility Curve For Parag
anshulvyas23
Noch keine Bewertungen
Homework
Dokument5 Seiten
Homework
vgciasen
Noch keine Bewertungen
HLP Action Plan
Dokument75 Seiten
HLP Action Plan
Thiago Gueiros
Noch keine Bewertungen
Biophysics Syllabus
Dokument2 Seiten
Biophysics Syllabus
Kamlesh Sahu
Noch keine Bewertungen
Handout - Cell Transport Review Worksheet
Dokument4 Seiten
Handout - Cell Transport Review Worksheet
api-502781581
Noch keine Bewertungen
Tugasan Epraktikal XBDM4103 Mei 2021
Dokument6 Seiten
Tugasan Epraktikal XBDM4103 Mei 2021
norfilzah
Noch keine Bewertungen
Updated CV - Jyoti
Dokument2 Seiten
Updated CV - Jyoti
gablu bablu
Noch keine Bewertungen
Marker Assisted Selection (MAS)
Dokument57 Seiten
Marker Assisted Selection (MAS)
teledane
Noch keine Bewertungen
Anti-Microbial Activity of Cassia Tora Leaves and Stems Crude Extract
Dokument4 Seiten
Anti-Microbial Activity of Cassia Tora Leaves and Stems Crude Extract
Helix
Noch keine Bewertungen
2015 Mobile Dna III
Dokument1.346 Seiten
2015 Mobile Dna III
Mauro Ortiz
100% (2)
Us20120251502a1 PDF
Dokument117 Seiten
Us20120251502a1 PDF
Fernando Cabrera
Noch keine Bewertungen
Buku Evolusi
Dokument446 Seiten
Buku Evolusi
gian septhayudi
100% (1)
Principles of Drug Testing Technology
Dokument2 Seiten
Principles of Drug Testing Technology
drugtestsdirect
100% (1)
Cell Cycle
Dokument117 Seiten
Cell Cycle
Pahel Amin
Noch keine Bewertungen
Molecular Biology of The Cell 5th Edition Alberts Test Bank
Dokument13 Seiten
Molecular Biology of The Cell 5th Edition Alberts Test Bank
odettedieupmx23m
100% (31)
Lab 3 - Smear Preparation Simple Staining and Gram Staining
Dokument29 Seiten
Lab 3 - Smear Preparation Simple Staining and Gram Staining
Ayman Elkenawy
Noch keine Bewertungen
Botany
Dokument48 Seiten
Botany
Venky GV
Noch keine Bewertungen
Book Real-Time PCR
Dokument343 Seiten
Book Real-Time PCR
Wida Salupi
100% (1)
Photosynthesis Inihibitors Review My
Dokument2 Seiten
Photosynthesis Inihibitors Review My
api-521781723
Noch keine Bewertungen
Malaysian Pharmaceutical Industry: Opportunities and Challenges
Dokument8 Seiten
Malaysian Pharmaceutical Industry: Opportunities and Challenges
Choko Ema Sakura
Noch keine Bewertungen
Bok:978 3 642 37922 2
Dokument615 Seiten
Bok:978 3 642 37922 2
atilio martinez
Noch keine Bewertungen
Department of Molecular Biology: Test Name Result Unit Bio. Ref. Range Method
Dokument1 Seite
Department of Molecular Biology: Test Name Result Unit Bio. Ref. Range Method
Ram Tholety
Noch keine Bewertungen
Haa Nipa Question Bank (Proposed) - Final
Dokument3 Seiten
Haa Nipa Question Bank (Proposed) - Final
Zahidul Hassan
Noch keine Bewertungen
Definition of Cell Culture
Dokument36 Seiten
Definition of Cell Culture
Gladys Ailing
Noch keine Bewertungen
Developing A Sars-Cov 2 Antigen Test Using Engineered A Nity Proteins
Dokument13 Seiten
Developing A Sars-Cov 2 Antigen Test Using Engineered A Nity Proteins
Kalyan Karumanchi
Noch keine Bewertungen
CV
Dokument4 Seiten
CV
api-504844041
Noch keine Bewertungen
Data Monkey Tutorial
Dokument31 Seiten
Data Monkey Tutorial
Rebriarina Hapsari
Noch keine Bewertungen
The Cell Cycle Script
Dokument5 Seiten
The Cell Cycle Script
Alex
Noch keine Bewertungen
The Role of Clinical Pharmacist in Pharmacovigilance and Drug Safety in Teritiary Care Teaching Hospital
Dokument11 Seiten
The Role of Clinical Pharmacist in Pharmacovigilance and Drug Safety in Teritiary Care Teaching Hospital
Baru Chandrasekhar Rao
Noch keine Bewertungen
Unit 2. Macromolecules of The Life and Their Importance
Dokument12 Seiten
Unit 2. Macromolecules of The Life and Their Importance
Sherif Ali
Noch keine Bewertungen
SUBHADIPA MAJUMDER2022-07-22Cell Potency
Dokument2 Seiten
SUBHADIPA MAJUMDER2022-07-22Cell Potency
Suv
Noch keine Bewertungen