Sie sind auf Seite 1von 15

Difficulties in Learning of

Bioinformatics
Presented To: Dr. Khalid Hussain
Presented By: Ali Zohaib
Roll No: 15211506-063
Department of Botany
Content
• Introduction
• Multidisciplinary Field
• 3 Problems with using NCBI BLAST for Sequence alignments in IP
Searching
 You want to search the entire sequence, not just a piece of it
 You need objective and repeatable results
 Searching for short sequences is tricky
• Lack of Faculty Training
• Too Much Data
• Constant stream of software's and Tools
Introduction
• Bioinformatics, a discipline combining biology, statistics and
computer science along with mathematics is increasingly
important for biological research
• Most significant barriers in learning of bioinformatics are lack
of student interest due to multidisciplinary field, limited
access to resources including Hardware, software and lack of
faculty training etc
Multidisciplinary Field
• It is a barrier in learning due to lack of mathematical and
computational knowledge and skills among students
• Student ability to think critically and effectively evaluate
information eliminated due to multidisciplinary field
• Student rely on being told exactly what to do and have an
inability to problem solve when things don’t turn out as they
anticipate
• Biology students shy away from mathematical and statistical
methods, and few have programming experience
3 Problems with using NCBI BLAST for Sequence
alignments in IP Searching
• NCBI BLAST is the most popular sequence comparison
algorithm, but it was not built with intellectual property
sequence search in mind.
• There are three problems with using NCBI BLAST alone for local
sequence alignments
 You want to search the entire sequence, not just a piece of it
 You need objective and repeatable results
 Searching for short sequences is tricky
1. You want to search the entire sequence,
not just a piece of it
• NCBI BLAST is a so-called local alignment algorithm, which
means that it will try to find small stretches of your query
that match with very high similarity to a sequence
• This is ideal in a biological context where one is looking for
conserved sequences
• In patents, we often want to answer a different question,
“what are all of the sequences which are 70% identical to my
query?” In that case, local alignments are just wrong
2. You need objective and repeatable results
• NCBI BLAST is a heuristic algorithm, which means it does not
report all alignments it finds because of a complicated
statistical model that decides if the match is significant or
not
• Decision is based of the length of the alignment and the
database size
• If the database grows there is a chance that previous findings
disappear
3. Searching for short sequences is tricky
• NCBI BLAST makes it harder because uses algorithm
shortcuts to go faster
• Most important heuristic is its word size parameter, where it
requires an uninterrupted stretch of eleven identical
nucleotides, or three identical amino acids, before it even
attempts to align two sequence
• This makes it less than ideal for searching short sequences
like primers, small RNA molecules
Lack of Faculty Training
• Training and preparing faculty of bioinformatics that are equally
and well trained in multiple fields necessary to teach breadth of
bioinformatics is long standing dilemma
• Lack of Faculty training effects the integration of bioinformatics
into life science creates difficulties for students
• Training required to produce a good bioinformaticians that
depends upon funding
• Preparing of training material is also more challenging to
incorporate new information and tools into their presentation
Too Much Data
Too Much Data
• Too much data, too little good solid annotation
• NGS sequencing methods floods us with DNA sequences for
many organisms including viruses, bacteria etc. From Too much
data it is very difficult to identify and classify each gene
• Data management issues occurs in case of large amount of Raw
data
• Perl is easy in using but in case too much data we can’t use
these tools which is out of their managing capacity
• Need large amount of CPU power which is expensive starting at
2 lakh €
Constant stream of software's and Tools
• Advancements in nucleotide sequencing techniques brought
with them plenty of bioinformatics programs and software
packages
• It can be difficult for students, researchers and teachers to
choose right software for their needs especially if they don’t
have a bioinformatics background
• Sometime a problem to install a program and get it working due
to a dependency hell
• Best working software that have potential to streamline are
much costly
Examples, features and comparisons of some commonly used commercial bioinformatics
software

Software Company Cost NGS Analyses Evolutionary Database Teaching


Analyses Searching suitability

Avadis NGS Strand $4500


Scientific ✓ ✗ ✗ ✗
Intelligence
Genamics Genamics $295 ✗ ✓ ✓ ✗
Expression
NextGENe Softgenetics $4049 ✓ ✗ ✗ ✗

Sequencher Gene Codes $2500 ✓ ✓ ✓ ✓


CLC Genomics ClC bio, $5500 ✓ ✓ ✓ ✓
Workbench Qiagen
CodonCode CodonCode $720 ✓ ✓ ✗ ✓
Aligner
Geneious Biomatters $795 ✓ ✓ ✓ ✓

Das könnte Ihnen auch gefallen