Sie sind auf Seite 1von 7

BioinQA: Addressing bottlenecks of Biomedical Domain through

Biomedical Question Answering System




Sparsh Mittal
Department of Electronics and
Computers Engineering,
IIT Roorkee, India, 247667.
sparsuch@iitr.ernet.in


Saket Gupta
Department of Electronics and
Computers Engineering,
IIT Roorkee, India, 247667.
freshuce@iitr.ernet.in


Dr. Ankush Mittal
Department of Electronics and
Computers Engineering,
IIT Roorkee, India, 247667
ankumfec@iitr.ernet.in


Sumit Bhatia
Department of Electrical
Engineering,
IIT Roorkee, India, 247667
sumi1uee@iitr.ernet.in
ABSTRACT
Recent advances in the realm of biomedicine and
genetics in the post genomics era have resulted in an explosion
in the amount of biomedical literature available. Large
textbanks comprising of thousands of full-text biology papers
are rapidly becoming available. Due to gigantic volumes of
information available and lack of efficient and domain specific
information retrieval tools, it has become extremely difficult to
extract the relevant information from the large repositories of
data. While the specialized information retrieval tools are not
suitable for beginners, general purpose search engines are not
intelligent enough to respond to domain specific questions,
predominantly of the Medical, Bioinformatics or related fields.
In this paper, we present an intelligent question answering
system that efficiently and quickly responds to the questions of
novices in natural language and is smart enough to answer
questions of advanced users (researchers). It employs various
natural language processing techniques to answer user
questions even from multiple documents. The system also
makes use of metadata knowledge for addressing specific
biomedical domain concerns like heterogeneity, acronyms, etc.

1. INTRODUCTION
The recent technological advancements in
biomedicine and genetics have resulted in an unimaginable
vast amount of data which keeps on growing at an ever
increasing rate. An idea about the growth of biomedical
literature can be had from the fact that PUBMED database of
National Library of Medicine(NLM) contains more than 14
million articles and hundreds of thousands more are being
added every year[1].
However, such a huge repository of data is useful
only if it can be easily accessed and the contents retrieved as
per the user requirements, providing information in a suitable
form [2]. Mostly modern search engines such as Google have
huge storehouse of information but the limitation faced by
users is of manually searching the documents obtained for the
queries thus posed, which may be vast in number. The
meaning of query is very relevant. For e.g. How is snake







poison employed in counteracting Neurotoxic venom?, and
When is snake poison employed in counteracting Neurotoxic
venom? and Why is snake poison employed in counteracting
Neurotoxic venom? all have different meanings.
Very few research groups are working on medical,
domain-specific question answering [3]. In contrast, generic
open domain question answering systems are not suitable
when dealing with the biomedical text where complex
technical terms are used in domain specific contexts [4]. The
term order variations and abbreviations are also quite common
in this field [5], and definitional questions [6] are few to be
found, for mostly an analytical explanation is needed. The
specialized information retrieval systems like PUBMED are
generally used by the researchers and experts whereas general
information retrieval systems like Google which are preferred
by neophytes of the field, suffer from the inherent drawbacks
of Open Domain information retrieval systems. [5] and [7]
provide a good study on the feasibility of medical question
answering systems and limitations of open domain question
answering system when applied to biomedical domain. Also,
difference in jargons used by a novice and advanced user
causes heterogeneity (see Literature Review). Our QA System
assigns the task of resolving differences of terms and formats
as used by user and in corpus, entirely to the system, thus
requiring no qualification on part of user to know the complex
jargon of the subject.
In our efforts to develop a system that can make the
process of information retrieval in large amounts of biomedical
data efficient and user friendly, we built a system which has
the following contributions:
1. In the field of multi-document search, our QA system is a
step in the direction of next-generation system which has the
capability of extracting answers from diverse documents.
2. This paper proposes tools to resolve and to mitigate the
semantic (essential) heterogeneity problem for
bioinformatics metadata. Information integration of
bioinformatics data is becoming a very vital problem.
3. In order to make the system useful for a wide range of users,
we have used different weighing and ranking schemes to
present to the user information which is most important to
him (See Section 3 and 4).
4. Our system is capable of answering a wide variety of
questions in addition to definitional questions, (e.g. How
does Tat enhances the ability of RNA polymerase to
elongate?)
5. The system integrates diverse resources like scanned books,
doc files, PowerPoint slides, etc, which have different
information and presentation methods. Books are illustrative
and give detailed analysis of concepts. Slides are condensed,
highlighting the key points. The system integrates
information from different types of documents.
The paper is organized as follows: Section 2 describes the
previous work in biomedical field and related areas and
bottlenecks. Section 3 describes the operational facets of the
question answering system. Section 4 presents the multi-
document extraction problem and our solution. Section 5
describes our implementation to resolve the heterogeneity
problem and use of metadata. Section 6 discusses experiments
and results. Section 7 gives conclusions and briefly discusses
future research.

2. LITERATURE REVIEW
An overview of the role and importance of question
answering systems in biomedicine is provided by [3]. TREC,
one of the major forums for discussions related to question
answering systems has included a track on Genomics. EQuer,
the French Evaluation campaign of Question-Answering
Systems is first one to provide a medical track [8].
MedQA, developed by Lee et al [6] is a biomedical
question answering system which caters to the needs of
practicing physicians. However it is still limited due to its
ability to answer only definitional question. Another question
answering system which is restricted to the genomics domain
has been developed by Rinaldi et al [9]. They have adapted
open domain question answering system to answer genomic
questions with emphasis on identifying term relations based on
a linguistic-rich full-parser. In a study conducted with a test set
of 100 medical questions collected from medical students in a
specialized domain, a thorough search in Google was unable to
obtain relevant documents within top five hits for 40% of the






























questions [10]. Moreover, due to busy practice schedules
physicians spend less than 2 minutes on average seeking an
answer to a question. Thus, most of the clinical questions
remain unanswered [11]. However, our system answers the
questions of the user quickly and most succinctly.
The Heterogeneity Problem - The current systems
are not flexible enough to adopt themselves according to the
knowledge level and requirements of a user. Agreeing to
common standard of uniformity of similar concepts or data
relationships has drawbacks. As with mediators/wrappers it
remains difficult for standards to keep up with a dynamically
changing domain (e.g. novice and researchers generally
conform to different domains).
[12] aims to semantically integrate metadata in
bioinformatics data sources Also, heterogeneity of metadata is
either of an accidental or essential nature. Accidental
heterogeneity arises from the use of different formats and
representation systems (e.g. XML, flat file, or other format)
and can be solved through translation systems, which perform
format conversion. Essential heterogeneity, also called
semantic heterogeneity, arises from using varied vocabulary to
describe similar concepts or data relationships or from using
the same metadata to describe different concepts or data
relationships. The mediator/wrapper-based strategy [13], [14]
has not been widely successful because it solves the problem
reactively, after it occurs (which is more difficult).

3. SYSTEM ARCHITECTURE
System Overview
Figure 1 shows system block diagram. The Question
Answering system is based on searching in context the entities
of the corpus for effective extraction of answers. System
recognizes entities by searching from course material, using





























Question
Classification
Users
Question
General
Question
Link Parser:
Corpus Entity
File generation
Question Parsing:
Question Focus Noun
Phases, POS Info
Answer Extraction:
Passage Retrieval and
Scoring
Answer Selection:
NP Matching and
Ranking
ANSWER MINING
Segmentation
Map components
in respective
domain
ANSWER MINING
from respective
domain
documents
Passage
Sieving
using Entity
clusters
M
u
l
t
i

D
o
c
u
m
e
n
t

Q
u
e
s
t
i
o
n

CORPUS




Link Parser
METADATA
Information
Heterogeneity
Resolution
Acronym
Expansion
CRG for implicit
assumptions
FINAL
ANSWER
Link parser. This is especially useful in biomedical domain
where extended terms (e.g. immunodeficiency, hematopoietic,
nonmyeloablative, etc) of lexicon are classified as entities. The
question is parsed using Link parser during Question parsing.
Query formulation translates the question into a set of queries
that is given as keyword input to the Retrieval engine. We used
Seft for context based retrieval and answer re-ranking
methods.
Answer Mining
In this QA system, Link Grammar Parser decides the
questions syntactic structure to extract Part of speech
information. Question classifier then uses pattern matching
based on wh-words (such as when refers to an event, why
reasoning type, etc) and simple part-of-speech information to
determine question types ([15]). Questions seeking comparison
may need the answer to be extracted from more than one
passage or document. These are dealt separately (Section 4).
In the next step, Question focus is identified by
finding the object of the verb. Importance is given to question
focus by assigning it more weightage during retrieval of
answers. Quite logically, the answers are most appropriate
when there is a local similarity in the text with the query, for
example for the question Is nonmyeloablative allogeneic
transplantation feasible in patients having HIV infection?
query terms nonmyeloablative , allogeneic ,
transplantation etc have local similarity which is identified in
the text, by locality based similarity algorithm. The
contribution of each occurrence of each query term is summed
to arrive at a similarity score for any particular location in any
document in the collection. Software tool Seft [16], matches in
accuracy conventional information retrieval systems and is fast
enough to be useful on hundreds of megabytes of text.
The Query Formulation module finds query words
from the question for providing input to the retrieval engine.
The system constructs a hash table of the entities identified
from the question based on entity file, which is based on either
the table of contents or the index or glossary of the biomedical
corpus. These keywords (entities) are considered most
important and are given the maximum weight. To avoid the
higher ranking of passages merely due to the frequent
occurrence of noun words (as is done in search engines)
Most importantly, key issues of solving
heterogeneity, acronym expansion and understanding users
implicit assumptions are also addressed in the answer
extraction module (detailed in Section 5). BioinQA then
performs phrase matching based re-ranking by searching for
occurrence of Noun Phrases (identified by question parser
above). After phrase matching, system processes the passages
according to the classification done in question classification.
4. MULTI DOCUMENT RETRIEVAL
To explain the differences of components, a new
algorithm was developed and implemented.
Segregate Algorithm
Answers to domain questions involving a
comparison or differentiation or contrast between two
different entities of the corpus, generally lie in different
documents. For example:
o What is the difference between Introns and Exons?
o Contrast between Lymphadenopathy and Leukoreduction.
We developed the Segregate Algorithm that maps the two
separate ingredients (components) of the question (for example
Lymphadenopathy and Leukoreduction) in their respective
information documents. The actual answer may be found in
different documents owing to the different nature of the
entities involved in the question. Documents are then scanned
for these components and the top n documents thus obtained
are re-ranked based on passage sieving.
Entity Cluster Matching Based Passage Sieving
Obtained passages will be most accurately depicting
a contrast when their parameters or entity clusters (linked-list
of the entities of a passage along with their frequency of
occurrence in that passage) are very similar (e.g. the possible
parameters for comparing medicines would be duration of
action, dosage, ingredient levels, side-effects, etc). Thus, re-
ranking is performed by generating such entity clusters for
each document and matching them. The link parser in the
system recognizes the entities of the passages and performs
matching with those of second by employing the following
procedure:
Let
, i n
G be the entity cluster set of the n
th
answer in
the i
th
component, where, 1 2 i s s , 1 10 n s s .
The Score obtained from Entity Cluster Matching Based Re-
ranking algorithm, ECRScore is given by
10
, ,
1
i n n k
k
ECRScore C
=
=

1 2 i s s , 1 10 n s s .
Here
, n k
C is the Similarity function and is defined as
, 1, 2, 2, 1, n k n k k n
C G G G G = =
The operator is used to match the entities present in both
its operands for measuring the similarity between them. Now,
the
, i n
FinalScore of all the passages is calculated as
, 1 , 2 ,
* *
i n i n i n
FinalScore w CurrentScore w ECRScore = +
Where
1 2
1 w w + =

Here
, i n
CurrentScore
is the Score of the passage obtained
from answer selection phase, w
1
, w
2
are weights given to
scores to incorporate the contribution of both modules and are
chosen (empirically) in our system to be 0.7 and 0.3
respectively. Finally answer passages are ranked according to
their FinalScore and top 5 passages are presented to user.
BioinQA

Figure 2. Sample output for the question What is the
difference between glycoprotein and lipoprotein?
5. SEMANTIC HETEROGEINITY
RESOLUTION THROUGH METADATA
Bioinformatics field is a multidisciplinary field, with
the users of all level, from general people, students to
researchers and medical practitioners. To bridge the gap
between level of understanding of an experienced researcher
and a novice, our system employs metadata information during
Answer extraction:
I. Utilization of scientific and general terminology: A non-
biology student is not likely to access information by
homosapiens but by humans. The user himself decides
whether he wants to use the system for novice search or
advanced user search (see clip). We have developed
Advanced-and-Learner-Knowledge Adaptive (ALKA)
Algorithm, which works and performs selective ranking (of the
initial 10 passages) on these principles: researchers use
scientific terms and terminologies of the jargon more
frequently. These may also include equations, numeric data
(numbers, percentage signs) and words of large length such as
Lymphadenopathy, etc. Thus, documents relevant for
researchers will include more of such terms with a higher
frequency because it actually fulfills the need of the user ,
whereas those meant for the novice would include simple
(short length) words with less numbers or equations. The entity
file of corpus constructed in the initial phase is configured to
classify the terms as either biological (e.g. Efavirenz),
scientific (e.g. homosapiens) or general (e.g. human), using
metadata information. If a passage contains more scientific
terms occurring frequently, it is given a lower rank for the
novice, and a higher one for the advanced user.
II. Use of acronyms: These are of great importance in a field
like biomedicine where precise scientific terms are used and
any error introduced due to requirement of typing long names
can be critical. Solution to problem of acronyms will not only
save time of user but also relieve them of burden of
remembering long scientific names to accuracy of single
character.
Manually built acronym lists have been employed to
resolve the differences in meaning due to use of acronyms at
one place and its full form at another place. Many acronym
lists have been compiled and published and many lists are
available on the Web (e.g., Acronym Finder and Canonical
Abbreviation/Acronym List). As the purpose of this study was
to demonstrate the use of information about expansions of
acronyms in enhancing the answers obtained from a question
answering system, use of a manually built acronym list is
justified.
III. Comprehending the implicit assumptions of the user: It
is a common observation that a typical question of user rarely
contains full information required to answer the question.
Rather it essentially contains many unstated assumptions and
also requires extending or narrowing the meaning of the
question to either broaden or shorten the search. This is
actually the case in the real life for humans as their
conversations hardly include the full detail, but leave many
things for the listener to assume. For example a user may ask
How does Tat enhances the ability of RNA polymerase to
elongate It is on part of system to decide between 3 RNA
polymerases (1, 2, and 3).
To perform in such circumstances, the system is built
with Concepts Relation Graph (CRG) which affects the search
by enhancing it with knowledge represented in the graph (CRG
BioinQA





















Figure 3. Different outputs for advanced and novice users
respectively, and a display of acronym expansion.

is a form of metadata information). CRG is a one-to-many
relation graph representation of concepts and data of the
biomedical domain (the entities corresponding to the nodes of
the graph are obtained by this relation). For example in the
above mentioned question, the concept of RNA will be related
to the three variants possible, namely 1-, 2- and 3- RNA. CRG
is meant to fill in the missing information, which is required to
answer the question or remove the ambiguity from the
question. Given an ambiguous question, as determined by
CRG, the user can either be prompted to supply more
information, or the system can still answer the question, by
employing the aid of CRG. Clearly, a general user is not likely
to know everything about the searched concept in the
beginning, so the latter approach is better. Hence, system
recognizes the keywords present in the question, as well as
those in the CRG, augmenting the search by using CRG
entities. The user is then presented the answer, along with the
knowledge of CRG. The user can choose to take the help

BioinQA
provided by CRG, and thus can select the suitable answer,
without having to search again for the answer by supplying
more information. This approach is general enough and can
solve variety of problems. If more precise information about
the background of the user is available, the system can be
configured to provide a unique and unambiguous answer to the
user, by selecting just one entity from the CRG. Use of this
approach paves the way for development of a friendly QA
system, which will save the user from having to enter elaborate
information in the question (although at the expense of
accuracy). Figure 3 shows difference in levels of answers
obtained for novice and advanced user with acronym
expansion (Tuberculosis searched from word TB).

6. EXPERIMENTATION
As Sample Resource, abstracts were taken from
PUBMED to experiment on the system. Difference seeking
questions are not generally available to be used as test
questions, unlike the open domain evaluations, where test
questions can be mined from question logs (Encarta, Excite,
AskJeeves), thus we had them constructed by a one of the
biomedical students.
To build a set of questions we took a pair of 40
normal questions and 20 difference seeking questions from
general students by conducting a survey. The group comprised
of beginners and sophomores as well. This was to simulate use
of the system by both novice and expert users. The questions
thus received were of widely varying difficulty level covering
various topics of the subject. For each question the system
presents 5 top answers to the user (and 3 for difference seeking
questions). A question is answered if the answer to the
question is available in the text only which is presented to the
user (and not in the document from which the text is retrieved).
Comparison of BioinQA with the GOOGLE Search Engine
We compared our system with the most sophisticated
search engine, Google. Questions were posed to Google and 5
documents were checked for presence of answer in them.
Evaluation metrics
For general questions we used the popular metric
Mean Reciprocal Answer Rank (MRAR) suggested in TREC
[17] for the assessment of question answering systems, which
is defined as follows.
| | 1
1 1 1
,
[ ] ( )
n
i
RR MRAR
rank i n rank i
=
= =


n is the number of questions.; RR is the Reciprocal Rank.
For evaluation of comparison based questions no metric has
been suggested in the literature. To evaluate BioinQAs
performance for such questions novel metric was adopted
called Mean Correlational Reciprocal Rank (MCRR) which
is defined as: Let rank1 and rank2 be ranks of correct answers
given by system for both components respectively. Then
| | | | 1
1 1
( 1 2 )
n
i
MCRR
n rank i rank i
=
=


n is the number of questions.
If answer to a question is not found in passages
presented to user, then it is assumed that rank of that question
is whose value is large compared to number of passages.
For calculation of MRAR is taken as . To calculate
MCRR, is taken as a much smaller value as it avoids
punishing the case where the system provided answer to only
one of the components. In our experiments we took as 10.
The use of MCRR, being very similar to MRAR can be
justified as it is symmetric w.r.t. objects being compared so it
takes difference between A and B and difference between B
and A to be the same, and because answer to a comparison
question is complete when both components (e. g. lipoprotein
and glycoprotein) are described, not just one. So, it punishes
the answers where only one component has been answered.
Results: We calculated MRAR and MCRR for our system and
Google search engine. The following table and graphs
summarizes the result of our experiments:





Table 1: Experimental Results of BioinQA and Google on
the data set

Evaluation of the results:
As opposed to BioinQA, where answers passages
provided to user were taken as correct, for Google authors had
to manually search in whole document returned by it, to check

(a)

(b)
Figure 4. Plot of (a) MRAR vs. % of questions asked. (b)
MCRR vs. % of questions asked
MRAR MCRR
BioinQA 0.7333 0.3096
Google 0.6328 0.2195
whether it somewhere contained the answer. This makes the
user effort exorbitantly large for Google. Moreover this
strategy completely fails for comparison based questions if it
does not happen to find a direct answer in the same words as
presented in the question. The following figure proves the
ineffectiveness of Google.


Figure 5. No answer to the Google search for the question
What is the difference between lipoprotein and
glycoprotein?

7. CONCLUSIONS AND FUTURE WORK
Our biomedical QA system uses the technique of
entity recognition and matching. System is based on searching
in context and utilizes syntactic information. BioinQA also
answers the comparison type questions from multiple
documents, a feature which contrasts sharply with the existing
search engines, which merely return answers from single
document or passage. The use of the Metadata to understand
the implicit assumptions of the user, accommodate acronyms
and to answer the question, based on the expertise of the user
(rather than giving fixed answers for every user irrespective of
their background) makes the system adapted to needs of user.
Our future work will focus on developing a
systematic framework for image (jpeg, bmp, etc) extraction
and method for its contextual presentation, along with
presentation of the textual data as the answer to any question,
which will greatly enhance the understanding of the user.
Along with images, focus will be on incorporating audio
lectures available in the e-leaning facilities, and other sources
as PUBMED.

8. REFERENCES
[1]. http//www.ncbi.nlm.nih.gov/ - National Center for
Biotechnology Information. Last accessed 27 September,
2007
[2]. Stergos Afantenos, Vangelis Karkaletsis, Panagiotis
Stamatopoulos. Summarization from medical documents: a
survey. 13th April, 2005.
[3] Zweigenbaum P. Question answering in biomedicine.
Workshop on Natural Language Processing for Question
Answering, EACL 2003.
[4]. Schultz S., Honeck M., and Hahn. H. Biomedical text
retrieval in languages with complex morphology. Proceedings
of the Workshop on Natural Language Processing in the
Biomedical domain, July 2002, pp. 61-68.
[5]. Song Y., Kim S., and Rim H.. Terminology indexing and
reweighting methods for biomedical text retrieval.
Proceedings of the SIGIR'04 workshop on search and
discovery in bioinformatics, ACM, Sheffield, UK, 2004.
[6]. Minsuk Lee, James Cimino, Hai Ran Zhu, Carl Sable,
Vijay Shanker, John Ely , Hong Yu. Beyond Information
RetrievalMedical Question Answering. AMIA, 2006.
[7]. Jacquemart P. & Zweigenbaum P. Towards a medical
question-answering system: a feasibility study. In R. Baud, M.
Fieschi, P. Le Beux & P. Ruch, Eds., Proceedings Medical
Informatics Europe, volume 95 of Studies in Health
Technology and Informatics, p. 463468, Amsterdam: IOS
Press(2003).
[8]. Ayache, C. Rapport final de la champagne
EQueREVALDA, Evaluation en Question- Rponse2005. Site
webTechnolanguehttp://www.technolangue.net/article61.html.-
last accessed - 15th June 2007.
[9]. Rinaldi F., Dowdall J., Shneider G. & Persidis A.
Answering questions in the genomics domain. ACL2004 QA
Workshop, 2004.
[10]. P. Jacquemart, and P. Zweigenbaum, Towards a medical
question-answering system: a feasibility study, In
Proceedings Medical Informatics Europe, P. L. Beux, and R.
Baud, Eds., 2003, Amsterdam. IOS Press.
[11] J. Ely, J. A. Osheroff, M. H. Ebell, et al., Analysis of
questions asked by family doctors regarding patient care,
BMJ, vol. 319, 1999. pp.
358361.
[12]. Lei Li, Roop G. Singh, Guangzhi Zheng, Art
Vandenberg, Vijay Vaishnavi, Sham Navathe. A
Methodology for Semantic Integration of Metadata in
Bioinformatics Data Sources. 43rd ACM Southeast
Conference, March 18-20, 2005, Kennesaw, GA, USA.
[13]. Chen, L., Jamil, H. M., and Wang, N. Automatic
Composite Wrapper Generation for Semi-Structured
Biological Data Based on Table Structure Identification.
SIGMOD Record 33(2):58-64, 2004.
[14] Stoimenov, L., Djordjevic, K., Stojanovic, D. Integration
of GIS Data Sources over the Internet Using Mediator and
Wrapper Technology. Proceedings of the 2000 10th
Mediterranean Electrotechnical Conference. Information
Technology and Electrotechnology for the Mediterranean
Countries (MeleCon 2000), pp. 334-336.
[15] Kumar, P. Kashyap S., Mittal A., Gupta S. A Fully
Automatic Question Answering System for intelligent search
in ELearning Documents. International Journal on E-
Learning(2005) 4(!),149-166.
[16] Owen de Kretser, Alistair Moffat Needles and Haystacks:
A Search Engine for Personal Information Collections. acsc,
p. 58, Australasian Computer Science Conference, 2000.
[17] Giovanni Aloisio, Massimo Cafaro, Sandro Fiore, Maria
Mirto. ProGenGrid: aWorkflow Service Infrastructure for
Composing and Executing Bioinformatics Grid Services.
Proceedings of the 18th IEEE Symposium on Computer-Based
Medical Systems (CBMS05)



ABOUT THE AUTHORS
Dr. Ankush Mittal: Dr Ankush Mittal
is a faculty member at Indian Institute of
Technology Roorkee, India. He has
published many papers in the international
and national journals and conferences. He
has been an editorial board member, Int.
Journal on Recent Patents on Biomedical
Engineering and reviewer for IEEE
Transaction on Multimedia, IEEE
Transaction on Circuit and Systems for Video Technology,
IEEE Transactions on Image Processing, IEEE Transactions of
Fuzzy Systems, IEEE Transactions on TKDE, etc, He has been
awarded the Young Scientist Award by The National academy
of Sciences, India, 2006 for contribution in E-learning in the
country, best paper award with Rs. 10,000 at IEEE ICISIP
conference, 2005 and Star Performer, 2004-05, IIT Roorkee
based on overall performance (teaching, research, thesis
supervision, etc). His research interests include Image
Processing and Object Tracking, Bioinformatics, E-Learning,
Content-Based Retrieval, AI and Bayesian Networks.

Sparsh Mittal: Sparsh Mittal is a senior
undergraduate student of Electronics &
Communications Engineering Department at
Indian Institute of Technology Roorkee,
India. His research interests include natural
language processing, data mining, FPGA
implementation using VHDL and Verilog
and image processing.

Saket Gupta: Saket Gupta is a senior
undergraduate in Electronics and
Communication Engineering Department at
Indian Institute of Technology Roorkee,
India. He has worked on Content Based
Retrieval, QA Systems and other NLP
applications for e-learning. His current field
of research includes MIMO communication systems; Image
processing; and FPGA synthesis and design using VHDL. He
has been awarded many scholarships from IIT Roorkee and
from other institutions.

Sumit Bhatia: Sumit Bhatia is a senior
undergraduate student in Electrical
Engineering Department at Indian Institute
of Technology Roorkee, India. His current
research interests include Content Based
Information retrieval and Data Mining. In
the past, he has worked in the areas of
Digital Image processing and Remote Sensing.

Das könnte Ihnen auch gefallen