Sie sind auf Seite 1von 12

Example Database

English-German
Dictionary




Zibilianu
Example Database
Consists of sentences exctracted from the
bilingual Korpus.
It is an XML document
The entry structure:
-ID
-language
-equivalent

The English-German Dictionary
Consists of two XML files containing
English and German words and
connections between them.

The dictionary does not cover the entire
korpus, but most of the words in the
Database can be found here as well.
Dictionary- Word Structure
Each entry has the following fields:
ID
Grammatical Category
Grammatical Subcategory
Equivalent (the connection to the equivalent or
equivalents the word has in the other
language)
Spelling
Attributes (which contains specific information
for each part of speech)


Dictionary Structure
The parts of speech contained in the dictionary
are:
Nouns
Verbs
Pronouns
Prepositions
Conjunctions
Adverbs
Adjectives
Articles


Dictionary Structure - Nouns
Fields
Category :NOUN
Subcategory :Common/Proper
Attributes :gender ,number, case
Difficulties
-no cases and genders in English
-compound words in German
Dictionary Structure -Verbs
Fields:
Category :VERB
Subcategory :Personal/Impersonal
Attributes: mode ,tense, person, no,
voice
Gaps:
-a specific verb should be found in every
mode ,tense ,person ,number
-when translating ,a verb can change its tense


Dictionary Structure -Adjectives
Fields:
Category :ADJECTIVE

Attributes :gender, number, case, degree

Difficulties:
-in English we can only speak about degree
-in German the adjective changes in
conformity with the number, the case and the
gender of the determined noun
Dictionary Structure
Articles -full coverage
Prepositions full coverage
Adverbs only the ones in the Database
Pronouns only the needed ones

In addition
there can be found specific expressions
with their translation (there is- es gibt)

Accessing the documents
In order to use the information contained
in the XML documents a parser was
required
DOM parser
builds an internal representation of the
document ,a tree of nodes
Methods to handle the information from
Dictionary and Database

Handling the documents
Common methods
get an entry when the ID is given

Dictionary
-get grammatical category for a given
word
-get equivalent
-get translation
Integration in the whole project
The Sentence Database as well as the
Dictionary are used by the Alignment,
Matching, Parsing and Recombination
sections

Das könnte Ihnen auch gefallen