Sie sind auf Seite 1von 6

Handling of Prepositions in English to Bengali

Machine Translation

Sudip Kumar Naskar Sivaji Bandyopadhyay


Dept. of Comp. Sc. & Engg., Dept. of Comp. Sc. & Engg.,
Jadavpur University, Jadavpur University,
Kolkata, India Kolkata, India
sudip_naskar@hotmail.com sivaji_cse_ju@yahoo.com

(monolingual and cross-lingual) aspects of


Abstract prepositions have been contemplated by several
researchers. Jackendoff (1977), Emonds (1985),
The present study focuses on the lexical Rauh (1993) and Pullum and Huddleston (2002)
meanings of prepositions rather than on have investigated the syntactic characteristics of
the thematic meanings because it is in- preposition. Cognitive theorists have examined
tended for use in an English-Bengali ma- the polysemous nature of prepositions and ex-
chine translation (MT) system, where the plored the conceptual relationships of the
meaning of a lexical unit must be pre- polysemy, proposing the graphical mental im-
served in the target language, even ages (Lakoff and Johnson, 1980; Brugman, 1981,
though it may take a different syntactic 1988; Herskovits, 1986; Langacker, 1987; Tyler
form in the source and target languages. and Evans, 2003). Fauconnier (1994) and Visetti
Bengali is the fifth language in the world and Cadiot (2002) have canvassed the pragmatic
in terms of the number of native speakers aspects of prepositions. A practical study of the
and is an important language in India. usage of prepositions was carried out for the pur-
There is no concept of preposition in pose of teaching English as a second language
Bengali. English prepositions are trans- (Wahlen, 1995; Lindstromberg, 1997; Yates,
lated to Bengali by attaching appropriate 1999). The deictic properties of spatial preposi-
inflections to the head noun of the prepo- tions have been studied by Hill (1982), while the
sitional phrase (PP), i.e., the object of the geographical information provided by them was
preposition. The choice of the inflection an interest of computational research (Xu and
depends on the spelling pattern of the Badler, 2000; Tezuka et al., 2001).
translated Bengali head noun. Further In the fields of natural language processing,
postpositional words may also appear in the problem of PP attachment has been a topic
the Bengali translation for some preposi- for research for quite a long time, and in recent
tions. The choice of the appropriate post- years, the problem was explored with a neural
positional word depends on the WordNet network-based approach (Sopena et al., 1998)
synset information of the head noun. and with a syntax-based trainable approach (Yeh
Idiomatic or metaphoric PPs are trans- and Vilain, 1998). Although past research has
lated into Bengali by looking into a bi- revealed various aspects of prepositions, there is
lingual example base. The analysis pre- not much semantic research of prepositions
sented here is general and applicable for available for computational use, which requires a
translation from English to many other vigorous formalization of representing the se-
Indo-Aryan languages that handle prepo- mantics. A recent semantic study of prepositions
sitions using inflections and postposi- for computational use is found in (Voss, 2002),
tions. with a focus on spatial prepositions. Spatial
prepositions are divided into three categories ac-
1 Introduction cording to which one of the two thematic mean-
ings between place and path they acquire when
Prepositions have been studied from a variety of
they are in argument, adjunct and non-
perspectives. Both linguistic and computational
subcategorized positions of particular types of

Proceedings of the Third ACL-SIGSEM Workshop on Prepositions, pages 89–94,


Trento, Italy, April 2006. 2006
c Association for Computational Linguistics
89
verbs. The semantics of spatial prepositions dealt 2 A Brief Overview of the English-
with in (Voss, 2002) is not lexical but thematic. Bengali MT System
There are some prepositions (e.g., over, with),
which have many senses as preposition. By mak- The handling of English prepositions during
ing use of the semantic features of the Comple- translation to Bengali has been studied with re-
ments (reference object) and Heads (verb, verb spect to an English-Bengali MT system (Naskar
phrase, noun or noun phrase governing a preposi- and Bandyopadhyay, 2005) being developed. In
tion or a PP), the meaning of the polysemous order to translate from English to Bengali, the
prepositions can be computationally disambigu- first thing we do is lexical analysis of the English
ated. The different meanings of over call for dif- sentence using the WordNet, to gather the lexical
ferent semantic features in its heads and com- features of the morphemes. During morphologi-
plements [Alam, 04]. cal analysis, the root words / terms (including
Prepositional systems across languages vary to idioms and named entities), along with associ-
a considerably degree, and this cross-linguistic ated grammatical information and semantic cate-
diversity increases as we move from core, physi- gories are extracted. A shallow parser identifies
cal senses of prepositions into the metaphoric the constituent phrases of the source language
extensions of prepositional meaning (metaphor sentence and tags them to encode all relevant
or rather, idiomaticity is one of the main realms information that might be needed to translate
of usage with prepositions) (Brala, 2000). these phrases and perhaps resolve ambiguities in
The present study focuses on the lexical mean- other phrases. Then these phrases are translated
ings of prepositions rather than on the thematic individually to the target language (Bengali) us-
meanings because it is intended for use in an ing Bengali synthesis rules. The noun phrases
English-Bengali machine translation (MT) sys- and PPs are translated using Example bases of
tem, where the meaning of a sentence, a phrase syntactic transfer rules. Verb phrase translation
or a lexical entry of the source language must be scheme is rule based and uses Morphologi-
preserved in the target language, even though it cal Paradigm Suffix Tables. Finally, those target
may take a different syntactic form in the source language phrases are arranged using some
and target languages. Bengali is the fifth lan- heuristics, based on the word ordering rules of
guage in the world in terms of the number of na- Bengali, to form the target language representa-
tive speakers and is an important language in tion of the source language sentence.
India. It is the official language of neighboring
Bangladesh. There is no concept of preposition 3 Prepositions in English
in Bengali. English prepositions are translated to A preposition is a word placed before a “noun”
Bengali by attaching appropriate inflections to to show in what relation the noun stands with
the head noun of the PP, i.e., the object of the regard to the other noun and verb words in the
preposition. The choice of the inflection depends same sentence. The noun that follows a preposi-
on the spelling pattern of the translated Bengali tion, i.e., the reference object is in the accusative
head noun. Further postpositional words may case and is governed by the preposition. Preposi-
also appear in the Bengali translation for some tions can also be defined as words that begin
prepositions. The choice of the appropriate post- prepositional phrases (PP). A PP is a group of
positional word depends on the WordNet (Fell- words containing a preposition, an object of the
baum, 1998) synset information of the head preposition, and any modifiers of the object.
noun. Idiomatic or metaphoric PPs are translated Syntactically, prepositions can be arranged
into Bengali by looking into a bilingual example into three classes – simple prepositions (e.g., at,
base. by, for, from etc.), compound prepositions and
A brief overview of the English-Bengali MT phrase prepositions. A compound preposition is
System is presented in Section 2. Different types made up of a set of words which starts with and
of English prepositions and their identification in acts like a preposition (e.g., in spite of, in favor
the MT system are described in Section 3. Inflec- of, on behalf of etc.). A phrase preposition is a
tions and postpositions in Bengali are outlined in simple preposition preceded by a word from an-
Section 4. Translation of English prepositions to other category, such as an adverb, adjective, or
inflections and postpositions in Bengali are de- conjunction (e.g., instead of, prior to, because of,
tailed in Section 5. The conclusion is drawn in according to etc.).
Section 6. Frequently prepositions follow the verbs to-
gether forming phrasal verbs and remain sepa-

90
rate. A word that looks like a preposition but is stranded preposition. It searches the pronoun
actually part of a phrasal verb is often called a (relative or interrogative) that appears at its left
particle. E.g. “Four men held up the bank.” Here and relates the stranded preposition to the pro-
held up is a verb [“to rob”]. Therefore, up is not noun. Thus during translation, the following
a preposition, and bank is not the object of a conversion takes place.
preposition. Instead, bank is a direct object of the
verb held up. A particle may not always appear (1) Where are you coming
immediately after the verb with which it makes from? ÅÆ From where are you
up a phrasal verb (e.g., Four men held the bank coming?
up.). (2) My grandfather was a
An idiomatic (metaphoric) PP starts with a collector of coins, which we
preposition, but its meaning cannot be ascer- used to fight over. ÅÆ My
tained from the meaning of its components. Ex- grandfather was a collector
amples of idiomatic PPs are: at times, by hook or of coins, over which we used
crook etc. to fight.
All these syntactical characteristics are used to
identify prepositions in the English-Bengali MT But if the pronoun is missing, then the system
system. Moreover, the inventory of prepositions has to find out the elliptical pronoun first.
in English is a close set. So, identification of
(3) I am grateful to the man
prepositions is not much of a problem in English.
A simple list serves the purpose. The preposi- I have spoken to. Æ I am
grateful to the man [whom] I
tions, compound prepositions, phrase preposi-
tions and idiomatic PPs are identified during have spoken to. Æ I am
morphological analysis. Some of the phrasal grateful to the man to
[whom] I have spoken.
verbs (when the phrasal verb appears as a whole)
are identified during the morphological analysis
Prepositions represent several relations with
phase and some during parsing (when the parti-
the nouns governed by them. Spatial and tempo-
cle does not accompany the verb).
ral prepositions (which indicate a place or time
However, there are some words that act as
relation) have received a relatively in-depth
prepositions and fall into other POS categories as
study for a number of languages. The semantics
well. For example, the word before can be used
of other types of prepositions describing manner,
as an adverb (e.g., I could not come before),
instrument, amount or accompaniment largely
preposition (e.g., He came before me) or a con-
remain unexplored. In case of an MT system,
junction (e.g., He came before I came). Simi-
when a preposition has different representations
larly, the word round can be used as an adjective
in the target language for different relations indi-
(e.g., Rugby is not played with a round ball),
cated by it, identification of the relation is neces-
noun (e.g., Rafter was knocked out of the tour-
sary. The WordNet synset information of the
nament in the third round), adverb (e.g., They
head noun of the PP, i.e., the object of the prepo-
have moved all the furniture round), preposition
sition serves to identify the relation.
(e.g., The earth revolves round the sun) and verb
(e.g., His eyes rounded with anger). But depend- 4 Inflections and Postpositions in Ben-
ing on the POS of the neighboring words/terms,
gali
the parser easily identifies the correct POS of the
word in the particular context. In Bengali, there is no concept of preposition.
A preposition is usually placed in front of (is English prepositions are handled in Bengali us-
“pre-positioned” before) its object, but some- ing inflections (vibhaktis) to the reference objects
times however may follow it (e.g., What are you and/or post-positional words after them. Inflec-
looking at?). The preposition is often placed at tions get attached to the reference objects. An
the end when the reference object is an interroga- inflection has no existence of its own in the lan-
tive pronoun (e.g., Where are you coming guage, and it does not have any meaning as well.
from?) or a relative pronoun (e.g., My grandfa- There are only a few inflections in Bengali: Φ
ther was a collector of coins, which we used to (null), -å# , -Ì^ , -åÌ^ , -åTö , -
(-e) (-y) (-ye) (-te)
fight over). In such cases, the system finds out
that the preposition is not a particle and is not å#åTö (-ete), -åEõ (-ke), -åÌ[ý (-re), -å#åÌ[ý (-ere),
followed by a noun either, so it must be a -Ì[ý (-r) and -å#Ì[ý (-er) (an inflection is repre-

91
sented as a word with a leading ‘-’ in this paper). ject for any of these 3 English spatial and tempo-
The placeholder indicated by a dashed circle ral prepositions. The choice depends on the spell-
represents a consonant or a conjunct. For exam- ing of the translated reference object. The rule is:
ple, if -å# inflection is attached to the word if the last letter of the Bengali representation of
[ýçLçÌ[ý (bazar [market]) the inflected word is å#
the reference object is a consonant, ‘ ’ (-e) or -
[ýçLçãÌ[ý (bazar-e [market-to]). On the other hand, å#åTö (-ete) is added to it (e.g., at/in marketÆ
post-positional words are independent words. [ýçLçãÌ[ý [bazar-e / bazar-ete]), else if the last let-
They have meanings of their own and are used ter of the Bengali word is a matra (vowel modi-
independently like other words. A post-positional #ç
fier) and if the matra is ‘ ’ (-a), any of ‘ ’ åTö
word is positioned after an inflected noun (the
reference object). Some examples of the post-
Ì^
(-te), or ' ' (-y) can be added to the Bengali ref-
positional words in (colloquial) Bengali are: ×VãÌ^ erence word (e.g., in eveningÆ aµùîçãTö / aµùîçÌ^
(diye [by]) , åUãEõ (theke [from]), LXî (jonno [sandhya-te / sandhya-y]), otherwise 'åTö’ (-te) is
[for]), Eõçä»K÷ (kachhe [near]), aç]ãX (samne [in added to it (e.g., at homeÆ [ýç×QÍöãTö [badi-te]).
front of]) etc. When translating the temporal expressions, if
‘on’ is followed by a day (like Sunday, Monday
5 Translating English prepositions to etc.) or by a date in English, null inflection is
Bengali added.
To translate this type of PPs, we take the help
When an English PP is translated into Bengali, of an example base, which contains bilingual
the following transformation takes place: (prepo- translation examples. Here are some translation
sition) (reference object) ÅÆ (reference object) examples from the example base (TLR – target
[(inflection)] [(postpositional-word)]. The corre- language representation of the reference object).
spondence between English prepositions and
Bengali postpositions (inflections and post- (1) at / in (place) ÅÆ
positional words) is not direct. As far as the se-
lection of the appropriate target language repre- (TLR) - ( / Ì å# åÌ^ / åTö ) [ - ( e /
sentation of a preposition is concerned the refer- ye / te )]
ence object plays a major role in determining the
correct preposition sense. Deciding whether the
(2) of (NP) ÅÆ (TLR) - ( Ì Ì[ý /
preposition is used in a spatial sense, as opposed å#Ì[ý / åÌ^Ì[ý ) [ - ( r / er / yer
to a temporal or other senses, is determined by )]
the semantics of the head noun of the reference
object. A noun phrase (NP) denoting a place 5.2 Translating English prepositions using
gives rise to a spatial PP. Similarly, an object Inflections and Postpositions in Bengali
referring to a time entity produces a temporal Most of the English prepositions are translated to
expression. These relationships can be estab- Bengali as inflections and postpositions to the
lished by looking at the WordNet synset informa- noun word representing the reference object. To
tion of the head noun of the PP. translate this type of PPs, we take the help of an
5.1 Translating English prepositions using example base, which contains bilingual transla-
Inflections in Bengali tion examples. Here are some translation exam-
ples from the example base (TLR – target lan-
The translation of the three English prepositions guage representation of the reference object).
'in', 'on', and 'at' involves identifying the possible
inflection to be attached to the head noun of the (1) before (artifact) ÅÆ
PP. No postpositional words are placed after the
head noun for these prepositions. The three (TLR) - ( / /Ì Ì[ý å#Ì[ý åÌ^Ì[ý aç]ãX
) [ -
prepositions 'in', 'on', and 'at' (which are both ( r / er / yer ) samne ]
spatial and temporal in nature) can be translated (2) before (!artifact) ÅÆ
å# åTö
into the Bengali inflections '- ' (-e), '- ’ (-te), (TLR) - ( / Ì Ì[ý å#Ì[ý / åÌ^Ì[ý %çãG [
) -
-å#åTö (-ete) and 'Ì^'
(-y). Any of these 4 Bengali ( r / er / yer ) age ]
inflections can be placed after the reference ob-

92
(3) round (place / physical ‘artifact’, whereas ‘evening’ and ‘me’ (which
object) ÅÆ (TLR) - ( / / Ì Ì[ý å#Ì[ý represents a person) are not. Thus ‘with’ is trans-
åÌ^Ì[ý ) »JôçÌ[ý×VãEõ [ - ( r / er / yer lated to Bengali as -Ì[ý aç]ãX in sentence (1), and
) chardike ] takes the meaning - Ì( Ì[ý / å#Ì[ý / åÌ^Ì[ý ) %çãG in
(4) after (time) ÅÆ (TLR) - sentence (2) and (3).
Ì( Ì[ý / å#Ì[ý / åÌ^Ì[ý ) YãÌ[ý÷ [
- ( r / er As there is no ambiguity in the meaning of
/ yer ) pare ] compound prepositions and phrase prepositions,
(5) since (place / physical a simple listing of them (along with their Bengali
representations) suffices to translate them. We
object / time) ÅÆ (TLR) åUãEõ have prepared a list that contains the phrase
[theke] prepositions and compound prepositions in Eng-
lish along with their Bengali translations.
The choice of inflection depends on the spell-
ing of the translated reference object as said be- English Bengali
fore. If the translated reference object ends with
in spite of aãüøC [satteo]
a vowel, åÌ^Ì[ý is added to it; else if ends with a
away from
consonant, å#Ì[ý (er)is added to it; otherwise (it - Ì( Ì[ý / å#Ì[ý / åÌ^Ì[ý ) åUãEõ VÇãÌ[ý
ends with a matra) Ì[ý (r)is appended with it. The [ - ( r / er / yer ) theke dure ]
owing to
postpositional word is placed after the inflected - Ì( Ì[ý / å#Ì[ý / åÌ^Ì[ý ) EõçÌ[ýãS
reference object in Bengali. The choice of the [ - ( r / er / yer ) karane ]
postpositional word depends on the semantic apart from
information about the reference objects as col-
K֍QꚍC [ chhadao ]
Instead of
lected from the WordNet. In cases with one - ( Ì[ý / å#Ì[ý / åÌ^Ì[ý ) Y×Ì[ý[ýãTöÛ
postpositional word, there is no need to know the [ - ( r / er / yer ) paribarte ]
semantic features of the reference objects. For
along with
example, ‘since’, as a preposition, is always - ( Ì[ý / å#Ì[ý / åÌ^Ì[ý ) açãU
translated as åUãEõ
(theke) in Bengali, irrespec- [ - ( r / er / yer ) sathhe ]
tive of the reference object. Again in some cases,
5.3 Translation of English Idiomatic PPs
this semantic information about the reference
object does not suffice to translate the preposi- The meaning of an idiomatic PP cannot be de-
tion properly. rived from the meanings of its components. The
Consider the following examples that include simplest way to tackle them is to maintain a list-
the preposition before in two different senses. ing of them. A list or a direct Example Base is
used which contains idioms, which start with
(1) He stood before the prepositions, along with their Bengali transla-
door. ÅÆ åa VÌ[ýLçÌ[ý aç]ãX VñçQÍöç_ tions. Such an idiom is treated like any other PP
(se [he] darja-r samne [the during the word-reordering phase. Here are some
door before] dandalo examples of them:
[stood])
(2) He reached before eve- (1) at times ÅÆ a]ãÌ^ a]ãÌ^
ning. ÅÆ åa aµùîçÌ[ý %çãG (samaye samaye)
(2) by hook or crook ÅÆ
åYgì»K÷ç_
(se [he] sondhya-r age
å^\öçã[ý+ åc÷çEõ
(jebhabei hok)
[evening before] pouchhalo
[reached]) (3) to a fault ÅÆ ]çyç×Tö×Ì[ýNþ
(3) He reached before John. (matratirikto)
ÅÆ åa LãXÌ[ý %çãG åYgì»K÷ç_
(se [he]
jan-er age [John before] 6 Conclusion
pouchhalo [reached]) In the present study, the handling of English
prepositions in Bengali has been studied with
From the WordNet, the system acquires the reference to a machine translation system from
semantic information that ‘door’ is a hyponym of English to Bengali. English prepositions are han-

93
dled in Bengali using inflections and / or using Langacker, Ronald. 1987. Foundations of cognitive
post-positional words. In machine translation, grammar, vol. 1. Stanford, CA: Stanford Univer-
sense disambiguation of preposition is necessary sity Press.
when the target language has different represen- Lindstromberg, Seth. 1997. English prepositions
tations for the same preposition. In Bengali, the explained. Amsterdam: John Benjamins.
choice of the appropriate inflection depends on
the spelling of the reference object. The choice Naskar, Sudip Kr. and Bandyopadhyay. Sivaji. 2005.
of the postpositional word depends on the se- A Phrasal EBMT System for Translating English
mantic information about the reference object to Bangla. In MT Summit X.
obtained from the WordNet. Pullum, Geoffrey and Rodney Huddleston. 2002.
Prepositions and prepositional phrases. In
Acknowledgements Huddleston and Pullum (eds.), 597-661.
Rauh, Gisa. 1993. On the grammar of lexical and
Our thanks go to Council of Scientific and In-
nonlexical prepositions in English. In Ze-
dustrial Research, Human Resource Develop- linskiy-Wibbelt (eds.), 99-150.
ment Group, New Delhi, India for supporting
Sudip Kumar Naskar under Senior Research Fel- Sopena, Joseph M., Agusti LLoberas and Joan L.
lowship Award (9/96(402) 2003-EMR-I). Moliner. 1998. A connectionist approach to prepo-
sitional phrase attachment for real world texts. In
COLING-ACL ’98, 1233-1237.
References
Tezuka, Taro, Ryong Lee, Yahiko Kambayashi and
Alam, Yukiko Sasaki. 2004. Decision Trees for Sense
Hiroki Takakura. 2001. Web-based inference rules
Disambiguation of Prepositions: Case of Over. In
for processing conceptual geographical relation-
HLT/NAACL-04.
ships. Proceedings of Web Information Sys-
Brala, Marija M. 2000. Understanding and trans- tems Engineering, 14-21.
lating (spatial) prepositions: An exercise in Tyler A. and Evans V. 2003. Reconsidering prepo-
cognitive semantics for lexicographic pur- sitional polysemy networks: the case of over*.
poses. In B. Nerlich, Z. Todd, V. Herman, & D. D. Clarke
Brugman, Claudia. 1988. The story of over: (Eds.), Polysemy: Flexible patterns of meanings in
Polysemy, semantics and the structure of the mind and language, pp. 95-160. Berlin: Mouton de
lexicon. New York: Garland Press. [1981. The Gruyter.
story of over. Berkely, CA: UC-Berkely MA the- Visetti, Yves-Marie and Pierre Cadiot. 2002. Insta-
sis.] bility and the theory of semantic forms Start-
Emonds, Joseph. 1985. A unified theory of syntac- ing from the case of prepositions. In Fei-
tic categories. Dordrecht: Foris. genbaum, Susanne and Dennis Kurzon (eds.), 9-39.

Fellbaum, Christiane D. ed. 1998. WordNet – An Voss, Clare. 2002. Interlingua-based machine
Electronic Lexical Database, MIT Press, Cam- translation of spatial expressions. University of
bridge, MA. Maryland: Ph.D. Dissertation.

Fauconnier, Gilles. 1994. Mental spaces. Cam- Wahlen, Gloria. 1995. Prepositions illustrated.
bridge: Cambridge University Press. Michigan: The University of Michigan Press.

Herskovits, Annette. 1986. Language and spatial Xu, Yilun Dianna and Norman Badler. 2000. Algo-
rithms for generating motion trajectories described
cognition An interdisciplinary study of the
by prepositions. Proceedings of Computer Ani-
prepositions in English. Cambridge: Cambridge
University Press.
mation 2000, 30-35.
Yates, Jean. 1999. The ins and outs of prepositions
Hill, Cliffort 1982. Up/down, front/back, left/right.
A contrastive study of Hausa and English. In A guidebook for ESL students. New York: Bar-
ron’s.
Weissenborn and Klein, 13-42.
Yeh, Alexander S. and Marc B. Vilain. 1998. Some
Jackendoff, Ray. 1977. The architecture of the lan-
properties of preposition and subordinate conjunc-
guage. Cambridge, MA: MIT Press. tion attachments. In COLING-ACL ’98, 1436-
Lakoff, George and Mark Johnson. 1980. Metaphors 1442.
we live by. Chicago: University of Chicago Press.

94

Das könnte Ihnen auch gefallen