Sie sind auf Seite 1von 6

The Sanskrit He

Version 2.85 [2015

Welcome to the Sanskrit Heritage site. It provides various services for the com

The first service is dictionary access. The dictionary is a hypertext structure g


information. There are currently two versions of the dictionary.
The first one is the original Heritage Sanskrit-French dictionary, which serves
tools. Furthermore it offers a rich encyclopedic contents about Indian culture.
explained below. A fully hypertext version in the Goldendict format is also ava

The second lexicon is a digital version of the Monier-Williams Sanskrit-English


is issued from Thomas Malten's digitalization of the Monier-Williams at Kln Un
adapted to the HTML Heritage look and feel by Pawan Goyal. The Sanskrit Her
compatibility of the grammatical tools.

The choice of the dictionary is set to a default by the configuration of the serv
respectively Sanskrit Heritage andMonier-Williams.

This site offers a number of linguistic services for the Sanskrit language, such
Sanskrit banks of tagged hypertext. Various phonological and morphological t

Sanskrit Heritage dictionary in book form

You may download the pdf file of the Heritage dictionary from PDF. This docum
management software from Adobe freely available on Internet. Since the docu
4.5 Mb. This is an on-going effort, lexical acquisition implies quick obsolescen

The Sanskrit Heritage dictionary is also available in an ebook format, u


the Golden Sanskrit Heritage page.

Multilingual hyper-text dictionary


Interactive browsing

The dictionary may be accessed through an indexing engine: Sanskrit Herit

Your browser must be HTML5 compliant, and for proper viewing of Sanskrit te
transliteration with diacritics, and for devangar. For instance, install fonts In
devangar with proper ligatures is Apple's Devanagari MT for Macintosh OS X
advised for proper rendering.

You may have to fiddle with the controls of your browser, so that the font dec
selection, and thus encoding is specified as Unicode compliant (UTF-8 encodin

Note that many words are given with their etymology as hypertext links. You
Also, the gender declarations of the main entries are mouse-sensitive, and giv
present class mark of the verbal roots gives access to the conjugation schem
prefixed derived verbs.

All these grammatical tools, originally developed for the Heritage dictionary, a
Thus our HTML Monier-Williams is linked with the Heritage declension engine
Sanskrit made easy

If you want to search for a Sanskrit word without knowing its exact translitera
allows you to search for words without knowing precise diacritics usage. For in
is limited for the moment to the Sanskrit Heritage dictionary.

Sanskrit Grammarian

This interface gives the declension tables for Sanskrit substantives. Try out th
The same transliteration conventions as for the dictionary index apply. For ins
"brahman" with gender Neu. The fourth button, labeled "Any", may be used fo
pronouns ("aham", "tvad"), or numeral words such as "dva", "tri", etc.

A conjugation engine for roots is also available. It handles the full present sys
the passive present, the perfect, the aorist and the future. Participial stems, a
conjugations (causative, intensive, desiderative) are also generated, for the fu
such as "bhuu" 1, "as" 2, "m.rj" 2, "han" 2, "haa" 3, "hu" 3, "daa" 4, "su" 5, "p

secondary conjugations of a root, enter code 0. You may cascade by generatin

A word of caution is called for here. The only safe way to get correct inflected
consistently with their specification in the Heritage dictionary. This is specially
various Sanskrit grammars. For instance, root h is called h, hv or hve acco
two items have the same phonetic realisation, their respective lexemes are di
there are three roots named m in the Sanskrit Heritage dictionary. They are
maa#4. If you ask for the conjugated forms of maa in present classes 2 or 3,
maa#3 (to mow) or maa#4 (to exchange) you have to enter explicitly their st
morphology parameters may yield random results or error messages.

Lemmatizer

Conversely, a lemmatiser attempts to tag inflected words. Try for instance (in V
(clicking on Noun) or "apibat", "akaar.siit", "dudoha", "vaahyate" etc (clicking
stems in some secondary derivations. For instance, "darzayi.syati" is found as
{ int. pr. m. sg. 3 }[d_1], "did.rk.sate" yields { des. pr. m. sg. 3 }[d_1] and

N.B. Do not attempt to lemmatize verbal forms with preverbs - this will not wo
forms is possible through the Sanskrit Reader interface, as we shall see below

Morphology

A dictionary of inflected forms of Sanskrit words is provided in XML form unde


resources site.

Sanskrit Reader

Try our interactive Sanskrit Reader. It is able to segment simple sentences. Try
"tryambaka.myajaamahesugandhi.mpu.s.tivardhanam" (we assume Velthuis t
tagged sentence. You will see two segmentations, one with an identified comp
"tryambakam". Note that each segment is indicated with a lemma giving its s
segment form from its stem. The stem is hyperlinked to the dictionary of choi

Note also that segments are separated by phonological information in the sha
sentence by successive sandhi application. For instance, solution 1 explains t
"ambakam" by rule i|a ya.

The reader may be helped by inserting blanks in the input at word junction. F
yajaamahe sugandhi.m pu.s.tivardhanam". But compounds should stay in one
"tacchrutvaasa~njaya uvaaca".

Many options are provided in the menu of the Reader page. For instance, click

where each chunk is in terminal sandhi form. For instance "tryambakam yajaa

Two strengths of the Reader are provided. The Simplified mode, offered as a d
powerful, using the full range of participles of verbs, privative compounds, etc
impractical, and other facilities must be used.

The grammar used to recognize sentences is explained as a local automaton


of the segmenter automaton control. A simpler one, close to the Simplified m
Complete mode of the reader, is Complete automaton. The color codes of these

In these diagrams, transparent nodes are non generative, and colored nodes
category Auxi is the subset of Verb consisting of conjugated forms of roots "k.
denotes sequences of preverbs.

Sanskrit Parser

If in the reader you press the "Parsing" button, many irrelevant pseudo-solutio
"pratilekhanenaak.saraa.nisundaraa.nibhavanti". In Simplified mode, it shows

Each solution returned with the parser is marked with a green check sign, wh
terms of roles (kraka).

The parser recognizes sentences. It may be made to recognize nominal phras


intended gender. You may for instance analyze the compound: "pravaran.rpam
masculine nominal. Alternatively, one can ask to recognize this form as a sing
category. When breaking the text with spaces, the Word mode allows to recog
sequences of chunks in final sandhi form separated by spaces, where sandhi
"Unsandhied" mode in the reader interface.

Sanskrit Tagger

The semantic analysis may be still ambiguous, since a given segment may be
presented under the role matrix, sorted by increasing penalty. Check for your
heart symbol. The system will return the corresponding unambiguously tagge
Iterating this process allows you to progressively tag a Sanskrit text with the S

Alternatively, you may select the ambiguous morphology choices, each being
the first choice, but you may override this default and choose manually e.g. th
"Submit" button and you will get the corresponding deterministically tagged s

Summary mode

Now that you are more familiar with using the various modes of the Reader on
sentences. Obviously the listing of all solutions is out of the question with long

semi-automatic segmentation. This new Summary mode is actually now propo

Try for instance "satya.mbruuyaatpriya.mbruuyaannabruuyaatsatyamapriya.m


presents a summary of the union of all solutions, as a chart of segments align
segment santanas proposed first, on top of a forest of smaller words combin
blue, and the forest of irrelevant combinations vanishes. Do the same under t
top candidates. Now choose the particle ca (and thus na). Now only one choic
finish the job. Indeed only one solution remains, as may be checked by clickin
point on the process. You are now viewing the same output as given by the Re
in the Summary. You may alternatively check the "Show Preferred Solutions" c
penalties. If you make a mistake in the selection of segments, it is easy to bac

Other Sanskrit Resources

We have on on-going cooperation with the Department of Sanskrit Studies of


the Indian Institute of Technology at Kharagpur on computational linguistics fo
scholars from the Sanskrit Library. This team is actively developing cooperatin

In october 2007 we organized the First International Sanskrit Computational L


followed by the Second Symposium in may 2008 at Brown University, by a third
2010 at JNU. a fifth one in january 2013 at IIT Bombay.

The Zen Library

This site reflects an ongoing project of Sanskrit processing on a comprehensiv


database, compiled from the Sanskrit Heritage dictionary, and on the Zen com
implemented in Pidgin ML, functional core of the Objective Caml programming
software under the Gnu Lesser General Public License (LGPL) from the Zen site

The Sanskrit Portal


Please visit our Sanskrit Portal to find links to other Sanskrit resources.

Artwork credits

Orissan artwork at this site courtesy of Shauraj Rath. Screenex, Bhubanesh


Wallpaper om images courtesy of Vishvarupa.com.
Ganesh wallpaper courtesy of Franois Patte.
Shri Yantra design Grard Huet 1990.

Top | Index | Stemmer | Grammar |


Grard Huet 1

Das könnte Ihnen auch gefallen