Beruflich Dokumente
Kultur Dokumente
Guidance:
It has a similar position in India to that of Latin and Greek in Europe, and
is a central part of Hindu/Vedic traditions.Many of the Indian languages
and foreign languages has it’s origin from sanskrit only.Paninian
grammar is still regarded as the mother of all the grammar.
So,Sanskrit has retained it’s position and charm in it’s original form
Motivation
People both in India and abroad are surely and steadily realizing the
importance of ancient scripts in the diverse fields of science,
commerce and arts. Also, as spiritual awareness sweeps the world,
great efforts are being made to present the vedic scriptures in different
languages to the common people.Swami Dayanand saraswati… once
rightly quoted “back to vedas”.
Objective
Sentences
(Sandhi) (vibhakti)
Root words --Gender, form
-- marker
Since a given word may be composed of two words,so one may go for
separating these words. e.g. rvINd+ = riv + #Nd+ .
So, if somehow we remove the marker then we may get the root word.
Sandhi( siNx iv¢h )
• What is sandhi ??
dae v[aeR me< AitZay saimPy ke kar[ jae ivkar %TpNn haeta hE ,//%se siNx
khte h<E,
1) effect on the first word only: (i.e. last letter of the first word
changes)
st! + jn?> = sJjn>
e.g. Ak> sv[eR dI"R> : it means Ak! ke bad yid sv[R Svr hae tae daenae<
A + A = Aa a + a = aa
# +$ =$ i + ii = ii
• Rule format: i.e. In what format the rules have been stored in the
machine.
e.g. # +# =$ i + i = ii
i.e rav—ii—ndra : if you get “ii” then add “i” on left i.e. rav + “i”
Rule format:
I have stored rules in the files indexed on there starting letter. I.e. for
above case, b - - ii - - i - - i the rule will be in the file “i.txt”.
Step2: try iteratively for breaking the word into two part(left and right)
e.g. if the word is kastu then:
kast u
kas tu
ka stu
k astu
e.g. kastu=kaH + tu.... so when i receive "kastu" for sandhi wichched,then i take
up left as "ka" and right as "stu",and then pass "ka"(left) and "stu"(right) for
trying up sandhi wichched)
3)try applying each of the sandhi rules(present in the rule file,here sandhi
rules present in s.txt) on left and right(i.e. on ka and stu).
example:
f—y—i|ii|I|—a|aa|A|u|uu|U|e|ai|o|au|aM|aH|RRi|R^i|R^I|L^i|L^I|
So,if on forming sandhi only first word is effected and second word
remains as it is then append something(depending on rule) to the leftGuess
........ no changes in rightGuess
check for left + right = result .....if the rightGuess word exist in
database,also check right of rule ..... f--result--left--right could be applied
(since,both, sandhi wigrah and sandhi formation on the given rule should
be checked up).
then depnding upon the presence of left hand side one may declare the
result.
left + right = sandhi
even if the left word is not present,but,since, rightGuess is present in
database,so there are chances of sandhi vigrah so record it to tryResult.txt
Since we are in effectOn=='f'
so we need something to be added to the leftGuess and then
check there presence in the database.
So,if on forming sandhi only second word is effected and first word remains
as it is then prefix something(depending on rule) to the rightGuess ........ no
changes in leftGuess
check for left + right = result .....if the right(prefix+rightGuess) word exist
in database.
check right of rule ..... s--result--left—right could be applied
then depnding upon the presence of left hand side one may declare the result.
Since we are in effectOn=='s' so we need something to be prefixed to the
rightGuess and then check there presence in the database.
check if the leftGuess ends with one of the string present in left part of the
rule since sandhi vigrah and sandhi formation both of them should be
checked. rule: effectOn--result--left—right
right: orSeparatedString(string|string|......|string|)
e.g. for rule s--Dh--Sh|shh|T|Th|D|Dh|N|--dh|
left: Sh|shh|T|Th|D|Dh|N|
e.g. leftGuess sheshh ends with shh(which is present
Sh|shh|T|Th|D|Dh|N|)
c) effect on both the words
extract one of the prefix for rightGuess from the rule's right
rule : b--e--a|aa|A|--i|ii|I|
right: i|ii|I|
potential prefix: i,ii,I
2. For k = n to 1,
break the string at kth position into two : leftBreakStr, rightBreakStr
-- if Rootatabase( rightString ) :
if ( Rootdatabase(leftString)) then set sandhiPossible : true
if(sandhiwichched(leftString )) then set sandhiPossible : true
Sample Output :
Vibhakti( ivÉiKTa )
Vibhakti
vachan(vcn )
Genders(il<g )
compared to
Two in hindi/English.
First thing is to detect markers. Once the marker is detected the work is
done.
By marker I mean:
Consider “goes”. Here the root word is “go” and the marker is “es”.
Similarly in the word “going”, root word is “go” and the marker is “ing”.
So if I encounter “bhyaam” in the given word then I can say that it is t&itya>
ivÉiKTa ,id&vcn
Vibhakti Algorithm
vachan : id&vcn
gender : male with ending word i,
i.e. #kara<t puiLl<g zBd
Storage used:
I have stored rules in the files indexed on there starting letter. I.e. for
above case, ibhyaam--i--3|4|5--d--m—i the rule will be in the file
“i.txt”.
Steps:
b) obtain all possible rules that can be applied to the given word.
how to obtain the rules??
e.g. ramA H
then open the file H.txt and get all possible rules from the H.txt and
store them in array rule[]
c) iteratively apply all rules stored in the array rule[]
d) get starting index and ending index of the word to be added (add
part from the rule format)and hence extract the add part from the
rule.
rule: AH--a--1|8--b--m--a
add: a
first: ram
second: AH
rule: AH--a--1|8--b--m--a
add: a
firstRes: rama
g) heck if the firstRes in DATABASE??
if the word found then project the result i.e. ling ,vachan , karak
i.e. vibhakti is possible, so break the corresponding into result.
rule: AH--a--1|8--b--m--a
e.g. root: rama
karak: 1|8
vachan: bahuvachan
ling: male
Algorithmic Steps:
Step 1:
fetch the word
Step2:
extract the vibhakti:(raamaabhyaam)====> possible>am..aam..abhyaam
==>extract using the last index i.e. m better store the rules based on last index
if u extract am and add a ..then left : raamaabhyaa not found in db
if u extract aam and add a ..then left : raamaabhyaam not found in db
if u extract abhyaam and add a ..then left : raamaa found in db
==> declare it's ling vachan karak based on it's ending e.g. based on
rup i.e. akaaraant pulling shabda
aakaarant pulling shabda
ikaarant pulling shabda e.tc.
Requirement Analysis:
Hardware:
CPU with minimum of 2.40 GH
256 MB of RAM
Softwares:
JDK version 1.4.2 or above
How to run:
Package : sandhi.sandhi
Src/sandhi/sandhi : contains all java files
1) frame.java
2) Sanskrit.java
3) Sandhi.java
4) Vibhakti.java
5) Utils.java
Input / Output:
Input:
Step1:
Single word input:
Either enter a text into the text box ,or select a word from the list(on the
left hand side).
File input:
NOTE: for processing the word it may take about a second or two, so please wait.
If the processing is done on the file, the please wait for around 2-3 minute.
Output:
Now for viewing the result in devnagri form follow the following steps.
USE-CASE : 1
Use-case description:
getHelp:
Sandhi:
pre condition: a word is selected or entered for getting
the desired work done OR a file is selected
action performed: calls for sandhiOrVibhakti(1)
//triggerSandhi
post condition: gets the result on output window
vibhakti
pre condition: a word is selected or entered for getting the
desired work done OR a file is selected
action performed: calls for sandhiOrVibhakti(2)
//triggerVibhakti
post condition: gets the result on output window
uploadWordDB
pre condition: a word has been entered in corrosponding box
uploadRootWord
JList1
action performed: gets the list of words in the list(left side large
window) oulput
action performed: shows the result in the output window(right
side large window)
USE-CASE : 2
Use-Case2 Discription :
Note:(linguistics)
sandhi can be possible: it means the word can be broken up into two
or more distinct word
vibhakti is possible: it means that given the word ,one can get its
karak,vachan,ling
vibhakti is probable: it means that given the word ,there are some
chances that one may detect karak,vachan,ling
trigger :
pre conditions:
1)A word(NOT NULL) has been received
2)either sandhi or vibhakti has been selected(through GUI )
Action Performed :triggers either the "triggerSandhi" or
“triggerVibhakti"
triggerVibhakti:
pre condition:
A word has been received from function "trigger".
Action Performed:
1)calls getRoop(word) for getting karak,vachan,ling
2)records the result of action performed in result.txt(if vibhakti
is possible, OR only some chances(probability) of vibhakti to
be formed)
post condition: result of the action has been recorded.
getResult:
pre condition:
name of the file from where result to be extracted has been
received.
Action Performed:
1)checks "couldn't be formed" is written OR not in "result.txt".
2)if result.txt doesn't contains "couldn't be formed", it means
the desired thing has been performed so,retrieve the result.
Since,sandhi wichched is a recursive process,so retrieve the
result in reverse order.
post condition:
User(GUI/frame) gets desired result.
USE-CASE : 3
USE-CASE:3 Description
sandhiWichhed
action performed:
1)tries iteratively for breaking the word into two part(left and
right)
e.g. if the word is kastu then:
kast u
kas tu
ka stu
k astu
trySandhiwichhed
/*
sandhi rule format: effectOn--result--left--right
e.g. f--s--H|--t|th|
*/
applyRule
pre condition:
receives one rule,firstString,secondString
e.g. rule : f--s--H|--t|th|
first : ka
second: stu
checkWordInDB
USE-CASE:4 Description
getRoop
pre condition:
boolean variables probable and possible set to false
receives a word for detecting karak,ling and vachan
action performed:
applies rules to extract the root word ,it's karak,ling and vachan
post condition:set the status by turning on either or both of the
boolean variable(possible,probable)
breakRuleIntoResult
pre condition:
receives rule,first word,second word,status(probableOrPossible)
action performed:
breaks the (i.e. processes ) rule and builds up the result.
/* rule : AH--a--1|8--b--m--a
first : ram
second: AH
e.g. root: rama
karak: 1|8
vachan: bahuvachan
ling: male
*/
post condition:
result is recorded in result.txt(if vibhakti possible),otherwise
result is recorded in tryResult.txt(vibhakti is probable).
getProbable
action performed: returns the status of chances of vibhakti root
getPossible
action performed: returns the status of chances of vibhakti root
USE-CASE : 5
USE-CASE:5 Description
attchPackagePath:
pre condition:
receives a filename to which the pathname of package to
be attached.
action performed: attaches the relative path of package to the
filename.
post condition: relative path of the filename is returned.
createFile:
pre conditions:
1) receives a file name for creating the file
2) receives some string which is to be written to the file.(may
be "",i.e. blank)
Action performed: creates a file with the name provided, and appends
the string to the filename.
post condition:
file has been created and string has been appended.
pre conditions:
1) receives a file which is to be broken into subfiles based on
some parameter.
(parameter: read each line and based on starting letter of the
line,put that line in the file(name as startingLetter.txt))
2)receives the folder name in which the subfiles are to be
stored.
Action performed:
reads the file fileName line-by-line and appends each line to the
file based on starting letter of the line.(used mainly for
organising the database/rules,based on starting letters)
post condition:
file has been broken into subfiles.
getContentWholeFile
pre conditions:
receives the fileName from which the content to be
retrieved.
Action performed:
reads the file line-by-line and concat each line into one string.
post condition
string containing the content of whole file is returned.
appendToFile
pre condition:
receives the filename and string to be appended to the
file.
Action performed:
appends the string to the filename.
checkWordInFile
post condition:
returns true if the rule(corresponding to the result part) is
present in the rulefile.
pre condition:
receives a word and orSeparatedString(string|string|......|string|)
post condition:
returns true(if starts) or false(if doesn't starts)
checkWordInRightOrSeparatedWord(used for sandhi only)
pre condition:
receives a word and orSeparatedString(string|string|......|string|)
Action performed:
checks if the word ends with one of the string
present in orSeparatedString(string|string|......|string|)
post condition:
returns true(if does ends) or false(if doesn't ends)
CLASS-DIAGRAM :
Description
Architecture Overview
Sandhi Vibhakti
Sandhi Rules
Database Vibhakti
Based on Based on
Starting letter Root words Starting letter
Non root
words
Depending upon the Sandhi / Vibhakti to be performed, it passes the query to the
respective server.
Respective server, taking help of utilities, interacts with the Rules database and
Word database.Does some processing and gives result back to the Sanskrit
server,which gives back the result to the result server.
NOTE:
If the input is through file, then file shouldn’t contain any blank line.
References
Books referred:
Web references:
• http://sanskritdeepika.org
• http://www.omkarananda-ashram.org
• http://www.gitasupersite.iitk.ac.in