Sie sind auf Seite 1von 10

56 More NextBlog shash7.shashank@gmail.

com Dashboard SignOut

Sanskrit Computational Linguistics


( ) (Algorithms and methods for
determining meanings)
Language, in other words the storehouse of all human Knowledge is represented by words and meanings. Language by itself has an
Ontological structure, Epistemological underpinnings and Grammar. Across languages, even though words /usages differ, the
concept of meanings remain the same in respective communications. Yet the "Meanings" are understood by human beings based
on Contextual, Relative, Tonal and Gestural basis. The dictionary meanings or 'as it is' meanings are taken rarely into consideration,
thus human language is ambigious in one sense and flexible in other.

Computers on the other hand are hard-coded to go by the dictionary meanings. Thus teaching (programming) Computers to
understand natural language (human language) has been the biggest challange haunting Scientists ever since the idea of Artificial
Intelligence (AI) came into existance. In addition this has lead to the obvious question of "What is intelligence" from a Computation
perspective. Defining intelligence precisely being impossible, this field of study has taken many shapes such as Computational
Linguistics, Natural Language Processing and "Machine Learning" etc. Artificial Intelligence instead of being used as a blanket term,
is now being used increasingly as "Analytics" in many critical applications.

Sanskrit being the oldest is also the most Scientific and Structured language. Sanskrit has many hidden Algorithms built into it as
part of its vast scientific treatises, for analysing "Meanings" or "Word sense" from many perspectives since time immemorial. "It is
perhaps our job to discover and convert the scientific methods inherent in Sanskrit into usable Computational models and Tools for
Natural Language Processing rather than reinventing the wheel" - as some Scientists put it. This blog's purpose is to expose some
of the hidden intricate tools and methodolgies used in Sanskrit for centuries to derive precise meanings of human language, to a
larger audiance particularly Computational Linguists for futher study, analysis and deployment in Natural Language Processing.

In addition, Sanskrit even though being flexible as a human language, is the least ambigious as the structure of the language is
precisely difined from a semantical and syntactical point of view. From a Psycholinguistic perspective this blog could also give us a
glimpse of the advanced linguistic capabilities of our forefathers as well their highly disciplined approach towards the structure and
usage.

Sunday,May25,2014 SanskritLinguisticsGroup

National Sanskrit TV Channel


India wants a dedicated National Sanskrit TV Channel

Over 120 million first time voters (out of the 550 million total voters) have cast their
votes in this election 2014. What is the significance of this number and what it has to
do with Sanskrit TV.

First, the number: The 120 million+ first time (young) voters is far more than UK (63.7
million) population and Canada (34.8 million) population put together. Vast majority of
Followers
these people are the new breed of Young and resurgent educated Indians who are well
connected with each other. These people are now slowly coming to understand the Followers(49)Next
depths of scientific underpinnings of Indian culture.

Second, the connection to Sanskrit: As the core of the Indian culture is intertwined
with the ancient language Sanskrit, this Young India wants to revive its connection with
the roots through Sanskrit.

Evidence of this claim: Follow

Hypothetically, if one concludes that


- 5% of the 120 million young voters, and Google+Followers
- 5% of total voters of 550 million people and
- 1% of India's total population (of a Billion+) wants Sanskrit then
Collectively 60 million people would desire Sanskrit to be revived - for this, one of an
important way forward is to start a National Sanskrit TV Channel.

Third, specific number of audiences for this TV Channel are:


- Over 40 million middle school students have chosen Sanskrit as the elective second
SanskritComputationalLinguistics
language in the school curriculum.
- 25,000+ Sanskrit university students from 25+ Sanskrit university campuses Follow

- 60,000 students of Ayurveda medicine collages


- 2,50,000 college students who have chosen Sanskrit in their studies from 93 Sanskrit
departments in colleges across the nation
- 50,000+ Sanskrit paatashaala students from over 5,000 Paatashaalas
- 300,000 students from Oriental schools across the nation who are studying Sanskrit
- 12 million+ adults who are trained in spoken Sanskrit by Samskrita Bharati
- 3 million+ people who had undergone certificate courses conducted by Rashtriya
Sanskrit Sansthan, Bharatiya Vidya Bhavan and many other such organisations.
- In January 2012, during the Vishwa Samskrita Pustaka mela in Bangalore, within a
span of 4 days over 6 million Sanskrit books (only Sanskrit) worth 70 million Rupees
40haveusincircles Viewall
got sold from over 60 publishers.
- Every single Sanskrit teaching Video /Audio clips in YouTube and other online media
have been registering 1000s of hits with multiple repeated hits during the past 5 years
! 56
- A search in Google for the word "Sanskrit" or "Samskrit" returns with over 43 million
page results - even some of the European languages themselves don't have these
many pages in the Web. FollowbyEmail
- Anyone who speaks chaste Hindi can understand Sanskrit as almost 90% of the Hindi
words are infact Sanskrit words (truncated at "Praatipadika" level - with out the
Emailaddress... Submit
nominal affix /suffix fully inflected). Thus a large number of audience readily exists for
a National Sanskrit TV Channel.
- Thus the total number over 50 million potential audience for Sanskrit TV is almost
Aboutme...
equivalent to a State population in India.
http://in.linkedin.com/in/cgkmurthi
Forth, in addition to all these people becoming viewers of the National Sanskrit TV
channel, the school /college students who will be directly benefited by the scientific KrishnamurthiCG
underpinnings of Sanskrit language.

BlogArchive

2014 (1)
May (1)
National Sanskrit TV
Channel
However this National Sanskrit TV channel should be professionally managed with
innovative programmings and packaging, so that it can not only help to spread the
2013 (10)
language but also be commercially viable. However in this channel all content
including Advertisements should only be in Sanskrit.
SamskritaBharati
Currently about 5 minutes Sanskrit news is shown once /twice a day in the National
Television channel - this is insufficient to promote Sanskrit - which is the objective of
first Sanskrit Commission formed in 1956-57 by the Prime Minister Sri. Nehru. Till date
many of the objectives of the commission reports are yet to be implemented. However
a National Television channel will help not only to promote Sanskrit, but also the Where I started...
recommendations of the Sanskrit commission can be deliberated.

SanskritComputationalTools

[The Oath taking ceremony in Sanskrit by Union ministers and Parliamentarians were SanskritSearch
witnessed with great enthusiasm by Indian public]

This National Sanskrit TV channel can telecast - Simple spoken Sanskrit lessons,
Ayurveda /Jyotisha/ Vastu/ Yoga/ Vedic Maths classes, UGC coaching classes for
Sanskrit including NET, Sanskrit classes for Children's and school students, Sanskrit
News, Sanskrit Dramas, Cultural Quiz programs, Classes for Vyakaranam (Linguistics) &
Nyaya (Epistemology), Maths and Spiritual discourses like Bhavadgita in simple
Sanskrit, etc.

When there is a TV channel for every single official language of India including for
Urdu - why can't there be one for Sanskrit.

SearchBlog
PS.: Note that the picture is Copyright of Doordarshan of India, and is graphically
altered to make a visual appeal and not for any commercial purpose. Currently DD- Search
Sanskrit doesn't exist !, Let us hope it will soon become a reality !!.

Posted by Krishnamurthi CG at 5:06 PM 32 comments Links to this post SubscribeToSanskritComputationalLinguistics

+49 Recommend this on Google

Posts
Labels: Ayurveda, Bhagavadgita, Doordarshan, Indian Culture, Jyotish, National TV,
Samskrit, Samskritam, Samskriti, Sanskrit, Sanskrit Channel, Sanskrit Television, All Comments
Sanskrit TV, School Students, Vaastu, Vedanta, Yoga

Thursday,April18,2013 Prof.L. Prof.Shrinivasa


Kumaraswamy Varakhedi

"Zero" is in Veda itself...


When we count from number 1 onwards and beyond number 9... how can we proceed if we don't have a number 10. To have a the number

10 then we must have the number "0". Else how 10 can be written - First writing 1 and followed by a 0. We are not familiar with other

method of writing in Decimal system (decimal system origination was Ancient India). If so, then how shall the Vedic rishis could have

mentioned such large numbers such as ayuta () for ten thousand, niyuta ( ) for hundred thousand, prayuta ( ) for million, My Respected My Respected
arbuda () for ten million, nyarbuda for hundred million etc. (these are used in Yajur Veda). Teacher Teacher

ViewssinceApr SanskritWikipedia
2013

4 6 3 7 9

Translate

SelectLanguage
Contribute in
Sanskrit

There was an
error in this
gadget
Today all encyclopedias are wrongly attributing the invention of "0" to Babylonian mathematics in 7nd century BCE, and also giving a

passing remark about Acharya Pingala in 3rd Century BCE as the one who used "0" in the Chandas shastram. Chandas shastram is a

Vedanga - limb of Veda. Acharya Pingala's Chandas shastram like Paninian Grammar was written for both Vedic and Worldly branches of PopularPosts
Samskritam. The original Chandas shastram is a part of Veda itself in the earlier Era. Thus it is evident that "0" was there from time of Veda
"Zero" is in Veda itself...
- which is time immemorial. Reference of "0" in Pingala Chandas shastram - in Sutra 8.29 "rupe shunyam" and Sutra 8.30 "dvihi shunye" -
When we count from
both these sutras use connect the 'valueless' usage of "0".
number 1 onwards and
beyond number 9... how
In "Brhadaranyaka Upanishad 5.1" of the Shukla Yajur Veda 1.4.10 quotes "Kham Brahmn" based on "adhibhoudika" meaning of this can we proceed if we don't
passage "Zero is Brahman" (complete, infinite, etc...) have a number 10. To have a the
number 10 th...
Again Yajurveda sukta 17.2 elaborates on the decimal place value system, without "0" how can decimal place value be represented ?.
Panini - Sanskrit Linguist
(Grammarian) could have
Sri. Aryabhata used the word "Kha" widely to denote emptiness ("Kham"). Sri.Suryadeva commenting on Aryabhatas "Kha", says that , lived 4000 years back
khani sunya upa lakshitani In Brahmaguptas work, the word "Kha" gets prominence. "Kha" and Shunya (void) is used synonymously. In There were great
Lilavati, when one come across the chapter on description of Shunya (zero), its a veritable carnival of kha. The verse reads as follows: Vaiyaakaranaas (not just
grammarians but Linguists) before and
"Yoge kham kshepsamam, vargado kham, khabhajito rashi Khahara syat, khaguna kham, khaguna nishchantayashcha sheshavidhau!!"
after Sage Panini. Sage Panini himself
- verse 46, Lilavati
refers about 16 Vaiy...

Why Sanskrit? in Computational


Linguistics - Part 2
From the above it is evident that "0" as a place value system was there and also "0" as a number was also there since the Vedic times - in First of all the confusion
other words means "anaadi" - beginingless or time immemorial. that needs to be cleared is
whether Sanskrit is best
So let's not keep repeating the mistake that Sri.Aryabhatta invented "0" etc. No doubt Sri. Aryabhatta was a great mathematician and
suited for Computing or
Computer Programming -
scientist. But saying that Sri. Aryabhatta invented "0" would be an insult our scientific advancements before him. To elaborate further,
my view is b...
during Mahabharata war - Astras (missiles) were widely used. Launch of such aerial weaponary requires precise calculations involving

topography, geometry, trignometry, etc. Such calculations certainly require the use of "0". - Like today how a missile launch can't be done National Sanskrit TV
without precise calculations requiring the use of "0". Channel
India wants a dedicated
Furthermore, Maharishi Vyasa write slokas on celestial maps with references to three sequential solar eclipses and to planetary positions.
National Sanskrit TV
Channel Over 120 million
Reference to the first solar eclipse comes in the Sabha Parva 79.29. Second solar eclipse just before Mahabharata war second in the
first time voters (out of the 550 million
Bhisma Parva 3.29, following a lunar eclipse occurring within the same fortnight. He warns that these successive eclipses are sign of bad
total voters) have c...
times (we can now use these celestial positions to do the detailed astronomical map and also do the dating to precisely estimate

Mahabharata war time), all such complex calculations require the useage of "0", thus "0" was in usage in Mahabharata time and even Why Sanskrit? in
before. Computational Linguistics -
Part 1
The English word zero came via French zro which is from Venetial zero, which came from (together with Ciper /Cypher) via Italian
This is a concise
introduction to "How
zefiro which came from Arabic , afira = is empty", ifr = "zero", nothing This was translation of the Samskritam word shoonya
Sanskrit is the most suitable language
/shunya ( ), meaning "Valueless" or "empty".
for Computing?" and now "In what way
Sanskrit ...
The etymological chain confirms that only the word "Shunya" (which is used to denote "0" as a valueless number) had travelled and the

same word is used for all other purposes of "0" even today. Such as "0" as a valueless number, or place value system, or fraction, etc.
What is "Sabda" - Shikshaa and
Vyakarana in Samskrit - Science of
Though various mathematical calculations using "0" for other purposes travelled later, but the other Samskritam words didn't travel till
Sound
Vyoma) were starting to be used in computer
20th century. Later in early 20th century the words such as Void (from Sanskrit word
The science of
programming languages.
sound is Vyakaranam.
- - both
In Samskritam we have many words for "0" depending on its value. They are below: knowledge (meanin...

Lost in Translation -
, / (poojya /sat) = Holy (complete) - from the word Wholly
Yogaartha vs. Rooddyartha
, , (shunya, rikta, randra) = Valueless
Meanings are lost in
, (Aabhu, avyakta) = Inexpressible (value can't be determined)
Translations, Generally
, (purna, ananta) = Complete, full, endless (infinite value) happen and are accepted to
, , , (kha /kham, diba, vyoma) = Infinity some degree in other languages. But
(bindu) = Point /Dot (used in fractions) with Sanskrit sometimes transla...
, (avyaya) = NaN / Indeclinable
Mordern Linguistic Terms and their
, (saankheya, drabinam) = Ordinal (while counting "0" as a number)
Sanskrit equivalents
This table is to give a short
Such wide veriety of names used for denoting "0" is found in many places starting from Vedas, Kalpa sutras, Chandas shastra, and many
understanding of the Sanskrit
other treatises. Many of the mathematicians of ancient Bharatam were Vaiyakaranaas - as the entire vyakarana sutras of Maharishi Panini equivalents for the popular Linguistics
by themselves are based on Bija Ganita (Algebra) principles. Maharishi Panini in Ashtadyayi refers an equivalent of "0" as "lopa" - in this Terms for non-Sanskrit Linguists.
kind of usage the value which was originally there has been removed after a particular phonetical change and loss of a phoneme. Contai...

Linguistics in Sanskrit - 3
The Ganita shaastra (mathematics) has developed into a separate branch of study very long back starting with the Shulba sutras of
distinctive perspectives
Sri.Bodhayanacharya and Jyotisha shastra times. "0" was in wide useage for a very long time even before the development of Ganita as a
In Sanskrit, research on
separate branch of study. Sri. Aryabhatta, Sri.Bhaskara, Sri.Bramhgupta, Sri. Neelakanta Somayaji, etc. these were Ganita Shastragnas after linguistics existed since
the Period of Sri. Gautama Buddha. time immemorial. Analysis
on the meanings of the Vedic
Even before and after the period of Sri. Gautama Buddha, Jain mathematicians were quite popular, and even before Jainism came, statements are called Arthavada. D...
Vaiyakaranaas were great mathematicians as well as linguists as the entire Samskritam language is based on mathematics and thus it is
Disruptive Nature of
most suitable for Computing. Technology
The idea that IT disrupts
In the ancient times the Ganita shaastra (mathematics) has its branches as - Geometry (Gyamiti) is the study of shapes and their only the others is wrong
applications; Algebra (Bija Ganita) is the study of operations and their applications; Trigonometry(Trikonamiti) is study of Triangles and the the biggest victim (or
relationships between their sides and the angles and Calculus (Chalana-kalana Ganita) the study of change. beneficiary) is the IT industry itself
why? Read on... In the...
The standard arithmetic algorithms actually originated in India, where they were known by various names such as patiganita (slate

arithmetic). However, the word algorithm comes from algorithmus: the Latinised name of al Khwarizmi of the 9th century House of

Wisdom in Baghdad. He wrote an expository book on Indian arithmetic called "Hisab al Hind". Gerbert dAurillac(later Pope Sylvester II),

the leading European mathematician of the 10th century, imported these arithmetic techniques from the Umayyad Khilafat of Crdoba. He

did so because the primitive Greek and Roman system of arithmetic (tied to the abacus), then prevailing in Europe, was no match for

Indian arithmetic. However, accustomed to the abacus (on which he wrote a tome), Gerbert was perplexed by algorithms based on the

place-value system, and foolishly got a special abacus (apices) constructed for these Arabic numerals in 976 CE.

Hence the name Arabic numerals because a learned pope amusingly thought there was some magic in the shape of the numerals

which made arithmetic efficient. Later, Florentine merchants realised that efficient Indian arithmetic algorithms conferred a competitive

advantage in commerce. Fibonacci, who traded across Islamic Africa, translated al Khwarizmis work, as did many others, which is why they

came to be known as algorithms. Eventually, after 600 years, Indian algorithms displaced the European abacus and were introduced in the
Jesuit syllabus as practical mathematics circa 1570 by Christoph Clavius. These algorithms are found in many early Indian texts, such as

the Patiganita of Sridhara or the "Ganita Sara Sangraha" of Mahavira, or the Lilavati of Bhaskara II.

Sri. Ananda Coomarswami had written an short piece on the concept in 1934, Kha and other words denoting Zero, in connection with the

Indian Metaphysics of space. He has tried to trace the origin of the use of "kha" for space to Rigveda in the context of the hole in the nave

of a wheel through which the axle runs. He states that "sunya" (void) as well as "purna" (full) have a common reference in the Vedas.

Since, the Vedic seers were enamored by the wheel (chakra - cycle), the names of various parts of wheel were used to explain

metaphysical concepts. Now, "kha" is the "Naabhi" of the wheel, the space within the hub. "Naabhi" is also the navel, navel of beings and

things. Thus, "kha" is the central space of things and beings. In the Rigveda, "kha" or "Naabhi" of the world wheel is regarded as the

receptacle and fountain of all order, formative ideas and goods - Ananda K. Coomarswami, Kha and other words denoting Zero, in

connection with the Indian Metaphysics of space (Bulletin of theSchoolofOriental Studies, VII (1934)

Posted by Krishnamurthi CG at 12:19 AM 11 comments Links to this post


+24 Recommend this on Google

Labels: 0, Buddha, Chandas, Cipher, Cypher, Jain, Mathematics, Maths, Panini, Pingala,
Pujya, Pujyam, Shoonya, Shunya, Soonya, Sunya, Veda, Vedanga, Vyakarana, Zero

Saturday,March23,2013

What is "Sabda" - Shikshaa and Vyakarana in Samskrit - Science


of Sound

The science of sound is Vyakaranam.


- - both knowledge (meaning) and the original
"sound" is associated with "Sabda" in Samskrit. The sound is given more importance in
Samskrit lanaguage than the word () as the natural sound in itself has inheritted
meaning with it. The "word" being its derivative added with a suffix, conveys the
derived /modified meaning of the original sound. Thus the language is also its
derivative. However vyakarana is made to convey the meaning grammar - it is more
than just grammar, as vyaakarana deals with all the derivations of the primordial
sounds such as - words, word-sense, phrases, sentenses, figures of speech, etc. (
). The shaastra that
deals with 'sound' in its basic form is called as Shikshaa ( ) which is primarily a
Vedanga (a part of Veda like Vyaakarana)

Shikshaa ( ) shaastra is the foundation for studying the 2 branches of


Samskrit language ( - ) Vaidika (Vedic Samskrit) and Laukika
(Classical Samskrit) - Laukika is the part of language that is in use for all purposes
other than Vedic - including Science, Literature, Medicine, and all other worldly things.

Shikshaa ( ) shaastra in its full form is a complex and intricate science based on
human vocal system. Which in its full capacity in use in Vedic part of Samskrit
( ). The same shaastra is also used in a limited manner in the all purpose
non-Vedic part of Samskrit ( ).

Even though Sandhi ( ) is studied along with Grammar, Sandhi deals only with the
pronunciation of syllable (Varna ) with respect to the factors such as - Place ( ),
Effort ( ), Duration ( ), Pitch ( ), etc. ( ). All these deals with Syllabicity
(syllables) and phonological aspects (which are part of Shikshaa) than words and
meanings. Thus Sandhi is primarily a subject of Shikshaa than grammatical processes -
even though Sandhi rules are given in grammar texts ( ), however the place,
etc. ( ) are elaborated in 'Varnochaarana shikshaa' and 'Paniniya Shiksha' of
Maharishi Panini. The rules for changing of syllables based on enjoining of syllables,
though found in Grammar texts but are in essence part of the Shikshaa /Phonology.

Thus the linguistics treatise Ashtaadyayi not only deals with Grammar (which is
primarily Syntax & Semantics - ) it also deals with the rules of Phonetics
- Shikshaa and in general all aspects of "Sound" ( ) which forms the basis for
language - including morphology, etc.
Not just Sandhi, the fundamental formations in Sanskrit Roots + Suffixes ( +
) and the word generation ( ) processes essentially are based on phonetics
( ) like Vriddhi, Guna, Samprasaaranam etc. ( ) -
Process of expansions of syllables which purely natural sound modifications while
joining syllables. These are evident across word formations - in both Noun forms and
Verb forms from singular to plural forms and also declensions. The Vriddhi, Guna, etc.
part in primary and secondary Noun derivatives from Noun roots, Nouns and Verb
roots, etc. ( - ) are again strictly follow the rules of phonetics. In
addition the the phonetic features such as Natvam, Shatvam ( ) are
also part of shikshaa. Also the letter 'h' () becoming the forth letter of the group
consonents ( ) are again shikshaa. Similarly all most all the Dhaatus (
-) - Verbal roots are also single syllable phonetic (sound) forms - also the suffixes
( ). The prefixes () are again mostly dual syllable sound forms.

There are many shikshaa shaastras ( ) more than 40 so far we have got for
Four Vedas () and their shaakhaas. Of these Paniniya shikshaa for Laukika (non-
vedic) branch of Samskrit is famous. Thus Maharishi Panini integrated all these
branches of 'the science of language" in his monumental work Ashtaadyayi. It appears
that the entire work of Maharishi Panini is to make rules for pronouncing the "Word"
correctly as the "Word" in itself has the inseperable meaning attached with it - thus the
perfect pronunciation of just one "Word" takes you to heaven as per Maharishi
Patanjali.

The word "vyaakaranam - vi+aa+kr+lyut (suffix)" ( = ++ + ) itself means


a "Special form" (of language) - with stress on the verb (creation of the form), the other
2 similar words (1)"aakaarah - aa+kr+ghan (suffix)" (+ +) and (2)"aakritih -
aa+kr+ktin (suffix)" (+ + ) both represent form and shape respectively in
common usage. The extra 'vi' ( ) prefix gives the meaning of Special. Thus the
word vyaakaranam itself means the entire science of the creation of the language.
Which includes abiding by the natural phonetic capabilities of human vocal faculties
and also reflecting the natural and eternal "sound-meaning" combination (
).
The entire Shikshaa shaastram is based on Human vocal anatomy and its primary
purpose in laukika part of language - is to make pronunciation easy and natural
(similar to Veda) in addition to shortening, softening, replacing, adding, etc. of syllables
based on natural movement of tongue and natural functioning of vocal chord. This has
also helped in making the entire language musical - which in-turn helped in easy
communication and retention of huge volumes of treatises over 1000s of years,
generations after generations based on the most natural and easy to remember
phonological sounds.

The natural inter-wining of phonology and language - music and literature, - a true
Wonder!. Hope we understand, hold it dear (in our tongues) and preserve it by passing
to the next generation without any deterioration...

This shikshaa shaastra is primarily a Vedanga - which means a part of Veda... and also
used in Yoga, Tantra and Shastras. The natural relationship between Language and
Phonology proves that Samskrit is a well constructed (not by human) language and is
indeed the greatest gift to mankind, from who? - who else...!
------------

Personally this has lead me to the conclusion that originally all 6 Vedangaas (Shikshaa,
Chandas, Nirukta, Vyaakaranam, Jyotisha and Kalpa) must have been a single shaastra
(may be called as Vyaakaranam - based on the Yogaartha of the word) and must have
been an integral part of Veda in the earlier Era (Dwapara Yuga) where Veda was just
one !

Basic details of Shikshaa you can fine here https://vedavichara.com/the-


vedas/vedangas-the-limbs-of-vedas.html
and
http://en.wikipedia.org/wiki/Shiksha

To continue...

Posted by Krishnamurthi CG at 11:24 PM 5 comments Links to this post


+8 Recommend this on Google

Labels: Ashtadyayi, Grammar, Language, Linguistics, Meanings, Panini, Phonetics,


Phonology, Samskrit, Sanskrit, shiksha, shikshaa, siksha, Sound, Vedanga, Vyakarana,
Word

Friday,March1,2013

Why Sanskrit? in Computational Linguistics - Part 2


First of all the confusion that needs to be cleared is whether Sanskrit is best suited for
Computing or Computer Programming - my view is both. Yet this paper is not about
Sanskrit as a computer programming tool - even though there are scientists and
academicians who are developing programming languages based on Paninian
priciples, however this paper deals with Sanskrit as a Computing tool. Computing here
refers to concepts, algorithms and methodologies.

Computer Programming is an entirely different thing as it deals with a human being


generating code in a high level computer language, which in-turn translated to a low
level code through compilers /linkers, which in-turn translated to operating system
instructions, which in-turn translated to microprocessor instructions (based on CPU
instruction set) which internally converted into binary instructions which further
converted to digital electronic (electrical) signals for flip-flops /counters etc.

The entire chain of programming is based on mathematics /symbol language and not
any human language spoken or written - even though the symbols consists of few
human understandable characters such as numeric 1-9, alphabetical a-z, and some
signs of mathematics such as +, -, /, %, etc. all these constitute the ASCII - which has
255 characters or symbols of computer codes - in other words each symbol can fit into
a single byte. These symbols are assigned to certain operative values in digital
electronics - thus the programming languages are not human languages. That's the
precise reason why human beings want Natural Language Processing or human
language processing capabilities in computers - which literally means our languages
being understood by computers. So far computer understands only computer
language and human being's only human language.

With respect to computer language the instructions (lets say commands /actions
/verbs) are very limited - widely used are about 15 - such as go to, break, compare,
copy, reverse, assign, operators (+,-,*,/), receive, display, etc. Also few other actions
(verbs) can be written as functions such as sort, list, etc. Thus the computer language's
capability in comparison with human language is very limited.

In comparison in Ashtadyayi - Panini's 1000s of years old Sanskrit grammar treatise -


the meta language used inside Ashtadyayi not Sanskrit but uses certain words and
rules of Sanskrit - which is used to teach Sanskrit grammar to the readers of
Ashtadyayi. That meta language has more instruction sets - yet without using any
explicit verb. Thus if one can make a high-level programming language exactly
mimicking the meta language of Ashtadyayi - we will have a powerful tool - with which
computer can generate words, form sentences, etc. - yet associating meanings will be
the biggest challenge.

The hypothesis is that - if there is a highly structured human language, can then that
language be used for Natural Language Processing ?- the answer is yes and to
wonderful degree containing complex human sentences - how ? = In Linguistics and
most importantly computational linguistics the following are essential for scientific
analysis (for computers to do the analysis) of the language - which consists of
sentences - and sentences have inherent meanings.

How then analysis can take place? -


Without ambiguity the sentence meaning being conveyed is first and most important
thing; because computers don't have intelligence - computer's understanding of
language is based on a particular structure (lets say a word (or) phrase and its
meaning) and it tries to combine or mix and match to a particular meaning. Again here
the computer doesn't care about the meaning but it responds for a question which has
a particular meaning and based on that from a set of answers the most suitable
answer is chosen and given - which again based on a particular individual, popularity,
number of occurrences, place, time, etc.

In natural language for example sentences like -

a- The committee chair chairs the meetings where the chair is elected as the chair for
one more chair-term.
b- All committees' chairs chair their meetings to elect the chair and the past chair is
elected as the new chair for the next chair-term.

Now, we can easily understand the meanings of these sentences, but computer can't
understand - here is where the language's ambiguities with respect to word meanings
and words' declensions, usage, phrase meanings and along with other phrases and
within a sentences - many such things matter.

English is the most complex language - it takes even for a native speaker 8-10 years to
achieve proficiency. It takes just 2 years to achieve proficiency in Sanskrit another 2
years in literary Sanskrit. In addition, the written form of English is again non-phonetic
which adds its own problems in converting text to voice. In addition due to many
borrowed words - spellings and pronunciations are again differ and add complexity.
More over the regional flavours.

The interpretation of a text in computing goes through - first Parsing, part of speech
tagging, lexical analysis, morphological analysis, syntactical analysis, and then
semantical analysis. Thus the more structured and scientific the language is the less
problems in computer based natural language processing and its applications such as
Machine Translation and Machine Assisted Translation

In general Language means - collection of sentences; A sentence means - collection of


meaningfully associated words such as Subject, Object, Verb etc. ; Each word in a
sentence should have clear and easily understandable verbal and nominal declensions
if not confusion starts. In the above example the word "Chair" is both verb and noun -
this gives enormous confusion to computer.

Lets explore further... Human language is highly ambiguous. Primarily - because of the
ambiguities of word sense (meanings) word meanings (on their own) and in
association with another word (in a phrase) and in association with verbs and other
words in a sentence - the complexities multiply. With quotations and idioms
complexities only increase further in a sentence or a part of speech. Now add
acronyms and what we get? - most complex thing known to human being next only to
human mind (or) both language and mind are one and the same ??

Thus in linguistic terms the complexity is exponentially increased in each


corresponding step as per the 6 most important things in the order in Linguistics and
how they are in Sanskrit are given below

(1) Phonetics and Phonology knowledge about linguistic sounds - In Sanskrit it is


known as Shiksha shastra - Sanskrit has over 40 Shiksaas for each shaaka of Veda but
for language in general Paniniya shiksha is most suitable as it correspondingly
connects to the Grammar and the rules of the grammar also abide by the rules of the
Phonetics.

(2) Morphology knowledge of the meaningful components of words from stems and
their generation and usage - In Sanskrit this is called as 'pada vyutpatti' in Sanskrit - 4
types of vrittis (word generators) are there for this purpose namely - Krit, Taddhita,
Samasa and Sannaadyanta - In addition the method for generating words are also
explained step-by-step in Panini's Ashtadyayi like a mathematical equation - thus
programming to generate words are easiest.

(3) Lexical knowledge of meanings and equivalent words. Every Sanskrit lexical item
has a one-one correspondence. So a particular word used in some place means the
same when used elsewhere too from a semantics point of view. Amara Kosha, Nirukta,
Nighantu all have the complete lexical database of Sanskrit words and associated word
connections.

(4) Syntax knowledge of the structural relationships between words - declensions of


nominal forms /stems - In Sanskrit Vibhakti play this role - we have very tight rule thus
there is no ambiguity. Also Sanskrit is a language without prepositions thus a major
complexity is removed - this is also explained step-by-step in Ashtadyayi like a
mathematical equation.

(5) Semantics knowledge of meaning of words in a sentence - In Sanskrit this is one


discussed in detail in many works and in Sanskrit vyakarana called as "Kaarakam" -
Many ways of sentence meanings and their analysis on a scientific basis are available
in Sanskrit with respect to different schools of linguisitic sciences such as Vyakarana,
Nyaya and Mimamsa.

(6) Pragmatics knowledge of the relationship of meaning with respect to the context
- this is the most complex as meanings change based on context and many other
factors - In Sanskrit there is a wonderful Vyakarana treatise available for pragmatics
called as "Vakyapadiyam" by Maharishi Bhartrhari - it is pity that many Sanskritists are
not aware of this. But this treatise is very popular among European linguists in
particular German, Belgian, and French.

All most all kinds of meaning analysis based on relationship between 2 words -
sameness, opposites, connection, association, context, etc. are dealt in detail in
"Vakyapadiyam", in the West Scholars are been inspired by this and made their
theories of Semantics and Pragmatics.

Some reference from Wikipedia - further reference from Linguistic Journals and
Scientific publications are below: Wikipedia - The Link (as on March 1st, 2013)

"Pini's work became known in 19th-century Europe, where it influenced modern


linguistics initially through Franz Bopp, who mainly looked at Pini. Subsequently, a
wider body of work influenced Sanskrit scholars such as Ferdinand de Saussure,
Leonard Bloomfield, and Roman Jakobson. Frits Staal (1930-2012) discussed the impact
of Indian ideas on language in Europe. After outlining the various aspects of the
contact, Staal notes that the idea of formal rules in language proposed by Ferdinand
de Saussure in 1894 and developed by Noam Chomsky in 1957 has origins in the
European exposure to the formal rules of Pinian grammar. In particular, de
Saussure, who lectured on Sanskrit for three decades, may have been influenced by
Pini and Bhartrihari; his idea of the unity of signifier-signified in the sign somewhat
resembles the notion of Sphoa. More importantly, the very idea that formal rules can
be applied to areas outside of logic or mathematics may itself have been catalyzed by
Europe's contact with the work of Sanskrit grammarians

de Saussure
Pini, and the later Indian linguist Bhartrihari, had a significant influence on many of
the foundational ideas proposed by Ferdinand de Saussure, professor of Sanskrit, who
is widely considered the father of modern structural linguistics. Saussure himself cited
Indian grammar as an influence on some of his ideas. In his Memoire sur le systeme
primitif des voyelles dans les langues indo-europennes (Memoir on the Original
System of Vowels in the Indo-European Languages) published in 1879, he mentions
Indian grammar as an influence on his idea that "reduplicated aorists represent
imperfects of a verbal class." In his De l'emploi du genitif absolu en sanscrit (On the
Use of the Genitive Absolute in Sanskrit) published in 1881, he specifically mentions
Pini as an influence on the work."

Sanskrit referred as a Devabhasa (Gods language) is not because it is the oldest


language - it is because it is very perfect in its structure, morphology, semantics, etc.
which have not changed for 1000s of years - Only a good linguist understands that
such a perfect language can't be created by Human beings, neither Cavemen nor
evolved - Whats the evidence: we have seen English evolving from a structured
languages and having some formal structures and usage initially to now with no
structure and highly ambiguous!. - not with respect to human understanding but with
respect to Linguistics. With respect to human understanding it has become easy and
flexible - as a result can we say that human mind has become unstructured and dull !,
may be past century scientists with out labs and tools have found many things !.
Mathematician Ramanujan's tools were just a pencil and a paper !.

References:

1. The science of language, Chapter 16, in Gavin D. Flood, ed. The Blackwell
Companion to Hinduism Blackwell Publishing, 2003, 599 pages ISBN 0-631-
21535-2, ISBN 978-0-631-21535-6. p. 357-358
2. George Cardona (2000), "Book review: Pinis Grammatik", Journal of the
American Oriental Society 120 (July September, 2000): 4645, JSTOR606023? [6]
3. Leonard Bloomfield (1927). "On some rules of Pini". Journal of the American
Oriental Society (American Oriental Society) 47: 6170. doi:10.2307/593241.
JSTOR593241
4. Ashtadyayi Reference: http://avagraha.wordpress.com/
5. Sanskrit Programming - Reference 2 sites both contains lot of information - (1)
http://vagartham.blogspot.in/ and
(2) http://uttishthabharata.wordpress.com/
6. Functional programming - Reference:
http://vishk.wordpress.com/2007/02/11/backus-naur-form-and-
ashtadhyayisanskrit-grammar/
7. Parser /Tokenizer for Samasa - Vaakkriti: Sanskrit Tokenizer, Aasish Pappu and
Ratna Sanyal, Indian Institute of Information Technology, Allahabad (U.P.), India,
Proceedings from the paper submitted in Third International Joint Conference on
Natural Language Processing, 2008, Hyderabad, India
8. The methods used inside the ashtadyayi is similar to today's arrays, inheritance
(including multiple inheritance), polymorphism, etc. used in OOPS - Reference:
Recent Research in Science and Technology, 2011, 3(7): 109-111, ISSN: 2076-
5061, www.scholarjournals.org
9. Computational Lingusitics - Reference: Hyman Malcolm D., From Pninian
Sandhi to Finite State Calculus, Sanskrit Computational Linguistics: First and
Second International Symposia, Revised Selected and Invited Papers, ISBN:978-
3-642-00154-3, Springer-Verlag, 2009.

Posted by Krishnamurthi CG at 12:45 PM 8 comments Links to this post


+15 Recommend this on Google

Labels: Ashtadyayi, Computational Linguistics, Computing, Linguistics, Programming,


Samskrit, Samskritam, Sanskrit, Sanskrit Lingusitics, Semantics, Shastra

Home Older Posts

Subscribe to: Posts (Atom)

Krishnamurthi CG. Ethereal theme. Powered by Blogger.

Das könnte Ihnen auch gefallen