Tuesday, November 12, 2019

Myths around - Is Sanskrit really the best language for computer programming?

Dependency Parser [from paper by Amba Kulkarni]

I recently received a forward about an article on the net Is Sanskrit really the best language for computer programming?  at https://techzoworld.wordpress.com It seems to be the only article on the blog, back from April 2018. No author name given.

The article laments the over-glorification of Sanskrit, specially in context of its use in Computer Science Indeed, there are many such comments, posts, and ideas floated on the net, inspired by Rick Briggs' NASA paper in 1985.

The exaggerations are almost always made by people who don't understand Sanskrit or Computer Science or both.

At the outset, one must understand that Sanskrit is not proposed to be a language in which one does programming. That is, instead of English letters (or French or Chinese or Hindi) one should use Devanagari letters and Sanskrit language. That is never a claim, though fully possible. And that can surely give some boost to folks to write programs in local languages/scripts on India. Just like folks write programs and use computers in languages other than English.

The importance of Sanskrit in Computer Science is to do with the way its grammar is encoded by Panini. He described the entire gamut and power of the language (i.e. its grammar) with a set of about 4000 formulas. Pretty much the way used to define the grammar of a modern computer language. This fact is well documented and not disputed by anyone.

The exaggerations are almost always made by people who don't understand Sanskrit or Computer Science or both. But the author of the article on techzoworld went to the other extreme, most probably due to lack of knowledge of Sanskrit. And towards the end, the author did get carried away in the bashing that he gave space to any and every claim made by just about anybody. Just a few demystification would have been enough for the intelligent.

The aim of this post is to point out other fallacies or misunderstandings in the above article at https://techzoworld.wordpress.com , and not vindicate any exaggerated claims.

Why me? Well, I know enough of Computer Science and Sanskrit to be dangerous enough to write a post about it!

Below, the original text from the article is given, and then my explanation/comment.

Storage issues
Despite the arguably best verbal efficiency, there are a few issues with the language in actual knowledge representation. Sanskrit has a glyph based script rather than the alphabet based script as with Latin and its derivatives.

Any language can be written in any script, if it has enough symbols. Even a new script can be formed. Sanskrit language can be written in almost all of Indian scripts, and a few scripts are not glyph based. Back in early 1990s, Indian students in US developed an ASCII based mapping for the entire Devanagari script, which would not be glyph based! It is called ITRANS. For example, you could write without ambiguity any Sanskrit verse. Like:
IAST: yā vīṇāvaradaṇḍamaṇḍitakarā yā shvétapadmāsanā
Devanagari: या वीणावरदण्डमण्डितकरा या श्वेतपद्मासना
Plain ASCII (ITRANS): yA vINAvaradaNDamaNDitakarA yA shvetapadmAsanA
There are other mappings as well.

Script and language are different, and should not be confused.

Sanskrit’s naturalness
The fact is that Sanskrit, unlike other languages, hasn’t had a natural evolution. Nearly everything about Sanskrit, as is known today, was codified sometime around the year 500 BCE by one person, Panini, who was bent on making it as precise and concise as was humanly possible. Sanskrit didn’t simply happen to have the required characteristics of an artificial language by coincidence. It’s there by design. It is indeed the work of a primitive computer scientist without the hardware. This is not to say Panini intended for his language to be used with machines. At best, his work caught the eye of a pattern seeking human in need of an answer to a difficult, perhaps unsolvable problem – it was bound to happen sooner or later.

This is indeed completely misunderstood perception.

Sanskrit spoken before Panini and after Panini doesn't have a difference of black and white! Even the Sanskrit of Vedas is very similar in structure, vocabulary and grammar as after Panini.
For example, 'na duruktAya spRihayet' 'न दुरुक्ताय स्पृहयेत्' is from Rigveda 1.41.9 (pre-Panini) and is perfect Sanskrit even today (post Panini).

Panini did not define Sanskrit grammar. He described it, not prescribed it.

He created rules that could encapsulate the existing language of his times.

He, like a true scientist, labored hard to make his product (set of rules, not the language itself) as precise as possible, using as little resources (words) as possible. So, it is not at all easy to understand Panini's aShTAdhyAyI on one's own, it is almost impossible. But commentaries that explain the formulas, do help a lot.

But, children learned the Sanskrit language before and after Panini, without learning the grammar first. Just like English kids learn English without learning the grammar first. Sanskrit is a natural language indeed. It was or is not like Klingon language.

Panini was not the first grammarian either. He was one in a chain. He himself cites at least ten grammarians before him. His genius was that he formulated, captured the essence of the entire existing language in a set of about 4000 formulas, that once encoded, were adhered to in learned circles. His grammar text Ashtadhyayi is not meant at all for novices, and even die-hard students take years to master it, because it is encoded succinctly, for an expressed purpose of brevity. That doesn't make the language difficult, it just makes mastering the formulas difficult.

From Wiki source:

The text takes material from lexical lists (DhatupathaGanapatha) as input and describes algorithms to be applied to them for the generation of well-formed words. It is highly systematised and technical. Inherent in its approach are the concepts of the phoneme, the morpheme and the root. His rules have a reputation for perfection[62] – that is, they tersely describe Sanskrit morphology unambiguously and completely. A consequence of his grammar's focus on brevity is its highly unintuitive structure, reminiscent of modern notations such as the "Backus–Naur form".[citation needed] His sophisticated logical rules and technique have been widely influential in ancient and modern linguistics.

The Aṣṭādhyāyī was not the first description of Sanskrit grammar, but it is the earliest that has survived in full.

'it was bound to happen sooner or later.' - this surely reeks of arrogance. This way, anything can be dismissed, be it Newton or Wolfgang Pauli!

The work of Panini was in wide circulation and use in India to teach grammar. When the British got to know of it, and published it in Europe, the entire science of modern linguistics was born. When the world was at war or enjoying wine, women and land - there were minds in India obsessed not just with the tone, scale, stress of the voiced utterances we call speech, one man Panini was also obsessed with finding a way to describe the entire language in as succinct a way as possible, with minimum syllables to remember. And succeeded.

That - is an extraordinary feat for the entire humankind.

The Sanskrit of today, the one reportedly spoken by a few tens of thousands, is about the same as that codified two and a half millennia ago. The language doesn’t evolve, it can’t evolve. Unlike natural languages, speakers of Sanskrit cannot be classified as proficient or eloquent as its precision does not allow gradations. You either speak the language or you don’t; there is no grey. Even artificial languages do not suffer that restriction.

This too is incorrect. Just like there are low, medium and great programmers in computer languages, or even in English (students do get grades, right!) so can one be a good, better, best speaker of Sanskrit language.

Sanskrit doesn't force you to make long words, or non-ending strings of syllables. It allows it, but doesn't demand it. The dreaded sandhi is also mostly optional. Simple spoken Sanskrit is still possible. Will that help you understand deep philosophical texts from 2000 years ago? No. Well, six years of modern English won't help you understand even 500 years old Shakespeare either.

Samskrit Bharati, a Bangalore based organization, focuses on spoken Sanskrit, and has trained thousands of people from all walks of life, who otherwise had no spoken exposure of the language. Learning a language is not difficult, excelling it it to precision and grandeur is. Every child learns his or her mother tongue very easily. But it takes extreme toil to learn a new language later in life. Or to master one's own mother tongue! Not every English speaking person is a Frost, not every musician a Bach.

Sanskrit was never widely spoken. During the past two and a half millennia, Sanskrit scholarship was an exclusive club. None other than the Brahmins were allowed to use it. That all literary works in Sanskrit was made accessible only to the Brahmins, spelt its doom. The thing about languages is that, like living organisms, languages too evolve by natural selection.

Natural languages thrive by fitting the need of the era. The flexible of the lot flourish organically forcing the less prominent ones to wither away. Sanskrit’s resistance to change was the reason of its demise. This is essentially why every attempt to revive the language will fail, no exceptions.

This is a just a hearsay remark, that is just as ridiculous as the author found claims made about Sanskrit. Both arising out of venturing in unknown territories.

Sanskrit was not an exclusive realm of Brahmins only. Its rigorous teaching may have done been mainly by Brahmins. It was as a language taught in every gurukula, which even till mid 19th century had students from all walks of society (as noted in East India Company/British journals of the time). Since every proof has to be from a third party source, one should check "The Third Report on The State of Education in Bengal" by William Adam, 1838 ! Yes, 1838.
As per Western and liberated opinions, by this time, Hindu society must have reached its nadir in terms of all the ills. But refer to page 14 and 18 of the book, to give an idea of the castes of the teachers and students. These are only for sake of example, read the full book for more. And this is just one province in one state.

Page 14 - "The Third Report on The State of Education in Bengal" by William Adam

Page 18 - "The Third Report on The State of Education in Bengal" by William Adam

Sanskrit does allow borrowing words, and actually there has been healthy exchange of vocabulary between Sanskrit and other languages. It has its own rules for making new words, like any language. It can take loan words as well by doing a simple trick. For all vehicles, it can take the loan word and add -yAnam (-यानम्) at end, thus enabling the same word formation as native Sanskrit words. Like, kAra-yAnam (कार-यानम् = car), basa-yAnam (बस-यानम् = bus), Trena-yAnam (ट्रेन-यानम् = train), eyarplena-yAnam (एयरप्लेन-यानम् = airplane, or simple the native Sanskrit word vAyu-yAnam वायु-यानम् = air-vehicle). This is just an example.

The degree of precision that Sanskrit affords its speakers prevents verbosity i.e. purposefully lengthening prose for effect. Attempts at verbosity leads to a redundant prose. Translating to Sanskrit from any other language would thus lead to loss of data. This data isn’t particularly useful in the context of the prose, but having it allows one to deduce information about the author – things like their personality and state of mind while writing. A language that attains precision does so at the expense of creativity. This clearly doesn’t happen with Sanskrit considering the abundance of Sanskrit works.

There is nothing preventing one to be verbose in Sanskrit. It is just not seen very erudite to yap in 15 words, that which can be said in one or two. For example, mantram (one word), its dictionary meaning is a phrase - mananam trAyate iti mantram = that which when contemplated upon, saves/rescues/redeems. So instead of mantram, one can keep using the phrase 'yasya mananam trAyate, tat' (whose contemplation rescues, that), or simply use 'mantram'.

Translating from other languages can be done in a Sanskrit way, going for the message than just the word, or in a very literal way if one chooses to. One also has to look at how words are formed in various languages. Most words in Sanskrit are formed from the attribute of the thing being described. For example, a lotus is called aravindam (one whose petals are spread like spokes), jalajam (one born in water), pankajam (one born in muddy water), kamalam (that which adorns water) etc. And the particular 'synonym' is chosen based on context and meter. But in English, all will go to 'lotus'. I am not aware of any 'meaning' of the word 'lotus' except that the sound combination was chosen to mean a certain flower plant. (I may be wrong here.)

As for 'A language that attains precision does so at the expense of creativity', think of any game. The more rules there are, the more evolved the game is considered! Chess has rules, and you have to play within the rules, and yet players are creative!
Image result for chess image"

Indian classical music has raga, and each raga has set a sequence of allowed and prohibited notes in ascending and descending that must be observed. One may say, how can one create music or enjoy it with so much restrictions!? Here is an example of raga Kirwani on santoor. But there are some of the most heavenly renditions created by maestros while perfectly playing by the rules. So rules don't suppress creativity, they can actually enhance it!
Image result for shivakumar sharma and zakir hussain"

Coming back to Sanskrit, the creativity of the masters of Sanskrit grammars is astounding, and the best examples are rarely understood by mere mortals. They can go so cryptic that it would put Obfuscation C code competitions to shame!

There is an aspect called varNa-chitra, where the Sanskrit grammar master show off, sometimes just to show off, and create amazingly creative verses.

Here are some crazier examples (from The Wonder That is Sanskrit):

  1. jajaujojAjijijjAjI, taM tato'titatAtatut |
    bhAbho'bhIbhAbhibhUbhAbhU-rArArirarirIraraH || (uses same consonants within each quarter) - shishupAlavadham 19.3
  2. dAdado duddaduddAdI dAdado dUdadIdadoH |
    duddAdaM dadade dudde dAdAdadadado'dadaH || (same consonant in both lines) - shishupAlavadham 19.114
  3. kShitisthitimitikShiptividhivinnidhisiddhiliT |
    mama tryakSha namaddakSha hara smarahara smara || (same vowel in each line) - sarasvatI kaNThAbharaNam 2.278
  4. yAyAyAyAyAyAyAyAyAyAyAyAyAyAyAyA |
    yAyAyAyAyAyAyAyAyAyAyAyAyAyAyAyA || (all yA's) - pAdukAsahasram #936

Example 1 - shishupAlavadham 19.3

Example 2 - shishupAlavadham 19.114

Just like the craziness of this prize winning code of a chess playing C program :

B,i,y,u,b,I[411],*G=I,x=10,z=15,M=1e4;X(w,c,h,e,S,s){int t,o,L,E,d,O=e,N=-M*M,K
=78-h<<x,p,*g,n,*m,A,q,r,C,J,a=y?-x:x;y^=8;G++;d=w||s&&s>=h&&v 0,0)>M;do{_ o=I[
p=O]){q=o&z^y _ q<7){A=q--&2?8:4;C=o-9&z?q["& .$  "]:42;do{r=I[p+=C[l]-64]_!w|p
==w){g=q|p+a-S?0:I+S _!r&(q|A<3||g)||(r+1&z^y)>9&&q|A>2){_ m=!(r-2&7))P G[1]=O,
K;J=n=o&z;E=I[p-a]&z;t=q|E-7?n:(n+=2,6^y);Z n<=t){L=r?l[r&7]*9-189-h-q:0 _ s)L
!(I[p+1]^n)+l[n&7]*9-386+!!g*99+(A<2))+!(E^y^9)_ s>h||1<s&s==h&&L>z|d){p[I]=n,O
-O|i-n|p-b|LM))P y^=8,u=J;J=q-1|A<7||m||!s|d|r|o<z||v 0,0)>M;O[I]=o;p[I]=r;m?
*m=*g,*g=0:g?*g=9^y:0;}_ L>N){*G=O _ s>1){_ h&&c-L<0)P L _!h)i=n,B=O,b=p;}N=L;}
!r&&++C*--A));}}}Z++O>98?O=20:e-O);P N+M*M&&N>-K+1924|d?N:0;}main(){Z++B<121)*G
++=B/x%x<2|B%x<2?7:B/x&4?0:*l++&31;Z B=19){Z B++<99)putchar(B%x?l[B[I]|16]:x)_
x-(B=F)){i=I[B+=(x-F)*x]&z;b=F;b+=(x-F)*x;Z x-(*G=F))i=*G^8^y;}else v u,5);v u,

Here’s the thing though. People who praise Sanskrit for its precision are the same people who suggest that works in the language need interpretation by scholars. They’re the same people who bend their scriptures to make them appear to reference newly discovered scientific facts. They say Sanskrit doesn’t need disambiguation while failing at translating all of the “ancient knowledge” trapped in their literature.

While there may be some instances of force fitting and extrapolating done by people who are not conversant in science or Sanskrit (or both), over all this point doesn't hold.
Sanskrit works of more than 14 or 20 centuries can be easily understood even today by any college going Sanskrit student, some even school going students. (five to six years of Sanskrit as a first language study). Works of Kalidasa, the Panchantantra, kathA-sarit-sAgara (Ocean of rivers of stories) are all simple enough Sanskrit, and thanks to Panini's standardization, all these works can be understood perfectly well even today. And are actually standard texts in higher classes and college.

What does need lot of interpretation are the deeper philosophical and spiritual texts. For that matter, even the original Bible has not been fully understood. It is just one book, what to say of hundreds and thousands of spiritual works in Sanskrit!

Of course, that doesn’t in and of itself mean anything. It is possible that the paper just gets quoted a lot for having kickstarted all of that research into Sanskrit. The logical next question is, is there any research at all? So, I dedicated about two hours of my info-binging time to look up research related to Sanskrit. Almost all of the publicly accessible real academic research on the language is about its literature, its cultural impact and decoding its complex grammatical rules – yes, that’s still a work in progress apparently. Every research that relates to both, the language and computation, are conducted under dedicated Sanskrit research academies based in India. I’m not saying research done in India is any less worthy than elsewhere. However, there is none to back the claims about Sanskrit gaining a foothold in modern computing.

You obviously understood all about it by two hours of net binging! I just learned Latin and Greek with DuoLingo in 15 minutes! English classes are still discussing Shakespeare and nuances of its shades of meanings and what not! Why ? Haven't they figured out mundane works created only 500 years ago?

True, the speed of work in Sanskrit Computational area has not picked up speed in India, for the same reason. There are not very many who know both the areas deeply. Incidentally, just a few weeks back the 6th International Sanskrit Computational Linguistic Symposium was concluded in IIT Kharagpur. and here is the link to their papers presented.

6th International Sanskrit Computational Linguistic Symposium was concluded in IIT Kharagpur

I would argue that English is in fact the best language to test the scope of natural language parsing simply because the evolution of English isn’t regulated by an academy like many others. It’s free to change and vary depending on the culture that speaks it. English linguists are almost exclusively descriptivists – they don’t police one’s speech as long as everyone understands what they’ve meant. It constantly borrows words from other languages for their own use. When new non-existent words become mainstream, they embrace rather than despise. It thrives by adaptation. An AI system that adapts itself to the evolving language rather than requiring people to speak with precision – that’s intelligence.

Sanskrit grammar is also descriptive. And 'as long as everyone understands' is not happening magically. Every time someone deviates, one has to explain it. And every time a new person stumbles upon a quirk, it has to be learned all over as a new fact. It is like English spellings. It is not just 26 letters and sounds, one has to learn each combination that spells one way but sounds another. It all seems easy if one learns from childhood. There too, one can see how a kid intuitively spells the 'wrong' spelling, because English is a lousy spelling language! It is not a great language, it is a language enforced upon the world due to the British imperialism.

In my somewhat arrogant but educated opinion, Sanskrit as a spoken language is worse than useless today. It’s extremely difficult to learn as is, and it’s not spoken widely.

Opera is not appreciated by many. Percentage wise, less number of Western (Euro-descending) people appreciate Opera, than folks that can understand and appreciate basic Sanskrit, or its musical hymns, prayers and shloka verses! Is ballet, opera dead?

The most famous formula of the world E= mc^2 is utter nonsense to a fifth grader. One has to earn the reward of understanding it, by rigorous Physics and Maths. Is it then all useless? There are only a handful of people who can actually use that formula to actually create something nuclear with it!

Language carries its thoughts, culture, knowledge system with it. From that angle Sanskrit is extremely important and useful. But that is drifting from the topic, which was its use in programming. So, let us not venture in other territories!

The purpose of this article is not to make false claims about Sanskrit the language or its literature. But humans have this tendency, to exaggerate. Everyone does it, in different spheres.

Sanskrit as a language - amazing.
As a store house of human inner wisdom - unparalleled.
As a discipline - mind-blowing.
As the most ancient human oral sounds alive in form of Vedic chanting - unbelievable.
As a language whose study itself promotes logical and analytical thinking - true. If you do it with rigor and sincerity, not a fly-by Learn Sanskrit in 30-days class..

Let us not fly high in exaggeration, nor dig ditches in shaming it.
Either way, loss is one's own, not of Sanskrit.
For truth needs no saving.
It is.
It just is.

[ As usual, if I made a mistake, it is my limited understanding. Please send me your opinion, or comment below. But do try to understand the spirit of the post.]

(c) Shashikant Joshi । शशिकांत जोशी । ॐ सर्वे भवन्तु सुखिनः ।
Practical Sanskrit. All rights reserved. Check us on Facebook.