Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bi-Weekly BTP Meeting Bhargava Reddy 110050078.

Similar presentations


Presentation on theme: "Bi-Weekly BTP Meeting Bhargava Reddy 110050078."— Presentation transcript:

1 Bi-Weekly BTP Meeting Bhargava Reddy

2 Contents Origin and importance of Language Linguistic Tree
Formal Definition of a Language Is English a finite or infinite Language How do we learn a language

3 Origin of Language? We know that we human beings have the same ancestors We have evolved into the most intelligent beings But we need to remember that even the languages have evolved But can we say that in the lines of human beings all the languages have a same ancestor? We know that Telugu, Tamil, Kannada and Malayalam have the same origin which is the Dravidian family of languages Do all the languages start from a common one? Atkinson QD, Meade A, Venditti C, Greenhill SJ, Pagel M (2008) Languages evolve in punctuational bursts. Science 319: 588.

4 Importance of Languages
The emergence of the human language faculty represented one of the major transitions in the evolutions of life It is the first time, exchange of highly complex information between individuals has become possible Parallels between genetic and language evolutions have been noticed by Charles Darwin. But the issue is debatable It is generally accepted that language has evolved and diversified obeying mechanisms similar to those of biological evolution The extant languages amount to a total of nearly 7000 languages and divided into 19 linguistic families Atkinson QD, Meade A, Venditti C, Greenhill SJ, Pagel M (2008) Languages evolve in punctuational bursts. Science 319: 588.

5 Linguistic Tree The number in red show the estimated age of the languages From the diagram we can interpret that the language has originated somewhere like 8,700 years ago This says that our Indian languages are not older than years Our Indian languages are still the oldest languages in the tree All the languages are broadly classified as: Celtic, Italic, French/lberian, West Germanic, North Germanic, Baltic, Slavic, Indic, Iranian, Albanian, Greek, Armenian, Tocharian, Antolian

6 Controversies in the origin of Sanskrit
The main key to the Indo-Iranian language is the Sanskrit language People believe that it originated around 1500 BCE (Radio Carbon dating of the scripts prove it) Nearly 3500 years old But we know that during the Ramayana, Sita was residing in the Valmiki’s ashram. Who happened to write Ramayana in Sanskrit But estimates show that Ramayana happened nearly 1.6 lakh years ago which contradicts our assumption

7 Formal Definition of Language
Alphabet: An alphabet is a finite set of symbols. Without loss of generality, we can consider the binary alphabet, {0,1}, by enumerating the actual alphabet in binary code Sentence: A sentence is defined as a string of symbols. The set of all sentences over the binary alphabet is {0,1,00,01,10,11,000,...}. There are infinitely many sentences, as many as integers; the set of all sentences is ‘countable’. Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova

8 Formal Definition of a Language
Language: A language is a set of sentences. Among all possible sentences some are part of the language and some are not. A finite language contains a finite number of sentences. An infinite language contains an infinite number of sentences. Grammar: A grammar is a finite list of rules specifying a language. A grammar is expressed in terms of ‘rewrite rules’: a certain string can be rewritten as another string. Strings contain elements of the alphabet together with ‘non-terminals’, which are place holders. After iterated application of the rewrite rules, the final string will only contain symbols of the alphabet. Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova

9 Is English Language finite or infinite?
The number of words in English is nearly 500,000 as per the Oxford dictionary We also know that any alphabet can be one of the 26 characters Practically speaking a spoke english sentence can be in the worst case 100 words (Though the number is absurd) But if we consider written English, the length of a sentence can be unbounded (E.g.: Given a sentence a longer one can be made by joining with other) So if we consider written english the cardinality of the set of all english sentence will be infinite which makes English an infinite language Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova

10 Learning Theory of a Language
The learner is presented with data and has to infer the rules that generate these data The difference between ‘learning’ and ‘memorization’ is the ability to generalize beyond one’s own experience to novel circumstances So if we consider language: The child will generalize to novel sentences never heard before. Learning theory describes the mathematics of learning with the aim of outlining conditions for successful generalization Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova

11 Learning theory of a Language
Children learn their native language by hearing grammatical sentences from their parents or others. From this ‘environmental input’, children construct an internal representation of the underlying grammar. Children are not told the grammatical rules. Neither children nor adults are ever aware of the grammatical rules that specify their own language. Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova

12 Learnability Imagine a speaker-hearer pair. The speaker uses grammar, G, to construct sentence of language L. The hearer receives sentences and should after some time be able to use grammar G to construct other sentences of L Mathematically speaking, the hearer is described by an algorithm, A, which takes a list of sentences as input and generates a language as output Furthermore, a set of languages is learnable by an algorithm if each language of this set is learnable Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova

13 Learnability We are interested in what set of languages, L = {L1,L2,…} can be learned by a given algorithm Gold’s theorem implies there exists no algorithm that can learn the set of regular languages Implies -> No algorithm can learn context-free languages, context-sensitive languages or computable languages Gold’s theorem formally states there exists no algorithm that can learn a set of ‘super-finite’ languages. Such a set includes all finite languages and at least one infinite language. Intuitively, if the learner infers that the target language is an infinite language, whereas the actual target language is a finite language that is contained in the infinite language, then the learner will not encounter any contradicting evidence, and will never converge onto the correct language.

14 Learning a Language We might think of this in the content of infinite languages. Let us look at finite languages In the context of statistical learning theory, the set of all finite languages cannot be learned. In the Gold framework, the set of all finite languages can be learned, but only by memorization: The learner will identify the correct language only after having heard all sentences of this language Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova

15 References Atkinson QD, Meade A, Venditti C, Greenhill SJ, Pagel M (2008) Languages evolve in punctuational bursts. Science 319: 588. Universal Entropy of Word Ordering Across Linguistic Families. Marcelo A. Montemurro, Zanette DH. PLoS ONE (2011), e doi: / journal.pone Computational and evolutionary aspects of language (2011), Martin A Nowak, Natalia L Komarova


Download ppt "Bi-Weekly BTP Meeting Bhargava Reddy 110050078."

Similar presentations


Ads by Google