Presentation is loading. Please wait.

Presentation is loading. Please wait.

Language For Bots – Cognitive Perspective July 2006 Jerry T. Ball Senior Research Psychologist Air Force Research Laboratory Mesa, AZ.

Similar presentations

Presentation on theme: "Language For Bots – Cognitive Perspective July 2006 Jerry T. Ball Senior Research Psychologist Air Force Research Laboratory Mesa, AZ."— Presentation transcript:

1 Language For Bots – Cognitive Perspective July 2006 Jerry T. Ball Senior Research Psychologist Air Force Research Laboratory Mesa, AZ

2 2 Language for Bots The ability of synthetic entities to be able to communicate with humans is highly desirable The ability of synthetic entities to be able to communicate with each other is also highly desirable A key question is whether the same language can or should be used for bot-to-human and bot-to-bot communication

3 3 Natural Languages Humans communicate with each other in natural language Natural languages have evolved into functional systems of communication in human communities all around the world Natural languages are complex – Ambiguous – Redundant – Often fuzzy, vague or inexact in meaning – Not well formalized – Full of exceptions to rules – Multiple levels of representation and multiple dimensions of meaning

4 4 Natural Languages Are these features of natural language non- functional? Is it desirable to eliminate them in a language for bots?

5 5 Logical Languages Philosophers have devised logical languages to overcome the undesirable features of natural language Logical languages are relatively simple – Unambiguous – Non-redundant – Precise in meaning – Formal – Exceptionless – Single level of representation – Limited number of dimensions of meaning

6 6 Logical Languages Do these features of logical languages represent an improvement over natural language? Are the philosophers right? Is natural language so riddled with flaws as to be inappropriate for logical reasoning and computation?

7 7 Natural Languages Natural languages have evolved to satisfy the desire of humans to communicate With few exceptions, in evolution, non-functional features fail to survive – The human appendix may be an exception There may be better global solutions (e.g. humans vs. other species), but evolution leads to locally satisfactory solutions Natural language expressions are grounded in our experience of the world and reflect that experience Natural languages are constrained by human performance limitations

8 8 Natural Languages Redundancy is functional – Facilitates understanding given human limitations and noise Ambiguity is inevitable – Not possible to represent each concept individually – Same word must be used to represent multiple concepts Take a hike; Take five; Take out; Take over; Take place Black bird; Blackbird; Black Ice; Black person; Black listed – Implies that words will come to have multiple meanings Fuzziness is necessary – The world isn’t divided up into simple, discrete entities – Most concepts do not have necessary and sufficient features

9 9 Logical Languages Logical languages were designed to support rational thought and reasoning Complex sentences in a logical language may be uninterpretable by most humans – Multiple center embedding; More generally, the use of recursion – Too many terms in one sentence exceeding short-term working memory limits – No discourse markers to indicate relations between sentences No real grounding of logical expressions – Just pushing symbols around Are logical languages satisfactory for communication?

10 10 Two Approaches to Designing a Viable Bot Language Start with a logical language and extend it to handle the kinds of meaning distinctions that are needed Start with a natural language and simplify it to the maximum extent practicable Both approaches may ultimately lead to a similar design If the meaning distinctions made in natural language are relevant, then the resulting design is likely to be closer to natural language than to a logical language If the complexity can’t be avoided, then it should be embraced! It is probably easier to look at natural language and decide what can be eliminated, than to look at a logical language and decide what needs to be added

11 11 Examples of Each Approach Starting with Predicate Logic – Zadeh’s Precisiated Natural Language Adds fuzzy logic to improve on predicate logic’s limited quantification capabilities – Kamp’s Discourse Representation Theory Adds features for representing Discourse level information Starting with Natural Language – Nirenburg, McShane et al.’s OntoSem Uses an ontology of concepts and text meaning representations to represent the meaning of linguistic expressions – Marked Up Natural Language Uses language itself as the basis for representing meaning

12 12 Starting with Predicate Logic From the perspective of predicate-argument structure, the following sentences appear to have the same basic meaning despite their different forms: – The man gave Jim a book – Jim was given a book by the man – The book was given to Jim by the man They are typically represented by a formula like  (x),  (y),  (z): give(x,y,z) Or if events are treated as objects so that tense can be represented  (x),  (y),  (z),  (e): give(e,x,y,z) & past(e)

13 13 Starting with Predicate Logic However, this grammatical variation is actually functional – The man… Jim… The book… – The first object described is the Topic (who or what the sentence is about) and contrasts with the Focus (what is predicated of the topic) – Coherent discourse takes account of this contrast The need to encode discourse level dimensions of meaning often leads to grammatical variation Predicate Logic provides no obvious mechanism for encoding discourse dimensions of meaning – Predicate Logic was designed to encode predicate- argument structure

14 14 Starting with Predicate Logic The following two sentences also appear to be very similar in meaning from the perspective of predicate- argument structure – Bill cooked the shrimp – It was Bill who cooked the shrimp Again, the difference in form is functional, encoding a distinction between Given and New Information – Bill is given in the first sentence – That what Bill did is cook the shrimp is new – Someone cooked the shrimp is given in the second sentence – That the someone was Bill is new

15 15 Predicate-Argument Structure A disjointed collection of conjoined/disjoined predicate-argument structures will fail to convey important dimensions of meaning The representation of the meaning of “the red ball is on a table” as something like…  (x),  (y): ball(x) & red(x) & table(y) & on(x,y)

16 16 Predicate-Argument Structure Fails to capture important dimensions of meaning – “the red ball” is what is being talked about – the subject – “on the table” refers to a location – “the red ball” refers to a definite ball, presuming the reader will access an existing referent in their situation model – “a table” refers to non-definite, but perhaps specific table, presuming the reader will introduce a new referent into their situation model – In “the red ball”, “red” modifies the head “ball” indicating a subcategory of ball and “the” provides the definite specification

17 17 Predicate-Argument Structure Predicate-argument structure reflects two important dimensions of meaning – Relational meaning – on(x, y), book(x), table(y) – And to some extent Referential meaning –  (x),  (y) Predicate-argument structure does not come close to reflecting all the dimensions of meaning encoded in natural language – Cannot represent the full meaning of sentences – Cannot represent discourse level dimensions of meaning Since these additional dimensions of meaning are largely functional, predicate logic needs to be extended to handle them if it is to be used as the basis for mapping to natural language – Natural Language provides the best indication of what these additional dimensions are and how they interact

18 18 Precisiated Natural Language Precisiated Natural Language is an attempt to extend Predicate Logic to better handle the full range of quantification as expressed in natural language According to Zadeh, natural language is a system for describing perceptions The imprecision of natural language is a direct consequence of the imprecision of perceptions Precisiated Natural Language (PNL) provides a basis for computation with perceptions The conceptual structure of PNL mirrors two fundamental facets of human cognition – Partiality – most human concepts are not bivalent, they are a matter of degree – Granularity – the clumping of values of attributes, forming granules with words as labels

19 19 Precisiated Natural Language PNL abandons bivalence, a key tenet of Predicate Logic – According to Zadeh, “The precisiation of a natural language cannot be achieved with the conceptual structure of bivalent logic” Inference is computational rather than logical

20 20 Precisiated Natural Language Zadeh defines a Generalized Constraint Language based on the template X isr R where X is a constrained variable, R is a constraining relation, and the r in isr is one of the following modal relations: – Possibilistic – multiple discrete possibilities with likelihoods – Probabilistic – Veristic – Usuality – Random sets – Fuzzy graph – Bimodal – Pawlak set

21 21 Precisiated Natural Language Using the modal relations and some additional features, PNL can represent the meaning of sentences like – Tandy is much older than Dana – Most Swedes are tall Tandy is much older than Dana  – (Age(Tandy), Age(Dana)) is much.older Where much.older is a binary fuzzy relation composed from much and older Most Swedes are tall  –  Count(tall.Swedes/Swedes) is most Where most is a fuzzy number and  Count is a computation over all Swedes The modal relation in these examples is Possibilistic by default

22 22 Advantages of Precisiated Natural Language Formalized enough to be able to support a computational implementation Good at Quantification – Able to represent the full range of quantification in natural language Good at Inexact Reasoning – Doesn’t require bounded, discrete categories – Not restricted to bivalent logic

23 23 Disadvantages of Precisiated Natural Language The syntax is not clearly defined below the template – X isr R (default r if not specified is possibilistic) As a result the mapping to and from Natural Language is at best unclear – Most Swedes are tall   Count(tall.Swedes/Swedes) is most – Tandy is much older than Dana  (Age(Tandy), Age(Dana)) is much.older Not clear if this mapping is to Concepts or Words – Words are highly ambiguous – Mapping to Concepts presumes disambiguation – How does this disambiguation occur? The formalism doesn’t address the encoding of discourse level dimensions of meaning – Representation of each sentence is independent of other sentences unless there are explicit logical connectives

24 24 Discourse Representation Theory Kamp’s Discourse Representation Theory is an attempt to provide a formalism for representing the relationships between a collection of propositions as is required for modeling discourse Primary mechanism is argument overlap – Two propositions that share the same argument are related via the shared argument Formalism is based on Predicate Logic Allows for the representation of events and states as objects within First Order Logic similar to Davidson Supports the representation of time Discourse Representation Structures can be nested – Able to handle anaphoric references and scope of quantification – Donkey sentences – If a farmer has a donkey, he beats it

25 25 Discourse Representation Theory x e t John(x) sleep(e,t,x) now(t) John sleeps  x  e  t : John(x) & sleep(e,t,x) & now(t) Discourse Representation Structure (DRS) Predicate Logic With Event and Time Objects

26 26 Advantages of Discourse Representation Theory Explicitly addresses the representation of some aspects of discourse level information Doesn’t exceed the power of First Order Logic – Efficient, computational proof mechanisms are available to support inferencing within First Order Logic According to Kamp, DRT as a knowledge representation formalism may be preferable to Predicate Logic in its standard form, even though DRT representations are translatable into Predicate Logic – “DRT may hold an important advantage over a notational variant (i.e. Predicate Logic) which, instead of revealing logically relevant structure, does more to obscure it” – Representation Matters! Includes mechanisms for processing sentences into DRT representations

27 27 Disadvantages of Discourse Representation Theory Argument overlap, the primary mechanism for representing discourse information, is only one type of coherence relation between sentences. Others include: – Same topic (implicit or explicit) – Different words with similar meanings – Temporal and spatial proximity – Causal relations (implicit or explicit) DRT is a notational variant of Predicate Logic – Zadeh’s criticism of Predicate Logic applies The mapping from DRT representations to and from Natural Language is like that of Predicate Logic – Predicates are assumed to be unambiguous concepts – Like Predicate Logic, the syntax of DRT is quite limited relative to that of Natural Language

28 28 Starting with Natural Language Ball (2006) – Can NLP Systems be a Cognitive Black Box? Nirenburg and McShane’s OntoSem – Maps natural language into Text Meaning Representations based on a semantic ontology which is intended to be language neutral – Doesn’t attempt to capture all the nuances in meaning that occur in natural language, just the relevant ones for a particular domain Marked up Natural Language

29 29 Some Basic Language Constraints Natural Language sentences are limited to 3 or 4 arguments – 1 John 1 bet 2 Jim 2 3 $50 3 4 the American league would win 4 – Probably a short-term working memory limitation Basic level categories are preferred – The couch is comfortable vs. The furniture is comfortable – Basic level categories tend to be the most general categories that are visually salient for humans Don’t violate expectations – Garden path sentences: The horse raced past the barn fell While Mary bathed the baby spat up on the bed

30 30 Sentences (Normal) Humans Can’t Process (Normally) The horse raced past the barn fell – Many normal humans can’t make sense of this sentence – Humans don’t appear to use exhaustive search and algorithmic backtracking to understand Garden Path sentences The mouse the cat the dog chased bit ate the cheese – Humans are unable to understand multiply center embedded sentences despite the fact that a simple stack mechanism makes them easy for parsers – Humans are very bad at processing truly recursive structures While Mary dressed the baby spit up on the bed – 40% of humans conclude in a post test that Mary dressed the baby – Humans often can’t ignore locally coherent meanings, despite their global incoherence (despite the claims of Chomsky and collaborators)

31 31 Sentences (Normal) Humans Can Process (Normally) The horse that was ridden past the barn, fell – Given enough grammatical cues, humans have little difficulty making sense of linguistic expressions The dog chased the cat, that bit the mouse, that ate the cheese – Humans can process “right embedded” expressions – These sentences appear to be processed iteratively rather than recursively – AI/Computer Science provides a model for coverting recursive processes into iterative processes which require less memory – Inability of humans to process recursive structures is likely due to short-term working memory limitations Does not appear to be a perceptual limitation!

32 32 Some Basic Language Constraints These constraints matter for humans! Any mapping from a formal language to natural language must avoid violating them – The system has to be capable of comprehending and generating appropriate Natural Language in order to communicate with humans! – Systems which attempt to appear natural without actually being so, are only likely to succeed in very narrow and well defined domains (Eliza, SHRDLU)

33 33 OntoSem OntoSem is a system for representing meaning that was specifically designed to support Natural Language Processing It contains an ontology of concepts which includes thousands of concepts organized into a multiple inheritance hierarchy – Concepts are types – Concepts are encoded in a natural language independent metalanguage – Concepts are unambiguous Below the upper ontology of general concepts, the depth of the hierarchy depends on the domain of application and the practical need for meaningful distinctions

34 34 OntoSem Natural Language is mapped into Text Meaning Representations (TMRs) TMRs contain instances of concepts organized into predicate-argument (or dependency) structures with many additional features to support encoding of additional dimensions of meaning – Semantic roles – Procedural hooks for computing information – Lexical information in addition to concepts


36 36 Advantages of OntoSem OntoSem’s Conceptual Ontology is intended to be language independent OntoSem is specifically developed for large-scale Natural Language Processing applications OntoSem includes a language processing system OntoSem directly addresses the issue of lexical disambiguation – The mapping from Natural Language to and from Text Meaning Representations is explicit – In Predicate Logic words and concepts are often used interchangebly as though words were unambiguous OntoSem has mechanisms for dealing with difficult problems like ellipsis

37 37 Advantages of OntoSem OntoSem comes with a large conceptual ontology of over 8000 concepts OntoSem has been extended to support multi-word units OntoSem has been extended to support domain and workflow scripts OntoSem provides computational tools to support adding to the conceptual ontology and creating lexicons

38 38 Disadvantages of OntoSem The mapping from Natural Language to and from Text Meaning Representations is still non-trivial – “ask” maps to REQUEST-ACTION-69 – “authorized” maps to ACCEPT-70 Creating TMRs requires disambiguation of all words in the input even if the application doesn’t require it Encoding the knowledge needed to support disambiguation is labor intensive and intellectually challenging TMRs are much larger and more complex than corresponding logical representations – Note that this is only a disadvantage if the construction of complex TMRs is unnecessary for some particular application – If the complexity cannot be avoided, it should be embraced!

39 39 Marked Up Natural Language Marked up natural language is based on the idea of building representations that are system independent, offering the possibility of standardization Mark up languages are currently being defined for the “semantic web” Most of these are based on XML and predicate logic with extensions OWL Web Ontology Language OWL Lite – Classification hierarchy and simple constraints OWL DL – Description Logic based but Computationally Complete (all conclusions are guaranteed to be computable) OWL Full – No guarantee of computational completeness DAML+OIL – DARPA Agent Markup Language + Ontology Inference Layer

40 40 Advantages of Marked Up Natural Language Mark up can facilitate machine interpretation relative to raw text No loss of information as when using abstract concepts or other forms of non-recoverable abstraction Natural language text can be generated by simply removing the markup Different tag sets can be designed for specific applications If markup is XML, then cross application portability is facilitated

41 41 Disadvantages of Marked Up Natural Language Marked up natural language is non-optimal for machine to machine communication Natural language may be too complex to be handled efficiently with mark up and without abstraction

42 42 Closing Considerations The closer the bot language is to natural language, the easier will be the mapping The better formalized the bot language is, the easier it will be for computers to process Successful bot languages are likely to fall somewhere within the union of natural languages and logical languages It may be that our bots will need to be bilingual – speaking a more formalized bot language which is not concerned with human limitations to each other and natural language to humans

43 43 Closing Considerations Paying attention to cognitive science principles may actually facilitate, not hinder, the development of a language for bots A language for bots which fails to consider human language representation and processing capabilities in sufficient detail is unlikely to be successful

44 44 Questions?

Download ppt "Language For Bots – Cognitive Perspective July 2006 Jerry T. Ball Senior Research Psychologist Air Force Research Laboratory Mesa, AZ."

Similar presentations

Ads by Google