Presentation on theme: "Steps towards Integrated Intelligence Naoyuki OKADA (Professor Emeritus) Kyushu Institute of Technology Aug. 22 nd, 2003."— Presentation transcript:
Steps towards Integrated Intelligence Naoyuki OKADA (Professor Emeritus) Kyushu Institute of Technology Aug. 22 nd, 2003
Progress Progress Step 1 Conceptual taxonomy of vocabulary vocabulary Step 2 Natural language understanding of moving picture patterns of moving picture patterns Step 3 Emotion processing ｖｓ knowledge processing knowledge processing Step 4 Integrated intelligence
１．Ｉｎｔｒｏ ｄｕｃｔｉｏｎ Ｔｈｅ ｈｉｓｔｏｒｙ ｏｆ the research of Ａｒｔｉ - ｆｉｃｉａｌ Ｉｎｔｅｌｌｉｇｅｎｃｅ (AI) is repetition of diversification and specialization of it’s fields as other research does. At the beginning of 1960s At the beginning of 1960s At the beginning of 2000s At the beginning of 2000s 1960s 1960s Natural lang. process. Natural lang. process. Pattern recognition Pattern recognition Learning Learning problem solving problem solving
１．Ｉｎｔｒｏ ｄｕｃｔｉｏｎ Ｔｈｅ ｈｉｓｔｏｒｙ ｏｆ the research of Ａｒｔｉ - ｆｉｃｉａｌ Ｉｎｔｅｌｌｉｇｅｎｃｅ (AI) is repetition of diversification and specialization of it’s fields as other research does. At the beginning of 1960s At the beginning of 1960s At the beginning of 2000s At the beginning of 2000s 2000s 2000s Fundamentals/TheoryFundamentals/Theory Knowledge representation, reasoning, algorithm, Knowledge representation, reasoning, algorithm, fuzzy theory, --- fuzzy theory, --- Learning/DiscoveryLearning/Discovery Inductive/deductive learning, example-based reasoning, data-mining, --- Infrastructure of knowledgeInfrastructure of knowledge Knowledge acquisition, knowledge base, Web search, - Knowledge acquisition, knowledge base, Web search, - AI architecture/languageAI architecture/language Agent/Distributed AIAgent/Distributed AI Problem solving by collaboration, agent society, --- Life/Brain systemLife/Brain system Artificial life, genetic algorithm, connectionism, --- Natural languageNatural language Natural language understanding, dialog Natural language understanding, dialog processing, corpus, speech recognition, ---- processing, corpus, speech recognition, ---- Pattern understandingPattern understanding Image recognition, scene analysis, image sequence processing,--- Cognition/BodyCognition/Body Intelligent robot, symbol-ground ding, cognitive psychology, --- ・・・・・・・・・・・・・・・・・・・・・
However, too much diversification and specialization weaken the study on the relations among subfields. Those relations are important above all in human intelligence. Those relations are important above all in human intelligence. So, we should sometimes stop, look back, put various kinds of results in order, and integrate them into a system.
Approach towards integration Multi-modal Human intelligence accepts multi-modal inputs. - Natural language in letters/voices - Natural language in letters/voices - Picture patterns
Intellect and sensitivity Intellect and sensitivity Knowledge and emotions are in the relationship of both wheels of a cart. Knowledge and emotions are in the relationship of both wheels of a cart.
２．Ｃｏｎｃｅｐｔｕａｌ ｔａｘｏｎｏｍｙ ｏｆ ｖｏｃａｂｕｌａｒｙ Language is “ the window ” of the mind. Language is “ the window ” of the mind. Semantic contents of language, or the system of concepts is the most important objects in making clear intelligence. Semantic contents of language, or the system of concepts is the most important objects in making clear intelligence.
Research in Early years Research in Early years C.J.Fillmore ’ 68 Case grammar C.J.Fillmore ’ 68 Case grammar M.R.Quillian ’ 68 Semantic network M.R.Quillian ’ 68 Semantic network R.Schank ’ 72 Conceptual dependency R.Schank ’ 72 Conceptual dependency Y.Wilks ’ 75 Preference semantics Y.Wilks ’ 75 Preference semantics
Conceptual analysis Categories of concepts Categories of concepts Concepts are formed for all the nature. - There are five categories from the linguistic viewpoints: substance, attribute, event, space/time, and miscellaneous - But each category is vague.
Computational definition Computational definition - What is substance? Individual of which quantity and quality can be recognized by sensors Individual of which quantity and quality can be recognized by sensors Fig. ２･ 1 Substance sensed by eyes Fig. ２･ 1 Substance sensed by eyes ---Mountain ---Mountain
- What is state ？ Fundamentally, static relation among several substances Fundamentally, static relation among several substances Fig. ２･２ State----Man in the car Fig. ２･２ State----Man in the car
- What is attribute? A special case of state. Fundamentally, A special case of state. Fundamentally, difference between object and standard difference between object and standard Fig. ２･ 3 Object-standard pair--- The mountain is higher than the tree. The mountain is higher than the tree. ～ ObjectStandard Difference
- A measure is necessary for the detection of difference Measure ： Height (length in the perpendicular direction) This measure brings an attribute to the object.
- What is event ? Fundamentally, change from a before- state to an after-state Fundamentally, change from a before- state to an after-state Change 前状態 後状態 Before-state After-state Fig. ２･４ Before-after state pair--- A man gets out of a car A man gets out of a car
- What is space and time ? Fundamentally, the location of substance, attribute or event is identified. Fundamentally, the location of substance, attribute or event is identified. Space ： position Space ： position Time ： passage Time ： passage
Primitive and complex Primitive and complex - Primitive A concept which can not be decomposed A concept which can not be decomposed any more(by referring to its word) any more(by referring to its word) - Complex A concept which can be decomposed A concept which can be decomposed into one or more primitives into one or more primitives
Formation of complex concept Formation of complex concept - Compound Type A: Two primitives are connected Type A: Two primitives are connected with a logical/syntactic relation. with a logical/syntactic relation. Type B: Primitives are connected with Type B: Primitives are connected with a scenario a scenario - Derivative Derived from a primitive Derived from a primitive
Conceptual classification Why classification ？ Why classification ？ - Verification of the proposed theory - Acquisition of conceptual data for machine processing Target vocabulary Target vocabulary - About 32,000 words used in everyday language
Table ２･２ Case-frame of events TypeExample v(sbj) Fall （ leaf ） v(sbj,org) Come （ smoke, chimney ） v(sbj,goal) Go （ Taro, post office ） v(sbj,ptn) Collide (truck, bus ） v(sbj,std) Resemble （ children, parents ） v(sbj,obj) v(sbj,obj) break （ boy, cup ） v(sbj,obj,org) Unload （ driver, box, truck) v(sbj,obj,goal) Put （ girl, candy, pocket) v(sbj,obj,inst) Scoop （ Hanako, sugar, spoon ） v(sbj,obj,att) Feel （ Jiro, breeze, cool ） Others
Number of classified concepts Number of classified concepts Substance ４，２００ Attribute ２，０６０ Event ３，７２０ Space/time １，８ ００ Total １１，７８ ０ Total １１，７８ ０
Evaluation Evaluation Our theory can cover the 70% of the Our theory can cover the 70% of the target vocabulary, and almost the target vocabulary, and almost the whole if a little enlarged. whole if a little enlarged. Fundamental data of concepts was Fundamental data of concepts was obtained, which contributed to the obtained, which contributed to the construction of EDR concept construction of EDR concept dictionaries later. dictionaries later.
Main publications 1973 N.Okada &T.Tamati ： Analysis and Classification of Simple Matter Concepts for the Interpretation of Natural Language and Picture Patterns, IECE Trans, Vol.56D,No.9, pp N.Okada: Conceptual Taxonomy of Japanese Verbs for Understanding Natural Language, Proc. COLING'80, pp
３． Natural language understanding of moving picture patterns R.A.Kirsch, Pioneer - Kirsch proposed integrated processing through the common representation of their meanings [Kirsch64]. - Kirsch proposed integrated processing through the common representation of their meanings [Kirsch64]. - But he processed just static picture patterns.
Approaches to moving picture patterns in early years patterns in early years N.Badler ‘ 75 Temporal scene analysis Sentence generation as the results of temporal scene analysis of temporal scene analysis Minsky ’ 75 Frame theory Universal data structure, particularly Universal data structure, particularly representation of event representation of event
Our approach - Input Sequential pictures each of which Sequential pictures each of which is line drawing by hands is line drawing by hands - Meanings captured The events of change_in_location The events of change_in_location which is the biggest in number. which is the biggest in number. - Output Japanese and English sentences Japanese and English sentences
Flow of processing Picture reading Noise cleaning Primitive picture recognition Reasoning of occurring events Structural analysis among primitives Understanding events Sentence generation Fig ３・１ Natural language understanding of picture sequences of picture sequences Bottom up Top down start end
Bottom up process Bottom up process Picture reading A TV camera follows a line segment by octagonal scanning. A TV camera follows a line segment by octagonal scanning. Primitive picture recognition An input line drawing with graph structure is matched with a template just like wave- propagation.
Fig ３・２ Reading a line segment （ a ） Octagonal scanning(b) Line following
Si Fig. ３・ 5 Boolean judgment of “inside/outside” Sj Logical computation
- Gestalt processing Metzger ’ s rule Metzger ’ s rule Continuation ： two line segments Continuation ： two line segments meeting with angle 180° meeting with angle 180° Enclosure ： a domain enclosed by Enclosure ： a domain enclosed by contours contours
ExperimentsExperiments Reading and recognition of line drawings by hands Structural analysis of static pictures Natural language understanding (NLU) of before- after state pairs NLU of picture sequences
(a) Before-state (b) After-state (a) Before-state (b) After-state Generated sentences: Generated sentences: １） A man （４） moves （１）． ６ ) A man （４） heads for a car. ２） A man （４） passes （１）． 7) A man （４） goes to(1) a car. ３） A man （４） walks ． 8 ） A man （４） comes to(1) a car ． ４） A man （４） goes forward （１）． ９） A man （４） gets near a car ． ５） A man （４） goes out(1) of a house. of a house. Fig. ３・ 7 NLU of a before-after state pair Fig. ３・ 7 NLU of a before-after state pair
Fig.3 ・ 8 NLU of a picture sequence t=t 0 t=t 1 t=t 2 t=t 3 t=t 4 t=t 5 A man (4) is in a house t=t A man(4) goes out (1) of a house t=t 1 A man(4) gets on(1) a car t=t 2 A car runs （２）. A car collides with a tree. A bird （１） leaves （１） a tree. A man(3) get off (1) a car t=t t=t t=t4
EvaluationEvaluation Reading and recognition About 150 primitive pictures were input, the 88% of which were correctly recognized and the 95% of which could be possible by some improvement. About 150 primitive pictures were input, the 88% of which were correctly recognized and the 95% of which could be possible by some improvement.
Structural analysis of before- after state pairs Note that the current image Note that the current image processing technology can processing technology can process gray-scale image process gray-scale image sequences by real-time sequences by real-time
Meaning understanding Our technology is still useful Our technology is still useful for all the subcategories of for all the subcategories of events except mental one events except mental one Historical significance This research took the lead in This research took the lead in the field of NLU of moving the field of NLU of moving picture patterns in ’ 70s. picture patterns in ’ 70s.
Main publications 1976 N.Okada & T.Tamachi ： Interpretation of Moving Picture Patterns and its Description in Natural Language---Semantic Analysis, IEICE Trans(D),Vol.J59- D, No.5, pp N.Okada: SUPP---Under- standing Moving Picture Patterns Based on Linguistic Knowledge, Proc. IJCAI,pp
４． Emotion processing vs. knowledge processing Why does AI need emotion processing? (1) Texts, e.g. social articles in newspapers often touch humanity such as glad/sad or gain/loss. (2) Some intelligent agents should be friendly to humans. (3) Some kinds of processing need a mechanism for evaluation of input information.
Research in early years Ｊ. Ｇ. Ｃａｒｂｏｎｅｌｌ ’ ８０ Ｊ. Ｇ. Ｃａｒｂｏｎｅｌｌ ’ ８０ Story understanding by personality Story understanding by personality Pfeifer & Nicholas ’85 Simulation of emotion mechanism Simulation of emotion mechanism by “interruption” by “interruption” Okada ’87 Okada ’87 Emotion model in NLU Emotion model in NLU
Our approach - Evocation and response Analysis of general property and algorithm Analysis of general property and algorithm - Roles shared by emotion and knowledge
Analysis of emotion Multi-factor analysis by Ｐｌｕｔｃｈ ｉｋ Plutchik divided emotions into two Plutchik divided emotions into two categories: “primary” and “complex” categories: “primary” and “complex” [ Ｐｌｕｔｃｈｉｋ 60]. [ Ｐｌｕｔｃｈｉｋ 60]. - We follows this idea, and take the followings as primary emotion: Gladness/sadness, like/dislike, Gladness/sadness, like/dislike, surprise, expectancy, anger, and surprise, expectancy, anger, and fear. fear.
Fig. ４･１ Hierarchical features of gladness (Gladness( the current state is better than the previous ( physiological (inner pleasure; outer pleasure); psychological ( goal achievement( information collection (expected; discover; become clear); plan (planning); results (completion; gain; useful)); personal relations( companion mind (agreement; sympathy; collaboration; make_friends_again); superiority/inferiority (superior; praise; obedience; hospitality; protection))); others))))
Evocation of emotion Evocation of emotion - Reflective Evoked unconsciously by a sudden stimulus from the external world or a remarkable change in the internal. Reflective response follows it. - Deliberative Evoked consciously by a cognitive process. Deliberate reasoning mediates between the input and its response.
Response of emotion Response of emotion - General trends If one is brought “pleasure” by an input, one promotes the input stimulus through one’s response, If one is brought “pleasure” by an input, one promotes the input stimulus through one’s response, otherwise one inhibits it. - Type of response * Free * Free * Constrained * Constrained
- Free An emotion is evoked straight to a stimulus, and a promoting/inhibitory response follows it. The response may cause to give up a task under execution. - Constrained Even if a free emotion is evoked internally, some task under execution inhibits straight expression
Language expression Emotion is adjective whereas Emotion is adjective whereas knowledge is verb knowledge is verb This implies that emotions are attributes. Since an attribute gives a measure to detect the difference between an object-standard pair, evocation of an emotion is measurement of the input stimulus. Emotion vs. knowledge
Subjective and objective Subjective and objective Emotion: subjective evaluation of information Emotion: subjective evaluation of information Knowledge: memory of objective information Knowledge: memory of objective information Pattern of evaluation Formation of personality Formation of personality
Experiments Experiments Simulation of protagonists of fables Simulation of protagonists of fables - Free evocation in a series of actions (Shown in Chapter ５ ) - Constrained evocation in dialog process
A dialog---invitation Ｋ１ Hi ． Ｐ２ Hi ． Ｋ３ Where are you going ？ Ｐ４ To the river for fishing ． Ｋ５ Sounds good ． Ｐ６ And you? Ｋ７ I’m going to the mansion to drink water of the pond. I’m very thirsty. （ continued ） （ continued ）
Ｐ８ The mansion is dangerous ． Ｋ９ Why ？ Ｐ１０ Because I heard a voice when I passed it a while ago ． Ｋ１１ Really ？ I wonder what shall I do. Ｋ１１ Really ？ I wonder what shall I do. Ｐ１２ Why don’t you come to the river with me ? Ｋ１３ Well, it’s far, isn’t it? Ｐ１４ But the water there is colder and more tasty. K １５ O.K. I’ll come with you ．
tentative_ acceptance (R-PLAN) Fig.4 ･ 2 Interaction between discourse and mental analyses dialogue model understand (E-PLAN) persuade_to_abandon(E-Plan) accept(R-PLAN) refuse_for_drawback(R-PLAN) inform(E-PLAN) utterance planning persuade_to_accept(R-PLAN) deny_drawback(R-PLAN) inform_advantage(R-PLAN) understand_drawback (R-PLAN) emphasize_advantage(R-Plan) action planning emotion language analysis language generation seek_advantage(R-PLAN) seek_drawback(R-PLAN) seek_drawback(E-PLAN) (E7) “I’m going to the pond in the mansion to ” (E13) “Well, it’s far.” (R12) “Why don’t you come to the river with me?” (R14) “But the water there is colder and more tasty.” message flow top-down prediction dialogue state tracking dialogue state transition intention recognition
EvaluationEvaluation Conceptual analysis The properties of primitive emotions of children were made clear. Evocation The so-called “non-logical” algorithm was clarified. Response Complicated responses in behavior and dialog were verified.
Main publications 1987 N.Okada ： Representation of knowledge and emotions ， Proc. Kyushu Symp. Information processing ， pp ． 1997 M.Tokuhisa & N.Okada ： A Pattern Recognition Approach to Emotion Arousal of Intelligent Agents,Trans.JSAI,Vol.39, No.8, pp
５． Integrated intelligence Intelligence dwells in the mind. Recent research in the fields of cognitive science(CS) and AI throws light on the comprehensive mechanism of the mind.
Computer Models of the Mind Existent models - M.Minsky ’ 85 System of multi-gents System of multi-gents - Okada ’ 87 Mind composed of six domains and five levels Mind composed of six domains and five levels - P.N.Johnson-Laird ’ 88 Systematization of the results of research in CS Systematization of the results of research in CS
The author ’ s model Fundamentally, we follow Minsky ’ s multi-agent model. Micro-processor ”μ-agent“ and it ’ s “ chain-activation ” are introduced.
μ-agent( name (identifier), name (identifier), domain (attached), domain (attached), input (premise of activation) ， input (premise of activation) ， execution (program) ， execution (program) ， memory (data) ， memory (data) ， description (result) ， description (result) ， output (message)) output (message)) Fig. ５・１ Frame representation of Fig. ５・１ Frame representation of μ-agent μ-agent
- Chain activation Various functions of mind is executed by a “ chain activation ” or a series of activations of μ- agents. Fig. ５・２ Chain activation RecognitionReasoningBehavior
Domains of processing The mind consists of six domains which function as follows: The mind consists of six domains which function as follows: (1) Recognition (1) Recognition (2) Reasoning&Design (2) Reasoning&Design (3) Emotion (3) Emotion (4) Expression (4) Expression (5) Memory (5) Memory (6) Language (6) Language
RecognitionExpression Reasoning&Design MemoryEmotion SensorsActuators Language Mind(brain ） （ Thirst ， hunger ， … ） Body External world （ Scene ， speech ， … ） （ Behavior ， speech ， … ） Fig.5.3 Ｄｏｍａｉｎｓ ｏｆ ｐｒｏｃｅｓｓｉｎｇ
Recognition Expression Reasoning&Design Memory Emotions Sensors Actuators Language Mind （ brain ） （ Thirst,hunger… ） Body External world （ Scene ， speech ， … ） （ Behavior ， speech ， … ） Control Planning Reasoning Plan controller Interrupt controller Plan generator Evaluator Simulator Reasoning of dialog Reasoning of behavior 危険性実現可能性 ・・・・・・・・・・・・・・・・ 存在性、ほか ・・・・・・・・・・・・・・・・ 認識・人間の存在性・交差点 1 認識・人間の存在性・館 1 前 1 認識・人間の存在性・館 1 の池 1 認識・人間の存在性・池 1 認識・人間の存在性・猟師小屋 1 前 1 認識・人間の存在性・館 1 のぶどう棚 1 認識・人間の存在性・橋 1 の東 1 認識・滑る可能性・池 1 認識・転ぶ可能性・池 1 認識・落ちる可能性・池 1 認識・溺れる可能性・池 1 認識・風邪をひく可能性・池 1 認識・凍死する可能性・池 1 認識・滑る可能性・館 1 の池 1
Levels of data Along concept formation process Along concept formation process Level 5 Connected concept 4 Simple concept 4 Simple concept 3 Conceptual feature 3 Conceptual feature 2 Cognitive feature 2 Cognitive feature 1 Raw data 1 Raw data
Raw data Cognitive feature Conceptual feature Connected concept Primitive concept (Substance) Roof, Wall, Room,.... Movement_ from_inside_ to_outside,.... Difference_in_ length,.... Visual Extracted Associated Composed Go High Vi Human Car go House Ni Movable Inside is_a Agent Origin Ai Shopping( buy, cash ／ card, store,....,, …,, …,, … (External) (Event) (Attribute) (Internal) Fig. 5 ･ 4 Levels of data
Aesopworld Project - Implementation of our theory - Simulation of the physical and mental activities of the protagonists of Aesop Fables, e.g. The Fox and the Grapes e.g. The Fox and the Grapes
Emotion Recognition Memory Expression Reasoner Fig.5 ･ 5 Chain activation of μ-agents Plan- knowledge Nature- reasoning Reasoning&Design Language Controller Planner Plan generator SimulatorEvaluator Sensors physiologythirst desire relieve thirst goal relieve thirst reasoning water in pond reasoning pot in house reasonin g human near pond Actuators plan Drink water plan eat fruits
Emotion Recognition Memory Expression Reasoner Plan- knowledge Nature- reasoning Reasoning &Design Language Controller Planner Plan generator SimulatorEvaluator action movement to mansion SensorsActuators plan go to mansion to drink water
ExperimentsExperiments Main system Four PCs and fifteen interpreters (subdomains) Four PCs and fifteen interpreters (subdomains) Fig. ５・６ Composition PC1 : Turbo Linux 8 Sub- domain 1 Sub- domain 2 Sub- domain 3 Sub- domain 4 PC2 : Turbo Linux 8 Sub- domain 5 Sub- domain 6 Sub- domain 7 Sub- domain 8 PC3 : RedHat Linux 5.2J Sub- domain 9 Sub- domain 10 Sub- domain 11 Sub- domain 12 PC4 : RedHat Linux 5.2J ) Sub- domain 13 Sub- domain 14 Message server LAN Sub- domain 15
Fig. ５・７ Snapshot １
Fig. ５・８ Snapshot2
Generated monolog by the Fox It’s very hot today. I’m on the animal trail 300 meters from the intersection. I’m very thirsty. I’d like to relieve my thirst in a safe way in a hurry. I’ll search for and drink water. I’ll go home. My home is far. I give up going there. I’ll go under the bridge. It’s far. I give up going there… I study other ways. I ’ ll search for a place with water. I remember a pond. I ’ ll find it. I remember the B pond. It ’ s in the Aesopworld. I ’ ll go there. A hunter ’ s lodge is close to it. He ’ ll probably be in it. He is man. Man is dangerous. I give up going there … I ’ ll eat watery foods. I ’ ll search for and eat fruits …
Minsky ’85Okada ’87Johnson-Laird ’88 ApproachBottom upTop down DomainsManySix LevelsmanyFiveMany TechnologyMulti-agents Turing machine Experi- ment NoYesNo Table 5.2 Comparison with Minsky and Johnson-Laird
EvaluationEvaluation Various mental activities discussed in CS and AI could be captured by our six domains and five levels. An interface to physiology is put at the level of raw data. This model can be implemented if the number of μ-agents is less than ten thousands. Our integrated intelligence took the lead in verifying its validation by experiments.
Main publications 1990 N.Okada and T.Endo: Story Generation Based on Dynamics of the Mind, Computational Intelligence, Vol.8, No.1, pp N.Okada: Integrating Vision, Motion, and Language through the Mind, Artificial Intelligence Reiview, Vol.10, pp
６ ． Residual problems and social applications Problems - Learning through experiences - Implementation to robots Applications - Support agents for education or diagnosis - Partner of handicapped/elder people
7 ． Conclusions Concepts of substance, attribute, event, and space/time are systematically analyzed and classified. A system for NLU of picture sequences were constructed. Primitive emotions were analyzed and implemented in the tasks of action and dialog planning.
A computer model of the mind with six domains of processing and five levels of data was proposed, and was implemented with twelve hundreds μ-agents on computers. These results led us to a conclusion that an infrastructure to construct complex intelligence covering many subfields could be obtained. These results led us to a conclusion that an infrastructure to construct complex intelligence covering many subfields could be obtained.