CONFUCIUS: An Intelligent MultiMedia Storytelling Interpretation and Presentation System Minhua Eunice Ma Supervisor: Prof. Paul Mc Kevitt School of Computing.

Slides:

Advertisements

Similar presentations

Extraction and Visualisation of Emotion from News Articles Eva Hanser, Paul Mc Kevitt School of Computing & Intelligent Systems Faculty of Computing &

Advertisements

TeleMorph & TeleTuras: Bandwidth determined Mobile MultiModal Presentation Student: Anthony J. Solon Supervisors: Prof. Paul Mc Kevitt Kevin Curran School.

Semantics (Representing Meaning)

HOMER: A Creative Story Generation System Student: Dimitrios N. Konstantinou Supervisor: Prof. Paul Mc Kevitt School of Computing and Intelligent Systems.

MediaHub: An Intelligent Multimedia Distributed Hub Student: Glenn Campbell Supervisors: Dr. Tom Lunney Prof. Paul Mc Kevitt School of Computing and Intelligent.

Statistical NLP: Lecture 3

LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.

Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.

NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.

KAIST CS780 Topics in Interactive Computer Graphics : Crowd Simulation A Task Definition Language for Virtual Agents WSCG’03 Spyros Vosinakis, Themis Panayiotopoulos.

John Hu Nov. 9, 2004 Multimodal Interfaces Oviatt, S. Multimodal interfaces Mankoff, J., Hudson, S.E., & Abowd, G.D. Interaction techniques for ambiguity.

Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,

VRML for Kinematic and Physical Modeling and Simulations Benjamin Pugliese Mahesh Saptharishi.

Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University

Projects in the Intelligent User Interfaces Group Frank Shipman Associate Director, Center for the Study of Digital Libraries.

1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.

Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.

Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.

Lesson 1: Intro to Animation

1Intelligent Tools for Media Workshop in Media Authoring Automatic Generation of Animation and Video Contact Doron Friedman Mobile:

GUI: Specifying Complete User Interaction Soft computing Laboratory Yonsei University October 25, 2004.

Animating Virtual Humans in Intelligent Multimedia Storytelling Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent Systems Faculty.

Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.

1 Darmstadt, October 02, 2007 Amalia Ortiz Asociación VICOMTech Mikeletegi Pasealekua Donostia - San Sebastián (Gipuzkoa)

Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.

Building character animation for intelligent storytelling with the H-Anim standard Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent.

Parser-Driven Games Tool programming © Allan C. Milne Abertay University v

MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney & Paul Mc Kevitt School of Computing and Intelligent Systems Faculty.

Chapter 7. BEAT: the Behavior Expression Animation Toolkit

A Cognitive Substrate for Natural Language Understanding Nick Cassimatis Arthi Murugesan Magdalena Bugajska.

SceneMaker: Multimodal Visualisation of Natural Language Film Scripts Dr. Minhua Eunice Ma School of Computing & Intelligent Systems Faculty of Computing.

SceneMaker Intelligent Multimodal Visualisation of Natural Language Scripts Eva Hanser Dipl.-Des. (FH), M.Sc. Prof. Paul Mc Kevitt, Dr. Tom Lunney, Dr.

Temporal Relations in Visual Semantics of Verbs Minhua Eunice Ma and Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering.

CONFUCIUS: an Intelligent MultiMedia storytelling interpretation & presentation system Minhua Eunice Ma Supervisor: Prof. Paul Mc Kevitt School of Computing.

SceneMaker: Automatic Visualisation of Screenplays School of Computing & Intelligent Systems Faculty of Computing & Engineering University of Ulster, Magee,

ENTERFACE 08 Project 2 “multimodal high-level data integration” Mid-term presentation August 19th, 2008.

NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.

A Common Ground for Virtual Humans: Using an Ontology in a Natural Language Oriented Virtual Human Architecture Arno Hartholt (ICT), Thomas Russ (ISI),

A Multi-agent Approach for the Integration of the Graphical and Intelligent Components of a Virtual Environment Rui Prada INESC-ID.

1 Representing New Voice Services and Their Features Ken Turner University of Stirling 11th June 2003.

Toward a Unified Scripting Language 1 Toward a Unified Scripting Language : Lessons Learned from Developing CML and AML Soft computing Laboratory Yonsei.

Natural Language Processing Menu Based Natural Language Interfaces -Kyle Neumeier.

1 1. Representing and Parameterizing Agent Behaviors Jan Allbeck and Norm Badler 연세대학교 컴퓨터과학과 로봇 공학 특강 학기 유 지 오.

© TMC Computer School HC20203 VRML HIGHER DIPLOMA IN COMPUTING Chapter 2 – Basic VRML.

A MBI L EARN Ambient Intelligent Multimodal Learning Environment for Children 100 day review December 2008 Jennifer Hyndman Supervisors: Dr. Tom Lunney,

Chapter 10. The Explorer System in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans On, Kyoung-Woon Biointelligence Laboratory.

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.

Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski.

1 Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents S. Kawamoto, et al. October 27, 2004.

Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.

Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.

Intelligent MultiMedia Storytelling System (IMSS) - Automatic Generation of Animation From Natural Language Input By Eunice Ma Supervisor: Prof. Paul Mc.

Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.

PGNET, Liverpool JMU, June 2005 MediaHub: An Intelligent MultiMedia Distributed Platform Hub Glenn Campbell, Tom Lunney, Paul Mc Kevitt School of Computing.

AUTONOMOUS REQUIREMENTS SPECIFICATION PROCESSING USING NATURAL LANGUAGE PROCESSING - Vivek Punjabi.

IMSTD:Intelligent Multimedia System for teaching Databases By : NAZLIA OMAR Supervisors: Prof. Paul Mc Kevitt Dr. Paul Hanna School of Computing and Mathematical.

Human-Assisted Machine Annotation Sergei Nirenburg, Marjorie McShane, Stephen Beale Institute for Language and Information Technologies University of Maryland.

SceneMaker: Automatic Visualisation of Screenplays Eva Hanser Prof. Paul Mc Kevitt Dr. Tom Lunney Dr. Joan Condell School of Computing & Intelligent Systems.

WP6 Emotion in Interaction Embodied Conversational Agents WP6 core task: describe an interactive ECA system with capabilities beyond those of present day.

What is Multimedia Anyway? David Millard and Paul Lewis.

Introducing Scratch Learning resources for the implementation of the scenario

Artificial Intelligence Logical Agents Chapter 7.

NCP meeting Jan 27-28, 2003, Brussels Colette Maloney Interfaces, Knowledge and Content technologies, Applications & Information Market DG INFSO Multimodal.

Collision Theory and Logic

KRISTINA Consortium Presented by: Mónica Domínguez (UPF-TALN)

Collision Theory and Logic

Natural Language Processing (NLP)

Natural Language Processing (NLP)

Artificial Intelligence 2004 Speech & Natural Language Processing

Natural Language Processing (NLP)

Presentation transcript:

CONFUCIUS: An Intelligent MultiMedia Storytelling Interpretation and Presentation System Minhua Eunice Ma Supervisor: Prof. Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering University of Ulster, Magee

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Outline  Related research  Overview of CONFUCIUS  Automatic generation of 3D animation  Semantic representation  Natural language processing  Current state of implementation  Relation to other work  Conclusion & Future work

Faculty Research Student Conference Jordanstown, 15 Jan 2004  3D visualisation Virtual humans & embodied agents: Jack, Improv, BEAT MultiModal interactive storytelling: AesopWorld, KidsRoom, Larsen & Petersen’s Interactive Storytelling, computer games Automatic Text-to-Graphics Systems: WordsEye, CD-based language animation  Related research in NLP Lexical semantics Levin’s verb classes Jackendoff’s Lexical Conceptual Structure Schank’s scripts Related research

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Objectives of CONFUCIUS  To interpret natural language sentences/stories and to extract conceptual semantics from the natural language  To generate 3D animation and virtual worlds automatically from natural language  To integrate 3D animation with speech and non-speech audio, to form an intelligent multimedia storytelling system Story in natural language CONFUCIUS Movie/drama script 3D animation non-speech audio Tailored menu for script input Speech (dialogue) Storywriter /playwright User /story listener

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Architecture of CONFUCIUS 3D authoring tools, existing 3D models & character models visual knowledge (3D graphic library) Prefabricated objects (knowledge base) Script writer Script parser Natural Language Processing Text To Speech Sound effects Animation generation Synchronizing & fusion 3D world with audio in VRML Natural language stories Language knowledge mapping LCS lexicon grammar semantic representations visual knowledge

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Software & Standards  Java parsing semantic representation changing VRML code to add/modify animation integrating modules  Natural language processing tools Connexor Machinese DFG parser (morphologic and syntax parsing) WordNet (lexicon, semantic inference)  3D graphic modelling Existing 3D models (virtual human/object) on Internet Authoring tools Humanoid characters: Character Studio Props & stage: 3D Studio Max Narrator: Microsoft Agent Modelling language & standard VRML 97 for modelling geometry of objects, props, environment H-Anim specifications for humanoid modelling

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Agents and Avatars—How much autonomy? Autonomy & intelligence: highlow autonomous agents avatarsinterface agents Virtual humans:  Autonomous agents have higher requirements for sensing, memory, reasoning, planning, behaviour control & emotion (sense-emotion- control-action structure)  “User-controlled” avatars require fewer autonomous actions-- basic naïve physics such as collision detection and reaction still required  Virtual character in non-interactive storytelling between agents and avatars--its behaviours, emotion, responses to changing environment described in story input characters in non-interactive storytelling

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Graphics library Simple geometry files geometry & joint hierarchy Files (H-Anim) animation library (key frames) objects/props characters motions instantiation

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Level of Articulation (LOA) of H-Anim Joints and segments of LOA1  CONFUCIUS adopts LOA1 in human animation  animation engine adds ROUTEs dynamically based on H-anim’s joints & animation keyframes  CONFUCIUS’ human animation adapted for other LOAs. Example site nodes on hands pushing objects holding objects

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Semantic representations

Faculty Research Student Conference Jordanstown, 15 Jan 2004  Lexical Visual Semantic Representation (LVSR): semantic representation between language syntax and 3D models  LVSR based on Jackendoff’s LCS adapted to task of language visualization (enhancement with Schank’s scripts)  Ontological categories: OBJ, HUMAN, EVENT, STATE, PLACE, PATH, PROPERTY OBJ -- props/places (e.g. buildings) HUMAN -- human being/other articulated animated characters (e.g. animals) as long as their skeleton hierarchy is defined EVENT -- actions, movements and manners STATE -- static existence PROPERTY -- attributes of OBJ/HUMAN Lexical Visual Semantic Representation

Faculty Research Student Conference Jordanstown, 15 Jan 2004 PATH & PLACE predicates  interpret spatial movement of OBJ/HUMANs  62 common English prepositions  7 PATH predicates & 11 PLACE predicates

Faculty Research Student Conference Jordanstown, 15 Jan 2004 NLP in CONFUCIUS Coreference resolution Part-of-speech tagger Syntactic parser Morphological parser Semantic inference Pre-processing Connexor FDG parser WordNet LCS database FEATURES Disambiguation Temporal reasoning Lexical temporal relations Post-lexical temporal relations

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Visual valency & verb ontology Human action verbs One visual valency (the role is a human, (partial) movement) Biped kinematics: arm actions (wave, scratch), leg actions (walk, jump, kick), torso actions (bow), combined actions (climb) Facial expressions & lip movement, e.g. laugh, fear, say, sing, order Two visual valency (at least one role is human) One human and one object (vt. or vi.+instrument) e.g. throw, push, kick, open, eat, drink, bake, trolley Two humans, e.g. fight, chase, guide Visual valency ≥ 3 (at least one role is human) Two humans and one object (inc. ditransitive verbs), e.g. give, show One human and 2+ objects (vt. + object + implicit instr./goal/theme) e.g. cut, write, butter, pocket, dig, cook Verbs without distinct visualisation when out of context: verbs of trying, helping, letting, creating/destroying High level behaviours (routine events), political and social activities e.g. interview, eat out (go to restaurant), go shopping

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Level-of-Detail (LOD) basic-level verbs & troponyms EVENT go run cause … event level verbs walkclimbjumpmanner level verbs limpstride swagger trot … skipbouncehopjogromp troponym level verbs

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Current status of implementation  Collision detection example (contact verbs: hit, collide, scratch, touch) The car collided with a wallThe car collided with a wall. using ParallelGraphics’ VRML extension--object-to-object collision non-speech sound effects  H-Anim examples: 3 visual valency verbs John put a cup of coffee on the table. H-Anim Site node locative tags of object (on_table tag for table object) 2 visual valency verbs John pushed the door. John ate the bread. Nancy sat on the chair. 1 visual valency verbs The waiter came to me: “Can I help you? Sir.” speech modality & lip synchronization camera direction (avatar’s point-of-view)

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Relation to other work  Domain-independent general purpose humanoid character animation  CONFUCIUS’ character animation focuses on language-to-humanoid animation process rather than considering human modelling & motion solely  Implementable semantic representation LVSR connecting linguistic semantics to visual semantics & suitable for action execution (animation)  Categorization and visualisation of eventive verbs based on visual valency  Reusable common sense knowledge base to elicit implied actions, instruments, goals, themes underspecified in language input

Faculty Research Student Conference Jordanstown, 15 Jan 2004 Prospective applications  Children’s education  Multimedia presentation  Movie/drama production  Computer games  Virtual Reality Conclusion & Future work  Humanoid animation explores problems in language visualization & automatic animation production  Formalizes meaning of action verbs and spatial prepositions  Maps language primitives with visual primitives  Reusable common senses knowledge base for other systems Further work  Discourse level interpretation  Action composition for simultaneous activities  Verbs concerning multiple characters’ synchronization & coordination (e.g. introduce)