Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.

Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009

Outline Introduction The PLOW System Demonstration Learning Tasks Evaluation Strength & Weakness Related Works Q&A and Discussion

Introduction Aim to further human-machine interaction Quickly learn new tasks from human subjects using modest amounts of interaction Acquire task models from intuitive language-rich demonstration

Background Previous Work: Learn new tasks by observation, learn throw observing experts’ demonstration. Paper’s Contribution: Acquire tasks much more quickly, typically from a single example, maybe with some clarification dialogue.

The Interface

Agent Architecture

Language Processing Focus on how it is used for task learning, rather than how it is accomplished TRIPS system Based on a domain-independent representation

Instrumentation The key issue is to get the right level of analysis for the instrumentation Backend API Calls of the GUI Identifiers Evoked By the Users The Middle Layer? Mouse Movements and Gestures Keyboard Hits... DOMs

A demo

Task Learning Challenges Identifying the correct parameterization Hierarchical structure Identifying the boundaries of iterative loops Loop termination conditions Task goals

Primitive Action Learning NL Interpretation + GUI Interpretation Heuristic search through DOM Semantic metric Structural distance metric

Primitive Action Learning Natural Language Interpretation GUI Interpretation Heuristic search Structural Distance Metric Semantic metric

Parameterization Learning Identify appropriate parameterization Object roles Input/output parameter Constant Relational dependency Information from language

Parameterization Learning Output: Hotels Input: Address Constant: Hotels Relational Dependency: Zip is Role of Address

Hierarchical Structure Learning Beginning of new procedures A Goal End of procedure

Iteration Learning Iterative procedure learning Language support PLOW’s attempt for Iteration User corrections/more example Rule/Pattern learned

Iteration Learning

Evaluation 16 test subjects with general training 3 other systems: One learned entirely from passive observation One used a sophisticated GUI primarily designed for editing procedures but extended to allow the definition of new procedures One used an NL-like query and specification language that requires detailed knowledge of HTML producing the web page

Evaluation – First part Subjects taught different systems about some subset of predefined test questions Evaluators created new test examples by specifying values for input parameters, then scored the execution results PLOW scored 2.82 out of 4 (not mention other systems’ scores)

Evaluation – Second part 10 new test questions designed by an outside group, unknown to the developers prior to the test Subjects had one work day to teach whichever of these tasks they wished, using whichever of the task learning system PLOW was used to teach 30 out of 55 task models

Evaluation – Second part 10 new test questions designed by an outside group, unknown to the developers prior to the test Subjects had one work day to teach whichever of these tasks they wished, using whichever of the task learning system PLOW was used to teach 30 out of 55 task models Also, PLOW received the highest average score in the test (2.2/4)

Strength Integrating natural language recognition and understanding (TRIPS, 2001) “Play by play” mode, great user experience Easier to identify parameters, boundaries of loops, termination conditions, build hierarchical structure, realize goals Generalization from one short task Learn not only the task, but also the rule

Strength Error correction from users “This is wrong. Here is another title” PLOW will confirm correctness from users when generating data from lists Less domain knowledge required, less training Close to “one-click automation”

Weakness Some remarks of Evaluation: PLOW was ensured of being able to learn 17 pre- determined test questions, other systems? 10 new tasks have different levels of difficulties: ex: For what reason did travel for between and ? No detailed analysis of evaluation result, so does PLOW really learn robust task models from a single example? Or just better on certain types of tasks?

Weakness Learning and reasoning relied on NL understanding: encounters new concepts? require certain patterns of speaking? Enough NL understanding capabilities? Still need one full work day to teach 3 simple tasks/person Users have to construct good task models, no error detection mechanism for users in PLOW

Related works Sheepdog, 2004 an implemented system for capturing, learning, and playing back technical support procedures on the Windows desktop Complex technical supporting procedures – relatively simple procedures in PLOW Record traces to form alignment problem, use I/O HMMs to build procedure models, need many training examples

Related works Tailor, 2005 a system that allows users to modify task information through instruction Recognize user’s instruction: combine rules with parser, JavaNLP Map sentences to hypothesized changes Reason about the effects of changes, detect the unanticipated behavior Also relatively simple tasks

Related works CHINLE, 2008 a system which automatically constructs PBD systems for applications based on their interface specification Learning from incomplete data and partial learning from inconsistent data – PLOW can learn subset of certain tasks, but users cannot make mistakes

Questions?

More time for the latest PLOW demo?

Thank you!

Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.

Similar presentations

Presentation on theme: "Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.

Similar presentations

Presentation on theme: "Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009."— Presentation transcript:

Similar presentations

About project

Feedback