Case Base Maintenance(CBM) Fabiana Prabhakar CSE 435 November 6, 2006.

Slides:



Advertisements
Similar presentations
Case Based Reasoning Lecture 7: CBR Competence of Case-Bases.
Advertisements

The added value information service that focuses on the European Union, the countries of Europe, and on the issues of concern to citizens, stakeholders.
Heuristic Search techniques
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Bridgette Parsons Megan Tarter Eva Millan, Tomasz Loboda, Jose Luis Perez-de-la-Cruz Bayesian Networks for Student Model Engineering.
Anany Levitin ACM SIGCSE 1999SIG. Outline Introduction Four General Design Techniques A Test of Generality Further Refinements Conclusion.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree.
Identifying Competence-Critical Instances for Instance-Based Learners Presenter: Kyu-Baek Hwang.
Chapter 11 Artificial Intelligence and Expert Systems.
Case-based Reasoning System (CBR)
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
Building Knowledge-Driven DSS and Mining Data
Methodology Conceptual Database Design
Chapter 1 Introduction to Databases
31 st October, 2012 CSE-435 Tashwin Kaur Khurana.
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Northcentral University The Graduate School February 2014
Requirements Engineering
11 C H A P T E R Artificial Intelligence and Expert Systems.
CBR for Fault Analysis in DAME Max Ong University of Sheffield.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 9: Algorithm Efficiency and Sorting Data Abstraction &
Large-Scale Case-Based Reasoning: Opportunity and Questions David Leake School of Informatics and Computing Indiana University.
March 16 & 21, Csci 2111: Data and File Structures Week 9, Lectures 1 & 2 Indexed Sequential File Access and Prefix B+ Trees.
Fundamentals of Information Systems, Third Edition2 Principles and Learning Objectives Artificial intelligence systems form a broad and diverse set of.
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
Distributed Aircraft Maintenance Environment - DAME DAME Workflow Advisor Max Ong University of Sheffield.
Date: 2012/3/5 Source: Marcus Fontouraet. al(CIKM’11) Advisor: Jia-ling, Koh Speaker: Jiun Jia, Chiou 1 Efficiently encoding term co-occurrences in inverted.
1 SMEs – a priority for FP6 Barend Verachtert DG Research Unit B3 - Research and SMEs.
Case study of Several Case Based Reasoners Sandesh.
Exploration Strategies for Learned Probabilities in Smart Terrain Dr. John R. Sullins Youngstown State University.
6/2/20161 Database Systems Lecture # 3 By: Asma Ahmad Jan 21 st, 2011.
Chapter 1 Data Structures and Algorithms. Primary Goals Present commonly used data structures Present commonly used data structures Introduce the idea.
File Organization Lecture 1
I Robot.
1 Chapter 3 1.Quality Management, 2.Software Cost Estimation 3.Process Improvement.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
Indexing and hashing Azita Keshmiri CS 157B. Basic concept An index for a file in a database system works the same way as the index in text book. For.
Methodology – Physical Database Design for Relational Databases.
Doc.: IEEE /0617r0 Submission May 2008 Tony Braskich, MotorolaSlide 1 Refining the Security Architecture Date: Authors:
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
Strategies for Distributed CBR Santi Ontañón IIIA-CSIC.
1 Knowledge Acquisition and Learning by Experience – The Role of Case-Specific Knowledge Knowledge modeling and acquisition Learning by experience Framework.
The Goldilocks Problem Tudor Hulubei Eugene C. Freuder Department of Computer Science University of New Hampshire Sponsor: Oracle.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
Some Thoughts to Consider 8 How difficult is it to get a group of people, or a group of companies, or a group of nations to agree on a particular ontology?
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
20. september 2006TDT55 - Case-based reasoning1 Retrieval, reuse, revision, and retention in case-based reasoning.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
Indexing Database Management Systems. Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files File Organization 2.
Artificial Intelligence
CS Machine Learning Instance Based Learning (Adapted from various sources)
Principles in the Evolutionary Design of Digital Circuits J. F. Miller, D. Job, and V. K. Vassilev Genetic Programming and Evolvable Machines.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.
Testing methods for the co-production of target knowledge Tobias Buser, Network for Transdisciplinary Research td-net, Swiss Academies of Arts and Sciences.
Author : Tzi-Cker Chiueh, Prashant Pradhan Publisher : High-Performance Computer Architecture, Presenter : Jo-Ning Yu Date : 2010/11/03.
Data Structures and Algorithm Analysis Dr. Ken Cosh Linked Lists.
Introduction To DBMS.
Azita Keshmiri CS 157B Ch 12 indexing and hashing
3.3. Case-Based Reasoning (CBR)
C.-S. Shieh, EC, KUAS, Taiwan
Architecture Components
Indexing and Hashing Basic Concepts Ordered Indices
MURI Kickoff Meeting Randolph L. Moses November, 2008
Authors: Barry Smyth, Mark T. Keane, Padraig Cunningham
1st Joint Workshop Pesticides Statistics
Presentation transcript:

Case Base Maintenance(CBM) Fabiana Prabhakar CSE 435 November 6, 2006

Introduction The growing use of CBR applications has brought with it increased awareness of the importance of case-base maintenance (CBM). Large scale CBR systems are becoming more prevalent, with case library sizes ranging from thousands to millions of cases. Large case-bases raises concern about the utility problem for case retrieval, underlining the potential need to control case-base growth through case deletion policies.

Definition  CBM is the process of refining a CBR system’s case base to improve the system’s performance.

Standard CBR learning The system always add each new case to the case base. Domain expert adds a variable number of new cases. Indexing of the cases.

Knowledge-based Systems Utility Problem The cost associated with searching for relevant knowledge outweighs the benefit of applying the knowledge.

Traditional Deletion Policies A simple deletion policy is random deletion. According to this policy a random item is removed from the knowledgebase once the knowledge- base size exceeds some predefined limit. Minton’s utility metric [Minton, 1990]. Chooses a knowledge item for deletion based on an estimate of its performance benefits. Utility=(ApplicationFreq*AverageSavings)- MatchCost

Remembering to Forget Competency Preserving Case Deletion Policy for CBR Systems (Smyth and Keane, 1995)

Coverage and Reachability Coverage of a case is the set of target problems that can be solved by such case. Reachability of a target problem is the set of cases that can be used to provide a solution for the target problem.

Case Competence Categories Pivotal Cases: its deletion directly reduces the competence of the system. A case is pivotal if it is reachable by no other case but itself. Auxiliary Cases: do not effect competence at all. A case is auxiliary case if the coverage it provides is subsumed by the coverage of one of its reachable cases.

Case Competence Categories (Cont.) Spanning Cases: do not directly affect the competence. Their coverage spaces link regions of the problem space that are independently covered by other cases. If cases from this linked regions are deleted, then the spanning case might be necessary. Support Cases: a special class of spanning cases. They exist in groups. The deletion of the group is analogous to removing a pivotal case.

Case Competence Categories (Cont.)

The case categories provide a means of ordering cases for deletion in terms of their competence contributions. 1. Auxiliary cases (they make no direct contribution to competence) 2. Support cases 3. Spanning cases 4. Pivotal cases.

Modeling Case Competence Competence categories are computed at start-up. During future problem solving as cases are learned, the case categories must be updated: 1. Re-compute the coverage and reachability sets of the appropriate cases; 2. Adjust the categories accordingly.

The Footprint Deletion Policy Ideally a deletion policy should work to remove irrelevant cases guiding the case-base toward an optimal configuration of cases. Competence Footprint is this optimal case-base. It provides the same competence of the entire case-base but with fewer cases.

The Footprint Deletion Algorithm DeleteCase(Cases): If there are auxiliary cases then SelectAuxiliary(AuxiliaryCases) ElseIf there are support cases then With the largest support group SelectSupport(SuportGroup) ElseIf there are spanning cases then SelectSpanning(SpanningCases) ElseIf there are pivotal cases then SelectPivot(PivotalCases) Endif

The Footprint Utility Deletion Policy Combine Footprint and Utility Deletion: Minton’s utility metric – An item is selected based on an estimate of its performance benefits. Utility = (ApplicationFreq * AverageSavings) – MatchCost The footprint method is used to select candidates for deletion. If there is only one such candidate then it is deleted. If, however, there a number of candidates, then rather than selecting the one with the least coverage or largest reachability set, the candidate with the lowest utility is chosen. In other words the utility metric is used within the SelectPivot, SelectSpanning, SelectSupport, and SelectAuxiliary procedures.

Further Applications The competence modeling approach may be used during the initial case acquisition stage of system development. It is often undesirable to store every available case in the initial case-base. 1. Utility Problem; 2. Irrelevant cases may introduce noise into the retrieval stage and lead to the selection of suboptimal cases or difficulties in tuning the similarity metric. The competence modeling approach may be used during the authoring process.

CBR systems Authoring Process Case base authoring can be a long, difficult, and tedious process, and the only advice given to the author is often of the “choose representative cases” variety. This can ultimately lead to the development of poor case bases, which offer limited coverage of the target problem space, and which include significant redundancy.

CASCADE (Case Authoring Support & Development Environment) Keeps the knowledge engineer informed about how case authoring is progressing, and in particular, how case base competence is evolving. Extends the case competency model proposed by Smyth and Keane.

Competence Groups A competence group is a collection of related cases. The key idea underlying the definition of a competence group is that of shared coverage. Two cases exhibit shared coverage if their coverage or reachability sets overlap.

The Evolution of Competence In general as cases are added to the case base one of four things can happen: 1. New groups are created; 2. Existing competence groups grow in size and coverage; 3. A number of existing groups merge to form a new ‘super’ group; 4. Existing groups can grow in size but without increasing coverage. Conversely, as cases are deleted, groups may disappear altogether, or they may split into smaller ‘sub’ groups.

The Competence Visualization Tool

Competence Regions

The Competence Visualization Tool – Examples

The Competence Visualization Tool – Examples (Cont.)

Conclusion Experience with the growing number of large-scale CBR systems has led to increasing recognition of the importance of case-base maintenance. Multiple researches have addressed pieces of the CBM problem, considering such issues as maintaining consistency and controlling case-base growth. The authoring process can be improved in order to avoid the development of poor case bases.

References Smyth B., and Keane, M. “Remembering to forget: A competence-preserving case deletion policy for case-based reasoning systems”, in Proc. of the 14 th International Joint Conf. on Artificial Intelligence, Montreal, Morgan Kauffmann, Canada, 1995, pp D.B. Leak and D.C. Wilson “Categorizing case- based maintenance: Dimensions and directions”, in Proc. of the 4 th European Workshop on Case- Base Reasoning, Dublin, Ireland, Springer Verleg: 1998, pp McKenna, E. & Smyth, B. “An Interactive Visualisation Tool for Case-Based Reasoners”, Journal of Applied Intelligence: Special Issue on Interactive Case-Based Reasoning, 2000