Case-Based Reasoning Ramon López de Mántaras Badia IIIA - CSIC www.iiia.csic.es/~mantaras.

Case-Based Reasoning Ramon López de Mántaras Badia IIIA - CSIC www.iiia.csic.es/~mantaras

A physician after having examined a patient, reminds another patient that treated a few days ago. Assuming that the reminding was caused by a similarity among relevant symptoms, the physician takes into account the diagnosis and treatment of the previous patient in order to diagnose and treat the current one. A financial advisor working on a credit allocation to a company recommends that the credit should be refused based on a previous case involving a company with similar financial situation that went bankruptcy These are two typical CBR situations

Case-Based Reasoning (CBR) Case-based reasoning (CBR) is a technique based on analogical reasoning. A case-based reasoner solves new problems by adapting solutions that were used to solve old (similar) problems. The main idea is to reuse previous experiences for actual problems. The difficulty arises when the actual situation is not identical to the previous one: There is a partial matching involved. –The central notion in CBR is the concept of similarity. Here we will briefly describe its main features.

Case-Based Reasoning (CBR) Basic Ideas: –Memorize previous experiences (cases) –Solve new Problems by retrieving and reusing similar cases –Store new experience again Is a well-founded technology: –Mathematically (works by Dubois, Prade, Godo, Esteva, LdeM) –Algorithmically –Cognitively (works by Schanck and his students) –Supported by experiments and applications –Business success

What is a Case ? A case has two parts: –Description of a problem or a set of problems (generalized case) –Description of the solution of this problem (formally or informally) Possibly additions like explanations, comments on the quality of the solution etc. Cases represent experiences : They record the solution of a problem solved in the past (and sometimes also how a problem was solved in the past)

Case Representation Many different case representations are used depending on requirements of domain and task: Flat feature-value list –Simple case structure is sometimes sufficient for problem solving –Easy to store and retrieve in a CBR system…but inefficient –Object-oriented representations –Case: collection of objects (instances of classes) –Required for complex and structured objects For special tasks: –Predicate logic: case = set of atomic formulas

How to Use a Case Solution ? Solution adaptation Input Problem Problem of the case Solution of the case In general, there is no guarantee for getting good solutions because the case may be „too far away“ from the problem. Therefore the problem arises how to define when a case is „close enough“.

How to Use a Case-Base A case base is a data base of cases If a new problem arises, use a case from the case base in order to solve the problem If we have many cases then the chance is higher to find one with a suitable solution Because the given problem is usually not exactly in the base one wants to retrieve a case which solved a problem which is „similar enough to be useful“ Hence, the notion of similarity is central to CBR

The Classical CBR R 4 -Cycle [from: Aamodt & Plaza, 1994] Retrieve: Determine most similar case(s). Reuse: Solve the new problem re-using information and knowledge in the retrieved case(s). Revise: Evaluate the applicability of the proposed solution in the real-world. Retain: Update case base with new learned case for future problem solving. Retrieve Reuse Revise Retain Case Base Knowledge New Case New Case Retrieved Case Solved Case Learned Case Tested/ Repaired Case Suggested Solution Problem Confirmed Solution This cycle shows the main activities in CBR

Retrieve: Modeling Similarity The similarity based retrieval computes an inexact match Different approaches depending on case representation: Similarity measures: –Are functions to compare two cases sim: Case x Case  [0..1] –Local similarity measure: similarity on feature level –Global similarity measure: similarity on case or object level Graph matching –Uses anti-unification (most specific generalization) Knowledge intensive retrieval

Similarities (1) They operate on problem descriptions Basic assumption: The more similar two problem descriptions C and D are, the more useful it is to use one of the solutions also for the other problem (the more similar the problems, the more similar the solutions)

Similarities (2) Given a fixed problem C A similarity measure introduces a partial ordering (to be more or less similar to C) on the set of problems and therefore also on the case base The basic assumption also means that “more similar“ also means “more useful“ with respect to the solutions Therefore the similarity measure controls the utility when inexact solutions are employed.

Similarities (4) The similarity measure is the central element to navigate through the space of possible solutions. Instead of presenting the exact solution similarity is a concept to approximate it. Even when the exact or optimal solution is not available or too difficult to achieve one comes still up with at least a suggestion for the solution.

A Typical Similarity Measure Given two problem descriptions C1, C2 p attributes y 1,..., y p used for the representation p SIM(C1,C2) =  w j · sim j (C1,C2) /  w j j=1 sim j : similarity for attribute y j (local measure) w j : describes the relevance of attribute j for the problem Problem: How to evaluate the importance of the different features? (importance is context dependent). One solution: learning the weights.

Retrieval: Finding The Nearest Neighbor For a new problem N the nearest neighbor in the case base is the case (C,S) for which problem C has the greatest similarity to N. Its solution S is intended to be most useful and is then the best solution the case base can offer. Classical databases use always total similarity (i.e. equality). The access to data in databases is in similarity-based systems replaced by the search for the nearest neighbor. It can be regarded as an optimization process. This requires more effort but can be much more useful.

Thresholds The nearest neighbor (in the given case base) is not always sufficient for providing an acceptable solution. On the other hand, a case which is not the nearest neighbor may be sufficient enough. For this purpose one can introduce two thresholds  and , 0 <  <  < 1 with the intention –If sim(newproblem, caseproblem) <  then the case is not accepted; –If sim(newproblem, caseproblem) >  then the case is accepted. This partitions this case base (for the actual problem into three parts: accepted cases, unaccepted cases and an uncertainty set.

Retrieve: Efficiency Issues Efficient case retrieval is essential for large case bases. Different approaches depending –on the representation –complexity of similarity computation –size of the base Organization of the base: –Linear lists, only for small bases –Index structures for large bases, e.g., discrimination trees How to store cases: –Databases: for large bases or if shared with other applications –Main memory: for small bases, not shared

Reuse: How to Adapt the Solution No modification of the solution: just copy. Manual/interactive solution adaptation by the user. Automatic solution adaptation : –Transformational Analogy: transformation of the solution Rules or operators to adjust solution w.r.t. differences in the problems Knowledge required about the impact of differences –Compositional adaptation: combine several cases to a single solution

Revise: Verify and Correct Solution Revision or adaptation phase –No revise phase –Verification of the solution by computer simulation –Verification / evaluation of the solution in the real world Criteria for revision –Correctness of the solution –Quality of the solution –Other, e.g., user preferences

Retain: Learning from Problem Solving What can be learned –New experience (new case) –Improved similarity assessment, importance of features –Organization/indexing of the case base to improve efficiency –Knowledge for solution adaptation –Forgetting cases, e.g., for efficiency or because out-of-date Methods –Storing cases in the case base (may include a generalization process) –Deleting cases from the case base CBR is a “lazy learning” method

A Simple Example (I) Overview Technical Diagnosis of Car Faults: –symptoms are observed (e.g., engine doesn’t start) and values are measured (e.g., battery voltage = 6.3V) –goal: Find the cause for the failure (e.g., battery empty) and a repair strategy (e.g., charge battery) Case-Based Diagnosis: –a case describes a diagnostic situation and contains: description of the symptoms problem part description of the failure and the cause description of a repair strategy What to do: –store a collection of cases in a case base –find case similar to current problem and reuse repair strategy solution part

A Simple Example (II) What does a Case Look Like? A case describes one particular diagnostic situation A case records several features and their specific values occurred in that situation Feature Value Problem (Symptoms) Problem: Front light doesn’t work Car: VW Golf IV, 1.6 l Year: 1998 Battery voltage: 13,6 V State of lights: OK State of light switch: OK Solution Diagnosis: Front light fuse defect Repair: Replace front light fuse CASE1CASE1

A Simple Example (III) A Case Base With Two Cases Each case describes one particular situation All cases are independent of each other Problem (Symptoms) Problem: Front light doesn’t work Car: VW Golf IV, 1.6 l Year: 1998 Battery voltage: 13,6 V State of lights: OK State of light switch: OK Solution Diagnosis: Front light fuse defect Repair: Replace front light fuse CASE1CASE1 Problem (Symptoms) Problem: Front light doesn’t work Car: Audi A4 Year: 1997 Battery voltage: 12,9 V State of lights: surface damaged State of light switch: OK Solution Diagnosis: Bulb defect Repair: Replace front light CASE2CASE2

A Simple Example (IV) Solving a New Diagnostic Problem A new problem has to be solved We make several observations in the current situation Observations define a new problem Not all feature values have to be known Note: The new problem is a “case” without solution part Feature Value Problem (Symptom): Problem: Break light doesn’t work Car: Audi 80 Year: 1989 Battery voltage: 12.6 V State of light: OK

A Simple Example (V) When are two cases similar? How to rank the cases according to their similarity? We can assess similarity based on the similarity of each feature Similarity of each feature depends on the feature value. BUT: Importance of different features may be different New Problem CASExCASEx Similar? Compare the New Problem with Each Case and Select the Most Similar Case :

A Simple Example (VI) Similarity Computation Computing similarities on features values. Express degree of similarity by a real number between 0 and 1 Examples: –Feature: Problem –Feature: Battery voltage (similarity depends on the difference) Different features have different importance (weights)! –High importance: Problem, Battery voltage, State of light,... –Low importance: Car, Year,... Not similar Equal Front light doesn’t workBreak light doesn’t work Front light doesn’t workEngine doesn’t start 0.8 0.1 12.6 V 13.6 V 12.6 V 6.7 V 0.9 0.1 Electric components Front lights break lights Car comp.

A Simple Example (VII) Compare New problem and Case 1 Similarity computation by weighted average similarity(new,case 1) = 1/20 * [ 6*0.8 + 1*0.4 + 1*0.6 + 6*0.9 + 6* 1.0 ] = 0.86 Problem (Symptoms) Problem: Break light doesn’t work Car: Audi 80 Year: 1989 Battery voltage: 12.6 V State of lights: OK Problem (Symptoms) Problem: Front light doesn’t work Car: VW Golf III, 1.6 l Year: 1996 Battery voltage: 13.6 V State of lights: OK State of light switch: OK Solution Diagnosis: Front light fuse defect Repair: Replace front light fuse 0.8 0.4 0.6 0.9 1.0 Very important feature: weight = 6 Less important feature: weight = 1

A Simple Example (VIII) Compare New problem and Case 2 Similarity computation by weighted average similarity(new,case 2) = 1/20 * [ 6*0.8 + 1*0.8 + 1*0.4 + 6*0.95 + 6*0 ] = 0.585 Case 1 is more similar: due to feature “State of lights” Problem (Symptoms) Problem: Break light doesn’t work Car: Audi 80 Year: 1989 Battery voltage: 12.6 V State of lights: OK Problem (Symptoms) Problem: Front light doesn’t work Car: Audi A4 Year: 1997 Battery voltage: 12.9 V State of lights: surface damaged State of light switch: OK Solution Diagnosis: Front light fuse defect Repair: Replace front light fuse 0.8 0.4 0.95 0 Very important feature: weight = 6 Less important feature: weight = 1

A Simple Example (IX) Reuse the Solution of Case 1 New Solution: Diagnosis: Break light fuse defect Repair: Replace break light fuse Problem (Symptom): Problem: Break light doesn’t work Car: Audi 80 Year: 1989 Battery voltage: 12,6 V State of light: OK Adapt Solution: How do differences in the problem affect the solution? Problem (Symptoms): Problem: Front light doesn’t work... Solution: Diagnosis: Front light fuse defect Repair: Replace front light fuse CASE1CASE1

A Simple Example (X) Store the New Experience CASE3CASE3 Problem (Symptoms): Problem: Break light doesn’t work Car: Audi 80 Year: 1989 Battery voltage: 12.6 V State of lights: OK State of light switch: OK Solution: Diagnosis: break light fuse defect Repair: replace break light fuse If diagnosis is correct: Store new case in the memory.

A glimpse at a complex system (SaxEx)

Advantages of CBR over other Techniques Reduces (not eliminates) the knowledge acquisition effort Requires less maintenance effort Improves problem solving performance through reuse Improves over time and adapts to changes in the environment High user acceptance

Reduce Knowledge Acquisition Effort CBR Systems Require less general knowledge Most knowledge in case base Case knowledge may be easier to acquire (sometimes already available) Problem: Adaptation knowledge Solution Problem KBS Domain Knowledge Knowledge Acquisition Knowledge Base Traditional Knowledge- Based Systems !! Acquisition of general knowledge is more difficult !!

Less Effort Required for Maintenance What is the impact of changes? Rule bases or models are difficult to maintain –Many dependencies between rules effects of changes of the rule base are hard to predict –Rules of KBS often difficult to understand for non AI experts –Maintenance by the domain expert almost impossible !! Case bases are easier to maintain –Cases are independent from each other –Domain experts and novices understand cases quite easy –Maintenance of CBR system (partially) by adding/deleting cases

When CBR is relevant –When a domain theory does not exist, but example cases are easy to find –When an expert in the domain is not available, is too expensive, or is incapable of articulate verbally his performance, but example cases are easy to find –When it is difficult to specify domain rules, but example cases are easy to find –When cases with similar solutions have similar problem descriptions i.e. there exists a similarity metric for problem descriptions and a corresponding set of adaptation rules) –When a case base already exists

Summary CBR is a (relatively) new AI approach able to use specific knowledge of past experiences CBR has produced interesting techniques and applications but further research is needed, particularly in: –How to index cases –Structural and quantitative matching methods –Integration with other AI techniques –Enriched representations –Forgetting mechanisms –Case base maintenance –Explanation mechanisms

Case-Based Reasoning Ramon López de Mántaras Badia IIIA - CSIC www.iiia.csic.es/~mantaras.

Similar presentations

Presentation on theme: "Case-Based Reasoning Ramon López de Mántaras Badia IIIA - CSIC www.iiia.csic.es/~mantaras."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Case-Based Reasoning Ramon López de Mántaras Badia IIIA - CSIC www.iiia.csic.es/~mantaras.

Similar presentations

Presentation on theme: "Case-Based Reasoning Ramon López de Mántaras Badia IIIA - CSIC www.iiia.csic.es/~mantaras."— Presentation transcript:

Similar presentations

About project

Feedback