Presentation is loading. Please wait.

Presentation is loading. Please wait.

Similarity in CBR (Cont’d) Sources: –Chapter 4 –www.iiia.csic.es/People/enric/AICom.html –www.ai-cbr.org.

Similar presentations


Presentation on theme: "Similarity in CBR (Cont’d) Sources: –Chapter 4 –www.iiia.csic.es/People/enric/AICom.html –www.ai-cbr.org."— Presentation transcript:

1 Similarity in CBR (Cont’d) Sources: –Chapter 4 –www.iiia.csic.es/People/enric/AICom.html –www.ai-cbr.org

2 Simple-Matching-Coefficient (SMC)  H(X,Y) = n – (A + D) = B + C Another distance-similarity compatible function is f(x) = 1 – x/max (where max is the maximum value for x)  We can define the SMC similarity, sim H : sim H (X,Y) = 1 – ((n – (A+D))/n) = (A+D)/n = 1- ((B+C)/n) Solution (I): Show that f(x) is order inverting: if x f(y) Proportion of the difference

3 Simple-Matching-Coefficient (SMC) (II) If we use on sim H (X,Y) = 1- ((B+C)/n) = factor(A, B, C, D)  Monotonic:  If A  A’ then:  If B  B’ then:  If C  C’ then:  If D  D’ then: factor(A,B,C,D)  factor(A’,B,C,D) factor(A,B’,C,D)  factor(A,B,C,D) factor(A,B,C’,D)  factor(A,B,C,D) factor(A,B,C,D)  factor(A,B,C,D’)  Symmetric: sim H (X,Y) = sim H (Y,X) Solution(II): Show that sim H (X,Y) is monotonic

4 Variations of SMC (III) We introduce a weight, , with 0 <  < 1: simH(X,Y) = (A+D)/n = (A+D)/(A+B+C+D) sim  (X,Y) = (  (A+D))/ (  (A+D) + (1 -  )(B+C))  For which  is sim  (X,Y) = sim H (X,Y)?  = 0.5  sim  (X,Y) preserves the monotonic and symmetric conditions Solution(III): Show that sim  (X,Y) is monotonic

5 Homework (Part IV): Attributes May Have multiple Values X = (X 1, …, X n ) where X i  T i Y = (Y 1, …,Y n ) where Y i  T i Each T i is finite Define a formula for the Hamming distance in this context

6 Tversky Contrast Model Defines a non monotonic distance Comparison of a situation S with a prototype P (i.e, a case) S and P are sets of features The following sets:  A = S  P  B = P – S  C = S – P A S P C B

7 Tversky Contrast Model (2) Tversky-distance: Where f: Sets  [0,  ), , , and  are constants f, , , and  are fixed and defined by the user Example:  If f(A) = # elements in A   =  =  = 1  T counts the number of elements in common minus the differences  The Tversky-distance is not symmetric T(P,S) =  f(A) -  f(B) -  f(C)

8 Local versus Global Similarity Metrics In many situations we have similarity metrics between attributes of the same type (called local similarity metrics). Example: For a complex engine, we may have a similarity for the temperature of the engine In such situations a reasonable approach to define a global similarity sim  (x,y) is to “aggregate” the local similarity metrics sim i (x i,y i ). A widely used practice sim  (x,y) to increate monotonically with each sim i (x i,y i ). What requirements should we give to sim  (x,y) in terms of the use of sim i (x i,y i )?

9 Local versus Global Similarity Metrics (Formal Definitions) A local similarity metric on an attribute T i is a similarity metric sim i : T i  T i  [0,1] A function  : [0,1] n  [0,1] is an aggregation function if:   (0,0,…,0) = 0   is monotonic non-decreasing on every argument Given a collection of n similarity metrics sim 1, …, sim n, for attributes taken values from T i, a global similarity metric, is a similarity metric sim:V  V  [0,1], V in T 1  …  T n, such that there is an aggregation  function with: sim(X,Y) = sim  (X,Y) =  (sim 1 (X 1,Y 1 ), …,sim n (X n,Y n )) Homework: provide an example of an aggregation function and a non-aggregation function and prove it. Show a global sim. metric

10 Solution Suppose that cases use an object oriented representation:  Suppose that cases use a taxonomical representation, describe how you would measure similarity and give a concrete example illustrating the process you described to measure similarity  Suppose that cases use a compositional representation, describe how you would measure similarity and give a concrete example illustrating the process you described to measure similarity Suggestion: look at the book!

11 Frontiers of Knowledge Dealing with numerical and non numerical values  Aggregation of local similarity metrics into a global similarity metric helps  but sometimes we don’t have local similarity metrics

12 Homework (II) From Chapter 5, what is the difference between completion and adaptation functions? What si their role on adaptation? Provide an example Show that Graph coloring is NP-complete  Assume that Constraint-SAT is NP complete  Definition. A constraint is a formula of the form: –(x = y) –(x  y) Where x and y are variables that can take values from a set (e.g., {yellow, white, black, red, …})  Definition. Constraint-SAT: given a conjunction of constraints, is there an instantiation of the variables that makes the conjunction true?


Download ppt "Similarity in CBR (Cont’d) Sources: –Chapter 4 –www.iiia.csic.es/People/enric/AICom.html –www.ai-cbr.org."

Similar presentations


Ads by Google