Download presentation
Presentation is loading. Please wait.
1
Mining the Most Interesting Rules Roberto J. Bayardo Jr., Rakesh Agrawal Presented by: Mohamed G. Elfeky
2
Introduction Algorithms for mining rules: Constraint-based Heuristic (Predictive rules) Interestingness-metric Several interestingness metrics: confidence, support, laplace, gain, conviction
3
Generic Problem Statement The rule: A C The input is: (U, D, , C, N) U is a set of conditions for the rule antecedent. D is a data-set. is a total order on rules. C is a condition for the rule consequent. N is a set of constraints on rules.
4
Optimized Rule Mining Find a set A 1 U such that: A 1 satisfies N, A 2 U: A 2 satisfies N A 1 < A 2. Any rule A C whose A A 1 is optimal. Generally, this is NP-Hard problem.
5
Partial-Order Optimized Rule Mining Partial order vs. Total order Some rules may be incomparable. Several equivalence classes for optimal rules. Find a set O P(U) such that: A O: A is optimal, For each equivalence class that has a rule that is optimal, exactly one member of this class is within O.
6
Monotonicity f(x) is said to be monotone in x if: x 1 < x 2 f(x 1 ) f(x 2 ) f(x) is said to be anti-monotone in x if: x 1 < x 2 f(x 1 ) f(x 2 )
7
Optimality SC-Optimality PC-Optimality Definition Theoretical Implications Practical Implications
8
SC-Optimality: Definition The partial order sc For rules r 1 and r 2 : r 1 < sc r 2 if and only if: sup(r 1 ) sup(r 2 ) conf(r 1 ) < conf(r 2 ), or sup(r 1 ) < sup(r 2 ) conf(r 1 ) conf(r 2 ). Also, r 1 = sc r 2 if and only if: sup(r 1 ) = sup(r 2 ) conf(r 1 ) = conf(r 2 ).
9
SC-Optimality: Definition (cont.) The partial order s c For rules r 1 and r 2 : r 1 < s c r 2 if and only if: sup(r 1 ) sup(r 2 ) conf(r 1 ) > conf(r 2 ), or sup(r 1 ) < sup(r 2 ) conf(r 1 ) conf(r 2 ). Also, r 1 = s c r 2 if and only if: sup(r 1 ) = sup(r 2 ) conf(r 1 ) = conf(r 2 ).
10
SC-Optimality: Definition (cont.) sc-optimal rule s c-optimal rule non-optimal rule confidence support No optimal rules fall outside the borders
11
SC-Optimality: Theoretical Implications A total order t is implied by sc if: r 1 sc r 2 r 1 t r 2 ^ r 1 = sc r 2 r 1 = t r 2 r is optimal for sc r is optimal for t. t defined by f(r) is implied by sc if: f(r) is monotone in support, and f(r) is monotone in confidence.
12
SC-Optimality: Theoretical Implications (cont.) Interestingness metrics: laplace(r) = gain(r) = sup(r) (1 – /conf(r)) conviction(r) = /(1 – conf(r)) sup(r) + 1 sup(r)/conf(r) + k
13
PC-Optimality: Definition The partial order pc For rules r 1 and r 2 : r 1 < pc r 2 if and only if: pop(r 1 ) pop(r 2 ) conf(r 1 ) < conf(r 2 ), or pop(r 1 ) pop(r 2 ) conf(r 1 ) conf(r 2 ). Also, r 1 = pc r 2 if and only if: pop(r 1 ) = pop(r 2 ) conf(r 1 ) = conf(r 2 ).
14
PC-Optimality: Definition (cont.) pop(A C) is the set of records from D that satisfy both A and C. |pop(r)| = sup(r) |D| Analogously, the definition of p c
15
PC-Optimality: Theoretical Implications sc is implied by pc and s c by p c. pc results in more incomparable rule pairs. pc-optimal rule set will contain more rules than sc-optimal rule set.
16
Optimality: Practical Implications Two algorithms are proposed, one for each type of optimality. Each algorithm produces a set of optimal rules without specifying the interestingness metrics. The produced set is guaranteed to identify the most interesting rules according to several metrics.
17
Optimality: Practical Implications (cont.) These algorithms facilitate interactivity: Examine the optimal rules according to some metric without additional querying or mining. Find the most interesting rule that characterizes any given subset of the population.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.