PROBABILISTIC AND LOGIC APPROACHES TO MACHINE LEARNING AND DATA MINING

Slides:



Advertisements
Similar presentations
FUNCTION OPTIMIZATION Switching Function Representations can be Classified in Terms of Levels Number of Levels, k, is Number of Unique Boolean (binary)
Advertisements

Representing Boolean Functions for Symbolic Model Checking Supratik Chakraborty IIT Bombay.
Glitches & Hazards.
CSEE 4823 Advanced Logic Design Handout: Lecture #2 1/22/15
MVI Function Review Input X is p -valued variable. Each Input can have Value in Set {0, 1, 2,..., p i-1 } literal over X corresponds to subset of values.
ECE 667 Synthesis & Verification - Boolean Functions 1 ECE 667 Spring 2013 ECE 667 Spring 2013 Synthesis and Verification of Digital Circuits Boolean Functions.
Efficient Decomposition of Large Fuzzy Functions and Relations Marek Perkowski + Portland State University, Dept. Electrical Engineering, Portland, Oregon.
DATE-2002TED1 Taylor Expansion Diagrams: A Compact Canonical Representation for Symbolic Verification M. Ciesielski, P. Kalla, Z. Zeng B. Rouzeyre Electrical.
Logic Synthesis Part II
Chapter 3 Simplification of Switching Functions
ECE Synthesis & Verification - Lecture 18 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Word-level.
Boolean Functions and their Representations
A New Approach to Structural Analysis and Transformation of Networks Alan Mishchenko November 29, 1999.
Rolf Drechlser’s slides used
ECE Synthesis & Verification - Lecture 19 1 ECE 667 Spring 2009 ECE 667 Spring 2009 Synthesis and Verification of Digital Systems Functional Decomposition.
2002/10/08 SeonPil Kim Layout-Driven Synthesis For Submicron Technology : Mapping Expansions To Regular Lattices High Level Synthesis Homework #2.
Taylor Expansion Diagrams (TED): Verification EC667: Synthesis and Verification of Digital Systems Spring 2011 Presented by: Sudhan.
ECE Synthesis & Verification - Lecture 10 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Binary.
 2001 CiesielskiBDD Tutorial1 Decision Diagrams Maciej Ciesielski Electrical & Computer Engineering University of Massachusetts, Amherst, USA

ECE 667 Synthesis & Verification - BDD 1 ECE 667 ECE 667 Synthesis and Verification of Digital Systems Binary Decision Diagrams (BDD)
 2000 M. CiesielskiPTL Synthesis1 Synthesis for Pass Transistor Logic Maciej Ciesielski Dept. of Electrical & Computer Engineering University of Massachusetts,
ECE 667 Synthesis and Verification of Digital Systems
Grover’s Algorithm in Machine Learning and Optimization Applications
By Tariq Bashir Ahmad Taylor Expansion Diagrams (TED) Adapted from the paper M. Ciesielski, P. Kalla, Z. Zeng, B. Rouzeyre,”Taylor Expansion Diagrams:
Efficient Decomposition of Large Fuzzy Functions and Relations.
Logic Decomposition ECE1769 Jianwen Zhu (Courtesy Dennis Wu)
Overview Part 2 – Circuit Optimization 2-4 Two-Level Optimization
Digitaalsüsteemide verifitseerimise kursus1 Formal verification: BDD BDDs applied in equivalence checking.
B-1 Appendix B - Reduction of Digital Logic Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.
2-Level Minimization Classic Problem in Switching Theory
1 Simplification of Boolean Functions:  An implementation of a Boolean Function requires the use of logic gates.  A smaller number of gates, with each.
Department of Computer Engineering
Combinatorial Algorithms Unate Covering Binate Covering Graph Coloring Maximum Clique.
2-Level Minimization Classic Problem in Switching Theory Tabulation Method Transformed to “Set Covering Problem” “Set Covering Problem” is Intractable.
Combinational Problems: Unate Covering, Binate Covering, Graph Coloring and Maximum Cliques Example of application: Decomposition.
Boolean Functions 1 ECE 667 ECE 667 Synthesis and Verification of Digital Circuits Boolean Functions Basics Maciej Ciesielski Univ.
Binary Decision Diagrams Prof. Shobha Vasudevan ECE, UIUC ECE 462.
EET 1131 Unit 5 Boolean Algebra and Reduction Techniques
Lecture 3: Incompletely Specified Functions and K Maps
CS 352 Introduction to Logic Design
Dr. Clincy Professor of CS
Delay Optimization using SOP Balancing
Digital Logic and Design
Portland State University
Example of application: Decomposition
A Boolean Paradigm in Multi-Valued Logic Synthesis
ECE 667 Synthesis and Verification of Digital Systems
Lecture 3: Incompletely Specified Functions and K Maps
BASIC & COMBINATIONAL LOGIC CIRCUIT
ECE 331 – Digital System Design
Digital Logic & Design Dr. Waseem Ikram Lecture 12.
Digital Logic & Design Dr. Waseem Ikram Lecture 13.
PROBLEM 1 Training Examples: Class 1 Training Examples: Class 2
Binary Decision Diagrams
Optimization Algorithm
Dr. Clincy Professor of CS
COE 202: Digital Logic Design Combinational Logic Part 3
Dr. Clincy Professor of CS
CS Chapter 3 (3A and ) – Part 3 of 5
Binary Decision Diagrams
Combinational Problems: Unate Covering, Binate Covering, Graph Coloring and Maximum Cliques Unit 6 part B.
Dr. Clincy Professor of CS
A logic function f in n inputs x1, x2, ...xn and
Overview Part 2 – Circuit Optimization
3-Variable K-map AB/C AB/C A’B’ A’B AB AB’
Delay Optimization using SOP Balancing
Analysis of Logic Circuits Example 1
A logic function f in n inputs x1, x2, ...xn and
ECE 331 – Digital System Design
Presentation transcript:

PROBABILISTIC AND LOGIC APPROACHES TO MACHINE LEARNING AND DATA MINING Marek Perkowski Portland State University

Essence of logic synthesis approach to learning

Example of Logical Synthesis John Dave Mark Jim Alan Nick Mate Robert

Good guys Bad guys A - size of hair B - size of nose C - size of beard Dave Jim John Mark Good guys Alan Nick Mate Robert Bad guys A - size of hair C - size of beard D - color of eyes B - size of nose

- 1 – 00 01 11 10 Good guys A - size of hair B - size of nose Mark John Dave Jim A’ B’CD A’ B’CD A’ BCD A’ BCD’ C - size of beard D - color of eyes A - size of hair B - size of nose 00 01 11 10 - 1 – AB CD

- 1 – 00 01 11 10 Bad guys A - size of hair B - size of nose Alan Nick Mate Robert Bad guys A’ B’C’D ABCD A’ BC’D’ AB’C’D C - size of beard D - color of eyes A - size of hair B - size of nose 00 01 11 10 - 1 – AB CD A’C

- 1 – 00 01 11 10 Generalization 2: Generalization 1: Bald guys with beards are good Generalization 2: All other guys are no good C - size of beard D - color of eyes A - size of hair B - size of nose 00 01 11 10 - 1 – AB CD A’C

SOP (DNF) approach to learning

Sum of Products AND gates, followed by an OR gate that produces the output. (Also, use Inverters as needed.) There are many algorithms to minimize SOP They were created either in ML community ot Logic Synthesis community. We will illustrate three different algorithms.

SOP minimization based on graph coloring Method 1 SOP minimization based on graph coloring

Reduction of SOP (DNF) Machine Learning to graph coloring SOP through Graph Coloring In previous example there were 4 binary variables. Here there are two variables , each with 4 values. We encode every group or minterm using the encoding as in the right We check for every two groups if they can be combined. If they can be combined, the combined group does not cover zeros. If the combined group covers zeros, the groups cannot be combined. Let us try to combine a1 and b1. We do bitwise OR. 1001 1000 1001 0100 The combined group does not cover zeros. 1001 1100

Reduction of SOP (DNF) Machine Learning to graph coloring SOP through Graph Coloring Let us try to combine a2 and b2. We do bitwise OR. 0010 1000 0100 0100 The combined group covers zeros. So groups a2 and b2 are not compatible. For every incompatible nodes in the graph there is an edge. 0110 1100

Reduction of SOP (DNF) Machine Learning to graph coloring SOP through Graph Coloring Based on incompatibility of groups we create the INCOMPATIBILITY GRAPH. Every two incompatible nodes (for incompatible groups) there is an edge. We color graph with the minimum number of colors. The minimum number of colors is called the chromatic number. We combine the nodes that have the same color.

The minimum coloring corresponds to the minimum number of combined groups in the final solution. These groups are usually products, but they may be also of the form PRODUCT1 * (PRODUCT2)’

SOP minimization based on set covering with primes Method 2 SOP minimization based on set covering with primes

SOP through Set Covering Find all prime implicants of the function. Create a table with columns being true minterms and rows being prime implicants. This is called the covering problem. You want to find the smallest subset of rows that covers all columns There are many algorithms for this problem. Some use BDDs, some SAT, some matrices. The same method can be used for Boolean minimization, test generation to cover all faults with minimum number of tests and to select best position of robots guarding a building from terrorists.

Columns correspond to minterms with value 1 SOP through Set Covering T0 T1 T2 T3 Rows correspond to prime implicants T0 and T2 is not a solution because column b0 is not covered. T0, T2 and T3 is a solution.

Method 3 SOP minimization based on set sequential finding of secondary essential primes

Machine Learning SOP through sequential finding of essential and secondary essential primes 1. Find essential primes Essential prime 00 01 11 10 1 - AB CD Essential prime

Machine Learning SOP through sequential finding of essential and secondary essential primes 2. remove essential primes Secondary Essential prime 00 01 11 10 - 1 AB CD orange is redundant prime Yellow is redundant prime

- 1 00 01 11 10 3. ITERATE Essential prime Secondary essential prime 1 AB CD Secondary essential prime ESSENTIAL prime The solution are essential primes and secondary essential primes of all levels. If algorithm does not terminate, make random choice and iterate OR use another algorithm.

Multivalued relations approach to learning

Short Introduction: multiple-valued logic Signals can have values from some set, for instance {0,1,2}, or {0,1,2,3} {0,1} - binary logic (a special case) {0,1,2} - a ternary logic {0,1,2,3} - a quaternary logic, etc 1 MIN MAX Minimal value 1 2 2 2 3 Maximal value 3 3

Functional Decomposition Evaluates the data function and attempts to decompose into simpler functions. F(X) = H( G(B), A ), X = A  B X A - free set B - bound set if A  B = , it is disjoint decomposition if A  B  , it is non-disjoint decomposition

Pros and cons In generating the final combinational network, BDD decomposition, based on multiplexers, and SOP decomposition, trade flexibility in circuit topology for time efficiency Generalized functional decomposition sacrifices speed for a higher likelihood of minimizing the complexity of the final network 6/24/2018

A Standard Map of function ‘z’ Bound Set a b \ c Columns 0 and 1 and columns 0 and 2 are compatible column compatibility = 2 Free Set z

Principle of finding patterns We have a tabular representation of data We want to find patterns In this case we are looking for patterns in columns. Columns have the same pattern if the symbols in each row can be combined. We say that these columns are COMPATIBLE. If in one row we have 0 and 0 , 1 and 1, 0 and -, 1 and - , or – and – then the columns are compatible. If we have a 0 and a relation 0,1 then the columns are compatible as one can select 0 from the choice of 0,1.

Decomposition of Multi-Valued Relations F(X) = H( G(B), A ), X = A  B A X Relation Relation Relation B if A  B = , it is disjoint decomposition if A  B  , it is non-disjoint decomposition

Forming a CCG from a K-Map Bound Set Free Set a b \ c Columns 0 and 1 and columns 0 and 2 are compatible column compatibility index = 2 C1 C2 C0 Column Compatibility Graph z

Column Incompatibility Graph Forming a CIG from a K-Map z a b \ c Columns 1 and 2 are incompatible chromatic number = 2 C1 C2 C0 Column Incompatibility Graph

Column Compatibility Graph Column Incompatibility Graph CCG and CIG are complementary Graph coloring graph multi-coloring Maximal clique covering clique partitioning C1 C2 C0 C1 C2 C0 Column Compatibility Graph Column Incompatibility Graph

clique partitioning example.

Maximal clique covering example.

g = a high pass filter whose acceptance threshold begins at Map of relation G G \ c G \ c After induction From CIG g = a high pass filter whose acceptance threshold begins at c > 1

The Meaning of Attributes

Attributes Static Facial features Gestures Objects to grasp Objects to avoid Symptoms of illness View of body cells Crystallization parameters of liquids Dynamic Changes in stock market Changes in facial features – facial gestures. Change of object’s view when robot approaches it. Dynamical change of body part in motion. Changes of moles on the skin. Changing symptoms of an illness.

Static p1 p2 p3 p4 p5 p6 p7 p8 Dynamic t0 p1 p2 p3 p4 p5 p6 p7 p8 t1 p1 p2 p3 p4 p5 p6 p7 p8 t2 p1 p2 p3 p4 p5 p6 p7 p8 Three vectors in time represented as one long vector for Machine Learning p1 p2 p3 p4 p5 p6 p7 p8 p1 p2 p3 p4 p5 p6 p7 p8 p1 p2 p3 p4 p5 p6 p7 p8 …… p1 p2 p3 p4 p5 p6 p7 p8 Attributes in time t0

Representation Models for Logic Based Machine Learning

Types of Logical Synthesis Sum of Products Decision Trees Decision Diagrams Functional Decomposition The method we are using

Binary Decision Diagrams There are many types of Decision Trees and many generalizations of them, used in logic and in ML

Decision Diagrams A Decision diagram breaks down a Karnaugh map into set of decision trees. A decision diagram ends when all of branches have a yes, no, or do not care solution. Example Karnaugh Map This diagram can become quite complex if the data is spread out as in the following example.

Decision Tree for Example Karnaugh Map

BDD Representation of function 1 00 01 11 10 1 - AB CD Incompletely specified function 6/24/2018

BDD Representation of function 1 00 01 11 10 1 CD AB Completely specified function The problem is how to find minimum tree or decision diagram for your given data 6/24/2018

Absolutely Minimum Background on Binary Decision Diagrams (BDD) BDDs are based on recursive Shannon expansion F = x Fx + x’ Fx’ Compact data structure for Boolean logic can represents sets of objects (states) encoded as Boolean functions Canonical representation reduced ordered BDDs (ROBDD) are canonical essential for simulation, analysis, synthesis and verification

Other expansions, other trees, other diagrams. The standard Decision Tree is based on Shannon Expansion. F = x Fx + x’ Fx’ This is the same concept as this F Shannon node for variable x x’ x All examples Fx’ Fx WIND WIND=WEAK WIND=STRONG Data Separation in ML is the same as Shannon Expansion in Logic All examples for which WIND was Weak All examples for which WIND was STRONG

Absolutely Minimum Background on Binary Decision Diagrams (BDD) and Kronecker Functional Decision Diagrams BDDs are based on recursive Shannon expansion F = x Fx + x’ Fx’ Compact data structure for Boolean logic can represents sets of objects (states) encoded as Boolean functions Canonical representation reduced ordered BDDs (ROBDD) are canonical essential for simulation, analysis, synthesis and verification Positive cofactor of F with respect to variable x Negative cofactor of F with respect to variable x

BDD Construction Typically done using APPLY operator Reduction rules remove duplicate terminals merge duplicate nodes (isomorphic subgraphs) remove redundant nodes Redundant nodes: nodes with identical children a b f 1 b c

BDD Construction – your first BDD Construction of a Reduced Ordered BDD 1 edge 0 edge f = ac + bc 1 a b c f a b c f 0 0 0 0 0 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 1 0 1 1 1 1 0 0 1 1 1 1 Truth table Decision tree

BDD Construction – cont’d 1 a b c f f 1 a b c f = (a+b)c 1 a b c 1. Remove duplicate terminals 2. Merge duplicate nodes 3. Remove redundant nodes

ROBDD MULTIPLEXOR CIRCUIT MUX F a b c d 1 0 1 F a E T b c E T E T What is decision Diagram in Theory and ML is a logic circuit from multiplexers in logic design d d E E T T 1

Kronecker Decision Diagrams There are many types of Decision Trees and many generalizations of them, used in logic and in ML Kronecker Decision Diagrams are generalization of BDDs

Decomposition types Decomposition types are associated to the variables in Xn with the help of a decomposition type list (DTL) d:=(d1,…,dn) where di { S, pD, nD}

KFDD IN KF trees and KFDD we can have any variable and any of the three expansions in any level Definition

F = a’b Å ac F = x Fx + x’ Fx’ a b c A’ f = Å Shannon cell Dipal cell representation with reversible gates The nodes of KFDD or KF tree can be interpreted as logic gates as shown below 6/24/2018 F = a’b Å ac F = x Fx + x’ Fx’ Positive cofactor of F with respect to variable x Negative cofactor of F with respect to variable x

Three different reductions types Type I : Each node in a DD is a candidate for the application g f g f This applies to BDD and KDD xi xi xi

Three different reductions types (cont’d) xi This applies to BDD and KDD xj xj g f g f

Three different reductions types (cont’d) Type D xi This is used in functional diagrams such as KFDDs xj xj f f g g

Example for OKFDD Shannon 1 x1 X1 * X2 *1 * X4 Shannon F=

Example for OKFDD (cont’d) This diagram below explains the expansions from previous slide X1 S-node X2 pD-node X3 nD-node

How to combine trees with any other Classifiers

It can be Shannon Tree or KFDD or any tree You have a tree on top. It can be Shannon Tree or KFDD or any tree Any cofactor of variables on top of tree ANY REMINDER LOGIC

Overview of data mining

What is Data Mining? Databases with millions of records and thousands of fields are now common in business, medicine, engineering, and the sciences. To extract useful information from such data sets is an important practical problem. Data Mining is the study of methods to find useful information from the database and use data to make predictions about the people or events the data was developed from. Classification for small error Understanding data (patterns, rules, statistics)

Some Examples of Data Mining 1) Stock Market Predictions 2) Large companies tracking sales 3) Military and intelligence applications

Data Mining in Epidemiology Epidemiologists track the spread of infectious disease and try to determines the diseases original source Often times Epidemiologist only have an initial suspicions about what is causing an illness. They interview people to find out what those people that got sick have in common. Currently they have to sort through this data by hand to try and determine the initial source of the disease. A data mining application would speed up this process and allow them to quickly track the source of an infectious diseases

Types of Data Mining 1) Neural Nets Data Mining applications use, among others, three methods to process data 1) Neural Nets 2) Statistical (Probabilistic) Analysis The method we will discuss here 3) Logical Methods The method we will discuss here

Understanding of the underlying area like medicine or finance DATA MINING Formal mathematics and programming MACHINE LEARNING

Conclusions

Logic Synthesis and ML classifier design are similar Logic Synthesis and ML classifier design are similar. In ML we have continuous or MV data and many don’t cares. Also we design for accuracy and interpretability. Occam Razor is similar SOP is a very good method, not recognized sufficiently by ML community. We shown three algorithms to deal with SOP. There are many more. The compatibility of columns is important to find patterns in data. We will use this method. It can be reduced to clique partitioning, clique covering and graph coloring. Decision Trees as shown in the previous lectures are only some particular examples of hierarchically decomposed structures Other structures are Decision Diagrams and Functional Decompositions. Kronecker Diagrams and Trees generalize Decision Diagrams and Trees.

Questions and Problems What is the principle of logic-based representation of data for Machine Learning. Give example of learning representations based on logic. Static versus dynamic attributes. What is the difference of decision trees and decision diagrams? How to generalize Boolean concepts to MV concepts in Machine Learning? Binary versus ternary Ashenhurst Decomposition What is Data Mining and how it can be used. Give your own example of Data Mining using known to you Machine Learning methods. What are bound and free sets in decomposition? What is disjoint versus non-disjoint decomposition? Can decomposition be combined with other learning methods? How?

Questions and Problems Explain Graph coloring approach to SOP. Explain essential prime approach to SOP Explain set covering (clique covering) approach to SOP. Compare Covering and coloring ideas. What is maximum Clique? What is maximum independent set? Apply Shannon Expansion to function from any Kmap in these slides. Select any variable you want. Apply Positive Davio Expansion to any function above. Apply Negative Davio Expansion to any function above. What is the difference of clique partitioning and clique covering. What is a relation of clique covering and SOP? What is the difference between Decision Tree and Decision Diagram? How to create a KFDD for function with don’t cares? DIFFICULT