Anna Atramentov Major: Computer Science Program of Study Committee: Vasant Honavar, Major Professor Drena Leigh Dobbs Yan-Bin Jia Iowa State University,

Slides:



Advertisements
Similar presentations
ADBIS 2007 Discretization Numbers for Multiple-Instances Problem in Relational Database Rayner Alfred Dimitar Kazakov Artificial Intelligence Group, Computer.
Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Anna Atramentov and Vasant Honavar* Artificial Intelligence Laboratory Department of Computer Science Iowa State University Ames, IA 50011, USA
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
ADBIS 2007 A Clustering Approach to Generalized Pattern Identification Based on Multi-instanced Objects with DARA Rayner Alfred Dimitar Kazakov Artificial.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
Classification Algorithms
Paper By - Manish Mehta, Rakesh Agarwal and Jorma Rissanen
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Feature selection and transduction for prediction of molecular bioactivity for drug design Reporter: Yu Lun Kuo (D )
Classification: Definition Given a collection of records (training set ) –Each record contains a set of attributes, one of the attributes is the class.
Decision Tree.
Deriving rules from data Decision Trees a.j.m.m (ton) weijters.
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin and Vasant Honavar. BigData2013.
Comparison of Data Mining Algorithms on Bioinformatics Dataset Melissa K. Carroll Advisor: Sung-Hyuk Cha March 4, 2003.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Application of Stacked Generalization to a Protein Localization Prediction Task Melissa K. Carroll, M.S. and Sung-Hyuk Cha, Ph.D. Pace University, School.
Classification Techniques: Decision Tree Learning
Lecture Notes for Chapter 4 Introduction to Data Mining
ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.
Discovering Substructures in Chemical Toxicity Domain Masters Project Defense by Ravindra Nath Chittimoori Committee: DR. Lawrence B. Holder, DR. Diane.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
Induction of Decision Trees
Tree-based methods, neutral networks
Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.
Lecture 5 (Classification with Decision Trees)
Three kinds of learning
Classification.
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning RASTOGI, Rajeev and SHIM, Kyuseok Data Mining and Knowledge Discovery, 2000, 4.4.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Comparing the Parallel Automatic Composition of Inductive Applications with Stacking Methods Hidenao Abe & Takahira Yamaguchi Shizuoka University, JAPAN.
Machine Learning Chapter 3. Decision Tree Learning
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, April 3, 2000 DingBing.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
1 Data Mining Lecture 3: Decision Trees. 2 Classification: Definition l Given a collection of records (training set ) –Each record contains a set of attributes,
Studying the Presence of Genetically Modified Variants in Organic Oilseed Rape by using Relational Data Mining Aneta Ivanovska 1, Celine Vens 2, Sašo Džeroski.
Decision Trees & the Iterative Dichotomiser 3 (ID3) Algorithm David Ramos CS 157B, Section 1 May 4, 2006.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Learning from Observations Chapter 18 Through
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CS690L Data Mining: Classification
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Slides for “Data Mining” by I. H. Witten and E. Frank.
1 Mark-A. Krogel, Magdeburg University, Knowledge Discovery and Machine Learning Group KDD Cup 2001: Gene/Protein Function Prediction Using the Multirelational.
Experiments with MRDTL – A Multi-relational Decision Tree Learning Algorithm Hector Leiva, Anna Atramentov and Vasant Honavar * Artificial Intelligence.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
1 Krogel, Rawles, Železný, Flach, Lavrač, Wrobel: Comparative Evaluation of Approaches to Propositionalization Comparative Evaluation of Approaches to.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Lecture Notes for Chapter 4 Introduction to Data Mining
Data Mining and Decision Support
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.
Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
BY International School of Engineering {We Are Applied Engineering} Disclaimer: Some of the Images and content have been taken from multiple online sources.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
Prepared by: Mahmoud Rafeek Al-Farra
Artificial Intelligence
Chapter 6 Classification and Prediction
Data Science Algorithms: The Basic Methods
Machine Learning: Lecture 3
Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,
Presentation transcript:

Anna Atramentov Major: Computer Science Program of Study Committee: Vasant Honavar, Major Professor Drena Leigh Dobbs Yan-Bin Jia Iowa State University, Ames, Iowa 2003 A Multi-Relational Decision Tree Learning Algorithm – Implementation and Experiments

KDD and Relational Data Mining Term KDD stands for Knowledge Discovery in Databases Traditional techniques in KDD work with the instances represented by one table Relational Data Mining is a subfield of KDD where the instances are represented by several tables DayOutlookTemp-reHumidityWindPlay Tennis d1SunnyHotHighWeakNo d2SunnyHotHighStrongNo d3OvercastHotHighWeakYes d4OvercastColdNormalWeakNo Department d1Math1000 d2Physics300 d3Computer Science400 Staff p1Daled1Professor k p2Martind3Postdoc30-40k p3Victord2Visitor Scientist 40-50k p4Davidd3Professor80-100k Graduate Student s1John2.04p1d3 s2Lisa3.510p4d3 s3Michel3.93p4d4

Motivation Importance of relational learning: Growth of data stored in MRDB Techniques for learning unstructured data often extract the data into MRDB Promising approach to relational learning: MRDM (Multi-Relational Data Mining) framework developed by Knobbe’s (1999) MRDTL (Multi-Relational Decision Tree Learning) algorithm implemented by Leiva (2002) Goals Speed up MRDM framework and in particular MRDTL algorithm Incorporate handling of missing values Perform more extensive experimental evaluation of the algorithm

Relational Learning Literature  Inductive Logic Programming (Dzeroski and Lavrac, 2001; Dzeroski et al., 2001; Blockeel, 1998; De Raedt, 1997)  First order extensions of probabilistic models  Relational Bayesian Networks(Jaeger, 1997)  Probabilistic Relational Models (Getoor, 2001; Koller, 1999)  Bayesian Logic Programs (Kersting et al., 2000)  Combining First Order Logic and Probability Theory  Multi-Relational Data Mining (Knobbe et al., 1999)  Propositionalization methods (Krogel and Wrobel, 2001)  PRMs extension for cumulative learning for learning and reasoning as agents interact with the world (Pfeffer, 2000)  Approaches for mining data in form of graph (Holder and Cook, 2000; Gonzalez et al., 2000)

Problem Formulation Example of multi-relational database Given: Data stored in relational data base Goal: Build decision tree for predicting target attribute in the target table schema instances Department d1Math1000 d2Physics300 d3Computer Science400 Staff p1Daled1Professor k p2Martind3Postdoc30-40k p3Victord2Visitor Scientist 40-50k p4Davidd3Professor80-100k Graduate Student s1John2.04p1d3 s2Lisa3.510p4d3 s3Michel3.93p4d4 Department ID Specialization #Students Staff ID Name Department Position Salary Grad.Student ID Name GPA #Publications Advisor Department

No {d3, d4}{d1, d2} {d1, d2, d3, d4} Tree_induction(D: data) A = optimal_attribute(D) if stopping_criterion (D) return leaf(D) else D left := split(D, A) D right := split complement (D, A) child left := Tree_induction(D left ) child right := Tree_induction(D right ) return node(A, child left, child right ) Propositional decision tree algorithm. Construction phase DayOutlookTemp-reHumidityWindPlay Tennis d1SunnyHotHighWeakNo d2SunnyHotHighStrongNo d3OvercastHotHighWeakYes d4OvercastColdNormalWeakNo Outlook not sunny … … … … Temperature hot not hot No Yes {d3} {d4} sunny DayOutlookTempHum-tyWindPlayT d1SunnyHotHighWeakNo d2SunnyHotHighStrongNo DayOutlookTempHum-tyWindPlayT d3OvercastHotHighWeakYes d4OvercastColdNormalWeakNo

MR setting. Splitting data with Selection Graphs IDSpecialization#Students d1Math1000 d2Physics300 d3Computer Science400 DepartmentGraduate Student IDNameDepartmentPositionSalary p1Daled1Professor k p2Martind3Postdoc30-40k p3Victord2Visitor Scientist 40-50k p4Davidd3Professor80-100k Staff IDNameGPA#Public.AdvisorDepartment s1John2.04p1d3 s2Lisa3.510p4d3 s3Michel3.93p4d4 Staff Grad. Student GPA >2.0 Department Staff Grad.Student complement selection graphs StaffGrad. Student GPA >2.0 StaffGrad. Student IDNameDepartmentPositionSalary p1Daled1Professor70-80k IDNameDepartmentPositionSalary p4Davidd3Professor80-100k IDNameDepartmentPositionSalary p2Martind3Postdoc30-40k p3Victord2Visitor Scientist 40-50k

What is selection graph? Staff Grad.Student GPA >3.9 Grad.Student Department It corresponds to the subset of the instances from target table Nodes correspond to the tables from the database Edges correspond to the associations between tables Open edge = “have at least one” Closed edge = “have non of ” Department Staff Grad.Student Specialization =math

Transforming selection graphs into SQL queries Staff Grad. Student Select T0.id Select distinct T0.id From From Staff Where T0.position=Professor Position = Professor Select T0.id Select distinct T0.id From T0, Graduate_Student T1 From Staff T0, Graduate_Student T1 Where T0.id=T1.Advisor Select T0.id Select distinct T0.id From T0 From Staff T0 Where T0.id not in ( Select T1. id ( Select T1. id From Graduate_Student T1) From Graduate_Student T1) GPA >3.9 Select distinct T0. id Graduate_Student T1 From Staff T0, Graduate_Student T1 T0.id=T1.Advisor Where T0.id=T1.Advisor T0. id not in ( Select T1. id From Graduate_Student T1 From Graduate_Student T1 Where T1.GPA > 3.9) Where T1.GPA > 3.9) Generic query: select distinct T0.primary_key from table_list where join_list and condition_list

MR decision tree Staff …… …… … … Grad. Student GPA >3.9 Grad.Student Each node contains selection graph Each child selection graph is a supergraph of the parent selection graph

How to choose selection graphs in nodes? Problem: There are too many supergraph selection graphs to choose from in each node Solution: start with initial selection graph find greedy heuristic to choose supergraph selection graphs: refinements use binary splits for simplicity for each refinement get complement refinement choose the best refinement based on information gain criterion Problem: Some potentially good refinements may give no immediate benefit Solution: look ahead capability Staff …… …… … … Grad. Student GPA >3.9 Grad.Student

Refinements of selection graph add condition to the node - explore attribute information in the tables add present edge and open node – explore relational properties between the tables Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student Specialization =math

Refinements of selection graph Position = Professor Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student Department Position != Professor Staff Grad.Student GPA >3.9 Grad.Student Department refinement complement refinement Department Staff Grad.Student add condition to the node add condition to the node add present edge and open node Specialization =math

Refinements of selection graph Staff Grad.Student GPA >3.9 Grad.Student Department GPA >2.0 Staff Grad.Student GPA >3.9 Grad.Student Department Grad.Student GPA >2.0 Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student add condition to the node add condition to the node add present edge and open node refinement complement refinement Specialization =math

Refinements of selection graph Staff Grad.Student GPA >3.9 Grad.Student Department #Students >200 Staff Grad.Student GPA >3.9 Grad.Student Department #Students >200 Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student add condition to the node add condition to the node add present edge and open node refinement complement refinement Specialization =math

Refinements of selection graph Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student add condition to the node add present edge and open node add present edge and open node refinement complement refinement Note: information gain = 0 Specialization =math

Refinements of selection graph Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student refinement complement refinement add condition to the node add present edge and open node add present edge and open node Specialization =math

Refinements of selection graph Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student DepartmentStaff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student refinement complement refinement add condition to the node add present edge and open node add present edge and open node Specialization =math

Refinements of selection graph Staff Grad.Student GPA >3.9 Grad.Student DepartmentGrad.S Staff Grad.Student GPA >3.9 Grad.Student DepartmentGrad.S Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student refinement complement refinement add condition to the node add present edge and open node add present edge and open node Specialization =math

Look ahead capability Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student Department refinement complement refinement Specialization =math

Look ahead capability Department Staff Grad.Student #Students > 200 Staff Grad.Student GPA >3.9 Grad.Student Department refinement complement refinement #Students > 200 Staff Grad.Student GPA >3.9 Grad.Student Department Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math

MRDTL algorithm. Construction phase Staff …… …… …… Grad.Student StaffGrad. Student GPA >3.9 Staff Grad.Student GPA >3.9 Grad.Student for each non-leaf node: consider all possible refinements and their complements of the node’s selection graph choose the best ones based on information gain criterion create children nodes

MRDTL algorithm. Classification phase Staff …… …… … … Grad. Student GPA >3.9 Grad.Student StaffGrad. Student GPA >3.9 Department Spec=math StaffGrad. Student GPA >3.9 Department Spec=physics Position = Professor …………… k80-100k for each leaf: apply selection graph of the leaf to the test data classify resulting instances with classification of the leaf

The most time consuming operations of MRDTL Entropy associated with this selection graph: Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math IDNameDepPositionSalary p1Daled1Postdoc c1c1 p2Martind1Postdoc c1c1 p3Davidd4Postdoc c1c1 p4Peterd3Postdoc c1c1 p5Adriand2Professor c2c2 p6Doinad3Professor c2c2 …… …… n1n1 n2n2 … E =  (n i /N) log (n i /N) Query associated with counts n i : select distinct Staff.Salary, count(distinct Staff.ID) from Staff, Grad.Student, Deparment where join_list and condition_list group by Staff.Salary Result of the query is the following list: c i, n i

The most time consuming operations of MRDTL Staff Grad.Student GPA >3.9 Grad.Student Department GPA >2.0 Staff Grad.Student GPA >3.9 Grad.Student Department Grad.Student GPA >2.0 Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math Entropy associated with each of the refinements select distinct Staff.Salary, count(distinct Staff.ID) from table_list where join_list and condition_list group by Staff.Salary

A way to speed up - eliminate redundant calculations Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math Problem: For selection graph with 162 nodes the time to execute a query is more than 3 minutes! Redundancy in calculation: For this selection graph tables Staff and Grad.Student will be joined over and over for all the children refinements of the tree A way to fix: calculate it only once and save for all further calculations

Speed Up Method. Sufficient tables Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math Staff_IDGrad.Student_IDDep_IDSalary p1s1d1 c1c1 p2s1d1 c1c1 p3s6d4 c1c1 p4s3d3 c1c1 p5s1d2 c2c2 p6s9d3 c2c2 …… ……

Speed Up Method. Sufficient tables Entropy associated with this selection graph: Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math n1n1 n2n2 … E =  (n i /N) log (n i /N) Query associated with counts n i : select S.Salary, count(distinct S.Staff_ID) from S group by S.Salary Result of the query is the following list: c i, n i Staff_IDGrad.Student_IDDep_IDSalary p1s1d1 c1c1 p2s1d1 c1c1 p3s6d4 c1c1 p4s3d3 c1c1 p5s1d2 c2c2 p6s9d3 c2c2 …… ……

Speed Up Method. Sufficient tables Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math select S.Salary, X.A, count(distinct S.Staff_ID) from S, X where S.X_ID = X.ID group by S.Salary, X.A Queries associated with the add condition refinement: Calculations for the complement refinement: count(c i, R comp (S)) = count(c i, S) – count(c i, R(S))

Speed Up Method. Sufficient tables Staff Grad.Student GPA >3.9 Grad.Student Department Specialization =math Queries associated with the add edge refinement: select S.Salary, count(distinct S.Staff_ID) from S, X, Y where S.X_ID = X.ID, and e.cond group by S.Salary Calculations for the complement refinement: count(c i, R comp (S)) = count(c i, S) – count(c i, R(S))

Speed Up Method Significant speed up in obtaining the counts needed for the calculations of the entropy and information gain The speed up is reached by the additional space used by the algorithm

Handling Missing Values Staff.Position Staff.NameStaff.DepDepartment.Spec For each attribute which has missing values we build a Naïve Bayes model: IDSpecialization#Students d1Math1000 d2Physics300 d3Computer Science400 Department Graduate Student IDNameDepartmentPositionSalary p1Daled1? k p2Martind3?30-40k p3Victord2Visitor Scientist 40-50k p4Davidd3?80-100k Staff IDNameGPA#Public.AdvisorDepartment s1John2.04p1d3 s2Lisa3.510p1d3 s3Michel3.93p4d4 … Staff.Position, b Staff.Name, aP(a|b)

Handling Missing Values Then the most probable value for the missing attribute is calculated by formula: IDSpecialization#Students d1Math1000 Department Graduate Student IDNameDepartmentPositionSalary p1Daled1? k Staff IDNameGPA#Public.AdvisorDepartment s1John2.04p1d3 s2Lisa3.510p1d3 P(v i | X 1.A 1, X 2.A 2, X 3.A 3 …) = P(X 1.A 1, X 2.A 2, X 3.A 3 …| v i ) P(v i ) / P(X 1.A 1, X 2.A 2, X 3.A 3 … ) = P(X 1.A 1 | v i ) P(X 2.A 2 | v i ) P(X 3.A 3 | v i ) … P(v i ) / P(X 1.A 1, X 2.A 2, X 3.A 3 … )

Experimental results. Mutagenesis Most widely DB used in ILP. Describes molecules of certain nitro aromatic compounds. Goal: predict their mutagenic activity (label attribute) – ability to cause DNA to mutate. High mutagenic activity can cause cancer. Two subsets regression friendly (188 molecules) and regression unfriendly (42 molecules). We used only regression friendly subset. 5 levels of background knowledge: B0, B1, B2, B3, B4. They provide richer descriptions of the examples. We used B2 level.

Experimental results. Mutagenesis Results of 10-fold cross-validation for regression friendly set. Data SetAccuracySel graph size (max) Tree sizeTime with speed up Time without speed up mutagenesis87.5% Best-known reported accuracy is 86% Schema of the mutagenesis database

Consists of a variety of details about the various genes of one particular type of organism. Genes code for proteins, and these proteins tend to localize in various parts of cells and interact with one another in order to perform crucial functions. 2 Tasks: Prediction of gene/protein localization and function 862 training genes, 381 test genes. Experimental results. KDD Cup 2001  Many attribute values are missing: 70% of CLASS attribute, 50% of COMPLEX, and 50% of MOTIF in composition table FUNCTION

localizationAccuracySel graph size (max) Tree sizeTime with speed up Time without speed up With handling missing values 76.11% secs secs Without handling missing values 50.14% secs secs Experimental results. KDD Cup 2001 functionAccuracySel graph size (max) Tree size (max) Time with speed up Time without speed up With handling missing values 91.44% secs secs Without handling missing values 88.56% secs secs Best-known reported accuracy is 93.6% Best-known reported accuracy is 72.1%

Experimental results. PKDD 2001 Discovery Challenge Consists of 5 tables Target table consists of 1239 records The task is to predict the degree of the thrombosis attribute from ANTIBODY_EXAM table The results for 5:2 cross validation: Data SetAccuracySel Graph size (max) Tree sizeTime with speed up Time without speed up thrombosis98.1% Best-known reported accuracy is 99.28% PATIENT_INFO DIAGNOSIS THROMBOSIS ANTIBODY_EXAM ANA_PATTERN

Summary the algorithm significantly outperforms MRDTL in terms of running time the accuracy results are comparable with the best reported results obtained using different data-mining algorithms Future work Incorporation of the more sophisticated techniques for handling missing values Incorporating of more sophisticated pruning techniques or complexity regularizations More extensive evaluation of MRDTL on real-world data sets Development of ontology-guided multi-relational decision tree learning algotihms to generate classifiers at multiple levels of abstraction [Zhang et al., 2002] Development of variants of MRDTL that can learn from heterogeneous, distributed, autonomous data sources, based on recently developed techniques for distributed learning and ontology based data integration

Thanks to Dr. Honavar for providing guidance, help and support throughout this research Colleges from Artificial Intelligence Lab for various helpful discussions My committee members: Drena Dobbs and Yan-Bin Jia for their help Professors and lecturers of the Computer Science department for the knowledge that they gave me through lectures and discussions Iowa State University and Computer Science department for funding in part this research