Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

G53MLE | Machine Learning | Dr Guoping Qiu
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Support Vector Machines
1 Lecture 5 Support Vector Machines Large-margin linear classifier Non-separable case The Kernel trick.
SVM—Support Vector Machines
Machine learning continued Image source:
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization Glenn Fung and Olvi L. Mangasarian CSNA 2002 June 13-16, 2002 Madison,
Classification and Decision Boundaries
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by grants from the National.
Support Vector Machines
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.
Reduced Support Vector Machine
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
Support Vector Machines Kernel Machines
Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.
Support Vector Machines
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Lecture 10: Support Vector Machines
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Efficient Model Selection for Support Vector Machines
SVM by Sequential Minimal Optimization (SMO)
Support Vector Machine & Image Classification Applications
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Transcription of Text by Incremental Support Vector machine Anurag Sahajpal and Terje Kristensen.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by a grant from the National.
Center for Computational Intelligence, Learning, and Discovery Artificial Intelligence Research Laboratory Department of Computer Science Supported in.
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
CISC667, F05, Lec22, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Support Vector Machines I.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Efficient Optimal Linear Boosting of a Pair of Classifiers.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
 Developed Struct-SVM classifier that takes into account domain knowledge to improve identification of protein-RNA interface residues  Results show that.
Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin, Sanghack Lee, Ngot Bui.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
Maximum Entropy Discrimination Tommi Jaakkola Marina Meila Tony Jebara MIT CMU MIT.
Typically, classifiers are trained based on local features of each site in the training set of protein sequences. Thus no global sequence information is.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Large Margin classifiers
Artificial Intelligence Research Laboratory
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Ontology-Based Information Integration Using INDUS System
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Machine Learning Week 3.
Support Vector Machines
Presentation transcript:

Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program Department of Computer Science AAAI 2005 Acknowledgements : This work is supported in part by grants from the National Science Foundation (IIS ), and the National Institutes of Health (GM ) to Vasant Honavar. Learning Support Vector Machine Classifiers From Distributed Data Sources Cornelia Caragea, Doina Caragea and Vasant Honavar Learning from Distributed Data Given the data fragments D 1,…, D N of a data set D distributed across N sites, a set of constraints Z, a hypothesis class H and a performance criterion P, the task of the learner L d is to output a hypothesis h  H that optimizes P, using only operations allowed by Z. Learning from Data Given a data set D, a hypothesis class H, and a performance criterion P, the learning algorithm L outputs a hypothesis h  H that optimizes P. Support Vector Machines SVM finds a separating hyperplane maximizing the margin of separation between classes when the data are linearly separable. Kernels can be used to make data sets separable in high dimensional feature spaces. SVM is among one of the most effective machine learning algorithms for classification problems. Our approach relies on identifying sufficient statistics for learning SVMs. We present an algorithm that learns SVMs from distributed data by iteratively computing the set of refinement sufficient statistics. Our algorithm is exact with respect to its centralized counterpart and efficient in terms of time complexity. Sufficient Statistics: A statistic s L (D) is a sufficient statistic for learning a hypothesis h using a learning algorithm L applied to a data set D if there exists a procedure that takes s L (D) as input and outputs h. Usually, we cannot compute all the sufficient statistics at once. Instead we can only compute sufficient statistics for the refinement of a hypothesis h i into a hypothesis h i+1. Exactness: An algorithm L d for learning from distributed data sets D 1, …, D N is said to be exact relative to its centralized counterpart L if the hypothesis produced by L d is identical to that obtained by L from complete data set D by appropriately combining the data sets D 1, …, D N. D1D1 D2D2 D3D3 Learning from distributed data L d Classifier h D D1D1 D2D2 D3D3 Learning from data L Classifier h Exactness condition: q(D) = C(q 1 (D 1 ),…, q N (D N )) Query s(D,h i ->h i+1 ) Answer s(D,h i ->h i+1 ) Query Decomposition Answer Composition D1D1 D2D2 DNDN q1q1 q2q2 qKqK Statistical Query Formulation Hypothesis Generation h i+1  R(h i, s(D, h i ->h i+1 )) Learner Partial hypothesis h i Query answering engine Information extraction from distributed data + Hypothesis generation The support vectors (x i,y i ) and their corresponding coefficients λ i can be seen as sufficient statistics. Learning Support Vector Machines from Distributed Data SV(D i ) Naïve Approach: the resulting algorithm is not exact! D1D1 D2D2 D3D3 Query Decomp Answer Comp Take union Set SV Query answering engine Statistical query formulation SV(D) Apply SVM to the set SV SVM SV={(x i,y i )| sv} Margin of Separation Separating Hyperplane: wx+b=0 Support Vectors Optimal solution: Nibxwyw ii,...,1 allfor 1)( subject to|| 2 1 maxarg 2  Margin of Separation Separating Hyperplane: wx+b=0 Counterexample to Naïve Distributed SVM Exact learning: all boundary information VConv(D+)  VConv(D-) where VConv(D) - the set of vertices that define the convex hull of D. Algorithm exponential in the number of dimensions. Exact and Efficient Learning SVM From Distributed Data Data sources Naïve-Tr. Acc.Naïve-Ts. Acc.Iter-Tr. Acc.Iter-Ts. Acc.Centr-Tr. Acc.Centr.-Ts.Acc.No of iter. Artificially generated Human-Yeast protein SVM Algorithm Learning Phase SVM(D:data, K:kernel) Solve the optimization problem: subject to: Let be the solution of this optimization problem. Classification Phase For a new instance x assign x to the class SVM from Horizontally Distributed Data Learning Phase Initialize (the global set of support vectors). repeat { Let Send to all data sources for (each data source ) { Apply and find the support vectors Send the support vectors to the central location. } At the central location: Compute Apply to find the new } until Let be the set of final support vectors. Let be their corresponding weights. Classification Phase for a new instance x assign x to the class We ran experiments on artificially generated data and protein function classification data. The results showed that our algorithm converges to the exact solution in a relatively small number of steps. This makes it preferable to previous algorithm for learning from distributed data.