Feature Selection for Pattern Recognition J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan University ( 台灣大學資訊工程系 )

Slides:

Advertisements

Similar presentations

Maximal Independent Subsets of Linear Spaces. Whats a linear space? Given a set of points V a set of lines where a line is a k-set of points each pair.

Advertisements

Latent Growth Curve Models

Pattern Finding and Pattern Discovery in Time Series

Heuristic Functions By Peter Lane

Data Mining Classification: Alternative Techniques

1 Machine Learning: Lecture 1 Overview of Machine Learning (Based on Chapter 1 of Mitchell T.., Machine Learning, 1997)

Inverting a Singly Linked List Instructor : Prof. Jyh-Shing Roger Jang Designer ： Shao-Huan Wang The ideas are reference to the textbook “Fundamentals.

TS: Explicitly assessing information and drawing conclusions Increasing & Decreasing Functions.

Chapter 5 Multiple Linear Regression

2 x0 0 12/13/2014 Know Your Facts!. 2 x1 2 12/13/2014 Know Your Facts!

2 x /18/2014 Know Your Facts!. 11 x /18/2014 Know Your Facts!

2 x /10/2015 Know Your Facts!. 8 x /10/2015 Know Your Facts!

1 Lecture 5 PRAM Algorithm: Parallel Prefix Parallel Computing Fall 2008.

5 x4. 10 x2 9 x3 10 x9 10 x4 10 x8 9 x2 9 x4.

EMIS 8374 LP Review: The Ratio Test. 1 Main Steps of the Simplex Method 1.Put the problem in row-0 form. 2.Construct the simplex tableau. 3.Obtain an.

DIRECTIONAL ARC-CONSISTENCY ANIMATION Fernando Miranda 5986/M

Linear Programming – Simplex Method: Computational Problems Breaking Ties in Selection of Non-Basic Variable – if tie for non-basic variable with largest.

Optimization of order-picking using a revised minimum spanning table method 盧坤勇國立聯合大學電子工程系.

Constraint Optimization We are interested in the general non-linear programming problem like the following Find x which optimizes f(x) subject to gi(x)

Variational Inference Amr Ahmed Nov. 6 th Outline Approximate Inference Variational inference formulation – Mean Field Examples – Structured VI.

Chapter 8: The Solver and Mathematical Programming Spreadsheet-Based Decision Support Systems Prof. Name Position (123) University.

Computational Facility Layout

C2 Part 4: VLSI CAD Tools Problems and Algorithms Marcelo Johann EAMTA 2006.

T-SPaCS – A Two-Level Single-Pass Cache Simulation Methodology + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing Wei Zang.

22 22 33 33 11 11 X1 X2 X3 X4 X5 X6 X7 X8 X9                    2,1  3,1  3,2 2,1 1,1 3,1 4,2 5,2 6,2 7,3 8,3 9,3.

Dynamic Time Warping (DTW)

Molecular Biomedical Informatics 分子生醫資訊實驗室 Machine Learning and Bioinformatics 機器學習與生物資訊學 Machine Learning & Bioinformatics 1.

Kaggle: Whale Challenge

COMPUTER AIDED DIAGNOSIS: FEATURE SELECTION Prof. Yasser Mostafa Kadah –

Feature Selection for Regression Problems

Reduced Support Vector Machine

A new predictive search area approach for fast block motion estimation Kuo-Liang Chung ( 鍾國亮 ) Lung-Chun Chang ( 張隆君 ) 國立台灣科技大學資訊工程系暨研究所 IEEE TRANSACTIONS.

Web Prefetch 張燕光資訊工程系成功大學

Performance Evaluation: Estimation of Recognition rates J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan Univ.

Randomized Variable Elimination David J. Stracuzzi Paul E. Utgoff.

Dan Simon Cleveland State University

PCA & LDA for Face Recognition

Alert Correlation for Extracting Attack Strategies Authors: B. Zhu and A. A. Ghorbani Source: IJNS review paper Reporter: Chun-Ta Li ( 李俊達 )

南台科技大學資訊工程系 Posture Monitoring System for Context Awareness in Mobile Computing Authors: Jonghun Baek and Byoung-Ju Yun Adviser: Yu-Chiang Li Speaker:

Introduction For some compiler, the intermediate code is a pseudo code of a virtual machine. Interpreter of the virtual machine is invoked to execute the.

南台科技大學資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member,

CSIE Dept., National Taiwan Univ., Taiwan

Dynamic Programming. What is dynamic programming? Break problem into subproblems Work backwards Can use ‘recursion’ ‘Programming’ - a mathematical term.

Decision Trees Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.

智慧型系統實驗室 iLab 南台資訊工程 1 Evaluation for the Test Quality of Dynamic Question Generation by Particle Swarm Optimization for Adaptive Testing Department of.

資訊工程系智慧型系統實驗室 iLab 南台科技大學 1 A Static Hand Gesture Recognition Algorithm Using K- Mean Based Radial Basis Function Neural Network 作者 :Dipak Kumar Ghosh,

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 GMDH-based feature ranking and selection for improved.

Quadratic Classifiers (QC) J.-S. Roger Jang ( 張智星 ) CS Dept., National Taiwan Univ Scientific Computing.

MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.

DTW for Speech Recognition J.-S. Roger Jang ( 張智星 ) MIR Lab ( 多媒體資訊檢索實驗室 ) CS, Tsing Hua Univ. ( 清華大學.

Performance Indices for Binary Classification 張智星 (Roger Jang) 多媒體資訊檢索實驗室台灣大學資訊工程系.

Dr. Gheith Abandah 1.  Feature selection is typically a search problem for finding an optimal or suboptimal subset of m features out of original M features.

南台科技大學資訊工程系 An effective solution for trademark image retrieval by combining shape description and feature matching 指導教授：李育強報告者：楊智雁日期： 2010/08/27.

Feature selection/extraction/reduction Data Science in Practice Week 8, 04/11 Jia-Ming Chang CS Dept, National Chengchi University

Simulation of Stock Trading J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.

Linear Classifiers (LC) J.-S. Roger Jang ( 張智星 ) MIR Lab, CSIE Dept. National Taiwan University.

Search in Google's N-grams

Quadratic Classifiers (QC)

DP for Optimum Strategies in Games

Discrete Fourier Transform (DFT)

National Taiwan University

Feature Selection for Pattern Recognition

Feature Selection To avid “curse of dimensionality”

Circularly Linked Lists and List Reversal

Scientific Computing: Closing 科學計算：結語

Naive Bayes Classifiers (NBC)

Game Trees and Minimax Algorithm

Longest Common Subsequence (LCS)

Edit Distance 張智星 (Roger Jang)

Presentation transcript:

Feature Selection for Pattern Recognition J.-S. Roger Jang ( 張智星 ) CSIE Dept., National Taiwan University ( 台灣大學資訊工程系 ) Machine Learning Feature Selection

2 Feature Selection: Goal & Benefits Feature selection Also known as input selection Goal To select a subset out of the original feature sets for better recognition rate Benefits Improve recognition rate Reduce computation load Explain relationships between features and classes

Machine Learning Feature Selection 3 Exhaustive Search Steps for direct exhaustive search 1.Use KNNC as the classifier, LOO for RR estimate 2.Generate all combinations of features and evaluate them one-by-one 3.Select the feature combination that has the best RR. Drawback d = 10  1023 models for evaluation  Time consuming! Advantage The optimal feature set can be identified.

Machine Learning Feature Selection 4 Exhaustive Search Direct exhaustive search x2x3x4x1 x1, x2 x1, x3 x1, x4 x1, x5 x1, x2, x3 x1, x2, x4 x1, x2, x5 x1, x2, x3, x4 x1, x2, x3, x5 1 input 2 inputs 3 inputs 4 inputs x2, x x5 x1, x3, x4 x1, x2, x4, x

Machine Learning Feature Selection 5 Exhaustive Search Characteristics of exhaustive search for feature selection The process is time consuming, but the identified feature set is optimum. It’s possible to use classifiers other than KNNC. It’s possible to use performance indices other than LOO.

Machine Learning Feature Selection 6 Heuristic Search Heuristic search for input selection One-pass ranking Sequential forward selection Generalized sequential forward selection Sequential backward selection Generalized sequential backward selection ‘Add m, remove n’ selection Generalized ‘add m, remove n’ selection

Machine Learning Feature Selection 7 Sequential Forward Selection Steps for sequential forward selection 1.Use KNNC as the classifier, LOO for RR estimate 2.Select the first feature that has the best RR. 3.Select the next feature (among all unselected features) that, together with the selected features, gives the best RR. 4.Repeat the previous step until all features are selected. Advantage If we have d features, we need to evaluate d(d+1)/2 models  A lot more efficient. Drawback The selected features are not always optimal.

Machine Learning Feature Selection 8 Sequential Forward Selection Sequential forward selection (SFS) x2x3x4x1x5 x2, x1 x2, x3 x2, x4 x2, x5 x2, x4, x1 x2, x4, x3 x2, x4, x5 x2, x4, x3, x1 x2, x4, x3, x5 1 input 2 inputs 3 inputs 4 inputs......

Machine Learning Feature Selection 9 Example: Iris Dataset Sequential forward selection Exhaustive search

Machine Learning Feature Selection 10 Example: Wine Dataset SFSSFS with input normalization 3 selected features, LOO RR=93.8% 6 selected features, LOO RR=97.8% If we use exhaustive search, we have 8 features with LOO RR=99.4%

Machine Learning Feature Selection 11 Use of Input Selection Common use of input selection Increase the model complexity sequentially by adding more inputs Select the model that has the best test RR Typical curve of error vs. model complexity Determine the model structure with the least test error Model complexity (# of selected inputs) Error rate Test error Training error Optimal structure