Motivation Conclusion Two EMSE-CMP original contributions in variable ranking and selection were implemented with the LS-SVM regression method in a Matlab.

Slides:



Advertisements
Similar presentations
Multistep Virtual Metrology Approaches for Semiconductor Manufacturing Processes Presenter: Simone Pampuri (University of Pavia, Italy) Authors: Simone.
Advertisements

Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Feature Grouping-Based Fuzzy-Rough Feature Selection Richard Jensen Neil Mac Parthaláin Chris Cornelis.
Feature selection and transduction for prediction of molecular bioactivity for drug design Reporter: Yu Lun Kuo (D )
Pattern Recognition and Machine Learning
1 Welcome to the Kernel-Class My name: Max (Welling) Book: There will be class-notes/slides. Homework: reading material, some exercises, some MATLAB implementations.
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Minimum Redundancy and Maximum Relevance Feature Selection
« هو اللطیف » By : Atefe Malek. khatabi Spring 90.
Correlation Aware Feature Selection Annalisa Barla Cesare Furlanello Giuseppe Jurman Stefano Merler Silvano Paoli Berlin – 8/10/2005.
Model Assessment, Selection and Averaging
Model assessment and cross-validation - overview
Lecture 4: Embedded methods
Decision Tree Rong Jin. Determine Milage Per Gallon.
Feature Selection for Regression Problems
Reduced Support Vector Machine
Special Topic: Missing Values. Missing Values Common in Real Data  Pneumonia: –6.3% of attribute values are missing –one attribute is missing in 61%
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
CS Instance Based Learning1 Instance Based Learning.
Randomized Variable Elimination David J. Stracuzzi Paul E. Utgoff.
Jeff Howbert Introduction to Machine Learning Winter Machine Learning Feature Creation and Selection.
Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.
A Multivariate Biomarker for Parkinson’s Disease M. Coakley, G. Crocetti, P. Dressner, W. Kellum, T. Lamin The Michael L. Gargano 12 th Annual Research.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
CLassification TESTING Testing classifier accuracy
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
A Comparative Study on Variable Selection for Nonlinear Classifiers C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.
An Example of Course Project Face Identification.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A data mining approach to the prediction of corporate failure.
Prediction of Malignancy of Ovarian Tumors Using Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
T. Messelis, S. Haspeslagh, P. De Causmaecker B. Bilgin, G. Vanden Berghe.
Concept learning, Regression Adapted from slides from Alpaydin’s book and slides by Professor Doina Precup, Mcgill University.
CS685 : Special Topics in Data Mining, UKY The UNIVERSITY of KENTUCKY Validation.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Feature Selection and Weighting using Genetic Algorithm for Off-line Character Recognition Systems Faten Hussein Presented by The University of British.
COT6930 Course Project. Outline Gene Selection Sequence Alignment.
8 th European AEC/APC Conference - Dresden 2007 Extracting correlated sets using the chi-squared measure within n-ary relations: an implementation A. Casali.
Discriminative Frequent Pattern Analysis for Effective Classification By Hong Cheng, Xifeng Yan, Jiawei Han, Chih- Wei Hsu Presented by Mary Biddle.
Data Mining and Decision Support
Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features 王荣 14S
NTU & MSRA Ming-Feng Tsai
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Feature Selection Poonam Buch. 2 The Problem  The success of machine learning algorithms is usually dependent on the quality of data they operate on.
Machine Learning in CSC 196K
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Outline Time series prediction Find k-nearest neighbors Lag selection Weighted LS-SVM.
Data Summit 2016 H104: Building Hadoop Applications Abhik Roy Database Technologies - Experian LinkedIn Profile:
Computational Intelligence: Methods and Applications Lecture 34 Applications of information theory and selection of information Włodzisław Duch Dept. of.
Canadian Bioinformatics Workshops
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
CSE 4705 Artificial Intelligence
COMP61011 Foundations of Machine Learning Feature Selection
Predict House Sales Price
CS548 Fall 2017 Decision Trees / Random Forest Showcase by Yimin Lin, Youqiao Ma, Ran Lin, Shaoju Wu, Bhon Bunnag Showcasing work by Cano,
Machine Learning Feature Creation and Selection
Historic Document Image De-Noising using Principal Component Analysis (PCA) and Local Pixel Grouping (LPG) Han-Yang Tang1, Azah Kamilah Muda1, Yun-Huoy.
Machine Learning in Practice Lecture 22
Feature Selection Methods
FEATURE WEIGHTING THROUGH A GENERALIZED LEAST SQUARES ESTIMATOR
Wellington Cabrera Advisor: Carlos Ordonez
Introduction to Machine learning
Presenter: Donovan Orn
Presentation transcript:

Motivation Conclusion Two EMSE-CMP original contributions in variable ranking and selection were implemented with the LS-SVM regression method in a Matlab Prototype for the design of VM models. The Matlab Prototype was validated using real data from two case studies: Austriamicrosystems: Prediction of PECVD (Plasma Enhanced Chemical Vapor Deposition) oxide thickness for an Inter Metal Dielectric (IMD) layers. STMicroelectronics Rousset case: Prediction of Overlay of Photolithography process Ranking and selection variables The main scientific contributions of EMSE-CMP are the development of filter and wrapper methods to ranking and selection variables. Some manufacturing processes have a very large number of input variables. The result is complex predictive models with poor generalization capabilities: The confidence level of a model is even larger when it uses a small number of adjusted parameters. In addition, taking into account irrelevant variables leads to introducing noisy data that yields to overfitting and then poor generalization capabilities. The goal of variable ranking and selection is to determine the smallest subset of variables, carrying as much information as possible, to explain the dependent variable, while discarding both redundant and/or irrelevant variables (i.e., poorly informative). EMSE-CMP has two main contributions in ranking and variable selection: 1)Contribution to filter method: Mutual Information-based Variable Selection using a Probe Feature. 2)Contribution to wrapper method: Wrapper with a meta-heuristic approach, namely a Tabu search algorithm (TabuWrap). Design of Virtual Metrology Models by Machine Learning : A Matlab Prototype A. Ferreira 1, G. Pages 1, Y. Oussar 2 1 Ecole Nat. Sup. des Mines de Saint-Etienne, 2 ESPCI Paristech We propose a methodology for building VM models using machine learning techniques. After a standard data pre-processing, a variable ranking and selection procedure is applied to determine the most relevant variables for predicting the metrology variables. Different techniques coming from the statistical learning theory such as: PLS regression and Least Squares Support Vector Machine LS-SVM regression are implemented. For each technique, the model parameters are estimated using a training algorithm. A k-fold cross-validation (or leave-one-out) procedure is used to select the model that exhibits the best generalization capabilities. Its performance is then estimated using a test dataset. The EMSE-CMP methodology was implemented in a Matlab Prototype dedicated to Virtual Metrology Models design. Basic Diagram for the Design of VM Models based on Machine Learning Techniques ApproachProsCons Filter Model free Low computational cost Fairly irregular May degrade performances Wrapper Consistent High accuracy Improves performances Computational burden Filter and Wrappers Approaches Matlab Prototype Case Study STMicroelectronics Rousset site Prediction of Overlay of Photolithography process 43 variables out of 169 have been selected