1/1/ Integrating Genetic Algorithms with Conditional Random Fields to Enhance Question Informer Prediction Min-Yuh Day 1, 2, Chun-Hung Lu 1, 2, Chorng-Shyong.

Slides:



Advertisements
Similar presentations
CS6800 Advanced Theory of Computation
Advertisements

1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
1/1/ A Knowledge-based Approach to Citation Extraction Min-Yuh Day 1,2, Tzong-Han Tsai 1,3, Cheng-Lung Sung 1, Cheng-Wei Lee 1, Shih-Hung Wu 4, Chorng-Shyong.
Institute of Intelligent Power Electronics – IPE Page1 Introduction to Basics of Genetic Algorithms Docent Xiao-Zhi Gao Department of Electrical Engineering.
Jun Zhu Dept. of Comp. Sci. & Tech., Tsinghua University This work was done when I was a visiting researcher at CMU. Joint.
1 Wendy Williams Metaheuristic Algorithms Genetic Algorithms: A Tutorial “Genetic Algorithms are good at taking large, potentially huge search spaces and.
A GENETIC ALGORITHM APPROACH TO SPACE LAYOUT PLANNING OPTIMIZATION Hoda Homayouni.
Non-Linear Problems General approach. Non-linear Optimization Many objective functions, tend to be non-linear. Design problems for which the objective.
1/1/ An Integrated Knowledge-based and Machine Learning Approach for Chinese Question Classification Min-Yuh Day 1,2, Cheng-Wei Lee 1, Shih-Hung Wu 3,
1 IOE/MFG 543 Chapter 14: General purpose procedures for scheduling in practice Section 14.5: Local search – Genetic Algorithms.
1/1/ Using Instant Messaging to Provide an Intelligent Learning Environment Chun-Hung Lu 1, Guey-Fa Chiou 2, Min-Yuh Day 1,3, Chorng-Shyong Ong 3, Wen-Lian.
Object Recognition Using Genetic Algorithms CS773C Advanced Machine Intelligence Applications Spring 2008: Object Recognition.
An Application of Genetic Simulation Approach to Layout Problem in Robot Arm Assembly Factory Speaker : Ho, Zih-Ping Advisor : Perng, Chyuan Industrial.
1/1/ Question Classification in English-Chinese Cross-Language Question Answering: An Integrated Genetic Algorithm and Machine Learning Approach Min-Yuh.
Artificial Intelligence Genetic Algorithms and Applications of Genetic Algorithms in Compilers Prasad A. Kulkarni.
1/1/ Designing an Ontology-based Intelligent Tutoring Agent with Instant Messaging Min-Yuh Day 1,2, Chun-Hung Lu 1,3, Jin-Tan David Yang 4, Guey-Fa Chiou.
Genetic Algorithm What is a genetic algorithm? “Genetic Algorithms are defined as global optimization procedures that use an analogy of genetic evolution.
16 November, 2005 Statistics in HEP, Manchester 1.
Chapter 6: Transform and Conquer Genetic Algorithms The Design and Analysis of Algorithms.
STRUCTURED PERCEPTRON Alice Lai and Shi Zhi. Presentation Outline Introduction to Structured Perceptron ILP-CRF Model Averaged Perceptron Latent Variable.
A hybrid method for gene selection in microarray datasets Yungho Leu, Chien-Pan Lee and Ai-Chen Chang National Taiwan University of Science and Technology.
Genetic Algorithms: A Tutorial
Genetic Algorithm.
Genetic Algorithms and Ant Colony Optimisation
Graphical models for part of speech tagging
An Approach of Artificial Intelligence Application for Laboratory Tests Evaluation Ş.l.univ.dr.ing. Corina SĂVULESCU University of Piteşti.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
Comparative study of various Machine Learning methods For Telugu Part of Speech tagging -By Avinesh.PVS, Sudheer, Karthik IIIT - Hyderabad.
Improved Gene Expression Programming to Solve the Inverse Problem for Ordinary Differential Equations Kangshun Li Professor, Ph.D Professor, Ph.D College.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
Lecture 8: 24/5/1435 Genetic Algorithms Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
More on Heuristics Genetic Algorithms (GA) Terminology Chromosome –candidate solution - {x 1, x 2,...., x n } Gene –variable - x j Allele –numerical.
Agent-Based Hybrid Intelligent Systems and Their Dynamic Reconfiguration Zili Zhang Faculty of Computer and Information Science Southwest University
Applying Genetic Algorithm to the Knapsack Problem Qi Su ECE 539 Spring 2001 Course Project.
An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
1 “Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for optimal combinations of things, solutions.
Genetic Algorithms Siddhartha K. Shakya School of Computing. The Robert Gordon University Aberdeen, UK
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
A Hybrid Genetic Algorithm for the Periodic Vehicle Routing Problem with Time Windows Michel Toulouse 1,2 Teodor Gabriel Crainic 2 Phuong Nguyen 2 1 Oklahoma.
Tuning Before Feedback: Combining Ranking Discovery and Blind Feedback for Robust Retrieval* Weiguo Fan, Ming Luo, Li Wang, Wensi Xi, and Edward A. Fox.
1 A New Method for Composite System Annualized Reliability Indices Based on Genetic Algorithms Nader Samaan, Student,IEEE Dr. C. Singh, Fellow, IEEE Department.
Smooth Side-Match Classified Vector Quantizer with Variable Block Size IEEE Transaction on image processing, VOL. 10, NO. 5, MAY 2001 Department of Applied.
Query Segmentation Using Conditional Random Fields Xiaohui and Huxia Shi York University KEYS’09 (SIGMOD Workshop) Presented by Jaehui Park,
Genetic Algorithms CSCI-2300 Introduction to Algorithms
Genetic Algorithms Genetic algorithms provide an approach to learning that is based loosely on simulated evolution. Hypotheses are often described by bit.
Genetic Algorithms What is a GA Terms and definitions Basic algorithm.
Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
Neural Networks And Its Applications By Dr. Surya Chitra.

An Introduction to Genetic Algorithms Lecture 2 November, 2010 Ivan Garibay
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Genetic Algorithms. Solution Search in Problem Space.
EVOLUTIONARY SYSTEMS AND GENETIC ALGORITHMS NAME: AKSHITKUMAR PATEL STUDENT ID: GRAD POSITION PAPER.
Genetic Algorithms An Evolutionary Approach to Problem Solving.
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
Genetic Algorithm(GA)
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
Presented By: Farid, Alidoust Vahid, Akbari 18 th May IAUT University – Faculty.
July 6, 2016Knowledge-Based System, Lecturer # 09 1 Knowledge Based System Lecture #09 Dr. Md. Hasanuzzaman Assistant Professor Department of Computer.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
Genetic Algorithm (Knapsack Problem)
Genetic Algorithms CSCI-2300 Introduction to Algorithms
Artificial Intelligence CIS 342
Traveling Salesman Problem by Genetic Algorithm
Presentation transcript:

1/1/ Integrating Genetic Algorithms with Conditional Random Fields to Enhance Question Informer Prediction Min-Yuh Day 1, 2, Chun-Hung Lu 1, 2, Chorng-Shyong Ong 2, Shih-Hung Wu 3, and Wen-Lian Hsu 1,*,Fellow, IEEE 1 Institute of Information Science, Academia Sinica, Taiwan 2 Department of Information Management, National Taiwan University, Taiwan 3 Department of CSIE, Chaoyang University of Technology, Taiwan IEEE IRI 2006, Waikoloa, Hawaii, USA, Sep 16-18, 2006.

Min-Yuh Day (NTU; SINICA) 2/2/ Outline Introduction Research Background The Hybrid GA-CRF Model Experimental Design Experimental Results Conclusions

Min-Yuh Day (NTU; SINICA) 3/3/ Introduction Question informers play an important role in enhancing question classification for factual question answering Question Informer choosing a minimal, appropriate contiguous span of a question token, or tokens, as the informer span of a question, which is adequate for question classification. An example of Question Informer “ What is the biggest city in the United States? ” Question informer: “ city ” “ city ” is the most important clue in the question for question classification.

Min-Yuh Day (NTU; SINICA) 4/4/ Introduction (cont.) Previous works have used Conditional Random Fields (CRFs) to identify question informer spans. We propose a hybrid approach that integrates GA with CRF to optimize feature subset selection in CRF-based question informer prediction models.

Min-Yuh Day (NTU; SINICA) 5/5/ Research Background Conditional Random Fields (CRFs) A framework for building probabilistic models To segment and label sequence data A CRF models Pr(y|x) using a Markov random field Advantage over traditional models Hidden Markov Models (HMMs) Maximum Entropy Markov Models (MEMMs) CRF++ Open source implementation of CRFs Segmenting and labeling sequenced data Flexible to redefine feature sets in feature templates

Min-Yuh Day (NTU; SINICA) 6/6/ Research Background (cont.) Genetic Algorithms (GAs) A class of heuristic search methods and computational models of adaptation and evolution based on mechanics of natural selection and genetics. Feature selection in machine learning Feature subset optimization

Min-Yuh Day (NTU; SINICA) 7/7/ The Hybrid GA-CRF Model Encoding a feature subset of CRF with the structure of chromosomes Initialization Population Evaluate (Fitness Function) CRF model 10-fold Cross validation Stopping criteria satisfied Apply GA operators and produce a new generation Apply the selected feature subsets to CRF test dataset

Encoding a Feature Subset of CRF with the structure of chromosomes Initialization Evaluate (Fitness Function) Population CRF model 10-fold Cross Validation GA Operators: Reproduction, Crossover, Mutation x: Feature subset F(x):Fitness Function Stopping criteria Satisfied? Near Optimal Feature Subset of CRF Near Optimal CRF Prediction Model Training dataset Test dataset CRF-based Question Informer Prediction GA-CRF Learning Yes No Hybrid GA-CRF Approach for Question Informer Prediction

Min-Yuh Day (NTU; SINICA) 9/9/ F1F1 F2F2 F3F3 F n-1 FnFn F n-2 … … Population Feature subset selection Chromosome 1 Chromosome 2 Chromosome 3 Chromosome m-2 Chromosome m-1 Chromosome m … … … … … Gene structure of chromosomes for a feature subset

Min-Yuh Day (NTU; SINICA) 10/ F1F1 F2F2 F3F3 F n-1 FnFn F n-2 … … Chromosome Feature subset = {F 1, F 3,…, F n-1, F n } Feature Example of feature subset encoding for GA

Min-Yuh Day (NTU; SINICA) 11/ Experimental Design Data set UIUC QC dataset (Li and Roth, 2002) Question informer dataset (Krishnan et al., 2005) Training questions: 5500 Test questions: 500

Min-Yuh Day (NTU; SINICA) 12/ Features of Question Informer Question informer tags for CRF model O-QIF0: outside and before a question informer B-QIF1: the start of question informer O-QIF2: outside and after a question informer 21 basic feature candidates Word, POS, heuristic informer, Parser Information, Token Information, Question wh-word, length, position. 5 sliding windows We Generate 105 (21*5) features (genes) for each chromosome

Min-Yuh Day (NTU; SINICA) 13/ Features for question informer prediction IDFeature nameDescriptionFeature Template for CRF ++F-scoreFeature Rank 1Word U01:%x[0,0] POS U01:%x[0,1] HQIHeuristic InformerU01:%x[0,2] Token U01:%x[0,3] ParserL0Parser Level 0U01:%x[0,4] ParserL1Parser Level 1U01:%x[0,5] ParserL2Parser Level 2U01:%x[0,6] ParserL3Parser Level 3U01:%x[0,7] ParserL4Parser Level 4U01:%x[0,8] ParserL5Parser Level 5U01:%x[0,9] ParserL6Parser Level 6U01:%x[0,10] IsTag U01:%x[0,11] IsNumIs NumberU01:%x[0,12] IsPrevTagIs Previous TagU01:%x[0,13] IsNextTag U01:%x[0,14] IsEdge U01:%x[0,15] IsBegin U01:%x[0,16] IsEnd U01:%x[0,17] Wh-wordQuestion Wh-word (6W1H1O)U01:%x[0,18] LengthQuestion LengthU01:%x[0,19] PositionToken PositionU01:%x[0,20]

Min-Yuh Day (NTU; SINICA) 14/ Data format for CRF model Question: “What is the oldest city in the United States?”

Min-Yuh Day (NTU; SINICA) 15/ Features f ij for x i j01 ixixi POSyiyi 0WhatWPO-QIF0 1isVBZO-QIF0 2theDTO-QIF0 3oldestJJSO-QIF0 4cityNNB-QIF1 5inINO-QIF2 6theDTO-QIF2 7UnitedNNPO-QIF2 8StatesNNPSO-QIF2 9?. Features f ij for x i j01 ixixi POSyiyi -2theDTO-QIF0 oldestJJSO-QIF0 0cityNNB-QIF1 +1inINO-QIF2 +2theDTO-QIF2 Sliding Windows i -2 i -1 i +0 i +1 i +2 Features f ij for x i = Uid:%x[i, j] the => f -2,0 => U00:%x[-2,0] => F1 oldest => f -1,0 => U01:%x[-1,0] => F2 city => f 0,0 => U02:%x[ 0,0] => F3 in => f +1,0 => U03:%x[+1,0] => F4 the => f +2,0 => U04:%x[+2,0] => F5 DT => f -2,1 => U05%x[-2,1] => F6 JJS => f -1,1 => U06:%x[-1,1] => F7 NN => f 0,1 => U07:%x[ 0,1] => F8 IN => f +1,1 => U08:%x[+1,1] => F9 DT => f +2,1 => U09:%x[+2,1] => F10 Features f ij for x i j01 ixixi POSyiyi -2the [-2,0]DT [-2,1]O-QIF0 oldest[-1, 0]JJS [-1,1]O-QIF0 0city [0, 0]NN [0,1]B-QIF1 +1in [+1, 0]IN [+1,1]O-QIF2 +2the [+2, 0]DT [+2,1]O-QIF2

Min-Yuh Day (NTU; SINICA) 16/ Feature generation and feature template for CRF++ FeatureFeaturesFeature TemplateFeature ID thef -2,0 U00:%x[-2,0]F1 oldestf -1,0 U01:%x[-1,0]F2 cityf 0,0 U02:%x[ 0,0]F3 inf +1,0 U03:%x[+1,0]F4 thef +2,0 U04:%x[+2,0]F5 DTf -2,1 U05%x[-2,1]F6 JJSf -1,1 U06:%x[-1,1]F7 NNf 0,1 U07:%x[ 0,1]F8 INf +1,1 U08:%x[+1,1]F9 DTf +2,1 U09:%x[+2,1]F10

Min-Yuh Day (NTU; SINICA) 17/ Featur e Featur es Feature Template Feature ID thef-2,0U00:%x[-2,0]F1 oldestf-1,0U01:%x[-1,0]F2 cityf0,0U02:%x[ 0,0]F3 inf+1,0U03:%x[+1,0]F4 thef+2,0U04:%x[+2,0]F5 DTf-2,1U05%x[-2,1]F6 JJSf-1,1U06:%x[-1,1]F7 NNf0,1U07:%x[ 0,1]F8 INf+1,1U08:%x[+1,1]F9 DTf+2,1U09:%x[+2,1]F10 Features f ij for x i = Uid:%x[i, j] F1F1 F2F2 F3F3 Chromosome Feature subset = {F 1, F 3, F 4, F 7, F 8, F 10 } Features F4F4 F5F5 F6F6 F7F7 F8F8 F9F9 F Encoding a feature subset with the structure of chromosomes for GA There are 105 feature subsets in total (21 basic features * 5 sliding windows)

Min-Yuh Day (NTU; SINICA) 18/ Experimental Results

Min-Yuh Day (NTU; SINICA) 19/ Experimental results of CRF-based question informer prediction using GA Population: 40, Crossover: 80%, Mutation:10%, Generation:100

Min-Yuh Day (NTU; SINICA) 20/ Optimal feature subset for the CRF model selected by GA GA-CRF Model Near Optimal Chromosome: Near Optimal Feature Subsets for CRF model: U001:%x[-2,1] U002:%x[0,1] U003:%x[1,1] U004:%x[-1,2] U005:%x[0,2] U006:%x[1,2] U007:%x[2,2] U008:%x[-2,3] U009:%x[-2,5] U010:%x[-1,5] U011:%x[-2,6] U012:%x[1,6] U013:%x[2,6] U014:%x[2,7] U015:%x[0,8] U016:%x[1,8] U017:%x[-2,9] U018:%x[1,9] U019:%x[1,10] U020:%x[-2,11] U021:%x[-2,12] U022:%x[0,12] U023:%x[0,13] U024:%x[2,13] U025:%x[-2,14] U026:%x[-1,14] U027:%x[2,14] U028:%x[0,16] U029:%x[1,16] U030:%x[2,16] U031:%x[-2,17] U032:%x[-1,17] U033:%x[0,17] U034:%x[1,17] U035:%x[-2,19] U036:%x[-1,19] U037:%x[2,19] U038:%x[-2,20] U039:%x[-1,20] U040:%x[2,20]

Min-Yuh Day (NTU; SINICA) 21/ Experimental Result of the proposed hybrid GA-CRF model for question informer prediction Question Informer PredictionAccuracyRecallPrecisionF-score Traditional CRF Model (All features) (105 features) 93.16%94.33%84.07%88.90 GA-CRF Model (Near optimal feature subset) (40 features) 95.58%95.79%92.04%93.87

Min-Yuh Day (NTU; SINICA) 22/ Conclusions We have proposed a hybrid approach that integrates Genetic Algorithm (GA) with Conditional Random Field (CRF) to optimize feature subset selection in a CRF- based model for question informer prediction. The experimental results show that the proposed hybrid GA-CRF model of question informer prediction improves the accuracy of the traditional CRF model. By using GA to optimize the selection of the feature subset in CRF-based question informer prediction, we can improve the F-score from 88.9% to 93.87%, and reduce the number of features from 105 to 40.

Min-Yuh Day (NTU; SINICA) 23/ Q & A Integrating Genetic Algorithms with Conditional Random Fields to Enhance Question Informer Prediction Min-Yuh Day a, b, Chun-Hung Lu a, b, Chorng-Shyong Ong b, Shih-Hung Wu c, and Wen-Lian Hsu a,*,Fellow, IEEE a Institute of Information Science, Academia Sinica, Taiwan b Department of Information Management, National Taiwan University, Taiwan c Department of CSIE, Chaoyang University of Technology, Taiwan IEEE IRI 2006, Waikoloa, Hawaii, USA, Sep 16-18, 2006.