Presentation is loading. Please wait.

Presentation is loading. Please wait.

TEMPLATE DESIGN © 2008 www.PosterPresentations.com Genetic Algorithm and Poker Rule Induction Wendy Wenjie Xu Supervised by Professor David Aldous, UC.

Similar presentations


Presentation on theme: "TEMPLATE DESIGN © 2008 www.PosterPresentations.com Genetic Algorithm and Poker Rule Induction Wendy Wenjie Xu Supervised by Professor David Aldous, UC."— Presentation transcript:

1 TEMPLATE DESIGN © 2008 www.PosterPresentations.com Genetic Algorithm and Poker Rule Induction Wendy Wenjie Xu Supervised by Professor David Aldous, UC Berkeley Statistics Department Introduction Algorithms An algorithm is a procedure for solving a problem. To apply an algorithm, we specify a set of inputs and then manipulate them, by running computations, logic arguments, and so on, in order to obtain a solution. The main focus of this research is to apply a long established algorithm to a specific problem, which is poker rule induction. Pretend you are in a foreign land, have never played the game before, are given a history of thousands of games, and are asked to come up with the rules. It is potentially difficult to discover rules that can correctly classify poker hands, yet it is trivial fir a human to validate the rules objectively. The goal is to train the computer to predict the best hand it can play based on the cards it has been dealt, which is automatic rules induction. Genetic Algorithms Genetic algorithms (GAs) are a general class of algorithms inspired by natural selection and have been widely applied to optimizing solutions to a great variety of problems, including medicine, robotics, laser technology, etc. The method has rather simple and intuitive underlying concepts. In short, it treats solutions to a specific problem as “genes” and simulates the process of natural selection, with the hope of eventually getting better solutions. The main steps include generating a somehow random initial population of solutions, selecting a proportion of the population with the best performance according to certain criteria breeding a population of offspring from the selected “genes”, treating the parents and children as the initial population and repeat the process. Mutation is usually included to add to the randomness. A Glance at the Data The training data come from the kaggle.com competition “Poker Rule Induction”. There are 25,010 poker hands. Each hand consists of 5 cards with a given suit and rank, drawn from a standard deck of 52. Suits and ranks are represented as ordinal categories: Each row in the training set has the accompanying class label for the poker hand it comprises. Hands are classified into the following 10 ordinal categories: Methods Most of the computer programming work is done with Python. 1. Validation set method: The data set is randomly divided into two parts, with about 60% of it as training set and 40% as test set. 2. Tree building: A number of functions are developed to build trees out of random samples drawn from the training set. A decision tree looks like this: A tree is grown from the training data by choosing the attribute-value pair to split up the data such that the resulting sets have the lowest Shannon entropy, i.e. the split sets each have the least variation in terms of class label. 3. Initial population: Random samples of some specified size are drawn from the training data. Each sample gives rise to a tree using the tree- building algorithm from step 2. Altogether, the trees form the initial population, or first generation, for GA. Current Findings References 4. Selection, cross-over, and mutation: These are the essential parts of GA. Given a tree population, the half that give the predictions with the least sum of squared errors are selected. Then, they are split into two groups and randomly paired up to exchange some random branch. Also, according to a specified mutative rate, some part of some tree may be randomly changed, either deleted or replaced by a randomly generated branch. The new population, consisting of parents and children and with possibly mutated individuals, then goes through the process again. OPTIONAL LOGO HERE Rule Induction Rule induction is one of the most important topics under machine learning. There are a great number of algorithms developed to uncover rules, or regularities, hidden in a set of data and consequently facilitate making predictions or understanding critical facts. Rules usually take the form of “if attribute 1 = value 1 and attribute 2 = value 2 and so on, then decision = value”. Therefore, it is sensible to use decision tree structures to represent and manipulate rules. The ideal goal of a rule induction method is to induce a rule set R that is consistent (there exists no case contradicting the rule) and complete (all attribute-value pairs are covered). Such a rule set is called discriminant. Certainly, there is always trade-off between these two characteristics as they can hardly be achieved at the same time. There are some parameters in the process, including sample size and number of population, which can be tested and manipulated to improve performance. Fitness is calculated by summing up the predictive errors squared or taking the percentage of accurate predictions. It is the critical tool that helps determine the parameters to be chosen. For example, the Fitness versus Sample Size graph above shows that a sample size of 5000 for building the initial tree population generates the best performance. Coley, David A. An Introduction to Genetic Algorithms for Scientists and Engineers. Singapore: World Scientific, 1999. Print. Segaran, Toby. Programming Collective Intelligence: Building Smart Web 2.0 Applications. Beijing: O'Reilly, 2007. Print. Websites: http://www.kaggle.com/c/poker-rule-induction http://whatis.techtarget.com/definition/algorithm


Download ppt "TEMPLATE DESIGN © 2008 www.PosterPresentations.com Genetic Algorithm and Poker Rule Induction Wendy Wenjie Xu Supervised by Professor David Aldous, UC."

Similar presentations


Ads by Google