Presentation on theme: "1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 1-Introducción Eduardo Poggi"— Presentation transcript:
1 Universidad de Buenos Aires Maestría en Data Mining y Knowledge Discovery Aprendizaje Automático 1-Introducción Eduardo Poggi (firstname.lastname@example.org)email@example.com Ernesto Mislej (firstname.lastname@example.org)email@example.com otoño de 2008
2 Agenda Aprendizaje Automático Sistemas de aprendizaje Tareas
3 Campo de estudio Artificial Intelligence Planning Natural Language Robotics K Representation … Machine Learning “Knowledge” Discovery Clusters Rules Concepts Patterns …
4 Multidisciplinary Field MachineLearning Probability & Statistics ComputationalComplexityTheory InformationTheory Philosophy Neurobiology ArtificialIntelligence
5 Inteligencia “Las personas poseen procesos que les permiten resolver problemas complejos, al conjunto de estos procesos que desconocemos denominamos inteligencia.” (M Minsky) Definición en cambio permanente = “Regiones inexploradas de África”.
6 Machine Learning Machine learning is the study of how to make computers automatically learn; the goal is to make computers improve their performance through experience. The purpose of this course is to present the key concepts, algorithms and theory that form the core of Machine Learning.
7 ML & DM Information: Set of patterns or expectations that underlie the data. Data Mining: Extraction of implicit, previously unknown and potentially useful information from data. Machine Learning: Provides the technical basis (algorithms) of data mining.
8 Epistemological differences among Computer Science, ML and DM Classic data processing Machine Learning (and Statistics) DM Simulates a deductive reasoning (= applies an existing model) Simulates an inductive reasoning (= invents a model) Simulates an inductive reasoning ("even more inductive") validation according to precision validation according to utility and comprehensibility Results as universal as possible Results relative to particular cases elegance = conciseness elegance = adequacy to the user's model Tends to reject AI Either tends to reject AI (Statistics) or claims belonging to AI (ML) Naturally integrates AI, DB, Stat., and MMI.
9 Model of learning systems Experience(E) Computer + LearningAlgorithm Class of Task (T) Performance(P)
10 Class of Tasks It is the kind of activity on which the computer will learn to improve its performance. Examples: Learning play chess Recognizing images of handwritten words Diagnosing patients coming into the hospital “Discovery” patterns in data
11 Settings for learning Tasks are generated by a random process outside the learner The learner can pose queries to a teacher The learner explores its surroundings autonomously Example: Learning to play chess Learn from a specific sequence Ask: what if the sequence is this? Give me an amateur player and then an expert player
12 Experiencia y Memoria Textos aprendidos “de memoria” “… quien podría soportar tan duras …” “Lasciate ogni esperanze voi ch´intrate” Relaciones Sin datos almacenados no hay aprendizaje Las capacidades de razonamiento no compensan la ignorancia Relación, Generalización y Abstracción
13 Experience and Performance Experience: What has been recorded in the past. Performance: A measure of the quality of the response or action. Example: Handwritten recognition using Neural Networks Experience: a database of handwritten images with their correct classification Performance: Accuracy in classifications
14 Performance Efectividad = Qr / Qp Eficacia = Efectividad * Tp / Tr Eficiencia = Eficacia * Rp / Rr Q = Cantidad de unidades (incluye calidad). T = Tiempo R = Recursos p = previsto r = real
15 Designing a Learning System 1.Define the knowledge to learn 2.Define the representation of the target knowledge 3.Define the learning mechanism + Define monitor mechanism Example: Handwritten recognition using Neural Networks 1.A function to classify handwritten images 2.A linear combination of handwritten features 3.A linear classifier
16 The Knowledge to Learn Supervised learning: A function to predict the class of new examples Let X be the space of possible examples Let Y be the space of possible classes Learn F : X Y Example: In learning to play chess the following are possible interpretations: In learning to play chess the following are possible interpretations: X : the space of board configurations X : the space of board configurations Y : the space of legal moves Y : the space of legal moves
17 The Representation of the Target Knowledge Example: Diagnosing a patient coming into the hospital. Features: X1: Temperature X2: Blood pressure X3: Blood type X4: Age X5: Weight Etc. Given a new example X = Given a new example X = F(X) = w1 x1 + w2 x2 + … + wn xn If F(X) > T predict heart disease otherwise predict no heart disease
18 The Representation of the Target Knowledge There are many possibilities: There are many possibilities: The class of functions is very expressive. The class of functions is very expressive. You can represent almost any function but to be effective the method needs lots of examples. You can represent almost any function but to be effective the method needs lots of examples. The class of functions is very limited. The class of functions is very limited. Don’t need many examples but may fail to contain the true target function. Don’t need many examples but may fail to contain the true target function. Características: validez, expresividad, facilidad de inferencia, adaptabilidad.
19 The Learning Mechanism 1 Machine learning algorithms abound: Decision Trees Decision Trees Rule-based systems Rule-based systems Neural networks Neural networks Nearest-neighbor Nearest-neighbor Support-Vector Machines Support-Vector Machines Bayesian Methods Bayesian Methods Important characteristics of the learning mechanism: What is the class of functions What is the class of functions How do you search over the class of functions How do you search over the class of functions
20 The Learning Mechanism 2 Example: Look over the space of all possible decision trees. Prefer small trees to large trees. Higher score Lower score
21 Choices designing a learning program Determine Type of Training Experience Games agaisnt experts Games against self Table of corrects moves Determine Target Function Board -> Value Borad -> Move Determine Representation of Learned Function Function Rules Artificial neural network Determine Learning Algorithm Gradient descent Linear programming
23 Application 1 Automatic Car Drive Class of Tasks: Learning to drive on highways from vision stereos. Knowledge: Images and steering commands recorded while observing a human driver. Performance Module: Accuracy in classification
24 Application 2 Learning to classify astronomical structures. galaxy stars Features: o Color o Size o Mass o Temperature o Luminosity unkown
25 Application 2 Classifying Astronomical Objects Class of Tasks: Learning to classify new objects. Knowledge: database of images with correct classification. Performance Module: Accuracy in classification
26 Other Applications Bio-Technology Protein Folding Prediction Micro-array gene expression Computer Systems Performance Prediction Credit Applications Fraud Detection Detección de patrones de consumo (compras repetitivas y esporádicas, productos relacionados) Character Recognition (US Postal Service) Web Applications Document Classification Learning User Preferences
27 Diferentes modelos Deductivos Memorización Inductivos Clasificación Clustering Teorización Híbridos EBL – Explanation base learning SML – Similarity base learning CBL – Case base learning
28 Should I care about Machine Learning at all? Machine learning is becoming increasingly popular and has become a cornerstone in many industrial applications. Machine learning provides algorithms for data mining, where the goal is to extract useful pieces of information (i.e., patterns) from large databases. The computer industry is heading towards systems that will be able to adapt and heal themselves automatically. The electronic game industry is now focusing on games where characters adapt and learn through time.
29 Summary Machine learning is the study of how to make computers automatically learn. A learning algorithm needs the following elements: class of tasks, performance metric, and body of experience. The design of a learning algorithm requires to define the knowledge to learn, the representation of the target knowledge, and the learning mechanism. Machine learning counts with many successful applications and is becoming increasingly important in science and industry.
30 Tareas Leer: Capítulo 1 de Mitchell Kodratoff, Yves: Machine Learning and Data Mining Kodratoff, Yves: Cuando el ordenador aprende Se sugiere leer: Kvitca, Adolfo: Resolución de problemas con IA. EBAI 1998. Caps: 1-4. Rich, Elaine: AI. McGrawHill, 1984. Caps: 2 y 3. Diseñar un sistema de aprendizaje para algún juego