Presentation is loading. Please wait.

Presentation is loading. Please wait.

1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR.

Similar presentations


Presentation on theme: "1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR."— Presentation transcript:

1 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR

2 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Bibliography Machine Learning, Tom Mitchell (McGraw Hill, 1997) Principal Component Analysis, Ian Jolliffe (Springer- Verlag, 2002) An introduction to SVM and other kernel-based learning methods, Cristianini-Shawe Taylor (Cambrige, 2000) The Elements of Statistical Learning, Hastie-Tibshirani- Friedman (Springer, 2001)

3 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Machine Learning The field of Machine Learning is concerned with the question of how to construct computer programs that automatically improve with experience The purpose of this course is to present key algorithms and theory that form the core of Machine Learning

4 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Machine Learning Interdisciplinary nature of the material: Statistics, Artificial Intelligence, Information Theory, etc. Basic question: How to program computers to learn?

5 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Machine Learning Intelligent Data Analysis: Intelligent application of data analytic tools (Statistics) Application of “intelligent” data analytic tools (Machine Learning) Modern world: Data-driven world (industrial, commercial, financial, scientific activities)

6 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? Recent progress in algorithms and theory Growing flood of online data Computational power available

7 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? Niches for Machine Learning: –Data Mining: using historical data to improve decisions Medical records  medical knowledge –Software applications we can’t program by hand Autonomous driving Speech recognition –Self customizing programs Newsreader that learns user interests

8 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? Data Mining –Data: Recorded facts –Information: Set of patterns, or expectations, that underlie the data –Data Mining: Extraction of implicit, previously unknown, and potentially useful information from data –Machine Learning: Provides the technical basis of data mining

9 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? Typical Datamining Tasks –Risk of Emergency Cesarean Section Given 9714 patient records, each describing a pregnancy and birth Each patient record contains 215 features Learn to predict: Classes of patients at high risk for emergency cesarean section

10 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning?

11 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? One of the learned rules: IFNo previous vaginal delivery, and Abnormal 2nd Trimester Ultrasound, and Malpresentation at admission THENProbability of Emergency C-Section 0.6 Over training data: 16/41=0.63 Over Test Data:12/20=0.60

12 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? –Credit Risk Analysis

13 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? –Customer Retention

14 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? –Problems Too Difficult to Program by Hand

15 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Why Machine Learning? –Software that Customizes to User

16 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Where is This Headed? Today: tip of the iceberg First-generation algorithms: neural nets, decision trees, regression.... Applied to well-formated databases Tomorrow: enormous impact Learn across mixed-media data and multiple databases Learn by active experimentation Learn decisions rather than predictions Cumulative, life-long learning

17 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Where is This Headed? Autonomous entities? “I'm sorry Dave; I can't let you do that.” –HAL 9000 in 2001: A Space Odyssey, by Arthur Clarke


Download ppt "1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006 Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR."

Similar presentations


Ads by Google