Presentation is loading. Please wait.

Presentation is loading. Please wait.

27-18 września 20121 Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,

Similar presentations


Presentation on theme: "27-18 września 20121 Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,"— Presentation transcript:

1 27-18 września 20121 Data Mining dr Iwona Schab

2 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business, administration, science and technology. 2 The process of discovering knowledge in data; the role of data mining in this process. 3 Data mining and Business Intelligence. 4 SEMMA methodology. 5 Data preparation: sampling, cleaning, normalization and standardization. 6 Association rules discovery. 7 Classification problems: case studies.

3 3 Semester timetable 8 Rule induction systems: algorithms, knowledge representation. 9 Decision trees: partition rules and pruning. 10 Classification based on probability distributions: naive Bayes estimation and Bayesian networks. 11 Grouping problems - case studies. 12 Cluster analysis: combinatorial and hierarchical methods. 13 Modeling response to direct mail marketing. 14 Churn analysis. 15 Text mining. 16 Web mining. 17 Data mining in Life Science. 18 Comparative analysis of algorithms implemented in SAS Enterprise Miner and WEKA software.

4 4 Literature Basic Paolo Giudici, Applied Data Mining. Statistical Methods for Business and Industry, Wiley, New York 2011 Supplementary Selected papers to be circulated Daniel T.Larose, Discovering Knowledge in Data: An Introduction to Data Mining, Wiley, New York 2005 Daniel T.Larose, Data Mining Methods and Models, Wiley, New York 2006

5 5 Statistical Analysis?

6 6 Data Mining to mine = to extract (e.g. precious, hidden resources from the Earth) Different definition and understanding depending on user New dyscipline developed from computing and statistics In-depth search to find additional information (previously unnoticed in the mass of data available) Data preparation and „structuring unstructured” needed Machine learning = finding relations and regularities in data Generalisation from the observed data to new unobserved case

7 7 KDD Process (Knowledge Discovery in Database)

8 8 Software www.sgh.waw.pl/ogolnouczelniane/ci/aplikacje/oprogramowanie/ SAS/STAT SAS Enterprise Miner --- Other: Statistica, SPSS WEKA


Download ppt "27-18 września 20121 Data Mining dr Iwona Schab. 2 Semester timetable ORGANIZATIONAL ISSUES, INDTRODUCTION TO DATA MINING 1 Sources of data in business,"

Similar presentations


Ads by Google