Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.

Similar presentations


Presentation on theme: "1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction."— Presentation transcript:

1 1 SBM411 資料探勘 陳春賢

2 2 Lecture I Class Introduction

3 3 Instructor Information  姓名 : 陳春賢  Ph.D. from Iowa State University, USA  M.S. from Iowa State University, USA  B.E. from 新竹清華大學  Technical specialty: Databases and Intelligent Decision Support Systems.  Research interests: Data Mining, Biomedical Informatics, Artificial Intelligence, Artificial Neural Networks

4 4 Contact Info  Contact Info: TEL: (03)211-8800 ext 5816 Email: cchen@mail.cgu.edu.tw

5 5 Course Objectives To learn  the terms, concepts and applications of data mining  the processes, techniques and models of data mining  data preprocessing techniques  data Warehouse and OLAP technology  to use free data mining software: Weka to analyze certain data sets

6 Learning Goals and Objectives (AACSB 國際商管認證 ) AoL 施測的學習目標 2. Our students will be able to solve problems effectively. 2-3. Our students will be able to assess alternatives and make the decision. Criterion I : Criteria identification Criterion II : Criteria application (Evaluate how criteria are applied to alternatives) Criterion III : Decision making basing on assessments

7 Learning Goals and Objectives (AACSB 國際商管認證 ) AoL 需公告的學習目標. Our students will be able to apply theories into practices. 1-1. Our students will be able to identify appropriate theories and associated boundary conditions. Criterion I : Identifying appropriate theory Criterion II : Understanding the theory Criterion III : Understanding the boundary conditions of the theory

8 8 Course Content  Introduction to data mining  Main data mining techniques Association rule mining Classification and prediction Cluster analysis  Open-source DM software in Java: Weka 3.x  Data preprocessing techniques  Data warehouse and OLAP technology

9 9 Textbook and References  Textbook Jiawei Han and Micheline Kamber, Data Mining : Concepts and Techniques, 2nd edition, Morgan Kaufmann Publishers, San Francisco, CA, USA, 2007.  參考書 Margaret H. Dunham, Data Mining: Introductory and Advanced Topics, Prentice Hall, Upper Saddle River, NJ, USA, 2002. 王派洲 譯,資料探勘 : 概念與方法,第二版 (Jiawei Han and Micheline Kamber, Data Mining:Concepts and Techniques,2/e) , 滄海書局, 2008.

10 10 Grading Policy  10% : Class Participation  40% : Midterm Exam One-hour close-book Exam (8/15, Class 9) Take-home Exam (Due 8/22, Class 10)  50% : Final Project 5% : Proposal (problem analysis) 10% : Final Report 35% : Data Analysis and Presentation

11 11 Project Proposal (8/29, Class 11) The proposal is to plan your project. It should at least include :  Title  Team member  Motivation  Problem and data description including data source, description, description of important attributes, data year, record number, attribute number and other  Schedule  A short description of the used DM techniques  Data analysis process data preprocessing, data mining, knowledge presentation/evaluation  Performance evaluation method  Others

12 12 Final Project  A project on DM application  Use Weka to analyze certain data sets  A presentation and report to introduce your project, at least including Title and motivation Problem, data description, data range, basic data statistics How the problem can be solved The DM algorithms you use/implement and related literature Analysis process data preprocessing, data mining, knowledge presentation/evaluation Class distribution at each attribute Performance evaluation method Result and value of the discovered knowledge Discussion  Each student can use 25 min for presentation 17~20 min for presentation, 3 min for Q&A, 2 min for getting ready

13 13 Class Schedule  Class 1: Class Introduction/Introduction to data mining(6/6)  Class 2-3: Classification and prediction (6/13, 7/4)  Class 4: Cluster analysis (7/11)  Class 5: The applications of data mining (7/18, 林詩偉老師 )  Class 6: Cluster analysis (7/25)  Class 7 : Big data analysis (8/1, 林詩偉老師 )  Class 8-9: Association rule mining (8/8, 8/15) (One-hour Close-book Exam @ 8/15)  Class 10: Weka (open DMware) Introduction & Lab (8/22) (Take-home Exam due 8/22)  Class 11: Data preprocessing (8/29) (Proposal of final project due 8/29)  Class 12: Data warehouse (9/5 上午 )  Class 13: Final project presentation (9/5 下午 )

14 14 Internet Resources  Lecture Slides Browser URL: ftp://163.25.117.117/cchen 104Summer →104S_Data Mining_eMIS → 上課投影片  Open source DM software in Java: Weka 3.x.x http://www.cs.waikato.ac.nz/~ml/weka/index.html

15 15 Dataset Web Sites for Mining  UCI Machine Learning Repository http://www1.ics.uci.edu/~mlearn/MLRepository.html  衛生福利部食品藥物管理署 OPEN DATA 開放資料集 http://data.fda.gov.tw  政府資料開放平臺 / 全部資料集清單 http://data.gov.tw/data_list  DASL http://lib.stat.cmu.edu/DASL/Datafiles/  JSE Data Archive http://www.amstat.org/publications/jse/jse_data_archive.html  KDNuggets http://www.kdnuggets.com/datasets/index.html  MLnet Online Information Service http://www.mlnet.org/cgi-bin/mlnetois.pl/?File=datasets.html

16 16 Question & Answer


Download ppt "1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction."

Similar presentations


Ads by Google