Download presentation
Presentation is loading. Please wait.
1
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments William H. Hsu, Computing and Information Sciences Shih-Hsiung Chou, Industrial and Manufacturing Systems Engineering Kansas State University KSOL course page: http://bit.ly/a68KuLhttp://bit.ly/a68KuL Course web site: http://www.kddresearch.org/Courses/CIS690http://www.kddresearch.org/Courses/CIS690 Instructor home page: http://www.cis.ksu.edu/~bhsuhttp://www.cis.ksu.edu/~bhsu Reading for Next Class: Syllabus and Introductory Handouts Instructions for Labs 0 – 1 Han & Kamber 2 e, Sections 1.1 – 1.4.3 (pp. 1 – 25), 6.1 (pp. 285 – 289) Data Mining in Mobile and Cloud Computing Environments: Course Organization and Survey Lecture 0 of 27: Part A – Course Organization
2
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Course Administration Course Page (KSOL): http://bit.ly/a68KuLhttp://bit.ly/a68KuL Class Web Page: www.kddresearch.org/Courses/CIS690www.kddresearch.org/Courses/CIS690 Instructional E-Mail Addresses – Best Way to Reach Instructor CIS690TA-L@listserv.ksu.edu (always use this to reach instructor and TA) CIS690TA-L@listserv.ksu.edu CIS690-L@listserv.ksu.edu CIS690-L@listserv.ksu.edu Instructor: William Hsu, Nichols 324C Office phone: +1 785 532 7905; home phone: +1 785 539 7180 IM: AIM/MSN/YIM hsuwh/rizanabsith, ICQ 28651394/191317559, Google banazir Office hours: after class Mon/Wed/Fri; other times by appointment Graduate Teaching Assistant: To Be Announced Office location: Nichols 124 (CIS Visualization Lab) & Nichols 218 Office hours: to be announced on class web board Grading Policy: Overview Midterm exam: 15% Homework: 15% Term project: 50% Labs: 20% (1% each; see calendar)
3
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Course Policies Letter Grades 15% graduations (85+%: A, 70+%: B, etc.) Cutoffs may be more lenient, but a) never higher and b) seldom much lower Grading Policy Exams: midterm (in-class, open-book/notes) 15% Homework: 15% (2 written, 2 programming, 2 mixed; drop lowest 2, 3% each) Term project (including proposal, interim, final reports): 50% Labs (upload solutions to K-State On-Line file dropbox): 20% Late Homework Policy Allowed only in case of medical excusal All other late homework: see drop policy Attendance Policy Absence due to travel or personal reasons: e-mail CIS690TA-L in advance See instructor, Office of the Dean of Student Life as needed Honor System Policy: http://www.ksu.edu/honor/http://www.ksu.edu/honor/ On plagiarism: cite sources, use quotes if verbatim, includes textbooks OK to discuss work, but turn in your own work only When in doubt, ask instructor
4
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Course Content Management System (CMS) http://www.kddresearch.org/Courses/CIS690 http://www.kddresearch.org/Courses/CIS690 Lecture notes (MS PowerPoint 97-2010, PDF) Homeworks (MS Word 97-2010, PDF) Exam and homework solutions (MS PowerPoint 97-2010, PDF) Class announcements (students’ responsibility) and grade postings Course Notes Online and at Copy Center (Required) Mailing List (Automatic): CIS690-L@listserv.ksu.eduCIS690-L@listserv.ksu.edu Homework/exams (before uploading to CMS, KSOL), sample data, solutions Class participation Project info, course calendar reminders Dated research announcements (seminars, conferences, calls for papers) LISTSERV Web Archive http://listserv.ksu.edu/archives/cis690-l.html http://listserv.ksu.edu/archives/cis690-l.html Stores e-mails to class mailing list as browsable/searchable posts Class Resources
5
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Recommended Text Witten, I. H. & Frank, E. (2006). Data Mining: Practical Machine Learning Tools and Techniques, second edition. San Francisco, CA, USA: Morgan Kauffman. Other References [on Reserve in Main or CIS Library] Han, J. & Kamber, M. (2006). Data Mining: Concepts and Techniques, second edition. San Francisco, CA, USA: Morgan Kauffman. Mitchell, T. M. (1997) Machine Learning. New York, NY, USA: McGraw-Hill. Tan, P.-N., Steinbach, M., & Kumar, V. (2006). Introduction to Data Mining. Reading, MA, USA: Addison-Wesley. Textbook and Recommended References Mitchell (1997) Witten & Frank 2 e Tan et al. (2006) 1 st edition (outdated) Han & Kamber 2 nd edition
6
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Both Courses Proficiency in high-level programming language (C++/C#, Java, Python, etc.) Required: course in data structures Recommended: discrete mathematics, probability At least 80 hours for semester (up to 120 depending on term project) Textbook – Data Mining: Concepts and Techniques, 2 e, Han & Kamber (2006) Reserve texts: Mitchell’s Machine Learning, several other outside references CIS 690 Data Mining in Mobile and Cloud Computing Environments Fresh background in symbolic logic, discrete math (sets, relations, counting) Some background assumed in linear algebra, calculus New topics: classification/regression, association, optimization, clustering “Mathematical maturity”: ready to learn more CIS 798 Topics in Computer Science Recommended: two programming courses Read up on heuristic search, games, constraints, knowledge representation AI programming experience helps (background lectures as needed) Watch advanced topics lectures; see list before choosing project topic Background Expected
7
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Syllabus [1]: First Half of Course
8
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Syllabus [2]: Second Half of Course
9
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Basics: First Two Weeks (Hours 2 – 9 of Course) Review of mathematical foundations: set theory, discrete math, probability Types of machine learning algorithms Combinatorial analysis: mappings and counting Bayesian classification Bayesian Inference Hour 3: association rules, statistical evaluation Hours 6 – 10: Naïve Bayes, classification in R Hours 15 – 18: clustering, Expectation-Maximization (EM) Other Math Topics to be Covered Information theory: decision tree induction, rule induction Basic statistical hypothesis testing Frequent itemsets: association rule mining Convex optimization: constraints, linear and quadratic programming (QP) Distance measures: clustering Logic: propositional, first-order, resolution Math Background To Be Covered
10
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Computing Platform: Mobile/Cloud Environments Android Operating system: modified Linux For mobile devices (Motorola Droid, HTC Incredible, etc.) Android, Inc. & Open Handset Alliance Software development kit: download from http://developer.android.com/sdk/http://developer.android.com/sdk/ Software Environment for the Advancement of Scholarly Research Originally developed for compute clusters Adapted for cloud computing environments SEASR – overall environment: http://seasr.orghttp://seasr.org Meandre – data mining flows: http://seasr.org/meandre/http://seasr.org/meandre/ © 2005 – present, National Center for Supercomputing Applications (NCSA)
11
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments Computing Platform: Data Mining Software Waikato Environment for Knowledge Analysis (WEKA) Data mining package Most popular machine learning and data mining software at present Download from http://www.cs.waikato.ac.nz/ml/weka/http://www.cs.waikato.ac.nz/ml/weka/ R Interpreter R: popular programming language for computational statistics Used for data mining implementations Comprehensive R Archive Network (CRAN): http://cran.r-project.orghttp://cran.r-project.org Apache Hadoop Java software framework Data-intensive distributed applications Inspired by Google MapReduce and Google File System (GFS)
12
Computing & Information Sciences Kansas State University CIS 690 Data Mining in Mobile and Cloud Computing Environments About Project Proposals Proposals About 1-2 pages; due at end of second week of course, one revision allowed Team projects: up to 2 people Contents: at least one paragraph on each of –1. Problem statement: describe task, objectives, purpose –2. Background: survey related work and applicable approaches –3. Methodology: describe planned approach –4. Evaluation criteria: how will performance be assessed? –5. Milestones: what will be done, when? Post Questions and Drafts to Class Mailing List
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.