Download presentation
Presentation is loading. Please wait.
Published byDwayne Powell Modified over 9 years ago
1
CSC 478 Programming Data Mining Applications Course Summary Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University
2
What we did Data Mining Overview The KDD Process Data Preprocessing and Understanding Using Python, Numpy, Pandas Using Scikit-learn modules Some emphasis on visualizing and understanding characteristics of the data Supervised Knowledge Discovery Classification Regression Analysis Techniques such as KNN, Ridge Regression, Decision Tree and Bayesian classification Lots of emphasis on model evaluation Evaluation metrics Train-Test methodologies such as cross-validation Systematic parameter selection (e.g., grid search) 2
3
What we did Unsupervised Knowledge Discovery Cluster analysis Using PCA and SVD for dimensionality reduction, data characterization, and noise reduction. Association rule discovery Emphasis on using unsupervised approaches as components of larger knowledge discovery efforts E.g., using PCA before clustering; using clustering as the basis for classification Real application domains Text Mining and document analysis/filtering Recommender systems Predictive modeling for marketing/business applications Image analysis 3
4
What we did not do (and you should learn later) Approaches for mining sequential/temporal data Markov models; time series analysis, sequential pattern mining More Ensemble and Hybrid Classifiers/Predictors Combining multiple classifiers Random Forest classifiers Other Meta-learners such as Ada Boost Support Vector Machines and Kernel-Based Classifiers Topic modeling with Latent factor models LDA Latent Dirichlet Allocation 4
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.