Overview Definition of Data Mining Application of Data Mining
Data Mining Refers to the mining or discovery of new information in terms of patterns or rules from vast amounts of data. To be useful, data mining must be carried out efficiently on large files and databese.
KDD Knowledge Discovery in Databases Data Cleaning Data Integration Databases Data Warehouse Task-relevant Data Selection Data Mining Pattern Evaluation
Data Mining Vs. Data Warehousing The goal of a data warehouse is to support decision making with data. Data Mining can be used in conjunction with a data warehouse to help with certain types of decisions
Goals of Data Mining and Knowledge Discovery Prediction – Data mining can show how certain attributes within the data will behave in the future. Identification – Data patterns can be used to identify the existence of an item, an event, or an activity.
Cont. Classification – Data mining can partition the data so that different classes or categories can be identified based on combinations of parameters Optimization – Once eventual goal of data mining may be to optimize the use of limited resources such as time, space… to maximize output variables such as sales or profits under a given set of constraints.
Types of Knowledge Discovered During Data Mining Association rules Classification hierarchies Sequential patterns Patterns within time series Clustering
Classification hierarchies Process of learning a model that describes different classes of data. Decision Tree
Sequential Patterns The discovery of sequential patterns is based on the concept of a sequence of itemsets. TO find all subsequences from the given sets of sequences that have a user-defined minimum support.
Patterns with in Time Series Time series are sequences of event Each event may be a given fixed type of a transaction The closing price of a stock or a fund is an event that occurs every weekday for each stock fund.
Application of Data Ming Marketing – Application include analysis of consumer behavior based on buying patterns Finance – Applications include analysis of creditworthiness of clients, segmentation of account receivables…
Cont. Manufacturing – Applications involve optimization of resources like machines, manpower, and materials Health Care – Applications include discovering patterns in radiological images, analyzing side effects of drugs…
Real Life Application The LA police departments counterterrorism unit next are using a new data-analysis system designed to identify and connect related pieces of intelligence to help officers dter and respond to terrorist attacks.
Reference Elmasri, Remez Fundamentals of Database Systems. Pearson. Singapore. 2004. LAPD turns to data analysis to fight terrorism.