LÊ QU Ố C HUY ID: QLU13069 1. OUTLINE  What is data mining ?  Major issues in data mining 2.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Personalized Presentation in Web-Based Information Systems Institute of Informatics and Software Engineering Faculty of Informatics and Information Technologies.
1 Copyright Jiawei Han; modified by Charles Ling for CS411a/538a Data Mining and Data Warehousing  Introduction  Data warehousing and OLAP for data mining.
Data warehouse example
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Spatial and Temporal Data Mining V. Megalooikonomou Introduction to Decision Trees ( based on notes by Jiawei Han and Micheline Kamber and on notes by.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
The Web is perhaps the single largest data source in the world. Due to the heterogeneity and lack of structure, mining and integration are challenging.
Institut für Softwarewissenschaft - Universität WienP.Brezany 1 Toward Knowledge Discovery in Databases Attached to Grids Peter Brezany Institute for Software.
Building Knowledge-Driven DSS and Mining Data
Yimam & Kobsa July 13, 2000TWIST 2000 Centralization vs. Decentralization Issues in Internet-based KMS: Experiences from Expertise Recommender Systems.
Data Mining – Intro.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Data Mining.
Business Intelligence
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
DATA MINING & KNOWLEDGE DISCOVERY
Data Mining Techniques
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Understanding Data Analytics and Data Mining Introduction.
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
Fundamentals of Information Systems, Fifth Edition
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
DECISION SUPPORT SYSTEM ARCHITECTURE: The data management component.
Presented by Abirami Poonkundran.  Introduction  Current Work  Current Tools  Solution  Tesseract  Tesseract Usage Scenarios  Information Flow.
Chapter 1 Introduction to Data Mining
Sharad Oberoi and Susan Finger Carnegie Mellon University DesignWebs: Towards the Creation of an Interactive Navigational Tool to assist and support Engineering.
Data Mining Process A manifestation of best practices A systematic way to conduct DM projects Different groups has different versions Most common standard.
Taxonomies of Visualization Techniques CMPT 455/826 - Week 12, Day 2 w12d2 Sept-Dec
David S. Ebert David S. Ebert Visual Analytics to Enable Discovery and Decision Making: Potential, Challenges, and.
MIS – 3030 Business Technologies Social Media & Conversation Big Data.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
1/26/2004TCSS545A Isabelle Bichindaritz1 Database Management Systems Design Methodology.
Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.
Data Mining By Dave Maung.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
1 What is OO Design? OO Design is a process of invention, where developers create the abstractions necessary to meet the system’s requirements OO Design.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
 Dr. Syed Noman Hasany.  Review of known methodologies  Analysis of software requirements  Real-time software  Software cost, quality, testing and.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
January 17, 2016Data Mining: Concepts and Techniques 1 What Is Data Mining? Data mining (knowledge discovery from data) Extraction of interesting ( non-trivial,
Knowledge Modeling and Discovery. About Thetus Thetus develops knowledge modeling and discovery infrastructure software for customers who: Have high-value.
Evaluation of DBMiner By: Shu LIN Calin ANTON. Outline  Importing and managing data source  Data mining modules Summarizer Associator Classifier Predictor.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Books Visualizing Data by Ben Fry Data Structures and Problem Solving Using C++, 2 nd edition by Mark Allen Weiss MATLAB for Engineers, 3 rd edition by.
1 Visualizing Multi-dimensional Clusters, Trends, and Outliers using Star Coordinates Author : Eser Kandogan Reporter : Tze Ho-Lin 2007/5/9 SIGKDD, 2001.
Smart Web Search Agents Data Search Engines >> Information Search Agents - Traditional searching on the Web is done using one of the following three: -
1. ABSTRACT Information access through Internet provides intruders various ways of attacking a computer system. Establishment of a safe and strong network.
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
July 7, 2016 Data Mining: Concepts and Techniques 1 1.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Data Mining – Intro.
What Is Cluster Analysis?
Introduction to Data Mining
Data Mining.
Introduction C.Eng 714 Spring 2010.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Mining: Concepts and Techniques Course Outline
Data Warehousing and Data Mining
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Data Mining.
Presentation transcript:

LÊ QU Ố C HUY ID: QLU

OUTLINE  What is data mining ?  Major issues in data mining 2

WHAT IS DATA MINING ?  The process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both. 3

4

MAJOR ISSUES IN DATA MINING  Mining Methodology  User Interaction  Diversity of Database Types  Data Mining and Society 5

MAJOR ISSUES IN DATA MINING  Mining Methodology  User Interaction  Diversity of Database Types  Data Mining and Society 6

MINING METHODOLOGY MINING METHODOLOGY  Mining various and new kinds of knowledge: -Data mining covers a wide spectrum of data analysis and knowledge discovery tasks, from data characterization and discrimination to association and correlation analysis, classification, clustering,… - Require the development of numerous data mining techniques. 7

MINING METHODOLOGY MINING METHODOLOGY  Mining knowledge in multidimensional space: -Searching for knowledge in large data sets. -Search for interesting patterns among combinations of dimensions (attributes) at varying levels of abstraction. -Such mining is known as (exploratory) multidimensional data mining. 8

MINING METHODOLOGY MINING METHODOLOGY  Boosting the power of discovery in a networked environment: - Most data objects reside in a linked or interconnected environment, whether it be the Web, database relations, files, or documents. - Semantic links across multiple data objects can be used to advantage in data mining. - Knowledge derived in one set of objects can be used to boost the discovery of knowledge in a “related” or semantically linked set of objects. 9

MINING METHODOLOGY MINING METHODOLOGY  Handling uncertainty, noise, or incompleteness of data: -Data often contain noise, errors, exceptions, or uncertainty, or are incomplete. -Errors and noise may confuse the data mining process, leading to the derivation of erroneous patterns. 10

MINING METHODOLOGY MINING METHODOLOGY  Handling uncertainty, noise, or incompleteness of data: - Data cleaning, data preprocessing, outlier detection and removal, and uncertainty reasoning are examples of techniques that need to be integrated with the data mining process. 11

MAJOR ISSUES IN DATA MINING  Mining Methodology  User Interaction  Diversity of Database Types  Data Mining and Society 12

USER INTERACTION  Interactive mining: -The data mining process should be highly interactive.  important to build flexible user interfaces and an exploratory mining environment, facilitating the user’s interaction with the system. 13

USER INTERACTION  Interactive mining: It should allow users : + dynamically change the focus of a search based on returned results + refine mining requests based on returned results + dynamically exploring “cube space” while mining 14

USER INTERACTION  Incorporation of background knowledge: - Background knowledge, constraints, rules, and other information regarding the domain under study  should be incorporated into the knowledge discovery process. - Such knowledge can be used for pattern evaluation as well as to guide the search toward interesting patterns. 15

USER INTERACTION  Presentation and visualization of data mining results: How can a data mining system present data mining results, vividly and flexibly, so that the discovered knowledge can be easily understood and directly usable by humans ? - This is especially crucial if the data mining process is interactive. - It requires the system to adopt expressive knowledge representations, user-friendly interfaces, and visualization techniques. 16

MAJOR ISSUES IN DATA MINING  Mining Methodology  User Interaction  Diversity of Database Types  Data Mining and Society 17

DIVERSITY OF DATABASE TYPES  Handling complex types of data: - Diverse applications generate a wide spectrum of new data types, from simple data objects to temporal data, sensor data, software program code, social network data…  It is unrealistic to expect one data mining system to mine all kinds of data, given the diversity of data types and the different goals of data mining. 18

DIVERSITY OF DATABASE TYPES  Handling complex types of data: - Domain- or application-dedicated data mining systems are being constructed for indepth mining of specific kinds of data. -The construction of effective and efficient data mining tools for diverse applications remains a challenging and active area of research 19

DIVERSITY OF DATABASE TYPES  Mining dynamic, networked, and global data repositories: Multiple sources of data are connected by: - the Internet, various kinds of networks, forming gigantic, distributed, and heterogeneous global information systems,… 20

DIVERSITY OF DATABASE TYPES  Mining dynamic, networked, and global data repositories: -Mining such gigantic, interconnected information networks may help disclose many more patterns and knowledge in heterogeneous data sets than can be discovered from a small set of isolated data repositories. 21

DIVERSITY OF DATABASE TYPES  Mining dynamic, networked, and global data repositories: -Web mining, multisource data mining, and information network mining have become challenging and fast- evolving data mining fields. 22

MAJOR ISSUES IN DATA MINING  Mining Methodology  User Interaction  Diversity of Database Types  Data Mining and Society 23

DATA MINING AND SOCIETY  Social impacts of data mining: The improper disclosure or use of data and the potential violation of individual privacy and data protection rights are areas of concern that need to be addressed. 24

DATA MINING AND SOCIETY  Privacy-preserving data mining: Data mining will help scientific discovery, business management, economy recovery, and security protection (e.g., the real-time discovery of intruders and cyberattacks). However, it poses the risk of disclosing an individual’s personal information.  Studies on privacy-preserving data publishing and data mining are ongoing. 25

DATA MINING AND SOCIETY  Invisible data mining: use results - More and more systems should have data mining functions built within so that people can perform data mining or use data mining results simply by mouse clicking, without any knowledge of data mining algorithms. 26

DATA MINING AND SOCIETY  Invisible data mining: incorporating into their components improve -Intelligent search engines and Internet-based stores perform such invisible data mining by incorporating data mining into their components to improve their functionality and performance. This is done often unbeknownst to the user. 27

REFERENCE 28

29

30