1 A Conceptual Framework of Data Mining Y.Y. Yao Department of Computer Science, University of Regina Regina, Sask., Canada S4S 0A2

Slides:



Advertisements
Similar presentations
Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Advertisements

Animal, Plant & Soil Science
CHAPTER 1 WHAT IS RESEARCH?.
Introduction to Research Methodology
Chapter 4 Validity.
CAP 252 Lecture Topic: Requirement Analysis Class Exercise: Use Cases.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Contemporary Perspectives. What is a “perspective”? What do you think???
Sabine Mendes Lima Moura Issues in Research Methodology PUC – November 2014.
Conceptual modelling. Overview - what is the aim of the article? ”We build conceptual models in our heads to solve problems in our everyday life”… ”By.
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
Science Inquiry Minds-on Hands-on.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Framework for K-12 Science Education
PROGRAMMING LANGUAGES The Study of Programming Languages.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
ARTIFICIAL INTELLIGENCE [INTELLIGENT AGENTS PARADIGM] Professor Janis Grundspenkis Riga Technical University Faculty of Computer Science and Information.
Designing and implementing of the NQF Tempus Project N° TEMPUS-2008-SE-SMHES ( )
Purpose of study A high-quality computing education equips pupils to use computational thinking and creativity to understand and change the world. Computing.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
BUSINESS INFORMATICS descriptors presentation Vladimir Radevski, PhD Associated Professor Faculty of Contemporary Sciences and Technologies (CST) Linkoping.
Learning outcomes for BUSINESS INFORMATCIS Vladimir Radevski, PhD Associated Professor Faculty of Contemporary Sciences and Technologies (CST)
1 Science as a Process Chapter 1 Section 2. 2 Objectives  Explain how science is different from other forms of human endeavor.  Identify the steps that.
Odyssey A Reuse Environment based on Domain Models Prepared By: Mahmud Gabareen Eliad Cohen.
MIS – 3030 Business Technologies Social Media & Conversation Big Data.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
Big Ideas Differentiation Frames with Icons. 1. Number Uses, Classification, and Representation- Numbers can be used for different purposes, and numbers.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
CHAPTER 1 Understanding RESEARCH
Construct-Centered Design (CCD) What is CCD? Adaptation of aspects of learning-goals-driven design (Krajcik, McNeill, & Reiser, 2007) and evidence- centered.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Modeling system requirements. Purpose of Models Models help an analyst clarify and refine a design. Models help simplify the complexity of information.
RE - SEARCH ---- CAREFUL SEARCH OR ENQUIRY INTO SUBJECT TO DISCOVER FACTS OR INVESTIGATE.
PROBLEM AREAS IN MATHEMATICS EDUCATION By C.K. Chamasese.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Panel Discussion on Foundations of Data Mining at RSCTC2004 J. T. Yao University of Regina Web:
1 The Theoretical Framework. A theoretical framework is similar to the frame of the house. Just as the foundation supports a house, a theoretical framework.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
LoG: A Methodology for Metadata Registry-based Management of Scientific Data July 5, 2002 Doo-Kwon Baik
Research Design. Selecting the Appropriate Research Design A research design is basically a plan or strategy for conducting one’s research. It serves.
How People Learn – Brain, Mind, Experience, and School (Bransford, Brown, & Cocking, 1999) Three core principles 1: If their (students) initial understanding.
Generic Tasks by Ihab M. Amer Graduate Student Computer Science Dept. AUC, Cairo, Egypt.
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
1 Knowledge Acquisition and Learning by Experience – The Role of Case-Specific Knowledge Knowledge modeling and acquisition Learning by experience Framework.
Research for Nurses: Methods and Interpretation Chapter 1 What is research? What is nursing research? What are the goals of Nursing research?
PSY 219 – Academic Writing in Psychology Fall Çağ University Faculty of Arts and Sciences Department of Psychology Inst. Nilay Avcı Week 9.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Banaras Hindu University. A Course on Software Reuse by Design Patterns and Frameworks.
Lecture №4 METHODS OF RESEARCH. Method (Greek. methodos) - way of knowledge, the study of natural phenomena and social life. It is also a set of methods.
Data Mining Concepts and Techniques Course Presentation by Ali A. Ali Department of Information Technology Institute of Graduate Studies and Research Alexandria.
WHAT IS RESEARCH? According to Redman and Morry,
What is Research?. Intro.  Research- “Any honest attempt to study a problem systematically or to add to man’s knowledge of a problem may be regarded.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
Text Linguistics. Definition of linguistics Linguistics can be defined as the scientific or systematic study of language. It is a science in the sense.
RESEARCH METHODOLOGY Research and Development Research Approach Research Methodology Research Objectives Engr. Hassan Mehmood Khan.
What is cognitive psychology?
Introduction to Research Methodology
Outline What is Literature Review? Purpose of Literature Review
Introduction to Research Methodology
Chapter 2 Database Environment Pearson Education © 2009.
Granular Computing for Web Based Information Retrieval Support Systems
Formulating a Research Problem
RESEARCH BASICS What is research?.
Chapter 2 Database Environment Pearson Education © 2009.
Debate issues Sabine Mendes Lima Moura Issues in Research Methodology
Presentation transcript:

1 A Conceptual Framework of Data Mining Y.Y. Yao Department of Computer Science, University of Regina Regina, Sask., Canada S4S 0A2

2 Acknowledgements Thanks to Professors Wang Jue Zhou Zhi-Hua Zhou Aoying for the kind invitation and this opportunity.

3 Motivations “The question typically is not what is an ecosystem, but how do we measure certain relationships between populations, how do some variables correlate with other variables, and how can we use this knowledge to extend our domain.” Salthe, S.N. Evolving Hierarchical Systems, Their Structure and Representation

4 Motivations “… the scientist is usually not, on the other hand, a self-conscious epistemologist. That would mean going beyond his area of narrow training for the purpose of questioning its point. Functioning as a scientist means functioning within the rules of a game learned during the apprenticeship in which examination of the philosophic foundations of the game plays a characteristically tiny role.”

5 Motivations (Data Mining) One is more interested in the algorithms for finding “knowledge”, but not what is knowledge. One is more interested in a more implementation-oriented view or framework of data mining, rather than a conceptual framework for the understanding of the nature of data mining.

6 Data mining Function-oriented approaches: Requirements Theory-oriented approaches: Mathematical/statistical methods Procedure/process-oriented approaches: KDD processes There does not exist a concept framework for data mining.

7 Motivations (General) We are more interested in doing than understanding. We are more interested in actual systems and methods than a powerful point of view. We are more interested in solving a real world problem than acquisition of knowledge. We have enough knowledge, but not sufficient wisdom in using the knowledge.

8 Motivations Four international workshops have been held on foundations of data mining. There still does not exist a well accepted and non-controversial framework. Many papers do not cover the “foundations of data mining”.

9 The question How to view and study data mining? What can we learn from our experiences? From other fields. From well established branches.

10 Knowledge structure and problem solving in physics Reif and Heller, “ Effective problem solving in a realistic domain depends crucially on the content and structure of the knowledge about the particular domain. ” The knowledge about physics “ specifies special descriptive concepts and relations described at various level of abstractness, is organized hierarchically, and is accompanied by explicit guidelines specifying when and how this knowledge is to be applied. ”

11 Knowledge structure and education Experts and novices differ in their knowledge organization. Experts are able to establish multiple representations of the same problem at different levels of granularity. Experts are able to see the connections between different grain- sized knowledge.

12 Cognitive Science Posner, 1989 According to the cognitive science approach, to learn a new filed is to build appropriate cognitive structures and to learn to perform computations that will transform what is know into what is not yet known.

13 A New View Data mining as a field of study, rather than simply a collections of algorithms, or a combination of several fields. The study of data mining may be viewed as a scientific enquiry into the nature of data mining and the scope of data mining methods.

14 Three basic questions What are the foundations of data mining? What is the scope of the foundations of data mining? What are the differences between existing researches and the research on the foundations of data mining?

15 A potential solution The study of the nature of data mining The study of data mining methods The philosophical foundations The theoretical foundations The mathematical foundations The philosophical foundations The theoretical foundations The mathematical foundations The technological foundations

16 A conceptual framework A layered framework can be established. Each layer/level deals with the problem in different contexts: in mind and in the abstract in machine application.

17 A layered model of Data Mining Philosophy level Algorithm/technique level Application level Philosophy layer Technique layer Application layer

18 A layered model Philosophy level: What is knowledge? The study of knowledge & knowledge discovery in mind and in the abstract. What is knowledge representation? How to express and communicate knowledge? What is the relationship between knowledge in mind and in real world? How to classify knowledge? How to organize knowledge? Philosophy layer Technique layer Application layer

19 A layered model Technique level: How to discover knowledge? The study of knowledge & knowledge discovery in machine. How to code, storage, retrieve knowledge in computer? How to develop an efficient algorithm? How to improve an existing technique? Philosophy layer Technique layer Application layer

20 A layered model Application level: How to use the discovered knowledge The study of the applications of discovered Knowledge. Is the discovered knowledge useful? Is the discovered knowledge meaningful? How to use the knowledge? Philosophy layer Technique layer Application layer

21 A layered model Philosophy level The study of knowledge & knowledge discovery in mind and in the abstract. Technique level The study of knowledge & knowledge discovery in machine. Application level The study of the applications of discovered Knowledge. 1.The division among the three levels is not a clear cut, and may have overlaps with each other. 2.The inner layers establish a foundation for the outer layers. 3.The outer layers may raise questions for the inner layers.

22 A layered model of KDD The results from philosophy level will provide guideline and set the stage for the algorithm and application levels. Philosophical study does not depend on the availability of specific techniques. Technical study is not constrained by a particular application. The existence of a type of knowledge in data is unrelated to whether we have an algorithm to extract it. The existence of an algorithm does not necessarily imply that the discovered knowledge is meaningful and useful

23 A layered model of KDD The three levels represent the understanding, discovery, and utilization of knowledge. Any of them is indispensable in the study of intelligence and intelligent systems. They must be considered together in a common framework through multi-disciplinary studies, rather than in isolation.

24 Application of the layered framework Concept formation and learning can be studied within the layered framework. The reconsideration brings a better understanding of the problem.

25 Application of the layered framework Concept formation and learning can be studied within the layered framework. The reconsideration brings a better understanding of the problem.

26 Philosophy level study of concept Classical view A concept is described jointly by its intension and extension.

27 Philosophy level study of concept

28 Philosophy level study of concept Two basic issues of concept formation Aggregation aims at the identification of a group of objects so that they form the extension of a concept. Characterization attempts to describe a set of objects as their intension.

29 Philosophy level study of concept Classical view Differentiation Integration Aggregation Characterization Concept formation Concept formation

30 Philosophy level study of concept Classical view Aggregation Characterization vs. Differences Concept formation Concept formation

31 Philosophy level study of concept Classical view Aggregation Characterization vs. Differences Similarities Concept formation Concept formation

32 Philosophy level study of concept Classical view Aggregation Characterization vs. Extension Intension Concept formation Concept formation

33 Philosophy level study of concept Context Hierarchy Concept learning Concept learning Concept formation Concept formation

34 Philosophy level study of concept Context Hierarchy Concept learning Concept learning Concept formation Concept formation

35 Technique level study of concept Search for the intension Given a context - Search for the extension Analyze the concepts relationship

36 Technique level study of concept Intensions of concepts defined by a language

37 Technique level study of concept Intensions of concepts defined by a language

38 Technique level study of concept Conjunctive concept space

39 Technique level study of concept Conjunctive concept space

40 Technique level study of concept

41 Technique level study of concept Extensions of concepts defined by an information table

42 Technique level study of concept

43 Technique level study of concept Extensions of concepts defined by an information table

44 Technique level study of concept Relationship between concepts in an information table

45 Technique level study of concept Relationship between concepts in an information table

46 Technique level study of concept Probabilistic measures:

47 Technique level study of concept Probabilistic measures:

48 Technique level study of concept Concept learning as search

49 Technique level study of concept Concept learning as search

50 Technique level study of concept Concept learning as search

51 Technique level study of concept Concept learning as search

52 Application level study of concept The main purposes of science are to describe and predict, to improve or manipulate the world around us, to explain our world. Concepts learning should serve the same purposes.

53 Application level study of concept to describe

54 Application level study of concept to predict

55 Application level study Domain specific The usefulness of concepts needs to be defined and interpreted based on other more familiar notions.

56 Conclusions It is important to treat data mining as a field of scientific enquiry. One needs to consider all aspects of data mining. The layered framework may provide a better understanding of data mining.

57 Conclusions We need to find the cognitive structures or knowledge structures of data mining. We need to move beyond algorithm and application centered views of data mining. We need to avoid seductive semantics.

58 Conclusions Data mining can be studied in the context of scientific discovery and research methods. Data mining and machine learning systems may be viewed as support systems for the exploration of data, such as research support systems.

59 Thank you! The ideas are preliminary and need fine tune. You comments, suggestions, and criticisms are welcome!