Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science The.

Similar presentations


Presentation on theme: "Data Mining David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science The."— Presentation transcript:

1 Data Mining David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science The University of Iowa

2 Why? Given enough data represented through enough dimensions, we loose the ability to see the patterns

3 How? Decision Trees Nearest Neighbor Clustering Neural Networks Rule Induction K-Means Clustering Decision Trees Nearest Neighbor Clustering Neural Networks Rule Induction K-Means Clustering

4 What is it? The automated extraction of hidden predictive information from databases. Key points Automated Hidden Predictive The automated extraction of hidden predictive information from databases. Key points Automated Hidden Predictive

5 The Typical Process

6 Evaluation Criteria Receiver Operating Characteristic Curves

7 But Nobody Said We Had To Do MATH….

8 Forms of Data Structured Databases Forms Semi-Structured Tables on the Web Bibliographic citations Graphs & charts Unstructured Full text (e.g., journal articles, physician chart notes) Images Structured Databases Forms Semi-Structured Tables on the Web Bibliographic citations Graphs & charts Unstructured Full text (e.g., journal articles, physician chart notes) Images

9 Text Mining Corpus now is a collection of text artifacts Full text when youve got it (e.g. newswire) Metadata when you dont (e.g. MEDLINE) The trick then becomes extracting interesting relationships between interesting entities Who killed who Who works for who Who makes what Corpus now is a collection of text artifacts Full text when youve got it (e.g. newswire) Metadata when you dont (e.g. MEDLINE) The trick then becomes extracting interesting relationships between interesting entities Who killed who Who works for who Who makes what

10 The Classic Entities Persons Organizations Places (Geography) Events Persons Organizations Places (Geography) Events

11 A Newswire Example APW [Israel(0.271), Jonathan Pollard (0.153), Benjamin Netanyahu(0.102), Bill Clinton(0.102), United States(0.055),...] Persons Bill Clinton (3) Jonathan Pollard (8) Moshe Fogel (2) Benjamin Netanyahu (2) Israeli Embassy (1) Organizations Cabinet (1) Places Israel (16) United States (5) Washington (2) APW [Israel(0.271), Jonathan Pollard (0.153), Benjamin Netanyahu(0.102), Bill Clinton(0.102), United States(0.055),...] Persons Bill Clinton (3) Jonathan Pollard (8) Moshe Fogel (2) Benjamin Netanyahu (2) Israeli Embassy (1) Organizations Cabinet (1) Places Israel (16) United States (5) Washington (2)

12 In the Medical/Health Realm UMLS an excellent framework Organism Chemical Activity Disease UMLS an excellent framework Organism Chemical Activity Disease

13 A MEDLINE Example Document: Reconstructive surgery in Nicaragua Provided MeSH Keywords Human Nicaragua Z Surgery, Plastic/* G Phrases [Reconstructive, surgery] [Nicaragua] [letter] MeSH Terms Surgery (1) G Letter [Publication Type] (1) Other Phrases Reconstructive surgery (1) Document: Reconstructive surgery in Nicaragua Provided MeSH Keywords Human Nicaragua Z Surgery, Plastic/* G Phrases [Reconstructive, surgery] [Nicaragua] [letter] MeSH Terms Surgery (1) G Letter [Publication Type] (1) Other Phrases Reconstructive surgery (1)

14 Concept Extraction Example Roman forces under Julius Caesar invade Britain. (S (NP (NP Roman forces) (PP under (NP Julius Caesar))) (VP invade (NP Britain)).) Entity Attributes: Concepts: Roman forces under Julius Caesar invade Britain. (S (NP (NP Roman forces) (PP under (NP Julius Caesar))) (VP invade (NP Britain)).) Entity Attributes: Concepts:

15 And a Small Demo…


Download ppt "Data Mining David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science The."

Similar presentations


Ads by Google