Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1:

Similar presentations


Presentation on theme: "Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1:"— Presentation transcript:

1 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 1 Part II: Web Content Mining Chapter 3: Clustering Introduction Hierarchical Agglomerative Clustering K-Means Clustering Probability-Based Clustering Collaborative Filtering

2 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 2 Two Class Mixture A0 B0.961 A0 A0 B0.780 A0 B0 A0 B0.980 A0 B0.135 A0.490 B0.928 B0 B0.658 A0 A0 A0.387 A0.570 B0 ClassMeanStandard deviationProbability of sampling A B

3 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 3 Finite Mixture Problem Given the labeled data, for each class C compute: mean standard deviation probability of sampling Generative document model

4 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 4 Finite Mixture Problem (2) A0 B0.961 A0 A0 B0.780 A0 B0 A0 B0.980 A0 B0.135 A0.490 B0.928 B0 B0.658 A0 A0 A0.387 A0.570 B0

5 Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1: Information Retrieval an Web Search 5 Classification Problem Given for class A and B compute P(A|x) and P(B|x). Use if x is a discrete variable if x is a continuous variable probability density function


Download ppt "Zdravko Markov and Daniel T. Larose, Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage, Wiley, 2007. Slides for Chapter 1:"

Similar presentations


Ads by Google