Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.

Similar presentations


Presentation on theme: "Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented."— Presentation transcript:

1 Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented by Nesreen Ahmed, Nguyen Cao, Sebastian Moreno, Philip Schatz

2 Outline Introduction Relational Gaussian Processes Application – Linkage prediction – Semi-Supervised Learning Experiments & Results Conclusion & Discussion 12/02/08CS590M: Statistical Machine Learning - Fall 20082

3 Introduction Many domains involve Relational Data – Web: document links – Document Categorization: citations – Computational Biology: protein interactions Inter-relationships between instances can be informative for learning tasks Relations reflect network structure, enrich how instances are correlated 12/02/08CS590M: Statistical Machine Learning - Fall 20083

4 Introduction Relational Information represented by a graph G = (V, E) Supervised Learning: – Provide structural knowledge Also for semi-supervised: derived from input attributes. Graph estimates the global geometric structure of the data 12/02/08CS590M: Statistical Machine Learning - Fall 20084

5 A Gaussian Process is a joint Gaussian distribution over sets of function values {f x } of any arbitrary set of n instances x   Gaussian Processes 12/02/08CS590M: Statistical Machine Learning - Fall 20085 where

6 Linkages: The uncertainty in observing ε ij induces Gaussian noise N(0, σ 2 ) in observing the values of the corresponding instances’ function value Relational Gaussian Processes 12/02/08CS590M: Statistical Machine Learning - Fall 20086 xjxj xixi ε ij

7 Approximate Inference: Relational Gaussian Processes 12/02/08CS590M: Statistical Machine Learning - Fall 20087 i,j runs over the set of observed undirected linkages EP algorithm approximates where as : is a 2x2 symmetric matrix

8 Relational Gaussian Processes 12/02/08CS590M: Statistical Machine Learning - Fall 20088 where is a nxn matrix with four non-zero entries augmented from

9 For any finite collection of data points X, the set of random variables {f x } conditioned on ε have a multivariate Gaussian distribution: Relational Gaussian Processes 12/02/08CS590M: Statistical Machine Learning - Fall 20089 where elements of covariance matrix are given by evaluating the following (covariance) kernel function:

10 Linkage Prediction Joint prob. Probability for an edge between X r and X s 12/02/08CS590M: Statistical Machine Learning - Fall 200810

11 Semi supervised learning 12/02/08CS590M: Statistical Machine Learning - Fall 200811 ? ?? ? ? ? ? ? 1 1 ?

12 Semi supervised learning 12/02/08CS590M: Statistical Machine Learning - Fall 200812 ? ?? ? ? ? ? ? 1 1 ? Nearest Neighborhood K=1

13 Semi supervised learning 12/02/08CS590M: Statistical Machine Learning - Fall 200813 ? ?? ? ? ? ? ? 1 1 ? Nearest Neighborhood K=2

14 Semi supervised learning Apply RGP to obtain Variables are related through a Probit noise Applying Bayes 12/02/08CS590M: Statistical Machine Learning - Fall 200814

15 Semi supervised learning Predictive distribution Obtaining Bernoulli distribution for classification 12/02/08CS590M: Statistical Machine Learning - Fall 200815

16 Experiments Experimental Setup – Kernel function Centralized Kernel : linear or Gaussian kernel shifted to the empirical mean – Noise level Label noise = 10 -4 (for RGP and GPC) Edge noise = [5 : 0.05] 12/02/08CS590M: Statistical Machine Learning - Fall 200816

17 12/02/08CS590M: Statistical Machine Learning - Fall 200817 30 Samples collected from a gaussian mixture with two components on the x-axis. Two labeled samples indicated by diamond and circle. K=3 Best value =0.4 based on approximate model evidence Results

18 12/02/08CS590M: Statistical Machine Learning - Fall 200818 Posterior Covariance matrix of RGP learnt from the data It captures the density information of unlabelled data Using the posterior covariance matrix learnt from the data as the new prior, supervised learning is carried out Curves represent predictive distribution for each class Results

19 Real World Experiment – Subset of the WEBKB dataset Collected from CS dept. of 4 universities Contains pages with hyperlinks interconnecting them Pages classified into 7 categories (e.g student, course, other) – Documents are preprocessed as vectors of input attributes – Hyperlinks translated into undirected positive linkages 2 pages are likely to be positively correlated if hyperlinked by the same hub page No negative linkages – Compared with GPC & LapSVM (Sindhwani et al. 2005) 12/02/08CS590M: Statistical Machine Learning - Fall 200819

20 Results Two classification tasks – Student vs. non-student, Other vs. non-other Randomly selected 10% samples as labeled data Selection repeated 100 times Linear kernel Table shows average AUC for predicting the labels of unlabeled cases 12/02/08CS590M: Statistical Machine Learning - Fall 200820 Student or NotOther or Not Univ.GPCLapSVMRGPGPCLapSVMRGP Corn.0.825±0.0160.987±0.0080.989±0.0090.708±0.0210.865±0.0380.884±0.025 Texa.0.899±0.0160.994±0.0070.999±0.0010.799±0.0210.932±0.0260.906±0.026 Wash.0.839±0.0180.957±0.0140.961±0.0090.782±0.0230.828±0.0250.877±0.024 Wisc0.883±0.0130.976±0.0290.992±0.0080.839±0.0140.812±0.0300.899±0.015

21 Conclusion A novel Bayesian framework to learn from relational data based on GP The RGP provides a data-dependent covariance function for supervised learning tasks (classification) Applied to semi-supervised learning tasks RGP requires very few labels to generalize on unseen test points – Incorporate unlabeled data in the model selection 12/02/08CS590M: Statistical Machine Learning - Fall 200821

22 Discussion The proposed framework can be extended to model: – Directed (asymmetric) relations as well as undirected relations – Multiple classes of relations – Graphs with weighted edges The model should be compared to other models The results can be sensitive to choice of K in KNN 12/02/08CS590M: Statistical Machine Learning - Fall 200822

23 12/02/08CS590M: Statistical Machine Learning - Fall 200823 Thanks Questions ?


Download ppt "Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented."

Similar presentations


Ads by Google