Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Bayesian Matrix Factorization Model for Relational Data UAI 2010 Authors: Ajit P. Singh & Geoffrey J. Gordon Presenter: Xian Xing Zhang Relational Learning.

Similar presentations


Presentation on theme: "A Bayesian Matrix Factorization Model for Relational Data UAI 2010 Authors: Ajit P. Singh & Geoffrey J. Gordon Presenter: Xian Xing Zhang Relational Learning."— Presentation transcript:

1 A Bayesian Matrix Factorization Model for Relational Data UAI 2010 Authors: Ajit P. Singh & Geoffrey J. Gordon Presenter: Xian Xing Zhang Relational Learning via Collective Matrix Factorization SIGKDD 2008

2 Basic ideas Collective matrix factorization is proposed for relational learning when an entity participates in multiple relations. Several matrices (with different types of support) are factored simultaneously with shared parameters CMF is extended to a hierarchical Bayesian model to enhance the sharing of statistics strength

3 An example of application Functional Magnetic Resonance Imaging (fMRI): – fMRI data can be viewed as a relation (real valued), Response(stimulus, voxel) ∈ [0, 1] – stimulus side-information: a relation (binary) Co- occurs(word, stimulus) ∈ {0, 1} ( which is collected as the statistics of whether the stimulus word co-occurs with other commonly- used words in large ) – The goal is to predict unobserved values of the Response relation

4 Basic model description In fMRI example, the Co-occurs relation is an m×n matrix X; the Response relation is an n×r matrix Y. Likelihood of each matrix X and Y: Co-occurs (p_X) is modeled by the Bernoulli distribution, Response (p_Y) is modeled by a Gaussian.

5 Hierarchical Collective Matrix Factorization Information between entities can only be shared indirectly, through another facto: e.g., in f(UV’), two distinct rows of U are correlated only through V. The hierarchical prior acts as a shrinkage estimator for the rows of a factor, pooling information indirectly, through Θ.

6 Bayesian Inference Hessian Metropolis-Hastings: – In random walk Metropolis-Hastings it samples from a proposal distribution defined by a Gaussian with mean equal to the sample at time t, F_i(t) and covariance matrix, which is problematic. – HMH uses both the gradient and Hessian to automatically construct a proposal distribution at each sampling step. This is claimed as the main technical contribution of the UAI2010 paper.

7 Related work

8 Experiment setting The Co-occurs(word, stimulus) relation is collected by measuring whether or not the stimulus word occurs within five tokens of a word in the Google Tera-word corpus. Hold-out prediction: Fold-in prediction (to predict a new row in Y)

9 Experiment results

10 Discussions Existing methods force one to choose between ignoring parameter uncertainty or making Gaussianity assumptions. Non-Gaussian response types significantly improve predictive accuracy. While non-Gaussianity complicates the construction of proposal distributions for Metropolis-Hastings, it does have a significant impact on predictive accuracy


Download ppt "A Bayesian Matrix Factorization Model for Relational Data UAI 2010 Authors: Ajit P. Singh & Geoffrey J. Gordon Presenter: Xian Xing Zhang Relational Learning."

Similar presentations


Ads by Google