Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.

Slides:



Advertisements
Similar presentations
Author(s): Michael Hortsch, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Advertisements

Author(s): Vic Divecha, 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-Non-commercial-Share.
Author(s): John Doe, MD; Jane Doe, PhD, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): John Doe, MD; Jane Doe, PhD, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Templates for editing U-M OER Materials
Author(s): Paul Conway, License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Project: Ghana Emergency Medicine Collaborative Document Title: Open Educational Resources Author(s): University of Michigan Department of Emergency Medicine.
Author(s): Steve Jackson, 2009 License: Unless otherwise noted, this material is made available under the terms of the Attribution - Noncommercial - Share.
Author(s): Bob Riddle, Kathleen Ludewig Omollo License: Unless otherwise noted, this material is made available under the terms of the Creative Commons.
Author(s): Steve Jackson, 2009 License: Unless otherwise noted, this material is made available under the terms of the Attribution - Noncommercial - Share.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author: Michael Jibson, M.D., Ph.D., 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Share.
Author(s): MELO 3D Project Team, 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): Rahul Sami and Paul Resnick, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): Joan Durrance, 2009 License: Unless otherwise noted, this material is made available under the terms of the Attribution - Non-commercial 3.0.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author(s): August E. Evrard, PhD License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-Non-commercial-Share.
Author(s) David A. Wallace and Margaret Hedstrom, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative.
Author: Michael Jibson, M.D., Ph.D., 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Share.
Author(s): Kate Saylor, 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author(s): August E. Evrard, PhD License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-Non-commercial-Share.
Author(s): Gerald Abrams, M.D., 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author(s): August E. Evrard, PhD License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-Non-commercial-Share.
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author(s): August E. Evrard, PhD License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-Non-commercial-Share.
Author: Michael Jibson, M.D., Ph.D., 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Share.
Author(s): Michael Hortsch, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): Michael Hortsch, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): Paul Conway, PhD, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author(s): Brenda Gunderson, Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author: Michael Jibson, M.D., Ph.D., 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Share.
Author(s): Beata M. Canby, David Hutchful, Pieter Kleymeer, Brandon Ngo, 2007 License: Unless otherwise noted, this material is made available under the.
Author(s): Michael Hortsch, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): MELO 3D Project Team, 2011 License: This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a.
Author(s): August E. Evrard, PhD License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-Non-commercial-Share.
Author(s): Steve Jackson, 2009 License: Unless otherwise noted, this material is made available under the terms of the Attribution - Noncommercial - Share.
Author(s): Vic Divecha, 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-Non-commercial-Share.
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available.
Author(s): Lisa McLaughlin, 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution-ShareAlike.
Author(s): Gabriel Krieshok, Alex Pompe, 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons.
Author(s): MELO 3D Project Team, 2011 License: This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a.
Author(s): Gerald Abrams, M.D., 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Non-commercial–Share.
Author(s): Paul Conway, PhD, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): Paul Conway Ph.D., 2011 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share.
Author(s): Paul Conway, PhD, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution.
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Rahul Sami and Paul Resnick, 2009
Author(s): Paul Resnick, PhD, 2011
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Rajesh Mangrulkar, MD, 2009
Author(s): Paul Conway, PhD, 2010
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author: Michael Jibson, M.D., Ph.D., 2009
Attribution: University of Michigan Medical School, Department of Internal Medicine License: Unless otherwise noted, this material is made available under.
Author(s): Paul Conway, PhD, 2010
1 Author(s): Rebecca W. Van Dyke, M.D., 2012
Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial.
Author(s): Joan Durrance, 2009
1 Author(s): Rebecca W. Van Dyke, M.D., 2012
Attribution: University of Michigan Medical School, Department of Microbiology and Immunology License: Unless otherwise noted, this material is made available.
Author(s): August E. Evrard, PhD
Attribution: Department of Neurology, 2009
Author(s): Rahul Sami and Paul Resnick, 2009
Presentation transcript:

Author(s): Rahul Sami, 2009 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Noncommercial Share Alike 3.0 License: We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material. Copyright holders of content included in this material should contact with any questions, corrections, or clarification regarding the use of content. For more information about how to cite these materials visit

Citation Key for more information see: Use + Share + Adapt Make Your Own Assessment Creative Commons – Attribution License Creative Commons – Attribution Share Alike License Creative Commons – Attribution Noncommercial License Creative Commons – Attribution Noncommercial Share Alike License GNU – Free Documentation License Creative Commons – Zero Waiver Public Domain – Ineligible: Works that are ineligible for copyright protection in the U.S. (USC 17 § 102(b)) *laws in your jurisdiction may differ Public Domain – Expired: Works that are no longer protected due to an expired copyright term. Public Domain – Government: Works that are produced by the U.S. Government. (USC 17 § 105) Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain. Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act. (USC 17 § 107) *laws in your jurisdiction may differ Our determination DOES NOT mean that all uses of this 3rd-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair. To use this content you should do your own independent analysis to determine whether or not your use will be Fair. { Content the copyright holder, author, or law permits you to use, share and adapt. } { Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. } { Content Open.Michigan has used under a Fair Use determination. }

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu Lecture 9: Page Rank; Singular Value Decomposition SI583: Recommender Systems

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 4 Recap: PageRank n Google’s big original idea [Brin &Page, 1998] n Idea: ranking is based on “random web surfer”: –start from any page at random –pick a random link from the page, and follow it –repeat! –ultimately, this process will converge to a stable distribution over pages (with some tricks...) –most likely page in this stable distribution is ranked highest n Strong points: –Pages linked to by many pages tend to be ranked higher (not always) –A link (“vote”) from a highly-ranked page carries more weight –Relatively hard to manipulate

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 5 Some Intuitions n Will D’s Rank be more or less than ¼? n Will C’s Rank be more or less than B’s? n How will A’s Rank compare to D’s? A DC B

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 6 Third Iteration n AR+E n Normalized (divide by 1.18) r r r r r r r r

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 7 Personalized PageRank n Pick E to be some sites that I like –My bookmarks –Links from my home page n Rank flows more from these initial links than from other pages –But much of it may still flow to the popular sites, and from them to others that are not part of my initial set

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu Other applications for pagerank? 8

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 9 Another method: Singular Value Decomposition (SVD) n Back to product recommendation setting n SVD-based collaborative filtering often used in place of User-user / Item-Item n Two different advantages: –Accuracy benefits: identifies “latent features” of items that are useful for predictions –Scalability: Easier to compute when ratings are sparse n Related terms: Principal Component Analysis, Latent Semantic Indexing,

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 10 Motivating SVD n Consider the following scenario –Joe rates items A,B,C,D; likes AC, dislikes BD –Sue rates items C,D,E,F; likes CE, dislikes DF –John rates items E,F,G,H; likes EG, dislikes FH n Will Joe like item G?

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 11 Motivating SVD n Consider the following scenario –Joe rates items A,B,C,D; likes AC, dislikes BD –Sue rates items C,D,E,F; likes CE, dislikes DF –John rates items E,F,G,H; likes EG, dislikes FH n Will Joe like item G? –user-user fails because Joe, John have no common ratings –item-item fails –intuitively, can argue that Joe is likely to like G.. n Idea: Capture the intuition in a CF algorithm

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 12 Motivating SVD.. n One intuitive explanation for why Joe might like G: –A,C,E,G have some common “feature”, which is why users who like one like the others –e.g., ACEG may be funny movies; Joe, Sue, John all like funny movies n Generalize this idea to multiple features n Important features have to be automatically discovered from ratings –or a hybrid of content and collab. filtering

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 13 Software modules: User-User visit site reco. items UI Ratings DB Reco. gener- ation Indexed DB Similarities Pearson Comp. sort, norma- lize Clker.com

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 14 Software modules visit site reco. items UI Ratings DB Reco. gener- ation Indexed DB Feature weights Learn features sort, norma- lize Clker.com

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 15 SVD Conceptual Model n Fit previous data to a model with k features: n Weights v Af1, etc. indicate extent to which A has feature f1,f2 n Weights u Joe,f1 etc. indicate extent to which Joe likes featues f1,f2 n Predict Joe’s preference for X from fitted weights Joe f1 Sue AB f2 Items Users v Af1 u Joe,f1 latent features X

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 16 Learning the weights: SVD n start with mean-normalized rating matrix X n SVD decomposition: calculate U,S,V such that –U: m  k, S: k  k, V: k  n –X = USV –S is a diagonal matrix (zero on non-diag) –U,V are “orthogonal” => features are independent n S indicates “intensity” of each feature –S ii : singular value of feature i

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 17 Fitting the weights: SVD n Model weights from SVD (U,S,V): n Weight (item j, feature f) =  s ff V fj n Weight (user i, feature f) =  s ff U if Joe f1 Sue AB f2 Items Users v Af1 u Joe,f1 latent features X Alternative: get software package to calculate weights directly..

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 18 SVD: selecting features n More features => better fit possible –but also more noise in weights –and harder to compute (matrices are larger) n In practice, do best fit with a small number of features (10,say) n Which features are picked?

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 19 SVD: selecting features n More features => better fit possible –but also more noise in weights –and harder to compute n In practice, do best fit with a small number of features (10,say) n Which features are picked? –Those with the highest singular value (intensity) –Small singular value => feature has negligible effect on predictions

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 20 SVD-based CF: Summary n Pick a number of features k n Normalize ratings n Use SVD to find best fit with k features n Use fitted model to predict value of Joe’s normalized rating for item X n Denormalize (add Joe’s mean) to predict Joe’s rating for X

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu 21 SVD Practicalities n SVD is a common mathematical operation; numerous libraries exist n Efficient algorithms to compute SVD for the typical case of sparse ratings n A fast, simple implementation of an SVD- based recommender (by Simon Funk/Brandyn Webb) was shown to do very well on the Netflix challenge

SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN si.umich.edu SVD and Content Filtering n Similar idea: Latent Semantic Indexing used in content-filtering –Fit item descriptions and keywords by a set of features –Related words map onto the same feature –Similar items have the similar feature vectors n Useful to combine content+collaborative filtering –Learn some features from content, some from ratings 22