Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize.

Similar presentations


Presentation on theme: "Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize."— Presentation transcript:

1 Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize

2 Agenda Introduction of recommendation system and Collaborative Filtering Introduction of recommendation system and Collaborative Filtering Item-based P-Tree CF algorithm Item-based P-Tree CF algorithm P-Tree P-Tree Similarity measurements Similarity measurements Experimental results Experimental results Conclusion and future work Conclusion and future work

3 Recommendation System Recommendation system identifies the mostly possible items that will be of interest to, or purchased by user Recommendation system identifies the mostly possible items that will be of interest to, or purchased by user More online retailers realize the importance of recommendation system More online retailers realize the importance of recommendation system

4 Collaborative Filtering Collaborative Filtering (CF) algorithm is widely used in recommendation system Collaborative Filtering (CF) algorithm is widely used in recommendation system User-based CF algorithm is limited because of its computation complexity User-based CF algorithm is limited because of its computation complexity Item-based CF has less scalability concerns Item-based CF has less scalability concerns

5 Item-based P-Tree CF // Convert the data into P-Tree structure PTree.load_binary(); // Build movie based similarity matrix while i in I { while j in I { sim(I,j) = }} // Get the top K nearest neighbors to item I while j in I { if sim[i,j] is not among the top K largest value sim[i,j] = 0.0 sim[i,j] = 0.0} // Prediction of rating on item i by user u sum = 0.0, weight = 0.0; for (int lp=0; lp<K; ++lp) { sum += r[u,j] * sim[i,j] weight += sim[i,j] } pred = sum/weight

6 P-Tree P-Tree is a lossless, lossless, compressed, and data-mining-ready vertical data structure P-Tree is a lossless, lossless, compressed, and data-mining-ready vertical data structure P-trees are used for fast computation of counts and for masking specific phenomena P-trees are used for fast computation of counts and for masking specific phenomena Data is first converted to P-trees Data is first converted to P-trees

7 P-Tree API size()Get size of PTree get_count()Get bit count of PTree setbit()Set a single bit of PTree reset()Clear the bits of PTree &AND operation of PTree |OR operation of PTree ~NOT operation of PTree dump()Print the binary representation of PTree load_binary()Load the binary representation of PTree

8 Item-Based Similarity (I) Cosine based Cosine based Pearson correlation Pearson correlation

9 Item-Based Similarity (II) Adjusted Cosine Adjusted Cosine Binary based Binary based

10 Experimental Results CosinePearsonAdj. CosBinary K=101.019061.017360.964521.05802 K=201.017751.014830.951441.03562 K=301.025301.021820.944081.03055 K=401.027661.029640.945491.02882 K=501.032511.028630.944661.02959

11 Neighborhood Size

12 Similarity Algorithm

13 Conclusion Experiments are taken on Cosine, Pearson, Adjusted Cosine and Binary similarity algorithms Experiments are taken on Cosine, Pearson, Adjusted Cosine and Binary similarity algorithms Results show adjusted Cosine similarity gets more accurate prediction than other algorithms Results show adjusted Cosine similarity gets more accurate prediction than other algorithms The optimal neighborhood size ranges from 20 - 30. The optimal neighborhood size ranges from 20 - 30.

14 Future Work Variant forms of similarity algorithm are not included Variant forms of similarity algorithm are not included Experiment on 50 randomly selected movies Experiment on 50 randomly selected movies Statistics confidence if experiment is taken on more movies Statistics confidence if experiment is taken on more movies

15 Questions and Comments?


Download ppt "Experimental Study on Item-based P-Tree Collaborative Filtering for Netflix Prize."

Similar presentations


Ads by Google