Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network Michael Matczynski 6.338 Status Report April 14, 2006.

Similar presentations


Presentation on theme: "The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network Michael Matczynski 6.338 Status Report April 14, 2006."— Presentation transcript:

1 The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network Michael Matczynski 6.338 Status Report April 14, 2006

2 Vision In order to successfully classify users in a social network such as facebook.com, we should leverage intermediate features. In order to successfully classify users in a social network such as facebook.com, we should leverage intermediate features.

3 Steps 1. Gather profile, friend, and group data from all MIT users on facebook.com 2. Build graph 3. Develop PageRank algorithm to determine profile popularity 4. Generate intermediate features from profiles 5. Develop algorithm to identify similarities between all users 6. Develop online interface for users

4 1. Gather Data Gathered data from 11,744 MIT profiles Gathered data from 11,744 MIT profiles Profile data (major, living group, etc) Profile data (major, living group, etc) Friend information (to build the graph) Friend information (to build the graph)

5 2. Build Graph Due to privacy settings, not all friend information is available Due to privacy settings, not all friend information is available Nonetheless, because a friendship link is undirected, the friends of users with strict privacy settings can mostly be deduced Nonetheless, because a friendship link is undirected, the friends of users with strict privacy settings can mostly be deduced

6 3. PageRank Algorithm Google’s PageRank Algorithm determines important nodes of a graph by using each link as a vote for that particular node Google’s PageRank Algorithm determines important nodes of a graph by using each link as a vote for that particular node Run Time: Run Time: <1sec / iteration <1sec / iteration PageRank converges within 20 iterations PageRank converges within 20 iterations Results: Due to the undirected nature of social networks, PageRank is highly correlated with number of friends  Not that useful Results: Due to the undirected nature of social networks, PageRank is highly correlated with number of friends  Not that useful

7 4. Generate Intermediate Features from Profiles Classical Musicians John Williams Fans College Democrats College Republicans Napolean Dynamite is a Retarded Movie Pirates of the Caribbean +Music+Music +Politics, +Liberal +Politics, -Liberal +Movies, -Napolean +Movies, +Pirates

8 5. Identify Similar Users Modified PageRank Algorithm Modified PageRank Algorithm One network for each attribute (ie. Music) One network for each attribute (ie. Music) Resulting PageRank would indicate clusters of similar interest Resulting PageRank would indicate clusters of similar interest Neural Networks Neural Networks Train neural network with known friends and learn about similarities / classifications Train neural network with known friends and learn about similarities / classifications

9 6. Online Interface If interesting results emerge, develop an online interface so members of the MIT community can learn about themselves If interesting results emerge, develop an online interface so members of the MIT community can learn about themselves

10 Next Steps Generate intermediate features Generate intermediate features Determine classification algorithm Determine classification algorithm Parallel computation Parallel computation


Download ppt "The Matrix: Using Intermediate Features to Classify and Predict Friends in a Social Network Michael Matczynski 6.338 Status Report April 14, 2006."

Similar presentations


Ads by Google