Presentation is loading. Please wait.

Presentation is loading. Please wait.

Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes

Similar presentations


Presentation on theme: "Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes"— Presentation transcript:

1 Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes http://www.cs.duke.edu/csed/socialnet

2 Social Networks, CompSci 49s, 11/16/20062 A Future for Computer Science?

3 Social Networks, CompSci 49s, 11/16/20063 Is there a Science of Networks? l From Erdos numbers to random graphs to Internet  From FOAF to Selfish Routing: apparent similarities between many human and technological systems & organization  Modeling, simulation, and hypotheses  Compelling concepts Metaphor of viral spread Properties of connectivity has qualitative and quantitative effects  Computer Science? l From the facebook to tomogravity  How do we model networks, measure them, and reason about them?  What mathematics is necessary?  Will the real-world intrude?

4 Social Networks, CompSci 49s, 11/16/20064 Physical Networks l The Internet  Vertices: Routers  Edges: Physical connections l Another layer of abstraction  Vertices: Autonomous systems  Edges: peering agreements  Both a physical and business network l Other examples  US Power Grid  Interdependence and August 2003 blackout

5 Social Networks, CompSci 49s, 11/16/20065 What does the Internet look like?

6 Social Networks, CompSci 49s, 11/16/20066 US Power Grid

7 Social Networks, CompSci 49s, 11/16/20067 Business & Economic Networks l Example: eBay bidding  vertices: eBay users  links: represent bidder-seller or buyer-seller  fraud detection: bidding rings l Example: corporate boards  vertices: corporations  links: between companies that share a board member l Example: corporate partnerships  vertices: corporations  links: represent formal joint ventures l Example: goods exchange networks  vertices: buyers and sellers of commodities  links: represent “permissible” transactions

8 Social Networks, CompSci 49s, 11/16/20068 Content Networks l Example: Document similarity  Vertices: documents on web  Edges: Weights defined by similarity  See TouchGraph GoogleBrowser l Conceptual network: thesaurus  Vertices: words  Edges: synonym relationships

9 Social Networks, CompSci 49s, 11/16/20069 Enron

10 Social Networks, CompSci 49s, 11/16/200610 Social networks l Example: Acquaintanceship networks  vertices: people in the world  links: have met in person and know last names  hard to measure l Example: scientific collaboration  vertices: math and computer science researchers  links: between coauthors on a published paper  Erdos numbers : distance to Paul Erdos  Erdos was definitely a hub or connector; had 507 coauthors l How do we navigate in such networks?

11 Social Networks, CompSci 49s, 11/16/200611

12 Social Networks, CompSci 49s, 11/16/200612 Acquaintanceship & more

13 Social Networks, CompSci 49s, 11/16/200613 Network Models (Barabasi) l Differences between Internet, Kazaa, Chord  Building, modeling, predicting l Static networks, Dynamic networks  Modeling and simulation l Random and Scale-free  Implications? l Structure and Evolution  Modeling via Touchgraph

14 Social Networks, CompSci 49s, 11/16/200614 Web-based social networks http://trust.mindswap.org l Myspace73,000,000 l Passion.com23,000,000 l Friendster21,000,000 l Black Planet17,000,000 l Facebook8,000,000 l Who’s using these, what are they doing, how often are they doing it, why are they doing it?

15 Social Networks, CompSci 49s, 11/16/200615 Golbeck’s Criteria l Accessible over the web via a browser l Users explicitly state relationships  Not mined or inferred l Relationships visible and browsable by others  Reasons? l Support for users to make connections  Simple HTML pages don’t suffice

16 Social Networks, CompSci 49s, 11/16/200616 CSE 112, Networked Life (UPenn) l Find the person in Facebook with the most friends  Document your process l Find the person with the fewest friends  What does this mean? l Search for profiles with some phrase that yields 30- 100 matches  Graph degrees/friends, what is distribution?

17 Social Networks, CompSci 49s, 11/16/200617 CompSci 1: Overview CS0 l Audioscrobbler and last.fm  Collaborative filtering  What is a neighbor?  What is the network?

18 Social Networks, CompSci 49s, 11/16/200618 What can we do with real data? l How do we find a graph’s diameter?  This is the maximal shortest path between any pair of vertices  Can we do this in big graphs? l What is the center of a graph?  From rumor mills to DDOS attacks  How is this related to diameter? l Demo GUESS (as augmented at Duke)  IM data, Audioscrobbler data

19 Social Networks, CompSci 49s, 11/16/200619 My recommendations at Amazon

20 Social Networks, CompSci 49s, 11/16/200620 And again…

21 Social Networks, CompSci 49s, 11/16/200621 How do search engines work? l Hotbot, Yahoo, Alta Vista, Excite, … l Inverted index with buckets of words  Insight: use matrix to represent how many times a term appears in one page  Columns: pages & Rows: terms  Problems? l Return pages that have the keyword - in what order?  Early solution: return those pages with most occurrences of term first  Problems?  Solution? Use structure of the web to do the work for us What did Google do?

22 Social Networks, CompSci 49s, 11/16/200622 Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx Inlinks are “good” (recommendations) Inlinks from a “good” site are better than inlinks from a “bad” site but inlinks from sites with many outlinks are not as “good”... “Good” and “bad” are relative. web site xxx

23 Social Networks, CompSci 49s, 11/16/200623 Google’s PageRank web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx Imagine a “pagehopper” that always either follows a random link, or jumps to random page

24 Social Networks, CompSci 49s, 11/16/200624 Google’s PageRank (Brin & Page, http://www-db.stanford.edu/~backrub/google.html) web site xxx web site yyyy web site a b c d e f g web site pdq pdq.. web site yyyy web site a b c d e f g web site xxx Imagine a “pagehopper” that always either follows a random link, or jumps to random page PageRank ranks pages by the amount of time the pagehopper spends on a page: or, if there were many pagehoppers, PageRank is the expected “crowd size”

25 Social Networks, CompSci 49s, 11/16/200625 Collaborative Filtering l Goal: predict the utility of an item to a particular user based on a database of user profiles  User profiles contain user preference information  Preference may be explicit or implicit Explicit means that a user votes explicitly on some scale Implicit means that the system interprets user behavior or selections to impute a vote l Problems  Missing data: voting is neither complete nor uniform  Preferences may change over time  Interface issues

26 Social Networks, CompSci 49s, 11/16/200626 Memory-based methods l Store all user votes and generalize from them to predict vote for new item l Predicted vote of active user a for item j :  where there are n users with non-zero weights, v i,j is the vote of user i and item j,  is a normalizing factor,  w () is a weighting function between users Distance metric Correlation or similarity

27 Social Networks, CompSci 49s, 11/16/200627 Computing weights - Cosine Correlation l In information retrieval, documents are represented as vectors of word frequencies  For CF, we treat preferences as vector Documents -> users Word frequencies -> votes l Similarity is then the cosine between two vectors  Dot product of the vectors divided by the product of their magnitudes

28 Social Networks, CompSci 49s, 11/16/200628 Computing weights - Pearson & Spearman correlation l Pearson Correlation  First used for CF in GroupLens project [Resnick et al., 1994]  Relatively efficient to calculate incrementally l Spearman Correlation  same as Pearson but calculations are done on rank of v a,j and v i,j

29 Social Networks, CompSci 49s, 11/16/200629 Model-based methods l Really what we want is the expected value of the user’s vote  Cluster Models Users belong to certain classes in C with common tastes Naive Bayes Formulation Calculate Pr( v i |C= c ) from training set  Bayesian Network Models


Download ppt "Social Networks, CompSci 49s, 11/16/20061 Social Networks as a Foundation for Computer Science Jeffrey Forbes"

Similar presentations


Ads by Google