Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS522: Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian

Similar presentations


Presentation on theme: "CS522: Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian"— Presentation transcript:

1 CS522: Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica (nickle@microsoft.com) Mohammad Mahdian (mahdian@microsoft.com)

2 Previously in this class Ranking using the hyperlink structure:  HITS  PageRank

3 Today Dealing with web spam An axiomatic approach to PageRank Next Lecture: Kamal Jain

4 Recap The PageRank of a page p is the probability of p in the stationary distribution of a random walk that in each stage with probability 1 – ε follows a random link from the current page, and with probability ε, starts from a random page. Typically, ε = 0.15.

5 The collusion problem What if a group of nodes “collude” to increase the PageRank of one or more in the group? Zhang, Goel, Govindan, Mason, and Van Roy, WAW 2004. Define “amplification” of a group of nodes, and prove that it is always at most O(1/ ε).

6 The collusion problem Question: Is collusion really a problem? Experiment (on a web subgraph, and blogstreet):  Take, say, the 1000 th and the 1001 th nodes in the PageRank order.  Each of these nodes removes all links to other pages, and adds a link to the other.  Compute PageRanks in the new graph. Results: Ranks of the colluding nodes increase significantly. Exercise: Go to eBay and search for PageRank.

7 Finding colluding groups Approach 1: Find a set S with the largest amplification. However, it can be shown that this problem is NP-hard.

8 Finding colluding nodes Approach 2: Identify colluding individuals Observation: If we increase ε, the PageRank of a colluding individual decreases (often proportional to 1/ ε). Heuristic: Compute PageRanks for multiple values of ε, and compute the correlation of the PageRank of each node with 1/ ε. Nodes with high correlation are probably colluding.

9 Dealing with collusion We can “punish” colluding individuals by increasing their ε, so that they cannot pass their reputation on to others. Experimental results

10 Explaining PageRank Axiomatic approach  Define a set of “natural” axioms  Prove that PageRank satisfies these axioms  Prove that any page ranking algorithm satisfying these axioms outputs the same ranking as PageRank

11 Axiomatic Approaches: Voting Consider a democracy where people submit preference lists over candidates. A voting rule (or social welfare function) outputs a global ordering of candidates for every set of preference lists.

12 Voting Axioms Unanimity: If everyone prefers the candidate x to y, then the global ordering also ranks x above y. Independence of irrelevant alternatives (IIA): For any two candidates x and y, changes in people’s rankings of candidates other than x and y should not affect the relative position of x and y in the global ordering.

13 Arrow’s (Im)possibility Theorem Theorem [Arrow, 1951]: The only function satisfying unanimity and IIA is dictatorship. Extensions  Similar results hold for social choice functions where a single candidate (winner) must be chosen [Muller-Satterthwaite, 1977]  Majority rule arises naturally when we relax IIA or restrict the preference domain of people (i.e., impose rules on how they can rank candidates).

14 Axiomatic Approach: PageRank Agents are nodes of graph. Agents output a “vote” over other agents as represented by a directed graph G. A ranking algorithm is a function mapping every directed graph to an ordering of its nodes.

15 PageRank Axioms Isomorphism: The ranking procedure should be independent of the names of the nodes. Self edge: Adding self loops should not harm a node and should not affect other nodes. Vote by committee: Importance a gives to b and c by voting shouldn’t change if a votes via committee. Collapsing: If two nodes vote similarly, and are linked to by disjoint sets of nodes, the ranking does not change when they are collapsed to one node. Proxy: There is an equal distribution of importance.

16 PageRank: Altman and Tennenholtz Theorem: PageRank satisfies axioms. Theorem: PageRank is only ranking algorithm which satisfies axioms (i.e., every other ranking algorithm which satisfies axioms outputs same ranking as PageRank).


Download ppt "CS522: Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian"

Similar presentations


Ads by Google