Presentation is loading. Please wait.

Presentation is loading. Please wait.

Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,

Similar presentations


Presentation on theme: "Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,"— Presentation transcript:

1 Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei, Liang Date: 2008/08/14 1

2 Outline Introduction Alias Detection Method – Data Representation – Ranking Algorithms Experiment 2

3 Introduction Individuals use aliases for various communication purposes Alias detection – Useful to both legitimate and illegitimate applications – Important to understand the extent to which the process can be automated 3

4 Introduction Aliases are listed on the same webpage can indicate there exists some form of relationship between them Many people use several email addresses – This paper attempt to determine which email addresses correspond to the same entity 4

5 Introduction Email addresses, a type of alias, can be distilled from a large number of web pages – Such as class rosters, research papers, discussion boards Email addresses provide a unique mapping from address to a specific entity 5

6 Data Representation Let S represent the set of sources Modeled as an undirected graph G = (I, E) – I be the set of unique email addresses – C ab = |e ab | denote the number of sources associated with each edge connecting a and b 6

7 Ranking Algorithms Ranking method – Top-k list of possible aliases – Shortest path algorithm Used geodesic distance to generate a ranking of nodes closest to a given originating node Relationship strength is augmented with – Number of aliases on a source – Number of collocations of aliases 7

8 Ranking Algorithms Geodesic distance – Length of the shortest path from a to b – Potential aliases are ranked from lowest to highest geodesic distance Multiple Collocation – Two aliases which collocate on more than one webpage signifies a stronger relationship 8

9 Ranking Algorithms Source Size – Strength between two aliases in inversely correlated with the number of aliases in a source Combined – Integrates both of previous assumptions 9

10 Experiment Derived from CMU web pages – 1978 distinct email aliases Data Set Statistics 10

11 Experiment 11

12 Experiment Geodesic Alias Distances 12

13 Experiment 13

14 Experiment 14

15 Experiment 15


Download ppt "Email Alias Detection Using Social Network Analysis Ralf Holzer, Bradley Malin, Latanya Sweeney LinkKDD 2005 Advisor: Dr. Koh Jia-Ling Reporter: Che-Wei,"

Similar presentations


Ads by Google