Xiaowei Ying, Xintao Wu, Daniel Barbara Spectrum based Fraud Detection in Social Networks 1.

Slides:



Advertisements
Similar presentations
Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
Advertisements

CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
Leting Wu Xiaowei Ying, Xintao Wu Aidong Lu and Zhi-Hua Zhou PAKDD 2011 Spectral Analysis of k-balanced Signed Graphs 1.
Mauro Sozio and Aristides Gionis Presented By:
Xintao Wu Aug 25,2014 Research Overview 1. Outline Introduction Privacy Preserving Social Network Analysis  Input perturbation  Output perturbation.
Spectrum Based RLA Detection Spectral property : the eigenvector entries for the attacking nodes,, has the normal distribution with mean and variance bounded.
Marios Iliofotou (UC Riverside) Brian Gallagher (LLNL)Tina Eliassi-Rad (Rutgers University) Guowu Xi (UC Riverside)Michalis Faloutsos (UC Riverside) ACM.
Xiaowei Ying Xintao Wu Univ. of North Carolina at Charlotte 2009 SIAM Conference on Data Mining, May 1, Sparks, Nevada Graph Generation with Prescribed.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Networks FIAS Summer School 6th August 2008 Complex Networks 1.
Leting Wu Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte Reconstruction from Randomized Graph via Low Rank Approximation.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
Detecting Fraudulent Personalities in Networks of Online Auctioneers Duen Horng (“Polo”) Chau Shashank Pandit Christos Faloutsos School of Computer Science.
Community Detection in a Large Real-World Social Network Karsten Steinhaeuser Nitesh V. Chawla DIAL Research Group University of Notre.
Kyle Heath, Natasha Gelfand, Maks Ovsjanikov, Mridul Aanjaneya, Leo Guibas Image Webs Computing and Exploiting Connectivity in Image Collections.
PageRank Identifying key users in social networks Student : Ivan Todorović, 3231/2014 Mentor : Prof. Dr Veljko Milutinović.
The Union-Split Algorithm and Cluster-Based Anonymization of Social Networks Brian Thompson Danfeng Yao Rutgers University Dept. of Computer Science Piscataway,
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.
Models of Influence in Online Social Networks
Faculty: Dr. Chengcui Zhang Students: Wei-Bang Chen Song Gao Richa Tiwari.
Social Networking and On-Line Communities: Classification and Research Trends Maria Ioannidou, Eugenia Raptotasiou, Ioannis Anagnostopoulos.
Neighbourhood Sampling for Local Properties on a Graph Stream A. Pavan, Iowa State University Kanat Tangwongsan, IBM Research Srikanta Tirthapura, Iowa.
Modeling Information Diffusion in Networks with Unobserved Links Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University.
University of California at Santa Barbara Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Zhao.
Spectral coordinate of node u is its location in the k -dimensional spectral space: Spectral coordinates: The i ’th component of the spectral coordinate.
Preserving Link Privacy in Social Network Based Systems Prateek Mittal University of California, Berkeley Charalampos Papamanthou.
Stochastic sleep scheduling (SSS) for large scale wireless sensor networks Yaxiong Zhao Jie Wu Computer and Information Sciences Temple University.
Boundary Recognition in Sensor Networks by Topology Methods Yue Wang, Jie Gao Dept. of Computer Science Stony Brook University Stony Brook, NY Joseph S.B.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent W. Freeh Dr. Kevin Bowyer Supported in part by the National Science.
Tarun Bansal, Bo Chen and Prasun Sinha
Advanced Spectrum Management in Multicell OFDMA Networks enabling Cognitive Radio Usage F. Bernardo, J. Pérez-Romero, O. Sallent, R. Agustí Radio Communications.
Xiaowei Ying, Xintao Wu Univ. of North Carolina at Charlotte PAKDD-09 April 28, Bangkok, Thailand On Link Privacy in Randomizing Social Networks.
Spectral Analysis based on the Adjacency Matrix of Network Data Leting Wu Fall 2009.
Xiaowei Ying, Leting Wu, Xintao Wu University of North Carolina at Charlotte Privacy and Spectral Analysis on Social Network Randomization.
Xiaowei Ying, Xintao Wu Dept. Software and Information Systems Univ. of N.C. – Charlotte 2008 SIAM Conference on Data Mining, April 25 th Atlanta, Georgia.
Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.
Detecting Communities Via Simultaneous Clustering of Graphs and Folksonomies Akshay Java Anupam Joshi Tim Finin University of Maryland, Baltimore County.
Yongqin Gao, Greg Madey Computer Science & Engineering Department University of Notre Dame © Copyright 2002~2003 by Serendip Gao, all rights reserved.
Xintao Wu Jan 18, 2013 Retweeting Behavior and Spectral Graph Analysis in Social Media.
SybilGuard: Defending Against Sybil Attacks via Social Networks.
Performance of Adaptive Beam Nulling in Multihop Ad Hoc Networks Under Jamming Suman Bhunia, Vahid Behzadan, Paulo Alexandre Regis, Shamik Sengupta.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Panther: Fast Top-k Similarity Search in Large Networks JING ZHANG, JIE TANG, CONG MA, HANGHANG TONG, YU JING, AND JUANZI LI Presented by Moumita Chanda.
Preserving Privacy and Social Influence Isabelle Stanton.
Mix networks with restricted routes PET 2003 Mix Networks with Restricted Routes George Danezis University of Cambridge Computer Laboratory Privacy Enhancing.
Steffen Staab 1WeST Web Science & Technologies University of Koblenz ▪ Landau, Germany Network Theory and Dynamic Systems Link Prediction.
Sybil Attacks VS Identity Clone Attacks in Online Social Networks Lei Jin, Xuelian Long, Hassan Takabi, James B.D. Joshi School of Information Sciences.
Arizona State University Fast Eigen-Functions Tracking on Dynamic Graphs Chen Chen and Hanghang Tong - 1 -
Xiaowei Ying, Kai Pan, Xintao Wu, Ling Guo Univ. of North Carolina at Charlotte SNA-KDD June 28, 2009, Paris, France Comparisons of Randomization and K-degree.
Density of States for Graph Analysis
Random Walk for Similarity Testing in Complex Networks
Cohesive Subgraph Computation over Large Graphs
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
DOULION: Counting Triangles in Massive Graphs with a Coin
Sequential Algorithms for Generating Random Graphs
Gephi Gephi is a tool for exploring and understanding graphs. Like Photoshop (but for graphs), the user interacts with the representation, manipulate the.
Peer-to-Peer and Social Networks
Supporting Fault-Tolerance in Streaming Grid Applications
Dieudo Mulamba November 2017
Approximating the Community Structure of the Long Tail
3.3 Network-Centric Community Detection
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Asymmetric Transitivity Preserving Graph Embedding
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Dominating Set By Eric Wengert.
Presentation transcript:

Xiaowei Ying, Xintao Wu, Daniel Barbara Spectrum based Fraud Detection in Social Networks 1

An abstraction of collaborative attacks including spam, viral marketing, individual re-identification via active/passive attacks The attacker creates some fake nodes and uses them to attack a large set of randomly selected regular nodes; Fake nodes also mimic the real graph structure among themselves to evade detection. Random Link Attack Shirvastava et al. icde08 2

3 Idea count external triangles around each node --- neighbors of a regular user have many triangles, but random victims do not. Algorithm detecting suspects clustering test and neighborhood independence test detecting RLAs GREEDY and TRWALK Limitation too many parameters high computational cost difficult to detect when there exist multiple RLAs Topology Approach Shirvastava et al. icde08

Our Approach Examine the spectral space of graph topology. : undirected, un-weighted, unsigned, and without considering link/node attribute information; Adjacency Matrix A (symmetric) Adjacency Eigenspace 4

5 Spectral coordinate: Ying and Wu SDM09 Polbook Network

Spectrum Based Fraud Detection RLA– from the matrix perturbation point of view 6

Spectrum Based Fraud Detection Approximate the spectral coordinate 7

Approximate the eigenvector in random link attack Regular nodes Approximation first order second order Attacking nodes 8

Illustrating network data 9 Network of the political blogs on the 2004 U.S. election (polblogs, 1,222 nodes and 16,714 edges) The blogs were labeled as either liberal or conservative.

Illustrating example Political blogs (1222, 16714): each node labeled as either liberal or conservative Add one RLA with 20 attacking nodes that have the same degree dist. as the regular ones. 10

Problem We do not know who are attackers/victims in the graph topology. For Random Link Attacks, we can derive the distribution of attacking nodes’ spectral coordinates. 11

The spectral coordinate of attacking node p has the normal distribution with mean and variance bounded by: We can get the region in the spectral space where RLA attacking nodes appear with high prob. Dist. of attackers’ spectral coordinates Inner structure of attackers does not affect the region!!! polblogs (1222, 16714), 20 attackers, each randomly attacks 30 victims 12

It is tedious to check every dimension one by one. The node non-randomness of RLA attackers We derive the upper bounds of mean and variance and get the decision line: Using node non-randomness 13

The node non-randomness of RLA attackers Identifying suspects Nodes below the decision line are suspects 14

RLAs with varied inner structure 15

SPCTRA Algorithm 16

Evaluation Topology based RLA detection approach – Shrivastava et al. ICDE08 clustering test and neighborhood independence test GREEDY and TRWALK Experimental Setting Political blogs (1222,16714), add 1 RLA with 20 attackers Web Spam Challenge data (114K nodes and 1.8M links), add a mix of 8 RLAs with varied sizes and connection patterns. 17

Evaluation on political blogs (1 RLA each time) Evaluation 18

Evaluation on Web spam challenge data A snapshot of websites in domain.UK (2007) SPCTRA: based on spectral space GREEDY: based on outer-triangles [Shrivastava, ICDE, 2008] Accuracy 19

Execution time TRWALK is 10 times faster than GREEDY (with less accuracy), but still 100 times slower than SPCTRA. Discussion of complexity is in the paper. 20

Bipartite Core Attacks Attacker creates two type of nodes: Accomplices: behave like normal users except heavily connecting to fraudsters to enhance fraudsters’ rating. Fraudsters: nodes that actually do frauds, mostly connect to accomplices. No link exists within accomplices or fraudsters. Figure from: Duen Horng Chau et. al., Detecting Fraudulent Personalities in Networks of Online Auctioneers 21 Bipartite core

Bipartite Core Attacks fraudsters and 30 accomplices.

DDoS attacks 23 Attacker controls 10% normal nodes to attack one victim node.

Conclusion Present a framework that exploits the spectral space of graph topology to detect attacks. Theoretical analysis showed that attackers locate in a different region from the regular ones in the spectral space. Develop the SPCTRA algorithm for detecting RLAs. Demonstrate its effectiveness and efficiency through empirical evaluation. 24

Future Work Explore other attacking scenarios in both social networks and communication networks. In Sybil attacks, attackers may choose victims purposely, rather than randomly. Track how graph evolves dynamically. 25

Questions? Acknowledgments This work was collaborated with Xiaowei Ying and Daniel Barbara, and was supported in part by U.S. National Science Foundation IIS , CNS and CCF Thank You! 26

27 Another Example

Adjacency Eigenspace 28 Spectral coordinate: Ying and Wu SDM09 Polbook Network