CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks (WWW2013) BEUTEL, ALEX, WANHONG XU, VENKATESAN GURUSWAMI, CHRISTOPHER.

Slides:



Advertisements
Similar presentations
Making Time-stepped Applications Tick in the Cloud Tao Zou, Guozhang Wang, Marcos Vaz Salles*, David Bindel, Alan Demers, Johannes Gehrke, Walker White.
Advertisements

BiG-Align: Fast Bipartite Graph Alignment
CMU SCS I2.2 Large Scale Information Network Processing INARC 1 Overview Goal: scalable algorithms to find patterns and anomalies on graphs 1. Mining Large.
Enabling Speculative Parallelization via Merge Semantics in STMs Kaushik Ravichandran Santosh Pande College.
Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
CSE 5243 (AU 14) Graph Basics and a Gentle Introduction to PageRank 1.
Presented by: GROUP 7 Gayathri Gandhamuneni & Yumeng Wang.
Linear Obfuscation to Combat Symbolic Execution Zhi Wang 1, Jiang Ming 2, Chunfu Jia 1 and Debin Gao 3 1 Nankai University 2 Pennsylvania State University.
Node labels as random variables prior belief observed neighbor potentials compatibility potentials Opinion Fraud Detection in Online Reviews using Network.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Presenter: Yufan Liu November 17th,
CMU SCS C. Faloutsos (CMU)#1 Large Graph Algorithms Christos Faloutsos CMU McGlohon, Mary Prakash, Aditya Tong, Hanghang Tsourakakis, Babis Akoglu, Leman.
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
N EIGHBORHOOD F ORMATION AND A NOMALY D ETECTION IN B IPARTITE G RAPHS Jimeng Sun, Huiming Qu, Deepayan Chakrabarti & Christos Faloutsos Jimeng Sun, Huiming.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
Detecting Fraudulent Personalities in Networks of Online Auctioneers Duen Horng (“Polo”) Chau Shashank Pandit Christos Faloutsos School of Computer Science.
Yield- and Cost-Driven Fracturing for Variable Shaped-Beam Mask Writing Andrew B. Kahng CSE and ECE Departments, UCSD Xu Xu CSE Department, UCSD Alex Zelikovsky.
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Memoplex Browser: Searching and Browsing in Semantic Networks CPSC 533C - Project Update Yoel Lanir.
Approximation Algorithms: Bristol Summer School 2008 Seffi Naor Computer Science Dept. Technion Haifa, Israel TexPoint fonts used in EMF. Read the TexPoint.
CBLOCK: An Automatic Blocking Mechanism for Large-Scale Deduplication Tasks Ashwin Machanavajjhala Duke University with Anish Das Sarma, Ankur Jain, Philip.
Models of Influence in Online Social Networks
Image Segmentation Image segmentation is the operation of partitioning an image into a collection of connected sets of pixels. 1. into regions, which usually.
Design Patterns for Efficient Graph Algorithms in MapReduce Jimmy Lin and Michael Schatz University of Maryland MLG, January, 2014 Jaehwan Lee.
Social Network Analysis via Factor Graph Model
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.
SGD ON HADOOP FOR BIG DATA & HUGE MODELS Alex Beutel Based on work done with Abhimanu Kumar, Vagelis Papalexakis, Partha Talukdar, Qirong Ho, Christos.
Yan Yan, Mingkui Tan, Ivor W. Tsang, Yi Yang,
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Part 1: Introduction Importance of geolocation Finding compromised accounts (prevent security breaches). Personalization of information based on location.
Graph Coloring with Ants
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
CSE 548 Advanced Computer Network Security Document Search in MobiCloud using Hadoop Framework Sayan Cole Jaya Chakladar Group No: 1.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Vladyslav Kolbasin Stable Clustering. Clustering data Clustering is part of exploratory process Standard definition:  Clustering - grouping a set of.
Protecting Sensitive Labels in Social Network Data Anonymization.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao, Yinglian Xie, Fang Yu, Qifa Ke, Yuan Yu, Yan Chen, and Eliot Gillum Speaker: 林佳宜.
CMU SCS Mining Large Graphs: Fraud Detection, and Algorithms Christos Faloutsos CMU.
Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Overlapping Community Detection in Networks
Crowd Fraud Detection in Internet Advertising Tian Tian 1 Jun Zhu 1 Fen Xia 2 Xin Zhuang 2 Tong Zhang 2 Tsinghua University 1 Baidu Inc. 2 1.
Outline  Introduction  Subgraph Pattern Matching  Types of Subgraph Pattern Matching  Models of Computation  Distributed Algorithms  Performance.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P8-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 8: hadoop and Tera/Peta byte graphs.
Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.
CMU SCS Anomaly Detection in Large Graphs Christos Faloutsos CMU.
Written by Qiang Cao, Xiaowei Yang, Jieqi Yu and Christopher Palow
Cohesive Subgraph Computation over Large Graphs
CACTUS-Clustering Categorical Data Using Summaries
Written by Qiang Cao, Xiaowei Yang, Jieqi Yu and Christopher Palow
BIPARTITE GRAPHS AND ITS APPLICATIONS
Supporting Fault-Tolerance in Streaming Grid Applications
Kijung Shin1 Mohammad Hammoud1
Dieudo Mulamba November 2017
Community Distribution Outliers in Heterogeneous Information Networks
Conflict-Aware Event-Participant Arrangement
KMeans Clustering on Hadoop Fall 2013 Elke A. Rundensteiner
Graph and Tensor Mining for fun and profit
Consensus Partition Liang Zheng 5.21.
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
GANG: Detecting Fraudulent Users in OSNs
Mingzhen Mo and Irwin King
Using Clustering to Make Prediction Intervals For Neural Networks
Presentation transcript:

CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks (WWW2013) BEUTEL, ALEX, WANHONG XU, VENKATESAN GURUSWAMI, CHRISTOPHER PALOW, AND CHRISTOS FALOUTSOS. GROUP 20

Outline 1. Motivation 2. Problem Formulation 3. Solutions ◦3.1 Serial Algorithm ◦3.2 Mapreduce implementation 4. Experiments 5. Conclusion

1. Motivation 1. Misleading feedback 2. Boost Facebook Page Like count ill-gotten Like: a Like that doesn’t come from someone truly interested in connecting with a Page 3. Existing defense mechanisms in Facebook: anti-fishing, anti-malware, fake account detection

2. Problem Formulation 1. Detecting ill-gotten Page Like on Facebook (and other deceitful user feedback in many other online setting) 2. Lockstep behavior: groups of users acting together, generally Liking the same Pages at around the same time

3. Problem Formulation

2. Problem Formulation

Finding the bipartite core is NP-hard Suspicious

2. Problem Formulation Maximize the number of suspicious users and the number of Page Likes of suspicious users that are suspicious (fall within the designated time window)

3. Solutions Question: What do we have? User-page Like relationship→ bipartite graph Like time→ edge creation time 3.1 A Serial Algorithm iteratively update 3.2 MapReduce Implementation parallel running

3.1. Serial Algorithm keep P′ constant and update c. Keep c constant and update P′.

3.1 Serial Algorithm In the updateCenter() function: keep P’ constant and update c. Loosen the width to Δt where >1 In the update Subspace() function: keep c constant and update P’. Any user that were covered before will still be.

3.1 Serial Algorithm Converge :

3.2 Mapreduce Implementation Brief Introduction to Mapreduce

3.2 Mapreduce Implementation Input: a set of clusters. Each cluster has a center c, and P’. Output: updated c and P’ for each cluster.

3.2 Mapreduce Implementation Algorithm: (1)Mapper: Take input as L and I. For each user i, check if it belongs to cluster k. If yes, output (k, (L i,*, I i,* )). (2)Reducer: Take input as (k, (L i,*, I i,* )). Then for each cluster k, update c and P’.

4. Experiments 4.1 Scalability 4.2 Convergence 4.3 Effectiveness Dataset: real Facebook like data &synthetic data

4.1 Scalability Good Scalability: Linear relationship between the data size and runtime

4.2 Convergence Although MapReduce Algorithm is not provably convergent, it is convergent in practice.

4.3 Effectiveness This Algorithm can find the most attacks in practice

Conclusion (1) Give a novel problem formulation, with a simple concrete definition of suspicious behavior in terms of graph structure and edge constraints. (2) Two algorithms to find such suspicious lockstep behavior: ◦one provably-convergent iterative algorithm ◦one approximate, scalable MapReduce implementation

Thanks