Download presentation

Presentation is loading. Please wait.

1
© 2011 IBM Corporation IBM Research SIAM-DM 2011, Mesa AZ, USA, Non-Negative Residual Matrix Factorization w/ Application to Graph Anomaly Detection Hanghang Tong and Ching-Yung Lin April 28-30, 2011

2
IBM Research © 2011 IBM Corporation Large Graphs are Everywhere! 2 ---------- Internet Map [Koren 2009] Food Web [2007] Protein Network [Salthe 2004] Social Network [Newman 2005] Web Graph Terrorist Network [Krebs 2002] Q: How to find patterns? e.g., community, anomaly, etc.

3
IBM Research © 2011 IBM Corporation A Typical Procedure: Matrix Tool for Finding Graph Patterns Graph Adj. Matrix A A = F x G + R Low-rank matrices Residual matrix 3

4
IBM Research © 2011 IBM Corporation A Typical Procedure: Matrix Tool for Finding Graph Patterns Graph Adj. Matrix A A = F x G + R community anomalies 4 An Illustrative Example Low-rank matrices Residual matrix

5
IBM Research © 2011 IBM Corporation A Typical Procedure: An Example Improve Interpretation by Non-negativity Interpretation by Non-negativity Graph Adjacency Matrix A A = F x G + R community anomalies Non-negative Matrix Factorization F >= 0; G >= 0 (for community detection) Non-negative Residual Matrix Factorization R(i,j) >= 0; for A(i,j) > 0 (for anomaly detection) This Paper 5

6
IBM Research © 2011 IBM Corporation Anomaly Detection on Graphs Social Networks –`Popularity contest’ Computer Networks –Spammer, Port Scanner, Vulnerable Machines, etc Financial Transaction Networks –Fraud transaction (e.g., money-laundry ring), scammer Criminal Networks –New criminal trend Tele-communication Networks –Tele-marketer 6 Key Observation: Abnormal Behavior Actual Activities

7
IBM Research © 2011 IBM Corporation Challenges and Core Ideas Challenges 1: Lack of `Ground-truth’ Core Idea 1: Using residual graph to improve the usability of anomaly detection results –(which is turned achieved by non-negative residual matrix factorization methods) Challenges 2: Large Data Core Idea 2: Carefully designed method, which scales linear wrt the size of the graph 7

8
IBM Research © 2011 IBM Corporation Optimization Formulation General Case 8 Weighted Frobenius Form WeightCommon in Any Matrix Factorization

9
IBM Research © 2011 IBM Corporation Optimization Formulation General Case 9 Non-negative residual Weighted Frobenius Form WeightCommon in Any Matrix Factorization Unique in This Paper

10
IBM Research © 2011 IBM Corporation Optimization Formulation 0/1 Weight Matrix (Major Focus of the Paper) 10 Non-negative residual Common in Any Matrix Factorization Unique in This Paper 0/1 weight

11
IBM Research © 2011 IBM Corporation Optimization Formulation with 0/1 Weight Matrix NrMF with 0/1 Weight Matrix Q: How to find ‘optimal’ F and G? –D1: Quality C1: non-convexity of opt. objective –D2: Scalability C2: large size of the graph 11

12
IBM Research © 2011 IBM Corporation Optimization Method: Batch Mode Basic Idea 1: Alternating Basic Idea 2: Separation 12 Not convex wrt F and G, jointly But convex if fixing either F or G argmin G s.t.. argmin G s.t.. For each j i, Standard Quadratic Programming Prob. Overall Complexity: Polynomial Can we do better?

13
IBM Research © 2011 IBM Corporation Optimization Method: Incremental Mode Basic Idea 1: Recursive Basic Idea 2: Alternating Basic Idea 3: Separation 13 Overall Complexity: Linear wrt # of edges QP for a single variable w/ boundary constrains Adjacency Matrix A Initialize: R=A Rank-1 Approximation Update Residual Matrix R Output Final Residual Matrix Do r times Can be solved in constant time

14
IBM Research © 2011 IBM Corporation Experimental Evaluation Effectiveness Anomaly Type AccuracyWall-clock Time # of edges 14 Efficiency

15
IBM Research © 2011 IBM Corporation Experimental Evaluation Effectiveness Efficiency Anomaly Type Accuracy Time # of edges # of type-2 nodes# of type-1 nodes 15

16
IBM Research © 2011 IBM Corporation Batch Method vs. Incremental Method Log Wall-clock time (sec.) Data SetIncremental Method Batch Method 16

17
IBM Research © 2011 IBM Corporation Conclusion Problem Formulation: Non-negative Residual Matrix Factorization –a new matrix factorization for interpretable graph anomaly detection Optimization Methods –Batch: straight-forward, polynomial time complexity –Incremental: linear time complexity Future Work –Other interpretable properties (sparseness) for anomaly detection –Matrix Factorization w/ Total Non-negativity 17

18
IBM Research © 2011 IBM Corporation Thank you! htong@us.ibm.com (We are hiring at IBM Research!) 18

19
IBM Research © 2011 IBM Corporation Visual Comparison 19

20
IBM Research © 2011 IBM Corporation low q up q low up

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google