Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

Problem solving with graph search
Lindsey Bleimes Charlie Garrod Adam Meyerson
NP-Hard Nattee Niparnan.
Triangle partition problem Jian Li Sep,2005.  Proposed by Redstar in Algorithm board in Fudan BBS.  Motivated by some network design strategy.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Spread of Influence through a Social Network Adapted from :
DAVA: Distributing Vaccines over Networks under Prior Information
What is Intractable? Some problems seem too hard to solve efficiently. Question 1: Does an efficient algorithm exist?  An O(a ) algorithm, where a > 1,
The Theory of NP-Completeness
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Maximizing the Spread of Influence through a Social Network
Dynamic Bayesian Networks (DBNs)
Best-First Search: Agendas
CS774. Markov Random Field : Theory and Application Lecture 17 Kyomin Jung KAIST Nov
Approximation Algorithms for Unique Games Luca Trevisan Slides by Avi Eyal.
Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams.
PCPs and Inapproximability Introduction. My T. Thai 2 Why Approximation Algorithms  Problems that we cannot find an optimal solution.
Computational problems, algorithms, runtime, hardness
CSE332: Data Abstractions Lecture 27: A Few Words on NP Dan Grossman Spring 2010.
Approximation Algorithms
Approximation Algorithms
The Theory of NP-Completeness
Analysis of Algorithms CS 477/677
Near-Optimal Network Design with Selfish Agents By Elliot Anshelevich, Anirban Dasgupta, Eva Tardos, Tom Wexler STOC’03 Presented by Mustafa Suleyman CIFTCI.
1 Branch and Bound Searching Strategies 2 Branch-and-bound strategy 2 mechanisms: A mechanism to generate branches A mechanism to generate a bound so.
Ch 13 – Backtracking + Branch-and-Bound
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Mehdi Kargar Aijun An York University, Toronto, Canada Discovering Top-k Teams of Experts with/without a Leader in Social Networks.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Advanced Algorithm Design and Analysis (Lecture 13) SW5 fall 2004 Simonas Šaltenis E1-215b
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
CSC401: Analysis of Algorithms CSC401 – Analysis of Algorithms Chapter Dynamic Programming Objectives: Present the Dynamic Programming paradigm.
Cliff Shaffer Computer Science Computational Complexity.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Online Social Networks and Media
CSE 421 Algorithms Richard Anderson Lecture 27 NP-Completeness and course wrap up.
NP-Complete problems.
SNU OOPSLA Lab. 1 Great Ideas of CS with Java Part 1 WWW & Computer programming in the language Java Ch 1: The World Wide Web Ch 2: Watch out: Here comes.
1 Latency-Bounded Minimum Influential Node Selection in Social Networks Incheol Shin
Introduction to Artificial Intelligence (G51IAI) Dr Rong Qu Blind Searches - Introduction.
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Evolving RBF Networks via GP for Estimating Fitness Values using Surrogate Models Ahmed Kattan Edgar Galvan.
Branch and Bound Searching Strategies
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
The NP class. NP-completeness
Lecture 7: Constrained Conditional Models
Wenyu Zhang From Social Network Group
Nanyang Technological University
Independent Cascade Model and Linear Threshold Model
Greedy & Heuristic algorithms in Influence Maximization
MEIKE: Influence-based Communities in Networks
BackTracking CS255.
Friend Recommendation with a Target User in Social Networking Services
Independent Cascade Model and Linear Threshold Model
Effective Social Network Quarantine with Minimal Isolation Costs
Dynamic Programming Dynamic Programming 1/18/ :45 AM
A History Sensitive Cascade Model in Diffusion Networks
Backtracking and Branch-and-Bound
Viral Marketing over Social Networks
Independent Cascade Model and Linear Threshold Model
Complexity Theory: Foundations
Presentation transcript:

Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity University CS REU, 05.June.2009

Motivation Network Diffusion: the process by which some nodes in a network influence other, neighboring, nodes and change their state Applications Brand recognition Diffusion in other domains Infectious diseases Ideas New technologies

Modeling Network Diffusion Common Models Linear Threshold Model: node activates when a certain (weighted) fraction of its neighbors is active Independent Cascade Model: active node has a one-time chance of activating a neighbor and succeeds with certain probability

Modeling Network Diffusion New Model History Sensitive Cascade Model (HSCM) Main idea: Allows nodes to try to activate neighbors multiple times Benefit: More plausible as in reality people have multiple interactions with each other

History Sensitive Cascade Model Application: A company releases a new product -- what should the advertising target audience be? Consumers with the highest willingness to pay? More influential consumers? Model consumers as nodes that have both “intrinsic” value and “network” value.

History Sensitive Cascade Model Application: A company releases a new product -- what should the advertising target audience be? Consumer with low intrinsic value worth marketing to just because of her network value Marketing to a profitable consumer may be redundant if network effect already makes her likely to buy

History Sensitive Cascade Model Problems Given a node, what is the probability of this node becoming active at a given time? (Vertex Activation Problem) What is the best subset of nodes to activate initially as to maximize the number of active nodes given a certain time for interaction? (Optimization Problem)

History Sensitive Cascade Model Problems Current algorithm implementing HSCM runs in exponential time We hope to invent an approximation algorithm running in polynomial time

3. Problem Definition The problems we are trying to solve

Outline Vertex Activation Problem – Approximating it Optimization Problems – Time Minimization – Activation Maximization – Approximating them

Vertex Activation Problem Given a directed, and weighted graph G – Each edge represents the probability of that edge’s source activating its target in one time step. – What is the probability that a certain vertex v is active on the k th time step?

Vertex Activation Approximation Problem Given a directed, and weighted graph G, a vertex v and a time step k If we have a program P that takes (G,k,v) and returns the exact probability of v being active by the k th time step Create a program A such that – |P(G,k,v)-A(G,k,v)|≤ε – 0<ε<1 – Guaranteed to be ε for all G,k,v.

Possible Problems With the Approximation We may not be able to create a polynomial time approximation algorithm for general graphs for any ε<1 because of the complexity of the HSCM model – We will explore this, and if we can’t do it, then we’ll do it for restricted graphs, – A polynomial time solution has been created during last year’s REU for tree graphs.

What we can do with a Vertex Activation Solver Use the concept of Θ-Certitude – We are Θ-certain that a particular vertex is active by the k th time step if P(G,v,k)≥Θ Determine whether we are Θ-certain that a subest of V, U is active by time step k – We simply check that P(G,u,k)≥Θ, for all u in U. We use Θ-Certitude to define two optimization problems.

Time Minimization Problem Given G, and a number m<|V| – Which subset of V, U where |U| ≤m should be selected – So that k is minimized, where k refers to the time step where all v in V are activated with Θ- Certitude.

Activation Maximization Problem Given G, and m<|V| – Which subset of V, U should be selected such that at the k th time step – The size of the set of nodes activated with Θ- Certitude, |A Θ | is maximized. Both optimization problems are NP-C, so in order to work with large data sets, we need to create approximations.

Approximating the Activation Maximization Problem Given G, and m<|V|, which subset of V, U should be selected such that – At time step k, the size of the set of vertices activated with Θ-certitude, |A Θ | is at least of size ε|A Θ * | – 0<ε<1 – |A Θ * | denotes the size of set of vertices activated with Θ-certitude if the optimal U is chosen.

4. Proposed Solution The strategies we expect to use to solve our problems

Solving the Vertex Activation Problem Building up from the work of last year’s REU we have created and implemented an algorithm Uses Markov chains to calculate the probability of a vertex being activated by the k th time step Involves multiplying a state transition matrix; since there are 2 |V| states the graph can take, this matrix is of size 2 2|V| It can be multiplied in polynomial time, but its size forces the algorithm overall to run in exponential time.

A Graph and the State Transition Matrix [][0][1][0, 1][2][0, 2][1, 2][0, 1, 2] [] [0] [1] [0, 1] [2] [0, 2] [1, 2] [0, 1, 2]

Empirical Evidence of Intractibility

Wrapping Up the Vertex Activation Problem Provide a rigorous analysis of the space and time complexities Optimize the matrix calculation and matrix multiplication – It’s easy to determine that it’s not possible for our graph to go from some states to others, or whether it cannot move from some states. – Take advantage of the fact that the matrix is upper-triangular.

Some (unexplored) ideas for approximating the Vertex Activation Problem Instead of using the Vertex Activation Problem in order to decide how good a set U is, heuristically determine a set of the most influential nodes in the graph – This might be done using standard graph search, path, or spanning tree algorithms. Simulate the History Sensitive Cascade Model, without paying too much attention to the cyclical nature of the graph Use Bayesian Networks to solve the Vertex Activation Problem, and determine whether they are easier to simulate.

Approximating the Optimization Problems The solutions we have in mind depend on us being able to determine how good some proposed solution U is (U is a subset of V). – Hopefully we will be able to do this with our approximation to the Vertex Activation Problem, otherwise we might use a heuristic as described before. Given this, we hope to explore several strategies for calculating U: – Algorithms that greedily add vertices to U – Hill-Climbing and Simulated-Annealing algorithms – A Genetic Algorithm

Proposed Experiment Domain Difficult to test Need two datasets: Feed the initial state of the network to the algorithm and compare against the final state Vertex Activation Problem is NP- Complete: The approximation algorithm will not fully reflect the expressive power of the model

Proposed Experiment Domain Simulation Test approximations against optimal predictions (Kempe at al., Maximizing the Spread of Influence through a Social Network)

Proposed Experiment Domain Comparison of HSCM with collected data The arXiv database Contains citations between scientific papers Probability of a certain author being cited at a given point, depending on the set of all others he cited and who cited him. A Keyboard The keys you press influence, which keys you will press next Interesting optimization problems: Dvorak vs QWERTY, etc.

Timeline End of next week: Whole system up and running; using the exponential-time algorithm In three weeks: Approximation of the Vertex Activation Problem In four weeks: Genetic algorithm to approximate the Optimization Problem In five weeks: Other ways to approximate the Optimization Problem

Conclusion Novel research We understand the problem But maybe not in its whole complexity? Venture into algorithm design Haven’t had much experience in this Learn a lot Even if goal fails Algorithms + AI: approximation techniques + applications of model (future work)

Thank you for listening We will take questions