Kijung Shin, Jisu Kim, Bryan Hooi, Christos Faloutsos

Slides:



Advertisements
Similar presentations
Request Dispatching for Cheap Energy Prices in Cloud Data Centers
Advertisements

SpringerLink Training Kit
Luminosity measurements at Hadron Colliders
From Word Embeddings To Document Distances
Choosing a Dental Plan Student Name
Virtual Environments and Computer Graphics
Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI
THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –
D. Phát triển thương hiệu
NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN
Điều trị chống huyết khối trong tai biến mạch máu não
BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.
Nasal Cannula X particulate mask
Evolving Architecture for Beyond the Standard Model
HF NOISE FILTERS PERFORMANCE
Electronics for Pedestrians – Passive Components –
Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel
L-Systems and Affine Transformations
CMSC423: Bioinformatic Algorithms, Databases and Tools
Some aspect concerning the LMDZ dynamical core and its use
Bayesian Confidence Limits and Intervals
实习总结 (Internship Summary)
Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,
Front End Electronics for SOI Monolithic Pixel Sensor
Face Recognition Monday, February 1, 2016.
Solving Rubik's Cube By: Etai Nativ.
CS284 Paper Presentation Arpad Kovacs
انتقال حرارت 2 خانم خسرویار.
Summer Student Program First results
Theoretical Results on Neutrinos
HERMESでのHard Exclusive生成過程による 核子内クォーク全角運動量についての研究
Wavelet Coherence & Cross-Wavelet Transform
yaSpMV: Yet Another SpMV Framework on GPUs
Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.
MOCLA02 Design of a Compact L-­band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Fuel cell development program for electric vehicle
Overview of TST-2 Experiment
Optomechanics with atoms
داده کاوی سئوالات نمونه
Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium  
ლექცია 4 - ფული და ინფლაცია
10. predavanje Novac i financijski sustav
Wissenschaftliche Aussprache zur Dissertation
FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,
Particle acceleration during the gamma-ray flares of the Crab Nebular
Interpretations of the Derivative Gottfried Wilhelm Leibniz
Advisor: Chiuyuan Chen Student: Shao-Chun Lin
Widow Rockfish Assessment
SiW-ECAL Beam Test 2015 Kick-Off meeting
On Robust Neighbor Discovery in Mobile Wireless Networks
Chapter 6 并发:死锁和饥饿 Operating Systems: Internals and Design Principles
You NEED your book!!! Frequency Distribution
Y V =0 a V =V0 x b b V =0 z
Fairness-oriented Scheduling Support for Multicore Systems
Climate-Energy-Policy Interaction
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Ch48 Statistics by Chtan FYHSKulai
The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.
Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs
Online Learning: An Introduction
Factor Based Index of Systemic Stress (FISS)
What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.
THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*
Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.
The Toroidal Sporadic Source: Understanding Temporal Variations
FW 3.4: More Circle Practice
ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف
Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM
Limits on Anomalous WWγ and WWZ Couplings from DØ
Presentation transcript:

Kijung Shin, Jisu Kim, Bryan Hooi, Christos Faloutsos Think before You Discard: Accurate Triangle Counting in Graph Streams with Deletions Kijung Shin, Jisu Kim, Bryan Hooi, Christos Faloutsos

Triangles in a Graph Graphs are everywhere! Introduction Algorithm Experiments Problem Conclusion Triangles in a Graph Graphs are everywhere! social networks, the web, citation networks Triangles are a fundamental primitive 3 nodes connected to each other Counting triangles has many applications community detection, anomaly detection, query optimization Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Application: Anomaly Detection Introduction Algorithm Experiments Problem Conclusion Application: Anomaly Detection [KMF11] [LJK18] # Incident Triangles # Incident Triangles Telemarketer Degree Degree Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Remaining Challenges Counting triangles in real-world graphs Introduction Algorithm Experiments Problem Conclusion Remaining Challenges Counting triangles in real-world graphs Real-world graphs are Large: not fitting in main memory Fully dynamic: both growing and shrinking online social networks Web Citation networks Call networks Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Previous Work - Given: a large and fully-dynamic graph Introduction Algorithm Experiments Problem Conclusion Previous Work - Given: a large and fully-dynamic graph - Goal: accurately estimate the count of triangles Large Graph Fully dynamic Graph Accurate Insertion Deletion MASCOT [LJK18] Triest-IMPR [DERU17] WRS [Shi17] ESD [HS17] Triest-FD [DERU17] ~𝟒× ~𝟒× ThinkD (Proposed) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Our Contributions We propose ThinkD (Think before You Discard) Introduction Algorithm Experiments Problem Conclusion Our Contributions We propose ThinkD (Think before You Discard) Fast and Accurate: outperforming competitors Scalable: linear data scalability Theoretically Sound: unbiased estimates Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Road Map Problem Definition Proposed Method: ThinkD Experiments Conclusions Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Fully Dynamic Graph Stream Introduction Algorithm Experiments Problem Conclusion Fully Dynamic Graph Stream Model for a large and fully-dynamic graph Discrete time 𝑡, starting from 1 and ever increasing At each time 𝑡, a change in the input graph arrives change: either an insertion or deletion of an edge Time 𝑡 1 Change (given) +(𝑎,𝑏) Graph (unmate-rialized) 𝑎 𝑏 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Fully Dynamic Graph Stream Introduction Algorithm Experiments Problem Conclusion Fully Dynamic Graph Stream Model for a large and fully-dynamic graph Discrete time 𝑡, starting from 1 and ever increasing At each time 𝑡, a change in the input graph arrives change: either an insertion or deletion of an edge Time 𝑡 1 2 Change (given) +(𝑎,𝑏) +(𝑎,𝑐) Graph (unmate-rialized) 𝑐 𝑎 𝑏 𝑎 𝑏 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Fully Dynamic Graph Stream Introduction Algorithm Experiments Problem Conclusion Fully Dynamic Graph Stream Model for a large and fully-dynamic graph Discrete time 𝑡, starting from 1 and ever increasing At each time 𝑡, a change in the input graph arrives change: either an insertion or deletion of an edge Time 𝑡 1 2 3 Change (given) +(𝑎,𝑏) +(𝑎,𝑐) +(𝑏,𝑐) Graph (unmate-rialized) 𝑐 𝑐 𝑎 𝑏 𝑎 𝑏 𝑎 𝑏 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Fully Dynamic Graph Stream Introduction Algorithm Experiments Problem Conclusion Fully Dynamic Graph Stream Model for a large and fully-dynamic graph Discrete time 𝑡, starting from 1 and ever increasing At each time 𝑡, a change in the input graph arrives change: either an insertion or deletion of an edge Time 𝑡 1 2 3 4 Change (given) +(𝑎,𝑏) +(𝑎,𝑐) +(𝑏,𝑐) −(𝑎,𝑏) Graph (unmate-rialized) 𝑐 𝑐 𝑐 𝑎 𝑏 𝑎 𝑏 𝑎 𝑏 𝑎 𝑏 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Fully Dynamic Graph Stream Introduction Algorithm Experiments Problem Conclusion Fully Dynamic Graph Stream Model for a large and fully-dynamic graph Discrete time 𝑡, starting from 1 and ever increasing At each time 𝑡, a change in the input graph arrives change: either an insertion or deletion of an edge Time 𝑡 1 2 3 4 5 … Change (given) +(𝑎,𝑏) +(𝑎,𝑐) +(𝑏,𝑐) −(𝑎,𝑏) +(𝑏,𝑑) Graph (unmate-rialized) 𝑐 𝑐 𝑐 𝑐 𝑑 𝑎 𝑏 𝑎 𝑏 𝑎 𝑏 𝑎 𝑏 𝑎 𝑏 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Fully Dynamic Graph Stream Introduction Algorithm Experiments Problem Conclusion Fully Dynamic Graph Stream Model for a large and fully-dynamic graph Discrete time 𝑡, starting from 1 and ever increasing At each time 𝑡, a change in the input graph arrives change: either an insertion or deletion of an edge Examples: … Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Problem Definition Given: Introduction Algorithm Experiments Problem Conclusion Problem Definition Given: a fully-dynamic graph stream (possibly infinite) memory space (finite) Estimate: the counts of global and local triangles To Minimize: estimation error at each time 𝑡 Time 𝑡 1 2 3 4 5 … Changes +(𝑎,𝑏) +(𝑎,𝑐) +(𝑏,𝑐) −(𝑎,𝑏) +(𝑏,𝑑) # Triangles Given Estimate Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Problem Definition (cont.) Introduction Algorithm Experiments Problem Conclusion Problem Definition (cont.) Global Triangles: all triangles in the graph Local triangles: the triangles incident to each node 3 2 1 4 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Road Map Problem Definition Proposed Method: ThinkD << Experiments Conclusions Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Overview of ThinkD Maintains and updates ∆ Introduction Algorithm Experiments Problem Conclusion Overview of ThinkD Maintains and updates ∆ Number of (non-deleted) triangles that it has observed How it processes an insertion: test arrive count store Yes No - arrive: an insertion of an edge arrives - count: count new triangles and increase ∆ - test: toss a coin - store: store the edge in memory Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Overview of ThinkD (cont.) Introduction Algorithm Experiments Problem Conclusion Overview of ThinkD (cont.) Maintains and updates ∆ Number of (non-deleted) triangles that it has observed How it processes a deletion: test arrive count delete Yes No - arrive: a deletion of an edge arrives - count: count deleted triangles and decrease ∆ - test: test whether the edge is stored in memory - delete: delete the edge in memory Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Comparison with Triest-FD Introduction Algorithm Experiments Problem Conclusion Comparison with Triest-FD ThinkD (Think before You Discard): every arrived change is used to update ∆ test arrive count: update ∆ store / delete Yes No (discard) Triest-FD [DERU17]: some changes are discarded without being used to update ∆ test arrive count: update ∆ store / delete Yes No (discard) → information loss! Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Two Versions of ThinkD Q1: How to test in the test step Introduction Algorithm Experiments Problem Conclusion Two Versions of ThinkD Q1: How to test in the test step Q2: How to estimate the count of all triangles from ∆ ThinkD-FAST: simple and fast using independent Bernoulli trials ThinkD-ACC: accurate and parameter-free using random pairing [GLH08] Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Details Details Introduction Algorithm Experiments Problem Conclusion Details ThinkD-FAST: Details ∆ : number of (non-deleted) triangles observed so far Toy example: - time : 𝑡−1 - count : ∆ - memory : (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Arrive Step Introduction Algorithm Experiments Problem Conclusion Details ThinkD-FAST: Arrive Step A new change in the input graph arrives change: either insertion or deletion of an edge test arrive count store / delete Yes No - time : 𝑡 - change: +(𝑢,𝑣) - count : ∆ - memory : (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Count Step Introduction Algorithm Experiments Problem Conclusion Details ThinkD-FAST: Count Step Count added or deleted triangles and update ∆ test arrive count store / delete Yes No - time : 𝑡 - change: +(𝑢,𝑣) - count : ∆ - # added triangles: 2 - memory : +=2 𝑢 𝑢 (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) 𝑥 𝑣 𝑦 𝑣 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Test Step Introduction Algorithm Experiments Problem Conclusion Details ThinkD-FAST: Test Step Simulate a Bernoulli trial with probability 𝑝 𝑝 is an input parameter test arrive count store / delete Yes No - time : 𝑡 - change: +(𝑢,𝑣) - count : ∆ - memory : (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Store Step Introduction Algorithm Experiments Problem Conclusion Details ThinkD-FAST: Store Step Store the new edge in memory test arrive count store / delete Yes No - time : 𝑡 - change: +(𝑢,𝑣) - count : ∆ - memory : (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) (𝑢,𝑣) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Arrive Step Introduction Algorithm Experiments Problem Conclusion Details ThinkD-FAST: Arrive Step A new change in the input graph arrives change: either insertion or deletion of an edge test arrive count store / delete Yes No - time : 𝑡+1 - change: −(𝑣,𝑥) - count : ∆ - memory : (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) (𝑢,𝑣) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Count Step Introduction Algorithm Experiments Problem Conclusion Details ThinkD-FAST: Count Step Count added or deleted triangles and update ∆ test arrive count store / delete Yes No - time : 𝑡+1 - change: −(𝑣,𝑥) - count : ∆ - # deleted triangles: 1 - memory : −=1 𝑢 (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) (𝑢,𝑣) 𝑥 𝑣 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Test Step Introduction Algorithm Experiments Problem Conclusion Analysis Details ThinkD-FAST: Test Step Test if the arrived edge is in memory test arrive count store / delete Yes No - time : 𝑡+1 - change: −(𝑣,𝑥) - count : ∆ - memory : (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) (𝑢,𝑣) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-FAST: Store Step Introduction Algorithm Experiments Problem Conclusion Analysis Details ThinkD-FAST: Store Step Remove the arrived edge from memory test arrive count store / delete Yes No - time : 𝑡+1 - change: −(𝑣,𝑥) - count : ∆ - memory : (𝑢,𝑥) (𝑢,𝑦) (𝑣,𝑦) (𝑣,𝑥) (𝑢,𝑣) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Unbiased Estimation 𝔼 ∆ 𝒑 𝟐 =𝜟 ∆ : count of observed triangles Introduction Algorithm Experiments Problem Conclusion Unbiased Estimation ∆ : count of observed triangles ∆ 𝒑 𝟐 : estimated count of all triangles ∆: true count of all triangles [ Theorem 1 ] At any time 𝒕, Proof and a variance of ∆ / p 2 : see the paper 𝔼 ∆ 𝒑 𝟐 =𝜟 Unbiased estimate of 𝜟 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Disadvantages of ThinkD-FAST Introduction Algorithm Experiments Problem Conclusion Disadvantages of ThinkD-FAST Parameter: setting the parameter 𝑝 is non-trivial small 𝑝→ underutilize memory → inaccurate estimation large 𝑝→ out-of-memory error Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Disadvantages of ThinkD-FAST (cont.) Introduction Algorithm Experiments Problem Conclusion Disadvantages of ThinkD-FAST (cont.) Information loss: it may discard inserted edges even when memory is not full test arrive count store / delete Yes No - time : 𝑡 - change: +(𝑢,𝑣) - estimate : ∆ (𝑢,𝑥) (𝑣,𝑥) (𝑢,𝑦) (𝑣,𝑦) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

ThinkD-ACC: Even Better! Introduction Algorithm Experiments Problem Conclusion ThinkD-ACC: Even Better! Random pairing [RLH08] instead of Bernoulli trials Advantages: no parameter less information loss: utilizes memory as fully as possible → more accurate estimation still unbiased Disadvantages: complicated → slower than ThinkD-FAST Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Scalability of ThinkD 𝑂(𝑘⋅𝑡) 𝑂(𝑘⋅𝑡) Let 𝑘 be the size of memory Introduction Algorithm Experiments Problem Conclusion Scalability of ThinkD Let 𝑘 be the size of memory For processing 𝑡 changes in the input stream, [ Theorem 2 ] ThinkD-ACC takes [ Theorem 3 ] ThinkD-FAST with 𝑝=𝑂 𝑘 𝑡 takes 𝑂(𝑘⋅𝑡) linear in data size 𝑂(𝑘⋅𝑡) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Advantages of ThinkD Fast and Accurate: outperforming competitors Introduction Algorithm Experiments Problem Conclusion Advantages of ThinkD Fast and Accurate: outperforming competitors Scalable: linear data scalability (Theorems 2 & 3) Theoretically Sound: unbiased estimates (Theorem 1) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Road Map Problem Definition Proposed Method: ThinkD Experiments << Conclusions Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Experimental Settings Introduction Algorithm Experiments Problem Conclusion Experimental Settings Competitors: Triest-FD [DERU17] & ESD [HS17] state-of-the-art algorithms for triangle counting in fully-dynamic graph streams Implementations: Datasets: insertions (edges in graphs) + deletions (random 20%) ER Synthetic (100B edges) Social Networks (1.8B+ edges, …) Citation (16M+) Web (6M+) Trust (0.7M+) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

EXP1. Bias Analysis [THM1] Introduction Algorithm Experiments Problem Conclusion EXP1. Bias Analysis [THM1] “Does ThinkD give unbiased estimates?” True Count ThinkD-ACC ThinkD-FAST Triest-FD - memory budget: 5% of the edges - dataset: - #repeats: 10,000 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Introduction Algorithm Experiments Problem Conclusion EXP2. Variance Analysis “Does ThinkD maintain estimates with small variance? Triest-FD ThinkD-FAST ThinkD-ACC True Count Number of Processed Changes - memory budget: 5% of the edges - dataset: - #repeats: 10,000 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

EXP3. Scalability [THM 2 & 3] Introduction Algorithm Experiments Problem Conclusion EXP3. Scalability [THM 2 & 3] “Does ThinkD scale linearly with the size of the input stream?” (THM 2 & 3) ThinkD-ACC ThinkD-FAST Linear scalability (slope=1) Number of Changes - dataset: ER - memory budget: fixed to 10 7 (i.e., 𝑝 = size / 10 7 ) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Advantages of ThinkD Fast and Accurate: outperforming competitors Introduction Algorithm Experiments Problem Conclusion Advantages of ThinkD Fast and Accurate: outperforming competitors Scalable: linear data scalability (Theorems 2 & 3) Theoretically Sound: unbiased estimates (Theorem 1) Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Introduction Algorithm Experiments Problem Conclusion EXP4. Space & Accuracy “Is ThinkD more accurate than its best competitors?” Triest-FD ThinkD-FAST Estimation Error (ratio) ESD ThinkD -ACC Memory budget (ratio) - dataset: - # repeats: 1000 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

EXP5. Speed & Accuracy “Is ThinkD faster than its best competitors?” Introduction Algorithm Experiments Problem Conclusion EXP5. Speed & Accuracy “Is ThinkD faster than its best competitors?” ThinkD-ACC ESD Triest-FD Estimation Error (ratio) ThinkD -FAST Running time (Sec) - dataset: - # repeats: 1000 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Advantages of ThinkD Fast and Accurate: outperforming competitors Introduction Algorithm Experiments Problem Conclusion Advantages of ThinkD Fast and Accurate: outperforming competitors Scalable: linear data scalability Theoretically Sound: unbiased estimates Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Road Map Problem Definition Proposed Method: ThinkD Experiments Conclusions << Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Conclusions ThinkD We propose ThinkD Introduction Algorithm Experiments Problem Conclusion Conclusions We propose ThinkD accurately estimates the count of triangles in large and fully-dynamic graphs Fast and Accurate: outperforming competitors Scalable: linear data scalability Theoretically Sound: unbiased estimates ThinkD Download Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

References Introduction Algorithm Experiments Problem Conclusion [RLH08] Rainer Gemulla et al., “Maintaining bounded-size sample synopses of evolving datasets.” The VLDB Journal 2008 [KMF11] U Kang et al., “Spectral analysis for billion-scale graphs: Discoveries and implementation.” PAKDD 2011 [DERU17] Lorenzo De Stefani et al., “TRIÈST: Counting Local and Global Triangles in Fully-Dynamic Streams with Fixed Memory Size.” TKDD 2017 [Shi17] Kijung Shin, “WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams”, ICDM 2017 [HS17] Han, Guyue, and Harish Sethu, "Edge sample and discard: A new algorithm for counting triangles in large dynamic graphs." ASONAM 2017 [LJK18] Yongsub Lim et al., “Memory-efficient and Accurate Sampling for Counting Local Triangles in Graph Streams: From Simple to Multigraphs”, TKDD 2018 Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Backup Slides Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)

Proof of Unbiasedness Proof Sketch: ∆ =# 𝑎𝑙𝑙 𝑎𝑑𝑑𝑒𝑑 𝑡𝑟𝑖𝑎𝑛𝑙𝑔𝑒𝑠 Introduction Algorithm Experiments Problem Conclusion Proof of Unbiasedness Proof Sketch: ∆ =# 𝑎𝑙𝑙 𝑎𝑑𝑑𝑒𝑑 𝑡𝑟𝑖𝑎𝑛𝑙𝑔𝑒𝑠 − #(𝑎𝑙𝑙 𝑟𝑒𝑚𝑜𝑣𝑒𝑑 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠) ∆ =# 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑎𝑑𝑑𝑒𝑑 𝑡𝑟𝑖𝑎𝑛𝑙𝑔𝑒𝑠 − #(𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑟𝑒𝑚𝑜𝑣𝑒𝑑 𝑡𝑟𝑖𝑎𝑛𝑔𝑙𝑒𝑠) 𝑝 2 = Prob. that each added triangle is observed 𝑝 2 = Prob. that each removed triangle is observed Thus, 𝐸𝑥𝑝 ∆ =∆⋅ 𝑝 2 ⇔ 𝐸𝑥𝑝 ∆ / 𝑝 2 =Δ Accurate Triangle Counting in Graph Streams with Deletions (by Kijung Shin)