A Local Facility Location Algorithm Supervisor: Assaf Schuster Denis Krivitski Technion – Israel Institute of Technology.

Slides:

Advertisements

Similar presentations

Impossibility of Distributed Consensus with One Faulty Process

Advertisements

Adopt Algorithm for Distributed Constraint Optimization

Local L2-Thresholding Based Data Mining in Peer-to-Peer Systems Ran Wolff Kanishka Bhaduri Hillol Kargupta CSEE Dept, UMBC Presented by: Kanishka Bhaduri.

Lecture 8: Asynchronous Network Algorithms

6.033: Intro to Computer Networks Layering & Routing Dina Katabi & Sam Madden Some slides are contributed by N. McKewon, J. Rexford, I. Stoica.

UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä

Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon.

Rumor Routing in Sensor Networks David Braginsky and Deborah Estrin Presented By Tu Tran 1.

1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.

1 Stochastic Event Capture Using Mobile Sensors Subject to a Quality Metric Nabhendra Bisnik, Alhussein A. Abouzeid, and Volkan Isler Rensselaer Polytechnic.

1 Complexity of Network Synchronization Raeda Naamnieh.

Fast Distributed Algorithm for Convergecast in Ad Hoc Geometric Radio Networks Alex Kesselman, Darek Kowalski MPI Informatik.

1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:

Dynamic Hypercube Topology Stefan Schmid URAW 2005 Upper Rhine Algorithms Workshop University of Tübingen, Germany.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.

An Illustrative Example

Peer-to-peer Multimedia Streaming and Caching Service by Won J. Jeon and Klara Nahrstedt University of Illinois at Urbana-Champaign, Urbana, USA.

Distributed Process Management1 Learning Objectives Distributed Scheduling Algorithms Coordinator Elections Orphan Processes.

Election Algorithms and Distributed Processing Section 6.5.

Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.

Paxos Made Simple Jinghe Zhang. Introduction Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks. In distributed system:

On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.

Distributed Asynchronous Bellman-Ford Algorithm

Distributed Load Balancing for Key-Value Storage Systems Imranul Hoque Michael Spreitzer Malgorzata Steinder.

Securing Every Bit: Authenticated Broadcast in Wireless Networks Dan Alistarh, Seth Gilbert, Rachid Guerraoui, Zarko Milosevic, and Calvin Newport.

Multi-Layer Perceptrons Michael J. Watts

Distributed Computation in MANets Robot swarm developed by James Rice University.

On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

Applying Genetic Algorithm to the Knapsack Problem Qi Su ECE 539 Spring 2001 Course Project.

Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.

Co-Grid: an Efficient Coverage Maintenance Protocol for Distributed Sensor Networks Guoliang Xing; Chenyang Lu; Robert Pless; Joseph A. O ’ Sullivan Department.

Generating RCPSP instances with Known Optimal Solutions José Coelho Generator and generated instances in:

Data Mining Algorithms for Large-Scale Distributed Systems Presenter: Ran Wolff Joint work with Assaf Schuster 2003.

A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.

Distributed Classification in Peer-to-Peer Networks Ping Luo, Hui Xiong, Kevin Lü, Zhongzhi Shi Institute of Computing Technology, Chinese Academy of Sciences.

Association Rule Mining in Peer-to-Peer Systems Ran Wolff Assaf Shcuster Department of Computer Science Technion I.I.T. Haifa 32000,Isreal.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

Survey Propagation. Outline Survey Propagation: an algorithm for satisfiability 1 – Warning Propagation – Belief Propagation – Survey Propagation Survey.

Mobile Agent Migration Problem Yingyue Xu. Energy efficiency requirement of sensor networks Mobile agent computing paradigm Data fusion, distributed processing.

A new Ad Hoc Positioning System 컴퓨터 공학과 오영준.

1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.

1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.

Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Efficient k-Coverage Algorithms for Wireless Sensor Networks Mohamed Hefeeda.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.

A Method for Distributed Computation of Semi-Optimal Multicast Tree in MANET Eiichi Takashima, Yoshihiro Murata, Naoki Shibata*, Keiichi Yasumoto, and.

SR: A Cross-Layer Routing in Wireless Ad Hoc Sensor Networks Zhen Jiang Department of Computer Science West Chester University West Chester, PA 19335,

Multi-channel Wireless Sensor Network MAC protocol based on dynamic route.

Cooperative Location- Sensing for Wireless Networks Authors ： Haris Fretzagias Maria Papadopouli Presented by cychen IEEE International Conference on Pervasive.

DISTIN: Distributed Inference and Optimization in WSNs A Message-Passing Perspective SCOM Team

Iterative Byzantine Vector Consensus in Incomplete Graphs Nitin Vaidya University of Illinois at Urbana-Champaign ICDCN presentation by Srikanth Sastry.

A N I N - MEMORY F RAMEWORK FOR E XTENDED M AP R EDUCE 2011 Third IEEE International Conference on Coud Computing Technology and Science.

Computer Science 1 Using Clustering Information for Sensor Network Localization Haowen Chan, Mark Luk, and Adrian Perrig Carnegie Mellon University

Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.

Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.

Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks Author: P. Kokkinos, K. Christodoulopoulos, A. Kretsis, and E. Varvarigos.

1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.

Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.

Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.

1 Advanced course on: Parallel and Distributed Model Checking Lecture 1 – Lecturers: Orna Grumberg, Computer Science Dept, Technion Karen Yorav,

Data Mining Algorithms for Large-Scale Distributed Systems

Scalable Load-Distance Balancing

Real Neurons Cell structures Cell body Dendrites Axon

Task: It is necessary to choose the most suitable variant from some set of objects by those or other criteria.

Privacy and Fault-Tolerance in Distributed Optimization Nitin Vaidya University of Illinois at Urbana-Champaign.

Theory and Application of Attribute Decomposition

Introduction to locality sensitive approach to distributed systems

Hongchao Zhou, Fei Liu, Xiaohong Guan

Presentation transcript:

A Local Facility Location Algorithm Supervisor: Assaf Schuster Denis Krivitski Technion – Israel Institute of Technology

Outline 1.Introduction 2.Related Work 3.Prerequisites 4.Local Majority Voting 5.Distributed Facility Location 6.Speculative Execution 7.Local ArgMin 8.Experimental Results Outline

Large-Scale Distributed Systems LSD – Large-Scale Distributed Peer-to-Peer file sharing networks –eMule: 2-3 million peers –Skype: 5,801,651 peers (for 4/4/06 16:14) Grid systems –EGEE: 10,000 CPU, 10 Petabytes of storage Wireless sensor networks Introduction

Computing in LSD Systems is Difficult Global synchronization is impossible –Synchronization is need after each iteration The input constantly changes –It is hard to keep a large system static Failures are frequent –If a PC fails once a week, a system with a million PCs will have 2 failures every second And of course scalability is necessary Introduction

Current state of LSD computing Embarrassingly parallel tasks –Many interesting problems are not embarrassingly parallel –Used in current grid systems Data storage and retrieval –No computation here –Used in current peer-to-peer systems Introduction

Desired state of LSD computing We want to be able to solve more elaborate problems: –Data mining –Optimization problems In this research we solve the facility location problem in LSD systems. Introduction

The Facility Location Problem We are given: –A set of facilities –A set of clients –A cost function We need to choose: –Which facilities to open –Which facility serves each client Such that the cost is minimized Introduction

Related Work – Data Mining Most of the distributed data mining algorithms were designed for small systems: –Extensive use of global synchronization –Do not tolerate failures Meta-Learning –No synchronization, and tolerates failures –Result quality decreases with the number of nodes Related Work

Related Work – LSD Computing Approaches Gossip –Random walk based –Asymptotically converges to the exact result with high probability Local Algorithms –Eventually achieves the exact result. We are here Related Work

What is a local algorithm? The output of each node depends only on the data of a group of neighbours Eventual correctness guaranteed Size of group may depend on the problem at hand Prerequisites

Local vs. Centralized LocalCentralized 2 link delays, doesn’t depend on the network size 16 link delays, equals to the network diameter Prerequisites

Local Facility Location Architecture Our contribution Proposed by Wolff and Schuster in ICDM Used in local association rules mining algorithm. Our extension Prerequisites

Majority Voting Each node has a poll with votes for the red or green party. Each node is interested to know which party won the elections. Local Majority Voting

Global constants:  λ – majority threshold  γ – bias Input of node u:  c u – the number of local votes  s u – the number of local red votes  G u – a set of neighbors Output of node u:  true if this inequality holds: false otherwise. The input is free to change An ad-hoc output is always available. Its accuracy gradually increases, and eventually it becomes exact. Local Majority Voting

Distributed Facility Location Global constants:  M – a set of possible facility locations  - facility cost Input of node u :  DB u – a set of clients local to node u  G u – a set of neighbors  - service cost Output of node u :  - a set of open facilities, such that Cost(C u ) is minimal. The input is free to change An ad-hoc output is always available. Its accuracy gradually increases, and eventually it becomes exact. Distributed Facility Location

Finding the optimal solution Facility Location is NP-Hard We use a hill climbing heuristic –In this case, hill climbing provides factor 3 approximation In each step we move, open, or close one facility. We stop when the cost doesn’t improve. Distributed Facility Location

Choosing the next step configuration C0C0 C1C1 C2C2 C3C3 C4C4 C5C5 C0C0 Known to every node Distributed over the whole network The local majority vote algorithm can be used to compare the costs of two configurations. Distributed Facility Location

Comparing two configurations. Each node votes in favor of one or another configuration. A configuration that wins the elections has lower cost than the other. C1C1 C2C2 Number of green votes of node u: Number of red votes of node u: Global constants: Distributed Facility Location

Why the ArgMin? To find the best next step out of k possible options, using majority votes, will require O(k 2 ) comparisons. The local ArgMin algorithm makes it in O(k) comparisons in average. Internals of the ArgMin will be described at the end. Distributed Facility Location

The ArgMin interface Global constants:  B – the bias vector Input of node u :  A u – the addendum vector  G u – a set of neighbors Output of node u :  The index i such that: ArgMin is anytime, its output may change, and like majority vote it never terminates! Distributed Facility Location

Speculative execution If we never finish computing the first step, how can we start the second one? The answer: We make a guess and base on it the next step. If the guess turns to be wrong, we backtrack and recompute. Speculative Execution

Every node speculates 1.Eventually, the first iteration will converge to the exact result, and will be the same in every node. 2.Then, the second iteration will be able to converge, and so on until all iterations are exact. 3.When all iterations are exact, every node will output the exact solution. Speculative Execution

ArgMin internals ArgMin uses majority votes to compare pairs of vector elements. ArgMin is also speculative. In every iteration, each configuration is compared to a pivot Local ArgMin

Experimental Results The number of messages each node sends does not depend on the network size Experimental Results

Majority of the nodes provide exact result even if the input continuously changes. Experimental Results

Conclusions We have described a new facility location algorithm suitable for large-scale distributed systems. The algorithm is scalable, communication efficient, and able to efficiently sustain failures.

The End Special thanks to Ran Wolff for his help in supervision of this research