LANCASTERUNIVERSITY Computing Department Lauren t Mathy 1 Internet Coordinate Systems Dr. Laurent Mathy Computing Department Lancaster University, UK

Slides:



Advertisements
Similar presentations
T. S. Eugene Ng Mellon University1 Towards Global Network Positioning T. S. Eugene Ng and Hui Zhang Department of Computer.
Advertisements

A Network Positioning System for the Internet T. S. Eugene Ng and Hui Zhang USENIX 04 Presented By: Imranul Hoque 1.
Intel Research Internet Coordinate Systems - 03/03/2004 Internet Coordinate Systems Marcelo Pias Intel Research Cambridge
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
LASTor: A Low-Latency AS-Aware Tor Client
1 Greedy Forwarding in Dynamic Scale-Free Networks Embedded in Hyperbolic Metric Spaces Dmitri Krioukov CAIDA/UCSD Joint work with F. Papadopoulos, M.
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Computer Networks Group Universität Paderborn Ad hoc and Sensor Networks Chapter 9: Localization & positioning Holger Karl.
Fabián E. Bustamante, 2007 Meridian: A lightweight network location service without virtual coordinates B. Wong, A. Slivkins and E. Gün Sirer SIGCOM 2005.
EL9331 Meridian: A Lightweight Network Location Service without Virtual Coordinates Bernard Wong, Aleksandrs Slivkins, Emin Gun Sirer SIGCOMM’05 ( Slides.
Self-Organizing Hierarchical Routing for Scalable Ad Hoc Networking David B. Johnson Department of Computer Science Rice University Monarch.
The Frog-Boiling Attack: Limitations of Secure Network Coordinate Systems IS523 Class Presentation KAIST Seunghoon Jeong 1.
Towards Unbiased End-to-End Network Diagnosis Name: Kwan Kai Chung Student ID: Date: 18/3/2007.
The Comparison of the Software Cost Estimating Methods
Copyright 2004 David J. Lilja1 What Do All of These Means Mean? Indices of central tendency Sample mean Median Mode Other means Arithmetic Harmonic Geometric.
Motion Tracking. Image Processing and Computer Vision: 82 Introduction Finding how objects have moved in an image sequence Movement in space Movement.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
Vivaldi Coordinate Service Justin Ma, Patrick Verkaik, Michael Vrable Department of Computer Science And Engineering UCSD CSE222A, Winter 2005.
An Algebraic Approach to Practical and Scalable Overlay Network Monitoring Yan Chen, David Bindel, Hanhee Song, Randy H. Katz Presented by Mahesh Balakrishnan.
King : Estimating latency between arbitrary Internet end hosts Krishna Gummadi, Stefan Saroiu Steven D. Gribble University of Washington Presented by:
Regressions and approximation Prof. Graeme Bailey (notes modified from Noah Snavely, Spring 2009)
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Predicting Communication Latency in the Internet Dragan Milic Universität Bern.
Cumulative Violation For any window size  t  Communication-Efficient Tracking for Distributed Cumulative Triggers Ling Huang* Minos Garofalakis.
T. S. Eugene Ng Mellon University1 Global Network Positioning: A New Approach to Network Distance Prediction Tze Sing Eugene.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 13th Lecture Christian Schindelhauer.
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 06/04/2007.
1 Numerical geometry of non-rigid shapes Non-Euclidean Embedding Non-Euclidean Embedding Lecture 6 © Alexander & Michael Bronstein tosca.cs.technion.ac.il/book.
Today Concepts underlying inferential statistics
Nearcast: A Locality-Aware P2P Live Streaming Approach for Distance Education XUPING TU, HAI JIN, and XIAOFEI LIAO Huazhong University of Science and Technology.
1 We will now consider the distributional properties of OLS estimators in models with a lagged dependent variable. We will do so for the simplest such.
Computer Networks Layering and Routing Dina Katabi
Adaptive Signal Processing
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Network Planète Chadi Barakat
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Communication (II) Chapter 4
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
Popularity versus Similarity in Growing Networks Fragiskos Papadopoulos Cyprus University of Technology M. Kitsak, M. Á. Serrano, M. Boguñá, and Dmitri.
Phoenix: A Weight-Based Network Coordinate System Using Matrix Factorization Yang Chen Department of Computer Science Duke University
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Phoenix: Towards an Accurate, Practical and Decentralized Network Coordinate System Yang Chen 1, Xiao Wang 1, Xiaoxiao Song 1, Eng Keong Lua 2, Cong Shi.
Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.
Feb nd IPTPS Lighthouses for Scalable Distributed Location Marcelo Pias UCL Jon Crowcroft CL/Cambridge University Steve Wilbur UCL Tim Harris Cambridge.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
1 Vivaldi: A Decentralized Network Coordinate System Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Presented by: Chen Qian.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
APPLICATION LAYER MULTICASTING
WSP: A Network Coordinate based Web Service Positioning Framework for Response Time Prediction Jieming Zhu, Yu Kang, Zibin Zheng and Michael R. Lyu The.
Network Coordinates : Internet Distance Estimation Jieming ZHU
SPYCE/May’04 coverage: A Cooperative Immunization System for an Untrusting Internet Kostas Anagnostakis University of Pennsylvania Joint work with: Michael.
Network Computing Laboratory 1 Vivaldi: A Decentralized Network Coordinate System Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT Published.
Sampling and estimation Petter Mostad
Cooperative Location- Sensing for Wireless Networks Authors : Haris Fretzagias Maria Papadopouli Presented by cychen IEEE International Conference on Pervasive.
Spring 2000CS 4611 Routing Outline Algorithms Scalability.
Load Balanced Link Reversal Routing in Mobile Wireless Ad Hoc Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE Department RPI Costas Busch CSCI Department.
Artificial Intelligence in Game Design Lecture 20: Hill Climbing and N-Grams.
Gang Wang, Shining Wu, Guodong Wang, Beixing Deng, Xing Li Tsinghua University Tsinghua Univ. Oct Experimental Study on Neighbor Selection Policy.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Lecture 13 – Network Mapping
Monitoring Persistently Congested Internet Links
Vivaldi: A Decentralized Network Coordinate System
Presented by Prashant Duhoon
CMPE 252A : Computer Networks
CMPE 252A : Computer Networks
Wireless Mesh Networks
Presentation transcript:

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 1 Internet Coordinate Systems Dr. Laurent Mathy Computing Department Lancaster University, UK RESCOM 2007

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 2 Aims of the talk Review main Internet Coordinate Systems and techniques Discuss properties of Internet as delay space and resulting embedding issues Highlight (some) security issues for ICS and approach to solution

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 3 Why Internet Coordinate Systems? Many applications, distributed systems, overlays benefit from “network topology awareness” –Closest server/neighbour selection –Distance ranking (which node is closer?) –Network-overlay topology congruence – e.g. CAN Need measurements –But potentially high overhead Many nodes to measure against Many different simultaneous overlays/applications measuring simultaneously –Especially that distance changes need to be tracked –“ping storms” on PlanetLab

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 4 Why Internet Coordinate Systems (2) Luckily, delays (RTT) are statistically constant and predictable –At least constancy in order of several minutes –Mostly present sporadic “level shifts” –predictable within 20% of real value, 95% of the time Idea is to map (“embed”) the Internet delay space onto an appropriate metric space so that –Each nodes coordinate is computed/tracked via sample measurement of a small number of nodes –Distance between any 2 nodes can be estimated without the need for further measurements Advantages –Low distance estimation/computation overhead –Low “full-mesh” distance communication overhead O(k*d) vs O(k 2 ), with k nodes and d dimensions

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 5 Relative positioning without ICS Not all relative positioning problems need coordinates Binning –Measure distances to set of landmarks (8 to 15) –Order landmarks by increasing RTTs to get bin “Id” –Rationale: nodes close to each other will see similar RTTs to landmarks and end up in same bin –Improvement: add “range levels” to bins E.g. ]0, 100] ms = level 0; ]100, 200] ms = level 1; >200ms = level 2 L1 L2 L L1L3L2:012 L2L3L1:002

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 6 ICS Embedding Principles The main goal is to embed the Internet delay (RTT) space on a metric space to allow easy distance estimations –Metric Space: given D(a,b) the distance function between a and b (anti-reflexivity) D(a,b) = 0 iff a = b (symmetry) D(a,b) = D(b,a) (triangular inequality) D(a,b) <= D(a,c) + D(c,b) –An embedding is a mapping from a metric space to another –For now, ignore the fact that Internet delay space is not metric… Goodness of embedding metric: relative absolute error –d(a,b) is estimated distance (= D(a,b)) –δ(a,b) is the (real) measured distance –|d(a,b) - δ(a,b)| / δ(a,b) –This will be directly or indirectly minimized –stabilisation of relative errors often equated to systems convergence But bear in mind that in some pathological cases, errors may stabilize, while the system is in chaos!

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 7 Global Network Positioning (GNP) The pioneering ICS Infrastructure-based: uses landmarks L1 L2 L3 x y (x 1, y 1 ) (x 2, y 2 ) (x 3, y 3 ) (x 4, y 4 )

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 8 GNP (2) Goal: find coordinates so that overall error between measured distances and estimated distances is minimized Embedding in 2 phases, based on multi-dimensional global minimization Phase 1 –From full mesh measurements between landmarks, centrally –Minimize –where ε(.) is an error measurement function e.g. Phase 2 –Minimize Resolution by simplex downhill method –Should find global minimum but risk of getting stuck in local minimum

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 9 GNP (3) Landmarks embed more often than normal nodes for accuracy For space with D-dimensions, must have at least D+1 landmarks Found that 7-D Euclidean space provides best accuracy vs overhead trade-off –In practice, 8 to 20 well placed landmarks is enough But risk of high measurement overhead at landmarks And landmarks represent point of failure  Enter GNP’s “derivatives”

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 10 Network Positioning System (NPS) GNP’s little brother Hierarchical architecture for scalability Membership servers designate positioned host as “reference points(RP)” when existing landmarks/RP are congested Optimal is 3 layers, due to error amplification across layers L1L3L2 Layer 0 Layer 1 Layer 2 Layer 3

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 11 NPS (2) Landmark positioning is distributed –Based on observation that GNP objective function F(.) can be re-written as have each landmark minimize its “corresponding” term Better accuracy when all landmark reposition roughly at the same time –When change in RTT is detected, a landmark triggers others to reposition with special probe Malicious reference point detection –On embedding, a node computes its relative error to its RF –Eliminates RF with max relative error if Max i (E Ri ) > 0.01 and Max i (E Ri ) > C median i (E Ri )

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 12 Practical Coordinate Computation (PIC) Kind of infrastructureless-NPS! –No more points of failure! Idea is that any node with a computed coordinate can be used as an RF/landmark –Again, for D-dimensions needs at least D+1 RFs –If not enough nodes in system yet, just work in lower dimension space –Better results if use roughly ½ of close and ½ of randomly chosen nodes Hey, how do you know, as you’ve just arrived? –Do a first embedding with only random landmarks, then pick close neighbours based on these rough coord, and start again! PIC has a malicious node detection based on the triangle inequality property

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 13 Lighthouse Any node can be a landmark Pick any D+1 nodes for a D-dimensional space, and use them as a local bases Local basis are usually oblique –A node coordinates therefore depends on oblique projections

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 14 Lighthouse (2) In a local basis A node coord can be expressed as And computed by resolving where With this, the full mesh measurements between the nodes in the local basis and general triangle formulas, we get the node coord in the local basis

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 15 Lighthouse (3) How do we “reconcile” all those local bases – and all those coordinates? By simple basis changing operation: given 2 basis we have where Pick any local basis as the global one and have each node maintain the transition matrix from its local to the global basis –All that’s needed are the coordinates of the (local) lighthouses in the global basis

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 16 Vivaldi Main peer-to-peer based proposal (no infrastructure) Based on the simulation of a network of springs –Spring between 2 nodes Rest position is the measured distance δ(i,j) If estimated distance d(i,j) is smaller, the embedding node is pushed away from the other node If estimated distance d(i,j) is bigger, the embedding node is pulled towards the other node –Nodes should attach to about ½ close nodes and ½ far nodes Each node has say 32 or 64 neighbours –Initial coord is the origin

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 17 Vivaldi (2) For stability, don’t overreact if other node has low confidence in its coordinates and don’t move too much if you are confident in yours For convergence, try and move more when you are not confident in your coordinates  each node keep a “local error” The local error can be seen as the inverse of the confidence a node has about its coordinates Used to compute an adaptive timestep

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 18 Vivaldi (3) Algorithm summary (embedding step for node i): w = e i / (e i +e j ) –Sample weight balances local and remote error ε s = |d(i,j) – δ(i,j)|/ δ(i,j) –Sample relative error e i = ε s * c e * w + e i * (1 – c e * w) –Update local error Δ = c c * w –Compute time step x i = x i + Δ * (δ(i,j) - d(i,j))u(x i – x j ) –Update coordinate

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 19 Internet delay space characteristics A study by Shavitt et al. has shown that the internet RTT space most resembles a hyperbolic space –This can be approximated by a 2d-Euclidean space augmented with a height vector This is the preferred Vivaldi space –The Euclidean component represents the Internet core with latencies proportional to geographic distances (no congestion) –The height vector represents the access link Issue when estimating distances between nodes behind the same access But is the Internet delay space a metric space anyway? –… NO!

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 20 Internet delay space characteristics (2) Internet a Metric space? –(anti-reflexivity) D(a,b) = 0 iff a = b Holds if timing facility has high enough resolution –(symmetry) D(a,b) = D(b,a) Paths are not symmetrical Holds for round-trip path metric and a bit of good will –That’s why “delay” here always means “RTT” –(triangular inequality) D(a,b) <= D(a,c) + D(c,b) Does not hold Estimates are that between 4% and 20% of all Internet paths exhibit Triangular Inequality Violations (TIV) The Internet is therefore a quasi-metric space, and embedding it into a metric space will create inaccuracies

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 21 Where are TIVs from? They can have several causes: Intra-domain routing –Intra-domain routing is based on shortest path routing –Discrepancies between actual link delay and link weights can create TIVs Traffic engineering anyone? ;-) Hot-potato routing R1 R2 R3 R d(2,3) = 13 d(2,1) = 4 d(1,3) = 8 d(2,3) > d(2,1) + d(1,3) TIV!!!

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 22 Where are TIVs from? (2) Private peering links Multihoming; bilateral, non-transitive peering relationships; interaction intra-inter domain routing, etc are even more causes for TIVs R1 R2 R3 R4 R5 R d(2,3) = 28 d(2,1) = 2 d(1,3) = 4 d(2,3) > d(2,1) + d(1,3) TIV!!!

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 23 Impact of TIVs At best, TIVs will just cause inaccuracies on embedding –If TIVs are encountered during embedding, the resulting coordinate will lie “in-between” –If not encountered during embedding, coordinates will still inadequately predict real distances At worse, coordinate will “oscillate” –Typically the case in Vivaldi –Because the TIVs have a nasty happy to “pull” on nodes, who then get pushed back by other neighbours TIVs are the major cause of errors in ICS

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 24 Other Oddity ICS have been observed to drift –The centroid of the points in the metric space moves in a fairly constant direction at a rate of a few hundred millisecond per day This has been observed on a large-scale vivaldi system –This is probably due to the accumulation of errors caused by TIVs, RTT level shifts, embedding errors, etc –For all practical purposes, this can be ignored as long as embedding refresh period is small compared to a day

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 25 What are ICS good for anyway? Some studies tend to suggest that although the relative errors can be very small, coordinate systems can perform badly at specific application –Especially, closest neighbour selection and neighbour ranking However, I have some doubt about the representativeness of the data used –Don’t get me wrong, it is actually very hard to get a snapshot of measurements that actually represent the network True, you say, but that’s the same for the computation of the relative error –You believe who you want… So the theory goes: the relative error may be too much of an aggregated metric to tell a good story… … but the alternative is, so far, application-specific metrics (yurk!)

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 26 ICS security Most ICS actually trade convergence time for scalability Also, most of them are actually more accurate as the number of nodes increase Because of this, you should expect ICS to be deployed has an always on service –You must have a coordinate by the time you need one! Great, but then, they may become a prime target for attackers –Think of all the nice applications, distributed systems and overlays you can bring down with one stone!!! Large scale DDoS Attack anyone? What can an attacker do? –That depends on where/who they are…

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 27 ICS security (2) Insider attack –Most ICS rely on full cooperation between the nodes to operate –Untrusted nodes can easily Lie about their coordinate (to mess up estimation) Tamper with your probes (usually delay them, to mess up measurement) Lie about anything else they can lie about (e.g. local error in Vivaldi) Both –Has been shown to be very effective –Result of this is a distortion of the coordinate space This is insidious, because unsuspecting honest nodes will propagate errors for the bad guys! Outsider attack –Inject rogue probes into the system to fool measurements –DDoS attacks on links Impact still under study

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 28 ICS security (3) Defending against insider attacks –Early methods too primitive NPS median test can start working for the attacker when the attacker dominates the set of measurements (and skews the median) PIC defence is based on the triangle inequality: the Internet messes it up for us without bad guys! –Trust propagation models But you must trust the trust propagation Can be complex

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 29 ICS security (3) –Signal processing It was shown that relative error evolution can be modelled by a linear state space model (and tracked by a Kalman filter) It was also shown that the model of error evolution for one node is a good match for the error model of nearby nodes –This means that the Kalman filter calibrated on one node can be used to predict errors observed on a nearby node The Kalman filter gives you the mean and variance of its innovation process which is the difference between the input (measured error) and the predicted one –A simple hypothesis test is therefore possible on the deviation between the measured error and the predicted one Idea: have a set of trusted infrastructure nodes (surveyors) that embed exclusively each other – they see a “clean” space –Surveyors also help embed other nodes –Have node use Kalman filter calibrated at (close by) surveyors –At any embedding, use the filter to test whether the observed error is compatible with the prediction »If not, ignore/change your neighbour

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 30 ICS security (4) The previous signal processing method cannot defend against a node that lies about its coordinate during distance estimation (application phase) In that case you need something else –Trust again? –Validity certificates? –???

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 31 Conclusions ICS are a relatively new field, and still very much a hot-topic –Our understanding of them still improves steadily On the other hand, several large-scale trials have shown that they are mostly fit for practical purpose They are poised to play a critical role in supporting future overlays and intelligent applications Serious deployment could be only a few years away –Most structured p2p systems have some kind of ICS prototypes available to them But these could of course become “famous last words” ;-)

LANCASTERUNIVERSITY Computing Department Lauren t Mathy 32 Thank you for your attention!