Improving Internet Availability with Path Splicing Nick Feamster Georgia Tech.

Slides:



Advertisements
Similar presentations
Network Layer Delivery Forwarding and Routing
Advertisements

EE384y: Packet Switch Architectures
Using Network Virtualization Techniques for Scalable Routing Nick Feamster, Georgia Tech Lixin Gao, UMass Amherst Jennifer Rexford, Princeton University.
OSPF 1.
Nick Feamster Georgia Tech
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Path Splicing with Network Slicing
Improving Internet Availability with Path Splicing Nick Feamster Georgia Tech Joint work with Murtaza Motiwala and Santosh Vempala.
Improving Internet Availability with Path Splicing Murtaza Motiwala Nick Feamster Santosh Vempala.
Challenges in Making Tomography Practical
Understanding the Network- Level Behavior of Spammers Anirudh Ramachandran Nick Feamster Georgia Tech.
Path Splicing with Network Slicing Nick Feamster Murtaza Motiwala Santosh Vempala.
Data-Plane Accountability with In-Band Path Diagnosis Murtaza Motiwala, Nick Feamster Georgia Tech Andy Bavier Princeton University.
Research Summary Nick Feamster. The Big Picture Improving Internet availability by making networks easier to operate Three approaches –From the ground.
Using VINI to Test New Network Protocols Murtaza Motiwala, Georgia Tech Andy Bavier, Princeton University Nick Feamster, Georgia Tech Santosh Vempala,
Internet Availability Nick Feamster Georgia Tech.
Path Splicing Nick Feamster, Murtaza Motiwala, Megan Elmore, Santosh Vempala.
Network-Based Spam Filtering Anirudh Ramachandran Nick Feamster Georgia Tech.
Interconnection: Switching and Bridging
Multihoming and Multi-path Routing
Network-Based Spam Filtering Nick Feamster Georgia Tech Joint work with Anirudh Ramachandran and Santosh Vempala.
Nick Feamster Research: Network security and operations –Helping network operators run the network better –Helping users help themselves Lab meetings:
Improving Internet Availability. Availability of Other Services Carrier Airlines (2002 FAA Fact Book) –41 accidents, 6.7M departures – % availability.
1 Building a Fast, Virtualized Data Plane with Programmable Hardware Bilal Anwer Nick Feamster.
Nick Feamster Research: Network security and operations –Helping network operators run the network better –Helping users help themselves Lab meetings:
Network Security Highlights Nick Feamster Georgia Tech.
1 Dynamics of Online Scam Hosting Infrastructure Maria Konte, Nick Feamster Georgia Tech Jaeyeon Jung Intel Research.
1 Network-Level Spam Detection Nick Feamster Georgia Tech.
Network Operations Research Nick Feamster
Path Splicing with Network Slicing Nick Feamster Murtaza Motiwala Santosh Vempala.
Network-Based Spam Filtering Nick Feamster Georgia Tech with Anirudh Ramachandran, Nadeem Syed, Alex Gray, Sven Krasser, Santosh Vempala.
Network Security Highlights Nick Feamster Georgia Tech.
Nick Feamster Georgia Tech
A Narrow Waist for Multipath Routing Murtaza Motiwala Bilal Anwer, Mukarram bin Tariq David Andersen, Nick Feamster.
Multihoming and Multi-path Routing
Interconnection: Switching and Bridging CS 4251: Computer Networking II Nick Feamster Fall 2008.
UNITED NATIONS Shipment Details Report – January 2006.
Scalable Routing In Delay Tolerant Networks
Chapter 6 File Systems 6.1 Files 6.2 Directories
1 Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
IP-Internet Protocol Addresses. Computer Engineering Department 2 Addresses for the Virtual Internet The goal of internetworking is to provide a seamless.
INTERNET PROTOCOLS Class 9 CSCI 6433 David C. Roberts Entire contents copyright 2011, David C. Roberts, all rights reserved.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Logically-Centralized Control COS 597E: Software Defined Networking.
Two-Market Inter-domain Bandwidth Contracting
Chapter 1: Introduction to Scaling Networks
Local Area Networks - Internetworking
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 2 The OSI Model and the TCP/IP.
Chapter 9 Introduction to MAN and WAN
Hash Tables.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Distance Vector Routing Protocols Routing Protocols and Concepts –
© 2006 Cisco Systems, Inc. All rights reserved. MPLS v MPLS VPN Technology Introducing MPLS VPN Architecture.
Countering DoS Attacks with Stateless Multipath Overlays Presented by Yan Zhang.
Chapter 20 Network Layer: Internet Protocol
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 10 Routing Fundamentals and Subnets.
1 Introduction to Network Layer Lesson 09 NETS2150/2850 School of Information Technologies.
Multihoming and Multi-path Routing CS 7260 Nick Feamster January
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
Connecting LANs, Backbone Networks, and Virtual LANs
PSSA Preparation.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Link-State Routing Protocols Routing Protocols and Concepts – Chapter.
Consensus Routing: The Internet as a Distributed System John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented.
1 Path Splicing Author: Murtaza Motiwala, Megan Elmore, Nick Feamster and Santosh Vempala Publisher: SIGCOMM’08 Presenter: Hsin-Mao Chen Date:2009/12/09.
Understanding the Network-Level Behavior of Spammers Anirudh Ramachandran Nick Feamster.
Network Security: Spam Nick Feamster Georgia Tech CS 6250 Joint work with Anirudh Ramachanrdan, Shuang Hao, Santosh Vempala, Alex Gray.
Routing and Routing Protocols
Computer Networks Layering and Routing Dina Katabi
1 Introducing Routing 1. Dynamic routing - information is learned from other routers, and routing protocols adjust routes automatically. 2. Static routing.
Routing and Routing Protocols CCNA 2 v3 – Module 6.
COS 561: Advanced Computer Networks
Presentation transcript:

Improving Internet Availability with Path Splicing Nick Feamster Georgia Tech

2 It is not difficult to create a list of desired characteristics for a new Internet. Deciding how to design and deploy a network that achieves these goals is much harder. Over time, our list will evolve. It should be: 1. Robust and available. The network should be as robust, fault-tolerant and available as the wire-line telephone network is today. It is not difficult to create a list of desired characteristics for a new Internet. Deciding how to design and deploy a network that achieves these goals is much harder. Over time, our list will evolve. It should be: 1.Robust and available. The network should be as robust, fault-tolerant and available as the wire-line telephone network is today. 2.… Can the Internet be Always On? Various studies (Paxson, etc.) show the Internet is at about 2.5 nines More critical (or at least availability- centric) applications on the Internet At the same time, the Internet is getting more difficult to debug –Scale, complexity, disconnection, etc.

3 Availability of Other Services Carrier Airlines (2002 FAA Fact Book) –41 accidents, 6.7M departures – % availability 911 Phone service (1993 NRIC report +) –29 minutes per year per line –99.994% availability Std. Phone service (various sources) –53+ minutes per line per year –99.99+% availability

4 Threats to Availability Natural disasters Physical device failures (node, link) –Drunk network administrators

5 Threats to Availability Natural disasters Physical failures (node, link) –Drunk network administrators –Cisco bugs Misconfiguration Mis-coordination Denial-of-service (DoS) attacks Changes in traffic patterns (e.g., flash crowd) …

6 Availability: Two Aspects Reliability: Connectivity in the routing tables should approach the that of the underlying graph –If two nodes s and t remain connected in the underlying graph, there is some sequence of hops in the routing tables that will result in traffic Recovery: In case of failure (i.e., link or node removal), nodes should quickly be able to discover a new path

7 Where Todays Protocols Stand Reliability –Approach: Compute backup paths –Challenge: Many possible failure scenarios! Recovery –Approach: Switch to a backup when a failure occurs –Challenge: Must quickly discover a new working path –Meanwhile, packets are dropped, reordered, etc.

8 Multipath: Promise and Problems Bad: If any link fails on both paths, s is disconnected from t Want: End systems remain connected unless the underlying graph has a cut ts

9 Path Splicing: Main Idea Step 1 (Perturbations): Run multiple instances of the routing protocol, each with slightly perturbed versions of the configuration Step 2 (Slicing): Allow traffic to switch between instances at any node in the protocol t s Compute multiple forwarding trees per destination. Allow packets to switch slices midstream.

10 Outline Path Splicing –Achieving Reliabile Connectivity Generating slices Constructing paths –Forwarding –Recovery Properties –High Reliability –Bounded Stretch –Fast recovery Ongoing Work

11 Generating Slices Goal: Each instance provides different paths Mechanism: Each edge is given a weight that is a slightly perturbed version of the original weight –Two schemes: Uniform and degree-based ts Base Graph ts Perturbed Graph

12 How to Perturb the Link Weights? Uniform: Perturbation is a function of the initial weight of the link Degree-based: Perturbation is a linear function of the degrees of the incident nodes –Intuition: Deflect traffic away from nodes where traffic might tend to pass through by default

13 Constructing Paths Goal: Allow multiple instances to co-exist Mechanism: Virtual forwarding tables a t c s b t a t c Slice 1 Slice 2 dstnext-hop

14 Forwarding Traffic Packet has shim header with forwarding bits Routers use lg(k) bits to index forwarding tables –Shift bits after inspection To access different (or multiple) paths, end systems simply change the forwarding bits –Incremental deployment is trivial –Persistent loops cannot occur Various optimizations are possible

15 Putting It Together End system sets forwarding bits in packet header Forwarding bits specify slice to be used at any hop Router: examines/shifts forwarding bits, and forwards t s

16 Evaluation Defining reliability Does path splicing improve reliability? –How close can splicing get to the best possible reliability (i.e., that of the underlying graph)? Can path splicing enable fast recovery? –Can end systems (or intermediate nodes) find alternate paths fast enough?

17 Defining Reliability Reliability: the probability that, upon failing each edge with probability p, the graph remains connected Reliability curve: the fraction of source- destination pairs that remain connected for various link failure probabilities p The underlying graph has an underlying reliability (and reliability curve) –Goal: Reliability of routing system should approach that of the underlying graph.

18 Reliability Curve: Illustration Probability of link failure (p) Fraction of source-dest pairs disconnected Better reliability More edges available to end systems -> Better reliability

19 Reliability Approaches Optimal Sprint (Rocketfuel) topology 1,000 trials p indicates probability edge was removed from base graph Reliability approaches optimal Average stretch is only 1.3 Sprint topology, degree-based perturbations

20 Recovery: Two Mechanisms End-system recovery –Switch slices at every hop with probability 0.5 Network-based recovery –Router switches to a random slice if next hop is unreachable –Continue for a fixed number of hops till destination is reached 20

21 Simple Recovery Strategies Work Well Which paths can be recovered within 5 trials? –Sequential trials: 5 round-trip times –…but trials could also be made in parallel Recovery approaches maximum possible Adding a few more slices improves recovery beyond best possible reliability with fewer slices.

22 What About Loops? Persistent loops are avoidable –In the simple scheme, path bits are exhausted from the header –Never switching back to the same Transient loops can still be a problem because they increase end-to-end delay (stretch) –Longer end-to-end paths –Wasted capacity

23 Significant Novelty for Modest Stretch Novelty: difference in nodes in a perturbed shortest path from the original shortest path Example s d Novelty: 1 – (1/3) = 2/3 Fraction of edges on short path shared with long path

24 Splicing Improves Availability Reliability: Connectivity in the routing tables should approach the that of the underlying graph –Approach: Overlay trees generated using random link-weight perturbations. Allow traffic to switch between them. –Result: Splicing ~ 10 trees achieves near-optimal reliability Recovery: In case of failure, nodes should quickly be able to discover a new path –Approach: End nodes randomly select new bits. –Result: Recovery within 5 trials approaches best possible.

25 Extension: Interdomain Paths Observation: Many routers already learn multiple alternate routes to each destination. Idea: Use the forwarding bits to index into these alternate routes at an ASs ingress and egress routers. Storing multiple entries per prefix Indexing into them based on packet headers Selecting the best k routes for each destination Required new functionality d default alternate Splice paths at ingress and egress routers

26 Open Questions and Ongoing Work How does splicing interact with traffic engineering? Sources controlling traffic? What are the best mechanisms for generating slices and recovering paths? Can splicing eliminate dynamic routing?

27 Conclusion Simple: Forwarding bits provide access to different paths through the network Scalable: Exponential increase in available paths, linear increase in state Stable: Fast recovery does not require fast routing protocols

Network-Level Spam Filtering

29 Spam: More Than Just a Nuisance 75-90% of all traffic –PDF Spam: ~11% and growing –Content filters cannot catch! As of August 2007, one in every 87 s constituted a phishing attack Targeted attacks on the rise –20k-30k unique phishing attacks per month –Spam targeted at CEOs, social networks on the rise

30 Problem #1: Content-Based Filtering is Malleable Low cost to evasion: Spammers can easily alter features of an s content can be easily adjusted and changed Customized s are easy to generate: Content- based filters need fuzzy hashes over content, etc. High cost to filter maintainers: Filters must be continually updated as content-changing techniques become more sophistocated

31 Problem #2: IPs are Ephemeral Hijack IP address space Send spam Withdraw route ~ 10 minutes Spammers use various techniques to change the IP addresses from which they send traffic Humans must notice changing behavior Existing blacklists cannot stay up to date One technique: BGP Spectrum Ability

32 Our Approach: Network-Based Filtering Filter based on how it is sent, in addition to simply what is sent. Network-level properties are more fixed –Hosting or upstream ISP (AS number) –Botnet membership –Location in the network –IP address block Challenge: Which properties are most useful for distinguishing spam traffic from legitimate ?

33 SpamTracker: Main Idea and Intuition Idea: Blacklist sending behavior (Behavioral Blacklisting) –Identify sending patterns that are commonly used by spammers Intuition: Much more difficult for a spammer to change the technique by which mail is sent than it is to change the content

34 SpamTracker Design For each sender, construct a behavioral fingerprint Cluster senders with similar fingerprints Filter new senders that map to existing clusters Approach Cluster Classify IP x domain x time Collapse LookupScore

35 Building the Classifier: Clustering Feature: Distribution of sending volumes across recipient domains Clustering Approach –Build initial seed list of bad IP addresses –For each IP address, compute feature vector: volume per domain per time interval –Collapse into a single IP x domain matrix: –Compute clusters

36 Clustering: Output and Fingerprint For each cluster, compute fingerprint vector: New IPs will be compared to this fingerprint IP x IP Matrix: Intensity indicates pairwise similarity

37 Classifying IP Addresses Given new IP address, build a feature vector based on its sending pattern across domains Compute the similarity of this sending pattern to that of each known spam cluster –Normalized dot product of the two feature vectors –Spam score is maximum similarity to any cluster

38 Summary Spam is on the rise and becoming more clever –12% of spam now PDF spam. Content filters are falling behind. Also becoming more targeted IP-Based blacklists are evadable –Up to 30% of spam not listed in common blacklists at receipt. ~20% remains unlisted after a month –Spammers commonly steal IP addresses New approach: Behavioral blacklisting –Blacklist how the mail was sent, not what was sent –SpamTracker being deployed and evaluated by a large spam filtering company