Lecture 12 Page 1 CS 239, Spring 2007 Other Topics in Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 17, 2007.

Slides:



Advertisements
Similar presentations
(Distributed) Denial of Service Nick Feamster CS 4251 Spring 2008.
Advertisements

Fred P. Baker CCIE, CCIP(security), CCSA, MCSE+I, MCSE(2000)
Lecture 7 Page 1 CS 236, Spring 2008 Proving It CS 236 Advanced Computer Security Peter Reiher May 13, 2008.
Lecture 12 Page 1 CS 111 Online Devices and Device Drivers CS 111 On-Line MS Program Operating Systems Peter Reiher.
By Hitesh Ballani, Paul Francis, Xinyang Zhang Slides by Benson Luk for CS 217B.
Lecture 2 Page 1 CS 236, Spring 2008 Security Principles and Policies CS 236 On-Line MS Program Networks and Systems Security Peter Reiher Spring, 2008.
Installing SAS 9.3 Raymond R. Balise Health Research and Policy.
Installing SAS 9.3 Raymond R. Balise Health Research and Policy.
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
Introduction to the Internet How did the Internet start? Why was the Internet developed? How does Internet handle the traffic? Why WWW changed the Internet.
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
UNCLASSIFIED Secure Indirect Routing and An Autonomous Enterprise Intrusion Defense System Applied to Mobile ad hoc Networks J. Leland Langston, Raytheon.
SM3121 Software Technology Mark Green School of Creative Media.
Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore et. al. University of California, San Diego.
Testing Intrusion Detection Systems: A Critic for the 1998 and 1999 DARPA Intrusion Detection System Evaluations as Performed by Lincoln Laboratory By.
COEN 252: Computer Forensics Router Investigation.
Internet Basics.
DDoS Attack and Its Defense1 CSE 5473: Network Security Prof. Dong Xuan.
FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.
Lecture 22 Page 1 Advanced Network Security Other Types of DDoS Attacks Advanced Network Security Peter Reiher August, 2014.
Lecture 7 Page 1 CS 236 Online Password Management Limit login attempts Encrypt your passwords Protecting the password file Forgotten passwords Generating.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 8 – Denial of Service.
Lecture 18 Page 1 CS 111 Online Design Principles for Secure Systems Economy Complete mediation Open design Separation of privileges Least privilege Least.
Lecture 8 Page 1 Advanced Network Security Review of Networking Basics: Internet Architecture, Routing, and Naming Advanced Network Security Peter Reiher.
CHAPTER 3 PLANNING INTERNET CONNECTIVITY. D ETERMINING INTERNET CONNECTIVITY REQUIREMENTS Factors to be considered in internet access strategy: Sufficient.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
Thoughts on Firewalls: Topologies, Application Impact, Network Management, Tech Support and more Deke Kassabian, April 2007.
IT253: Computer Organization
Tony McGregor RIPE NCC Visiting Researcher The University of Waikato DAR Active measurement in the large.
Lecture 18 Page 1 Advanced Network Security Distributed Denial of Service Attacks Advanced Network Security Peter Reiher August, 2014.
Linux Networking and Security
Fundamentals of Proxying. Proxy Server Fundamentals  Proxy simply means acting on someone other’s behalf  A Proxy acts on behalf of the client or user.
Lecture 1 Page 1 CS 239, Fall 2010 Distributed Denial of Service Attacks and Defenses CS 239 Advanced Topics in Computer Security Peter Reiher September.
Where did plants and animals come from? How did I come to be?
Lecture 13 Page 1 Advanced Network Security Authentication and Authorization in Local Networks Advanced Network Security Peter Reiher August, 2014.
Lecture 12 Page 1 CS 236 Online Virtual Private Networks VPNs What if your company has more than one office? And they’re far apart? –Like on opposite coasts.
Lecture 6 Page 1 Advanced Network Security Review of Networking Basics Advanced Network Security Peter Reiher August, 2014.
Understanding Computer Viruses: What They Can Do, Why People Write Them and How to Defend Against Them Computer Hardware and Software Maintenance.
Lecture 12 Page 1 CS 236, Spring 2008 Virtual Private Networks VPNs What if your company has more than one office? And they’re far apart? –Like on opposite.
Lecture 20 Page 1 Advanced Network Security Basic Approaches to DDoS Defense Advanced Network Security Peter Reiher August, 2014.
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Measurement in the Internet Measurement in the Internet Paul Barford University of Wisconsin - Madison Spring, 2001.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Lecture 17 Page 1 CS 236, Spring 2008 Advanced Topics in Network Security: IP Spoofing and DDoS CS 236 On-Line MS Program Networks and Systems Security.
Lecture 16 Page 1 CS 239, Spring 2007 Designing Performance Experiments: An Example CS 239 Experimental Methodologies for System Software Peter Reiher.
Role Of Network IDS in Network Perimeter Defense.
Lecture 17 Page 1 Advanced Network Security Network Denial of Service Attacks Advanced Network Security Peter Reiher August, 2014.
Lecture 5 Page 1 CS 236 Online More on Cryptography CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
This was written with the assumption that workbooks would be added. Even if these are not introduced until later, the same basic ideas apply Hopefully.
INTERNET SIMULATOR Jelena Mirkovic USC Information Sciences Institute
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
Lecture 12 Page 1 CS 136, Spring 2009 Network Security: Firewalls CS 136 Computer Security Peter Reiher May 12, 2009.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
By: Brett Belin. Used to be only tackled by highly trained professionals As the internet grew, more and more people became familiar with securing a network.
Lecture 18 Page 1 CS 236 Online Prolog to Lecture 18 CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
Lecture 9 Page 1 CS 236 Online Firewalls What is a firewall? A machine to protect a network from malicious external attacks Typically a machine that sits.
Advanced Topics in Network Security: IP Spoofing and DDoS CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
David Wetherall Spring 2000
Protecting Memory What is there to protect in memory?
Wireless Network Security
Outline Basics of network security Definitions Sample attacks
Andy Wang CIS 5930 Computer Systems Performance Analysis
Virtual Private Networks
File System Structure How do I organize a disk into a file system?
Lecture 3: Secure Network Architecture
Advanced Topics in Network Security: IP Spoofing and DDoS CS 236 On-Line MS Program Networks and Systems Security Peter Reiher.
Outline The spoofing problem Approaches to handle spoofing
Outline Basics of network security Definitions Sample attacks
Outline The concept of perimeter defense and networks Firewalls.
Presentation transcript:

Lecture 12 Page 1 CS 239, Spring 2007 Other Topics in Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 17, 2007

Lecture 12 Page 2 CS 239, Spring 2007 Outline Experiment order and randomization Important traces Useful models for experimentation

Lecture 12 Page 3 CS 239, Spring 2007 Randomization of Experimental Order Uncontrollable parameters may vary during experimentation –In non-random ways Plotting error vs. experiment number detects this –But doesn’t control it Randomization controls the problem –Becomes error parameter

Lecture 12 Page 4 CS 239, Spring 2007 An Example Data from sample one factor experiment with replications Assumed order is all A levels first, then B levels, etc.

Lecture 12 Page 5 CS 239, Spring 2007 What Does This Chart Tell Us? Bigger errors for early replications of the experiment Eventually settling down to a narrow range So maybe our A experiments observed some different conditions than later experiments Might get different results if A experiments were run last, instead of first

Lecture 12 Page 6 CS 239, Spring 2007 Why Might This Kind of Thing Happen? Consider measuring disk performance: –Benchmark creates 1000 small files, 10 large ones, writes them, then deletes them –File size is varied as experimental parameter –One run takes several hours –Other people use system daily Disk fragmentation may increase over time, changing results

Lecture 12 Page 7 CS 239, Spring 2007 Another Possible Reason Cyclic effects Something is happening on the computer every hour/day/week Experiments run while this thing is happening behave differently Ideally, should get rid of cyclic effect –But that’s not always possible There are many other similar reasons for this kind of behavior

Lecture 12 Page 8 CS 239, Spring 2007 Another Reason These kinds of effects are very common when you run live tests Also when you run raw traces –Of sufficient length and complexity to capture them –Not a problem if all tests get same trace –But potentially a problem if you divide the trace into pieces for different runs –That includes dividing traces for training purposes

Lecture 12 Page 9 CS 239, Spring 2007 Complete Randomization Plan experiment first –Levels of each parameter –Number of replications List experiments by levels and replication number Choose experiments from list randomly –Selection without replacement

Lecture 12 Page 10 CS 239, Spring 2007 More Advanced Techniques Complete randomization sometimes impossible –E.g., might need to install different hardware for each level Too much intervention to potentially change HW after each run Divide experiments into blocks –Randomize within each block –Not that helpful if only one factor Block effect confounded with true effect

Lecture 12 Page 11 CS 239, Spring 2007 An Example Testing DDoS defense boxes Your experiment has three factors –Which of three boxes –Varying number of attack sites (3 levels) –Makeup of DDoS traffic (3 levels) The boxes are hardware appliances

Lecture 12 Page 12 CS 239, Spring 2007 Why Is This Problematic? Boxes need to be put in-line in testing framework Requiring someone to switch cables (at least) With complete randomization, need to switch cables on roughly 2/3s of experimental runs

Lecture 12 Page 13 CS 239, Spring 2007 A Block Design for This Case Set up blocks of experiments with single box tested in each block –But multiple blocks for each box E.g., all tests for box A with maximum number of attack sites are in one block Randomize order of block testing Randomize within the block

Lecture 12 Page 14 CS 239, Spring 2007 What Have We Gained? Many fewer cable changes But less danger that unforeseen effects depending on experiment order will cause problems Haven’t removed the problem, but have decreased it

Lecture 12 Page 15 CS 239, Spring 2007 Something To Keep In Mind Experimenters tend to think of periodic or startup effects as a nuisance They are actually real phenomena –Possibly important phenomena When designing experiments, think seriously about whether you want to avoid these effects –Or, alternately, capture them –The latter requires careful thought

Lecture 12 Page 16 CS 239, Spring 2007 Traces Traces are often an important part of a workload Many kinds of traces are hard to gather for yourself In some cases, traces are publicly available Sometimes you can use those

Lecture 12 Page 17 CS 239, Spring 2007 Some Useful Traces NLANR packet header traces CAIDA traces and data sets U. of Oregon Routeviews traces File system traces Web traces Crawdad wireless traces

Lecture 12 Page 18 CS 239, Spring 2007 NLANR Network Traces Traces of Internet packet activities –Just packet headers Variety of traces gathered at different places in Internet Of varying length Useful if you want to generate “realistic” internal Internet traffic NLANR is out of business, now run by CAIDA

Lecture 12 Page 19 CS 239, Spring 2007 CAIDA Traces and Data Sets CAIDA is organization dedicated to measuring Internet phenomena They’ve gathered a bunch of useful data –Some of which they’ve made publicly available Likely to be adding more over course of time

Lecture 12 Page 20 CS 239, Spring 2007 Some CAIDA Datasets Skitter topology data Denial-of-service backscatter data Internet worm activity data Packet traces from OC12 and OC48 ISP points DNS root server traffic activity

Lecture 12 Page 21 CS 239, Spring 2007 Skitter Data Sets Skitter is CAIDA project to gather Internet topology data Skitter sends probe packets from many sites around globe to Internet addresses Gathers data based on responses Data can be used to build map of current topology/routing state of Internet

Lecture 12 Page 22 CS 239, Spring 2007 Denial of Service Backscatter Data Typical DoS attacks result in victim’s sending lots of response packets –If attack spoofed addresses, they go to random sites –This is called backscatter CAIDA watches backscatter and has made some backscatter data available Provides insight into DoS attack numbers, sizes, targets, etc.

Lecture 12 Page 23 CS 239, Spring 2007 Internet Worm Activity Worms spread to randomly chosen addresses CAIDA has data on worm probe attempts to their addresses For Code Red and Witty Some parts of data available to all Others available on a restricted access basis Useful for modeling worm activity

Lecture 12 Page 24 CS 239, Spring 2007 Routeviews Data Gathered at University of Oregon BGP updates and routing tables from several participating ASes –From 2001 to date –Gathered every two hours, mostly

Lecture 12 Page 25 CS 239, Spring 2007 What Does Routeviews Data Show? Full picture of routing from perspective of particular points on Internet Partial view of overall Internet topology and routing Data can be used to deduce lots of useful things

Lecture 12 Page 26 CS 239, Spring 2007 What Could Experimenters Use Routeviews Data For? Generating Internet topology maps Generating realistic BGP update traffic Generating models of path lengths in Internet

Lecture 12 Page 27 CS 239, Spring 2007 File System Traces Surprisingly few traces of significant amounts of file system activity But some are available –Many are old More might become so in near future Best place to start looking is SNIA IOTTA repository –

Lecture 12 Page 28 CS 239, Spring 2007 Some File System Traces Seer trace –Gathered in my research group (1996/1997) –Real activity by real users –575 Mbytes LASR trace –Also gathered in my group (2000/2001) –Real activity by real users –3.2 Gbytes TraceFS data –16 minutes worth of activity (2007) –Based on running benchmarks –58 Mbytes

Lecture 12 Page 29 CS 239, Spring 2007 Typical File System Trace Contents Records of file system related system calls Recorded every time file system was invoked Indicates file accessed, type of access, time, size, perhaps user and process –With significant anonymization

Lecture 12 Page 30 CS 239, Spring 2007 What Can You Do With File System Traces? Replay them when testing file systems Use them to build models of file system activity Use them to generate profiles of what files in a file system are actually used –One big weakness in most traces is they show what was accessed –No info about the rest of the file system’s contents

Lecture 12 Page 31 CS 239, Spring 2007 Other Interesting File System Traces Cello traces –block level access to disk Plan 9 traces –Possibly deceptive, due to unusual system model of Plan 9 –Seem to have disappeared from web Werner Vogels Windows traces –Also seem to have disappeared

Lecture 12 Page 32 CS 239, Spring 2007 Web Server Traces Usually traces of HTTP requests made to some web server –Suitably anonymized Many available –But many are old –Web moves fast enough that it’s not clear how representative they are

Lecture 12 Page 33 CS 239, Spring 2007 Lawrence Berkeley Web Trace Repository Various web traces kept at LBL – Some are quite extensive –E.g., 1.3 billion web requests for 1998 World Cup site None from after 2000

Lecture 12 Page 34 CS 239, Spring 2007 IRCache Traces Weekly traces of a proxy cache Latest currently available from January 2007 ftp://ftp.ircache.net/Traces/ Free for academic users Commercial users have to pay

Lecture 12 Page 35 CS 239, Spring 2007 Web Caching Trace Site Run by Brian D. Davison Contains pointers to several web caches Except IRCache, none newer than 1999 Many are pointers to same traces as LBL –But not all

Lecture 12 Page 36 CS 239, Spring 2007 Crawdad Wireless Traces Crawdad is project to gather useful data on wireless networks –Based at Dartmouth – Contains large quantity of data on various wireless phenomena

Lecture 12 Page 37 CS 239, Spring 2007 The Dartmouth Wireless Traces Maybe the best stuff in the Crawdad data archives Dartmouth’s campus has had complete wireless coverage for several years –And all students have wireless-enabled computers They’ve kept complete data on associations to wireless access points for five full years –Still gathering and making data available

Lecture 12 Page 38 CS 239, Spring 2007 What Can You Do With Dartmouth’s Data? Lots of stuff Traces of activity at wireless access points Models of user mobility Analysis of malware propagation via user movement Models of typical patterns of user network access

Lecture 12 Page 39 CS 239, Spring 2007 Other Neat Stuff in Crawdad Repository Other records of user mobility through wireless networks Data on Bluetooth activity in various environments Placelab data on use of wireless for localization Link quality information for mesh networks Ongoing data gathering project, so more will be added

Lecture 12 Page 40 CS 239, Spring 2007 Useful Experimental Models In many cases, we can’t test in real conditions Typically try to mimic real conditions by using models –Workload models –Network topology models –Models of other experimental conditions There are already useful models for many things –Often widely accepted as valid within certain research communities –Might be better using them than trying to create your own

Lecture 12 Page 41 CS 239, Spring 2007 Some Important Model Categories Network topology models Network traffic models

Lecture 12 Page 42 CS 239, Spring 2007 Network Topology Models Many experiments nowadays investigate network/distributed systems behavior They need a realistic network to test the system –Usually embedded in testbed hardware Where do you get that from? In some cases, it’s obvious or you have a map of a suitable network In other cases, more challening

Lecture 12 Page 43 CS 239, Spring 2007 Some Challenging Cases You need the Internet in the middle You are investigating a large enterprise network You are doing scalability testing that requires networks of several sizes

Lecture 12 Page 44 CS 239, Spring 2007 Network Generation Models The typical response to this problem Run a program that generates a suitable network Map the resulting network onto your available hardware –Could be challenging, if you don’t have enough machines –Some generators create networks of specified size But theoretically like whatever they’re modeling

Lecture 12 Page 45 CS 239, Spring 2007 Network Topologies and Power Law Behavior Much debate on whether the Internet (and other computer networks) follow power law behavior –Where P(k) is probability a node connects to k other nodes Generally some agreement that power law topology generator do better job than hierarchical models –Less agreement on how power law properties arise in networks like Internet

Lecture 12 Page 46 CS 239, Spring 2007 Some Popular Topology Generators GT-ITM BRITE INET

Lecture 12 Page 47 CS 239, Spring 2007 GT-ITM Supports various ways to randomly generate network graphs –Including transit-stub model Which doesn’t produce power law graphs Still, very widely used

Lecture 12 Page 48 CS 239, Spring 2007 BRITE Parameterizable network generation tool Outputs its networks in NS-2 syntax Places nodes randomly in a plane Randomly selects some number of nodes to connect to each new node –From a limited set of candidates Some experiments suggest it produces graphs matching power law behavior Topology generator of choice for Emulab

Lecture 12 Page 49 CS 239, Spring 2007 INET Topology generator specifically intended to produce Internet-like graphs Much effort to match various network characteristics

Lecture 12 Page 50 CS 239, Spring 2007 A Different Approach Map the real Internet accurately Use that map for your topology Rocketfuel project is one approach to this mapping – etworking/rocketfuel/ etworking/rocketfuel/ Issue of producing small representative topology you can actually test with remains

Lecture 12 Page 51 CS 239, Spring 2007 Network Traffic Models Frequently necessary to feed network traffic into an experiment Could use a trace But sometimes better to use a generator The generator needs a model to tell it how to generate traffic What kind of model?

Lecture 12 Page 52 CS 239, Spring 2007 Different Network Traffic Model Approaches Trace analysis –Derive properties from traces of network behavior –Generate traffic according to those properties Structural models –Pretend you’re running an application –Generate traffic as it would do

Lecture 12 Page 53 CS 239, Spring 2007 Harpoon Discussed in earlier lecture Uses network traces to determine type of network traffic to mimic –Gathered with other tools Generates traffic from TCP and UDP sessions

Lecture 12 Page 54 CS 239, Spring 2007 Swing A trace-based generator Analyzes trace –Looking at users, networks, apps Calculate CDFs based on these parameters Traffic generator creates traffic based on these Produces very realistic results –Improves on Harpoon by allowing application- based variation of traffic –And produces fidelity at finer time scales ( ~RTT time) Apparently not yet available for general use

Lecture 12 Page 55 CS 239, Spring 2007 Netspec A structural model generator Able to emulate traffic generation behavior of multiple types of applications –HTTP, FTP, Telnet, voice, video, etc. You decide how many you want of each Netspec generates them Doesn’t seem to be downloadable, at the moment –No actual link on the “distribution” place on Netspec web page