Presentation is loading. Please wait.

Presentation is loading. Please wait.

Traceback Pat Burke Yanos Saravanos. Agenda Introduction Problem Definition Benchmarks and Metrics Traceback Methods  Packet Marking  Hash-based Conclusion.

Similar presentations


Presentation on theme: "Traceback Pat Burke Yanos Saravanos. Agenda Introduction Problem Definition Benchmarks and Metrics Traceback Methods  Packet Marking  Hash-based Conclusion."— Presentation transcript:

1 Traceback Pat Burke Yanos Saravanos

2 Agenda Introduction Problem Definition Benchmarks and Metrics Traceback Methods  Packet Marking  Hash-based Conclusion References

3 Why Use Traceback? General Network Monitoring  Check users on FTP server Network Threats  SPAM  DoS  Insider attacks

4 Why Use Traceback? Network Threats  Worms / Viruses Code Red (2001) spreading at 8 hosts/sec Slammer Worm (2003) spreading at 125 hosts/sec Illegal file sharing

5 Why Use Traceback? Currently very difficult to find spammers, virus authors  Easy to spoof IPs  No inherent tracing mechanism in IP Blaster virus author left clues in code, was eventually caught What if we could trace packets back to point of origin?

6 Packet Tracing

7 Monitoring applications currently exist  Ethereal, tcpdump, ngrep, etc Only work with untampered packets Worms, viruses, spam are sent with spoofed IPs from compromised computers Need solutions to trace all packets

8 Preliminary Solutions Routers add identifiers to the packet as it moves along the Internet  Packet size increases with every hop Effective throughput decreases very quickly Routers keep a log of all the packets that have been routed  Large overhead required of all routers Huge database containing packet information When should you clear packet information?

9 Benchmarks Effect on throughput  Amount of overhead added to the packets False positive rate  Percentage of paths traced back to the incorrect source Computational intensity  Time required to trace an attack  Amount of data required to trace an attack  CPU/memory usage on router

10 Benchmarks Traceback’s effect on network  Does it flood? Susceptibility to spoofing Collisions  For hash-based traceback methods

11 Some Assumptions Attackers can create/spoof any packet Packets from an attack may take different routes to victim Attacker-victim routes are stable Routers are not compromised

12 Packet Marking

13 Add information to the packets so that paths can be retraced to original source Methods for marking packets  Probabilistic Node Marking Edge Marking  Deterministic

14 Probabilistic Packet Marking (PPM) Using probability, router marks a packet  With router IP address (node marking)  With edge of paths (edge marking) Node marking  95% accuracy, requires ~300,000 packets Edge marking  More state information required, converges much faster

15 PPM Nodes Each router writes its address in a 32-bit field only with probability p  Address field can be overwritten by routers closer to the victim  Probability of seeing the mark of a router d hops away is p(1-p) d-1  Need many packets before we see a mark from a distant router

16 PPM Nodes – Pros Not every packet is marked  Lower overhead on routers  Higher throughput (packet size remains small) Fixed space is required for the packets  Packet size + 32 bits

17 PPM Nodes - Cons Large number of false positives  DDoS with 25 hosts requires several days and has thousands of false positives Slow convergence rate  For 95% success, we need 300,000 packets Attacker can still inject modified packets into PPM network (mark spoofing) This is only for a single attacker

18 PPM Edge Sampling Reserve distance field and two 32-bit address fields (“start” and “end”) If router decides to mark a packet, writes its address in “start” field and zeroes the distance field When a router sees a zero in the distance field, it writes its address in the “end” field If a router decides not to mark a packet, increments distance field Must use saturating addition (distance field has limit)

19 PPM Edge Sampling Max packets to reconstruct an attack is ln(d)/p(1-p) d-1  Requires fewer packets than when marking nodes Edge sampling allows reconstruction of the whole attack tree  Packets have additional overhead Encoding start, end, and distance eliminates compatibility with networks not using PPM

20 Deterministic Packet Marking (DPM) Every packet is marked Spoofed marks are overwritten with correct marks

21 DPM Incoming packets are marked Outgoing packets are unaltered Requires more overhead than PPM Less computation required Probability of generating ingress IP address (1-p) d-1

22 DPM 32-bit address is split into two fields (0-15 and 16-31) and a flag IP populates one of the two fields with probability of 0.5  Set flag to 1 if using the higher end bits Only part of the address is available to the attacker Can be made more secure by using non- uniform probability distributions

23 DPM Claimed to have 0 false positives Claimed to converge very quickly  99% probability of success with 7 packets  99.9% probability of success with only 10 packets Has not been tested on large networks Cannot deal with NAT

24 HASH-BASED TRACEBACK Source Path Isolation Engine (SPIE)

25 SPIE - Overview Each router along a packet’s transmission path computes a set of Hash-codes (digests) associated with each packet The time-tagged digests are stored in router- memory for some time period  Limited by available router resources Traceback is initiated only by “authenticated agent requests” to the SPIE Traceback Manager (STM)  Executed by means of a broadcast message Results in the construction of a complete attack graph within the STM

26 SPIE - Assumptions Packets may be addressed to multiple destinations Attackers are aware they are being traced Routers may be subverted, but not often Routing within the network may be unstable  Traceback must deal with divergent paths Packet size should not grow as a result of traceback  1 byte increase in size = 1% increase in resource use  Very controversial … self-enabling assumption End hosts may be resource constrained Traceback is an infrequent operation  Broadcast messages can have a significant impact on internet performance Traceback should return entire path, not just source

27 SPIE - Architecture DGA (Data Generation Agent) Resident in SPIE-enhanced routers to produce digests and store them in time-stamped digest tables. Implemented as software agents, interface cards, or dedicated aux boxes SCAR (SPIE Collection and Reduction Agents) Data concentration point for some regional area. When traceback is requested, SCAR’s initiate a broadcast request for traceback and produce regional attack graphs based upon data from constituent DGA’s STM (SPIE Traceback Manager) Controls the SPIE system. Verifies authenticity of a traceback request, dispatches the request to the appropriate SCAR’s, gathers regional attack graphs, and assembles the complete attack graph.

28 SPIE - Hashing Multiple hash-codes (hash-codes, different groupings of fields) are calculated for each package based on 24 relatively invariant fields of the first 32 bytes of each packet.  Packet was received if all hashes are positive Hash functions can be simple (no cryptographic hardness required) and relatively fast Masked (gray) areas are NOT used in hash-code calculation WAN.00092% LAN.139%

29 SPIE – Implementation Issues PRO  Single packet tracing is feasible  Automated processing by SPIE- enhanced routers make spoofing difficult, at best  Relatively low storage required Only digests and time are stored  Does not aid in eavesdropping of payload data Payload is not stored CON  Requires specially configured (SPIE-enhanced) routers. Probability of detection is directly related to the number of available SPIE-enhanced routers in the network in question  Storage in routers is a limiting factor in the window of time in which a packet may be successfully traced May consider some sort of filtering of packets to be digested  May have the appearance of a loss of anonymity across the Internet

30 Conclusions DoS, worms, viruses continuously becoming more dangerous Attacks must be shut down quickly and be traceable Integrating traceback into next generation Internet is critical

31 Conclusions Probabilistic Packet Marking  Keeps low packet overhead  Not 100% accurate, traceback is slow Deterministic Packet Marking  No false positives  Much higher packet overhead, needs more testing Hash-based Traceback  No packet overhead  New, more capable routers

32 Conclusions Cooperation is required  Routers must be built to handle new tracing protocols  ISPs must provide compliance with protocols  Internet is no longer anonymous Some issues must still be solved  NATs  Collisions

33 References Belenky, A., Ansari, N. “IP Traceback with Deterministic Packet Marking”. IEEE Communications Letter, April 2003. Savage, S., et al. “Practical Network Support for IP Traceback”. Department of Computer Science, University of Washington. Snoeren, A., Partridge, Craig, et al. “Single- Packet IP Traceback”. IEEE/ACM Transactions on Networking, December 2002.


Download ppt "Traceback Pat Burke Yanos Saravanos. Agenda Introduction Problem Definition Benchmarks and Metrics Traceback Methods  Packet Marking  Hash-based Conclusion."

Similar presentations


Ads by Google