Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ECE 526 – Network Processing Systems Design System Implementation Principles II Varghese Chapter 3.

Similar presentations


Presentation on theme: "1 ECE 526 – Network Processing Systems Design System Implementation Principles II Varghese Chapter 3."— Presentation transcript:

1 1 ECE 526 – Network Processing Systems Design System Implementation Principles II Varghese Chapter 3

2 2 Outline Review Principle 1-7 Implementation principles ─ Reflect what we learned Example: TCAM updating Cautionary Questions

3 3 Reviews P1: Avoid Obvious Waste ─ Example: copy packet pointer instead of packet P2: Shift Computation in Time ─ precompute (table lookup), ─ evaluate lazily (network forensics) ─ Share Expenses (batch processing) P3: Relax Subsystem Requirements ─ Trade certainty for time (random sampling); ─ Trade accuracy for time (hashing, bloom filter); ─ Shift computation in space (fast path/slow path)

4 4 Reviews P4: Leverage Off-System Components ─ Examples: Onboard Address Recognition & Filtering, cache P5: Add Hardware to Improve Performance ─ Use memory interleaving, pipelining (= parallelism); ─ Use Wide-word parallelism (save memory accesses) ─ Combine SRAM, DRAM (low-order bits each counter in SRAM for a large number of counters) P6: Replace inefficient general routines with efficient specialized ones ─ Examples: NAT using forwarding and reversing tables P7: Avoid Unnecessary Generality ─ Examples: RISC, microengine

5 5 P8: Don't be tied to reference implementations Key Concept: ─ Implementations are sometimes given (e.g. by manufacturers) as a way to make the specification of an interface precise, or show how to use a device ─ These do not necessarily show the right way to think about the problem—they are chosen for conceptual clarity! Examples: ─ Using parallel packet classification instead of sequential demultiplexing in TCP/IP protocols

6 6 P9: Pass hints across interfaces Key Concept: if the caller knows something the callee will have to compute, pass it (or something that makes it easier to compute) as an argument! ─ "hint" = something that makes the recipient's life easier, but may not be correct ─ "tip" = hint that is guaranteed to be correct ─ Caveat: callee must either trust caller, or verify (probably should do both) Example ─ Active message, the message carry the address of interrupt handler for fast dispatching

7 7 P10: Pass hints in protocol headers Key Concept: If sender knows something receiver will have to compute, pass it in the header Example: ─ Tag switching, packet contains extra information beside the destination address for fast lookup

8 8 P11: Optimize the Expected Case Key Concept: If 80% of the cases can be handled similarly, optimize for those cases P11a: Use Caches ─ A form of using state to improve performance Example: ─ TCP input "header prediction" If an incoming packet is in order and does what is expected, can process in small number of instructions

9 9 P12: Add or Exploit State to Gain Speed Key Concept: Remember things to make it easier to compute them later P12a: Compute incrementally ─ Here the idea is to "accumulate" as you go, rather than computing all-at-once at the end Example: ─ Incremental computation of IP checksum

10 10 P13: Optimize Degrees of Freedom Key Concept: be aware of variables under one’s control and evaluation criteria used determine good performance Example: memory-based string matching algorithm ─ possible transitions from each state for a character is 256 (2^^8, ASCII coding using 8 bit); ─ Bit-split algorithm using 8 machines, each machine only check for one bit, the total possible transitions for a character is 16 (2^^1 * 8)

11 11 P14: Use special techniques for finite universes (e.g. small integers) Key Concept: when the domain of a function is small, techniques like bucket sorting, bitmaps, etc. become feasible. Example: ─ bucket sorting for NAT table lookup NAT table is very sparse Each bucket is accessed by hashing ─ Bucket sort Partitioning an array into a finite number of bucket Each bucket is sorted individually

12 12 P15: Use algorithmic techniques to create efficient data structures Key Concept: once P1-P14 have been applied, think about how to build an ingenious data structure that exploits what you know Examples ─ IP forwarding lookups PATRICIA trees (data structure) were first –A special trie, with each edge of patricia tree labled with sequences of characters. Then many other more-efficient approaches

13 13 TCAM Ternary: 0, 1 and *(wildcard) TCAM: specified length of key and associated actions TCAM lookup: compare the query with all keys in parallel, output (in one cycle) the lowest memory location whose key matches the input IP forward uses longest-prefix matching ─ DIP 010001 matches both 010001* and 01* Using TCAM for IP forwarding, requires put all longer prefixes occur before any shorter ones.

14 14 IP Lookup All prefixes with the same length are group together the shortest prefix 0* are in the highest memory address The packet with DIP: 110001 matches prefix of both P3 and P5 P5 is chosen due to longest-prefix matches

15 15 Routing Table Update 11* with P1 needed to insert to routing table Naïve: create space in group of length-2 prefix, and pushing up one position all prefixes of length-2 and higher Core routing table have 100, 000 entries  100, 000 memory accesses

16 16 Routing Table Update P13: understand the exploit degrees of freedom -- we can add 11* at any position of group 2, not required after 10*. We can add boundary of group 2 and group 3.

17 17 Clever Routing Table Updating the maximum memory accesses is 32 – i.

18 18 Cautionary Questions Q1: Is improvement really needed? Q2: Is this really the bottleneck? Q3: What impact will change have on rest of system? Q4: Does BoE-analysis indicate significant improvement? Q5: Is it worth adding custom hardware? Q6: Can protocol change be avoided? Q7: Do prototypes confirm the initial promise? Q8: Will performance gains be lost if environment changes?

19 19 Summary P1-P5: System-oriented Principles ─ These recognize/leverage the fact that a system is made up of components ─ Basic idea: move the problem to somebody else’s subsystem P6-P10: Improve efficiency without destroying modularity ─ “Pushing the envelope” of module specifications ─ Basic engineering: system should satisfy spec but not do more P11-P15: Local optimization techniques ─ Speeding up a key routine ─ Apply these after you have looked at the big picture

20 20 Reminder


Download ppt "1 ECE 526 – Network Processing Systems Design System Implementation Principles II Varghese Chapter 3."

Similar presentations


Ads by Google