Presentation is loading. Please wait.

Presentation is loading. Please wait.

Authors: Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim) Xu Conf. : The 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems.

Similar presentations


Presentation on theme: "Authors: Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim) Xu Conf. : The 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems."— Presentation transcript:

1 Authors: Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim) Xu Conf. : The 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS '09) Presenter : JHAO-YAN JIAN Date : 2010/12/15 Design and Performance Analysis of a DRAM-based Statistics Counter Array Architecture 1

2 Outline 2 Introduction Scheme Analysis Evaluation

3 Network Measurement Traf fi c statistics are widely used in router management traf fi c engineering network anomaly detection � etc. Fine-grained network measurement Possibly tens of millions of flows (and counters) Wirespeed statistics counting 8 ns update time at 40 Gb/s(OC-768) 3

4 Naive Implementations SRAM is fast, but too expensive e.g., 10 million counters × 64-bits = 80 MB, prohibitively expensive (infeasible for on-chip) DRAM is cheap, but naïve implementation too slow e.g., 50 ns DRAM random access times typically quoted, 2 × 50 ns = 100 ns > 8 ns required for wirespeed updates (read, increment, then write) 4

5 Hybrid SRAM/DRAM architectures(1/2) Based on premise that DRAM is too slow, hybrid SRAM/DRAM architectures have been proposed e.g., Shah’02[25], Ramabhadran’03[21], Roeder’04[23], Zhao’06[31] All based on following idea: Store full counters in DRAM (64-bits) Keep say a 5-bit SRAM counter, one per flow Wirespeed increments on 5-bit SRAM counters “Flush" SRAM counters to DRAM before they “overflow" Once “flushed", SRAM counter won’t overflow again for at least say another 2^5 = 32 (or 2^b in general) cycles 5

6 Hybrid SRAM/DRAM architectures(2/2) 10 to 57 MB needed far exceed available on-chip SRAM On-chip SRAM needed for other network processing SRAM amount depends on “how often" SRAM counters have to be flushed - if arbitrary increments are allowed (e.g. byte counting), more SRAM needed Integer specific, no decrements 6

7 Basic Architecture: Randomized Scheme Proposed by Lin & Xu in Hotmetrics08. Counters randomly distributed across B memory banks B > 1= 1/µ, where µ is the SRAM-to-DRAM access latency ratio. 7

8 Adversarial Access Patterns Even though random address permutation is applied, the memory loads to the DRAM banks may not be balanced. Counter index permutation in our scheme makes it difficult for an adversary to purposely trigger a large number of consecutive counter updates to the same memory bank with updates to distinct counters since the pseudorandom permutation function (or the key it uses) is not known to the outside world. An adversary can only try to trigger consecutive counter updates to the same counter, which would result in consecutive accesses to the same memory bank. 8

9 Extended Architecture to Handle Adversaries Fixed pipelined delay module absorbs repeated updates to the same memory location. Implemented as a fully associative cache with FIFO replacement policy 1112 9

10 Worst Case Request Pattern 10 q + r requests for distinct counters a 1,..., a q+r q requests repeat T times each r requests repeat T − 1 times each

11 A Few Definitions(1/2) 11 Eg. (1, 1, 1) ≤ M (2, 1, 0) ≤ M (3, 0, 0).

12 A Few Definitions(2/2) 12

13 A Useful Theorem 13 The following theorem relates majorization, exchangeable random variables and convex order together.

14 Chernoff Bound(1/2) 14 Want to bound the probability that a request queue will over fl ow in n cycles Xs,t is the number of updates to the bank during cycles [s, t], τ = t − s, K is length of request queue. For total over fl ow probability bound multiply by B.

15 Chernoff Bound(2/2) 15 m i : 1 ≤ i ≤ n be the count of the number of appearances of the ith address m ∗ : worst case counter update sequences X i, 1 ≤ i ≤ n be the indicator random variable for whether the ith address is mapped to the DRAM bank

16 Overflow Probability Overflow probability for 16 million counters, µ = 1/16, B = 32. 16

17 Memory Usage Comparison 17


Download ppt "Authors: Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim) Xu Conf. : The 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems."

Similar presentations


Ads by Google