Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Networks Seminar Spring 2007 A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives Miguel Jimeno Ken Christensen.

Similar presentations


Presentation on theme: "Computer Networks Seminar Spring 2007 A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives Miguel Jimeno Ken Christensen."— Presentation transcript:

1 Computer Networks Seminar Spring 2007 A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives Miguel Jimeno Ken Christensen Department of Computer Science and Engineering University of South Florida Tampa, FL 33620 {mjimeno, christen}@cse.usf.edu

2 Computer Networks Seminar Spring 2007 2 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

3 Computer Networks Seminar Spring 2007 3 The internet consumes 2% of all the electricity consumed in the US. [1] An average PC consumes 120 W when fully powered-on. [10] PCs could add 10% to the typical US residential consumption. P2P Applications make the PC remain “on the net” all the time, (they are idle 99% of the time) Introduction [1]K. Kawamoto, J. Koomey, B. Nordman, R. Brown, M. Piette, M. Ting, and A. Meier, “Electricity Used by Office Equipment and Network Equipment in the U.S.: Detailed Report and Appendices,” Technical Report LBNL-45917, Energy Analysis Department, Lawrence Berkeley National Laboratory, 2001.

4 Computer Networks Seminar Spring 2007 4 Can a P2P application can be run in small, low- power microcontroller? The PC could then be power managed. The microcontroller can’t store large list of file names. Bloom Filters: Bloom filters are a well known probabilistic data structure for representing a list of file name strings. Introduction

5 Computer Networks Seminar Spring 2007 5 Figure 1. Bloom filter of size m bits, and k = 4 hash functions. Image Taken from [9] False negatives are not possible, but there is a probability of generating false positives. where m = size of the Bloom filter in bits, k = number of hash functions used to calculate a Bloom filter, and s = number of bits set. Bloom Filters: A group of hash functions are used to map elements into an array of bits. Introduction

6 Computer Networks Seminar Spring 2007 6 Background Bloom filters were first proposed by Bloom [2] Kirsch et. al. proposed a way to calculate bloom filter with less hashing [7] Lumetta et. al. used the Power of Two Choices to calculate the bloom filter [8] [2] B. Bloom, “Space/Time Tradeoffs in Hash Coding with Allowable Errors,” Communications of the ACM, Vol. 13, No. 7, pp. 422-426, 1970.

7 Computer Networks Seminar Spring 2007 7 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

8 Computer Networks Seminar Spring 2007 8 We investigated new methods for reducing the probability of false positives for a Bloom filter for fixed m and n. The target is the implementation of this structure in a power management proxy. Research Problem

9 Computer Networks Seminar Spring 2007 9 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

10 Computer Networks Seminar Spring 2007 10 NICs support up to MAC layer, but can’t respond to higher-layer packets. A PC needs to be fully powered-on in order to respond to packets. Applications like P2P file sharing require the PC to be fully powered-on all the time. To manage power in PCs running P2P applications: -We are studying the idea of using small controller to proxy for a sleeping PC. The SmartNIC

11 Computer Networks Seminar Spring 2007 11 This proxy will be able to maintain P2P TCP connections and respond to query messages. We are exploring locating the controller on the NIC, so it’s a “SmartNIC”. The SmartNIC

12 Computer Networks Seminar Spring 2007 12 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

13 Computer Networks Seminar Spring 2007 13 Best-of-N method: N instances of a Bloom filter are generated and the instance with the least number of bits set to 1 is selected. The “winner” hash group is used to test the bloom filter. The New Design: Best-of-N method 1) What improvement in Pr[false positive] can be achieved? 2) What is the computational cost to generate the filter?

14 Computer Networks Seminar Spring 2007 14 In order to compute N instances quickly, we developed a new pseudo-hashing method called “RNG hashing”. This method, based on a Random Number Generator, generates multiple hashes from one initial “seed” hash. The New Design: Best-of-N method

15 Computer Networks Seminar Spring 2007 15 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

16 Computer Networks Seminar Spring 2007 16 We define S to be the random variable for the number of bits set in a Bloom filter. Using order statistics we can determine the distribution of the minimum value of the independent samples S 1, S 2, …, S N (selected as Best-of-N). For order statistics, if f(s) and F(s) are known, then Analysis of Best-of-N Method

17 Computer Networks Seminar Spring 2007 17 For a continuous distribution, The mean can be computed as Based on heuristic and empirical evidence, the distribution of S appears to be close to normal. Now we have that where μ=E[S] and σ= σ[S]. We know that Analysis of Best-of-N Method

18 Computer Networks Seminar Spring 2007 18 We derive The probability of false positive for our method is then: where E[S min ] is computed by substituting above. Analysis of Best-of-N Method

19 Computer Networks Seminar Spring 2007 19 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

20 Computer Networks Seminar Spring 2007 20 For a given m and n where k is chosen optimally, we study the probability of false positive as a function of N. Numerical Results 30%

21 Computer Networks Seminar Spring 2007 21 For Figure 5, n = 1000 and m = 16,000. For Figure 6, same n, but m = 32,000 Numerical Results

22 Computer Networks Seminar Spring 2007 22 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

23 Computer Networks Seminar Spring 2007 23 Environment -Dell OptiPlex GX620 PC (Pentium4, 3.4 Ghz, 2 MBytes cache) with 1 GByte RAM. -WindowsXP, gcc compiler (version 3.4.2 mingw-special from Dev C++. -A list of 25,000 strings of unique music file names was obtained using Bearshare 5.2. Response Variables -Probability of false positive for the Bloom filter. -Execution time to generate a Bloom filter. Experiments Evaluation

24 Computer Networks Seminar Spring 2007 24 Control variables -Hashing method used. CRC32, Md5, RNG Method, Kirsch Method -Bloom filter parameters m, n, and k. -Best-of-N parameter N. -Number of strings used in the string test set. Experiments Description -False Positive Exp 1: Vary N, measure Prob. of False Positive. -False Positive Exp 2: Vary N, measure False Pos. -Run-time experiment: Collect CPU time for each N. Experiments Evaluation

25 Computer Networks Seminar Spring 2007 25 The experimental results for probability of false positive perfectly agree with the analysis. CPU time results of RNG method were as good as Kirsch method, and better than CRC32. Experiments Evaluation Kirsch and RNG

26 Computer Networks Seminar Spring 2007 26 Introduction & Background Research Problem The SmartNIC The new Design: Best-of-N Method Analysis of Best-of-N Method Numerical Results & Experiments Evaluation Summary & Future Work Outline

27 Computer Networks Seminar Spring 2007 27 Two Improvements to Bloom filters -A new Best-of-N method that reduces the probability of false positive by generating N instances of a Bloom filter and selecting the best one. -A new RNG hashing method that generates pseudo hashes given a single seed hash. Bloom filters could be implemented in a power management proxy for P2P applications. Savings of up to 85 Mill. could be obtained if 25% of PCs running P2P applications use SmartNICs. Summary & Future Work

28 Computer Networks Seminar Spring 2007 28 3. A. Broder and M. Mitzenmacher, “Network Applications of Bloom Filters: A Survey,” Internet Mathematics, Vol. 1, No. 4, pp. 485-509, 2005. 4. Energy Information Administration, “U.S Household Electricity Report,” July 2005. Available: http://www.eia.doe.gov/emeu/reps/enduse/er01_us.html.http://www.eia.doe.gov/emeu/reps/enduse/er01_us.html 5. L. Fan, P. Cao, and J. Almeida, “Bloom Filters - The Math,” 2000. Available: http://www.cs.wisc.edu/~cao/ papers/summary-cache/node8.html. 6. A. Kirsch and M. Mitzenmacher, “Less Hashing, Same Performance: Building a Better Bloom Filter,” Technical Report TR-02-5, Computer Science Group, Harvard University, 2005. 7. S. Lumetta and M. Mitzenmacher, “Using the Power of Two Choices to Improve Bloom Filters,” unpublished, 2006. Available: http://www.eecs.harvard.edu/~michaelm/ postscripts/bftwo.ps. 8. A. Pagh, R. Pagh, and S. Rao, “An Optimal Bloom Filter Replacement,” Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 823-829, 2005. 9. http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html 10. US Department of Energy, Energy Efficiency and Renewable Energy, “Estimating Appliance and Home Electronic Energy Use,” 2005. Available: http://www.eere.energy.gov/consumer/your_home/appliances/index.cfm/mytopic=10 040. References

29 Computer Networks Seminar Spring 2007 29 Thanks! I’ll be happy to answer any questions.


Download ppt "Computer Networks Seminar Spring 2007 A Power Management Proxy with a New Best-of-N Bloom Filter Design to Reduce False Positives Miguel Jimeno Ken Christensen."

Similar presentations


Ads by Google