Presentation is loading. Please wait.

Presentation is loading. Please wait.

Department of Electrical and Computer Engineering University of Massachusetts, Amherst Xin Huang and Tilman Wolf A Methodology.

Similar presentations


Presentation on theme: "Department of Electrical and Computer Engineering University of Massachusetts, Amherst Xin Huang and Tilman Wolf A Methodology."— Presentation transcript:

1 Department of Electrical and Computer Engineering University of Massachusetts, Amherst Xin Huang and Tilman Wolf {xhuang,wolf}@ecs.umass.edu A Methodology for Evaluating Runtime Support in Network Processors

2 2 Department of Electrical and Computer Engineering Runtime Support in Network Processor  Network processor (NP) Multi-core system-on-chip Programmability & high packet processing rate  Heterogeneous resources Control processors Multiple packet processors Co-processors Memory hierarchy Interconnection  Runtime support Dynamic task allocation IXP 2800

3 3 Department of Electrical and Computer Engineering General Operation of Runtime Support in NP  Input Hardware resources Workload  Mapping method  Output Task allocation  Dynamic adaptation Different runtime support systems Difficult to compare AP2 AP1 AP3 AP2AP3

4 4 Department of Electrical and Computer Engineering Contributions  Evaluation methodology Traffic representation Analytical system model based on queuing networks Results  Specific: 3 example runtime support system I. Ideal Allocation II. Full Processor Allocation R. Kokku, T. Riche, A. Kunze, J. Mudigonda, J. Jason, and H. Vin. A case for run-time adaptation in packet processing systems. In Proc. of the 2 nd workshop on Hot Topics in Networks (HOTNETS-II), Cambridge, MA, Nov. 2003 III. Partitioned Application Allocation T. Wolf, N. Weng, and C.-H. Tai. Design consideration for network processor operating systems. In Proc. of ACM/IEEE Symposium on Architectures for Networking and Communication System (ANCS), pages 71-80, Princeton, NJ, Oct. 2005

5 5 Department of Electrical and Computer Engineering Outline  Introduction  Evaluation Methodology Dynamic Workload Model Runtime System Model  Result  Summary

6 6 Department of Electrical and Computer Engineering Workload  NP workload is characterized by applications and traffic  How to represent workload?

7 7 Department of Electrical and Computer Engineering Dynamic Workload Model  Workload graph: Application/Task: T Traffic: Processing requirement:  Example:  Processing requirement: R. Ramaswamy and T. Wolf. PacketBench: A tool for workload characterization of network processing. In Proc. of IEEE 6th Annual Workshop on Workload Characterization (WWC-6), page 42-50, Austin, TX, Oct. 2003

8 8 Department of Electrical and Computer Engineering Outline  Introduction  Evaluation Methodology Dynamic Workload Model Runtime System Model  Result  Summary

9 9 Department of Electrical and Computer Engineering Runtime System Model  Unified approach for all runtime systems Queuing networks Specific solution for each runtime system Runtime mapping: Graph: Packet arrival rate: Service time:  Metrics for all runtime systems Processor utilization: Average number of packets in the system:

10 10 Department of Electrical and Computer Engineering Three Example Runtime Support Systems  System I: Ideal Allocation  System II: Full Processor Allocation  System III: Partitioned Application Allocation

11 11 Department of Electrical and Computer Engineering Example Evaluation Model – System I  Ideal Allocation All processors can process all packets completely Unrealistic, but can provide baseline M/G/m FCFS single station

12 12 Department of Electrical and Computer Engineering M/G/m Single Station Queuing System  Cosmetatos approximation  Evaluation metrics G. Cosmetatos. Some Approximate Equilibrium Results for the Multiserver Queue (M/G/r). Operations Research Quarterly, USA, pages 615 – 620, 1976 G. Bolch, S. Greiner, H. de Meer, and K. S. Trivedi. Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications. John Wiley & Sons, Inc., New York, NY, August 1998

13 13 Department of Electrical and Computer Engineering Example Evaluation Model – System II  Full Processor Allocation Allocate entire tasks to subsets of processors Allocate as few processors as possible to save power One processor run one type of task Reallocation is triggered by queue length BCMP M/M/1-FCFS model (Jackson network)

14 14 Department of Electrical and Computer Engineering BCMP Network  BCMP: Basket, Chandy, Muntz, and Palacios  Characteristics: Open, closed, and mixed queuing network; Several job classes; Four types of nodes: M/M/m – FCFS (class-independent service time), M/G/1 – PS, M/G/∞ – IS, and M/G/1 – LCFS PR  Product-form steady-state solution:  Open M/M/1-FCFS BCMP Queuing Network: Evaluation metrics: F. Baskett, K. Chandy, R. Muntz, and F. Palacios. Open, Closed, and Mixed Networks of Queues wit Different Classes of Customers. Journal of the ACM, 22(2): 248 – 260, April 1975

15 15 Department of Electrical and Computer Engineering Example Evaluation Model – System III  Partitioned Application Allocation Tasks be partitioned across multiple processors Synchronized pipelines Allocate tasks equally across all processors to maximize throughput Reallocate at fixed time intervals Equations for evaluation metrics are the same as System II. BCMP M/M/1-FCFS model (Jackson network)

16 16 Department of Electrical and Computer Engineering Outline  Introduction  Evaluation Methodology Dynamic Workload Model Runtime System Model  Result  Summary

17 17 Department of Electrical and Computer Engineering Setup  System 16 100MIPS processing engines Queue lengths are infinite  Workload  Other assumptions Partition applications into 7-15 subtasks

18 18 Department of Electrical and Computer Engineering Processor Allocation Over Time  Ideal: 16 processors  Full Processor: Change with traffic  Partitioned Application: 16 processors Full processor allocation system

19 19 Department of Electrical and Computer Engineering Processor Utilization Over Time  Ideal: Lowest processor utilization  Full Processor: Highest processor utilization because using fewer number of processors  Partitioned Application: Low processor utilization Not equal to ideal case due to the unbalanced task allocation and pipeline overhead

20 20 Department of Electrical and Computer Engineering Packets in System Over Time  Ideal: Least number of packets  Full Processor: Packets queued up due to its high processor utilization  Partitioned Application: Most number of packets due to unbalanced task allocation and pipeline overhead More stable performance because of finer processor allocation granularity

21 21 Department of Electrical and Computer Engineering Performance for Different Data Rates  Ideal: Smooth increase  Full Processor: Periodical peak  Partitioned Application: Smooth increase  The maximum data rate supported by the systems Ideal: 100% Full Processor: 79.6% Partitioned application: 75.1%

22 22 Department of Electrical and Computer Engineering Implication of the Results  Ideal Allocation Provide a base line  Full Processor Allocation Allocate as few processors as possible to save power Use entire processor as the allocation granularity Good: High processor utilization Bad: High performance variance  Partitioned Application Allocation Equally distribute tasks on all the processors Finer processor allocation granularity Good: Stable performance Bad: Difficult to get optimized solution => pipeline synchronization overhead

23 23 Department of Electrical and Computer Engineering Summary  Analytical methodology for evaluating different runtime support NP systems  Dynamic workload model and runtime system model  Results: 3 example runtime support systems Quantitative metrics Tradeoffs

24 24 Department of Electrical and Computer Engineering Questions ?


Download ppt "Department of Electrical and Computer Engineering University of Massachusetts, Amherst Xin Huang and Tilman Wolf A Methodology."

Similar presentations


Ads by Google