Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.

Similar presentations


Presentation on theme: "Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute."— Presentation transcript:

1 Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute for Informatics, University of Oslo, Norway Multimedia and Telecommunications Track (MTT ’02) – 28th EUROMICRO Conference, Dortmund, Germany, September 2002

2 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Overview Application scenario The INSTANCE project Zero-copy data paths  static buffer allocation  performance evaluation Summary and conclusions

3 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Network Application Scenario Media-on-Demand server: Applicable in applications like News- or Video-on-Demand provided by city-wide cable or pay-per-view companies Multimedia Storage Server Project goals: Optimize performance within a single server: Reduce resource requirements Maximize number of clients Retrieval is the bottleneck: Some important factors: Memory management Communication protocol processing Error management Network

4 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 The INSTANCE Project We try to make optimal use of a given set of resources:  network level framing  integrated error management  memory architecture periodic broadcast service dynamic zero-copy buffers static zero-copy buffers Project goals: Optimize performance within a single server: Reduce resource requirements Maximize number of clients

5 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 General Operating System Structure and Data Path file system communication system application user space kernel space

6 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Pentium 4 Processor registers cache(s) Example: Intel Hub Architecture (850 Chipset) – II I/O controller hub memory controller hub RDRAM PCI slots system bus (64-bit, 400/533 MHz) hub interface (four 8-bit, 66 MHz) PCI bus (32-bit, 33 MHz) RAM interface (two 64-bit, 200 MHz) network card disk file system communication system application file system communication system application disknetwork card Note: these transfers only show data movement between sub-systems. Additionally, data touching operations within a sub-system will require that data is moved from memory and to the CPU, e.g.: - checksum calculation - encryption - data encoding - forward error correction Thus, copy operations is expensive: bandwidth is limited consumes CPU cycles affects the cache

7 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 file system communication system application user space kernel space Zero-Copy: Basic Idea bus(es) mbufbuf b_datam_data

8 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 file system communication system application mbuf memory pools mbuf mbuf cluster user space memory buf memory pools buf buf cluster Zero-Copy: Dynamic Allocation

9 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Zero-Copy: Static Allocation Allocate all needed memory during stream initialization If possible, set all buf and mbuf data pointers Use alternating buffers header data area data pointer mbuf pointer buf pointer buf s mbuf s

10 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Zero-Copy: Operations Stream initialization Read operation Send operation Stream close currently used buffer header data area buf s header data area buf s mbuf s send offset currently used buffer mbuf s

11 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Performance: Test Setup Implemented in NetBSD Dell Precision Workstation 620  PIII 1 GHz CPU  100 Mbps network card  Single disk storage Software probe to measure allocation times  RDTSC instruction  CPUID instruction  probe overhead 206 cycles

12 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Evaluation: Zero-Copy Transfer Rate  Throughput increase of ~2.7 times per stream (can at least double the number of clients) Zero-copy transfer rate limited by network card and storage system A later dynamic version:  saturated a 1 Gbps NIC  reduced processing time by approximately 50 %  huge improvement in number of concurrent streams approx. 12 Mbps approx. 6 Mbps

13 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Evaluation: Static Allocation Saves time to get and free memory regions  malloc – 5.80 µs, free – 6.48 µs  get_poolitem – 0.15 µs, put_poolitem – 0.15 µs  e.g., 1 GB file, 64 KB disk blocks, 1 KB packets retrieving 1 GB  16 K disk I/Os (1 buf, 1 region each) sending 1 GB  1 M packets (2 mbufs each, sharing data region) totally 2 M + 32 K get and free operations  0.63 s sending the whole file assuming a pool (takes totally about 10 s, or 7s kernel time, to send having fast devices) Might save time to set data pointers and length fields Inflexible (variable bit rate streams) Strict waiting on static buffers  Saves CPU cycles at the cost of statically allocating memory

14 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Conclusions and Future Work Zero-copy reduces data movement overhead in the OS (reduces processing time by approximately 50 %) Static versus dynamic allocation of zero-copy buffers  tradeoff between flexibility and CPU resources  static saves CPU, but inflexible  dynamic is flexible, but adds allocation costs  we will use our dynamic implementation in our future work Ongoing and future work:  Tune dynamic implementation (ongoing)  Zero-copy network–disk path (ongoing)  Add memory caching

15 © 2002 Pål Halvorsen MTT’02, Dortmund, Germany, September 2002 Questions??


Download ppt "Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute."

Similar presentations


Ads by Google