Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rev PA1 1 Performance energy trade-offs with Silicon Photonics Sébastien Rumley, Robert Hendry, Dessislava Nikolova, Keren Bergman.

Similar presentations


Presentation on theme: "Rev PA1 1 Performance energy trade-offs with Silicon Photonics Sébastien Rumley, Robert Hendry, Dessislava Nikolova, Keren Bergman."— Presentation transcript:

1 Rev PA1 1 Performance energy trade-offs with Silicon Photonics Sébastien Rumley, Robert Hendry, Dessislava Nikolova, Keren Bergman

2 Rev PA1 2 Goal of the study Suppose (silicon photonics based) optical data movement between end-points –Small connectivity (4 – 16) –Between chips (not on the same chip), potentially distant of several meters What is the design space? –Selection of the “topology” –Choice of optical devices –Amount of WDM parallelism –Type of modulation and rate

3 Rev PA1 3 Topology selection SendReceiveSendReceive Basically, all-to-all, switched, or bus … and all the possible combinations thereof, or hybrids but let start by analyzing two “extremities” of this design space: –All-to-all (a.k.a Full-mesh) –Switched (a.k.a star network) SendReceive

4 Rev PA1 4 Other aspects Type of modulation and rate –Simply 10Gb/s per channel, OOK – considered as a good trade-off between SERDES complexity and optical channel utilization To be extended in the future Choice of optical devices and amount of WDM parallelism: –Interrelated! –Optical devices parameters have to be optimized for a given number of wavelengths AND for a given topology The worst case path determines the parameters, and the maximal number of channels supported  Design space: between 1 and max  Selecting the max is NOT the obvious choice! Topology Optical devices Number of channels [1] S. Rumley, et al. "Modeling Silicon Photonics in Distributed Computing Systems: From the Device to the Rack".Modeling Silicon Photonics in Distributed Computing Systems: From the Device to the Rack [2] R. Hendry, D. Nikolova, S. Rumley, N. Ophir, K. Bergman, "Physical layer analysis and modeling of silicon photonic WDM bus architectures ”Physical layer analysis and modeling of silicon photonic WDM bus architectures [3] R. Hendry, et al "Modeling and Evaluation of Chip-to-Chip Scale Silicon Photonic Networks," IEEE Symposium on High Performance Interconnects Hoti 2014

5 Rev PA1 5 Why shouldn’t the channels number (hence the bandwidth) always be maximized? Each channel (color) needs its own modulator and detector devices Each channel needs its own amount of initial optical power –Provided by a (so far, rather poorly efficient) laser  This laser power dominates other power requirements  More channels generally DOES NOT make the system MORE energy efficient More channels induce inter-channels effects. To (partly) compensate for those, more initial optical power is required –More channels also means bigger, more “lossy” optical devices  More channels generally DO make the system LESS energy efficient Ideal (power-wise) number of channels: 1 (but adding a few will not drastically change the per channel consumption) –Except in cases involving devices whose consumption is independent of number of channels (common to all channels) In these cases, the ideal (power-wise) channel number is larger than one

6 Rev PA1 6 Relation energy efficiency - channels When going from POWER to ENERGY-PER-BIT efficiency, the utilization plays a major role For a FIXED load (traffic, average network activity over time), the energy-per-bit looks like this Flat if resulting bandwidth is lower than the load (resulting in 100% utilization – and buffer overflow) Proportional to the number of channels (each channel consumes, almost independently of the utilization) For high number of channels, optical signal effects super- linearly affect the power consumption (for low number, it is negligible) Number of channels Energy-per- bit (J) Max channels 1 channel Average load Channel rate

7 Rev PA1 7 So, how many channels? From a computer architecture point of view, more channels, hence more bandwidth, is generally good to take –Less queuing time when links are highly solicited –SHORTER serialization times Serialization time, inversely proportional to the bandwidth Latency (log) Number of channels (log) Queuing time Max channels Average load Channel rate Sum = Head-to- tail latency

8 Rev PA1 8 Performance-energy trade-off for a link Plotting once against the other Energy-per- bit (J) (log) Head to tail latency (log) Optical signal effects with high number of channels High latency (saturation, overflow) Trade energy- efficiency for latency Trade latency for energy- efficiency

9 Rev PA1 9 Going back to the topology choice In case of all-to-all, the total number of channels (i.e. bisectional bandwidth) is a multiple of N(N-1) –So at least N(N-1) with N=16, 240  2.4 Tb/s –At most the maximum number supported by a link (typically 100*) times N(N-1) With N=16, 24,000  240 Tb/s In the switched case, it is a multiple of N –So at least N With N=16, 16  160Gb/s –At most the max number supported by the switched, e.g. 40* With N=16, 640  6.4 Tb/s  Two topologies differ from the range of bandwidths they can offer –For low loads, the all-to-all might be an over-kill, even with a single channel –For high loads, the switch might be short of a few Tb/s, even with 40 channels But they are several other very important differences… * depending on the assumptions made on the device behavior, numbers mentioned here as indicative

10 Rev PA1 10 More differences – on the energy side Consider a case where we want to provide 4.8 Tb/s of bisect. BW between 16 endpoints –Falls in the range of both all-to-all and switched (480 channels in total) All-to-all means 2 channels per link (we have 16x15 = 240 links) Switched means 30 channels per link (with 16 links) –For a total traffic of (e.g.) 2.4 Tb/s, utilization is 50% in the two cases Same total number of wavelengths, same traffic BUT, more wavelengths per link in the switched case (and switch signal attenuation to be compensated)  Switched architecture less energy efficient (more energy-per-bit) What about the latency?

11 Rev PA1 11 Topology impact – on the latency side In the previous example, all-to-all and switched provide the same bisect BW. –Same asymptotic throughput, same saturation load. –But… Switched topology implies resources sharing among flows –Impacts queuing latency Packets in a flow are not only delayed by previous packets in the same flow, but also by other flows’ packets.  Less predictable, potentially higher latency distribution –On the other hand, serialization latency improved by the fact that all outgoing channels can be used in parallel for a single packet. From 2 to 30 channels  up to 15x improvement in serialization latency

12 Rev PA1 12 Differences among topologies in terms of performance-energy trade-off At constant bisect. BW: –All-to-all intrinsically energy optimal –Switched intrinsically latency optimal (at least below the saturation load) But bisect. BW is not a requirement, we can “test” different WDM parallelism –See how these populate the Pareto front for each topology

13 Rev PA1 13 Main result 16 endpoints 10 Gb/s average load between pairs (150Gb/s per client) (2.4 Tb/s total) 1KB packets Poisson arrivals Includes physical layer analysis Power consumption from components taken in the literature Switch realizes round-robin arbitration All-to-all allows the shorter latencies Gap Switched offers solution “in between” All-to-all achieves the best energy efficiencies 1 channel per link  100% utilization = saturation load

14 Rev PA1 14 Other loads No solution below 1pJ/bit 100 Gb/s per client 225 Gb/s per client Switched topology is completely dominated

15 Rev PA1 15 Measuring latency with poisson traffic is only an indicator  Let’s test the designs with traffic generated by a simple skeleton. All-to-all, 2 channels per link, 480 total Switched, 30 channels per link, 480 total Initial broadcast takes the same time to complete In the switched case, some messages arrive earlier. The “shift down” communication phase fully benefits from the switch (no congestion)

16 Rev PA1 16 Pareto trade-off for application skeleton Switched architecture almost totally dominated! Does this contradicts previous slide?  NO Time-to-solution (ns)

17 Rev PA1 17 Performance and energy relations All-to-all, 2 channels per link 2 4 3 5 30 Switched architecture, with the same total number of channels, DO leads to shorter time-to- solution than all-to-all But the presence of the switch AND the multiple channels induce a penalty in terms of energy Switched, 30 channels per link In this particular case, a larger latency gain can be achieved by doubling the channels in the all-to-all, at a far least energy penalty.

18 Rev PA1 18 Application results discussion Sensitive to physical layer parameters. –Depending on the assumptions made on future fabrication possibilities, results for the switched topology might improve slightly Sensitive to network size –With 8 or even 4 clients, the switch penalty is far less important Sensitive to application itself So far, arbitration latency neglected –That will push green curves to the right Arbitration power consumption neglected, too –That will push green curves up But silicon area of all-to-all neglected, too…

19 Rev PA1 19 Conclusions Main conclusion: too close to call! –Although it seems that the all-to-all architecture does pretty good for a “brute force” approach, the switched architecture seems to not be far away –For a given context, one may be slightly better than an other Important to expose the “solution diversity” to the higher layers  Integration of the resulting models (all-to-all and switched) in SSTMicro! Probably a good potential lies in hybrid architectures –Example: a switch for pair end-point, another for odd end-points Doubles the number of links, shrinks the switch radix by factor of two  Explore the possible hybrids and integrate in SSTMicro, too. Sensitivity analysis –Physical layer: around 20 parameters; arbitration: 3-4 parameters Application traffic type: 3-4 parameters…


Download ppt "Rev PA1 1 Performance energy trade-offs with Silicon Photonics Sébastien Rumley, Robert Hendry, Dessislava Nikolova, Keren Bergman."

Similar presentations


Ads by Google