Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring Concentration and Channel Slicing in On-chip Network Router

Similar presentations


Presentation on theme: "Exploring Concentration and Channel Slicing in On-chip Network Router"— Presentation transcript:

1 Exploring Concentration and Channel Slicing in On-chip Network Router
Prabhat Kumar1 Yan Pan1 John Kim2 Gokhan Memik1 Alok Choudhary1 1Northwestern University 2KAIST, South Korea

2 Contributions of the Work
Performance implication of concentration. Integrated vs. external concentration. 47% reduction in area 36% reduction in energy 10% performance degradation Channel Slicing Virtual concentration for efficient resource utilization. 69% reduction in area 32 % reduction in energy

3 Outline Motivation Concentration Channel slicing Virtual concentration
Results Conclusion

4 Motivation Limited Budget Efficiency Design Options Previous Work Area
Energy Efficiency Performance Cost optimization Design Options Concentration Channel Slicing Previous Work Concentration – CMESH, Flattened Butterfly, Firefly, Multidrop Express Channels (on-chip) Channel Slicing – Dragonfly (off-chip) On-chip communication is critical for CMPs as that what determines the performance of the system Concentration and Slicing are previously proposed options whose trade-offs are not well investigated

5 Motivation Firefly [Pan ISCA’09] Simplified Router Microarchitecture

6 Typical Topology 2D Mesh # Processor Nodes = # Routers

7 Solution: Concentration
Multiple cores share one router Benefits Resource sharing Network Diameter decreases Local communication cost decreases Drawbacks Router complexity increases significantly You can say: An example, with concentration = 4 is shown on the right. If we keep the bandwidth density constant then the channel width will be 2x the channel width in 2D MESH Put a text below the figre: Concentration = 4, Router radix increases from 5->8, channel width -> 2x C = 4 Radix = 8 Width = 2x

8 Issue : Router Complexity
Router components Crossbar Switch ~ (radix)2 Arbitration logic 5x5 crossbar, 2D MESH 8x8 crossbar, 2D MESH, C = 4 (Integrated Concentration) Put below the figures (Also say it is just an example of 2D mesh concentration): Typical crossbar, integrated crossbar (or something like what is an integrated crossbar), put the reference of the model You can say: Crossbar Switch area increases quadratically with the radix, Higher radix leads to increased complexity in arbitration logic. How can we reduce the complexity of crossbar switch?

9 Design Option: External Concentration
Multiplex injection ports De-multiplex ejection ports Benefits Router radix decreases Area decreases Cons Reduced switching capacity You have to mention the philosophy of external concentration You can say: The traffic going from the injection to the ejection ports still see an equivalent to a 8x8 crossbar, while the intermediate traffic (i.e., east, west, north, south) traffic sees a 5x5 crossbar In few words, Put down the philosophy as it is the last slide of this section

10 Issue: Arbitration External Concentration Two levels of arbitration
Parallel Arbitration Use router switch information for concentration arbitration Add more details, examples or something

11 Outline Motivation Concentration Channel slicing Virtual concentration
Results Conclusion

12 Issue: Wide Channels Constant bandwidth density => wider channels
Inefficient utilization Cache lines ~ bits wide Request, control, coherency packets much narrower Router Area Switch area ~ (channel width)2 Add the reference paper for the model C = 4 Radix = 8 Width = 2x

13 Design Option: Channel Slicing
Slice wide channels Pros Complexity reduces further Better channel utilization Cons Serialization latency increases (for long pkts) Wide Channels imply larger area of components of the routers, how to put this statement in 2-3 words???? C = 4 Slicing Factor = 4

14 Outline Motivation Concentration Channel slicing Virtual concentration
Results Conclusion

15 Combining Concentration and Slicing
Slicing + Concentration Virtual Concentration Nodes dedicated to a sliced layer No sharing of input bandwidth Explain clearly how the local traffic flows or something!!!

16 Outline Motivation Concentration Channel slicing Virtual concentration
Results Conclusion

17 Evaluation Setup Simulation Environment Booksim simulator
Constant on-chip resources Equal Bisection bandwidth for all configurations Equal amount of buffer storage Elaborate a little bit about holding on-chip resources constant!!!

18 External Concentration
Zero-load latency 21% reduction for UR 25% reduction for Bitcomp Throughput 10% reduction for UR No change for Bitcomp Area 47% reduction compared to Integrated Energy 36% reduction compared to Integrated

19 Virtual Concentration
Zero-load latency No change compared to MESH 16% increase for UR compared to Integrated 12% increase for Bitcomp compared to Integrated Throughput No significant difference for UR 4.5% increase for Bitcomp Area 69% reduction compared to MESH Energy 32% reduction compared to MESH Change the 16% and 12% according to the plots.

20 Area and Energy Consumption
69% reduction compared to MESH 88% reduction compared to Integrated concentration Energy 32 % reduction compared to MESH 35% reduction compared to Integrated concentration

21 Conclusion Combination of concentration and channel slicing provides efficient NoC design. External concentration reduces complexity with some performance degradation. Virtual Concentration saves 69% area and 32% energy compared to 2D MESH. Make sure that the conclusion is in accordance to the previous slides, do not use any word which is not used earlier.

22 Thank you for your patience!!
Questions?


Download ppt "Exploring Concentration and Channel Slicing in On-chip Network Router"

Similar presentations


Ads by Google