Presentation is loading. Please wait.

Presentation is loading. Please wait.

Physical Planning for the Architectural Exploration of Large-Scale Chip Multiprocessors Javier de San Pedro, Nikita Nikitin, Jordi Cortadella and Jordi.

Similar presentations


Presentation on theme: "Physical Planning for the Architectural Exploration of Large-Scale Chip Multiprocessors Javier de San Pedro, Nikita Nikitin, Jordi Cortadella and Jordi."— Presentation transcript:

1 Physical Planning for the Architectural Exploration of Large-Scale Chip Multiprocessors Javier de San Pedro, Nikita Nikitin, Jordi Cortadella and Jordi Petit Universitat Politècnica de Catalunya (Barcelona) Project funded by Intel Corp.

2 Outline Introduction – Architectural exploration of CMPs Physical planning – Integration with architectural exploration – Floorplanning and wire planning technology Results and conclusions NOCS 20132

3 Designing a Chip Multiprocessor NOCS 20133 CMPCMP Off-Chip Memory DSPDSP GraphicsGraphics Data Mining BioinformaticsBioinformatics How many cores? How much L2/L3 on-chip cache? Interconnect: mesh/ring/bus? How many memory controllers?

4 Automated exploration Models (performance/power) Cores On-chip caches Off-chip memories Interconnect fabrics Cache protocol Workloads Architectural configuration Number of cores Cluster size L2/L3 size Hierarchical interconnect Memory controllers Exploration tool Constraints Area Throughput Power NOCS 20134 Huge design space: Billions of configurations

5 Our exploration flow NOCS 2013 Simulation Analytical Modeling* 5 Architectural configurations Promising configurations *N. Nikitin et al. "Analytical Performance Modeling of Hierarchical Interconnect Fabrics.“ NOCS 2012

6 Configuration example NOCS 20136 C2C2 L2L2 C2C2 L2L2 C2C2 L2L2 L3L3 Bus C2C2 L2L2 C2C2 L2L2 NINI RR C1C1 L2L2 - 6x4 mesh, 24 clusters - total 144 cores - 6 cores/cluster - 1 C1, 128K L1, 256K L2 - 5 C2, 64K L1, 96K L2 - 146 Mb total shared L3 Throughput = 85.71 IPC - 6x4 mesh, 24 clusters - total 144 cores - 6 cores/cluster - 1 C1, 128K L1, 256K L2 - 5 C2, 64K L1, 96K L2 - 146 Mb total shared L3 Throughput = 85.71 IPC

7 Min area, but … Unbalanced aspect ratio L2 not adjacent to cores Uneven distribution of ring routers Motivation for physical planning NOCS 20137 CC L2L2 CC L2L2 CC L2L2 CC L2L2 rrrr rrrr rr rr RR L3L3 NSWE Block area is used during exploration

8 Abutability in tiled CMPs NOCS 2013 N S E W 8

9 Over-the-cell routing ISPD 2013Tiled CMPs Cache Core 9  500-1000 wires Router 1000 wires  10 µm

10 Physical planning NOCS 2013 Simulation Analytical Modeling L3L3 RR Local IC L2L2 CC CCL2L2 L3 10 Physical Planning Estimations: Area Wirelength Routability

11 Floorplanner Slicing structures & Simulated Annealing* Lightweight maze router Constraints: – Adjacency (Core  L2) – Balanced links (rings) – Maximum length for critical nets NOCS 201311 D.F. Wong and C.L. Liu, “A New Algorithm for Floorplan Design” DAC 1986, pages 101-107. 13 2 45 V HH V 1 2 3 4 5

12 Wire planner SAT-based approach for gridded routing Grid unit: link width (  500 - 1000 wires) Customizable for any type of Boolean-encoded constraints (abutability, 1D/2D routing, …) NOCS 2013 Top view Cross-section view 12

13 Filtering floorplans NOCS 201313 Area limit

14 After physical planning NOCS 201314

15 Conclusions Physical planning has significant impact during architectural exploration of tiled CMPs Future work: – New topologies (mesh of rings) – Multi-module memory floorplanning – Regularity and choppability NOCS 201315


Download ppt "Physical Planning for the Architectural Exploration of Large-Scale Chip Multiprocessors Javier de San Pedro, Nikita Nikitin, Jordi Cortadella and Jordi."

Similar presentations


Ads by Google