Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 10-02 Mike Drob Grant Furgiuele Ben Winters Advisor: Dr. Chris Chu Client: IBM IBM Contact – Karl Erickson.

Similar presentations


Presentation on theme: "May 10-02 Mike Drob Grant Furgiuele Ben Winters Advisor: Dr. Chris Chu Client: IBM IBM Contact – Karl Erickson."— Presentation transcript:

1 May 10-02 Mike Drob Grant Furgiuele Ben Winters Advisor: Dr. Chris Chu Client: IBM IBM Contact – Karl Erickson

2 Project Overview Circuit Placement problem is bottleneck of physical design Currently only single-core – no threads Will attempt to parallelize some functions of the FastPlace algorithm using the linux pthreads library. Implement RQL idea (IBM) into FastPlace

3 Project Plan Start with existing serial FastPlace algorithm Parallelize FastPlace algorithm to decrease run-time Hope to gain increases as close to N times speedup (N = cores) as possible Realistically, expect 0.75N or 0.5N End-goal is mostly proof-of-concept IBM uses in-house algorithm Contains proprietary circuit processing

4 Project Design Written in C Run under Linux using POSIX thread library Consider scalability – 2, 4, 8, etc. cores RQL implementation IBM Concept Netlist optimization for placement

5 Implementation – Overall Using Data Parallelism as scheme Assigning loop iterations to threads Localizing variable usage Where absolutely necessary, using thread synchronization (mutex, etc..) To maximize speed improvement with threads, minimize total number of tasks for threads to accomplish Have individual threads do as much as possible

6 Implementation – Thread Pool Threads are created once at start Various Benefits: Minimizes overhead from thread creation Increases cache performance Allows core scalability – number of threads running can equal cores available

7 Implementation - RQL Force-vector Modulation Forces acting upon cells Forces are modeled as a spring potential energy problem Native Force in the algorithm tries to reduce wire length by bringing connected cells closer to each other Spreading Force tries to move cells into sparse areas within the placement region Need a balance of the two to meet placement and wire length objectives Modulate the Spreading Forces High Spreading Forces means the connection belongs to a fan-out net or boundary Therefore, cells with connections in the top 5 percentile of spreading forces are skipped in quadratic placement Skipping these leaves the cell’s other connections minimized instead of degrading them. Results in placing cells in their overall optimal location

8 Implementation - RQL During quadratic placement (global placement process) Calculate magnitude of spreading forces for all cells in each iteration Calculate force on current cell If current cell’s force is above the 5% threshold, skip its placement

9 Implementation - Functions Move_8pt family move_8pt, move_8pt_withMap, move_8pt_mixedMode, move_8pt_mixedMode_withMap, move_8pt_clustering, move_8pt_clustering_withMap Calculates score based on cell coordinates and bin utilization Doesn’t lend well to parallelization The fix? If a new cell is within 3x3 box of cell being currently calculated for, new cell is skipped Helps remove significant wirelength degradation

10 Implementation - Functions Swap_move family swap_move_FM, vswap_move, local_order3_FM, flipAllCells Row-based data processing Break up matrix into segments based on number of threads Assign each thread to do X rows

11 Testing Profiled original FastPlace algorithm gprof gives CPU time per function Profiling parallel FastPlace Valgrind FastPlace code outputs actual time elapsed Can be used to compare performance Not 100% consistent

12 Testing & Results Test results for correctness Compare “wire length” results Average total wirelength no worse than 1% greater Threadpool is tested and working Test results for speedup Compared actual run-time See slides on next page

13 Test Results – RQL Implementation Wire length Results Between.12% - 2.08% decreased wire length on ISPD98 benchmarks with an average of.98% Between.11% - 3.18% decreased wire length on ISPD2005 benchmarks with an average of 1.39% Run-time Results Some run-time slow down Average of 3.36% increased on ISPD98 Average of 4.02% increased on ISPD2005

14 Test Results – Global Placement

15 Test Results – Detailed Placement

16 Project Impact Shows that threads can be used to speed up the placement process With availability of multi-core CPU’s, and scalability of thread implementation, speed improvement could continue Reduces bottleneck in development process

17 Questions?


Download ppt "May 10-02 Mike Drob Grant Furgiuele Ben Winters Advisor: Dr. Chris Chu Client: IBM IBM Contact – Karl Erickson."

Similar presentations


Ads by Google