Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Comparison-FREE SORTING ALGORITHM ON CPUs

Similar presentations


Presentation on theme: "A Comparison-FREE SORTING ALGORITHM ON CPUs"— Presentation transcript:

1 A Comparison-FREE SORTING ALGORITHM ON CPUs
Saleh Abdel-hafeez, Jordan (JUST) Ann Gordon-Ross, USA (UF) Samer AbuBaker, Jordan (JUST)

2 Highlights Principle Example Potential Key Factors CPU Simulation
Single Threaded (no Parallelism) C-Code (Memory Locality) Execution Time Simulations Multi-threaded (Parallelism) C-Code (Atomic and Semaphore Vs. Memory) Conclusions

3 Principle Example

4 Potential Key Factors Two Representations N=2K Computations less Idea
Binary One-Hot N=2K Computations less Memory Transpose Memory Mapping Idea Reduce the size of One-Hot (NxN) to NX1 Improve Locality (Spatial and Temporal)

5 CPU Single Thread

6 Loop1 Time vs. Loop2 Time (MEMORY LOCALITY)

7 Dependent Less on Input Distribution

8 CPU Single thread (Time Simulation)

9 CPU Single Thread Significant
The Fastest Minor Effect on Data Type Distribution One Dimensional Memory Less Computations Easy to work with Less Energy & Power 7 8 10 12 14 16 18 20 22 24 26 28 29 Free-comparison 6 41 145 584 2317 6839 31414 69519 418684 quick 15 30 140 602 2673 11409 47064 148004 456904

10 CPU Multiple Threads (8-Threads & 4-Core)

11 CPU Multiple threaded (TIME)

12 Execution Time vs. Data Sizes
7 8 10 12 14 16 18 20 22 24 26 28 30 32 34 8-thread 345 333 363 386 1070 2085 7658 17309 58822 234639 Non-thread 6 41 145 584 2317 6839 31414 69519 418684 2.22E+08 8.12E+08

13 Memory Usage

14 Comparison with Parallel Sorting Algorithms
Avoid Mutual Exclusive (Memory Blocked) Use More Memory for threaded Use Atomic for less memory Execution Time (Second) 14 20 24 26 Comparison-Free /0.0005 0.002 0.235 1.08 [1]-2011-Bitonic-Sort-CPU&GPU 0.0012 0.076 1.97 2.23 [2]-2010-Intel (Radix) CPU 0.0075 0.025 0.081 0.33 [3]-2009-Invidia (Radix) GPU 0.008 0.031 0.12 0.27

15 CONCLUSION The Design is novel and is not an incremental of other hybrid sorting algorithms (Future Work); the C-Code is clear and is available Comparison-free: Single-Threaded The fastest for data sizes < 216 Comparison-free: Multi-threaded CPU (Simple 4-Core) fastest at data 220 CPU (Advance Multi-Core) need to investigate GPU (Simple and Advance) need to investigate Use less memory, and expecting less energy


Download ppt "A Comparison-FREE SORTING ALGORITHM ON CPUs"

Similar presentations


Ads by Google