Presentation is loading. Please wait.

Presentation is loading. Please wait.

OCR on Knights Landing (Xeon-Phi)

Similar presentations


Presentation on theme: "OCR on Knights Landing (Xeon-Phi)"— Presentation transcript:

1 OCR on Knights Landing (Xeon-Phi)
31st Mar 2016 Acknowledgment: This material is based upon work supported by the Department of Energy Office of Science under cooperative agreement DE-SC and DE-SC , and Lawrence Livermore National Labs subcontract B

2 Knights Landing Overview
Three modes Self-boot processor Self-boot w/ integrated fabric Co-processor (PCIe addon card) MCDRAM: three memory modes Flat – entirely addressable Cache – on DDR, direct-mapped Hybrid – part cache, part memory Cluster modes (cc mesh interconnect) All-to-all: address uniformly hashed Quadrant: software-transparent, address hashed to dir same quadrant as memory Sub-NUMA: exposed as 4 NUMA nodes KNL presentation at Hotchips ‘15

3 OCR on KNL 1 policy domain with up to 288 workers
MCDRAM in flat mode, with two allocators $ numactl -H available: 2 nodes (0-1) node 0 cpus: 0 255 node 0 size: MB node 0 free: MB node 1 cpus: node 1 size: MB node 1 free: MB node distances: node 0: 1: Memory hints to choose allocator on MCDRAM (OCR_HINT_DB_HIGHBW)

4 Results – Stencil 2D weak scaling
Xeon KNL Preliminary results! Software under optimization

5 Results – MCDRAM vs DDR Stencil 2D with 256 threads
Preliminary results! Software under optimization Stencil 2D with 256 threads

6 Results – Stream Runtime bottlenecks?
Profiling underway Limited vectorization opportunities? Preliminary results! Software under optimization

7 Next Steps Rootcause & fix MCDRAM performance
Study all-to-all vs. sub-NUMA modes Single vs multiple policy domains Performance counters & introspection


Download ppt "OCR on Knights Landing (Xeon-Phi)"

Similar presentations


Ads by Google