Presentation is loading. Please wait.

Presentation is loading. Please wait.

Algorithm Engineering Parallele Suche Stefan Edelkamp.

Similar presentations

Presentation on theme: "Algorithm Engineering Parallele Suche Stefan Edelkamp."— Presentation transcript:

1 Algorithm Engineering Parallele Suche Stefan Edelkamp

2 Übersicht Motivation PRAM Terminierung Depth-Slicing Hash-based Partitioning & Transposition Table Scheduling Stack Splitting & Parallel Window Search Parallele Suche mit Treaps

3 Parallel Shared Memory Graph Search Single-core CPU Multi-core CPU Parallelization is important for multi-core CPUs But parallelizing graph-search algorithms such as breadth- first search, Dijkstras algorithm, and A* is challenging… Issues: Load balancing, Locking, …

4 Parallel Shared Memory Graph Search Single-core CPU Multi-core GPU Parallelization is even more important for GPUs But parallelizing graph-search algorithms such as breadth- first search, Dijkstras algorithm, and A* is challenging… Issues: Kernel Function Design, Load balancing, Locking, …

5 Parallel External Memory Graph Search Single-core CPU+HDD Multi-core C/GPU+HDD …

6 Motivation Parallel and External Memory Graph Search Synergies: They need partitioned access to large sets of data This data needs to be processed individually. Limited information transfer between two partitions Streaming in external memory programs relates to Communication Queues in distributed programs (as communication often realized on files) Good external implementations often lead to good parallel implementations

7 Experimente

8 Weitere Experimente

9 Parallel Random Access Machine Common Read/Exclusive Write (CREW PRAM)

10 Parallele Addition

11 In Pseudo-Code

12 Definitionen Problemgröße Parallele Rechenzeit Arbeit Sequentielle Zeit: Effizienz: Speedup: Im Beispiel Linear Speedup Effiziente Parallelisierung: Im Beispiel

13 Präfixsumme

14 Terminierung

15 Depth-Slicing

16 Im Quelltext

17 Hash-based Partitioning

18 Transposition Driven Scheduling

19 Im Quelltext

20 Parallele Tiefensuche (Parallel Branch-And Bound) master slave

21 Im Quelltext

22 Load-Balancing via Stack Splitting

23 Parallel Window Search (Iterative-Deepening Search)

24 Treaps: Mischung aus Heaps und Suchbäumen

25 Einsatz Using a treap the need for exclusive locks can be alleviated to some extend. Each operation on the treap manipulates the data structure in the same top-down direction. Moreover, it can be decomposed into successive elementary operations. Tree partial locking protocol: Every process holds exclusive access to a sliding window of nodes in the tree. It can move this window down a path in the tree, which allows other processes to access different, non- overlapping windows at the same time. Parallel search using a treap with partial locking has been tested for the FIFTEENPUZZLE on different architectures, with a speedup for 8 processors in between 2 and 5.

26 Selbstanordnende Bäume mittels Splay-Operation Siehe Extra-Folien

27 Parallel External-Memory Graph Search Motivation Shared and Distributed Environments Parallel Delayed Duplicate Detection Parallel Expansion Distributed Sorting Parallel Structured Duplicate Detection Finding Disjoint Duplicate Detection Scopes Locking

28 Distributed Search over the Network Distributed setting provides more space. Experiments show that internal time dominates I/O.

29 Exploiting Independence Since each state in a Bucket is independent of the other – they can be expanded in parallel. Duplicates removal can be distributed on different processors. Bulk (Streamed) transfers much better than single ones.

30 Parallel Breadth-First Frontier Search Enumerating 15-Puzzle Hash function partitions both layers into files. If a layer is done, children files are renamed into parent files. For parallel processing a work queue contains parent files waiting to be expanded, and child files waiting to be merged



33 Distributed Queue for Parallel Best- First Search P0 P1 P2 TOP Beware of the Mutual Exclusion Problem!!!

34 Distributed Delayed Duplicate Detection Each state can appear several times in a bucket. A bucket has to be searched completely for the duplicates. P0P1P2P3 GOAL Problem: Concurrent Writes !!!! Sorted buffers Single Files

35 Multiple Processors - Multiple Disks Variant Sorted buffers w.r.t the hash val Sorted Files P1 P2 P3P4 Divide w.r.t the hash ranges Sorted buffers from every processor Sorted File h 0 ….. h k-1 h k ….. h l-1

Download ppt "Algorithm Engineering Parallele Suche Stefan Edelkamp."

Similar presentations

Ads by Google