Implementing Task Decompositions Intel Software College Introduction to Parallel Programming – Part 5.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

Analyzing Parallel Performance Intel Software College Introduction to Parallel Programming – Part 6.
Confronting Race Conditions Intel Software College Introduction to Parallel Programming – Part 4.
Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.
Improving Parallel Performance Intel Software College Introduction to Parallel Programming – Part 7.
Implementing Domain Decompositions Intel Software College Introduction to Parallel Programming – Part 3.
Simplifications of Context-Free Grammars
1
Copyright © 2003 Pearson Education, Inc. Slide 1.
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Chapter 17 Linked Data Structures. Copyright © 2006 Pearson Addison-Wesley. All rights reserved Learning Objectives Nodes and Linked Lists Creating,
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Extended Learning Module D (Office 2007 Version) Decision Analysis.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
David Burdett May 11, 2004 Package Binding for WS CDL.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Chapter 5: Repetition and Loop Statements Problem Solving & Program.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
1 Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Break Time Remaining 10:00.
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Intel Software College Tuning Threading Code with Intel® Thread Profiler for Explicit Threads.
PP Test Review Sections 6-1 to 6-6
Chapter 17 Linked Lists.
1 DATA STRUCTURES. 2 LINKED LIST 3 PROS Dynamic in nature, so grow and shrink in size during execution Efficient memory utilization Insertion can be.
Bright Futures Guidelines Priorities and Screening Tables
Health Artifact and Image Management Solution (HAIMS)
Bellwork Do the following problem on a ½ sheet of paper and turn in.
INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.5 Dividing Polynomials Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Adding Up In Chunks.
Copyright © 2013 by John Wiley & Sons. All rights reserved. HOW TO CREATE LINKED LISTS FROM SCRATCH CHAPTER Slides by Rick Giles 16 Only Linked List Part.
1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of CHAPTER 11: Priority Queues and Heaps Java Software Structures: Designing.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
10 -1 Chapter 10 Amortized Analysis A sequence of operations: OP 1, OP 2, … OP m OP i : several pops (from the stack) and one push (into the stack)
Types of selection structures
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Essential Cell Biology
Clock will move after 1 minute
PSSA Preparation.
Chapter 11 Creating Framed Layouts Principles of Web Design, 4 th Edition.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Chapter 13 Web Page Design Studio
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Pointers and Linked Lists.
 2003 Prentice Hall, Inc. All rights reserved. 1 Chapter 13 - Exception Handling Outline 13.1 Introduction 13.2 Exception-Handling Overview 13.3 Other.
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
INTEL CONFIDENTIAL OpenMP for Domain Decomposition Introduction to Parallel Programming – Part 5.
INTEL CONFIDENTIAL Reducing Parallel Overhead Introduction to Parallel Programming – Part 12.
INTEL CONFIDENTIAL Shared Memory Considerations Introduction to Parallel Programming – Part 4.
Presentation transcript:

Implementing Task Decompositions Intel Software College Introduction to Parallel Programming – Part 5

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 2 Implementing Task Decompositions Objectives At the end of this module, you should be able to Describe how threads can be used to implement parallel programs using a task decomposition Implement a task decomposition based on work pools Implement a task decomposition in which different threads execute different functions

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 3 Implementing Task Decompositions Case Study: The N Queens Problem Is there a way to place N queens on an N-by-N chessboard such that no queen threatens another queen?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 4 Implementing Task Decompositions A Solution to the 4 Queens Problem

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 5 Implementing Task Decompositions Exhaustive Search

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 6 Implementing Task Decompositions Design #1 for Parallel Search Create threads to explore different parts of the search tree simultaneously If a node has children The thread creates child nodes The thread explores one child node itself Thread creates a new thread for every other child node

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 7 Implementing Task Decompositions Design #1 for Parallel Search Thread W New Thread X New Thread Y New Thread Z

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 8 Implementing Task Decompositions Pros and Cons of Design #1 Pros Simple design, easy to implement Balances work among threads Cons Too many threads created Lifetime of threads too short Overhead costs too high

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 9 Implementing Task Decompositions Design #2 for Parallel Search One thread created for each subtree rooted at a particular depth Each thread sequentially explores its subtree

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 10 Implementing Task Decompositions Design #2 in Action Thread 1 Thread 2 Thread 3

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 11 Implementing Task Decompositions Pros and Cons of Design #2 Pros Thread creation/termination time minimized Cons Subtree sizes may vary dramatically Some threads may finish long before others Imbalanced workloads lower efficiency

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 12 Implementing Task Decompositions Design #3 for Parallel Search Main thread creates work poollist of subtrees to explore Main thread creates finite number of co-worker threads Each subtree exploration is done by a single thread Inactive threads go to pool to get more work

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 13 Implementing Task Decompositions Work Pool Analogy More rows than workers Each worker takes an unpicked row and picks the crop After completing a row, the worker takes another unpicked row Process continues until all rows have been harvested

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 14 Implementing Task Decompositions Design #3 in Action Thread 1 Thread 2 Thread 3 Thread 3 Thread 1

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 15 Implementing Task Decompositions Pros and Cons of Strategy #3 Pros Thread creation/termination time minimized Workload balance better than strategy #2 Cons Threads need exclusive access to data structure containing work to be done, a sequential component Workload balance worse than strategy #1 Conclusion Good compromise between designs 1 and 2

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 16 Implementing Task Decompositions Implementing Strategy #3 for N Queens Work pool consists of N boards representing N possible placements of queen on first row

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 17 Implementing Task Decompositions Parallel Program Design One thread creates list of partially filled-in boards Fork: Create one thread per CPU Each thread repeatedly gets board from list, searches for solutions, and adds to solution count, until no more board on list Join: Occurs when list is empty One thread prints number of solutions found

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 18 Implementing Task Decompositions Search Tree Node Structure /* The board struct contains information about a node in the search tree; i.e., partially filled- in board. The work pool is a singly linked list of board structs. */ struct board { int pieces;/* # of queens on board*/ int places[MAX_N]; /* Queens pos in each row */ struct board *next; /* Next search tree node */ };

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 19 Implementing Task Decompositions Key Code in main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions);

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 20 Implementing Task Decompositions Insertion of OpenMP Code struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; omp_set_num_threads (omp_get_num_procs()); #pragma omp parallel search_for_solutions (n, stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions);

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 21 Implementing Task Decompositions Original C Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { ptr = stack; stack = stack->next; search (n, ptr, num_solutions); free (ptr); }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 22 Implementing Task Decompositions C/OpenMP Function to Get Work void search_for_solutions (int n, struct board *stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (stack != NULL) { #pragma omp critical { ptr = stack; stack = stack->next; } search (n, ptr, num_solutions); free (ptr); }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 23 Implementing Task Decompositions Original C Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 24 Implementing Task Decompositions C/OpenMP Search Function void search (int n, struct board *ptr, int *num_solutions) { int i; int no_threats (struct board *); if (ptr->pieces == n) { #pragma omp critical (*num_solutions)++; } else { ptr->pieces++; for (i = 0; i < n; i++) { ptr->places[ptr->pieces-1] = i; if (no_threats(ptr)) search (n, ptr, num_solutions); } ptr->pieces--; }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 25 Implementing Task Decompositions Only One Problem: It Doesnt Work! OpenMP program throws an exception Culprit: Variable stack Heap stack

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 26 Implementing Task Decompositions Problem Site int main () { struct board *stack;... #pragma omp parallel search_for_solutions (n, stack, &num_solutions);... } void search_for_solutions (int n, struct board *stack, int *num_solutions) {... while (stack != NULL)...

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 27 Implementing Task Decompositions 1. Both Threads Point to Top stack Thread 1Thread 2

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 28 Implementing Task Decompositions 2. Thread 1 Grabs First Element stack Thread 1Thread 2 stack ptr

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 29 Implementing Task Decompositions 3. Error #1: Thread 2 grabs same element Thread 1Thread 2 stack ptr stack ptr

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 30 Implementing Task Decompositions 4. Error #2: Thread 1 deletes element and then Thread 2s stack ptr dangles stack Thread 1Thread 2 stack ptr ?

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 31 Implementing Task Decompositions Remedy 1: Make stack Static Thread 1Thread 2 stack

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 32 Implementing Task Decompositions Remedy 2: Use Indirection stack Thread 1Thread 2 stack

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 33 Implementing Task Decompositions Corrected main Function struct board *stack;... stack = NULL; for (i = 0; i < n; i++) { initial=(struct board *)malloc(sizeof(struct board)); initial->pieces = 1; initial->places[0] = i; initial->next = stack; stack = initial; } num_solutions = 0; omp_set_num_threads (omp_get_num_procs()); #pragma omp parallel search_for_solutions (n, &stack, &num_solutions); printf ("The %d-queens puzzle has %d solutions\n", n, num_solutions);

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 34 Implementing Task Decompositions Corrected Stack Access Function void search_for_solutions (int n, struct board **stack, int *num_solutions) { struct board *ptr; void search (int, struct board *, int *); while (*stack != NULL) { #pragma omp critical { ptr = *stack; *stack = (*stack)->next; } search (n, ptr, num_solutions); free (ptr); }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 35 Implementing Task Decompositions Case Study: Fancy Web Browser

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 36 Implementing Task Decompositions Case Study: Fancy Web Browser You can see snapshot of page before deciding whether to click on the link

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 37 Implementing Task Decompositions C Code page = retrieve_page (url); find_links (page, &num_links, &link_url); for (i = 0; i < num_links; i++) snapshots[i].image = NULL; for (i = 0; i < num_links; i++) generate_preview (&snapshots[i]); display_page (page);

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 38 Implementing Task Decompositions Pseudocode, Option A Retrieve page Identify links Enter parallel region Thread gets ID number (id) If id = 0 draw page else fetch page & build snapshot image (id-1) Exit parallel region

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 39 Implementing Task Decompositions Timeline of Option A Retrieve page Identify links Enter parallel block Display page Fetch page, create snapshot Fetch, create Barrier at end of parallel block

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 40 Implementing Task Decompositions C/OpenMP Code, Option A page = retrieve_page (url); find_links (page, &num_links, &link_url); for (i = 0; i < num_links; i++) snapshots[i].image = NULL; omp_set_num_threads (num_links + 1); #pragma omp parallel private (id) { id = omp_get_thread_num(); if (id == 0) display_page (page); else generate_preview (&snapshots[id-1]); }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 41 Implementing Task Decompositions Pseudocode, Option B Retrieve page Identify links Two activities happen in parallel 1. Draw page 2. For all links do in parallel Fetch page and build snapshot image

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 42 Implementing Task Decompositions Parallel Sections #pragma omp parallel sections { #pragma omp section #pragma omp section } Meaning: The following block contains sub-blocks that may execute in parallel Dividers between sections Each block executed by one thread

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 43 Implementing Task Decompositions Nested Parallelism We can use parallel sections to specify two different concurrent activities: drawing the Web page and creating the snapshots We are using a for loop to create multiple snapshots; number of iterations is known only at run time We would like to make for loop parallel OpenMP allows nested parallelism: a parallel region inside another parallel region A thread entering a parallel region creates a new team of threads to execute it

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 44 Implementing Task Decompositions Timeline of Option B Retrieve page Identify links Enter parallel sections Display page Fetch page, create snapshot Fetch, create Barrier Enter parallel for Barrier

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 45 Implementing Task Decompositions C/OpenMP Code, Option B page = retrieve_page (url); find_links (page, &num_links, &link_url); omp_set_num_threads (2); #pragma omp parallel sections { display_page (page); #pragma omp section omp_set_num_threads (num_links); #pragma omp parallel for for (i = 0; i < num_links; i++) generate_preview (&snapshots[i]); }

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 46 Implementing Task Decompositions References David R. Butenhof, Programming with POSIX Threads, Addison-Wesley (1997). Rohit Chandra, Leonardo Dagum, Dave Kohr, Dror Maydan, Jeff McDonald, and Ramesh Menon, Parallel Programming in OpenMP, Morgan Kaufmann (2001). Ian Foster, Designing and Building Parallel Programs, Addison-Wesley (1995). Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004).

Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Intel ® Software College 47 Implementing Task Decompositions