Presentation is loading. Please wait.

Presentation is loading. Please wait.

COMPunity: The OpenMP Community Barbara Chapman Compunity University of Houston.

Similar presentations


Presentation on theme: "COMPunity: The OpenMP Community Barbara Chapman Compunity University of Houston."— Presentation transcript:

1 cOMPunity: The OpenMP Community Barbara Chapman Compunity University of Houston

2 2 Contents A brief cOMPunity history Board of Directors Finances Workshops Meantime, in the rest of the world… Membership Our web presence Participation in ARB committee work OpenMP Futures

3 3 cOMPunity Goals provide continuity for workshops Participate in work of ARB Promote API Provide information, primarily via website, To join ARB, it was necessary to found a (non-profit) company Based officially in US state of Delaware Non-profit company, founded end of 2001

4 4 Board of Directors Must have 4 directors according to by-laws Current: Barbara Chapman (CEO, Finances) Mark Bull (Secretary) Dieter an Mey (web services) Mitsuhisa Sato (Asia) Alistair Rendell (Australasia) Eduard Ayguade (OpenMP language) Mike Voss (workshops) Rudi Eigenmann (workshops) We need to hold elections soon

5 5 Finances Goal: build up reserve to ensure workshops Expenses: Annual fee to agent in State of Delaware Delaware franchise tax (only $25 at present) Web domain registration ARB has waived membership fee Income: membership fees and surplus from workshops Current balance: ca. $18,400 Non-profit status recognized for federal tax purposes All cOMPunity work without pay.

6 6 OpenMP Focus Workshops First workshop at Lund, Sweden, 1999 Since 2000 organized annually EWOMP in Europe WOMPAT in North America WOMPEI in Asia Strong regional participation Aachen introduced OMPlab to format

7 7 wallcraf@nlr.com

8 Comments from First Workshop It is easy to get OpenMP code up and running. Can do this incrementally. It is also easy to combine OpenMP and MPI straightforward migration path for MPI code OpenMP is well suited to SPMD style coding It is not easy to optimize for cache, but it is essential for good performance Compilers should do the cache optimization

9 OpenMP Language: Comments Some workarounds are required when porting Fortran 90 constructs not truly supported array reductions not possible need threadprivate for variables I/O needs more work consideration Extensions proposed for I/O, synchronization Extensions may be required for new kinds of HPC applications Libraries are needed

10 Performance: Comments Scalable applications have been developed Some specific performance problems, e.g. I/O and memory allocation Significant differences in overheads for OpenMP constructs on different platforms On cc-NUMA systems, performance too dependent on OS and system load EPCC OpenMP Microbenchmark available at http://www.epcc.ed.ac.uk/research/openmpbench/

11 OpenMP Language: Comments Major problem with OpenMP is range of applicability Needs significant extension for use on cc- NUMA and distributed memory systems Data and thread placement may need to be coordinated by user explicitly Some vendor-specific extensions, but no standards

12 Summary: First Meeting High level of satisfaction with development experience, but understanding of cache optimizations often limited Mostly SPMD programming style adopted Many using OpenMP together with MPI Some language and performance problems identified Much discussion of cc-NUMA performance Confidence in market for OpenMP expressed

13 13 Uptake of OpenMP Widely available Single source code Ease of programming Major industry codes ported to OpenMP Many hybrid codes in use Many more users: experts and novice parallel programmers

14 14 ECHAM5: OpenMP vs. MPI PnPn Speedup p690 Courtesy of Oswald Hahn, MPI for Meteorology

15 15 What has the ARB Been Doing? OpenMP 1.0 to 2.0 to 2.5 Much clearer specs Some nasty parts Especially flush Memory model This is actually a problem elsewhere too (e.g. Pthreads) Tools work didn’t produce interfaces for performance tools or debuggers Lots of ideas for new features waiting for discussion

16 16 Life is Short, Remember? It’s official: OpenMP is easier to use than MPI!

17 17 Life is Short, Remember? It’s official: OpenMP is easier to use than MPI UPC! (Not actually, tested on real subject)

18 18 Workshops: What has changed? Workshops now merged One international workshop (IWOMP) Will rotate location IWOMP 2006 in Europe IWOMP 2007 in Australia Need a steering committee Email suggestions for this to chapman@cs.uh.educhapman@cs.uh.edu How does this affect the date of the event?

19 19 Workshops: Format and Content What about the content? Should we be working harder to get new kinds of users? If so, how? Publishable papers? Contributions from other sources OMP Lab Tutorial

20 20 Status: Membership BOD wanted membership in cOMPunity to be more-or- less free to academics Solution was to make it part of workshop registration In other words: participants at workshops are members In the past, de facto discount for attendance at multiple workshops Now there is only one annual workshop So we are (pretty much) the current members You can join individually too (fee is $50)

21 21 Membership: ctd What is benefit of being a member? Ability to participate in ARB deliberations This needs to be better organized Members-only discussion list New proposal: membership for two years from attended workshop up to the workshop in that year (not matter what the date)

22 22 Our Web Presence www.compunity.org Seems to be pretty useful Was managed at UH, input from BOD members Now managed by RWTH Aachen www.iwomp.org

23 23 Participation in ARB Participation in ARB committees ARB, Tools, Futures and Language ARB: Barbara Chapman Reports are produced by Matthijs van Waveren (Fujitsu) Tools: various, including originators of POMP interface Language: UPC Barcelona, but no regular participation on OpenMP 2.5 committee

24 24 Challenges and Opportunities Single processor optimization: Multiple virtual processors on a single chip need multi- threading Applications outside scientific computing: Compute intensive commercial and financial applications need HPC technology. Multiprocessor game platforms are coming. Clusters and distributed shared memory Clusters are the fastest growing HPC platform. Can OpenMP play a greater role? Does OpenMP have the right language features for these?

25 25 Completeness If we don’t cover a broad enough range of parallel applications, some one else will. Explicit Threading, Distributed Programming? Is OpenMP able to meet the needs of asynchronous or scalable computing? Is there an inherent problem or is some work on the language needed? The Risk: Fragmentation of Parallel Programming API’s – bad for HPC

26 26 OpenMP3.0 A list of proposed features prepared this week Not all of them have a concrete proposal Listed in following slides Order of listing does NOT imply anything with regard to priority, overall importance or status of proposal

27 27 OpenMP3.0: Suggested Features Task Queues There is a proposal Semaphores There is a proposal Collapse clause to allow parallelization of perfect loop nests There is a proposal

28 28 OpenMP3.0: Suggested Features Parallelization of wavefront loop nests There is a proposal Thread groups, named sections and precedence relationships There is a proposal Add internal control variable and environment variable to control slave thread stack size There is a proposal

29 29 OpenMP3.0: Suggested Features Automatic data scope clause There is a proposal SCHEDULE clause for sections There is no proposal Error reporting mechanism There is no proposal

30 30 OpenMP3.0: Suggested Features More kinds of schedules, including one where enough can be assumed to make NOWAIT useful There are several proposals Reductions with user defined functions (esp. min/max reductions in C/C++ There is no proposal Array reductions in C/C++ There is no proposal

31 31 OpenMP3.0: Suggested Features Reduce clause/construct to force reduction inside a parallel region There is no proposal Insist on (instead of permitting) multiple copies of internal control variables There is a proposal Define interactions with standard thread APIs There is no proposal

32 32 OpenMP3.0: Suggested Features INDIRECT clause to specify partially parallel loops There is a proposal Add library routines to support nested parallelism (team ids, global thread ids, etc.) There is no proposal If POMP-like profiling interface never happens, some basic profiling mechanism There is no proposal

33 33 OpenMP3.0: Suggested Features Support for default(private) in C/C++ There is no proposal Additional clauses to make workshare more flexible There is no proposal Include F2003 in set of base languages There is no proposal

34 34 OpenMP3.0: Suggested Features Non-flushing lock routines There is no proposal Support for atomic writes There is no proposal

35 35 OpenMP3.0: Proposed Fixes Remove possibility of storage reuse for private variables Define more clearly where constructors/destructors are called Define clearly how threadprivate objects should be initialized Widen scope of persistence of threadprivate in nested parallel regions

36 36 OpenMP3.0: Proposed Fixes Allow unsigned integers as parallel loop iteration variables in C/C++ Fix C/C++ directive grammar Address reading of environment variables when libraries are loaded multiple times

37 37 Validating OpenMP 2.5 for Fortran and C/C++ Mathias Mueller HLRS High Performance Computing Center Stuttgart University of Houston

38 38 Moving OpenMP Forward What else matters? Modularity? Libraries?.....? Even more widely Some users have been asking for a variety of hints and/or assertions to give more information to the compiler This is not really OpenMP specific

39 39 Moving OpenMP Forward Tools committee Many users complain about relative lack of tools How can we help get better tools? Can we share infrastructure to get more open source tools? What kind of tool support is (most) important?

40 40 cOMPunity Activities Participation in ARB Committees; ARB, Futures/Language, Tools Requires commitment Workshops Web presence Other? Need to participate in 3.0 effort

41 41 Outlook Let’s round up those cycles!!!

42 42 Elections Current officers are willing to serve Must have at least four Roles: Chair Secretary Finances Outreach Workshops Regional events

43 43 OpenMP ARB Current Organization OpenMP Board of Directors Greg Astfalk, HP (Chair) David Klepacki, IBM Ken Miura, Fujitsu Sanjiv Shah, KSL/Intel Josh Simons, Sun OpenMP ARB (Administrative) One representative per member OpenMP Officers Sanjiv Shah, CEO David Poulsen, CFO Larry Meadows, Secretary OpenMP Committees (Actual Work) One representative per member Language, Mark Bull Futures, Mark Debug, Bronis de Supinski Performance, Bronis MPIT, Bronis

44 44 OpenMP 3.0 : Pointer chasing loops Can OpenMP today handle pointer chasing loops? nodeptr list, p; for (p=list; p!=NULL; p=p->next) process(p->data); nodeptr list, p; #pragma omp parallel private( p ) for (p=list; p!=NULL; p=p->next) #pragma omp single nowait process(p->data); Yes it can – at least for simple cases:

45 45 OpenMP 3.0 : Pointer chasing loops A better way has been proposed: WorkQueuing #pragma omp parallel taskq private(p) for (p=list; p!=NULL; p=p->next) #pragma omp task process(p->data); Key concept: separate work iteration from work generation, which is combined in omp for Syntactic variations have been proposed by SUN and the Nanos threads group This method is very flexible Reference: Shah, Haab, Petersen and Throop, EWOMP’1999 paper.

46 46 Parallelization of Loop Nests do i = 1,33 do j = 1,33....loop body end do With 32 threads, how can we get good load balance without manually collapsing the loops? Can we handle non-rectangular and/or imperfect nests? Do I = 1, 33 DO J = 1, 33 … body of loop … END DO Do I = 1, 33 DO J = 1, 33 … body of loop … END DO

47 47 Portability of OpenMP Thread stacksize Different vendor defaults Different ways to request a given size Need to standardize this Behavior of code between parallel regions Do threads sleep? Busy-wait? Can the user control this? Again, need to standardize options

48 48 OpenMP Enhancements : OpenMP must be more modular Define how OpenMP interfaces to “other stuff”: How can an OpenMP program work with components implemented with OpenMP? How can OpenMP work with other thread environments? Support library writers: OpenMP needs an analog to MPI’s contexts. We don’t have any solid proposals on the table to deal with these problems.

49 49 Automatic Data Scoping Create a standard way to ask the compiler to figure out data scooping. When in doubt, the compiler serializes the construct int j; double x, result[COUNT]; #pragma omp parallel for default(automatic) for (j=0; j<COUNT; j++){ x = bigCalc(j); res[j] = hugeCalc(x); } int j; double x, result[COUNT]; #pragma omp parallel for default(automatic) for (j=0; j<COUNT; j++){ x = bigCalc(j); res[j] = hugeCalc(x); } Ask the compiler to figure out that “x” should be private.

50 !$OMP PARALLEL !$OMP DO do i=1,imt RHOKX(imt,i) = 0.0 enddo !$OMP ENDDO !$OMP DO do i=1, imt do j=1, jmt if (k.le. KMU(j,i)) then RHOKX(j,i) = DXUR(j,i)*p5*RHOKX(j,i) endif enddo !$OMP ENDDO !$OMP DO do i=1, imt do j=1, jmt if (k > KMU(j,i)) then RHOKX(j,i) = 0.0 endif enddo !$OMP ENDDO if (k == 1) then !$OMP DO do i=1, imt do j=1, jmt RHOKMX(j,i) = RHOKX(j,i) enddo !$OMP ENDDO !$OMP DO do i=1, imt do j=1, jmt SUMX(j,i) = 0.0 enddo !$OMP ENDDO endif !$OMP SINGLE factor = dzw(kth-1)*grav*p5 !$OMP END SINGLE !$OMP DO do i=1, imt do j=1, jmt SUMX(j,i) = SUMX(j,i) + factor * & (RHOKX(j,i) + RHOKMX(j,i)) enddo !$OMP ENDDO !$OMP END PARALLEL Execution with Reduced Synchronization Part of computation of gradient of hydrostatic pressure in POP code Runtime execution model for (c stands for chunk) Dataflow execution model associated with translated code

51 51 Producer/Consumer Example Correct version according to 2.5: Producer: data =... !$omp flush(data,flag) flag = 1 !$omp flush(flag) Consumer: do !$omp flush(flag) while (flag.eq.0) !$omp flush(data)... = data

52 52 Workshops Since 2000 organized annually EWOMP in Europe WOMPAT in North America WOMPEI in Asia Strong regional participation Aachen introduced OMPlab to format These have been a niche event Most OpenMP users are satisfied (or at least not thinking about how it could evolve) OpenMP is supposed to be easy, right?

53 53 What’s in a Flush? Flush writes data to and reads from memory It doesn’t synchronize threads According to the new rules Compiler is free to reorder flush directives if they are on different variables Two flushes on same variables must be seen by all threads in the same order


Download ppt "COMPunity: The OpenMP Community Barbara Chapman Compunity University of Houston."

Similar presentations


Ads by Google