Download presentation

Presentation is loading. Please wait.

Published byHeidi Butrum Modified about 1 year ago

1
879 CISC Parallel Computation High Performance Fortran (HPF) Ibrahim Halil Saruhan Although the [Fortran] group broke new ground … they never lost sight their main objective, namely to produce a product that would be acceptable to practical users with real problems to solve. Fortran … is still by far the most popular language for numerical computation Maurice V. Wilkes

2
Outline Introduction Brief History of Fortran and HPF HPF Directives, Syntax Data Mapping Data Parallelism Putting It all Together Intrinsic Procedures Extrinsic Procedures References and Further Information

3
Introduction HPF a language that combines the full Fortran 90 language with special user annotations dealing with data distribution. will be a standard programming language for computationally intensive applications on many types of machines. a set of extensions to Fortran expressing parallel execution at a high level. designed to provide a portable extension to Fortran 90 for writing data parallel applications. HPFF is the group of people who developed HPF. Since its introduction almost four decades ago, Fortran has been the language of choice for scientific and engineering programming, and HPF is the latest set of extensions to this venerable language.

4
Brief History of Fortran and HPF Early 1950’s The first programming language to be called Fortran was developed by IBM 1957 Became popular after the first compiler delivered to a customer 1966 ANSI published the first formal standard for Fortran including features like integer, real, double precision, do loop, if conditionals, subroutines, functions, Hollerith data type( replaced with character type) and global Variables. This standard is called Fortran ANSI and ISO published a new standard (Fortran 77) including features like If then else if end if conditional statements, complex data type, complex constants, complex numbers, character type, formatted, unformatted and direct-access file input and output.

5
Brief History of Fortran and HPF To satisfy the need for efficient programming on the new generation of parallel Machines, Fortran should need extensions and that leads to the beginning of HPF Desire for Revision on FORTRAN 77 standard let to the work on Fortran with the title of Fortran 8X and resulted in 1991 by ISO and renamed as Fortran 90. Its goal was to modernize Fortran so that it may continue its long history as a scientific and engineering programming language. The first group to discuss standardization of parallel Fortran features was the Parallel Computing Forum (PCF). The original goals of the group were to standardize the language features for task oriented parallelism and shared memory machines Nov Digital Equipment Corporation organized a meeting at the Supercomputing ’91 conference in Albuquerque, New Mexico to discuss HPF

6
Brief History of Fortran and HPF 1992 Jan Kickoff meeting for HPFF in Houston Texas, hosted by the Center for Research on Parallel Computation at Rice University. Over 130 people attended and the meeting is size was larger than expected, a series of smaller “working group” meetings was scheduled to create the language draft Mar The HPFF working group, nearly 40 people, met for the first time in Dallas, Texas. Eight further meetings were held May The HPFF working group produced the HPF language Specification version 1.0

7
HPF Directives and Their Syntax The form of an hpf-directive-line (H201) is : Directive-origin hpf-directive where a directive-origin(H202) is one of !HPF$ CHPF$ *HPF$ Fortran 90 allows comments to begin with “C” and “*” as well as “!” in the fixed source form, but allows only “!” to begin a comment in free source form. There are two forms of directive in HPF: specification-directive executable-diretive specification-directive (H204): Must be in the specification part of the program unit executable-directive (H205): Appears with the other Fortran 90 executable-constructs in the program unit. Examples :align, distribute, processors …

8
HPF-conforming or not? !HPF$ DISTRIBUTE (CYCLIC) :: PERIODIC_TABLE … RIGHT REAL PERIODIC_TABLE (103); !HPF$ DISTRIBUTE PERIODIC_TABLE (CYCLIC) REAL PERIODIC_TABLE (103) !HPF$ DISTRIBUTE PERIODIC_TABLE (CYCLIC) RIGHT !HPF$ DISTRIBUTE PERIODIC_TABLE (CYCLIC); DISTRIBUTE LOG_TABLE (BLOCK) WRONG !HPF$ DISTRIBUTE PERIODIC_TABLE (CYCLIC) !HPF$ DISTRIBUTE LOG_TABLE (BLOCK) RIGHT WRONG

9
Programming Model of HPF Programming Model Communication Parallelism FORALL DO INDEPENDENT INTRINSINC and STANDARD LIBRARY FUNCTIONS EXTRINSINC FUNCTIONS

10
Data Mapping HPF describes data-to-processor mapping by using two kind of operations: Distribute : Directive that describes how an array is divided into even-sized pieces and distributed to processors in a regular way. REAL A (100,100)Array declaration Result : Each processor receives a 50X50 block of A, like P1 gets A(1:50,1:50) There are 4 processors in this example !HPF$ DISTRIBUTE A (BLOCK, BLOCK) Result : Each processor receives every 4.th row of A, like P1 gets A(1,1:100), A(5,1:100), A(9,1:100) …. !HPF$ DISTRIBUTE A (CYCLIC, *)

11
Data Mapping Align : Directive that describes how two arrays ‘line up’ together. Result : X and Y are always distributed the same !HPF$ ALIGN X(I) WITH Y(I) Result : Elements of X correspond to the elements of Y(A can have at most half as many elements as Y) !HPF$ ALIGN X(I) WITH Y(2*I-1)

12
Data Mapping Example REAL DECK_OF_CARDS (52) !HPF$ DISTRIBUTE DECK_OF_CARDS (CYCLIC) REAL DECK_OF_CARDS (52) !HPF$ DISTRIBUTE DECK_OF_CARDS (CYCLIC(5)) There are 4 processors in this example DECK_OF_CARDS (1:49:4) DECK_OF_CARDS (2:50:4) DECK_OF_CARDS (3:51:4) DECK_OF_CARDS (4:52:4) DECK_OF_CARDS (1:5) and DECK_OF_CARDS (21:25) and DECK_OF_CARDS (41:45)

13
HPF Data Mapping Model Arrays or other objects Group of aligned objects Abstract processors as a user-declared Cartesian mesh Physical Processors ALIGN (static) or REALIGN (dynamic) DISTRIBUTE (static) or REDISTRIBUTE (dynamic) Optional implementation- dependent directive

14
Data Mapping Example REAL, DIMENSION (16) :: A, B, C REAL, DIMENSION (32) :: D REAL, DIMENSION (8) :: X REAL, DIMENSION (0:9) :: Y INTEGER, DIMENSION (16):: INX !HPF$ PROCESSORS, DIMENSION(4) :: PROC !HPF$ DISTRIBUTE, (BLOCK) ONTO PROCS :: A, B, D, INX !HPF$ DISTRIBUTE, (CYCLIC) ONTO PROCS:: C !HPF$ ALIGN (I) WITH Y(I+1):: X

15
HPF Data Mapping Declaration a b inx d c PROCS (1) PROCS (2) PROCS (3) PROCS (4) x y

16
HPF Data Mapping Example 1 FORALL (I=1:16) A(I) = B(I) No communication b a

17
HPF Data Mapping Example 2 FORALL (I=1:16) A(I) = C(I) Total communication is 12 elements c a

18
HPF Data Mapping Example 3 FORALL (I=1:15) A(I) = B(I+1) Total communication is 3 elements b a

19
Data Parallelism The most important features used by HPF for parallelism are : Forall and Independent Forall generalizes the Fortran 90 array assignment to handle new shapes of arrays. Independent directive gives the compiler more information about a DO loop or FORALL statement. It tells the compiler that a DO loop doesn’t make any bad data access that force the loop to be run sequentially. Forall is not a loop, not is it a parallel loop as defined in some languages. Forall doesn’t iterate in any well-defined order. !HPF$ INDEPENDENT DO I=1,N X(INDX(I)) = Y(I) END DO

20
Data Parallelism FORALL (I = 2:5) A(I,I) = A(I-1,I-1) A single statement FORALL There are two kind of Forall statements. The single statement and the multi statement:

21
Data Parallelism FORALL (I = 1:8) A(I,I) = SQRT(A(I,I)) FORALL (j = I-3 : I+3, J/=I.AND. J>=1.AND> J<=8) A(I,J) = A(I,I) * A(J,J) END FORALL A multi statement FORALL

22
Data Parallelism

23

24

25

26

27
Putting It all Together The total performance of an HPF program is the combination of parallelism and communication. The performance of an HPF program will depend on the programming model, compiler design, target machine characteristics, and other factors. A simple model for the total computation time of a parallel program is : Ttotal = Tpar /Pactive + Tserial + Tcomm Where : Ttotal is the total execution time. Tpar is the total work that can be executed in parallel. Pactive is the number of (physical) processors that are active, executing the work in Tpar Tserial is the total work that is done serially. Tcomm is the cost of communications

28
Example REAL, ARRAY(16,16) :: X, Y ….. FORALL (J= 2:15, K=2:15) Y(J,K) = (X(J,K) + X(J-1,K) + X(J+1,K)+ X(J,K-1), X(J,K+1))/5.0 END FORALL DISTRIBUTE X (*, BLOCK) DISTRIBUTE X (BLOCK,BLOCK) DISTRIBUTE X (*, CYCLIC) P1P2P3P4 P1P2 P3P4 P1P2P3P4P1P2P3P4P1P2P3P4P1P2P3P4 Various distributions of a 16*16 array onto four processors

29
Example DISTRIBUTE X (BLOCK,BLOCK) P1P2 P3P4 DISTRIBUTE X (*, BLOCK) P1P2P3P4 P2 and P3 each must compute 56 elements of Y (a 14X4 sub array of Y) P1 and P4 each must compute 42 elements of Y (a 14X3 sub array of Y) P2 must exchange 14 elements of X with P1 and 14 another elements of X with P3. P3 has the same computation as P2. P1,P4 has less work to do. Overall completion time: Tpar/Pactive is 56 element-computations and communications overhead (TComm) is 28 element-exchanges. Each processor holds 8*8 sub array. Each processor must compute 49 elements of Y. P1 must compute Y(2:8,2:8) … Each processor can compute 36 elements of Y without requiring communication. For the remaining 13 elements of Y it must obtain 7 elementsof X from each of two other processors Tpar/Pactive is 49 element computations T comm is 14 element-exchanges

30
Intrinsic and Library Procedures System Inquiry Functions (like NUMBER_OF_PROCESSORS, PROCESSORS_SHAPE, SIZE) Mapping Inquiry Subroutines (like HPF_ALIGNMENT, HPF_TEMPLATE, HPF_DISTRIBUTION) Computational Functions Bit Manipulation Functions (like ILEN,LEADZ,POPCNT,POPPAR) Array Location Functions (like MAXLOC,MINLOC) Array Reduction Functions (like IALL,IANY,IPARITY,PARITY) Array Combining Scatter Functions (like SUM_SCATTER) Array Prefix and Suffix Functions (like ALL_SCATTER, ANY_SCATTER) Array Sorting Functions (like GRADE_DOWN)

31
Extrinsic Procedures For Instance, INTERFACE EXTRINSINC (COBOL) SUBROUTINE PRINT_REPORT(DATA_ARRAY) REAL DATA_ARRAY(:,:) END SUBROUTINE PRINT_REPORT END INTERFACE HPF provides a mechanism by which HPF programs may call procedures written in other parallel programming languages. Because such procedures are outside of HPF, they are called extrinsic procedures

32
References and Further Information The High Performance Fortran Handbook by Charles H. Koelbel, David B. Loveman, Robert S. Schreiber The MIT Press 1994 (in the library) Designing and Building Parallel Programs, by Ian Foster unix.mcs.anl.gov/dbpp/text/node82.html#SECTION Rice University Syracuse University

33
Questions ?

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google