Application of Fortran 90 to ocean model codes Mark Hadfield National Institute of Water and Atmospheric Research New Zealand.

Slides:



Advertisements
Similar presentations
879 CISC Parallel Computation High Performance Fortran (HPF) Ibrahim Halil Saruhan Although the [Fortran] group broke new ground …
Advertisements

Chapter 10- Instruction set architectures
Fortran Jordan Martin Steven Devine. Background Developed by IBM in the 1950s Designed for use in scientific and engineering fields Originally written.
Names and Bindings.
Chapter 3 Loaders and Linkers
The new features of Fortran 2003 David Muxworthy BSI Fortran Convenor.
Computational Methods of Scientific Programming Lecturers Thomas A Herring, Room A, Chris Hill, Room ,
Fortran: Array Features Session Five ICoCSIS. Outline 1.Zero-sized Array 2.Assumed-shaped Array 3.Automatic Objects 4.Allocation of Data 5.Elemental Operations.
Variables Names Bindings Type Scope. L-Value versus R-Value Not complicated Associated with assignment statements Left hand side represents an address.
CIS 101: Computer Programming and Problem Solving Lecture 8 Usman Roshan Department of Computer Science NJIT.
CS 330 Programming Languages 10 / 16 / 2008 Instructor: Michael Eckmann.
VBA Modules, Functions, Variables, and Constants
Chapter 10 Additional Features of Arrays Dr. Ali Can Takinacı İstanbul Technical University Faculty of Naval Architecture and Ocean Engineering İstanbul.
Functions & Subroutines HTML version DFMain.chm. Types of subprograms Internal External Module Pure Elemental Recursive Optional arguments Generic Defined.
Introduction to Fortran Fortran Evolution Drawbacks of FORTRAN 77 Fortran 90 New features Advantages of Additions.
ISBN Chapter 5 Names, Bindings, Type Checking, and Scopes Names Variables The Concept of Binding Type Checking Strong Typing Type Compatibility.
Fortran 9x HTML version. New F90 features Free source form Modules User-defined data types and operators Generic user-defined procedures Interface blocks.
Chapter 9 Introduction to Procedures Dr. Ali Can Takinacı İstanbul Technical University Faculty of Naval Architecture and Ocean Engineering İstanbul -
Contemporary Languages in Parallel Computing Raymond Hummel.
Fortran- Subprograms Chapters 6, 7 in your Fortran book.
Task Farming on HPCx David Henty HPCx Applications Support
Fortran: Specification Statements Session Six ICoCSIS.
5.3 Machine-Independent Compiler Features
Introduction to FORTRAN
C++ Object Oriented 1. Class and Object The main purpose of C++ programming is to add object orientation to the C programming language and classes are.
CS 355 – Programming Languages
Fortran: Program Units and Procedures Session Four ICoCSIS.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
CS 403: Programming Languages Lecture 2 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Intro to C++ And Some Tools Opening Discussion zHave any questions come up since last class? Have you had a chance to look over the project.
Names Variables Type Checking Strong Typing Type Compatibility 1.
5-1 Chapter 5: Names, Bindings, Type Checking, and Scopes Variables The Concept of Binding Type Checking Strong Typing Type Compatibility Scope and Lifetime.
1 4.2 MARIE This is the MARIE architecture shown graphically.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Programming Languages and Design Lecture 7 Subroutines and Control Abstraction Instructor: Li Ma Department of Computer Science Texas Southern University,
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 2.
Module and Data Sharing. Programming in the Large Software, in general, is large having multiple units Multiple units designed and developed independently.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
1 Serial Run-time Error Detection and the Fortran Standard Glenn Luecke Professor of Mathematics, and Director, High Performance Computing Group Iowa State.
ITC Research Computing Support Using Matlab Effectively By: Ed Hall Research Computing Support Center Phone: Φ Fax:
COMP3190: Principle of Programming Languages
What is a Package? A package is an Oracle object, which holds other objects within it. Objects commonly held within a package are procedures, functions,
Introduction to c++ programming - object oriented programming concepts - Structured Vs OOP. Classes and objects - class definition - Objects - class scope.
August 2001 Parallelizing ROMS for Distributed Memory Machines using the Scalable Modeling System (SMS) Dan Schaffer NOAA Forecast Systems Laboratory (FSL)
ME 142 Engineering Computation I Using Subroutines Effectively.
CSC 8505 Compiler Construction Runtime Environments.
FORTRAN History. FORTRAN - Interesting Facts n FORTRAN is the oldest Language actively in use today. n FORTRAN is still used for new software development.
Concepts of programming languages Chapter 5 Names, Bindings, and Scopes Lec. 12 Lecturer: Dr. Emad Nabil 1-1.
Getting Started with SIDL using the ANL SIDL Environment (ASE) ANL SIDL Team MCS Division, ANL April 2003 The ANL SIDL compilers are based on the Scientific.
FORTRAN 90+ Yetmen Wang Fortran 90/95/2000 INTRODUCTION FORTRAN VERSIONS PROGRAM STRUCTURE NEW SOURCE FORM OO PROGRAMMING ARRAY PROGRAMMING SIGNIFICANT.
Experiences with Co-array Fortran on Hardware Shared Memory Platforms Yuri DotsenkoCristian Coarfa John Mellor-CrummeyDaniel Chavarria-Miranda Rice University,
Functions Math library functions Function definition Function invocation Argument passing Scope of an variable Programming 1 DCT 1033.
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
How to execute Program structure Variables name, keywords, binding, scope, lifetime Data types – type system – primitives, strings, arrays, hashes – pointers/references.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
OCR A Level F453: The function and purpose of translators Translators a. describe the need for, and use of, translators to convert source code.
LLVM IR, File - Praakrit Pradhan. Overview The LLVM bitcode has essentially two things A bitstream container format Encoding of LLVM IR.
BIL 104E Introduction to Scientific and Engineering Computing Lecture 4.
Chapter Goals Describe the application development process and the role of methodologies, models, and tools Compare and contrast programming language generations.
Chapter 10 Application Development
Object Oriented Programming in
Component 1.6.
CSCI-235 Micro-Computer Applications
Compiler Construction (CS-636)
Microprocessor and Assembly Language
Computer Engg, IIT(BHU)
Programming Languages
Chapter 9 Subprograms Fundamentals of Subprograms
C++ Object Oriented 1.
Presentation transcript:

Application of Fortran 90 to ocean model codes Mark Hadfield National Institute of Water and Atmospheric Research New Zealand

Slide 2©2003 NIWA. All Rights Reserved. It’s not about me but… I have used ocean models: –ROMS –POL3D –MOMA I program in –Fortran –IDL –Python, Matlab, C ++, … I run models and analysis software on –PC, Win 2000, Compaq Fortran or Cygwin g77 –Cray T3E –Compaq Alpha machines with 1 or 2 CPUs

Slide 3©2003 NIWA. All Rights Reserved. Outline of this talk Fortran 90/95 new features Some Fortran 90/95 features in more detail Application in ROMS Tales from the front line ROMS 1 and ROMS 2 performance Conclusions

Slide 4©2003 NIWA. All Rights Reserved. Fortran 90/95 new features (1) Dynamic data objects –Allocatable arrays –Pointers –Automatic data objects Array processing –Whole-array operations –Array subset syntax –New intrinsic functions –Pointers Free-format source

Slide 5©2003 NIWA. All Rights Reserved. Fortran 90/95 new features (2) Modules –Replace COMMON for packaging global data –Hide information –Bundle operations with data INCLUDE statement! Procedure enhancements –Explicit interfaces (MODULE procedures & INTERFACE statement) –Optional & named arguments –Generic procedures

Slide 6©2003 NIWA. All Rights Reserved. Fortran 90/95 new features (3) Data structures User-defined types and operators New control structures Portable ways to specify numeric precision I/O enhancements –Name lists

Slide 7©2003 NIWA. All Rights Reserved. Fortran 90 Pros & Cons (1) Pros –Data organisation can match the problem better –Enables more readable, maintainable code –Stricter argument checking –Libraries can present a cleaner interface –Supported by all (?) commercial compilers

Slide 8©2003 NIWA. All Rights Reserved. Fortran 90 Pros & Cons (2) Cons –Greater feature load can reduce readability & maintainability –Potential reduction in performance –No satisfactory open-source compiler –Modules complicate building a program with multiple source files –Commercial compilers still have performance problems and bugs

Slide 9©2003 NIWA. All Rights Reserved. ALLOCATABLE and POINTER (1) Allocatable arrays can be allocated real, allocatable :: a(:,:) … allocate a(10,20) So can pointers real, pointer :: a(:,:) … allocate a(10,20) Pointers can be associated with arrays & array subsets (& scalars) real, pointer :: a(:,:),b(:,:) real, target :: c(10,20) … a => c b => c(1:5,1:10)

Slide 10©2003 NIWA. All Rights Reserved. ALLOCATABLE and POINTER (2) Pointers can be components of a structure type mytype real, pointer :: a(:,:) end type mytype … type (mytype) x … allocate (x%a(10,20)) Allocatable arrays can’t be structure components. This is unfortunate because pointers are harder to manage and optimise –A pointer may be associated with non-contiguous memory locations –Pointers complicate “ownership”

Slide 11©2003 NIWA. All Rights Reserved. Passing arrays to subprograms: Explicit-shape Example: program main real :: a(10,20), b(200) call mysub (10,20,a) call mysub (10,20,b) write (unit=*,fmt=*) a,b end program main subroutine mysub (m,n,x) integer, intent(in) :: m,n real, intent(out) :: x(m,n) x(:,:) = 1.0 end subroutine mysub The subprogram is given access to a contiguous block of data (m  n elements). It should not write beyond that block. (But what if it does?)

Slide 12©2003 NIWA. All Rights Reserved. Passing arrays to subprograms: Assumed-shape Example: program main real :: a(10,20) call mysub (a) write (unit=*,fmt=*) a contains subroutine mysub (x) real, intent(out) :: x(:,:) x(:,:) = 1.0 end subroutine mysub end program main The subprogram is given an array descriptor that tells it where to find the data

Slide 13©2003 NIWA. All Rights Reserved. Explicit shape vs assumed-shape Explicit-shape: –Dummy array need not match actual array in rank & shape –Subprogram can be more efficient because contiguous data guaranteed –If actual array not contiguous, a temporary copy is needed Assumed-shape: –No need for array-shape arguments. This is simpler but less self-documenting –Passing array descriptor more costly?

Slide 14©2003 NIWA. All Rights Reserved. ROMS 2 structure Model data are in structure variables packaged in modules Main model arrays are pointer components in the structures and are dynamically allocated Subroutines access model arrays by USEing the module or via arguments The model currently supports explicit- shape and assumed-shape dummy array declarations, selected via the ASSUMED_SHAPE preprocessor symbol

Slide 15©2003 NIWA. All Rights Reserved. Why explicit- and assumed-shape? Assumed-shape required because –With some older compilers, there is a drastic (5 ) performance drop with explicit-shape declarations Recall that pointers can be associated with non- contiguous data, in which case a temporary, contiguous copy must be made. Newer compilers apparently detect that pointers in ROMS are associated with contiguous data and avodi the copy. Older ones do not. –Some (Compaq) compilers have trouble with the form of the explicit-shape declarations in ROMS real(r8) :: tke(LBi:UBi,LBj:UBj,0:N(ng),3) Explicit-shape retained because –On some platforms, the model runs faster with this option (6% faster on Cray T3E) –Weaker checking allows some subroutines to accept 2D or 3D data, avoiding duplication of code

Slide 16©2003 NIWA. All Rights Reserved. Another tale from the front line MOM 4 is a substantially revised version of the GFDL MOM model, written in Fortran 90, supporting STATIC and DYNAMIC memory options Early tests showed a substantial (40%) drop in performance with the DYNAMIC option. This was traced to a performance bug in the SGI test compiler, triggered by the combination of array syntax operations on dynamically allocated arrays within derived types (from message on MOM4 mailing list by Christopher Kerr).

Slide 17©2003 NIWA. All Rights Reserved. ROMS 2 performance (1) Tested on two platforms –PC, Pentium GHz, Windows 2000, Compaq Visual Fortran 6.6 –NIWA’s Cray T3E (Kupe), 140 PE, Alpha EV5 600 MHz The CPU on the PC is 6–8 times as fast as one PE on Kupe. We need to get 20 PEs on the job to make Kupe worthwhile; the maximum reasonable number is usually 60.

Slide 18©2003 NIWA. All Rights Reserved. ROMS 2 performance (2) On the PC –For a small problem (UPWELLING) execution time in ROMS 2 is 40% less than in ROMS 1. –This advantage does not hold for medium-size problems (BENCHMARK1). For these, ROMS 2 speed is similar to ROMS 1. –ROMS 2 supports multiple tiles in serial mode. For medium-size problems using a modest number of tiles (20–60) reduces execution time by 25%. –Explicit-shape code doesn’t work due to compiler bugs, but if we work around these bugs explicit- and assumed-shape speeds are similar.

Slide 19©2003 NIWA. All Rights Reserved. ROMS 2 performance (3) On Kupe –Serial runs show same speedup of ROMS 2 relative to ROMS 1 for small problems, same lack of speedup for large problems –No advantage using multiple tiles in serial mode –Explicit-shape code 6% faster than assumed- shape –MPI runs on n processors show an n speedup as long as tile size is 25  25 or more –PE performance, scaling & memory size make Kupe an effective tool for model runs up to (say) 400  400  30

Slide 20©2003 NIWA. All Rights Reserved. Conclusions ROMS 2 (Fortran 90) is not significantly slower than ROMS 1 (Fortran 77) For small problems ROMS 2 is significantly faster than ROMS 1 There are performance pitfalls in Fortran 90, mostly related to inefficient treatment of pointers Widespread use of array operations is probably not a good idea Compiler bugs remain a problem A community model needs to be tested on a variety of platforms to uncover problems