Presentation is loading. Please wait.

Presentation is loading. Please wait.

HDF - 1 - Mike Folk, Elena Pourmal, Bob McGrath National Center for Supercomputing Applications University of Illinois at Urbana-Champaign NOBUGS 2004.

Similar presentations


Presentation on theme: "HDF - 1 - Mike Folk, Elena Pourmal, Bob McGrath National Center for Supercomputing Applications University of Illinois at Urbana-Champaign NOBUGS 2004."— Presentation transcript:

1 HDF - 1 - Mike Folk, Elena Pourmal, Bob McGrath National Center for Supercomputing Applications University of Illinois at Urbana-Champaign NOBUGS 2004 HDF-EOS Workshop VIII HDF Software Process Lessons Learned & Success Factors HDF

2 HDF - 2 -Outline What is HDF? and Who is HDF? HDF “Architecture” Some statistics How do we measure success? How can we achieve success? Group practices Summing up – strengths, weaknesses, needs

3 HDF - 3 - What is HDF? Who is HDF?

4 HDF - 4 - HDF in a nutshell – what it is File format and I/O Libraries for storing, managing and archiving large complex scientific and other data Tools and utilities Open source, free for any use (U of I license) Well maintained and supported From HDF group, NCSA Univ of Illinois http://hdf.ncsa.uiuc.edu

5 HDF - 5 - HDF in a nutshell - features General –simple and flexible data model Flexible –store data of diverse origins, sizes, types –supports complex data structures and types Portable –available for many operating systems and machines Scalable –works in high end computing environments –accommodates date of any size or multiplicity Efficient –fast access, including parallel i/o –Stores big data efficiently

6 HDF - 6 - HDF in a nutshell - users Apps in industry, academia, government –More than 200 distinct applications Large user base –E.g. NASA estimates 1.6 million users Underlying format for community standards –E.g. HDF-EOS, SAF, CGNS, NPOESS, NeXus

7 HDF - 7 - Example of HDF file: mixing and grouping objects Raster image palette 3-D array 2-D array Raster image lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 Table a b c z x _foo_y foo 1GB Text : This file was create as a part of… see http://hdf.ncsa.uiuc.edu

8 HDF - 8 - HDF “Architecture”

9 HDF - 9 - HDF “Architecture” File or other data source HDF I/O library – High-level, object-specific APIs. – Low-level API for I/O to files, etc. Utilities and applications for managing, manipulating, viewing, & analyzing data. Low level Interface HDF5 Applications Programming Interface Tools & Applications File

10 HDF - 10 - User’s controlled I/O and “storage” Data pipeline –Data transformation –Compression –Encryption –Storage layout Virtual file options –Stdio (normal file) –Split file –MPI-IO & other parallel –Network –Memory –custom HDF I/O Library HDF “File”

11 HDF - 11 - Supported languages and compilers C Wrappers: –C++ –Fortran90 –Java Vendors’ compilers (SUN, IBM, HP, etc.) PGI and Absoft (Fortran) GNU C (e.g. gcc 3.3.2)

12 HDF - 12 - Supported Machines and OS Solaris 2.7, 2.8 (32/64-bit) IRIX6.5 IRIX64-6.5 HPUX 11.00 AIX 5.1 (32/64-bit modes) OSF1 FreeBSD Linux (SuSe, RH8, RH9) including 64-bit Altix (SGI Linux) IA-32 and IA-64 Windows 2000, XP MAC OS X Crays (T3E, SV1, T90IEEE) DOE National Labs machines Linux Clusters

13 HDF - 13 - File SerialParallelLinux RHIRIX32XPSV1IA32SGIWintelCray Architecture in context Low level Interface HDF5 Applications Programming Interface Tools & Applications CC++F90Java

14 HDF - 14 - File SerialParallelLinux RHIRIX32XPSV1IA32SGIWintelCray Architecture in context Low level Interface HDF5 Applications Programming Interface CC++F90Java Tools & Applications

15 HDF - 15 - File SerialParallelLinux RHIRIX32XPSV1IA32SGIWintelCray Architecture in context Low level Interface HDF5 Applications Programming Interface CC++F90Java HDF-EOSSAFCGNS Tools & Applications

16 HDF - 16 - The testing challenge Machines × operating systems × compilers × languages × serial and parallel × compression options × configuration options × virtual file options × backward compatibility = a large number

17 HDF - 17 - “Diversity makes our code better…” Todd Smith, Geospiza

18 HDF - 18 - Some statistics

19 HDF - 19 - HDF Statistics HDF Group –15 FTE + 3-5 students –$2.1million annual budget HDF5 source code distribution –2073 files –917,186 Lines of code HDF Project –HDF5, HDF4, H4toH5, H5Lite, Java –3,000,000 lines of code (estimate)

20 HDF - 20 - HDF5 source distribution by categories (lines of code)

21 HDF - 21 - HDF5 staff investment

22 HDF - 22 - How do we measure success?

23 HDF - 23 - How do we measure success? Mission Goals and objectives Strong and continuing relationships with users High quality software Strong committed development team Great working environment Adequate funding

24 HDF - 24 - Mission, goals and objectives Mission –To develop, promote, deploy, and support open and free technologies that facilitate scientific data exchange, access, analysis, archiving and discovery Goals (examples) –Innovate and evolve the technologies in concert with a changing world of technologies –Maintain a high level of quality and reliability –Collaborate and build communities –Build a team

25 HDF - 25 - Mission, goals and objectives Objectives - how we reach the goal Example: –Goal Maintain a high level of quality and reliability –Objectives Improve testing Implement a program to insure excellent software engineering practices Develop and execute a plan to meet quality/reliability standards

26 HDF - 26 -Users Number of users Happy users Unhappy users  Users achieve their goals by using HDF technologies Users coming back with new needs Financial support from users

27 HDF - 27 -Software Technology that addresses users’ needs and demands (current and future) –E.g. big files, parallel access, multiple objects Usability –Number and types of applications –Appropriate APIs and data models –Available tools –Interoperability with other software E.g. IDL, MatLab, Mathematica

28 HDF - 28 -Software Stability –Can data be shared? –Can software run on needed platforms Sustainability –Can read data written 15 years ago on obsolete platform –Is software available in 15 years? Acceptability –De facto standard Open standard for exchange of remote-sensed data Over 3,000,000,000,000,000 bytes stored in HDF and HDF-EOS

29 HDF - 29 - How can we achieve success?

30 HDF - 30 - How can we achieve success? Maintain strong, responsible, and continuing relationships with users An approach to needs identification, software design, and software implementation based on sound principles of software engineering Effective technical processes for developing, testing, integrating and maintaining software Business and social processes based on sound group management principles

31 HDF - 31 - Stages of software development at HDF Getting started Creating an implementation approach Implementation and maintenance Relations with users and sponsors Group practices

32 HDF - 32 - Getting started Discover a need Identify a sponsor Clarify the need, its role, and its importance Enter task into the project plan –Make initial estimate of time and resources for the task –Give it a priority –Identify task’s lead –Identify a person who will work on the task

33 HDF - 33 - Creating implementation approach Write up a needs/approach RFC (Request For Comment) –Actively solicit feedback from developers/sponsors –Revise until satisfied Write up a design/approach RFC –Get feedback from developers/sponsors –Revise until satisfied Revise project plan according to RFC results Archive RFC

34 HDF - 34 - Implementation and maintenance Identify validation plan (need improvement) Implement –Library or tool –Tests –Documentation Ask sponsor and friendly users for feedback Review results and repeat appropriate steps above as needed Clean up (documentation, Web, etc.) and announce Support (debug, fix, add more tests, advertise)

35 HDF - 35 - Relations with users and sponsors Who are our sponsors? –Organizations and communities with institutional and financial commitment to HDF NCSA, NASA, DOE ASCI, Boeing, … –Agencies supporting R&D NCSA, NASA, DOE, NSF, … –Collaborators who make in-kind contributions Cactus, PyTables, NeXUS, CGNS … –HDF group members

36 HDF - 36 - Relations with users and sponsors Each task is associated with a sponsor Each task has a priority, which should be confirmed with sponsor Each task falls into one of these categories –Research –R&D (research, possibly integrate into product) –Development Technology infusion Library or tools enhancement

37 HDF - 37 - Group practices

38 HDF - 38 - Group practices - technical Source code management: CVS Bug tracking: Bugzilla –Bugs entered by support staff and developers –Prioritized by staff –Easy bugs fixed “on the fly”

39 HDF - 39 - Group practices - technical The testing challenge Code testing –Testing before code check-in –Regression testing –Remote testing –Different configurations testing –Backward compatibility testing

40 HDF - 40 - Thank you From: HDF group system admin To: hdf5lib@ncsa.uiuc.edu Subject: HDF5_Daily_Tests_FAILED!!! *** HDF5 Tests on 041022 *** ============================= Watchers List ============================= HDF5 Daily test features/platforms watchers and procedure ----------------------------------------------------- ---- Procedure: The watcher will investigate and report the cause of failure by 11am. The developer who checked in the error code may report so by then too. The watcher or the developer should get the failure fixed and report it by 3pm. Platforms watchers: AIX 5.1 (copper) Albert FreeBSD Quincey HP-UX Elena IA32 (tungsten) Raymond IA64 (tg-login) Albert IRIX64-6.5 32,64-bit Raymond IRIX 6.5 Raymond Linux 2.4 Peter Solaris 2.7&8 32,64-bit Elena Windows Kent Features watchers: General Library Quincey General parallel Albert configuration Quincey, James mpich Raymond Fortran Elena Intel compilers Elena + Kent (for windows) PGI compilers Elena C++ Binh-Minh Thread-safety Quincey Tools Padro --- updated: 2004/10/01 ============================= Tests Summary ============================= ****FAILED eirene: setenv CC icc setenv F9X ifc setenv CXX icc --enable-fortran - -enable-cxx**** PASSED arabica: setenv CC /afs/ncsa/projects/hdf/packages/mpich_1. 2.4/SunOS64_5.7/bin/mpicc setenv F9X /afs/ncsa/projects/hdf/packages/mpich_1. 2.4/SunOS64_5.7/bin/mpif90 setenv ALL_LOCAL 1 --enable-fortran standard PASSED arabica: setenv CC mpicc setenv ALL_LOCAL 1 standard PASSED arabica: setenvN 2 CC cc - xarch=v9 setenvN 2 F9X f90 -xarch=v9 setenvN 2 CXX CC -xarch=v9 standard -- with- szlib=/afs/ncsa/projects/hdf/packages/szip _new/SunOS_5.7-64bit PASSED arabica: standard --enable-cxx - -enable-fortran --with- szlib=/afs/ncsa/projects/hdf/packages/szip _new/SunOS_5.7 PASSED Cu12: --enable-parallel PASSED Cu12: --enable-parallel setenv CFLAGS -q64 setenv FFLAGS -q64 setenvN 3 AR ar -X 64 --enable-fortran -- with- zlib=/afs/ncsa/projects/hdf/packages/zlib/ AIX5.1-64bit --with- szlib=/afs/ncsa/projects/hdf/packages/szip _new/AIX5.1-64bit

41 HDF - 41 - Daily test report From: HDF group system admin To: hdf5lib@ncsa.uiuc.edu Subject: HDF5_Daily_Tests_FAILED!!! *** HDF5 Tests on 041022 *** ============================= Watchers List ============================= HDF5 Daily test features/platforms watchers and procedure --------------------------------------------------------- Procedure: The watcher will investigate and report the cause of failure by 11am. The developer who checked in the error code may report so by then too. The watcher or the developer should get the failure fixed and report it by 3pm. Platforms watchers: AIX 5.1 (copper) Albert FreeBSD Quincey HP-UX Elena IA32 (tungsten) Raymond IA64 (tg-login) Albert IRIX64-6.5 32,64-bit Raymond IRIX 6.5 Raymond Linux 2.4 Peter Solaris 2.7&8 32,64-bit Elena Windows Kent Features watchers: General Library Quincey General parallel Albert configuration Quincey, James mpich Raymond Fortran Elena Intel compilers Elena + Kent (for windows) PGI compilers Elena C++ Binh-Minh Thread-safety Quincey Tools Padro --- updated: 2004/10/01 ============================= Tests Summary ============================= ****FAILED eirene: setenv CC icc setenv F9X ifc setenv CXX icc --enable-fortran --enable-cxx**** PASSED arabica: setenv CC /afs/ncsa/projects/hdf/packages/mpich_1.2.4/SunOS64_5.7/bin /mpicc setenv F9X /afs/ncsa/projects/hdf/packages/mpich_1.2.4/SunOS64_5.7/bin /mpif90 setenv ALL_LOCAL 1 --enable-fortran standard PASSED arabica: setenv CC mpicc setenv ALL_LOCAL 1 standard PASSED arabica: setenvN 2 CC cc -xarch=v9 setenvN 2 F9X f90 -xarch=v9 setenvN 2 CXX CC -xarch=v9 standard --with- szlib=/afs/ncsa/projects/hdf/packages/szip_new/SunOS_5.7- 64bit PASSED arabica: standard --enable-cxx --enable-fortran -- with- szlib=/afs/ncsa/projects/hdf/packages/szip_new/SunOS_5.7 PASSED Cu12: --enable-parallel PASSED Cu12: --enable-parallel setenv CFLAGS -q64 setenv FFLAGS -q64 setenvN 3 AR ar -X 64 --enable-fortran --with- zlib=/afs/ncsa/projects/hdf/packages/zlib/AIX5.1-64bit -- with-szlib=/afs/ncsa/projects/hdf/packages/szip_new/AIX5.1- 64bit

42 HDF - 42 - Group practices - technical Release levels –Development release –Official release –Past releases

43 HDF - 43 - Group practices - technical Coding standards Maintaining platform-independence Maintaining time-independence Rules for changing APIs Documentation Rapid prototyping

44 HDF - 44 - Group practices – business and social Staff breakdown –User support –Documentation –QA –Software development –Testing –Team leadership –System administration Basic library development Support, doc, QA, maintenance Tools and Java Parallel I/O, Grid, big machines HDF Project Team lead for each team Most staff in two or more teams Staff relationships – –Complement each other – –Overlap each other – –Keep each other honest

45 HDF - 45 - Group practices – business and social Accountability of everyone to the whole process Help desk Approaches to carrying out tasks –Paying attention to technical proposals –Weekly HDf5 developer’s meetings –HDF seminars Management and administration –Performance reviews with emphasis on goals, development –Critical to success –That’s another talk

46 HDF - 46 - Summing up Strengths, weaknesses, needs

47 HDF - 47 - Strengths User support Staff –High quality, diverse staff with good morale –Staff commitment and enthusiasm Ability to address all aspects of product development –Emphasis on quality control –Fast bug fixing and frequent releases –Ability to focus on a single product over a long term High level of support from sponsors Project’s visibility through NCSA, NASA, DOE, users

48 HDF - 48 - Weaknesses Software development team –Library expertise still concentrated among too few developers –Team communication is challenging Processes –Release/maintenance take too much time and resources –Configuration and porting are a huge time sink –We don’t do enough prototyping –Hard to keep up with new technologies –Parallel I/O hard to support

49 HDF - 49 - More weaknesses & challenges Usability –Software too hard to use for casual users –Insufficient documentation –Insufficient tools for high level users –Insufficient interoperability with common tools and formats Marketing –Marketing effort is inadequate –Need to connect better with users and potential users Viable long-term support

50 HDF - 50 - Most immediate needs Configuration and build Testing and prototyping Marketing Reporting –Performance reports –General reports to users –HDF book Sustainable business model

51 HDF - 51 - Thank you


Download ppt "HDF - 1 - Mike Folk, Elena Pourmal, Bob McGrath National Center for Supercomputing Applications University of Illinois at Urbana-Champaign NOBUGS 2004."

Similar presentations


Ads by Google