HDF - 1 - Mike Folk, Elena Pourmal, Bob McGrath National Center for Supercomputing Applications University of Illinois at Urbana-Champaign NOBUGS 2004.

Slides:



Advertisements
Similar presentations
Extreme Programming Alexander Kanavin Lappeenranta University of Technology.
Advertisements

Project What is a project A temporary endeavor undertaken to create a unique product, service or result.
DCS Architecture Bob Krzaczek. Key Design Requirement Distilled from the DCS Mission statement and the results of the Conceptual Design Review (June 1999):
Computer Engineering 203 R Smith Agile Development 1/ Agile Methods What are Agile Methods? – Extreme Programming is the best known example – SCRUM.
© , Michael Aivazis DANSE Software Issues Michael Aivazis California Institute of Technology DANSE Software Workshop September 3-8, 2003.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 2: Operating-System Structures Modified from the text book.
Introduction to UNIX Acknowledgement:Thanks to Dr Andrew Horner for the original version of this set of slides. All trademarks are the properties of their.
Effort in hours Duration Over Weeks Or Months Inception Launch Web Lifecycle Methodology Maintenance Phases Copyright Wonderlane Studios.
U-Mail System Design Specification Joseph Woo, Chris Hacking, Alex Benson, Elliott Conant, Alex Meng, Michael Ratanapintha April 28,
University of Illinois at Urbana-ChampaignHDF Mike Folk HDF-EOS Workshop IV Sept , 2000 HDF Update HDF.
University of Illinois at Urbana-ChampaignHDF 1McGrath/Yang 2/27/02 Transitioning from HDF4 to HDF5 Robert E. McGrath Kent Yang.
This chapter is extracted from Sommerville’s slides. Text book chapter
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
Software Tools and Processes Training and Discussion October 16, :00-4:30 p.m. Jim Willenbring.
Programming. What is a Program ? Sets of instructions that get the computer to do something Instructions are translated, eventually, to machine language.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
ESMF Development Status and Plans ESMF 4 th Community Meeting Cecelia DeLuca July 21, 2005 Climate Data Assimilation Weather.
1 Overview of HDF5 HDF Summit Boeing Seattle The HDF Group (THG) September 19, 2006.
Chapter 2: Operating-System Structures. 2.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 2: Operating-System Structures Operating.
Software Development Process and Management (or how to be officious and unpopular)
EGEE is a project funded by the European Union under contract IST Testing processes Leanne Guy Testing activity manager JRA1 All hands meeting,
HDF Update Mike Folk, Kent Yang, Elena Pourmal The HDF Group March 31st, 2009 March 31, 2009Annual HDF Briefing to ESDIS1.
HDF Mike Folk National Center for Supercomputing Applications Science Data Processing Workshop February 26-28, 2002 HDF Update HDF.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
1.8History of Java Java –Based on C and C++ –Originally developed in early 1991 for intelligent consumer electronic devices Market did not develop, project.
December 1, 2005HDF & HDF-EOS Workshop IX P eter Cao, NCSA December 1, 2005 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration.
HDF Converting between HDF4 and HDF5 MuQun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University of Illinois,
Introduction and Features of Java. What is java? Developed by Sun Microsystems (James Gosling) A general-purpose object-oriented language Based on C/C++
Ch 1. A Python Q&A Session Spring Why do people use Python? Software quality Developer productivity Program portability Support libraries Component.
CS 390 Unix Programming Summer Unix Programming - CS 3902 Course Details Online Information Please check.
11/7/2007HDF and HDF-EOS Workshop XI, Landover, MD1 HDF5 Software Process MuQun Yang, Quincey Koziol, Elena Pourmal The HDF Group.
Towards Long-Term Archiving of NASA HDF-EOS and HDF Data Data Maps and the Use of Mark-Up Language Ruth Duerr, Mike Folk, Muqun Yang, Chris Lynnes, Peter.
Guide to Programming with Python Chapter One Getting Started: The Game Over Program.
March 17, 2006CIP Status Meeting March 17, 2006 Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Project Report at CIP AG Meeting.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
A High performance I/O Module: the HDF5 WRF I/O module Muqun Yang, Robert E. McGrath, Mike Folk National Center for Supercomputing Applications University.
Chapter 5 Information Systems in Business Software
The HDF Group Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal, Peter Cao The HDF Group November 5, 2009 November 3-5,
CS2204: Introduction to Unix January 19 th, 2004 Class Meeting 1 * Notes adapted by Christian Allgood from previous work by other members of the CS faculty.
Software Maintenance Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
HDF EOS Workshop David Han Code
HNDIT23082 Lecture 06:Software Maintenance. Reasons for changes Errors in the existing system Changes in requirements Technological advances Legislation.
HDF and HDF-EOS Workshop VII September 24, 2003 HDF5, HDF-EOS and Geospatial Data Archives Don Keefer Illinois State Geological Survey Mike Folk Univ.
Requirements Engineering Requirements Engineering in Agile Methods Lecture-28.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
Introduction to UNIX CS 2204 Class meeting 1 *Notes by Doug Bowman and other members of the CS faculty at Virginia Tech. Copyright
Introduction to UNIX CS465. What is UNIX? (1) UNIX is an Operating System (OS). An operating system is a control program that allocates the computer's.
Chapter 1 Basic Concepts of Operating Systems Introduction Software A program is a sequence of instructions that enables the computer to carry.
State of Georgia Release Management Training
Support for NPP/NPOESS by The HDF Group Mike Folk, Elena Pourmal The HDF Group Annual HDF Briefing to ESDIS March 31, 2009 March Annual HDF Briefing.
CEG 2400 FALL 2012 Windows Servers Network Operating Systems.
GROUP PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston.
Introduction to System Administration. System Administration  System Administration  Duties of System Administrator  Types of Administrators/Users.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
 System Requirement Specification and System Planning.
Enhancements for Voltaire’s InfiniBand simulator
Hierarchical Data Formats (HDF) Update
Software Configuration Management
Software Project Configuration Management
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Chapter 18 Maintaining Information Systems
CS 5150 Software Engineering
Introduction Edited by Enas Naffar using the following textbooks: - A concise introduction to Software Engineering - Software Engineering for students-
Leigh Grundhoefer Indiana University
Chapter 2: The Linux System Part 1
Enterprise Program Management Office
Lecture 06:Software Maintenance
Presentation transcript:

HDF Mike Folk, Elena Pourmal, Bob McGrath National Center for Supercomputing Applications University of Illinois at Urbana-Champaign NOBUGS 2004 HDF-EOS Workshop VIII HDF Software Process Lessons Learned & Success Factors HDF

HDF - 2 -Outline What is HDF? and Who is HDF? HDF “Architecture” Some statistics How do we measure success? How can we achieve success? Group practices Summing up – strengths, weaknesses, needs

HDF What is HDF? Who is HDF?

HDF HDF in a nutshell – what it is File format and I/O Libraries for storing, managing and archiving large complex scientific and other data Tools and utilities Open source, free for any use (U of I license) Well maintained and supported From HDF group, NCSA Univ of Illinois

HDF HDF in a nutshell - features General –simple and flexible data model Flexible –store data of diverse origins, sizes, types –supports complex data structures and types Portable –available for many operating systems and machines Scalable –works in high end computing environments –accommodates date of any size or multiplicity Efficient –fast access, including parallel i/o –Stores big data efficiently

HDF HDF in a nutshell - users Apps in industry, academia, government –More than 200 distinct applications Large user base –E.g. NASA estimates 1.6 million users Underlying format for community standards –E.g. HDF-EOS, SAF, CGNS, NPOESS, NeXus

HDF Example of HDF file: mixing and grouping objects Raster image palette 3-D array 2-D array Raster image lat | lon | temp ----|-----| | 23 | | 24 | | 21 | 3.6 Table a b c z x _foo_y foo 1GB Text : This file was create as a part of… see

HDF HDF “Architecture”

HDF HDF “Architecture” File or other data source HDF I/O library – High-level, object-specific APIs. – Low-level API for I/O to files, etc. Utilities and applications for managing, manipulating, viewing, & analyzing data. Low level Interface HDF5 Applications Programming Interface Tools & Applications File

HDF User’s controlled I/O and “storage” Data pipeline –Data transformation –Compression –Encryption –Storage layout Virtual file options –Stdio (normal file) –Split file –MPI-IO & other parallel –Network –Memory –custom HDF I/O Library HDF “File”

HDF Supported languages and compilers C Wrappers: –C++ –Fortran90 –Java Vendors’ compilers (SUN, IBM, HP, etc.) PGI and Absoft (Fortran) GNU C (e.g. gcc 3.3.2)

HDF Supported Machines and OS Solaris 2.7, 2.8 (32/64-bit) IRIX6.5 IRIX HPUX AIX 5.1 (32/64-bit modes) OSF1 FreeBSD Linux (SuSe, RH8, RH9) including 64-bit Altix (SGI Linux) IA-32 and IA-64 Windows 2000, XP MAC OS X Crays (T3E, SV1, T90IEEE) DOE National Labs machines Linux Clusters

HDF File SerialParallelLinux RHIRIX32XPSV1IA32SGIWintelCray Architecture in context Low level Interface HDF5 Applications Programming Interface Tools & Applications CC++F90Java

HDF File SerialParallelLinux RHIRIX32XPSV1IA32SGIWintelCray Architecture in context Low level Interface HDF5 Applications Programming Interface CC++F90Java Tools & Applications

HDF File SerialParallelLinux RHIRIX32XPSV1IA32SGIWintelCray Architecture in context Low level Interface HDF5 Applications Programming Interface CC++F90Java HDF-EOSSAFCGNS Tools & Applications

HDF The testing challenge Machines × operating systems × compilers × languages × serial and parallel × compression options × configuration options × virtual file options × backward compatibility = a large number

HDF “Diversity makes our code better…” Todd Smith, Geospiza

HDF Some statistics

HDF HDF Statistics HDF Group –15 FTE students –$2.1million annual budget HDF5 source code distribution –2073 files –917,186 Lines of code HDF Project –HDF5, HDF4, H4toH5, H5Lite, Java –3,000,000 lines of code (estimate)

HDF HDF5 source distribution by categories (lines of code)

HDF HDF5 staff investment

HDF How do we measure success?

HDF How do we measure success? Mission Goals and objectives Strong and continuing relationships with users High quality software Strong committed development team Great working environment Adequate funding

HDF Mission, goals and objectives Mission –To develop, promote, deploy, and support open and free technologies that facilitate scientific data exchange, access, analysis, archiving and discovery Goals (examples) –Innovate and evolve the technologies in concert with a changing world of technologies –Maintain a high level of quality and reliability –Collaborate and build communities –Build a team

HDF Mission, goals and objectives Objectives - how we reach the goal Example: –Goal Maintain a high level of quality and reliability –Objectives Improve testing Implement a program to insure excellent software engineering practices Develop and execute a plan to meet quality/reliability standards

HDF Users Number of users Happy users Unhappy users  Users achieve their goals by using HDF technologies Users coming back with new needs Financial support from users

HDF Software Technology that addresses users’ needs and demands (current and future) –E.g. big files, parallel access, multiple objects Usability –Number and types of applications –Appropriate APIs and data models –Available tools –Interoperability with other software E.g. IDL, MatLab, Mathematica

HDF Software Stability –Can data be shared? –Can software run on needed platforms Sustainability –Can read data written 15 years ago on obsolete platform –Is software available in 15 years? Acceptability –De facto standard Open standard for exchange of remote-sensed data Over 3,000,000,000,000,000 bytes stored in HDF and HDF-EOS

HDF How can we achieve success?

HDF How can we achieve success? Maintain strong, responsible, and continuing relationships with users An approach to needs identification, software design, and software implementation based on sound principles of software engineering Effective technical processes for developing, testing, integrating and maintaining software Business and social processes based on sound group management principles

HDF Stages of software development at HDF Getting started Creating an implementation approach Implementation and maintenance Relations with users and sponsors Group practices

HDF Getting started Discover a need Identify a sponsor Clarify the need, its role, and its importance Enter task into the project plan –Make initial estimate of time and resources for the task –Give it a priority –Identify task’s lead –Identify a person who will work on the task

HDF Creating implementation approach Write up a needs/approach RFC (Request For Comment) –Actively solicit feedback from developers/sponsors –Revise until satisfied Write up a design/approach RFC –Get feedback from developers/sponsors –Revise until satisfied Revise project plan according to RFC results Archive RFC

HDF Implementation and maintenance Identify validation plan (need improvement) Implement –Library or tool –Tests –Documentation Ask sponsor and friendly users for feedback Review results and repeat appropriate steps above as needed Clean up (documentation, Web, etc.) and announce Support (debug, fix, add more tests, advertise)

HDF Relations with users and sponsors Who are our sponsors? –Organizations and communities with institutional and financial commitment to HDF NCSA, NASA, DOE ASCI, Boeing, … –Agencies supporting R&D NCSA, NASA, DOE, NSF, … –Collaborators who make in-kind contributions Cactus, PyTables, NeXUS, CGNS … –HDF group members

HDF Relations with users and sponsors Each task is associated with a sponsor Each task has a priority, which should be confirmed with sponsor Each task falls into one of these categories –Research –R&D (research, possibly integrate into product) –Development Technology infusion Library or tools enhancement

HDF Group practices

HDF Group practices - technical Source code management: CVS Bug tracking: Bugzilla –Bugs entered by support staff and developers –Prioritized by staff –Easy bugs fixed “on the fly”

HDF Group practices - technical The testing challenge Code testing –Testing before code check-in –Regression testing –Remote testing –Different configurations testing –Backward compatibility testing

HDF Thank you From: HDF group system admin To: Subject: HDF5_Daily_Tests_FAILED!!! *** HDF5 Tests on *** ============================= Watchers List ============================= HDF5 Daily test features/platforms watchers and procedure Procedure: The watcher will investigate and report the cause of failure by 11am. The developer who checked in the error code may report so by then too. The watcher or the developer should get the failure fixed and report it by 3pm. Platforms watchers: AIX 5.1 (copper) Albert FreeBSD Quincey HP-UX Elena IA32 (tungsten) Raymond IA64 (tg-login) Albert IRIX ,64-bit Raymond IRIX 6.5 Raymond Linux 2.4 Peter Solaris 2.7&8 32,64-bit Elena Windows Kent Features watchers: General Library Quincey General parallel Albert configuration Quincey, James mpich Raymond Fortran Elena Intel compilers Elena + Kent (for windows) PGI compilers Elena C++ Binh-Minh Thread-safety Quincey Tools Padro --- updated: 2004/10/01 ============================= Tests Summary ============================= ****FAILED eirene: setenv CC icc setenv F9X ifc setenv CXX icc --enable-fortran - -enable-cxx**** PASSED arabica: setenv CC /afs/ncsa/projects/hdf/packages/mpich_1. 2.4/SunOS64_5.7/bin/mpicc setenv F9X /afs/ncsa/projects/hdf/packages/mpich_1. 2.4/SunOS64_5.7/bin/mpif90 setenv ALL_LOCAL 1 --enable-fortran standard PASSED arabica: setenv CC mpicc setenv ALL_LOCAL 1 standard PASSED arabica: setenvN 2 CC cc - xarch=v9 setenvN 2 F9X f90 -xarch=v9 setenvN 2 CXX CC -xarch=v9 standard -- with- szlib=/afs/ncsa/projects/hdf/packages/szip _new/SunOS_5.7-64bit PASSED arabica: standard --enable-cxx - -enable-fortran --with- szlib=/afs/ncsa/projects/hdf/packages/szip _new/SunOS_5.7 PASSED Cu12: --enable-parallel PASSED Cu12: --enable-parallel setenv CFLAGS -q64 setenv FFLAGS -q64 setenvN 3 AR ar -X 64 --enable-fortran -- with- zlib=/afs/ncsa/projects/hdf/packages/zlib/ AIX5.1-64bit --with- szlib=/afs/ncsa/projects/hdf/packages/szip _new/AIX5.1-64bit

HDF Daily test report From: HDF group system admin To: Subject: HDF5_Daily_Tests_FAILED!!! *** HDF5 Tests on *** ============================= Watchers List ============================= HDF5 Daily test features/platforms watchers and procedure Procedure: The watcher will investigate and report the cause of failure by 11am. The developer who checked in the error code may report so by then too. The watcher or the developer should get the failure fixed and report it by 3pm. Platforms watchers: AIX 5.1 (copper) Albert FreeBSD Quincey HP-UX Elena IA32 (tungsten) Raymond IA64 (tg-login) Albert IRIX ,64-bit Raymond IRIX 6.5 Raymond Linux 2.4 Peter Solaris 2.7&8 32,64-bit Elena Windows Kent Features watchers: General Library Quincey General parallel Albert configuration Quincey, James mpich Raymond Fortran Elena Intel compilers Elena + Kent (for windows) PGI compilers Elena C++ Binh-Minh Thread-safety Quincey Tools Padro --- updated: 2004/10/01 ============================= Tests Summary ============================= ****FAILED eirene: setenv CC icc setenv F9X ifc setenv CXX icc --enable-fortran --enable-cxx**** PASSED arabica: setenv CC /afs/ncsa/projects/hdf/packages/mpich_1.2.4/SunOS64_5.7/bin /mpicc setenv F9X /afs/ncsa/projects/hdf/packages/mpich_1.2.4/SunOS64_5.7/bin /mpif90 setenv ALL_LOCAL 1 --enable-fortran standard PASSED arabica: setenv CC mpicc setenv ALL_LOCAL 1 standard PASSED arabica: setenvN 2 CC cc -xarch=v9 setenvN 2 F9X f90 -xarch=v9 setenvN 2 CXX CC -xarch=v9 standard --with- szlib=/afs/ncsa/projects/hdf/packages/szip_new/SunOS_ bit PASSED arabica: standard --enable-cxx --enable-fortran -- with- szlib=/afs/ncsa/projects/hdf/packages/szip_new/SunOS_5.7 PASSED Cu12: --enable-parallel PASSED Cu12: --enable-parallel setenv CFLAGS -q64 setenv FFLAGS -q64 setenvN 3 AR ar -X 64 --enable-fortran --with- zlib=/afs/ncsa/projects/hdf/packages/zlib/AIX5.1-64bit -- with-szlib=/afs/ncsa/projects/hdf/packages/szip_new/AIX bit

HDF Group practices - technical Release levels –Development release –Official release –Past releases

HDF Group practices - technical Coding standards Maintaining platform-independence Maintaining time-independence Rules for changing APIs Documentation Rapid prototyping

HDF Group practices – business and social Staff breakdown –User support –Documentation –QA –Software development –Testing –Team leadership –System administration Basic library development Support, doc, QA, maintenance Tools and Java Parallel I/O, Grid, big machines HDF Project Team lead for each team Most staff in two or more teams Staff relationships – –Complement each other – –Overlap each other – –Keep each other honest

HDF Group practices – business and social Accountability of everyone to the whole process Help desk Approaches to carrying out tasks –Paying attention to technical proposals –Weekly HDf5 developer’s meetings –HDF seminars Management and administration –Performance reviews with emphasis on goals, development –Critical to success –That’s another talk

HDF Summing up Strengths, weaknesses, needs

HDF Strengths User support Staff –High quality, diverse staff with good morale –Staff commitment and enthusiasm Ability to address all aspects of product development –Emphasis on quality control –Fast bug fixing and frequent releases –Ability to focus on a single product over a long term High level of support from sponsors Project’s visibility through NCSA, NASA, DOE, users

HDF Weaknesses Software development team –Library expertise still concentrated among too few developers –Team communication is challenging Processes –Release/maintenance take too much time and resources –Configuration and porting are a huge time sink –We don’t do enough prototyping –Hard to keep up with new technologies –Parallel I/O hard to support

HDF More weaknesses & challenges Usability –Software too hard to use for casual users –Insufficient documentation –Insufficient tools for high level users –Insufficient interoperability with common tools and formats Marketing –Marketing effort is inadequate –Need to connect better with users and potential users Viable long-term support

HDF Most immediate needs Configuration and build Testing and prototyping Marketing Reporting –Performance reports –General reports to users –HDF book Sustainable business model

HDF Thank you