May 30-31, 2012HDF5 Workshop at PSI1 HDF5 at Glance Quick overview of known topics.

Slides:



Advertisements
Similar presentations
A PLFS Plugin for HDF5 for Improved I/O Performance and Analysis Kshitij Mehta 1, John Bent 2, Aaron Torres 3, Gary Grider 3, Edgar Gabriel 1 1 University.
Advertisements

April 17-19HDF/HDF-EOS Workshop XV1 HDF5 Advanced Topics Elena Pourmal The HDF Group The 15 th HDF and HDF-EOS Workshop April 17, 2012.
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Dr. Kalpakis CMSC 421, Operating Systems. Fall File-System Interface.
Ceng Operating Systems
Chapter 3.7 Memory and I/O Systems. 2 Memory Management Only applies to languages with explicit memory management (C or C++) Memory problems are one of.
The HDF Group Introduction to HDF5 Barbara Jones The HDF Group The 13 th HDF & HDF-EOS Workshop November 3-5, HDF/HDF-EOS Workshop.
September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1 New Features in HDF5.
With Windows 7 Comprehensive© 2012 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Windows 7 Comprehensive.
Status of netCDF-3, netCDF-4, and CF Conventions Russ Rew Community Standards for Unstructured Grids Workshop, Boulder
1 of 14 Substituting HDF5 tools with Python/H5py scripts Daniel Kahn Science Systems and Applications Inc. HDF HDF-EOS Workshop XIV, 28 Sep
HDF5 Tools Update Peter Cao - The HDF Group November 6, 2007 This report is based upon work supported in part by a Cooperative Agreement.
Parallel HDF5 Introductory Tutorial May 19, 2008 Kent Yang The HDF Group 5/19/20081SCICOMP 14 Tutorial.
HDF 1 HDF5 Advanced Topics Object’s Properties Storage Methods and Filters Datatypes HDF and HDF-EOS Workshop VIII October 26, 2004.
The HDF Group April 17-19, 2012HDF/HDF-EOS Workshop XV1 Introduction to HDF5 Barbara Jones The HDF Group The 15 th HDF and HDF-EOS Workshop.
1 High level view of HDF5 Data structures and library HDF Summit Boeing Seattle September 19, 2006.
HDF5 A new file format & software for high performance scientific data management.
The HDF Group Parallel HDF5 Design and Programming Model May 30-31, 2012HDF5 Workshop at PSI 1.
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
February 2-3, 2006SRB Workshop, San Diego P eter Cao, NCSA Mike Wan, SDSC Sponsored by NLADR, NFS PACI Project in Support of NCSA-SDSC Collaboration Object-level.
The HDF Group Multi-threading in HDF5: Paths Forward Current implementation - Future directions May 30-31, 2012HDF5 Workshop at PSI 1.
Chapter 3.5 Memory and I/O Systems. 2 Memory Management Memory problems are one of the leading causes of bugs in programs (60-80%) MUCH worse in languages.
The HDF Group HDF5 Datasets and I/O Dataset storage and its effect on performance May 30-31, 2012HDF5 Workshop at PSI 1.
HDF 1 New Features in HDF Group Revisions HDF and HDF-EOS Workshop IX November 30, 2005.
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
April 28, 2008LCI Tutorial1 Introduction to HDF5 Tools Tutorial Part II.
The HDF Group HDF5 Tools Updates Peter Cao, The HDF Group September 28-30, 20101HDF and HDF-EOS Workshop XIV.
The HDF Group October 28, 2010NetcDF Workshop1 Introduction to HDF5 Quincey Koziol The HDF Group Unidata netCDF Workshop October 28-29,
October 15, 2008HDF and HDF-EOS Workshop XII1 What will be new in HDF5?
Update on HDF5 1.8 The HDF Group HDF and HDF-EOS Workshop X November 28, 2006HDF.
1 HDF5 Life cycle of data Boeing September 19, 2006.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
HDF Hierarchical Data Format Nancy Yeager Mike Folk NCSA University of Illinois at Urbana-Champaign, USA
HDF5.
September 9, 2008SPEEDUP Workshop - HDF5 Tutorial1 Introduction to HDF5 Command-line Tools.
Collections Data structures in Java. OBJECTIVE “ WHEN TO USE WHICH DATA STRUCTURE ” D e b u g.
The HDF Group Introduction to netCDF-4 Elena Pourmal The HDF Group 110/17/2015.
The HDF Group HDF5 Chunking and Compression Performance tuning 10/17/15 1 ICALEPCS 2015.
The HDF Group HDF Group Support for NPP/JPSS Mike Folk, Elena Pourmal, Larry Knox, Albert Cheng The HDF Group DEWG Meeting June 19, 2012.
NetCDF and Scientific Data Durability Russ Rew, UCAR Unidata ESIP Federation Summer Meeting
The HDF Group Single Writer/Multiple Reader (SWMR) 110/17/15.
May 30-31, 2012 HDF5 Workshop at PSI May Partial Edge Chunks Dana Robinson The HDF Group Efficient Use of HDF5 With High Data Rate X-Ray Detectors.
The HDF Group 10/17/151 HDF5 Tools Tutorial ICALEPCS 2015.
Intro to Parallel HDF5 10/17/151ICALEPCS /17/152 Outline Overview of Parallel HDF5 design Parallel Environment Requirements Performance Analysis.
May 30-31, 2012 HDF5 Workshop at PSI May Metadata Journaling Dana Robinson The HDF Group Efficient Use of HDF5 With High Data Rate X-Ray Detectors.
MATLAB's HDF5 Updates John Evans Image and Scientific Data Formats.
May 30-31, 2012 HDF5 Workshop at PSI May The HDF5 Virtual File Layer (VFL) and Virtual File Drivers (VFDs) Dana Robinson The HDF Group Efficient.
Lecture 02 File and File system. Topics Describe the layout of a Linux file system Display and set paths Describe the most important files, including.
The HDF Group 10/17/151 Introduction to HDF5 ICALEPCS 2015.
The HDF Group Single Writer/Multiple Reader (SWMR) 110/17/15.
The HDF Group Introduction to HDF5 Session 7 Datatypes 1 Copyright © 2010 The HDF Group. All Rights Reserved.
Copyright © 2010 The HDF Group. All Rights Reserved1 Data Storage and I/O in HDF5.
The HDF Group Introduction to HDF5 Session Three HDF5 Software Overview 1 Copyright © 2010 The HDF Group. All Rights Reserved.
1 January 11-13, 2010ESRF Workshop – Introduction to HDF5 Introduction to HDF5 Francesc Alted Consultant and PyTables creator.
HDF and HDF-EOS Workshop XII
Hierarchical Data Formats (HDF) Update
Single Writer/Multiple Reader (SWMR)
HDF5 Metadata and Page Buffering
Introduction to HDF5 Tutorial.
Quick introduction to the Workshop
Tad Scheiblich RSI December 2, 2005
HDF5 Virtual Dataset Elena Pourmal Copyright 2017, The HDF Group.
Chapter 2: System Structures
Peter Cao The HDF Group November 28, 2006
Introduction to HDF5 Mike McGreevy The HDF Group
Moving applications to HDF
Advanced UNIX progamming
Hierarchical Data Format (HDF) Status Update
Elena Pourmal The HDF Group HDF Workshop July 17, 2018
HDF5 Tools Updates and Discussions
Presentation transcript:

May 30-31, 2012HDF5 Workshop at PSI1 HDF5 at Glance Quick overview of known topics

May 30-31, 2012HDF5 Workshop at PSI2 Outline Overview of HDF5 Topics not covered by PSI HDF5 Tutorial Groups and Links

HDF5 File HDF5 Workshop at PSI3 lat | lon | temp ----|-----| | 23 | | 24 | | 21 | 3.6 An HDF5 file is a container that holds data objects. Experiment Notes: Serial Number: Date: 3/13/09 Configuration: Standard 3 May 30-31, 2012

HDF5 File 4 lat | lon | temp ----|-----| | 23 | | 23 | | 24 | | 24 | | 21 | | 21 | 3.6 / SimOut Viz HDF5 groups and links organize data objects. Every HDF5 file has a root group Parameters 10;100;1000 Timestep 36,000 May 30-31, 2012HDF5 Workshop at PSI Similar to UNIX directories

HDF5 Software Layers & Storage HDF5 File Format File Split Files File on Parallel Filesystem Other I/O Drivers Virtual File Layer Posix I/O Split Files MPI I/OCustom Internals Memory Mgmt Datatype Conversion Filters Chunked Storage Version Compatibility and so on… Language Interfaces C, Fortran, C++ HDF5 Data Model Objects Groups, Datasets, Attributes, … Tunable Properties Chunk Size, I/O Driver, … HDF5 Library Storage h5dump tool High Level APIs HDFview tool Tools h5repack tool Java Interface … API 5May 30-31, 2012HDF5 Workshop at PSI

GROUPS AND LINKS May 30-31, 2012HDF5 Workshop at PSI6

May 30-31, 2012HDF5 Workshop at PSI7 Groups and Links Groups are containers for links (graph edges) Links were added in Warning: Many APIs in H5G interface are obsolete - use H5L interfaces to discover and manipulate file structure

Example h5_links.py 8 / B A Different kinds of links May 30-31, 2012HDF5 Workshop at PSI a External a soft dangling dset.h5 links.h5 Dataset can be “reached” using three paths /A/a /a /soft Dataset is in a different file

May 30-31, 2012HDF5 Workshop at PSI9 Links Name Example: “A”, “B”, “a”, “dangling”, “soft” Unique within a group; “/” are not allowed in names Type Hard Link Value is object’s address in a file Created automatically when object is created Can be added to point to existing object Soft Link Value is a string, for example, “/A/a”, but can be anything Use to create aliases

May 30-31, 2012HDF5 Workshop at PSI10 Links (cont.) Type External Link Value is a pair of strings, for example, (“dset.h5”, “dset” ) Use to access data in other HDF5 files HDF introduced caching of files opened via external links H5Pset_elink_file_cache_size

May 30-31, 2012HDF5 Workshop at PSI11 Links Properties ASCII or UTF-8 encoding for names Create intermediate groups Saves programming effort C example lcpl_id = H5Pcreate(H5P_LINK_CREATE); H5Pset_create_intermediate_group( lcpl_id, 1 ); H5Gcreate (fid, "A/B", lcpl_id, H5P_DEFAULT, H5P_DEFAULT); Group “A” will be created if it doesn’t exist

May 30-31, 2012HDF5 Workshop at PSI12 Operations on Links See H5L interface in Reference Manual Create Delete Copy Iterate Check if exists

May 30-31, 2012HDF5 Workshop at PSI13 Groups Properties Creation properties Type of links storage Compact (in 1.8.* versions) Used with a few members (default under 8) Dense (default behavior) Used with many (>16) members (default) Tunable size for a local heap Save space by providing estimate for size of the storage required for links names Can be compressed (in and later) Many links with similar names (XXX-abc, XXX-d, XXX- efgh, etc.) Requires more time to compress/uncompress data

May 30-31, 2012HDF5 Workshop at PSI14 Groups Properties Creation properties Links may have creation order tracked and indexed Indexing by name (default) A, B, a, dangling, soft Indexing by creation order (has to be enabled) A, B, a, soft, dangling ples-by-api/api18-c.htmlhttp:// ples-by-api/api18-c.html

May 30-31, 2012HDF5 Workshop at PSI15 Discovering HDF5 file’s structure HDF5 provides C and Fortran 2003 APIs for recursive and non-recursive iterations over the groups and attributes H5Ovisit and H5Literate (H5Giterate) H5Aiterate Life is much easier with H5Py (h5_visita.py) import h5py def print_info(name, obj): print name for name, value in obj.attrs.iteritems(): print name+":", value f = h5py.File('GATMO-SATMS-npp.h5', 'r+') f.visititems(print_info) f.close()

May 30-31, 2012HDF5 Workshop at PSI16 Checking a path in HDF5 HDF provides HL C and Fortran 2003 APIs for checking if paths exists H5LTvalid_path (h5ltvalid_path_f) Example: Is there an object with a path /A/B/C/d ? TRUE if there is a path, FALSE otherwise

Hints Use latest file format (see H5Pset_libver_bound function in RM) Save space when creating a lot of groups in a file Save time when accessing many objects (>1000) Caution: Tools built with the HDF5 versions prior to will not work on the files created with this property May 30-31, HDF5 Workshop at PSI

May 30-31, 2012HDF5 Workshop at PSI18 Informal Benchmark Create a file and a group in a file Create up to 10^6 groups with one dataset in each group Compare files sizes and performance of HDF using the latest group format with the performance of HDF (default, old format) and Note: Default and became very slow after groups

Time to Open and Read a Dataset May 30-31, 2012HDF5 Workshop at PSI19

Time to Close the File May 30-31, 2012HDF5 Workshop at PSI20

File Size May 30-31, 2012HDF5 Workshop at PSI21

DATATYPES May 30-31, 2012HDF5 Workshop at PSI22

Datatypes See Tutorial examples May 30-31, HDF5 Workshop at PSI

Thank You! Questions? May 30-31, 2012HDF5 Workshop at PSI24