Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Formats (HPC Visualization with ParaView Workshop)

Similar presentations


Presentation on theme: "Data Formats (HPC Visualization with ParaView Workshop)"— Presentation transcript:

1 Data Formats (HPC Visualization with ParaView Workshop)
Shuaib Arshad April 23, 2014

2 Supported Data Types ParaView Data (.pvd)
VTK (.vtp, .vtu, .vti, .vts, .vtr) VTK Legacy (.vtk) VTK Multi Block (.vtm,.vtmb,.vtmg,.vthd,.vthb) Partitioned VTK (.pvtu, .pvti, .pvts, .pvtr) ADAPT (.nc, .cdf, .elev, .ncd) ANALYZE (.img, .hdr) ANSYS (.inp) AVS UCD (.inp) BOV (.bov) BYU (.g) CAM NetCDF (.nc, .ncdf) CCSM MTSD (.nc, .cdf, .elev, .ncd) CCSM STSD (.nc, .cdf, .elev, .ncd) CEAucd (.ucd, .inp) CMAT (.cmat) CML (.cml) CTRL (.ctrl) Chombo (.hdf5, .h5) Claw (.claw) Comma Separated Values (.csv) Cosmology Files (.cosmo, .gadget2) Curve2D (.curve, .ultra, .ult, .u) DDCMD (.ddcmd) Digital Elevation Map (.dem) Dyna3D(.dyn) EnSight (.case, .sos) Enzo boundary and hierarchy ExodusII (.g, .e, .exe, .ex2, .ex2v.., etc) ExtrudedVol (.exvol) FVCOM (MTMD, MTSD, Particle, STSD) Facet Polygonal Data Flash multiblock files Fluent Case Files (.cas) GGCM (.3df, .mer) GTC (.h5) GULP (.trg) Gadget (.gadget) Gaussian Cube File (.cube) JPEG Image (.jpg, .jpeg) LAMPPS Dump (.dump) LAMPPS Structure Files LODI (.nc, .cdf, .elev, .ncd) LODI Particle (.nc, .cdf, .elev, .ncd) LS-DYNA (.k, .lsdyna, .d3plot, d3plot) M3DCl (.h5) MFIX Unstructred Grid (.RES) MM5 (.mm5) MPAS NetCDF (.nc, .ncdf) Meta Image (.mhd, .mha) Miranda (.mir, .raw) Multilevel 3d Plasma (.m3d, .h5) NASTRAN (.nas, .f06) Nek5000 Files Nrrd Raw Image (.nrrd, .nhdr) OpenFOAM Files (.foam) PATRAN (.neu) PFLOTRAN (.h5) PLOT2D (.p2d) PLOT3D (.xyz, .q, .x, .vp3d) PLY Polygonal File Format PNG Image Files POP Ocean Files ParaDIS Files Phasta Files (.pht) Pixie Files (.h5) ProSTAR (.cel, .vrt) Protein Data Bank (.pdb, .ent, .pdb) Raw Image Files Raw NRRD image files (.nrrd) SAMRAI (.samrai) SAR (.SAR, .sar) SAS (.sasgeom, .sas, .sasdata) SESAME Tables SLAC netCDF mesh and mode data SLAC netCDF particle data Silo (.silo, .pdb) Spheral (.spheral, .sv) SpyPlot CTH SpyPlot (.case) SpyPlot History (.hscth) Stereo Lithography (.stl) TFT Files TIFF Image Files TSurf Files Tecplot ASCII (.tec, .tp) Tecplot Binary (.plt) Tetrad (.hdf5, .h5) UNIC (.h5) VASP CHGCA (.CHG) VASP OUT (.OUT) VASP POSTCAR (.POS) VPIC (.vpc) VRML (.wrl) Velodyne (.vld, .rst) VizSchema (.h5, .vsh5) Wavefront Polygonal Data (.obj) WindBlade (.wind) XDMF and hdf5 (.xmf, .xdmf) XMol Molecule

3 ParaView Data Model Uses VTK Data Model
Fundamental data structure is data object Scientific dataset (Rectilinear grid, FE mesh) Abstract data structure (graph, tree) Data structure Building blocks Mesh (topology, geometry) Attributes

4 VTK Data Model

5 Mesh Actual data structure vary Common abstractions:
Vertices Cells Used to discretize a region Various types (tetrahedra, hexahedra) Cells mapped to vertices by connectivity Faces stored only for polyhedron Completely defined by topology and spatial coordinates of vertices

6 Attributes Defines discrete values of a field over the mesh (pressure, temperature, velocity, stress tensor) Stored as data arrays, and can have arbitrary number of components Can be associated with points, cells, or neither

7 Uniform Rectilinear Grid
Implicit definition of topology and point coordinates Complete definition requires: Extents – min, max indices in each direction Origin – position of the index (0, 0, 0) Spacing – inter-point distance, each direction independently defined npts_total = npts_x * npts_y * npts_z coord = origin + index * spacing (i, j, k) flat index = k * (npts_x * npts_y) + j * npts_x + i All cells are of the same type Regular nature, require less storage, some algorithms optimized to take advantage

8 Rectilinear Grid Implicit definition of topology and semi-implicit definition of point coordinates Complete definition requires: Extents – min, max indices in each direction 3 Arrays defining coordinates in x-, y-, and z- directions, having lengths npts_x, npts_y and npts_z respectively coord = (coord_array_x(i), coord_array_y(i), coord_array_z(i)) (i, j, k) flat index = k * (npts_x * npts_y) + j * npts_x + i All cells are of the same type

9 Curvilinear Grid Also called Structured Grid
Implicit definition of topology and explicit definition of point coordinates Complete definition requires: Extents – min, max indices in each direction Array of point coords – stores position of every vertex explicitly coord = coord_array (idx_flat) (i, j, k) flat index = k * (npts_x * npts_y) + j * npts_x + i All cells are of the same type

10 AMR Dataset Native support
Collection of Uniform Rectilinear grids grouped under increasing refinement ratios Support for masking (blanking) sub-regions of the rectilinear grids using array bytes

11 Unstructured Grid Most general primitive dataset type
Explicit definition of topology and point coordinates Significantly increased memory requirement, so use only if previous options can’t be used Supports large number of cell types, all of which can exist within one grid

12 Polygonal Grid Polydata
Specialized version of unstructured grid for efficient rendering Consists of: 0D cells (vertices and polyvertices) 1D cells (lines and polylines) 2D cells (polygons and triangle strips)

13 Table Tabular dataset consisting of rows and columns
Can be loaded using various file formats like CSV Can be converted to other datasets Filters operating on tables: Table to Points Table to Structured Grid

14 Multiblock Dataset Tree of datasets where leaf nodes are simple datasets (all of the above except AMR) Used to group together related datasets

15 Multipiece Dataset Similar to Multiblock
Group together datasets that are part of a whole mesh – same type and same attributes Used to collect datasets produced by a parallel sim without having to append the meshes Can be produced only using certain readers Not possible to extract individual pieces

16 Introduction to HDF5

17 What is HDF5? HDF5 == Hierarchical Data Format, v5 Open file format
Designed for high volume or complex data Open source software Works with data in the format A data model Structures for data organization and specification August 7, 2013 Extreme Scale Computing HDF5

18 What is HDF5 Hierarchical Data Format v5 Open file format
Designed for high volume or complex data Open source software Works with data in the format A data model Structures for data organization and specification

19 HDF5 is designed … for high volume and/or complex data
for every size and type of system (portable) for flexible, efficient storage and I/O to enable applications to evolve in their use of HDF5 and to accommodate new models to support long-term data preservation August 7, 2013 Extreme Scale Computing HDF5

20 Designed for high volume and/or complex data
for every size and type of system (portable) for flexible, efficient storage and I/O to enable applications to evolve in their use of HDF5 and to accommodate new models to support long-term data preservation

21 HDF5 File An HDF5 file is a container that holds data objects.
lat | lon | temp -­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐ | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 An HDF5 file is a container that holds data objects. August 7, 2013 Extreme Scale Computing HDF5 10

22 HDF5 File Container holding data objects

23 HDF5 Data Model HDF5 Objects Dataset Link Group Datatype Attribute
Dataspace File a.k.a. HDF5 Abstract Data Model a.k.a. HDF5 Logical Data Model August 7, 2013 Extreme Scale Computing HDF5

24 HDF5 Data Model aka HDF5 Abstract Data Model Objects Group Dataspace
Datatype Dataset Link Attribute

25 HDF5 Dataset HDF5 Datatype HDF5 Dataspace
Integer 32bit LE HDF5 Dataspace Rank 3 Dimensions Dim_0 = 4 Dim_1 = 5 Dim_2 = 7 Specifications for single data element and array dimensions Multi-dimensional array of identically typed data elements HDF5 datasets organize and contain “raw data values”. HDF5 datatype describes individual data elements. HDF5 dataspace describes the logical layout of the data elements. August 7, 2013 Extreme Scale Computing HDF5 11

26 HDF5 Dataset Organizes and contains “raw data values”
Datatype describes the individual data elements Dataspace describes the logical layout of data elements

27 HDF5 Dataspace Describes the logical layout of the elements in an HDF5 dataset NULL no elements Scalar single element Simple array (most common) multiple elements organized in a rectangular array rank = number of dimensions dimension sizes = number of elements in each dimension maximum number of elements in each dimension may be fixed or unlimited August 7, 2013 Extreme Scale Computing Argonne 12

28 HDF5 Dataspace Describes logical layout of the elements in the dataset
Null No elements Scalar Single element Simple array (most common) Multiple elements organized in a rectangular array rank = number of dimensions dimension sizes = number of elements in each dimension maximum number of elements in each dimension may be fixed or unlimited

29 HDF5 Dataspace Two roles: Dataspace contains spatial information
Rank and dimensions Permanent part of dataset definition Rank = 2 Dimensions = 4x6 Partial I/0: Dataspace describes application’s data buffer and data elements participating in I/O Rank = 1 Dimension = 10 August 7, 2013 Extreme Scale Computing HDF5 13

30 HDF5 Datatypes Describe individual data elements in an HDF5 dataset
Wide range of datatypes supported Integer Float Enum Array User-defined (e.g., 13-bit integer) Variable length types (e.g., strings) Compound (similar to C structs) Many more … August 7, 2013 Extreme Scale Computing Argonne 14

31 HDF5 Datatype Describe the individual elements in HDF5 dataset
Wide range of datatypes supported Integer Float Enum Array User-defined (e.g. 13-bit integer) Variable length types (e.g. strings) Compound (similar to C structs) Many more …

32 HDF5 Dataset 3 5 Datatype: 32-bit Integer Dataspace: Rank = 2
12 Datatype: 32-bit Integer Dataspace: Rank = 2 Dimensions = 5 x 3 August 7, 2013 Extreme Scale Computing HDF5

33 How data is stored? Contiguous (default) Chunked Chunked & Compressed
Buffer in memory Data in the file Data elements stored physically adjacent to each other Contiguous (default) Better access time for subsets; extendible Chunked Improves storage efficiency, transmission speed Chunked & Compressed August 7, 2013 Extreme Scale Computing HDF5

34 HDF5 Dataset with Compound Datatype
3 5 V V V V V V V V V int16 char int32 2x3x2 array of float32 Compound Datatype: Dataspace: Rank = 2 Dimensions = 5 x 3 August 7, 2013 Extreme Scale Computing HDF5

35 HDF5 Attributes Typically contain user metadata
Have a name and a value Attributes “decorate” HDF5 objects Value is described by a datatype and a dataspace Analogous to a dataset, but do not support partial I/O operations; nor can they be compressed or extended August 7, 2013 Extreme Scale Computing HDF5 18

36 HDF5 Groups and Links / HDF5 groups and links organize data objects.
Every HDF5 file has a root group Experiment Notes: Serial Number: Date: 3/13/09 Configura.on: Standard 3 Parameters 10;100;1000 Viz SimOut lat | lon | temp -­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐ | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 Timestep 36,000 August 7, 2013 Extreme Scale Computing HDF5

37 HDF5 File An HDF5 file is a smart container that holds data objects.
lat | lon | temp -­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐|-­‐-­‐-­‐-­‐-­‐ | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 An HDF5 file is a smart container that holds data objects. August 7, 2013 Extreme Scale Computing HDF5

38 HDF5 Home Page HDF5 home page: http://hdfgroup.org/HDF5/
Latest release: HDF ( coming in November 2013) HDF5 source code: Written in C, and includes optional C++, Fortran 90 APIs, and High Level APIs Contains command-line utilities (h5dump, h5repack, h5diff, ..) and compile scripts HDF5 pre-built binaries: When possible, include C, C++, F90, and High Level libraries. Check ./lib/libhdf5.settings file. Built with and require the SZIP and ZLIB external libraries August 7, 2013 Extreme Scale Computing HDF5

39 HDF5 Software Layers & Storage
API Apps H5Part High Level APIs netCDF-­‐4 h5dump HDFview Java Interface Language Interfaces C, Fortran, C++ HDF5 Data Model Objects Groups, Datasets, Attributes, … Tunable Proper.es Chunk Size, I/O Driver, … HDF5 Library Memory Mgmt Datatype Conversion Chunked Storage Version Compa.bility and so on… Internals Filters Virtual File Layer I/O Drivers Posix I/ O Split Files MPI I/O Custom Storage HDF5 File Format File on Parallel Filesystem Split Files File Other August 7, 2013 Extreme Scale Computing HDF5

40 Useful Tools For New Users
h5dump: Tool to “dump” or display contents of HDF5 files h5cc, h5c++, h5fc: Scripts to compile applications HDFView: Java browser to view HDF5 files HDF5 Examples (C, Fortran, Java, Python, Matlab) August 7, 2013 Extreme Scale Computing HDF5

41 General Programming Paradigm
Object is opened or created Object is accessed, possibly many times Object is closed Properties of object are optionally defined Creation properties (e.g., use chunking storage) Access properties August 7, 2013 Extreme Scale Computing HDF5

42 The General HDF5 API C, Fortran, Java, C++, and .NET bindings
IDL, MATLAB, Python (H5Py, PyTables) C routines begin with prefix H5? ? is a character corresponding to the type of object the function acts on Example Functions: H5D : Dataset interface H5F : File interface H5S : dataSpace interface e.g., H5Dread e.g., H5Fopen e.g., H5Sclose August 7, 2013 Extreme Scale Computing HDF5

43 The HDF5 API For flexibility, the API is extensive
300+ functions Victorinox Swiss Army Cybertool 34 This can be daunting… but there is hope A few functions can do a lot Start simple Build up knowledge as more features are needed August 7, 2013 Extreme Scale Computing HDF5

44 Basic Functions H5Fcreate (H5Fopen)
H5Screate_simple/H5Screate H5Dcreate (H5Dopen) H5Dread, H5Dwrite H5Dclose H5Sclose H5Fclose create (open) File create dataSpace create (open) Dataset access Dataset close Dataset close dataSpace close File August 7, 2013 Extreme Scale Computing HDF5

45 Other Common Functions
DataSpaces: H5Sselect_hyperslab (Partial I/O) H5Sselect_elements (Partial I/O) H5Dget_space Groups: H5Gcreate, H5Gopen, H5Gclose Attributes: H5Acreate, H5Aopen_name, H5Aclose, H5Aread, H5Awrite Property lists: H5Pcreate, H5Pclose H5Pset_chunk, H5Pset_deflate August 7, 2013 Extreme Scale Computing HDF5 30

46 h5py Package Pythonic interface to HDF5 binary data format
Allows storage for large sized numerical data Uses Numpy and Python metaphors like dictionary and Numpy array syntax More information:

47 h5py – Create File

48 h5py – Create Dataset

49 h5py – Create Dataset (2)

50 h5py – Create Attribute

51 h5py – Create Group

52 h5py – Create Groups

53 h5py – Create Datasets in Groups

54 Introduction to XDMF

55 XDMF eXtensible Data Model and Format XML based
Standardized method to exchange scientific data between HPC codes and tools

56 Data Format Raw data to be manipulated
Type, precision, location, rank, and dimensions completely describe any dataset Light data Data description (metadata) Typically less than 1000 values Can be passed around easily Stored in XML Heavy data Actual raw values of the dataset Megabytes, Terabytes etc Movement needs to be kept at minimum Typically stored in HDF5, raw, or similar data formats Redundantly stored in both XML and HDF5

57 Data Model Describes the intended use of the data Stored using XML
Targeted at scientific simulation data focusing on scalars, vector, and tensors defined on a grid Structured and Unstructured grids are described using their topology and geometry Calculated, time varying values are attributes of the grid The actual values for the grid geometry, connectivity and attribute values are contained in data format Separation of data format and model allows for efficient storage

58 Data Model contd… HPC data is viewed as hierarchy of Domains
Domain must contain at least 1 grid Grid Basic representation of both geometric and computed/measured values Group of elements with structured or unstructured topology Geometry Specifies X, Y, and Z positions of the Grid One or more attributes Store any values associated with the Grid or individual cells

59 XDMF API C++ API to read write XDMF data from applications
Wrappers for Python, Tcl, and Java

60 XML Case sensitive Made up of: Element: “Well formed” XML “Valid” XML
<ElementTag AttributeName=“AttributeValue” … > Cdata </ElementTag> Case sensitive Made up of: Elements Entities Processing information Element: <tag Name1=“Value1” Name2=“Value2”> Cdata </tag> <!-- This is a comment --> “Well formed” XML Syntactically correct (quotes match, elements end properly) “Valid” XML Conforms to the Schema or DTD 2 extensions used

61 XInclude Allows for inclusion of files that now well formed XML
<Xdmf Version=“2.0” xmlns:xi=“[ <xi:include href=“Example3.xmf”/> </Xdmf>

62 XPath Allows for elements in the XML document and the API to reference specific elements /Xdmf/Domain/Grid /Xdmf/Domain/Grid[10] Plate”]

63 Minimal file All valid XDMF should appear between <Xdmf> and </Xdmf> <?xml version=“1.0” ?> <!DOCTYPE Xdmf SYSTEM “Xdmf.dtd” []> <Xdmf Version=“2.0”> </Xdmf>

64 Entities XML’s basic substitution mechanism of entities good for improving readability <?xml version=“1.0” ?> <!DOCTYPE Xdmf SYSTEM “Xdmf.dtd” [ <!ENTITY cellDimZXY “ ” ]> <Xdmf Version=“2.0”> ... &cellDimZXY; </Xdmf>

65 Elements <?xml version=“1.0” ?>
<!DOCTYPE Xdmf SYSTEM “Xdmf.dtd” [ <!ENTITY cellDimZXY “ ” ]> <Xdmf Version=“2.0”> <Domain> <Grid> <Topology> </Topology> <Geometry> </Geometry> <Attribute> </Attribute> </Grid> </Domain> </Xdmf>

66 DataItem Uniform - single array of values
<DataItem Dimensions=“3”> </DataItem>

67 DataItem Uniform contd … <DataItem Dimensions=“3”> 1.0 2.0 3.0
<DataItem ItemType=“Uniform” Format=“XML” NumberType=“Float” Precision=“4” Rank=“1” Dimensions=“3”> </DataItem>

68 DataItem Uniform contd … <DataItem ItemType=“Uniform” Format=“HDF”
NumberType=“Float” Precision=“8” Dimensions=“ ”> OutputData.h5:/Results/Iteration 100/Part 2/Pressure </DataItem> <DataItem ItemType=“Uniform” Format=“Binary” Dimensions=“ ”> PressureFile.bin </DataItem>

69 DataItem Collection – 1D array of DataItem
Tree – Hierarchical structure of DataItem <DataItem Name=“Tree Example” ItemType=“Tree”> <DataItem ItemType=“Tree”> <DataItem Name=“Collection1” ItemType=“Collection”> <DataItem Dimensions=“3”> </DataItem> <DataItem Dimensions=“4”> <DataItem Name=“Collection2” ItemType=“Collection”>

70 7 8 9 </DataItem> <DataItem Dimensions=“4”> <DataItem ItemType=“Uniform” Format=“HDF” NumberType=“Float” Precision=“8” Dimensions=“ ”> OutputData.h5:/Results/Iteration 100/Part 2/Pressure

71 OutputData.h5:/Results/Iteration 100/Part 2/Pressure
Tree Example Collection 1 Collection 2 OutputData.h5:/Results/Iteration 100/Part 2/Pressure

72 DataItem HyperSlab – subset of some other DataItem, specified by:
Start Stride Count Example: Source data: HDF5 file Source dimensions: 100 x 200 x 300 x 3 Start at [0, 0, 0, 0] End at [50, 100, 150, 2] Include every other plane

73 DataItem Hyperslab contd …
<DataItem ItemType=“Hyperslab” Dimensions=“ ” Type=“Hyperslab> <DataItem Dimensions=“3 4” Format=“XML”> </DataItem> <DataItem Name=“Points” Dimensions=“ ” Format=“HDF”> MyData.h5:/XYZ

74 Grid Container of info related to 2D and 3D points, structured or unstructured connectivity, and assigned values Types: Uniform – a homogeneous single grid Collection – array of uniform grids with same attributes Tree – hierarchical group SubSet – portion of another grid

75 Grid contd … <Grid Name=“Car Wheel” GridType=“Tree”>
<Grid Name=“Tire” GridType=“Uniform”> <Topology ... <Geometry ... <Grid> <Grid Name=“Lug Nuts” GridType=“Collection”> <Grid Name=Lug Nut 0” GridType=“Uniform”> </Grid> <Grid Name=Lug Nut 1” GridType=“Uniform”> ...

76 Topology Describes general organization of data
Structured (2DSMesh, 2DRectMesh, 2DCoRectMesh, 3DSMesh, 3DRectMesh, 3DCoRectMesh) Linear (Polyvertex, Polyline, Polygon, …) Quadratic (Edge_3, Tri_6, Quad_8, …) Arbitrary (Mixed)

77 Geometry Describe XYZ values of the mesh Organization XYZ XY X_Y_Z
VXVYVZ ORIGIN_DXDYDZ ORIGIN_DXDY

78 Attribute Defines values associated with the mesh Values: Centered:
Scalar Vector Tensor Tensor6 Matrix Centered: Node Edge Face Cell Grid

79 Example <?xml version=“1.0” ?> <!DOCTYPE Xdmf SYSTEM “Xdmf.dtd” [ <!ENTITY HeavyData “claw.ptc0000”> ]> <Xdmf Version=“2.0”> <Domain> <Grid GridType=“Uniform”> <Topology TopologyType=“3DCoRectMesh” Dimensions=“ ”/> <Geometry GeometryType=“Origin_DxDyDz”> <DataItem Dimensions=“3” Format=“XML”> </DataItem>

80 Example contd … <DataItem Dimensions=“3” Format=“XML”> </DataItem> </Geometry> <Attribute Name=“A1” AttributeType=“Scalar” Center=“Cell”> <DataItem ItemType=“HyperSlab” Dimensions=“ ” Type=“HyperSlab”> <DataItem Dimensions=“3 4” Format=“XML”>

81 Example contd … <DataItem Dimensions=“ ” NumberType=“Float” Precision=“8” Format=“Binary” Endian=“Big” Seek=“8”> &HeavyData; </DataItem> </Attribute> </Grid> </Domain> </Xdmf>

82 Resources XDMF main page

83 Questions?


Download ppt "Data Formats (HPC Visualization with ParaView Workshop)"

Similar presentations


Ads by Google