2 The Situation: A Stream of Data An instrument takes measurements at regular intervalsData arrives in "packets"--one value at a timeData is arriving in real-timeOr multiple instruments are being usedPackets consist of one or more measurementsPackets vary in size and content
3 The Solution: Packet Tables A high-level API for HDF5Designed to support streams of dataHigh-performance for real-time dataSupports both fixed-length packets and variable-length packetsAvailable in C and C++Packet Tables are always 1-D lists of packets.They're not higher-performance than normal HDF5 calls, but they are much faster than other High-Level APIs.
4 Packet Tables vs. H5TB Tables The "Packet Table" and "Table" interfaces both create tables in HDF5.H5TB Tables are flexible.H5TB Tables support insertions.Packet Tables are high-performance and support variable-length entries.A table is one or the other, but not both!Packet Tables are lower-level; they have to be opened and closed. H5TB Tables calls are atomic.H5TB tables store metadata about field names, allow tables to be combined, etc.Packet Tables support appends, but not insertions (this feature could be added if there is demand for it, but it would much slower than appending).
5 Example – Boeing flight test HDF5 “Packet”Some other HDF5 “Table” package
6 Using Packet TablesA Packet Table contains either fixed-length or variable-length packets.Use H5PTcreate_fl or H5PTcreate_vlOnce set, a Packet Table's type never changesPacket Tables need to be opened and closed like HDF5 datasets.Use H5PTopen and H5PTclose
7 Using Packet Tables Write packets from the data stream Use H5PTappendRead packets back in orderSet the starting point with H5PTset_indexUse H5PTget_next to move through the data…Or, out of orderUse H5PTread_packetsIf you set the index to point to packet 1 and call H5PTget_next, you'll get packet 1. Next time you call H5PTget_next, you'll get packet 2, and so on. You can also get more than one packet at a time.H5PTread_packets gives you random read access without bothering with indices, etc.
8 Fixed-length vs. Variable-length TimeDataa. Fixed length packets.b. Variable length packets.This is what we mean when we talk about "Fixed length" and "Variable-length" packets.This is also a good picture of what Packet Tables look like in general.
9 Fixed-Length vs. Variable Length Both types of Packet Table use the same API callsFixed-length tables use HDF5 datatypesVariable-Length Packet Tables use hvl_t structsHDF5's natural support for variable-size dataDuring reads, a buffer is allocated and must be freed -- use H5PTfree_vlen_readbuffEssentially, variable-length packet tables are fixed-length packet tables that take a different kind of data. All the functions are the same, but fixed-length tables expect buffers full of some HDF5 datatype, and variable-length tables expect buffers full of hvl_t's.
10 Packet Tables in Action An overview of Packet TablesSee the Packet Table use cases:Simple examples of Packet Tables in use
11 SQL, Science data and HDF5 While the commercial world has standardized on the relational data model and SQL, no single standard or tool has critical mass in the scientific community. There are many parallel and competing efforts to build these tool suites – at least one per discipline. Data interchange outside each group is problematic. In the next decade, as data interchange among scientific disciplines becomes increasingly important, a common HDF-like format and package for all the sciences will likely emerge.Jim Gray,Distinguished Engineer atMicrosoft,1998 Turing Award winner“Scientific Data Management in the Coming Decade,” Jim Gray, et al. Cyberinfrastructure Technology Watch Quarterly, Volume 1, Number 2, February