Presentation is loading. Please wait.

Presentation is loading. Please wait.

E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen

Similar presentations

Presentation on theme: "E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen"— Presentation transcript:

1 e-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen Robert Carroll

2 Agenda About the BinX project Introduction to the BinX language Introduction to the BinX library Example application Overview of the BinX API Discussion

3 The problem Most scientific data are in binary files Binary data files are not all standardized Binary data files are platform-dependent XML is useful to represent metadata Scientific datasets can be too large in XML

4 What is BinX? Binary in XML –Annotation language Using XML Descriptive Low-level –Software components BinX library Generic utilities API

5 How and Why BinX is used 0101010101 01010101010 10101010100 01000010111 01010101010 10101010110 Special Application Program Special Application Program … … BinX Library Application Program Application Program Application Program Application Program Application Program Application Program

6 e-Science Data Information and Knowledge Transformation The BinX Language Annotating a binary data stream Mark up data types Mark up sequences Mark up arrays Complex structures

7 Data elements Primitive data elements –Byte, character, integer, real Complex data elements –Arrays, struct, union User-defined data elements

8 Primitive Data Types Character – (Fixed length, variable length and delimited) Integer – –, Real –

9 1. 32767 2. 2147483647 3. 100.0 4. 100.0 Primitive Data Types Mark up data types FF 7F 7F FF FF FF 00 00 C8 42 42 C8 00 00 1234

10 Abstract struct types Mark up a sequence Screen descriptor in GIF: Screen width: unsigned short; Screen height: unsigned short; Packed field: a byte Background colour index: byte Pixel aspect ratio: byte

11 Abstract array types Mark up an array A 2-dimensional array containing 10-by-100, 32-bit integers

12 Embedded abstract types Complex structures

13 User-defined metadata Label the data types and structures

14 Reusable type definitions Define macros for reuse

15 Linking to binary data Reference the binary data file … …

16 The BinX document

17 A BinX document – – – – – Root element Data class section Data instance section Abstract data type

18 DataBinX DataBinX = BinX with Data 100 1000 5.257 1 2

19 e-Science Data Information and Knowledge Transformation The BinX Library Core library Utilities Applications

20 Output from the library DataBinX combined data and BinX document SchemaBinX Binary data stream DataBinX = SchemaBinX + Binary data

21 BinX Components The library has core functionality to support generic utilities and applications Applications Utilities BinX Library Core BinX core functionality Parse/Gen BinX doc Read/write binary data Parse/Gen DataBinX Generic tools DataBinx pack/unpack Extractor Applications Domain-specific

22 BinX application models Data manipulation model Data transportation model Data service model Data query model Data catalogue model

23 Data manipulation model Extraction –Subset of a dataset Combination –Merge several datasets Transformation –Conversion of data types –Change of sequence order –Transposition of array dimensions Transparency –Automatic change of byte order

24 Data transportation model DataBinX as interlingua XML document XML document DataBinX Schema BinX Schema BinX + Binary BinX + Binary ZIP (MIME) ZIP (MIME) XSLT BinX Util ZIP tool Send Receive XSLT BinX Util ZIP tool

25 Data service model Publishing logical datasets in BinX DB 0101 0101 01 Client BinX Grid 0101 0101 01 BinX Dataset from one binary file Dataset from several binary files Dataset from multiple data sources

26 Data query model Create DataBinX –From Binary and BinX Query DataBinX –Use XPath Create New DataBinX –Results from query Parse DataBinX –Create new Binary and BinX 010101010 BinX + Binary BinX + Binary DataBinX XPath New DataBinX New DataBinX 010101010 BinX + Binary BinX + Binary

27 Data catalogue model Primary storage Binary data files Metadata Syntactic annotation Semantic annotation Classification Domain specific Cross-reference XLink 0101 0101 01 BinX 1.1 BinX 1.1 BinX 1.2.1 BinX 1.2.1 BinX 1.2.2 BinX 1.2.2 BinX 1.2.3 BinX 1.2.3 0101 0101 01 BinX 1.2 BinX 1.2 BinX 1 BinX 1 BINARY Detailed Abstract METADATA

28 e-Science Data Information and Knowledge Transformation Application in Astronomy Case Study Data Conversion Between FITS and VOTable

29 Application in astronomy FITS and VOTable conversion DataBinX Utility BinX library Core SIMPLE = T … END 01010101 SIMPLE = T … END 01010101 <?xml version=. … <?xml version=. …

30 FITS file SIMPLE = T / file does conform to FITS standard BITPIX = 8 / number of bits per data pixel NAXIS = 1 / number of data axes … END 3D 4A 14 0F 1C FE 25 04 … … XTENSION= BINTABLE / binary table extension BITPIX = 8 / 8-bit bytes NAXIS = 2 / 2-dimensional binary table … END 7B 3E 40 2C 16 70 E7 6F … … 0 79 Primary HDU Extension Header Data

31 VOTable Procyon 114.827 5.227 4 5 3 4 3 2 1 2 3 3 5 6

32 FITS DataBinX VOTable FITS to VOTable conversion DataBinX Utility FITS Schema BinX Schema BinX Preprocessor DataBinX VOTable XSLT transformer

33 VOTableDataBinXFITS VOTable to FITS conversion XSLT transformer VOTable XSLT DataBinX FITS Schema BinX Schema BinX DataBinX Utility Binary Data Binary Data Post processor FITS Header FITS Header

34 Support Information and software download: – Questions: – Requirements and suggestions: – –

35 e-Science Data Information and Knowledge Transformation BinX API

36 Parsing a BinX document BxBinxFile* pReader = new BxBinxFile(); If (pReader->parse(mybinx.xml)) { BxDataset* pDataset = pReader->getDataset(); }

37 Reading a BinX document BxArrayFixed* pArray = pDataset->getArray(0); BxArrayFixed* pArray = pDataset- >getArray(fixed); Get an array object BxDataset* pStruct = pArray->get(0, 0); Get a struct from the array

38 Reading a BinX document BxFloat32* pReal = pStruct- >getFloat(Real); Float real = pReal->getFloat(); Get the data value

39 Creating BinX document BxBinxFileWriter* pWriter = new BxBinxFileWriter(); Create a object to write out the document BxDataset* pData = new BxDataset(); Create a new dataset (in memory BinX document) BxShort16* i16 = new BxShort16(100); pData->addDataObject(i16);

40 Creating BinX document BxBinaryFile* pbf = new BxBinaryFile(); Create a new binary file pbf->setDatasetPointer(pData); Create a link to the BinX document pWriter->setBinaryFilePtr(pbf); pWriter->save("TestDataset.xml"); Save the BinX document

41 Merge binary data BxBinxFileReader * pFile1 = new BxBinxFileReader(file1.xml); BxBinxFileReader * pFile2 = new BxBinxFileReader(file2.xml); BxDataset * pDataset1 = pFile1->getDataset(); BxDataset * pDataset2 = pFile2->getDataset(); BxArray * pArray1 = pDataset1->getArray(0); BxArray * pArray2 = pDataset2->getArray(0); BxDataObject * pData1 = pArray1->getNext(); BxDataObject * pData2 = pArray2->getNext(); FILE * fo = fopen(output.dat,wb); pData1->toStreamBinary(fo); pData2->toStreamBinary(fo);

42 Summary One BinX document can describe many binary files Generate BinX document from code Easy to use interfaces Flexible

Download ppt "E-Science Data Information and Knowledge Transformation BinX – A Tool for Binary File Access eDIKT project team Ted Wen"

Similar presentations

Ads by Google