Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.efda-taskforce-itm.org The european ITM Task Force data structure F. Imbeaux.

Similar presentations


Presentation on theme: "Www.efda-taskforce-itm.org The european ITM Task Force data structure F. Imbeaux."— Presentation transcript:

1 www.efda-taskforce-itm.org The european ITM Task Force data structure F. Imbeaux

2 www.efda-taskforce-itm.org Data structure : logic, tools Universal Access Layer (UAL) Data base organisation, storage, and catalogue system Outline

3 www.efda-taskforce-itm.org Key elements for the communication between different codes : –Definition of a common format –This common format should be generic enough : to be shared by all codes to apply to all machines / circumstances for a given physical problem Practical point of view : –Code i/o interface uses directly the data structure –Many modern languages (Fortran 90, C++, …) can use structured objects  the data structure design is language independent –The content of the structured object can be changed (e.g. add a new variable) without having to modify the i/o interface of the codes Conceptual point of view : –Object-oriented approach to Consistent Physical Objects (CPO) Why a data structure ?

4 www.efda-taskforce-itm.org Full description of a tokamak : physics quantities + subsystems characteristics + diagnostics measurements  Object oriented data structure : High degree of organisation : several subtrees corresponding to « Consistent Physical Objects » (avoid flat structures with long list of parameter names). Substructures correspond to Consistent Physical Object : –Subsystem : (e.g. a heating system, or a diagnostic) : will contain structured information on the hardware setup and the measured data by / related to this object. –Code results (e.g. a given plasma equilibrium, or the various source terms and fast particle distribution function from an RF code) : will contain structured information on the code parameters and the physics results. Programming Language flexibility : use of recent software technologies : database structure is defined using XML schemas Object oriented data structure

5 www.efda-taskforce-itm.org XML is a generic and standardised object-oriented language, quite convenient to describe structures XML files can also contain the actual data, but we do not use this possibility (ASCII format not convenient for large size numerical data) XML schemas are used to define the data structure (arborescence, type of the objects, …). User-friendly tools (XML editors) allow fast and easy design of the data structure. Small translations scripts allow an automated translation of the structure for various purposes : –Generate type definitions in various languages –Generate access routines to CPO in various languages –Generate documentation –Extract specific parts of the data structure (e.g. machine description) Use of XML schemas

6 www.efda-taskforce-itm.org Just below the top : subtrees representing Consistent Physical Objects : subsystems of the tokamak, or topical code results (equilibrium, MHD, RF, …) One general bookkeeping node (contains in particular the reference GPN) Set of reduced data summarising the main simulation parameters ("0D") for the data base catalogue Detailed data structure : TOP

7 www.efda-taskforce-itm.org Typical diagnostic structure Each subtree (CPO) has its own time array. Each subtree (CPO) has one bookkeeping structure. Detailed data structure : CPO diagnostic

8 www.efda-taskforce-itm.org Typical code structure Each subtree (CPO) has its own time array. Each subtree (CPO) has one bookkeeping structure. Stores code results and code parameters Detailed data structure : CPO code

9 www.efda-taskforce-itm.org A database entry is an instance of the data structure During simulations, the « plasma state component » is an instance of the data structure All data exchanged between codes are part of the data structure (CPOs) Access to the data structure is managed by the Universal Access Layer The data structure is a key central object

10 www.efda-taskforce-itm.org Library allowing to access to the data structure –From the database –From the plasma state component during the run Low level routines : Get/Put a single variable. Developed in C. User level routines : Get/Put a whole CPO, with time interpolation / resampling options. Developed for F90 and C++. Transport layer : Access to the data (knows about the storage method) Since the « user level » routines deal with CPO, they must adhere to the data structure  they are generated by XSL (XML) scripts from the schemas The Universal Access Layer

11 www.efda-taskforce-itm.org The data is presently stored on an MDS+ server –Widely used data access system in the fusion community –Interfaces already exist with many languages –Convenient for storing multi-dimensional arrays, no problem with large data size –Not really object oriented (arrays of objects not possible), slow for large number of data calls The XML schemas defining the data structure are used to build the MDS+ model tree (automated script) The data storage system may evolve in the future The data structure is in principle storage-independent : CPOs can be stored with different methods. They contain nodes describing the storage method, to be used by the Transport Layer to access data. Database storage

12 www.efda-taskforce-itm.org A database entry is defined by : –The tokamak name (MDS tree name) –The shot number –The version of the data / reference number of the simulation Use of the MDS+ shot number as a Generalised Pulse Number (GPN) A full tokamak simulator should be able to compute all possible experimental quantities  Unique data structure for experimental data and all kinds of simulations Guarantee the consistency of a given dataset with minimum bookkeeping effort  Each entry of the database corresponds to a unique consistent physics dataset Each new simulation or version of the experimental data creates a new entry Definition of database entries

13 www.efda-taskforce-itm.org Guarantee data consistency within one entry  each new simulation or version of the experimental data creates a new entry. Copying all data present in the structure would cost a lot of storage space. Only data that are modified are explicitly written in the « output » GPN The unmodified data can be tracked down using a signal referencing the « input » GPN. –This signal would be located at the top of the tree –Valid for all subtrees (subtrees of different origin not allowed, since it may violate data consistency)  simple and efficient bookkeeping A catalogue system, based on a relational database allows to keep track of all existing entries in the ITM DB, and their relation (input/output) Referencing system

14 www.efda-taskforce-itm.org Referencing system Exp. Data 0535220004 Ref : none Ref : 0535220001 Simulation #1 0535220002 Ref : 0535220004 Simulation #2 0535220005 Ref : 0535220004 Simulation #3 0535220006 Ref : 0535220002 Simulation #4 0535220003 Exp. Data 0535220001 Ref : none Guarantees data consistency Referencing system  recursive search, hidden from the user if he does not want to know about it

15 www.efda-taskforce-itm.org Data structure completed for IMP #1 (equilibrium). Design ongoing for IMP3 (integration – core and edge transport) and IMP5 (H&CD sources) MDS+ server set up, model tree adheres to the DS UAL : first prototype has been produced, being tested. Hope to deliver first user version in June (C++ and F90) Extend DS to other physics areas Data base catalogue and referencing system designed conceptually, tools must be developed. First test of the whole set DS + UAL : benchmarking equilibrium codes (IMP1 task) Present status and perspectives


Download ppt "Www.efda-taskforce-itm.org The european ITM Task Force data structure F. Imbeaux."

Similar presentations


Ads by Google