Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECMWF ODB Training 2006 slide 1 Introduction to Observational DataBase (ODB) Sami Saarinen, Paul Burton ECMWF 22-Mar-2006.

Similar presentations


Presentation on theme: "ECMWF ODB Training 2006 slide 1 Introduction to Observational DataBase (ODB) Sami Saarinen, Paul Burton ECMWF 22-Mar-2006."— Presentation transcript:

1 ECMWF ODB Training 2006 slide 1 Introduction to Observational DataBase (ODB) Sami Saarinen, Paul Burton ECMWF 22-Mar-2006

2 ECMWF ODB Training 2006 slide 2 Overview Introduction to ODB Creating a simple database  Use of simulobs2odb –program  Visualizing data using odbviewer, ODBTk The bigger picture  ODB within IFS/4DVAR-system  A more complex database Manipulating ODB from Fortran90 Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

3 ECMWF ODB Training 2006 slide 3 Overview Introduction to ODB Creating a simple database  Use of simulobs2odb –program  Visualizing data using odbviewer, ODBTk The bigger picture  ODB within IFS/4DVAR-system  A more complex database Manipulating ODB from Fortran90 Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

4 ECMWF ODB Training 2006 slide 4 Introduction to ODB ODB is a tailor made database software developed at ECMWF to manage very large observational data volumes through the IFS/4DVAR-system, and to enable flexible post- processing of observational data Observational database usually contains following items:  Observation identification, position and time coordinates  Observation value, pressure levels, channel numbers  Various quality control flags  Obs. departures from background and analysis fields  Satellite specific information  Other closely related information

5 ECMWF ODB Training 2006 slide 5 AMSU-A data before screening

6 ECMWF ODB Training 2006 slide 6 Basic components of ODB ODB/SQL-language  Data Definition Language: To describe what data items belong to database, what are their data types and how they are related (if any) to each other  Data Query Language: To query and return a subset of data which satisfies certain user specified conditions. This is the key feature of the ODB software !! Fortran90 interface layer  Data manipulation : create, update & remove data  Execute ODB/SQL-queries and retrieve filtered data  To control MPI and OpenMP-parallelization

7 ECMWF ODB Training 2006 slide 7 ODB/SQL compilation system

8 ECMWF ODB Training 2006 slide 8 Typical ODB usage patterns Database can be created interactively or in batch mode  We usually run our in-house BUFR2ODB in batch  New observation types can also be fed in via text file Complete database manipulation currently prefers using Fortran90-interface, but read/only database can also be accessed via rudimentary client-server –interface (C/C++) When database has been created, the application program normally queries data and places the result (also known as view) into a data matrix allocated by the user There can be virtually any number of active views at any given time. These can be updated and fed back to database

9 ECMWF ODB Training 2006 slide 9 Overview Introduction to ODB Creating a simple database  Use of simulobs2odb –program  Visualizing data using odbviewer, ODBTk The bigger picture  ODB within IFS/4DVAR-system  A more complex database Manipulating ODB from For Tools: odbless, odbdif, odbcompress, odbdup, odb2netcdf

10 ECMWF ODB Training 2006 slide 10 Creating a simple database We will create a very simple database using text files The 3 text files describe  Data layout i.e. what data items comprise this ODB  Location and time information of observations  Actual observation measurement information for each location at the given pressure levels Feed these files into simulobs2odb-program Discover the data values in database by using odbviewer

11 ECMWF ODB Training 2006 slide 11 Data definition layout : MYDB.ddl CREATE TABLE hdr AS ( seqno pk1int, obstype pk1int, codetype pk1int, lat pk9real, lon pk9real, date yyyymmdd, time hhmmss, body @LINK, ); CREATE TABLE body AS ( entryno pk1int, varno pk1int, vertco_type pk1int, press pk9real, obsvalue pk9real, );

12 ECMWF ODB Training 2006 slide 12 Input file#2 : hdr.txt #hdr obstype = 2 codetype = 141 seqno lat lon date time body.len 1 45 -15 20041101 000000 1

13 ECMWF ODB Training 2006 slide 13 Input file#3 : body.txt #body entryno varno vertco_type press obsvalue 1 2 1 50000 251.0

14 ECMWF ODB Training 2006 slide 14 Running simulobs2odb Initialize ODB interactive environment :  use odb Create database using the following simple command :  simulobs2odb –l MYDB –i hdr.txt –i body.txt As a result of these commands, a small database called MYDB has been created and it contains one data pool with two tables hdr and body, which are linked (related) to each other via special @LINK data type It is now easy to extend database by providing more data, or specifying more data items, or adding more tables, or all above at the same time

15 ECMWF ODB Training 2006 slide 15 Visualizing with odbviewer History: odbviewer was originally written to be used as a debugging tool for ODB software development Linked with ECMWF graphics package MAGICS/MAGICS++ it displays coverage plots Also a textual report generator  Displays output of data queries “Sensitive” to ODB/SQL-language : tries automatically produce both coverage plot and textual report for the user Textual report itself can be invaluable source of information for further post-processing tasks

16 ECMWF ODB Training 2006 slide 16 Running odbviewer Go to database directory  cd MYDB Run  odbviewer –q ‘SELECT lat,lon,press,obsvalue\ FROM hdr, body \ WHERE obstype = 2’

17 ECMWF ODB Training 2006 slide 17 odbviewer coverage plot Our observation !!

18 ECMWF ODB Training 2006 slide 18 Some odbviewer [options] -h List of options (gimme some “help” !) -q ‘SQL-stmt’ Provide ODB/SQL-statement inline -v viewname/poolno Choose SQL name (& optionally pool number) -p “1-10,12,15” Choose from a subset of pools -RNo radians-to-degrees conversion for (lat,lon) -rEnforce radians-to-degrees conversion -cClean start (i.e. recompile all) -e editorChoose preferred editor -e batchRun in batch mode (same as –e pipe) -NDo not produce a report at all -IDo not show plot immediately -P projectionChange projection -C file.cmap Supply a color map file -A plot_areaChoose plotting area

19 ECMWF ODB Training 2006 slide 19 ODBTk : The ODB Toolkit GUI based ODB visualisation tool Easy way for non-experts to build SQL Interactive viewing of observational data Can refine SQL “WHERE” statement as you view the data Portable, lightweight application  Requires ODB, perl, Fortran90 & C compilers

20 ECMWF ODB Training 2006 slide 20 ODBTk : Building an SQL Twin views on structure  Hierarchical structure  Allows relationship between tables/columns to be seen  “Flat structure”  Easy to find a given column/member or table  Allows user to sort structure SQL library  Both local & shared

21 ECMWF ODB Training 2006 slide 21 Visualising Coverage

22 ECMWF ODB Training 2006 slide 22 Visualising X-Y plots

23 ECMWF ODB Training 2006 slide 23 Overview Introduction to ODB Creating a simple database  Use of simulobs2odb –program  Visualizing data using odbviewer, ODBTk The bigger picture  ODB within IFS/4DVAR-system  A more complex database Manipulating ODB from Fortran90 Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

24 ECMWF ODB Training 2006 slide 24 AMSU-A data after screening Under 10% left active !!

25 ECMWF ODB Training 2006 slide 25 ODB within IFS/4DVAR-system ECMA/ODB CCMA/ODB Output BUFRs

26 ECMWF ODB Training 2006 slide 26 A more complex database In the real world a database may contain many more tables (>>5) than in the simple example earlier Each table can contain 10—50 data columns There can also be a sophisticated data hierarchy (next slide) to describe potentially complex relationships between tables In order to provide a good parallel performance on supercomputers, data tables are furthermore divided into data pools  They behave like sub-databases within a database  Allows much bigger data sets than otherwise possible

27 ECMWF ODB Training 2006 slide 27 Comprehensive data hierarchy

28 ECMWF ODB Training 2006 slide 28 ECMWF BUFR to ODB conversion ODBs at ECMWF are normally created by using bufr2odb  Enables MPI-parallel database creation  efficient  Allows retrospective inspection of Feedback BUFR data by converting it into ODB bufr2odb can also be used interactively, for example: bufr2odb –i bufr_input_file – I 1-20 –n 4 The preceding example creates 4 pools of ECMA database from the given BUFR input file, but includes only BUFR subtypes from 1 to 20 (inclusive) Feedback BUFR to ODB works similarly: fb2odb –i feedback_bufr_file –n 8 –u 2

29 ECMWF ODB Training 2006 slide 29 Overview Introduction to ODB Creating a simple database  Use of simulobs2odb –program  Visualizing data using odbviewer, ODBTk The bigger picture  ODB within IFS/4DVAR-system  A more complex database Manipulating ODB from Fortran90 Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

30 ECMWF ODB Training 2006 slide 30 Manipulating ODB from Fortran90 Currently Fortran90 is the only way to fill an ODB database  simulobs2odb is also a Fortran90-program underneath  likewise odbviewer or practically any other ODB-tool Also: to fetch and update data, Fortran90 is necessary ODB Fortran90 interface layer offers a comprehensive set of functions to  Open & close database  Attach to & execute precompiled ODB/SQL queries  Load, update & store queried data

31 ECMWF ODB Training 2006 slide 31 An example ODB program program main use odb_module implicit none integer(4) :: h, rc, nra, nrows, ncols, npools, j, jp real(8), allocatable :: x(:,:) npools = 0 ODB_open h = ODB_open(‘MYDB’, ’OLD’, npools=npools) ODB_close rc = ODB_close(h, save=.TRUE.) end program main

32 ECMWF ODB Training 2006 slide 32 Data manipulation loop DO jp=1,npools ! Execute SQL, allocate space, get data into matrix ODB_select rc = ODB_select(h,’sqlview’,nrows,ncols,poolno=jp) allocate(x(nrows,0:ncols)) ODB_get rc = ODB_get(h,’sqlview’,x,nrows,ncols,poolno=jp) ! Update data, put back to DB, deallocate space call update(x,nrows,ncols) ! Not an ODB-routine ODB_put rc = ODB_put(h,’sqlview’,x,nrows,ncols,poolno=jp) deallocate(x) ODB_cancel rc = ODB_cancel(h,’sqlview’,poolno=jp) ! Use the following only with READONLY-databases ODB_release ! rc = ODB_release(h,poolno=jp) ENDDO

33 ECMWF ODB Training 2006 slide 33 Compile, link and run (1) use odb # once per session (2) odbcomp MYDB.ddl # once only;often from file MYDB.sch (3) odbcomp sqlview.sql # recompile only when changed (4) odbf90 main.F90 update.F90 –lMYDB –o main.x # link (5)./main.x # run

34 ECMWF ODB Training 2006 slide 34 Overview Introduction to ODB Creating a simple database  Use of simulobs2odb –program  Visualizing data using odbviewer, ODBTk The bigger picture  ODB within IFS/4DVAR-system  A more complex database Manipulating ODB from Fortran90 Tools: odbless, odbdiff, odbcompress, odbdup, odb2netcdf

35 ECMWF ODB Training 2006 slide 35 odbless A textual browser that allows to look at ODB data page-by- page –basis (a little like Unix less-command):  By default calculates statistical summary for each retrieved data column  Cheap with near-optimal ODB data access pattern  User has a choice of specifying starting row Usage: odbless –q ‘SELECT column(s) FROM table(s) WHERE …’ \ –s starting_row –n number_of_rows_to_display \ [–b buffer_size –X]

36 ECMWF ODB Training 2006 slide 36 odbdiff Enables to compare two ODB databases for differences Very useful tool when trying to identify errors/differences between operational and experimental 4DVAR runs Usage: odbdiff –q ‘SELECT …’ DATABASE1 DATABASE2 By default brings up an xdiff-window with respect to diffs If latitude and longitude were given in the data query, then also produces a difference plot using odbviewer-tool

37 ECMWF ODB Training 2006 slide 37 odbcompress Enables creation of very compact database from the existing one for  archiving purposes, or for smaller footprint Makes post-processing considerably faster At this point the user has choices of both  Truncating the data precision  Leaving out columns that are less of importance Early tests show that this new tool achieves compression factors from 2.5X to 11X  the higher compression being for satellite data !!

38 ECMWF ODB Training 2006 slide 38 odbdup Duplicates database(s) by copying metadata (low volume), but shares the actual data (high volume) Allows database sharing between multiple users  Over shared (e.g. NFS mounted) disk Enables creation of time-series database, for example: odbdup –i “200601*/ECMA.conv” –o USERDB The previous example creates a new database labelled as USERDB, which presumably spans over all the conventional observations during January 2006  The heureka is : user has now access to a whole month of data as if it was situated in one single database !!

39 ECMWF ODB Training 2006 slide 39 odb2netcdf Translates the given ODB-query (or whole ODB-table) into a series of NetCDF-files, by default one file for each ODB data pool Usage: odb2netcdf –q ‘SELECT …’ The result files can be viewed with standard NetCDF tools like ncdump and ncview The files can also be produced in NetCDF packed format (with a caveat of truncated precision)

40 ECMWF ODB Training 2006 slide 40 Also … Some interesting facts Written mainly in C-language  Except Fortran90-interface and IFS/4DVAR interface  Except BUFR  ODB (by Milan Dragosavac) ODB/SQL is currently converted into C-code  10 lines of SQL generates >> 100 lines of C-code Standalone ODB installation (w/o IFS) is also available  Can be built in about 30 minutes for Linux/laptop Tested at least on the following machines  SGI/Altix, IBM Power3/4, Linux Intel/AMD, VPP, … Automatic binary data conversion guarantees database portability between different machines

41 ECMWF ODB Training 2006 slide 41 … and some ODB “limitations” ODB software is clearly meant for large scale computation since – given lots of memory and disk space, fast CPUs:  A single program can handle up to 2^31 ODB databases  A single database can have up to 2^31 data pools  A single database can have any number of tables  A single table in a data pool can have up to 2^31 rows and (by default) 9999 columns  A single ODB/SQL-query over active data pools can retrieve up to 2^31 rows in one go These really big numbers show that ODBs potential is on parallel computers, but we haven’t forgotten desktop PCs!

42 ECMWF ODB Training 2006 slide 42 Finally… ODB software is developed to allow unprecedented amounts of satellite data through the IFS/4DVAR system  Software has been operational at ECMWF since June’2000, but is still evolving  Emphasis is now on graphical post-processing and how to enable fast access to very large amounts of data Other ECMWF member states and co-operating countries that are also using or just becoming users of ODB  MeteoFrance, DWD, Hungary, Aladin/HIRLAM-nations  MetOffice is considering via collaboration with BoM  University of Vienna via re-analysis ERA40 collaboration


Download ppt "ECMWF ODB Training 2006 slide 1 Introduction to Observational DataBase (ODB) Sami Saarinen, Paul Burton ECMWF 22-Mar-2006."

Similar presentations


Ads by Google