Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECMWF A short ODB Training 2007 slide 1 Introduction to Observational DataBase (ODB) 25-Apr-2007.

Similar presentations

Presentation on theme: "ECMWF A short ODB Training 2007 slide 1 Introduction to Observational DataBase (ODB) 25-Apr-2007."— Presentation transcript:

1 ECMWF A short ODB Training 2007 slide 1 Introduction to Observational DataBase (ODB) 25-Apr-2007

2 ECMWF A short ODB Training 2007 slide 2 Overview Introduction to ODB Creating a simple database Use of simulobs2odb –program Visualizing data using basic odbviewer More complex databases ODB within IFS/4DVAR-system Manipulating ODB data from Fortran90 Few tools: odbsql, odbdiff, odbcompress, odbdup, odb2netcdf ODBTk : A GUI-based ODB visualisation toolkit A separate presentation & demo by Paul Burton

3 ECMWF A short ODB Training 2007 slide 3 Introduction to ODB ODB is a tailor made (hierarchical) database software developed at ECMWF to manage very large observational data volumes through the ECMWF IFS/4DVAR-system on highly parallel supercomputer systems ODB also enables flexible post-processing of observational data even on a desktop computer ODB software is written in C and Fortran-90 languages and is available virtually on any Unix-systems (and now also for Windows/CYGWIN) The software can be installed from source code (tar-ball) normally in a less than an hour

4 ECMWF A short ODB Training 2007 slide 4 Snapshot of AIRS channel#1873 brightness-T

5 ECMWF A short ODB Training 2007 slide 5 A snapshot of SATOB/AMV-winds

6 ECMWF A short ODB Training 2007 slide 6 One month of averaged Br-T : HIRS,channel#4

7 ECMWF A short ODB Training 2007 slide 7 … Introduction to ODB An observational database usually contains following items: Observation identification, position and time coordinates Observation value, pressure levels, channel numbers Various quality control flags Obs. departures from background and analysis fields Satellite specific information Other closely related information All information can be accessed via ODB/SQL language and Fortran90 interface Also a direct (read-only) access to ODB-data is now available no programming effort to scan ODB-data

8 ECMWF A short ODB Training 2007 slide 8 Basic components of ODB ODB/SQL-language Data Definition Language: To describe what data items belong to database, what are their data types and how they are related (if any) to each other Data Query Language: To query and return a subset of data which satisfies certain user specified conditions. This is the key feature of the ODB software !! Fortran90 interface layer Data manipulation : create, update & remove data Execute ODB/SQL-queries and retrieve filtered data To control MPI and OpenMP-parallelization

9 ECMWF A short ODB Training 2007 slide 9 Creating a simple ODB database We will create a very simple database using text files The 3 text files describe Data layout i.e. what data items will go into ODB Location and time information of observations Actual observation measurement information for each location at the given pressure levels Feed these files into simulobs2odb-program Discover the data values in database by using odbviewer

10 ECMWF A short ODB Training 2007 slide 10 Data definition layout : MYDB.ddl CREATE TABLE hdr AS ( seqno pk1int, obstype pk1int, codetype pk1int, lat pk9real, lon pk9real, date yyyymmdd, time hhmmss, ); CREATE TABLE body AS ( entryno pk1int, varno pk1int, vertco_type pk1int, press pk9real, obsvalue pk9real, );

11 ECMWF A short ODB Training 2007 slide 11 Input file#2 : hdr.txt #hdr obstype = 2 codetype = 141 seqno lat lon date time body.len

12 ECMWF A short ODB Training 2007 slide 12 Input file#3 : body.txt #body entryno varno vertco_type press obsvalue

13 ECMWF A short ODB Training 2007 slide 13 Running simulobs2odb Initialize ODB interactive environment : use odb Create database using the following simple command : simulobs2odb –l MYDB –i hdr.txt –i body.txt As a result of these commands, a small database called MYDB has been created and it contains one data pool with two tables hdr and body, which are linked (related) to each other via data type It is now easy to extend database by providing more data, or specifying more data items, or adding more tables, or all above at the same time

14 ECMWF A short ODB Training 2007 slide 14 Visualizing with odbviewer History: odbviewer was originally written to be used as a debugging tool for ODB software development Linked with ECMWF graphics package MAGICS/MAGICS++ Displays coverage plots Also a textual report generator Displays output of data queries Sensitive to ODB/SQL-language : tries automatically produce both coverage plot and textual report for the user Textual report itself can be invaluable source of information for further post-processing tasks Making use of the new and more economical tool odbsql

15 ECMWF A short ODB Training 2007 slide 15 Running odbviewer Go to database directory cd MYDB Run odbviewer –q SELECT lat,lon,press,obsvalue\ FROM hdr, body \ WHERE obstype = 2

16 ECMWF A short ODB Training 2007 slide 16 odbviewer coverage plot Our observation !!

17 ECMWF A short ODB Training 2007 slide 17 Some odbviewer options -h List of options (gimme some help !) -q SQL-stmt Provide ODB/SQL-statement inline -v viewname/poolno Choose SQL name (& optionally pool number) -p 1-10,12,15 Choose from a subset of pools -RNo radians-to-degrees conversion for (lat,lon) -rEnforce radians-to-degrees conversion -k Show (lat,lon) in degrees even if they were in radians in DB -cClean start (i.e. recompile all) -e editorChoose preferred editor -e batchRun in batch mode (same as –e pipe) -NDo not produce a report at all -IDo not show plot immediately -P projectionChange display projection -C file.cmap Supply a color map file -A plot_areaChoose plotting area -F (en)Force to use the old style odbviewer over odbsql

18 ECMWF A short ODB Training 2007 slide 18 More complex databases In reality databases usually contain many more tables (>>5) than in the simple example earlier Each table can contain 1050 data columns There can also be a sophisticated data hierarchy (see the next slide) to describe potentially quite complex relationships between tables In order to provide a good parallel performance on supercomputers, data tables are furthermore divided into data pools, which enables parallel I/O, too: They behave like sub-databases within a database Allows much bigger data sets than otherwise possible

19 ECMWF A short ODB Training 2007 slide 19 Comprehensive data hierarchy

20 ECMWF A short ODB Training 2007 slide 20 ODB within IFS/4DVAR-system ECMA/ODB CCMA/ODB Output BUFRs

21 ECMWF A short ODB Training 2007 slide 21 AMSU-A data before screening

22 ECMWF A short ODB Training 2007 slide 22 AMSU-A data after screening Under 10% left active !!

23 ECMWF A short ODB Training 2007 slide 23 Typical ODB usage at ECMWF … Database can be created interactively or in batch mode We usually run our in-house BUFR2ODB in batch-mode New observation types can also be fed in via text file Complete database manipulation prefer using Fortran90- interface, but any read/only-database can also be accessed via rudimentary client-server –interface (C/C++) Another possibility is to run the new tool – odbsql No need to use of ODB/SQL compilation system No need to write a single line of Fortran90 The tool is under development

24 ECMWF A short ODB Training 2007 slide 24 … Typical ODB usage at ECMWF When database has been created, the application program queries data via precompiled ODB/SQL and places the result data (also known as view ) into a data matrix allocated by the user program There can virtually be any number of active views at any given time. These can be updated and fed back to database Due to ODB, the use of WMO BUFR has therefore been minimized at ECMWF in order to enable faster and more robust processing of observations

25 ECMWF A short ODB Training 2007 slide 25 ECMWF BUFR to ODB conversion ODBs at ECMWF are normally created by using bufr2odb Enables MPI-parallel database creation efficient Allows retrospective inspection of Feedback BUFR data by converting it into ODB (slow & not all data in BUFR) bufr2odb can also be used interactively, for example: bufr2odb –i bufr_input_file – I 1-20 –n 4 The preceding example creates 4 pools of ECMA database from the given BUFR input file, but includes only BUFR subtypes from 1 to 20 (inclusive) Feedback BUFR to ODB works similarly: fb2odb –i feedback_bufr_file –n 8 –u 2

26 ECMWF A short ODB Training 2007 slide 26 Manipulating ODB from Fortran90 Currently Fortran90 is the only way to fill an ODB database simulobs2odb is also a Fortran90-program underneath likewise odbviewer or practically any other ODB-tool Also: to fetch and update data, Fortran90 is necessary ODB Fortran90 interface layer offers a comprehensive set of functions to Open & close database Attach to & execute precompiled ODB/SQL queries Load, update & store queried data

27 ECMWF A short ODB Training 2007 slide 27 An example ODB program program main use odb_module implicit none integer(4) :: h, rc, nra, nrows, ncols, npools, j, jp real(8), allocatable :: x(:,:) npools = 0 ODB_open h = ODB_open(MYDB, OLD, npools=npools) ODB_close rc = ODB_close(h, save=.TRUE.) end program main

28 ECMWF A short ODB Training 2007 slide 28 Data manipulation loop DO jp=1,npools ! Execute SQL, allocate space, get data into matrix ODB_select rc = ODB_select(h,sqlview,nrows,ncols,poolno=jp) allocate(x(nrows,0:ncols)) ODB_get rc = ODB_get(h,sqlview,x,nrows,ncols,poolno=jp) ! Update data, put back to DB, deallocate space call update(x,nrows,ncols) ! Not an ODB-routine ODB_put rc = ODB_put(h,sqlview,x,nrows,ncols,poolno=jp) deallocate(x) ODB_cancel rc = ODB_cancel(h,sqlview,poolno=jp) ! Use the following only with READONLY-databases ODB_release ! rc = ODB_release(h,poolno=jp) ENDDO

29 ECMWF A short ODB Training 2007 slide 29 Compile, link and run (1) use odb # once per session (2) odbcomp MYDB.ddl # once only;often from file MYDB.sch (3) odbcomp sqlview.sql # recompile only when changed (4) odbf90 main.F90 update.F90 –lMYDB –o main.x # link (5)./main.x # run

30 ECMWF A short ODB Training 2007 slide 30 ODB/SQL compilation system

31 ECMWF A short ODB Training 2007 slide 31 odbsql A new tool to access ODB data in read/only –mode Does not generate C-code, but dives directly into data Usually faster than generated C-code with exception of accessing large amounts of satellite data (investigated) The tool is under active development right now Usage: odbsql –q SELECT column(s) FROM table(s) WHERE … \ –s starting_row –n number_of_rows_to_display \ [–X] [other_options]

32 ECMWF A short ODB Training 2007 slide 32 ODB/SQL – examples (1) SET $t2m = 39; // Scalar parameters, whose values … SET $synop = 1; // … can be overridden in Fortran90 CREATE VIEW t2m AS SELECT an_depar, fg_depar, lat, lon, obsvalue FROM hdr, body WHERE obstype = $synop // Give me synops AND = $t2m // Give me 2 meter temperatures AND obsvalue is not NULL ; // Dont want missing data

33 ECMWF A short ODB Training 2007 slide 33

34 ECMWF A short ODB Training 2007 slide 34 ODB/SQL – examples (2) SELECT count(*), avg(obsvalue), stdev(fg_depar) FROM hdr, body WHERE obstype = $synop && varno = $t2m AND obsvalue IS NOT NULL; // Observation count per (obstype,codetype)-pair : SELECT obstype, codetype, count(*) FROM hdr ; SELECT varno, avg(fg_depar), CORR(fg_depar, an_depar) FROM body WHERE fg_depar is NOT null ;

35 ECMWF A short ODB Training 2007 slide 35

36 ECMWF A short ODB Training 2007 slide 36 odbdiff Enables comparison of two ODB databases for differences A very useful tool when trying to identify errors/differences between operational and experimental 4DVAR runs Usually a non-trivial task Usage: odbdiff –q SELECT … /dir1/DATABASE1 /dir2/DATABASE2 By default the command brings up an xdiff-window with respect to differences If latitude and longitude were also given in the data query, then it also produces a difference plot using odbviewer-tool

37 ECMWF A short ODB Training 2007 slide 37 odbcompress Enables to create very compact databases from the existing ones for archiving purposes, or for smaller database footprint (disk occupancy) Makes post-processing considerably faster The user can choose to Truncate the data precision, and/or Leave out columns that are less of an importance Typical compression ratios vary between 2.5X … 11X the high compression achieved for satellite data !!

38 ECMWF A short ODB Training 2007 slide 38 odbdup/odbmerge Allows f.ex. database sharing between multiple users Over shared (e.g. NFS, Lustre, GPFS, GFS) disks Duplicates [merges] database(s) by copying metadata (low in volume), but shares the actual (high volume) binary data Also enables creation of time-series database, for example: odbdup –i */ECMA.conv –o USERDB The previous example creates a new database labelled as USERDB, which presumably spans over the all conventional observations during the January 2007 The main point : user has now access to whole month of data as if it was a single database !!

39 ECMWF A short ODB Training 2007 slide 39 odb2netcdf Translates the result of a given ODB-query (or whole ODB- table) into a series of NetCDF-files, by default one file for each ODB data pool (i.e. partition) Usage: odb2netcdf –q SELECT … [-p pool_number] [-P] The result files can be viewed with the standard NetCDF tools like ncdump and ncview The files can also be created in the NetCDF packed format (caveat : truncated data precision), -P option was used

40 ECMWF A short ODB Training 2007 slide 40 Some interesting facts on ODB Written mainly in C-language Except Fortran90-interface and IFS/4DVAR interface Except BUFR ODB (by Milan Dragosavac, ECMWF) ODB/SQL is currently converted into C-code 10 lines of SQL generates >> 100 lines of C-code Standalone ODB installation (w/o IFS) is also available Tested at least on the following machines SGI/Altix, IBM Power3/4/5, Linux Intel/AMD Fujitsu VPPs, NEC SX, Cray XT3/4 Automatic binary data conversion guarantees database portability between different machines

41 ECMWF A short ODB Training 2007 slide 41 … and some ODB limitations ODB software is clearly meant for large scale computation since – given lots of memory and disk space, fast CPUs: A single program can handle up to 2^31 ODB databases A single database can have up to 2^31 data pools A single database can have any number of tables A single table in a data pool can have up to 2^31 rows and (by default) 9999 columns A single ODB/SQL-query over active data pools can retrieve up to 2^31 rows in one go These really big numbers show that ODBs potential is on parallel computers. Yet we havent forgotten the PCs!

42 ECMWF A short ODB Training 2007 slide 42 Finally… ODB software is developed to allow unprecedented amounts of satellite data through the IFS/4DVAR system Software has been operational at ECMWF since June2000, but is still evolving Emphasis is now on graphical post-processing and how to enable fast access to very large amounts of data Who is using ODB outside ECMWF ? At least … MeteoFrance, Hungarian MS, SMHI, FMI Aladin and some HIRLAM nations Australian Bureau of Meteorology University of Vienna via re-analysis ERA40 collaboration

Download ppt "ECMWF A short ODB Training 2007 slide 1 Introduction to Observational DataBase (ODB) 25-Apr-2007."

Similar presentations

Ads by Google