Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.

Similar presentations


Presentation on theme: "ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley."— Presentation transcript:

1 ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley

2 Topics  Environment setting  Data management tools and principles  ICOADS NCAR Release 2.5 contributions  Background Collections  Future Challenges

3 Environment Setting  ICOADS is part of a larger collection called the Research Data Archive (RDA)  RDA – briefly  600+ datasets (atmosphere, ocean, geosciences)  4.3M files, 462 TB (primary data)  unique users annually, including ICOADS  Staff, 7 scientific programmers (M.S. degrees), me, and administrative assistant

4 Data management principles  Always archive 2 copies of observational data  3 rd copy at a partner center (disaster recovery)  Free and open data access world-wide  Internet  Past – other media, cd-roms, tapes, etc.  Share what we have to build archives  E.g. Digitization of Maury data in China in exchange for global land surface data

5 Data Management Tools Old System: Specialized Software to manage each data input. Inefficient Difficult to Scale RDA Metadata Database RDA Metadata Database Unidata Server University Server NWP Server NWP Server Online Disk Tape Storage GCMD Metadata Server GCMD Metadata Server RDA Data Server Specialized Software Package 2 Specialized Software Package 3 Specialized Software Package 1 New System: Common RDA tools that homogenize data management. Efficient Scalable RDA Data Management Common Tool Set

6 Data Management tools – a few details  Common scripting structure to do routine dataset updates (dsupdt)  Very tunable  Frequency, multiple server priority list, validation  Fully integrated with RDADB  Users view is automatically update and therefore always current  Common single archiving function (dsarch)  location and copy control (MSS/HPSS storage, and online disk)  Fills all DB entries (e.g. file and dataset relationships)

7 Data management tools  Harvest file level metadata (gatherxml)  Handle various formats (GRIB1, GRIB2, netCDF, BUFR, IMMA, ON29, etc.)  Save as and populate DB  Benefits  Problem detection  Versioning, replacement, extension  Inventory information  Drive better data service for users

8 Data management tools  Provide access to data in tape storage archive (dsrqst)  Relatively new, not universally available across RDA - yet  Delayed mode – with DB control (many details)  Why – RDA holds 462 TB  40 TB online – most popular small scale products  Access to more products for greater community

9 ICOADS Release 2.5 NCAR  Data Preparation – format evaluations, translate native formats to IMMA format  Moored research buoy delayed mode archives  TOA, PIRATA (PMEL, JAMSTEC)  World Ocean Database 2005  Multiple ocean profile types (NODC)  Receive/archive ICOADS data processing results  NOAA/ESRL does processing - source merging, duplicate elimination, preconditioning deletion and fixes, etc.

10 ICOADS Release 2.5 NCAR  Create and maintain user data access interfaces  File access  IMMA and binary (observations, monthly summary statistics)  Sub-selection (time, space, parameter)  Example coming.  Output is ASCII tabular format  Runs automatically – nearly all requests completed in 10 minutes  Keep user metrics

11 ICOADS Release 2.5 NCAR  Near-term preliminary extensions to R2.5  Beginning with data in 2008 and forward  Based on NCEP GTS compilation/merge  Runs on day 2 of each month – processes previous month.  Create IMMA observations and binary monthly summary statistics  Harvest file level metadata  Do all archiving of original and processed files  Automatically, update user interfaces

12 Brief drive through NCAR

13 World-wide User Access

14 File Level Metadata – ICOADS IMMA Example

15

16

17

18 8 pages of information like this

19 A look at 2009

20

21 What is happening in 2009?

22 World-wide User Access

23

24

25

26 Similar service for the monthly summary statistics

27 Who uses the sub-setting interfaces? Countries

28 Background Collections  Historical  Most complete set of ALL source data used to create ALL ICOADS Releases  Beginning in mid-1980s  Copies of ALL ICOADS Releases  We do not delete any files

29 Background Collections  Ongoing / Routine data receipts  Format conversions are done at NCDC DescriptionSourceFrequency Marine Surface GTSNCEP (BUFR)Monthly Marine Surface GTSNCDC (IMMA)Monthly SEASNCDC (IMMA)Monthly KeyedNCDC (IMMA)Monthly (nominally) GCCNCDC (IMMA)Quarterly (nominally) VOSClimNCDC (IMMA)Monthly

30 Future Challenges  Eliminate user interface dependency on java applets – deploy java script instead.  Support “advanced” ICOADS initiative  Bias adjusted / corrected observations  Serve as a central DB / handle data ingest  Build a user interface  Continue as a full U.S. partner.


Download ppt "ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley."

Similar presentations


Ads by Google