Presentation on theme: "Space Physics Interactive Data Resource – SPIDR :Dr. ZHI N, Mik hail Dr. ZHI ZHI N, Mik hail (Ge oph ysic al Cen ter Rus sian Aca d. Sci. ) Dr. KIH N,"— Presentation transcript:
Space Physics Interactive Data Resource – SPIDR :Dr. ZHI N, Mik hail Dr. ZHI ZHI N, Mik hail (Ge oph ysic al Cen ter Rus sian Aca d. Sci. ) Dr. KIH N, Eric (Nat iona l Geo phy sica l Dat a Cen ter NO AA) Dr. KIH N, Eric Co-Authors:Mr. ME DV ED EV, Dmi try (Ge oph ysic al Cen ter Rus sian Aca d. Sci. ) Mr. RE DM ON, Rob (Nat iona l Geo phy sica l Dat a Cen ter NO AA) Mr. MIS HIN, Dmi try (Ins titut e of Phy sics of the Eart h Rus sian Aca d. Sci. ) Mikhail ZHIZHIN (Geophysical Center Russian Acad. Sci.) Eric KIHN (National Geophysical Data Center NOAA) Dmitry MEDVEDEV (Geophysical Center Russian Acad. Sci.) Rob REDMON (National Geophysical Data Center NOAA) Dmitry MISHIN (Institute of Physics of the Earth Russian Acad. Sci.)
50 years ago – International Geophysical Year – IGY1957 Total data volume ~ 1 Gb Exchange ~ 1 Mb/year
Yesterday – databases, Internet, web – Y2K Total data volume ~ 1 Tb Exchange ~ 1 Gb/year
Tomorrow – Electronic Geophysical Year – EGY2007 Total data volume ~ 1 Pb Exchange ~ 1 Tb/year
SPIDR mission SPIDR is a de facto standard data source on solar- terrestrial physics, functioning within the framework of the ICSU World Data Centers. It is a distributed database and application server network, built to select, visualize and model historical space weather data distributed across the Internet. SPIDR can work as a fully-functional web- application (portal) or as a grid of web-services, providing functions for other applications to access its data holdings.
SPIDR databases Currently SPIDR archives include solar activity and solar wind data, geomagnetic variations and indices, ionospheric, cosmic rays, radio-telescope ground observations, telemetry and images from NOAA, NASA, and DMSP satellites. SPIDR database clusters and portals are installed in the USA, Russia, China, Japan, Australia, South Africa, and India.
SPIDR components SPIDR portal combines the central XML metadata repository with a set of distributed data web services and data file collections. A user can search for data using metadata inventory, use persistent data basket to save the selection for the next session, and plot or download in parallel the selected data in different formats, including XML and NetCDF.
Data service: common data model serialization + URL All grid data services in SPIDR share the same Common Data Model and compatible metadata schema.
Local and/or remote data service: output data stream It is possible at the same time to use a local data source with JDBC protocol and a remote data service with SOAP protocol. The type of protocol is defined by the SPIDR configuration.
Data upload and synchronization: input data stream A database administrator can upload new files into the SPIDR databases using the web services directly or through the web portal. SPIDR databases are self-synchronizing via the web services.
SPIDR metadata “compromise” XML database (high level, low-granularity metadata) = Virtual Observatory (VxO) –Hierarchy of the data categories, key words, textual descriptions –Methods and credentials to access the data (web-service, ftp- directory) –User Forum for data quality and usability support SQL database (low level, high-granularity metadata) = Data Inventory –Parameters (name, physical meaning, units of measurement, virtual formula) or database schema –Availability and accreditation of the data (inventory) –Visualization details (type of the plot and coordinate system, scales, labels) –Input-output formats
Simplistic for novice users to be driven by Guru Advanced user interface System administrator interface SPIDR usage tutorial Data description and help Different workflows and interfaces for different User groups SPIDR homepage http://spidr.ngdc.noaa.gov
Real-time usage statisics for a given time interval User sessions per day Total ~20 000 registered users Per database requests for plot (red) and export (blue)
Input: ground and satellite data from SPIDR data services Space weather numerical models Output: high-resolution rendering of the near-Earth space Numerical modeling on the Grid: Space Weather Reanalysis - SWR
SWR Computer Resources 768 Intel Pentium 4 Xeon Nodes (Dual 2.2 GHz Processors) Myricom Myrinet CLOS64 (2.4 Gbs) ADIC Fileserve MSS (100 Tbytes) NGDC was the #2 JET user for 2004-2005 The SWR consumed 400,000 + CPU Hours The SWR has produced over 2.5 Tb data, this exceeds all of NGDC’s non-satellite holdings! JET Supercomputer FSL/NOAA, Boulder The SWR requires a tremendous array of computer support in order to meet its goals. Challenges include sufficient CPU power, integrating distributed model runs, and storage space for input and output data sets. The SWR project makes use of shared time on FSL’s JET supercomputer as well as RAID and Tivoli based storage systems at NGDC NOAA
SPIDR integration with VxO and Grid infrastructure Two reasons to move to the Grid middleware: 1.The digital certificates for security and authentication simplify inter-site communication 2. Processing large environmental archives requires asynchronous web-services call mechanism
Some conclusions Grid (web) data services accessible from SPIDR portal and a number of clients in Java, C#, Matlab, MS Excel Near-real time IMF, ionosphere and geomagnetic data input streams Data accreditation, FTP file depositary synchronous with the database Metadata service with high-level data description and low-level data inventory Virtual Observatory and User Community functionality: forum, bookmarks, i-mail, external metadata services Integration with Web Map Services “Fork” of the SPIDR-based data resource on solid Earth “Proprietary” SPIDR common data model becomes limiting, need generic like NetCDF SPIDR as a resource on the Space Physics Grid