Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Australian Geoscience Data Cube

Similar presentations


Presentation on theme: "An Australian Geoscience Data Cube"— Presentation transcript:

1 An Australian Geoscience Data Cube
Aaron Sedgmen Geoscience Australia

2 Overview Organisational background Data cube concept
Geoscience Australia’s data cube implementation The shift from traditional methods of managing EO data Example applications of the data cube Where to with the data cube An Australian Geoscience Data Cube

3 Organisational Background
Geoscience Australia – a government agency providing advice and information to the Australian Government and geoscientific information to industry and other stakeholders. National Earth Observations Group - provides earth observation products and services as well as expert advice, and information for decision makers. An Australian Geoscience Data Cube

4 The Space-Time Data Cube: a new paradigm for managing and using environmental data
An Australian Geoscience Data Cube

5 The Data Cube concept 191610 (x) 575 (t) x 7 (λ)
An Australian Geoscience Data Cube

6 “Cubing” Landsat images
Tile squares  space   time  & … Stack Dice… An Australian Geoscience Data Cube

7 GA’s data cube implementation
GA developed a working data cube prototype in early 2013 to undertake time-series analysis of Landsat data Contains fifteen years ( ) of the Landsat 5 & 7 archive covering the Australian land mass 3,960,528 tiles sourced from a total of 550,537 Level 1T, ARG25, Pixel Quality & some Fractional Cover datasets 110TB of compressed geoTIFF files Access to the cube is via a Python API that enables generation of mosaiced time slices, and temporal stacks of derived quantities Users can apply their own algorithms via the API for generating derived quantities An Australian Geoscience Data Cube

8 Hosting of the data cube at NCI
The GA data cube is hosted on the National Computational Infrastructure (NCI), located at the Australian National University in Canberra. The Raijin super computer at the NCI is currently ranked around 27th in the world, based on the following specifications: 57,472 cores 160 Tbytes memory 10 Pbytes spinning disk 1.2 Pflops computer performance The storage and processing power available at NCI is a critical enabler for the data cube An Australian Geoscience Data Cube

9 GA’s Traditional EO product process
EO products have traditionally been produced on demand for areas of interest from tape archives of scene based raw data Identify footprint of product in space or time Search catalogue order scenes Client requests product 1Petabyte hierarchical archive: Millions of individual scenes Tape store accessed by robot. Feature extraction, algorithm application spectral unmixing Orthorectification calibration, cloud Masking, atmospheric correction, mosaicing Product packaging and delivery An Australian Geoscience Data Cube

10 A paradigm shift from traditional methods
The data cube holds multiple Landsat products for the entire archive – removes the need to generate products at time of request Hosting the data cube at NCI co-locates “big data” with high performance computing – enables in-situ analysis of the whole archive Computational analysis is moved from the scientist’s local environment to a central HPC facility Removes the need to download and replicate the data Provides computing power not otherwise available to many scientists Opens up possibilities to integrate the Landsat archive with other “big data” datasets hosted at the HPC facility An Australian Geoscience Data Cube

11 Surface water Menindee Lakes time series Total observations per grid cell ~ *4000 grid cells An Australian Geoscience Data Cube

12 Continental-Scale surface water results
Time series analysis of entire 15yr archive of ARG25 data at 25m resolution. ~2 days processing time (pre Raijin HPC facility) An Australian Geoscience Data Cube

13 What the GA data cube is not (yet)
A publically available production system Still a working prototype being used for internal environmental science projects A real-time delivery system for time-series data serving large numbers of concurrent users (i.e. a web-delivery system) A number of OGC specifications, including CF-netCDF, Web Coverage Service (WCS), Web Processing Service (WPS) and Web Coverage Processing Service (WCPS), are being investigated for enabling this capability. Yet another system for delivering “pretty pictures” (a la GeoServer or Google Earth Engine) The data cube environment is optimised for scientific analysis. The delivery of portrayal data (e.g. map images via WMS) is best served by systems optimised for data distribution. An Australian Geoscience Data Cube

14 Acknowledgements Dr Stuart Minchin – Chief, Environmental Geoscience Division Geoscience Australia Alex Ip – Senior Developer, eResearch Infrastructure An Australian Geoscience Data Cube

15 Questions? Thank you Phone: +61 2 6249 9576
Web: Address: Cnr Jerrabomberra Avenue and Hindmarsh Drive, Symonston ACT 2609 Postal Address: GPO Box 378, Canberra ACT 2601


Download ppt "An Australian Geoscience Data Cube"

Similar presentations


Ads by Google