Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 High Performance Computing and High.

Similar presentations


Presentation on theme: "Nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 High Performance Computing and High."— Presentation transcript:

1 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 nci.org.au @NCInews High Performance Computing and High Performance Data: exploring the growing use of Supercomputers in Oil and Gas Exploration & Production Lesley Wyborn 1, Ben Evans 1, David Lescinsky 2 and Clinton Foster 2 16 September 2014 1 National Computational Infrastructure (NCI), 2 Geoscience Australia (GA)

2 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 1.Current drivers for supercomputers in the Oil and Gas Exploration and Production (E & P) 2.Overview of the concepts of: – High Performance Computing (HPC) – High Performance Data (HPD) – Data-intensive Science 3.Present some new research directions in HP environments: are they applicable to Oil and Gas E & P? 4.Discuss advantages of the Oil and Gas Industries, Academia and Government collaboratively working together in Data-intensive Science, but still enabling competitive E & P analytics 5.Key take home messages Outline

3 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Potential drivers: relevant facts on Oil and Gas E & P ‘Easy’ oil is running out: easily accessible fields are becoming scarcer People are no longer drilling wildcat wells and hoping for the best As exploration goes deeper and into harsher environments (e.g., Arctic, deeper water) the risk of miscalculating drill sites increases The cost of finding and then bringing discoveries into production are now substantially higher (e.g., offshore rigs can cost $1,000,000 per day) Exponentially growing volumes of E & P data are being collected In all parts of E & P the risks of getting it wrong are far greater than ever Source: http://www.economist.com/node/16326356http://www.economist.com/node/16326356 Source: http://media.economist.com/images/images-magazine/2010/10/TQ/201010TQC941.gifhttp://media.economist.com/images/images-magazine/2010/10/TQ/201010TQC941.gif

4 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Background of Paper: Government working with Academia in partnership Work done in GA and its predecessors in the management of scientific digital data since 1977 Collaborative work since 2010 by GA and NCI, in particular research into large-scale, High Performance Data (HPD), High Performance Computing (HPC) and multi-disciplinary Data-intensive Science NCI is a partnership between Academia and Government: ANU, Bureau of Meteorology, GA and CSIRO Funding of ~ $360M in eResearch Infrastructure by the Australian Government (former Department of Innovation, Industry, Science and Research) since 2007 (2 Petaflop computers, 24,000 node research cloud, ~30 PB of data storage at 8 nodes, data services, networks and 12 virtual laboratories) Raijin: The NCI 57,000 core Petascale machine (currently No 38 on the Top 500 Supercomputer list)

5 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 We are entering the 4 th Paradigm of Scientific Discovery ~250 BC Archimedes of Syracuse First paradigm: Thousands of years ago Empirical Science describing natural phenomena Second paradigm: Last few hundred years: Theoretical Science using models, generalizations ~1650 AD Sir Isaac Newton Source: http://www.aps.org/publications/apsnews/200908/zerogravity.cfm http://www.aps.org/publications/apsnews/200908/zerogravity.cfm Third paradigm: Last few decades: Computational Science cpu intensive or simulating complex phenomena ~1940 AD Alan Turing Source: http://couldhavebeenacoconuttree.wordpress.com/2011/05/07/volume- archimedes-and-the-golden-crown-2/http://couldhavebeenacoconuttree.wordpress.com/2011/05/07/volume- archimedes-and-the-golden-crown-2/ Source: http://www.rutherfordjournal.org/article040101.html http://www.turing.org.uk/turing/scrapbook/electronic.htmlhttp://www.rutherfordjournal.org/article040101.html http://www.turing.org.uk/turing/scrapbook/electronic.html

6 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 The 4 th Paradigm of Data- intensive Science http://research.microsoft.com/en- us/collaboration/fourthparadigm/4th_paradigm_book_complete_lr.pdf Concept developed in 2007 by Jim Gray Data-intensive Supercomputing is where large volume data stores and large capacity computation are co-located Such HP hybrid systems are designed, programmed and operated to enable users to interactively invoke different forms of computation in situ over large volume data collections High Performance Data (HPD) is data that is carefully prepared, standardised and structured to be used in Data-intensive Science on HPC Very different to compute intensive paradigm – cf the iPhone5 of 2012

7 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 Australian HPC in Top 500: June 2014 Tier 1 (Top 500) Tier 2 Tier 3 Local Machines and Clusters Local Condor Pools Based on European Climate Computing Environments, Bryan Lawrence (http://home.badc.rl.ac.uk/lawrence/blog/2010/08/02 ) & Top 500 list June 2014 (http://www.top500.org)http://home.badc.rl.ac.uk/lawrence/blog/2010/08/02http://www.top500.org Petascale: >100,000 cores Internal Terascale: >10,000 cores External GA usage!! No 38: NCI (979 TFlops) No 57: LS Vic (715 TFlops) No 181: CSIRO (168 TFlops) No 266: Pawsey (192 Tflops) No 363: Defence (162 TFlops) No 364: Defence (162 TFlops) Institutional Facilities Grid, Cloud Local Machines and Clusters Local Condor Pools Gigascale: >1,000 cores No 500 (134.2 TFLOPS) Tier 0 (Top 10) Megascale: >100 cores Desktop: 2 – 8 cores No 10: 3.14 PFLOPS No 1: 33.86 PFLOPS Tianhe-2 (China) No 38 (No 11: ENI, No 16: TOTAL)

8 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Oil and Gas E&P in the global Top 500 Supercomputers list http://www.top500.org/statistics/perfdevel/ From the earliest days, Oil and Gas E & P has had a high demand for HPC 1 Developments in geophysical data processing software closely tracked (drove?) developments in HPC architecture 1 Oil and Gas E & P use cases appear on the Top 500 list, but not all users are recorded or identifiable June 2013, Pangea (TOTAL) was No 11 (2.09 Pflops) June 2014, ENI (Italy) was No 11 (3 Pflops) 2005-2012 marks significant shifts in HPC everywhere iPhone5 in 2012 was ~80 Gflops 1 http://igcc.ucsd.edu/assets/001/505220.pdf1 http://igcc.ucsd.edu/assets/001/505220.pdf Supercomputing and Energy in china – How Investment in HPC affects Oil Security No 11 Pangea No 38 Raijin iPhone 5S

9 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 The growth in HPC capacity is no longer driven by increasing the No. of CPU’s Moore’s law – Transistor density doubles every 2 years Limitations – Power, heat dissipation – Atomic limits Impacts – CPU clock speeds plateaued – Power wall forced shift to multi-core – Number of cores increased – Parallelisation became king – New algorithms required for parallelism – Many commercial software does not scale and/or the business model is inappropriate Sutter 2009 The Free Lunch is over:. http://www.gotw.ca/publications/concurrency-ddj.htmhttp://www.gotw.ca/publications/concurrency-ddj.htm Slide Courtesy of Brett Bryan CSIRO

10 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 Source: http://www.ewdn.com/2013/05/21/shell-and-russian-software- developers-team-up-to-focus-on-supercomputing-for-oil-exploration/http://www.ewdn.com/2013/05/21/shell-and-russian-software- developers-team-up-to-focus-on-supercomputing-for-oil-exploration/ New algorithms are being developed for Supercomputing by Oil and Gas E&P

11 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Supercomputers assisting Oil and Gas E&P Supercomputers assist in Oil and Gas E&P in three primary ways: – seismic data processing – reservoir simulation – computational visualisation at all stages of the process from Exploration to Production (e.g., can produce four-dimensional visualisations that identify how oil, gas and water flow through the reservoir during production that are hard to ‘see’ algorithmically) Source: http://www.analytics-magazine.org/november-december-2011/695-how-big-data-is-changing-the-oil-a-gas-industryhttp://www.analytics-magazine.org/november-december-2011/695-how-big-data-is-changing-the-oil-a-gas-industry

12 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 They help "de-risk" the whole process from exploration to production by – enabling processing and combination of vast amounts of data from well logs, seismic, gravity and magnetic surveys to produce 3D models of the subsurface – assisting in identifying drilling locations that maximise the chance of finding exploitable resources and minimise drilling of dry holes – producing four-dimensional visualisations that identify how oil, gas and water flow through the reservoir during production – enabling field engineers to plan the optimum layout for producing and injection wells, and to extract residual oil and gas from primary production – allowing ensemble runs to test multiple scenarios and to quantify uncertainties on all parts of the process from exploration through to production – above all, enabling integration with non-Oil and Gas data sets to maximise extraction from the subsurface safely and with minimal risks and environmental impacts Supercomputers “de-risking” Oil and Gas E&P

13 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Quotes on Supercomputers assisting Oil and Gas E &P… They save the industry time: “Projects that used to take two years now take six months” “Pangea helped analyze seismic data from TOTAL’s Kaombo project in Angola in just 9 days, or 4months quicker than it would have taken previously ” They produce better products – “It is like having a bigger lens, so that you get a sharper picture” They allow for more interaction within teams – “faster processors allow those collecting the data and the geologists, who interpret the data, to exchange information and made needed adjustments” They open new possibilities: – “BP’s industry-leading development of ‘digital rocks’.. enable calculating petrophysical rock properties and modeling fluid flow directly from high-resolution 3D images – at a scale equivalent to 1/50th of the thickness of a human hair”

14 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Impact on cost-benefit analysis Supercomputers can change an oil/gas company’s cost-benefit calculations by: – allowing it to process data more quickly – creating a more accurate model with fewer assumptions that help pinpoint the best drilling location, thus reducing the number of dry holes – monitoring changes in a site/field over time – “de-risking” the process to make drilling in complex environments more affordable and safer http://subseaworldnews.com/wp-content/uploads/2013/05/Jasper- Explorer-Starts-Drilling-1st-Well-for-CNOOC-Congo-SA.jpg

15 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 http://www.top500.org/statistics/perfdevel/ In HPC parallelising code is only one part of it No 11 Pangea The elephant in the room is data access The needs to be a balance between processing power and ability to access data (data scaling) The focus in no longer on feeds and speeds The focus is for on-demand direct access to large data sources It is now on content and on enabling HPC analytics directly on that content No 38 Raijin iPhone 5S

16 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 Local Increase Model Complexity Timescale Speed up Data access Increase Data Resolution Increase Model Size and Data types Self describing data cubes and data arrays Use higher resolution data Monte Carlo Simulations, multiple ensemble runs Petascale Terascale Giga Single passes at larger scales: integrate more data types Use longer duration runs: use more and shorter time intervals Based on European Climate Computing Environments, Bryan Lawrence (http://home.badc.rl.ac.uk/lawrence/blog/2010/08/02 )http://home.badc.rl.ac.uk/lawrence/blog/2010/08/02 Ways to better utilise HPC capacity and transition to petascale computing

17 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 Data Accessibility Tools Bandwidth High Performance Computing Infrastructures The High Performance systems tetrahedron in balance

18 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 Data Accessibility Tools, Codes Bandwidth High Performance Computing Infrastructures The High Performance systems tetrahedron in 2014 Totally out of balance!

19 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 HPD is now an essential prelude to Data-intensive Science – We have new opportunities to process large volumes of data at resolutions and at scales never before possible – But data volumes are growing exponentially: scalable data access is increasingly difficult – Traditional data find/download technologies are well past their effective limit for Data-intensive Science – ‘Big data IS the new oil but unrefined it is of little value: it must be refined, processed and analysed’ – We need to convert ‘Big data’ collections into High Performance Data (HPD) by Aggregating data into seamless ‘pre-processed’ data products Creating hyper-cubes and self describing data arrays Source: http://www.tsunami. org/images/student/ art/hokusai.jpg http://www.tsunami. org/images/student/ art/hokusai.jpg

20 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 A research project with 15 Years of Landsat Data (1998-2012) funded by the Department of Innovation, Industry, Science and Research The Landsat cube arranges 636,000 Landsat Source scene spatially and temporally to allow flexible but efficient large-scale analysis The data is partitioned into spatially-regular, time-stamped, band- aggregated tiles which can be presented as temporal stacks. Spatially partitioned tiles Temporal Stack Creating HPD collections: eg the Landsat Cube

21 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Current Landsat Holdings as HPD 636,000 Landsat Source Scenes ( ~52 x 10 12 Pixels) 4M Spatially-Regular Time-Stamped Tiles (0.5 PB)

22 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Sampled 1,312,087 25 tiles => 21x10 12 pixels Water detection over15 Years from 1998-2012 High 25m Nominal Pixel Resolution Actual data can be sampled at national or local farm scale High-Resolution, Multi-Decade, Continental-Scale Analysis of HPD

23 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Can we created an equivalent HPD array of seismic reflection data? What would a calibrated HPD array of all Australian Seismic data look like? That is, direct access to actual data content rather than to metadata on files of data which then need to be downloaded, integrated and processed locally Such an array could be sampled and processed directly at a national, basin or prospect scale And then integrated with HPD full resolution point clouds of magnetic, gravity and magneto-telluric survey data Source: http://www.ga.gov.au/__data/assets/image/0003/15645/13-7749-8-allstates.jpghttp://www.ga.gov.au/__data/assets/image/0003/15645/13-7749-8-allstates.jpg

24 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Realities of HPD Data Collections HPD collections are just too large to move – Bandwidth limits the capacity to move them: data transfers are too slow – Even if they can be moved, few can afford to store them locally: the energy costs are also substantial HPD is about moving processing to the data, moving users to the data and about having online applications to process the data HPD enables cross-domain integration Domain-neutral international standards for data collections and interoperability are critical for allowing complex interactions in HP environments both within and between HPD collections HPD enables scalable data access but also means rethinking the algorithms of the data (not again?)

25 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 The Oil and Gas Industry can take a bow They have been amongst the leaders in development of global standardised formats (e.g., SEG-Y) Energistics have driven the next generation of the ISO 19115 metadata standard – THE global standard for discovery of geospatial data Energistics are a global consortium that facilitates the development, management and adoption of data exchange standards for the upstream oil and gas industry (e.g., WITSML, PRODML and RESQML) In 2012, the Oil and Gas Industries formed the Standards Leadership Council which links many oil and gas standards bodies as well as the OGC and the SEG But these standards may need to evolve to increase uptake of data in HP environments, particularly at exascale

26 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Rethinking Hardware Architectures for Data-intensive Science Work at NCI has highlighted the need for balanced systems to enable Data-intensive Science including: – Interconnecting processes and high throughput to reduce inefficiencies – The need to really care about placement of data resources – Better communications between the nodes – Large persistent storage (on spinning disk) in addition to the traditional ‘scratch spaces’ – I/O capability to match the computational power NCI’s I/O speed to persistent storage is ~50 GBytes/sec and to scratch is ~120 Gbytes/sec – Close coupling of cluster, cloud and storage Since starting on the NCI/GA Data-intensive journey in 2010 software is being progressively rewritten: data and hardware architectures also need to change to create balanced systems NCI’s Integrated High Performance Environment

27 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Warning: Exascale is just around the corner http://www.top500.org/statistics/perfdevel/ No38 Raijin Next NCI ?? No11 Pangea Next Pangea or ENI?? Addressing the data access problem is the highest priority as Supercomputing heads towards exascale Climate/Weather research are already there: can we learn from these academic communities? Looking backwards: the capacity of an iPhone 5 today is equivalent to Supercomputers of 1995 Looking forwards: we are starting to slip…. Exascale Petascale Terascale Gigascale iPhone 5S

28 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Energy: the main limitation on growth in HP environments Future HP systems will be energy-limited (both storage and HPC): are we reaching the tops in flops? Architecture will matter more: energy efficiency will be achieved through careful architecture Increased performance will be determined by new algorithms and far more efficient data access http://www.top500.org/statistics/perfdevel/ Top 200: June 2014

29 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Future HPC Challenges for the Oil and Gas E&P Future HPC challenges for everyone are: – Power, programmability and scalability: programming needs to be at an extreme scale, using massive parallelism – Data movement is THE current bottleneck: the precise and efficient flow of data will take center stage and hardware that will enable that level of control will become critical – Balanced systems will be crucial (architecture, software, data access) Specific HPC challenges for Oil and Gas E & P are: – There will probably need to be a transition to high volume collaborative high performance data stores against which competitive algorithms can be deployed by individual companies – The competitive advantage will now no longer be in what data a company holds: the advantage will be in smarter proprietary algorithms that are applied to collaborative HPD collections that are closely sited next to HPC

30 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Can we do this as a 3-way collaboration? Industry: Driving developments in HPD/HPC Government Agencies: (Data Rich) Academia: Cutting edge HPC/HPD research, particularly scaling to exascale Data-intensive Science ? ? Government and Academia in partnership in HPC and HPD: developing new approaches to in-situ Data-intensive Science Academia and Industry in partnerships on HPC developing new systems Industry and Government in partnership for traditional data supply

31 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 © National Computational Infrastructure 2014 Take home messages for Oil and Gas E&P HPC is now an integral part of the Oil and Gas E&P: there IS capacity for way more growth Moving to HPD collections will be key to enable data to be integrated and processed at high resolutions to give more accurate models and predictions The Oil and Gas Industry will need to continue to drive standards globally and ensure they can are compatible with the rapidly on-coming exascale HPC/HPD environments Three way partnerships need to be investigated (Government, Industry, Academia) Then together we can continue to drive ‘New Oil’ from ‘Raw Materials’ via standard specifications to feed HPC environments to ‘fuel’ quality assessments and provide burning new insights to support environmentally sustainable and safe development of our Oil and Gas Resources

32 nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 nci.org.au @NCInews Any Questions? Source: http://www.analytics-magazine.org/november-december-2011/695-how-big-data-is-changing-the-oil-a-gas-industryhttp://www.analytics-magazine.org/november-december-2011/695-how-big-data-is-changing-the-oil-a-gas-industry Dr Lesley Wyborn lesley.wyborn@anu.edu.au Dr Ben Evans ben.evans@anu.edu.au ben.evans@anu.edu.au Dr David Lescinsky David.lescinsky@ga.gov.au Dr Clinton Foster clinton.foster@ga.gov.au


Download ppt "Nci.org.au © National Computational Infrastructure 2014 HPC and HPD in E&P Perth, September 2014 High Performance Computing and High."

Similar presentations


Ads by Google