1 UK NeSC Meeting, November 18 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI in a commercial environment.

Slides:



Advertisements
Similar presentations
© Fraunhofer Institute SCAI and other members of the SIMDAT consortium Data Grids for Process and Product Development using Numerical Simulation and Knowledge.
Advertisements

Grids for Complex Problem Solving, 29 January 2003 Grid based collaborative working in large distributed organisations
Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
Open Grid Service Architecture - Data Access & Integration (OGSA-DAI) Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax:+44.
SWITCH Visit to NeSC Malcolm Atkinson Director 5 th October 2004.
E-Science Update Steve Gough, ITS 19 Feb e-Science large scale science increasingly carried out through distributed global collaborations enabled.
Spatial Data e-Infrastructure UK e-Science ALL HANDS MEETING September, Edinburgh, UK Higgins, C., Koutroumpas, M., Sinnott, R.O., Watt, J.,
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Grid-Enabling Data: Sticking Plaster, Sellotape, & Chewing Gum? Colin C. Venters National Centre for e-Social Science University.
FirstDIG First Data Investigation on the Grid Paul Graham, Terry Sloan, Adam Carter EPCC Ian Gregory, Darren Unwin First South Yorkshire tel:+44 (0)131.
GEODE Workshop 16 th January 2007 Issues in e-Science Richard Sinnott University of Glasgow Ken Turner University of Stirling.
Modelling and Simulation for e-Social Science Mark Birkin School of Geography University of Leeds.
Computer Science Department 1 Load Balancing and Grid Computing David Finkel Computer Science Department Worcester Polytechnic Institute.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
A Data Curation Application Using DDI: The DAMES Data Curation Tool for Organising Specialist Social Science Data Resources Simon Jones*, Guy Warner*,
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
Advanced Data Mining and Integration Research for Europe ADMIRE – Framework 7 ICT ADMIRE Overview European Commission 7 th.
1 e-science & data mining workshop, NeSC, UK, November 30 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
1/8 Enhancing Grid Infrastructures with Virtualization and Cloud Technologies Ignacio M. Llorente Business Workshop EGEE’09 September 21st, 2009 Distributed.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
The Integration of Peer-to-peer and the Grid to Support Scientific Collaboration Tran Vu Pham, Lydia MS Lau & Peter M Dew {tranp, llau &
Climate Sciences: Use Case and Vision Summary Philip Kershaw CEDA, RAL Space, STFC.
DAME: Distributed Engine Health Monitoring on the Grid
1 EPCC Sun Data and Compute Grids Project Using Sun Grid Engine and Globus to Schedule Jobs Across a Combination of Local.
Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone:
Introduction to OGSA-DAI The OGSA-DAI Team
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
ODD-Genes: Accelerating data-driven scientific discovery NeSC Review 2003 NeSC
OGSA-DAI in OMII-Europe Neil Chue Hong EPCC, University of Edinburgh.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
SBIR Final Meeting Collaboration Sensor Grid and Grids of Grids Information Management Anabas July 8, 2008.
DAME: A Distributed Diagnostics Environment for Maintenance Duncan Russell University of Leeds.
Internet2 Health Sciences Mary Kratz Internet2 Health Science Manager March Spring Member Meeting International Session.
Tools for collaboration How to share your duck tales…
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
1 Grid scheduling issues in the Sun Data and Compute Grids Project NeSC Grid Scheduling Workshop, Edinburgh, UK 21 st October.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
DAME: A Distributed Diagnostics Environment for Maintenance Dr Tom Jackson University of York.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Supercomputing 2003, UK e-Science Booth 1 First Data Investigation on the Grid: FirstDIG Terry Sloan, Paul Graham, Adam Carter Edinburgh Parallel Computing.
IBM & HSBC visit Malcolm Atkinson Director & e-Science Envoy UK National e-Science Centre & e-Science Institute 30 th March 2006.
Geoff Cawood, Terry Sloan Edinburgh Parallel Computing Centre (EPCC) Telephone: EPCC Sun Data and Compute.
Data Integration in Bioinformatics Using OGSA-DAI The BioDA Project Shirley Crompton, Brian Matthews (CCLRC) Alex Gray, Andrew Jones, Richard White (Cardiff.
7. Grid Computing Systems and Resource Management
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Toward a common data and command representation for quantum chemistry Malcolm Atkinson Director 5 th April 2004.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
Data and storage services on the NGS.
© Copyright AARNet Pty Ltd PRAGMA Update & some personal observations James Sankar Network Engineer - Middleware.
The National Grid Service Mike Mineter.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
OGSA-DAI Usage Scenarios and Behaviour: Determining good practice Mario Antonioletti EPCC, University of Edinburgh
NERC e-Science Meeting Malcolm Atkinson Director & e-Science Envoy UK National e-Science Centre & e-Science Institute 26 th April 2006.
Welcome Grids and Applied Language Theory Dave Berry Research Manager 16 th October 2003.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
OR2009 Atlanta, 18 May1 The Global Registries Initiative: Progress Report and Software Demonstration Chris Blackall, Australian National Data Service Jeremy.
Accessing the VI-SEEM infrastructure
Clouds , Grids and Clusters
Grid Computing.
Cloud Computing Dr. Sharad Saxena.
1st International Conference on Semantics, Knowledge and Grid
What is a Grid? Grid - describes many different models
Presentation transcript:

1 UK NeSC Meeting, November 18 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI in a commercial environment

2 UK NeSC Meeting, November 18 th, 2004 Overview The Grid vision The INWA project Demo of data browse via FirstDIG Browser and OGSA-DAI Data Fusion Data Fusion demo Future Plans

3 UK NeSC Meeting, November 18 th, 2004 The Grid Vision “… flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources - what we refer to as virtual organisations.” The Anatomy of the Grid: Enabling Scalable Virtual Organizations. I. Foster, C. Kesselman, S. Tuecke. International J. Supercomputer Applications, 15(3), 2001.

4 UK NeSC Meeting, November 18 th, 2004 The INWA Project

5 UK NeSC Meeting, November 18 th, 2004 The INWA virtual organisation

6 UK NeSC Meeting, November 18 th, 2004 INWA Resources & Participants Resources –UK mortgage data –UK property data –Australian telco data –Australian property data –Compute power at EPCC –Compute power at Curtin Individuals and Organisations: –Analyst at EPCC, UK –Analyst at Curtin, Australia –EPCC, UK – compute resource provider and host –Curtin, Australia – compute resource host –Sun Microsystems, Aus – compute resource provider –Bank, UK – data provider –ESPC, UK – data provider –Telco, Aus – data provider –VGO, WA, Aus – data provider

7 UK NeSC Meeting, November 18 th, 2004 Background Funded by UK Economic & Social Research Council (UK) in the Pilot Projects in E-Social Science –Small scale projects to explore the potential of Grid technologies within the social sciences –Informing Business & Regional Policy: Grid enabled fusion of global data & local knowledge –INWA : Innovation Node Western Australia Started November 2003 –Initial phase finished August 2004

8 UK NeSC Meeting, November 18 th, 2004 Project Aims  Evaluate the suitability of existing grid solutions for secure distributed data mining and analysis on commercially sensitive data  Investigate the advantages of fusing public and private data enabled by a grid environment

9 UK NeSC Meeting, November 18 th, 2004 INWA Grid software Transfer-queue Over Globus (TOG) v1.1 from the UK e- Science Sun Data and Compute Grids project –provides access to remote compute resources Open Grid Services Architecture – Data Access and Integration (OGSA-DAI) Release 3.1 –provides access control and discovery of distributed heterogeneous data resources First Data Investigation on the Grid (FirstDIG) –grid data service browser provides SQL access to OGSA-DAI enabled resources –now part of OGSA-DAI Release 4.0  Globus Toolkit 2 and 3 –Grid middleware

10 UK NeSC Meeting, November 18 th, 2004 Curtin,Australia EPCC,UK The INWA Grid Grid Engine BankTelco Grid Engine BankTelco OGSA-DAI TOG Data Browser Telco data Bank data Australian property UK Property

11 UK NeSC Meeting, November 18 th, 2004 Demonstration  Scenario –A bank wants to predict if home owners are likely to move house within 5 years of taking out a loan to buy the house –This type of loan is a mortgage –Bank wants to use its own data and publically available data to help improve the prediction –Demo uses dummy data –Data stored in Australia in OGSA-DAI enabled databases –Demo shows an example of a workflow used in the project to browse and analyse data –FirstDIG browser and OGSA-DAI were used to browse and fuse data

12 UK NeSC Meeting, November 18 th, 2004 Access OGSA-DAI Registry  FirstDIG browser started  OGSA-DAI registry at Curtin selected

13 UK NeSC Meeting, November 18 th, 2004 Browse demo bank data  Grid data service factories appear  demoBank GDSF selected  SQL query input –select * from demoBankData LIMIT 50  Run select query  Query results appear –example bank data

14 UK NeSC Meeting, November 18 th, 2004 Browse demo public data  Select demo public GDSF  Run select query –select * from demoPublicdata limit 50  Query results appear –example public data

15 UK NeSC Meeting, November 18 th, 2004 Data Fusion Fusing commercial data with public property data Account IDAddressLoanDate… Downing Street, …200,00010/2/2002… My Street, …100,00014/8/1980… Address#Bedrooms#Garages… 10 Downing Street, …43… 20 My Street, …30… Account IDAddressLoanDate#Bedrooms#Garages… Downing …200,00010/2/200243… My Street, …100,00014/8/198030… + =

16 UK NeSC Meeting, November 18 th, 2004 Data Fusion Why do it ? –Prospect of better models/predictions –Added value But –need a distributed-aggregated approach to preserve anonymity So simulated this over the Grid –Using a less specific join key Not a 1-1 join but a 1-n so averaging necessary –Limited the potential gains from fusion Fuzzy joins –e.g. postcode formats, addresses (St=Street, flat numbers)

17 UK NeSC Meeting, November 18 th, 2004 Demo Data fusion  Select Database Join activity  Load SQL for data fusion pattern

18 UK NeSC Meeting, November 18 th, 2004 Demo Data fusion 2  Configure join pattern  Select source databases  Join on postcode  Set destination database

19 UK NeSC Meeting, November 18 th, 2004 Data fusion results

20 UK NeSC Meeting, November 18 th, 2004 Future Plans

21 UK NeSC Meeting, November 18 th, 2004 Future Plans Include Chinese Academy of Sciences (CNIC) as node in the INWA grid infrastructure Upgrade from OGSA-DAI R3.1 to R4.0 –Addresses security and performance issues Investigate ODBC connections to OGSA-DAI data services –ODBC typically available in the data analysis software used in business and social science research …then we can start to explore the impact of Grid capabilities on innovation processes and hence the Grid’s potential to support (virtual) industry clusters