Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone: +44 131 650 5155

Slides:



Advertisements
Similar presentations
ACT! 3.0 For Notes R5 Turning Contacts Into Relationships and Relationships Into Results Copyright (C) 2002 – GL Computing.
Advertisements

Grid-Enabling Data: Sticking Plaster, Sellotape, & Chewing Gum? Colin C. Venters National Centre for e-Social Science University.
1 This document and its contents are the property of The University of Iowa’s Public Policy Center National Evaluation of a Mileage- Based Road User Charge.
Making the most of Satellite Navigation and Tracking Integration with your Business System.
FirstDIG First Data Investigation on the Grid Paul Graham, Terry Sloan, Adam Carter EPCC Ian Gregory, Darren Unwin First South Yorkshire tel:+44 (0)131.
Holding slide prior to starting show. Supporting Collaborative Working of Construction Industry Consortia via the Grid - P. Burnap, L. Joita, J.S. Pahwa,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Institute for Scientific Computing – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University.
Centralized and Client/Server Architecture and Classification of DBMS
1 e-science & data mining workshop, NeSC, UK, November 30 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
1 Connecting via ODBC Peter Troon Brian Samson. 2 Overview –ODBC and Middleware –Our school’s helpdesk database –The client, which is used to access the.
Osama Shahid ( ) Vishal ( ) BSCS-5B
Overview of SQL Server Alka Arora.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
© Paradigm Publishing, Inc. 5-1 Chapter 5 Application Software Chapter 5 Application Software.
Computers Are Your Future Tenth Edition Chapter 12: Databases & Information Systems Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall1.
Crystal And Elliott Edward M. Kwang President. Crystal Version Standard - $145 Professional - $350 Developer - $450.
Middleware-Based OS Distributed OS Networked OS 1MEIT Application Distributed Operating System Services Application Network OS.
Quality Attributes of Web Software Applications – Jeff Offutt By Julia Erdman SE 510 October 8, 2003.
Authorized Dealer.
1 UK NeSC Meeting, November 18 th, 2004 Terry Sloan EPCC, The University of Edinburgh INWA : using OGSA-DAI in a commercial environment.
Page Up or Down to navigate through the program.
EdSkyQuery-G Overview Brian Hills, December
Fundamentals of Database Chapter 7 Database Technologies.
Organizing Data and Information AD660 – Databases, Security, and Web Technologies Marcus Goncalves Spring 2013.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Stephen Booth EPCC Stephen Booth GridSafe Overview.
1 EPCC Sun Data and Compute Grids Project Using Sun Grid Engine and Globus to Schedule Jobs Across a Combination of Local.
GPS TimeTrack™ for Distribution. Proprietary and confidential. All rights reserved. Xora, Inc. 2 Business challenges Limited visibility –It’s difficult.
Csi315csi315 Client/Server Models. Client/Server Environment LAN or WAN Server Data Berson, Fig 1.4, p.8 clients network.
ITGS Case Study Theatre Booking System Ayushi Pradhan.
Archivists' Toolkit - CRADLE Presentation, 10 Feb The Archivists’ Toolkit CRADLE Presentation 10 Feb
Session-8 Data Management for Decision Support
DATABASE MANAGEMENT SYSTEMS IN DATA INTENSIVE ENVIRONMENNTS Leon Guzenda Chief Technology Officer.
Introduction to OGSA-DAI The OGSA-DAI Team
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Ch 14 QQ T F 1.A database table consists of fields and records. T F 2.Good data validation techniques can help improve data integrity. T F 3.An index is.
ITGS Databases.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Experiences with OGSA-DAI : Portlet Access and Benchmark Deepti Kodeboyina and Beth Plale Computer Science Dept. Indiana University.
Supercomputing 2003, UK e-Science Booth 1 First Data Investigation on the Grid: FirstDIG Terry Sloan, Paul Graham, Adam Carter Edinburgh Parallel Computing.
Computing Fundamentals Module Lesson 6 — Using Technology to Solve Problems Computer Literacy BASICS.
August 3, March, The AC3 GRID An investment in the future of Atlantic Canadian R&D Infrastructure Dr. Virendra C. Bhavsar UNB, Fredericton.
Geoff Cawood, Terry Sloan Edinburgh Parallel Computing Centre (EPCC) Telephone: EPCC Sun Data and Compute.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Adrian Jackson, Stephen Booth EPCC Resource Usage Monitoring and Accounting.
Data and storage services on the NGS.
The National Grid Service Mike Mineter.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Virtual Organisations for Trials and Epidemiological Studies (VOTES) Overview VOTES is a pioneering project investigating the application of Grid technology.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Enterprise Resource Planning - PeopleSoft. An ERP system is a business support system that maintains in a single database the data needed for a variety.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
1 Case Study: Business Intelligence & Customer Data Customer Support Web-based Dashboard VP Marketing SQL XSLT XML Data Grid Customer Data Customer Order.
OGSA-DAI.
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
Enterprise Resource Planning
Issues of Scaling LAN Session 4321 SHARE 85 Pat Berastegui-Egen.
Presentation transcript:

Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone:

Overview  FirstDIG  INWA  Outstanding issues raised by these projects

First Data Investigation on the Grid: FirstDIG /

Motivation  Few UK e-Science projects involve service companies such as First plc  First plc –Operate worldwide in variety of transport sectors –Over vehicles in the UK, 23% of the market –UK’s largest operator  The challenge for First –Meeting the needs of the travelling public whilst making money –Data integration and mining may assist but huge range of fragmented data sources

Data Sources in the Bus Industry  Many different kinds of data involved with running a bus company –Mileage, revenue, customer contact, schedule, fuel consumption, vehicle maintenance, routes…  Many means to collect data –Manually entered data at depot –Data collected on buses from ticket machines –Data collected on buses from GPS systems –GPS system notes when bus passes through a predefined “footprint” and records the time at which this happens

Answering Business Questions  Want to combine data from more than one source: –Complaints versus Lateness –Revenue versus Lost Miles –Complaints versus Lost Miles  Want data aggregated in some way: –By Service –By Day  Want to consider subsets of the data –e.g. weekdays only

Disparate Databases  Data is typically stored in disparate databases –Various reasons for this: Incremental construction of systems. –Not a problem for day-to-day running and querying but…  Introduces challenges for Data Analysis –Systems introduced at different times –Different database engines –Different front-ends –Different operating systems –Different physical locations –Different ways of representing data  These issues are NOT unique to buses

OGSA-DAI  OGSA-DAI –Open Grid Services Architecture : Data Access and Integration –Potentially provides a solution –Need business users to make transition from science to commerce  Grid middleware: –Assists with the access and integration of data from separate data sources via the Grid –Represents databases as Grid Services –Enables access from other machines in a secure manner

FirstDIG Achievements  Deployment at First South Yorkshire  Combined two databases to answer real business questions –The Customer Contact System Microsoft Access Information on customer complaints e.g. time, service, nature –The Mileage database dBASE IV Information on bus mileage e.g. lost miles  Produced generic Grid Data Service Browser –SQL access including joins across the databases

First Grid Data Service Browser

Informing Business & Regional Policy: Grid-enabled fusion of global data & ‘local’ knowledge INWA /

INWA  An e-Social Science demonstrator –Demonstrates how grid technologies can improve business –Combining private and public data sources –Finance and Telecommunications  Uses many grid technologies –TOG from Sun DCG provides access to remote HPC resource –OGSA-DAI provides access control and discovery of distributed heterogeneous data resources –FirstDIG grid data service browser provides SQL access to OGSA-DAI enabled resources –Globus Toolkit 2 and 3

EPCC INWA Grid Infrastructure UK Property data service Australian Property data service Curtin Globus Grid Globus Grid FirstDIG Grid Engine Bank Telco TOG Grid Engine Bank Telco TOG Bank data Telco data

References  EPCC –  FirstDIG –  OGSA-DAI –  INWA –  Sun Data & Compute Grids –  Transfer-queue Over Globus (TOG) –

Outstanding issues raised by FirstDIG & INWA

Outstanding Issues: Usability  OGSA-DAI is middleware, client toolkit helps  Incorporation of demo First browser helpful’ish But really want …  Interfaces to real data analysis & dbms packages eg SPSS  Otherwise users could end up building applications that replicate these eg the First Grid Data Service Browser  Want to be able to point Access, Excel, etc at a grid data source and examine it

Outstanding issues: Data  CSV (Comma separated value) data sources –are common but current JDBC-ODBC drivers do not have sufficient functionality (NOT an OGSA-DAI issue per se)  No support for BIT type field –And others eg BOOLEAN, BINARY, etc  Certain characters (eg &, >) are not handled by the OGSA-DAI XML parser –Company names often have & in them  Dates from certain sources not handled properly –First Grid Data Service has to handle this internally

Outstanding issues: Miscellaneous  Security –Rolemap file is not encrypted –If one GDS accesses another GDS the user security credentials are not passed on so it does not work  Installation & Testing –Install & Set-up Well-explained but still a fair amount of user effort involved –Lack of an example OGSA-DAI site to point at to test that your OGSA-DAI installation works

Outstanding Issues: Miscellaneous  Installation & Testing –Lack of an example OGSA-DAI site to point at to test that your OGSA-DAI installation works  Large results sets –Can increase JVM size but this is not scalable –This occurred on most datasets  Integration –DQP is a start ….(Linux, OQL)  Why use OGSA-DAI ? –Easysoft etc –

Why use OGSA-DAI ? ‘a RDBMS engine that appears to client apps as a fully conformant ODBC 3.5 data source….can be used to provide real-time, heterogeneous access to multiple target data sources.’