Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone: +44 131 650 5155

Similar presentations


Presentation on theme: "Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone: +44 131 650 5155"— Presentation transcript:

1 Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone: +44 131 650 5155 Email: tsloan@epcc.ed.ac.uk

2 Overview  FirstDIG  INWA  Outstanding issues raised by these projects

3 First Data Investigation on the Grid: FirstDIG http://www.epcc.ed.ac.uk/firstdig /

4 Motivation  Few UK e-Science projects involve service companies such as First plc  First plc –Operate worldwide in variety of transport sectors –Over 10000 vehicles in the UK, 23% of the market –UK’s largest operator  The challenge for First –Meeting the needs of the travelling public whilst making money –Data integration and mining may assist but huge range of fragmented data sources

5 Data Sources in the Bus Industry  Many different kinds of data involved with running a bus company –Mileage, revenue, customer contact, schedule, fuel consumption, vehicle maintenance, routes…  Many means to collect data –Manually entered data at depot –Data collected on buses from ticket machines –Data collected on buses from GPS systems –GPS system notes when bus passes through a predefined “footprint” and records the time at which this happens

6 Answering Business Questions  Want to combine data from more than one source: –Complaints versus Lateness –Revenue versus Lost Miles –Complaints versus Lost Miles  Want data aggregated in some way: –By Service –By Day  Want to consider subsets of the data –e.g. weekdays only

7 Disparate Databases  Data is typically stored in disparate databases –Various reasons for this: Incremental construction of systems. –Not a problem for day-to-day running and querying but…  Introduces challenges for Data Analysis –Systems introduced at different times –Different database engines –Different front-ends –Different operating systems –Different physical locations –Different ways of representing data  These issues are NOT unique to buses

8 OGSA-DAI  OGSA-DAI –Open Grid Services Architecture : Data Access and Integration –Potentially provides a solution –Need business users to make transition from science to commerce  Grid middleware: –Assists with the access and integration of data from separate data sources via the Grid –Represents databases as Grid Services –Enables access from other machines in a secure manner

9 FirstDIG Achievements  Deployment at First South Yorkshire  Combined two databases to answer real business questions –The Customer Contact System Microsoft Access Information on customer complaints e.g. time, service, nature –The Mileage database dBASE IV Information on bus mileage e.g. lost miles  Produced generic Grid Data Service Browser –SQL access including joins across the databases

10 First Grid Data Service Browser

11 Informing Business & Regional Policy: Grid-enabled fusion of global data & ‘local’ knowledge INWA http://www.epcc.ed.ac.uk/~inwa /

12 INWA  An e-Social Science demonstrator –Demonstrates how grid technologies can improve business –Combining private and public data sources –Finance and Telecommunications  Uses many grid technologies –TOG from Sun DCG provides access to remote HPC resource –OGSA-DAI provides access control and discovery of distributed heterogeneous data resources –FirstDIG grid data service browser provides SQL access to OGSA-DAI enabled resources –Globus Toolkit 2 and 3

13 EPCC INWA Grid Infrastructure UK Property data service Australian Property data service Curtin User@CurtinUser@Edinburgh Globus Grid Globus Grid FirstDIG Grid Engine Bank Telco TOG Grid Engine Bank Telco TOG Bank data Telco data

14 References  EPCC –http://www.epcc.ed.ac.uk/  FirstDIG –http://www.epcc.ed.ac.uk/firstdig/  OGSA-DAI –http://www.ogsadai.org.uk  INWA –http://www.epcc.ed.ac.uk/~inwa  Sun Data & Compute Grids –http://www.epcc.ed.ac.uk/sungrid/  Transfer-queue Over Globus (TOG) – http://gridengine.sunsource.net/project/gridengine/tog.html

15 Outstanding issues raised by FirstDIG & INWA

16 Outstanding Issues: Usability  OGSA-DAI is middleware, client toolkit helps  Incorporation of demo First browser helpful’ish But really want …  Interfaces to real data analysis & dbms packages eg SPSS  Otherwise users could end up building applications that replicate these eg the First Grid Data Service Browser  Want to be able to point Access, Excel, etc at a grid data source and examine it

17 Outstanding issues: Data  CSV (Comma separated value) data sources –are common but current JDBC-ODBC drivers do not have sufficient functionality (NOT an OGSA-DAI issue per se)  No support for BIT type field –And others eg BOOLEAN, BINARY, etc  Certain characters (eg &, >) are not handled by the OGSA-DAI XML parser –Company names often have & in them  Dates from certain sources not handled properly –First Grid Data Service has to handle this internally

18 Outstanding issues: Miscellaneous  Security –Rolemap file is not encrypted –If one GDS accesses another GDS the user security credentials are not passed on so it does not work  Installation & Testing –Install & Set-up Well-explained but still a fair amount of user effort involved –Lack of an example OGSA-DAI site to point at to test that your OGSA-DAI installation works

19 Outstanding Issues: Miscellaneous  Installation & Testing –Lack of an example OGSA-DAI site to point at to test that your OGSA-DAI installation works  Large results sets –Can increase JVM size but this is not scalable –This occurred on most datasets  Integration –DQP is a start ….(Linux, OQL)  Why use OGSA-DAI ? –Easysoft etc –http://www.easysoft.com/products/2001/main.phtml

20 Why use OGSA-DAI ? ‘a RDBMS engine that appears to client apps as a fully conformant ODBC 3.5 data source….can be used to provide real-time, heterogeneous access to multiple target data sources.’


Download ppt "Using OGSA-DAI in a commercial environment Terry Sloan EPCC Telephone: +44 131 650 5155"

Similar presentations


Ads by Google