Presentation is loading. Please wait.

Presentation is loading. Please wait.

GEODE - NeSC workshop, Oct 2006 GEODE: Grid Enabled Occupational Data Environment Paul Lambert and Larry Tan University of Stirling www.geode.stir.ac.uk.

Similar presentations


Presentation on theme: "GEODE - NeSC workshop, Oct 2006 GEODE: Grid Enabled Occupational Data Environment Paul Lambert and Larry Tan University of Stirling www.geode.stir.ac.uk."— Presentation transcript:

1 GEODE - NeSC workshop, Oct 2006 GEODE: Grid Enabled Occupational Data Environment Paul Lambert and Larry Tan University of Stirling Paul Lambert, Larry Tan, Ken Turner, & Vernon Gayle University of Stirling Ken PrandyCardiff University Richard SinnottUniversity of Glasgow Erik BihagenStockholm University Marco van LeeuwenIntl. Institute for Social History (Amsterdam)

2 GEODE - NeSC workshop, Oct 2006 The Grid and New Technologies of Data Collection The Grid and eScience: 1. Online Coordination of electronic resources and collaborations (Distributed computing) Large scale Collaborative Heterogeneous 2. Standard protocols / information management systems UK eSocial Science: 1) Investment in assessing / implementing technology 2) Computationally demanding data analysis 3) Qualitative and quantitative data collection technologies 4) **Data sharing, processing and access**

3 GEODE - NeSC workshop, Oct 2006 GEODE: Survey records occupational data The importance of occupational micro-data Collecting occupational data 1) Initial occupational records (textual description) 2) Processing occupational records: Good practice: Preservation of original, OUG and substantive variables NSIs favour transparent occupational data coding (1) and translation systems (2) Text descriptions (1) Standardised Occupational Index (e.g. unit group: OUG) (2) Substantive occupational summary (e.g. social class code)

4 GEODE - NeSC workshop, Oct 2006 Occupational data collection and processing (1) Text records OUG data Currently: Text coding software (e.g. CASCOT) Manual look-up GEODE: Linkage to existing resources Further facilities possible but not planned (users typically have adequate resources) (2) OUG data summary indicators Currently: Numerous aggregate occupational information resources Bespoke data programming requirements GEODE: Core provision: management and access of these data resources Service to large volumes of users

5 GEODE - NeSC workshop, Oct 2006 Some illustrative occupational information resources Index units# distinct files (average size kb) Updates? CAMSIS, Local OUG*(e.s.) 200 (100)y CAMSIS value labels Local OUG50 (50)n ISEI tools, home.fsw.vu.nl/~ganzeboom Int. OUG20 (50)y E-Sec matrices Int. OUG*(e.s.) 20 (200)n Hakim gender seg codes (Hakim 1998) Local OUG2 (paper)n

6 GEODE - NeSC workshop, Oct 2006 Whats the problem? Indexed mainly by Occupational Unit Group (OUG). But… Numerous alternative occupational data files (time; country; format) Alternative OUG schemes; other index factors (employment status) Inconsistent translations to social classifications – by file or by fiat Dynamic updates to occupational data resources Low uptake of existing occupational information resources Strict security constraints on users micro-social survey data External user (micro-social data) Occ info (index file) (aggregate) Users output (micro-social data) idougsex.ougCS-MCS-FEGPidougCS I II VIIa

7 GEODE - NeSC workshop, Oct 2006 GEODE: Grid Enabled Occupational Data Environment Strategy: 1)Occupational data index service (depository) i.Semantic data curation (DDI) ii.Data storage (OGSA-DAI) iii.Data indexing / access (OGSA-DAI) 2) User-friendly portal access Entry to an international virtual organisation for data depositors and users (GridSphere, GT4, OGSA-DAI) Facilitate linking occupational information to users datasets (OGSA-DAI) (initial focus on CAMSIS resources)

8 GEODE - NeSC workshop, Oct 2006 Occupational information depository 1.1) Semantic curation of occupational information Establish a GEODE-M meta- data subset (.xml) Founded on Michigan Data Documentation Initiative Minimise curation requirements Web proforma entry [via Portal using Gridsphere] Release date Country Time period Author Format Missing data Data extensions OUG variable Other identifier variables Output variables

9 GEODE - NeSC workshop, Oct 2006 Technical Objectives Create a virtual community of occupational information researchers –Gateway for occupational information –Data abstraction –Uniform access to resources Accessible via a portal Occupational data curation –Annotation of data using DDI Occupational matching services –e.g. Linking surveyed data to CAMSIS scores

10 GEODE - NeSC workshop, Oct 2006

11 GEODE - Architecture VO members can deploy own data services, also occupational matching services –Scalable –Distributed Possible application for other types of social science data –Annotation with DDI –Custom services can be deployed

12 GEODE - NeSC workshop, Oct 2006 GEODE – Prototype Simple occupational matching services VO of Occupational Data Resources Portal for searching external resources

13 GEODE - NeSC workshop, Oct 2006 GEODE - Prototype

14 GEODE - NeSC workshop, Oct 2006 GEODE - Prototype Windows environment Java GridSphere Portal Framework Globus Toolkit 4 –Index Service (Virtual Organization) –OGSA-DAI WSRF (Data Access Middleware) Custom OGSA-DAI resources and activities Accesses CSV, Relational data resources

15 GEODE - NeSC workshop, Oct 2006 GEODE - Prototype Data Documentation Initiative –Annotate the data resources Occupational Matching Grid Services –Checks if DDI of target resource is compatible (e.g. category specified matches requirement) –Map occupational unit group to data –Returns mapped/matched results Demonstration of prototype

16 GEODE - NeSC workshop, Oct 2006 Future Work Possible extension of VO to other social science related datasets –With services Variety of occupational data analysis services


Download ppt "GEODE - NeSC workshop, Oct 2006 GEODE: Grid Enabled Occupational Data Environment Paul Lambert and Larry Tan University of Stirling www.geode.stir.ac.uk."

Similar presentations


Ads by Google