Presentation is loading. Please wait.

Presentation is loading. Please wait.

IWIR-CRIS '06 Data retrieval in PURE Data retrieval in the 4-year old PURE CRIS project at 9 universities.

Similar presentations


Presentation on theme: "IWIR-CRIS '06 Data retrieval in PURE Data retrieval in the 4-year old PURE CRIS project at 9 universities."— Presentation transcript:

1 IWIR-CRIS '06 Data retrieval in PURE Data retrieval in the 4-year old PURE CRIS project at 9 universities

2 2 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Agenda ■ Overview ■ Retrieval  Validated manual data gathering  Dynamic integration to local back-end systems  Aggregation, enrichment and import of historic data  Experiments with automated imports of historic data ■ Exposure  Two web services  OAI  Z39.50  Reports  Portal framework ■ Archiving ■ Near future

3 3 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Overview ■ Brief overview ■ … in order to discuss ingestion, integration, conversion and import in a specific context

4 4 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Overview ■ Brief overview ■ History  Development begun in 2002 ■ Users  9 universities (DK+SE), several hospitals + other research institutions ■ Platform and architecture  J2EE enterprise application  Release management: All users have instances of same release version, same code-base ■ Business model  Commercial software licenses, powerful user group, shared budgets ■ Modular  Basic module, Reporting module, Student thesis module, External publications module, Bibliometrics module, Press module.

5 5 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Overview

6 6 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Manual data gathering ■ User roles/right + workflow:  = de-centralized data gathering  = validated data gathering  = continuous data gathering ■ GUI example ■ Management focus is necessary  Reports and statistics, KPI-management, etc. ■ Adding value to researchers is necessary  Instantly in Google indexes, instantly updated personal websites, instantly updated CV, increased citations (source in paper), etc.

7 7 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Dynamic integration ■ Dynamic integration to local back-end systems:  Personnel systems, payroll systems (for data retrieval)  LDAPs, Active Directories (for data retrieval + authentication)  Single sign-on systems (for authentication)  … to automatically create object types such as “person” or “organization” ■ … and yes, PURE hosts data, too  We need complete objects according to the meta-data model ■ Plug-in architecture in PURE:  Pro = individually adapted integration  Con = individually programmed plug-in necessary  Future = GUI, standardized plug-ins

8 8 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Import ■ Historic data ■ Many sources  More or less useful data  More or less consequent use of formats :-) ■ The PXA format  PURE XML Archive format -.zip based  Meta-data, relations between entities, binary files ■ Aggregation > enrichment > conversion > import  The process is external to PURE

9 9 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Retrieval ■ Experiments ■ Experiments with automated imports of historic data from specific, identified sources ■ [source format] > PXA conversion > import > enrichment/validation ■ Very poor data quality demands the concept of “draft objects” in PURE

10 10 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Web services ■ RPC/encoded + document/literal ■ Rich libraries of methods ■ Including format-specific methods: APA, MLA, HARVARD, VANCOUVER and CBE ■ Free and near-instant adding of methods ■ WS code example (if time)

11 11 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ OAI support ■ OAI-PMH data provider ■ OAI-PMH formats ■ DC ■ DDF-MXD (Danish national format) ■ SVEP (Swedish national format)  … more to come ■ Also used to harvest other PURE-repositories for “external publications”

12 12 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Z39.50 ■ Enabling of searches in PURE from library systems ■ SRW/SRU

13 13 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Reports ■ PURE reporting module ■ GUI example

14 14 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Reference manager ■ Export of data to local Reference Manager installation ■ Using RM-formatted export file ■ Promotes registering to the repository rather than in RM ■ GUI example

15 15 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Exposure ■ Portal framework ■ PUREportal – free PURE-specific framework for custom development of research exhibition portals ■ Online example ■ Typical cost scenario € 20,000 ■ Typical delivery time 1 month ■ Little need for requirements specification ■ Automatic PURE-API maintenance

16 16 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Archiving ■ Data archiving – 2 levels ■ SQL environment ■ Meta-data and relations ■ Binary files just stored in server file system ■ FEDORA via connector (not PURE-specific, Open Source) ■ Facilitates:  Higher quality archival of binary files  Long term preservation in general  Adoption of PURE in institutions’ general FEDORA strategies

17 17 atira Niels Jernes Vej 10 DK-9220 Aalborg +45 9635 6100 www.atira.dk Near future ■ The near future regarding data retrieval ■ More automated imports using increasingly advanced converters ■ Automated data delivery (push and harvest) to:  Industry specific search services (e.g. PubMed, Nordicom)  Documentary data collections (such as clinicaltrials.org), and national collections (such as DDF (DK), ForskDok (NO), etc. ■ Temporary import objects  When imported data are not in sufficient quality to create valid objects  when data cannot be properly related to other objects upon import


Download ppt "IWIR-CRIS '06 Data retrieval in PURE Data retrieval in the 4-year old PURE CRIS project at 9 universities."

Similar presentations


Ads by Google