Presentation on theme: "1 Repositories Update (UK) Peter Burnhill Director, EDINA National Data Centre, University of Edinburgh, Scotland UK JISC/CNI Conference, Edinburgh, 1."— Presentation transcript:
1 Repositories Update (UK) Peter Burnhill Director, EDINA National Data Centre, University of Edinburgh, Scotland UK JISC/CNI Conference, Edinburgh, 1 & 2 July2010 Managing Data in Difficult Times
2 Overview policies/strategies/technologies/infrastructure to manage research/teaching Scope –Digital repositories at the level of the institution (for itself), at a level above the campus: for institutions, for UK, for much much more *within the European and wider international context *in support of research, learning & teaching …. and management Having voice as … –a provider of common services and national infrastructure [EDINA] –a user of repository software [Eprints, DSpace, IntraLibrary] –a member of SONEX and indirectly of COAR and UK-CORR and focus on repository-related progress in the UK since last JISC/CNI; where is the value, how this is assessed/expressed? –Size of investment in recent times –Cost-effectiveness and impact of provision *Effort at institutional & inter/national level and the shared services agenda? Wondering what Dorothea said next …
3 Managing Data in Difficult Times Nostalgia for interesting but not difficult times? JISC Repositories & Preservation Programme - April 2006; March 2009 £14m investment in H.E. repository and digital content infrastructure This included the JISC RepositoryNet, as four support services: Repository Support Project Repository Research Project Intute Repository Search interim repository | Prospero | the Depot | OpenDepot Checking the JISC website today –under the heading of key digital repository activities are 21 funding programmes and 216 funded projects. Including some that are just being awarded … & then there is: OR10: Open Repositories Conference, 6/9 July 2010, Madrid RepoFringe2010: Repository Fringe 2/3 September, Edinburgh and several others
4 R is for Repository What are Repositories? –Facility/technology to support at least three basic types of service: PUT: a service interface that allows one or more use community to deposit/issue digital content (+ metadata on that content) KEEP: a service that ensures the integrity of that content, for the life of the repository GET: a service interface that allows one or more use community to search/extract that content *Use community: persons or machines/software; appropriate interface Digital Repositories Review (R.Heery and S.Anderson, 2005) –Digital repository differs from other digital collections in that: *"content is deposited, whether by content creator, owner or third party *architecture manages content as well as metadata; *repository offers a minimum set of basic services [put, get, search, access control] *must be sustainable & trusted, well-supported & well-managed." "a university-based institutional repository is a set of services … for the management and dissemination of digital materials created by the institution and its community members. … an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access..." (C. Lynch, 2003)
5 R is for Repository Who has Repositories and why?
6 R is for Repository Who has Repositories and why?
7 R is for Repository Who has Repositories and why?
8 R is for Repository What are Repositories and what are they for? –Allowing deposit of and holding all sorts of digital things/stuff *Metadata + Objects; Metadata + pointers; Metadata only *All sorts of objects: images, datasets, theses, articles, etc etc Special interest in serving our central task: –ease & continuity of access to scholarly resources
anytime/place convenience authorisation licence to use Ensuring researchers, students and their teachers have ease and continuing access to online scholarly resources projects ease continuing P.Burnhill, Edinburgh 2009 usability open preservation post-cancellation restricted Use case: article–length work published in e-journals but other use cases apply access to content & services reliability well-seamed interoperability functionality who/WAYF authentication
UK funding councils JISC Sub-Committees JISC Collections acting as platform for network-level services & helping to build the JISC Integrated Information Environment research, learning & teaching in UK universities & colleges Research Councils UK National Data Centres
11 1&2 provider of services & user of software EDINA-run repositories, with and without JISC –DataShare: for research data (institutional, U of Ed) *Open Data; using DSpace –Jorum: for learning materials [with Mimas] *OER and turnstile (UK); using DSpace & IntraLibrary –OpenDepot (the Depot): for research papers *OA (world); using Eprints –ShareGeo: for geo-spatial data *Open Data and turnstile (UK); using DSpace –OA Repository Junction as shared service tool *using own code and Eprints as an 'escrow' repository during the transfer process. –& maybe others … depending on definition of repository
for learning materials [with Mimas] OER and turnstile (UK); using DSpace & IntraLibrary
ShareGeo: for geo-spatial data Turnstile (UK) Data & Open Data; using DSpace
15 3. SONEX four individuals in JISC-sponsored mini think-tank –from Denmark, Spain & UK –Morgens Sandfaer, Pablo de Castro (Chair) & Jim Downing (Richard Jones) and Peter Burnhill came out of international workshop Amsterdam, March 2009 –charged with looking at how repositories should inter-operate –the focus group given name of repository handshake –3 other focus groups on citation, identifiers and organisation * the latter an exit strategy for EU-funded DRIVER project? focus switched to deposit opportunities –semi-automatic issue/deposit, under terms of Open Access *concern about risk of hollow ring of repositories *avoid diktat about standards and techno babble –looking to interoperability via SWORD
16 3. SONEX focus switched to deposit opportunities –Initial categorisation of repositories into which authors deposit –Looking to onward interworking/interoperability (SWORD) *Not just technical interoperability but workflow –Role of repository managers But also recognition of other network- attached systems: –Authoring tools *Desktop software –Bibliography tools –Non-Author-based workflows *CRIS *REF
17 SONEX: Scholarly Output Notification & EXchange Re-branded ourselves as SONEX, to signal … –scholarly output, not just research publications –notification using metadata only –exchange as two-way interoperability/negotiation *push metadata; pull content; exploit always-on Internet SONEX use case: multi-person & multi- institutional SONEX activities: –Identify/analyse deposit opportunities (use cases) for ingest into the repository space. –Identify/promote projects tackling deposit use cases –Gap analysis machine (third party systems) as user (PUT & GET) http://sonexworkgroup.blogspot.com
18 SONEX Use Case Actors Use case Actor 1: Individual author/researcher [person] author of multi-authored article, other author(s) at other institution(s) sole author with entire career at a single institution [exception] –Variant: author making deposit is the PI of funded research project(compliance with mandate from funder to deposit) –Variant: author making deposit is not the PI of funded research project but work is associated with one or more funded research projects (PI) Use case Actors 2&3: Depositor is not author (Mediated deposit) –Variant: support staff in research group –Variant : Librarys own resources and document collections –Variant: Institutional Research Support Systems (CRIS systems) [machine] Use case Actor 4: Repository Manager (RM) of an IR –wishing to be notified & obtain copy from a subject (SR) or another IR Use case Actor 5: Publisher (which work is published) [machine] a)deposit under OA of the author's final copy (OA-RJ & PEER projects) b)OA of published copy c)Pointer supply to published copy Other Actors: Vendor of authoring or repository software
19 SONEX Use Case Scenarios Gven opportunity, and motivation, to deposit content into the repository space, for onward notification and exchange: 1.PI(s) as co-author *with felt obligation to notify grant funders of OA deposit *via web-based or desktop environment 2.Publisher(s) *assisting their author(s) in supply of full-text into appropriate repositories 3.CRIS, a campus research information system, *managed support for researchers, including note of publications for the Project/Grant 4.Bibliography *web-based publications lists *as maintained by individual researchers, Research Groups, Departments, etc. *including RAE/REF driven institutional actions
20 OA Repository Junction Project m2m broker supports: –Discovery of user & content type –Get /ingest package of data (metadata + digital object) –Deduce /parse data object & deduce target repository(s) –Pass /deposit package into repository targets –Notify /send alert to appropriate 3 rd party(s) eg repository managers Working with Publisher and Subject Repository via Broker Service Theo Andrew & Ian Stuart (EDINA)
22 O is for Open OA (for publications) not the only open policy: –OER: Open Educational Resources *UKOER: Jorum and other subject/institutional repositories *Open CourseWare – as open webpages –Open Data *Both repository and open databases; Linked Open Data –Open Source Software Open Access –the regime used for Subject Repositories –seemed to be motive for creation of Institutional Repositories *Green OA self-archiving by authors: Creative Commons Is this how we should judge success of Repositories? –OA now becoming mainstream, including uptake by publishers –"One fifth of 2008's research papers now open access" The Great Beyond, Nature blog, June 25, 2010 Are Repositories the only way to support OA? –Repositories to align themselves with, and support funder-mandates for open access if they are to be successful
23 Informal discussion with JISC programme managers Dealing with institutional processes now, rather than repository technology. Depending on type of content, the projects would fit much more closely in: managing research data programme research information programme open educational resources programme as they have much more in common with those projects than they do with each other.
24 Informal discussion with JISC programme managers Dealing with institutional processes now, rather than repository technology. Depending on type of content, the projects would fit much more closely in: managing research data programme research information programme open educational resources programme as they have much more in common with those projects than they do with each other. repositories have found their core business proposition via the REF and making sure Universities list research outputs to obtain research ratings - have not succeeded in making the business case that IRs should be doing the job of archiving, a core library platform, or the job of an institutional demonstrator/poster space. Repositories fit in the University Enterprise Stack by virtue of being a system that delivers a business solution to a real financial problem.
25 UK-CORR: UK Council of Research Repositories individual rather than institutional, UKCORRemail@example.com UK has rich heterogeneous repository landscape (C.Awre); lurk following comment from Dorothea Salo UKCORRfirstname.lastname@example.org
26 UK-CORR: UK Council of Research Repositories individual rather than institutional, UKCORRemail@example.com UK has rich heterogeneous repository landscape (C.Awre); lurk following comment from Dorothea Salo: US mainly about OA full texts; UK mainly about … serving research assessment! UKCORRfirstname.lastname@example.org –Is there more to IRs than the REF: lots of bibliographic records & little full text? –Should IRs only accept full text, not metadata only? 1.in absence of a CRIS, our IR had to do REF (Lancaster & Northampton) 2.was OA but then RAE2008, but should aim to include all (OU) 3.motive for IR was digital preservation, with different REF system; funder mandate compliance for OA; visibility via OA (Oxford/Bodleian) 4.RAE/REF is opportunity to engage institution-wide (Warwick) 5.Advent of CRIS (which dont manage outputs well) may be opportunity for IRs to have role, including use of metadata only as lever to obtain full text (Hull) 6.REF & research management information allows IRs to be embedded as platform for OA (Southampton) 7.RAE/REF has different goals to OA and IRs with low % of full text may undermine OA movement (Nottingham)
27 COAR: Confederation of Open Access Repositories New: 1 st General Assembly in Madrid in March 2010 48 members drawn largely from Europe, but including both JISC & CNI, and also EDINA (University of Edinburgh) Work Plan for 2010/12, including 1.Advocacy on behalf of OA and repositories (Rs) [both together?] 2.Populating (OA) Rs 3.Best practice documents 4.Facilitate and ensure data interoperability of (across?) Rs 5.interoperability with other systems (such as CRIS systems) 6.Support national helpdesks 7.Guidance on how Rs will form essential elements for global e- infrastructure 8.Promote R manager profession 9.Provide advice & guidance on suitable R infrastructure technologies 10.Global (meta)data store 11.Strategic partner other infrastructure-related initiatives worldwide
28 Managing Data in Difficult [Interesting] Times End of an era? End of the R word? Embedded in domain-specific processes? 1. Moving from technology to policy & practice: some domain-specific, some common to repositories a)Collection management: active curation & Linked relationships * versions, data|article|learning material * Collections, see also b)First point of public issue (availability); Take-down regimes 2. Institutional stewardship responsibility for its born-digital [and digitised] content –"a university-based institutional repository [supports] a set of services … for the management and dissemination of digital materials created by the institution and its community members. … an organizational commitment to the stewardship of these digital materials, including long-term preservation where appropriate, as well as organization and access..." (C. Lynch, 2003) 3. What of the (new) shared services imperative? –Who does what, at what level/scale?
29 Theoretical basis for digital library? Mix of document tradition & computation tradition considerable simplification, … helpful to think … of two traditions, or mentalities, even cultures, co-exist in area of Information Science 1.Approaches based on a concern with documents, with signifying records: archives, bibliography, documentation, librarianship, records management, and the like 2.approaches based on uses for formal techniques, whether mechanical (such as punch cards and data-processing equipment) or mathematical (as in algorithmic procedures). Michael Buckland, UC Berkeley, 1998 http://people.ischool.berkeley.edu/~buckland/asis62.html
30 Time for me to stop Hoping that I have left some space/place for questions Thank you Acknowledgements Theo Andrew, Pablo de Castro & Robin Rice, Dave Flanders & Andy McGregor
31 Multimedia resources: candidate for repository? platform for search and download of film, video and audio –wide range of subject coverage, including documentary film –Llicensed for use in learning, teaching and research Being re-worked as the Digital Media Hub, combining –Film & Sound Online *initial 600 hours of film, digitised for downloading –NewsFilm Online *3000 hours of material from ITN & Reuters *Over 4TBs of clips to download –Release of product from JISC Digitisation programmes *Plus Education Image Gallery of still photography –Visual and Sound Materials Portal project *Discovering all sorts of audio-visual material Special interest for social science as record on non-print record of 20 th Century: the first A-V century –With new forms of research material to use and to master