Presentation is loading. Please wait.

Presentation is loading. Please wait.

Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005.

Similar presentations


Presentation on theme: "Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005."— Presentation transcript:

1 Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005

2 Outline Geospatial data = Geospatial data = Institutional and Subject Repositories Institutional and Subject Repositories Repository choices Repository choices Data Centres Data Centres Possible solutions Possible solutions

3 Geospatial Data Scope within GRADE Scope within GRADE Numerical data, raw and analyzed Numerical data, raw and analyzed Information Products Information Products Publications Publications CD Roms, DVD CD Roms, DVD Learning Objects Learning Objects

4 Repositories are spreading because … Supplementary to traditional publication Supplementary to traditional publication Do not affect current research publication processes Do not affect current research publication processes Give easy access Give easy access Give rapid access Give rapid access Give long-term access Give long-term access Increase readership and use of material Increase readership and use of material They offer advantages to institutions They offer advantages to institutions They offer advantages to research funders They offer advantages to research funders They offer new ways for information to be linked and used They offer new ways for information to be linked and used

5 Subject/Discipline Based Repositories Relies on peer interaction – no mandate Relies on peer interaction – no mandate Individual agreements have to be struck Individual agreements have to be struck No definitive boundaries No definitive boundaries Quality control issues Quality control issues Sustainability issues Sustainability issues Transitory – collection at risk Transitory – collection at risk Responsibility for preservation Responsibility for preservation Issues over the return on the money and effort invested Issues over the return on the money and effort invested ? A trusted repository? Supported by …. Subject repositories often managed by an individual for a group

6 Subject repositories are archives which collect and manage material relating to one or more related subject areas. A number currently exist mainly within science subjects. Significant subject repositories include many using e-Prints or DSpace software: Significant subject repositories include many using e-Prints or DSpace software: ArXiv - http://xxx.arxiv.cornell.edu/ (physics, mathematics, non-linear science and computer science) ArXiv - http://xxx.arxiv.cornell.edu/ (physics, mathematics, non-linear science and computer science)http://xxx.arxiv.cornell.edu/ Cogprints - http://cogprints.ecs.soton.ac.uk/ (Cognitive sciences including psychology, neuroscience, linguistics and other related areas) Cogprints - http://cogprints.ecs.soton.ac.uk/ (Cognitive sciences including psychology, neuroscience, linguistics and other related areas)http://cogprints.ecs.soton.ac.uk/ CiteSeer - http://citeseer.nj.nec.com/cs (computer science) CiteSeer - http://citeseer.nj.nec.com/cs (computer science)http://citeseer.nj.nec.com/cs HTP Prints - http://htpprints.yorku.ca/ (History and theory of psychology) HTP Prints - http://htpprints.yorku.ca/ (History and theory of psychology)http://htpprints.yorku.ca/ PubMedCentral - http://www.pubmedcentral.nih.gov/ (US National Library of Medicine's digital archive of life sciences journal literature. PubMedCentral - http://www.pubmedcentral.nih.gov/ (US National Library of Medicine's digital archive of life sciences journal literature.http://www.pubmedcentral.nih.gov/ PhilSci Archive - http://philsci-archive.pitt.edu/ (philosophy of science) PhilSci Archive - http://philsci-archive.pitt.edu/ (philosophy of science)http://philsci-archive.pitt.edu/ E-LIS - http://eprints.rclis.org/ (library and information science) E-LIS - http://eprints.rclis.org/ (library and information science)http://eprints.rclis.org/ RePEc (Research Papers in Economics) RePEc (Research Papers in Economics) RePEc (Research Papers in Economics) RePEc (Research Papers in Economics)

7 Institutional Repositories Freely accessible web-based databases providing access to the full text of scholarly material produced by members of an institution. Digital collections that capture and preserve the intellectual output of the communities. What are the essential elements? Institutionally defined: Content - generated by the community Institutionally defined: Content - generated by the community Scholarly content:, published articles, books, book sections, preprints Scholarly content:, published articles, books, book sections, preprints and working papers, conference papers, enduring teaching and working papers, conference papers, enduring teaching materials, student theses, data-sets, etc. materials, student theses, data-sets, etc. Cumulative & perpetual: preserve ongoing access to material Cumulative & perpetual: preserve ongoing access to material Interoperable & open access: free, online, global, utilising standards : Interoperable & open access: free, online, global, utilising standards : OAI, Dublin Core etc OAI, Dublin Core etc

8 Institutional Repositories Institutions are logical implementers of repositories because they can take responsibility for: because they can take responsibility for: – Centralising a distributed activity – Framework and Infrastructure – Permanence that can sustain changes – Stewardship of Digital assets – Preservation policy for long term access – Provide central digital showcase for the research, teaching and scholarship of the institution teaching and scholarship of the institution a trusted repository supported by the Information Community

9 Institutional Repository Software for geo data OSI Directory of Institutional Repository Software V.3 http://www.soros.org/openaccess/software / OSI Directory of Institutional Repository Software V.3 http://www.soros.org/openaccess/software / http://www.soros.org/openaccess/software E-Prints (GNU) [http://software.eprints.org/]. Open-source OAI-compliant E-Prints (GNU) [http://software.eprints.org/]. Open-source OAI-complianthttp://software.eprints.org/ software developed at University of Southampton to enable anyone to set up software developed at University of Southampton to enable anyone to set up their own Open Archives-compliant institutional archive. Originally programmed for subject repositories but now re-engineered for IR. Does not identify treatment of datasets, though can cover bibliographic description their own Open Archives-compliant institutional archive. Originally programmed for subject repositories but now re-engineered for IR. Does not identify treatment of datasets, though can cover bibliographic description DSpace: Durable Digital Depository [http://dspace.org/]. Open-source software developed at MIT for their own repository; released as open source software in Nov. 2002. DSpace: Durable Digital Depository [http://dspace.org/]. Open-source software developed at MIT for their own repository; released as open source software in Nov. 2002. http://dspace.org/ Overtly identifies datasets. Offers opportunity to explore the issues surrounding the Overtly identifies datasets. Offers opportunity to explore the issues surrounding the incorporation of different metadata standards within one system…. Different disciplines have adopted different sets of metadata standards to accommodate their particular data needs. incorporation of different metadata standards within one system…. Different disciplines have adopted different sets of metadata standards to accommodate their particular data needs. Two examples are the CSDGM standard for geospatial data and the DICOM standard for digital imaging in medicine. … develop more general standards, such as Dublin Core, which Two examples are the CSDGM standard for geospatial data and the DICOM standard for digital imaging in medicine. … develop more general standards, such as Dublin Core, which proposes a basic set of common elements that can be used across many different disciplines and document types. proposes a basic set of common elements that can be used across many different disciplines and document types. (DC and MARC are norms) (DC and MARC are norms)

10 https://dspace.ucalgary.ca/handle/1880/33https://dspace.ucalgary.ca/handle/1880/33 need to register to search

11 http://careo.ucalgary.ca/cgi-bin/WebObjects/CAREO.woahttp://careo.ucalgary.ca/cgi-bin/WebObjects/CAREO.woa - information products

12 Repository Choices Subject - arXiv, Cogprints, RePEC, Subject - arXiv, Cogprints, RePEC, Institutional – Southampton, Glasgow, Nottingham (SHERPA), MBA UK Institutional – Southampton, Glasgow, Nottingham (SHERPA), MBA UK National - DARE (all universities in the Netherlands), Scotland, British Library (proposal) National - DARE (all universities in the Netherlands), Scotland, British Library (proposal) National / Subject - ODINPubAfrica National / Subject - ODINPubAfrica International - Internet Archive Universal, OAIster International - Internet Archive Universal, OAIster Regional - White Rose UK Regional - White Rose UK Consortia - SHERPA-LEAP (London E-prints Access Project) Consortia - SHERPA-LEAP (London E-prints Access Project) Funding Agency – NIH (PubMed), Wellcome Trust (UK PubMed), NERC Funding Agency – NIH (PubMed), Wellcome Trust (UK PubMed), NERC Project - Public Knowledge Project EPrint Archive Project - Public Knowledge Project EPrint Archive Conference - 11th Joint Symposium on Neural Computation, May 15 2004 Conference - 11th Joint Symposium on Neural Computation, May 15 2004 Personal – peer to peer, web pages etc Personal – peer to peer, web pages etc Media Type - VCILT Learning Objects Repository, NTDL (Theses) Media Type - VCILT Learning Objects Repository, NTDL (Theses) Publisher – journal archives Publisher – journal archives Data Repositories/Archives - NODC, BODC, DOD, JODC, BADC etc Data Repositories/Archives - NODC, BODC, DOD, JODC, BADC etc Science, particularly Environmental Science is well served Science, particularly Environmental Science is well served Logical host for numeric datasets Logical host for numeric datasets

13 Data Centres/ Archives / Repositories Within organisational infrastructures but not defined by it Within organisational infrastructures but not defined by it National responsibilities National responsibilities Subject and Technical Specialists, quality control of content Subject and Technical Specialists, quality control of content Secure storage and migration policies Secure storage and migration policies Well developed Metadata schema & Standards Well developed Metadata schema & Standards DIF – Directory Interchange Format, FGDC etc DIF – Directory Interchange Format, FGDC etc ISO 19115 ISO 19115 the minimum set of metadata required to serve the full range of metadata applications (data discovery, determining data fitness for use, data access, data transfer, and use of digital data); the minimum set of metadata required to serve the full range of metadata applications (data discovery, determining data fitness for use, data access, data transfer, and use of digital data); optional metadata elements - to allow for a more extensive standard description of geographic data, if required; optional metadata elements - to allow for a more extensive standard description of geographic data, if required; a method for extending metadata to fit specialized needs. a method for extending metadata to fit specialized needs. Though ISO 19115:2003 is applicable to digital data, its principles can be extended to many other forms of geographic data such as maps, charts, and textual documents as well as non-geographic data. Though ISO 19115:2003 is applicable to digital data, its principles can be extended to many other forms of geographic data such as maps, charts, and textual documents as well as non-geographic data. a trusted repository supported by the Data Management Community

14 ARCHIMEDE : A Canadian software solution for institutional repositories [http://archimede.bibl.ulaval.ca/di/Welcome.do]. OAI compliant software developed by Laval University Library. Archimede has been developed in a multilingual perspective, with internationalization as a focus. The text (or content) of the interface is independent and not embedded in the code making it relatively easy to develop an interface in a specific language without having to work on the code itself. English, French and Spanish interfaces are already offered in Archimede. That feature allows also the user to switch easily from language to language anywhere and anytime during his search and retrieval process. ARCHIMEDE : A Canadian software solution for institutional repositories [http://archimede.bibl.ulaval.ca/di/Welcome.do]. OAI compliant software developed by Laval University Library. Archimede has been developed in a multilingual perspective, with internationalization as a focus. The text (or content) of the interface is independent and not embedded in the code making it relatively easy to develop an interface in a specific language without having to work on the code itself. English, French and Spanish interfaces are already offered in Archimede. That feature allows also the user to switch easily from language to language anywhere and anytime during his search and retrieval process.http://archimede.bibl.ulaval.ca/di/Welcome.do Berkeley Electronic Press [http://www.bepress.com/repositories.html]. Commercial OAI-compliant software used by the University of Californias eScholarship Repository. Berkeley Electronic Press [http://www.bepress.com/repositories.html]. Commercial OAI-compliant software used by the University of Californias eScholarship Repository.http://www.bepress.com/repositories.htmleScholarship Repositoryhttp://www.bepress.com/repositories.htmleScholarship Repository CERN Document Server Software (CDSware) [http://cdsware.cern.ch/]. OAI compliant software developed by, maintained by, and used at, the CERN Document Server. CERN Document Server Software (CDSware) [http://cdsware.cern.ch/]. OAI compliant software developed by, maintained by, and used at, the CERN Document Server.http://cdsware.cern.ch/CERN Document Server.http://cdsware.cern.ch/CERN Document Server. Project Tapir [http://sourceforge.net/projects/tapir-eul]: Tapir provides additional functionality to digital asset management software DSpace primarily designed for Electronic Theses and Dissertations supervision, submission and dissemination. See Queen's University Project. Project Tapir [http://sourceforge.net/projects/tapir-eul]: Tapir provides additional functionality to digital asset management software DSpace primarily designed for Electronic Theses and Dissertations supervision, submission and dissemination. See Queen's University Project. http://sourceforge.net/projects/tapir-eulQueen's University Projecthttp://sourceforge.net/projects/tapir-eulQueen's University Project Fedora Project: An Open-Source Digital Repository Management System [http://www.fedora.info/]. Jointly developed by the University of Virginia and Cornell University, Fedora is a general-purpose digital object repository system that can be used in whole or part to support a variety of use cases including: institutional repositories, digital libraries, content management, digital asset management, scholarly publishing, and digital preservation. Fedora Project: An Open-Source Digital Repository Management System [http://www.fedora.info/]. Jointly developed by the University of Virginia and Cornell University, Fedora is a general-purpose digital object repository system that can be used in whole or part to support a variety of use cases including: institutional repositories, digital libraries, content management, digital asset management, scholarly publishing, and digital preservation.http://www.fedora.info/ Greenstone [http://www.greenstone.org/cgi-bin/library?a=p&p=home]. Suite of open-source multilingual software for building and distributing digital library collections. Produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed (since 2000) in cooperation with UNESCO and the Human Info NGO. Presently in limited use at New Zealand Digital Library Project and some other sites. Greenstone [http://www.greenstone.org/cgi-bin/library?a=p&p=home]. Suite of open-source multilingual software for building and distributing digital library collections. Produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed (since 2000) in cooperation with UNESCO and the Human Info NGO. Presently in limited use at New Zealand Digital Library Project and some other sites.http://www.greenstone.org/cgi-bin/library?a=p&p=homeNew Zealand Digital Library Projecthttp://www.greenstone.org/cgi-bin/library?a=p&p=homeNew Zealand Digital Library Project OCLC Research Software [http://www.oclc.org/research/software/default.htm]. A list of open source software developed by the Online Computer Library Center (OCLC) to build a repository and harvest data according to OAI-PMH standards. OCLC Research Software [http://www.oclc.org/research/software/default.htm]. A list of open source software developed by the Online Computer Library Center (OCLC) to build a repository and harvest data according to OAI-PMH standards.http://www.oclc.org/research/software/default.htm FIGARO, i-TOR, etc FIGARO, i-TOR, etc

15 Dilemma for Researcher Mandates from major funding agencies now require grantees to deposit research output in a designated repository or any Mandates from major funding agencies now require grantees to deposit research output in a designated repository or any Wellcome Trust (UK PubMed) - £400 million producing 3500 papers per year Wellcome Trust (UK PubMed) - £400 million producing 3500 papers per year RCUK RCUK Where should the full text of their research be deposited Where should the full text of their research be deposited Researcher wants to enter metadata and deposit only once and perhaps deposit all related material in one place? Researcher wants to enter metadata and deposit only once and perhaps deposit all related material in one place? Situation at present Situation at present Harvesting, but harvester is not the choice of the depositor Harvesting, but harvester is not the choice of the depositor Duplicate keying metadata into repositories of choice Duplicate keying metadata into repositories of choice Cannot target multiple repositories with one exercise Cannot target multiple repositories with one exercise Does it matter where it is deposited since Google Scholar, Yahoo, Scopus, will pick it up wherever it is? Does it matter where it is deposited since Google Scholar, Yahoo, Scopus, will pick it up wherever it is?

16 Repositories taking over the world? Turf War Turf War Not between Institutional and Subject Repositories – complementary and should coexist Not between Institutional and Subject Repositories – complementary and should coexist Possibly between Text based and Numeric based repositories Possibly between Text based and Numeric based repositories Repositories of whatever flavour v. Data Centres Repositories of whatever flavour v. Data Centres Are both spilling over into each others territory? Are both spilling over into each others territory? The Cavalry : JISC Digital Repositories Programme The Cavalry : JISC Digital Repositories Programme Strand: Linking Text and Data Strand: Linking Text and Data

17 Learning & Teaching workflows Research & e-Science workflows Aggregator services Repositories : institutional, e-prints, subject, data, learning objects Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Resource discovery, linking, embedding Deposit / self- archiving Learning object creation, re-use Searching, harvesting, embedding Quality assurance bodies Validation Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding From: Lyon : CNI - JISC - SURF Conference, May 2005

18 CLADDIER Project ** ( Citation, Location And Deposition in Discipline and Institutional Repositories) The CLADDIER system will be a step on the road to a situation where (in this case, environmental) scientists will to be able to move seamlessly from information discovery (location), through acquisition to deposition of new material, with all the digital objects correctly identified and cited. The lessons learned will be of applicability for the relationships between other discipline based repositories and institutional repositories. The CLADDIER system will be a step on the road to a situation where (in this case, environmental) scientists will to be able to move seamlessly from information discovery (location), through acquisition to deposition of new material, with all the digital objects correctly identified and cited. The lessons learned will be of applicability for the relationships between other discipline based repositories and institutional repositories. **JISC Digital Repositories Programme 2005 -

19 Persistent identifiers semantically transparent Versioning Dataset Citations Publishing practice Automated Linking both ways citation png

20 Where to Deposit One outcome of CLADDIER Project One outcome of CLADDIER Project pull = Harvesting pull = Harvesting push = CLADDIER outcome push = CLADDIER outcome Enable researcher to deposit in one repository and choose to upload (push) the metadata to another repository of choice. Enable researcher to deposit in one repository and choose to upload (push) the metadata to another repository of choice. Logical to push from IR to Subject? Logical to push from IR to Subject? Redundancy of records? Redundancy of records?

21 Thank You Pauline Simpson ( ps@noc.soton.ac.uk ) ps@noc.soton.ac.uk

22 Data Centres Discovery metadata - What data sets hold the sort of data I am interested in? This enable organisations to know and publicise what data holdings they have. Discovery metadata - What data sets hold the sort of data I am interested in? This enable organisations to know and publicise what data holdings they have. Exploration metadata - Do the identified data sets contain sufficient information to enable a sensible analysis to be made for my purposes? This is documentation to be provided with the data to ensure that others use the data correctly and wisely. Exploration metadata - Do the identified data sets contain sufficient information to enable a sensible analysis to be made for my purposes? This is documentation to be provided with the data to ensure that others use the data correctly and wisely. Exploitation metadata - What is the process of obtaining and using the data that are required? This helps end users and provider organisations to effectively store, reuse, maintain and archive their data holdings. Exploitation metadata - What is the process of obtaining and using the data that are required? This helps end users and provider organisations to effectively store, reuse, maintain and archive their data holdings.


Download ppt "Institutional Repositories & Discipline Based Repositories Pauline Simpson National Oceanography Centre, Southampton GRADE Kick Off Meeting 28 Sep 2005."

Similar presentations


Ads by Google