Institution update KB DK

Slides:



Advertisements
Similar presentations
OCLC Online Computer Library Center OCLC Cataloging Update Connexion client 1.50 & more OCLC CJK Users Group Annual Meeting San Francisco, CA April 8,
Advertisements

Open Access Niamh Brennan Trinity College Dublin DRIVER Summit, Goettingen, January 17th 2008 Local Integration, National Federation TCD-RSS, TARA, IReL-Open,
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Bibliothèque nationale de France Tallinn, BnF update: production and development priorities in 2015.
Looking Ahead Archive-It Partner Meeting November 12, 2013.
US Army Corps of Engineers BUILDING STRONG ® Creating a Data Dictionary for Your Local Data USACE SDSFIE Training Prerequisites: Preparing Your Local Data.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
The KnowledgeBank: Powered by DSpace Laura Tull Systems Librarian Ohio State University Libraries WiLSWorld July 27, 2004.
Consists of the following components (which are purchased separately) Resource Discovery * Web based deposit (including authorisation)* Full Text Index.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
11 WARC standard revision workshop Clément Oury IIPC General Assembly open workshops Stanford, April 28th, 2015 IIPC General Assembly – Stanford – April.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
Recent approaches to capture web content, which Heritrix can’t harvest  Capturing Social Media  Screen filming of Rich Media  Project: Event crawl of.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Digital preservation Hydra Europe, LSE 24 April 2015 Anders Conrad.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Web Archiving at the Innsbruck Newspaper Archive Innsbrucker Zeitungsarchiv / IZA Presentation by Renate Giacomuzzi, Elisabeth Sporer, Armin Schleicher.
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
July 25, 2012 Arlington, Virginia Digital Preservation 2012warcreate.com WARCreate Create Wayback-Consumable WARC Files from Any Webpage Mat Kelly, Michele.
WebArchiv Czech Web Archive IIPC 2007, Paris.
1 News and media websites harvesting. 2 A daily crawl since December 2010 The selective crawl contains 92 websites National daily newspapers (
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
Tool Academy: Web Archiving Nicholas Digital Cultural Heritage DC Meetup December 20, 2012 “cobwebbed screw driver” by Flickr user Colby.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
CERN – IT Department CH-1211 Genève 23 Switzerland t CERN Open Source Collaborative tools: Digital Library Software Tim Smith CERN/IT.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Plans for 2015 Tallinn, Jan 29 th, 2015 Ditte Laursen, Sabine Schostag,
IWIR-CRIS '06 Data retrieval in PURE Data retrieval in the 4-year old PURE CRIS project at 9 universities.
NetarchiveSuite Sabine Schostag The Netarchive
A historical perspective of Digital Preservation at The Royal Library, Denmark.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
0 A Workable Solution for Basic Metadata January 9, 2006.
Netarkivet RESAW seminar, Dec 2-3, 2013 Day 1. Who are we today □Birgit N. Henriksen, head of digital preservation, KB □Bjarne Andersen, head of digital.
1 1 Scholarly Publishing & Academic Resources Coalition an initiative of the Association of Research Libraries Institutional Repository.
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
CyberCemetery Preserving At-Risk Government Web Content.
Rights Metadata in DRS Basic Rights Functions in: – Batch Builder – EAS – DRS Web Admin.
OAIS: From Requirements to Reality at OCLC FLICC / CENDI Symposium, Dec Pam Kircher Product Manager, Digital Archive OCLC Digital & Preservation.
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
The Web Archiving Service Spring 2009 Update User’s Council Annual Meeting Tracy Seneca California Digital Library Capture Today’s Web;
System Development & Operations NSF DataNet site visit to MIT February 8, /8/20101NSF Site Visit to MIT DataSpace DataSpace.
ALA Annual Meeting Claire Cocco Global Product Manager CONTENTdm Users Group June 30th, 2008.
2015 NetarchiveSuite Workshop Eesti Rahvusraamatukogu Tallinn, Estonia January
PDS4 Demonstration Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
IUScholarWorks Repository Update Jim Halliday, Stacy Konkiel & Jennifer Laherty.
ELISQ Systems Demonstration Sagnik Ray Choudhury Doha -- May 2015.
Ingest – Workflow Irena Vipavc Brvar ADP SEEDS Workshop I Belgrade, October.
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
Use cases for BnF broad crawls Annick Lorthios. 2 Step by step, the first in-house broad crawl The 2010 broad crawl has been performed in-house at the.
8 November 2012, Penn State Harrisburg Linda Friend University Libraries Publishing & Curation Services.
Reference Management Module I: Introduction By Rehema Chande-Mallya(PhD)
Making FAAM Flights Discoverable
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Linked Open Data: Challenges and Opportunities for BAnQ
Moving on : Repository Services after the RAE
Workshop on Web Archiving
Document & Web Content Management
Documentation as part of curation in web archiving.
CDISC SHARE API v1.0 CAC Update 22 February 2018
Andreas Trappe Scientist of Information and Media Technologie
VT Web Archiving Anthony Rinaldi and Dev Mehta CS 4624
SDMX Tools Overview and architecture
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Local Rules Apply: Creating and Sustaining a Cost Effective Digital Preservation System on a Limited Budget Matt Ransom, Digital Assets Manager Belk Library.
The Bentley Digital Media Library
ArchivesSpace – Archivematica – DSpace Workflow Integration
Webarchive Austria NetarchiveSuite Meeting Madrid 2019
Presentation transcript:

Institution update KB DK NAS workshop Vienna, April 2017 Sabine and Tue

Netarchive 2015-2017 Highlights 10 Years anniverary Full text search Mapping special collections Web Danica Upgrading to Heritrix 3 Broad crawl analysis New collection strategy Social Media collection strategy Studies on content collection via API Image Search (pilote project) Using Archive-it and testing Brozzler ISO statistics Access using Citrix Secure Research Bitarchive Instanse OAI harvesting of research libraries A new e- and audio-books workflow Archive Compression on the way

10 Years anniverary Preparation: gathering lots of information on Netarchive Tables and information sheets

Full text search SoLr search Wayback https://netarkiv-wayback.kb.dk/Citrix/KBWeb/ Evt. demonstration ??

Mapping special collections

Web Danica

Upgrading to Heritrix 3

New collection strategy Now We talk about the details later on this workshop Before

Social Media collection strategy Analysis Choice in process Training of the curators External partners (e.g. journalist)

Studies on Facebook collection via API Digital Footprints Tool for researchers Needs user consent, even for open profiles Further development needed Whale Can collect all active, open, Danish Facebook profiles and posts. License (999€/month) Cannot solve the problem of being blokked by Facebook

Image Search (pilote project) Shine MimeTypeSearch (mere info mgl.)

Using Archive-it and testing Brozzler We are using Archive-it to: test sites we have problems with in NAS harvest a limited numbers of facebooks profiles download the harvested warc files and preserve them locally We have tested Brozzler with succes and wants to do more integration..

ISO statistics Based on ISO/TR 14873(https://www.iso.org/standard/55211.html ).

Access using Citrix Part of Royal Danish Library common Citrix platform ( about 40 concurrent users during a workday). Different access restriction setups (e.g. Researcher from Home/Researcher only in Reading Room) using Citrix GPO’s (Group Policy Object templates and Active Directory groups). Using given browser IE v.11 with given proxy and plugins setup and workspace (plan to change to Chrome). Plans for digital reference e.g. Zotero integration..

Secure Research Bitarchive Instanse Batch jobs validated and executed by operational manager in secure environment and using a separate NAS instans. Extracted data exported to secure researcher proccessing server.

OAI harvesting of research libraries Based on https://www.openarchives.org/

A new workflow for e- and audio-books Daily deduplicated extracts of 85 % of all e-/audio books through 1 aggregator. Feeds into a new structured workflow. Testing GoAnywhere for secure exchange of data. Facilitating upload of 15 % from personal publishers. Enhanced with metadata from National Bibliography aggregator. Integrated with the Library system.

Archive Compression – on the way All new harvest are gzipped since february 2017 (without deduplication). Gzip of the old part is expected to be finished in autumn 2017. Using jwat basicly and recreating new CDX’s and ”middleware” datafiles for creating revisits and creation of new metadatafiles and facilitating creation of new non deduplicated CDX indexes ( 8 TB). The precompress job of 800 TB is expected to take between 7-14 day’s for the distributed archive in CPH.