Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, 2014-06-25 A Weekend with Nanite Large scale.

Slides:



Advertisements
Similar presentations
Introduction to Planets Hans Hofman Nationaal Archief Netherlands Prague, 17 October 2008.
Advertisements

Preservation as a Process of a Repository David Tarrant University of Southampton (UK) Preserv Repository Preservation and Interoperability.org.uk.
JISC/BL Workshop Digital Libraries and their services March 6, 2006 Richard Boulderstone Director eStrategy, The British Library.
LIFE 3 LIFE 3 : Predicting Long Term Preservation Costs Brian Hole LIFE 3 Project Manager The British Library IFLA conference 27/02/10.
Tackling the challenge of long term digital preservation through PCP The PREFORMA Project Anna Kasimati Studies and Programs Office, Greek Film Centre.
SCAPE Carl Wilson Open Planets Foundation SCAPE Training Guimarães Characterisation An introduction to the identification and characterisation of.
Bibliothèque nationale de France Tallinn, BnF update: production and development priorities in 2015.
Dr. Ross King AIT Austrian Institute of Technology GmbH SCAPE/OPF Executive Seminar: Managing Digital Preservation The Hague, April 2, 2014 SCAPE Tools.
Co-ordinated by aparsen.eu #APARSEN Co-funded by the European Union under FP7-ICT Highlights from the DRM survey and recommendations Kirnn Kaur,
Demonstration of the use of browser extensions in Mozilla to link from a Google Scholar item to a European Library object.
Apache Solr at The UK Web Archive Andy Jackson Web Archive Technical Lead.
1 Focus on the User User Centered Design for Finding Articles David Lindahl Director of Digital Library Initiatives University of Rochester Libraries
Preservation and Long-term access through Networked Services Adam Farquhar, The British Library iPres2006 Cornell University, October 2006.
© 2010 Microsoft Corporation. All rights reserved. Quality Assurance: Towards Tools for Characterizing and Comparing Digital Documents Natasa Milic-Frayling.
1 / 1509 / 17 / 14 Digital preservation of architectural 3D data Rosetta in the context of the DURAARK project IGeLU Conference Oxford, September 17 th.
P reservation and L ong-term A ccess through N ETworked S ervices.
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
Preserving webharvests at the National Library of New Zealand Te Puna Mātauranga o Aotearoa Peter McKinney Digital Preservation Policy Analyst National.
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
Digital preservation Hydra Europe, LSE 24 April 2015 Anders Conrad.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Dealing with DRM and Digital Rights at the German National Library.
Catherine Jones Science and Technology Facilities Council SCAPE Training Statsbiblioteket, Aarhus, November 2013 Control Policy formulation The why.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
TEAM FOUNDATION SERVER (TFS) By Sunny Niranjana Devi. M.
LIFE 3 LIFE3: Predicting Long Term Preservation Costs Paul Wheatley Digital Preservation Manager The British Library.
LIFE 3 LIFE 3 : Predicting Long Term Preservation Costs Brian Hole LIFE 3 Project Manager The British Library KeepIt training course 05/02/10.
FP7 ERANET – 2008 RTD – – ICT-AGRI “Coordination of European Research within ICT and Robotics in Agriculture and related Environmental issues” Action.
Tool Academy: Web Archiving Nicholas Digital Cultural Heritage DC Meetup December 20, 2012 “cobwebbed screw driver” by Flickr user Colby.
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
Building Scalable Web Archives Florent Carpentier, Leïla Medjkoune Internet Memory Foundation IIPC GA, Paris, May 2014.
Artur Kulmukhametov Vienna University of Technology SCAPE PW Training Event Aarhus, November 2013 Content Profiling and C3PO.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, Hadoop and its applications at the.
IIPC GA, Stanford, US - WARCApril 28 th 2015Slide 1 WARC as Package Format for all Preserved Digital Material by Eld Zierau The Royal Library of Denmark.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Why persistent identifiers are crucial in digital preservation.
Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, SCAPE Scalable Preservation Environments.
SCAPE Scalable Preservation Environments. 2 Its all about scalability! Scalable services for planning and execution of institutional preservation strategies.
Supporting practical preservation work and making it sustainable with SPRUCE Paul Wheatley SPRUCE Project Manager University of These.
Session 2.  Wake Up Call, LSTA Digitization Grant  Digital Preservation Summit, May 2008  ISU Digital Preservation Group, September 2009.
This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under.
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
Scientific Data and Electronic Publishing Renze Brandsma, Head, Digital Production Centre University of Amsterdam Maarten Hoogerwerf, Project Manager,
Alastair Duncan STFC Pre Coffee talk STFC July 2014 The Trials and Tribulations and ultimate success of parallelisation using Hadoop within the SCAPE project.
University of California Libraries Partnering in NDIPP. The Web at Risk John Kunze, Preservation Technologies Architect California Digital Library Presentation.
Uganda Scholarly Digital Library (USDL) Makerere University’s Institutional Repository By Margaret Nakiganda URL:
SEE-GRID-SCI The SEE-GRID-SCI initiative is co-funded by the European Commission under the FP7 Research Infrastructures contract no.
IR Homework #1 By J. H. Wang Mar. 16, Programming Exercise #1: Vector Space Retrieval - Indexing Goal: to build an inverted index for a text collection.
SCAPE Rainer Schmidt SCAPE Training Event September 16 th – 17 th, 2013 The British Library Building Scalable Environments Technologies and SCAPE Platform.
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No Brownfield.
SCAPE David Open Planets Foundation / University of Southampton iPres2012 Toronto, October 2012 LDS 3 Applying.
Co-funded by the European Union´s Seventh Programme for research, technological development and demonstration under grant agreement No
IMLS is an independent grant-making agency fostering leadership, innovation and lifetime learning by supporting museums and libraries.
Why is an Employee Information Management System Important for your Business?
Barbara Sierman SCAPE Training Statsbiblioteket, Aarhus, November 2013 Preservation Policy in SCAPE.
Hadoop Javad Azimi May What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data. It includes:
An Introduction to Tessella and The Safety Deposit Box Platform
Building A Web-based University Archive
EPrints Preservation.
Item 1: This task required students to evaluate search results to choose the most appropriate one for a specified topic. This task illustrates achievement.
A Web-Based Tool for Gathering Ordinal Rankings
Digital Preservation Planning:
Automation and Scalability in Digital Preservation
Tony Ardura, Austin Burnett, Rex Lacy, Shawn Neumann
EPrints Preservation.
Presentation transcript:

Per Møldrup-Dalum State and University Library SCAPE Information Day State and University Library, Denmark, A Weekend with Nanite Large scale characterisation of web archives

A short introduction to the experiment A live demonstration A look at the data for characterisation A look at the input for the job Run the job Analysis of the output and of the run itself. 2 Agenda This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐ (Grant Agreement number ).

Performance-testing the tools SCAPE User Story: As a Web Archive I need a Digital Preservation System that can process both ARC and WARC files and identify file formats/characterize of items contained so that I can assess preservation risks and plan which tools will be required for access to those formats. 3 Task at Hand This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐ (Grant Agreement number ).

Apache Tika DROID from The National Archive (libmagic) Not a word on FITS... 4 Tools at Hand This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐ (Grant Agreement number ).

Created and maintained by the British Library Improved by SCAPE and sustained by Open Planets Foundation Tika and libmagic support added Advanced Tika support through a ”persistent” Tika server ARC header extraction added More to come… 5 Nanite This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐ (Grant Agreement number ).

6 This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐ (Grant Agreement number ).

SCAPE User Story for web archive data: labs.org/display/SP/File+Format+Identification+and+Ch aracterisation+of+Web+Archiveshttp://wiki.opf- labs.org/display/SP/File+Format+Identification+and+Ch aracterisation+of+Web+Archives Nanite: A Weekend With Nanite blog post: weekend-nanite weekend-nanite Open Planets Blogs: References This work was partially supported by the SCAPE Project. The SCAPE project is co‐funded by the European Union under FP7 ICT‐ (Grant Agreement number ).