NetarchiveSuite Sabine Schostag The Netarchive

Slides:



Advertisements
Similar presentations
Server Technology Confidential Page 1, June 22,2009 Server Technology Global Services.
Advertisements

OCLC Digital Archive Overview Judith Cobb LIPA Meeting July 2006.
Software Development Lifecycle & Release Management Scottie Cheng.
Status and plans for the H3 release NetarchiveSuite 5.0.
Título de la presentación NetarchiveSuite at the BNE Juan Carlos García Arratia – Chief of IT Development Service, NLS Mar Pérez Morillo – Chief of Web.
By SAG Objectives Cross platform QA Automation for web applications Scheduling the automation Automatically build the test scripts Generate the.
© 2008 Zend Technologies; made available under the EPL v March 2008 PDT – The PHP Development Toolkit Assaf Almaz, PDT co-Project Leader Zend Technologies.
Recent approaches to capture web content, which Heritrix can’t harvest  Capturing Social Media  Screen filming of Rich Media  Project: Event crawl of.
1 Archive-It Training University of Maryland July 12, 2007.
1 News and media websites harvesting. 2 A daily crawl since December 2010 The selective crawl contains 92 websites National daily newspapers (
NetArchive Suite Workshop 2011 Technical Track - Code refactoring with the Spring Framework.
Software Quality Assurance 2/20 WELCOME Graphic User Interface Testing.
Tool Academy: Web Archiving Nicholas Digital Cultural Heritage DC Meetup December 20, 2012 “cobwebbed screw driver” by Flickr user Colby.
Microsoft Office Communicator A General Introduction.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
Tutorial 1: Getting Started with Adobe Dreamweaver CS4.
Wikis are websites where pages can be edited using an online document editor. Users can easily edit and share content. Enterprise wikis are platforms.
Conditions and Terms of Use
IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.
Plans for 2015 Tallinn, Jan 29 th, 2015 Ditte Laursen, Sabine Schostag,
IBM OmniFind Enterprise Edition V9.1 – July 2010 Data Source – FileNet P8 crawler overview  Key features: –Access to FileNet P8 Content Engine by using.
ATG Environment Setup In this session you will learn – Setting Up ATG environment – Creating new ATG application – Configuring Data Source – Configuring.
1Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8 Contract Management.
ERP Course: Planning, Design, and Implementation of ERP Readings: Chapter 3 Mary Sumner Peter Dolog dolog [at] cs [dot] aau [dot] dk E2-201 Information.
Was.cdlib.org California Digital Library University of California Rosalie Lack
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
NetarchiveSuite Meeting, Aarhus, 29./ Austria Updates and Plans for 2013 Michaela Mayr, Andreas P. Austrian National Library
Netarkivet RESAW seminar, Dec 2-3, 2013 Day 1. Who are we today □Birgit N. Henriksen, head of digital preservation, KB □Bjarne Andersen, head of digital.
Curator wishes for the roadmap november 2011 updates.
Introduction | Examples | Solutions | Tools | Q&A Visit the Active Content Developer Center: 1 The Changes.
1 Media Production Support v1 5 May 2010 Blake Crosby June 2, 2010 Standards Documents.
A Short Course on Geant4 Simulation Toolkit How to learn more?
Parish Councils Conference Website Support Workshop.
Content Management Systems INF385e Fall 2007 Ron Garza 30 October 2007 INF385e Fall 2007 Ron Garza 30 October 2007.
NetarchiveSuite Meeting, Paris, * Austria Updates and Plans for 2014/2015 Michaela Mayr, Andreas Predikaka Austrian National Library.
Creating and Managing Content Types Module 9. Overview  Understanding Content Types  Creating and Using Site Columns  Creating and Using Site Content.
Workforce Scheduling Release 5.0 for Windows Implementation Overview OWS Development Team.
NAS101, Appendex A, Page 1 DOCUMENTATION This section briefly describes the MSC.Nastran documentation. A quick overview of these documents is shown in.
Building Collections on the Web BCWeb. What’s BCWeb ? BCWeb was developped entirely by the BnF for the content curators to replace its old selection tools.
2015 NetarchiveSuite Workshop Eesti Rahvusraamatukogu Tallinn, Estonia January
Developer Skywalker Software Development Module Development Sipke Schoorstra.
SAPRAA 5 Sept 2008 eCTD An overview of the full day presentation by Dr Olaf Schoepke at the SAAPI conference in July 2008.
© 2005 by QNX; made available under the EPL v1.0 | March 8, 2016 CDT Roadmap Doug Schaefer CDT Project Lead QNX Software Systems.
2: Operating Systems Networking for Home & Small Business.
Geant4 Training 2003 A Short Course on Geant4 Simulation Toolkit How to learn more? The full set of lecture notes of this Geant4.
This slide deck is for LPI Academy instructors to use for lectures for LPI Academy courses. ©Copyright Network Development Group Module 01 Introduction.
Strategies for archiving the Danish web space Bjarne Andersen Head of Digital Resources State and University Library, Aarhus
Breeda Herlihy, IR Manager, UCC Library. UCC selected DSpace in 2008 Software selection group Staff from Library IT, Computer Centre, Special Collections,
BIT 285: ( Web) Application Programming Lecture 07 : Tuesday, January 27, 2015 Git.
“This improved a lot since I started using Tango (three years ago) from scratch so I'm happy to see the efforts from the developers. Still there is room.
XNAT 1.7: Getting Started 6 June, Introduction In this presentation we’ll discuss:  Features and functions in XNAT 1.7  Requirements  Installing.
SharePoint Broken Link Manager
Maven 04 March
Institution update KB DK
Chapter 18 Maintaining Information Systems
BnF - DLWEB - Umbra & Heritrix 3
A Short Course on Geant4 Simulation Toolkit How to learn more?
Software Testing With Testopia
IT Roles and Responsibilities
Documentation as part of curation in web archiving.
Confidential – Oracle Internal/Restricted/Highly Restricted
05 | Making the Cloud Transition
The purpose of testing Artifacts Test in the sw Life Cycle Workers
Course: Module: Lesson # & Name Instructional Material 1 of 32 Lesson Delivery Mode: Lesson Duration: Document Name: 1. Professional Diploma in ERP Systems.
SharePoint Broken Link Manager
MSC photo:  It was taken some time in the late 1930s, but we don’t have an exact date.  The college was known as MSC from 1925 until 1955 when we became.
A Short Course on Geant4 Simulation Toolkit How to learn more?
A Short Course on Geant4 Simulation Toolkit How to learn more?
GENEDI EUROPEAN COMMISSION - EUROSTAT GENERIC EDI TOOLBOX
Adoption and Use of IIIF for Digital Resource Sharing in CONTENTdm
Presentation transcript:

NetarchiveSuite Sabine Schostag The Netarchive

How we use NetarchiveSuite Questions and answers on NetarchiveSuite:  lifecycle: What aspects of the web archiving life cycle model does the tool cover? What aspects of the model would you like to/do you intend to build into the tool? What functionality does the tool provide that isn't reflected in the model?  development: What resources are committed to the tool's ongoing development? What are major features in the roadmap? Is the code open source?  adoption: What is the user base for the tool? How environment- specific is the tool as opposed to readily reusable by other organizations?  functionality: What are the tool's unique features? What are its shortcomings?

NetarchiveSuite Lifecycle What aspects of the web archiving life cycle model does the tool cover? What aspects of the model would you like to/do you intend to build into the tool? Extended documentation, Search functions, time schedules ≤ 1 hour What functionality does the tool provide that isn't reflected in the model? Time schedules min: once an hour – max ??

NetarchiveSuite  development: What resources are committed to the tool's ongoing development?  2,6 MP What are major features in the roadmap?  Technical improvements,  Upgrade to or support Heritrix 3,  Replacing current NetarchiveSuite Archive module  Better integration of documentation Is the code open source?

NetarchiveSuite  adoption: What is the user base for the tool? How environment- specific is the tool as opposed to readily reusable by other organizations? Even though the NetarchiveSuite software is developed in Java, and therefore is mostly platform independent, we do have a couple of external calls to the Unix sort command. The parts of our software using this external command therefore only run on Linux/Unix, or Windows with Cygwin installed. Se installation manual:

NetarchiveSuite  Functionality: What are the tool's unique features? What are its shortcomings?  Multifaceted aplication Selective Harvests Snapshot Harvests Domains Schedules Extended fields Heritrix GUI Access Global Crawler Traps Harvest History Harvester Templates Quality Assurance System State Bit Preservation See:

NetarchiveSuite Netarchive use of NAS /overview  Broad crawls  Selective crawls ”Selective crawls” Event crawls Special crawls (e.g. upon a scholars wish) Focused crawls: Social media (special templates), very big sites,..