Presentation on theme: "IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist."— Presentation transcript:
IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist
www.bl.uk 2 Introduction Jointly developed by BL and NLZ 2006 under the auspices of the IIPC WCT manages the selective web harvesting process Designed for use in libraries by non-technical users Open source Uses the Heritrix web crawler
www.bl.uk 3 What it does and doesn’t do. Appraisal and selection: choosing websites for capture. –Subject specialists, curators, external agencies –BL uses a selection permission tool plugged into WCT Metadata/Description –Basic Dublin Core Metadata –Titles, description, subject and collection tagging Scoping and Data Capture –Scheduling –Crawl parameters, e.g. path depth, size of download QA and Analysis –Heritrix log files –Browse tools –Recommendations based on indicators
www.bl.uk 4 What it does and doesn’t do continued.. Storage and Organisation –WARC files created in WCT –Passed out of WCT for indexing and long term storage Access/Use/Reuse –Wayback is plugged in as the access tool –Harvested sites can be viewed within the tool Risk Management –Harvest Authorisation module, rights metadata –Records the outcome of publisher communications –Control the display of Targets
www.bl.uk 5 Development Latest version 1.6.1 available now. UI new features and improvements (x 17) including… –Date pickers for date fields –Scheduling heat map –Harvest optimisation Bug Fixes (x11) Development related e.g., –No longer need to install Apache Tomcat server or database etc NLNZ budgeted NZD 50,000 for 2014-15 Open development process up to all WCT users. –WCT pages http://webcurator.sourceforge.net/http://webcurator.sourceforge.net/ –Wiki http://sourceforge.net/projects/webcurator/http://sourceforge.net/projects/webcurator/ (Code, Support, mailing lists, bug tracker)
www.bl.uk 6 Thank-you. UK Web Archive http://www.webarchive.org.uk http://britishlibrary.typepad.co.uk/webarchive/ @UKWebArchive Nicola.Bingham@bl.uk Nicola.Bingham@bl.uk
Your consent to our cookies if you continue to use this website.