NetarchiveSuite Meeting, Tallinn, 29./30.01.2015 * Austria Updates and Plans for 2015 Michaela Mayr, Andreas Predikaka Austrian National Library.

Slides:



Advertisements
Similar presentations
Introductory to database handling Endre Sebestyén.
Advertisements

NetarchiveSuite Meeting, BnF, 24./ Curator Track Austria Michaela Mayr Austrian National Library
Bibliothèque nationale de France Tallinn, BnF update: production and development priorities in 2015.
BnF projects and priorities On the collection side – Perform broad and focused crawls with a maximum of 100TB – Set up the legal deposit of ebooks.
Título de la presentación NetarchiveSuite at the BNE Juan Carlos García Arratia – Chief of IT Development Service, NLS Mar Pérez Morillo – Chief of Web.
‘opac 2.0’ and design hub web 2.0 enabled Emu collections sebastian chan manager, web services powerhouse museum sept 2006 lynne mcnairn emu administrator.
Looking Ahead Archive-It Partner Meeting November 12, 2013.
System Implementation Dr. Dania Bilal IS 551 Fall 2005.
Fedora 3: A Smooth Migration Michael Durbin. The Scenario  New versions of software promise exciting new capabilities and improvements.  They also present.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
Recent approaches to capture web content, which Heritrix can’t harvest  Capturing Social Media  Screen filming of Rich Media  Project: Event crawl of.
Digital Repository Service (DRS) Harvard University Library OIS presented by: Wendy Gogel & Andrea Goethals.
The Austrian Adaptation Platform Sabine McCallum, 19 June 2013.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
1 News and media websites harvesting. 2 A daily crawl since December 2010 The selective crawl contains 92 websites National daily newspapers (
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
How to Face the Challenges of Web Archiving? The experiences of a small library on the edge. Chloe Martin, Internet Memory Catherine Ryan, National Library.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
The Australian Government Web Archive ALIA Conference September 2014, Melbourne Alison Dellit Director, Australian Collection Management.
Johannes Spitzbart Phonogrammarchiv, Austrian Academy of Sciences Österreichische Tage der Digitalen Geisteswissenschaften save the data - workshop on.
Annick Le Follic Bibliothèque nationale de France Tallinn,
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
Plans for 2015 Tallinn, Jan 29 th, 2015 Ditte Laursen, Sabine Schostag,
Aarhus. BnF main topics – 2013 – crawling side Keep crawling –Broad and focused crawls –Limit of 100 Tb Crawl of password protected content –“Press project”:
GoodsWayGoodsWay Capstone Project Team information Goodsway.
NetarchiveSuite Sabine Schostag The Netarchive
NetarchiveSuite Meeting, Aarhus, 29./ Austria Updates and Plans for 2013 Michaela Mayr, Andreas P. Austrian National Library
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
Curator wishes for the roadmap november 2011 updates.
ORBIS & PORTALS E-Journal Workshop Michael Markwith, TDNet Inc. Reed College Library May 9, 2002.
NetarchiveSuite Meeting, BnF, Austria Updates and Plans for 2012 Michaela Mayr, Andreas P. Austrian National Library
CyberCemetery Preserving At-Risk Government Web Content.
NetarchiveSuite Workshop, November 24, 2011, Paris 1 Austria Using Wayback for Access and QA Andreas P. Austrian National Library
NetarchiveSuite Meeting, Paris, * Austria Updates and Plans for 2014/2015 Michaela Mayr, Andreas Predikaka Austrian National Library.
Passive Microwave Report to the Polar DAAC Advisory Group PoDAG XXII, June 2-4, 2004 Boulder, CO.
 Politics & History  Geography  Environmental Management  Modern Languages  British Official Publications *  European Documentation Centre*  Statistics*
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
Service updates: People Australia ARROW Discovery Service Picture Australia Basil Dewhurst Manager, Resource Discovery Services
Building Collections on the Web BCWeb. What’s BCWeb ? BCWeb was developped entirely by the BnF for the content curators to replace its old selection tools.
1 NetarchiveSuite Workshop Paris November , 2011.
Course Title Google's Online Power Tools That You Need NOW!
2015 NetarchiveSuite Workshop Eesti Rahvusraamatukogu Tallinn, Estonia January
GL12 - Prague, 7 December From OpenSIGLE to OpenGrey Changes and Continuity Christiane Stock and Nathalie Henrot (INIST-CNRS)
1 « Luxembourg, 18 April 2007 « Virtual Library of Official Statistics « Dissemination Working Group.
PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Internal Services.
GW ToDo A Task Manager CSCI 6442 Project Spring, 2016.
OpenGrey – a new environment for OpenSIGLE and European Grey Literature Christiane Stock (INIST-CNRS)
Chapter 11 Analysis Methodology Spring Incident Response & Computer Forensics.
Alfresco Monitoring with OpenSource Tools Miguel Rodriguez Technical Account Manager.
Strategies for archiving the Danish web space Bjarne Andersen Head of Digital Resources State and University Library, Aarhus
Use cases for BnF broad crawls Annick Lorthios. 2 Step by step, the first in-house broad crawl The 2010 broad crawl has been performed in-house at the.
INFSO-RI SA2 ETICS2 first Review Valerio Venturi INFN Bruxelles, 3 April 2009 Infrastructure Support.
Jacek Otwinowski (Data Preparation Group)
Institution update KB DK
WinCC-OA Log Analysis SCADA Application Service - Reporting
BnF - DLWEB - Umbra & Heritrix 3
Joanne Archer University of Maryland Libraries
László Drótos – Márton Németh National Széchényi Library Department of Electronic Library Services Web archiving Planning a new pilot project.
Czech Statistical Office
Jennifer Vargas October 23 – November 30, 2012
End to End Monitoring Solution using Open Source Technology where webMethods 9.10 is used as ESB IBM Confidential.
Panel on Web Archiving Government Information: LAC’s Program Update
5018 Sensors Channels 420 at 150 V.
Academic Search Group 16 刘督 范禹
NSDL Portals and User Interface Test Designs
swimming instead of drowning in data: Usage statistics SIUC
5018 Sensors Channels 420 at 150 V.
Webarchive Austria NetarchiveSuite Meeting Madrid 2019
Presentation transcript:

NetarchiveSuite Meeting, Tallinn, 29./ * Austria Updates and Plans for 2015 Michaela Mayr, Andreas Predikaka Austrian National Library

Harvesting 2014 Ongoing Collections: –Media (since 2011) –Politics (since 2013) incl. 1 regional election Olympic Winter Games Sochi –3 seeds daily, 96 seeds weekly EU elections –132 seeds daily, 33 seeds weekly World War I –151 seeds * Budget = 2 TB

Harvesting 2015 Ongoing Collections: –Media (since 2011) –Politics (since 2013) incl. 4 regional elections 4th Broad Crawl –New TLDs.wien,.tirol –ARC format, NAS 4.4, PostgreSQL Eurovision Song Contest Content behind paywalls? * Budget = 10 TB

Statistics Approximately 1.4 m. domains 60 TB raw / 30 TB compressed 2 bn. files *

Access Prototype for online search interface (no access to data) –Improved search possibilities (partial fulltext-search of selected seeds) –User tracking (inhouse, online) and data handling with ELK stack (Elasticsearch, Logstash, Kibana) External access for 4 libraries

NAS & other tech stuff -Notification Tool (for selective crawls) NAS Release tests File Format Identification (DROID, as part of ONB risk mangement) *

NAS & other tech stuff HADOOP –Responsibilites changed –Problem solving in progress To do until broad crawl (03/15): –Database Migration MySQL to PostgreSQL –Switch to NAS 4.4 Switch to OpenWayback