BnF projects and priorities. 2014 On the collection side – Perform broad and focused crawls with a maximum of 100TB – Set up the legal deposit of ebooks.

Slides:



Advertisements
Similar presentations
Recent developments in digital archiving and preservation Jan Fullerton Director General National Library of Australia.
Advertisements

New Developments in Library and Archives Canadas ETD Program 11 th International Symposium on ETDs Aberdeen, Scotland, June 5, 2008 Sharon Reeves, Manager,
Harvesting digital newspapers at the Bibliothèque nationale de France
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
Bibliothèque nationale de France Tallinn, BnF update: production and development priorities in 2015.
Integrated Digital Event Web Archive and Library (IDEAL) and Aid for Curators Archive-It Partner Meeting Montgomery, Alabama Mohamed Farag & Prashant Chandrasekar.
Título de la presentación NetarchiveSuite at the BNE Juan Carlos García Arratia – Chief of IT Development Service, NLS Mar Pérez Morillo – Chief of Web.
University of Alberta Libraries: Bringing in the Harvest Archive-It Partners Meeting.: October 2011.: Geoff Harder.
14 mai 2007Evolution of Scientific Publications, Colloque de l'Académie des sciences1 Preservation of electronic publications mission Catherine Lupovici.
The Library of Congress Cooperative Web Archiving Project Abbie Grotke, Library of Congress Grant Harris, Library of Congress Jennifer Long, Georgetown.
BUILDING DIGITAL WEB ARCHIVES FOR FUTURE SCHOLARS Jani Stenvall
Looking Ahead Archive-It Partner Meeting November 12, 2013.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive July 2008.
1 Co-developing access to the UK Web Archive Helen Hockx-Yu Head of Web Archiving, British Library.
Jackie Knowles, Project Manager. Image from
Digitizing History Michael E. Unsworth MSU Libraries
11 WARC standard revision workshop Clément Oury IIPC General Assembly open workshops Stanford, April 28th, 2015 IIPC General Assembly – Stanford – April.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
NATIONAL MEMORY AND DIGITAL DELIVERY PROGRESS WITH LEGAL DEPOSIT OF ELECTRONIC PUBLICATIONS IN THE UNITED KINGDOM Graeme Forbes National Library of Scotland.
Recent approaches to capture web content, which Heritrix can’t harvest  Capturing Social Media  Screen filming of Rich Media  Project: Event crawl of.
The capture and preservation of websites at the National Library of New Zealand Gillian Lee Alexander Turnbull Library.
Preserving webharvests at the National Library of New Zealand Te Puna Mātauranga o Aotearoa Peter McKinney Digital Preservation Policy Analyst National.
Web Archiving Life Cycle Model Archive-It Partner Meeting December 3, 2012 Molly Bragg
Annick Le Follic Bibliothèque nationale de France Tallinn,
Digital Objects Management Arbicon Visit, June 7, 2007 Esa-Pekka Keskitalo, Senior Analyst esa-pekka.keskitalo [at] helsinki.fi.
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
WebArchiv Czech Web Archive IIPC 2007, Paris.
1 News and media websites harvesting. 2 A daily crawl since December 2010 The selective crawl contains 92 websites National daily newspapers (
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
UNIT 9: PUBLISHING TECHNOLOGY. News in Digital Era 1. Readers can obtain digital press printed format anywhere in the world at any time. 2. This digital.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
Ymchwil Research Ymchwil Research RESAW Ioan Isaac-Richards Ingest Processes Manager Head of Web Archiving
Archiving the Web – The Bibliothèque nationale de France’s « L’archivage du Web » Bert Wendland Bibliothèque nationale de France.
DIGAR as the way and possibility to re-use the publications of public sector National Library of Estonia Kairi Felt Chief Specialist of E-Collections
Annick Le Follic Bibliothèque nationale de France Tallinn,
The Western Waters Digital Library: Building a Resource Through Multi- State Collaboration and Technology Dawn Paschal Assistant Dean, Digital Library.
Special Collections Keys to History Research. Databases.
Aarhus. BnF main topics – 2013 – crawling side Keep crawling –Broad and focused crawls –Limit of 100 Tb Crawl of password protected content –“Press project”:
Office of Strategic Initiatives All Hands Meeting-March 2010 Challenges in Web Archiving: Library of Congress Edition Abbie Grotke, Web Archiving Team.
1 Archive-It: Archiving and Preserving Born Digital Content NDIIPP June 2009 Molly Bragg Partner Specialist Internet Archive.
A historical perspective of Digital Preservation at The Royal Library, Denmark.
From here to perpetuity: challenges (and a few confessions) in preserving web-based AV content ASRA Conference 2011 Paul Koerbin Manager Web Archiving.
Themes Architecture Content Metadata Interoperability Standards Knowledge Organisation Systems Use and Users Legal and Economic Issues The Future.
Curator wishes for the roadmap november 2011 updates.
Using local studies resources at Derbyshire libraries Cultural and Community Services.
1 A Very Large Digital Library Technology Demonstration William Y. Arms Cornell University.
Architecture and Planning Library. Collections Our collections support the curriculum of the School of Architecture, with its academic programs in: 
CyberCemetery Preserving At-Risk Government Web Content.
GPO’s Federal Digital System December 10, 2009 U.S. Government Printing Office.
NetarchiveSuite Meeting, Paris, * Austria Updates and Plans for 2014/2015 Michaela Mayr, Andreas Predikaka Austrian National Library.
DRS 2 Project (2008 – Present!) Andrea Goethals, Harvard Library Digital Preservation Management Workshop, MIT June 13, 2013.
H UMAN R IGHTS W EB A RCHIVE P ORTAL – T ECHNICAL S UMMARY Columbia University Libraries.
Metadata Extraction & Web Archives: Automating the Record Creation Process Abbie Grotke / Gina Jones /
DIGITAL PUBLIC LIBRARY OF AMERICA DP.LA “Benjamin Sewall Blake jumping,” 1888, Massachusetts Historical Society.
Building Collections on the Web BCWeb. What’s BCWeb ? BCWeb was developped entirely by the BnF for the content curators to replace its old selection tools.
1 NetarchiveSuite Workshop Paris November , 2011.
2015 NetarchiveSuite Workshop Eesti Rahvusraamatukogu Tallinn, Estonia January
Sally Rumsey Digital Research Librarian The Bodleian Libraries University of Oxford.
Netarchive Plans for the next year. Netarchive – Plans for the next year  4 broad crawls  One broad crawl lasts less than 55days  We are able to fullfill.
Creating a Preferred Future Ensuring Success in the New Environment.
Use cases for BnF broad crawls Annick Lorthios. 2 Step by step, the first in-house broad crawl The 2010 broad crawl has been performed in-house at the.
Digitization Workflows From the Digital Projects Unit University of North Texas Libraries Mark E. Phillips Jeremy D. Moore February 12, 2009.
Archiving & Preserving Digital Content
BnF experiences with harvesting content beyond paywalls
Legal Deposit & UK Publishing
Wisconsin County and Municipal Government Collections in Archive-It
CS6604 Digital Libraries IDEAL Webpages Presented by
Preserving Our Collective Digital History
DDP/DAP Design and Technology Overview
Lights, Camera, Deposit: Using EMu and Sapphire to create a Digital Media Kirsty Menzies, Emma Yan University of Glasgow Library,
Presentation transcript:

BnF projects and priorities

2014 On the collection side – Perform broad and focused crawls with a maximum of 100TB – Set up the legal deposit of ebooks (not the same IT team) – Design an organization for harvesting of news websites (PDFs) On the preservation / back office side – Move to WARC format – Replace othe Petabox architecture – Improve the performance of ingest in SPAR (BnF digital repository) On the access side – Give access in regional libraries (start with 3) – Launch a data mining project around WWI On the international side – Contribute to NAS and Wayback developement – Open source BCWeb

2015 On the collection side – Perform broad and focused crawls with a maximum of ? Tb – Legal deposit of ebooks (not the same IT team) – Extend the number of news websites (PDFs) / experiment with digital newspapers deposit? – Crawl YouTube and Vimeo? On the preservation – Ingest « historical » web archive collections in SPAR On the access side – Extend access to web archives in regional libraries – Redesign indexing processes : better search, FT search (???)