Building Collections on the Web BCWeb. What’s BCWeb ? BCWeb was developped entirely by the BnF for the content curators to replace its old selection tools.

Slides:



Advertisements
Similar presentations
Recent developments in digital archiving and preservation Jan Fullerton Director General National Library of Australia.
Advertisements

NetarchiveSuite Meeting, BnF, 24./ Curator Track Austria Michaela Mayr Austrian National Library
1 NetarchiveSuite Workshop Paris November , 2011.
OCLC Digital Archive Overview Judith Cobb LIPA Meeting July 2006.
Harvesting digital newspapers at the Bibliothèque nationale de France
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
Bibliothèque nationale de France Tallinn, BnF update: production and development priorities in 2015.
Bibliothèque nationale de France Tallinn,
BnF projects and priorities On the collection side – Perform broad and focused crawls with a maximum of 100TB – Set up the legal deposit of ebooks.
Título de la presentación NetarchiveSuite at the BNE Juan Carlos García Arratia – Chief of IT Development Service, NLS Mar Pérez Morillo – Chief of Web.
The Library of Congress Cooperative Web Archiving Project Abbie Grotke, Library of Congress Grant Harris, Library of Congress Jennifer Long, Georgetown.
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
Recent approaches to capture web content, which Heritrix can’t harvest  Capturing Social Media  Screen filming of Rich Media  Project: Event crawl of.
The capture and preservation of websites at the National Library of New Zealand Gillian Lee Alexander Turnbull Library.
Annick Le Follic Bibliothèque nationale de France Tallinn,
Web Archiving at the Innsbruck Newspaper Archive Innsbrucker Zeitungsarchiv / IZA Presentation by Renate Giacomuzzi, Elisabeth Sporer, Armin Schleicher.
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
Joanne Archer University of Maryland Kate Odell Archive-It Abbie Grotke Library of Congress Tessa Fallon Columbia University Creating and Maintaining Web.
1 Archiving and Preserving the Web Dan Avery Kristine Hanna Merrilee Proffitt Internet Archive RLG April 2006.
NetArchive Suite Workshop 2011 Technical Track - Code refactoring with the Spring Framework.
The Web is a Mess: or How I Learned to Stop Worrying and Love Web Archiving Lori Donovan, Internet Archive.
Web Capture team Office of strategic initiatives February 27, 2006 Selecting Content from the Web: Challenges and Experiences of the Library of Congress.
Ymchwil Research Ymchwil Research RESAW Ioan Isaac-Richards Ingest Processes Manager Head of Web Archiving
For each of the Climate Literacy and Energy Literacy Principles, a dedicated page on the CLEAN website summarizes the relevant scientific concepts and.
The Web Archiving Service Tracy Seneca California Digital Library California Digital LibraryNew York UniversityUniversity of North Texas National Digital.
Content Strategy.
5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.
Annick Le Follic Bibliothèque nationale de France Tallinn,
IIPC GA Curator Tools Fair May 2014 WEB CURATOR TOOL Nicola Bingham Web Archivist.
The Western Waters Digital Library: Building a Resource Through Multi- State Collaboration and Technology Dawn Paschal Assistant Dean, Digital Library.
The ECHO DEPository Project A project of the University of Illinois at Urbana-Champaign and OCLC in partnership with the Library of Congress ALA Annual.
ECHO DEPository Project: Highlight on tools & emerging issues The ECHO DEPository Project is a 3-year digital preservation research and development project.
Electronic Theses at Rhodes University presented by Irene Vermaak Rhodes University Library National ETD Project CHELSA Stakeholder Workshop 5 November.
Aarhus. BnF main topics – 2013 – crawling side Keep crawling –Broad and focused crawls –Limit of 100 Tb Crawl of password protected content –“Press project”:
Office of Strategic Initiatives All Hands Meeting-March 2010 Challenges in Web Archiving: Library of Congress Edition Abbie Grotke, Web Archiving Team.
NetarchiveSuite Sabine Schostag The Netarchive
The Real At Risk E-Content: University Web Resources EDUCAUSE Joanne Kaczmarek University of Illinois at Urbana-Champaign Taylor Surface OCLC October 12,
Can we be doing more? Beth Tillinghast University of Hawaii at Manoa October 19, 2011 Archive-It Partner Meeting ACCESS TO OUR ARCHIVED WEBSITE COLLECTIONS.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
Curator wishes for the roadmap november 2011 updates.
The Library of Congress Martha Anderson Program Officer, NDIIPP Office of Strategic Initiatives Library of Congress April 2005 LC Perspective : Preservation.
Web Archiving: Avery Fisher Center for Music & Media Rhiannon Bettivia, Zack Lischer-Katz, Samantha Losben & Erica Wilson November 29, 2010 Digital Preservation.
NetarchiveSuite Meeting, BnF, Austria Updates and Plans for 2012 Michaela Mayr, Andreas P. Austrian National Library
Adobe Dreamweaver CS3 Revealed CHAPTER SIX: MANAGING A WEB SERVER AND FILES.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
Discovery Tools for Health Libraries  11 th September 2015 WorldCat Discovery Services Simon Day Product Manager.
Facilitating Access and Reuse of Research Materials: the Case of The European Library Nuno Freire The European Library RESAW Seminar December 2013.
ACT : Legal Deposit Annotation and Curation Tool Peter Webster British Library
Current Quality Assurance Practices in Web Archiving Brenda Reyes Ayala, Mark Phillips, and Lauren Ko University of North Texas
1 Alma SMART Collaborative Networks Collaboration Made Simple.
11 Researcher practice in data management Margaret Henty.
1 NetarchiveSuite Workshop Paris November , 2011.
2015 NetarchiveSuite Workshop Eesti Rahvusraamatukogu Tallinn, Estonia January
Challenges in Web Archiving UNT Perspective NDIIPP – July 21, 2010.
An Application Profile and Prototype Metadata Management System for Licensed Electronic Resources Adam Chandler Information Technology Librarian Central.
How to complete and submit a Final Report through Mobility Tool+ Technical guidelines Authentication, Completion and Submission 1 Antonia Gogaki IT Officer.
Joint Information Systems Committee Repositories Support Project Summer School 2008 Amber Thomas, JISC.
Use cases for BnF broad crawls Annick Lorthios. 2 Step by step, the first in-house broad crawl The 2010 broad crawl has been performed in-house at the.
Facebook Clone Script - Social Network Script
IFLA Satellite conference - Helsinki - 10 août 2012
BnF experiences with harvesting content beyond paywalls
Prepared by: Galya STATEVA, Chief expert
Impact of the Alternative e-Publishing Model: From Open Access Resources & Self-Publishing toward Librarian’s New Challenges 溫達茂 飛資得資訊 中華民國九十三年十一月.
Creating Web Collections with Archive-It
László Drótos – Márton Németh National Széchényi Library Department of Electronic Library Services Web archiving Planning a new pilot project.
Documentation as part of curation in web archiving.
Employee Task Management Software
ArchivesSpace – Archivematica – DSpace Workflow Integration
Presentation transcript:

Building Collections on the Web BCWeb

What’s BCWeb ? BCWeb was developped entirely by the BnF for the content curators to replace its old selection tools and the excel table It’s based on the organization of the BnF – A network of librarians ( = content curators) who select websites – BnF uses Netarchivesuite – The organization of the collections

The network of content curators BCWeb is used by the content curators inside the BnF And by external partners – regional libraries – university libraries – research laboratories...

BCWeb a tool to select websites

The curator defines the scope of the crawl

The technical settings The content curator chooses several technical settings, in particular : The type of crawl: broad and selective. The budget : it indicates the number of URLs collected per website (defined according to the website’s size), The frequency And the depth The lists of websites are transferred to NAS

BCWeb settings are related to NAS settings

The transfer is done by the digital legal deposit team

A record contains also… Management information : the creator, the manager of the record… The curator can deactivate a record to stop its harvest He can add some additional URLs. A description The curator and the digital legal deposit team can add some notes on the QA for example He adds a theme

The BnF’s collections

2 types of collections Ongoing focused collections – selections maintained by the different departments of the library related to their expertise and to their collection policy; And project collections – selections on a specific theme or in connection with an event, usually organized by members of several departments of the library, sometimes in cooperation with external partners The domain list of the broad crawl isn’t in BCWeb In the admin part, it’s possible to make the correspondance between a collection and a harvest definition

Correspondance between the collections and the harvest definitions

To conclude… It depends on the BnF’s workflow and the BnF’s needs – It doesn’t allow to manage information about the producers or the rights – It isn’t a QA tool (no links between BCWeb and the archives, the historical URLs of the website are not visible) The developments are now reduced : – They concern improvements or bug corrections (for example, the performance) – The BnF starts to open its code