Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 NetarchiveSuite Workshop Paris November 24 - 25, 2011.

Similar presentations


Presentation on theme: "1 NetarchiveSuite Workshop Paris November 24 - 25, 2011."— Presentation transcript:

1 1 NetarchiveSuite Workshop Paris November 24 - 25, 2011

2 2 Reflexion on metrics Objectives of the discussion –To present the five main metrics used by each institution –To share how we communicate them to partners and researchers –BnF has 45 statistics for harvesting activity and now tries to define the more relevant for external partners –What about libraries of Denmark and Austria ? ISO report –A working group of ISO has clarified definitions and proposed statistics & indicators For a better advocacy of web archiving initiatives in a wider environment, national or international; International measures and comparisons; Best evaluation practices within institutions. –The discussion will be held on the first point

3 3 Core statistics for collection development StatisticPurposeExample 01Total number of URLs (or responses)Quantity of information in Web Archive 14 billion URL 02Distribution of URL by status codes, and especially number of harvested files (2XX) Number of items in Web Archive 2 million harvested files or items 03Total size stored (compressed size, in bytes) Overall size of Web Archive200 terabytes 04Total size of the harvested files (uncompressed size, in bytes) Size of Web Archive for collected items 180 terabytes 05Number of targetsObjectives of the collection8,000 targets 06Number of target instances (captures)Resulting content14,000 target instances 07Number of WARC files or any other container files Number of conservation units in Archive 18 000 WARC files 08Number of domains or hostsNumber of website-alike items in Web Archive core indicator or not?

4 4 Core statistics for collection characterisation StatisticPurposeExample 09Distribution by top level or second level domain Geographic distribution70% of collection in.fr TLD 10Distribution by size of domains Size characterisation and crawl management 3% of collection provide 30% of the collected URL 11Distribution by format types Format/Document type characterisation 60% of collection is in html/text 12Distribution by languages Linguistic distribution80% of collection includes Danish language 13Chronological coverageTemporal analysis of collectionOldest captures in Archive holdings date from 1996 14Number of granted permissions Efficiency of permission management 60% of permissions granted by publishers 15Number of nominationsSelection activity30% of collection is the outcome of human selection 16Distribution by subject domains using descriptive or provenance metadata Subject distribution and content/topical analysis 20% of collection related to Arts subjects and topics

5 5 The case of BnF Main indicators –01/ Total number of URLs –03/ Total size stored –05/ Number of targets –09/ Distribution by top level or second level domain –11/ Distribution by format types Ways of communication –Annual report on BnF website –Conferences, workshops with librarians or researchers –Redaction for 2012 of a report on documents received by legal deposit, for institutional partners –Nothing special for readers


Download ppt "1 NetarchiveSuite Workshop Paris November 24 - 25, 2011."

Similar presentations


Ads by Google