Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing.

Similar presentations


Presentation on theme: "Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing."— Presentation transcript:

1 Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop

2 A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

3 A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

4 CONTENT SURVEY #0

5 Stage #0: Content survey Input: Output: Specifications of content contribution Excel specs questionnaire

6

7

8

9 CONTENT SURVEY #0

10 Stage #1: Harvesting and package creation Input: Output:Harvested data in XML Collection-specific analysis tool Sample of source data: 1000 records Mapping specifications template Excel specs XML raw data HTML analysis tool XML sample raw data TXT mapping template

11 CONTENT SURVEY #0

12 #2 Analysis and mapping specifications Input: Output: Excel specs TXT mapping specs HTML analysis tool XML sample raw data TXT mapping template

13

14 CONTENT SURVEY #0

15 Stage #3: Mapping and normalisation Input: Output: XML raw data TXT mapping specs XML normalised mapped data XML profile Quality check

16 NORMALISER

17 STAGE 3

18 CONTENT SURVEY #0

19 Stage #4: Database storage and indexing Input: Output: XML normalised mapped data DBINDEX

20 A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

21 Europeana Semantic Element (ESE) Europeana “Schema” for the Prototype Based on Dublin Core Metadata Elements Set (DCMES)(ISO ) 49 Elements (26 Elements & 23 Refinements) Created through discussions in July/August 2008

22 ESE specialities europeana:country europeana:provider (dc:source) europeana:language (dc:language) europeana:type (dc:type, dc:format) europeana:year (dc:date) europeana:isShownBy (dc:relation) europeana:isShownAt (dc:relation) europeana:object europeana:uri (dc:identifier)

23 All normalised:  Syntax  Value Let’s examine their characteristics ESE specialities

24 Definition:  Country of content provider. If several countries: Europe Format:  String, ex: switzerland, germany,… Reference:  TEL controlled list. Supports TEL interface translation mechanism Mechanism:  Manual In portal:  Facet browsing of search results Normalised ESE terms: Country

25

26 Definition:  Organisation sending the data to Europeana Format:  String, ex: Musées lausannois, Nasjonalbiblioteket,… Reference:  Europeana controlled list of content providers: Mechanism:  Manual but potentially can be automated In portal:  Facet browsing of search results Normalised ESE terms : Provider

27

28

29 Definition:  Language of provider’s country (ESE:languages of the metadata) Format:  2-letters, ex: it, no,fr, en, es,… Reference:  ISO639-1 language codes  Exception: If several languages: “mul” Mechanism:  Manual but potentially can be automated In portal:  Facet browsing of search results Normalised ESE terms: Language

30 Definition:  Type of the original object Format:  String Reference:  4 Europeana types: IMAGE, TEXT, SOUND, VIDEO Mechanism:  Manual: Mapping specified by content provider In portal:  Categorisation display  Facet browsing of search results Normalised ESE terms: Type

31

32 Definition:  Date of creation of the original object (analog or born digital) Format:  4 digits [YYYY], ex: 1950 Reference:  Europeana year Mechanism:  Automatic extraction with “YearExtractor” converter In portal:  Facet browsing of search results  Browsing by time (timeline) Normalised ESE terms: Year

33

34

35 Definition:  URL to the digital object Format:  URL (http://...) Mechanism:  Automatic or manual In portal:  Linking Normalised ESE terms: isShownBy

36

37 Definition:  URL to the digital object with context Format:  URL (http://...) Mechanism:  Automatic or manual In portal:  Linking Normalised ESE terms: isShownAt

38

39 Definition:  URL to the digital object as thumbnail Format:  URL (http://...) Mechanism:  Automatic or manual In portal:  Display Normalised ESE terms: Object

40

41 Definition:  Record identifier for Europeana system Format:  URI Mechanism:  Automatic: special algorithm guaranteeing uniqueness (and integrity) of records http://www.europeana.eu/resolve/record/91101/0BAF44EDF8B98F1322DEEAD4AB989778E6394418 In portal:  MyEuropeana  Full digital object view in Europeana Normalised ESE terms: URI

42 A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

43 Metadata normalisation in practice Demo of stage #3’s workflow: 1.Go through data of example collection #1 2.Practical exercise: let’s normalise example collection #2 for Europeana!! 3.2 examplesof known issues MAPPING & NORMALISATION #3

44 SUBVERSION (SVN)

45 COLLECTION FOLDERSOURCE XMLMAPPING SPECS TXTOUTPUT XMLMAPPING/NORM. SPECS XML

46 Example 1: “Midas” collection 83 moving image records from the Association des Cinémathèques Européennes  Harvested data  Fields mapping/Type values mapping specs  Analysis file (source data)  Mapping file  Profile file  Analysis file + sample (normalised data)

47 Example 2: “Outsider Art Museum” collection 4142 records from the Musées Lausannois

48 Known issues with mapping/profile files 1. Wrong syntax in mapping file causes errors in profile.xml:  If use “=>” in comment in mapping.txt this creates a mapping entry in profile.xml! Ex: ………

49

50

51

52 BEFORE

53 AFTER

54 Known issues with mapping/profile files 2. Wrong syntax in mapping file causes errors in profile.xml:  There should be 2 blanks between “=>” and “N/A” and not one otherwise the mapping specification is not well formatted in XML in profile.xml: Ex: ………………….

55 MAPPING.TXT PROFILE.XML MAPPING.TXT PROFILE.XML profile.xml with error: 2 white spaces!

56 Documentation in Europeana context Europeana Semantic Elements (ESE) v3.1 “Europeana – Data Offline Preparation” Commented version of “profile.xml” “Quality Control Checklist”

57

58

59 A. Workflow B. Metadata normalisation with ESE C. Approach in practice: Demo of tools used D. Knowledge SHARING Workshop: Discussion of the practice for EuropeanaLocal Session

60 Questions about Europeana metadata ingestion/normalisation process? Integration and/or compatibility of this process with EuropeanaLocal content strategy:  Where normalisation will take place?  By who? … Discussion

61 Thank you Julie.Verleyen@kb.nl

62

63

64  Duplicated records  Records without URLs to digital object  Records without Europeana type (SOUND, TYPE, IMAGE, VIDEO)  Records to copyright-protected digital objects Discarding factors during normalisation


Download ppt "Metadata Normalisation in Europeana The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing."

Similar presentations


Ads by Google